-
Notifications
You must be signed in to change notification settings - Fork 50
[SPARK-52481] Add Spark History Server example
#249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| applicationTolerations: | ||
| restartConfig: | ||
| restartPolicy: Always | ||
| maxRestartAttempts: 9223372036854775807 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the default value of maxRestartAttempts is 3 (for normal jobs), I set this as Long.MAX_VALUE because SHS is supposed to run forever.
$ jshell
| Welcome to JShell -- Version 21.0.7
| For an introduction type: /help intro
jshell> Long.MAX_VALUE
$1 ==> 9223372036854775807|
Thank you, @viirya . Merged to main. |
|
The Spark event log cannot be directly written to s3. There are the following errors: WARN S3ABlockOutputStream: Application invoked the Syncable API against stream writing to logs/spark/events/eventlog_v2_spark-61bc831c6f774629a1cd1bc95d6a7288/events_1_spark-61bc831c6f774629a1cd1bc95d6a7288. This is unsupported |
|
@melin , it's a warning, not an error.
FYI, Apache Spark 3.2+ has the following by default. |
spark.eventLog.dir cannot be directly set to s3 path, which is not very convenient. There are two solutions:
|
It seems that the above are not questions. So, to be clear, No, your personal assessments and proposed solutions sounds wrong to me, @melin . It doesn't make sense at all for long running jobs and streaming jobs. In other words, that's not a community recommendation. Apache Spark community has been using S3 (and S3-compatible object storage) directly as an Apache Spark event log directory for a long time. BTW, Apache Spark PR is not a good place for this kind of Q&A. I'd recommend you to send to [email protected] if you have any difficulties or you want to discuss this. |
|
FYI, @melin . Here is the example. |
|
Hi @dongjoon-hyun , thank you for your work here. I am trying to deploy a spark-history-server pod using your example, and I see it as running but I cannot port-forward it since the port 18080 is not being used by the pod (and so not used by the service). Could you please tell me what should I do to be able to fix my issue? I have the following config: apiVersion: spark.apache.org/v1
kind: SparkApplication
metadata:
name: spark-history-server
namespace: spark-operator
spec:
mainClass: "org.apache.spark.deploy.history.HistoryServer"
sparkConf:
# spark.jars.packages: "org.apache.hadoop:hadoop-aws:3.4.1"
spark.jars: "https://repo1.maven.org/maven2/com/google/cloud/bigdataoss/gcs-connector/3.1.1/gcs-connector-3.1.1-shaded.jar"
spark.jars.ivy: "/tmp/.ivy2.5.2"
spark.driver.memory: "2g"
spark.kubernetes.namespace: spark-operator
spark.kubernetes.authenticate.driver.serviceAccountName: "spark"
spark.kubernetes.container.image: "apache/spark:4.0.0-java21-scala"
spark.history.fs.cleaner.enabled: "true"
spark.history.fs.cleaner.maxAge: "30d"
spark.history.fs.cleaner.maxNum: "100"
spark.history.fs.eventLog.rolling.maxFilesToRetain: "10"
spark.hadoop.fs.gs.impl: "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem"
spark.hadoop.fs.AbstractFileSystem.gs.impl: "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS"
spark.hadoop.fs.gs.auth.service.account.enable: "true"
# spark.hadoop.fs.gs.auth.type: "COMPUTE_ENGINE"
spark.hadoop.fs.gs.project.id: "rcim-prod-data-core-0"
spark.hadoop.fs.defaultFS: "gs://SOME-BUCKET-GCS-spark-logs"
spark.history.fs.logDirectory: "gs://SOME-BUCKET-GCS-spark-logs"
runtimeVersions:
sparkVersion: "4.0.0"
applicationTolerations:
restartConfig:
restartPolicy: Always
maxRestartAttempts: 9223372036854775807I changed the value of my GCS bucket name, but again: I don't see any errors in the pod's logs. When I execute the following port-forward command I get this: Thank you very much if you can help! |
|
You need to use |
|
@dongjoon-hyun thanks for the taking the time for this! I confirm that it now works :). |
What changes were proposed in this pull request?
Add
Spark History Serverexample.Why are the changes needed?
Since Apache Spark 4.0, Spark rolls the event logs by default and compressed them by default.
spark.eventLog.rolling.enabledby default spark#43638spark.eventLog.compressby default spark#43036However, we still need more configurations to allow SHS manages the event log directories. This PR aims to provide an example of
Spark History Serverwith the configuration.Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manual review.
Was this patch authored or co-authored using generative AI tooling?
No.