Skip to content

Conversation

@attilapiros
Copy link
Contributor

@attilapiros attilapiros commented Jan 27, 2023

What changes were proposed in this pull request?

Introducing a config to close all active SparkContexts after the Main method has finished.

Why are the changes needed?

We run into errors after upgrading from Spark 3.1 to Spark 3.2 as the SparkContext get closed right after the starting of the application. It turned out the root cause is SPARK-34674 which introduced the closing of the SparkContexts after the Main method has finished. For details see #32283.

This application was a spark job server built on top of springboot so all the job submits were outside of the main method.

Does this PR introduce any user-facing change?

With the current default (true) so it is the same behaviour as for YARN.

How was this patch tested?

Manually.

@github-actions github-actions bot added the CORE label Jan 27, 2023
@attilapiros attilapiros changed the title [SPARK-42219] Introducing a config to close all active SparkContexts after the Main method has finished [SPARK-42219][Core] Introducing a config to close all active SparkContexts after the Main method has finished Jan 27, 2023
@attilapiros attilapiros changed the title [SPARK-42219][Core] Introducing a config to close all active SparkContexts after the Main method has finished [SPARK-42219][CORE] Introducing a config to close all active SparkContexts after the Main method has finished Jan 27, 2023
@attilapiros
Copy link
Contributor Author

cc @dongjoon-hyun, @HyukjinKwon

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a bad example from ancient YARN age, @attilapiros . I'd not re-enforce those bad habits. Instead, I can give you the counter examples like

sparkConf.getBoolean("spark.kubernetes.submitInDriver", false)

if (sparkConf.getOption("spark.yarn.appMasterEnv.OMP_NUM_THREADS").isEmpty &&
sparkConf.getOption("spark.mesos.driverEnv.OMP_NUM_THREADS").isEmpty &&
sparkConf.getOption("spark.kubernetes.driverEnv.OMP_NUM_THREADS").isEmpty) {

@attilapiros
Copy link
Contributor Author

@dongjoon-hyun I have moved the config into the k8s module

@attilapiros
Copy link
Contributor Author

cc @holdenk

@attilapiros
Copy link
Contributor Author

This pyspark test failure is unrelated:

Error:  running /__w/spark/spark/python/run-tests --modules=pyspark-core,pyspark-streaming,pyspark-ml --parallelism=1 ; received return code 255
Error: Process completed with exit code 19.

@attilapiros attilapiros requested a review from holdenk January 31, 2023 23:09
@attilapiros
Copy link
Contributor Author

cc @HyukjinKwon, @srowen

@pan3793
Copy link
Member

pan3793 commented Mar 19, 2023

SPARK-42698(#40314) is aiming to expand the scope of stopping SparkContext after runMain exiting to all cluster managers, if it is accepted, the configuration proposed by this PR should expand scope as well.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be specific to Kubernetes?
Does it need to be a config or a method you can call?
Actually, why would you not kill the contexts after main exits in any case?

@attilapiros
Copy link
Contributor Author

@srowen

Should this be specific to Kubernetes?

The original #32283 was Kubernetes specific. This PR just adds a new config to have the old behaviour as default but make the new one also available.

Does it need to be a config or a method you can call?

Unfortunately there is a use case for both behaviour. See the next point.

Actually, why would you not kill the contexts after main exits in any case?

I bumped into this change when I analysed an application where spark was used as a job server.
With the older spark it was running just fine but after an update it was stopping in the very beginning.
The app was built on top of springboot where job requests was served via REST.
In a springboot app the main method just initialises / registers the REST handlers and the serving of the new requests done on separate threads. With the new behaviour the Spark context was closed right after the initialisation.

@attilapiros
Copy link
Contributor Author

cc @mridulm

@zhouyifan279
Copy link
Contributor

Our customer also encountered this issue recently. They are migrating their spark job server (a spring boot application) from Spark 2.4 to Spark 3.
They feel surprised that only Spark 3.1.2 stops SparkContext after job server started.

@attilapiros
Copy link
Contributor Author

I am closing this PR. Job servers has the option to do blocking call in the main method to avoid the auto stopping of the active spark contexts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants