Skip to content

Conversation

@skonto
Copy link
Contributor

@skonto skonto commented May 30, 2018

What changes were proposed in this pull request?

Remove code that is misleading and is a leftover from a previous implementation.

How was this patch tested?

Manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I would like to refactor the name here of SPARK_JAVA_OPTS, its a bit misleading as this property used to exist in Spark and was removed: https://issues.apache.org/jira/browse/SPARK-14453

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be confusing -- this is local though right, not exported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixcheung yes it is local, but for users not to make assumptions it would be good to rename it at some point.

@SparkQA
Copy link

SparkQA commented May 30, 2018

Test build #91305 has finished for PR 21462 at commit 74e9011.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 30, 2018

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3573/

@SparkQA
Copy link

SparkQA commented May 30, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3573/

@skonto
Copy link
Contributor Author

skonto commented May 30, 2018

@foxish pls review.

@foxish
Copy link
Contributor

foxish commented May 30, 2018

jenkins, ok to test

@SparkQA
Copy link

SparkQA commented May 30, 2018

Test build #91312 has finished for PR 21462 at commit 74e9011.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 30, 2018

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3576/

@SparkQA
Copy link

SparkQA commented May 30, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3576/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we remove the ability to put jars somewhere in the image and add them to the classpath at runtime? We had this before with #20193.

cc @mccheah

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should confirm that the new init-container-less implementation retains the capability of doing that - maybe through an e2e test.

Copy link
Contributor Author

@skonto skonto May 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@foxish I can add that capability, but as far as I can tell this is not there now. I was looking for that but SPARK_EXTRA_CLASSPATH I am afraid is not used. In addition since we are using client mode --driver-class-path should be used here. In general all options in the common config doc should we functional.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be able to specify dependencies with the URI local://. However in order for the image itself to specify additional jars without having to list them in spark-submit, you have to put the jars in the same directory as the rest of the Spark distribution jars. It would be good to have an API or environment variable that can point to additional jars.

Copy link
Contributor Author

@skonto skonto May 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mccheah yes putting under /opt/spark/jars is one option because its included by default in the classpath. I will add then SPARK_EXTRA_CLASSPATH back and will let the jars there added to the spark-submit in the container, sounds good?
I opened another ticker for the driver's extra java options here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not create any side effects so it can be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not have secrets -> mountpaths support right now? @mccheah @liyinan926

Copy link
Contributor Author

@skonto skonto Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@foxish you are right we do have, but this statement has no effect. For example ti does not modify sparkConf.
It was used in the past by the following statement that was removed:

-    val mountSecretBootstrap = if (executorSecretNamesToMountPaths.nonEmpty) {	
-      Some(new MountSecretsBootstrap(executorSecretNamesToMountPaths))	
-    } else {	
-      None	
-    }

Here is the relevant PR:
a83ae0d#diff-7d9979c0153744eafa24348ecbfa1671

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mccheah @liyinan926 any more to this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a bug, shouldn't we just be mounting the secrets here?

Copy link
Contributor Author

@skonto skonto Jun 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mccheah @felixcheung I think the logic has changed otherwise the tests I have in this PR(#21652) would have failed when removed that part and re-run them. Also they should have failed anyway if there was a bug. Tests there cover mounted secrets.

@SparkQA
Copy link

SparkQA commented Jun 8, 2018

Test build #91560 has finished for PR 21462 at commit f8635fd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@skonto
Copy link
Contributor Author

skonto commented Jun 8, 2018

@foxish pls review. It is quite short :)

cp -R "$SPARK_MOUNTED_FILES_DIR/." .
readarray -t SPARK_EXECUTOR_JAVA_OPTS < /tmp/java_opts.txt

if [ -n "$SPARK_EXTRA_CLASSPATH" ]; then
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This var exists in docs but not in code.

fi
if [ -n "$SPARK_MOUNTED_FILES_DIR" ]; then
cp -R "$SPARK_MOUNTED_FILES_DIR/." .
readarray -t SPARK_EXECUTOR_JAVA_OPTS < /tmp/java_opts.txt
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename for historical reasons.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is local right? shouldn't matter what the name is. also this might be an image running the driver, not an executor?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixcheung I believe this is because we are running spark-submit from driver. And so the JAVA_OPTS are now only applicable to the executor.
I +1 this change, but would like weigh-in from @foxish

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah spark-submit detects whatever it needs via the spark-launcher.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, that makes sense. Those env-vars are set by launcher at submission time.

@SparkQA
Copy link

SparkQA commented Jun 8, 2018

Test build #91562 has finished for PR 21462 at commit 9f1ddf3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 8, 2018

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3708/

@SparkQA
Copy link

SparkQA commented Jun 8, 2018

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/3708/

@skonto
Copy link
Contributor Author

skonto commented Jun 9, 2018

@felixcheung @mccheah pls review.

@skonto
Copy link
Contributor Author

skonto commented Jun 12, 2018

@foxish gentle ping.

@skonto
Copy link
Contributor Author

skonto commented Jun 14, 2018

@felixcheung gentle ping.

Copy link
Member

@felixcheung felixcheung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ok, but someone else should review entrypoint.sh

fi
if [ -n "$SPARK_MOUNTED_FILES_DIR" ]; then
cp -R "$SPARK_MOUNTED_FILES_DIR/." .
readarray -t SPARK_EXECUTOR_JAVA_OPTS < /tmp/java_opts.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is local right? shouldn't matter what the name is. also this might be an image running the driver, not an executor?

@skonto
Copy link
Contributor Author

skonto commented Jun 19, 2018

@felixcheung its local.

@skonto
Copy link
Contributor Author

skonto commented Jun 20, 2018

@foxish gentle ping.

Copy link
Contributor

@foxish foxish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of questions.
Change LGTM, thanks.

fi
if [ -n "$SPARK_MOUNTED_FILES_DIR" ]; then
cp -R "$SPARK_MOUNTED_FILES_DIR/." .
readarray -t SPARK_EXECUTOR_JAVA_OPTS < /tmp/java_opts.txt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, that makes sense. Those env-vars are set by launcher at submission time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not have secrets -> mountpaths support right now? @mccheah @liyinan926

@skonto
Copy link
Contributor Author

skonto commented Jun 26, 2018

Thank you @foxish I was also thinking of adding the following:

 if [ -n "$HADOOP_CONFIG_URL" ]; then
     echo "Setting up hadoop config files...."
     mkdir -p /etc/hadoop/conf
     wget $HADOOP_CONFIG_URL/core-site.xml -P /etc/hadoop/conf 
     wget $HADOOP_CONFIG_URL/hdfs-site.xml -P /etc/hadoop/conf
     export HADOOP_CONF_DIR=/etc/hadoop/conf
 fi

I have already tested it with: org.apache.spark.examples.DFSReadWriteTest
At least we get some basic support for integrating with an external hdfs cluster. Are you ok with adding that small ode snippet (I know that hdfs/kerberos hdfs support will come, still this might be useful since it allows to fetch the config files from any place)? Something similar we use for Spark on DC/OS here.

@ifilonenko
Copy link
Contributor

@skonto can we hold off on that until I merge Kerberos support? It is in the works :)

@skonto
Copy link
Contributor Author

skonto commented Jun 28, 2018

@foxish @felixcheung gentle ping for merge. I replied for mount paths.

@felixcheung
Copy link
Member

ping @mccheah @liyinan926 on #21462 (comment)

@foxish
Copy link
Contributor

foxish commented Jul 2, 2018

LGTM, we can file a separate task to track what happened to secrets mounting without blocking on removing dead code.

Thanks for the fix @skonto.

@foxish
Copy link
Contributor

foxish commented Jul 2, 2018

Merging to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants