Skip to content
This repository was archived by the owner on Dec 4, 2024. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
5150951
WIP add tests and update manifest with spark.tgz
samvantran Jul 26, 2018
764e6c0
WIP fix cleanUpSubmitArgs to handle special chars and multiargs
samvantran Jul 26, 2018
746969a
Cleanup
samvantran Jul 26, 2018
62b542b
Add vendor pkg go-shellwords
samvantran Jul 26, 2018
a71e59f
Fix url
samvantran Jul 26, 2018
9f1dee7
More cleanup + gofmt
samvantran Jul 27, 2018
65a0f7c
Fix single quote error
samvantran Jul 27, 2018
8d347b9
Fix descrip
samvantran Jul 27, 2018
8336832
More fixes and tests
samvantran Jul 27, 2018
db177cd
Debug why single quote fails via spark run
samvantran Jul 29, 2018
c5df110
Fixes and cleanup
samvantran Jul 29, 2018
55412bc
gofmt
samvantran Jul 29, 2018
5a8e3f4
Comment out test, need to create app to print out options
samvantran Jul 30, 2018
8297693
Add simple app + test for CI
samvantran Jul 31, 2018
93f588a
Cleanup and fix test
samvantran Aug 1, 2018
96ac0a2
Fixes
samvantran Aug 1, 2018
3026e88
Cleanup test cases
samvantran Aug 2, 2018
7c294c7
Address PR comments
samvantran Aug 2, 2018
7b12b34
Fix expected test output
samvantran Aug 2, 2018
ff65589
Write confs to tempfile
samvantran Aug 7, 2018
29701c3
Forgot arg in parent function
samvantran Aug 7, 2018
376bf1b
Let's try escaping the quotes
samvantran Aug 8, 2018
f7ad435
Alternatively, wrap entire fn in file
samvantran Aug 8, 2018
6b32e56
Add function isSparkApp
samvantran Aug 14, 2018
c711c6c
Print out all system.properties in app
samvantran Aug 14, 2018
34dffbc
Run the actual file in test
samvantran Aug 14, 2018
d0a230e
Add run perms to tempfile
samvantran Aug 14, 2018
1afd177
Octals are different in python3
samvantran Aug 15, 2018
644c264
Subprocess.run needs shell=True
samvantran Aug 15, 2018
f267518
Sleep right after chmod (potentially old Docker bug)
samvantran Aug 15, 2018
e7035d0
Holy bejesus it finally works
samvantran Aug 16, 2018
9f86664
Cleanup, move logic to test_spark and revert spark_utils
samvantran Aug 16, 2018
0fef2c3
Simplify test_multi_arg_confs
samvantran Aug 17, 2018
39434da
Address PR comments
samvantran Aug 20, 2018
cfadbf1
Cleanup
samvantran Aug 21, 2018
9a69880
Oops, too hasty with the revert
samvantran Aug 21, 2018
36faae7
Merge branch 'master' into DCOS-38138-shell-escape
samvantran Aug 24, 2018
fbc86dc
Use spark distro 2.6.5 created from default
samvantran Aug 24, 2018
13bb7f0
Resync test.sh from dcos-commons: use DOCKER_IMAGE envvar
Aug 25, 2018
7865b6c
Skip test_jar test
samvantran Aug 28, 2018
0ca555a
Merge branch 'master' into DCOS-38138-shell-escape
samvantran Aug 28, 2018
d0aae3c
Remove checking for bool values
samvantran Aug 29, 2018
7b2bb6c
Move app extensions closer to method
samvantran Aug 30, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Fixes
  • Loading branch information
samvantran committed Aug 2, 2018
commit 96ac0a2d4f494c7f3e598ae3c272cadaa504dcad
6 changes: 3 additions & 3 deletions cli/dcos-spark/submit_builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -149,8 +149,8 @@ Args:
submit.Flag("properties-file", "Path to file containing whitespace-separated Spark property defaults.").
PlaceHolder("PATH").ExistingFileVar(&args.propertiesFile)
submit.Flag("conf", "Custom Spark configuration properties. "+
"For properties with multiple values, wrap in single quotes. "+
"e.g. conf=property='val1 val2'").
"If submitting properties with multiple values, "+
"wrap in single quotes e.g. --conf prop='val1 val2'").
PlaceHolder("prop=value").StringMapVar(&args.properties)
submit.Flag("kerberos-principal", "Principal to be used to login to KDC.").
PlaceHolder("user@REALM").Default("").StringVar(&args.kerberosPrincipal)
Expand Down Expand Up @@ -316,7 +316,7 @@ LOOP:
continue LOOP
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we pass --supervise false or something similar? The code doesn't seem to handle this case explicitly (It seems as if this will be joined in the default case though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, false will get joined with --supervise and the job will fail. I just tried it and we get a parse error

Error when parsing --submit-args: unknown long flag '--supervise false'

In this case, it might make sense to check for boolean flags and if found, check the next argument. If it's a boolean statement like true/false, we could just move the pointer forward and skip it (i+=2).

More generally though, there are a lot of ways to mess up confs. Does it make sense to have some kind of blacklist that guards against 'user error'?

}
}
if strings.Contains(current, "=") {
if strings.Contains(current, "=") {
// already in the form arg=val, no merge required
sparkArgs = append(sparkArgs, current)
i++
Expand Down
2 changes: 1 addition & 1 deletion manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"spark_version": "2.2.1",
"default_spark_dist": {
"hadoop_version": "2.7",
"uri": "https://svt-dev.s3.amazonaws.com/spark/spark-2.2.1-bin-2.7.3-dcos-38138.tgz"
"uri": "svt-dev.s3.amazonaws.com/spark/spark-2.2.1-bin-2.7.3-dcos-38138.tgz"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once PR is in good shape and approved, I'll update the downloads.mesosphere uri to include the updated tgz

},
"spark_dist": [
{
Expand Down
2 changes: 1 addition & 1 deletion tests/jobs/scala/src/main/scala/MultiConfs.scala
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,4 @@ object MultiConfs {
val conf = new SparkConf().setAppName(APPNAME)
conf.getAll.foreach(println)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional check that tests the whole thing end-to-end: pass in -Dparam1=\"valA valB\" as one of the extraJavaOptions, and then inside this app, you can check the value of System.getProperty("param1") to make sure it was set correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll update.

}
}
}
7 changes: 4 additions & 3 deletions tests/test_spark.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,11 +108,12 @@ def test_sparkPi(service_name=utils.SPARK_SERVICE_NAME):
@pytest.mark.sanity
def test_spark_with_multi_configs(service_name=utils.SPARK_SERVICE_NAME):
utils.run_tests(
app_url="https://s3-us-west-1.amazonaws.com/svt-dev/jars/dcos-spark-scala-tests-assembly-0.2-DCOS-38138.jar",
app_url=utils.dcos_test_jar_url(),
app_args="",
expected_output="spark.executor.extraJavaOptions,-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3 -Dparam4=\"This one with spaces\"'",
expected_output="spark.executor.extraJavaOptions,-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3",
service_name=service_name,
args=["--conf spark.driver.extraJavaOptions='-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3 -Dparam4=\"This one with spaces\"'",
args=["--conf spark.driver.extraJavaOptions='-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3'",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is giving me trouble. This same command works via Spark CLI but not with pytests

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the TeamCity output, the error seems to be thrown by the Parse function called here: https://github.com/mesosphere/spark-build/blob/DCOS-38138-shell-escape/cli/dcos-spark/submit_builder.go#L411. It might be a good idea to check whether that is behaving as expected.

Also, confirming that the following command:

dcos spark --name=spark run --verbose --submit-args="--conf spark.driver.extraJavaOptions='-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3' --conf spark.mesos.containerizer=mesos --class MultiConfs --conf spark.mesos.role=* --conf spark.driver.memory=2g http://infinity-artifacts-ci.s3.amazonaws.com/autodelete7d/spark/test-20180802-062841-eHc7mdB3GD1WgGtj/dcos-spark-scala-tests-assembly-0.2-SNAPSHOT.jar "

exactly as it is generated by the tests works from the CLI would also be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that the exact same command works when submitted via command line and the stdout returns the expected args:
from CLI:

Translated spark-submit arguments: '--conf=spark.driver.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3, --conf=spark.mesos.containerizer=mesos, --class=MultiConfs, --conf=spark.mesos.role=*, --conf=spark.driver.memory=2g, http://infinity-artifacts-ci.s3.amazonaws.com/autodelete7d/spark/test-20180802-062841-eHc7mdB3GD1WgGtj/dcos-spark-scala-tests-assembly-0.2-SNAPSHOT.jar'
...
Run job succeeded. Submission id: driver-20180802152929-0002

stdout

1.448: [GC (Allocation Failure) [PSYoungGen: 64512K->8448K(75264K)] 64512K->8456K(247296K), 0.0073157 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
Running MultiConfs app. Printing out all config values:
(spark.driver.extraJavaOptions,-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Dparam3=val3)
(spark.jars,file:/mnt/mesos/sandbox/dcos-spark-scala-tests-assembly-0.2-SNAPSHOT.jar)
...

The issue seems to be going through CI requires another layer of Python string parsing which causes the Go parsing to fail (my guess is that additional quotes are added)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a few special characters here as well? Maybe a -Dparam3=\"val3 val4\" like in the unit test?

"--conf spark.mesos.containerizer=mesos",
"--class MultiConfs"])


Expand Down