-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-22151] : PYTHONPATH not picked up from the spark.yarn.appMaste… #21468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
0aee8fa
5e733ae
6ba543e
5423bef
49f37a8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -813,8 +813,14 @@ private[spark] class Client( | |
| if (pythonPath.nonEmpty) { | ||
| val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath) | ||
| .mkString(ApplicationConstants.CLASS_PATH_SEPARATOR) | ||
| env("PYTHONPATH") = pythonPathStr | ||
| sparkConf.setExecutorEnv("PYTHONPATH", pythonPathStr) | ||
| val newValue = | ||
|
||
| if (env.contains("PYTHONPATH")) { | ||
| env("PYTHONPATH") + ApplicationConstants.CLASS_PATH_SEPARATOR + pythonPathStr | ||
| } else { | ||
| pythonPathStr | ||
| } | ||
| env("PYTHONPATH") = newValue | ||
| sparkConf.setExecutorEnv("PYTHONPATH", newValue) | ||
| } | ||
|
|
||
| if (isClusterMode) { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could just say
env.get("PYTHONPATH") ++=: pythonPathbefore turning the list into a string.But there's also two extra questions here:
py-files? I kinda think after makes more sense, since files are generally provided in the command line.appMasterEnvbe reflected in executors? With your code it is. I'm not so sure it should.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good questions
precedence: So right now you can work around this issue by exporting PYTHONPATH before you launch spark-submit, I think this is something that could just be in someone's env on the launcher box and might not be what you want in a yarn container. I would think that specifying explicit pythonpath via spark.yarn.appMasterEnv would take precedence over that since you explicitly configured. Now the second question is where that fails with the py-files and that one isn't as clear to me since like you said its explicitly specified. Maybe we do py-files then spark.yarn.appMasterEnv.PYTHONPATH and then last env PYTHONPATH. that is different from the way it is now though. thoughts?
agree this should not be reflected in the executors so if it is we shouldn't do that. We should make sure the spark. executorEnv.PYTHONPATH works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your comments @vanzin , have made the necessary changes. As far as precedence is concerned, I am still not sure whether I understood your question at first, however @tgravescs clarified it for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note as @vanzin said you can just use the ++=: operator with the listbuffer to prepend and get rid of the if conditions before converting to string.
env.get("PYTHONPATH") ++=: (sys.env.get("PYTHONPATH") ++=: pythonPath)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tgravescs Have replaced the if-else code with ++ operator.