Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add a comment.
  • Loading branch information
ueshin committed Mar 22, 2019
commit f8b34041fe16f64aac2baaf14e32925635984ff8
2 changes: 2 additions & 0 deletions python/pyspark/worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -253,6 +253,8 @@ def read_udfs(pickleSer, infile, eval_type):
"spark.sql.legacy.execution.pandas.groupedMap.assignColumnsByName", "true")\
.lower() == "true"

# Scalar Pandas UDF handles struct type arguments as pandas DataFrames instead of
# pandas Series. See SPARK-27240.
df_for_struct = eval_type == PythonEvalType.SQL_SCALAR_PANDAS_UDF
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems hard to tell why when eval_type is PythonEvalType.SQL_SCALAR_PANDAS_UDF, then df_for_struct should be true. Maybe a well explained comment here is better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add a comment.

ser = ArrowStreamPandasUDFSerializer(timezone, safecheck, assign_cols_by_name,
df_for_struct)
Expand Down