Skip to content

Conversation

@naveenminchu
Copy link
Contributor

a.) Currently I have fix for InferSchema to return NullType and ArrayType(NullType)
b.) @rxin Can you confirm second task on [SPARK-12346] is talking about
prepareJobForWrite(job: Job) in "/spark-sql_2.10/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRelation.scala"

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@rxin
Copy link
Contributor

rxin commented Jan 6, 2016

cc @yhuai

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just delete this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yhuai Now removed NullType case as suggested.

@yhuai
Copy link
Contributor

yhuai commented Jan 6, 2016

@naveenminchu I think the next step is to remove those null type fields before we write data to parquet/orc. @liancheng Parquet does not have a null type, right?

@yhuai
Copy link
Contributor

yhuai commented Jan 9, 2016

Can you add tests and also make changes to the write path (parquet/orc)? Otherwise, when you write a json dataset with null type to parquet, it will fail.

@nchammas
Copy link
Contributor

Is this PR really meant to modify 1,320 files?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants