Skip to content

Conversation

@zzcclp
Copy link
Contributor

@zzcclp zzcclp commented Sep 27, 2018

Changes:

  1. According to SPARK-PR#22346, change the parameter type from 'outputColumns: Seq[Attribute]' to 'outputColumnNames: Seq[String]' when call 'writeAndRead' method;
  2. According to SPARK-PR#21815, there are some parameters added 'lazy', so move original class 'CarbonDataSourceScan' to src path 'commonTo2.1And2.2', and add a new class 'CarbonDataSourceScan' in src path 'spark2.3' which is added some lazy parameters.
  3. Upgrade spark integration version to 2.3.2

Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:

  • Any interfaces changed? No

  • Any backward compatibility impacted? No

  • Document update required? No

  • Testing done
    Please provide details on
    - Whether new unit test cases have been added or why no new tests are required?
    - How it is tested? Please attach test report.
    - Is it a performance related change? Please attach the performance test report.
    - Any additional information to help reviewers in testing this change.

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

classOf[Seq[Attribute]],
classOf[SparkPlan])
method.invoke(dataSourceObj, mode, query, query.output, physicalPlan)
method.invoke(dataSourceObj, mode, query, query.output.map(_.name), physicalPlan)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameters of 'writeAndRead' method had been changed, please see: SPARK-PR#22346

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some comments too

override lazy val (outputPartitioning, outputOrdering): (Partitioning, Seq[SortOrder]) =
(partitioning, Nil)

override lazy val metadata: Map[String, String] = md
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameters (supportsBatch, outputPartitioning, outputOrdering, metadata) had been added keyword 'lazy', please see: SPARK-PR#21815

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: made lazy since spark 2.3.2 version (SPARK-PR#21815)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@zzcclp
Copy link
Contributor Author

zzcclp commented Sep 27, 2018

@jackylk @chenliang613 @sujith71955 please review.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/615/

@sujith71955
Copy link
Contributor

Thanks for raising the PR, It will better if you can add the description about the changes in this PR.

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8876/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/807/

@chenliang613
Copy link
Contributor

retest this please

@@ -0,0 +1,55 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why need to move CarbonDataSourceScan.scala?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move original class 'CarbonDataSourceScan' to src path 'commonTo2.1And2.2', and add a new class 'CarbonDataSourceScan' in src path 'spark2.3' which is added some lazy parameters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment : only for 4 parameters , copy the whole file(CarbonDataSourceScan.scala) for spark 2.3 integration, may not require. see if can add the judgement for different spark version with different code/parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenliang613 , it's not about the different code/parameters, it changes common val parameters to lazy val parameters. I think it's difficult to add lazy keyword on val parameters according to spark version.
Or does anyone know how to do and give me some help, thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/619/

@CarbonDataQA
Copy link

Build Failed with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8880/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/811/

@ravipesala
Copy link
Contributor

@zzcclp Please check and fix the tests

@zzcclp
Copy link
Contributor Author

zzcclp commented Sep 28, 2018

@ravipesala can you help me to check why these three test cases fail? It's about the decimal precision.

@zzcclp
Copy link
Contributor Author

zzcclp commented Sep 28, 2018

@ravipesala I know how to fix and will fix the tests ASAP.

@zzcclp zzcclp changed the title [WIP] Upgrade spark integration version to 2.3.2 [CARBONDATA-2989] Upgrade spark integration version to 2.3.2 Sep 28, 2018
@zzcclp zzcclp force-pushed the wip_upgrade_to_spark2.3.2 branch from 586cf7b to c8b53cf Compare September 28, 2018 14:52
@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/837/

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/643/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8905/

@zzcclp
Copy link
Contributor Author

zzcclp commented Sep 28, 2018

@sujith71955 @chenliang613 @ravipesala @jackylk this pr is ready, please review, thanks.

@chenliang613
Copy link
Contributor

My comment : only for 4 parameters , copy the whole file(CarbonDataSourceScan.scala) for spark 2.3 integration, may not require. see if can add the judgement for different spark version with different code/parameters.

@jackylk
Copy link
Contributor

jackylk commented Sep 29, 2018

I think it is not possible to add lazy variable without breaking spark 2.2 integration. Since this class is small, I think it is ok to have a separate one for spark 2.3 integration

@zzcclp
Copy link
Contributor Author

zzcclp commented Sep 29, 2018

@sujith71955 any suggestion for this?

@chenliang613
Copy link
Contributor

LGTM

1 similar comment
@ravipesala
Copy link
Contributor

LGTM

@sujith71955
Copy link
Contributor

sujith71955 commented Sep 30, 2018

LGTM Even though SPARK-PR#21815 changes is a workaround solution and in future its subjected to change. Need a caution here. Can you please add this PR reference in the modified code of the carbon for future reference. I think you add like ' made lazy since spark 2.3.2 version(SPARK-PR#21815)
Thanks for your effort.

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/656/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8920/

@CarbonDataQA
Copy link

Build Failed with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/852/

@zzcclp
Copy link
Contributor Author

zzcclp commented Sep 30, 2018

the failed test case is not related to this pr, right?

@sujith71955
Copy link
Contributor

I think better to re-trigger once

@sujith71955
Copy link
Contributor

retest this please

@sujith71955
Copy link
Contributor

sujith71955 commented Sep 30, 2018

Seems to be impact of this PR , but it has passed previously,since we modified the outputpartition as lazy, better to have a relook once

@zzcclp
Copy link
Contributor Author

zzcclp commented Sep 30, 2018

ok, but it doesn't add lazy for spak 2.2.

@sujith71955
Copy link
Contributor

sujith71955 commented Sep 30, 2018 via email

@CarbonDataQA
Copy link

Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder2.1/660/

@CarbonDataQA
Copy link

Build Success with Spark 2.3.1, Please check CI http://136.243.101.176:8080/job/carbondataprbuilder2.3/8923/

@CarbonDataQA
Copy link

Build Success with Spark 2.2.1, Please check CI http://95.216.28.178:8080/job/ApacheCarbonPRBuilder1/855/

@asfgit asfgit closed this in 2081bc8 Sep 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants