Skip to content

Conversation

@vanzin
Copy link
Contributor

@vanzin vanzin commented Jun 22, 2018

This passes a unique attempt id instead of attempt number to v2
data sources and hadoop APIs, because attempt number is reused
when stages are retried. When attempt numbers are reused, sources
that track data by partition id and attempt number may incorrectly
clean up data because the same attempt number can be both committed
and aborted.

…writes.

This passes a unique attempt id instead of attempt number to v2
data sources and hadoop APIs, because attempt number is reused
when stages are retried. When attempt numbers are reused, sources
that track data by partition id and attempt number may incorrectly
clean up data because the same attempt number can be both committed
and aborted.
@tgravescs
Copy link
Contributor

+1 pending tests. @rdblue

@rdblue
Copy link
Contributor

rdblue commented Jun 22, 2018

+1

@SparkQA
Copy link

SparkQA commented Jun 22, 2018

Test build #92226 has finished for PR 21615 at commit a80b57b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 23, 2018

Test build #92234 has finished for PR 21615 at commit f9b134e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Jun 25, 2018
… number for writes .

This passes a unique attempt id instead of attempt number to v2
data sources and hadoop APIs, because attempt number is reused
when stages are retried. When attempt numbers are reused, sources
that track data by partition id and attempt number may incorrectly
clean up data because the same attempt number can be both committed
and aborted.

Author: Marcelo Vanzin <[email protected]>

Closes #21615 from vanzin/SPARK-24552-2.3.
@vanzin vanzin closed this Jun 25, 2018
@vanzin
Copy link
Contributor Author

vanzin commented Jun 25, 2018

Merged to 2.3.

jzhuge pushed a commit to jzhuge/spark that referenced this pull request Aug 20, 2018
… number for writes .

This passes a unique attempt id instead of attempt number to v2
data sources and hadoop APIs, because attempt number is reused
when stages are retried. When attempt numbers are reused, sources
that track data by partition id and attempt number may incorrectly
clean up data because the same attempt number can be both committed
and aborted.

Author: Marcelo Vanzin <[email protected]>

Closes apache#21615 from vanzin/SPARK-24552-2.3.
@vanzin vanzin deleted the SPARK-24552-2.3 branch August 24, 2018 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants