Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
clarify retry docs
  • Loading branch information
jose-torres committed Mar 3, 2018
commit 215c225c5a1623cfa02f617201e21067bbf6088a
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,10 @@
* succeeds), a {@link WriterCommitMessage} will be sent to the driver side and pass to
* {@link DataSourceWriter#commit(WriterCommitMessage[])} with commit messages from other data
* writers. If this data writer fails(one record fails to write or {@link #commit()} fails), an
* exception will be sent to the driver side, and Spark may retry this writing task for some times,
* each time {@link DataWriterFactory#createDataWriter(int, int, long)} gets a different
* `attemptNumber`, and finally call {@link DataSourceWriter#abort(WriterCommitMessage[])} if all
* retry fail.
* exception will be sent to the driver side, and Spark may retry this writing task a few times.
* In each retry, {@link DataWriterFactory#createDataWriter(int, int, long)} will receive a
* different `attemptNumber`. Spark will call {@link DataSourceWriter#abort(WriterCommitMessage[])}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not clear to me. Isnt it the case that abort will be called every time a task attempt ends in an error?
This seems to give the impression that abort is called only after N failed attempts have been made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local abort will be called every time a task attempt fails. The global abort referenced here is called only when the job fails.

* when the configured number of retries is exhausted.
*
* Besides the retry mechanism, Spark may launch speculative tasks if the existing writing task
* takes too long to finish. Different from retried tasks, which are launched one by one after the
Expand Down