Skip to content
Prev Previous commit
Next Next commit
Update ExternalAppendOnlyUnsafeRowArray.scala
  • Loading branch information
cloud-fan authored Sep 3, 2025
commit 32a744d095fd32b46e7669459abe67c7d06c2eeb
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,14 @@ import org.apache.spark.util.collection.unsafe.sort.{UnsafeExternalSorter, Unsaf
* An append-only array for [[UnsafeRow]]s that strictly keeps content in an in-memory array
* until [[numRowsInMemoryBufferThreshold]] or [[sizeInBytesInMemoryBufferThreshold]] is reached
* post which it will switch to a mode (backed by [[UnsafeExternalSorter]]) which would flush to
* disk after [[numRowsSpillThreshold]] is met (or before if there is excessive memory consumption).
* Setting these threshold involves following trade-offs:
* disk after [[numRowsSpillThreshold]] or [[sizeInBytesSpillThreshold]] is met (or before if there
* is excessive memory consumption). Setting these threshold involves following trade-offs:
*
* - If [[numRowsInMemoryBufferThreshold]] and [[sizeInBytesInMemoryBufferThreshold]] are too high,
* the in-memory array may occupy more memory than is available, resulting in OOM.
* - If [[numRowsSpillThreshold]] is too low, data will be spilled frequently and lead to
* excessive disk writes. This may lead to a performance regression compared to the normal case
* of using an [[ArrayBuffer]] or [[Array]].
* - If [[numRowsSpillThreshold]] or [[sizeInBytesSpillThreshold]] is too low, data will be spilled
* frequently and lead to excessive disk writes. This may lead to a performance regression compared
* to the normal case of using an [[ArrayBuffer]] or [[Array]].
*/
class ExternalAppendOnlyUnsafeRowArray(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does private[sql] need to be removed?

The previous PR added parameters in ExternalAppendOnlyUnsafeRowArray, but the documentation for ExternalAppendOnlyUnsafeRowArray was not updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other data structures (ShuffleExternalSorter, UnsafeExternalSorter, etc.) are all public classes under a private package. I just make them consistent here, not a big deal and I'm fine to revert as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the classdoc

taskMemoryManager: TaskMemoryManager,
Expand Down