[SPARK-47764][CORE][SQL] Cleanup shuffle dependencies based on ShuffleCleanupMode #45930

bozhang2820 · 2024-04-08T13:10:31Z

What changes were proposed in this pull request?

This change adds a new trait, ShuffleCleanupMode under QueryExecution, and two new configs, spark.sql.shuffleDependency.skipMigration.enabled and spark.sql.shuffleDependency.fileCleanup.enabled.

For Spark Connect query executions, ShuffleCleanupMode is controlled by the two new configs, and shuffle dependency cleanup are performed accordingly.

When spark.sql.shuffleDependency.fileCleanup.enabled is true, shuffle dependency files will be cleaned up at the end of query executions.

When spark.sql.shuffleDependency.skipMigration.enabled is true, shuffle dependencies will be skipped at the shuffle data migration for node decommissions.

Why are the changes needed?

This is to: 1. speed up shuffle data migration at decommissions and 2. possibly (when file cleanup mode is enabled) release disk space occupied by unused shuffle files.

Does this PR introduce any user-facing change?

Yes. This change adds two new configs, spark.sql.shuffleDependency.skipMigration.enabled and spark.sql.shuffleDependency.fileCleanup.enabled to control the cleanup behaviors.

How was this patch tested?

Existing tests.

Was this patch authored or co-authored using generative AI tooling?

No

cloud-fan · 2024-04-15T06:03:47Z

core/src/main/scala/org/apache/spark/shuffle/MigratableResolver.scala

+  /**
+   * Mark a shuffle that should not be migrated.
+   */
+  def addShuffleToSkip(shuffleId: Int): Unit


let's add a default implememtation

cloud-fan · 2024-04-15T06:04:27Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+
+  val SHUFFLE_DEPENDENCY_FILE_CLEANUP_ENABLED =
+    buildConf("spark.sql.shuffleDependency.fileCleanup.enabled")
+      .doc("When enabled, shuffle dependency files will be cleaned up at the end of SQL " +


Suggested change

.doc("When enabled, shuffle dependency files will be cleaned up at the end of SQL " +

.doc("When enabled, shuffle files will be cleaned up at the end of SQL " +

cloud-fan · 2024-04-15T06:07:03Z

sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala

      val inMemoryRelation = sessionWithConfigsOff.withActive {
-        val qe = sessionWithConfigsOff.sessionState.executePlan(planToCache)
+        val qe = sessionWithConfigsOff.sessionState.executePlan(
+          planToCache, shuffleCleanupMode = DoNotCleanup)


isn't this the default?

Tried to be explicit here. Removed the unnecessary argument.

cloud-fan · 2024-04-15T06:10:00Z

sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala

+      logicalPlan: LogicalPlan,
+      shuffleCleanupMode: ShuffleCleanupMode): DataFrame =
+    sparkSession.withActive {
+      val qe = sparkSession.sessionState.executePlan(


can we new QueryExecution here? Then we don't need to touch session state builder

Good idea. Done.

cloud-fan · 2024-04-15T06:14:49Z

core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala

    }
  }

+  private val shuffleIdsToSkip = Collections.newSetFromMap[Int](new ConcurrentHashMap)


What's the life cycle of it?

Updated to remove from this Set when the shuffle is unregistered.

cloud-fan · 2024-04-18T02:27:51Z

core/src/main/scala/org/apache/spark/shuffle/sort/SortShuffleManager.scala

        shuffleBlockResolver.removeDataByMap(shuffleId, mapTaskId)
      }
    }
+    shuffleBlockResolver.removeShuffleToSkip(shuffleId)


this is a weird place to do cleanup. Shall we cover all shuffle manager implementations? Shall we do it in the caller of this unregisterShuffle function?

Yeah this is a bit weird... Changed to use a Guava cache with a fixed maximum size (1000) instead, so that we do not need to do cleanups for shufflesToSkip.

cloud-fan · 2024-04-23T06:03:19Z

core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala

  }

+  private val shuffleIdsToSkip =
+    CacheBuilder.newBuilder().maximumSize(1000).build[java.lang.Integer, java.lang.Boolean]()


if the value does not matter, shall we just use Object type and always pass null?

Unfortunately Guava cache won't accept null values...

cloud-fan · 2024-04-23T06:05:23Z

sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala

  val stageCache: TrieMap[SparkPlan, ExchangeQueryStageExec] =
    new TrieMap[SparkPlan, ExchangeQueryStageExec]()
+
+  val shuffleIds: TrieMap[Int, Boolean] = new TrieMap[Int, Boolean]()


what does the value mean? BTW, stageCache uses TrieMap because the key is SparkPlan. For int key, I think normal hash map works fine

I think a concurrent hash map is still required since the context are shared between the main query and all sub queries?

yea, concurrent hash map with int key should be good here.

cloud-fan · 2024-04-24T08:13:40Z

thanks, merging to master!

ulysses-you · 2024-04-30T06:52:57Z

sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala

+            shuffleIds.foreach { shuffleId =>
+              queryExecution.shuffleCleanupMode match {
+                case RemoveShuffleFiles =>
+                  SparkEnv.get.shuffleManager.unregisterShuffle(shuffleId)


Shall we call shuffleDriverComponents.removeShuffle ? We are at driver side, shuffleManager.unregisterShuffle would do nothing in non-local mode.

Thanks for catching this! Will fix this in a follow-up asap.

Created #46302.

ulysses-you · 2024-04-30T07:06:38Z

sql/core/src/main/scala/org/apache/spark/sql/execution/SQLExecution.scala

          }
+          if (queryExecution.shuffleCleanupMode != DoNotCleanup
+            && isExecutedPlanAvailable) {
+            val shuffleIds = queryExecution.executedPlan match {


It seems the root node can be a command. Shall we collect all the AdaptiveSparkPlanExec inside the plan ?

Oh this is a good catch! I think we should. cc @bozhang2820

I could be wrong but I thought DataFrames for commands are created in SparkConnectPlanner, and the ones for queries are only created in SparkConnectPlanExecution?

Ideally we should clean up shuffles for CTAS and INSERT as well, as they also run queries.

…Shuffle to remove shuffle properly ### What changes were proposed in this pull request? This is a follow-up for #45930, where we introduced ShuffleCleanupMode and implemented cleaning up of shuffle dependencies. There was a bug where `ShuffleManager.unregisterShuffle` was used on Driver, and in non-local mode it is not effective at all. This change fixed the bug by changing to use `ShuffleDriverComponents.removeShuffle` instead. ### Why are the changes needed? This is to address the comments in #45930 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Updated unit tests. ### Was this patch authored or co-authored using generative AI tooling? No Closes #46302 from bozhang2820/spark-47764-1. Authored-by: Bo Zhang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

…Shuffle to remove shuffle properly ### What changes were proposed in this pull request? This is a follow-up for apache#45930, where we introduced ShuffleCleanupMode and implemented cleaning up of shuffle dependencies. There was a bug where `ShuffleManager.unregisterShuffle` was used on Driver, and in non-local mode it is not effective at all. This change fixed the bug by changing to use `ShuffleDriverComponents.removeShuffle` instead. ### Why are the changes needed? This is to address the comments in apache#45930 (comment) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Updated unit tests. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#46302 from bozhang2820/spark-47764-1. Authored-by: Bo Zhang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

### What changes were proposed in this pull request? Changes to cleanup shuffle generated from running commands(eg writes) This was also brought by cloud-fan and ulysses-you [here](#45930 (comment)) ### Why are the changes needed? To cleanupshuffle generated from commands ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit test added ### Was this patch authored or co-authored using generative AI tooling? No Closes #52157 from karuppayya/SPARK-53413. Lead-authored-by: Karuppayya Rajendran <[email protected]> Co-authored-by: Karuppayya <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

### What changes were proposed in this pull request? Changes to cleanup shuffle generated from running commands(eg writes) This was also brought by cloud-fan and ulysses-you [here](apache#45930 (comment)) ### Why are the changes needed? To cleanupshuffle generated from commands ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit test added ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#52157 from karuppayya/SPARK-53413. Lead-authored-by: Karuppayya Rajendran <[email protected]> Co-authored-by: Karuppayya <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

Cleanup shuffle dependencies based on ShuffleCleanupMode

2cc5a8b

github-actions bot added SQL CORE CONNECT labels Apr 8, 2024

Fix mima

2feb4c2

github-actions bot added the BUILD label Apr 9, 2024

cloud-fan reviewed Apr 15, 2024

View reviewed changes

bozhang2820 added 2 commits April 17, 2024 22:29

Address comments

e30ffa8

Remove unnecessary changes

50134ed

cloud-fan reviewed Apr 18, 2024

View reviewed changes

bozhang2820 added 3 commits April 22, 2024 22:15

Change to use Guava cache to store shuffle IDs to skip

e803026

Add unit tests

f6ab96f

Merge remote-tracking branch 'apache/master' into spark-47764

d0f33d6

cloud-fan reviewed Apr 23, 2024

View reviewed changes

cloud-fan approved these changes Apr 23, 2024

View reviewed changes

bozhang2820 added 2 commits April 23, 2024 17:36

Cleanup shuffles after each case

42d172e

Use ConcurrentHashMap

b6283cf

cloud-fan closed this in c44493d Apr 24, 2024

ulysses-you reviewed Apr 30, 2024

View reviewed changes

bozhang2820 mentioned this pull request Apr 30, 2024

[SPARK-47764][FOLLOW-UP] Change to use ShuffleDriverComponents.removeShuffle to remove shuffle properly #46302

Closed

abellina mentioned this pull request Jul 15, 2024

[SPARK-48861][SQL] Enable shuffle file removal/skipMigration for all SQL executions #47360

Closed

bozhang2820 deleted the spark-47764 branch January 8, 2025 01:35

dongjoon-hyun mentioned this pull request Jul 17, 2025

[SPARK-52741][SQL] RemoveFiles ShuffleCleanup mode doesnt work with non-adaptive execution #51432

Closed

karuppayya mentioned this pull request Jul 17, 2025

[SPARK-52777][SQL] Enable shuffle cleanup mode configuration in Spark SQL #51458

Closed

karuppayya mentioned this pull request Aug 28, 2025

[SPARK-53413][SQL] Shuffle cleanup for commands #52157

Closed

ivoson mentioned this pull request Nov 11, 2025

[CORE][SPARK-49788] Add optional TTL based cleaning for blocks to better support notebooks #49032

Open

	.doc("When enabled, shuffle dependency files will be cleaned up at the end of SQL " +
	.doc("When enabled, shuffle files will be cleaned up at the end of SQL " +

[SPARK-47764][CORE][SQL] Cleanup shuffle dependencies based on ShuffleCleanupMode #45930

[SPARK-47764][CORE][SQL] Cleanup shuffle dependencies based on ShuffleCleanupMode #45930

Uh oh!

Conversation

bozhang2820 commented Apr 8, 2024

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bozhang2820 Apr 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Apr 24, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan Apr 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bozhang2820 Apr 23, 2024 •

edited

Loading

cloud-fan Apr 30, 2024 •

edited

Loading