-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-53001] Integrate RocksDB Memory Usage with the Unified Memory Manager #51708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Besides Structured Streaming, RocksDB is also used in other areas like the Live UI. Don’t they require similar handling? |
|
@LuciferYang Sorry, I think I'm missing something. Could you elaborate on what the suggestion is here? |
|
@LuciferYang - which components are you referring to ? |
For instance, Spark Live UI can also utilize RocksDB as its storage backend, and I'm not sure if it encounters similar issues as well. |
I'm just quite curious about this. Even if such issues exist, we can still fix them in a separate PR. |
Sure, sounds good |
| "Setting this to 0 disables unmanaged memory polling.") | ||
| .version("4.1.0") | ||
| .timeConf(TimeUnit.MILLISECONDS) | ||
| .createWithDefaultString("1s") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be safe and avoid a regression, shall we start with 0 by default, @ericm-db , @anishshri-db, @gatorsmile ?
| * @param unmanagedMemoryConsumer The consumer to register for memory tracking | ||
| */ | ||
| def registerUnmanagedMemoryConsumer( | ||
| unmanagedMemoryConsumer: UnmanagedMemoryConsumer): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation, @ericm-db ?
| case class UnmanagedMemoryConsumerId( | ||
| componentType: String, | ||
| instanceKey: String | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation, @ericm-db ?
| * - Native libraries with custom memory allocation | ||
| * - Off-heap caches managed outside of Spark | ||
| */ | ||
| trait UnmanagedMemoryConsumer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we move this into a separate file, UnmanagedMemoryConsumer.scala?
| * @return Total memory usage in bytes across all tracked components | ||
| */ | ||
| def getMemoryUsage: Long = { | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. Let's remove this redundant empty line.
| require(db != null && !db.isClosed, "RocksDB must be open to get memory usage") | ||
| RocksDB.mainMemorySources.map { memorySource => | ||
| getDBProperty(memorySource) | ||
| }.sum |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a one-liner? Maybe, the following style?
- RocksDB.mainMemorySources.map { memorySource =>
- getDBProperty(memorySource)
- }.sum
+ RocksDB.mainMemorySources.map(getDBProperty).sum| * Updates the cached memory usage if enough time has passed. | ||
| * This is called from task thread operations, so it's already thread-safe. | ||
| */ | ||
| def updateMemoryUsageIfNeeded(): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like being invoked frequently in several places. What is the overload of this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's minimal, on the order of ns
| * memory allocation decisions. | ||
| */ | ||
| object RocksDBMemoryManager extends Logging { | ||
| object RocksDBMemoryManager extends Logging with UnmanagedMemoryConsumer{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little surprised because Scala linter didn't catch this, with UnmanagedMemoryConsumer{. Could you add a space like with UnmanagedMemoryConsumer {?
| boundedMemoryEnabled.toString)) { | ||
|
|
||
| import org.apache.spark.memory.UnifiedMemoryManager | ||
| import org.apache.spark.sql.streaming.Trigger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have special reasons why we have these import statements in the test code body? Otherwise, please move this to the file header.
|
|
||
| try { | ||
| // Let the stream run to establish RocksDB instances and generate state operations | ||
| Thread.sleep(2000) // 2 seconds should be enough for several processing cycles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a little risky. Can we use eventually instead of Thread.sleep?
| val maxAttempts = 15 // 15 attempts with 1-second intervals = 15 seconds max | ||
|
|
||
| while (rocksDBMemory <= 0L && attempts < maxAttempts) { | ||
| Thread.sleep(1000) // Wait between checks to allow memory updates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
| } | ||
|
|
||
| // Verify memory tracking remains stable during continued operation | ||
| Thread.sleep(2000) // Let stream continue running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
|
@dongjoon-hyun thank you for all the feedback, I will address this |
|
@dongjoon-hyun Can you PTAL at this PR: #51778 |
### What changes were proposed in this pull request? This PR aims to document newly added `core` module configurations as a part of Apache Spark 4.1.0 preparation. ### Why are the changes needed? To help the users use new features easily. - #47856 - #51130 - #51163 - #51604 - #51630 - #51708 - #51885 - #52091 - #52382 ### Does this PR introduce _any_ user-facing change? No behavior change because this is a documentation update. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #52626 from dongjoon-hyun/SPARK-53926. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…efore execution memory OOM ### What changes were proposed in this pull request? We have a log before OOM for off-heap memory allocation. Before the change, the log is: > 25/08/05 16:44:32 INFO TaskMemoryManager: 100 bytes of memory are used for execution and 100 bytes of memory are used for storage After: > 25/08/05 16:44:32 INFO TaskMemoryManager: 100 bytes of memory are used for execution and 100 bytes of memory are used for storage and 500 bytes of memory are used but unmanaged ### Why are the changes needed? Following #51708, to allow user to know the reason if the unmanaged memory causes OOM. ### Does this PR introduce _any_ user-facing change? Only changes a log message. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #51848 from zhztheplayer/wip-53128. Authored-by: Hongze Zhang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…efore execution memory OOM ### What changes were proposed in this pull request? We have a log before OOM for off-heap memory allocation. Before the change, the log is: > 25/08/05 16:44:32 INFO TaskMemoryManager: 100 bytes of memory are used for execution and 100 bytes of memory are used for storage After: > 25/08/05 16:44:32 INFO TaskMemoryManager: 100 bytes of memory are used for execution and 100 bytes of memory are used for storage and 500 bytes of memory are used but unmanaged ### Why are the changes needed? Following #51708, to allow user to know the reason if the unmanaged memory causes OOM. ### Does this PR introduce _any_ user-facing change? Only changes a log message. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #51848 from zhztheplayer/wip-53128. Authored-by: Hongze Zhang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]> (cherry picked from commit c4ad381) Signed-off-by: Dongjoon Hyun <[email protected]>
### What changes were proposed in this pull request? This PR aims to document newly added `core` module configurations as a part of Apache Spark 4.1.0 preparation. ### Why are the changes needed? To help the users use new features easily. - apache#47856 - apache#51130 - apache#51163 - apache#51604 - apache#51630 - apache#51708 - apache#51885 - apache#52091 - apache#52382 ### Does this PR introduce _any_ user-facing change? No behavior change because this is a documentation update. ### How was this patch tested? Manual review. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#52626 from dongjoon-hyun/SPARK-53926. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
…efore execution memory OOM ### What changes were proposed in this pull request? We have a log before OOM for off-heap memory allocation. Before the change, the log is: > 25/08/05 16:44:32 INFO TaskMemoryManager: 100 bytes of memory are used for execution and 100 bytes of memory are used for storage After: > 25/08/05 16:44:32 INFO TaskMemoryManager: 100 bytes of memory are used for execution and 100 bytes of memory are used for storage and 500 bytes of memory are used but unmanaged ### Why are the changes needed? Following apache#51708, to allow user to know the reason if the unmanaged memory causes OOM. ### Does this PR introduce _any_ user-facing change? Only changes a log message. ### How was this patch tested? Existing tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#51848 from zhztheplayer/wip-53128. Authored-by: Hongze Zhang <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
Currently, RocksDB memory is untracked and not included in memory decisions in Spark. We want to factor the RocksDB memory usage into memory allocations so we don't hit OOMs. This change introduces a background memory polling thread from the MemoryManager that queries RocksDB memory every X seconds (configurable via SQLConf).
Why are the changes needed?
This helps us avoid OOMs when RocksDB is used as the StateStoreProvider by taking other Spark allocations into account.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit tests
Was this patch authored or co-authored using generative AI tooling?
No