Skip to content

Enable PCBC completionObjects autoShrink to reduce memory usage and gc#3913

Merged
hangc0276 merged 1 commit intoapache:masterfrom
wenbingshen:wenbing/autoShrinkCompletionObjects
Apr 22, 2023
Merged

Enable PCBC completionObjects autoShrink to reduce memory usage and gc#3913
hangc0276 merged 1 commit intoapache:masterfrom
wenbingshen:wenbing/autoShrinkCompletionObjects

Conversation

@wenbingshen
Copy link
Member

@wenbingshen wenbingshen commented Apr 11, 2023

Motivation

PerChannelBookieClient completionObjects occupy a lot of heap space and cannot be recycled.
The figure below shows that the internal table array of ConcurrentOpenHashMap has used space size=0, but the array length is still 16384, and the memory overhead is 65552bytes.
image

image

ConcurrentOpenHashMap default DefaultConcurrencyLevel=16. We have hundreds of bookie nodes. Due to the feature of bookie polling and writing, the client and server have long connection characteristics, which will as a result, the memory usage of about 65552 * 16 * 1776 = 1.74GB cannot be recycled, and the space take up by these tables is all size=0 (The broker's owner topic has drifted to other brokers due to Full GC).
image

When the throughput of the pulsar cluster increases and the bookie cluster expands, these memory usage will also increase. Coupled with the unreasonable memory usage in other aspects of pulsar that we know, this will cause the pulsar broker to continuously generate full gc.

Changes

I think adding autoShrink to completionObjects can reduce this part of memory usage and reduce the frequency of Full GC.

@horizonzy
Copy link
Member

ConcurrentOpenHashMap default DefaultConcurrencyLevel=16. We have hundreds of bookie nodes. Due to the feature of bookie polling and writing, the client and server have long connection characteristics, which will as a result, the memory usage of about 65552 * 16 * 1776 = 1.74GB cannot be recycled, and the space take up by these tables is all size=0 (The broker's owner topic has drifted to other brokers due to Full GC).

I have a question about this. The memory occupation is not about the array size. The key(CompletionKey) and the value(CompletionValue) is the occupier. As long as the key and the value is removed, the memory occupation will be decrease.
We make the array size autoShrink didn't help the GC

@wenbingshen
Copy link
Member Author

wenbingshen commented Apr 13, 2023

I have a question about this. The memory occupation is not about the array size. The key(CompletionKey) and the value(CompletionValue) is the occupier. As long as the key and the value is removed, the memory occupation will be decrease. We make the array size autoShrink didn't help the GC

@horizonzy The layout of the array object in memory, when pointer compression is enabled, includes
array object header + length * 4 (reference) + length * (single element size)

We have an object array with a size of 16384, and the space it occupies in memory is = 8 + 4 + 4 + 4 * 16384 = 65552
In our pulsar broker, such an array would occupy about = 65552 * 16 * 1776 = 1.74GB
image

If we can turn on autoShrink, in the case of size=0, the array size will shrink to 24.
And the space it occupies in memory is = 8 + 4 + 4 + 4 * 24 = 112
In our pulsar broker, such an array would occupy about = 112 * 16 * 1776 = 3108KB
image

This way we can reclaim a lot of space in memory.

@horizonzy
Copy link
Member

In our pulsar broker, such an array would occupy about = 65552 * 16 * 1776 = 1.74GB

Why multiply 16.

@wenbingshen
Copy link
Member Author

Why multiply 16.

@horizonzy ConcurrentOpenHashMap default DefaultConcurrencyLevel=16

image
image

@horizonzy
Copy link
Member

Thanks, I got it.

Copy link
Member

@horizonzy horizonzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement. LGTM

Copy link
Contributor

@hangc0276 hangc0276 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice Catch!

@wenbingshen
Copy link
Member Author

@merlimat @eolivelli @dlg99 @zymap Can you help take a look at this pr. Thanks.

Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hangc0276 hangc0276 merged commit ca33b31 into apache:master Apr 22, 2023
@wenbingshen wenbingshen deleted the wenbing/autoShrinkCompletionObjects branch April 23, 2023 02:55
zymap pushed a commit that referenced this pull request Jun 19, 2023
#3913)

### Motivation

PerChannelBookieClient completionObjects occupy a lot of heap space and cannot be recycled.
The figure below shows that the internal table array of ConcurrentOpenHashMap has used space size=0, but the array length is still 16384, and the memory overhead is 65552bytes.
![image](https://user-images.githubusercontent.com/35599757/231114802-db90c49b-d295-46d7-b7db-785035b341f0.png)

![image](https://user-images.githubusercontent.com/35599757/231113930-bd9f3f54-9052-4c0b-9a3f-2fc493632e35.png)

ConcurrentOpenHashMap default DefaultConcurrencyLevel=16. We have hundreds of bookie nodes. Due to the feature of bookie polling and writing, the client and server have long connection characteristics, which will as a result, the memory usage of about 65552 * 16 * 1776 = 1.74GB cannot be recycled, and the space take up by these tables is all size=0 (The broker's owner topic has drifted to other brokers due to Full GC).
![image](https://user-images.githubusercontent.com/35599757/231117087-08c80320-fa71-49c2-a199-cfee3d83ddc5.png)

When the throughput of the pulsar cluster increases and the bookie cluster expands, these memory usage will also increase. Coupled with the unreasonable memory usage in other aspects of pulsar that we know, this will cause the pulsar broker to continuously generate full gc.

### Changes
I think adding autoShrink to completionObjects can reduce this part of memory usage and reduce the frequency of Full GC.

(cherry picked from commit ca33b31)
zymap pushed a commit that referenced this pull request Dec 6, 2023
#3913)

### Motivation

PerChannelBookieClient completionObjects occupy a lot of heap space and cannot be recycled.
The figure below shows that the internal table array of ConcurrentOpenHashMap has used space size=0, but the array length is still 16384, and the memory overhead is 65552bytes.
![image](https://user-images.githubusercontent.com/35599757/231114802-db90c49b-d295-46d7-b7db-785035b341f0.png)

![image](https://user-images.githubusercontent.com/35599757/231113930-bd9f3f54-9052-4c0b-9a3f-2fc493632e35.png)

ConcurrentOpenHashMap default DefaultConcurrencyLevel=16. We have hundreds of bookie nodes. Due to the feature of bookie polling and writing, the client and server have long connection characteristics, which will as a result, the memory usage of about 65552 * 16 * 1776 = 1.74GB cannot be recycled, and the space take up by these tables is all size=0 (The broker's owner topic has drifted to other brokers due to Full GC).
![image](https://user-images.githubusercontent.com/35599757/231117087-08c80320-fa71-49c2-a199-cfee3d83ddc5.png)

When the throughput of the pulsar cluster increases and the bookie cluster expands, these memory usage will also increase. Coupled with the unreasonable memory usage in other aspects of pulsar that we know, this will cause the pulsar broker to continuously generate full gc.

### Changes
I think adding autoShrink to completionObjects can reduce this part of memory usage and reduce the frequency of Full GC.

(cherry picked from commit ca33b31)
Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
apache#3913)

### Motivation

PerChannelBookieClient completionObjects occupy a lot of heap space and cannot be recycled.
The figure below shows that the internal table array of ConcurrentOpenHashMap has used space size=0, but the array length is still 16384, and the memory overhead is 65552bytes.
![image](https://user-images.githubusercontent.com/35599757/231114802-db90c49b-d295-46d7-b7db-785035b341f0.png)

![image](https://user-images.githubusercontent.com/35599757/231113930-bd9f3f54-9052-4c0b-9a3f-2fc493632e35.png)

ConcurrentOpenHashMap default DefaultConcurrencyLevel=16. We have hundreds of bookie nodes. Due to the feature of bookie polling and writing, the client and server have long connection characteristics, which will as a result, the memory usage of about 65552 * 16 * 1776 = 1.74GB cannot be recycled, and the space take up by these tables is all size=0 (The broker's owner topic has drifted to other brokers due to Full GC).
![image](https://user-images.githubusercontent.com/35599757/231117087-08c80320-fa71-49c2-a199-cfee3d83ddc5.png)

When the throughput of the pulsar cluster increases and the bookie cluster expands, these memory usage will also increase. Coupled with the unreasonable memory usage in other aspects of pulsar that we know, this will cause the pulsar broker to continuously generate full gc.

### Changes
I think adding autoShrink to completionObjects can reduce this part of memory usage and reduce the frequency of Full GC.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants

Comments