-
Notifications
You must be signed in to change notification settings - Fork 4k
protobuf, api, core, netty: zero copy into protobuf #7330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ejona86
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach looks good
| public interface HasByteBuffer { | ||
|
|
||
| /** | ||
| * Gets a {@link ByteBuffer} containing up to {@code length} bytes of the content, or {@code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would either expect a get(void) method or a read(length) method. Providing a length here doesn't seem to provide any value as if you specify a smaller length it just limits what is returned but doesn't change any internal state. read(length) actually changes the "read position."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds fair. Changed.
| buffer.reset(); | ||
| readableBytes += (buffer.readableBytes() - currentRemain); | ||
| } | ||
| int size = readableBuffers.size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems you could use rewindableBuffers.pollLast()/rewindableBuffers.removeLast() along with readableBuffers.addFirst()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks for the suggestion.
| if (!owner) { | ||
| buffer = ignoreClose(buffer); | ||
| } | ||
| return buffer.canUseByteBuffer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One option to simplify this "which interfaces should we support" logic is to not rely on the interface to determine whether it supports getByteBuffer(). We could have another method like boolean getByteBufferSupported(). See InputStream.markSupported(). Basically, that would allow us to always return an instance that implements HasByteBuffer but some of the time the getByteBuffer() method is non-functional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, updated.
protobuf-lite/src/main/java/io/grpc/protobuf/lite/ProtoLiteUtils.java
Outdated
Show resolved
Hide resolved
protobuf-lite/src/main/java/io/grpc/protobuf/lite/ProtoLiteUtils.java
Outdated
Show resolved
Hide resolved
| @Override | ||
| public Iterator<ByteBuffer> iterator() { | ||
| try { | ||
| stream.reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach would work, but does assume that only one iterator would be used at a time. Since it is basically just the same amount of work to make a list of ByteBuffers up-front and provide the list as an iterator, that seems superior as it does not assume one-iterator-at-a-time behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, something I missed for consideration. I only thought about using an iterator has the benefit of only expanding the content as it goes, while it doesn't give much value for this case.
Changed to make a list up-front.
… used for okhttp as well.
…on for protobuf parse.
njhill
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @voidzcy I really like this improvement :)
I saw there was some iteration on the interface design, probably there is something I'm missing but why not just have a List<ByteBuffer> peekByteBuffers(int length) method instead of the mark/reset etc?
And similar to netty how about also int byteBufferCount(int length) and ByteBuffer peekByteBuffer(int length) so to help minimize allocations. The buffer count method then can also serve to indicate whether the ReadableBuffer/InputStream supports buffer peeking by returning -1 if not. WDYT?
| } | ||
| stream.reset(); | ||
| cis = CodedInputStream.newInstance(buffers); | ||
| } else if (size > 0 && size <= DEFAULT_MAX_MESSAGE_SIZE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should still be done in the new case I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have decided to keep "normal" messages (2~4 KB) in the current codepath, copying small things isn't necessarily bad with CPU caches. So we only enable this for messages >= 64 KB (most messages should be below this threshold).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable, would be interesting to benchmark different message sizes/sparsity to verify the threshold (of course likely system dependent).
But what about the size <= DEFAULT_MAX_MESSAGE_SIZE check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For large messages (larger than DEFAULT_MAX_MESSAGE_SIZE), we would prefer the ByteBuffer approach if it is supported. This shouldn't cause a problem.
|
|
||
| @Override | ||
| public ByteBuffer getByteBuffer() { | ||
| return buffer.nioBuffers()[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach could be pretty heavy on garbage, especially if underlying buffer is composite. I'd suggest at least doing buffer.nioBufferCount() == 1 ? buffer.nioBuffer() : buffer.nioBuffers()[0].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, sound fair. Improved.
We discussed just a The need to iterate over the byte buffers multiple times seems a temporary detail. Protobuf only needs to loop multiple times so it can determine if all the buffers are direct, in which case it takes a faster code path. Using mark/reset allows us to stop using mark/reset in the future and so be able to release consumed ByteBuffers during message parsing. Mark/reset is also useful to some existing users of gRPC. Some interceptors need to read the message, so with mark/reset they can avoid a copy. |
| stream.skip(buffer.remaining()); | ||
| buffers.add(buffer); | ||
| } | ||
| stream.reset(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the reset here really serves any purpose? IIUC the mark is only done to say "don't release the stuff I'm about to read" but it will all be released when the stream is closed once this method returns regardless.
If the reset stays then to be more "correct" shouldn't stream.skip(size) also be called after CodedInputStream.newInstance(buffers)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to #7330 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@voidzcy I was referring to this specific line here, not the existence of the reset() method. This line could be deleted, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Deleted.
| * @throws UnsupportedOperationException if this operation is not supported. | ||
| */ | ||
| @Nullable | ||
| ByteBuffer getByteBuffer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make the API cleaner how about having this return null for not supported and a (constant) empty ByteBuffer for EOS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having this return null for not supported and a (constant) empty ByteBuffer for EOS
The semantics of returning null for operation not supported isn't strong enough. Most existing Java APIs (e.g.,java.nio.ByteBuffer uses hasArray() and array()) throw an UnsupportedOperationException for unsupported optional APIs. I think the current approach is cleaner.
|
Thanks @ejona86
Right but
Ah, that makes sense. An alternative that comes to mind would be to have some kind of But otherwise I guess mark/reset already exist in the interface so might be preferable for that reason. |
|
@ejona86 thanks for the detailed response and for humouring me in general :) I don't mean any of it as a criticism of past decisions, this only just occurred to me really. I guess on further reflection what I really feel is that ReadableBuffer-like types could be more appropriate than InputStream, particularly if ownership transfer is later considered where it may be that the chunk of data as a whole is going to be used for something else (something like flatbuffers comes to mind). This is also why I felt accessing the list of backing ByteBuffers in one go rather than having to pull them out one-by one to populate an indeterminately-sized list might be nice. I accept your points though and probably the only meaningful difference would be the incremental release behaviour. This might now be off-topic but I had a thought about a simple way to support ownership transfer - how about a " I'm also curious about outbound ownership transfer. What if a message is written which has some backing resource that requires releasing? This could be done when the parsed InputStream is closed but IIUC messages might be serialized multiple times in some cases like retries (or am I wrong about that)?
Wouldn't it be better to send this kind of thing as a stream of smaller messages? Otherwise, the example you gave would be a strong case i.m.o. for figuring out a protobuf ownership transfer option so that the message could be parsed with "aliasing" enabled i.e. full zero copy. You won't be reducing mem requirement if you are using netty pooled buffers since these will be separate from the proto-allocated byte[]s. |
I don't really disagree. But that's also a different approach. I think our current approach is more "here's an API that you can use to integrate with existing Java code." If we exposed a buffer directly then it's not normally directly useful; you have to convert it to byte[] or ByteBuffer or InputStream to pass to existing code. Our current API is really just ways to convert the data to some other form. But the fact that it would be nicer to expose ReadableBuffer I think points more to "Java's buffer APIs are sucky," which isn't really a revelation. There's also the problem that we didn't/don't want to expose such a "wide" API as ReadableBuffer. I really don't want to get into the buffer business!
That's a neat idea, but yeah, a bit off-topic. I've moved that discussion to #1054 where we have been talking about ways to address ownership transfer.
Generally. But that's obviously more complex and adds no benefit if the entire message has to be processed at once.
Yes, although the user will have to "do something" to enable that optimization since they'll need to manage a lifetime in some fashion; it can't be free. Also, alias can be harmful if you aren't dealing with large
Yes, Netty may not free them immediately, but Netty could reuse them for other RPCs immediately. |
Outbound equivalent of grpc#7330. Protobuf doesn't support multiple ByteBuffers in this direction but I don't think that matters much since the outbound buffers are typically allocated/sized to fit the messages.
eee4f0f to
7b4e070
Compare
|
Thanks for reviewing, all comments addressed. PTAL. |
| /** | ||
| * Indicates whether or not {@link #getByteBuffer} operation is supported for this buffer. | ||
| */ | ||
| boolean canUseByteBuffer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe better name would be hasByteBuffer()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, go as you like. I don't have a strong preference.
| */ | ||
| @VisibleForTesting | ||
| static final boolean IS_JAVA9_OR_HIGHER = | ||
| !"1.7".equals(JAVA_VERSION) && !"1.8".equals(JAVA_VERSION); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about something like
static {
boolean isJava9 = true;
try {
Class.forName("java.lang.StackWalker");
} catch (ClassNotFoundException cnfe) {
isJava9 = false;
}
IS_JAVA9_OR_HIGHER = isJava9;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, seems to be simpler and guess it should work equally well. Changed.
| /** | ||
| * Indicates whether or not {@link #getByteBuffer} operation is supported. | ||
| */ | ||
| boolean getByteBufferSupported(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not give this the same name as the equivalent added ReadableBuffer method (my suggestion would be hasByteBuffer())?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't need to match what we use for ReadableBuffer right? We'd probably have a discussion on this API this Thursday and we usually do a vote for the name.
hasByteBuffer 1 vote now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As well as for consistency, having it line up with ReadableBuffer would permit ReadableBuffers to themselves be InputStreams implementing this interface, which I think might allow for some other streamlining later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, for that reason.
My original thought was this HasByteBuffer interface should only have the getByteBuffer method. InputStreams not supporting the operation should just not implement this interface. #7330 (comment) suggests combining two InputStream implementations into one by adding this getByteBufferSupported method. This look ok, but making the method hasByteBuffer does make this API wired. I'd rather change the hasByteBuffer method on ReadableBuffer interface to getByteBufferSupported.
|
@ejona86 All comments addressed, PTAL. |
|
So this is in limbo because Java doesn't really let protobuf avoid the initialization. There is a JDK enhancement request, but will not probably be resolved soon enough to be helpful to users within the next year-to-two-ish (even after a fix, it will take time to roll out to users). So that leaves this as a not-clear-winner. We did see a benchmark show better performance, in terms of CPU time. Since we aren't needing to allocate a large contiguous byte[], then maybe that by itself provides benefits. I do question that, or rather, I think there's some cases where HTTP/2 frame fragmentation will reduce the benefit. Although I'd also like to see other optimizations increasing the frame size as appropriate for performance. We can try running the TransportBenchmark with the 'gc' profiler and see if it shows clear benefit. Or maybe some other benchmark. But without any further evidence, I think we should close this PR. |
The problem for this change without protobuf's array initialization avoidance in ByteBuffer codepath is that the improvement provided by this change in gRPC will be outweighed by the cost of unnecessary initialization in protobuf (note that protobuf's byte array codepath's array initialization is eliminated by JDK's Closing this PR now as the change doesn't seem to make the overall performance better without JDK enhancement request being resolved. |
|
Close in favor of #8102. |
TODO: add tests and enhance Javadoc.The argument for the maximum number of bytes to be kept within the marked range is unimplemented, as underlying buffers do not support it.
CompositeReadableBuffercould support it in a coarser granularity, but it seems not have much value.@ejona86 Could you please give a quick glance to see if I am on the right direction? Just in case for not going towards a dark end.