Skip to content

Conversation

@craigcitro
Copy link
Contributor

The goal here is to lower latency for transfers, as well as having some
consistency between client libraries.

For more discussion, see
googleapis/google-api-python-client#482.

PTAL @dhermes -- I also didn't see an obvious place you were setting a download chunk size, but if you point me at one I'm happy to update it. 😁

The goal here is to lower latency for transfers, as well as having some
consistency between client libraries.

For more discussion, see
googleapis/google-api-python-client#482.
@dhermes
Copy link
Contributor

dhermes commented Mar 1, 2018

RE: download chunk size, these are the two most relevant spots:

It may be set somewhere in google-cloud-python.

@dhermes
Copy link
Contributor

dhermes commented Mar 1, 2018

This LGTM, but @jonparrott should LGTM too before a merge.

@dhermes
Copy link
Contributor

dhermes commented Mar 1, 2018

Eek, lots of "256 KB must divide chunk size" errors.

craigcitro added a commit to craigcitro/google-resumable-media-python that referenced this pull request Mar 1, 2018
The validation of `ResumableUpload.chunk_size` checks divisibility by a
constant, and then provides an error with a hardcoded value for that constant.
This can be confusing (eg googleapis#47).

The fix here is to just insert the appropriate value into the error message.
Adding a bad value still gives the expected error message, eg

```
Traceback (most recent call last):
  File "/usr/local/google/home/craigcitro/gh/google-resumable-media-python/tests/unit/test__upload.py", line 245, in test_constructor
    upload = _upload.ResumableUpload(RESUMABLE_URL, 12 * 1024)#chunk_size)
  File "/usr/local/google/home/craigcitro/gh/google-resumable-media-python/google/resumable_media/_upload.py", line 323, in __init__
    resumable_media.UPLOAD_CHUNK_SIZE / 1024))
ValueError: 256 KB must divide chunk size
```
@craigcitro
Copy link
Contributor Author

So looking at the way that constant is used, I think I had misunderstood its purpose.

AFAICT, it's not used as the default for chunk sizes when starting an upload. Instead, the caller always provides chunk_size to the constructor, and the code is simply checking that the provided value is divisible by UPLOAD_CHUNK_SIZE. So any default that exists is probably in google-cloud-storage and/or other libraries that use this.

Is that correct? If so, I'll drop this PR, but my offer to tweak default chunksizes can be happily reapplied to google-cloud-storage if you'd like at no additional charge. 😀

(The error message above was misleading, due to the string 256 KB being hardcoded; sent #48 to fix.)

@dhermes
Copy link
Contributor

dhermes commented Mar 1, 2018

@craigcitro No that isn't correct. The actual storage API requires chunks are divisible by some value (I can't recall what it is).

@craigcitro
Copy link
Contributor Author

@dhermes We're maybe missing each other on this one. :)

Storage does indeed have a requirement on chunk size being divisible by 256KB: from the docs, "Create chunks in multiples of 256 KB (256 x 1024 bytes) in size, except for the final chunk that completes the upload." (This same requirement holds for media transfers across all APIs.)

Now, in the google-resumable-media code, the user always provides a chunk size to the ResumableUpload constructor, and UPLOAD_CHUNK_SIZE is only used to confirm the invariant above for the provided value.

My original goal in this CL was to change the default chunk size, but I think that was misguided: there is no default chunk size. If such a default exists, it's presumably in the callers of this library.

More directly:

@sduskis
Copy link

sduskis commented Jan 16, 2019

@craigcitro, this is an old PR. Can we close it?

@craigcitro
Copy link
Contributor Author

Sure.

@craigcitro craigcitro closed this Jan 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants