-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Fix issue where nextcloud external storage cannot download files from Cloudflare R2 #35125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue where nextcloud external storage cannot download files from Cloudflare R2 #35125
Conversation
e8ee363 to
e9318c0
Compare
|
@pserwylo Please run |
… Cloudflare R2 Signed-off-by: Peter Serwylo <[email protected]>
e9318c0 to
5330e98
Compare
|
@come-nc: Thanks for the prompt review and the pointer to the Question regarding the target release - Although happy for it to go with Nextcloud 26, it is probably unsurprising that I would like to selfishly have it in a patch release of Nextcloud 25 so as to be available earlier. Out of curiousity, is the rationale that it is kind of introducing a new feature (e.g. support for Cloudflare R2), as opposed to a bugfix in the existing S3 storage functionality? Nevertheless, thanks for reviewing. |
Target release is 26 because target branch is master. |
|
considering the code change it looks more like a bugfix than a new feature, to accomodate to quirks if the code doesn't cause any harm to non-Cloudflare servers then it would be ok to backport to 25 |
|
would be good to hear what @icewind1991 and/or @juliushaertl think |
Great. I'll wait to hear from others, and if anyone else agrees, then it seems the right thing is to ask the bot to backport (based on looking at other PRs). Is that something that you would normally do, or would the I? I also note a couple of other CI jobs failing. One is due to PostgreSQL not starting as expected - is this known to be flakey? Or is it something I should investigate? The other one is the "Performance" one, which is failing because it assumes PRs come from the |
|
Hi @pserwylo, I'm from the R2 team and would like to thank you for the very thorough investigation into this. In cases like this we're generally open to changing our implementation to more closely match other S3 compatible providers. We'll be releasing a fix for this compatibility issue with the status code of range requests later this week, or early next week. I'll post an update when that change goes live. As for your issue with range requests being truncated, I'm very interested in looking into this with you. If it's convenient for you, you can reach me in the cloudflare discord, or if you prefer another communication method, that can be arranged as well. |
|
This fix has been released now. GetObject with a range set now unconditionally returns a 206 response with content-range set. Let me know if that didn't address this issue, and we can keep looking into compatibility between nextcloud & R2. |
|
Thanks @Frederik-Baetens - great to see someone from the team here, and such a quick turn around on the fix. For the record, I still believe you were not strictly doing anything incorrect, but fixing it on your end does mean older versions of Nextcloud get if for free without needing a workaround in this code, so perhaps other S3 clients may benefit too if they had similar assumptions. I've closed this PR now as there is no need to work around this quirk. I can confirm that on my Nextcloud 24.0.7 (obviously without this fix) it can now download files from R2 as expected. Appreciate the reviews from others on this thread too, even though it turned out not to be required. Regarding the quirk with truncated files that I first noticed when running unit tests on this change, I will try to find you on discord when I get some time next week. In the meantime, I've done my best to document and make the issue reproducible in my pserwylo/cloudflare-r2-php-fopen-bug repo. The README documents in detail my (unsuccessful) efforts to track it down. |
|
glad to hear that it got sorted out upstream, thanks a lot 😄 |
|
@pserwylo Thanks for confirming that nextcloud works with R2 now 🎉 I've been able to reproduce that truncated files issue with the helpful reproduction repo you provided, but given that this seems to be specific to PHP fopen(), getting to the bottom of this requires more PHP knowledge than I have. So I'm looking forward to looking into this with you when you find some time next week! |
Fixes an issue in External Storage when using an S3 backend, whereby some providers allow Nextcloud to upload, but not subsequently download, files.
I witnessed this using Cloudflare R2, which is a relatively new (but well priced) S3 compatible storage system that has recently come out of Beta.
To reproduce
regionmust be set toautofor R2 buckets. Every thing else such as access keys, hosts, and bucket names is no different then any other S3 compatible storage and the details can be found from your Cloudflare account.Description of the issue
The error log shows the following when trying to download a file from R2 via Nextcloud:
This error isn't particularly useful though, because the actual issue is that the
SeekableHttpStream::reconnectfunction returnsfalsewhich triggers thefopento fail with the above message.After investigating, it turned out that this is because the code assumes S3 servers will always return a
206 Partial Responsewhen requesting a file with the request headerRange: bytes=0-.This assumption is not true of Cloudflare R2 (and potentially other providers), which interpret the
Range: bytes=0-as "Can I please have the whole file" and return a200 OKresponse.Strictly speaking, a
200response is okay. Although RFC 7233 doesn't specify anything about the edge case ofbytes=0-(which really means the entire file), it does say:It doesn't say anything about the browser
MUSTrespond with a206 Partial Contentwhen handlingRange:request headers. Given this, it seems like the response from Cloudflare here is similar in spirit to ignoring theRange:header and returning with200and thus is to spec. Therefore, it seemed appropriate to handle this in the Nextcloud code gracefully by still continuing to fetch the file without failing.Solution
Add a special case to check for a
200 OKresponse to aRange: bytes=0-request, and if it exists, request a range thatencompasses the entire length of the file (as defined by the
Content-Lengthheader from the initialHEADrequest).Notes about this PR
Only for
Range: bytes=0-This only special cases
Range: bytes=0-requests, not other range requests which may potentially be ignored by the server.I feel this is okay for a couple of reasons.
Firstly - the spec saying "Servers MAY ignore
Rangerequests" is probably to cater for old HTTP servers which don't know how to send partial responses. S3 is relatively modern compared to HTTP, and thus I imagine all S3 servers would actually support range requests for only parts of the file correctly by returning a206 Partial Contentas Nextcloud currently expects. Indeed, this is the case with Cloudflare, which does correctly return206 Partial Contentresponses for range requests that do not start at 0.Secondly - This is to ensure that the semantics of "Please open a file starting at this point and read until the end" are not changed. If we put other behaviour in, e.g. for a
Range: bytes=1000-request, but the server returned aHTTP/1.1 200 OKrequest, it would probably stuff up the rest of the Nextcloud server which was actually expecting only a partial piece of the file.Notes on testing
This is my first PR to
nextcloud/server, and I seem to have chosen a curly issue to do with a specific downstream provider that is nothing to do with Nextcloud. I understand there are some integration tests which work with Azure and AWS, but am not sure what infrastructure would be required on the Nextcloud server in order to get a working PoC up with Cloudflare R2. I believe that the integration tests for S3 storage use a mock S3 storage, which likely doesn't exhibit the same behaviour as Cloudflare R2. Furthermore, Cloudflare R2 buckets cannot be made public in the same way as S3 buckets (they only expose a HTTP endpoint to access files if made public, not a full public S3 compatible API endpoint), so I can't even offer up my own R2 PoC bucket (and you may not want to depend on a random strangers bucket for your integration tests anyway).Having said that, I did hack the
./tests/lib/preseed-config.phpto include the following CloudflareR2 config (for my private bucket):and as expected, when run on
master, I get 11 test failures in./tests/lib/Files/ObjectStorage/S3Tests.phpwith the following error:After running the same tests on this branch, they all pass and include coverage of my new block of code indicating that Cloudflare R2 did indeed work as expected.
Please let me know if there is anything I can do with regards to writing tests for this case.
Feedback requested on HTTP 200 check
Although the regex used to check for
HTTP 200responses seems relatively naïve, but to the bets of my knowledge it is to spec according to HTTP 1.1. I'm unsure of HTTP/2 or HTTP/3, but empirically the S3 clients seem to use HTTP 1.1 at this point anyway. Feedback welcome if you'd like me to change this in any way - though in practice - it works for Cloudflare R2 according to my manual testing.Quirky PHP/Cloudflare R2 bug still to be solved
When running the Nextcloud unit tests, I noticed that there was a strange time where a few of the tests resulted in a quirky intermittent failure when downloading byte ranges, that resulted in some bytes being truncated from the response 😮.
I'm still investigating myself with this PoC: https://github.com/pserwylo/cloudflare-r2-php-fopen-bug, but I have noticed the following:
fopen("https://...")(i.e. it doesn't happen if I perform the same requests via Guzzle.mitmproxythen the AWS signature verification pics up that a MITM has taken place and seems to reject the request.A massive massive thanks to @juliushaertl for the foresight to choose the following filesizes in the
S3Test: https://github.com/pserwylo/nextcloud-server/blame/e8ee363138eee7703cc236a4f40276090eb3a997/tests/lib/Files/ObjectStore/S3Test.php#L145. Without these specific byte sizes which cross a specific power-of-2 boundary, I would not have observed this issue.