use exact read size to acquire from io semaphore #9287

dannyzaken · 2025-11-18T14:48:39Z

Describe the Problem

In read_object_stream, we used the requested_size passed to the _read function as the value to acquire from the io semaphore. By default, this is 32 MB (the stream's highWaterMark).
For datasets with mostly small objects, this limits the number of concurrent reads more than necessary.

Explain the Changes

Changed io_sem_size to reflect the actual size requested by the current read.
Also, avoid entering the code under the semaphore if there is nothing more to read.
changed debug level of some read\upload messages to log1 instead of log0

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

Doc added/updated
Tests added

Summary by CodeRabbit

Bug Fixes
- Fixed early termination handling for read streams and ensured original errors are rethrown after logging.
- Improved completion signaling when reads finish.
Refactor
- Reduced logging verbosity across upload and read paths to minimize noise.
- Adjusted read resource sizing to better match requested ranges.
- Added a log entry indicating streaming target during uploads.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-11-18T14:48:53Z

Walkthrough

Replaces several verbose debug logs with lower-verbosity logs across upload/read paths, adds an early-termination guard in read_object_stream, adjusts IO semaphore sizing to use requested_end - reader.pos, adds an explicit re-throw in upload error handling, and inserts a few additional log lines and a minor comment formatting change.

Changes

Cohort / File(s)	Summary
Debug logging standardization and IO logic refinements `src/sdk/object_io.js`	Converts verbose debug logs (log0) to less verbose (log1) across upload and read flows; adds early-termination guard in `read_object_stream` when `requested_end ≤ reader.pos` (pushes `null` and returns); changes IO semaphore sizing to use `requested_end - reader.pos`; adds explicit `throw` of caught error in upload path; adds a log line for streaming target bucket/key; minor comment formatting tweak

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Verify semaphore sizing change (requested_end - reader.pos) for buffer and concurrency correctness.
Review early-termination guard in read_object_stream for correct stream-close semantics and edge cases.
Confirm the explicit re-throw in upload error handling preserves original stack/context and does not alter error handling elsewhere.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly addresses the main change: adjusting IO semaphore acquisition to use the exact read size (requested_end - reader.pos) instead of the default stream size, which is the primary objective.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4abf62d and f4918af.

📒 Files selected for processing (1)

src/sdk/object_io.js

🚧 Files skipped from review as they are similar to previous changes (1)

src/sdk/object_io.js

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: run-package-lock-validation
GitHub Check: Build Noobaa Image
GitHub Check: run-jest-unit-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

jackyalbo · 2025-11-18T15:47:58Z

src/sdk/object_io.js

            // instead of getting multiple calls from the stream with small slices to return.

            const requested_end = Math.min(params.end, reader.pos + requested_size);
+            if (requested_end <= reader.pos) {


Just thinking here, maybe the logic will be clearer:
if (requested_size === 0 || reader.pos >= params.end) {
As this will give us more understanding that reading is finished

alphaprinz

Nice. LGTM.

re Jacky's comment, can't say I have a preference. Options are equivalent and both make sense.

- In read_object_stream, we used the requested_size passed to the _read function as the value to acquire from the io semaphore. By default, this is 32 MB (the stream's highWaterMark). - For datasets with mostly small objects, this limits the number of concurrent reads more than necessary. - Changed io_sem_size to reflect the actual size requested by the current read. - Also, avoid entering the code under the semaphore if there is nothing more to read. - changed debug level of some read\upload messages to log1 instead of log0 Signed-off-by: Danny Zaken <[email protected]>

dannyzaken requested a review from guymguym November 18, 2025 14:48

pull-request-size bot added the size/S label Nov 18, 2025

jackyalbo reviewed Nov 18, 2025

View reviewed changes

alphaprinz approved these changes Nov 18, 2025

View reviewed changes

tangledbytes approved these changes Nov 19, 2025

View reviewed changes

dannyzaken force-pushed the danny-fixes branch from 4abf62d to f4918af Compare December 22, 2025 15:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

use exact read size to acquire from io semaphore #9287

use exact read size to acquire from io semaphore #9287

dannyzaken commented Nov 18, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Nov 18, 2025 •

edited

Loading

Uh oh!

jackyalbo Nov 18, 2025

Uh oh!

alphaprinz left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

use exact read size to acquire from io semaphore #9287

Are you sure you want to change the base?

use exact read size to acquire from io semaphore #9287

Conversation

dannyzaken commented Nov 18, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the Problem

Explain the Changes

Issues: Fixed #xxx / Gap #xxx

Testing Instructions:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

jackyalbo Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

alphaprinz left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dannyzaken commented Nov 18, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 18, 2025 •

edited

Loading