Added `del` to save memory during reading by jamesbraza · Pull Request #1265 · Future-House/paper-qa

jamesbraza · 2026-01-06T22:09:01Z

Our readers are taking up too much RAM. This PR attempts to remove images from the readers early, to save overall RAM

Note

Reduces peak RAM during PDF parsing by freeing large intermediate image buffers as soon as they’re no longer needed.

In paperqa_nemotron/reader.py: delete rendered_page after conversion, delete image_for_api post-API call, and delete cropped regions (region_pix) after saving
In paperqa_pypdf/reader.py: delete pdfium bitmaps (pdfium_rendered_page, pix) after PIL conversion/saving for full-page screenshots, figure crops, and table crops
Changes are localized to memory cleanup with no API or behavioral surface changes

^{Written by Cursor Bugbot for commit db16c88. Configure here.}

dosubot · 2026-01-06T22:09:15Z

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

^{How did I do? Any feedback?}

Copilot

Pull request overview

This PR adds explicit del statements to free image objects earlier in the PDF reading process to reduce RAM usage. The changes target image-related objects (pdfium bitmaps, PIL images, and numpy arrays) that are no longer needed after extraction or rendering.

Key Changes:

Added del statements to free pdfium bitmap objects immediately after they're converted or used
Added del for intermediate numpy array used for API calls once processing is complete
Added del for cropped PIL images after they're saved to BytesIO buffers

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
packages/paper-qa-pypdf/src/paperqa_pypdf/reader.py	Frees pdfium bitmap memory for full-page renders, clustered image regions, and table regions immediately after conversion to PNG format
packages/paper-qa-nemotron/src/paperqa_nemotron/reader.py	Frees pdfium bitmap memory after PIL conversion, numpy array memory after API calls complete, and cropped image memory after saving to buffer

After thorough review, all the deletions are safe and correctly placed. The variables are deleted only after their last usage, and any derived objects (like PIL images created from pdfium bitmaps) are retained as needed. The comments accurately describe the memory being freed. This is a clean optimization that should help reduce memory usage during PDF processing without introducing any bugs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jamesbraza added 2 commits January 6, 2026 14:06

Del'ing images early in readers from pdfium bitmaps

23a69c6

Del'ing a few other places too

db16c88

jamesbraza requested review from MicPie, sidnarayanan and whitead January 6, 2026 22:09

jamesbraza self-assigned this Jan 6, 2026

Copilot AI review requested due to automatic review settings January 6, 2026 22:09

jamesbraza added the bug Something isn't working label Jan 6, 2026

dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Jan 6, 2026

Copilot started reviewing on behalf of jamesbraza January 6, 2026 22:09 View session

dosubot bot added the enhancement New feature or request label Jan 6, 2026

Copilot AI reviewed Jan 6, 2026

View reviewed changes

nadolskit approved these changes Jan 11, 2026

View reviewed changes

whitead approved these changes Jan 11, 2026

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Jan 11, 2026

jamesbraza merged commit 68c66ce into main Jan 11, 2026
20 of 21 checks passed

jamesbraza deleted the memory-savings-nemotron-parse branch January 11, 2026 20:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added `del` to save memory during reading#1265

Added `del` to save memory during reading#1265
jamesbraza merged 2 commits intomainfrom
memory-savings-nemotron-parse

jamesbraza commented Jan 6, 2026 •

edited by cursor bot

Loading

Uh oh!

dosubot bot commented Jan 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jamesbraza commented Jan 6, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dosubot bot commented Jan 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jamesbraza commented Jan 6, 2026 •

edited by cursor bot

Loading