Skip to content

Fixed CMYK images crashing PNG encoding in PyPDF reader#1311

Merged
jamesbraza merged 1 commit intomainfrom
fix-cmyk-png-crash
Mar 4, 2026
Merged

Fixed CMYK images crashing PNG encoding in PyPDF reader#1311
jamesbraza merged 1 commit intomainfrom
fix-cmyk-png-crash

Conversation

@jamesbraza
Copy link
Copy Markdown
Collaborator

Closes #1310

Summary

  • When Pillow can't save an image as PNG due to an unsupported color mode (e.g. CMYK from print-oriented PDFs), fall back to converting the image to RGB before re-encoding
  • Added a CMYK test case to test_individual_mode_outputs_png that reproduces the crash

Follow-up to #1298, which handled file format re-encoding but missed color mode incompatibility.

Made with Cursor

Copilot AI review requested due to automatic review settings March 3, 2026 21:08
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Mar 3, 2026
@jamesbraza jamesbraza self-assigned this Mar 3, 2026
@jamesbraza jamesbraza added the bug Something isn't working label Mar 3, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses crashes in the PyPDF-based reader when encountering images that Pillow cannot encode as PNG due to unsupported color modes (notably CMYK), by adding a safe fallback conversion to RGB and extending test coverage to reproduce the issue.

Changes:

  • Add an RGB-conversion fallback when PNG encoding fails during individual image extraction.
  • Extend test_individual_mode_outputs_png to include a CMYK input case that previously triggered the crash.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
packages/paper-qa-pypdf/src/paperqa_pypdf/reader.py Adds try/except around PNG encoding and falls back to RGB conversion on failure.
packages/paper-qa-pypdf/tests/test_paperqa_pypdf.py Updates parametrized test to include a CMYK image case and verifies PNG output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jamesbraza jamesbraza force-pushed the fix-cmyk-png-crash branch 3 times, most recently from 923ad23 to efed641 Compare March 3, 2026 21:31
PNG doesn't support all color modes (e.g. CMYK from print-oriented
PDFs). When Pillow raises OSError for an unsupported mode, fall back
to converting the image to RGB before re-encoding as PNG.

Made-with: Cursor
@jamesbraza jamesbraza force-pushed the fix-cmyk-png-crash branch from efed641 to a0c25ca Compare March 4, 2026 05:23
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 4, 2026
@jamesbraza jamesbraza merged commit 0e4a06e into main Mar 4, 2026
6 of 7 checks passed
@jamesbraza jamesbraza deleted the fix-cmyk-png-crash branch March 4, 2026 07:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CMYK images in PDFs crash indexing with OSError

3 participants