Skip to content

Float number in page.rect causes ValueError: CropBox not in MediaBox #4832

@YunkaiXiao

Description

@YunkaiXiao

Description of the bug

Sometimes when a page in a pdf document has high precision float dimensions, page.set_cropbox(viewer_rect) throws ValueError: CropBox not in MediaBox.

How to reproduce the bug

This happens sometimes when loading a book with cover. Some covers deliver dimensions that are apparently larger than itself. I sense there's a chance that the real size of the page is probably stored in variables with lesser precision (and rounded down), and when cropping, the more precise dimension (page.rect) would be slightly larger than the detected (measured by page.set_cropbox())

>>> doc = fitz.open(pdf_path)
>>> page = doc.load_page(0)
>>> viewer_rect = page.rect
>>> viewer_rect
Rect(0.0, 0.0, 419.2799987792969, 583.9199829101562)
>>> page.set_cropbox(viewer_rect)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kai/pdf_extraction/lib/python3.12/site-packages/pymupdf/__init__.py", line 12964, in set_cropbox
    return self._set_pagebox("CropBox", rect)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/kai/pdf_extraction/lib/python3.12/site-packages/pymupdf/__init__.py", line 10284, in _set_pagebox
    raise ValueError(f"{boxtype} not in MediaBox")
ValueError: CropBox not in MediaBox

sample.pdf

PyMuPDF version

1.26.6

Operating system

Linux

Python version

3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions