Skip to content

BUG: compress_content_stream not readable in Adobe Acrobat#1698

Merged
MartinThoma merged 5 commits intopy-pdf:mainfrom
pubpub-zz:Compress
Mar 12, 2023
Merged

BUG: compress_content_stream not readable in Adobe Acrobat#1698
MartinThoma merged 5 commits intopy-pdf:mainfrom
pubpub-zz:Compress

Conversation

@pubpub-zz
Copy link
Collaborator

fixes #1654
ContentStream must be stored as individual objects

fixes  py-pdf#1654
ContentStream must be stored as individual objects
this is an interim version :
this is not in accordance with PDF ref as the streams must be indirect Objects(bottom of page 60 of PDF 1.7 reference)
this induced that the compression must only be applied to pages belonging to PdfWriters
@codecov
Copy link

codecov bot commented Mar 11, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.10 🎉

Comparison is base (8b0f091) 92.37% compared to head (9d74017) 92.47%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1698      +/-   ##
==========================================
+ Coverage   92.37%   92.47%   +0.10%     
==========================================
  Files          33       33              
  Lines        6480     6487       +7     
  Branches     1281     1282       +1     
==========================================
+ Hits         5986     5999      +13     
+ Misses        320      317       -3     
+ Partials      174      171       -3     
Impacted Files Coverage Δ
pypdf/_page.py 90.85% <100.00%> (+0.08%) ⬆️

... and 1 file with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@pubpub-zz
Copy link
Collaborator Author

many change in the tests as the issue is coming from the fact that streams must be indirect_objects: compression can only be applied to pages part of PdfWriter

@pubpub-zz
Copy link
Collaborator Author

all good



@pytest.mark.enable_socket()
@pytest.mark.slow()
Copy link
Member

@MartinThoma MartinThoma Mar 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for myself: The slowest test now is the one with https://corpora.tika.apache.org/base/docs/govdocs1/950/950337.pdf-tika-950337.pdf with about 1.4s. We only want to mark tests with > 5s with slow, so removing the slow flag here is fine.

@MartinThoma MartinThoma merged commit 3a9d6f6 into py-pdf:main Mar 12, 2023
MartinThoma added a commit that referenced this pull request Mar 12, 2023
Bug Fixes (BUG)
-  compress_content_stream not readable in Adobe Acrobat (#1698)
-  Pass logging parameters correctly in set_need_appearances_writer (#1697)
-  Write /Root/AcroForm in set_need_appearances_writer (#1639)

Robustness (ROB)
-  Allow more whitespaces within linearized file (#1701)
@MartinThoma
Copy link
Member

I've just noticed that the benchmark failed.

Did this PR make a breaking change or was the benchmark broken before?
Is the raise ValueError("Page must be part of a PdfWriter") actually correct?

MartinThoma added a commit that referenced this pull request Mar 12, 2023
MartinThoma added a commit that referenced this pull request Mar 14, 2023
@MartinThoma MartinThoma changed the title BUG : compress_content_stream not readable in acrobat BUG: compress_content_stream not readable in Adobe Acrobat Mar 14, 2023
@pubpub-zz pubpub-zz deleted the Compress branch June 24, 2023 08:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lost of "Page Mode" & preset zoom in "Reduce PDF Size" tutorials

2 participants