Skip to content

STY: Simplify _extract_text#1683

Merged
MartinThoma merged 6 commits intomainfrom
simplify-extract-text
Mar 12, 2023
Merged

STY: Simplify _extract_text#1683
MartinThoma merged 6 commits intomainfrom
simplify-extract-text

Conversation

@MartinThoma
Copy link
Member

No description provided.

@MartinThoma MartinThoma marked this pull request as draft March 4, 2023 12:46
@codecov
Copy link

codecov bot commented Mar 4, 2023

Codecov Report

Patch coverage: 90.24% and project coverage change: +0.01 🎉

Comparison is base (39f52dc) 92.37% compared to head (db60d11) 92.39%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1683      +/-   ##
==========================================
+ Coverage   92.37%   92.39%   +0.01%     
==========================================
  Files          33       34       +1     
  Lines        6481     6496      +15     
  Branches     1281     1281              
==========================================
+ Hits         5987     6002      +15     
  Misses        320      320              
  Partials      174      174              
Impacted Files Coverage Δ
pypdf/_text_extraction/__init__.py 89.83% <89.83%> (ø)
pypdf/_page.py 91.13% <100.00%> (+0.36%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@MartinThoma
Copy link
Member Author

@pubpub-zz _extract_text is too complex. By moving some code blocks into own functions I hope I can reduce the complexity. The goal is to make it easier to understand how the text extraction works and make it easier to extend / adjust it.

I still want to group some of the parameters, but I'm uncertain at the moment what makes sense. What do you think about this approach?

@MartinThoma
Copy link
Member Author

The block in if check_crlf_space: could also be a function

@pubpub-zz
Copy link
Collaborator

As the objective is to have as many people as possible to maintain the code, this is OK for me 😀

@MartinThoma MartinThoma force-pushed the simplify-extract-text branch from 1704190 to 15e547a Compare March 5, 2023 09:37
@MartinThoma MartinThoma marked this pull request as ready for review March 5, 2023 09:51
@MartinThoma MartinThoma merged commit 85f9757 into main Mar 12, 2023
@MartinThoma MartinThoma deleted the simplify-extract-text branch March 12, 2023 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants