Skip to content

Fixing Settings.get_index_name being the same for different PDF parsers#1125

Merged
jamesbraza merged 1 commit intomainfrom
parser-index-name
Oct 6, 2025
Merged

Fixing Settings.get_index_name being the same for different PDF parsers#1125
jamesbraza merged 1 commit intomainfrom
parser-index-name

Conversation

@jamesbraza
Copy link
Copy Markdown
Collaborator

If we are using a PyPDF-based vs a PyMuPDF-based parser, the autogenerated index names should not clobber each other.

@jamesbraza jamesbraza self-assigned this Oct 5, 2025
Copilot AI review requested due to automatic review settings October 5, 2025 04:06
@jamesbraza jamesbraza added the bug Something isn't working label Oct 5, 2025
@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Oct 5, 2025
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue where different PDF parsers (PyMuPDF vs PyPDF) would generate the same index name, causing them to clobber each other's cached parsing results.

  • Added the PDF parser function as a parameter in index name generation to ensure uniqueness
  • Added a test to verify that different parsers produce different index names

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/paperqa/settings.py Modified get_index_name method to include the PDF parser function in the index name calculation
tests/test_paperqa.py Added test case to verify different PDF parsers generate unique index names

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@dosubot
Copy link
Copy Markdown

dosubot bot commented Oct 5, 2025

Related Documentation

Checked 1 published document(s). No updates required.

How did I do? Any feedback?  Join Discord

@jamesbraza jamesbraza merged commit 5db5ee1 into main Oct 6, 2025
5 checks passed
@jamesbraza jamesbraza deleted the parser-index-name branch October 6, 2025 19:25
@dosubot
Copy link
Copy Markdown

dosubot bot commented Oct 6, 2025

Documentation Updates

Checked 1 published document(s). No updates required.

How did I do? Any feedback?  Join Discord

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants