Skip to content

Conversation

@whitead
Copy link
Collaborator

@whitead whitead commented Nov 1, 2025

It turns out openalex never worked if you query on DOI. OpenAlex refers to DOIs always with the URL prefix (e.g., https://doi.org/10xxx/xxxxx). So when it returned articles, they had DOIs that didn't match the requested DOIs and we rejected them as failed. This fixes and adds a low-effort unit test that just checks if openalex is working.


Note

Normalize OpenAlex DOIs by removing the https://doi.org/ prefix before comparison and add a VCR test to verify DOI queries work via OpenAlexProvider.

  • Clients/OpenAlex:
    • Normalize doi by removing https://doi.org/ prefix in API results before matching against requested DOI.
    • Handle title-search path by selecting first result and validating non-empty results.
  • Tests:
    • Add OpenAlexProvider to tests and new async test test_does_openalex_work querying by DOI with fields=["open_access"].
    • Record VCR cassette for the DOI request to OpenAlex.

Written by Cursor Bugbot for commit 0da01b2. Configure here.

@whitead whitead requested review from Copilot and jamesbraza November 1, 2025 22:28
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 1, 2025
@dosubot
Copy link

dosubot bot commented Nov 1, 2025

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@dosubot dosubot bot added the bug Something isn't working label Nov 1, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for removing the DOI prefix from OpenAlex API responses and includes a test to verify OpenAlex integration works correctly.

  • Adds DOI prefix removal logic to normalize DOI format from OpenAlex responses
  • Introduces a new test to verify OpenAlex provider functionality
  • Adjusts whitespace formatting in the OpenAlex client code

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tests/test_clients.py Imports OpenAlexProvider and adds test for OpenAlex DOI query functionality
src/paperqa/clients/openalex.py Adds DOI prefix normalization to remove "https://doi.org/" prefix from OpenAlex responses and adjusts whitespace
tests/cassettes/test_does_openalex_work[10.1021-acs.jctc.5b00178].yaml VCR cassette recording for the new OpenAlex test

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@jamesbraza jamesbraza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 1, 2025
@whitead whitead enabled auto-merge (squash) November 1, 2025 23:15
@whitead whitead disabled auto-merge November 1, 2025 23:29
@whitead whitead merged commit cf2102f into main Nov 1, 2025
4 of 7 checks passed
@whitead whitead deleted the open-alex-fixes branch November 1, 2025 23:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants