Skip to content

Conversation

@vayoa
Copy link
Contributor

@vayoa vayoa commented Oct 23, 2025

This is a fix for my own issue. Fixes #260.

  • Edited apps/worker-py/.venv/lib/python3.11/site-packages/langextract/annotation.py.

    1. Inside _annotate_documents_single_pass, when creating annotated_doc before yielding (around the loop that flushes finished
      documents), wrap annotated_extractions in list(...) (or use copy()), then immediately reset annotated_extractions = [] so
      the next document gets its own list.
    2. Apply the same treatment in the final flush block at the end of the function so the last document is isolated as well.
    3. Ensure any other yield site in this function (including the sequential-pass helper if it reuses the same collector) also
      hands out a fresh list.
    4. Write a simple test.

Commits:

  • langextract/langextract/annotation.py:355 and langextract/langextract/annotation.py:404 now hand out a copied extraction list before each yield and immediately reset annotated_extractions so every document receives its own list without bleed- through.
  • langextract/tests/annotation_test.py:745 introduces a regression test with a fake resolver that asserts each annotated document keeps its own extraction payload and that the lists are distinct.

vayoa and others added 2 commits October 5, 2025 02:29
  - langextract/langextract/annotation.py:355 and langextract/langextract/annotation.py:404 now hand out a copied extraction
    list before each yield and immediately reset annotated_extractions so every document receives its own list without bleed-
    through.
  - langextract/tests/annotation_test.py:745 introduces a regression test with a fake resolver that asserts each annotated
    document keeps its own extraction payload and that the lists are distinct.
@github-actions github-actions bot added the size/S Pull request with 50-150 lines changed label Oct 23, 2025
@almeidava93-spesia
Copy link

I'm having the same problem here. This PR fixes the issue. Thanks @vayoa
Can anyone from the core team review this?

@github-actions
Copy link

⚠️ Branch Update Required

Your branch is 1 commits behind main. Please update your branch to ensure CI checks run with the latest code:

git fetch origin main
git merge origin/main
git push

Note: Enable "Allow edits by maintainers" to allow automatic updates.

@aksg87
Copy link
Collaborator

aksg87 commented Nov 2, 2025

This was resolved in #276 - which also refactors and optimizes the annotation process. Thanks for reporting this! Will close this PR for now, but reopen if you would like to discuss anything further.

@aksg87 aksg87 closed this Nov 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/S Pull request with 50-150 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi-Document extraction bleed (only last result captured)

3 participants