Skip to content

bia scraper may silently fail to collect new records due to sort order + CL DupChecker #1934

@grossir

Description

@grossir

I was checking the "latest opinion view" and bia had no new opinions for 40 days. However, on the source there were fresh opinions in that period. The problem was caused by

  • we use date_filed_is_approximate, and the same date for all records, so the order is given by the names
  • if 5 of the first opinions thus ordered happen to already exist on the DB, no more are checked

So, we had been ignoring newer cases due to this sorting artifact.

A solution is to override ordering just for this court, the only one that uses date_filed_is_approximate

behavior is clear on the logs
bia-logs.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Coverage Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions