<general_rules>
- Always run
make lintbefore committing to ensure code passes ruff and mypy checks - Use
make formatto automatically format code with ruff (includes import sorting and code formatting) - Follow Google-style docstrings as configured in pyproject.toml
- Maintain type hints for all function parameters and return values
- When creating new functions in the trustcall package, first search existing modules (_base.py, _validation_node.py) to avoid duplication
- Use
uvfor all dependency management - never use pip directly - Run
make testsfor unit tests before pushing changes - Use
make tests_watchfor continuous testing during development - For evaluation testing, use
make evals(requires API keys) - Always check that new code doesn't break existing functionality
- Public API should only expose necessary functions through trustcall/init.py
- Internal modules use underscore prefix (_base.py, _validation_node.py)
- Follow existing import patterns: langchain-core for LLM integration, langgraph for state management
- When adding new dependencies, update pyproject.toml and run
uv sync</general_rules>
<repository_structure>
__init__.py: Public API exposing create_extractor, ExtractionInputs, ExtractionOutputs_base.py: Main extraction logic, tool handling, JSON patch operations, and core extractor functionality_validation_node.py: ValidationNode class for tool call validation in LangGraph workflowspy.typed: Indicates package supports type checking
unit_tests/: Core functionality tests (test_extraction.py, test_strict_existing.py, test_utils.py)evals/: Evaluation benchmarks using LangSmith for model comparison (test_evals.py)cassettes/: VCR cassettes for mocking API responses in testsconftest.py: Pytest configuration with asyncio backend setup
pyproject.toml: Project metadata, dependencies, tool configuration (ruff, mypy, pytest)Makefile: Common development commands (tests, lint, format, build, publish)uv.lock: Locked dependency versions managed by uv.github/workflows/: CI/CD with unit tests (test.yml) and daily evaluations (eval.yml)
README.md: Comprehensive usage examples and API documentation_static/: Static assets (cover image)LICENSE: MIT license </repository_structure>
<dependencies_and_installation>
- Uses
uvfor fast, reliable dependency management - Never use pip directly - always use
uv run,uv sync, oruv add - Dependencies are defined in pyproject.toml with version constraints
langgraph>=0.2.25: State graph management for LLM workflowsdydantic<1.0.0,>=0.0.8: Dynamic Pydantic model creationjsonpatch<2.0,>=1.33: JSON patch operations for efficient updateslangchain-core: LLM integration and tool calling
- Code quality:
ruff(linting/formatting),mypy(type checking) - Testing:
pytest,pytest-asyncio,pytest-socket,vcrpy - LLM providers:
langchain-openai,langchain-anthropic,langchain-fireworks
uv sync --all-extras --dev: Install all dependencies including dev toolsuv sync: Install only production dependenciesuv add <package>: Add new dependencyuv build: Build distribution packages </dependencies_and_installation>
<testing_instructions>
- Uses pytest with asyncio support for async/await testing patterns
- Socket access is disabled by default (
--disable-socket --allow-unix-socket) to prevent external calls - VCR cassettes in tests/cassettes/ and tests/evals/cassettes/ mock API responses
make tests: Run unit tests with socket restrictions and detailed outputmake tests_watch: Continuous testing during development (uses ptw)make evals: Run evaluation benchmarks (requires OPENAI_API_KEY, ANTHROPIC_API_KEY, LANGSMITH_API_KEY)make doctest: Run doctests in the trustcall module
- Unit Tests: Core functionality testing without external API calls
- test_extraction.py: Main extractor functionality and retry logic
- test_strict_existing.py: Schema validation and existing data handling
- test_utils.py: Utility functions like patch application and type conversion
- Evaluation Tests: LangSmith-integrated benchmarks comparing model performance
- test_evals.py: Comparative evaluation across different LLM providers
- Use FakeExtractionModel for mocking LLM responses in unit tests
- Async tests should use pytest-asyncio decorators
- Mock external API calls using VCR cassettes or custom fake models
- Follow existing patterns for tool validation and schema testing
- Test both success and error scenarios, especially for validation failures </testing_instructions>