Skip to content

Latest commit

 

History

History
107 lines (88 loc) · 5.01 KB

File metadata and controls

107 lines (88 loc) · 5.01 KB

<general_rules>

General Development Rules

Code Quality and Style

  • Always run make lint before committing to ensure code passes ruff and mypy checks
  • Use make format to automatically format code with ruff (includes import sorting and code formatting)
  • Follow Google-style docstrings as configured in pyproject.toml
  • Maintain type hints for all function parameters and return values
  • When creating new functions in the trustcall package, first search existing modules (_base.py, _validation_node.py) to avoid duplication

Development Workflow

  • Use uv for all dependency management - never use pip directly
  • Run make tests for unit tests before pushing changes
  • Use make tests_watch for continuous testing during development
  • For evaluation testing, use make evals (requires API keys)
  • Always check that new code doesn't break existing functionality

Import and Module Organization

  • Public API should only expose necessary functions through trustcall/init.py
  • Internal modules use underscore prefix (_base.py, _validation_node.py)
  • Follow existing import patterns: langchain-core for LLM integration, langgraph for state management
  • When adding new dependencies, update pyproject.toml and run uv sync </general_rules>

<repository_structure>

Repository Structure

Core Package (trustcall/)

  • __init__.py: Public API exposing create_extractor, ExtractionInputs, ExtractionOutputs
  • _base.py: Main extraction logic, tool handling, JSON patch operations, and core extractor functionality
  • _validation_node.py: ValidationNode class for tool call validation in LangGraph workflows
  • py.typed: Indicates package supports type checking

Testing Structure (tests/)

  • unit_tests/: Core functionality tests (test_extraction.py, test_strict_existing.py, test_utils.py)
  • evals/: Evaluation benchmarks using LangSmith for model comparison (test_evals.py)
  • cassettes/: VCR cassettes for mocking API responses in tests
  • conftest.py: Pytest configuration with asyncio backend setup

Configuration and Build

  • pyproject.toml: Project metadata, dependencies, tool configuration (ruff, mypy, pytest)
  • Makefile: Common development commands (tests, lint, format, build, publish)
  • uv.lock: Locked dependency versions managed by uv
  • .github/workflows/: CI/CD with unit tests (test.yml) and daily evaluations (eval.yml)

Documentation and Assets

  • README.md: Comprehensive usage examples and API documentation
  • _static/: Static assets (cover image)
  • LICENSE: MIT license </repository_structure>

<dependencies_and_installation>

Dependencies and Installation

Package Manager

  • Uses uv for fast, reliable dependency management
  • Never use pip directly - always use uv run, uv sync, or uv add
  • Dependencies are defined in pyproject.toml with version constraints

Core Dependencies

  • langgraph>=0.2.25: State graph management for LLM workflows
  • dydantic<1.0.0,>=0.0.8: Dynamic Pydantic model creation
  • jsonpatch<2.0,>=1.33: JSON patch operations for efficient updates
  • langchain-core: LLM integration and tool calling

Development Dependencies

  • Code quality: ruff (linting/formatting), mypy (type checking)
  • Testing: pytest, pytest-asyncio, pytest-socket, vcrpy
  • LLM providers: langchain-openai, langchain-anthropic, langchain-fireworks

Installation Commands

  • uv sync --all-extras --dev: Install all dependencies including dev tools
  • uv sync: Install only production dependencies
  • uv add <package>: Add new dependency
  • uv build: Build distribution packages </dependencies_and_installation>

<testing_instructions>

Testing Instructions

Test Framework and Structure

  • Uses pytest with asyncio support for async/await testing patterns
  • Socket access is disabled by default (--disable-socket --allow-unix-socket) to prevent external calls
  • VCR cassettes in tests/cassettes/ and tests/evals/cassettes/ mock API responses

Running Tests

  • make tests: Run unit tests with socket restrictions and detailed output
  • make tests_watch: Continuous testing during development (uses ptw)
  • make evals: Run evaluation benchmarks (requires OPENAI_API_KEY, ANTHROPIC_API_KEY, LANGSMITH_API_KEY)
  • make doctest: Run doctests in the trustcall module

Test Categories

  • Unit Tests: Core functionality testing without external API calls
    • test_extraction.py: Main extractor functionality and retry logic
    • test_strict_existing.py: Schema validation and existing data handling
    • test_utils.py: Utility functions like patch application and type conversion
  • Evaluation Tests: LangSmith-integrated benchmarks comparing model performance
    • test_evals.py: Comparative evaluation across different LLM providers

Writing Tests

  • Use FakeExtractionModel for mocking LLM responses in unit tests
  • Async tests should use pytest-asyncio decorators
  • Mock external API calls using VCR cassettes or custom fake models
  • Follow existing patterns for tool validation and schema testing
  • Test both success and error scenarios, especially for validation failures </testing_instructions>