This directory contains end-to-end tests that run against the actual Claude API to verify real-world functionality.
These tests require a valid Anthropic API key. The tests will fail if ANTHROPIC_API_KEY is not set.
Set your API key before running tests:
export ANTHROPIC_API_KEY="your-api-key-here"Install the development dependencies:
pip install -e ".[dev]"python -m pytest e2e-tests/ -vpython -m pytest e2e-tests/ -v -m e2epython -m pytest e2e-tests/test_mcp_calculator.py::test_basic_addition -v- Each test typically uses 1-3 API calls
- Tests use simple prompts to minimize token usage
- The complete test suite should cost less than $0.10 to run
Tests the MCP (Model Context Protocol) integration with calculator tools:
- test_basic_addition: Verifies the add tool executes correctly
- test_division: Tests division with decimal results
- test_square_root: Validates square root calculations
- test_power: Tests exponentiation
- test_multi_step_calculation: Verifies multiple tools can be used in sequence
- test_tool_permissions_enforced: Ensures permission system works correctly
Each test validates:
- Tools are actually called (ToolUseBlock present in response)
- Correct tool inputs are provided
- Expected results are returned
- Permission system is enforced
These tests run automatically on:
- Pushes to
mainbranch (via GitHub Actions) - Manual workflow dispatch
The workflow uses ANTHROPIC_API_KEY from GitHub Secrets.
- Set your API key:
export ANTHROPIC_API_KEY=sk-ant-... - The tests will not skip - they require the key to run
- Check your API key is valid and has quota available
- Ensure network connectivity to api.anthropic.com
- Verify the
allowed_toolsparameter includes the necessary MCP tools - Check that tool names match the expected format (e.g.,
mcp__calc__add)
When adding new e2e tests:
- Mark tests with
@pytest.mark.e2edecorator - Use the
api_keyfixture to ensure API key is available - Keep prompts simple to minimize costs
- Verify actual tool execution, not just mocked responses
- Document any special setup requirements in this README