Run thousands of realistic multi-turn scenarios in parallel. Find edge cases before production.
Capture your implicit decision criteria. Turn intuition into automated evaluation.
Reports that show what to fix and how. Analysis that drives action.
FluxLoop is an open-source toolkit for running reproducible, offline-first simulations of AI agents against dynamic scenarios. It empowers developers to rigorously test agent behavior, evaluate performance against custom criteria, and build confidence before shipping to production.
- Easy to Use: Get started quickly with MCP integration and Flux Agent (TBD) for automated setup
- Local-first: Run experiments on your machine with full control and reproducibility
- Framework-agnostic: Works with any agent framework (LangGraph, LangChain, custom)
- Evaluation-first: Solve the AI evaluation problem properly with rigorous, offline-first testing
Stop guessing, start evaluating.
Get started quickly with MCP Server integration that automatically detects your agent framework and guides you through the setup process. No more manual configuration or guesswork.
Instrument existing agent code with minimal changesโjust add @fluxloop.agent() and you're tracing. Works with any Python-based agent framework.
Rigorously test your agents with reproducible experiments and structured evaluation. Define rule-based and LLM-based evaluators, set success criteria, analyze trends and outliers, and generate comprehensive reports with customizable templatesโall designed for proper AI evaluation.
Run experiments on your machine without cloud dependencies. Full control over your testing environment with reproducible, auditable results.
Visual project management for your IDE. Browse projects, run experiments with one click, parse results into Markdown timelines, and explore outputs in a structured treeโall without leaving your editor.
Available on VS Code Marketplace and Open VSX (Cursor).
Complete command-line control for advanced workflows. Initialize projects, generate test inputs with LLM, run batch experiments, and parse resultsโall from your terminal.
FluxLoop consists of multiple integrated packages that work together to provide a complete AI agent testing solution:
Core instrumentation library for tracing and recording agent execution.
Add @fluxloop.agent() decorator to your code to automatically capture traces, observations, and execution context. Supports async, streaming, and complex agent frameworks.
๐ Documentation: https://docs.fluxloop.ai/sdk/
๐ฆ PyPI: fluxloop
Command-line orchestration tool for managing experiments end-to-end.
Initialize projects, generate test inputs with LLM, run batch simulations, and parse results into human-readable formatsโall from your terminal.
- New: Pytest Bridge (
fluxloop init pytest-template) produces a ready-to-run smoke test wired to thefluxloop_runnerfixture; seedocs/guides/pytest_bridge.mdfor examples and the sample GitHub Actions workflow.
๐ Documentation: https://docs.fluxloop.ai/cli/
๐ฆ PyPI: fluxloop-cli
Visual project management for Cursor and VS Code.
Browse projects, run experiments with one click, parse results into Markdown timelines, and explore outputs in a structured tree viewโall without leaving your IDE.
๐ Documentation: https://docs.fluxloop.ai/vscode/
๐ Marketplaces: VS Code | Open VSX (Cursor)
AI-assisted integration guidance via Model Context Protocol.
Automatically detect your agent framework, suggest integration patterns, and provide context-aware help for setting up FluxLoop in your codebase.
๐ Documentation: https://docs.fluxloop.ai/mcp/
๐ฆ PyPI: fluxloop-mcp
AI-powered integration assistant built into the VSCode extension.
Flux Agent analyzes your code, consults FluxLoop documentation via MCP, and generates intelligent integration suggestions using LLM. It combines repository analysis with OpenAI models to provide contextualized, framework-specific guidanceโwithout making automatic changes. You review and apply suggestions manually.
๐ Documentation: https://docs.fluxloop.ai/vscode/integration-assistant/overview
โจ Features:
- Repository analysis and framework detection
- AI-generated code integration suggestions
- Knowledge search with citation-backed answers
- Manual review and application workflow
๐ Status: Beta (available in VSCode extension v0.1.3+)
# Install Python packages (SDK and MCP require Python 3.11+)
pip install fluxloop fluxloop-cli fluxloop-mcp
# Install VSCode/Cursor Extension
# Search "FluxLoop" in Extensions marketplace๐ Installation Guides: SDK | CLI | VSCode | MCP
# 1. Create a project
fluxloop init project --name my-agent
# 2. Add @fluxloop.agent() decorator to your code
# 3. Generate test inputs
fluxloop generate inputs --limit 50
# 4. Run experiment
fluxloop run experiment
# 5. Parse results
fluxloop parse experiment experiments/<experiment_dir>๐ Complete Tutorial: End-to-End Workflow Guide
- ๐ฏ Instrument Agents: Add decorators to trace execution
- ๐ Generate Inputs: Create test scenarios with LLM or deterministic strategies
- ๐งช Run Simulations: Execute batch experiments with different configurations
- ๐ฌ Multi-Turn Conversations: Automatically extend single-input experiments into multi-turn dialogues with AI supervisor
- ๐ Analyze Results: Parse structured outputs into human-readable timelines
- ๐ด Record & Replay: Capture complex arguments and replay them (advanced)
- ๐ง AI-Assisted Setup: Use MCP server and Flux Agent for automated integration guidance
Building trustworthy AI requires a community dedicated to rigorous, transparent evaluation. FluxLoop provides the foundational tooling, but there's much more to do:
- Shape the standard: Define the open contract for AI agent simulation data
- Build integrations: Create adapters for popular frameworks (LangChain, LlamaIndex, CrewAI)
- Enhance developer experience: Improve CLI, SDK, and VSCode extension
- Develop evaluation methods: Create novel ways of measuring agent performance
We're an early-stage project with an ambitious roadmap. Your contributions can have massive impact.
Check out our contribution guide and open issues to get started.
- Issues: Report bugs or suggest features on GitHub
FluxLoop is licensed under the Apache License 2.0.
