Skip to content

Open-source toolkit for running reproducible, offline-first simulations of AI agents against dynamic scenarios

License

Notifications You must be signed in to change notification settings

Fluxloop-AI/fluxloop

Repository files navigation

FluxLoop Logo

FluxLoop OSS

Status License SDK PyPI CLI PyPI MCP PyPI VS Code Marketplace Open VSX

Ship Agents with Data. Scale Business.

๐ŸŽฏ Simulate at Scale

Run thousands of realistic multi-turn scenarios in parallel. Find edge cases before production.

๐Ÿ“Š Align to Your Standards

Capture your implicit decision criteria. Turn intuition into automated evaluation.

๐Ÿš€ Act on Insights

Reports that show what to fix and how. Analysis that drives action.


Simulate, Evaluate, and Trust Your AI Agents

FluxLoop is an open-source toolkit for running reproducible, offline-first simulations of AI agents against dynamic scenarios. It empowers developers to rigorously test agent behavior, evaluate performance against custom criteria, and build confidence before shipping to production.

Core Philosophy

  • Easy to Use: Get started quickly with MCP integration and Flux Agent (TBD) for automated setup
  • Local-first: Run experiments on your machine with full control and reproducibility
  • Framework-agnostic: Works with any agent framework (LangGraph, LangChain, custom)
  • Evaluation-first: Solve the AI evaluation problem properly with rigorous, offline-first testing

Stop guessing, start evaluating.


Key Features

๐Ÿค– AI-Assisted Setup

Get started quickly with MCP Server integration that automatically detects your agent framework and guides you through the setup process. No more manual configuration or guesswork.

๐ŸŽฏ Simple Decorator-Based Instrumentation

Instrument existing agent code with minimal changesโ€”just add @fluxloop.agent() and you're tracing. Works with any Python-based agent framework.

๐Ÿ“Š Evaluation-First Testing

Rigorously test your agents with reproducible experiments and structured evaluation. Define rule-based and LLM-based evaluators, set success criteria, analyze trends and outliers, and generate comprehensive reports with customizable templatesโ€”all designed for proper AI evaluation.

๐Ÿงช Offline-First Simulation

Run experiments on your machine without cloud dependencies. Full control over your testing environment with reproducible, auditable results.

๐Ÿ”Œ VSCode/Cursor Extension

Visual project management for your IDE. Browse projects, run experiments with one click, parse results into Markdown timelines, and explore outputs in a structured treeโ€”all without leaving your editor.

Available on VS Code Marketplace and Open VSX (Cursor).

๐Ÿš€ Powerful CLI

Complete command-line control for advanced workflows. Initialize projects, generate test inputs with LLM, run batch experiments, and parse resultsโ€”all from your terminal.


๐Ÿ“ฆ Packages

FluxLoop consists of multiple integrated packages that work together to provide a complete AI agent testing solution:

1. SDK (Python 3.11+)

Core instrumentation library for tracing and recording agent execution.

Add @fluxloop.agent() decorator to your code to automatically capture traces, observations, and execution context. Supports async, streaming, and complex agent frameworks.

๐Ÿ“– Documentation: https://docs.fluxloop.ai/sdk/
๐Ÿ“ฆ PyPI: fluxloop

2. CLI

Command-line orchestration tool for managing experiments end-to-end.

Initialize projects, generate test inputs with LLM, run batch simulations, and parse results into human-readable formatsโ€”all from your terminal.

  • New: Pytest Bridge (fluxloop init pytest-template) produces a ready-to-run smoke test wired to the fluxloop_runner fixture; see docs/guides/pytest_bridge.md for examples and the sample GitHub Actions workflow.

๐Ÿ“– Documentation: https://docs.fluxloop.ai/cli/
๐Ÿ“ฆ PyPI: fluxloop-cli

3. VSCode Extension

Visual project management for Cursor and VS Code.

Browse projects, run experiments with one click, parse results into Markdown timelines, and explore outputs in a structured tree viewโ€”all without leaving your IDE.

๐Ÿ“– Documentation: https://docs.fluxloop.ai/vscode/
๐Ÿ›’ Marketplaces: VS Code | Open VSX (Cursor)

4. MCP Server (Python 3.11+)

AI-assisted integration guidance via Model Context Protocol.

Automatically detect your agent framework, suggest integration patterns, and provide context-aware help for setting up FluxLoop in your codebase.

๐Ÿ“– Documentation: https://docs.fluxloop.ai/mcp/
๐Ÿ“ฆ PyPI: fluxloop-mcp

5. Flux Agent (Beta)

AI-powered integration assistant built into the VSCode extension.

Flux Agent analyzes your code, consults FluxLoop documentation via MCP, and generates intelligent integration suggestions using LLM. It combines repository analysis with OpenAI models to provide contextualized, framework-specific guidanceโ€”without making automatic changes. You review and apply suggestions manually.

๐Ÿ“– Documentation: https://docs.fluxloop.ai/vscode/integration-assistant/overview
โœจ Features:

  • Repository analysis and framework detection
  • AI-generated code integration suggestions
  • Knowledge search with citation-backed answers
  • Manual review and application workflow

๐Ÿ”œ Status: Beta (available in VSCode extension v0.1.3+)


Getting Started

Installation

# Install Python packages (SDK and MCP require Python 3.11+)
pip install fluxloop fluxloop-cli fluxloop-mcp

# Install VSCode/Cursor Extension
# Search "FluxLoop" in Extensions marketplace

๐Ÿ“– Installation Guides: SDK | CLI | VSCode | MCP

Quick Workflow

# 1. Create a project
fluxloop init project --name my-agent

# 2. Add @fluxloop.agent() decorator to your code

# 3. Generate test inputs
fluxloop generate inputs --limit 50

# 4. Run experiment
fluxloop run experiment

# 5. Parse results
fluxloop parse experiment experiments/<experiment_dir>

๐Ÿ“– Complete Tutorial: End-to-End Workflow Guide

What You Can Do

  • ๐ŸŽฏ Instrument Agents: Add decorators to trace execution
  • ๐Ÿ“ Generate Inputs: Create test scenarios with LLM or deterministic strategies
  • ๐Ÿงช Run Simulations: Execute batch experiments with different configurations
  • ๐Ÿ’ฌ Multi-Turn Conversations: Automatically extend single-input experiments into multi-turn dialogues with AI supervisor
  • ๐Ÿ“Š Analyze Results: Parse structured outputs into human-readable timelines
  • ๐Ÿ”ด Record & Replay: Capture complex arguments and replay them (advanced)
  • ๐Ÿง  AI-Assisted Setup: Use MCP server and Flux Agent for automated integration guidance

๐Ÿค Why Contribute?

Building trustworthy AI requires a community dedicated to rigorous, transparent evaluation. FluxLoop provides the foundational tooling, but there's much more to do:

  • Shape the standard: Define the open contract for AI agent simulation data
  • Build integrations: Create adapters for popular frameworks (LangChain, LlamaIndex, CrewAI)
  • Enhance developer experience: Improve CLI, SDK, and VSCode extension
  • Develop evaluation methods: Create novel ways of measuring agent performance

We're an early-stage project with an ambitious roadmap. Your contributions can have massive impact.

Check out our contribution guide and open issues to get started.


๐Ÿšจ Community & Support

  • Issues: Report bugs or suggest features on GitHub

๐Ÿ“„ License

FluxLoop is licensed under the Apache License 2.0.

About

Open-source toolkit for running reproducible, offline-first simulations of AI agents against dynamic scenarios

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published