Skip to content

AliMostafaRadwan/Retail-Analytics-Copilot

Repository files navigation

Local Retail Analytics AI Agent

A local, privacy-focused AI agent that answers retail analytics questions using RAG over local documents and SQL over a SQLite database.

Features

  • Hybrid Architecture: Combines RAG (for policies, calendars) and SQL (for data aggregation).
  • Local Execution: Runs entirely on your machine using Ollama (Phi-3.5) and local SQLite.
  • Auditable: Provides citations for every answer, linking to DB tables and document chunks.
  • Optimized: Uses DSPy to optimize SQL generation.
  • Resilient: Includes a repair loop to correct SQL errors or formatting issues.

Setup

  1. Prerequisites:

    • Python 3.10+
    • Ollama installed and running.
    • Model pulled: ollama pull phi3.5:3.8b-mini-instruct-q4_K_M
  2. Installation:

    pip install -r requirements.txt
  3. Data Setup:

    • The Northwind database is downloaded to data/northwind.sqlite.
    • Documentation is in docs/.

Usage

Run the agent on the evaluation dataset:

python run_agent_hybrid.py --batch sample_questions_hybrid_eval.jsonl --out outputs_hybrid.jsonl

Architecture

  • Router: Classifies questions as RAG, SQL, or Hybrid.

Local Retail Analytics AI Agent

A local, privacy-focused AI agent that answers retail analytics questions using RAG over local documents and SQL over a SQLite database.

Features

  • Hybrid Architecture: Combines RAG (for policies, calendars) and SQL (for data aggregation).
  • Local Execution: Runs entirely on your machine using Ollama (Phi-3.5) and local SQLite.
  • Auditable: Provides citations for every answer, linking to DB tables and document chunks.
  • Optimized: Uses DSPy to optimize SQL generation.
  • Resilient: Includes a repair loop to correct SQL errors or formatting issues.

Setup

  1. Prerequisites:

    • Python 3.10+
    • Ollama installed and running.
    • Model pulled: ollama pull phi3.5:3.8b-mini-instruct-q4_K_M
  2. Installation:

    pip install -r requirements.txt
  3. Data Setup:

    • The Northwind database is downloaded to data/northwind.sqlite.
    • Documentation is in docs/.

Usage

Run the agent on the evaluation dataset:

python run_agent_hybrid.py --batch sample_questions_hybrid_eval.jsonl --out outputs_hybrid.jsonl

Architecture

  • Router: Classifies questions as RAG, SQL, or Hybrid.
  • Retriever: BM25 search over markdown documents.
  • SQL Generator: DSPy module to generate SQLite queries, optimized with BootstrapFewShot.
  • Synthesizer: Combines SQL results and retrieved context to answer the question.
  • Repair Loop: Automatically retries on SQL errors or invalid output formats.

Optimization

The GenerateSQL DSPy module was optimized using BootstrapFewShot.

  • Metric: SQL Execution Success (checking if the generated SQL runs against the SQLite DB without error).
  • Result: The optimizer successfully generated a compiled module agent/compiled_sql_module.json with few-shot examples.
  • Performance Note: On local hardware with phi3:mini, SQL generation can be slow due to the schema context size. A fallback to dspy.Predict (zero-shot) is implemented if the optimized module causes timeouts.

Known Limitations

  • Inference Speed: Running phi3:mini on CPU with large schema contexts can be slow.
  • Memory: The agent requires significant RAM (approx 8GB+ free) to run the model and vector store.
  • Accuracy: Retrieval depends on BM25 and might miss semantic nuances. SQL generation is sensitive to schema complexity.

Assumptions

  • CostOfGoods is approximated as 0.7 * UnitPrice when calculating margins, as the Northwind database lacks cost data.
  • The agent assumes a standard Northwind schema with created compatibility views (orders, order_items, products, customers).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages