Skip to content

Latest commit

 

History

History
1409 lines (1221 loc) · 30.3 KB

File metadata and controls

1409 lines (1221 loc) · 30.3 KB

ChatMind API Documentation

Modern REST API for exploring your rich ChatGPT semantic knowledge graph.

The ChatMind API provides powerful endpoints for discovering insights, exploring semantic connections, and visualizing your processed ChatGPT conversations. Built with FastAPI for automatic documentation and validation, leveraging your rich Neo4j database with hybrid Qdrant vector search capabilities.


🚀 Quick Start

Base URL

http://localhost:8000

API Documentation

Health Check

curl http://localhost:8000/api/health

✅ Test Coverage & Reliability

  • All endpoints are covered by automated tests.
  • High pass rate with comprehensive test coverage (see scripts/test_api_endpoints.py).
  • All endpoints return a standard response with data, message, and error keys.
  • Comprehensive endpoint testing with TDD approach.

📋 API Overview

Rich Semantic Knowledge Graph

Your ChatMind database contains:

  • Chats with positioning data and metadata
  • Messages with content and conversation flow
  • Chunks with embeddings (using Sentence Transformers)
  • Embeddings stored in Qdrant for semantic search
  • Tags with semantic classification and normalization
  • Cluster Summaries with positioning and relationships
  • Chat Summaries with metadata and context
  • Similarity relationships between chats and clusters
  • Cross-database linking between Neo4j and Qdrant

Note: Actual counts depend on your processed data volume.

Core Features

  • Discovery & Exploration: Find topics, patterns, and insights
  • Semantic Search: Search by meaning, not just keywords (using embeddings and cosine similarity)
  • Graph Visualization: 2D/3D positioning and relationships
  • Hybrid Search: Combines Neo4j graph queries with Qdrant vector search
  • Interactive Exploration: Real-time graph navigation
  • Vector Similarity: Embeddings for semantic search
  • Turn-level Retrieval (New): Retrieve coherent pairs of user+assistant turns with micro-summaries
  • Cross-Encoder Reranking (New): Re-rank expanded candidates for relevance and diversity

Authentication

Currently, the API runs without authentication for local development. For production deployment, consider adding API key authentication.

Error Handling & Reliability

  • All endpoints include comprehensive error handling
  • Neo4j connection issues are gracefully handled
  • Invalid parameters return appropriate HTTP status codes
  • All DateTime objects are properly serialized to ISO format strings

CORS Configuration

The API is configured to accept requests from:

  • http://localhost:3000
  • http://localhost:3001
  • http://localhost:5173
  • http://127.0.0.1:5173

Recent Improvements

  • Modular Architecture: Clean separation of routes, models, and utilities
  • Test-Driven Development: All endpoints tested and validated
  • Hybrid Database Integration: Seamless Neo4j + Qdrant operations
  • Error Handling: Robust error handling and graceful fallbacks
  • Performance: Optimized database connections and queries
  • Turnization + NEXT Links (New): (:Turn) nodes with [:HAS_TURN] and [:NEXT]
  • Retrieval Endpoint (New): BM25 → Vector → Graph Expansion → Cross-Encoder Rerank → Pack

📊 Data Models

ApiResponse

{
  "data": "any",
  "message": "string",
  "error": "string (optional)"
}

GraphNode

{
  "id": "string",
  "type": "Chat|Message|Chunk|Topic|Tag",
  "properties": {
    "title": "string",
    "content": "string",
    "tags": ["string"],
    "domain": "string",
    "sentiment": "string",
    "complexity": "string",
    "create_time": "string (ISO format)",
    "timestamp": "number"
  },
  "position": {
    "x": "number",
    "y": "number"
  }
}

GraphEdge

{
  "source": "string",
  "target": "string", 
  "type": "CONTAINS|HAS_CHUNK|SUMMARIZES|TAGS|SIMILAR_TO|HAS_TOPIC",
  "properties": {
    "score": "number",
    "weight": "number",
    "similarity": "number"
  }
}

SearchResult

{
  "chunk_id": "string",
  "content": "string",
  "message_id": "string",
  "chat_id": "string",
  "similarity_score": "number (optional)"
}

Conversation

{
  "chat_id": "string",
  "title": "string",
  "create_time": "number",
  "message_count": "number",
  "position_x": "number",
  "position_y": "number"
}

Message

{
  "message_id": "string",
  "content": "string",
  "role": "user|assistant",
  "timestamp": "number",
  "chat_id": "string"
}

Turn (New)

{
  "turn_uid": "string", 
  "chat_id": "string",
  "turn_id": "number",
  "ts": "string|number",
  "summary": "string",
  "roles": ["user", "assistant"],
  "prev_turn_ids": ["number"]
}

RetrievalPack (New)

{
  "chat_id": "string",
  "turns": [
    {
      "turn_uid": "string",
      "turn_id": "number",
      "summary": "string",
      "rerank_score": "number"
    }
  ]
}

RetrievalResponse (New)

{
  "data": {
    "query": "string",
    "packs": [
      {
        "chat_id": "string",
        "turns": [
          {
            "turn_uid": "string",
            "turn_id": "number",
            "summary": "string",
            "rerank_score": "number"
          }
        ]
      }
    ]
  },
  "message": "string",
  "error": null
}

🔗 API Endpoints

1. Health Check

GET /api/health

Check API health and connectivity.

Response:

{
  "data": {
    "status": "healthy",
    "neo4j": "connected"
  },
  "message": "All services operational",
  "error": null
}

Example:

curl http://localhost:8000/api/health

2. Core Data Access APIs

GET /api/stats/dashboard

Get dashboard statistics and overview.

Response:

{
  "data": {
    "chats": 1714,
    "messages": 32565,
    "chunks": 47575,
    "tags": 23989,
    "clusters": 150
  },
  "message": "Dashboard statistics retrieved successfully",
  "error": null
}

GET /api/chats

Get all chats with metadata.

Query Parameters:

  • limit (int, optional): Maximum results (default: 50, max: 200)

Response:

{
  "data": [
    {
      "id": "chat_abc123",
      "title": "Python Web Development",
      "timestamp": "2025-01-15T10:30:00Z",
      "message_count": 25
    }
  ],
  "message": "Found 50 chats",
  "error": null
}

GET /api/chats/{chat_id}/messages

Get messages for a specific chat.

Path Parameters:

  • chat_id (string, required): Chat ID

Query Parameters:

  • limit (int, optional): Maximum results (default: 50, max: 200)

Response:

{
  "data": [
    {
      "id": "msg_123",
      "content": "How do I build a web app with Python?",
      "role": "user",
      "timestamp": "2025-01-15T10:30:00Z"
    }
  ],
  "message": "Found 25 messages for chat chat_abc123",
  "error": null
}

GET /api/topics

Get all topics/clusters.

Query Parameters:

  • limit (int, optional): Maximum results (default: 50, max: 200)

Response:

{
  "data": [
    {
      "topic_id": "cluster_45",
      "name": "Python Web Development",
      "summary": "Discussions about building web applications with Python",
      "size": 25
    }
  ],
  "message": "Found 150 topics",
  "error": null
}

GET /api/chunks

Get semantic chunks.

Query Parameters:

  • limit (int, optional): Maximum results (default: 50, max: 200)
  • cluster_id (string, optional): Filter by cluster ID

Response:

{
  "data": [
    {
      "chunk_id": "abc123_msg_0_chunk_0",
      "content": "Example chunk content",
      "embedding_hash": "hash123",
      "cluster_id": "cluster_45",
      "cluster_name": "Python Web Development"
    }
  ],
  "message": "Found 50 chunks",
  "error": null
}

GET /api/tags

Get all tags with counts.

Response:

{
  "data": [
    {
      "name": "#python",
      "count": 156,
      "category": "programming"
    }
  ],
  "message": "Found 23989 tags",
  "error": null
}

GET /api/clusters/{cluster_id}

Get details for a specific cluster.

Path Parameters:

  • cluster_id (string, required): Cluster ID

Response:

{
  "data": {
    "cluster_id": "cluster_45",
    "name": "Python Web Development",
    "summary": "Discussions about building web applications with Python",
    "size": 25,
    "chunk_contents": ["content1", "content2"]
  },
  "message": "Cluster details retrieved successfully",
  "error": null
}

3. Search APIs

GET /api/search

Simple search endpoint using Neo4j only.

Query Parameters:

  • query (string, required): Search query
  • limit (int, optional): Maximum results (default: 10, max: 100)

Response:

{
  "data": [
    {
      "content": "Example message content",
      "message_id": "msg_123"
    }
  ],
  "message": "Found 1 results for 'example'",
  "error": null
}

Example:

curl "http://localhost:8000/api/search?query=python&limit=5"

GET /api/search/semantic

Semantic search using Qdrant vector database.

Query Parameters:

  • query (string, required): Search query
  • limit (int, optional): Maximum results (default: 10, max: 100)

Response:

{
  "data": [
    {
      "chunk_id": "abc123_msg_0_chunk_0",
      "content": "Example content",
      "message_id": "msg_123",
      "chat_id": "chat_456",
      "similarity_score": 0.95
    }
  ],
  "message": "Found 1 semantic results for 'example'",
  "error": null
}

Features:

  • True Semantic Search: Finds related content even when exact words don't match
  • Cosine Similarity: Uses vector embeddings for semantic comparison
  • Contextual Understanding: "japan" finds "Korea", "Italy", "Mexico" as related countries
  • Programming Context: "python" finds beginner discussions and script references

Example:

curl "http://localhost:8000/api/search/semantic?query=machine learning&limit=5"

GET /api/search/hybrid

Hybrid search combining Neo4j and Qdrant.

Query Parameters:

  • query (string, required): Search query
  • limit (int, optional): Maximum results (default: 10, max: 100)

Response:

{
  "data": [
    {
      "chunk_id": "abc123_msg_0_chunk_0",
      "content": "Example content",
      "message_id": "msg_123",
      "chat_id": "chat_456",
      "similarity_score": 0.95,
      "source": "qdrant"
    }
  ],
  "message": "Found 1 hybrid results for 'example'",
  "error": null
}

Example:

curl "http://localhost:8000/api/search/hybrid?query=python&limit=5"

GET /api/search/stats

Get search statistics and database status.

Response:

{
  "data": {
    "neo4j_connected": true,
    "qdrant_connected": true,
    "embedding_model_loaded": true,
    "total_conversations": 2,
    "total_messages": 2,
    "total_chunks": 47575
  },
  "message": "Search statistics retrieved",
  "error": null
}

GET /api/search/tags

Search by specific tags.

Query Parameters:

  • tags (string, required): Comma-separated tag list
  • limit (int, optional): Maximum results (default: 10, max: 100)

Response:

{
  "data": [
    {
      "chunk_id": "abc123_msg_0_chunk_0",
      "content": "Example content",
      "message_id": "msg_123",
      "chat_id": "chat_456",
      "tags": ["#python", "#api"]
    }
  ],
  "message": "Found 0 results for tags: python",
  "error": null
}

GET /api/search/domain/{domain}

Search by domain.

Path Parameters:

  • domain (string, required): Domain to search

Query Parameters:

  • limit (int, optional): Maximum results (default: 10, max: 100)

Response:

{
  "data": [
    {
      "chunk_id": "abc123_msg_0_chunk_0",
      "content": "Example content",
      "message_id": "msg_123",
      "chat_id": "chat_456"
    }
  ],
  "message": "Found 2 results for domain: technology",
  "error": null
}

GET /api/search/similar/{chunk_id}

Find similar content to a specific chunk.

Path Parameters:

  • chunk_id (string, required): Chunk ID to find similar content for

Query Parameters:

  • limit (int, optional): Maximum results (default: 10, max: 100)

Response:

{
  "data": [],
  "message": "Found 0 similar results for chunk: abc123",
  "error": null
}

GET /api/search/tags/available

Get all available tags.

Response:

{
  "data": [
    {
      "name": "python",
      "count": 156
    }
  ],
  "message": "Found 23989 available tags",
  "error": null
}

POST /api/retrieval/retrieve (New)

Turn-level hybrid retrieval with reranking.

Request Body:

{
  "query": "python web dev",
  "topn": 10,
  "window_tokens": 5000
}

Response:

{
  "data": {
    "query": "python web dev",
    "packs": [
      {
        "chat_id": "chat_abc123",
        "turns": [
          {
            "turn_uid": "chat_abc123:12",
            "turn_id": 12,
            "summary": "We decided X because Y.",
            "rerank_score": 0.84
          }
        ]
      }
    ]
  },
  "message": "Retrieved 10 turns across 4 chats",
  "error": null
}

Pipeline: BM25 (SQLite FTS5 on turn summaries) → Qdrant vector search (chatmind_turns) → NEXT ±1 expansion (Neo4j) → Cross‑encoder rerank → Pack by chat_id.

4. Advanced Search & Discovery APIs

POST /api/search/advanced

Advanced search with filters.

Request Body:

{
  "query": "python web development",
  "filters": {
    "start_date": "2025-01-01",
    "end_date": "2025-01-31",
    "tags": "python,web",
    "domain": "technology",
    "limit": 10
  }
}

Response:

{
  "data": [
    {
      "content": "How do I build a web app with Python?",
      "message_id": "msg_123",
      "role": "user",
      "timestamp": "2025-01-15T10:30:00Z",
      "conversation_title": "Python Web Development"
    }
  ],
  "message": "Found 5 results with advanced search",
  "error": null
}

POST /api/query/neo4j

Execute custom Neo4j queries.

Request Body:

{
  "query": "MATCH (t:Tag) RETURN t.name, t.count ORDER BY t.count DESC LIMIT 5"
}

Response:

{
  "data": [
    {
      "t.name": "#python",
      "t.count": 156
    }
  ],
  "message": "Query executed successfully, returned 5 records",
  "error": null
}

GET /api/connections/explain

Explain why two conversations are connected.

Query Parameters:

  • source_id (string, required): Source chat ID
  • target_id (string, required): Target chat ID

Response:

{
  "data": {
    "source_chat": {
      "id": "chat_abc123",
      "title": "Health Discussion"
    },
    "target_chat": {
      "id": "chat_def456", 
      "title": "Stress Management"
    },
    "shared_tags": ["#stress", "#health"],
    "shared_themes": ["stress management", "wellness"],
    "connection_strength": 0.75,
    "strength_category": "strong",
    "explanation": "Both conversations share themes: #stress, #health"
  },
  "message": "Connection explanation generated successfully",
  "error": null
}

GET /api/search/cross-domain

Find how a topic appears across different domains.

Query Parameters:

  • query (string, required): Search query
  • limit (int, optional): Maximum results per domain (default: 5, max: 20)

Response:

{
  "data": {
    "query": "stress",
    "cross_domain_results": {
      "health": [
        {
          "chat_id": "chat_health_1",
          "chat_title": "Health Discussion",
          "content": "I'm feeling stressed about...",
          "message_id": "msg_123",
          "role": "user",
          "similarity": 0.9
        }
      ],
      "business": [
        {
          "chat_id": "chat_business_1",
          "chat_title": "Work Stress",
          "content": "The workload is causing stress...",
          "message_id": "msg_456",
          "role": "user",
          "similarity": 0.85
        }
      ]
    },
    "domains_searched": ["health", "business", "personal"],
    "total_results": 2
  },
  "message": "Found cross-domain results for 'stress'",
  "error": null
}

GET /api/discover/suggestions

Suggest interesting connections to explore.

Query Parameters:

  • limit (int, optional): Maximum suggestions (default: 5, max: 20)

Response:

{
  "data": [
    {
      "suggestion": "You discussed 'stress management' in health and business contexts",
      "source_domain": "health",
      "target_domain": "business",
      "connection_strength": 0.85,
      "exploration_path": ["health", "stress management", "business"],
      "chat1": {
        "id": "chat_health_1",
        "title": "Health Discussion"
      },
      "chat2": {
        "id": "chat_business_1",
        "title": "Work Stress Management"
      }
    }
  ],
  "message": "Found 1 discovery suggestions",
  "error": null
}

GET /api/timeline/semantic

Get chronological data with semantic connections.

Query Parameters:

  • start_date (string, optional): Start date (YYYY-MM-DD)
  • end_date (string, optional): End date (YYYY-MM-DD)
  • limit (int, optional): Maximum conversations per day (default: 50, max: 200)

Response:

{
  "data": [
    {
      "date": "2025-01-15",
      "conversations": [
        {
          "chat_id": "chat_abc123",
          "title": "Health Discussion",
          "timestamp": "2025-01-15T10:30:00Z",
          "domain": "health",
          "tags": ["#stress", "#health"],
          "semantic_connections": [
            {
              "related_chat": "chat_def456",
              "connection": "similar content",
              "similarity": 0.8
            }
          ]
        }
      ]
    }
  ],
  "message": "Found timeline data for 1 days",
  "error": null
}

POST /api/chats/{chat_id}/summary

Generate a summary for a specific chat.

Path Parameters:

  • chat_id (string, required): Chat ID

Response:

{
  "data": {
    "chat_id": "chat_abc123",
    "title": "Health Discussion",
    "summary": "Chat 'Health Discussion' contains 4 messages covering various topics.",
    "message_count": 4
  },
  "message": "Chat summary generated successfully",
  "error": null
}

GET /api/discover/topics

Discover most discussed topics.

Query Parameters:

  • limit (int, optional): Maximum results (default: 20, max: 100)
  • min_count (int, optional): Minimum chunk count (default: 0)

Response:

{
  "data": [
    {
      "topic_id": "cluster_45",
      "name": "Python Web Development",
      "summary": "Discussions about building web applications with Python",
      "size": 25
    }
  ],
  "message": "Found 20 topics",
  "error": null
}

GET /api/discover/domains

Discover domain distribution.

Response:

{
  "data": [
    {
      "domain": "technology",
      "chat_count": 856
    }
  ],
  "message": "Found 5 domains",
  "error": null
}

GET /api/discover/clusters

Discover semantic clusters.

Query Parameters:

  • limit (int, optional): Maximum results (default: 20, max: 100)
  • min_size (int, optional): Minimum cluster size (default: 0)
  • include_positioning (boolean, optional): Include positioning data (default: true)

Response:

{
  "data": [
    {
      "cluster_id": "cluster_45",
      "name": "Python Web Development",
      "summary": "Discussions about building web applications with Python",
      "size": 25,
      "x": 0.234,
      "y": 0.567
    }
  ],
  "message": "Found 20 clusters",
  "error": null
}

GET /api/conversations/{chat_id}/context

Get conversation context and related content.

Path Parameters:

  • chat_id (string, required): Chat ID

Response:

{
  "data": {
    "title": "Python Web Development",
    "timestamp": "2025-01-15T10:30:00Z",
    "tags": ["#python", "#web"],
    "related_chats": ["chat_456", "chat_789"]
  },
  "message": "Conversation context retrieved successfully",
  "error": null
}

GET /api/search/similar/{chunk_id}

Find similar content for a chunk.

Path Parameters:

  • chunk_id (string, required): Chunk ID

Query Parameters:

  • limit (int, optional): Maximum results (default: 10, max: 50)

Response:

{
  "data": [
    {
      "chunk_id": "def456_msg_1_chunk_0",
      "content": "Similar content about web development",
      "similarity_score": 0.85
    }
  ],
  "message": "Found 5 similar chunks",
  "error": null
}

GET /api/search/domains

Get list of available domains for filtering.

Response:

{
  "data": ["technology", "health", "business", "personal"],
  "message": "Found 4 available domains",
  "error": null
}

5. Analytics APIs

GET /api/analytics/patterns

Get conversation pattern analysis.

Query Parameters:

  • timeframe (string, optional): Timeframe - daily, weekly, monthly (default: daily)
  • include_sentiment (boolean, optional): Include sentiment analysis (default: false)

Response:

{
  "data": {
    "timeframe": "daily",
    "patterns": [
      {
        "date": "2025-01-15",
        "chat_count": 25
      }
    ],
    "total_chats": 1714
  },
  "message": "Pattern analysis for daily timeframe",
  "error": null
}

GET /api/analytics/sentiment

Get basic sentiment overview.

Response:

{
  "data": {
    "total_messages": 32565,
    "note": "Sentiment analysis not implemented yet"
  },
  "message": "Basic message statistics retrieved",
  "error": null
}

6. Graph Exploration APIs

GET /api/graph

Get graph data for visualization.

Response:

{
  "data": {
    "nodes": [
      {
        "id": "chat_abc123",
        "type": "Chat",
        "properties": {
          "title": "Python Web Development",
          "domain": "technology"
        },
        "position": {
          "x": 0.234,
          "y": 0.567
        }
      }
    ],
    "edges": [
      {
        "source": "chat_abc123",
        "target": "cluster_45",
        "type": "SIMILAR_TO",
        "properties": {
          "score": 0.85
        }
      }
    ]
  },
  "message": "Graph data retrieved",
  "error": null
}

GET /api/conversations

Get all conversations.

Response:

{
  "data": [
    {
      "chat_id": "chat_abc123",
      "title": "Python Web Development",
      "create_time": 1732234567.0,
      "message_count": 5,
      "position_x": 0.234,
      "position_y": 0.567
    }
  ],
  "message": "Found 3 conversations",
  "error": null
}

GET /api/conversations/{chat_id}/messages

Get messages for a specific conversation.

Path Parameters:

  • chat_id (string, required): Chat ID

Response:

{
  "data": [
    {
      "message_id": "msg_123",
      "content": "Example message content",
      "role": "user",
      "timestamp": 1732234567.0,
      "chat_id": "chat_abc123"
    }
  ],
  "message": "Found 1 messages for chat: chat_abc123",
  "error": null
}

GET /api/chunks/{chunk_id}

Get details for a specific chunk.

Path Parameters:

  • chunk_id (string, required): Chunk ID

Response:

{
  "data": {
    "chunk_id": "abc123_msg_0_chunk_0",
    "content": "Example chunk content",
    "message_id": "msg_123",
    "chat_id": "chat_456"
  },
  "message": "Chunk details retrieved",
  "error": null
}

GET /api/graph/visualization

Get data for 2D/3D graph visualization.

Query Parameters:

  • node_types (string, optional): Comma-separated node types (Chat,Cluster,Tag)
  • limit (int, optional): Maximum nodes (default: 100)
  • include_edges (boolean, optional): Include relationship data (default: true)
  • filter_domain (string, optional): Filter by domain

Response:

{
  "data": {
    "nodes": [
      {
        "id": "chat_abc123",
        "type": "Chat",
        "properties": {
          "title": "Python Web Development",
          "domain": "technology"
        },
        "position": {
          "x": 0.234,
          "y": 0.567
        }
      }
    ],
    "edges": [
      {
        "source": "chat_abc123",
        "target": "cluster_45",
        "type": "SIMILAR_TO",
        "properties": {
          "score": 0.85
        }
      }
    ]
  },
  "message": "Visualization data retrieved",
  "error": null
}

GET /api/graph/expand/{node_id}

Expand a node to show its connections.

Path Parameters:

  • node_id (string, required): Node ID (chat_id, message_id, or chunk_id)

Response:

{
  "data": {
    "node": {
      "id": "chat_abc123",
      "chat_id": "chat_abc123",
      "message_id": null,
      "chunk_id": null,
      "labels": ["Chat"],
      "properties": {
        "title": "Python Web Development",
        "domain": "technology"
      }
    },
    "connections": [
      {
        "relationship_type": "HAS_MESSAGE",
        "neighbor_labels": ["Message"],
        "neighbor_properties": {
          "content": "How do I build a web app?",
          "role": "user"
        }
      }
    ]
  },
  "message": "Node expansion: 5 connections found",
  "error": null
}

GET /api/graph/neighbors

Get neighbors of a node with similarity filtering.

Query Parameters:

  • node_id (string, required): Node ID
  • limit (int, optional): Maximum results (default: 10, max: 50)
  • min_similarity (float, optional): Minimum similarity threshold (default: 0.0, max: 1.0)

Response:

{
  "data": [
    {
      "neighbor_id": "chat_def456",
      "neighbor_title": "Flask Web Development",
      "similarity": 0.85
    }
  ],
  "message": "Found 5 neighbors for chat_abc123",
  "error": null
}

4. Debug APIs

GET /api/debug/schema

Get database schema information.

Response:

{
  "data": {
    "neo4j_schema": {
      "nodes": [
        {
          "type": "Chat",
          "count": 2,
          "properties": ["chat_id", "title", "create_time", "message_count", "position_x", "position_y"]
        }
      ],
      "relationships": [
        {
          "type": "CONTAINS",
          "count": 2,
          "properties": []
        }
      ]
    },
    "qdrant_schema": {
      "collection_name": "chatmind_embeddings",
      "vector_size": 384,
      "total_points": 47575
    }
  },
  "message": "Schema information retrieved",
  "error": null
}

🔧 Error Handling

Standard Error Response

{
  "data": null,
  "message": "Error description",
  "error": "ERROR_TYPE"
}

Common HTTP Status Codes

  • 200 OK: Request successful
  • 400 Bad Request: Invalid parameters
  • 404 Not Found: Resource not found
  • 500 Internal Server Error: Server error
  • 503 Service Unavailable: Database connection issues

🚀 Development

Starting the API Server

# Using the run script
python run.py

# Direct execution
python main.py

# With uvicorn
python -m uvicorn main:app --reload --host 0.0.0.0 --port 8000

Environment Variables

# Required
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=chatmind123

# Optional
QDRANT_HOST=localhost
QDRANT_PORT=6335
QDRANT_COLLECTION=chatmind_embeddings
EMBEDDING_MODEL=all-MiniLM-L6-v2
API_HOST=0.0.0.0
API_PORT=8000

Running Tests

# Run comprehensive test suite
python scripts/test_api_endpoints.py --base-url http://localhost:8000
# (Legacy) simple smoke tests
python test_simple_api.py

📊 Integration Examples

Frontend Integration (JavaScript)

// Health check
const healthResponse = await fetch('http://localhost:8000/api/health');
const health = await healthResponse.json();

// Simple search
const searchResponse = await fetch('http://localhost:8000/api/search?query=python&limit=5');
const searchResults = await searchResponse.json();

// Semantic search
const semanticResponse = await fetch('http://localhost:8000/api/search/semantic?query=machine learning&limit=5');
const semanticResults = await semanticResponse.json();

// Graph visualization
const graphResponse = await fetch('http://localhost:8000/api/graph/visualization?limit=100');
const graphData = await graphResponse.json();

// Get conversations
const conversationsResponse = await fetch('http://localhost:8000/api/conversations');
const conversations = await conversationsResponse.json();

Python Integration

import requests

# Health check
response = requests.get('http://localhost:8000/api/health')
health = response.json()

# Hybrid search
response = requests.get('http://localhost:8000/api/search/hybrid?query=python&limit=10')
results = response.json()

# Retrieval
response = requests.post('http://localhost:8000/api/retrieval/retrieve', json={"query": "python web dev", "topn": 5})
retrieval = response.json()

🎯 Architecture

Modular Design

The API follows a clean, modular architecture:

api/
├── main.py                 # FastAPI app setup and startup/shutdown
├── run.py                  # Entrypoint to run the server
├── config.py               # Configuration management
├── routes/                 # Individual route files
│   ├── health.py           # Health checks
│   ├── search.py           # Search endpoints
│   ├── graph.py            # Graph exploration endpoints
│   ├── retrieval.py        # Turn retrieval endpoint (PR2)
│   └── debug.py            # Debug endpoints
├── models/                 # Pydantic models
│   ├── __init__.py
│   └── common.py
├── utils/                  # Utility functions
│   ├── __init__.py
│   └── helpers.py
└── services/               # Business logic
    ├── __init__.py
    └── retrieval.py        # BM25 + vector + expansion + rerank + packing

Database Integration

  • Neo4j: Handles graph relationships, metadata, and complex queries. Includes (:Turn) with [:HAS_TURN] and [:NEXT] per conversation.
  • Qdrant: Handles semantic search with embeddings. Uses chatmind_embeddings (chunks) and chatmind_turns (turn summaries) collections.
  • Hybrid / Retrieval: Combines BM25 prefilter, turn vectors, graph expansion, and cross-encoder reranking.
  • Error Handling: Graceful fallbacks for all database operations.

Performance Features

  • Singleton Pattern: Single instances of database drivers and embedding models
  • Connection Pooling: Efficient database connection management
  • Lazy Loading: Embedding models loaded only when needed
  • Global Connections: Database connections passed via set_global_connections functions

API Version: 4.0.0
Last Updated: Current Development
Maintainer: ChatMind Development Team
Test Status: ✅ 100% Pass Rate (69/69 endpoints)