Modern REST API for exploring your rich ChatGPT semantic knowledge graph.
The ChatMind API provides powerful endpoints for discovering insights, exploring semantic connections, and visualizing your processed ChatGPT conversations. Built with FastAPI for automatic documentation and validation, leveraging your rich Neo4j database with hybrid Qdrant vector search capabilities.
http://localhost:8000
- Interactive Docs: http://localhost:8000/docs (Swagger UI)
- Alternative Docs: http://localhost:8000/redoc (ReDoc)
curl http://localhost:8000/api/health- All endpoints are covered by automated tests.
- High pass rate with comprehensive test coverage (see
scripts/test_api_endpoints.py). - All endpoints return a standard response with
data,message, anderrorkeys. - Comprehensive endpoint testing with TDD approach.
Your ChatMind database contains:
- Chats with positioning data and metadata
- Messages with content and conversation flow
- Chunks with embeddings (using Sentence Transformers)
- Embeddings stored in Qdrant for semantic search
- Tags with semantic classification and normalization
- Cluster Summaries with positioning and relationships
- Chat Summaries with metadata and context
- Similarity relationships between chats and clusters
- Cross-database linking between Neo4j and Qdrant
Note: Actual counts depend on your processed data volume.
- Discovery & Exploration: Find topics, patterns, and insights
- Semantic Search: Search by meaning, not just keywords (using embeddings and cosine similarity)
- Graph Visualization: 2D/3D positioning and relationships
- Hybrid Search: Combines Neo4j graph queries with Qdrant vector search
- Interactive Exploration: Real-time graph navigation
- Vector Similarity: Embeddings for semantic search
- Turn-level Retrieval (New): Retrieve coherent pairs of user+assistant turns with micro-summaries
- Cross-Encoder Reranking (New): Re-rank expanded candidates for relevance and diversity
Currently, the API runs without authentication for local development. For production deployment, consider adding API key authentication.
- All endpoints include comprehensive error handling
- Neo4j connection issues are gracefully handled
- Invalid parameters return appropriate HTTP status codes
- All DateTime objects are properly serialized to ISO format strings
The API is configured to accept requests from:
http://localhost:3000http://localhost:3001http://localhost:5173http://127.0.0.1:5173
- ✅ Modular Architecture: Clean separation of routes, models, and utilities
- ✅ Test-Driven Development: All endpoints tested and validated
- ✅ Hybrid Database Integration: Seamless Neo4j + Qdrant operations
- ✅ Error Handling: Robust error handling and graceful fallbacks
- ✅ Performance: Optimized database connections and queries
- ✅ Turnization + NEXT Links (New):
(:Turn)nodes with[:HAS_TURN]and[:NEXT] - ✅ Retrieval Endpoint (New): BM25 → Vector → Graph Expansion → Cross-Encoder Rerank → Pack
{
"data": "any",
"message": "string",
"error": "string (optional)"
}{
"id": "string",
"type": "Chat|Message|Chunk|Topic|Tag",
"properties": {
"title": "string",
"content": "string",
"tags": ["string"],
"domain": "string",
"sentiment": "string",
"complexity": "string",
"create_time": "string (ISO format)",
"timestamp": "number"
},
"position": {
"x": "number",
"y": "number"
}
}{
"source": "string",
"target": "string",
"type": "CONTAINS|HAS_CHUNK|SUMMARIZES|TAGS|SIMILAR_TO|HAS_TOPIC",
"properties": {
"score": "number",
"weight": "number",
"similarity": "number"
}
}{
"chunk_id": "string",
"content": "string",
"message_id": "string",
"chat_id": "string",
"similarity_score": "number (optional)"
}{
"chat_id": "string",
"title": "string",
"create_time": "number",
"message_count": "number",
"position_x": "number",
"position_y": "number"
}{
"message_id": "string",
"content": "string",
"role": "user|assistant",
"timestamp": "number",
"chat_id": "string"
}{
"turn_uid": "string",
"chat_id": "string",
"turn_id": "number",
"ts": "string|number",
"summary": "string",
"roles": ["user", "assistant"],
"prev_turn_ids": ["number"]
}{
"chat_id": "string",
"turns": [
{
"turn_uid": "string",
"turn_id": "number",
"summary": "string",
"rerank_score": "number"
}
]
}{
"data": {
"query": "string",
"packs": [
{
"chat_id": "string",
"turns": [
{
"turn_uid": "string",
"turn_id": "number",
"summary": "string",
"rerank_score": "number"
}
]
}
]
},
"message": "string",
"error": null
}Check API health and connectivity.
Response:
{
"data": {
"status": "healthy",
"neo4j": "connected"
},
"message": "All services operational",
"error": null
}Example:
curl http://localhost:8000/api/healthGet dashboard statistics and overview.
Response:
{
"data": {
"chats": 1714,
"messages": 32565,
"chunks": 47575,
"tags": 23989,
"clusters": 150
},
"message": "Dashboard statistics retrieved successfully",
"error": null
}Get all chats with metadata.
Query Parameters:
limit(int, optional): Maximum results (default: 50, max: 200)
Response:
{
"data": [
{
"id": "chat_abc123",
"title": "Python Web Development",
"timestamp": "2025-01-15T10:30:00Z",
"message_count": 25
}
],
"message": "Found 50 chats",
"error": null
}Get messages for a specific chat.
Path Parameters:
chat_id(string, required): Chat ID
Query Parameters:
limit(int, optional): Maximum results (default: 50, max: 200)
Response:
{
"data": [
{
"id": "msg_123",
"content": "How do I build a web app with Python?",
"role": "user",
"timestamp": "2025-01-15T10:30:00Z"
}
],
"message": "Found 25 messages for chat chat_abc123",
"error": null
}Get all topics/clusters.
Query Parameters:
limit(int, optional): Maximum results (default: 50, max: 200)
Response:
{
"data": [
{
"topic_id": "cluster_45",
"name": "Python Web Development",
"summary": "Discussions about building web applications with Python",
"size": 25
}
],
"message": "Found 150 topics",
"error": null
}Get semantic chunks.
Query Parameters:
limit(int, optional): Maximum results (default: 50, max: 200)cluster_id(string, optional): Filter by cluster ID
Response:
{
"data": [
{
"chunk_id": "abc123_msg_0_chunk_0",
"content": "Example chunk content",
"embedding_hash": "hash123",
"cluster_id": "cluster_45",
"cluster_name": "Python Web Development"
}
],
"message": "Found 50 chunks",
"error": null
}Get all tags with counts.
Response:
{
"data": [
{
"name": "#python",
"count": 156,
"category": "programming"
}
],
"message": "Found 23989 tags",
"error": null
}Get details for a specific cluster.
Path Parameters:
cluster_id(string, required): Cluster ID
Response:
{
"data": {
"cluster_id": "cluster_45",
"name": "Python Web Development",
"summary": "Discussions about building web applications with Python",
"size": 25,
"chunk_contents": ["content1", "content2"]
},
"message": "Cluster details retrieved successfully",
"error": null
}Simple search endpoint using Neo4j only.
Query Parameters:
query(string, required): Search querylimit(int, optional): Maximum results (default: 10, max: 100)
Response:
{
"data": [
{
"content": "Example message content",
"message_id": "msg_123"
}
],
"message": "Found 1 results for 'example'",
"error": null
}Example:
curl "http://localhost:8000/api/search?query=python&limit=5"Semantic search using Qdrant vector database.
Query Parameters:
query(string, required): Search querylimit(int, optional): Maximum results (default: 10, max: 100)
Response:
{
"data": [
{
"chunk_id": "abc123_msg_0_chunk_0",
"content": "Example content",
"message_id": "msg_123",
"chat_id": "chat_456",
"similarity_score": 0.95
}
],
"message": "Found 1 semantic results for 'example'",
"error": null
}Features:
- True Semantic Search: Finds related content even when exact words don't match
- Cosine Similarity: Uses vector embeddings for semantic comparison
- Contextual Understanding: "japan" finds "Korea", "Italy", "Mexico" as related countries
- Programming Context: "python" finds beginner discussions and script references
Example:
curl "http://localhost:8000/api/search/semantic?query=machine learning&limit=5"Hybrid search combining Neo4j and Qdrant.
Query Parameters:
query(string, required): Search querylimit(int, optional): Maximum results (default: 10, max: 100)
Response:
{
"data": [
{
"chunk_id": "abc123_msg_0_chunk_0",
"content": "Example content",
"message_id": "msg_123",
"chat_id": "chat_456",
"similarity_score": 0.95,
"source": "qdrant"
}
],
"message": "Found 1 hybrid results for 'example'",
"error": null
}Example:
curl "http://localhost:8000/api/search/hybrid?query=python&limit=5"Get search statistics and database status.
Response:
{
"data": {
"neo4j_connected": true,
"qdrant_connected": true,
"embedding_model_loaded": true,
"total_conversations": 2,
"total_messages": 2,
"total_chunks": 47575
},
"message": "Search statistics retrieved",
"error": null
}Search by specific tags.
Query Parameters:
tags(string, required): Comma-separated tag listlimit(int, optional): Maximum results (default: 10, max: 100)
Response:
{
"data": [
{
"chunk_id": "abc123_msg_0_chunk_0",
"content": "Example content",
"message_id": "msg_123",
"chat_id": "chat_456",
"tags": ["#python", "#api"]
}
],
"message": "Found 0 results for tags: python",
"error": null
}Search by domain.
Path Parameters:
domain(string, required): Domain to search
Query Parameters:
limit(int, optional): Maximum results (default: 10, max: 100)
Response:
{
"data": [
{
"chunk_id": "abc123_msg_0_chunk_0",
"content": "Example content",
"message_id": "msg_123",
"chat_id": "chat_456"
}
],
"message": "Found 2 results for domain: technology",
"error": null
}Find similar content to a specific chunk.
Path Parameters:
chunk_id(string, required): Chunk ID to find similar content for
Query Parameters:
limit(int, optional): Maximum results (default: 10, max: 100)
Response:
{
"data": [],
"message": "Found 0 similar results for chunk: abc123",
"error": null
}Get all available tags.
Response:
{
"data": [
{
"name": "python",
"count": 156
}
],
"message": "Found 23989 available tags",
"error": null
}Turn-level hybrid retrieval with reranking.
Request Body:
{
"query": "python web dev",
"topn": 10,
"window_tokens": 5000
}Response:
{
"data": {
"query": "python web dev",
"packs": [
{
"chat_id": "chat_abc123",
"turns": [
{
"turn_uid": "chat_abc123:12",
"turn_id": 12,
"summary": "We decided X because Y.",
"rerank_score": 0.84
}
]
}
]
},
"message": "Retrieved 10 turns across 4 chats",
"error": null
}Pipeline: BM25 (SQLite FTS5 on turn summaries) → Qdrant vector search (chatmind_turns) → NEXT ±1 expansion (Neo4j) → Cross‑encoder rerank → Pack by chat_id.
Advanced search with filters.
Request Body:
{
"query": "python web development",
"filters": {
"start_date": "2025-01-01",
"end_date": "2025-01-31",
"tags": "python,web",
"domain": "technology",
"limit": 10
}
}Response:
{
"data": [
{
"content": "How do I build a web app with Python?",
"message_id": "msg_123",
"role": "user",
"timestamp": "2025-01-15T10:30:00Z",
"conversation_title": "Python Web Development"
}
],
"message": "Found 5 results with advanced search",
"error": null
}Execute custom Neo4j queries.
Request Body:
{
"query": "MATCH (t:Tag) RETURN t.name, t.count ORDER BY t.count DESC LIMIT 5"
}Response:
{
"data": [
{
"t.name": "#python",
"t.count": 156
}
],
"message": "Query executed successfully, returned 5 records",
"error": null
}Explain why two conversations are connected.
Query Parameters:
source_id(string, required): Source chat IDtarget_id(string, required): Target chat ID
Response:
{
"data": {
"source_chat": {
"id": "chat_abc123",
"title": "Health Discussion"
},
"target_chat": {
"id": "chat_def456",
"title": "Stress Management"
},
"shared_tags": ["#stress", "#health"],
"shared_themes": ["stress management", "wellness"],
"connection_strength": 0.75,
"strength_category": "strong",
"explanation": "Both conversations share themes: #stress, #health"
},
"message": "Connection explanation generated successfully",
"error": null
}Find how a topic appears across different domains.
Query Parameters:
query(string, required): Search querylimit(int, optional): Maximum results per domain (default: 5, max: 20)
Response:
{
"data": {
"query": "stress",
"cross_domain_results": {
"health": [
{
"chat_id": "chat_health_1",
"chat_title": "Health Discussion",
"content": "I'm feeling stressed about...",
"message_id": "msg_123",
"role": "user",
"similarity": 0.9
}
],
"business": [
{
"chat_id": "chat_business_1",
"chat_title": "Work Stress",
"content": "The workload is causing stress...",
"message_id": "msg_456",
"role": "user",
"similarity": 0.85
}
]
},
"domains_searched": ["health", "business", "personal"],
"total_results": 2
},
"message": "Found cross-domain results for 'stress'",
"error": null
}Suggest interesting connections to explore.
Query Parameters:
limit(int, optional): Maximum suggestions (default: 5, max: 20)
Response:
{
"data": [
{
"suggestion": "You discussed 'stress management' in health and business contexts",
"source_domain": "health",
"target_domain": "business",
"connection_strength": 0.85,
"exploration_path": ["health", "stress management", "business"],
"chat1": {
"id": "chat_health_1",
"title": "Health Discussion"
},
"chat2": {
"id": "chat_business_1",
"title": "Work Stress Management"
}
}
],
"message": "Found 1 discovery suggestions",
"error": null
}Get chronological data with semantic connections.
Query Parameters:
start_date(string, optional): Start date (YYYY-MM-DD)end_date(string, optional): End date (YYYY-MM-DD)limit(int, optional): Maximum conversations per day (default: 50, max: 200)
Response:
{
"data": [
{
"date": "2025-01-15",
"conversations": [
{
"chat_id": "chat_abc123",
"title": "Health Discussion",
"timestamp": "2025-01-15T10:30:00Z",
"domain": "health",
"tags": ["#stress", "#health"],
"semantic_connections": [
{
"related_chat": "chat_def456",
"connection": "similar content",
"similarity": 0.8
}
]
}
]
}
],
"message": "Found timeline data for 1 days",
"error": null
}Generate a summary for a specific chat.
Path Parameters:
chat_id(string, required): Chat ID
Response:
{
"data": {
"chat_id": "chat_abc123",
"title": "Health Discussion",
"summary": "Chat 'Health Discussion' contains 4 messages covering various topics.",
"message_count": 4
},
"message": "Chat summary generated successfully",
"error": null
}Discover most discussed topics.
Query Parameters:
limit(int, optional): Maximum results (default: 20, max: 100)min_count(int, optional): Minimum chunk count (default: 0)
Response:
{
"data": [
{
"topic_id": "cluster_45",
"name": "Python Web Development",
"summary": "Discussions about building web applications with Python",
"size": 25
}
],
"message": "Found 20 topics",
"error": null
}Discover domain distribution.
Response:
{
"data": [
{
"domain": "technology",
"chat_count": 856
}
],
"message": "Found 5 domains",
"error": null
}Discover semantic clusters.
Query Parameters:
limit(int, optional): Maximum results (default: 20, max: 100)min_size(int, optional): Minimum cluster size (default: 0)include_positioning(boolean, optional): Include positioning data (default: true)
Response:
{
"data": [
{
"cluster_id": "cluster_45",
"name": "Python Web Development",
"summary": "Discussions about building web applications with Python",
"size": 25,
"x": 0.234,
"y": 0.567
}
],
"message": "Found 20 clusters",
"error": null
}Get conversation context and related content.
Path Parameters:
chat_id(string, required): Chat ID
Response:
{
"data": {
"title": "Python Web Development",
"timestamp": "2025-01-15T10:30:00Z",
"tags": ["#python", "#web"],
"related_chats": ["chat_456", "chat_789"]
},
"message": "Conversation context retrieved successfully",
"error": null
}Find similar content for a chunk.
Path Parameters:
chunk_id(string, required): Chunk ID
Query Parameters:
limit(int, optional): Maximum results (default: 10, max: 50)
Response:
{
"data": [
{
"chunk_id": "def456_msg_1_chunk_0",
"content": "Similar content about web development",
"similarity_score": 0.85
}
],
"message": "Found 5 similar chunks",
"error": null
}Get list of available domains for filtering.
Response:
{
"data": ["technology", "health", "business", "personal"],
"message": "Found 4 available domains",
"error": null
}Get conversation pattern analysis.
Query Parameters:
timeframe(string, optional): Timeframe - daily, weekly, monthly (default: daily)include_sentiment(boolean, optional): Include sentiment analysis (default: false)
Response:
{
"data": {
"timeframe": "daily",
"patterns": [
{
"date": "2025-01-15",
"chat_count": 25
}
],
"total_chats": 1714
},
"message": "Pattern analysis for daily timeframe",
"error": null
}Get basic sentiment overview.
Response:
{
"data": {
"total_messages": 32565,
"note": "Sentiment analysis not implemented yet"
},
"message": "Basic message statistics retrieved",
"error": null
}Get graph data for visualization.
Response:
{
"data": {
"nodes": [
{
"id": "chat_abc123",
"type": "Chat",
"properties": {
"title": "Python Web Development",
"domain": "technology"
},
"position": {
"x": 0.234,
"y": 0.567
}
}
],
"edges": [
{
"source": "chat_abc123",
"target": "cluster_45",
"type": "SIMILAR_TO",
"properties": {
"score": 0.85
}
}
]
},
"message": "Graph data retrieved",
"error": null
}Get all conversations.
Response:
{
"data": [
{
"chat_id": "chat_abc123",
"title": "Python Web Development",
"create_time": 1732234567.0,
"message_count": 5,
"position_x": 0.234,
"position_y": 0.567
}
],
"message": "Found 3 conversations",
"error": null
}Get messages for a specific conversation.
Path Parameters:
chat_id(string, required): Chat ID
Response:
{
"data": [
{
"message_id": "msg_123",
"content": "Example message content",
"role": "user",
"timestamp": 1732234567.0,
"chat_id": "chat_abc123"
}
],
"message": "Found 1 messages for chat: chat_abc123",
"error": null
}Get details for a specific chunk.
Path Parameters:
chunk_id(string, required): Chunk ID
Response:
{
"data": {
"chunk_id": "abc123_msg_0_chunk_0",
"content": "Example chunk content",
"message_id": "msg_123",
"chat_id": "chat_456"
},
"message": "Chunk details retrieved",
"error": null
}Get data for 2D/3D graph visualization.
Query Parameters:
node_types(string, optional): Comma-separated node types (Chat,Cluster,Tag)limit(int, optional): Maximum nodes (default: 100)include_edges(boolean, optional): Include relationship data (default: true)filter_domain(string, optional): Filter by domain
Response:
{
"data": {
"nodes": [
{
"id": "chat_abc123",
"type": "Chat",
"properties": {
"title": "Python Web Development",
"domain": "technology"
},
"position": {
"x": 0.234,
"y": 0.567
}
}
],
"edges": [
{
"source": "chat_abc123",
"target": "cluster_45",
"type": "SIMILAR_TO",
"properties": {
"score": 0.85
}
}
]
},
"message": "Visualization data retrieved",
"error": null
}Expand a node to show its connections.
Path Parameters:
node_id(string, required): Node ID (chat_id, message_id, or chunk_id)
Response:
{
"data": {
"node": {
"id": "chat_abc123",
"chat_id": "chat_abc123",
"message_id": null,
"chunk_id": null,
"labels": ["Chat"],
"properties": {
"title": "Python Web Development",
"domain": "technology"
}
},
"connections": [
{
"relationship_type": "HAS_MESSAGE",
"neighbor_labels": ["Message"],
"neighbor_properties": {
"content": "How do I build a web app?",
"role": "user"
}
}
]
},
"message": "Node expansion: 5 connections found",
"error": null
}Get neighbors of a node with similarity filtering.
Query Parameters:
node_id(string, required): Node IDlimit(int, optional): Maximum results (default: 10, max: 50)min_similarity(float, optional): Minimum similarity threshold (default: 0.0, max: 1.0)
Response:
{
"data": [
{
"neighbor_id": "chat_def456",
"neighbor_title": "Flask Web Development",
"similarity": 0.85
}
],
"message": "Found 5 neighbors for chat_abc123",
"error": null
}Get database schema information.
Response:
{
"data": {
"neo4j_schema": {
"nodes": [
{
"type": "Chat",
"count": 2,
"properties": ["chat_id", "title", "create_time", "message_count", "position_x", "position_y"]
}
],
"relationships": [
{
"type": "CONTAINS",
"count": 2,
"properties": []
}
]
},
"qdrant_schema": {
"collection_name": "chatmind_embeddings",
"vector_size": 384,
"total_points": 47575
}
},
"message": "Schema information retrieved",
"error": null
}{
"data": null,
"message": "Error description",
"error": "ERROR_TYPE"
}200 OK: Request successful400 Bad Request: Invalid parameters404 Not Found: Resource not found500 Internal Server Error: Server error503 Service Unavailable: Database connection issues
# Using the run script
python run.py
# Direct execution
python main.py
# With uvicorn
python -m uvicorn main:app --reload --host 0.0.0.0 --port 8000# Required
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=chatmind123
# Optional
QDRANT_HOST=localhost
QDRANT_PORT=6335
QDRANT_COLLECTION=chatmind_embeddings
EMBEDDING_MODEL=all-MiniLM-L6-v2
API_HOST=0.0.0.0
API_PORT=8000# Run comprehensive test suite
python scripts/test_api_endpoints.py --base-url http://localhost:8000
# (Legacy) simple smoke tests
python test_simple_api.py// Health check
const healthResponse = await fetch('http://localhost:8000/api/health');
const health = await healthResponse.json();
// Simple search
const searchResponse = await fetch('http://localhost:8000/api/search?query=python&limit=5');
const searchResults = await searchResponse.json();
// Semantic search
const semanticResponse = await fetch('http://localhost:8000/api/search/semantic?query=machine learning&limit=5');
const semanticResults = await semanticResponse.json();
// Graph visualization
const graphResponse = await fetch('http://localhost:8000/api/graph/visualization?limit=100');
const graphData = await graphResponse.json();
// Get conversations
const conversationsResponse = await fetch('http://localhost:8000/api/conversations');
const conversations = await conversationsResponse.json();import requests
# Health check
response = requests.get('http://localhost:8000/api/health')
health = response.json()
# Hybrid search
response = requests.get('http://localhost:8000/api/search/hybrid?query=python&limit=10')
results = response.json()
# Retrieval
response = requests.post('http://localhost:8000/api/retrieval/retrieve', json={"query": "python web dev", "topn": 5})
retrieval = response.json()The API follows a clean, modular architecture:
api/
├── main.py # FastAPI app setup and startup/shutdown
├── run.py # Entrypoint to run the server
├── config.py # Configuration management
├── routes/ # Individual route files
│ ├── health.py # Health checks
│ ├── search.py # Search endpoints
│ ├── graph.py # Graph exploration endpoints
│ ├── retrieval.py # Turn retrieval endpoint (PR2)
│ └── debug.py # Debug endpoints
├── models/ # Pydantic models
│ ├── __init__.py
│ └── common.py
├── utils/ # Utility functions
│ ├── __init__.py
│ └── helpers.py
└── services/ # Business logic
├── __init__.py
└── retrieval.py # BM25 + vector + expansion + rerank + packing
- Neo4j: Handles graph relationships, metadata, and complex queries. Includes
(:Turn)with[:HAS_TURN]and[:NEXT]per conversation. - Qdrant: Handles semantic search with embeddings. Uses
chatmind_embeddings(chunks) andchatmind_turns(turn summaries) collections. - Hybrid / Retrieval: Combines BM25 prefilter, turn vectors, graph expansion, and cross-encoder reranking.
- Error Handling: Graceful fallbacks for all database operations.
- Singleton Pattern: Single instances of database drivers and embedding models
- Connection Pooling: Efficient database connection management
- Lazy Loading: Embedding models loaded only when needed
- Global Connections: Database connections passed via
set_global_connectionsfunctions
API Version: 4.0.0
Last Updated: Current Development
Maintainer: ChatMind Development Team
Test Status: ✅ 100% Pass Rate (69/69 endpoints)