Skip to content

05 Performance Optimization

Henry edited this page Aug 23, 2025 · 1 revision

Performance Optimization

Guide to optimizing MCP Memory Service for maximum performance and scalability.

Table of Contents

Quick Wins

1. Choose the Right Backend

# SQLite-vec (Recommended for single-client)
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
# Average read time: ~5ms

# ChromaDB (For multi-client)
export MCP_MEMORY_STORAGE_BACKEND=chroma
# Average read time: ~15ms

# Cloudflare (For distributed/production)
export MCP_MEMORY_STORAGE_BACKEND=cloudflare
# Network dependent

2. Enable HTTP/HTTPS for Better Performance

export MCP_HTTP_ENABLED=true
export MCP_HTTPS_ENABLED=true
export MCP_HTTP_PORT=8443

3. Use Batch Operations

# ❌ Slow: Individual operations
for memory in memories:
    await store_memory(memory)

# ✅ Fast: Batch operation
await store_memories_batch(memories)

Database Optimization

SQLite-vec Configuration

# Optimize SQLite settings
export SQLITE_PRAGMA_CACHE_SIZE=10000
export SQLITE_PRAGMA_SYNCHRONOUS=NORMAL
export SQLITE_PRAGMA_WAL_AUTOCHECKPOINT=1000

ChromaDB Configuration

# Optimize ChromaDB settings
chroma_settings = {
    "anonymized_telemetry": False,
    "allow_reset": False,
    "is_persistent": True,
    "persist_directory": "/path/to/chroma_db"
}

Database Maintenance

# SQLite maintenance (weekly)
sqlite3 memory.db "VACUUM;"
sqlite3 memory.db "REINDEX;"
sqlite3 memory.db "ANALYZE;"

# Check database size
sqlite3 memory.db "SELECT page_count * page_size as size FROM pragma_page_count(), pragma_page_size();"

Query Performance

Optimize Search Queries

Use Specific Keywords

# ❌ Slow: Vague search
results = await search("thing")

# ✅ Fast: Specific search
results = await search("authentication JWT token")

Limit Results Appropriately

# For quick browsing
results = await search(query, limit=10)

# For existence check
exists = len(await search(query, limit=1)) > 0

# For comprehensive analysis
results = await search(query, limit=100)

Combine Search Types

# Most efficient: Tag search first (indexed)
tagged = await search_by_tag(["python", "error"])

# Then refine with text search
refined = await search("authentication", memories=tagged)

Index Optimization

Tag Indexing

-- Ensure tag indexes exist
CREATE INDEX IF NOT EXISTS idx_memory_tags ON memories(tags);
CREATE INDEX IF NOT EXISTS idx_memory_created_at ON memories(created_at);
CREATE INDEX IF NOT EXISTS idx_memory_content_hash ON memories(content_hash);

Content Indexing

# Use full-text search when available
results = await search_fts("authentication error python")

# Fall back to semantic search for complex queries
results = await search_semantic("how to fix JWT timeout issues")

Memory Management

Memory Usage Patterns

Efficient Memory Allocation

# ❌ Memory intensive
all_memories = await get_all_memories()
filtered = [m for m in all_memories if condition(m)]

# ✅ Stream processing
async for memory in stream_memories():
    if condition(memory):
        yield memory

Cache Management

# Configure embedding cache
EMBEDDING_CACHE_SIZE = 1000  # Number of embeddings to cache
EMBEDDING_CACHE_TTL = 3600   # Cache TTL in seconds

# Query result caching
QUERY_CACHE_SIZE = 100       # Number of query results to cache
QUERY_CACHE_TTL = 300        # Cache TTL in seconds

Resource Limits

# Limit memory usage
export MCP_MAX_MEMORY_MB=2048

# Limit concurrent operations
export MCP_MAX_CONCURRENT_OPERATIONS=10

# Limit embedding batch size
export MCP_EMBEDDING_BATCH_SIZE=50

Monitoring & Metrics

Performance Metrics to Track

# Query performance
query_time = time.time()
results = await search(query)
duration = time.time() - query_time
print(f"Query took {duration:.2f}s")

# Memory usage
import psutil
memory_usage = psutil.Process().memory_info().rss / 1024 / 1024
print(f"Memory usage: {memory_usage:.1f}MB")

# Database stats
stats = await get_database_stats()
print(f"Total memories: {stats.count}")
print(f"Database size: {stats.size_mb}MB")

Built-in Performance Tools

# Health check endpoint
curl https://localhost:8443/api/health

# Stats endpoint
curl https://localhost:8443/api/stats

# Performance metrics
curl https://localhost:8443/api/metrics

Logging Configuration

# Enable performance logging
export MCP_LOG_LEVEL=INFO
export MCP_LOG_PERFORMANCE=true

# Monitor slow queries
export MCP_SLOW_QUERY_THRESHOLD=1000  # Log queries > 1s

Troubleshooting Performance Issues

Common Performance Problems

Slow Search Queries

Symptoms: Search takes >2 seconds Diagnosis:

# Check database size
stats = await get_db_stats()
if stats.size_mb > 1000:
    print("Large database detected")

# Check index usage
explain_plan = await explain_query(search_query)
if "SCAN" in explain_plan:
    print("Full table scan detected")

Solutions:

  1. Add missing indexes
  2. Optimize query patterns
  3. Consider database partitioning

High Memory Usage

Symptoms: Process using >4GB RAM Diagnosis:

# Check embedding cache
cache_stats = await get_embedding_cache_stats()
print(f"Cache size: {cache_stats.size}")

# Check for memory leaks
memory_trend = get_memory_usage_trend(hours=24)
if memory_trend.slope > 0.1:
    print("Potential memory leak")

Solutions:

  1. Reduce cache sizes
  2. Enable garbage collection
  3. Restart service periodically

Database Lock Contention

Symptoms: "Database is locked" errors Diagnosis:

# Check for long-running transactions
sqlite3 memory.db "SELECT * FROM sqlite_master WHERE type='table';"

# Check WAL file size
ls -la *.db-wal

Solutions:

  1. Enable WAL mode
  2. Reduce transaction scope
  3. Add connection pooling

Performance Benchmarking

# Benchmark search performance
async def benchmark_search():
    queries = ["python", "error", "authentication", "database"]
    times = []
    
    for query in queries:
        start = time.time()
        results = await search(query, limit=10)
        duration = time.time() - start
        times.append(duration)
        print(f"Query '{query}': {duration:.2f}s ({len(results)} results)")
    
    avg_time = sum(times) / len(times)
    print(f"Average search time: {avg_time:.2f}s")

# Run benchmark
await benchmark_search()

Optimization Checklist

Daily Monitoring

  • Check query response times (<1s average)
  • Monitor memory usage (<2GB)
  • Verify database health
  • Review slow query logs

Weekly Maintenance

  • Run database VACUUM
  • Update query statistics
  • Review performance metrics
  • Clean up old logs

Monthly Review

  • Analyze performance trends
  • Update optimization settings
  • Review capacity planning
  • Performance regression testing

Performance Targets

Response Time Goals

  • Search queries: <500ms average
  • Memory storage: <100ms average
  • Health checks: <50ms average
  • Bulk operations: <5s for 100 items

Resource Usage Goals

  • Memory usage: <2GB for 100K memories
  • Disk space: <1GB for 100K memories
  • CPU usage: <10% average load
  • Network: <1MB/s average throughput

Following these optimization guidelines will ensure your MCP Memory Service performs efficiently at any scale.

Clone this wiki locally