05 Performance Optimization

Performance Optimization

Guide to optimizing MCP Memory Service for maximum performance and scalability.

Quick Wins

1. Choose the Right Backend

# SQLite-vec (Recommended for single-client)
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
# Average read time: ~5ms

# ChromaDB (For multi-client)
export MCP_MEMORY_STORAGE_BACKEND=chroma
# Average read time: ~15ms

# Cloudflare (For distributed/production)
export MCP_MEMORY_STORAGE_BACKEND=cloudflare
# Network dependent

2. Enable HTTP/HTTPS for Better Performance

export MCP_HTTP_ENABLED=true
export MCP_HTTPS_ENABLED=true
export MCP_HTTP_PORT=8443

3. Use Batch Operations

# ❌ Slow: Individual operations
for memory in memories:
    await store_memory(memory)

# ✅ Fast: Batch operation
await store_memories_batch(memories)

Database Optimization

SQLite-vec Configuration

# Optimize SQLite settings
export SQLITE_PRAGMA_CACHE_SIZE=10000
export SQLITE_PRAGMA_SYNCHRONOUS=NORMAL
export SQLITE_PRAGMA_WAL_AUTOCHECKPOINT=1000

ChromaDB Configuration

# Optimize ChromaDB settings
chroma_settings = {
    "anonymized_telemetry": False,
    "allow_reset": False,
    "is_persistent": True,
    "persist_directory": "/path/to/chroma_db"
}

Database Maintenance

# SQLite maintenance (weekly)
sqlite3 memory.db "VACUUM;"
sqlite3 memory.db "REINDEX;"
sqlite3 memory.db "ANALYZE;"

# Check database size
sqlite3 memory.db "SELECT page_count * page_size as size FROM pragma_page_count(), pragma_page_size();"

Query Performance

Optimize Search Queries

Use Specific Keywords

# ❌ Slow: Vague search
results = await search("thing")

# ✅ Fast: Specific search
results = await search("authentication JWT token")

Limit Results Appropriately

# For quick browsing
results = await search(query, limit=10)

# For existence check
exists = len(await search(query, limit=1)) > 0

# For comprehensive analysis
results = await search(query, limit=100)

Combine Search Types

# Most efficient: Tag search first (indexed)
tagged = await search_by_tag(["python", "error"])

# Then refine with text search
refined = await search("authentication", memories=tagged)

Index Optimization

Tag Indexing

-- Ensure tag indexes exist
CREATE INDEX IF NOT EXISTS idx_memory_tags ON memories(tags);
CREATE INDEX IF NOT EXISTS idx_memory_created_at ON memories(created_at);
CREATE INDEX IF NOT EXISTS idx_memory_content_hash ON memories(content_hash);

Content Indexing

# Use full-text search when available
results = await search_fts("authentication error python")

# Fall back to semantic search for complex queries
results = await search_semantic("how to fix JWT timeout issues")

Memory Management

Memory Usage Patterns

Efficient Memory Allocation

# ❌ Memory intensive
all_memories = await get_all_memories()
filtered = [m for m in all_memories if condition(m)]

# ✅ Stream processing
async for memory in stream_memories():
    if condition(memory):
        yield memory

Cache Management

# Configure embedding cache
EMBEDDING_CACHE_SIZE = 1000  # Number of embeddings to cache
EMBEDDING_CACHE_TTL = 3600   # Cache TTL in seconds

# Query result caching
QUERY_CACHE_SIZE = 100       # Number of query results to cache
QUERY_CACHE_TTL = 300        # Cache TTL in seconds

Resource Limits

# Limit memory usage
export MCP_MAX_MEMORY_MB=2048

# Limit concurrent operations
export MCP_MAX_CONCURRENT_OPERATIONS=10

# Limit embedding batch size
export MCP_EMBEDDING_BATCH_SIZE=50

Monitoring & Metrics

Performance Metrics to Track

# Query performance
query_time = time.time()
results = await search(query)
duration = time.time() - query_time
print(f"Query took {duration:.2f}s")

# Memory usage
import psutil
memory_usage = psutil.Process().memory_info().rss / 1024 / 1024
print(f"Memory usage: {memory_usage:.1f}MB")

# Database stats
stats = await get_database_stats()
print(f"Total memories: {stats.count}")
print(f"Database size: {stats.size_mb}MB")

Built-in Performance Tools

# Health check endpoint
curl https://localhost:8443/api/health

# Stats endpoint
curl https://localhost:8443/api/stats

# Performance metrics
curl https://localhost:8443/api/metrics

Logging Configuration

# Enable performance logging
export MCP_LOG_LEVEL=INFO
export MCP_LOG_PERFORMANCE=true

# Monitor slow queries
export MCP_SLOW_QUERY_THRESHOLD=1000  # Log queries > 1s

Troubleshooting Performance Issues

Common Performance Problems

Slow Search Queries

Symptoms: Search takes >2 seconds Diagnosis:

# Check database size
stats = await get_db_stats()
if stats.size_mb > 1000:
    print("Large database detected")

# Check index usage
explain_plan = await explain_query(search_query)
if "SCAN" in explain_plan:
    print("Full table scan detected")

Solutions:

Add missing indexes
Optimize query patterns
Consider database partitioning

High Memory Usage

Symptoms: Process using >4GB RAM Diagnosis:

# Check embedding cache
cache_stats = await get_embedding_cache_stats()
print(f"Cache size: {cache_stats.size}")

# Check for memory leaks
memory_trend = get_memory_usage_trend(hours=24)
if memory_trend.slope > 0.1:
    print("Potential memory leak")

Solutions:

Reduce cache sizes
Enable garbage collection
Restart service periodically

Database Lock Contention

Symptoms: "Database is locked" errors Diagnosis:

# Check for long-running transactions
sqlite3 memory.db "SELECT * FROM sqlite_master WHERE type='table';"

# Check WAL file size
ls -la *.db-wal

Solutions:

Enable WAL mode
Reduce transaction scope
Add connection pooling

Performance Benchmarking

# Benchmark search performance
async def benchmark_search():
    queries = ["python", "error", "authentication", "database"]
    times = []
    
    for query in queries:
        start = time.time()
        results = await search(query, limit=10)
        duration = time.time() - start
        times.append(duration)
        print(f"Query '{query}': {duration:.2f}s ({len(results)} results)")
    
    avg_time = sum(times) / len(times)
    print(f"Average search time: {avg_time:.2f}s")

# Run benchmark
await benchmark_search()

Optimization Checklist

Daily Monitoring

Check query response times (<1s average)
Monitor memory usage (<2GB)
Verify database health
Review slow query logs

Weekly Maintenance

Run database VACUUM
Update query statistics
Review performance metrics
Clean up old logs

Monthly Review

Analyze performance trends
Update optimization settings
Review capacity planning
Performance regression testing

Performance Targets

Response Time Goals

Search queries: <500ms average
Memory storage: <100ms average
Health checks: <50ms average
Bulk operations: <5s for 100 items

Resource Usage Goals

Memory usage: <2GB for 100K memories
Disk space: <1GB for 100K memories
CPU usage: <10% average load
Network: <1MB/s average throughput

Following these optimization guidelines will ensure your MCP Memory Service performs efficiently at any scale.

Uh oh!

05 Performance Optimization

Performance Optimization

Table of Contents

Quick Wins

1. Choose the Right Backend

2. Enable HTTP/HTTPS for Better Performance

3. Use Batch Operations

Database Optimization

SQLite-vec Configuration

ChromaDB Configuration

Database Maintenance

Query Performance

Optimize Search Queries

Use Specific Keywords

Limit Results Appropriately

Combine Search Types

Index Optimization

Tag Indexing

Content Indexing

Memory Management

Memory Usage Patterns

Efficient Memory Allocation

Cache Management

Resource Limits

Monitoring & Metrics

Performance Metrics to Track

Built-in Performance Tools

Logging Configuration

Troubleshooting Performance Issues

Common Performance Problems

Slow Search Queries

High Memory Usage

Database Lock Contention

Performance Benchmarking

Optimization Checklist

Daily Monitoring

Weekly Maintenance

Monthly Review

Performance Targets

Response Time Goals

Resource Usage Goals

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally