curl http://localhost:8080/metrics | grep -E "(db_query|cache_)"curl http://localhost:8080/metrics | grep "db_query_duration_seconds"Measures query execution time in seconds.
Labels:
operation: get, list, create, update, deletetable: models, users, nodes, model_replicas, audit_log_entries
Buckets: 0.001s, 0.002s, 0.004s, 0.008s, 0.016s, 0.032s, 0.064s, 0.128s, 0.256s, 0.512s, 1.024s, +Inf
Example:
db_query_duration_seconds_bucket{operation="get",table="models",le="0.001"} 45
db_query_duration_seconds_bucket{operation="get",table="models",le="0.002"} 48
db_query_duration_seconds_sum{operation="get",table="models"} 0.156
db_query_duration_seconds_count{operation="get",table="models"} 50
Total number of database queries executed.
Labels:
operation: get, list, create, update, deletetable: models, users, nodes, model_replicas, audit_log_entriesstatus: success, error
Example:
db_queries_total{operation="get",status="success",table="models"} 50
db_queries_total{operation="get",status="error",table="models"} 2
Number of successful cache retrievals.
Labels:
cache_type: redistable: models
Example:
cache_hits_total{cache_type="redis",table="models"} 35
Number of failed cache retrievals (not found in cache).
Labels:
cache_type: redistable: models
Example:
cache_misses_total{cache_type="redis",table="models"} 15
Measures cache operation execution time in seconds.
Labels:
operation: get, set, deletecache_type: redistable: models
Buckets: 0.0001s, 0.0002s, 0.0004s, 0.0008s, 0.0016s, 0.0032s, 0.0064s, 0.0128s, 0.0256s, 0.0512s, 0.1024s, +Inf
Example:
cache_operation_duration_seconds_bucket{cache_type="redis",operation="get",table="models",le="0.001"} 48
cache_operation_duration_seconds_sum{cache_type="redis",operation="get",table="models"} 0.012
cache_operation_duration_seconds_count{cache_type="redis",operation="get",table="models"} 50
rate(db_query_duration_seconds_sum[5m]) / rate(db_query_duration_seconds_count[5m])
rate(db_query_duration_seconds_sum[5m]) by (table) / rate(db_query_duration_seconds_count[5m]) by (table)
rate(db_query_duration_seconds_sum[5m]) by (operation) / rate(db_query_duration_seconds_count[5m]) by (operation)
histogram_quantile(0.95, rate(db_query_duration_seconds_bucket[5m]))
histogram_quantile(0.99, rate(db_query_duration_seconds_bucket[5m]))
topk(5, histogram_quantile(0.99, sum(rate(db_query_duration_seconds_bucket[5m])) by (operation, table, le)))
rate(db_queries_total[5m])
sum(rate(db_queries_total[5m])) by (table)
sum(rate(db_queries_total[5m])) by (operation)
rate(db_queries_total{status="error"}[5m]) / rate(db_queries_total[5m])
rate(db_queries_total{status="error"}[5m]) by (table) / rate(db_queries_total[5m]) by (table)
increase(db_queries_total{status="error"}[1h])
rate(cache_hits_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))
(rate(cache_hits_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))) * 100
rate(cache_misses_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))
rate(cache_operation_duration_seconds_sum[5m]) / rate(cache_operation_duration_seconds_count[5m])
rate(cache_operation_duration_seconds_sum[5m]) by (operation) / rate(cache_operation_duration_seconds_count[5m]) by (operation)
sum(rate(cache_operation_duration_seconds_count[5m])) by (operation)
sum(rate(db_query_duration_seconds_bucket[5m])) by (le, operation)
Visualization: Heatmap X-axis: Time Y-axis: Latency buckets
(sum(rate(cache_hits_total[5m])) / (sum(rate(cache_hits_total[5m])) + sum(rate(cache_misses_total[5m])))) * 100
Visualization: Gauge Min: 0 Max: 100 Thresholds: Red < 70%, Yellow < 85%, Green >= 85%
sum(rate(db_queries_total[5m])) by (operation)
Visualization: Time series graph Legend: {{operation}}
topk(10, histogram_quantile(0.99, sum(rate(db_query_duration_seconds_bucket[5m])) by (operation, table, le)))
Visualization: Table Columns: Operation, Table, P99 Latency
rate(db_queries_total{status="error"}[5m])
Visualization: Time series graph Y-axis: Errors per second
- alert: HighDatabaseErrorRate
expr: rate(db_queries_total{status="error"}[5m]) / rate(db_queries_total[5m]) > 0.05
for: 5m
labels:
severity: warning
annotations:
summary: "Database error rate above 5%"
description: "{{ $labels.table }} has {{ $value | humanizePercentage }} error rate"- alert: SlowDatabaseQueries
expr: histogram_quantile(0.95, rate(db_query_duration_seconds_bucket[5m])) > 1
for: 5m
labels:
severity: warning
annotations:
summary: "95th percentile query latency above 1s"
description: "{{ $labels.operation }} on {{ $labels.table }} is slow: {{ $value }}s"- alert: VerySlowDatabaseQueries
expr: histogram_quantile(0.99, rate(db_query_duration_seconds_bucket[5m])) > 5
for: 2m
labels:
severity: critical
annotations:
summary: "99th percentile query latency above 5s"
description: "{{ $labels.operation }} on {{ $labels.table }} is very slow: {{ $value }}s"- alert: LowCacheHitRate
expr: (rate(cache_hits_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))) < 0.7
for: 10m
labels:
severity: warning
annotations:
summary: "Cache hit rate below 70%"
description: "{{ $labels.table }} cache hit rate is {{ $value | humanizePercentage }}"- alert: VeryLowCacheHitRate
expr: (rate(cache_hits_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))) < 0.5
for: 5m
labels:
severity: critical
annotations:
summary: "Cache hit rate below 50%"
description: "{{ $labels.table }} cache is ineffective: {{ $value | humanizePercentage }}"- alert: HighQueryVolume
expr: rate(db_queries_total[5m]) > 1000
for: 10m
labels:
severity: info
annotations:
summary: "Query volume above 1000 qps"
description: "Current rate: {{ $value }} queries/second"- Total QPS (Stat panel)
- Average Latency (Stat panel)
- Error Rate (Stat panel)
- Cache Hit Rate (Gauge)
- Query Latency Heatmap (Heatmap)
- P95/P99 Latency (Time series)
- Queries by Operation (Time series)
- Queries by Table (Time series)
- Cache Hit/Miss Rate (Time series)
- Cache Operation Latency (Time series)
- Error Rate by Table (Time series)
- Top Errors (Table)
# Find operations with P99 > 100ms
histogram_quantile(0.99, rate(db_query_duration_seconds_bucket[5m])) > 0.1
# Top 5 operations by query count
topk(5, rate(db_queries_total[5m]))
# Cache hit rate for each table
(rate(cache_hits_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))) by (table)
# Compare current vs 1 hour ago
rate(db_query_duration_seconds_sum[5m]) / rate(db_query_duration_seconds_count[5m])
/
rate(db_query_duration_seconds_sum[5m] offset 1h) / rate(db_query_duration_seconds_count[5m] offset 1h)
# Total seconds spent in database queries per second
sum(rate(db_query_duration_seconds_sum[5m]))
scrape_configs:
- job_name: 'ollamamax'
static_configs:
- targets: ['localhost:8080']
metrics_path: '/metrics'
scrape_interval: 15s# Export all metrics
curl -s http://localhost:8080/metrics > metrics.txt
# Export database metrics only
curl -s http://localhost:8080/metrics | grep "^db_" > db_metrics.txt
# Export cache metrics only
curl -s http://localhost:8080/metrics | grep "^cache_" > cache_metrics.txt- Check server is running:
curl http://localhost:8080/health - Verify metrics endpoint:
curl http://localhost:8080/metrics - Check logs for errors
- Trigger database operations
- Wait for scrape interval (default 15s)
- Check Prometheus targets page
- Check database connection pool
- Review query plans
- Analyze cache hit rates
- Check for lock contention
- Review cache TTL settings
- Check Redis memory usage
- Analyze access patterns
- Consider increasing cache size
- Set appropriate scrape intervals: 15-30s for production
- Use recording rules: Pre-aggregate expensive queries
- Set retention policies: Balance storage vs. historical data
- Create dashboards: Visualize key metrics
- Configure alerts: Proactive monitoring
- Regular reviews: Weekly metric analysis
- Document baselines: Know normal behavior
/home/kp/OllamaMax/docs/COMMENT_4_IMPLEMENTATION.md- Full implementation details/home/kp/OllamaMax/COMMENT_4_COMPLETE.md- Implementation summary
For issues or questions:
- Check Prometheus documentation: https://prometheus.io/docs/
- Review Grafana guides: https://grafana.com/docs/
- Check application logs
- Implementation Date: 2025-10-27
- Metrics Version: 1.0
- Compatible Prometheus Version: 2.0+