Production-ready ELK stack configuration for comprehensive log aggregation, processing, and analysis with distributed tracing correlation.
Docker Containers -> Filebeat -> Logstash -> Elasticsearch -> Kibana
↓
Trace Correlation
↓
Jaeger
Location: /monitoring/filebeat/filebeat.yml
Features:
- Container log collection from Docker
- Docker metadata enrichment
- Kubernetes autodiscovery support
- JSON log parsing
- Load-balanced output to Logstash
- Service and environment tagging
Configuration:
# Container logs input
- type: container
paths:
- '/var/lib/docker/containers/*/*.log'
# Output to Logstash
output.logstash:
hosts: ["logstash:5044"]
compression_level: 3
worker: 2
loadbalance: trueLocation: /monitoring/logstash/pipeline/logstash.conf
Features:
- JSON message parsing
- Timestamp extraction and normalization
- Log level extraction and tagging
- Trace/span ID correlation for Jaeger
- Error and audit log tagging
- Kubernetes metadata enrichment
- Index routing (logs, errors, audit)
Pipeline Flow:
Input (Beats:5044) -> Filter (Parse/Enrich) -> Output (Elasticsearch)
Three index types:
-
ollamamax-logs-YYYY.MM.dd
- Standard application logs
- INFO, DEBUG, WARN levels
- General operations
-
ollamamax-errors-YYYY.MM.dd
- ERROR and FATAL logs
- Tagged with
error - Priority alerting
-
ollamamax-audit-YYYY.MM.dd
- Audit trail events
- Security-sensitive operations
- Compliance tracking
- Tagged with
audit
Logstash extracts trace_id and span_id from logs and preserves them in Elasticsearch:
Application Log -> Filebeat -> Logstash -> Elasticsearch
↓
trace_id: abc123
span_id: def456
↓
Query Jaeger with trace_id -> Full trace visualization
To enable trace correlation, applications must log in JSON format:
{
"timestamp": "2025-01-27T10:30:45.123Z",
"level": "info",
"message": "Processing request",
"trace_id": "abc123def456789",
"span_id": "def456789",
"service": "ollamamax-api"
}| Field | Source | Purpose |
|---|---|---|
@timestamp |
log.timestamp |
Event time |
level |
log.level |
Log severity |
trace_id |
log.trace_id |
Distributed trace ID |
span_id |
log.span_id |
Trace span ID |
k8s_namespace |
kubernetes.namespace |
K8s namespace |
k8s_pod |
kubernetes.pod.name |
Pod name |
k8s_container |
kubernetes.container.name |
Container name |
error: ERROR/FATAL logsaudit: Audit eventsforwarded: From Filebeat
GET ollamamax-logs-*/_search
{
"query": {
"term": {
"trace_id": "abc123def456789"
}
}
}
GET ollamamax-errors-*/_search
{
"query": {
"match_all": {}
},
"sort": [
{ "@timestamp": "desc" }
]
}
GET ollamamax-audit-*/_search
{
"query": {
"range": {
"@timestamp": {
"gte": "now-24h"
}
}
}
}
-
Filebeat:
- Port: None (internal)
- Collects from:
/var/lib/docker/containers - Sends to: Logstash:5044
-
Logstash:
- Port: 5044 (Beats input)
- Port: 9600 (Monitoring API)
- Reads from: Filebeat
- Writes to: Elasticsearch
-
Elasticsearch:
- Port: 9200 (HTTP API)
- Port: 9300 (Transport)
- Storage:
elasticsearch_datavolume
-
Kibana:
- Port: 5601 (Web UI)
- Connects to: Elasticsearch
Filebeat:
LOGSTASH_HOST=logstash
LOGSTASH_PORT=5044
ENVIRONMENT=productionLogstash:
ELASTICSEARCH_HOST=elasticsearch
ELASTICSEARCH_PORT=9200
LS_JAVA_OPTS=-Xms256m -Xmx256mFilebeat:
- Config:
./monitoring/filebeat/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro - Logs:
/var/lib/docker/containers:/var/lib/docker/containers:ro - Socket:
/var/run/docker.sock:/var/run/docker.sock:ro
Logstash:
- Pipeline:
./monitoring/logstash/pipeline:/usr/share/logstash/pipeline:ro
Logstash:
curl http://localhost:9600/_node/statsElasticsearch:
curl http://localhost:9200/_cluster/healthKibana:
curl http://localhost:5601/api/statusMonitor via Prometheus:
- Filebeat: Logs shipped/sec
- Logstash: Events processed/sec
- Elasticsearch: Index size, query latency
# Check Filebeat container
docker logs filebeat
# Verify permissions
docker exec filebeat ls -la /var/lib/docker/containers
# Test Logstash connection
docker exec filebeat ping logstash# Check pipeline syntax
docker exec logstash /usr/share/logstash/bin/logstash --config.test_and_exit -f /usr/share/logstash/pipeline/logstash.conf
# View logs
docker logs logstash
# Check Elasticsearch connection
docker exec logstash curl -X GET "elasticsearch:9200/_cluster/health?pretty"- Verify application logs contain
trace_idfield - Check Logstash filter extracts field correctly
- Query Elasticsearch to confirm field exists:
curl -X GET "localhost:9200/ollamamax-logs-*/_mapping/field/trace_id?pretty"
Logstash:
environment:
- "LS_JAVA_OPTS=-Xms512m -Xmx512m" # Increase heapElasticsearch:
environment:
- "ES_JAVA_OPTS=-Xms1g -Xmx1g" # Increase heap-
Index Lifecycle Management:
- Rotate daily indices
- Delete old logs after 30 days
- Archive to S3 for long-term storage
-
Log Format:
- Use JSON structured logging
- Include trace/span IDs
- Add context fields (user_id, request_id)
-
Performance:
- Use Filebeat compression
- Configure Logstash workers
- Monitor queue depths
-
Security:
- Restrict Elasticsearch access
- Use Kibana authentication
- Audit log access
# Delete old indices
curl -X DELETE "localhost:9200/ollamamax-logs-2025.01.*"
# Create index template for retention
curl -X PUT "localhost:9200/_index_template/ollamamax-logs" -H 'Content-Type: application/json' -d'
{
"index_patterns": ["ollamamax-logs-*"],
"template": {
"settings": {
"index.lifecycle.name": "ollamamax-logs-policy",
"index.lifecycle.rollover_alias": "ollamamax-logs"
}
}
}
'# Reload Logstash configuration
docker exec logstash curl -XPOST 'localhost:9600/_node/pipeline/main/_reload'
# Restart Logstash
docker-compose restart logstash-
Find Trace ID in Kibana:
- Search logs by error or event
- Copy
trace_idfield
-
View Trace in Jaeger:
- Open Jaeger UI: http://localhost:16686
- Search by trace ID
- View complete request flow
-
Create Kibana Link to Jaeger:
// In Kibana Dashboard { "url": "http://localhost:16686/trace/{{trace_id}}", "label": "View Trace" }
# Increase queue size
queue.mem:
events: 4096
flush.min_events: 512
flush.timeout: 1s
# Increase workers
output.logstash:
worker: 4# Increase pipeline workers
pipeline.workers: 4
# Increase batch size
pipeline.batch.size: 125
# Increase refresh interval
index.refresh_interval: 30s
# Increase bulk queue
thread_pool.bulk.queue_size: 1000