From 26402f2538cc1304bfd197c3a771fd2322608b76 Mon Sep 17 00:00:00 2001 From: beorn7 Date: Fri, 15 May 2020 17:56:10 +0200 Subject: [PATCH 1/2] Change the instance name for standard pod scraping to be unique Any of the potentially many containers in a pod can expose one or more ports with Prometheus metrics. However, with our current target labels, all of these targets get the same instance label (just the pod name), which leads to the dreaded `PrometheusOutOfOrderTimestamps` alert, see https://github.com/grafana/deployment_tools/issues/3441 . (In fact, if we get the alert, we are already lucky, because the problem can go unnoticed until someone actually needs one of the time series that receive samples from different targets, rendering them useless.) In practice, we rarely have more than one port to scrape per pod, but it does happen, and it's totally within the intended usage pattern of K8s, which means it can happen more at any time. The two examples I'm aware of: - Kube-state-metrics (KSM) has only one container it its pod, but that container exposes two metrics ports (http-metrics and self-metrics). - Consul pods run a container with the consul-exporter and a container with the statsd-exporter, each exposing their metrics on a different port. Both ports are named http-metrics, which is possible because they are exposed by different containers. (This is the case that triggered the above linked issue.) To avoid the metric duplication, we could add a container and a port label, but it is a Prometheus convention that the instance label alone should be unique within a job. Which brings us to what I'm proposing in this commit: Create the instance label by joining pod name, container name, and port name with `:` in between. In most cases, the resulting instance value will appear redundant, but I believe the consistency has some value. Applying same magic to shorten the instance label when possible would add complexity and remove the consistency. --- .../lib/prometheus-config.libsonnet | 23 ++++++++++++++----- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/prometheus-ksonnet/lib/prometheus-config.libsonnet b/prometheus-ksonnet/lib/prometheus-config.libsonnet index e993b2d60..1302c1c61 100644 --- a/prometheus-ksonnet/lib/prometheus-config.libsonnet +++ b/prometheus-ksonnet/lib/prometheus-config.libsonnet @@ -141,10 +141,16 @@ target_label: 'namespace', }, - // Rename instances to be the pod name + // Rename instances to the concatenation of pod:container:port. + // All three components are needed to guarantee a unique instance label. { - source_labels: ['__meta_kubernetes_pod_name'], + source_labels: [ + '__meta_kubernetes_pod_name', + '__meta_kubernetes_pod_container_name', + '__meta_kubernetes_pod_container_port_name', + ], action: 'replace', + separator: ':', target_label: 'instance', }, @@ -192,11 +198,16 @@ action: 'keep', }, - // Rename instances to be the pod name. - // As the scrape two ports of KSM, include the port name in the instance - // name. Otherwise alerts about scrape failures and timeouts won't work. + // Rename instances to the concatenation of pod:container:port. + // In the specific case of KSM, we could leave out the container + // name and still have a unique instance label, but we leave it + // in here for consistency with the normal pod scraping. { - source_labels: ['__meta_kubernetes_pod_name', '__meta_kubernetes_pod_container_port_name'], + source_labels: [ + '__meta_kubernetes_pod_name', + '__meta_kubernetes_pod_container_name', + '__meta_kubernetes_pod_container_port_name', + ], action: 'replace', separator: ':', target_label: 'instance', From 0ccca409a20a56d209ecfa1fc6a8ba76bef89798 Mon Sep 17 00:00:00 2001 From: beorn7 Date: Fri, 15 May 2020 20:31:37 +0200 Subject: [PATCH 2/2] Add a `container` and `pod` target label This allows joining with cAdvisor metrics. --- .../lib/prometheus-config.libsonnet | 23 +++++++++++++++---- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/prometheus-ksonnet/lib/prometheus-config.libsonnet b/prometheus-ksonnet/lib/prometheus-config.libsonnet index 1302c1c61..fa17e198a 100644 --- a/prometheus-ksonnet/lib/prometheus-config.libsonnet +++ b/prometheus-ksonnet/lib/prometheus-config.libsonnet @@ -134,12 +134,23 @@ replacement: '$1', }, - // But also include the namespace as a separate label, for routing alerts + // But also include the namespace, container, pod as separate labels, + // for routing alerts and joining with cAdvisor metrics. { source_labels: ['__meta_kubernetes_namespace'], action: 'replace', target_label: 'namespace', }, + { + source_labels: ['__meta_kubernetes_pod_name'], + action: 'replace', + target_label: 'pod', // Not 'pod_name', which disappeared in K8s 1.16. + }, + { + source_labels: ['__meta_kubernetes_container_name'], + action: 'replace', + target_label: 'container', // Not 'container_name', which disappeared in K8s 1.16. + }, // Rename instances to the concatenation of pod:container:port. // All three components are needed to guarantee a unique instance label. @@ -169,10 +180,12 @@ ], }, - // A separate scrape config for kube-state-metrics which doesn't add a namespace - // label, instead taking the namespace label from the exported timeseries. This - // prevents the exported namespace label being renamed to exported_namespace, and - // allows us to route alerts based on namespace. + // A separate scrape config for kube-state-metrics which doesn't + // add namespace, container, and pod labels, instead taking + // those labels from the exported timeseries. This prevents them + // being renamed to exported_namespace etc. and allows us to + // route alerts based on namespace and join KSM metrics with + // cAdvisor metrics. { job_name: '%s/kube-state-metrics' % $._config.namespace, kubernetes_sd_configs: [{