This is a kubectl plugin that allows you to profile applications with low-overhead in Kubernetes environments by generating FlameGraphs and many other outputs as JFR, thread dump, heap dump and class histogram for Java applications by using jcmd. For Python applications, thread dump output and speed scope format file are also supported. See Usage section. More functionalities will be added in the future.
Running kubectl-prof does not require any modification to existing pods.
This is an open source fork of kubectl-flame with several new features and bug fixes.
- Supported languages: Go, Java (any JVM based language), Python, Ruby, NodeJS, Clang and Clang++.
- Kubernetes that use some of the following container runtimes:
- Containerd by using flag
--runtime=containerd(default) - CRI-O by using flag
--runtime=crio
- Containerd by using flag
To profile a Java application in pod mypod for 1 minute and save the flamegraph into /tmp run:
kubectl prof my-pod -t 5m -l java -o flamegraph --local-path=/tmpNOTICE:
- if
--local-pathis omitted, a flamegraph result will be saved into current directory
Profiling Java application in alpine-based containers require using --alpine flag:
kubectl prof mypod -t 1m --lang java -o flamegraph --alpine NOTICE: this is only required for Java apps, the --alpine flag is unnecessary for other languages.
Profiling Java Pod and generate JFR output require using -o/--output jfr option:
kubectl prof mypod -t 5m -l java -o jfr In this case, profiling Java Pod and generate JFR output require using -o/--output jfr and --tool async-profiler
options:
kubectl prof mypod -t 5m -l java -o jfr --tool jcmdIn this case, profiling Java Pod and generate the thread dump output require using -o/--output threaddump options:
kubectl prof mypod -l java -o threaddumpIn this case, profiling Java Pod and generate the heap dump output require using -o/--output heapdump options:
kubectl prof mypod -l java -o heapdump --tool jcmdIn this case, profiling Java Pod and generate the heap histogram output require using -o/--output heaphistogram
options:
kubectl prof mypod -l java -o heaphistogram --tool jcmdSupported container runtimes values are: crio, containerd.
kubectl prof mypod -t 1m --lang java --runtime crioTo profile a Python application in pod mypod for 1 minute and save the flamegraph into /tmp run:
kubectl prof mypod -t 1m --lang python -o flamegraph --local-path=/tmpIn this case, profiling Python Pod and generate the thread dump output require using -o/--output threaddump option:
kubectl prof mypod --lang python --local-path=/tmp -o threaddump In this case, profiling Python Pod and generate the thread dump output require using -o/--output speedscope option:
kubectl prof mypod -t 1m --lang python --local-path=/tmp -o speedscope To profile a Golang application in pod mypod for 1 minute run:
kubectl prof mypod -t 1m --lang go -o flamegraphTo profile a NodeJS application in pod mypod for 1 minute run:
kubectl prof mypod -t 1m --lang node -o flamegraphFor profiling NodeJS Pod and generate the heap snapshot output require using -o/--output heapsnapshot option:
kubectl prof mypod --lang node -o heapsnapshotNOTICE: the NodeJS apps have to be run with --heapsnapshot-signal=SIGUSR2 (default) or --heapsnapshot-signal=SIGUSR1 options.
If the NodeJS app was run with --heapsnapshot-signal=SIGUSR1 option, you should use the following command:
kubectl prof mypod --lang node -o heapsnapshot --node-heap-snapshot-signal=10Take into account that the signal has to be a number, not a string.
More info about that here and here.
To profile a Ruby application in pod mypod for 1 minute run:
kubectl prof mypod -t 1m --lang ruby -o flamegraphTo profile a Clang application in pod mypod for 1 minute run:
kubectl prof mypod -t 1m --lang clang -o flamegraphTo profile a Clang++ application in pod mypod for 1 minute run:
kubectl prof mypod -t 1m --lang clang++ -o flamegraphProfiling a pod for 5 minutes in intervals of 60 seconds for java language by giving the cpu limits, the container runtime, the agent image and the image pull policy
kubectl prof mypod -l java -o flamegraph -t 5m --interval 60s --cpu-limits=1 -r containerd --image=localhost/my-agent-image-jvm:latest --image-pull-policy=IfNotPresentProfiling in profiling namespace a pod running in my-apps namespace by using the profiler service account for go language
kubectl prof mypod -n profiling --service-account=profiler --target-namespace=my-apps -l goProfiling by setting custom resource requests and limits for the agent pod (default: neither requests nor limits are set) for python language
kubectl prof mypod --cpu-requests 100m --cpu-limits 200m --mem-requests 100Mi --mem-limits 200Mi -l pythonProfile the pods with the label selector "app=my-app"
for 5 minutes with JFR format for java language by using --selector option:
kubectl prof --selector app=myapp -t 5m -l java -o jfr
In addition, you can define the number of pods to be profiled simultaneously by using --pool-size-profiling-jobs.
For example, the following command will profile five pods simultaneously:
kubectl prof --selector app=myapp -t 5m -l java -o jfr --pool-size-profiling-jobs 5
kubectl prof --helpInstall Krew
Install repository and plugin:
kubectl krew index add kubectl-prof https://github.com/josepdcs/kubectl-prof
kubectl krew search kubectl-prof
kubectl krew install kubectl-prof/prof
kubectl prof --helpSee the release page for the full list of pre-built assets. And download the binary according yours architecture.
wget https://github.com/josepdcs/kubectl-prof/releases/download/1.7.0/kubectl-prof_1.7.0_linux_amd64.tar.gz
tar xvfz kubectl-prof_1.7.0_linux_amd64.tar.gz && sudo install kubectl-prof /usr/local/bin/$ go get -d github.com/josepdcs/kubectl-prof
$ cd $GOPATH/src/github.com/josepdcs/kubectl-prof
$ make install-deps$ makeModify Makefile, property DOCKER_BASE_IMAGE, and run:
$ make agentskubectl-prof launch a Kubernetes Job on the same node as the target pod. Under the hood kubectl-profcan use the
following tools according the programming language:
- For Java:
- async-profiler in order to generate flame graphs or JFR
files and the rest of output type supported for this tool.
- For generating flame graphs use the option:
--tool async-profilerand-o flamegraph. - For generating JFR files use the option:
--tool async-profilerand-o jfr. - For generating collapsed/raw use the option:
--tool async-profilerand-o collapsedor-o raw. - Note: Default output is flame graphs if no option
-o/--outputis given.
- For generating flame graphs use the option:
- jcmd in order to generate: JFR
files, thread dumps, heap dumps and heap histogram.
- For generating JFR files use the options:
--tool jcmdand-o jfr. - For generating thread dumps use the options:
--tool jcmdand-o threaddump. - For generating heap dumps use the options:
--tool jcmdand-o heapdump. - For generating heap histogram use the options:
--tool jcmdand-o histogram. - Note: Default output is JFR if no option
-o/--outputis given.
- For generating JFR files use the options:
- Note: Default tool is async-profiler if no
option
--toolis given and default output is flame graphs if no option-o/--outputis also given.
- async-profiler in order to generate flame graphs or JFR
files and the rest of output type supported for this tool.
- For Golang: ebpf profiling.
- For generating flame graphs use the option:
-o flamegraph. - For generating raw use the option:
-o raw. - Note: Default output is flame graphs if no option
-o/--outputis given.
- For generating flame graphs use the option:
- For Python: py-spy.
- For generating flame graphs use the option:
-o flamegraph. - For generating thread dumps use the option:
-o threaddump. - For generating speed scope use the option :
-o speedscope. - For generating raw use the option:
-o raw. - Note: Default output is flame graphs if no option
-o/--outputis given.
- For generating flame graphs use the option:
- For Ruby: rbspy.
- For generating flame graphs use the option:
-o flamegraph. - For generating speed scope use the option :
-o speedscope. - For generating callgrind use the option:
-o callgrind. - Note: Default output is flame graphs if no option
-o/--outputis given.
- For generating flame graphs use the option:
- For Node.js: ebpf profiling and perf but last one is not recommended.
- For generating flame graphs use the option:
-o flamegraph. - For generating raw use the option:
-o raw. - For generating heap snapshot use the option:
-o heapsnapshot. - Note: Default output is flame graphs if no option
-o/--outputis given. - In order for Javascript Symbols to be resolved, the node process needs to be run with
--perf-basic-profflag. - For generating the heap snapshot output, the node process needs to be run with
--heapsnapshot-signalflag. Information about that here and here.
- For generating flame graphs use the option:
- For Clang and Clang++: perf is the default profiler, but ebpf profiling is also supported.
The raw output is a text file with the raw data from the profiler. It could be used to generate flame graphs, or you can use https://www.speedscope.app/ to visualize the data.
kubectl-prof also supports to work in modes discrete and continuous:
- In discrete mode: only one profiling result is requested. Once this result is obtained, the profiling process
finishes. This is the default behaviour when only using
-t timeoption. - In continuous mode: can produce more than one result. Given a session duration and an interval, a result is produced
every interval until the profiling session finishes. Only the last produced result is available. It is client
responsibility to store all the session results.
- For using this option you must use the
--interval timeoption in addition to-t time.
- For using this option you must use the
In addition, kubectl-prof will attempt to profile all the processes detected in the container.
It will try to profile them all based on the provided language. When this happens, the tool will display a warning similar to:
⚠ Detected more than one PID to profile: [2508 2509]. It will be attempt to profile all of them. Use the --pid flag specifying the corresponding PID if you only want to profile one of them.
But if you want to profile a specific process, you have two options:
- Provide the specific PID using the
--pid PIDflag if you know the PID (the previous warning can help you identify the PID you want to profile). - Provide a process name using the
--pgrep process-matching-nameflag.
For profiling Java Pods, kubectl-prof runs the agent pod with the PERFMON and SYSLOG capabilities by default.
According to the Kernel documentation,
PERFMON and SYSLOG should be enough for collecting performance samples.
However, if you need to run with SYS_ADMIN, you can specify it using the --capabilities option.
This one can be used multiple times to add more than one capability.
Example:
kubectl prof my-pod -t 5m -l java -o flamegraph --local-path=/tmp --capabilities=SYS_ADMINBy default, the profiling agent pod will only be scheduled on nodes without taints. If your target pod runs on a node with taints, you can specify tolerations to allow the agent pod to be scheduled on the same node.
Tolerations can be specified using the --tolerations flag with the following formats:
key=value:effect- Tolerate a taint with a specific key, value, and effectkey:effect- Tolerate a taint with a specific key and effect (any value)key- Tolerate a taint with a specific key (defaults to NoSchedule effect)
You can use the --tolerations flag multiple times to add multiple tolerations.
Examples:
Tolerate a specific taint with key, value, and effect:
kubectl prof my-pod -t 5m -l java -o flamegraph --tolerations=node.kubernetes.io/disk-pressure=true:NoScheduleTolerate a taint with key and effect (any value):
kubectl prof my-pod -t 5m -l java -o flamegraph --tolerations=node.kubernetes.io/memory-pressure:NoExecuteTolerate a taint with just a key (defaults to NoSchedule):
kubectl prof my-pod -t 5m -l java -o flamegraph --tolerations=node.kubernetes.io/unreachableAdd multiple tolerations:
kubectl prof my-pod -t 5m -l java -o flamegraph \
--tolerations=node.kubernetes.io/disk-pressure=true:NoSchedule \
--tolerations=node.kubernetes.io/memory-pressure:NoExecute \
--tolerations=dedicated=profiling:PreferNoSchedulePlease refer to the contributing.md file for information about how to get involved. We welcome issues, questions, and pull requests
- Josep Damià Carbonell Seguí: [email protected]
Special thanks to the original Author of kubectl-flame
- Eden Federman: [email protected]
- Verizon Media Code
This project is licensed under the terms of the Apache 2.0 open source license. Please refer to LICENSE for the full terms.
| Service | Status |
|---|---|
| Github Actions | |
| GoReport |