Find goroutine leaks and deadlocks in Go programs — before they hit production.
Goroutine leaks and deadlocks are among the hardest bugs in production Go. They're invisible in normal testing, survive code review, and only surface under specific load — causing cascading memory growth or complete service hangs.
ThreadGraph analyzes Go execution traces to find exactly which goroutines are permanently blocked, what they're blocked on, and where they were created — with no binary modification, no code changes, and no instrumentation.
go install github.com/Heman10x-NGU/threadgraph@latest# Auto-capture trace and analyze
threadgraph run ./...
# Analyze an existing trace
threadgraph analyze trace.out
# With static lock-release analysis
threadgraph run --static ./...
# CI-friendly JSON output
threadgraph run --format json --no-llm ./...ThreadGraph Analysis
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
5 goroutine leaks
0 deadlocks
0 long blocks
● GOROUTINE LEAK (high confidence)
Goroutine 18 blocked on: "chan receive"
Location: testdata/buggy/main.go:42
Stack:
main.leakyWorker
testdata/buggy/main.go:42 +0x28
● GOROUTINE LEAK (high confidence)
Goroutine 22 blocked on: "chan receive"
Location: testdata/buggy/main.go:42
Stack:
main.leakyWorker
testdata/buggy/main.go:42 +0x28
[3 more leaks...]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Analyzed 47 goroutines · 2341ms window · /tmp/trace123.out
| Finding Type | How Detected | Confidence |
|---|---|---|
| Goroutine leak | Channel-blocked at trace end | High |
| Mutex deadlock | 2+ goroutines at same lock site >500ms | Medium |
| AB-BA lock inversion | Crossed lock acquisition history | Medium |
| Channel-lock cycle | Lock holder blocked on channel | Medium |
| Lock leak (static) | go/ssa CFG: lock path without unlock | Low |
| N-way lock cycle | Tarjan's SCC on lock-acquisition graph | Medium |
| Data race | Go race detector output parsing | High |
| Orphan goroutine | Created but never scheduled | Low |
Tested against GoBench GoKer — 68 real blocking bugs extracted from production Go projects by researchers at UCSB (CGO 2021 paper).
Projects tested: Kubernetes, etcd, CockroachDB, gRPC, Docker, Hugo, Istio, Syncthing
| Version | Score | Key Technique Added |
|---|---|---|
| v0.1 | 31/68 (45%) | Basic leak detection |
| v0.2 | 59/68 (86%) | AB-BA, chan+lock cycle, schedule diversity |
| v0.3 | 62/68 (91%) | testing.T.Run provenance fix, syncHistory |
| v0.4 | 64/68 (94%) | Goroutine provenance tree, Tarjan's SCC N-way deadlock, receiver-aware mutex matching |
False positives on net/http/httptest (224 goroutines): 0
No other open-source Go tool has been benchmarked against this dataset.
ThreadGraph runs go test -trace on your package — no binary modification needed.
It parses the Go execution trace (a structured binary log of every goroutine state
transition) and applies 6 detection algorithms:
- detectLeaks — goroutines blocked on channels at trace end (with lifetime-ratio confidence scoring)
- detectDeadlocks — mutex contention groups by call site
- detectLockCycles — N-way lock ordering cycles via Tarjan's SCC algorithm (catches 3-goroutine deadlocks)
- detectChanLockCycle — goroutines holding a lock while waiting on a channel
- detectOrphans — goroutines that never ran before the test exited
- detectTransientBlocks — mutex deadlocks unblocked by test timeout
- AnalyzeLockRelease (
--static) — go/ssa CFG analysis for locks not released on all code paths
If no bugs are found on the first pass, it automatically retries with GOMAXPROCS=1, 2, and 4 to expose scheduling-dependent bugs that only manifest under specific interleavings.
For full algorithm documentation, see ARCHITECTURE.md.
When ANTHROPIC_API_KEY is set, ThreadGraph calls Claude to explain each finding in
plain English and suggest a fix. Pass --no-llm to skip this.
export ANTHROPIC_API_KEY=sk-ant-...
threadgraph run ./...--format string Output format: terminal (default) or json
--no-llm Skip Claude AI explanations
--output string Write output to file instead of stdout
--min-block string Minimum block duration to report (default "500ms")
--static Enable go/ssa static lock-release analysis
--debug-filtered Print all blocked goroutines with filter status to stderr
--save-baseline string Save current findings as a baseline JSON file
--baseline string Suppress known findings; exit 1 only on new regressions
- Goroutine provenance tree — BFS from
testing.Troots; only test-owned goroutines reported - Source code context in AI explanations (±10-line window around finding)
- CI baseline comparison (
--save-baseline/--baseline) - VS Code extension with inline annotations
- GitHub PR check — "this PR introduced 2 goroutine leaks"
- Production sampling mode (LeakProf-style pprof for long-running services)
See CONTRIBUTING.md.
Apache 2.0 — see LICENSE.