Claude Engram

Persistent memory for AI coding assistants. Auto-tracks mistakes, decisions, and context. Retrieves the right memory at the right time using hybrid search (keyword + vector + reranking). Works with any MCP-compatible tool.

Benchmarks

Retrieval benchmarks

Retrieval-only (recall@k). These measure whether the right memory is found in the top results — not end-to-end QA with answer generation and judge scoring, which is what the published LongMemEval leaderboard measures. MemPalace comparison uses the same retrieval-only methodology (their raw mode, no LLM reranking, top_k=10).

Benchmark	Claude Engram	MemPalace (raw)
LongMemEval Recall@5 (500 questions)	0.966	0.966
LongMemEval Recall@10	0.982	0.982
LongMemEval NDCG@10	0.889	0.889
ConvoMem (250 items, 5 categories)	0.960	0.929
LoCoMo R@10 (1,986 questions, top_k=10)	0.649	0.603
Speed	43ms/query	~600ms/query
Dependencies	AllMiniLM (optional)	ChromaDB

Reproduce: python tests/bench_longmemeval.py, bench_locomo.py, bench_convomem.py

Integration benchmarks

These test what the product actually does — not just search retrieval.

Benchmark	What it tests	Result
Decision Capture (220 prompts)	Auto-detect decisions from user prompts	97.8% precision, 36.7% recall
Injection Relevance (50 memories, 15 cases)	Right memories surface before edits	14/15 passed, 100% cross-domain isolation
Compaction Survival (6 scenarios)	Rules/mistakes survive context compression	6/6 passed
Error Auto-Capture (53 payloads)	Extract errors, reject noise, deduplicate	100% recall, 97% precision
Multi-Project Scoping (11 cases)	Sub-project isolation + workspace inheritance	11/11 passed
Edit Loop Detection (12 scenarios)	Detect spirals vs iterative improvement	12/12 passed

Reproduce: python tests/bench_integration.py (runs from tests/ directory)

Comparison with MemPalace

Different approaches. MemPalace is a conversation archive with a spatial palace structure, knowledge graph, AAAK compression, and specialist agents. Claude Engram is live-capture: hooks into the coding lifecycle to auto-track mistakes, decisions, and context as you work. Comparable retrieval, different strengths.

Compatibility

Platform	What Works	Auto-Capture
Claude Code (CLI, desktop, VS Code, JetBrains)	Everything	Yes — 10 hook events
Cursor	MCP tools (memory, search, scope, etc.)	No hooks
Windsurf	MCP tools	No hooks
Continue.dev	MCP tools	No hooks
Zed	MCP tools	No hooks
Any MCP client	MCP tools	No hooks
Python code	`MemoryStore` SDK directly	N/A

With Claude Code, hooks auto-capture mistakes, decisions, edits, test results, and session state. With other tools, you use the MCP tools manually — the memory system, hybrid search, archiving, and scoring all work the same.

Features

Hybrid search — keyword + AllMiniLM vector + reranking. No ChromaDB dependency.
Auto-tracks mistakes from any failed tool. Warns before editing the same file.
Auto-captures decisions from prompts ("let's use X") via semantic + regex scoring.
Detects edit loops when the same file is edited 3+ times.
Survives compaction — auto-checkpoint before, re-inject rules/mistakes after.
Tiered storage — hot (fast) + archive (cold, searchable, restorable). Rules and mistakes never archive.
Scored injection — top 3 memories by file match, tags, recency, importance before every edit.
Multi-project — memories scoped per sub-project. Workspace rules cascade down.

Install

git clone https://github.com/20alexl/claude-engram.git
cd claude-engram
python -m venv venv
source venv/bin/activate  # or venv\Scripts\activate on Windows

pip install -e .                # Core
pip install -e ".[semantic]"    # + AllMiniLM for vector search and decision capture

python install.py               # Configure hooks + MCP server

Per-Project Setup

python install.py --setup /path/to/your/project

Or copy .mcp.json and CLAUDE.md to your project root.

Mid-Project Adoption

Already deep in a project? Install normally, then tell your AI to dump what it knows:

Save everything you know about this project:
- memory(add_rule) for each project convention
- memory(remember) for key facts about the architecture
- work(log_decision) for decisions we've made and why

Ollama (Optional)

Only needed for scout_search, scout_analyze, and LLM-based convention checking. Everything else works without it.

ollama pull gemma3:4b                    # or gemma3:12b for better semantic search
export CLAUDE_ENGRAM_MODEL="gemma3:4b"   # Linux/Mac

Configuration

Variable	Default	Description
`CLAUDE_ENGRAM_MODEL`	`gemma3:12b`	Ollama model
`CLAUDE_ENGRAM_OLLAMA_URL`	`http://localhost:11434`	Ollama endpoint
`CLAUDE_ENGRAM_ARCHIVE_DAYS`	`14`	Days until inactive memories archive
`CLAUDE_ENGRAM_SCORER_TIMEOUT`	`1800`	AllMiniLM server idle timeout (seconds)

Documentation

Library Book — design, internals, full usage guide, API reference, gotchas.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
claude_engram		claude_engram
library-book		library-book
scripts		scripts
tests		tests
.gitignore		.gitignore
.mcp.json		.mcp.json
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
install.py		install.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claude Engram

Benchmarks

Retrieval benchmarks

Integration benchmarks

Comparison with MemPalace

Compatibility

Features

Install

Per-Project Setup

Mid-Project Adoption

Ollama (Optional)

Configuration

Documentation

License

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Claude Engram

Benchmarks

Retrieval benchmarks

Integration benchmarks

Comparison with MemPalace

Compatibility

Features

Install

Per-Project Setup

Mid-Project Adoption

Ollama (Optional)

Configuration

Documentation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages