[Product] Context Memory Store - persistent deduplicated memory across sessions

## Summary

Add a persistent context memory store that accumulates knowledge across sessions, automatically deduplicating, compressing, and expiring stale context. Think of it as a vector DB with built-in context intelligence.

## Problem

Today's AI agents are stateless between sessions. Each new conversation starts from scratch - re-fetching the same docs, re-reading the same code, re-discovering the same patterns. Memory solutions like Mem0 store raw conversation history, but they don't understand *redundancy*. After 100 sessions, you have 100 copies of "this project uses React with TypeScript."

## What Distill should do differently

Distill already knows how to deduplicate, compress, and cluster. Apply that to *persistent storage*:

- **Write**: Agent pushes context (code snippets, decisions, errors, learnings)
- **Deduplicate on write**: New context is compared against existing memory. If semantically redundant, it's merged or discarded.
- **Read**: Agent queries memory. Results are deduplicated, compressed, and ranked by relevance + recency.
- **Decay**: Old, unreferenced memories get progressively compressed (full text -> summary -> keywords -> evicted)

## API Design

```
POST /v1/memory/store
{
  "session_id": "session_abc",
  "entries": [
    {"text": "The auth service uses JWT with RS256", "source": "code_review", "tags": ["auth"]},
    {"text": "We switched from HS256 to RS256 in PR #142", "source": "git", "tags": ["auth", "security"]}
  ]
}

Response:
{
  "stored": 1,        // 1 new entry (the other was deduplicated against existing)
  "merged": 1,        // 1 entry merged with existing memory
  "total_memories": 847
}
```

```
POST /v1/memory/recall
{
  "query": "How does authentication work in this project?",
  "max_tokens": 2000,
  "recency_weight": 0.3
}

Response:
{
  "memories": [
    {"text": "Auth service uses JWT with RS256 (switched from HS256 in PR #142)", "relevance": 0.94, "last_referenced": "2026-02-14T..."},
    ...
  ],
  "stats": {
    "candidates": 23,
    "deduplicated": 8,
    "returned": 5,
    "token_count": 1840
  }
}
```

```
DELETE /v1/memory/forget
{
  "tags": ["deprecated"],
  "older_than": "2025-01-01"
}
```

## Storage backends

| Backend | Use case |
|---------|----------|
| In-memory (default) | Development, single-session |
| SQLite | Local persistent storage |
| Redis | Shared across instances |
| Postgres + pgvector | Production, multi-tenant |

## Key design decisions

- **Dedup on write, not just read** - prevents unbounded growth
- **Hierarchical decay** - memories compress over time (full -> summary -> keywords -> evicted)
- **Source tracking** - every memory knows where it came from (file, commit, conversation)
- **Tag-based organization** - enables scoped recall ("only auth-related memories")
- **Token-budgeted recall** - caller specifies max tokens, Distill fills the budget optimally

## How this connects to existing Distill

- Uses `pkg/dedup` for write-time deduplication
- Uses `pkg/compress` for hierarchical decay
- Uses `pkg/contextlab` (clustering + MMR) for read-time retrieval
- Uses `pkg/cache` for hot-path acceleration
- Exposes Prometheus metrics and OTEL traces

## Deliverables

- [ ] `pkg/memory/store.go` - Memory store interface
- [ ] `pkg/memory/sqlite.go` - SQLite backend
- [ ] `pkg/memory/memory_test.go` - Tests
- [ ] `cmd/memory.go` - CLI commands (`distill memory store`, `distill memory recall`, `distill memory stats`)
- [ ] API endpoints: `/v1/memory/store`, `/v1/memory/recall`, `/v1/memory/forget`, `/v1/memory/stats`
- [ ] MCP tools: `memory_store`, `memory_recall`
- [ ] Decay worker (background goroutine that compresses old memories)

## Acceptance Criteria

- [ ] Store 10K memories, recall in <50ms
- [ ] Write-time dedup prevents duplicate storage
- [ ] Hierarchical decay reduces storage over time
- [ ] Token-budgeted recall fills context window optimally
- [ ] Works as MCP tool in Claude Desktop

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Product] Context Memory Store - persistent deduplicated memory across sessions #29

Summary

Problem

What Distill should do differently

API Design

Storage backends

Key design decisions

How this connects to existing Distill

Deliverables

Acceptance Criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Backend	Use case
In-memory (default)	Development, single-session
SQLite	Local persistent storage
Redis	Shared across instances
Postgres + pgvector	Production, multi-tenant

Uh oh!

[Product] Context Memory Store - persistent deduplicated memory across sessions #29

Description

Summary

Problem

What Distill should do differently

API Design

Storage backends

Key design decisions

How this connects to existing Distill

Deliverables

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions