Advanced memory management system for LLM agents with Letta-inspired features
Memedge is a sophisticated memory system designed for building stateful LLM agents on Cloudflare Workers. Inspired by Letta (formerly MemGPT), it provides structured memory blocks, semantic search, recursive summarization, and privacy-aware memory management.
- π― Structured Memory Blocks: Organize information into core blocks (human, persona, context) and custom blocks
- π Semantic Search: Built-in semantic search using Cloudflare AI embeddings (no external vector DB needed!)
- π Archival Memory: Long-term storage with searchable history
- π Recursive Summarization: Hierarchical conversation summarization for managing long-term context
- π Privacy-Aware: Built-in privacy markers ([PRIVATE], [CONFIDENTIAL], [DO NOT SHARE])
- β‘ Edge-Native: Optimized for Cloudflare Workers with Durable Objects
- π οΈ LLM Tool Integration: Ready-to-use tool definitions for function calling
- πΎ SQL-Based: Uses Cloudflare Durable Objects SQL for persistence
- π¨ Effect-Based: Leverages Effect for type-safe error handling
npm install memedge
# or
yarn add memedge
# or
pnpm add memedgeimport { Effect } from 'effect';
import {
MemoryManagerLive,
SqlStorageContext
} from 'memedge/memory';
// Setup SQL storage context
const sqlContext = SqlStorageContext.of({ sql: durableObjectSQL });
// Create and use memory manager
const program = Effect.gen(function* () {
const memoryManager = yield* MemoryManagerService;
// Initialize database
yield* memoryManager.initializeDatabase();
// Write memory
yield* memoryManager.writeMemory('user_profile', 'Name: Alice, Role: Engineer');
// Read memory
const entry = yield* memoryManager.readMemory('user_profile');
console.log(entry?.text);
});
// Run with context
Effect.runPromise(
program.pipe(
Effect.provide(MemoryManagerLive),
Effect.provide(Layer.succeed(SqlStorageContext, sqlContext))
)
);import {
MemoryBlockManagerLive,
MemoryBlockManagerService
} from 'memedge/memory';
const program = Effect.gen(function* () {
const manager = yield* MemoryBlockManagerService;
// Create a memory block
yield* manager.createBlock(
'human',
'Human',
'Name: Alice\nRole: Software Engineer\nPrefers: Concise responses',
'core'
);
// Insert content
yield* manager.insertContent(
'human',
'Company: TechCorp',
'end'
);
// Replace content
yield* manager.replaceContent(
'human',
'Concise responses',
'Detailed explanations'
);
// Get block
const block = yield* manager.getBlock('human');
console.log(block?.content);
});import {
searchMemoryBlocks,
generateEmbedding,
AiBindingContext
} from 'memedge/memory';
const program = Effect.gen(function* () {
const manager = yield* MemoryBlockManagerService;
const blocks = yield* manager.getAllBlocks();
// Search memory blocks semantically
const results = yield* searchMemoryBlocks(
'health information',
blocks,
5, // limit
0.5 // threshold
);
results.forEach(r => {
console.log(`${r.block.label}: ${r.score}`);
console.log(r.block.content);
});
});
// Provide AI binding for embeddings
Effect.runPromise(
program.pipe(
Effect.provide(MemoryBlockManagerLive),
Effect.provide(Layer.succeed(AiBindingContext, { ai: env.AI }))
)
);import {
createBaseSummary,
checkRecursiveSummarizationNeeded,
createRecursiveSummary
} from 'memedge/summaries';
const program = Effect.gen(function* () {
// Create base summary from messages
const summaryId = yield* createBaseSummary(messages, persona);
// Check if recursive summarization is needed
const check = yield* checkRecursiveSummarizationNeeded();
if (check.needed && check.summaries) {
// Create recursive summary
const recursiveId = yield* createRecursiveSummary(
check.summaries,
check.level!,
persona
);
console.log(`Created level ${check.level} summary: ${recursiveId}`);
}
});Memedge provides ready-to-use tool definitions for LLM function calling:
import {
getMemoryTools,
getEnhancedMemoryTools,
getAllMemoryTools
} from 'memedge/tools';
// Basic tools
const basicTools = getMemoryTools();
// { memory_read, memory_write }
// Enhanced Letta-style tools
const enhancedTools = getEnhancedMemoryTools();
// {
// memory_get_block, memory_insert, memory_replace,
// memory_rethink, memory_create_block, memory_list_blocks,
// archival_insert, archival_search, memory_search
// }
// All tools (enhanced + legacy)
const allTools = getAllMemoryTools();
// Use with your LLM provider
const response = await generateText({
model: openai('gpt-4'),
tools: allTools,
// ...
});import {
executeMemoryGetBlock,
executeMemoryInsert,
executeMemorySearch
} from 'memedge/tools';
// Execute tool based on LLM response
if (toolCall.name === 'memory_get_block') {
const result = yield* executeMemoryGetBlock(toolCall.args);
// { block_id, label, content, updated_at }
}
if (toolCall.name === 'memory_insert') {
const result = yield* executeMemoryInsert(toolCall.args);
// { success, message }
}
if (toolCall.name === 'memory_search') {
const result = yield* executeMemorySearch({
...toolCall.args,
useSemanticSearch: true
});
// { results: [{ block_id, label, content, score }] }
}Memory blocks are structured containers for different types of information:
- Core Blocks: Always loaded into context (human, persona, context, custom)
- Archival Blocks: Searchable long-term storage, loaded on-demand
- Operations: insert, replace, rethink (complete rewrite)
Memedge supports privacy-aware memory with built-in markers:
// Store private information
yield* memoryManager.writeMemory(
'health_info',
'[PRIVATE] Allergic to penicillin. [CONFIDENTIAL] Therapy on Tuesdays.'
);
// The system respects these markers when sharing informationSupported markers:
[PRIVATE]- Personal information[CONFIDENTIAL]- Confidential data[DO NOT SHARE]- Explicitly not shareable[PERSONAL]- Personal notes
Memedge uses a simple but effective approach to semantic search:
- Embeddings Generation: Uses Cloudflare AI (@cf/baai/bge-base-en-v1.5, 768 dimensions)
- Storage: Embeddings stored as JSON in SQL (no separate vector DB!)
- Search: Cosine similarity computed in-worker
- Performance: Sub-50ms search latency for typical queries
- Cost: Included in Cloudflare Workers costs
Hierarchical conversation summarization for managing long-term context:
Level 0: Base Summaries (20 messages each)
Level 1: Meta-Summaries (10 x L0)
Level 2: Super-Summaries (10 x L1)
Level 3: Ultra-Summaries (10 x L2)
This logarithmic approach keeps context manageable even with thousands of messages.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Memedge System β
β β
β ββββββββββββββββββ ββββββββββββββββββββββββββββββββ β
β β Memory Manager β β Memory Block Manager β β
β β (Legacy KV) β β (Letta-style) β β
β β β β β β
β β β’ purpose/text β β β’ Structured blocks β β
β β β’ Privacy β β β’ Core + Archival β β
β β markers β β β’ insert/replace/rethink β β
β ββββββββββββββββββ ββββββββββββββββββββββββββββββββ β
β β β β
β βββββββββββββ¬ββββββββββββ β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Semantic Search (Cloudflare AI) β β
β β β β
β β β’ Generate embeddings (768D) β β
β β β’ Store in SQL as JSON β β
β β β’ Cosine similarity search β β
β β β’ No external vector DB β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Recursive Summarization β β
β β β β
β β β’ Base summaries (L0) β β
β β β’ Recursive meta-summaries (L1, L2, L3) β β
β β β’ Hierarchical context compression β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Durable Objects SQL Storage β β
β β β β
β β β’ agent_memory (legacy) β β
β β β’ memory_blocks (structured) β β
β β β’ archival_memory (long-term) β β
β β β’ memory_embeddings (vectors) β β
β β β’ conversation_summaries_v2 (recursive) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
const config: SummarizationConfig = {
baseSummaryThreshold: 20, // Messages before L0 summary
recursiveThreshold: 10, // Summaries before next level
maxLevel: 3, // Maximum recursion depth
recentSummaryCount: 3 // Recent summaries to load
};// Search with custom threshold and limit
const results = yield* searchMemoryBlocks(
query,
blocks,
10, // limit: max results
0.7 // threshold: minimum similarity score
);See the API documentation for detailed API reference.
# Run tests
npm test
# Watch mode
npm run test:watch
# Coverage
npm run test:coverageContributions are welcome! Please read our Contributing Guide for details.
MIT License - see LICENSE file for details.
- Inspired by Letta (MemGPT) - Thank you to the Letta team for pioneering advanced memory systems for LLM agents
- Built for Cloudflare Workers
- Powered by Effect
| Feature | Memedge | Letta |
|---|---|---|
| Architecture | Cloudflare Workers + Durable Objects | Python + PostgreSQL + Vector DB |
| Memory Blocks | β Core + Archival | β Core + Archival |
| Semantic Search | β Built-in (Cloudflare AI) | β External Vector DB |
| Embeddings | 768D, stored in SQL | Configurable, separate DB |
| Latency | ~30-50ms (edge) | ~100-200ms (server) |
| Scalability | Edge-native, globally distributed | Server-based |
| Privacy Markers | β Built-in | β Not included |
| Recursive Summarization | β Hierarchical | β Simple |
| Tool Integration | β Zod schemas | β Pydantic |
| Cost | Included in Workers | Separate services |
| Visual Tools | β Code-first | β Agent Dev Environment |
Made with β€οΈ for the LLM agent community