claude-mem: Review, Architecture, and Alternatives for Claude Code Persistent Memory
Why Persistent Memory Has Become a Critical Issue for Developers
Claude Code has a structural limitation: every session starts with a blank context window. For a developer running daily sessions on the same project, this means a repetitive cycle: re-explaining the architecture, redefining conventions, recalling past decisions. Claude spends an average of 3 to 5 exploration tool calls before producing anything useful in a new session.
Claude Code's native memory (CLAUDE.md and MEMORY.md files) only partially solves the problem. MEMORY.md is limited to 200 lines and offers no semantic search. For projects with hundreds of accumulated sessions, that is insufficient.
claude-mem is an open-source plugin that adds a complete recall layer to Claude Code. With 37.2k GitHub stars, 174 releases, and 22 contributors, it is the most widely adopted persistent memory project in the ecosystem. What follows is a technical analysis of its architecture, strengths, weaknesses, and alternatives.
Technical Architecture: SQLite, Chroma, and the Hook Pipeline
System Overview
claude-mem integrates with Claude Code through the plugin lifecycle hook system. Six hook scripts intercept key session events:
Hook | Trigger | Role |
|---|---|---|
| Session opens | Queries the hybrid database, injects relevant summaries into context |
| Prompt submitted | Initializes session tracking and prepares the observation pipeline |
| After each tool call | Captures the observation (edit, bash, search, agent task) in real time |
| Before context ends | Generates semantic summaries via the Claude Agent SDK |
| Session closes | Cleanup, final data persistence |
Smart Install | First launch | Automatic dependency configuration |
The Worker Service runs in the background as an HTTP server on port 37777, managed by the Bun runtime. It exposes a local web interface and 10 search endpoints via a REST API.
The SQLite + Chroma Hybrid Storage
This is the technical core of the project. Two complementary storage layers coexist:
Layer 1: SQLite with FTS5
Structured storage: sessions, individual observations per tool call, AI-generated summaries
Keyword search via BM25 scoring
Low latency, no external dependencies
Database stored in
~/.claude-mem/
Layer 2: Chroma Vector Database
Semantic search via embeddings
Enables conceptual queries ("authentication issues" finds JWT-related context even without lexical matches)
Index stored in
~/.claude-mem/vector-db/chroma.sqlite3Notable detail: Chroma itself stores its index in a SQLite file, meaning two SQLite instances coexist
The hybrid approach combines exact-match search (fast, deterministic) with semantic search (flexible, conceptual). This is a classic RAG (Retrieval-Augmented Generation) pattern, adapted here for the specific use case of session memory.
Context Compression
The compression mechanism is central to claude-mem's design. Rather than storing raw conversation transcripts (which can be 5 to 10x larger due to tool call noise), the plugin:
Captures structured observations at each
PostToolUsehook: which tool was called, what it returned, which file was modified, what decision was madeGenerates semantic summaries via the Claude Agent SDK at the
Stophook: compressed, meaningful representations of what happenedSelectively injects at session start: only the most relevant context is injected, based on semantic and keyword similarity
The 3-Layer MCP Workflow: Progressive Disclosure
The 4 MCP tools exposed by claude-mem follow a progressive disclosure pattern designed to minimize token consumption:
Step 1: search(query="authentication bug", type="bugfix", limit=10)
→ Returns compact index with IDs (~50-100 tokens/result)
Step 2: timeline(observation_id=123)
→ Chronological context around that observation
Step 3: get_observations(ids=[123, 456])
→ Full details ONLY for filtered IDs (~500-1,000 tokens/result)
This design achieves roughly 10x token savings compared to loading all observations. The idea is to pre-filter before fetching full content.
The 4 available MCP tools:
`search`: full-text search across the memory index, filterable by type, date, project
`timeline`: chronological context around a specific observation or query
`get_observations`: retrieve full details by ID (always batched)
`__IMPORTANT`: workflow documentation, always visible to Claude
Installation and Configuration
Setup (30 Seconds)
/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem
Restart Claude Code. The plugin is active. No API key, no configuration file required.
System Requirements
Component | Minimum Version | Notes |
|---|---|---|
Node.js | 18.0.0+ | Required |
Claude Code | Latest version | Plugin support required |
Bun | Auto-installed | JavaScript runtime and process manager |
uv | Auto-installed | Python package manager for vector search |
SQLite 3 | Bundled | Included with the plugin |
Configuration File
Settings are stored in ~/.claude-mem/settings.json (auto-created on first launch):
{
"model": "claude-sonnet-4-20250514",
"port": 37777,
"dataDir": "~/.claude-mem",
"logLevel": "info",
"contextInjection": {
"enabled": true,
"maxTokens": 4000
}
}
Local Web Interface
A viewer is available at http://localhost:37777 for:
Real-time memory stream viewing
Searching past observations
Accessing individual observations via
/api/observation/{id}Switching between stable and beta channels (Settings menu)
Version management
Privacy Controls
Wrap any sensitive content in <private> tags to prevent it from being stored in memory:
<private>This API key or personal note will not be captured</private>
Endless Mode: The Beta Feature That Divides Opinion
Endless Mode is an experimental feature accessible via the beta channel in the web viewer (Settings, then switch channel).
Described as a "biomimetic memory architecture for extended sessions," it promises:
Up to 95% token reduction per session
Roughly 20x more tool calls before hitting context limits
Effectively "endless" sessions without context window exhaustion
Community assessment: these figures were vigorously disputed during Reddit discussions in December 2025. Several developers called the 95% reduction "highly exaggerated," noting it applied to a then non-functional feature. The standard mode (outside Endless Mode) delivers more modest but verifiable savings.
Detailed Comparison of Persistent Memory Solutions
Aspect | claude-mem | Mem0 | Supermemory | A-MEM | Native Claude Code |
|---|---|---|---|---|---|
Architecture | Hooks + SQLite + Chroma | MCP + vector DB | Hooks + Cloud | ChromaDB + graph | CLAUDE.md / MEMORY.md |
Storage | Local only | Cloud or self-hosted | Cloud | Local | Local |
Cost | Free (AGPL-3.0) | Freemium (10k mem/month) | Paid (Pro+) | Free (MIT) | Free (built-in) |
Privacy | Full local control | Cloud by default | Cloud | Full local control | Full local control |
Setup | 2 commands | API key + config | API key + plugin | Manual | Built-in |
Memory model | Compressed session logs | Extracted facts | Tool-call captures | Evolving knowledge graph | Flat markdown file |
Search | Hybrid (FTS5 + vector) | Vector semantic | Vector semantic | Graph traversal (BFS/DFS) | None (file contents) |
Auto-capture | Yes (hooks) | Yes (MCP) | Yes (hooks) | Partial | Yes (auto memory) |
Scale | Hundreds of sessions | Unlimited (cloud) | Unlimited (cloud) | Unlimited | 200-line hard limit |
Token efficiency | Progressive disclosure | 90% reduction (claimed) | Not specified | Graph traversal | Full load every session |
Interconnection | No (isolated logs) | Partial | No | Yes (evolving graph) | No |
License | AGPL-3.0 | Proprietary/OSS | Proprietary | MIT | N/A |
Key Differences Analyzed
claude-mem vs Mem0:
claude-mem runs entirely locally; Mem0 is primarily cloud-based
claude-mem uses hooks (automatic capture without agent action); Mem0 uses MCP (the agent must explicitly call memory tools)
Mem0 claims better benchmark results (LOCOMO +26% vs OpenAI); claude-mem has no published benchmark data
Mem0 offers a free tier limited to 10,000 memories per month; claude-mem is unlimited
claude-mem vs A-MEM:
claude-mem: compaction model. Compressed logs are static and don't update each other
A-MEM: knowledge graph model. Memories link to each other and update when new information arrives (e.g., "auth uses JWT" + "JWT expires in 1hr" produces two linked, mutually updated memories)
A-MEM is more sophisticated but potentially more fragile
claude-mem vs Native Claude Code Memory:
CLAUDE.md is written by the developer; MEMORY.md is written by Claude with a 200-line limit
claude-mem adds the recall layer that native memory lacks: hybrid search, compression, large-scale archiving
The two are complementary. claude-mem does not replace CLAUDE.md; it extends the system
AGPL-3.0 License Considerations
The AGPL-3.0 license has direct implications for commercial users:
Personal or internal use: no restrictions
Modification and deployment as a network service: obligation to publish modified source code
Derivative works must adopt the same AGPL-3.0 license
The
ragtime/directory is under the PolyForm Noncommercial License 1.0.0, restricted to non-commercial use
If your company is considering integrating claude-mem into a product or service, a legal analysis of AGPL compliance is recommended.
Limitations and Known Issues
Documented Technical Issues (GitHub Issues, March 2026)
Issue | Description | Reference |
|---|---|---|
chroma-mcp process leaks | Orphaned subprocesses accumulate, causing SQLite lock errors on the vector database | Workaround: |
Windows incompatibility | Claude Code hooks freeze the CLI on Windows | Issue #1366 |
Installer loop | Installer fails in a loop due to | Issue #1371 |
CRLF line endings |
| Issue #1342 |
AWS Bedrock | Environment variables not passed to CLI; observations fail silently | Issue #1373 |
parseSummary bug | Creates empty SESSION SUMMARY records | Issue #1360 |
Viewer crash |
| Issue #1359 |
macOS crashes | Frequent crashes and unresponsiveness reported | Issue #1362 |
Conceptual Limitations
RAG quality degrades at scale: retrieval accuracy decreases as the database grows. This is a fundamental limitation of RAG systems, not specific to claude-mem.
Non-interconnected logs: compressed session summaries remain isolated. Unlike a knowledge graph (A-MEM), newer information does not update older memories. On a very long project, this can lead to inconsistencies between summaries from different periods.
Retrieval noise: some community members describe the system as "a caching layer with cache invalidation problems." The difficulty of consistently retrieving the right past context is inherent to the RAG pattern.
Single-user only: no shared memory across developers. On a team project, each developer has their own memory database with no synchronization capability.
No published benchmarks: unlike Mem0 (which claims +26% on LOCOMO), claude-mem has published no quantitative data on retrieval quality.
Use Cases and Developer Feedback
Developers using persistent memory tools for Claude Code report:
Roughly 60% reduction in time spent re-explaining concepts between sessions
Code aligned with project conventions about 85% of the time (versus 30% without memory)
Time from session start to working feature cut roughly in half
More precise debugging thanks to recognition of recurring patterns
Scenarios where claude-mem delivers the most value:
Multi-week projects with significant context accumulation
Regression debugging where Claude can reference the exact change that introduced the bug
Multi-session refactors requiring coherent architectural understanding across multiple days
Automatic learning of coding preferences (arrow functions, TypeScript patterns, test conventions)
Technical Verdict
claude-mem is a well-designed tool that solves a real problem with a sensible architecture. The SQLite + Chroma hybrid storage choice is sound, the progressive token disclosure pattern is elegant, and integration via Claude Code hooks makes the experience seamless.
The strengths are clear: free, full local privacy, trivial installation, and an active community (37.2k GitHub stars in 7 months).
The weaknesses are equally clear: imperfect stability (process leaks, macOS crashes, Windows incompatibility), no published benchmarks, and inherent limitations of the RAG model at scale.
For an individual developer working on medium to long-duration projects with Claude Code, claude-mem is currently the most complete open-source solution. For teams or production environments, stability issues and AGPL-3.0 license implications warrant careful evaluation.
The most interesting alternative for those seeking a conceptually different approach is A-MEM (evolving knowledge graph, MIT license). For those who prioritize cloud hosting and verified benchmarks, Mem0 deserves evaluation.



