Local-first semantic memory for AI agents and humans.
Store anything. Find it by meaning. No cloud, no APIs, no subscriptions — everything runs on your machine.
Ragger combines dense vector search with keyword matching for better recall, uses local embeddings (no API calls), and supports pluggable storage backends. It's designed as a long-term memory backend for AI agents but works standalone.
A C++ port is also available with the same HTTP API, database format, and config file.
- Local embeddings —
all-MiniLM-L6-v2via sentence-transformers (384-dim, ~90MB) - Hybrid search — BM25 keyword + vector cosine similarity (pure Python, configurable blend)
- Fast vector search — NumPy cosine similarity (~10-50ms for 50K documents)
- Pluggable backends — SQLite (default); abstract base class makes it easy to add more
- HTTP & MCP servers — REST API and Model Context Protocol for tool integration
- Collection filtering — Organize memories into searchable collections (e.g.,
docs,reference,memory) - Usage tracking — Per-memory access stats for identifying high-value content
- Python API — Reusable
RaggerMemoryclass for embedding into your own apps - Query logging — Track searches with timing, scores, and quality metrics
- Path normalization —
$HOME→~/for portable, privacy-friendly storage
Production install (creates system user/group, installs as service):
cd /path/to/Ragger
sudo ./install.shThe installer is interactive: it asks for single-user or multi-user mode, creates all system resources, installs the executable, then walks through every user on the system — offering to add, remove, or configure client integrations (OpenClaw, Claude Desktop) for each. Safe to re-run on upgrades; only the executable is overwritten.
Development setup (local venv):
cd /path/to/Ragger
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtUsage:
# Store a memory
ragger store "The deploy script requires Node 18+"
# Search (semantic — finds by meaning, not just keywords)
ragger search "deployment requirements"
# Import a document (chunked at paragraph boundaries)
ragger import notes.md --collection docs
# Start HTTP server (for OpenClaw plugin or any HTTP client)
ragger serveFirst run downloads the embedding model (~90MB) to your HuggingFace cache. After that, all operations are offline.
| Guide | Description |
|---|---|
| Getting Started | Installation, setup, first run |
| Configuration | Config files, settings reference |
| Collections | Organizing memories into collections |
| Search & RAG | How hybrid search works |
| HTTP API | REST endpoints, MCP server, auth |
| Python API | Library usage, custom backends |
| Chat Persistence | Turn storage, summaries, cleanup |
| Deployment | Production setup, LaunchDaemon, multi-user |
| Project Structure | Code layout, database schema |
| OpenClaw Integration | Plugin setup for OpenClaw |
| Agent Guide | Best practices for AI agents |
| Testing Your Install | Verify with your own data |
| Design Decisions | Why things are the way they are |
Version 0.8.1 — Production-ready with unified install, rebuild-embeddings backup/warning, and comprehensive testing.
Features: multi-user with per-user databases and search merging, common shared memory DB, bearer token authentication (enabled by default), automatic token rotation, per-user inference model selection, rebuild-embeddings verb with backup and confirmation, schema-driven API formats, chat persistence with background summarization, user provisioning CLI, and idempotent install script.
- Python 3.10+
- ~1GB disk for model + dependencies
- SQLite backend: No extra dependencies (uses Python stdlib)
Platforms: macOS and Linux are tested and supported. Windows should work (all dependencies have Windows wheels) but needs a PowerShell install script and Windows service integration. Contributions welcome.
-
Store: Text → 384-dim vector via sentence-transformers → backend document with text, embedding, metadata, and timestamp.
-
Search: Query → vector → hybrid scoring (NumPy cosine similarity + BM25 keyword relevance) → results ranked by blended score.
-
Performance: 50K vectors × 384 dims ≈ 10-50ms on Apple Silicon. The embedding model stays loaded in server mode for fast repeated queries.
from ragger_memory import RaggerMemory
with RaggerMemory() as memory:
memory.store("Some fact", metadata={"source": "notes.md"})
results = memory.search("related query", limit=5)
for r in results:
print(f"[{r['score']:.3f}] {r['text']}")# Store
curl -X POST http://localhost:8432/store \
-H "Content-Type: application/json" \
-d '{"text": "Deploy to staging every Friday", "metadata": {"category": "preference"}}'
# Search
curl -X POST http://localhost:8432/search \
-H "Content-Type: application/json" \
-d '{"query": "deployment schedule", "limit": 3}'# Store a memory
ragger store "The deploy script requires Node 18+"
# Search with filters
ragger search "API authentication" --collections docs --limit 3
# Import files
ragger import notes.md --collection docs
ragger import doc1.md doc2.md doc3.md --collection reference
# Export documents
ragger export docs ./exported/
# Count memories
ragger count
# Rebuild BM25 index
ragger rebuild-bm25
# Run MCP server (JSON-RPC over stdin/stdout)
ragger mcpMemories are organized into collections — logical groups that let you separate reference material from conversation memories and search the right pool for the right question.
Built-in:
memory— Agent-stored memories: facts, decisions, preferences, session summaries (default)
Example custom collections:
docs— Project documentation, API referencesreference— Technical manuals, specificationsnotes— Meeting notes, research, bookmarks
Search specific collections:
ragger search "API auth" --collections docs referenceOr search everything (default):
ragger search "API auth"See Collections for best practices and AI agent integration tips.
GPL v3 — See LICENSE for details.
Commercial licensing: If you'd like to use Ragger in a proprietary product without GPL obligations, commercial licenses are available. Contact reid@diskerror.com.