Ragger Memory

Local-first semantic memory for AI agents and humans.

Store anything. Find it by meaning. No cloud, no APIs, no subscriptions — everything runs on your machine.

Ragger combines dense vector search with keyword matching for better recall, uses local embeddings (no API calls), and supports pluggable storage backends. It's designed as a long-term memory backend for AI agents but works standalone.

A C++ port is also available with the same HTTP API, database format, and config file.

Features

Local embeddings — all-MiniLM-L6-v2 via sentence-transformers (384-dim, ~90MB)
Hybrid search — BM25 keyword + vector cosine similarity (pure Python, configurable blend)
Fast vector search — NumPy cosine similarity (~10-50ms for 50K documents)
Pluggable backends — SQLite (default); abstract base class makes it easy to add more
HTTP & MCP servers — REST API and Model Context Protocol for tool integration
Collection filtering — Organize memories into searchable collections (e.g., docs, reference, memory)
Usage tracking — Per-memory access stats for identifying high-value content
Python API — Reusable RaggerMemory class for embedding into your own apps
Query logging — Track searches with timing, scores, and quality metrics
Path normalization — $HOME → ~/ for portable, privacy-friendly storage

Quick Start

Production install (creates system user/group, installs as service):

cd /path/to/Ragger
sudo ./install.sh

The installer is interactive: it asks for single-user or multi-user mode, creates all system resources, installs the executable, then walks through every user on the system — offering to add, remove, or configure client integrations (OpenClaw, Claude Desktop) for each. Safe to re-run on upgrades; only the executable is overwritten.

Development setup (local venv):

cd /path/to/Ragger
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Usage:

# Store a memory
ragger store "The deploy script requires Node 18+"

# Search (semantic — finds by meaning, not just keywords)
ragger search "deployment requirements"

# Import a document (chunked at paragraph boundaries)
ragger import notes.md --collection docs

# Start HTTP server (for OpenClaw plugin or any HTTP client)
ragger serve

First run downloads the embedding model (~90MB) to your HuggingFace cache. After that, all operations are offline.

Documentation

Guide	Description
Getting Started	Installation, setup, first run
Configuration	Config files, settings reference
Collections	Organizing memories into collections
Search & RAG	How hybrid search works
HTTP API	REST endpoints, MCP server, auth
Python API	Library usage, custom backends
Chat Persistence	Turn storage, summaries, cleanup
Deployment	Production setup, LaunchDaemon, multi-user
Project Structure	Code layout, database schema
OpenClaw Integration	Plugin setup for OpenClaw
Agent Guide	Best practices for AI agents
Testing Your Install	Verify with your own data
Design Decisions	Why things are the way they are

Status

Version 0.8.1 — Production-ready with unified install, rebuild-embeddings backup/warning, and comprehensive testing.

Features: multi-user with per-user databases and search merging, common shared memory DB, bearer token authentication (enabled by default), automatic token rotation, per-user inference model selection, rebuild-embeddings verb with backup and confirmation, schema-driven API formats, chat persistence with background summarization, user provisioning CLI, and idempotent install script.

Requirements

Python 3.10+
~1GB disk for model + dependencies
SQLite backend: No extra dependencies (uses Python stdlib)

Platforms: macOS and Linux are tested and supported. Windows should work (all dependencies have Windows wheels) but needs a PowerShell install script and Windows service integration. Contributions welcome.

How It Works

Store: Text → 384-dim vector via sentence-transformers → backend document with text, embedding, metadata, and timestamp.
Search: Query → vector → hybrid scoring (NumPy cosine similarity + BM25 keyword relevance) → results ranked by blended score.
Performance: 50K vectors × 384 dims ≈ 10-50ms on Apple Silicon. The embedding model stays loaded in server mode for fast repeated queries.

Python API Example

from ragger_memory import RaggerMemory

with RaggerMemory() as memory:
    memory.store("Some fact", metadata={"source": "notes.md"})
    results = memory.search("related query", limit=5)
    for r in results:
        print(f"[{r['score']:.3f}] {r['text']}")

HTTP API Example

# Store
curl -X POST http://localhost:8432/store \
  -H "Content-Type: application/json" \
  -d '{"text": "Deploy to staging every Friday", "metadata": {"category": "preference"}}'

# Search
curl -X POST http://localhost:8432/search \
  -H "Content-Type: application/json" \
  -d '{"query": "deployment schedule", "limit": 3}'

Command-Line Examples

# Store a memory
ragger store "The deploy script requires Node 18+"

# Search with filters
ragger search "API authentication" --collections docs --limit 3

# Import files
ragger import notes.md --collection docs
ragger import doc1.md doc2.md doc3.md --collection reference

# Export documents
ragger export docs ./exported/

# Count memories
ragger count

# Rebuild BM25 index
ragger rebuild-bm25

# Run MCP server (JSON-RPC over stdin/stdout)
ragger mcp

Collections

Memories are organized into collections — logical groups that let you separate reference material from conversation memories and search the right pool for the right question.

Built-in:

memory — Agent-stored memories: facts, decisions, preferences, session summaries (default)

Example custom collections:

docs — Project documentation, API references
reference — Technical manuals, specifications
notes — Meeting notes, research, bookmarks

Search specific collections:

ragger search "API auth" --collections docs reference

Or search everything (default):

ragger search "API auth"

See Collections for best practices and AI agent integration tips.

License

GPL v3 — See LICENSE for details.

Commercial licensing: If you'd like to use Ragger in a proprietary product without GPL obligations, commercial licenses are available. Contact reid@diskerror.com.

Name		Name	Last commit message	Last commit date
Latest commit History 191 Commits
docs		docs
formats		formats
openclaw-plugin		openclaw-plugin
ragger_memory		ragger_memory
scripts		scripts
tests		tests
web		web
.coverage		.coverage
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SOUL.md		SOUL.md
example-system.ini		example-system.ini
example-user.ini		example-user.ini
install.sh		install.sh
ragger.py		ragger.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ragger Memory

Features

Quick Start

Documentation

Status

Requirements

How It Works

Python API Example

HTTP API Example

Command-Line Examples

Collections

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ragger Memory

Features

Quick Start

Documentation

Status

Requirements

How It Works

Python API Example

HTTP API Example

Command-Line Examples

Collections

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages