Self-hosted search API for AI agents. 16 endpoints. 9-strategy content extraction. Optional Tor-anonymized stack. No third-party search API keys, no per-query fees, no vendor lock-in. Optional local bearer auth is supported.
git clone https://github.com/brcrusoe72/agent-search.git
cd agent-search
./scripts/prepare-searxng.sh
docker compose up -d
curl "http://localhost:3939/search?q=distributed+consensus+algorithms"You now have a deduplicated, multi-engine search API running on :3939.
If you enable auth, pass the token on all non-health endpoints:
export AGENT_SEARCH_TOKEN="change-me"
curl -H "Authorization: Bearer $AGENT_SEARCH_TOKEN" \
"http://localhost:3939/search?q=distributed+consensus+algorithms"Prefer not to use Docker for the API server?
git clone https://github.com/brcrusoe72/agent-search.git
cd agent-search
./scripts/install-native.sh
./scripts/run-native.shNative mode requires Python 3.11+ and a reachable SearXNG instance with JSON output enabled. It stores AgentSearch state in ./data. See Native Install.
Run the self-contained test suite:
python -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt pytest requests
pip install -e sdk -e mcp-server
./scripts/prepare-searxng.sh
pytest tests -q
python -m compileall app adapters mcp-server/agent_search_mcp scripts sdk -q
docker compose -f docker-compose.yml config --quiet
docker compose -f docker-compose.yml -f examples/compose.private.yml config --quiet
docker build -t agent-search-api:test .Those tests mock SearXNG, so they do not require Docker or a running local service.
Run the optional live localhost check:
AGENTSEARCH_INTEGRATION=1 pytest tests -qIf your local instance requires auth:
AGENT_SEARCH_TOKEN="change-me" AGENTSEARCH_INTEGRATION=1 pytest tests -qRun the optional Docker smoke tests against a running direct/private stack:
AGENT_SEARCH_TOKEN="change-me" \
AGENTSEARCH_DOCKER_INTEGRATION=1 \
pytest tests/test_live_docker.py -qAgentSearch wraps SearXNG with a FastAPI layer that adds everything LLM agents actually need: deduplication, cross-engine scoring, content extraction, query expansion, domain trust scoring, prompt injection scrubbing, and self-improvement.
Standard stack — docker compose up gives you search on :3939.
Private stack — docker compose -f docker-compose.yml -f examples/compose.private.yml up adds an anonymized instance on :3940 that routes all traffic through Tor with Snowflake obfuscation. Encrypted DNS via CoreDNS → Cloudflare DoT. Network-level isolation — the private stack physically cannot egress without Tor.
AgentSearch delegates engine support to the connected SearXNG instance. The authoritative list for a running stack is:
curl "http://localhost:3939/engines"The bundled searxng/settings.example.yml explicitly enables 23 engines: Google, Startpage, Brave, Bing, DuckDuckGo, Google Scholar, Semantic Scholar, arXiv, Crossref, OpenAlex, PubMed, Google News, Bing News, Yahoo News, Wikinews, Wikipedia, Wikidata, Hugging Face, Reddit, Hacker News, Stack Overflow, GitHub, and Lobsters.
Run ./scripts/prepare-searxng.sh to create ignored local runtime files at searxng/settings.yml and searxng/settings.tor.yml with generated SearXNG instance secrets. Do not commit those generated files.
Because SearXNG is configured with use_default_settings: true, your live instance may expose additional enabled engines from the installed SearXNG catalog. Use the engines= query parameter to request specific engines, and use /engines to verify what is available in that deployment.
SearXNG finds pages. AgentSearch finds pages, reads them, scores them, deduplicates them, caches them, scrubs prompt injections out of them, detects paywalls, falls back through 9 extraction strategies when the first one fails, and gets better at it over time. One API call.
| AgentSearch | Tavily | Exa | SerpAPI | Raw SearXNG | |
|---|---|---|---|---|---|
| Cost | Free | $0.005/query | $0.001/query | $50/mo | Free |
| Self-hosted | ✅ | ❌ | ❌ | ❌ | ✅ |
| Content extraction | 9-strategy kill chain | Basic | Basic | ❌ | ❌ |
| Deduplication | Cross-engine | ❌ | ❌ | ❌ | ❌ |
| Prompt injection scrubbing | ✅ | ❌ | ❌ | ❌ | ❌ |
| Self-improving | ✅ (evolver) | ❌ | ❌ | ❌ | ❌ |
| Tor anonymization | Optional | ❌ | ❌ | ❌ | Manual |
| Endpoint | Method | What it does |
|---|---|---|
/search |
GET | Multi-engine web search with deduplication and scoring |
/search/deep |
GET | Server-side query expansion — runs variations in parallel, fuses results |
/search/extract |
GET | Search + inline content extraction in one call |
/search/jobs |
GET | Job search across LinkedIn, Indeed, Glassdoor, ZipRecruiter |
/search/policy |
GET | Policy and regulatory document search |
/search/sources |
GET | Source discovery with institutional filtering |
/search/sources/institutions |
GET | List source registry institutions |
/search/stats |
GET | Query statistics and cache metrics |
/news |
GET | Structured multi-source news (Google News, Bing News, Yahoo News) |
| Endpoint | Method | What it does |
|---|---|---|
/read |
GET | 9-strategy kill chain extraction for any URL |
/read/batch |
POST | Concurrent multi-URL extraction in one request |
The kill chain escalates through strategies until one succeeds:
- Direct fetch + smart content selectors
- Readability scoring (paragraph density vs link density)
- User-agent rotation (Chrome/Safari/Firefox/Edge signatures)
- Wayback Machine (CDX API → latest snapshot)
- Google Cache
- Search-about fallback (find coverage elsewhere)
- Custom adapters (pluggable Python modules from disk)
- PDF extraction (pdfplumber)
- YouTube transcript (yt-dlp)
Every request gets SSRF protection, prompt injection detection, paywall detection, and content length caps automatically.
| Endpoint | Method | What it does |
|---|---|---|
/adapt/report |
POST | Report a fetch failure for a URL |
/adapt/stats |
GET | View adaptation metrics and failure patterns |
/adapt/evolve |
POST | Trigger self-improvement cycle — analyzes failures, tunes config |
| Endpoint | Method | What it does |
|---|---|---|
/health |
GET | Health check (API + SearXNG status) |
/engines |
GET | List available search engines and their status |
curl "http://localhost:3939/search/extract?q=python+async+patterns&count=3"Returns search results with extracted content inline — no second round-trip to /read.
curl "http://localhost:3939/search/deep?q=ethon+industrial+ai+platform&count=10"Server-side query variation + parallel execution + result fusion. Surfaces results that flat /search misses.
curl "http://localhost:3939/read?url=https://example.com/paywalled-article"{
"url": "https://example.com/paywalled-article",
"content": "Full article text extracted via Wayback Machine...",
"strategy": "wayback",
"chars": 4821,
"cached": false,
"strategies_tried": ["direct", "readability", "ua_rotation", "wayback"]
}curl -X POST "http://localhost:3939/read/batch" \
-H "Content-Type: application/json" \
-d '{"urls": ["https://a.com", "https://b.com", "https://c.com"]}'pip install agentsearch-clientfrom agentsearch import AgentSearch
client = AgentSearch() # defaults to localhost:3939
results = client.search("manufacturing OEE best practices", count=5)
for r in results.results:
print(f"{r.title} — {r.url}")
# Content extraction
page = client.read("https://example.com/article")
print(page.content[:500])
# Batch read
pages = client.read_batch(["https://a.com", "https://b.com"])
print(f"{pages.successful}/{pages.total} succeeded")For authenticated instances, pass token=... or use AGENT_SEARCH_TOKEN,
AGENTSEARCH_TOKEN, credentials/agent-search-token.txt, or
~/.config/agent-search/token.
from langchain.tools import tool
import requests
@tool
def web_search(query: str) -> str:
"""Search the web using AgentSearch."""
resp = requests.get("http://localhost:3939/search", params={"q": query, "count": 5})
return "\n".join(
f"- {r['title']}: {r['url']}\n {r['snippet']}"
for r in resp.json()["results"]
)pip install mcp httpx
python mcp-server/server.pyFor authenticated instances, set AGENT_SEARCH_TOKEN or run:
python mcp-server/server.py --token "change-me"Add to Claude Desktop config:
{
"mcpServers": {
"agent-search": {
"command": "python",
"args": ["/path/to/mcp-server/server.py"]
}
}
}See mcp-server/README.md for details.
The optional private stack adds a fully anonymized search path:
┌──────────┐ ┌──────────────┐ ┌───────────────┐ ┌──────────┐
│ :3940 │───▶│ api-private │───▶│ searxng-priv │───▶│ Tor │──▶ Internet
│ (agent) │ │ (FastAPI) │ │ (SearXNG) │ │(Snowflake│
└──────────┘ └──────────────┘ └───────────────┘ │ + obfs4) │
└──────────┘
All containers use CoreDNS → Cloudflare DoT
tor-internal network: no direct egress possible
What this gives you:
- Your ISP sees TLS to Cloudflare (DNS) and WebRTC-looking traffic (Snowflake). Not search queries.
- The private SearXNG instance lives on an internal-only Docker network with no internet route except through Tor.
- Port 3939 = direct (fast), port 3940 = anonymized (slower, private).
Setup:
./scripts/prepare-searxng.sh
docker compose -f docker-compose.yml -f examples/compose.private.yml up -d --buildAll private stack configs live in examples/ — copy and customize as needed.
Port 3939 (direct) Port 3940 (Tor-anonymized)
│ │
▼ ▼
┌─────────┐ ┌─────────────┐
│ API │ │ api-private │
│(FastAPI) │ │ (FastAPI) │
├─────────┤ ├─────────────┤
│ dedup │ ┌─────────────────┐ │ same code │ ┌───────────────┐
│ scoring │ │ SearXNG │ │ Tor egress │ │ SearXNG-priv │
│ cache │──│ Google, Bing, │ │ only │──│ (tor-internal │
│ scrub │ │ DDG, Brave, │ └─────────────┘ │ network) │
│ killchn │ │ /engines list │ └───────┬───────┘
│ trust │ └─────────────────┘ │
│ evolver │ ┌─────┴─────┐
└─────────┘ │ Tor │
│ │ Snowflake │
▼ │ + obfs4 │
┌─────────┐ └───────────┘
│ CoreDNS │──▶ Cloudflare DoT (encrypted DNS)
└─────────┘
| Module | LOC | What it does |
|---|---|---|
killchain.py |
1016 | 9-strategy escalating content extraction |
main.py |
920 | FastAPI app, 16 endpoints, auth, rate limiting |
source_tracer.py |
620 | Source provenance tracking and citation chains |
scrubber.py |
539 | Prompt injection detection and content sanitization |
source_library.py |
310 | Curated institutional source registry |
domain_trust.py |
311 | Domain trust scoring (TLD, age, reputation) |
evolver.py |
301 | Self-improvement engine — failure analysis → config tuning |
content_cache.py |
241 | URL-keyed content cache with TTL |
query_expansion.py |
201 | Server-side query variation and fusion |
Plus: 5 pluggable adapters (Cloudflare bypass, Medium, 403 handler, parse error recovery, empty content fallback), MCP server, Python SDK, test suite.
A real autonomous research agent ("the wolf") uses every AgentSearch endpoint. Before AgentSearch was wired in correctly, the agent's hand-rolled SearXNG client silently 401'd on three of four engines. Every hunt on a low-profile entity returned 0 frameworks.
After: 17 frameworks per hunt, 7/7 gaps closed. Same agent, same model, same prompts. The difference was the search infrastructure underneath.
→ Full walkthrough: case-studies/wolf.md
Environment variables (set in docker-compose.yml or .env):
| Variable | Default | Description |
|---|---|---|
SEARXNG_URL |
http://searxng:8080 |
SearXNG instance URL |
SEARXNG_IMAGE |
pinned SearXNG digest | SearXNG container image; override only when intentionally upgrading |
PYTHON_BASE_IMAGE |
pinned Python digest | API Docker base image; override only when intentionally upgrading |
COREDNS_IMAGE |
pinned CoreDNS digest | Private-stack DNS image |
SOCAT_IMAGE |
pinned socat digest | Private-stack TCP forwarder image |
TOR_BASE_IMAGE |
pinned Debian digest | Private-stack Tor proxy base image |
CACHE_TTL |
3600 |
Cache duration in seconds |
RATE_LIMIT |
60 |
Max requests per minute |
SQLITE_TIMEOUT |
1.0 |
SQLite lock wait timeout in seconds for query stats |
AGENT_SEARCH_TOKEN |
(empty) | Bearer token for auth (optional) |
ADAPTERS_DIR |
/app/adapters |
Path to pluggable adapter modules |
- Search engines, news engines, rate limits, and failure modes depend on the connected SearXNG instance.
/enginesis the live source of truth. - Bearer auth is a simple local API gate, not a multi-user authorization system. Treat
AGENT_SEARCH_TOKENas a shared service token. - Rate limiting is in memory. It resets on restart and is per API process.
- Query statistics use local SQLite with WAL and a bounded lock timeout. For high-volume multi-worker deployments, move query logging to an external database or telemetry backend.
- The MCP package intentionally bounds its
mcpdependency to the tested 1.27.x line. Upgrade deliberately and run the package/CI checks before publishing. - Content extraction validates the starting URL and every redirect hop before fetching redirected content, but fetched third-party pages are still untrusted and are scrubbed before being returned.
- Google Cache is unreliable because public cache availability changes frequently.
- The Tor/private stack is intentionally slower than direct search.
pip install -r requirements.txt
SEARXNG_URL=http://localhost:8080 uvicorn app.main:app --reload --port 3939
# Run tests
pytest tests/Testruns on every push and pull request.CodeQLruns on push, pull request, and a weekly schedule.- Dependabot watches Python packages, Dockerfiles, and GitHub Actions.
- Version releases are created from semantic tags such as
v2.0.1or by manually running theReleaseworkflow with a tag input. - Update
CHANGELOG.mdbefore creating a release tag.
- Fork → branch → commit → PR.
Issues and PRs welcome. If you're building an agent that needs search, this is for you.
The root AgentSearch API, SDK, Docker stack, and docs are MIT licensed. The MCP server under mcp-server/ is AGPL-3.0 licensed; see mcp-server/LICENSE.