Skip to content

brcrusoe72/agent-search

Repository files navigation

AgentSearch

Self-hosted search API for AI agents. 16 endpoints. 9-strategy content extraction. Optional Tor-anonymized stack. No third-party search API keys, no per-query fees, no vendor lock-in. Optional local bearer auth is supported.

License: MIT PyPI

git clone https://github.com/brcrusoe72/agent-search.git
cd agent-search
./scripts/prepare-searxng.sh
docker compose up -d
curl "http://localhost:3939/search?q=distributed+consensus+algorithms"

You now have a deduplicated, multi-engine search API running on :3939.

If you enable auth, pass the token on all non-health endpoints:

export AGENT_SEARCH_TOKEN="change-me"
curl -H "Authorization: Bearer $AGENT_SEARCH_TOKEN" \
  "http://localhost:3939/search?q=distributed+consensus+algorithms"

Prefer not to use Docker for the API server?

git clone https://github.com/brcrusoe72/agent-search.git
cd agent-search
./scripts/install-native.sh
./scripts/run-native.sh

Native mode requires Python 3.11+ and a reachable SearXNG instance with JSON output enabled. It stores AgentSearch state in ./data. See Native Install.

Verify

Run the self-contained test suite:

python -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt pytest requests
pip install -e sdk -e mcp-server
./scripts/prepare-searxng.sh
pytest tests -q
python -m compileall app adapters mcp-server/agent_search_mcp scripts sdk -q
docker compose -f docker-compose.yml config --quiet
docker compose -f docker-compose.yml -f examples/compose.private.yml config --quiet
docker build -t agent-search-api:test .

Those tests mock SearXNG, so they do not require Docker or a running local service.

Run the optional live localhost check:

AGENTSEARCH_INTEGRATION=1 pytest tests -q

If your local instance requires auth:

AGENT_SEARCH_TOKEN="change-me" AGENTSEARCH_INTEGRATION=1 pytest tests -q

Run the optional Docker smoke tests against a running direct/private stack:

AGENT_SEARCH_TOKEN="change-me" \
AGENTSEARCH_DOCKER_INTEGRATION=1 \
pytest tests/test_live_docker.py -q

What it does

AgentSearch wraps SearXNG with a FastAPI layer that adds everything LLM agents actually need: deduplication, cross-engine scoring, content extraction, query expansion, domain trust scoring, prompt injection scrubbing, and self-improvement.

Standard stackdocker compose up gives you search on :3939.

Private stackdocker compose -f docker-compose.yml -f examples/compose.private.yml up adds an anonymized instance on :3940 that routes all traffic through Tor with Snowflake obfuscation. Encrypted DNS via CoreDNS → Cloudflare DoT. Network-level isolation — the private stack physically cannot egress without Tor.

Search engines

AgentSearch delegates engine support to the connected SearXNG instance. The authoritative list for a running stack is:

curl "http://localhost:3939/engines"

The bundled searxng/settings.example.yml explicitly enables 23 engines: Google, Startpage, Brave, Bing, DuckDuckGo, Google Scholar, Semantic Scholar, arXiv, Crossref, OpenAlex, PubMed, Google News, Bing News, Yahoo News, Wikinews, Wikipedia, Wikidata, Hugging Face, Reddit, Hacker News, Stack Overflow, GitHub, and Lobsters.

Run ./scripts/prepare-searxng.sh to create ignored local runtime files at searxng/settings.yml and searxng/settings.tor.yml with generated SearXNG instance secrets. Do not commit those generated files.

Because SearXNG is configured with use_default_settings: true, your live instance may expose additional enabled engines from the installed SearXNG catalog. Use the engines= query parameter to request specific engines, and use /engines to verify what is available in that deployment.

Why not just use SearXNG directly?

SearXNG finds pages. AgentSearch finds pages, reads them, scores them, deduplicates them, caches them, scrubs prompt injections out of them, detects paywalls, falls back through 9 extraction strategies when the first one fails, and gets better at it over time. One API call.

AgentSearch Tavily Exa SerpAPI Raw SearXNG
Cost Free $0.005/query $0.001/query $50/mo Free
Self-hosted
Content extraction 9-strategy kill chain Basic Basic
Deduplication Cross-engine
Prompt injection scrubbing
Self-improving ✅ (evolver)
Tor anonymization Optional Manual

Endpoints

Search

Endpoint Method What it does
/search GET Multi-engine web search with deduplication and scoring
/search/deep GET Server-side query expansion — runs variations in parallel, fuses results
/search/extract GET Search + inline content extraction in one call
/search/jobs GET Job search across LinkedIn, Indeed, Glassdoor, ZipRecruiter
/search/policy GET Policy and regulatory document search
/search/sources GET Source discovery with institutional filtering
/search/sources/institutions GET List source registry institutions
/search/stats GET Query statistics and cache metrics
/news GET Structured multi-source news (Google News, Bing News, Yahoo News)

Content extraction

Endpoint Method What it does
/read GET 9-strategy kill chain extraction for any URL
/read/batch POST Concurrent multi-URL extraction in one request

The kill chain escalates through strategies until one succeeds:

  1. Direct fetch + smart content selectors
  2. Readability scoring (paragraph density vs link density)
  3. User-agent rotation (Chrome/Safari/Firefox/Edge signatures)
  4. Wayback Machine (CDX API → latest snapshot)
  5. Google Cache
  6. Search-about fallback (find coverage elsewhere)
  7. Custom adapters (pluggable Python modules from disk)
  8. PDF extraction (pdfplumber)
  9. YouTube transcript (yt-dlp)

Every request gets SSRF protection, prompt injection detection, paywall detection, and content length caps automatically.

Self-improvement

Endpoint Method What it does
/adapt/report POST Report a fetch failure for a URL
/adapt/stats GET View adaptation metrics and failure patterns
/adapt/evolve POST Trigger self-improvement cycle — analyzes failures, tunes config

Infrastructure

Endpoint Method What it does
/health GET Health check (API + SearXNG status)
/engines GET List available search engines and their status

Quick examples

Search with content extraction

curl "http://localhost:3939/search/extract?q=python+async+patterns&count=3"

Returns search results with extracted content inline — no second round-trip to /read.

Deep search (query expansion)

curl "http://localhost:3939/search/deep?q=ethon+industrial+ai+platform&count=10"

Server-side query variation + parallel execution + result fusion. Surfaces results that flat /search misses.

Read a URL (kill chain)

curl "http://localhost:3939/read?url=https://example.com/paywalled-article"
{
  "url": "https://example.com/paywalled-article",
  "content": "Full article text extracted via Wayback Machine...",
  "strategy": "wayback",
  "chars": 4821,
  "cached": false,
  "strategies_tried": ["direct", "readability", "ua_rotation", "wayback"]
}

Batch read

curl -X POST "http://localhost:3939/read/batch" \
  -H "Content-Type: application/json" \
  -d '{"urls": ["https://a.com", "https://b.com", "https://c.com"]}'

Python SDK

pip install agentsearch-client
from agentsearch import AgentSearch

client = AgentSearch()  # defaults to localhost:3939
results = client.search("manufacturing OEE best practices", count=5)
for r in results.results:
    print(f"{r.title}{r.url}")

# Content extraction
page = client.read("https://example.com/article")
print(page.content[:500])

# Batch read
pages = client.read_batch(["https://a.com", "https://b.com"])
print(f"{pages.successful}/{pages.total} succeeded")

For authenticated instances, pass token=... or use AGENT_SEARCH_TOKEN, AGENTSEARCH_TOKEN, credentials/agent-search-token.txt, or ~/.config/agent-search/token.

LangChain tool

from langchain.tools import tool
import requests

@tool
def web_search(query: str) -> str:
    """Search the web using AgentSearch."""
    resp = requests.get("http://localhost:3939/search", params={"q": query, "count": 5})
    return "\n".join(
        f"- {r['title']}: {r['url']}\n  {r['snippet']}"
        for r in resp.json()["results"]
    )

MCP server (Claude Desktop, Cursor, Windsurf)

pip install mcp httpx
python mcp-server/server.py

For authenticated instances, set AGENT_SEARCH_TOKEN or run:

python mcp-server/server.py --token "change-me"

Add to Claude Desktop config:

{
  "mcpServers": {
    "agent-search": {
      "command": "python",
      "args": ["/path/to/mcp-server/server.py"]
    }
  }
}

See mcp-server/README.md for details.

Private stack (Tor + encrypted DNS)

The optional private stack adds a fully anonymized search path:

┌──────────┐    ┌──────────────┐    ┌───────────────┐    ┌──────────┐
│  :3940   │───▶│ api-private  │───▶│ searxng-priv  │───▶│   Tor    │──▶ Internet
│ (agent)  │    │ (FastAPI)    │    │ (SearXNG)     │    │(Snowflake│
└──────────┘    └──────────────┘    └───────────────┘    │ + obfs4) │
                                                          └──────────┘
                 All containers use CoreDNS → Cloudflare DoT
                 tor-internal network: no direct egress possible

What this gives you:

  • Your ISP sees TLS to Cloudflare (DNS) and WebRTC-looking traffic (Snowflake). Not search queries.
  • The private SearXNG instance lives on an internal-only Docker network with no internet route except through Tor.
  • Port 3939 = direct (fast), port 3940 = anonymized (slower, private).

Setup:

./scripts/prepare-searxng.sh
docker compose -f docker-compose.yml -f examples/compose.private.yml up -d --build

All private stack configs live in examples/ — copy and customize as needed.

Architecture

Port 3939 (direct)                    Port 3940 (Tor-anonymized)
     │                                      │
     ▼                                      ▼
┌─────────┐                          ┌─────────────┐
│   API   │                          │ api-private  │
│(FastAPI) │                          │  (FastAPI)   │
├─────────┤                          ├─────────────┤
│ dedup   │  ┌─────────────────┐     │ same code   │  ┌───────────────┐
│ scoring │  │    SearXNG      │     │ Tor egress  │  │ SearXNG-priv  │
│ cache   │──│ Google, Bing,   │     │ only        │──│ (tor-internal │
│ scrub   │  │ DDG, Brave,     │     └─────────────┘  │  network)     │
│ killchn │  │ /engines list   │                       └───────┬───────┘
│ trust   │  └─────────────────┘                               │
│ evolver │                                              ┌─────┴─────┐
└─────────┘                                              │    Tor    │
     │                                                   │ Snowflake │
     ▼                                                   │  + obfs4  │
┌─────────┐                                              └───────────┘
│ CoreDNS │──▶ Cloudflare DoT (encrypted DNS)
└─────────┘

Key modules (6,700 LOC)

Module LOC What it does
killchain.py 1016 9-strategy escalating content extraction
main.py 920 FastAPI app, 16 endpoints, auth, rate limiting
source_tracer.py 620 Source provenance tracking and citation chains
scrubber.py 539 Prompt injection detection and content sanitization
source_library.py 310 Curated institutional source registry
domain_trust.py 311 Domain trust scoring (TLD, age, reputation)
evolver.py 301 Self-improvement engine — failure analysis → config tuning
content_cache.py 241 URL-keyed content cache with TTL
query_expansion.py 201 Server-side query variation and fusion

Plus: 5 pluggable adapters (Cloudflare bypass, Medium, 403 handler, parse error recovery, empty content fallback), MCP server, Python SDK, test suite.

Case study: 0 → 17 frameworks per hunt

A real autonomous research agent ("the wolf") uses every AgentSearch endpoint. Before AgentSearch was wired in correctly, the agent's hand-rolled SearXNG client silently 401'd on three of four engines. Every hunt on a low-profile entity returned 0 frameworks.

After: 17 frameworks per hunt, 7/7 gaps closed. Same agent, same model, same prompts. The difference was the search infrastructure underneath.

→ Full walkthrough: case-studies/wolf.md

Configuration

Environment variables (set in docker-compose.yml or .env):

Variable Default Description
SEARXNG_URL http://searxng:8080 SearXNG instance URL
SEARXNG_IMAGE pinned SearXNG digest SearXNG container image; override only when intentionally upgrading
PYTHON_BASE_IMAGE pinned Python digest API Docker base image; override only when intentionally upgrading
COREDNS_IMAGE pinned CoreDNS digest Private-stack DNS image
SOCAT_IMAGE pinned socat digest Private-stack TCP forwarder image
TOR_BASE_IMAGE pinned Debian digest Private-stack Tor proxy base image
CACHE_TTL 3600 Cache duration in seconds
RATE_LIMIT 60 Max requests per minute
SQLITE_TIMEOUT 1.0 SQLite lock wait timeout in seconds for query stats
AGENT_SEARCH_TOKEN (empty) Bearer token for auth (optional)
ADAPTERS_DIR /app/adapters Path to pluggable adapter modules

Limitations and security notes

  • Search engines, news engines, rate limits, and failure modes depend on the connected SearXNG instance. /engines is the live source of truth.
  • Bearer auth is a simple local API gate, not a multi-user authorization system. Treat AGENT_SEARCH_TOKEN as a shared service token.
  • Rate limiting is in memory. It resets on restart and is per API process.
  • Query statistics use local SQLite with WAL and a bounded lock timeout. For high-volume multi-worker deployments, move query logging to an external database or telemetry backend.
  • The MCP package intentionally bounds its mcp dependency to the tested 1.27.x line. Upgrade deliberately and run the package/CI checks before publishing.
  • Content extraction validates the starting URL and every redirect hop before fetching redirected content, but fetched third-party pages are still untrusted and are scrubbed before being returned.
  • Google Cache is unreliable because public cache availability changes frequently.
  • The Tor/private stack is intentionally slower than direct search.

Development

pip install -r requirements.txt
SEARXNG_URL=http://localhost:8080 uvicorn app.main:app --reload --port 3939

# Run tests
pytest tests/

Release and GitHub governance

  • Test runs on every push and pull request.
  • CodeQL runs on push, pull request, and a weekly schedule.
  • Dependabot watches Python packages, Dockerfiles, and GitHub Actions.
  • Version releases are created from semantic tags such as v2.0.1 or by manually running the Release workflow with a tag input.
  • Update CHANGELOG.md before creating a release tag.

Contributing

  1. Fork → branch → commit → PR.

Issues and PRs welcome. If you're building an agent that needs search, this is for you.

License

The root AgentSearch API, SDK, Docker stack, and docs are MIT licensed. The MCP server under mcp-server/ is AGPL-3.0 licensed; see mcp-server/LICENSE.

About

Self-hosted search API + MCP server for AI agents. Bundles SearXNG. Zero API keys, one-command deploy. Open-source alternative to Tavily, Exa, and Serper.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors