For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Go to Platform
HomeSDKs & toolsAPI referencePricing
  • Get started
    • Welcome
    • Quickstart
  • Search API
    • Overview
    • Get live news
    • Retrieve page content
    • Search Operators
    • How to Evaluate the Search API
  • Contents API
    • Overview
  • Research API
    • Overview
  • Finance Research API
    • Overview
  • Administration
    • Account
    • Billing
    • API Keys
  • Support
    • Platform
    • Support
    • Resources
    • System Status
LogoLogo
HomeSDKs & toolsAPI referencePricing
LogoLogo
Go to Platform
On this page
  • What is the You.com Search API?
  • How it works
  • What you get
  • Key features
  • Live crawling — full page content per result
  • Unified web & news results
  • LLM-optimized output
  • Advanced search operators
  • Global coverage
  • Freshness controls
  • All optional parameters you can control
  • Common use cases
  • RAG (Retrieval-Augmented Generation)
  • AI agent knowledge retrieval
  • News monitoring & alerts
  • Content research & analysis
  • Knowledge base construction
  • Examples of advanced search capabilites
  • Domain filtering
  • Search operators
  • Pagination
  • Geographic targeting
  • Best practices
  • 1. Use snippets for RAG
  • 2. Implement caching
  • 3. Handle empty results
  • 4. Use appropriate count values
  • 5. Use search operators and query parameters
  • GET vs. POST
  • Pricing
  • Next steps
Search API

Search overview

Get structured, LLM-optimized search results from web and news sources.

|View as Markdown|Open in Claude|
Was this page helpful?
Built with

What is the You.com Search API?

The You.com Search API delivers high-quality, structured web and news results optimized for programmatic access in AI applications. Designed for developers building RAG systems, AI agents, knowledge bases, and data-driven applications, our Search API returns clean, structured data with rich metadata, relevant snippets, and full-page content.

How it works

The Search API processes your query and returns unified results from both web and news sources in a single request. Each result includes:

  1. Core information: URLs, titles, and descriptions
  2. LLM-ready snippets: Query-aware text excerpts - the best snippets to answer your query will be provided
  3. Rich metadata: Publication dates, authors, thumbnails, and favicons
  4. Full page content: Live-crawled HTML or Markdown on demand

Our intelligent classification system automatically determines when to include news results based on query intent, ensuring you get the most relevant information for your use case.

What you get

Every search returns structured JSON with two main result types:

Web results

  • Relevant web pages from across the internet
  • Multiple text snippets per result for context
  • Publication dates and author information
  • Thumbnail images and favicons for UI display

News results (when relevant)

  • Recent news articles from authoritative sources
  • Article summaries and headlines
  • Publication timestamps for freshness
  • Associated images and metadata
  • Full article content (HTML or Markdown) via live crawling

All results are returned in clean, structured JSON format requiring no HTML parsing or post-processing.


Key features

Live crawling — full page content per result

With snippets alone, you get ~100–200 words of extracted text per result. With livecrawl enabled, you get the full page content—often 2,000–10,000 words of HTML or Markdown. This is what unlocks deep RAG, comprehensive synthesis, and knowledge base construction from search results.

Add livecrawl to any search request and each matching result gains a contents object with the full page content in your chosen format. Crawl web, news or all results. Set livecrawl_formats to markdown (recommended for LLMs) or html.

1from youdotcom import You
2from youdotcom.models import LiveCrawl, LiveCrawlFormats
3
4with You(api_key_auth="api_key") as you:
5 # Livecrawl both web and news results
6 res = you.search.unified(
7 query="latest AI developments",
8 count=5,
9 livecrawl=LiveCrawl.ALL,
10 livecrawl_formats=LiveCrawlFormats.MARKDOWN,
11 )
12
13 # Access live-crawled content from web results
14 if res.results and res.results.web:
15 for result in res.results.web:
16 if result.contents:
17 print(f"Web: {result.title}")
18 print(f"Content: {result.contents.markdown[:200]}...\n")
19
20 # Access live-crawled content from news results
21 if res.results and res.results.news:
22 for result in res.results.news:
23 if result.contents:
24 print(f"News: {result.title}")
25 print(f"Content: {result.contents.markdown[:200]}...\n")

Need content from specific URLs you already have? Use the Contents API instead — it takes a list of URLs directly, without requiring a search query.

Unified web & news results

Get both web pages and news articles in a single API call. Our classification system automatically determines when to include news results based on query intent.

1{
2 "results": {
3 "web": [ // Web search results
4 {
5 "url": "https://example.com/article",
6 "title": "Article Title",
7 "description": "Brief description of the content",
8 "snippets": [
9 "Relevant excerpt from the page",
10 "Another relevant passage"
11 ],
12 "thumbnail_url": "https://example.com/image.jpg",
13 "page_age": "2025-11-15T10:30:00",
14 "authors": ["Author Name"],
15 "favicon_url": "https://example.com/favicon.ico",
16 "contents": { // Included when livecrawl is "web" or "all"
17 "markdown": "# Article Title\n\nFull page content..."
18 }
19 }
20 ],
21 "news": [ // News articles (when relevant)
22 {
23 "title": "Breaking News Article",
24 "description": "News article summary",
25 "url": "https://news.com/article",
26 "page_age": "2025-11-15T14:00:00",
27 "thumbnail_url": "https://news.com/image.jpg",
28 "contents": { // Included when livecrawl is "news" or "all"
29 "markdown": "# Breaking News\n\nFull article content..."
30 }
31 }
32 ]
33 },
34 "metadata": {
35 "search_uuid": "942ccbdd-7705-4d9c-9d37-4ef386658e90",
36 "query": "your search query",
37 "latency": 0.342
38 }
39}

LLM-optimized output

Every result includes:

  • Snippets: Pre-extracted relevant text excerpts
  • Descriptions: Clean summaries without HTML clutter
  • Metadata: Publication dates, authors, thumbnails, favicons
  • Structured JSON: No parsing required, ready for AI consumption

Advanced search operators

Build powerful and precise search queries using search operators:

  • site:domain.com - Search within specific domains
  • filetype:pdf - Filter by file type
  • +term / -term - Include/exclude specific terms
  • Boolean operators: AND, OR, NOT

Learn more about search operators

Global coverage

Target results by geographic region using the country parameter (ISO 3166-1 alpha-2 country codes) and filter by language using the language parameter (BCP 47 language codes).

Freshness controls

Filter results by recency:

  • day - Last 24 hours
  • week - Last 7 days
  • month - Last 30 days
  • year - Last 365 days
  • YYYY-MM-DDtoYYYY-MM-DD - Custom date range

All optional parameters you can control

ParameterTypeDescription
querystringYour search query (supports search operators)
countintegerMax results per section (default varies, max 100)
freshnessstringday, week, month, year, or date range
countrystringCountry code (e.g., US, GB, FR)
languagestringBCP 47 language code (e.g., EN, JP, DE)
offsetintegerFor pagination (0-9)
safesearchstringoff, moderate (default), strict
livecrawlstringweb, news, or all
livecrawl_formatsstring (GET) / array (POST)html or markdown. GET: repeat the parameter (?livecrawl_formats=html&livecrawl_formats=markdown). POST: JSON array
crawl_timeoutintegerMaximum livecrawl timeout in seconds (1-60, default 10). Applies only when livecrawl is enabled
include_domainsstring (GET) / array (POST)Restrict results to these domains. GET: comma-separated string. POST: JSON array. Supports up to 500 domains
exclude_domainsstring (GET) / array (POST)Exclude results from these domains. GET: comma-separated string. POST: JSON array. Supports up to 500 domains
boost_domainsstring (GET) / array (POST)Boost results from these domains without excluding other domains. GET: comma-separated string. POST: JSON array. Supports up to 500 domains. Cannot be used with include_domains

View full API reference


Common use cases

RAG (Retrieval-Augmented Generation)

Use search snippets to provide context to your LLM without hallucination. The structured snippets are perfect for feeding directly into your prompt.

Private RAG template

See a working RAG app that lets users ask AI questions about their own private documents.

1from youdotcom import You
2
3# Initialize
4you = You("YOUR_KEY")
5
6# Search and extract context
7def get_context(query):
8 response = you.search.unified(query=query, count=5)
9 snippets = []
10 if response.results and response.results.web:
11 for result in response.results.web:
12 if result.snippets:
13 snippets.extend(result.snippets)
14 return "\n".join(snippets)
15
16context = get_context("quantum computing")
17
18# Feed to your LLM
19prompt = f"Based on this information:\n{context}\n\nAnswer: {user_question}"

AI agent knowledge retrieval

Give your AI agents access to real-time web information. Perfect for building agents that need up-to-date facts, news, or specialized domain knowledge.

Simple Search template

See a working app that runs a web search and returns the top results.

News monitoring & alerts

Track breaking news, competitor mentions, or industry trends. The automatic news classification ensures you get timely articles when relevant.

Content research & analysis

Gather comprehensive information from multiple sources for content creation, competitive intelligence, or market research.

Knowledge base construction

Use live crawling to build comprehensive knowledge bases with full-page content in clean Markdown format.


Examples of advanced search capabilites

Domain filtering

Restrict results to, exclude results from, or boost specific domains. Use include_domains for a strict allowlist, exclude_domains to filter out unwanted domains, and boost_domains to prefer matching domains without filtering out other results. For large domain lists, POST is strongly recommended.

1from youdotcom import You
2
3with You(api_key_auth="api_key") as you:
4 # Only return results from trusted news sources
5 res = you.search.unified(
6 query="federal reserve interest rate decision",
7 include_domains=["reuters.com", "apnews.com", "ft.com", "bloomberg.com"],
8 )
9
10 if res.results and res.results.web:
11 for result in res.results.web:
12 print(f"{result.title} — {result.url}")

Use boost_domains when you want to prefer sources without making them mandatory. Matching results from boosted domains receive a relative ranking boost, but the boost is not quantified. If boosted domains do not have matching results, results from other domains can still appear. boost_domains can be used with exclude_domains, but not with include_domains.

Search operators

Combine operators for powerful, precise searches:

1from youdotcom import You
2from youdotcom.models import Freshness
3
4with You(api_key_auth="api_key") as you:
5 # Find PDFs about climate change from .edu sites published this year
6 res = you.search.unified(
7 query="climate change site:.edu filetype:pdf",
8 freshness=Freshness.YEAR,
9 )
10
11 # Print PDF results with their URLs
12 if res.results and res.results.web:
13 for result in res.results.web:
14 print(f"{result.title}")
15 print(f" PDF URL: {result.url}")

Pagination

Use offset to retrieve additional pages of results. The offset value (0-9) skips that many pages, so offset=1 with count=10 returns results 11-20.

1from youdotcom import You
2
3with You(api_key_auth="api_key") as you:
4 # Get the second page of results
5 res = you.search.unified(
6 query="machine learning",
7 count=10,
8 offset=1,
9 )
10
11 print(res.results.web)

Geographic targeting

Narrow down on results by country:

1from youdotcom import You
2from youdotcom.models import Country
3
4# Get Swiss results
5with You(api_key_auth="api_key") as you:
6 res = you.search.unified(
7 query="best restaurants in geneva",
8 country=Country.CH,
9 )
10
11 # Print restaurant results with descriptions
12 if res.results and res.results.web:
13 for result in res.results.web:
14 print(f"{result.title}")
15 if result.description:
16 print(f" {result.description}\n")

Refer to the ISO 3166-1 alpha-2 standard for a list of country codes.


Best practices

1. Use snippets for RAG

The snippets array is pre-processed for LLM consumption. Use it directly instead of crawling full pages when possible.

2. Implement caching

Cache frequent queries to reduce API calls and improve response times. Consider a 5-15 minute TTL for most use cases.

3. Handle empty results

Always check if results.web or results.news arrays are empty before processing:

1if results.get("results", {}).get("web"):
2 # Process results
3else:
4 # Handle no results case

4. Use appropriate count values

  • For RAG: count=5-10 is usually sufficient
  • For UI display: count=20-50 for pagination
  • For data gathering: count=100 (max) for comprehensive coverage

5. Use search operators and query parameters

Use search operators and specify query parameters in the request to reduce noise and get more relevant results.


GET vs. POST

The Search API supports both GET /v1/search and POST /v1/search. Both methods go through the same underlying logic and return identical responses — the difference is how parameters are encoded on the wire.

Use GET when:

  • Your request only uses simple scalar parameters (query, count, freshness, etc.)
  • HTTP cacheability matters — GET responses can be cached at CDN and proxy layers using the URL and Vary: X-API-Key as a cache key. POST responses are not cached by default per the HTTP spec.
  • You want quick testing with curl or a browser.

Use POST when:

  • You’re using include_domains, exclude_domains, or boost_domains — these accept JSON arrays in POST. With GET, domains must fit in a single comma-separated query string value and are subject to URL length limits.
  • Future array parameters work naturally as JSON — no ambiguity between comma-separated values and repeated params.

Wire format differences for complex fields:

FieldGETPOST (JSON body)
include_domains?include_domains=a.com,b.com — comma-separated, single value"include_domains": ["a.com", "b.com"]
exclude_domains?exclude_domains=a.com,b.com — comma-separated, single value"exclude_domains": ["a.com", "b.com"]
boost_domains?boost_domains=a.com,b.com — comma-separated, single value"boost_domains": ["a.com", "b.com"]
livecrawl_formats?livecrawl_formats=html&livecrawl_formats=markdown — repeated params"livecrawl_formats": ["html", "markdown"]

GET domain filters do not support repeated parameters. ?include_domains=a.com&include_domains=b.com will not work — only a single comma-separated value is supported. For large domain lists, use POST.


Pricing

$5.00 per 1,000 calls (up to 100 results per call)

All new accounts receive $100 in free credits to get started. Pricing is simple — you only pay for what you use.

What’s included:

  • Web and news results in a single unified request
  • Up to 100 results per call
  • News results at no extra cost
  • LLM-ready snippets with rich metadata
  • Country, language, recency, domain and more targeting filters

Livecrawl add-on — $1.00 per 1,000 pages

Full page content via the livecrawl parameter (HTML, Markdown, or both) is billed separately from the base Search API rate.

Example: A single call with count=10 and livecrawl_formats=["html", "markdown"] returns 10 web results and 10 news results — 20 pages total.

Line itemCalculationCost
Search API call1 call × $5.00 / 1,000$0.005
Livecrawl20 pages × $1.00 / 1,000$0.020
Total$0.025

For volume discounts, annual pricing, or enterprise features, visit you.com/pricing or contact [email protected].


Next steps

Run in Postman

Use our Postman collections to learn and experiment on your own

Run In Postman
API reference

Explore the complete API documentation with all parameters and response schemas

Search operators

Master advanced search operators to refine your queries

Guides & tutorials

Learn from practical examples and integration guides

Quickstart guide

Get your API key and make your first search in 5 minutes