feat(context-dev): add Context.dev web + brand data integration#5048
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
@greptile |
|
@cursor review |
PR SummaryLow Risk Overview The Implements 22 Docs and catalog updates: integration MDX, Reviewed by Cursor Bugbot for commit 3ff8e33. Configure here. |
Greptile SummaryAdds a full Context.dev integration block with 22 tools spanning web scraping, structured extraction, brand intelligence, industry classification, and utility prefetch endpoints. The implementation follows the existing Firecrawl pattern:
Confidence Score: 5/5New integration-only addition with no changes to existing tool or block logic; all 22 tools are isolated behind the context_dev namespace and cannot affect other integrations. The integration is a well-contained additive change. Auth headers, error handling, credit extraction, and response transforms all follow the established Firecrawl pattern. The block's params dispatch is complete and verified (all 22 operations have matching switch cases, the API key uses user-only visibility, and the previously flagged includeFrames and screenshot MIME-type gaps have been resolved). The one naming ambiguity in classify_sic (input apps/sim/tools/context_dev/classify_sic.ts — minor naming ambiguity between the Important Files Changed
Sequence DiagramsequenceDiagram
participant User as User / LLM
participant Block as ContextDevBlock
participant Tool as Tool (one of 22)
participant API as Context.dev API
User->>Block: invoke with operation + params + apiKey
Block->>Block: params dispatch (switch on operation)
Note over Block: setBool / setString / setNumber coerce inputs
Block->>Tool: resolved params object
alt GET endpoint (scrape_markdown, get_brand, screenshot, map, …)
Tool->>API: GET /v1/… ?params… (Bearer apiKey)
else "POST endpoint (crawl, extract, search, prefetch_*, …)"
Tool->>API: "POST /v1/… { body } (Bearer apiKey)"
end
API-->>Tool: "JSON { ...data, key_metadata }"
Tool->>Tool: parseContextDevResponse → transformResponse
Note over Tool: extractCreditMetadata(key_metadata)
Tool-->>Block: "{ success, output: { ...data, creditsConsumed, creditsRemaining } }"
Block-->>User: output fields
Reviews (3): Last reviewed commit: "fix(context-dev): wire includeFrames, sp..." | Re-trigger Greptile |
Greptile SummaryThis PR adds a Context.dev integration with 10 operations spanning web scraping (markdown, HTML, screenshot, crawl, sitemap map), web search, structured extraction, NAICS/SIC classification, and brand data retrieval — modeled closely on the existing Firecrawl pattern (BYOK Bearer auth,
Confidence Score: 4/5Safe to merge — the integration is self-contained, follows established patterns, and only minor capability gaps were found. The implementation is consistent across all 10 tools, auth credentials are correctly scoped to
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
User([User / LLM]) --> Block["ContextDevBlock\n(operation dropdown)"]
Block --> Dispatch["params dispatcher\n(switch on operation)"]
Dispatch --> SM["context_dev_scrape_markdown\nGET /web/scrape/markdown"]
Dispatch --> SH["context_dev_scrape_html\nGET /web/scrape/html"]
Dispatch --> SS["context_dev_screenshot\nGET /web/screenshot"]
Dispatch --> CW["context_dev_crawl\nPOST /web/crawl"]
Dispatch --> MP["context_dev_map\nGET /web/scrape/sitemap"]
Dispatch --> SR["context_dev_search\nPOST /web/search"]
Dispatch --> EX["context_dev_extract\nPOST /web/extract"]
Dispatch --> CN["context_dev_classify_naics\nGET /web/naics"]
Dispatch --> CS["context_dev_classify_sic\nGET /web/sic"]
Dispatch --> GB["context_dev_get_brand\nGET /brand/retrieve"]
SS --> FTP["FileToolProcessor\n(download → UserFile)"]
FTP --> File[(Stored File)]
SM & SH & CW & MP & SR & EX & CN & CS & GB --> Credits["creditsConsumed / creditsRemaining\n(key_metadata)"]
Reviews (2): Last reviewed commit: "feat(context-dev): add Context.dev web +..." | Re-trigger Greptile |
…mages, prefetch Expands coverage to all relevant Context.dev endpoints (22 tools): brand by name/email/ticker, simplified brand, transaction identifier, single + catalog product extraction, fonts, styleguide, image discovery, and prefetch utilities. Shared brand output schema and transform helper; verified against the live API.
…erive screenshot MIME Addresses review feedback: - includeFrames is now a block subblock + param for scrape_markdown/scrape_html - crawl and extract use separate Max Pages fields (crawl 1-500, extract 1-50) so a crawl value can no longer be forwarded to extract beyond its limit - screenshot file MIME type and extension are derived from the returned URL instead of being hardcoded to PNG
|
@greptile |
|
@cursor review |
There was a problem hiding this comment.
✅ Bugbot reviewed your changes and found no new issues!
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit 3ff8e33. Configure here.
Summary
Adds a Context.dev integration — a single API-key (Bearer) service for web scraping and brand/firmographic data. Modeled on the existing Firecrawl integration (
AuthMode.ApiKey, BYOK via the block's API-key field). Covers all relevant Context.dev endpoints across every API family.Web scraping
scrape_markdown,scrape_html,scrape_images,screenshot(stored as a downloadable file),crawl,map,searchWeb extraction
extract(JSON-schema structured data),extract_product,extract_products,scrape_fonts,scrape_styleguide,classify_naics,classify_sicBrand intelligence
get_brand(by domain),get_brand_by_name,get_brand_by_email,get_brand_by_ticker,get_brand_simplified,identify_transactionUtility
prefetch_domain,prefetch_by_email22 tools total.
File handling
The screenshot endpoint returns a hosted image URL. The tool surfaces it as a
file-typed output (ToolFileDatawithurl), so the executor'sFileToolProcessordownloads and stores it as aUserFile— the same path other file-producing tools use. The file's MIME type/extension are derived from the returned URL. The rawscreenshotUrlis also exposed.Details
https://docs.context.dev). Every response'skey_metadatacredit accounting is surfaced ascreditsConsumed/creditsRemaining.context_dev(white background, brand logo icon), registered inblocks/registry.ts; tools registered intools/registry.ts.ContextDevBlockMetawith 9 use-case templates.Validation
bun run type-check— clean for the integration (0 errors incontext_dev)bun run lint— cleanbun run check:api-validation— passes (no new boundary routes)Review
Addressed all inline findings from the first review pass (commit
3ff8e334e2): wiredincludeFramesinto the block, split crawl/extractmaxPagesso the differing page limits can't be crossed, and derive the screenshot MIME type from the URL.