Fetch API
Per-website scraper endpoints that auto-configure themselves. Point the API at any domain and AI discovers optimal CSS selectors, extraction schema, and request settings. Configs are cached so subsequent requests are fast and consistent.
Config Resolution
Each request resolves through a three-layer cache hierarchy. Configs get faster the more they're used.
Memory Cache
< 1msUser-specific config first, then shared/public config. Sub-millisecond.
Database Lookup
~50msQueries config database for matching domain + path. Includes community-discovered configs.
AI Discovery
~3-5s (first time)AI crawls the page, analyzes structure, and generates an optimal scraper config. Only happens once per domain/path.
Fetch vs. Scrape
Use Fetch When
- ✓You want structured data without writing CSS selectors
- ✓You're scraping a site repeatedly and want cached configs
- ✓You need AI to figure out the best extraction approach
- ✓You want community-validated scraper configs
Use Scrape When
- ✓You already know the exact CSS selectors you need
- ✓You want full control over extraction settings
- ✓You need raw markdown, HTML, or text output
- ✓You're doing a one-off extraction
What Gets Auto-Configured
CSS Selectors
AI discovers the right selectors for titles, prices, descriptions, images, and other structured fields on each page type.
Request Mode
Determines whether a page needs JavaScript rendering (chrome), stealth mode, or works with plain HTTP.
Scroll & Wait
Detects lazy-loaded content that requires scrolling or waiting for specific elements to appear before extraction.
Extraction Schema
Generates a JSON schema describing the structured data that can be extracted from the page.
Content Filtering
Sets root selectors for main content and exclude selectors to skip ads, navigation, and footers.
Confidence Scoring
Each config gets a confidence score. Configs are validated over time and improved as more users access the same endpoint.
Endpoint Reference
/fetch/{domain}/{path} URL Parameters
domain Target website domain (e.g., news.ycombinator.com) path Page path to scrape (e.g., /newest or /) Body Parameters (all optional)
AI handles extraction automatically. These only control output format and crawl behavior.
return_format json (default), markdown, html, text limit Number of pages to crawl (default 1, max 100) readability Strip navigation, ads, sidebars. Returns main content only. Code Examples
import os, requests
response = requests.post(
"https://api.spider.cloud/fetch/news.ycombinator.com/",
headers={
"Authorization": "Bearer " + os.environ["SPIDER_API_KEY"],
"Content-Type": "application/json",
},
json={"return_format": "json"}
)
data = response.json()
print(data) curl -X POST https://api.spider.cloud/fetch/news.ycombinator.com/ \
-H "Authorization: Bearer $SPIDER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"return_format": "json"}' const SPIDER_API_KEY = process.env.SPIDER_API_KEY;
const response = await fetch(
"https://api.spider.cloud/fetch/news.ycombinator.com/",
{
method: "POST",
headers: {
"Authorization": "Bearer " + SPIDER_API_KEY,
"Content-Type": "application/json",
},
body: JSON.stringify({ return_format: "json" }),
}
);
const data = await response.json();
console.log(data); Response Fields
url The final URL after any redirects content Extracted data in your chosen format status HTTP status code of the response metadata Page title, description, keywords, og:image css_extracted Structured data from AI-discovered selectors (JSON format) links All links found on the page Fetch Directory
Browse pre-configured scraper endpoints discovered by the community.
Every time someone uses the Fetch API on a new domain, the AI-discovered config is validated and made available in the public directory. Browse available scrapers, see what fields they extract, and use them instantly.