Spider Developer Platform

Collect web data at scale. Spider handles crawling, rendering, proxy rotation, and anti-bot evasion, you get clean data back through a single API.

Developer quickstartInstall the SDK, export your API key, and make your first request.3 mins

Request

import requests, json

headers = {
    'Authorization': 'Bearer $SPIDER_API_KEY',
    'Content-Type': 'application/json',
}

json_data = {"limit":5,"url":"https://example.com"}

response = requests.post('https://api.spider.cloud/crawl', 
  headers=headers, json=json_data, stream=True)

with response as r:
    r.raise_for_status()
    
    buffer = b""

    for chunk in response.iter_content(chunk_size=8192):
        if chunk:
            buffer += chunk
            try:
                data = json.loads(buffer.decode('utf-8'))
                print(data)
                buffer = b""
            except json.JSONDecodeError:
                continue

CrawlFollow links across entire sites. Set depth, limit, and domain scope.

ScrapeFetch a single page as HTML, markdown, text, or structured JSON.

SearchSearch the web and scrape the results in one request.

ScreenshotCapture full-page screenshots with Chrome rendering.

StreamingProcess pages as they finish instead of waiting for the full result.

AI ExtractionExtract structured data from any page using AI or CSS selectors.

Data ConnectorsStream results directly to S3, Google Cloud, Azure Blob, Sheets, or Supabase.

Anti-Bot BypassAutomatic fingerprint rotation, stealth mode, and retry engine to bypass bot protection.

Proxy ModeIntelligent geo-routing across residential, ISP, and mobile proxies in 100+ countries.

Browser CloudFull cloud browsers via CDP WebSocket. Playwright and Puppeteer compatible with stealth, proxies, and recording.

Fetch API (Alpha)AI-configured per-website scrapers. Discovers selectors automatically, caches and reuses.

How It Works

Every request goes through three stages: fetch (retrieve the page using HTTP or headless Chrome), process (render JavaScript, rotate proxies, handle anti-bot challenges), and deliver (convert to your chosen format and return). Spider's Rust-based engine runs all stages concurrently, so a 500-page crawl takes seconds, not hours.

API Endpoints

All endpoints accept JSON and return JSON. Authenticate with a Bearer token.

Method	Path	Description
`POST`	`/crawl`	Start from a URL and follow links to discover and fetch multiple pages.
`POST`	`/scrape`	Fetch a single page and return its content in any format.
`POST`	`/search`	Search the web and optionally scrape the results.
`POST`	`/screenshot`	Capture a full-page screenshot as base64 PNG.
`POST`	`/fetch/{domain}/{path}`	AI-configured per-website scraper with cached configs. (Alpha)
`GET`	`/data/scraper-directory`	Browse optimized scraper configs for popular websites.
`HTTP`	`proxy.spider.cloud`	Route requests through intelligent residential, ISP, or mobile proxies.
`WS`	`/v1/browser`	Connect a Playwright or Puppeteer client to a cloud browser via CDP.

Request Modes

Choose how Spider fetches each page. smart (default) automatically picks between HTTP and Chrome based on the page. Use http for static HTML, it is the fastest and cheapest. Use chrome when you need JavaScript rendering, SPA support, or real browser fingerprints for bot-protected sites. See Concepts for details.

Proxy Mode

Route any Spider request through proxy.spider.cloud for intelligent proxy management. Spider automatically selects the best proxy pool, rotates IPs, and handles geo-routing based on the target site — no manual configuration required. Choose from residential (real-user IPs across 100+ countries), ISP (stable datacenter IPs, highest throughput), or mobile (real 4G/5G device IPs for maximum stealth). Use the country_code parameter for geolocation targeting and proxy to select a pool. Proxy Mode works with Crawl, Scrape, Screenshot, Search, and Links. See the Proxy-Mode API reference for pricing and configuration.

Browser Cloud

Spider also provides full cloud browsers accessible over a CDP WebSocket at wss://browser.spider.cloud/v1/browser?token=YOUR-API-KEY. Connect any Playwright or Puppeteer client with connectOverCDP() for full page control, AI extraction, and automation. Sessions include built-in stealth, proxy rotation, and optional session recording. 100 concurrent browsers on all plans. See the Browser API reference for examples and configuration. Use the spider-browser npm package for a ready-made client.

Credits

Usage is measured in credits at $1 / 10,000 credits. Each page costs a base amount, with additional credits for Chrome rendering, proxy usage, and AI extraction. Every response includes a costs object with a per-request breakdown. Monitor your balance on the usage page.

Build now

Explore our guides

Spider Platform

A practical walkthrough for collecting web data with Spider, from your first crawl to production pipelines.

Jeff Mendez · Jan 2, 2024
Spider API

An overview of Spider's API capabilities, endpoints, request modes, output formats, and how to get started.

Jeff Mendez · Jan 3, 2024

Extract Leads

Extract contact information from any website using Spider's AI-powered pipeline. Emails, phone numbers, and more.

Jeff Mendez · Feb 1, 2024
Website Archiving

Archive web pages with Spider. Capture full page resources, automate regular crawls, and store content for long-term access.

Jeff Mendez · Feb 7, 2024
LangChain + Groq + Spider Integration Guide

Crawl multiple URLs with Spider's LangChain loader, then summarize the results with Groq and Llama 3.

Gilbert Bagaoisan · May 4, 2024
Stock Research Assistant Using crewAI and Spider

Build a crewAI research pipeline that uses Spider to scrape financial data and write stock analysis reports.

Gilbert Bagaoisan · May 12, 2024
Automated Cold Email Outreach Using Spider

Extract company info from inbound emails, scrape their website with Spider, and generate personalized replies with RAG.

William Espegren · May 17, 2024
Scrape & Crawl Agent with Microsoft's Autogen

Set up an Autogen agent that scrapes and crawls websites using the Spider API.

William Espegren · May 20, 2024
Proxy Mode - Spider

Route requests through Spider's proxy front-end for easy integration with third-party tools.

Jeff Mendez · Jun 6, 2024
Crawling Authenticated Pages

Three methods for crawling pages behind login walls: cookies, execution scripts, and AI-driven actions.

Jeff Mendez · Jun 14, 2024
Speedy Resilient Web Scraper for RAG AI: Part 2

Scaling web scraping for RAG pipelines. Error-first design, retry strategies, and handling failures at volume.

Troy Lowry · Jun 22, 2024
Speedy Resilient Web Scraper for RAG AI: Part 1

Choosing your scraper, cleaning HTML for RAG, deduplicating content, and testing on a single site before scaling up.

Troy Lowry · Jun 22, 2024
Set Up Automated Free Website Static Search

Add full-text static search to any website using Spider and Pagefind.

Jeff Mendez · Jul 2, 2024
Build an AI Agent from Scratch

Build a research agent that searches the web with Spider, evaluates results, and forms answers with OpenAI.

William Espegren · Jul 9, 2024
Discord Real-Time Data Retrieval

Set up Spider Bot on your Discord server to fetch and analyze web data using slash commands.

Jeff Mendez · Aug 25, 2024
Scaling Headless Chrome for High-Performance Applications

Practical strategies for scaling headless Chrome, from container orchestration to Rust-based CDP handlers and ALB configuration.

Jeff Mendez · Feb 21, 2025
Spider Search (SERP)

Search the web and optionally scrape results in a single API call. Built for LLM pipelines, agents, and data collection.

Jeff Mendez · Mar 22, 2025