Skip to main content

ScrapeGraph MCP Server

License: MIT Python 3.13+ smithery badge A production‑ready Model Context Protocol (MCP) server that connects LLMs to the ScrapeGraph AI API for AI‑powered web scraping, research, and crawling.

⭐ Star us on GitHub

If this server is helpful, a star goes a long way. Thanks!

Key Features

  • 8 tools covering markdown conversion, AI extraction, search, crawling, sitemap, and agentic flows
  • Remote HTTP MCP endpoint and local Python server support
  • Works with Cursor, Claude Desktop, and any MCP‑compatible client
  • Robust error handling, timeouts, and production‑tested reliability

Get Your API Key

Create an account and copy your API key from the ScrapeGraph Dashboard.

Recommended: Use the Remote MCP Server

Endpoint:
https://mcp.scrapegraphai.com/mcp
Click to open setup guides:

Local Usage (Python)

Prefer running locally? Install and wire the server via stdio.

Install

pip install -e .
# or
uv pip install -e .
Set your key:
# macOS/Linux
export SGAI_API_KEY=your-api-key-here
# Windows (PowerShell)
$env:SGAI_API_KEY="your-api-key-here"

Run the server

scrapegraph-mcp
# or
python -m scrapegraph_mcp.server

Optional: Install via Smithery

npx -y @smithery/cli install @ScrapeGraphAI/scrapegraph-mcp --client claude

Available Tools

The server exposes 8 enterprise‑ready tools:

1. markdownify

Convert a webpage to clean markdown.
markdownify(website_url: str)

2. smartscraper

AI‑powered extraction with optional infinite scrolls.
smartscraper(
  user_prompt: str,
  website_url: str,
  number_of_scrolls: int | None = None,
  render_heavy_js: bool | None = None
)

3. searchscraper

Search the web and extract structured results.
searchscraper(
  user_prompt: str,
  num_results: int | None = None,
  number_of_scrolls: int | None = None
)

4. scrape

Fetch raw HTML with optional heavy JS rendering.
scrape(website_url: str, render_heavy_js: bool | None = None)

5. sitemap

Discover a site’s URLs and structure.
sitemap(website_url: str)

6. smartcrawler_initiate

Start an async multi‑page crawl (AI or markdown mode).
smartcrawler_initiate(
  url: str,
  prompt: str | None = None,
  extraction_mode: str = "ai",
  depth: int | None = None,
  max_pages: int | None = None,
  same_domain_only: bool | None = None
)

7. smartcrawler_fetch_results

Poll results using the returned request_id.
smartcrawler_fetch_results(request_id: str)

8. agentic_scrapper

Agentic, multi‑step workflows with optional schema and session persistence.
agentic_scrapper(
  url: str,
  user_prompt: str | None = None,
  output_schema: dict | None = None,
  steps: list | None = None,
  ai_extraction: bool | None = None,
  persistent_session: bool | None = None,
  timeout_seconds: float | None = None
)

Troubleshooting

  • Verify your key is present in config (X-API-Key for remote, SGAI_API_KEY for local).
  • Claude Desktop logs:
    • macOS: ~/Library/Logs/Claude/
    • Windows: %APPDATA%\\Claude\\Logs\\
  • If a long crawl is β€œstill running”, keep polling smartcrawler_fetch_results.

License

MIT License – see LICENSE file for details.