Select an archive from the list below, enter a URL pattern, and hit Search to query the index. See the PyWB CDX Server API Reference for more about the query API. Replace the API endpoint coll/cdx with one of the endpoints listed below (also available as a JSON list).
Command-line tools
Tools for working with the CDX server and downloading from Common Crawl can be found on our Examples page.
About the data
Common Crawl data is stored on Amazon Web Services' Public Data Sets. All data and index files are free to download. Feel free to run your own index server, or analyze the index offline.
More about the URL index in the original announcement. For help, visit the Common Crawl user forum or Discord server. See also Getting Started.