Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations, etc from Wikipedia. Documentation provides code snippets for the most common use cases.
This package requires at least Python 3.9 to install because it's using IntEnum.
pip3 install wikipedia-apiGoal of Wikipedia-API is to provide simple and easy to use API for retrieving informations from Wikipedia. The library provides both a synchronous (Wikipedia) and an asynchronous (AsyncWikipedia) client. Bellow are examples of common use cases.
Key differences between the sync and async API:
- All data-fetching attributes (
summary,text,sections,langlinks,links,backlinks,categories,categorymembers,coordinates,images,pageid,fullurl,displaytitle, …) are explicit@propertydefinitions in both APIs. In the async API every such property returns a coroutine:await page.summary,await page.sections,await page.pageid, etc. title,ns,namespace,language,variantare plain@propertyvalues in both APIs (noawaitneeded).exists()is a plain method in the sync API; a coroutine method in the async API:await page.exists().section_by_title()andsections_by_title()are plain synchronous methods in both APIs.
import wikipediaapi
# Synchronous client
wiki = wikipediaapi.Wikipedia(user_agent='MyProjectName ([email protected])', language='en')
# Asynchronous client
wiki = wikipediaapi.AsyncWikipedia(user_agent='MyProjectName ([email protected])', language='en')Getting single page is straightforward. You have to initialize Wikipedia (or AsyncWikipedia)
object and ask for page by its name.
To initialize it, you have to provide:
- user_agent to identify your project. Please follow the recommended format.
- language to specify language mutation. It has to be one of supported languages.
Synchronous
import wikipediaapi
wiki_wiki = wikipediaapi.Wikipedia(user_agent='MyProjectName ([email protected])', language='en')
page_py = wiki_wiki.page('Python_(programming_language)')Asynchronous
import asyncio
import wikipediaapi
async def main():
wiki_wiki = wikipediaapi.AsyncWikipedia(user_agent='MyProjectName ([email protected])', language='en')
page_py = wiki_wiki.page('Python_(programming_language)')
# Data is fetched lazily — await any attribute or property to trigger it
asyncio.run(main())For checking, whether page exists, you can use function exists.
Synchronous
page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True
page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" % page_missing.exists())
# Page - Exists: FalseAsynchronous
In the async API, exists() is a coroutine — it lazily fetches pageid
via the info API call if not yet cached (same approach as await page.fullurl).
async def main():
page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % await page_py.exists())
# Page - Exists: True
page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" % await page_missing.exists())
# Page - Exists: FalseClass WikipediaPage has property summary, which returns description of Wiki page.
In the async API, summary is a coroutine.
Synchronous
import wikipediaapi
wiki_wiki = wikipediaapi.Wikipedia('MyProjectName ([email protected])', 'en')
print("Page - Title: %s" % page_py.title)
# Page - Title: Python (programming language)
print("Page - Summary: %s" % page_py.summary[0:60])
# Page - Summary: Python is a widely used high-level programming language forAsynchronous
async def main():
print("Page - Title: %s" % page_py.title)
# Page - Title: Python (programming language)
summary = await page_py.summary
print("Page - Summary: %s" % summary[0:60])
# Page - Summary: Python is a widely used high-level programming language forWikipediaPage has two properties with URL of the page. It is fullurl and canonicalurl.
In the async API, these attributes are awaitables.
Synchronous
print(page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)
print(page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)Asynchronous
async def main():
print(await page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)
print(await page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)To get full text of Wikipedia page you should use property text which constructs text of the page
as concatanation of summary and sections with their titles and texts.
Synchronous
wiki_wiki = wikipediaapi.Wikipedia(
user_agent='MyProjectName ([email protected])',
language='en',
extract_format=wikipediaapi.ExtractFormat.WIKI
)
p_wiki = wiki_wiki.page("Test 1")
print(p_wiki.text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...
wiki_html = wikipediaapi.Wikipedia(
user_agent='MyProjectName ([email protected])',
language='en',
extract_format=wikipediaapi.ExtractFormat.HTML
)
p_html = wiki_html.page("Test 1")
print(p_html.text)
# <p>Summary</p>
# <h2>Section 1</h2>
# <p>Text of section 1</p>
# <h3>Section 1.1</h3>
# <p>Text of section 1.1</p>
# ...Asynchronous
async def main():
wiki_wiki = wikipediaapi.AsyncWikipedia(
user_agent='MyProjectName ([email protected])',
language='en',
extract_format=wikipediaapi.ExtractFormat.WIKI
)
page = wiki_wiki.page("Test 1")
text = await page.text
print(text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...To get all top level sections of page, you have to use property sections. It returns list of
WikipediaPageSection, so you have to use recursion to get all subsections.
Synchronous
def print_sections(sections, level=0):
for s in sections:
print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
print_sections(s.sections, level + 1)
print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to lAsynchronous
def print_sections(sections, level=0):
for s in sections:
print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
print_sections(s.sections, level + 1)
async def main():
sections = await page_py.sections
print_sections(sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readableTo get last section of page with given title, you have to use function section_by_title.
It returns the last WikipediaPageSection with this title.
section_by_title works the same in both the sync and async API.
section_history = page_py.section_by_title('History')
print("%s - %s" % (section_history.title, section_history.text[0:40]))
# History - Python was conceived in the late 1980s bTo get all sections of page with given title, you have to use function sections_by_title.
It returns the all WikipediaPageSection with this title.
sections_by_title works the same in both the sync and async API.
page_1920 = wiki_wiki.page('1920')
sections_january = page_1920.sections_by_title('January')
for s in sections_january:
print("* %s - %s" % (s.title, s.text[0:40]))
# * January - January 1
# Polish–Soviet War in 1920: The
# * January - January 2
# Isaac Asimov, American author
# * January - January 1 – Zygmunt Gorazdowski, PolishIf you want to get other translations of given page, you should use property langlinks. It is map,
where key is language code and value is WikipediaPage (or AsyncWikipediaPage).
Synchronous
def print_langlinks(page):
langlinks = page.langlinks
for k in sorted(langlinks.keys()):
v = langlinks[k]
print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))
print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8
page_py_cs = page_py.langlinks['cs']
print("Page - Summary: %s" % page_py_cs.summary[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový skAsynchronous
In the async API, langlinks is an awaitable property. Attributes on the returned
page stubs (e.g. fullurl) are also awaitables.
async def main():
langlinks = await page_py.langlinks
for k in sorted(langlinks.keys()):
v = langlinks[k]
print("%s: %s - %s: %s" % (k, v.language, v.title, await v.fullurl))
page_py_cs = langlinks['cs']
print("Page - Summary: %s" % (await page_py_cs.summary)[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový skIf you want to get all links to other wiki pages from given page, you need to use property links.
It's map, where key is page title and value is WikipediaPage (or AsyncWikipediaPage).
Synchronous
def print_links(page):
links = page.links
for title in sorted(links.keys()):
print("%s: %s" % (title, links[title]))
print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...Asynchronous
async def main():
links = await page_py.links
for title in sorted(links.keys()):
print("%s: %s" % (title, links[title]))If you want to get all categories under which page belongs, you should use property categories.
It's map, where key is category title and value is WikipediaPage (or AsyncWikipediaPage).
Synchronous
def print_categories(page):
categories = page.categories
for title in sorted(categories.keys()):
print("%s: %s" % (title, categories[title]))
print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...Asynchronous
async def main():
categories = await page_py.categories
for title in sorted(categories.keys()):
print("%s: %s" % (title, categories[title]))To get all pages from given category, you should use property categorymembers (sync) or
awaitable property categorymembers (async). It returns all members of given category.
You have to implement recursion and deduplication by yourself.
Synchronous
def print_categorymembers(categorymembers, level=0, max_level=1):
for c in categorymembers.values():
print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
print_categorymembers(c.categorymembers, level=level + 1, max_level=max_level)
cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
print_categorymembers(cat.categorymembers)
# Category members: Category:Physics
# * Statistical mechanics (ns: 0)
# * Category:Physical quantities (ns: 14)
# ** Refractive index (ns: 0)
# ** Vapor quality (ns: 0)
# ** Electric susceptibility (ns: 0)
# ** Specific weight (ns: 0)
# ** Category:Viscosity (ns: 14)
# *** Brookfield Engineering (ns: 0)Asynchronous
async def print_categorymembers(categorymembers, level=0, max_level=1):
for c in categorymembers.values():
print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
await print_categorymembers(
await c.categorymembers, level=level + 1, max_level=max_level
)
async def main():
cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
await print_categorymembers(await cat.categorymembers)To get geographic coordinates of a page, use coordinates() on the wiki client or the
coordinates property on the page. Results are Coordinate dataclasses with lat,
lon, primary, and globe fields.
Synchronous
# Basic coordinates
page = wiki_wiki.page('London')
coords = wiki_wiki.coordinates(page)
for c in coords:
print(f"lat={c.lat}, lon={c.lon}, primary={c.primary}")
# Or via the page property (uses default params):
coords = page.coordinates
print(f"Coordinates: {len(coords)}")
# Coordinates with enum parameters (type-safe)
from wikipediaapi import CoordinatesProp, CoordinateType
coords = wiki_wiki.coordinates(
page,
prop=[CoordinatesProp.GLOBE, CoordinatesProp.TYPE, CoordinatesProp.COUNTRY],
primary=CoordinateType.ALL
)
# Get only primary coordinates
coords = wiki_wiki.coordinates(page, primary=CoordinateType.PRIMARY)
# Get only secondary coordinates
coords = wiki_wiki.coordinates(page, primary=CoordinateType.SECONDARY)Asynchronous
async def main():
# Basic coordinates
page = wiki_wiki.page('London')
coords = await wiki_wiki.coordinates(page)
for c in coords:
print(f"lat={c.lat}, lon={c.lon}, primary={c.primary}")
# Or via the page property:
coords = await page.coordinates
# Coordinates with enum parameters (type-safe)
from wikipediaapi import CoordinatesProp, CoordinateType
coords = await wiki_wiki.coordinates(
page,
prop=[CoordinatesProp.GLOBE, CoordinatesProp.TYPE, CoordinatesProp.COUNTRY],
primary=CoordinateType.ALL
)To get images (files) used on a page, use images() on the wiki client or the
images property on the page.
Synchronous
page = wiki_wiki.page('London')
imgs = wiki_wiki.images(page)
for title in imgs:
print(title)
# Or via the page property:
imgs = page.imagesAsynchronous
async def main():
page = wiki_wiki.page('London')
imgs = await wiki_wiki.images(page)
for title in imgs:
print(title)To find pages near a geographic point, use geosearch(). Each returned page has a
geosearch_meta property with distance, lat, lon, and primary fields.
Synchronous
# Basic geosearch
results = wiki_wiki.geosearch(coord=wikipediaapi.GeoPoint(51.5074, -0.1278), radius=1000, limit=5)
for title, page in results.items():
meta = page.geosearch_meta
print(f"{title}: {meta.dist:.0f}m away")
# Geosearch with enum parameters (type-safe)
from wikipediaapi._enums import GeoSearchSort, Globe
results = wiki_wiki.geosearch(
coord=wikipediaapi.GeoPoint(51.5074, -0.1278),
sort=GeoSearchSort.DISTANCE,
globe=Globe.EARTH,
radius=1000,
limit=5
)
# Geosearch with different sort
results = wiki_wiki.geosearch(
coord=wikipediaapi.GeoPoint(51.5074, -0.1278),
sort=GeoSearchSort.RELEVANCE,
radius=1000,
limit=5
)Asynchronous
async def main():
# Basic geosearch
results = await wiki_wiki.geosearch(
coord=wikipediaapi.GeoPoint(51.5074, -0.1278), radius=1000, limit=5
)
for title, page in results.items():
meta = page.geosearch_meta
print(f"{title}: {meta.dist:.0f}m away")
# Geosearch with enum parameters (type-safe)
from wikipediaapi._enums import GeoSearchSort, Globe
results = await wiki_wiki.geosearch(
coord=wikipediaapi.GeoPoint(51.5074, -0.1278),
sort=GeoSearchSort.DISTANCE,
globe=Globe.EARTH,
radius=1000,
limit=5
)To get random Wikipedia pages, use random().
Synchronous
# Basic random pages
pages = wiki_wiki.random(limit=3)
for title in pages:
print(title)
# Random pages with enum filter (type-safe)
from wikipediaapi._enums import RedirectFilter
pages = wiki_wiki.random(filter_redirect=RedirectFilter.NONREDIRECTS, limit=3)
# Get only redirects
pages = wiki_wiki.random(filter_redirect=RedirectFilter.REDIRECTS, limit=3)
# Get all pages (redirects and non-redirects)
pages = wiki_wiki.random(filter_redirect=RedirectFilter.ALL, limit=3)Asynchronous
async def main():
# Basic random pages
pages = await wiki_wiki.random(limit=3)
for title in pages:
print(title)
# Random pages with enum filter (type-safe)
from wikipediaapi._enums import RedirectFilter
pages = await wiki_wiki.random(filter_redirect=RedirectFilter.NONREDIRECTS, limit=3)To search for pages by keyword, use search(). Returns a SearchResults object
with pages, totalhits, and suggestion. Each page has a search_meta
property with snippet, size, wordcount, and timestamp fields.
Synchronous
# Basic search
results = wiki_wiki.search("Python programming", limit=5)
print(f"Total hits: {results.totalhits}")
print(f"Suggestion: {results.suggestion}")
for title, page in results.pages.items():
print(f"{title}: {page.search_meta.wordcount} words")
# Search with enum parameters (type-safe)
from wikipediaapi import SearchProp, SearchInfo, SearchWhat, SearchQiProfile, SearchSort
results = wiki_wiki.search(
"Python programming",
prop=[SearchProp.SIZE, SearchProp.WORDCOUNT, SearchProp.TIMESTAMP],
info=[SearchInfo.TOTAL_HITS, SearchInfo.SUGGESTION],
what=SearchWhat.TEXT,
qi_profile=SearchQiProfile.ENGINE_AUTO_SELECT,
sort=SearchSort.RELEVANCE,
limit=5
)
# Search with different sort options
results = wiki_wiki.search("Python programming", sort=SearchSort.LAST_EDIT_DESC, limit=5)
results = wiki_wiki.search("Python programming", sort=SearchSort.TITLE_NATURAL_ASC, limit=5)
# Search by title only
results = wiki_wiki.search("Python", what=SearchWhat.TITLE, limit=5)
# Near match search
results = wiki_wiki.search("Pythn", what=SearchWhat.NEAR_MATCH, limit=5)Asynchronous
async def main():
# Basic search
results = await wiki_wiki.search("Python programming", limit=5)
print(f"Total hits: {results.totalhits}")
for title, page in results.pages.items():
print(f"{title}: {page.search_meta.wordcount} words")
# Search with enum parameters (type-safe)
from wikipediaapi import SearchProp, SearchInfo, SearchWhat, SearchQiProfile, SearchSort
results = await wiki_wiki.search(
"Python programming",
prop=[SearchProp.SIZE, SearchProp.WORDCOUNT],
info=[SearchInfo.TOTAL_HITS],
what=SearchWhat.TEXT,
qi_profile=SearchQiProfile.ENGINE_AUTO_SELECT,
sort=SearchSort.RELEVANCE,
limit=5
)To efficiently fetch coordinates or images for multiple pages at once, use
pages() to create a PagesDict (or AsyncPagesDict), then call
the batch methods.
Synchronous
pd = wiki_wiki.pages(["London", "Paris", "Berlin"])
batch_coords = pd.coordinates()
for title, coords in batch_coords.items():
print(f"{title}: {len(coords)} coordinate(s)")
batch_imgs = pd.images()
for title, imgs in batch_imgs.items():
print(f"{title}: {len(imgs)} image(s)")Asynchronous
async def main():
pd = wiki_wiki.pages(["London", "Paris", "Berlin"])
batch_coords = await pd.coordinates()
for title, coords in batch_coords.items():
print(f"{title}: {len(coords)} coordinate(s)")Official API supports many different parameters. You can see them in the sandbox. Not all these parameters are supported directly as parameters of the functions. If you want to specify them, you can pass them as additional parameters in the constructor. For the info API call you can specify parameter converttitles. If you want to specify it, you can use:
Synchronous
import wikipediaapi
wiki_wiki = wikipediaapi.Wikipedia('MyProjectName ([email protected])', 'zh', 'zh-tw', extra_api_params={'converttitles': 1})
page = wiki_wiki.page("孟卯")
print(repr(page.varianttitles))Asynchronous
async def main():
wiki_wiki = wikipediaapi.AsyncWikipedia('MyProjectName ([email protected])', 'zh', 'zh-tw', extra_api_params={'converttitles': 1})
page = wiki_wiki.page("孟卯")
print(repr(await page.varianttitles))The Wikipedia-API provides strongly-typed enum parameters for better type safety while maintaining full backward compatibility with string values. Using enums provides IDE autocomplete, type checking, and prevents runtime errors from typos.
Key Benefits:
- Type Safety: IDE autocomplete and compile-time type checking
- No Typos: Enum values are validated by Python
- Backward Compatible: All existing string-based code continues to work
- Self-Documenting: Enum names clearly indicate the purpose
Available Enums:
SearchProp: Search result properties (SIZE,WORDCOUNT,TIMESTAMP,SNIPPET, etc.)SearchInfo: Search metadata (TOTAL_HITS,SUGGESTION,REWRITTEN_QUERY)SearchWhat: Search types (TEXT,TITLE,NEAR_MATCH)SearchQiProfile: Search ranking profiles (ENGINE_AUTO_SELECT,CLASSIC, etc.)SearchSort: Sort options for search results (RELEVANCE,LAST_EDIT_DESC, etc.)GeoSearchSort: Sort options for geographic search (DISTANCE,RELEVANCE)Globe: Celestial body for coordinates (EARTH,MARS,MOON,VENUS)CoordinateType: Coordinate filtering (ALL,PRIMARY,SECONDARY)CoordinatesProp: Coordinate properties (COUNTRY,DIM,GLOBE,NAME,REGION,TYPE)RedirectFilter: Redirect filtering for random pages (ALL,REDIRECTS,NONREDIRECTS)Direction: Sort direction for images (ASCENDING,DESCENDING)
Enum vs String Usage:
import wikipediaapi
from wikipediaapi import (
SearchProp, SearchInfo, SearchWhat, SearchQiProfile, SearchSort,
GeoSearchSort, Globe, CoordinateType, CoordinatesProp, RedirectFilter
)
wiki = wikipediaapi.Wikipedia('MyProjectName ([email protected])', 'en')
# Type-safe enum usage (recommended)
results = wiki.search(
"python",
prop=[SearchProp.SIZE, SearchProp.WORDCOUNT],
info=[SearchInfo.TOTAL_HITS],
what=SearchWhat.TEXT,
qi_profile=SearchQiProfile.ENGINE_AUTO_SELECT,
sort=SearchSort.RELEVANCE
)
geo_results = wiki.geosearch(coord=wikipediaapi.GeoPoint(lat=51.5, lon=-0.1), sort=GeoSearchSort.DISTANCE, globe=Globe.EARTH)
coords = wiki.coordinates(page, prop=[CoordinatesProp.GLOBE, CoordinatesProp.TYPE, CoordinatesProp.COUNTRY], primary=CoordinateType.ALL)
random_pages = wiki.random(filter_redirect=RedirectFilter.NONREDIRECTS)
# Backward-compatible string usage (still works)
results = wiki.search(
"python",
prop=["size", "wordcount"],
info=["totalhits"],
what="text",
qi_profile="engine_autoselect",
sort="relevance"
)
geo_results = wiki.geosearch(coord=wikipediaapi.GeoPoint(lat=51.5, lon=-0.1), sort="distance", globe="earth")
coords = wiki.coordinates(page, prop=["globe", "type", "country"], primary="all")
random_pages = wiki.random(filter_redirect="nonredirects")Type Aliases for Function Signatures:
The library uses Wiki* type aliases that accept both enum members and strings,
making it easy to write type-annotated code:
from wikipediaapi import (
WikiSearchSort, WikiSearchProp, WikiSearchInfo,
WikiSearchWhat, WikiSearchQiProfile, SearchSort,
WikiCoordinatesProp, WikiCoordinateType
)
def search_function(
query: str,
sort: WikiSearchSort,
prop: list[WikiSearchProp] | None = None,
info: list[WikiSearchInfo] | None = None,
what: WikiSearchWhat | None = None,
qi_profile: WikiSearchQiProfile | None = None
) -> wikipediaapi.SearchResults:
"""Search Wikipedia with either enum or string parameters."""
wiki = wikipediaapi.Wikipedia('MyApp/1.0')
return wiki.search(query, sort=sort, prop=prop, info=info, what=what, qi_profile=qi_profile)
def coords_function(
page: wikipediaapi.WikipediaPage,
prop: list[WikiCoordinatesProp] | None = None,
primary: WikiCoordinateType = CoordinateType.PRIMARY
) -> list[wikipediaapi.Coordinate]:
"""Get coordinates with either enum or string parameters."""
wiki = wikipediaapi.Wikipedia('MyApp/1.0')
return wiki.coordinates(page, prop=prop, primary=primary)
# Both calls work and are type-safe
search_function(
"python",
SearchSort.RELEVANCE,
prop=[SearchProp.SIZE, SearchProp.WORDCOUNT],
info=[SearchInfo.TOTAL_HITS],
what=SearchWhat.TEXT,
qi_profile=SearchQiProfile.ENGINE_AUTO_SELECT
) # Enum input
search_function(
"python",
"relevance",
prop=["size", "wordcount"],
info=["totalhits"],
what="text",
qi_profile="engine_autoselect"
) # String inputAll exceptions raised by the library inherit from WikipediaException. You can catch specific
exceptions or the base WikipediaException. The same exception types are raised by both the
sync and async clients.
Synchronous
import wikipediaapi
wiki_wiki = wikipediaapi.Wikipedia(user_agent='MyProjectName ([email protected])', language='en')
# Catch any Wikipedia-API error
try:
page = wiki_wiki.page('Python_(programming_language)')
print(page.summary[0:60])
except wikipediaapi.WikipediaException as e:
print("Error: %s" % e)# Handle specific error types
try:
page = wiki_wiki.page('Python_(programming_language)')
print(page.summary[0:60])
except wikipediaapi.WikiRateLimitError as e:
print("Rate limited! Retry after: %s seconds" % e.retry_after)
except wikipediaapi.WikiHttpError as e:
print("HTTP error %d: %s" % (e.status_code, e))
except wikipediaapi.WikiHttpTimeoutError:
print("Request timed out")
except wikipediaapi.WikiConnectionError:
print("Could not connect to Wikipedia")
except wikipediaapi.WikiInvalidJsonError:
print("Received invalid response from Wikipedia")Asynchronous
async def main():
wiki_wiki = wikipediaapi.AsyncWikipedia(user_agent='MyProjectName ([email protected])', language='en')
try:
page = wiki_wiki.page('Python_(programming_language)')
print((await page.summary)[0:60])
except wikipediaapi.WikiRateLimitError as e:
print("Rate limited! Retry after: %s seconds" % e.retry_after)
except wikipediaapi.WikiHttpError as e:
print("HTTP error %d: %s" % (e.status_code, e))
except wikipediaapi.WikiHttpTimeoutError:
print("Request timed out")
except wikipediaapi.WikiConnectionError:
print("Could not connect to Wikipedia")
except wikipediaapi.WikiInvalidJsonError:
print("Received invalid response from Wikipedia")By default, transient errors (HTTP 429, 5xx, timeouts, connection errors) are retried up to 3 times
with exponential backoff. You can configure this behavior in the constructor.
The same options apply to both Wikipedia and AsyncWikipedia.
import wikipediaapi
# Custom retry: 5 retries with 2-second base wait
wiki_wiki = wikipediaapi.Wikipedia(
user_agent='MyProjectName ([email protected])',
language='en',
max_retries=5,
retry_wait=2.0,
)# Disable retries entirely
wiki_wiki = wikipediaapi.Wikipedia(
user_agent='MyProjectName ([email protected])',
language='en',
max_retries=0,
)If you have problems with retrieving data you can get URL of undrerlying API call.
This will help you determine if the problem is in the library or somewhere else.
Logging works the same for both Wikipedia and AsyncWikipedia.
import sys
import wikipediaapi
wikipediaapi.log.setLevel(level=wikipediaapi.logging.DEBUG)
# Set handler if you use Python in interactive mode
out_hdlr = wikipediaapi.logging.StreamHandler(sys.stderr)
out_hdlr.setFormatter(wikipediaapi.logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(wikipediaapi.logging.DEBUG)
wikipediaapi.log.addHandler(out_hdlr)
wiki = wikipediaapi.Wikipedia(user_agent='MyProjectName ([email protected])', language='en')
page_ostrava = wiki.page('Ostrava')
print(page_ostrava.summary)
# logger prints out: Request URL: http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Ostrava&explaintext=1&exsectionformat=wiki.. toctree::
:maxdepth: 2
API
CHANGES
CLI
DESIGN
DEVELOPMENT
wikipediaapi/api