The docstring example in the PageConfig type stub (packages/python/kreuzberg/_internal_bindings.pyi) shows extract_pages=True in its example, implying that is the default or expected usage. The actual default is extract_pages=False, which contradicts the example in context.
Location
packages/python/kreuzberg/_internal_bindings.pyi, around line 1059:
class PageConfig:
r"""Page extraction and tracking configuration.
...
Example:
>>> from kreuzberg import ExtractionConfig, PageConfig
>>> config = ExtractionConfig(pages=PageConfig(extract_pages=True))
"""
extract_pages: bool # Default: False
insert_page_markers: bool # Default: False
...
The attributes clearly document Default: False, but the example shows extract_pages=True without explaining why, which is confusing to users who assume the example demonstrates default or typical usage.
Expected behavior
The example should either:
- Show the default (no-arg) constructor:
PageConfig(), or
- Explicitly annotate that
extract_pages=True is being set to enable page tracking (i.e., it is non-default), and explain what it enables.
Additionally, since using result_format="element_based" without extract_pages=True produces incorrect page numbers (see Issue 1 & 2), the PageConfig docstring should note this interaction.
Environment
- Platform: Windows (reported by user; also reproducible on other platforms)
- Kreuzberg version: (reporter did not specify — please include your version)
- Python version: (reporter did not specify)
The docstring example in the
PageConfigtype stub (packages/python/kreuzberg/_internal_bindings.pyi) showsextract_pages=Truein its example, implying that is the default or expected usage. The actual default isextract_pages=False, which contradicts the example in context.Location
packages/python/kreuzberg/_internal_bindings.pyi, around line 1059:The attributes clearly document
Default: False, but the example showsextract_pages=Truewithout explaining why, which is confusing to users who assume the example demonstrates default or typical usage.Expected behavior
The example should either:
PageConfig(), orextract_pages=Trueis being set to enable page tracking (i.e., it is non-default), and explain what it enables.Additionally, since using
result_format="element_based"withoutextract_pages=Trueproduces incorrect page numbers (see Issue 1 & 2), thePageConfigdocstring should note this interaction.Environment