Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: Unstructured-IO/unstructured
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 0.20.6
Choose a base ref
...
head repository: Unstructured-IO/unstructured
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: 0.20.8
Choose a head ref
  • 2 commits
  • 7 files changed
  • 2 contributors

Commits on Feb 19, 2026

  1. fix: set max decompressed size for elements JSON (#4244)

    Sets a max size on the decompressed version of an elements JSON. A quite
    large JSON from a 1225 page document is 5MB, for reference. One place we
    still might run into headroom issues is if a JSON from a quite large
    document included embedded digital images.
    
    The result of a JSON being too large, is that the decompressed version
    will not parse, as the tail will be left off. Part of the review should
    be to determine whether this is an acceptable failure mode.
    
    <!-- CURSOR_SUMMARY -->
    ---
    
    > [!NOTE]
    > **Medium Risk**
    > Touches deserialization of compressed element payloads, which can
    affect ingestion/round-tripping for large documents and changes the
    failure mode to explicit exceptions when limits are hit.
    > 
    > **Overview**
    > Adds a hard cap (`MAX_DECOMPRESSED_SIZE`, default 200MB) when
    inflating base64+gzipped elements JSON in
    `elements_from_base64_gzipped_json`, preventing unbounded memory/disk
    blowups; decompression now explicitly fails with
    `DecompressedSizeExceededError` (new) when the limit is hit, or
    `zlib.error` when the payload is incomplete/corrupt.
    > 
    > Bumps version to `0.20.7`, updates the changelog, and adds targeted
    tests covering normal round-trip, incomplete streams, and size-limit
    exceedance (via patching the max size).
    > 
    > <sup>Written by [Cursor
    Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
    a5e5256. This will update automatically
    on new commits. Configure
    [here](https://cursor.com/dashboard?tab=bugbot).</sup>
    <!-- /CURSOR_SUMMARY -->
    qued authored Feb 19, 2026
    Configuration menu
    Copy the full SHA
    c6c7462 View commit details
    Browse the repository at this point in the history
  2. fix: update depdencies (#4247)

    - resolve lock issue with windows and python 3.13 (lack of library
    support): a few dependencies are only required for either non-windows
    system or windows but with python version < 3.13
    - downgrade `wrapt` so it is compatible with
    `opentelemetry-instrumentation-httpx` library
    badGarnet authored Feb 19, 2026
    Configuration menu
    Copy the full SHA
    a8f14ba View commit details
    Browse the repository at this point in the history
Loading