Skip to content

Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization#85440

Merged
azat merged 3 commits intoClickHouse:masterfrom
azat:string-deser-fix
Aug 13, 2025
Merged

Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization#85440
azat merged 3 commits intoClickHouse:masterfrom
azat:string-deser-fix

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Aug 12, 2025

Changelog category (leave one):

  • Critical Bug Fix (crash, data loss, RBAC) or LOGICAL_ERROR

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix possible UB (crashes) in case of MEMORY_LIMIT_EXCEEDED during String deserialization

Previously it was possible to leave offsets and data out of sync if data.resize_exact fails with MEMORY_LIMIT_EXCEEDED, which eventually will lead to various UBs (i.e. SIZES_OF_COLUMNS_DOESNT_MATCH and LOGICAL_ERROR)

@azat azat requested a review from bharatnc August 12, 2025 14:33
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Aug 12, 2025

Workflow [PR], commit [d0a2dba]

Summary:

job_name test_name status info comment
Integration tests (arm_binary, distributed plan, 2/4) failure
test_async_logger_metrics/test.py::test_async_logger_profile_events FAIL
Stress test (amd_ubsan) failure
Server died FAIL
Hung check failed, possible deadlock found (see hung_check.log) FAIL
Killed by signal (in clickhouse-server.log) FAIL
Fatal message in clickhouse-server.log (see fatal_messages.txt) FAIL
Killed by signal (output files) FAIL
Found signal in gdb.log FAIL

@clickhouse-gh clickhouse-gh bot added pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-cloud labels Aug 12, 2025
@azat azat force-pushed the string-deser-fix branch from f1af558 to 2cea8c1 Compare August 12, 2025 14:46
@azat azat force-pushed the string-deser-fix branch 2 times, most recently from 04623a0 to 0e3b4c1 Compare August 12, 2025 21:58
@alexey-milovidov
Copy link
Copy Markdown
Member

What is the scenario when we use an incomplete String column after an exception is thrown?

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a test.

You can make a test by gradually increasing the max_memory_usage value.

@alexey-milovidov
Copy link
Copy Markdown
Member

PTAL why everything failed in this PR.

@azat
Copy link
Copy Markdown
Member Author

azat commented Aug 13, 2025

What is the scenario when we use an incomplete String column after an exception is thrown?

*update_field_loaded_block,

@azat
Copy link
Copy Markdown
Member Author

azat commented Aug 13, 2025

Needs a test.

@alexey-milovidov c3e4a81

@azat azat requested a review from alexey-milovidov August 13, 2025 12:27
@azat azat force-pushed the string-deser-fix branch from c3e4a81 to 68630ad Compare August 13, 2025 14:26
azat added 2 commits August 13, 2025 16:26
…alization

Previously it was possible to leave `offsets` and `data` out of sync if
`data.resize_exact` fails with `MEMORY_LIMIT_EXCEEDED`, which eventually
will lead to various UBs (i.e. `SIZES_OF_COLUMNS_DOESNT_MATCH` and
`LOGICAL_ERROR`)
W/o the fix it always fails, i.e.:

    $ src/unit_tests_dbms --gtest_filter=StringSerialization.*
    Note: Google Test filter = StringSerialization.*
    [==========] Running 1 test from 1 test suite.
    [----------] Global test environment set-up.
    [----------] 1 test from StringSerialization
    [ RUN      ] StringSerialization.IncorrectStateAfterMemoryLimitExceeded
    /src/ch/clickhouse/src/DataTypes/Serializations/tests/gtest_string_serialization.cpp:79: Failure
    Expected equality of these values:
      result.getDataAt(result.size() - 1)
        Which is: ar\0foo
      "foobar"
        Which is: 0x555557ce25a3

    [  FAILED  ] StringSerialization.IncorrectStateAfterMemoryLimitExceeded (28 ms)
    [----------] 1 test from StringSerialization (28 ms total)

    [----------] Global test environment tear-down
    [==========] 1 test from 1 test suite ran. (30 ms total)
    [  PASSED  ] 0 tests.
    [  FAILED  ] 1 test, listed below:
    [  FAILED  ] StringSerialization.IncorrectStateAfterMemoryLimitExceeded

     1 FAILED TEST
@azat azat force-pushed the string-deser-fix branch from 68630ad to a1569a2 Compare August 13, 2025 14:26
@azat azat force-pushed the string-deser-fix branch from a1569a2 to d0a2dba Compare August 13, 2025 14:40
@alexey-milovidov
Copy link
Copy Markdown
Member

ClickHouse/src/Dictionaries/HashedArrayDictionary.cpp

Sounds good, but I need to understand, why we continue using the block after an exception is thrown in HashedArrayDictionary.

@alexey-milovidov alexey-milovidov self-assigned this Aug 13, 2025
@azat azat enabled auto-merge August 13, 2025 20:34
@azat azat added this pull request to the merge queue Aug 13, 2025
robot-ch-test-poll3 added a commit that referenced this pull request Aug 15, 2025
Cherry pick #85440 to 24.8: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
robot-clickhouse added a commit that referenced this pull request Aug 15, 2025
robot-ch-test-poll3 added a commit that referenced this pull request Aug 15, 2025
Cherry pick #85440 to 25.3: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
robot-clickhouse added a commit that referenced this pull request Aug 15, 2025
robot-ch-test-poll3 added a commit that referenced this pull request Aug 15, 2025
Cherry pick #85440 to 25.5: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
robot-clickhouse added a commit that referenced this pull request Aug 15, 2025
robot-ch-test-poll3 added a commit that referenced this pull request Aug 15, 2025
Cherry pick #85440 to 25.6: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
robot-clickhouse added a commit that referenced this pull request Aug 15, 2025
robot-ch-test-poll3 added a commit that referenced this pull request Aug 15, 2025
Cherry pick #85440 to 25.7: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
robot-clickhouse added a commit that referenced this pull request Aug 15, 2025
@robot-clickhouse robot-clickhouse added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Aug 15, 2025
clickhouse-gh bot added a commit that referenced this pull request Aug 15, 2025
Backport #85440 to 25.7: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
clickhouse-gh bot added a commit that referenced this pull request Aug 15, 2025
Backport #85440 to 25.6: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
alexey-milovidov added a commit that referenced this pull request Aug 16, 2025
Backport #85440 to 25.5: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
azat added a commit that referenced this pull request Aug 16, 2025
Backport #85440 to 25.3: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
azat added a commit that referenced this pull request Aug 18, 2025
Backport #85440 to 24.8: Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization
@azat azat mentioned this pull request Sep 9, 2025
1 task
yokofly added a commit to timeplus-io/proton that referenced this pull request Dec 15, 2025
…alization

* ClickHouse/ClickHouse#85440 from azat/string-deser-fix

Fix possible UB in case of MEMORY_LIMIT_EXCEEDED during String deserialization

* fix build

---------

Co-authored-by: Azat Khuzhin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-backports-created-cloud deprecated label, NOOP pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants