Skip to content

base58Encode does not work correctly if a string starts with null byte #40587

@zaa

Description

@zaa

base58Encode does not encode the data correctly. Compare output from base58Encode to the one from hex:

:) select base58Encode('\x00\x0b\xe3\xe1\xeb\xa1\x7a\x47\x3f\x89\xb0\xf7\xe8\xe2\x49\x40\xf2\x0a\xeb\x8e\xbc\xa7\x1a\x88\xfd\xe9\x5d\x4b\x83\xb7\x1a\x09') as encoded_data format JSONEachRow;

{"encoded_data":""}

1 row in set. Elapsed: 0.002 sec.

:) select hex('\x00\x0b\xe3\xe1\xeb\xa1\x7a\x47\x3f\x89\xb0\xf7\xe8\xe2\x49\x40\xf2\x0a\xeb\x8e\xbc\xa7\x1a\x88\xfd\xe9\x5d\x4b\x83\xb7\x1a\x09') as encoded_data format JSONEachRow;

{"encoded_data":"000BE3E1EBA17A473F89B0F7E8E24940F20AEB8EBCA71A88FDE95D4B83B71A09"}

1 row in set. Elapsed: 0.002 sec.

Does it reproduce on recent release?

Yes. Reproduced on ClickHouse version 22.8.1.2097.

Expected behavior

Python encodes the data as follows:

$ python3 -m venv ~/.venv/base58
$ ~/.venv/base58/bin/pip install base58
$ ~/.venv/base58/bin/python3
>>> import base58
>>> data = b'\x00\x0b\xe3\xe1\xeb\xa1\x7a\x47\x3f\x89\xb0\xf7\xe8\xe2\x49\x40\xf2\x0a\xeb\x8e\xbc\xa7\x1a\x88\xfd\xe9\x5d\x4b\x83\xb7\x1a\x09';
>>> len(data)
32
>>> base58.b58encode(data)
b'1BWutmTvYPwDtmw9abTkS4Ssr8no61spGAvW1X6NDix'
>>> len(base58.b58encode(data))
43

so I expected base58Encode call to return '1BWutmTvYPwDtmw9abTkS4Ssr8no61spGAvW1X6NDix'.

Note: I am not sure, but the issue might have the same root cause as the one from #40536.

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    potential bugTo be reviewed by developers and confirmed/rejected.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions