Conversation
There was a problem hiding this comment.
Pull request overview
Adds a public _base64 user-defined literal to decode base64 strings at compile time (C++23), and updates tests/docs to use it.
Changes:
- Introduces
simdutf::literals::operator""_base64in the public header under C++23 + base64 feature guards. - Updates constexpr base64 tests to use the new literal, adding coverage for whitespace and empty input.
- Documents compile-time base64 decoding usage in the README.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| tests/constexpr_base64_tests.cpp | Replaces the local demo literal with the new public _base64 literal and adds more test cases. |
| include/simdutf/implementation.h | Implements the _base64 literal and supporting constexpr decode helpers. |
| README.md | Documents the new compile-time base64 literal and its behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| template <std::size_t InputLen> | ||
| constexpr auto base64_decode_literal(const char *str) { | ||
| base64_decode_result<InputLen> result{}; | ||
| auto r = scalar::base64::base64_to_binary_details_impl( | ||
| str, InputLen, result.buffer.data(), base64_default, loose); | ||
| if (r.error != error_code::SUCCESS) { | ||
| throw "invalid base64 input in _base64 literal"; | ||
| } |
There was a problem hiding this comment.
The README claims “Invalid base64 input causes a compilation error”, but this implementation can also be evaluated at runtime (and would then throw a const char*, which is a poor exception type for a public API). Consider making the literal consteval to enforce compile-time evaluation, and replace the throw-based failure with a compile-time diagnostic approach (e.g., a dependent static_assert) so users get a clearer error message and there is no runtime exception behavior.
There was a problem hiding this comment.
hmmm... does not look like valid criticism?
There was a problem hiding this comment.
this is only possible to invoke if you are using the detail namespace, which users are not supposed to do. I clarified that in a commit I pushed.
I upgraded the function to consteval, so it is never evaluated at runtime.
| template <std::size_t InputLen> struct base64_decode_result { | ||
| static constexpr std::size_t max_out = (InputLen + 3) / 4 * 3; | ||
| std::array<char, max_out> buffer{}; | ||
| std::size_t output_count{}; | ||
| }; |
There was a problem hiding this comment.
Base64 decoding produces arbitrary bytes, but the literal returns std::array<char, N> (and uses char buffers internally). This can be surprising for non-text payloads and encourages treating decoded data as text. Consider returning std::array<std::uint8_t, N> (or std::byte) to better match “binary data” semantics and avoid signedness/encoding confusion; char can still be produced by users when they explicitly want text.
|
|
||
| template <base64_literal_helper a> constexpr auto base64_make_array() { | ||
| constexpr auto decoded = base64_decode_literal<a.size()>(a.storage); | ||
| std::array<char, decoded.output_count> ret{}; |
There was a problem hiding this comment.
Base64 decoding produces arbitrary bytes, but the literal returns std::array<char, N> (and uses char buffers internally). This can be surprising for non-text payloads and encourages treating decoded data as text. Consider returning std::array<std::uint8_t, N> (or std::byte) to better match “binary data” semantics and avoid signedness/encoding confusion; char can still be produced by users when they explicitly want text.
pauldreik
left a comment
There was a problem hiding this comment.
I took the liberty to push a commit upgrading to consteval (C++20), clarifying the detail namespace is not part of the public api and that not only spaces are ignored.
the last bit should be a hint for those wanting to embed something that is so long it should be broken into lines for readability. clang format etc does not know that the contents of the string can be line broken for this use case.
| template <std::size_t InputLen> | ||
| constexpr auto base64_decode_literal(const char *str) { | ||
| base64_decode_result<InputLen> result{}; | ||
| auto r = scalar::base64::base64_to_binary_details_impl( | ||
| str, InputLen, result.buffer.data(), base64_default, loose); | ||
| if (r.error != error_code::SUCCESS) { | ||
| throw "invalid base64 input in _base64 literal"; | ||
| } |
There was a problem hiding this comment.
this is only possible to invoke if you are using the detail namespace, which users are not supposed to do. I clarified that in a commit I pushed.
I upgraded the function to consteval, so it is never evaluated at runtime.
| template <std::size_t InputLen> struct base64_decode_result { | ||
| static constexpr std::size_t max_out = (InputLen + 3) / 4 * 3; | ||
| std::array<char, max_out> buffer{}; | ||
| std::size_t output_count{}; | ||
| }; |
|
|
||
| template <base64_literal_helper a> constexpr auto base64_make_array() { | ||
| constexpr auto decoded = base64_decode_literal<a.size()>(a.storage); | ||
| std::array<char, decoded.output_count> ret{}; |
It seems generally useful to be able to do
"abcd"_base64and get anstd::array.I am basically copying @pauldreik's code from the test up to
implementation.hand allowing users to call it.