Skip to content

extends our base64 public API#955

Merged
lemire merged 8 commits intomasterfrom
base64_extension
Apr 5, 2026
Merged

extends our base64 public API#955
lemire merged 8 commits intomasterfrom
base64_extension

Conversation

@lemire
Copy link
Copy Markdown
Member

@lemire lemire commented Mar 30, 2026

This exposes two new functions: base64_valid and base64_to_binary_details as public. They were previously not presented as public.

@lemire lemire requested a review from pauldreik March 30, 2026 21:14
Copy link
Copy Markdown
Collaborator

@pauldreik pauldreik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to be complete, I think we should have constexpr span overloads and also a test that at least calls the new functions.

Comment thread include/simdutf_c.h
simdutf_base64_options options,
simdutf_last_chunk_handling_options last_chunk_options);

/* single-character base64 validation */
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the usecase for these two functions?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so the main use case is when using stop_before_partial. In these cases, you deliberately stop before the partial block (duh). But without the details part, you don't know where you stopped!!!

The stop_before_partial part is useful when you are decoding a stream of base64. Let us say that I give you a large string and you cut it off in segments on 4096 and one segment ends with 3 base64 characters. Now, you can't yet decode them because maybe there is a fourth character coming!

It is also useful in case of errors although that's not why I care about it.

Comment thread README.md

You can also check whether a single character is a valid base64 character using `base64_valid`:
```cpp
bool is_valid = simdutf::base64_valid('A'); // true
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

am I missing something, or are these only available as c functions?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capture d’écran, le 2026-04-01 à 20 28 17

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are C++ functions !!!

@lemire lemire requested review from Copilot and pauldreik April 2, 2026 02:12
@lemire
Copy link
Copy Markdown
Member Author

lemire commented Apr 2, 2026

@pauldreik Excellent comments. I think I have addressed them.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the public Base64 API by exposing base64_valid and base64_to_binary_details, and wires these APIs through the C/C++ interfaces with accompanying documentation and tests.

Changes:

  • Add public base64_to_binary_details overloads (C++ free functions + span/constexpr usage) and document behavior.
  • Add C API equivalents for “details” decoding and single-character base64 validation.
  • Update CLI tool and add runtime + compile-time tests covering the newly public APIs.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tools/fastbase64.cpp Switches CLI decoding to the new public base64_to_binary_details free function.
src/implementation.cpp Adds public C++ free-function wrappers for base64_to_binary_details (char/char16_t).
include/simdutf/implementation.h Documents and declares the new public base64_to_binary_details APIs (incl. span overloads).
src/simdutf_c.cpp Implements new C API wrappers returning a simdutf_full_result.
include/simdutf_c.h Adds simdutf_full_result and declares new C API entry points.
tests/base64_tests.cpp Adds runtime tests for base64_to_binary_details and base64_valid.
tests/constexpr_base64_tests.cpp Adds constexpr tests demonstrating compile-time base64_to_binary_details.
README.md Documents base64_to_binary_details / base64_valid usage and semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tools/fastbase64.cpp
Comment on lines +342 to +344
simdutf::full_result r = simdutf::base64_to_binary_details(
input_data.data(), total_input, output_buffer.data(), options,
simdutf::last_chunk_handling_options::loose);
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change switches the CLI from get_active_implementation()->base64_to_binary_details(...) to the public free function simdutf::base64_to_binary_details(...), which (per the implementation pattern in this repo) routes through the default implementation. If the CLI supports selecting/changing the active implementation at runtime, this will ignore that selection and change behavior/perf. Prefer calling simdutf::get_active_implementation()->base64_to_binary_details(...) here to preserve existing CLI semantics while still benefiting from the newly-public API elsewhere.

Suggested change
simdutf::full_result r = simdutf::base64_to_binary_details(
input_data.data(), total_input, output_buffer.data(), options,
simdutf::last_chunk_handling_options::loose);
simdutf::full_result r =
simdutf::get_active_implementation()->base64_to_binary_details(
input_data.data(), total_input, output_buffer.data(), options,
simdutf::last_chunk_handling_options::loose);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so.

Comment thread tools/fastbase64.cpp
Comment on lines +366 to +368
simdutf::full_result r = simdutf::base64_to_binary_details(
input_data.data(), total_input, output_buffer.data(), options,
simdutf::last_chunk_handling_options::stop_before_partial);
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change switches the CLI from get_active_implementation()->base64_to_binary_details(...) to the public free function simdutf::base64_to_binary_details(...), which (per the implementation pattern in this repo) routes through the default implementation. If the CLI supports selecting/changing the active implementation at runtime, this will ignore that selection and change behavior/perf. Prefer calling simdutf::get_active_implementation()->base64_to_binary_details(...) here to preserve existing CLI semantics while still benefiting from the newly-public API elsewhere.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so.

Comment thread include/simdutf/implementation.h Outdated
@pauldreik
Copy link
Copy Markdown
Collaborator

I think this looks good to merge!
minor things:

  • the span overloads for base64_to_binary_details come after each other, not after the function they overload
  • the utf16 overloads are not tested to the same extent

@lemire lemire merged commit 06258b2 into master Apr 5, 2026
104 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants