Skip to content

[Variant] Offer simdutf8 as an optional dependency when validating metadata #7902

@friendlymatthew

Description

@friendlymatthew

This is a follow up on #7878

The variant spec states the string values in the metadata dictionary must be UTF-8 encoded strings.

We do this check here:

// Verify the string values in the dictionary are UTF-8 encoded strings.
let value_buffer =
string_from_slice(self.bytes, 0, self.first_value_byte as _..self.bytes.len())?;

Since we offer simdutf8 as an optional dependency in other crates, we could do the same when performing the validation above. See @Dandandan's comment.

The rough idea being:

If simdutf8 is supported, do:

let value_str = simdutf8::basic::from_utf8(value_buffer)?;

else, default to the existing implementation

Metadata

Metadata

Assignees

Labels

arrowChanges to the arrow crateenhancementAny new improvement worthy of a entry in the changelogparquetChanges to the parquet crate

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions