-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add legacy +POS/-POS handling in sort to pass GNU sort-field-limit test #9501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Support GNU’s obsolescent +POS1 [-POS2] syntax by translating it to -k before clap parses args, gated by _POSIX2_VERSION. Adds tests for accept and reject cases to ensure sort-field-limit GNU test passes.
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
src/uu/sort/src/sort.rs
Outdated
| let mut idx = 0; | ||
| let bytes = spec.as_bytes(); | ||
| while idx < bytes.len() && bytes[idx].is_ascii_digit() { | ||
| idx += 1; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a functional approach is cleaner. Something like:
| let mut idx = 0; | |
| let bytes = spec.as_bytes(); | |
| while idx < bytes.len() && bytes[idx].is_ascii_digit() { | |
| idx += 1; | |
| } | |
| let idx = spec.chars().take_while(|c| c.is_ascii_digit()).count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was fixed in this commit. 3b52b5d
src/uu/sort/src/sort.rs
Outdated
| let mut char_idx = 0; | ||
| let stripped_bytes = stripped.as_bytes(); | ||
| while char_idx < stripped_bytes.len() && stripped_bytes[char_idx].is_ascii_digit() { | ||
| char_idx += 1; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
| let mut char_idx = 0; | |
| let stripped_bytes = stripped.as_bytes(); | |
| while char_idx < stripped_bytes.len() && stripped_bytes[char_idx].is_ascii_digit() { | |
| char_idx += 1; | |
| } | |
| let char_idx = stripped.chars().take_while(|c| c.is_ascii_digit()).count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was fixed in this commit. 3b52b5d
src/uu/sort/src/sort.rs
Outdated
| #[derive(Debug, Clone)] | ||
| struct LegacyKeyPart { | ||
| field: usize, | ||
| char: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name char is a bit unlucky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was fixed in this commit. 3b52b5d
src/uu/sort/src/sort.rs
Outdated
| to.field.saturating_add(1) | ||
| }; | ||
|
|
||
| let mut end_part = end_field.to_string(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for introducing end_part? Couldn't you reorder things a bit and push directly to keydef?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was fixed in this commit. bf7a8ab
src/uu/sort/src/sort.rs
Outdated
| } | ||
| } | ||
|
|
||
| fn parse_legacy_part(spec: &str, default_char: usize) -> Option<LegacyKeyPart> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the default_char param necessary? It's 0 in both places this function is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was fixed in this commit. 3b52b5d
src/uu/sort/src/sort.rs
Outdated
| if stripped | ||
| .as_bytes() | ||
| .first() | ||
| .is_some_and(|c| c.is_ascii_digit()) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can simplify it with starts_with:
| if stripped | |
| .as_bytes() | |
| .first() | |
| .is_some_and(|c| c.is_ascii_digit()) | |
| { | |
| if stripped.starts_with(|c: char| c.is_ascii_digit()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was fixed in this commit. e477de9
CodSpeed Performance ReportMerging #9501 will degrade performances by 4.51%Comparing Summary
Benchmarks breakdown
Footnotes
|
|
GNU testsuite comparison: |
|
GNU testsuite comparison: |
|
@cakebaker Thanks for the review! I've addressed all the comments. |
Kudos, thanks! |
This change translates GNU’s legacy +POS1 [-POS2] sort key syntax into -k form before clap parsing, gating it by _POSIX2_VERSION to match GNU behavior, and adds by-util tests to ensure acceptance/rejection behavior matches GNU and that the sort-field-limit compatibility test passes.
#9127