-
Notifications
You must be signed in to change notification settings - Fork 34
♻️ Rename SequenceSet internals
#562
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* `str_to_tuples` => `parse_runs` * `str_to_tuple` => `parse_minmax` (alias: `parse_run`) * `parse_string_entry` => `parse_entry`
* `input_to_tuples` => `import_runs` * `input_to_tuple` => `import_minmax` (alias: `import_run`) * `range_to_tuple` => `import_range_minmax` * `to_tuple_int` => `import_num`
* `export_string_entries` => `export_minmaxes`/`export_runs` * `tuple_to_str` => `export_minmax`/`export_run` * `tuple_to_entry` => `export_minmax_entry`/`export_run_entry` * `each_number_in_tuple` => `each_number_in_minmax` * `from_tuple_int` => `export_num`
* `include_tuple?` => `include_run?` * `intersect_tuple?` => `intersect_run?`
* `tuple_gte_with_index` => `bsearch_minmax_with_index` * `tuple_gte_to` => `bsearch_range`
* `tuples_add` => `add_minmaxes` or `add_runs` * `tuples_subtract` => `subtract_minmaxes` or `subtract_runs` * `tuple_add` => `add_minmax` or `add_run` * `tuple_coalesce` => `add_coalesced_minmax` * `tuple_subtract` => `subtract_minmax` or `subtract_run` * `tuple_trim_or_split` => `trim_or_split` * `tuples_trim_or_delete` => `trim_or_delete` Also renamed _some_ local variables for consistency. Mostly `tuple` => `minmax`.
* `each_entry_tuple` => `each_entry_minmax` * `each_tuple_with_index` => `each_minmax_with_index` * `reverse_each_tuple_with_index` => `reverse_each_minmax_with_index` * `lookup_number_by_tuple_index` => `seek_number_in_minmaxes`
'#minmaxes` is currently an alias for `#set_data`, but an upcoming PR will change how the `@set_data` is structured. This prepares for that by documenting which code expects an array (or enum) of `(min, max)` tuples. Some of this code could remain unchanged if `#minmaxes` were changed to return an enumerator.
This is an alias for `#set_data`, and is used to document code that expects an array of "runs" but can be abstract about how those runs are implemented. An upcoming PR will replace `[min, max]` arrays with `u32x2` packed integers, which dramatically reduces memory allocations and speeds up _some_ operations (although it does slow down others).
The instance variable and attribute were renamed to `set_data`. But most uses were replaced with either `runs` or `minmaxes`, based on how they are used.
c1cd9f5 to
7d73944
Compare
nevans
added a commit
that referenced
this pull request
Dec 8, 2025
The results are highly dependant on data, YJIT, etc... but in my tests
this implementation seems to _consistently_ run faster.
```
Comparison:
builtin: L ^ R
local: 967.0 i/s
before #567 (559b86a): 917.5 i/s - 1.05x slower
before #562 (a228ee1): 901.2 i/s - 1.07x slower
v0.5.12: 840.0 i/s - 1.15x slower
v0.5.0: 830.7 i/s - 1.16x slower
v0.4.21: 797.0 i/s - 1.21x slower
v0.5.9: 790.4 i/s - 1.22x slower
builtin: L.xor! R
local: 960.9 i/s
before #562 (a228ee1): 954.3 i/s - 1.01x slower
before #567 (559b86a): 940.0 i/s - 1.02x slower
v0.5.12: 796.4 i/s - 1.21x slower
v0.5.0: 633.9 i/s - 1.52x slower
v0.4.21: 620.1 i/s - 1.55x slower
v0.5.9: 619.0 i/s - 1.55x slower
```
When I first wrote this code, the `master` branch was consistently
1.3x-1.5x slower. The speedup is much less now, so other implementation
details are probably more important. When the internal set data
implementation is changed (see #484), this and other operations should
be re-evaluated. I suspect an iterative "merge" may be the fastest
overall approach, but that code is more complex than simply relying on
the core add/subtract/complement methods with simple boolean algebra.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
sequence-set
Any code the IMAP `sequence-set` data type or grammar rule, especially the SequenceSet class.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR renames the poorly named
@tuplesivar and all of the methods dealing with that data structure. It should not contain any significant refactoring besides renaming (and changing direct ivar access to use an attr_reader).The instance variable and attribute were renamed to the much more abstract
set_data. But most uses were updated to use eitherrunsorminmaxes, based on how they are used:#minmaxesis currently an alias for#set_data, but an upcoming PR will change how@set_datais structured. This prepares for that by documenting which code expects an array (or enum) of(min, max)tuples. Some of this code could remain unchanged if#minmaxeswere changed to return an enumerator.#runsis also currently an alias for#set_data, and is used to document code that expects an array of "runs" but can be abstract about how those runs are implemented. An upcoming PR will replace[min, max]arrays withu32x2packed integers, which dramatically reduces memory allocations and speeds up some operations (although it does slow down others).Most of the internal private methods were renamed:
import_{type}- for converting external inputs into internal typesparse_{type}- for converting strings into internal typesexport_{type}- for converting internal types into exported typesbsearch_{type}- for binary searching set_datatuple(s)withminmax(es)orrun(s)This refactoring is a small step towards #484.