Add hash() and get_index() to phf_shared.#62
Merged
sfackler merged 1 commit intorust-phf:masterfrom Aug 4, 2015
Merged
Conversation
In https://github.com/servo/string-cache, we currently use a `phf::OrderedSet` with its `get_index` method to get an identified stored in an `Atom`, and `index` to get a string back from that identifier. However, the extra inderection of `OrderedSet` of `Set` is not necessary. We don’t care about the order, only about getting numeric identifiers. Additionally, when `get_index` returns `None`, we hash the input string again to find it in table of dynamic atoms. With this chang, we can reuse the phf hash instead: servo/string-cache#103 At first I tried adding hash and index access to `phf::Map`, but the API got messy quickly.
Collaborator
|
I think this makes sense as-is. I see |
sfackler
added a commit
that referenced
this pull request
Aug 4, 2015
Add hash() and get_index() to phf_shared.
Contributor
Author
|
So you’re in favor of string-cache using these details from Could you publish this to crates.io? Thanks! |
Collaborator
|
Yep, I think that strategy makes sense. Will publish now. |
bors-servo
pushed a commit
to servo/string-cache
that referenced
this pull request
Sep 1, 2015
Reuse phf hash and remove phf::OrderedSet indirection <s>Do not merge yet.</s> This depends on rust-phf/rust-phf#62 Use the `phf_shared` and `phf_generator` crates directly instead of `phf`. This allows us to re-use the phf hash in the dynamic table and avoid hashing the same string again. Also remove the indirection of `phf::OrderedSet` compared to `phf::Set`: we don’t care about the order, only about getting numeric indices. (The optimization mentioned in a comment of using a bit map of the first 64 atoms in the html5ever tree builder was never implemented. If we want it, the indirection and order preservation can be added back while preserving the hash reuse.) Fixes #38. <!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/servo/string-cache/103) <!-- Reviewable:end -->
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
In https://github.com/servo/string-cache, we currently use a
phf::OrderedSetwith itsget_indexmethod to get an identified stored in anAtom, andindexto get a string back from that identifier.However, the extra inderection of
OrderedSetofSetis not necessary. We don’t care about the order, only about getting numeric identifiers.Additionally, when
get_indexreturnsNone, we hash the input string again to find it in table of dynamic atoms. With this chang, we can reuse the phf hash instead: servo/string-cache#103At first I tried adding hash and index access to
phf::Map, but the API got messy quickly. Do you think that is worth pursuing more, rather than having string-cache use phf internals?