Building a large hash table has a non-trivial cost. See mailing list discussion https://lists.apache.org/thread.html/rb85519cc21ffb09a836a9107919e07b076165ff81c22fb88b59a8296%40%3Cuser.arrow.apache.org%3E **Reporter**: [Wes McKinney](https://issues.apache.org/jira/browse/ARROW-10097) / @wesm **Assignee**: [Ben Kietzman](https://issues.apache.org/jira/browse/ARROW-10097) / @bkietz #### Related issues: - [[C++][Dataset] Minimize Expression to a wrapper around compute::Function](https://github.com/apache/arrow/issues/26312) (is fixed by) - [[C++] Caching pre computed data based on FunctionOptions in the kernel state](https://github.com/apache/arrow/issues/26521) (is related to) <sub>**Note**: *This issue was originally created as [ARROW-10097](https://issues.apache.org/jira/browse/ARROW-10097). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>