Is your feature request related to a problem or challenge?
This epic attempts to organize attempts to improve DataFusion's ability to process datasets that are larger than fit in configured memory budget
Some of DataFusion's "pipeline blocking" operations (SortExec and HashGroupBy) already do work with datasets that are larger than fit in memory, but the performance and usability could be improved
Note: Joins are another operation that can run out of memory and will error (rather than falling back to some other strategy like Sort-Merge-Join for example). If people are interested in making this better, I think we could organize another project
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response