Skip to content

Comments

BestFitting: Reduce allocations#7037

Closed
MichaReiser wants to merge 2 commits intomainfrom
best-fitting-reduce-allocations
Closed

BestFitting: Reduce allocations#7037
MichaReiser wants to merge 2 commits intomainfrom
best-fitting-reduce-allocations

Conversation

@MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Sep 1, 2023

Summary

This PR aims to reduce the allocations necessary for best_fitting.

Today:

  • One allocation for interning the object that should be formatted with different variants (outside of best fitting, avoids formatting the same content multiple times)
  • One allocation for the Vec that stores all variants
  • One allocation per variant that stores the format elements for that variant

This PR removes the allocations per variant and instead writes all variants into a single buffer stored on BestFitting.
The downside of this is that resolving the most-flat or most-expanded variants now requires a search for the StartBestFittingEntry or EndBestFittingEntry . I don't expect this to be significant because best-fitting entries tend to be small (<16 entries).

Performance

This gives us a 2% improvement for most files, except the large/dataset.py. I need to dig deeper to understand why only large data set is regressing. This is especially surprising because I would have expected the performance to improve the most for large files that make heavy use of best fitting.

Test Plan

cargo test

@MichaReiser
Copy link
Member Author

MichaReiser commented Sep 1, 2023

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@codspeed-hq
Copy link

codspeed-hq bot commented Sep 1, 2023

CodSpeed Performance Report

Merging #7037 will degrade performances by 2.05%

Comparing best-fitting-reduce-allocations (37c62ff) with memoize-text-width (6a31ef4)

Summary

🔥 3 improvements
❌ 1 regressions
✅ 16 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark memoize-text-width best-fitting-reduce-allocations Change
formatter[large/dataset.py] 58.5 ms 59.7 ms -2.05%
🔥 formatter[pydantic/types.py] 20.2 ms 19.7 ms +2.46%
🔥 formatter[numpy/ctypeslib.py] 11 ms 10.7 ms +2.45%
🔥 formatter[unicode/pypinyin.py] 3.7 ms 3.6 ms +2.69%

@MichaReiser MichaReiser force-pushed the best-fitting-reduce-allocations branch from a6a6c1a to 243a81b Compare September 1, 2023 08:11
let variants = self.variants.items();

let mut formatted_variants = Vec::with_capacity(variants.len());
let mut buffer = VecBuffer::with_capacity(variants.len() * 8, f.state_mut());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to remove the need for this buffer too but this complicates things a bit because finding the boundaries of a variant suddenly need to account for nested best fitting elements:

StartBestFittingEntry (Outer)
....
StartBestFittingEntry (Inner)
...
EndBestFittingEntry
...
EndBestFittingEntry

The nesting is probably rare because best fitting is almost always used together with Interned, meaning that the nested best fitting most likely ends up being inside of the interned vec, but it remains possible.

@MichaReiser MichaReiser force-pushed the best-fitting-reduce-allocations branch from 243a81b to b51d602 Compare September 1, 2023 08:27
@MichaReiser MichaReiser added the formatter Related to the formatter label Sep 1, 2023
@MichaReiser MichaReiser force-pushed the best-fitting-reduce-allocations branch 2 times, most recently from ddc19b0 to 5f148a7 Compare September 1, 2023 08:54
@MichaReiser MichaReiser changed the base branch from main to token-element September 1, 2023 17:17
@MichaReiser MichaReiser force-pushed the best-fitting-reduce-allocations branch from 5f148a7 to 0e5caac Compare September 1, 2023 17:17
@MichaReiser MichaReiser added the performance Potential performance improvement label Sep 1, 2023
@MichaReiser MichaReiser force-pushed the best-fitting-reduce-allocations branch from 0e5caac to 092c9a2 Compare September 1, 2023 17:42
@MichaReiser MichaReiser force-pushed the best-fitting-reduce-allocations branch from 092c9a2 to 770e2b4 Compare September 1, 2023 18:21
@MichaReiser MichaReiser mentioned this pull request Sep 1, 2023
Base automatically changed from token-element to main September 2, 2023 08:05
@MichaReiser MichaReiser changed the base branch from main to memoize-text-width September 2, 2023 08:33
@MichaReiser MichaReiser force-pushed the best-fitting-reduce-allocations branch from 770e2b4 to 37c62ff Compare September 2, 2023 08:33
Base automatically changed from memoize-text-width to main September 6, 2023 07:10
@MichaReiser
Copy link
Member Author

Isn't as promising as I thought, or even regressing and it introduces additional complexity.

@MichaReiser MichaReiser closed this Sep 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

formatter Related to the formatter performance Potential performance improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant