-
Notifications
You must be signed in to change notification settings - Fork 1.4k
faster metric scans #7920
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster metric scans #7920
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7920 +/- ##
============================================
- Coverage 71.37% 71.25% -0.12%
- Complexity 4193 4210 +17
============================================
Files 1595 1594 -1
Lines 82514 82613 +99
Branches 12304 12316 +12
============================================
- Hits 58895 58870 -25
- Misses 19643 19768 +125
+ Partials 3976 3975 -1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
a999be7 to
e71d582
Compare
...he/pinot/segment/local/segment/index/readers/forward/FixedByteChunkSVForwardIndexReader.java
Outdated
Show resolved
Hide resolved
...he/pinot/segment/local/segment/index/readers/forward/FixedByteChunkSVForwardIndexReader.java
Outdated
Show resolved
Hide resolved
d20f123 to
9368ab2
Compare
|
I added a benchmark to show where the difference comes from, but the benchmark also hints at the minimum cost of query vectorization: an array copy. This cost could be eliminated for reductions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: it'd be helpful to comment a bit when the range can be contiguous and when not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? That doesn't seem like a responsibility of this class, does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think I'm lack of some context to understand this new check. Not necessarily a comment to the code. Would appreciate it if you could shed some light in the conversion here. Just looking at this check method, I assume it'd require the docIds to be unique and sorted in asc. So when that happens and when not. This may help me understand when the optimization can kick in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, got it. However, I think that kind of info would be better off in a readme about the how the query and storage layers interact. I can do that (because understanding this area of the code has been hard for me) but not in this PR. I don't think the comment belongs here, and would be superfluous after reading a decent readme.
|
@siddharthteotia to review before merging. |
Yes, I will make sure to go through this by EOD. Thank you for waiting |
pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
Outdated
Show resolved
Hide resolved
a988aa7 to
b3a7772
Compare
|
This feels like a fairly low risk change for a decent improvement to me. Are there any blockers here? |
...he/pinot/segment/local/segment/index/readers/forward/FixedByteChunkSVForwardIndexReader.java
Outdated
Show resolved
Hide resolved
b3a7772 to
3a96ce8
Compare
This change is motivated by a profile where a user performed a summation over a raw column in a group by query, and a significant amount of time was spent in bounds checks:
This change adds methods
fillValuesto theForwardIndexReaderinterface which can avoid bounds checks and perform vectorized copies when the range of docIds is contiguous. There is no way to avoid bounds checks with the current APIs otherwise as there is no way for the compiler to infer that the docIds array is monotonic.With
This speeds up
select platform, sum(clicks) from complexWebsite group by platformon a 7.5M row segment by ~25%: 60ms down to 45ms.Summation is not the best case for the change because it requires conversion from
longtodouble- accumulating the sum as alongwould amplify the effect.