-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[enhancement] [release-note] Update get bytes to return raw bytes of string and support getBytesMV #9441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9441 +/- ##
============================================
+ Coverage 63.53% 69.84% +6.30%
- Complexity 5072 5136 +64
============================================
Files 1836 1910 +74
Lines 98545 101832 +3287
Branches 15062 15451 +389
============================================
+ Hits 62614 71122 +8508
+ Misses 31313 25699 -5614
- Partials 4618 5011 +393
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly minor comments
pinot-core/src/main/java/org/apache/pinot/core/common/DataBlockCache.java
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
Outdated
Show resolved
Hide resolved
| } | ||
| break; | ||
| case BYTES: | ||
| case STRING: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove the switch and directly read the values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we don't want to support this for other types?
pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
Outdated
Show resolved
Hide resolved
...a/org/apache/pinot/segment/local/realtime/impl/dictionary/StringOnHeapMutableDictionary.java
Outdated
Show resolved
Hide resolved
| String unpaddedString = getUnpaddedString(i, buffer); | ||
| _unpaddedStrings[i] = unpaddedString; | ||
| _unPaddedStringToIdMap.put(unpaddedString, i); | ||
| _unpaddedBytes[i] = getUnpaddedBytes(i, buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only read the unpadded bytes, and directly create unpaddedString from the unpadded bytes for better performance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance will not be good because we have do encode every time if we want to read unpadded string? Or you are talking about space efficiency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking about the bytes/string reading during the construction
| _unpaddedBytes[i] = getUnpaddedBytes(i, buffer); | |
| byte[] unpaddedBytes = getUnpaddedBytes(i, buffer); | |
| String unpaddedString = new String(unpaddedBytes, UTF_8); | |
| ... |
...cal/src/main/java/org/apache/pinot/segment/local/segment/index/readers/StringDictionary.java
Outdated
Show resolved
Hide resolved
...egment-spi/src/main/java/org/apache/pinot/segment/spi/index/mutable/MutableForwardIndex.java
Outdated
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/utils/BytesUtils.java
Outdated
Show resolved
Hide resolved
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise
pinot-core/src/main/java/org/apache/pinot/core/common/DataBlockCache.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
Outdated
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/common/DataFetcher.java
Outdated
Show resolved
Hide resolved
pinot-core/src/test/java/org/apache/pinot/core/common/DataFetcherTest.java
Outdated
Show resolved
Hide resolved
pinot-core/src/test/java/org/apache/pinot/core/common/DataFetcherTest.java
Show resolved
Hide resolved
| String unpaddedString = getUnpaddedString(i, buffer); | ||
| _unpaddedStrings[i] = unpaddedString; | ||
| _unPaddedStringToIdMap.put(unpaddedString, i); | ||
| _unpaddedBytes[i] = getUnpaddedBytes(i, buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking about the bytes/string reading during the construction
| _unpaddedBytes[i] = getUnpaddedBytes(i, buffer); | |
| byte[] unpaddedBytes = getUnpaddedBytes(i, buffer); | |
| String unpaddedString = new String(unpaddedBytes, UTF_8); | |
| ... |
.../apache/pinot/segment/local/segment/index/readers/ImmutableDictionaryTypeConversionTest.java
Outdated
Show resolved
Hide resolved
.../apache/pinot/segment/local/segment/index/readers/ImmutableDictionaryTypeConversionTest.java
Outdated
Show resolved
Hide resolved
...-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/ForwardIndexReader.java
Outdated
Show resolved
Hide resolved
...e/src/main/java/org/apache/pinot/core/operator/transform/function/BaseTransformFunction.java
Outdated
Show resolved
Hide resolved
…string and support getBytesMV (apache#9441) * get raw bytes for string value column * get raw bytes for string value column * support getBytesMV * fix fail test * fix style * Address sytle commets * address comments
This change is backward incompatible: