-
Notifications
You must be signed in to change notification settings - Fork 1.4k
bugfix: use consumerDir during lucene realtime segment conversion #13094
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| LOGGER.warn("TEXT_MATCH query timeout on realtime consuming segment {}, column {}, search query {}", _segmentName, | ||
| _column, searchQuery); | ||
| throw new RuntimeException("TEXT_MATCH query timeout on realtime consuming segment"); | ||
| LOGGER.warn("TEXT_MATCH query interrupted while querying the consuming segment {}, column {}, search query {}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this message to timeout due to some concerns in the original PR, but it's also used during early termination - and calling those queries timeouts is misleading.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #13094 +/- ##
============================================
+ Coverage 61.75% 62.08% +0.32%
+ Complexity 207 198 -9
============================================
Files 2436 2514 +78
Lines 133233 137869 +4636
Branches 20636 21324 +688
============================================
+ Hits 82274 85590 +3316
- Misses 44911 45883 +972
- Partials 6048 6396 +348
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
| */ | ||
| public LuceneTextIndexCreator(String column, File segmentIndexDir, boolean commit, boolean realtimeConversion, | ||
| @Nullable int[] immutableToMutableIdMap, TextIndexConfig config) { | ||
| File consumerDir, @Nullable int[] immutableToMutableIdMap, TextIndexConfig config) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add @Nullable before File consumerDir.
|
Since this PR is a bug fix, can you explain in a high level what is the main issue causing the problem and how you fix it? |
| private boolean _optimizeDictionaryForMetrics = false; | ||
| private double _noDictionarySizeRatioThreshold = IndexingConfig.DEFAULT_NO_DICTIONARY_SIZE_RATIO_THRESHOLD; | ||
| private boolean _realtimeConversion = false; | ||
| private File _consumerDir; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a javadoc to explain what this directory is about.
jackjlli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for making the hotfix!
| _partitionFunction = config.getPartitionFunction(); | ||
| _mainPartitionId = config.getPartitionId(); | ||
| _nullHandlingEnabled = config.isNullHandlingEnabled(); | ||
| _consumerDir = config.getConsumerDir() != null ? new File(config.getConsumerDir()) : null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this ever be null? Currently the abstraction is not very clear, and I don't really see proper null handling. Can we ensure it is always set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, it can be null in the MemoryEstimator path. I think it's probably clearer to set a dummy dir there, and remove the @Nullable annotation for consumerDir
For the realtime flow RealtimeTableDataManager.getConsumerDir() ensures it will not be null.
2cdeb5d to
07d8500
Compare
07d8500 to
79e3352
Compare
| _indexCreator = | ||
| new LuceneTextIndexCreator(column, new File(segmentIndexDir.getAbsolutePath() + "/" + segmentName), | ||
| false /* commitOnClose */, true, null, config); | ||
| false /* commitOnClose */, true, null, null, config); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like you are passing in null as consumerDir here. Do we have a test covering this path? I'd imagine it should throw NPE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do - it's covered in LuceneMutableTextIndexTest. consumerDir isn't used for the realtime index, and is only used by the LuceneTextIndexCreator initialized during conversion of the realtime segment. Added a comment clarifying this
Addresses #13039. Previously the consumers dir was derived from indexDir, which was not accurate in the case that
pinot.server.instance.consumerDirwas set. This passes consumerDir through to ensure the path is accurate in all cases.Unrelated: also close
Analyzerobjects, and clarify some exception/log.Testing: Unit tests cover, also sanity tested via quickstart w/
pinot.server.instance.consumerDirset.tag:
bugfix