Skip to content

Conversation

@kmozaid
Copy link
Owner

@kmozaid kmozaid commented Feb 17, 2022

This PR adds following features -

Added a new partition function BoundedColumnValuePartitionFunction to be able to partition segments on column values such that partitionId still remains integer. You can configure different column values on which you want to partition segments in a new functionConfig property.

Example Config -

"tableIndexConfig": {
  "segmentPartitionConfig": {
    "columnPartitionMap": {
      "subject": {
        "functionName": "BoundedColumnValue",
        "functionConfig": {
          "columnValues": "Maths|English|Chemistry"
        }
      }
    }
  }

PartitionId is generated based on position in columnValues. PartitionId would 1 for Maths, 2 for English and so on.
PartitionId 0 is reserved for any other subject which are not present in given partitionConfig.
The new config functionConfig is also saved with column metadata in metadata.properties and in zookeeper segment metadata.
Broker can also prune segments based on this partition function

  1. Segment Partitioning on multiple columns for offline table.

@kmozaid kmozaid closed this Feb 28, 2022
kmozaid pushed a commit that referenced this pull request Aug 11, 2025
* Addition of initial spi change checker code

* Fixes to yaml file, mainly excluding artifacts.zip code to test run further

* changing shell file permissions

* add "exit 1" to shell script and mess with TableConfig method signature

* changing shell file permissions again for some reason

* changing git diff checker file path

* second git diff checker file path change

* removing unnecessary code from files, fixed issue with running GitDiffChecker, added functionality for displaying line number, and reverted temporary change to TableConfig.java

* permission changes

* changing permissions

* fixing compilation error

* fixing parameterization of commits

* trial and error yml file #1

* commit for testing that config file correctly has parameters, changing sh file so that "No incorrect..." message only displayed once, and adding blank line check to GitDiffChecker

* GitDiffChecker: add case to skip ---. otherwise, change parameters to work with pull requests.

* testing

* testing #2

* Revert "testing #2"

This reverts commit 270937f.

* yml: change main to master. shell: change main to master and change conditional to reflect return type change. java: switch from returning line number to string of code, as previous logic did not work if multiple "chunks" of code were changed, and added annotation logic that excluded json-related annotations.

* yml: comment out apache/pinot condition for now. shell: change error message and put text file in my module. java: slightly altered regex after rethinking it, removed System.out.println, i think it's not necessary

* java: slightly altered regex to account for interface definitions using semicolons and not curly braces

* changing "no incorrect spi changes" value to "0", so if the method returns nothing, there isn't an accidental test passing

* minor change to annotation regex, \n changed to $ (end of string)

* fix to pom.xml

* Added logic for outputting line number along with original file code snippet

* per testing on another branch, slightly updating line number logic

* Removed my outdated custom Java implementation of pinot-spi change checking. Switched to the japicmp plugin, with some modifications to the compatibility of checks that japicmp performs. With this, contributors will be able to see if they made incompatible SPI changes when running mvn clean, rather than waiting until they make a PR. Adding a .jar of pinot-spi that japicmp will use for comparisons.

* Per Tianle's comment, added a comment in the pom file explaining our justification for using a baseline jar for comparison, and that we need to eventually transition away from it.

* Removed code that made annotation changes/deletions incompatible. Fixed pom file so that all pinot-spi files are checked

* updated baseline .jar to match updates from apache:master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants