Skip to content

Conversation

@saurabhd336
Copy link
Contributor

@saurabhd336 saurabhd336 commented Apr 22, 2022

This PR adds support for being able to use complex scalar function expressions for fitlterConfig. In addition to common comparison functions, support for AND, OR and NOT has been added.
GH issue: #8488

How to use

With this change, filterConfig's filterFunction can now be simple SQL like expressions. This simplifies filtering of records especially with Groovy expressions being disabled by default.

"ingestionConfig": {
      "filterConfig": {
        "filterFunction": "svInt = 123 AND svDouble <= 200"
      }

@saurabhd336
Copy link
Contributor Author

GH issue: #8488

@codecov-commenter
Copy link

codecov-commenter commented Apr 25, 2022

Codecov Report

Merging #8582 (58739e4) into master (7edad89) will decrease coverage by 1.67%.
The diff coverage is 39.62%.

@@             Coverage Diff              @@
##             master    #8582      +/-   ##
============================================
- Coverage     70.59%   68.92%   -1.68%     
- Complexity     4412     4413       +1     
============================================
  Files          1667     1669       +2     
  Lines         87314    87365      +51     
  Branches      13085    13098      +13     
============================================
- Hits          61643    60218    -1425     
- Misses        21411    22962    +1551     
+ Partials       4260     4185      -75     
Flag Coverage Δ
integration1 ?
integration2 25.83% <0.00%> (+0.02%) ⬆️
unittests1 66.40% <39.62%> (-0.02%) ⬇️
unittests2 14.35% <0.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ot/common/function/scalar/ComparisonFunctions.java 14.28% <14.28%> (ø)
...gment/local/function/InbuiltFunctionEvaluator.java 65.42% <43.18%> (-13.46%) ⬇️
.../pinot/common/function/scalar/ObjectFunctions.java 50.00% <50.00%> (ø)
...pinot/minion/exception/TaskCancelledException.java 0.00% <0.00%> (-100.00%) ⬇️
...nverttorawindex/ConvertToRawIndexTaskExecutor.java 0.00% <0.00%> (-100.00%) ⬇️
...e/pinot/common/minion/MergeRollupTaskMetadata.java 0.00% <0.00%> (-94.74%) ⬇️
...plugin/segmentuploader/SegmentUploaderDefault.java 0.00% <0.00%> (-87.10%) ⬇️
.../transform/function/MapValueTransformFunction.java 0.00% <0.00%> (-85.30%) ⬇️
...ot/common/messages/RoutingTableRebuildMessage.java 0.00% <0.00%> (-81.82%) ⬇️
...verttorawindex/ConvertToRawIndexTaskGenerator.java 5.45% <0.00%> (-80.00%) ⬇️
... and 137 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7edad89...58739e4. Read the comment docs.

@saurabhd336 saurabhd336 force-pushed the conjugationQueryOperators branch from eefad9f to 7a5fff1 Compare April 26, 2022 09:01
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach is better than #8563
Could you please add some description to the PR? Don't put the template as the description as that doesn't help reviewers understand the purpose of the PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) Revert this file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding 3 execution nodes: AndExecutionNode, OrExecutionNode, NotExecutionNode. IMO no need to add the concept of ConjugateFunction

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User will mostly likely use the function name within the FilterKind, and we should include that in the alias

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does svInt = 123 AND svInt < 123 work? Users usually don't use eq or lte explicitly.
Also let's put some filter that can potentially match record to test the behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is was LTE. Hence it worked :)
Had added these to be able to test and / or. Made the tests a bit more extensive.
All the filters actually match the record. SKIP_RECORD_KEY is being checked for.

@saurabhd336 saurabhd336 force-pushed the conjugationQueryOperators branch 2 times, most recently from 612bcd7 to e1c216e Compare April 28, 2022 13:05
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 2 methods need to be annotated with nullableParameters = true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Redundant parenthesis, same for others

Suggested change
Boolean res = (Boolean) (executableNode.execute(row));
Boolean res = (Boolean) executableNode.execute(row);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to annotate the names because it matches the method name in the canonical format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if users will use gt instead of >, and IIRC gt is not standard SQL syntax. We may just make the method name match the filter kind in camel case (they will be canonicalized to the same name)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change the expression to svInt = 123 AND svDouble <= 200. Users won't explicitly write query in this format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

@saurabhd336 saurabhd336 force-pushed the conjugationQueryOperators branch 3 times, most recently from d48c94a to 70aae26 Compare April 29, 2022 12:34
@saurabhd336 saurabhd336 changed the title Support conjugates for scalar functions (with early termination) (WIP) Support conjugates for scalar functions, add more scalar functions Apr 29, 2022
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to make sure these work case-insensitive way (all other scalar functions do)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only minor/nit comments

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor)

Suggested change
Preconditions.checkState(numArguments == 1, "Unsupported number of arguments for NOT operator");
Preconditions.checkState(numArguments == 1, "NOT function expects 1 argument, got: %s", numArguments);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)

Suggested change
return (Math.abs(a - b) >= DOUBLE_COMPARISON_TOLERANCE);
return Math.abs(a - b) >= DOUBLE_COMPARISON_TOLERANCE;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)

Suggested change
return (Math.abs(a - b) < DOUBLE_COMPARISON_TOLERANCE);
return Math.abs(a - b) < DOUBLE_COMPARISON_TOLERANCE;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)

Suggested change
return (val > a) && (val < b);
return val > a && val < b;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor)

Suggested change
public static boolean isNull(Object obj) {
public static boolean isNull(@Nullable Object obj) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor)

Suggested change
public static boolean isNotNull(Object obj) {
public static boolean isNotNull(@Nullable Object obj) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldn't work rt? it should be svInt == 123

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In SQL, we use = for comparison, e.g. SELECT * FROM myTable WHERE svInt = 123

@saurabhd336 saurabhd336 force-pushed the conjugationQueryOperators branch 2 times, most recently from ed6183a to 60d4227 Compare May 4, 2022 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants