Skip to content

Conversation

@KKcorps
Copy link
Contributor

@KKcorps KKcorps commented Mar 7, 2022

There are some functions which are missing from Pinot currently which are part of standard SQL. We can still do some of these operations (except for trignometric ones) by writing composite SQL expressions using existing functions. This however leads to complex queries and it is better we support these functions directly in Pinot.

For better performance, Transform function classes have also been added for the new Math functions. The classes have the same function name as scalar and will be given first preference when function is used.

This also helps in providing better support for third party tools such as Tableau.

@KKcorps KKcorps force-pushed the transform_functions_patch branch from 5c56be1 to 41c2050 Compare March 7, 2022 07:46
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding these missing functions. Can you please add unit test to ensure the desired behavior?

@codecov-commenter
Copy link

codecov-commenter commented Mar 14, 2022

Codecov Report

Merging #8304 (ccbba40) into master (515f7a2) will increase coverage by 0.20%.
The diff coverage is 77.16%.

@@             Coverage Diff              @@
##             master    #8304      +/-   ##
============================================
+ Coverage     70.76%   70.96%   +0.20%     
- Complexity     4278     4283       +5     
============================================
  Files          1655     1661       +6     
  Lines         86665    87046     +381     
  Branches      13067    13134      +67     
============================================
+ Hits          61327    61772     +445     
+ Misses        21095    21012      -83     
- Partials       4243     4262      +19     
Flag Coverage Δ
integration1 28.80% <14.18%> (+0.13%) ⬆️
integration2 27.34% <14.18%> (-0.03%) ⬇️
unittests1 67.07% <77.16%> (+0.09%) ⬆️
unittests2 14.15% <0.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../pinot/common/function/DateTimePatternHandler.java 80.00% <0.00%> (-20.00%) ⬇️
...common/function/scalar/TrigonometricFunctions.java 0.00% <0.00%> (ø)
...ot/common/function/scalar/ArithmeticFunctions.java 69.56% <37.50%> (+16.23%) ⬆️
.../pinot/common/function/scalar/StringFunctions.java 71.05% <64.28%> (-2.80%) ⬇️
...nsform/function/RoundDecimalTransformFunction.java 71.79% <71.79%> (ø)
...orm/function/TruncateDecimalTransformFunction.java 71.79% <71.79%> (ø)
...tor/transform/function/PowerTransformFunction.java 80.76% <80.76%> (ø)
...inot/common/function/scalar/DateTimeFunctions.java 95.23% <83.33%> (-0.73%) ⬇️
...form/function/TrigonometricTransformFunctions.java 86.25% <86.25%> (ø)
...orm/function/SingleParamMathTransformFunction.java 95.23% <90.90%> (-0.42%) ⬇️
... and 44 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 515f7a2...ccbba40. Read the comment docs.

@Jackie-Jiang
Copy link
Contributor

@KKcorps There are 6 hidden conversations (github automatically collapsed them) unaddressed, please take a look.

@KKcorps
Copy link
Contributor Author

KKcorps commented Mar 17, 2022

@Jackie-Jiang @richardstartin I have added the unit-tests as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥇

@KKcorps KKcorps requested a review from richardstartin March 21, 2022 13:04
@KKcorps KKcorps force-pushed the transform_functions_patch branch from f38c59d to e98c403 Compare March 21, 2022 16:30
@KKcorps KKcorps requested a review from richardstartin March 22, 2022 11:44
Copy link
Member

@richardstartin richardstartin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, and I appreciate that this has extended a long way beyond its original scope so thanks for being so patient and for your determination. Can we just get rid of the stray semicolons in the trig functions before merging please.

@KKcorps KKcorps force-pushed the transform_functions_patch branch from 0829dcb to 0a63209 Compare March 22, 2022 21:08
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may avoid the boxing/unboxing using primitive double and use NaN to represent not set, same for other functions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the change for this class. Is it fine if I use Integer.MIN_VALUE for other classes or leaved the boxed implementation as it is for now? I would prefer the latter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using boxed value is fine if we unbox the value before getting into the loop (not sure if JVM will do the optimization, cc @richardstartin)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just use a flag rather than sentinel values. I don't know what the best thing to do is, but it probably isn't very important anyway. If this ever shows up as a bottleneck we can fix it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of using a boolean flag, so that it is guaranteed to have the desired behavior without potentially paying the cost of unboxing the values

Copy link
Contributor Author

@KKcorps KKcorps Mar 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, boolean flags are better. using them.

@KKcorps KKcorps force-pushed the transform_functions_patch branch from 2830ca3 to e81d81c Compare March 26, 2022 11:31
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) Rename the second parameter to scale for readability

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang Mar 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Math.round() for this function as BigDecimal calculation is much more costly? If not, please add some comments explaining the reason. Same for the one with scale

_rightTransformFunction),
"Argument must be single-valued with type INT or LONG for transform function: %s", getName());
} else {
_rightTransformFunction = null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can simplify the handling by setting _scale = 0 and _fixedScale = true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for doing it this way is to use optimal Math.round and Math.floor for this case instead of relying on BigDecimal represenation.

@KKcorps KKcorps merged commit ebd20a4 into apache:master Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants