Skip to content

[score boosting] add more expressions#6054

Merged
coszio merged 3 commits intodevfrom
more-expressions-in-formula
Feb 26, 2025
Merged

[score boosting] add more expressions#6054
coszio merged 3 commits intodevfrom
more-expressions-in-formula

Conversation

@coszio
Copy link
Copy Markdown
Contributor

@coszio coszio commented Feb 24, 2025

Adds the following operations to expression:

  • sqrt
  • pow
  • exp
  • log (base 10)
  • ln
  • abs

@coszio coszio requested review from generall and timvisee February 24, 2025 20:55
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 24, 2025

📝 Walkthrough

Walkthrough

The pull request extends the mathematical expression capabilities across multiple modules. It introduces new variants for operations such as absolute value, square root, power, exponential, base‑10 logarithm, and natural logarithm. These changes span the gRPC protocol definitions, REST schemas, internal expression representations, and evaluation logic. The extensions include new fields in protobuf messages, additional enum variants and structs in REST, and enhanced conversion and evaluation functions within the internal query and scoring systems.

Changes

File(s) Change Summary
lib/api/src/grpc/proto/points.proto
lib/api/src/grpc/qdrant.rs
Updated protobuf definitions and gRPC implementation: extended the Expression message with new fields (abs, sqrt, pow, exp, log10, ln) and introduced a new PowExpression type. Additional methods for each new operation were added.
lib/api/src/rest/schema.rs Added new variants to the Expression enum and corresponding structs (AbsExpression, SqrtExpression, PowExpression, PowParams, ExpExpression, Log10Expression, LnExpression) to support additional mathematical operations.
lib/collection/src/operations/universal_query/formula.rs
lib/collection/src/operations/universal_query/shard_query.rs
Introduced new variants to the ExpressionInternal enum with support for Sqrt, Pow, Exp, Log10, Ln, and Abs. Updated conversion logic from REST expressions to internal representations.
lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs
Extended the evaluation logic and parsed formula representation to include new parsed expression variants (Sqrt, Pow, Exp, Log10, Ln, Abs), integrating the new mathematical operations into the scoring system.
docs/grpc/docs.md Added documentation for the new PowExpression type and included descriptions for the new mathematical expressions.
docs/redoc/master/openapi.json Updated the OpenAPI schema to include new mathematical expressions (AbsExpression, SqrtExpression, PowExpression, ExpExpression, Log10Expression, LnExpression).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as GRPC/REST API
    participant Converter as Expression Converter
    participant QueryEngine as Internal Query System
    participant Evaluator as FormulaScorer

    Client->>API: Send expression request (e.g., pow, sqrt)
    API->>Converter: Convert external expression to internal representation
    Converter->>QueryEngine: Map to ExpressionInternal variants (including new operations)
    QueryEngine->>Evaluator: Evaluate expression (e.g., compute pow, sqrt, etc.)
    Evaluator-->>QueryEngine: Return computed result
    QueryEngine-->>API: Forward evaluation result
    API-->>Client: Return final result
Loading

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (4)
lib/api/src/grpc/proto/points.proto (2)

590-595: Add documentation comments for new mathematical operations.

Consider adding descriptive comments for each new operation to improve API documentation:

-        Expression abs = 9; // Absolute value
-        Expression sqrt = 10; // Square root
-        PowExpression pow = 11; // Power
-        Expression exp = 12; // Exponential
-        Expression log10 = 13; // Logarithm
-        Expression ln = 14; // Natural logarithm
+        Expression abs = 9; // Returns the absolute value of the expression
+        Expression sqrt = 10; // Returns the square root of the expression
+        PowExpression pow = 11; // Returns the base expression raised to the power of the exponent expression
+        Expression exp = 12; // Returns e raised to the power of the expression
+        Expression log10 = 13; // Returns the base-10 logarithm of the expression
+        Expression ln = 14; // Returns the natural logarithm (base e) of the expression

618-621: Add documentation comments for PowExpression message.

Consider adding descriptive comments for the PowExpression message and its fields:

+// Message for representing a power operation with a base and exponent
 message PowExpression {
-    Expression base = 1;
-    Expression exponent = 2;
+    Expression base = 1; // The base expression to be raised to a power
+    Expression exponent = 2; // The exponent expression
 }
lib/api/src/grpc/qdrant.rs (2)

5223-5240: LGTM! Consider enhancing documentation with domain constraints.

The new mathematical operations are well-organized and properly tagged. However, consider adding domain constraints to the documentation:

  • Sqrt: Input must be non-negative
  • Log10 and Ln: Input must be positive

Apply this diff to enhance the documentation:

        /// Absolute value
        #[prost(message, tag = "9")]
        Abs(::prost::alloc::boxed::Box<super::Expression>),
-       /// Square root
+       /// Square root (input must be non-negative)
        #[prost(message, tag = "10")]
        Sqrt(::prost::alloc::boxed::Box<super::Expression>),
        /// Power
        #[prost(message, tag = "11")]
        Pow(::prost::alloc::boxed::Box<super::PowExpression>),
        /// Exponential
        #[prost(message, tag = "12")]
        Exp(::prost::alloc::boxed::Box<super::Expression>),
-       /// Logarithm
+       /// Logarithm base 10 (input must be positive)
        #[prost(message, tag = "13")]
        Log10(::prost::alloc::boxed::Box<super::Expression>),
-       /// Natural logarithm
+       /// Natural logarithm (input must be positive)
        #[prost(message, tag = "14")]
        Ln(::prost::alloc::boxed::Box<super::Expression>),

5280-5285: LGTM! Consider adding documentation for PowExpression.

The struct is well-designed with appropriate traits and field types. However, it would benefit from documentation explaining its purpose and any domain constraints.

Apply this diff to add documentation:

+/// Represents a power operation with a base and an exponent.
+/// Both base and exponent can be complex expressions.
+/// Note: Special cases like 0^0 or negative base with fractional exponent should be handled during evaluation.
pub struct PowExpression {
    #[prost(message, optional, boxed, tag = "1")]
    pub base: ::core::option::Option<::prost::alloc::boxed::Box<Expression>>,
    #[prost(message, optional, boxed, tag = "2")]
    pub exponent: ::core::option::Option<::prost::alloc::boxed::Box<Expression>>,
}
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8ec07b6 and eb467da.

📒 Files selected for processing (7)
  • lib/api/src/grpc/proto/points.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (3 hunks)
  • lib/api/src/rest/schema.rs (3 hunks)
  • lib/collection/src/operations/universal_query/formula.rs (2 hunks)
  • lib/collection/src/operations/universal_query/shard_query.rs (2 hunks)
  • lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (1 hunks)
  • lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test-consensus
  • GitHub Check: test
  • GitHub Check: test (macos-latest)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test (windows-latest)
  • GitHub Check: test (ubuntu-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: test
  • GitHub Check: test
🔇 Additional comments (8)
lib/api/src/grpc/qdrant.rs (1)

5188-5191: LGTM! Tags are properly ordered for the new mathematical operations.

The tags sequence is continuous and properly ordered to accommodate the new mathematical operations while maintaining backward compatibility.

lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1)

44-52: LGTM! The new mathematical operations are well-structured.

The new variants in ParsedExpression are correctly implemented:

  • Each variant appropriately uses Box<ParsedExpression> for recursive types
  • The Pow variant correctly captures both base and exponent expressions
lib/collection/src/operations/universal_query/shard_query.rs (2)

249-267: LGTM! The conversion logic is correctly implemented.

The parse_and_convert method properly handles the new mathematical operations, maintaining consistency with the existing pattern.


538-563: LGTM! The gRPC conversion is well-implemented.

The TryFrom<grpc::Expression> implementation:

  • Properly validates required fields for the Pow variant
  • Maintains consistent error handling
  • Follows the established pattern for other variants
lib/collection/src/operations/universal_query/formula.rs (2)

29-37: LGTM! The internal expression types are well-defined.

The new variants in ExpressionInternal are correctly implemented:

  • Each variant appropriately uses Box<ExpressionInternal> for recursive types
  • The Pow variant correctly captures both base and exponent expressions

86-104: LGTM! The REST conversion is properly implemented.

The From<rest::Expression> implementation correctly handles all new mathematical operations, maintaining consistency with the existing pattern.

lib/api/src/rest/schema.rs (2)

644-650: LGTM! The REST API schema is well-structured.

The new variants in the Expression enum are correctly defined and ordered.


669-715: LGTM! The expression structs are properly implemented.

The new expression structs:

  • Follow the established pattern
  • Use appropriate derive macros
  • Properly use Box for recursive types

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/grpc/docs.md (1)

215-215: Markdown List Indentation

The static analysis hint indicates that an unordered list item at line 215 is indented with 4 spaces instead of the expected 2. Please adjust the indentation to conform with the markdown style guidelines. For example, if the current line is similar to:

-        - [SomeItem](#some-anchor)
+  - [SomeItem](#some-anchor)

this would resolve the issue.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

215-215: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb467da and a89a3c6.

📒 Files selected for processing (2)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (3 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md

215-215: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test (macos-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: test (windows-latest)
  • GitHub Check: test-consensus
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test (ubuntu-latest)
🔇 Additional comments (15)
docs/grpc/docs.md (2)

2703-2708: New Mathematical Operations in the Expression Section

The addition of new rows for abs, sqrt, pow, exp, log10, and ln is clear and consistent with the existing style. In particular, using [PowExpression] as the link for the pow operation sets it apart, which is appropriate given that its definition is expanded in its own section. Please verify that the anchor names (e.g. #qdrant-Expression and #qdrant-PowExpression) match exactly the corresponding anchors in the document.


3715-3724: PowExpression Section Documentation

The newly added "PowExpression" section is well-structured, with a concise header and a table clearly defining the two fields: base and exponent. This nicely complements the new pow entry in the Expression table.

docs/redoc/master/openapi.json (13)

14294-14296: New mathematical operation: Absolute value

Added new AbsExpression schema for computing absolute values in expressions.


14300-14302: New mathematical operation: Square root

Added new SqrtExpression schema for computing square roots in expressions.


14303-14305: New mathematical operation: Power

Added new PowExpression schema for computing power operations in expressions.


14306-14308: New mathematical operation: Exponential

Added new ExpExpression schema for computing exponential (e^x) operations in expressions.


14309-14311: New mathematical operation: Base-10 logarithm

Added new Log10Expression schema for computing base-10 logarithms in expressions.


14312-14314: New mathematical operation: Natural logarithm

Added new LnExpression schema for computing natural logarithms in expressions.


14359-14369: Schema definition for absolute value operation

The AbsExpression schema is properly defined with:

  • Required field "abs" of type Expression
  • Follows consistent schema pattern with other mathematical operations

14401-14411: Schema definition for square root operation

The SqrtExpression schema is properly defined with:

  • Required field "sqrt" of type Expression
  • Follows consistent schema pattern with other mathematical operations

14412-14437: Schema definition for power operation

The PowExpression and PowParams schemas are properly defined with:

  • Required fields "base" and "exponent" of type Expression
  • Follows consistent schema pattern with other mathematical operations
  • Appropriate parameter structure for binary operation

14438-14448: Schema definition for exponential operation

The ExpExpression schema is properly defined with:

  • Required field "exp" of type Expression
  • Follows consistent schema pattern with other mathematical operations

14449-14459: Schema definition for base-10 logarithm operation

The Log10Expression schema is properly defined with:

  • Required field "log10" of type Expression
  • Follows consistent schema pattern with other mathematical operations

14460-14470: Schema definition for natural logarithm operation

The LnExpression schema is properly defined with:

  • Required field "ln" of type Expression
  • Follows consistent schema pattern with other mathematical operations

14294-14470:

Details

❓ Verification inconclusive

Verify mathematical operations are properly integrated

The new mathematical operations are properly integrated into the Expression type union, allowing them to be used in formulas and other expressions throughout the API.


🏁 Script executed:

#!/bin/bash
# Verify that all new expression types are properly referenced in the Expression schema
ast-grep --pattern $'anyOf": [
  $$$
  {
    "$ref": "#/components/schemas/AbsExpression"
  },
  $$$
  {
    "$ref": "#/components/schemas/SqrtExpression"
  },
  $$$
  {
    "$ref": "#/components/schemas/PowExpression"
  },
  $$$
  {
    "$ref": "#/components/schemas/ExpExpression"
  },
  $$$
  {
    "$ref": "#/components/schemas/Log10Expression"
  },
  $$$
  {
    "$ref": "#/components/schemas/LnExpression"
  },
  $$$'

Length of output: 424


Math Operations Integration Needs Manual Verification

The openapi.json snippet shows all the new mathematical operation references (AbsExpression, DivExpression, SqrtExpression, PowExpression, ExpExpression, Log10Expression, LnExpression, and GeoDistance) visibly present. However, our initial automated AST search did not return any output. This suggests that the precise AST pattern we used might not match the actual JSON structure—even though the snippet appears correct.

  • Action: Please manually verify that the Expression type union (or equivalent structure) has been updated to include these operations as intended. This includes confirming that the new math operation references are part of the union used throughout the API.

Comment thread lib/api/src/grpc/proto/points.proto
Comment on lines +160 to +161
// Undefined: Send this point to the bottom of results
Ok(f32::NEG_INFINITY)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work for euclid as well, because we invert later?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All score boosting will have Order::LargerIsBetter, this will be added to documentation. For the case of euclidean distance, users will have to negate the scores, to be explicit.

@coszio coszio merged commit 64ff5f0 into dev Feb 26, 2025
@coszio coszio deleted the more-expressions-in-formula branch February 26, 2025 12:31
timvisee pushed a commit that referenced this pull request Mar 21, 2025
* sqrt, pow, exp, log, ln, abs

* upd openapi and grpc

* handle undefined math as negative infinity
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants