Skip to content

[score boosting] Add decay expressions#6154

Merged
coszio merged 9 commits intodevfrom
decay-function-helper
Mar 18, 2025
Merged

[score boosting] Add decay expressions#6154
coszio merged 9 commits intodevfrom
decay-function-helper

Conversation

@coszio
Copy link
Copy Markdown
Contributor

@coszio coszio commented Mar 11, 2025

Adds exponential, gaussian, and linear decay expressions.

Plots and formulas can be visualized here https://www.desmos.com/calculator/htg0vrfmks

I really tried to name the variables as straightforward as I could, but I know there is room for improvement if anyone comes up with a better name.

@coszio coszio added this to the score boosting milestone Mar 11, 2025
@coszio coszio requested review from generall and timvisee March 11, 2025 21:03
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 11, 2025

📝 Walkthrough

Walkthrough

The changes integrate a new set of decay expression functionalities across multiple components of the system. A new documentation section, DecayParamsExpression, has been added to describe decay parameters with fields such as x, target, scale, and midpoint. Corresponding updates appear in the OpenAPI specification, protocol buffers, and REST schema, introducing new expression types: LinDecayExpression, ExpDecayExpression, and GaussDecayExpression. In the gRPC conversions and query modules, modifications allow for parsing and handling decay expressions by mapping them onto a new Decay variant with an associated DecayKind indicating linear, Gaussian, or exponential decay. Support functions for computing decay values have been added to both the formula evaluation and scoring mechanisms. Collectively, these changes expand the system’s ability to support decay-based calculations in query processing and formula evaluation.

Suggested reviewers

  • generall
  • timvisee

Tip

⚡🧪 Multi-step agentic review comment chat (experimental)
  • We're introducing multi-step agentic chat in review comments. This experimental feature enhances review discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments.
    - To enable this feature, set early_access to true under in the settings.
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (10)
docs/redoc/master/openapi.json (2)

14714-14719: Consider validating scale parameter in the API schema

The scale parameter's documentation states that it "must be a non-zero positive number," but the schema doesn't enforce this constraint. While validation might happen in the implementation code, it would be beneficial to add schema validation to make this requirement explicit in the API documentation.

          "scale": {
            "description": "The scale factor of the decay, in terms of `x`. Defaults to 1.0. Must be a non-zero positive number.",
            "type": "number",
            "format": "float",
+           "exclusiveMinimum": 0,
            "nullable": true
          },

14720-14725: Consider adding range validation for the midpoint parameter

The midpoint parameter typically represents a value between 0 and 1 for decay functions. While the default is specified as 0.5, there's no validation in the schema to ensure it stays within a reasonable range. Consider adding minimum and maximum constraints if appropriate for your implementation.

          "midpoint": {
            "description": "The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`.",
            "type": "number",
            "format": "float",
+           "minimum": 0,
+           "maximum": 1,
            "nullable": true
          }

Note: Only add these constraints if they are actual requirements for your decay functions implementation.

docs/grpc/docs.md (1)

2461-2462: Markdown Table Formatting Issue – Escape Pipe Character
The description for the midpoint field includes an inline code snippet:

`|x - target| == scale`

This pipe character (|) within the code span may be misinterpreted as a delimiter, causing the linter (MD056) to report extra columns. Please escape the pipe (for example, use \|x - target\| == scale) to ensure the table renders correctly.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

2461-2461: Table column count
Expected: 4; Actual: 6; Too many cells, extra data will be missing

(MD056, table-column-count)

lib/collection/src/operations/universal_query/shard_query.rs (2)

460-519: Consider refactoring to reduce code duplication.

The implementation for all three decay variants (ExpDecay, GaussDecay, LinDecay) contains nearly identical code with only the DecayKind being different. This repetition could be extracted into a helper function to improve maintainability.

+ // Helper function to handle decay expressions conversion
+ fn try_from_decay_expression(
+     decay_expression: Box<grpc::DecayParamsExpression>,
+     kind: DecayKind
+ ) -> Result<ExpressionInternal, tonic::Status> {
+     let grpc::DecayParamsExpression {
+         x,
+         target,
+         midpoint,
+         scale,
+     } = *decay_expression;
+
+     let x = *x.ok_or_else(|| tonic::Status::invalid_argument("missing field: x"))?;
+
+     let target = target.map(|t| (*t).try_into()).transpose()?.map(Box::new);
+
+     Ok(ExpressionInternal::Decay {
+         kind,
+         x: Box::new(x.try_into()?),
+         target,
+         midpoint,
+         scale,
+     })
+ }

 Variant::ExpDecay(exp_decay_expression) => {
-    let grpc::DecayParamsExpression {
-        x,
-        target,
-        midpoint,
-        scale,
-    } = *exp_decay_expression;
-
-    let x = *x.ok_or_else(|| tonic::Status::invalid_argument("missing field: x"))?;
-
-    let target = target.map(|t| (*t).try_into()).transpose()?.map(Box::new);
-
-    ExpressionInternal::Decay {
-        kind: DecayKind::Exp,
-        x: Box::new(x.try_into()?),
-        target,
-        midpoint,
-        scale,
-    }
+    try_from_decay_expression(exp_decay_expression, DecayKind::Exp)?
 }
 Variant::GaussDecay(gauss_decay_expression) => {
-    let grpc::DecayParamsExpression {
-        x,
-        target,
-        midpoint,
-        scale,
-    } = *gauss_decay_expression;
-
-    let x = *x.ok_or_else(|| tonic::Status::invalid_argument("missing field: x"))?;
-
-    let target = target.map(|t| (*t).try_into()).transpose()?.map(Box::new);
-
-    ExpressionInternal::Decay {
-        kind: DecayKind::Gauss,
-        x: Box::new(x.try_into()?),
-        target,
-        midpoint,
-        scale,
-    }
+    try_from_decay_expression(gauss_decay_expression, DecayKind::Gauss)?
 }
 Variant::LinDecay(lin_decay_expression) => {
-    let grpc::DecayParamsExpression {
-        x,
-        target,
-        midpoint,
-        scale,
-    } = *lin_decay_expression;
-
-    let x = *x.ok_or_else(|| tonic::Status::invalid_argument("missing field: x"))?;
-
-    let target = target.map(|t| (*t).try_into()).transpose()?.map(Box::new);
-
-    ExpressionInternal::Decay {
-        kind: DecayKind::Lin,
-        x: Box::new(x.try_into()?),
-        target,
-        midpoint,
-        scale,
-    }
+    try_from_decay_expression(lin_decay_expression, DecayKind::Lin)?
 }

472-478: Consider validating scale parameter.

For the scale parameter, you might want to add validation to ensure it's a non-zero positive number as indicated in the documentation for DecayParamsExpression. Currently, there's no validation for this constraint.

 ExpressionInternal::Decay {
     kind: DecayKind::Exp,
     x: Box::new(x.try_into()?),
     target,
-    midpoint,
-    scale,
+    midpoint, 
+    scale: scale.map(|s| if s <= 0.0 {
+        return Err(tonic::Status::invalid_argument("scale must be a positive non-zero value"));
+    } else {
+        s
+    }).transpose()?,
 }
lib/api/src/grpc/conversions.rs (2)

2785-2871: Consider adding runtime checks for invalid lambda.
While this new ParsedExpression::Decay branch correctly generates variants such as LinDecay, ExpDecay, and GaussDecay, the internal debug_assert!(0.0 < lambda && lambda < 1.0) might fail silently in production. Consider returning an error or providing default behavior for out-of-range values to avoid undefined behavior at runtime.


2872-2909: Validate lambda range in production.
This helper function calculates (midpoint, scale) based on the existing lambda and kind. The debug_assert!(0.0 < lambda && lambda < 1.0) helps in development, but no fallback or error is returned if the assertion fails. For robust error handling, consider replacing the debug_assert! with a runtime check that gracefully handles or reports invalid lambda values in production.

lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (2)

261-264: Exponential decay function is concise and clear.
The function calculates e^(diff * lambda). As a minor suggestion, consider verifying that lambda is not negative when invoked to avoid edge-case results.


266-269: Gaussian decay function logic appears correct.
Squaring the difference and dividing by lambda matches standard gauss decay usage. If negative or zero lambda might occur upstream, consider how to handle that gracefully.

lib/api/src/grpc/proto/points.proto (1)

626-635: Definition of DecayParamsExpression Message

The newly introduced DecayParamsExpression message clearly defines the decay parameters with:

  • x: the variable to decay,
  • target: an optional field (defaulting to 0 per the comment),
  • scale: an optional float that must be non-zero and positive (with an expected default of 1.0),
  • midpoint: an optional float (defaulting to 0.5).

These comments provide useful guidance; however, remember that proto3 does not enforce default values automatically. It might be beneficial to ensure that your business logic or documentation highlights that the defaults must be applied during message processing. Also, consider whether renaming lin_decay to linear_decay might enhance clarity, although the current naming is consistent with the abbreviated forms used for exponential and Gaussian decays.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a0862e and 0eca749.

📒 Files selected for processing (10)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/conversions.rs (4 hunks)
  • lib/api/src/grpc/proto/points.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (3 hunks)
  • lib/api/src/rest/schema.rs (2 hunks)
  • lib/collection/src/operations/universal_query/formula.rs (4 hunks)
  • lib/collection/src/operations/universal_query/shard_query.rs (2 hunks)
  • lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (3 hunks)
  • lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md

141-141: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


2461-2461: Table column count
Expected: 4; Actual: 6; Too many cells, extra data will be missing

(MD056, table-column-count)

⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-consistency
  • GitHub Check: test-consensus-compose
  • GitHub Check: test (macos-latest)
  • GitHub Check: test (windows-latest)
  • GitHub Check: test-low-resources
  • GitHub Check: test-consensus
  • GitHub Check: test (ubuntu-latest)
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
🔇 Additional comments (23)
docs/redoc/master/openapi.json (2)

14493-14749: New decay expressions properly integrated with Expression schema

Good implementation of the new decay expressions (linear, exponential, and Gaussian) as part of the Expression schema. The integration follows the existing pattern for other expression types, and the structure is consistent with the rest of the API.


14694-14727: Well-structured DecayParamsExpression with clear parameter documentation

The DecayParamsExpression schema is well-defined with appropriate required and optional fields. The documentation for each parameter clearly explains their purpose and default values:

  • x: The input expression (required)
  • target: The value to start decaying from (defaults to 0)
  • scale: The scale factor of the decay (defaults to 1.0)
  • midpoint: The midpoint value of the decay function (defaults to 0.5)

This implementation allows for flexible configuration of the decay functions while providing sensible defaults.

lib/api/src/grpc/qdrant.rs (2)

5185-5185: LGTM: Tags updated to include new decay expressions.

The tags attribute has been updated to include tags 15, 16, and 17 for the new decay expression variants.


5236-5244: LGTM: Added well-documented decay expression variants.

The three new decay expression variants (ExpDecay, GaussDecay, LinDecay) have been added with appropriate documentation comments that clearly describe their purpose.

docs/grpc/docs.md (1)

2451-2460: New DecayParamsExpression Section Documentation Added
The new section clearly introduces the DecayParamsExpression with well-defined fields:

  • x: the variable to decay.
  • target: the reference value (defaults to 0).
  • scale: the decay’s scale factor (defaults to 1.0 and must be positive).
    These descriptions are concise and aligned with the PR’s goal of introducing decay expressions.
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (2)

57-65: Implementation looks good!

The new Decay variant in ParsedExpression is well-structured with clear comments for each field. The pattern follows the existing code style and provides all necessary fields for implementing decay functions (kind, x, target, lambda).


68-76: LGTM!

The DecayKind enum is well-documented, clearly describing the three types of decay functions (Lin, Gauss, Exp) that are being implemented. The naming convention is consistent with the project.

lib/collection/src/operations/universal_query/shard_query.rs (1)

10-12: Import looks good.

Correctly importing the DecayKind from the parsed_formula module to use with the new decay expression handling.

lib/api/src/rest/schema.rs (2)

654-657: Enum variant additions look good.

The new decay expression variants have been added to the Expression enum following the existing pattern and naming conventions of the codebase.


722-747: Well-documented struct definitions.

The new decay expression structs and the DecayParamsExpression struct are well-defined with clear documentation. The comments for each field in DecayParamsExpression provide good information about defaults and constraints, which will be helpful for API users.

One minor suggestion: You might want to add a constraint validation check for the scale parameter in the implementation to ensure it's always a non-zero positive number as specified in the documentation.

lib/api/src/grpc/conversions.rs (2)

16-16: No issues found on the added imports.
This import provides the required identifiers (DecayKind, ParsedExpression, ParsedFormula) without introducing any conflicts or unused references.


53-55: Imports for new expression variants look good.
These additions enable references to newly introduced expression structs such as DecayParamsExpression, ensuring clarity in code organization.

lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (4)

10-10: Importing new decay-related types looks correct.
The addition of DecayKind into the import list is consistent with the usage in the updated scoring logic.


20-20: Default decay target is set to 0.0.
This constant effectively defaults decay calculations toward zero. Ensure it aligns with the business logic of your application (some scenarios might prefer a different default).


238-256: Decay expression evaluation is well-structured.
The usage of exp_decay, gauss_decay, and linear_decay is clearly delineated by DecayKind. Consider validating lambda at this level as well to handle unexpected negative or zero lambda values.


271-274: Linear decay function uses a straightforward approach.
Ensuring the result is clamped to a minimum of zero (.max(0.0)) is a good safeguard. Continue verifying that lambda remains within acceptable bounds to avoid unexpected expansions.

lib/collection/src/operations/universal_query/formula.rs (6)

8-8: New import for Decay expressions is clear.
Including DecayKind, ParsedExpression, and ParsedFormula aligns the formula logic with the new decay capabilities.


16-17: Default decay parameters outlined.
These default midpoint and scale values (0.5 and 1.0) may be appropriate for typical use-cases. Ensure they reflect intended domain logic and are documented for clarity.


51-57: New Decay variant is a valuable extension.
This addition empowers the ExpressionInternal enum to represent linear, exponential, and Gaussian decay parameters. The structured approach is easy to read and maintain.


127-149: Decay expression parsing is robust and consistent.
After computing lambda via decay_params_to_lambda, the code neatly crafts a ParsedExpression::Decay. Just ensure external inputs for midpoint and scale remain valid, preventing runtime errors.


156-188: Lambda computation for decay is well-implemented.
The function enforces constraints on midpoint ((0,1)) and scale (>0). The resulting lambda computation aligns with the decay definitions. If there's a possibility of midpoint or scale being zero or negative at runtime, consider adding more specific errors for clarity.


281-326: Mapping REST decay expressions to internal logic is coherent.
The code reads each decay expression (LinDecay, ExpDecay, GaussDecay) from REST parameters and constructs ExpressionInternal::Decay. This approach keeps concerns well-separated and clearly transforms API inputs into internal structures.

lib/api/src/grpc/proto/points.proto (1)

596-598: New Decay Expression Fields Added to Expression Message

The new oneof fields exp_decay, gauss_decay, and lin_decay (with tags 15, 16, and 17 respectively) expand the Expression message to support exponential, Gaussian, and linear decay. The numbering is sequential and consistent with previous fields. Please verify that all downstream consumers and parsers of Expression are updated accordingly to handle these new variants.

Comment thread lib/api/src/grpc/qdrant.rs
coderabbitai[bot]

This comment was marked as resolved.

@coszio coszio force-pushed the decay-function-helper branch from 8f7a988 to e0de6b3 Compare March 12, 2025 03:00
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
lib/collection/src/operations/universal_query/shard_query.rs (1)

475-497: Implementation looks good with a minor readability suggestion.

The try_from_decay_params function correctly extracts and processes all fields from the DecayParamsExpression. The required field validation and error handling are well implemented.

For improved readability, consider splitting the target processing into multiple steps:

- let target = target.map(|t| (*t).try_into()).transpose()?.map(Box::new);
+ // Process optional target field
+ let target = match target {
+     Some(t) => {
+         let converted = (*t).try_into()?;
+         Some(Box::new(converted))
+     }
+     None => None,
+ };

This approach makes the code slightly more verbose but easier to understand, especially for developers less familiar with Rust's functional style.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f7a988 and e0de6b3.

📒 Files selected for processing (1)
  • lib/collection/src/operations/universal_query/shard_query.rs (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: test (macos-latest)
  • GitHub Check: test-consensus
  • GitHub Check: test (windows-latest)
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test (ubuntu-latest)
  • GitHub Check: test
🔇 Additional comments (1)
lib/collection/src/operations/universal_query/shard_query.rs (1)

460-468: LGTM! The decay variants are correctly implemented.

Each decay variant now properly uses its corresponding decay kind, ensuring that the decay functions are correctly represented. This is a good implementation that allows the system to support all three types of decay expressions.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (9)

61-69: Consider adding a constructor method for the Decay variant

The Decay variant is well-documented, but unlike other variants (e.g., new_div, new_neg), it doesn't have a constructor method. For consistency, consider adding a new_decay method.

+ pub fn new_decay(
+     kind: DecayKind,
+     x: ParsedExpression,
+     target: Option<ParsedExpression>,
+     lambda: ScoreType,
+ ) -> Self {
+     ParsedExpression::Decay {
+         kind,
+         x: Box::new(x),
+         target: target.map(Box::new),
+         lambda,
+     }
+ }

167-167: Remove debug statement

The dbg! macro is typically used during development and should be removed before production.

- dbg!(midpoint);

166-171: Improve error message clarity for midpoint validation

The current error message could be more explicit about the allowed range being exclusive of 0.0 and 1.0.

if midpoint <= 0.0 || midpoint >= 1.0 {
-   dbg!(midpoint);
    return Err(OperationError::validation_error(
-       "Decay midpoint should be between 0.0 and 1.0, not inclusive.",
+       format!("Decay midpoint should be in the range (0.0, 1.0) (exclusive), got {}", midpoint),
    ));
}

173-177: Improve error message clarity for scale validation

The current error message could be more specific about the requirement for scale.

if scale <= 0.0 {
    return Err(OperationError::validation_error(
-       "Decay scale should be non-zero positive.",
+       format!("Decay scale should be positive and non-zero, got {}", scale),
    ));
}

195-206: Consider adding runtime validation for linear decay lambda

You're using a debug assertion to check lambda's range, but this won't be checked in release builds. Consider adding runtime validation for safety.

DecayKind::Lin => {
    // We assume lambda is in the range (0, 1)
-   debug_assert!(0.0 < lambda && lambda < 1.0);
+   if !(0.0 < lambda && lambda < 1.0) {
+       log::warn!("Linear decay lambda should be in range (0, 1), got {}", lambda);
+   }

    // Linear lambda is (1.0 - midpoint) / scale,
    // setting scale to 1.0 allows us to ignore the division,
    // and only set the midpoint to some value.
    //
    // (1.0 - midpoint) / 1.0 = lambda
    // 1.0 - midpoint = lambda
    // midpoint = 1.0 - lambda
    ((-lambda + 1.0), 1.0)
}

208-222: Consider adding runtime validation for Gaussian decay lambda

Similar to the linear decay case, you're using a debug assertion for the lambda range check. Consider adding runtime validation.

DecayKind::Gauss => {
    // We assume lambda is non-zero negative
-   debug_assert!(lambda < 0.0);
+   if !(lambda < 0.0) {
+       log::warn!("Gaussian decay lambda should be negative, got {}", lambda);
+   }

    // Gauss lambda is scale^2 / ln(midpoint)
    // setting midpoint to 1/e (0.3678...) allows us to ignore the division, since ln(1/e) = -1
    // Then we set scale to sqrt(-lambda)
    //
    // scale^2 / ln(1/e) = lambda
    // scale^2 / -1.0 = lambda
    // -scale^2 = lambda
    // scale^2 = -lambda
    // scale = sqrt(-lambda)
    (1.0 / std::f32::consts::E, lambda.neg().sqrt())
}

224-236: Consider adding runtime validation for exponential decay lambda

As with the other decay types, consider adding runtime validation for the lambda range.

DecayKind::Exp => {
    // We assume lambda is non-zero negative
-   debug_assert!(lambda < 0.0);
+   if !(lambda < 0.0) {
+       log::warn!("Exponential decay lambda should be negative, got {}", lambda);
+   }

    // Exponential lambda is ln(midpoint) / scale
    // setting midpoint to 1/e (0.3678...) allows us to ignore the division, since ln(1/e) = -1
    // Then we set scale to -1 / lambda
    //
    // ln(1/e) / scale = lambda
    // -1.0 / scale = lambda
    // scale = -1.0 / lambda
    (1.0 / std::f32::consts::E, -1.0 / lambda)
}

73-80: Consider enhancing DecayKind documentation with formula details

While the enum variants have good descriptive comments, it might be helpful to include the actual formulas implemented by each decay function for completeness.

#[derive(Debug, Copy, Clone, PartialEq)]
pub enum DecayKind {
-   /// Linear decay function
+   /// Linear decay function: f(x) = max(0, 1 - λ * |x - target|)
    Lin,
-   /// Gaussian decay function
+   /// Gaussian decay function: f(x) = exp(-λ * (x - target)²)
    Gauss,
-   /// Exponential decay function
+   /// Exponential decay function: f(x) = exp(-λ * |x - target|)
    Exp,
}

13-14: Consider adding documentation for decay constants

The constants DEFAULT_DECAY_MIDPOINT and DEFAULT_DECAY_SCALE could benefit from brief documentation explaining their significance in the decay functions.

- const DEFAULT_DECAY_MIDPOINT: f32 = 0.5;
- const DEFAULT_DECAY_SCALE: f32 = 1.0;
+ /// Default midpoint value for decay functions when not specified (represents 50% decay point)
+ const DEFAULT_DECAY_MIDPOINT: f32 = 0.5;
+ /// Default scale value for decay functions when not specified (controls the rate of decay)
+ const DEFAULT_DECAY_SCALE: f32 = 1.0;
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e0de6b3 and 0d42a84.

📒 Files selected for processing (5)
  • lib/api/src/grpc/conversions.rs (4 hunks)
  • lib/collection/src/operations/universal_query/formula.rs (4 hunks)
  • lib/common/common/src/math.rs (1 hunks)
  • lib/segment/src/common/operation_error.rs (1 hunks)
  • lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test-consensus-compose
  • GitHub Check: test (macos-latest)
  • GitHub Check: test (windows-latest)
  • GitHub Check: test-consensus
  • GitHub Check: test (ubuntu-latest)
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
🔇 Additional comments (10)
lib/segment/src/common/operation_error.rs (1)

91-96: Clean addition of validation error constructor

This helper method for creating validation errors follows the same pattern as service_error and service_error_light. It provides a consistent way to create validation errors throughout the codebase.

lib/common/common/src/math.rs (2)

20-24: Good implementation of float comparison function

This helper function provides a convenient way to check if two floating-point values are approximately equal using standard tolerance values. It's a good practice to use such functions instead of direct equality comparisons for floating-point values.


26-29: Well-implemented parameterized float comparison

This function follows a standard algorithm for floating-point comparisons with configurable tolerances:

  1. Uses both absolute and relative tolerance
  2. Scales relative tolerance by maximum magnitude of inputs
  3. Provides a flexible foundation for numerical comparisons

The implementation is mathematically sound and handles edge cases appropriately.

lib/collection/src/operations/universal_query/formula.rs (3)

48-54: Well-structured decay expression variant

The new Decay variant has been properly structured with appropriate fields: kind to specify the decay function type, x for the input, target as an optional reference value, and optional scaling parameters. The use of Option for parameters that might have default values is good practice.


124-146: Good implementation of decay expression parsing

The implementation correctly:

  1. Converts the scaling parameters to a lambda value
  2. Recursively parses sub-expressions for x and target
  3. Properly constructs the resulting ParsedExpression::Decay
  4. Handles errors appropriately with the ? operator

The code follows the established pattern for other expressions in this method.


245-289: Comprehensive implementation of all decay types

The implementation handles all three decay types (linear, exponential, and Gaussian) consistently. Each expression type is correctly mapped to the appropriate DecayKind variant while preserving all parameters. The code properly extracts and transforms nested expressions.

lib/api/src/grpc/conversions.rs (1)

2846-2864: Complete decay expression unparsing implementation

The code correctly:

  1. Converts lambda back to midpoint and scale parameters
  2. Unpacks and unparses nested expressions
  3. Constructs the appropriate variant based on decay kind
  4. Ensures all parameters are properly included in the result

This completes the bidirectional conversion between internal and API representations of decay expressions.

lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (3)

324-359: Good implementation of decay tests with property-based testing

The use of property-based testing with proptest is a good approach to verify the mathematical properties of your decay functions across a wide range of inputs. The test correctly verifies that lambda values can be converted to parameters and back without significant precision loss.


154-162: LGTM! Good documentation with visual reference

The documentation for decay_params_to_lambda is clear and includes a helpful link to visualize the formulas, which is excellent for developers to understand the mathematical behavior.


2-2: Good addition of Neg trait

The addition of std::ops::Neg import is correctly used for the .neg() method in the Gaussian decay parameter conversion.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (3)

187-237: Consider consistency in parameter validation across methods.

While decay_params_to_lambda does thorough validation with proper error handling, decay_lambda_to_params only uses debug_assert! to check assumptions about lambda values. Consider adding proper validation with error handling for production use.

For production-ready code, consider changing the debug assertions to proper validation:

pub fn decay_lambda_to_params(lambda: f32, kind: DecayKind) -> (f32, f32) {
    match kind {
        DecayKind::Lin => {
-           // We assume lambda is in the range (0, 1)
-           debug_assert!(0.0 < lambda && lambda < 1.0);
+           // Lambda must be in the range (0, 1) for linear decay
+           if !(0.0 < lambda && lambda < 1.0) {
+               // Use a default or panic with a clear message in production code
+               // For API consistency, consider returning Result<(f32, f32), OperationError>
+           }

            // Linear lambda calculation...

204-204: Simplify the expression for improved readability.

The expression (-lambda + 1.0) could be written more clearly as (1.0 - lambda) which better matches the algebra in the comments above it.

-            ((-lambda + 1.0), 1.0)
+            ((1.0 - lambda), 1.0)

13-14: Consider exposing constants as public API.

The default constants for decay parameters might be useful for other modules that interact with decay functions. Consider making them public if they're meant to be part of the API.

-const DEFAULT_DECAY_MIDPOINT: f32 = 0.5;
-const DEFAULT_DECAY_SCALE: f32 = 1.0;
+pub const DEFAULT_DECAY_MIDPOINT: f32 = 0.5;
+pub const DEFAULT_DECAY_SCALE: f32 = 1.0;
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7740be5 and 1b3e72d.

📒 Files selected for processing (1)
  • lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test-consensus-compose
  • GitHub Check: test (macos-latest)
  • GitHub Check: test-consensus
  • GitHub Check: test (windows-latest)
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test (ubuntu-latest)
  • GitHub Check: test
🔇 Additional comments (3)
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (3)

61-69: LGTM! Well-structured Decay variant addition.

The new Decay variant in the ParsedExpression enum is well-structured with clear documentation comments. The implementation includes all necessary components: kind of decay function, value to decay, target value, and lambda parameter.


154-185: Comprehensive parameter validation in decay_params_to_lambda.

The function handles parameter validation thoroughly with meaningful error messages. The link to the graphical representation is particularly helpful for understanding the formulas.


323-359: Excellent test coverage with property-based testing.

The test implementation is thorough using property-based testing to verify the mathematical correctness of the lambda conversions across different ranges. The helper function is well-documented and properly checks for numerical closeness rather than exact equality, which is appropriate for floating-point operations.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1)

186-235: Consider handling out-of-range lambda without debug asserts.
The debug assertions are helpful for testing, yet they won't throw an error in release mode if lambda is not in the expected range. Consider returning a proper error or fallback in production.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b3e72d and 0378642.

📒 Files selected for processing (2)
  • lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (3 hunks)
  • lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test (macos-latest)
  • GitHub Check: test-consensus
  • GitHub Check: test (windows-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: test
  • GitHub Check: test (ubuntu-latest)
  • GitHub Check: test
  • GitHub Check: test
🔇 Additional comments (12)
lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (5)

10-10: Use of newly introduced decay references is appropriate.
The addition of DecayKind to the imports aligns well with the decay logic introduced later in the file.


20-20: Double-check default target behavior.
Setting the default target to 0.0 may cause unexpected result shifts if the user omits the target field. Consider verifying if this default is universally acceptable or if a different value might better suit typical use cases.


262-265: Confirm sign of lambda for exponential decay.
This function applies exp(lambda * diff). For a true “decay,” lambda must be negative. Ensure that the generation of lambda (in user-supplied or upstream code) consistently produces negative values. Otherwise, exponential growth may occur instead.


267-270: Confirm sign of lambda for Gaussian decay.
As with exponential decay, verify that lambda is negative. A positive or zero lambda would yield growth or a constant value, rather than a decay.


272-275: Linear decay logic appears sound.
By incorporating a clamp at zero via .max(0.0), you ensure non-negative results. No issues noted.

lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (7)

7-7: Expanded error handling imports are appropriate.
The inclusion of OperationError and OperationResult helps maintain consistent validation and error reporting across decay-related code.


12-13: Default midpoint and scale constants look reasonable.
A midpoint of 0.5 and scale of 1.0 provide intuitive baseline values. No functional issues identified here.


60-68: Decay variant fields are well-structured.
The Decay variant clearly separates the expression x, optional target, and lambda, easing future expansions or adjustments to decay logic.


71-79: Compact and clear enumeration of decay kinds.
The DecayKind enum directly captures linear, Gaussian, and exponential behaviors, supporting a clean dispatch in the scorer.


153-185: Parameter-to-lambda conversion logic is well-implemented.
Validation ensures midpoint value stays within (0.0, 1.0) and checks for a positive scale. This safeguards against undefined or pathological decay parameters.


279-280: New import for floating-point comparison is beneficial.
Using is_close to ensure robust float-based comparisons is a good way to avoid rounding pitfalls in your tests.


321-357: Thorough round-trip tests.
Your proptest-based tests verify decays across a range of values, providing strong confidence in correct parameter handling.

@coszio coszio force-pushed the decay-function-helper branch from bc3e83e to 5f82433 Compare March 13, 2025 13:21
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (1)

264-277: Add doc comments clarifying parameter constraints.
Each decay function assumes a certain sign convention for lambda to stay in [0,1]. It would help future maintainers if the code explicitly documented that assumption. For instance:

+/// Exponential decay function: requires `lambda < 0` for a valid decay in [0,1].
 fn exp_decay(x: ScoreType, target: ScoreType, lambda: ScoreType) -> f32 {
     let diff = (x - target).abs();
     (lambda * diff).exp()
 }

+/// Gaussian decay function: requires `lambda < 0` for a valid decay in [0,1].
 fn gauss_decay(x: ScoreType, target: ScoreType, lambda: ScoreType) -> f32 {
     let diff = x - target;
     (lambda * diff * diff).exp()
 }

+/// Linear decay function: requires `lambda > 0` for a valid decay in [0,1].
 fn linear_decay(x: ScoreType, target: ScoreType, lambda: ScoreType) -> f32 {
     let diff = (x - target).abs();
     (-lambda * diff + 1.0).max(0.0)
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bc3e83e and 5f82433.

📒 Files selected for processing (1)
  • lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (1)
Learnt from: coszio
PR: qdrant/qdrant#6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-consensus
  • GitHub Check: test (macos-latest)
  • GitHub Check: test (windows-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test (ubuntu-latest)
🔇 Additional comments (3)
lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (3)

10-10: Imports look good.
No issues with adding DecayKind here, as it cleanly extends the expression parsing functionality.


20-20: Default target is fine as a constant.
Using 0.0 as the default decay target appears reasonable. Should there be a need to adapt this in future, consider making it a user-configurable setting.


238-259:

Details

✅ Verification successful

Ensure runtime parameter constraints remain enforced.
This newly introduced Decay variant cleanly invokes the respective decay function. Although the comment states that all decays remain within [0,1], this relies on the correctness of upstream validations (e.g., ensuring negative lambda for exponential/gaussian, positive for linear). Confirm that these parameter constraints are indeed enforced by the builder or parser to avoid potential non-finite or out-of-range results.


🏁 Script executed:

#!/bin/bash
# Verify the presence of parameter validations for the `Decay` expressions in the codebase.
rg -A 10 "decay_params_to_lambda" --type rust

Length of output: 3630


Verify upstream parameter validations for Decay expressions

The parameter constraints are enforced upstream via the ParsedExpression::decay_params_to_lambda function in lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs. This function validates that the decay parameters (e.g., ensuring the midpoint is between 0.0 and 1.0) are correct before computing lambda. As long as these validations remain intact, the guarantees that decay functions return values within the expected [0, 1] range hold true.

  • The validations for decay parameters are clearly implemented in ParsedExpression::decay_params_to_lambda.
  • Upstream tests (including proptest cases) further confirm the roundtrip conversion maintains valid values.

No further action is needed here as the runtime constraints are properly enforced.

@coszio coszio force-pushed the decay-function-helper branch from f2ec159 to f112926 Compare March 18, 2025 13:41
@coszio coszio force-pushed the decay-function-helper branch from f112926 to 502c618 Compare March 18, 2025 17:04
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
lib/segment/src/common/operation_error.rs (1)

95-99: Consider adding documentation to the validation_error method.

The new validation_error method is well-implemented and follows the same pattern as the existing helper methods. However, unlike service_error and service_error_light, it lacks documentation comments explaining its purpose and usage. Consider adding documentation similar to the other methods for consistency and clarity.

+   /// Create a new validation error with a description
    pub fn validation_error(description: impl Into<String>) -> OperationError {
        OperationError::ValidationError {
            description: description.into(),
        }
    }
docs/redoc/master/openapi.json (2)

14689-14733: Well-structured DecayParamsExpression schema with clear parameter documentation

The DecayParamsExpression schema is well-designed with:

  • Required x parameter for the input value
  • Optional parameters with good default value documentation
  • Clear descriptions explaining each parameter's purpose

Consider adding schema-level validation for the scale parameter to enforce that it's a positive non-zero number. While the documentation states this requirement, adding a minimum constraint would provide automatic validation.

  "scale": {
    "description": "The scale factor of the decay, in terms of `x`. Defaults to 1.0. Must be a non-zero positive number.",
    "type": "number",
    "format": "float",
+   "exclusiveMinimum": 0,
    "nullable": true
  },

14700-14732: Consider adding default values to the DecayParamsExpression schema

While the code correctly documents the default values in comments, consider adding explicit default properties in the schema. This would make the defaults visible in generated API documentation and client libraries.

  "target": {
    "description": "The target value to start decaying from. Defaults to 0.",
    "anyOf": [
      {
        "$ref": "#/components/schemas/Expression"
      },
      {
        "nullable": true
      }
    ],
+   "default": null
  },
  "scale": {
    "description": "The scale factor of the decay, in terms of `x`. Defaults to 1.0. Must be a non-zero positive number.",
    "type": "number",
    "format": "float",
+   "default": 1.0,
    "nullable": true
  },
  "midpoint": {
    "description": "The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`.",
    "type": "number",
    "format": "float",
+   "default": 0.5,
    "nullable": true
  }
docs/grpc/docs.md (1)

2451-2461: Ensure Correct Markdown Table Rendering in New DecayParamsExpression Section

The new "DecayParamsExpression" section clearly documents the decay parameters with well-defined fields and default values. However, the description in the "midpoint" row (line 2461) contains pipe characters inside inline code:

when `|x - target| == scale`

Markdown may interpret the unescaped pipe symbols (|) as column dividers, which can break the table formatting and lead to an unexpected number of cells (as noted by MD056). To fix this, please escape the pipe characters or wrap the expression in a code span that prevents misinterpretation. For instance, you might change it to:

-| midpoint | [float](#float) | optional | The midpoint of the decay. Defaults to 0.5. Output will be this value when `|x - target| == scale`. |
+| midpoint | [float](#float) | optional | The midpoint of the decay. Defaults to 0.5. Output will be this value when `\|x - target\| == scale`. |

This adjustment will ensure that the table renders correctly.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

2461-2461: Table column count
Expected: 4; Actual: 6; Too many cells, extra data will be missing

(MD056, table-column-count)

lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (1)

20-20: Introducing DEFAULT_DECAY_TARGET
Using a constant default target of 0.0 is fitting. Confirm whether this value is universally meaningful for all decay functions or if a dynamic approach might be needed for future expansions.

lib/api/src/grpc/proto/points.proto (1)

626-635: Introducing DecayParamsExpression
This message neatly encapsulates the core parameters for any decay-based expression. As with the REST schema, be mindful of invalid scale inputs. Overall, well structured for Protobuf usage.

lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1)

186-235: Comprehensive implementation of decay_lambda_to_params with detailed comments.

The mathematical derivations are well-explained in the comments, making the code more maintainable. Each decay type is handled appropriately with good debug assertions to catch potential programming errors.

Consider replacing debug assertions with proper validation if these values could be provided by external sources in production. Debug assertions will be disabled in release builds.

- debug_assert!(0.0 < lambda && lambda < 1.0);
+ if !(0.0 < lambda && lambda < 1.0) {
+     // Handle invalid lambda or use defaults
+ }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f112926 and 502c618.

📒 Files selected for processing (12)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/conversions.rs (4 hunks)
  • lib/api/src/grpc/proto/points.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (3 hunks)
  • lib/api/src/rest/schema.rs (2 hunks)
  • lib/collection/src/operations/universal_query/formula.rs (4 hunks)
  • lib/collection/src/operations/universal_query/shard_query.rs (2 hunks)
  • lib/common/common/src/math.rs (1 hunks)
  • lib/segment/src/common/operation_error.rs (1 hunks)
  • lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (4 hunks)
  • lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/common/common/src/math.rs
🧰 Additional context used
🧠 Learnings (2)
lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (1)
Learnt from: coszio
PR: qdrant/qdrant#6154
File: lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs:65-66
Timestamp: 2025-03-13T13:17:00.411Z
Learning: In Qdrant's decay functions implementation, the `target` field serves as the reference point from which decay is measured (the value where the decay function equals its maximum), while the lambda parameter that controls the decay rate is calculated separately from `midpoint` and `scale` parameters. The `target` field is not involved in the lambda calculation.
lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (1)
Learnt from: coszio
PR: qdrant/qdrant#6154
File: lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs:238-257
Timestamp: 2025-03-13T13:15:16.290Z
Learning: All decay functions in Qdrant (exponential, Gaussian, and linear) have an output range of [1, 0], with 1 being the maximum value when x equals target, and values approaching or reaching 0 as the distance between x and target increases. Lambda parameters are properly validated to ensure decay functions always produce finite results.
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md

141-141: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)


2461-2461: Table column count
Expected: 4; Actual: 6; Too many cells, extra data will be missing

(MD056, table-column-count)

⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Basic TLS/HTTPS tests
  • GitHub Check: test-snapshot-operations-s3-minio
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-low-resources
  • GitHub Check: test-consistency
  • GitHub Check: test-consensus-compose
  • GitHub Check: test (macos-latest)
  • GitHub Check: test-consensus
  • GitHub Check: test (windows-latest)
  • GitHub Check: test
  • GitHub Check: test
  • GitHub Check: test (ubuntu-latest)
  • GitHub Check: test
🔇 Additional comments (31)
docs/redoc/master/openapi.json (3)

14489-14508: Good addition of decay expressions to the Expression schema

The changes properly integrate the new decay expressions (LinDecayExpression, ExpDecayExpression, and GaussDecayExpression) into the Expression schema alongside existing expression types. This follows the same pattern used for other expressions in the API.


14734-14744: ExpDecayExpression properly references the shared DecayParamsExpression schema

The exponential decay expression implementation correctly uses the shared DecayParamsExpression schema to define its parameters, promoting code reuse and consistency.


14745-14755: GaussDecayExpression properly references the shared DecayParamsExpression schema

The Gaussian decay expression implementation correctly uses the shared DecayParamsExpression schema to define its parameters, maintaining consistency with the other decay expression types.

lib/api/src/grpc/qdrant.rs (3)

5185-5185: No major concerns about extending the tags.
Expanding the tag range to include 15–17 for the new decay variants looks consistent. Please confirm that none of the newly introduced tags conflicts with existing proto definitions.


5236-5244: Add test coverage for the new decay variants.
While defining ExpDecay, GaussDecay, and LinDecay here is a great addition, ensure there are corresponding unit or integration tests exercising these variants end-to-end. This helps confirm that the new fields are correctly parsed and handled in the overall pipeline.


5293-5306: Maintain clarity for parameter constraints and usage.
The documentation nicely states that scale must be a non-zero positive number and clarifies how midpoint is used when |x - target| == scale. From prior discussions, validation is enforced elsewhere in the codebase. To avoid confusion, consider either referencing that validation here or adding a succinct check at the gRPC layer. This clarifies if an invalid scale arrives from external sources.

lib/collection/src/operations/universal_query/formula.rs (6)

8-8: Well-integrated import addition for the decay functionality.

The added DecayKind import is used appropriately in the new decay expression variant.


48-54: Good design of the new Decay variant.

The Decay variant has a well-structured design with appropriate fields:

  • kind to distinguish between different decay types
  • x as the input value
  • target as an optional reference point
  • midpoint and scale as optional configuration parameters

This design is flexible and aligns well with the decay expressions described in the PR objectives.


124-146: Thorough and defensive implementation of the decay expression conversion.

The implementation correctly:

  1. Extracts parameters from the Decay variant
  2. Uses decay_params_to_lambda to validate and convert parameters
  3. Recursively converts nested expressions
  4. Handles the optional target parameter properly with transpose()

The implementation is robust and maintains the recursive pattern used throughout the rest of the function.


245-259: Well-implemented conversion from REST LinDecay to internal representation.

This implementation correctly extracts the parameters from the REST LinDecay expression and constructs an ExpressionInternal::Decay with the appropriate kind.


260-274: Well-implemented conversion from REST ExpDecay to internal representation.

The implementation follows the same pattern as the LinDecay conversion, ensuring consistency across the codebase.


275-289: Well-implemented conversion from REST GaussDecay to internal representation.

The implementation maintains consistency with the other decay type conversions, completing the set of three decay types mentioned in the PR objectives.

lib/collection/src/operations/universal_query/shard_query.rs (3)

3-3: Appropriate imports added for decay functionality.

The additions of DecayParamsExpression and DecayKind imports support the new decay expression functionality.

Also applies to: 10-12


460-468: Properly implemented decay variants with correct decay kinds.

All three decay variants are now using their corresponding decay kinds, ensuring proper behavior for each decay function type.


475-497: Well-designed helper function to reduce code duplication.

The try_from_decay_params function:

  1. Extracts all necessary parameters from the DecayParamsExpression
  2. Validates the required x field with a clear error message
  3. Handles the optional target field conversion with transpose()
  4. Creates the appropriate ExpressionInternal::Decay variant

This helps maintain DRY principles by avoiding code duplication across the three decay variants.

lib/api/src/grpc/conversions.rs (3)

53-55: Appropriate import organization for the new decay types.

The imports have been updated to include DecayParamsExpression and other necessary types for the decay functionality.


2788-2788: Improved parameter naming for clarity.

Renaming the parameter from formula to expression better reflects its actual type and purpose.


2849-2867: Comprehensive implementation of decay expression unparsing.

The implementation:

  1. Correctly extracts decay parameters using decay_lambda_to_params
  2. Builds the appropriate DecayParamsExpression with all necessary fields
  3. Creates the correct variant based on the decay kind
  4. Recursively unparses nested expressions

This completes the round-trip serialization/deserialization cycle for decay expressions.

lib/api/src/rest/schema.rs (2)

654-656: New decay expression variants introduced in the Expression enum
Adding LinDecay, ExpDecay, and GaussDecay expands the number of supported expression types. This is well-aligned with the PR objective to implement decay functions.


722-748: Definition of DecayParamsExpression and associated structs
All fields in DecayParamsExpression significantly enhance flexibility in configuring decay. However, if the value for scale is provided externally, ensure it is validated (e.g., > 0.0) elsewhere to avoid unexpected negative or zero values. Otherwise, the code looks clean and coherent with the new decay logic.

lib/segment/src/index/query_optimization/rescore_formula/formula_scorer.rs (5)

10-10: Importing DecayKind and other items
No issues. The reference to parsed_formula::{DecayKind, ParsedExpression, ParsedFormula, VariableId} correctly integrates decay logic into the scorer.


242-264: Evaluating the new ParsedExpression::Decay variant

  • The logic to evaluate x and target is appropriately consistent with other numeric expressions.
  • Defaulting target to DEFAULT_DECAY_TARGET is clear.
  • The debug assertion that ensures the decay output remains within [0.0, 1.0] is a smart safeguard for development environments.
    No major concerns; this is a well-structured addition.

308-312: exp_decay function
The exponential decay formula (lambda * diff).exp() is a concise and effective approach. Please confirm lambda remains negative or ensures values produce the intended falloff.


313-316: gauss_decay function
Good usage of (lambda * diff * diff).exp(). The squared distance ensures the classic Gaussian drop-off. No functional issues spotted.


318-321: linear_decay function
The clamp at .max(0.0) ensures no negative values, matching the stated [0, 1] range. Implementation looks correct for a linear decay curve.

lib/api/src/grpc/proto/points.proto (1)

596-598: New optional fields in Expression for decay
Adding exp_decay, gauss_decay, and lin_decay fields is consistent with the new decay features. This change is well-documented and clearly named.

lib/segment/src/index/query_optimization/rescore_formula/parsed_formula.rs (5)

60-68: Excellent implementation of the Decay expression variant!

The Decay expression variant is well structured with clear field documentation. The comments effectively explain each field's purpose, particularly clarifying that target represents the value where decay function is highest, while lambda shapes the decay function.


71-79: Well-documented DecayKind enum with appropriate variants.

The enum provides clear documentation for each decay function type (Linear, Gaussian, and Exponential), which aligns with the PR's objective of adding these specific decay expressions.


153-184: Thorough implementation of decay_params_to_lambda with proper validation.

This function correctly transforms decay parameters into lambda values with appropriate validation checks. The validation messages are clear, and the link to the visualization of formulas is a valuable reference. The implementation matches the decay functions described in the PR objectives.


321-357: Great property-based testing approach for lambda conversion roundtrip.

The property tests thoroughly verify the roundtrip conversion for all decay kinds with appropriate lambda value ranges. This ensures the stability and correctness of the conversion logic across a wide range of inputs.


12-13: Well-chosen default values for decay parameters.

The default values for midpoint (0.5) and scale (1.0) are sensible choices that align with common decay function implementations.

@coszio coszio deleted the decay-function-helper branch March 18, 2025 20:09
timvisee pushed a commit that referenced this pull request Mar 21, 2025
* Add decay expressions

* remove code duplication in conversion

* test and fix lambda roundtrip

* Remove debug print in decay midpoint validation

* Improve error messages for decay midpoint and scale validation

* use similar lambda for gauss and exp decays

* avoid abs before squaring

* add comment about decay range

* add debug assertion
@coderabbitai coderabbitai Bot mentioned this pull request Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants