Skip to content

Fix: attach data_quality_issues for list payloads in wrapper validation#1151

Merged
VaibhavUpreti merged 1 commit intoTracer-Cloud:mainfrom
emefienem:fix-list-aggregation-bug
Apr 30, 2026
Merged

Fix: attach data_quality_issues for list payloads in wrapper validation#1151
VaibhavUpreti merged 1 commit intoTracer-Cloud:mainfrom
emefienem:fix-list-aggregation-bug

Conversation

@emefienem
Copy link
Copy Markdown
Contributor

Fixes #1113

Describe the changes you have made in this PR -

This PR fixes a bug in validate_host_metrics() where list-based payloads
did not properly aggregate and return data_quality_issues at the root level.

Changes made:

  • Updated list handling logic in validate_host_metrics
  • Ensured each list item is fully validated using validate_metrics
  • Aggregated all data_quality_issues from list items into root response
  • Ensured existing flat and nested structure behavior remains unchanged

Demo/Screenshot for feature changes and bug fixes -

Screenshot 2026-04-30 at 5 20 21 PM

Code Understanding and AI Usage

Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?

  • No, I wrote all the code myself
  • [] Yes, I used AI assistance (continue below)

If you used AI assistance:

  • [] I have reviewed every single line of the AI-generated code
  • [] I can explain the purpose and logic of each function/component I added
  • I have tested edge cases and understand how the code handles them
  • I have modified the AI output to follow this project's coding standards and conventions

Explain your implementation approach:


Checklist before requesting a review

  • I have added proper PR title and linked to the issue
  • I have performed a self-review of my code
  • I can explain the purpose of every function, class, and logic block I added
  • I understand why my changes work and have tested them thoroughly
  • I have considered potential edge cases and how my code handles them
  • If it is a core feature, I have added thorough tests
  • My code follows the project's style guidelines and conventions

Note: Please check Allow edits from maintainers if you would like us to assist in the PR.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 30, 2026

Greptile Summary

This PR fixes the bug in validate_host_metrics where data_quality_issues were never surfaced at the root level for list-based payloads (previously collected by iterating validated_data after _validate_flat_metrics, which never writes that key to individual points).

  • P1 — structural regression in list items: Switching to validator.validate_metrics(data_point.copy()) for each flat data point triggers _validate_cpu_metric and _validate_disk_metric on scalar values, wrapping cpu and disk with {"raw": ..., "validated": False} dicts instead of preserving the original integers. A simpler fix would be to keep _validate_flat_metrics per item and collect issues from validator.issues after each call, avoiding the double-dispatch into the nested-structure branches.
  • P1 — test gap: test_wrapper_list_structure_payload only asserts "data_quality_issues" in result and does not assert the preserved scalar types of cpu/disk inside result["data"], so the regression passes undetected.

Confidence Score: 3/5

Not safe to merge as-is — the fix introduces a structural regression that changes the shape of every validated list item.

Two P1 findings: the switch to validate_metrics on individual flat items silently replaces scalar cpu/disk values with {"raw": ..., "validated": False} dicts, and the new test only checks for the presence of data_quality_issues rather than verifying the unchanged shape of the data items, leaving the regression undetected.

app/tools/utils/data_validation.py (line 398) and tests/tools/utils/test_data_validation.py (lines 129–138) both need attention before merging.

Important Files Changed

Filename Overview
app/tools/utils/data_validation.py The switch from _validate_flat_metrics to validate_metrics for each list item correctly aggregates data_quality_issues, but validate_metrics also runs _validate_cpu_metric/_validate_disk_metric on scalar values, wrapping them in {"raw": ..., "validated": False} dicts and silently changing the data shape of every list item.
tests/tools/utils/test_data_validation.py Removes the xfail marker from test_wrapper_list_structure_payload, but the test only asserts that data_quality_issues is present, leaving the structural regression in individual data-point fields uncaught.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["validate_host_metrics(metrics)"] --> B{Is metrics a dict?}
    B -- No --> C["Return error with data_quality_issues"]
    B -- Yes --> D{"'data' key present and is a list?"}
    D -- No --> E["validator.validate_metrics(metrics) — flat/nested path"]
    D -- Yes --> F["Loop over data points"]
    F --> G{Is data_point a dict?}
    G -- No --> H["Append as-is"]
    G -- Yes --> I["validator.validate_metrics(data_point) ⚠️ wraps scalar cpu/disk"]
    I --> J{"data_quality_issues in validated_point?"}
    J -- Yes --> K["Extend all_issues; del from validated_point"]
    J -- No --> L["Skip"]
    K --> M["Append validated_point"]
    L --> M
    H --> M
    M --> F
    F --> N["result = metrics.copy(); result['data'] = validated_data"]
    N --> O{any all_issues?}
    O -- Yes --> P["result['data_quality_issues'] = all_issues"]
    O -- No --> Q["Return result"]
    P --> Q
Loading

Comments Outside Diff (1)

  1. tests/tools/utils/test_data_validation.py, line 129-138 (link)

    P1 Test coverage is too narrow to catch the structural regression

    test_wrapper_list_structure_payload only asserts "data_quality_issues" in result. It does not verify the shape of the individual data points inside result["data"] — in particular that cpu and disk remain scalars and are not wrapped in {"raw": ..., "validated": False} dicts. Adding assertions such as isinstance(result["data"][0]["cpu"], (int, float)) and isinstance(result["data"][0]["disk"], (int, float)) would have caught the regression introduced by the change on line 398.

Reviews (1): Last reviewed commit: "Fix: attach data_quality_issues for list..." | Re-trigger Greptile

if isinstance(data_point, dict):
# Validate each data point
validated_point = validator._validate_flat_metrics(data_point.copy())
validated_point = validator.validate_metrics(data_point.copy())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 validate_metrics mutates scalar cpu/disk in each list item

Switching from _validate_flat_metrics to validate_metrics fixes issue aggregation, but introduces a structural regression. validate_metrics runs _validate_cpu_metric and _validate_disk_metric on any key named "cpu" or "disk", and both methods return {"raw": <value>, "validated": False} when the input is not a dict (i.e. a plain integer). For a flat data point like {"cpu": 95, "ram": 8589934592, "disk": 50}, the returned point inside validated_data will have cpu and disk replaced with nested dicts instead of their original scalar values — silently changing the shape of every list item in the response.

@muddlebee
Copy link
Copy Markdown
Collaborator

@emefienem review points, CI checks

Copy link
Copy Markdown
Member

@VaibhavUpreti VaibhavUpreti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome @emefienem merging this for velocity!

@VaibhavUpreti VaibhavUpreti merged commit bee6d61 into Tracer-Cloud:main Apr 30, 2026
10 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

😤 @emefienem said "I will fix this" and then actually fixed it. Legendary behavior.


👋 Join us on Discord - OpenSRE : hang out, contribute, or hunt for features and issues. Everyone's welcome.

@emefienem
Copy link
Copy Markdown
Contributor Author

awesome @emefienem merging this for velocity!

🫡

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix validate_host_metrics() list-response aggregation bug

3 participants