Skip to content

fix: add is_available and extract_params to MongoDB, ClickHouse, Kafka, and CloudWatch tools#732

Merged
VaibhavUpreti merged 3 commits intoTracer-Cloud:mainfrom
devankitjuneja:fix/730-tool-is-available-leak
Apr 21, 2026
Merged

fix: add is_available and extract_params to MongoDB, ClickHouse, Kafka, and CloudWatch tools#732
VaibhavUpreti merged 3 commits intoTracer-Cloud:mainfrom
devankitjuneja:fix/730-tool-is-available-leak

Conversation

@devankitjuneja
Copy link
Copy Markdown
Contributor

Fixes #730

Describe the changes you have made in this PR -

10 tools defaulted to _always_available, causing them to appear in the planner for every investigation regardless of pipeline context. In pipelines like rds-postgres-synthetic (Grafana-only), the LLM would select MongoDB, ClickHouse, Kafka, and CloudWatch tools, they would fail with TypeError: missing required argument, and the agent wasted investigation loops on dead-end actions.

Root cause: BaseTool.is_available() returns True by default. The requires=[...] field on @tool is documentation-only — not wired to the availability check.

Changes:

  • Added mongodb_is_available + mongodb_extract_params to app/integrations/mongodb.py
  • Added clickhouse_is_available + clickhouse_extract_params to app/integrations/clickhouse.py
  • Added kafka_is_available + kafka_extract_params to app/integrations/kafka.py
  • Added cloudwatch_is_available to app/tools/utils/availability.py (no extract_params — CloudWatch uses IAM auth, job_queue is alert-specific)
  • Wired is_available + extract_params into all 10 affected tool decorators

extract_params follows the established PostgreSQL/MongoDB pattern — credentials are auto-filled from resolved integrations so the LLM never needs to supply connection_string, host, or bootstrap_servers. ClickHouse and Kafka detect_sources.py wiring is tracked in a follow-up issue; the helpers are ready to activate once that lands.

Screenshots of the UI changes (If any) -

N/A


Code Understanding and AI Usage

Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?

  • No, I wrote all the code myself
  • Yes, I used AI assistance (continue below)

If you used AI assistance:

  • I have reviewed every single line of the AI-generated code
  • I can explain the purpose and logic of each function/component I added
  • I have tested edge cases and understand how the code handles them
  • I have modified the AI output to follow this project's coding standards and conventions

Explain your implementation approach:

The planner filters tools via action.is_available(available_sources) before passing them to the LLM. Tools without a custom is_available always pass this filter. The fix adds named helper functions in each integration module (matching the pattern already used by PostgreSQL and MongoDB tools) and wires them into the @tool decorator. extract_params auto-fills connection credentials from available_sources so the LLM receives only alert-specific parameters it can reason about.


Checklist before requesting a review

  • I have added proper PR title and linked to the issue
  • I have performed a self-review of my code
  • I can explain the purpose of every function, class, and logic block I added
  • I understand why my changes work and have tested them thoroughly
  • I have considered potential edge cases and how my code handles them
  • If it is a core feature, I have added thorough tests
  • My code follows the project's style guidelines and conventions

Note: Please check Allow edits from maintainers if you would like us to assist in the PR.

…a, and CloudWatch tools (Tracer-Cloud#730)

10 tools defaulted to _always_available, causing them to appear in the
planner for every pipeline regardless of context. The LLM would select
them, they would fail with TypeError on missing required arguments, and
the agent wasted investigation loops on dead-end actions.

- Add mongodb_is_available + mongodb_extract_params to integrations/mongodb.py
- Add clickhouse_is_available + clickhouse_extract_params to integrations/clickhouse.py
- Add kafka_is_available + kafka_extract_params to integrations/kafka.py
- Add cloudwatch_is_available to tools/utils/availability.py
- Wire is_available + extract_params into all 10 affected tool decorators

extract_params ensures credentials are auto-filled from resolved
integrations so the LLM never needs to supply connection strings.
ClickHouse and Kafka source blocks in detect_sources.py are tracked
separately; the helpers are ready to activate once that wiring lands.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

This PR fixes a planner-pollution bug where 10 tools (MongoDB ×5, ClickHouse ×2, Kafka ×2, CloudWatch ×1) defaulted to BaseTool.is_available() → True, causing the LLM to select them even in pipelines that had none of those integrations configured, wasting investigation loops on dead-end TypeError failures.

Key changes:

  • mongodb_is_available gates on connection_verified (confirmed written by detect_sources.py at line 1099 for MongoDB integrations)
  • mongodb_database_is_available added for MongoDBProfilerTool and MongoDBCollectionStatsTool, which require a database name — prevents those tools appearing when only a bare connection string is configured
  • clickhouse_is_available / kafka_is_available gate on connection_verified; per the PR description, these tools will remain hidden until the ClickHouse/Kafka detect_sources.py wiring lands in a follow-up, which is a safe interim state
  • cloudwatch_is_available correctly gates on the presence of the cloudwatch source key (not connection_verified, which IAM-based auth never writes) — tool activates only when cloudwatch_log_group is present in alert annotations
  • All three previous review concerns (MongoDB connection_verified inconsistency, profiler/collection-stats missing-database footgun, CloudWatch always-False) are addressed in this revision

Confidence Score: 4/5

Safe to merge; all three previously flagged issues are resolved and the implementation is consistent with established patterns.

The core bug is properly fixed. mongodb_is_available now uses connection_verified (matching ClickHouse/Kafka), the database-gated helpers correctly guard profiler/collection-stats tools, and CloudWatch uses key-presence (correct for IAM auth). The only open item is a minor robustness nit on the int() port cast in clickhouse_extract_params, which is a P2 style suggestion rather than a blocking issue.

app/integrations/clickhouse.py — minor port-cast robustness nit; all other files are clean.

Important Files Changed

Filename Overview
app/integrations/clickhouse.py Adds clickhouse_is_available (gates on connection_verified) and clickhouse_extract_params; minor robustness issue with unguarded int() port cast.
app/integrations/kafka.py Adds kafka_is_available and kafka_extract_params following the established pattern; clean implementation.
app/integrations/mongodb.py Adds mongodb_is_available (gates on connection_verified, which detect_sources.py does write at line 1099), mongodb_database_is_available for profiler/collection-stats tools, and mongodb_extract_params; previous thread concerns resolved.
app/tools/utils/availability.py Adds cloudwatch_is_available gating on key presence (correct for IAM-based auth where connection_verified is never written); previous thread concern about always-False resolved.
app/tools/MongoDBProfilerTool/init.py Correctly wires mongodb_database_is_available to prevent the tool appearing in the planner without a configured database.
app/tools/MongoDBCollectionStatsTool/init.py Correctly wires mongodb_database_is_available; collection param remains LLM-supplied (alert-specific), consistent with other alert-specific params.
app/tools/CloudWatchBatchMetricsTool/init.py Wires cloudwatch_is_available; no extract_params (correct, IAM auth, job_queue is alert-specific).
app/tools/KafkaConsumerGroupTool/init.py Wires kafka_is_available and kafka_extract_params; group_id correctly left for LLM to supply.
app/tools/KafkaTopicHealthTool/init.py Clean wiring of kafka_is_available and kafka_extract_params.
app/tools/ClickHouseQueryActivityTool/init.py Clean wiring of clickhouse_is_available and clickhouse_extract_params.
app/tools/ClickHouseSystemHealthTool/init.py Clean wiring of clickhouse_is_available and clickhouse_extract_params.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Planner: filter tools via is_available] --> B{sources dict}

    B --> C[sources.get 'mongodb']
    C --> D{connection_verified?}
    D -- No --> E[Tools hidden from planner]
    D -- Yes --> F{database present?}
    F -- No --> G[ServerStatus / CurrentOps / ReplicaStatus visible]
    F -- Yes --> H[All 5 MongoDB tools visible]

    B --> I[sources.get 'clickhouse']
    I --> J{connection_verified?\ndetect_sources wiring pending}
    J -- No --> K[Tools hidden until follow-up]
    J -- Yes --> L[ClickHouse tools visible]

    B --> M[sources.get 'kafka']
    M --> N{connection_verified?\ndetect_sources wiring pending}
    N -- No --> O[Tools hidden until follow-up]
    N -- Yes --> P[Kafka tools visible]

    B --> Q[sources.get 'cloudwatch']
    Q -- key absent --> R[Tool hidden]
    Q -- key present\nlog_group in annotations --> S[CloudWatchBatchMetricsTool visible\nLLM supplies job_queue]
Loading

Reviews (3): Last reviewed commit: "fix: cloudwatch_is_available should chec..." | Re-trigger Greptile

Comment thread app/integrations/mongodb.py Outdated
Comment thread app/integrations/mongodb.py
…tion_verified and add database guard

- mongodb_is_available now checks connection_verified (consistent with
  clickhouse_is_available and kafka_is_available; connection_verified is
  set by detect_sources when MongoDB is configured and validated)
- Add mongodb_database_is_available for tools that require a database
  name (MongoDBProfilerTool, MongoDBCollectionStatsTool); without it
  those tools would appear in the planner and return an error when no
  database is configured, repeating the original bug in a narrower scope
@devankitjuneja
Copy link
Copy Markdown
Contributor Author

@greptile-apps re-review

Comment thread app/tools/utils/availability.py Outdated
…nection_verified

detect_sources.py populates sources["cloudwatch"] from alert annotations
(cloudwatch_log_group) but never writes connection_verified — CloudWatch
uses ambient IAM auth, not an API key. Checking connection_verified caused
the tool to be permanently hidden even when CloudWatch context was present.
@devankitjuneja
Copy link
Copy Markdown
Contributor Author

@greptile-apps re-review

Copy link
Copy Markdown
Member

@VaibhavUpreti VaibhavUpreti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome @devankitjuneja! 🚀 Good catch with the bug

@VaibhavUpreti VaibhavUpreti merged commit 0cafa99 into Tracer-Cloud:main Apr 21, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] 10 tools without is_available leak into unrelated pipeline planners

2 participants