feat: Victoria Logs Integration (resolves #126) by yashksaini-coder · Pull Request #1060 · Tracer-Cloud/opensre

yashksaini-coder · 2026-04-28T19:58:37Z

Summary

Adds a complete, production-ready VictoriaLogs integration to OpenSRE. This is an alternative to #663, addressing all P1/P2 defects identified by Greptile, Copilot, and maintainers before merging.

Closes #126

What is new

Component	File
Pydantic config model	`app/integrations/models.py`
Catalog classification + env loading	`app/integrations/catalog.py`
Verification logic	`app/integrations/verify.py`
HTTP client (sync + async)	`app/services/victoria_logs/client.py`
LangGraph tool	`app/tools/VictoriaLogsTool/__init__.py`
Evidence source type	`app/types/evidence.py`
Documentation	`docs/victoria_logs.mdx`
Tests	`tests/integrations/test_victoria_logs.py`, `tests/tools/test_victoria_logs_tool.py`

Issues resolved from #663 review

P1 — Tool permanently non-functional (catalog wiring)

victoria_logs was absent from both direct_services and load_env_integrations(). Added to both so DB-stored and env-var-backed configs both work.

P1 — `victoria_logs` missing from `EffectiveIntegrations`

EffectiveIntegrations uses extra="forbid". Added victoria_logs: EffectiveIntegrationEntry | None = None.

P1 — `integration_id` injection crashing DB-configured integrations

VictoriaLogsIntegrationConfig had no integration_id field, causing silent ValidationError on any DB-stored integration. Added the field with a default of "".

P1 — `extract_params()` returned `{}` causing `TypeError` at runtime

The investigation executor calls tool.run(**tool.extract_params(sources)). Implemented extract_params() to return query, limit, and start from sources.

P1 — `"victoria_logs"` not a valid `EvidenceSource`

BaseTool.__init_subclass__() validates this at import time. Added "victoria_logs" to the EvidenceSource Literal in app/types/evidence.py.

P2 — `import contextlib` inside NDJSON parsing loop

Moved to module-level imports alongside json and httpx.

P2 — `tenant_id` defaulting to `"0"` always sending `AccountID` header

Changed to str | None = None. Header is now only sent when explicitly configured.

Unrelated changes removed

app/constants/prompts.py and app/nodes/extract_alert/extract.py prompt/routing changes removed per maintainer request (#663 review).

Async support

Added async_query_logs() via httpx.AsyncClient to avoid blocking the event loop in async LangGraph nodes. Sync method retained for verification flows.

Quality gates (all green)

make lint         — 0 ruff errors
make format-check — all 909 files formatted
make typecheck    — 0 new mypy errors
make test-cov     — 3323 passed, 2 skipped, 1 xfailed
VictoriaLogs tests — 9/9 passed

Environment variables

VICTORIA_LOGS_URL="http://localhost:9428"      # or VICTORIA_LOGS_BASE_URL
VICTORIA_LOGS_TENANT_ID="your-tenant-id"       # optional, for multi-tenant deployments

Co-authored-by: Copilot [email protected]

mintlify · 2026-04-28T19:58:53Z

Preview deployment for your docs. Learn more about Mintlify Previews.

Project	Status	Preview	Updated (UTC)
tracer	🟢 Ready	View Preview	Apr 28, 2026, 7:59 PM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Adds a complete, production-ready VictoriaLogs integration: - VictoriaLogsIntegrationConfig Pydantic model with integration_id support - Catalog classification + env loading (VICTORIA_LOGS_URL / VICTORIA_LOGS_BASE_URL) - _verify_victoria_logs() wired into verify and CORE_VERIFY_SERVICES - VictoriaLogsClient with sync and async query methods using LogsQL - VictoriaLogsTool with working is_available() and extract_params() - Added 'victoria_logs' to EvidenceSource Literal - victoria_logs field added to EffectiveIntegrations model - Documentation in docs/victoria_logs.mdx - 9 tests covering integration config, classification, verify, and tool behavior All quality gates pass: lint, format, typecheck, test-cov (3323 passed).

greptile-apps · 2026-04-28T20:02:57Z

Greptile Summary

This PR adds a production-ready VictoriaLogs integration — config model, catalog wiring, verification, an HTTP client (sync + async), a LangGraph tool, and tests. The integration layer (catalog.py, models.py, verify.py, client.py) is well-structured and follows existing patterns. Two P1 regressions need to be resolved before merging:

Tool always returns "base_url is required" at runtime: VictoriaLogsTool.extract_params returns only {query, limit, start} but run() reads base_url/tenant_id from kwargs["sources"], which the investigation executor never populates. All connection config must flow through extract_params (see AlertmanagerAlertsTool for the correct pattern).
Alertmanager classification removed from extraction prompt: The alert_source rule for Prometheus/Alertmanager payloads was present in the base branch but is absent in this PR, causing existing Alertmanager routing to silently break.

Confidence Score: 3/5

Not safe to merge — the VictoriaLogs tool will never function via the investigation executor, and existing Alertmanager routing is regressed

Two P1 defects: the tool's extract_params/run contract is broken (always returns base_url error at runtime), and the alertmanager classification rule was removed from the LLM extraction prompt — a regression for all existing Alertmanager users unrelated to this feature

app/tools/VictoriaLogsTool/init.py (extract_params/run mismatch) and app/nodes/extract_alert/extract.py (dropped alertmanager classification rule)

Important Files Changed

Filename	Overview
app/tools/VictoriaLogsTool/init.py	New LangGraph tool for VictoriaLogs; `extract_params` returns only query params but `run()` reads connection config from `kwargs["sources"]` which the executor never populates — tool will always fail at runtime
app/nodes/extract_alert/extract.py	Alertmanager classification rule accidentally removed from the LLM extraction prompt, breaking Prometheus/Alertmanager payload routing for existing users
app/services/victoria_logs/client.py	New sync/async HTTP client for VictoriaLogs; clean NDJSON parsing, conditional AccountID header, correct timeout — no issues found
app/integrations/catalog.py	Wires VictoriaLogs into service key map, `_classify_service_instance`, `load_env_integrations`, and `direct_services`; logic mirrors existing patterns correctly
app/integrations/models.py	Adds `VictoriaLogsIntegrationConfig`, `integration_id` to `EffectiveIntegrationEntry`, and `victoria_logs` field to `EffectiveIntegrations` — all correct
app/integrations/verify.py	Adds `_verify_victoria_logs` and wires it into `verify_integrations`; verification calls `query_logs("*", limit=1)` which is a safe probe
app/nodes/extract_alert/models.py	Updates AlertDetails alert_source description to add 'eks', 'alertmanager', and 'victorialogs'; 'victorialogs' is not a valid alert source and may cause incorrect LLM classification
app/types/evidence.py	Adds "victoria_logs" to EvidenceSource Literal — correct and required for BaseTool source validation
tests/tools/test_victoria_logs_tool.py	Good coverage for is_available, extract_params, run success/failure/missing-url; tests do not catch the executor-path failure (no base_url in extract_params output)

Sequence Diagram

sequenceDiagram
    participant E as Investigation Executor
    participant T as VictoriaLogsTool
    participant C as VictoriaLogsClient
    participant VL as VictoriaLogs API

    Note over E,T: Current (broken) executor path
    E->>T: extract_params(sources)
    T-->>E: {query: None, limit: 50, start: "-1h"}
    E->>T: run(query=None, limit=50, start="-1h")
    Note over T: kwargs["sources"] = {} (never set)
    T-->>E: {available: false, error: "base_url is required"}

    Note over E,T: Correct pattern (AlertmanagerAlertsTool)
    E->>T: extract_params(sources)
    T-->>E: {base_url: "...", tenant_id: "...", query: "*", ...}
    E->>T: run(base_url="...", tenant_id="...", query="*", ...)
    T->>C: VictoriaLogsClient(config)
    C->>VL: GET /select/logsql/query
    VL-->>C: 200 OK (NDJSON)
    C-->>T: {success: true, rows: [...]}
    T-->>E: {available: true, logs: [...]}

Comments Outside Diff (2)

app/nodes/extract_alert/extract.py, line 64 (link)

Alertmanager classification rule dropped from the extraction prompt

The base version of this prompt included "Set to alertmanager if the payload contains Prometheus/Alertmanager-specific fields such as fingerprint, generatorURL pointing to Prometheus, startsAt/endsAt in the Alertmanager webhook format, or the text mentions Alertmanager." This PR removes that instruction entirely, so the LLM will no longer classify Prometheus/Alertmanager payloads correctly — they will fall through to null instead of "alertmanager", breaking downstream routing for all existing Alertmanager users. The instruction should be restored to its original form.
app/nodes/extract_alert/models.py, line 23-24 (link)

'victorialogs' is not a valid alert source in the field description

'victorialogs' was added to the alert_source description alongside legitimate alert platforms (grafana, datadog, etc.). VictoriaLogs is a log-storage backend — alerts do not originate from it. Listing it here guides the LLM to emit "victorialogs" as an alert_source, a value that has no routing handler and would silently drop any investigation context derived from it. Consider removing 'victorialogs' from this description or replacing it with a note that it is only a log datasource, not an alert origin.

_{Reviews (1): Last reviewed commit: "feat: Victoria Logs Integration (resolve..." | Re-trigger Greptile}

greptile-apps · 2026-04-28T20:03:01Z

+    def extract_params(self, sources: dict) -> dict[str, Any]:
+        vl_conf = sources.get("victoria_logs", {})
+        return {
+            "query": vl_conf.get("query"),
+            "limit": vl_conf.get("limit", 50),
+            "start": vl_conf.get("start", "-1h"),
+        }
+
+    def run(self, query: str, limit: int = 50, start: str = "-1h", **kwargs: Any) -> dict[str, Any]:
+        vl_conf = kwargs.get("sources", {}).get("victoria_logs", {})
+        config = VictoriaLogsIntegrationConfig(
+            base_url=vl_conf.get("base_url", ""),
+            tenant_id=vl_conf.get("tenant_id"),
+            integration_id=vl_conf.get("integration_id", ""),
+        )


extract_params never supplies base_url — tool always fails via executor

The investigation executor calls tool.run(**tool.extract_params(sources)). extract_params returns only {"query": None, "limit": 50, "start": "-1h"} — it never extracts base_url or tenant_id from the integration config. Inside run(), vl_conf = kwargs.get("sources", {}).get("victoria_logs", {}) will always be {} (no sources key is passed), so config.base_url will always be "", and the method immediately returns {"available": False, "error": "VictoriaLogs base_url is required."}. Every executor-driven call to this tool is dead on arrival.

The pattern used by AlertmanagerAlertsTool (see app/tools/AlertmanagerAlertsTool/__init__.py) is the correct approach: extract_params must surface base_url and tenant_id directly, and run() must accept them as explicit parameters instead of fishing them out of **kwargs["sources"].

greptile-apps · 2026-04-28T20:03:10Z

+
+@patch("app.services.victoria_logs.client.httpx.get")
+def test_victoria_logs_verify(mock_get):
+    """Test VictoriaLogs verification logic."""
+    from app.integrations.verify import _verify_victoria_logs
+
+    # Mock successful response
+    mock_response = MagicMock()
+    mock_response.status_code = 200
+    mock_response.text = '{"row": 1}'
+    mock_get.return_value = mock_response
+
+    config = {
+        "base_url": "http://localhost:9428",
+        "tenant_id": "123",
+    }
+


Verification test calls raise_for_status on an un-mocked mock

mock_response.raise_for_status is a MagicMock by default — calling it does nothing and won't raise. The verify path in client.py calls resp.raise_for_status() before parsing the body, so a 4xx/5xx response would never be exercised here. Consider adding mock_response.raise_for_status.return_value = None explicitly, and a separate case where mock_response.raise_for_status.side_effect = httpx.HTTPStatusError(...) to verify the "failed" path through HTTP errors (not just exception injection).

greptile-apps · 2026-04-28T20:03:14Z

+VICTORIA_LOGS_URL="http://localhost:9428"
+VICTORIA_LOGS_TENANT_ID="0"
+```


Documentation example contradicts the "opt-in header" behaviour

VICTORIA_LOGS_TENANT_ID="0" is shown as an example value with no accompanying note. The PR specifically changed tenant_id from "0" to None so the AccountID header is only sent when the variable is explicitly set. A reader following the docs example verbatim will always send AccountID: 0, which is the old default behaviour the PR intentionally removed. Either document "omit this variable entirely for single-tenant deployments" or clarify that "0" is the VictoriaLogs default tenant and is appropriate for single-tenant setups.

Copilot

Pull request overview

Adds a first-class VictoriaLogs integration to OpenSRE, including configuration/loading, verification, a query client, and an investigation tool, plus documentation and tests.

Changes:

Introduces VictoriaLogsIntegrationConfig, env-var loading/classification, and adds victoria_logs to effective integrations + evidence source typing.
Implements VictoriaLogs verification and an HTTP client (sync + async) plus a VictoriaLogsTool.
Adds docs and initial unit tests for env loading/classification, verification, and the tool.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`app/integrations/models.py`	Adds VictoriaLogs config model and extends effective integration models to support `victoria_logs`.
`app/integrations/catalog.py`	Wires VictoriaLogs into service key mapping, classification, env loading, and effective integration resolution.
`app/integrations/verify.py`	Adds VictoriaLogs verification wiring and supporting imports/constants updates.
`app/services/victoria_logs/client.py`	Adds sync + async VictoriaLogs query client with NDJSON parsing.
`app/services/victoria_logs/__init__.py`	Exposes VictoriaLogs client/config and provides a factory helper.
`app/tools/VictoriaLogsTool/__init__.py`	Adds a BaseTool implementation for querying VictoriaLogs via LogsQL.
`app/types/evidence.py`	Extends `EvidenceSource` to include `victoria_logs`.
`app/nodes/extract_alert/models.py`	Updates the `alert_source` field description to include additional platforms.
`app/nodes/extract_alert/extract.py`	Updates the LLM prompt guidance for extracting `alert_source`.
`docs/victoria_logs.mdx`	Adds end-user documentation for configuring and using VictoriaLogs integration.
`tests/integrations/test_victoria_logs.py`	Adds tests for env loading, catalog classification, and verification behavior.
`tests/tools/test_victoria_logs_tool.py`	Adds tests for tool availability, parameter extraction, and run behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-28T20:07:51Z

+        client = VictoriaLogsClient(conf)
+        res = client.query_logs("*", limit=1)
+        if res.get("success"):
+            return _result("victoria_logs", source, "verified", "")


_verify_victoria_logs() returns status="verified" on success, but the rest of verify.py (and verification_exit_code()) expects the success status to be "passed". As written, a successful VictoriaLogs verification will still be treated as failing the core-gate because no core result will have status=="passed". Use "passed" here (and keep "missing"/"failed" for other outcomes) to match the existing contract.

Suggested change

return _result("victoria_logs", source, "verified", "")

return _result("victoria_logs", source, "passed", "")

Copilot · 2026-04-28T20:07:51Z

+            "query": vl_conf.get("query"),
+            "limit": vl_conf.get("limit", 50),
+            "start": vl_conf.get("start", "-1h"),
+        }
+
+    def run(self, query: str, limit: int = 50, start: str = "-1h", **kwargs: Any) -> dict[str, Any]:
+        vl_conf = kwargs.get("sources", {}).get("victoria_logs", {})
+        config = VictoriaLogsIntegrationConfig(
+            base_url=vl_conf.get("base_url", ""),
+            tenant_id=vl_conf.get("tenant_id"),
+            integration_id=vl_conf.get("integration_id", ""),


VictoriaLogsTool.run() reads credentials from kwargs["sources"], but the investigation executor calls tools as action.run(**action.extract_params(available_sources)) (no implicit "sources" kwarg). With the current extract_params() returning only query/limit/start, run() will see an empty sources dict and always fail with missing base_url. Adjust the tool to accept base_url/tenant_id/integration_id as explicit run() args and have extract_params() return them (plus a non-empty default query), or include a "sources" entry in extract_params in a way consistent with the executor contract.

Suggested change

"query": vl_conf.get("query"),

"limit": vl_conf.get("limit", 50),

"start": vl_conf.get("start", "-1h"),

}

def run(self, query: str, limit: int = 50, start: str = "-1h", **kwargs: Any) -> dict[str, Any]:

vl_conf = kwargs.get("sources", {}).get("victoria_logs", {})

config = VictoriaLogsIntegrationConfig(

base_url=vl_conf.get("base_url", ""),

tenant_id=vl_conf.get("tenant_id"),

integration_id=vl_conf.get("integration_id", ""),

"query": vl_conf.get("query") or "*",

"limit": vl_conf.get("limit", 50),

"start": vl_conf.get("start", "-1h"),

"base_url": vl_conf.get("base_url", ""),

"tenant_id": vl_conf.get("tenant_id"),

"integration_id": vl_conf.get("integration_id", ""),

}

def run(

self,

query: str,

limit: int = 50,

start: str = "-1h",

base_url: str = "",

tenant_id: str | None = None,

integration_id: str = "",

**kwargs: Any,

) -> dict[str, Any]:

config = VictoriaLogsIntegrationConfig(

base_url=base_url,

tenant_id=tenant_id,

integration_id=integration_id,

Copilot · 2026-04-28T20:07:51Z

+    }
+
+    result = _verify_victoria_logs("env", config)
+    assert result["status"] == "verified"


This test asserts VictoriaLogs verification status == "verified", but verify.py uses "passed"/"missing"/"failed" as its status contract (and verification_exit_code() checks for "passed"). Once _verify_victoria_logs is aligned to return "passed" on success, update this assertion accordingly.

Suggested change

assert result["status"] == "verified"

assert result["status"] == "passed"

Copilot · 2026-04-28T20:07:52Z

+    with patch("app.tools.VictoriaLogsTool.VictoriaLogsClient", return_value=mock_client):
+        result = tool.run(
+            query="error",
+            sources={"victoria_logs": {"base_url": "http://localhost:9428"}},
+        )


These run() tests pass credentials via a "sources" kwarg, but the investigation executor calls tools as action.run(**action.extract_params(available_sources)) and does not inject "sources". As a result, the tests can pass even if the tool is unusable in real investigations. Update the tool/tests so extract_params returns all required kwargs (e.g. base_url/tenant_id/query/limit/start) and tests invoke run() using only extract_params output (mirroring execute_actions.py).

Copilot · 2026-04-28T20:07:52Z

+```bash
+VICTORIA_LOGS_URL="http://localhost:9428"
+VICTORIA_LOGS_TENANT_ID="0"
+```
+
+### Local Store
+
+Alternatively, you can add it to your local store using the CLI:
+
+```bash
+opensre integration add victoria_logs --base_url http://localhost:9428 --tenant_id 0
+```


The docs show VICTORIA_LOGS_TENANT_ID="0" (and CLI example uses --tenant_id 0), but the integration logic treats tenant_id as optional and only sends the AccountID header when explicitly configured. Suggest updating the examples to omit tenant_id entirely unless the deployment is actually multi-tenant (to avoid encouraging always sending AccountID=0).

Copilot · 2026-04-28T20:07:52Z

+    requires = ["base_url"]
+    input_schema = {
+        "type": "object",
+        "properties": {
+            "query": {
+                "type": "string",
+                "description": "LogsQL query to filter logs. Example: `_stream_id:* AND error`",
+            },
+            "limit": {
+                "type": "integer",
+                "default": 50,
+                "description": "Maximum number of logs to retrieve.",
+            },
+            "start": {
+                "type": "string",
+                "default": "-1h",
+                "description": "Time range, e.g., -1h or -24h.",
+            },
+        },
+        "required": ["query"],
+    }


VictoriaLogsTool.requires includes "base_url", but input_schema doesn't define a base_url field (and build_prompt.py renders required inputs from input_schema properties). This makes the planner metadata inconsistent and hides a required runtime credential from the prompt/tool contract. Add base_url (and any other required credential fields like tenant_id/integration_id, if applicable) to input_schema so the declared interface matches how the tool is invoked.

muddlebee · 2026-04-29T13:39:03Z

@yashksaini-coder we can close #663 then? and reviews/conflicts pending here. LMK.

Copilot AI review requested due to automatic review settings April 28, 2026 19:58

Copilot started reviewing on behalf of yashksaini-coder April 28, 2026 19:59 View session

mintlify Bot deployed to staging - docs April 28, 2026 19:59 View deployment

yashksaini-coder force-pushed the feat/126-victoria-logs-integration branch from f4979b5 to dbd7a22 Compare April 28, 2026 20:02

greptile-apps Bot reviewed Apr 28, 2026

View reviewed changes

mintlify Bot deployed to staging - docs April 28, 2026 20:03 View deployment

Copilot AI reviewed Apr 28, 2026

View reviewed changes

yashksaini-coder closed this Apr 30, 2026

yashksaini-coder deleted the feat/126-victoria-logs-integration branch April 30, 2026 11:12

yashksaini-coder mentioned this pull request Apr 30, 2026

feat(integrations): add VictoriaLogs integration #1144

Merged

	return _result("victoria_logs", source, "verified", "")
	return _result("victoria_logs", source, "passed", "")

	assert result["status"] == "verified"
	assert result["status"] == "passed"

Conversation

yashksaini-coder commented Apr 28, 2026

Summary

What is new

Issues resolved from #663 review

P1 — Tool permanently non-functional (catalog wiring)

P1 — victoria_logs missing from EffectiveIntegrations

P1 — integration_id injection crashing DB-configured integrations

P1 — extract_params() returned {} causing TypeError at runtime

P1 — "victoria_logs" not a valid EvidenceSource

P2 — import contextlib inside NDJSON parsing loop

P2 — tenant_id defaulting to "0" always sending AccountID header

Unrelated changes removed

Async support

Quality gates (all green)

Environment variables

Uh oh!

mintlify Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (2)

Uh oh!

greptile-apps Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

muddlebee commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

P1 — `victoria_logs` missing from `EffectiveIntegrations`

P1 — `integration_id` injection crashing DB-configured integrations

P1 — `extract_params()` returned `{}` causing `TypeError` at runtime

P1 — `"victoria_logs"` not a valid `EvidenceSource`

P2 — `import contextlib` inside NDJSON parsing loop

P2 — `tenant_id` defaulting to `"0"` always sending `AccountID` header

mintlify Bot commented Apr 28, 2026 •

edited

Loading

greptile-apps Bot commented Apr 28, 2026 •

edited

Loading