Skip to content

feat: add MariaDB integration for database RCA#492

Merged
VaibhavUpreti merged 13 commits intoTracer-Cloud:mainfrom
yashksaini-coder:feat/mariadb-integration
Apr 13, 2026
Merged

feat: add MariaDB integration for database RCA#492
VaibhavUpreti merged 13 commits intoTracer-Cloud:mainfrom
yashksaini-coder:feat/mariadb-integration

Conversation

@yashksaini-coder
Copy link
Copy Markdown
Collaborator

What this does

Adds a first-class MariaDB integration so OpenSRE can directly query MariaDB instances during incident investigations — connection health, active queries, lock contention, slow queries, and replication status.

Why

Teams running MariaDB-backed services had no way to pull database-level evidence during incidents. This closes that gap with 5 production-safe, read-only investigation tools.

What's included

  • Config & validationapp/integrations/mariadb.py with connection config, pymysql driver, and connectivity validation
  • 5 investigation tools:
    • Process List — active threads and queries (excluding idle)
    • Global Status — curated server metrics (connections, threads, InnoDB, uptime)
    • InnoDB Status — engine internals (deadlocks, buffer pool, I/O)
    • Slow Queries — top queries by avg execution time from performance_schema
    • Replication — replica I/O/SQL thread state, lag, errors (MariaDB multi-source syntax)
  • Integration wiring — CLI setup, verification, resolve node classification, env var loading, detect_sources
  • Tool availabilityis_available/extract_params so tools only appear when MariaDB is configured and credentials auto-populate
  • Tests — unit tests for config/validation/classification + tool contract tests for all 5 tools
  • Env docsMARIADB_HOST, MARIADB_PORT, MARIADB_DATABASE, MARIADB_USERNAME, MARIADB_PASSWORD, MARIADB_SSL

Safety

  • All queries are read-only (SELECT / SHOW only)
  • Timeouts enforced on connect + read
  • Results capped at max_results (default 50, max 200)
  • Query text truncated to 200 chars in output
  • Passwords never in tool output
  • SSL enabled by default
  • MariaDB-specific: SHOW ALL SLAVES STATUS with SHOW SLAVE STATUS fallback, performance_schema disabled graceful handling

How to test

  1. Set MARIADB_HOST, MARIADB_DATABASE, MARIADB_USERNAME, MARIADB_PASSWORD in .env
  2. Or run opensre integrations setup mariadb
  3. Run opensre integrations verify mariadb

Checks

  • make lint — all passed
  • make typecheck — 298 source files, no issues
  • make test-cov — all MariaDB tests pass (57 new tests)

Closes #331

Copilot AI review requested due to automatic review settings April 7, 2026 10:58
@yashksaini-coder yashksaini-coder force-pushed the feat/mariadb-integration branch from 7457af0 to 1321822 Compare April 7, 2026 11:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a first-class MariaDB integration to OpenSRE so investigations can query MariaDB safely (status, process list, InnoDB status, slow queries, replication), including wiring into integration resolution/verification and tool registration.

Changes:

  • Introduces app/integrations/mariadb.py with normalized config, connectivity validation, and read-only diagnostic queries.
  • Adds 5 MariaDB investigation tools (function-based @tool) and wires MariaDB into integration resolution, env loading, source detection, and verification.
  • Adds MariaDB-focused unit/contract tests, plus dependency and env template updates.

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
app/integrations/mariadb.py Core MariaDB config/validation + read-only query helpers for the 5 tools.
app/tools/MariaDBStatusTool/__init__.py Tool wrapper for curated SHOW GLOBAL STATUS metrics.
app/tools/MariaDBProcessListTool/__init__.py Tool wrapper for active process list queries.
app/tools/MariaDBInnoDBStatusTool/__init__.py Tool wrapper for SHOW ENGINE INNODB STATUS.
app/tools/MariaDBSlowQueriesTool/__init__.py Tool wrapper for performance_schema slow query summary.
app/tools/MariaDBReplicationTool/__init__.py Tool wrapper for replication status queries.
app/nodes/resolve_integrations/node.py Classifies/loads MariaDB credentials from store/env into resolved integrations.
app/nodes/plan_actions/detect_sources.py Adds MariaDB to detected sources for investigations.
app/integrations/verify.py Adds MariaDB to effective integrations and verification flow.
app/integrations/models.py Adds MariaDB integration config model and effective-integrations entry.
app/integrations/cli.py Adds interactive opensre integrations setup mariadb handler.
app/cli/constants.py Adds mariadb to supported integration service lists.
app/types/evidence.py Adds mariadb to the canonical EvidenceSource union.
pyproject.toml Adds pymysql runtime dep and types-PyMySQL dev typing dep.
.env.example Documents MariaDB env vars (MARIADB_*).
tests/test_mariadb_integration.py Unit tests for config/env loading/validation and integration classification.
tests/tools/conftest.py Adds MariaDB entry to mock agent state for tool contract tests.
tests/tools/test_mariadb_status_tool.py Contract + basic behavior test for global status tool.
tests/tools/test_mariadb_process_list_tool.py Contract + basic behavior test for process list tool.
tests/tools/test_mariadb_innodb_status_tool.py Contract + basic behavior test for InnoDB status tool.
tests/tools/test_mariadb_slow_queries_tool.py Contract + basic behavior test for slow queries tool.
tests/tools/test_mariadb_replication_tool.py Contract + basic behavior test for replication tool.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +114
return pymysql.connect(
host=config.host,
port=config.port,
database=config.database,
user=config.username,
password=config.password,
ssl=ssl_arg,
connect_timeout=DEFAULT_MARIADB_TIMEOUT_S,
read_timeout=int(config.timeout_seconds),
write_timeout=int(config.timeout_seconds),
charset="utf8mb4",
autocommit=True,
program_name="opensre",
)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_get_connection hard-codes connect_timeout=DEFAULT_MARIADB_TIMEOUT_S (5s) while the config’s timeout_seconds default is 10s and is used for read/write timeouts. This makes connect behavior inconsistent with the configured timeout and with the PR’s “timeouts enforced” claim. Consider deriving connect_timeout from config.timeout_seconds (or exposing a separate connect_timeout_seconds field) so all timeouts are consistently configurable.

Copilot uses AI. Check for mistakes.
Comment thread app/integrations/mariadb.py Outdated
Comment on lines +97 to +107
ssl_arg: dict[str, Any] | None = None
if config.ssl:
ssl_arg = {"check_hostname": False}

return pymysql.connect(
host=config.host,
port=config.port,
database=config.database,
user=config.username,
password=config.password,
ssl=ssl_arg,
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When config.ssl is true, the SSL options explicitly disable hostname verification (check_hostname: False). That weakens TLS protections and can allow MITM if certificates are otherwise accepted. Prefer leaving hostname verification enabled by default and only disabling it via an explicit opt-out setting (or require a CA bundle / verify mode configuration).

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +133
def validate_mariadb_config(config: MariaDBConfig) -> MariaDBValidationResult:
"""Validate MariaDB connectivity with a lightweight version query."""
if not config.host:
return MariaDBValidationResult(ok=False, detail="MariaDB host is required.")

try:
conn = _get_connection(config)
try:
with conn.cursor() as cur:
cur.execute("SELECT VERSION()")
row = cur.fetchone()
version = row[0] if row else "unknown"
db_name = config.database or "(default)"
return MariaDBValidationResult(
ok=True,
detail=f"Connected to MariaDB {version}; target database: {db_name}.",
)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validate_mariadb_config only checks that host is set, but the rest of the integration (tools’ is_available and MariaDBConfig.is_configured) requires both host and database. This can lead to opensre integrations verify mariadb reporting success even when the DB name is missing and the tools won’t be available. Align validation with tool requirements by also requiring database (and likely username) to be non-empty for a “passed” result.

Copilot uses AI. Check for mistakes.
Comment on lines +365 to +406
with conn.cursor() as cur:
row = None
columns: list[str] = []
# MariaDB-specific multi-source syntax first, then legacy
for stmt in ("SHOW ALL SLAVES STATUS", "SHOW SLAVE STATUS"):
try:
cur.execute(stmt)
row = cur.fetchone()
if cur.description:
columns = [d[0] for d in cur.description]
break
except Exception: # noqa: BLE001
continue

if not row:
return {
"source": "mariadb",
"available": True,
"note": "This server is not configured as a replica.",
"replication": {},
}

# Build a dict from column names and row values
full = dict(zip(columns, row))
# Return curated subset (MariaDB uses Master/Slave naming)
_KEYS = (
"Slave_IO_Running",
"Slave_SQL_Running",
"Seconds_Behind_Master",
"Last_Error", "Last_Errno",
"Master_Host", "Master_Port",
"Master_Log_File",
"Relay_Log_Space",
"Exec_Master_Log_Pos",
"Connection_name",
)
replication = {k: full[k] for k in _KEYS if k in full}
return {
"source": "mariadb",
"available": True,
"replication": replication,
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_replication_status executes SHOW ALL SLAVES STATUS but only reads a single row via fetchone(). On multi-source replication setups this will silently drop additional channels, contradicting the stated multi-source support. Consider fetching all rows and returning a list of channel statuses (keyed by Connection_name) or aggregating them into a structured result.

Copilot uses AI. Check for mistakes.
Comment on lines +770 to +780
mariadb_int = (resolved_integrations or {}).get("mariadb")
if mariadb_int and str(mariadb_int.get("host", "")).strip():
sources["mariadb"] = {
"host": str(mariadb_int.get("host", "")).strip(),
"port": mariadb_int.get("port", 3306),
"database": str(mariadb_int.get("database", "")).strip(),
"username": str(mariadb_int.get("username", "")).strip(),
"password": str(mariadb_int.get("password", "")).strip(),
"ssl": mariadb_int.get("ssl", True),
"connection_verified": True,
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

detect_sources adds a mariadb source when only host is present, but the MariaDB tools’ availability/config checks require both host and database. This can yield a partially-populated sources['mariadb'] with connection_verified=True while the tools still won’t show/run. Consider requiring a non-empty database here (or aligning mariadb_is_available/MariaDBConfig.is_configured to only require host if that’s intended).

Copilot uses AI. Check for mistakes.
Comment on lines +239 to +252
else:
mariadb_host = os.getenv("MARIADB_HOST", "").strip()
if mariadb_host:
effective["mariadb"] = {
"source": "local env",
"config": {
"host": mariadb_host,
"port": int(os.getenv("MARIADB_PORT", "3306").strip() or "3306"),
"database": os.getenv("MARIADB_DATABASE", "").strip(),
"username": os.getenv("MARIADB_USERNAME", "").strip(),
"password": os.getenv("MARIADB_PASSWORD", "").strip(),
"ssl": os.getenv("MARIADB_SSL", "true").strip().lower() in ("true", "1", "yes"),
},
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the env-var fallback branch, resolve_effective_integrations treats MARIADB_HOST alone as sufficient to include MariaDB in the effective integrations, but the rest of the integration requires a non-empty database (and typically username). This can make verify/display flows report MariaDB as configured even when the tools will be unavailable. Consider only adding the env-backed MariaDB entry when the required fields are present, or make verification fail with a clear message when they’re missing.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +6
"""Shared MariaDB integration helpers.

Provides configuration, connectivity validation, and read-only diagnostic
queries for MariaDB instances. All operations are production-safe: read-only,
timeouts enforced, result sizes capped.
"""
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The repo has integration-level E2E coverage for other database integrations (e.g., tests/e2e/mongodb/test_mongodb_e2e.py covers resolution + verify + detect_sources + tools availability), but this PR only adds unit/contract tests for MariaDB. Consider adding a similar tests/e2e/mariadb/ suite to validate the full wiring (env/store resolution → verify → detect_sources → tool availability) and prevent regressions in the end-to-end investigation flow.

Copilot uses AI. Check for mistakes.
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 7, 2026

Greptile Summary

This PR adds a MariaDB integration with 5 read-only diagnostic tools (process list, global status, InnoDB status, slow queries, replication), full CLI/wizard/verify wiring, and 57 new tests. Most of the serious issues flagged in prior rounds (missing _parse_port, _result arity, integration_id omission, program_name kwarg, multi-source replication) have been addressed, and the credential store now gets hardened 0o600 permissions on write.

Confidence Score: 4/5

Safe to merge after addressing the dead-code model and confirming the slow-query timer units are correct in the live file.

All prior critical issues (SyntaxError, TypeError, arity bug, missing _parse_port) appear resolved. Two P2 findings remain: MariaDBIntegrationConfig is dead code that adds confusion, and the new ssl.create_default_context() approach silently enforces strict cert+hostname verification. The slow-query timing values in get_slow_queries warrant a final sanity-check before merge.

app/integrations/mariadb.py (slow-query timer divisor), app/integrations/models.py (unused MariaDBIntegrationConfig)

Important Files Changed

Filename Overview
app/integrations/mariadb.py Core MariaDB integration: connection config, 5 read-only query helpers, SSL via create_default_context() (strict cert verification), and replication multi-channel support. Timer divisor in get_slow_queries still uses 10^9 (microseconds) despite column aliases saying ms.
app/integrations/models.py Adds MariaDBIntegrationConfig, but it is never imported or used anywhere — all live code paths use MariaDBConfig from mariadb.py instead.
app/integrations/verify.py Adds _verify_mariadb with correct 4-arg _result call and env-var fallback with safe inline port parsing; wires into verify_integrations dispatch.
app/nodes/resolve_integrations/node.py Adds MariaDB branch to _classify_integrations with integration_id propagated, and env-var loading via build_mariadb_config.
app/integrations/store.py Adds STORE_PATH.chmod(0o600) after each write, hardening credential file permissions.
app/nodes/plan_actions/detect_sources.py Wires mariadb into detect_sources following the same pattern as postgresql; reads flat resolved_integrations structure correctly.
tests/integrations/test_mariadb.py Comprehensive config-layer and env-var tests; good coverage of validation edge cases and normalization.
tests/test_mariadb_integration.py Tool-layer unit tests with mocked connections; covers all 5 query helpers plus classify_integrations and helper utilities.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    ENV[Env vars\nMARIADB_HOST etc.] -->|mariadb_config_from_env| CFG[MariaDBConfig]
    STORE[Integration Store\ncredentials JSON] -->|build_mariadb_config| CFG
    CLI[opensre integrations setup mariadb] -->|_setup_mariadb / upsert_integration| STORE

    CFG --> VERIFY[validate_mariadb_config\nSELECT VERSION]
    CFG --> CONN[_get_connection\npymysql + ssl.create_default_context]

    CONN --> PL[get_process_list\nPROCESSLIST]
    CONN --> GS[get_global_status\nSHOW GLOBAL STATUS]
    CONN --> IS[get_innodb_status\nSHOW ENGINE INNODB STATUS]
    CONN --> SQ[get_slow_queries\nperformance_schema]
    CONN --> RS[get_replication_status\nSHOW ALL SLAVES STATUS]

    PL --> TOOL1[MariaDBProcessListTool]
    GS --> TOOL2[MariaDBStatusTool]
    IS --> TOOL3[MariaDBInnoDBStatusTool]
    SQ --> TOOL4[MariaDBSlowQueriesTool]
    RS --> TOOL5[MariaDBReplicationTool]

    RESOLVE[resolve_integrations node\n_classify_integrations] -->|flat mariadb dict| DETECT[detect_sources\nsources map]
    DETECT --> TOOL1 & TOOL2 & TOOL3 & TOOL4 & TOOL5
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: app/integrations/models.py
Line: 230-244

Comment:
**`MariaDBIntegrationConfig` is unreferenced dead code**

`MariaDBIntegrationConfig` is defined here but not imported or used anywhere else in the codebase (grep over the full repo finds only this definition). All actual MariaDB wiring — `_classify_integrations`, `_verify_mariadb`, the tool layer — goes through `MariaDBConfig` from `app/integrations/mariadb.py` and its `build_mariadb_config` helper. Having two parallel config models for the same integration without any usage of one of them will confuse future maintainers about which model to use.

Either wire it in (e.g. use it in `verify.py`'s `_verify_mariadb` instead of calling `build_mariadb_config` directly) or remove it.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: app/integrations/mariadb.py
Line: 109-111

Comment:
**`create_default_context()` enforces strict cert + hostname verification**

`ssl.create_default_context()` sets `check_hostname=True` and `verify_mode=CERT_REQUIRED` by default. Connections to MariaDB instances with self-signed certificates, a private CA not in the system trust store, or where users connect via IP address (no hostname in the cert) will fail with an `ssl.SSLCertVerificationError`. The `.env.example` describes `MARIADB_SSL=true` with no hint of this strictness.

Consider either documenting this behaviour, or exposing a `ssl_verify_cert` config field (defaulting to `True`) and relaxing the context when it's `False`:

```python
if config.ssl:
    ssl_ctx = _ssl.create_default_context()
    if not getattr(config, "ssl_verify_cert", True):
        ssl_ctx.check_hostname = False
        ssl_ctx.verify_mode = _ssl.CERT_NONE
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (25): Last reviewed commit: "Merge branch 'Tracer-Cloud:main' into fe..." | Re-trigger Greptile

Comment thread app/integrations/mariadb.py Outdated
Comment on lines +366 to +378
row = None
columns: list[str] = []
# MariaDB-specific multi-source syntax first, then legacy
for stmt in ("SHOW ALL SLAVES STATUS", "SHOW SLAVE STATUS"):
try:
cur.execute(stmt)
row = cur.fetchone()
if cur.description:
columns = [d[0] for d in cur.description]
break
except Exception: # noqa: BLE001
continue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Multi-source replication silently returns only the first replica

SHOW ALL SLAVES STATUS returns one row per master connection in a multi-source setup, but fetchone() discards every row after the first. The PR description explicitly claims multi-source support, so this is a present functional defect — any replica beyond the first is silently dropped.

Suggested change
row = None
columns: list[str] = []
# MariaDB-specific multi-source syntax first, then legacy
for stmt in ("SHOW ALL SLAVES STATUS", "SHOW SLAVE STATUS"):
try:
cur.execute(stmt)
row = cur.fetchone()
if cur.description:
columns = [d[0] for d in cur.description]
break
except Exception: # noqa: BLE001
continue
rows = cur.fetchall() or []
if cur.description:
columns = [d[0] for d in cur.description]
break

After this change, iterate rows to build a list of replica dicts, return "replicas": [...] instead of "replication": {}.

Prompt To Fix With AI
This is a comment left during a code review.
Path: app/integrations/mariadb.py
Line: 366-378

Comment:
**Multi-source replication silently returns only the first replica**

`SHOW ALL SLAVES STATUS` returns one row per master connection in a multi-source setup, but `fetchone()` discards every row after the first. The PR description explicitly claims multi-source support, so this is a present functional defect — any replica beyond the first is silently dropped.

```suggestion
                rows = cur.fetchall() or []
                if cur.description:
                    columns = [d[0] for d in cur.description]
                break
```

After this change, iterate `rows` to build a list of replica dicts, return `"replicas": [...]` instead of `"replication": {}`.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread app/integrations/mariadb.py Outdated
Comment on lines +96 to +99

ssl_arg: dict[str, Any] | None = None
if config.ssl:
ssl_arg = {"check_hostname": False}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 check_hostname is not a recognized pymysql SSL dict key

pymysql's ssl parameter accepts ca, cert, key, capath, and cipher. The key check_hostname is silently ignored, so hostname-verification behavior is undefined. Passing an empty dict {} or None is clearer and achieves the same "connect with TLS, no cert verification" effect that appears to be the intent here.

Suggested change
ssl_arg: dict[str, Any] | None = None
if config.ssl:
ssl_arg = {"check_hostname": False}
ssl_arg: dict[str, Any] | None = {} if config.ssl else None
Prompt To Fix With AI
This is a comment left during a code review.
Path: app/integrations/mariadb.py
Line: 96-99

Comment:
**`check_hostname` is not a recognized pymysql SSL dict key**

pymysql's `ssl` parameter accepts `ca`, `cert`, `key`, `capath`, and `cipher`. The key `check_hostname` is silently ignored, so hostname-verification behavior is undefined. Passing an empty dict `{}` or `None` is clearer and achieves the same "connect with TLS, no cert verification" effect that appears to be the intent here.

```suggestion
        ssl_arg: dict[str, Any] | None = {} if config.ssl else None
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread app/nodes/resolve_integrations/node.py Outdated
Comment on lines +229 to +242
try:
mariadb_config = build_mariadb_config({
"host": credentials.get("host", ""),
"port": credentials.get("port", 3306),
"database": credentials.get("database", ""),
"username": credentials.get("username", ""),
"password": credentials.get("password", ""),
"ssl": credentials.get("ssl", True),
})
except Exception:
continue

if mariadb_config.host and mariadb_config.database:
resolved["mariadb"] = mariadb_config.model_dump()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 integration_id not propagated for MariaDB

Every other integration in _classify_integrations (Grafana, Datadog, Honeycomb, Coralogix, Sentry, Vercel, OpsGenie) passes "integration_id": integration.get("id", "") when building its config object. The MariaDB block omits this, so resolved["mariadb"]["integration_id"] is always "" for store-backed configurations.

Suggested change
try:
mariadb_config = build_mariadb_config({
"host": credentials.get("host", ""),
"port": credentials.get("port", 3306),
"database": credentials.get("database", ""),
"username": credentials.get("username", ""),
"password": credentials.get("password", ""),
"ssl": credentials.get("ssl", True),
})
except Exception:
continue
if mariadb_config.host and mariadb_config.database:
resolved["mariadb"] = mariadb_config.model_dump()
try:
mariadb_config = build_mariadb_config({
"host": credentials.get("host", ""),
"port": credentials.get("port", 3306),
"database": credentials.get("database", ""),
"username": credentials.get("username", ""),
"password": credentials.get("password", ""),
"ssl": credentials.get("ssl", True),
"integration_id": integration.get("id", ""),
})
Prompt To Fix With AI
This is a comment left during a code review.
Path: app/nodes/resolve_integrations/node.py
Line: 229-242

Comment:
**`integration_id` not propagated for MariaDB**

Every other integration in `_classify_integrations` (Grafana, Datadog, Honeycomb, Coralogix, Sentry, Vercel, OpsGenie) passes `"integration_id": integration.get("id", "")` when building its config object. The MariaDB block omits this, so `resolved["mariadb"]["integration_id"]` is always `""` for store-backed configurations.

```suggestion
            try:
                mariadb_config = build_mariadb_config({
                    "host": credentials.get("host", ""),
                    "port": credentials.get("port", 3306),
                    "database": credentials.get("database", ""),
                    "username": credentials.get("username", ""),
                    "password": credentials.get("password", ""),
                    "ssl": credentials.get("ssl", True),
                    "integration_id": integration.get("id", ""),
                })
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +323 to +326
WHERE SCHEMA_NAME = %s
OR SCHEMA_NAME IS NULL
ORDER BY AVG_TIMER_WAIT DESC
LIMIT %s
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 OR SCHEMA_NAME IS NULL may surface unrelated system queries

Rows with SCHEMA_NAME IS NULL in events_statements_summary_by_digest represent cross-schema or administrative statements (SHOW STATUS, SET NAMES, SELECT @@version, etc.) that are unrelated to the target application database. If these statements have a high AVG_TIMER_WAIT they can displace genuinely slow application queries in the results.

Consider removing the OR SCHEMA_NAME IS NULL clause unless including system-level statements is intentional:

Suggested change
WHERE SCHEMA_NAME = %s
OR SCHEMA_NAME IS NULL
ORDER BY AVG_TIMER_WAIT DESC
LIMIT %s
WHERE SCHEMA_NAME = %s
Prompt To Fix With AI
This is a comment left during a code review.
Path: app/integrations/mariadb.py
Line: 323-326

Comment:
**`OR SCHEMA_NAME IS NULL` may surface unrelated system queries**

Rows with `SCHEMA_NAME IS NULL` in `events_statements_summary_by_digest` represent cross-schema or administrative statements (`SHOW STATUS`, `SET NAMES`, `SELECT @@version`, etc.) that are unrelated to the target application database. If these statements have a high `AVG_TIMER_WAIT` they can displace genuinely slow application queries in the results.

Consider removing the `OR SCHEMA_NAME IS NULL` clause unless including system-level statements is intentional:

```suggestion
                    WHERE SCHEMA_NAME = %s
```

How can I resolve this? If you propose a fix, please make it concise.

Comment thread app/integrations/mariadb.py Outdated
Comment thread app/integrations/mariadb.py
Comment thread app/integrations/verify.py Fixed
Comment thread app/nodes/resolve_integrations/node.py Outdated
Comment thread app/integrations/verify.py Outdated
Comment thread app/nodes/resolve_integrations/node.py Outdated
Comment thread app/cli/constants.py
Comment thread app/integrations/cli.py Outdated
@yashksaini-coder
Copy link
Copy Markdown
Collaborator Author

image

@shanduur Tested MariaDB locally using Docker and verified the local env setup successfully.

@yashksaini-coder yashksaini-coder force-pushed the feat/mariadb-integration branch from 81915ab to 2ce6464 Compare April 11, 2026 07:10

import pytest

from app.integrations.cli import _parse_port
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 _parse_port does not exist — entire test file fails with ImportError

app.integrations.cli has no _parse_port function. Every test in TestParsePort will raise ImportError at collection time. The PR description claims "all MariaDB tests pass (57 new tests)", but this file cannot even be imported.

The actual port handling in _setup_mariadb uses int(port) if port.isdigit() else 3306, which also doesn't match the tests: "0".isdigit() and "65536".isdigit() both return True, so out-of-range ports are stored rather than falling back to the default.

_parse_port needs to be defined in app/integrations/cli.py and wired into _setup_mariadb:

def _parse_port(value: str, default: int = 3306) -> int:
    try:
        port = int(value)
    except (ValueError, TypeError):
        return default
    return port if 1 <= port <= 65535 else default
Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/cli/test_mariadb_cli.py
Line: 7

Comment:
**`_parse_port` does not exist — entire test file fails with `ImportError`**

`app.integrations.cli` has no `_parse_port` function. Every test in `TestParsePort` will raise `ImportError` at collection time. The PR description claims "all MariaDB tests pass (57 new tests)", but this file cannot even be imported.

The actual port handling in `_setup_mariadb` uses `int(port) if port.isdigit() else 3306`, which also doesn't match the tests: `"0".isdigit()` and `"65536".isdigit()` both return `True`, so out-of-range ports are stored rather than falling back to the default.

`_parse_port` needs to be defined in `app/integrations/cli.py` and wired into `_setup_mariadb`:

```python
def _parse_port(value: str, default: int = 3306) -> int:
    try:
        port = int(value)
    except (ValueError, TypeError):
        return default
    return port if 1 <= port <= 65535 else default
```

How can I resolve this? If you propose a fix, please make it concise.

@yashksaini-coder yashksaini-coder force-pushed the feat/mariadb-integration branch from 2ce6464 to 7607f03 Compare April 11, 2026 07:17
Comment thread tests/cli/test_mariadb_cli.py Fixed
Comment thread tests/integrations/test_store.py Fixed
Comment thread tests/test_mariadb_integration.py Fixed
import pymysql

ssl_arg: dict[str, Any] | None = None
if config.ssl:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing an empty dict to ssl= in pymysql encrypts the channel but skips certificate verification. An attacker on the network can MITM the connection. For a tool that transmits database credentials, this is a meaningful risk.

Fix: pass {"ssl_verify_cert": True, "ssl_ca": "/etc/ssl/certs/ca-certificates.crt"}, or at minimum expose a ssl_verify_cert config option and document the default behavior.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, you're right that an empty dict skips verification. Changed it to pass ssl_verify_cert: True so pymysql actually validates the server certificate against system CAs.

Comment thread app/integrations/store.py
def _save(data: dict[str, Any]) -> None:
STORE_PATH.parent.mkdir(parents=True, exist_ok=True)
STORE_PATH.write_text(json.dumps(data, indent=2) + "\n")
STORE_PATH.chmod(0o600)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct for new files. However, a store file created by an older version (e.g. 0o644) stays world-readable until the next write. Consider adding a read-time check: if the file exists and its mode is not 0o600, fix it immediately before returning data.

Comment thread app/integrations/mariadb.py Outdated
if cur.description:
columns = [d[0] for d in cur.description]
break
except Exception: # noqa: BLE001
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bare except Exception here will silently swallow timeouts and connection-lost errors on SHOW ALL SLAVES STATUS and attempt SHOW SLAVE STATUS on a dead cursor. Should catch only the syntax/unsupported error:
except pymysql.err.ProgrammingError:

Comment thread app/integrations/mariadb.py Outdated
cur.execute(
"""
SELECT DIGEST_TEXT, COUNT_STAR,
ROUND(AVG_TIMER_WAIT / 1000000000000, 4) AS avg_time_ms,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

performance_schema timers count in picoseconds (10⁻¹²). Dividing by 10¹² gives seconds, not milliseconds. The column name avg_time_ms is wrong by 1000×.
Fix: divide by 1000000000 (10⁹) for ms, or rename the columns to avg_time_s.
eg
ROUND(AVG_TIMER_WAIT / 1000000000, 4) AS avg_time_ms,
ROUND(SUM_TIMER_WAIT / 1000000000, 4) AS total_time_ms,

Comment thread app/integrations/mariadb.py Outdated
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Comment thread app/integrations/mariadb.py Outdated
Comment thread app/integrations/verify.py Outdated
"source": "local env",
"config": {
"host": mariadb_host,
"port": int(os.getenv("MARIADB_PORT", "3306").strip() or "3306"),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Bare int() crashes on invalid MARIADB_PORT

If MARIADB_HOST is set and MARIADB_PORT contains a non-numeric value (e.g. "localhost:3306", "abc"), int(...) raises ValueError with no surrounding try/except, crashing resolve_effective_integrations() entirely — breaking verification for all integrations, not just MariaDB. The parallel path in node.py's _load_env_integrations avoids this by routing through build_mariadb_config, which handles coercion gracefully via _normalize_port.

Reuse the existing helper:

from app.integrations.cli import _parse_port
...
"port": _parse_port(os.getenv("MARIADB_PORT", "3306").strip()),
Prompt To Fix With AI
This is a comment left during a code review.
Path: app/integrations/verify.py
Line: 307

Comment:
**Bare `int()` crashes on invalid `MARIADB_PORT`**

If `MARIADB_HOST` is set and `MARIADB_PORT` contains a non-numeric value (e.g. `"localhost:3306"`, `"abc"`), `int(...)` raises `ValueError` with no surrounding try/except, crashing `resolve_effective_integrations()` entirely — breaking verification for all integrations, not just MariaDB. The parallel path in `node.py`'s `_load_env_integrations` avoids this by routing through `build_mariadb_config`, which handles coercion gracefully via `_normalize_port`.

Reuse the existing helper:
```python
from app.integrations.cli import _parse_port
...
"port": _parse_port(os.getenv("MARIADB_PORT", "3306").strip()),
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — using _parse_port now which handles invalid values gracefully instead of bare int().

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — removed the circular import. Using inline safe parsing now instead of importing from cli.py.

Comment thread app/integrations/verify.py Fixed
Copy link
Copy Markdown
Collaborator

@Devesh36 Devesh36 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good !!

Copy link
Copy Markdown
Member

@VaibhavUpreti VaibhavUpreti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome job @yashksaini-coder 🚀 ! Thanks a lot for the reviews @Devesh36 !!!

@VaibhavUpreti VaibhavUpreti merged commit ff29778 into Tracer-Cloud:main Apr 13, 2026
8 checks passed
@VaibhavUpreti
Copy link
Copy Markdown
Member

@yashksaini-coder could you please create a new issue to document this in our list of integrations in the mintlify docs https://www.opensre.com/docs?

@yashksaini-coder yashksaini-coder deleted the feat/mariadb-integration branch April 16, 2026 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add MariaDB integration for database RCA

6 participants