Skip to content

chore(wren-ai-service): fix dry run#1955

Merged
cyyeh merged 1 commit intomainfrom
chore/ai-service/fix-dryrun
Sep 17, 2025
Merged

chore(wren-ai-service): fix dry run#1955
cyyeh merged 1 commit intomainfrom
chore/ai-service/fix-dryrun

Conversation

@cyyeh
Copy link
Copy Markdown
Member

@cyyeh cyyeh commented Sep 17, 2025

Summary by CodeRabbit

  • Bug Fixes

    • Improved stability when running SQL queries: the app no longer errors if server responses contain missing or null fields.
    • More reliable display of SQL-related error details, with graceful fallbacks when data is incomplete.
    • Correlation ID is captured when available to aid troubleshooting.
  • Chores

    • Internal error parsing made more robust without changing any user-facing APIs or behavior.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 17, 2025

Walkthrough

Refactors WrenUI.execute_sql error handling to safely extract dialectSql, plannedSql, and correlationId from potentially sparse GraphQL error payloads using guarded lookups and empty defaults. Public interfaces remain unchanged.

Changes

Cohort / File(s) Summary
Engine provider error handling
wren-ai-service/src/providers/engine/wren.py
Rewrote nested payload extraction for error fields (dialectSql, plannedSql, correlationId) with defensive fallbacks ({} and empty strings). Maintains final SQL resolution order: dialect → planned → original. No public API changes.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant WrenUI
  participant GraphQL API as GraphQL API

  Client->>WrenUI: execute_sql(sql)
  WrenUI->>GraphQL API: query(sql)
  alt Success
    GraphQL API-->>WrenUI: data
    WrenUI-->>Client: result
  else Error
    GraphQL API-->>WrenUI: error payload (possibly sparse)
    note over WrenUI: Safely extract dialectSql/plannedSql/correlationId with guarded lookups and defaults
    WrenUI-->>Client: error response with resolved SQL and correlationId
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

wren-ai-service

Suggested reviewers

  • yichieh-lu
  • imAsterSun

Poem

I nibble through stacks where the errors hop,
With guarded steps so the trace won’t flop.
If keys go missing, I won’t despair—
I burrow for SQL hidden there.
A twitch, a blink, defaults in line,
Now failures report just fine. 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "chore(wren-ai-service): fix dry run" is concise and scoped to the wren-ai-service and the changeset (which hardens WrenUI.execute_sql error extraction) plausibly addresses a dry-run failure, so it is related to the main change; however the phrase "fix dry run" is somewhat generic and does not describe the specific update to safely navigate missing GraphQL error fields.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/ai-service/fix-dryrun

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
wren-ai-service/src/providers/engine/wren.py (3)

167-168: Return type annotation mismatch vs actual returns (WrenIbis.execute_sql).

Function returns three values throughout, but the annotation declares two. Also the timeout branch returns a string instead of the error dict used elsewhere.

Apply:

-    ) -> Tuple[bool, Optional[Dict[str, Any]]]:
+    ) -> Tuple[bool, Optional[Dict[str, Any]], Dict[str, str]]:

And fix the timeout branch below (see next comment).


286-287: Return type annotation mismatch vs actual returns (WrenEngine.execute_sql).

The function returns a dict for the third item, not an Optional[str]. Align the type to match runtime behavior and other providers.

Apply:

-    ) -> Tuple[bool, Optional[Dict[str, Any]], Optional[str]]:
+    ) -> Tuple[bool, Optional[Dict[str, Any]], Dict[str, str]]:

327-329: Timeout branch returns a string, inconsistent with success/error dicts.

Standardize on the error dict shape to simplify callers.

Apply:

-        except asyncio.TimeoutError:
-            return False, None, f"Request timed out: {timeout} seconds"
+        except asyncio.TimeoutError:
+            return False, None, {"error_message": f"Request timed out: {timeout} seconds", "correlation_id": ""}
🧹 Nitpick comments (6)
wren-ai-service/src/providers/engine/wren.py (6)

90-101: Deduplicate nested lookups; precompute error parts for clarity and fewer allocations.

The repeated get/OR chains are hard to read and maintain. Precompute first error, other, and metadata once, then reuse.

Apply:

-                dialect_sql = (
-                    (
-                        (
-                            (res_json.get("errors", [{}])[0] or {}).get(
-                                "extensions", {}
-                            )
-                            or {}
-                        ).get("other", {})
-                        or {}
-                    ).get("metadata", {})
-                    or {}
-                ).get("dialectSql", "") or ""
+                # Extract nested GraphQL error fields safely (reused below)
+                _err0 = ((res_json.get("errors") or [{}])[0] or {})
+                _other = ((_err0.get("extensions") or {}).get("other") or {})
+                _metadata = (_other.get("metadata") or {})
+                dialect_sql = (_metadata.get("dialectSql") or "")

102-113: Reuse the computed metadata; avoid repeated deep chains.

Apply:

-                planned_sql = (
-                    (
-                        (
-                            (res_json.get("errors", [{}])[0] or {}).get(
-                                "extensions", {}
-                            )
-                            or {}
-                        ).get("other", {})
-                        or {}
-                    ).get("metadata", {})
-                    or {}
-                ).get("plannedSql", "") or ""
+                planned_sql = (_metadata.get("plannedSql") or "")

121-131: Unify correlation_id extraction and add top‑level fallback.

Use the precomputed _other and also fall back to top-level correlationId for parity with the success path.

Apply:

-                        "correlation_id": (
-                            (
-                                (
-                                    (res_json.get("errors", [{}])[0] or {}).get(
-                                        "extensions", {}
-                                    )
-                                    or {}
-                                ).get("other", {})
-                                or {}
-                            ).get("correlationId")
-                            or ""
-                        ),
+                        "correlation_id": (
+                            _other.get("correlationId") or res_json.get("correlationId", "") or ""
+                        ),

85-88: Log with correlation_id for traceability (keep SQL out of logs).

Consider including the extracted correlation_id in the error log once available (or compute it before logging) to ease cross‑service tracing.

If you want, I can propose a small restructuring to compute correlation bits before logging.


184-206: Guard JSON decoding on 204 and standardize error shape (WrenIbis.execute_sql).

Avoid await response.json() on 204 responses; and make the timeout/error returns consistent with other engines (dict with error_message, correlation_id).

Apply:

-            ) as response:
-                if dry_run:
-                    res = await response.text()
-                else:
-                    res = await response.json()
-
-                if response.status == 200 or response.status == 204:
-                    return (
-                        True,
-                        res,
-                        {
-                            "correlation_id": "",
-                        },
-                    )
-
-                return (
-                    False,
-                    None,
-                    {
-                        "error_message": res,
-                        "correlation_id": "",
-                    },
-                )
+            ) as response:
+                if response.status in (200, 204):
+                    if dry_run or response.status == 204:
+                        res = await response.text()
+                    else:
+                        res = await response.json()
+                    return True, res, {"correlation_id": ""}
+
+                # Read error body safely
+                err_text = await response.text()
+                return False, None, {"error_message": err_text, "correlation_id": ""}
         except asyncio.TimeoutError:
-            return False, None, f"Request timed out: {timeout} seconds"
+            return False, None, {"error_message": f"Request timed out: {timeout} seconds", "correlation_id": ""}

67-69: Variable shadowing: “data” reused for response rows.

Minor readability nit: request payload “data” and response “data” share the same name. Consider renaming the latter to rows to avoid confusion.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4dd117b and 4eb85f9.

📒 Files selected for processing (1)
  • wren-ai-service/src/providers/engine/wren.py (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-03-18T10:28:10.593Z
Learnt from: andreashimin
PR: Canner/WrenAI#1414
File: wren-ui/src/apollo/server/utils/error.ts:0-0
Timestamp: 2025-03-18T10:28:10.593Z
Learning: The typo in error code enum names ("IDENTIED_AS_GENERAL" and "IDENTIED_AS_MISLEADING_QUERY" instead of "IDENTIFIED_AS_GENERAL" and "IDENTIFIED_AS_MISLEADING_QUERY") in wren-ui/src/apollo/server/utils/error.ts was recognized but intentionally not fixed as it was considered out of scope for PR #1414.

Applied to files:

  • wren-ai-service/src/providers/engine/wren.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (go)
🔇 Additional comments (1)
wren-ai-service/src/providers/engine/wren.py (1)

293-303: Confirm WrenEngine endpoints accept GET with JSON body

Multiple callers send a GET with a JSON body to /v1/mdl/preview and /v1/mdl/dry-run (wren-ai-service/src/providers/engine/wren.py, wren-ai-service/tools/run_sql.py, wren-ai-service/eval/utils.py, wren-ai-service/eval/data_curation/utils.py, wren-ui/src/apollo/server/adaptors/wrenEngineAdaptor.ts). I could not find server-side route implementations in this repo to verify the API expects GET — confirm the engine accepts GET bodies; if not, change these clients to POST to avoid intermediaries dropping the body.

@cyyeh cyyeh added module/ai-service ai-service related ci/ai-service ai-service related labels Sep 17, 2025
@cyyeh cyyeh requested a review from yichieh-lu September 17, 2025 03:38
@cyyeh cyyeh merged commit c47a4c3 into main Sep 17, 2025
15 checks passed
@cyyeh cyyeh deleted the chore/ai-service/fix-dryrun branch September 17, 2025 04:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/ai-service ai-service related module/ai-service ai-service related wren-ai-service

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants