Skip to content

feat(agents): control plane impl for llm providers v2#2509

Merged
jordan-umusu merged 10 commits intomainfrom
feat/llm-providers-v2-be
Apr 27, 2026
Merged

feat(agents): control plane impl for llm providers v2#2509
jordan-umusu merged 10 commits intomainfrom
feat/llm-providers-v2-be

Conversation

@jordan-umusu
Copy link
Copy Markdown
Collaborator

@jordan-umusu jordan-umusu commented Apr 8, 2026

Summary by cubic

Implements the v2 LLM providers control plane with a shared model catalog, encrypted custom provider management, org/workspace access control, and a canonical default‑model selection. Adds audited, scope‑checked REST APIs and updates the frontend client for these endpoints.

  • New Features

    • Catalog (/organization/agent-catalog): list/get with provider/name filters and cursor pagination; create/update/delete; platform bulk upserts and discovery upserts; effective workspace models via GET /workspaces/{workspace_id}/agent-models; audit events; platform catalog bumped with updated capability flags.
    • Custom providers (/organization/agent-custom-providers): CRUD with encrypted API keys and custom headers; strict base_url validation (HTTP/HTTPS with hostname); passthrough forwarding fix; connection validation; refresh to sync models into the catalog; discovery status tracking (never/running/succeeded/failed); resolves catalog config with decrypted creds/passthrough; audit events.
    • Access control (/organization/agent-model-access): enable/disable models at org or workspace; list enabled models with cursor; audit events.
    • Default model selection (/agent/default-model-selection): GET/PUT the org’s canonical default by catalog_id; stores catalog_id, model_provider, model_name, and optional custom_provider_id; resolves legacy name‑only defaults; frontend adds types and services agentGetDefaultModelSelection and agentSetDefaultModelSelection.
  • Migration

    • Run Alembic migrations to create catalog, provider, and model access tables with tenant RLS.
    • Agent catalog RLS split: platform rows (organization_id is NULL) are readable across orgs but not writable; org‑owned rows remain protected.
    • Set TRACECAT__DB_ENCRYPTION_KEY to enable encryption for provider secrets.

Written for commit 43b3a6b. Summary will update on new commits. Review in cubic

@jordan-umusu jordan-umusu temporarily deployed to internal-registry-ci April 8, 2026 21:42 — with GitHub Actions Inactive
@jordan-umusu jordan-umusu added fix Bug fix agents LLM agents labels Apr 8, 2026
@jordan-umusu jordan-umusu temporarily deployed to internal-registry-ci April 8, 2026 21:42 — with GitHub Actions Inactive
@jordan-umusu jordan-umusu changed the base branch from main to feat/llm-providers-v2-models April 8, 2026 21:42
Comment thread tracecat/agent/provider/service.py Outdated
Comment thread tracecat/agent/provider/service.py Outdated
Comment thread tracecat/agent/provider/service.py
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-models branch from b9318a8 to b185e7c Compare April 15, 2026 12:55
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from 6ae521f to 4ecff10 Compare April 15, 2026 15:10
Comment thread tracecat/agent/catalog/router.py Outdated
Comment thread tracecat/agent/provider/router.py
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-models branch from 29ee760 to 665fd5b Compare April 20, 2026 14:17
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from 4ecff10 to 5dad6bc Compare April 20, 2026 15:38
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-models branch from 7981aa4 to 5bb4db8 Compare April 20, 2026 15:40
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch 3 times, most recently from f7e5329 to 1067cbe Compare April 20, 2026 16:14
@jordan-umusu jordan-umusu marked this pull request as ready for review April 20, 2026 16:17
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from 1067cbe to 1416aab Compare April 20, 2026 16:17
@jordan-umusu
Copy link
Copy Markdown
Collaborator Author

@codex review

@jordan-umusu
Copy link
Copy Markdown
Collaborator Author

@cubic review

@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai Bot commented Apr 20, 2026

@cubic review

@jordan-umusu I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1416aabc57

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/access/service.py
Comment thread tracecat/agent/catalog/router.py
Comment thread tracecat/agent/provider/service.py Outdated
Comment thread tracecat/agent/catalog/service.py Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 issues found across 20 files

Confidence score: 2/5

  • High-risk user impact is likely: tracecat/agent/access/service.py can raise MissingGreenlet after commit without a refresh, and tracecat/agent/provider/service.py can silently drop existing secret fields during partial updates (data loss/regression risk).
  • Pagination behavior has concrete correctness issues in tracecat/agent/catalog/service.py (tie-breaker omission can skip rows) and tracecat/agent/catalog/router.py (always returning next_cursor=None can make items beyond limit unreachable).
  • Several router/service error-mapping and integrity gaps (tracecat/agent/access/router.py, tracecat/agent/access/service.py, tracecat/agent/provider/router.py) are likely to surface as 500s or raw DB errors instead of controlled 4xx/domain responses.
  • Pay close attention to tracecat/agent/access/service.py, tracecat/agent/provider/service.py, tracecat/agent/catalog/service.py, tracecat/agent/catalog/router.py - these contain the highest-confidence breakage paths (runtime async ORM failure, config overwrite, and pagination correctness).
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tracecat/agent/provider/schemas.py">

<violation number="1" location="tracecat/agent/provider/schemas.py:50">
P2: Add max-length validation to update fields to match create/DB constraints and prevent oversized PATCH payloads from failing at DB commit.</violation>
</file>

<file name="tracecat/agent/service.py">

<violation number="1" location="tracecat/agent/service.py:511">
P2: Default model selection is written with two independent commits, which can leave settings inconsistent if the second write fails.</violation>
</file>

<file name="tracecat/agent/provider/router.py">

<violation number="1" location="tracecat/agent/provider/router.py:52">
P2: Handle invalid pagination cursors in the router and return HTTP 400 instead of letting `ValueError` propagate.

(Based on your team's feedback about translating known validation/domain errors at router boundaries.) [FEEDBACK_USED]</violation>
</file>

<file name="tracecat/agent/catalog/service.py">

<violation number="1" location="tracecat/agent/catalog/service.py:156">
P1: Cursor pagination can skip rows with identical `created_at` values because the continuation filter ignores the `id` tie-breaker.</violation>
</file>

<file name="tracecat/agent/catalog/router.py">

<violation number="1" location="tracecat/agent/catalog/router.py:214">
P1: This endpoint truncates results but always returns `next_cursor=None`, which can make workspace models beyond `limit` permanently unreachable to clients.</violation>
</file>

<file name="tracecat/agent/access/router.py">

<violation number="1" location="tracecat/agent/access/router.py:83">
P1: Handle `TracecatNotFoundError` in `disable_model` and map it to HTTP 404; otherwise missing IDs return 500 via the global exception handler.

(Based on your team's feedback about translating known domain errors at router/global-handler boundaries.) [FEEDBACK_USED]</violation>
</file>

<file name="tracecat/agent/access/service.py">

<violation number="1" location="tracecat/agent/access/service.py:44">
P1: Missing `IntegrityError` handling for the unique constraint on `(organization_id, workspace_id, catalog_id)`. Enabling the same model twice will surface an unhandled `IntegrityError` instead of a meaningful error. Wrap the commit and translate to a domain error so routers can return HTTP 409.

(Based on your team's feedback about returning HTTP 409 for duplicate assignment operations.) [FEEDBACK_USED]</violation>

<violation number="2" location="tracecat/agent/access/service.py:45">
P1: Missing `await self.session.refresh(access)` after commit. In async SQLAlchemy, the commit expires all loaded objects; accessing attributes on `access` in `model_validate` will raise `MissingGreenlet` because implicit lazy loading is not available in async sessions. Every other agent service refreshes the object before returning it.</violation>
</file>

<file name="tracecat/agent/provider/service.py">

<violation number="1" location="tracecat/agent/provider/service.py:52">
P1: Partial secret update overwrites the entire encrypted config, losing previously stored fields. If a user updates only `api_key`, existing `custom_headers` are silently dropped (and vice versa). Decrypt the existing config first, merge the update, then re-encrypt.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread tracecat/agent/catalog/service.py Outdated
Comment thread tracecat/agent/catalog/router.py Outdated
Comment thread tracecat/agent/access/router.py Outdated
Comment thread tracecat/agent/access/service.py Outdated
Comment thread tracecat/agent/access/service.py
Comment thread tracecat/agent/provider/service.py
Comment thread tracecat/agent/provider/schemas.py Outdated
Comment thread tracecat/agent/service.py Outdated
Comment thread tracecat/agent/provider/router.py Outdated
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from 1416aab to 367c8d3 Compare April 20, 2026 22:24
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 367c8d3da5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/service.py
Comment thread tracecat/agent/access/service.py Outdated
Comment thread tracecat/agent/provider/service.py Outdated
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from 367c8d3 to 81b3774 Compare April 20, 2026 23:19
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 81b3774375

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/provider/router.py Outdated
Comment thread tracecat/agent/access/service.py Outdated
Comment thread tracecat/agent/catalog/service.py
@jordan-umusu jordan-umusu requested a review from daryllimyt April 21, 2026 12:41
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-models branch from 3990191 to 055253b Compare April 27, 2026 14:03
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from f5a6d8e to de58209 Compare April 27, 2026 14:04
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: de58209dbf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/access/service.py
Comment thread tracecat/agent/service.py
Comment on lines +548 to +549
return self._to_default_model_selection(catalog_entry)
return None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fall back to legacy default when catalog id is stale

This branch returns None as soon as the stored catalog-id setting does not resolve, so the method never checks the legacy agent_default_model value afterward. Since the legacy set_default_model path still updates only the name setting, mixed usage can leave agent_default_model_catalog_id stale and make /agent/default-model-selection disagree with the effective default model (or become null unexpectedly). When catalog-id lookup fails, continue to the legacy-name fallback (or keep both settings synchronized).

Useful? React with 👍 / 👎.

@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-models branch from 055253b to 79607d1 Compare April 27, 2026 14:21
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from de58209 to cb79da3 Compare April 27, 2026 14:22
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb79da3c4c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/provider/schemas.py
Comment thread tracecat/agent/catalog/service.py
Base automatically changed from feat/llm-providers-v2-models to main April 27, 2026 14:34
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from cb79da3 to 86cdaf2 Compare April 27, 2026 14:36
@jordan-umusu jordan-umusu temporarily deployed to internal-registry-ci April 27, 2026 14:37 — with GitHub Actions Inactive
@jordan-umusu jordan-umusu temporarily deployed to internal-registry-ci April 27, 2026 14:37 — with GitHub Actions Inactive
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from 86cdaf2 to 843d236 Compare April 27, 2026 15:14
@jordan-umusu jordan-umusu temporarily deployed to internal-registry-ci April 27, 2026 15:14 — with GitHub Actions Inactive
@jordan-umusu jordan-umusu temporarily deployed to internal-registry-ci April 27, 2026 15:14 — with GitHub Actions Inactive
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 843d2367c8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tracecat/agent/service.py
) -> DefaultModelSelection:
"""Set the canonical default model selection for the organization."""
access_svc = AgentModelAccessService(session=self.session, role=self.role)
enabled_models = await access_svc.get_org_models()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid hidden agent:read requirement in default-model update

set_default_model_selection is gated with agent:update, but it calls AgentModelAccessService.get_org_models(), which is decorated with @require_scope("agent:read"). A role that legitimately has update permissions but not read permissions will get a scope-denied error when setting the default model, even though the route advertises update-only access. This makes the endpoint behavior inconsistent with its declared authorization contract.

Useful? React with 👍 / 👎.

Comment thread tracecat/agent/catalog/router.py Outdated
@jordan-umusu jordan-umusu force-pushed the feat/llm-providers-v2-be branch from 843d236 to 43b3a6b Compare April 27, 2026 15:29
@jordan-umusu jordan-umusu temporarily deployed to internal-registry-ci April 27, 2026 15:29 — with GitHub Actions Inactive
@jordan-umusu jordan-umusu temporarily deployed to internal-registry-ci April 27, 2026 15:29 — with GitHub Actions Inactive
@jordan-umusu jordan-umusu merged commit 46d95e3 into main Apr 27, 2026
18 checks passed
@jordan-umusu jordan-umusu deleted the feat/llm-providers-v2-be branch April 27, 2026 15:34
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 43b3a6b6dc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +337 to +350
stmt = insert(AgentCatalog).values(values)
stmt = stmt.on_conflict_do_update(
index_elements=[
"organization_id",
"custom_provider_id",
"model_provider",
"model_name",
],
set_={
"model_metadata": stmt.excluded.model_metadata,
"last_refreshed_at": stmt.excluded.last_refreshed_at,
},
)
await self.session.execute(stmt)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prune stale discovered models during provider refresh

upsert_discovered_models() only performs INSERT ... ON CONFLICT DO UPDATE for models returned by the latest discovery run, but it never deletes previously discovered rows that are now absent from the provider response. When a provider removes a model, that stale catalog row remains enabled/selectable and can still be chosen as default, leading to downstream runtime failures against a model that no longer exists. Refresh should reconcile the full set for (organization_id, custom_provider_id, model_provider) by removing rows not present in the new discovery payload.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents LLM agents fix Bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants