[Infra] Isolate unit test workflows with hardened security posture by yuneng-berri · Pull Request #24740 · BerriAI/litellm

yuneng-berri · 2026-03-28T16:57:25Z

Summary

Problem

The current matrix-based unit test workflow (test-litellm-matrix.yml) has two issues:

Opaque job names — failures show as test (proxy-unit-b6) or test (other-3), making it impossible to tell what broke at a glance
Single monolithic workflow — no separation between test domains, making it harder to enforce per-workflow security policies

Fix

Added 16 new workflow files that replace the matrix with individually-named workflows per test domain
Created a reusable base workflow (_test-unit-base.yml) to eliminate setup duplication
Hardened security posture across all workflows:
- Zero secrets: references — unit tests have no access to any secrets
- permissions: { contents: read } only — least privilege
- All actions pinned to commit SHAs (not tags) to prevent supply chain attacks
- persist-credentials: false on all checkout steps
- Template injection prevention via env: indirection (no ${{ }} in run: blocks)
- Concurrency groups with cancel-in-progress: true
- Timeouts on all jobs

New Workflows

Workflow	Tests
`Unit Tests: LLM Provider Transformations`	Vertex AI + all other provider request/response transforms
`Unit Tests: Proxy Auth & Key Management`	JWT, RBAC, API key validation, policy engine
`Unit Tests: Proxy API Endpoints`	All proxy HTTP endpoint handlers (15 subdirs)
`Unit Tests: Proxy Infrastructure`	DB ops, middleware, spend tracking, experimental
`Unit Tests: Core Utilities`	Token counting, cost calculation, streaming
`Unit Tests: Integrations`	Mocked Langfuse, DataDog, Prometheus callbacks
`Unit Tests: Responses, Caching & Types`	Response format conversion, cache strategy, types
`Unit Tests: Enterprise, Google GenAI & Routing`	Enterprise features, GenAI transforms, router logic
`Unit Tests: MCP, Secrets, Containers & Misc`	Remaining test domains
`Unit Tests: Proxy Legacy Tests`	Legacy proxy tests (9 descriptive matrix entries)
`Unit Tests: Router`	Router unit tests (new to GHA)
`Unit Tests: Pass-Through Endpoints`	Pass-through endpoint tests (new to GHA)
`Unit Tests: LiteLLM Utilities`	Utility function tests (new to GHA)
`Unit Tests: Security`	Proxy security tests (new to GHA)
`Unit Tests: Documentation Validation`	Documentation validation tests (new to GHA)

Testing

All 16 workflows trigger and pass on this PR
Verify job names are descriptive in the Actions tab
Confirm no secrets are accessible to any unit test job

Type

🚄 Infrastructure
✅ Test

Replace monolithic matrix workflow with individual, descriptively-named workflow files. Each workflow uses a shared reusable base and follows least-privilege security: zero secrets, read-only permissions, SHA-pinned actions, persist-credentials: false, and env-var indirection to prevent template injection. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

vercel · 2026-03-28T16:57:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
litellm	Ready	Preview, Comment	Mar 28, 2026 5:18pm

CLAassistant · 2026-03-28T16:57:34Z

All committers have signed the CLA.

codspeed-hq · 2026-03-28T16:59:14Z

Merging this PR will not alter performance

✅ 16 untouched benchmarks

_{Comparing litellm_unit_test_workflow_isolation (c717189) with main (2eb3c20)}

greptile-apps · 2026-03-28T17:03:56Z

Greptile Summary

This PR introduces a hardened CI architecture by decomposing the single test-litellm-matrix.yml into 11 new individual workflow files (10 caller workflows + 1 reusable base), with 5 additional workflows referenced in the PR description but not yet present in this diff. The security improvements are genuine and well-executed: all external actions are pinned to commit SHAs, persist-credentials: false is applied universally, permissions: contents: read is enforced everywhere, secrets are fully absent, and template injection is prevented via env: indirection throughout.

Key changes:

_test-unit-base.yml: Centralises the 6-step setup (checkout, Python, Poetry, cache, deps, Prisma) into a reusable workflow_call target, parameterising test-path, workers, reruns, timeout-minutes, and max-failures
10 caller workflows: Each maps one or more test directories to the base, with descriptive job names replacing the opaque test (proxy-unit-b6) style
test-unit-proxy-legacy.yml: Cannot reuse the base (GitHub does not support matrix + workflow_call), so it duplicates all 6 setup steps inline and runs a 9-entry alphabetically-partitioned matrix over tests/proxy_unit_tests/
Dependency version improvements: google-cloud-aiplatform is now pinned (==1.115.0 vs old >=1.38), openapi-core is pinned (==0.23.0), and nodejs-wheel-binaries==24.13.1 is added for reproducible Prisma client generation
Coverage verification: Cross-referencing against test-litellm-matrix.yml confirms all 20 old matrix test paths are preserved, with router_strategy correctly migrated from other-3 to enterprise-routing
The old test-litellm-matrix.yml still exists and will double CI minutes on every PR until removed (noted in prior threads)

Confidence Score: 5/5

Safe to merge; all findings are P2 style/quality suggestions with no correctness or security impact

The security hardening is correctly implemented throughout. Full test coverage parity with the old matrix workflow was confirmed by cross-referencing both files. The only remaining concerns are P2: a latent alphabetical gap in the proxy-legacy glob partition (no current file falls through) and unquoted glob expansion in the base workflow (existing behaviour, not a regression). Known open items — old matrix not deleted, no workflow_dispatch, no push trigger — are already tracked in prior threads and do not block merge.

test-unit-proxy-legacy.yml — setup duplication and alphabetical glob partitioning warrant a maintenance comment for future contributors

Important Files Changed

Filename	Overview
.github/workflows/_test-unit-base.yml	New reusable base workflow with SHA-pinned actions, env-indirection for template injection prevention, and parameterized pytest runner
.github/workflows/test-unit-proxy-legacy.yml	Legacy proxy test workflow inlines all 6 setup steps from base (no reuse possible due to matrix), with alphabetic glob partitioning across 9 matrix groups; setup must be manually kept in sync with _test-unit-base.yml
.github/workflows/test-unit-llm-providers.yml	Splits LLM providers into vertex-ai (workers:1 for isolation) and all-others (--ignore= flag for exclusion), matching old matrix structure
.github/workflows/test-unit-proxy-endpoints.yml	Covers 15 proxy endpoint subdirectories split from old monolithic proxy-misc group; rag_endpoints and realtime_endpoints directories absent (pre-existing gap, not a regression)
.github/workflows/test-unit-enterprise-routing.yml	Adds router_strategy (was in old other-3) alongside enterprise, google_genai, router_utils; uses default 20-minute timeout for 4 directories

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    PR[Pull Request to main] --> CW1
    PR --> CW2
    PR --> CW3
    PR --> CW4
    PR --> CW5
    PR --> CW6
    PR --> CW7
    PR --> CW8
    PR --> CW9
    PR --> CW10

    CW1["test-unit-llm-providers\n(vertex-ai + other-providers)"]
    CW2["test-unit-proxy-auth\n(auth / hooks / policy / client)"]
    CW3["test-unit-proxy-endpoints\n(15 endpoint subdirs)"]
    CW4["test-unit-proxy-infra\n(db / middleware / spend / pass-through)"]
    CW5["test-unit-core-utils\n(litellm_core_utils)"]
    CW6["test-unit-integrations\n(callbacks & logging)"]
    CW7["test-unit-responses-caching-types\n(responses / caching / types)"]
    CW8["test-unit-enterprise-routing\n(enterprise / google_genai / router)"]
    CW9["test-unit-misc\n(MCP / secrets / containers / root)"]
    CW10["test-unit-proxy-legacy\n(matrix: 9 alphabetic groups)"]

    CW1 --> BASE["_test-unit-base.yml\n(reusable workflow)"]
    CW2 --> BASE
    CW3 --> BASE
    CW4 --> BASE
    CW5 --> BASE
    CW6 --> BASE
    CW7 --> BASE
    CW8 --> BASE
    CW9 --> BASE

    CW10 --> INLINE["Inline setup\n(duplicated from base)"]

    BASE --> STEPS["checkout → python → poetry\n→ cache → install deps\n→ enterprise → prisma → pytest"]
    INLINE --> STEPS2["checkout → python → poetry\n→ cache → install deps\n→ enterprise → prisma\n→ pytest (matrix loop)"]

_{Reviews (3): Last reviewed commit: "[Infra] Remove workflows that require AP..." | Re-trigger Greptile}

greptile-apps · 2026-03-28T17:04:00Z

+  pull_request:
+    branches: [main]


Old matrix workflow not removed — tests will run twice

The PR description states these new workflows "replace" test-litellm-matrix.yml, but that file still exists and has the same pull_request: branches: [main] trigger. With both active, every PR will now run the old matrix jobs AND all 16 new workflow runs concurrently. This doubles CI minutes and resource consumption for every PR.

The old workflow covers the same test paths (llms-vertex, llms-other, proxy-guardrails, proxy-core, proxy-misc, integrations, core-utils, other-1 through other-3, root, and all 9 proxy-unit-* entries) — fully overlapping the new workflows. You'll want to delete .github/workflows/test-litellm-matrix.yml as part of this PR (or a fast-follow) to avoid the duplication.

greptile-apps · 2026-03-28T17:04:01Z

+  pull_request:
+    branches: [main]
+


No workflow_dispatch trigger on any new workflow

None of the 16 new workflows include a workflow_dispatch trigger, making it impossible to manually re-run a specific test suite from the GitHub Actions UI without pushing a new commit. This is particularly inconvenient when a flaky test needs a targeted re-run. Adding workflow_dispatch: {} (or with optional inputs) to each caller workflow would restore that capability.

Suggested change

pull_request:

branches: [main]

on:

pull_request:

branches: [main]

workflow_dispatch: {}

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-03-28T17:04:02Z

+    steps:
+      - uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955 # v4.3.0
+        with:
+          persist-credentials: false
+
+      - name: Set up Python
+        uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
+        with:
+          python-version: "3.12"
+
+      - name: Install Poetry
+        run: pip install 'poetry==2.3.2'
+
+      - name: Cache Poetry dependencies
+        uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
+        with:
+          path: |
+            ~/.cache/pypoetry
+            ~/.cache/pip
+            .venv
+          key: ${{ runner.os }}-poetry-${{ hashFiles('poetry.lock') }}
+          restore-keys: |
+            ${{ runner.os }}-poetry-
+
+      - name: Install dependencies
+        run: |
+          poetry config virtualenvs.in-project true
+          poetry install --with dev,proxy-dev --extras "proxy semantic-router"
+          poetry run pip install google-genai==1.22.0 \
+            google-cloud-aiplatform==1.115.0 fastapi-offline==1.7.3 python-multipart==0.0.22 openapi-core==0.23.0
+
+      - name: Setup litellm-enterprise
+        run: |
+          poetry run pip install --force-reinstall --no-deps -e enterprise/
+
+      - name: Generate Prisma client
+        env:
+          PRISMA_BINARY_CACHE_DIR: ${{ runner.temp }}/prisma-cache
+        run: |
+          poetry run pip install nodejs-wheel-binaries==24.13.1
+          poetry run prisma generate --schema litellm/proxy/schema.prisma
+
+      - name: Run tests - ${{ matrix.test-group.name }}
+        env:
+          TEST_PATH: ${{ matrix.test-group.path }}
+        run: |
+          poetry run pytest ${TEST_PATH} \
+            --tb=short -vv \
+            --maxfail=10 \
+            -n 2 \
+            --reruns 1 \
+            --reruns-delay 1 \
+            --dist=loadscope \
+            --durations=20


Full setup duplicated from _test-unit-base.yml

Because workflow_call doesn't support matrix strategies, test-unit-proxy-legacy.yml inlines all six setup steps (checkout, Python, Poetry install, cache, dependencies, enterprise, Prisma generate) that already live in the base workflow. Any future change to the shared setup (e.g. bumping poetry or google-genai versions, adding a new install step) must be manually applied in both places.

Consider adding a comment to the top of the job explicitly flagging this as intentional duplication that must be kept in sync with _test-unit-base.yml, so future maintainers don't accidentally diverge the two.

greptile-apps · 2026-03-28T17:04:03Z

+  pull_request:
+    branches: [main]
+


No push trigger on main — post-merge regressions undetected

All 16 workflows only trigger on pull_request: branches: [main]. Once a PR is merged, none of these jobs run again, so a merge that introduces a regression on main won't be caught until the next PR opens. Consider adding a push: branches: [main] trigger to at least the most critical suites (e.g. proxy-auth, llm-providers) to maintain a green main signal.

This also applies to: test-unit-documentation.yml, test-unit-enterprise-routing.yml, test-unit-integrations.yml, test-unit-litellm-utils.yml, test-unit-llm-providers.yml, test-unit-misc.yml, test-unit-pass-through.yml, test-unit-proxy-auth.yml, test-unit-proxy-endpoints.yml, test-unit-proxy-infra.yml, test-unit-proxy-legacy.yml, test-unit-responses-caching-types.yml, test-unit-router.yml, test-unit-security.yml.

Rename job keys from generic 'test' to descriptive names (e.g., 'core-utils', 'proxy-auth', 'router') so GitHub checks display as 'core-utils / run' instead of 'test / test'. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

These test suites are not pure unit tests and don't belong in Phase 1: - litellm_utils_tests: health check tests need OPENAI_API_KEY - pass_through_unit_tests: tests hit real Anthropic API - router_unit_tests: tests call real OpenAI moderation endpoints - proxy_security_tests: requires DATABASE_URL (Postgres) - documentation_tests: requires docs directory at specific relative path These will be re-added in later phases with proper secret scoping. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

vercel bot deployed to Preview March 28, 2026 16:58 View deployment

greptile-apps bot reviewed Mar 28, 2026

View reviewed changes

vercel bot deployed to Preview March 28, 2026 17:09 View deployment

vercel bot deployed to Preview March 28, 2026 17:18 View deployment

yuneng-berri requested a review from ryan-crabbe-berri March 28, 2026 17:25

ryan-crabbe-berri approved these changes Mar 28, 2026

View reviewed changes

yuneng-berri merged commit 428d837 into main Mar 28, 2026
57 of 105 checks passed

yuneng-berri deleted the litellm_unit_test_workflow_isolation branch March 28, 2026 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Infra] Isolate unit test workflows with hardened security posture#24740

[Infra] Isolate unit test workflows with hardened security posture#24740
yuneng-berri merged 3 commits intomainfrom
litellm_unit_test_workflow_isolation

yuneng-berri commented Mar 28, 2026

Uh oh!

vercel bot commented Mar 28, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Mar 28, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 28, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 28, 2026 •

edited

Loading

Important Files Changed

Uh oh!

greptile-apps bot Mar 28, 2026

Uh oh!

greptile-apps bot Mar 28, 2026

Uh oh!

greptile-apps bot Mar 28, 2026

Uh oh!

greptile-apps bot Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

yuneng-berri commented Mar 28, 2026

Summary

Problem

Fix

New Workflows

Testing

Type

Uh oh!

vercel bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Uh oh!

greptile-apps bot commented Mar 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel bot commented Mar 28, 2026 •

edited

Loading

CLAassistant commented Mar 28, 2026 •

edited

Loading

codspeed-hq bot commented Mar 28, 2026 •

edited

Loading

greptile-apps bot commented Mar 28, 2026 •

edited

Loading