Skip to content

Cli implement#182

Merged
cheyang merged 19 commits intosgl-project:mainfrom
diw-zw:cli-implement
Mar 25, 2026
Merged

Cli implement#182
cheyang merged 19 commits intosgl-project:mainfrom
diw-zw:cli-implement

Conversation

@diw-zw
Copy link
Copy Markdown
Collaborator

@diw-zw diw-zw commented Mar 6, 2026

Ⅰ. Motivation

Ⅱ. Modifications

Ⅲ. Does this pull request fix one issue?

fixes #XXXX

Ⅳ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.

Ⅴ. Describe how to verify it

VI. Special notes for reviews

Checklist

  • Format your code make fmt.
  • Add unit tests or integration tests.
  • Update the documentation related to the change.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the kubectl rbg llm CLI by introducing comprehensive configuration management, model lifecycle operations (pull, list, run), and an extensible plugin architecture. These changes empower users to easily configure their LLM environment, manage model downloads, and deploy inference services directly from the command line, integrating seamlessly with Kubernetes resources.

Highlights

  • LLM Configuration Management: Introduced a new config subcommand under kubectl rbg llm to manage configurations for storage, model sources, and inference engines. This includes commands to add, get, use, set, delete, view, and initialize configurations.
  • Model Pull and Listing: Added pull and models subcommands. The pull command facilitates downloading models from a configured source (e.g., HuggingFace, ModelScope) to a configured storage (e.g., PVC) using Kubernetes Jobs. The models command lists all downloaded models within the active storage.
  • Model Serving Deployment: Implemented a run subcommand to deploy models as inference services. This command generates a Kubernetes Pod template based on the configured engine (e.g., vLLM, SGLang) and specified resources (GPU, CPU, memory), ready for deployment.
  • Extensible Plugin Architecture: Developed a plugin-based system for storage, source, and engine types, allowing for easy extension and integration of new backends. Initial plugins include PVC for storage, HuggingFace and ModelScope for sources, and vLLM and SGLang for engines.
  • Kubernetes Integration for Operations: Leveraged Kubernetes Jobs for model download and listing operations, ensuring these tasks are executed reliably within the cluster environment. Metadata injection is used to track downloaded models.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • cmd/cli/cmd/llm/config/config.go
    • Added the main config command for LLM configuration management, registering various subcommands for storage, source, and engine settings.
  • cmd/cli/cmd/llm/config/config_test.go
    • Added unit tests for the config command, verifying its structure and subcommand registration.
  • cmd/cli/cmd/llm/config/engine.go
    • Added commands for managing LLM engine configurations, including add-engine, get-engines, use-engine, set-engine, and delete-engine.
  • cmd/cli/cmd/llm/config/engine_test.go
    • Added unit tests for the engine configuration management commands.
  • cmd/cli/cmd/llm/config/misc.go
    • Added miscellaneous configuration commands: view to display current settings, set-namespace to change the default namespace, and init for interactive configuration setup.
  • cmd/cli/cmd/llm/config/misc_test.go
    • Added unit tests for miscellaneous configuration commands and helper functions like readLine and selectPlugin.
  • cmd/cli/cmd/llm/config/source.go
    • Added commands for managing LLM source configurations, including add-source, get-sources, use-source, set-source, and delete-source.
  • cmd/cli/cmd/llm/config/source_test.go
    • Added unit tests for the source configuration management commands.
  • cmd/cli/cmd/llm/config/storage.go
    • Added commands for managing LLM storage configurations, including add-storage, get-storages, use-storage, set-storage, and delete-storage.
  • cmd/cli/cmd/llm/config/storage_test.go
    • Added unit tests for the storage configuration management commands.
  • cmd/cli/cmd/llm/llm.go
    • Modified the main llm command to import and register new config, pull, models, and run subcommands.
    • Imported plugin packages to ensure their registration for dynamic loading.
  • cmd/cli/cmd/llm/llm_test.go
    • Added unit tests for the main llm command, verifying the presence and basic properties of new subcommands like config, pull, models, and run.
  • cmd/cli/cmd/llm/models.go
    • Added the models command to list downloaded models in the configured storage.
    • Implemented logic to create and monitor a Kubernetes Job that scans the storage for model metadata.
  • cmd/cli/cmd/llm/models_test.go
    • Added unit tests for the models command, including job creation and output parsing helpers.
  • cmd/cli/cmd/llm/pull.go
    • Added the pull command to download models from a source to storage.
    • Implemented Kubernetes Job creation for model download, including resource limits and metadata injection for tracking.
  • cmd/cli/cmd/llm/pull_test.go
    • Added unit tests for the pull command, covering job creation, state derivation, and shell escaping logic.
  • cmd/cli/cmd/llm/run.go
    • Added the run command to deploy models as inference services.
    • Implemented logic to generate a Kubernetes Pod template based on configured engine, storage, and user-specified resources.
  • cmd/cli/cmd/llm/run_test.go
    • Added unit tests for the run command, verifying flag defaults and environment variable parsing.
  • cmd/cli/cmd/llm/utils.go
    • Added utility functions sanitizeModelID for creating valid Kubernetes resource names and printPodTemplate for outputting Pod YAML.
  • cmd/cli/cmd/llm/utils_test.go
    • Added unit tests for the sanitizeModelID utility function.
  • cmd/cli/config/config.go
    • Added core configuration structures (Config, StorageConfig, SourceConfig, EngineConfig) and management logic.
    • Implemented functions for loading, saving, and performing CRUD operations on storage, source, and engine configurations.
  • cmd/cli/config/config_test.go
    • Added comprehensive unit tests for the configuration management system, covering file operations and CRUD logic for all configuration types.
  • cmd/cli/plugin/engine/interface.go
    • Added the Plugin interface for LLM inference engines, defining methods for configuration, template generation, and service URL retrieval.
    • Implemented a registry for engine plugins to allow dynamic loading and validation.
  • cmd/cli/plugin/engine/interface_test.go
    • Added unit tests for the engine plugin interface and registry.
  • cmd/cli/plugin/engine/sglang.go
    • Added an implementation of the Plugin interface for the SGLang inference engine, including its configuration fields and Pod template generation logic.
  • cmd/cli/plugin/engine/sglang_test.go
    • Added unit tests for the SGLang engine plugin.
  • cmd/cli/plugin/engine/vllm.go
    • Added an implementation of the Plugin interface for the vLLM inference engine, including its configuration fields and Pod template generation logic.
  • cmd/cli/plugin/engine/vllm_test.go
    • Added unit tests for the vLLM engine plugin.
  • cmd/cli/plugin/source/huggingface.go
    • Added an implementation of the Plugin interface for HuggingFace as a model source, including its configuration fields and Pod template generation for model downloads.
  • cmd/cli/plugin/source/huggingface_test.go
    • Added unit tests for the HuggingFace source plugin.
  • cmd/cli/plugin/source/interface.go
    • Added the Plugin interface for LLM model sources, defining methods for configuration and Pod template generation for model downloads.
  • cmd/cli/plugin/source/interface_test.go
    • Added unit tests for the source plugin interface and registry.
  • cmd/cli/plugin/source/modelscope.go
    • Added an implementation of the Plugin interface for ModelScope as a model source, including its configuration fields and Pod template generation for model downloads.
  • cmd/cli/plugin/source/modelscope_test.go
    • Added unit tests for the ModelScope source plugin.
  • cmd/cli/plugin/storage/interface.go
    • Added the Plugin interface for LLM model storage backends, defining methods for configuration, mounting, and model listing.
  • cmd/cli/plugin/storage/interface_test.go
    • Added unit tests for the storage plugin interface and registry.
  • cmd/cli/plugin/storage/pvc.go
    • Added an implementation of the Plugin interface for PersistentVolumeClaim (PVC) storage, including its configuration fields and Pod template modification for volume mounts.
  • cmd/cli/plugin/storage/pvc_test.go
    • Added unit tests for the PVC storage plugin.
  • cmd/cli/plugin/util/util.go
    • Added a utility package with ConfigField struct and ValidateConfig function to standardize plugin configuration validation.
  • cmd/cli/plugin/util/util_test.go
    • Added unit tests for the plugin configuration validation utility.
Activity
  • The pull request was created by diw-zw.
  • The pull request description contains a standard template for motivation, modifications, issue linking, test cases, verification, and special notes, but these sections are currently empty.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive CLI for managing LLM models, including configuration, pulling, and running models, with a plugin-based architecture for different model sources, storage backends, and inference engines. However, critical command injection vulnerabilities have been identified where user-supplied input (model IDs and revisions) is directly concatenated into shell commands executed within Kubernetes Jobs, allowing arbitrary command execution within the Job containers. Robust shell escaping for all dynamic components or avoiding shell execution is highly recommended. Additionally, there are high-priority issues concerning concurrency and data correctness, and medium-priority improvements for user experience and error handling. Please review the detailed comments for specifics.

Comment thread cmd/cli/cmd/llm/pull.go Outdated
Comment thread cmd/cli/plugin/source/huggingface.go Outdated
Comment thread cmd/cli/plugin/source/modelscope.go Outdated
Comment thread cmd/cli/config/config.go
Comment thread cmd/cli/cmd/llm/config/misc.go Outdated
Comment thread cmd/cli/cmd/llm/run.go Outdated
Comment thread cmd/cli/plugin/storage/pvc.go Outdated
@diw-zw diw-zw force-pushed the cli-implement branch 2 times, most recently from 85d03f4 to 36e129f Compare March 12, 2026 09:28
@diw-zw diw-zw marked this pull request as ready for review March 19, 2026 05:27
@gemini-code-assist
Copy link
Copy Markdown

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@cheyang cheyang requested a review from Copilot March 24, 2026 08:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements a pluggable kubectl rbg llm CLI, including config management, model source/storage/engine plugins, and interactive chat support against running inference services.

Changes:

  • Add plugin interfaces + built-in implementations for storage (PVC/OSS), sources (HuggingFace/ModelScope), and engines (vLLM/SGLang)
  • Introduce CLI config file handling (YAML), plus llm subcommands (config, run, pull, models, list, delete, chat)
  • Add extensive Go unit tests across CLI, config, and plugin packages

Reviewed changes

Copilot reviewed 60 out of 566 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
go.mod Adds CLI dependencies (readline/term/yaml.v3) needed for interactive UX and config parsing
cmd/cli/util/util.go Adds helpers to create controller-runtime client / REST config for CLI operations
cmd/cli/plugin/util/util.go Adds config validation + masked input utilities for interactive configuration
cmd/cli/plugin/util/util_test.go Tests config validation behavior
cmd/cli/plugin/storage/interface.go Introduces storage plugin interface, registry, and config validation entrypoints
cmd/cli/plugin/storage/interface_test.go Tests storage registry behaviors (unknown type, registration, fields)
cmd/cli/plugin/storage/pvc.go Adds PVC storage plugin implementation
cmd/cli/plugin/storage/pvc_test.go Tests PVC storage plugin behavior and config validation
cmd/cli/plugin/storage/oss.go Adds OSS storage plugin with Secret/PV/PVC provisioning via controller-runtime client
cmd/cli/plugin/source/interface.go Introduces source plugin interface + registry + validation
cmd/cli/plugin/source/interface_test.go Tests source registry behaviors
cmd/cli/plugin/source/huggingface.go Adds HuggingFace source plugin templating logic
cmd/cli/plugin/source/huggingface_test.go Tests HuggingFace source templating, including injection-safety checks
cmd/cli/plugin/source/modelscope.go Adds ModelScope source plugin templating logic
cmd/cli/plugin/source/modelscope_test.go Tests ModelScope source templating and secret precedence
cmd/cli/plugin/engine/interface.go Introduces engine plugin interface + registry + validation
cmd/cli/plugin/engine/interface_test.go Tests engine registry behaviors
cmd/cli/plugin/engine/vllm.go Adds vLLM engine plugin template generator
cmd/cli/plugin/engine/vllm_test.go Tests vLLM engine config/defaults/template generation
cmd/cli/plugin/engine/sglang.go Adds SGLang engine plugin template generator
cmd/cli/plugin/engine/sglang_test.go Tests SGLang engine config/defaults/template generation
cmd/cli/config/config.go Adds CLI YAML config model + load/save + CRUD for storages/sources/engines
cmd/cli/config/config_test.go Adds tests for config path resolution, load/save, and CRUD behaviors
cmd/cli/cmd/llm/llm.go Registers the llm command and new subcommands; imports plugins for registration side-effects
cmd/cli/cmd/llm/llm_test.go Verifies subcommand registration and expected flags
cmd/cli/cmd/llm/utils.go Adds model-id sanitization and YAML printing helpers
cmd/cli/cmd/llm/utils_test.go Tests model-id sanitization logic
cmd/cli/cmd/llm/models.go Adds llm models job-based storage scanner and table output
cmd/cli/cmd/llm/models_test.go Tests list-models Job construction and no-panic printing
cmd/cli/cmd/llm/list.go Adds llm list command to list CLI-created RoleBasedGroups and print a table
cmd/cli/cmd/llm/list_printer.go Adds tabwriter-based list output formatter
cmd/cli/cmd/llm/list_test.go Tests list command metadata and table formatting
cmd/cli/cmd/llm/delete.go Adds llm delete command to delete RoleBasedGroups by name
cmd/cli/cmd/llm/delete_test.go Tests deletion using fake clientset primitives
cmd/cli/cmd/llm/pull_test.go Adds tests for pull job helpers and injection-safe metadata writing
cmd/cli/cmd/llm/run_test.go Adds tests around run command flags and run context resolution
cmd/cli/cmd/llm/run/models.yaml Adds embedded model/mode presets used by run
cmd/cli/cmd/llm/run/model_embed.go Embeds models.yaml into the binary
cmd/cli/cmd/llm/run/model_config.go Adds model/mode config loader and matcher (exact / wildcard / default)
cmd/cli/cmd/llm/metadata/metadata.go Adds label/annotation constants and RunMetadata struct for CLI-created RBGs
cmd/cli/cmd/llm/config/config.go Adds llm config umbrella command and subcommand wiring
cmd/cli/cmd/llm/config/config_test.go Tests config command wiring and subcommand presence
cmd/cli/cmd/llm/config/storage.go Adds storage config subcommands (add/get/use/set/delete)
cmd/cli/cmd/llm/config/storage_test.go Tests storage config subcommand metadata and flags
cmd/cli/cmd/llm/config/source.go Adds source config subcommands (add/get/use/set/delete)
cmd/cli/cmd/llm/config/source_test.go Tests source config subcommand metadata and flags
cmd/cli/cmd/llm/config/engine.go Adds engine config subcommands (set/get/reset)
cmd/cli/cmd/llm/config/engine_test.go Tests engine config subcommand metadata and flags
cmd/cli/cmd/llm/config/misc.go Adds config view and interactive config init wizard
cmd/cli/cmd/llm/config/misc_test.go Tests interactive helpers (ReadLine/selectPlugin) used by the wizard
cmd/cli/cmd/llm/chat/chat.go Adds llm chat interactive/non-interactive chat client with port-forwarding
cmd/cli/cmd/llm/chat/client.go Implements OpenAI-compatible /v1/chat/completions client (streaming + non-streaming)
cmd/cli/cmd/llm/chat/portforward.go Manages kubectl port-forward subprocess lifecycle
cmd/cli/cmd/llm/chat/chat_test.go Adds tests for chat client, REPL behavior, and helper functions
cmd/cli/cmd/llm/benchmark/benchmark.go Updates benchmark code path to use v1alpha2 RBG API
.golangci.yml Suppresses dupl linter for llm config command files

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cmd/cli/cmd/llm/chat/chat.go
Comment thread cmd/cli/config/config.go
Comment thread cmd/cli/config/config.go
Comment thread cmd/cli/cmd/llm/delete.go Outdated
Comment thread cmd/cli/cmd/llm/delete.go Outdated
Comment thread cmd/cli/cmd/llm/delete.go
Comment thread cmd/cli/cmd/llm/config/storage.go Outdated
Comment thread cmd/cli/cmd/llm/run/model_config.go Outdated
Comment thread cmd/cli/cmd/llm/config/misc.go Outdated
Comment thread cmd/cli/cmd/llm/llm.go
cheyang

This comment was marked as resolved.

@diw-zw diw-zw force-pushed the cli-implement branch 2 times, most recently from d9120ff to cd6b80d Compare March 25, 2026 03:06
Copy link
Copy Markdown
Collaborator

@cheyang cheyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@cheyang cheyang merged commit a9106e5 into sgl-project:main Mar 25, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants