-
Notifications
You must be signed in to change notification settings - Fork 329
feat(huatuo): add huatuo input plugin with sidecar support #1368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a new input plugin for Huatuo, an eBPF-based monitoring tool, with two operational modes: (1) Sidecar mode that manages the full lifecycle of a local huatuo-bamai process including installation, configuration, and process supervision, and (2) Remote mode that simply scrapes metrics from an existing huatuo instance.
Key changes include:
- New huatuo input plugin with automatic installation from embedded tarballs, dynamic configuration management via TOML overwrites, and autonomous process lifecycle management
- GitHub Actions workflow integration to download and embed huatuo releases for linux-amd64 and linux-arm64
- GoReleaser configuration updates to package embedded huatuo tarballs in architecture-specific releases
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 30 comments.
Show a summary per file
| File | Description |
|---|---|
| inputs/huatuo/huatuo.go | Core plugin implementation with sidecar process management, tarball extraction, config overwriting, port detection, and Prometheus metrics scraping |
| inputs/huatuo/README.md | English documentation describing sidecar and remote modes with configuration examples |
| inputs/huatuo/README_cn.md | Chinese documentation mirroring the English README |
| conf/input.huatuo/huatuo.toml | Configuration template with commented examples for both operational modes |
| agent/metrics_agent.go | Registers the huatuo plugin with the agent |
| go.mod | Promotes pelletier/go-toml/v2 from indirect to direct dependency |
| .goreleaser.yaml | Splits linux builds into separate archives and embeds architecture-specific huatuo tarballs |
| .github/workflows/release.yaml | Downloads latest huatuo releases from GitHub during the release build process |
Critical Issues Found:
- Path traversal vulnerability in tar extraction (Zip Slip)
- Race conditions in process management and shutdown
- Context cancellation not properly respected during long-running process execution
- Missing waitgroup synchronization in Drop() method leading to potential goroutine leaks
- Multiple concurrency safety issues with shared state access
Moderate Issues Found:
- Fragile GitHub API parsing in release workflow without error handling
- Config file reading duplicated across multiple functions
- File glob patterns in GoReleaser could match multiple files with undefined behavior
- Missing validation/verification of downloaded external assets
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func unpackTarGz(r io.Reader, dest string) error { | ||
| gzr, err := gzip.NewReader(r) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| defer gzr.Close() | ||
|
|
||
| tr := tar.NewReader(gzr) | ||
| for { | ||
| header, err := tr.Next() | ||
| if err == io.EOF { | ||
| break | ||
| } | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| target := filepath.Join(dest, header.Name) | ||
| switch header.Typeflag { | ||
| case tar.TypeDir: | ||
| if err := os.MkdirAll(target, 0755); err != nil { | ||
| return err | ||
| } | ||
| case tar.TypeReg: | ||
| f, err := os.OpenFile(target, os.O_CREATE|os.O_RDWR, os.FileMode(header.Mode)) | ||
| if err != nil { | ||
| return err | ||
| } | ||
| if _, err := io.Copy(f, tr); err != nil { | ||
| f.Close() | ||
| return err | ||
| } | ||
| f.Close() | ||
| } | ||
| } | ||
| return nil |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tar extraction implementation is vulnerable to path traversal attacks (Zip Slip). An attacker could craft a malicious tarball with entries containing path traversal sequences like "../" in header.Name, allowing files to be written outside the intended destination directory. Before constructing the target path, you should validate that filepath.Join(dest, header.Name) after cleaning stays within the dest directory boundary. Consider using filepath.Clean on header.Name and checking that the resulting path has dest as a prefix.
inputs/huatuo/huatuo.go
Outdated
| // Wait for exit | ||
| err := cmd.Wait() | ||
| log.Printf("I! huatuo process exited: %v", err) | ||
|
|
||
| // Allow quick restart unless context cancelled | ||
| time.Sleep(5 * time.Second) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The manageProcess goroutine has a critical issue: it starts an infinite restart loop but never checks if ctx.Done() is signaled during the sleep periods or while the process is running. If the process is long-running, cancelFunc won't take effect until the process exits naturally. The context cancellation should be checked while waiting for the command to finish. Consider using cmd.Wait() in a separate goroutine and selecting on both ctx.Done() and the wait completion, so you can terminate the process immediately when shutdown is requested.
| // Wait for exit | |
| err := cmd.Wait() | |
| log.Printf("I! huatuo process exited: %v", err) | |
| // Allow quick restart unless context cancelled | |
| time.Sleep(5 * time.Second) | |
| // Wait for exit or context cancellation | |
| waitCh := make(chan error, 1) | |
| go func() { | |
| waitCh <- cmd.Wait() | |
| }() | |
| select { | |
| case err := <-waitCh: | |
| log.Printf("I! huatuo process exited: %v", err) | |
| case <-ctx.Done(): | |
| // Context cancelled, try to terminate process | |
| log.Printf("I! context cancelled, terminating huatuo process") | |
| _ = cmd.Process.Signal(os.Interrupt) | |
| // Give it a moment to shut down gracefully | |
| select { | |
| case <-waitCh: | |
| case <-time.After(1 * time.Second): | |
| _ = cmd.Process.Kill() | |
| <-waitCh | |
| } | |
| return | |
| } | |
| // Allow quick restart unless context cancelled | |
| select { | |
| case <-ctx.Done(): | |
| return | |
| case <-time.After(5 * time.Second): | |
| } |
inputs/huatuo/huatuo.go
Outdated
| if err := cmd.Start(); err != nil { | ||
| log.Printf("E! failed to start huatuo: %v", err) | ||
| time.Sleep(10 * time.Second) | ||
| continue | ||
| } | ||
| ins.cmd = cmd | ||
|
|
||
| // Wait for exit | ||
| err := cmd.Wait() | ||
| log.Printf("I! huatuo process exited: %v", err) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The command execution will fail if started without proper permissions since huatuo requires root for eBPF. However, cmd.Start() failure triggers a retry loop with only a 10-second delay. This could generate excessive error logs. Additionally, after successful Start(), the code stores the cmd reference but then immediately calls cmd.Wait() blocking the goroutine. This means ins.cmd will only be accessible for Drop() to kill during this Wait period. If you need to kill the process during startup (before Wait is called), ins.cmd won't be set yet, causing a nil pointer issue in Drop().
inputs/huatuo/huatuo.go
Outdated
| time.Sleep(10 * time.Second) | ||
| continue | ||
| } | ||
| ins.cmd = cmd |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a potential race condition with ins.cmd. The field is written in the manageProcess goroutine (line 365) and read in the Drop method (line 381) without any synchronization. If Drop is called while manageProcess is between iterations (after cmd.Wait() but before the next cmd assignment), ins.cmd could be nil or stale. Consider protecting ins.cmd access with a mutex, or using atomic operations to safely access the process reference.
| for _, sub := range subEntries { | ||
| oldPath := filepath.Join(srcDir, sub.Name()) | ||
| newPath := filepath.Join(ins.InstallPath, sub.Name()) | ||
| if err := os.Rename(oldPath, newPath); err != nil { | ||
| return fmt.Errorf("failed to move %s to %s: %w", oldPath, newPath, err) | ||
| } | ||
| } |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file moving logic after unpacking has a potential issue. If a file with the same name already exists at the destination (newPath), os.Rename will fail on some systems or overwrite without warning on others. This could happen if the tarball is re-extracted over an existing installation. Consider checking if the destination exists first, or using a safer approach like extracting to a temporary directory first, then moving the entire tree atomically.
inputs/huatuo/huatuo.go
Outdated
| content, err := os.ReadFile(targetFile) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| var data map[string]interface{} | ||
| if err := toml.Unmarshal(content, &data); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| for k, v := range ins.ConfigOverwrites { | ||
| setNestedMap(data, k, v) | ||
| } | ||
|
|
||
| newContent, err := toml.Marshal(data) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| return os.WriteFile(targetFile, newContent, 0644) |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The applyConfig() function reads the entire config file, unmarshals it, modifies values, marshals it back, and writes it every time Init() is called. For the detectPort() function, the config file is read and parsed again. These repeated file I/O operations are inefficient. Consider caching the parsed config or combining these operations into a single read-modify-write cycle, especially since both functions need the same config file.
inputs/huatuo/huatuo.go
Outdated
|
|
||
| func (ins *Instance) scrape(slist *types.SampleList) error { | ||
| client := http.Client{ | ||
| Timeout: 5 * time.Second, |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout value of 5 seconds is hardcoded. Consider making this configurable through the Instance configuration (similar to how other input plugins like prometheus have a configurable Timeout field), allowing operators to adjust based on their network conditions and huatuo response times.
inputs/huatuo/huatuo.go
Outdated
| f.Close() | ||
| return err | ||
| } | ||
| f.Close() |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handle may be writable as a result of data flow from a call to OpenFile and closing it may result in data loss upon failure, which is not handled explicitly.
| f.Close() | |
| return err | |
| } | |
| f.Close() | |
| closeErr := f.Close() | |
| if closeErr != nil { | |
| return fmt.Errorf("error closing file after copy failure: %v (copy error: %v)", closeErr, err) | |
| } | |
| return err | |
| } | |
| if err := f.Close(); err != nil { | |
| return fmt.Errorf("error closing file after successful copy: %v", err) | |
| } |
inputs/huatuo/huatuo.go
Outdated
| f.Close() | ||
| return err | ||
| } | ||
| f.Close() |
Copilot
AI
Dec 16, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handle may be writable as a result of data flow from a call to OpenFile and closing it may result in data loss upon failure, which is not handled explicitly.
| f.Close() | |
| return err | |
| } | |
| f.Close() | |
| closeErr := f.Close() | |
| if closeErr != nil { | |
| // Return the write error, but log the close error as well | |
| return fmt.Errorf("write error: %v; close error: %v", err, closeErr) | |
| } | |
| return err | |
| } | |
| if err := f.Close(); err != nil { | |
| return err | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 18 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
inputs/huatuo/huatuo.go
Outdated
| cmd := exec.Command("sudo", binPath, "--config", filepath.Base(confPath), "--region", region) | ||
| cmd.Dir = ins.InstallPath // Set workdir to install path so it finds config if relative | ||
|
|
||
| // Redirect stdout/stderr to log? | ||
| if ins.DebugMod { | ||
| cmd.Stdout = os.Stdout | ||
| cmd.Stderr = os.Stderr | ||
| } | ||
|
|
||
| if err := cmd.Start(); err != nil { | ||
| log.Printf("E! failed to start huatuo: %v", err) | ||
| select { | ||
| case <-ctx.Done(): | ||
| return | ||
| case <-time.After(10 * time.Second): | ||
| continue | ||
| } | ||
| } | ||
|
|
||
| ins.lock.Lock() | ||
| ins.cmd = cmd | ||
| ins.lock.Unlock() | ||
|
|
||
| // Wait for exit or context cancel | ||
| done := make(chan error, 1) | ||
| go func() { | ||
| done <- cmd.Wait() | ||
| }() | ||
|
|
||
| select { | ||
| case <-ctx.Done(): | ||
| // Context cancelled, kill process | ||
| if cmd.Process != nil { | ||
| _ = cmd.Process.Signal(os.Interrupt) | ||
| // Give slight grace period? | ||
| time.Sleep(1 * time.Second) | ||
| _ = cmd.Process.Kill() | ||
| } | ||
| return | ||
| case err := <-done: | ||
| log.Printf("I! huatuo process exited: %v", err) | ||
| } | ||
|
|
||
| // Allow quick restart unless context cancelled | ||
| select { | ||
| case <-ctx.Done(): | ||
| return | ||
| case <-time.After(5 * time.Second): | ||
| // continue loop | ||
| } | ||
| } | ||
| } |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running huatuo-bamai with sudo in an infinite restart loop without rate limiting or backoff is dangerous. If the process fails immediately (e.g., due to misconfiguration), this will spawn processes every 10-15 seconds indefinitely, potentially exhausting system resources. Consider adding exponential backoff, a maximum retry count, or checking if the previous process ran for a minimum duration before restarting.
| _ = cmd.Process.Signal(os.Interrupt) | ||
| // Give slight grace period? | ||
| time.Sleep(1 * time.Second) | ||
| _ = cmd.Process.Kill() |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sending os.Interrupt followed by a 1-second sleep and then Kill is not a graceful shutdown pattern. Consider using cmd.Process.Signal with a proper wait mechanism or use a context with timeout for graceful termination before forcing a kill.
| func setNestedMap(m map[string]interface{}, path string, val interface{}) { | ||
| parts := strings.Split(path, ".") | ||
| current := m | ||
| for i := 0; i < len(parts)-1; i++ { | ||
| part := parts[i] | ||
| next, ok := current[part] | ||
| if !ok || next == nil { | ||
| nextMap := make(map[string]interface{}) | ||
| current[part] = nextMap | ||
| current = nextMap | ||
| } else if nextMap, ok := next.(map[string]interface{}); ok { | ||
| current = nextMap | ||
| } else { | ||
| nextMap := make(map[string]interface{}) | ||
| current[part] = nextMap | ||
| current = nextMap | ||
| } | ||
| } | ||
| current[parts[len(parts)-1]] = val | ||
| } |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The setNestedMap function doesn't handle the case where parts is empty (when path is an empty string). This would cause a panic when accessing parts[len(parts)-1]. Consider adding a check at the beginning to return early or error if path is empty.
| # 远程采集地址 | ||
| url = "http://192.168.1.100:19704/metrics" | ||
|
|
||
| # install_path 必须为空或省略 |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says install_path must be empty or omitted ("必须为空或省略"), but technically it just needs to be empty - omitting it will use the zero value (empty string). Consider clarifying this for consistency with behavior.
| # install_path 必须为空或省略 | |
| # install_path 需保持为空;不配置该字段时其默认值为一个空字符串 |
| ins.lock.Lock() | ||
| ins.cmd = cmd | ||
| ins.lock.Unlock() |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a potential race condition where ins.cmd is being written without a lock held during the initial assignment, then locked for subsequent accesses. The cmd field should be protected by the lock during the initial assignment at line 378 as well to ensure thread-safe access throughout its lifecycle.
| go ins.manageProcess(ctx) | ||
| ins.wg.Add(1) |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goroutine started at line 117 is not added to the waitgroup before being launched. The wg.Add(1) call at line 118 happens after the goroutine starts, creating a race condition where the goroutine could potentially call wg.Done() before wg.Add(1) is executed. Move ins.wg.Add(1) before the go statement.
| go ins.manageProcess(ctx) | |
| ins.wg.Add(1) | |
| ins.wg.Add(1) | |
| go ins.manageProcess(ctx) |
| curl -L -o embedded/linux-amd64/huatuo-$VERSION-linux-amd64.tar.gz "https://github.com/ccfos/huatuo/releases/download/$VERSION/huatuo-$VERSION-linux-amd64.tar.gz" | ||
| curl -L -o embedded/linux-arm64/huatuo-$VERSION-linux-arm64.tar.gz "https://github.com/ccfos/huatuo/releases/download/$VERSION/huatuo-$VERSION-linux-arm64.tar.gz" |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The curl downloads lack verification (no checksum validation). If the huatuo release is compromised or the download is corrupted, this will silently include malicious or broken binaries in the categraf release. Consider downloading and verifying checksums, or at least checking that the downloaded files are valid tar.gz archives before proceeding.
| if err := os.MkdirAll(target, 0755); err != nil { | ||
| return err | ||
| } |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating parent directories for every file in the tarball is inefficient. The MkdirAll is called even when the directory already exists from a previous file. Consider checking if the directory exists first or tracking which directories have been created to avoid repeated system calls.
No description provided.