Skip to content

Conversation

@kongfei605
Copy link
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings December 16, 2025 13:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new input plugin for Huatuo, an eBPF-based monitoring tool, with two operational modes: (1) Sidecar mode that manages the full lifecycle of a local huatuo-bamai process including installation, configuration, and process supervision, and (2) Remote mode that simply scrapes metrics from an existing huatuo instance.

Key changes include:

  • New huatuo input plugin with automatic installation from embedded tarballs, dynamic configuration management via TOML overwrites, and autonomous process lifecycle management
  • GitHub Actions workflow integration to download and embed huatuo releases for linux-amd64 and linux-arm64
  • GoReleaser configuration updates to package embedded huatuo tarballs in architecture-specific releases

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 30 comments.

Show a summary per file
File Description
inputs/huatuo/huatuo.go Core plugin implementation with sidecar process management, tarball extraction, config overwriting, port detection, and Prometheus metrics scraping
inputs/huatuo/README.md English documentation describing sidecar and remote modes with configuration examples
inputs/huatuo/README_cn.md Chinese documentation mirroring the English README
conf/input.huatuo/huatuo.toml Configuration template with commented examples for both operational modes
agent/metrics_agent.go Registers the huatuo plugin with the agent
go.mod Promotes pelletier/go-toml/v2 from indirect to direct dependency
.goreleaser.yaml Splits linux builds into separate archives and embeds architecture-specific huatuo tarballs
.github/workflows/release.yaml Downloads latest huatuo releases from GitHub during the release build process

Critical Issues Found:

  • Path traversal vulnerability in tar extraction (Zip Slip)
  • Race conditions in process management and shutdown
  • Context cancellation not properly respected during long-running process execution
  • Missing waitgroup synchronization in Drop() method leading to potential goroutine leaks
  • Multiple concurrency safety issues with shared state access

Moderate Issues Found:

  • Fragile GitHub API parsing in release workflow without error handling
  • Config file reading duplicated across multiple functions
  • File glob patterns in GoReleaser could match multiple files with undefined behavior
  • Missing validation/verification of downloaded external assets

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 175 to 210
func unpackTarGz(r io.Reader, dest string) error {
gzr, err := gzip.NewReader(r)
if err != nil {
return err
}
defer gzr.Close()

tr := tar.NewReader(gzr)
for {
header, err := tr.Next()
if err == io.EOF {
break
}
if err != nil {
return err
}

target := filepath.Join(dest, header.Name)
switch header.Typeflag {
case tar.TypeDir:
if err := os.MkdirAll(target, 0755); err != nil {
return err
}
case tar.TypeReg:
f, err := os.OpenFile(target, os.O_CREATE|os.O_RDWR, os.FileMode(header.Mode))
if err != nil {
return err
}
if _, err := io.Copy(f, tr); err != nil {
f.Close()
return err
}
f.Close()
}
}
return nil
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tar extraction implementation is vulnerable to path traversal attacks (Zip Slip). An attacker could craft a malicious tarball with entries containing path traversal sequences like "../" in header.Name, allowing files to be written outside the intended destination directory. Before constructing the target path, you should validate that filepath.Join(dest, header.Name) after cleaning stays within the dest directory boundary. Consider using filepath.Clean on header.Name and checking that the resulting path has dest as a prefix.

Copilot uses AI. Check for mistakes.
Comment on lines 367 to 372
// Wait for exit
err := cmd.Wait()
log.Printf("I! huatuo process exited: %v", err)

// Allow quick restart unless context cancelled
time.Sleep(5 * time.Second)
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manageProcess goroutine has a critical issue: it starts an infinite restart loop but never checks if ctx.Done() is signaled during the sleep periods or while the process is running. If the process is long-running, cancelFunc won't take effect until the process exits naturally. The context cancellation should be checked while waiting for the command to finish. Consider using cmd.Wait() in a separate goroutine and selecting on both ctx.Done() and the wait completion, so you can terminate the process immediately when shutdown is requested.

Suggested change
// Wait for exit
err := cmd.Wait()
log.Printf("I! huatuo process exited: %v", err)
// Allow quick restart unless context cancelled
time.Sleep(5 * time.Second)
// Wait for exit or context cancellation
waitCh := make(chan error, 1)
go func() {
waitCh <- cmd.Wait()
}()
select {
case err := <-waitCh:
log.Printf("I! huatuo process exited: %v", err)
case <-ctx.Done():
// Context cancelled, try to terminate process
log.Printf("I! context cancelled, terminating huatuo process")
_ = cmd.Process.Signal(os.Interrupt)
// Give it a moment to shut down gracefully
select {
case <-waitCh:
case <-time.After(1 * time.Second):
_ = cmd.Process.Kill()
<-waitCh
}
return
}
// Allow quick restart unless context cancelled
select {
case <-ctx.Done():
return
case <-time.After(5 * time.Second):
}

Copilot uses AI. Check for mistakes.
Comment on lines 360 to 369
if err := cmd.Start(); err != nil {
log.Printf("E! failed to start huatuo: %v", err)
time.Sleep(10 * time.Second)
continue
}
ins.cmd = cmd

// Wait for exit
err := cmd.Wait()
log.Printf("I! huatuo process exited: %v", err)
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command execution will fail if started without proper permissions since huatuo requires root for eBPF. However, cmd.Start() failure triggers a retry loop with only a 10-second delay. This could generate excessive error logs. Additionally, after successful Start(), the code stores the cmd reference but then immediately calls cmd.Wait() blocking the goroutine. This means ins.cmd will only be accessible for Drop() to kill during this Wait period. If you need to kill the process during startup (before Wait is called), ins.cmd won't be set yet, causing a nil pointer issue in Drop().

Copilot uses AI. Check for mistakes.
time.Sleep(10 * time.Second)
continue
}
ins.cmd = cmd
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential race condition with ins.cmd. The field is written in the manageProcess goroutine (line 365) and read in the Drop method (line 381) without any synchronization. If Drop is called while manageProcess is between iterations (after cmd.Wait() but before the next cmd assignment), ins.cmd could be nil or stale. Consider protecting ins.cmd access with a mutex, or using atomic operations to safely access the process reference.

Copilot uses AI. Check for mistakes.
Comment on lines 156 to 162
for _, sub := range subEntries {
oldPath := filepath.Join(srcDir, sub.Name())
newPath := filepath.Join(ins.InstallPath, sub.Name())
if err := os.Rename(oldPath, newPath); err != nil {
return fmt.Errorf("failed to move %s to %s: %w", oldPath, newPath, err)
}
}
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file moving logic after unpacking has a potential issue. If a file with the same name already exists at the destination (newPath), os.Rename will fail on some systems or overwrite without warning on others. This could happen if the tarball is re-extracted over an existing installation. Consider checking if the destination exists first, or using a safer approach like extracting to a temporary directory first, then moving the entire tree atomically.

Copilot uses AI. Check for mistakes.
Comment on lines 235 to 254
content, err := os.ReadFile(targetFile)
if err != nil {
return err
}

var data map[string]interface{}
if err := toml.Unmarshal(content, &data); err != nil {
return err
}

for k, v := range ins.ConfigOverwrites {
setNestedMap(data, k, v)
}

newContent, err := toml.Marshal(data)
if err != nil {
return err
}

return os.WriteFile(targetFile, newContent, 0644)
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The applyConfig() function reads the entire config file, unmarshals it, modifies values, marshals it back, and writes it every time Init() is called. For the detectPort() function, the config file is read and parsed again. These repeated file I/O operations are inefficient. Consider caching the parsed config or combining these operations into a single read-modify-write cycle, especially since both functions need the same config file.

Copilot uses AI. Check for mistakes.

func (ins *Instance) scrape(slist *types.SampleList) error {
client := http.Client{
Timeout: 5 * time.Second,
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout value of 5 seconds is hardcoded. Consider making this configurable through the Instance configuration (similar to how other input plugins like prometheus have a configurable Timeout field), allowing operators to adjust based on their network conditions and huatuo response times.

Copilot uses AI. Check for mistakes.
Comment on lines 204 to 207
f.Close()
return err
}
f.Close()
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File handle may be writable as a result of data flow from a call to OpenFile and closing it may result in data loss upon failure, which is not handled explicitly.

Suggested change
f.Close()
return err
}
f.Close()
closeErr := f.Close()
if closeErr != nil {
return fmt.Errorf("error closing file after copy failure: %v (copy error: %v)", closeErr, err)
}
return err
}
if err := f.Close(); err != nil {
return fmt.Errorf("error closing file after successful copy: %v", err)
}

Copilot uses AI. Check for mistakes.
Comment on lines 204 to 207
f.Close()
return err
}
f.Close()
Copy link

Copilot AI Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File handle may be writable as a result of data flow from a call to OpenFile and closing it may result in data loss upon failure, which is not handled explicitly.

Suggested change
f.Close()
return err
}
f.Close()
closeErr := f.Close()
if closeErr != nil {
// Return the write error, but log the close error as well
return fmt.Errorf("write error: %v; close error: %v", err, closeErr)
}
return err
}
if err := f.Close(); err != nil {
return err
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 18 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 358 to 409
cmd := exec.Command("sudo", binPath, "--config", filepath.Base(confPath), "--region", region)
cmd.Dir = ins.InstallPath // Set workdir to install path so it finds config if relative

// Redirect stdout/stderr to log?
if ins.DebugMod {
cmd.Stdout = os.Stdout
cmd.Stderr = os.Stderr
}

if err := cmd.Start(); err != nil {
log.Printf("E! failed to start huatuo: %v", err)
select {
case <-ctx.Done():
return
case <-time.After(10 * time.Second):
continue
}
}

ins.lock.Lock()
ins.cmd = cmd
ins.lock.Unlock()

// Wait for exit or context cancel
done := make(chan error, 1)
go func() {
done <- cmd.Wait()
}()

select {
case <-ctx.Done():
// Context cancelled, kill process
if cmd.Process != nil {
_ = cmd.Process.Signal(os.Interrupt)
// Give slight grace period?
time.Sleep(1 * time.Second)
_ = cmd.Process.Kill()
}
return
case err := <-done:
log.Printf("I! huatuo process exited: %v", err)
}

// Allow quick restart unless context cancelled
select {
case <-ctx.Done():
return
case <-time.After(5 * time.Second):
// continue loop
}
}
}
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running huatuo-bamai with sudo in an infinite restart loop without rate limiting or backoff is dangerous. If the process fails immediately (e.g., due to misconfiguration), this will spawn processes every 10-15 seconds indefinitely, potentially exhausting system resources. Consider adding exponential backoff, a maximum retry count, or checking if the previous process ran for a minimum duration before restarting.

Copilot uses AI. Check for mistakes.
Comment on lines +391 to +394
_ = cmd.Process.Signal(os.Interrupt)
// Give slight grace period?
time.Sleep(1 * time.Second)
_ = cmd.Process.Kill()
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending os.Interrupt followed by a 1-second sleep and then Kill is not a graceful shutdown pattern. Consider using cmd.Process.Signal with a proper wait mechanism or use a context with timeout for graceful termination before forcing a kill.

Copilot uses AI. Check for mistakes.
Comment on lines +77 to +96
func setNestedMap(m map[string]interface{}, path string, val interface{}) {
parts := strings.Split(path, ".")
current := m
for i := 0; i < len(parts)-1; i++ {
part := parts[i]
next, ok := current[part]
if !ok || next == nil {
nextMap := make(map[string]interface{})
current[part] = nextMap
current = nextMap
} else if nextMap, ok := next.(map[string]interface{}); ok {
current = nextMap
} else {
nextMap := make(map[string]interface{})
current[part] = nextMap
current = nextMap
}
}
current[parts[len(parts)-1]] = val
}
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setNestedMap function doesn't handle the case where parts is empty (when path is an empty string). This would cause a panic when accessing parts[len(parts)-1]. Consider adding a check at the beginning to return early or error if path is empty.

Copilot uses AI. Check for mistakes.
# 远程采集地址
url = "http://192.168.1.100:19704/metrics"

# install_path 必须为空或省略
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says install_path must be empty or omitted ("必须为空或省略"), but technically it just needs to be empty - omitting it will use the zero value (empty string). Consider clarifying this for consistency with behavior.

Suggested change
# install_path 必须为空或省略
# install_path 需保持为空;不配置该字段时其默认值为一个空字符串

Copilot uses AI. Check for mistakes.
Comment on lines +377 to +379
ins.lock.Lock()
ins.cmd = cmd
ins.lock.Unlock()
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential race condition where ins.cmd is being written without a lock held during the initial assignment, then locked for subsequent accesses. The cmd field should be protected by the lock during the initial assignment at line 378 as well to ensure thread-safe access throughout its lifecycle.

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +118
go ins.manageProcess(ctx)
ins.wg.Add(1)
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goroutine started at line 117 is not added to the waitgroup before being launched. The wg.Add(1) call at line 118 happens after the goroutine starts, creating a race condition where the goroutine could potentially call wg.Done() before wg.Add(1) is executed. Move ins.wg.Add(1) before the go statement.

Suggested change
go ins.manageProcess(ctx)
ins.wg.Add(1)
ins.wg.Add(1)
go ins.manageProcess(ctx)

Copilot uses AI. Check for mistakes.
Comment on lines +30 to +31
curl -L -o embedded/linux-amd64/huatuo-$VERSION-linux-amd64.tar.gz "https://github.com/ccfos/huatuo/releases/download/$VERSION/huatuo-$VERSION-linux-amd64.tar.gz"
curl -L -o embedded/linux-arm64/huatuo-$VERSION-linux-arm64.tar.gz "https://github.com/ccfos/huatuo/releases/download/$VERSION/huatuo-$VERSION-linux-arm64.tar.gz"
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The curl downloads lack verification (no checksum validation). If the huatuo release is compromised or the download is corrupted, this will silently include malicious or broken binaries in the categraf release. Consider downloading and verifying checksums, or at least checking that the downloaded files are valid tar.gz archives before proceeding.

Copilot uses AI. Check for mistakes.
Comment on lines +224 to +226
if err := os.MkdirAll(target, 0755); err != nil {
return err
}
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating parent directories for every file in the tarball is inefficient. The MkdirAll is called even when the directory already exists from a previous file. Consider checking if the directory exists first or tracking which directories have been created to avoid repeated system calls.

Copilot uses AI. Check for mistakes.
@kongfei605 kongfei605 merged commit 1eb6b41 into flashcatcloud:main Dec 17, 2025
2 of 3 checks passed
@kongfei605 kongfei605 deleted the huatuo branch December 17, 2025 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant