Skip to content

Conversation

@kongfei605
Copy link
Collaborator

@kongfei605 kongfei605 commented Dec 19, 2025

No description provided.

Copilot AI review requested due to automatic review settings December 19, 2025 07:51
@kongfei605 kongfei605 changed the title feat(prometheus): robust handling for duplicate metrics without HELP … feat(prometheus): robust handling for duplicate metrics without HELP headers Dec 19, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the Prometheus parser to handle duplicate metrics that lack HELP headers by introducing a nested splitting strategy. The parser now splits metrics by both HELP and TYPE headers, allowing it to properly parse metric streams where metrics are delimited only by TYPE headers.

Key Changes

  • Implemented nested byte splitting: first by HELP headers, then by TYPE headers within each HELP segment
  • Added empty segment filtering to skip whitespace-only content
  • Enhanced error logging to include the actual metric content that failed to parse

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 89 to 104
typeMetrics := bytes.Split(metrics[i], typeHeaderBytes)
for j := range typeMetrics {
if j != 0 {
typeMetrics[j] = append(append([]byte(nil), typeHeaderBytes...), typeMetrics[j]...)
}

if len(bytes.TrimSpace(typeMetrics[j])) == 0 {
continue
}

err := p.parse(typeMetrics[j], slist)
if err != nil {
log.Println("E! parse metrics failed, error:", err, "metrics:", string(typeMetrics[j]))
}
}
}
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new nested splitting logic for handling metrics without HELP headers is not covered by tests. This is complex logic with several edge cases (metrics with only TYPE headers, metrics with both HELP and TYPE headers, empty segments, etc.) that should be tested. The codebase has comprehensive test coverage in other areas (elasticsearch, keepalived, etc. collectors have extensive test files), so this parser should follow the same pattern.

Copilot uses AI. Check for mistakes.
Comment on lines 83 to 93
metrics := bytes.Split(buf, metricHeaderBytes)
for i := range metrics {
if i != 0 {
metrics[i] = append(append([]byte(nil), MetricHeaderBytes...), metrics[i]...)
metrics[i] = append(append([]byte(nil), metricHeaderBytes...), metrics[i]...)
}
err := p.parse(metrics[i], slist)
if err != nil {
log.Println("E! parse metrics failed, error:", err, "metrics:", metrics[i])

typeMetrics := bytes.Split(metrics[i], typeHeaderBytes)
for j := range typeMetrics {
if j != 0 {
typeMetrics[j] = append(append([]byte(nil), typeHeaderBytes...), typeMetrics[j]...)
}
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The nested splitting approach creates multiple intermediate byte slice allocations. Each split operation creates a new slice, and prepending headers with double append creates additional allocations. For large metric payloads, this could result in significant memory overhead.

Consider alternative approaches:

  1. Use a single-pass parser that scans for both HELP and TYPE headers
  2. Pre-allocate slices with estimated capacity
  3. Use bytes.Index to find delimiters without creating intermediate slices

The current approach processes the same data multiple times (split by HELP, then each result split by TYPE), which could impact performance when handling high-volume metric scrapes.

Copilot uses AI. Check for mistakes.
Comment on lines 95 to 97
if len(bytes.TrimSpace(typeMetrics[j])) == 0 {
continue
}
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty check only happens after splitting by TYPE headers, not after splitting by HELP headers. This asymmetry could lead to unnecessary parse attempts on empty or whitespace-only segments from the HELP split. Consider adding a similar check in the outer loop at line 88 to skip empty segments early and avoid the nested TYPE splitting altogether for empty content.

Copilot uses AI. Check for mistakes.

err := p.parse(typeMetrics[j], slist)
if err != nil {
log.Println("E! parse metrics failed, error:", err, "metrics:", string(typeMetrics[j]))
Copy link

Copilot AI Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message now includes the full metric content as a string, which could be quite large for metrics with many labels or long help text. Consider truncating the output or logging only a summary (e.g., first N bytes or first line) to avoid flooding logs, especially when parsing failures are frequent.

Copilot uses AI. Check for mistakes.
@kongfei605 kongfei605 merged commit 823f0e3 into flashcatcloud:main Dec 19, 2025
3 checks passed
@kongfei605 kongfei605 deleted the parser_up branch December 19, 2025 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant