-
Notifications
You must be signed in to change notification settings - Fork 329
feat(snmp_zabbix): support dependent item type and LLD lifetime manag… #1378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(snmp_zabbix): support dependent item type and LLD lifetime manag… #1378
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds support for dependent item types and Low-Level Discovery (LLD) lifetime management to the snmp_zabbix input plugin. The implementation aligns with Zabbix 7.0+ lifecycle features, enabling sophisticated monitoring workflows with master-dependent item relationships and configurable TTL policies for discovered items.
Key Changes:
- Implements dependent item type support with hierarchical processing where dependent items derive their values from master items through preprocessing
- Adds LLD lifetime management with separate delete and disable TTL policies (DELETE_NEVER, DELETE_IMMEDIATELY, DELETE_AFTER, DISABLE_NEVER, DISABLE_IMMEDIATELY, DISABLE_AFTER)
- Introduces internal metrics collection for monitoring item discovery states and countdown timers
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| inputs/snmp_zabbix/template.go | Adds LLD lifetime fields to DiscoveryRule struct, simplifies macro expansion logic, and adds GetStats() and GetSNMPDiscoveryRules() helper functions |
| inputs/snmp_zabbix/snmp.go | Refactors getTemplateStaticItems() to support dependent item tree building and integrates LLD lifetime parsing with scheduler updates |
| inputs/snmp_zabbix/scheduler.go | Implements item state management (IsLost, IsDisabled, LostSince), adds maintenance loop for TTL enforcement, and provides internal metrics collection |
| inputs/snmp_zabbix/discovery.go | Defines lifetime duration constants, adds DependentItems field to MonitorItem, improves OID prefix validation, implements dependent item tree building in ApplyItemPrototypes(), and adds ParseLLDLifetimes() function |
| inputs/snmp_zabbix/collector.go | Adds dependent item processing logic with recursive tree traversal to process master-dependent chains |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| var buildTree func(key string) MonitorItem | ||
| buildTree = func(key string) MonitorItem { | ||
| item := *agentItemsMap[key] | ||
| childrenKeys := dependencyMap[key] | ||
| for _, childKey := range childrenKeys { | ||
| childItem := buildTree(childKey) | ||
| item.DependentItems = append(item.DependentItems, childItem) | ||
| } | ||
| return item |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buildTree function creates a value copy of the item from the map (line 570), but this means modifications to item.DependentItems won't be reflected in the original map entry. Since the function is recursive and builds a tree, this is actually correct behavior - each level returns a new copy with its dependents attached. However, there's a potential issue: if a dependent item has a master that doesn't exist in the map, the function will panic with a nil pointer dereference when accessing *agentItemsMap[key]. The validation at lines 554-560 logs a warning but doesn't prevent the panic.
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 14 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| func copyTags(src map[string]string) map[string]string { | ||
| dst := make(map[string]string, len(src)+1) |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memory allocation inefficiency. The code creates a new map with capacity len(src)+1 but then only copies len(src) items. The +1 suggests space for the "action" tag, but that tag is added by the caller after this function returns. Either remove the +1 or add a capacity parameter to make the function more flexible.
| dst := make(map[string]string, len(src)+1) | |
| dst := make(map[string]string, len(src)) |
|
|
||
| resultChan <- collectionResult | ||
|
|
||
| if len(item.DependentItems) > 0 && err == nil { |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate statement detected. The line resultChan <- collectionResult appears twice consecutively, which will send the same result to the channel twice. This will cause duplicate metrics to be collected for dependent items. Remove one of the duplicate lines.
| IsLabelProvider: false, | ||
| var buildTree func(key string) MonitorItem | ||
| buildTree = func(key string) MonitorItem { | ||
| item := *agentItemsMap[key] |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential nil pointer dereference. The code dereferences agentItemsMap[key] without checking if the key exists in the map. If a child references a key that doesn't exist (e.g., circular dependency or orphaned item), this will panic. Add an existence check before dereferencing.
| if c.config.DebugMode { | ||
| log.Printf("D! dependent preprocessing failed for %s: %v", item.Key, err) | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error from preprocessing is checked but only logged in debug mode, then the function returns early without sending any result to the channel. This means dependent items with preprocessing errors will silently disappear from metrics. Consider sending an error result to the channel or at least logging at a higher level than debug.
| if c.config.DebugMode { | |
| log.Printf("D! dependent preprocessing failed for %s: %v", item.Key, err) | |
| } | |
| log.Printf("E! dependent preprocessing failed for %s: %v", item.Key, err) | |
| collectionResult.Error = err | |
| resultChan <- collectionResult |
| func (c *SNMPCollector) processDependentItems(agent string, dependents []MonitorItem, masterValue interface{}, resultChan chan<- CollectionResult) { | ||
| for _, depItem := range dependents { |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent variable naming: The parameter is named dependents but elsewhere in the codebase uses DependentItems. Consider renaming to dependentItems for consistency with the field name and other similar variables in the codebase.
| func (c *SNMPCollector) processDependentItems(agent string, dependents []MonitorItem, masterValue interface{}, resultChan chan<- CollectionResult) { | |
| for _, depItem := range dependents { | |
| func (c *SNMPCollector) processDependentItems(agent string, dependentItems []MonitorItem, masterValue interface{}, resultChan chan<- CollectionResult) { | |
| for _, depItem := range dependentItems { |
| for _, depTmpl := range dependentItems { | ||
| child := agentItemsMap[depTmpl.Key] |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent naming: The parameter depTmpl mixes "dep" (abbreviation) with "Tmpl" (abbreviation for template), but this variable represents a dependent item, not a template. Consider renaming to depItem or dependentItem for clarity.
| // >0: 延迟执行的时长 | ||
| func ParseLLDLifetimes(rule DiscoveryRule) (time.Duration, time.Duration) { | ||
| // 1. 解析 Delete 策略 (彻底删除) | ||
| deleteTTL := parseStrategy(rule.LifetimeType, rule.Lifetime, 7*24*time.Hour) // 默认删除时间 7d |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value in the comment doesn't match the code. The comment states "默认删除时间 30d" (default deletion time 30d), but the code uses 7*24*time.Hour which is 7 days, not 30 days. Either update the comment to say "7d" or change the code to use 30 days if that's the intended default.
| deleteTTL := parseStrategy(rule.LifetimeType, rule.Lifetime, 7*24*time.Hour) // 默认删除时间 7d | |
| deleteTTL := parseStrategy(rule.LifetimeType, rule.Lifetime, 7*24*time.Hour) // 默认删除时间 7d |
| var buildTree func(key string) MonitorItem | ||
| buildTree = func(key string) MonitorItem { | ||
| item := *localMap[key] | ||
| children := depMap[key] | ||
| for _, childKey := range children { | ||
| if _, exists := localMap[childKey]; exists { | ||
| childItem := buildTree(childKey) | ||
| item.DependentItems = append(item.DependentItems, childItem) | ||
| } | ||
| } | ||
| return item | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential stack overflow risk. The buildTree function is recursive and doesn't have any protection against circular dependencies. If the dependency graph contains a cycle (A depends on B, B depends on A), this will cause infinite recursion and a stack overflow. Add cycle detection or a visited set to prevent this.
| var buildTree func(key string) MonitorItem | ||
| buildTree = func(key string) MonitorItem { | ||
| item := *agentItemsMap[key] | ||
| childrenKeys := dependencyMap[key] | ||
| for _, childKey := range childrenKeys { | ||
| childItem := buildTree(childKey) | ||
| item.DependentItems = append(item.DependentItems, childItem) | ||
| } | ||
| return item | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential stack overflow risk. Similar to the issue in discovery.go, this recursive buildTree function doesn't have protection against circular dependencies. If there's a cycle in the dependency graph, this will cause infinite recursion and a stack overflow. Add cycle detection or a visited set to prevent this.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
| func (s *ItemScheduler) runMaintainLoop(ctx context.Context) { | ||
| ticker := time.NewTicker(1 * time.Minute) // 1分钟检查一次,精度更高 |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The maintenance loop ticker is hardcoded to 1 minute. Consider making this configurable or at least defining it as a constant at the package level. This would improve maintainability and allow for easier testing and tuning of the maintenance frequency.
| // 兼容多种形式的上下文键,例如: | ||
| // "SNMPINDEX", "#SNMPINDEX", "{SNMPINDEX}", "{#SNMPINDEX}" | ||
| normalized := macro | ||
|
|
||
| // 去掉外围花括号 | ||
| if strings.HasPrefix(normalized, "{") && strings.HasSuffix(normalized, "}") && len(normalized) > 2 { | ||
| normalized = normalized[1 : len(normalized)-1] | ||
| } | ||
|
|
||
| // 去掉前导 #(discovery 宏前缀) | ||
| normalized = strings.TrimPrefix(normalized, "#") | ||
|
|
||
| if normalized == "" { | ||
| continue | ||
| } | ||
| result = strings.ReplaceAll(result, fmt.Sprintf("{#%s}", normalizedMacro), value) | ||
| result = strings.ReplaceAll(result, fmt.Sprintf("{%s}", normalizedMacro), value) | ||
|
|
||
| result = strings.ReplaceAll(result, fmt.Sprintf("{#%s}", normalized), value) | ||
| result = strings.ReplaceAll(result, fmt.Sprintf("{%s}", normalized), value) |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inline comments in this section are in Chinese (lines 715-725, 731-732). While the code functionality is clear, consider using English for inline comments to maintain consistency with the rest of the codebase and improve accessibility for all contributors. The comment on line 1001 should also be updated to English for consistency.
|
|
||
| var buildTree func(key string) MonitorItem | ||
| buildTree = func(key string) MonitorItem { | ||
| item := *localMap[key] |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential nil pointer dereference. The code dereferences localMap[key] without checking if the key exists in the map. If the key is not found, this will panic. Consider adding a check to ensure the key exists before dereferencing.
| item := *localMap[key] | |
| itemPtr, ok := localMap[key] | |
| if !ok || itemPtr == nil { | |
| // If the key is not present in the local map, return a zero-value item | |
| return MonitorItem{} | |
| } | |
| item := *itemPtr |
| LifetimeType string `yaml:"lifetime_type,omitempty"` // DELETE_NEVER, DELETE_IMMEDIATELY, DELETE_AFTER | ||
| EnabledLifetime string `yaml:"enabled_lifetime,omitempty"` // Disable duration | ||
| EnabledLifetimeType string `yaml:"enabled_lifetime_type,omitempty"` // DISABLE_NEVER, DISABLE_IMMEDIATELY, DISABLE_AFTER |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing documentation for the added lifecycle fields. The new fields LifetimeType, EnabledLifetime, and EnabledLifetimeType lack comments explaining their purpose, valid values, and relationship to Zabbix 7.0+ LLD lifecycle management. Consider adding documentation comments to help future maintainers understand how these fields are used.
| LifetimeType string `yaml:"lifetime_type,omitempty"` // DELETE_NEVER, DELETE_IMMEDIATELY, DELETE_AFTER | |
| EnabledLifetime string `yaml:"enabled_lifetime,omitempty"` // Disable duration | |
| EnabledLifetimeType string `yaml:"enabled_lifetime_type,omitempty"` // DISABLE_NEVER, DISABLE_IMMEDIATELY, DISABLE_AFTER | |
| // LifetimeType controls how long discovered entities are kept after they stop | |
| // being discovered (Zabbix 7.0+ LLD lifecycle). Valid values: | |
| // - "DELETE_NEVER": keep entities indefinitely | |
| // - "DELETE_IMMEDIATELY": remove entities as soon as they are no longer discovered | |
| // - "DELETE_AFTER": remove entities after the duration specified in Lifetime | |
| LifetimeType string `yaml:"lifetime_type,omitempty"` | |
| // EnabledLifetime defines how long a discovered entity is kept in an enabled | |
| // state before it is automatically disabled (but not necessarily deleted). | |
| // The value is a Zabbix duration string (for example: "1h", "7d", "30d") and | |
| // is used together with EnabledLifetimeType for Zabbix 7.0+ LLD lifecycle. | |
| EnabledLifetime string `yaml:"enabled_lifetime,omitempty"` | |
| // EnabledLifetimeType defines when a discovered entity should be disabled in | |
| // relation to EnabledLifetime. Valid values: | |
| // - "DISABLE_NEVER": never automatically disable the entity | |
| // - "DISABLE_IMMEDIATELY": disable as soon as the lifecycle condition is met | |
| // - "DISABLE_AFTER": disable after the duration specified in EnabledLifetime | |
| EnabledLifetimeType string `yaml:"enabled_lifetime_type,omitempty"` |
| func (t *ZabbixTemplate) GetStats() map[string]interface{} { | ||
| stats := make(map[string]interface{}) | ||
|
|
||
| stats["format"] = t.Format | ||
| stats["version"] = t.ZabbixExport.Version | ||
| stats["template_count"] = len(t.ZabbixExport.Templates) | ||
| stats["host_count"] = len(t.ZabbixExport.Hosts) | ||
| stats["total_items"] = len(t.Items) | ||
| stats["total_discovery_rules"] = len(t.DiscoveryRules) | ||
| stats["total_macros"] = len(t.Macros) | ||
|
|
||
| // 按类型统计items | ||
| itemTypes := make(map[string]int) | ||
| for _, item := range t.Items { | ||
| itemType := ConvertZabbixItemType(item.Type) | ||
| itemTypes[itemType]++ | ||
| } | ||
| stats["item_types"] = itemTypes | ||
|
|
||
| // 统计discovery rules中的item prototypes | ||
| totalProtos := 0 | ||
| for _, rule := range t.DiscoveryRules { | ||
| totalProtos += len(rule.ItemPrototypes) | ||
| } | ||
| stats["total_item_prototypes"] = totalProtos | ||
|
|
||
| // 统计triggers | ||
| totalTriggers := 0 | ||
| for _, tmpl := range t.ZabbixExport.Templates { | ||
| totalTriggers += len(tmpl.Triggers) | ||
| } | ||
| for _, host := range t.ZabbixExport.Hosts { | ||
| totalTriggers += len(host.Triggers) | ||
| } | ||
| stats["total_triggers"] = totalTriggers | ||
|
|
||
| // 统计graphs | ||
| totalGraphs := 0 | ||
| for _, tmpl := range t.ZabbixExport.Templates { | ||
| totalGraphs += len(tmpl.Graphs) | ||
| } | ||
| for _, host := range t.ZabbixExport.Hosts { | ||
| totalGraphs += len(host.Graphs) | ||
| } | ||
| stats["total_graphs"] = totalGraphs | ||
|
|
||
| return stats | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new GetStats method lacks test coverage. This method performs aggregation logic across multiple collections (items, discovery rules, triggers, graphs) and should have tests to verify correct counting and proper handling of edge cases like empty templates or nil collections.
| for _, depTmpl := range dependentItems { | ||
| child := agentItemsMap[depTmpl.Key] | ||
| if _, ok := agentItemsMap[depTmpl.MasterKey]; !ok { | ||
| if s.DebugMod { | ||
| log.Printf("W! skipping template item with discovery macros: key=%s, oid=%s", key, oid) | ||
| log.Printf("W! dependent item %s missing master %s on agent %s", child.Key, depTmpl.MasterKey, agentAddr) | ||
| } | ||
| continue | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing validation for master item existence. The code validates that master items exist for dependent items but only logs a warning and continues processing. If a dependent item references a non-existent master, it will later cause a nil pointer dereference in the buildTree function. Consider either skipping the dependent item or ensuring proper error handling.
| // Rebuild root items with full dependency tree | ||
| agentItemsList = nil | ||
| for i := range parsedTemplateItems { | ||
| if parsedTemplateItems[i].MasterKey == "" { | ||
| root := buildTree(parsedTemplateItems[i].Key) | ||
| agentItemsList = append(agentItemsList, &root) | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inefficient slice rebuild. The code sets 'agentItemsList' to nil and then rebuilds it by iterating through all parsedTemplateItems again. This is unnecessary since you can reuse the existing agentItemsList and only rebuild the items that have dependents. Consider optimizing this to avoid the full rebuild.
| // Rebuild root items with full dependency tree | |
| agentItemsList = nil | |
| for i := range parsedTemplateItems { | |
| if parsedTemplateItems[i].MasterKey == "" { | |
| root := buildTree(parsedTemplateItems[i].Key) | |
| agentItemsList = append(agentItemsList, &root) | |
| } | |
| // Enrich existing root items with full dependency tree | |
| for _, itemPtr := range agentItemsList { | |
| rootWithDeps := buildTree(itemPtr.Key) | |
| *itemPtr = rootWithDeps |
| func ParseLLDLifetimes(rule DiscoveryRule) (time.Duration, time.Duration) { | ||
| // 1. 解析 Delete 策略 (彻底删除) | ||
| deleteTTL := parseStrategy(rule.LifetimeType, rule.Lifetime, 7*24*time.Hour) // 默认删除时间 7d | ||
|
|
||
| // 2. 解析 Disable 策略 (停止采集) | ||
| // 默认禁用策略:如果未配置,通常 Zabbix 行为是立即禁用(0)或者永不禁用。 | ||
| // 这里我们设定:如果未显式配置,且旧版 logic (lifetime) 存在,则 Disable 默认为 0 (立即禁用,保持旧版行为一致性) | ||
| disableTTL := parseStrategy(rule.EnabledLifetimeType, rule.EnabledLifetime, DurationImmediately) | ||
|
|
||
| return deleteTTL, disableTTL | ||
| } | ||
|
|
||
| // 辅助函数:通用策略解析 | ||
| // typeStr: DELETE_NEVER, DISABLE_AFTER 等 | ||
| // durationStr: "7d", "1h" | ||
| // defaultValue: 当 typeStr 为空时的默认 duration | ||
| func parseStrategy(typeStr, durationStr string, defaultValue time.Duration) time.Duration { | ||
| // 归一化 | ||
| typeStr = strings.ToUpper(strings.TrimSpace(typeStr)) | ||
|
|
||
| switch typeStr { | ||
| case "DELETE_NEVER", "DISABLE_NEVER": | ||
| return DurationNever | ||
| case "DELETE_IMMEDIATELY", "DISABLE_IMMEDIATELY": | ||
| return DurationImmediately | ||
| case "DELETE_AFTER", "DISABLE_AFTER": | ||
| if durationStr == "" { | ||
| return defaultValue | ||
| } | ||
| return parseZabbixDelay(durationStr) | ||
| default: | ||
| // 兼容旧版配置或空配置 | ||
| // 如果 explicit type 为空,但 durationStr 有值,视为 AFTER | ||
| if typeStr == "" && durationStr != "" { | ||
| return parseZabbixDelay(durationStr) | ||
| } | ||
| // 否则返回默认值 | ||
| return defaultValue | ||
| } | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new ParseLLDLifetimes function lacks test coverage. Given that this function is responsible for parsing critical lifecycle configuration with multiple edge cases (NEVER, IMMEDIATELY, AFTER with durations, empty values, fallback logic), it would benefit from comprehensive unit tests to ensure correct behavior across all scenarios.
| if err != nil { | ||
| if c.config.DebugMode { | ||
| log.Printf("D! dependent preprocessing failed for %s: %v", item.Key, err) | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incomplete error handling. When preprocessing fails for a dependent item, the function returns early without sending any error result through the channel. This means the caller won't be aware of the failure. Consider sending a CollectionResult with the error set before returning.
| } | |
| } | |
| collectionResult.Error = err | |
| resultChan <- collectionResult |
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func (s *ItemScheduler) CollectInternalMetrics(slist *types.SampleList) { | ||
| s.mu.RLock() | ||
| defer s.mu.RUnlock() | ||
|
|
||
| now := time.Now() | ||
| for _, sch := range s.discovered { | ||
| tags := map[string]string{ | ||
| "agent": sch.Item.Agent, | ||
| "item_key": sch.Item.Key, | ||
| "discovery_rule": sch.Item.DiscoveryRuleKey, | ||
| "oid": sch.Item.OID, | ||
| } | ||
|
|
||
| var state int // 0: Normal, 1: LostActive, 2: LostDisabled | ||
|
|
||
| if !sch.IsLost { | ||
| state = 0 | ||
| } else { | ||
| if sch.IsDisabled { | ||
| state = 2 | ||
| } else { | ||
| state = 1 | ||
| } | ||
| } | ||
| // 1. 上报状态指标 | ||
| slist.PushFront(types.NewSample("", "snmp_zabbix_item_discovery_state", state, tags)) | ||
|
|
||
| // 2. 上报倒计时指标 (仅当 Lost 时) | ||
| if sch.IsLost { | ||
| elapsed := now.Sub(sch.LostSince) | ||
|
|
||
| // 2.1 Disable 倒计时 (仅当 Active 且 DisableTTL != Never) | ||
| if !sch.IsDisabled && sch.DisableTTL != DurationNever { | ||
| remaining := sch.DisableTTL - elapsed | ||
| if remaining < 0 { | ||
| remaining = 0 | ||
| } | ||
| // 复制标签以避免并发修改(虽然这里是新建的map) | ||
| disableTags := copyTags(tags) | ||
| disableTags["action"] = "disable" | ||
| slist.PushFront(types.NewSample("", "snmp_zabbix_item_remaining_seconds", remaining.Seconds(), disableTags)) | ||
| } | ||
|
|
||
| // 2.2 Delete 倒计时 (DeleteTTL != Never) | ||
| if sch.DeleteTTL != DurationNever { | ||
| remaining := sch.DeleteTTL - elapsed | ||
| if remaining < 0 { | ||
| remaining = 0 | ||
| } | ||
| deleteTags := copyTags(tags) | ||
| deleteTags["action"] = "delete" | ||
| slist.PushFront(types.NewSample("", "snmp_zabbix_item_remaining_seconds", remaining.Seconds(), deleteTags)) | ||
| } | ||
| } | ||
|
|
||
| } | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CollectInternalMetrics iterates over all discovered items with only a read lock, but the metrics collection could be resource-intensive with thousands of items, especially when creating and copying tags for each item. Consider implementing pagination, sampling, or batch processing to reduce the performance impact of this operation.
| // Validate masters exist for dependents | ||
| for _, depTmpl := range dependentItems { | ||
| child := agentItemsMap[depTmpl.Key] | ||
| if _, ok := agentItemsMap[depTmpl.MasterKey]; !ok { | ||
| if s.DebugMod { | ||
| log.Printf("W! skipping template item with discovery macros: key=%s, oid=%s", key, oid) | ||
| log.Printf("W! dependent item %s missing master %s on agent %s", child.Key, depTmpl.MasterKey, agentAddr) | ||
| } | ||
| continue | ||
| } | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation logs a warning when a master item is missing but continues execution without preventing the dependent item from being added to the dependency map. This could lead to runtime errors when buildTree tries to access a non-existent master. Consider either filtering out these invalid dependents or adding nil checks in buildTree.
| // 如果之前是 Lost,现在恢复了 | ||
| if sch.IsLost { | ||
| sch.IsLost = false | ||
| sch.LostSince = time.Time{} | ||
| log.Printf("I! item recovered from lost: %s", id) | ||
| } | ||
| // 如果之前是 Disabled,现在需要恢复采集 | ||
| wasDisabled := sch.IsDisabled | ||
| if wasDisabled { | ||
| sch.IsDisabled = false | ||
| // 需要重新加入调度队列 | ||
| } | ||
| oldInterval := sch.Interval | ||
| newInterval := newItem.Delay | ||
|
|
||
| if newInterval != oldInterval { | ||
| s.removeFromIntervalSlice(oldInterval, sch) | ||
| // 只有在非 Disabled 状态下,才需要从旧队列移除 | ||
| // 如果 interval 变了,或者之前是 disabled (不在队列中),都需要加入新队列 | ||
| if newInterval != oldInterval || wasDisabled { | ||
| if !wasDisabled { | ||
| // 还在旧队列里,先移除 | ||
| s.removeFromIntervalSlice(oldInterval, sch) | ||
| } | ||
| // 加入新队列 | ||
| s.intervals[newInterval] = append(s.intervals[newInterval], sch) | ||
| sch.Interval = newInterval | ||
| intervalTouchedNew[newInterval] = true | ||
| } |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When an item transitions from disabled to enabled (wasDisabled is true), the TTL fields (DeleteTTL and DisableTTL) are not updated or reset. This means a recovered item may still have stale TTL values from when it was lost. Consider updating or clearing these TTL values when an item is recovered to avoid unexpected behavior.
| for i := range parsedTemplateItems { | ||
| if parsedTemplateItems[i].MasterKey == "" { | ||
| root := buildTree(parsedTemplateItems[i].Key) | ||
| agentItemsList = append(agentItemsList, &root) |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking the address of a loop variable creates a common bug. The variable 'root' is reused across iterations, so all pointers will reference the same memory location. To fix this, create a copy of 'root' inside the loop before taking its address.
| agentItemsList = append(agentItemsList, &root) | |
| rootCopy := root | |
| agentItemsList = append(agentItemsList, &rootCopy) |
| if len(item.DependentItems) > 0 && result.Error == nil { | ||
| c.processDependentItems(agent, item.DependentItems, masterValue, resultChan) |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When master item collection fails, dependent items are silently skipped without any indication. Consider logging a message at debug level when dependent items are skipped due to master collection failure to aid in troubleshooting.
| if len(item.DependentItems) > 0 && result.Error == nil { | |
| c.processDependentItems(agent, item.DependentItems, masterValue, resultChan) | |
| if len(item.DependentItems) > 0 { | |
| if result.Error == nil { | |
| c.processDependentItems(agent, item.DependentItems, masterValue, resultChan) | |
| } else if c.config.DebugMode { | |
| log.Printf("D! skipping %d dependent items for %s on agent %s due to master collection error: %v", | |
| len(item.DependentItems), item.Key, agent, result.Error) | |
| } |
| var agentItemsList []*MonitorItem | ||
| var dependentItems []*TemplateMonitorItem | ||
|
|
||
| key := s.expandMacros(tmplItem.Key) | ||
| oid := s.expandMacros(tmplItem.SNMPOID) | ||
| name := s.expandMacros(tmplItem.Name) | ||
| desc := s.expandMacros(tmplItem.Description) | ||
| for i := range parsedTemplateItems { | ||
| // Deep copy needed for each agent | ||
| newItem := parsedTemplateItems[i].MonitorItem | ||
| newItem.Agent = agentAddr | ||
|
|
||
| itemPtr := &newItem | ||
| agentItemsMap[newItem.Key] = itemPtr | ||
|
|
||
| if parsedTemplateItems[i].MasterKey != "" { | ||
| dependentItems = append(dependentItems, &parsedTemplateItems[i]) | ||
| } else { | ||
| agentItemsList = append(agentItemsList, itemPtr) | ||
| } | ||
| } | ||
|
|
||
| // Static items should not contain discovery macros {#MACRO} | ||
| if strings.Contains(key, "{#") || strings.Contains(oid, "{#") { | ||
| // Validate masters exist for dependents | ||
| for _, depTmpl := range dependentItems { | ||
| child := agentItemsMap[depTmpl.Key] | ||
| if _, ok := agentItemsMap[depTmpl.MasterKey]; !ok { | ||
| if s.DebugMod { | ||
| log.Printf("W! skipping template item with discovery macros: key=%s, oid=%s", key, oid) | ||
| log.Printf("W! dependent item %s missing master %s on agent %s", child.Key, depTmpl.MasterKey, agentAddr) | ||
| } | ||
| continue | ||
| } | ||
| } | ||
|
|
||
| tags := make(map[string]string) | ||
| for _, t := range tmplItem.Tags { | ||
| tags[t.Tag] = t.Value | ||
| } | ||
| dependencyMap := make(map[string][]string) | ||
| for _, dep := range dependentItems { | ||
| dependencyMap[dep.MasterKey] = append(dependencyMap[dep.MasterKey], dep.Key) | ||
| } | ||
|
|
||
| valueType := ConvertZabbixValueType(tmplItem.ValueType) | ||
| monitorItem := MonitorItem{ | ||
| Key: key, | ||
| OID: oid, | ||
| Type: "snmp", | ||
| Name: name, | ||
| Units: tmplItem.Units, | ||
| Delay: parseZabbixDelay(tmplItem.Delay), | ||
| Agent: agentAddr, | ||
| ValueType: valueType, | ||
| Description: desc, | ||
| Preprocessing: tmplItem.Preprocessing, | ||
| Tags: tags, | ||
| IsDiscovered: false, | ||
| IsLabelProvider: false, | ||
| var buildTree func(key string) MonitorItem | ||
| buildTree = func(key string) MonitorItem { | ||
| item := *agentItemsMap[key] | ||
| childrenKeys := dependencyMap[key] | ||
| for _, childKey := range childrenKeys { | ||
| childItem := buildTree(childKey) | ||
| item.DependentItems = append(item.DependentItems, childItem) | ||
| } | ||
| return item | ||
| } | ||
|
|
||
| // Set IsLabelProvider for CHAR/TEXT value types | ||
| switch tmplItem.ValueType { | ||
| case "CHAR", "1", "TEXT", "4": | ||
| monitorItem.IsLabelProvider = true | ||
| monitorItem.LabelKey = extractLabelKey(key) | ||
| // Rebuild root items with full dependency tree | ||
| agentItemsList = nil |
Copilot
AI
Dec 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code creates agentItemsList then immediately sets it to nil on line 580, only to rebuild it. This is wasteful. Consider initializing agentItemsList after determining root items to avoid the unnecessary initial allocation and population.
…ement