Skip to content

[Bug]: freeipmi plugin crashing after 4 hours #21107

@rafaelwastaken

Description

@rafaelwastaken

Bug description

The freeipmi plugin is crashing around 4 hours after starting netdata. It is consistently crashing right around the 4 hour mark every time.

This issue appears to have started after it auto-updated to v2.7.0 and is happening on v2.7.1 as well. I'm able to reproduce it on Intel Dell servers with iDRAC & AMD Supermicro servers with Supermicro IPMI. The plugin does not recover until I manually restart Netdata (but then it crashes again anyways)

Expected behavior

freeipmi plugin collects IPMI metrics without crashing

Steps to reproduce

  1. Install Netdata on Arch Linux with the freeipmi plugin on a host with IPMI (Supermicro IPMI, Dell iDRAC, etc)
  2. Configure it to record freeipmi metrics
  3. Leave Netdata running for at least 4 hours
  4. The freeipmi plugin will crash after almost 4 hours exactly

Installation method

kickstart.sh

System info

Linux campinas.huenet.net 6.16.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 11 Sep 2025 17:42:36 +0000 x86_64 GNU/Linux
/etc/arch-release:
/etc/os-release:NAME="Arch Linux"
/etc/os-release:PRETTY_NAME="Arch Linux"
/etc/os-release:ID=arch
/etc/os-release:BUILD_ID=rolling
/etc/os-release:ANSI_COLOR="38;2;23;147;209"
/etc/os-release:LOGO=archlinux-logo

Netdata build info

time=2025-10-06T20:14:08.061-04:00 comm=netdata source=daemon level=notice errno="2, No such file or directory" tid=1699283  msg="CONFIG: cannot load user config '/etc/netdata/stream.conf'. Will try stock config."
Packaging:
    Netdata Version ____________________________________________ : v2.7.1
    Installation Type __________________________________________ : kickstart-build
    Package Architecture _______________________________________ : unknown
    Package Distro _____________________________________________ : unknown
    Configure Options __________________________________________ : cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DCMAKE_C_STANDARD=11 -DCMAKE_CXX_STANDARD=14 -DBUILD_SHARED_LIBS= -DCMAKE_C_FLAGS=' -fexceptions -fexceptions -fno-omit-frame-pointer -funwind-tables -fasynchronous-unwind-tables' -DCMAKE_CXX_FLAGS='  -fexceptions -fexceptions -fno-omit-frame-pointer -funwind-tables -fasynchronous-unwind-tables' -DCMAKE_COMPILE_DEFINITIONS='_GNU_SOURCE' -DCMAKE_EXE_LINKER_FLAGS=' -fexceptions -fexceptions -rdynamic' -DCMAKE_SHARED_LINKER_FLAGS=''
Default Directories:
    User Configurations ________________________________________ : /etc/netdata
    Stock Configurations _______________________________________ : /usr/lib/netdata/conf.d
    Ephemeral Databases (metrics data, metadata) _______________ : /var/cache/netdata
    Permanent Databases ________________________________________ : /var/lib/netdata
    Plugins ____________________________________________________ : /usr/libexec/netdata/plugins.d
    Static Web Files ___________________________________________ : /usr/share/netdata/web
    Log Files __________________________________________________ : /var/log/netdata
    Lock Files _________________________________________________ : /var/lib/netdata/lock
    Home _______________________________________________________ : /var/lib/netdata
Operating System:
    Kernel _____________________________________________________ : Linux
    Kernel Version _____________________________________________ : 6.16.7-arch1-1
    Operating System ___________________________________________ : Arch Linux
    Operating System ID ________________________________________ : arch
    Operating System ID Like ___________________________________ : unknown
    Operating System Version ___________________________________ : unknown
    Operating System Version ID ________________________________ : none
    Detection __________________________________________________ : /etc/os-release
Hardware:
    CPU Cores __________________________________________________ : 48
    CPU Frequency ______________________________________________ : 3350000000
    RAM Bytes __________________________________________________ : 270033293312
    Disk Capacity ______________________________________________ : 132812503351296
    CPU Architecture ___________________________________________ : x86_64
    Virtualization Technology __________________________________ : none
    Virtualization Detection ___________________________________ : systemd-detect-virt
Container:
    Container __________________________________________________ : none
    Container Detection ________________________________________ : systemd-detect-virt
    Container Orchestrator _____________________________________ : none
    Container Operating System _________________________________ : none
    Container Operating System ID ______________________________ : none
    Container Operating System ID Like _________________________ : none
    Container Operating System Version _________________________ : none
    Container Operating System Version ID ______________________ : none
    Container Operating System Detection _______________________ : none
Features:
    Built For __________________________________________________ : Linux
    Netdata Cloud ______________________________________________ : YES
    Health (trigger alerts and send notifications) _____________ : YES
    Streaming (stream metrics to parent Netdata servers) _______ : YES
    Back-filling (of higher database tiers) ____________________ : YES
    Replication (fill the gaps of parent Netdata servers) ______ : YES
    Streaming and Replication Compression ______________________ : YES (zstd lz4 gzip brotli)
    Contexts (index all active and archived metrics) ___________ : YES
    Tiering (multiple dbs with different metrics resolution) ___ : YES (5)
    Machine Learning ___________________________________________ : YES
    Memory Allocator ___________________________________________ : system
Database Engines:
    dbengine (compression) _____________________________________ : YES (zstd lz4)
    alloc ______________________________________________________ : YES
    ram ________________________________________________________ : YES
    none _______________________________________________________ : YES
Connectivity Capabilities:
    ACLK (Agent-Cloud Link: MQTT over WebSockets over TLS) _____ : YES
    static (Netdata internal web server) _______________________ : YES
    WebRTC (experimental) ______________________________________ : NO
    Native HTTPS (TLS Support) _________________________________ : YES
    TLS Host Verification ______________________________________ : YES
Libraries:
    LZ4 (extremely fast lossless compression algorithm) ________ : YES
    ZSTD (fast, lossless compression algorithm) ________________ : YES
    zlib (lossless data-compression library) ___________________ : YES
    Brotli (generic-purpose lossless compression algorithm) ____ : YES
    protobuf (platform-neutral data serialization protocol) ____ : YES (system)
    OpenSSL (cryptography) _____________________________________ : YES
    libdatachannel (stand-alone WebRTC data channels) __________ : NO
    JSON-C (lightweight JSON manipulation) _____________________ : YES
    libcap (Linux capabilities system operations) ______________ : YES
    libcrypto (cryptographic functions) ________________________ : YES
    libyaml (library for parsing and emitting YAML) ____________ : YES
    libmnl (library for working with netfilter) ________________ : YES
    stacktraces (library for getting stack traces) _____________ : libbacktrace (mmap, threads, data)
Plugins:
    apps (monitor processes) ___________________________________ : YES
    cgroups (monitor containers and VMs) _______________________ : YES
    cgroup-network (associate interfaces to CGROUPS) ___________ : YES
    proc (monitor Linux systems) _______________________________ : YES
    tc (monitor Linux network QoS) _____________________________ : YES
    diskspace (monitor Linux mount points) _____________________ : YES
    freebsd (monitor FreeBSD systems) __________________________ : NO
    macos (monitor MacOS systems) ______________________________ : NO
    windows (monitor Windows systems) __________________________ : NO
    statsd (collect custom application metrics) ________________ : YES
    timex (check system clock synchronization) _________________ : YES
    idlejitter (check system latency and jitter) _______________ : YES
    bash (support shell data collection jobs - charts.d) _______ : YES
    debugfs (kernel debugging metrics) _________________________ : YES
    cups (monitor printers and print jobs) _____________________ : YES
    ebpf (monitor system calls) ________________________________ : YES
    freeipmi (monitor enterprise server H/W) ___________________ : YES
    network-viewer (monitor TCP/UDP IPv4/6 sockets) ____________ : YES
    systemd-journal (monitor journal logs) _____________________ : YES
    windows-events (monitor Windows events) ____________________ : NO
    nfacct (gather netfilter accounting) _______________________ : NO
    perf (collect kernel performance events) ___________________ : YES
    slabinfo (monitor kernel object caching) ___________________ : YES
    Xen ________________________________________________________ : NO
    Xen VBD Error Tracking _____________________________________ : NO
Exporters:
    AWS Kinesis ________________________________________________ : NO
    GCP PubSub _________________________________________________ : NO
    MongoDB ____________________________________________________ : NO
    Prometheus (OpenMetrics) Exporter __________________________ : YES
    Prometheus Remote Write ____________________________________ : YES
    Graphite ___________________________________________________ : YES
    Graphite HTTP / HTTPS ______________________________________ : YES
    JSON _______________________________________________________ : YES
    JSON HTTP / HTTPS __________________________________________ : YES
    OpenTSDB ___________________________________________________ : YES
    OpenTSDB HTTP / HTTPS ______________________________________ : YES
    All Metrics API ____________________________________________ : YES
    Shell (use metrics in shell scripts) _______________________ : YES
Debug/Developer Features:
    Trace All Netdata Allocations (with charts) ________________ : NO
    Developer Mode (more runtime checks, slower) _______________ : NO
Runtime Information:
    Profile ____________________________________________________ : standalone
    Stream Parent (accept data from Children) __________________ : NO
    Stream Child (send data to a Parent) _______________________ : NO
    Total System Memory ________________________________________ : 270033293312
    Available System Memory ____________________________________ : 205810458624

Additional info

journald logs from the netdata process with the freeipmi crash trace. Netdata updated on 9/26 at 3:37AM and the crash happened at 7:37AM (4 hours later).

-- Boot d54fb2ae10d34be1bc50447b6908d663 --
Sep 13 13:08:49 campinas.huenet.net systemd[1]: Starting Netdata, X-Ray Vision for your infrastructure!...
Sep 13 13:08:49 campinas.huenet.net systemd[1]: Started Netdata, X-Ray Vision for your infrastructure!.
Sep 26 03:37:12 campinas.huenet.net systemd[1]: Stopping Netdata, X-Ray Vision for your infrastructure!...
Sep 26 03:37:13 campinas.huenet.net systemd[1]: netdata.service: Deactivated successfully.
Sep 26 03:37:13 campinas.huenet.net systemd[1]: Stopped Netdata, X-Ray Vision for your infrastructure!.
Sep 26 03:37:13 campinas.huenet.net systemd[1]: netdata.service: Consumed 4d 16h 34.751s CPU time, 2.4G memory peak.
Sep 26 03:37:43 campinas.huenet.net systemd[1]: Starting Netdata, X-Ray Vision for your infrastructure!...
Sep 26 03:37:44 campinas.huenet.net systemd[1]: Started Netdata, X-Ray Vision for your infrastructure!.
Sep 26 07:37:50 campinas.huenet.net systemd-coredump[2724711]: [LNK] Process 2202156 (freeipmi.plugin) of user 957 dumped core.

                                                               Stack trace of thread 2202156:
                                                               #0  0x00007f15b509894c n/a (libc.so.6 + 0x9894c)
                                                               #1  0x00007f15b503e410 raise (libc.so.6 + 0x3e410)
                                                               #2  0x00007f15b502557a abort (libc.so.6 + 0x2557a)
                                                               #3  0x00007f15b5ea0bd1 n/a (libuv.so.1 + 0x9bd1)
                                                               #4  0x000056353cb4c360 n/a (/usr/libexec/netdata/plugins.d/freeipmi.plugin + 0x14360)
                                                               #5  0x00007f15b6280012 n/a (ld-linux-x86-64.so.2 + 0x2012)
                                                               #6  0x00007f15b628416e n/a (ld-linux-x86-64.so.2 + 0x616e)
                                                               #7  0x00007f15b5040cb1 n/a (libc.so.6 + 0x40cb1)
                                                               #8  0x00007f15b5040d8e exit (libc.so.6 + 0x40d8e)
                                                               #9  0x000056353cb6055e n/a (/usr/libexec/netdata/plugins.d/freeipmi.plugin + 0x2855e)
                                                               #10 0x000056353cb4d134 n/a (/usr/libexec/netdata/plugins.d/freeipmi.plugin + 0x15134)
                                                               #11 0x00007f15b5027675 n/a (libc.so.6 + 0x27675)
                                                               #12 0x00007f15b5027729 __libc_start_main (libc.so.6 + 0x27729)
                                                               #13 0x000056353cb4fd75 n/a (/usr/libexec/netdata/plugins.d/freeipmi.plugin + 0x17d75)
                                                               ELF object binary architecture: AMD x86-64

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions