-
Notifications
You must be signed in to change notification settings - Fork 94
Issues
is:issue state:open
is:issue state:open
Issue creation is restricted in this repository
Search results
[Feature]: Add garbage collection for preflight gang coordination configmap
enhancementNew feature or requestNew feature or requestpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.#1452 In NVIDIA/NVSentinel;[Bug]: Node-drainer cancels a fresh quarantine after a previous UnQuarantined event
bugSomething isn't workingSomething isn't workingpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.[Bug]: Fault-quarantine does not apply cordon or taint actions from additional unhealthy events
bugSomething isn't workingSomething isn't workingpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.#1439 In NVIDIA/NVSentinel;[Feature]: Add optional enrichment pipeline to event-exporter for Pod metadata and business context
enhancementNew feature or requestNew feature or requestpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.#1434 In NVIDIA/NVSentinel;[Feature]: Add post-remediation validation gate before clearing node/GPU fault state
enhancementNew feature or requestNew feature or requestpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.#1427 In NVIDIA/NVSentinel;[Feature]: Do not uncordon nodes cordoned independently of NVSentinel
enhancementNew feature or requestNew feature or requestpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.#1424 In NVIDIA/NVSentinel;[Bug]: syslog-health-monitor can miss XIDs when journald rotates before scan
bugSomething isn't workingSomething isn't workingpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.[Bug]: remediation-failed node label can remain after the unsupported failing check has recovered
bugSomething isn't workingSomething isn't workingpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.[Bug]: NVSentinel doesn't work in IPv6-only clusters
bugSomething isn't workingSomething isn't workingpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.#1407 In NVIDIA/NVSentinel;[Bug]: NIC monitor emits non-fatal events for expected-down ports after reboot
bugSomething isn't workingSomething isn't workingpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.#1379 In NVIDIA/NVSentinel;[Bug]: NIC monitor emits false FATAL for unprovisioned Ethernet/RoCE Aux ports on cloud shapes
bugSomething isn't workingSomething isn't workingpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.#1361 In NVIDIA/NVSentinel;[Bug]: Platform connector OR-based entity matching silently clears unrelated NIC failures
bugSomething isn't workingSomething isn't workingpriority/P1Max fix SLA: 183 daysMax fix SLA: 183 daysStatus: Open.#1360 In NVIDIA/NVSentinel;