-
-
Notifications
You must be signed in to change notification settings - Fork 422
Description
A chat with @poettering (feel free to correct me if I remembered something wrong here) at FOSDEM 2025 brought up an interesting consideration: as he argues in favor of immutable image based operating environments, whose life cycle is effectively managed by another immutable OS image living as initrd (and probably systemd at the heart of it all), the currently typical NUT shutdown integration would not be feasible/welcome there in the way it is done now...
Roughly speaking, what we do currently on numerous platforms is:
- A NUT server (with physical connection to the UPS) normally runs drivers and the data server, probably also an
upsmoninstance inprimaryrole (whoever is actually fed by that UPS should have a copy; most of the time the system managing an UPS is also fed by it). - If a power outage occurs, this server raises FSD (Forced Shut Down) state to tell everyone else to shut down, and after a while, a locally running
upsmonshuts down its own operating system too. - As part of such FSD handling, the
upsmoncreates a file in location specified byPOWERDOWNFLAGconfiguration from itsupsmon.conf. - Running daemons, including NUT drivers for the UPS and the data server, are stopped as are any other services.
- (Some systems eventually kill all userland processes and remount read-only)
- In case of systemd-driven systems, the late shutdown hook script in
/usr/lib/systemd/system-shutdown/nutshutdownkicks in, finds thatPOWERDOWNFLAGfile, and runs the NUT driver program again to tell the UPS to power-off/power-cycle (so the UPS usually turns on automatically when the wall power returns - or if it already has, and all fed systems are guaranteed to fully restart) at/after the moment we know the power loss would not corrupt any data.
This last bit actually places a number of constraints on the environment:
- The
POWERDOWNFLAGfile location should still be mounted and at least readable; - The NUT configuration files (e.g.
/etc/nutor/etc/ups) should be on filesystems still mounted and at least readable, so that the correct driver is chosen and connects to the expected device(s); - The NUT driver programs and any libraries they might dynamically link to (and possibly their resource files - maybe SNMP MIBs, etc.) should be on filesystems still mounted and at least readable (programs also executable).
- NOTE: In recent NUT releases, there is a way to tell the running driver program to turn the UPS off, instead of re-initializing the connection (can take long, a PITA in case of SNMP walks specifically); but there is no practical use for that to my knowledge. The
drivername -khandling to kill power automatically tries to talk to an existing daemonized copy first (if found), before taking the matter into its own hands. But the systems/frameworks that indiscriminately kill off userland processes are unlikely to benefit from this anyway, unless they support some method of exempting certain programs from a killing spree.
These constraints go a bit against the goal that the image-based OSes want the operating environment fully unmounted, not even leaving read-only tentacles in place.
One feasible idea from the chat was to have NUT driver package(s) installed (also) into the initrd image, automatically pulling whatever dependencies are needed for the libraries it uses. Maybe a user-curated selection of drivers, maybe a vendor/corporation dictated "everything" (for signed images to be ubiquitously useful). Also a few tools like upsmon (to check with upsmon -K that FSD is in progress) and upsdrvctl would also be needed.
And it would be that initrd image's shutdown hooks that tell the UPS to go off, after the production environment is safely unmounted and flushed.
- A location like
/run(maybe it exactly?) might be used to convey not only thePOWERDOWNFLAGfile existence and magic content for that FSD handling to kick in, but also a copy of latest-known NUT configuration files. - In the
nutshutdownscript, theNUT_CONFPATHcould point to that copy;NUT_STATEPATH,NUT_ALTPIDPATH(any other?) envvars could be used to point to respective location usable in the initrd environment (e.g./dev/shmfor R/W locations, if at all used in shutdown routine - I think it would be a bug if PID files or socket files are created at that point; location existence may be checked though, not sure). - The driver programs called from
nutshutdowncould be told to just run asrootand not drop privileges, as other accounts (andudevrules in case of USB/Serial links) would likely not be configured at that point. Or maybe they would be there, if packaging did work all the way in initrd too. - Not sure about access to networked power devices (SNMP, NetXML, remote IPMI...) - if the system would still have an IP address at that point, or if it goes away when
network{,ing}.servicegets stopped. In legacy systems, the age where the late shutdown originated, an address stayed assigned until the OS power-cycled itself, so ansnmp-upscould be commanded to power-cycle just as well...