WIP Project Notes - 02 / Sep /2025
- Summary
- Potential Applications For A Linux Desktop Helper
- What's Different About A "Sysadmin" Agent?
- Down The Line Bells And Whistles
To date, most interest in the use of AI tools for development and programming-related purposes has focused on their potential to generate working code. Latterly, this has been nicknamed "vibe coding."
"Vibe coding," however, is an extremely challenging place for LLMs to shine as programming assistants: context loads remain an enormous challenge, often leading to rapidly diminishing returns even on state of the art models.
An underappreciated application of code generation agents is in what might be called "routine systems administration." In the case of Linux desktops, this can encompass advanced tasks that even (indirectly) involve the use of headless browsers.
The list of what can be accomplished, on Linux (and other OSes) without leaving the shell is vast. Unlike when writing code, LLMs don't have to maintain coherence over long token loads at each turn.
Unlike code generation, they don't need to absorb large repositories in temporary context stores. And finally ... the syntax of foundational CLIs in Linux environments remains relatively stable compared to the constantly evolving world of SDKs and APIs that are used in programming tasks.
Potential applications include:
- "Organise my Github repositories." (File Organisation)
- "Try to make sense out of this messy folder" (File Organisation)
- "Install pyenv and add the Python 3.12."
- "Check what Flathub programs I have and remove anything I haven't touched in the last 6 months."
- "See which Python environments I have on this computer and if my bashrc is configured correctly."
- "Could you set up YADM and add a service job to push it every 6 hours?"
- "I'd like to type gp and action git add, commit, and push. Could you add that as a Bash alias."
- "I'm learning about ML and want to fine tune my own STT model. Could you set me up a Conda environment with some sensible packages. You have my GPU in context."
- "Assume the vantage point of a critical cybersec auditor who's here to pick holes. Audit the cybersecurity of my filesystem and provide a set of remediation suggestions."
- "Could you check my firewall config and close whatever isn't necessary."
- "Am I running an AV package? Am I pulling in updates regularly?"
- "My computer is crazy laggy. Could you check the logs and see if anything obvious is showing up?"
- "Whenever I want to open a new file the file picker freezes. Can we try to debug and if that doesn't work find a substitute?"
- "I cloned this interesting looking agent project from Github. Please follow the readme and install."
- "This is a low resource environment. Could you run some checks with htop, top, heat monitors. See how I'm doing and whether we might add Zram to prevent OOM issues."
- "Could you have a look at journal and see if you can spot any significant warnings that I could fix?"
Why create a specialised agent for this and not just run a regular code-gen CLI?
The reason has nothing to do with the LLM: modern LLMs fine-tuned on code-gen tasks and with agentic capabilities are the ideal language models for this workload.
The reason is that this task becomes far more efficient when the agent is configured with a few special ingredients:
At first run, the agent (WIP) is populated with a persistent system environment profile that is populated periodically:
This creates a context file intended to provide enduring foundational context about the environment:
- What distro?
- What DE?
- What GPU?
- Do we have ROCM / CUDA?
- How much RAM?
- What CPU?
Etc
At subsequent runs, a lighter ephemeral file is added providing temporary metrics:
- System load
- lsusb: connected devices
The two are concatenated so that when the agent runs the user doesn't have to waste time answering repetitive questions.
Most code gen CLIs assume the following default posture ('I' here is the perspective of the LLM!):
- I'm running within a repo. I cannot move outside of the repo without explicit user consent.
- I'm here to work on a development project that is being built within this repo.
While this default posture makes abundant sense in the context in which it is used, it also doesn't make sense for the sysadmin use case in which the agent may need to access large swathes of the filesystem to effectively do its job.
The implementation for Ezra involves the inverse working model: assumed permissive access. Users may fence off parts of the file system but, otherwise, the agent will assume it can elevate to sudo, run anywhere within /.
For many users, this would be unthinkable. For others, this would be fine with close supervision and when running a local LLM. The implementation would ideally support all of these scenarios.
Prevention is better than cure. Journal catches many developing hardware and software glitches before they cascade. The implementation could support a post boot doctor designed to analyse logs in the critical first few minutes after boot events when anomalies are more obvious.
Local and/or cloud STT integration for easier console input.