In this text, I explain how I use the Bash shell. Of course, there are several other ways to use Bash; this is my personal point of view.
If you think there is something wrong or could be improved, please create an issue in this GitHub project. Thank you!
This text contains two parts: Bash Strict Mode and General Hints and Opinions.
Bash strict mode refers to a set of options and practices used in Bash scripting to make scripts more robust, reliable, and easier to debug. By enabling strict mode, you can prevent common scripting errors, detect issues early, and make your scripts fail in a controlled way when something unexpected happens.
I use this at the top of my Bash scripts:
#!/usr/bin/env bash
# Bash Strict Mode: https://github.com/guettli/bash-strict-mode
trap 'echo -e "\nπ€· π¨ π₯ Warning: A command has failed. Exiting the script. Line was ($0:$LINENO): $(sed -n "${LINENO}p" "$0" 2>/dev/null || true) π₯ π¨ π€· "; exit 3' ERR
set -Eeuo pipefailLet's have a closer look:
#!/usr/bin/env bashThis makes sure we use the Bash shell and not a different shell. Writing portable shell scripts is more complicated, and I want to get things done, so I use Bash and its handy features.
The command /usr/bin/env looks up bash in $PATH. This is handy if /bin/bash is outdated on
your system, and you installed a new version in your home directory.
This line prints a warning if the shell script terminates because a command returned a non-zero exit code:
trap 'echo "Warning: A command has failed. Exiting the script. Line was ($0:$LINENO): $(sed -n "${LINENO}p" "$0")"; exit 3' ERRIt shows the line that caused the shell script to exit and exits with 3.
Why 3? I use that according to the Nagios Plugin Return
Codes, although I don't use Nagios anymore. 3
means "Unknown."
set -Eeuo pipefail-E: ERR Trap Inheritance Ensures that the ERR trap is inherited by shell functions, command
substitutions, and subshells.
-e: Exit on Error Causes the script to immediately exit if any command returns a non-zero exit
status unless that command is followed by || to explicitly handle the error.
-u: Undefined Variables Treats the use of undefined variables as an error, causing the script
to exit.
-o pipefail: Pipeline Failure Ensures that a pipeline (a series of commands connected by |)
fails if any command within it fails, rather than only failing if the last command fails.
Quoting the Zen of Python:
Errors should never pass silently. Unless explicitly silenced.
I think the above strict mode ensures that errors donβt go unnoticed and prevents scripts from running into unexpected issues. I prefer a command to fail and show me the failed line, rather than the default behavior of Bash (continuing with the next line in the script).
Imagine you have a simple script:
grep foo bar.txt >out.txt
echo "all is fine"The script expects a file called bar.txt. But what happens if that file does not exist?
If the file does not exist, you get this output (without strict mode):
β― ~/tmp/t.sh
grep: bar.txt: No such file or directory
all is fine
The script terminates with a zero (meaning "OK") exit status, even though something went wrong.
That's something I would like to avoid.
With strict mode enabled, you will get:
grep: bar.txt: No such file or directory
Warning: A command has failed. Exiting the script. Line was (/home/user/tmp/t.sh:5): grep foo bar.txt >out.txt
And the exit status of the script will be 3, which indicates an error.
If you post about set -e on the Bash subreddit, you get an automated comment like this:
Don't blindly use set -euo pipefail.
The link explains why you should not use set -Eeuo pipefail everywhere.
I disagree. Strict mode has consequences, and dealing with these consequences requires some extra typing. But typing is not the bottleneck. I prefer to type a bit more if it results in more reliable Bash scripts.
This would fail in strict mode if FOO is not set:
if [ -z "$FOO" ]; then
echo "Env var FOO is not set. Doing completely different things now ..."
do_different_things
fiOutput:
line N: FOO: unbound variable
You can work around that easily by setting the value to the empty string:
if [ -z "${FOO:-}" ]; then
echo "Env var FOO is not set. Doing completely different things now ..."
do_different_things
fiNon-zero exit codes often indicate an error, but not always.
The command grep returns 0 if a line matches, otherwise 1.
For example, you want to filter comments into a file:
echo -e "foo\n#comment\nbar" | grep '^#' >comments.txtThe code above works in strict mode because there is a match. But it fails if there is no comment.
In that case, I expect comments.txt to be an empty file, and the script should not fail but
continue to the next line.
This code fails in strict mode:
echo -e "foo\nbar" | grep '^#' >comments.txt | some-other-commandWorkaround:
echo -e "foo\nbar" | { grep '^#' >comments.txt || true; } | some-other-commandWith this pattern, you can easily ignore non-zero exit statuses.
In most cases you just want to know: Was the command successful or not?
If you want to know the exit code ($?) then you can use that pattern:
if some-command; then
code=0
else
code=$?
fi
echo $codeWhen using the Bash Strict Mode, you want commands which fail (exit status not 0) to fail. Execution of the script should be stopped.
Imagine there are two commands combined with &&. The non-zero exit status does not get noticed.
(false is a command which has exit status 1)
false && falseTo avoid that pitfall, avoid && and use two lines instead:
false
falseIf you use two lines, then Bash in strict mode will fail on the first non-zero exit status (here the
first false):
This will fail:
random_id=$(tr -dc 'a-z0-9' </dev/urandom | head -c 7)Explanation: When head closes its output after reading 7 bytes, tr is still writing, but
suddenly its output pipe is gone. This causes tr to receive a SIGPIPE and exit with a non-zero
status (usually 141).
This will work:
random_id=$(tr -dc 'a-z0-9' </dev/urandom | head -c 7 || true)Thanks to Reddit user "aioeu" for the explanation: cat file | head fails, when using "strict mode" : r/bash
If I want to distinguish between a successful command and non-zero:
if ./my-small-script.sh; then
echo "Success"
else
echo "Failure"
fiMy conclusion: Use strict mode!
Use the right tool. But which tool is the right one?
I use Bash if I need to execute several Linux command-line tools (one after the other) to achieve my goal.
This means the script is straightforward. There are no functions, only a few "if/else" statements (mostly for error handling) and a few loops.
For example, provisioning a vanilla virtual machine into a custom virtual machine to meet specific requirements is such a task. Bash fits perfectly for that.
Bash is not a real programming language. For applications, itβs better to use Golang or Python (in my opinion).
General Bash Hints: Avoid find ... | while read -r file β Use while read -r file; do ...; done < <(find ...)
Imagine you want to collect errors like this, and fail after the loop if the string error is not empty:
errors=""
find ... | while read -r file; do
if ...; then
errors="$errors ..."
fi
done
if [ -n "$errors" ]; then
echo "failed: $errors"
exit 1
fiThis won't work because the block in the loop is executed in a new subshell. Variables set inside a subshell are not accessible outside.
If you do it like this, it works:
while read -r file; do
...
done < <(find ... )I prefer to write a second (or third) Bash script instead of writing functions.
This provides a clean interface between your primary and secondary script.
If you want to execute two tasks concurrently, you can do it like this:
# Bash Strict Mode: https://github.com/guettli/bash-strict-mode
trap 'echo -e "\nπ€· π¨ π₯ Warning: A command has failed. Exiting the script. Line was ($0:$LINENO): $(sed -n "${LINENO}p" "$0" 2>/dev/null || true) π₯ π¨ π€· "; exit 3' ERR
set -Eeuo pipefail
{
echo task 1
sleep 1
} & task1_pid=$!
{
echo task 2
sleep 2
} & task2_pid=$!
# Wait each PID on its own line so you get each child's exit status.
wait "$task1_pid"
wait "$task2_pid"
echo endWhy wait each PID separately?
- You must wait to reap background children and avoid zombies.
wait pid1 pid2will wait for both PIDs, but its exit status is the exit status of the last PID waited for. This means an earlier background job can fail yet the combinedwaitcan still return success if the last job succeeds β not what you want if you need to detect failures reliably.
Makefiles are similar to the strict mode. Let's look at an example:
target: prerequisites
recipe-command-1
recipe-command-2
recipe-command-3If recipe-command-1 fails the Makefile stops, and does not execute recipe-command-2.
The syntax in a Makefile looks like shell, but it is not.
As soon as the commands in a Makefile get complicated, I recommend to keep it simple:
target: prerequisites
script-in-bash-strict-mode.shInstead of trying to understand the syntax of Makefile (for example $(shell ...)), I recommend to
call a Bash script.
A Bash script has the benefit that formatting (shfmt) and ShellCheck are available in the editor.
Unfortunately there are several different flavours of regular expressions.
Instead of learning the old regular expressions, I recommend to use the Perl Compatible Regular Expressions.
The good news: grep supports PCRE with the -P flag. I suggest to use it.
I avoid using awk because I am not familiar with its syntax, and from 1996 up to now, this has
worked out fine for me.
The only time I use awk is when the input is split by whitespace and the length varies.
Example: I want to print the second column:
command-which-prints-columns | awk '{print $2}'From time to time, I use perl one-liners.
I think writing portable shell scripts is unnecessary in most cases. It is like trying to write a script that works in both, the Python and the Ruby interpreters at the same time. Don't do it. Be explicit and write a Bash script (not a shell script).
There is a handy shell formatter: shfmt and a VS Code plugin shell-format.
There is ShellCheck and a VS Code plugin for ShellCheck which helps you find errors in your script.
ShellCheck can recognize several types of incorrect quoting. It warns you about every unquoted variable. Since it is not much work, I follow ShellCheck's recommendations.
There are several well known tools for provisioning a machine: Ansible, SaltStack, Puppet, Chef, ...
All of them have their learning costs.
It depends on the environment, but maybe a Bash script in strict mode is easier to maintain.
Some years ago, we used SaltStack to provision and update a lot of virtual machines. We wasted so much time because things did not work as expected, or error messages got swallowed. In hindsight, we would have been much faster if we had taken the pragmatic approach (Bash) instead of being proud to use the same tools as big tech companies.
GitHub Actions (or similar CI tools) have a big drawback: You can't execute them on your local device.
I try to keep our GitHub Actions simple: the YAML config calls Bash scripts, which I can also execute locally.
You can use containers to ensure that all developers have the same environment.
This article is about Bash scripting.
For interactive I use:
- Fish Shell
- Starship for the prompt.
- Atuin for the shell history.
- direnv to set directory specific env variables.
- brew
- ripgrep
- fd find
- CopyQ Clipboard Manager
- Activity Watch Automatic time tracker
- VSCode
- Ubuntu LTS.
Usually don't use ripgrep and fd in Bash scripts, because these are not available on most
systems.
Imagine you use credentials like this:
echo "$OCI_TOKEN" | oras manifest fetch --password-stdin $IMAGE_URLIf you use set -x, then every line gets printed. This will print the content of $OCI_TOKEN.
This can reveal your secrets in the logs.
Rule of thumb: Never use set -x in a script. Except temporarily for debugging, but do not commit
it to the source code repo.
Thank you to https://www.reddit.com/r/bash/
I got several good hints there.