How to not shoot yourself in the foot on production
Posted on May 07, 2026 in Science
I work in a small, fast-paced team at a custom-research company. We build and operate a fairly involved platform, and there are maybe a handful of us who regularly touch production infrastructure. We can't always afford the organizational overhead to justify a full-blown change-management process with four-eyes policies for every command typed on a server. Sometimes, at 11pm on a Thursday, you need to SSH into a production node and figure out why a Docker service is misbehaving. That is the reality, well, my current reality at least...
But the reality also includes the fact that tired or multitasking engineers type dangerous things. docker volume prune on a node with persistent data. docker swarm leave on a manager. apt upgrade on a system that should only get targeted patches. These are the kinds of commands that, once executed, tend to ruin your evening in painful ways.
What we wanted was something low-ceremony: a tripwire. Something that makes you stop and think for a second before you proceed, and that takes about five seconds to override when you actually know what you are doing. The poor man's approach to not shooting yourself in the foot.
It turns out bash has a fairly unknown mechanism that is almost perfectly suited for this1. It lives in a corner of the shell that was designed for debuggers, and most people have never heard of it. Full disclosure: I got this from a not-very-much-upvoted stackoverflow post: https://stackoverflow.com/a/55977897 . Thank you, user xhienne!
The DEBUG trap¶
Bash's trap builtin is most commonly used with signals — trap cleanup EXIT, trap '' INT, that sort of thing. But bash also supports several pseudo-signals, and the one that interests us here is DEBUG.
A DEBUG trap fires before every simple command, for command, case command, select command, arithmetic command, conditional command, and before the first command in a shell function. The variable $BASH_COMMAND holds the full text of the command that is about to be executed. This by itself is already interesting: you get to inspect what the user typed before the shell does anything with it.
But the real trick stems from two additional settings that together render the DEBUG trap into an actual interception mechanism:
shopt -s extdebug enables extended debugging mode. Among other things, this changes the behavior of the DEBUG trap: if the trap handler returns a non-zero exit status, the next command is skipped and not executed. This is the critical piece. Without extdebug, the DEBUG trap is purely observational — it can log, but it cannot prevent. With extdebug, a non-zero return from the trap handler tells bash to silently discard the pending command.
set -T enables function tracing. Normally, the DEBUG trap is not inherited by shell functions or subshells. set -T ensures that the trap propagates into functions, command substitutions, and subshells invoked with (command). Without this, a user could sidestep the trap.
The combination of these three pieces — trap '...' DEBUG, shopt -s extdebug, and set -T — yields something that none of the other approaches can provide: a hook that fires before every command, has access to the command text, and can prevent execution by returning non-zero.
Putting it together¶
The implementation is straightforward. In .bashrc, we install a DEBUG trap that pattern-matches $BASH_COMMAND against a list of blocked command prefixes. If a match is found, the trap handler prints a warning and returns false, which causes bash to skip the command. If no match is found, the handler returns true, and execution proceeds normally.
The blocked commands we use include things like docker volume prune, docker swarm leave, docker stack rm, docker service rm, and apt upgrade. These are the commands that, in our context, are most often typed either by accident or without sufficient thought about the consequences on a particular node.
trap '
if [[ "$BASH_COMMAND" == "docker volume prune"* ]]; then
printf "[Forbidden: >> %s << Run \"trap - DEBUG\" to allow.]\n" "$BASH_COMMAND"
false
elif [[ "$BASH_COMMAND" == "docker swarm leave"* ]]; then
printf "[Forbidden: >> %s << Run \"trap - DEBUG\" to allow.]\n" "$BASH_COMMAND"
false
elif [[ "$BASH_COMMAND" == "docker stack rm"* ]]; then
printf "[Forbidden: >> %s << Run \"trap - DEBUG\" to allow.]\n" "$BASH_COMMAND"
false
fi
' DEBUG
set -T
shopt -s extdebug
When you type a blocked command, you see something like:
[Forbidden: >> docker stack rm mystack << Run "trap - DEBUG" to allow.]
The command is not executed. Your swarm stack is still there.
The escape hatch¶
This is not a security boundary, and it is not intended to be one. It is a tripwire. If you actually need to run one of the blocked commands — and this happens more often than one thinks — you clear the trap:
trap - DEBUG
This removes the DEBUG trap entirely, for the current shell session. You run your command. Once you exit the (e.g. ssh) shell session, the .bashrc will be sourced again, and your trap is active again Or, you re-source .bashrc to put the tripwire back:
source ~/.bashrc
We deploy this via Ansible to pretty much all "important production" as well as "someone's playground" servers, injected into .bashrc for both the ubuntu and root users. The blocked command list lives in a separate text file, one command prefix per line, which makes it easy to update. For anyone who wants to deploy this across their machines, here is the Ansible snippet I use. It reads a blocked_commands.txt file (one command prefix per line), and there is some jinja magic in there:
- name: Read blocked commands from local file
set_fact:
blocked_cmds: "{{ lookup('file', 'blocked_commands.txt').splitlines() }}"
- name: Construct trap command
set_fact:
trap_cmd: >-
{% if trap_cmd is defined %}{{ trap_cmd }}{% else %}if [[ "$BASH_COMMAND" == "{{ item }}"* ]]; then printf "[Forbidden: >> %s << This might lead to issues. Run \"trap - DEBUG\" to allow command. source ~/.bashrc to disable it again]\n" "$BASH_COMMAND"; false; {% endif %}
elif [[ "$BASH_COMMAND" == "{{ item }}"* ]]; then
printf "[Forbidden: >> %s << This might lead to issues. Run \"trap - DEBUG\" to allow command. source ~/.bashrc to disable it again]\n" "$BASH_COMMAND"; false;
loop: "{{ blocked_cmds }}"
- name: Apply trap to .bashrc for user ubuntu
blockinfile:
dest: "/home/ubuntu/.bashrc"
block: |
# via https://stackoverflow.com/a/55977897
trap '{{ trap_cmd + "fi;" | trim }}' DEBUG
set -T
shopt -s extdebug
marker: '# {mark} ANSIBLE MANAGED BLOCK - command restriction via trap'
insertbefore: EOF
create: yes
Colored shell prompts¶
While we are at it: Another cheap .bashrc trick that pairs well with the command tripwire: making it visually obvious where you are. If you spend your day hopping between terminals it is surprisingly easy to lose track of which shell is connected to which environment. Especially when you have four terminal tabs open and they all say ubuntu@ip-10-0-something.
We tackle this by overwriting PS1 per deployment stage, with color-coded [STAGE] suffixes. This trick is a bit more common, I got it from Mitchel Hashimoto in some youtube video but since then I've seen it around. The prompt on a production node is red and says [PROD]. Staging is orange, [STAGING]. But you can do whatever in practice. In any case, this is the kind of thing that costs nothing to set up and saves your behind the moment your muscle memory tries to run a prod command in what you thought was a staging session. Here is some example for debian based systems:
- name: set vars for shell prompt
set_fact:
stage_colors:
# green
master: '\[\033[01;32m\]'
# orange
staging: '\[\e[38;5;202;1m\]'
# red
prod: '\[\e[38;5;160;1m\]'
green_color: '\[\033[01;32m\]'
reset_formatting: '\[\033[00m\]'
- name: Update shell prompt
blockinfile:
dest: "/home/{{ ansible_user }}/.bashrc"
marker: '# {mark} ANSIBLE MANAGED BLOCK - overwrite shell prompt PS1'
insertbefore: "unset color_prompt force_color_prompt"
block: |
if [ "$color_prompt" = yes ]; then
PS1='${debian_chroot:+($debian_chroot)}{{ green_color }}\u@\h{{ reset_formatting }}:\w{{ stage_colors[deployment_stage] }}[{{ deployment_stage | upper }}]\${{ reset_formatting }} '
else
PS1='${debian_chroot:+($debian_chroot)}\u@\h:\w[{{ deployment_stage | upper }}]\$ '
fi
The result is that on a prod machine your prompt reads something like:
ubuntu@ip-10-0-1-42:~[PROD]$
...in red.
This is how I stop myself from shooting myself in the foot - on prod.
-
You might think of bash's restricted mode (
rbash), which disables things like changing directories, settingPATH, and running commands with slashes. This is far too blunt an instrument. We are not trying to lock people out of the system — we are trying to prevent a specific class of mistakes while leaving everything else fully operational. A restricted shell renders the machine nearly unusable for the kind of debugging and administration work that necessitates SSH access in the first place. ↩