A short defense of boring infrastructure.
Most of the outages I've cleaned up came from someone reaching for novelty. Here's the case for the dullest possible stack β and when to break that rule on purpose.
I have a folder on my laptop called incidents/. It is the closest thing I have to a religious text. Every entry is some version of the same story: a team needed to ship a thing, an engineer reached for a tool that was newer than their experience with it, and at three in the morning on a Saturday, someone called me.
Over the years the names changed β service mesh, then GraphQL gateway, then a particularly enthusiastic Kafka topology, then a vector DB used as a primary store β but the structure didn't. The exciting tool was supposed to save us six weeks. It cost us six months and a person.
What I want to argue is something close to a moral position: your default should be boring. Not because boring is correct, but because boring is cheap to be wrong about.
What I mean by boring
By boring I don't mean old. Plenty of old technology is also bad. I mean: tools you can debug at 3am without reading documentation; tools where the failure modes are folkloric β the team already tells stories about them; tools where the next hire has used them before.
The cost of a tool isn't how long it takes to install. It's how long it takes the fifth person on the team to fix it at midnight.
Postgres is boring. cron is boring. A shell script behind a systemd unit is boring. None of these are correct in every situation β but they have the rare property that the worst case is well-understood.
When to break the rule
I'm not religious about this. There are three situations where I happily reach for the new thing:
- The boring tool is structurally wrong, not just slow.
- You have two engineers who have shipped the new tool to production before, at scale, and they're both on this team.
- The new tool replaces something that has caused you measurable, repeated pain β and you can name the incidents.
Note that “it's on the front page of Hacker News” is not on this list. Neither is “we'll learn a lot.” You will learn a lot from running anything in production. The question is what you'll learn it about.
The senior trap
I am no longer the person who would benefit most from learning the new tool. I am the person who would benefit most from not having to learn it at 3am. There's a temptation, when you reach a certain seniority, to mistake your team's curiosity for your team's judgment.
# the wrong question
> Should we use $NEW_THING?
# the right question
> What does our worst week look like
> if we use $NEW_THING vs. what we have?
I keep that diff in my notes. I look at it before approving an RFC. It is the single most useful piece of judgment I've developed in twelve years.
None of this is an argument against curiosity. Build the weird thing on the weekend. Just don't put it on the load path before you've put it through your own worst case. Boring is a kindness to your future self β and to whoever has to call you at 3am.
β Luca, somewhere near Bologna.





