10 Things I Learned After 4 Years as a Backend Engineer
Four years of building backend systems. Hundreds of deploys. A handful of incidents that taught me more than any course ever could. Here’s what stuck.
1. The Database Is Almost Always the Bottleneck
Not your application code. Not your framework. The database.
I’ve spent more time staring at EXPLAIN ANALYZE output than writing actual application logic. The most impactful performance work I’ve done was adding the right index, rewriting a query to avoid a sequential scan, or denormalizing a hot read path.
Before you optimize your code, check your queries first.
2. Idempotency Isn’t Optional
In distributed systems, every operation should be safe to retry. Networks fail. Messages get delivered twice. Deployments restart mid-request.
If your payment endpoint charges twice on retry, you have a very expensive bug.
// Not this
async function chargeUser(userId: string, amount: number) {
await stripe.charges.create({ amount, customer: userId });
}
// This
async function chargeUser(idempotencyKey: string, userId: string, amount: number) {
await stripe.charges.create(
{ amount, customer: userId },
{ idempotencyKey }
);
}
Design every write operation with the question: “What happens if this runs twice?“
3. Logs Are More Valuable Than Tests
Controversial, but hear me out. Tests tell you if your code works as expected in controlled scenarios. Logs tell you what actually happened in production.
I’m not anti-testing. But I’ve seen teams with 95% test coverage that couldn’t debug production issues because their logging was garbage. Structured logs with correlation IDs, request context, and meaningful messages are worth more than an extra unit test.
Invest in both. But if I had to choose, I’d choose observability.
4. Simple Beats Clever Every Time
Early in my career, I wrote “clever” code. Event sourcing for a CRUD app. Microservices for a two-person team. Custom framework abstractions that nobody else understood.
Now I write boring code. If/else instead of polymorphic dispatch. SQL instead of an ORM for complex queries. A monolith until the team actually needs microservices.
Clever code impresses in pull requests. Simple code survives on-call rotations.
5. You Don’t Need Microservices (Yet)
Most projects should start as a monolith. Period.
Microservices solve organizational scaling problems, not technical ones. If you have fewer than 20 engineers, you probably don’t need independent deployability. You need fast iteration and simple debugging.
I’ve seen small teams waste months on service mesh configuration, distributed tracing setup, and cross-service schema management — problems that don’t exist in a well-structured monolith.
Split when you feel the pain. Not before.
6. Error Handling Is the Real Logic
The happy path is easy. Anyone can write code that works when everything goes right. The actual engineering is in the failure modes.
- What happens when the downstream service times out?
- What if the database connection pool is exhausted?
- What if the message queue is full?
- What if the payload is 100x larger than expected?
I now spend more time thinking about failure cases than success cases. Every external call gets a timeout, a retry policy, and a circuit breaker. Every queue consumer has a dead letter queue.
7. Migrations Are the Hardest Part
Writing new code is easy. Changing existing systems without breaking things is hard.
Database migrations, API version transitions, data backfills — this is where things go wrong. I’ve learned to:
- Always make migrations backward-compatible
- Deploy schema changes separately from code changes
- Use expand-contract pattern: add new column → backfill → migrate reads → drop old column
- Never rename a column in a single deploy
The most senior engineers I know are the ones who can change a running system without anyone noticing.
8. Premature Abstraction Is Worse Than Duplication
The DRY principle is overrated for early-stage code. When I see two similar-looking functions, my instinct used to be to immediately extract a shared abstraction.
Now I wait. I let the duplication exist until I’ve seen three or four instances and understand the actual pattern. Premature abstractions couple things that shouldn’t be coupled and make future changes harder.
Three similar functions are easy to change independently. One abstraction with five configuration flags is a nightmare.
9. Documentation Rots, but Architecture Decisions Don’t
I’ve stopped writing detailed system documentation that nobody reads. Instead, I focus on:
- ADRs (Architecture Decision Records) — why we chose X over Y, with context that was true at the time
- Runbooks — step-by-step instructions for incident response
- README with one command to run locally — if setup takes more than 5 minutes, developers won’t contribute
The “why” ages better than the “what.” Code changes. The reasoning behind architectural choices remains valuable for years.
10. On-Call Changes How You Write Code
Nothing improves your code quality like being woken up at 3 AM by your own bugs.
After being on-call, you start writing differently:
- You add better error messages because you’ll be the one reading them half-asleep
- You add circuit breakers because you know downstream services will fail
- You add metrics because you need to know if something is wrong before users tell you
- You add graceful degradation because partial outages are worse than full ones
The feedback loop from production to development is the most powerful learning mechanism in software engineering. If your team doesn’t have on-call rotations, find another way to close that loop.
These lessons came from production incidents, late-night debugging sessions, and working with engineers far more experienced than me. None of them are original — they’ve all been said before. But there’s a difference between reading advice and living it.
The next four years will probably teach me that half of these are wrong. That’s the point.