Articles
Deep-dives on AI, GitHub, RAG, cloud and DevOps - written by engineers shipping production systems in 2026.
Serverless Was Amazing Until My Lambda Timed Out at 3am
I was a serverless evangelist for two years. Then I ran a real production workload on Lambda and learned some expensive lessons about cold starts, timeout limits, and vendor lock-in.
The Cloud Migration That Almost Killed Our Startup (And What Saved Us)
We decided to migrate from our managed hosting to AWS in Q4. Three months of downtime incidents, one near-catastrophic data access issue, and one extremely uncomfortable conversation with investors later, here's what we learned.
What Three Years of Running Kubernetes in Production Actually Taught Me
I've been running Kubernetes in production since 2022. Here's what the blog posts don't tell you: the operational surprises, the 2am incidents, and the things I wish I'd known before I started.
We Replaced Jenkins With GitHub Actions. Six Months Later, Here's the Verdict.
We ran Jenkins for four years. Migration to GitHub Actions took three months. Here's the honest comparison: what got better, what we miss, and whether we'd do it again.
The Monitoring Stack That Saved Our Black Friday
Last November our traffic spiked 12x in 90 minutes. We caught a cascade failure in under four minutes and resolved it before 95% of users noticed. Here's the monitoring setup that made that possible.
Platform Engineering vs DevOps: Why the Name Change Actually Matters
When people started calling it 'platform engineering' instead of DevOps, a lot of engineers rolled their eyes at another industry rebrand. They were wrong to. Here's what actually changed.
I Automated Our Entire Deployment Process. Here's the Part Nobody Warned Me About.
We built a fully automated CI/CD pipeline: commit to deploy in under eight minutes, zero manual steps. It was a genuine improvement. It also created a class of problems I didn't anticipate.
I Rebuilt Our Site in Next.js 14. My Honest Six-Month Review.
After four years on a custom React SPA, we migrated to Next.js 14 App Router. Six months later I have real data on performance, developer experience, and the things I wish I'd known before starting.
The React Mistake That Crashed Our App at 50,000 Concurrent Users
We built a live event feature. It worked fine in testing. At 50,000 concurrent users it triggered a re-render cascade that locked up browsers across the entire platform. Here's exactly what happened.
TypeScript: The Pain Was Worth It. Here's What Changed After Two Years.
I resisted TypeScript for longer than I'd like to admit. Two years after finally committing to it for a production codebase, I have data on what actually changed - and one honest admission about where I was wrong.