Blog

Thinking out loud.

Writing about the hard parts of building AI systems in production. Not tutorials — lessons learned.

January 15, 2026|12 min read

Designing Reliable LLM Systems

What it actually takes to build LLM-powered features that work 99.5% of the time. Covering structured outputs, fallback chains, and the retry patterns nobody talks about.

LLMSystem DesignProduction

December 2, 2025|8 min read

When Not to Use Agents

Agents are powerful but overused. Here's a framework for deciding when a simple chain, a DAG, or a hard-coded pipeline is the better choice over autonomous agents.

AgentsArchitectureTradeoffs

October 18, 2025|15 min read

RAG Failure Modes in Production

After running RAG systems serving 500+ daily users, here are the failure modes that don't show up in tutorials: chunking disasters, embedding drift, and the retrieval-generation gap.

RAGProductionDebugging

September 5, 2025|10 min read

Cost vs Accuracy: The Tradeoffs Nobody Documents

Every ML system has a cost-accuracy frontier. I walk through real examples of navigating this tradeoff in production, from model selection to caching strategies.

Cost OptimizationMLProduction