system design

Truncating LLM Context Is Not Retrieval: Why slice(0, N) Was the Wrong Fix

22 minute read

We capped a 2.6M-character LLM prompt with slice(0, 10000). Latency dropped, dashboards turned green, and the model started answering with the wrong context....

One Nav at a Time: How We Stopped Feeding the Legacy and Started Replacing It

7 minute read

The Strangler Fig — We Didn’t Rewrite It. We Outgrew It.

Limits: The Engineering Decision You Keep Postponing—Until Production Makes It For You

9 minute read

What running at scale taught us about the limits we didn’t set.