Truncating LLM Context Is Not Retrieval: Why slice(0, N) Was the Wrong Fix
We capped a 2.6M-character LLM prompt with slice(0, 10000). Latency dropped, dashboards turned green, and the model started answering with the wrong context....
We capped a 2.6M-character LLM prompt with slice(0, 10000). Latency dropped, dashboards turned green, and the model started answering with the wrong context....
The Strangler Fig — We Didn’t Rewrite It. We Outgrew It.
What running at scale taught us about the limits we didn’t set.