top of page


When Your System Lies in Complete Sentences: Observing LLMs in Production
Picture this scenario. Your AI-powered feature is running. Every metric you watch looks clean. Error rate: flat. P99 latency: within SLO. HTTP 200s across the board. Your on-call engineer has nothing to page about. But for the last three days, the model has been producing subtly wrong output. Not wrong in a way that crashes anything. Not wrong in a way that fires a single alert. The responses are fluent, confident, and structurally perfect. They just happen to be incorrect in
Chandra Sekar Reddy
May 248 min read
bottom of page