Blog

Notes on ML engineering, data platforms, and the developer tools I build along the way.

| llm / go / devops

I monitored 6 LLM APIs for 7 days. Here's what I found.

60,000 probes across GPT-4o-mini, Claude 3.5 Haiku, Gemini 2.0 Flash, Llama 3.3 70B, DeepSeek Chat, and Mistral Small. Real latency numbers from continuous monitoring.

Read post
| llm / python / devtools

How I built Infracost for LLM spend in a day

Building tokentoll, an Infracost-style cost-impact tool for LLM API spend, in a single day. Architecture, model-name resolution, multi-pass constant propagation, and validation across twenty real codebases.

Read post