Currently Studying — ML Systems
Open reading list and learning targets for the next 6–12 months, oriented around training infrastructure, inference serving, evaluation tooling, and the systems craft underneath frontier-model work.
The active reading list. This page updates monthly. The aim is the engineering substrate underneath frontier-model work — training, inference, evals, infrastructure — not research-method-of-the-week pursuit.
Books in progress
- Deep Learning — Goodfellow, Bengio, Courville. Currently around the Chapter 4–5 boundary, working through optimization and regularization with derivations rather than skimming.
- Designing Data-Intensive Applications — Kleppmann. Re-read. Forces tying back what I’ve shipped to its formal name.
- Interview-track ML systems texts — selecting one to anchor the upcoming application cycle.
Topics I’m going deep on
In rough priority order:
- Transformer internals. Attention math from scratch, KV cache, RoPE, MQA / GQA, FlashAttention.
- Distributed training. DDP, FSDP, ZeRO, pipeline + tensor parallelism. The actual primitives, not the framework wrapper.
- Inference serving. vLLM, TensorRT-LLM, Triton — paged attention, continuous batching, speculative decoding.
- RLHF and preference optimization. DPO, IPO, KTO. The simplifications matter more than the math.
- Eval design. LLM-as-judge, offline eval pipelines, regression-testing models — the part I have the most direct production analog for already.
- Agent frameworks and tool-use patterns. What’s actually load-bearing in production agents vs. what’s library scaffolding.
- Kubernetes and GPU orchestration. Closing the ECS-only gap.
What I’m tracking
Public research I read more or less weekly: Anthropic and OpenAI engineering posts, the SemiAnalysis archive, the more rigorous ML blogs (Lilian Weng, Jay Alammar), and the major training-infra papers as they ship.
Side work
A public-artifacts plan is in motion — the goal is to have a small but serious open-source artifact in the ML-systems space shipped by the time I’m in the Bay. Updates land on the writing index when there’s substance to share.