Currently Studying — ML Systems

Open reading list and learning targets for the next 6–12 months, oriented around training infrastructure, inference serving, evaluation tooling, and the systems craft underneath frontier-model work.

Year 2026

The active reading list. This page updates monthly. The aim is the engineering substrate underneath frontier-model work — training, inference, evals, infrastructure — not research-method-of-the-week pursuit.

Books in progress

Deep Learning — Goodfellow, Bengio, Courville. Currently around the Chapter 4–5 boundary, working through optimization and regularization with derivations rather than skimming.
Designing Data-Intensive Applications — Kleppmann. Re-read. Forces tying back what I’ve shipped to its formal name.
Interview-track ML systems texts — selecting one to anchor the upcoming application cycle.

Topics I’m going deep on

In rough priority order:

Transformer internals. Attention math from scratch, KV cache, RoPE, MQA / GQA, FlashAttention.
Distributed training. DDP, FSDP, ZeRO, pipeline + tensor parallelism. The actual primitives, not the framework wrapper.
Inference serving. vLLM, TensorRT-LLM, Triton — paged attention, continuous batching, speculative decoding.
RLHF and preference optimization. DPO, IPO, KTO. The simplifications matter more than the math.
Eval design. LLM-as-judge, offline eval pipelines, regression-testing models — the part I have the most direct production analog for already.
Agent frameworks and tool-use patterns. What’s actually load-bearing in production agents vs. what’s library scaffolding.
Kubernetes and GPU orchestration. Closing the ECS-only gap.

What I’m tracking

Public research I read more or less weekly: Anthropic and OpenAI engineering posts, the SemiAnalysis archive, the more rigorous ML blogs (Lilian Weng, Jay Alammar), and the major training-infra papers as they ship.

Side work

A public-artifacts plan is in motion — the goal is to have a small but serious open-source artifact in the ML-systems space shipped by the time I’m in the Bay. Updates land on the writing index when there’s substance to share.