Lab

Public posts and projects

Articles, repos, and working notes, openly available.

Inference costPublished - April 9, 2026

Batch Inference

Why batch inference can cut costs sharply on AI workloads that do not need real-time latency.

Value often starts with better cost architecture, not with a bigger model.

Agentic workflowsPublished - January 27, 2026

The current public Claude-Book release shows how I orchestrate multiple agents, state, and multi-pass workflows around a writing system.

How to design agentic workflows that go beyond a wrapper or a linear chatbot.

RAG simplificationPublished - March 29, 2026

A repo to compare retrieval strategies and document when embeddings actually add value.

Reducing RAG stack complexity often improves delivery speed, maintenance, and total cost.