Latest Posts
-
Why Is LLM Inference So Much Slower on CPU? A Deep Dive
2026-03-28GPU inference is ~10–20x faster than CPU, and it's not about compute. A complete walkthrough of how CPUs and GPUs access memory differently, from cache lines to coalesced access, SIMD to SIMT, and shared memory tiling.
-
RSS Reader with short summary and novelty score, using a local LLM
2026-03-27Using local LLMs to prioritize the best articles to read
-
Automating Product Demo Videos with AI and Remotion
2026-03-13Building a tool that turns product screenshots into polished 30-second demo videos, using Claude's vision capabilities and Remotion's React-based video engine.