Vish Sangale

Notes from a
quiet lab on
& craft, capacity,
constraints. Vish Sangale

Staff ML Researcher at Meta. I work on large-scale AI architectures and recommendation systems — and on the long, slow project of teaching computation to mirror the robustness of biology.

/ now

what's loaded into RAM,
updated when it changes

this week
Scaling Slate-Q to a real catalog. Watching variance climb past slate=12.
rl-recsys slate-q ablations
this month
Reading on hierarchical semantic IDs & codebook collapse. Inference-time entropy keeps eating my catalog.
plum rq-vae
on the bench
Bonsai-LLM, training a 1B at home. Quality > quantity, capacity matters, constraints are the point.
bonsai-llm gemma-3 fineweb-edu

Recent thinking.

view archive →
N° 05
Beyond the Click: Slate-Q for Sequential Recommendation Most recommendation systems are designed to maximize immediate engagement—the “next click.” However, true user value is built over entire sessions. In this project, RL-RECSYS, I...
recsys
N° 04
Animating Intelligence: Visualizing AI with Manim-AI Neural networks are often treated as “black boxes,” full of abstract matrices and hidden weights. To bridge the gap between theory and intuition, I developed...
manim
N° 03
Bonsai-LLM: The 'Small LLMs Lab' Philosophy In an era of trillion-parameter models and massive compute clusters, it’s easy to forget that capacity matters, constraints are the point, and craft beats brute...
small-llms
N° 02
Modernizing GPT-2: A 3.1x Throughput Leap with 2025 Optimizations The original GPT-2 architecture, released in 2019, remains the bedrock of modern NLP. However, the “standard” recipe for training Transformers has shifted dramatically. In this...
gpt-2
N° 01
Causal-Informed Hybrid Online Adaptive Optimization for Ad Load Personalization in Large-Scale Social Networks This paper presents CTRCBO (Cohort-Based Trust Region Contextual Bayesian Optimization), a hybrid framework designed for personalizing ad load in large-scale social networks like Meta.
causal-inference
Total solar eclipse — totality totality · 2017
aside · photography

The day surrendered to night, and the corona danced.

When I'm not training models I am, more often than is reasonable, pointed at the sky. Eclipses, comets, the Jovian court — light from a long way off, captured patiently on a small sensor.

Open the collection