The Tokens
Bonding though words... Manpreet & Renaira
Featured
🧠 Brains, Monty & Transformers at Cafe...
You’re sitting in a quiet room, sipping coffee. Outside, the world is silent, inside, billions of neurons fire, cells glow, memories strengthen, expectations form, and predictions whisper through your mind, all while consuming about the same energy as a dim light bulb. Every time...
AI isn’t a miracle cure, it’s physiotherapy...
Only works if you’re willing to move. I’ve been reflecting on how exciting it would be for GPT and Kimi K2 to learn an even better version of AI Snake Oil in their next training! Just kidding, of course! “AI isn’t our replacement, it’s...
Non Deterministic AI
You ever asked GPT the same question twice and gotten different answers? Even when you set the temperature to zero (which should make responses deterministic), variations still occur. This seemingly simple observation reveals a fascinating technical challenge that affects every large language model currently...
Glorious AI Noise
Keep Your Sanity When AI Breaks the Speed Limit I often hear this question about AI, and it can feel overwhelming to read, learn, and adapt. The real challenge lies in knowing where to begin, where to pause, and how to navigate through the...
AI: A Slightly Silly Look into the Future
Have been actively working on AI since before ChatGPT, and it has been amazing to watch the growth and speed at which AI models are changing every discussion and conversation around us. Saturday evening, I decided to pen down some of my thoughts. These...
Flash Attention
Write up is not written by the GPT of any form, and believe me, it feels good to have typos and grammatical mistakes to feel more human. I do not want to talk about how neural Networks were inspired using brains, but I do...
GRPO At Its Best
Wild world of fine-tuning large language models is where we feed math problems to a 7-billion-parameter beast (Qwen2.5-7B-Instruct), run it on 8 fire-breathing A100 GPUs, and politely ask it to get smarter without throwing a tantrum. This writeup dives into GRPO, a Reinforcement Learning...
Watch My Models Learn
Fancy models have set the bar high, but guess what? My model is taking a different route by mastering the art of improvement on every forward and backward pass! Let’s explore the numbers that prove this learning leap. (P.S. If you’re new here, check...
AI Leadership
AI is becoming increasingly prevalent in technology, with many products and features being developed in prototype stages. However, pressure to add “AI” to everything often doesn’t always lead to a meaningful impact. To effectively harness AI, it is essential to understand business needs and...
GPT & Me 🧠
This write-up is A Neural Network Love Story (Spoiler: It’s Complicated), one neuron at a time – while GPT pretends not to notice! It is my hands-on experience training a Generative Pretrained Transformer with 124 million parameters - powered by 8 massive NVIDIA A100 GPUs,...
Neural Networks and Coffee Breaks ☕
Step at a Time 📚 This writeup provides a beginner explanation for understanding and training GPT-2. I started by implementing a transformer decoder. You can visit mini-autograd and mini-models for my older work, and now I am slowly graduating to setting up, training, and...