Tokens in LLMs - Ramshankar Yadhunath

A collection of resources to understand how tokens work in Large Language Models. ## What are Tokens? [Your Chatbot Isn't Reading Words—It's Counting Tokens](https://hackernoon.com/your-chatbot-isnt-reading-wordsits-counting-tokens) - HackerNoon - Tokens are words, subwords, characters, or symbols - not whole words. "unhappiness" becomes "un," "happi," and "ness" - Three stages: splitting (text → subwords), encoding (subwords → integers), decoding (integers → text) - Use `tiktoken` to monitor token limits, optimise costs, and debug how your text tokenizes - Chunk large documents and trim inputs to manage context windows and reduce API costs ## The Cost of LLM Tokens [Falling LLM Token Prices and What They Mean for AI Companies](https://www.deeplearning.ai/the-batch/falling-llm-token-prices-and-what-they-mean-for-ai-companies/) - DeepLearning.ai - GPT-4o costs $4/million tokens vs GPT-4's $36 in March 2023 - a 79% annual price drop - Competition from open-weight models (Llama 3.1) and new hardware (Groq, Cerebras) driving prices down - Advice: prioritise functionality over cost optimisation - for most apps, LLM costs are negligible - Even compute-intensive agentic workloads now cost ~$1.44/hour at current rates ## Video: How LLM Tokens Actually Work <iframe width="800" height="400" src="https://www.youtube.com/embed/nKSk_TiR8YA" title="Most devs don't understand how LLM tokens work" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe> **Key takeaways:** - Tokens are the currency of LLMs - the smallest unit processed - How you are billed when you use them - Tokens convert text into numerics; an LLM actually computes these numbers