A collection of resources to understand how tokens work in Large Language Models.
## What are Tokens?
[Your Chatbot Isn't Reading Words—It's Counting Tokens](https://hackernoon.com/your-chatbot-isnt-reading-wordsits-counting-tokens) - HackerNoon
- Tokens are words, subwords, characters, or symbols - not whole words. "unhappiness" becomes "un," "happi," and "ness"
- Three stages: splitting (text → subwords), encoding (subwords → integers), decoding (integers → text)
- Use `tiktoken` to monitor token limits, optimise costs, and debug how your text tokenizes
- Chunk large documents and trim inputs to manage context windows and reduce API costs
## The Cost of LLM Tokens
[Falling LLM Token Prices and What They Mean for AI Companies](https://www.deeplearning.ai/the-batch/falling-llm-token-prices-and-what-they-mean-for-ai-companies/) - DeepLearning.ai
- GPT-4o costs $4/million tokens vs GPT-4's $36 in March 2023 - a 79% annual price drop
- Competition from open-weight models (Llama 3.1) and new hardware (Groq, Cerebras) driving prices down
- Advice: prioritise functionality over cost optimisation - for most apps, LLM costs are negligible
- Even compute-intensive agentic workloads now cost ~$1.44/hour at current rates
## Video: How LLM Tokens Actually Work
<iframe width="800" height="400" src="https://www.youtube.com/embed/nKSk_TiR8YA" title="Most devs don't understand how LLM tokens work" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
**Key takeaways:**
- Tokens are the currency of LLMs - the smallest unit processed
- How you are billed when you use them
- Tokens convert text into numerics; an LLM actually computes these numbers