Tokens & embeddings

How text becomes numbers a model can work with, and why "meaning as math" is the secret to how AI relates ideas.

We met tokens briefly on what a model is: a token is a chunk of text, roughly ¾ of a word. This page goes deeper, because tokens explain some of the most baffling AI behavior you'll ever see, and embeddings, the next idea, are the secret behind how AI relates concepts at all.

Why tokens explain AI's weirdest failures

A model doesn't see letters or words. It sees tokens, and it can only work with one token at a time. That single fact explains a class of failures that otherwise looks insane.

The classic example: ask a top-tier model how many R's are in "strawberry," and it has historically gotten it wrong. Why? Because "strawberry" gets split into separate tokens (something like "straw" + "berry"), and the model never sees the individual letters as letters. Asking it to count R's is like asking someone to count letters in a word you only ever showed them as two puzzle pieces. This is why you can ask an LLM to do something genuinely impressive and then watch it fail a question a child could answer. It's not broken. It just doesn't experience text the way you do. (Newer models often get this one right now, by spelling the word out first, but the underlying reason it ever happened is pure tokenization.)

The same root cause explains why models are bad at strict word counts. It doesn't know words, it knows tokens, and a token doesn't map cleanly to a word, so it has no real concept of "exactly 600 words." If you've ever fought with an AI for 45 minutes trying to get a write-up under a word limit, watching it miss again and again, that's tokens. (The big providers have since bolted on special tools to count for the model, precisely because this came up so often.)

See it yourself

Paste any sentence into the OpenAI tokenizer and watch it break into tokens. Try your own name, "Agentforce," and a long word, and notice where the splits land. This is exactly what the model sees.

Embeddings: turning meaning into numbers

A token is still text. Computers do math, not text. So the next step is turning each token into numbers, and not just any numbers: an embedding is a list of numbers that captures a token's meaning as a position in space.

Picture a giant map where every word has coordinates, and words with similar meanings sit near each other. "King" and "queen" land in the same neighborhood. "Dog" and "cat" cluster together, far from "spreadsheet." The model learns these positions from data, the same patterns-from-examples idea as everywhere else.

The famous demonstration: in this meaning-space, you can do arithmetic on concepts. Take the coordinates for king, subtract man, add woman, and you land almost exactly on queen. Relationships between ideas become literal math. That's not a party trick; it's the mechanism that lets AI tell that two differently-worded sentences mean the same thing.

Why this matters later

Embeddings are how AI searches by meaning instead of keywords. When you hear "grounding" or "RAG" in the Agentforce world, this is the engine underneath: your question gets turned into an embedding, and the system finds the records or documents whose embeddings are nearby in meaning, even if they don't share a single word. This page is the bridge from Foundations to that practical capability.

📝 Practice

Open the tokenizer and find a word that splits into a surprising number of tokens (long or unusual words are good candidates). Then ask your favorite AI assistant to "write exactly 50 words" and count what you get back. Seeing the word-count miss firsthand makes the token idea stick.

Why tokens explain AI's weirdest failures

Embeddings: turning meaning into numbers

Why this matters later

On this page