The hardware behind AI

Why AI runs on GPUs, the shift to specialized chips like TPUs, and the difference between training and inference.

Now that we know the four parts of a computer, we can answer a question that confuses a lot of people: why is AI such a big deal for hardware companies? Why did NVIDIA — a company most people knew for gaming graphics cards — become one of the most valuable companies in the world? The answer is entirely about the parts we just met.

Why AI runs on GPUs, not CPUs

Remember the difference between the CPU (one fast worker, working in sequence) and the GPU (thousands of simpler workers, working in parallel).

It turns out that AI is, under the hood, an enormous amount of the same simple math repeated billions of times — mostly multiplying big grids of numbers together. That's exactly the kind of work a GPU was built for. Handing this job to a CPU would be like asking one brilliant accountant to add up a billion numbers alone; handing it to a GPU is like having a stadium full of people each adding up a small piece at the same time.

This is the core reason AI exploded when it did. The hardware that makes it practical — massively parallel GPUs — already existed, and companies that make those chips suddenly became the picks-and-shovels of an entire industry.

A seed for later

This is why chipmakers like NVIDIA became so valuable — the whole world suddenly needed parallel-math hardware. We'll dig into the economics of this (and why it's reshaping markets) in a later module. For now, just hold the connection: AI demand → chip demand.

The shift to specialized chips (TPUs)

GPUs were a happy accident — they were designed for graphics and turned out to be great for AI. The natural next step is to build chips specifically for AI math, with nothing wasted on anything else. Google's TPU (Tensor Processing Unit) is the best-known example, and most big players are now designing their own custom AI silicon.

The trend line is clear: from general-purpose chips (CPUs) → to repurposed parallel chips (GPUs) → to purpose-built AI chips (TPUs and friends). Each step trades flexibility for raw speed at the one job that matters most right now.

Two different jobs: training vs. inference

Here's a distinction you'll hear constantly, and it's worth getting straight early. AI hardware does two very different jobs:

Training is building the model — feeding it enormous amounts of data so it learns patterns. This is staggeringly expensive: thousands of chips running for weeks or months. It happens once (or occasionally), in giant data centers.
Inference is using the model — you ask it something, it gives you an answer. This is much cheaper per use, but it happens billions of times a day, every time anyone interacts with an AI feature.

A useful analogy: training is writing the textbook (slow, costly, done once); inference is a student using the textbook to answer a question (fast, cheap, done constantly). Most of what you'll touch day-to-day — including Agentforce — is inference.

📝 Practice / Homework

In one sentence each, write your own definitions of training and inference — no peeking. Then think of one everyday product you use that's doing inference every time you use it (autocomplete, photo search, a recommendation feed all count). Bring your example.

Why AI runs on GPUs, not CPUs

The shift to specialized chips (TPUs)

Two different jobs: training vs. inference

On this page