Inference

The moment the AI actually 'thinks' and produces an output after you hit enter.

What it means

There are two lives of an AI: Training (learning) and Inference (doing). Inference is the act of using the trained model to make a prediction or generate text. Every time you use ChatGPT, you are running inference.

Why it matters

Inference costs are the main expense for running AI companies. Optimizing inference is the battleground for making AI profitable and accessible.