What it means
There are two lives of an AI: Training (learning) and Inference (doing). Inference is the act of using the trained model to make a prediction or generate text. Every time you use ChatGPT, you are running inference.
Why it matters
Inference costs are the main expense for running AI companies. Optimizing inference is the battleground for making AI profitable and accessible.
