AI-PEDIA

Multimodal AI

AI that can understand and generate multiple types of media, like text, images, and audio.

What it means

Multimodal AI isn't limited to just reading text. It can process information from various 'modes' - seeing images, hearing audio, and reading documents simultaneously - to understand the world more like a human does.

Why it matters

This enables much richer interactions. You can show an AI a picture of a broken shelf and ask it how to fix it, or have a fluid voice conversation that feels natural because the AI 'hears' your tone.

Keep reading

A few adjacent definitions to lock in the concept.

View all →

Artificial General Intelligence (AGI)

A hypothetical AI that can learn and solve any intellectual task a human can.

Read definition

Black Box AI

AI systems whose internal decision-making process is opaque to humans.

Read definition

Foundation Model

A large-scale AI model trained on vast data that serves as a base for many applications.

Read definition