HomeGuide › AI Media Glossary: The Key Terms Behind AI Images and Deepfakes

AI Media Glossary: The Key Terms Behind AI Images and Deepfakes

Updated June 29, 2026

Plenty of technical terms swirl around AI-generated images and videos, showing up in news reports, platform notices and discussions without ever being explained. Anyone who wants to understand how such content is made and how you can spot it quickly runs into words like diffusion model, watermark or provenance. This glossary sorts through the key terms and explains them in plain language.

We focus on what is practically relevant to you as a consumer: how AI media is made, how its origin can be labeled, and what detection methods can and cannot do. One important note up front: none of these technologies delivers a hundred percent reliable answer to whether an image is authentic or AI-generated. This glossary helps you place the terms in context and read claims more critically.

Generative AI and Deepfake

Generative AI is the umbrella term for systems that create new content, meaning images, videos, text or audio, rather than just sorting or classifying what already exists. It learns patterns from large amounts of example data and combines them into new outputs. Well-known applications include image generators and language models.

A deepfake is one specific application of this: an image or video that has been manipulated or fully generated with AI to make a real person do or say something they never did. The term combines deep learning and fake. Not every piece of AI content is a deepfake, but every deepfake is AI-driven.

How AI Creates Images: Diffusion Model and GAN

A diffusion model is the technique behind most of today's image generators. In simple terms, the model learns to work a meaningful image out of random image noise step by step, guided by your text input. These models often deliver highly photorealistic results and are the current standard.

A GAN, or generative adversarial network, is an older approach in which two neural networks work against each other: one creates images, the other tries to tell fakes apart from real images. This race makes the results ever more convincing. GANs were used for face generation for a long time and are the root of many early deepfakes.

What You Put In and What Comes Out: Prompt, Text-to-Image, Text-to-Video

A prompt is the input you use to tell an AI model what to create. It is usually text, but it can also be an image or a combination. How precisely and how much detail you put into the prompt strongly influences the result.

Text-to-image refers to generating an image from a text description, and text-to-video accordingly means generating a video clip. Text-to-video is technically far more demanding, because many individual frames have to stay consistent over time. This is exactly where errors often still show up, such as flickering details or objects that jump around.

Targeted Manipulation: Face-Swap, Inpainting and Outpainting

In a face-swap, one face in an image or video is replaced with another. This is a common technique in deepfakes of celebrities and private individuals, and it can be abused for fraud, bullying or disinformation.

Inpainting means that a selected part of an image is refilled or altered by AI, for example to remove or insert an object. Outpainting extends an image beyond its original edges by having the AI add matching image areas. Both methods make it possible to manipulate only part of a real photo, which makes detection harder.

Limits of the Technology: Hallucination, Cheapfake and Shallowfake

Hallucination describes the case where an AI system produces content that looks plausible but is factually wrong or entirely made up. In images this often shows up as impossible details like deformed hands, illegible text or objects that make no physical sense. Such oddities can be an indication, but they are not solid proof.

A cheapfake or shallowfake needs no elaborate AI at all. Simple means are enough here: a video is slowed down or sped up, taken out of context, mislabeled or crudely cut. Such fakes are technically simple, yet they spread fast and often do just as much damage as real deepfakes.

Origin and Labeling: C2PA, SynthID, Watermark, Provenance

Provenance, or proof of origin, means that it is documented in a traceable way where an image comes from and how it was created or edited. C2PA is an open industry standard for this: it stores so-called content credentials, meaning signed information about the origin, directly in the file. If the file is manipulated, this signature can become invalid.

SynthID is a digital watermark developed by Google that embeds an invisible pattern into AI-generated content, which special software can read back out. Such watermarks are meant to mark the origin without visibly changing the image. Important to know: origin data and watermarks can be missing, be removed or get lost in screenshots. Their presence is an indication, but their absence is no proof of authenticity.

Traces in the Material: Image Metadata, EXIF, Media Forensics and Upscaling

Image metadata is additional information that can be stored in an image file, such as the time of capture, camera model or location. EXIF is the most widespread format for this in photos. This data can be read out, but it can also be removed or faked easily, and many platforms delete it automatically on upload.

Media forensics examines an image or video for technical traces of manipulation or AI generation, for example unusual noise patterns, compression artifacts or inconsistencies in light and shadow. Upscaling, in turn, enlarges or sharpens an image with AI and invents details that were not present in the original. Even a real photo can pick up artificial elements through upscaling, which makes forensic analyses even harder.

Key takeaways

Frequently asked questions

Can you reliably tell from metadata or EXIF data whether an image is AI-generated?

No. Metadata and EXIF data can give hints about the camera, software or editing, but they can be faked, removed, or often get lost entirely when uploading to platforms. Missing metadata does not automatically mean an image is fake, and existing metadata is no proof of authenticity. They are only one building block among several.

What is the difference between a deepfake and a cheapfake?

A deepfake is created with AI, for example through a face-swap or a fully generated video. A cheapfake or shallowfake works without AI and uses simple tricks like slowed-down playback, false labeling or taking something out of context. Both can deceive, but the cheapfake is far simpler to produce technically.

Does a watermark like SynthID reliably protect against fakes?

A digital watermark like SynthID can mark that a piece of content comes from a particular AI, and it is a useful tool for labeling. However, not all AI tools carry such watermarks, and they can be lost or weakened through editing, screenshots or deliberate manipulation. A missing watermark therefore says nothing reliable about the origin.

How reliable is AI detection overall?

AI detection never delivers a hundred percent reliable answer. Generation and detection techniques keep evolving in parallel, and content that has been heavily edited or upscaled is especially hard to assess. The most reliable approach is to combine several signals: forensic analysis, proof of origin, the plausibility of the content, and the source that is spreading it.

More guides

How to Spot AI Images: Telltale Signs of AI-Generated Photos

Read more →

What Is a Deepfake? How They Work and How to Spot Them

Read more →

How to Spot AI Videos: Signs of AI-Generated and Manipulated Clips

Read more →

Check it yourself

Upload an image or video or paste a link - we check it for AI generation and give you an honest assessment.

Guide · Legal notice · Privacy · aiorauthentic.com