What exactly is a deepfake?
A deepfake is an image, video or audio file that has been altered or newly generated using machine learning methods. The goal is almost always to make a person appear to do or say something they never actually did or said. Unlike classic image editing, where someone retouches pixels by hand, here a program learns from many real examples how a face or a voice looks and sounds, and then assembles the result on its own.
Deepfakes come in several forms. In a face swap, one person's face is placed into an existing video of another person. In lip syncing, the face stays the same, but the mouth and speech are altered so that the person appears to speak a text that is not their own. In voice cloning, only the voice is reproduced, often from just a few seconds of audio material. And there are fully generated faces of people who do not exist at all.
Not every deepfake is created with bad intentions. The same technology is behind harmless filters, film dubbing or satire. It becomes a problem when a fake deliberately deceives without the people affected or the viewers knowing about it.
How deepfakes are created technically
Most deepfakes are based on neural networks, that is, programs that learn patterns from large amounts of example material. Put simply: anyone who wants to fake a face feeds the system with many recordings of that face from different angles and under different lighting. From this, the network learns how the face has to look in every situation, and can then transfer it onto new footage.
A method that was widespread for a long time are so-called GANs, generative adversarial networks. Here, two programs work against each other: one produces fakes, the other tries to expose them as fakes. Through this constant struggle, the generated images keep getting better. Newer systems often work with diffusion models, which reveal an image step by step out of random noise until a sharp, realistic result emerges. This technique is also behind many well-known image generators.
For voices, the principle works similarly. A model analyzes the pitch, speaking pace and timbre of a real voice and can then produce any sentences in that voice. What used to require hours of recorded material can today sometimes be done with very short audio snippets, for example from a public video or a voice message.
Spotting a deepfake: what to watch for in a video
There are a number of visible clues that can point to a fake. None of them on its own is a reliable proof, but the more of them come together, the more suspicious you should become. Watch the video as large and in as good a quality as possible, and pause at individual points instead of just letting it play through once.
Especially revealing are often the transitions at the edge of the face, the eyes and the mouth. If these areas look unnatural or do not match the rest of the body, a second look is worthwhile. Lighting and skin tones are also a good checkpoint, because a machine does not always blend the light on the face, neck and surroundings in a consistent way.
- The blinking seems too rare, too frequent or oddly mechanical
- At the edge of the face, hairline or chin, there is slight flickering or shimmering
- The lip movement does not exactly match the spoken word
- Lighting and shadows on the skin and neck seem contradictory
- Teeth, hair or ears look blurry or unnatural
- The skin texture is strikingly smooth or changes from frame to frame
Dangers and misuse of deepfakes
The greatest practical harm today comes from fraud. In so-called CEO fraud, perpetrators use a cloned voice or a fake video to pose as superiors and instruct employees to transfer money in a hurry. In the private version, a modern form of the grandparent scam, a familiar-sounding voice calls and claims to be in trouble and to urgently need money. Because the voice seems real, even cautious people fall for it.
In addition, deepfakes are used for defamation and harassment, for example by mounting faces into compromising or invented scenes. In the public sphere they serve disinformation, when politicians are attributed statements they never made. And in advertising, fake celebrity appearances show up promoting dubious products or investment scams.
What all these cases have in common is that they target trust. A familiar face or a familiar voice lowers our guard. That is exactly what the perpetrators exploit.
What you can do if you suspect a deepfake
If a video or a call makes you suspicious, the most important rule is: do not act under time pressure. Fakers deliberately create urgency so that you do not stop to think. Hang up, pause and check calmly.
For money demands by voice or video, the rule is: do not transfer anything and do not hand over any data before you have reached the person through a second channel that you know. Call back the real number, or agree on a fixed code word within the family. For public content, it is worth checking whether reputable sources report the same thing and where the material originally came from.
- Stay calm and ignore artificially created time pressure
- Verify the person through an independent, known channel
- Trace the source and first publication of the material
- Report suspicious content on the respective platform
- In the case of fraud or a threat, file a report with the police
Why purely visual detection reaches its limits
The features mentioned still often help today, but let us be honest: they are losing their value. In the past, incorrect blinking or coarse edges at the face's outline were reliable clues. Modern models have largely fixed exactly these weaknesses. What was a clear tell yesterday can already be invisible tomorrow.
That is why it is risky to rely on your own eye alone. Content that looks clean and convincing is no proof that it is real. Conversely, a poorly compressed, jerky video can be genuine and still look like a fake. Gut feeling alone deceives in both directions.
For this reason, technical checking is gaining importance. Special analysis methods examine an image or video for traces that are invisible to the human eye, for example typical patterns in the image noise or inconsistencies in the compression. However, even such methods never deliver a verdict with one hundred percent certainty. They give a well-founded assessment, not a final yes or no. The most reliable approach remains the combination: common sense, checking the source and, when in doubt, a technical analysis.