Deepfake Research
Deepfake Research
The image shows a computer screen displaying video editing software in use. The
software appears to be Adobe After Effects, given the interface elements and layout. On
the screen, there is a video of Barack Obama, and various visual markers and tracking
points are visible on his face. These markers indicate that facial tracking is being used, a
common technique in deepfake creation. The image is part of a process where a
deepfake video is being made or edited, demonstrating the manipulation of the original
video content to alter or mimic the facial movements and expressions of the person
shown, in this case, a video of Barack Obama. The presence of the U.S. flag and
presidential seal in the background reinforces that this is a video featuring Obama in an
official capacity.
1 of 2
prevnext
AP
A deepfake is something that a machine has produced using deep learning and
which contains false information. It pays to keep the definition of deepfakes,
however vague, in front of us because of the way deepfakes distort reality and
attempt to persuade us that something false is true.
On May 28, the wrestlers protesting peacefully in New Delhi were tackled to the
ground, arrested, and boarded in a van to keep them from disrupting the
inauguration of the Parliament building. Shortly after, a photo appeared to show
four of the beleaguered wrestlers posing with wide smiles for a selfie in the van.
If you had believed the photo to be real, you might also have believed that the
wrestlers had orchestrated a clash with the police and that they wanted to be
photographed while being ‘roughed up’. This is what the person who created the
photo may have intended. Though it emerged later that this photo had been
morphed, and was not a deepfake, creating such visuals has become child’s play.
Deepfaking is a significant ‘upgrade’ from photoshopping images as it transcends
the limits of human skill. Here, machines iteratively process large amounts of data
to falsify images and videos, sometimes in real time, and with fewer imperfections.
Deepfake images and videos thus have an unsettling legacy. People worldwide have
already used the technology to create a video of Barack Obama verbally abusing
Donald Trump, hack facial recognition software, manufacture ‘revenge porn’, etc.
On May 22, a deepfake image purporting to show a towering column of dark smoke
rising from the Pentagon received sober coverage from a few Indian television
news channels. The image was soon found to have been machine-made.
Then again, these tools have also been used for good. Using deep learning, the ALS
Association in the U.S. founded a “voice cloning initiative” to restore the voices of
those who had lost it to amyotrophic lateral sclerosis. Deep learning has also been
adapted in comedy, cinema, music, and gaming. Experts have recreated the voices
and/or visuals of visual artist Andy Warhol, celebrity chef Anthony Bourdain, and
rapper Tupac Shakur, among others, enhancing our ability to understand, and even
reinterpret, history (although some of these attempts haven’t been free of
controversy).
Redeemable technology
As such, despite its potential to rupture the social fabric, deep learning is entirely
redeemable, just like the kitchen knife or the nuclear reactor. The focus, in turn and
as usual, must be on how we wield it. This is also the question that generative
artificial intelligence like ChatGPT has been forcing us to ask. The major technology
companies behind ChatGPT et al seem to have been driven by ‘can we do this?’
rather than ‘should we do this?’, although not without exceptions.
Clumsy though it may seem, deepfakes merit a similar response: laws that regulate
its use and punish bad-faith actors, and keep the door open for democratic inputs
to guide the future of such a powerful technology. A good starting point could be
what political philosopher Adrienne de Ruiter wrote in 2021, which is to protect
against the “manipulation of hyper-realistic digital representations of our image
and voice.” This, she said, “should be considered a fundamental moral right in the
age of deepfakes”. And a stepping stone for us, as individuals, is to become more
scientifically, digitally, and public-spiritedly literate. Then, we will be able to look
past an implausible photo and bring to light its concealed creator.
For now, among all the countries, China has responded strongest. It has banned
deepfaked visuals whose creators don’t have permission to modify the original
material and which aren’t watermarked accordingly. The success of this policy is no
doubt assured by the country’s existing surveillance network. Every measure short
of this requires at least an ampoule of self-restraint. And that is rooted in the kind
of people that we are.
Despite its potential to rupture the social fabric, deep learning is entirely
redeemable, just like the kitchen knife or the nuclear reactor. The problem is how
we wield it.
ABHISHEK CHATTERJEE
What is a deepfake?
Deepfakes are a compilation of artificial images and audio put together with
machine-learning algorithms to spread misinformation and replace a real person’s
appearance, voice, or both with similar artificial likenesses or voices. It can create
people who do not exist and it can fake real people saying and doing things they
did not say or do.
The term deepfake originated in 2017, when an anonymous Reddit user called
himself “Deepfakes.” This user manipulated Google’s open-source, deep-learning
technology to create and post pornographic videos. The videos were doctored with
a technique known as face-swapping. The user “Deepfakes” replaced real faces with
celebrity faces. Deepfake technology is now being used for nefarious purposes like
scams and hoaxes, celebrity pornography, election manipulation, social
engineering, automated disinformation attacks, identity theft and financial fraud,
cybersecurity company Norton said in a blog.
Deepfake technology has been used to impersonate former U.S. Presidents Barack
Obama and Donald Trump, India’s Prime Minister Narendra Modi, Facebook chief
Mark Zuckerberg and Hollywood celebrity Tom Cruise. China’s new rule aims to
combat the use of deepfake for spreading disinformation.
The policy requires deep synthesis service providers and users to ensure that any
doctored content using the technology is explicitly labelled and can be traced back
to its source, the South China Morning Post reported. The regulation also mandates
people using the technology to edit someone’s image or voice, to notify and take
the consent of the person in question. When reposting news made by the
technology, the source can only be from the government-approved list of news
outlets. Deep synthesis service providers must also abide by local laws, respect
ethics, and maintain the “correct political direction and correct public opinion
orientation”, according to the new regulation.
The Code of Practice was signed in October 2018 by online platforms Facebook,
Google, Twitter and Mozilla, as well as by advertisers and other players in the
advertising industry. Microsoft joined in May 2019, while TikTok signed the Code in
June 2020. However, the assessment of the Code revealed important gaps and
hence the Commission has issued a Guidance on updating and strengthening the
Code in order to bridge the gaps. The Code’s revision process was completed in
June 2022.
In July, last year, the U.S. introduced the bipartisan Deepfake Task Force Act to
assist the Department of Homeland Security (DHS) to counter deepfake technology.
The measure directs the DHS to conduct an annual study of deepfakes — assess
the technology used, track its uses by foreign and domestic entities, and come up
with available countermeasures to tackle the same.
Some States in the United States such as California and Texas have passed laws
that criminalise the publishing and distributing of deepfake videos that intend to
influence the outcome of an election. The law in Virginia imposes criminal penalties
on the distribution of nonconsensual deepfake pornography.
In India, however, there are no legal rules against using deepfake technology.
However, specific laws can be addressed for misusing the tech, which include
Copyright Violation, Defamation and cyber felonies.
While Canada does not have any regulations to tackle deepfakes, it is in a unique
position to lead the initiative to counter deepfakes. Within Canada, some of the
most cutting-edge AI research is being conducted by the government with a
number of domestic and foreign actors. Furthermore, Canada is a member and
leader in many related multilateral initiatives like the Paris Call for Trust and
Security in Cyberspace, NATO Cooperative Cyber Defence Centre of Excellence and
the Global Partnership on Artificial Intelligence. It can use these forums to
coordinate with global and domestic actors to create deepfake policy in different
areas.
New menace: Having taken cognizance of the issue, almost all social media
platforms have some policy for deepfakes. iStock
What are the ways in which AI-manipulated digital media can impact the lives
of individuals as well as influence the public discourse? How is it employed by
various groups and how can society overcome the ‘infodemic’?
ASHISH JAIMAN
EXPLAINER
Disinformation and hoaxes have evolved from mere annoyance to warfare that can
create social discord, increase polarisation, and in some cases, even influence the
election outcome. Nation-state actors with geopolitical aspirations, ideological
believers, violent extremists, and economically motivated enterprises can
manipulate social media narratives with easy and unprecedented reach and scale.
The disinformation threat has a new tool in the form of deepfakes.
What are deepfakes?
Deepfakes are digital media - video, audio, and images edited and manipulated
using Artificial Intelligence. It is basically hyper-realistic digital falsification.
Deepfakes are created to inflict harm on individuals and institutions. Access to
commodity cloud computing, public research AI algorithms, and abundant data and
availability of vast media have created a perfect storm to democratise the creation
and manipulation of media. This synthetic media content is referred to as
deepfakes.
The first case of malicious use of deepfake was detected in pornography. According
to a sensity.ai, 96% of deepfakes are pornographic videos, with over 135 million
views on pornographic websites alone. Deepfake pornography exclusively targets
women. Pornographic deepfakes can threaten, intimidate, and inflict psychological
harm. It reduces women to sexual objects causing emotional distress, and in some
cases, lead to financial loss and collateral consequences like job loss.
Deepfake can depict a person as indulging in antisocial behaviors and saying vile
things that they never did. Even if the victim could debunk the fake via alibi or
otherwise, that fix may come too late to remedy the initial harm.
Deepfakes can also cause short-term and long-term social harm and accelerate the
already declining trust in traditional media. Such erosion can contribute to a culture
of factual relativism, fraying the increasingly strained civil society fabric.
Social media platforms are taking cognizance of the deepfake issue, and almost all
of them have some policy or acceptable terms of use for deepfakes. We also need
easy-to-use and accessible technology solutions to detect deepfakes, authenticate
media, and amplify authoritative sources.
THE GIST
Deepfakes are digital media - video, audio, and images edited and manipulated
using Artificial Intelligence. It is basically hyper-realistic digital falsification.
iStockphoto
What was the controversy surrounding the ‘voice cloning’ service provider,
ElevenLabs? What are the potential threats around artificial speech
synthesis? Can audio deepfakes be detected? What is the concern regarding
this technology and the creative industry?
ABHISHEK CHATTERJEE
On January 29, several users of the social media platform 4chan, used “speech
synthesis” and “voice cloning” service provider, ElevenLabs, to make voice
deepfakes of celebrities like Emma Watson, Joe Rogan, and Ben Shapiro. These
deepfake audios made racist, abusive, and violent comments. Making deepfake
voices to impersonate others without their consent is a serious concern that could
have devastating consequences. In response to such use of their software,
ElevenLabs tweeted saying, “While we see our tech being overwhelmingly applied to
positive use, we also see an increasing number of voice cloning misuse cases.”
What are voice deepfakes?
A voice deepfake is one that closely mimics a real person’s voice. The voice can
accurately replicate tonality, accents, cadence, and other unique characteristics of
the target person. People use AI and robust computing power to generate such
voice clones or synthetic voices. Sometimes it can take weeks to produce such
voices, according to Speechify, a text-to-speech conversion app.
To create deepfakes one needs high-end computers with powerful graphics cards,
leveraging cloud computing power. Powerful computing hardware can accelerate
the process of rendering, which can take hours, days, and even weeks, depending
on the process. Besides specialised tools and software, generating deepfakes need
training data to be fed to AI models. This data are often original recordings of the
target person’s voice. AI can use this data to render an authentic-sounding voice,
which can then be used to say anything.
What are the threats arising from the use of voice deepfakes?
Attackers are using such technology to defraud users, steal their identity, and to
engage in various other illegal activities like phone scams and posting fake videos
on social media platforms.
According to one of Speechify’s blog posts, back in 2020, a manager from a bank in
the UAE, received a phone call from someone he believed was a company director.
The manager recognised the voice and authorised a transfer of $35 million. The
manager had no idea that the company director’s voice was cloned.
Voice deepfakes used in filmmaking have also raised ethical concerns about the use
of the technology. Morgan Neville’s documentary film on the late legendary chef
Anthony Bourdain used voice-cloning software to make Bourdain say words he
never spoke. This sparked criticism.
Gathering clear recordings of people’s voices is getting easier and can be obtained
through recorders, online interviews, and press conferences. Voice capture
technology is also improving, making the data fed to AI models more accurate and
leading to more believable deepfake voices. This could lead to scarier situations,
Speechify highlighted in their blog.
OpenAI’s Vall-e, My Own Voice, Resemble, Descript, ReSpeecher, and iSpeech are
some of the tools that can be used in voice cloning. ReSpeecher is the software
used by Lucasfilm to create Luke Skywalker’s voice in the Mandalorian.
Call centres can also take steps to mitigate the threat from voice deepfakes,
according to voice recognition engineers at Pindrop. Callback functions can end
suspicious calls and request an outbound call to the account owner for direct
confirmation. Multifactor authentication (MFA) and anti-fraud solutions can also
reduce deepfake risks. Pindrop mentioned factors like devising call metadata for ID
verification, digital tone analysis, and key-press analysis for behavioural biometrics.
NABEEL AHMED
VARUN KRISHAN
Several users on the microblogging platform X have pointed out instances where
the Gemini chatbot seemingly refused to generate images of white people, leading
to factually inaccurate results. Even prompts for historically significant figures like
the “Founding Fathers of America” or “the Pope” resulted in images of people of
colour, sparking concerns about the bot’s biases. Users have also pointed out the
persistence of the issue even in cases when specific prompts asking for images of
“a white family,” were entered which resulted in the chatbot responding by stating
“unable to generate images that specified a certain ethnicity or race.” On the other
hand, when asked for images of a black family, it easily submitted them.
Google added the image-generating feature to the Gemini chatbot, formerly known
as Bard, about three weeks ago.
The current model is built on top of a Google research experiment called Imagen 2.
This is not the first time the government has hit out at Google. Earlier this month,
Mr. Chandrasekhar citing Gemini’s predecessor, Bard’s similar “error”, had said that
the company’s claim that the model was “under trial” was not an acceptable excuse.
Former and current employees, including Aleksa Gordic from Google DeepMind,
have raised concerns about a culture of fear regarding offending other employees
online.
Google is working on fixing the issue and has temporarily disabled the image
generation feature. “While we do this, we’re going to pause the image generation of
people and will re-release an improved version soon,” Google stated as part of a
post on X.
Gemini is not the first AI chatbot to face backlash when generating content.
Recently Microsoft had to adjust its own Designer tool. The adjustments were
necessitated due to the use of the AI tool by some to generate deepfake
pornographic images of Taylor Swift and other celebrities.
OpenAI, however, has put a filter in the tool to block prompt requests that mention
violent, sexual, or hateful language, as well as images of prominent personalities.
Why has the govt. issued a directive on deepfakes?
How do deepfakes work and what is the technology behind them? What are
the implications of the increasing use of deepfakes by cybercriminals and
scammers?
NABEEL AHMED
Deepfakes have been around since 2017 and refer to videos, audios or images
created using a form of artificial intelligence called deep learning. The term became
popular when a Reddit contributor used publicly available AI-driven software to
impose the faces of celebrities onto the bodies of people in pornographic videos.
Fast forward to 2023, deepfake tech, with the help of AI tools, allows semi and
unskilled individuals to create fake content with morphed audio-visual clips and
images. Researchers have observed a 230% increase in deepfake usage by
cybercriminals and scammers, and have predicted the technology would replace
phishing in a couple of years, Cyfirma, a cybersecurity company said.
The technology involves modifying or creating images and videos using a machine
learning technique called generative adversarial network (GAN). The AI-driven
software detects and learns the subjects’ movements and facial expressions from
the source material and then duplicates these in another video or image. To ensure
that the deepfake created is as close to real as possible, creators use a large
database of source images. This is why more deepfake videos are created of public
figures, celebrities and politicians. The dataset is then used by one software to
create a fake video, while a second software is used to detect signs of forgery in it.
Through the collaborative work of the two software, the fake video is rendered until
the second software package can no longer detect the forgery. This is known as
“unsupervised learning”, when machine-language models teach themselves. The
method makes it difficult for other software to identify deepfakes.
India’s IT Rules, 2021 require that all content reported to be fake or produced using
deep fake be taken down by intermediary platforms within 36 hours.
The Indian IT ministry has also issued notices to social media platforms stating that
impersonating online was illegal under Section 66D of the Information Technology
Act of 2000. The IT Rules, 2021, also prohibit hosting any content that impersonates
another person and requires social media firms to take down artificially morphed
images when alerted.
The EU has issued guidelines for the creation of an independent network of fact-
checkers to help analyse the sources and processes of content creation. The U.S.
has also introduced the bipartisan Deepfake Task Force Act to counter deepfake
technology.
THE GIST