0% found this document useful (0 votes)
31 views6 pages

A Study On Combating Emerging Threat of Deepfake Weaponization

Uploaded by

rakeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views6 pages

A Study On Combating Emerging Threat of Deepfake Weaponization

Uploaded by

rakeen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Proceedings of the Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)

IEEE Xplore Part Number:CFP20OSV-ART; ISBN: 978-1-7281-5464-0

A Study on Combating Emerging Threat of


Deepfake Weaponization

Rahul Katarya Anushka Lal


Department of Computer Science and Engineering, Department of Computer Science and Engineering
Delhi Technological University Delhi Technological University
New Delhi, India New Delhi, India
[email protected] [email protected]

Abstract—A breakthrough in the emerging use of machine Machine learning is an evolving concept of
learning and deep learning is the concept of autoencoders and Computer Science where machines can be trained to learn
GAN (Generative Adversarial Networks), architectures that can from provided data and accordingly take decisions on their
generate believable synthetic content called deepfakes. The own, just like humans do. Deep learning is a broader aspect of
threat lies when these low-tech doctored images, videos, and
machine learning, where highly complex networks are trained
audios blur the line between fake and genuine content and are
to learn from a massive database of unstructured data.
used as weapons to cause damage to an unprecedented degree.
This paper presents a survey of the underlying technology of Today there are various free deepfake
deepfakes and methods proposed for their detection. Based on a
applications like the Chinese app, Zao [3] that lets users to
detailed study of all the proposed models of detection, this paper
presents S S TNet as the best model to date, that uses spatial, easily swap faces with movie stars so they can see their self,
temporal, and steganalysis for detection. The threat posed by playing that role in the movie, DeepNude [4], that can create
document and signature forgery, which is yet to be explored by nonconsensual porn, FakeApp, FaceSwap, and DeepFace Lab.
researchers, has also been highlighted in this paper. This paper The existence of such open-source software and the
concludes with the discussion of research directions in this field availability of devices in the market for fabricating and
and the development of more robust techniques to deal with the propagating these falsified information has brought to
increasing threats surrounding deepfake technology. attention the immediate need for detection and elimination of
Index Terms—Deep Learning, Generative Adversarial malicious deepfake content. Deepfakes can act as a powerful
Networks, autoencoders, Deepfake detection, Fake image, Fake weapon to insurgent groups and terrorist organizations, who
Video may depict their adversaries using inflammatory words or
engaging in provocative actions, to maximize the galvanizing
impact on their target audiences. For instance, a member of
I. INTRODUCTION
the Islamic State (or ISIS), can falsely generate fake content
Deepfake technology is a new automatic that shows government officials or soldiers discussing
computer graphics tool that portrays entirely unrealistic events bombing attacks at a mosque, to aid their terrorist group’s
as real through digital media manipulation. Deepfake gained recruitment [5]. States can use this weapon to undermine their
its name from the Reddit platform where an anonymous user non-state opponents. Deepfakes can affect the outcome of an
used this term, a combination of deep learning and fake, for election, hence a threat to democracy. Deepfake is even
replacing celebrities into adult video clips. As soon as the weaponizing satellite images of Earth by showing the
code was made public, widespread interest spawned in the existence of certain objects in landscapes and locations that do
users about the generation of fake content. ObamaNet [1], was not exist in reality, just so they can play with the minds of
an architecture that featured an impressive use of lip-syncing military analysts and influence their decisions based on these
technology to generate synchronized photo-realistic lip-sync fake images [6][7].
videos. Deepfake technique makes it possible to generate
Amidst the threats posed by deepfakes in various
unauthentic videos of people expressing or saying things they
sectors, the ability to generate realistic simulations can have a
have never said before [2].
whole new positive impact on humanity. It can create an array
Deepfake makes use of Artificial Intelligence of opportunities in fields of education, entertainment, and
(AI), machine learning, and deep learning concepts. AI deals business. Historical figures can be made to communicate with
with intelligence at the machine level. students. In movies, face-swapping can be achieved for scenes
that cannot be fulfilled by the actors alone. For example, in

978-1-7281-5464-0/20/$31.00 ©2020 IEEE 485

Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 18,2021 at 00:09:24 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)
IEEE Xplore Part Number:CFP20OSV-ART; ISBN: 978-1-7281-5464-0

2016’s Rogue One, late Peter Cushing’s appearance as Grand information to reconstruct an output based on its
Moff Tarkin was possible through similar technology [8]. representation in latent space. The decoder tries to recreate an
Deepfakes can be used in business such that customers can image that resembles the original as much as possible.
exactly view their appearance in products they wish to buy
without applying them in reality. In the medical world, this Fig. 1 shows how an autoencoder can be used to
technology can play a great role in training doctors, nurses, swap two faces. Following the path indicated by the red
and surgeons to operate on real-life scenarios in a virtual arrows, Face B is reconstructed similar to Face A. The
environment [9]. important aspect here is that both the faces have used the same
encoder. This will help the encoder to use general features that
However, the potential of deepfakes to cause a are common to both faces , and their positioning in latent space
broad spectrum of serious harm to society is a matter of will also be similar. This will allow the autoencoders to
greater concern. They can act as a new weapon to humiliation transform the same picture with faces of the target and the
and destruction, identity theft and exploitation, defamation, original individual swapped. Here, the latent space of Face A
and manipulation of legal evidence. Several methods have is referred for Decoder of Face B to be able to reconstruct
been proposed to detect deepfakes by focusing on the minute Face B similar to Face A. This technique has its application in
details of the content such as facial texture, head poses, eye- various deepfake technologies like DFaker, DeepFaceLab, and
blinking, skin color, lip movements, Spatio-temporal features, TensorFlow-based deepfakes [12].
and capsule forensics. Most of these rely on the same deep
learning techniques that are used for the creation of deepfakes. B. Generative Adversarial Networks

In the foreseeable future, deepfakes will continue The majority of the current deepfake technologies
evolving; thus, it is important to investigate their development incorporate the use of GAN. The GAN architecture was first
and improve the methods of detection accordingly. The main proposed by Ian Goodfellow in 2014 [13]. He introduced a
objective of this paper is to present a survey of methods used framework that involves the use of two neural networks that
for the creation and detection of deepfakes. Section II explains work by challenging each other: the first one generates new
the popular underlying principle of deepfake architecture. data while the other discriminates this new data from the
Section III discusses and compares different proposed original training data set. Contesting the two neural networks
methods for deepfake detection. Section IV presents the tend to improve both the quality of fake data produced as well
contribution of this paper to the survey. The research as the neural network’s discrimination ability. If a large
opportunities in this direction are further highlighted in number of images are fed to GAN, it can create a unique
Section V. image on its own [14]. However, it is necessary to attach a
filter that can help differentiate these unique outputs as
II. DEEPFAKE CREATION acceptable or not. For this, GANs make use of a
discriminative network that checks the generated data with
Deepfakes use Deep Neural Networks (DNNs). true data. Both are trained to operate together until the
DNNs consists of a set of interconnected units called neurons. discriminator has falsely classified the generated output as
These units together perform some form of computational task authentic, almost 50% of the time. This helps us conclude that
and help solve complex problems. Two popular technologies the generator model is successfully generating plausible
associated with deepfake creation are the autoencoder-decoder examples. Fig.1 shows the block diagram explaining the
model and the GAN architecture that has been discussed workflow of GAN architecture.
below.
The GAN architecture uses the min-max method for training
A. Autoencoders the generator and discriminator [13]. The min (0) represents a
Autoencoder was the first technology to be used fake output while the max (1) represents a genuine output.
in deepfake creation. Autoencoder is used to recreate images The goal of the discriminator is to get as close as possible to
that it is trained on. The output generated works in three the max value such that a realistic-looking deepfake is
different phases: encoder, latent space, and a decoder [10]. generated which can be further be used for face-swapping in
The encoder first compresses the input pixels to a relatively images and videos. GANs are more suitable for generating
smaller size by encoding special attributes like skin texture, new data [15]. The main advantage of GANs over
skin color, facial expressions, open eye, closed eye, head pose, autoencoders is that they can be used for a wider range of
and any minute details of the face. This compressed image is tasks for example to produce several classes of data, similar to
sent as input to latent space that is useful for understanding the MNIST dataset [16]. Autoencoders , on the other hand, are
and learning patterns and structural similarities between the more suitable for compressing data to lower dimensions or
data points [11]. Lastly, the decoder decompresses this generating semantic vectors from it.

978-1-7281-5464-0/20/$31.00 ©2020 IEEE 486

Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 18,2021 at 00:09:24 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)
IEEE Xplore Part Number:CFP20OSV-ART; ISBN: 978-1-7281-5464-0

Fig. 1. T he workflow of Autoencoders and GAN in the creation of Deepfakes

B. DEEPFAKE DETECTION Fig.2 shows an example of deepfake image created using a


deepfake application called FaceApp. Here U.S President
To date, there is much ongoing research in this Donald Trump’s unhappy face (upper block) has been
field. According to data obtained from transformed into a smiling face (lower left) by only using the
https://app.dimensions .ai, there have been a total of 931 smile tool in this deepfake application. Next American singer
research publications in this field by the end of August 2020 Camila Cabello’s face is swapped with that of Donald Trump
and is likely to increase further in the coming years. Image, (lower right) and both these images are challenging to detect
video, and audio are the three major types of deepfakes that as fake.
are currently being dealt with globally.

Malicious use of deepfakes has become


undeniably powerful. It has become a personal weapon of
revenge. They are used to manipulate financial markets or to
destabilize international relations. Several efforts have been
made to prevent against this deceptive and destructive
potential of deepfakes. This section presents a survey of
deepfake detection methods categorized under the fake image,
fake video, and fake audio detection.

A. Fake Image Detection

A new wave of face-swapping technology poses a


significant threat to identity and can penetrate systems to get Fig. 2. Use of Deepfake application FaceApp to generate the fake image
illegitimate access. With many disturbing AI-generated
images flooding across various platforms, several detection Fake image detection has been treated as a binary
measures have been proposed in this regard. Most of these classification problem where Convolutional Neural Network
fake images are synthesized using high-quality GANs that (CNN) models have mostly been used for detection [17, 18].
generate highly realistic content that is very difficult to detect. Later adoption of advanced CNN-Xception Network [19]
further increased the accuracy of detection. Here they have

978-1-7281-5464-0/20/$31.00 ©2020 IEEE 487

Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 18,2021 at 00:09:24 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)
IEEE Xplore Part Number:CFP20OSV-ART; ISBN: 978-1-7281-5464-0

carried out the extraction of spatial and steganalysis features differentiate between a fake and authentic visual. This method
of the digital content using CNNs. Spatial features involve the outperformed the method proposed in [27], that used
identification of visible inconsistencies in the image like facial landmark detectors to identify the eye corners and eyelid
blurs, facial texture, artificial smoothness, and contrast contours for precise detection of eyes and with the use of two
difference. Steganalysis help analyze the hidden information different models namely the SVM (Support Vector Machine)
in an image through low-level feature extraction. However, and HMM (Hidden Markov Model), detected the rate of eye
with many evolving extensions of GAN, new mechanisms are blinks. This technique was further enhanced to differentiate
being adopted. So it is important to improve the generalization between a complete and incomplete eye blink. Fogelton and
ability of detection tools. Instead of focusing on statistical Benesova [28], have designed a model that detects a complete
details of a low-level pixel, Xuan et al. [20] proposed the use blink, incomplete blink, and no blink, for every frame. This
of Gaussian blur and Gaussian Noise to force classifiers to method was implemented on the Researcher’s night dataset,
learn more detailed and meaningful features from improved and it outperformed all other related work by almost 8%.
statistical similarity at the pixel level in images.
Stressing on the fact that several intra-frame
Likewise, in [21] a two-phase model is proposed inconsistencies and temporal frame inconsistencies can occur
that pairs fake and real images . These pairs then learn from across frames in deepfake video generation, Guera and Delp
discriminative common fake feature networks (CFFN). The [23] proposed temporal feature analysis of videos to detect
discriminative CFF is used to identify the authenticity of the fake videos, with the use of a simple convolutional Long -
image. They have highlighted the fact that it is challenging to Short - Term- Memory (LSTM) structure. Similar to this, an
identify subjects excluded from the training phase using advanced framework known as SSTNet [29], uses a
supervised learning. So have introduced a contrastive loss to combination of CNN based spatial feature extraction,
learn through pairwise learning. The objective of this steganalysis feature extraction and temporal feature extraction
framework is to detect newly encountered fake images and by a single LSTM for fake video detection. This framework
those generated by a new GAN. Their experimental results achieved a good generalization on the GAN dataset and
have demonstrated to outperform the precision and recall rate outperformed previous steganalysis methods when tested on
of other state-of-the-art methods. the FaceForensics++ dataset [30]. Fig. 3 shows the working
of SSTNet model.
B. Fake Video Detection

When videos are synthesized to swap faces or


change facial expressions, the new images do not usually
match the lighting conditions or head positioning. This is
achieved by geometrically transforming them by rotating,
resizing, or otherwise distorting them. This process is
guaranteed to leave behind digital artifacts in the resulting
image that can make them look doctored. And so various
algorithms have been trained to detect these artifacts to ch eck Fig. 3 T he proposed framework of SST Net [29]
for the authenticity of the content. Li and Lyu [22] propose a
method that exposes deepfake videos based on these face- C. Fake Audio Detection
warping artifacts. Instead of self-generating negative data
from available datasets like several other models [23,24], Audio manipulation is a combination of AI techniques of
which can be a very time consuming and expensive process, deepfakes and general processing like speeding, slowing,
they use a simple image processing operation to create cutting, or decontextualizing a video, commonly referred to as
negative data. They compare generated face areas with its Cheapfakes [31]. Resemblyzer is an open-source tool that
surrounding using a dedicated CNN model for detection. detects deepfakes by extracting high-level representation of
audios that enable developers to compare the two voice
Several deepfake algorithms are not able to samples at any given point of time [32]. Other fake audio
mimic the normal eye blink rate of a person due to a lack of detectors focus on the difference in spectrograms - the visual
closed eye images. Wang et al. [25], used contour circle fitting representation of audio, of real and fake audios [33].
to locate the presence of pupil and thus detect blinks and However, later an improved CNN based architecture called
achieved an accuracy of 96.6%. However, most of these Wavenet [34] was introduced in various platforms like Text-
studies were based on datasets created in a laboratory, so their To-Speech (TTS) and speech recognition. TTS and Voice
results could not be generalized in a real-world scenario. Li et Conversion (VC) are two types of speech synthesis software.
al. [26], recently proposed the use of CNN along with TTS synthesizes human-like speech based on words or
recurrent neural network (RNN) for detecting eye blinks, to

978-1-7281-5464-0/20/$31.00 ©2020 IEEE 488

Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 18,2021 at 00:09:24 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)
IEEE Xplore Part Number:CFP20OSV-ART; ISBN: 978-1-7281-5464-0

phonemes, while VC systems can convert the voice of an models have popularly been used to extract dynamic acoustic
utterance into another voice with the same content. DNN
TABLE I
SUMMARY OF PROMINENT DEEPFAKE DETECTION TECHNIQUES

Method Type Dataset Merits Demerits


CGFace Image Real im ages: CelebA automatic extraction of abstract features and cannot identify hidden features in images
Model[17] Fake im ages: generated use of AdaBoost classifier over softmax
using PCGAN and BEGAN
Pairwise Images Real im ages: CelebA two phase model is proposed that pairs fake may not detect new fake features
learning[21] Fake im ages: generated and real images and common fake feature generated by new GAN
from DCGAN, WGAN, network is trained to distinguish the features
and PGGAN between the fake and real images using
pairwise learning
Preprocessing[20] Image Real im ages: CelebA-HQ, similar image level preprocessing is done to not very large increment in performance
Fake im ages: generated both real and fake images to destroy low compared to other similar approaches
from DCGAN, WGANGP level noise cues so model can learn more
and PGGAN. intrinsic features for classification
Face-warping Videos UADFV, DeepfakeTIMIT makes use of distinctive artifacts that are does not take into consideration temporal
artifacts[22] obtained during the resolution inconsistencies across frames
transformation of images
Eye Blinking[26] Videos Real: 49 real interview Highlighted the fact that deepfake videos only considers lack of blinking for
and presentation videos have inconsistencies in the human blinking detection and not the frequency of
Fake: generated from rate compared to real videos blinking
above-mentioned videos
SST Net: Spatial, Videos/Image FaceForensics++ extracts low level, mid-level and high-level doesn’t take into account the
temporal and GAN-based Deepfakes artifacts of images in videos and also inconsistencies in blinking rate
steganalysis[29] explores the temporal inconsistencies across
successive frames
Dynamic Audio ASVspoof 2015 database proposed an HLL scoring method that only performance is yet to be investigated on
acoustic makes use of human node outputs for spoof replay attacks and spoofing speech
Features[35] detection produced by WaveNet

features of audiovisuals and label them as real or fake invest in this sector as well.
[35].This proposed methodology has shown to outperform the
static feature analysis of Gaussian Mixture Model (GMM) V. CONCLUSION & FUTURE RESEARCH DIRECTIONS
classifiers [36]. The summarized version of the most As the deepfake technology approaches towards
prominent deepfake detection methods has been listed in generating fake content with considerably improved quality, it
Table 1. will likely become impossible to detect them shortly. It is
IV. CONTRIBUTIONS thus, important to immediately respond to the emerging threat
posed by deepfakes with great caution. To be able to
After an in-depth review of several proposed detection distinguish between generated and authentic content,
methods for deepfakes, the SSTNet model proves to be the organizations can start developing encrypted digital stamps
most convincing one due to its flexibility and use of a for authentic digital media.
generalized approach in detecting both the fake images and
Believability and accessibility have become the
videos. This model achieved an accuracy level of around 90%
most significant drive in deep fake technology. Stopping
to 95%, which is much higher compared to other models
deepfakes from spreading across massive networks is a
trained on the same datasets. This method can further be
considerable challenge and requires social media platforms to
improvised by incorporating other detection methods like the
step up. They need to develop tools and extensions that can
rate of complete eye blinks and investigating for lip-synching
help deal with deepfake content moderation, detection, and
evidence, on datasets released by giant tech companies like
prevent their mainstream media coverage. Also confusing
Google [37] and Facebook [38] to classify fake videos.
deepfakes to generate more flawed output can help detect
Many researchers have failed to recognize that them easily. This can be achieved with the addition of special
signature forgery and document forgery is also a matter of noise to digital photos that are uploaded on social media, such
great concern. Signatures ensure a great deal of authenticity in that they create a decoy suggesting there is a face when there
various sectors of banking, insurance, healthcare, copyrights is none, in reality [39].
and governmental regulatory compliance. Forgery of any form
To combat deep fake just developing and
could potentially lead to more immense disastrous
deploying one or two successful tools is not enough. It will
consequences. Thus, it is important for researchers to equally

978-1-7281-5464-0/20/$31.00 ©2020 IEEE 489

Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 18,2021 at 00:09:24 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC)
IEEE Xplore Part Number:CFP20OSV-ART; ISBN: 978-1-7281-5464-0

require a constant reinvention of these tools as this technology Network,” 2018, doi: 10.3390/app8122610.
[18] Y. Aslam and N. Santhi, “A Review of Deep Learning Approaches
is evolving at a much faster rate and machine learning plays a for Image Analysis,” Proc. 2nd Int. Conf. Smart Syst. Inven.
crucial role in achieving this. Therefore, the research Technol. ICSSIT 2019, no. Icssit, pp. 709–714, 2019, doi:
community should continue their research in developing 10.1109/ICSSIT 46314.2019.8987922.
[19] C. Google, “Xception : Deep Learning with Depthwise Separable
countermeasures using machine learning and deep learning to Convolutions,” pp. 1251–1258, 2014.
combat the weaponization of deepfakes . [20] X. Xuan, B. Peng, W. Wang, and J. Dong, “On the Generalization
of GAN Image Forensics,” Lect. Notes Comput. Sci. (including
REFERENCES Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol.
[1] R. Kumar, J. Sotelo, K. Kumar, A. de Brebisson, and Y. Bengio, 11818 LNCS, no. 61502496, pp. 134–141, 2019, doi: 10.1007/978-
“ ObamaNet: Photo-realistic lip-sync from text,” pp. 1–4, 2017, 3-030-31456-9_15.
[Online]. Available: http://arxiv.org/abs/1801.01442. [21] C. Hsu, Y. Zhuang, and C. Lee, “applied sciences Deep Fake Image
[2] D. Yadav and S. Salmani, “Deepfake: A survey on facial forgery Detection Based on Pairwise Learning,” 2020, doi:
technique using generative adversarial network,” 2019 Int. Conf. 10.3390/app10010370.
Intell. Comput. Control Syst. ICCS 2019, pp. 852–857, 2019, doi: [22] Y. Li and S. Lyu, “Exposing DeepFake Videos By Detecting Face
10.1109/ICCS45141.2019.9065881. Warping Artifacts.”
[3] “ Chinese deepfake app Zao sparks privacy row after going viral | [23] G. David and E. J. Delp, “Deepfake Video Detection Using
Privacy | T he Guardian.” Recurrent Neural Networks.”
https://www.theguardian.com/technology/2019/sep/02/chinese-face- [24] D. Afchar and V. Nozick, “MesoNet : a Compact Facial Video
swap-app-zao-triggers-privacy-fears-viral (accessed Aug. 26, 2020). Forgery Detection Network.”
[4] “ AI deepfake app DeepNude transformed photos of women into [25] M. Wang, L. Guo, and W. Chen, “Blink detection using Adaboost
nudes - Vox.” https://www.vox.com/2019/6/27/18761639/ai- and contour circle for fatigue recognition,” Comput. Electr. Eng.,
deepfake-deepnude-app-nude-women-porn (accessed Aug. 26, vol. 0, pp. 1–11, 2016, doi: 10.1016/j.compeleceng.2016.09.008.
2020). [26] Y. Li, M. Chang, and S. Lyu, “In Ictu Oculi : Exposing AI
[5] “ Deepfakes and the New Disinformation War | Foreign Affairs.” Generated Fake Face Videos by Detecting Eye Blinking.”
https://www.foreignaffairs.com/articles/world/2018-12- [27] M. Perception and C. T echnical, “Eye Blink Detection Using Facial
11/deepfakes-and-new-disinformation-war (accessed Aug. 30, Landmarks,” 2016.
2020). [28] A. Fogelton and W. Benesova, “Eye blink completeness detection,”
[6] “ T he Newest AI-Enabled Weapon: ‘Deep-Faking’ Photos of the Comput. Vis. Image Underst., vol. 176–177, pp. 78–85, 2018, doi:
Earth - Defense One.” 10.1016/j.cviu.2018.09.006.
https://www.defenseone.com/technology/2019/03/next -phase-ai- [29] “SST NET : DETECTING MANIPULAT ED FACES T HROUGH
deep-faking-whole-world-and-china-ahead/155944/ (accessed Aug. SPAT IAL , ST EGANALYSIS AND T EMPORAL FEAT URES
24, 2020). YuT ao Gao Yu Xiao Alibaba Group China,” pp. 2952–2956, 2020.
[7] “ Deep fakes: AI-manipulated media will be ‘WEAPONISED’ to [30] R. Andreas, D. Cozzolino, L. Verdoliva, J. Thies, M. Nießner, and
trick military | Science | News | Express.co.uk.” C. Riess, “FaceForensics ++ : Learning to Detect Manipulated
https://www.express.co.uk/news/science/1109783/deep-fakes-ai- Facial Images.”
artificial-intelligence-photos-video-weaponised-china (accessed [31] B. Paris and J. Donovan, “Deepfakes and Cheap Fakes,” Data Soc.,
Aug. 24, 2020). p. 47, 2019, [Online]. Available: https://site.ieee.org/sagroups-
[8] R. Chesney and D. K. Citron, “Deep Fakes: A Looming Challenge 7011/%0Ahttps://site.ieee.org/sagroups-
for Privacy, Democracy, and National Security,” SSRN Electron. J., 7011/blog/%0Ahttps://datasociety.net/library/deepfakes-and-cheap-
pp. 1753–1820, 2018, doi: 10.2139/ssrn.3213954. fakes/.
[9] “ Don’t believe your eyes: Exploring the positives and negatives of [32] “Resemble AI launches voice synthesis platform and deepfake
deepfakes - AI News.” https://artificialintelligence- detection tool | VentureBeat.”
news.com/2019/08/05/dont -believe-your-eyes-exploring-the- https://venturebeat.com/2019/12/17/resemble-ai-launches-voice-
positives-and-negatives-of-deepfakes/ (accessed Aug. 25, 2020). synthesis-platform-and-deepfake-detection-tool/ (accessed Aug. 30,
[10] J. Kietzmann, L. W. Lee, I. P. McCarthy, and T . C. Kietzmann, 2020).
“ Deepfakes: Trick or treat?,” Bus. Horiz., vol. 63, no. 2, pp. 135– [33] “Detecting Audio Deepfakes With AI | by Dessa | Dessa News |
146, 2020, doi: 10.1016/j.bushor.2019.11.006. Medium.” https://medium.com/dessa-news/detecting-audio-
[11] “ Understanding Latent Space in Machine Learning | by Ekin T iu | deepfakes-f2edfd8e2b35 (accessed Aug. 30, 2020).
T owards Data Science.” [34] “WaveNet: A generative model for raw audio | DeepMind.”
https://towardsdatascience.com/understanding-latent-space-in- https://deepmind.com/blog/article/wavenet-generative-model-raw-
machine-learning-de5a7c687d8d (accessed Aug. 28, 2020). audio (accessed Aug. 30, 2020).
[12] T . T . Nguyen, C. M. Nguyen, D. T. Nguyen, D. T . Nguyen, and S. [35] H. Yu, Z. H. T an, Z. Ma, R. Martin, and J. Guo, “ Spoofing
Nahavandi, “Deep Learning for Deepfakes Creation and Detection: Detection in Automatic Speaker Verification Systems Using DNN
A Survey,” pp. 1–12, 2019, [Online]. Available: Classifiers and Dynamic Acoustic Features,” IEEE Trans. Neural
http://arxiv.org/abs/1909.11573. Networks Learn. Syst., vol. 29, no. 10, pp. 4633–4644, 2018, doi:
[13] I. J. Goodfellow et al., “Generative adversarial nets,” Adv. Neural 10.1109/T NNLS.2017.2771947.
Inf. Process. Syst., vol. 3, no. January, pp. 2672–2680, 2014. [36] D. Reynolds, “Gaussian Mixture Models,” Encycl. Biometrics, no.
[14] “ Generative Adversarial Networks: The T ech Behind DeepFake and 2, pp. 659–663, 2009, doi: 10.1007/978-0-387-73003-5_196.
FaceApp.” https://interestingengineering.com/generative- [37] “Google has released a giant database of deepfakes to help fight
adversarial-networks-the-tech-behind-deepfake-and-faceapp deepfakes | MIT T echnology Review.”
(accessed Aug. 27, 2020). https://www.technologyreview.com/2019/09/25/132884/google-
[15] M. Mirza and S. Osindero, “ Conditional Generative Adversarial has-released-a-giant-database-of-deepfakes-to-help-fight-deepfakes/
Nets,” pp. 1–7, 2014, [Online]. Available: (accessed Aug. 30, 2020).
http://arxiv.org/abs/1411.1784. [38] “Deepfake Detection Challenge Dataset.”
[16] “ What is the difference between Generative Adversarial Networks https://ai.facebook.com/datasets/dfdc/ (accessed Aug. 30, 2020).
and Autoencoders? - Quora.” https://www.quora.com/What-is-the- [39] “Scientists Are T aking the Fight Against Deepfakes to Another
difference-between-Generative-Adversarial-Networks-and- Level | Discover Magazine.”
Autoencoders (accessed Aug. 29, 2020). https://www.discovermagazine.com/technology/scientists-are-
[17] L. M. Dang, S. I. Hassan, S. Im, J. Lee, S. Lee, and H. Moon, taking-the-fight-against-deepfakes-to-another-level (accessed Aug.
“ applied sciences Identification Using Convolutional Neural 30, 2020).

978-1-7281-5464-0/20/$31.00 ©2020 IEEE 490

Authorized licensed use limited to: UNIVERSITY OF CONNECTICUT. Downloaded on May 18,2021 at 00:09:24 UTC from IEEE Xplore. Restrictions apply.

You might also like