A Study on Deepfake Detection Methods Using Computer Vision Algorithms

The document presents a study on deepfake detection methods utilizing computer vision algorithms, focusing on the advancements in artificial intelligence and deep learning techniques. It discusses the challenges posed by deepfake technology, including its potential for misuse in disinformation and the importance of developing effective detection systems. The paper reviews existing detection methods, datasets, and future directions for research to enhance the reliability and accuracy of deepfake detection in various applications.

Uploaded by

IJMSRT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views

A Study on Deepfake Detection Methods Using Computer Vision Algorithms

Uploaded by

IJMSRT

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Volume-3-Issue-6-June,2025 International Journal of Modern Science and Research Technology

ISSN NO-2584-2706

A Study on Deepfake Detection Methods using Computer

Vision Algorithms
Yogesh Sonvane; Dharmesh Meshram; Ayush Ninawe
Department of MCA, G H Raisoni College of Engineering and Management, Nagpur,
Maharashtra, India.

Abstract:
Deepfake technology has reached robustness, and real-time deployment. In
unprecedented scales and has reached critical addition, we look at the standard datasets for
mass among society. Deepfake technologies training and testing of deepfake detection
enable the making of virtually ‘real’ but fully systems, highlighting their scope and
fabricated audio-visual media, thus leaving limitations, and relevance to real-world
significant indelible footprints with the applications. The paper objectives include
public (the public). providing a thorough review of recent
Many advanced artificial intelligence progress on the issue, as well as warning
techniques like deep learning are used in this signs of critical gaps in current approaches,
technology to create synthetic material, and discussing possible future directions.
which is capable of performing believable These efforts seek to mitigate various threats
simulations of real people in video or audio of deepfake technologies and to promote the
content, much without their knowledge or development of digital content authentication
consent. Deepfake devices represent an systems.
effective means of disinformation for digital
security e. g. human rights, political Keywords:
discourse, information integrity, and trust. Deepfake, Computer Vision, Convolutional
The result of recent developments in the Neural Networks, Recurrent Neural Networks,
field of deepfake detection relies Deep Learning.
predominantly on machine learning models
that generalize efficiently and reliably, 1. Introduction
particularly the most effective models, along Deep learning has brought about tremendous
the theoretical semantic space constraints, by improvements to artificial intelligence,
using both spatial and temporal features (for resulting in tremendous advances in the field
example, convolutional neural networks can of synthetic media generation. One of the
be used for generalizing representations of most exciting developments was the
information), and obtain reliable detection development of deepfake technology.
results. This paper presents an overview of Deepfakes are digitally manipulated or
the current technologies that are used for the synthetically generated media (typically
detecting deepfake materials. It provides videos or audio recordings) that relies heavily
specific contributions on the design of the on generative adversarial networks
architecture for these systems, and their (GANs) to generate
relative effectiveness in various salient
problems, like generalization accuracy,

IJMSRT25JUN38 www.ijmsrt.com 226

DOI: https://doi.org/10.5281/zenodo.15639304
Volume-3-Issue-6-June,2025 International Journal of Modern Science and Research Technology
ISSN NO-2584-2706

available with both legitimate and unvalid

content which approximates authentic applications across different industries (film,
human features, expressions, and voices gaming, entertainment, advertising,
such that it can fool the human eye and
many traditional detection systems [1]. education, digital communications, etc.).
Such GAN-based frameworks operate Deepfake technologies are beneficial in terms
under a dual-network logic, where an of inventive new forms of creativity, realistic
generator tries to produce appealing visual effects and enhanced user experiences
synthetic outputs while a discriminator on the one hand, and in terms of criminalizing
attempts to determine whether the produced or disgracing them in deceptive or malicious
output is true and hence they produce ways such as misinformation campaign,
increasingly realistic fabrications over time. identity theft, harassment, cyber fraud, and
These developments have brought with political bribery.
them a number of ethical, legal, and
security problems, especially at a time Even more problematic is the blurring of lines
when it is becoming more difficult to between real and synthetic content, with users
determine whether a digital output is becoming likely to suspect the authenticity of
authentic due to the fact that manipulation what they view online, regardless of whether it
media can be created on a very low cost and is actually real or artificial. This becomes even
widely disseminated across platforms. more problematic because these tools for
The accurate detection of such deepfake creating synthetic media have not only been
content has therefore become critical, as the made available to researchers and advanced
ability to distinguish the authenticity of development professionals , but also to the
digital communication, public discourse, general public through open-source software
and even democratic processes is crucial for and online tutorials [4].
maintaining the integrity of digital media,
public discourse, and even democratic With a personal computer and basic digital
processes. Computer vision techniques skills, anyone can generate a deepfake that can
(mainly deep learning) are an important replicate an identifier, much broader then the
component in this detection because they abilities required to create real people. The use
permit analyzing large quantities of visual of such tools, in the aggregate, contributes to a
and spatiotemporal data with high accuracy democratization of content creation as well as
and can automate the detection process with artistic expression (each act represents a new
high accuracy [2]. opportunity for creativity in the public sphere)
Deep learning can be trained to detect yet presents additional dangers for malicious
irregularities in facial movements, actors. A malicious actor may exploit the
nonnatural blinking, uneven lighting, and a potential ability to create distorted images and
variety of ambiguous artifacts introduced videos to impersonate others, commit fraud to
during the generation of synthetic content, other individuals, influence public opinion, or
which by itself may not be visible to the provide false evidence (for example, the
average human viewer, but can be detected misuse of DNA testing) and as such will
via algorithmic analysis. With the growing increasingly need systems for evaluating these
importance of artificial intelligence and the attacks in parallel to those available today, in
growing number of valid applications, order to create an attack scalable, real-time,
deepfake technologies have become and generalizable.

Under such a high-uncertainty set up, when

IJMSRT25JUN38 www.ijmsrt.com 227
DOI: https://doi.org/10.5281/zenodo.15639304
Volume-3-Issue-6-June,2025 International Journal of Modern Science and Research Technology
ISSN NO-2584-2706

the benefits of innovation need to be taken Misleading information, fake news-worthy

into account in balance with the potential for events, and impersonation can all become a
damage, it is no wonder that the threat due to deepfakes. This technology's dual
development of effective deepfake detection use emphasizes the importance of detection
methods has to go hand in hand with a mechanisms that are capable of mitigating
societal responsibility, with the age of threats while allowing beneficial applications
digitalization, now in full swing, there needs to grow. These systems become crucial for
to be not only a concerted effort to build maintaining trustworthiness in visual media
scientifically sound shields, but also within the digital information ecosystem and
awareness- raising work, media literacy and are available to everyone due to the
responsible AI use to defend against both the accessibility of the tools.
possible negative impacts on the ecosystem Recently, the industry has seen a significant
of deepfake proliferation. improvement with new GAN-based
architectures, including next generation models
2. Background and Related Work like StyleGAN and the First-Order Motion
2.1.Deepfake Generation Techniques Model. The added methods greatly improve the
Creating deepfakes employs advanced deep realism of generated content through better
learning models, mainly Generative texture and facial alignment as well as dynamics
Adversarial Networks (GANs) and of motion, especially with head turns and
autoencoders. These two systems have complicated facial movements. In particular,
enhanced the ability to produce strikingly StyleGAN has advanced the creation of
realistic images and videos beyond what synthetic faces to the level of ultra- high
was possible before, and which are often hard resolution, capturing nuanced changes in age,
to tell apart from real-life recordings. lighting, and emotion, while The First-Order
These models work by analyzing immense Motion Model provides the ability to animate a
datasets containing faces, emotions, and target face with motion derived from a source
speech, teaching them complex patterns of video, which makes the manipulation of videos
facial and visual movements. Because of far more realistic and flexible than ever before.
the training performed on these datasets, Such enhanced consistency in motion and
GANs and autoencoders can imitate synthesis of textures makes deepfakes more
intricate facial movements, generate fluid, convincing, and incredibly difficult to
realistic expressions, lip movements of identify through traditional means. Additionally,
a fabricated face to predetermined the incorporation of advanced artificial
audio, and even perform seemingly intelligence platforms and tools has significantly
effortless identity swaps [3]. sped up the deepfake production process,
In this way, deepfake systems can also making it possible to create synthetic content in
change the underlying emotion in a person’s real- time or instantly. This instant production
face, completely alter their likeness, or capacity has compounded the difficulty for
mesh dissimilar audio with the movements detection tools to keep abreast of technological
of the mouth to fabricate content that passes advancements while also pre-empting new
as real video footage. Even though these threats. To counter increasingly sophisticated
methods have been employed in ‘positive’ deepfakes, researchers have started investigating
ways—like film post-production, digital even more sophisticated generative models like
entertainment, historical figure reanimation, diffusion models, which progressively improve
gaming avatars, or even for privacy- noisy images to generate high-fidelity outputs,
preserving video conferences—there is still and Neural Radiance Fields (NeRFs), which are
a significant scope of abuse. capable of generating realistic 3D- rendered
scenes from 2D image data
IJMSRT25JUN38 www.ijmsrt.com 228
DOI: https://doi.org/10.5281/zenodo.15639304
Volume-3-Issue-6-June,2025 International Journal of Modern Science and Research Technology
ISSN NO-2584-2706

These new techniques have the potential to Whereas CNNs are superior at spatial
push the quality and interactivity of deepfakes processing, Recurrent Neural Networks
much higher than today's norms, allowing (RNNs), such as Long Short-Term Memory
synthetic media to dynamically react to user (LSTM) networks, are more effective at
inputs or create realistic depth and lighting learning temporal dependencies—that is, how
effects in virtual environments. As they patterns at the pixel or feature level change
progress, the need for adaptive, smart, and over video frames. These models are
resilient detection systems will increase especially useful in identifying frame-level
correspondingly, requiring constant updates anomalies like
to detection algorithms as well as underlying unnatural head motion, jittery facial
datasets to stay effective against new attack expressions, or irregular blinking patterns that
vectors. can be present in deepfake videos. By
examining a series of frames, RNNs can learn
2.2.Existing Deepfake Detection Methods the temporal coherence of visual features so
To counter the increasing level of that they can mark down sequences where the
sophistication in deepfake generation continuity of movement seems broken or
algorithms, scientists have suggested a broad artificial [5].
range of detection methods, which utilize In reality, the integration of CNNs and RNNs
several different architectures and algorithmic into hybrid architectures has shown to be
approaches to detect tampered media with successful for jointly processing both spatial
great accuracy. and temporal information, resulting in
improved
The most dominant classes of methods are detection performance in multimedia scenarios
those based on Convolutional Neural where both forms of inconsistencies could be
Network (CNN)-style models, which are present [6].
highly skilled at recognizing and processing Besides conventional deep learning
spatial patterns in static images and video approaches, newer research has also started to
frames. These models operate by examining investigate transformer-based models like
media content for subtle discrepancies— Vision Transformers (ViTs) that provide a
irregular eye reflection, unnatural skin texture, fundamentally distinct solution for deepfake
inconsistent illumination effects, or detection. Unlike CNNs emphasizing local
morphological discrepancies—that could spatial information, ViTs process an image as
reveal manipulation. XceptionNet, a CNN- a sequence of patches and use self-attention
based model, is one such model that has mechanisms to capture global dependencies
performed well in numerous deepfake across the whole image.
detection competitions by extracting deep This enables ViT-based models to better learn
feature representations with the ability to more holistic and long-range dependencies in
distinguish real from fake facial images [4]. visual data, potentially resulting in increased
Their capability to learn deep hierarchical detection accuracy, particularly where
feature patterns that can generalize across deepfakes cause subtle but globally distributed
various datasets and deepfake generation artifacts. Transformers also enable multi-
techniques and provide a stable basis for the modal inputs, making it possible to integrate
assessment of media authenticity is CNNs' facial expression, head pose data, and audio
strong point. features into more complete detection syste

IJMSRT25JUN38 www.ijmsrt.com 229

DOI: https://doi.org/10.5281/zenodo.15639304
Volume-3-Issue-6-June,2025 International Journal of Modern Science and Research Technology
ISSN NO-2584-2706

characteristics added while generating

Another promising research direction synthetic media. In the last few years, several
involves using attention mechanisms that benchmark datasets have been compiled to
dynamically concentrate the model's facilitate this emerging field of research, each
processing power on areas of an image or with different challenges, sources of data,
video that are most likely to harbor manipulation techniques, and degrees of
artifacts—such as the mouth, eyes, or realism, thus encouraging the development of
jawline. Such mechanisms increase more solid and generalizable detection
effectiveness and interpretability by having models.
the model first analyze the most important
areas, which happen to be the most One of the most popular and impactful datasets
vulnerable to manipulation in deepfake in the field is FaceForensics++, which consists
media. In addition to individual model of a large set of both original and manipulated
strategies, ensemble learning methods have videos, specifically tailored to enable deepfake
held promise in recent years by combining detection research. This dataset contains a
the strengths of multiple models to generate variety of manipulation techniques such as
a final prediction that is stronger and more FaceSwap, DeepFakes, and NeuralTextures,
accurate than any individual model. performed on high-quality video content
Ensembles can comprise combinations of sourced from a variety of YouTube channels.
CNNs, RNNs, ViTs, or manually designed The dataset is designed to enable researchers to
feature-based detectors and can take train, validate, and test detection models with
advantage of their capacity to decrease various compression settings, hence serving as
variance, enhance generalization, and be an important benchmark to investigate the
resilient to adversarial attacks that could be effects of video quality on detection
structured to exploit the defects of a single performance [7].
detection technique [6].
Overall, as deepfake technologies improve, Ground truth masks for the tampered areas are
the area of deepfake detection needs to also available through FaceForensics++,
advance in tandem with stronger models, facilitating pixel-level inspection and assisting
more varied training sets, adversarial researchers in building models that can both
training methods, and explainable AI localize and classify deepfakes.
frameworks to provide reliability,
transparency, and adaptability for effective Another important contribution to society is the
real- world deployment. DeepFake Detection Challenge (DFDC) dataset,
a large-scale benchmark made available through
3. Datasets for Deepfake Detection a joint effort from Facebook, Amazon Web
Deepfake detection system development and Services (AWS), and academia. The DFDC
testing heavily depend on access to varied dataset includes thousands of real and fake
and high-quality datasets that contain both videos, with more than 3,000 actors and with
original and tampered video or audio many manipulations on diverse demographics,
content. These datasets are crucial for backgrounds, lighting conditions, and camera
training machine learning algorithms to settings. Its diversity makes it well-suited for
identify slight inconsistencies and training models capable of generalizing in
unnatural

IJMSRT25JUN38 www.ijmsrt.com 230

DOI: https://doi.org/10.5281/zenodo.15639304
Volume-3-Issue-6-June,2025 International Journal of Modern Science and Research Technology
ISSN NO-2584-2706

various real-world situations. The DFDC specific datasets are critical for learning to
dataset also simulates a realistic test identify more general categories of deepfake
environment by incorporating videos that are content, especially in situations where
post-processed using typical methods like manipulations go beyond straightforward facial
resizing, re-encoding, and compression— replacements and instead leverage deeper
factors that normally impair detection multimodal contradictions. Lacking this variety,
accuracy in deployment scenarios [8]. models learned on a single class of manipulations
can be challenged when exposed to unknown or
With its size and complexity, this dataset new forgeries during actual use.
continues to be an essential resource for testing Acknowledging the constraints of using only
model scalability and real-world resilience. naturally occurring data, a few researchers have
resorted to synthetically created datasets that
Another contribution to the area is Celeb-DF, a permit controlled experimentation. Such datasets
collection that prioritizes realism through the can be created with controllable parameters,
use of high-quality deepfake videos with including lighting conditions, head poses, facial
natural lip sync, subtle facial expressions, and expressions, and environmental backgrounds.
negligible visual artifacts. In contrast to Such artificial datasets play a two-fold benefit:
previous collections, which occasionally had they complement available real-world data to
such distortions as being overt or overdone, enhance generalization and they allow models to
Celeb-DF aimed to model subtler and refined be trained that are robust against delicate
manipulations and is thus a more challenging artifacts under diverse conditions. For instance,
task for the detection models. It covers changing lighting and face orientation assists in
deepfakes captured with advanced synthesis readying models for the detection of deepfakes
methods that minimize temporal flickering, taken under different environmental conditions,
inconsistent facial illumination, and edge for instance, dim light or skewed angles [9].
deformations. Consequently, it allows Synthetic data sets can also be created to
researchers to probe the limits of current encompass edge cases and difficult examples
detection models and determine whether they which are scarce in natural data sets, thereby
can identify well-made and visually persuasive making trained detectors more robust.
deepfakes [9]. In summary, datasets play an indispensable role in
In addition to these well-known public the study of deepfake detection. The ongoing
datasets, researchers have started curating creation and diversification of datasets—ranging
domain-specific and adversarial datasets to from high-fidelity manipulations to low-
address various attack vectors. These comprise resolution material, adversarial attacks, and
deepfakes produced under adversarial training synthetic
conditions where the forgery is specifically augmentation—are a fundamental necessity for
created to evade detection, and deepfakes that developing detection systems that can perform
are not only visual forgery but also synthetic robustly in real-world conditions which are
audio and multimodal forgery, where both dynamic and adversarial. And so, as deepfake
audio and video are modified simultaneously. technology advances, so must the datasets on
These which detection systems are trained to ensure that
the tools remain adaptive, inclusive, and future-
proof.

IJMSRT25JUN38 www.ijmsrt.com 231

DOI: https://doi.org/10.5281/zenodo.15639304
Volume-3-Issue-6-June,2025 International Journal of Modern Science and Research Technology
ISSN NO-2584-2706

4. Challenges in Deepfake Detection seem real to humans but also bypass automated
In spite of the significant progress in deepfake detection tools. Such adversarial examples also
detection techniques, many critical challenges reinforce the demand for strong and resistant
still hinder the creation of foolproof and fully models that can withstand such tailored efforts
generalized detection systems. As deepfake to mislead [11].
creation becomes more advanced, the process Accordingly, researchers are researching
of separating manipulated content from real adversarial training methods and ensemble
media becomes proportionally more methods as viable countermeasures, though it is
challenging. These challenges are technical as a continuously ongoing and still unsolved
well as systemic, encompassing model issueto maintain steady resistance against
generalizability, adversarial interference complex adversarial approaches.
robustness, computational requirements, and Also, the computational burden of utilizing deep
practical constraints in real- world deployment learning models in real-time deepfake detection
environments. constitutes a great limiting factor towards wide-
scale usage. High-accuracy detection models
One of the most stubborn problems in this area generally demand high processing power,
is that of generalization. Detection models for memory, and energy—resources that could be in
deepfakes tend to work well on the precise short supply on edge devices or in low-resource
varieties of manipulated content within the environments. This processing requirement
datasets they were trained on. Yet, when makes real-time deployment in social media
confronted with unknown deepfake variations, moderation, live video streaming, and
especially those produced by newer or less video
common synthesis methods, these models will conferencing applications, among others, less
too often see a precipitous decline in feasible where response times need to be fast
performance. This absence of cross-dataset [12].
and cross-technique generalization implies that
most present detectors are excessively Optimizing model architectures for efficiency
dependent on dataset-specific artifacts and are without compromising accuracy is a delicate and
not actually learning truly intrinsic indicators technically challenging process that remains a
of manipulation [10]. focus area for continued research and
Consequently, the real-world usefulness of development.These challenges collectively
most models continues to be constrained, highlight the fact that, as much progress the field
especially as deepfake generation techniques has made, deepfake detection is an ever-evolving
get increasingly diversified and improved. and adversarial environment. Mitigating these
The other critical concern relates to the limitations is imperative to the future success and
susceptibility of detection models to dependability of any system designed to protect
adversarial attacks. Adversaries have started against digital disinformation and media
looking into how they can deliberately manipulation.
manipulate deepfake media in methods that
can trick even the most sophisticated
detection algorithms. By slightly modifying
pixel values or adding perturbations crafted
to deceive machine learning classifiers,
adversaries can make deepfakes not only

IJMSRT25JUN38 www.ijmsrt.com 232

DOI: https://doi.org/10.5281/zenodo.15639304
Volume-3-Issue-6-June,2025 International Journal of Modern Science and Research Technology
ISSN NO-2584-2706

5. Future Directions This is especially crucial in use cases where

With the ongoing problems and continuously detection outcomes can have major societal
changing dynamics of deepfake technology, or legal consequences, e.g., courtroom
future work has to take novel and evidence or political contentmoderation
interdisciplinary steps towards further Alongside privacy issues and data-sharing
improving the detection systems' accuracy, constraints, federated learning has proven to be
scalability, and interpretability. Some an effective framework for training deepfake
promising directions are taking shape in the detection models without centralizing sensitive
domain, each attempting to overcome certain data. Under this decentralized learning
shortcomings of current models while framework, models are trained locally on
unveiling new possibilities for more resilient individual devices or institutional servers and
and accountable AI-driven solutions. transmit only model updates—not raw data—to a
One of the most significant fields of research central coordinating server. This method
is multimodal analysis, where several streams maintains user privacy while still enjoying a
of data—face expressions, voice audio, broad and heterogeneous pool of training data,
synchronization of lip movements, and even which can greatly improve model generalization
physiological signals such as eye blinking and resilience [15].
patterns or micro- expressions—are combined
into a single detection framework. Multimodal For global-scale deployment, e.g., social
systems, unlike unimodal detectors based on networks or cloud- based video services,
visual signals alone, can cross-reference signals federated learning can potentially construct
from different modalities and detect stronger and more diverse detection systems
inconsistencies that may otherwise be without violating user confidentiality. Together,
overlooked. For instance, disaligned lip these future-oriented strategies are a necessary
motion with audio or out-of-synch facial development of the discipline to maintain
motions with speech behavior are robust detection systems as relevant, reliable, and
symptoms of deepfake manipulation, and their morally sound as deepfake generation
combination is demonstrated to vastly enhance technologies evolve.
detection efficacy [13].
6. Conclusion
Another on-going research line is the The sheer spread of deepfake technology, driven
designing of Explainable AI (XAI) by state-of- the-art generative models and the
applications for deepfake detection. Most ready availability of AI capabilities, has brought
existing detection models are "black with it far-reaching implications for the integrity
boxes," meaning that they do not reveal of digital media, public faith, and personal safety.
Though a tremendous amount has been achieved
much about how a decision, such as in developing detection mechanisms that take
declaring a video to be fake, was made. XAI advantage of computer vision, machine learning,
seeks to transform this by providing and deep neural networks, the deepfake
transparency and interpretability in model detection landscape is still peppered with knotty
decision-making. By pointing out which challenges. Concerns of generalization,
aspects or areas of a video played the most adversarial robustness, computational efficiency,
significant role in the classification and ethical interpretability still curtail the
outcome, XAI- augmented models can capability of even the most sophisticated
encourage more end-user trust and enable detection frameworks.
developers to gain a deeper insight into
model weaknesses and limitations [14].

IJMSRT25JUN38 www.ijmsrt.com 233

DOI: https://doi.org/10.5281/zenodo.15639304
Volume-3-Issue-6-June,2025 International Journal of Modern Science and Research Technology
ISSN NO-2584-2706

Researchers and technologists must [8] Panahi, I., & Kehtarnavaz, N. (2018). Deep
therefore be vigilant and proactive, not just Learning-Based Real-Time Face Detection
by optimizing current models but by and Recognition.
adopting fresh paradigms in multimodal [9] Pantic, M., & Rothkrantz, L. J. M. (2000).
analysis, explainable AI, and privacy- Automatic Analysis of Facial Expressions.
preserving learning. The future of this [10] Zhou, B., Khosla, A., Lapedriza, A.,
discipline rides on the capacity to evolve Torralba, A., & Oliva, A. (2017). Learning
rapidly to new threats, scale solutions to Deep Features for Discriminative
practical applications, and maintain Localization.
detection tools as dynamic and innovative as [11] Zhang, X., He, K., Ren, S., & Sun, J.
the generative processes they are meant to (2017).
ShuffleNet: An Extremely Efficient CNN
address. By meeting both the technical for Mobile Devices.
demands and ethical calls, the next [12] Redmon, J., & Farhadi, A. (2018).
generation of research on deepfakes can YOLOv3: An Incremental Improvement.
provide significant protection from the abuse [13] Simonyan, K., & Zisserman, A. (2015).
of synthetic media and thus maintain the Very Deep Convolutional Networks for
authenticity and reliability of digital Large-Scale Image Recognition.
information in a future with more emphasis [14] Goodfellow, I., Pouget-Abadie, J., Mirza,
onAI M., et al. (2014). Generative Adversarial
Networks.
[15] Deng, J., Dong, W., Socher, R., Li, L., Li,
References K., & Fei-Fei, L. (2009). ImageNet: A
Large-Scale Hierarchical Image Database.
[1] Grigoryan, A. M., & Agaian, S. S. (2015).
Algorithms of the q2r × q2r-point 2-D
Discrete Fourier Transform.
[2] He, K., Zhang, X., Ren, S., & Sun, J.
(2015).
Delving Deep into Rectifiers: Surpassing
Human-Level Performance on ImageNet
Classification.
[3] Hochreiter, S., & Schmidhuber, J. (1997).
Long Short-Term Memory.
[4] Girshick, R., Donahue, J., Darrell, T., &
Malik,
J. (2014). Rich Feature Hierarchies for
Accurate Object Detection and Semantic
Segmentation.
[5] Klein, G. (2015). Attention is All You
Need: Transformers in Vision Tasks.
[6] Laptev, I. (2008). Learning Realistic
Human Actions from Movies.
[7] Lucey, P., Cohn, J. F., & Kanade, T. (2009).
The Extended Cohn-Kanade Dataset
(CK+).