0% found this document useful (0 votes)
2 views40 pages

Mathematics 11 04937 v2

This document presents a comprehensive review of colorectal and gastric cancer detection using traditional machine learning classifiers, focusing on mathematical formulations and methodologies. It synthesizes findings from 36 peer-reviewed articles published between 2017 and 2023, highlighting preprocessing techniques, feature extraction, and performance metrics, achieving 100% accuracy in detection but with lower sensitivity for gastric cancer. The study aims to enhance cancer diagnostics through optimized methodologies and identifies opportunities for improvement in detection practices.

Uploaded by

sharmfec
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views40 pages

Mathematics 11 04937 v2

This document presents a comprehensive review of colorectal and gastric cancer detection using traditional machine learning classifiers, focusing on mathematical formulations and methodologies. It synthesizes findings from 36 peer-reviewed articles published between 2017 and 2023, highlighting preprocessing techniques, feature extraction, and performance metrics, achieving 100% accuracy in detection but with lower sensitivity for gastric cancer. The study aims to enhance cancer diagnostics through optimized methodologies and identifies opportunities for improvement in detection practices.

Uploaded by

sharmfec
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

mathematics

Review
Analysis of Colorectal and Gastric Cancer Classification:
A Mathematical Insight Utilizing Traditional Machine
Learning Classifiers
Hari Mohan Rai * and Joon Yoo *

School of Computing, Gachon University, Seongnam-si 13120, Republic of Korea


* Correspondence: [email protected] (H.M.R.); [email protected] (J.Y.)

Abstract: Cancer remains a formidable global health challenge, claiming millions of lives annually.
Timely and accurate cancer diagnosis is imperative. While numerous reviews have explored cancer
classification using machine learning and deep learning techniques, scant literature focuses on tra-
ditional ML methods. In this manuscript, we undertake a comprehensive review of colorectal and
gastric cancer detection specifically employing traditional ML classifiers. This review emphasizes the
mathematical underpinnings of cancer detection, encompassing preprocessing techniques, feature ex-
traction, machine learning classifiers, and performance assessment metrics. We provide mathematical
formulations for these key components. Our analysis is limited to peer-reviewed articles published
between 2017 and 2023, exclusively considering medical imaging datasets. Benchmark and publicly
available imaging datasets for colorectal and gastric cancers are presented. This review synthesizes
findings from 20 articles on colorectal cancer and 16 on gastric cancer, culminating in a total of
36 research articles. A significant focus is placed on mathematical formulations for commonly used
preprocessing techniques, features, ML classifiers, and assessment metrics. Crucially, we introduce
our optimized methodology for the detection of both colorectal and gastric cancers. Our performance
metrics analysis reveals remarkable results: 100% accuracy in both cancer types, but with the lowest
sensitivity recorded at 43.1% for gastric cancer.
Citation: Rai, H.M.; Yoo, J. Analysis
of Colorectal and Gastric Cancer
Keywords: traditional machine learning; cancer detection; colorectal cancer; gastric cancer;
Classification: A Mathematical mathematical formulation; preprocessing; feature extraction
Insight Utilizing Traditional Machine
Learning Classifiers. Mathematics MSC: 68T07
2023, 11, 4937. https://doi.org/
10.3390/math11244937

Academic Editors: Florin Leon,


1. Introduction
Mircea Hulea and Marius
Gavrilescu
Cancer, a longstanding enigma in human history, has experienced a notable upsurge
in its prevalence in recent decades due to several contributing causes. These reasons
Received: 19 November 2023 encompass the inexorable aging of populations, the embracing of detrimental lifestyles,
Revised: 9 December 2023 and heightened exposure to carcinogens in the environment, food, and beverages [1,2]. The
Accepted: 11 December 2023 term “cancer” has its origins in the Greek word “kapkivoc”, which carries a dual meaning,
Published: 12 December 2023
referring to both a neoplasm and a crustacean of the crab genus. This nomenclature was first
introduced in the medical lexicon in the 17th century and signifies a condition characterized
by the invasive spread of cells to different anatomical sites, potentially causing harm [3–5].
Copyright: © 2023 by the authors.
In the human anatomy, composed of countless innumerable cells, cancer can emerge
Licensee MDPI, Basel, Switzerland. in diverse locations, from the extremities to the brain. While cells typically divide and
This article is an open access article multiply to meet the body’s needs and undergo programmed cell death, when necessary,
distributed under the terms and deviations can lead to the uncontrolled replication of damaged or abnormal cells, resulting
conditions of the Creative Commons in the formation of a neoplasm or tumor. These tumors can be categorized as benign
Attribution (CC BY) license (https:// (non-malignant) or malignant (cancerous), with the latter having the potential to travel
creativecommons.org/licenses/by/ to distant body parts from the original location, often affecting nearby tissues along the
4.0/). way. Notably, blood cancers, like leukemia, do not follow the typical pattern of solid tumor

Mathematics 2023, 11, 4937. https://doi.org/10.3390/math11244937 https://www.mdpi.com/journal/mathematics


Mathematics 2023, 11, 4937 2 of 40

formation but rather tend to involve the proliferation of abnormal blood cells that circulate
within the body and may not form solid masses as seen in other types of cancer. Cancer
arises from genetic anomalies that disrupt the regulation of cellular proliferation. These
genetic anomalies compromise the natural control mechanisms that prevent excessive cell
proliferation. The body has inherent mechanisms designed to remove cells that possess
damaged DNA, but, in certain cases, these fail, allowing abnormal cells to thrive and
potentially develop into tumors, disrupting regular bodily functions; these defenses can
diminish with age or due to various factors [6].
Each instance of cancer exhibits a distinct genetic modification that evolves as the
tumor grows. Tumors often showcase a diversity of genetic mutations across various cells
existing within the same cluster. Genetic abnormalities primarily affect three types of genes:
DNA repair genes, proto-oncogenes, and tumor suppressor genes. Proto-oncogenes are
typically immersed in healthy cell division and proliferation. The transformation of these
genes into oncogenes, brought on by specific alterations or increased activity, fuels uncon-
trolled cell growth and plays a role in cancer development. Meanwhile, tumor suppressor
genes meticulously manage cellular division while imposing restraints on unbridled and
unregulated cellular proliferation, and mutations in these genes disable their inhibitory
function, increasing the risk of cancer. Mutations in DNA repair genes are significant in
rectifying DNA damage, and these genes can lead to the accumulation of further genetic
abnormalities, making cells more prone to developing cancer. Metastasis is the movement
of cancer cells from the initial site to new parts. It includes cell detachment, local tissue
invasion, blood or lymph system entry, and growth in distant tissues [7,8]. Understanding
the genetic and cellular mechanisms underlying cancer development and metastasis is
crucial for improving diagnostics, developing effective treatments, and advancing cancer
research. Researchers can work toward better strategies for prevention, early detection,
and targeted therapies by unraveling the intricacies of cancer at the molecular level. The
early diagnosis of cancer developments across different body areas requires accurate and
automated computerized techniques. While numerous researchers have made significant
strides in cancer detection, there remains substantial scope for improvement in this field. In
this manuscript, we have scrutinized colorectal and gastric cancers employing conventional
ML techniques solely based on medical imaging datasets. Medical images offer finer and
more specific details compared to other medical data sources.

Literature Review
This section provides an evaluative comparison of the most recent review articles
available, analyzing current review articles dedicated to the utilization of machine learning
and deep learning classifiers for cancer detection across diverse types. The objective is to
summarize the positive aspects and limitations of these review articles, as per the review
presented, on various cancer types. The papers selected for analysis include those that cover
more than two cancer types, are peer-reviewed, and were published between 2019 and 2023.
This present study extends our prior works [9,10] by providing an extensive review that
now encompasses seven distinct cancer types. Levine et al. (2019) [9] focused on cutaneous,
mammary, pulmonary, and various other malignant conditions, emphasizing radiological
practices and diagnostic workflows. The study detailed the construction and deployment of
a convolutional neural network for medical image analysis. However, limitations included a
relative underemphasis on malignancy detection, sparse literature sources, and examination
of a limited set of performance parameters. Huang et al. (2020) [10] explored prostatic,
mammary, gastric, colorectal, solid, and non-solid malignancies. The study presented a
comparative analysis of artificial intelligence algorithms and human pathologists in terms
of prognostic and diagnostic performance across various cancer classifications. However,
limitations included a lack of literature for each malignancy category, the absence of
consideration for machine learning and deep learning classifiers, and a lack of an in-depth
literature review. Saba (2020) [11] examined mammary, encephalic, pulmonary, hepatic,
cutaneous, and leukemic cancers, offering concise explanations of benchmark datasets and
Mathematics 2023, 11, 4937 3 of 40

a comprehensive evaluation of diverse performance metrics. However, limitations included


a combined treatment of machine learning and deep learning without a separate analysis
and the absence of a comparative exploration between the two methodologies. Shah et al.
(2021) [12] proposed predictive systems for various cancer types but had limitations. It
used a data, prediction technique, and view (DPV) framework to assess cancer detection.
The focus was on data type, modality, and acquisition. However, the study included a
limited number of articles for each cancer type, lacked a performance evaluation, and only
considered deep learning-based methods.
Majumder and Sen (2021) [13] centered its focus on the domains of mammary, pul-
monary, solid, and encephalic malignancies. The findings embraced the demonstration of
artificial intelligence’s application in the domains of oncopathology and translational oncol-
ogy. However, limitations included a limited consideration of cancer types and literature
sources, along with variations in performance metrics across different sources. Tufail et al.
(2021) [14] evaluated astrocytic, mammary, colorectal, ovarian, gastric, hepatic, thyroid,
and various other cancer types, emphasizing publicly accessible datasets, cancer detection,
and segmentation. However, the exclusive focus on deep learning-based cancer detection
limited a comprehensive examination of each cancer type. Kumar and Alqahtani (2022) [15]
examined mammary, encephalic, pulmonary, cutaneous, prostatic, and various other malig-
nancies, detailing diverse deep learning models and architectures based on image types.
However, limitations included the exclusive focus on deep learning methods and variations
in performance metrics across different literature sources. Kumar et al. (2022) [3] evaluated
various malignancies, offering comprehensive coverage across diverse cancer categories.
The study drew from numerous literature sources, presenting a wide array of performance
metrics and acknowledging challenges. However, limitations included the amalgamation
of all cancer types in a single analysis and the absence of a separate assessment of machine
learning and deep learning approaches. Painuli et al. (2022) [16] concentrated on mammary,
pulmonary, hepatic, cutaneous, encephalic, and pancreatic malignancies. The study exam-
ined benchmark datasets for these cancer types and provided an overview of the utilization
of machine learning and deep learning methodologies. The research identified the most
proficient classifiers based on accuracy but unified the examination of deep learning and
machine learning techniques instead of offering individual assessments.
Rai (2023) [17] conducted a comprehensive analysis of cancer detection and segmenta-
tion, utilizing both deep neural network (DNN) and conventional machine learning (CML)
methods, covering seven cancer types. The review separately scrutinized the strengths and
challenges of DNN and CML classifiers. Despite limitations, such as a limited number of
research articles and the absence of a database and feature extraction analysis, the study
provided valuable insights into cancer detection, laying the foundation for future research
directions. Maurya et al. (2023) [18] assessed encephalic, cervical, mammary, cutaneous,
and pulmonary cancers, providing a comprehensive analysis of the performance parame-
ters and inherent challenges. However, it lacked an independent assessment of machine
learning and deep learning techniques and a dataset description. Mokoatle et al. (2023) [19]
focused on pulmonary, mammary, prostatic, and colorectal cancers, proposing novel de-
tection methodologies utilizing SBERT and the SimCSE approach. However, limitations
included the study’s focus on four cancer types, the lack of a dataset analysis, and reliance
on a single assessment metric. Rai and Yoo (2023) [20] enhanced cancer diagnostics by
classifying four cancer types with computational machine learning (CML) and deep neural
network (DNN) methods. The study reviewed 130 pieces of literature, outlined benchmark
datasets and features, and presented a comparative analysis of CML and DNN models.
Limitations included a focus on four cancer types and reliance on a single metric (accuracy)
for classifier validation.
This study offers an expansive and in-depth examination of the current landscape and
potential prospects for diagnosing colorectal and gastric cancers through the application of
traditional machine learning methodologies. The key contributions and highlights of this
review can be distilled into the following key points.
Mathematics 2023, 11, 4937 4 of 40

• Mathematical Formulations to Augment Cognizance: Inaugurating the realm of


mathematical formulations, meticulously addressing the most frequently utilized
preprocessing techniques, features, machine learning classifiers, and the intricate
domain of assessment metrics.
• Mathematical Deconstruction of ML Classifiers: Engaging in a profound exploration
of the mathematical intricacies underpinning machine learning classifiers commonly
harnessed in the arena of cancer detection.
• Colorectal and Gastric Cancer Detection: Dedicating an analytical focus to the nu-
anced landscape of colorectal and gastric cancer detection. Our scrutiny unfurled a
detailed examination of the methodologies and techniques germane to the diagnosis
and localization of these particular cancer types.
• Preprocessing Techniques and Their Formulation: Penetrating the intricate realm of
preprocessing techniques and probing their pivotal role in elevating the quality and
accuracy of models employed in cancer detection.
• Feature Extraction Strategies and Informative Features: Embarking on a compre-
hensive journey, scrutinizing the multifaceted domain of feature extraction tech-
niques, meticulously counting and discerning the number of features wielded in
research articles.
• A Multidimensional Metrics Analysis: Conducting an holistic examination encom-
passing a spectrum of performance evaluation metrics, encapsulating accuracy, sensi-
tivity, specificity, precision, negative predictive value, F-measure (F1), area under the
curve, and the Matthews correlation coefficient (MCC).
• Evaluation Parameters for Research Articles: Systematically analyzing diverse pa-
rameters, including publication year, preprocessing techniques, features, techniques,
image count, modality nuances, dataset details, and integral metrics (%).
• Prominent Techniques and Their Effectiveness: Expertly identifying the techniques
most prevalently harnessed by researchers in the realm of cancer detection and metic-
ulously pinpointing the most effective among the gamut of options.
• Key Insights and Ongoing Challenges: Highlighting key insights from the scruti-
nized research papers, encompassing advances, groundbreaking revelations, and
challenges in cancer detection using traditional machine learning techniques.
• Architectural Design of Proposed Methodology: Laying out in meticulous detail an
architectural blueprint derived from the reviewed literature. These architectural for-
mulations present invaluable guides for the enhancement of cancer detection models.
• Recognizing Opportunities for Improvement: Executing a methodical comparative
analysis of an array of metrics, meticulously scrutinizing their zenith and nadir val-
ues, as well as the interstitial chasm. This granular evaluation aids in the strategic
pinpointing of areas harboring untapped potential for enhancement in cancer detec-
tion practices.

2. Materials and Methods


2.1. Literature Selection Process
In this section, we will provide a broad overview of the procedures involved in
selecting and employing research articles for the purpose of cancer classification through
traditional ML approaches. These selection criteria encompass both inclusion and exclusion
standards, which we will delineate in depth. The PRISMA flow diagram delineates the
systematic review process employed for the detection of colorectal and stomach (gastric)
cancer utilizing conventional machine learning (CML) methodologies, as illustrated in
Figure 1. Commencing with an initial identification of 571 records through meticulous
database searching, the subsequent removal of 188 duplicates yielded 383 distinct records.
Through a rigorous screening process, 197 records were deemed ineligible, prompting a
detailed assessment of eligibility for 186 full-text articles. Within this subset, the exclusion
of 150 articles on various grounds culminated in the inclusion of 36 studies. This select
group of 36 studies served as the foundational basis for the scoping review, offering a
database searching, the subsequent removal of 188 duplicates yielded 383 distinct
Through a rigorous screening process, 197 records were deemed ineligible, pro
detailed assessment of eligibility for 186 full-text articles. Within this subset, the e
Mathematics 2023, 11, 4937 5 of 40
of 150 articles on various grounds culminated in the inclusion of 36 studies. Th
group of 36 studies served as the foundational basis for the scoping review, o
comprehensive
comprehensive exploration
exploration of detection
of cancer cancer detection methodsCML
methods employing employing CML
approaches for appro
both
both colorectal
colorectal and stomach
and stomach cancers.cancers.

Identification
Records identified through Additional records identified
database searching through other sources
(n = 571) (n = 0)

Records after duplicates removed


(n = 383)
Screening

Records screened Records excluded


(n =383) (n =197)

Full-text articles Full-text articles


Eligibility

assessed for eligibility excluded, with reasons


(n = 186) (n =150)

Studies included
(n =36)
Included

Studies included in
scoping review
(n =36)

Figure
Figure 1. PRISMA
1. PRISMA flow diagram
flow diagram for the literature
for the literature selection process.
selection process.

2.1.1. Inclusion Criteria


2.1.1. Inclusion Criteria
The inclusion criteria for the review of research articles focused on cancer detection
The inclusion
were defined criteria
across several forparameters.
specific the reviewFirstly,
of research articles
the articles had tofocused on cancer d
pertain exclu-
sively
weretodefined
the classification
across of cancer using
several conventional
specific parameters.machine learning
Firstly, theclassifiers. Theseto perta
articles had
articles
sivelywere specifically
to the chosen of
classification if they wereusing
cancer peer-reviewed
conventionaland published
machine between 2017classifie
learning
and 2023. The selection was limited to journal articles, omitting conference papers, book
articles were specifically chosen if they were peer-reviewed and published betw
chapters, and similar sources to maintain the analytical scope. The studies selected for
and 2023.
review The
utilized selection
medical image was limited
datasets relatedtotojournal
colorectalarticles, omitting
and gastric cancers.conference
Addition- pap
chapters,
ally, and similar
a key criterion was the sources
inclusion to maintain
of accuracy as athe analyticalmetric
performance scope. The
in the studies sel
chosen
articles. Accuracy stands as a fundamental measure in evaluating
review utilized medical image datasets related to colorectal and gastric cancers.the effectiveness of A
cancer detection models. The selected studies also strictly employed traditional machine
ally, a key criterion was the inclusion of accuracy as a performance metric in th
learning classifiers for their classification tasks. The review was narrowed down to studies
articles.two
covering Accuracy stands as a fundamental
specific high-mortality measureand
cancer types: colorectal in gastric
evaluating
cancer.the effectivenes
Further-
cer detection
more, articles were models.
requiredTheto beselected studies
in the English also strictly
language, a criterionemployed
implemented traditional
to
ensure the enhanced accessibility and comprehension of the research, thereby
learning classifiers for their classification tasks. The review was narrowed down t contributing
to clarity and accuracy in the assessment process. Figure 2 illustrates the parameters gov-
covering two specific high-mortality cancer types: colorectal and gastric cancer.
erning the inclusion and exclusion of research articles in the selection process employed in
more,
this articles were required to be in the English language, a criterion implem
manuscript.
ensure the enhanced accessibility and comprehension of the research, thereby
uting to clarity and accuracy in the assessment process. Figure 2 illustrates the pa
governing the inclusion and exclusion of research articles in the selection pro
ployed in this manuscript.
Mathematics 2023, 11, 4937 6 of 40
Mathematics 2023, 11, x FOR PEER REVIEW 6 of 42

Figure
Figure2.2.Parameters
Parametersgoverning
governingthethe
inclusion and
inclusion exclusion
and of research
exclusion articles
of research in the
articles selection
in the pro-process.
selection
cess.
2.1.2. Exclusion Criteria
2.1.2. Exclusion Criteria
The exclusion criteria, a pivotal aspect of the research review process for cancer
The exclusion
detection, servedcriteria, a pivotalfilter
as a strategic aspect toofensure
the research review process
the selection for cancer de-
of high-quality, pertinent
tection, served as a strategic filter to ensure the selection of high-quality,
articles. Omitting conference papers and book chapters was a deliberate choice pertinent articles.
to uphold a
Omitting conference papers and book chapters was a deliberate choice to uphold a supe-
superior standard, guided by the in-depth scrutiny and comprehensive nature typically
rior standard, guided by the in-depth scrutiny and comprehensive nature typically asso-
associated with peer-reviewed journal articles. Additionally, the requirement for digital
ciated with peer-reviewed journal articles. Additionally, the requirement for digital object
object identifiers (DOIs) within the selected studies aimed to guarantee the reliability and
identifiers (DOIs) within the selected studies aimed to guarantee the reliability and acces-
accessibility of the articles, facilitating easy citation, retrieval, and verification processes.
sibility of the articles, facilitating easy citation, retrieval, and verification processes. The
The temporal
temporal boundaryboundary
set the set thewithin
scope scopeawithin
specificatimeframe,
specific timeframe, excluding
excluding research research
pub-
lished before 2017 or after 2023, with the intention of focusing on the most recent advance- recent
published before 2017 or after 2023, with the intention of focusing on the most
advancements
ments within the within thecancer
field of field ofdetection.
cancer detection.
LanguageLanguage
limitationslimitations were incorporated,
were incorporated, al-
allowing only English publications to ensure a consistent
lowing only English publications to ensure a consistent understanding and analysis. understanding and analysis.
Moreover,the
Moreover, theexclusion
exclusionofofdeepdeeplearning
learning classifiers
classifiers in favor
in favor of traditional
of traditional machine
machine learning
learn-
methods
ing methods aligned
alignedwith thethe
with specific
specificobjective
objectiveofofassessing
assessingthe theperformance
performanceand andeffectiveness
effec-
of the latter
tiveness of thein cancer
latter detection.
in cancer By By
detection. narrowing
narrowing thethefocus
focusexclusively
exclusively to tocolorectal
colorectal and
gastric
and cancers,
gastric cancers,thethe
exclusion
exclusioncriteria
criteriaaimed
aimed to ensure
ensureaaconcentrated
concentrated andand comprehensive
comprehen-
sive analysis
analysis across
across these
these specific
specific high-mortality
high-mortality cancer
cancer types.
types. Thisapproach
This approachfacilitated
facilitatedaadeeper
deeper understanding
understanding of theofefficacy
the efficacy of traditional
of traditional machine
machine learningmethods
learning methods in the the con-
context of
text of different
different cancercancer
types.types.
To
Toilluminate
illuminatethe research
the research hotspots,
hotspots,we wehavehave
detailed the quantity
detailed of literature
the quantity refer- refer-
of literature
ences
encespertaining
pertainingannually
annually to to
each cancer
each cancercategory (colorectal
category (colorectaland gastric), along along
and gastric), with thewith the
cumulative total, visually represented in Figure 3. This visual aid is designed
cumulative total, visually represented in Figure 3. This visual aid is designed to aid readers to aid read-
ers in identifying
in identifying pertinentliterature
pertinent literature related
relatedtotothese
thesespecific
specific cancer
cancer categories, fostering
categories, fostering a
a more nuanced analysis within the specified years.
more nuanced analysis within the specified years.

2.2. Medical Imaging Datasets


Data collection is the essential first step in any machine learning endeavor, and the
performance of classifiers and detection tasks depends on the characteristics of the datasets
used. The approach for identifying or classifying diseases, particularly cancers, is closely
linked to the nature of the dataset. Various data types, such as images, text, and signal
data, may require distinct processing methods. In the context of cancer detection, medical
image datasets are of paramount importance. These datasets contain images that provide
valuable information about the presence and characteristics of cancerous tissues. Spe-
cialized techniques, including image segmentation and feature extraction, are applied to
extract relevant information for classification or detection. Analyzing image datasets differs
significantly from text or signal datasets due to differences in data structures and feature
Mathematics 2023, 11, 4937 7 of 40

extraction techniques. Dataset availability can be categorized as offline or real-time. In the


domain of cancer detection, most research relies on offline datasets sourced from healthcare
institutions, research centers, and platforms like Kaggle and Mendeley. Researchers often
use local datasets from these sources to conduct studies and develop innovative cancer
detection methods. In Table 1, we have described some benchmarked imaging datasets of
lung and colorectal cancers.

Table 1. Benchmark and public medical imaging datasets for colorectal and gastric cancer with
download links.

Cancer No. of Data


Dataset Modality Downloadable Link Pixel Size
Category Samples
https://zenodo.org/record/
NCT-CRC-HE-100K H&E 1214456 (accessed on 100,000 224 × 224
15 September 2023)
https://academictorrents.
Lung and colon com/details/7a638ed187a618
histopathological H&E 0fd6e464b3666a6ea0499af4af 10,000 768 × 768
images (LC25000) (accessed on
15 September 2023)
Colorectal
https://zenodo.org/record/
CRC-VAL-HE-7K H&E 1214456 (accessed on 7180 224 × 224
15 September 2023)
https://zenodo.org/record/
Kather-CRC-2016 53169#.W6HwwP4zbOQ 5000 150 × 150
H&E
(KCRC-16) (accessed on 10 5000 × 5000
15 September 2023)
https://dl.acm.org/do/10.1
Kvasir V-2 dataset 720 × 576 to
Endoscopy 145/3193289/full/ (accessed 4000
(KV2D) 1920 × 1072
on 15 September 2023)
https://osf.io/mh9sj/
HyperKvasir dataset Stomach 110,079 images
Endoscopy (accessed on ----
(HKD) (Gastric) and 374 videos
15 September 2023)
Gastric histopathology
https://gitee.com/neuhwm/ 160 × 160, 120
sub-size image H&E 245,196
Mathematics 2023, 11, x FOR PEER REVIEW GasHisSDB × 120, 80 × 80 7 of 42
database (GasHisSDB)

Figure 3. Temporal Analysis of Literature Utilization Across Cancer Categories (2017–2023).


Figure 3. Temporal Analysis of Literature Utilization Across Cancer Categories (2017–2023).
2.2. Medical Imaging Datasets
Data collection is the essential first step in any machine learning endeavor, and the
performance of classifiers and detection tasks depends on the characteristics of the da-
tasets used. The approach for identifying or classifying diseases, particularly cancers, is
closely linked to the nature of the dataset. Various data types, such as images, text, and
Mathematics 2023, 11, 4937 8 of 40

2.3. Preprocessing
In cancer detection, preprocessing is essential to prepare data for analysis and clas-
sification. It refines diverse data types, like medical images and genetic and clinical data,
addressing noise and inconsistencies. Medical image preprocessing includes noise reduc-
tion, enhancement, normalization, and format standardization. Augmentation enhances
data diversity. Quality preprocessed data improves cancer detection model performance.
Common tasks include noise reduction, data cleaning, transformation, normalization, and
standardization. Preprocessing optimizes data for analysis, contributing to effective cancer
diagnosis. Key preprocessing techniques are summarized in Table 2.

Table 2. Fundamental preprocessing techniques, associated formulas, and detailed descriptions.

Preprocessing
Formula Description
Technique
Ifiltered ( A, B) epitomizes the clean image
pixel at location ( A, B). I ( A − x, B − y) is the
pixel significance at location ( A − x, B − y)
N N
Image Filtering Ifiltered ( A, B) = ∑ ∑ I ( A − x, B − y) · K ( x, y) in the original image. K ( x, y) is the value of
x =− N y=− N the convolution kernel at location ( x, y). The
summation is performed over a window of
size (2N + 1) × (2N + 1) centered at ( A, B).
Idenoised represents the denoised image.
E( Idenoised ) is the data fidelity term, which
measures how well the denoised image
Image Denoising Idenoised = argmin( E( Idenoised ) + R( Idenoised )) matches the noisy input image. R( Idenoised )
is the regularization term, which imposes a
prior on the structure of the denoised
image [21].
Filteredvalue represents the resulting value
after applying Gaussian filtering. x and y are
−( x2 +y2 )
Gaussian Filtering 1 the spatial coordinates. σ is the standard
Filteredvalue = (2πσ 2) ∗ e
2σ2
deviation, controlling the amount of
smoothing or blurring.
PixelOP is the enhanced pixel value, derived
from Pixel IP in the input image. Min IP and
Max IP are the minimum and maximum pixel
Contrast Enhancement IP − Min IP
Pixel OP = (Pixel
Max IP − Min IP )
∗ ( MaxOP − MinOP ) + MinOP values in the input image. MinOP and
of Images (CEI)
MaxOP represent the desired minimum and
maximum pixel values in the output
image [22].
where T is the transformation operator, v is
Linear Transformation T (v) = Av the input vector, and A is a matrix defining
the transformation.
O( A, B) is the enhanced output pixel at
Contrast Limited ( A, B) using contrast-enhancing
Adaptive Histogram O( A, B) = T ( I ( A, B)) transformation function T (·) based on pixel
Equalization (CLAHE) intensity using cumulative distribution
function (CDF).
X [m] represents the DCT coefficient at
frequency index m. x [k ] is the input signal. N
Discrete Cosine N −1 
π (2k+1)m

X [m] = ∑ x [k] · cos is the number of samples in the signal. The
Transform (DCT) 2N
k =0 summation is performed over all samples in
the signal
W ( x, y) is the DWT coefficient, ( I ( a, b)) is the
Wavelet Transform N −1 M −1
W ( x, y) = ∑ ∑ I ( a, b) · ψx,y ( a, b) pixel value at ( a, b), and ψx,y ( a, b) is the 2D
(WT)
a =0 b =0 wavelet function.
Mathematics 2023, 11, 4937 9 of 40

Table 2. Cont.

Preprocessing
Formula Description
Technique
Grayvalue is the converted gray value from
RGB channels (Redvalue , Greenvalue , Bluevalue ).
RGB to Gray Gray_value = (0.2989 ∗ Redvalue ) +
Coefficients 0.2989, 0.5870, and 0.1140 are
Conversion (RGBG) (0.5870 ∗ Greenvalue ) + (0.1140 ∗ Bluevalue )
weights assigned to the R, G, and B channels,
respectively [23].
The cropped image Icropped is obtained by
Cropping (ROI) Icropped = I [y : y + h, x : x + w] cropping the input image I at coordinates
( x, y) with width w and height h.

2.4. Feature Engineering


Feature engineering is a critical component in solving classification problems, particu-
larly with traditional machine learning methods. Features represent dataset attributes used
by the model to classify or predict. Instead of using the entire dataset, relevant features are
extracted and serve as classifier inputs, delivering the desired outcomes. Proper prepro-
cessing is essential before feature engineering to ensure data quality. Feature engineering
involves selecting which features to extract, choosing methods, defining the domain, and
specifying the number of features. Categories of feature engineering include extraction,
selection, reduction, fusion, and enhancement. Commonly used features for predicting
lung and colorectal cancers in medical images are outlined below.

2.4.1. Histogram-Based First-Order Features (FOFs)


These are statistical features extracted from an image’s histogram, providing valuable
information about the distribution and characteristics of pixel intensities [24]. Here are some
significant FOFs, along with their mathematical formulae presented in Equations (1)–(4).
Skewness (s): Skewness quantifies the asymmetry of the histogram and is calcu-
lated as:
1 Gmax n o
s = 3 ∑ ( i − µ )3 ∗ h i (1)
σ i =1
Here, i is the gray level, hi is its frequency, Gmax is the highest grayscale intensity, and
µ and σ2 are the mean and variance, respectively.
Excess Kurtosis (k): Excess kurtosis measures the peakedness of the histogram and is
calculated as:
1 Gmax n o
k = 4 ∑ ( i − µ )4 ∗ h i − 3 (2)
σ i =1
Energy: Energy reflects the overall intensity in the image and is computed as:

Gmax n o
Energy = ∑ [ h i ]2 (3)
i =1

Entropy (HIST): Entropy quantifies the information or randomness in the histogram


and is calculated as:
Gmax
Entropy = ∑ {hi ∗ ln(hi )} (4)
i =1

2.4.2. Gray-Level Co-Occurrence Matrix (GLCM) Features


GLCM is a technique used for texture analysis in image processing. It assesses the
association between pixel values in an image, relying on the likelihood of specific pixel
pairs with particular gray levels occurring within a defined spatial proximity [25–27]. Here
Mathematics 2023, 11, 4937 10 of 40

are some important GLCM features, along with their mathematical formulas as provided
in Equations (5)–(10).
Here, (x, y) pairs typically refer to the intensity values of adjacent or neighboring pixels.
Sum of Squares Variance (SSV): SSV quantifies the variance in gray levels within
the texture.
SSV = ∑( x − µ)2 ∗ GLCM ( x, y) (5)
x,y

Inverse Different Moment (IDM): IDM measures the local homogeneity and is higher
for textures with similar gray levels.

1
IDM = ∑ 1 + (x − y)2 ∗ GLCM(x, y) (6)
x,y

Correlation (Corr): Correlation quantifies the linear dependency between pixel values
in the texture. It spans from −1 to 1, with 1 signifying flawless positive correlation.

∑ x,y ( x ∗ y ∗ GLCM ( x, y))(µ a ∗ µb )


Corr = (7)
σa ∗ σb

Dissimilarity: Dissimilarity quantifies how different neighboring pixel values are.

Dissimilarity = ∑|(x − y)| ∗ GLCM(x, y) (8)


x,y

Autocorrelation (AuCorr): Autocorrelation measures the similarity between pixel


values at different locations in the texture.

AuCorr = ∑ x ∗ y ∗ GLCM(x, y) (9)


x,y

Inverse Difference (ID): ID measures the local homogeneity and is higher for textures
with similar gray levels at different positions.

GLCM( x, y)
ID = ∑ 1 + |(x − y)| (10)
x,y

2.4.3. Gray-Level Run Length Matrix (GLRLM)


This is a statistical procedure employed in image processing and texture assessment
to quantify the distribution of run lengths of specific gray levels within an image. Here are
some significant GLRLM features along with their corresponding mathematical formulas,
as presented in Equations (11)–(22).
Short Run Emphasis (SRE): SRE evaluates the dispersion of shorter runs character-
ized by lower gray-level values.

C ( x, y)
SRE = ∑ x2
(11)
x,y

Here, ( x, y) are gray levels, and C ( x, y) is the co-occurrence matrix value reflecting the
frequency of each gray-level combination.
Long Run Emphasis (LRE): LRE assesses the presence of extended runs marked by
higher gray-level values [28].

LRE = ∑ C(x, y) ∗ x2 (12)


x,y
Mathematics 2023, 11, 4937 11 of 40

Gray Level Nonuniformity (GLN): GLN Quantifies the nonuniformity of gray-level


values in runs.
GLN = ∑ C ( x, y)2 (13)
x,y

Run Length Nonuniformity (RLN): RLN evaluates the irregularity in the lengths
of runs.
C ( x, y)
RLN = ∑ (14)
x,y y2

Run Percentage (RP): RP represents the percentage of runs in the matrix.

C ( x, y)
RP = ∑ N2
(15)
x,y

Run Entropy (RE): RE calculates the entropy of run lengths and gray levels.

RE = −∑(C ( x, y) ∗ log C ( x, y) + ∈) (16)


x,y

Low Gray-Level Run Emphasis (LGRE): LGRE accentuates shorter runs with lower
gray-level values.
C ( x, y) N+1
LRGE = ∑ 2
, for y ≤ (17)
x,y y 2

High Gray-Level Run Emphasis (HGRE): HGRE highlights longer runs with higher
gray-level values.
N+1
HRGE = ∑ C ( x, y) ∗ y2 , for y > (18)
x,y 2

Short Run Low Gray-Level Emphasis (SRLGLE): SRLGLE highlights shorter runs
that contain lower gray-level values.

C ( x, y) N+1
SRLGLE = ∑ 2 2
 , for x, y ≤
2
(19)
x,y (x ∗ y

Short Run High Gray-Level Emphasis (SRHGLE): SRHGLE highlights shorter runs
that contain higher gray-level values.

C ( x, y) ∗ x2 N+1 N+1
SRHGLE = ∑ y 2
, for x, y ≤
2
, y>
2
(20)
x,y

Long Run Low Gray-Level Emphasis (LRLGLE): LRLGLE emphasizes longer runs
featuring lower gray-level values.

C ( x,y)
N+1 N+1
LRLGLE = ∑ x2
y2
, for x >
2
, y≤
2
(21)
x,y

Long Run High Gray-Level Emphasis (LRHGLE): LRHGLE highlights extended


sequences with higher gray-level values.

2 N+1
LRHRGLE = ∑ C(x, y) ∗ x2 ∗ y , for x, y >
2
(22)
x,y
Mathematics 2023, 11, 4937 12 of 40

2.4.4. Neighborhood Gray-Tone Difference Matrix (NGTDM)


This is another texture analysis method used in image processing to characterize the
spatial arrangement of gray tones in an image. Here are some key NGTDM features along
with their respective mathematical formulas, as outlined in Equations (23)–(27).
Coarseness: Measures the coarseness of the texture based on differences in gray tones.

N C ( x, y)
Coars = ∑x=g 1 (∆x )2
(23)

Ng refers to the highest achievable discrete intensity level within the image.
Contrast (NGTD): Quantifies the contrast or sharpness in the texture.
N N
Contrast NGTD = ∑x=g 1 ∑y=g 1 C(x, y) ∗ |x − y| (24)

Busyness: Represents the level of activity or complexity in the texture.


N N
Busyness = ∑x=g 1 ∑y=g 1 C(x, y) ∗ y (25)

Complexity: Measures the complexity or intricacy of the texture.

N N P( x, y)
Complexity = ∑x=g 1 ∑y=g 1 1 + |x − y|2 (26)

Texture Strength (TS): Quantifies the strength or intensity of the texture.


s  2
Ng Ng x y
TS = ∑ x =1 ∑ y =1
P( x, y) ∗
Ng

Ng
(27)

These features provide a detailed analysis of texture patterns in images, making them
valuable for various applications, including image classification, quality control, and texture
discrimination in fields such as geology, material science, and medical imaging.

2.5. Traditional Machine Learning Classifiers


Machine learning-based classifiers, renowned for their advanced capabilities in detect-
ing cancer, notably stand out in their effectiveness when harmonized with non-invasive
diagnostic techniques, providing a significant edge in the domain of cancer detection.
Researchers have employed a range of ML classifiers to identify different malignancies and
disorders. Some commonly used classifiers include:

2.5.1. K-Nearest Neighbors (KNN)


K-Nearest Neighbors (KNN) is a widely used and simple machine learning algorithm,
suitable for classification and regression tasks. It relies on the assumption that similar
inputs lead to similar outputs, assigning a class label to a test input based on the prevalent
class among its k closest neighbors. The formal definition involves representing a test
point ‘x’ and determining its set of ‘k’ nearest neighbors, denoted as ‘Nx’, where ‘k’ is a
user-defined parameter.
The Minkowski distance is a flexible distance metric that can be tailored by adjusting
the value of the parameter ‘p.’ The Minkowski distance between two data points ‘x’ and ‘z’
in a ‘d’-dimensional space is defined by Equation (28):
!1/p
d
dist( x, z) = ∑ | xr − zr | p
(28)
r =1

The “1-NN Convergence Proof” states that, as your dataset grows infinitely large,
the 1-Nearest Neighbor (1-NN) classifier’s error will not be more than twice the error of
/

𝑑𝑖𝑠𝑡(𝑥, 𝑧) = |𝑥 − 𝑧 | (28)

Mathematics 2023, 11, 4937 The “1-NN Convergence Proof” states that, as your dataset grows infinitely large, 13 ofthe
40

1-Nearest Neighbor (1-NN) classifier’s error will not be more than twice the error of the
Bayes optimal classifier, which represents the best possible classification performance.
the
ThisBayes optimal
also holds forclassifier,
k-NN with which
largerrepresents
values ofthe best
k. It possiblethe
highlights classification performance.
ability of the K-Nearest
This also holds
Neighbors for k-NN
algorithm with larger
to approach values performance
optimal of k. It highlights
withthe ability ofdata
increasing [29]. As 𝑛
the K-Nearest
Neighbors infinity, 𝑍to approach
approachesalgorithm converges to 𝑍 , and
optimal performance with increasing
the probability of differentdata As𝑍n
[29].for
labels
approaches
when returning (𝑍 ZNN
infinity, )’s label
converges to Zt , and
is described the probability
in Equation of different labels for Zt when
(29) [30].
returning ( ZNN )’s label is described in Equation (29) [30].
∈ = 𝑃(𝑦 ∗ |𝑍 )(1 − 𝑃(𝑦 ∗ |𝑍 )) + 𝑃(𝑦 ∗ |𝑍 )(1 − 𝑃(𝑦 ∗ |𝑍 )) ≤ (1 − 𝑃(𝑦 ∗ |𝑍 )) + (1 − 𝑃(𝑦 ∗ |𝑍 ))
(29)
∈ NN = P(y∗| Zt )(1=−2(1
P(y−∗|𝑃(𝑦
ZNN∗))
|𝑍+))P=(y2∗|∈ZNN )(1 − P(y∗| Zt )) ≤ (1 − P(y∗| ZNN )) + (1 − P(y∗| Zt ))
(29)
Here, BO is=the − P(yoptimal
2(1Bayes ∗| Zt )) = 2 ∈ BO If the test point and its nearest neighbor are
classifier.
indistinguishable, misclassification
Here, BO is the occurs if they
Bayes optimal classifier. If thehave
test different
point andlabels. This probability
its nearest is
neighbor are
outlined in Equation
indistinguishable, (30) and Figureoccurs
misclassification 4 [29,31].
if they have different labels. This probability is
outlined in Equation (30) and Figure 4 [29,31].
1 − 𝑝(𝑠| 𝑥) 𝑝(𝑠| 𝑥) + 𝑝(𝑠| 𝑥) 1 − 𝑝(𝑠| 𝑥) = 2𝑝(𝑠| 𝑥) 1 − 𝑝(𝑠| 𝑥) (30)
1 − prepresents
Equation((30) (s| x )) p(s| xthe p(s| x )(1 − p(s| xprobability
) +misclassification )) = 2p(s| x )( 1 − pthe
when (s| xtest
)) point and(30)
its
nearest neighbor have differing labels.

Figure 4. Probabilistic analysis of misclassification for identical test point and nearest neighbor sce-
Figure 4. Probabilistic analysis of misclassification for identical test point and nearest neighbor scenario.
nario.
Equation (30) represents the misclassification probability when the test point and its
2.5.2. Multilayered Perceptron (MLP)
nearest neighbor have differing labels.
In contrast to static kernels, neural network units have adaptable internal parameters
2.5.2.
for anMultilayered Perceptron
adjustable structure. (MLP)
A perceptron, inspired by biological neurons, comprises three
components: (i) to
In contrast weighted edges neural
static kernels, for individual
networkmultiplications, (ii) a summation
units have adaptable unit for
internal parameters
calculating
for the sum,
an adjustable and (iii)Aan
structure. activation inspired
perceptron, unit applying a non-linear
by biological function
neurons, [32–34].three
comprises The
single-layer unit
components: function involves
(i) weighted edges fora individual
linear combination passed through
multiplications, a non-linear
(ii) a summation unitacti-
for
vation, represented
calculating the sum,byand Equation
(iii) an(31) and Figure
activation unit5applying
[33,34]. a non-linear function [32–34].
The single-layer unit function involves a linear combination passed through a non-linear
activation, represented by Equation (31) and Figure 5 [33,34].
𝑦 ( )𝑓 = 𝑤 ( ) + 𝑤 ( )𝑥 (31)
!
N
y (1) f = w0 (1) + ∑ w j (1) x j (31)
j =1
1
⎡𝑥 ⎤
⎢ ⎥
.
𝑥=⎢ ⎥ (33)
⎢ . ⎥
Mathematics 2023, 11, 4937 ⎢ . ⎥ 14 of 40
⎣𝑥 ⎦

(a) (b)
Figure 5. Contrasts (a) biological neurons, showcasing intricate neural architecture, with (b) artifi-
Figure 5. Contrasts (a) biological neurons, showcasing intricate neural architecture, with (b) ar-
cial perceptrons in neural networks, depicting simplified representations and emphasizing struc-
tificial perceptrons in neural networks, depicting simplified representations and emphasizing
tural differences.
structural differences.
The vector representation comprises(1input values 𝑥 𝑡𝑜 (1)𝑥 , and an additional (ele-
In a single-layer neural network unit, y ) f is the output, w0 is the(bias, ) and ∑ N 1)
j=1 w j ( x)j
ment of 1. Internal parameters of single-layer units include
is the weighted sum of inputs. In general, we compute U1 units as feature bias 𝑤 , and weights 𝑤
transformations ,in
( )
through 𝑤 , . These
learning models, parameters
described form the(32)
in an Equation column of a matrix 𝑊 ( ) with dimensions
𝑗𝑡ℎ [33,34].
(𝑁 + 1) × 𝑈 , as demonstrated in Equation (34) below [34]:
(1) (1)
model( x, w) = w0 +( y) 1 ( x())w1 + · · · + ( )yU1 ( x ) wU1 (32)
⎡𝑤 , 𝑤, ... 𝑤 , ⎤
The input vector x can be denoted ⎢ as
( ) represented
( )
. . . in
( ) ⎥
𝑤 Equation (33) [33,34].
𝑊 = ⎢𝑤 , 𝑤, ,
⎥ (34)

⎢ ( ) ⋮ ⋮ ⋮ ⎥
(1 )
 
( )
⎣𝑤 ,  𝑤 ,  ... 𝑤 , ⎦
x
 1
Notably, the matrix–vector product x= 𝑊 . 𝑥

 encompasses all linear combinations (33)
within our 𝑈 units as given in Equation (35)
 . 
 [33]. 
 . 
x N( )
(𝑊 𝑥) = 𝑤 (, ) + 𝑤 , 𝑥 , 𝑗 = 1, … , 𝑈 (35)
The vector representation comprises input values x1 to x N , and an additional element
of 1. We extend
Internal the activation
parameters function 𝑓 units
of single-layer to handle a general
include
(𝑑
1) × 1 vector 𝑣 in
bias w0,j
(1)
and weights w1,jEquation
through
(36)
(1) [34]:
w N,j . These parameters form the jth column of a matrix W (1) with dimensions ( N + 1) × U1 ,
as demonstrated in Equation (34) below [34]:

(1) (1) (1)


 
w0,1 w0,2 ··· w0,U1
 (1) (1) (1) 
w
 1,1 w1,2 · · · w1,U1 
W1 =  . (34)

 .. .. .. .. 
 . . . 
(1) (1) (1)
w N,1 w N,2 · · · w N,U1

Notably, the matrix–vector product W1T x encompasses all linear combinations within
our U1 units as given in Equation (35) [33].

  N
∑ wn,j xn ,
(1) (1)
W1T x = w0,j + j = 1, . . . , U1 (35)
j
n =1
23, 11, x FOR PEER REVIEW 15 of 42

Mathematics 2023, 11, 4937 15 of 40

𝑓(𝑣 )
𝑓(𝑣) = ⋮
We extend the activation function f to handle a general d × 1(36)
vector v in
Equation (36) [34]:
𝑓(𝑣 )
In Equation (37), 𝑓(𝑊 𝑥) is a 𝑈 × 1 vector containing
f (v1 ) all 𝑈 single-layer units
 
 .. 
[33,34]: f (v) =  .  (36)
f (vd )
( ) ( )
𝑓(𝑊 𝑥) =(37),
𝑓 𝑤 , 1+
Tx a𝑤U1, ×𝑥 1 , 𝑗= 1, … , 𝑈 all U1 single-layer
(37)

In Equation f W is vector containing units [33,34]:
!
N
anT x𝐿-layer
 
The mathematical expression for = f unit +∑
w0,j in a general
wn,j xn multilayer perceptron,
(1) (1)
f W 1 , j = 1, . . . , U1 (37)
j
built recursively from single-layer units, is given by Equation
n =1 (38) [33,34].
The mathematical expression ( for
) an L-layer unit in a general multilayer perceptron,

built recursively ( )
from single-layer units,( is) given
( ) by Equation (38) [33,34].
𝑦 ( ) (𝑥) = 𝑓 𝑤 + 𝑤 𝑓 (𝑥) (38)
 
U ( L −1)
( L ) ( L −1)

( L) ( L)
y ( x ) = f  w0 + wi fi ( x ) (38)
i =1
2.5.3. Support Vector Machine (SVM)
2.5.3. Support
SVMs, employed Vector Machine
for regression (SVM)
and classification tasks, stand out in supervised ma-
chine learning for their precision with complex datasets.classification
SVMs, employed for regression and Particularlytasks, standinout
effective in supervised
binary
classification, SVMs aim to discover an optimal hyperplane, maximizing the boundary in binary
machine learning for their precision with complex datasets. Particularly effective
classification, SVMs aim to discover an optimal hyperplane, maximizing the boundary
between classes. Serving as a linear classifier, SVMs build on the perceptron introduced
between classes. Serving as a linear classifier, SVMs build on the perceptron introduced by
by Rosenblatt in 1958 [35–37]. Unlike perceptrons, SVMs identify the hyperplane (H) with
Rosenblatt in 1958 [35–37]. Unlike perceptrons, SVMs identify the hyperplane (H) with the
the maximum separation
maximummargin, defined
separation in defined
margin, Equation (39).
in Equation (39).
ℎ(𝑥) = sign(𝑤 + 𝑏)   (39)
h( x ) = sign w Tx + b (39)
The SVM classifies in {+1, −1}, emphasizing the key concept of finding a hyperplane
with maximum margin The𝜎.SVM classifies
Figure 6 illustrates −1},importance,
in {+1, this emphasizing with
the key
theconcept
marginof expressed
finding a hyperplane
with
in Equation (40) [35] maximum margin σ. Figure 6 illustrates this importance, with the margin expressed
in Equation (40) [35]
𝜎 = 𝑚𝑖𝑛 𝑤.
σ= 𝑥 min w.x j (40) (40)
( , ) ( x j ,y j )eD

where input vectors


where𝑥 input
are within
vectorsthe unitwithin
x j are the 𝜎unit
sphere, is the closest
sphere, σ is data point data
the closest frompoint
the from the
hyperplane, and the vector 𝑤and
hyperplane, the vector
resides w resides
on the on the unit sphere.
unit sphere.

Figure 6. Separating Hyperplanes and Maximum Margin Hyperplane in Support Vector Machines.
Figure 6. Separating Hyperplanes and Maximum Margin Hyperplane in Support Vector Machines.

Max Margin Classifier: We formulate our pursuit of the maximizing-margin hyper-


plane as a constrained optimization task, aiming to enhance the margin while ensuring
correct classification of all data points. This is expressed in Equation (41) [35,37]:
Mathematics 2023, 11, 4937 16 of 40

Max Margin Classifier: We formulate our pursuit of the maximizing-margin hyper-


plane as a constrained optimization task, aiming to enhance the margin while ensuring
correct classification of all data points. This is expressed in Equation (41) [35,37]:
Mathematics 2023, 11, x FOR PEER REVIEW   16 of 42
[ maxσ(u, δ) such that ∀i yi u T xi + δ ≥ 0] (41)
u,δ | {z }
| {z }
maximize margin separating hyperplane

[ max σ (𝑢, δ) such that ∀𝑖 𝑦 (𝑢 𝑥 + δ) ≥ 0]


Upon substituting the, definition of σ, Equation (42) is derived, as given below. (41)
separating hyperplane
maximize margin
1  
Upon substituting min u T xof
[maxthe definition σ,δequation
i+ u T xis
s.t. ∀i yi (42) i+ δ ≥ 0]as given below. (42)
derived,
u,δ k u k2 xi ∈ D
| 1 {z } | {z }
max min |𝑢) 𝑥 + 𝛿| s.t. ∀𝑖 𝑦 (𝑢 𝑥 + δ) ≥ 0]
σ (u,δ
separating hyperplane
| 𝒖, ‖𝑢‖ 𝒙{z 𝒊∈ } separating hyperplane (42)
( , )
maximize margin

Scaling
 invariance enables flexible adjustment of u and δ. Smart value selection
Scaling invariance enables flexible adjustment of 𝑢 and δ. Smart value selection en-
ensures min u T + δ= 1),
x δ| = 1 introduced
, introduced as equality
an equality constraint
sures (min |𝑢 𝑥 + as an constraint in theinobjective
the objective per
per Equa-
∈x∈ D
Equation
tion (43) (43)
[37]:[37]:  
[max
max 1 · 1|u⋅||𝑢| = min
2 = min |u||𝑢 | min
2 = u> 𝑢
= min u 𝒖] (43)
(43)
u,δ , u,δ , u,δ ,

Utilizingthe
Utilizing thefact that f𝑓(𝑧)
factthat (z) == z𝑧2 isismonotonically
monotonically increasing
increasing for for z𝑧 ≥
≥ 00 and
and ||𝑢|
u |≥≥00,,
where 𝑢 maximizing |𝑢| also maximizes > 𝑢 𝐮. This reformulates
where u maximizing |u |2 also maximizes u u. This reformulates the optimization problem the optimization
inproblem
Equation in(44),
Equation
and a(44), and a diagram
structural structuralofdiagram of a multi-SVM
a multi-SVM has been visualized
has been visualized in Figure 7.in
Figure 7.  
minmin u>𝑢u 𝐮subject to ∀to
subject u T𝑦xi(𝑢
i, yi ∀𝑖, + δ𝑥 +≥δ) 0, ≥min
0, u T x|𝑢i +𝑥δ +=δ|1= 1
min (44)
u,δ
,
i (44)

Figure7.7.Structural
Figure Structuraldiagram
diagramofofthe
themulti-class
multi-classsupport
supportvector
vectormachine
machine(SVM).
(SVM).

2.5.4.
2.5.4.Bayes
Bayesand
andNaive
NaiveBayes
Bayes(NB)(NB)Classifier
Classifier
The
The Bayes classifier, an ideal algorithm, assigns
Bayes classifier, an ideal algorithm, assigns class
class labels
labels based on class probabil-
probabili-
ities
ties given observed features and prior knowledge. It predicts the
given observed features and prior knowledge. It predicts theclass
classwith
withthe
thehighest
highest
estimated
estimatedprobability, often
probability, used
often as as
used a benchmark
a benchmark butbut
requiring complete
requiring knowledge
complete of un-
knowledge of
derlying probability
underlying distributions.
probability To estimate
distributions. P(y|𝑃(𝑦|𝑥̅
To estimate x ) for)the
forBayes classifier,
the Bayes the common
classifier, the com-
approach is maximum
mon approach likelihood
is maximum estimation
likelihood (MLE),
estimation especially
(MLE), for the
especially fordiscrete variable
the discrete y,
varia-
as outlined in Equation (45) [37]:
ble y, as outlined in Equation (45) [37]:
∑m∑ I ( x𝐼(𝑥= x=∧𝑥̅ y∧i 𝑦= =y)𝑦)
| x ) =) =k=1 n k
P(y𝑃(𝑦|𝑥̅ (45)
(45)
∑i=∑1 I ( x𝐼(𝑥
i ==x )𝑥̅ )
Naive Bayes addresses MLE’s limitations with sparse data by assuming feature inde-
pendence. It estimates 𝑃(𝑦) and 𝑃(𝑥̅ |𝑦) instead of 𝑃(𝑦|𝑥̅ ) using Bayes’ rule (Equation
(46)) [37]:
𝑃(𝑥̅ |𝑦)𝑃(𝑦)
Mathematics 2023, 11, 4937 17 of 40

Naive Bayes addresses MLE’s limitations with sparse data by assuming feature indepen-
dence. It estimates P(y) and P( x |y) instead of P(y| x ) using Bayes’ rule (Equation (46)) [37]:

P( x |y) P(y)
P(y| x ) = (46)
P( x )

Generative learning estimates P(y) and P( x |y), with P(y) resembling tallying occur-
rences for discrete binary values (Equation (47)).

∑in=1 I (yi = c)
P(y = c) = = πc (47)
n
To simplify estimation, the Naive Bayes (NB) assumption is introduced, a key element
of the NB classifier. It assumes feature independence given the class label, formalized in
Equation (48) for P( x |y).
d
P( x |y) = ∏ P ( xα | y ) (48)
α=1

Here, xα is the value of feature α, assuming feature values, given class label y, are
entirely independent. Despite potential complex relationships, NB classifiers are effec-
tive. The Bayes classifier, defined in Equation (49), further simplifies to (50) due to P( x )
independence from y, and using logarithmic property, it can be expressed as (51).

h( x ) = argmaxy P(y| x ) = argmaxy P( x |y) P(y) (49)

n
h( x ) = argmaxy ∏ P( xα |y) P(y) (50)
α=1
n
h( x ) = argmaxy ∑ log( P( xα |y)) + log( P(y)) (51)
α=1

Estimating log( P( xα |y)) is straightforward for one dimension. P(y) remains unaf-
fected and is calculated independently. In Gaussian NB, where features are continuous
( xα ∈ R), P( xα |y) follows a Gaussian distribution (Equation (52)). This assumes each fea-
ture ( xα ) follows a class-conditional Gaussian distribution with mean µαc and variance
σ2αc (Equations (53) and (54)), using parameter estimates in the Gaussian NB classifier for
each class [37]. !
1 ( xα − µαc )2
P ( xα | y = d ) = p exp − (52)
2πσ2αc 2σ2αc

1 n
nd k∑
µαc = I (yk = d) xiα (53)
=1

1 n
nd k∑
σ2αc = I (yk = d)( xiα − µαc )2 (54)
=1

2.5.5. Logistic Regression (LR)


Logistic regression, commonly used in classification, calculates the probability of a
binary label based on input features. In logistic regression (LR), the logistic (sigmoid)
function transforms a linear combination of input features x, weights w, and a bias term b
into a likelihood estimate between 0 and 1. Mathematically, logistic regression is defined in
Equation (55) [38]:
1
P ( y = 1| x ) = T
(55)
−(
1 + e w x +b)
Mathematics 2023, 11, 4937 18 of 40

Equation (71): P(y = 1| x )) is the likelihood of class 1 given features x. w and b


are estimated using statistical methods, minimizing assumptions about P( xi |y), allowing
flexibility in underlying distributions [38].
The Maximum Likelihood Estimate (MLE): MLE maximizes P(y | x, w), the probabil-
ity of observing y ∈ Rn given feature values xi . It aims to find parameters maximizing this
function, assuming independence among yi given xi and w. Equation (56) captures the
mathematical expression for the conditional data likelihood.
m
P(y| x, w) = ∏ P(yk | xk , w) (56)
k =1

Now, by taking the logarithm of the product of Equation (57), we obtain Equation (73):
!
m m  
log ∏ P(yk | xk , w) = − ∑ log 1 + e−yk w xk
T
(57)
k =1 k =1

To find the MLE for w, we aim to minimize the function provided in Equation (58):
m   m  
∑ log
Tx
= argmin(w) ∑ log 1 + e−yk w xk
T
w MLE = argmax (w) − 1 + e−yk w k (58)
k =1 k =1

Minimizing the function in Equation (58) is our goal, achieved through gradient
descent on the negative log likelihood in Equation (59).
m  

T
L(w) = log 1 + e−yk w xk (59)
k =1

Maximum a Posteriori (MAP): In maximum a posteriori (MAP), assuming a Gaussian


prior, the objective is to find w MAP that maximizes the posterior probability, represented
mathematically in Equation (60). Reformulating, this becomes an optimization problem, as
shown in Equation (61), where λ = 2σ1 2 , and gradient descent is employed on the negative
log posterior l (w) for parameter optimization [32,37].

w MAP = argmaxlog( P(y| x, w) P(w)) ∝ P(y| x, w) P(w) (60)


w

m  
w MAP = argmin ∑ log 1 + e−yk w xk + λw T w
T
(61)
w
k =1

2.5.6. Decision Tree (DT)


Decision trees, used for regression and classification, form a hierarchical structure
with nodes for decisions, branches for outcomes, and leaves for predictions. The goal
is a compact tree with pure leaves, ensuring each contains instances from a single class.
Achieving consistency is computationally challenging due to the NP-hard complexity of
finding a minimum-size tree [37]. Impurity functions in decision trees, evaluated on a
dataset D with pairs ( a1 , b1 ), . . . , ( an , bn ), where bi takes values in {1, . . . , m} representing
m classes, are crucial for assessing tree quality.
Gini Impurity: Gini impurity in a decision tree is calculated for a leaf using Equa-
tion (62), and the Gini impurity for the entire tree is given by Equation (63).

k
I (D) = ∑ q m (1 − q m ) (62)
m =1

| DL | |D |
GT ( D ) = G ( D ) + R GT ( D R ) (63)
|D| T L |D|
𝐾𝐿(𝑝||𝑞) = 𝑝 𝑙𝑜𝑔 > 0 ← 𝐾𝐿 − 𝐷𝑖𝑣𝑒𝑟𝑔𝑒𝑛𝑐𝑒
𝑞
1
= 𝑝 𝑙𝑜𝑔(𝑝 ) − 𝑝 𝑙𝑜𝑔(𝑞 ), 𝑤ℎ𝑒𝑟𝑒 𝑞 =
𝑐

Mathematics 2023, 11, 4937 = 𝑝 𝑙𝑜𝑔(𝑝 ) + 𝑝 𝑙𝑜𝑔(𝑐) 19 of 40


(64)

= 𝑝 𝑙𝑜𝑔(𝑝 ) + 𝑙𝑜𝑔(𝑐) 𝑝 , 𝑤ℎ𝑒𝑟𝑒 𝑙𝑜𝑔(𝑐) ← 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡, 𝑝 =1


| DL |
where: D = DL ∪ DR , DL ∩ DR = ∅, |D|
represents the fraction of inputs in the left subtree,
|D |
) = min ) =subtree.
R
𝑚𝑎𝑥 𝐾 𝐿 ((𝑝||𝑞)
and | D= | 𝑚𝑎𝑥 𝑝 the
represents 𝑙𝑜𝑔(𝑝
fraction −
of inputs𝑝in𝑙𝑜𝑔(𝑝
the right min 𝐻(𝑠) ← binary
The 𝐸𝑛𝑡𝑟𝑜𝑝𝑦
decision tree with
class levels has been visualized in Figure 8.

Figure8.8.Binary
Figure BinaryDecision
DecisionTree
Treewith
withSole
SoleStorage
StorageofofClass
ClassLabels.
Labels.

Entropy in Decision
ID3 Algorithm: The Trees: Entropy stops
ID3 algorithm in decision trees measures
tree-building when alldisorder using
labels are the class
same
fractions. Minimizing entropy aligns with a uniform distribution, promoting random-
or no more attributes can split further. If all share the same label, a leaf with that label is
ness. KL-Divergence
created. KL( p||qattributes
If no more splitting ) gauges exist,
the closeness p tomost
of the
a leaf with a uniform distribution
frequent q, as in
label is generated
Equation (64).
(Equation (65)) [39].
c p
KL( p||q𝑖𝑓
) ∃ 𝑦⃗ 𝑠.
=𝑡.∑ ∀(𝑥,
pn𝑦)log ∈
qn 𝑆,
n
> 0 ←𝑦 KL= 𝑦⃗,
−𝑟𝑒𝑡𝑢𝑟𝑛 𝑙𝑒𝑎𝑓 𝑤𝑖𝑡ℎ 𝑙𝑎𝑏𝑒𝑙 𝑦⃗
Divergence
𝐼𝐷3(𝑆): n =1 (65)
𝑖𝑓 ∃ 𝑥⃗ 𝑠. 𝑡. ∀(𝑥, 𝑦) ∈ 𝑆, 𝑥 = 𝑥⃗ 𝑟𝑒𝑡𝑢𝑟𝑛 𝑙𝑒𝑎𝑓 𝑤𝑖𝑡ℎ𝑜𝑢𝑡 𝑚𝑜𝑑𝑒 (𝑦: (𝑥, 𝑦) ∈ 𝑆)
= ∑ pn log( pn ) − pn log(qn ), where qn = 1c
n
CART (Classification and Regression Trees): CART (classification and regression
= ∑ pn log( pn ) + pn log(c) (64)
trees) is suitable for continuous
n labels (𝑦 ∈ 𝑅), using the squared loss function (Equation
(66)). It efficiently = ∑ pthe
finds n logbest + log
( pn )split (c)∑ pn , where
(attribute log(c) ← by
and threshold) constant, ∑ pn =
minimizing 1 average
the
n n n
squared difference from the average label 𝑦 [37].
maxKL( p||q) = max∑ pn log( pn ) = min−∑ pn log( pn ) = min H (s) ← Entropy
p p n p n p
1
𝐿(𝑆) = (𝑦 − 𝑦⃗) ← 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑠𝑞𝑢𝑎𝑟𝑒𝑑 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑓𝑟𝑜𝑚 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑙𝑎𝑏𝑒𝑙 (66)
|𝑆| ID3 Algorithm: The ID3 algorithm stops tree-building when all labels are the same
( , )∈
or no more attributes can split further. If all share the same label, a leaf with that label is
where 𝑦If⃗ no
created. ∑ splitting
= |more 𝑦 ← 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑙𝑎𝑏𝑒𝑙
attributes exist, a leaf with the most frequent label is generated
| ( , )∈
(Equation (65)) [39].
→ → →
(
i f2.5.7. Ensemble
∃ y s.t. ∀( x, y) Classifier
∈ S, y =(EC)
y , return lea f with label y
ID3(S) : → → (65)
i f ∃ x Ensemble
s.t. ∀( x, y)classifiers
∈ S, x =represent
x returnaleasophisticated
f without modeclass
(y of
: (machine
x, y) ∈ S)learning techniques
aimed at enhancing the precision and resilience of predictive models. Their fundamental
CART (Classification and Regression Trees): CART (classification and regression trees)
is suitable for continuous labels (yi ∈ R), using the squared loss function (Equation (66)). It
efficiently finds the best split (attribute and threshold) by minimizing the average squared
difference from the average label ys [37].

1 → 2
 
L(S) = ∑
|S| (i,j)∈S
y − yS ← Average squared difference from average label (66)

→ 1
|S| ∑(i,j)∈S
where yS = y ← average label

2.5.7. Ensemble Classifier (EC)


Ensemble classifiers represent a sophisticated class of machine learning techniques
aimed at enhancing the precision and resilience of predictive models. Their fundamental
premise revolves around the amalgamation of predictions from multiple foundational
models. Below, we delve into several prominent types of ensemble classifiers, each with its
distinct modus operandi.
Mathematics 2023, 11, 4937 20 of 40

Bagging (Bootstrap Aggregating): Bagging orchestrates the training of multiple foun-


dational models in parallel. Each model operates independently on distinct, resampled
subsets of the training data. This decomposition helps us understand the sources of error
in our models. Bias/variance decomposition is described by Equation (67) [37,40]:

→ 2 → 2
     
h
2
i → → 2
E ( f k ( x ) − b) = E ( f k ( x ) − f ( x )) + E ( f ( x ) − c ()) + E ( c ( x ) − d( x )) (67)
| {z } | {z } | {z } | {z }
Error Variance Bias Noise

In Equation (67), we decompose the error into four components: “Error”, “Variance”,
“Bias”, and “Noise”. Our primary objective is to minimize the “Variance” term, which is
expressed as Equation (68):
→ 2
E[( f ( x ) − f ( x )) ] (68)
| k {z }
Variance

Ensemble learning minimizes variance by averaging individual predictions f k ( x ).


Bagging enhances ML classifiers by creating multiple datasets, training individual classifiers
hi (), and aggregating predictions in the final ensemble classifier h(z), through averaging
(Equation (69)) [40]:
1 n
h ( z ) = ∑ hi ( z ) (69)
n i =1
In practice, a larger value of n often leads to a better-performing ensemble, as it
leverages diverse base models for more robust predictions.
Random Forest (RF): RF stands as one of the most renowned and beneficial bagging
algorithms. The RF algorithm entails creating multiple datasets, building decision trees
with random feature subsets for each  dataset, and averaging their predictions for the final
1
classifier [37,40] h( x ) = m ∑ h j ( x ) .
Boosting: Boosting addresses high bias in machine learning models, specifically when
dealing with the hypothesisclass  H. Boosting reduces
 bias by iteratively constructing an
→ →
ensemble of weak learners HT x = ∑tT=1 αt ht x with each iteration introducing a
new classifier, guided by gradient descent in function space [37,41].
Gradient descent: Gradient descent in functional space optimizes the loss function
` within hypothesis class H by finding the appropriate step size α and weak learner h
that minimizes l( H + αh). The technique uses Taylor approximation to approximate the
optimal weak learner h with a fixed α around 0.1 (Equation (70)) [34].
n
∂l
argminh∈ Hl ( H + αh) ≈ argminh∈ H < ∇l( H ), h ≥ argminh∈ H ∑ h ( xi ) (70)
i =1
∂[ H ( xi )]

Here, each prediction serves as an input to the loss function. The function `(H) can be
expressed by Equation (71).

nl
l(H) = ∑ ( H (xi )) = l ( H ( x1 ), . . . , H ( xn )) (71)
i=1

This approximation enables the utilization of boosting as long as there exists a method,
denoted as A, capable of solving Equation (72).
n
∂l
ht+1 = argminh∈ H ∑ h( x ) (72)
∂[ H ( x )]
i =1 | {z i }
ri
Mathematics 2023, 11, 4937 21 of 40

where A({( x1 , r1 ), . . . , ( xn , rn )}) = argminh∈ H ∑in=1 ri h( xi ); progress is made as long as


∑in=1 ri h( xi ) < 0, even if h is not an excellent learner.
AnyBoost (Generic Boosting): AnyBoost, a versatile boosting technique, iteratively
combines weak learners, prioritizing challenging data points for enhanced accuracy. It cre-
ates a strong learner from weak ones, effectively reducing bias and improving predictions.
See Algorithm 1 for the pseudo-code [41].

Algorithm 1: Pseudo-code for the AnyBoost.


Input: l, a, {(xi , yi )}, A
H0 = 0
for t = 0: T − 1 do
∂l H x ,y1 ),...,( Ht ( xn ),yn )
∀ I : ri = (( t ( 1 ) ∂H (x ) i
ht+1 = A({( x1 , r1 ), . . . ., ( xn , rn )}) = argminh∈ H ∑in=1 ri h( xi )
if ∑in=1 ri ht+1 ( xi ) < 0 then
Ht+1 = Ht + α t+1 ht+1
else
return Ht (Negative gradient orthogonal to descent direction.)
end
end
return HT

Gradient Boosted Regression Trees (GBRT): GBRT, a sequential regression algorithm,


combines decision trees to correct errors iteratively for precise predictions. Applicable to
both classification and regression, it uses weak learners, often shallow regression trees,
with a fixed depth. The step size (α) is a small constant, and the loss function (l) must be
differentiable, convex, and decomposable over individual samples. The ensemble’s overall
loss is defined in Equation (73) [41].
n
L( H ) = ∑ l ( H (xi )) (73)
i =1

GBRT minimizes the loss by iteratively adding weak learners to the ensemble. Pseudo-
code is in Algorithm 2 [41].

Algorithm 2: Pseudo-code for GBRT


Input: l, α, {(xi , yi )}, A
H=0
for t = 1: T do
∀i : ti = yi − H ( xi )
h = argminh∈ H (h( xi ) − ti )2
H ← H + αh
end
return H

AdaBoost: AdaBoost is a binary classification algorithm utilizing weak learners h


producing binary predictions. Key components include step-size α and exponential loss
`(H), given by Equation (74):
n
l(H) = ∑ e − yi H ( xi ) (74)
i =1

The gradient function ri needed to find the optimal weak learner is computed using
Equation (75).
∂L
ri = = − y i e − yi H ( xi ) (75)
∂H ( xi )
Mathematics 2023, 11, 4937 22 of 40

Introducing wi = Z1 e−yi H ( xi ) , for clarity and convenience, where Z = ∑in=1 e−yi H ( xi ) ,


normalizing the weights. Each wi signifies the role of ( xi , yi ) in the global loss. To find
the next weak learner, we solve the optimization problem in Equation (76) with h( xi ) ∈
{+1, −1} [42].
n  
h( xi ) = argminh∈ H ∑ ri h( xi ) substitute in : ri = e− H ( xi )yi
i =1
n  
= argminh∈ H − ∑ yi e− H (xi )yi h( xi ) substitute in : wi = Z1 e− H (xi )yi
i =1
n
= argminh∈ H − ∑ wi yi h( xi ) (yi h(xi ) ∈ {+1, −1}with h(xi )yi = 1 ⇐⇒ h(xi ) = yi ) (76)
i =1 !
= argminh∈ H ∑ wi − ∑ wi ∑ wi = 1 − ∑ wi
i:h(xi )6=yi i:h( xi )=yi i:h( xi )=yi i:h( xi )6=yi
= argminh∈ H ∑ wi (This is the weighted classification error.)
i:h(xi )6=yi

In (76), e = ∑i:h( xi )6=yi wi , representing the weighted classification error. AdaBoost


seeks a classifier minimizing this error without requiring high accuracy. The optimal step
size, denoted as α, minimizes the loss l most effectively in the closed-form optimization
problem (77) [41].
α = argminα l ( H + αh)
n (77)
= argminα ∑ e−yi [ H (xi )+αh(xi )]
i =1

Taking the derivative with respect to α and setting it to zero, as shown by


Equations (78)–(80):
n
∑ yi h(xi )e−yi [ H(xi )+αyi h(xi )] = 0 (yi h( xi ) ∈ {+1 or − 1}) (78)
i =1
−(yi H ( xi )+αyi h ( xi )) −(yi H ( xi )+αyi h ( xi ))  
1 − yi H ( xi )
∑ ∑
| {z } | {z }
− e 1 + e −1 = 0 wi = e (79)
i:h( xi )yi =1 i:h( xi )yi 6=1
Z
 

− ∑ wi e −α + ∑ wi eα = 0  e = ∑ wi  (80)
i:h( xi )yi =1 i:h( xi )yi 6=1 i:h( xi )yi =−1

For further simplification, with ε representing the sum over misclassified examples, as
given in Equation (81):
−(1 − )e−α + eα = 0 (81)
Solving for α, as shown in Equation (82):

1−
e2α = (82)


1 1−
ln α= (83)
2 
The optimal step size α, derived from the closed-form solution in (83), facilitates
rapid convergence in AdaBoost. After each step Ht+1 = Ht + αh, recalculating and re-
normalizing all weights is crucial for the algorithm’s progression. The pseudo-code for
AdaBoost Ensemble classifier is presented in Algorithm 3 [37,41].
Mathematics 2023, 11, 4937 23 of 40

Algorithm 3: Pseudo-code for AdaBoost


Input: l, α, {(xi , yi )}, A
H=0
∀i : wi = n1
for t = 1: T do
h = A ( w1 , x 1 , y 1 ) , . . . . . . . . . , ( w n , x n , y n )
e= ∑ wi
i:h( xi )6=yi
1
if e < 2 then
α = 12 ln 1− 


Ht+1 = Ht + αh
−αh( xi )yi
∀ i : w i ← wi e 1
2e(1−e) 2
else
return (Ht )
end
return H
end

2.6. Assessment Metrics


The crucial next step in evaluating machine learning classifiers is the use of a separate
test dataset that has not been part of the training process. Evaluation involves various
parameters, with the confusion matrix being a widely adopted tool. This matrix forms
the basis for determining assessment metrics, essential for validating model performance,
whether it is a traditional or deep neural network classifier. In cancer prediction tasks,
numerous metrics are employed to assess effectiveness, including error rate, accuracy,
sensitivity, specificity, recall, precision, predictivity, F1 score, area under the curve (AUC),
negative predictive value (NPR), false positive rate (FPR), and false negative rate (FNR), and
Matthews correlation coefficient (MCC) [43]. These metrics quantify predictive capabilities
Mathematics 2023, 11, x FOR PEER REVIEW 24 of 42
and are vital for diverse prediction tasks. Multiple performance evaluation metrics rely on
the confusion matrix, as visualized in Figure 9, for multiclass classification.

Figure9.9.Confusion
Figure ConfusionMatrix
Matrixfor
forMulticlass
MulticlassClassification
ClassificationEvaluation.
Evaluation.

Accuracy (Acc): This


Accuracy(Acc): This metric
metric isis aa fundamental
fundamental indicator
indicator of
of aa model’s
model’s overall
overall perfor-
perfor-
mance. It measures the ratio of accurately categorized cases (both cancer and non-cancer)
mance. It measures the ratio of accurately categorized cases (both cancer and non-cancer) to
the overall cases in the test dataset. It may not be suitable when the dataset is imbalanced.
to the overall cases in the test dataset. It may not be suitable when the dataset is imbal-
anced.
(TP + TN)
Accuracy (%ACC ) = (TP
× 100+ TN)
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦Total
(%𝐴𝐶𝐶) =
Samples × 100
Total Samples
Error Rate (ER): The reciprocal of accuracy equates to the error rate. It quantifies the
proportion of instances that the model incorrectly classifies. A lower error rate suggests a
more accurate model, and it is especially useful when you want to know how often the
model makes incorrect predictions.
𝐸𝑟𝑟𝑜𝑟 𝑟𝑎𝑡𝑒 (𝐸𝑅) = 1 − 𝐴𝑐𝑐
Mathematics 2023, 11, 4937 24 of 40

Error Rate (ER): The reciprocal of accuracy equates to the error rate. It quantifies the
proportion of instances that the model incorrectly classifies. A lower error rate suggests
a more accurate model, and it is especially useful when you want to know how often the
model makes incorrect predictions.

Error rate ( ER) = 1 − Acc

FP + FN
%ER = × 100 = 100 − (%ACC )
Total Samples
Specificity (% Spe): True negative rate, commonly known as specificity, is a metric
that evaluates a model’s accuracy in correctly identifying true negative cases. This is crucial
in minimizing false alarms.

TN
Speci f icity(%Sp) = True Negative Rate (%TNR) = × 100
Total Negative

Sensitivity (% Sen): This metric, also termed recall or the true positive rate (TPR),
gauges the model’s capability to accurately identify true positive values, which correspond
to cases of cancer, among the total positive cases within a dataset [42].

TP
Sensitivity(%Sen) = Recall (%Re) = True Positive Rate (%TPR) = × 100
Total Positive
Precision (% Pr): Precision, also recognized as positive predictive value (PP), denotes
the ability to accurately predict positive values among the true positive predictions. A high
precision score signifies that the model effectively reduces false positive errors.

TP
Precision(%Pr) = Positive Predictivity(%PP) = × 100
True Prediction
F1 Score (% F1): An equitable metric that amalgamates positive predictive value and
recall forms the F1 score [44]. It is particularly valuable when you require a singular metric
that contemplates both incorrect positive predictions and missed positive predictions.

2 × TP 2PP × TPR
F1-score (%F1) = × 100 = ×100
(2 × TP + FP + FN) (PP + TPR)

Area Under the Curve (AUC): The AUC assesses the classifier’s capacity to differen-
tiate between affirmative and negative occurrences. It gauges the general efficacy of the
model concerning receiver operating characteristic (ROC) graphs. A superior AUC score
signifies enhanced differentiation capability.
Negative Predictive Value (% NPV): It measures the classifier’s capability to accu-
rately predict negative instances among all instances classified as negative. A high NPV
suggests that the classifier is effective at identifying non-cancer cases when it predicts them
as such, reducing the likelihood of unnecessary treatments.

TN
Negative Predictive Value (%NPV ) = ×100
Total Negative

False Positive Rate (%FPR): This quantifies how often the classifier falsely identifies
a negative instance as positive. It provides insights into the model’s propensity for false
positive errors. In cancer detection, a high FPR can lead to unnecessary distress and
treatments for individuals who do not have cancer.
FP
False Positive Rate (%FPR) = ×100
Total Negative
Mathematics 2023, 11, 4937 25 of 40

False Negative Rate (%FNR): It determines the classifier’s tendency to falsely identify
a positive instance as negative. It reveals the model’s performance regarding false negative
errors, which is critical in cancer detection to avoid missing real cases. High FNR can lead
to undiagnosed cancer cases and potentially delayed treatments.

FN
False Negative Rate (%FNR) = × 100
Total Positive
Matthews Correlation Coefficient (MCC): The Matthews correlation coefficient (MCC)
represents a pivotal metric utilized for evaluating the effectiveness of binary (two class) predic-
tions, prominently beneficial when dealing with scenarios where classes are asymmetrically
distributed in their volume and representation within the dataset. The formula to calculate
MCC is:
[( TN ∗ TP) − ( FN ∗ FP)]
p
((True Prediction) ∗ (False Predication) ∗ (Total Positive) ∗ (Total Negative))

where TN (True Negative) is accurately recognized negatives, TP (True Positive) is ac-


curately recognized positives, FP (False Positive) is negatives incorrectly identified as
positives, FN (False Negative) is positives incorrectly recognized as negatives, Total Posi-
tive is the Sum of TP and FN (all actual positives), Total Negative is the Sum of TN and FP
(all actual negatives), True Prediction: Sum of TP and FP (correctly identified positives),
False Predication: Sum of FN and TN (incorrectly identified negatives), Total Samples: Sum
of TP, TN, FP, and FN (entire dataset).

3. Review Analysis
In this section, we present a thorough and extensive analysis of cancer detection
utilizing conventional machine learning models applied to medical imaging datasets. Our
study is focused exclusively on the detection of two specific types of cancer: colorectal
and stomach cancer. For each of these cancer types, we have meticulously compiled a
comprehensive review table that encompasses the relevant literature published during
the period spanning 2017 to 2023. This table encompasses a range of crucial review
parameters, including the year of publication, the datasets utilized, preprocessing methods,
feature extraction techniques, machine learning classifiers employed, the number of images
involved, the imaging modality, and various performance metrics. In total, our review
encompasses 36 research articles that have harnessed medical imaging datasets to detect
these specific types of cancer. Our primary emphasis lies in scrutinizing the utilization of
traditional machine learning methodologies in the context of cancer detection using image
datasets. We have conducted this analysis based on the meticulously assembled review
tables. Subsequent subsections provide in-depth and comprehensive reviews for both
colorectal and stomach cancer. Within our analysis, we delve into the intricate application
of machine learning approaches for the intent of cancer prediction. Our overarching goal
is to furnish valuable insights into the efficacy and constraints of conventional machine
learning models when applied to the realm of cancer detection using medical imaging
datasets. Through a meticulous examination and comparative analysis of results derived
from various studies, our objective is to make a meaningful contribution to the evolution of
cancer detection methodologies and to offer guidance for future research endeavors in this
critical domain.

3.1. Analysis of Colorectal Cancer Prediction


Table 3 showcases 20 studies conducted from 2017 to 2023, focusing on machine
learning-based colorectal cancer detection. These studies underscore the vital role of pre-
processing methods in enhancing detection accuracy. The highest accuracy achieved is
100%, with the lowest at 76.00%. Various techniques, including cropping, stain normal-
ization, contrast enhancement, smoothing, and filtering, were employed in conjunction
with segmentation, feature extraction, and machine learning algorithms like SVM, MLP,
Mathematics 2023, 11, 4937 26 of 40

RF, and KNN. These approaches successfully detect colorectal cancer using modalities
such as endocytoscopy, histopathological images, and clinical data. The studies employed
varying quantities of images, patients, or slices, ranging from 54 to 100,000. The “KCRC-16”
datasets are prominently featured in these analyses.
In a comparative analysis of colorectal cancer detection studies, (Talukder et al., 2022) [45]
stood out with an impressive accuracy of 100%. Their approach included preprocessing
steps like resizing, BGR2RGB conversion, and normalization. Deep learning models such as
DenseNet169, MobileNet, VGG19, VGG16, and DenseNet201 were employed. Performance
assessment was conducted using a combination of voting, XGB, EC, MLP, LGB, RF, SVM,
LR, and hybrid techniques on a dataset comprising 2800 H&E images from the LC25000
dataset. Their best model achieved a flawless 100% accuracy. In contrast, (Ying et al., 2022) [46]
achieved thelowest accuracy of 76.0% in colorectal cancer detection. Their approach involved
manual region of interest (ROI) selection and various preprocessing techniques. They leveraged
multiple features, including FOS, shape, GLCM, GLSZM, GLRLM, NGTDM, GLDM, LoG,
and WT. Classification was carried out using the MLR technique on a dataset consisting of
276 CECT images from a private dataset. Their least-performing model achieved an accuracy
of 76.00%. Moreover, their study exhibited a sensitivity of 65.00%, specificity of 80.00%, and
precision of 54.00%, indicating relatively suboptimal performance in accurately identifying
colorectal cancer cases.
(Khazaee Fadafen and Rezaee 2023) [47] conducted a remarkable colorectal cancer
detection study by utilizing a substantial dataset (the highest number of images among all)
comprising a total of 100,000 medical images sourced from the H&E NCT-CRC-HE-100K
dataset. Their preprocessing methodology encompassed the conversion of RGB images
to the HSV color space and the utilization of the lightness space. For classification, they
harnessed the dResNet architecture in conjunction with DSVM, which resulted in an out-
standing accuracy rate of 99.76%. (Jansen-winkeln et al., 2021) [48] conducted a study with
a notably smallest dataset, comprising only 54 medical images. Their preprocessing ap-
proach included smoothing and normalization. For classification purposes, they employed
a combination of MLP, SVM, and RF techniques. This approach yielded commendable
results with an accuracy of 94.00%, sensitivity at 86.00%, and specificity reaching 95.00%.
Notably, their analysis identified MLP as the most effective model in their study.
Within the corpus of 20 studies dedicated to the realm of colorectal cancer detection,
researchers have deployed an array of diverse preprocessing strategies encompassing
endocytoscopy, cropping, IPP, stain normalization, CEI, smoothing, normalization, filtering,
THN, DRR, augmentation, UM-SN, resizing, BGR2RGB, normalization, scaling, labeling,
RGBG, VTI, HOG, RGB to HSV, lightness space, edge preserving, and linear transforma-
tion. These sophisticated methodologies collectively served as the linchpin for optimizing
machine learning-based colorectal cancer detection, ushering in a new era of precision
and accuracy. However, it is captivating to note that, within the comprehensive assess-
ment of 23 studies, a select quartet of research endeavors chose to forgo the utilization
of any specific preprocessing techniques. This exceptional cluster includes the works of
(Bora et al., 2021) [49], (Fan et al., 2021) [50], and (Lo et al., 2023) [51]. Astonishingly, these
studies defied conventional wisdom by attaining commendable accuracies that spanned
the spectrum from 94.00% to an impressive 99.44%. Such outcomes suggest that, in cases
where the dataset is inherently pristine and impeccably aligned with the demands of the
classification task, the impact of preprocessing techniques on the classifier’s performance
might indeed exhibit a marginal influence.
In the comprehensive analysis of the research studies under scrutiny, it is noteworthy
that only the works of (Grosu et al., 2021) [52] and (Ying et al., 2022) [46] registered
accuracy figures falling below the 90% threshold, specifically at 84.7% and 76%, respectively.
This observation underscores the intriguing possibility that traditional machine learning
models can indeed yield highly accurate cancer detection performance, provided they are
meticulously optimized.
Mathematics 2023, 11, 4937 27 of 40

Table 3. Performance comparison of traditional ML-based colorectal cancer prediction methods.

Year References Pre- Features Techniques Dataset Data Train Test Modality Metrics (%)
Processing Samples Data Data
2017 [53] Endocytoscopy Texture, SVM Private 5843 5643 200 ENI Acc 94.1
nuclei Sen 89.4
Spe 98.9
Pre 98.8
NPV 90.1
2019 [54] IPP CSQ, Color WSVMCS Private 180 108 72 H&E Acc 96.0
histogram
2019 [55] Cropping Biophysical NB, MLP, OMIS 316 237 79 OMIS Acc 92.6
characteristic, data Sen 96.3
WLD, Spe 88.9
2021 [56] Filtering HOS, FOS, ANN, KCRC-16 5000 4550 450 H&E Acc 95.3
GLCM, Gabor, RSVM,
WPT, LBP
2021 [57] IPP, Augmen- VGG-16 MLP KCRC-16 5000 4825 175 H&E Acc 99.0
tation Sen 96.0
Spe 99.0
Pre 96.0
NPV 99.0
F1 96.0
2021 [50] --- AlexNet EC, SVM, LC25000 10,000 4-fold cross H&E Acc 99.4
AlexNet, validation
2021 [58] THN, DRR BmzP NN MALDI 559 Leave-One-Out H&E Acc 98.0
MSI cross-validation Sen 98.2
Spe 98.6
2021 [52] Filtering Filters, RF Private 287 169 77 CT Acc 84.7 *
Texture, Sen 82.0
GLHS, Shape Spe 85.0
AUC 91.0
2021 [49] --- GFD, MLP Private 734 five-fold NBI, Acc 95.7
NSCT, Shape LSSVM, cross-validation WLI Sen 95.3
Spe 95.0
Pre 93.2
F1 90.5
2021 [48] Normalization, Spatial MLP, Private 54 Leave-One-Out HSI Acc 94.0
smoothing Information SVM, RF cross-validation Sen 86.0
Spe 95.0
2022 [59] VTI Haralick, VTF RF Private 63 cross-validation CT Acc 92.2
method Sen 88.4
Spe 96.0
AUC 96.2
2022 [60] RGBG GLCM ANN, RF, KCRC-16 5000 4500 500 H&E Acc 98.7
KNN Sen 98.6
Spe 99.0
Pre 98.9
2022 [45] Resize, Deep Features EC, LC25000 2800 10-fold H&E Acc 100.0
BGR2RGB, Hybrid, cross-validation
Normaliza- LR, LGB,
tion, MLP, RF,
SVM,
XGB,
Voting
2022 [46] ROI FOS, GLCM, MLR Private 276 194 82 CECT Acc 76.0
GLDM, Sen 65.0
GLRLM, Spe 80.0
GLSZM, LoG, Pre 54.0
NGTDM, NPV 86.0
Shape, WT
Mathematics 2023, 11, 4937 28 of 40

Table 3. Cont.

Year References Pre- Features Techniques Dataset Data Train Test Modality Metrics (%)
Processing Samples Data Data
2022 [61] UM-SN HIM, GLCM, LDA, LC25000 1000 900 100 H&E Acc 99.3
Statistical MLP, RF, Sen 99.5
SVM, Pre 99.5
XGB, F1 99.5
LGB
2022 [26] --- Color Spaces, ANN, DT, KCRC-16 5000 3504 1496 H&E Acc 97.3
Haralick KNN, Sen 97.3
QDA, Spe 99.6
SVM Pre 97.4
2023 [62] Filtering, Color CatBoost, NCT- 12,042 8429 3613 H&E Acc 90.7
linear Trans- characteristic, DT, GNB, CRCHE- Sen 97.6
formation, DBCM, KNN, RF 7K Spe 97.4
normalization SMOTE Pre 90.6
Rec 90.5
F1 90.5
2023 [51] --- Clinical, SEKNN Private 1729 tenfold ENI Acc 94.0
FEViT cross-validation Sen 74.0
Spe 98.0
AUC 93.0
Lightness KCRC-16 5000 4000 1000 H&E Acc 98.8
2023 [47] space, RGB to dResNet DSVM
HSV NCT- 100,000 80,003 19,997 H&E Acc 99.8
CRC-HE-
100
K
2023 [63] HOG, RGBG, Morphological SVM Private 540 420 120 ENI Acc 97.5
Resizing
* Not given in the paper, calculated from the result table, bold font signifies the best model in the ‘Techniques’
column. Abbreviations: BGR2RGB, Blue-Green-Red to Red-Green-Blue; BmzP, Binning of m/z Points; catBoost,
Categorical Boosting; CECT, Contrast-Enhanced CT; CSQ, Color Space Quantization; DBCM, Differential Box
Count Method; DSVM, Deep Support Vector Machine; dResNet, Dilated ResNet; DRR, Dynamic Range Reduction;
DSVM, Deep Support Vector Machine; ENI, Endomicroscopy Images; FEViT, Feature Ensemble Vision Transformer;
FOS, First-Order Statistics; GFD, Generic Fourier Descriptor; GNB, Gaussian Naive Bayes; GLDM, Gray-Level
Dependence Matrix; GLHS, Gray Level Histogram Statistics; GLSZM, Gray Level Size Zone Matrix; GNB,
Gaussian Naive Bayes; HOG, Histogram of Oriented Gradients; HOS, Higher-Order Statistic; HIM, Hu Invariants
Moments; HSI, Hyperspectral Imaging; HSV, Hue-Saturation-Value; LBP, Local Binary Pattern; LDA, Linear
Discriminant Analysis; LGB, Light Gradient Boosting; LoG, Laplacian of Gaussian; LSSVM, Least Square Support
Vector Machine; MLR, Multivariate Logistic Regression; NGTDM, Neighboring Gray Tone Difference Matrix;
NSCT, Non-Subsampled Contourlet Transform; OMIS, Optomagnetic Imaging Spectroscopy; QDA, Quadratic
Discriminant Analysis; SEKNN, Subspace Ensemble K-Nearest Neighbor; THN, TopHat and Normalization;
UMSN, Unsharp Masking and Stain Normalization; VTF, Vector Texture Features; VTI, Vector Texture Images;
WLD, Wavelength Difference; WLI, White Light Imaging; WPT, Wavelet Packet Transform; WSVMCS, Wavelet
Kernel SVM with Color Histogram; XGB, Extreme Gradient Boosting.

The analysis of colorectal cancer detection using traditional machine learning tech-
niques reveals a notable disparity in model performance across various crucial metrics,
showcasing substantial discrepancies between the models with the highest and lowest val-
ues as shown in Figure 10. The most proficient model achieved an extraordinary accuracy
of 100.0%, whereas the least effective model achieved an accuracy of 76.0%, resulting in a
substantial difference of 24.0%. When considering sensitivity, the top-performing model
reached an impressive 99.5%, whereas the lowest-performing model registered a mere
65.0%, leading to a remarkable disparity of 34.5%. Similarly, concerning specificity, the
superior model attained 99.6%, while the inferior model managed only 80.0%, resulting
in a significant difference of 19.6%. In terms of precision, the best model demonstrated
99.5%, while the worst model exhibited a precision of only 54.0%, resulting in a substantial
difference of 45.5%. When examining the F1-score, the model with the highest performance
achieved 99.5%, whereas the least proficient model attained a score of 63.2%, yielding a
notable difference of 36.3%. Lastly, in the case of the area under the curve (AUC), the
top model achieved a score of 96.2%, while the bottom model scored 76.0%, marking a
model reached an impressive 99.5%, whereas the lowest-performing model registered
mere 65.0%, leading to a remarkable disparity of 34.5%. Similarly, concerning specifici
the superior model attained 99.6%, while the inferior model managed only 80.0%, resu
ing in a significant difference of 19.6%. In terms of precision, the best model demonstrat
99.5%, while the worst model exhibited a precision of only 54.0%, resulting in a substant
Mathematics 2023, 11, 4937 difference of 45.5%. When examining the F1-score, the model with the highest
29 of 40 perfo
mance achieved 99.5%, whereas the least proficient model attained a score of 63.2%, yiel
ing a notable difference of 36.3%. Lastly, in the case of the area under the curve (AUC
the top model achieved a score of 96.2%, while the bottom model scored 76.0%, marki
significant difference of 20.2%. These conspicuous differences underscore the pivotal role
a significant difference of 20.2%. These conspicuous differences underscore the pivo
of choosing appropriate machine learning techniques and feature sets in the effectiveness
role of choosing appropriate machine learning techniques and feature sets in the effectiv
of colorectal cancer detection.
ness of colorectalEffective cancer detection
cancer detection. Effective has far-reaching
cancer implications,
detection has far-reaching implic
influencing nottions,
only patient outcomes but also the operational efficiency of
influencing not only patient outcomes but also the operational healthcare efficiency
systems and thehealthcare
allocationsystems
of valuable medical
and the resources.
allocation of valuable medical resources.

Figure
Figure 10. Metrics 10. Metrics
comparison comparison
for the forof
prediction the prediction
colorectal of colorectal cancer.
cancer.

3.2. Analysis of Gastric Cancer


3.2. Analysis of Prediction
Gastric Cancer Prediction
Table 4encapsulates
Table 4 meticulously meticulously16 encapsulates 16 distinct
distinct studies studieswithin
conducted conducted within the tempo
the temporal
frame of 2018 to 2023, each ardently devoted to machine learning-based
frame of 2018 to 2023, each ardently devoted to machine learning-based gastric cancer gastric canc
detection. Thesedetection. These investigations
investigations collectivelythe
collectively underscore underscore the pivotal
pivotal role role of preprocessi
of preprocessing
in elevating theinaccuracy
elevatingofthe accuracy
stomach of stomach
cancer cancer
detection detection
models. models.
Notably, theNotably,
pinnacletheofpinnacle
achievement inachievement in this realm
this realm reached reached a100.0%
a remarkable remarkable 100.0%
accuracy, accuracy,
whereas thewhereas
lowest the lowe
point stood
point stood at 71.2%. This at 71.2%.spectrum
diverse This diverse spectrum of performance
of performance underscoresunderscores
the profound the profou
influence of preprocessing
influence of preprocessing techniques,
techniques, spanning spanning
resizing, resizing,
filtering, filtering,and
cropping, cropping,
color and col
enhancement. enhancement. These preprocessing
These preprocessing strategies, instrategies,
harmony in harmony
with with segmentation,
segmentation, feature featu
extraction, and the adept utilization of machine learning algorithms encompassing SVM,
MLP, RF, and KNN, have collectively converged to engender a triumphant era of stomach
cancer detection. This progress extends across diverse modalities such as endoscopy, CT,
MRI, and histopathology images. The quantity of images, patients, or slices underpinning
these studies spanned a substantial range, from 30 to a staggering 245,196. It is intriguing
to note that the enigmatic “Private” dataset emerged as the most recurrently harnessed
resource in this insightful analysis.
The research conducted by (Ayyaz et al., 2022) [64] achieved outstanding results in
stomach cancer detection, with a remarkable accuracy of 99.80%. They employed var-
ious preprocessing techniques, including resizing, contrast enhancement, binarization,
and filtering. However, the segmentation method used was not specified in the study.
Feature extraction was carried out with deep learning models like VGG19 and AlexNet.
For classification, they used multiple techniques such as DT, NB, KNN, SVM, and more.
Among these, the cubic SVM model performed the best, achieving an accuracy of 99.80%.
This model also had a high sensitivity, precision, F1-score, and an AUC of 100.0%. On
the other hand, the study conducted by (Mirniaharikandehei et al., 2021) [65] achieved
comparatively lower performance in stomach cancer detection, with an accuracy of 71.20%.
Their preprocessing techniques involved filtering and ROI selection, and they utilized the
HTS segmentation method. Feature extraction was done using radiomics features such as
GLRLM, GLDM, and WT LoG. The classification was carried out using various machine
learning models, including SVM, LR, RF, DT, and GBM. The worst-performing model in
their analysis was GBM, with an accuracy of 71.20%. This model had lower sensitivity but a
higher specificity, precision, and F1-score. (Hu et al., 2022) [66] conducted a stomach cancer
Mathematics 2023, 11, 4937 30 of 40

detection study with a large dataset of 245,196 medical images. They used various prepro-
cessing techniques, including ROI selection, cropping, filtering, rotation, and disruption.
The study extracted features such as color histograms, LBP, and GLCM. For classifica-
tion, they applied RF and LSVM classifiers, achieving an accuracy of 85.99%. RF was the
best-performing model in their analysis. On the other hand, (Naser and Zeki 2021) [67]
conducted a stomach cancer detection study with a smaller dataset of only 30 medical
images. They applied DIFQ-based preprocessing techniques, and their study used FCM for
classification and achieved an accuracy of 85.00%. Table 4 provides an overview of different
machine learning-based techniques for stomach (gastric) cancer detection, encompassing
16 reviewed studies. Notably, three of these studies specifically, namely, (Korkmaz and
Esmeray 2018) [68], (Nayyar et al., 2021) [69], and (Hu et al., 2022a) [70], opted not to
employ any preprocessing techniques. Surprisingly, they achieved noteworthy accuracies
of 87.77%, 99.8%, and 85.24%, respectively. This demonstrates the potential for effective
stomach cancer detection even in the absence of preprocessing methods. However, it is
essential to highlight that a significant portion of the studies examined in the table chose
to implement various preprocessing techniques, including CEI, filtering, resizing, Fourier
transform, cropping, ROI selection, rotation, disruption, binarization, augmentation, and
RSA. These preprocessing steps underscore their pivotal role in enhancing the performance
of machine learning models for stomach cancer detection.
Out of the 16 studies focused on gastric cancer detection, 50% of them (8 studies)
achieved an accuracy rate of over 90%, indicating highly accurate results. However, the
other 50% of the studies received less than 90% accuracy. This discrepancy in performance
might be attributed to the utilization of private datasets in these studies. Private datasets
may not undergo the same level of processing or standardization as publicly available
datasets, potentially leading to variations in data quality and affecting the performance of
the machine learning models.

Table 4. Performance comparison of traditional ML-based gastric cancer prediction methods.

Year References Preprocessing Features Techniques Dataset Data Train Test Modality Metrics (%)
Samples Data Data
2018 [71] Fourier BRISK, SURF, DT, DA Private 180 90 90 H&E Acc 86.7
transform MSER
2018 [72] Resizing LBP, HOG ANN, RF Private 180 90 90 H&E Acc 100.0

--- Private 180 90 90 H&E Acc 87.8


2018 [68] SURF, DFT NB
Private 720 360 360 H&E Acc 90.3
2018 [73] CEI, filtering, GLCM SVM Private 207 126 81 NBI Acc 96.3
resizing Sen 96.7
Spe 95.0
Pre 98.3
2019 [74] Resizing, GLCM, Shape, SVM Private 490 326 164 CT Acc 71.3
cropping FOF, GLSZM Sen 72.6
Spe 68.1
Pre 82.0
NPV 50.0
2021 [67] DIFQ SMI FCM, Private 30 --- --- MRI Acc 85.0
KMC
2021 [75] Resizing Extract HOG RF, MLP Private 180 90 90 H&E Acc 98.1
2021 [76] Resizing TSS BP, Private 78 --- --- MRI Acc 94.6
BPSVM,
SVM
2021 [69] --- Deep Features CSVM, Private 4000 2800 1200 WCE Acc 99.8
Bagged Sen 99.0
Trees, Pre 99.3
KNNs, F1 99.1
SVMs AUC 100
Mathematics 2023, 11, 4937 31 of 40

Table 4. Cont.

Year References Preprocessing Features Techniques Dataset Data Train Test Modality Metrics(%)
Samples Data Data
2021 [65] Filtering, ROI LoG, WT, GBM, DT, Private 159 Leave-One-Out CT Acc 71.2
GLDM, RF, LR, cross-validation Sen 43.1
GLRLM SVM. Spe 87.1
Pre 65.8
2022 [77] Augmentation, InceptionNet, SVM, RF, HKD 10,662 37,788 9610 Endoscopy Acc 98.0
resizing, VGGNet KNN. (47,398 Sen 100
filtering Augm- Pre 100
neted) F1 100
MCC 97.8
2022 [70] --- GLCM, LBP, NSVM, GasHisSDB 245196 196,157 49,039 H&E Acc 85.2
HOG, LSVM, Sen 84.9 #
histogram, LR, NB, Pre 84.6 #
luminance, RF, ANN, Spe 84.9 #
Color KNN F1 84.8 #
histogram
2022 [64] Binarization, VGG19 Bagged Private 2590 10-fold EUS Acc 99.8
CEI, filtering, Alexnet Tree, cross-validation Sen 99.8
resizing Coarse Pre 99.8
Tree, F1 99.8
CSVM, AUC 100
CKNN,
DT, Fine
Tree,
KNN, NB
2022 [66] Cropping, Color LSVM, GasHisSDB 245,196 196,157 49,039 H&E Acc 85.9
disruption, histogram, RF Sen 86.2 #
filtering, ROI, GLCM, LBP Spe 86.2 #
Rotation Pre 85.7 #
F1 85.9 #
2023 [78] Augmentation, MobileNet- Bayesian, KV2D 4854 10-fold Endoscopy Acc 96.4
CEI V2 CSVM, cross-validation Pre 97.6
LSVM, Sen 93.0
QSVM, F1 95.2
Softmax
2023 [79] RSA RSF PLS-DA, Private 450 Leave-One-Out H&E Acc 94.8
LOO, cross validation Sen 91.0
SVM Spe 100
AUC 95.8
# Calculated by averaging the normal and abnormal class, Bold Font techniques represent the best model.
Abbreviations: BPSVM, Binary Robust Invariant Scalable Keypoints; BRISK, Binary Robust Invariant Scalable
Keypoints; CKNN, Cosine K-Nearest Neighbor; CSVM, Cubic SVM; DA, Discriminant Analysis; DIFQ, Dividing
an image into four quarters; FCM, Fuzzy C-Means; GGF, Global Graph Features; HOG, Histogram of Oriented
Gradients; HTSS, Hybrid Tumor Segmentation; KMC, K-Means Clustering; LOO, Leave-One-Out; LSVM, Linear
Support Vector Machine; MSER, Maximally Stable Extremal Regions; NSVM, Non-Linear Support Vector Machine;
OAT, Otsu Adaptive Thresholding; PLS-DA, Partial Least-Squares Discriminant Analysis; QSVM, Quadratic
SVM; RSA, Raman Spectral Analysis; RSF, Raman Spectral Feature; SM, Seven Moments Invariants; SMI, Seven
Moments Invariants; SURF, Speeded Up Robust Features; TSS, Tumor Scattered Signal.

The analysis of gastric cancer detection reveals substantial variations in model perfor-
mance across key metrics, with significant differences observed between the highest and
lowest values as shown in Figure 11. Accuracy (Acc) showcased a noteworthy contrast,
with the best-performing model achieving a flawless 100.00% and the least effective model
scoring 71.20%. This substantial 28.80% difference underscores the pivotal role of model
selection in achieving accurate gastric cancer detection. Sensitivity (Sen) displayed a con-
siderable gap, with the top model achieving a perfect 100.00%, while the lowest model only
reached 43.10%. This marked difference of 56.90% emphasizes the necessity of sensitive
detection techniques in identifying gastric cancer. Similarly, specificity (Spe) followed suit,
with the highest model reaching 100.00% and the lowest model achieving 68.10%. The
substantial 31.90% difference highlights the importance of correctly identifying non-cancer
Mathematics 2023, 11, 4937 32 of 40

cases in diagnostic accuracy. Precision (Pre) also exhibited a significant disparity, with
the best model achieving 100.00%, and the least effective model achieving 65.80%. The
difference of 34.20% underscores the significance of precise identification of gastric cancer
cases. It is noteworthy that the negative predictive value (NPV) remained constant at
50.00% for both the highest and lowest models, signifying that neither model excelled in
ruling out non-cancer cases. However, since NPV is only used in a single article, its impact
on the overall analysis may be limited.
Additionally, the F1-score showed a substantial difference, with the top model achiev-
ing a perfect 100.00%, while the lowest model reached 84.80%. The 15.20% difference
emphasizes the balance between precision and sensitivity in gastric cancer detection. Lastly,
in terms of the area under the curve (AUC), the best model achieved a near-perfect 100.00%,
while the lowest model attained a still impressive 95.80%. The modest 4.20% difference
Mathematics 2023, 11, x FOR PEER REVIEW
indicates that both models performed well in distinguishing between gastric cancer and 33 of 42
non-cancer cases. It is also worth noting that the area under the curve (AUC) metric was
utilized in only three articles, and the differences in AUC were relatively modest. There-
fore, the impact metric
of AUC wasonutilized in onlyanalysis
the overall three articles,
may and the differences
be less generalized.in AUC
These were relatively mod-
findings
underscore the critical role of model choice and feature selection in the effective detec- These
est. Therefore, the impact of AUC on the overall analysis may be less generalized.
findings
tion of gastric cancer. underscore
Accurate andthesensitive
critical role of model tools
diagnostic choiceare
andcrucial
featurefor
selection in the effective
improving
detection of gastric cancer. Accurate and sensitive diagnostic tools are crucial for improv-
patient outcomes and optimizing healthcare resources. While NPV and AUC may have a
ing patient outcomes and optimizing healthcare resources. While NPV and AUC may
limited impact in this context due to their restricted usage, the other metrics highlight the
have a limited impact in this context due to their restricted usage, the other metrics high-
significance of selecting
light the appropriate
significance ofmodels forappropriate
selecting reliable gastric
modelscancer detection.
for reliable gastric cancer detection.

Metrics Comparision for Gastric cancer detection


120.00
100.00
80.00
Metrics(%)

60.00
40.00
20.00
0.00
Acc Sen Spe Pre NPV F1 Score AUC

Highest Metrics (%) Lowest Metrics (%) Difference

Figure 11. Metrics comparison for the prediction of gastric cancer.


Figure 11. Metrics comparison for the prediction of gastric cancer.
4. Proposed Methodology
4. Proposed Methodology
In this
In this section, we section,our
delineate weproposed
delineate our proposed methodology
methodology for theof
for the detection detection of colorectal
colorectal
and gastric cancer through the application of traditional machine
and gastric cancer through the application of traditional machine learning techniques. learning techniques.
These approaches have been meticulously crafted based on the discerning insights and
These approaches have been meticulously crafted based on the discerning insights and
observations gleaned from the comprehensive review tables. Our primary goal is to intro-
observations gleaned from the comprehensive review tables. Our primary goal is to
duce a Proposed (optimized) approach, accompanied by the most suitable parameters, in
introduce a Proposed (optimized) approach, accompanied by the most suitable parameters,
order to attain the most superior results. Our endeavor is to provide an efficient, effective,
in order to attain automated,
the most superior results.
and highly Our
precise endeavor
technique foristhe
todetection
provide an efficient, and
of colorectal effective,
gastric cancer.
automated, and highly precise technique for the detection of colorectal and gastric cancer.
4.1. Detection of Colorectal Cancer
4.1. Detection of Colorectal Cancer
Figure 12 is a comprehensive visualization of the architectural framework that un-
Figure 12 isderpins
a comprehensive
our proposedvisualization
model for theof the architectural
detection of colorectalframework
cancer. Thisthat un- draws
blueprint
derpins our proposed model from
its inspiration for the
thedetection of colorectal
wealth of insights cancer.
extracted fromThis
Tableblueprint draws a foun-
3, which provides
its inspiration from the wealth
dational of insights
understanding extracted
of the from Table
methodologies 3, which
that have provenprovides
effectiveain foun-
this domain.
dational understanding
While weof theopted
have methodologies thatmodality
to use the H&E have proven effective in
as an illustrative this domain.
example, it is imperative
to recognize
While we have opted to use that
the our
H&E model can seamlessly
modality accommodate
as an illustrative other modalities.
example, This flexibility
it is imperative
to recognize that our model can seamlessly accommodate other modalities. This flexibilityfor the in-
is a testament to the adaptability and robustness of our approach, as it allows
corporation of diverse data sources to enrich the depth and scope of our analysis. At the
crux of our methodology lies the preprocessing phase, an instrumental step that sets the
stage for the rigorous examination of input images. Within this phase, we meticulously
execute four pivotal steps: Image Enhancement, Pixel Enhancement, RGB-to-Gray Con-
version, and Image Segmentation. These sequential operations are not arbitrary but have
Mathematics 2023, 11, 4937 33 of 40

is a testament to the adaptability and robustness of our approach, as it allows for the
Mathematics 2023, 11, x FOR PEER REVIEW
incorporation 35 ofAt
of diverse data sources to enrich the depth and scope of our analysis. 42the

crux of our methodology lies the preprocessing phase, an instrumental step that sets the
stage for the rigorous examination of input images. Within this phase, we meticulously
execute four pivotal steps: Image Enhancement, Pixel Enhancement, RGB-to-Gray Con-
version, and Image Segmentation. These sequential operations are not arbitrary but have
been thoughtfully selected and implemented to systematically prepare the input images.
Their collective objective is to optimize the images, ensuring they are in a suitable form
for efficient feature extraction and subsequent in-depth analysis. The realm of feature
engineering is where our approach truly shines. Here, we introduce an innovative and
nuanced strategy. Instead of relying solely on one type of feature, we merge two distinct
categories: deep learning-based features, which are often referred to as “deep features”, and
a varied assortment of other features. This assortment includes Discrete Wavelet Transform
(DWT), Gray Level Co-occurrence Matrix (GLCM), Local Binary Pattern (LBP), Texture,
and Gray Level Size Zone Matrix (GLSZM). The fusion of these diverse feature sets is not a
random choice but a deliberate effort to enhance the robustness and comprehensiveness
of our analysis. This fusion is designed to ensure that our model captures both the intri-
cate, high-level representations obtained through deep learning and handcrafted features
meticulously tailored to highlight specific aspects of tumor characteristics. By incorporat-
ing these different types of features, our model becomes versatile, capable of effectively
identifying patterns and characteristics in the data that may not be discernible when using
only one type of feature. By executing this innovative approach, we aim to enhance the
model’s ability to interpret and understand the complex information contained within
medical images. This, in turn, contributes to the accuracy and efficiency of colorectal cancer
detection. Furthermore, it enables our model to adapt and excel in different scenarios and
datasets, making it a powerful tool for healthcare professionals and researchers working in
the field of cancer detection.

Figure 12. Proposed architectural flow diagram for the detection of colorectal cancer using tradi-
Figure
tional 12. Proposed
machine architectural
learning flowimaging
models from diagramdatabase.
for the detection of colorectal cancer using traditional
machine learning models from imaging database.
Mathematics 2023, 11, 4937 34 of 40

The combination of these diverse features enhances the model’s capability to en-
compass both intricate, high-level representations acquired through deep learning and
meticulously tailored handcrafted features that accentuate distinct tumor characteristics.
Moving forward in the workflow, we encounter the crucial stages of feature selection and
optimization. This pivotal process serves a dual role: it reduces feature redundancy while
enhancing the overall model performance by focusing on the most distinctive attributes.
Our model evaluation process is underpinned by a rigorous data-partitioning strategy,
effectively splitting the dataset into training and testing subsets. The training dataset
undergoes additional scrutiny through a k-fold cross-validation approach, fortifying the
model’s training and facilitating a robust performance assessment. This approach not
only guards against overfitting but also assesses the model’s adaptability to various data
scenarios. The test dataset becomes the arena for predicting colorectal cancer, with the
cubic support vector machine (SVM) taking the lead in this classification task. The SVM is a
formidable presence among traditional machine learning classifiers, known for its prowess
in handling high-dimensional data and executing binary classification tasks, making it
ideally suited for the intricacies of cancer detection. In summary, our proposed model
architecture harmoniously integrates advanced image preprocessing techniques, innovative
feature-engineering methodologies, and the proven machinery of a traditional machine
learning classifier. This synthesis yields an efficient and accurate framework for colorectal
cancer detection. Pending further validation and testing on diverse datasets, this approach
has the potential to revolutionize early cancer detection and diagnosis, potentially leading
to improved patient outcomes and a transformation in healthcare effectiveness.

4.2. Detection of Gastric Cancer


The system architecture flow diagram, as depicted in Figure 13, outlines our compre-
hensive and adaptable approach to stomach cancer (Gastric) detection employing tradi-
tional machine learning classifiers. Informed by the top-performing models scrutinized in
Table 4, our proposed architecture is intentionally crafted to accommodate both endoscopy
video datasets, which have gained prominence in recent years, and static image datasets.
Initiating with endoscopy video datasets as the primary data source, our architecture
seamlessly extends its capabilities to image datasets by extracting individual frames from
the video sequences. Subsequently, these extracted frames undergo preprocessing, which
encompasses various techniques such as noise reduction, RGB-to-grayscale conversion,
or other pertinent methods contingent on the specific application and dataset attributes.
Acknowledging the potential constraint of limited video datasets, we introduce data aug-
mentation techniques as part of our solution. This augmentation process generates an
ample supply of augmented image datasets, enabling the model to undergo training on a
more diverse and representative set of samples. This augmentation strategy empowers the
model to generalize better, ultimately leading to enhanced performance outcomes. Moving
into the feature extraction phase, we advocate the simultaneous use of deep features and
texture-based features. Deep features are sourced from state-of-the-art deep learning mod-
els, while texture-based features encompass attributes like GLCM, GLRLM, and GLSZM,
harnessed through conventional feature extraction methods. This fusion of diverse feature
types ensures that the model possesses the capability to encapsulate both abstract high-level
representations and the specific characteristics embedded in the stomach cancer data.
Upon the amalgamation of these features, the subsequent step in our approach in-
volves feature optimization. Here, we employ well-suited algorithms to meticulously select
the most pertinent attributes among the fused features. This optimization process serves
a dual function: firstly, it mitigates the peril of overfitting, a common pitfall in machine
learning endeavors, and secondly, it bolsters the overall efficiency of the model. The care-
fully curated selection of features enhances the model’s capacity to discriminate between
different classes, resulting in improved classification accuracy. Following the optimization
phase, the dataset undergoes a deliberate partitioning into two distinct subsets: the training
set and the testing set. This partitioning is a strategic maneuver that ensures the robust
Mathematics 2023, 11, x FOR PEER REVIEW 36 of 42

4.2. Detection of Gastric Cancer


Mathematics 2023, 11, 4937 35 of 40
The system architecture flow diagram, as depicted in Figure 13, outlines our compre-
hensive and adaptable approach to stomach cancer (Gastric) detection employing tradi-
tional machine learning classifiers. Informed by the top-performing models scrutinized in
training
Table and proposed
4, our rigorous evaluation
architecture of is
traditional machine
intentionally crafted learning classifiers. The
to accommodate bothdistribu-
endos-
tion ofvideo
copy the dataset is thoughtfully
datasets, orchestrated
which have gained to prevent
prominence any data
in recent years, leakage and to
and static create
image da-a
reliableInitiating
tasets. foundation withfor our model’s
endoscopy assessment.
video datasets asDepending
the primary ondata
the specific
source, nature of the
our architec-
classification
ture seamlessly task and the
extends itsunique requirements
capabilities to image of the application,
datasets by extracting we employ a range
individual frames of
classifiers known for their effectiveness in various scenarios. These
from the video sequences. Subsequently, these extracted frames undergo preprocessing, classifiers include but
are notencompasses
which limited to support
various vector machines
techniques such(SVM), Random
as noise Forest
reduction, (RF), logistic regression
RGB-to-grayscale conver-
(LR), or
sion, backpropagation neural networks
other pertinent methods contingent (BPNN),
on the and artificial
specific neuraland
application networks
dataset(ANN).
attrib-
Each Acknowledging
utes. of these classifiers theispotential
chosen judiciously
constraint of tolimited
cater tovideo
the specific
datasets, characteristics
we introduceofdatathe
dataset and the intricacies of the task at hand. These classifiers excel
augmentation techniques as part of our solution. This augmentation process generates an in categorizing stom-
ach cancer
ample supplyintoofdistinct
augmented types, thereby
image providing
datasets, valuable
enabling insights
the model essentialtraining
to undergo for accurate
on a
more diverse and representative set of samples. This augmentation strategyarchitecture
diagnosis and tailored treatment. A standout feature of our proposed system empowers
is itsmodel
the inherentto adaptability. This architectural
generalize better, flexibility
ultimately leading to empowers
enhanced the system to seamlessly
performance outcomes.
Moving into the feature extraction phase, we advocate the simultaneous use and
accommodate both image and video datasets, thereby rendering it versatile suitable
of deep fea-
for a wide spectrum of applications. By harnessing the capabilities
tures and texture-based features. Deep features are sourced from state-of-the-art deep of traditional machine
learning methods
learning and integrating
models, while the novel
texture-based approaches
features encompass of feature fusion
attributes like and optimization,
GLCM, GLRLM,
and GLSZM, harnessed through conventional feature extraction methods. This efficiency
our system architecture exhibits substantial potential for delivering heightened fusion of
and heightened accuracy in the realm of stomach cancer detection. Nonetheless, it is
diverse feature types ensures that the model possesses the capability to encapsulate both
imperative to emphasize the essentiality of conducting further validation and in-depth
abstract high-level representations and the specific characteristics embedded in the stom-
evaluation of our system’s performance.
ach cancer data.

Figure 13. Proposed architectural flow diagram for the detection of stomach cancer using traditional
Proposed
Figure 13.learning
machine architectural
models flow diagram
from imaging dataset.for the detection of stomach cancer using traditional
machine learning models from imaging dataset.
Mathematics 2023, 11, 4937 36 of 40

4.3. Key Observations


The comprehensive assessment of colorectal and gastric cancer detection techniques
using traditional machine learning methods and medical image datasets has revealed
several key insights:
Dataset Diversity: Evaluation includes colorectal and gastric cancer datasets, ranging
from 30 to 100,000 images. The varied dataset sizes showcase machine learning classifier
effectiveness with appropriate tuning.
Exceptional Model Performances: Models achieve 100% accuracy for both colorectal and
gastric cancer, with perfect scores in key metrics like sensitivity, specificity, precision, and
F1-score, showcasing the potential of traditional ML classifiers with optimal parameters.
Preprocessing Techniques: Researchers employ various preprocessing techniques, includ-
ing image filtering, denoising, wavelet transforms, RGB-to-gray conversion, normalization,
cropping (ROI), sampling, and binarization, to optimize model performance and minimize
biases during data manipulation.
Literature Review Significance: This analysis spans 36 literature sources related to col-
orectal and gastric cancer, underscoring the significant interest in cancer detection through
traditional ML classifiers. Researchers have explored an extensive range of cancer types,
diverse evaluation metrics, and datasets, collectively advancing the field.
Dominant Traditional ML Techniques: SVM is a commonly used traditional ML classifier
in cancer detection tasks, emphasizing the need to understand each classifier’s strengths
and limitations for optimal selection.
Insightful Dataset and Feature Analysis: Reviewed studies predominantly utilized bench-
mark medical image datasets, with researchers employing feature extraction techniques
like GLCM for informative feature extraction in cancer detection.
Prudent Model Architecture Design: Optimal results in cancer detection require thought-
ful and optimized model architectures, which can enhance accuracy, generalizability, and
interpretability, addressing challenges in medical image analysis.

4.4. Key Challenges and Future Scope


Traditional ML classifiers have shown remarkable potential in cancer detection. How-
ever, several challenges and the future scope in their application have been identified:
Variability in Accuracy: Traditional ML classifiers exhibit variable accuracy rates across
cancer types, ranging from 76% to 100%. Overcoming these variations poses a challenge,
underscoring the need for enhanced models. Future research should prioritize refining
models for consistent and accurate performance across diverse cancer types.
Metric Disparities: Metric variations, especially in sensitivity (43.1% to 100%) for gastric
cancer, suggest potential data imbalance challenges. Addressing these issues is crucial
for accurate model assessments. Future research should focus on developing strategies to
handle imbalanced data and improve model robustness.
Preprocessing Challenges: Balancing raw and preprocessed data is crucial to ensure input
data quality and reliability, contributing to robust cancer detection model performance.
Future research should explore advanced preprocessing techniques and optimization
methods to further enhance model robustness.
Limited use of evaluation metrics: Limited use of metrics like NPV, AUC, and MCC
in the reviewed literature highlights the challenge of comprehensive model assessment.
Addressing this limitation and exploring a broader range of metrics is crucial for future
research to enhance understanding and effectiveness in cancer detection tasks.
Generalizing to novel cancer types: The literature primarily focuses on colorectal and
gastric cancers, posing a challenge for extending traditional ML classifiers to less-explored
cancer types. Future research should aim to develop versatile ML models with robust
feature extraction techniques to adapt to diverse cancer types and domains.
Addressing overfitting and model selection: The diversity in ML classifiers poses chal-
lenges in model selection for specific cancers, emphasizing the need for careful evaluation
Mathematics 2023, 11, 4937 37 of 40

to avoid overfitting. Future research should focus on refining model selection strategies to
enhance the robustness of cancer detection techniques and improve diagnostic accuracy.

5. Conclusions
In this manuscript, a thorough review and analysis of colorectal and gastric cancer de-
tection using traditional machine learning techniques are presented. We have meticulously
scrutinized 36 research papers published between 2017 and 2023, specifically focusing on
the domain of medical imaging datasets for detecting these types of cancers. Mathematical
formulations elucidating frequently employed preprocessing techniques, feature extraction
methods, traditional machine learning classifiers, and assessment metrics are provided.
These formulations offer valuable guidance to researchers when selecting the most suitable
techniques for their cancer detection studies. To conduct this analysis, a range of criteria
such as publication year, preprocessing methods, dataset particulars, image quantities,
modality, techniques, best models, and metrics (%) were considered. An extensive array
of metrics was employed to evaluate model performance comprehensively. Notably, the
study delves into the highest and lowest metric values and their disparities, highlighting
opportunities for enhancement. Remarkably, we found that the highest achievable value for
all metrics reached an astonishing 100%, with gastric cancer detection registering the lowest
sensitivity at 43.10%. This underscores the potential of traditional ML classifiers, while
indicating areas for further refinement. Drawing from these insights, we present a proposed
(optimized) methodology for both colorectal and gastric cancer detection, aiding in the
selection of an optimized approach for future cancer detection research. The manuscript
concludes by delineating key findings and challenges that offer valuable directions for
future research endeavors.
In our future research endeavors, we plan to implement the proposed optimized
methodology for the detection of colorectal and gastric cancer within the specified exper-
imental framework. This proactive approach aligns with our commitment to enhancing
the effectiveness of cancer detection methodologies. Furthermore, we will conscientiously
incorporate and address the challenges and limitations identified in this study, ensuring a
comprehensive and iterative improvement in our investigative efforts.

Author Contributions: Original Draft Preparation: H.M.R.; Review and Editing: H.M.R.; Visualiza-
tion: H.M.R.; Supervision: J.Y.; Project Administration: J.Y.; Funding Acquisition: J.Y. All authors
have read and agreed to the published version of the manuscript.
Funding: This work was supported by the National Research Foundation of Korea (NRF) Grant
funded by the Korea government (MSIT) (NRF-2021R1F1A1063640).
Data Availability Statement: Data sharing is not applicable to this article as no datasets were
generated or analyzed during the current study.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Faguet, G.B. A brief history of cancer: Age-old milestones underlying our current knowledge database. Int. J. Cancer 2014,
136, 2022–2036. [CrossRef] [PubMed]
2. Afrash, M.R.; Shafiee, M.; Kazemi-Arpanahi, H. Establishing machine learning models to predict the early risk of gastric cancer
based on lifestyle factors. BMC Gastroenterol. 2023, 23, 6. [CrossRef] [PubMed]
3. Kumar, Y.; Gupta, S.; Singla, R.; Hu, Y.-C. A systematic review of artificial intelligence techniques in cancer prediction and
diagnosis. Arch. Comput. Methods Eng. 2021, 29, 2043–2070. [CrossRef] [PubMed]
4. Nguon, L.S.; Seo, K.; Lim, J.-H.; Song, T.-J.; Cho, S.-H.; Park, J.-S.; Park, S. Deep learning-based differentiation between mucinous
cystic neoplasm and serous cystic neoplasm in the pancreas using endoscopic ultrasonography. Diagnostics 2021, 11, 1052.
[CrossRef] [PubMed]
5. Kim, S.H.; Hong, S.J. Current status of image-enhanced endoscopy for early identification of esophageal neoplasms. Clin. Endosc.
2021, 54, 464–476. [CrossRef] [PubMed]
6. NCI. What Is Cancer?—NCI. National Cancer Institute. Available online: https://www.cancer.gov/about-cancer/understanding/
what-is-cancer (accessed on 9 June 2023).
Mathematics 2023, 11, 4937 38 of 40

7. Zhi, J.; Sun, J.; Wang, Z.; Ding, W. Support vector machine classifier for prediction of the metastasis of colorectal cancer. Int. J.
Mol. Med. 2018, 41, 1419–1426. [CrossRef] [PubMed]
8. Zhou, H.; Dong, D.; Chen, B.; Fang, M.; Cheng, Y.; Gan, Y.; Zhang, R.; Zhang, L.; Zang, Y.; Liu, Z.; et al. Diagnosis of Distant
Metastasis of Lung Cancer: Based on Clinical and Radiomic Features. Transl. Oncol. 2017, 11, 31–36. [CrossRef] [PubMed]
9. Levine, A.B.; Schlosser, C.; Grewal, J.; Coope, R.; Jones, S.J.; Yip, S. Rise of the Machines: Advances in Deep Learning for Cancer
Diagnosis. Trends Cancer 2019, 5, 157–169. [CrossRef] [PubMed]
10. Huang, S.; Yang, J.; Fong, S.; Zhao, Q. Artificial intelligence in cancer diagnosis and prognosis: Opportunities and challenges.
Cancer Lett. 2019, 471, 61–71. [CrossRef]
11. Saba, T. Recent advancement in cancer detection using machine learning: Systematic survey of decades, comparisons and
challenges. J. Infect. Public Health 2020, 13, 1274–1289. [CrossRef]
12. Shah, B.; Alsadoon, A.; Prasad, P.; Al-Naymat, G.; Beg, A. DPV: A taxonomy for utilizing deep learning as a prediction technique
for various types of cancers detection. Multimed. Tools Appl. 2021, 80, 21339–21361. [CrossRef]
13. Majumder, A.; Sen, D. Artificial intelligence in cancer diagnostics and therapy: Current perspectives. Indian J. Cancer 2021,
58, 481–492. [CrossRef] [PubMed]
14. Bin Tufail, A.; Ma, Y.-K.; Kaabar, M.K.A.; Martínez, F.; Junejo, A.R.; Ullah, I.; Khan, R. Deep Learning in Cancer Diagnosis and
Prognosis Prediction: A Minireview on Challenges, Recent Trends, and Future Directions. Comput. Math. Methods Med. 2021,
2021, 9025470. [CrossRef] [PubMed]
15. Kumar, G.; Alqahtani, H. Deep Learning-Based Cancer Detection-Recent Developments, Trend and Challenges. Comput. Model.
Eng. Sci. 2022, 130, 1271–1307. [CrossRef]
16. Painuli, D.; Bhardwaj, S.; Köse, U. Recent advancement in cancer diagnosis using machine learning and deep learning techniques:
A comprehensive review. Comput. Biol. Med. 2022, 146, 105580. [CrossRef] [PubMed]
17. Rai, H.M. Cancer detection and segmentation using machine learning and deep learning techniques: A review. Multimed. Tools
Appl. 2023, 1–35. [CrossRef]
18. Maurya, S.; Tiwari, S.; Mothukuri, M.C.; Tangeda, C.M.; Nandigam, R.N.S.; Addagiri, D.C. A review on recent developments in
cancer detection using Machine Learning and Deep Learning models. Biomed. Signal Process. Control. 2023, 80, 104398. [CrossRef]
19. Mokoatle, M.; Marivate, V.; Mapiye, D.; Bornman, R.; Hayes, V.M. A review and comparative study of cancer detection using
machine learning: SBERT and SimCSE application. BMC Bioinform. 2023, 24, 112. [CrossRef]
20. Rai, H.M.; Yoo, J. A comprehensive analysis of recent advancements in cancer detection using machine learning and deep learning
models for improved diagnostics. J. Cancer Res. Clin. Oncol. 2023, 149, 14365–14408. [CrossRef]
21. Ullah, A.; Chen, W.; Khan, M.A. A new variational approach for restoring images with multiplicative noise. Comput. Math. Appl.
2016, 71, 2034–2050. [CrossRef]
22. Azmi, K.Z.M.; Ghani, A.S.A.; Yusof, Z.M.; Ibrahim, Z. Natural-based underwater image color enhancement through fusion of
swarm-intelligence algorithm. Appl. Soft Comput. 2019, 85, 105810. [CrossRef]
23. Alruwaili, M.; Gupta, L. A statistical adaptive algorithm for dust image enhancement and restoration. In Proceedings of
the 2015 IEEE International Conference on Electro/Information Technology (EIT), Dekalb, IL, USA, 21–23 May 2015; IEEE:
Piscataway, NJ, USA, 2015; pp. 286–289.
24. Cai, J.-H.; He, Y.; Zhong, X.-L.; Lei, H.; Wang, F.; Luo, G.-H.; Zhao, H.; Liu, J.-C. Magnetic Resonance Texture Analysis in
Alzheimer’s disease. Acad. Radiol. 2020, 27, 1774–1783. [CrossRef]
25. Chandrasekhara, S.P.R.; Kabadi, M.G.; Srivinay, S. Wearable IoT based diagnosis of prostate cancer using GLCM-multiclass SVM
and SIFT-multiclass SVM feature extraction strategies. Int. J. Pervasive Comput. Commun. 2021. ahead-of-print. [CrossRef]
26. Alqudah, A.M.; Alqudah, A. Improving machine learning recognition of colorectal cancer using 3D GLCM applied to different
color spaces. Multimed. Tools Appl. 2022, 81, 10839–10860. [CrossRef]
27. Vallabhaneni, R.B.; Rajesh, V. Brain tumour detection using mean shift clustering and GLCM features with edge adaptive total
variation denoising technique. Alex. Eng. J. 2018, 57, 2387–2392. [CrossRef]
28. Rego, C.H.Q.; França-Silva, F.; Gomes-Junior, F.G.; de Moraes, M.H.D.; de Medeiros, A.D.; da Silva, C.B. Using Multispectral
Imaging for Detecting Seed-Borne Fungi in Cowpea. Agriculture 2020, 10, 361. [CrossRef]
29. Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [CrossRef]
30. Callen, J.L.; Segal, D. An Analytical and Empirical Measure of the Degree of Conditional Conservatism. J. Account. Audit. Financ.
2013, 28, 215–242. [CrossRef]
31. Weinberger, K. Lecture 2: K-Nearest Neighbors. Available online: https://www.cs.cornell.edu/courses/cs4780/2017sp/lectures/
lecturenote02_kNN.html (accessed on 12 November 2023).
32. Weinberger, K. Lecture 3: The Perceptron. Available online: https://www.cs.cornell.edu/courses/cs4780/2017sp/lectures/
lecturenote03.html (accessed on 12 November 2023).
33. Watt, J.; Borhani, R.; Katsaggelos, A.K. Machine Learning Refined; Cambridge University Press (CUP): Cambridge, UK, 2020;
ISBN 9781107123526.
34. Watt, R.B.J. 13.1 Multi-Layer Perceptrons (MLPs). Available online: https://kenndanielso.github.io/mlrefined/blog_posts/13
_Multilayer_perceptrons/13_1_Multi_layer_perceptrons.html (accessed on 12 November 2023).
35. Weinberger, K. Lecture 9: SVM. Available online: https://www.cs.cornell.edu/courses/cs4780/2017sp/lectures/lecturenote09.
html (accessed on 13 November 2023).
Mathematics 2023, 11, 4937 39 of 40

36. Balas, V.E.; Mastorakis, N.E.; Popescu, M.-C.; Balas, V.E. Multilayer Perceptron and Neural Networks. 2009. Available online:
https://www.researchgate.net/publication/228340819 (accessed on 18 September 2023).
37. Murphy, K.P. Machine Learning: A Probabilistic Perspective; MIT Press: Cambridge, MA, USA, 2012.
38. Islam, U.; Al-Atawi, A.; Alwageed, H.S.; Ahsan, M.; Awwad, F.A.; Abonazel, M.R. Real-Time Detection Schemes for Memory DoS
(M-DoS) Attacks on Cloud Computing Applications. IEEE Access 2023, 11, 74641–74656. [CrossRef]
39. Houshmand, M.; Hosseini-Khayat, S.; Wilde, M.M. Minimal-Memory, Noncatastrophic, Polynomial-Depth Quantum Convolu-
tional Encoders. IEEE Trans. Inf. Theory 2012, 59, 1198–1210. [CrossRef]
40. Bagging. Available online: https://www.cs.cornell.edu/courses/cs4780/2017sp/lectures/lecturenote18.html (accessed on
13 November 2023).
41. Boosting. Available online: https://www.cs.cornell.edu/courses/cs4780/2017sp/lectures/lecturenote19.html (accessed on
13 November 2023).
42. Dewangan, S.; Rao, R.S.; Mishra, A.; Gupta, M. Code Smell Detection Using Ensemble Machine Learning Algorithms. Appl. Sci.
2022, 12, 10321. [CrossRef]
43. Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2018, 17, 168–192. [CrossRef]
44. Leem, S.; Oh, J.; So, D.; Moon, J. Towards Data-Driven Decision-Making in the Korean Film Industry: An XAI Model for Box
Office Analysis Using Dimension Reduction, Clustering, and Classification. Entropy 2023, 25, 571. [CrossRef] [PubMed]
45. Talukder, A.; Islam, M.; Uddin, A.; Akhter, A.; Hasan, K.F.; Moni, M.A. Machine learning-based lung and colon cancer detection
using deep feature extraction and ensemble learning. Expert Syst. Appl. 2022, 205, 117695. [CrossRef]
46. Ying, M.; Pan, J.; Lu, G.; Zhou, S.; Fu, J.; Wang, Q.; Wang, L.; Hu, B.; Wei, Y.; Shen, J. Development and validation of a radiomics-
based nomogram for the preoperative prediction of microsatellite instability in colorectal cancer. BMC Cancer 2022, 22, 524.
[CrossRef]
47. Fadafen, M.K.; Rezaee, K. Ensemble-based multi-tissue classification approach of colorectal cancer histology images using a novel
hybrid deep learning framework. Sci. Rep. 2023, 13, 8823. [CrossRef]
48. Jansen-Winkeln, B.; Barberio, M.; Chalopin, C.; Schierle, K.; Diana, M.; Köhler, H.; Gockel, I.; Maktabi, M. Feedforward artificial
neural network-based colorectal cancer detection using hyperspectral imaging: A step towards automatic optical biopsy. Cancers
2021, 13, 967. [CrossRef]
49. Bora, K.; Bhuyan, M.K.; Kasugai, K.; Mallik, S.; Zhao, Z. Computational learning of features for automated colonic polyp
classification. Sci. Rep. 2021, 11, 4347. [CrossRef]
50. Fan, J.; Lee, J.; Lee, Y. A Transfer learning architecture based on a support vector machine for histopathology image classification.
Appl. Sci. 2021, 11, 6380. [CrossRef]
51. Lo, C.-M.; Yang, Y.-W.; Lin, J.-K.; Lin, T.-C.; Chen, W.-S.; Yang, S.-H.; Chang, S.-C.; Wang, H.-S.; Lan, Y.-T.; Lin, H.-H.; et al.
Modeling the survival of colorectal cancer patients based on colonoscopic features in a feature ensemble vision transformer.
Comput. Med. Imaging Graph. 2023, 107, 102242. [CrossRef] [PubMed]
52. Grosu, S.; Wesp, P.; Graser, A.; Maurus, S.; Schulz, C.; Knösel, T.; Cyran, C.C.; Ricke, J.; Ingrisch, M.; Kazmierczak, P.M. Machine
learning–based differentiation of benign and premalignant colorectal polyps detected with CT colonography in an asymptomatic
screening population: A proof-of-concept study. Radiology 2021, 299, 326–335. [CrossRef]
53. Takeda, K.; Kudo, S.-E.; Mori, Y.; Misawa, M.; Kudo, T.; Wakamura, K.; Katagiri, A.; Baba, T.; Hidaka, E.; Ishida, F.; et al. Accuracy
of diagnosing invasive colorectal cancer using computer-aided endocytoscopy. Endoscopy 2017, 49, 798–802. [CrossRef] [PubMed]
54. Yang, K.; Zhou, B.; Yi, F.; Chen, Y.; Chen, Y. Colorectal Cancer Diagnostic Algorithm Based on Sub-Patch Weight Color Histogram
in Combination of Improved Least Squares Support Vector Machine for Pathological Image. J. Med. Syst. 2019, 43, 306. [CrossRef]
[PubMed]
55. Dragicevic, A.; Matija, L.; Krivokapic, Z.; Dimitrijevic, I.; Baros, M.; Koruga, D. Classification of Healthy and Cancer States of
Colon Epithelial Tissues Using Opto-magnetic Imaging Spectroscopy. J. Med. Biol. Eng. 2018, 39, 367–380. [CrossRef]
56. Trivizakis, E.; Ioannidis, G.S.; Souglakos, I.; Karantanas, A.H.; Tzardi, M.; Marias, K. A neural pathomics framework for classifying
colorectal cancer histopathology images based on wavelet multi-scale texture analysis. Sci. Rep. 2021, 11, 15546. [CrossRef]
57. Damkliang, K.; Wongsirichot, T.; Thongsuksai, P. Tissue classification for colorectal cancer utilizing techniques of deep learning
and machine learning. Biomed. Eng. Appl. Basis Commun. 2021, 33, 2150022. [CrossRef]
58. Mittal, P.; Condina, M.R.; Klingler-Hoffmann, M.; Kaur, G.; Oehler, M.K.; Sieber, O.M.; Palmieri, M.; Kommoss, S.; Brucker, S.;
McDonnell, M.D.; et al. Cancer tissue classification using supervised machine learning applied to MALDI mass spectrometry
imaging. Cancers 2021, 13, 5388. [CrossRef]
59. Cao, W.; Pomeroy, M.J.; Liang, Z.; Abbasi, A.F.; Pickhardt, P.J.; Lu, H. Vector textures derived from higher order derivative
domains for classification of colorectal polyps. Vis. Comput. Ind. Biomed. Art 2022, 5, 16. [CrossRef]
60. Deif, M.A.; Attar, H.; Amer, A.; Issa, H.; Khosravi, M.R.; Solyman, A.A.A. A New Feature Selection Method Based on Hybrid
Approach for Colorectal Cancer Histology Classification. Wirel. Commun. Mob. Comput. 2022, 2022, 7614264. [CrossRef]
61. Chehade, A.H.; Abdallah, N.; Marion, J.-M.; Oueidat, M.; Chauvet, P. Lung and colon cancer classification using medical imaging:
A feature engineering approach. Phys. Eng. Sci. Med. 2022, 45, 729–746. [CrossRef]
62. Tripathi, A.; Misra, A.; Kumar, K.; Chaurasia, B.K. Optimized Machine Learning for Classifying Colorectal Tissues. SN Comput.
Sci. 2023, 4, 461. [CrossRef]
Mathematics 2023, 11, 4937 40 of 40

63. Kara, O.C.; Venkatayogi, N.; Ikoma, N.; Alambeigi, F. A Reliable and Sensitive Framework for Simultaneous Type and Stage
Detection of Colorectal Cancer Polyps. Ann. Biomed. Eng. 2023, 51, 1499–1512. [CrossRef] [PubMed]
64. Ayyaz, M.S.; Lali, M.I.U.; Hussain, M.; Rauf, H.T.; Alouffi, B.; Alyami, H.; Wasti, S. Hybrid deep learning model for endoscopic
lesion detection and classification using endoscopy videos. Diagnostics 2021, 12, 43. [CrossRef] [PubMed]
65. Mirniaharikandehei, S.; Heidari, M.; Danala, G.; Lakshmivarahan, S.; Zheng, B. Applying a random projection algorithm to
optimize machine learning model for predicting peritoneal metastasis in gastric cancer patients using CT images. Comput.
Methods Programs Biomed. 2021, 200, 105937. [CrossRef]
66. Hu, W.; Li, C.; Li, X.; Rahaman, M.; Ma, J.; Zhang, Y.; Chen, H.; Liu, W.; Sun, C.; Yao, Y.; et al. GasHisSDB: A new gastric
histopathology image dataset for computer aided diagnosis of gastric cancer. Comput. Biol. Med. 2022, 142, 105207. [CrossRef]
[PubMed]
67. Naser, E.F.; Zeki, S.M. Using Fuzzy Clustering to Detect the Tumor Area in Stomach Medical Images. Baghdad Sci. J. 2021, 18, 1294.
[CrossRef]
68. Korkmaz, S.A.; Esmeray, F. A New Application Based on GPLVM, LMNN, and NCA for Early Detection of the Stomach Cancer.
Appl. Artif. Intell. 2018, 32, 541–557. [CrossRef]
69. Nayyar, Z.; Khan, M.A.; Alhussein, M.; Nazir, M.; Aurangzeb, K.; Nam, Y.; Kadry, S.; Haider, S.I. Gastric tract disease recognition
using optimized deep learning features. Comput. Mater. Contin. 2021, 68, 2041–2056. [CrossRef]
70. Hu, W.; Chen, H.; Liu, W.; Li, X.; Sun, H.; Huang, X.; Grzegorzek, M.; Li, C. A comparative study of gastric histopathology
sub-size image classification: From linear regression to visual transformer. Front. Med. 2022, 9, 1072109. [CrossRef]
71. Korkmaz, S.A. Recognition of the Gastric Molecular Image Based on Decision Tree and Discriminant Analysis Classifiers by using
Discrete Fourier Transform and Features. Appl. Artif. Intell. 2018, 32, 629–643. [CrossRef]
72. Korkmaz, S.A.; Binol, H. Classification of molecular structure images by using ANN, RF, LBP, HOG, and size reduction methods
for early stomach cancer detection. J. Mol. Struct. 2018, 1156, 255–263. [CrossRef]
73. Kanesaka, T.; Lee, T.-C.; Uedo, N.; Lin, K.-P.; Chen, H.-Z.; Lee, J.-Y.; Wang, H.-P.; Chang, H.-T. Computer-aided diagnosis for
identifying and delineating early gastric cancers in magnifying narrow-band imaging. Gastrointest. Endosc. 2018, 87, 1339–1344.
[CrossRef] [PubMed]
74. Feng, Q.-X.; Liu, C.; Qi, L.; Sun, S.-W.; Song, Y.; Yang, G.; Zhang, Y.-D.; Liu, X.-S. An Intelligent Clinical Decision Support System
for Preoperative Prediction of Lymph Node Metastasis in Gastric Cancer. J. Am. Coll. Radiol. 2019, 16, 952–960. [CrossRef]
75. Korkmaz, S.A. Classification of histopathological gastric images using a new method. Neural Comput. Appl. 2021, 33, 12007–12022.
[CrossRef]
76. Dai, H.; Bian, Y.; Wang, L.; Yang, J. Support Vector Machine-Based Backprojection Algorithm for Detection of Gastric Cancer
Lesions with Abdominal Endoscope Using Magnetic Resonance Imaging Images. Sci. Program. 2021, 2021, 9964203. [CrossRef]
77. Haile, M.B.; Salau, A.; Enyew, B.; Belay, A.J. Detection and classification of gastrointestinal disease using convolutional neural
network and SVM. Cogent Eng. 2022, 9, 2084878. [CrossRef]
78. Noor, M.N.; Nazir, M.; Khan, S.A.; Song, O.-Y.; Ashraf, I. Efficient Gastrointestinal Disease Classification Using Pretrained Deep
Convolutional Neural Network. Electronics 2023, 12, 1557. [CrossRef]
79. Yin, F.; Zhang, X.; Fan, A.; Liu, X.; Xu, J.; Ma, X.; Yang, L.; Su, H.; Xie, H.; Wang, X.; et al. A novel detection technology for early
gastric cancer based on Raman spectroscopy. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2023, 292, 122422. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like