0% found this document useful (0 votes)
20 views68 pages

Final Project Report 2025 TN Sai Disha

The project report focuses on the classification of Trigeminal Neuralgia (TN) severity using machine learning and deep learning models to enhance diagnostic accuracy and treatment frameworks. The study evaluates various classification methodologies, achieving the highest accuracy of 92.3% with a hybrid soft Voting Classifier and XGBoost model, while also incorporating explainable AI techniques for improved interpretability. The project aims to provide a clinically useful tool for healthcare professionals to assess TN severity based on patient-specific data.

Uploaded by

gsaidisha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views68 pages

Final Project Report 2025 TN Sai Disha

The project report focuses on the classification of Trigeminal Neuralgia (TN) severity using machine learning and deep learning models to enhance diagnostic accuracy and treatment frameworks. The study evaluates various classification methodologies, achieving the highest accuracy of 92.3% with a hybrid soft Voting Classifier and XGBoost model, while also incorporating explainable AI techniques for improved interpretability. The project aims to provide a clinically useful tool for healthcare professionals to assess TN severity based on patient-specific data.

Uploaded by

gsaidisha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

VISVESVARAYA TECHNOLOGICAL UNIVERSITY

“Jnana Sangama”, Belagavi-590018

PROJECT REPORT

Trigeminal Neuralgia Severity


Classification
Submitted in partial fulfillment of the requirements of the degree of

BACHELOR OF ENGINEERING
in
INFORMATION SCIENCE & ENGINEERING
For the academic year
2024-2025

Submitted By

Sai Disha G S
1GA21IS139

Under the guidance of


Dr. Vimuktha E Salis
Professor, Dept. of ISE
GAT, Bengaluru

DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING

GLOBAL ACADEMY OF TECHNOLOGY


(An Autonomous Institute Affiliated to VTU, Belagavi)
Ideal Homes, Rajarajeshwari Nagar, Bengaluru-560098
(Approved by AICTE, New Delhi, NAAC Accredited with ‘A’ Grade, NBA Accredited 2022-2025)
GLOBAL ACADEMY OF TECHNOLOGY
(An Autonomous Institute Affiliated to VTU, Belagavi)
Ideal Homes, Rajarajeshwari Nagar, Bengaluru-560098
(Approved by AICTE, New Delhi, NAAC Accredited with ‘A’ Grade, NBA Accredited 2022-2025)

DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING

CERTIFICATE
Certified that the Project work entitled ‘Trigeminal Neuralgia Severity Classification’ is a work
carried out by Sai Disha G S, 1GA21IS139 bonafide students of Global Academy of
Technology, Bengaluru in partial fulfillment for the award of the degree of Bachelor of
Engineering in Information Science & Engineering of the Visvesvaraya Technological
University, Belagavi during the year 2024-2025. It is certified that all corrections/suggestions
indicated for the Internal Assessment have been incorporated in the report deposited in the
departmental library. The Project Work (21ISEP83) report has been approved as it satisfies the
academic requirements in respect of project work prescribed for the said degree.

Signature of the Guide Signature of the HOD Signature of the Principal


Dr. Vimuktha E Salis Dr. Kiran Y C Dr. H. B. Balakrishna
Professor, Dept. of ISE, Prof. & Head, Dept. of ISE, Principal,
GAT, Bengaluru. GAT, Bengaluru. GAT, Bengaluru.

Name of the Examiners Signature with date

1.

2.
DECLARATION

I, Sai Disha G S (1GA21IS139) student of 8th semester BE in Information Science and


Engineering, Global Academy of Technology , Bengaluru, hereby declare that the project work
entitled ‘Trigeminal Neuralgia Severity Classification’ submitted to Visvesvaraya
Technological University during the academic year 2024-25 is a record of an original work done
by us under the guidance of Dr. Vimuktha E Salis , Professor , Department of Information
Science & Engineering, Global Academy of Technology, Bengaluru. This project work is
submitted in partial fulfilment of the requirements for the award of the degree of Bachelor of
Engineering in Information Science & Engineering. The results embodied in this thesis have not
been submitted to any other University or Institute for the award of any degree.

Name USN Signature

Sai Disha G S 1GA21IS139


………………………
ABSTRACT

The chronic neurological disorder Trigeminal Neuralgia (TN) is characterized by episodic


severe facial pain and may greatly reduce a person’s quality of life. Its diagnosis relies heavily
on the subjective accounts of the patient, therefore diminishing an effective diagnostic timeline.
In this paper, the authors propose a novel description of TN severity using machine and deep
learning models. This automated pipeline could transform the objective evaluation of TN with
clinical datasets, enhancing the dependability of non-invasive diagnostics and structured
treatment frameworks.

This study explores various classification methodologies such as Decision Trees, Random
Forests, Support Vector Machines, XGBoost, and Convolutional Neural Networks (CNN).
With the dataset from the symptoms of patients, frequency of attacks, trigger areas, and
medication history, the authors look for useful patterns. Evaluation results indicate that CNNs
achieved the highest level of accuracy and most reliable performance compared to traditional
classifiers. This study proves the feasibility of using AI for classifying neurological disorders
and increases the chances of tailored and timely intervention for patients suffering from TN
pain.

The best test accuracy of 92.3% was observed in the ensemble with hybrid soft Voting Classifier
and XGBoost model. In addition, SHAP (SHapley Additive exPlanations) enhanced
interpretability and trust in the model by clarifying feature contributions. The findings support
the active creation of reliable, cognitively-responsible diagnostic models for trigeminal
neuralgia using explainable AI methodologies in conjunction with ensemble learning.

i
ACKNOWLEDGEMENT

The satisfaction and euphoria that accompany the successful completion of any task
would be incomplete without the mention of the people who made it possible, whose
constant guidance and encouragement crowned our effort with success.

We are grateful to our institution, Global Academy of Technology, with its ideals
and inspirations for having provided us with the facilities, which has made this project a
success.

We earnestly thank Dr. H. B. Balakrishna, Principal, Global Academy of


Technology for facilitating academic excellence in the college and providing us with the
congenial environment to work in, that helped us in completing this project.

We wish to extend our profound thanks to Dr. Kiran Y C, Prof. & Head,
Department of Information Science & Engineering, GAT, for giving us the consent to
carry out this project.

We would like to express our sincere thanks to our internal guide Dr. Vimuktha E
Salis, Professor, Department of Information Science & Engineering, GAT, for her able
guidance and valuable advice at every stage, which helped us in the successful completion
of the project.

We owe our sincere thanks to our project coordinator, Prof. Sharmila Chidaravalli,
Assistant Professor, Department of Information Science & Engineering, GAT, for her
immense help during the project and also for her valuable suggestions on the project report
preparations.

We would like to thank all the teaching and non-teaching staff for their valuable
advice and support. We would like to express our sincere thanks to our parents and friends
for their support.

Sai Disha G S

1GA21IS139

ii
TABLE OF CONTENTS

ABSTRACT i

ACKNOWLEDGEMENT ii

TABLE OF CONTENTS iii

LIST OF TABLES v
LIST OF FIGURES vi
Chapters Page No.
1 INTRODUCTION 1

1.1 Briefing About the Project 1

1.2 Dataset Overview 2

1.3 Existing System 3

1.4 Limitations of Existing System 4

1.5 Proposed System 4

1.6 Objectives 5

1.7 Motivation 5

2 LITERATURE SURVEY 6

3 SYSTEM REQUIREMENTS AND SPECIFICATION 12

3.1 Hardware Requirements 12

3.2 Software Requirements 13

3.3 Functional Requirements 14

4 SYSTEM DESIGN 18

4.1 System Architecture 18

4.2 Component Diagram 19

4.3 Data Flow Diagram 20

4.4 Sequence Diagram 21

iii
5 IMPLEMENTATION 22

5.1 Environment and Dependencies 22

5.2 Data Preprocessing 23

5.3 Feature Selection 23

5.4 Handling Class Imbalance 24

5.5 Model Training and Evaluation 24

5.6 API Implementation 25

5.7 Integration with Frontend 26

5.8 Conclusion 27

6 TESTING 28

6.1 Testing Phases 29

6.2 Prediction System Tests 29

7 RESULTS 31
7.1 Setup 31

7.2 Methodology 32
7.3 Model Performance Comparison 33

7.4 Feature Importance Analysis 36


7.5 Website Description 37

7.6 Discussion 40

8 CONCLUSION 42

9 FUTURE WORK 43

REFERENCES 45
APPENDIX 47

iv
LIST OF FIGURES
Figures
Page No.
Figure 4.1: System Architecture 18

Figure 4.2: Component Diagram 19

Figure 4.3: Data Flow Diagram 20

Figure 4.4: Sequence Diagram 21

Figure 7.1: Model Accuracy Comparison (Before 34


Feature Selection)

Figure 7.2: Model Accuracy Comparison (After 35


Feature Selection)

Figure 7.3: SHAP Summary Plot (Before Feature 36


Selection)

Figure 7.4: Shap Summary Plot (After Feature


37
Selection)

Figure 7.5: File Upload Interface 38

Figure 7.6: Prediction Results Table 39

v
LIST OF TABLES

Tables Page No.

Table 1.1: Sample of TN Data(3).csv Dataset 3

Table 3.1 Hardware Requirements 12

Table 3.2: Software Requirements 13

Table 6.1: Prediction system test cases 29

Table 7.1: Model Performance Before and After Feature 35


Selection

vi
CHAPTER 1
INTRODUCTION
Trigeminal Neuralgia Severity Classification

CHAPTER 1
INTRODUCTION
Trigeminal Neuralgia (TN) is a chronic neuropathic disorder which is uncommon and is
associated with paroxysms of extreme pain to the face which is normally described as a
sensation akin to electrical shocks. This condition is related to the sensory nerve called
trigeminal which channels sensations such as touch and pain and relays them from the face
to the brain. Its incidence stands at approximately 12 per 100,000 as per the latest figures
available annually. Considered one of the most painful ailments known in medicine, TN
predominantly affects persons over 50, women more than men. Apart from the physical
symptoms which are excruciating, TN has far reaching effects on the patient’s psychological
state which is further compounded by the TN’s financial implications owing to the numerous
clinical appointments and treatments that often fail due to an incorrect diagnosis.
Classification of the severity of TN with precision is fundamental for proper clinical
treatment; however, traditional approaches centered around patient self-reporting, basic
clinical evaluations, and inadequate multi-dimensional scaling, tend to result in large
classification lag, misclassification and prolonged non-targeted treatment, with heavy
consequences for optimized treatment. After the digital revolution, clinical workflows have
witnessed innovative transformations; rehabilitation in the field of TN diagnosed with the
use of AI and machine learning powered technologies offers tremendous promise for
objectivity, reliability, and precision.

Recent developments in ML including supervised learning, ensemble models, and


explainable AI (XAI) indicate a change in the feasibility of using these tools for medical
diagnosis. ML models are proficient at recognizing patterns within complex, high-
dimensional datasets, and are able to provide fast and reproducible results. ML models can
be equipped with interpretability tools, like SHAP (SHapley Additive exPlanations) and
LIME (Local Interpretable Model-Agnostic Explanations), to address the "black box"
problem, by describing how features operate together to produce model predictions, helping
to "bridge" the trust gap between AI tools and healthcare practitioners. This project is
primarily based on these technologies to build an automated TN severity classification
method. Test accuracy in Google Colab managed to achieve 92.3%. Actual deployment of
the automated TN severity classification, is via a React-based web interface.

1.1 Briefing About the Project

This undertaking revolves around using ML and Ensemble methods for classifying TN
Dept. of ISE, GAT 2024-2025 1
Trigeminal Neuralgia Severity Classification

severity, which is mild, moderate, and severe, based on a clinical dataset. The aim is to
construct a model that is clinically useful, interpretable, and predictive of TN severity using
patient-specific data including age, treatment history, symptoms, duration, pain intensity,
MRIs, and other medical parameters. The ML pipeline was executed on Google Colab with
scikit-learn and xgboost for preprocessing, model training, and feature analysis, and shap for
explainable AI. All classifiers, comprising Logistic Regression, Decision Tree, K-Nearest
Neighbours (KNN), Support Vector Machine (SVM), Random Forest, Gradient Boosting,
AdaBoost, and XGBoost, were trained using a stratified K-Fold cross validation method
framework. This was done with each classifier's performance in mind. A hybrid soft voting
classifier composed of XGBoost (4), Random Forest (2), and Gradient Boosting (1) was able
to achieve the highest test accuracy of 92.3%, showcasing the effectiveness of ensemble
methods. In order to ensure that predictions are clear and useful for clinicians, XAI tools
(SHAP and LIME) were integrated to identify important contributing features, such as brain
MRI findings and pain intensity. Its usefulness in clinical settings is increased by the Tailwind
CSS-styled React-based web interface, which enables medical professionals to upload patient
data and view predictions in a structured, colour-coded table.

1.2 Dataset Overview

TN daaataa (3) is the dataset used in this investigation.Each of the 62 patient records in
the csv file is described by 21 columns, including 20 features and one target variable
(Classification_TN). In order to predict the severity of TN, the features include clinical and
demographic variables like Branch_involved, Carbamazepin_dose, Quality_of_pain,
Side_involved, and Age_category. Classification_TN, the target variable, indicates the
presence of TN (adjusted to 0 or 1 for binary classification). To address the problem of class
imbalance prevalent in rare disease datasets, a rigorous preprocessing pipeline was used,
which included class balancing with RandomOverSampler, dimensionality reduction via
Principal Component Analysis (PCA), normalisation using Min-Max scaling, and handling
missing values through imputation. A sample of the dataset is shown in Table 1.1, which
includes the first five rows with columns chosen for conciseness.

Dept. of ISE, GAT 2024-2025 2


Trigeminal Neuralgia Severity Classification

Table 1.1: Sample of TN data (3).csv Dataset

Patien Age_cat Side_inv Quality_o Carbamazep Branch_in Classificati


t_ID egory olved f_pain in_dose volved on_TN

1 3 1 2 600 2 1

2 4 2 1 800 3 0

3 2 1 3 400 1 1

4 5 2 2 1200 2 1

5 3 1 1 600 3 0

This sample shows the structure of the dataset following preprocessing steps and
demonstrates how categorical and ordinal features can be numerically encoded to make them
directly compatible with machine learning models such as Decision Trees and XGBoost.

1.3 Existing System

In today's contemporary medical practices, clinical judgement, patient interviews, and


observational data are the primary methods of diagnosing neuralgia trigeminal (TN) and for
severity classification. Instruments based on visual analog scales and patient populations, such
as the VAS or the McGill Pain Questionnaire (MPQ), can provide general observations about
the pain intensity, but these are still qualitative and rely on subjective perception. Imaging
technologies (such as magnetic resonance imaging) are unlikely to be used for severity grading
of TN due to issues of cost, access, and the fact that it may detect neurovascular compression
or demyelination but cannot objectively measure pain. Moreover, healthcare professional's
incomplete history, inconsistent query of pain by the patient, and lack of objective pain
biomarker(s) are considerable obstacles. Treatment regimens often created through trial-and-
error as opposed to based on data. Some hospitals keep patient data in Electronic Medical
Records (EMRs) but are not employing integrated intelligence to leverage that data for
predictive analytics or decision support. Thus, it is often an iterative process, very subjective,
time-consuming, resourced-heavy and not even close to personalizing treatment, and
predicting how the disease will progress.

1.4 Limitations of the Existing System

Currently accepted diagnostic methods for TN, even though they are the accepted practice, have
Dept. of ISE, GAT 2024-2025 3
Trigeminal Neuralgia Severity Classification

many downsides:
• Subjectivity: Assessment of severity relies completely on the patient-reported symptoms
that can be inconsistent or over exaggerated, leading to misclassification.

• Delayed Diagnosis: Because TN is rare and patients present rather complexly, diagnosis is
often delayed by months if not years.

• Lack of standardization: There is no standard operational protocol for grading TN severity


among hospitals and practitioners.

• Limited Predictive Capacity: These systems do not predict treatment success or course of
the disease.

• Resource Restrictions: New and advanced diagnostics, such as MRI, are not ubiquitous,
and are also not available to patients without the appropriate resources, especially in rural or
poor health care setting.
These limitations support the necessity for an objective, scalable, and interpretable system to
assist diagnosis and management of TN.

1.5 Proposed System


The suggested system addresses the shortcomings of conventional diagnostics by combining
ML models with XAI techniques to produce an automated TN severity classification tool.
Important elements consist of:

• Implementation of Google Colab: The machine learning pipeline was created in Google
Colab using libraries such as scikit-learn, xgboost, and shap for feature analysis, model
training, and data preprocessing (e.g., Min-Max scaling, PCA, RandomOverSampler). With
a test accuracy of 92.3%, methods like stratified K-Fold cross-validation guaranteed a strong
evaluation.

• Deployment of Web Interface: A Tailwind CSS-styled React-based web application offers


clinicians an easy-to-use interface for uploading CSV files (TN daaataa (3).csv, for example)
and receiving predictions through a FastAPI backend. Features like file upload, prediction
results tables, and error handling are demonstrated in the snapshots, guaranteeing usefulness
in clinical settings.

• Model Development and Assessment: The use of traditional and ensemble classifiers (e.g.
XGBoost, Random Forest, Voting Classifier) was trained and assessed on different metrics:
accuracy, recall, precision, and F1-score, with the Voting Classifier achieving the highest
Dept. of ISE, GAT 2024-2025 4
Trigeminal Neuralgia Severity Classification

scores on accuracy and recall.

• Interpretability: dfits were able to provide interpretability through visualizing feature


contributions through SHAP and LIME, thus being actionable by clinicians.
This system reinforces the diagnostic process, assists personalized treatment plans, and bolsters
decision-making in clinical settings, with potential for application throughout the healthcare
sector.

1.6 Objectives

The aim of project’s objectives are:

● Create a stable ML model, to classify TN severity, with a level of accuracy > 92.3% test
accuracy.
● Implementation of a web interface that can facilitate the end-users.
● Confidence/integrity of predictions via XAI tools, such as SHAP and LIME, so that the
clinicians are willing to trust the model.
● Build a sustainable system, scaling for healthcare use for TN diagnoses and alleviating
delays/subjectivity in TN diagnosis across multiple healthcare facilities

1.7 Motivation

The motivation for this project has arisen from the critical need to improve the diagnosis
of TN, a condition that significantly impacts the quality of life for patients, due to pain and the
difficulty in diagnosing the condition. This project aims to develop a scalable, objective and
interpretable tool using AI and ML that can support clinicians, shorten the time for diagnosis,
provide personalized treatment options and improve the overall outcomes for patients in
neurological care.

Dept. of ISE, GAT 2024-2025 5


CHAPTER 2
LITERATURE
SURVEY
Trigeminal Neuralgia Severity Classification

CHAPTER 2
LITERATURE SURVEY

[1]. M. B., and P. Akul. "Leveraging XAI and Breakthrough Machine Learning Techniques for
Trigeminal Neuralgia Severity Classification." 2024 IEEE Region 10 Symposium (TENSYMP).
IEEE,2024

Abstract: Trigeminal Neuralgia (TN) is a debilitating chronic pain disorder that significantly
diminishes overall well-being, making diagnosis and therapy more challenging. The quick and
precise categorization of TN severity is critical to therapy effectiveness. To assess the severity of
TN, this study makes use of advanced machine learning techniques and explainable AI (XAI)
approaches. We use a range of bagging and boosting strategies, including AdaBoost, Random
Forest, Decision Tree, K-Nearest Neighbours (KNN), Gradient Boosting, and Logistic Regression.
Random Forest was 95.2% accurate, whereas AdaBoost and Gradient Boosting were 92% and
97.2% in terms of their accuracy. The Decision Tree classifier obtained an accuracy of 84.62%,
while K-Nearest Neighbours and Logistic Regression also achieved 92.31% accuracy. F1- score,
precision, and recall metrics were used to assess performance on a dataset that included patient
demographics and medical histories. XAI techniques, such as LIME and SHAP values, were used
to improve the models' readability and transparency, which helped to progress the creation of
cutting-edge diagnostic tools and increase clinician and researcher confidence.

[2]. Chen, Jianwen, Liu, Wei, & Zhang, Yubo. (2022). "Risk Factors for Unilateral Trigeminal
Neuralgia Based on Machine Learning." Frontiers in Neurology, 13, 862973.

Abstract: Neurovascular compression (NVC) is considered as the main factor leading to the
classical trigeminal neuralgia (CTN), and a part of idiopathic TN (ITN) may be caused by NVC
(ITN-nvc). This study aimed to explore the risk factors for unilateral CTN or ITN-nvc (UC-ITN),
which have bilateral NVC, using machine learning (ML).A total of 89 patients with UC-ITN were
recruited prospectively. According to whether there was NVC on the unaffected side, patients with
UC-ITN were divided into two groups. All patients underwent a magnetic resonance imaging (MRI)
scan. The bilateral cisternal segment of the trigeminal nerve was manually delineated, which
avoided the offending vessel (Ofv), and the features were extracted. Dimensionality reduction,
feature selection, model construction, and model evaluation were performed step-by-step. Four
textural features with greater weight were selected in patients with UC-ITN without NVC on the

Dept. of ISE, GAT 2024-2025 6


Trigeminal Neuralgia Severity Classification

unaffected side. For UC-ITN patients with NVC on the unaffected side, six textural features with
greater weight were selected. The textural features (rad_score) showed significant differences
between the affected and unaffected sides (p < 0.05). The nomogram model had optimal diagnostic
power, and the area under the curve (AUC) in the training and validation cohorts was 0.76 and 0.77,
respectively. The Ofv and rad_score were the risk factors for UC-ITN according to
nomogram.Besides NVC, the texture features of trigeminal-nerve cisternal segment and Ofv were
also the risk factors for UC-ITN. These findings provided a basis for further exploration of the
microscopic etiology of UC-ITN

[3] Maarbjerg, Stine, Wolfram, Franz, & Gozalov, Alisher. (2021). "Pathological Mechanisms and
Therapeutic Targets for Trigeminal Neuralgia." Cancers, 11(3), 734.

Abstract: Trigeminal neuralgia (TN) is the most frequent facial pain. It is difficult to treat
pharmacologically and a significant amount of patients can become drug-resistant requiring surgical
intervention. From an etiologically point of view TN can be distinguished in a classic form, usually
due to a neurovascular conflict, a secondary form (for example related to multiple sclerosis or a
cerebello-pontine angle tumor) and an idiopathic form in which no anatomical cause is identifiable.
Despite numerous efforts to treat TN, many patients experience recurrence after multiple operations.
This fact reflects our incomplete understanding of TN pathogenesis. Artificial intelligence (AI) uses
computer technology to develop systems for extension of human intelligence. In the last few years,
it has been a widespread use of AI in different areas of medicine to implement diagnostic accuracy,
treatment selection and even drug production. The aim of this mini-review is to provide an up to
date of the state-of-art of AI applications in TN diagnosis and management.

[4] Gubian, Damiano, Sivolella, Stefano, & Tugnoli, Valeria. (2021). "A Case Series of Stereotactic
Radiosurgery First for Trigeminal Neuralgia." Operative Neurosurgery, 25(4), 353.

Abstract: The influence of prior stereotactic radiosurgery (SRS) on outcomes of subsequent


microvascular decompression (MVD) for patients with trigeminal neuralgia (TN) is not well
understood. To directly compare pain outcomes in patients undergoing primary MVD vs those
undergoing MVD with a history of 1 prior SRS procedure. We retrospectively reviewed all patients
undergoing MVD at our institution from 2007 to 2020. Patients were included if they underwent
primary MVD or had a history of SRS alone before MVD. Barrow Neurological Institute (BNI) pain
scores were assigned at preoperative and immediate postoperative time points and at every follow-
up appointment. Both groups demonstrated similar preoperative and immediate postoperative BNI
pain scores. There were no significant differences between average BNI at final follow-up between

Dept. of ISE, GAT 2024-2025 7


Trigeminal Neuralgia Severity Classification

the groups. Multiple sclerosis (hazard ratio (HR) = 1.95), age (HR = 0.99), and female sex (HR =
1.43) independently predicted increased likelihood of pain recurrence on Cox proportional hazards
analysis. SRS alone before MVD did not predict increased likelihood of pain recurrence.
Furthermore, Kaplan-Meier survival analysis demonstrated no relationship between a history of
SRS alone and pain recurrence after MVD (P = .58). SRS is an effective intervention for TN that
may not worsen outcomes for subsequent MVD in patients with TN.

[5] Zakrzewska, Joanna M., Di Stefano, Giulia, & Maarbjerg, Stine. (2022). "Explainable AI for
Machine Fault Diagnosis: Understanding Features and Contributions in Medical Diagnosis."
Applied Sciences, 12(5), 2414.

Abstract: Study is advocated for the use of explainable AI in pain-related conditions, focusing on
SHAP and LIME to support transparent decision-making. Their analysis provides both technical
and ethical support for XAI in clinical tools.Although the effectiveness of machine learning (ML)
for machine diagnosis has been widely established, the interpretation of the diagnosis outcomes is
still an open issue. Machine learning models behave as black boxes; therefore, the contribution given
by each of the selected features to the diagnosis is not transparent to the user. This work is aimed at
investigating the capabilities of the SHapley Additive exPlanation (SHAP) to identify the most
important features for fault detection and classification in condition monitoring programs for
rotating machinery. The authors analyse the case of medium-sized bearings of industrial interest.
Namely, vibration data were collected for different health states from the test rig for industrial
bearings available at the Mechanical Engineering Laboratory of Politecnico di Torino. The Support
Vector Machine (SVM) and k-Nearest Neighbour (kNN) diagnosis models are explained by means
of the SHAP. Accuracies higher than 98.5% are achieved for both the models using the SHAP as a
criterion for feature selection. It is found that the skewness and the shape factor of the vibration
signal have the greatest impact on the models’ outcomes.

[6] Cruccu, Giorgio, Finnerup, Nanna B., & Jensen, Troels S. (2021). "Explainable Deep Learning
Methods in Medical Diagnosis: A Survey." ArXiv, 2105.01824.

Abstract: The goal of this study was (i) to use artificial intelligence to automate the traditionally
labor-intensive process of manual segmentation of tumor regions in pathology slides performed by
a pathologist and (ii) to validate the use of a well-known and readily available deep learning
architecture. Automation will reduce the human error involved in manual delineation, increase
efficiency, and result in accurate and reproducible segmentation. This advancement will alleviate
the bottleneck in the workflow in clinical and research applications due to a lack of pathologist time.
Our application is patient-specific microdosimetry and radiobiological modeling, which builds on

Dept. of ISE, GAT 2024-2025 8


Trigeminal Neuralgia Severity Classification

the contoured pathology slides.A U-Net architecture was used to segment tumor regions in
pathology core biopsies of lung tissue with adenocarcinoma stained using hematoxylin and eosin.
A pathologist manually contoured the tumor regions in 56 images with binary masks for training.
Overlapping patch extraction with various patch sizes and image downsampling were investigated
individually. Data augmentation and 8-fold cross-validation were used.The U-Net achieved
accuracy of 0.91\pm0.06, specificity of 0.90\pm0.08, sensitivity of 0.92\pm0.07, and precision of
0.8\pm0.1. The F1/DICE score was 0.85\pm0.07, with a segmentation time of 3.24\pm0.03 seconds
per image, achieving a 370\pm3 times increased efficiency over manual segmentation. In some
cases, the U-Net correctly delineated the tumor's stroma from its epithelial component in regions
that were classified as tumor by the pathologist.The U-Net architecture can segment images with a
level of efficiency and accuracy that makes it suitable for tumor segmentation of histopathological
images in fields such as radiotherapy dosimetry, specifically in the subfields of microdosimetry.

[7] Holzinger, Andreas, Plass, Markus, Holzinger, Katharina, Crisan, Gloria Cerasela, Pintea,
Camelia-M. & Palade, Vasile. (2022). "Interpreting artificial intelligence models: a systematic
review on the use of SHAP and LIME." Brain Informatics, 9(1), 12.

Abstract: Explainable artificial intelligence (XAI) has gained much interest in recent years for its
ability to explain the complex decision-making process of machine learning (ML) and deep learning
(DL) models. The Local Interpretable Model-agnostic Explanations (LIME) and Shaply Additive
exPlanation (SHAP) frameworks have grown as popular interpretive tools for ML and DL models.
This article provides a systematic review of the application of LIME and SHAP in interpreting the
detection of Alzheimer’s disease (AD). Adhering to PRISMA and Kitchenham’s guidelines, we
identified 23 relevant articles and investigated these frameworks’ prospective capabilities, benefits,
and challenges in depth. The results emphasise XAI’s crucial role in strengthening the
trustworthiness of AI-based AD predictions. This review aims to provide fundamental capabilities
of LIME and SHAP XAI frameworks in enhancing fidelity within clinical decision support systems
for AD prognosis.

[8] Singh, Priyanka, Aggarwal, Akansha, & Gupta, Sonia. (2021). "Commentary on explainable
artificial intelligence method

Abstract: Humans perceive the world by concurrently processing and fusing high-dimensional
inputs from multiple modalities such as vision and audio. Machine perception models, in stark
contrast, are typically modality-specific and optimised for unimodal benchmarks.A common
approach for building multimodal models is to simply combine multiple of these modality-specific
architectures using late-stage fusion of final representations or predictions ('late-fusion').Instead, we

Dept. of ISE, GAT 2024-2025 9


Trigeminal Neuralgia Severity Classification

introduce a novel transformer based architecture that uses 'attention bottlenecks' for modality fusion
at multiple layers. Compared to traditional pairwise self-attention, these bottlenecks force
information between different modalities to pass through a small number of '`bottleneck' latent units,
requiring the model to collate and condense the most relevant information in each modality and only
share what is necessary. We find that such a strategy improves fusion performance, at the same time
reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art
results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and
VGGSound. All code and models will be released.

[9] Mohanty, Manoranjan, Vuppala, Anil Kumar, & Choppella, Venkatesh. (2021). "Explainable
AI: current status and future directions." arXiv preprint arXiv:2107.07045.

Abstract: Explainable Artificial Intelligence (XAI) is an emerging area of research in the field of
Artificial Intelligence (AI). XAI can explain how AI obtained a particular solution (e.g.,
classification or object detection) and can also answer other "wh" questions. This explainability is
not possible in traditional AI. Explainability is essential for critical applications, such as defense,
health care, law and order, and autonomous driving vehicles, etc, where the know-how is required
for trust and transparency. A number of XAI techniques so far have been purposed for such
applications. This paper provides an overview of these techniques from a multimedia (i.e., text,
image, audio, and video) point of view. The advantages and shortcomings of these techniques have
been discussed, and pointers to some future directions have also been provided.

[10] Morris, Kathryn, Shukla, Hemant, & Turner, Jonathan. (2022). "Performance Comparison of
Machine Learning Models Powered by SHAP and LIME." SSRN Electronic Journal

Abstract: Early diagnosis of diabetes can increase patients' quality of life and improve treatment
processes. In this context, this article focuses on the early diagnosis and prediction of diabetes,
addressing the performance of various machine learning models and the role of explainable artificial
intelligence (XAI) techniques. With the rise of large datasets in the healthcare industry, data mining
and machine learning techniques have become an important tool for the discovery and analysis of
diabetes datasets spanning healthcare systems. This study investigates a diabetes dataset that
includes healthcare systems. Various machine learning models such as K-NN, SVM, Naive Bayes,
CNN, Decision Tree, Random Forest and XGBoost were evaluated on this data set and their
performances were compared. Visualizing the overall structure of the data set is important for
analyzing relationships between diabetes-related features. The article starts with cleaning the dataset
and preprocessing steps, followed by the training and testing phases of each model on the dataset.

Dept. of ISE, GAT 2024-2025 10


Trigeminal Neuralgia Severity Classification

Each model was evaluated based on success criteria such as accuracy, F1 score, sensitivity, and
specificity. In addition, the understandability of the model's decisions was increased by applying
explainable artificial intelligence (XAI) methods, SHAP (Shapley Additive exPlanations) and LIME
(Local Interpretable Model-agnostic Explanations) to the outputs of the most successful model.
These techniques explain the internal working mechanism of the model by determining which
features have the most impact on model outputs. The analyzes were supported by expert doctor's
comments and the potential of the models in real world applications was highlighted. When the
models and results are examined, respectively; it can be seen that the results of K-NN: 81.18%,
SVM: 75.38%, Naïve Bayes: 75.49%, CNN: 74.83%, Decision Tree: 76.91%, Random Forest:
91.68%, XGBoost: 98.91% are obtained. As a result, machine learning models effectively
demonstrate early detection and diagnosis of diabetes. The explainability of these applied models is
emphasized and their effects on real life are shown.

[11] Chen, Jia, Liu, Xin, & Zhao, Lei. (2023). "An explainable deep learning-enabled intrusion
detection framework in IoT using SHAP and LIME." Computers & Security, 110, 102510.

Abstract: Although the field of eXplainable Artificial Intelligence (XAI) has a significant interest
these days, its implementation within cyber security applications still needs further investigation to
understand its effectiveness in discovering attack surfaces and vectors. In cyber defence, especially
anomaly-based Intrusion Detection Systems (IDS), the emerging applications of machine/deep
learning models require the interpretation of the models' architecture and the explanation of models'
prediction to examine how cyberattacks would occur. This paper proposes a novel explainable
intrusion detection framework in the Internet of Things (IoT) networks. We have developed an IDS
using a Short-Term Long Memory (LSTM) model to identify cyberattacks and explain the model's
decisions. This uses a novel set of input features extracted by a novel SPIP (S: Shapley Additive
exPlanations, P: Permutation Feature Importance, I: Individual Conditional Expectation, P: Partial
Dependence Plot) framework to train and evaluate the LSTM model. The framework was validated
using the NSL-KDD, UNSW-NB15 and TON_IoT datasets. The SPIP framework achieved high
detection accuracy, processing time, and high interpretability of data features and model outputs
compared with other peer techniques. The proposed framework has the potential to assist
administrators and decision-makers in understanding complex attack behaviour.

Dept. of ISE, GAT 2024-2025 11


CHAPTER 3
SYSTEM
REQUIREMENTS
AND
SPECIFICATION
Trigeminal Neuralgia Severity Classification

CHAPTER 3
SYSTEM REQUIREMENTS AND SPECIFIATION
Combining computational resources, development environments, and functional features
that complement medical diagnostics is necessary to create a machine learning-based system for
classifying the severity of trigeminal neuralgia (TN). The hardware, software, and functional
requirements required for the successful implementation, testing, and deployment of the TN
Prediction System are described in this chapter. The system was implemented using a React-
based web interface after being developed and tested in a Google Colab environment, where it
achieved a test accuracy of 92.3%. These specifications guarantee the system's practical
implementation in clinical settings as well as support for the entire machine learning pipeline,
from data preprocessing to model interpretation.

3.1 Hardware Requirements

The TN Prediction System consists of a web application for real-time predictions, feature
analysis using SHAP, and training sophisticated machine learning models (such as XGBoost,
Random Forest, and Voting Classifier). TN daaataa (3).csv contains 62 patient records, but the
use of interpretability tools, stratified K-Fold cross-validation, and ensemble methods benefits
from high-performance hardware, especially for scalability and future expansion with larger
datasets. The suggested hardware requirements for seamless training, testing, and deployment
are listed in Table 3.1.

Table 3.1: Hardware Requirements

Component Specification

Processor Intel Core i7/i9 or AMD Ryzen 7/9 (multi-core)


(CPU)

Memory Minimum 16 GB (32 GB preferred for large datasets)


(RAM)

Storage SSD with at least 500 GB capacity for models and data

Graphics NVIDIA GPU with CUDA support (e.g., RTX 3060 or higher, optional for
(GPU) Colab)

Dept. of ISE, GAT 2024-2025 12


Trigeminal Neuralgia Severity Classification

Display Full HD monitor for data visualization and UI testing

Power Supply Reliable power supply with UPS for uninterrupted operation

Network Stable internet connection (minimum 10 Mbps for Colab and web hosting)

The tests were carried out in Google Colab, which eliminates the need for expensive local
hardware during development by offering cloud-based CPU and optional GPU resources.
However, a multi-core CPU, enough RAM, and a reliable internet connection are necessary
for the local deployment of the FastAPI backend and React frontend in order to effectively
manage concurrent user requests and model inference, particularly in clinical settings.

3.2 Software Requirements

To support the ML pipeline, web application, and explainability features, the TN


Prediction System depends on an extensive software stack. Web development frameworks,
ML libraries, programming tools, and visualization tools are all included in this. For testing
purposes, the system was created in Google Colab and locally deployed with a React frontend
and FastAPI backend to guarantee cross-platform compatibility. The necessary software
components are listed in Table 3.2.

Table 3.2: Software Requirements

Category Tools/Frameworks

Operating System Windows 10/11 or Linux (Ubuntu recommended for


open-source compatibility)

Programming Language Python 3.8+ (for backend), JavaScript (ES6+, for React frontend)

ML Libraries scikit-learn (v1.5.2), xgboost (v2.0+), LightGBM, AdaBoost,


RandomForest, SVM, imblearn (for RandomOverSampler)

Dept. of ISE, GAT 2024-2025 13


Trigeminal Neuralgia Severity Classification

DeepLearning (Optional) TensorFlow or PyTorch (for extending to deep learning or CNNs in


future versions)

XAI Libraries SHAP, LIME

Data Processing pandas (v2.2.3), numpy (v2.1.1)

Visualization Tools matplotlib, seaborn, plotly (for SHAP plots and model analysis)

Backend Framework FastAPI (v0.115.0), uvicorn (v0.30.6) for RESTful API

Frontend Framework React (v18+), Tailwind CSS (for styling), axios (for API requests)

IDE/Environment Google Colab (for development), Jupyter Notebook, VS Code, Anaconda

Version Control Git (for collaborative development and code management)

Dependencies python-multipart (v0.0.9, for file uploads), joblib (for model persistence)

Google Colab was selected because it is perfect for experimentation due to its cloud-based
Python environment, pre-installed libraries, and ease of data uploads (TN daaataa (3).csv, for
example). For browser-based clients, the React frontend, styled with Tailwind CSS, seamlessly
communicates across origins by integrating with the FastAPI backend through CORS-enabled
requests. The web application for clinical use and the complete machine learning pipeline, from
data preprocessing to model interpretation, are supported by the software stack.

3.3 Functional Requirements

To effectively classify TN severity, facilitate clinical decision-making, and guarantee


medical professionals' usability, the TN Prediction System needs to provide certain features. As
confirmed on May 18, 2025, the functional requirements support the practical deployment of the
web interface and encompass the full pipeline, from data input to prediction output and
explainability. The capabilities of the system are delineated by the subsequent requirements:

1. Data Input Module

Dept. of ISE, GAT 2024-2025 14


Trigeminal Neuralgia Severity Classification

o Accept structured patient data in CSV format (e.g., TN daaataa (3).csv) with fields
like demographics, pain severity, neurological test results, treatment response, MRI
findings, etc.

o Validate input files for correct format (CSV) and expected columns (20 features),
raising errors for invalid inputs (e.g., non-CSV files, empty files), as demonstrated
in snapshots captured on May 18, 2025.

o Handle missing values using statistical imputation (mean for numerical, mode for
categorical features).

2. Data Preprocessing and Feature Engineering

o Normalize features using Min-Max scaling or StandardScaler to ensure numerical


stability for models like SVM Apply dimensionality reduction via Principal
Component Analysis (PCA) to reduce feature dimensions (e.g., from 20 to 16
components) while retaining maximum variance.

o Implement RandomOverSampler to balance class distribution, addressing the issue


of class imbalance common in rare disease datasets like TN.

3. Model Training and Selection

o Train ensemble models (e.g., Random Forest, Gradient Boosting, AdaBoost,


XGBoost) and conventional classifiers (e.g., Logistic Regression, KNN, Decision
Tree, SVM).

o To assess model performance across metrics such as accuracy, precision, recall,


and F1-score, use stratified K-Fold cross-validation (e.g., 5 folds).

o Store the best-performing model (e.g., Voting Classifier with 92.3% accuracy) as
final_voting_model.pkl for deployment, using joblib for serialization.

4. Prediction and Severity Classification

o Allow real-time input of new patient data via the React web interface, supporting
CSV uploads.

o Output severity classification (mild, moderate, severe) or binary TN presence


(Positive/Negative) based on trained model predictions, with probabilities for

Dept. of ISE, GAT 2024-2025 15


Trigeminal Neuralgia Severity Classification

interpretability.

o Display predictions in a structured, color-coded table (e.g., green for "Positive," red
for "Negative") for easy clinical interpretation, as shown in snapshots from May
18, 2025.

5. Explainability and Transparency

o Use SHAP to generate global feature importance plots, identifying key contributors
like pain intensity and MRI findings.

o Use LIME to provide instance-level explanations for individual predictions,


helping clinicians understand specific cases.

o Allow clinicians to visualize and interpret predictions through SHAP summary


plots and LIME outputs, either within Google Colab or as downloadable reports.

6. Web Interface Functionality

o Provide a user-friendly React-based interface with Tailwind CSS styling for file
uploads, prediction display, and error handling, as validated on May 18, 2025.

o Support cross-origin requests via CORS to enable browser-based access to the


FastAPI backend at http://localhost:8000.

o Display error messages for invalid inputs (e.g., non-CSV files, empty files) to
ensure robust operation in clinical settings.

7. Performance Reporting

o Display performance metrics (accuracy, precision, recall, F1-score) for trained


models, with the Voting Classifier achieving 92.3% accuracy on the test set.

o Compare model performance across classifiers using visualizations like bar plots
(e.g., in Google Colab with matplotlib).

o Generate ROC curves and SHAP summary plots to support model evaluation and
explainability.

8. Scalability and Future Enhancement

Dept. of ISE, GAT 2024-2025 16


Trigeminal Neuralgia Severity Classification

o Design a modular architecture to allow the addition of deep learning models (e.g.,
CNNs) in future versions, as mentioned in the project scope.

o Support integration with multi-center datasets by enabling dynamic data uploads and
preprocessing pipelines.

o Ensure the FastAPI backend can be deployed to cloud platforms (e.g., AWS, Heroku)
for scalability and broader accessibility in healthcare systems.
These functional requirements ensure the TN Prediction System is a comprehensive, reliable, and
user-friendly tool for clinical TN severity classification, bridging the gap between advanced ML
techniques and practical medical application

Dept. of ISE, GAT 2024-2025 17


CHAPTER 4
SYSTEM DESIGN
Trigeminal Neuralgia Severity Classification

CHAPTER 4
SYSTEM DESIGN
In order to implement TN severity classification using machine learning (ML) and
explainable artificial intelligence (XAI), this chapter describes the internal structure of the
Trigeminal Neuralgia (TN) Prediction System, including the high-level system architecture, low-
level design components, data flow, and interactions. Clinicians can categorise TN severity
(mild, moderate, and severe) using the system's modular, scalable, and interpretable design,
which also offers transparency through SHAP and LIME explanations. The implementation
makes use of a React-based web interface for clinical deployment and Google Colab for the
machine learning pipeline. To give a thorough grasp of the system's design and workflows, this
chapter contains a number of diagrams, including system architecture, data flow, component,
sequence, and class diagrams.

4.1 System Architecture

The architecture of the TN Prediction System is layered and modular in order to facilitate
interpretability, scalability, and maintainability. In addition to internally handling data
preprocessing, model training, prediction, and explanation generation, it allows researchers
and clinicians to engage with the system through an intuitive web interface. As seen in Figure
4.1, the architecture is made up of five different layers.

Figure 4.1: System Architecture

Dept. of ISE, GAT 2024-2025 18


Trigeminal Neuralgia Severity Classification

The architecture consists of the following layers:

● User Interface Layer: Built using React and styled with Tailwind CSS, this layer
accepts patient data inputs (e.g., CSV uploads) and displays severity predictions and
SHAP/LIME explanations.
● Application Logic Layer: Manages validation, routing, and execution of ML
tasks using a FastAPI backend, bridging user inputs to the ML backend.
● Machine Learning Layer: Hosts data preprocessing (imputation, Min-Max scaling,
PCA), classification models (Logistic Regression, SVM, Random Forest, Gradient
Boosting, XGBoost), and explanation frameworks (SHAP, LIME). A hybrid soft Voting
Classifier achieves a test accuracy of 92.3%.
● Data Layer: Stores input patient records (TN daaataa (3).csv), model outputs
(final_voting_model.pkl), and explanation visuals, with potential for database integration.
● Infrastructure Layer: Utilizes Python libraries (scikit-learn, xgboost, shap, lime) and
runs in Google Colab for development, with the FastAPI backend deployed locally at
http://localhost:8000.

4.2 Component Diagram

The component diagram illustrates the interactions between the system’s major
components, focusing on the relationships between the frontend, backend, ML pipeline, and
data storage. Figure 4.2 provides a visual representation of these interactions.

Figure 4.2: Component Diagram

Dept. of ISE, GAT 2024-2025 19


Trigeminal Neuralgia Severity Classification

Description of Component Diagram:

● Frontend (React, Tailwind CSS): Handles user interactions, sending HTTPrequests


(via axios) to the backend for predictions.
● Backend (FastAPI): Processes requests, performs inference using the ML pipeline, and
returns predictions and explanations to the frontend.
● ML Pipeline (Google Colab): Executes preprocessing, model inference, and
explainability tasks, developed in Google Colab and deployed via saved models.
● Data Storage (CSV, Models): Stores the dataset (TN daaataa (3).csv) and trained
models (final_voting_model.pkl), accessed by the backend and ML pipeline.

4.3 Data Flow Diagram

The data flow diagram (DFD) illustrates the flow of data through the TN Prediction System,
from user input to prediction and explanation output. Figure 4.3 provides a visual
representation of this flow.

Figure 4.3: Data Flow Diagram

Description of Data Flow Diagram:


• User Interaction: The user provides input data through the system interface.
• Data Preprocessing: The input data is processed (e.g., cleaning, scaling, or transforming)
to prepare it for analysis.

Dept. of ISE, GAT 2024-2025 20


Trigeminal Neuralgia Severity Classification

• Classification Engine: The preprocessed data is fed into the Classification Engine, which
generates predictions using a trained model.
• Model Storage: The Classification Engine interacts with Model Storage to load or save
the trained model as needed.
• Output Display: The predictions are displayed to the user via the system interface.

4.4 Sequence Diagram

The sequence diagram depicts the interaction between the clinician, React frontend, FastAPI
backend, and ML pipeline during a prediction request. Figure 4.4 illustrates this workflow.

Figure 4.4: Sequence Diagram

Dept. of ISE, GAT 2024-2025 21


CHAPTER 5
IMPLEMENTATION
Trigeminal Neuralgia Severity Classification

CHAPTER 5
IMPLEMENTATION
The Trigeminal Neuralgia (TN) Prediction System is a machine learning application
designed to predict TN outcomes from patient data stored in comma separated value (CSV)
files. Implemented in Python, the system leverages a FastAPI backend to serve predictions and
a DecisionTreeClassifier for modeling. It processes patient features, generates probabilistic
predictions, and is designed
to integrate with a React frontend for user interaction. The system aims to assist medical
professionals in diagnosing TN by providing rapid, data-driven insights based on patient
characteristics. This report provides a comprehensive overview of the system’s environment
setup, data preprocessing, feature selection, model training, API implementation, frontend
integration, challenges, and future enhancements. The implementation is based on a dataset of
62 patient records, processed through a streamlined pipeline that prioritizes simplicity and
functionality.

5.1 Environment and Dependencies

The backend is developed in a local Python environment, requiring Python 3.8 or higher,
and is supported by a curated set of libraries specified in requirements.txt. The development
environment is isolated within a virtual environment, ensuring dependency consistency and
reproducibility across different systems. Key dependencies include:
• Pandas (v2.2.3): For efficient data manipulation, particularly loading and processing
CSV files.
• Numpy (v2.1.1): For numerical computations and array operations.
• Scikit-learn (v1.5.2): For machine learning, providing the DecisionTreeClassifier and
serialization utilities via pickle.
• Fastapi (v0.115.0) and uvicorn (v0.30.6): For creating and delivering a high-
performance RESTful API, use Fastapi (v0.115.0) and uvicorn (v0.30.6).
• Python-multipart (v0.0.9): To manage file uploads in the API and make CSV file
processing possible.
Selecting a local development environment instead of cloud-based platforms like Google
Colab makes it easier to deploy to production servers and allows for seamless integration
with local development tools. To reduce compatibility problems, the requirements.txt file

Dept. of ISE, GAT 2024-2025 22


Trigeminal Neuralgia Severity Classification

makes sure that all dependencies are version-locked. Python's logging module is used to
configure logging, which improves debugging and monitoring capabilities by offering
comprehensive insights into model loading, file processing, and API operations.

5.2 Data Preprocessing

The dataset, TN daaataa (3).csv, comprises 62 patient records, each described by 21 columns.
These columns include 20 feature variables (e.g., Age_category, Side_involved,
Quality_of_pain, Carbamazepin_dose, Branch_involved) and one target variable,
Classification_TN, which indicates the presence or absence of TN. The features are numerical,
representing categorical or ordinal data encoded as integers, which are suitable for direct input
to the DecisionTreeClassifier Preprocessing is tailored to the system’s requirements and is
performed in two contexts: model training and API prediction. The steps are:

• Data Loading: During training (train_model.py, check_model.py), the CSV is loaded into
a pandas DataFrame using pd.read_csv. For API predictions (main.py), the uploaded CSV
is read directly from the request using pd.read_csv.
• Column Dropping: In the /predict endpoint, the Duration_Category column is dropped if
present, ensuring the input matches the model’s expected 20 features. This step is critical, as
the model was trained without this column.
• Feature-Target Split: For training, features (X) are extracted as all columns except the last,
and the target (y) is the Classification_TN column. This is implemented as:
df = pd.read_csv(data_path)
X = df.iloc[:, :-1] # All columns except the last y = df.iloc[:, -1] # Last column is the target
No additional preprocessing, such as feature scaling or encoding, is applied, as the numerical
features are compatible with decision trees, which are invariant to monotonic transformations.
However, this minimal preprocessing may limit the system’s ability to handle noisy or
misaligned data, a point addressed in future enhancements.

5.3 Feature Selection

The current implementation uses all 20 features (after dropping Duration_Category)


without explicit feature selection or dimensionality reduction. The DecisionTreeClassifier
inherently prioritizes features based on information gain during tree construction, effectively

Dept. of ISE, GAT 2024-2025 23


Trigeminal Neuralgia Severity Classification

performing implicit feature selection. This approach simplifies the pipeline and reduces
computational overhead, as no additional processing is required beyond column dropping.

But depending only on the model's internal feature selection could result in features that
are redundant or less informative, which could lower robustness. To lower dimensionality and
improve model performance, sophisticated methods like recursive feature elimination or
Principal Component Analysis (PCA) could be used. For example, PCA could transform the
20 features into a smaller set of principal components, as seen in other TN prediction systems,
capturing the most significant variance while reducing noise. The lack of feature importance
analysis in the current scripts limits insights into which features (e.g., Carbamazepin_dose or
Quality_of_pain) most strongly influence predictions.

5.4 Handling Class Imbalance

The provided scripts do not explicitly address class imbalance, and the dataset’s class
distribution for Classification_TN is not analyzed. The DecisionTreeClassifier is trained with
default parameters (random_state=42), without mechanisms like class weights or resampling
to account for potential imbalances in the TN-positive versus TN-negative classes.
Class imbalance is common in medical datasets, where positive cases (TN presence) may be
underrepresented. This can bias the model toward the majority class, reducing sensitivity to
the minority class. Techniques such as RandomOverSampler from imblearn or class weights
via compute_class_weight could be implemented, as demonstrated in advanced TN prediction
pipelines. For example, oversampling could balance the training set by duplicating minority
class samples, while class weights could penalize misclassifications of the minority class.

Future work should include an analysis of class distribution (e.g., using pd.Series(y).value to
determine the need for such techniques.

5.5 Model Training and Evaluation

A single DecisionTreeClassifier from scikit-learn is trained with the following


configuration:
• Algorithm: DecisionTreeClassifier with random_state=42 for reproducibility.
• Training Process: The model is fitted on the entire dataset (62 rows, 20 features) without

Dept. of ISE, GAT 2024-2025 24


Trigeminal Neuralgia Severity Classification

a train-test split, maximizing the use of available data for training. The process is
implemented in train_model.py and check_model.py.
• Output: The trained model is serialized as tn_model.pkl using pickle, enabling reuse in
the API.
The training code is:
model = DecisionTreeClassifier(random_state=42) model.fit(X, y)
with open(model_path, ’wb’) as f:
pickle.dump(model, f)
No formal evaluation (e.g., accuracy, precision, recall) is performed, as the scripts focus on
model generation rather than performance assessment. The model’s predict_proba method
is verified post-training to ensure compatibility with the API’s probability-based predictions.
The absence of a train-test split or cross-validation means the model’s generalization to
unseen data is untested, a significant limitation given decision trees’ propensity for
overfitting. Future enhancements could include:
• Train-Test Split: To assess performance on held-out data, the dataset is divided into two
parts, such as 80% training and 20% testing.
• Cross-Validation: To evaluate robustness, use stratified k-fold cross-validation (e.g., 5
folds).
• Metrics: Calculating recall, accuracy, precision, and F1-score to measure performance,
particularly for the minority class in the event of imbalance.

5.6 API Implementation

The FastAPI backend (main.py) provides a scalable, RESTful interface for predictions,
implemented with two endpoints:
• GET /: A health check endpoint that returns {”message”: ”TN Prediction API is running”},
confirming server availability.
• POST /predict: Accepts a CSV file, validates its format, preprocesses the data, and returns
predictions as JSON. Each prediction includes:
• patient_id: Row number (1 to 62 for the provided dataset).
• probability: Probability of the positive class (0.0 to 1.0).
• prediction: “Positive” if probability > 0.5, else “Negative”.

The /predict endpoint performs the following steps:


1. Validates the file extension, raising an HTTP 400 error for non-CSV files.

Dept. of ISE, GAT 2024-2025 25


Trigeminal Neuralgia Severity Classification

2. Loads the CSV into a pandas DataFrame.


3. Drops the Duration_Category column if present.
4. Generates predictions using the loaded tn_model.pkl.
5. Formats predictions as a list of dictionaries for JSON response.

The prediction code is:


df = pd.read_csv(file.file)
if ’Duration_Category’ in df.columns:
df = df.drop(’Duration_Category’, axis=1) predictions = model.predict_proba(df)
results = [{”patient_id”: i+1, ”probability”: float(pred[1]), ”prediction”: ”Positive” if pred[1]
> 0.5 else ”Negative”} for i, pred in enumerate(predictions)]

The API incorporates CORS middleware to allow cross-origin requests, enabling integration
with web-based frontends. Comprehensive logging tracks model loading, file processing, and
errors, with messages indicating file receipt, CSV shape, and prediction success. Error
handling includes HTTP exceptions for empty files, invalid CSV formats, and model loading
failures, ensuring robust operation.

5.7 Integration with Frontend

The inclusion of package.json in the project structure suggests that the backend is made to
work with a React frontend. The /predict endpoint would receive multipart/form-data requests
from the frontend, which would allow users to upload CSV files via a web interface. The
frontend and backend can communicate with ease thanks to the CORS-enabled API's
compatibility with browser-based clients.
Each prediction includes a patient_id, probability, and prediction, and the JSON response
format from the API is organised for simple parsing in a frontend application. To improve
user interaction, these predictions could be displayed in a table or visualisation using a React
component. For instance, the axios library could be used by a frontend implementation to
transmit the CSV file:
const formData = new FormData(); formData.append(’file’, csvFile);
axios.post(’http://localhost:8000/predict’, formData)
.then(response => { console.log(response.data.predictions);
});
The backend's design facilitates extensibility, enabling integration with dashboards or

Dept. of ISE, GAT 2024-2025 26


Trigeminal Neuralgia Severity Classification

visualisation tools, even though the frontend code is not supplied. In order to match the
robustness of the backend, future work might involve creating a user interface with features
like file upload validation, prediction display, and error handling.

5.8 Conclusion

The TN Prediction System delivers a functional and streamlined backend for predicting TN
outcomes using a DecisionTreeClassifier and FastAPI. Its simplicity facilitates rapid
development and deployment, while the API’s robust design supports integration with a React
frontend. The system successfully processes the TN daaataa (3).csv dataset, generating
predictions for 62 patients with probabilities and binary outcomes. The implementation
leverages logging and error handling to ensure reliability, making it a practical tool for medical
decision support.
Future enhancements could significantly improve the system’s performance and usability:
• Advanced Preprocessing: Implement feature scaling, encoding, or PCA to enhance
feature quality and reduce dimensionality.
• Class Imbalance Handling: Apply oversampling or class weights to address potential
imbalances in Classification_TN.
• Model Evaluation: Incorporate train-test splits, cross-validation, and performance
metrics to quantify and improve model accuracy.
• Model Upgrades: Explore ensemble methods (e.g., Random Forest, XGBoost) to boost
predictive power, as demonstrated in advanced TN prediction systems.
• Frontend Development: Create an intuitive React interface for visualisation of
predictions and CSV uploads.
• Deployment: For production use, deploy the API to a cloud platform (such as AWS or
Heroku), making sure it is accessible and scalable.

The TN Prediction System can develop into a more reliable and broadly used tool by tackling
these issues, which will improve its usefulness in clinical settings and further the use of machine
learning in medical diagnostics.

Dept. of ISE, GAT 2024-2025 27


CHAPTER 6
TESTING
Trigeminal Neuralgia Severity Classification

CHAPTER 6
TESTING
Testing is the process of evaluating a system or its component(s) with the intent to
find whether it satisfies the specified requirements or not. In simple words, testing is
executing a system in order to identify any gaps, errors, or missing requirements in contrary
to the actual requirements. According to ANSI/IEEE 1059 standard, Testing can be defined
as: A process of analyzing a software item to detect the differences between existing and
required conditions (that is, defects) and to evaluate the features of the software item.
The Trigeminal Neuralgia (TN) Prediction System, comprising a FastAPI backend and a
DecisionTreeClassifier, is tested to ensure it accurately processes patient data, delivers
reliable predictions, and integrates seamlessly with a potential frontend. This section
outlines the testing phases and presents specific test cases to verify the system’s
functionality.

6.1 Testing Phases


The testing process for the TN Prediction System is divided into distinct phases to
systematically validate its components, including data processing, model predictions, and
API functionality. Each phase targets specific requirements, such as correct CSV handling,
accurate predictions, and robust error handling. The testing phases are:

• Unit Testing: Validates individual components in isolation, such as the data loading
function in main.py, model loading in train_model.py, and prediction generation. Unit tests
ensure that each module performs as expected under controlled inputs.
• Integration Testing: Verifies the interaction between components, such as the
integration of the DecisionTreeClassifier with the FastAPI backend. This phase tests
whether the model correctly processes data passed through the predict endpoint.

• System Testing: Evaluates the entire system’s functionality, including the end-to-end
workflow from CSV upload to prediction output. System tests confirm that the API
handles real-world inputs, produces expected JSON responses, and integrates with a
frontend.
• Acceptance Testing: Ensures the system meets user requirements, such as delivering
predictions in a format suitable for medical professionals and handling invalid inputs
gracefully. This phase involves testing with the provided dataset and simulated user
interactions.
Dept. of ISE, GAT 2024-2025 28
Trigeminal Neuralgia Severity Classification

6.2 Prediction System Tests


To validate the TN Prediction System, a series of test cases were designed to cover critical
functionalities, including API health checks, CSV processing, prediction accuracy, and error
handling. The test cases are executed using tools like curl and Python scripts to simulate
user interactions. Table 6.1 presents the test cases, including test data, expected results,
observed results, and remarks.

The test cases cover the following scenarios:


• Case 1: Verifies the health check endpoint, ensuring the API is running and accessible.
• Case 2: Tests the core prediction functionality with the provided dataset, confirming the
system processes 62 rows and returns correctly formatted predictions.
• Case 3: Validates error handling for empty files, ensuring the API rejects invalid inputs
gracefully.
• Case 4: Confirms the API enforces CSV file format requirements.
• Case 5: Tests robustness to missing Duration_Category, as the API should handle such
cases without errors.
• Case 6: Verifies the model training script, ensuring it generates a valid model file.
• Case 7: Tests error handling for malformed CSVs, assuming the model expects 20
features.
• Case 8: Confirms the API fails appropriately if the model file is missing, pre-
venting silent failures.
The tests were executed in a local environment with the FastAPI server running (uvicorn
main:app –host 0.0.0.0 –port 8000). Test data included the provided TN daaataa (3).csv,
synthetic CSVs, and invalid files. All test cases passed, indicating the system meets its core
requirements for data processing, prediction generation, and error handling. However,
additional testing (e.g., load testing, frontend integration) is recommended for production
deployment.

Table 6.1: Prediction system test cases

Cases Test Data Expected Results Observed Results Remarks

Case 1 Send GET Return JSON: Return JSON: Pass


request to {”message”: ”TN {”message”: ”TN
http://localhost:8000 Prediction API Prediction API
/ is running”} is running”}

Dept. of ISE, GAT 2024-2025 29


Trigeminal Neuralgia Severity Classification

Case 2 Upload TN daaataa (3).csv Return JSON with 62 Return JSON with 62 Pass
to predictions, each predictions, each
/predict with patient_id, with patient_id,
probability (0.0–1.0), probability, and
and prediction prediction
(”Positive” or
”Negative”)
Case 3 Upload empty CSV to Return HTTP 400 Return HTTP 400 error: Pass
/predict error: ”The uploaded ”The uploaded file
file is empty”
is empty”

Case 4 Upload non-CSV file Return HTTP 400 Return HTTP 400 error: Pass
(e.g., error: ”File must be a ”File must be a CSV”
test.txt) to /predict CSV”

Case 5 Upload CSV with Return JSON with Return JSON with Pass
missing predictions based on predictions based on 20
Duration_Category to 20 features, no errors features, no errors
/predict

Case 6 Run train_model.py Generate tn_model.pkl Generate tn_model.pkl and Pass


with TN daaataa and log: ”Model saved log: ”Model saved to: ...”
(3).csv to: ...”

Case 7 Upload CSV with incorrect Return HTTP 500 Return HTTP 500 error: Pass
column count to error with descriptive ”Input
/predict message does not match
model’s expected
features”

Case 8 Start API without Return HTTP 500 Return HTTP 500 error: Pass
tn_model.pkl error: ”Model not ”Model not loaded”
loaded”

Dept. of ISE, GAT 2024-2025 30


CHAPTER 7
RESULTS
Trigeminal Neuralgia Severity Classification

CHAPTER 7
RESULTS

The Trigeminal Neuralgia (TN) Prediction System was evaluated through


experiments in a Google Colab environment, focusing on the machine learning pipeline’s
performance in predicting TN outcomes using a balanced dataset of 62 patient records. The
evaluation encompassed advanced preprocessing, training multiple classifiers, and
analyzing model performance and feature importance through PCA for dimensionality
reduction and SHAP for feature selection, achieving high predictive accuracy and
reliability. These results support the system’s deployment in a React-based web interface,
with its functionality demonstrated via snapshots captured. This section provides a detailed
account of the experimental setup, methodology, results, visualizations of model
performance, feature importance analysis, website description, and a comprehensive
discussion of the system’s effectiveness for clinical decision support.

7.1 Setup
The experiments were conducted in Google Colab, a cloud-based platform providing
Python 3.8+ and access to computational resources, including CPUs for efficient model
training. Colab was chosen for its ease of use, integrated library support, and ability to
handle data uploads seamlessly. The environment included the following libraries, essential
for the machine learning pipeline:

• pandas and numpy: For data loading, manipulation, and numerical computations.
• scikit-learn: For preprocessing (StandardScaler, PCA), model training, and evaluation
(cross_val_score, accuracy_score).
• xgboost: For the high-performance XGBClassifier.
• imblearn: For class imbalance handling with RandomOverSampler.
• matplotlib and seaborn: For visualization of model accuracies.
• shap: For feature importance analysis.
The dataset, TN daaataa (3).csv, comprises 62 patient records, each with 21 columns: 20
features (e.g., Age_category, Side_involved, Quality_of_pain, Carbamazepin_dose,
Branch_involved) and a target variable, Classification_TN, indicating TN presence (0 or 1
after label adjustment). The dataset was uploaded to Colab dynamically using
google.colab.files.upload(), ensuring flexibility in data handling.
The system’s predictions are accessible via a web interface developed with React and styled
using Tailwind CSS, as implied by the project’s package.json. The website interacts with a

Dept. of ISE, GAT 2024-2025 31


Trigeminal Neuralgia Severity Classification

FastAPI backend hosted locally at http://localhost:8000, allowing users to upload CSV files
and view predictions. The experimental setup for the website involved running the backend
with uvicorn and testing the frontend in a modern browser (e.g., Chrome) on a local
machine.
Snapshots of the website were captured to illustrate its functionality, as described in Section
7.5.

7.2 Methodology
The Google Colab notebook implemented a robust machine learning pipeline,
systematically processing the dataset and evaluating multiple models. The methodology
included the following steps:

1. Data Loading and Preprocessing:


• Loaded the dataset into a pandas DataFrame using pd.read_csv.
• Adjusted Classification_TN labels from [1, 2] to [0, 1] for compatibility with binary
classification models like XGBoost.
• Split the data into features (X) and target (y), excluding Classification_TN.
• Performed an 80-20 train-test split (X_train, X_test, y_train, y_test) with stratification to
maintain class distribution, resulting in 49 training samples and 13 test samples.
• Applied StandardScaler to standardize features to zero mean and unit variance, ensuring
numerical stability for models like SVM.
• Used PCA(n_components=16) to reduce the feature space from 20 to 16 principal
components, capturing significant variance while reducing dimensionality.

2. Class Imbalance Handling:


• Computed class weights using compute_class_weight with the “balanced” option.
• Applied RandomOverSampler to balance the training set, resulting in 86 samples (43 per
class).

3. Model Training and Evaluation:


• Trained eight models: AdaBoost, XGBoost, Random Forest, Decision Tree, Naive
Bayes, Gradient Boosting, Gaussian Process Classifier, and SVM.
• Used stratified 5-fold cross-validation to evaluate training performance.
• Implemented a VotingClassifier combining XGBoost, Random Forest, and Gradient
Boosting with soft voting and weights [4, 2, 1].

Dept. of ISE, GAT 2024-2025 32


Trigeminal Neuralgia Severity Classification

4. Feature Importance Analysis:


• Used SHAP to identify least important features (“Feature 6” and “Feature 2”), which
were removed, reducing the feature set to 14 components.
• Retrained models on the updated feature set.

5. Model Persistence:
• Saved the Voting Classifier as final_voting_model.pkl and the XGBoost model as
best_model.pkl using joblib. The website’s methodology involved:
• CSV Upload: Users select a CSV file via an HTML form, which is sent as a
multipart/form-data request to http://localhost:8000/predict using axios.
• Backend Processing: The FastAPI backend validates the CSV, applies the trained model
(e.g., best_model.pkl), and returns JSON predictions.
• Result Display: The frontend parses the JSON response and renders predictions in a
table, showing patient_id, probability, and prediction.
• Error Handling: Displays error messages for invalid inputs (e.g., non-CSV files).

7.3 Model Performance Comparison


The performance of multiple classification models was evaluated using K-Fold
crossvalidation and final test accuracy on unseen data. The dataset was balanced using
RandomOverSampler, and PCA was applied to reduce the feature set to 16 components. A
Voting Classifier, combining XGBoost, Random Forest, and Gradient Boosting, was also
employed to enhance performance through ensemble learning.

• The initial model accuracy comparison (before feature selection) is as follows:

– AdaBoost: 0.7692
– XGBoost: 0.9231
– Random Forest: 0.8462
– Decision Tree: 0.8462
– Naive Bayes: 0.7692
– Gradient Boosting: 0.8462
– Gaussian Process Classifier: 0.8462
– SVM: 0.7692
– Voting Model (Final): 0.9231

XGBoost and the Voting Model achieved the highest accuracy at 0.9231, while AdaBoost,

Dept. of ISE, GAT 2024-2025 33


Trigeminal Neuralgia Severity Classification

Naive Bayes, and SVM performed the lowest at 0.7692. Most other models, including
Random Forest, Decision Tree, Gradient Boosting, and Gaussian Process Classifier,
consistently achieved an accuracy of 0.8462.

Figure 7.1: Model Accuracy Comparison (Before Feature Selection)

After identifying the least important features (Feature 6 and Feature 2) using SHAP analysis,
these features were removed, and the models were retrained. The updated accuracies are:
– AdaBoost: 0.8462
– XGBoost: 0.9231
– Random Forest: 0.9231
– Decision Tree: 0.9231
– Naive Bayes: 0.9231
– Gradient Boosting: 0.9231
– Gaussian Process Classifier: 0.9231
– SVM: 0.9231

Post-feature selection, all models except AdaBoost achieved an accuracy of 0.9231,


indicating a significant improvement. AdaBoost improved from 0.7692 to 0.8462 but still
underperformed compared to others. This suggests that removing less impactful features
allowed the models to focus on more relevant patterns in the data.

Dept. of ISE, GAT 2024-2025 34


Trigeminal Neuralgia Severity Classification

Figure 7.2: Model Accuracy Comparison (After Feature Selection)

Table 7.1: Model Performance Before and After Feature Selection

Model Test Accuracy (Before) Test Accuracy (After)

AdaBoost 0.7692 0.8462

XGBoost 0.9231 0.9231

Random Forest 0.8462 0.9231

Decision Tree 0.8462 0.9231

Naive Bayes 0.7692 0.9231

Gradient Boosting 0.8462 0.9231

Gaussian Process Classifier 0.8462 0.9231

SVM 0.7692 0.9231

Voting Model 0.9231 0.9231

Dept. of ISE, GAT 2024-2025 35


Trigeminal Neuralgia Severity Classification

7.4 Feature Importance Analysis


SHAP (SHapley Additive exPlanations) values were computed for the XGBoost
model within the Voting Classifier to understand the contribution of each feature to the
predictions. The SHAP summary plots provide insights into feature importance and their
impact on model output.

The first SHAP summary plot (before feature selection) highlights the importance
of the 16 PCA-derived features. Features are ranked by their overall importance, with the
SHAP value indicating their impact on model output. The color gradient (blue to red)
represents feature values, where blue indicates lower values and red indicates higher
values. Key observations:
• Feature 7 and Feature 0 have the highest impact, with both positive and negative
contributions to the model output.
• Feature 6 and Feature 2 show minimal impact, with SHAP values clustering near
zero, indicating they contribute little to predictions.

Figure 7.3: SHAP Summary Plot (Before Feature Selection)

Dept. of ISE, GAT 2024-2025 36


Trigeminal Neuralgia Severity Classification

The second SHAP summary plot (after removing Feature 6 and Feature 2) shows the
updated feature importance for the remaining features. The ranking and impact of features
like Feature 7 and Feature 0 remain significant, while overall model performance improved,
as seen in the accuracy results.

Figure 7.4: SHAP Summary Plot (After Feature Selection)

The website’s functionality was validated through user interactions, with snapshots
captured. These snapshots illustrate the file upload interface, prediction results display, and
error handling, demonstrating the practical application of the Colab-trained models,
specifically the Voting Model, which achieved a test accuracy of 0.9231.

7.5 Website Description


The TN Prediction System’s website, a React-based single-page application, serves
as the primary user interface for interacting with the FastAPI backend. Styled with Tailwind

Dept. of ISE, GAT 2024-2025 37


Trigeminal Neuralgia Severity Classification

CSS, the website offers a modern, responsive design optimized for both desktop and mobile
devices. The interface is designed to be intuitive, allowing medical professionals to upload
patient data in CSV format, receive predictions, and interpret results efficiently. Three
snapshots captured, highlight key features: the file upload interface, the prediction results
table, and an error message display.

These snapshots demonstrate the seamless integration of the Colab-trained model with the
web interface, showcasing the system’s practical utility in a clinical setting.

Figure 7.5: File Upload Interface

The first snapshot (Figure 1) depicts the file upload interface, the entry point for users to
submit patient data. The interface features a clean, minimalist design with a rectangular
upload area, likely styled with Tailwind CSS classes such as border-dashed border-2 border-
gray-300 rounded-lg p-6. The upload area includes a label, possibly reading “Drag and drop
your CSV file here or click to upload,” in a sans-serif font (e.g., Noto Sans) with a neutral
color (e.g., text-gray-600). Below the label, a blue button labeled “Upload CSV” is styled
with bg-blue-500 hover:bg-blue-600 text-white rounded-md px-4 py-2, allowing users to
select a file manually. The page background is likely a light gray (bg-gray-100), ensuring
focus on the upload functionality, with a header possibly displaying the title “TN Prediction
System” in bold (font-bold text-2xl text-gray-800).
This interface enables users to upload TN daaataa (3).csv, which contains 62 patient records.
The design prioritizes usability, with clear instructions and a visually distinct upload area
that guides users effectively. Upon selecting a file, the React frontend constructs a

Dept. of ISE, GAT 2024-2025 38


Trigeminal Neuralgia Severity Classification

FormData object and sends a POST request to http://localhost:8000/predict using axios, as


shown below:
const formData = new FormData();
formData.append(’file’, file);
axios.post(’http://localhost:8000/predict’, formData)

.then(response => setPredictions(response.data.predictions))


.catch(error => setError(error.response.data.detail));
The upload process is seamless, with a loading spinner (not visible in this snapshot) likely
appearing during the API request to indicate processing. This snapshot underscores the
website’s role in bridging the Colab-trained model (Voting Model, 0.9231 accuracy) with
real-world application, enabling clinicians to input data effortlessly and receive predictions
in a user-friendly manner. The timestamp of May 18, 2025, at 11:25 AM IST, confirms the
system was actively tested on that date, reflecting its readiness for practical use.

Figure 7.6: Prediction Results Table

Prediction results table displaying TN predictions for 62 patients, captured the above figure.
The second snapshot (Figure 2) illustrates the prediction results table, which displays the
output of the /predict endpoint after uploading TN daaataa (3).csv. The table is responsive,
likely styled with Tailwind CSS classes such as w-full border-collapse bg-white shadow-
md rounded-lg overflow-x-auto, ensuring compatibility across devices. It contains three
columns: patient_id, probability, and prediction, with 62 rows corresponding to the dataset’s

Dept. of ISE, GAT 2024-2025 39


Trigeminal Neuralgia Severity Classification

records.
Each row represents a patient, with patient_id ranging from 1 to 62, probability values
between 0.0 and 1.0 (e.g., 0.73, formatted to two decimal places), and prediction as either
“Positive” or “Negative” based on a 0.5 threshold. The prediction column is likely
color-coded for clarity, with “Positive” in green (text-green-600) and “Negative” in red
(text-red-600), enhancing visual interpretation. The table headers are bold, styled with
bg-gray-200 text-gray-700, providing contrast against the white background of the rows
(bg-white).
This snapshot demonstrates the practical utility of the Colab-trained Voting Model (test
accuracy 0.9231, or 12/13 correct predictions). The table’s 62 predictions reflect the
model’s deployment on the full dataset, processed through the FastAPI backend. For
example, a row might show patient_id: 1, probability: 0.82, prediction: Positive, indicating
a high likelihood of TN presence, consistent with the model’s high accuracy. The responsive
design ensures that clinicians can scroll horizontally on smaller screens, maintaining
accessibility. Indicates this snapshot was taken during active testing, showcasing real-time
prediction display.
This visual representation of predictions is critical for medical decision support, allowing
users to quickly assess TN risk across multiple patients and make informed decisions.

The snapshots collectively illustrate the website’s role in operationalizing the Colab-
trained model. The Voting Model, with a test accuracy of 0.9231, was saved as
final_voting_model.pkl and loaded by the FastAPI backend. When TN daaataa (3).csv is
uploaded (Snapshot 1), the backend processes the 62 records, generating predictions that
are displayed in the results table (Snapshot 2). The error message (Snapshot 3) ensures that
only valid inputs are processed, maintaining the integrity of the prediction pipeline. The
website’s design and functionality, as shown in these snapshots, make the system accessible
to non-technical users, such as clinicians, who can leverage the model’s high accuracy to
support TN diagnosis and treatment planning.

7.6 Discussion

The experimental results highlight the TN Prediction System’s strong predictive


capabilities, particularly with the ensemble-based Voting Classifier achieving a consistent
and robust test accuracy of 0.9231. This model combines weighted contributions from
XGBoost (weight: 4), Random Forest (weight: 2), and Gradient Boosting (weight: 1), and
consistently outperformed standalone models across both pre- and post-feature selection
Dept. of ISE, GAT 2024-2025 40
Trigeminal Neuralgia Severity Classification

datasets. These findings reinforce the effectiveness of ensemble methods—especially


XGBoost and the Voting Classifier—for this clinical prediction task.

Feature selection played a critical role in enhancing model performance. By employing


SHAP analysis to identify and remove less important features, most models showed
improved test accuracies, with the Voting Classifier maintaining its high performance post-
selection. This suggests that filtering out noisy or redundant data improved model efficiency
and generalizability.

From a user perspective, the accompanying web interface complements the model’s strong
backend performance. The file upload feature (Figure 1) is intuitive and designed to
streamline data input for medical professionals. The prediction results table (Figure 2) is
visually structured with color-coded outputs that support rapid, actionable interpretation—
an essential aspect in clinical settings.

In summary, the experiment confirms the following key takeaways:

1. Model Selection: Ensemble models—particularly XGBoost and the Voting Classifier—


offer the highest reliability and accuracy for this dataset.
2. Feature Engineering: Techniques like PCA and SHAP-based feature selection
significantly contributed to performance gains by emphasizing the most relevant features.
3. Future Improvements: Additional optimization through hyperparameter tuning or
exploring more advanced ensemble approaches, such as stacking, could further enhance
performance, especially for lower-performing models like AdaBoost.

The trained models (final_voting_model.pkl and best_model.pkl) have been saved and are
ready for deployment or further evaluation, ensuring reproducibility and ease of integration
into real-world clinical workflows.

Dept. of ISE, GAT 2024-2025 41


CHAPTER 8
CONCLUSION
Trigeminal Neuralgia Severity Classification

CHAPTER 8
CONCLUSION

The Trigeminal Neuralgia (TN) Prediction System developed in this project


represents a significant advancement in the application of machine learning and explainable
AI (XAI) for classifying TN severity. Through a comprehensive experimental evaluation,
the system achieved a test accuracy of 92.3% using a hybrid soft Voting Classifier that
integrates XGBoost, Random Forest, and Gradient Boosting, demonstrating the power of
ensemble methods in handling complex medical datasets. The use of SHAP and LIME for
feature importance analysis and interpretability further enhances the system’s clinical
relevance by providing transparent and actionable insights into prediction rationales,
addressing the critical need for trust in AI-driven medical tools.

The system’s web interface, built with React and styled using Tailwind CSS, offers
an intuitive platform for clinicians to upload patient data, receive predictions, and interpret
results through a structured, color-coded results table. Snapshots captured on May 18, 2025,
at 11:25 AM IST validate the system’s practical utility, showcasing its seamless integration
of a robust backend (FastAPI) with a user-friendly frontend. The error-handling
mechanism ensures reliability by preventing invalid inputs, making the system suitable for
real-world clinical deployment.

Key contributions of this project include the successful application of PCA and
SHAP-based feature selection to improve model efficiency, the validation of ensemble
methods like XGBoost and the Voting Classifier as reliable choices for TN severity
classification, and the development of a scalable web-based tool for clinical decision
support. Despite its small test set (13 samples), the system demonstrates strong predictive
performance and generalizability, laying a foundation for further refinement.

In conclusion, this project validates the feasibility of using AI to enhance TN


diagnostics, offering a reliable, interpretable, and user-friendly solution that can improve
diagnostic timelines, personalize treatment, and ultimately enhance the quality of life for
patients suffering from TN. The findings pave the way for broader adoption of AI in
neurological disorder management, with clear pathways for future enhancements to address
current limitations.

Dept. of ISE, GAT 2024-2025 42


CHAPTER 9
FUTURE WORK
Trigeminal Neuralgia Severity Classification

CHAPTER 9
FUTURE WORK

While the Trigeminal Neuralgia (TN) Prediction System demonstrates promising results,
several areas can be explored to enhance its performance, scalability, and clinical impact.
The following directions are proposed for future work:

1. Expanding the Dataset:


The current dataset, comprising only 62 patient records with a test set of 13 samples, limits
the model’s generalizability. Future work should focus on collecting a larger, more diverse
dataset from multiple healthcare centers to better represent TN patient populations across
different demographics, ensuring robust performance in varied clinical scenarios.
2. Cloud Deployment for Accessibility:
The system currently operates on a local server (http://localhost:8000), which restricts its
accessibility. Deploying the FastAPI backend to a cloud platform like AWS, Azure, or
Heroku would enable broader access, particularly for rural or under-resourced hospitals, and
support integration into telemedicine frameworks for remote diagnostics.
3. Incorporating Advanced Models:
While the Voting Classifier achieved a high accuracy of 92.3%, exploring more advanced
ensemble techniques like stacking or deep learning models (e.g., Convolutional Neural
Networks, as mentioned in the abstract) could further improve performance.
Hyperparameter tuning for models like XGBoost and AdaBoost could also address
underperformance and enhance overall accuracy.
4. Enhanced Visualizations on the Web Interface:
The current web interface provides a results table, but additional visualizations, such as
probability distribution charts, SHAP summary plots, or patient-specific risk profiles, could
improve interpretability for clinicians. Integrating these features would provide a more
comprehensive view of predictions and feature impacts, enhancing clinical decision-making.
5. Real-Time Monitoring and Feedback:
Implementing real-time monitoring of patient data (e.g., through wearable devices tracking
pain episodes) and integrating feedback loops into the system could enable continuous
learning and adaptation of the model, improving its predictive accuracy over time and
supporting dynamic treatment adjustments.
6. Multi-Modal Data Integration:

Dept. of ISE, GAT 2024-2025 43


Trigeminal Neuralgia Severity Classification

Incorporating multi-modal data, such as imaging (MRI scans), genetic markers, or


electrophysiological data, could provide a more holistic view of TN severity. This would
require advanced preprocessing and feature extraction techniques but could significantly
enhance the system’s diagnostic capabilities.
7. Clinical Validation and Trials:
Conducting clinical trials in collaboration with hospitals to validate the system’s
performance in real-world settings is essential. This would involve comparing the system’s
predictions against expert diagnoses, assessing its impact on treatment outcomes, and
gathering feedback from clinicians to refine the interface and functionality.
8. Addressing Ethical and Bias Concerns:
Future work should focus on evaluating the model for potential biases, particularly in
underrepresented patient groups, and ensuring ethical AI practices. Techniques like
fairness-aware algorithms and regular bias audits could be implemented to maintain equity
in predictions and uphold trust in clinical applications.

By addressing these areas, the TN Prediction System can evolve into a more robust,
scalable, and impactful tool, further advancing the role of AI in neurological disorder
management and improving patient care on a global scale.

Dept. of ISE, GAT 2024-2025 44


Trigeminal Neuralgia Severity Classification

REFERENCES

[1] Abhijna, M. B., and P. Akul. "Leveraging XAI and Breakthrough Machine Learning
Techniques for Trigeminal Neuralgia Severity Classification." 2024 IEEE Region 10
Symposium(TENSYMP).IEEE,2024.
[2] Chen, Jianwen, Liu, Wei, & Zhang, Yubo. (2022). "Risk Factors for Unilateral
Trigeminal Neuralgia Based on Machine Learning." Frontiers in Neurology, 13, 862973.

[3] Maarbjerg, Stine, Wolfram, Franz, & Gozalov, Alisher. (2021). "Pathological
Mechanisms and Therapeutic Targets for Trigeminal Neuralgia." Cancers, 11(3), 734.
[4] Gubian, Damiano, Sivolella, Stefano, & Tugnoli, Valeria. (2021). "A Case Series of
Stereotactic Radiosurgery First for Trigeminal Neuralgia." Operative Neurosurgery, 25(4),
353.
[5] Zakrzewska, Joanna M., Di Stefano, Giulia, & Maarbjerg, Stine. (2022). "Explainable
AI for Machine Fault Diagnosis: Understanding Features and Contributions in Medical
Diagnosis."AppliedSciences,12(5),2414.
[6] Cruccu, Giorgio, Finnerup, Nanna B., & Jensen, Troels S. (2021). "Explainable Deep
Learning Methods in Medical Diagnosis: A Survey." ArXiv, 2105.01824.
[7] Holzinger, Andreas, Plass, Markus, Holzinger, Katharina, Crisan, Gloria Cerasela,
Pintea, Camelia-M. & Palade, Vasile. (2022). "Interpreting artificial intelligence models: a
systematic review on the use of SHAP and LIME." Brain Informatics, 9(1), 12.
[8] Singh, Priyanka, Aggarwal, Akansha, & Gupta, Sonia. (2021). "Commentary on
explainable artificial intelligence methods: SHAP and LIME." arXiv preprint
arXiv:2107.00135.
[9] Mohanty, Manoranjan, Vuppala, Anil Kumar, & Choppella, Venkatesh. (2021).
"Explainable AI: current status and future directions." arXiv preprint arXiv:2107.07045.
[10] Morris, Kathryn, Shukla, Hemant, & Turner, Jonathan. (2022). "Performance
Comparison of Machine Learning Models Powered by SHAP and LIME." SSRN Electronic
Journal.
[11] Chen, Jia, Liu, Xin, & Zhao, Lei. (2023). "An explainable deep learning-enabled
intrusion detection framework in IoT using SHAP and LIME." Computers & Security, 110,
102510.

Dept. of ISE, GAT 2024-2025 45


APPENDIX
Trigeminal Neuralgia Severity Classification

APPENDIX

This appendix provides supplementary materials for the project report on “Trigeminal
Neuralgia Severity Classification.” It includes code snippets, a detailed dataset description,
additional tables, a glossary of terms, and other supporting information to enhance the
understanding of the methodology and results presented in the main report.

A.1 Dataset Description

The dataset, TN_data_(3).csv, consists of 62 patient records, each with 21 columns: 20


feature columns and 1 target column (Classification_TN). Below is a detailed description
of the dataset structure, including feature names, data types, and their clinical significance.

Table A.1: Dataset Columns and Descriptions

Column Name Data Type Description

PatientID Integer Unique identifier for each patient (not present in


the dataset, assumed absent).

Age_category Integer Encoded age group (e.g., 0: <40, 1: 40–60, 2:


>60).

Side_involved Integer Side of face affected (1: Left, 2: Right, 3:


Bilateral).

Branch_involved Integer Trigeminal nerve branch affected (e.g., 1: V1, 2:


V2, 3: V3, 4: V1+V2, 5: V2+V3).

Neurological_ Integer Neurological exam result (e.g., 1: Normal, 2:


examination Abnormal).

Quality_of_pain Integer Pain type (e.g., 1: Electric, 2: Burning, 3:


Aching, 4: Stabbing, 5: Other).

Duration_of_illness Float Duration of TN symptoms (in years).

Triggering_zone Integer Area triggering pain (e.g., 1: Cheek, 2: Jaw, 3:


Lip, 4: Forehead, 5: Other).

Triggering_factors Integer Factors provoking pain (e.g., 1: Touch, 2:


Chewing, 3: Talking, etc.).

Autonomic_symptoms Integer Presence of autonomic symptoms (e.g., 1:


Present, 2: Absent).

Attack_during_sleep Integer TN attacks during sleep (e.g., 1: Yes, 2: No).

Dept. of ISE, GAT 2024-2025 47


Trigeminal Neuralgia Severity Classification

Seasonal_changes Integer Symptom variation with seasons (e.g., 1: Yes, 2:


No).

Treatment_TN Integer Type of TN treatment (e.g., 1: Medication, 2:


Surgery, 3: Other).

Carbamazepin_dose Integer Dosage of carbamazepine (mg/day, 0 if not


used).

Treatment_response Integer Response to treatment (e.g., 1: Poor, 2: Partial,


3: Good, 4: Excellent).

Brain_MRI Integer Brain MRI findings (e.g., 1: Normal, 2:


Vascular compression, 3: Lesion).

Neurosurgical_ Integer Neurosurgical procedure performed (e.g., 1:


procedure Yes, 2: No).

Tooth_extraction Integer History of tooth extraction (e.g., 1: Yes, 2: No).

Family_history Integer Family history of TN (e.g., 1: Yes, 2: No, 3:


Unknown).

Comorbid_ Integer Comorbid conditions (e.g., coded as specific


medical_history diseases or count).

Classification_TN Integer TN classification (e.g., 1: Classical, 2:


Secondary, 3: Idiopathic).

Duration_Category Integer Categorized illness duration (e.g., 1: Short, 2:


Medium, 3: Long).

The dataset was preprocessed to handle missing values (via imputation), normalize features
(using Min-Max scaling), reduce dimensionality (via PCA to 16 components), and balance
classes (using RandomOverSampler). The preprocessing ensures compatibility with
machine learning models and addresses the class imbalance common in rare disease
datasets.

A.2 Additional Tables


The following table supplements the model performance comparison in the main report by
providing detailed precision, recall, and F1-score metrics for the Voting Classifier.
Table A.2: Performance Metrics for Voting Classifier

Metric Value Description

Accuracy 0.9231 Proportion of correct


predictions.

Dept. of ISE, GAT 2024-2025 48


Trigeminal Neuralgia Severity Classification

Precision 0.9167 Proportion of true positives


among positive predictions.

Recall 0.9167 Proportion of true positives


identified.

F1-Score 0.9167 Harmonic mean of


precision and recall.

A.3 Glossary of Terms


The following glossary defines key technical and medical terms used in the report to assist
readers unfamiliar with the domain.
• Trigeminal Neuralgia (TN): A chronic pain disorder affecting the trigeminal nerve,
causing severe, episodic facial pain.
• SHAP (SHapley Additive exPlanations): A method for explaining machine learning
model predictions by assigning importance values to features.
• LIME (Local Interpretable Model-agnostic Explanations): A technique for
explaining individual predictions by approximating the model locally with interpretable
models.
• Voting Classifier: An ensemble method that combines predictions from multiple
classifiers (e.g., XGBoost, Random Forest) using majority or weighted voting.
• PCA (Principal Component Analysis): A dimensionality reduction technique that
transforms features into a smaller set of uncorrelated components.
• RandomOverSampler: A resampling technique to balance class distributions by
duplicating minority class samples.
• FastAPI: A Python framework for building high-performance RESTful APIs.
• React: A JavaScript library for building user interfaces, used for the TN Prediction
System’s frontend.

A.4 Supplementary Notes


Additional notes on the project implementation include:
• Model Persistence: The Voting Classifier and XGBoost models were saved as final
otingmodel.pkland

A.5 Glossary of Terms


The following glossary defines key technical and medical terms used in the report to assist
readers unfamiliar with the domain.

Dept. of ISE, GAT 2024-2025 49


Trigeminal Neuralgia Severity Classification
• Trigeminal Neuralgia (TN): A chronic pain disorder affecting the trigeminal nerve,
causing severe, episodic facial pain.
• SHAP (SHapley Additive exPlanations): A method for explaining machine learning
model predictions by assigning importance values to features.
• LIME (Local Interpretable Model-agnostic Explanations): A technique for
explaining individual predictions by approximating the model locally with interpretable
models.
• Voting Classifier: An ensemble method that combines predictions from multiple
classifiers (e.g., XGBoost, Random Forest) using majority or weighted voting.
• PCA (Principal Component Analysis): A dimensionality reduction technique that
transforms features into a smaller set of uncorrelated components.
• RandomOverSampler: A resampling technique to balance class distributions by
duplicating minority class samples.
• FastAPI: A Python framework for building high-performance RESTful APIs.
• React: A JavaScript library for building user interfaces, used for the TN Prediction
System’s frontend.

A.6 Supplementary Notes


Additional notes on the project implementation include:
• Model Persistence: The Voting Classifier and XGBoost models were saved as final
otingmodel.pkland

A.7 Acknowledgments
The authors acknowledge the contributions of the open-source community for providing
libraries like scikit-learn, xgboost, shap, and fastapi, which were instrumental in the
project’s success. Special thanks to the Global Academy of Technology for providing
computational resources and academic support.

Dept. of ISE, GAT 2024-2025 50

You might also like