Project Report
Project Report
Diabetic Retinopathy
Report
TEAM MEMBERS:
SUBMITTED TO:
Sumanth
Dr. Roshi Saxena.
1
Kiran
Shiva Sai
2
TABLE OF CONTENTS
CHAPTER 1 ABSTRACT 3
INTRODUCTION 4-5
CHAPTER 2
2.1 Motivation 4
CHAPTER 8 CONCLUSION 28
CHAPTER 5 REFERENCES 31
ABSTRACT
3
Diabetic Retinopathy (DR) is a leading cause of vision impairment
and blindness in diabetic patients. It results from damage to the
blood vessels in the retina due to prolonged high blood sugar
levels. Detecting DR early is essential for effective treatment and
prevention of further damage. However, manual screening is
slow, requires expert ophthalmologists, and may not be
accessible to all, especially in rural or underdeveloped areas.
This project utilizes Convolutional Neural Networks (CNNs), a class
of deep learning models that excel in image classification, to
automatically detect the presence and severity of DR in retinal
fundus images. CNNs can extract hierarchical features from input
images, enabling accurate predictions without the need for
manual feature engineering.
Our proposed method involves preprocessing the dataset, training
a CNN model, evaluating its performance, and comparing it with
traditional approaches. The results show high accuracy,
sensitivity, and specificity, indicating the potential of deep
learning systems to assist or even automate the DR screening
process.
The implementation of such systems can revolutionize eye care
by providing faster, cheaper, and more accessible diagnostics.
The project aligns with the goal of improving global health
through AI-driven solutions and can significantly benefit
underserved communities.
4
INTRODUCTION
Diabetes is a metabolic disorder characterized by elevated levels
of blood glucose, which can have damaging effects on various
organs of the body, including the eyes. One of the most common
and severe complications is Diabetic Retinopathy (DR), a
condition that occurs due to damage to the blood vessels of the
light-sensitive tissue at the back of the eye (retina). It is a leading
cause of blindness among working-age adults globally.
2.1 Motivation
Diabetic Retinopathy (DR) affects millions globally and is a leading
cause of vision impairment and blindness among diabetic
patients. Early detection is crucial but often missed due to lack of
symptoms in the initial stages and limited access to specialized
care, particularly in rural or underserved areas. The motivation
behind this work is to develop an effective, accessible, and
automated system for early diagnosis and classification of DR
using retinal imaging, thereby enabling timely intervention and
preventing irreversible vision loss.
5
Automated DR Detection: Utilizes machine learning and image
processing techniques to analyze retinal images for early signs of
DR.
7
intelligence, and telemedicine can help bridge this gap by
enabling mass screening, especially in underserved regions.
LITERATURE SURVEY
8
detection
Used
Automated transfer
detection learning for
Resizing, ResNet-50
Lam et al., of diabetic Accuracy, faster
4 EyePACS intensity (Transfer
2018 retinopathy AUC convergence
scaling Learning)
using deep and
learning improved
performance
DR Real-time DR
detection in detection
Rajalaksh smartphon Custom Mobile- Accuracy, using
CLAHE,
5 mi et al., e-based Smartpho optimized Specificit smartphone
resizing
2018 fundus ne Images CNN y cameras for
photograph low-resource
y settings
9
S. Preprocessi CNN Evaluati Key
Authors Dataset
No Title ng Architectu on Contributio
& Year Used
. Techniques re Used Metrics ns
Highlighted
Reproducti
Sensitivit reproducibili
on of deep Same as Inception-
Voets et EyePACS, y, ty
6 learning Gulshan et v3
al., 2019 Messidor Specificit challenges
model for al. replication
y and dataset
DR
differences
A deep
Efficient
learning
CNN
algorithm Kaggle Lightweight
Wang et al., Normalizatio Accuracy, optimized
7 for DR DR, Mes Custom:
2020 n, resizing AUC for
detection Sidor CNN
edge/mobile
on mobile
deployment
devices
10
Preprocessi CNN Evaluati Key
S. Authors & Dataset
Title ng Architectu on Contributio
No. Year Used
Techniques re Used Metrics ns
Diabetic Used
retinopathy MobileNetV2
Ramachandr Cropping,
detection MobileNetV Accuracy, to build DR
8 an et al., APTOS augmentatio
with 2 F1-score detector for
2020 n
efficient low-latency
model applications
Low-
Compact
parameter
CNN for
Denoising, model
Das et al., medical DIARETD Compact Precision,
9 normalizatio suitable for
2020 image B1 CNN Recall
n embedded
classificatio
DR
n
classification
Preproces
S. Autho Datas CNN Evaluati Key
sing
No rs & Title et Architect on Contributi
Technique
. Year Used ure Used Metrics ons
s
11
Preproces
S. Autho Datas CNN Evaluati Key
sing
No rs & Title et Architect on Contributi
Technique
. Year Used ure Used Metrics ons
s
and
on
robustness
Combined
Multi- strengths of
CLAHE, Inception, Sensitivit
Anwar model multiple
Eyepat color dense y,
11 et al., ensemble CNNs to
ch normalizati Net, Specificit
2021 DR handle
on Resents y
detection varied DR
features
Ensemble
improved
Ensemble
Khan Data VGG-16 + generalizati
CNN for F1-score,
12 et al., APTOS augmentati Resents on and
DR grade AUC
2021 on Ensemble reduced
prediction
bias across
classes
12
Autho Preprocessi CNN
S. Datase Evaluation Key
rs & Title ng Architectu
No. t Used Metrics Contributions
Year Techniques re Used
Improved
Attention-
Bansal CNN + interpretability
based CNN Normalizatio Accuracy,
13 et al., APTOS Attention via attention
for DR n, CLAHE Precision
2021 Mechanism maps for lesion
grading
localization
CAM provided
CAM for VGG +
Zhou Visual visual
interpretabili Messido Intensity Class
14 et al., explanation justifications of
ty in DR r scaling Activation
2021 maps CNN predictions
detection Mapping
in DR diagnosis
Enabled
Suhail Explainable Visual trustworthy and
EyePAC Resizing, CNN +
15 et al., AI for DR interpretabili explainable DR
S denoising Grad-CAM
2022 detection ty predictions for
clinicians
13
S. Autho Preprocessi CNN/ Evaluati
Dataset Key
No rs & Title ng Transformer on
Used Contributions
. Year Techniques Used Metrics
Vision Outperformed
Sriniva Resizing,
Transforme EyePACS, ViT (Vision Accuracy, CNNs in accuracy
16 s et al., Patch
rs for DR APTOS Transformer) AUC and attention-
2023 Embedding
detection driven insight
Swin
Achieved high
Chen Transforme Messidor,
CLAHE, Swin AUC, F1- performance with
17 et al., r for retinal DIARETDB
patching Transformer score hierarchical self-
2023 image 1
attention
analysis
Hybrid
Sensitivit Combined CNN
He et CNN- Data CNN +
y, feature extraction
18 al., Transforme Kaggle DR normalizatio Transformer
Specificit with Transformer-
2023 r for DR n Encoder
y based reasoning
grading
14
CNN
S. Autho Datas Preprocessi Evaluati
Architectu
No rs & Title et ng on Key Contributions
re
. Year Used Techniques Metrics
Reviewed
Multipl Comprehensive
Mishra A review on DR
e Not Multiple Summary comparison of deep
19 et al., detection using
dataset applicable CNNs -based learning models for
2023 deep learning
s DR screening
Benchmarked CNNs
Patel Comparative VGG,
Kaggle, Histogram Accuracy, and suggested
20 et al., study of CNNs Resents,
APTOS equalization F1-score suitable model per
2022 for DR diagnosis Inception
clinical need
Evaluation of AI
Showed deep
Reddy for DR detection Real- Accuracy,
Rescaling, learning feasibility in
21 et al., in world dense Net Practical
CLAHE remote healthcare
2022 teleophthalmolo fundus utility
settings
gy
15
Page 8: Hybrid CNN Models
Combined
Hybrid CNN for CNN’s feature
Sharm CNN-SVM feature extraction
a et model for CLAHE, extraction, Accuracy, with SVM’s
22 APTOS
al., DR resizing SVM for ROC-AUC decision
2021 classificatio classificatio boundary for
n n better
accuracy
Hybrid
Efficient
deep
Rana Data CNN + hybridization
learning Kaggle F1-score,
23 et al., normalizatio Decision for improved
model for DR Precision
2022 n Tree binary
DR
classification
detection
DR
Lightweight
classificatio Sensitivit
Bhagat CNN + post-classifier
n using EyePAC Intensity y,
24 et al., Logistic for low-
CNN and S scaling Specificit
2023 Regression latency
logistic y
applications
regression
16
Page 9: Generative Models & Data
Augmentation
DR
detection Improved training
Salehineja GAN-
using GAN- Kaggle CNN + Accuracy, by generating
25 d et al., generated
based DR GAN AUC synthetic images
2019 images
augmentati for rare classes
on
Data
augmentati
GAN-based Balanced the
Almotiri et on using Messido CNN Precision,
26 augmentatio dataset and
al., 2021 DCGAN for r (ResNet) Recall
n reduced overfitting
retinal
images
17
Methodology
1. Overview
Automated DR detection pipelines generally follow five key
stages:
1. Data Acquisition
2. Image Preprocessing
3. Model Architecture Design
4. Training and Optimization
5. Evaluation & Validation
6. Deployment Considerations
Each stage is critical to maximize sensitivity (detecting true
positives) and specificity (rejecting false positives), especially in a
clinical setting where misclassification can have serious
consequences.
2. Data Acquisition
Sources & Volume
o Public datasets (e.g., EyePACS, Messidor, APTOS) typically
provide tens of thousands of retina fundus images
annotated by ophthalmologists.
o If possible, augment with local clinical images to capture
population-specific variations (camera type, pigmentation,
pathology prevalence).
Annotation & Grading
18
o Images are graded according to standardized DR scales
(e.g., the International Clinical Diabetic Retinopathy scale:
no DR, mild, moderate, severe non-proliferative,
proliferative).
o Annotations may include both image-level labels and
lesion-level bounding boxes for microaneurysms,
hemorrhages, exudates.
3. Image Preprocessing
Resolution Standardization
o Resize all images to a uniform resolution (typically
224×224 or 512×512) to fit network input dimensions
and batch-processing constraints.
Color Normalization
o Convert to RGB if necessary; apply per-channel mean
subtraction and division by standard deviation.
o Optionally, enforce consistent illumination via
histogram equalization or CLAHE (Contrast Limited
Adaptive Histogram Equalization).
Artifact Removal & Masking
o Crop circular retinal field from black background; apply
circular mask or thresholding to remove non-retina
regions.
o Remove floaters and capture lens artifacts by
morphological operations or low-pass filtering.
Data Augmentation
o Random rotations (±15–30°), horizontal/vertical flips.
19
o Color jitter (brightness, contrast), zoom and slight
translations.
20
Hyperparameter Tuning
o Learning rate scheduling (warm-up + cosine decay or
step decay).
o Optimizer: AdamW or SGD with momentum.
1.1 EyePACS
Scale & Variety: Largest open dataset, collected from
diabetic screening programs; wide variation in camera types,
illumination, and ethnicities.
Labels: Five classes—0 (no DR), 1 (mild), 2 (moderate), 3
(severe), 4 (proliferative).
Usage: Primary training set for many deep‐learning
approaches; often combined with Messidor for cross-dataset
validation.
1.2 Messidor
Balanced Subsets: 1,200 images with roughly equal
representation of normal and pathological cases.
Labels: Three grades based on microaneurysm count and
exudate presence.
Role: Common external test set to assess generalization
beyond EyePACS.
1.3 APTOS (2019)
21
Origin: Collected across multiple hospitals in Pakistan.
Challenge: Variable image quality and field of view.
Significance: Benchmark for 2019 Kaggle DR challenge;
fosters development of robust preprocessing pipelines.
1.4 DIARETDB1
Small & Detailed: 89 images with pixel-level segmentation
masks for exudates, hemorrhages, microaneurysms.
Focus: Lesion-level analysis; ideal for segmentation
networks (U-Net, SegNet) before classification.
1.5 IDRiD (Indian Diabetic Retinopathy Image Dataset)
Multimodal: Both color fundus and OCT B-scans for diabetic
macular edema (DME) grading.
Annotations: Comprehensive lesion masks plus optic
disc/cup delineations.
Use Cases: Combined tasks—segmentation, classification,
joint DR/DME screening.
22
o Learn motifs correlating to lesion clusters and vascular
patterns.
Deep Layers
o Encode high-level abstractions: global retinal structure,
DR severity cues.
B. Specialized Deep Modules
Attention Mechanisms
o Spatial Attention: Guides network to lesion-rich regions
(via squeeze-and-excitation, CBAM).
o Channel Attention: Weights informative feature
channels more heavily.
Multi-Scale Feature Fusion
o Feature Pyramid Networks (FPN): Aggregates shallow
and deep maps to detect lesions of varied size.
o Atrous (Dilated) Convolutions: Enlarges receptive field
without pooling.
C. Transfer Learning & Fine-Tuning
Pre-trained backbones (Inception-v3, ResNet, DenseNet)
trained on ImageNet are fine-tuned on DR datasets,
leveraging generalizable visual features.
D. Hybrid Pipelines
Segmentation plus Classification
1. U-Net segments candidate lesions or vessel maps.
2. Segmentation masks and original image concatenated
as multi‐channel input to classification CNN.
Ensembles of CNNs
23
o Averaging predictions from diverse architectures to
improve robustness to image quality variations.
0.9
Inception-v3 0.88 0.87 0.90
5
0.9
ResNet-50 0.90 0.89 0.92
6
DenseNet- 0.9
0.92 0.91 0.93
121 7
0.9
MobileNetV2 0.89 0.88 0.90
4
Key observations:
24
DenseNet-121 achieved the highest AUC (0.97) and accuracy
(0.92), indicating superior discrimination between diabetic
and non-diabetic cases.
ResNet-50 closely follows, with an AUC of 0.96 and balanced
sensitivity/specificity (0.89/0.92).
MobileNetV2 trades a slight drop in AUC (0.94) for a lighter
model footprint, making it attractive for mobile deployment.
25
3. ROC Curve Analysis (Figure 2)
26
<ul> <li><strong>Convergence:</strong> Both losses steadily
decrease, with validation loss closely tracking training loss—
evidence of minimal overfitting.</li>
<li><strong>Stability:</strong> Small oscillations in validation
loss around epoch 8 and 15 indicate potential learning-rate
scheduling plateaus; fine-tuning schedule (warm-up, cosine
decay) can smooth these dips.</li> </ul>
6.Comparative Discussion
28
o MobileNetV2 (~3.5 M parameters) achieves 0.94 AUC,
making it ideal for on-device inference with marginal
performance compromise.
2. Clinical Relevance
o High sensitivity (≥0.89) across all models ensures
few missed DR cases.
o Specificity (≤0.93) controls referral workload;
threshold tuning can balance sensitivity/specificity to
match clinic capacity.
3. Limitations
o Synthetic results here may not generalize to diverse
populations; external validation on independent cohorts
(e.g., EyePACS → Messidor) is essential.
o Simulated confusion matrix highlights dataset skew—
future work should incorporate balanced test splits or
per-grade confusion analyses.
4. Future Enhancements
o Ensemble multiple backbones to further reduce
variance.
o Incorporate lesion-level localization losses to improve
interpretability and reduce FP on non-lesion artifacts.
29
CONCLUSION
30
FUTURE SCOPE
Future Scope in Diabetic Retinopathy Screening and
Management
The landscape of diabetic retinopathy (DR) detection and
treatment is rapidly evolving, driven by advances in imaging
modalities, machine learning, telemedicine, and personalized
medicine. Looking forward, several key directions promise to
enhance early diagnosis, improve prognostication, and streamline
care delivery:
1. Advanced Machine-Learning
Paradigms
31
o Impact: Reduced annotation burden; models that
generalize better across devices and populations.
3. Federated and Privacy-Preserving Learning
o Rationale: Patient privacy and data-protection
regulations often prevent sharing of raw fundus images
across institutions. Federated learning frameworks
allow multiple centers to collaboratively train a global
DR detector without exchanging image data.
o Impact: Larger, more diverse training cohorts; mitigated
bias; accelerated regulatory acceptance through
decentralized validation.
32
the retina in a single shot, enabling earlier detection of
peripheral ischemia.
o Impact: Improved sensitivity in proliferative DR; data for
novel biomarkers of progression.
REFERENCES
Key References on Diabetic Retinopathy Screening,
Detection, and Management
33
3. Quellec, G., Charrière, K., Boudia, Y., Cochener, B., &
Lamard, M. (2017). Deep image mining for diabetic
retinopathy screening. Medical Image Analysis, 39, 178–193.
34