© 2024 JETIR july 201X, Volume X, Issue X www.jetir.
org (ISSN-2349-5162)
DEEP INSIGHT: UNVEILING DIABETIC
RETINOPATHY THROUGH ADVANCED
NEURAL NETWORKS
1
Namratha S, 2K R Sumana
1
PG Student, 2Assistant Professor
1
Department of MCA,
1
The National Institute of Engineering, Mysuru, Visweswaraya Technological University, India
________________________________________________________________________________________________________
Abstract : Diabetic retinopathy (DR) is a severe eye condition caused by diabetes, potentially leading to blindness if untreated.
This project develops a machine learning model to classify DR stages using color fundus images. Data was sourced from the
Aptos 2019 Blindness Detection dataset, the IDRiD dataset, and a custom dataset from Paraguay. Preprocessing included
enhancing images with CLAHE. DenseNet169 was chosen after evaluating several pre-trained models and was on the Dataset.
This model provides an effective solution for early detection and classification of DR.
IndexTerms – Diabetic Retinopathy, DenseNet169, Deep Learning , Retinopathy Classification,
________________________________________________________________________________________________________
I. INTRODUCTION
Diabetic retinopathy (DR) is the leading cause of visual loss globally, and its timely detection is crucial for preventing severe
vision impairment. Manual diagnosis of DR is resource-intensive and time-consuming, often leading to treatment delays and poorer
patient outcomes. This project aims to develop a neural network system that can accurately detect and categorize DR from retinal
images using advanced deep learning techniques, such as convolutional neural networks (CNNs). By analyzing images of patients'
left and right eyes, the system will classify the severity of DR into five categories: no DR, mild, moderate, severe, and proliferative
DR, scored from 0 to 4. The primary objective is to create an automated analysis system that can quickly generate a score based on
this scale, thus facilitating early diagnosis and intervention. The initiative seeks to minimize the risk of vision loss and enhance
patient care by streamlining the screening process.
II. LITERATURE SURVEY
To categorize the DR stages, Qummar et al. [1] trained an ensemble architecture of five deep CNN models in 2019. These
models were ResNet50, Inception V3, Xception, Dense121, and Dense169. Rich feature encoding is possible with this ensemble
architecture, which enhances classification performance. According to the experimental findings, the suggested model performs
better than other popular models that were trained on the same Kaggle dataset and is capable of correctly identifying five stages of
DR. Through the integration of many architectures, including ResNet-20 and Inception modules, the methodology allows for a
comprehensive analysis of retinal images, capturing their richness and depth.
A innovative hybrid Strawberry-based Convolution Neural Framework (SbCNF) was developed in 2023 by K Mithili et al. to
identify and classify retinopathy from retinal images [2]. The principal aim is to enable prompt identification and management of
conditions linked to increased blood pressure and glucose levels. The SbCNF algorithm is designed specifically to extract retinal
veins from fundus retinal pictures, providing essential information for disease diagnosis. Its great accuracy is primarily due to its
reduced processing cost. This makes it effective in the identification of retinopathy. Nevertheless, the work does not address any
potential drawbacks or difficulties with the suggested SbCNF algorithm and does not provide a thorough analysis of the
classification methods used.
A hybrid Convolutional Neural Network (CNN) model that combines ResNet50 and Inceptionv3 for reliable feature extraction
in DR identification from fundus pictures was proposed by Ghulam Ali et al. [3] in 2023. An analysis is out using a fundus picture
dataset that is accessible to the public highlights how successful the suggested strategy is. The model shows encouraging results in
Paper id Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 25
© 2024 JETIR july 201X, Volume X, Issue X www.jetir.org (ISSN-2349-5162)
DR identification, indicating its potential significance in clinical practice. This is in line with the recognition of the importance of
early detection and intervention in preventing diabetic vision loss.
Mohaimenul et al. developed a robust and lightweight deep learning model for the categorization of a wide variety of diabetic
retinopathy (DR) images [4]. The goal of the project is to develop an automatic DR identification system that is faster and more
accurate than manual approaches. To this end, three different DR datasets—APTOS, Messidor2, and IDRiD—are combined to
create a collection of 5,819 raw photos. While using three augmentation strategies results in a balanced dataset, using several image
preparation techniques improves image quality. The paper also compares the performance of six pre-trained models: VGG16,
VGG19, MobileNetV2, ResNet50, InceptionV3, and Xception. The suggested model outperforms the others in terms of accuracy
and is robust.
III. PROPOSED WORK
The proposed system leverages advanced deep learning techniques to automate the detection and classification of diabetic
retinopathy stages from retinal images. The process begins with image acquisition, where retinal images are captured using
standard fundus cameras and uploaded to the system. These images then undergo preprocessing, including the application of
Contrast Limited Adaptive Histogram Equalization (CLAHE), to enhance their quality. A pre-trained deep learning model,
DenseNet169, is fine-tuned on the Aptos Blindness Detection 2019 and IDRiD datasets to accurately classify the images into
different stages of diabetic retinopathy. The system processes the images and predicts the stage of diabetic retinopathy, providing a
confidence score for each prediction. A user-friendly web interface built using Flask allows users to upload images and view the
prediction results, ensuring an accessible and efficient diagnostic tool for medical professionals.
IV. METHODOLOGY
4.1 Data Collection
The Aptos 2019 Blindness Detection , The Indian Diabetic Retinopathy Image Dataset (IDRiD) and UNA, Paraguay Fundus
Image dataset are the main datasets used in this study. These datasets were selected because they include extensive and annotated
retinal scan images, which are essential for creating a reliable diabetic retinopathy detection algorithm. The Aptos 2019 Blindness
Detection: This Dataset is made up of retinal photos that were collected in a range of imaging circumstances. On a scale of 0 to 4,
each image has a label indicating the severity degree of diabetic retinopathy; 0 denotes no diabetic retinopathy and 4 denotes the
most severe level. The rationale behind selecting this dataset in particular is because APTOS, which was created using a variety of
camera models and types, includes 5590 retinal images. The Indian Diabetic Retinopathy Image Dataset (IDRiD): To add more
granularity and support the model's generalization, the IDRiD dataset comprises images classified by different degrees of diabetic
retinopathy and diabetic macular edema. The dataset annotates diabetic retinopathy lesions and normal structures and contains 516
retina images, enabling researchers to train and evaluate machine learning models effectively. UNA, Paraguay Fundus Images: This
dataset consists of 757 color fundus photos taken using ZEISS's VISUCAM 500 camera. Expert ophthalmologists have divided
these images into other groups, such as Proliferative Diabetic Retinopathy (PDR) and Non-Proliferative Diabetic Retinopathy
(NPDR).
4.2 Data Preprocessing
To improve image quality and guarantee that the input fed into the neural network is standardized and normalized, effective
preprocessing is essential. Among the preprocessing actions are: Image Resizing To maintain uniformity and conformity with the
input requirements of the pre-trained models employed, all photos are shrunk to 224 x 224 pixels. Contrast Limited Adaptive
Histogram Equalization (CLAHE) Contrast Limited Adaptive Histogram Equalization, or CLAHE, is a sophisticated image
processing method used to enhance contrast. In contrast to conventional histogram equalization, which enhances the contrast of the
entire image, CLAHE operates by separating the image into discrete, contextual areas known as tiles. Localized contrast
enhancement is made possible by independently applying histogram equalization to each tile. This method works especially well
for photos that have various lighting or detail levels in different areas. This method is used to improve the retinal pictures' contrast.
By modifying the image histogram, CLAHE assists in emphasizing the characteristics of the retina, which is especially helpful for
medical photographs. CLAHE has a contrast limiting step to prevent over-amplification of noise and to prevent areas that are too
bright or dark. To do this, a clip limit must be specified, limiting the amplification of pixel values above a predetermined level. The
extra pixels are then re-distributed throughout the histogram to maintain the enhancement's equilibrium and aesthetic appeal. In
order to create a smooth and improved image, the processed tiles are then blended using bilinear interpolation to eliminate false
borders. In medical imaging, CLAHE is frequently utilized to improve the appearance of characteristics like blood vessels and
lesions. One example of this is the examination of diabetic retinopathy. CLAHE provides a more accurate and thorough
representation of image data, assisting in improved diagnosis and analysis. This is achieved by enhancing local contrast while
avoiding the drawbacks of conventional approaches. Normalization By dividing by 255, pixel values are normalized to fall between
0 and 1. This stage facilitates faster convergence during model training by guaranteeing a consistent data distribution Data
Augmentation Data augmentation strategies are used to increase the model's generalization and prevent overfitting. Among these
methods, Rotation: Shaping pictures within a range at random. Zooming: Making arbitrary adjustments to the zoom. Horizontal and
Vertical Shifts: Translating images horizontally and vertically. Shearing: Giving the pictures shear adjustments. Flipping: To
improve the training data's diversity, perform arbitrary horizontal flips. TensorFlow's ImageDataGenerator is used for data
augmentation; during training, it dynamically performs these modifications.
4.3 Model selection
To determine which convolutional neural network (CNN) architecture is best suited for this task, a number of pre-trained CNN
models are assessed. Among the chosen models are DenseNet169, This network is well-known for its dense layer connection,
which enhances gradient flow and promotes feature reuse. MobileNet, provides an excellent balance between efficiency and
performance, tailored for embedded and mobile vision applications. DenseNet121, allows gradient flows over skip connections,
which makes use of residual learning to facilitate the training of deeper networks. InceptionV3, enables the network to record a
Paper id Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 26
© 2024 JETIR july 201X, Volume X, Issue X www.jetir.org (ISSN-2349-5162)
variety of spatial hierarchies by combining multiple convolutional filters of varying sizes. Using the retinal images, each model is
adjusted, making use of the pre-trained weights on ImageNet to gain an advantage over transfer learning.
4.4 DenseNet169 Model
Known for its dense connectivity pattern in which each layer receives input from all previous layers and passes its own feature
maps to all subsequent layers. DenseNets are a family of sophisticated convolutional neural network architectures that includes
DenseNet169. This particular design has several advantages that make DenseNet169 particularly effective for medical image
analysis, including the detection of diabetic retinopathy. In DenseNet169, every layer is feed-forwardly directly connected to every
other layer. This indicates that all feature maps from previous layers make up the input for each layer. In addition to encouraging
feature reuse, improving feature propagation, reducing the number of parameters significantly, and mitigating the vanishing
gradient problem, these dense connections also aid. DenseNet169's growth rate, or the number of filters added to each layer, is 32.
Rich feature representation is maintained while the model's complexity is managed thanks to this balanced growth rate. Before
performing 3x3 convolutions, DenseNet169 employs bottleneck layers, which use 1x1 convolutions to decrease the number of
input feature maps. This bottleneck design lowers the number of parameters and increases computing efficiency. DenseNet169 use
batch normalization, 1x1 convolution, and 2x2 average pooling as transition layers to regulate the network's complexity and size.
By lowering the number and spatial dimensions of feature maps, these layers improve efficiency and generalization. The ImageNet
dataset, which has over a million photos and 1000 classes, is frequently used to begin DenseNet169 with weights that have already
been trained for transfer learning. Mechanisms for hyperparameter adjustment and validation are also included in the training
environment to guard against overfitting and guarantee ideal model performance.
Fig.1 Overview of the Proposed System
V. RESULT ANALYSIS
Figure 2 illustrates the precision comparison among DenseNet121, DenseNet169, and InceptionV3 models across different
diabetic retinopathy classes. DenseNet121 shows high precision for the 'No_DR' class (0.94) but struggles with the 'Severe' class
(0.00). DenseNet169 maintains high precision for 'No_DR' (0.94) and 'Mild' (0.76). InceptionV3 has the lowest precision for
'Mild' (0.53) but performs well for 'No_DR' (0.93), similar to the other models. These results highlight each model's strengths and
areas for improvement in accurately identifying diabetic retinopathy stages.
Fig.2 – Precision Comparison Graph
Paper id Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 27
© 2024 JETIR july 201X, Volume X, Issue X www.jetir.org (ISSN-2349-5162)
I. ACKNOWLEDGMENT
Thepreferredspellingoftheword “acknowledgment” inAmericaiswithoutan “e” afterthe “g”.Avoidthestiltedexpression,
“Oneofus(R.B.G.)thanks...”
Instead,try“R.B.G.thanks”.Putapplicablesponsoracknowledgmentshere;DONOTplacethemonthefirstpageofyourpaperorasafootnote.
REFERENCES
[1] Ali, A. 2001.Macroeconomic variables as common pervasive risk factors and the empirical content of the Arbitrage Pricing
Theory. Journal of Empirical finance, 5(3): 221–240.
[2] Basu, S. 1997. The Investment Performance of Common Stocks in Relation to their Price to Earnings Ratio: A Test of the
Efficient Markets Hypothesis. Journal of Finance, 33(3): 663-682.
[3] Bhatti, U. and Hanif. M. 2010. Validity of Capital Assets Pricing Model.Evidence from KSE-Pakistan.European Journal of
Economics, Finance and Administrative Science, 3 (20).
Paper id Journal of Emerging Technologies and Innovative Research (JETIR) www.jetir.org 28