0% found this document useful (0 votes)
9 views31 pages

Optimal Deep Learning Model for Classification of Lung Cancer

The manuscript presents an optimal deep learning model for classifying lung cancer using CT images, leveraging an Optimal Deep Neural Network (ODNN) and Linear Discriminant Analysis (LDA) for feature extraction and dimensionality reduction. The proposed method achieved a sensitivity of 96.2%, specificity of 94.2%, and accuracy of 94.56%, demonstrating significant improvements over existing classification techniques. This early detection approach aims to enhance survival rates by identifying lung cancer at earlier stages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views31 pages

Optimal Deep Learning Model for Classification of Lung Cancer

The manuscript presents an optimal deep learning model for classifying lung cancer using CT images, leveraging an Optimal Deep Neural Network (ODNN) and Linear Discriminant Analysis (LDA) for feature extraction and dimensionality reduction. The proposed method achieved a sensitivity of 96.2%, specificity of 94.2%, and accuracy of 94.56%, demonstrating significant improvements over existing classification techniques. This early detection approach aims to enhance survival rates by identifying lung cancer at earlier stages.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Accepted Manuscript

Optimal deep learning model for classification of lung cancer on CT


images

Lakshmanaprabu S.K., Sachi Nandan Mohanty, Shankar K.,


Arunkumar N., Gustavo Ramirez

PII: S0167-739X(18)31701-1
DOI: https://doi.org/10.1016/j.future.2018.10.009
Reference: FUTURE 4514

To appear in: Future Generation Computer Systems

Received date : 25 July 2018


Revised date : 16 September 2018
Accepted date : 4 October 2018

Please cite this article as: Lakshmanaprabu S.K., et al., Optimal deep learning model for
classification of lung cancer on CT images, Future Generation Computer Systems (2018),
https://doi.org/10.1016/j.future.2018.10.009

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form.
Please note that during the production process errors may be discovered which could affect the
content, and all legal disclaimers that apply to the journal pertain.
Optimal Deep Learning Model for Classification of Lung Cancer

on CT Images

Lakshmanaprabu S.K1,*, Sachi Nandan mohanty2, Shankar K3, Arunkumar N4, Gustavo
Ramirez5
1
Department of Electronics and Instrumentation Engineering, B.S.Abdur Rahman Crescent Institute
of science and Technology, Chennai, India. Email: [email protected]
2
Department of Computer Science & Engineering, Gandhi institute for technology, Bhubaneswar,
India. Email: [email protected]
3
School of Computing, Kalasalingam Academy of Research and Education, Krishnankoil, India.
Email: [email protected]
4
Department of Electronics and Instrumentation Engineering, SASTRA University, Tanjavur, India,
Email: [email protected]
5
Department of Telematics, University of Cauca, Colombia, Email :[email protected]
*corresponding Author.

Abstract

Lung cancer is one of the dangerous diseases that cause huge cancer death worldwide. Early

detection of lung cancer is the only possible way to improve a patient's chance for survival. A

Computed Tomography (CT) scan used to find the position of tumor and identify the level of

cancer in the body. The current study presents an innovative automated diagnosis

classification method for Computed Tomography (CT) images of lungs. In this paper, the CT

scan of lung images was analyzed with the assistance of Optimal Deep Neural Network

(ODNN) and Linear Discriminate Analysis (LDA). The deep features extracted from a CT

lung images and then dimensionality of feature is reduced using LDR to classify lung nodules

as either malignant or benign. The ODNN is applied to CT images and then, optimized using

Modified Gravitational Search Algorithm (MGSA) for identify the lung cancer classification

The comparative results show that the proposed classifier gives the sensitivity of 96.2%,

specificity of 94.2% and accuracy of 94.56%.

Key Terms: Image Processing, Computed Tomography, Lung Cancer, LDA, Optimization,

Classification.
Nomenclature

CT Computed Tomography

LDA Linear Discriminate Analysis

ODNN Optimal Deep Neural Network

GSA Gravitational Search Algorithm

MGSA Modified Gravitational Search Algorithm

KNN K-Nearest Neighbor

ANN Artificial Neural Network

SVM Support Vector Machine

CAD Computer-Aided Diagnosis

DNN Deep Neural Network

DCNN Deep Convolutional Neural Network

CNN Convolutional Neural Network

GLCM Gray Level Co-occurrence Matrix

ELDA LDA based on Euclidean Distance

ROIs Regions of Interest

ELDA Regularized Linear Discriminate Analysis

PCA Principal Components Analysis

DWT Discrete Wavelet Transform

EC Evolutionary Computation

DBN Deep Belief Network

RBM Boltzmann Machine

NN Neural Network

PPV Positive Predictive Value

NPV Negative Predictive values

1. Introduction

Medical image analysis has extraordinary supremacy in the field of health sector, particularly

in noninvasive treatment and clinical examination [1]. The acquired restorative images such
as X-rays, CT, MRI, and ultrasound imaging are used for specific diagnosis [2]. In medical

imaging, CT is one of the filtering mechanism which use attractive fields to capture images in

films [3]. Lung cancer is one-of-its-kind of cancer that leads to 1.61 million deaths per year.

In Indonesia, lung cancer is ranked in the third position among the prevalent cancers, for the

most part, found in the MIoT centers [4]. The survival rate is higher if the cancer is diagnosed

at the beginning stages. The early discovery of lung cancer is not a simple assignment.

Around 80% of the patients are diagnosed effectively only at the center or propelled phase of

cancer [5]. Lung cancer is positioned second among males and tenth among females [2]

globally. The information given in these studies is a general portrayal of lung cancer location

framework that contains four basic stages. The lung cancer is the third most frequent cancer

in women, after breast and colorectal cancers [6, 7]. Feature extraction process is one of the

simplest and efficient dimensionality reduction techniques in image processing[8][9]. One of

the striking features of CT imaging is its non-obtrusive character. The rise of angles, which

might be viewed, is odd when compared to parallel imaging modalities [10].

The selected or extracted features set will extract the relevant information from the

input data to the reduction process [11]. The reduced features are assigned to a support vector

machine for the purpose of training and testing. The models used for lung cancer image

classification are neural network models with binarization image pre-processing [12]. The

existing research work for lung cancer classification was performed using a neural network

model which provided 80% accuracy [13]. Various investigations have been conducted

regarding lung cancer classification and Classifiers, for example, ‘SVM, KNN and ANN’

[14]. The SVM is a universal useful learning method based on statistical learning hypothesis

[15]. However, these techniques are expensive and detect lung cancer at its advanced stages

due to which the chance for survival is very low. The early detection of cancer can be helpful
in curing the disease completely. So, the requirement of developing a technique to detect the

occurrence of cancerous nodule in the early stage is increasing [16].

The contribution of the current work considers two important phases: First phase is

the CT lung cancer classification processes where the selected features are extracted to LDA

reduction process and in the second phase, optimal deep learning classifier with MGSA

optimization algorithm is used to classify the CT lung cancer images. The proposed method

outperformed over other methods and also it is shown that the performance improvement is

statistically significant. In the rest of this paper, section 2 discuss about literature study where

section 3 depicted the current issues of the classifier. Further, the 4thsection extravagantly

contemplated the proposed philosophy. At that point, section 5 contains the execution and

investigation of this work followed by the conclusion with recommendations for future work.

1.1 Causes and detection of lung cancer

 The general visualization of lung cancer is poor since specialists are unable to

discover the infected region until the point when it reaches propelled stage. Five-year

survival is around 54% for beginning time lung cancer which is restricted to the lungs,

yet just around 4% in the advanced stage of inoperable lung cancer.

 The danger of lung cancer increments with the number of cigarettes smoked after

some time; specialists allude to this hazard as far as pack-long periods of smoking

history. A little segment of lung cancers occurs in individuals with no known hazard

factors for the illness. A portion of these may very well be arbitrary occasions that

there may not be an outside reason.

 To examine lung cancer, patients normally undergo X-ray or CT or MRI scans to

distinguish anomalous developments in lungs. In any case, exceptionally sensitive CT

can identify little knobs that could conceivably be cancerous.


 Early detection of lung cancer: The earlier identification of lung cancer can have

greater treatment alternatives and a far more possibility of survival. Be that as it may,

just 16% of the individuals are diagnosed in the beginning stage when the sickness is

generally treatable.

2. Literature review

In 2018, Yutong Xie et al. [17] recommended an algorithm for lung nodule classification that

circuits the Texture, Shape and Deep model-learned data (Fuse-TSD) at the choice level. This

algorithm utilizes a GLCM-based surface descriptor, a Fourier-shape descriptor to portray the

heterogeneity of nodules and a DCNN to train the features of nodes.

Hiba Chougrad et al. [18] investigated a CAD framework based on CNN to classify

the breast cancer. Deep learning generally requires expansive datasets to prepare systems

while transfer learning method uses a little datasets of medical images. The CNNs optimally

trained with the help of transfer learning method. . The CNN accomplished the best outcomes

in terms of accuracy i.e., 98.94%. Heba Mohsen et al. [19] demonstrated the DNN classifier

for brain tumor classification where the DNN is combined with wavelet transform and

principal component analysis.

In 2015, Alok Sharma et al. [20] proposed a method of regularized linear discriminant

analysis, in which the regularization parameter computed traditional cross-validation

algorithm. In order to investigate the medical data for prediction of disease needs a proper set

of features. There have been many evolutionary algorithm has been applied to obtain the

optimal selection of features. Recently, gravitational search algorithm and Elephant Herd

optimizations are utilized for the selection of optimal features [21] [28].

Kuruvilla, J. and Gunavathi, K (2014) developed a ANN based cancer classification

for CT images. The statistical used for the classification model developed. The paper claimed

that feed forward back propagation network provide better accuracy compared to feed
forward networks. Also, the skewness feature has more significance in enhancement of classifier

accuracy [22].

In the study conducted by Hao Wang et al during 2016, proposed a summed up LDA

method based on Euclidean norm called ELDA method to defeat the existing disadvantages

in the conventional LDA procedure. Multi-class SVM is connected to execute step

classification. The trial results exhibited that this algorithm accomplishes better results

similarly with high accuracy and viability than any other gait recognition procedures in the

model [24].

3. Existing problems of classifiers

 In existing techniques, the lung images were captured and subjected to segmentation

specifically after which the SVM classifier was applied and then the accuracies were

measured [22].

 The current framework had a limitation since it could not predict the sort and shape or

size of the tumor and it dealt with a number of pixels which is not valuable for the

earlier detection of cancer [17]. At the point, when ANN produces a testing solution,

it does not provide some insights as to why and how. This reduces confide in the

network.

 Neural networks are ‘black box’ and have been restricted in their ability to

expressively recognize the conceivable causal relationships. NN particularly has

profound networks with many hidden layers and are capable of modeling complex

structures. However, the training algorithm is again more complex and dynamically

sensitive, which can cause a few issues [17, 18].

 It is estimated that by utilizing this model, various existing data mining and image

processing strategies could be made to work on together in multiple ways. The main
disadvantage of the LDA technique is that it only distinguishes the images containing

anomalies [23].

 The GSA drawback is the metropolis standard for contrasting the places of moving

particles along with controlling the molecule to move and conquer their arbitrariness.

4. Methodology

The proposed approach used to classify the CT images of the human lung which has a few

stages such as preprocessing, feature extraction, reduction and finally the classification.

Initially, the CT images were considered to improve the quality of images followed by the

feature extraction procedure to extract the features (histogram, Texture, and wavelet) of the

images based on strategies. After the feature extraction, dimensionality reduction technique

considered reduce the feature for the classification process, the purpose of dimension

reduction is reducing the computational time and cost of our classifying Method, feature

reduction is utilized that is LDA. The LDA based feature reduction technique is applied in the

proposed classifying method for reducing the computational time and cost. The maximum

features utilized for classification increase the computation time as well as the storage

memory. During classification phase, the CT lung images are classified as normal, benign

and malignant based on the extracted features. Generally, the classification issue has two

phases such as training and testing phases; the classifier is trained with the chosen features of

the training data. On the other hand, during the testing phase, the outcomes of the

classification procedure signify whether the images contain the lung cancer regions or the

non-cancer regions. The current study utilized ODNN classifier and MGSA optimization is

used for the optimized structure. This approach is illustrated in figure 1 with outstanding

straightforwardness and minimal effort in both trainings as well as in testing process in the

classification of CT lung image.


Fig 1: Block Diagram foor Proposed
d CT image classificatioon

4.1 Filttering and contrast


c en
nhancemen
nt phase

The colllected mediical databasse images w


were adulterrated with so
ome kinds oof noise. Th
his filter,

if the im
mage is noiisy and targ
get pixels neeighboring pixel worth
h is somewhhere around
d 0's and

255's bby then suppplanting piixel respectt with the middle


m esteeem. After removing noise
n of

databasses, it’s considered


c to contrasst enhanceement proccess as ad
adaptive hiistogram

equalizaation.

Contrast(ii, j )  rank * max_ int eensity (i, j )  Inittailly rank  0


rankk  0  1 (1)

The hisstogram in the


t primary position off each line is
i acquired utilizing thee principal position

of the last row, by


b subtractiing the traailing colum
mn includin
ng the new
w leading ro
ow. The

complexxity of the CT images is to be inccreased and set with the limit, so tthat it conseequently

recogniizes the grayy level of th


he image annd adaptively modify the dispersinng two neig
ghboring

gray levvels in the new


n histograam.

4.2 Feaature extracction


The reason of feature extraction technique is to represent the image in its compact and unique form of

single values or matrix vector. Feature extraction computes dimensionality reduction in image

processing based on which image can be used for classification. It includes diminishing the

input data into a reduced representative set of features. The features are utilized as

contributions to classifiers that assign them to the class what they represented. The aim of

feature extraction is to reduce the data by estimating positive properties. In the current study,

histogram features, texture feature and wavelet features whereas all these features are

extracted from different bands of CT images.

4.2.1 Histogram features

In histogram features, the image is represented in terms of pixels. The histogram

demonstrates the number of pixels in an image at each power value. Transforming the power

values of the histogram of the images approximately matches a predetermined histogram.

From the input image, a total range of gray levels is evaluated by the histogram method.

Here, there are 256 gray levels which ranges from 0 to 255. It has some common features, for

example, Variance, mean, skewness and kurtosis, standard.

Variance: The variance gives the number of gray level fluctuations from the mean gray level

value. The statistical distributions, like the variance in the length of lines of a particular limit,

it could be utilized to distinguish low profile contrasts in the texture.

Mean: The mean gives the average gray level of each region and it is helpful only as a harsh

idea of power not by any stretch of the image texture.

Standard Deviation: The definition of Standard Deviation is the square root of the variance

denoting image contrast. The image contrast level is evaluated by high and low variance

values. This denotes that a high contrast image has high variance whereas a low contrast

image has low variance.


Skewness: The image skewness is calculated based on the tail of the histogram. The tail of

the histogram value is categorized into two sets, positive and negative

Kurtosis: It is a measure of the possible distribution of a real-valued random variable and it

depicts how anomaly the image is. Kurtosis and Skewness are used in the statistical analysis

to get an insight into shape of distribution.

4.2.2 Texture features

Texture features are extracted from the input image only next to histogram features. Since the

abnormality is generally spread in the image, the textural orientation of each class is

extraordinary, which helps to attain better classification accuracy. The gray level

co-occurrence matrix symbolizes a statistical method of reviewing the surface that takes into

account the spatial relationship of pixels. The GLCM functions spell out the texture of an

image, by estimating the recurrence of occurrences of pairs of the pixel with the same values.

Generally, these features are calculated by utilizing GLCM probability values and it has

somewhere in the range of 22 features among which a few features is considered for the

current study regarding CT lung image classification process.

G Pij 
Fij
L 1
(2)
F
i , j 0
ij

In the above equation, Fijdenotes the ‘frequency of occurrences between two grey levels’, L is

the Number of quantized grey levels, ‘i’ and ‘j’ for a given displacement vector for the

specified window size.

Energy: This guarantees that the maximum constant values or intermittent consistency in

gray level distribution will shape the maximum vitality of surface.

Entropy: It refers to the quantity of data in the image which is required for the compression

process. The image with low entropy exhibits tiny contrast and large runs of image pixels in

the assigned values.


Homogeneity: The homogeneity constraint is generally called the contrast minute which

evaluates the image homogeneity assuming that the prevalent values for minor gray tone

changes in pair components. Along these lines, the homogeneity is an evaluation which

characterizes prevalent values for minor contrast images.

Contrast: This is the one that calculates the spatial recurrence of an image and the varying

moments of GLCM. It symbolizes the variance between the maximum and the base values of

a neighboring arrangement of pixels.

Correlations: Correlation evaluates the linear dependence of gray levels of adjoining pixels.

The tracking of the digital image correlation stands for an optical procedure, which misuses

tracking and the images registration is approached for the measurements of variations in

images.

4.2.3 Wavelet-based features

The wavelet transform gives an image handling information because of its beneficial features.

The DWT speaks to a linear transformation, which is the function on the data vector whose

length is related energy. In the wavelet transform, the feature extractions are carried out by

means of two stages as follows. First, the subband of the natural image is developed and these

subband are evaluated with the help of various resolutions. Wavelet is an extraordinary

numerical method to include extraction and has been used to separate the wavelet coefficients

from images. The mean prediction of DWT coefficients is figure out by taking the normal

coarse coefficient.

Coeff [at ]   at (3)

Where δat is the mean value for approximation coefficient since initially, the images are

outfitted to the low pass channel which screens the low recurrence image within the cut off

recurrence. Thereafter, the image signals are outfitted to a high pass filter which screens the

high-frequency beat signals surpassing the cut-off recurrence.


4.3 Feaature Reducction: Lineear Discrim
minant Anallysis

The objjective is to decrease th


he first inforrmational in
ndex by estiimating speecific characcteristics

or featuures that diffferentiate one


o data dessign from an
nother. All the
t bad feattures of CT
T is to be

combinned so as to reduce the features. LD


DA is a dim
mensionality
y reductionn process, where
w the

originall input spacce is transformed into aan autonomo


ous feature space with a dimensio
on that is

free off alternate dimensions


d . The LDA
A model is illustrated in figure 2. It is ussed as a

dimensiionality redduction facttor for featuure vectors before the classificatiion process without

any losss of data.

Fig 2: LDA
A

The feaature reductiion matrix is


i given as,

  
c N ss
M w  
 mij   j mij   j (4)
T

j 1 i 1

Where ‘c’ denotess the numbeer of classe s and mj, Ns and αj are a test of a class, Nu
umber of

tests in class and meaning


m off class. The reduction matrix
m is caalculated ussing blow eq
quations

such as follows.

k
R s   ( m j  m) ( m j  m) T (5)
j 1

Where ‘m’ is ‘meean of all classes’


c in which LD
DA strategiees are appliied with th
he direct

discrim
minant hypotthesis. Thiss standard aattempts to expand thee proportionn of determ
minant of

‘betweeen-class disperses grid of the antiicipated exaamples’ to the


t determinnant of the ‘inside-
class disseminate network of the anticipated examples’. The multi-class LDA is considered

where the connection between a set of classes is not same as another set. From the minimal

features available for classification procedure, the CT images are classified.

4.5 Classification of lung CT images

In the CT image classification model, the current study proposed DNN in view of profound

learning approach. DL structure broadens the customary NN by adding more hidden layers to

the system design between the input and output layers so as to demonstrate more

unpredictable and nonlinear connections. After the features selection, the grouping step is

performed with the help of DNN on the resultant component vector. This classifier works

with the help of two capacities such as profound DBN and RBM. In order to improve the

classification performance of the proposed model, MGSA optimization is considered which

involved steps of optimal deep learning model described in the section below along with an

illustration of optimal DNN as figure 3.

4.5.1 Deep Belief Network

During the training stage, a DBN is utilized which is a deep design and feed-forward neural

network, i.e. with various hidden layers. The DBN model awards the system to deliver

evident-starts based on its hidden units' states which depicts the system conviction. The

parameters of a DBN are the weights among the units of layers in addition to the bias of

layer. It is a principal challenging task to set up the parameters to train DNN help of a

confined restricted RBM [18].

4.5.2 Restricted Boltzmann Machine (RBM)

RBM is a two-layer rehashed neural framework in which the stochastic twofold sources of

information are associated with stochastic paired yields by symmetrically-weighed

affiliations. A preparation case is demonstrated in which the class check is ignored and it is

expanded stochastically through RBM in condition (6). This vector is also coursed the other
way thhrough RB
BM which impacts inn a confab
bulation (rre-trying) oof the rem
markable

informaation data.

i j I J
F w, h   I ij wi h j    i wi    j h j (6)
i 1 j 1 i 1 j 1

where I ij representss the symm


metric interacction term between
b thee visible uniit wi and thee hidden

unit h j , ,  are thhe bias term


ms, i, j are thhe number of visible an
nd hidden uunits.

Fig 3: Opptimal DNN


N Structure
4.5.3 Modified gravitational search algorithm for weight optimization

The novel population-based heuristic algorithm is based on the law of gravity and mass

interactions. So, the majority of interactions cooperate for an immediate type of

correspondence through the gravitational force. The GSA approach provides a solution to the

issue by its mass position and gravitation; also the fitness function of the algorithm is

determined by its inertial masses. Subsequently, each mass over an answer and the strategy is

directed [28] by properly adjusting the gravitational and inertial masses. The new position is

updated for the probability function which is utilized as a part of random value selection

following the technique considered for the optimization process.

(i) Weight initialization

Initially, ‘w’ sets of agents are considered, their positions specified and represented as

follows: In the equation (7) shows w i1 the position of agent and .wis a search space of agent

to choose weights.

w  { w i1 ...... w is } (7)

(ii) Fitness evaluation

In this CT lung image classification, the maximum specificity ratio based on the trained and

tested structure of DNN is considered as the fitness function and it is shown in the equation

below (8).

Fit  MAX ( spec ) (8)

(iii) Mass and force updates for generation of new solution

The force between two particles is proportionally relative to their masses and conversely

corresponding to their distance, each one of the particles moves towards those particles which

are heavy in their mass. This is derived under the equations (9) and (10).

D i(t )
 (9)

Mass i(t )
D i(t )
Fit i ( t )  worst ( t )
Di ( t ) 
Best ( t )  worst ( t ) (10)

Where ‘Fit’ represents the fitness value of the particle ‘i’ at a time‘t’. For the maximization

problem and for estimating the acceleration of an agent, a set of total force applied from

heavier masses has to be taken into account.

(iv) Force evaluation

To give a stochastic trademark to GSA, the total force that follows up on the molecule ‘i’ in

the dth measurement is set to be a randomly weighted total of search segments of the forces.

Mass j (t ) * Massi (t )
Forcei (t )  
jk best
Rn j gr(t )
Ed ij  
( w j (t )  wi (t ) )
(11)

Here w j (t ) represents the position of the i _ th particle in the dimension; Massi and

Massjdenote the gravitational mass related to the particles ‘i’ and ‘j’; Then gr ( t ) is the

gravitational constant; Ed ij denotes the Euclidian distance between the particles ‘i’ and ‘j’

inthe generation ‘t’; ‘ε’ is a small constant which is bigger than 0.

(v) Modification process to select a random value

In equation (11), the random values are selected by calculating the probability function of

mass and force updating process and the equation is as follows.

prob  0.31  I  (12)


 I max 

This algorithm is iteratively associated for various iterations to join at a sufficiently adequate

solution. It justifies saying that the random walk of the insect is compelled by the

investigation rate at the present cycle.

(vi) The optimal solution with the Termination process

The best solutions which fulfill the objective function are discovered and the algorithm is

prepared to give exact solutions in light of maximizing the accuracy of CT lung images in the
classification process. If unable to get optimal results in iteration 1, then move for iteration

_New= iteration +1, until get the optimal weights of DNN process, the steps will be repeated.

4.5.4 Fine tuning phase

The working rule of this stage depends on the normal backpropagation algorithm. To detect

and classify the abnormal, an output layer is proposed as the highest point of the DNN.

Additionally, there is‘N’ number of input neurons (based on the features), and three hidden

layers are used in the current study DNN. The optimized weight is planned through the

training stage with the assistance of a training data set, where back propagation begins with

the weights that were achieved in the pre-training stage. From the optimal weights, the layer

work is refreshed and is shown as follows.

T (m i  1 / n)   (m i   opt _ w ji n j

(13)
T (n i  1 / m)   (n i   opt _ w ji m j

where m and n denote the bias vectors for visible and hidden layers and σ is a logistic function

with the range of (0, 1). Further, the training dataset is skilled until the optimized weight is

grasped, or maximum accuracy is attained with the help of equation (13). Finally, on the basis

of the optimal weight (w), the lung images are classified in the testing stage by testing the

data set.

5. Result and discussion

The proposed CT lung image classification models were implemented in the working

platform of MATLAB 2016 with system configurations such as an i5 processor with 4GB

RAM. In this cancer image classification process, standard CT database was used and the

proposed model was compared with existing classifiers like NN, SVM, KNN, DNN and so

on, based on different measures of the classification model.


5.1 Dattabase desccription

In the pproposed woork, the dataabase compprised of 50 low-dosagee and recordded lung caancer CT

image ddataset are used for th


he detectionn purpose [30]. The CT scan imaages with 1.25 mm

slice thickness werre attained by a singlee breath. Th


he location of nodules was recogn
nized by

the radiiologist andd also provided in the ddataset. Thee test imagees considereed for the proposed
p

work arre shown in figure 4.

Fig 4: Sam
mple database images

5.2. Perrformance Metrics

The mosst commonlyy used evalua odel include in the table 2.The
ation metho ds for a classsification mo

data bassed image for


f training and testing images of ODNN-base
O ed classificaation processs is

tabulateed in table 3.
3 For lung image
i invesstigation, 70
0 images weere considerred for train
ning and

the rem
maining 30 im
mages weree consideredd for the testing processs.

Table 2: P
Performance Metrics

Metrics
M mula
Form

Truue Positive--TP

Truue Negativee-TP

Falsse Positive –FP

Falsse Negativee-FN

TP
Sen
nsitivity Sen 
T  FN
TP

TN
Specificity Spc 
T  FP
TN
TP  TN
Accuracy Acc 
TP  TN  FP  FN

TP
PPV PPV 
TP  FP

TN
NPV NPV 
TN  FN

Table 3: Database images for training and testing analysis

Phase ODNN classifier


Target Images
Normal Malignant Benign Total Images

Normal 22 1 4 27

Training Malignant 2 18 2 22

Benign 1 20 0 21

Total Images 25 39 6 70

Normal 6 0 2 8

Malignant 1 9 1 11
Testing
Benign 0 11 0 11

Total Images 7 20 3 30

The results attained by the ODNN model were illustrated with point operations. From the

results, the performance of the proposed model is determined by the ability to detect cancer

or non-cancerous lung image. Based on the testing data, the model is able to predict the

medical conditions of the lung in a new patient's record.


2.5

Time (Sec)
1.5
PCA
1 LDA
ICA
0.5

0
1 2 3 4
Images

Fig 5: Feature Reduction Time comparison

The LDA was utilized for lessening the difficulty of the framework in feature reduction time

as illustrated in figure 5. The dimensional value of the feature vector got reduced from the

images. The comparison graph clearly shows that the proposed technique achieves less

computational time coupled with high classification accuracy (because of LDA-based feature

reduction). The ideal opportunity for network training is not considered since the

weights/biases of the LDA should keep unchanged unless the properties of images change a

great deal. Confining the element vectors to the part picked by the LDA, prompts an

expansion in precision rates.

Table 4: Optimal weights based hidden layer vs error rate

Hidden layer Number of Error rate

with weights features Training Testing

1 (0.8) 8 0.22 0.28

2 (0.65) 4 0.26 0.33

3 (0.25) 6 0.29 0.35

4 (0.33) 3 0.35 0.31


Table 4 demonstraates the hidd
den layers w
with weights training and
a testing error valuees where

the dim
minished feaatures are co
onsidered foor the trainiing-testing process.
p In the hidden layer 4,

three feeatures extrraction is assigned as input facto


ors to give minimum M
Mean Squaare Error

(MSE) to training data


d and miinimum MS
SE for testin
ng informatiion.

Table 5: Proposed CT
T lung imaage classificcation with pre-processsed resultss

Con
ntrast
CT
T image Type Accuraccy Sensitiivity Speccificity
Enh
hanced

Normal 95.21 92.22 86.5


8

Benign 85.48 88.552 82.1


8

Malignant 92.22 93.22 91.2


9

Normal 96.45 91.558 89.2


8

Malignant 94.58 90.552 93.2


9

Benign 92.22 86.22 84.5


8

Table 5 demonstraates the accuracy leveel of lung cancer image classificcation ratess for the

proposeed approachh. In this teest, two claassifiers baased on sup


pervised maachine learn
ning are

exhibiteed, for CT
T image, ass normal oor benign or malignaant. The prroposed OD
DNN is
comparred with exxisting classifiers and it demonstrated that proposed aalgorithm provides
p

better cclassification results. Itt is inferredd from the tabulated results


r that the proposeed work

expels sensitivity to initial values


v of cllustering beecause of the
t evolutioonary classification

algorithhm. Seconddly, the texture and coolor featurees are conssidered for grouping CT
C lung

cancer datasets too enhance the classiffication pro


oficiency. From
F the vvalidation analysis,
a

maximuum accuracyy and the siignificant vaariation in the


t accuracy
y can be obbserved betw
ween the

kernel ffunction.

100

80
ODNN
MLP
Metrics (%)

60
RBF
Linear
40
A
ANN
KNN
20 DNN

0
Accuracy
y Seensitivity Specificcity

(a)

100

80 OD
DNN
LP
ML
Metrics (%)

60 RB
BF
Linnear
40 AN
NN
KN
NN
20 DN
NN

0
PPV NPV
(b)

Fig 6:Classifier comparative analysis

Figure 5 provides the comparative analysis of classifiers with various measurements like

PPV, NPV, Accuracy, Sensitivity, specificity, and accuracy. In this investigation, two

classifiers based on supervised machine learning are displayed for CT normal/abnormal

human lung classification. It was inferred that the proposed techniques produce classification

accuracy of 99%in proposed classifier, which is demonstrated during the testing phase.The

classification, it is accuracy of82.29% in NN 90.54% in SVM, and 74.55% in DNN. The PPV

and NPV values indicate better performance as nearly 98% in proposed model. After

completing the analysis, the classification specificity was 95% which is not considered as a

decent performance, similarly sensitivity parameter. This may be because of the commotion

that was exhibited in the phase data due to which the image was misclassified.

Table 6: K folds’ validation results

Number of
Accuracy Sensitivity Specificity PPV NPV
Folds

10 92.12 88.56 88.54 77.45 90.1

15 93.11 89.45 71.2 72.2 82.2

20 96.2 90.2 83.5 88.65 56.2

25 88.54 75.5 93.2 90.1 67.1

30 89.52 90.2 88.41 79.5 90.1

35 96.52 86.45 92.1 82.45 93.3

40 95.45 92.85 89.4 88.5 59.2

Table 6 demonstrates the classification of K fold validation and consequences of CT lung

cancer classification, accuracy getting approximately 100% in proposed classifier. To be sure,


the median operator creates the most noticeable bad classification performance even not as

much as single feature extraction. Every time, an overlap is utilized for training and the rest is

utilized for the test. The results variance is reduced with a larger k. All the observations are

utilized for both training and validation and each observation are utilized for validation for

only once.

6. Conclusion

The proposed ODNN with feature reduction demonstrated the better classification in case of

lung CT Images compared with others classification techniques. An automatic lung cancer

classification approach reduces the manual labeling time and avoids a human mistake.

Through machine learning techniques, the researchers planned to achieve better precision and

accuracy in recognizing a normal and abnormal lung image. According to the experimental

outcomes, the proposed technique is effective for the classification of the human lung images

in terms of accuracy, sensitivity, and specificity with its values 94.56%, 96.2%, and 94.2%

respectively. The accuracy level has clearly evident that the proposed algorithm is deeply

proficient in recognizing cancer-affected parts in CT images. The classification performances

of this investigation demonstrate the advantages of this strategy: it is speedy, simple to

operate, non-invasive and cheap. In future work, we will use high dosage CT lung images and

optimal feature selection with multi-classifier consisted to cancer detection process.

References

[1] Rattan, S., Kaur, S., Kansal, N. and Kaur, J., 2017, December. An optimized lung

cancer classification system for computed tomography images. In Image Information

Processing (ICIIP), 2017 Fourth International Conference on (pp. 1-6). IEEE.


[2] Naresh, P. and Shettar, R., 2014. Image Processing and Classification Techniques for

Early Detection of Lung Cancer for Preventive Health Care: A Survey. International

Journal on Recent Trends in Engineering & Technology, 11(1), p.595.

[3] Detterbeck, F.C., 2017. The 8 th Edition Lung Cancer Stage Classification: What

Does It Mean on Main Street,pp.1-26.

[4] Li, J., Wang, Y., Song, X. and Xiao, H., 2018. Adaptive multinomial regression with

overlapping groups for multi-class classification of lung cancer. Computers in

Biology and Medicine.pp.1-22.

[5] Wutsqa, D.U. and Mandadara, H.L.R., 2017, October. Lung cancer classification

using radial basis function neural network model with point operation. In Image and

Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), 2017 10th

International Congress on (pp. 1-6). IEEE.

[6] Sharma, D. and Jindal, G., 2011. Computer Aided Diagnosis System for Detection of

LungCancer in CT Scan Images. International Journal of Computer and Electrical

Engineering, 3(5), p.714.

[7] Bhatnagar, D., Tiwari, A.K., Vijayarajan, V. and Krishnamoorthy, A., 2017,

November. Classification of normal and abnormal images of lung cancer. In IOP

Conference Series: Materials Science and Engineering (Vol. 263, No. 4, p. 042100).

IOP Publishing.

[8] Sui, X., Jiang, W., Chen, H., Yang, F., Wang, J. and Wang, Q., 2017. Validation of

the stage groupings in the eighth edition of the TNM classification for lung cancer.

Journal of Thoracic Oncology, 12(11), pp.1679-1686.

[9] El-Sherbiny, B., Nabil, N., El-Naby, S.H., Emad, Y., Ayman, N., Mohiy, T. and

AbdelRaouf, A., 2018, March. BLB (Brain/Lung cancer detection and segmentation
and Breast Dense calculation). In Deep and Representation Learning (IWDRL), 2018

First International Workshop on (pp. 41-47). IEEE.

[10] Chen, F., Zhang, D., Wu, J. and Zhang, B., 2017, December. Computerized analysis

of tongue sub-lingual veins to detect lung and breast cancers. In Computer and

Communications (ICCC), 2017 3rd IEEE International Conference on (pp. 2708-

2712). IEEE.

[11] Al-Tarawneh, M.S., 2012. Lung cancer detection using image processing techniques.

Leonardo Electronic Journal of Practices and Technologies, 11(21), pp.147-58.

[12] Akay, Mehmet Fatih. "Support vector machines combined with feature selection for

breast cancer diagnosis." Expert systems with applications 36, no. 2 (2009): 3240-

3247.

[13] Xie, Y., Zhang, J., Xia, Y., Fulham, M. and Zhang, Y., 2018. Fusing texture, shape

and deep model-learned information at decision level for automated classification of

lung nodules on chest CT. Information Fusion, 42, pp.102-110.

[14] Sharma, D. and Jindal, G., 2011. Computer Aided Diagnosis System for Detection of

LungCancer in CT Scan Images. International Journal of Computer and Electrical

Engineering, 3(5), p.714.

[15] Shankar, S. K. Lakshmanaprabu, Deepak Gupta, AndinoMaseleno and Victor Hugo

C. de Albuquerque 2018, Optimal feature-based multi-kernel SVM approach for

thyroid disease classification, Journal of Supercomputing,pp.1-16.

[16] Sarker, P., Shuvo, M.M.H., Hossain, Z. and Hasan, S., 2017, September.

Segmentation and classification of lung tumor from 3D CT image using K-means

clustering algorithm. In Advances in Electrical Engineering (ICAEE), 2017 4th

International Conference on (pp. 731-736). IEEE.


[17] Xie, Y., Zhang, J., Xia, Y., Fulham, M. and Zhang, Y., 2018. Fusing texture, shape

and deep model-learned information at decision level for automated classification of

lung nodules on chest CT. Information Fusion, 42, pp.102-110.

[18] Chougrad, H., Zouaki, H. and Alheyane, O., 2018. Deep convolutional neural

networks for breast cancer screening. Computer methods and programs in

biomedicine, 157, pp.19-30.

[19] Mohsen, H., El-Dahshan, E.S.A., El-Horbaty, E.S.M. and Salem, A.B.M., 2017.

Classification using deep learning neural networks for brain tumors. Future

Computing and Informatics Journal., pp.1-4.

[20] Sharma, A. and Paliwal, K.K., 2015. A deterministic approach to regularized linear

discriminant analysis. Neurocomputing, 151, pp.207-214.

[21] Nagpal, S., Arora, S. and Dey, S., 2017. Feature Selection using Gravitational Search

Algorithm for Biomedical Data. Procedia Computer Science, 115, pp.258-265.

[22] Kuruvilla, J. and Gunavathi, K., 2014. Lung cancer classification using neural

networks for CT images. Computer methods and programs in biomedicine, 113(1),

pp.202-209.

[23] Wang, H., Fan, Y., Fang, B. and Dai, S., 2018. Generalized linear discriminant

analysis based on euclidean norm for gait recognition. International Journal of

Machine Learning and Cybernetics, 9(4), pp.569-576.

[24] Wang, Z. and Tao, J., 2006, November. A fast implementation of adaptive histogram

equalization. In Signal Processing, 2006 8th International Conference on (Vol. 2).

IEEE.pp.1-4.

[25] Hiremath, P.S. and Shivashankar, S., 2006. Wavelet based features for texture

classification. GVIP journal, 6(3), pp.55-58.


[26] Ren, H. and Chang, Y.L., 2005, November. Feature extraction with modified Fisher's

linear discriminant analysis. In Chemical and Biological Standoff Detection III (Vol.

5995, p. 599506). International Society for Optics and Photonics.

[27] Eldos, T. and Al Qasim, R., 2013. On the performance of the Gravitational Search

Algorithm. International Journal of Advanced Computer Science Applications. West

Yorkshire, United Kingdom.pp.1-5.

[28] Lakshmanaprabu, S. K., Shankar, K., Khanna, A., Gupta, D., Rodrigues, J. J.,

Pinheiro, P. R., & De Albuquerque, V. H. C. Effective Features to Classify Big Data

Using Social Internet of Things. IEEE Access, 6, 24196-24204, 2018.

[29] Shankar K, Mohamed Elhoseny, Lakshmanaprabu S K, Ilayaraja M, Vidhyavathi RM,

Mohamed A. Elsoud, Majid Alkhambashi. Optimal feature level fusion based ANFIS

classifier for brain MRI image classification. Concurrency Computat Pract Exper.

2018;e4887.https://doi.org/10.1002/cpe.4887

[30] http://www.via.cornell.edu/lungdb.html
BIOG
GRAPHY

LAKSHM MANAPRABU U S.K. complleted his Bach helor of Enginneering (B.E.)


degree in Electronics and Instrumeentation Engiineering from m the R.M.K K.
Engineerin ng College, C Chennai in thee year 2009. He H did his Maasters in M.E E.
Industrial Engineering from Sudharrsan Engineering College,, Pudukkottaii,
Tamilnadu u in the year 22011. He is a senior
s research fellow and hhe is currently
y
pursuing his
h Ph.D. degrree in multivaariable processs control in thhe Departmen nt
of Electro onics and Insstrumentation n Engineering g of B.S. Abbdur Rahman n
Crescent Institute
I of Sccience and Technology, Chennai,
C Indiaa. He received
d
award of National
N Felloowship from University
U Grrand Commisssion, Govt. of
India, Delh hi, for doing hhis Ph.D. deggree for the Yeear of 2013-22018. His areaa
of interestts includes M Multivariable Control,
C Evolutionary Algoorithm, Fuzzy y
Logic Con ntrol, Artificcial Intelligen
nce, Internet of Things, Model Based d
Developm ment and Hardw ware in the loo
op Testing.
Prof. Dr.S Sachi Nandann Mohanty, reeceived his Ph.D. P from IIIT Kharagpurr,
India in thhe year 2014, with MHRD scholarship from f Govt off India. He has
recently jooined as Assocciate Professor in the Deaprrtment of Com mputer Sciencee
& Engineeering at Gandhhi Institute fo or Technology y Bhubanewarr. His research h
areas inclu ude Data minning, Big Daata Analysis, Cognitive Sccience, Fuzzy y
Decision Making,
M Brainn-Computer Interface,
I Cog
gnition, and C Computationaal
Intelligencce. Prof. S N M Mohantyhas received
r 2 Best Paper Awaards during his
Ph.D at IIT T Kharagpur from Internattional Confereence at Benjinng, China, and d
the other at Internatioonal Confereence on Softt Computing Applications
organized by IIT Rookeee in the year 2013. He hass awarded Besst thesis award d
first prize by
b Computer Society of Ind dia in the yearr 2015. He hass published 155
Internation h been electted as Member
nal Journals off Internationaal repute and has
of Institutee of Engineerss and IEEE Computer Society. He also thhe reviewer of
IJAP, IJDM M Internationaal Journals.
K.Shankarr is an assisttant professor in the Departtment of Com mputer Sciencee
and Inform mation Technoology at the Kalasalingam m University, Krishnankoill,
Tamilnadu u, India. H He received his master degrees off Master of
Computer Applications,, Master of Philosophy in Computerr Science and d
Ph.D. deg gree in compputer sciencee from Alagaappa Universiity, Karaikudii,
India. He has several yyears of experience working g in the reseaarch, academiaa
and teaching. His cuurrent researcch interests include i Crypptography and d
Network Security,
S Clouud security, Image Processsing and Sooft Computing g
Techniquees.

Arunkumaar N has com mpleted in hiss BE, ME an nd PhD in Ellectronics and d


Communiccation Engineeering with sp pecialization in
i Biomedicall Engineering g.
He has a strong
s academmic teaching and
a research experience
e off more than 100
years in SASTRA
S Uniiversity, Indiaa. He is apprreciated for hhis innovativee
xperiences to the principles
research oriented teachiing related prractical life ex
of engineeering. He is aactive in reseearch and hass been givingg directions to o
active reseearchers acros s the globe.
Gustavo Ramírez
R Gonnzález is a professor
p in department of telematics
engineerinng in Universiidad del Caucca, Colombia. He has pubblished severaal
research paapers and worrked on many research projects.
Optimal Deep Learning Model for
Classification of Lung Cancer on CT Images
Highlights:
 The current study presents an innovative approach for

automated diagnosis based on the classification of

Computed Tomography (CT) lung images.

 The CT lung images are extracted into different feature

subsets to reduction process by using Linear

Discriminate Analysis (LDA) to diminish the span of the

features

 An automatic lung cancer classification approach reduces

the manual labeling time and avoids human mistake. The

proposed method is to achieve better precision and

accuracy in recognizing a normal and abnormal lung

image.

 An Optimal Deep Neural Network (ODNN) classifier is

used to characterize the images and the deep structure is

optimized using Modified Gravitational Search

Algorithm (MGSA).

 The proposed work obtained sensitivity of 95.26%,

specificity of 96.2% and accuracy of 96.2% when

compared to existing classifiers.

You might also like