0% found this document useful (0 votes)
10 views6 pages

1181-Article Text-3627-1-10-20240603

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views6 pages

1181-Article Text-3627-1-10-20240603

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information)

17/2 (2024), 121-126. DOI: http://dx.doi.org/10.21609/jiki.v17i2.1181

Classification of Coffee Fruit Maturity Level based on


Multispectral Image Using Naïve Bayes Method
I’zaz Dhiya ‘Ulhaq, Muhamad Arief Hidayat, and Tio Dharmawan*

Faculty of Computer Science, University of Jember, Jember, Indonesia

E-mail: [email protected]

Abstract

The current research about the classification of coffee fruit ripeness based on multispectral images has
been developed using the Convolutional Neural Network (CNN) method to extract patterns from high-
dimensional multispectral images. The high complexity of CNN allows the model to capture complex
features but requires more time and computational resources for model training and testing. Therefore,
in this study, classification is performed using a more straightforward method such as Naïve Bayes
because its complexity only depends on the number of features and samples. The method only considers
each feature independently, so it has high speed and does not require a lot of computational resources.
Naïve Bayes is applied to color and texture features extracted from multispectral images of coffee fruit.
There are 300 features consisting of 60 color features and 240 texture features. Experiments were
conducted based on the comparison of training and testing data and the use of each feature. The
combination of color and texture features showed better performance than color or texture features
alone, with the highest accuracy reaching 91.01%. In conclusion, using Naïve Bayes is still reasonably
good in classifying the ripeness of coffee fruit based on multispectral images.

Keywords: Coffee fruit maturity, Multispectral image, Naïve Bayes

1. Introduction space, and narrow LED bandwidth [4], [5]. The


camera produces 15 color channels, including
Coffee (Coffea sp) is an agricultural violet, royal blue, blue, azure, cyan, green, lime,
commodity that plays a significant role in world yellow, amber, red-orange, red, deep red, far red,
economic growth. Coffee has become popular and and 2 Near Infrared (NIR). Meanwhile, an ordinary
favored because processed coffee drinks have a camera can only produce three color channels: red,
delicious taste and distinctive aroma from the best green, and blue.
quality coffee. It is strongly influenced by the In research [4] utilizing the ability of the
coffee fruit's maturity level at harvest time. Coffee Convolutional Neural Network (CNN) method to
picked when the coffee fruit is red or ripe will extract patterns from high-dimensional
produce good quality coffee, while coffee picked multispectral images of coffee fruit. The high
when it is young will cause a reduction in the taste complexity of CNN allows the model to capture
and aroma of coffee [1]. more complex features in multispectral images.
Researchers have made various efforts to However, it also requires more time and
develop a technology that can detect the maturity computational resources for training and testing the
level of coffee fruit. One of the techniques used is model. The complexity is depends on the model
the multispectral imaging technique, which is architecture, number of parameters, and the
believed to be better than conventional imaging training complexity.
techniques, especially in checking the quality of Therefore, this study conducted an experiment
agricultural commodities [2]. This technique can using a more straightforward method such as Naïve
generate more data and capture spectral signs Bayes. This method was chosen because its
containing much information [3]. complexity only depends on the number of features
Multispectral images of coffee fruit are and samples. In addition, this method only
obtained using a special camera modified with considers each feature independently, so it has high
specifications in the form of a wide speed and requires little computational resources.
electromagnetic spectrum, controlled illumination

121
122 Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information), volume 17,
issue 2, June 2024

2. Literature Review fruit maturity levels is based on color, texture, and


shape features extracted from coffee fruit images.
This study conducted based on previous studies The three features are combined to produce a total
about classification of the maturity level of coffee of 208 features.
fruit. Details of related studies can be seen in Fig. The features were then selected so that only
1. nine final features were used for training. The
feature selection process shows that the texture
feature has a higher discrimination value. It proves
that the classification of the maturity level of coffee
fruit can not only be done based on color alone but
can also utilize the texture of the coffee fruit. Data
in color, texture, and shape features are continuous
numerical data. So, to process it using Gaussian
Naïve Bayes with an accuracy rate of 96.8%.
Research by Syahputra et al. [7] classifies the
maturity level of coffee fruit based on color by
utilizing color histograms and color moments.
There are 19 features consisting of 10 color
Fig. 1. Details of previous studies histogram features and nine color moment features.
This study found that the use of color histogram
Research by Tamayo Monsalve et al. [4] was features was better in characterizing the maturity
conducted using multispectral images of coffee level of coffee fruit than color moments.
fruit where wavelengths were detected using Based on an in-depth study of the three
instruments sensitive to particular wavelengths, previous studies, the method used in this study is
including infrared and ultraviolet. Multispectral obtained, as seen in Fig. 3.
images allow the addition of information that the
human eye cannot capture because multispectral
images can capture larger wavelengths. In contrast,
ordinary color imagery can only capture three
wavelengths: red, green, and blue.
The three main components of a multispectral
image camera are a wide electromagnetic
spectrum, a controlled science space, and a narrow
LED bandwidth. In Fig. 2, the camera is designed
to produce 15 wavelengths with a wavelength
range of 400-1000 nm.

Fig. 2. Special multispectral image camera Fig. 3. The classification of coffee fruit maturity method

The camera uses 30 LEDs with different power 3. Methodology


and a bandwidth of less than 20 nm. The
illumination chamber controls glare, shadows, and 3.1. Data Collection
light to prevent outside light from entering or
reflecting from inside. Multispectral images of coffee fruit are
Sandoval et al. [6] conducted research using obtained from the site
ordinary images; the classification method used https://doi.org/10.5281/zenodo.4914786 as numpy
was the Naïve Bayes method. Classifying coffee data. Multispectral images of coffee fruit totaled
‘Ulhaq et.al., Classification of Coffee Fruit Maturity Level based on Multispectral Image 123

640 images with a size of 224x224 pixels each. The segmentation stages can be seen in Fig. 4.
multispectral image of coffee fruit has 15 color
channels with different wavelengths, as shown in
Table 1.
The maturity level of coffee fruit is grouped
into five: immature, semimature, mature, overripe,
(a) original (b) blur (c) Sobel (d) erosion
and dry. The number for each maturity level can be
seen in Table 2.

Table 1. Color channels in the multispectral image of coffee


fruit
(e) dilation (f) hole- (g) (h) cropped
Wavelength Color filling bounding
box
410 Violet
450 Royal Blue Fig. 4. Image segmentation process
470 Blue
490 Azure 3.3. Features Extraction
505 Cyan
530 Green
560 Lime The multispectral image of coffee fruit that has
590 Yellow been cropped according to the segmentation results
600 Amber
620 Red Orange
on the best color channel, then feature extraction is
630 Red carried out to find the values in a multispectral
650 Deep Red image that are unique characteristics of the image.
720 Far Red
840 NIR Various features can be extracted, but the features
950 NIR used in this research include color and texture.
Color feature extraction uses a color histogram
Table 2. Number of pictures per ripening stage because it describes the distribution of colors in an
Ripening Stage Number of Image image. Color histograms can be applied in various
Immature s 130 color spaces, one in multispectral images. This
Semimature 160 research utilizes the statistical values of the color
Mature 160
Overripe 112 histogram, including mean, variance, skewness,
Dry 78 and kurtosis. The statistical significance can be
Total 640 calculated using the equation that has been
presented in the equation below [8]:
3.2. Image Pre-Processing 𝑀 𝑁
1
𝑀𝑒𝑎𝑛 = ∑ ∑ 𝑃𝑖,𝑗 (1)
Multispectral image data goes through the 𝑀𝑁
𝑖=1 𝑗=1
segmentation process. It is needed to remove the 𝑀 𝑁
unecessary background in the multispectral image 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = √
1
∑ ∑(𝑃𝑖,𝑗 − 𝜇)2 (2)
of the coffee fruit. An extensive background can 𝑀𝑁
𝑖=1 𝑗=1
reduce the accuracy of the classification process of ∑𝑀 𝑁 3
𝑖=1 ∑𝑗=1(𝑃𝑖,𝑗 − 𝜇)
the maturity level of the coffee fruit. 𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠 =
𝑀𝑁𝜎 3
(3)
The first phase is finding the mask of each ∑𝑀 𝑁
𝑖=1 ∑𝑗=1(𝑃𝑖,𝑗 − 𝜇)
4
𝐾𝑢𝑟𝑡𝑜𝑠𝑖𝑠 = −3 (4)
image. The Gaussian blur method with 7x7 kernel 𝑀𝑁𝜎 4
is used to remove the detail of image. Sobel is
applied to the previous result to find the edge of Texture feature extraction uses the Gray Level
object. The Sobel done with the 3x3 x and y kernel. Co-Occurrence Matrix (GLCM) method. GLCM is
After that, morphological operations (erosion, a matrix representing the frequency of occurrence
dilation, and hole filling) applied to get the full of two adjacent pixels at a certain intensity,
filled object. The image is eroded using 5x5 kernel distance, and angle. The GLCM matrix is
to remove small objects. Then The image is dilated processed based on four angular directions, namely
using 3x3 kernel to emphasize the object. 0°, 45°, 90°, and 135° with a minimum distance
Furthermore, the hole filling operation applied to between pixels of 1 pixel [9]. Then, texture
the dilated image. The result shown the several features, including contrast, energy, homogeneity,
objects with white color. and correlation, can be extracted from the matrix.
The second phase is finding and cropping the The equation for calculating texture features has
coffe fruit object. been presented in the equation below:
A well-segmented color channel is used to
𝑙𝑒𝑣𝑒𝑙𝑠−1
reference image cropping on other channels. The 𝐶𝑜𝑛𝑡𝑟𝑎𝑠𝑡 = ∑ 𝑃𝑖,𝑗 (𝑖 − 𝑗)2 (5)
𝑖,𝑗=0
124 Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information), volume 17,
issue 2, June 2024

𝐸𝑛𝑒𝑟𝑔𝑦 = ∑
𝑙𝑒𝑣𝑒𝑙𝑠−1
2
√𝑃𝑖,𝑗 (6) Matrix. Confusion Matrix is a table that describes
𝑖,𝑗=0 the performance of a particular algorithm. Each
𝑙𝑒𝑣𝑒𝑙𝑠−1 𝑃𝑖,𝑗
𝐻𝑜𝑚𝑜𝑔𝑒𝑛𝑒𝑖𝑡𝑦 = ∑ (7) row in the table represents actual data, and each
1 + (𝑖 − 𝑗)2
𝑖,𝑗=0
column represents predicted data [11]. The
𝑙𝑒𝑣𝑒𝑙𝑠−1 (𝑖 − 𝜇𝑖 )(𝑖 − 𝜇𝑗 ) Confusion Matrix shown in Table 3. All of the
𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 = ∑ 𝑃𝑖,𝑗 (8)
𝑖,𝑗=0 parameters is averaging using macro average.
√(𝜎𝑖2 )(𝜎𝑗2 )
[ ] Macro average is summing the parameters then
divide by the parameter number. The parameters
Color feature extraction is done in each color obtained from the Confusion Matrix table
channel, so color feature extraction produces 60 calculation include the following:
features. The extraction of texture features is done
in each color channel with four angles, namely 0°, 𝑇𝑃 (11)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
45°, 90°, and 135°, so the extraction of texture 𝐹𝑃 + 𝑇𝑃
features results in 240 features. If these features are
combined, the total features amount to 300
features. 𝑇𝑃 (12)
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝐹𝑁 + 𝑇𝑃
3.4. Gaussian Naïve Bayes
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙 (13)
Naïve Bayes is an algorithm that applies simple 𝐹1 𝑆𝑐𝑜𝑟𝑒 = 2 ×
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
probability calculations by summing frequencies
and combinations of values in a data set [10]. Table 3 Confusion Matrix
However, the use of Naïve Bayes is adjusted to the Actual Values
Positive Negative
nature of the data in the dataset. Color and texture
features extracted from multispectral images of
Predicted Class

Positive True Positive (TP) False Positive (FP)


coffee fruit are continuous numerical data. So, that
process is done using Gaussian Naïve Bayes [11].
Gaussian Naïve Bayes calculation is done with the Negative False Negative (FN) True Negative (TN)
following steps:
a. Calculate the average value of each feature
based on the existing classes. 4. Result and Analysis
b. Calculating the standard deviation value of
each feature based on the existing classes. The multispectral image segmentation of coffee
c. Calculating the prior probability by dividing fruit was performed on 15 color channels.
the number of occurrences of a class by the Experiments using blur, Sobel edge detection,
total number of all classes. erosion, dilation, and hole-filling methods
d. Calculating the Gaussian value using the performed sequentially show results as in the table
following equation: below:
2
(𝑥𝑖 −𝜇𝑖𝑗) Table 4. Image segmentation experiment
1 − Color Number of failures
𝑃(𝑋𝑖 = 𝑥𝑖 |𝑌 = 𝑦𝑗 ) = 𝑒 2𝜎 2𝑖𝑗 (9)
√2𝜋𝜎𝑖𝑗 Violet 640
Royal Blue 627
Blue 573
e. Calculate the posterior probability value of Azure 571
Cyan 557
each class by multiplying all the Gaussian Green 493
values with the prior probability value. Lime 354
f. The prediction result is obtained from the Yellow 403
Amber 163
highest posterior probability value. Red Orange 61
Red 97
3.5. Model Evaluation Deep Red 276
Far Red 640
NIR 640
The classification process is based on the NIR 640
features used, namely color features, texture
features, and a combination of color and texture Images that fail to segment are images that are
features. The experiment was also conducted using not segmented right on the coffee fruit object, or
K-Fold with the number of K is 10. parts of the coffee fruit object are cut off. Based on
Through these experiments, we can find the the results presented in Table 4, the red-orange
performance of each experiment using Confusion color channel has the least failure rate, which is 61
‘Ulhaq et.al., Classification of Coffee Fruit Maturity Level based on Multispectral Image 125

images. Therefore, the red-orange color channel is in each color channel, resulting in 60 color features
used as a reference for cropping other color and 240 texture features. If these features are
channels. combined, the total features amount to 300
Next, the multispectral image of coffee fruit is features. The extraction results of each feature can
extracted to obtain the color and texture features be seen in Fig. 5 and Fig. 6.
described in the previous point. Extraction is done

Fig. 5. Color feature extraction result

Fig. 6. Texture feature extraction result

The dataset that has been obtained is


classified using Naïve Bayes by considering the
division of training data and testing data. The The model with the Color and Texture feature
performance comparison of each classification had the highest accuracy and F1 score. The
experiment is presented in Table 5. accuracy score is 91.01%, and the F1-Score is
The precision of the three models does not 90.93%. The model could predict the ripening
have significant differences. The highest stage accurately under any conditions.
precision score obtained using the Color and As shown in Fig 7, the model had equal
Texture feature with the score is 91.65%. It ability to predict each of the ripening stage. The
means the model can predict each ripened levels highest number of missclassification is in the
precisely. The model also has the highest recall semimature ripening stage with 22 data. The
score with the score is 92.05%. lowest number of missclassification is in the dry
ripening stage with only 1 data.
Table 5. Naive Bayes classification performance comparison
Color Texture Color & Texture
Precision 91.11% 90.44% 91.65%
Recall 91.52% 90.79% 92.05%
Accuracy 89.98% 89.80% 91.01%
F1-Score 89.98% 90.65% 90.93%
126 Jurnal Ilmu Komputer dan Informasi (Journal of Computer Science and Information), volume 17,
issue 2, June 2024

Korelasinya dengan Kualitas Biji Pada Ketinggian


Berbeda,” Jurnal Tanaman Industri dan Penyegar,
vol. 9, no. 1, pp. 1–14, 2022, doi:
10.21082/jtidp.v9n1.2022.p1-14.
[2] J. Qin, K. Chao, M. S. Kim, R. Lu, and T. F. Burks,
“Hyperspectral and Multispectral Imaging for
Evaluating Food Safety and Quality,” J Food Eng,
vol. 118, no. 2, pp. 157–171, 2013, doi:
10.1016/j.jfoodeng.2013.04.001.
[3] D. Lorente, N. Aleixos, J. Gómez-Sanchis, S.
Cubero, O. L. García-Navarrete, and J. Blasco,
“Recent Advances and Applications of
Hyperspectral Imaging for Fruit and Vegetable
Quality Assessment,” Food and Bioprocess
Technology, vol. 5, no. 4. pp. 1121–1142, May
2012. doi: 10.1007/s11947-011-0725-1.
[4] M. A. Tamayo-Monsalve et al., “Coffee Maturity
Fig. 7. Confusion matrix of Color & Texture Model Classification Using Convolutional Neural
Networks and Transfer Learning,” IEEE Access,
The performance of the combined feature is vol. 10, pp. 42971–42982, 2022, doi:
10.1109/ACCESS.2022.3166515.
not significant. It only had 0.28% difference with [5] M. A. T. Monsalve, G. Osorio, N. L. Montes, S.
the color feature model. As the complexity Lopez, S. Cubero, and J. Blasco, “Characterization
consideration, color feature model is of a Multispectral Imaging System Based on
recommended. Narrow Bandwidth Power LEDs,” IEEE Trans
Instrum Meas, vol. 70, 2021, doi:
10.1109/TIM.2020.3010109.
5. Conclusion [6] Z. Sandoval, F. Prieto, and J. Betancur, “Digital
Image Processing for Classification of Coffee
Based on the experiments, the classification Cherries,” in 2010 IEEE Electronics, Robotics and
Automotive Mechanics Conference, CERMA 2010,
of the maturity level of coffee fruit based on
2010, pp. 417–421. doi: 10.1109/CERMA.2010.54.
multispectral images using Naïve Bayes had an [7] H. Syahputra, F. Arnia, and K. Munadi,
excellent performance. In contrast, it can not “Karakterisasi Kematangan Buah Kopi
perform as well as the previous study. The Berdasarkan Warna Kulit Kopi Menggunakan
Histogram dan Momen Warna,” JURNAL
previous study [4] had 98.47% of the F1-Score NASIONAL TEKNIK ELEKTRO, vol. 8, no. 1, p.
rate, and the best version of the experiments was 42, Mar. 2019, doi: 10.25077/jnte.v8n1.615.2019.
90.93% of the F1-Score rate. When the [8] O. D. Nurhayati, “Pengolahan Citra untuk
complexity is considered, the color feature model Identifikasi Jenis Telur Ayam Lehorn dan Omega-3
is recommended to be implemented in the Menggunakan K-Mean Clustering dan Principal
Component Analysis,” Jurnal Sistem Informasi
system. Otherwise, CNN is considerable to use. Bisnis, vol. 10, no. 1, pp. 84–93, Jun. 2020, doi:
10.21456/vol10iss1pp84-93.
6. Limitations [9] R. Widodo et al., “Pemanfaatan Ciri Gray Level
Co-Occurrence Matrix (GLCM) Citra Buah Jeruk
The data used in this study using the data that has
Keprok (Citrus reticulata Blanco) untuk Klasifikasi
been published by [4]. The segmentation method Mutu,” Jurnal Pengembangan Teknologi Informasi
proposed in this study limited to the condition in dan Ilmu Komputer, vol. 2, no. 11, pp. 5769–5776,
the dataset. 2018, [Online]. Available: http://j-ptiik.ub.ac.id
[10] A. Saleh, “Implementasi Metode Klasifikasi Naive
Bayes dalam Memprediksi Besarnya Penggunaan
7. Future Work Listrik Rumah Tangga,” Citec Journal, vol. 2, no.
Future studies should consider the dynamic 3, pp. 207–217, 2015.
background of the images. Thus the [11] Q. Hasanah, H. Oktavianto, and Y. D. Rahayu,
segmentation process can done dynamically “Analisis Algoritma Gaussian Naive Bayes
Terhadap Klasifikasi Data Pasien Penderita Gagal
without constrained to the background. Jantung Gaussian Naive Bayes Algorithm Analysis
Of Data Classification Of Heart Failure Patiens,”
References Jurnal Smart Teknologi, vol. 3, no. 4, pp. 2774–
1702, 2022, [Online]. Available:
http://jurnal.unmuhjember.ac.id/index.php/JST
[1] Y. Abubakar, D. Hasni, and S. A. Wati, “Analisis
Kualitas Buah Merah Kopi Arabika Gayo dan

You might also like