How IoT and computer vision could improve the casting quality

This document discusses the integration of Internet of Things (IoT) and computer vision technologies to enhance quality control in the casting industry. It presents a methodology for detecting and categorizing surface imperfections in castings using machine learning techniques, emphasizing the importance of data interconnectivity throughout the manufacturing process. The authors detail their approach to data collection, representation, and the application of various machine learning algorithms to improve defect detection and overall casting quality.

Uploaded by

scout.hristo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views

How IoT and computer vision could improve the casting quality

Uploaded by

scout.hristo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

How IoT and computer vision could improve the

casting quality
Iker Pastor-López José Gaviria de la Puerta Borja Sanz
[email protected] Faculty of Engineering, University of Faculty of Engineering, University of
Faculty of Engineering, University of Deusto Deusto
Deusto Bilbao, Vizcaya Bilbao, Vizcaya
Bilbao, Vizcaya

Aitor Goti Pablo G. Bringas

Faculty of Engineering, University of Faculty of Engineering, University of
Deusto Deusto
Bilbao, Vizcaya Bilbao, Vizcaya

ABSTRACT CCS CONCEPTS

In recent years, Internet of Things has being used in several • Computing methodologies → Visual inspection; Scene
fields of modern life. The possibility of having an intercon- anomaly detection; Supervised learning by classification; Clas-
nection between all the related objects through data opens sification and regression trees.
up a world of research in different fields. Among these fields,
IoT technology is considered to make a significant impact in KEYWORDS
casting industry and the whole casting process. Moreover, Computer Vision, Machine Learning, Feature Engineering,
the casting industry is an exceptionally critical field since Image Segmentation
it directly provides elements to other industries. Therefore,
ACM Reference Format:
castings are subjected to a series of very rigorous quality
Iker Pastor-López, José Gaviria de la Puerta, Borja Sanz, Aitor Goti,
controls which must be validated by the entire manufac- and Pablo G. Bringas. 2019. How IoT and computer vision could
turing process. Considering that the casting process has to improve the casting quality. In 9th International Conference on the
be interconnected in order to foresee the state of the each Internet of Things (IoT 2019), October 22–25, 2019, Bilbao, Spain.
piece before it is produced, it is vitally important that all ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3365871.
sub-processes feed back into the system to learn from errors. 3365878
In this work, we presented a new methodology for the detec-
tion and categorization of the imperfections on the surface 1 INTRODUCTION
of the castings. Among these defects, we particularly focused In recent years, IoT devices have become increasingly influ-
on inclusions, cold laps and misruns. To this end, we first ential in various areas of daily life. The capability of having
compared several features extracted from the obtained im- devices constantly connected to the internet or predicting
ages in order to highlight the regions of the casting that may systems can help many people in their everyday lives. These
be affected. And then, we applied several machine-learning systems also give us the possibility of monitoring the inci-
techniques to classify the regions. The final results were car- dents occurring concurrently through a camera, which can
ried to the starting point of the process to use them in the be, for example, used to detect the status of car drivers in
models to predict the quality of new castings. real time, alerting the driver while he/she is asleep [27].
These new applications of IoT allow us to access an im-
Permission to make digital or hard copies of all or part of this work for mense amount of data in real time, through a proper han-
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
dling of the concept. To get all this information, the IoT is
this notice and the full citation on the first page. Copyrights for components generated connecting all devices through sensors, such as
of this work owned by others than ACM must be honored. Abstracting with RFID, video cameras, lasers, etc. In addition, these devices
credit is permitted. To copy otherwise, or republish, to post on servers or to are connected to the Internet through different protocols
redistribute to lists, requires prior specific permission and/or a fee. Request
[15].
permissions from [email protected].
On the other hand, computer vision has had a great impact
IoT 2019, October 22–25, 2019, Bilbao, Spain
© 2019 Association for Computing Machinery.
on different areas in recent years. Louw et al. [19] developed
ACM ISBN 978-1-4503-7207-7/19/10. . . $15.00 a low cost system for learning factories which is used to teach
https://doi.org/10.1145/3365871.3365878 the processes of a industry. In the case of the automotive
IoT 2019, October 22–25, 2019, Bilbao, Spain Pastor-López, et al.

industry, different approaches have been made to the use Deep Learning (DL)[17], that is a sub-field of Artificial Neu-
of computer vision for the detection of the assembly in the ral Networks (ANNs). A standard neural network (NN) is
fuses of gearboxes [30]. a collection of simple connected processors called neurons.
Following this field over the years, several studies have Each neuron produces a sequence of real-valued activations.
also been carried out on castings. In some of them, the detec- The neurons that obtain the inputs from the environment
tion of certain kind of defects in iron castings was executed are called input neurons, and other neurons get activated
through modelling the foundry production parameters as through weighted connections from previously active neu-
input[26]. These kind of studies uses the combination of ma- rons. The output neurons show the final results. DL algo-
chine learning tools with the data extracted from the foundry rithms use the same bases, but include different hidden lay-
using different sensors. ers, and in the last years different architectures have been
These industrial processes have been searching for years developed within these kind of networks [13].
for a convenient way to automate the failure detection pro- These techniques have also been applied to industrial
cess and to interconnect the whole process. As Bringas rightly inspection[31]. Optical Quality Control (OQC ensures that
stated [3], industrial processes have changed with the advent the product is visually free of imperfections or defects. In
of IoT which provides the interconnection of the different this way, it is a common practice in the industry to create
phases of the processes much more optimal by sharing infor- a new set of features that are manually engineered when a
mation among them. As an example, there have been some problem arises. Deep Learning techniques allow us to avoid
developed projects like IPRO (Intelligent Foundry Business this manual step and make t́he featureséngineering process
Process) that seek the generation of intelligent systems dur- automatic. In order to achieve this, it is necessary to get a
ing the industrial process control of foundry. These systems big labelled data-set. Unfortunately, in industrial processes
have a strong sensorization aspect and artificial intelligence there are several issues that pose a challenge when trying to
for the detection and prevention of undesirable situations. acquire this dataset.
The size of a typical data-set to train these systems may
serve as an example. The IRIS data-set[10]; is a traditional
dataset that is included in every traditional machine learning
course. It is composed by 150 instances with 4 attributes in
each of them. On the other hand, the MINST [18] dataset;
which is the equivalent data-set in the deep learning area, is
composed ofaround 70.000 instances of 20x20 images (e.g.,
20x20x3 channels (Red, Green and Blue), 1.200 features per
instance.
In summary, the first dataset contains 600 elements and
the second one around 84 million elements. Thus, nowadays,
the manual feature engineering process is still in use in the
industry. As shown in section 2 , our dataset is not big enough
to apply deep learning techniques. In recent years, several
authors tried to develop new techniques to avoid the use of
this manual feature engineering process when dealing with
small data-sets [16]. Our research is focused on this “small
Figure 1: The Foundry Processes. data” area. We organize the rest of the paper as follows: First,
we start with a complete description to prepare the dataset
Figure 1 shows the entire casting process; from the instant that is used for surface defect categorization. Secondly, we
that the casting is performed to the moment when the pieces describe the machine learning algorithms that are usedfor
are categorized depending on their quality, as good or bad. the experimentation. Finally, we present the results obtained
One of the steps of this process is the part where the whole with the combination of different groups of variables and
process is interconnected through the data. At the end of algorithms, as well as the conclusions of our work.
the process, the defects are successfully detected using the
computer vision. The information provided from the use of
2 DATASET PREPARATION
this technology will result in the improvement of the process
in the early stages and thus avoid defects. Data Gathering
In recent years, there has been a great evolution in the To acquire data from the surface of the castings, we devel-
machine learning area, using algorithms that are known as oped a functional machine vision system (see Figure 1) [20]
How IoT and computer vision could improve the casting quality IoT 2019, October 22–25, 2019, Bilbao, Spain

composed of: (i) a laser-based triangulation camera with noise. As a result of the process, we obtain a matrix with the
3D technology, (ii) a robotic arm and (iii) a computer with height values of the casting in each point.
several data processing capabilities [22].
(1) Image device. We use a laser-based triangulation cam-
era. By taking advantage of the high-power (3-B class)
laser, we are able to scan the casting even though tehir
surface tends to be dark.
(2) Processing device. We use a workstation with a XENON
E5506 processor working with 32 GB of RAM mem-
ory and a QUADRO FX1800 graphic processor unit.
This system process all the data and manages all the
devices.
(3) Robotic arm. We have adapted the camera to the
robotic arm due to the diversity of the castings.

Figure 3: The height matrix represented in 3 dimensions.

In addition, we developed some software tools, that allow

us to perform some operations over the castings’ data. Specif-
ically, in Figure 2, we can see a three-dimensional viewer in
which we can zoom in and zoom out, and mark regions.

Data Representation
Using this height matrix, we create other three representa-
tions (see Figure 3) of the data. With these representations
we aim to have more complete information about the surface
of the castings and different aspects of the data.
• Grey-scale Height Map [29]: In this representation
we convert each value of the height matrix into a range
between 0 and 255. As a result, we obtain a grey-scale
image with different levels of grey.
• Normals map: It does not only show data about the
heights, but also the direction of the surface in each
point. The resulting vectors for each point, have three
components, one per dimension (x, y, z). Then, we
Figure 2: Proposed prototype for data gathering. codify each one with an RGB color code (R = x, G =
y, B = z), resulting in an image with the vector’s data
To start with the data acquisition, the casting is put over a [22].
black painted table and adjusted on a special silicon mould. • Merge of the normals map and grey-scale height
We use this color to decrease unwished reflections of the map: The last generated representation, merges the
laser, thus, minimizing the noise. Additionally, the mould data of the grey-scale height map and the normals of
ensures that all the scanned castings are in the same position. vector one. In particular, we follow the next process:
Then, making a linear movement with the robotic arm, the (1) We calculate the cosine distance between each nor-
laser is projected over the surface of the casting, and with mal vector and a model vector defined as (0, 0, 1).
this projection, a set of height points are calculated. The This model vector represents a flat surface. A High
precision of our system is of 0.2mm. Then, we remove the value for this distance implies a pixel with a casting
points related to the working table in order to reduce the edge or surface defect.
IoT 2019, October 22–25, 2019, Bilbao, Spain Pastor-López, et al.

(2) In the resulting distances map each point is con-

verted to a range between 0 and 255, in order to
transform the distances matrix into a grey-scale map.
(3) We use a substract filter between the grey-scale
height map and the resulting grey-scale map from
the previous step.
With this image, we represent the values of the height,
and the normal vector in each point. In this way, we obtain
a more complete representation of the casting surface data
(see the third 93 image in Figure 3).

Regions of interest extraction process

Regions of interest are parts through an image in which there
is relevant information for a particular purpose[4]. We apply
this concept, in order to focusonly on the areas of the images
where there may be a defect or the edge of a casting[6]. To
this end, we use a segmentation method, based on previously
scanned castings without defects [22].
As seen in Figure 4, after applying our segmentation method,
we obtain different areas in the image. Some of these areas
represent not only edges in castings or tolerable surface im-
perfections, but also defects like cold lap and inclusions as in
this example. Finally, we extract the minimum rectangular
area that contains the selected region.

Extracted features for defect categorisation

To perform a categorization of the defects through machine
learning techniques, we need to transform the resulting im-
ages into vectors of variables. To this end, we use different Figure 4: The result of extracting regions of interest. In the
features extracted from castings’ images, gathering as much first image we can see a normal map in RGB representa-
information as possible to represent them. We can group tion of a casting with 3 surface defects: (1) represents a cold-
these features in the following sets: lap, and (2, 3) are inclusions with different sizes. In the sec-
• Simple features[22], which set groups all the fea- ond image, we show the regions of interest extracted from
tures extracted from the representations of the cast- the previous representation after we apply the segmentation
process.
ings.
• Best Crossing Line Profile[21], that minimizes the
irrelevant data from the images, and is commonly used
on x-ray images. categorized with regards to the availability of labeled in-
• Fast Fourier Transformation, that uses FFT to con- stances in the training data-set. The algorithms learn using
vert the images into frequencies and turn them into this previous knowledge and can create new predictions. In
the frequential domain[1]. This process is applied to this paper, we use several machine learning algorithms that
each of the generated representations. adopt a train data-set to obtain the knowledge.
• Co-ocurrence matrix: this set of features, aims to
represent the texture information present in the im- K-Nearest Neighbors
ages. Concretely, we extract the well known Haralick K-Nearest Neighbors (KNN)[11] is a simple supervised ma-
textural features [14]. chine learning method. This is a traditional non parametric
technique that classifies new samples, based on distance be-
3 MACHINE LEARNING ALGORITHMS tween the sample vector and the class of the k closest vectors
Boosted by the technological giants (Google, Facebook or in the training space. This algorithm is very k dependant,
Amazon, among others), there is a lot of research conducted anddifferent values of this parameter cause changes in per-
in machine learning[8].Traditionally, these algorithms are formance (e.g., using a big k value, the number of neighbours
How IoT and computer vision could improve the casting quality IoT 2019, October 22–25, 2019, Bilbao, Spain

will increase the classification time and will influence in the on the creation of pieces for safety and precision components
accuracy of the result). within this industry.
This algorithm does not have a model training stage, and To create the dataset, we collect 645 foundry castings using
it just compares the distance between several instances. Tra- the segmentation system previously described. We use 176
ditionally, the metric used to evaluate distances has been correct casting to train the model, and 469 for testing. With
the euclidean distance and the aggregation metric used to this seed, we create a dataset composed by 5785 segments to
evaluate the class of the instances has been the simple vote train the machine-learning models.
[32]. We focus on the detection of 3 different defects: 1)inclu-
sion, 2) cold lap and 3) misrun. We also iinclude a new cate-
Bayesian Networks gory, called “Correct”, that represents the segments that are
Bayesian Networks[23], which are based on the Bayes Theo- correct, even though the method has marked them as faulty.
rem, are defined as directed acyclic graph models (DAG) for The number of samples in each category are indicated in
multivariate analysis. This model can help to know the sta- Table 1.
tistical dependencies between system variables. Each node
Table 1: Number of samples for each category.
represents problem variables that can be either a premise
or a conclusion and each link represents conditional depen-
dencies between such variables. They have an associated Category Number of samples
probability distribution function. Inclusion 387
Moreover, the probability function illustrates the strength Cold Lap 16
of these relationships in the graph[5]. The most important Misrun 52
capability of Bayesian Networks is their ability to determine Correct 5030
the probability that a certain hypothesis is true (e.g., the prob-
ability of an executable to be malware [9]) given a historical
data-set. The criteria for acceptance is based on the final require-
ments of the customer. Due to the final destination industry,
Support Vector Machines (SVM) the quality standards are very restrictive. To this end, we
SVM algorithms first map the input vector into a higher di- label each possible segment with its defects within the cast-
mension space, and then algorithms divide the n-dimensional ings.
space representation of the data into two regions using a The dataset is not balanced for the existing classes due to
hyperplane. In binary classification, the main goal of the algo- scarce data. To minimize the problems that the algorithms
rithm is to maximize the margin between those two regions tend to have in these cases (scarce and unbalanced data),
or classes. The margin is defined by the farthest distance be- we apply the Synthetic Minority Over-sampling Technique
tween the examples of the two classes and computed based (SMOTE)[7], which is a combination of over-sampling the
on the distance between the closest instances of both classes, less populated classes.
which are called supporting vectors[28]. Next, we conduct the following methodology to evaluate
the precision of our method to categorize the segments:
Decision Trees • Cross validation: it is a commonly used methodology
A Decision Tree classifies a sample through a sequence of in machine-learning evaluation. We use 10 as value of
decisions, in which the current decision helps to make the k (e.g., we split our dataset 10 times in 10 different in
subsequent decision. In this algorithms, nodes represent con- a learning set (90% of dataset) and testing set (10% of
ditions regarding the variables of a problem, whereas final the total data).
nodes represent the ultimate decision of the algorithm[24]. • SMOTE: In order to balance the dataset, we applythis
Thus, it can be represented graphically as trees. method, which was previously described.
To train the models, we use Random Forest, a combination • Training step: In this step, we use the algorithms
of weak classifiers (i.e. ensemble) of different randomly-built described in Table 2 to find the algorithm that has the
decision trees[2], and J48, the WEKA [12] implementation best performance.
of the C4.5 algorithm[25]. Finally, to determine the best results, we focus on the maxi-
mization of the accuracy and the Area Under Roc (AUC). The
4 EMPIRICAL VALIDATION first one shows the number of correct classified instances,
In order to evaluate the performance of our detector, we and the second one takes into account the relationship be-
use a collected dataset from a foundry specialized in the tween true positive rate and false positive rate. In our exper-
automotive sector. More specifically, the foundry is focused iment, we have more than two classes (one per defect type),
IoT 2019, October 22–25, 2019, Bilbao, Spain Pastor-López, et al.

Table 2: List of the algorithms used in our empirical validation.

Algorithm Acronym Description

BN: K2 Bayesian Network, using a K2 kernel. (Ref: 3)
BN: TAN Bayesian Network, using a TAN kernel. (Ref: 3)
Naïve Bayes Bayesian Network, using a Naïve. (Ref: 3)
SVM: PK SVM, using Polynomial Kernel (Ref: 3)
SVM: NPK SVM Normalised Polynomial kernel (Ref: 3)
SVM: P SVM, with a Pearson VII kernel (Ref: 3)
SVM: RBF SVM, with Radial Basis Function kernel (Ref: 3)
KNN K=1 K-Nearest Neighbors. K = 1. (Ref: 3)
KNN K=2 K-Nearest Neighbors. K = 2. (Ref: 3)
KNN K=3 K-Nearest Neighbors. K = 3. (Ref: 3)
KNN K=4 K-Nearest Neighbors. K = 4. (Ref: 3)
KNN K=5 K-Nearest Neighbors. K = 5. (Ref: 3)
DT: J48 J48 Decision tree, (Ref: 3)
DT: RF N=10 Random Forest, with 10 trees (Ref: 3)
DT: RF N=25 Random Forest, with 25 trees (Ref: 3)
DT: RF N=50 Random Forest, with 50 trees (Ref: 3)
DT: RF N=75 Random Forest, with 75 trees (Ref: 3)
DT: RF N=100 Random Forest, with 100 trees (Ref: 3)

to this end we have used a weighted values of accuracy and SVM algorithm family. Regarding accuracy, all values are
AUC. near 0.9, except the Naïve Bayes algorithm.
COM configuration results are 0.2 below the other seg-
mentation strategies. The worst results are obtained through
5 RESULTS DISCUSSION AND CONCLUSIONS Naïve Bayes, with the lowest accuracy, 0.4376 and with a low
We compare the detection capabilities using the different area result (just 0.7342).
algorithms. The results of the experiments are shown in ta- In table 4 we show the performance of the system combin-
ble 3. We can see that the results of BCLP and FFT are very ing all the systems. In this way, the Decision Trees once more
similar. The Area Under ROC Curve and the accuracy using have the best performance, both in accuracy and in the area
both categorization strategies are very close, (e.g., using a under ROC curve. There are no significant changes in the
bayesian network and a K2 kernel (BN: K2), the difference performance after combining the segmentation techniques
between the accuracy is only 0.538). Some algorithms have in this family of algorithms. On the other hand, there are
a lower performance in these categorization strategies (e.g, improvements in other families of algorithms (e.g., SVM).
Naïve Bayes with FTT only gets 0.3099 accuracy, worse than Combining these segmentation techniques slightly changes
a coin flip). As usual, we achieve the best results using Deci- the overall results, thus improving the performance of the
sion Trees; the bigger the value of N, the better the results of machine learning algorithms.In this way, the worst results
the model. Particularly, using a Decision Tree and Random are obtained using KNN algorithms. Although the change
Forrest, with 100 as the value of N, we obtain more than of K value improves the results, the AUC are still below the
0.91 and a 0.9238 Area under ROC Curve with the BCLP results presented by the other families of algorithms. In this
categorization method. The problem with these algorithms way, the most balanced results are obtained through the SVM
is that they tend to over-fit the data. family, since both accuracy and AUC get values over 0.9.
We obtain the best results using the Simple segmentation, In conclusion, in this first experimentation we can con-
using the Random Forest N = 100 (DT: RF N = 100). With clude that the best strategy is to use the Simple characteriza-
this configuration we obtain a 0.9664 accuracy and a 0.9763 tion method, but both BCLP and FTT get a good performance,
area under ROC curve. In general, the best results are ob- and could be used in some circumstances. The worst perfor-
tained through the Simple method. All the values of AUC mance is obtained by COM algorithms, and we discourage
are over 0.9, except KNN, the worst results are obtained with the use of these characterization method in these cases. In
this segmentation. On the other hand the best results are the second experiment we can conclude that combining the
again achieved through Decision Trees , followed by the
How IoT and computer vision could improve the casting quality IoT 2019, October 22–25, 2019, Bilbao, Spain

Table 3: Results of the categorization using BCLP, FFT, COM, and Simple feature sets by themselves.

Classifier BCLP FFT COM Simple

W. Acc. W. AUC W. Acc. W. AUC W. Acc. W. AUC W. Acc. W. AUC
BN: K2 0.8013 0.8787 0.7592 0.8715 0.6127 0.7140 0.9195 0.9171
BN: TAN 0.9333 0.8881 0.8617 0.8821 0.7967 0.7342 0.9539 0.9185
Naïve Bayes 0.6131 0.8216 0.3099 0.8306 0.4276 0.7275 0.7994 0.9089
SVM: PK 0.6038 0.8441 0.6780 0.8568 0.5498 0.7628 0.9013 0.9576
SVM: NPK 0.6949 0.8650 0.6882 0.8671 0.5831 0.7594 0.9262 0.9612
SVM: P 0.8933 0.8975 0.7589 0.8859 0.6733 0.7742 0.9647 0.9749
SVM: RBF 0.6015 0.8351 0.6753 0.8499 0.5348 0.7315 0.8798 0.9528
KNN K=1 0.8967 0.7674 0.8612 0.7553 0.7918 0.6875 0.9262 0.5983
KNN K=2 0.9128 0.8296 0.8877 0.8224 0.8408 0.7233 0.9296 0.6342
KNN K=3 0.8856 0.8558 0.8576 0.8519 0.7792 0.7432 0.9305 0.6741
KNN K=4 0.8943 0.8647 0.8686 0.8635 0.8005 0.7500 0.9292 0.7088
KNN K=5 0.8745 0.8715 0.8474 0.8694 0.7628 0.7566 0.9295 0.7381
DT: J48 0.8789 0.7735 0.8598 0.7329 0.7571 0.6628 0.9389 0.8477
DT: RF N=10 0.9141 0.9020 0.8914 0.8754 0.8323 0.7569 0.9622 0.9626
DT: RF N=25 0.9143 0.9159 0.8930 0.8864 0.8319 0.7697 0.9646 0.9706
DT: RF N=50 0.9175 0.9209 0.8949 0.8907 0.8339 0.7723 0.9655 0.9743
DT: RF N=75 0.9173 0.9224 0.8949 0.8928 0.8345 0.7739 0.9660 0.9755
DT: RF N=100 0.9178 0.9238 0.8947 0.8942 0.8349 0.7747 0.9664 0.9763

Table 4: Results of the categorization using the combination of Simple feature set with BCLP, FFT, COM, and a combination
of all the feature sets.

Classifier Simple+BCLP Simple+FFT Simple+COM All

Acc. AUC Acc. AUC Acc. AUC Acc. AUC
BN: K2 0.9230 0.8950 0.9108 0.9155 0.9100 0.9012 0.9088 0.8848
BN: TAN 0.9413 0.8387 0.9507 0.8997 0.9508 0.9012 0.9380 0.7983
Naïve Bayes 0.7845 0.8955 0.7828 0.9094 0.8011 0.9048 0.7761 0.8953
SVM: PK 0.9147 0.9589 0.9072 0.9574 0.9123 0.9592 0.9253 0.9593
SVM: NPK 0.9279 0.9601 0.9259 0.9611 0.9306 0.9622 0.9312 0.9606
SVM: P 0.9588 0.9674 0.9641 0.9762 0.9642 0.9774 0.9581 0.9680
SVM: RBF 0.9020 0.9553 0.8867 0.9529 0.8920 0.9557 0.9111 0.9560
KNN K=1 0.9257 0.6038 0.9280 0.5951 0.9266 0.6002 0.9260 0.5981
KNN K=2 0.9291 0.6450 0.9296 0.6299 0.9292 0.6345 0.9294 0.6460
KNN K=3 0.9297 0.6804 0.9296 0.6681 0.9301 0.6769 0.9298 0.6838
KNN K=4 0.9297 0.7089 0.9290 0.7039 0.9290 0.7134 0.9295 0.7139
KNN K=5 0.9300 0.7323 0.9289 0.7355 0.9297 0.7462 0.9300 0.7381
DT: J48 0.9384 0.8396 0.9388 0.8411 0.9396 0.8411 0.9351 0.8284
DT: RF N=10 0.9625 0.9622 0.9628 0.9623 0.9614 0.9618 0.9613 0.9614
DT: RF N=25 0.9646 0.9699 0.9651 0.9708 0.9646 0.9711 0.9640 0.9702
DT: RF N=50 0.9659 0.9732 0.9662 0.9741 0.9657 0.9741 0.9654 0.9737
DT: RF N=75 0.9659 0.9744 0.9668 0.9752 0.9657 0.9751 0.9659 0.9747
DT: RF N=100 0.9661 0.9751 0.9673 0.9763 0.9661 0.9760 0.9659 0.9752

characterization techniques improves the general results. Re- Decision Forest, SVM algorithms in general have better per-
garding machine learning algorithms, we can conclude that, formance when facing new instances, therefore they would
despite the fact that the best results are achieved through the be the best option to be include in a prototype of the system.
IoT 2019, October 22–25, 2019, Bilbao, Spain Pastor-López, et al.

REFERENCES [23] Judea Pearl. 1985. Bayesian Networks: A Model of Self-activated

[1] R. Bracewell. 1999. The fourier transform and its applications. Memory for Evidential Reasoning. Proceedings of the 7th Conference of
[2] Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), the Cognitive Science Society (1985).
5–32. [24] J. Ross Quinlan. 1986. Induction of decision trees. Machine learning 1,
[3] PABLO GARCIA BRINGAS. 2012. HACIA UNA SOCIEDAD IN- 1 (1986), 81–106.
TELIGENTE BASADA EN EL INTERNET DE LAS COSAS. DYNA [25] J Ross Quinlan. 2014. C4. 5: programs for machine learning. Elsevier.
87, 4 (2012), 386–388. [26] Igor Santos, Javier Nieves, Pablo G Bringas, Argoitz Zabala, and Jon
[4] Ron Brinkmann. 2008. The art and science of digital compositing: Tech- Sertucha. 2013. Supervised learning classification for dross prediction
niques for visual effects, animation and motion graphics. Morgan Kauf- in ductile iron casting production. In 2013 IEEE 8th Conference on
mann. Industrial Electronics and Applications (ICIEA). IEEE, 1749–1754.
[5] Enrique Castillo, Jose M Gutierrez, and Ali S Hadi. 2012. Expert systems [27] Sama-E Shan, MD Fahim Faisal, Syed Rezaul Haque, and Pradipta
and probabilistic network models. Springer Science & Business Media. Saha. 2018. IoT and Computer Vision Based Driver Safety Monitoring
[6] KR Castleman. [n.d.]. Digital Image Processing. Second. System with Risk Prediction. In 2018 International Conference on Com-
[7] Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip puter, Communication, Chemical, Material and Electronic Engineering
Kegelmeyer. 2002. SMOTE: synthetic minority over-sampling tech- (IC4ME2). IEEE, 1–4.
nique. Journal of artificial intelligence research 16 (2002), 321–357. [28] Vladimir Vapnik. 2013. The nature of statistical learning theory. Springer
[8] M Bishop Christopher. 2016. PATTERN RECOGNITION AND MACHINE science & business media.
LEARNING. Springer-Verlag New York. [29] D. vom Stein. 2007. Automatic visual 3-D inspection of castings.
[9] José Gaviria de la Puerta, Borja Sanz, Igor Santos, and Pablo García Foundry Trade Journal 180, 3641 (2007), 24–27.
Bringas. 2015. Using Dalvik opcodes for malware detection on Android. [30] Weiqiang Wang, Yi Luo, Kun Yang, and Chunxue Shang. 2019. Multi-
In International Conference on Hybrid Artificial Intelligence Systems. angle automotive fuse box detection and assembly method based on
Springer, 416–426. machine vision. Measurement (2019).
[10] Ronald A Fisher. 1936. The use of multiple measurements in taxonomic [31] Daniel Weimer, Bernd Scholz-Reiter, and Moshe Shpitalni. 2016. Design
problems. Annals of eugenics 7, 2 (1936), 179–188. of deep convolutional neural network architectures for automated
[11] Evelyn Fix and Joseph L Hodges Jr. 1951. Discriminatory analysis- feature extraction in industrial inspection. CIRP Annals 65, 1 (2016),
nonparametric discrimination: consistency properties. Technical Report. 417–420.
California Univ Berkeley. [32] Min-Ling Zhang and Zhi-Hua Zhou. 2007. ML-KNN: A lazy learning
[12] Stephen R Garner et al. 1995. Weka: The waikato environment for approach to multi-label learning. Pattern recognition 40, 7 (2007), 2038–
knowledge analysis. In Proceedings of the New Zealand computer science 2048.
research students conference. Citeseer, 57–64.
[13] Ian Goodfellow, Yoshua Bengio, Aaron Courville, and Yoshua Bengio.
2016. Deep learning. Vol. 1. MIT press Cambridge.
[14] R.M. Haralick, K. Shanmugam, and I.H. Dinstein. 1973. Textural fea-
tures for image classification. Systems, Man and Cybernetics, IEEE
Transactions on 3, 6 (1973), 610–621.
[15] Tai-hoon Kim, Carlos Ramos, and Sabah Mohammed. 2017. Smart city
and IoT.
[16] Rob Kitchin and Tracey P Lauriault. 2015. Small data in the era of big
data. GeoJournal 80, 4 (2015), 463–475.
[17] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning.
nature 521, 7553 (2015), 436.
[18] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998.
Gradient-based learning applied to document recognition. Proc. IEEE
86, 11 (1998), 2278–2324.
[19] Louis Louw and Marli Droomer. 2019. Development of a low cost
machine vision based quality control system for a learning factory.
Procedia Manufacturing 31 (2019), 264–269.
[20] Nirbhar Neogi, Dusmanta K Mohanta, and Pranab K Dutta. 2014. Re-
view of vision-based steel surface inspection systems. EURASIP Journal
on Image and Video Processing 2014, 1 (2014), 50.
[21] Iker Pastor-Lopez, Jorge de-la Pena-Sordo, Igor Santos, and Pablo G
Bringas. 2015. Surface defect categorization of imperfections in high
precision automotive iron foundries using best crossing line profile.
In Industrial Electronics and Applications (ICIEA), 2015 IEEE 10th Con-
ference on. IEEE, 339–344.
[22] Iker Pastor-López, Igor Santos, Aitor Santamaría-Ibirika, Mikel Salazar,
Jorge de-la Pena-Sordo, and Pablo G Bringas. 2012. Machine-learning-
based surface defect detection and categorisation in high-precision
foundry. In Industrial Electronics and Applications (ICIEA), 2012 7th
IEEE Conference on. IEEE, 1359–1364.