0% found this document useful (0 votes)
25 views7 pages

AGV Robot Based On Computer Vision and Deep Learning

Uploaded by

YUSUF NAUFAL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views7 pages

AGV Robot Based On Computer Vision and Deep Learning

Uploaded by

YUSUF NAUFAL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

2019 3rd IEEE International Conference on Robotics and Automation Sciences

AGV Robot Based on Computer Vision and Deep Learning

Chen Chuixin Cheng Hanxiang


Electrical and Electronic Engineering College Electrical and Electronic Engineering College
Tian He College of GuangDong Polytechnic Normal Tian He College of GuangDong Polytechnic Normal
University University
GuangZhou, China GuangZhou, China
e-mail: [email protected] e-mail: [email protected]

Abstract—AGV is widely used in automation production, and AGV is composed of car body, battery and charging
there are some shortcomings in the current research of AGV. system, driving device, steering device, precise parking
An AGV robot based on computer vision and in-depth learning device, vehicle controller, communication device[6],
is proposed. The hardware is composed of car body, power information sampling subsystem, ultrasonic obstacle
supply, driving device, steering device, precise parking device detection and protection subsystem, visual monitoring
and controller, etc. The software algorithm adopts convolution subsystem, loading device and car body azimuth calculation
neural network structure. Through continuous training of the subsystem. As shown in Figure 1.
collected image information, the steering angle of the robot is
determined, and realize the virtual path navigation and safety
protection, Experiments prove the correctness of the algorithm.
The system can be widely used in the field of automation, and
has a good reference value for the study of artificial
intelligence.

Keywords-AGV; computer vision; deep learning; neural


network

I. INTRODUCTION
With the gradual development of factory automation and,
computer integrated manufacturing system technology and
the wide application of flexible manufacturing system and
automated warehouse, AGV (Automatic Guided Vehicle) as
a necessary means of automated handling and unloading to Figure 1. AGV composition
link and regulate discrete logistics system to make its
operation continuous, its application scope and technical
level have developed rapidly [1]. II. OVERALL DESCRIPTION OF THE SYSTEM
AGV is an unmanned automatic guided vehicle with The designed AGV robot has independent computing
microcontroller as control core [2], battery as power and ability and four-wheel forward steering, which can be used
non-contact guidance device. The basic functions of the for data acquisition and algorithm testing.
automatic operation are guiding driving, accurate address AGV robot carries self-designed data processing
parking and load transfer. As an effective means of modern platform, so the robot has certain data processing ability and
logistics processing automation and the key equipment of can cooperate with remote host. When collecting samples,
flexible manufacturing system, AGV has been more and the data processing platform of the robot controls the camera
more widely used, and the research of AGV has very to take photos, and saves the pictures on the processing
important theoretical and practical significance [3]. platform. After collecting samples, the convolution neural
Since the advent of AGV, most of the control process of network is trained and synchronized to the remote host.
AGV is generally satisfied with the control model based on During the testing process, the data processing platform runs
kinematics [4], while few people carry out control design the trained convolution neural network. According to the
based on computer vision and in-depth learning. It is found environmental images collected by the camera, the
that the improvement of AGV vehicle body dynamics model convolution neural network can predict the steering angle in
based on computer vision and in-depth learning can obtain real time and drive the AGV robot to move. The
the nonlinear coupling relationship between direct motor communication between the robot and the remote host is
input and the speed of running and guiding wheels. It will be through the network.
of great and far-reaching significance to guide the design of AGV robot can complete CNN training. The training is
vehicle body mechanical structure, path planning and accomplished by the cooperation of GPU and CPU, in which
rational path tracking control law [5]. the CPU performs data preprocessing and other preparatory

978-1-7281-0855-1/19/$31.00 ©2019 IEEE 28


Authorized licensed use limited to: Universiti Tun Hussein Onn Malaysia. Downloaded on November 01,2024 at 02:05:13 UTC from IEEE Xplore. Restrictions apply.
work, and the GPU performs forward and backward function. The design details of steering components are
propagation calculation of the neural network. When shown in Figure 3.
choosing GPU, factors such as core number and storage
space should be considered. The core number of GPU
determines the training time of the network. Because this
algorithm is trained offline, the training speed is not required.
GPU storage space limits the size of network and training
batch. Activation value and error of each layer of neural
network are stored in GPU, which requires a lot of storage
space. Reducing the size of data batches in training or
switching to convolution cores with larger steps can reduce
the storage space required by CNN, but it may cause a
certain loss of accuracy, so it needs to be weighed in
practical application[7].
III. SYSTEM HARDWARE ARCHITECTURE Figure 3. Steering gear control and steering shaft structure

AGV robot adopts four-wheel arrangement structure. The AGV robot fully considers the convenience of later
front wheel is steered by a group of differential steering modification in shape and structure design, and always takes
components controlled by a digital steering engine, and the into account its practical performance. The final fuselage
rear wheel is driven by two groups of brushless DC motors design is shown in Figure 4.
independently. The frame structure of AGV is made of 2525
aluminium alloy, which not only meets the requirements of
the robot's load design, but also meets the aesthetic
requirements in appearance. As shown in figure 2.

Figure 4. Fuselage design drawing

Figure 2. AGV frame IV. SOFTWARE ALGORITHM


The steering structure of AGV robot uses digital steering A. Overall Description of System Software Algorithms
engine to control steering components to achieve steering

Figure 5. End-to-end control algorithm

This algorithm is based on deep learning and automatic system is a convolution neural network (CNN) composed of
control of robots. It studies autopilot as a whole and seven convolution layers and four full connection layers. The
establishes an independent learning system. The learning input of the network is the image taken by binocular camera

29
Authorized licensed use limited to: Universiti Tun Hussein Onn Malaysia. Downloaded on November 01,2024 at 02:05:13 UTC from IEEE Xplore. Restrictions apply.
installed in front of the AGV. The output is a floating point in the full junction layer. The number of neurons is 1164,
number, representing the steering angle to be predicted. The 100 and 50. The final output is a node representing the
loss function is measured by the square error. After CNN steering angle. The network structure is shown in Figure 6.
training, the image captured by the camera will be mapped to
the steering angle through CNN. AGV continuously captures
the first view image of the current environment in the
process of advancing. The steering angle is predicted by
CNN, and then AGV adjusts the direction in real time
according to the predicted steering angle[8]. The overall flow
of the algorithm is shown in Figure 5.
The system takes binocular camera image as input and
outputs steering command. The characteristics of this system
are as follows:
First, the system is based on the depth measurement by
stereo camera, so binocular camera is used to collect data.
Secondly, the computing resources of the AGV robot
used in the system are relatively strong, and its computing
ability can meet the needs, and it has the ability of
autonomous decision-making in the process of motion.
Thirdly, as a regression problem, the automatic driving
function of the system predicts the turning angle. Compared
with predicting left-turn and right-turn movements, steering
angle can accurately describe the motion, but training is
more difficult. Therefore, some solutions to improve the
training effect and generalization ability are put forward.
B. Design and Implementation of Convolutional Neural
Network
The convolution neural network designed in this system
inputs the first view image of AGV robot and outputs the
steering angle, which is a continuous value. The design
elements mainly include loss function, network structure,
convolution layer, activation function and so on.
1) Loss function
The steering angle of CNN output belongs to continuous Figure 6. CNN network architecture
value, so square error is used to measure the loss. The
objective of network optimization is to minimize the square 3) Convolution layer
error between the steering angle predicted by CNN and the Convolution layer is the core layer of constructing
steering angle manually adopted. m is the number of training convolution neural network. It undertakes most of the
samples, n is the number of features, d is the number of computational load of the neural network. Its function is to
categories, and the loss is L. Then the characteristic matrix of extract image features layer by layer. The parameters of
m samples is X  (m, n) , the weight matrix is W  (n, d ) , and convolution layer are composed of some learnable
the sample label is y  (m, d ) . The expression of loss function convolution kernels, each of which is smaller in width and
is shown in formula (1). height. The convolution kernels used in the algorithm are
2 3*3 and 5*5. Compared with larger convolution kernels,
L  XW  y (1) smaller convolution kernels make the network more
2) System network architecture nonlinear and reduce the number of parameters. The
CNN has 11 layers, including 7 convolution layers and 4 convolution core is the same in depth as the input data
full connection layers. The final output node is the steering volume. The size of the convolution core in the first layer of
angle[9]. The original image is pre-processed to 129*225 the network is 5*5*3, which represents the width, height and
size. Pictures are input into the network, and then through depth of the convolution core. Because the input image has
five convolution layers with 5*5 convolution core size, the three channels, the depth of the convolution core is also 3.
size of features is shrinking after each convolution layer, but When the network propagates forward, the convolution core
the number of features is increasing, there are 24, 36, 48, 64, slides over the width and height of the input data volume,
64. calculates the inner product between the data block and the
Then it passes through two convolution layers with the sliding data block, and then uses the activation function as
core size of 3*3. These two convolution layers further the activation value of the next neuron layer. When the
extract features without scaling. After the convolution layer, convolution core slides across the entire input data volume, a
it enters the full junction layer. There are three hidden layers two-dimensional activation map is generated [10], with each

30
Authorized licensed use limited to: Universiti Tun Hussein Onn Malaysia. Downloaded on November 01,2024 at 02:05:13 UTC from IEEE Xplore. Restrictions apply.
element being an activation value. When the convolution designing the super parameters of convolution layer, the
kernel learns the features that are useful for decision-making, value of W-F+2P should be guaranteed to be divided by step
the neurons of the features will show a higher activation S, that is, the convolution core can skip the input data body
value. The features learned by the upper convolution layer neatly. Zero filling is usually applied to ensure this divisive
are more abstract and generalized than those of the lower relationship.
convolution layer. There can be more than one convolution In convolution layer, parameter sharing is used to reduce
core on each convolution layer. In this algorithm, there are the number of parameters. Parametric sharing is based on the
24 convolution cores in the first convolution layer. Each assumption that if a feature operator is useful at pixels (x1,
convolution core produces an activation graph when it y1), it is also useful at pixels (x2, y2). Therefore, the neurons
convolutes the input. Therefore, 24 activation graphs are of each depth slice in the depth direction use the same weight
generated. These activation maps are superimposed in the and bias, and a set of weight parameters correspond to a
depth direction to form the output, which is still arranged in depth slice in the depth direction. Therefore, if the input
three directions: width, height and depth. Local connections volume depth of convolution layer is D1, the convolution
are used between neurons in the convolution layer. When layer has F*F*D1*K weights and K biases. A convolution
dealing with high-dimensional data such as images, it is layer in this algorithm has 24 depth slices in the depth
unrealistic to make every neuron connect with all the direction. There are 129*225*24 = 696600 neurons, but only
neurons in the previous layer, which will make the network 5*5*3*24 = 1800 weight parameters (+24 bias parameters).
have too many parameters to learn to achieve good results. Therefore, the number of parameters in the convolutional
Therefore, each neuron is locally connected with the former nervous layer is greatly reduced by weight sharing.
neuron, and the size of the connection space is called the 4) Activation function
receptive field, which represents the spatial size of the The activation function of this algorithm uses ELU
convolution nucleus and is a superparameter. Figure 7 is a function. Because the gradient of return may be too large in
neuron diagram. regression training, RELU neurons may die easily. ELU
function has left soft saturation, so it can avoid death and
accelerate convergence to a certain extent. The RELU
activation function image is shown in Figure 8. Its
expression is shown in Formula (3).

Figure 7. Neuron schematic diagram

Output volume size is related not only to input data size


W, convolution core size F, but also to depth, stride and
zero-padding parameters. Depth refers to the number of Figure 8. RELU function
convolution cores in the convolution layer, and step size
refers to the number of pixels moving each time in the x if x  0
RELU( x)   (3)
sliding filter. Generally, step size is 1 or 2. If the filter sliding 0 if x  0
step size S > 1, the output data volume will shrink in space. The activation function of RELU is unsaturated, which
Zero padding P is used to control the size of the input data effectively avoids the disappearance of gradient and has a
volume. The common usage is to fill the edge with 0 value to strong acceleration effect on network convergence. In the
keep the output data volume unchanged in width and task of ImageNet classification, we compare RELU function
height.The size W' of the output data volume is usually with sigmoid function, and find that the convergence speed
calculated as shown in equation (2). of neural network using RELU activation function is about 6
W '  (W  F  2P) / S  1 (2) times faster than that of the latter. At the same time, RELU
In this algorithm, input data volume width W = 129, function avoids exponential operation in sigmoid and tanh
height H = 225, channel number 3. In the first convolution functions, which consumes computing resources when
layer, the convolution core size F = 5, step size S = 1, zero propagating forward and backward in the network. However,
filling P = 2, It has a total of K = 24 convolution kernels. the RELU function also has some shortcomings, the main
Thus, the output data width W'=(129-5+2*2)/2+1=65, the shortcomings are called neuron "death" phenomenon. When
height H'=(225-5+2*2)/2+1=113, so the output data volume a large gradient flows through RELU neurons, it may cause x
size is 65*113*24. The size of subsequent convolution layers to be updated to the negative hemiaxial region of the
can be calculated in turn. It should be noted that when transverse axis. At this time, the activation value and

31
Authorized licensed use limited to: Universiti Tun Hussein Onn Malaysia. Downloaded on November 01,2024 at 02:05:13 UTC from IEEE Xplore. Restrictions apply.
gradient of the neurons become zero. In this case, the V. SYSTEM TESTING
neurons will never be activated again. This phenomenon can
be avoided to some extent by reducing the learning rate. The A. Training Data Acquisition
algorithm of this system is used to predict the steering angle. Each training data includes a picture with the AGV robot
It is possible to get a large gradient when using the square as the first view and the appropriate steering angle and
error loss function. Tests showed that using RELU neurons throttle value corresponding to the picture. In data
resulted in about 40% neuronal death. Therefore, the ELU acquisition, the turning angle and throttle value of the robot
activation function is used in this algorithm. ELU activation are controlled manually, and the AGV robot is controlled by
function combines the advantages of RELU function and the collector through remote control. The collector observes
sigmoid function, and has left soft saturation. The ELU the environment in front of the current AGV robot.
function is shown in Fig. 9 and its expression is shown in Considering obstacle distance, marking line distance,
Formula (4). forward direction and its position, the collector controls the
AGV robot to use an appropriate steering angle as the sample
label. The data processing platform records the current image
taken and the steering angle chosen by the collector and the
throttle value, which constitute a training sample. The
control interface used in data acquisition is shown in Figure
10. The image captured by the camera can be displayed in
real time in the control interface. Such a cycle until the
number of samples reaches the expected, then the training
sample collection is completed. Therefore, the algorithm is
supervised learning algorithm, in which the supervisor is the
sample collector. In order to ensure the diversity and
representativeness of training samples, sample collection
needs to be carried out under different map, ground
environment and illumination conditions. Sample collection
Figure 9. ELU function of this system is carried out in the laboratory. In the process
of collection, manual marking lines are used to simulate the
 x if x  0 transportation environment of factories. A total of 17 320
ELU ( x)   (4)
a(exp( x)  1) if x  0 samples are collected. The collector should follow a certain
Among them, a is a constant, which can be selected by strategy in collecting samples, which should enable the AGV
cross validation. The algorithm is set to 0.1. The linear part robot to run smoothly and avoid obstacles. At the same time,
on the right side of ELU function can alleviate the the strategy should be consistent throughout the acquisition
disappearance of gradient, and the soft saturation on the left process. The consistency of strategy can ensure the
side can reduce the "death" phenomenon of neurons, so that consistency of training data, so as to ensure the rationality of
the convergence speed is faster, which is suitable for this learning process and prevent noise data from over-fitting.
system. The basic strategies of sampling in the experiment are as
follows:

Figure 10. Data Acquisition Interface

32
Authorized licensed use limited to: Universiti Tun Hussein Onn Malaysia. Downloaded on November 01,2024 at 02:05:13 UTC from IEEE Xplore. Restrictions apply.
Firstly, when there is no obstacle or AGV robot has represent mathematical operations. Lines represent
crossed the obstacle, the AGV robot should stay in the multidimensional data arrays, namely tensors, which are
middle of the road and drive along the landmark line. interconnected among nodes. Tensorflow can automatically
Secondly, when obstacles appear in the AGV robot's field of calculate the differential in the back propagation of neural
view, the steering should be adjusted, and then a series of networks. It provides program interfaces such as Python/C++,
steering actions should be taken to avoid rapid turning by which can be easily deployed in the distributed environment
avoiding obstacles as smoothly as possible. The reason is of multi-CPU and multi-GPU. Verification set is used to
that the camera can only observe the obstacles in front of it. select hyper-parameters in network training, including
When the camera in front of it passes through the obstacles, convolution kernel size, regularization coefficient, Drop Out
the obstacles on the side of the smart car can not be observed. rate, etc. At the same time, in the training process, the
Therefore, if the AGV robot rotates rapidly near the obstacle, verification set can prevent the model from over-fitting. The
it is easy to touch the obstacle even if the front part of the specific method is to calculate the data loss of the
robot can avoid the obstacle, so the AGV robot should plan a verification set after each round of training, and stop training
smooth trajectory in advance to avoid the obstacle. Thirdly, when the loss of the verification set no longer decreases,
different map types need to be changed in sample collection. which is called early stopping. This is because when the loss
In the experiment, various map types such as S-shape, of training set decreases and the loss of verification set no
octagon and Pentagon are arranged to ensure the diversity of longer decreases, the model tends to over-fit gradually, and
samples, as shown in Figure 11. Fourthly, different training should be stopped at this time.
illumination conditions need to be changed when sampling.
In the experiment, different illumination conditions are ACKNOWLEDGMENT
simulated by changing the light and shade of natural light, The system consists of hardware and software. The
the color and brightness of light. Fifthly, different types of hardware mainly includes the dynamic gear train structure of
obstacles need to be changed to simulate obstacles in traffic AGV robot and the necessary circuit. The software mainly
environment. Finally, we collect obstacle avoidance includes the algorithm of computer vision and deep learning.
behaviors in extreme situations, such as AGV robot's rapid The convolutional neural network of the learning system is
rotation when it is close to obstacles, obstacle avoidance on deeply studied, and the correctness of the algorithm is
roads without markers, and so on, to simulate the driving in verified by experiments. This system is suitable for AGV
unstructured environment, so as to make the algorithm more systems in different application environments at home and
robust. abroad, and is applied in automation production lines such as
automobile manufacturing, which improves the industrial
automation level of Chinese enterprises and the ability to
participate in international competition. This article has been
strongly supported by family members, colleagues and
enterprises, and also refers to the important literature of the
predecessors, in which I would like to express my gratitude.
REFERENCES
[1] Zhu yunhong, “Optimal Path Search Based on Improved A*
Algorithms,” Computer Technology and Development,
2018, 28(4): 55–59.
[2] Wang dianjun, “Path Planning of Indoor Mobile Robot Based on
Improved A* Algorithms,”Journal of Tsinghua University (Natural
Science Edition), 2012, 52(8): 1085–1089.
[3] Frampton KD Acoustic self-localization in a distributed sensor
network. IEEE Sensors Journal, 2016, 6(1): 166–177.do and H. Suhl,
Eds. New York: Academic, 1963, pp. 271–350.
Figure 11. Experimental topographic map
[4] Huang zhiqiu, “Overview of the Development of Automatic
Navigation Vehicles,” Mechanical Design and Manufacturing
B. Training Outcome Expectationsings Engineering, 2010,39(1):53-59.
[5] Chen hongbo, “Development Prospect and Market Application of
After sample collection, the designed convolutional AGV in Logistics Industry,” Robot Technology and Application,
neural network can be trained, and the sample can be used as 2015(6):39-40.
input of the network after data augmentation and steering [6] Lu S, “A RFID-enabled positioning system in automated guided
angle pretreatment. All training samples are divided into vehicle for smart factories,” Journal of Manufacturing Systems,
training set, verification set and test set. The training set 2017,44:179-190.
contains 17320 samples, the verification set contains 5000 [7] Zhen shaohua, “Research on Multi-Path Fast Detection Algorithms
samples and the test set contains 2320 samples. The structure for Visual Navigation AGV,” Electronic Design Engineering,
2016(11):123.
of convolution neural network is based on Tensorflow.
[8] Huang yiun, “Quantity Configuration Planning of Cars in Workshop
Tensorflow is an efficient deep learning framework. Data AGV Material Handling System,” Industrial Engineering and
flow graph is used to describe the calculation process. Nodes Management, 2015(4):63.

33
Authorized licensed use limited to: Universiti Tun Hussein Onn Malaysia. Downloaded on November 01,2024 at 02:05:13 UTC from IEEE Xplore. Restrictions apply.
[9] Yu hongjie, “Research on Extraction of Center Line of Visual AGV [10] Xu wenbin, “Design of AGV Assembly Robot Control System Based
Navigation Mark under Complex Conditions,” Computer on Industrial Computer,” Computer technology and its application,
Measurement and Control, 2016,24(1):212-215. 2013,39(7):131-134.

34
Authorized licensed use limited to: Universiti Tun Hussein Onn Malaysia. Downloaded on November 01,2024 at 02:05:13 UTC from IEEE Xplore. Restrictions apply.

You might also like