CSE488 - Lab7 - Neural Networks and TensorFlow
CSE488 - Lab7 - Neural Networks and TensorFlow
1
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
• Input: This layer is similar to dendrites and takes input from other networks/neurons.
• Summation: This layer functions like the soma of neurons. It aggregates the input signal received.
• Activation: This layer is also similar to a soma, and it takes the aggregated information and fires a
signal only if the aggregated input crosses a certain threshold value. Otherwise, it does not fire.
• Output: This layer is similar to axon terminals in that it might be connected to other neu-
rons/networks or act as a final output layer (for predictions).
The inputs will be fed to each of the hidden layer neurons, by multiplying each input value with a weight
(W) and summing them with a bias value (b). So, the equations at the neurons’ hidden layer will be as
2
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
follows:
H1 = W1 ∗ X1 + W4 ∗ X2 + b1
H2 = W2 ∗ X1 + W5 ∗ X2 + b2
H3 = W3 ∗ X1 + W6 ∗ X2 + b3
The values H1 , H2 , and H3 will be passed to the output layer, with weights W7 , W8 , and W9 , respectively.
The output layer will produce the final predicted value of Y
Yp = W7 ∗ H1 + W8 ∗ H2 + W9 ∗ H3
As the input data (X1 and X2) in this network flows in a forward direction to produce the final outcome,
Yp, it is said to be a feed forward network, or, because the data is propagating in a forward manner, a
forward propagation. Now, suppose the actual value of the output is known (denoted by Y). In this case,
we can calculate the difference between the actual value and the predicted value, i.e., L = (Y − Yp)2 ,
where L is the loss value. To minimize the loss value, we will try to optimize the weights accordingly, by
taking a derivate of the loss function to the previous weights,
For example, if we have to find the rate of change of loss function as compared to W7 , we would take a
derivate of the Loss function to that of W7 (d L /dW7 ), and so on. As we can see from the preceding diagram,
the process of taking the derivates is moving in a backward direction, that is, a backward propagation is
occurring. There are multiple optimizers available to perform backward propagation, such as stochastic
gradient descent (SGD), AdaGrad, among others.
3
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
WARNING:tensorflow:From F:\Users\Yehia\anaconda3\envs\py3\lib\site-
packages\tensorflow_core\python\compat\v2_compat.py:88:
disable_resource_variables (from tensorflow.python.ops.variable_scope) is
deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
4
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
[3]: c = tf.constant(2,name='c')
x = tf.Variable(3,name='x')
y = tf.Variable(c*x,name='y')
WARNING:tensorflow:From F:\Users\Yehia\anaconda3\envs\py3\lib\site-
packages\tensorflow_core\python\ops\variables.py:2826:
Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated
and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in
eager and graph (inside tf.defun) contexts.
But now if you try to see the internal value of y (you expect a value of 6) with the print() function, you
will see that it will give you the object and not the value.
[4]: print(x)
print(y)
[5]: X = tf.placeholder("int32")
Y = tf.placeholder("int32")
Once you have defined all the variables involved, i.e., you have defined the mathematical model at the
base of the system, you need to perform the appropriate processing and initialize the whole model with
the tf.global_variables_initializer() function.
[6]: model = tf.global_variables_initializer()
5
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
Now that you have a model initialized and loaded into memory, you need to start doing the calcula-
tions, but to do that you need to communicate with the TensorFlow runtime system. For this purpose a
TensorFlow session is created, during which you can launch a series of commands to interact with the
underlying graph corresponding to the model you have created. You can create a new session with the
tf.Session() constructor. Within a session, you can perform the calculations and receive the values of
the variables obtained as results, i.e., you can check the status of the graph during processing.
You have already seen that the operation of TensorFlow is based on the creation of an internal graph
structure, in which the nodes are able to perform processing on the flow of data inside tensors that follow
the connections of the graph. So when you start a session, in practice you do nothing but instantiate this
graph.
A session has two main methods: - session.extend() allows you to make changes to the graph during
the calculation, such as adding new nodes or connections. - session.run() launches the execution of the
graph and allows you to obtain the results in output.
Since several operations are carried out within the same session, it is preferred to use the construct with:
with all calls to methods inherent to it. In this simple case, you simply want to see the values of the
variables defined in the model and print them on the terminal.
1.6 Tensors
The basic element of the TensorFlow library is the tensor. In fact, the data that follow the flow within the
Data Flow Graph are tensors
6
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
[8]: t = np.arange(9).reshape((3,3))
print(t)
[[0 1 2]
[3 4 5]
[6 7 8]]
Now you can convert this multidimensional array into a TensorFlow tensor very easily using the
tf.convert_to_tensor() function, which takes two parameters. The first parameter is the array t that
you want to convert and the second is the type of data you want to convert it to, in this case int32.
[[0 1 2]
[3 4 5]
[6 7 8]]
But tensors can be built directly from TensorFlow, without using the NumPy library. There are a number
of functions that make it possible to enhance the tensors quickly and easily.
[10]: t0 = tf.zeros((3,3),'float64')
t1 = tf.random_uniform((3, 3), minval=0, maxval=1, dtype=tf.float32)
with tf.Session() as session:
print(session.run(t0))
print(session.run(t1))
[[0. 0. 0.]
[0. 0. 0.]
[0. 0. 0.]]
[[0.22495162 0.35007727 0.53897154]
[0.23294413 0.8983315 0.16192901]
[0.3974179 0.2803402 0.08716547]]
sum = tf.add(t1,t2)
mul = tf.matmul(t1,t2)
7
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
(11, 2)
[[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0,
1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0]]
To better see how these points are arranged spatially and which classes they belong to, there is no better
approach than to plot everything with matplotlib.
8
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1]
The SLP Model Definition If you want to do a deep learning analysis, the first thing to do is define
the neural network model you want to implement. So you will already have in mind the structure to be
implemented, how many neurons and layers and compounds (in this case only one), the weight of the
connections, and the cost function to be applied.
Following the TensorFlow practice, you can start by defining a series of parameters necessary to charac-
terize the execution of the calculations during the learning phase. The learning_rate is a parameter that
regulates the learning speed of each neuron. This parameter is very important and plays a very important
role in regulating the efficiency of a neural network during the learning phase. Another parameter to be
defined is training_epochs. This defines how many epochs (learning cycles) will be applied to the neural
network for the learning phase. During program execution, it will be necessary in some way to monitor
the progress of learning and this can be done by printing values on the terminal. You can decide how
many epochs you will have to display a printout with the results, and insert them into the display_step
parameter. A reasonable value is every 50 or 100 steps.
To make the implemented code reusable, it is necessary to add parameters that specify the number of
9
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
elements that make up the training set, and how many batches must be divided. In this case you have a
small training set of only 11 items. So you can use them all in one batch. Finally, you can add two more
parameters that describe the size and number of classes to which the incoming data belongs.
[15]: n_samples = 11
batch_size = 11
total_batch = int(n_samples/batch_size)
Now that you have defined the parameters of the method, let’s move on to building the neural network.
First, define the inputs and outputs of the neural network through the use of placeholders.
Now that you have defined the placeholders, occupied with the weights and the bias, which, as you saw,
are used to define the connections of the neural network. These tensors W and b are defined as variables
by the constructor Variable() and initialized to all zero values with tf.zeros().
The variables weight and bias you have just defined will be used to define the evidence x ∗ W + b, which
characterizes the neural network in mathematical form. The tf.matmul() function performs a multipli-
cation between tensors x ∗ W, while the tf.add() function adds to the result the value of bias b.
[18]: evidence = tf.add(tf.matmul(x, W), b)
From the value of the evidence, you can directly calculate the probabilities of the output values with the
10
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
tf.nn.softmax() function.
[19]: y_ = tf.nn.softmax(evidence)
Continuing with the construction of the model, now you must think about establishing the rules for the
minimization of these parameters and you do so by defining the cost (or loss). In this phase you can
choose many functions; one of the most common is the mean squared error loss.
But you can use any other function that you think is more convenient. Once the cost (or loss) function
has been defined, an algorithm must be established to perform the minimization at each learning cycle
(optimization). You can use the tf.train.GradientDescentOptimizer() function as an optimizer that
bases its operation on the Gradient Descent algorithm.
With the definition of the cost optimization method (minimization), you have completed the definition of
the neural network model. Now you are ready to begin to implement its learning phase.
Learning Phase Before starting, define two lists that will serve as a container for the results obtained
during the learning phase. In avg_set you will enter all the cost values for each epoch (learning cy-
cle), while in epoch_set you will enter the relative epoch number. These data will be useful at the end
to visualize the cost trend during the learning phase of the neural network, which will be very use-
ful for understanding the efficiency of the chosen learning method for the neural network. Then be-
fore starting the session, you need to initialize all the variables with the function you’ve seen before,
tf.global_variables_initializer().
[22]: avg_set = []
epoch_set=[]
init = tf.global_variables_initializer()
11
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
12
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
Do not consider this example that takes only a minute to do the calculation; in real cases it may take days
to do it, and often you have to make many attempts and adjust and calibrate the different parameters
before developing an efficient neural network that is very accurate at class recognition, or at performing
any other task.
Now you can move on to see the results of the classification during the last step of the learning phase.
[25]: yc = last_result[:,1]
plt.scatter(inputX[:,0],inputX[:,1],c=yc, s=50, alpha=1)
plt.show()
13
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
The color ranging from blue (belonging to 100% to the first group) to yellow (belonging to 100% to the
second group). As you can see, the division in the two classes of the points of the training set is quite
optimal, with an uncertainty for the four points on the central diagonal (green).
Test Phase and Accuracy Calculation Now that you have an educated neural network, you can create
the evaluations and calculate the accuracy. First you define a testing set with elements different than the
training set. For convenience, these examples always use 11 elements.
[[1. 2.25]
[1.25 3. ]
[2. 2.5 ]
[2.25 2.75]
[2.5 3. ]
[2. 0.9 ]
[2.5 1.2 ]
[3. 1.25]
[3. 1.5 ]
[3.5 2. ]
14
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
[3.5 2.5 ]]
[[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1.0], [0.0,
1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0], [0.0, 1.0]]
[0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
[28]: inputX
[28]: array([[1. , 3. ],
[1. , 2. ],
[1. , 1.5],
[1.5, 2. ],
[2. , 3. ],
[2.5, 1.5],
[2. , 1. ],
[3. , 1. ],
[3. , 2. ],
[3.5, 1. ],
[3.5, 3. ]])
15
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
[[-0.7092779 0.70927787]
[ 0.6299925 -0.6299924 ]]
Accuracy: 1.0
Apparently, the neural network was able to correctly classify all 11 samples. It displays the points on the
Cartesian plane with the same system of color gradations ranging from dark blue to yellow.
[31]: yc = result[:,1]
plt.scatter(testX[:,0],testX[:,1],c=yc, s=50, alpha=1)
plt.show()
The results can be considered optimal, given the simplicity of the model used and the small amount of
16
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
data used in the training set. Now you will face the same problem with a more complex neural network,
the Perceptron Multi Layer.
learning_rate = 0.001
training_epochs = 2000
display_step = 50
n_samples = 11
batch_size = 11
total_batch = int(n_samples/batch_size)
# tf Graph input
X = tf.placeholder("float", [None, n_input])
Y = tf.placeholder("float", [None, n_classes])
Now you have to deal with the definition of the various W and bias b weights for the different connections.
The neural network is now much more complex, having several layers to take into account. An efficient
way to parameterize them is to define them as follows, commenting out the weight and bias parameters
for the second hidden layer (only for the MLP with one hidden layer as this example).
17
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
To create a neural network model that takes into account all the parameters you’ve specified dynamically,
you need to define a convenient function, which you’ll call multilayer_perceptron()
18
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
19
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
[0.20411345 0.7958866 ]
[0.25564632 0.74435365]
[0.02743917 0.9725609 ]
[0.16072556 0.8392744 ]
[0.00802145 0.9919785 ]
[0.27143803 0.728562 ]]
Accuracy: 1.0
(11, 2)
WARNING:tensorflow:From F:\Users\Yehia\anaconda3\envs\py3\lib\site-
packages\tensorflow_core\python\ops\resource_variable_ops.py:1635: calling
BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops)
with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
20
CSE488: Computational Intelligence Lab07: Neural Networks and Tensorflow
Train on 11 samples
Epoch 1/10
11/11 [==============================] - 0s 273us/sample - loss: 0.1198 - acc:
0.8182
Epoch 2/10
11/11 [==============================] - 0s 91us/sample - loss: 0.1197 - acc:
0.8182
.
.
Epoch 8/10
11/11 [==============================] - 0s 182us/sample - loss: 0.1194 - acc:
0.8182
Epoch 9/10
11/11 [==============================] - 0s 182us/sample - loss: 0.1193 - acc:
0.8182
Epoch 10/10
11/11 [==============================] - 0s 182us/sample - loss: 0.1193 - acc:
0.8182
21