Module - 5
Module - 5
Visualization in python
Matplotlib Pyplot
Most of the Matplotlib utilities lies under the pyplot submodule, and
are usually imported under the plt alias:
import matplotlib.pyplot as plt
import matplotlib.pyplot as plt
import numpy as np
xpoints = np.array([0, 6])
ypoints = np.array([0, 250])
plt.plot(xpoints, ypoints)
plt.show()
Plotting x and y points
The plot() function is used to draw points (markers) in a diagram.
By default, the plot() function draws a line from point to point.
The function takes parameters for specifying points in the diagram.
Parameter 1 is an array containing the points on the x-axis.
Parameter 2 is an array containing the points on the y-axis.
If we need to plot a line from (1, 3) to (8, 10), we have to pass two arrays [1, 8]
and [3, 10] to the plot function.
Example
Draw a line in a diagram from position (1, 3) to (2, 8) then to (6, 1) and finally to position
(8, 10):
plt.plot(xpoints, ypoints)
plt.show()
Default X-Points
If we do not specify the points in the x-axis, they will get the default
values 0, 1, 2, 3, (etc. depending on the length of the y-points.
So, if we take the same example and leave out the x-points, the
diagram will look like this:
The x-points in the example is [0, 1, 2, 3, 4, 5].
Plotting without x-points:
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10, 5, 7])
plt.plot(ypoints)
plt.show()
Markers
You can use the keyword argument marker to emphasize each point
with a specified marker:
Example
Mark each point with a circle:
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, marker = 'o')
plt.show()
Format Strings fmt
You can use also use the shortcut string notation parameter to specify
the marker.
This parameter is also called fmt, and is written with this syntax:
marker|line|color
Example
Mark each point with a circle:
import matplotlib.pyplot as plt
import numpy as np
ypoints = np.array([3, 8, 1, 10])
plt.plot(ypoints, 'o:r')
plt.show()
Line Reference
Line Syntax Description
'-' Solid line
':' Dotted line
'--' Dashed line
'-.' Dashed/dotted line
ms (markersize) mec (markeredge color)
The layout is organized in rows and columns, which are represented by the first and
second argument.
plt.subplot(1, 2, 1)
#the figure has 1 row, 2 columns, and this plot is the first plot.
plt.subplot(1, 2, 2)
#the figure has 1 row, 2 columns, and this plot is the second plot.
So, if we want a figure with 2 rows an 1 column (meaning that the two plots will be
displayed on top of each other instead of side-by-side), we can write the syntax like
this:
import matplotlib.pyplot as plt
import numpy as np
#plot 1:
x = np.array([0, 1, 2, 3])
y = np.array([3, 8, 1, 10])
plt.subplot(2, 1, 1)
plt.plot(x,y)
#plot 2:
x = np.array([0, 1, 2, 3])
y = np.array([10, 20, 30, 40])
plt.subplot(2, 1, 2)
plt.plot(x,y)
plt.show()
Creating Scatter Plots
With Pyplot, you can use the scatter() function to draw a scatter plot.
The scatter() function plots one dot for each observation. It needs two
arrays of the same length, one for the values of the x-axis, and one for
values on the y-axis:
Example
A simple scatter plot:
import matplotlib.pyplot as plt
import numpy as np
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
plt.show()
The observation in the example above is the result of 13 cars passing by.
It seems that the newer the car, the faster it drives, but that could be a
coincidence, after all we only registered 13 cars.
Compare Plots
In the example above, there seems to be a relationship between speed
and age, but what if we plot the observations from another day as
well? Will the scatter plot tell us something else?
Draw two plots on the same figure:
import matplotlib.pyplot as plt
import numpy as np
#day one, the age and speed of 13 cars:
x = np.array([5,7,8,7,2,17,2,9,4,11,12,9,6])
y = np.array([99,86,87,88,111,86,103,87,94,78,77,85,86])
plt.scatter(x, y)
#day two, the age and speed of 15 cars:
x = np.array([2,2,8,1,15,8,12,9,7,3,11,4,7,14,12])
y = np.array([100,105,84,105,90,99,90,95,94,100,79,112,91,80,85])
plt.scatter(x, y)
plt.show()
With Pyplot, you can use the bar() function to draw bar graphs:
Draw 4 bars:
plt.bar(x,y)
plt.show()
Horizontal Bars
If you want the bars to be displayed horizontally instead of vertically, use the
barh() function:
Example
Draw 4 horizontal bars:
plt.barh(x, y)
The bar() and barh() takes the keyword argument color to set the color
of the bars:
Histogram
A histogram is a graph showing frequency distributions.
It is a graph showing the number of observations within each given
interval.
Example: Say you ask for the height of 250 people, you might end up
with a histogram like this:
You can read from the histogram that there are approximately:
With Pyplot, you can use the pie() function to draw pie charts:
A simple pie chart:
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
plt.pie(y)
plt.show()
Labels
Add labels to the pie chart with the label parameter.
The label parameter must be an array with one label for each wedge:
A simple pie chart:
import matplotlib.pyplot as plt
import numpy as np
y = np.array([35, 25, 25, 15])
mylabels = ["Apples", "Bananas", "Cherries", "Dates"]
plt.pie(y, labels = mylabels)
plt.show()