Visualiza on Using Matplotlib
Matplotlib
It is one of the most popular libraries for data visualiza on in Python.
It provides a flexible and comprehensive toolkit for crea ng a wide variety of sta c,
animated, and interac ve visualiza ons in Python.
Matplotlib is a versa le library that allows you to create a wide range of visualiza ons, from
simple plots to complex mul -plot grids.
With its extensive func onality and customiza on op ons, it is an invaluable tool for anyone
working with data in Python.
Whether you are analyzing data, presen ng findings, or simply exploring datasets, mastering
Matplotlib will significantly enhance your data visualiza on capabili es.
Installing and Se ng Up Visualiza on Libraries
To get started with Matplotlib, you need to install it. You can use pip, Python's package manager, to
install Matplotlib along with NumPy and Pandas for data handling.
pip install matplotlib numpy pandas
A er installa on, you can import Matplotlib in your Python scripts or Jupyter notebooks:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
Canvas and Axes
In Matplotlib, a canvas is the area where you can draw plots. The axes are the regions where the
actual plo ng occurs. Each figure can contain mul ple axes, and you can customize each axis
independently.
Crea ng a Simple Plot:
# Create a new figure
plt.figure()
# Create an axis
plt.plot([1, 2, 3], [1, 4, 9]) # Plo ng a simple line
# Adding labels and tle
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt. tle('Simple Plot')
# Show the plot
plt.show()
Subplots
Subplots allow you to create mul ple plots in a single figure. You can specify the number of rows and
columns for the subplots.
Example:
fig, axs = plt.subplots(2, 2) # 2 rows, 2 columns
axs[0, 0].plot([1, 2, 3], [1, 4, 9])
axs[0, 0].set_ tle('Plot 1')
axs[0, 1].bar([1, 2, 3], [3, 7, 5])
axs[0, 1].set_ tle('Plot 2')
axs[1, 0].sca er([1, 2, 3], [5, 1, 6])
axs[1, 0].set_ tle('Plot 3')
axs[1, 1].hist([1, 2, 1, 2, 3], bins=3)
axs[1, 1].set_ tle('Plot 4')
plt. ght_layout() # Adjusts subplots to fit into figure area.
plt.show()
Common Plots
Sca er Plots:
Useful for visualizing the rela onship between two numerical variables.
x = np.random.rand(50)
y = np.random.rand(50)
plt.sca er(x, y)
plt. tle('Sca er Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()
Histograms:
Useful for displaying the distribu on of a numerical variable.
data = np.random.randn(1000)
plt.hist(data, bins=30, alpha=0.7)
plt. tle('Histogram')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()
Box Plots:
Useful for visualizing the spread and iden fying outliers in data.
data = [np.random.normal(0, std, 100) for std in range(1, 4)]
plt.boxplot(data, vert=True, patch_ar st=True)
plt. tle('Box Plot')
plt.xlabel('Group')
plt.ylabel('Value')
plt.show()
5. Logarithmic Scale
You can set logarithmic scales for axes to be er visualize data that spans several orders of
magnitude.
x = np.linspace(0.1, 10, 100)
y = np.exp(x)
plt.plot(x, y)
plt.yscale('log')
plt. tle('Logarithmic Scale')
plt.xlabel('X-axis')
plt.ylabel('Y-axis (log scale)')
plt.show()
Placement of Ticks and Custom Tick Labels
You can customize ck placement and labels to improve the readability of your plots.
x = np.arange(0, 10, 1)
y = x ** 2
plt.plot(x, y)
plt.x cks( cks=[0, 2, 4, 6, 8, 10], labels=['Zero', 'Two', 'Four', 'Six', 'Eight', 'Ten'])
plt. tle('Custom Tick Labels')
plt.show()
Pandas Visualiza on
Pandas also provides built-in visualiza on capabili es that leverage Matplotlib.
Example:
df = pd.DataFrame({
'A': np.random.randn(100),
'B': np.random.randn(100)
})
df.plot.sca er(x='A', y='B', tle='Pandas Sca er Plot')
plt.show()
Style Sheets
Matplotlib allows you to change the visual style of your plots using style sheets.
plt.style.use('ggplot') # Use ggplot style
plt.plot(np.random.randn(100).cumsum())
plt. tle('Styled Plot')
plt.show()
Plot Types
Here are some common plot types available in Matplotlib:
Area Plot: Shows quan ta ve data visually, with the area below the line filled.
Bar Plots: Useful for categorical data.
categories = ['A', 'B', 'C']
values = [10, 15, 7]
plt.bar(categories, values)
plt. tle('Bar Plot')
plt.show()
Line Plots:
Useful for displaying trends over me.
x = np.linspace(0, 10, 100)
y = np.sin(x)
plt.plot(x, y)
plt. tle('Line Plot')
plt.show()
Hexagonal Bin Plot:
Useful for bivariate data with a lot of points.
x = np.random.randn(1000)
y = np.random.randn(1000)
plt.hexbin(x, y, gridsize=30, cmap='Blues')
plt.colorbar()
plt. tle('Hexagonal Bin Plot')
plt.show()
Kernel Density Es ma on Plot (KDE):
Es mates the probability density func on of a random variable.
import seaborn as sns
data = np.random.normal(size=1000)
sns.kdeplot(data)
plt. tle('KDE Plot')
plt.show()
Distribu on Plots
Visualizing the distribu on of data is essen al for understanding its characteris cs. You can use
histograms and KDE together.
sns.histplot(data, kde=True)
plt. tle('Distribu on Plot with KDE')
plt.show()
Categorical Data Plots
Matplotlib and Seaborn provide ways to visualize categorical data using bar plots, count plots, and
box plots.
# Using Seaborn for a count plot
import seaborn as sns
ps = sns.load_dataset(' ps')
sns.countplot(x='day', data= ps)
plt. tle('Count Plot for Days')
plt.show()
Combining Categorical Plots
You can combine different categorical plots to gain more insights.
sns.catplot(x='day', y='total_bill', kind='box', data= ps)
plt. tle('Box Plot of Total Bill by Day')
plt.show()
Matrix Plots
Matrix plots, like heatmaps, are useful for visualizing the rela onships between variables.
data = np.random.rand(10, 12)
sns.heatmap(data)
plt. tle('Heatmap')
plt.show()
Regression Plots
Visualizing the rela onship between variables can be enhanced using regression plots.
sns.regplot(x='total_bill', y=' p', data= ps)
plt. tle('Regression Plot')
plt.show()
Grids
Matplotlib provides the ability to create grid layouts for more complex visualiza ons.
fig, axs = plt.subplots(2, 2, figsize=(10, 10))
axs[0, 0].plot(x, y)
axs[0, 1].sca er(x, y)
axs[1, 0].hist(data, bins=30)
axs[1, 1].boxplot(data)
plt. ght_layout()
plt.show()