Concepts of Data Visualisation (Autosaved) (Autosaved)
Concepts of Data Visualisation (Autosaved) (Autosaved)
Aliaa Zaki
visualization with
python
“SHOW, DON’T TELL”
AGENDA
Numerical Categorical
Data Data
A. Continuous data
• can be split into smaller and smaller
units, and still a smaller unit exists.
• could take on any value within an interval,
many possible values.
• Length
• Weight
• Temperature
• Age
B. Discrete data
• countable value, finite number of values.
Categorical Data
Classifies items into different groups
Nominal
Ordinal groups are merely
groups have an order names, no ranking.
or ranking
Histogram
Modality
Skewness
Kurtosis
Box plot
• When you want to compare the
distributions of the continuous
variable for each category.
Scatter plots
• 1. You have two continuous variables.
• 2. relationship between the two
variables.
Line plots
• 1. You have two continuous variables.
• 2. You want to answer questions
about their relationship.
• Usually, but not always, the x-axis is
dates or times.
Bar plots
• 1. You have a categorical variable.
Design of
visualization
Ommiting the
baseline
• Base line 0
• Truncated graph
Manipulatin
g the y-
access
Cherry
picking
data
Using the
wrong graph
color
red-apple
green-apple
0 2 4 6 8 10 12 14
Series 3
Chart junk
More.
The chart-junk of
Steve Jobs more.
Data-ink ratio
Example of ink-ratio
𝟓 . 𝟑 − 0 . 6 1978:18 mile/gallon
Lie factor
27
0 . 6
. 5 −
=14.8
18
0.6 inches
18
1985:27.5 mile/gallon
5.3 inches
https://infovis-wiki.net/wiki/Lie_Factor
Type of data analysis
Univariate Bivariate Multivariate
Python library for
data visualization
Matplotlib
Seaborn
Plotly
ggplot
most popular
2D plotting library
With this library, with just a few lines of code, one can generate plots, bar charts, histograms, power spectra, stemplots, scatterplots, error
charts, pie charts and many other types.
https://matplotlib.org/tutorials/introductory/pyplot.html#sphx-glr-tutorials-introductory-pyplot-py
Seaborn is a library based on Matplotlib
Plotly is a web-based toolkit to form data visualizations.
Plotly can also be accessed from a Python Notebook and has a great API. With
unique functionalities such as contour plots, dendrograms, and 3D charts,
It’s coding time