0% found this document useful (0 votes)
20 views34 pages

Concepts of Data Visualisation (Autosaved) (Autosaved)

The document discusses different types of data and methods for visualizing data using Python libraries. It covers numerical and categorical data, as well as common graph types like histograms, box plots, scatter plots and line plots. It also discusses best practices for designing effective data visualizations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views34 pages

Concepts of Data Visualisation (Autosaved) (Autosaved)

The document discusses different types of data and methods for visualizing data using Python libraries. It covers numerical and categorical data, as well as common graph types like histograms, box plots, scatter plots and line plots. It also discusses best practices for designing effective data visualizations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Introduction to data

Aliaa Zaki

visualization with
python
“SHOW, DON’T TELL”
AGENDA

Benefits of Data Design of Data


01 Visualization 03 visalization

Data Type Python library for data


02 04 visualization
How to get insights of data

Calculating Running Drawing


summary models plots
statistics
Numerical Data
DATA
TYPE Categorical Dat
a
Data Type

Numerical Categorical
 Data Data

Continuous Discrete Categorical Categorical


data data Ordinal Nominal
Numerical Data

A. Continuous data
• can be split into smaller and smaller
units, and still a smaller unit exists.
• could take on any value within an interval,
many possible values.
• Length
• Weight
• Temperature
• Age
B. Discrete data
• countable value, finite number of values.
Categorical Data
Classifies items into different groups

Nominal
Ordinal groups are merely
groups have an order names, no ranking. 
or ranking
Histogram
Modality
Skewness

Kurtosis
Box plot
• When you want to compare the
distributions of the continuous
variable for each category.
Scatter plots
• 1. You have two continuous variables.
• 2. relationship between the two
variables.
Line plots
• 1. You have two continuous variables.
• 2. You want to answer questions
about their relationship.
• Usually, but not always, the x-axis is
dates or times.
Bar plots
• 1. You have a categorical variable.
Design of
visualization
Ommiting the
baseline
• Base line 0
• Truncated graph
Manipulatin
g the y-
access
Cherry
picking
data
Using the
wrong graph
color

red-apple

green-apple

0 2 4 6 8 10 12 14

Series 3
Chart junk
More.
The chart-junk of
Steve Jobs more.
Data-ink ratio
Example of ink-ratio
  𝟓 . 𝟑 − 0 . 6 1978:18 mile/gallon

Lie factor
  27
0 . 6
. 5 −
=14.8
18
0.6 inches

18

1985:27.5 mile/gallon
5.3 inches

•The standard required an The magnitude of increase


increase in mileage from 18 shown in the graph is 783%,
to 27.5, an increase of 53%.
 *100=53%  
*100=783%

https://infovis-wiki.net/wiki/Lie_Factor
Type of data analysis
Univariate Bivariate Multivariate
Python library for
data visualization
Matplotlib
Seaborn
Plotly
ggplot
most popular
2D plotting library
With this library, with just a few lines of code, one can generate plots, bar charts, histograms, power spectra, stemplots, scatterplots, error
charts, pie charts and many other types.
https://matplotlib.org/tutorials/introductory/pyplot.html#sphx-glr-tutorials-introductory-pyplot-py
Seaborn is a library based on Matplotlib
Plotly is a web-based toolkit to form data visualizations.
Plotly can also be accessed from a Python Notebook and has a great API. With
unique functionalities such as contour plots, dendrograms, and 3D charts,
It’s coding time

You might also like