Data visualization
Presented By
Dr.Deepa A
1
2
Presentation Overview
Data Visualization Introduction – What is DV?
Why DV?
Benefits
Techniques
Who Uses DV
Steps in Data Visualization
Data Visualization Roles
Data Visualization Tools
Techniques in Programming
3
Examples
INTRODUCTION
What is DV?
Is the practice of translating information into a visual
context such as map or graph.
Is one of the steps of the data science process, which states
that after data has been collected, processed and modeled, it
must be visualized for conclusions to be made.
The term is often used interchangeably with others, including
information graphics, information visualization and statistical
graphics. 4
5
Why DV?
The main goal of data visualization is to make it easier to identify
patterns, trends and outliers in large data sets.
DV aims to identify, locate, manipulate, format and deliver data
in the most efficient way possible.
6
7
Benefits of DV
• The ability to absorb information quickly, improve insights and
make faster decisions.
• An increased understanding of the next steps that must be taken to
improve the organization.
• An improved ability to maintain the audience's interest with information
they can understand.
8
Benefits of DV
An easy distribution of information that increases the opportunity to
share insights with everyone involved.
Eliminate the need for data scientists since data is more accessible
and understandable.
An increased ability to act on findings quickly and, therefore, achieve
success with greater speed and less mistakes.
9
10
Who uses DV
Data visualization is important for almost every career.
DV in Education:
To monitor students learning progress throughout the semester.
Advisers can take prompt actions to help the students who are failing
It can be used by teachers to display student test results, by
computer scientists exploring advancements in artificial intelligence
(AI) or by executives looking to share information with stakeholders .
11
DV in Business
Provides Business with vastly improved Decision making process because it
enhance the available information and represents it in a pictorial format.
Visualization charts with actionable data can display and help to communicate
the message effectively.
Data visualization can support organization leaders in identifying patterns and
gaps and interpreting the information in a meaningful manner.
DV in Military
For the military, clear and actionable data is critical
To quickly share accurate information in the most concise structure.
Better understanding of past Data can make more accurate. 12
DV Steps
Develop your Research question
Get or create your Data
Clean your Data.
Choose a Chart Type.
Choose your Tool.
Prepare data.
Create Chart
13
DV Techniques
Box plots
Histograms
Heat Maps
Charts
Tree Maps
Network Diagram
14
Boxplots
•A box and Whisker plot displays
the five number summary of a set
of data.
• The five number summary is the
Minimum, first quartile, median,
Third quartile and maximum.
• In Box Plot, we draw a box from
the first quartile to the third
quartile.
•A vertical line goes through the
box at the median.
•The whiskers go from each
quartile to the minimum or
maximum.
15
Histogram
• Histograms are a type of Bar
Chart.
• The bars on a Histogram
represent ranges along a
continuous quantifiable spectrum.
• A Chart that shows the
frequency distribution of data
points across a continuous range
of Data values.
• A chart that displays numeric
16
Line Chart
A Line Chart is a Graphical representation
of information that changes over time.
It is a visual comparison of how two variables
shown on X and Y Axis are related or vary with
each other.
A Line Graph helps to determine the relationship
between two sets of values.
When changes are miner it is better to use
Line charts than Bar Graphs. 17
Pie Chart
A Circular Chart with multiple divisions where
each division shows the contribution of each
value to the total value.
is a Graphical representation
of information that shows a part to a whole.
One would easily see the biggest or smallest
share of the total data
Displays relative proportions of multiple
classes of Data. 18
Scatter Chart
A Scatter plot or chart uses dots to represent
values for different numeric variables.
Scatter Plots are used to observe relationships
between variables
The position of each dot on the Horizontal and
Vertical Axis indicates values for an individual
data point.
It is particularly useful for Researchers,
Economists, Scientists and Journalists
19
Bar Chart
A Bar chart represents categorical Data with
rectangular bars with heights proportional to the
values that they represent.
It is used when you want to show a distribution
of data points or perform a comparison of metric
values across different subgroups of your data.
From a bar chart, we can see which groups are
highest or most common
A Bar chart can be of two types horizontal or
Vertical. 20
Pair Plot
Pair plot is a Module of Seaborn Library which
provides a high level interface for drawing
attractive and informative statistical graphics.
Visualizes given data to find the relationship
between them where the variables can be
continuous or categorical
A Pair plot is a representation which plots
pairwise relationship in the data.
It is used to understand the best set of features
to explain a relationship between two variables 21
KDE Chart
Kernel Distribution Estimation Plot is used for
visualizing the probability density of a
continuous variable.
Interpretation of Density Curve:
If Density curve is left skewed, then the mean is
less than median.
If Density curve is right skewed, then the mean is greater
than median.
If Density curve has no skew, then the mean is equal to
22
Area Plot
An Area Chart displays graphically quantitative
data..
It is based on line chart.
The area between axis and line are commonly
emphasized with colors, and textures.
It is used to showcase data that depicts a time series
relationship.
It is a great chart to visualize a volume change over a period
of time.
23
Hex Bin Plot
A Hexbin plot is used to represent the
relationship between two numerical variables
when many data points are present.
In Hex bin plot , the points are not overlapping
the plot is split into several Hexbins/Hexagons.
It shows the density of data points in a 2Dspace.
The colour of the bins represents the number of data points
within that bin..
Uses Hexagons to split the area into several parts and
24
Heat Maps
A Heat Map is graphical representation of data
that uses a system of colour coding to represent
different values.
It is most commonly used to show user
behaviour on specific webpages.
It is a Two dimensional data visualization that
represents the magnitude of individual values
within a data set as a color.
They are applicable in A/B Testing, helpful in redesigining
websites, content marketing etc. 25
Change over time
26
Spark Line Chart Examples
27
28
29
Part-to-whole composition
30
31
32
33
Flows and processes
34
35
How data is distributed?
36
37
38
Comparing values between
groups
39
40
41
Relationships between variables
42
43
44
45
46
Geographical Data
47
48
49
Data Visualization Tools
Tableau
• Infogram
• ChartBlocks
• D3.js
• Google Charts
• Fusion Charts
• Chart.js
50
Tableau
•Tableau has a variety of options available, including a desktop app,
server and hosted online versions, and a free public option.
• There are hundreds of data import options available, from CSV files to
Google Ads and Analytics data to Sales force data.
• Output options include multiple chart formats as well as mapping
capability. That means designers can create color-coded maps that
showcase geographically important data in a format that’s much easier to
digest than a table or chart could ever be.
51
Infogram
• Infogram is a fully-featured drag-and-drop visualization tool that allows
even non-designers to create effective visualizations of data for marketing
reports, infographics, social media posts, maps, dashboards, and more.
• Finished visualizations can be exported into a number of formats: .PNG,
.JPG, .GIF, .PDF, and .HTML. Interactive visualizations are also possible,
perfect for embedding into websites or apps.
• Infogram also offers a WordPress plugin that makes embedding
visualizations even easier for WordPress users.
52
Chart
Blocks
Chart Blocks claims that data can be imported from “anywhere” using
their API, including from live feeds. While they say that importing data
from any source can be done in “just a few clicks,” it’s bound to be more
complex than other apps that have automated modules or extensions for
specific data sources.
• The app allows for extensive customization of the final visualization
created, and the chart building wizard helps users pick exactly the right
data for their charts before importing the data.
• Designers can create virtually any kind of chart, and the output is
responsive—a big advantage for data visualization designers who want
53
D3.js
• D3.js is a JavaScript library for manipulating documents using data.
• D3.js requires at least some JS knowledge, though there are apps out
there that allow non programming users to utilize the library.
• Those apps include NVD3, which offers reusable charts for D3.js;
Plotly’s Chart Studio, which also allows designers to create WebGL and
other charts; and Ember Charts, which also uses the Ember.js
framework.
54
Google Charts
•Google Charts is a powerful, free data visualization tool that is
specifically for creating interactive charts for embedding online.
• It works with dynamic data and the outputs are based purely on
HTML5 and SVG, so they work in browsers without the use of additional
plugins. Data sources include Google Spreadsheets, Google Fusion
Tables, Salesforce, and other SQL databases.
• There are a variety of chart types, including maps, scatter charts,
column and bar charts, histograms, area charts, pie charts, treemaps,
timelines, gauges, and many others. 55
Fusion Charts
•Fusion Charts is another JavaScript-based option for creating web and
mobile dashboards. It includes over 150 chart types and 1,000 map
types.
• It can integrate with popular JS frameworks (including React, jQuery,
React, Ember, and Angular) as well as with server-side programming
languages (including PHP, Java, Django, and Ruby on Rails).
• FusionCharts gives ready-to-use code for all of the chart and map
variations, making it easier to embed in websites even for those
designers with limited programming knowledge. 56
Chart.js
•Chart.js is a simple but flexible JavaScript charting library.
•It’s open source, provides a good variety of chart types (eight total),
and allows for animation and interaction.
• Chart.js uses HTML5 Canvas for output, so it renders charts well
across all modern browsers.
•Charts created are also responsive, so it’s great for creating
visualizations that are mobile-friendly.
57
Visualization using Programming
Python
• Python is considered one of the top-level programming languages
for data visualization because it is known for having many libraries
that allow for greater flexibility and its large and active scientific
computing community.
• It also controls the specific elements of the created graphics and
makes the specifications repeatable through code.
58
• Python is also very good at processing data, it provides open-source communities and
rich third-party libraries that allow continuous optimization for data visualization.
– matplotlib
– seaborn
– plotly
– pylab
59
R
• is an open-source software environment designed for creating
graphics.
• R is designed for data analysis.
•Although Python is becoming more and more popular, especially in
the areas of machine learning and in-depth learning, the R language
still has absolute advantages in data analysis and visualization, with
ggplot2 package and its extension package humanized drawing
grammar favored by users, especially bioinformatics and medical
researchers.
R 60
Power BI
•Power BI is able to extract data from a variety of data sources in
addition to supporting Microsoft's own products.
•The drag-and-drop graphical development model used by Power BI will
free data analysts from the visual chores and put more effort into data
management, algorithm research, and business communication.
61
Features of Tableau
62
Tableau products
63
Tableau Reports
64
Tableau file types…
65
Tableau can connect to:
66
Connecting to Data
Connecting to Data
Sample Data
Importing Data to tableau
Data in Tableau
Tableau Worksheet
Making Charts
Making Charts
Changing the axes
Multiple Dimensions in Charts
Multiple Measures in Charts
Various Types of Charts
Tree Map
Mark Options in Tableau
Color
Size
Label
Detail
Scatter Plot
Sorting
Sorting
Filters
Filters
Types of Aggregation
Count
Results in Percentages
Coloumn wise Percentage
Coloumn wise Percentage
Row wise Percentage
Calculated Fields
Creating Calculated Fields
Viewing Calculated Fields
Viewing Calculated Fields
Using Calculated Fields
Sets
Creating Sets
Using Sets
Hierarchy
Hierarchy
Maps
Creating Maps
Making Dashboards in Tableau
Conclusion
The tutorial covers the basic functionalities in Tableau. Much more
options are available which can be explored by one self after one
gets a feel of the software
THANK YOU
110