0 ratings0% found this document useful (0 votes) 94 views8 pagesData Visualization
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
Fata wa ‘Mining (Mu-Som. 6-Comp.)_3-25 a
eS
$.10_Bai Visutzstion $$
>
Mg
ata visualisation is presenting the data in a graphical or pictorial formay_y,
techniques help people to analyse things which are otherwise not possible when
large Pattems in the data can be marked very easily using the data visualisa bey
‘Some of the data visualisation techniques are as follows : mel
Pixel-oriented visualization techniques t-
|
1.
In pixel based visualisation techniques, there is a separate sub windows fry
of each attribute and is represented by one colored pixel.
‘maximises the amount of information represented atone time without ayy,
A tuple with m variables has different m colored pixel to represent each vais,
each variable has sub window.
is . in the
Fig. 3.10.1: Pixel visualisation with four variables
2. Geometric projection visualization techniques ~
Scanned by CamScannerVar : i
aiaby, Gi) Hyperslice : It is an extensior Tepresent
J em : : mn to scatter plot matrices. The
multidimensional function as a matrix of. orthogonal two dimensional tier .
the oan co-ordinates : The parallel Vertical lines separated define the axes. A point
(Cartesian coordinates corresponds to a polyline in parallel coordinates,
45
4
35
| 3
f
| 2s
e fous! |
Mig. 3.10.2: An example of scatter plot
idth Sopallongth Petal width__Petallongth
Sea i gosa—Vorsicolot — Virgiica
Fig. 3103 : An example of Parallel coordinates
3. Icon-based visualization techniques
; iconic display techniques.
-ualisation techniques also known as iconic
= Icon based visualisation
Scanned by CamScannerData Warehousing & Mining (MU-Sem. 6-Comp.)_3-27
Introduction
Each multidimensional data item is mapped to an icon.
This technique allows visualisation of large amounts of data,
Two most commonly used icon based techniques are :
(i) Chernoff faces
i) Stick figures
(@) Chernoff faces
— Illustration of trends in multidimensional data can be done by using Chemo,
This concept was introduced by Herman Chernoff in the year 1973, :
- The faces in Chernoff faces are related to facial expressions or features Of hy
being. So to distinguish between them is easy.
~ Different data dimensions were mapped to different facial features, for exang
face width, the length or curvature of the mouth, the length of the nose ete,
~ An example of Chemoff faces is shown below; they use facial features tors
trends in the values of the data, not the specific values themselves.
~ They display multidimensional data of upto 18 variables or dimensions,
~ _InFig. 3.10.4, each face represents an n-dimensional data points (n<=18).
— OG OCO®
© ® ©@
Fig.3.104 An example of Chernoff faces
Scanned by CamScanneroy
>
ok A
SH om wortostng & Min any
} 220
5k Daren Introduction to Data Min
} Pickett and Grinst
a
trod
~The Fig. 3.10.5
twelve of them, ““Peseus
~ Stick icon can be used to ds
helps to differentiate the tex
XT 9445
YN YY
@)A five stick figure i
orientation (© Atel arco tay wit a body
Fig. 3.10.5 : Example of stick figure
4. Hlerarchical visualization techniques
~ The visualisation techniques discussed above display multiple dimensions
simultaneously. However for a large data set having large number of dimensions the
above techniques may not be useful
~ Hierarchical visualisation techniques partition all dimensions in to subset (subspaces).
— These subspaces are visualised in a hierarchical manner.
Some of the visualisation techniques are :
'
@ Dimensional stacking
Gi) Mosaic Plot
(iii) Worlds-within-worlds
(iv) Tree-map
(v) Visualizing complex data and relations
Scanned by CamScanner@ Dimensional stacking
In dimension stacking, partition the n-dimensional attribute space jg o _ worlds with
subspaces. © gnnermost ¥
= Attribute values are partitioned into various classes. Remaining |
— Each clement is a two dimensional space is a xy plot. ‘Through th
= Mark the important attributes and are used on the outer levels. including re
= Using quer
Abtibute 4
| Fig. 3.10.6: Data in dimension stacking
G@) Mosaic plot
~ Mosaic plots give a graphical illustration of the ‘successive decompositions, .
= Rectangles are used to represent the count of categorical data and at’.
every stg!
rectangles are split parallel.
~ To draw a mosaic plot, a contingency table of data and chosen ordering of varie
With the response variable is required.
~ Example: In titanic example , Out of all women , 67% survived which is coded a!
and 33% died which is coded as 0. So the women bar shows as 61/33 split, Ano} (fv) Tree-maps
‘men, only 17% survived, so this bar shows a 17/83 split : - Tree m
hierarcl
~The vis
accord
~ Thele
~ Bach
expres
Fig. 3.10.7 : Mosaie Plot for Titanic
Scanned by CamScannerz
Fig, 3.108 : Worlds within worlds visualization
(iv) Tree-maps
~ Tree maps’ visualization techniques are well suited for display
ay
hierarchical st in isplaying large amounts of
~ The visualization space is divided into multiple rectangles that are sized and ordered
according to a quantitative variable.
~The levels in the liierarchy are seen rectangles containing other rectangles.
“a Each set of rectangles on the same level in the hierarchy represents a column or an
expression in a data set.
Scanned by CamScanner(©) Visualizing complex data and relations
J.s0n #0079) 331 Seen wp
ei
BS
ee eo
Fig. 3.10.9 : Web traffic by location Tree-map
~ Bach individual rectangle on a level in the hierarchy represents a category
column,
For example, in the Fig. 3.10.9, a rectangle representing global below which thee x
‘Ssctanples representing continents which contain several rectangles. represcats,
countries in that continent,
~ Each rectangle representing a country may in tumn.contain ectangles representsg
states in these countries.
This technique is useful wo visualize non
i-numerie data such as text, pictures, Net
entries and product reviews,
A tag cloud isa visualization method which helps to understand the informatio
user generated taps,
Arrange the taps alphabetically or with the User preferences with different font siz
and colors, ;
Tag clouds are used in two way
ys that with the size of tag, find out that bo
many times that tag is a eae
ipplied on that item by diffe has bee
applied to how many items, y different users or that tag
a cumsrirmrcnning 8
type it may b
3.11.1 Form of
Why Pre-proce
1. Real world
> Incom
Scanned by CamScannermy nte es
SSS
as Syllabus Topic : Data Preprocessing ——
3.11_Data Preprocessing
Process that involves transformation of data into information through classifying, sorting,
merging, recording, retrieving, transmitting, ot reporting is called data processing. Data
processing can be manual or computer based.
In Business related world, data processing refers to data processing so as to enable
effective functioning of the organisations and businesses.
Computer data processing refers to a process that takes the data input via a program and
‘summarizes, analyse the same or convert it to useful information.
‘The processing of data may also be automated.
Data processing systems are also known as information systems.
When data processing docs not involve any data manipulation and only converts the data
type it may be called as data conversion.
3.11.1 Form of Data Pre-processing
Why Pre-processing Is Required ?
Real world data are generally
certain attributes or attributes
ee hen c
_ +The data is said to be incomplete wl
Incomplete : The aggregate data is available,
1,
values are missing or only
Scanned by CamScanner