0% found this document useful (0 votes)
5 views12 pages

Unit 5 Notes

The document discusses various pixel-oriented and geometric visualization techniques for representing multidimensional data, including pixel-oriented visualization, space-filling curves, scatter plots, and parallel coordinates. It also covers icon-based techniques like Chernoff faces and stick figures, as well as hierarchical visualization methods such as tree-maps and worlds-within-worlds. Additionally, it highlights the importance of visualizing complex data types and relationships, using examples like tag clouds and disease influence graphs.

Uploaded by

ganeshparsab999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views12 pages

Unit 5 Notes

The document discusses various pixel-oriented and geometric visualization techniques for representing multidimensional data, including pixel-oriented visualization, space-filling curves, scatter plots, and parallel coordinates. It also covers icon-based techniques like Chernoff faces and stick figures, as well as hierarchical visualization methods such as tree-maps and worlds-within-worlds. Additionally, it highlights the importance of visualizing complex data types and relationships, using examples like tag clouds and disease influence graphs.

Uploaded by

ganeshparsab999
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

UNIT 5

Pixel-Oriented Visualization Techniques


Pixel-oriented visualization is a way to show data using colors in pixels. Each pixel represents a
value from the data, and the color of the pixel tells you something about that value. This
technique is useful when you have multiple dimensions (like income, credit limit, etc.) and want
to see how they relate to each other by creating separate windows or areas for each attribute. To
understand how these attributes are related to each other, pixel-oriented visualization can help:
1. Each Attribute Gets a Window:
o For a dataset with m attributes (e.g., like income, credit limit, etc), create m
separate windows (like small grids) on your screen.

2. Sort All Records in the Same Order:


o Decide on a global order for all records (e.g., sort customers by income from
lowest to highest).This order is used across all windows so you can compare
attributes for the same person.

3. Map Values to Pixels:


o In each window, represent the sorted records as pixels. The color of each pixel
reflects the value of that attribute. For example:
 Income Window: Lighter pixels for lower income, darker pixels
for higher income.
 Credit Limit Window: Lighter pixels for lower credit limits,
darker pixels for higher credit limits.
 Transaction Volume Window: Lighter pixels for fewer
transactions, darker pixels for more transactions.
 Age Window: Lighter pixels for younger ages, darker pixels for
older ages.
4. Analyze Patterns
o By looking at the patterns of colors across the windows, you can spot
relationships between the attributes.
 For example:
 If the credit limit window shows darker pixels where the income
window also has darker pixels, it suggests that higher income
correlates with higher credit limits.
 If there’s no clear pattern between the age window and the income
window, it suggests that age and income are not strongly related.
Here’s what the visualization might look like for the four attributes:

Attribute Description
(a) Income Pixels get darker as income increases.
(b) Credit Limit Pixels get darker as credit limit increases.
(c) Transaction Volume Pixels get darker as transaction volume increases.
(d) Age Pixels get darker as age increases.

By sorting customers by income first, you can easily compare how other attributes (credit limit,
transaction volume, age) change as income changes.
The following two techniques are powerful tools in data visualization, helping to make complex,
high-dimensional data more understandable and accessible.
1.Space-Filling Curves
It is used to map high-dimensional data into a 2D space. Examples include the Hilbert
curve, Gray code curve, and Z-curve. These curves ensure that all parts of the space are
covered without gaps.In the figure, each curve shows how a 2D space is filled in a
specific pattern:

 Hilbert Curve: Fills the space in a more structured, grid-like manner.


 Gray Code Curve: Fills the space in a zigzag pattern.
 Z-Curve: Fills the space in a diagonal, step-like pattern.
2. Circle Segment Technique
It uses circular segments to represent multiple dimensions. Each dimension is a segment
of a circle, and data records are plotted along these segments, making it easier to compare
dimensions.

 Figure 2.12(a): Shows how a single data record is represented in circle segments. Each
dimension (Dim 1, Dim 2, ..., Dim 6) has its own segment, and the data record is plotted
as a point in each segment.
 Figure 2.12(b): Shows how multiple data records are laid out in circle segments. The
segments are arranged in a circular pattern, and the data records are plotted along these
segments, forming a circular layout.

Geometric Projection Visualization Techniques


Pixel-Oriented Visualization Limitations
Pixel-oriented visualization techniques don’t provide much insight into the distribution of data
in a multidimensional space. Pixel-oriented techniques often struggle to show the dense regions
patterns clearly because they focus more on individual data points rather than the overall
distribution.
Why Geometric Projection Techniques Are Needed
Geometric projection techniques address this limitation by helping us visualize high-dimensional
data in a lower-dimensional space (usually 2D or 3D).
Example: Scatter Plot

A scatter plot displays 2-D data points using Cartesian coordinates. A third dimension can be
added using different colors or shapes to represent different data points. X and Y are two spatial
attributes and the third dimension is represented by different shapes. Through this visualization,
we can see that points of types “+” and “* ” tend to be colocated.
A 3D scatter plot lets you visualize 3 variables (X, Y, Z), and adding color allows you to
include a 4th variable.

When you have a dataset with more than 4 dimensions, it becomes hard to visualize everything
using a single scatter plot. A scatter-plot matrix solves this problem by showing all possible
pairs of dimensions in a grid format.
Example: The Iris dataset has 5 dimensions (Sepal Length, Sepal Width, Petal Length, Petal
Width, and Species).
When the number of dimensions in a dataset becomes very large, scatter-plot matrices become
cluttered and hard to interpret.
A better alternative is parallel coordinates, which can effectively visualize high-dimensional
data in a compact and intuitive way. Instead of plotting dimensions on perpendicular axes (like in
a scatter plot), parallel coordinates use n parallel vertical axes, one for each dimension. Each
data record is represented by a polygonal line that connects its values across all dimensions. The
limitations of parallel coordinates are
1. It cannot show dataset of many records.
2. With too many data points, the visualization can become messy (lots of overlapping
lines).
Icon-Based Visualization Techniques: Chernoff Faces
1. What Are Icon-Based Visualization Techniques?
Icon-based visualization techniques use small icons or symbols to represent multidimensional
data. The two popular icon-based techniques:
1. Chernoff faces and
2. Stick figures.
1. Chernoff Faces
Chernoff faces are a unique and creative way to visualize multidimensional data by
representing each data record as a cartoon-like human face. This technique uses different facial
features (like eyes, nose, mouth, etc.) to encode up to 18 dimensions of data. By looking at the
faces, you can quickly spot trends, similarities, or differences in the data. Different dimensions of
the data are mapped to specific facial features.For example:
 Dimension 1: Eye size
 Dimension 2: Nose length
 Dimension 3: Mouth width
 Dimension 4: Pupil size
 Dimension 5: Eyebrow slant
 Dimension 6: Eye eccentricity and so on...
Advantages of Chernoff Faces
 Compact Representation: Multiple dimensions can be visualized in a single icon (a
face), making it easy to compare many data points at once.
 Human Intuition: Humans are naturally good at recognizing faces and subtle
differences, which helps in quickly identifying patterns or anomalies.
 Visual Appeal: Faces are engaging and memorable, making the visualization more
intuitive and accessible.
Limitations of Chernoff Faces
 Cognitive Load: While humans are good at recognizing faces, interpreting the meaning
of each facial feature can be challenging, especially when there are many dimensions.
 Overcrowding: With too many dimensions, the faces can become cluttered and difficult
to interpret.
 Subjectivity: The interpretation of facial features can be subjective, and not everyone
may perceive the same patterns.
Asymmetrical Chernoff faces remove the requirement for symmetry, allowing the left and
right sides of the face to be different. This doubles the number of facial characteristics(36
dimensions instead of just 18 ) that can be used to encode data.
2.Stick Figure
The stick figure visualization technique is a creative way to represent multidimensional data
using simple stick figures. Each stick figure has five parts: a body and four limbs (two arms and
two legs). The technique maps dimensions of the data to the position, angle, or length of these
parts, allowing you to visualize complex datasets in an intuitive way.
Example: Census Data
Let’s say we’re analyzing census data with the following dimensions:
1. Age → X-axis
2. Income → Y-axis
3. Gender → Left arm angle (horizontal = male, vertical = female).
4. Education Level → Right arm length (longer = higher education).
5. Employment Status → Leg angles (straight = employed, bent = unemployed).
If the data is dense (many stick figures close together), the stick figures form a texture pattern
that highlights trends, such as:
 Highly educated people (long right arms) tend to cluster in high-income areas.
 Unemployed people (bent legs) are more common in lower-income regions.

Hierarchical Visualization Techniques


Hierarchical visualization techniques partition all dimensions into subsets (i.e., subspaces). The
subspaces are visualized in a hierarchical manner.
“Worlds-within-Worlds,” also known as n-Vision, is a representative hierarchical visualization
method. Consider the example of a 6-dimensional dataset with dimensions:

 F: The dimension we want to study (e.g., "Happiness Score").


 X1, X2, X3, X4, X5: Other dimensions (e.g., Age, Income, Education Level, Location,
Spending Habits).
We want to observe how dimension F changes with respect to the other dimensions.

Step 1: Fix Dimensions


Example: Set X3 = c3, X4 = c4, X5 = c5
 Fix Education Level = College, Location = Urban, and Spending Habits = High.

Step 2: Create the Outer World


 The outer world is a 3D plot showing Education Level, Location, Spending Habits.
 The origin of the inner world is located at the point (College, Urban, High) in the outer
world.
Step 3: Create the Inner World
 The inner world is a 3D plot showing Happiness Score (F) vs. Age (X1) and Income
(X2) for customers with College education, living in Urban areas, and having High
spending habits.
Step 4: Interact
 Move the origin in the outer world to a new location (e.g., High School, Rural, Low)
and observe how the inner world changes.
 Swap dimensions: Use Age, Income, Spending Habits in the outer world and
Happiness Score, Education Level, Location in the inner world.

Thus, given more dimensions, more levels of worlds can be used, which is why the method is
called “worlds-within worlds.”
As another example of hierarchical visualization methods, tree-maps display hierarchical data as
a set of nested rectangles.
Example of a tree-map visualizing Google news stories:
Top-Level Categories
 The entire dataset is divided into seven main categories, each shown as a large
rectangle:
o Politics
o Sports
o Technology
o Business
o Health
o Entertainment
o Science
Each category is assigned a unique color (e.g., blue for Politics, green for Sports).
Subcategories
 Within each category, the news stories are further divided into subcategories or
individual stories.
o Example:
 Under "Sports":
 Football: A medium-sized rectangle.
 Basketball: A smaller rectangle.
 Tennis: An even smaller rectangle.
o The size of each rectangle reflects the number of news stories in that subcategory.
Visualizing Complex Data and Relations
In the early days, data visualization was primarily focused on numeric data. However, with the
advent of modern technologies, we now have access to a wide variety of complex data types,
including textual data, network data, multimedia data. Visualizing and analyzing such data
attracts a lot of focus.
One common way to visualize non-numeric data, such as text and social media content, is
through tag clouds. A tag cloud is a visualization technique used to display statistics of user-
generated tags (e.g., in blogs, social media, or product reviews).Often, in a tag cloud, tags are
listed alphabetically or in a user-preferred order. The importance of a tag is indicated by font size
or color.

Tag clouds are often used in two ways.


1. Tag Cloud for a Single Item
In this use case, the tag cloud focuses on a single object (e.g., one product, one blog post, or one
image). The size of each tag reflects how often users have applied that specific tag to the item.
How It Works
 Tag Size: Represents the frequency of a tag assigned to the item by different users.
o Larger font = More users applied this tag.
o Smaller font = Fewer users applied this tag.
2. Tag Cloud for Multiple Items
In this use case, the tag cloud summarizes tagging data across multiple items (e.g., all products
in a category, all blog posts on a website, or all images in a gallery). The size of each tag reflects
how many items the tag has been applied to, indicating its popularity across the dataset.
How It Works
 Tag Size: Represents the number of items that a tag has been applied to.
o Larger font = The tag is applied to more items (i.e., it’s more popular).
o Smaller font = The tag is applied to fewer items.
In addition to complex data, complex relations among data entries also raise challenges for
visualization.

Another challenge in visualizing complex data arises when there are relationships between
entities. For example, in a disease influence graph, nodes represent diseases, and edges
represent correlations between them. This type of visualization is particularly useful for
understanding how different diseases might influence or co-occur with one another.
Example:
Suppose you are visualizing a disease influence graph for common illnesses:
 Nodes:
o Diseases like "Flu" and "Common Cold" would have large nodes because they are
prevalent.
o Diseases like "Ebola" would have small nodes because they are rare.
 Edges:
o Diseases like "Flu" and "Pneumonia" might have thick edges, indicating a strong
correlation (since flu can lead to pneumonia).
o Diseases like "Flu" and "Diabetes" might have thin edges, indicating a weaker
correlation.
This visualization helps researchers and healthcare professionals understand how diseases are
related and how they might influence each other.

You might also like