0% found this document useful (0 votes)
41 views46 pages

R Data Visualization Techniques

The document discusses various methods for creating graphical representations of data in R including scatter plots, line charts, bar plots, box plots, kernel density plots and dot plots. It provides code examples for generating these different plot types and customizing aspects like colors, labels and titles.

Uploaded by

vedhvirat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views46 pages

R Data Visualization Techniques

The document discusses various methods for creating graphical representations of data in R including scatter plots, line charts, bar plots, box plots, kernel density plots and dot plots. It provides code examples for generating these different plot types and customizing aspects like colors, labels and titles.

Uploaded by

vedhvirat
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

BMAT202P- Probability and statistics lab

Table and Graphical Representations


⚫ Importing CSV and Tabular Data Files
We can change the current working directory as follows:
⚫ setwd("<location of the dataset>")

⚫ Example
>setwd("C:\\Users\\admin\\Desktop\\")
>data=read.csv("stud.csv")
⚫ Comma-separated values (CSV) files
⚫ Data files have many formats and accordingly we have
options for loading them.

>data=read.csv(“C:\\Users\\admin\\Desktop\\Mokesh
\\stud.csv”)

Or

>data=read.csv(“C:/Users/admin/Desktop/Mokesh/stud
.csv”)
Graphics on R
⚫ A simple plot plot(X) has each element of a discrete variable X
ploted on the y-axis and the element's index on the x-axis
>v <- c(7,12,28,3,41)
>t <- c(14,7,6,19,3)
> plot(v,type = "o", col = "red", xlab = "Month", ylab =
"Rain fall",main = "Rain fall chart")
>lines(t, type = "o", col = "blue")
R a i n fall c h a r t

40
30
Rainfall

20
10

1 2 3 4 5

M ont h
⚫ Line chart
⚫ A line chart is a simple plot with consecutive plots connected
by lines
type= p type= l type= o type= b

5
4

4
3

3
y

y
y

y
2

2
1

1
1 3 5 1 3 5 1 3 5 1 3 5

x x x x

type= c type= s type= S type= h


5

5
4

4
3

3
y

y
y

y
2

2
1

1
1 3 5 1 3 5 1 3 5 1 3 5

x x x x
y y

0 100 300 500 0 100 300 500

1
1

x
3

x
3
type= c
type= p

5
y y

0 100 300 500 0 100 300 500


1

1
x
3

x
3
type= l

type= s

5
5

y y

0 100 300 500 0 100 300 500


1

1
x
3

x
3
type= S
type= o

5
5

y y

0 100 300 500 0 100 300 500


1

1
x
3

x
3
type= h
type= b

5
5
⚫ Scatterplot
A scatterplot plot(X,Y) has each element of a variable Y
ploted on the y-axis and the corresponding element
for variable X on the x-axis

# scatterplot
>attach(mtcars)
>plot(wt, mpg, main="Weight / MPG graph",
xlab="Car Weight (lbs)", ylab="Miles Per Gallon",
pch=19)
Weight / M P G graph

30
Miles Per Gallon

25
20
15
10

2 3 4 5

C a r W e i g ht (lbs)
⚫ Kernel density plots
⚫ Kernel density plots nicely visualize the shape of a distribution.
They can be better than histograms, even with normal curves
because histograms are strongly affected by the number of bins
used and by outliers.
⚫ # Kernel density plot
⚫ >d <- density(mtcars$mpg) # kernel density estimates
⚫ >plot(d)
⚫ # Filled density plot
⚫ >d <- density(mtcars$mpg)
⚫ >plot(d, main="Kernel Density of Miles Per Gallon")
⚫ >polygon(d, col="red", border="blue")
Kernel Density of Miles Per Gallon
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07
Density

10 20 30 40

N = 32 Bandwidth = 2.477
⚫ boxplot(X) is a plot that, if X is a vector, the vector
elements are the heights of the bars in the plot, if X is a
matrix, the matrix columns are the heights of the bars
in the plot, stacked after the first bar (column)

⚫ If the argument beside=TRUE, then the values in each


column are juxtaposed, not stacked.

⚫ The argument horiz=TRUE creates an horizontal


barplot.
⚫ > simple barplot
⚫ > barplot (VADeaths[,"Rural Male"])

60
50
40
30
20
10
0

50-54 55-59 60-64 65-69 70-74


⚫ # stacked barplots
 barplot(VADeaths[,c("Rural Male", "Rural Female")])

150
100
50
0

Rural Male Rural Female


⚫ > # juxtaposed barplots
⚫ > barplot(VADeaths [,c("Rural Male", "Rural
Female")],beside=T)

60
50
40
30
20
10
0

Rural Male Rural Female


>H <- c(7,12,28,3,41)
>M <- c("Mar","Apr","May","Jun","Jul")
>barplot(H,names.arg = M,xlab = "Month",ylab =
"Revenue",col="blue",main = "Revenue chart")
R even u e chart

40
30
Revenue

20
10
0

Mar Apr May Jun Jul

Month
Example :-
>colors <- c("green","orange","brown")
>months <- c("Mar","Apr","May","Jun","Jul")
>regions <- c("East","West","North")
>Values <-
matrix(c(2,9,3,11,9,4,8,7,3,12,5,2,8,10,11),nrow = 3,ncol
= 5,byrow =TRUE)
>barplot(Values,main = "total revenue",names.arg =
months,xlab = "month",ylab = "revenue",col=colors)
>legend("topleft", regions, cex = 1.3, fill = colors)
total r e v e n u e

30
East
West
25

North
20
revenue

15
10
5
0

Mar A pr May Jun Jul

month
⚫ # Simple Dotplot
>dotchart(mtcars$mpg,labels=row.names(mtcars),cex=.7,
main="Gas Milage for Car Models",xlab="Miles Per
Gallon")
⚫ # Dotplot: Grouped Sorted and Colored
⚫ # Sort by mpg, group and color by cylinder
⚫ >x <- mtcars[order(mtcars$mpg),] # sort by mpg
⚫ >x$cyl <- factor(x$cyl) # it must be a factor
⚫ >x$color[x$cyl==4] <- "red"
⚫ >x$color[x$cyl==6] <- "blue"
⚫ >x$color[x$cyl==8] <- "darkgreen"
⚫ >dotchart(x$mpg,labels=row.names(x),cex=.7,groups=
x$cyl,main="Gas Milage for Car Models\ngrouped by
cylinder",xlab="Miles Per Gallon",gcolor="black",
color=x$color)
G a s M i l a g e for C a r M o d e l s
g r o u p e d b y c y linde r

4
T o yo t a Corolla
Fiat 128
Lotus E uropa
H o nd a Civic
Fiat X 1 - 9
P o r sc he 9 1 4 - 2
Merc 2 4 0 D
Merc 230
D a t sun 710
T o yo t a C orona
V o l vo 1 4 2 E
6
H ornet 4 D rive
Mazda RX4 W ag
M a z d a RX4
Ferrari Dino
Merc 280
Valiant
Merc 2 8 0 C

8
Pontiac Firebird
H ornet S portabout
Merc 4 5 0 S L
Merc 4 5 0 S E
F o r d P antera L
D o d g e Challenger
A M C Javelin
Merc 4 5 0 S L C
Maserati B ora
C hrysle r Imperial
D uster 360
C am aro Z 2 8
Lincoln Continental
Cadillac Fleetw ood

10 15 20 25 30

Miles Per Gallon


⚫ Pie
⚫ pie(x) draws a circle (pie) cut into segments (slices), each slice
represents a unique value from the elements of x and the sixe of the
slice and the relative frequency of each unique value is represented
by the size of t
# simple pie
>pie(unique(mtcars$cyl), labels = unique(mtcars$cyl), main="Pie Chart of
N. of cylinders") # pie with percentages and colors
>with(mtcars, {
>n.cyl <- unique(cyl)
>percent.cyl <-round(table(cyl)/dim(mtcars)[1]*100,2)
>lbls <- paste(n.cyl," cyl=",percent.cyl,"%", sep="")
>pie(n.cyl, labels = lbls , main="Pie Chart of N. of cylinders",
col=rainbow(length(lbls)))})
P i e C h a r t o f N. o f c y l i n d e r s

6 cyl=34. 38%

4 cyl=21.88%

8 cyl=43. 75%
>x <- c(21, 62, 10, 53)
>labels <- c("London", "New York", "Singapore",
"Mumbai")
>pie(x,labels)

New York

London

Singapore

Mumbai
>x = c(21, 62, 10, 53)
>labels = c("London", "New York", "Singapore",
"Mumbai")
>pie(x, labels, main = "City pie chart", col =
rainbow(length(x)))
C it y pie c h a r t

N e w York

L o nd o n

S i ng a p o r e

M um b a i
>x <- c(21, 62, 10,53)
>labels <- c("London","New York“ ,"Singapore“ ,"Mumbai" )
>piepercent<- round(100*x/sum(x), 1)
>pie(x, labels = piepercent, main = "City pie chart",col =
rainbow(length(x)))
>legend("topright", c("London","New York","Singapore",
"Mumbai"), cex = 0.8,fill = rainbow(length(x)))
C it y pie c h a r t

London
N ew York
Singap ore
42.5 Mumb ai

14.4

6.8

36.3
⚫ histogram
⚫ hist(X) is an histogram, a bar plot with the frequencies of the values
in X on the y-axis and the ranges of values on the x-axis
⚫ A cumulative distribution curve is the proportion of X on the y-
axis, up to the current position on the x-axis

⚫ > # simple histogram


⚫ > hist(faithful$waiting)
H i s t o g r a m o f faithful$waiting

50
40
Frequency

30
20
10
0

40 50 60 70 80 90 100

faithful$waiting
# draw the histogram
>hist(faithful$waiting, prob =TRUE, xlim=range(xx) , border =
"gray" , col="gray90")
# adds the frequency polygon
>lines(xx, yy, lwd=2, col = "royalblue")
⚫ boxplot
boxplot(X) is a box-and-whisker plot with the values of variable X,
this is an effective way to summarize larger datasets.

# Boxplot of MPG by Car Cylinders


> boxplot(mpg~cyl,data=mtcars, main="Car Milage
Data",xlab="Number of Cylinders", ylab="Miles Per
Gallon")
Car Milage Data

30
Miles Per Gallon

25
20
15
10

4 6 8

Number of Cylinders
⚫ Pairs
⚫ pairs() shows a matrix with all the scatterplots for the columns of
variable X

⚫ pairs(~mpg+disp+drat+wt,data=mtcars,
main="Scatterplot Matrix MPG, Displacement,Rear
axle ratio,Weight")
Scatterplot M atri x M P G , Di sp l acem en t, Rear axle ratio, W ei g h
100 300 2 3 4 5

30
mpg

20
10
300

disp
100

5.0
4.0
d ra t

3.0
5
4

wt
3
2

10 20 30 3.0 4.0 5.0


⚫ Contour
⚫ contour(X,Y,Z) draws a contour plot, with vector X for the rows,
vectorY for the columns and matrix X for the data
>x <- 10*(1:nrow(volcano)); x.at <- seq(100, 800, by=100)
>y <- 10*(1:ncol(volcano)); y.at <- seq(100, 600, by=100)
# Using Terrain Colors
>image(x, y, volcano, col=terrain.colors(100),axes=FALSE)
>contour(x, y, volcano, levels=seq(90, 200, by=5), add=TRUE,
col="brown")
>axis(1, at=x.at)
>axis(2, at=y.at)
>box()
>title(main="Maunga Whau Volcano", sub =
"col=terrain.colors(100)", font.main=4)
Mau n g a Whau Volcano

600
500 110

155
400
300
y

200

180

140
100

135
125
120

100 200 300 400 500 600 700 800

x
c o l= t e r r a i n . c o lo r s ( 1 0 0 )
⚫ Persp
persp(X,Y,Z) draws a 3d graph, with vector X for the rows, vectorY
for the columns and matrix X for the data

# # (2) Visualizing a simple DEM model


>z <- 2 * volcano # Exaggerate the relief
>x <- 10 * (1:nrow(z)) # 10 meter spacing (S to N)
>y <- 10 * (1:ncol(z)) # 10 meter spacing (E to W)
>persp(x, y, z, theta = 120, phi = 15, scale = FALSE, axes =
FALSE)
⚫ Tables
Example:
> library(MASS)
>ships
> table(ships$type)

>table(ships$type,ships$year)
Example :-
>library(MASS)
>USArrests
>table(USArrests[,3])
>table(cut(USArrests[,3],pretty(USArrests[,3])))

Example :-
> airquality
> table(airquality[,4],airquality[,5])
>table(cut(airquality[,4],pretty(airquality[,4])),
airquality[,5])
Example :-

> library(MASS)
> cars

You might also like