0% found this document useful (0 votes)
19 views8 pages

Advanced R Programming GGPLOT2 Notes

The document provides an overview of the ggplot2 package in R, which is used for creating complex data visualizations with a programmatic interface. It outlines the building blocks of ggplot2, including data, aesthetics, geometries, facets, statistics, coordinates, and themes, along with examples of how to implement these features. Additionally, it discusses the installation process, data formatting, and customization options for creating high-quality plots.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views8 pages

Advanced R Programming GGPLOT2 Notes

The document provides an overview of the ggplot2 package in R, which is used for creating complex data visualizations with a programmatic interface. It outlines the building blocks of ggplot2, including data, aesthetics, geometries, facets, statistics, coordinates, and themes, along with examples of how to implement these features. Additionally, it discusses the installation process, data formatting, and customization options for creating high-quality plots.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

ggplot2 Packages

ggplot2:
ggplot2 is a plotting package that provides helpful commands
to create complex plots from data in a data frame. It provides a
more programmatic interface for specifying what variables to
plot, how they are displayed, and general visual properties.
Therefore, we only need minimal changes if the underlying data
change or if we decide to change from a bar plot to a
scatterplot. This helps in creating publication quality plots with
minimal amounts of adjustments and tweaking.

ggplot2 plots work best with data in the ‘long’ format, i.e., a
column for every variable, and a row for every observation.
Well-structured data will save lots of time when making figures
with ggplot2

Data visualization with R and ggplot2 in R Programming Language also


termed as Grammar of Graphics is a free, open-source, and easy-to-
use visualization package widely used in R Programming Language. It
is the most powerful visualization package written by Hadley Wickham.
It includes several layers on which it is governed.

Building Blocks of layers with the grammar of graphics


 Data: The element is the data set itself
 Aesthetics: The data is to map onto the Aesthetics attributes such as
x-axis, y-axis, color, fill, size, labels, alpha, shape, line width, line type
 Geometrics: How our data being displayed using point, line,
histogram, bar, boxplot
 Facets: It displays the subset of the data using Columns and rows
 Statistics: Binning, smoothing, descriptive, intermediate
 Coordinates: the space between data and display using Cartesian,
fixed, polar, limits
 Themes: Non-data link

Note: Before start the programming first we need to install ggplot2 package.
Also load the required package
#Install ggplot2 package
install.packages("ggplot2")
#Load ggplot2 package
library(ggplot2)
#Load ggplot2 package
library(dplyr)
Data set used
ggplot(data=mtcars)+labs(title="MTcars Data plot")
print(mtcars)
1) Aesthetic: The data is to map onto the Aesthetics attributes such as x-
axis, y-axis, color, fill, size, labels, alpha, shape, line width, line type
Example
ggplot(data=mtcars,aes(x=hp,y=mpg,col=disp))+labs(title="MTcars Data plot")

2) Geometric layer: How our data being displayed using point, line,
histogram, bar, boxplot
Example
Add the size
ggplot(data=mtcars,aes(x=hp,y=mpg,col="red"))+geom_point()+
labs(title="Miles per gallon vs horse power",
x="Horse power",
y="Miles per Gallon")
Example
Adding shape and color
ggplot(data=mtcars,aes(x=hp,y=mpg,col=factor(cyl),shape=factor(am)))+
geom_point()+labs(title="Miles per gallon vs horse power",
x="Horse power",
y="Miles per Gallon")
Example
ggplot(data=class,aes(x=Height,y=Weight,col=factor(Age),shape=factor(Sex)))+
geom_point()+labs(title="Miles per gallon vs horse power",
x="Height",
y="Weight")

Example
setwd("E:\\R programme\\R.Directory")
class<-read.csv("E:\\R programme\\R.Directory\\CLASS (2).CSV")
print(class)
View(class)

ggplot(data=class,aes(x=Age))+geom_histogram(binwidth=0.5)+
labs(title="Class Information",
x="Age",
y="Count")
3)Facets layer: ggplot2 in R facet layer is used to split the data up into
subsets of the entire dataset and it allows the subsets to be visualized on
the same plot.
Example
Separate row
dm<-ggplot(data=class,aes(x=Weight,y=Height,shape=factor(sex))
+geom_point()
dm+facet_grid(Age~.)+labs(title="Class Information",
x="Weight",
y="Height")
separate columns
p<-ggplot(data=class,aes(x=Weight,y=Height,shape=factor(sex)))
+geom_point()
p+facet_grid(.~Age)+
labs(title="class Information",
x="Weight",
y="Height")
print(mtcars)
Separate rows according to transmission type
p <- ggplot(data = mtcars, aes(x = hp, y = mpg, shape = factor(cyl))) +
geom_point()
p + facet_grid(am ~ .) +
labs(title = "Miles per Gallon vs Horsepower",
x = "Horsepower",
y = "Miles per Gallon")

Separate columns according to cylinders


p <- ggplot(data = mtcars, aes(x = hp, y = mpg, shape = factor(cyl))) +
geom_point()

p + facet_grid(. ~ cyl) +
labs(title = "Miles per Gallon vs Horsepower",
x = "Horsepower",
y = "Miles per Gallon")

4) Statistics: ggplot2 in R this layer, we transform our data using binning,


smoothing, descriptive, intermediate.
Example
ggplot(data=class,aes(x=Weight,y=Age))+geom_point()+
stat_smooth(method=lm,col="red")+labs(title="class Information")

5) Coordinates layers: ggplot2 in R these layers, data coordinates are


mapped together to the mentioned plane of the graphic and we adjust the
axis and changes the spacing of displayed data with Control plot
dimensions.
Example
ggplot(data=mtcars,aes(x=wt,y=mpg))+geom_point()+
stat_smooth(method=lm,col="red")+
scale_y_continuous("Miles per gallon",limits=c(2,35),expand=c(0,0))+
scale_x_continuous("Weight",limits=c(0,25),expand=c(0,0))+coord_equal()+
labs(title="Miles per gallon",
x="Weight",
y="Miles per gallon")

coord_cartesian(): The coord_cartesian() function is used to zoom in and out


on a figure using the ggplot2 without affecting the under laying data. When
dealing with data that has extreme number of outliers that obscure the
subtitles of the remaing data.
Example
The xlim and ylim parameter of the coord_cartesian function specify the x and
y axis boundaries
ggplot(data=class,aes(x=Age,y=Height,col="pink"))+geom_point()
+geom_smooth()+
coord_cartesian(xlim=c(10,16))
#Create bar plot.
Example
data<-data.frame(language=c("python","SAS","c++","R","Javascript"),
popularity=c(14,27,21,20,10))
print(data)
ggplot(data,aes(x=language,y=popularity))+
geom_bar(stat="identity",fill="steelblue")+
labs(title="programming language")
6) Theme layer: Theme layer can be used for customizations ranging from
changing the location of the legends to setting the background color of the
plot.
This layer controls the finer points of display like the font size and background
color properties.
Example
ggplot(data=class,aes(x=Weight,y=Height))+geom_point()+facet_grid(.~Age)+
theme(plot.background=element_rect(fill="pink",colour="grey"))+
labs(title="class information")
Example
ggplot(data=class,aes(x=Weight,y=Height))+geom_point()+facet_grid(Sex~Age)
+
theme()+
labs(title="class information")

creating a panel of different plot


install.packages(gridExtra)
library(gridExtra)

Example
Step-1:selecting specific columns
selected_cols<-c("mpg","disp","hp","drat")
selected_data<-mtcars[,selected_cols]
selected_data
Step-2 creating histogram for each column
hist_plot_mgp<-ggplot(selected_data,aes(x=mpg))+
geom_histogram(binwidth=2,fill="blue",color="white")+
labs(title="Histogram:Miles per gallon",
x="Miles",y="frequency")

hist_plot_disp<-ggplot(selected_data,aes(x=disp))+
geom_histogram(binwidth=50,fill="red",color="white")+
labs(title="Histogram:displacement",
x="displacement",y="frequency")

hist_plot_hp<-ggplot(selected_data,aes(x=hp))+
geom_histogram(binwidth=20,fill="green",color="white")+
labs(title="Histogram:horse power",
x="horse power",y="frequency")

hist_plot_drat<-ggplot(selected_data,aes(x=drat))+
geom_histogram(binwidth=0.5,fill="orange",color="white")+
labs(title="Histogram:drat",
x="drat",y="frequency")

Step-3: Arrange the plot


grid.arrange(hist_plot_mgp,hist_plot_disp,hist_plot_hp,hist_plot_drat,ncol=3)
Step-4 Save the graph in different format
sav<-
grid.arrange(hist_plot_mgp,hist_plot_disp,hist_plot_hp,hist_plot_drat,ncol=2)
#save file png
ggsave("sav.png",sav)
#save file pdf
ggsave("sav.pdf",sav)

You might also like