R PROGRAMMING LAB(20) (1)
R PROGRAMMING LAB(20) (1)
To evolve as a centre of excellence for nurturing computer professionals with research and
innovation skills, inculcating moral values and societal concerns.
MISSION
M1: To transform students into creative computer engineers to meet global challenges.
M2: To produce competent and quality professionals by imparting computer concepts and
techniques and a zest for research and higher studies.
M3: To build entrepreneur skills and leadership qualities in the students by inculcating the spirit
of ethical values.
1
LISTOF PROGRAMS:
1. Download and install R-Programming environment and
install basic packages using install. Packages () command
in R.
2. Learn all the basics of R-Programming(Datatypes
,Variables,Operatorsetc.)
3. Implement R-Loops with different examples.
4. Learn the basics of functions in R and implement with examples.
5. Implement data frames in R. Write a program to join
columns and rows in a data frame using c bind () and r
bind () in R.
6. Implement different String Manipulation functions in R.
7. Implement different data structures in R(Vectors,Lists ,DataFrames)
8. Write program to read a csv file and analyze the data in the file in R
9. Create piecharts and barcharts using R.
10. Create a dataset and do statistical analysis on the data using R.
11. Write R program to find Correlation and Covariance
12. Write R program for Regression Modeling
13. Write R program to build classification model using KNN algorithm
14. Write R program to build clustering model using K -mean algorithm
2
INDEX
DownloadandinstallR-Programmingenvironmentandinstall basic 7
1 packages using install .packages()command in R.
LearnallthebasicsofR-Programming(Datatypes,Variables Operators 9
2 etc.)
ImplementR-Loopswithdifferentexamples. 17
3
ImplementdifferentdatastructuresinR(Vectors,Lists,Data Frames) 26
7
Writeaprogram toreadacsvfileandanalyzethedatainthefileinR 30
8
CreatepiechartsandbarchartsusingR. 37
9
WriteRprogramtobuildclassificationmodelusingKNN algorithm 43
13
WriteRprogramtobuild clusteringmodelusingK-meanalgorithm 46
14
3
Brief Introduction of R Programming Language:
R is an open-source programming language that is widely used as a statistical software and data
analysis tool. R generally comes with the Command-line interface. R is available across widely used
platforms like Windows, Linux, and mac OS. Also, the R programming language is the latest cutting-
edge tool.
It was designed by RossIhaka and Robert Gentleman at the University of Auckland,New Zealand,
and is currently developed by the R Development Core Team. R programming language isan
implementation of the S programming language. It also combines with lexical scoping semantics
inspiredbyScheme.Moreover,theprojectconceivesin1992,withaninitialversionreleasedin 1995 and a
stable beta version in 2000.
UseofRProgramming:
It’saplatform-independentlanguage.Thismeansitcanbeappliedtoalloperatingsystem.
It’sanopen-sourcefreelanguage.Thatmeansanyonecaninstallitinany organizationwithout
purchasing a license.
Rprogrammingisusedasaleadingtoolformachinelearning,statistics,anddataanalysis. Objects,
functions, and packages can easily be created by R.
R programming language is not only a statistic package but also allows us to integrate with
other languages (C, C++). Thus,can easily interact with many data sources and statistical
packages.
TheRprogramminglanguagehasavastcommunityofusersandit’sgrowingdaybyday.
R is currently one of the most requested programming languages in the Data Science
job marketthat makes it the hottest trend nowadays
4
1. Installationof R-Studioonwindows:
Step–3:Runthe.exeandfollowtheinstallationinstructions.
Click Next on the welcome window.
Enter/browsethepath totheinstallationfolder andclick Nextto proceed.
Click Finishtoendtheinstallation.
Output:
5
Install theRPackages:-
First,run RStudio.
After clicking on the packages tab, click on install. The following dialog box will
appear.
In the Install Packages dialog, write the package name you want to install under
the Packages field and then click install. This will install the
packageyousearchedfororgiveyoualistofmatchingpackagesbasedonyour package
text.
InstallingPackages:-
Loading Packages:-
Once the package is downloaded to yourcomputer youcanaccess thefunctions and
Resourcesprovidedbythepackageintwodifferentways:
#load the package to use in the current R session
library(packagename)
GettingHelp on Packages:-
"C:/ProgramFiles/R/R-3.2.2/library"
install.packages("PackageName")
#Installthepackagenamed"XML".
install.packages("XML")
6
2. Learnallthebasicsof R-Programming(Datatypes,Variables,Operators etc.)
ProgramDescription:
Variables are nothing but reserved memory locations to store values. This means that, whencreate a
variable you reserve some space in memory.
A variable provides us with named storage that our programs can manipulate. A variable in R canstore
an atomic vector, group of atomic vectors or a combination of many Robjects. A valid variable name
consists of letters, numbers and the dot or underline characters. The variable name starts with a letter
or the dot not followed by a number.
An operator is a symbol that tells the compiler to perform specific mathematical or logical
manipulations. R language is rich in built-in operators and provides following types of operators.
DataTypes:
Numeric:
v <-23.5
print(class(v))
Logical
v<-
TRUE
print(class(v))
Integer
v<-2L
print(class(v))
Output:
7
R-objects.
Vectors
Lists
Matrices
Arrays
Factors
DataFrames
Vectors
Whenyouwanttocreatevectorwithmorethanoneelement, youshoulduse c()functionwhichmeans to
combine the elements into a vector.
#Createavector.
apple<-c('red','green',"yellow")
print(apple)
#Gettheclassofthevector.
print(class(apple))
Output:
8
Lists
AlistisanR-objectwhichcancontainmanydifferenttypesofelementsinsideitlikevectors, functions and even
another list inside it.
#Createa list.
list1 <-list(c(2,5,3),21.3,sin)
#Printthelist.
print(list1)
Output:
9
Matrices
A matrix is a two-dimensional rectangular data set. It can be created using a vector input to the matrix
function.
#Createa matrix.
M=matrix(c('a','a','b','c','b','a'),nrow=2,ncol=3,byrow=TRUE) print(M)
Output:
10
Arrays
While matrices are confined to two dimensions, arrays can be of any number of dimensions. The array
function takes a dim attribute which creates the required number of dimension. In the below example
we create an array with two elements which are 3x3 matrices each.
#Create an array.
a<-array(c('green','yellow'),dim=c(3,3,2))
print(a)
Output:
11
Factors
Factors are the R-objects which are created using a vector. It stores the vector along with the distinct
values of the elements in the vector as labels. The labels are always character irrespective of whether
it is numeric or character or Boolean etc. in the input vector. They are useful in statistical modeling.
Factors are created using the factor() function. Then levels functions gives the coun to flevels.
#Createa vector.
apple_colors<-c('green','green','yellow','red','red','red','green')
Output:
12
Variables:
The variables can be assigned values using leftward, rightward and equal to operator. The values
ofthevariablescanbeprintedusing print()orcat()function.The cat() functioncombinesmultiple items
into a continuous print output.
#Assignmentusingequaloperator.
var.1=c(0,1,2,3)
#Assignmentusingleftwardoperator.
var.2<- c("learn","R")
#Assignmentusingrightwardoperator.
c(TRUE,1)->var.3
print(var.1)
cat("var.1is",var.1,"\n")
cat("var.2is",var.2,"\n")
cat("var.3is",var.3,"\n")
Output:
13
ROperators:
TypesofOperators
Arithmetic Operators
v<- c( 2,5.5,6)
t<- c(8, 3, 4)
print(v+t)
RelationalOperators
v<- c(2,5.5,6,9)
t<- c(8,2.5,14,9)
print(v>t)
LogicalOperators
v <-
c(3,1,TRUE,2+3i) t<-
c(4,1,FALSE,2+3i)
print(v&t)
Assignment
Operators v1 <-
c(3,1,TRUE,2+3i) v2<<-
c(3,1,TRUE,2+3i) v3 =
c(3,1,TRUE,2+3i)
print(v1)
print(v2)
print(v3)
Output:
14
3ImplementR-Loopswithdifferentexamples.
ProgramDescription:
A for loop is the most popular control flow statement. A for loop is used to iterate a vector. It issimilar
to the while loop. There is only one difference between for and while, i.e., in while loop, the condition
is checked before the execution of the body, but in for loop condition is checked after the execution of
the body.
#Createfruit vector
fruit<-c('Apple','Orange',"Guava",'Pinapple','Banana','Grapes') #
Create the for statement
for(iinfruit){ print(i)
}
Output:
15
#Creatingamatrix
mat<-matrix(data=seq(10,21,by=1),nrow=6,ncol=2) #
Creating the loop with r and c to iterate over the matrix for
(r in 1:nrow(mat))
for(cin1:ncol(mat))
print(paste("mat[",r,",",c,"]=",mat[r,c]))
print(mat)
Output:
16
Rwhileloop :
v<-c("Hello","whileloop")
cnt <- 2
while(cnt<7)
{ print(v)
cnt=cnt+ 1
}
Output:
17
4. LearnthebasicsoffunctionsinRandimplementwithexamples.
ProgramDescription:
The function in turn performs its task and returns control to the interpreter as well as any result which
may be stored in other objects.
Built-in Function
#Createasequenceofnumbersfrom32to44. print(seq(32,44))
#Findmeanofnumbersfrom25to82.
print(mean(25:82))
#Findsumofnumbersfrm41to68.
print(sum(41:68))
Output:
18
User-definedFunction
We can create user-defined functions in R. They are specific to what a user wants and once created
they can be used like the built-in functions. Below is an example of how a function is created andused.
#Createafunctiontoprintsquaresofnumbersinsequence.
#Callthefunctionnew.functionsupplying6asanargument. new.function(6)
19
5. ImplementdataframesinR.Writeaprogramtojoincolumnsand rowsina
data frame using cbind() and rbind() in R.
ProgramDescription:
#Creatingvectorobjects
Name<-c("ShubhamRastogi","NishkaJain","GunjanGarg","SumitChaudhary") Address
<- c("Moradabad","Etah","Sambhal","Khurja")
Marks<-c(255,355,455,655)
#Combiningvectorsintoonedataframe
print(info)
#Creatinganotherdataframewithsimilarcolumns
Name = c("Deepmala","Arun"),
Address=c("Khurja","Moradabad"),
Marks = c("755","855"),
stringsAsFactors=FALSE
20
#Printingaheader.
cat("### TheSeconddataframe\n")
#Printingthedataframe.
print(new.stuinfo)
#Combiningrowsformboththedataframes.
# Printingaheader.
cat("###Thecombineddataframe\n")
print(all.info)
Output :
21
6. ImplementdifferentStringManipulationfunctionsinR
ProgramDescription:
String manipulation basically refers to the process of handling and analyzingstrings. It involves various
operations concerned with modification and parsing of strings to use and change its data. R offers a
series of in-built functions to manipulate the contents of a string. In this article, we will study different
functions concerned with the manipulation of strings in R.
ConcatenationofStrings
String Concatenation is the technique of combining two strings. String Concatenation can be done
using many ways:
pr-1
#RprogramforStringconcatenation
Output:
22
pr-2
#Concatenationusingcat()function
str <- cat("learn", "code", "tech", sep = ":")
print (str)
Output:
23
7 ImplementdifferentdatastructuresinR(Vectors,Lists,DataFrames)
ProgramDescription:
Vectors are the most basic R data objects and there are six types of atomic vectors. They are logical,
integer, double, complex, character and raw.
Lists are the R objects which contain elements of different types like − numbers, strings, vectors and
anotherlistinsideit.Alistcanalsocontainamatrix orafunctionasitselements. Listiscreatedusing list()
function.
Vectors
#Createa vector.
apple<-c('red','green',"yellow")
print(apple)
#Gettheclassofthevector.
print(class(apple))
Output:
24
Lists
AlistisanR-objectwhichcancontainmanydifferent typesofelementsinsideitlikevectors,functions and even
another list inside it.
#Createa list.
list1 <-list(c(2,5,3),21.3,sin)
#Printthelist.
print(list1)
[[1]]
[1]2 53
[[2]]
[1]21.3
[[3]]
function(x).Primitive("sin")
Output:
25
Matrices
A matrix is a two-dimensional rectangular data set. It can be created using a vector input to the matrix
function.
#Createa matrix.
M=matrix(c('a','a','b','c','b','a'),nrow=2,ncol=3,byrow=TRUE
) print(M)
Output:
26
DataFrames:
#Printthedataframe Data_Frame
Output:
27
8. Writea programto read acsv fileand analyzethe data inthe fileinR
ProgramDescription:
In R, we can read data from files stored outside the R environment. We can also write data into fileswhich
will be stored and accessed by the operating system. R can read and write into various file formats like
csv, excel, xml etc.
#Gettingandprintingcurrentworkingdirectory. print(getwd())
# Setting the current working directory.
setwd("C:\Users\sreek\OneDrive\Desktop\SAISANTHOSHI-MRCET-2023")
# Getting and printingthe current working directory.
print(getwd())
Output:
28
ReadingaCSV file
data<-read.csv("record.csv")
print(data)
Output:
Output:
29
Gettingthemaximum salary
# Creating a data frame.
csv_data<-read.csv("record.csv")
#Gettingthemaximumsalaryfromdataframe.
max_sal<- max(csv_data$salary)
print(max_sal)
Output:
30
Gettingthedetails of allthepersons whoareworkingintheITdepartment
Output:
31
Gettingthedetailsofthepersonswhosesalaryisgreaterthan600andworkingintheIT department.
Output:
32
Gettingdetailsof thosepeopleswhojoinedonorafter 2014.
Output:
33
Writinginto aCSVfile:
csv_data<-read.csv("record.csv")
#Gettingdetails ofthosepeopleswhojoined onorafter2014
details<-subset(csv_data,as.Date(start_date)>as.Date("2014-01-01")) #
Writing filtered data into a new file.
write.csv(details,"output.csv")
new_details<-read.csv("output.csv")
print(new_details)
Output:
34
9. CreatepiechartsandbarchartsusingR
Program Description :
A pie-chart is a representation of values as slices of a circle with different colors. The slices are labeled
and the numbers corresponding to each slice is also represented in the chart.
#Createdataforthegraph.
geeks<- c(23, 56, 20, 63)
labels<-c("Mumbai","Pune", "Chennai", "Bangalore")
Output:
35
#Createthedataforthechart A
<- c(17, 32, 8, 53, 1)
Output:
36
10. CreateadatasetanddostatisticalanalysisonthedatausingR
Program Description :
The R Programming Language provides some easy and quick tools that let us convert our data into
visually insightful elements like graphs.
Output:
37
11. WriteRprogramtofindCorrelationandCovariance
Program Description :
Covariance shows the direction of the path of the linear relationship between the variables while
afunction is applied to them.
Correlation on the contrarymeasures both the power and direction of the linear relationship between
two variables.
#Rprogramtoillustrate
#pearsonCorrelationTesting
# Using cor()
#Takingtwo numeric
#Vectorswithsamelength
x
= c(1, 2, 3, 4, 5, 6, 7)
y=c(1, 3,6, 2, 7, 4, 5)
# Calculating
#Correlationcoefficient
# Using cor() method
result=cor(x,y,method="pearson") #
Print the result
cat("Pearsoncorrelationcoefficientis:", result)
Output:
38
Covariance
#Datavectors
x<-c(1,3,5,
10)
#Printcovarianceusingdifferentmethods
print(cov(x, y))
print(cov(x, y, method =
"pearson")) print(cov(x, y, method
= "kendall"))
print(cov(x,y,method="spearman"))
Output:
39
12. WriteRprogramforRegressionModeling
ProgramDescription:
Regression analysis is a very widely used statistical tool to establish a relationship model between
two variables. One of these variable is called predictor variable whose value is gathered through
experiments. Theothervariableis called response variablewhosevalueis derived from thepredictor
variable.
#GeneraterandomIQvalueswithmean=30andsd=2 IQ <-
rnorm(40, 30, 2)
#SortingIQlevelinascendingorder IQ
<- sort(IQ)
#Generatevectorwithpassandfailvaluesof40students result
<- c(0, 0, 0, 1, 0, 0, 0, 0, 0, 1,
1, 0, 0, 0, 1, 1, 0, 0, 1, 0,
0, 0, 1, 0, 0, 1, 1, 0, 1, 1,
1, 1, 1, 0, 1, 1, 1, 1, 0, 1)
#DataFrame
df<-as.data.frame(cbind(IQ,result))
# Print data frame
print(df)
Output:
40
13 .WriteRprogramtobuildclassificationmodelusingKNNalgorithm
Program Description :
#Loadingdata
data(iris)
#Structure
str(iris)
# Installing Packages
install.packages("e1071")
install.packages("caTools")
install.packages("class")
#Loadingpackage
library(e1071)
library(caTools)
library(class)
#Loadingdata
data(iris)
head(iris)
#Splittingdataintotrain
# and test data
split<-sample.split(iris,SplitRatio=0.7)
train_cl<-subset(iris, split ==
"TRUE") test_cl<-subset(iris, split ==
"FALSE")
#Feature Scaling
train_scale<-scale(train_cl[,1:4])
test_scale<-scale(test_cl[,1:4])
41
#FittingKNNModel
# to training dataset
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 1)
classifier_knn
# Confusiin Matrix
cm<-table(test_cl$Species,classifier_knn) cm
#ModelEvaluation-ChoosingK #
Calculate out of Sample error
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy=', 1-misClassError))
# K =3
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 3)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy =', 1-misClassError))
# K =5
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 5)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy =', 1-misClassError))
# K =7
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 7)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy=', 1-misClassError))
42
# K =15
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 15)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy =', 1-misClassError))
# K =19
classifier_knn<-knn(train=train_scale,
test =test_scale,
cl=train_cl$Species,
k = 19)
misClassError<-mean(classifier_knn!=test_cl$Species)
print(paste('Accuracy=', 1-misClassError))
Output:
43
14 WriteRprogramtobuild clusteringmodel usingK-meanalgorithm
ProgramDescription:
K Means Clustering in R Programming is an Unsupervised Non-linear algorithm that cluster data based
on similarity or similar groups. It seeks to partition the observations into a pre-specified number of
clusters. Segmentation of data takes place to assign each training example to a segment called a cluster.
#Loadingdata
data(iris)
#Structure
str(iris)
# Installing Packages
install.packages("ClusterR")
install.packages("cluster")
#Loadingpackage
library(ClusterR)
library(cluster)
#Removinginitiallabelof
#Speciesfromoriginaldataset
iris_1 <- iris[, -5]
#FittingK-MeansclusteringModel
# to training dataset
set.seed(240)#Settingseed
kmeans.re<-kmeans(iris_1,centers=3,nstart=20)
kmeans.re
44
#Clusteridentificationfor #
each observation
kmeans.re$cluster
# Confusion Matrix
cm<-table(iris$Species,kmeans.re$cluster) cm
##Plotiingclustercenters
kmeans.re$centers
kmeans.re$centers[,c("Sepal.Length","Sepal.Width")]
45
## Visualizing clusters
y_kmeans<-kmeans.re$cluster
clusplot(iris_1[,c("Sepal.Length","Sepal.Width")],
y_kmeans,
lines =
0,shade=TRUE
, color =
TRUE,
labels
= 2,
plotchar = FALSE,
span=TRUE,
main=paste("Clusteriris"),
xlab =
'Sepal.Length', ylab=
'Sepal.Width')
Output:
***
46