R Programming-Chapiter 4
R Programming-Chapiter 4
Chapter 4
BASE OF R
In programming with R, we’ll need to define Variables that are used to store infor-
mation to be referenced and manipulated. Therefore, R provides us with some rules to
construct a valid variable name. It should be started with letters at first and you can use
only dot(.) or underscores within the variable name. The R program provides us with
several useful features, among them related to variables is how to search for their values,
as well as delete them if we need to. The "ls()" function is used to list every variable that is
currently available in the workspace, it can also match variable names with patterns. The
"rm()" function can be used to remove variables from the system. As a convention, we
will start learning R programming by defining variables and trying to print these values
in the console of R by using "print()".
#create
v_1<-100
print (v_1)
1v<-23
print(ls())
rm(v_1)
print (v_1)
[[1] 100
> 1v<-23
Error: unexpected symbol in "1v"
> print(ls())
[1] "v_1"
> rm(v_1)
> print (v_1)
Error in print(v_1) : object ’v_1’ not found
>
Remark. In a conclusion from the above code, the variable v_1 is normally shown but
21
the other variable 1v didn’t show because the name of the variable should not start with
numbers. In addition, we used the function "ls()" to find if there is a variable stored in
the system, and then we delete it by using "rm()" function and then tried to find its value
but it didn’t appear.
In general, you need to use different variables to store information when programming
which we have the same structure in R. R provides us with many types of data to use,
for example:
• complex: represent the complex numbers. it can be created directly by writing the
number or we can use "as.complex()", or "complex" functions.
Remark. You can know the data type of any variable in R by using "typeof()" or "class()".
Let us consider the following code in R for more explanation of each data type.
v8<- TRUE
# Know the type of each one
typeof(v1)
typeof(v2)
typeof(v3)
typeof(v4)
typeof(v5)
typeof(v6)
typeof(v7)
typeof(v8)
[1] "character"
[2] "double"
[3] "integer"
[4] "integer"
[5] "complex"
[6] "complex"
[7] "complex"
[8] "logical"
For R, you can store data as an R object which is a data structure having some
attributes and methods which are applied to its attributes and you can retrieve saved
data. For example, suppose we have a sample of 5 men with ages 25, 30, 29, 28, and 31.
You can store these data as an R object which is a name by which you can retrieve saved
data. These data can be created as a column vector with 5 rows for the variable age.
Therefore, Let us look at the frequent types of R-objects:
Vectors: A vector is the simplest object in R which consists of a number of elements
of the same type. It can be constructed with "c()".
# Create a vector
indep.var <- c(’age’,’hieght’,’temperature’)
23
print(indep.var)
# Get the class of the vector.
print(class(indep.var))
Remark. We can create a vector with a sequence of numbers by using ":" if the sequence
of numbers needs to have only the difference of 1.
[1] 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
[32] 41 42 43 44 45 46 47 48 49 50
Matrices: The components of a matrix must all be of the same data type. To create
a matrix, we need to use the basic form which is:
"matrix(data, nrow, ncol, byrow, dimnames)"
Where you should write the input in data, the number of rows in nrow, and the
number of columns in ncol. For byrow, If you want to start the matrix by row then you
should put the logical clue TRUE and the opposite by writing FALSE, and for dimnames
means the names assigned to the rows and columns. Let us Create a matrix taking a
number as input.
Lists: Lists in R contain elements of different types like numbers, vectors, and other lists
inside them. The list is created using "list()" function. The names of the list components
and the contents of list components can be specified as arguments of the list function by
using the = character. In the following code, we are going to define the names of the row
and columns of the previous matrix.
row<-c("row1", "row2")
col<-c("col1", "col2", "col3")
l=list(rows=row,columns=col)
M <- matrix(c(1,2,3,4,5,6), nrow = 2,ncol=3, byrow = FALSE,dimnames
=l )
print(l)
print(M)
$rows
[1] "row1" "row2"
$columns
[1] "col1" "col2" "col3"
columns
rows col1 col2 col3
row1 1 3 5
row2 2 4 6
Arrays: Vectors and matrices are expanded to create arrays where a matrix is a two-
dimensional array, whereas a vector is a one-dimensional one. Therefore, it can store data
in more than two dimensions and the components of an array must all be of the same
25
data type, much like with matrices and vectors. In the below example, we create an array
by using "Array()", with 4 elements which are 2x3 rectangular matrices.
, , 1
, , 2
, , 3
, , 4
Factors: In R Language, factors represent categorical variables (i.e. data of which the
26
value range is a collection of codes.). There are typically only a few possible values for
factor variables. These elements are known as LEVELS. One can create a factor easily
with the "factor()" function. Now, consider the data variable, which has 4 values shown
in the following example
[1] Male Female East Male North East Female West Female East
North
Levels: East Female Male North West
[1] "East" "Female" "Male" "North" "West"
Data frames: Data frames are data displayed in a table, Unlike a matrix, each column
has different types of elements (numeric, factor, or character type). However, each column
must have the same type of data. In fact, a data frame is a requirement for the majority of
statistical modeling operations in R. Data frames are created using "data.frames" function.
For example, let us define data and then try this function to see the outcome.
age<-c(18,20,34,50)
weight<-c(45,70,77,61)
height<-c(1.89,1.68,1.82,1.60)
gender<-c("male","female","male","female")
#create the data frame
data.frame(age,weight,height,gender)
4 50 61 1.60 female
Remark. You should consider that the columns must have the same number of data ele-
ments, and each column has a name.
1. Arithmetic Operators are performed in R and we have the six basic operations. We
have addition, subtraction, multiplication, division, modulus, exponent, etc.
3. Logical Operators used to combine each statement with another one by using "AND,
OR, and NOT".
4. Assignment Operators is used to assign value to a variable, you can assign a variable
value or the result of an arithmetical expression.
for more information check out figure 4.1. It presents all the operators [7].
4.3.1 If statement:
In if statement you should write a Boolean condition in the beginning and write the
statements that should either return true or false. The if statement’s commands will
execute if the condition is true. The program will go on to the subsequent lines of code
if the condition is false, preventing the statements from being executed. Below, is the
syntax for an if statement in R:
if(Boolean_condition) {
statements .....
Let’s have an example:
[1] 6
Error in x > 4 : invalid comparison with complex values
Remark. In the example above, we have an integer variable called x with a value of 5.
The first if statement checks if x is an integer. Since the condition is satisfied we enter
the next if statements to check if x is greater than 4 or less than 3. Because x is greater
29
than 4, the print statement nested below will be executed. For the second value of x which
is obviously a complex number then the command within the first if statement will not
execute and it just returns an error.
when the Boolean condition is false, it is a great tool to have another condition under
the name else. Below, is the syntax for an if... else statement in R:
if(boolean_condition) {
statements shall execute if the boolean_condition is true. } else {
statements shall execute if the boolean_condition is false. }
Let’s have an example:
[1] -97
Remark. in the above example, we have an uncomplicated code of checking whether the
number entered is greater than 100 or not. As you can see, because the condition in the if
statement returned false, the command under the else statement will run and print out,
if the number is less than 100.
A switch statement allows a variable to be tested for equality against a list of values.
Each value is called a case, and the variable being switched on is checked for each case.[5].
To apply this statement, you need to know some rules, one of them is If the value of the
expression is not a character string it is coerced to an integer, and If there is more than
one match, the first matching element is returned. For more information please check [5].
Below, is the syntax for a switch statement in R:
switch(expression, case1, case2, case3....)
Let’s have an example:
30
#Testing the equality between the first variable and other statements
switch(TRUE,"Boolean","numeric","complex")
switch(2,"Boolean","numeric","complex")
[1] "Boolean"
[1] "numeric"
4.4 Loops
Usually, every developer needs to perform a series of conditions repeated either for a
specified number of times or until the break is encountered. R program provides us with
different loops to handle the need of any developer. In the following table 6.3, we present
those loops with their syntax writing in R.
Remark. When you need to end the loop statement and transfer execution to the statement
immediately following the loop, you can use the break statement by writing just "break".
Also, there is the next statement which simulates the behavior of the R switch "next".
4.5 Functions in R
R has a rich set of functions that can be used to code almost every task for the
developer. It’s very important to understand the purpose and syntax of R functions and
know how to create or use them.
31
R has a lot of useful built-in functions that can be used for a variety of things. These
mathematical functions are very helpful to find absolute value, square value, and much
more calculations. This program also allows us to find these functions in an easy way by
writing "Built-in-functions" in the first red box that is present in figure 4.2, and then it
appears on the page named "The R base package ". Now, scroll down the page, and you
will find a list of letters in figure 4.3, and you can view all of them as shown in figure
4.3. One of the benefits of this program is that it has also explained the method of using
these functions, as well as an illustrative example for the reader, and that is only when
choosing the function to be used by clicking on it, and you will see all the explanation on
how to use it as in figure 4.4. In the next figure 4.5, there is an example of how to use
these functions on a vector of real numbers and their result.
Sometimes the function stored in R is not enough, so we need to create our own
function with its variables and conditions to have a good performance on a special task.
Therefore, R provides us with a way to declare a user-defined function. Below, is the
syntax for defining functions:
function_name <- function(argument1, argument2, ...) { Function body
Where: function name is the name of the function that should be clear and meaningful
because it will be stored in the R environment after the function definition. For the
Arguments of a function, sometimes called parameters, here we define the variable of this
function and they are optional; And function Body contains a collection of statements
that defines what the function will do.
Remark. After defining a function, a developer needs to call this function and use it again
and again. To do so, we just put the defined name and add the necessary arguments inside
the parenthesis.
print(m)} }}
#defining the vectors
c1<-c(1,2,5,4)
c2<-c(1,2,3)
c3<-c(1,2)
#Calling the function
matrixfunc(c1,c2,c3)
[,1]
[1,] 1
[,1] [,2]
[1,] 1 2
[,1]
[1,] 1
[2,] 2
[,1] [,2]
[1,] 1 2
[2,] 5 4
[,1]
[1,] 1
[2,] 2
[3,] 5
[,1] [,2]
[1,] 1 2
[2,] 5 4
[3,] 1 2
4.6 Exercise
1. Create a vector v_1 of the values of sin(ex ) at x = 4, 3, 2, 1, 5.
3
2. Create a vector v_2 of the values 53 , 99 , cos(3.5π), 3, abs(2 + 3i)
35
3. Create a matrix of 2 rows and 5 columns where data from v_1 and v_2.
4. writes a function multpvect which takes two arguments vect1 and vect2 of any type.
The function should return the multiplication of two vectors. write all the cases
that a vector could have.