0% found this document useful (0 votes)
32 views26 pages

Lab 1

lab lecture notes of R language.

Uploaded by

neilzhaony
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views26 pages

Lab 1

lab lecture notes of R language.

Uploaded by

neilzhaony
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Introduction to RStudio

MACC7006 Accounting Data and Analytics

Keri Hu

Faculty of Business and Economics

1/26
Today’s objective

By the end of today’s lab, you should be able to:

• Locate and identify the essential parts of the RStudio interface

• Create, edit, and save .R and .RData files

• Generate objects and differentiate between datasets, numbers, strings,


and functions

2/26
RStudio interface

Figure 1: RStudio interface

3/26
Arithmetic operations

R can be used as a calculator:

5 + 3

## [1] 8

5 / 3

## [1] 1.666667

5 ^ 3

## [1] 125

• The [1] is telling you the row number.


4/26
An “object-oriented” programming language

Objects, any pieces of information stored by R, can be:

• A dataset (e.g. “WHO”)


• A subset of a dataset (e.g. only the even observations of “WHO”)
• A number (e.g. 2π ` 1)
• A text string (e.g. “HKU is awesome”)
• A function (e.g. a function that takes in x and gives you x2 ` 8)

5/26
Create objects

R can store objects with a name of our choice. Use <- as an assignment
operator for objects.

object_1 <- 5 + 3
object_1

## [1] 8

If we assign a new value to the same object name, then we will overwrite
this object (so be careful when doing so!)

object_1 <- 5 - 3
object_1

## [1] 2
6/26
Objects (cont.)

R can also represent other types of values as objects, such as strings of


characters:

MySchool <- "HKU"


MySchool

## [1] "HKU"

7/26
A vector stores information in a given order

We use the function c(), which stands for “concatenate,” to enter a data
vector (with commas separating elements of the vector):

vector.1 <- c(93, 92, 83, 99, 96, 97)


vector.1

## [1] 93 92 83 99 96 97

• Note: when creating a vector, R creates column vectors pn ˆ 1q.

8/26
seq function

An easy way to create a long sequence of numbers is the seq function.

• The sequence starts at the first argument, ends at the second


argument, and jumps in increments defined by the third argument.

seq(0, 20, 5)

## [1] 0 5 10 15 20

• If you have 1000 data points, and you want to rank them from 1 to
1000, you can use seq(1, 1000, 1).

9/26
Retrieve part of a vector

To access specific elements of a vector, we use square brackets


[ ]. This is called indexing:

vector.1[2]

## [1] 92

vector.1[c(2, 4)]

## [1] 92 99

vector.1[-4]

## [1] 93 92 83 96 97

10/26
Multiply a vector by a number

Since each element of this vector is a numeric value, we can apply


arithmetic operations to it:

vector.1 * 1000

## [1] 93000 92000 83000 99000 96000 97000

11/26
Element-wise operations of vectors

vec1 <- c(1, 2, 3)


vec2 <- c(3, 3, 3)
vec1 + vec2

## [1] 4 5 6

vec1 * vec2

## [1] 3 6 9

vec1 / vec2

## [1] 0.3333333 0.6666667 1.0000000

12/26
Functions

A function takes input object(s) and returns an output object. In R, a


function generally runs as funcname(input). Some basic functions useful
for summarizing data include:

• length(): length of a vector (number of elements)


• min(): minimum value
• max(): maximum value
• range(): range of data
• mean(): mean
• sd(): standard deviation
• sum(): sum

Try these with vector.1

13/26
Functions (cont.)

length(vector.1)

## [1] 6

min(vector.1)

## [1] 83

max(vector.1)

## [1] 99

range(vector.1)

## [1] 83 99
14/26
Functions (cont.)

mean(vector.1)

## [1] 93.33333

sd(vector.1)

## [1] 5.680376

sum(vector.1)

## [1] 560

15/26
R script

• A text file containing a set of commands and comments

Why to use R script? Instead of re-entering codes each time to execute a


set of commands, . . .

• Reproducibility
• Anyone anywhere with data and R script can produce the results.
• Big time savings when repeating analysis on data

16/26
Create an R script

17/26
Specify a working directory in R

Working directory: the default location where R searches for files and
where it saves files

• Use the function setwd() to change the working directory

setwd("/Users/Keri/MACC7006")

• Use the function getwd() to display the current working directory.

getwd()

## [1] "/Users/Keri/MACC7006"

18/26
Loading data from working directory

Dataset WHO.csv: recent statistics about 194 countries from the World
Health Organization (WHO)

• For CSV files:

WHO <- read.csv("WHO.csv")

• For RData files:

WHO <- load("WHO.RData")

19/26
Data frames

A data frame is the data structure (we can think of it as an Excel


spreadsheet). Useful functions for data frames include:

• str(): examine structure of the object


• names(): return a vector of variable names
• nrow(): return the number of rows
• ncol(): return the number of columns
• dim(): combine ncol() and nrow() into a vector
• summary(): provide a statistical summary
• head(): displays the first six observations
• tail(): displays the last six observations
• View(): displays the spreadsheet of the entire data frame

Load WHO.csv, assign it to an object called WHO (as we did in the last
page), and try the above functions on this newly created data frame.

20/26
Example of a data frame

Variable 1 Variable 2
Observation 1 Variable 1’s value of Observation 1
Observation 2
Observation 3

21/26
Retrieve part of a data frame: using []

We can retrieve specified observations and variables using brackets [ ]


with a comma in the form [rows, columns]:

WHO[1:3, "Country"]

## [1] "Afghanistan" "Albania" "Algeria"

WHO[1:4, 1]

## [1] "Afghanistan" "Albania" "Algeria" "Andorra"

Observe that “Country” is the first variable in the “WHO” data frame.

22/26
Retrieve part of a data frame: using $

The $ operator is another way to access variables from a data frame:

head(WHO$Country, 5)

## [1] "Afghanistan" "Albania" "Algeria" "Andorra" "Angola"

Note: the “5” after the comma specifies how many observations to display.

23/26
Save R script

24/26
Save objects

When you quit RStudio, you will be asked whether you would like to save
the workspace. You should answer no in general.

• To export CSV:

write.csv(WHO, file = "WHO.csv")

• To export RData:

save(WHO, file = "WHO.RData")

Go ahead and export your data frame as RData.

25/26
Here are the commands/operators we covered today:

• <-
• c(), seq()
• vector[]
• length(), min(), max(), range(), mean(), sd(), sum()
• setwd(), getwd()
• read.csv(), load()
• str(), names(), nrow(), ncol(), dim(), summary(),
head(), tail(), View()
• write.csv(), save()
• $

26/26

You might also like