0% found this document useful (0 votes)
30 views15 pages

Week 10

The document discusses different non-tabular data types including time series, spatial, and network data. It provides code examples for creating and manipulating each type of data in R. For time series, it shows how to create and plot single and multivariate time series objects. For spatial data, it demonstrates functions for creating thematic maps. For network data, it reviews graph types and functions for generating, analyzing, and visualizing network objects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views15 pages

Week 10

The document discusses different non-tabular data types including time series, spatial, and network data. It provides code examples for creating and manipulating each type of data in R. For time series, it shows how to create and plot single and multivariate time series objects. For spatial data, it demonstrates functions for creating thematic maps. For network data, it reviews graph types and functions for generating, analyzing, and visualizing network objects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Week 10

Introduction to Non-Tabular Data Types: Time series, spatial data, Network data.
Data Transformations: Converting Numeric Variables into Factors, Date Operations,
String Parsing, Geocoding

Aim: To implement Time series, spatial data, Network data. Data Transformations:
Converting Numeric Variables into Factors, Date Operations, String Parsing, Geocoding

TIME SERIES:

Description:

Time series is a series of data points in which each data point is associated with a timestamp.
A simple example is the price of a stock in the stock market at different points of time on a
given day. Another example is the amount of rainfall in a region at different months of the
year. R language uses many functions to create, manipulate and plot the time series data. The
data for the time series is stored in an R object called time-series object. It is also a R data
object like a vector or data frame.
The time series object is created by using the ts() function.
Syntax
The basic syntax for ts() function in time series analysis is −
timeseries.object.name <- ts(data, start, end, frequency)

Arguments:

 data is a vector or matrix containing the values used in the time series.
 start specifies the start time for the first observation in time series.
 end specifies the end time for the last observation in time series.
 frequency specifies the number of observations per unit time.

Source code:

# Weekly data of COVID-19 positive cases from


# 22 January, 2020 to 15 April, 2020
x <- c(580, 7813, 28266, 59287, 75700,
87820, 95314, 126214, 218843, 471497,
936851, 1508725, 2072113)
> library(lubridate)

Attaching package: ‘lubridate’

The following objects are masked from ‘package:base’:

date, intersect, setdiff, union


# output to be created as png file
png(file ="timeSeries.png")

1
# creating time series object
# from date 22 January, 2020
mts <- ts(x, start = decimal_date(ymd("2020-01-22")),
frequency = 365.25 / 7)

# plotting the graph


plot(mts, xlab ="Weekly Data",
ylab ="Total Positive Cases",
main ="COVID-19 Pandemic",
col.main ="darkgreen")

# saving the file


dev.off()

Multivariate Time Series Analysis


# Weekly data of COVID-19 positive cases and
# weekly deaths from 22 January, 2020 to
# 15 April, 2020

positiveCases <- c(580, 7813, 28266, 59287,


75700, 87820, 95314, 126214,
218843, 471497, 936851,
1508725, 2072113)

deaths <- c(17, 270, 565, 1261, 2126, 2800,


3285, 4628, 8951, 21283, 47210,
88480, 138475)

# library required for decimal_date() function


library(lubridate)

# output to be created as png file

2
png(file="multivariateTimeSeries.png")

# creating multivariate time series object


# from date 22 January, 2020
mts <- ts(cbind(positiveCases, deaths),
start = decimal_date(ymd("2020-01-22")),
frequency = 365.25 / 7)

# plotting the graph


plot(mts, xlab ="Weekly Data",
main ="COVID-19 Cases",
col.main ="darkgreen")

# saving the file


dev.off()

3
SPATIAL DATA:

Description:

tmap element: Building block for drawing thematic maps. All element functions have the
prefix tm_.

Quick thematic map plot: Draw a thematic map quickly. This function is a convenient
wrapper of the main plotting method of stacking tmap-elements. Without arguments or with a
search term, this functions draws an interactive map.

Syntax:

qtm(shp, fill = NA, symbols.size = NULL, symbols.col = NULL, symbols.shape = NULL,


dots.col = NULL, text = NULL, text.size = 1, text.col = NA, lines.lwd = NULL, lines.col =
NULL, raster = NA, borders = NA, by = NULL, scale = NA, title = NA, projection = NULL,
bbox = NULL, basemaps = NA, overlays = NA, style = NULL, format = NULL, ...)
Source code:

> library(tmap)
> data(World,rivers,metro)
> qtm(World)

> qtm(World, fill="HPI", fill.n = 9, fill.palette = "div",fill.title = "Happy Planet Index", fill.id
= "name", style = "gray", format = "World", projection = "+proj=eck4")

4
> qtm(World, fill = "continent", format = "World", style = "col_blind", projection =
"+proj=eck4")

>
>qtm(World, fill = "economy", format = "World", style = "col_blind", projection =
"+proj=eck4")

5
> qtm(World, borders = NULL) + qtm(metro, symbols.size = "pop2010", symbols.title.size=
"Metropolitan Areas", symbols.id= "name",format = "World")

old-style crs object detected; please recreate object with a recent sf::st_crs()

old-style crs object detected; please recreate object with a recent sf::st_crs()

old-style crs object detected; please recreate object with a recent sf::st_crs()

> current.mode <- tmap_mode("view")

tmap mode set to interactive viewing

> qtm("Viskhapatnam")

tmaptools::geocode_OSM didn't found any results for: "Viskhapatnam".

6
> tm_shape(World) +tm_polygons("HPI")

> tm1 <- tm_shape(World, projection="+proj=eck4", simplify = 0.05) + tm_polygons() +


tm_layout("Simplification: 0.05")

> tm1

7
NETWORK DATA:

Description:

igraph is a library and R package for network analysis.


The description of an igraph object starts with up to four letters:
 D or U, for a directed or undirected graph
 N for a named graph (where nodes have a name attribute)
 W for a weighted graph (where edges have a weight attribute)
 B for a bipartite (two-mode) graph (where nodes have a type attribute)

Functions used in the Social Network Analysis


 library() function
library() function load and attach add-on packages.
Syntax:

library(package, help, logical.return = FALSE....)


 make_full_graph() function
This function is used to create a full graph.
Syntax:
make_full_graph(n, loops = FALSE, directed = FALSE)
 make_ring() function
A ring is a one-dimensional lattice and it can create lattices of arbitrary dimensions,
periodic or non-periodic ones.
Syntax:
make_ring(n, directed = FALSE, circular = TRUE, mutual = FALSE)

8
 make_star() function
This Function creates a star graph, where every single vertex is connected to the center
vertex and nobody else.
Syntax:
make_star(n, center = 1, mode = c("in", "out", "mutual", "undirected"))
 sample_gnp() function
This is a simple model where every possible edge is created with the same constant
probability.
Syntax:
 sample_gnp(n, p, loops = FALSE, directed = FALSE)
 plot() function
This function is used to draw the given graph in the active graphics window.
Syntax:
plot(defined_graph_name)
Full Graph
Syntax:
make_full_graph ()
Parameters:
 Number of vertices.
 directed = TRUE/FALSE Whether to create a directed graph or not.
 loops = TRUE/FALSE Whether to add self-loops to the graph or not.

Ring Graph
The Ring graph is a one-dimensional lattice and is a special case of make_lattice function.
Syntax:
make_ring ()
Parameters:
 Number of vertices.
 directed = TRUE/FALSE Whether to create a directed graph or not.
 mutual =TRUE/FALSE Whether directed edges are mutual or not. It is ignored in
undirected graph.
 circular =TRUE/FALSE Whether to create circular ring.

Star Graph
A star graph is where every single vertex is connected to the center vertex and nobody else.
Syntax:
make_star()
Parameters:
 Number of vertices
 center = Id of the center vertex
 mode = It defines direction of the edges in/out/mutual/undirected.
 in – The edges point to the center.
 out – The edges point from the center.
 mutual – A directed star graph is created with mutual edges.
 undirected – The edges are undirected.

9
Source code:

## Download and install the package


install.packages("igraph") ## Load package
>library(igraph)
Attaching package: ‘igraph’

The following objects are masked from ‘package:stats’:

decompose, spectrum

The following object is masked from ‘package:base’:

union
> #Create networks
> g1 <- graph( edges=c(1,2, 2,3, 3, 1), n=3, directed=F )
> plot(g1)
> class(g1)
[1] "igraph"
> g1
IGRAPH ac528bb U--- 3 3 --
+ edges from ac528bb:
[1] 1--2 2--3 1—3

> g4 <- graph( c("John", "Jim", "Jim", "Jack", "Jim", "Jack", "John", "John"),
+ isolates=c("Jesse", "Janis", "Jennifer", "Justin") )
> # In named graphs we can specify isolates by providing a list of their names.
> plot(g4, edge.arrow.size=.5, vertex.color="gold", vertex.size=15,
+ vertex.frame.color="gray", vertex.label.color="black",
+ vertex.label.cex=0.8, vertex.label.dist=2, edge.curved=0.2)

10
> plot(graph_from_literal(a---b, b---c)) # the number of dashes doesn't matter

> plot(graph_from_literal(a:b:c---c:d:e))

> library(igraph)

11
> Full_Graph <- make_full_graph(8, directed = FALSE)

> plot(Full_Graph)

> library(igraph)
> Ring_Graph <- make_ring(12, directed = FALSE, mutual = FALSE, circular = TRUE)
> plot(Ring_Graph)

> library(igraph)
> Star_Graph <- make_star(10, center = 1)
> plot(Star_Graph)

12
13
CONVERTING NUMERICAL VARIABLES INTO R:

Description:

cut divides the range of x into intervals and codes the values in x according to which
interval they fall. The leftmost interval corresponds to level one, the next leftmost to level
two and so on.

Syntax:
cut(x, ...)

## Default S3 method:
cut(x, breaks, labels = NULL,
include.lowest = FALSE, right = TRUE, dig.lab = 3, ordered_result = FALSE, ...)
x: numeric data
breaks: If the value is provided, the entire range of numeric data gets divided into this
‘breaks’. include.lowest: If set to True, it include the lowest value in the consideration

Source code:

#create data frame


df <- data.frame(team=c('A', 'A', 'B', 'B', 'C', 'C', 'C', 'D'),
points=c(12, 15, 22, 29, 35, 24, 11, 24))

#view data frame


df
#view structure of data frame
str(df)

#convert points column from numeric to factor


df$points <- as.factor(df$points)

#view updated data frame


df
#view updated structure of data frame
str(df)

Output:
team points
1 A 12
2 A 15
3 B 22
4 B 29
5 C 35
6 C 24
7 C 11
8 D 24
'data.frame': 8 obs. of 2 variables:

14
$ team : chr "A" "A" "B" "B" ...
$ points: num 12 15 22 29 35 24 11 24
team points
1 A 12
2 A 15
3 B 22
4 B 29
5 C 35
6 C 24
7 C 11
8 D 24
'data.frame': 8 obs. of 2 variables:
$ team : chr "A" "A" "B" "B" ...
$ points: Factor w/ 7 levels "11","12","15",..: 2 3 4 6 7 5 1 5

15

You might also like