Introduction To R
Introduction To R
A. Di Bucchianico
Types of statistical software
• command-line software
– requires knowledge of syntax of commands
– reproducible results through scripts
– detailed analyses possible
• GUI-based software
– does not require knowledge of commands
– not reproducible actions
• hybrid types (both command-line and GUI)
Introduction to R 2
Well-known statistical software
• SAS
• SPSS
• Minitab
• Statgraphics
• S-Plus
• R
• …
Introduction to R 3
R
• free
• language almost the same as S
• maintained by top quality experts
• available on all platforms
• continuous improvement
Introduction to R 4
Contents
• Basic operations
• Data creation + I/O
• Component extraction
• Plots
• Basic statistics
• Libraries
• Regression analysis
• Survival analysis
Introduction to R 5
Basic operations
• assignment operation: a <- 2+sqrt(5)
• help function:
– help(pnorm)
– help.search(“normal distribution”)
• probability functions:
– d (density): dgamma(x,n,)
– p (probability=cdf): pweibull(x,3,2)
– q (quantile): qnorm(0.95)
– r (random numbers): rexp(10,)
Introduction to R 6
Data creation + I/O
• create
– vectors: c(1,2,3)
– matrices: matrix(c(1,2,3,4,5,6),2,3,byrow=T) (2=#rows)
– list
• patterns:
– “:” (1,2,3) = 1:3
– seq (1,2,3) = seq(1,3,by=1)
• working directories and files:
– setwd
– getwd
– attach
• read data
– from file: read.table(“file.txt”,header=TRUE)
– from web: read.data.url
Introduction to R 7
Component extraction
• d[r,]: rth row of object d
• d[,c]: cth column of object d
• d[r,c]: entry in row r and column c of object d
• length(d): length of d
• d[d<20]: extract all elements of d that are
smaller than 20
• d[“age”]: extract column “age” from object d
Introduction to R 8
Plots
• plot: both 1D and 2D plots
• hist: histogram
• qqnorm: normal probability plot (“quantile-
quantile” plot)
Introduction to R 9
Basic statistics
• summary
• mean
• stdev
• t.test
• boxplot
Introduction to R 10
Packages
• specialized functions available through
packages and libraries
• in Windows interface choose Packages ->
Load Packages
• examples of packages:
– qcc (quality control)
– survival
Introduction to R 11
Functions
Analyses that have to be performed often
can be put in the form of functions
Example: simple <-
function(data,mean=0,alpha=0.05)
{hist(data),t.test(data,conf.level=alpha,mu=
mean,alternative=“two-sided”)}
Introduction to R 13
Survival analysis
• through library Surv of survival
• Cox proportional hazards: coxph
Introduction to R 14
Useful web sites
• www.r-project.org
• http://cran.r-project.org/doc/contrib/Short-refcard.pdf
• http://www.uni-muenster.de/ZIV/Mitarbeiter/
BennoSueselbeck/s-html/shelp.html
• http://www.maths.lth.se/help/R/
• http://www.mas.ncl.ac.uk/~ndjw1/teaching/sim/R-
intro.html
• http://stats.math.uni-augsburg.de/JGR/
• http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/index.html
Introduction to R 15