0% found this document useful (0 votes)
108 views116 pages

An Introduction To R

Using R interactively: an introductory session 1. Getting help with functions and features 2. Vectors and assignment 2. Vector arithmetic 3. Generating regular sequences 2. Logical vectors 2. Character vectors 2. Index vectors? selecting and modifying subsets of a data set 2. Objects, their modes and attributes 3. Changing the length of an object 4. The function tapply() and ragged arrays 4. Ordered factors 5. Arrays and matrices 5. Array indexing.

Uploaded by

Ryan Washburn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views116 pages

An Introduction To R

Using R interactively: an introductory session 1. Getting help with functions and features 2. Vectors and assignment 2. Vector arithmetic 3. Generating regular sequences 2. Logical vectors 2. Character vectors 2. Index vectors? selecting and modifying subsets of a data set 2. Objects, their modes and attributes 3. Changing the length of an object 4. The function tapply() and ragged arrays 4. Ordered factors 5. Arrays and matrices 5. Array indexing.

Uploaded by

Ryan Washburn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 116

5/28/2015

AnIntroductiontoR

AnIntroductiontoR
TableofContents
Preface
1Introductionandpreliminaries
1.1TheRenvironment
1.2Relatedsoftwareanddocumentation
1.3Randstatistics
1.4Randthewindowsystem
1.5UsingRinteractively
1.6Anintroductorysession
1.7Gettinghelpwithfunctionsandfeatures
1.8Rcommands,casesensitivity,etc.
1.9Recallandcorrectionofpreviouscommands
1.10Executingcommandsfromordivertingoutputtoafile
1.11Datapermanencyandremovingobjects
2Simplemanipulationsnumbersandvectors
2.1Vectorsandassignment
2.2Vectorarithmetic
2.3Generatingregularsequences
2.4Logicalvectors
2.5Missingvalues
2.6Charactervectors
2.7Indexvectorsselectingandmodifyingsubsetsofadataset
2.8Othertypesofobjects
3Objects,theirmodesandattributes
3.1Intrinsicattributes:modeandlength
3.2Changingthelengthofanobject
3.3Gettingandsettingattributes
3.4Theclassofanobject
4Orderedandunorderedfactors
4.1Aspecificexample
4.2Thefunctiontapply()andraggedarrays
4.3Orderedfactors
5Arraysandmatrices
5.1Arrays
5.2Arrayindexing.Subsectionsofanarray
5.3Indexmatrices
5.4Thearray()function
5.4.1Mixedvectorandarrayarithmetic.Therecyclingrule
5.5Theouterproductoftwoarrays
5.6Generalizedtransposeofanarray
5.7Matrixfacilities
5.7.1Matrixmultiplication
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

1/116

5/28/2015

AnIntroductiontoR

5.7.2Linearequationsandinversion
5.7.3Eigenvaluesandeigenvectors
5.7.4Singularvaluedecompositionanddeterminants
5.7.5LeastsquaresfittingandtheQRdecomposition
5.8Formingpartitionedmatrices,cbind()andrbind()
5.9Theconcatenationfunction,c(),witharrays
5.10Frequencytablesfromfactors
6Listsanddataframes
6.1Lists
6.2Constructingandmodifyinglists
6.2.1Concatenatinglists
6.3Dataframes
6.3.1Makingdataframes
6.3.2attach()anddetach()
6.3.3Workingwithdataframes
6.3.4Attachingarbitrarylists
6.3.5Managingthesearchpath
7Readingdatafromfiles
7.1Theread.table()function
7.2Thescan()function
7.3Accessingbuiltindatasets
7.3.1LoadingdatafromotherRpackages
7.4Editingdata
8Probabilitydistributions
8.1Rasasetofstatisticaltables
8.2Examiningthedistributionofasetofdata
8.3Oneandtwosampletests
9Grouping,loopsandconditionalexecution
9.1Groupedexpressions
9.2Controlstatements
9.2.1Conditionalexecution:ifstatements
9.2.2Repetitiveexecution:forloops,repeatandwhile
10Writingyourownfunctions
10.1Simpleexamples
10.2Definingnewbinaryoperators
10.3Namedargumentsanddefaults
10.4Theargument
10.5Assignmentswithinfunctions
10.6Moreadvancedexamples
10.6.1Efficiencyfactorsinblockdesigns
10.6.2Droppingallnamesinaprintedarray
10.6.3Recursivenumericalintegration
10.7Scope
10.8Customizingtheenvironment
10.9Classes,genericfunctionsandobjectorientation
11StatisticalmodelsinR
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

2/116

5/28/2015

AnIntroductiontoR

11.1Definingstatisticalmodelsformulae
11.1.1Contrasts
11.2Linearmodels
11.3Genericfunctionsforextractingmodelinformation
11.4Analysisofvarianceandmodelcomparison
11.4.1ANOVAtables
11.5Updatingfittedmodels
11.6Generalizedlinearmodels
11.6.1Families
11.6.2Theglm()function
11.7Nonlinearleastsquaresandmaximumlikelihoodmodels
11.7.1Leastsquares
11.7.2Maximumlikelihood
11.8Somenonstandardmodels
12Graphicalprocedures
12.1Highlevelplottingcommands
12.1.1Theplot()function
12.1.2Displayingmultivariatedata
12.1.3Displaygraphics
12.1.4Argumentstohighlevelplottingfunctions
12.2Lowlevelplottingcommands
12.2.1Mathematicalannotation
12.2.2Hersheyvectorfonts
12.3Interactingwithgraphics
12.4Usinggraphicsparameters
12.4.1Permanentchanges:Thepar()function
12.4.2Temporarychanges:Argumentstographicsfunctions
12.5Graphicsparameterslist
12.5.1Graphicalelements
12.5.2Axesandtickmarks
12.5.3Figuremargins
12.5.4Multiplefigureenvironment
12.6Devicedrivers
12.6.1PostScriptdiagramsfortypesetdocuments
12.6.2Multiplegraphicsdevices
12.7Dynamicgraphics
13Packages
13.1Standardpackages
13.2ContributedpackagesandCRAN
13.3Namespaces
14OSfacilities
14.1Filesanddirectories
14.2Filepaths
14.3Systemcommands
14.4CompressionandArchives
AppendixAAsamplesession
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

3/116

5/28/2015

AnIntroductiontoR

AppendixBInvokingR
B.1InvokingRfromthecommandline
B.2InvokingRunderWindows
B.3InvokingRunderOSX
B.4ScriptingwithR
AppendixCThecommandlineeditor
C.1Preliminaries
C.2Editingactions
C.3Commandlineeditorsummary
AppendixDFunctionandvariableindex
AppendixEConceptindex
AppendixFReferences
Next:Preface[Contents][Index]

AnIntroductiontoR
ThisisanintroductiontoR(GNUS),alanguageandenvironmentforstatisticalcomputing
andgraphics.Rissimilartotheawardwinning1Ssystem,whichwasdevelopedatBell
LaboratoriesbyJohnChambersetal.Itprovidesawidevarietyofstatisticalandgraphical
techniques(linearandnonlinearmodelling,statisticaltests,timeseriesanalysis,classification,
clustering,...).
Thismanualprovidesinformationondatatypes,programmingelements,statisticalmodelling
andgraphics.
ThismanualisforR,version3.2.0(20150416).
Copyright1990W.N.Venables
Copyright1992W.N.Venables&D.M.Smith
Copyright1997R.Gentleman&R.Ihaka
Copyright1997,1998M.Maechler
Copyright19992015RCoreTeam
Permissionisgrantedtomakeanddistributeverbatimcopiesofthismanual
providedthecopyrightnoticeandthispermissionnoticearepreservedonallcopies.
Permissionisgrantedtocopyanddistributemodifiedversionsofthismanualunder
theconditionsforverbatimcopying,providedthattheentireresultingderivedwork
isdistributedunderthetermsofapermissionnoticeidenticaltothisone.
Permissionisgrantedtocopyanddistributetranslationsofthismanualintoanother
language,undertheaboveconditionsformodifiedversions,exceptthatthis
permissionnoticemaybestatedinatranslationapprovedbytheRCoreTeam.
Preface:

Introductionandpreliminaries:

Simplemanipulationsnumbersandvectors:
Objects:

Factors:

Arraysandmatrices:

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

4/116

5/28/2015

AnIntroductiontoR

Listsanddataframes:
Readingdatafromfiles:
Probabilitydistributions:
Loopsandconditionalexecution:
Writingyourownfunctions:
StatisticalmodelsinR:
Graphics:
Packages:
OSfacilities:
Asamplesession:
InvokingR:
Thecommandlineeditor:
Functionandvariableindex:
Conceptindex:
References:

Next:Introductionandpreliminaries,Previous:Top,Up:Top[Contents][Index]

Preface
ThisintroductiontoRisderivedfromanoriginalsetofnotesdescribingtheSandSPLUS
environmentswrittenin19902byBillVenablesandDavidM.SmithwhenattheUniversityof
Adelaide.WehavemadeanumberofsmallchangestoreflectdifferencesbetweentheRandS
programs,andexpandedsomeofthematerial.
WewouldliketoextendwarmthankstoBillVenables(andDavidSmith)forgranting
permissiontodistributethismodifiedversionofthenotesinthisway,andforbeingasupporter
ofRfromwayback.
Commentsandcorrectionsarealwayswelcome.PleaseaddressemailcorrespondencetoR
[email protected].
Suggestionstothereader

MostRnoviceswillstartwiththeintroductorysessioninAppendixA.Thisshouldgivesome
familiaritywiththestyleofRsessionsandmoreimportantlysomeinstantfeedbackonwhat
actuallyhappens.
ManyuserswillcometoRmainlyforitsgraphicalfacilities.SeeGraphics,whichcanbereadat
almostanytimeandneednotwaituntilalltheprecedingsectionshavebeendigested.
Introductionandpreliminaries:
Next:Simplemanipulationsnumbersandvectors,Previous:Preface,Up:Top[Contents]
[Index]

1Introductionandpreliminaries
TheRenvironment:
Relatedsoftwareanddocumentation:
Randstatistics:
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

5/116

5/28/2015

AnIntroductiontoR

Randthewindowsystem:

UsingRinteractively:

Gettinghelp:

Rcommandscasesensitivityetc:

Recallandcorrectionofpreviouscommands:

Executingcommandsfromordivertingoutputtoafile:
Datapermanencyandremovingobjects:

Next:Relatedsoftwareanddocumentation,Previous:Introductionandpreliminaries,Up:
Introductionandpreliminaries[Contents][Index]
1.1TheRenvironment
Risanintegratedsuiteofsoftwarefacilitiesfordatamanipulation,calculationandgraphical
display.Amongotherthingsithas
aneffectivedatahandlingandstoragefacility,
asuiteofoperatorsforcalculationsonarrays,inparticularmatrices,
alarge,coherent,integratedcollectionofintermediatetoolsfordataanalysis,
graphicalfacilitiesfordataanalysisanddisplayeitherdirectlyatthecomputeroron
hardcopy,and
awelldeveloped,simpleandeffectiveprogramminglanguage(calledS)whichincludes
conditionals,loops,userdefinedrecursivefunctionsandinputandoutputfacilities.
(IndeedmostofthesystemsuppliedfunctionsarethemselveswrittenintheSlanguage.)
Thetermenvironmentisintendedtocharacterizeitasafullyplannedandcoherentsystem,
ratherthananincrementalaccretionofveryspecificandinflexibletools,asisfrequentlythecase
withotherdataanalysissoftware.
Risverymuchavehiclefornewlydevelopingmethodsofinteractivedataanalysis.Ithas
developedrapidly,andhasbeenextendedbyalargecollectionofpackages.However,most
programswritteninRareessentiallyephemeral,writtenforasinglepieceofdataanalysis.
Next:Randstatistics,Previous:TheRenvironment,Up:Introductionandpreliminaries
[Contents][Index]
1.2Relatedsoftwareanddocumentation
RcanberegardedasanimplementationoftheSlanguagewhichwasdevelopedatBell
LaboratoriesbyRickBecker,JohnChambersandAllanWilks,andalsoformsthebasisoftheS
PLUSsystems.
TheevolutionoftheSlanguageischaracterizedbyfourbooksbyJohnChambersandcoauthors.
ForR,thebasicreferenceisTheNewSLanguage:AProgrammingEnvironmentforData
AnalysisandGraphicsbyRichardA.Becker,JohnM.ChambersandAllanR.Wilks.Thenew
featuresofthe1991releaseofSarecoveredinStatisticalModelsinSeditedbyJohnM.
ChambersandTrevorJ.Hastie.Theformalmethodsandclassesofthemethodspackageare
basedonthosedescribedinProgrammingwithDatabyJohnM.Chambers.SeeReferences,for
precisereferences.
TherearenowanumberofbookswhichdescribehowtouseRfordataanalysisandstatistics,
anddocumentationforS/SPLUScantypicallybeusedwithR,keepingthedifferencesbetween
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

6/116

5/28/2015

AnIntroductiontoR

theSimplementationsinmind.SeeWhatdocumentationexistsforR?inTheRstatisticalsystem
FAQ.
Next:Randthewindowsystem,Previous:Relatedsoftwareanddocumentation,Up:
Introductionandpreliminaries[Contents][Index]
1.3Randstatistics
OurintroductiontotheRenvironmentdidnotmentionstatistics,yetmanypeopleuseRasa
statisticssystem.Weprefertothinkofitofanenvironmentwithinwhichmanyclassicaland
modernstatisticaltechniqueshavebeenimplemented.AfewofthesearebuiltintothebaseR
environment,butmanyaresuppliedaspackages.Thereareabout25packagessuppliedwithR
(calledstandardandrecommendedpackages)andmanymoreareavailablethroughthe
CRANfamilyofInternetsites(viahttp://CRAN.Rproject.org)andelsewhere.Moredetailson
packagesaregivenlater(seePackages).
MostclassicalstatisticsandmuchofthelatestmethodologyisavailableforusewithR,butusers
mayneedtobepreparedtodoalittleworktofindit.
ThereisanimportantdifferenceinphilosophybetweenS(andhenceR)andtheothermain
statisticalsystems.InSastatisticalanalysisisnormallydoneasaseriesofsteps,with
intermediateresultsbeingstoredinobjects.ThuswhereasSASandSPSSwillgivecopious
outputfromaregressionordiscriminantanalysis,Rwillgiveminimaloutputandstorethe
resultsinafitobjectforsubsequentinterrogationbyfurtherRfunctions.
Next:UsingRinteractively,Previous:Randstatistics,Up:Introductionandpreliminaries
[Contents][Index]
1.4Randthewindowsystem
ThemostconvenientwaytouseRisatagraphicsworkstationrunningawindowingsystem.
Thisguideisaimedatuserswhohavethisfacility.Inparticularwewilloccasionallyrefertothe
useofRonanXwindowsystemalthoughthevastbulkofwhatissaidappliesgenerallytoany
implementationoftheRenvironment.
Mostuserswillfinditnecessarytointeractdirectlywiththeoperatingsystemontheircomputer
fromtimetotime.Inthisguide,wemainlydiscussinteractionwiththeoperatingsystemon
UNIXmachines.IfyouarerunningRunderWindowsorOSXyouwillneedtomakesome
smalladjustments.
SettingupaworkstationtotakefulladvantageofthecustomizablefeaturesofRisa
straightforwardifsomewhattediousprocedure,andwillnotbeconsideredfurtherhere.Usersin
difficultyshouldseeklocalexperthelp.
Next:Gettinghelp,Previous:Randthewindowsystem,Up:Introductionandpreliminaries
[Contents][Index]
1.5UsingRinteractively
WhenyouusetheRprogramitissuesapromptwhenitexpectsinputcommands.Thedefault
promptis>,whichonUNIXmightbethesameastheshellprompt,andsoitmayappearthat
nothingishappening.However,asweshallsee,itiseasytochangetoadifferentRpromptif
youwish.WewillassumethattheUNIXshellpromptis$.
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

7/116

5/28/2015

AnIntroductiontoR

InusingRunderUNIXthesuggestedprocedureforthefirstoccasionisasfollows:
1. Createaseparatesubdirectory,saywork,toholddatafilesonwhichyouwilluseRfor
thisproblem.ThiswillbetheworkingdirectorywheneveryouuseRforthisparticular
problem.
$mkdirwork
$cdwork

2. StarttheRprogramwiththecommand
$R

3. AtthispointRcommandsmaybeissued(seelater).
4. ToquittheRprogramthecommandis
>q()

AtthispointyouwillbeaskedwhetheryouwanttosavethedatafromyourRsession.On
somesystemsthiswillbringupadialogbox,andonothersyouwillreceiveatextprompt
towhichyoucanrespondyes,noorcancel(asingleletterabbreviationwilldo)tosavethe
databeforequitting,quitwithoutsaving,orreturntotheRsession.Datawhichissaved
willbeavailableinfutureRsessions.
FurtherRsessionsaresimple.
1. Makeworktheworkingdirectoryandstarttheprogramasbefore:
$cdwork
$R

2. UsetheRprogram,terminatingwiththeq()commandattheendofthesession.
TouseRunderWindowstheproceduretofollowisbasicallythesame.Createafolderasthe
workingdirectory,andsetthatintheStartInfieldinyourRshortcut.ThenlaunchRbydouble
clickingontheicon.
1.6Anintroductorysession
ReaderswishingtogetafeelforRatacomputerbeforeproceedingarestronglyadvisedtowork
throughtheintroductorysessiongiveninAsamplesession.
Next:Rcommandscasesensitivityetc,Previous:UsingRinteractively,Up:Introductionand
preliminaries[Contents][Index]
1.7Gettinghelpwithfunctionsandfeatures
RhasaninbuilthelpfacilitysimilartothemanfacilityofUNIX.Togetmoreinformationonany
specificnamedfunction,forexamplesolve,thecommandis
>help(solve)

Analternativeis
>?solve
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

8/116

5/28/2015

AnIntroductiontoR

Forafeaturespecifiedbyspecialcharacters,theargumentmustbeenclosedindoubleorsingle
quotes,makingitacharacterstring:Thisisalsonecessaryforafewwordswithsyntactic
meaningincludingif,forandfunction.
>help("[[")

Eitherformofquotemarkmaybeusedtoescapetheother,asinthestring"It'simportant".
Ourconventionistousedoublequotemarksforpreference.
OnmostRinstallationshelpisavailableinHTMLformatbyrunning
>help.start()

whichwilllaunchaWebbrowserthatallowsthehelppagestobebrowsedwithhyperlinks.On
UNIX,subsequenthelprequestsaresenttotheHTMLbasedhelpsystem.TheSearchEngine
andKeywordslinkinthepageloadedbyhelp.start()isparticularlyusefulasitiscontainsa
highlevelconceptlistwhichsearchesthoughavailablefunctions.Itcanbeagreatwaytoget
yourbearingsquicklyandtounderstandthebreadthofwhatRhastooffer.
Thehelp.searchcommand(alternatively??)allowssearchingforhelpinvariousways.For
example,
>??solve

Try?help.searchfordetailsandmoreexamples.
Theexamplesonahelptopiccannormallyberunby
>example(topic)

WindowsversionsofRhaveotheroptionalhelpsystems:use
>?help

forfurtherdetails.
Next:Recallandcorrectionofpreviouscommands,Previous:Gettinghelp,Up:Introductionand
preliminaries[Contents][Index]
1.8Rcommands,casesensitivity,etc.
TechnicallyRisanexpressionlanguagewithaverysimplesyntax.Itiscasesensitiveasare
mostUNIXbasedpackages,soAandaaredifferentsymbolsandwouldrefertodifferent
variables.ThesetofsymbolswhichcanbeusedinRnamesdependsontheoperatingsystem
andcountrywithinwhichRisbeingrun(technicallyonthelocaleinuse).Normallyall
alphanumericsymbolsareallowed2(andinsomecountriesthisincludesaccentedletters)plus
.and_,withtherestrictionthatanamemuststartwith.oraletter,andifitstartswith.
thesecondcharactermustnotbeadigit.Namesareeffectivelyunlimitedinlength.
Elementarycommandsconsistofeitherexpressionsorassignments.Ifanexpressionisgivenas
acommand,itisevaluated,printed(unlessspecificallymadeinvisible),andthevalueislost.An
assignmentalsoevaluatesanexpressionandpassesthevaluetoavariablebuttheresultisnot
automaticallyprinted.
Commandsareseparatedeitherbyasemicolon(;),orbyanewline.Elementarycommands
canbegroupedtogetherintoonecompoundexpressionbybraces({and}).Commentscanbe
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

9/116

5/28/2015

AnIntroductiontoR

putalmost3anywhere,startingwithahashmark(#),everythingtotheendofthelineisa
comment.
Ifacommandisnotcompleteattheendofaline,Rwillgiveadifferentprompt,bydefault
+

onsecondandsubsequentlinesandcontinuetoreadinputuntilthecommandissyntactically
complete.Thispromptmaybechangedbytheuser.Wewillgenerallyomitthecontinuation
promptandindicatecontinuationbysimpleindenting.
Commandlinesenteredattheconsolearelimited4toabout4095bytes(notcharacters).
Next:Executingcommandsfromordivertingoutputtoafile,Previous:Rcommandscase
sensitivityetc,Up:Introductionandpreliminaries[Contents][Index]
1.9Recallandcorrectionofpreviouscommands
UndermanyversionsofUNIXandonWindows,Rprovidesamechanismforrecallingandre
executingpreviouscommands.Theverticalarrowkeysonthekeyboardcanbeusedtoscroll
forwardandbackwardthroughacommandhistory.Onceacommandislocatedinthisway,the
cursorcanbemovedwithinthecommandusingthehorizontalarrowkeys,andcharacterscanbe
removedwiththeDELkeyoraddedwiththeotherkeys.Moredetailsareprovidedlater:seeThe
commandlineeditor.
TherecallandeditingcapabilitiesunderUNIXarehighlycustomizable.Youcanfindouthowto
dothisbyreadingthemanualentryforthereadlinelibrary.
Alternatively,theEmacstexteditorprovidesmoregeneralsupportmechanisms(viaESS,Emacs
SpeaksStatistics)forworkinginteractivelywithR.SeeRandEmacsinTheRstatisticalsystem
FAQ.
Next:Datapermanencyandremovingobjects,Previous:Recallandcorrectionofprevious
commands,Up:Introductionandpreliminaries[Contents][Index]
1.10Executingcommandsfromordivertingoutputtoafile
Ifcommands5arestoredinanexternalfile,saycommands.Rintheworkingdirectorywork,they
maybeexecutedatanytimeinanRsessionwiththecommand
>source("commands.R")

ForWindowsSourceisalsoavailableontheFilemenu.Thefunctionsink,
>sink("record.lis")

willdivertallsubsequentoutputfromtheconsoletoanexternalfile,record.lis.Thecommand
>sink()

restoresittotheconsoleonceagain.
Previous:Executingcommandsfromordivertingoutputtoafile,Up:Introductionand
preliminaries[Contents][Index]
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

10/116

5/28/2015

AnIntroductiontoR

1.11Datapermanencyandremovingobjects
TheentitiesthatRcreatesandmanipulatesareknownasobjects.Thesemaybevariables,arrays
ofnumbers,characterstrings,functions,ormoregeneralstructuresbuiltfromsuchcomponents.
DuringanRsession,objectsarecreatedandstoredbyname(wediscussthisprocessinthenext
session).TheRcommand
>objects()

(alternatively,ls())canbeusedtodisplaythenamesof(mostof)theobjectswhicharecurrently
storedwithinR.Thecollectionofobjectscurrentlystorediscalledtheworkspace.
Toremoveobjectsthefunctionrmisavailable:
>rm(x,y,z,ink,junk,temp,foo,bar)

AllobjectscreatedduringanRsessioncanbestoredpermanentlyinafileforuseinfutureR
sessions.AttheendofeachRsessionyouaregiventheopportunitytosaveallthecurrently
availableobjects.Ifyouindicatethatyouwanttodothis,theobjectsarewrittentoafilecalled
.RData6inthecurrentdirectory,andthecommandlinesusedinthesessionaresavedtoafile
called.Rhistory.
WhenRisstartedatlatertimefromthesamedirectoryitreloadstheworkspacefromthisfile.At
thesametimetheassociatedcommandshistoryisreloaded.
Itisrecommendedthatyoushoulduseseparateworkingdirectoriesforanalysesconductedwith
R.Itisquitecommonforobjectswithnamesxandytobecreatedduringananalysis.Names
likethisareoftenmeaningfulinthecontextofasingleanalysis,butitcanbequitehardtodecide
whattheymightbewhentheseveralanalyseshavebeenconductedinthesamedirectory.
Next:Objects,Previous:Introductionandpreliminaries,Up:Top[Contents][Index]

2Simplemanipulations;numbersandvectors
Vectorsandassignment:

Vectorarithmetic:

Generatingregularsequences:
Logicalvectors:

Missingvalues:

Charactervectors:

Indexvectors:

Othertypesofobjects:

Next:Vectorarithmetic,Previous:Simplemanipulationsnumbersandvectors,Up:Simple
manipulationsnumbersandvectors[Contents][Index]
2.1Vectorsandassignment
Roperatesonnameddatastructures.Thesimplestsuchstructureisthenumericvector,whichis
asingleentityconsistingofanorderedcollectionofnumbers.Tosetupavectornamedx,say,
consistingoffivenumbers,namely10.4,5.6,3.1,6.4and21.7,usetheRcommand
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

11/116

5/28/2015

AnIntroductiontoR

>x<c(10.4,5.6,3.1,6.4,21.7)

Thisisanassignmentstatementusingthefunctionc()whichinthiscontextcantakeanarbitrary
numberofvectorargumentsandwhosevalueisavectorgotbyconcatenatingitsargumentsend
toend.7
Anumberoccurringbyitselfinanexpressionistakenasavectoroflengthone.
Noticethattheassignmentoperator(<),whichconsistsofthetwocharacters<(lessthan)
and(minus)occurringstrictlysidebysideanditpointstotheobjectreceivingthevalue
oftheexpression.Inmostcontextsthe=operatorcanbeusedasanalternative.
Assignmentcanalsobemadeusingthefunctionassign().Anequivalentwayofmakingthe
sameassignmentasaboveiswith:
>assign("x",c(10.4,5.6,3.1,6.4,21.7))

Theusualoperator,<,canbethoughtofasasyntacticshortcuttothis.
Assignmentscanalsobemadeintheotherdirection,usingtheobviouschangeintheassignment
operator.Sothesameassignmentcouldbemadeusing
>c(10.4,5.6,3.1,6.4,21.7)>x

Ifanexpressionisusedasacompletecommand,thevalueisprintedandlost8.Sonowifwe
weretousethecommand
>1/x

thereciprocalsofthefivevalueswouldbeprintedattheterminal(andthevalueofx,ofcourse,
unchanged).
Thefurtherassignment
>y<c(x,0,x)

wouldcreateavectorywith11entriesconsistingoftwocopiesofxwithazerointhemiddle
place.
Next:Generatingregularsequences,Previous:Vectorsandassignment,Up:Simple
manipulationsnumbersandvectors[Contents][Index]
2.2Vectorarithmetic
Vectorscanbeusedinarithmeticexpressions,inwhichcasetheoperationsareperformed
elementbyelement.Vectorsoccurringinthesameexpressionneednotallbeofthesamelength.
Iftheyarenot,thevalueoftheexpressionisavectorwiththesamelengthasthelongestvector
whichoccursintheexpression.Shortervectorsintheexpressionarerecycledasoftenasneedbe
(perhapsfractionally)untiltheymatchthelengthofthelongestvector.Inparticularaconstantis
simplyrepeated.Sowiththeaboveassignmentsthecommand
>v<2*x+y+1

generatesanewvectorvoflength11constructedbyaddingtogether,elementbyelement,2*x
repeated2.2times,yrepeatedjustonce,and1repeated11times.
Theelementaryarithmeticoperatorsaretheusual+,,*,/and^forraisingtoapower.In
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

12/116

5/28/2015

AnIntroductiontoR

additionallofthecommonarithmeticfunctionsareavailable.log,exp,sin,cos,tan,sqrt,and
soon,allhavetheirusualmeaning.maxandminselectthelargestandsmallestelementsofa
vectorrespectively.rangeisafunctionwhosevalueisavectoroflengthtwo,namelyc(min(x),
max(x)).length(x)isthenumberofelementsinx,sum(x)givesthetotaloftheelementsinx,
andprod(x)theirproduct.
Twostatisticalfunctionsaremean(x)whichcalculatesthesamplemean,whichisthesameas
sum(x)/length(x),andvar(x)whichgives
sum((xmean(x))^2)/(length(x)1)

orsamplevariance.Iftheargumenttovar()isannbypmatrixthevalueisapbypsample
covariancematrixgotbyregardingtherowsasindependentpvariatesamplevectors.
sort(x)returnsavectorofthesamesizeasxwiththeelementsarrangedinincreasingorder
howeverthereareothermoreflexiblesortingfacilitiesavailable(seeorder()orsort.list()

whichproduceapermutationtodothesorting).
Notethatmaxandminselectthelargestandsmallestvaluesintheirarguments,eveniftheyare
givenseveralvectors.Theparallelmaximumandminimumfunctionspmaxandpminreturna
vector(oflengthequaltotheirlongestargument)thatcontainsineachelementthelargest
(smallest)elementinthatpositioninanyoftheinputvectors.
Formostpurposestheuserwillnotbeconcernedifthenumbersinanumericvectorare
integers,realsorevencomplex.Internallycalculationsaredoneasdoubleprecisionreal
numbers,ordoubleprecisioncomplexnumbersiftheinputdataarecomplex.
Toworkwithcomplexnumbers,supplyanexplicitcomplexpart.Thus
sqrt(17)

willgiveNaNandawarning,but
sqrt(17+0i)

willdothecomputationsascomplexnumbers.
Generatingregularsequences:
Next:Logicalvectors,Previous:Vectorarithmetic,Up:Simplemanipulationsnumbersand
vectors[Contents][Index]
2.3Generatingregularsequences
Rhasanumberoffacilitiesforgeneratingcommonlyusedsequencesofnumbers.Forexample
1:30isthevectorc(1,2,,29,30).Thecolonoperatorhashighprioritywithinanexpression,
so,forexample2*1:15isthevectorc(2,4,,28,30).Putn<10andcomparethesequences
1:n1and1:(n1).
Theconstruction30:1maybeusedtogenerateasequencebackwards.
Thefunctionseq()isamoregeneralfacilityforgeneratingsequences.Ithasfivearguments,
onlysomeofwhichmaybespecifiedinanyonecall.Thefirsttwoarguments,ifgiven,specify
thebeginningandendofthesequence,andifthesearetheonlytwoargumentsgiventheresultis
thesameasthecolonoperator.Thatisseq(2,10)isthesamevectoras2:10.
Argumentstoseq(),andtomanyotherRfunctions,canalsobegiveninnamedform,inwhich
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

13/116

5/28/2015

AnIntroductiontoR

casetheorderinwhichtheyappearisirrelevant.Thefirsttwoargumentsmaybenamed
from=valueandto=valuethusseq(1,30),seq(from=1,to=30)andseq(to=30,from=1)areall
thesameas1:30.Thenexttwoargumentstoseq()maybenamedby=valueandlength=value,
whichspecifyastepsizeandalengthforthesequencerespectively.Ifneitheroftheseisgiven,
thedefaultby=1isassumed.
Forexample
>seq(5,5,by=.2)>s3

generatesins3thevectorc(5.0,4.8,4.6,,4.6,4.8,5.0).Similarly
>s4<seq(length=51,from=5,by=.2)

generatesthesamevectorins4.
Thefifthargumentmaybenamedalong=vector,whichisnormallyusedastheonlyargumentto
createthesequence1,2,,length(vector),ortheemptysequenceifthevectorisempty(asit
canbe).
Arelatedfunctionisrep()whichcanbeusedforreplicatinganobjectinvariouscomplicated
ways.Thesimplestformis
>s5<rep(x,times=5)

whichwillputfivecopiesofxendtoendins5.Anotherusefulversionis
>s6<rep(x,each=5)

whichrepeatseachelementofxfivetimesbeforemovingontothenext.
Next:Missingvalues,Previous:Generatingregularsequences,Up:Simplemanipulations
numbersandvectors[Contents][Index]
2.4Logicalvectors
Aswellasnumericalvectors,Rallowsmanipulationoflogicalquantities.Theelementsofa
logicalvectorcanhavethevaluesTRUE,FALSE,andNA(fornotavailable,seebelow).Thefirst
twoareoftenabbreviatedasTandF,respectively.NotehoweverthatTandFarejustvariables
whicharesettoTRUEandFALSEbydefault,butarenotreservedwordsandhencecanbe
overwrittenbytheuser.Hence,youshouldalwaysuseTRUEandFALSE.
Logicalvectorsaregeneratedbyconditions.Forexample
>temp<x>13

setstempasavectorofthesamelengthasxwithvaluesFALSEcorrespondingtoelementsofx
wheretheconditionisnotmetandTRUEwhereitis.
Thelogicaloperatorsare<,<=,>,>=,==forexactequalityand!=forinequality.Inadditionifc1
andc2arelogicalexpressions,thenc1&c2istheirintersection(and),c1|c2istheirunion
(or),and!c1isthenegationofc1.
Logicalvectorsmaybeusedinordinaryarithmetic,inwhichcasetheyarecoercedintonumeric
vectors,FALSEbecoming0andTRUEbecoming1.Howevertherearesituationswherelogical
vectorsandtheircoercednumericcounterpartsarenotequivalent,forexampleseethenext
subsection.
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

14/116

5/28/2015

AnIntroductiontoR

Next:Charactervectors,Previous:Logicalvectors,Up:Simplemanipulationsnumbersand
vectors[Contents][Index]
2.5Missingvalues
Insomecasesthecomponentsofavectormaynotbecompletelyknown.Whenanelementor
valueisnotavailableoramissingvalueinthestatisticalsense,aplacewithinavectormay
bereservedforitbyassigningitthespecialvalueNA.IngeneralanyoperationonanNAbecomes
anNA.Themotivationforthisruleissimplythatifthespecificationofanoperationis
incomplete,theresultcannotbeknownandhenceisnotavailable.
Thefunctionis.na(x)givesalogicalvectorofthesamesizeasxwithvalueTRUEifandonlyif
thecorrespondingelementinxisNA.
>z<c(1:3,NA);ind<is.na(z)

Noticethatthelogicalexpressionx==NAisquitedifferentfromis.na(x)sinceNAisnotreallya
valuebutamarkerforaquantitythatisnotavailable.Thusx==NAisavectorofthesame
lengthasxallofwhosevaluesareNAasthelogicalexpressionitselfisincompleteandhence
undecidable.
Notethatthereisasecondkindofmissingvalueswhichareproducedbynumerical
computation,thesocalledNotaNumber,NaN,values.Examplesare
>0/0

or
>InfInf

whichbothgiveNaNsincetheresultcannotbedefinedsensibly.
Insummary,is.na(xx)isTRUEbothforNAandNaNvalues.Todifferentiatethese,is.nan(xx)is
onlyTRUEforNaNs.
Missingvaluesaresometimesprintedas<NA>whencharactervectorsareprintedwithoutquotes.
Next:Indexvectors,Previous:Missingvalues,Up:Simplemanipulationsnumbersandvectors
[Contents][Index]
2.6Charactervectors
CharacterquantitiesandcharactervectorsareusedfrequentlyinR,forexampleasplotlabels.
Whereneededtheyaredenotedbyasequenceofcharactersdelimitedbythedoublequote
character,e.g.,"xvalues","Newiterationresults".
Characterstringsareenteredusingeithermatchingdouble(")orsingle(')quotes,butare
printedusingdoublequotes(orsometimeswithoutquotes).TheyuseCstyleescapesequences,
using\astheescapecharacter,so\\isenteredandprintedas\\,andinsidedoublequotes"is
enteredas\".Otherusefulescapesequencesare\n,newline,\t,taband\b,backspacesee?
Quotesforafulllist.
Charactervectorsmaybeconcatenatedintoavectorbythec()functionexamplesoftheiruse
willemergefrequently.
Thepaste()functiontakesanarbitrarynumberofargumentsandconcatenatesthemonebyone
intocharacterstrings.Anynumbersgivenamongtheargumentsarecoercedintocharacter
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

15/116

5/28/2015

AnIntroductiontoR

stringsintheevidentway,thatis,inthesamewaytheywouldbeiftheywereprinted.The
argumentsarebydefaultseparatedintheresultbyasingleblankcharacter,butthiscanbe
changedbythenamedargument,sep=string,whichchangesittostring,possiblyempty.
Forexample
>labs<paste(c("X","Y"),1:10,sep="")

makeslabsintothecharactervector
c("X1","Y2","X3","Y4","X5","Y6","X7","Y8","X9","Y10")

Noteparticularlythatrecyclingofshortliststakesplaceheretoothusc("X","Y")isrepeated5
timestomatchthesequence1:10.9
Next:Othertypesofobjects,Previous:Charactervectors,Up:Simplemanipulationsnumbers
andvectors[Contents][Index]
2.7Indexvectors;selectingandmodifyingsubsetsofadataset
Subsetsoftheelementsofavectormaybeselectedbyappendingtothenameofthevectoran
indexvectorinsquarebrackets.Moregenerallyanyexpressionthatevaluatestoavectormay
havesubsetsofitselementssimilarlyselectedbyappendinganindexvectorinsquarebrackets
immediatelyaftertheexpression.
Suchindexvectorscanbeanyoffourdistincttypes.
1. Alogicalvector.Inthiscasetheindexvectorisrecycledtothesamelengthasthevector
fromwhichelementsaretobeselected.ValuescorrespondingtoTRUEintheindexvector
areselectedandthosecorrespondingtoFALSEareomitted.Forexample
>y<x[!is.na(x)]

creates(orrecreates)anobjectywhichwillcontainthenonmissingvaluesofx,inthe
sameorder.Notethatifxhasmissingvalues,ywillbeshorterthanx.Also
>(x+1)[(!is.na(x))&x>0]>z

createsanobjectzandplacesinitthevaluesofthevectorx+1forwhichthecorresponding
valueinxwasbothnonmissingandpositive.
2. Avectorofpositiveintegralquantities.Inthiscasethevaluesintheindexvectormust
lieintheset{1,2,,length(x)}.Thecorrespondingelementsofthevectorareselected
andconcatenated,inthatorder,intheresult.Theindexvectorcanbeofanylengthand
theresultisofthesamelengthastheindexvector.Forexamplex[6]isthesixth
componentofxand
>x[1:10]

selectsthefirst10elementsofx(assuminglength(x)isnotlessthan10).Also
>c("x","y")[rep(c(1,2,2,1),times=4)]

(anadmittedlyunlikelythingtodo)producesacharactervectoroflength16consistingof
"x","y","y","x"repeatedfourtimes.
3. Avectorofnegativeintegralquantities.Suchanindexvectorspecifiesthevaluestobe
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

16/116

5/28/2015

AnIntroductiontoR

excludedratherthanincluded.Thus
>y<x[(1:5)]

givesyallbutthefirstfiveelementsofx.
4. Avectorofcharacterstrings.Thispossibilityonlyapplieswhereanobjecthasanames
attributetoidentifyitscomponents.Inthiscaseasubvectorofthenamesvectormaybe
usedinthesamewayasthepositiveintegrallabelsinitem2furtherabove.
>fruit<c(5,10,1,20)
>names(fruit)<c("orange","banana","apple","peach")
>lunch<fruit[c("apple","orange")]

Theadvantageisthatalphanumericnamesareofteneasiertorememberthannumeric
indices.Thisoptionisparticularlyusefulinconnectionwithdataframes,asweshallsee
later.
Anindexedexpressioncanalsoappearonthereceivingendofanassignment,inwhichcasethe
assignmentoperationisperformedonlyonthoseelementsofthevector.Theexpressionmustbe
oftheformvector[index_vector]ashavinganarbitraryexpressioninplaceofthevectorname
doesnotmakemuchsensehere.
Forexample
>x[is.na(x)]<0

replacesanymissingvaluesinxbyzerosand
>y[y<0]<y[y<0]

hasthesameeffectas
>y<abs(y)

Previous:Indexvectors,Up:Simplemanipulationsnumbersandvectors[Contents][Index]
2.8Othertypesofobjects
VectorsarethemostimportanttypeofobjectinR,butthereareseveralotherswhichwewill
meetmoreformallyinlatersections.
matricesormoregenerallyarraysaremultidimensionalgeneralizationsofvectors.In
fact,theyarevectorsthatcanbeindexedbytwoormoreindicesandwillbeprintedin
specialways.SeeArraysandmatrices.
factorsprovidecompactwaystohandlecategoricaldata.SeeFactors.
listsareageneralformofvectorinwhichthevariouselementsneednotbeofthesame
type,andareoftenthemselvesvectorsorlists.Listsprovideaconvenientwaytoreturnthe
resultsofastatisticalcomputation.SeeLists.
dataframesarematrixlikestructures,inwhichthecolumnscanbeofdifferenttypes.
Thinkofdataframesasdatamatriceswithonerowperobservationalunitbutwith
(possibly)bothnumericalandcategoricalvariables.Manyexperimentsarebestdescribed
bydataframes:thetreatmentsarecategoricalbuttheresponseisnumeric.SeeData
frames.
functionsarethemselvesobjectsinRwhichcanbestoredintheprojectsworkspace.This
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

17/116

5/28/2015

AnIntroductiontoR

providesasimpleandconvenientwaytoextendR.SeeWritingyourownfunctions.
Next:Factors,Previous:Simplemanipulationsnumbersandvectors,Up:Top[Contents]
[Index]

3Objects,theirmodesandattributes
Theintrinsicattributesmodeandlength:
Changingthelengthofanobject:

Gettingandsettingattributes:

Theclassofanobject:

Next:Changingthelengthofanobject,Previous:Objects,Up:Objects[Contents][Index]
3.1Intrinsicattributes:modeandlength
TheentitiesRoperatesonaretechnicallyknownasobjects.Examplesarevectorsofnumeric
(real)orcomplexvalues,vectorsoflogicalvaluesandvectorsofcharacterstrings.Theseare
knownasatomicstructuressincetheircomponentsareallofthesametype,ormode,namely
numeric10,complex,logical,characterandraw.
Vectorsmusthavetheirvaluesallofthesamemode.Thusanygivenvectormustbe
unambiguouslyeitherlogical,numeric,complex,characterorraw.(Theonlyapparentexception
tothisruleisthespecialvaluelistedasNAforquantitiesnotavailable,butinfactthereare
severaltypesofNA).Notethatavectorcanbeemptyandstillhaveamode.Forexamplethe
emptycharacterstringvectorislistedascharacter(0)andtheemptynumericvectoras
numeric(0).
Ralsooperatesonobjectscalledlists,whichareofmodelist.Theseareorderedsequencesof
objectswhichindividuallycanbeofanymode.listsareknownasrecursiveratherthanatomic
structuressincetheircomponentscanthemselvesbelistsintheirownright.
Theotherrecursivestructuresarethoseofmodefunctionandexpression.Functionsarethe
objectsthatformpartoftheRsystemalongwithsimilaruserwrittenfunctions,whichwe
discussinsomedetaillater.ExpressionsasobjectsformanadvancedpartofRwhichwillnotbe
discussedinthisguide,exceptindirectlywhenwediscussformulaeusedwithmodelinginR.
Bythemodeofanobjectwemeanthebasictypeofitsfundamentalconstituents.Thisisa
specialcaseofapropertyofanobject.Anotherpropertyofeveryobjectisitslength.The
functionsmode(object)andlength(object)canbeusedtofindoutthemodeandlengthofany
definedstructure11.
Furtherpropertiesofanobjectareusuallyprovidedbyattributes(object),seeGettingand
settingattributes.Becauseofthis,modeandlengtharealsocalledintrinsicattributesofan
object.
Forexample,ifzisacomplexvectoroflength100,theninanexpressionmode(z)isthe
characterstring"complex"andlength(z)is100.
Rcatersforchangesofmodealmostanywhereitcouldbeconsideredsensibletodoso,(anda
fewwhereitmightnotbe).Forexamplewith
>z<0:9
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

18/116

5/28/2015

AnIntroductiontoR

wecouldput
>digits<as.character(z)

afterwhichdigitsisthecharactervectorc("0","1","2",,"9").Afurthercoercion,or
changeofmode,reconstructsthenumericalvectoragain:
>d<as.integer(digits)

Nowdandzarethesame.12Thereisalargecollectionoffunctionsoftheformas.something()
foreithercoercionfromonemodetoanother,orforinvestinganobjectwithsomeotherattribute
itmaynotalreadypossess.Thereadershouldconsultthedifferenthelpfilestobecomefamiliar
withthem.
Next:Gettingandsettingattributes,Previous:Theintrinsicattributesmodeandlength,Up:
Objects[Contents][Index]
3.2Changingthelengthofanobject
Anemptyobjectmaystillhaveamode.Forexample
>e<numeric()

makeseanemptyvectorstructureofmodenumeric.Similarlycharacter()isaemptycharacter
vector,andsoon.Onceanobjectofanysizehasbeencreated,newcomponentsmaybeaddedto
itsimplybygivingitanindexvalueoutsideitspreviousrange.Thus
>e[3]<17

nowmakeseavectoroflength3,(thefirsttwocomponentsofwhichareatthispointbothNA).
Thisappliestoanystructureatall,providedthemodeoftheadditionalcomponent(s)agreeswith
themodeoftheobjectinthefirstplace.
Thisautomaticadjustmentoflengthsofanobjectisusedoften,forexampleinthescan()
functionforinput.(seeThescan()function.)
Converselytotruncatethesizeofanobjectrequiresonlyanassignmenttodoso.Henceifalpha
isanobjectoflength10,then
>alpha<alpha[2*1:5]

makesitanobjectoflength5consistingofjusttheformercomponentswithevenindex.(The
oldindicesarenotretained,ofcourse.)Wecanthenretainjustthefirstthreevaluesby
>length(alpha)<3

andvectorscanbeextended(bymissingvalues)inthesameway.
Next:Theclassofanobject,Previous:Changingthelengthofanobject,Up:Objects
[Contents][Index]
3.3Gettingandsettingattributes
Thefunctionattributes(object)returnsalistofallthenonintrinsicattributescurrentlydefined
forthatobject.Thefunctionattr(object,name)canbeusedtoselectaspecificattribute.These
functionsarerarelyused,exceptinratherspecialcircumstanceswhensomenewattributeis
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

19/116

5/28/2015

AnIntroductiontoR

beingcreatedforsomeparticularpurpose,forexampletoassociateacreationdateoranoperator
withanRobject.Theconcept,however,isveryimportant.
Somecareshouldbeexercisedwhenassigningordeletingattributessincetheyareanintegral
partoftheobjectsystemusedinR.
Whenitisusedonthelefthandsideofanassignmentitcanbeusedeithertoassociateanew
attributewithobjectortochangeanexistingone.Forexample
>attr(z,"dim")<c(10,10)

allowsRtotreatzasifitwerea10by10matrix.
Previous:Gettingandsettingattributes,Up:Objects[Contents][Index]
3.4Theclassofanobject
AllobjectsinRhaveaclass,reportedbythefunctionclass.Forsimplevectorsthisisjustthe
mode,forexample"numeric","logical","character"or"list",but"matrix","array",
"factor"and"data.frame"areotherpossiblevalues.
Aspecialattributeknownastheclassoftheobjectisusedtoallowforanobjectorientedstyle13
ofprogramminginR.Forexampleifanobjecthasclass"data.frame",itwillbeprintedina
certainway,theplot()functionwilldisplayitgraphicallyinacertainway,andothersocalled
genericfunctionssuchassummary()willreacttoitasanargumentinawaysensitivetoitsclass.
Toremovetemporarilytheeffectsofclass,usethefunctionunclass().Forexampleifwinter
hastheclass"data.frame"then
>winter

willprintitindataframeform,whichisratherlikeamatrix,whereas
>unclass(winter)

willprintitasanordinarylist.Onlyinratherspecialsituationsdoyouneedtousethisfacility,
butoneiswhenyouarelearningtocometotermswiththeideaofclassandgenericfunctions.
GenericfunctionsandclasseswillbediscussedfurtherinObjectorientation,butonlybriefly.
Next:Arraysandmatrices,Previous:Objects,Up:Top[Contents][Index]

4Orderedandunorderedfactors
Afactorisavectorobjectusedtospecifyadiscreteclassification(grouping)ofthecomponents
ofothervectorsofthesamelength.Rprovidesbothorderedandunorderedfactors.Whilethe
realapplicationoffactorsiswithmodelformulae(seeContrasts),weherelookataspecific
example.
4.1Aspecificexample
Suppose,forexample,wehaveasampleof30taxaccountantsfromallthestatesandterritories
ofAustralia14andtheirindividualstateoforiginisspecifiedbyacharactervectorofstate
mnemonicsas
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

20/116

5/28/2015

AnIntroductiontoR

>state<c("tas","sa","qld","nsw","nsw","nt","wa","wa",
"qld","vic","nsw","vic","qld","qld","sa","tas",
"sa","nt","wa","vic","qld","nsw","nsw","wa",
"sa","act","nsw","vic","vic","act")

Noticethatinthecaseofacharactervector,sortedmeanssortedinalphabeticalorder.
Afactorissimilarlycreatedusingthefactor()function:
>statef<factor(state)

Theprint()functionhandlesfactorsslightlydifferentlyfromotherobjects:
>statef
[1]tassaqldnswnswntwawaqldvicnswvicqldqldsa
[16]tassantwavicqldnswnswwasaactnswvicvicact
Levels:actnswntqldsatasvicwa

Tofindoutthelevelsofafactorthefunctionlevels()canbeused.
>levels(statef)
[1]"act""nsw""nt""qld""sa""tas""vic""wa"

Thefunctiontapply()andraggedarrays:
Orderedfactors:

Next:Orderedfactors,Previous:Factors,Up:Factors[Contents][Index]
4.2Thefunctiontapply()andraggedarrays
Tocontinuethepreviousexample,supposewehavetheincomesofthesametaxaccountantsin
anothervector(insuitablylargeunitsofmoney)
>incomes<c(60,49,40,61,64,60,59,54,62,69,70,42,56,
61,61,61,58,51,48,65,49,49,41,48,52,46,
59,46,58,43)

Tocalculatethesamplemeanincomeforeachstatewecannowusethespecialfunction
tapply():
>incmeans<tapply(incomes,statef,mean)

givingameansvectorwiththecomponentslabelledbythelevels
actnswntqldsatasvicwa
44.50057.33355.50053.60055.00060.50056.00052.250

Thefunctiontapply()isusedtoapplyafunction,heremean(),toeachgroupofcomponentsof
thefirstargument,hereincomes,definedbythelevelsofthesecondcomponent,herestatef15,
asiftheywereseparatevectorstructures.Theresultisastructureofthesamelengthasthelevels
attributeofthefactorcontainingtheresults.Thereadershouldconsultthehelpdocumentfor
moredetails.
Supposefurtherweneededtocalculatethestandarderrorsofthestateincomemeans.Todothis
weneedtowriteanRfunctiontocalculatethestandarderrorforanygivenvector.Sincethereis
anbuiltinfunctionvar()tocalculatethesamplevariance,suchafunctionisaverysimpleone
liner,specifiedbytheassignment:
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

21/116

5/28/2015

AnIntroductiontoR

>stderr<function(x)sqrt(var(x)/length(x))

(WritingfunctionswillbeconsideredlaterinWritingyourownfunctions,andinthiscasewas
unnecessaryasRalsohasabuiltinfunctionsd().)Afterthisassignment,thestandarderrorsare
calculatedby
>incster<tapply(incomes,statef,stderr)

andthevaluescalculatedarethen
>incster
actnswntqldsatasvicwa
1.54.31024.54.10612.73860.55.2442.6575

Asanexerciseyoumaycaretofindtheusual95%confidencelimitsforthestatemeanincomes.
Todothisyoucouldusetapply()oncemorewiththelength()functiontofindthesamplesizes,
andtheqt()functiontofindthepercentagepointsoftheappropriatetdistributions.(Youcould
alsoinvestigateRsfacilitiesforttests.)
Thefunctiontapply()canalsobeusedtohandlemorecomplicatedindexingofavectorby
multiplecategories.Forexample,wemightwishtosplitthetaxaccountantsbybothstateand
sex.Howeverinthissimpleinstance(justonefactor)whathappenscanbethoughtofasfollows.
Thevaluesinthevectorarecollectedintogroupscorrespondingtothedistinctentriesinthe
factor.Thefunctionisthenappliedtoeachofthesegroupsindividually.Thevalueisavectorof
functionresults,labelledbythelevelsattributeofthefactor.
Thecombinationofavectorandalabellingfactorisanexampleofwhatissometimescalleda
raggedarray,sincethesubclasssizesarepossiblyirregular.Whenthesubclasssizesareallthe
sametheindexingmaybedoneimplicitlyandmuchmoreefficiently,asweseeinthenext
section.
Previous:Thefunctiontapply()andraggedarrays,Up:Factors[Contents][Index]
4.3Orderedfactors
Thelevelsoffactorsarestoredinalphabeticalorder,orintheordertheywerespecifiedto
factoriftheywerespecifiedexplicitly.
Sometimesthelevelswillhaveanaturalorderingthatwewanttorecordandwantourstatistical
analysistomakeuseof.Theordered()functioncreatessuchorderedfactorsbutisotherwise
identicaltofactor.Formostpurposestheonlydifferencebetweenorderedandunorderedfactors
isthattheformerareprintedshowingtheorderingofthelevels,butthecontrastsgeneratedfor
theminfittinglinearmodelsaredifferent.
Next:Listsanddataframes,Previous:Factors,Up:Top[Contents][Index]

5Arraysandmatrices
Arrays:
Arrayindexing:
Indexmatrices:
Thearray()function:
Theouterproductoftwoarrays:
Generalizedtransposeofanarray:

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

22/116

5/28/2015

AnIntroductiontoR

Matrixfacilities:

Formingpartitionedmatrices:

Theconcatenationfunctionc()witharrays:
Frequencytablesfromfactors:

Next:Arrayindexing,Previous:Arraysandmatrices,Up:Arraysandmatrices[Contents]
[Index]
5.1Arrays
Anarraycanbeconsideredasamultiplysubscriptedcollectionofdataentries,forexample
numeric.Rallowssimplefacilitiesforcreatingandhandlingarrays,andinparticularthespecial
caseofmatrices.
Adimensionvectorisavectorofnonnegativeintegers.Ifitslengthiskthenthearrayisk
dimensional,e.g.amatrixisa2dimensionalarray.Thedimensionsareindexedfromoneupto
thevaluesgiveninthedimensionvector.
AvectorcanbeusedbyRasanarrayonlyifithasadimensionvectorasitsdimattribute.
Suppose,forexample,zisavectorof1500elements.Theassignment
>dim(z)<c(3,5,100)

givesitthedimattributethatallowsittobetreatedasa3by5by100array.
Otherfunctionssuchasmatrix()andarray()areavailableforsimplerandmorenaturallooking
assignments,asweshallseeinThearray()function.
Thevaluesinthedatavectorgivethevaluesinthearrayinthesameorderastheywouldoccur
inFORTRAN,thatiscolumnmajororder,withthefirstsubscriptmovingfastestandthelast
subscriptslowest.
Forexampleifthedimensionvectorforanarray,saya,isc(3,4,2)thenthereare3*4*2=24
entriesinaandthedatavectorholdsthemintheordera[1,1,1],a[2,1,1],,a[2,4,2],
a[3,4,2].
Arrayscanbeonedimensional:sucharraysareusuallytreatedinthesamewayasvectors
(includingwhenprinting),buttheexceptionscancauseconfusion.
Next:Indexmatrices,Previous:Arrays,Up:Arraysandmatrices[Contents][Index]
5.2Arrayindexing.Subsectionsofanarray
Individualelementsofanarraymaybereferencedbygivingthenameofthearrayfollowedby
thesubscriptsinsquarebrackets,separatedbycommas.
Moregenerally,subsectionsofanarraymaybespecifiedbygivingasequenceofindexvectors
inplaceofsubscriptshoweverifanyindexpositionisgivenanemptyindexvector,thenthefull
rangeofthatsubscriptistaken.
Continuingthepreviousexample,a[2,,]isa4*2arraywithdimensionvectorc(4,2)anddata
vectorcontainingthevalues
c(a[2,1,1],a[2,2,1],a[2,3,1],a[2,4,1],
a[2,1,2],a[2,2,2],a[2,3,2],a[2,4,2])
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

23/116

5/28/2015

AnIntroductiontoR

inthatorder.a[,,]standsfortheentirearray,whichisthesameasomittingthesubscripts
entirelyandusingaalone.
Foranyarray,sayZ,thedimensionvectormaybereferencedexplicitlyasdim(Z)(oneitherside
ofanassignment).
Also,ifanarraynameisgivenwithjustonesubscriptorindexvector,thenthecorresponding
valuesofthedatavectoronlyareusedinthiscasethedimensionvectorisignored.Thisisnot
thecase,however,ifthesingleindexisnotavectorbutitselfanarray,aswenextdiscuss.
Indexmatrices:

Thearray()function:
Next:Thearray()function,Previous:Arrayindexing,Up:Arraysandmatrices[Contents]
[Index]
5.3Indexmatrices
Aswellasanindexvectorinanysubscriptposition,amatrixmaybeusedwithasingleindex
matrixinordereithertoassignavectorofquantitiestoanirregularcollectionofelementsinthe
array,ortoextractanirregularcollectionasavector.
Amatrixexamplemakestheprocessclear.Inthecaseofadoublyindexedarray,anindex
matrixmaybegivenconsistingoftwocolumnsandasmanyrowsasdesired.Theentriesinthe
indexmatrixaretherowandcolumnindicesforthedoublyindexedarray.Supposeforexample
wehavea4by5arrayXandwewishtodothefollowing:
ExtractelementsX[1,3],X[2,2]andX[3,1]asavectorstructure,and
ReplacetheseentriesinthearrayXbyzeroes.
Inthiscaseweneeda3by2subscriptarray,asinthefollowingexample.
>x<array(1:20,dim=c(4,5))#Generatea4by5array.
>x
[,1][,2][,3][,4][,5]
[1,]1591317
[2,]26101418
[3,]37111519
[4,]48121620
>i<array(c(1:3,3:1),dim=c(3,2))
>i#iisa3by2indexarray.
[,1][,2]
[1,]13
[2,]22
[3,]31
>x[i]#Extractthoseelements
[1]963
>x[i]<0#Replacethoseelementsbyzeros.
>x
[,1][,2][,3][,4][,5]
[1,]1501317
[2,]20101418
[3,]07111519
[4,]48121620
>

Negativeindicesarenotallowedinindexmatrices.NAandzerovaluesareallowed:rowsinthe
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

24/116

5/28/2015

AnIntroductiontoR

indexmatrixcontainingazeroareignored,androwscontaininganNAproduceanNAintheresult.
Asalesstrivialexample,supposewewishtogeneratean(unreduced)designmatrixforablock
designdefinedbyfactorsblocks(blevels)andvarieties(vlevels).Furthersupposetherearen
plotsintheexperiment.Wecouldproceedasfollows:
>Xb<matrix(0,n,b)
>Xv<matrix(0,n,v)
>ib<cbind(1:n,blocks)
>iv<cbind(1:n,varieties)
>Xb[ib]<1
>Xv[iv]<1
>X<cbind(Xb,Xv)

Toconstructtheincidencematrix,Nsay,wecoulduse
>N<crossprod(Xb,Xv)

Howeverasimplerdirectwayofproducingthismatrixistousetable():
>N<table(blocks,varieties)

Indexmatricesmustbenumerical:anyotherformofmatrix(e.g.alogicalorcharactermatrix)
suppliedasamatrixistreatedasanindexingvector.
Next:Theouterproductoftwoarrays,Previous:Indexmatrices,Up:Arraysandmatrices
[Contents][Index]
5.4Thearray()function
Aswellasgivingavectorstructureadimattribute,arrayscanbeconstructedfromvectorsbythe
arrayfunction,whichhastheform
>Z<array(data_vector,dim_vector)

Forexample,ifthevectorhcontains24orfewer,numbersthenthecommand
>Z<array(h,dim=c(3,4,2))

wouldusehtosetup3by4by2arrayinZ.Ifthesizeofhisexactly24theresultisthesameas
>Z<h;dim(Z)<c(3,4,2)

Howeverifhisshorterthan24,itsvaluesarerecycledfromthebeginningagaintomakeitupto
size24(seeTherecyclingrule)butdim(h)<c(3,4,2)wouldsignalanerrorabout
mismatchinglength.Asanextremebutcommonexample
>Z<array(0,c(3,4,2))

makesZanarrayofallzeros.
Atthispointdim(Z)standsforthedimensionvectorc(3,4,2),andZ[1:24]standsforthedata
vectorasitwasinh,andZ[]withanemptysubscriptorZwithnosubscriptstandsfortheentire
arrayasanarray.
Arraysmaybeusedinarithmeticexpressionsandtheresultisanarrayformedbyelementby
elementoperationsonthedatavector.Thedimattributesofoperandsgenerallyneedtobethe
same,andthisbecomesthedimensionvectoroftheresult.SoifA,BandCareallsimilararrays,
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

25/116

5/28/2015

AnIntroductiontoR

then
>D<2*A*B+C+1

makesDasimilararraywithitsdatavectorbeingtheresultofthegivenelementbyelement
operations.Howeverthepreciseruleconcerningmixedarrayandvectorcalculationshastobe
consideredalittlemorecarefully.
Therecyclingrule:
Previous:Thearray()function,Up:Thearray()function[Contents][Index]
5.4.1Mixedvectorandarrayarithmetic.Therecyclingrule

Thepreciseruleaffectingelementbyelementmixedcalculationswithvectorsandarraysis
somewhatquirkyandhardtofindinthereferences.Fromexperiencewehavefoundthe
followingtobeareliableguide.
Theexpressionisscannedfromlefttoright.
Anyshortvectoroperandsareextendedbyrecyclingtheirvaluesuntiltheymatchthesize
ofanyotheroperands.
Aslongasshortvectorsandarraysonlyareencountered,thearraysmustallhavethesame
dimattributeoranerrorresults.
Anyvectoroperandlongerthanamatrixorarrayoperandgeneratesanerror.
Ifarraystructuresarepresentandnoerrororcoerciontovectorhasbeenprecipitated,the
resultisanarraystructurewiththecommondimattributeofitsarrayoperands.
Next:Generalizedtransposeofanarray,Previous:Thearray()function,Up:Arraysandmatrices
[Contents][Index]
5.5Theouterproductoftwoarrays
Animportantoperationonarraysistheouterproduct.Ifaandbaretwonumericarrays,their
outerproductisanarraywhosedimensionvectorisobtainedbyconcatenatingtheirtwo
dimensionvectors(orderisimportant),andwhosedatavectorisgotbyformingallpossible
productsofelementsofthedatavectorofawiththoseofb.Theouterproductisformedbythe
specialoperator%o%:
>ab<a%o%b

Analternativeis
>ab<outer(a,b,"*")

Themultiplicationfunctioncanbereplacedbyanarbitraryfunctionoftwovariables.For
exampleifwewishedtoevaluatethefunctionf(xy)=cos(y)/(1+x^2)overaregulargridof
valueswithxandycoordinatesdefinedbytheRvectorsxandyrespectively,wecouldproceed
asfollows:
>f<function(x,y)cos(y)/(1+x^2)
>z<outer(x,y,f)

Inparticulartheouterproductoftwoordinaryvectorsisadoublysubscriptedarray(thatisa
matrix,ofrankatmost1).Noticethattheouterproductoperatorisofcoursenoncommutative.
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

26/116

5/28/2015

AnIntroductiontoR

DefiningyourownRfunctionswillbeconsideredfurtherinWritingyourownfunctions.
Anexample:Determinantsof2by2singledigitmatrices

Asanartificialbutcuteexample,considerthedeterminantsof2by2matrices[a,bc,d]where
eachentryisanonnegativeintegerintherange0,1,,9,thatisadigit.
Theproblemistofindthedeterminants,adbc,ofallpossiblematricesofthisformand
representthefrequencywithwhicheachvalueoccursasahighdensityplot.Thisamountsto
findingtheprobabilitydistributionofthedeterminantifeachdigitischosenindependentlyand
uniformlyatrandom.
Aneatwayofdoingthisusestheouter()functiontwice:
>d<outer(0:9,0:9)
>fr<table(outer(d,d,""))
>plot(as.numeric(names(fr)),fr,type="h",
xlab="Determinant",ylab="Frequency")

Noticethecoercionofthenamesattributeofthefrequencytabletonumericinordertorecover
therangeofthedeterminantvalues.Theobviouswayofdoingthisproblemwithforloops,to
bediscussedinLoopsandconditionalexecution,issoinefficientastobeimpractical.
Itisalsoperhapssurprisingthatabout1in20suchmatricesissingular.
Next:Matrixfacilities,Previous:Theouterproductoftwoarrays,Up:Arraysandmatrices
[Contents][Index]
5.6Generalizedtransposeofanarray
Thefunctionaperm(a,perm)maybeusedtopermuteanarray,a.Theargumentpermmustbea
permutationoftheintegers{1,,k},wherekisthenumberofsubscriptsina.Theresultofthe
functionisanarrayofthesamesizeasabutwitholddimensiongivenbyperm[j]becomingthe
newjthdimension.Theeasiestwaytothinkofthisoperationisasageneralizationof
transpositionformatrices.IndeedifAisamatrix,(thatis,adoublysubscriptedarray)thenB
givenby
>B<aperm(A,c(2,1))

isjustthetransposeofA.Forthisspecialcaseasimplerfunctiont()isavailable,sowecould
haveusedB<t(A).
Next:Formingpartitionedmatrices,Previous:Generalizedtransposeofanarray,Up:Arraysand
matrices[Contents][Index]
5.7Matrixfacilities
Asnotedabove,amatrixisjustanarraywithtwosubscripts.Howeveritissuchanimportant
specialcaseitneedsaseparatediscussion.Rcontainsmanyoperatorsandfunctionsthatare
availableonlyformatrices.Forexamplet(X)isthematrixtransposefunction,asnotedabove.
Thefunctionsnrow(A)andncol(A)givethenumberofrowsandcolumnsinthematrixA
respectively.
Multiplication:
Linearequationsandinversion:

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

27/116

5/28/2015

AnIntroductiontoR

Eigenvaluesandeigenvectors:

Singularvaluedecompositionanddeterminants:
LeastsquaresfittingandtheQRdecomposition:
Next:Linearequationsandinversion,Previous:Matrixfacilities,Up:Matrixfacilities
[Contents][Index]
5.7.1Matrixmultiplication

Theoperator%*%isusedformatrixmultiplication.Annby1or1bynmatrixmayofcoursebe
usedasannvectorifinthecontextsuchisappropriate.Conversely,vectorswhichoccurin
matrixmultiplicationexpressionsareautomaticallypromotedeithertoroworcolumnvectors,
whicheverismultiplicativelycoherent,ifpossible,(althoughthisisnotalwaysunambiguously
possible,asweseelater).
If,forexample,AandBaresquarematricesofthesamesize,then
>A*B

isthematrixofelementbyelementproductsand
>A%*%B

isthematrixproduct.Ifxisavector,then
>x%*%A%*%x

isaquadraticform.16
Thefunctioncrossprod()formscrossproducts,meaningthatcrossprod(X,y)isthesameas
t(X)%*%ybuttheoperationismoreefficient.Ifthesecondargumenttocrossprod()isomitted
itistakentobethesameasthefirst.
Themeaningofdiag()dependsonitsargument.diag(v),wherevisavector,givesadiagonal
matrixwithelementsofthevectorasthediagonalentries.Ontheotherhanddiag(M),whereMis
amatrix,givesthevectorofmaindiagonalentriesofM.Thisisthesameconventionasthatused
fordiag()inMATLAB.Also,somewhatconfusingly,ifkisasinglenumericvaluethendiag(k)
isthekbykidentitymatrix!
Next:Eigenvaluesandeigenvectors,Previous:Multiplication,Up:Matrixfacilities[Contents]
[Index]
5.7.2Linearequationsandinversion

Solvinglinearequationsistheinverseofmatrixmultiplication.Whenafter
>b<A%*%x

onlyAandbaregiven,thevectorxisthesolutionofthatlinearequationsystem.InR,
>solve(A,b)

solvesthesystem,returningx(uptosomeaccuracyloss).Notethatinlinearalgebra,formallyx
=A^{1}%*%bwhereA^{1}denotestheinverseofA,whichcanbecomputedby
solve(A)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

28/116

5/28/2015

AnIntroductiontoR

butrarelyisneeded.Numerically,itisbothinefficientandpotentiallyunstabletocomputex<
solve(A)%*%binsteadofsolve(A,b).
Thequadraticformx%*%A^{1}%*%xwhichisusedinmultivariatecomputations,shouldbe
computedbysomethinglike17x%*%solve(A,x),ratherthancomputingtheinverseofA.
Next:Singularvaluedecompositionanddeterminants,Previous:Linearequationsandinversion,
Up:Matrixfacilities[Contents][Index]
5.7.3Eigenvaluesandeigenvectors

Thefunctioneigen(Sm)calculatestheeigenvaluesandeigenvectorsofasymmetricmatrixSm.
Theresultofthisfunctionisalistoftwocomponentsnamedvaluesandvectors.The
assignment
>ev<eigen(Sm)

willassignthislisttoev.Thenev$valisthevectorofeigenvaluesofSmandev$vecisthematrix
ofcorrespondingeigenvectors.Hadweonlyneededtheeigenvalueswecouldhaveusedthe
assignment:
>evals<eigen(Sm)$values
evalsnowholdsthevectorofeigenvaluesandthesecondcomponentisdiscarded.Ifthe

expression
>eigen(Sm)

isusedbyitselfasacommandthetwocomponentsareprinted,withtheirnames.Forlarge
matricesitisbettertoavoidcomputingtheeigenvectorsiftheyarenotneededbyusingthe
expression
>evals<eigen(Sm,only.values=TRUE)$values

Next:LeastsquaresfittingandtheQRdecomposition,Previous:Eigenvaluesandeigenvectors,
Up:Matrixfacilities[Contents][Index]
5.7.4Singularvaluedecompositionanddeterminants

Thefunctionsvd(M)takesanarbitrarymatrixargument,M,andcalculatesthesingularvalue
decompositionofM.ThisconsistsofamatrixoforthonormalcolumnsUwiththesamecolumn
spaceasM,asecondmatrixoforthonormalcolumnsVwhosecolumnspaceistherowspaceofM
andadiagonalmatrixofpositiveentriesDsuchthatM=U%*%D%*%t(V).Disactuallyreturned
asavectorofthediagonalelements.Theresultofsvd(M)isactuallyalistofthreecomponents
namedd,uandv,withevidentmeanings.
IfMisinfactsquare,then,itisnothardtoseethat
>absdetM<prod(svd(M)$d)

calculatestheabsolutevalueofthedeterminantofM.Ifthiscalculationwereneededoftenwitha
varietyofmatricesitcouldbedefinedasanRfunction
>absdet<function(M)prod(svd(M)$d)

afterwhichwecoulduseabsdet()asjustanotherRfunction.Asafurthertrivialbutpotentially
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

29/116

5/28/2015

AnIntroductiontoR

usefulexample,youmightliketoconsiderwritingafunction,saytr(),tocalculatethetraceofa
squarematrix.[Hint:Youwillnotneedtouseanexplicitloop.Lookagainatthediag()
function.]
Rhasabuiltinfunctiondettocalculateadeterminant,includingthesign,andanother,
determinant,togivethesignandmodulus(optionallyonlogscale),
Previous:Singularvaluedecompositionanddeterminants,Up:Matrixfacilities[Contents]
[Index]
5.7.5LeastsquaresfittingandtheQRdecomposition

Thefunctionlsfit()returnsalistgivingresultsofaleastsquaresfittingprocedure.An
assignmentsuchas
>ans<lsfit(X,y)

givestheresultsofaleastsquaresfitwhereyisthevectorofobservationsandXisthedesign
matrix.Seethehelpfacilityformoredetails,andalsoforthefollowupfunctionls.diag()for,
amongotherthings,regressiondiagnostics.Notethatagrandmeantermisautomatically
includedandneednotbeincludedexplicitlyasacolumnofX.Furthernotethatyoualmost
alwayswillpreferusinglm(.)(seeLinearmodels)tolsfit()forregressionmodelling.
Anothercloselyrelatedfunctionisqr()anditsallies.Considerthefollowingassignments
>Xplus<qr(X)
>b<qr.coef(Xplus,y)
>fit<qr.fitted(Xplus,y)
>res<qr.resid(Xplus,y)

ThesecomputetheorthogonalprojectionofyontotherangeofXinfit,theprojectionontothe
orthogonalcomplementinresandthecoefficientvectorfortheprojectioninb,thatis,bis
essentiallytheresultoftheMATLABbackslashoperator.
ItisnotassumedthatXhasfullcolumnrank.Redundancieswillbediscoveredandremovedas
theyarefound.
Thisalternativeistheolder,lowlevelwaytoperformleastsquarescalculations.Althoughstill
usefulinsomecontexts,itwouldnowgenerallybereplacedbythestatisticalmodelsfeatures,as
willbediscussedinStatisticalmodelsinR.
Next:Theconcatenationfunctionc()witharrays,Previous:Matrixfacilities,Up:Arraysand
matrices[Contents][Index]
5.8Formingpartitionedmatrices,cbind()andrbind()
Aswehavealreadyseeninformally,matricescanbebuiltupfromothervectorsandmatricesby
thefunctionscbind()andrbind().Roughlycbind()formsmatricesbybindingtogethermatrices
horizontally,orcolumnwise,andrbind()vertically,orrowwise.
Intheassignment
>X<cbind(arg_1,arg_2,arg_3,)

theargumentstocbind()mustbeeithervectorsofanylength,ormatriceswiththesamecolumn
size,thatisthesamenumberofrows.Theresultisamatrixwiththeconcatenatedarguments
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

30/116

5/28/2015

AnIntroductiontoR

arg_1,arg_2,formingthecolumns.
Ifsomeoftheargumentstocbind()arevectorstheymaybeshorterthanthecolumnsizeofany
matricespresent,inwhichcasetheyarecyclicallyextendedtomatchthematrixcolumnsize(or
thelengthofthelongestvectorifnomatricesaregiven).
Thefunctionrbind()doesthecorrespondingoperationforrows.Inthiscaseanyvector
argument,possiblycyclicallyextended,areofcoursetakenasrowvectors.
SupposeX1andX2havethesamenumberofrows.TocombinethesebycolumnsintoamatrixX,
togetherwithaninitialcolumnof1swecanuse
>X<cbind(1,X1,X2)

Theresultofrbind()orcbind()alwayshasmatrixstatus.Hencecbind(x)andrbind(x)are
possiblythesimplestwaysexplicitlytoallowthevectorxtobetreatedasacolumnorrow
matrixrespectively.
Next:Frequencytablesfromfactors,Previous:Formingpartitionedmatrices,Up:Arraysand
matrices[Contents][Index]
5.9Theconcatenationfunction,c(),witharrays
Itshouldbenotedthatwhereascbind()andrbind()areconcatenationfunctionsthatrespectdim
attributes,thebasicc()functiondoesnot,butratherclearsnumericobjectsofalldimand
dimnamesattributes.Thisisoccasionallyusefulinitsownright.
Theofficialwaytocoerceanarraybacktoasimplevectorobjectistouseas.vector()
>vec<as.vector(X)

Howeverasimilarresultcanbeachievedbyusingc()withjustoneargument,simplyforthis
sideeffect:
>vec<c(X)

Thereareslightdifferencesbetweenthetwo,butultimatelythechoicebetweenthemislargelya
matterofstyle(withtheformerbeingpreferable).
Previous:Theconcatenationfunctionc()witharrays,Up:Arraysandmatrices[Contents]
[Index]
5.10Frequencytablesfromfactors
Recallthatafactordefinesapartitionintogroups.Similarlyapairoffactorsdefinesatwoway
crossclassification,andsoon.Thefunctiontable()allowsfrequencytablestobecalculated
fromequallengthfactors.Iftherearekfactorarguments,theresultisakwayarrayof
frequencies.
Suppose,forexample,thatstatefisafactorgivingthestatecodeforeachentryinadatavector.
Theassignment
>statefr<table(statef)

givesinstatefratableoffrequenciesofeachstateinthesample.Thefrequenciesareordered
andlabelledbythelevelsattributeofthefactor.Thissimplecaseisequivalentto,butmore
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

31/116

5/28/2015

AnIntroductiontoR

convenientthan,
>statefr<tapply(statef,statef,length)

Furthersupposethatincomefisafactorgivingasuitablydefinedincomeclassforeachentry
inthedatavector,forexamplewiththecut()function:
>factor(cut(incomes,breaks=35+10*(0:7)))>incomef

Thentocalculateatwowaytableoffrequencies:
>table(incomef,statef)
statef
incomefactnswntqldsatasvicwa
(35,45]11010010
(45,55]11112013
(55,65]03132221
(65,75]01000010

Extensiontohigherwayfrequencytablesisimmediate.
Next:Readingdatafromfiles,Previous:Arraysandmatrices,Up:Top[Contents][Index]

6Listsanddataframes
Lists:

Constructingandmodifyinglists:
Dataframes:

Next:Constructingandmodifyinglists,Previous:Listsanddataframes,Up:Listsanddata
frames[Contents][Index]
6.1Lists
AnRlistisanobjectconsistingofanorderedcollectionofobjectsknownasitscomponents.
Thereisnoparticularneedforthecomponentstobeofthesamemodeortype,and,forexample,
alistcouldconsistofanumericvector,alogicalvalue,amatrix,acomplexvector,acharacter
array,afunction,andsoon.Hereisasimpleexampleofhowtomakealist:
>Lst<list(name="Fred",wife="Mary",no.children=3,
child.ages=c(4,7,9))

Componentsarealwaysnumberedandmayalwaysbereferredtoassuch.ThusifLstisthe
nameofalistwithfourcomponents,thesemaybeindividuallyreferredtoasLst[[1]],Lst[[2]],
Lst[[3]]andLst[[4]].If,further,Lst[[4]]isavectorsubscriptedarraythenLst[[4]][1]isits
firstentry.
IfLstisalist,thenthefunctionlength(Lst)givesthenumberof(toplevel)componentsithas.
Componentsoflistsmayalsobenamed,andinthiscasethecomponentmaybereferredtoeither
bygivingthecomponentnameasacharacterstringinplaceofthenumberindoublesquare
brackets,or,moreconveniently,bygivinganexpressionoftheform
>name$component_name

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

32/116

5/28/2015

AnIntroductiontoR

forthesamething.
Thisisaveryusefulconventionasitmakesiteasiertogettherightcomponentifyouforgetthe
number.
Sointhesimpleexamplegivenabove:
Lst$nameisthesameasLst[[1]]andisthestring"Fred",
Lst$wifeisthesameasLst[[2]]andisthestring"Mary",
Lst$child.ages[1]isthesameasLst[[4]][1]andisthenumber4.

Additionally,onecanalsousethenamesofthelistcomponentsindoublesquarebrackets,i.e.,
Lst[["name"]]isthesameasLst$name.Thisisespeciallyuseful,whenthenameofthe
componenttobeextractedisstoredinanothervariableasin
>x<"name";Lst[[x]]

ItisveryimportanttodistinguishLst[[1]]fromLst[1].[[]]istheoperatorusedtoselecta
singleelement,whereas[]isageneralsubscriptingoperator.Thustheformeristhefirst
objectinthelistLst,andifitisanamedlistthenameisnotincluded.Thelatterisasublistofthe
listLstconsistingofthefirstentryonly.Ifitisanamedlist,thenamesaretransferredtothe
sublist.
Thenamesofcomponentsmaybeabbreviateddowntotheminimumnumberoflettersneededto
identifythemuniquely.ThusLst$coefficientsmaybeminimallyspecifiedasLst$coeand
Lst$covarianceasLst$cov.
Thevectorofnamesisinfactsimplyanattributeofthelistlikeanyotherandmaybehandledas
such.Otherstructuresbesideslistsmay,ofcourse,similarlybegivenanamesattributealso.
Next:Dataframes,Previous:Lists,Up:Listsanddataframes[Contents][Index]
6.2Constructingandmodifyinglists
Newlistsmaybeformedfromexistingobjectsbythefunctionlist().Anassignmentofthe
form
>Lst<list(name_1=object_1,,name_m=object_m)

setsupalistLstofmcomponentsusingobject_1,,object_mforthecomponentsandgiving
themnamesasspecifiedbytheargumentnames,(whichcanbefreelychosen).Ifthesenames
areomitted,thecomponentsarenumberedonly.Thecomponentsusedtoformthelistarecopied
whenformingthenewlistandtheoriginalsarenotaffected.
Lists,likeanysubscriptedobject,canbeextendedbyspecifyingadditionalcomponents.For
example
>Lst[5]<list(matrix=Mat)

Concatenatinglists:
Previous:Constructingandmodifyinglists,Up:Constructingandmodifyinglists[Contents]
[Index]
6.2.1Concatenatinglists
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

33/116

5/28/2015

AnIntroductiontoR

Whentheconcatenationfunctionc()isgivenlistarguments,theresultisanobjectofmodelist
also,whosecomponentsarethoseoftheargumentlistsjoinedtogetherinsequence.
>list.ABC<c(list.A,list.B,list.C)

Recallthatwithvectorobjectsasargumentstheconcatenationfunctionsimilarlyjoinedtogether
allargumentsintoasinglevectorstructure.Inthiscaseallotherattributes,suchasdimattributes,
arediscarded.
Previous:Constructingandmodifyinglists,Up:Listsanddataframes[Contents][Index]
6.3Dataframes
Adataframeisalistwithclass"data.frame".Therearerestrictionsonliststhatmaybemade
intodataframes,namely
Thecomponentsmustbevectors(numeric,character,orlogical),factors,numeric
matrices,lists,orotherdataframes.
Matrices,lists,anddataframesprovideasmanyvariablestothenewdataframeasthey
havecolumns,elements,orvariables,respectively.
Numericvectors,logicalsandfactorsareincludedasis,andbydefault18charactervectors
arecoercedtobefactors,whoselevelsaretheuniquevaluesappearinginthevector.
Vectorstructuresappearingasvariablesofthedataframemustallhavethesamelength,
andmatrixstructuresmustallhavethesamerowsize.
Adataframemayformanypurposesberegardedasamatrixwithcolumnspossiblyofdiffering
modesandattributes.Itmaybedisplayedinmatrixform,anditsrowsandcolumnsextracted
usingmatrixindexingconventions.
Makingdataframes:

attach()anddetach():

Workingwithdataframes:
Attachingarbitrarylists:
Managingthesearchpath:
Next:attach()anddetach(),Previous:Dataframes,Up:Dataframes[Contents][Index]
6.3.1Makingdataframes

Objectssatisfyingtherestrictionsplacedonthecolumns(components)ofadataframemaybe
usedtoformoneusingthefunctiondata.frame:
>accountants<data.frame(home=statef,loot=incomes,shot=incomef)

Alistwhosecomponentsconformtotherestrictionsofadataframemaybecoercedintoadata
frameusingthefunctionas.data.frame()
Thesimplestwaytoconstructadataframefromscratchistousetheread.table()functionto
readanentiredataframefromanexternalfile.ThisisdiscussedfurtherinReadingdatafrom
files.
Next:Workingwithdataframes,Previous:Makingdataframes,Up:Dataframes[Contents]
[Index]
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

34/116

5/28/2015

AnIntroductiontoR

6.3.2attach()anddetach()

The$notation,suchasaccountants$home,forlistcomponentsisnotalwaysveryconvenient.A
usefulfacilitywouldbesomehowtomakethecomponentsofalistordataframetemporarily
visibleasvariablesundertheircomponentname,withouttheneedtoquotethelistname
explicitlyeachtime.
Theattach()functiontakesadatabasesuchasalistordataframeasitsargument.Thus
supposelentilsisadataframewiththreevariableslentils$u,lentils$v,lentils$w.Theattach
>attach(lentils)

placesthedataframeinthesearchpathatposition2,andprovidedtherearenovariablesu,vorw
inposition1,u,vandwareavailableasvariablesfromthedataframeintheirownright.Atthis
pointanassignmentsuchas
>u<v+w

doesnotreplacethecomponentuofthedataframe,butrathermasksitwithanothervariableuin
theworkingdirectoryatposition1onthesearchpath.Tomakeapermanentchangetothedata
frameitself,thesimplestwayistoresortonceagaintothe$notation:
>lentils$u<v+w

Howeverthenewvalueofcomponentuisnotvisibleuntilthedataframeisdetachedand
attachedagain.
Todetachadataframe,usethefunction
>detach()

Moreprecisely,thisstatementdetachesfromthesearchpaththeentitycurrentlyatposition2.
Thusinthepresentcontextthevariablesu,vandwwouldbenolongervisible,exceptunderthe
listnotationaslentils$uandsoon.Entitiesatpositionsgreaterthan2onthesearchpathcanbe
detachedbygivingtheirnumbertodetach,butitismuchsafertoalwaysuseaname,for
examplebydetach(lentils)ordetach("lentils")
Note:InRlistsanddataframescanonlybeattachedatposition2orabove,and
whatisattachedisacopyoftheoriginalobject.Youcanaltertheattachedvalues
viaassign,buttheoriginallistordataframeisunchanged.
Next:Attachingarbitrarylists,Previous:attach()anddetach(),Up:Dataframes[Contents]
[Index]
6.3.3Workingwithdataframes

Ausefulconventionthatallowsyoutoworkwithmanydifferentproblemscomfortablytogether
inthesameworkingdirectoryis
gathertogetherallvariablesforanywelldefinedandseparateprobleminadataframe
underasuitablyinformativename
whenworkingwithaproblemattachtheappropriatedataframeatposition2,andusethe
workingdirectoryatlevel1foroperationalquantitiesandtemporaryvariables
beforeleavingaproblem,addanyvariablesyouwishtokeepforfuturereferencetothe
dataframeusingthe$formofassignment,andthendetach()
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

35/116

5/28/2015

AnIntroductiontoR

finallyremoveallunwantedvariablesfromtheworkingdirectoryandkeepitascleanof
leftovertemporaryvariablesaspossible.
Inthiswayitisquitesimpletoworkwithmanyproblemsinthesamedirectory,allofwhich
havevariablesnamedx,yandz,forexample.
Next:Managingthesearchpath,Previous:Workingwithdataframes,Up:Dataframes
[Contents][Index]
6.3.4Attachingarbitrarylists
attach()isagenericfunctionthatallowsnotonlydirectoriesanddataframestobeattachedto
thesearchpath,butotherclassesofobjectaswell.Inparticularanyobjectofmode"list"may

beattachedinthesameway:
>attach(any.old.list)

Anythingthathasbeenattachedcanbedetachedbydetach,bypositionnumberor,preferably,
byname.
Previous:Attachingarbitrarylists,Up:Dataframes[Contents][Index]
6.3.5Managingthesearchpath

Thefunctionsearchshowsthecurrentsearchpathandsoisaveryusefulwaytokeeptrackof
whichdataframesandlists(andpackages)havebeenattachedanddetached.Initiallyitgives
>search()
[1]".GlobalEnv""Autoloads""package:base"

where.GlobalEnvistheworkspace.19
Afterlentilsisattachedwehave
>search()
[1]".GlobalEnv""lentils""Autoloads""package:base"
>ls(2)
[1]"u""v""w"

andasweseels(orobjects)canbeusedtoexaminethecontentsofanypositiononthesearch
path.
Finally,wedetachthedataframeandconfirmithasbeenremovedfromthesearchpath.
>detach("lentils")
>search()
[1]".GlobalEnv""Autoloads""package:base"

Next:Probabilitydistributions,Previous:Listsanddataframes,Up:Top[Contents][Index]

7Readingdatafromfiles
Largedataobjectswillusuallybereadasvaluesfromexternalfilesratherthanenteredduringan
Rsessionatthekeyboard.Rinputfacilitiesaresimpleandtheirrequirementsarefairlystrictand
evenratherinflexible.ThereisaclearpresumptionbythedesignersofRthatyouwillbeableto
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

36/116

5/28/2015

AnIntroductiontoR

modifyyourinputfilesusingothertools,suchasfileeditorsorPerl20tofitinwiththe
requirementsofR.Generallythisisverysimple.
Ifvariablesaretobeheldmainlyindataframes,aswestronglysuggesttheyshouldbe,anentire
dataframecanbereaddirectlywiththeread.table()function.Thereisalsoamoreprimitive
inputfunction,scan(),thatcanbecalleddirectly.
FormoredetailsonimportingdataintoRandalsoexportingdata,seetheRDataImport/Export
manual.
Theread.table()function:
Thescan()function:

Accessingbuiltindatasets:
Editingdata:

Next:Thescan()function,Previous:Readingdatafromfiles,Up:Readingdatafromfiles
[Contents][Index]
7.1Theread.table()function
Toreadanentiredataframedirectly,theexternalfilewillnormallyhaveaspecialform.
Thefirstlineofthefileshouldhaveanameforeachvariableinthedataframe.
Eachadditionallineofthefilehasasitsfirstitemarowlabelandthevaluesforeach
variable.
Ifthefilehasonefeweriteminitsfirstlinethaninitssecond,thisarrangementispresumedto
beinforce.Sothefirstfewlinesofafiletobereadasadataframemightlookasfollows.
Inputfileformwithnamesandrowlabels:
PriceFloorAreaRoomsAgeCent.heat
0152.00111.083056.2no
0254.75128.071057.5no
0357.50101.0100054.2no
0457.50131.069068.8no
0559.7593.090051.9yes
...

Bydefaultnumericitems(exceptrowlabels)arereadasnumericvariablesandnonnumeric
variables,suchasCent.heatintheexample,asfactors.Thiscanbechangedifnecessary.
Thefunctionread.table()canthenbeusedtoreadthedataframedirectly
>HousePrice<read.table("houses.data")

Oftenyouwillwanttoomitincludingtherowlabelsdirectlyandusethedefaultlabels.Inthis
casethefilemayomittherowlabelcolumnasinthefollowing.
Inputfileformwithoutrowlabels:
PriceFloorAreaRoomsAgeCent.heat
52.00111.083056.2no
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

37/116

5/28/2015

AnIntroductiontoR

54.75128.071057.5no
57.50101.0100054.2no
57.50131.069068.8no
59.7593.090051.9yes
...

Thedataframemaythenbereadas
>HousePrice<read.table("houses.data",header=TRUE)

wheretheheader=TRUEoptionspecifiesthatthefirstlineisalineofheadings,andhence,by
implicationfromtheformofthefile,thatnoexplicitrowlabelsaregiven.
Thescan()function:
Next:Accessingbuiltindatasets,Previous:Theread.table()function,Up:Readingdatafrom
files[Contents][Index]
7.2Thescan()function
Supposethedatavectorsareofequallengthandaretobereadinparallel.Furthersupposethat
therearethreevectors,thefirstofmodecharacterandtheremainingtwoofmodenumeric,and
thefileisinput.dat.Thefirststepistousescan()toreadinthethreevectorsasalist,asfollows
>inp<scan("input.dat",list("",0,0))

Thesecondargumentisadummyliststructurethatestablishesthemodeofthethreevectorsto
beread.Theresult,heldininp,isalistwhosecomponentsarethethreevectorsreadin.To
separatethedataitemsintothreeseparatevectors,useassignmentslike
>label<inp[[1]];x<inp[[2]];y<inp[[3]]

Moreconveniently,thedummylistcanhavenamedcomponents,inwhichcasethenamescanbe
usedtoaccessthevectorsreadin.Forexample
>inp<scan("input.dat",list(id="",x=0,y=0))

Ifyouwishtoaccessthevariablesseparatelytheymayeitherbereassignedtovariablesinthe
workingframe:
>label<inp$id;x<inp$x;y<inp$y

orthelistmaybeattachedatposition2ofthesearchpath(seeAttachingarbitrarylists).
Ifthesecondargumentisasinglevalueandnotalist,asinglevectorisreadin,allcomponents
ofwhichmustbeofthesamemodeasthedummyvalue.
>X<matrix(scan("light.dat",0),ncol=5,byrow=TRUE)

Therearemoreelaborateinputfacilitiesavailableandthesearedetailedinthemanuals.
Next:Editingdata,Previous:Thescan()function,Up:Readingdatafromfiles[Contents]
[Index]
7.3Accessingbuiltindatasets
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

38/116

5/28/2015

AnIntroductiontoR

Around100datasetsaresuppliedwithR(inpackagedatasets),andothersareavailablein
packages(includingtherecommendedpackagessuppliedwithR).Toseethelistofdatasets
currentlyavailableuse
data()

AllthedatasetssuppliedwithRareavailabledirectlybyname.However,manypackagesstill
usetheobsoleteconventioninwhichdatawasalsousedtoloaddatasetsintoR,forexample
data(infert)

andthiscanstillbeusedwiththestandardpackages(asinthisexample).Inmostcasesthiswill
loadanRobjectofthesamename.However,inafewcasesitloadsseveralobjects,soseethe
onlinehelpfortheobjecttoseewhattoexpect.
7.3.1LoadingdatafromotherRpackages

Toaccessdatafromaparticularpackage,usethepackageargument,forexample
data(package="rpart")
data(Puromycin,package="datasets")

Ifapackagehasbeenattachedbylibrary,itsdatasetsareautomaticallyincludedinthesearch.
Usercontributedpackagescanbearichsourceofdatasets.
Previous:Accessingbuiltindatasets,Up:Readingdatafromfiles[Contents][Index]
7.4Editingdata
Wheninvokedonadataframeormatrix,editbringsupaseparatespreadsheetlikeenvironment
forediting.Thisisusefulformakingsmallchangesonceadatasethasbeenread.Thecommand
>xnew<edit(xold)

willallowyoutoedityourdatasetxold,andoncompletionthechangedobjectisassignedto
xnew.Ifyouwanttoaltertheoriginaldatasetxold,thesimplestwayistousefix(xold),whichis
equivalenttoxold<edit(xold).
Use
>xnew<edit(data.frame())

toenternewdataviathespreadsheetinterface.
Next:Loopsandconditionalexecution,Previous:Readingdatafromfiles,Up:Top[Contents]
[Index]

8Probabilitydistributions
Rasasetofstatisticaltables:

Examiningthedistributionofasetofdata:
Oneandtwosampletests:

Next:Examiningthedistributionofasetofdata,Previous:Probabilitydistributions,Up:
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

39/116

5/28/2015

AnIntroductiontoR

Probabilitydistributions[Contents][Index]
8.1Rasasetofstatisticaltables
OneconvenientuseofRistoprovideacomprehensivesetofstatisticaltables.Functionsare
providedtoevaluatethecumulativedistributionfunctionP(X<=x),theprobabilitydensity
functionandthequantilefunction(givenq,thesmallestxsuchthatP(X<=x)>q),andto
simulatefromthedistribution.
Distribution
beta
binomial
Cauchy
chisquared
exponential
F
gamma
geometric
hypergeometric
lognormal
logistic
negativebinomial
normal
Poisson
signedrank
Studentst
uniform
Weibull
Wilcoxon

Rname additionalarguments
beta

shape1,shape2,ncp

binom

size,prob

cauchy

location,scale

chisq

df,ncp

exp

rate

df1,df2,ncp

gamma

shape,scale

geom

prob

hyper

m,n,k

lnorm

meanlog,sdlog

logis

location,scale

nbinom

size,prob

norm

mean,sd

pois

lambda

signrank n
t

df,ncp

unif

min,max

weibull shape,scale
wilcox

m,n

Prefixthenamegivenherebydforthedensity,pfortheCDF,qforthequantilefunction
andrforsimulation(randomdeviates).Thefirstargumentisxfordxxx,qforpxxx,pforqxxx
andnforrxxx(exceptforrhyper,rsignrankandrwilcox,forwhichitisnn).Innotquiteall
casesisthenoncentralityparameterncpcurrentlyavailable:seetheonlinehelpfordetails.
Thepxxxandqxxxfunctionsallhavelogicalargumentslower.tailandlog.pandthedxxxones
havelog.Thisallows,e.g.,gettingthecumulative(orintegrated)hazardfunction,H(t)=
log(1F(t)),by
pxxx(t,...,lower.tail=FALSE,log.p=TRUE)

ormoreaccurateloglikelihoods(bydxxx(...,log=TRUE)),directly.
Inadditiontherearefunctionsptukeyandqtukeyforthedistributionofthestudentizedrangeof
samplesfromanormaldistribution,anddmultinomandrmultinomforthemultinomial
distribution.Furtherdistributionsareavailableincontributedpackages,notablySuppDists.
Herearesomeexamples
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

40/116

5/28/2015

AnIntroductiontoR

>##2tailedpvaluefortdistribution
>2*pt(2.43,df=13)
>##upper1%pointforanF(2,7)distribution
>qf(0.01,2,7,lower.tail=FALSE)

SeetheonlinehelponRNGforhowrandomnumbergenerationisdoneinR.
Next:Oneandtwosampletests,Previous:Rasasetofstatisticaltables,Up:Probability
distributions[Contents][Index]
8.2Examiningthedistributionofasetofdata
Givena(univariate)setofdatawecanexamineitsdistributioninalargenumberofways.The
simplestistoexaminethenumbers.Twoslightlydifferentsummariesaregivenbysummaryand
fivenumandadisplayofthenumbersbystem(astemandleafplot).
>attach(faithful)
>summary(eruptions)
Min.1stQu.MedianMean3rdQu.Max.
1.6002.1634.0003.4884.4545.100
>fivenum(eruptions)
[1]1.60002.15854.00004.45855.1000
>stem(eruptions)
Thedecimalpointis1digit(s)totheleftofthe|
16|070355555588
18|000022233333335577777777888822335777888
20|00002223378800035778
22|0002335578023578
24|00228
26|23
28|080
30|7
32|2337
34|250077
36|0000823577
38|2333335582225577
40|0000003357788888002233555577778
42|03335555778800233333555577778
44|02222335557780000000023333357778888
46|0000233357700000023578
48|00000022335800333
50|0370

Astemandleafplotislikeahistogram,andRhasafunctionhisttoplothistograms.
>hist(eruptions)
##makethebinssmaller,makeaplotofdensity
>hist(eruptions,seq(1.6,5.2,0.2),prob=TRUE)
>lines(density(eruptions,bw=0.1))
>rug(eruptions)#showtheactualdatapoints

Moreelegantdensityplotscanbemadebydensity,andweaddedalineproducedbydensityin
thisexample.Thebandwidthbwwaschosenbytrialanderrorasthedefaultgivestoomuch
smoothing(itusuallydoesforinterestingdensities).(Betterautomatedmethodsofbandwidth
choiceareavailable,andinthisexamplebw="SJ"givesagoodresult.)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

41/116

5/28/2015

AnIntroductiontoR

images/hist
Wecanplottheempiricalcumulativedistributionfunctionbyusingthefunctionecdf.
>plot(ecdf(eruptions),do.points=FALSE,verticals=TRUE)

Thisdistributionisobviouslyfarfromanystandarddistribution.Howabouttherighthand
mode,sayeruptionsoflongerthan3minutes?Letusfitanormaldistributionandoverlaythe
fittedCDF.
>long<eruptions[eruptions>3]
>plot(ecdf(long),do.points=FALSE,verticals=TRUE)
>x<seq(3,5.4,0.01)
>lines(x,pnorm(x,mean=mean(long),sd=sqrt(var(long))),lty=3)

images/ecdf
Quantilequantile(QQ)plotscanhelpusexaminethismorecarefully.
par(pty="s")#arrangeforasquarefigureregion
qqnorm(long);qqline(long)

whichshowsareasonablefitbutashorterrighttailthanonewouldexpectfromanormal
distribution.Letuscomparethiswithsomesimulateddatafromatdistribution
images/QQ
x<rt(250,df=5)
qqnorm(x);qqline(x)

whichwillusually(ifitisarandomsample)showlongertailsthanexpectedforanormal.We
canmakeaQQplotagainstthegeneratingdistributionby
qqplot(qt(ppoints(250),df=5),x,xlab="QQplotfortdsn")
qqline(x)

Finally,wemightwantamoreformaltestofagreementwithnormality(ornot).Rprovidesthe
ShapiroWilktest
>shapiro.test(long)
ShapiroWilknormalitytest
data:long
W=0.9793,pvalue=0.01052

andtheKolmogorovSmirnovtest
>ks.test(long,"pnorm",mean=mean(long),sd=sqrt(var(long)))
OnesampleKolmogorovSmirnovtest
data:long
D=0.0661,pvalue=0.4284
alternativehypothesis:two.sided

(Notethatthedistributiontheoryisnotvalidhereaswehaveestimatedtheparametersofthe
normaldistributionfromthesamesample.)
Previous:Examiningthedistributionofasetofdata,Up:Probabilitydistributions[Contents]
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

42/116

5/28/2015

AnIntroductiontoR

[Index]
8.3Oneandtwosampletests
Sofarwehavecomparedasinglesampletoanormaldistribution.Amuchmorecommon
operationistocompareaspectsoftwosamples.NotethatinR,allclassicaltestsincludingthe
onesusedbelowareinpackagestatswhichisnormallyloaded.
Considerthefollowingsetsofdataonthelatentheatofthefusionofice(cal/gm)fromRice
(1995,p.490)
MethodA:79.9880.0480.0280.0480.0380.0380.0479.97
80.0580.0380.0280.0080.02
MethodB:80.0279.9479.9879.9779.9780.0379.9579.97

Boxplotsprovideasimplegraphicalcomparisonofthetwosamples.
A<scan()
79.9880.0480.0280.0480.0380.0380.0479.97
80.0580.0380.0280.0080.02
B<scan()
80.0279.9479.9879.9779.9780.0379.9579.97
boxplot(A,B)

whichindicatesthatthefirstgrouptendstogivehigherresultsthanthesecond.
images/ice
Totestfortheequalityofthemeansofthetwoexamples,wecanuseanunpairedttestby
>t.test(A,B)
WelchTwoSamplettest
data:AandB
t=3.2499,df=12.027,pvalue=0.00694
alternativehypothesis:truedifferenceinmeansisnotequalto0
95percentconfidenceinterval:
0.013855260.07018320
sampleestimates:
meanofxmeanofy
80.0207779.97875

whichdoesindicateasignificantdifference,assumingnormality.BydefaulttheRfunctiondoes
notassumeequalityofvariancesinthetwosamples(incontrasttothesimilarSPLUSt.test
function).WecanusetheFtesttotestforequalityinthevariances,providedthatthetwo
samplesarefromnormalpopulations.
>var.test(A,B)
Ftesttocomparetwovariances
data:AandB
F=0.5837,numdf=12,denomdf=7,pvalue=0.3938
alternativehypothesis:trueratioofvariancesisnotequalto1
95percentconfidenceinterval:
0.12510972.1052687
sampleestimates:
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

43/116

5/28/2015

AnIntroductiontoR

ratioofvariances
0.5837405

whichshowsnoevidenceofasignificantdifference,andsowecanusetheclassicalttestthat
assumesequalityofthevariances.
>t.test(A,B,var.equal=TRUE)
TwoSamplettest
data:AandB
t=3.4722,df=19,pvalue=0.002551
alternativehypothesis:truedifferenceinmeansisnotequalto0
95percentconfidenceinterval:
0.016690580.06734788
sampleestimates:
meanofxmeanofy
80.0207779.97875

Allthesetestsassumenormalityofthetwosamples.ThetwosampleWilcoxon(orMann
Whitney)testonlyassumesacommoncontinuousdistributionunderthenullhypothesis.
>wilcox.test(A,B)
Wilcoxonranksumtestwithcontinuitycorrection
data:AandB
W=89,pvalue=0.007497
alternativehypothesis:truelocationshiftisnotequalto0
Warningmessage:
Cannotcomputeexactpvaluewithtiesin:wilcox.test(A,B)

Notethewarning:thereareseveraltiesineachsample,whichsuggestsstronglythatthesedata
arefromadiscretedistribution(probablyduetorounding).
Thereareseveralwaystocomparegraphicallythetwosamples.Wehavealreadyseenapairof
boxplots.Thefollowing
>plot(ecdf(A),do.points=FALSE,verticals=TRUE,xlim=range(A,B))
>plot(ecdf(B),do.points=FALSE,verticals=TRUE,add=TRUE)

willshowthetwoempiricalCDFs,andqqplotwillperformaQQplotofthetwosamples.The
KolmogorovSmirnovtestisofthemaximalverticaldistancebetweenthetwoecdfs,assuming
acommoncontinuousdistribution:
>ks.test(A,B)
TwosampleKolmogorovSmirnovtest
data:AandB
D=0.5962,pvalue=0.05919
alternativehypothesis:twosided
Warningmessage:
cannotcomputecorrectpvalueswithtiesin:ks.test(A,B)

Next:Writingyourownfunctions,Previous:Probabilitydistributions,Up:Top[Contents]
[Index]
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

44/116

5/28/2015

AnIntroductiontoR

9Grouping,loopsandconditionalexecution
Groupedexpressions:
Controlstatements:
Next:Controlstatements,Previous:Loopsandconditionalexecution,Up:Loopsandconditional
execution[Contents][Index]
9.1Groupedexpressions
Risanexpressionlanguageinthesensethatitsonlycommandtypeisafunctionorexpression
whichreturnsaresult.Evenanassignmentisanexpressionwhoseresultisthevalueassigned,
anditmaybeusedwhereveranyexpressionmaybeusedinparticularmultipleassignmentsare
possible.
Commandsmaybegroupedtogetherinbraces,{expr_1;;expr_m},inwhichcasethevalueof
thegroupistheresultofthelastexpressioninthegroupevaluated.Sincesuchagroupisalsoan
expressionitmay,forexample,beitselfincludedinparenthesesandusedapartofanevenlarger
expression,andsoon.
Previous:Groupedexpressions,Up:Loopsandconditionalexecution[Contents][Index]
9.2Controlstatements
Conditionalexecution:
Repetitiveexecution:
Next:Repetitiveexecution,Previous:Controlstatements,Up:Controlstatements[Contents]
[Index]
9.2.1Conditionalexecution:ifstatements

Thelanguagehasavailableaconditionalconstructionoftheform
>if(expr_1)expr_2elseexpr_3

whereexpr_1mustevaluatetoasinglelogicalvalueandtheresultoftheentireexpressionis
thenevident.
Theshortcircuitoperators&&and||areoftenusedaspartoftheconditioninanifstatement.
Whereas&and|applyelementwisetovectors,&&and||applytovectorsoflengthone,and
onlyevaluatetheirsecondargumentifnecessary.
Thereisavectorizedversionoftheif/elseconstruct,theifelsefunction.Thishastheform
ifelse(condition,a,b)andreturnsavectorofthelengthofitslongestargument,with
elementsa[i]ifcondition[i]istrue,otherwiseb[i].
Previous:Conditionalexecution,Up:Controlstatements[Contents][Index]
9.2.2Repetitiveexecution:forloops,repeatandwhile

Thereisalsoaforloopconstructionwhichhastheform
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

45/116

5/28/2015

AnIntroductiontoR

>for(nameinexpr_1)expr_2

wherenameistheloopvariable.expr_1isavectorexpression,(oftenasequencelike1:20),and
expr_2isoftenagroupedexpressionwithitssubexpressionswrittenintermsofthedummy
name.expr_2isrepeatedlyevaluatedasnamerangesthroughthevaluesinthevectorresultof
expr_1.
Asanexample,supposeindisavectorofclassindicatorsandwewishtoproduceseparateplots
ofyversusxwithinclasses.Onepossibilityhereistousecoplot(),21whichwillproducean
arrayofplotscorrespondingtoeachlevelofthefactor.Anotherwaytodothis,nowputtingall
plotsontheonedisplay,isasfollows:
>xc<split(x,ind)
>yc<split(y,ind)
>for(iin1:length(yc)){
plot(xc[[i]],yc[[i]])
abline(lsfit(xc[[i]],yc[[i]]))
}

(Notethefunctionsplit()whichproducesalistofvectorsobtainedbysplittingalargervector
accordingtotheclassesspecifiedbyafactor.Thisisausefulfunction,mostlyusedin
connectionwithboxplots.Seethehelpfacilityforfurtherdetails.)
Warning:for()loopsareusedinRcodemuchlessoftenthanincompiled
languages.Codethattakesawholeobjectviewislikelytobebothclearerand
fasterinR.
Otherloopingfacilitiesincludethe
>repeatexpr

statementandthe
>while(condition)expr

statement.
Thebreakstatementcanbeusedtoterminateanyloop,possiblyabnormally.Thisistheonly
waytoterminaterepeatloops.
Thenextstatementcanbeusedtodiscontinueoneparticularcycleandskiptothenext.
Controlstatementsaremostoftenusedinconnectionwithfunctionswhicharediscussedin
Writingyourownfunctions,andwheremoreexampleswillemerge.
Next:StatisticalmodelsinR,Previous:Loopsandconditionalexecution,Up:Top[Contents]
[Index]

10Writingyourownfunctions
Aswehaveseeninformallyalongtheway,theRlanguageallowstheusertocreateobjectsof
modefunction.ThesearetrueRfunctionsthatarestoredinaspecialinternalformandmaybe
usedinfurtherexpressionsandsoon.Intheprocess,thelanguagegainsenormouslyinpower,
convenienceandelegance,andlearningtowriteusefulfunctionsisoneofthemainwaysto
makeyouruseofRcomfortableandproductive.
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

46/116

5/28/2015

AnIntroductiontoR

ItshouldbeemphasizedthatmostofthefunctionssuppliedaspartoftheRsystem,suchas
mean(),var(),postscript()andsoon,arethemselveswritteninRandthusdonotdiffer
materiallyfromuserwrittenfunctions.
Afunctionisdefinedbyanassignmentoftheform
>name<function(arg_1,arg_2,)expression

TheexpressionisanRexpression,(usuallyagroupedexpression),thatusesthearguments,
arg_i,tocalculateavalue.Thevalueoftheexpressionisthevaluereturnedforthefunction.
Acalltothefunctionthenusuallytakestheformname(expr_1,expr_2,)andmayoccur
anywhereafunctioncallislegitimate.
Simpleexamples:

Definingnewbinaryoperators:
Namedargumentsanddefaults:
Thethreedotsargument:

Assignmentwithinfunctions:
Moreadvancedexamples:

Scope:

Customizingtheenvironment:
Objectorientation:

Next:Definingnewbinaryoperators,Previous:Writingyourownfunctions,Up:Writingyour
ownfunctions[Contents][Index]
10.1Simpleexamples
Asafirstexample,considerafunctiontocalculatethetwosampletstatistic,showingallthe
steps.Thisisanartificialexample,ofcourse,sincethereareother,simplerwaysofachieving
thesameend.
Thefunctionisdefinedasfollows:
>twosam<function(y1,y2){
n1<length(y1);n2<length(y2)
yb1<mean(y1);yb2<mean(y2)
s1<var(y1);s2<var(y2)
s<((n11)*s1+(n21)*s2)/(n1+n22)
tst<(yb1yb2)/sqrt(s*(1/n1+1/n2))
tst
}

Withthisfunctiondefined,youcouldperformtwosamplettestsusingacallsuchas
>tstat<twosam(data$male,data$female);tstat

Asasecondexample,considerafunctiontoemulatedirectlytheMATLABbackslashcommand,
whichreturnsthecoefficientsoftheorthogonalprojectionofthevectoryontothecolumnspace
ofthematrix,X.(Thisisordinarilycalledtheleastsquaresestimateoftheregression
coefficients.)Thiswouldordinarilybedonewiththeqr()functionhoweverthisissometimesa
bittrickytousedirectlyanditpaystohaveasimplefunctionsuchasthefollowingtouseit
safely.
Thusgivenanby1vectoryandannbypmatrixXthenX\yisdefinedas(XX)^{}Xy,where
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

47/116

5/28/2015

AnIntroductiontoR

(XX)^{}isageneralizedinverseofX'X.
>bslash<function(X,y){
X<qr(X)
qr.coef(X,y)
}

Afterthisobjectiscreateditmaybeusedinstatementssuchas
>regcoeff<bslash(Xmat,yvar)

andsoon.
TheclassicalRfunctionlsfit()doesthisjobquitewell,andmore22.Itinturnusesthe
functionsqr()andqr.coef()intheslightlycounterintuitivewayabovetodothispartofthe
calculation.Hencethereisprobablysomevalueinhavingjustthispartisolatedinasimpleto
usefunctionifitisgoingtobeinfrequentuse.Ifso,wemaywishtomakeitamatrixbinary
operatorforevenmoreconvenientuse.
Next:Namedargumentsanddefaults,Previous:Simpleexamples,Up:Writingyourown
functions[Contents][Index]
10.2Definingnewbinaryoperators
Hadwegiventhebslash()functionadifferentname,namelyoneoftheform
%anything%

itcouldhavebeenusedasabinaryoperatorinexpressionsratherthaninfunctionform.
Suppose,forexample,wechoose!fortheinternalcharacter.Thefunctiondefinitionwouldthen
startas
>"%!%"<function(X,y){}

(Notetheuseofquotemarks.)ThefunctioncouldthenbeusedasX%!%y.(Thebackslash
symbolitselfisnotaconvenientchoiceasitpresentsspecialproblemsinthiscontext.)
Thematrixmultiplicationoperator,%*%,andtheouterproductmatrixoperator%o%areother
examplesofbinaryoperatorsdefinedinthisway.
Next:Thethreedotsargument,Previous:Definingnewbinaryoperators,Up:Writingyourown
functions[Contents][Index]
10.3Namedargumentsanddefaults
AsfirstnotedinGeneratingregularsequences,ifargumentstocalledfunctionsaregiveninthe
name=objectform,theymaybegiveninanyorder.Furthermoretheargumentsequencemay
beginintheunnamed,positionalform,andspecifynamedargumentsafterthepositional
arguments.
Thusifthereisafunctionfun1definedby
>fun1<function(data,data.frame,graph,limit){
[functionbodyomitted]
}
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

48/116

5/28/2015

AnIntroductiontoR

thenthefunctionmaybeinvokedinseveralways,forexample
>ans<fun1(d,df,TRUE,20)
>ans<fun1(d,df,graph=TRUE,limit=20)
>ans<fun1(data=d,limit=20,graph=TRUE,data.frame=df)

areallequivalent.
Inmanycasesargumentscanbegivencommonlyappropriatedefaultvalues,inwhichcasethey
maybeomittedaltogetherfromthecallwhenthedefaultsareappropriate.Forexample,iffun1
weredefinedas
>fun1<function(data,data.frame,graph=TRUE,limit=20){}

itcouldbecalledas
>ans<fun1(d,df)

whichisnowequivalenttothethreecasesabove,oras
>ans<fun1(d,df,limit=10)

whichchangesoneofthedefaults.
Itisimportanttonotethatdefaultsmaybearbitraryexpressions,eveninvolvingotherarguments
tothesamefunctiontheyarenotrestrictedtobeconstantsasinoursimpleexamplehere.
Next:Assignmentwithinfunctions,Previous:Namedargumentsanddefaults,Up:Writingyour
ownfunctions[Contents][Index]
10.4Theargument
Anotherfrequentrequirementistoallowonefunctiontopassonargumentsettingstoanother.
Forexamplemanygraphicsfunctionsusethefunctionpar()andfunctionslikeplot()allowthe
usertopassongraphicalparameterstopar()tocontrolthegraphicaloutput.(SeeThepar()
function,formoredetailsonthepar()function.)Thiscanbedonebyincludinganextra
argument,literally,ofthefunction,whichmaythenbepassedon.Anoutlineexampleis
givenbelow.
fun1<function(data,data.frame,graph=TRUE,limit=20,...){
[omittedstatements]
if(graph)
par(pch="*",...)
[moreomissions]
}

Lessfrequently,afunctionwillneedtorefertocomponentsof.Theexpressionlist(...)
evaluatesallsuchargumentsandreturnstheminanamedlist,while..1,..2,etc.evaluatethem
oneatatime,with..nreturningthenthunmatchedargument.
Next:Moreadvancedexamples,Previous:Thethreedotsargument,Up:Writingyourown
functions[Contents][Index]
10.5Assignmentswithinfunctions
Notethatanyordinaryassignmentsdonewithinthefunctionarelocalandtemporaryandare
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

49/116

5/28/2015

AnIntroductiontoR

lostafterexitfromthefunction.ThustheassignmentX<qr(X)doesnotaffectthevalueofthe
argumentinthecallingprogram.
TounderstandcompletelytherulesgoverningthescopeofRassignmentsthereaderneedstobe
familiarwiththenotionofanevaluationframe.Thisisasomewhatadvanced,thoughhardly
difficult,topicandisnotcoveredfurtherhere.
Ifglobalandpermanentassignmentsareintendedwithinafunction,theneitherthe
superassignmentoperator,<<orthefunctionassign()canbeused.Seethehelpdocumentfor
details.SPLUSusersshouldbeawarethat<<hasdifferentsemanticsinR.Thesearediscussed
furtherinScope.
Next:Scope,Previous:Assignmentwithinfunctions,Up:Writingyourownfunctions
[Contents][Index]
10.6Moreadvancedexamples
Efficiencyfactorsinblockdesigns:
Droppingallnamesinaprintedarray:
Recursivenumericalintegration:

Next:Droppingallnamesinaprintedarray,Previous:Moreadvancedexamples,Up:More
advancedexamples[Contents][Index]
10.6.1Efficiencyfactorsinblockdesigns

Asamorecomplete,ifalittlepedestrian,exampleofafunction,considerfindingtheefficiency
factorsforablockdesign.(SomeaspectsofthisproblemhavealreadybeendiscussedinIndex
matrices.)
Ablockdesignisdefinedbytwofactors,sayblocks(blevels)andvarieties(vlevels).IfRand
Karethevbyvandbbybreplicationsandblocksizematrices,respectively,andNisthebbyv
incidencematrix,thentheefficiencyfactorsaredefinedastheeigenvaluesofthematrixE=I_v
R^{1/2}NK^{1}NR^{1/2}=I_vAA,whereA=K^{1/2}NR^{1/2}.Onewaytowrite
thefunctionisgivenbelow.
>bdeff<function(blocks,varieties){
blocks<as.factor(blocks)#minorsafetymove
b<length(levels(blocks))
varieties<as.factor(varieties)#minorsafetymove
v<length(levels(varieties))
K<as.vector(table(blocks))#removedimattr
R<as.vector(table(varieties))#removedimattr
N<table(blocks,varieties)
A<1/sqrt(K)*N*rep(1/sqrt(R),rep(b,v))
sv<svd(A)
list(eff=1sv$d^2,blockcv=sv$u,varietycv=sv$v)
}

Itisnumericallyslightlybettertoworkwiththesingularvaluedecompositiononthisoccasion
ratherthantheeigenvalueroutines.
Theresultofthefunctionisalistgivingnotonlytheefficiencyfactorsasthefirstcomponent,
butalsotheblockandvarietycanonicalcontrasts,sincesometimesthesegiveadditionaluseful
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

50/116

5/28/2015

AnIntroductiontoR

qualitativeinformation.
Next:Recursivenumericalintegration,Previous:Efficiencyfactorsinblockdesigns,Up:More
advancedexamples[Contents][Index]
10.6.2Droppingallnamesinaprintedarray

Forprintingpurposeswithlargematricesorarrays,itisoftenusefultoprintthemincloseblock
formwithoutthearraynamesornumbers.Removingthedimnamesattributewillnotachievethis
effect,butratherthearraymustbegivenadimnamesattributeconsistingofemptystrings.For
exampletoprintamatrix,X
>temp<X
>dimnames(temp)<list(rep("",nrow(X)),rep("",ncol(X)))
>temp;rm(temp)

Thiscanbemuchmoreconvenientlydoneusingafunction,no.dimnames(),shownbelow,asa
wraparoundtoachievethesameresult.Italsoillustrateshowsomeeffectiveandusefuluser
functionscanbequiteshort.
no.dimnames<function(a){
##Removealldimensionnamesfromanarrayforcompactprinting.
d<list()
l<0
for(iindim(a)){
d[[l<l+1]]<rep("",i)
}
dimnames(a)<d
a
}

Withthisfunctiondefined,anarraymaybeprintedincloseformatusing
>no.dimnames(X)

Thisisparticularlyusefulforlargeintegerarrays,wherepatternsaretherealinterestratherthan
thevalues.
Previous:Droppingallnamesinaprintedarray,Up:Moreadvancedexamples[Contents]
[Index]
10.6.3Recursivenumericalintegration

Functionsmayberecursive,andmaythemselvesdefinefunctionswithinthemselves.Note,
however,thatsuchfunctions,orindeedvariables,arenotinheritedbycalledfunctionsinhigher
evaluationframesastheywouldbeiftheywereonthesearchpath.
Theexamplebelowshowsanaivewayofperformingonedimensionalnumericalintegration.
Theintegrandisevaluatedattheendpointsoftherangeandinthemiddle.Iftheonepanel
trapeziumruleansweriscloseenoughtothetwopanel,thenthelatterisreturnedasthevalue.
Otherwisethesameprocessisrecursivelyappliedtoeachpanel.Theresultisanadaptive
integrationprocessthatconcentratesfunctionevaluationsinregionswheretheintegrandis
farthestfromlinear.Thereis,however,aheavyoverhead,andthefunctionisonlycompetitive
withotheralgorithmswhentheintegrandisbothsmoothandverydifficulttoevaluate.
TheexampleisalsogivenpartlyasalittlepuzzleinRprogramming.
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

51/116

5/28/2015

AnIntroductiontoR

area<function(f,a,b,eps=1.0e06,lim=10){
fun1<function(f,a,b,fa,fb,a0,eps,lim,fun){
##functionfun1isonlyvisibleinsidearea
d<(a+b)/2
h<(ba)/4
fd<f(d)
a1<h*(fa+fd)
a2<h*(fd+fb)
if(abs(a0a1a2)<eps||lim==0)
return(a1+a2)
else{
return(fun(f,a,d,fa,fd,a1,eps,lim1,fun)+
fun(f,d,b,fd,fb,a2,eps,lim1,fun))
}
}
fa<f(a)
fb<f(b)
a0<((fa+fb)*(ba))/2
fun1(f,a,b,fa,fb,a0,eps,lim,fun1)
}

Scope:

Objectorientation:
Next:Customizingtheenvironment,Previous:Moreadvancedexamples,Up:Writingyourown
functions[Contents][Index]
10.7Scope
Thediscussioninthissectionissomewhatmoretechnicalthaninotherpartsofthisdocument.
However,itdetailsoneofthemajordifferencesbetweenSPLUSandR.
Thesymbolswhichoccurinthebodyofafunctioncanbedividedintothreeclassesformal
parameters,localvariablesandfreevariables.Theformalparametersofafunctionarethose
occurringintheargumentlistofthefunction.Theirvaluesaredeterminedbytheprocessof
bindingtheactualfunctionargumentstotheformalparameters.Localvariablesarethosewhose
valuesaredeterminedbytheevaluationofexpressionsinthebodyofthefunctions.Variables
whicharenotformalparametersorlocalvariablesarecalledfreevariables.Freevariables
becomelocalvariablesiftheyareassignedto.Considerthefollowingfunctiondefinition.
f<function(x){
y<2*x
print(x)
print(y)
print(z)
}

Inthisfunction,xisaformalparameter,yisalocalvariableandzisafreevariable.
InRthefreevariablebindingsareresolvedbyfirstlookingintheenvironmentinwhichthe
functionwascreated.Thisiscalledlexicalscope.Firstwedefineafunctioncalledcube.
cube<function(n){
sq<function()n*n
n*sq()
}

Thevariableninthefunctionsqisnotanargumenttothatfunction.Thereforeitisafree
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

52/116

5/28/2015

AnIntroductiontoR

variableandthescopingrulesmustbeusedtoascertainthevaluethatistobeassociatedwithit.
Understaticscope(SPLUS)thevalueisthatassociatedwithaglobalvariablenamedn.Under
lexicalscope(R)itistheparametertothefunctioncubesincethatistheactivebindingforthe
variablenatthetimethefunctionsqwasdefined.ThedifferencebetweenevaluationinRand
evaluationinSPLUSisthatSPLUSlooksforaglobalvariablecallednwhileRfirstlooksfora
variablecallednintheenvironmentcreatedwhencubewasinvoked.
##firstevaluationinS
S>cube(2)
Errorinsq():Object"n"notfound
Dumped
S>n<3
S>cube(2)
[1]18
##thenthesamefunctionevaluatedinR
R>cube(2)
[1]8

Lexicalscopecanalsobeusedtogivefunctionsmutablestate.Inthefollowingexamplewe
showhowRcanbeusedtomimicabankaccount.Afunctioningbankaccountneedstohavea
balanceortotal,afunctionformakingwithdrawals,afunctionformakingdepositsanda
functionforstatingthecurrentbalance.Weachievethisbycreatingthethreefunctionswithin
accountandthenreturningalistcontainingthem.Whenaccountisinvokedittakesanumerical
argumenttotalandreturnsalistcontainingthethreefunctions.Becausethesefunctionsare
definedinanenvironmentwhichcontainstotal,theywillhaveaccesstoitsvalue.
Thespecialassignmentoperator,<<,isusedtochangethevalueassociatedwithtotal.This
operatorlooksbackinenclosingenvironmentsforanenvironmentthatcontainsthesymbol
totalandwhenitfindssuchanenvironmentitreplacesthevalue,inthatenvironment,withthe
valueofrighthandside.Iftheglobalortoplevelenvironmentisreachedwithoutfindingthe
symboltotalthenthatvariableiscreatedandassignedtothere.Formostusers<<createsa
globalvariableandassignsthevalueoftherighthandsidetoit23.Onlywhen<<hasbeenused
inafunctionthatwasreturnedasthevalueofanotherfunctionwillthespecialbehavior
describedhereoccur.
open.account<function(total){
list(
deposit=function(amount){
if(amount<=0)
stop("Depositsmustbepositive!\n")
total<<total+amount
cat(amount,"deposited.Yourbalanceis",total,"\n\n")
},
withdraw=function(amount){
if(amount>total)
stop("Youdon'thavethatmuchmoney!\n")
total<<totalamount
cat(amount,"withdrawn.Yourbalanceis",total,"\n\n")
},
balance=function(){
cat("Yourbalanceis",total,"\n\n")
}
)
}
ross<open.account(100)
robert<open.account(200)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

53/116

5/28/2015

AnIntroductiontoR

ross$withdraw(30)
ross$balance()
robert$balance()
ross$deposit(50)
ross$balance()
ross$withdraw(500)

Next:Objectorientation,Previous:Scope,Up:Writingyourownfunctions[Contents][Index]
10.8Customizingtheenvironment
Userscancustomizetheirenvironmentinseveraldifferentways.Thereisasiteinitializationfile
andeverydirectorycanhaveitsownspecialinitializationfile.Finally,thespecialfunctions
.Firstand.Lastcanbeused.
ThelocationofthesiteinitializationfileistakenfromthevalueoftheR_PROFILEenvironment
variable.Ifthatvariableisunset,thefileRprofile.siteintheRhomesubdirectoryetcisused.
ThisfileshouldcontainthecommandsthatyouwanttoexecuteeverytimeRisstartedunder
yoursystem.Asecond,personal,profilefilenamed.Rprofile24canbeplacedinanydirectory.
IfRisinvokedinthatdirectorythenthatfilewillbesourced.Thisfilegivesindividualusers
controlovertheirworkspaceandallowsfordifferentstartupproceduresindifferentworking
directories.Ifno.Rprofilefileisfoundinthestartupdirectory,thenRlooksfora.Rprofilefile
intheusershomedirectoryandusesthat(ifitexists).Iftheenvironmentvariable
R_PROFILE_USERisset,thefileitpointstoisusedinsteadofthe.Rprofilefiles.
Anyfunctionnamed.First()ineitherofthetwoprofilefilesorinthe.RDataimagehasa
specialstatus.ItisautomaticallyperformedatthebeginningofanRsessionandmaybeusedto
initializetheenvironment.Forexample,thedefinitionintheexamplebelowaltersthepromptto
$andsetsupvariousotherusefulthingsthatcanthenbetakenforgrantedintherestofthe
session.
Thus,thesequenceinwhichfilesareexecutedis,Rprofile.site,theuserprofile,.RDataand
then.First().Adefinitioninlaterfileswillmaskdefinitionsinearlierfiles.
>.First<function(){
options(prompt="$",continue="+\t")#$istheprompt
options(digits=5,length=999)#customnumbersandprintout
x11()#forgraphics
par(pch="+")#plottingcharacter
source(file.path(Sys.getenv("HOME"),"R","mystuff.R"))
#mypersonalfunctions
library(MASS)#attachapackage
}

Similarlyafunction.Last(),ifdefined,is(normally)executedattheveryendofthesession.An
exampleisgivenbelow.
>.Last<function(){
graphics.off()#asmallsafetymeasure.
cat(paste(date(),"\nAdios\n"))#Isittimeforlunch?
}

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

54/116

5/28/2015

AnIntroductiontoR

Previous:Customizingtheenvironment,Up:Writingyourownfunctions[Contents][Index]
10.9Classes,genericfunctionsandobjectorientation
Theclassofanobjectdetermineshowitwillbetreatedbywhatareknownasgenericfunctions.
Puttheotherwayround,agenericfunctionperformsataskoractiononitsargumentsspecificto
theclassoftheargumentitself.Iftheargumentlacksanyclassattribute,orhasaclassnot
cateredforspecificallybythegenericfunctioninquestion,thereisalwaysadefaultaction
provided.
Anexamplemakesthingsclearer.Theclassmechanismofferstheuserthefacilityofdesigning
andwritinggenericfunctionsforspecialpurposes.Amongtheothergenericfunctionsareplot()
fordisplayingobjectsgraphically,summary()forsummarizinganalysesofvarioustypes,and
anova()forcomparingstatisticalmodels.
Thenumberofgenericfunctionsthatcantreataclassinaspecificwaycanbequitelarge.For
example,thefunctionsthatcanaccommodateinsomefashionobjectsofclass"data.frame"
include
[[[<anyas.matrix
[<meanplotsummary

Acurrentlycompletelistcanbegotbyusingthemethods()function:
>methods(class="data.frame")

Converselythenumberofclassesagenericfunctioncanhandlecanalsobequitelarge.For
exampletheplot()functionhasadefaultmethodandvariantsforobjectsofclasses
"data.frame","density","factor",andmore.Acompletelistcanbegotagainbyusingthe
methods()function:
>methods(plot)

Formanygenericfunctionsthefunctionbodyisquiteshort,forexample
>coef
function(object,...)
UseMethod("coef")

ThepresenceofUseMethodindicatesthisisagenericfunction.Toseewhatmethodsareavailable
wecanusemethods()
>methods(coef)
[1]coef.aov*coef.Arima*coef.default*coef.listof*
[5]coef.nls*coef.summary.nls*
Nonvisiblefunctionsareasterisked

Inthisexampletherearesixmethods,noneofwhichcanbeseenbytypingitsname.Wecan
readthesebyeitherof
>getAnywhere("coef.aov")
Asingleobjectmatchingcoef.aovwasfound
Itwasfoundinthefollowingplaces
registeredS3methodforcoeffromnamespacestats
namespace:stats
withvalue
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

55/116

5/28/2015

AnIntroductiontoR

function(object,...)
{
z<object$coef
z[!is.na(z)]
}
>getS3method("coef","aov")
function(object,...)
{
z<object$coef
z[!is.na(z)]
}

Afunctionnamedgen.clwillbeinvokedbythegenericgenforclasscl,sodonotname
functionsinthisstyleunlesstheyareintendedtobemethods.
ThereaderisreferredtotheRLanguageDefinitionforamorecompletediscussionofthis
mechanism.
Next:Graphics,Previous:Writingyourownfunctions,Up:Top[Contents][Index]

11StatisticalmodelsinR
Thissectionpresumesthereaderhassomefamiliaritywithstatisticalmethodology,inparticular
withregressionanalysisandtheanalysisofvariance.Laterwemakesomerathermoreambitious
presumptions,namelythatsomethingisknownaboutgeneralizedlinearmodelsandnonlinear
regression.
Therequirementsforfittingstatisticalmodelsaresufficientlywelldefinedtomakeitpossibleto
constructgeneraltoolsthatapplyinabroadspectrumofproblems.
Rprovidesaninterlockingsuiteoffacilitiesthatmakefittingstatisticalmodelsverysimple.As
wementionintheintroduction,thebasicoutputisminimal,andoneneedstoaskforthedetails
bycallingextractorfunctions.
Formulaeforstatisticalmodels:

Linearmodels:

Genericfunctionsforextractingmodelinformation:

Analysisofvarianceandmodelcomparison:

Updatingfittedmodels:

Generalizedlinearmodels:

Nonlinearleastsquaresandmaximumlikelihoodmodels:
Somenonstandardmodels:

Next:Linearmodels,Previous:StatisticalmodelsinR,Up:StatisticalmodelsinR[Contents]
[Index]
11.1Definingstatisticalmodels;formulae
Thetemplateforastatisticalmodelisalinearregressionmodelwithindependent,homoscedastic
errors
y_i=sum_{j=0}^pbeta_jx_{ij}+e_i,i=1,,n,
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

56/116

5/28/2015

AnIntroductiontoR

wherethee_iareNID(0,sigma^2).Inmatrixtermsthiswouldbewritten
y=Xbeta+e
wheretheyistheresponsevector,Xisthemodelmatrixordesignmatrixandhascolumnsx_0,
x_1,,x_p,thedeterminingvariables.Veryoftenx_0willbeacolumnofonesdefiningan
interceptterm.
Examples

Beforegivingaformalspecification,afewexamplesmayusefullysetthepicture.
Supposey,x,x0,x1,x2,arenumericvariables,XisamatrixandA,B,C,arefactors.The
followingformulaeontheleftsidebelowspecifystatisticalmodelsasdescribedontheright.
y~x
y~1+x

Bothimplythesamesimplelinearregressionmodelofyonx.Thefirsthasanimplicit
interceptterm,andthesecondanexplicitone.
y~0+x
y~1+x
y~x1

Simplelinearregressionofyonxthroughtheorigin(thatis,withoutaninterceptterm).
log(y)~x1+x2

Multipleregressionofthetransformedvariable,log(y),onx1andx2(withanimplicit
interceptterm).
y~poly(x,2)
y~1+x+I(x^2)

Polynomialregressionofyonxofdegree2.Thefirstformusesorthogonalpolynomials,
andthesecondusesexplicitpowers,asbasis.
y~X+poly(x,2)

MultipleregressionywithmodelmatrixconsistingofthematrixXaswellaspolynomial
termsinxtodegree2.
y~A

Singleclassificationanalysisofvariancemodelofy,withclassesdeterminedbyA.
y~A+x

Singleclassificationanalysisofcovariancemodelofy,withclassesdeterminedbyA,and
withcovariatex.
y~A*B
y~A+B+A:B
y~B%in%A
y~A/B

TwofactornonadditivemodelofyonAandB.Thefirsttwospecifythesamecrossed
classificationandthesecondtwospecifythesamenestedclassification.Inabstractterms
allfourspecifythesamemodelsubspace.
y~(A+B+C)^2
y~A*B*CA:B:C

Threefactorexperimentbutwithamodelcontainingmaineffectsandtwofactor
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

57/116

5/28/2015

AnIntroductiontoR

interactionsonly.Bothformulaespecifythesamemodel.
y~A*x
y~A/x
y~A/(1+x)1

SeparatesimplelinearregressionmodelsofyonxwithinthelevelsofA,withdifferent
codings.Thelastformproducesexplicitestimatesofasmanydifferentinterceptsand
slopesastherearelevelsinA.
y~A*B+Error(C)

Anexperimentwithtwotreatmentfactors,AandB,anderrorstratadeterminedbyfactor
C.Forexampleasplitplotexperiment,withwholeplots(andhencealsosubplots),
determinedbyfactorC.
Theoperator~isusedtodefineamodelformulainR.Theform,foranordinarylinearmodel,is
response~op_1term_1op_2term_2op_3term_3

where
response
isavectorormatrix,(orexpressionevaluatingtoavectorormatrix)definingtheresponse
variable(s).
op_i
isanoperator,either+or,implyingtheinclusionorexclusionofaterminthemodel,
(thefirstisoptional).
term_i
iseither
avectorormatrixexpression,or1,
afactor,or
aformulaexpressionconsistingoffactors,vectorsormatricesconnectedbyformula
operators.
Inallcaseseachtermdefinesacollectionofcolumnseithertobeaddedtoorremoved
fromthemodelmatrix.A1standsforaninterceptcolumnandisbydefaultincludedinthe
modelmatrixunlessexplicitlyremoved.
TheformulaoperatorsaresimilarineffecttotheWilkinsonandRogersnotationusedbysuch
programsasGlimandGenstat.Oneinevitablechangeisthattheoperator.becomes:since
theperiodisavalidnamecharacterinR.
Thenotationissummarizedbelow(basedonChambers&Hastie,1992,p.29):
Y~M

YismodeledasM.
M_1+M_2

IncludeM_1andM_2.
M_1M_2

IncludeM_1leavingouttermsofM_2.
M_1:M_2
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

58/116

5/28/2015

AnIntroductiontoR

ThetensorproductofM_1andM_2.Ifbothtermsarefactors,thenthesubclassesfactor.
M_1%in%M_2

SimilartoM_1:M_2,butwithadifferentcoding.
M_1*M_2
M_1+M_2+M_1:M_2.
M_1/M_2
M_1+M_2%in%M_1.
M^n

AlltermsinMtogetherwithinteractionsuptoordern
I(M)

InsulateM.InsideMalloperatorshavetheirnormalarithmeticmeaning,andthatterm
appearsinthemodelmatrix.
Notethatinsidetheparenthesesthatusuallyenclosefunctionargumentsalloperatorshavetheir
normalarithmeticmeaning.ThefunctionI()isanidentityfunctionusedtoallowtermsinmodel
formulaetobedefinedusingarithmeticoperators.
Noteparticularlythatthemodelformulaespecifythecolumnsofthemodelmatrix,the
specificationoftheparametersbeingimplicit.Thisisnotthecaseinothercontexts,forexample
inspecifyingnonlinearmodels.
Contrasts:
Previous:Formulaeforstatisticalmodels,Up:Formulaeforstatisticalmodels[Contents]
[Index]
11.1.1Contrasts

Weneedatleastsomeideahowthemodelformulaespecifythecolumnsofthemodelmatrix.
Thisiseasyifwehavecontinuousvariables,aseachprovidesonecolumnofthemodelmatrix
(andtheinterceptwillprovideacolumnofonesifincludedinthemodel).
WhataboutaklevelfactorA?Theanswerdiffersforunorderedandorderedfactors.For
unorderedfactorsk1columnsaregeneratedfortheindicatorsofthesecond,,kthlevelsof
thefactor.(Thustheimplicitparameterizationistocontrasttheresponseateachlevelwiththat
atthefirst.)Fororderedfactorsthek1columnsaretheorthogonalpolynomialson1,,k,
omittingtheconstantterm.
Althoughtheanswerisalreadycomplicated,itisnotthewholestory.First,iftheinterceptis
omittedinamodelthatcontainsafactorterm,thefirstsuchtermisencodedintokcolumns
givingtheindicatorsforallthelevels.Second,thewholebehaviorcanbechangedbythe
optionssettingforcontrasts.ThedefaultsettinginRis
options(contrasts=c("contr.treatment","contr.poly"))

ThemainreasonformentioningthisisthatRandShavedifferentdefaultsforunorderedfactors,
SusingHelmertcontrasts.Soifyouneedtocompareyourresultstothoseofatextbookorpaper
whichusedSPLUS,youwillneedtoset
options(contrasts=c("contr.helmert","contr.poly"))

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

59/116

5/28/2015

AnIntroductiontoR

Thisisadeliberatedifference,astreatmentcontrasts(Rsdefault)arethoughteasierfor
newcomerstointerpret.
Wehavestillnotfinished,asthecontrastschemetobeusedcanbesetforeachterminthe
modelusingthefunctionscontrastsandC.
Wehavenotyetconsideredinteractionterms:thesegeneratetheproductsofthecolumns
introducedfortheircomponentterms.
Althoughthedetailsarecomplicated,modelformulaeinRwillnormallygeneratethemodels
thatanexpertstatisticianwouldexpect,providedthatmarginalityispreserved.Fitting,for
example,amodelwithaninteractionbutnotthecorrespondingmaineffectswillingenerallead
tosurprisingresults,andisforexpertsonly.
Next:Genericfunctionsforextractingmodelinformation,Previous:Formulaeforstatistical
models,Up:StatisticalmodelsinR[Contents][Index]
11.2Linearmodels
Thebasicfunctionforfittingordinarymultiplemodelsislm(),andastreamlinedversionofthe
callisasfollows:
>fitted.model<lm(formula,data=data.frame)

Forexample
>fm2<lm(y~x1+x2,data=production)

wouldfitamultipleregressionmodelofyonx1andx2(withimplicitinterceptterm).
Theimportant(buttechnicallyoptional)parameterdata=productionspecifiesthatany
variablesneededtoconstructthemodelshouldcomefirstfromtheproductiondataframe.This
isthecaseregardlessofwhetherdataframeproductionhasbeenattachedonthesearchpathor
not.
Next:Analysisofvarianceandmodelcomparison,Previous:Linearmodels,Up:Statistical
modelsinR[Contents][Index]
11.3Genericfunctionsforextractingmodelinformation
Thevalueoflm()isafittedmodelobjecttechnicallyalistofresultsofclass"lm".Information
aboutthefittedmodelcanthenbedisplayed,extracted,plottedandsoonbyusinggeneric
functionsthatorientthemselvestoobjectsofclass"lm".Theseinclude
add1devianceformulapredictstep
aliasdrop1kappaprintsummary
anovaeffectslabelsprojvcov
coeffamilyplotresiduals

Abriefdescriptionofthemostcommonlyusedonesisgivenbelow.
anova(object_1,object_2)

Compareasubmodelwithanoutermodelandproduceananalysisofvariancetable.
coef(object)

Extracttheregressioncoefficient(matrix).
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

60/116

5/28/2015

AnIntroductiontoR

Longform:coefficients(object).
deviance(object)

Residualsumofsquares,weightedifappropriate.
formula(object)

Extractthemodelformula.
plot(object)

Producefourplots,showingresiduals,fittedvaluesandsomediagnostics.
predict(object,newdata=data.frame)

Thedataframesuppliedmusthavevariablesspecifiedwiththesamelabelsastheoriginal.
Thevalueisavectorormatrixofpredictedvaluescorrespondingtothedetermining
variablevaluesindata.frame.
print(object)

Printaconciseversionoftheobject.Mostoftenusedimplicitly.
residuals(object)

Extractthe(matrixof)residuals,weightedasappropriate.
Shortform:resid(object).
step(object)

Selectasuitablemodelbyaddingordroppingtermsandpreservinghierarchies.The
modelwiththesmallestvalueofAIC(AkaikesAnInformationCriterion)discoveredin
thestepwisesearchisreturned.
summary(object)

Printacomprehensivesummaryoftheresultsoftheregressionanalysis.
vcov(object)

Returnsthevariancecovariancematrixofthemainparametersofafittedmodelobject.
Next:Updatingfittedmodels,Previous:Genericfunctionsforextractingmodelinformation,Up:
StatisticalmodelsinR[Contents][Index]
11.4Analysisofvarianceandmodelcomparison
Themodelfittingfunctionaov(formula,data=data.frame)operatesatthesimplestlevelina
verysimilarwaytothefunctionlm(),andmostofthegenericfunctionslistedinthetablein
Genericfunctionsforextractingmodelinformationapply.
Itshouldbenotedthatinadditionaov()allowsananalysisofmodelswithmultipleerrorstrata
suchassplitplotexperiments,orbalancedincompleteblockdesignswithrecoveryofinterblock
information.Themodelformula
response~mean.formula+Error(strata.formula)

specifiesamultistratumexperimentwitherrorstratadefinedbythestrata.formula.Inthe
simplestcase,strata.formulaissimplyafactor,whenitdefinesatwostrataexperiment,namely
betweenandwithinthelevelsofthefactor.
Forexample,withalldeterminingvariablesfactors,amodelformulasuchasthatin:
>fm<aov(yield~v+n*p*k+Error(farms/blocks),data=farm.data)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

61/116

5/28/2015

AnIntroductiontoR

wouldtypicallybeusedtodescribeanexperimentwithmeanmodelv+n*p*kandthreeerror
strata,namelybetweenfarms,withinfarms,betweenblocksandwithinblocks.
ANOVAtables:
Previous:Analysisofvarianceandmodelcomparison,Up:Analysisofvarianceandmodel
comparison[Contents][Index]
11.4.1ANOVAtables

Notealsothattheanalysisofvariancetable(ortables)areforasequenceoffittedmodels.The
sumsofsquaresshownarethedecreaseintheresidualsumsofsquaresresultingfroman
inclusionofthatterminthemodelatthatplaceinthesequence.Henceonlyfororthogonal
experimentswilltheorderofinclusionbeinconsequential.
Formultistratumexperimentstheprocedureisfirsttoprojecttheresponseontotheerrorstrata,
againinsequence,andtofitthemeanmodeltoeachprojection.Forfurtherdetails,see
Chambers&Hastie(1992).
AmoreflexiblealternativetothedefaultfullANOVAtableistocomparetwoormoremodels
directlyusingtheanova()function.
>anova(fitted.model.1,fitted.model.2,)

ThedisplayisthenanANOVAtableshowingthedifferencesbetweenthefittedmodelswhen
fittedinsequence.Thefittedmodelsbeingcomparedwouldusuallybeanhierarchicalsequence,
ofcourse.Thisdoesnotgivedifferentinformationtothedefault,butrathermakesiteasierto
comprehendandcontrol.
Next:Generalizedlinearmodels,Previous:Analysisofvarianceandmodelcomparison,Up:
StatisticalmodelsinR[Contents][Index]
11.5Updatingfittedmodels
Theupdate()functionislargelyaconveniencefunctionthatallowsamodeltobefittedthat
differsfromonepreviouslyfittedusuallybyjustafewadditionalorremovedterms.Itsformis
>new.model<update(old.model,new.formula)

Inthenew.formulathespecialnameconsistingofaperiod,.,only,canbeusedtostandfor
thecorrespondingpartoftheoldmodelformula.Forexample,
>fm05<lm(y~x1+x2+x3+x4+x5,data=production)
>fm6<update(fm05,.~.+x6)
>smf6<update(fm6,sqrt(.)~.)

wouldfitafivevariatemultipleregressionwithvariables(presumably)fromthedataframe
production,fitanadditionalmodelincludingasixthregressorvariable,andfitavariantonthe
modelwheretheresponsehadasquareroottransformapplied.
Noteespeciallythatifthedata=argumentisspecifiedontheoriginalcalltothemodelfitting
function,thisinformationispassedonthroughthefittedmodelobjecttoupdate()anditsallies.
Thename.canalsobeusedinothercontexts,butwithslightlydifferentmeaning.For
example
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

62/116

5/28/2015

AnIntroductiontoR

>fmfull<lm(y~.,data=production)

wouldfitamodelwithresponseyandregressorvariablesallothervariablesinthedataframe
production.
Otherfunctionsforexploringincrementalsequencesofmodelsareadd1(),drop1()andstep().
Thenamesofthesegiveagoodcluetotheirpurpose,butforfulldetailsseetheonlinehelp.
Next:Nonlinearleastsquaresandmaximumlikelihoodmodels,Previous:Updatingfitted
models,Up:StatisticalmodelsinR[Contents][Index]
11.6Generalizedlinearmodels
Generalizedlinearmodelingisadevelopmentoflinearmodelstoaccommodatebothnonnormal
responsedistributionsandtransformationstolinearityinacleanandstraightforwardway.A
generalizedlinearmodelmaybedescribedintermsofthefollowingsequenceofassumptions:
Thereisaresponse,y,ofinterestandstimulusvariablesx_1,x_2,,whosevalues
influencethedistributionoftheresponse.
Thestimulusvariablesinfluencethedistributionofythroughasinglelinearfunction,
only.Thislinearfunctioniscalledthelinearpredictor,andisusuallywritten
eta=beta_1x_1+beta_2x_2++beta_px_p,
hencex_ihasnoinfluenceonthedistributionofyifandonlyifbeta_iiszero.
Thedistributionofyisoftheform
f_Y(ymu,phi)
=exp((A/phi)*(ylambda(mu)gamma(lambda(mu)))+tau(y,phi))
wherephiisascaleparameter(possiblyknown),andisconstantforallobservations,A
representsapriorweight,assumedknownbutpossiblyvaryingwiththeobservations,and
$\mu$isthemeanofy.Soitisassumedthatthedistributionofyisdeterminedbyits
meanandpossiblyascaleparameteraswell.
Themean,mu,isasmoothinvertiblefunctionofthelinearpredictor:
mu=m(eta),eta=m^{1}(mu)=ell(mu)
andthisinversefunction,ell(),iscalledthelinkfunction.
Theseassumptionsarelooseenoughtoencompassawideclassofmodelsusefulinstatistical
practice,buttightenoughtoallowthedevelopmentofaunifiedmethodologyofestimationand
inference,atleastapproximately.Thereaderisreferredtoanyofthecurrentreferenceworkson
thesubjectforfulldetails,suchasMcCullagh&Nelder(1989)orDobson(1990).
Families:

Theglm()function:
Next:Theglm()function,Previous:Generalizedlinearmodels,Up:Generalizedlinearmodels
[Contents][Index]
11.6.1Families
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

63/116

5/28/2015

AnIntroductiontoR

TheclassofgeneralizedlinearmodelshandledbyfacilitiessuppliedinRincludesgaussian,
binomial,poisson,inversegaussianandgammaresponsedistributionsandalsoquasilikelihood
modelswheretheresponsedistributionisnotexplicitlyspecified.Inthelattercasethevariance
functionmustbespecifiedasafunctionofthemean,butinothercasesthisfunctionisimplied
bytheresponsedistribution.
Eachresponsedistributionadmitsavarietyoflinkfunctionstoconnectthemeanwiththelinear
predictor.Thoseautomaticallyavailableareshowninthefollowingtable:
Familyname

Linkfunctions

binomial

logit,probit,log,cloglog

gaussian

identity,log,inverse

Gamma

identity,inverse,log

inverse.gaussian

1/mu^2,identity,inverse,log

poisson

identity,log,sqrt

quasi

logit,probit,cloglog,identity,inverse,log,1/mu^2,
sqrt

Thecombinationofaresponsedistribution,alinkfunctionandvariousotherpiecesof
informationthatareneededtocarryoutthemodelingexerciseiscalledthefamilyofthe
generalizedlinearmodel.
Previous:Families,Up:Generalizedlinearmodels[Contents][Index]
11.6.2Theglm()function

Sincethedistributionoftheresponsedependsonthestimulusvariablesthroughasinglelinear
functiononly,thesamemechanismaswasusedforlinearmodelscanstillbeusedtospecifythe
linearpartofageneralizedmodel.Thefamilyhastobespecifiedinadifferentway.
TheRfunctiontofitageneralizedlinearmodelisglm()whichusestheform
>fitted.model<glm(formula,family=family.generator,data=data.frame)

Theonlynewfeatureisthefamily.generator,whichistheinstrumentbywhichthefamilyis
described.Itisthenameofafunctionthatgeneratesalistoffunctionsandexpressionsthat
togetherdefineandcontrolthemodelandestimationprocess.Althoughthismayseemalittle
complicatedatfirstsight,itsuseisquitesimple.
Thenamesofthestandard,suppliedfamilygeneratorsaregivenunderFamilyNameinthe
tableinFamilies.Wherethereisachoiceoflinks,thenameofthelinkmayalsobesupplied
withthefamilyname,inparenthesesasaparameter.Inthecaseofthequasifamily,thevariance
functionmayalsobespecifiedinthisway.
Someexamplesmaketheprocessclear.
Thegaussianfamily

Acallsuchas
>fm<glm(y~x1+x2,family=gaussian,data=sales)

achievesthesameresultas
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

64/116

5/28/2015

AnIntroductiontoR

>fm<lm(y~x1+x2,data=sales)

butmuchlessefficiently.Notehowthegaussianfamilyisnotautomaticallyprovidedwitha
choiceoflinks,sonoparameterisallowed.Ifaproblemrequiresagaussianfamilywitha
nonstandardlink,thiscanusuallybeachievedthroughthequasifamily,asweshallseelater.
Thebinomialfamily

Considerasmall,artificialexample,fromSilvey(1970).
OntheAegeanislandofKalythosthemaleinhabitantssufferfromacongenitaleyedisease,the
effectsofwhichbecomemoremarkedwithincreasingage.Samplesofislandermalesofvarious
agesweretestedforblindnessandtheresultsrecorded.Thedataisshownbelow:
Age:
20 35 45 55 70
No.tested: 50 50 50 50 50
No.blind: 6 17 26 37 44
Theproblemweconsideristofitbothlogisticandprobitmodelstothisdata,andtoestimatefor
eachmodeltheLD50,thatistheageatwhichthechanceofblindnessforamaleinhabitantis
50%.
Ifyisthenumberofblindatagexandnthenumbertested,bothmodelshavetheformy~B(n,
F(beta_0+beta_1x))wherefortheprobitcase,F(z)=Phi(z)isthestandardnormaldistribution
function,andinthelogitcase(thedefault),F(z)=e^z/(1+e^z).InbothcasestheLD50isLD50
=beta_0/beta_1thatis,thepointatwhichtheargumentofthedistributionfunctioniszero.
Thefirststepistosetthedataupasadataframe
>kalythos<data.frame(x=c(20,35,45,55,70),n=rep(50,5),
y=c(6,17,26,37,44))

Tofitabinomialmodelusingglm()therearethreepossibilitiesfortheresponse:
Iftheresponseisavectoritisassumedtoholdbinarydata,andsomustbea0/1vector.
Iftheresponseisatwocolumnmatrixitisassumedthatthefirstcolumnholdsthenumber
ofsuccessesforthetrialandthesecondholdsthenumberoffailures.
Iftheresponseisafactor,itsfirstlevelistakenasfailure(0)andallotherlevelsas
success(1).
Hereweneedthesecondoftheseconventions,soweaddamatrixtoourdataframe:
>kalythos$Ymat<cbind(kalythos$y,kalythos$nkalythos$y)

Tofitthemodelsweuse
>fmp<glm(Ymat~x,family=binomial(link=probit),data=kalythos)
>fml<glm(Ymat~x,family=binomial,data=kalythos)

Sincethelogitlinkisthedefaulttheparametermaybeomittedonthesecondcall.Toseethe
resultsofeachfitwecoulduse
>summary(fmp)
>summary(fml)

Bothmodelsfit(alltoo)well.TofindtheLD50estimatewecanuseasimplefunction:
>ld50<function(b)b[1]/b[2]
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

65/116

5/28/2015

AnIntroductiontoR

>ldp<ld50(coef(fmp));ldl<ld50(coef(fml));c(ldp,ldl)

Theactualestimatesfromthisdataare43.663yearsand43.601yearsrespectively.
Poissonmodels

WiththePoissonfamilythedefaultlinkisthelog,andinpracticethemajoruseofthisfamilyis
tofitsurrogatePoissonloglinearmodelstofrequencydata,whoseactualdistributionisoften
multinomial.Thisisalargeandimportantsubjectwewillnotdiscussfurtherhere.Itevenforms
amajorpartoftheuseofnongaussiangeneralizedmodelsoverall.
OccasionallygenuinelyPoissondataarisesinpracticeandinthepastitwasoftenanalyzedas
gaussiandataaftereitheralogorasquareroottransformation.Asagracefulalternativetothe
latter,aPoissongeneralizedlinearmodelmaybefittedasinthefollowingexample:
>fmod<glm(y~A+B+x,family=poisson(link=sqrt),
data=worm.counts)
Quasilikelihoodmodels

Forallfamiliesthevarianceoftheresponsewilldependonthemeanandwillhavethescale
parameterasamultiplier.Theformofdependenceofthevarianceonthemeanisacharacteristic
oftheresponsedistributionforexampleforthepoissondistributionVar(y)=mu.
Forquasilikelihoodestimationandinferencethepreciseresponsedistributionisnotspecified,
butratheronlyalinkfunctionandtheformofthevariancefunctionasitdependsonthemean.
Sincequasilikelihoodestimationusesformallyidenticaltechniquestothoseforthegaussian
distribution,thisfamilyprovidesawayoffittinggaussianmodelswithnonstandardlink
functionsorvariancefunctions,incidentally.
Forexample,considerfittingthenonlinearregressiony=theta_1z_1/(z_2theta_2)+e
whichmaybewrittenalternativelyasy=1/(beta_1x_1+beta_2x_2)+ewherex_1=
z_2/z_1,x_2=1/z_1,beta_1=1/theta_1,andbeta_2=theta_2/theta_1.Supposingasuitable
dataframetobesetupwecouldfitthisnonlinearregressionas
>nlfit<glm(y~x1+x21,
family=quasi(link=inverse,variance=constant),
data=biochem)

Thereaderisreferredtothemanualandthehelpdocumentforfurtherinformation,asneeded.
Next:Somenonstandardmodels,Previous:Generalizedlinearmodels,Up:Statisticalmodelsin
R[Contents][Index]
11.7Nonlinearleastsquaresandmaximumlikelihoodmodels
CertainformsofnonlinearmodelcanbefittedbyGeneralizedLinearModels(glm()).Butinthe
majorityofcaseswehavetoapproachthenonlinearcurvefittingproblemasoneofnonlinear
optimization.Rsnonlinearoptimizationroutinesareoptim(),nlm()andnlminb(),which
providethefunctionality(andmore)ofSPLUSsms()andnlminb().Weseektheparameter
valuesthatminimizesomeindexoflackoffit,andtheydothisbytryingoutvariousparameter
valuesiteratively.Unlikelinearregressionforexample,thereisnoguaranteethattheprocedure
willconvergeonsatisfactoryestimates.Allthemethodsrequireinitialguessesaboutwhat
parametervaluestotry,andconvergencemaydependcriticallyuponthequalityofthestarting
values.
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

66/116

5/28/2015

AnIntroductiontoR

Leastsquares:

Maximumlikelihood:
Next:Maximumlikelihood,Previous:Nonlinearleastsquaresandmaximumlikelihoodmodels,
Up:Nonlinearleastsquaresandmaximumlikelihoodmodels[Contents][Index]
11.7.1Leastsquares

Onewaytofitanonlinearmodelisbyminimizingthesumofthesquarederrors(SSE)or
residuals.Thismethodmakessenseiftheobservederrorscouldhaveplausiblyarisenfroma
normaldistribution.
HereisanexamplefromBates&Watts(1988),page51.Thedataare:
>x<c(0.02,0.02,0.06,0.06,0.11,0.11,0.22,0.22,0.56,0.56,
1.10,1.10)
>y<c(76,47,97,107,123,139,159,152,191,201,207,200)

Thefitcriteriontobeminimizedis:
>fn<function(p)sum((y(p[1]*x)/(p[2]+x))^2)

Inordertodothefitweneedinitialestimatesoftheparameters.Onewaytofindsensible
startingvaluesistoplotthedata,guesssomeparametervalues,andsuperimposethemodel
curveusingthosevalues.
>plot(x,y)
>xfit<seq(.02,1.1,.05)
>yfit<200*xfit/(0.1+xfit)
>lines(spline(xfit,yfit))

Wecoulddobetter,butthesestartingvaluesof200and0.1seemadequate.Nowdothefit:
>out<nlm(fn,p=c(200,0.1),hessian=TRUE)

Afterthefitting,out$minimumistheSSE,andout$estimatearetheleastsquaresestimatesofthe
parameters.Toobtaintheapproximatestandarderrors(SE)oftheestimateswedo:
>sqrt(diag(2*out$minimum/(length(y)2)*solve(out$hessian)))

The2whichissubtractedinthelineaboverepresentsthenumberofparameters.A95%
confidenceintervalwouldbetheparameterestimate+/1.96SE.Wecansuperimposetheleast
squaresfitonanewplot:
>plot(x,y)
>xfit<seq(.02,1.1,.05)
>yfit<212.68384222*xfit/(0.06412146+xfit)
>lines(spline(xfit,yfit))

Thestandardpackagestatsprovidesmuchmoreextensivefacilitiesforfittingnonlinearmodels
byleastsquares.ThemodelwehavejustfittedistheMichaelisMentenmodel,sowecanuse
>df<data.frame(x=x,y=y)
>fit<nls(y~SSmicmen(x,Vm,K),df)
>fit
Nonlinearregressionmodel
model:y~SSmicmen(x,Vm,K)
data:df
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

67/116

5/28/2015

AnIntroductiontoR

VmK
212.683707110.06412123
residualsumofsquares:1195.449
>summary(fit)
Formula:y~SSmicmen(x,Vm,K)
Parameters:
EstimateStd.ErrortvaluePr(>|t|)
Vm2.127e+026.947e+0030.6153.24e11
K6.412e028.281e037.7431.57e05
Residualstandarderror:10.93on10degreesoffreedom
CorrelationofParameterEstimates:
Vm
K0.7651

Previous:Leastsquares,Up:Nonlinearleastsquaresandmaximumlikelihoodmodels
[Contents][Index]
11.7.2Maximumlikelihood

Maximumlikelihoodisamethodofnonlinearmodelfittingthatapplieseveniftheerrorsarenot
normal.Themethodfindstheparametervalueswhichmaximizetheloglikelihood,or
equivalentlywhichminimizethenegativeloglikelihood.HereisanexamplefromDobson
(1990),pp.108111.Thisexamplefitsalogisticmodeltodoseresponsedata,whichclearly
couldalsobefitbyglm().Thedataare:
>x<c(1.6907,1.7242,1.7552,1.7842,1.8113,
1.8369,1.8610,1.8839)
>y<c(6,13,18,28,52,53,61,60)
>n<c(59,60,62,56,63,59,62,60)

Thenegativeloglikelihoodtominimizeis:
>fn<function(p)
sum((y*(p[1]+p[2]*x)n*log(1+exp(p[1]+p[2]*x))
+log(choose(n,y))))

Wepicksensiblestartingvaluesanddothefit:
>out<nlm(fn,p=c(50,20),hessian=TRUE)

Afterthefitting,out$minimumisthenegativeloglikelihood,andout$estimatearethemaximum
likelihoodestimatesoftheparameters.ToobtaintheapproximateSEsoftheestimateswedo:
>sqrt(diag(solve(out$hessian)))

A95%confidenceintervalwouldbetheparameterestimate+/1.96SE.
Previous:Nonlinearleastsquaresandmaximumlikelihoodmodels,Up:StatisticalmodelsinR
[Contents][Index]
11.8Somenonstandardmodels
WeconcludethischapterwithjustabriefmentionofsomeoftheotherfacilitiesavailableinR
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

68/116

5/28/2015

AnIntroductiontoR

forspecialregressionanddataanalysisproblems.
Mixedmodels.Therecommendednlmepackageprovidesfunctionslme()andnlme()for
linearandnonlinearmixedeffectsmodels,thatislinearandnonlinearregressionsin
whichsomeofthecoefficientscorrespondtorandomeffects.Thesefunctionsmakeheavy
useofformulaetospecifythemodels.
Localapproximatingregressions.Theloess()functionfitsanonparametricregression
byusingalocallyweightedregression.Suchregressionsareusefulforhighlightingatrend
inmessydataorfordatareductiontogivesomeinsightintoalargedataset.
Functionloessisinthestandardpackagestats,togetherwithcodeforprojectionpursuit
regression.
Robustregression.Thereareseveralfunctionsavailableforfittingregressionmodelsin
awayresistanttotheinfluenceofextremeoutliersinthedata.Functionlqsinthe
recommendedpackageMASSprovidesstateofartalgorithmsforhighlyresistantfits.
Lessresistantbutstatisticallymoreefficientmethodsareavailableinpackages,for
examplefunctionrlminpackageMASS.
Additivemodels.Thistechniqueaimstoconstructaregressionfunctionfromsmooth
additivefunctionsofthedeterminingvariables,usuallyoneforeachdeterminingvariable.
Functionsavasandaceinpackageacepackandfunctionsbrutoandmarsinpackagemda
providesomeexamplesofthesetechniquesinusercontributedpackagestoR.An
extensionisGeneralizedAdditiveModels,implementedinusercontributedpackages
gamandmgcv.
Treebasedmodels.Ratherthanseekanexplicitgloballinearmodelforpredictionor
interpretation,treebasedmodelsseektobifurcatethedata,recursively,atcriticalpointsof
thedeterminingvariablesinordertopartitionthedataultimatelyintogroupsthatareas
homogeneousaspossiblewithin,andasheterogeneousaspossiblebetween.Theresults
oftenleadtoinsightsthatotherdataanalysismethodstendnottoyield.
Modelsareagainspecifiedintheordinarylinearmodelform.Themodelfittingfunctionis
tree(),butmanyothergenericfunctionssuchasplot()andtext()arewelladaptedto
displayingtheresultsofatreebasedmodelfitinagraphicalway.
TreemodelsareavailableinRviatheusercontributedpackagesrpartandtree.
Next:Packages,Previous:StatisticalmodelsinR,Up:Top[Contents][Index]

12Graphicalprocedures
GraphicalfacilitiesareanimportantandextremelyversatilecomponentoftheRenvironment.It
ispossibletousethefacilitiestodisplayawidevarietyofstatisticalgraphsandalsotobuild
entirelynewtypesofgraph.
Thegraphicsfacilitiescanbeusedinbothinteractiveandbatchmodes,butinmostcases,
interactiveuseismoreproductive.InteractiveuseisalsoeasybecauseatstartuptimeRinitiates
agraphicsdevicedriverwhichopensaspecialgraphicswindowforthedisplayofinteractive
graphics.Althoughthisisdoneautomatically,itmayusefultoknowthatthecommandusedis
X11()underUNIX,windows()underWindowsandquartz()underOSX.Anewdevicecan
alwaysbeopenedbydev.new().
Oncethedevicedriverisrunning,Rplottingcommandscanbeusedtoproduceavarietyof
graphicaldisplaysandtocreateentirelynewkindsofdisplay.
Plottingcommandsaredividedintothreebasicgroups:
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

69/116

5/28/2015

AnIntroductiontoR

Highlevelplottingfunctionscreateanewplotonthegraphicsdevice,possiblywithaxes,
labels,titlesandsoon.
Lowlevelplottingfunctionsaddmoreinformationtoanexistingplot,suchasextra
points,linesandlabels.
Interactivegraphicsfunctionsallowyouinteractivelyaddinformationto,orextract
informationfrom,anexistingplot,usingapointingdevicesuchasamouse.
Inaddition,Rmaintainsalistofgraphicalparameterswhichcanbemanipulatedtocustomize
yourplots.
Thismanualonlydescribeswhatareknownasbasegraphics.Aseparategraphicssubsystem
inpackagegridcoexistswithbaseitismorepowerfulbuthardertouse.Thereisa
recommendedpackagelatticewhichbuildsongridandprovideswaystoproducemultipanel
plotsakintothoseintheTrellissysteminS.
Highlevelplottingcommands:
Lowlevelplottingcommands:
Interactingwithgraphics:

Usinggraphicsparameters:

Graphicsparameters:

Devicedrivers:

Dynamicgraphics:

Next:Lowlevelplottingcommands,Previous:Graphics,Up:Graphics[Contents][Index]
12.1Highlevelplottingcommands
Highlevelplottingfunctionsaredesignedtogenerateacompleteplotofthedatapassedas
argumentstothefunction.Whereappropriate,axes,labelsandtitlesareautomaticallygenerated
(unlessyourequestotherwise.)Highlevelplottingcommandsalwaysstartanewplot,erasing
thecurrentplotifnecessary.
Theplot()function:

Displayingmultivariatedata:

Displaygraphics:

Argumentstohighlevelplottingfunctions:
Next:Displayingmultivariatedata,Previous:Highlevelplottingcommands,Up:Highlevel
plottingcommands[Contents][Index]
12.1.1Theplot()function

OneofthemostfrequentlyusedplottingfunctionsinRistheplot()function.Thisisageneric
function:thetypeofplotproducedisdependentonthetypeorclassofthefirstargument.
plot(x,y)
plot(xy)

Ifxandyarevectors,plot(x,y)producesascatterplotofyagainstx.Thesameeffectcan
beproducedbysupplyingoneargument(secondform)aseitheralistcontainingtwo
elementsxandyoratwocolumnmatrix.
plot(x)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

70/116

5/28/2015

AnIntroductiontoR

Ifxisatimeseries,thisproducesatimeseriesplot.Ifxisanumericvector,itproducesa
plotofthevaluesinthevectoragainsttheirindexinthevector.Ifxisacomplexvector,it
producesaplotofimaginaryversusrealpartsofthevectorelements.
plot(f)
plot(f,y)

fisafactorobject,yisanumericvector.Thefirstformgeneratesabarplotoffthe
secondformproducesboxplotsofyforeachleveloff.
plot(df)
plot(~expr)
plot(y~expr)

dfisadataframe,yisanyobject,exprisalistofobjectnamesseparatedby+(e.g.,a+b
+c).Thefirsttwoformsproducedistributionalplotsofthevariablesinadataframe(first
form)orofanumberofnamedobjects(secondform).Thethirdformplotsyagainstevery
objectnamedinexpr.
Next:Displaygraphics,Previous:Theplot()function,Up:Highlevelplottingcommands
[Contents][Index]
12.1.2Displayingmultivariatedata

Rprovidestwoveryusefulfunctionsforrepresentingmultivariatedata.IfXisanumericmatrix
ordataframe,thecommand
>pairs(X)

producesapairwisescatterplotmatrixofthevariablesdefinedbythecolumnsofX,thatis,every
columnofXisplottedagainsteveryothercolumnofXandtheresultingn(n1)plotsarearranged
inamatrixwithplotscalesconstantovertherowsandcolumnsofthematrix.
Whenthreeorfourvariablesareinvolvedacoplotmaybemoreenlightening.Ifaandbare
numericvectorsandcisanumericvectororfactorobject(allofthesamelength),thenthe
command
>coplot(a~b|c)

producesanumberofscatterplotsofaagainstbforgivenvaluesofc.Ifcisafactor,thissimply
meansthataisplottedagainstbforeverylevelofc.Whencisnumeric,itisdividedintoa
numberofconditioningintervalsandforeachintervalaisplottedagainstbforvaluesofcwithin
theinterval.Thenumberandpositionofintervalscanbecontrolledwithgiven.values=
argumenttocoplot()thefunctionco.intervals()isusefulforselectingintervals.Youcan
alsousetwogivenvariableswithacommandlike
>coplot(a~b|c+d)

whichproducesscatterplotsofaagainstbforeveryjointconditioningintervalofcandd.
Thecoplot()andpairs()functionbothtakeanargumentpanel=whichcanbeusedto
customizethetypeofplotwhichappearsineachpanel.Thedefaultispoints()toproducea
scatterplotbutbysupplyingsomeotherlowlevelgraphicsfunctionoftwovectorsxandyasthe
valueofpanel=youcanproduceanytypeofplotyouwish.Anexamplepanelfunctionusefulfor
coplotsispanel.smooth().
Next:Argumentstohighlevelplottingfunctions,Previous:Displayingmultivariatedata,Up:
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

71/116

5/28/2015

AnIntroductiontoR

Highlevelplottingcommands[Contents][Index]
12.1.3Displaygraphics

Otherhighlevelgraphicsfunctionsproducedifferenttypesofplots.Someexamplesare:
qqnorm(x)
qqline(x)
qqplot(x,y)

Distributioncomparisonplots.Thefirstformplotsthenumericvectorxagainstthe
expectedNormalorderscores(anormalscoresplot)andthesecondaddsastraightlineto
suchaplotbydrawingalinethroughthedistributionanddataquartiles.Thethirdform
plotsthequantilesofxagainstthoseofytocomparetheirrespectivedistributions.
hist(x)
hist(x,nclass=n)
hist(x,breaks=b,)

Producesahistogramofthenumericvectorx.Asensiblenumberofclassesisusually
chosen,butarecommendationcanbegivenwiththenclass=argument.Alternatively,the
breakpointscanbespecifiedexactlywiththebreaks=argument.Iftheprobability=TRUE
argumentisgiven,thebarsrepresentrelativefrequenciesdividedbybinwidthinsteadof
counts.
dotchart(x,)

Constructsadotchartofthedatainx.Inadotcharttheyaxisgivesalabellingofthedata
inxandthexaxisgivesitsvalue.Forexampleitallowseasyvisualselectionofalldata
entrieswithvalueslyinginspecifiedranges.
image(x,y,z,)
contour(x,y,z,)
persp(x,y,z,)

Plotsofthreevariables.Theimageplotdrawsagridofrectanglesusingdifferentcoloursto
representthevalueofz,thecontourplotdrawscontourlinestorepresentthevalueofz,
andtheperspplotdrawsa3Dsurface.
Previous:Displaygraphics,Up:Highlevelplottingcommands[Contents][Index]
12.1.4Argumentstohighlevelplottingfunctions

Thereareanumberofargumentswhichmaybepassedtohighlevelgraphicsfunctions,as
follows:
add=TRUE

Forcesthefunctiontoactasalowlevelgraphicsfunction,superimposingtheplotonthe
currentplot(somefunctionsonly).
axes=FALSE

Suppressesgenerationofaxesusefulforaddingyourowncustomaxeswiththeaxis()
function.Thedefault,axes=TRUE,meansincludeaxes.
log="x"
log="y"
log="xy"

Causesthex,yorbothaxestobelogarithmic.Thiswillworkformany,butnotall,types
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

72/116

5/28/2015

AnIntroductiontoR

ofplot.
type=

Thetype=argumentcontrolsthetypeofplotproduced,asfollows:
type="p"

Plotindividualpoints(thedefault)
type="l"

Plotlines
type="b"

Plotpointsconnectedbylines(both)
type="o"

Plotpointsoverlaidbylines
type="h"

Plotverticallinesfrompointstothezeroaxis(highdensity)
type="s"
type="S"

Stepfunctionplots.Inthefirstform,thetopoftheverticaldefinesthepointinthe
second,thebottom.
type="n"

Noplottingatall.Howeveraxesarestilldrawn(bydefault)andthecoordinate
systemissetupaccordingtothedata.Idealforcreatingplotswithsubsequentlow
levelgraphicsfunctions.
xlab=string
ylab=string

Axislabelsforthexandyaxes.Usetheseargumentstochangethedefaultlabels,usually
thenamesoftheobjectsusedinthecalltothehighlevelplottingfunction.
main=string

Figuretitle,placedatthetopoftheplotinalargefont.
sub=string

Subtitle,placedjustbelowthexaxisinasmallerfont.
Next:Interactingwithgraphics,Previous:Highlevelplottingcommands,Up:Graphics
[Contents][Index]
12.2Lowlevelplottingcommands
Sometimesthehighlevelplottingfunctionsdontproduceexactlythekindofplotyoudesire.In
thiscase,lowlevelplottingcommandscanbeusedtoaddextrainformation(suchaspoints,
linesortext)tothecurrentplot.
Someofthemoreusefullowlevelplottingfunctionsare:
points(x,y)
lines(x,y)

Addspointsorconnectedlinestothecurrentplot.plot()stype=argumentcanalsobe
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

73/116

5/28/2015

AnIntroductiontoR

passedtothesefunctions(anddefaultsto"p"forpoints()and"l"forlines().)
text(x,y,labels,)

Addtexttoaplotatpointsgivenbyx,y.Normallylabelsisanintegerorcharacter
vectorinwhichcaselabels[i]isplottedatpoint(x[i],y[i]).Thedefaultis
1:length(x).
Note:Thisfunctionisoftenusedinthesequence
>plot(x,y,type="n");text(x,y,names)

Thegraphicsparametertype="n"suppressesthepointsbutsetsuptheaxes,andthetext()
functionsuppliesspecialcharacters,asspecifiedbythecharactervectornamesforthe
points.
abline(a,b)
abline(h=y)
abline(v=x)
abline(lm.obj)

Addsalineofslopebandinterceptatothecurrentplot.h=ymaybeusedtospecifyy
coordinatesfortheheightsofhorizontallinestogoacrossaplot,andv=xsimilarlyforthe
xcoordinatesforverticallines.Alsolm.objmaybelistwithacoefficientscomponentof
length2(suchastheresultofmodelfittingfunctions,)whicharetakenasaninterceptand
slope,inthatorder.
polygon(x,y,)

Drawsapolygondefinedbytheorderedverticesin(x,y)and(optionally)shadeitinwith
hatchlines,orfillitifthegraphicsdeviceallowsthefillingoffigures.
legend(x,y,legend,)

Addsalegendtothecurrentplotatthespecifiedposition.Plottingcharacters,linestyles,
colorsetc.,areidentifiedwiththelabelsinthecharactervectorlegend.Atleastoneother
argumentv(avectorthesamelengthaslegend)withthecorrespondingvaluesofthe
plottingunitmustalsobegiven,asfollows:
legend(,fill=v)

Colorsforfilledboxes
legend(,col=v)

Colorsinwhichpointsorlineswillbedrawn
legend(,lty=v)

Linestyles
legend(,lwd=v)

Linewidths
legend(,pch=v)

Plottingcharacters(charactervector)
title(main,sub)

Addsatitlemaintothetopofthecurrentplotinalargefontand(optionally)asubtitlesub
atthebottominasmallerfont.
axis(side,)

Addsanaxistothecurrentplotonthesidegivenbythefirstargument(1to4,counting
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

74/116

5/28/2015

AnIntroductiontoR

clockwisefromthebottom.)Otherargumentscontrolthepositioningoftheaxiswithinor
besidetheplot,andtickpositionsandlabels.Usefulforaddingcustomaxesaftercalling
plot()withtheaxes=FALSEargument.
Lowlevelplottingfunctionsusuallyrequiresomepositioninginformation(e.g.,xandy
coordinates)todeterminewheretoplacethenewplotelements.Coordinatesaregiveninterms
ofusercoordinateswhicharedefinedbytheprevioushighlevelgraphicscommandandare
chosenbasedonthesupplieddata.
Wherexandyargumentsarerequired,itisalsosufficienttosupplyasingleargumentbeinga
listwithelementsnamedxandy.Similarlyamatrixwithtwocolumnsisalsovalidinput.Inthis
wayfunctionssuchaslocator()(seebelow)maybeusedtospecifypositionsonaplot
interactively.
Mathematicalannotation:
Hersheyvectorfonts:

Next:Hersheyvectorfonts,Previous:Lowlevelplottingcommands,Up:Lowlevelplotting
commands[Contents][Index]
12.2.1Mathematicalannotation

Insomecases,itisusefultoaddmathematicalsymbolsandformulaetoaplot.Thiscanbe
achievedinRbyspecifyinganexpressionratherthanacharacterstringinanyoneoftext,
mtext,axis,ortitle.Forexample,thefollowingcodedrawstheformulafortheBinomial
probabilityfunction:
>text(x,y,expression(paste(bgroup("(",atop(n,x),")"),p^x,q^{nx})))

Moreinformation,includingafulllistingofthefeaturesavailablecanobtainedfromwithinR
usingthecommands:
>help(plotmath)
>example(plotmath)
>demo(plotmath)

Previous:Mathematicalannotation,Up:Lowlevelplottingcommands[Contents][Index]
12.2.2Hersheyvectorfonts

ItispossibletospecifyHersheyvectorfontsforrenderingtextwhenusingthetextandcontour
functions.TherearethreereasonsforusingtheHersheyfonts:
Hersheyfontscanproducebetteroutput,especiallyonacomputerscreen,forrotated
and/orsmalltext.
Hersheyfontsprovidecertainsymbolsthatmaynotbeavailableinthestandardfonts.In
particular,therearezodiacsigns,cartographicsymbolsandastronomicalsymbols.
Hersheyfontsprovidecyrillicandjapanese(KanaandKanji)characters.
Moreinformation,includingtablesofHersheycharacterscanbeobtainedfromwithinRusing
thecommands:
>help(Hershey)
>demo(Hershey)
>help(Japanese)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

75/116

5/28/2015

AnIntroductiontoR

>demo(Japanese)

Next:Usinggraphicsparameters,Previous:Lowlevelplottingcommands,Up:Graphics
[Contents][Index]
12.3Interactingwithgraphics
Ralsoprovidesfunctionswhichallowuserstoextractoraddinformationtoaplotusinga
mouse.Thesimplestoftheseisthelocator()function:
locator(n,type)

Waitsfortheusertoselectlocationsonthecurrentplotusingtheleftmousebutton.This
continuesuntiln(default512)pointshavebeenselected,oranothermousebuttonis
pressed.Thetypeargumentallowsforplottingattheselectedpointsandhasthesame
effectasforhighlevelgraphicscommandsthedefaultisnoplotting.locator()returns
thelocationsofthepointsselectedasalistwithtwocomponentsxandy.
locator()isusuallycalledwithnoarguments.Itisparticularlyusefulforinteractivelyselecting

positionsforgraphicelementssuchaslegendsorlabelswhenitisdifficulttocalculatein
advancewherethegraphicshouldbeplaced.Forexample,toplacesomeinformativetextnear
anoutlyingpoint,thecommand
>text(locator(1),"Outlier",adj=0)

maybeuseful.(locator()willbeignoredifthecurrentdevice,suchaspostscriptdoesnot
supportinteractivepointing.)
identify(x,y,labels)

Allowtheusertohighlightanyofthepointsdefinedbyxandy(usingtheleftmouse
button)byplottingthecorrespondingcomponentoflabelsnearby(ortheindexnumberof
thepointiflabelsisabsent).Returnstheindicesoftheselectedpointswhenanother
buttonispressed.
Sometimeswewanttoidentifyparticularpointsonaplot,ratherthantheirpositions.For
example,wemaywishtheusertoselectsomeobservationofinterestfromagraphicaldisplay
andthenmanipulatethatobservationinsomeway.Givenanumberof(x,y)coordinatesintwo
numericvectorsxandy,wecouldusetheidentify()functionasfollows:
>plot(x,y)
>identify(x,y)

Theidentify()functionsperformsnoplottingitself,butsimplyallowstheusertomovethe
mousepointerandclicktheleftmousebuttonnearapoint.Ifthereisapointnearthemouse
pointeritwillbemarkedwithitsindexnumber(thatis,itspositioninthex/yvectors)plotted
nearby.Alternatively,youcouldusesomeinformativestring(suchasacasename)asahighlight
byusingthelabelsargumenttoidentify(),ordisablemarkingaltogetherwiththeplot=FALSE
argument.Whentheprocessisterminated(seeabove),identify()returnstheindicesofthe
selectedpointsyoucanusetheseindicestoextracttheselectedpointsfromtheoriginalvectors
xandy.
Next:Graphicsparameters,Previous:Interactingwithgraphics,Up:Graphics[Contents]
[Index]
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

76/116

5/28/2015

AnIntroductiontoR

12.4Usinggraphicsparameters
Whencreatinggraphics,particularlyforpresentationorpublicationpurposes,Rsdefaultsdonot
alwaysproduceexactlythatwhichisrequired.Youcan,however,customizealmosteveryaspect
ofthedisplayusinggraphicsparameters.Rmaintainsalistofalargenumberofgraphics
parameterswhichcontrolthingssuchaslinestyle,colors,figurearrangementandtext
justificationamongmanyothers.Everygraphicsparameterhasaname(suchascol,which
controlscolors,)andavalue(acolornumber,forexample.)
Aseparatelistofgraphicsparametersismaintainedforeachactivedevice,andeachdevicehasa
defaultsetofparameterswheninitialized.Graphicsparameterscanbesetintwoways:either
permanently,affectingallgraphicsfunctionswhichaccessthecurrentdeviceortemporarily,
affectingonlyasinglegraphicsfunctioncall.
Thepar()function:

Argumentstographicsfunctions:
Next:Argumentstographicsfunctions,Previous:Usinggraphicsparameters,Up:Using
graphicsparameters[Contents][Index]
12.4.1Permanentchanges:Thepar()function

Thepar()functionisusedtoaccessandmodifythelistofgraphicsparametersforthecurrent
graphicsdevice.
par()

Withoutarguments,returnsalistofallgraphicsparametersandtheirvaluesforthecurrent
device.
par(c("col","lty"))

Withacharactervectorargument,returnsonlythenamedgraphicsparameters(again,asa
list.)
par(col=4,lty=2)

Withnamedarguments(orasinglelistargument),setsthevaluesofthenamedgraphics
parameters,andreturnstheoriginalvaluesoftheparametersasalist.
Settinggraphicsparameterswiththepar()functionchangesthevalueoftheparameters
permanently,inthesensethatallfuturecallstographicsfunctions(onthecurrentdevice)willbe
affectedbythenewvalue.Youcanthinkofsettinggraphicsparametersinthiswayassetting
defaultvaluesfortheparameters,whichwillbeusedbyallgraphicsfunctionsunlessan
alternativevalueisgiven.
Notethatcallstopar()alwaysaffecttheglobalvaluesofgraphicsparameters,evenwhenpar()
iscalledfromwithinafunction.Thisisoftenundesirablebehaviorusuallywewanttosetsome
graphicsparameters,dosomeplotting,andthenrestoretheoriginalvaluessoasnottoaffectthe
usersRsession.Youcanrestoretheinitialvaluesbysavingtheresultofpar()whenmaking
changes,andrestoringtheinitialvalueswhenplottingiscomplete.
>oldpar<par(col=4,lty=2)
plottingcommands
>par(oldpar)

Tosaveandrestoreallsettable25graphicalparametersuse
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

77/116

5/28/2015

AnIntroductiontoR

>oldpar<par(no.readonly=TRUE)
plottingcommands
>par(oldpar)

Previous:Thepar()function,Up:Usinggraphicsparameters[Contents][Index]
12.4.2Temporarychanges:Argumentstographicsfunctions

Graphicsparametersmayalsobepassedto(almost)anygraphicsfunctionasnamedarguments.
Thishasthesameeffectaspassingtheargumentstothepar()function,exceptthatthechanges
onlylastforthedurationofthefunctioncall.Forexample:
>plot(x,y,pch="+")

producesascatterplotusingaplussignastheplottingcharacter,withoutchangingthedefault
plottingcharacterforfutureplots.
Unfortunately,thisisnotimplementedentirelyconsistentlyanditissometimesnecessarytoset
andresetgraphicsparametersusingpar().
Next:Devicedrivers,Previous:Usinggraphicsparameters,Up:Graphics[Contents][Index]
12.5Graphicsparameterslist
Thefollowingsectionsdetailmanyofthecommonlyusedgraphicalparameters.TheRhelp
documentationforthepar()functionprovidesamoreconcisesummarythisisprovidedasa
somewhatmoredetailedalternative.
Graphicsparameterswillbepresentedinthefollowingform:
name=value

Adescriptionoftheparameterseffect.nameisthenameoftheparameter,thatis,the
argumentnametouseincallstopar()oragraphicsfunction.valueisatypicalvalueyou
mightusewhensettingtheparameter.
Notethataxesisnotagraphicsparameterbutanargumenttoafewplotmethods:seexaxtand
yaxt.
Graphicalelements:

Axesandtickmarks:

Figuremargins:

Multiplefigureenvironment:
Next:Axesandtickmarks,Previous:Graphicsparameters,Up:Graphicsparameters[Contents]
[Index]
12.5.1Graphicalelements

Rplotsaremadeupofpoints,lines,textandpolygons(filledregions.)Graphicalparameters
existwhichcontrolhowthesegraphicalelementsaredrawn,asfollows:
pch="+"

Charactertobeusedforplottingpoints.Thedefaultvarieswithgraphicsdrivers,butitis
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

78/116

5/28/2015

AnIntroductiontoR

usuallyacircle.Plottedpointstendtoappearslightlyaboveorbelowtheappropriate
positionunlessyouuse"."astheplottingcharacter,whichproducescenteredpoints.
pch=4

Whenpchisgivenasanintegerbetween0and25inclusive,aspecializedplottingsymbol
isproduced.Toseewhatthesymbolsare,usethecommand
>legend(locator(1),as.character(0:25),pch=0:25)

Thosefrom21to25mayappeartoduplicateearliersymbols,butcanbecolouredin
differentways:seethehelponpointsanditsexamples.
Inaddition,pchcanbeacharacteroranumberintherange32:255representingacharacter
inthecurrentfont.
lty=2

Linetypes.Alternativelinestylesarenotsupportedonallgraphicsdevices(andvaryon
thosethatdo)butlinetype1isalwaysasolidline,linetype0isalwaysinvisible,andline
types2andonwardsaredottedordashedlines,orsomecombinationofboth.
lwd=2

Linewidths.Desiredwidthoflines,inmultiplesofthestandardlinewidth.Affectsaxis
linesaswellaslinesdrawnwithlines(),etc.Notalldevicessupportthis,andsomehave
restrictionsonthewidthsthatcanbeused.
col=2

Colorstobeusedforpoints,lines,text,filledregionsandimages.Anumberfromthe
currentpalette(see?palette)oranamedcolour.
col.axis
col.lab
col.main
col.sub

Thecolortobeusedforaxisannotation,xandylabels,mainandsubtitles,respectively.
font=2

Anintegerwhichspecifieswhichfonttousefortext.Ifpossible,devicedriversarrangeso
that1correspondstoplaintext,2toboldface,3toitalic,4tobolditalicand5toasymbol
font(whichincludeGreekletters).
font.axis
font.lab
font.main
font.sub

Thefonttobeusedforaxisannotation,xandylabels,mainandsubtitles,respectively.
adj=0.1

Justificationoftextrelativetotheplottingposition.0meansleftjustify,1meansright
justifyand0.5meanstocenterhorizontallyabouttheplottingposition.Theactualvalueis
theproportionoftextthatappearstotheleftoftheplottingposition,soavalueof0.1
leavesagapof10%ofthetextwidthbetweenthetextandtheplottingposition.
cex=1.5

Characterexpansion.Thevalueisthedesiredsizeoftextcharacters(includingplotting
characters)relativetothedefaulttextsize.
cex.axis
cex.lab
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

79/116

5/28/2015

AnIntroductiontoR

cex.main
cex.sub

Thecharacterexpansiontobeusedforaxisannotation,xandylabels,mainandsubtitles,
respectively.
Next:Figuremargins,Previous:Graphicalelements,Up:Graphicsparameters[Contents]
[Index]
12.5.2Axesandtickmarks

ManyofRshighlevelplotshaveaxes,andyoucanconstructaxesyourselfwiththelowlevel
axis()graphicsfunction.Axeshavethreemaincomponents:theaxisline(linestylecontrolled
bytheltygraphicsparameter),thetickmarks(whichmarkoffunitdivisionsalongtheaxisline)
andtheticklabels(whichmarktheunits.)Thesecomponentscanbecustomizedwiththe
followinggraphicsparameters.
lab=c(5,7,12)

Thefirsttwonumbersarethedesirednumberoftickintervalsonthexandyaxes
respectively.Thethirdnumberisthedesiredlengthofaxislabels,incharacters(including
thedecimalpoint.)Choosingatoosmallvalueforthisparametermayresultinalltick
labelsbeingroundedtothesamenumber!
las=1

Orientationofaxislabels.0meansalwaysparalleltoaxis,1meansalwayshorizontal,and
2meansalwaysperpendiculartotheaxis.
mgp=c(3,1,0)

Positionsofaxiscomponents.Thefirstcomponentisthedistancefromtheaxislabeltothe
axisposition,intextlines.Thesecondcomponentisthedistancetotheticklabels,andthe
finalcomponentisthedistancefromtheaxispositiontotheaxisline(usuallyzero).
Positivenumbersmeasureoutsidetheplotregion,negativenumbersinside.
tck=0.01

Lengthoftickmarks,asafractionofthesizeoftheplottingregion.Whentckissmall
(lessthan0.5)thetickmarksonthexandyaxesareforcedtobethesamesize.Avalueof
1givesgridlines.Negativevaluesgivetickmarksoutsidetheplottingregion.Use
tck=0.01andmgp=c(1,1.5,0)forinternaltickmarks.
xaxs="r"
yaxs="i"

Axisstylesforthexandyaxes,respectively.Withstyles"i"(internal)and"r"(the
default)tickmarksalwaysfallwithintherangeofthedata,howeverstyle"r"leavesa
smallamountofspaceattheedges.(ShasotherstylesnotimplementedinR.)
Next:Multiplefigureenvironment,Previous:Axesandtickmarks,Up:Graphicsparameters
[Contents][Index]
12.5.3Figuremargins

AsingleplotinRisknownasafigureandcomprisesaplotregionsurroundedbymargins
(possiblycontainingaxislabels,titles,etc.)and(usually)boundedbytheaxesthemselves.
Atypicalfigureis
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

80/116

5/28/2015

AnIntroductiontoR

images/fig11
Graphicsparameterscontrollingfigurelayoutinclude:
mai=c(1,0.5,0.5,0)

Widthsofthebottom,left,topandrightmargins,respectively,measuredininches.
mar=c(4,2,2,1)

Similartomai,exceptthemeasurementunitistextlines.
marandmaiareequivalentinthesensethatsettingonechangesthevalueoftheother.The

defaultvalueschosenforthisparameterareoftentoolargetherighthandmarginisrarely
needed,andneitheristhetopmarginifnotitleisbeingused.Thebottomandleftmarginsmust
belargeenoughtoaccommodatetheaxisandticklabels.Furthermore,thedefaultischosen
withoutregardtothesizeofthedevicesurface:forexample,usingthepostscript()driverwith
theheight=4argumentwillresultinaplotwhichisabout50%marginunlessmarormaiareset
explicitly.Whenmultiplefiguresareinuse(seebelow)themarginsarereduced,howeverthis
maynotbeenoughwhenmanyfiguressharethesamepage.
Previous:Figuremargins,Up:Graphicsparameters[Contents][Index]
12.5.4Multiplefigureenvironment

Rallowsyoutocreateannbymarrayoffiguresonasinglepage.Eachfigurehasitsown
margins,andthearrayoffiguresisoptionallysurroundedbyanoutermargin,asshowninthe
followingfigure.
images/fig12
Thegraphicalparametersrelatingtomultiplefiguresareasfollows:
mfcol=c(3,2)
mfrow=c(2,4)

Setthesizeofamultiplefigurearray.Thefirstvalueisthenumberofrowsthesecondis
thenumberofcolumns.Theonlydifferencebetweenthesetwoparametersisthatsetting
mfcolcausesfigurestobefilledbycolumnmfrowfillsbyrows.
ThelayoutintheFigurecouldhavebeencreatedbysettingmfrow=c(3,2)thefigureshows
thepageafterfourplotshavebeendrawn.
Settingeitherofthesecanreducethebasesizeofsymbolsandtext(controlledby
par("cex")andthepointsizeofthedevice).Inalayoutwithexactlytworowsandcolumns
thebasesizeisreducedbyafactorof0.83:iftherearethreeormoreofeitherrowsor
columns,thereductionfactoris0.66.
mfg=c(2,2,3,2)

Positionofthecurrentfigureinamultiplefigureenvironment.Thefirsttwonumbersare
therowandcolumnofthecurrentfigurethelasttwoarethenumberofrowsandcolumns
inthemultiplefigurearray.Setthisparametertojumpbetweenfiguresinthearray.You
canevenusedifferentvaluesforthelasttwonumbersthanthetruevaluesforunequally
sizedfiguresonthesamepage.
fig=c(4,9,1,4)/10

Positionofthecurrentfigureonthepage.Valuesarethepositionsoftheleft,right,bottom
andtopedgesrespectively,asapercentageofthepagemeasuredfromthebottomleft
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

81/116

5/28/2015

AnIntroductiontoR

corner.Theexamplevaluewouldbeforafigureinthebottomrightofthepage.Setthis
parameterforarbitrarypositioningoffigureswithinapage.Ifyouwanttoaddafigureto
acurrentpage,usenew=TRUEaswell(unlikeS).
oma=c(2,0,3,0)
omi=c(0,0,0.8,0)

Sizeofoutermargins.Likemarandmai,thefirstmeasuresintextlinesandthesecondin
inches,startingwiththebottommarginandworkingclockwise.
Outermarginsareparticularlyusefulforpagewisetitles,etc.Textcanbeaddedtotheouter
marginswiththemtext()functionwithargumentouter=TRUE.Therearenooutermarginsby
default,however,soyoumustcreatethemexplicitlyusingomaoromi.
Morecomplicatedarrangementsofmultiplefigurescanbeproducedbythesplit.screen()and
layout()functions,aswellasbythegridandlatticepackages.
Next:Dynamicgraphics,Previous:Graphicsparameters,Up:Graphics[Contents][Index]
12.6Devicedrivers
Rcangenerategraphics(ofvaryinglevelsofquality)onalmostanytypeofdisplayorprinting
device.Beforethiscanbegin,however,Rneedstobeinformedwhattypeofdeviceitisdealing
with.Thisisdonebystartingadevicedriver.Thepurposeofadevicedriveristoconvert
graphicalinstructionsfromR(drawaline,forexample)intoaformthattheparticulardevice
canunderstand.
Devicedriversarestartedbycallingadevicedriverfunction.Thereisonesuchfunctionfor
everydevicedriver:typehelp(Devices)foralistofthemall.Forexample,issuingthecommand
>postscript()

causesallfuturegraphicsoutputtobesenttotheprinterinPostScriptformat.Somecommonly
useddevicedriversare:
X11()

ForusewiththeX11windowsystemonUnixalikes
windows()

ForuseonWindows
quartz()

ForuseonOSX
postscript()

ForprintingonPostScriptprinters,orcreatingPostScriptgraphicsfiles.
pdf()

ProducesaPDFfile,whichcanalsobeincludedintoPDFfiles.
png()

ProducesabitmapPNGfile.(Notalwaysavailable:seeitshelppage.)
jpeg()

ProducesabitmapJPEGfile,bestusedforimageplots.(Notalwaysavailable:seeitshelp
page.)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

82/116

5/28/2015

AnIntroductiontoR

Whenyouhavefinishedwithadevice,besuretoterminatethedevicedriverbyissuingthe
command
>dev.off()

Thisensuresthatthedevicefinishescleanlyforexampleinthecaseofhardcopydevicesthis
ensuresthateverypageiscompletedandhasbeensenttotheprinter.(Thiswillhappen
automaticallyatthenormalendofasession.)
PostScriptdiagramsfortypesetdocuments:
Multiplegraphicsdevices:

Next:Multiplegraphicsdevices,Previous:Devicedrivers,Up:Devicedrivers[Contents]
[Index]
12.6.1PostScriptdiagramsfortypesetdocuments

Bypassingthefileargumenttothepostscript()devicedriverfunction,youmaystorethe
graphicsinPostScriptformatinafileofyourchoice.Theplotwillbeinlandscapeorientation
unlessthehorizontal=FALSEargumentisgiven,andyoucancontrolthesizeofthegraphicwith
thewidthandheightarguments(theplotwillbescaledasappropriatetofitthesedimensions.)
Forexample,thecommand
>postscript("file.ps",horizontal=FALSE,height=5,pointsize=10)

willproduceafilecontainingPostScriptcodeforafigurefiveincheshigh,perhapsforinclusion
inadocument.Itisimportanttonotethatifthefilenamedinthecommandalreadyexists,itwill
beoverwritten.ThisisthecaseevenifthefilewasonlycreatedearlierinthesameRsession.
ManyusagesofPostScriptoutputwillbetoincorporatethefigureinanotherdocument.This
worksbestwhenencapsulatedPostScriptisproduced:Ralwaysproducesconformantoutput,
butonlymarkstheoutputassuchwhentheonefile=FALSEargumentissupplied.Thisunusual
notationstemsfromScompatibility:itreallymeansthattheoutputwillbeasinglepage(which
ispartoftheEPSFspecification).Thustoproduceaplotforinclusionusesomethinglike
>postscript("plot1.eps",horizontal=FALSE,onefile=FALSE,
height=8,width=6,pointsize=10)

Previous:PostScriptdiagramsfortypesetdocuments,Up:Devicedrivers[Contents][Index]
12.6.2Multiplegraphicsdevices

InadvanceduseofRitisoftenusefultohaveseveralgraphicsdevicesinuseatthesametime.
Ofcourseonlyonegraphicsdevicecanacceptgraphicscommandsatanyonetime,andthisis
knownasthecurrentdevice.Whenmultipledevicesareopen,theyformanumberedsequence
withnamesgivingthekindofdeviceatanyposition.
Themaincommandsusedforoperatingwithmultipledevices,andtheirmeaningsareasfollows:
X11()

[UNIX]
windows()
win.printer()
win.metafile()
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

83/116

5/28/2015

AnIntroductiontoR

[Windows]
quartz()

[OSX]
postscript()
pdf()
png()
jpeg()
tiff()
bitmap()

Eachnewcalltoadevicedriverfunctionopensanewgraphicsdevice,thusextendingby
onethedevicelist.Thisdevicebecomesthecurrentdevice,towhichgraphicsoutputwill
besent.
dev.list()

Returnsthenumberandnameofallactivedevices.Thedeviceatposition1onthelistis
alwaysthenulldevicewhichdoesnotacceptgraphicscommandsatall.
dev.next()
dev.prev()

Returnsthenumberandnameofthegraphicsdevicenextto,orprevioustothecurrent
device,respectively.
dev.set(which=k)

Canbeusedtochangethecurrentgraphicsdevicetotheoneatpositionkofthedevice
list.Returnsthenumberandlabelofthedevice.
dev.off(k)

Terminatethegraphicsdeviceatpointkofthedevicelist.Forsomedevices,suchas
postscriptdevices,thiswilleitherprintthefileimmediatelyorcorrectlycompletethefile
forlaterprinting,dependingonhowthedevicewasinitiated.
dev.copy(device,,which=k)
dev.print(device,,which=k)

Makeacopyofthedevicek.Heredeviceisadevicefunction,suchaspostscript,with
extraarguments,ifneeded,specifiedby.dev.printissimilar,butthecopieddeviceis
immediatelyclosed,sothatendactions,suchasprintinghardcopies,areimmediately
performed.
graphics.off()

Terminateallgraphicsdevicesonthelist,exceptthenulldevice.
Previous:Devicedrivers,Up:Graphics[Contents][Index]
12.7Dynamicgraphics
Rdoesnothavebuiltincapabilitiesfordynamicorinteractivegraphics,e.g.rotatingpointclouds
ortobrushing(interactivelyhighlighting)points.However,extensivedynamicgraphics
facilitiesareavailableinthesystemGGobibySwayne,CookandBujaavailablefrom
http://www.ggobi.org/
andthesecanbeaccessedfromRviathepackagerggobi,describedat
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

84/116

5/28/2015

AnIntroductiontoR

http://www.ggobi.org/rggobi.
Also,packagerglprovideswaystointeractwith3Dplots,forexampleofsurfaces.
Next:OSfacilities,Previous:Graphics,Up:Top[Contents][Index]

13Packages
AllRfunctionsanddatasetsarestoredinpackages.Onlywhenapackageisloadedareits
contentsavailable.Thisisdonebothforefficiency(thefulllistwouldtakemorememoryand
wouldtakelongertosearchthanasubset),andtoaidpackagedevelopers,whoareprotected
fromnameclasheswithothercode.TheprocessofdevelopingpackagesisdescribedinCreating
RpackagesinWritingRExtensions.Here,wewilldescribethemfromauserspointofview.
Toseewhichpackagesareinstalledatyoursite,issuethecommand
>library()

withnoarguments.Toloadaparticularpackage(e.g.,thebootpackagecontainingfunctions
fromDavison&Hinkley(1997)),useacommandlike
>library(boot)

UsersconnectedtotheInternetcanusetheinstall.packages()andupdate.packages()functions
(availablethroughthePackagesmenuintheWindowsandOSXGUIs,seeInstallingpackages
inRInstallationandAdministration)toinstallandupdatepackages.
Toseewhichpackagesarecurrentlyloaded,use
>search()

todisplaythesearchlist.Somepackagesmaybeloadedbutnotavailableonthesearchlist(see
Namespaces):thesewillbeincludedinthelistgivenby
>loadedNamespaces()

Toseealistofallavailablehelptopicsinaninstalledpackage,use
>help.start()

tostarttheHTMLhelpsystem,andthennavigatetothepackagelistingintheReferencesection.
Standardpackages:

ContributedpackagesandCRAN:
Namespaces:

Next:ContributedpackagesandCRAN,Previous:Packages,Up:Packages[Contents][Index]
13.1Standardpackages
Thestandard(orbase)packagesareconsideredpartoftheRsourcecode.Theycontainthebasic
functionsthatallowRtowork,andthedatasetsandstandardstatisticalandgraphicalfunctions
thataredescribedinthismanual.TheyshouldbeautomaticallyavailableinanyRinstallation.
SeeRpackagesinRFAQ,foracompletelist.
Next:Namespaces,Previous:Standardpackages,Up:Packages[Contents][Index]
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

85/116

5/28/2015

AnIntroductiontoR

13.2ContributedpackagesandCRAN
TherearethousandsofcontributedpackagesforR,writtenbymanydifferentauthors.Someof
thesepackagesimplementspecializedstatisticalmethods,othersgiveaccesstodataorhardware,
andothersaredesignedtocomplementtextbooks.Some(therecommendedpackages)are
distributedwitheverybinarydistributionofR.MostareavailablefordownloadfromCRAN
(http://CRAN.Rproject.org/anditsmirrors)andotherrepositoriessuchasBioconductor
(http://www.bioconductor.org/)andOmegahat(http://www.omegahat.org/).TheRFAQcontains
alistofCRANpackagescurrentatthetimeofrelease,butthecollectionofavailablepackages
changesveryfrequently.
Previous:ContributedpackagesandCRAN,Up:Packages[Contents][Index]
13.3Namespaces
Allpackageshavenamespaces,andhavesinceR2.14.0.Namespacesdothreethings:theyallow
thepackagewritertohidefunctionsanddatathataremeantonlyforinternaluse,theyprevent
functionsfrombreakingwhenauser(orotherpackagewriter)picksanamethatclasheswith
oneinthepackage,andtheyprovideawaytorefertoanobjectwithinaparticularpackage.
Forexample,t()isthetransposefunctioninR,butusersmightdefinetheirownfunctionnamed
t.Namespacespreventtheusersdefinitionfromtakingprecedence,andbreakingeveryfunction
thattriestotransposeamatrix.
Therearetwooperatorsthatworkwithnamespaces.Thedoublecolonoperator::selects
definitionsfromaparticularnamespace.Intheexampleabove,thetransposefunctionwill
alwaysbeavailableasbase::t,becauseitisdefinedinthebasepackage.Onlyfunctionsthatare
exportedfromthepackagecanberetrievedinthisway.
Thetriplecolonoperator:::maybeseeninafewplacesinRcode:itactslikethedoublecolon
operatorbutalsoallowsaccesstohiddenobjects.UsersaremorelikelytousethegetAnywhere()
function,whichsearchesmultiplepackages.
Packagesareofteninterdependent,andloadingonemaycauseotherstobeautomatically
loaded.Thecolonoperatorsdescribedabovewillalsocauseautomaticloadingoftheassociated
package.Whenpackageswithnamespacesareloadedautomaticallytheyarenotaddedtothe
searchlist.
Next:Asamplesession,Previous:Packages,Up:Top[Contents][Index]

14OSfacilities
RhasquiteextensivefacilitiestoaccesstheOSunderwhichitisrunning:thisallowsittobe
usedasascriptinglanguageandthatabilityismuchusedbyRitself,forexampletoinstall
packages.
BecauseRsownscriptsneedtoworkacrossallplatforms,considerableefforthasgoneinto
makethescriptingfacilitiesasplatformindependentasisfeasible.
Filesanddirectories:

Filepaths:

Systemcommands:

CompressionandArchives:
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

86/116

5/28/2015

AnIntroductiontoR

Next:Filepaths,Previous:OSfacilities,Up:OSfacilities[Contents][Index]
14.1Filesanddirectories
Therearemanyfunctionstomanipulatefilesanddirectories.Herearepointerstosomeofthe
morecommonlyusedones.
Tocreatean(empty)fileordirectory,usefile.createorcreate.dir.(Thesearetheanalogues
ofthePOSIXutilitiestouchandmkdir.)FortemporaryfilesanddirectoriesintheRsession
directoryseetempfile.
Filescanberemovedbyeitherfile.removeorunlink:thelattercanremovedirectorytrees.
Fordirectorylistingsuselist.files(alsoavailableasdir)orlist.dirs.Thesecanselectfiles
usingaregularexpression:toselectbywildcardsuseSys.glob.
Manytypesofinformationonafilepath(includingforexampleifitisafileordirectory)canbe
foundbyfile.info.
Thereareseveralwaystofindoutifafileexists(andfilecanexistonthefilesystemandnotbe
visibletothecurrentuser).Therearefunctionsfile.exists,file.accessandfile_testwith
variousversionsofthistest:file_testisaversionofthePOSIXtestcommandforthose
familiarwithshellscripting.
Functionfile.copyistheRanalogueofthePOSIXcommandcp.
Choosingfilescanbedoneinteractivelybyfile.choose:theWindowsporthasthemore
versatilefunctionschoose.filesandchoose.dirandtherearesimilarfunctionsinthetcltk
package:tk_choose.filesandtk_choose.dir.
Functionsfile.showandfile.editwilldisplayandeditoneormorefilesinawayappropriate
totheRport,usingthefacilitiesofaconsole(suchasRGuionWindowsorR.apponOSX)if
oneisinuse.
Thereissomesupportforlinksinthefilesystem:seefunctionsfile.linkandSys.readlink.
Next:Systemcommands,Previous:Filesanddirectories,Up:OSfacilities[Contents][Index]
14.2Filepaths
Withafewexceptions,RreliesontheunderlyingOSfunctionstomanipulatefilepaths.Some
aspectsofthisareallowedtodependontheOS,anddo,evendowntotheversionoftheOS.
TherearePOSIXstandardsforhowOSesshouldinterpretfilepathsandmanyRusersassume
POSIXcompliance:butWindowsdoesnotclaimtobecompliantandotherOSesmaybeless
thancompletelycompliant.
Thefollowingaresomeissueswhichhavebeenencounteredwithfilepaths.
POSIXfilesystemsarecasesensitive,sofoo.pngandFoo.PNGaredifferentfiles.However,
thedefaultsonWindowsandOSXaretobecaseinsensitive,andFATfilesystems
(commonlyusedonremovablestorage)arenotnormallycasesensitive(andallfilepaths
maybemappedtolowercase).
AlmostalltheWindowsOSservicessupporttheuseofslashorbackslashasthefilepath
separator,andRconvertstheknownexceptionstotheformrequiredbyWindows.
ThebehaviouroffilepathswithatrailingslashisOSdependent.Suchpathsarenotvalid
onWindowsandshouldnotbeexpectedtowork.POSIX2008requiressuchpathsto
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

87/116

5/28/2015

AnIntroductiontoR

matchonlydirectories,butearlierversionsallowedthemtoalsomatchfiles.Sotheyare
bestavoided.
Multipleslashesinfilepathssuchas/abc//defarevalidonPOSIXfilesystemsandtreated
asiftherewasonlyoneslash.TheyareusuallyacceptedbyWindowsOSfunctions.
However,leadingdoubleslashesmayhaveadifferentmeaning.
WindowsUNCfilepaths(suchas\\server\dir1\dir2\fileand\\?
\UNC\server\dir1\dir2\file)arenotsupported,buttheymayworkinsomeRfunctions.
POSIXfilesystemsareallowedtotreataleadingdoubleslashspecially.
Windowsallowsfilepathscontainingdrivesandrelativetothecurrentdirectoryonadrive,
e.g.d:foo/barreferstod:/a/b/c/foo/barifthecurrentdirectoryondrived:is/a/b/c.It
isintendedthatthesework,buttheuseofabsolutepathsissafer.
Functionsbasenameanddirnameselectpartsofafilepath:therecommendedwaytoassemblea
filepathfromcomponentsisfile.path.Functionpathexpanddoestildeexpansion,substituting
valuesforhomedirectories(thecurrentusers,andperhapsthoseofotherusers).
Onfilesystemswithlinks,asinglefilecanbereferredtobymanyfilepaths.Function
normalizePathwillfindacanonicalfilepath.
Windowshastheconceptsofshort(8.3)andlongfilenames:normalizePathwillreturnan
absolutepathusinglongfilenamesandshortPathNamewillreturnaversionusingshortnames.
Thelatterdoesnotcontainspacesandusesbackslashastheseparator,soissometimesusefulfor
exportingnamesfromR.
Filepermissionsarearelatedtopic.RhassupportforthePOSIXconceptsofread/write/execute
permissionforowner/group/allbutthismaybeonlypartiallysupportedonthefilesystem(sofor
exampleonWindowsonlyreadonlyfiles(fortheaccountrunningtheRsession)arerecognized.
AccessControlLists(ACLs)areemployedonseveralfilesystems,butdonothaveanagreed
standardandRhasnofacilitiestocontrolthem.UseSys.chmodtochangepermissions.
Next:CompressionandArchives,Previous:Filepaths,Up:OSfacilities[Contents][Index]
14.3Systemcommands
Functionssystemandsystem2areusedtoinvokeasystemcommandandoptionallycollectits
output.system2isalittlemoregeneralbutitsmainadvantageisthatitiseasiertowritecross
platformcodeusingit.
systembehavesdifferentlyonWindowsfromotherOSes(becausetheAPICcallofthatname

does).Elsewhereitinvokesashelltorunthecommand:theWindowsportofRhasafunction
shelltodothat.
TofindoutiftheOSincludesacommand,useSys.which,whichattemptstodothisinacross
platformway(unfortunatelyitisnotastandardOSservice).
FunctionshQuotewillquotefilepathsasneededforcommandsinthecurrentOS.
Previous:Systemcommands,Up:OSfacilities[Contents][Index]
14.4CompressionandArchives
RecentversionsofRhaveextensivefacilitiestoreadandwritecompressedfiles,often
transparently.ReadingoffilesinRistoaveylargeextentdonebyconnections,andthefile
functionwhichisusedtoopenaconnectiontoafile(oraURL)andisabletoidentifythe
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

88/116

5/28/2015

AnIntroductiontoR

compressionusedfromthemagicheaderofthefile.
Thetypeofcompressionwhichhasbeensupportedforlongestisgzipcompression,andthat
remainsagoodgeneralcompromise.FilescompressedbytheearlierUnixcompressutilitycan
alsoberead,butthesearebecomingrare.Twootherformsofcompression,thoseofthebzip2
andxzutilitiesarealsoavailable.Thesegenerallyachievehigherratesofcompression
(dependingonthefile,muchhigher)attheexpenseofslowerdecompressionandmuchslower
compression.
Thereissomeconfusionbetweenxzandlzmacompression(seehttp://en.wikipedia.org/wiki/Xz
andhttp://en.wikipedia.org/wiki/LZMA):Rcanreadfilescompressedbymostversionsofeither.
Filearchivesaresinglefileswhichcontainacollectionoffiles,themostcommononesbeing
tarballsandzipfilesasusedtodistributeRpackages.Rcanlistandunpackboth(seefunctions
untarandunzip)andcreateboth(forzipwiththehelpofanexternalprogram).
Next:InvokingR,Previous:OSfacilities,Up:Top[Contents][Index]

AppendixAAsamplesession
ThefollowingsessionisintendedtointroducetoyousomefeaturesoftheRenvironmentby
usingthem.Manyfeaturesofthesystemwillbeunfamiliarandpuzzlingatfirst,butthis
puzzlementwillsoondisappear.
StartRappropriatelyforyourplatform(seeInvokingR).

TheRprogrambegins,withabanner.
(WithinRcode,thepromptonthelefthandsidewillnotbeshowntoavoidconfusion.)
help.start()

StarttheHTMLinterfacetoonlinehelp(usingawebbrowseravailableatyourmachine).
Youshouldbrieflyexplorethefeaturesofthisfacilitywiththemouse.
Iconifythehelpwindowandmoveontothenextpart.
x<rnorm(50)
y<rnorm(x)

Generatetwopseudorandomnormalvectorsofxandycoordinates.
plot(x,y)

Plotthepointsintheplane.Agraphicswindowwillappearautomatically.
ls()

SeewhichRobjectsarenowintheRworkspace.
rm(x,y)

Removeobjectsnolongerneeded.(Cleanup).
x<1:20

Makex=(1,2,,20).
w<1+sqrt(x)/2

Aweightvectorofstandarddeviations.
dummy<data.frame(x=x,y=x+rnorm(x)*w)
dummy

Makeadataframeoftwocolumns,xandy,andlookatit.
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

89/116

5/28/2015

AnIntroductiontoR

fm<lm(y~x,data=dummy)
summary(fm)

Fitasimplelinearregressionandlookattheanalysis.Withytotheleftofthetilde,weare
modellingydependentonx.
fm1<lm(y~x,data=dummy,weight=1/w^2)
summary(fm1)

Sinceweknowthestandarddeviations,wecandoaweightedregression.
attach(dummy)

Makethecolumnsinthedataframevisibleasvariables.
lrf<lowess(x,y)

Makeanonparametriclocalregressionfunction.
plot(x,y)

Standardpointplot.
lines(x,lrf$y)

Addinthelocalregression.
abline(0,1,lty=3)

Thetrueregressionline:(intercept0,slope1).
abline(coef(fm))

Unweightedregressionline.
abline(coef(fm1),col="red")

Weightedregressionline.
detach()

Removedataframefromthesearchpath.
plot(fitted(fm),resid(fm),
xlab="Fittedvalues",
ylab="Residuals",
main="ResidualsvsFitted")

Astandardregressiondiagnosticplottocheckforheteroscedasticity.Canyouseeit?
qqnorm(resid(fm),main="ResidualsRankitPlot")

Anormalscoresplottocheckforskewness,kurtosisandoutliers.(Notveryusefulhere.)
rm(fm,fm1,lrf,x,dummy)

Cleanupagain.
ThenextsectionwilllookatdatafromtheclassicalexperimentofMichelsontomeasurethe
speedoflight.Thisdatasetisavailableinthemorleyobject,butwewillreadittoillustratethe
read.tablefunction.
filepath<system.file("data","morley.tab",package="datasets")
filepath

Getthepathtothedatafile.
file.show(filepath)

Optional.Lookatthefile.
mm<read.table(filepath)
mm
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

90/116

5/28/2015

AnIntroductiontoR

ReadintheMichelsondataasadataframe,andlookatit.Therearefiveexperiments
(columnExpt)andeachhas20runs(columnRun)andslistherecordedspeedoflight,
suitablycoded.
mm$Expt<factor(mm$Expt)
mm$Run<factor(mm$Run)

ChangeExptandRunintofactors.
attach(mm)

Makethedataframevisibleatposition3(thedefault).
plot(Expt,Speed,main="SpeedofLightData",xlab="ExperimentNo.")

Comparethefiveexperimentswithsimpleboxplots.
fm<aov(Speed~Run+Expt,data=mm)
summary(fm)

Analyzeasarandomizedblock,withrunsandexperimentsasfactors.
fm0<update(fm,.~.Run)
anova(fm0,fm)

Fitthesubmodelomittingruns,andcompareusingaformalanalysisofvariance.
detach()
rm(fm,fm0)

Cleanupbeforemovingon.
Wenowlookatsomemoregraphicalfeatures:contourandimageplots.
x<seq(pi,pi,len=50)
y<x

xisavectorof50equallyspacedvaluesintheinterval[pi\,pi].yisthesame.
f<outer(x,y,function(x,y)cos(y)/(1+x^2))

fisasquarematrix,withrowsandcolumnsindexedbyxandyrespectively,ofvaluesof
thefunctioncos(y)/(1+x^2).
oldpar<par(no.readonly=TRUE)
par(pty="s")

Savetheplottingparametersandsettheplottingregiontosquare.
contour(x,y,f)
contour(x,y,f,nlevels=15,add=TRUE)

Makeacontourmapoffaddinmorelinesformoredetail.
fa<(ft(f))/2
faistheasymmetricpartoff.(t()istranspose).
contour(x,y,fa,nlevels=15)

Makeacontourplot,
par(oldpar)

andrestoretheoldgraphicsparameters.
image(x,y,f)
image(x,y,fa)

Makesomehighdensityimageplots,(ofwhichyoucangethardcopiesifyouwish),
objects();rm(x,y,f,fa)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

91/116

5/28/2015

AnIntroductiontoR

andcleanupbeforemovingon.
Rcandocomplexarithmetic,also.
th<seq(pi,pi,len=100)
z<exp(1i*th)
1iisusedforthecomplexnumberi.
par(pty="s")
plot(z,type="l")

Plottingcomplexargumentsmeansplotimaginaryversusrealparts.Thisshouldbea
circle.
w<rnorm(100)+rnorm(100)*1i

Supposewewanttosamplepointswithintheunitcircle.Onemethodwouldbetotake
complexnumberswithstandardnormalrealandimaginaryparts
w<ifelse(Mod(w)>1,1/w,w)

andtomapanyoutsidethecircleontotheirreciprocal.
plot(w,xlim=c(1,1),ylim=c(1,1),pch="+",xlab="x",ylab="y")
lines(z)

Allpointsareinsidetheunitcircle,butthedistributionisnotuniform.
w<sqrt(runif(100))*exp(2*pi*runif(100)*1i)
plot(w,xlim=c(1,1),ylim=c(1,1),pch="+",xlab="x",ylab="y")
lines(z)

Thesecondmethodusestheuniformdistribution.Thepointsshouldnowlookmore
evenlyspacedoverthedisc.
rm(th,w,z)

Cleanupagain.
q()

QuittheRprogram.YouwillbeaskedifyouwanttosavetheRworkspace,andforan
exploratorysessionlikethis,youprobablydonotwanttosaveit.
Next:Thecommandlineeditor,Previous:Asamplesession,Up:Top[Contents][Index]

AppendixBInvokingR
UsersofRonWindowsorOSXshouldreadtheOSspecificsectionfirst,butcommandlineuse
isalsosupported.
InvokingRfromthecommandline:
InvokingRunderWindows:

InvokingRunderOSX:

ScriptingwithR:

Next:InvokingRunderWindows,Previous:InvokingR,Up:InvokingR[Contents][Index]
B.1InvokingRfromthecommandline
WhenworkingatacommandlineonUNIXorWindows,thecommandRcanbeusedbothfor
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

92/116

5/28/2015

AnIntroductiontoR

startingthemainRprogramintheform
R[options][<infile][>outfile],

or,viatheRCMDinterface,asawrappertovariousRtools(e.g.,forprocessingfilesinR
documentationformatormanipulatingaddonpackages)whicharenotintendedtobecalled
directly.
AttheWindowscommandline,Rterm.exeispreferredtoR.
YouneedtoensurethateithertheenvironmentvariableTMPDIRisunsetoritpointstoavalid
placetocreatetemporaryfilesanddirectories.
MostoptionscontrolwhathappensatthebeginningandattheendofanRsession.Thestartup
mechanismisasfollows(seealsotheonlinehelpfortopicStartupformoreinformation,and
thesectionbelowforsomeWindowsspecificdetails).
Unlessnoenvironwasgiven,Rsearchesforuserandsitefilestoprocessforsetting
environmentvariables.Thenameofthesitefileistheonepointedtobytheenvironment
variableR_ENVIRONifthisisunset,R_HOME/etc/Renviron.siteisused(ifitexists).The
userfileistheonepointedtobytheenvironmentvariableR_ENVIRON_USERifthisisset
otherwise,files.Renvironinthecurrentorintheusershomedirectory(inthatorder)are
searchedfor.Thesefilesshouldcontainlinesoftheformname=value.(See
help("Startup")foraprecisedescription.)Variablesyoumightwanttosetinclude
R_PAPERSIZE(thedefaultpapersize),R_PRINTCMD(thedefaultprintcommand)andR_LIBS
(specifiesthelistofRlibrarytreessearchedforaddonpackages).
ThenRsearchesforthesitewidestartupprofileunlessthecommandlineoptionno
sitefilewasgiven.ThenameofthisfileistakenfromthevalueoftheR_PROFILE
environmentvariable.Ifthatvariableisunset,thedefaultR_HOME/etc/Rprofile.siteis
usedifthisexists.
Then,unlessnoinitfilewasgiven,Rsearchesforauserprofileandsourcesit.The
nameofthisfileistakenfromtheenvironmentvariableR_PROFILE_USERifunset,afile
called.Rprofileinthecurrentdirectoryorintheusershomedirectory(inthatorder)is
searchedfor.
Italsoloadsasavedworkspacefromfile.RDatainthecurrentdirectoryifthereisone
(unlessnorestoreornorestoredatawasspecified).
Finally,ifafunction.First()exists,itisexecuted.Thisfunction(aswellas.Last()
whichisexecutedattheendoftheRsession)canbedefinedintheappropriatestartup
profiles,orresidein.RData.
Inaddition,thereareoptionsforcontrollingthememoryavailabletotheRprocess(seetheon
linehelpfortopicMemoryformoreinformation).Userswillnotnormallyneedtousethese
unlesstheyaretryingtolimittheamountofmemoryusedbyR.
Racceptsthefollowingcommandlineoptions.
help
h

Printshorthelpmessagetostandardoutputandexitsuccessfully.
version

Printversioninformationtostandardoutputandexitsuccessfully.
encoding=enc

Specifytheencodingtobeassumedforinputfromtheconsoleorstdin.Thisneedstobe
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

93/116

5/28/2015

AnIntroductiontoR

anencodingknowntoiconv:seeitshelppage.(encodingencisalsoaccepted.)The
inputisreencodedtothelocaleRisrunninginandneedstoberepresentableinthe
lattersencoding(soe.g.youcannotreencodeGreektextinaFrenchlocaleunlessthat
localeusestheUTF8encoding).
RHOME

PrintthepathtotheRhomedirectorytostandardoutputandexitsuccessfully.Apart
fromthefrontendshellscriptandthemanpage,Rinstallationputseverything
(executables,packages,etc.)intothisdirectory.
save
nosave

ControlwhetherdatasetsshouldbesavedornotattheendoftheRsession.Ifneitheris
giveninaninteractivesession,theuserisaskedforthedesiredbehaviorwhenendingthe
sessionwithq()innoninteractiveuseoneofthesemustbespecifiedorimpliedbysome
otheroption(seebelow).
noenviron

Donotreadanyuserfiletosetenvironmentvariables.
nositefile

Donotreadthesitewideprofileatstartup.
noinitfile

Donotreadtheusersprofileatstartup.
restore
norestore
norestoredata

Controlwhethersavedimages(file.RDatainthedirectorywhereRwasstarted)shouldbe
restoredatstartupornot.Thedefaultistorestore.(norestoreimpliesallthespecific
norestore*options.)
norestorehistory

Controlwhetherthehistoryfile(normallyfile.RhistoryinthedirectorywhereRwas
started,butcanbesetbytheenvironmentvariableR_HISTFILE)shouldberestoredat
startupornot.Thedefaultistorestore.
noRconsole

(Windowsonly)PreventloadingtheRconsolefileatstartup.
vanilla

Combinenosave,noenviron,nositefile,noinitfileandnorestore.
UnderWindows,thisalsoincludesnoRconsole.
ffile
file=file

(notRgui.exe)Takeinputfromfile:meansstdin.Impliesnosaveunlesssavehas
beenset.OnaUnixalike,shellmetacharactersshouldbeavoidedinfile(butspacesare
allowed).
eexpression

(notRgui.exe)Useexpressionasaninputline.Oneormoreeoptionscanbeused,but
nottogetherwithforfile.Impliesnosaveunlesssavehasbeenset.(Thereisa
limitof10,000bytesonthetotallengthofexpressionsusedinthisway.Expressions
containingspacesorshellmetacharacterswillneedtobequoted.)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

94/116

5/28/2015

AnIntroductiontoR

noreadline

(UNIXonly)Turnoffcommandlineeditingviareadline.ThisisusefulwhenrunningR
fromwithinEmacsusingtheESS(EmacsSpeaksStatistics)package.SeeThe
commandlineeditor,formoreinformation.Commandlineeditingisenabledfordefault
interactiveuse(seeinteractive).Thisoptionalsoaffectstildeexpansion:seethehelp
forpath.expand.
minvsize=N
minnsize=N

Forexpertuseonly:settheinitialtriggersizesforgarbagecollectionofvectorheap(in
bytes)andconscells(number)respectively.SuffixMspecifiesmegabytesormillionsof
cellsrespectively.Thedefaultsare6Mband350krespectivelyandcanalsobesetby
environmentvariablesR_NSIZEandR_VSIZE.
maxppsize=N

SpecifythemaximumsizeofthepointerprotectionstackasNlocations.Thisdefaultsto
10000,butcanbeincreasedtoallowlargeandcomplicatedcalculationstobedone.
Currentlythemaximumvalueacceptedis100000.
maxmemsize=N

(Windowsonly)SpecifyalimitfortheamountofmemorytobeusedbothforRobjects
andworkingareas.ThisissetbydefaulttothesmalleroftheamountofphysicalRAMin
themachineandfor32bitR,1.5Gb26,andmustbebetween32Mbandthemaximum
allowedonthatversionofWindows.
quiet
silent
q

Donotprintouttheinitialcopyrightandwelcomemessages.
slave

MakeRrunasquietlyaspossible.Thisoptionisintendedtosupportprogramswhichuse
Rtocomputeresultsforthem.Itimpliesquietandnosave.
interactive

(UNIXonly)AssertthatRreallyisbeingruninteractivelyevenifinputhasbeen
redirected:useifinputisfromaFIFOorpipeandfedfromaninteractiveprogram.(The
defaultistodeducethatRisbeingruninteractivelyifandonlyifstdinisconnectedtoa
terminalorpty.)Usinge,forfileassertsnoninteractiveuseevenifinteractive
isgiven.
Notethatthisdoesnotturnoncommandlineediting.
ess

(Windowsonly)SetRtermupforusebyRinferiormodeinESS,includingasserting
interactiveuse(withoutthecommandlineeditor)andnobufferingofstdout.
verbose

Printmoreinformationaboutprogress,andinparticularsetRsoptionverbosetoTRUE.R
codeusesthisoptiontocontroltheprintingofdiagnosticmessages.
debugger=name
dname

(UNIXonly)RunRthroughdebuggername.Formostdebuggers(theexceptionsare
valgrindandrecentversionsofgdb),furthercommandlineoptionsaredisregarded,and
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

95/116

5/28/2015

AnIntroductiontoR

shouldinsteadbegivenwhenstartingtheRexecutablefrominsidethedebugger.
gui=type
gtype

(UNIXonly)Usetypeasgraphicaluserinterface(notethatthisalsoincludesinteractive
graphics).Currently,possiblevaluesfortypeareX11(thedefault)and,providedthat
Tcl/Tksupportisavailable,Tk.(Forbackcompatibility,x11andtkareaccepted.)
arch=name

(UNIXonly)Runthespecifiedsubarchitecture.
args

Thisflagdoesnothingexceptcausetherestofthecommandlinetobeskipped:thiscanbe
usefultoretrievevaluesfromitwithcommandArgs(TRUE).
Notethatinputandoutputcanberedirectedintheusualway(using<and>),buttheline
lengthlimitof4095bytesstillapplies.Warninganderrormessagesaresenttotheerrorchannel
(stderr).
ThecommandRCMDallowstheinvocationofvarioustoolswhichareusefulinconjunctionwith
R,butnotintendedtobecalleddirectly.Thegeneralformis
RCMDcommandargs

wherecommandisthenameofthetoolandargstheargumentspassedontoit.
Currently,thefollowingtoolsareavailable.
BATCH

RunRinbatchmode.RunsRrestoresavewithpossiblyfurtheroptions(see?
BATCH).
COMPILE

(UNIXonly)CompileC,C++,FortranfilesforusewithR.
SHLIB

Buildsharedlibraryfordynamicloading.
INSTALL

Installaddonpackages.
REMOVE

Removeaddonpackages.
build

Build(thatis,package)addonpackages.
check

Checkaddonpackages.
LINK

(UNIXonly)Frontendforcreatingexecutableprograms.
Rprof

PostprocessRprofilingfiles.
Rdconv
Rd2txt
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

96/116

5/28/2015

AnIntroductiontoR

ConvertRdformattovariousotherformats,includingHTML,LaTeX,plaintext,and
extractingtheexamples.Rd2txtcanbeusedasshorthandforRd2convttxt.
Rd2pdf

ConvertRdformattoPDF.
Stangle

ExtractS/RcodefromSweaveorothervignettedocumentation
Sweave

ProcessSweaveorothervignettedocumentation
Rdiff

DiffRoutputignoringheadersetc
config

Obtainconfigurationinformation
javareconf

(Unixonly)UpdatetheJavaconfigurationvariables
rtags

(Unixonly)CreateEmacsstyletagfilesfromC,R,andRdfiles
open

(Windowsonly)OpenafileviaWindowsfileassociations
texify

(Windowsonly)Process(La)TeXfileswithRsstylefiles
Use
RCMDcommandhelp

toobtainusageinformationforeachofthetoolsaccessibleviatheRCMDinterface.
Inaddition,youcanuseoptionsarch=,noenviron,noinitfile,nositefileand
vanillabetweenRandCMD:theseaffectanyRprocessesrunbythetools.(Herevanillais
equivalenttonoenvironnositefilenoinitfile.)However,notethatRCMDdoesnot
ofitselfuseanyRstartupfiles(inparticular,neitherusernorsiteRenvironfiles),andalloftheR
processesrunbythesetools(exceptBATCH)usenorestore.Mostusevanillaandsoinvoke
noRstartupfiles:thecurrentexceptionsareINSTALL,REMOVE,SweaveandSHLIB(whichuses
nositefilenoinitfile).
RCMDcmdargs

foranyotherexecutablecmdonthepathorgivenbyanabsolutefilepath:thisisusefultohave
thesameenvironmentasRorthespecificcommandsrununder,forexampletorunlddor
pdflatex.UnderWindowscmdcanbeanexecutableorabatchfile,orifithasextension.shor
.pltheappropriateinterpreter(ifavailable)iscalledtorunit.
Next:InvokingRunderOSX,Previous:InvokingRfromthecommandline,Up:InvokingR
[Contents][Index]
B.2InvokingRunderWindows
TherearetwowaystorunRunderWindows.Withinaterminalwindow(e.g.cmd.exeoramore
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

97/116

5/28/2015

AnIntroductiontoR

capableshell),themethodsdescribedintheprevioussectionmaybeused,invokingbyR.exeor
moredirectlybyRterm.exe.Forinteractiveuse,thereisaconsolebasedGUI(Rgui.exe).
ThestartupprocedureunderWindowsisverysimilartothatunderUNIX,butreferencestothe
homedirectoryneedtobeclarified,asthisisnotalwaysdefinedonWindows.Ifthe
environmentvariableR_USERisdefined,thatgivesthehomedirectory.Next,iftheenvironment
variableHOMEisdefined,thatgivesthehomedirectory.Afterthosetwousercontrollablesettings,
Rtriestofindsystemdefinedhomedirectories.ItfirsttriestousetheWindows"personal"
directory(typicallyC:\DocumentsandSettings\username\MyDocumentsinWindowsXP).Ifthat
fails,andenvironmentvariablesHOMEDRIVEandHOMEPATHaredefined(andtheynormallyare)
thesedefinethehomedirectory.Failingallthose,thehomedirectoryistakentobethestarting
directory.
YouneedtoensurethateithertheenvironmentvariablesTMPDIR,TMPandTEMPareeitherunsetor
oneofthempointstoavalidplacetocreatetemporaryfilesanddirectories.
Environmentvariablescanbesuppliedasname=valuepairsonthecommandline.
Ifthereisanargumentending.RData(inanycase)itisinterpretedasthepathtotheworkspace
toberestored:itimpliesrestoreandsetstheworkingdirectorytotheparentofthenamedfile.
(ThismechanismisusedfordraganddropandfileassociationwithRGui.exe,butalsoworksfor
Rterm.exe.Ifthenamedfiledoesnotexistitsetstheworkingdirectoryiftheparentdirectory
exists.)
ThefollowingadditionalcommandlineoptionsareavailablewheninvokingRGui.exe.
mdi
sdi
nomdi

ControlwhetherRguiwilloperateasanMDIprogram(withmultiplechildwindows
withinonemainwindow)oranSDIapplication(withmultipletoplevelwindowsforthe
console,graphicsandpager).Thecommandlinesettingoverridesthesettingintheusers
Rconsolefile.
debug

EnabletheBreaktodebuggermenuiteminRgui,andtriggerabreaktothedebugger
duringcommandlineprocessing.
UnderWindowswithRCMDyoumayalsospecifyyourown.bat,.exe,.shor.plfile.Itwillbe
runundertheappropriateinterpreter(Perlfor.pl)withseveralenvironmentvariablesset
appropriately,includingR_HOME,R_OSTYPE,PATH,BSTINPUTSandTEXINPUTS.Forexample,ifyou
alreadyhavelatex.exeonyourpath,then
RCMDlatex.exemydoc

willrunLaTeXonmydoc.tex,withthepathtoRsshare/texmfmacrosappendedtoTEXINPUTS.
(Unfortunately,thisdoesnothelpwiththeMiKTeXbuildofLaTeX,butRCMDtexifymydoc
willworkinthatcase.)
Next:ScriptingwithR,Previous:InvokingRunderWindows,Up:InvokingR[Contents]
[Index]
B.3InvokingRunderOSX
TherearetwowaystorunRunderOSX.WithinaTerminal.appwindowbyinvokingR,the
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

98/116

5/28/2015

AnIntroductiontoR

methodsdescribedinthefirstsubsectionapply.ThereisalsoconsolebasedGUI(R.app)thatby
defaultisinstalledintheApplicationsfolderonyoursystem.Itisastandarddoubleclickable
OSXapplication.
ThestartupprocedureunderOSXisverysimilartothatunderUNIX,butR.appdoesnotmake
useofcommandlinearguments.ThehomedirectoryistheoneinsidetheR.framework,butthe
startupandcurrentworkingdirectoryaresetastheusershomedirectoryunlessadifferent
startupdirectoryisgiveninthePreferenceswindowaccessiblefromwithintheGUI.
Previous:InvokingRunderOSX,Up:InvokingR[Contents][Index]
B.4ScriptingwithR
Ifyoujustwanttorunafilefoo.RofRcommands,therecommendedwayistouseRCMDBATCH
foo.R.IfyouwanttorunthisinthebackgroundorasabatchjobuseOSspecificfacilitiestodo
so:forexampleinmostshellsonUnixalikeOSesRCMDBATCHfoo.R&runsabackgroundjob.
Youcanpassparameterstoscriptsviaadditionalargumentsonthecommandline:forexample
(wheretheexactquotingneededwilldependontheshellinuse)
RCMDBATCH"argsarg1arg2"foo.R&

willpassargumentstoascriptwhichcanberetrievedasacharactervectorby
args<commandArgs(TRUE)

ThisismadesimplerbythealternativefrontendRscript,whichcanbeinvokedby
Rscriptfoo.Rarg1arg2

andthiscanalsobeusedtowriteexecutablescriptfileslike(atleastonUnixalikes,andinsome
Windowsshells)
#!/path/to/Rscript
args<commandArgs(TRUE)
...
q(status=<exitstatuscode>)

Ifthisisenteredintoatextfilerunfooandthisismadeexecutable(bychmod755runfoo),itcan
beinvokedfordifferentargumentsby
runfooarg1arg2

Forfurtheroptionsseehelp("Rscript").ThiswritesRoutputtostdoutandstderr,andthiscan
beredirectedintheusualwayfortheshellrunningthecommand.
IfyoudonotwishtohardcodethepathtoRscriptbuthaveitinyourpath(whichisnormallythe
caseforaninstalledRexceptonWindows,bute.g.OSXusersmayneedtoadd/usr/local/bin
totheirpath),use
#!/usr/bin/envRscript
...

AtleastinBourneandbashshells,the#!mechanismdoesnotallowextraargumentslike#!
/usr/bin/envRscriptvanilla.
Onethingtoconsideriswhatstdin()refersto.ItiscommonplacetowriteRscriptswith
segmentslike
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

99/116

5/28/2015

AnIntroductiontoR

chem<scan(n=24)
2.903.103.403.403.703.702.802.502.402.402.702.20
5.283.373.033.0328.953.773.402.203.503.603.703.70

andstdin()referstothescriptfiletoallowsuchtraditionalusage.Ifyouwanttorefertothe
processsstdin,use"stdin"asafileconnection,e.g.scan("stdin",...).
Anotherwaytowriteexecutablescriptfiles(suggestedbyFranoisPinard)istouseahere
documentlike
#!/bin/sh
[environmentvariablescanbesethere]
Rslave[otheroptions]<<EOF
Rprogramgoeshere...
EOF

butherestdin()referstotheprogramsourceand"stdin"willnotbeusable.
ShortscriptscanbepassedtoRscriptonthecommandlineviatheeflag.(Emptyscriptsare
notaccepted.)
NotethatonaUnixaliketheinputfilename(suchasfoo.R)shouldnotcontainspacesnorshell
metacharacters.
Next:Functionandvariableindex,Previous:InvokingR,Up:Top[Contents][Index]

AppendixCThecommandlineeditor
C.1Preliminaries
WhentheGNUreadlinelibraryisavailableatthetimeRisconfiguredforcompilationunder
UNIX,aninbuiltcommandlineeditorallowingrecall,editingandresubmissionofprior
commandsisused.Notethatotherversionsofreadlineexistandmaybeusedbytheinbuilt
commandlineeditor:thisusedtohappenonOSX.
Itcanbedisabled(usefulforusagewithESS27)usingthestartupoptionnoreadline.
WindowsversionsofRhavesomewhatsimplercommandlineediting:seeConsoleunderthe
HelpmenuoftheGUI,andthefileREADME.RtermforcommandlineeditingunderRterm.exe.
WhenusingRwithreadlinecapabilities,thefunctionsdescribedbelowareavailable,aswellas
others(probably)documentedinmanreadlineorinforeadlineonyoursystem.
ManyoftheseuseeitherControlorMetacharacters.Controlcharacters,suchasControlm,are
obtainedbyholdingtheCTRLdownwhileyoupressthemkey,andarewrittenasCmbelow.Meta
characters,suchasMetab,aretypedbyholdingdownMETA28andpressingb,andwrittenasMb
inthefollowing.IfyourterminaldoesnothaveaMETAkeyenabled,youcanstilltypeMeta
charactersusingtwocharactersequencesstartingwithESC.Thus,toenterMb,youcouldtype
ESCb.TheESCcharactersequencesarealsoallowedonterminalswithrealMetakeys.Notethat
caseissignificantforMetacharacters.
C.2Editingactions
TheRprogramkeepsahistoryofthecommandlinesyoutype,includingtheerroneouslines,
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

100/116

5/28/2015

AnIntroductiontoR

andcommandsinyourhistorymayberecalled,changedifnecessary,andresubmittedasnew
commands.InEmacsstylecommandlineeditinganystraighttypingyoudowhileinthisediting
phasecausesthecharacterstobeinsertedinthecommandyouareediting,displacingany
characterstotherightofthecursor.InvimodecharacterinsertionmodeisstartedbyMiorMa,
charactersaretypedandinsertionmodeisfinishedbytypingafurtherESC.(Thedefaultis
Emacsstyle,andonlythatisdescribedhere:forvimodeseethereadlinedocumentation.)
PressingtheRETcommandatanytimecausesthecommandtoberesubmitted.
Othereditingactionsaresummarizedinthefollowingtable.
C.3Commandlineeditorsummary
Commandrecallandverticalmotion
Cp

Gotothepreviouscommand(backwardsinthehistory).
Cn

Gotothenextcommand(forwardsinthehistory).
Crtext

Findthelastcommandwiththetextstringinit.
Onmostterminals,youcanalsousetheupanddownarrowkeysinsteadofCpandCn,
respectively.
Horizontalmotionofthecursor
Ca

Gotothebeginningofthecommand.
Ce

Gototheendoftheline.
Mb

Gobackoneword.
Mf

Goforwardoneword.
Cb

Gobackonecharacter.
Cf

Goforwardonecharacter.
Onmostterminals,youcanalsousetheleftandrightarrowkeysinsteadofCbandCf,
respectively.
Editingandresubmission
text

Inserttextatthecursor.
Cftext
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

101/116

5/28/2015

AnIntroductiontoR

Appendtextafterthecursor.
DEL

Deletethepreviouscharacter(leftofthecursor).
Cd

Deletethecharacterunderthecursor.
Md

Deletetherestofthewordunderthecursor,andsaveit.
Ck

Deletefromcursortoendofcommand,andsaveit.
Cy

Insert(yank)thelastsavedtexthere.
Ct

Transposethecharacterunderthecursorwiththenext.
Ml

Changetherestofthewordtolowercase.
Mc

Changetherestofthewordtouppercase.
RET

ResubmitthecommandtoR.
ThefinalRETterminatesthecommandlineeditingsequence.
Thereadlinekeybindingscanbecustomizedintheusualwayviaa~/.inputrcfile.These
customizationscanbeconditionedonapplicationR,thatisbyincludingasectionlike
$ifR
"\Cxd":"q('no')\n"
$endif

Next:Conceptindex,Previous:Thecommandlineeditor,Up:Top[Contents][Index]

AppendixDFunctionandvariableindex
Jumpto: !%&*+./:<=>?^|~

ABCDEFGHIJKLMNOPQRSTUVW
X
IndexEntry

Section

!:
!=:

Logicalvectors
Logicalvectors

%*%:

Multiplication

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

102/116

5/28/2015

AnIntroductiontoR

%o%:

Theouterproductoftwoarrays

&:
&&:

Logicalvectors
Conditionalexecution

*:

Vectorarithmetic

+:

Vectorarithmetic

Vectorarithmetic

.:
.Last:

Updatingfittedmodels
Customizingtheenvironment
Customizingtheenvironment

/:

Vectorarithmetic

::

Generatingregularsequences
Namespaces
Namespaces

&

.
.First:

:
:::
::::

<
<:
<=:

Logicalvectors
Scope
Logicalvectors

==:

Logicalvectors

>:

Logicalvectors
Logicalvectors

<<:

>
>=:

?
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

103/116

5/28/2015

AnIntroductiontoR

?:
??:

Gettinghelp
Gettinghelp

^:

Vectorarithmetic

|:
||:

Logicalvectors
Conditionalexecution

~:

Formulaeforstatisticalmodels

A
abline:

Lowlevelplottingcommands
ace:
Somenonstandardmodels
add1:
Updatingfittedmodels
anova:
Genericfunctionsforextractingmodelinformation
anova:
ANOVAtables
aov:
Analysisofvarianceandmodelcomparison
aperm:
Generalizedtransposeofanarray
array:
Thearray()function
as.data.frame: Makingdataframes
as.vector:
Theconcatenationfunctionc()witharrays
attach:
attach()anddetach()
attr:
Gettingandsettingattributes
attr:
Gettingandsettingattributes
attributes:
Gettingandsettingattributes
attributes:
Gettingandsettingattributes
avas:
Somenonstandardmodels
axis:
Lowlevelplottingcommands
B
boxplot:
break:
bruto:

Oneandtwosampletests
Repetitiveexecution
Somenonstandardmodels

C
c:
c:
c:

Vectorsandassignment
Charactervectors
Theconcatenationfunctionc()witharrays

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

104/116

5/28/2015

AnIntroductiontoR

c:

Concatenatinglists
C:
Contrasts
cbind:
Formingpartitionedmatrices
coef:
Genericfunctionsforextractingmodelinformation
coefficients: Genericfunctionsforextractingmodelinformation
contour:
Displaygraphics
contrasts:
Contrasts
coplot:
Displayingmultivariatedata
cos:
Vectorarithmetic
crossprod:
Indexmatrices
crossprod:
Multiplication
cut:
Frequencytablesfromfactors
D
data:
data.frame:
density:
det:
detach:
determinant:
dev.list:
dev.next:
dev.off:
dev.prev:
dev.set:
deviance:
diag:
dim:
dotchart:
drop1:

Accessingbuiltindatasets
Makingdataframes
Examiningthedistributionofasetofdata
Singularvaluedecompositionanddeterminants
attach()anddetach()
Singularvaluedecompositionanddeterminants
Multiplegraphicsdevices
Multiplegraphicsdevices
Multiplegraphicsdevices
Multiplegraphicsdevices
Multiplegraphicsdevices
Genericfunctionsforextractingmodelinformation
Multiplication
Arrays
Displaygraphics
Updatingfittedmodels

E
ecdf:
edit:
eigen:
else:
Error:
example:
exp:

Examiningthedistributionofasetofdata
Editingdata
Eigenvaluesandeigenvectors
Conditionalexecution
Analysisofvarianceandmodelcomparison
Gettinghelp
Vectorarithmetic

F
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

105/116

5/28/2015

AnIntroductiontoR

F:
factor:
FALSE:
fivenum:
for:
formula:
function:

Logicalvectors
Factors
Logicalvectors
Examiningthedistributionofasetofdata
Repetitiveexecution
Genericfunctionsforextractingmodelinformation
Writingyourownfunctions

G
getAnywhere:
getS3method:
glm:

Objectorientation
Objectorientation
Theglm()function

H
help:
help:
help.search:
help.start:
hist:
hist:

Gettinghelp
Gettinghelp
Gettinghelp
Gettinghelp
Examiningthedistributionofasetofdata
Displaygraphics

I
identify:

is.nan:

Interactingwithgraphics
Conditionalexecution
Conditionalexecution
Conditionalexecution
Displaygraphics
Missingvalues
Missingvalues

jpeg:

Devicedrivers

ks.test:

Examiningthedistributionofasetofdata

legend:

Lowlevelplottingcommands
Vectorarithmetic
Theintrinsicattributesmodeandlength
Factors

if:
if:
ifelse:
image:
is.na:

L
length:
length:
levels:

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

106/116

5/28/2015

AnIntroductiontoR

lines:
list:
lm:
lme:
locator:
loess:
loess:
log:
lqs:
lsfit:

Lowlevelplottingcommands
Lists
Linearmodels
Somenonstandardmodels
Interactingwithgraphics
Somenonstandardmodels
Somenonstandardmodels
Vectorarithmetic
Somenonstandardmodels
LeastsquaresfittingandtheQRdecomposition

M
mars:
max:
mean:
methods:
min:
mode:

Somenonstandardmodels
Vectorarithmetic
Vectorarithmetic
Objectorientation
Vectorarithmetic
Theintrinsicattributesmodeandlength

N
NA:
NaN:
ncol:
next:
nlm:
nlm:
nlm:
nlme:
nlminb:
nrow:

Missingvalues
Missingvalues
Matrixfacilities
Repetitiveexecution
Nonlinearleastsquaresandmaximumlikelihoodmodels
Leastsquares
Maximumlikelihood
Somenonstandardmodels
Nonlinearleastsquaresandmaximumlikelihoodmodels
Matrixfacilities

O
optim:
order:
ordered:
ordered:
outer:

Nonlinearleastsquaresandmaximumlikelihoodmodels
Vectorarithmetic
Orderedfactors
Orderedfactors
Theouterproductoftwoarrays

P
pairs:
par:

Displayingmultivariatedata
Thepar()function

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

107/116

5/28/2015

AnIntroductiontoR

paste:
pdf:
persp:
plot:
plot:
pmax:
pmin:
png:
points:
polygon:
postscript:
predict:
print:
prod:

Charactervectors
Devicedrivers
Displaygraphics
Genericfunctionsforextractingmodelinformation
Theplot()function
Vectorarithmetic
Vectorarithmetic
Devicedrivers
Lowlevelplottingcommands
Lowlevelplottingcommands
Devicedrivers
Genericfunctionsforextractingmodelinformation
Genericfunctionsforextractingmodelinformation
Vectorarithmetic

Q
qqline:
qqline:
qqnorm:
qqnorm:
qqplot:
qr:
quartz:

Examiningthedistributionofasetofdata
Displaygraphics
Examiningthedistributionofasetofdata
Displaygraphics
Displaygraphics
LeastsquaresfittingandtheQRdecomposition
Devicedrivers

R
range:
rbind:
read.table:
rep:
repeat:
resid:
residuals:
rlm:
rm:

Vectorarithmetic
Formingpartitionedmatrices
Theread.table()function
Generatingregularsequences
Repetitiveexecution
Genericfunctionsforextractingmodelinformation
Genericfunctionsforextractingmodelinformation
Somenonstandardmodels
Datapermanencyandremovingobjects

S
scan:

Thescan()function
sd:
Thefunctiontapply()andraggedarrays
search:
Managingthesearchpath
seq:
Generatingregularsequences
shapiro.test: Examiningthedistributionofasetofdata
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

108/116

5/28/2015

AnIntroductiontoR

sin:
sink:
solve:
sort:
source:
split:
sqrt:
stem:
step:
step:
sum:
summary:
summary:
svd:

Vectorarithmetic
Executingcommandsfromordivertingoutputtoafile
Linearequationsandinversion
Vectorarithmetic
Executingcommandsfromordivertingoutputtoafile
Repetitiveexecution
Vectorarithmetic
Examiningthedistributionofasetofdata
Genericfunctionsforextractingmodelinformation
Updatingfittedmodels
Vectorarithmetic
Examiningthedistributionofasetofdata
Genericfunctionsforextractingmodelinformation
Singularvaluedecompositionanddeterminants

T
T:
t:
t.test:
table:
table:
tan:
tapply:
text:
title:
tree:
TRUE:

Logicalvectors
Generalizedtransposeofanarray
Oneandtwosampletests
Indexmatrices
Frequencytablesfromfactors
Vectorarithmetic
Thefunctiontapply()andraggedarrays
Lowlevelplottingcommands
Lowlevelplottingcommands
Somenonstandardmodels
Logicalvectors

U
unclass:
update:

Theclassofanobject
Updatingfittedmodels

V
var:

vector:

Vectorarithmetic
Thefunctiontapply()andraggedarrays
Oneandtwosampletests
Genericfunctionsforextractingmodelinformation
Vectorsandassignment

while:

Repetitiveexecution

var:
var.test:
vcov:

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

109/116

5/28/2015

AnIntroductiontoR

wilcox.test:
windows:

Oneandtwosampletests
Devicedrivers

X11:

Devicedrivers

Jumpto: !%&*+./:<=>?^|~

ABCDEFGHIJKLMNOPQRSTUVW
X
Next:References,Previous:Functionandvariableindex,Up:Top[Contents][Index]

AppendixEConceptindex
Jumpto: ABCDEFGIKLMNOPQRSTUVW
IndexEntry
Section
A
Accessingbuiltindatasets:
Additivemodels:
Analysisofvariance:
Arithmeticfunctionsand
operators:
Arrays:
Assignment:
Attributes:

Accessingbuiltindatasets
Somenonstandardmodels
Analysisofvarianceandmodelcomparison
Vectorarithmetic

Binaryoperators:
Boxplots:

Definingnewbinaryoperators
Oneandtwosampletests

Charactervectors:
Classes:
Classes:
Concatenatinglists:
Contrasts:
Controlstatements:
CRAN:
Customizingtheenvironment:

Charactervectors
Theclassofanobject
Objectorientation
Concatenatinglists
Contrasts
Controlstatements
ContributedpackagesandCRAN
Customizingtheenvironment

Arrays
Vectorsandassignment
Objects

D
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

110/116

5/28/2015

AnIntroductiontoR

Dataframes:
Defaultvalues:
Densityestimation:
Determinants:
Divertinginputandoutput:
Dynamicgraphics:

Dataframes
Namedargumentsanddefaults
Examiningthedistributionofasetofdata
Singularvaluedecompositionanddeterminants
Executingcommandsfromordivertingoutputtoafile
Dynamicgraphics

Eigenvaluesandeigenvectors:
EmpiricalCDFs:

Eigenvaluesandeigenvectors
Examiningthedistributionofasetofdata

Factors:
Factors:
Families:
Formulae:

Factors
Contrasts
Families
Formulaeforstatisticalmodels

Generalizedlinearmodels:
Generalizedtransposeofanarray:
Genericfunctions:
Graphicsdevicedrivers:
Graphicsparameters:
Groupedexpressions:

Generalizedlinearmodels
Generalizedtransposeofanarray
Objectorientation
Devicedrivers
Thepar()function
Groupedexpressions

Indexingofandbyarrays:
Indexingvectors:

Arrayindexing
Indexvectors

KolmogorovSmirnovtest:

Examiningthedistributionofasetofdata

Leastsquaresfitting:
Linearequations:

LeastsquaresfittingandtheQRdecomposition
Linearequationsandinversion

Linearmodels:
Lists:
Localapproximatingregressions:
Loopsandconditionalexecution:

Linearmodels
Lists
Somenonstandardmodels
Loopsandconditionalexecution

M
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

111/116

5/28/2015

AnIntroductiontoR

Matrices:
Matrixmultiplication:
Maximumlikelihood:
Missingvalues:
Mixedmodels:

Arrays
Multiplication
Maximumlikelihood
Missingvalues
Somenonstandardmodels

Namedarguments:
Namespace:
Nonlinearleastsquares:

Namedargumentsanddefaults
Namespaces
Nonlinearleastsquaresandmaximumlikelihood

models

Objectorientation:
Objects:
Oneandtwosampletests:
Orderedfactors:
Orderedfactors:
Outerproductsofarrays:

Objectorientation
Objects
Oneandtwosampletests
Factors
Contrasts
Theouterproductoftwoarrays

Packages:
Packages:
Probabilitydistributions:

Randstatistics
Packages
Probabilitydistributions

QRdecomposition:
Quantilequantileplots:

LeastsquaresfittingandtheQRdecomposition
Examiningthedistributionofasetofdata

Readingdatafromfiles:
Recyclingrule:
Recyclingrule:

Readingdatafromfiles
Vectorarithmetic
Therecyclingrule

Regularsequences:
Removingobjects:
Robustregression:

Generatingregularsequences
Datapermanencyandremovingobjects
Somenonstandardmodels

Scope:
Searchpath:
ShapiroWilktest:

Scope
Managingthesearchpath
Examiningthedistributionofasetofdata

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

112/116

5/28/2015

AnIntroductiontoR

Singularvaluedecomposition:
Statisticalmodels:
Studentsttest:

Singularvaluedecompositionanddeterminants
StatisticalmodelsinR
Oneandtwosampletests

Tabulation:
Treebasedmodels:

Frequencytablesfromfactors
Somenonstandardmodels

Updatingfittedmodels:

Updatingfittedmodels

Vectors:

Simplemanipulationsnumbersandvectors

Wilcoxontest:
Workspace:
Writingfunctions:

Oneandtwosampletests
Datapermanencyandremovingobjects
Writingyourownfunctions

Jumpto: ABCDEFGIKLMNOPQRSTUVW
Previous:Conceptindex,Up:Top[Contents][Index]

AppendixFReferences
D.M.BatesandD.G.Watts(1988),NonlinearRegressionAnalysisandItsApplications.John
Wiley&Sons,NewYork.
RichardA.Becker,JohnM.ChambersandAllanR.Wilks(1988),TheNewSLanguage.
Chapman&Hall,NewYork.ThisbookisoftencalledtheBlueBook.
JohnM.ChambersandTrevorJ.Hastieeds.(1992),StatisticalModelsinS.Chapman&Hall,
NewYork.ThisisalsocalledtheWhiteBook.
JohnM.Chambers(1998)ProgrammingwithData.Springer,NewYork.Thisisalsocalledthe
GreenBook.
A.C.DavisonandD.V.Hinkley(1997),BootstrapMethodsandTheirApplications,Cambridge
UniversityPress.
AnnetteJ.Dobson(1990),AnIntroductiontoGeneralizedLinearModels,ChapmanandHall,
London.
PeterMcCullaghandJohnA.Nelder(1989),GeneralizedLinearModels.Secondedition,
ChapmanandHall,London.
JohnA.Rice(1995),MathematicalStatisticsandDataAnalysis.Secondedition.DuxburyPress,
Belmont,CA.
S.D.Silvey(1970),StatisticalInference.Penguin,London.
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

113/116

5/28/2015

AnIntroductiontoR

Footnotes

(1)
ACMSoftwareSystemsaward,1998:
http://awards.acm.org/award_winners/chambers_6640862.cfm.
(2)
ForportableRcode(includingthattobeusedinRpackages)onlyAZaz09shouldbeused.
(3)
notinsidestrings,norwithintheargumentlistofafunctiondefinition
(4)
someoftheconsoleswillnotallowyoutoentermore,andamongstthosewhichdosomewill
silentlydiscardtheexcessandsomewilluseitasthestartofthenextline.
(5)
ofunlimitedlength.
(6)
TheleadingdotinthisfilenamemakesitinvisibleinnormalfilelistingsinUNIX,andin
defaultGUIfilelistingsonOSXandWindows.
(7)
Withotherthanvectortypesofargument,suchaslistmodearguments,theactionofc()is
ratherdifferent.SeeConcatenatinglists.
(8)
Actually,itisstillavailableas.Last.valuebeforeanyotherstatementsareexecuted.
(9)
paste(...,collapse=ss)joinstheargumentsintoasinglecharacterstringputtingssinbetween,
e.g.,ss<"|".Therearemoretoolsforcharactermanipulation,seethehelpforsuband
substring.

(10)
numericmodeisactuallyanamalgamoftwodistinctmodes,namelyintegeranddouble
precision,asexplainedinthemanual.
(11)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

114/116

5/28/2015

AnIntroductiontoR

Notehoweverthatlength(object)doesnotalwayscontainintrinsicusefulinformation,e.g.,
whenobjectisafunction.
(12)
Ingeneral,coercionfromnumerictocharacterandbackagainwillnotbeexactlyreversible,
becauseofroundofferrorsinthecharacterrepresentation.
(13)
AdifferentstyleusingformalorS4classesisprovidedinpackagemethods.
(14)
ReadersshouldnotethatthereareeightstatesandterritoriesinAustralia,namelytheAustralian
CapitalTerritory,NewSouthWales,theNorthernTerritory,Queensland,SouthAustralia,
Tasmania,VictoriaandWesternAustralia.
(15)
Notethattapply()alsoworksinthiscasewhenitssecondargumentisnotafactor,e.g.,
tapply(incomes,state),andthisistrueforquiteafewotherfunctions,sinceargumentsare
coercedtofactorswhennecessary(usingas.factor()).
(16)
Notethatx%*%xisambiguous,asitcouldmeaneitherxxorxx,wherexisthecolumnform.
Insuchcasesthesmallermatrixseemsimplicitlytobetheinterpretationadopted,sothescalar
xxisinthiscasetheresult.Thematrixxxmaybecalculatedeitherbycbind(x)%*%xorx%*%
rbind(x)sincetheresultofrbind()orcbind()isalwaysamatrix.However,thebestwayto
computexxorxxiscrossprod(x)orx%o%xrespectively.
(17)
EvenbetterwouldbetoformamatrixsquarerootBwithA=BBandfindthesquaredlengthof
thesolutionofBy=x,perhapsusingtheCholeskyoreigendecompositionofA.
(18)
ConversionofcharactercolumnstofactorsisoverriddenusingthestringsAsFactorsargument
tothedata.frame()function.
(19)
Seetheonlinehelpforautoloadforthemeaningofthesecondterm.
(20)
UnderUNIX,theutilitiessedorawkcanbeused.
(21)
http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

115/116

5/28/2015

AnIntroductiontoR

tobediscussedlater,orusexyplotfrompackagelattice.
(22)
SeealsothemethodsdescribedinStatisticalmodelsinR
(23)
InsomesensethismimicsthebehaviorinSPLUSsinceinSPLUSthisoperatoralwayscreatesor
assignstoaglobalvariable.
(24)
SoitishiddenunderUNIX.
(25)
Somegraphicsparameterssuchasthesizeofthecurrentdeviceareforinformationonly.
(26)
2.5GbonversionsofWindowsthatsupport3Gbperprocessandhavethesupportenabled:see
therwFAQQ2.93.5Gbonmost64bitversionsofWindows.
(27)
TheEmacsSpeaksStatisticspackageseetheURLhttp://ESS.Rproject.org
(28)
OnaPCkeyboardthisisusuallytheAltkey,occasionallytheWindowskey.OnaMac
keyboardnormallynometakeyisavailable.

http://cran.rproject.org/doc/manuals/Rintro.html#One_002dandtwo_002dsampletests

116/116

You might also like