0% found this document useful (0 votes)
1 views22 pages

01_courseIntro

Uploaded by

Vinay Joshy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views22 pages

01_courseIntro

Uploaded by

Vinay Joshy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Course Introduction

Learning Objectives Code les:


• Why study time series? I will post relevant
• Course outline
• What is a time series? Reading:
I will post relevant readings here

Justin Slater

STAT*4360
fi
NVIDIA stock price

Time series data are


everywhere.

Data that are collected


over time require
specialized methods for
analysis.

What are some key


features of this time
series?

STAT*4360
Climate Change

From Shumway and Sto er 2017


STAT*4360
ff
Opioid-related incidents

https://www.hamilton.ca/people-programs/public-health/alcohol-drugs-gambling/hamilton-opioid-information-system
STAT*4360
Examples of time series data: Speech recognition

From Shumway and Sto er 2017


STAT*4360
ff
Why study time series
Time series data is di erent than other data
• X-axis is always time
• Data are in a speci c order
In your other classes, in many of the examples, you probably assumed that data
are generated from independent and identically distributed (iid) random
variables.
Examples:
iid
X1, . . . , Xn ∼ N(0,σ) meaning that there are n random variables that are not
related to each other, and all have the same statistical properties.
iid
Linear regression: yi ∼ N(β0 + β1xi, σ). For xed x, y1 . . . yn are not related and
have the same statistical properties.
Time series data is not iid

STAT*4360
fi
ff
fi
Time series data is not iid

STAT*4360
Time series data is not iid

STAT*4360
Adjacent time points are (cor)related

Pearson correlation is ~ 0.7 Pearson correlation is ~ 0


STAT*4360
Why we study time series

We study time series because points that are close in time tend to be
similar to each other.

This means that we need specialized methods to analyze such data.

Some things we may want to do with time series data are:

• Forecast. What will the temperature be over the next week?


• Look for trends over time. Are lake levels increasing year to year?
• Classi cation. What word is a person trying to say? We won’t do much of this.

STAT*4360
fi
Syllabus

Instructor: Dr. Justin Slater

[email protected]

O ce hours: ???? MACN 521

Lectures: M/W/F 8:30am to 9:20am in CRSC101

Prerequisites: STAT*3240 Applied regression analysis.

You can email me, but please include STAT*4360 in the subject line or your
email may be ignored. Please allow a business day or two for me to reply.

STAT*4360
ffi
Of ce hours

STAT*4360
fi
Quick Survey

Stats Major?

Stats minors?

Data science?

Grad students?

Who here has used R?

Who here has R-markdown experience?

Who has taken Mathematical statistics?

Who is taking Mathematical Statistics right now?

STAT*4360
Textbooks

Required book:

Shumway, R. H., & Sto er, D. S. Time Series


Analysis and Its Applications: With R Examples, 4th
edition. Springer 2017.

Optional reference:

Brockwell, P. J, & Davis, R. A. Introduction to Time


Series and Forecasting, 3rd edition. Springer 2016.

These texts are freely available at online at UofG


library

STAT*4360
ff
Term tests

3 term tests, 50 minutes each. Best two are worth 12.5%, worst is worth 5%.

First test is Sept 27.

Quizzes will be a mix of conceptual questions and some light mathy questions.

Missed quiz (syllabus contains o cial policy)

Miss 1 test for a legit reason, other two tests are worth 15% each.

See course outline for further details.

STAT*4360
ffi
Participation

5% of your grade. In no particular order:

• Show up to class almost all the time


• Contribute to in-class discussions
• Do practice questions
• Come to o ce hours if you are confused

STAT*4360
ffi
Assignments
There will be 3 assignments in this course, each worth 10%

Assignments will be a mix of math, simulation, and data analysis

I want the coding portion of your assignments to be reproducible. You should


submit an .rmd le along with a compiled .pdf le. I will give you an example of
how to do this.

Coding is something that you need to learn by doing. I will provide instruction
on R-coding, along with tips and tricks.

Ultimately, building your autonomy with coding just takes time and practice.

See syllabus for late policy.

STAT*4360
fi
fi
Project

Proposal is due on the last day of class. Project will be due soon after. Can
decide as a group.

Proposal is worth 5%, report worth 35%

This is to be done individually.

You will be required to nd a data set on your own, although I am happy to help
you.

Rmarkdown should be used.

STAT*4360
fi
Project tips
Tips for completing project:

• Think of a topic you are interested in, or want people to think you are
interested in: Sports, Finance, public health, air quality, climate change

• Find some time series data pertaining to that thing. Use Google.
• Some scienti c papers make their data available.
• Avoid websites like Kaggle.

STAT*4360
fi
What is a time series?

A random variable is a function from a probability space to the real line*.

Suppose we have some set of outcomes {heads, tails}. A random variable will
assign a number to these outcomes.

Ex: Suppose you toss a coin once. De ne X as

X = 1 if heads, 0 if tails. ⟹ X(heads) = 1, X(tails) = 0


If X mapped to non-numbers then we couldn’t take expectations, variances,
describe distributions, etc.

A time series is simply a sequence of random variables indexed by time.

STAT*4360 *All random variables in this course will map to the real line, but they can map to other things.
fi
What is a time series?
Xt is the random variable at time t.
{Xt : t = 1,...,T} is a sequence of random variables indexed by time. We call
this a time series. We may shorten this to {Xt}

However, we also call a realization of these random variables a time series, and
denote this with the lower case version of that letter:

{xt : t = 1,...,T}
Hence, we use the term “time series” to denote both random variables, and
data generated from those random variables.

In this course, the indices t are all equally spaced. They will be 1,2,3… not
1,2,15,48…
STAT*4360
To-do’s

Read Syllabus - I will assume that you have read the syllabus by Friday
September 13 at 8:29am.

Download E-textbooks from library

STAT*4360

You might also like