0% found this document useful (0 votes)

7 views145 pages

Notes Scatterplots

Uploaded by

samueljacobmiller

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views145 pages

Notes Scatterplots

Uploaded by

samueljacobmiller

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 145

Linear Regression

CHAPTER 3: SCATTERPLOTS AND CORRELATION

Describing a scatterplot

Scatterplots examine the relationship between

Describing a scatterplot

Scatterplots examine the relationship between 2 quantitative variables.

Describing a scatterplot

Scatterplots examine the relationship between 2 quantitative variables.

Explanatory Variable:

Response Variable:
Describing a scatterplot

Scatterplots examine the relationship between 2 quantitative variables.

Explanatory Variable:
The independent variable – Median
Income

Response Variable:
Describing a scatterplot

Scatterplots examine the relationship between 2 quantitative variables.

Explanatory Variable:
The independent variable – Median
Income

Response Variable:
The dependent variable – Crime rate
Describing a scatterplot

 Scatterplots are described by 3 things:

Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 Direction: Positive or negative
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 Direction: Positive or negative
 Positive: as the explanatory variable increases, the response
variable also tends to increase
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 Direction: Positive or negative
 Positive: as the explanatory variable increases, the response
variable also tends to increase
 OR as the explanatory variable decreases, the response variable
also tends to decrease
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 Direction: Positive or negative
 Positive: as the independent variable increases, the dependent
variable also tends to increase
 Negative: as the independent variable increases, the dependent
variable
tends to decrease.
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 FORM: Linear or non linear
Linear
Non-linear
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 FORM: Linear or non linear
Linear
Non-linear
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 FORM: Linear or non linear
Linear
Non-linear
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 STRENGTH: Strong, Moderate, Weak (with r-value)
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 STRENGTH: Strong, Moderate, Weak (with r-value)
Strong Moderate
Weak
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 STRENGTH: Strong, Moderate, Weak (with r-value)
 Correlation coefficient (r-value): the measure of the strength and the
direction of the association.
 r-values are between -1 and 1 with 0 the weakest and 1 and -1 the
strongest.
 r-value has no units
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 STRENGTH: Strong, Moderate, Weak (with r-value)
 Correlation coefficient (r-value): the measure of the strength and
direction of the association.
 R values are between -1 and 1 with 0 the weakest and 1 and -1 the
strongest. R=-0.65 R=0.99
Describing a scatterplot

 Scatterplots are described by 3 things: Direction, Form and

Strength.
 STRENGTH: Strong, Moderate, Weak (with r-value)
 Correlation (r-value): the measure of the strength of the association.
 R values are between -1 and 1 with 0 the weakest and 1 and -1 the
R≈0.4
strongest. R≈0
Describing a scatterplot

 Give the direction, form and strength for each:

Describing a scatterplot

 Give the direction, form and strength for each:

Positive Positive Neither

Negative No direction
Linear Linear non-linear linear
No form
Strong Moderate Strong Strong
no correlation
Describing a scatterplot

 Estimate the r-value for each:

Positive Positive Neither

Negative No direction
Linear Linear non-linear linear
No form
Strong Moderate Strong Strong
no correlation
Describing a scatterplot

 Estimate the r-value for each:

Positive Positive Neither

Negative No direction
Linear Linear non-linear linear
No form
Strong Moderate Strong Strong
no correlation
Describing a scatterplot

 Describe the scatterplot.

Describing a scatterplot

 Describe the scatterplot.

There is a moderate, negative,

linear relationship between
median income and crime rate.
More on Correlation

 Correlation does NOT prove Causation!

 Examples:
More on Correlation

 Correlation does NOT prove Causation!

 Examples:
 There is a positive association between the amount of damage done
at a fire and the number of fire-fighters who report to a fire.
More on Correlation

 Correlation does NOT prove Causation!

 Examples:
 There is a positive association between the amount of damage done
at a fire and the number of fire-fighters who report to a fire.
 Does this mean that the firefighters are causing the damage?
 No, it could be that bigger fires cause damage and also require
more firefighters.
More on Correlation

 Correlation does NOT prove Causation!

 Examples:
 There is a positive association between the number of AP Classes
students take in school and their GPA in college.
More on Correlation

 Correlation does NOT prove Causation!

 Examples:
 There is a positive association between the number of AP Classes
students take in school and their GPA in college.
 Does this mean that if more students are encouraged to take AP
Classes, they will do better in college?
 No, it could be that there is something inherently different about
students who choose to take AP classes (such as high motivation)
which also causes them to want to succeed in college. Students
pressured to take AP may not necessarily do better in college.
More on Correlation

 Correlation does NOT prove Causation!

 Examples:
 There is a positive association between chocolate sales and car
accidents.
More on Correlation

 Correlation does NOT prove Causation!

 Examples:
 There is a positive association between chocolate sales and car
accidents.
 Does this mean that buying chocolate (or eating chocolate) is
causing car accidents?
More on Correlation

 Correlation does NOT prove Causation!

80
Calories

40
2 3 4 5 6 7 8 9

Fat (grams)
LSRL

 LSRL – “Least Squares Regression Line”

LSRL

 LSRL – “Least Squares Regression Line”

 y=mx+b
LSRL

 LSRL – “Least Squares Regression Line”

 y=mx+b
^𝑦 =𝑎+𝑏 𝑥
 Put data in L1 and L2
 Stat, Calc, 8: Linear Regression L1, L2
LSRL

 LSRL – “Least Squares Regression Line”

 y=mx+b
^𝑦 =𝑎+𝑏 𝑥
 Put data in L1 and L2
 Stat, Calc, 8: Linear Regression L1, L2
 If R and R2 do not show...
 Go to Catalogue (at the bottom), scroll until you find
DiagnosticOn, then hit Enter twice, the go back and do 8:LinReg,
L1, L2.
LSRL

 LSRL – “Least Squares Regression Line”

 y=mx+b
^𝑦 =𝑎+𝑏 𝑥
𝑐𝑎𝑙 𝑜^ 𝑟𝑖𝑒𝑠=22+9.143 ( 𝑓𝑎𝑡)
LSRL

 LSRL – “Least Squares Regression Line”

 y=mx+b
^𝑦 =𝑏+𝑚𝑥
𝑐𝑎𝑙 𝑜^ 𝑟𝑖𝑒𝑠=22+9.143 ( 𝑓𝑎𝑡)
22: Y-intercept
LSRL

 LSRL – “Least Squares Regression Line”

 y=mx+b
^𝑦 =𝑏+𝑚𝑥
𝑐𝑎𝑙 𝑜^ 𝑟𝑖𝑒𝑠=22+9.143 ( 𝑓𝑎𝑡)
22: Y-intercept – we predict a slice of cheese with 0 grams of fat would have 22
calories.
LSRL

 LSRL – “Least Squares Regression Line”

 y=mx+b
^𝑦 =𝑏+𝑚𝑥
𝑐𝑎𝑙 𝑜^ 𝑟𝑖𝑒𝑠=22+9.143 ( 𝑓𝑎𝑡)
22: Y-intercept – we predict a slice of cheese with 0 grams of fat would have 22
calories.
9.143: Slope
LSRL

 LSRL – “Least Squares Regression Line”

 y=mx+b
^𝑦 =𝑏+𝑚𝑥
𝑐𝑎𝑙 𝑜^ 𝑟𝑖𝑒𝑠=22+9.143 ( 𝑓𝑎𝑡)
22: Y-intercept – we predict a slice of cheese with 0 grams of fat would have 22
calories.
9.143: Slope –we predict the calories will increase by 9.143 for each 1 gram increase
of fat.
Describe the scatterplot

 Describe the scatterplot (include the r-value)

Describe the scatterplot

 Describe the scatterplot (include the r-value)

 There is a strong, positive, linear relationship between grams of fat
and calories in cheese slices with a correlation of 0.96.
Coefficient of Determination (r2)

 R2: The percent of variation in y that is explained by the linear model.

Coefficient of Determination (r2)

 R2: The percent of variation in y that is explained by the linear model.

 91.4% of the variation in calories can be explained by the linear
model.
Predictions

 Use the model to predict the calories in a slice of American cheese with
6 grams of fat.
Predictions