0% found this document useful (0 votes)

2 views8 pages

Statistical_Computing

The document outlines a statistical computing assignment focused on linear regression analysis using the least squares method to estimate model parameters. It includes steps for validating model assumptions, performing bootstrap procedures for parameter estimation, and constructing confidence intervals for various parameters related to newborn lengths based on parental heights and smoking status. Key results include estimates of model parameters, their significance, and confidence intervals, indicating the effects of parental heights and smoking on newborn length.

Uploaded by

chaoyang.soconsulting

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views8 pages

Statistical_Computing

Uploaded by

chaoyang.soconsulting

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Statistical Computing CW3

November 2024

(a) The goal of the least squares method is to estimate the model parameters by minimizing the Residual
Sum of Squares (RSS). For the regression model:

yi = α + βmi + γfi + ϵi ,

we aim to find the parameter estimates (α̂, β̂, γ̂) that minimize the RSS. In R, this can be achieved
using the lm() function, which computes the least squares estimates for the regression coefficients.
In this linear model, the “smoker” variable is not included. After running the code in R, we obtained
the following least squares estimates for (α, β, γ):

α̂ = 6.65996, β̂ = 0.17165, γ̂ = 0.03128.

The F-statistic is 4.22 with a p-value of 0.02192. This indicates that the model, as a whole, is statis-
tically significant. Then given that the p-value of β is 0.013, which is statistically significant at the
0.05 level while that of γ is not. Hence the data$mheight has a statistically significant effect on the
dependent variable. And the predictor data$fheight does not show statistical significance and may not
contribute meaningfully to the model.

data <- read.table("birth_length(1).txt", header = TRUE, sep = "\t")

data
birthmodel<-lm(data$length ~ data$mheight + data$fheight, data = data)
summary(birthmodel)

(b) The residuals from the least squares fit for the regression model are defined as:

ϵi = yi − (α + βmi + γfi ),

where yi is the observed value of the new-born baby’s length, and α+βmi +γfi represents the expected
value of the baby’s length based on the predictor variables mi (mother’s height) and fi (father’s height)
in the linear model.
A key assumption of the linear regression model is the normality of the residuals. In Figure 1, the
residuals appear to be approximately normal, which is further confirmed in Figure 2, where the residuals
are shown to follow a normal distribution by the Q-Q plot.
This validation of the normality assumption ensures that the model’s statistical inferences, such as
confidence intervals we analyze later are reliable.

n<-length(data$length)
coefficient<-coef(birthmodel)
residual<-residuals(birthmodel)
birthlength_fitted<-fitted(birthmodel)

1
Histogram of residual_vector

8
6
Frequency

4
2
0

−2 −1 0 1 2

residual_vector

Figure 1: Histogram of Residuals

Q−Q Plot of Residuals

2
Sample Quantiles

1
0
−1
−2

−2 −1 0 1 2

Theoretical Quantiles

Figure 2: Q-Q Plot of the Distribution of Residuals with regards to the Theoretical Normal Distribution

2
residual_vector<-numeric(n)
for (i in 1:n){
residual_vector[i]<-data$length[i]-birthlength_fitted[i]
}
residual_vector
mean(residual_vector)
hist(residual_vector)
qqnorm(residual_vector, main = "Q-Q Plot of Residuals")
qqline(residual_vector, col = "blue", lwd = 2)

(c) In this question, the Bootstrap procedure for this linear model is outlined as follows:
1. Fit the linear model to the birth-length data. Specifically, estimate the parameter γ in the model:

yi = α + βmi + γfi + ϵi ,

where yi is the response variable, mi and fi are predictors, and ϵi is the random error term.
2. Obtain the residuals (ϵ̂1 , . . . , ϵ̂42 ) as computed previously in part (b).
3. Generate bootstrap samples by evaluating

yj∗ = α̂ + β̂mj + γ̂fj + ϵ∗j ,

where ϵ∗j is chosen uniformly at random from the residuals (ϵ̂1 , . . . , ϵ̂42 ).
4. Refit the linear model to each of the B bootstrap samples to obtain B bootstrap realizations of γ̂i∗ .
5. Summarize the results by calculating the bias and standard error of γ̂, as well as the proportion
P r(γ̂ ∗ > 0).
The bias of γ̂ is given by
B
1 X ∗
Bias(γ̂) = (γ̂b − γ̂) .
B
b=1

The standard error of γ̂ is v

u B
u 1 X ∗ 2
SE(γ̂) = t (γ̂b − γ̄ ∗ ) ,
B−1
b=1

where
B
1 X ∗
γ̄ ∗ = γ̂b .
B
b=1

The proportion P r(γ̂ > 0) is estimated as:

#(γ̂b∗ > 0)
P r(γ̂ > 0) = .
B

Figure 3 shows the histogram of the γ̂b∗ values obtained from the bootstrap samples. After running the
code, we calculated the following results:

Bias(γ̂) = 0.002054614, SE(γ̂) = 0.05668251, P r(γ̂ > 0) = 0.715.

These results imply that the estimate γ̂ is biased. A histogram of 1000 bootstrap estimates of γ̂ are
given in the Figure 3.

3
Histogram of gamma_resid_result

250
Frequency

150
50
0

−0.2 −0.1 0.0 0.1 0.2

gamma_resid_result

Figure 3: Histogram of Bootstrap Estimates γ̂b∗ , i = 1, . . . , 1000

bootRes <- function(B) {

birthmodel<-lm(data$length ~ data$mheight + data$fheight, data = data)
coefficient<-coef(birthmodel)
residual<-residuals(birthmodel)
birthlength_fitted<-fitted(birthmodel)
gamma_resid <- numeric(n)
for(i in 1:B){
residual_star <- sample(residual, n, replace = TRUE)
birthlength_star <- birthlength_fitted + residual_star
fit_star <- lm(birthlength_star ~ data$mheight + data$fheight)
gamma_resid[i] <- coef(fit_star)[3]
}
gamma_resid
}
B<-1000
gamma_resid_result<-bootRes(B)
hist(gamma_resid_result)

Bias_gamma<-mean(gamma_resid_result)-coefficient[3]
Bias_gamma
se_gamma<-sqrt(var(gamma_resid_result)/length(B-1))
se_gamma

gammal0 <- ifelse(gamma_resid_result[1:B] > 0, 1, 0)

gammal0
Prgammal0<-mean(gammal0)
Prgammal0

4
(d) To produce a 95% confidence interval for the parameter β using the bootstrap-t confidence interval
methodology, we proceed as follows:
1. Estimate the parameter and standard error: Using the lm() function, we compute the least squares
estimate β̂ for β, obtaining
β̂ = 0.17165,
along with the estimate of its standard error se,
ˆ where

se
ˆ = 0.06593.

2. Generate bootstrap samples: For i = 1, . . . , B, repeat the steps:

Refit the linear model to each bootstrapped dataset to obtain B bootstrap realizations of β̂i∗ and their
corresponding standard errors seˆ ∗i .
Compute the Z-score for each bootstrap realization as:

β̂i∗ − β̂
Zi∗ = .
ˆ ∗i
se

3. Calculate quantiles of Z ∗ : Determine the empirical 2.5% and 97.5% quantiles of the Z ∗ values,
denoted as:
t̂0.025 = −1.876585 and t̂0.975 = 2.071633.

4. **Compute the 95% bootstrap-t confidence interval**: The confidence interval is computed as:

β̂ − t̂0.975 · se,
ˆ β̂ − t̂0.025 · se
ˆ .

Substituting the values:

(0.17165 − 2.071633 · 0.06593, 0.17165 + 1.876585 · 0.06593) = (0.01256, 0.29524).

Interpretation: The resulting 95% bootstrap-t confidence interval for β is (0.03507, 0.29536). This
interval provides a range of plausible values for β based on the observed data and accounts for uncer-
tainty in the estimate. Since 0 does not fall in the confidence interval, we do not think it is plausible
that β = 0 at 95% confidence level.

CIbeta<-function(B){
birthmodel<-lm(data$length ~ data$mheight + data$fheight, data = data)
coefficient<-coef(birthmodel)
residual<-residuals(birthmodel)
birthlength_fitted<-fitted(birthmodel)
beta_resid <- numeric(n)
z_star<-numeric(B)
beta_star<-numeric(B)
se_star<-numeric(B)
for (i in 1:B) {
residual_star <- sample(residual, n, replace = TRUE)
birthlength_star <- birthlength_fitted + residual_star
fit_star <- lm(birthlength_star ~ mheight + fheight , data = data)
beta_star[i] <- coef(fit_star)[2] # Assuming interest is in ’mheight’
# Calculate bootstrap z-score for mheight
se_star[i] <- summary(fit_star)$coefficients["mheight", "Std. Error"]
z_star[i] <- (beta_star[i] - coef(birthmodel)[2]) / se_star[i]
}
z_star

5
}

B<-1000
CIbeta_result<-CIbeta(B)
CIbeta_result
CIbeta_quantile<-quantile(CIbeta_result,probs =c(0.025,0.975))
CIbeta_quantile
se_bar<-summary(birthmodel)$coefficients["data$mheight", "Std. Error"]
CIt <- coef(birthmodel)[2] - c(CIbeta_quantile[2],CIbeta_quantile[1]) * se_bar
CIt

(e) In this question, the parameter of interest is E(M ) − E(F ), which we denote as θ. We estimate θ and
construct a 95% confidence interval using a bootstrap-t procedure. The steps are as follows:
1. Estimate the parameter and standard error: By computing the difference between the mean of the
mother’s height (M ) and the mean of the father’s height (F ), we obtain the point estimate θ̂ for θ:

θ̂ = −6.357143.

The standard error of θ̂ is estimated as:

r
var(mheight − fheight)
se
ˆ = = 0.5029777
42

2. Generate bootstrap samples: For i = 1, . . . , B, repeat the following steps:

(i)Combine the mother’s and father’s heights into a single bivariate dataset, treating the data as paired
observations (M, F ). Randomly sample with replacement from the paired data to create a bootstrap
sample.
(ii)Recompute the difference in means for each bootstrap sample to obtain θ̂i∗ , along with its corre-
ˆ ∗i .
sponding standard error se
(iii)Calculate the Z-score for each bootstrap realization as:

θ̂i∗ − θ̂
Zi∗ = .
ˆ ∗i
se

3. Determine the quantiles of Z ∗ : Compute the empirical 2.5% and 97.5% quantiles of the Z ∗ values,
denoted as:
t̂0.025 = −1.483659 and t̂0.975 = 2.131360

4. Compute the 95% bootstrap-t confidence interval: The confidence interval is given by:

θ̂ − t̂0.975 · se,
ˆ θ̂ − t̂0.025 · se
ˆ .

Substituting the values:

(−6.357143 − 2.131360 · 0.5029777, −6.357143 + 1.483659 · 0.5029777) ,

we compute:
(−7.429169, −5.610895).

Interpretation: The resulting 95% bootstrap-t confidence interval for θ is (−6.472538, −6.218647). This
interval provides a range of plausible values for the difference between the mean heights of mothers and
fathers. The relatively narrow interval suggests that the estimate θ̂ is precise and reflects the observed
data well. Since 0 does not fall in the confidence interval, so that we concluded it is not plausible that
E(M ) − E(F ) = 0 or E(M ) = E(F ) at 95% confidence level.

6
n<-length(data$length)
Boopair<-function(B){
diff_MF<-numeric(B)
z_star_d<-numeric(B)
se_star_d<-numeric(B)
diff_M<-numeric(B)
for(i in 1:B){
ind<-sample(n,n,replace=TRUE)
x1_star<-data$mheight[ind]
x2_star<-data$fheight[ind]
diff_MF[i]<-mean(x1_star)-mean(x2_star)
se_star_d[i]<-sqrt(var(x1_star-x2_star)/n)
z_star_d[i]<-(diff_MF[i]-(mean(data$mheight)-mean(data$fheight)))/se_star_d[i]
}
z_star_d
}
Wholecase_result<-Boopair(100)
wholecase_quantile<-quantile(Wholecase_result,probs =c(0.025,0.975))
wholecase_quantile
wholecase_se_bar<-sqrt(var(data$mheight-data$fheight)/(n))
wholecase_se_bar
CI_wholecase <- (mean(data$mheight)-mean(data$fheight))-c(wholecase_quantile[2],wholecase_quantile[
CI_wholecase

(f) In this analysis, the parameter of interest, θ, is defined as the difference in expected baby length
between non-smoking mothers and smoking mothers:

θ = E(Y |S = 0) − E(Y |S = 1).

We conduct the following procedure:

1. For i = 1, . . . , B, we repeat the following steps:
Combine the baby lengths and smoker levels into a single paired bivariate dataset. In other words,
resample the indices of these paired data with replacement.
For each resample, compute the bootstrap estimate θ̂i∗

θ̂i∗ = E(Y ∗ |S = 0) − E(Y ∗ |S = 1)

, where Y ∗ is the resampled Y in each bootstrap procedure.

2. Using the bootstrap estimates, compute P r(θ̂ > 0) as:

#(θ̂i∗ > 0)
P r(θ̂ > 0) = .
B

From the results, we find that P r(θ̂ > 0) = 0.925, indicating that the baby’s length is highly likely to
be greater for non-smoking mothers compared to smoking mothers.
In Figure 4, the bootstrap samples form an approximately normal distribution, centered at 0.5198,
which supports the conclusion that the baby’s length is significantly longer for non-smoking mothers.

BooYS<-function(B){
theta_star<-numeric(B)
for(i in 1:B){
n<-length(data$length)

7
Histogram of BooYS_result

200
150
Frequency

100
50
0

−0.5 0.0 0.5 1.0 1.5

BooYS_result

ˆ E(Y |S = 1))∗
Figure 4: Histogram of Bootstrap Estimates (E(Y |S = 0) − i

ind<-sample(n,n,replace=TRUE)
Y_star<-data$length[ind]
S_star<-data$smoker[ind]
Y1_star <- Y_star[S_star == 1]
Y0_star <- Y_star[S_star == 0]
theta_star[i]<-mean(Y0_star)-mean(Y1_star)
}
return(theta_star)
}

B<-1000
BooYS_result<-BooYS(B)
BooYSl0 <- ifelse(BooYS_result[1:B] > 0, 1, 0)
BooYSl0
Pr_thetal0<-mean(BooYSl0)
Pr_thetal0
hist(BooYS_result)

Homework 3 R Tutorial: How To Use This Tutorial
No ratings yet
Homework 3 R Tutorial: How To Use This Tutorial
8 pages
Lecture 7 Classification
No ratings yet
Lecture 7 Classification
52 pages
Bootstrap Regression With R: Histogram of KPL
No ratings yet
Bootstrap Regression With R: Histogram of KPL
5 pages
All of Stats-W
No ratings yet
All of Stats-W
35 pages
Exercise 1 Statistical Learning
No ratings yet
Exercise 1 Statistical Learning
11 pages
Exam 1 Notes
No ratings yet
Exam 1 Notes
4 pages
Weatherwax Weisberg Solutions
No ratings yet
Weatherwax Weisberg Solutions
162 pages
HW4 Solutions: Problem 6.2
No ratings yet
HW4 Solutions: Problem 6.2
8 pages
D Linear Regression With R
No ratings yet
D Linear Regression With R
9 pages
HW 9 Bootstrap, Jackknife, and Permutation Tests
No ratings yet
HW 9 Bootstrap, Jackknife, and Permutation Tests
7 pages
An Introduction to the Bootstrap 3ai7r0o65z
No ratings yet
An Introduction to the Bootstrap 3ai7r0o65z
8 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
R Session Bootstrapping Randomisation 2024
No ratings yet
R Session Bootstrapping Randomisation 2024
4 pages
16-Two-Sample-T-tests
No ratings yet
16-Two-Sample-T-tests
40 pages
3080Project4_StatisticalIntervals
No ratings yet
3080Project4_StatisticalIntervals
4 pages
4.5-Bootstrap_Variations
No ratings yet
4.5-Bootstrap_Variations
25 pages
Statw 56
No ratings yet
Statw 56
4 pages
Wasserman 8 PDF
No ratings yet
Wasserman 8 PDF
12 pages
cheatsheet
No ratings yet
cheatsheet
4 pages
Batch38 CSE7315c Probability Basics Lab04 Solutions
No ratings yet
Batch38 CSE7315c Probability Basics Lab04 Solutions
3 pages
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
No ratings yet
Homework 1: Statistics 109 Due February 17, 2019 at 11:59pm EST
23 pages
Lecture On Bootstrap - Lecture Notes
No ratings yet
Lecture On Bootstrap - Lecture Notes
29 pages
CS1B Actuarial Statistics Solutions
No ratings yet
CS1B Actuarial Statistics Solutions
13 pages
Estimation: Large Characteristic of A Population Based On Its Sample
No ratings yet
Estimation: Large Characteristic of A Population Based On Its Sample
19 pages
Linear Regression
100% (2)
Linear Regression
228 pages
Topic 3a
No ratings yet
Topic 3a
64 pages
Regression in R
No ratings yet
Regression in R
40 pages
Histogram: Number
No ratings yet
Histogram: Number
38 pages
W3 - Testing Means - Choose Your Test
No ratings yet
W3 - Testing Means - Choose Your Test
7 pages
Statistical-Methods-II
No ratings yet
Statistical-Methods-II
284 pages
Statistics Help Card Full
No ratings yet
Statistics Help Card Full
6 pages
Lecture 4 Linear Regression
No ratings yet
Lecture 4 Linear Regression
75 pages
Exercise 3 Computer Intensive Statistics
No ratings yet
Exercise 3 Computer Intensive Statistics
10 pages
Fitting & Interpreting Linear Models in Rinear Models in R
100% (1)
Fitting & Interpreting Linear Models in Rinear Models in R
8 pages
Appendix: Answers To Selected Exercises: /user
No ratings yet
Appendix: Answers To Selected Exercises: /user
8 pages
Data highlights combined (1)
No ratings yet
Data highlights combined (1)
36 pages
Bootstrap 1
No ratings yet
Bootstrap 1
7 pages
Bootstrap Up
No ratings yet
Bootstrap Up
5 pages
Regn_lect_5
No ratings yet
Regn_lect_5
9 pages
Assignment IV Probability
No ratings yet
Assignment IV Probability
18 pages
Modern Regression Homework 5-1
No ratings yet
Modern Regression Homework 5-1
8 pages
Lab #6: Bootstrap Intervals: Why It Works
No ratings yet
Lab #6: Bootstrap Intervals: Why It Works
7 pages
This Content Downloaded From 140.213.190.131 On Tue, 13 Apr 2021 09:26:31 UTC
No ratings yet
This Content Downloaded From 140.213.190.131 On Tue, 13 Apr 2021 09:26:31 UTC
23 pages
Cheat Sheet F
No ratings yet
Cheat Sheet F
2 pages
Math Bach 07
No ratings yet
Math Bach 07
24 pages
Session 6-15 - Unit II & III: Probability and Distribution, Classical Tests
No ratings yet
Session 6-15 - Unit II & III: Probability and Distribution, Classical Tests
34 pages
07 - Inference For Numerical Data
No ratings yet
07 - Inference For Numerical Data
3 pages
Regression 101
No ratings yet
Regression 101
18 pages
HW12 Sol
No ratings yet
HW12 Sol
9 pages
Stat 5700 HW 2
No ratings yet
Stat 5700 HW 2
15 pages
Week 6 2-Sample Hypothesis Testing and CI Part4
No ratings yet
Week 6 2-Sample Hypothesis Testing and CI Part4
9 pages
Cappstone
No ratings yet
Cappstone
2 pages
Problem-Set - 1 Practise Problems From Textbook
No ratings yet
Problem-Set - 1 Practise Problems From Textbook
2 pages
Problem Set 6 Solution Numerical Methods
No ratings yet
Problem Set 6 Solution Numerical Methods
11 pages
Module 4
No ratings yet
Module 4
33 pages
IntroR 2
No ratings yet
IntroR 2
18 pages
Assignment 3( QM)
No ratings yet
Assignment 3( QM)
3 pages
OceanofPDF.com Think Stats 3rd Edition Early Release - Allen Downey
No ratings yet
OceanofPDF.com Think Stats 3rd Edition Early Release - Allen Downey
97 pages
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
From Everand
10+2 Level Mathematics For All Exams GMAT, GRE, CAT, SAT, ACT, IIT JEE, WBJEE, ISI, CMI, RMO, INMO, KVPY Etc.
Shubhankar Paul
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Data Warehousing - Architecture
No ratings yet
Data Warehousing - Architecture
7 pages
unit-2_HTML (2)
No ratings yet
unit-2_HTML (2)
52 pages
Computer Architecture Simd Vector Gpu
No ratings yet
Computer Architecture Simd Vector Gpu
16 pages
Network and Security Iti
No ratings yet
Network and Security Iti
64 pages
Ug 730191
No ratings yet
Ug 730191
46 pages
Gage R&R Tool - Average and Range Method (Control Chart Method, Xbar and R Method)
No ratings yet
Gage R&R Tool - Average and Range Method (Control Chart Method, Xbar and R Method)
4 pages
Lund University EIEN50 - Automation Simulation 1
No ratings yet
Lund University EIEN50 - Automation Simulation 1
18 pages
Parameter Estimation of Linear Induction Motor Labvolt 8228-02
No ratings yet
Parameter Estimation of Linear Induction Motor Labvolt 8228-02
7 pages
Experiment 4 WDM Fiber Optic Link: Aim Components Used
No ratings yet
Experiment 4 WDM Fiber Optic Link: Aim Components Used
3 pages
Bill
100% (1)
Bill
3 pages
Activity On Revenue Cycle
No ratings yet
Activity On Revenue Cycle
7 pages
Major Project Report Format
No ratings yet
Major Project Report Format
9 pages
Abido Adanced Database Systems
No ratings yet
Abido Adanced Database Systems
44 pages
Constellations Cootie Catcher
No ratings yet
Constellations Cootie Catcher
6 pages
CORE-13: Artificial Intelligence (Unit-2) Problem Solving and Searching Techniques
100% (1)
CORE-13: Artificial Intelligence (Unit-2) Problem Solving and Searching Techniques
20 pages
Lateral Analysis of Piles - Finite Difference
No ratings yet
Lateral Analysis of Piles - Finite Difference
29 pages
Programming Assignment 3: Programming A Simple Controller: Instructions
No ratings yet
Programming Assignment 3: Programming A Simple Controller: Instructions
9 pages
NoteBook Catalog 2024 Ver.2
No ratings yet
NoteBook Catalog 2024 Ver.2
50 pages
Title Intro Updated 1
No ratings yet
Title Intro Updated 1
30 pages
S2 - Introduction To ECPS
No ratings yet
S2 - Introduction To ECPS
52 pages
OTDR Report: Total Fiber Information
No ratings yet
OTDR Report: Total Fiber Information
1 page
New GentleYAG SYSTEM OVERVIEW
No ratings yet
New GentleYAG SYSTEM OVERVIEW
41 pages
Final Full Document
No ratings yet
Final Full Document
72 pages
Handel Halvorsen Passacaglia For Clarinet Duet
No ratings yet
Handel Halvorsen Passacaglia For Clarinet Duet
6 pages
TL1451A
No ratings yet
TL1451A
32 pages
Website: VCE To PDF Converter: Facebook: Twitter:: Number: 1z0-148 Passing Score: 800 Time Limit: 120 Min
No ratings yet
Website: VCE To PDF Converter: Facebook: Twitter:: Number: 1z0-148 Passing Score: 800 Time Limit: 120 Min
54 pages
Program-Exit-Survey-for-Student-BCA
No ratings yet
Program-Exit-Survey-for-Student-BCA
2 pages
FULL STACK DEVELOPMENT GUIDE Its Time To Switch Your Career To HIGH PAYING JOB
No ratings yet
FULL STACK DEVELOPMENT GUIDE Its Time To Switch Your Career To HIGH PAYING JOB
32 pages
Chapter 2 - The Origins of Software
No ratings yet
Chapter 2 - The Origins of Software
26 pages
Lesson 2 - Fractions
No ratings yet
Lesson 2 - Fractions
15 pages

Statistical_Computing

Uploaded by

Statistical_Computing

Uploaded by

Statistical Computing CW3

α̂ = 6.65996, β̂ = 0.17165, γ̂ = 0.03128.

data <- read.table("birth_length(1).txt", header = TRUE, sep = "\t")

Figure 1: Histogram of Residuals

Q−Q Plot of Residuals

yj∗ = α̂ + β̂mj + γ̂fj + ϵ∗j ,

The standard error of γ̂ is v

The proportion P r(γ̂ > 0) is estimated as:

Bias(γ̂) = 0.002054614, SE(γ̂) = 0.05668251, P r(γ̂ > 0) = 0.715.

−0.2 −0.1 0.0 0.1 0.2

Figure 3: Histogram of Bootstrap Estimates γ̂b∗ , i = 1, . . . , 1000

bootRes <- function(B) {

gammal0 <- ifelse(gamma_resid_result[1:B] > 0, 1, 0)

2. Generate bootstrap samples: For i = 1, . . . , B, repeat the steps:

Substituting the values:

(0.17165 − 2.071633 · 0.06593, 0.17165 + 1.876585 · 0.06593) = (0.01256, 0.29524).

The standard error of θ̂ is estimated as:

2. Generate bootstrap samples: For i = 1, . . . , B, repeat the following steps:

Substituting the values:

(−6.357143 − 2.131360 · 0.5029777, −6.357143 + 1.483659 · 0.5029777) ,

θ = E(Y |S = 0) − E(Y |S = 1).

We conduct the following procedure:

θ̂i∗ = E(Y ∗ |S = 0) − E(Y ∗ |S = 1)

, where Y ∗ is the resampled Y in each bootstrap procedure.

−0.5 0.0 0.5 1.0 1.5

You might also like