0% found this document useful (0 votes)

13 views9 pages

Assignment 7 Business Analytics

The document discusses regression analyses conducted on various datasets, including cereal nutritional data, employee salary data, Major League Baseball performance, and Ohio education performance. Key findings include significant relationships between calories and sugars/carb intake, as well as predictors of current salary and baseball wins. Additionally, it highlights the importance of multicollinearity and the effectiveness of models in predicting outcomes based on independent variables.

Uploaded by

Trang Phạm Thị Thảo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views9 pages

Assignment 7 Business Analytics

Uploaded by

Trang Phạm Thị Thảo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

Họ và tên: Phạm Thị Thảo Trang

MSSV: 31241027994
Mã LHP: 25C1BUS50321001
_____________________________________________________________________

17. The Excel file Cereal Data provides a variety of nutritional information about
67 cereals and their shelf location in a supermarket. Use regression analysis to
find the best model that explains the relationship between calories and the other
variables. Investigate the model assumptions and clearly explain your
conclusions. Keep in mind the principle of parsimony!
R Square: 0.726 → Sugars and Carbs together explain 72.6% of the variation in
Calories.
H₀: The regression coefficients of Sugars and Carbs = 0
H₁: The regression coefficients of Sugars and Carbs ≠ 0

ANOVA’s table: F = 84.702, p-value = 0.000

→ Reject H₀: Sugars and Carbs significantly explain variation in Calories.

Individual Coefficient testing:

For Sugars: t = 12.387, p-value = 0.000, CI [3.276; 4.535]
For Carbs: t = 9.325, p-value = 0.000, CI [2.639; 4.078]
→ Both variables are statistically significant → Reject H₀

18. The Excel file Salary Data provides information on current salary, beginning
salary, previous experience (in months) when hired, and total years of education
for a sample of 100 employees in a firm.
a. Develop a multiple regression model for predicting current salary as a function
of the other variables.
b. Find the best model for predicting current salary using the t-value criterion
R Square = 0.832 → Beginning Salary, Previous experience, and Education
together explain 83.2% of the variation in Current salary.

H₀: The regression coefficients of all independent variables = 0

H₁: At least one regression coefficient ≠ 0

ANOVA’s table: F = 130.521, p-value = 0.000

→ Reject H₀: The model significantly explains variation in Current salary.

Individual Coefficient testing:

- Beginning salary: t = 15.203, p = 0.000 → significant

- Experience: t = -1.404, p = 0.164 → not significant
- Education: t = 2.045, p = 0.044 → significant

=> The best model includes Beginning Salary and Education.

21. The Excel file Major League Baseball provides data on the 2010 season.
a. Construct and examine the correlation matrix. Is multicollinearity a potential
problem?
b. Suggest an appropriate set of independent variables that predict the number
of wins by examining the correlation matrix.
c. Find the best multiple regression model for predicting the number of wins.
How good is your model? Does it use the same variables you thought were
appropriate in part (b)?
a. Construct and examine the correlation matrix. Is multicollinearity a
potential problem?

From the correlation matrix, we can observe:

- The dependent variable Won (number of wins) is highly correlated with several
independent variables such as
+ Runs (r = 0.785, p = 0.000 < 0.01)
+ Runs Batted In (r = 0.758, p = 0.000 < 0.01)
+ Earned Run Average (r = -0.682, p = 0.000 < 0.01)
+ Walks (r = 0.533, p = 0.002 < 0.01)
- Among the independent variables, there are also very high correlations between:
+ Runs and Runs Batted In (r = 0.995)
+ Runs and Home Runs (r = 0.727)
+ Runs Batted In and Home Runs (r = 0.759)
+ Doubles and Hits (r = 0.434)

b. Suggest an appropriate set of independent variables that predict the

number of wins by examining the correlation matrix.

Based on the correlation matrix:

- The dependent variable Won is strongly and significantly correlated with:
+ Runs (r = 0.785, p = 0.000)
+ Runs Batted In (r = 0.758, p = 0.000)
+ Earned Run Average (ERA) (r = –0.682, p = 0.000)
+ Walks (r = 0.533, p = 0.002)
+ Home Runs (r = 0.438, p = 0.016)
- However, there are very high correlations among some independent variables, such
as:

+ Runs and Runs Batted In: r = 0.995 → almost identical information.

+ Runs and Home Runs: r = 0.727
+ Runs Batted In and Home Runs: r = 0.759

c. Find the best multiple regression model for predicting the number of wins.
How good is your model? Does it use the same variables you thought were
appropriate in part (b)?
R Square = 0.941 → This means that 94.1% of the variation in the number of wins
(Won) can be explained by the predictors (Walks, Earned Run Average, Runs
Batted In, and Runs).

ANOVA’s table: F = 100.430, p-value = 0.000

→ Reject H₀: The regression model as a whole is significant, meaning that at least
one independent variable has a significant relationship with the dependent
variable

Individual Coefficient testing:

- Runs: t = 1.898, p = 0.069 -> not significant

- Runs battled in: t = -0.355, p = 0.725 -> not significant
- Earned run average: t = -10.875, p = 0.000 -> significant
- Walks: t = -2.068, p = 0.049 -> not significant

24. The State of Ohio Department of Education has a mandated ninth-grade

proficiency test that covers writing, reading, mathematics, citizenship (social
studies), and science. The Excel file Ohio Education Performance provides data
on success rates (defined as the percent of students passing) in school districts in
the greater Cincinnati metropolitan area along with state averages.
a. Suggest the best regression model to predict math success as a function of
success in the other subjects by examining the correlation matrix; then run the
regression tool for this set of variables
b. Develop a multiple regression model to predict math success as a function of
success in all other subjects using the systematic approach described in this
chapter. Is multicollinearity a problem?
c. Compare the models in parts (a) and (b). Are they the same? Why or why not?
a. Suggest the best regression model to predict math success as a function of
success in the other subjects by examining the correlation matrix; then run
the regression tool for this set of variables

Among the independent variables (Writing, Reading, Citizenship, and Science) and
the dependent variable (Math), all show strong positive correlations. And Science
shows the strongest correlation with Math with 0.895

b. Develop a multiple regression model to predict math success as a function

of success in all other subjects using the systematic approach described in
this chapter. Is multicollinearity a problem?
R Square = 0.875 → This means that 87.5% of the variation in Math can be
explained by the predictor Science.

ANOVA’s table: F = 210.184, p-value = 0.000

→ Reject H₀: The regression model as a whole is significant, meaning that
Science success has a statistically significant relationship with Math success.

Individual Coefficient testing:

Science: t = 14.498, p = 0.000 → significant

Assignment-15 BA
No ratings yet
Assignment-15 BA
11 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
37 pages
Linear Regression and Correlation
No ratings yet
Linear Regression and Correlation
41 pages
Statistical Analysis Homework
No ratings yet
Statistical Analysis Homework
9 pages
AP Statistics Linear Regression Review
No ratings yet
AP Statistics Linear Regression Review
5 pages
Regression Analysis Guide
No ratings yet
Regression Analysis Guide
25 pages
RESEARCH METHODS LESSON 18 - Multiple Regression
No ratings yet
RESEARCH METHODS LESSON 18 - Multiple Regression
6 pages
ECONOMETRICS I REViSIONS
No ratings yet
ECONOMETRICS I REViSIONS
22 pages
Theme 3 Multivariante Regression Model
No ratings yet
Theme 3 Multivariante Regression Model
8 pages
Ohio School Performance Analysis
No ratings yet
Ohio School Performance Analysis
7 pages
SRT 605 - Topic (10) SLR
No ratings yet
SRT 605 - Topic (10) SLR
39 pages
Linear and Multiple Regression Analysis
100% (2)
Linear and Multiple Regression Analysis
8 pages
Lecture 12 - Adv. Correlation and Multiple Regression
No ratings yet
Lecture 12 - Adv. Correlation and Multiple Regression
32 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
22 pages
Multivariate Analysis Techniques Explained
No ratings yet
Multivariate Analysis Techniques Explained
20 pages
Chapter9 PDF
No ratings yet
Chapter9 PDF
28 pages
Chapter 9
No ratings yet
Chapter 9
28 pages
Chapter 9
No ratings yet
Chapter 9
28 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
Correlation and Regression Analysis Using SPSS
No ratings yet
Correlation and Regression Analysis Using SPSS
102 pages
Bio2 Module 4 - Multiple Linear Regression
No ratings yet
Bio2 Module 4 - Multiple Linear Regression
20 pages
Regression
No ratings yet
Regression
49 pages
Chapter Fourteen: Multiple Regression and Correlation Analysis
No ratings yet
Chapter Fourteen: Multiple Regression and Correlation Analysis
27 pages
Regression Analysis for Ticket Sales
No ratings yet
Regression Analysis for Ticket Sales
58 pages
MAT 120 Chapter 9 Notes PDF
No ratings yet
MAT 120 Chapter 9 Notes PDF
4 pages
Multivariate Analysis IBS
No ratings yet
Multivariate Analysis IBS
20 pages
Linear Regressions With Calculator Notes
No ratings yet
Linear Regressions With Calculator Notes
4 pages
Correlation and Regression Analysis Guide
No ratings yet
Correlation and Regression Analysis Guide
6 pages
1
No ratings yet
1
5 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
16 pages
Stats for Students & Educators
No ratings yet
Stats for Students & Educators
15 pages
Linear Regression Project Guide
No ratings yet
Linear Regression Project Guide
3 pages
Lab 9 Report
No ratings yet
Lab 9 Report
5 pages
Linear Correlation and Regression Analysis
No ratings yet
Linear Correlation and Regression Analysis
26 pages
Team8 Lab3
No ratings yet
Team8 Lab3
12 pages
Regression Models - Follow
No ratings yet
Regression Models - Follow
7 pages
Regression & Group Differences Report
No ratings yet
Regression & Group Differences Report
41 pages
AMA3602Final2024Fall Ray
No ratings yet
AMA3602Final2024Fall Ray
21 pages
Predictive Analytics - Business Predictions Using Mutliple Linear Regression
No ratings yet
Predictive Analytics - Business Predictions Using Mutliple Linear Regression
21 pages
Regression and Correlation Guide
No ratings yet
Regression and Correlation Guide
13 pages
Linear Regression Analysis Guide
No ratings yet
Linear Regression Analysis Guide
19 pages
Chap 014
No ratings yet
Chap 014
25 pages
Interpretation of Regression
No ratings yet
Interpretation of Regression
6 pages
Strongest Linear Correlation Explained
No ratings yet
Strongest Linear Correlation Explained
3 pages
Introduction To Advanced Statistics
No ratings yet
Introduction To Advanced Statistics
37 pages
Chapter - 10.QM Sir Pac
No ratings yet
Chapter - 10.QM Sir Pac
8 pages
Multiple Regression Insights
No ratings yet
Multiple Regression Insights
18 pages
Jam Session Stat140 November 2024
No ratings yet
Jam Session Stat140 November 2024
6 pages
Worksheet 7 Olympic Swimmers
No ratings yet
Worksheet 7 Olympic Swimmers
5 pages
Bi Is The Slope of The Regression Line Which Indicates The Change in The Mean of The Probablity Bo Is The Y Intercept of The Regression Line
No ratings yet
Bi Is The Slope of The Regression Line Which Indicates The Change in The Mean of The Probablity Bo Is The Y Intercept of The Regression Line
5 pages
Chapter 4 - Multiple Regression Analysis
No ratings yet
Chapter 4 - Multiple Regression Analysis
43 pages
Chap 014
No ratings yet
Chap 014
20 pages
Excel Homework Instructions Spring 2014
No ratings yet
Excel Homework Instructions Spring 2014
70 pages
Advance Business Research Methods
No ratings yet
Advance Business Research Methods
38 pages
Lean Six Sigma Green Belt: Data Analysis
No ratings yet
Lean Six Sigma Green Belt: Data Analysis
50 pages
Econometrics Exam Instructions for MBA
No ratings yet
Econometrics Exam Instructions for MBA
4 pages
Dynamic Programming for Beginners
No ratings yet
Dynamic Programming for Beginners
6 pages
The Statistical Mechanics of Irreversible Phenomena 1st Edition Pierre Gaspard Download
100% (6)
The Statistical Mechanics of Irreversible Phenomena 1st Edition Pierre Gaspard Download
76 pages
2022 Revision Test - 4 - Statistics - MR Share
No ratings yet
2022 Revision Test - 4 - Statistics - MR Share
4 pages
Simplex Method
100% (1)
Simplex Method
34 pages
Partition Function Portfolio Construction
No ratings yet
Partition Function Portfolio Construction
4 pages
Ama U3
No ratings yet
Ama U3
19 pages
MAT2411 Differential Equation Final Exam: Name-Surname Group Number Department Signature Date Duration
No ratings yet
MAT2411 Differential Equation Final Exam: Name-Surname Group Number Department Signature Date Duration
4 pages
Lab # 8 Control System
No ratings yet
Lab # 8 Control System
10 pages
Curve Fitting & Approximation
No ratings yet
Curve Fitting & Approximation
29 pages
CS109A Notes For Lecture 1/26/96 Running Time
No ratings yet
CS109A Notes For Lecture 1/26/96 Running Time
4 pages
Lecture 10 PDF
No ratings yet
Lecture 10 PDF
73 pages
Understanding Adversarial Robustness From Feature Maps of Convolutional Layers
No ratings yet
Understanding Adversarial Robustness From Feature Maps of Convolutional Layers
14 pages
AI Mid Term Exam Sample Paper
100% (1)
AI Mid Term Exam Sample Paper
2 pages
Math 8 Summative Assessment Q2
100% (1)
Math 8 Summative Assessment Q2
2 pages
Notes Machine Learning
No ratings yet
Notes Machine Learning
34 pages
Trajectory-Tracking Control of Mobile Robot Systems Incorporating Neural-Dynamic Optimized Model Predictive Approach
No ratings yet
Trajectory-Tracking Control of Mobile Robot Systems Incorporating Neural-Dynamic Optimized Model Predictive Approach
10 pages
Xii - Maths - QP - (Set - 1)
No ratings yet
Xii - Maths - QP - (Set - 1)
9 pages
Quantum Error Correction Overview
No ratings yet
Quantum Error Correction Overview
22 pages
Submodular Set Function
No ratings yet
Submodular Set Function
7 pages
Divide and Conquer
No ratings yet
Divide and Conquer
14 pages
Theory of Open Quantum Systems
No ratings yet
Theory of Open Quantum Systems
11 pages
Two Port
No ratings yet
Two Port
4 pages
State-Space Representation Guide
No ratings yet
State-Space Representation Guide
37 pages
The Church-Turing Thesis: Computability and Complexity
No ratings yet
The Church-Turing Thesis: Computability and Complexity
11 pages
Factorial Design
No ratings yet
Factorial Design
30 pages
Cloud Securityusing Hybrid Cryptography
No ratings yet
Cloud Securityusing Hybrid Cryptography
7 pages
Tutorial Sheet 05
No ratings yet
Tutorial Sheet 05
2 pages
Assign 4
No ratings yet
Assign 4
3 pages
Fundamentals of Algorithm Design
No ratings yet
Fundamentals of Algorithm Design
297 pages
Tutorial 5
No ratings yet
Tutorial 5
59 pages

Assignment 7 Business Analytics

Uploaded by

Assignment 7 Business Analytics

Uploaded by

Họ và tên: Phạm Thị Thảo Trang

ANOVA’s table: F = 84.702, p-value = 0.000

Individual Coefficient testing:

H₀: The regression coefficients of all independent variables = 0

ANOVA’s table: F = 130.521, p-value = 0.000

Individual Coefficient testing:

- Beginning salary: t = 15.203, p = 0.000 → significant

=> The best model includes Beginning Salary and Education.

From the correlation matrix, we can observe:

b. Suggest an appropriate set of independent variables that predict the

Based on the correlation matrix:

+ Runs and Runs Batted In: r = 0.995 → almost identical information.

ANOVA’s table: F = 100.430, p-value = 0.000

Individual Coefficient testing:

- Runs: t = 1.898, p = 0.069 -> not significant

24. The State of Ohio Department of Education has a mandated ninth-grade

b. Develop a multiple regression model to predict math success as a function

ANOVA’s table: F = 210.184, p-value = 0.000

Individual Coefficient testing:

You might also like