Paper: Regression Analysis
Practical 3: Regression on Dummy variables
Kaushik Jain
18006
Q1) Codes:
The fitted regression line turns out to be:
y*(effective life of a cutting tool)=36.98560-0.02661x1*(lathe speed in revolutions per
minute)+15.000425x2*(type of cutting tool used)
The intercept suggests that when both x1 and x2 are zero, the estimated value of y is
36.98560.
For each one-unit increase in x1(speed), holding x2 constant, we expect y to decrease
by approximately 0.02661 rpm.
The coefficient for x2 indicates that, on average, when the tool type changes, the
predicted response variable changes by approximately 15.00425 units, all other
variables being constant.
The model fits the data well, as indicated by the high R-squared value of 0.9003,
indicating that the model explains a large portion of the variability in the response
variable.
The F-statistic for model is 76.75 with a very low p-value (3.086e-09).This indicates
that the overall model is statistically significant
The p-value for all the coefficients are less than 0.05, hence the individual coefficients
are statistically significant.
Interpretation:
The estimated coefficient for x2 is approximately 15.00425, with a 95% confidence interval
ranging from 12.13 to 17.87. This means that for a one-unit increase in x2, holding other
variables constant, we expect the response variable to increase by an average of 15.00425,
within the confidence interval of 12.13 to 17.87.
Q2) Codes:
The fitted regression line turns out to be:
Income = 14276.1 + 1472.7*(age) + 2480.7*(married) – 8397.4*(divorced)
Intercept: The intercept represents the average income for a single individual who is
zero years old. Obviously you can’t be zero years old, so it doesn’t make sense to
interpret the intercept by itself in this particular regression model.
Age: Each one year increase in age is associated with an average increase of $1471.70
in income. Since the p-value (0.00428) < 0.05, age is a statistically significant
predictor of income.
Married: A married individual, on average, earns $2479.70 more than a single
individual. Since the p-value (0.80018) is not less than 0.05, this difference is not
statistically significant.
Divorced: A divorced individual, on average, earns $8397.40 less than a single
individual. Since the p-value (0.53187) is not less than 0.05, this difference is not
statistically significant.
Since both dummy variables were not statistically significant, we could drop marital
status as a constant from the model.
The R-square value of 0.9008 indicates that the model explains 90.08% of the variance in
income, which is quite high. Adjusted R-square is 0.8584, which is slightly lower but still
indicates a good fit.
Intercept (β0): We're 95% confident that the value of the response variable when all
predictor variables are zero falls between approximately -10343.16 and 38895.40.
Age (β1): For every one-year increase in Age, the response variable changes between
approximately 633.55 and 2309.80 units, with 95% confidence.
Married (β2): Being Married is associated with a change in the response variable that
falls between approximately -19821.65 and 24781.14, with 95% confidence.
Divorced (β3): Being Divorced is associated with a change in the response variable
that falls between approximately -38596.88 and 21802.07, with 95% confidence.
******************