0% found this document useful (0 votes)
39 views17 pages

T Test

Uploaded by

Tran Selena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views17 pages

T Test

Uploaded by

Tran Selena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

T test: compares means of 2

Analyze=> Compares means => Independent Sample T Test

1. Independent Sample T test


Group Statistics
gender N Mean Std. Deviation Std. Error Mean
total Female 64 102.03 13.896 1.737
Male 41 98.29 17.196 2.686

Independent Samples Test


Levene's Test for Equality of
Variances t-test for Equality of Means
95% Confidence Interval of the
Mean Std. Error Difference
F Sig. t df Sig. (2-tailed) Difference Difference Lower Upper
total Equal variances assumed 2.019 .158 1.224 103 .224 3.739 3.053 -2.317 9.794
Equal variances not 1.169 72.421 .246 3.739 3.198 -2.637 10.114
assumed

Levene’s test
Sig.=significant=p-value (ý nghĩa thống kê); mức ý nghĩa thống kê phổ biến là 0.05
If Sig. value in the Levene’s test > 0.05=> variances are not different=> equal variances assumed line
If the Sig. value in the Levene’s test < 0.05=> variances are different=? Equal variances are note assumed

Group Statistics
gender N Mean Std. Deviation Std. Error Mean
gpa Female 64 2.8967 .74622 .09328
Male 41 2.5949 .76346 .11923

Independent Samples Test


Levene's Test for Equality of
Variances t-test for Equality of Means
95% Confidence Interval of the
Mean Std. Error Difference
F Sig. t df Sig. (2-tailed) Difference Difference Lower Upper
gpa Equal variances assumed .331 .566 2.004 103 .048 .30184 .15062 .00312 .60056
Equal variances not 1.994 83.974 .049 .30184 .15138 .00080 .60288
assumed

In Levene’s test, since sig.(0.566)>0.05, the variances of the 2 populations are not different. Therefore, use the result of t test for
Equality of Means in the line “Equal variances assumed”.
T test for equality of means, because sig.(0.048)<0.05, it can be concluded that there is a statistically significally difference in the
mean value of the total variables between the two gender groups, male and female.
Pair sample t test
Paired Samples Statistics
Mean N Std. Deviation Std. Error Mean
Pair 1 quiz1 7.47 105 2.481 .242
quiz2 7.98 105 1.623 .158

Paired Samples Correlations


N Correlation Sig.
Pair 1 quiz1 & quiz2 105 .673 .000

Paired Samples Test


Paired Differences
95% Confidence Interval of the
Difference
Mean Std. Deviation Std. Error Mean Lower Upper t df Sig. (2-tailed)
Pair 1 quiz1 - quiz2 -.514 1.835 .179 -.869 -.159 -2.872 104 .005

ANOVA
Compare means=> one way anova
Total (mean) ở trên, factors ở dưới
1.

Descriptives
Overall, I am satisfied with the price performance ratio of Oddjob Airways.
95% Confidence Interval for Mean
N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum
Blue 677 4.47 1.641 .063 4.35 4.60 1 7
Silver 245 4.03 1.560 .100 3.84 4.23 1 7
Gold 143 3.99 1.556 .130 3.73 4.24 1 7
Total 1065 4.31 1.625 .050 4.21 4.40 1 7
Test of Homogeneity of Variances
Levene Statistic df1 df2 Sig.
Overall, I am satisfied with Based on Mean .907 2 1062 .404
the price performance ratio Based on Median .068 2 1062 .934
of Oddjob Airways. Based on Median and with .068 2 1017.925 .934
adjusted df
Based on trimmed mean .771 2 1062 .463

Vì sig.(0.040)>0.05 nên phương sai của các nhóm không khác nhau. Do đó, sử dụng kết quả ở bảng ANOVA
Nhỏ hơn 0.05 thì dùng bảng Robust.
Since sig.(0.040)>0.05, the variances of the groups are not different. Therefore, use the result in ANOVA table.

ANOVA
Overall, I am satisfied with the price performance ratio of Oddjob Airways.
Sum of Squares df Mean Square F Sig.
Between Groups 51.755 2 25.878 9.963 .000
Within Groups 2758.455 1062 2.597
Total 2810.210 1064

Bước 2: anova
Vì sig.(0.000)<0.05 nên kết luận có sự khác biệt có ý nghĩa thống kê giữa ít nhất 2 nhóm khách hàng về giá trị trung bình của sự hài lòng về giá.

Since sig.(0.000)<0.05 , it is concluded that there is a statiscally significant difference between at least 2 groups in the mean value of price
statisfaction.
Robust Tests of Equality of Means
Overall, I am satisfied with the price performance ratio of Oddjob Airways.
Statistica df1 df2 Sig.
Welch 10.230 2 345.211 .000
a. Asymptotically F distributed.
ANOVA=> tick tukey

Multiple Comparisons
Dependent Variable: Overall, I am satisfied with the price performance ratio of Oddjob Airways.
Tukey HSD
Mean Difference 95% Confidence Interval
(I) Traveler Status (J) Traveler Status (I-J) Std. Error Sig. Lower Bound Upper Bound
Blue Silver .440* .120 .001 .16 .72
Gold .487* .148 .003 .14 .83
*
Silver Blue -.440 .120 .001 -.72 -.16
Gold .047 .170 .959 -.35 .44
Gold Blue -.487* .148 .003 -.83 -.14
Silver -.047 .170 .959 -.44 .35
*. The mean difference is significant at the 0.05 level.

The mean difference is significant at the 0.05 level


Since sig.(0.001)<0.05, it is concluded that there is a statistically significant difference in the mean value of price satisfaction between the Blue
and Silver groups.
Since sig.(0.003)<0.05, it is concluded that there is a statistically significant difference in the mean value of price satisfaction between the Blue
and Gold groups.
Since sig.(0.959)>0.05, it is concluded that there is no statistically significant difference in the mean value of price satisfaction between the
Gold and Silver groups.
Descriptives
total
95% Confidence Interval for Mean
N Mean Std. Deviation Std. Error Lower Bound Upper Bound Minimum Maximum
Native 5 95.20 17.094 7.645 73.98 116.42 75 115
Asian 20 102.90 12.876 2.879 96.87 108.93 78 123
Black 24 100.08 14.714 3.004 93.87 106.30 65 124
White 45 102.27 14.702 2.192 97.85 106.68 51 123
Hispanic 11 92.91 21.215 6.397 78.66 107.16 52 120
Total 105 100.57 15.299 1.493 97.61 103.53 51 124
 Asian (Mean = 102.90) and White (Mean = 102.27) groups have the highest average scores, indicating these groups scored slightly higher on
average compared to others.
 The Black group has a mean of 100.08, which is very close to the overall mean of 100.57.
 The Native group has a mean of 95.20, and Hispanic has the lowest mean at 92.91, suggesting that these two groups scored lower on average.

Test of Homogeneity of Variances


Levene Statistic df1 df2 Sig.
total Based on Mean .930 4 100 .450
Based on Median .449 4 100 .773
Based on Median and with .449 4 78.532 .773
adjusted df
Based on trimmed mean .825 4 100 .512

Step 1: Levene’s Test


Since sig.(0.450)>0.05, the variances of the groups are not different. Therefore, use the result in ANOVA table.
ANOVA
total
Sum of Squares df Mean Square F Sig.
Between Groups 1033.572 4 258.393 1.109 .357
Within Groups 23310.142 100 233.101
Total 24343.714 104

Since sig.(0.357)>0.05 , it is concluded that there is no statistically significant difference between ethnic groups in the mean value of total
variable.
Multiple Comparisons
Dependent Variable: total
Tukey HSD
Mean Difference 95% Confidence Interval
(I) ethnicity (J) ethnicity (I-J) Std. Error Sig. Lower Bound Upper Bound
Native Asian -7.700 7.634 .851 -28.91 13.51
Black -4.883 7.506 .966 -25.74 15.97
White -7.067 7.197 .863 -27.06 12.93
Hispanic 2.291 8.235 .999 -20.59 25.17
Asian Native 7.700 7.634 .851 -13.51 28.91
Black 2.817 4.623 .973 -10.03 15.66
White .633 4.103 1.000 -10.77 12.03
Hispanic 9.991 5.731 .413 -5.93 25.91
Black Native 4.883 7.506 .966 -15.97 25.74
Asian -2.817 4.623 .973 -15.66 10.03
White -2.183 3.859 .980 -12.90 8.54
Hispanic 7.174 5.559 .698 -8.27 22.62
White Native 7.067 7.197 .863 -12.93 27.06
Asian -.633 4.103 1.000 -12.03 10.77
Black 2.183 3.859 .980 -8.54 12.90
Hispanic 9.358 5.135 .367 -4.91 23.62
Hispanic Native -2.291 8.235 .999 -25.17 20.59
Asian -9.991 5.731 .413 -25.91 5.93
Black -7.174 5.559 .698 -22.62 8.27
White -9.358 5.135 .367 -23.62 4.91
CORRELATION
What is correlation?
The correlation (tương quan) is a common measure of how strongly two variables relate to each other.
How we can measure correlation?
Using the Pearson’s correlation coefficient ranges from -1 to 1.
R=0=> there is no relationshio
R<=1 => negative relationship
r>=1 => postive relationship
What are some threshold ?
Using an absolute correlation (giá trị tuyệt đối)
Create correlation
Step 1: analyze=> correlation => bivariate correlation

Correlations
S1 S2 S3 S4
... with Oddjob Airways you Pearson Correlation 1 .739** .619** .717**
will arrive on time. Sig. (2-tailed) .000 .000 .000
N 1038 1037 952 1033
** **
… the entire journey with Pearson Correlation .739 1 .694 .766**
Oddjob Airways will occur Sig. (2-tailed) .000 .000 .000
as booked. N 1037 1040 952 1034
** **
... in case something does Pearson Correlation .619 .694 1 .645**
not work out as planned, Sig. (2-tailed) .000 .000 .000
Oddjob Airways will find a N 952 952 954 951
good solution.
… the flight schedules of Pearson Correlation .717** .766** .645** 1
Oddjob Airways are Sig. (2-tailed) .000 .000 .000
reliable. N 1033 1034 951 1035
**. Correlation is significant at the 0.01 level (2-tailed).

The correlation between s1 and s3 is 0.739>0.49, which indicates a strong relationship according to Cohen
The correlation between s2 and s4 is the highest correlation (0.766), suggesting that s4 are strongly associated with varable s2
REGRESSION ANALYSIS
LINEAR REGRESSION (HỒI QUY): biến x có tác động lên biến y

Y=(B0+b1 x1+b2x2+b3x3+x4…..
Y: dependent variables
X1: independent variable (explanatory variables)=> simple linear regression (>1 Independent variable) and multiple linear regression (>2 IV)
B1,2,3: regression coefficient term
B0: intercept
Step 1: analyze=> regression=> linear

Variables Entered/Removeda
Mode Variables Variables Method
l Entered Removed
1 Top 10% HS, . Enter
Median SAT,
Expenditures/
Student,
Acceptance
Rateb
a. Dependent Variable: Graduation %
b. All requested variables entered.
Model Summary
Model R R Adjusted R Std. Error of
Square Square the Estimate
a
1 .731 .534 .492 5.308
a. Predictors: (Constant), Top 10% HS, Median SAT,
Expenditures/Student, Acceptance Rate

Giải thích R-square:


53% of the variation in the dependent variable (graduation %) is explained by the independent variables in the model (specifically, top 10% hs,
median sat, expenditure/student, acceptance rate).
R-square has 2 limitations:
53% biến thiên của biến phụ thuộc (graduation %) được giải thích bởi các biến độc lập trong mô hình.
1. When increasing sample size => R-square increases
2. When adding more independent variables to the model=> R-square increases
Adjusted R square (R bình phương hiệu chỉnh)
Adjusted R square=0.492
49% of the variation in the dependent variable (graduation %) is explained by the independent variables in the model (specifically, top 10% hs,
median sat, expenditures/students, acceptance rate), adjusting for the sample size and the number of independent variables in the model.
ANOVAa
Model Sum of df Mean F Sig.
Squares Square
1 Regression 1423.209 4 355.802 12.627 .000b
Residual 1239.852 44 28.178
Total 2663.061 48
a. Dependent Variable: Graduation %
b. Predictors: (Constant), Top 10% HS, Median SAT, Expenditures/Student, Acceptance Rate

Sig. compare to 0.05


If sig. < 0.05, the overall model is statistically significant at the 5% significant level
If sig.>0.05, the overall model is not statistically significant at the 5% significant level
Coefficientsa
Model Unstandardized Coefficients Standardized t Sig.
Coefficients
B Std. Error Beta
1 (Constant) 17.921 24.557 .730 .469
Median SAT .072 .018 .606 4.004 .000
Acceptance Rate -.249 .083 -.446 -2.990 .005
Expenditures/Student .000 .000 -.282 -2.057 .046
Top 10% HS -.163 .079 -.296 -2.051 .046
a. Dependent Variable: Graduation %

Đọc kết quả hồi quy


Bước 1: xem hệ số hồi quy có ý nghĩa thống kê hay không? (check if the regression coefficient is statistically significant?)
Sig.=significant=p-value (ý nghĩa thống kê); mức ý nghĩa thống kê phổ biến là 0.05
Nếu sig.<0.05 => có ý nghĩa thống kê => bước 2
Nếu sig.>0.05 => không có ý nghĩa thống kê => X không có tác động lên Y
Bước 2: xem tác động tích cực hay tiêu cực?
Nếu dấu của hệ số (+): X có tác động tích cực lên Y. Ví dụ: Điểm Median SAT có tác động tích cực lên tỷ lệ tốt nghiệp (graduation)
Nếu dấu của hệ số (-): X có tác động tiêu cực/ngược chiều lên Y

Step 2: is the impact positive or negative?


If the coefficient sign is (+): X has a positive impact on Y. For example, Median SAT scores have a positive impact on graduation rates.
If the coefficient sign is (-): X has a negative impact on Y.

XÁC SUẤT THI

Notes:
The signs of the Unstandardized and Standardized coefficient are always the same
Understandardized Coefficients: coefficients that still retain the original units of the variable
Standardized Coefficients: Coefficients that have been standardized, no longer retain the original units of the variable
To explain the economic meaning, Unstandardized Coefficient will be used

INTERPRETATION (CÂU 7)
If X increases by 1 unit, Y increases how many units? Keeping other factors unchanged
Nếu điểm trung vị SAT (median SAT) tăng lên 1 điểm, thì tỷ lệ tốt nghiệp tăng lên 0.072%, giữ nguyên các yếu tố khác không thay đổi

Nếu tỷ lệ chấp thuận (acceptance rate) tăng 1%, thì tỷ lệ tốt nghiệp giảm 0.249%, giữ nguyên các yếu tố khác không thay đổi

Regression Model
Lower acceptance rates suggest higher graduation rates

The coefficient of acceptance rate is statistically significant and negative. This indicates that the acceptance rate has a negative influence on the
graduation rate.
If the acceptance rate increases by 1 point, the graduation rate decreases by 0.024%.
MULTICOLLINEARITY (CÂU 6)

Một vấn đề xảy ra khi có sự tương quan rất mạnh xảy ra giữa các biến độc lập trong mô hình
Khi |r| >0.7: biểu hiện của đa cộng tuyến
2 hậu quả của mô hình có đa cộng tuyến
1. Dấu của hệ số có thể thay đổi (vd: đúng bản chất củ hệ số là dương, nhưng vì đa cộng tuyến nên dấu chuyển sang âm
2. Giá trị sig. của hệ số tăng lên, khiến cho biến mất ý nghĩa thống kế.

Correlations
Acceptance Expenditures/
Median SAT Rate Student Top 10% HS
Median SAT Pearson Correlation 1 -.602** .573** .503**
Significance(2-tailed) .000 .000 .000
N 49 49 49 49
** *
Acceptance Rate Pearson Correlation -.602 1 -.284 -.610**
Significance(2-tailed) .000 .048 .000
N 49 49 49 49
** *
Expenditures/Student Pearson Correlation .573 -.284 1 .506**
Significance(2-tailed) .000 .048 .000
N 49 49 49 49
** ** **
Top 10% HS Pearson Correlation .503 -.610 .506 1
Significance(2-tailed) .000 .000 .000
N 49 49 49 49
**. Correlation at 0.01(2-tailed):...
*. Correlation at 0.05(2-tailed):...

◦ If there is no problem of multicollinearity, we continue with the regression estimation.


◦ If there is a problem of multicollinearity, we run the regression estimation with highly-correlated independent variables in separate
regressions.

REGRESSION WITH CATEGORICAL VARIABLES


Regression=> linear
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Significance
1 (Constant) 893.588 1824.575 .490 .628
Age 1044.146 42.141 .975 24.777 .000
MBA 14767.232 1351.802 .430 10.924 .000
a. Dependent Variable: Salary

You might also like