Statistical Analysis


By Rama Krishna Kompella
Relationships Between Variables
• The relationship between variables can be
  explained in various ways such as:
  –   Presence /absence of a relationship
  –   Directionality of the relationship
  –   Strength of association
  –   Type of relationship
Relationships Between Variables
• Presence / absence of a relationship
  – E.g., if we are interested to study the customer
    satisfaction levels of a fast-food restaurant, then
    we need to know if the quality of food and
    customer satisfaction have any relationship or not
Relationships Between Variables
• Direction of the relationship
  – The direction of a relationship can be either
    positive or negative
  – Food quality perceptions are related positively to
    customer commitment toward a restaurant.
Relationships Between Variables
• Strength of association
– They are generally categorized as nonexistent, weak,
  moderate, or strong.
– Quality of food is strongly associated with customer
  satisfaction in a fast-food restaurant
Relationships Between Variables
• Type of association
  – How can the link between Y and X best be
    described?
  – There are different ways in which two variables
    can share a relationship
     • Linear relationship
     • Curvilinear relationship
Chi-Square (χ2) and Frequency Data
• Today the data that we analyze consists of frequencies; that
  is, the number of individuals falling into categories. In other
  words, the variables are measured on a nominal scale.
• The test statistic for frequency data is Pearson Chi-Square.
  The magnitude of Pearson Chi-Square reflects the amount of
  discrepancy between observed frequencies and expected
  frequencies.
Steps in Test of Hypothesis
1.   Determine the appropriate test
2.   Establish the level of significance:α
3.   Formulate the statistical hypothesis
4.   Calculate the test statistic
5.   Determine the degree of freedom
6.   Compare computed test statistic against a
     tabled/critical value
1. Determine Appropriate Test
• Chi Square is used when both variables are
  measured on a nominal scale.
• It can be applied to interval or ratio data that
  have been categorized into a small number of
  groups.
• It assumes that the observations are randomly
  sampled from the population.
• All observations are independent (an individual
  can appear only once in a table and there are no
  overlapping categories).
• It does not make any assumptions about the
  shape of the distribution nor about the
  homogeneity of variances.
2. Establish Level of Significance
• α is a predetermined value
• The convention
     • α = .05
     • α = .01
     • α = .001
3. Determine The Hypothesis:
Whether There is an Association
            or Not
• Ho : The two variables are independent
• Ha : The two variables are associated
4. Calculating Test Statistics
• Contrasts observed frequencies in each cell of a
  contingency table with expected frequencies.
• The expected frequencies represent the number of
  cases that would be found in each cell if the null
  hypothesis were true ( i.e. the nominal variables are
  unrelated).
• Expected frequency of two unrelated events is
  product of the row and column frequency divided by
  number of cases.
            Fe= Fr Fc / N
4. Calculating Test Statistics



      ( Fo − Fe )         2
χ = ∑
 2
                   
           Fe     
4. Calculating Test Statistics
            O
         fre bse
            qu rv
              en ed
                cie
                   s


      ( Fo − Fe )                2
χ = ∑
 2
                   
           Fe     

                                   Ex que
                                     fre
                                      pe nc
                                         cte y
                                            d
                          qu ted
                              cy
                       fre pec
                            en
                         Ex
5. Determine Degrees of




                                                     of
                                                 ber
                                            Num ls in
                                             leve n
                                                    m
                          df = (R-1)(C-1)

                                               colu le
                                                     b
        Freedom


                                                varia
                                              Numb
                                                     e
                                            levels r of
                                                   in ro
                                              variab w
                                                     le
6. Compare computed test statistic
      against a tabled/critical value
• The computed value of the Pearson chi-
  square statistic is compared with the critical
  value to determine if the computed value is
  improbable
• The critical tabled values are based on
  sampling distributions of the Pearson chi-
  square statistic
• If calculated χ2 is greater than χ2 table
  value, reject Ho
Example
• Suppose a researcher is interested in buying
  preferences of environmentally conscious
  consumers.
• A questionnaire was developed and sent to a
  random sample of 90 voters.
• The researcher also collects information about
  the gender of the sample of 90 respondents.
Bivariate Frequency Table or
                Contingency Table

               Favor   Neutral   Oppose   f row

Male           10      10        30       50

Female         15      15        10       40


f column       25      25        40       n = 90
Bivariate Frequency Table or
                Contingency Table

                    Favor   Neutral   Oppose   f row

Male                10      10        30       50

Female              15      15        10       40


f column          e d 25    25        40       n = 90
               erv cies
            bs en
           O qu
            fre
Bivariate Frequency Table or




                                               Row frequency
                Contingency Table

               Favor   Neutral   Oppose   f row

Male           10      10        30       50

Female         15      15        10       40


f column       25      25        40       n = 90
Bivariate Frequency Table or
                   Contingency Table

                   Favor   Neutral   Oppose   f row

   Male            10      10        30       50

   Female          15      15        10       40


   f column        25      25        40       n = 90
Column frequency
1. Determine Appropriate Test

1. Gender ( 2 levels) and Nominal
2. Buying Preference ( 3 levels) and Nominal
2. Establish Level of Significance

            Alpha of .05
3. Determine The Hypothesis
• Ho : There is no difference between men and
  women in their opinion on pro-environmental
  products.

• Ha : There is an association between gender
  and opinion on pro-environmental products.
4. Calculating Test Statistics

               Favor    Neutral    Oppose     f row

Men            fo =10   fo =10     fo =30     50
               fe =13.9 fe =13.9   fe=22.2
Women          fo =15   fo =15     fo =10     40
               fe =11.1 fe =11.1   fe =17.8
f column       25       25         40         n = 90
4. Calculating Test Statistics

           Favor    Neutral    Oppose     f row
                        = 50*25/90
Men        fo =10   fo =10     fo =30     50
           fe =13.9 fe =13.9   fe=22.2
Women      fo =15   fo =15     fo =10     40
           fe =11.1 fe =11.1   fe =17.8
f column   25       25         40         n = 90
4. Calculating Test Statistics

           Favor    Neutral    Oppose     f row

Men        fo =10   fo =10     fo =30     50
           fe =13.9 fe =13.9 fe=22.2
                       = 40* 25/90
Women      fo =15   fo =15     fo =10     40
           fe =11.1 fe =11.1   fe =17.8
f column   25       25         40         n = 90
4. Calculating Test Statistics


    (10 − 13.89) 2 (10 − 13.89) 2 (30 − 22.2) 2
χ =
 2
                  +              +              +
        13.89          13.89          22.2

      (15 − 11.11) 2 (15 − 11.11) 2 (10 − 17.8) 2
                    +              +
          11.11          11.11          17.8


     = 11.03
5. Determine Degrees of
        Freedom
      df = (R-1)(C-1) =
       (2-1)(3-1) = 2
6. Compare computed test statistic
       against a tabled/critical value
•   α = 0.05
•   df = 2
•   Critical tabled value = 5.991
•   Test statistic, 11.03, exceeds critical value
•   Null hypothesis is rejected
•   Men and women differ significantly in their
    opinions on pro-environmental products
SPSS Output Example

                     Chi-Square Tests

                                                 Asymp. Sig.
                        Value           df        (2-sided)
Pearson Chi-Square       11.025a             2           .004
Likelihood Ratio         11.365              2           .003
Linear-by-Linear
                           8.722             1            .003
Association
N of Valid Cases              90
  a. 0 cells (.0%) have expected count less than 5. The
     minimum expected count is 11.11.
Additional Information in SPSS Output
• Exceptions that might distort χ2 Assumptions
  – Associations in some but not all categories
  – Low expected frequency per cell
• Extent of association is not same as statistical
  significance




                                             Demonstrated
                                          through an example
Another Example Heparin Lock
                       Placement
                   Complication Incidence * Heparin Lock Placement Time Group Crosstabulation


                                                                         Heparin Lock                       Time:
                                                                     Placement Time Group
                                                                                                          1 = 72 hrs
                                                                         1          2           Total
           Complication    Had Compilca      Count                            9         11           20
                                                                                                          2 = 96 hrs
           Incidence                         Expected Count                10.0       10.0         20.0
                                             % within Heparin Lock
                                                                        18.0%       22.0%        20.0%
                                             Placement Time Group
                           Had NO Compilca   Count                          41         39            80
                                             Expected Count               40.0       40.0          80.0
                                             % within Heparin Lock
                                                                        82.0%       78.0%        80.0%
                                             Placement Time Group
           Total                             Count                          50         50          100
                                             Expected Count               50.0       50.0        100.0
                                             % within Heparin Lock
                                                                       100.0%     100.0%        100.0%
                                             Placement Time Group




from Polit Text: Table 8-1
Hypotheses in Smoking Habit


• Ho: There is no association between
  complication incidence and duration of
  smoking habit. (The variables are
  independent).
• Ha: There is an association between
  complication incidence and duration of
  smoking habit. (The variables are related).
More of SPSS Output



                                     Chi-Square Tests

                                                  Asymp. Sig.    Exact Sig.   Exact Sig.
                         Value           df        (2-sided)      (2-sided)    (1-sided)
Pearson Chi-Square          .250b             1           .617
Continuity Correctiona      .063              1           .803
Likelihood Ratio            .250              1           .617
Fisher's Exact Test                                                   .803         .402
Linear-by-Linear
                            .248              1          .619
Association
N of Valid Cases            100
  a. Computed only for a 2x2 table
  b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.
     00.
Pearson Chi-Square
• Pearson Chi-Square = .
  250, p = .617
 Since the p > .05, we fail to
  reject the null hypothesis                                         Chi-Square Tests

  that the complication rate                              Value          df
                                                                                  Asymp. Sig.
                                                                                   (2-sided)
                                                                                                 Exact Sig.
                                                                                                  (2-sided)
                                                                                                              Exact Sig.
                                                                                                               (1-sided)

  is unrelated to smoking        Pearson Chi-Square
                                 Continuity Correctiona
                                                             .250b
                                                             .063
                                                                              1
                                                                              1
                                                                                          .617
                                                                                          .803


  habit duration.                Likelihood Ratio
                                 Fisher's Exact Test
                                 Linear-by-Linear
                                                             .250             1           .617
                                                                                                      .803         .402



• Continuity correction is
                                                             .248             1          .619
                                 Association
                                 N of Valid Cases            100


  used in situations in which
                                   a. Computed only for a 2x2 table
                                   b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10.


  the expected frequency
                                      00.




  for any cell in a 2 by 2
  table is less than 10.
More SPSS Output



                                    Symmetric Measures

                                                          Asymp.
                                                                  a          b
                                              Value      Std. Error Approx. T Approx. Sig.
Nominal by           Phi                        -.050                                .617
Nominal              Cramer's V                  .050                                .617
Interval by Interval Pearson's R                -.050         .100      -.496        .621c
Ordinal by Ordinal Spearman Correlation         -.050         .100      -.496        .621c
N of Valid Cases                                  100
  a. Not assuming the null hypothesis.
  b. Using the asymptotic standard error assuming the null hypothesis.
  c. Based on normal approximation.
Phi Coefficient
• Pearson Chi-Square                                                 Symmetric Measures

                                                                                         Asymp.
                                                                                                 a
                                                                             Value      Std. Error

  provides information         Nominal by
                               Nominal
                                                      Phi
                                                      Cramer's V
                                                                               -.050
                                                                                .050


  about the existence of
                               Interval by Interval   Pearson's R              -.050         .100
                               Ordinal by Ordinal     Spearman Correlation     -.050         .100
                               N of Valid Cases                                  100

  relationship between 2         a. Not assuming the null hypothesis.
                                 b. Using the asymptotic standard error assuming the null hypothes


  nominal variables, but not
                                 c. Based on normal approximation.




  about the magnitude of
  the relationship
• Phi coefficient is the                       χ                     2
  measure of the strength                   φ=
  of the association                           N
Cramer’s V
• When the table is larger than 2                                            Symmetric Measures


  by 2, a different index must be
                                                                                                 Asymp.
                                                                                                         a
                                                                                     Value      Std. Error
                                       Nominal by             Phi                      -.050
  used to measure the strength         Nominal
                                       Interval by Interval
                                                              Cramer's V
                                                              Pearson's R
                                                                                        .050
                                                                                       -.050          .100
  of the relationship between the      Ordinal by Ordinal
                                       N of Valid Cases
                                                              Spearman Correlation     -.050
                                                                                         100
                                                                                                      .100


  variables. One such index is           a. Not assuming the null hypothesis.
                                         b. Using the asymptotic standard error assuming the null hypothesis
  Cramer’s V.                            c. Based on normal approximation.


• If Cramer’s V is large, it means
  that there is a tendency for
  particular categories of the first
  variable to be associated with
                                                            χ          2
  particular categories of the
  second variable.                     V=
                                                         N (k − 1)
Cramer’s V
• When the table is larger than 2                                            Symmetric Measures


  by 2, a different index must be
                                                                                                 Asymp.
                                                                                                         a
                                                                                     Value      Std. Error
                                       Nominal by             Phi                      -.050
  used to measure the strength         Nominal
                                       Interval by Interval
                                                              Cramer's V
                                                              Pearson's R
                                                                                        .050
                                                                                       -.050          .100
  of the relationship between the      Ordinal by Ordinal
                                       N of Valid Cases
                                                              Spearman Correlation     -.050
                                                                                         100
                                                                                                      .100


  variables. One such index is           a. Not assuming the null hypothesis.
                                         b. Using the asymptotic standard error assuming the null hypothesis
  Cramer’s V.                            c. Based on normal approximation.


• If Cramer’s V is large, it means
  that there is a tendency for
  particular categories of the first
  variable to be associated with
                                                            χ          2
  particular categories of the
  second variable.                     V=
                                                         N (k − 1)
                                   Number of                                    Smallest of
                                     cases                                   number of rows or
Q & As

More Related Content

PPTX
Ppt of first order differenatiol equation
PDF
Exam paper
PPT
Chi square mahmoud
PPT
Chi square[1]
PPTX
Chahine Hypothesis Testing,
PPTX
Chi square test
PPTX
ders 5 hypothesis testing.pptx
PPTX
One way AVOVA
Ppt of first order differenatiol equation
Exam paper
Chi square mahmoud
Chi square[1]
Chahine Hypothesis Testing,
Chi square test
ders 5 hypothesis testing.pptx
One way AVOVA

Similar to T10 statisitical analysis (20)

PDF
Chai squre
PDF
PDF
1210 2
PDF
Chi square
PPT
Probability and statistics(assign 7 and 8)
PPT
New statistics
PPT
Finals Stat 1
PPT
Probability and statistics
PPT
Statistics1(finals)
PPT
Probability and statistics
PPT
Probability and statistics
PPT
Probability and statistics(exercise answers)
DOCX
501 assignment1
PDF
Practice test ch 10 correlation reg ch 11 gof ch12 anova
DOC
Statistics
PPT
contingency tables.ppt
PPT
Aron chpt 11 ed (2)
PDF
Qm1 notes
PDF
Qm1notes
PPTX
Chisquare
Chai squre
1210 2
Chi square
Probability and statistics(assign 7 and 8)
New statistics
Finals Stat 1
Probability and statistics
Statistics1(finals)
Probability and statistics
Probability and statistics
Probability and statistics(exercise answers)
501 assignment1
Practice test ch 10 correlation reg ch 11 gof ch12 anova
Statistics
contingency tables.ppt
Aron chpt 11 ed (2)
Qm1 notes
Qm1notes
Chisquare
Ad

More from kompellark (20)

PPT
T22 research report writing
PPT
Rubric assignment 2
PPT
Answers mid-term
PPT
T21 conjoint analysis
PPT
T20 cluster analysis
PPT
T19 factor analysis
PPT
T18 discriminant analysis
PPT
T17 correlation
PPT
T16 multiple regression
PPT
T15 ancova
PPT
T14 anova
PPT
T13 parametric tests
PPT
T11 types of tests
PPT
T15 ancova
PPT
T14 anova
PPT
T13 parametric tests
PPT
T12 non-parametric tests
PPT
T11 types of tests
PPT
T16 multiple regression
PPT
T10 statisitical analysis
T22 research report writing
Rubric assignment 2
Answers mid-term
T21 conjoint analysis
T20 cluster analysis
T19 factor analysis
T18 discriminant analysis
T17 correlation
T16 multiple regression
T15 ancova
T14 anova
T13 parametric tests
T11 types of tests
T15 ancova
T14 anova
T13 parametric tests
T12 non-parametric tests
T11 types of tests
T16 multiple regression
T10 statisitical analysis
Ad

Recently uploaded (20)

PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
Co-training pseudo-labeling for text classification with support vector machi...
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
PDF
Advancing precision in air quality forecasting through machine learning integ...
PDF
giants, standing on the shoulders of - by Daniel Stenberg
PDF
Lung cancer patients survival prediction using outlier detection and optimize...
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PDF
Data Virtualization in Action: Scaling APIs and Apps with FME
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
PDF
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
PDF
Comparative analysis of machine learning models for fake news detection in so...
PPTX
Configure Apache Mutual Authentication
PDF
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
PDF
Flame analysis and combustion estimation using large language and vision assi...
PDF
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
PDF
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
PPTX
Build Your First AI Agent with UiPath.pptx
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Co-training pseudo-labeling for text classification with support vector machi...
Consumable AI The What, Why & How for Small Teams.pdf
Custom Battery Pack Design Considerations for Performance and Safety
“A New Era of 3D Sensing: Transforming Industries and Creating Opportunities,...
Advancing precision in air quality forecasting through machine learning integ...
giants, standing on the shoulders of - by Daniel Stenberg
Lung cancer patients survival prediction using outlier detection and optimize...
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Data Virtualization in Action: Scaling APIs and Apps with FME
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
The-2025-Engineering-Revolution-AI-Quality-and-DevOps-Convergence.pdf
The-Future-of-Automotive-Quality-is-Here-AI-Driven-Engineering.pdf
Comparative analysis of machine learning models for fake news detection in so...
Configure Apache Mutual Authentication
INTERSPEECH 2025 「Recent Advances and Future Directions in Voice Conversion」
Flame analysis and combustion estimation using large language and vision assi...
Transform-Your-Streaming-Platform-with-AI-Driven-Quality-Engineering.pdf
Transform-Your-Factory-with-AI-Driven-Quality-Engineering.pdf
Build Your First AI Agent with UiPath.pptx

T10 statisitical analysis

  • 1. Statistical Analysis By Rama Krishna Kompella
  • 2. Relationships Between Variables • The relationship between variables can be explained in various ways such as: – Presence /absence of a relationship – Directionality of the relationship – Strength of association – Type of relationship
  • 3. Relationships Between Variables • Presence / absence of a relationship – E.g., if we are interested to study the customer satisfaction levels of a fast-food restaurant, then we need to know if the quality of food and customer satisfaction have any relationship or not
  • 4. Relationships Between Variables • Direction of the relationship – The direction of a relationship can be either positive or negative – Food quality perceptions are related positively to customer commitment toward a restaurant.
  • 5. Relationships Between Variables • Strength of association – They are generally categorized as nonexistent, weak, moderate, or strong. – Quality of food is strongly associated with customer satisfaction in a fast-food restaurant
  • 6. Relationships Between Variables • Type of association – How can the link between Y and X best be described? – There are different ways in which two variables can share a relationship • Linear relationship • Curvilinear relationship
  • 7. Chi-Square (χ2) and Frequency Data • Today the data that we analyze consists of frequencies; that is, the number of individuals falling into categories. In other words, the variables are measured on a nominal scale. • The test statistic for frequency data is Pearson Chi-Square. The magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies.
  • 8. Steps in Test of Hypothesis 1. Determine the appropriate test 2. Establish the level of significance:α 3. Formulate the statistical hypothesis 4. Calculate the test statistic 5. Determine the degree of freedom 6. Compare computed test statistic against a tabled/critical value
  • 9. 1. Determine Appropriate Test • Chi Square is used when both variables are measured on a nominal scale. • It can be applied to interval or ratio data that have been categorized into a small number of groups. • It assumes that the observations are randomly sampled from the population. • All observations are independent (an individual can appear only once in a table and there are no overlapping categories). • It does not make any assumptions about the shape of the distribution nor about the homogeneity of variances.
  • 10. 2. Establish Level of Significance • α is a predetermined value • The convention • α = .05 • α = .01 • α = .001
  • 11. 3. Determine The Hypothesis: Whether There is an Association or Not • Ho : The two variables are independent • Ha : The two variables are associated
  • 12. 4. Calculating Test Statistics • Contrasts observed frequencies in each cell of a contingency table with expected frequencies. • The expected frequencies represent the number of cases that would be found in each cell if the null hypothesis were true ( i.e. the nominal variables are unrelated). • Expected frequency of two unrelated events is product of the row and column frequency divided by number of cases. Fe= Fr Fc / N
  • 13. 4. Calculating Test Statistics  ( Fo − Fe )  2 χ = ∑ 2   Fe 
  • 14. 4. Calculating Test Statistics O fre bse qu rv en ed cie s  ( Fo − Fe )  2 χ = ∑ 2   Fe  Ex que fre pe nc cte y d qu ted cy fre pec en Ex
  • 15. 5. Determine Degrees of of ber Num ls in leve n m df = (R-1)(C-1) colu le b Freedom varia Numb e levels r of in ro variab w le
  • 16. 6. Compare computed test statistic against a tabled/critical value • The computed value of the Pearson chi- square statistic is compared with the critical value to determine if the computed value is improbable • The critical tabled values are based on sampling distributions of the Pearson chi- square statistic • If calculated χ2 is greater than χ2 table value, reject Ho
  • 17. Example • Suppose a researcher is interested in buying preferences of environmentally conscious consumers. • A questionnaire was developed and sent to a random sample of 90 voters. • The researcher also collects information about the gender of the sample of 90 respondents.
  • 18. Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column 25 25 40 n = 90
  • 19. Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column e d 25 25 40 n = 90 erv cies bs en O qu fre
  • 20. Bivariate Frequency Table or Row frequency Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column 25 25 40 n = 90
  • 21. Bivariate Frequency Table or Contingency Table Favor Neutral Oppose f row Male 10 10 30 50 Female 15 15 10 40 f column 25 25 40 n = 90 Column frequency
  • 22. 1. Determine Appropriate Test 1. Gender ( 2 levels) and Nominal 2. Buying Preference ( 3 levels) and Nominal
  • 23. 2. Establish Level of Significance Alpha of .05
  • 24. 3. Determine The Hypothesis • Ho : There is no difference between men and women in their opinion on pro-environmental products. • Ha : There is an association between gender and opinion on pro-environmental products.
  • 25. 4. Calculating Test Statistics Favor Neutral Oppose f row Men fo =10 fo =10 fo =30 50 fe =13.9 fe =13.9 fe=22.2 Women fo =15 fo =15 fo =10 40 fe =11.1 fe =11.1 fe =17.8 f column 25 25 40 n = 90
  • 26. 4. Calculating Test Statistics Favor Neutral Oppose f row = 50*25/90 Men fo =10 fo =10 fo =30 50 fe =13.9 fe =13.9 fe=22.2 Women fo =15 fo =15 fo =10 40 fe =11.1 fe =11.1 fe =17.8 f column 25 25 40 n = 90
  • 27. 4. Calculating Test Statistics Favor Neutral Oppose f row Men fo =10 fo =10 fo =30 50 fe =13.9 fe =13.9 fe=22.2 = 40* 25/90 Women fo =15 fo =15 fo =10 40 fe =11.1 fe =11.1 fe =17.8 f column 25 25 40 n = 90
  • 28. 4. Calculating Test Statistics (10 − 13.89) 2 (10 − 13.89) 2 (30 − 22.2) 2 χ = 2 + + + 13.89 13.89 22.2 (15 − 11.11) 2 (15 − 11.11) 2 (10 − 17.8) 2 + + 11.11 11.11 17.8 = 11.03
  • 29. 5. Determine Degrees of Freedom df = (R-1)(C-1) = (2-1)(3-1) = 2
  • 30. 6. Compare computed test statistic against a tabled/critical value • α = 0.05 • df = 2 • Critical tabled value = 5.991 • Test statistic, 11.03, exceeds critical value • Null hypothesis is rejected • Men and women differ significantly in their opinions on pro-environmental products
  • 31. SPSS Output Example Chi-Square Tests Asymp. Sig. Value df (2-sided) Pearson Chi-Square 11.025a 2 .004 Likelihood Ratio 11.365 2 .003 Linear-by-Linear 8.722 1 .003 Association N of Valid Cases 90 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 11.11.
  • 32. Additional Information in SPSS Output • Exceptions that might distort χ2 Assumptions – Associations in some but not all categories – Low expected frequency per cell • Extent of association is not same as statistical significance Demonstrated through an example
  • 33. Another Example Heparin Lock Placement Complication Incidence * Heparin Lock Placement Time Group Crosstabulation Heparin Lock Time: Placement Time Group 1 = 72 hrs 1 2 Total Complication Had Compilca Count 9 11 20 2 = 96 hrs Incidence Expected Count 10.0 10.0 20.0 % within Heparin Lock 18.0% 22.0% 20.0% Placement Time Group Had NO Compilca Count 41 39 80 Expected Count 40.0 40.0 80.0 % within Heparin Lock 82.0% 78.0% 80.0% Placement Time Group Total Count 50 50 100 Expected Count 50.0 50.0 100.0 % within Heparin Lock 100.0% 100.0% 100.0% Placement Time Group from Polit Text: Table 8-1
  • 34. Hypotheses in Smoking Habit • Ho: There is no association between complication incidence and duration of smoking habit. (The variables are independent). • Ha: There is an association between complication incidence and duration of smoking habit. (The variables are related).
  • 35. More of SPSS Output Chi-Square Tests Asymp. Sig. Exact Sig. Exact Sig. Value df (2-sided) (2-sided) (1-sided) Pearson Chi-Square .250b 1 .617 Continuity Correctiona .063 1 .803 Likelihood Ratio .250 1 .617 Fisher's Exact Test .803 .402 Linear-by-Linear .248 1 .619 Association N of Valid Cases 100 a. Computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10. 00.
  • 36. Pearson Chi-Square • Pearson Chi-Square = . 250, p = .617 Since the p > .05, we fail to reject the null hypothesis Chi-Square Tests that the complication rate Value df Asymp. Sig. (2-sided) Exact Sig. (2-sided) Exact Sig. (1-sided) is unrelated to smoking Pearson Chi-Square Continuity Correctiona .250b .063 1 1 .617 .803 habit duration. Likelihood Ratio Fisher's Exact Test Linear-by-Linear .250 1 .617 .803 .402 • Continuity correction is .248 1 .619 Association N of Valid Cases 100 used in situations in which a. Computed only for a 2x2 table b. 0 cells (.0%) have expected count less than 5. The minimum expected count is 10. the expected frequency 00. for any cell in a 2 by 2 table is less than 10.
  • 37. More SPSS Output Symmetric Measures Asymp. a b Value Std. Error Approx. T Approx. Sig. Nominal by Phi -.050 .617 Nominal Cramer's V .050 .617 Interval by Interval Pearson's R -.050 .100 -.496 .621c Ordinal by Ordinal Spearman Correlation -.050 .100 -.496 .621c N of Valid Cases 100 a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis. c. Based on normal approximation.
  • 38. Phi Coefficient • Pearson Chi-Square Symmetric Measures Asymp. a Value Std. Error provides information Nominal by Nominal Phi Cramer's V -.050 .050 about the existence of Interval by Interval Pearson's R -.050 .100 Ordinal by Ordinal Spearman Correlation -.050 .100 N of Valid Cases 100 relationship between 2 a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothes nominal variables, but not c. Based on normal approximation. about the magnitude of the relationship • Phi coefficient is the χ 2 measure of the strength φ= of the association N
  • 39. Cramer’s V • When the table is larger than 2 Symmetric Measures by 2, a different index must be Asymp. a Value Std. Error Nominal by Phi -.050 used to measure the strength Nominal Interval by Interval Cramer's V Pearson's R .050 -.050 .100 of the relationship between the Ordinal by Ordinal N of Valid Cases Spearman Correlation -.050 100 .100 variables. One such index is a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis Cramer’s V. c. Based on normal approximation. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with χ 2 particular categories of the second variable. V= N (k − 1)
  • 40. Cramer’s V • When the table is larger than 2 Symmetric Measures by 2, a different index must be Asymp. a Value Std. Error Nominal by Phi -.050 used to measure the strength Nominal Interval by Interval Cramer's V Pearson's R .050 -.050 .100 of the relationship between the Ordinal by Ordinal N of Valid Cases Spearman Correlation -.050 100 .100 variables. One such index is a. Not assuming the null hypothesis. b. Using the asymptotic standard error assuming the null hypothesis Cramer’s V. c. Based on normal approximation. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with χ 2 particular categories of the second variable. V= N (k − 1) Number of Smallest of cases number of rows or

Editor's Notes

  • #13: Mean difference between pairs of values
  • #14: Mean difference between pairs of values
  • #15: Mean difference between pairs of values