0% found this document useful (0 votes)

158 views36 pages

Introduction To Multivariate Analysis: Dr. Ibrahim Awad Ibrahim

The document discusses various topics in multivariate analysis, including: 1) Different types of variables used in data analysis and their measures of central tendency. 2) Simple and multiple linear regression models, correlation, population covariance, and issues that can arise with regression such as multicollinearity. 3) Other multivariate techniques like canonical correlation analysis, discriminant analysis, logistic regression, survival analysis, principal component analysis, factor analysis, and cluster analysis.

Uploaded by

Rohit kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

158 views36 pages

Introduction To Multivariate Analysis: Dr. Ibrahim Awad Ibrahim

Uploaded by

Rohit kumar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

Epidemiological Applications in Health Services Research

Introduction to Multivariate Analysis

Dr. Ibrahim Awad Ibrahim.

Areas to be addressed today

Introduction to variables and data

Simple linear regression

Correlation

Population covariance

Multiple regression

Canonical correlation

Discriminant analysis

Logistic regression

Survival analysis

Principal component analysis

Factor analysis

Cluster analysis
Types of variables (Stevens’
classification, 1951)

Nominal
 distinct categories: race, religions, counties, sex

Ordinal
 rankings: education, health status, smoking
levels

Interval
 equal differences between levels: time,
temperature, glucose blood levels

Ratio
 interval with natural zero: bone density, weight,
height
Variables use in data analysis

Dependent: result, outcome
 developing CHD


Independent: explanatory
 Age, sex, diet, exercise


Latent constructs
 SES, satisfaction, health status


Measurable indicators
 education, employment, revisit, miles walked
Variables in data example
Name # of Position
characters
STFIPS FIPS 1 2
CODE (STATE)
STCENSUS 1 3

LEVEL 1 4

STABBREV 1 5

AREANAME 7 6
NAME OF
US/STATE/COUN
TY
POPULATION 7 13
1992 ABS
ITEM002
xyz 20
Data

Data screening and transformation

Normality

Independence

Correlation (or lack of independence)
Variable types and measures of
central tendency

Nominal: mode

Ordinal: median

Interval: Mean

Ratio: Geometric mean and harmonic
mean
Simple linear regression
Y = A + BX

X
Correlation

Mean =

Variance (SD)2 = 



Population covariance = (X-  x)(Y-  y)

Product moment coefficient=

=xy/  x  y

It lies between -1 and 1
Example physical and mental health
indicators
Correlations

PHYSICAL MENTAL
PHYSICAL Pearson Correlation 1.000 .230**
Sig. (2-tailed) . .000
N 109888 109888
MENTAL Pearson Correlation .230** 1.000
Sig. (2-tailed) .000 .
N 109888 109888
**. Correlation is significant at the 0.01 level (2-tailed).
Negative correlation

Correlations

WEIGHT AGEDIAB
WEIGHT Pearson Correlation 1.000 -.029**
Sig. (2-tailed) . .000
N 109888 109888
AGEDIAB Pearson Correlation -.029** 1.000
Sig. (2-tailed) .000 .
N 109888 109888
**. Correlation is significant at the 0.01 level (2-tailed).
Population covariance

 =0.00  =0.33  =0.6

 =0.88
Multiple regression and correlation
Simple linear Y =  + X
Multiple regression Y =  + 1X1 + 2X2 + 3X3 . . .+ pXp

EF ejection fraction

Exercise
Body fat
Issues with regression

Missing values
 random
 pattern
 mean substitution and ML

Dummy variables
 equal intervals!

Multicollinearity
 independent variables are highly
correlated

Garbage can method
Canonical correlation

An extension of multiple regression

Multiple Y variables and multiple X
variables

Finding several linear combinations of the
X var and the same number of linear
combinations of the Y var.

These combinations are called canonical
variables and the correlations between the
corresponding pairs of canonical variables
are called CANONICAL CORRELATIONS
Correlation matrix
Correlations

WTFORH PHYSHLT MENTHL POORHL HLTHPLA

WTFORHTX

Pearson Correlation
Data screening and transformation
TX
1.000
GENHLTH
.072**
H
-.008**
TH
.016**
TH
-.005
N
.023**
BPTAKE
.011**
TOLDHI
.000
Sig. (2-tailed) . .000 .006 .000 .208 .000 .000 .903

GENHLTH
N

Pearson Correlation
Normality
109888
.072**
109888
1.000
109888
-.228**
109888
-.061**
54351
-.147**
109888
.035**
108445
-.084**
77436
-.091**
Sig. (2-tailed)
N

Independence
.000
109888 109888
. .000
109888
.000
109888
.000
54351
.000
109888
.000
108445
.000
77436
PHYSHLTH Pearson Correlation -.008** -.228** 1.000 .223** .295** -.011** .083** .030**
Sig. (2-tailed) 
N
Correlation (or lack of independence)
.006
109888
.000
109888 109888
. .000
109888
.000
54351
.000
109888
.000
108445
.000
77436
MENTHLTH Pearson Correlation .016** -.061** .223** 1.000 -.120** -.038** .019** .014**
Sig. (2-tailed) .000 .000 .000 . .000 .000 .000 .000
N 109888 109888 109888 109888 54351 109888 108445 77436
POORHLTH Pearson Correlation -.005 -.147** .295** -.120** 1.000 -.001 .055** .014**
Sig. (2-tailed) .208 .000 .000 .000 . .816 .000 .005
N 54351 54351 54351 54351 54351 54351 53754 38018
HLTHPLAN Pearson Correlation .023** .035** -.011** -.038** -.001 1.000 .152** .022**
Sig. (2-tailed) .000 .000 .000 .000 .816 . .000 .000
N 109888 109888 109888 109888 54351 109888 108445 77436
BPTAKE Pearson Correlation .011** -.084** .083** .019** .055** .152** 1.000 .039**
Sig. (2-tailed) .000 .000 .000 .000 .000 .000 . .000
N 108445 108445 108445 108445 53754 108445 108445 77436
TOLDHI Pearson Correlation .000 -.091** .030** .014** .014** .022** .039** 1.000
Sig. (2-tailed) .903 .000 .000 .000 .005 .000 .000 .
N 77436 77436 77436 77436 38018 77436 77436 77436
**. Correlation is significant at the 0.01 level (2-tailed).
Discriminant analysis

A method used to classify an individual
in one of two or more groups based on a
set of measurements

Examples:
 at risk for
 heart disease
 cancer
 diabetes, etc.

It can be used for prediction and
description
Discriminant analysis

B B
ab
A
A


a and b are wrongly classified

discriminant function to describe the probability
of being classified in the right group.
Logistic regression

An alternative to discriminant analysis to
classify an individual in one of two
populations based on a set of criteria.

It is appropriate for any combination of
discrete or continuous variables

It uses the maximum likelihood
estimation to classify individuals based
on the independent variable list.
Survival analysis (event history
analysis)

Analyze the length of time it takes a
specific event to occur.

Time for death, organ failure, retirement,
etc.

Length of time function of {explanatory
variables (covariates)}
Survival data example
died
died
died
lost

surviving

1980
1985 1990
Log-linear regression

A regression model in which the
dependent variable is the log of survival
time (t) and the independent variables
are the explanatory variables.

Multiple regression Y =  + 1X1 + 2X2 + 3X3 . . .+ pXp

Log (t) =  + 1X1 + 2X2 + 3X3 . . .+ pXp + e

Cox proportional hazards model

Another method to model the relationship between
survival time and a set of explanatory variables.

Proportion of the population who die up to time (t) is
the lined area

1980 t 1985 1990

Cox proportional hazards model

The hazard function (h) at time (t) is
proportional among groups 1 & 2 so that

h1(t1)/h2(t2) is constant.
Principal component analysis

Aimed at simplifying the description of a set
of interrelated variables.

All variables are treated equally.

You end up with uncorrelated new variables
called principal components.

Each one is a linear combination of the
original variables.

The measure of the information conveyed
by each is the variance.

The PC are arranged in descending order of
the variance explained.
Principal component analysis

A general rule is to select PC explaining
at least 5% but you can go higher for
parsimony purposes.

Theory should guide this selection of
cutoff point.

Sometimes it is used to alleviate
multicollinearity.
Factor analysis

The objective is to understand the
underlying structure explaining the
relationship among the original variables.

We use the factor loading of each of the
variables on the factors generated to
determine the usability of a certain
variable.

It is guided again by theory as to what are
the structures depicted by the common
factors encompassing the selected
variables.
Factor analysis
Total Variance Explained

Extraction Sums of Squared

Initial Eigenvalues Loadings
% of Cumulativ % of Cumulativ
Component Total Variance e% Total Variance e%
1 1.699 16.986 16.986 1.699 16.986 16.986
2 1.663 16.629 33.614 1.663 16.629 33.614
3 1.108 11.083 44.697 1.108 11.083 44.697
4 1.035 10.351 55.048 1.035 10.351 55.048
5 .908 9.077 64.125
6 .881 8.808 72.933
7 .834 8.338 81.271
8 .788 7.879 89.150
9 .571 5.714 94.865
10 .514 5.135 100.000
Extraction Method: Principal Component Analysis.
Factor analysis
Component Matrixa

Component
1 2 3 4
GENHLTH .450 .207 -.150 -.552
PHYSHLTH -.770 .254 -3.31E-03 -.208
MENTHLTH .652 -.232 -6.74E-02 .353
POORHLTH -.612 6.329E-02 -1.03E-02 .110
BPTAKE -.128 .352 -.465 .474
BLOODCHO 6.411E-02 .335 -.563 .158
SEATBELT .166 .697 .242 .222
SFTYLT16 .137 .676 .447 .188
BIKEHLMT .156 .414 .210 -.299
SMOKENOW -.112 -.382 .495 .356
Extraction Method: Principal Component Analysis.
a. 4 components extracted.
Cluster analysis

A classification method for individuals into
previously unknown groups

It proceeds from the most general to the most
specific:

Kingdom: Animalia
Phylum: Chordata
Subphylum: vertebrata
Class: mammalia
Order: primates
Family: hominidae
Genus: homo
Species: sapiens
Patient clustering

Major: patients
Types: medical
Subtype: neurological
Class: genetic
Order: lateonset
disease: Guillian Barre syndrom

Hierarchical: divisive or agglumerative
Conclusions
Presentation Schedule

4 each on 4/22 and 4/27

5 on 4/29

Each presentation should be maximum of
10 minutes and 5 minutes for discussion

E-mail me your requirements of software
and hardware for your presentation.

Final projects due 5/7/99 by 5:00 pm in
my office.
Presentation Schedule 1

Date Time Who

4/22 1:00 - 1:15
1:16 - 1:30
1:31 - 1:45
1:46 - 2:00
Presentation Schedule 2

Date Time Who

4/27 1:00 - 1:15
1:16 - 1:30
1:31 - 1:45
1:46 - 2:00
2:01 - 2:15
Presentation Schedule 3

Date Time Who

4/29 1:00 - 1:15
1:16 - 1:30
1:31 - 1:45
1:46 - 2:00

Intro to Correlation & Regression
No ratings yet
Intro to Correlation & Regression
15 pages
Multivariate Analysis
100% (1)
Multivariate Analysis
14 pages
Multivariate Analysis
No ratings yet
Multivariate Analysis
23 pages
Regression Analysis: Statistics For Psychology
No ratings yet
Regression Analysis: Statistics For Psychology
40 pages
01 Multivariate Analysis
100% (1)
01 Multivariate Analysis
40 pages
Correlation and Regression-1
No ratings yet
Correlation and Regression-1
32 pages
Multivariate Linear Regression
No ratings yet
Multivariate Linear Regression
30 pages
Financial Time Series Analysis
No ratings yet
Financial Time Series Analysis
7 pages
Gender and Voting Preference Analysis
No ratings yet
Gender and Voting Preference Analysis
2 pages
Abstract - APA 7th Edition
No ratings yet
Abstract - APA 7th Edition
17 pages
MLR with SPSS: A Beginner's Guide
No ratings yet
MLR with SPSS: A Beginner's Guide
17 pages
What Is Hypothesis Testing
No ratings yet
What Is Hypothesis Testing
18 pages
ANOVA for Statistical Analysis
No ratings yet
ANOVA for Statistical Analysis
18 pages
Understanding Multivariate Regression
No ratings yet
Understanding Multivariate Regression
20 pages
Mann Whitney Example
No ratings yet
Mann Whitney Example
30 pages
2.1 Descriptive Statistics Contd
No ratings yet
2.1 Descriptive Statistics Contd
20 pages
MANOVA Guide for Stats Students
No ratings yet
MANOVA Guide for Stats Students
29 pages
SEM:Confirmatory Factor Analysis (CFA)
No ratings yet
SEM:Confirmatory Factor Analysis (CFA)
28 pages
Research Methodology Overview
No ratings yet
Research Methodology Overview
9 pages
Multivariate Analysis IBS
No ratings yet
Multivariate Analysis IBS
20 pages
Confirmatory Factor Analysis Guide
No ratings yet
Confirmatory Factor Analysis Guide
14 pages
Two-Way (Between-Groups) ANOVA: Statstutor Community Project
No ratings yet
Two-Way (Between-Groups) ANOVA: Statstutor Community Project
4 pages
Spss Unit 1 Notes
No ratings yet
Spss Unit 1 Notes
26 pages
Confirmatory Factor Analysis Overview
No ratings yet
Confirmatory Factor Analysis Overview
22 pages
Hypothesis Testing
0% (1)
Hypothesis Testing
139 pages
Introduction to Biostatistics Lecture
No ratings yet
Introduction to Biostatistics Lecture
22 pages
Overview of Mixed Models in Statistics
No ratings yet
Overview of Mixed Models in Statistics
37 pages
Statistics For Psychology 6E 6th Edition Arthur Aron Elaine N Aron Elliot Coups Ebook and TestBank Bundle Get It Now
No ratings yet
Statistics For Psychology 6E 6th Edition Arthur Aron Elaine N Aron Elliot Coups Ebook and TestBank Bundle Get It Now
344 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
8 pages
Using Statistical Techniq Ues in Analyzing Data
100% (1)
Using Statistical Techniq Ues in Analyzing Data
40 pages
EDA Techniques in R with dlookr
100% (2)
EDA Techniques in R with dlookr
11 pages
Outlier Detection Techniques
No ratings yet
Outlier Detection Techniques
55 pages
Factor Analysis Guide: Techniques & Applications
No ratings yet
Factor Analysis Guide: Techniques & Applications
22 pages
Regression
100% (1)
Regression
43 pages
Understanding Multiple Regression Analysis
100% (1)
Understanding Multiple Regression Analysis
58 pages
Two Way Annova
No ratings yet
Two Way Annova
18 pages
Estimation and Hypothesis Testing Guide
100% (2)
Estimation and Hypothesis Testing Guide
32 pages
Count Data Modeling for Academics
100% (2)
Count Data Modeling for Academics
34 pages
Confirmatory Factor Analysis Guide
No ratings yet
Confirmatory Factor Analysis Guide
15 pages
Spss Notes by Asprabhu
100% (1)
Spss Notes by Asprabhu
38 pages
Statistic Interview Questions and Answers by Jeevan Raj
No ratings yet
Statistic Interview Questions and Answers by Jeevan Raj
21 pages
Inferential Statistics For Data Science
100% (1)
Inferential Statistics For Data Science
10 pages
Types of Data Analysis Explained
100% (1)
Types of Data Analysis Explained
28 pages
Correlation-Regression 2019
No ratings yet
Correlation-Regression 2019
76 pages
Correlation 26-2-24
No ratings yet
Correlation 26-2-24
16 pages
Chi Square Test: Testing Several Proportions Goodness-of-Fit Test Test of Independence
No ratings yet
Chi Square Test: Testing Several Proportions Goodness-of-Fit Test Test of Independence
3 pages
Flow Charts for Statistical Techniques
No ratings yet
Flow Charts for Statistical Techniques
1 page
Exploratory Factor Analysis
No ratings yet
Exploratory Factor Analysis
9 pages
SPSS Data Transformation Guide
No ratings yet
SPSS Data Transformation Guide
5 pages
Quantitative Analysis
No ratings yet
Quantitative Analysis
3 pages
Statistics - Notes 2 1 3 1 1 1 1 1 2 2 1 1 1 1 1
No ratings yet
Statistics - Notes 2 1 3 1 1 1 1 1 2 2 1 1 1 1 1
95 pages
The Nature of Probability and Statistics: Lecturer: FATEN AL-HUSSAIN
No ratings yet
The Nature of Probability and Statistics: Lecturer: FATEN AL-HUSSAIN
52 pages
Data Arrangement and Presentation Methods
No ratings yet
Data Arrangement and Presentation Methods
55 pages
R for Economics Students
No ratings yet
R for Economics Students
128 pages
5 Alternatives To Experimentation Correlational and Quasi Experimental Design PDF
No ratings yet
5 Alternatives To Experimentation Correlational and Quasi Experimental Design PDF
47 pages
Correlation New
100% (1)
Correlation New
38 pages
Topic03 Correlation Regression
No ratings yet
Topic03 Correlation Regression
81 pages
Guidelines For Authors: 3. Reporting Documenting Research in Scientific Articles
No ratings yet
Guidelines For Authors: 3. Reporting Documenting Research in Scientific Articles
7 pages
Statistical Methods in Nursing
No ratings yet
Statistical Methods in Nursing
73 pages
15 Mod15TrainingDesigntext PDF
No ratings yet
15 Mod15TrainingDesigntext PDF
15 pages
Items Description of Module
No ratings yet
Items Description of Module
10 pages
Notification For Physical Test
No ratings yet
Notification For Physical Test
60 pages
1505121612mod1 - IntroductiontotrainingDevelopmenttext 1
No ratings yet
1505121612mod1 - IntroductiontotrainingDevelopmenttext 1
14 pages
HR Training: Learning Styles & Cycle
No ratings yet
HR Training: Learning Styles & Cycle
13 pages
International Business Environment Notes
No ratings yet
International Business Environment Notes
0 pages
Advanced Calculus Guide
No ratings yet
Advanced Calculus Guide
329 pages
Notification For Physical Test
No ratings yet
Notification For Physical Test
60 pages
Haslinda
No ratings yet
Haslinda
7 pages
Consumer Behavior in Hero Bike Purchases
No ratings yet
Consumer Behavior in Hero Bike Purchases
57 pages
Polynomial Concepts and Factorization Techniques
No ratings yet
Polynomial Concepts and Factorization Techniques
11 pages
Business Law: Bspatil
No ratings yet
Business Law: Bspatil
103 pages
Items Description of Module: Subject Name Paper Name Module Title Module Id Pre-Requisites Objectives Keywords
No ratings yet
Items Description of Module: Subject Name Paper Name Module Title Module Id Pre-Requisites Objectives Keywords
12 pages
Global
No ratings yet
Global
1 page
Global Information Sharing
No ratings yet
Global Information Sharing
1 page
Global
No ratings yet
Global
1 page
Understanding Hypothesis Testing in Business
No ratings yet
Understanding Hypothesis Testing in Business
71 pages
Example Problems - Econometrics
No ratings yet
Example Problems - Econometrics
8 pages
CEE335 Lab Manual 2017 SP
No ratings yet
CEE335 Lab Manual 2017 SP
75 pages
APPLIED SOIL MECHANICS
No ratings yet
APPLIED SOIL MECHANICS
39 pages
Traverse Computations and Adjustments PDF
No ratings yet
Traverse Computations and Adjustments PDF
37 pages
Speech Signal Processing
100% (2)
Speech Signal Processing
173 pages
Cognitive and Motivational Biases in Risk Analysis
No ratings yet
Cognitive and Motivational Biases in Risk Analysis
22 pages
Latest Pay Scale Calculations For Haryana Employees
95% (40)
Latest Pay Scale Calculations For Haryana Employees
43 pages
Prime Implicants in Boolean Logic
No ratings yet
Prime Implicants in Boolean Logic
12 pages
ODE Formula Sheet
No ratings yet
ODE Formula Sheet
4 pages
Marine Department SMS 2018 - Last Edition-1
100% (2)
Marine Department SMS 2018 - Last Edition-1
102 pages
10 1109@lawp 2020 2995244
No ratings yet
10 1109@lawp 2020 2995244
4 pages
Worksheet I On MICRO I 2015-16 A.Y@AAU - NMD
No ratings yet
Worksheet I On MICRO I 2015-16 A.Y@AAU - NMD
7 pages
ICSE Physics Strategy and Trackers 2026-By DR Salman Thakur 2
No ratings yet
ICSE Physics Strategy and Trackers 2026-By DR Salman Thakur 2
7 pages
Ncert Solutions Class 10 Maths Chapter 12 Ex 12 3
No ratings yet
Ncert Solutions Class 10 Maths Chapter 12 Ex 12 3
33 pages
Fluent Simulationheated Pipe
No ratings yet
Fluent Simulationheated Pipe
5 pages
Edu Exam P Sample Quest
No ratings yet
Edu Exam P Sample Quest
188 pages
QS 03 Steel Reinforcement Rev 05
No ratings yet
QS 03 Steel Reinforcement Rev 05
104 pages
Hasc 402 Exam
100% (1)
Hasc 402 Exam
4 pages
IMSO 2017 Short Answer Problems
100% (1)
IMSO 2017 Short Answer Problems
25 pages
Potential Difficulties When Applying The Gauss
No ratings yet
Potential Difficulties When Applying The Gauss
3 pages
G-12 Physics Projectile Motion Questions
100% (1)
G-12 Physics Projectile Motion Questions
3 pages
Bedford Statics Practice Problems
No ratings yet
Bedford Statics Practice Problems
6 pages
(2008) Fractional Calculus Applied To Model Arterial Viscoelasticity
No ratings yet
(2008) Fractional Calculus Applied To Model Arterial Viscoelasticity
6 pages
Part - A Answer All Questions: Page 1/3
No ratings yet
Part - A Answer All Questions: Page 1/3
3 pages
Unit 1 Clustering Numericals
No ratings yet
Unit 1 Clustering Numericals
5 pages
Data Structures Curriculum Overview
No ratings yet
Data Structures Curriculum Overview
3 pages
Prolog PPT Updated
No ratings yet
Prolog PPT Updated
45 pages
Dreamforce Ree Math 3
No ratings yet
Dreamforce Ree Math 3
6 pages
50 Arithmetic Questions PDF For SBI PO PRE 2020 by Maths by Arun Sir
No ratings yet
50 Arithmetic Questions PDF For SBI PO PRE 2020 by Maths by Arun Sir
68 pages

Introduction To Multivariate Analysis: Dr. Ibrahim Awad Ibrahim

Uploaded by

Introduction To Multivariate Analysis: Dr. Ibrahim Awad Ibrahim

Uploaded by

Epidemiological Applications in Health Services Research

Introduction to Multivariate Analysis

Dr. Ibrahim Awad Ibrahim.

 =0.00  =0.33  =0.6

WTFORH PHYSHLT MENTHL POORHL HLTHPLA

Multiple regression Y =  + 1X1 + 2X2 + 3X3 . . .+ pXp

Log (t) =  + 1X1 + 2X2 + 3X3 . . .+ pXp + e

1980 t 1985 1990

Extraction Sums of Squared

Date Time Who

Date Time Who

Date Time Who

You might also like