0% found this document useful (0 votes)

3 views20 pages

C01 Introduction S

The document provides an overview of multivariate analysis, emphasizing its complexity due to simultaneous measurements on multiple variables and the need for advanced statistical techniques. It discusses various objectives of multivariate methods, such as data reduction, sorting, dependence investigation, prediction, and hypothesis testing, along with applications in fields like psychology, education, and environmental science. Additionally, it covers data structure, types of variables, handling missing values, graphical techniques, and descriptive statistics relevant to multivariate data analysis.

Uploaded by

janae gardener

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views20 pages

C01 Introduction S

Uploaded by

janae gardener

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

1.

INTRODUCTION

1.1 Aspects
Multivariate data arise when researchers record the values of several random variables
on the number of subjects or objects or perhaps one of a variety of other things (we
will use the general term “units” or “items”) in which they are interested, leading to a
vector-valued or multidimensional observation from each. Since the data include
simultaneous measurements on many variables, this body of methodology is called
multivariate analysis.

To understand the relationships between many variables make multivariate analysis an

inherently difficult subject. That is, the human mind is overwhelmed by sheer bulk of
the data, and more mathematics is required to derive multivariate statistical technique
for making inferences.

Our emphasis will be on analysis of measurements obtained without actively

controlling or manipulating any of the variables on which the measurements are made.

Multivariate data are ubiquitous as is illustrated by the examples below:

(i) Psychologists and other behavioural scientists often record the values of several
different cognitive variables on a number of subjects.

(ii) Educational researchers may be interested in the examination marks obtained

by students for a variety of different subjects.

(iii) Archaeologist may make a set of measurements on artefacts of interest.

(iv) Environmentalists might assess pollution levels of a set of cities along with
noting other characteristics of the cities related to climate and human ecology

1
1.2 Classification
It is difficult to establish a classification scheme for multivariate techniques that is both
widely accepted and indicates the appropriateness of the techniques.
▪ One classification distinguishes techniques designed to study interdependent
relationships from those designed to study dependent relationships.
▪ Another classifies techniques according to the number of populations and the
number of sets of variables being studied.

The choice of methods and the types of analyses employed are largely determined by
the objective of the investigation.

The objectives of scientific investigations to which multivariate methods lend

themselves include the following:

(i) Data reduction or structural simplification:- The phenomenon being

studied is represented as simply as possible without sacrificing valuable
information. It is hoped that this will make interpretation easier.

(ii) Sorting and grouping:- Groups of “similar” objects or variables are created,
based upon measured characteristics. Or rules for classifying objects or
variables into well-defined groups may be required

(iii) Investigation of the dependence among variable:- The nature of

relationships among variables is of interest. Are all the variables mutually
independent or are one or more variables dependent on the others? If so, how?

(iv) Prediction:- Relationships between variables must be determined for the

purpose of predicting the values of one or more variables on the basis of
observations on the other variables.

(v) Hypothesis construction and testing:- Specific statistical hypotheses,

formulated in terms of the parameters of multivariate populations, are tested.
This may be done to validate assumptions or to reinforce prior convictions.

2
1.3 Applications
To give some indication of the usefulness of multivariate techniques, few examples are
given below. These examples are multifaceted and could be placed in more than one
category.

Data reduction or structural simplification

▪ Track records from many nations used to develop an index of performance for
both male and female athletes
▪ Multispectral image data collected by a high-altitude scanner were reduced to a
form that could be viewed as images of a shoreline in two dimensions
▪ Data on several variables related to cancer patient responses to radio-therapy, a
simple measure of patient response to radiotherapy was constructed
▪ Data on several variables relating to yield and protein content were used to
create an index to select parents of subsequent generations of improved bean
plant.

Sorting and grouping

▪ Measurements of several physiological variables were used to develop a
screening procedure that discriminates alcoholics from non-alcoholics.
▪ Data related to responses to visual stimuli were used to develop a rule for
separating people suffering from a multiple-sclerosis-caused visual pathology
from those not suffering from the disease.

Investigation of the dependence among variable

▪ The associations between measures of risk-taking propensity and measures of
socioeconomic characteristics for top-level business executives were used to
assess the relation between risk-taking behaviour and performance.
▪ Data on several variables were used to identify factors that were responsible for
client success in hiring external consultants.

3
Prediction
▪ Measurements on several accounting and financial variables were used to
develop a method for identifying potentially insolvent property-liability insurers.
▪ cDNA microarray experiments(gene expression data) are increasingly used to
study molecular variations among cancer tumours. A reliable classification of
tumours is essential for successful diagnosis and treatment of cancer.

Hypothesis construction and testing

▪ Experimental data on several variables were used to see whether the nature of
the instructions make any difference in perceived risks, as quantified by test
scores.
▪ Data on several variables were used to determine whether different types of
firms in newly industrialised countries exhibited different patterns of
innovation.

1.4 Data Structure

Investigators seeking to understand a social or physical phenomenon, selects a number
p  1 of variables or characters to record.

The values of these variables are all recorded for each item, individual, or experimental
unit.

The notation x jk , indicate the particular value of the kth variable that is observed on
the jth item. That is,

x jk  measurement of the kth variable on the jth item

Consequently, n measurements on p variables can be displayed as follows:

4
variable 1 variable 2 variable k variable p
Item1 x11 x12 x1k x1 p
Item 2 x21 x22 x2 k x2 p

Item j x j1 x j2 x jk x jp

Item n xn1 xn 2 xnk xnp

This can be represented as a matrix of n rows and p columns:

 x11 x12 x1k x1 p 

 
 x21 x22 x2 k x2 p 
 
X  
 x j1 x j2 x jk x jp 
 
 
x xn 2 xnk xnp 
 n1

This contains the data consisting of all of the observations on all of the variables.

Example 1.1
A selection of four receipts from a bookstore was obtained to investigate the nature of
book sales. The receipt provided among other things, the number of books sold and
the total amount of each sale.

Let the first variable be total dollar sales and the second variable the number of books
sold. We can consider the number of receipts as four measurements on two variables.

Variable 1 (dollar sales): 42 52 48 58

Variable 2(number of books): 4 5 4 3

5
Notation wise:
x11  42 x21  52 x31  48 x41  58
x12  4 x22  5 x32  4 x42  3

Matrix (data array):

 42 4
52 5 
X 
 48 4
 
 58 3

1.4.1 Types of variables

(i) Nominal
Unordered categorical variables. Examples, such as, the sex of respondent, hair
colour, presence or absence of depression and nationality.

(ii) Ordinal
Where there is ordering but no implication of equal distance between the
different points of the scale. Examples:- educational attainment(no schooling,
primary, secondary, tertiary), social class, degree classification(distinction, merit,
pass).
(iii) Interval
Where there are equal differences between successive points on the scale but
the position of zero is arbitrary. For example, the measurement of temperature
using Celsius or Fahrenheit scales.

(iv) Ratio
The highest level of measurement, where one can investigate the relative
magnitudes of scores as well as the differences between them. The position of
zero is fixed. Example, the absolute measure of temperature in Kelvin, age,
weight and length.

6
1.4.2 Missing values
Consider the table below; where NA denotes missing values. This is one of the
problems that faced statisticians undertaking statistical analysis in general and
multivariate analysis in particular.

ID Sex Age IQ Depression Health Weight

1 Male 21 120 Yes Very good 150
2 Male 43 NA No Very good 160
3 Male 22 135 No Average 135
4 Female 16 130 Yes Good 110
5 Female NA 150 Yes Good 110
6 Male 86 150 No Average 140
7 Female 22 84 No Average 105
8 Female 22 84 No Very good 105

Missing values, i.e., observations and measurements that should have been recorded
but for one reason or another, were not.

In multivariate data, missing values arise, for example, non-response in sample surveys,
dropouts in longitudinal data, refusal to answer particular questions in a questionnaire.

The most important way for dealing with missing data is try to avoid them during the
data-collection stage of a study. However, this is not possible.

There are several ways to deal with missing values:-

▪ Complete-case analysis
▪ Available case analysis
▪ Imputation
▪ Multiple imputation

7
1.5 Graphical Techniques
There are several graphical displays that can be used to aid in data analysis. It is
impossible to simultaneously plot all measurements made on several variables and
study the configurations. But, plots of individual variables and plots of pairs of variables
can still be very informative.

Bellows are few plots that frequently aid in data analysis.

8
9
The diagram below, specimen (board) 16 and possibly specimen (board) 9 are identified
as unusual observations. Figures 1.12(a), (b), and (c) contain perspectives of the
stiffness data in the x1, x2 , x3 space.

These views were obtained by continually rotating and turning the three-dimensional
coordinate axes. Spinning the coordinate axes allows one to get a better understanding
of the three-dimensional aspects of the data. Figure 1.12(d) gives one picture of the
stiffness data in x2 , x3 , x4 space. Notice that Figures 1.12(a) and(d) visually confirm
specimens 9 and 16 as outliers. Specimen 9 is very large in all three coordinates.

A counter-clockwise-like rotation of the axes in Figure 1.12(a) produces Figure 1.12(b),

and the two unusual observations are masked in this view. A further spinning of the
x2 , x3 axes gives Figure 1.12(c); one of the outliers (16) is now hidden.

Additional insights can sometimes be gleaned from visual inspection of the slowly
spinning data. It is this dynamic aspect that statisticians are just beginning to understand
and exploit.

10
Plots like those in Figure 1.12 allow one to identify readily observations that do not
conform to the rest of the data and that may heavily influence inferences based on
standard data-generating models

11
1.6 Descriptive Statistics
Information contained in the data can be assessed by calculating summary statistics.
For example, sample mean, which provides a measure of location, i.e. “central value”
for a set of numbers. The average of the squares of the distance s of all numbers from
the mean provides a measure of the variation (or spread), in the numbers.

Consider the dataset, as a matrix X , then we could treat each column of X separately.
That is, x11, x21,...., xn1 be n measurements on the first variable.

For each column we can find the:

▪ sample mean
1 n
xj   xij
n i 1
for j  1, 2,...., p

▪ sample variance
1 n

 xij  x j 
2
s 2j  for j  1, 2,...., p
n  1 i 1

The positive square root of s 2j is known as the sample standard deviation.

Consider n pairs of measurements on each of variables 1 and 2:

 x11   x21   xn 2 
x  x 
, ,......., x 
 21   22   n2 

That is x j1 and x j 2 are observed on the jth experimental items  j  1, 2,...n  .

12
A measure of the linear association between the measurements of variables 1 and 2 is
given by the sample covariance

1 n
s12   
 x j1  x1 x j 2  x2
n  1 j 1


▪ sample covariance ( is a measure of their linear dependence) between two

variables

1 n
s jk  
 xij  x j
n  1 i 1
  xik  xk  j  1,2,..., p k  1,2,..., p

Note that covariance reduces to the sample variance when

j  k , i.e., s 2j  s jj

Also, s jk  skj for all j and k

▪ the (Pearson’s) sample correlation coefficient

This measure the strength of linear relationship between two variables does not
depend on the units of measurement.

The sample correlation coefficient for the jth and kth variables is

13
n
  xij  x j   xik  xk 
i 1
r jk 
n 2 n
 xij  x j    xik  xk 2
i 1 i 1

s jk
 j  1,...., p; k  1,..., p
s jj skk

Note that, r jk  rkj for all j and k

▪ Properties of the Correlation Coefficient

(i) 1  rjk  1 for all j , k

(ii) r jk gives the strength of the relationship with values of r jk close to

one implying strong relationships and values close to zero implying
weak relationship.

(iii) The sign of r jk gives the direction of the association

(iv) If r jk  1 , then there are constants a and b such that xij  a  bxik
for j  1, 2,..., n

(v) The value of r jk does not change if either variable is subject to a linear
transformation

14
We can represent all these quantities as an array:

 x1 
x 
x 
2
Sample mean
 
 
 x p 

 s11 s12 s1 p 
 
 s21 s22 s2 p 
Sample variances-covariances Sn   
 
 s p1 s p2 s pp 

 1 r12 r1 p 
 
 r21 1 r2 p 
Sample correlations R 
 
 rp1 1 

Example 1.6.1
Consider the data given in Example 1.1. Compute the sample mean, covariance,
variance and correlation coefficients.

15
1.7 Geometry
(i) 
The straight-line or Euclidean distance of a point P  x1, x2 ,.., x p  from the

origin O   0,....,0  is

d  O, P   x12  x22  ...  x 2p (1.7.1)

All points that lie a constant squared distance, such as c 2 , from the origin
satisfy the equation
d 2  O, P   x12  x22  ...  x 2p  c 2 (1.7.2)

This is the equation of a hypersphere (a circle, if p  2 ), points equidistant

from the origin lie on a hypersphere.

(ii) The straight-line distance between two arbitrary points P and Q with


coordinates P  x12 , x22 ,..., x 2p   
and Q  y12 , y22 ,..., y 2p is given by

d  P, Q    x1  y1 2   x2  y2 2  ...   x p  y p 
2
(1.7.3)

This is another measure, which is unsatisfactory for most statistical purposes,

because each coordinate contributes equally to the calculation of Euclidean
distance.

16
(iii) We need to find a ‘statistical’ distance that accounts for differences in variation,
and in due course, the presence for correlation. Our choice will depend upon
the sample variance and covariance. The statistical distance is fundamental to
multivariate analysis.

One way to proceed is to divide each coordinate by the sample standard

deviation, thus, we have a standardised coordinates. These are now on the same
footing, hence, we find the distance by using the standard Euclidean formula.

(a) Consider the point P   x1, x2  from the origin O   0,0 

The statistical distance,

2 2
 x   x  x12 x22
d  O, P    1    2    (1.7.4)
 s   s s11 s22
 11   22 

• In 2
, the difference between (1.7.1) and 1.7.4) is due to the
weights k1  s1 and k2  s1 attached to x12 and x22
11 22

• If the sample variances are the same, k1  k2 , then x12 and x22 will
receive the same weight
• If the variability in the x1 direction is the same as the variability
in the x2 direction, and the x1 values vary independently of the
x2 values, Euclidean distance is appropriate.

Using (1.7.4), we see that all points which have coordinates  x1, x2  and
are a constant squared distance, c 2 , from the origin must satisfy

x12 x22
  c2
s11 s22

17
Which is the equation of an ellipse centred at the origin, whose major
and minor axes coincide with the coordinate axes. That is, the statistical
distance in (1.7.4) has an ellipse as the locus of all points a constant
distance from the origin. This general case is shown below.

(b) Consider the point 

P  x1, x2 ,..., x p  to any fixed point


Q  y1, y2 ,..., y p .
Assume that the coordinates vary independently of one another, the
statistical distance

 x1  y1 2   x2  y2 2  ...   x p  y p 
2

d  P, Q   (1.7.5)
s11 s22 s pp

(c) Equation (1.7.5) assume independent coordinates. If the coordinates of

the pairs  x1, x2  exhibit a tendency to be large or small together, and
the sample correlation coefficient is positive, then the variability in the
x2 direction is larger than the variability in the x1 direction.

18
Thus, we rotate the original coordinate system through the angle  while
keeping the scatter fixed label the rotated axes x1 and x2 . This suggests
that we calculate the sample variances using the x1 and x2 coordinates
and measure distance as in (1.7.4). That is,

x12 x22
d  O, P    (1.7.6)
s11 s22

A scatterplot for
positively correlated
measurements and a
rotated coordinate
system

The relation between the original coordinates  x1, x2  and the rotated coordinates
 x1, x2  is given by
x1  x1 cos    x2 sin  
x2   x1 sin    x2 cos  

In terms of the original coordinates,

d  O, P   a11x12  2a12 x1x2  a22 x22

19
where

cos 2   sin 2  
a11  
cos 2   s11  2sin   cos   s12  sin 2   s22 cos 2   s22  2sin   cos   s12  sin 2   s11

sin 2   cos 2  
a22  
cos 2   s11  2sin   cos   s12  sin 2   s22 cos 2   s22  2sin   cos   s12  sin 2   s11

cos   sin   sin   cos  

a12  
cos   s11  2sin   cos   s12  sin   s22
2 2
cos   s22  2sin   cos   s12  sin 2   s11
2

Hence, in general

11 1 1 22 2 2 pp p 
 a  x  y 2  a  x  y 2   a x  y 2  2a  x  y  x  y  
d  P, Q   
p 
12 1 1 2 2 
 

 2a13  x1  y1   x3  y3    2a p 1, p x p 1  y p 1 x p  y p   

where

O   0, 0,..., 0  denote the origin, P  x1, x2 ,...., x p  and Q   y1, y2 ,..., y p  be
a specified fixed point.

Applied Longitudinal Analysis Lecture Notes
No ratings yet
Applied Longitudinal Analysis Lecture Notes
475 pages
Module Exercise Mega 1
100% (1)
Module Exercise Mega 1
85 pages
Cob 300 Business Plan
No ratings yet
Cob 300 Business Plan
33 pages
Aspects of Multivariate Analysis
No ratings yet
Aspects of Multivariate Analysis
50 pages
Aspects of Multivariate Analysis
No ratings yet
Aspects of Multivariate Analysis
4 pages
Lecture 1 Multivariate Analysis PDF
No ratings yet
Lecture 1 Multivariate Analysis PDF
28 pages
Multivariate Methods
No ratings yet
Multivariate Methods
13 pages
Multivariate Statistical Analysis: Prof. DR.: RAFAEL AMARO
No ratings yet
Multivariate Statistical Analysis: Prof. DR.: RAFAEL AMARO
29 pages
5.basic Statistics
No ratings yet
5.basic Statistics
43 pages
Section 1 - Multivariate Data and Matrix Algebra
No ratings yet
Section 1 - Multivariate Data and Matrix Algebra
14 pages
Em (601) Report# 9
No ratings yet
Em (601) Report# 9
6 pages
An Introduction To Multivariate Statistics
No ratings yet
An Introduction To Multivariate Statistics
19 pages
Deshmukh Abstract New
No ratings yet
Deshmukh Abstract New
3 pages
Emdad Rahman
No ratings yet
Emdad Rahman
85 pages
Note 1
No ratings yet
Note 1
5 pages
L1 Introduction To Multivariate Analysis PDF
No ratings yet
L1 Introduction To Multivariate Analysis PDF
55 pages
Module 2 - Statistical Foundations
No ratings yet
Module 2 - Statistical Foundations
108 pages
CH 8 Data Analysis
No ratings yet
CH 8 Data Analysis
34 pages
Unit-3 Research Methods-MCA
No ratings yet
Unit-3 Research Methods-MCA
15 pages
Introduction To STATISTICS-new
No ratings yet
Introduction To STATISTICS-new
44 pages
Pertemuan 1 SNN
No ratings yet
Pertemuan 1 SNN
37 pages
Multivariate Final PDF
No ratings yet
Multivariate Final PDF
261 pages
Unit III Data Analysis and Reporting
No ratings yet
Unit III Data Analysis and Reporting
13 pages
Cba101 MT
No ratings yet
Cba101 MT
4 pages
Sta 103 L1 Upda2
No ratings yet
Sta 103 L1 Upda2
104 pages
Lecture 1 - Introduction To Statistics
No ratings yet
Lecture 1 - Introduction To Statistics
3 pages
Lecture Notes Quanti 1
No ratings yet
Lecture Notes Quanti 1
105 pages
2348314_BioStats_CIA1
No ratings yet
2348314_BioStats_CIA1
10 pages
Basic Concepts in Statistics
No ratings yet
Basic Concepts in Statistics
42 pages
Introduction To Qa
No ratings yet
Introduction To Qa
4 pages
DOC-20250325-WA0014
No ratings yet
DOC-20250325-WA0014
63 pages
Statistics: Basic Concepts
No ratings yet
Statistics: Basic Concepts
5 pages
Estadístic A Descriptiv A: Dr. Lázaro Bustio Martínez Otoño 2023
No ratings yet
Estadístic A Descriptiv A: Dr. Lázaro Bustio Martínez Otoño 2023
42 pages
CHAPTER 1 & 2_ STATS
No ratings yet
CHAPTER 1 & 2_ STATS
5 pages
Basic Concepts and Foundations of Quantitative Research
No ratings yet
Basic Concepts and Foundations of Quantitative Research
18 pages
Module 3 - Lesson 3.2 Quantitative Data Analysis
No ratings yet
Module 3 - Lesson 3.2 Quantitative Data Analysis
41 pages
What Is Multivariate Analysis
No ratings yet
What Is Multivariate Analysis
7 pages
Data Analysis and Interpretation
No ratings yet
Data Analysis and Interpretation
24 pages
Research Methodology - Multi Variate Analysis 13 10 23
No ratings yet
Research Methodology - Multi Variate Analysis 13 10 23
17 pages
Statistical Methods: 4 Unit
No ratings yet
Statistical Methods: 4 Unit
39 pages
Introduction To STATISTICS-new
100% (1)
Introduction To STATISTICS-new
46 pages
Unit 4
No ratings yet
Unit 4
152 pages
Regression
No ratings yet
Regression
82 pages
01 Multivariate Analysis
100% (1)
01 Multivariate Analysis
40 pages
RES1N Prefinal Module 4
No ratings yet
RES1N Prefinal Module 4
3 pages
Multivariate Data Analysis: Setia Pramana
No ratings yet
Multivariate Data Analysis: Setia Pramana
46 pages
UNIT-2
No ratings yet
UNIT-2
38 pages
Multivariate Statistics Unit 1 JJ 09-01-2025
No ratings yet
Multivariate Statistics Unit 1 JJ 09-01-2025
27 pages
Statistical Foundations - Intro 64zlf
100% (2)
Statistical Foundations - Intro 64zlf
86 pages
RM-Quantitative Data Analysis
No ratings yet
RM-Quantitative Data Analysis
152 pages
Introduction To Statistics
100% (3)
Introduction To Statistics
43 pages
BI UNIT-IV
No ratings yet
BI UNIT-IV
142 pages
Chapter 1: Statistics: Scatterplot
No ratings yet
Chapter 1: Statistics: Scatterplot
30 pages
1.introduction To Biostatistics
No ratings yet
1.introduction To Biostatistics
56 pages
Experimental Lesson3 Statistics
No ratings yet
Experimental Lesson3 Statistics
46 pages
chapter 1_250119_072242
No ratings yet
chapter 1_250119_072242
11 pages
paper2scheam 2[1]
No ratings yet
paper2scheam 2[1]
28 pages
Multivariate Analysis
No ratings yet
Multivariate Analysis
7 pages
Learn Statistics Fast: A Simplified Detailed Version for Students
From Everand
Learn Statistics Fast: A Simplified Detailed Version for Students
Hesbon R.M
No ratings yet
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Regression Analysis: A Journey from Simple to Complex
From Everand
Regression Analysis: A Journey from Simple to Complex
Pasquale De Marco
No ratings yet
Mary Mother of The Church
No ratings yet
Mary Mother of The Church
14 pages
Three Strategies To Derive A Dual Problem
No ratings yet
Three Strategies To Derive A Dual Problem
8 pages
Chapter 3 - Problem Solving
No ratings yet
Chapter 3 - Problem Solving
28 pages
Instructions & Information: 400 Litre Aerobin
No ratings yet
Instructions & Information: 400 Litre Aerobin
2 pages
Kelinci
No ratings yet
Kelinci
12 pages
Untitled
No ratings yet
Untitled
11 pages
Netflix Cookies 1
No ratings yet
Netflix Cookies 1
3 pages
Trip Generation Analysis
No ratings yet
Trip Generation Analysis
56 pages
CV of Sabine El Wak
No ratings yet
CV of Sabine El Wak
3 pages
284-Article Text-446-1-10-20181225
No ratings yet
284-Article Text-446-1-10-20181225
17 pages
Indoor Selectable-Output Strobes and Horn Strobes For Ceiling Applications
No ratings yet
Indoor Selectable-Output Strobes and Horn Strobes For Ceiling Applications
4 pages
STP275 WfwMC4 275 270 265
No ratings yet
STP275 WfwMC4 275 270 265
2 pages
Q1-CUF - The School Dance Dilemma (MODALS OF PERMISSION)
No ratings yet
Q1-CUF - The School Dance Dilemma (MODALS OF PERMISSION)
4 pages
Clean and Green
100% (1)
Clean and Green
9 pages
S4hana Terms+of+Payment
100% (1)
S4hana Terms+of+Payment
25 pages
Hitachi High-Tech - Mastering Side-By-Side Development Techniques To Innovate in A More Agile and Future-Ready Way
No ratings yet
Hitachi High-Tech - Mastering Side-By-Side Development Techniques To Innovate in A More Agile and Future-Ready Way
1 page
Timestomp and Autopsy
No ratings yet
Timestomp and Autopsy
3 pages
ME MTech 2021 Regulations-PED
No ratings yet
ME MTech 2021 Regulations-PED
17 pages
Mormon Mysticism
100% (1)
Mormon Mysticism
289 pages
Youtube Link: Https://Youtu - Be/Emtqrzenfb0: Siyensikula
No ratings yet
Youtube Link: Https://Youtu - Be/Emtqrzenfb0: Siyensikula
4 pages
Effects of Social Media On Grade 11 Students of Voctech
No ratings yet
Effects of Social Media On Grade 11 Students of Voctech
11 pages
Water Testing: The Principles and Techniques Used in Testing Different Types of Water
No ratings yet
Water Testing: The Principles and Techniques Used in Testing Different Types of Water
1 page
Bipolar Limb Leads (Frontal Plane)
No ratings yet
Bipolar Limb Leads (Frontal Plane)
7 pages
Killing Mr_ Griffin -- Lois Duncan -- Puffin Teenage Fiction, Harmondsworth, England, 1990 -- Puffin -- 9780140371710 -- Ddbbc15d4b8b4a5be0aa3534c1a9cff4 -- Anna’s Archive
No ratings yet
Killing Mr_ Griffin -- Lois Duncan -- Puffin Teenage Fiction, Harmondsworth, England, 1990 -- Puffin -- 9780140371710 -- Ddbbc15d4b8b4a5be0aa3534c1a9cff4 -- Anna’s Archive
260 pages
Noosphere - 734 Bibliographic References (1926-2007)
No ratings yet
Noosphere - 734 Bibliographic References (1926-2007)
31 pages
ch 3 ppt afs
No ratings yet
ch 3 ppt afs
17 pages
Frid Ffhs26 Tech Sheet
No ratings yet
Frid Ffhs26 Tech Sheet
2 pages
pc101 Document w04ApplicationActivityTemplate
No ratings yet
pc101 Document w04ApplicationActivityTemplate
2 pages

C01 Introduction S

Uploaded by

C01 Introduction S

Uploaded by

1.

To understand the relationships between many variables make multivariate analysis an

Our emphasis will be on analysis of measurements obtained without actively

Multivariate data are ubiquitous as is illustrated by the examples below:

(ii) Educational researchers may be interested in the examination marks obtained

(iii) Archaeologist may make a set of measurements on artefacts of interest.

The objectives of scientific investigations to which multivariate methods lend

(i) Data reduction or structural simplification:- The phenomenon being

(iii) Investigation of the dependence among variable:- The nature of

(iv) Prediction:- Relationships between variables must be determined for the

(v) Hypothesis construction and testing:- Specific statistical hypotheses,

Data reduction or structural simplification

Sorting and grouping

Investigation of the dependence among variable

Hypothesis construction and testing

1.4 Data Structure

x jk  measurement of the kth variable on the jth item

Consequently, n measurements on p variables can be displayed as follows:

Item n xn1 xn 2 xnk xnp

This can be represented as a matrix of n rows and p columns:

 x11 x12 x1k x1 p 

Variable 1 (dollar sales): 42 52 48 58

Matrix (data array):

1.4.1 Types of variables

ID Sex Age IQ Depression Health Weight

There are several ways to deal with missing values:-

Bellows are few plots that frequently aid in data analysis.

A counter-clockwise-like rotation of the axes in Figure 1.12(a) produces Figure 1.12(b),

For each column we can find the:

The positive square root of s 2j is known as the sample standard deviation.

Consider n pairs of measurements on each of variables 1 and 2:

That is x j1 and x j 2 are observed on the jth experimental items  j  1, 2,...n  .

▪ sample covariance ( is a measure of their linear dependence) between two

Note that covariance reduces to the sample variance when

Also, s jk  skj for all j and k

▪ the (Pearson’s) sample correlation coefficient

Note that, r jk  rkj for all j and k

▪ Properties of the Correlation Coefficient

(ii) r jk gives the strength of the relationship with values of r jk close to

(iii) The sign of r jk gives the direction of the association

d  O, P   x12  x22  ...  x 2p (1.7.1)

This is the equation of a hypersphere (a circle, if p  2 ), points equidistant

This is another measure, which is unsatisfactory for most statistical purposes,

One way to proceed is to divide each coordinate by the sample standard

(a) Consider the point P   x1, x2  from the origin O   0,0 

(b) Consider the point 

(c) Equation (1.7.5) assume independent coordinates. If the coordinates of

In terms of the original coordinates,

d  O, P   a11x12  2a12 x1x2  a22 x22

cos   sin   sin   cos  

You might also like