0% found this document useful (0 votes)
2 views73 pages

unit 3

The document outlines the principles of measurement in research, detailing the process of assigning numbers to empirical events and the various levels of measurement, including nominal, ordinal, interval, and ratio scales. It emphasizes the importance of reliability and validity in measurement, as well as the criteria for good measurement and the significance of sampling design in research. Additionally, it discusses attitude measurement and the different types of attitude scales used to assess individual opinions and feelings.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views73 pages

unit 3

The document outlines the principles of measurement in research, detailing the process of assigning numbers to empirical events and the various levels of measurement, including nominal, ordinal, interval, and ratio scales. It emphasizes the importance of reliability and validity in measurement, as well as the criteria for good measurement and the significance of sampling design in research. Additionally, it discusses attitude measurement and the different types of attitude scales used to assess individual opinions and feelings.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 73

Measurement in Research

06/27/25
Introduction
Measurement in research consists of
assigning numbers to empirical events in
compliance with a set of rules.
1)Selecting observable empirical events
2)Using numbers or symbols to represent
aspects of the events
3)Applying a mapping rule to connect the
observation to the symbol

06/27/25
Introduction (cont.)
Example 1:
To study people whom attend a computer
exhibition at PWTC where all of the
computer’s new models are on display. You
are interested in learning the male-to-female
ratio among visitors of the exhibition. You
observe those enter the exhibition area.
• Record male as ‘m’ and female as ‘f’ or
• Record male as ‘1’ and female as ‘2’.

06/27/25
Introduction (cont.)
Example 2:
To measure the opinion of people on several
new computer models. This can be achieved
by interviewing a sample of visitors and
assign their opinions to scales ranging from
Strongly Agree (1) … Neutral (3) … to
Strongly Disagree (5).

06/27/25
What is measured?
Concepts used in research may be classified
as:
Objects
•Include the things of ordinary experience such
as people, automobiles, food etc.
Phenomena
•Things that are not concrete such as attitudes,
perception, opinion, satisfaction etc.
Properties
•Characteristics of the objects
What is measured? (cont.)
• A person’s physical properties may be stated
in terms of weight, height, posture.
• Psychological properties include attitudes
and intelligence.
• Social properties include leadership, ability,
class affiliation or status.
Rules of Measurement
A rule is a guide that instructs us on what to do. An
example of a rule of measurement might be:
•Assign the numerals 1 through 7 to individuals
according to how productive they are. If the
individual is an unproductive worker with little
output, assign the numeral 1.
•If a study on office computer systems is not
concerned with a person’s depth of experience but
defines people as users or nonusers, a ‘1’ for
experience with the system and a ‘0’ for non
experience with the system can be used.
Levels of Measurement
Variables can be further differentiated in terms of
the ‘level’ or nature of measurement that are
‘continuous’ or ‘discrete’ in their form.
Continuous variables
•Have an infinite number of values that flow along
a continuum.
•On a continuum, values can be divided and sub-
divided indefinitely in mathematical theory.
•Even a five-point scale could be divided into a
larger number of smaller units by sub-dividing
between each pair of points on the scale.
Levels of Measurement
Discrete variables
• Have relatively fixed set of separate values or
variable attributes.
• Instead of a smooth continuum of values,
discrete variables contain distinct categories (eg.
Gender: Male and Female)
Measurement Levels
Continuous and discrete variables yield four levels
of measurement (degree of precision of
measurement).

The four levels of measurement are:


1.Nominal
2.Ordinal
3.Interval
4.Ratio
Measurement Levels
Nominal Categories with no order.
Discrete /
Categorical
(Frequency)
Ordinal Categories with some order.

Arranges objects according


Continuum / Interval to their magnitudes in units
Continuous/ of equal interval.
Scale
(Score)
Ratio Arranges objects according
to their magnitudes in units
of equal interval & has a true
zero point.
Nominal Scale
• The simplest type of scale.
• A scale in which the numbers of letters assigned
to objects serve as labels for identification or
classification.

GENDER Males :1
Females :2

RACE Malays : 1
Chinese : 2
Indian :3
Ordinal Scale
• A scale that arranges objects or alternatives
according to their magnitudes.
• A typical ordinal scale, example to rate services,
brands, and so on as ‘excellent’, ‘good’, ‘fair’, or
‘poor’.
• We know ‘excellent’ is higher than ‘good’ but we do
not know by how much nor would we know whether
the gaps between ranks are the same or different.
Interval Scale
• A scale that not only arranges objects according to
their magnitudes, but also distinguished this ordered in
units of equal interval.
• Example 1: Ratings of radio programs would involve
program evaluations using a five- or seven-point scale.
• Hence, it would be possible not only to determine
which program was best liked, second best liked, third
best liked, etc. but also the amount by which one
program was more liked than another.
Interval Scale (cont.)
• Example 2: If a temperature is 90 degree Celsius, it
cannot be said that it is twice as hot as 45 degree
Celsius.
• The reason for this is that 0 degree Celsius does not
represent the lack of temperature but a relative point on
the Celsius scale.
• Due to the lack of an absolute zero point, the interval
scale does not allow the conclusion that 90 is twice as
great as the number 45, only that the interval distance is
two times greater.
Ratio Scale
• At the ratio level, it is possible to measure the extent
to which one variable exceeds another on a
particular dimension, and in addition, the scale of
measurement has a true zero point.
• Example: when measuring distance in meters, zero
means no distance at all. It is an absolute and non-
arbitrary zero point.
• When measuring money in currency values, again
zero means no money at all. The absolute zero point
is an important factor because such scales also have
exactly equal intervals between the separate points
on the scale.
Criteria for good measurement
1. Reliability
The degree to which measures are free from
error and therefore yield consistent results.
The reliability of a measure indicates the
stability and consistency with which the
instrument measures the concept.
Example: imperfections in the measuring
process that affect the assignment of scores or
numbers in different ways each time a measure
is taken, such as a respondent who
misunderstands a question are the cause of
low reliability.
Criteria for good measurement
2. Validity
Is a test of how well an instrument that is
concerned with whether we measure the right
concept.
There are two type of validity: Internal and
external validity.
Internal validity: concerned about issue of the
authenticity of the cause-and-effect
relationships
External validity: concerned about issue of the
generalizability to the external environment.
Goodness of Measures
Test whether items in the instruments
should belong there. Steps:
1. Item Analysis 1. Calculate Total Score
2. Divide respondents into high and
low score
3. Compute t-test for each item
4. Use only items that are significant

Is the measure without bias (error free)


2. Reliability
and therefore consistent across time
Analysis and across items in the instrument?
i.e. is it stable and consistent?

3. Validity Is the instrument measuring the concept


it sets out to measure and not
Analysis something else?

06/27/25
Goodness of Measures
Test-retest
Stability
Reliability Parallel form
(Accuracy)
Interitem
consistency
Consistency
GOODNESS Split-half
OF DATA Logical
(content) Face
Validity
(Actuality) Criterion Predictive
related Concurrent

Congruent Convergent
(construct) Discriminant

06/27/25
Reliability and Validity

Valid but Unreliable


Valid & Reliable Reliable but NOT
Valid

06/27/25
Reliability
Observed scores may reflect true scores,
but it may reflect other factors as well:
stable characteristics: two people having the
same opinion may circle different responses
transients personal factors such as mood
situational factors, time pressure, time
variations in administration and mechanical
factors
Reliability: Stability and consistency

Stability – over time, conditions, state of
respondents

Consistency – Homogeneity of times; items can
measure the construct independently

06/27/25
Reliability of Measures
RELIABILITY

Stability Consistency

Test-retest Parallel form Interitem Split-half


Repeated Two comparable Consistency of Correlation
measures on sets of measures respondents’ between two-
the same for the same answer to all the halves of a
respondent; construct; same items; high measure;
high items, same correlation correlation
correlation – response format among between the
high but different responses to the two halves
reliability wording; Analysis items –
- correlation Cronbach 
06/27/25
Validity
Multiple indicators: - often used to capture a
given construct e.g. attitude; to

cover the domain of the construct

robust - reduce random error

Cronbach alpha - measures intercorrelation
between indicators - they should be positively
correlated but not perfectly correlated
Construct Validity

Face validity

Convergent validity (Correlation to assess it)

Divergent validity

06/27/25
Validity
VALIDITY

Logical Criterion Congruent


(content) related (construct)

Face Predictive Concurrent Convergent Discriminant


Ensures Does Does Do the two Does the
adequate measure measure instruments measure
and differentiate differentiate measuring have low
representativ to predict a to predict a the concept correlation
e set of items future criterion correlate with an
that tap the criterion variable highly? unrelated
concept variable currently variable?
Panel of Analysis – Analysis –
judges – face Correlation Correlation
validity
06/27/25
Data Source: Sampling
Two Central Questions

Do we sample or census?

If sample:

How to identify Who/what to include in
the sample? - sampling design

How many to include in the sample? -
sample size

06/27/25
What is a Good Sample?

Representative of the Population


Estimates from sample are accurate

Estimates from sample are precise

06/27/25
Steps in Sampling Design
What is the relevant population?
 What are the parameters of interest?
 What is the sampling frame?
 What is the type of sample?
 What size sample is needed?

 How much will it cost?

06/27/25
Types of Sampling
Design
Convenience

Non- Judgement
probabili Quota
ty Design
Snowball

Simple Random
Samplin
Systematic
g
One-stage
Design design
Stratified

Probabili Cluster

tyDesign Simple Random


Multistage
design Stratified

Combination
06/27/25
Choosing a Sampling
Design
Is REPRESENTATIVENESS critical?
YE NO
S
Choose PROBABILITY design Choose NON-PROBABILITY design

Generaliza Subgroup Collect Information Quick, Relevant


bility Differences localized about unreliable information
information subsets of information about certain
sample groups
Simple Cluster if not
random enough RM Area Convenience
samples Double
Systematic samples
Only experts Info from
Equal sized subgroups? have special
information interest
YE NO groups
S
Proportionate Disproportionate
stratified samples stratified samples Judgement Quota
06/27/25
Sample Size: Factors
Homogeneity of sampling units
Confidence level
Precision
Analytical Procedure

Cost, Time and Personnel

06/27/25
Roscoe’s Rule of Thumb
Larger than 30 and less than 500
appropriate for most research
A minimum of 30 for each sub samples
Multivariate research: At least 10 times
the number of variables
Simple Experiments with tight controls
- samples as small as 10 to 20

06/27/25
MEASUREMENT OF
ATTITUDE : ATTITUDE SCALES
Attitude
Thurstone defines Attitude as the degree
of positive or negative effect associated
with some psychological objects.
Remmers et al define attitude as a feeling
for or against something.
Characteristics of Attitude
Favourableness : It is the degree to
which a person is for or against a
psychological object.
Intensity : It is the strength of the feeling.
Salience : How freely or spontaneously an
individual expresses his/her attitude.
Attitude are acquired, not inborn or innate.
Attitude are more or less permanent.
Attitude involves subject-object relationship
i.e. Attitudes are formed in relation to some
person, object or situation.
Attitude involves affective, cognitive and
action components.
Attitudes are inferred : Attitude of a person
cannot be observed directly. Attitude can
only be inferred from individual’s actions,
behaviour and statements.
Attitude Scale
The inquiry form that attempts to assess the
attitude of an individual is known as an
Opinionnaire or Attitude Scale. It consists of a
number of items that have been carefully
prepared, selected and edited according to
some criteria. Items of Attitude scales are called
Statements, which can be defined as ‘anything
that is said about a psychological object.’An
individual responds to these statements by
indicating his/her agreement or disagreement.
Assumption in Measurement of Attitude

An individual’s behaviour with respect to


the object of attitude will be consistent
from one situation to another.
Attitude cannot be observed directly. It is,
therefore, assumed that it must be inferred
from the statements, actions and
behaviour of an individual.
Different Types of Attitude
Scales
Method of Equal Appearing Intervals or
Thurstone Scale.
Method of Summated Ratings or Likert
Scale.
Method of Cumulative Scaling.
Semantic Differentials.
Criteria for Writing Statements
(Edwards, 1957)
Avoid statements that refer to the past rather
than the present.
Avoid factual statements.
Avoid statements that may be interpreted in
more than one way.
Avoid statements that are irrelevant to the
psychological object under consideration.
Avoid statements that are likely to be endorsed
by almost everyone or almost no one.
The Method of Equal Appearing
Intervals
Originally developed by Thurstone and
Chave (1929)
Assumptions underlying the Method :
The intervals into which the statements are
sorted or rated are equal.
The attitude of the judge does not influence
the sorting of the statements into the
various intervals.
Steps in the Method of Equal
Appearing Intervals
Step-I : Collection of Statements : A large
number of statements ( about 100 to 200)
showing both favourable and unfavourable
attitude in varying degrees towards the
psychological object, are written or
collected by the researcher from different
sources.
Select statements that are believed to cover the entire
range of the effective scale of interest.
Keep the language of the statements simple, clear
and direct.
Statement should be short, rarely exceeding 20
words.
Each statement should contain only one complete
thought.
Statements containing universals such as all, always,
none or never often introduce ambiguity and should
be avoided.
Words such as only, just, merely and others of a
similar nature should be used with care and
moderation in writing statements.
Whenever possible, statements should be
in the form of simple sentences.
Avoid words that may not be understood
by those who are to be given the complete
scale.
Avoid the use of double negatives.
Step-II :The Sorting of Statements :In the
second step, the statements are sent to the
experts or judges for classification on an 11-point
continuum, according to favourableness or
unfavourableness of each statement towards
the psychological object under study. The
researcher proceeds as follows :
Each statement is printed on separate
sheets / cards.
Each judge is then given 11 cards
(envelopes) on which letters A to K (or
numbers 1 to 11) are written.
These cards / envelops are arranged before the
judges in a manner that ‘A’ is kept at the extreme
left and ‘K’ is kept at the extreme right.
Statements that seem to express the most
unfavourable feelings about the object of attitude
are to be placed on the card / envelope ‘A’.
Statements that seem to express the most
favourable feelings about the object of attitude
are to be placed on the card / envelope ‘K’.
Statements that express neither favourable nor
unfavourable feelings about the object of attitude
are to be placed in the middle ‘F’ card / envelope
that is described as the neutral card.
The cards / envelopes lettered from ‘G’ to
‘K’ represent various degrees of
favourableness and the cards from ‘E’ to
‘A’ represent various degrees of
unfavourableness.
Step-III : Selection of Statements for the Final
Scale :The next step is to determine the Scale
Value and Q-value of each statement. Thurstone
and Chave used median as the scale value to
show the favourableness or unfavourableness,
and quartile deviation as the Q-value, as a
measure of variance for a given statement. The
final form of the scale is then constructed by
selecting 30 to 35 statements that are most
relevant, least ambiguous and which cover or
represent the different intensities of the attitude.
They are then arranged in a random order.
Step-IV :Determining Reliability : Split-
half technique is used.
Step-V : Determining Validity : By
correlating the average attitude scores with
actual behaviour of the subjects.
The Method of Summated
Ratings
Introduced by Likert (1932)
Step-I : Collection of Items :
A large number of statements that express an opinion or
feeling towards the psychological object are collected.
The statements express definite favourableness or
unfavourableness to the psychological object.
The number of favourable and unfavourable statements
should be approximately equal.
Statements are then edited keeping in view the criteria.
Each item is followed by five responses, viz. Strongly Agree,
Agree, Undecided, Disagree and Strongly Disagree.
One of these responses is to be checked by the respondent.
Step-II : The Try Out :
The preliminary draft of the scale is administered to
a sample of about 200 subjects selected from the
target population.
Arbitrary scoring weights are used as follows :

SA A UD D SD
Favourable 5 4 3 2 1
Unfavourable 1 2 3 4 5

An individual’s score on a particular attitude scale is


the sum of his ratings on all items.
Step-III : Selection of Items and
Preparation of the Final Draft :
Items are selected in this method using Item
Analysis.
On the basis of the total scores, 25 % of the
subjects with the highest total scores and 25%
of the subjects with the lowest total scores are
taken.
In evaluating the responses of high and low
groups to the individual statements, ‘t’ ratio is
found out.
The value of ‘t’ measures the extent to which a
given statement differentiates between the high
and low groups.
A t-value equal to or greater than 1.75 indicates
that the average response of the high and low
groups to a statement differs significantly.
About 20-25 statements with the largest t-values (t
 1.75) are selected for the final draft of the
attitude scale.
Step-IV :Determining Reliability : Split-
half technique is used.
Step-V : Determining Validity :
Correlating the result with score obtained
using other scales. Also, by correlating the
average attitude scores with actual
behaviour of the subjects.
Analysis and Interpretation of Data:
In equal-appearing interval scales, the attitude
score of an individual is taken as the mean of the scale
values of the statements with which he/she is agree and
the interpretation of an attitude score is made
independently of the distribution of scores for a particular
group of individuals.
On the other hand, the interpretation of an attitude
score on a summated-rating scale cannot be made
independently of the distribution of scores of some
defined or norm group. The interpretation of the
summated-rating attitude score of an individual in terms
of favourableness or unfavourableness is always done
with the help of the mean of the norm group.
Limitations in Measurement of
Attitude
An individual may conceal his/her real
attitude and express socially accepted
opinions only.
An individual may not really know how
(s)he feels about a social issue.
An individual may not be able to express
his/her attitude towards a situation in
abstract.
LIKERT SCALE
ORIGIN OF LIKERT SCALE
The original idea for the likert scale is found in
Rensis Likert’s 1932 article in Archive of
psychology titled “ A technique for the
measurement of Attitudes”.
Likert-type or frequency scales use fixed
choice response formats and are designed
to measure attitudes or opinions
What is Likert scale?
• It is a psychometric scale commonly
involved in research that employs
questionnaires.
• It is the most widely used approach to
scaling responses in survey research.
• Likert scales are a non-comparative
scaling technique and are one-
dimensional in nature.
• When responding to a Likert questionnaire
item respondents specify their level of
agreement or disagreement on a
symmetric agree-disagree scale for a
series of statements.
The format of a typical Seven-level
Likert item
The format of a typical five-level
Likert item
I believe that ecological questions are the
most important issues facing human
beings today.
Strongly agree / agree / don’t know /
disagree / strongly disagree
Each of the five (or seven) responses
would have a numerical value which would
be used to measure the attitude under
investigation.
When to Use Likert Scales
We can use it to get an overall
measurement of a particular topic, opinion,
or experience and also collect specific data
on contributing factors.
Choose a particular scale (3 point, 5 point,
7 point, etc) and use it as your standard to
cut down on potential confusion and
fatigue. This will also allow for comparisons
within and between your data sets.
ADVANTAGES
 Likert Scale questions use psychometric
testing to measure beliefs, attitudes and
opinion.
 Working with quantitative data, it is easy to
draw conclusions, reports, results and
graphs from the responses.
 Likert Scale questions use a scale, & people
are not forced to express an either-or
opinion, rather it allows them to be neutral.
 It is very easy and quick type of survey and
it can be sent out through all modes of
communication, including even text
messages.
LIMITATIONS

They are uni-dimensional, because they


only give a certain amount of choices.
Previous questions will have influenced
responses to any further questions that
have been asked.
Participants may not be completely honest
- which may be intentional or unintentional.
Contd…
Participants may base answers on feelings
toward surveyor or subject.

Scale requires a great deal of decision-


making.

can take a long time to analyze the data


SEMANTIC DIFFERENTIAL
SCALE
Semantic differential scale or the S.D. scale
developed by Charles E. Osgood, G.J. Suci
and P.H. Tannenbaum (1957), is an attempt to
measure the psychological meanings of an
object to an individual.

This scale is based on the presumption that an


object can have different dimensions of
connotative meanings which can be located in
multidimensional property space, or what can be
called the semantic space in the context of S.D.
scale.
 This scaling consists of a set of bipolar rating
scales, usually of 7 points, by which one or
more respondents rate one or more concepts
on each scale item.
For instance, the S.D. scale items for analysing
candidates for leadership position may be shown
as under:

(E ) Unsuccessful
Successful Lenient
(P ) Severe Light
(P ) Heavy Cold
(A ) Hot Regressive
(E ) Weak
Progressive Passive
(P ) Strong Slow
(A ) Active False
(A ) Fast Unsociable
3 2 1 0 –1 –2 –3
(E ) True
(E ) Sociable
Candidates for leadership position (along with
the concept—the ‘ideal’ candidate) may be
compared and we may score them from +3 to –3
on the basis of the above stated scales.

 (The letters, E, P, A showing the relevant


factor viz., evaluation, potency and activity
respectively, written along the left side are not
written in actual scale. Similarly the numeric
values shown are also not written in actual
scale.)
Osgood and others did produce a list of some
adjective pairs for attitude research purposes and
concluded that semantic space is
multidimensional rather than unidimensional.

They made sincere efforts and ultimately found


that three factors, viz., evaluation, potency and
activity, contributed most to meaningful
judgements by respondents.

The evaluation dimension generally accounts


for 1/2 and 3/4 of the extractable variance and the
other two factors account for the balance
Procedure : - Various steps involved in
developing S.D. Scale are as follows :

First of all the concepts to be studied are


selected. The concepts are usually chosen by
personal judgement, keeping in view the nature of
the problem.
 The next step is to select the scales bearing in
mind the criterion of factor composition and the
criterion of scale’s relevance to the concepts
being judged (it is common practice to use at
least three scales for each factor with the help of
which an average factor score has to be worked
out). One more criterion to be kept in view is that
scales should be stable across subjects and
concepts.
Then a panel of judges are used to rate the
various stimuli (or objects) on the various
selected scales and the responses of all
judges would then be combined to determine
the composite scaling.
THANK YOU

You might also like