Psychometrics ppt
Psychometrics ppt
MEGHANAA KARANAM
Psychological Testing Vs. Assessment
Factors Affecting Psychological Testing
1. Examiner Variables
Examples:
Wechsler Adult Intelligence Scale (WAIS-IV) – Measures intelligence and cognitive
abilities.
Stanford-Binet Intelligence Scales – Used for giftedness and intellectual disability
assessment.
Thematic Apperception Test (TAT) – Measures unconscious motives and emotions.
Based on Administration: Individual
vs. Group Testing
Group Tests
Administered to multiple people simultaneously, often in educational and
organizational settings.
Advantages:
Time-efficient and cost-effective.
Standardized administration and scoring.
Useful for screening large populations.
Examples:
Raven’s Progressive Matrices (RPM) – Non-verbal intelligence test.
Scholastic Aptitude Test (SAT) – Measures reasoning and critical
thinking skills.
Tests for Special Populations
Designed for individuals with specific needs such as disabilities,
neurodivergent conditions, or cultural backgrounds.
Verbal Tests
Non-Verbal Tests
Performance Tests
Culture-Fair Tests
Where:
PPP is the difficulty index.
The value of P ranges from 0 to 1.
A higher value (closer to 1) indicates an easier item.
A lower value (closer to 0) indicates a more difficult item.
Item Analysis
Example Calculation:
Suppose 100 students take a test.
80 students answer a specific question correctly.
The difficulty index (F) is:
P=80/100=0.80
Since 80% of students got the item correct, it is considered easy.
Item Analysis
Item Discrimination
Definition: Item discrimination refers to the ability of a test item to
differentiate between high-performing and low-performing examinees.
It is calculated using the discrimination index (D), which measures the
difference in performance between high and low scorers.
• The EQUIVALENCE aspect considers how much error may get introduced
by different investigators or different samples of the items being studied. A
good way to test for the equivalence of measurements by two investigators is
to compare their observations of the same events.
Methods for assessing reliability
1. Test-Retest Method
Advantages:
Can be applied when only one version of the test is available.
Provides a simple and intuitive measure of reliability.
Methods for assessing reliability
1. Test-Retest Method
Limitations:
Conducting multiple test sessions can be costly and time-consuming.
Participants may remember their previous responses, leading to
artificially high reliability.
Psychological factors (e.g., anxiety, motivation) may change over time,
affecting scores.
A low correlation does not necessarily indicate poor reliability but could
suggest changes in the underlying construct.
The longer the interval between tests, the higher the chance of true
changes in the measured construct.
Reactivity effects may cause participants to change their attitudes
between test and retest.
Methods for assessing reliability
2. Alternative Form Method (Equivalent/Parallel Forms Method)
This method addresses some of the limitations of the test-retest method by using
two different but equivalent versions of a test instead of repeating the same test.
These alternate forms contain questions of equal difficulty but differ in content to
prevent memory bias. The reliability coefficient is calculated by correlating the
scores from both forms, typically administered about two weeks apart.
Advantages:
Reduces memory-related biases that may inflate reliability.
Provides a more rigorous assessment of measurement precision.
Limitations:
Developing equivalent test forms is challenging and time-intensive.
Requires participants to take two different tests, which may be burdensome.
Administering two separate tests increases the demand on resources.
Methods for assessing reliability
3. Split-Half Method
This method assesses the internal consistency of a test by dividing it into two
equal halves and comparing the scores from each half. A common way to split the
test is by grouping odd-numbered and even-numbered items separately (Odd-
Even reliability). The correlation between the two halves is calculated using
Pearson's correlation coefficient, which is then adjusted using the Spearman-
Brown formula to estimate full-test reliability.
Spearman-Brown Formula:
Advantages:
Requires only one test administration, unlike the test-retest and parallel forms
methods.
Suitable when time or resources do not allow for multiple testing sessions.
Limitations:
The reliability estimate varies based on how the test is split (e.g., first vs.
second half vs. odd-even).
Different methods of splitting items may lead to different reliability coefficients.
Validity in Measurement
Validity refers to the extent to which a measuring instrument accurately
measures what it is intended to measure. In social science and development
research, establishing validity is crucial, especially for complex variables like
malnutrition or intellectual development, where direct measurement is difficult.
1. Content Validity
Content validity assesses whether a test adequately
covers the domain of interest.
It ensures that the test represents all aspects of the
construct it aims to measure.
Example: A language proficiency test should include
grammar, vocabulary, comprehension, and writing skills. If
the test aligns well with instructional objectives, it is
content valid.
Types of Validity in Measurement
2. Criterion Validity
This type of validity examines how well a test correlates with an independent
criterion (i.e., an external measure).
a) Predictive Validity
Measures how well a test predicts future performance on a relevant criterion.
Example: Entrance exams (e.g., GRE, SAT) are validated by correlating test scores
with students’ academic performance in the future.
b) Concurrent Validity
Assesses how well a test correlates with an external measure taken at the same
time.
Example: A diagnostic test that differentiates students who need extra coaching
from those who do not has concurrent validity.
Key Difference: Predictive validity concerns future outcomes, while concurrent validity
evaluates present characteristics.
Types of Validity in Measurement
3. Construct Validity
Construct validity examines how well a test measures a
theoretical construct rather than a directly observable trait.
It is used when there is no universally accepted criterion or
content framework for measurement.
Construct validation requires:
a) Defining theoretical relationships between concepts.
b) Empirically testing these relationships.
c) Interpreting findings to confirm or refine the theory.
Example: Intelligence tests should correlate with other cognitive
ability measures to demonstrate construct validity.
Reliability vs. Validity: Which is More Important?
Key Differences: