0% found this document useful (0 votes)

34 views10 pages

Objectives

The document outlines the essential characteristics of assessment, focusing on validity, reliability, and usability. It emphasizes that validity pertains to the appropriateness of interpretations of assessment results, while reliability refers to the consistency of those results across different contexts. Usability addresses the practicality of the assessment process, ensuring it is economical, easy to administer, and produces interpretable results.

Uploaded by

evin27844

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views10 pages

Objectives

Uploaded by

evin27844

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

3/20/2025

1 VALIDITY, RELIABILTY AND USABILITY

2 Essential assessment charactertics

Validity

Reliability

Usability


3 Validity and reliability

Validity
adequacy and appropriateness of the interpretations and uses of assessment
results

E.g.
If the results are to be used as a measure of students’ reading skills
our interpretations are to be based on evidence that the scores actually reflect
reading skills
not impacted by irrelevant factor, such as the vocabulary or linguistic complexity

4 Validity and reliability

Reliability
the consistency of assessment results

E.g.
we get similar scores when the same assessment procedure is used with the same
students on two different occasions
a high degree of reliability from one occasion to another

We get similar scores when different teachers independently rate student
performances on the same assessment task
a high degree of reliability from one rater to another

5 Validity and reliability

Reliability
we are concerned with consistency of the results
rather than with appropriateness of the interpretations made from the results
(which is validity).

Reliability (consistency) of measurement is needed to obtain valid results, but can
have reliability without validity
6 Usability
Refers to the practicality of the procedure
Not about the other qualities present

Assessment procedure should

1
6 3/20/2025


Assessment procedure should
Be economical in terms of time and money
Be easily administered
Be easily scored
Produce results that can be accurately interpreted

7 Nature of validity
Validity
The appropriateness of the interpretation and use of the results

A matter of degree
it does not exist on all-or-none basis. (high validity, low validity)

Specific to some particular use or interpretation for a specific population of test
takers
No assessment is valid for all purposes
When indicating computational skill
the mathematics test may have a high degree of validity for 3rd and 4th
graders but a low degree of validity for the 2nd and 5th graders
A reading test
may have high validity for skimming and scanning and low validity for
inferencing

Necessary to consider the specific interpretation or use to be made of the results

8 Major considerations in assessment validation
Content
The assessment content and specifications from which it was derived

Construct
The nature of the characteristics being measured

Assessment-criterion relationships
The relation of the assessment results to other measures

Consequences
The consequences of the uses and interpretations of the results

9 Content
How an individual performs on a domain of tasks that the assessment is supposed
to represent

E.g. knowledge of 200 words
we select 20 words and generalize it to the knowledge of 200

2
9
3/20/2025

E.g. knowledge of 200 words

we select 20 words and generalize it to the knowledge of 200
the extent to which our 20-word test constituted a representative sample of the
200 words

the goal in the consideration of content validation
to determine if a set of assessment tasks
provides a relevant and representative sample of the domain of tasks
10 Content
The definition of the domain to be assessed
derive from the identification of goals and objectives

The assessment begins with a content area that reflects the goals and objectives

Steps
Specifying the domain of instructionally relevant tasks
Specifying the emphasis according to the priority of goals and objectives
Constructing or selecting a representative set of assessment tasks

From what has been taught
to what is to be measured
to what should be emphasized in the assessment
to a representative sample of relevant tasks

11 Content
Assessment development to enhance validity

Table of specifications

Subject-mater content (topics to be learned)

Instructional objectives (types of performance)



12 Content
Assessment development to enhance validity
The percentage in the table
The relative degree of emphasis that each content area and each instructional
objective is to be given in the test


13 Content
Table of specifications

3
3/20/2025

13
Table of specifications

The specifications should be in harmony with what was taught
The weights assigned in the table reflect the emphasis that was given during
instruction

The more closely the Qs match the specified sample
the more valid a measure of student learning

It can be used in selecting tests that publishers prepare
How well do they match with our table of specifications?

14 Construct
Is the test actually measuring the construct it claims it is measuring?

A construct is an individual characteristic or an abstract theoretical concept
assumed to exist to explain some aspect of behavior
Reading comprehension, inferencing, speaking proficiency, intelligence,
creativity, anxiety, mathematical reasoning, etc.

These are called constructs because they are theoretical constructions that are used
to explain performance on an assessment
15 Construct
Construct validation
the process of determining if the performance on an assessment can be
interpreted in terms of a construct(s)

Two questions are important in construct validations

Does the assessment adequately represent the intended construct? (construct
underrepresentation)
Problem-solving task turning into a memorization task

Is performance influenced by factors that are irrelevant to the construct?
(construct-irrelevant variance)
A mathematics test influenced by reading demands

16 Methods used in construct validation

Defining the domain(area) or tasks to be measured (also in content validation)

Analyzing the response process required by the assessment tasks
Thinking aloud or interviewing (to check on mental process)

Comparing the scores of known groups
A prediction of differences for a particular test or assessment can be checked

4
3/20/2025

Comparing the scores of known groups

A prediction of differences for a particular test or assessment can be checked
against groups that are known to differ and the results used as a partial support
for construct validation (e.g. mathematics majors vs English majors)
The test should be able to distinguish them

Comparing scores before and after a particular learning experience or experimental
treatment
Scores increase with instruction?

Comparing scores with other similar measures (also an assessment-criterion
consideration)
E.g. high correlation between like tests and lower correlation between unlike tests



17 Assessment-criterion considerations
When test scores are to be used
to predict future performance
to estimate current performance on some valued measure other than the test
itself (called a criterion)

Concerned with evaluating the relationship between the test and the criterion

18 Assessment-criterion considerations
For example, can ALES scores indicate success at exams in masters programs?

The degree of relationship can be described by statistically correlating the two set of
scores
The resulting correlation coefficient provides a numerical summary of the degree
of relationship between the two sets of scores

Scatter plots and expectancy tables can also be used.

19

20
Example on excel

Interpretation

Interpretation
.90 to

21 Consideration of consequences
Assessments are intended to contribute to improved learning, but do they?

What impact do assessments have on teaching?

5
3/20/2025
21


What impact do assessments have on teaching?

What are the possibly negative, unintended consequences of a particular use of
assessment results?

High importance associated with test results lead teachers to focus narrowly on
what is on the test while ignoring important parts of the curriculum not covered by
the test

E.g. Changing the construct of teaching from problem-solving to memorization
ability because of a high-stakes test

An example: college professors preparing for YDS for several years and end up
passing exam but not speaking English

22 Factors influencing validity

Factors in the test or assessment itself
Unclear directions
Difficult language
Ambiguity
Inadequate time limit (construct-irrelevant variance)
Overemphasis of easy-to-assess aspects and disregard difficult-to-assess aspects
(construct underrepresentation)
Poorly-constructed test items (e.g. providing clues)
Test too short (i.e. may not be represenative)
Improper arrangement of test (like most difficult ones first)
Identifiable pattern of answers (T, F, T, F, T, F, T, F)


23 Factors influencing validity

Factors in administration and scoring
Insufficient time
Unfair aid to students
Cheating
Unreliable scoring
Failing to follow directions
Adverse physical and psychological conditions

Factors in student responses (like motivation, fear, anxiety)


24 Reliability
The consistency of measurement
how consistent test scores or results are from one assessment to another

The more consistent the assessment results are from one measurement to another
the fewer errors there will be

6
24
3/20/2025

The more consistent the assessment results are from one measurement to another
the fewer errors there will be
Consequently, the greater reliability

25 Reliability
An estimate of reliability refers to a particular type of consistency
Different periods of time
Different samples of tasks
Different raters

Low reliability means low validity
But high reliability does not mean high validity
26 Determining reliability in correlation methods
Consistency
over a period of time
over different forms of assessment
within the assessment itself
different raters

27 Test-retest method
The same assessment
administered twice to the same group of students
with a given time interval between the two (a measure of stability)
Not too long not too short for the purpose

The longer the interval between the first and second assessments
influenced by changes in the student characteristic being measured
the smaller the reliability coefficient will be

28 Test-retest method
Stability is important when results are used for several years
like English test scores, but not as important for a unit test

The test-retest method is not very relevant for teacher-constructed classroom tests
Not desirable to readminister the same assessment

In choosing standardized tests, stability is an important criterion
29 Equivalent(parallel)-forms method
Uses two different but equivalent forms of an assessment

Two different tests are prepared based on the same set of specifications
Administered to the same group of students in a short period of time
The resulting assessment scores are correlated

It does not tell anything about long-term stability

7
3/20/2025

It does not tell anything about long-term stability



30 Split-half method
The assessment is administered to a group of students in the usual manner and
then is divided in half for scoring purposes

E.g. to score the even-numbered and the odd-numbered tasks separately

This produces two scores for each student
When correlated, provides a measure of internal consistency

To estimate the scores’ reliability based on the full-length assessment, Spearman
Brown formula is applied
31 Interrater consistency
When student work is judgmentally scored
whether the same scores are assigned by another judge

Consistency can be evaluated with correlation
the scores assigned by one judge with those assigned by another judge

To achieve acceptable levels of interrater consistency
Agreed on scoring-rubrics
Training of raters to use those rubrics with examples of student work
32 Writing rubric

40 Examples

41 Reliability methods

42 Standard error of measurement

The amount of variation in the scores would be directly related to the reliability of
the assessment procedures
Low reliability by large variations in the student’s assessment results
High reliability by little variation from one assessment to another

To estimate the amount of variation to be expected in the scores

8
3/20/2025


To estimate the amount of variation to be expected in the scores
Standard error of measurement

The standard error of measurement is the standard deviation of the errors of
measurement

When the standard error of measurement is small, the confidence band is narrow
(indicating high reliability)
Greater confidence that the obtained score is near the true score

A teacher who is aware of the standard error of measurement realizes that it is
impossible to be dogmatic in interpreting minor differences in assessment scores

43 Standard error of measurement

44 Factors influencing reliability measures

Number of assessment tasks
The larger the number of assessment tasks (e.g. questions) on an assessment, the
higher its reliability will be

Spread of scores
The larger the spread of scores, the higher the estimate of reliability
Individuals stay in the same relative position in a group from one assessment to
another

Objectivity
Degree to which equally competent scorers obtain the same results
Objectivity can be increased by careful phrasing of the questions and by a
standard set of rules for scoring
45 Usability
Ease of administration
Easy directions? Complicated directions? Requires expertise to implement?

Time required for administration
Allot as much time needed to obtain valid and reliable scores, not more

Ease of interpretation and application
If misinterpreted, there is no use and may even be harmful to some individual or
group

Availability of equivalent forms or comparable forms
Can also be useful in measuring development

Cost of testing
To save money, one should not prefer tests with lower validity and reliability
estimates

9
3/20/2025

To save money, one should not prefer tests with lower validity and reliability
estimates



Assessment of Learning-1
No ratings yet
Assessment of Learning-1
24 pages
Validity
No ratings yet
Validity
27 pages
Classroom Assessment
No ratings yet
Classroom Assessment
16 pages
Student Name:: Anum Saddique
No ratings yet
Student Name:: Anum Saddique
24 pages
Assighment 2
No ratings yet
Assighment 2
26 pages
Assessment in Learning II PPT 2 (Principles of High Quality Assessment)
No ratings yet
Assessment in Learning II PPT 2 (Principles of High Quality Assessment)
67 pages
6406 Classroom Assessment Assignment 2
No ratings yet
6406 Classroom Assessment Assignment 2
8 pages
Assignment 2nd 8602
No ratings yet
Assignment 2nd 8602
28 pages
Assessment
No ratings yet
Assessment
192 pages
Validity and Reliability
100% (4)
Validity and Reliability
19 pages
Unit Iii - Designing and Developing Assessments: Let's Read These
No ratings yet
Unit Iii - Designing and Developing Assessments: Let's Read These
21 pages
Validity
No ratings yet
Validity
48 pages
Educator's Guide to Assessments
No ratings yet
Educator's Guide to Assessments
25 pages
8602 2nd Mohsin
No ratings yet
8602 2nd Mohsin
22 pages
Types of Validity
No ratings yet
Types of Validity
6 pages
Validity and Reliability: Purpose of Tests
No ratings yet
Validity and Reliability: Purpose of Tests
19 pages
Javiria Shafiq 8602-02
No ratings yet
Javiria Shafiq 8602-02
17 pages
Assessment Types and Validity
100% (1)
Assessment Types and Validity
49 pages
Assessment As An Integral Part of Teaching
No ratings yet
Assessment As An Integral Part of Teaching
46 pages
Muhammad Nadeem 0000748844 8602 B.ED (1.5 YEARS) SPRING 2024 1 2
No ratings yet
Muhammad Nadeem 0000748844 8602 B.ED (1.5 YEARS) SPRING 2024 1 2
22 pages
Principles of High Quality Assessment
No ratings yet
Principles of High Quality Assessment
31 pages
Sana Bibi 8602-2
No ratings yet
Sana Bibi 8602-2
44 pages
Labeeb 0000758542 8602 B.ED (1.5 YEARS) SPRING 2024 1 2: Educational Assessment and Evaluation
No ratings yet
Labeeb 0000758542 8602 B.ED (1.5 YEARS) SPRING 2024 1 2: Educational Assessment and Evaluation
22 pages
PrinciplesofAssessment Properties of Assessment Methods
No ratings yet
PrinciplesofAssessment Properties of Assessment Methods
45 pages
Key Qualities of Assessment Instruments
No ratings yet
Key Qualities of Assessment Instruments
7 pages
Validity
No ratings yet
Validity
33 pages
Test Construction - Aggabao & Martin
No ratings yet
Test Construction - Aggabao & Martin
127 pages
Week 1: My Learning Essentials
No ratings yet
Week 1: My Learning Essentials
7 pages
Anum Saddique 0000762728 8602 B.ED (1.5 YEARS) SPRING 2024 1 2
No ratings yet
Anum Saddique 0000762728 8602 B.ED (1.5 YEARS) SPRING 2024 1 2
22 pages
Muhammad Abbas 0000759749 8602 B.ED (1.5 YEARS) SPRING 2024 1 2
No ratings yet
Muhammad Abbas 0000759749 8602 B.ED (1.5 YEARS) SPRING 2024 1 2
22 pages
Evaluation and Assessment Guide
No ratings yet
Evaluation and Assessment Guide
11 pages
Educational Assessment and Evaluation 8602
No ratings yet
Educational Assessment and Evaluation 8602
18 pages
EDUC3 Module2
No ratings yet
EDUC3 Module2
5 pages
Educ 116 PDF
No ratings yet
Educ 116 PDF
29 pages
Activity 3 Assessment
No ratings yet
Activity 3 Assessment
8 pages
8602 Assignment
No ratings yet
8602 Assignment
30 pages
Babi
No ratings yet
Babi
6 pages
8602 Assignment No 2
No ratings yet
8602 Assignment No 2
23 pages
Assessment Methods in Education
No ratings yet
Assessment Methods in Education
7 pages
Effective Assessment for Teachers
No ratings yet
Effective Assessment for Teachers
5 pages
Lecture Notes On Characteristics of Tests
No ratings yet
Lecture Notes On Characteristics of Tests
10 pages
Assessment and Evaluation in Education - Teacher Note Forthe Midterm
No ratings yet
Assessment and Evaluation in Education - Teacher Note Forthe Midterm
8 pages
Chapter 4 Principles of Classroom Assessment
No ratings yet
Chapter 4 Principles of Classroom Assessment
9 pages
Validity and Reliability
No ratings yet
Validity and Reliability
19 pages
Educator's Guide to Assessment
No ratings yet
Educator's Guide to Assessment
21 pages
Ea Notes 2024 Final
No ratings yet
Ea Notes 2024 Final
55 pages
Principles of High Quality Assessment and Reliability
No ratings yet
Principles of High Quality Assessment and Reliability
49 pages
Al1 Final Reviewer
No ratings yet
Al1 Final Reviewer
170 pages
Midterm Exam Reviewer
No ratings yet
Midterm Exam Reviewer
13 pages
Understanding Tables of Specification
100% (6)
Understanding Tables of Specification
8 pages
Validity
No ratings yet
Validity
16 pages
What Is Test
No ratings yet
What Is Test
35 pages
Constructing Effective Test Items
No ratings yet
Constructing Effective Test Items
22 pages
8602 Assignment
No ratings yet
8602 Assignment
30 pages
Ethical Principles of Student Assessment
No ratings yet
Ethical Principles of Student Assessment
2 pages
Ydt Tense
No ratings yet
Ydt Tense
5 pages
Examm
No ratings yet
Examm
10 pages
Yökdil Fen Sentence Completion
No ratings yet
Yökdil Fen Sentence Completion
38 pages
Learning Objectives Analysis Report
No ratings yet
Learning Objectives Analysis Report
4 pages
From Literacy
No ratings yet
From Literacy
12 pages
Vocabulary
No ratings yet
Vocabulary
14 pages
Using ChatGPT For Second Language Writing
No ratings yet
Using ChatGPT For Second Language Writing
3 pages
Azarnoosh Et Al, 2016
No ratings yet
Azarnoosh Et Al, 2016
7 pages
Local Vs Glocal
No ratings yet
Local Vs Glocal
1 page
Materials and Content Development in English Coursebook
No ratings yet
Materials and Content Development in English Coursebook
11 pages
Greve 2001
No ratings yet
Greve 2001
8 pages
13 Principles of Classroom Assessment
No ratings yet
13 Principles of Classroom Assessment
12 pages
Development and Validation of An Internationally Reliable Short-Form of The Positive and Negative Affect Schedule (PANAS)
No ratings yet
Development and Validation of An Internationally Reliable Short-Form of The Positive and Negative Affect Schedule (PANAS)
16 pages
Chemistry Handbook and Study Guide
No ratings yet
Chemistry Handbook and Study Guide
27 pages
Ethical Conduct in NBI Frontline Services
No ratings yet
Ethical Conduct in NBI Frontline Services
38 pages
Reflective Project Ibcp Guidelines Update Class of 2019
No ratings yet
Reflective Project Ibcp Guidelines Update Class of 2019
14 pages
Development of Ecological Place Meaning in New York City
No ratings yet
Development of Ecological Place Meaning in New York City
22 pages
Brand Love: The Emotional Bridge Between Experience and Engagement, Generation-M Perspective
No ratings yet
Brand Love: The Emotional Bridge Between Experience and Engagement, Generation-M Perspective
16 pages
2fe861, 240217, VG1
No ratings yet
2fe861, 240217, VG1
27 pages
How Do Internet Memes Affect Brand Image?: Hsuju Teng
No ratings yet
How Do Internet Memes Affect Brand Image?: Hsuju Teng
15 pages
Business Research Methods Guide
No ratings yet
Business Research Methods Guide
35 pages
Yulsardi & Ratmanida, 2021
No ratings yet
Yulsardi & Ratmanida, 2021
10 pages
Amee Guide 9 Osler
No ratings yet
Amee Guide 9 Osler
8 pages
Key Concepts in Business Research Methods
100% (1)
Key Concepts in Business Research Methods
16 pages
Psych Assessment Test 2 Practice MCQs
No ratings yet
Psych Assessment Test 2 Practice MCQs
20 pages
Nepsy Ii
No ratings yet
Nepsy Ii
14 pages
Question Bank
No ratings yet
Question Bank
18 pages
Unit - I
No ratings yet
Unit - I
48 pages
Model ANswer
No ratings yet
Model ANswer
6 pages
PSPA8112-Presentation Rubric-2023
No ratings yet
PSPA8112-Presentation Rubric-2023
4 pages
Chapter 12 T (Human Resource Management)
No ratings yet
Chapter 12 T (Human Resource Management)
32 pages
Reading Comprehension Skills in IX Grade
No ratings yet
Reading Comprehension Skills in IX Grade
305 pages
VMIarticle
No ratings yet
VMIarticle
9 pages
Leadership Skills' Acquisition
No ratings yet
Leadership Skills' Acquisition
58 pages
Paper 2
No ratings yet
Paper 2
35 pages
Vividness Narrative Transportation and Sense of Presence in Destination Marketing Empirical Evidence From Augmented Reality Tourism
No ratings yet
Vividness Narrative Transportation and Sense of Presence in Destination Marketing Empirical Evidence From Augmented Reality Tourism
14 pages
Badminton Fitness Components Overview
No ratings yet
Badminton Fitness Components Overview
10 pages
Volume 25 - Issue 7 - Pages 2573-2577
No ratings yet
Volume 25 - Issue 7 - Pages 2573-2577
5 pages
School Leadership's Impact
No ratings yet
School Leadership's Impact
85 pages
Lawlor Schonert-Reichl Gadermann Zumbo MAAS-C 2013
No ratings yet
Lawlor Schonert-Reichl Gadermann Zumbo MAAS-C 2013
13 pages

Objectives

Uploaded by

Objectives

Uploaded by

3/20/2025

1 VALIDITY, RELIABILTY AND USABILITY

2 Essential assessment charactertics

3 Validity and reliability

4 Validity and reliability

5 Validity and reliability

E.g. knowledge of 200 words

16 Methods used in construct validation

Comparing the scores of known groups

22 Factors influencing validity

23 Factors influencing validity

It does not tell anything about long-term stability

42 Standard error of measurement

43 Standard error of measurement

44 Factors influencing reliability measures

You might also like