THE ADVANTAGES AND DISADVANTAGES OF STANDARDIZED
HIGH-STAKES ASSESSMENT
By: Danyil Repenko & Jackson Lindmark
ID — 1819855 & 1625896
Submitted to Dr. D. Ripley
EDU 100 Section 800
Fall Session 2023
October 18, 2023
1
[Section 1]: Introduction: Why this is an issue; my philosophical stance.
High-stakes assessment is a type of standardized assessment with significant
consequences for students and teachers alike in North America. This evaluation method has
shaped the very way in which students learn, or believe how they must learn, and how teachers
teach, or believe they must. With all the experience of high-stakes assessment and newfound
knowledge in the field of education, many arguments, both inside the field and in the popular
discourse, arise on how to either improve, change, or abolish said estimation system. As the
high-stakes assessment has become the backbone of modern education, any change to it is, in
effect, a change to the whole system.
In this paper, Danyil Repenko and Jackson Lindmark are going to try to define what, as
we believe, are the advantages and disadvantages of high-stakes assessment - for a better
understanding of both proponents and opponents -, the current struggles the system needs to
overcome and if, indeed, it must. We approach this topic as individuals grounded in the beliefs of
essentialism and progressivism. According to Martin & Loomis (2014), the former emphasizes
the existence of some essential knowledge, skills, and frameworks of mind that are defined by
the needs of society, which in turn, enable students to function effectively in the world at large.
The latter foregrounds highly student-centered instruction, hands-on learning, encouraged
self-expression, and the fluidity of certain knowledge.
[Section 2]: Demonstrate Critical Reflection and ANSWER for the reader: How might this
philosophical stance and your own experiences impact your interpretation of the
topic/issue? Why is this topic important to you?
Having been put through different systems with standardized tests, we can both say with
confidence that to a student, tests are school. School is a test. And while this is true to a certain
extent, the stigma of unit test after unit test in preparation for a diploma has severely manipulated
2
what education looks like to a student. To most, it has removed the act of growth and learning
and has replaced it with an endless stream of impersonal assessment. It is then no wonder why
most students dream of finishing high school and never looking back. We wish to change that
outlook by taking steps that might one day result in more students appreciating their education
rather than resenting it. Thus a need to understand the good and evil of high-stakes assessment.
The essentialists within us can accept that there are lessons and outcomes that must be
learned and achieved in order for success, both within the classroom and without. Thus, to some
extent, a standardized test to measure those outcomes is welcomed. But not at risk of losing
progressivist values by increasing the disparity of different and specific ethnic groups or social
classes, limiting learning to only what the government deems necessary, and forgetting that
behind every test grade lies a human being. A human being that must be tended to and cared for,
and government-set quotas do not do that; that, we believe, is achieved by professional teachers
who want to help build their community. This is where our essentialist progressivism wishes to
bridge that gap, and why the issue is so important to us.
[Section 3]: Summary of arguments that favor or provide benefits.
Let us begin with examining the argued benefits of the existent high-stakes examination.
Firstly, it removes the impact of subjective judgment of teachers (Phelps, 2008). Researches,
from as early as the beginning of the 19th century, support the observation that grades for the
same tasks fluctuate from one teacher to another. In the 1910s, for example, researchers Starch &
Elliott (1912; as cited in Phelps, 2008) produced copies of two actual English examination
papers and sent them to teachers to grade and return. The marks ranged from 50 to 98 percent.
Later, Lincoln & Workman (1936, 7; as cited in Phelps, 2008), considering similar results of
similar experiments, concluded that “there is abundant evidence that teachers’ marks are a very
unreliable means of measurement”. Teachers, it is believed, can narrow the curriculum
3
themselves, or simply get outsmarted by witty students. Plus, other non-subject matters are
included in evaluation, such as behavior, class participation etc. One American study of teacher
grading practices discovered that 66 percent of teachers “felt that their perception of a student’s
ability should be taken into consideration in awarding the final grade” (Frary, Cross, & Weber
1993; as cited in Phelps, 2008, p. 2). For some subjects, supporters say, this can be detrimental.
Arguably the biggest positive outcome of removed subjectivity is said to be the improved
equality of opportunity and recognition for people with different social standing, like minorities
(Phelps, 2008), because the judgment is based on actual performance over equivalent content.
Consequently, the resulting empirical data is believed to be utilizable in aiding diverse
purposes. For one: accountability. Students don’t always learn what they are taught. It is
impossible to predict what they will pick up. Considering that every child starts from a unique
base of knowledge and draws distinctive understandings due to individual experience, only
assessment can reach the conclusion that intended learning occurred (Dylan, 2013, p. 28).
Educators and school administrations can be brought accountable for their results in front of
parents, students and other stakeholders. Decision-making, then, would be grounded in facts,
providing an effective foundation to seek understanding and facilitate remediation, to support
classroom planning duties, to improve teaching and learning situations, to identify eligible for
further education students and to decide on kinds of education students should receive (Agrey,
2004; Dylan, 2013).
With such high-profile usage of exam results, teachers and especially pupils, proponents
claim, will feel more compelled to put in an effort to take education seriously, so as not to face
consequences (Agrey, 2004).
4
Unfortunately for supporters of the current high-stakes assessment system, all and every
benefit comes with ambiguity or severe repercussions.
[Section 4]: Summary of arguments dealing with the challenges / negatives.
Despite the benefits of standardized and high-stakes testing, there is overwhelming
critique from educators having to deal with the repercussions of the institution. In 1995, Marvin
F. Wideen, Thomas O'Shea, Ivy Pye, and George Ivany conducted a two-year study in B.C.,
focusing on the effects of standardized assessments in the classroom. Sixteen of the eighteen
grade 12 teachers that participated “ranged from mild ambivalence to strong dislike in their
views of the government examinations.” (Wideen et. al. 1997). Concerns were raised over the
narrowing of the curriculum to fit the test, which in turn caused psychological pressures for both
students and teachers, greatly restricted the creativity within the classroom and thus student
engagement, and forced teachers to omit difficult but important lessons (Wideen et. al. 1997).
While the effects are different in every classroom, there appeared to be a near consensus that a)
high-stakes assessments have impacted the classroom at least moderately, and b) this impact has
been mostly negative for both teachers and students (Wideen et. al. 1997). They concluded that
when “teachers [are] being circumscribed and controlled by examinations, and students […] only
focus on what will be tested, the conversation is limited to only testable aspects of the
discipline”, and that arguably the most crucial aspect of education; “enquiry into ourselves and
[surroundings], is not part of that conversation”(Wideen et. al. 1997). This is supported by Ken
Jones, who claims that the continuous disempowerment of teachers within their classrooms has
contributed to the growing teacher shortage, given the disillusionment it can cause (Jones, 2004).
Loren Agrey has compiled the insights of many educators who often battled against
standardized testing. Agrey relies on Alfie Kohn and his experience in the field, who posits that
“the variance in test scores has a higher correlation to non-instructional factors”, wholly
5
unrelated to tests or teachers “such as the number of parents living at home, parental educational
background, type of community and poverty rate, than to instructional performance” (Agrey
2004). Dylan (2013) mentions an example: “only 11 per cent of the variation in students’ science
scores in PISA in 2006 was attributable to the school”.
To buttress these concerns, Sepideh Massoodi argues that new inequalities have come
from high-stakes assessments that target ESL students, as other minorities, disproportionately.
Massodi claims “the cultural content of the standardized tests might not be suitable for ESL
students coming from diverse cultural backgrounds, [which stems] from the fact that the tests
were designed to target mainstream education students” (Masoodi 2014). Students appear to
suffer on these tests if they are not familiar with the nuance of certain words (“‘figurative
language,’ ‘sayings,’ ‘proverbs’ and ‘analogies’” (Masoodi 2014)) that would only be natural for
native English speakers, and if they lack the cultural influences purported by mainstream content
often focusing on English (and some French) Post-Confederation Canada. Given that the 2006
census states around 22% of Canadians are allophones, this is a subject that needs to be
addressed.
[Section 5]: Your final position on the issue, based on your research.
After reading the research and reflecting on our philosophical stance and our experiences,
we believe that the current high-stakes exam system is problematic to say the least. Even rigged
and flawed, the essentialists in us agree, it provides important educational, political, and
communal functions. Thus, it should stay, but with many reworks, as progressivists in us cannot
possibly look behind the bouquet of shortcomings. If it stays with no changes, it won’t suffice.
Fortunately, the people in the field already have suggestions (Dylan, 2013; Jones, 2004;
Masoodi, 2014).
6
This project opened our eyes to the perceived benefits of the high-stakes exam and those
which were achieved in theory. Though it’s the implementation, not theory, that counts. The
broader specter of repercussions, drawn out, clearly shows the extent to which standardized tests
failed in their purposes and brought upon more negative effects than was anticipated. The
findings only reinforced our initial position.
[Section 6]: Conclusion: provide a concluding paragraph that synthesizes the main points
of your research and your final position.
Standardized high-stakes assessment has had both positive and negative effects on the
world of education, especially in Canada, which should not be underestimated. While it has
brought more objectivity, assessment equality, and faster (and cheaper) general evaluation, it has
also created new inequalities, and greatly increased stress, with moral dilemmas, distorted
standards, and missed opportunities for both students and teachers alike. For us, standardized
assessments have provided a framework through which a more objective lens can be cast on the
mass testing of learning outcomes on a nationwide level, but in its current form, they cause as
much harm to students as they help to organize them. This has been a hotly debated and
researched issue within education for many years, and it will undoubtedly continue to be one,
likely even after we’ve retired from the field, but that does not mean improvements cannot be
made. Such necessary and helpful components of the institution, some highlighted by Richard
Phelps, should remain, while other areas of inequality and unnecessary meddling in the
classroom, as given by Loren Agrey and a host of others, must be altered for a more
student-oriented approach in order for future students to develop, grow, and learn as best they
can. For that is the purpose of education, not meeting government quotas, and therein lies the
difference we, as future educators, wish to achieve.
7
References
Agrey, L. (2004). The pressure cooker in education: Standardized assessment and high stakes.
Canadian Social Studies, 38(3). https://eric.ed.gov/?id=EJ1073917 - Peer
reviewed-scholarly
Dylan, W. (2013). Assessment: The bridge between teaching and learning. Voices from the
Middle, 21(2), 15-20.
https://www.proquest.com/docview/1464750635/fulltextPDF/95BBD977A6774EB5PQ/1
?accountid=14474 - Peer reviewed-scholarly
Jones, K. (2004). A balanced school accountability model: An alternative to high-stakes Testing.
Phi Delta Kappan, 85(8), 584-590. https://doi.org/10.1177/003172170408500805
Masoodi, S (2014). Using high-stakes standardized examinations for ESL Students: Challenges >
and implications. European Journal of Educational Sciences, 1(2), 96-113.
https://doi.org/10.19044/ejes.v1no2a8 - Peer reviewed-scholarly
Phelps, R. P. (2008). The role and importance of standardized testing in the world of teaching
and training: Paper presented at the 15th congress of the world association for
educational research. Nonpartisan and Education Review / Essays, 4(3).
https://nonpartisaneducation.org/Review/Essays/v4n3.pdf
Wideen, M.F., O’Shea T., Pye, F., & Ivany, G. (1997). High-Stakes testing and the teaching of
science. Canadian Journal of Education / Revue canadienne de l'éducation, 22(4),
428-444. https://www.jstor.org/stable/1585793 - Peer reviewed-scholarly
8
Rubric
CONTENT_________________/10
Excellent(10) ● States the Inquiry Question
● Exceptionally clear and succinct introduction and conclusion.
● Evidence from article is accurately and insightfully summarized providing
context for the reader (W5 – who, what, where, when, why)
● Overall purpose of the paper is clearly and articulately stated.
● Demonstrates clear understanding of the complexity of the issue
● Philosophical stance is reflected upon within the context of the issue
Very Good ● States the Inquiry Question
(8-9) ● Clear and succinct introduction and conclusion.
● Evidence from articles is accurately summarized providing context for the
reader (W5 - who, what, where, when, why)
● Overall purpose of the paper is articulately stated.
● Demonstrates a general understanding of the complexity of the issue/topic
● Philosophical stance is reflected upon
Good (6-7) ● States the Inquiry Question
● Paper contains a basic introduction and conclusion.
● Evidence from articles is summarized but summaries may lack insight,
detail and/or lack clarity.
● A basic overall purpose of the paper is stated.
● Demonstrates a general understanding of the issue/topic
Satisfactory ● States the Inquiry Question
(5) ● Paper contains an introduction and conclusion.
● Evidence from articles is summarized but summaries may lack insight,
detail and/or lack clarity
● A basic purpose of the paper is evident but not clearly stated
● Demonstrates a basic understanding of the issue/topic
Unsatisfactory ● Doesn’t state the inquiry question
(0-4) ● Paper may be lacking an introduction and conclusion.
● Evidence from articles is not adequately summarized. Summaries may be
unclear, and/or lacking.
● The basic overall purpose of the paper is lacking.
● Demonstrates minimal exploration of the issue/topic
QUALITY OF RESEARCH & SUPPORT _________/10
Excellent(10) ● Four or more, relevant (current, mainly Canadian) articles were critically
chosen.
● Articles clearly represent diverse perspectives and styles (e.g. at least
two peer reviewed-scholarly articles and two trade-popular articles). Peer
9
reviewed-scholarly sources are labeled as such.
● Highly relevant sources have been included
● Critical discussion is always supported by relevant sources and research.
Very Good (8-9) ● Four or more, relevant (current, mainly Canadian) articles were critically
chosen.
● Articles clearly represent diverse perspectives and styles (e.g. at least
two peer reviewed-scholarly articles and two trade-popular articles). Peer
reviewed-scholarly sources are labeled as such.
● Highly relevant sources have been included
● Critical discussion is most often supported by relevant sources and
research.
Good (6-7) ● Four relevant (current, mainly Canadian) articles were chosen.
● Articles represent diverse perspectives and styles (e.g. two peer
reviewed-scholarly and two trade-popular articles). Peer
reviewed-scholarly sources are labeled as such.
● Discussion is usually supported by research.
Satisfactory (5) ● Four articles were chosen.
● Articles are similar in perspective and styles (e.g. two peer
reviewed-scholarly and trade-popular articles) and/or some articles lack
relevance.
● Discussion is somewhat supported by research
Unsatisfactory ● Less than four articles were chosen; and/or
(0-4) ● Articles are inappropriate, or irrelevant for the chosen topic.
● Discussion lacks support by research; over-generalizing is common
WRITING MECHANICS, STYLE, APA FORMAT _____ /5
Excellent(5) ● Ideas are articulated utilizing clear, coherent and fully developed
paragraphs with smooth and effective transitions.
● Writing is free of GUSP* and/or APA style errors.
● All assignment criteria are effectively met, including a proper title page
& rubric
Good (4) ● Ideas are generally articulated utilizing clear, coherent and fully
developed paragraphs with smooth and effective transitions.
● Writing has few GUSP* and/or APA style errors.
● Meets most assignment criteria including a proper title page & rubric
attached
Satisfactory (3) ● Ideas communicated utilizing organized sentences / paragraphs.
● Writing has some GUSP* and/or APA style errors.
● Pertinent assignment criteria are lacking, such as section headings.
10
Unsatisfactory ● Ideas incoherent and unclear; sentences / paragraphs are confusing.
(0-2) ● Writing has numerous GUSP* and/or APA style errors.
● Citations missing or inaccurate; little attempt to meet APA citation
guidelines.
● Assignment criteria has not been met to an acceptable standard.
1819855 & 1625896
*GUSP = grammar; usage of words; spelling; punctuation.
Instructor assigned mark: _____ /25
Self-Assessment
[Section 1]: Introduction: Why this is an issue; my philosophical stance.
● CONTENT SELF-ASSESSMENT: 8/10
● QUALITY OF RESEARCH & SUPPORT SELF-ASSESSMENT: 8/10
● WRITING MECHANICS, STYLE, APA FORMAT SELF-ASSESSMENT: 4/5
● SELF-ASSESSMENT: 20/25
[Section 2]: Demonstrate Critical Reflection and ANSWER for the reader: How might this
philosophical stance and your own experiences impact your interpretation of the
topic/issue? Why is this topic important to you?
● CONTENT SELF-ASSESSMENT: 8/10
● QUALITY OF RESEARCH & SUPPORT SELF-ASSESSMENT: 7/10
● WRITING MECHANICS, STYLE, APA FORMAT SELF-ASSESSMENT: 4/5
● SELF-ASSESSMENT: 19/25
[Section 3]: Summary of arguments that favor or provide benefits.
● CONTENT SELF-ASSESSMENT: 7/10
● QUALITY OF RESEARCH & SUPPORT SELF-ASSESSMENT: 7/10
● WRITING MECHANICS, STYLE, APA FORMAT SELF-ASSESSMENT: 4/5
11
● SELF-ASSESSMENT: 18/25
[Section 4]: Summary of arguments dealing with the challenges / negatives.
● CONTENT SELF-ASSESSMENT: 9/10
● QUALITY OF RESEARCH & SUPPORT SELF-ASSESSMENT: 8/10
● WRITING MECHANICS, STYLE, APA FORMAT SELF-ASSESSMENT: 4/5
● SELF-ASSESSMENT: 21/25
[Section 5]: Your final position on the issue, based on your research.
● CONTENT SELF-ASSESSMENT: 8/10
● QUALITY OF RESEARCH & SUPPORT SELF-ASSESSMENT: 7/10
● WRITING MECHANICS, STYLE, APA FORMAT SELF-ASSESSMENT: 4/5
● SELF-ASSESSMENT: 19/25
[Section 6]: Conclusion: provide a concluding paragraph that synthesizes the main points
of your research and your final position.
● CONTENT SELF-ASSESSMENT: 8/10
● QUALITY OF RESEARCH & SUPPORT SELF-ASSESSMENT: 8/10
● WRITING MECHANICS, STYLE, APA FORMAT SELF-ASSESSMENT: 4/5
● SELF-ASSESSMENT: 20/25