0% found this document useful (0 votes)
8 views13 pages

Johnson Tweedie2021 Article IELTS OutTOEFL OutIsTheEndOfGe

The article examines the future of general English for Academic Purposes (EAP) in light of the increasing preference for standardized English language proficiency tests like IELTS and TOEFL for university admissions. The study finds that while these standardized tests are weak predictors of academic achievement, completion of EAP programs significantly correlates with student success. This suggests that predictions about the decline of general EAP may be premature, as it remains a valuable pathway for academic readiness.

Uploaded by

Anika Tabassum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views13 pages

Johnson Tweedie2021 Article IELTS OutTOEFL OutIsTheEndOfGe

The article examines the future of general English for Academic Purposes (EAP) in light of the increasing preference for standardized English language proficiency tests like IELTS and TOEFL for university admissions. The study finds that while these standardized tests are weak predictors of academic achievement, completion of EAP programs significantly correlates with student success. This suggests that predictions about the decline of general EAP may be premature, as it remains a valuable pathway for academic readiness.

Uploaded by

Anika Tabassum
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Interchange (2021) 52:101–113

https://doi.org/10.1007/s10780-021-09416-6

“IELTS‑out/TOEFL‑out”: Is the End of General English


for Academic Purposes Near? Tertiary Student Achievement
Across Standardized Tests and General EAP

Robert C. Johnson1 · M. Gregory Tweedie2

Received: 22 October 2020 / Accepted: 18 January 2021 / Published online: 27 January 2021
© The Author(s), under exclusive licence to Springer Nature B.V. part of Springer Nature 2021

Abstract
It has been widely asserted that general English for Academic Purposes (EAP) in post-
secondary education has a limited future, given both the benefits of discipline-specific
EAP, and the widespread use by universities of international standardized assessments
for proof of English language proficiency (such as IELTS, PTE, TOEFL, etc.). The extra
time and expense required for completion of EAP courses is a less appealing admission
route than it is to “IELTS-out” (bypassing an EAP program by obtaining a standard-
ized test score that enabling direct admission into a university program). This article
investigates capacity of nine widely-used measures of English language proficiency to
predict post-secondary student achievement (n = 1918) across multiple academic pro-
grams at a Canadian university, over a seven-year admission period. All standardized
tests measuring English language proficiency for admission readiness (including IELTS
and TOEFL) were insignificant and/or problematically weak predictors of achievement,
in both first and final semester of study. In contrast, completion of pre-enrolment EAP
programming proved to be a significant predictor of academic achievement, with sig-
nificant, moderate association. With general EAP the only measure of language pro-
ficiency a significant predictor of student achievement, the study’s findings question
whether predictions about the demise of such programs are premature.

Keywords IELTS · TOEFL · CAEL · English language proficiency · Language


testing · English for Academic Purposes

The reports of my death have been greatly exaggerated.


Attributed to Mark Twain.

* M. Gregory Tweedie
[email protected]
Robert C. Johnson
[email protected]
1
University of Calgary Qatar, Doha, Qatar
2
Werklund School of Education EDT 1032, University of Calgary, 2500 University Drive NW,
Calgary, AB T2N 1N4, Canada

13
Vol.:(0123456789)
102 R. C. Johnson, M. G. Tweedie

Introduction: The Death of General EAP?

In some settings, prospects for the future of general EAP (English for Academic
Purposes) programming appears gloomy. At many English-medium universities,
international students unable to present a required score on one of several standard-
ized tests of English Language Proficiency (ELP; see Sect. 2.1) must complete pre-
enrollment language support courses, often EAP courses of a general nature, meant
to prepare for study in a broad range of academic disciplines. English as an Addi-
tional Language (EAL) students seeking university entrance can forego both the
(often substantial) expense and time required to complete an EAP program to dem-
onstrate proof of ELP through achieving a requisite score on a number of interna-
tionally recognized assessments. Student demand for these gatekeeping assessments
has spawned a billion-dollar industry in testing and test preparation (Cavanagh
2015), and has given rise to criticism that an overemphasis on test preparation is
undermining the “real business of learning the language” (Gan 2009, p. 25). While
the verbs “IELTS-out” and “TOEFL-out” may not yet be in the Oxford Dictionary,
many EAP instructors will immediately recognize the process which they describe:
withdrawing from an EAP course (often already in progress) by obtaining a stand-
ardized test score that facilitates direct admission into a university program. In infor-
mal discussions, more than one international student has indicated that the private
tutoring and multiple attempts required to finally achieve an IELTS band of 6.5 for
direct entry was still a fraction of the price and far less time-consuming than even
one semester of EAP (Personal communications, 2017). Given the costs associated
with studying abroad, it is certainly understandable that international students would
want to avoid both extending the length of the study program and incurring addi-
tional course costs. However, students’ “IELTS-out” strategy has left some EAP
instructors demoralized and with questions surrounding professional identity (Per-
sonal communications, 2017). The authors have come across various institutional
strategies to counter the “IELTS-out” phenomenon, including offering standardized
test preparation workshops alongside—or in some cases even in place of—EAP cur-
ricula; closing admissions loopholes through the creation of a clause disallowing
use of any other proof of ELP after enrollment in EAP; and intensifying marketing
efforts to convince students that EAP is well worth the extra time and expense. In
circumstances such as these, where EAP is positioned as a post-secondary admis-
sion gatekeeper alongside, or even in competition with, other measures of demon-
strating ELP, it is easy to be pessimistic about EAP’s future.1
A growing emphasis on discipline-specificity for postsecondary English language
preparation seems also to present a challenge to the future of general EAP. Murray
(2016, pp. 89–90) contrasts what he calls “generic EAP” with the virtues of an “aca-
demic literacies” approach. Among Murray’s characterizations of generic EAP are a
“grounding in generic study skills” and a program “out of kilter with the notion of

1
The institutional setting in which this study takes place requires proof of ELP either through comple-
tion of a pre-enrollment general EAP program, or requisite scores on one of a number of prescribed
standardized language tests.

13
“IELTS-out/TOEFL-out”: Is the End of General English for… 103

academic literacy as something tied to a particular domain of application”. So out of


kilter, in fact, that Murray argues “it is likely that many will have to unlearn some of
what they have absorbed in those programmes if they are to meet the requirements
of their [post-EAP program] disciplines”.
Advocates of academic literacies for tertiary preparation assert the inadequacy
of a “study skills approach” associated with general EAP, which is said to treat lit-
eracy, and in particular writing, as “an individual cognitive skill where the formal
features of writing are learnt and easily applied across different contexts”, with typi-
cal emphases being “a focus on sentence structure, creating a paragraph and punc-
tuation” (Sheridan 2011, p. 130). Murray labels this a “one-size-fits-all view of
academic literacy” that “tends to dislocate those skills from particular disciplinary
contexts” (2016, p. 85). Further unsettling to EAP professionals is the claim that
such approaches are “generally constructed within discourses of deficit and remedia-
tion” (Henderson and Hirst 2007, p. 26).
If a student can “IELTS-out” of general EAP to gain university admission, and if,
as its critics claim, general EAP by definition represents content disembodied from
the academic disciplines it purports to serve, the future of such programs seems very
much in doubt. It is within this context that the current study was undertaken to
investigate the relative overlap of the two indicators of English language proficiency
(ELP) most higher education institutions accept—standardized language tests (such
as IELTS Academic, TOEFL, and others) and EAP courses—and student success in
academic programs. Should EAP courses be interchangeable with standardized test
results as determinants of required proficiency in the language of instruction, or, if
as Murray suggests, generic EAP is problematic for preparing EAL students for aca-
demic programs, we would expect to see this reflected in the capacity of these ELP
indicators to predict future student success.

Background

Defining ELP for University Admission

Whether driven by post-secondary institutions’ desire to increase internationaliza-


tion, to broaden participation, or to receive the revenue generated through inter-
national students’ additional fees, energetic goals to further increase enrolment of
international students abound in many countries. The government of Canada, for
example, set a goal of doubling the number of international students at its post-sec-
ondary institutions in a 10-year period (Macgregor and Folinazzo 2017). Though
BANA nations (Britain, Australasia, North America) are traditionally seen as study
destinations for English-medium postsecondary education, an increasing num-
ber of other countries are competing for international students by offering degrees
with English as a medium of instruction (Lee 2015; Macaro et al. 2018). This has
increased institutions’ recognition of the necessity to both support language learn-
ing needs of international students, and quantify their ELP, despite the fact that ELP
is typically only vaguely defined, with definitions varying widely even within the
same institution (Murray and Hicks 2016). A bewildering number of standardized

13
104 R. C. Johnson, M. G. Tweedie

tests offer to provide such quantification, each with its own distinctive slant on what
constitutes ELP. The developers of the ­TOEFL2 for example, acknowledge the chal-
lenges of defining ELP for the purposes of assessment encountered throughout that
test’s evolution (Chapelle et al. 2008; Jamieson et al. 2008). Another widely used
standardized assessment, the ­ PTE2 “measures English language proficiency for
communication in tertiary level academic settings” (Zheng and Dejong 2011, p. 3),
but Zheng and Dejong’s discussion of PTE’s construct validity makes no attempt to
explicitly define ELP. Murray’s (2016, p. 70) characterization of typical attempts at
defining the ELP construct as “rather vague” and “catch-all” rings true.
As well as a lack of clarity around what constitutes ELP, further complication is
introduced by the process through which institutions determine which standardized
tests to use for admission, and the setting of particular cut scores from those tests.
Uneven at best (Tweedie and Chu 2019), test selection and cut score identification
is often done simply by referring to scores used by competitor institutions, or by
referring to comparison tables, their questionable usefulness notwithstanding (Tay-
lor 2004).
The institutional murkiness of what exactly defines language levels needed for
post-secondary success is further compounded where completion of a pre-enroll-
ment, gatekeeping EAP course can be used as proof of ELP. The emphases of these
courses vary widely between institutions, and unlike degree courses whose trans-
ferability is specified in credit transfer agreements between universities, such EAP
courses are in many cases non-transferable. Thus, attempting to pin down what
graduates of a particular gatekeeping course can actually do in terms of language, or
linking their ELP to particular standardized test scores, remains problematic.
Determining equivalency among assessments of ELP is also thorny. Assessments
are scored on different scales, the minimum cut scores on assessments required for
admission vary across institutions, and universities typically do not make explicit
the criteria being considered when using a particular test. One large Canadian uni-
versity, for example, requires an IELTS score of Band 6.5 or a PTE score of 61 for
undergraduate admission (University of Alberta 2018), yet a band 6.5 on IELTS is
equated by the test developers to a range of 58–64 points on the PTE Academic
(Pearson Education 2017).
Treating assessments as equivalent, when they have been created utilizing differ-
ent frameworks and/or even target constructs, invites misuse of the results (AERA
APA and NCME 2014; Kane 2013). When test developers make differing claims for
their assessments, it follows that the uses of results should also differ.

Operationalizing Academic Success: Why Grades Matter

Defining academic success engenders similar challenges as when attempting to


define ELP, and multiple definitions have been put forward, suggesting inclusion
of a wide variety of indicators. Alongside measures like Grade Point Average
(GPA), proposed indicators of student success have included: scores on entrance
exams; credit hours taken consecutively; time to degree completion; perceptions
of institutional quality; willingness to re-enroll; post-graduation achievement;

13
“IELTS-out/TOEFL-out”: Is the End of General English for… 105

social integration into campus life; appreciation for diversity; adherence to dem-
ocratic values, among others (e.g., see Kuh et al. 2006). York et al. (2015), in
building upon the extensive literature review of Kuh and colleagues, propose
that the definition of student success be a multidimensional one “inclusive of
academic achievement, attainment of learning objectives, acquisition of desired
skills and competencies, satisfaction, persistence, and postcollege performance”
(p. 5). Of these multiple dimensions, we have limited our consideration for
the purpose of this study to focus on GPA. Critics are quick to object to using
grades as the only measure of academic success, given the breadth of experi-
ences which constitute students’ post-secondary paths. We affirm the value of
multidimensional means of benchmarking student achievement, but maintain
that grades are an important measure for a number of reasons.
First, rightly or wrongly, grades are used by a wide variety of stakeholders
as a measure of student abilities. Universities themselves consider grades when
making decisions on admission, for both undergraduate and graduate program
entrance. Course GPA is often a central criterion for getting one’s preferred
study major. Organizations providing scholarship funding utilize grades as an
important selection mechanism, and post-graduation, employers may factor
university GPA into hiring decisions. It follows that the combined effect of the
above gatekeepers would result in the central stakeholder, the students them-
selves, placing a high value on grades. York et al. acknowledge that their “con-
structivist method” of reviewing the success construct limits the inclusion of
student and parent voices (2015, p. 9). We expect that including student and
parent voices in defining academic success would strengthen the case for consid-
ering grades.
Further, some have argued that achievement of course learning objectives
represents a more accurate depiction of academic success, since grades are only
substitute measurements of actual learning. For this reason, York et al. (2015)
argue for a separation between grades, attainment of learning objectives and
achievement of skills and competencies when conceptualizing student achieve-
ment. In our view though, it follows that achieving the learning objectives of
a course or program, and gaining the requisite skills and competencies, should
lead to higher grades. We readily admit that actual knowledge (as opposed to
assessed knowledge) is exceptionally difficult to quantify, and that, as York and
colleagues assert, a student’s GPA is only a “proxy” measurement for what may
have actually been learnt (p. 7). Such philosophical considerations notwithstand-
ing, we are not optimistic that students, their parents, scholarship committees,
university admission policy-makers or employers will, in the near future, opt for
wholesale adoption of (more difficult to measure) actual learning over the more
measurable, but admittedly proxy, learning that is reflected in grades.
Finally, as GPA is by far the most widely available measure of student per-
formance to which researchers have access, it is conveniently comparable across
institutions and contexts, making it highly useful.

13
106 R. C. Johnson, M. G. Tweedie

Context of the Study

This enquiry took place at a large, research-intensive Canadian university, where


English is the medium of instruction. All applicants to the institution must dem-
onstrate ELP for direct admission, which, for international students, can be done
by presenting a prescribed score on one of eight international standardized assess-
ments: TOEFL iBT, TOEFL PBT, IELTS Academic, CAEL, MELAB, PTE, CAE,
or PTE.2 A ninth option is available: applicants who do not meet the requisite cut
scores on one of these assessments may opt to enroll in the institution’s EAP pro-
gram, which provides pre-enrollment instruction in general academic English. Stu-
dents are placed in one of three levels through means of an in-house placement
instrument, and complete three semester-length courses at each level: academic
writing and grammar, reading comprehension and proficiency, and listening com-
prehension and oral proficiency. Learners attaining a grade of at least 70% in the
program’s third level are considered to have satisfied the ELP requirement and are
then admitted to the university.
Given that the institution accepts nine different means of demonstrating ELP for
admission (the 8 standardized tests and completion of the EAP program), it stands
to reason that comparability among measures should be subjected to scrutiny, as
should any differential performance of instruments predicting student success. To
the best of our knowledge, no previous study has ever investigated the comparability
of all eight of these assessments with respect to their predictive capacity for student
achievement, or compared their performance with EAP course results. Since many
of the eight assessments are used for admission to English-medium universities
around the world, we anticipate that the findings will have implications for a very
large number of ELP test-users internationally.

Methods

This quantitative research study considered anonymized data for 1918 EAL students
at a Canadian university, ranging from Fall semester 2010 to Fall semester 2016,
as provided by the institution’s student services office. Participants (49.9% female,
49.8% male, 0.3% no response) constituted a multinational, multilingual, and mul-
tidisciplinary sample, with a total of 107 different nationalities and 19 different
academic programs represented. Each student had completed: (i) at least one of the
eight ELP tests officially accepted for entry into the university, and/or (ii) the EAP
program at the institution. Finally, data included GPA for students’ first semester of

2
For TOEFL® iBT/PBT (Test of English as a Foreign Language internet-Based Test/Paper-Based
Test) see ETS (2018). For IELTS Academic (International English Language Testing System) see
IELTS (2018). For CAEL (Canadian Academic English Language) test see Paragon Testing Enterprises
(2018). For MELAB (Michigan English Language Assessment Battery) see Michigan Language Assess-
ment (2018). For PTE (Pearson Test of English—Academic) see Pearson (2019). For CAE (Cambridge
Advanced English; more recently C1 Advanced) see UCLES (2019).

13
“IELTS-out/TOEFL-out”: Is the End of General English for… 107

Table 1  Descriptive statistics for academic program GPA, ELP scores, and EAP course results
N Minimum Maximum Mean Std. Deviation

First semester GPA 629 .43 4.00 2.932 .744


Final semester GPA 627 .50 4.00 3.323 .588
CAE 0
CAEL 73 40 90 68.630 9.024
IELTS academic 809 4.0 9.0 6.740 .840
MELAB 32 68 94 84.690 6.172
PTE 6 63 85 72.000 9.798
TOEFL CBT 16 163 273 234.370 29.145
TOEFL iBT 735 43 119 92.29 11.293
TOEFL PBT 20 477 667 578.35 45.121
EAP reading 517 1.00 4.00 2.644 .528
EAP writing 514 1.00 4.00 2.531 .528
EAP listening and speaking 399 1.00 4.00 2.629 .5312

study at the institution, final semester, or both. Cumulative GPA, however, was not
available in the records provided.
Pearson correlation coefficients (r) were used to estimate the capacity of the dif-
ferent ELP tests and EAP courses for predicting student success in academic pro-
grams (both in aggregate and for each program). Correlations of determination (r2)
were used when it was felt more beneficial to discuss the amount of variance in GPA
which seemed to be determined by variance in a predictive variable (a specific ELP
test or EAP course).
Despite the large number of participants, once broken into subgroups (e.g., stu-
dents presenting CAEL results who completed EAP Reading), the sample sizes
often became quite small or even zero. While there are no clear guidelines as to
what an acceptable sample size is for a Pearson correlation calculation, the authors
decided to follow David’s (1938) long-held and oft-cited recommendation of a mini-
mum of 25. As such, only results for which sample size was greater than 25 are typi-
cally presented and discussed.

Results

Table 1 reports the descriptive statistics for academic program GPA, ELP tests, and
EAP results. In the seven years of data provided, not one EAL student reported a
CAE score as evidence of ELP, only six presented PTE scores, and none of these
particular students had first or final semester GPA on record. In addition, very few
students had computer-based (CBT) or paper-based (PBT) TOEFL outcomes (n = 16
and 20, respectively), which might be expected given the near-complete transition to
the Internet-based TOEFL (iBT) over the past decade. As none of these instruments
had a sample size of 25 or greater, they were omitted from further analyses.

13
108 R. C. Johnson, M. G. Tweedie

Table 2  Correlation between ELP indicator results and students’ first and final semester GPA
Predictor First semester GPA Final semester GPA

CAEL
r .073 .097
p (2-tailed) .696 .603
N 31 31
IELTS academic
r .054 .199*
p (2-tailed) .495 .011
N 163 162
TOEFL iBT
r .246** 218**
p (2-tailed) .000 .000
N 341 340
EAP reading
r .246** .386**
p (2-tailed) .005 .000
N 127 126
EAP writing
r .144 .280**
p (2-tailed) .107 .002
N 126 125
EAP listening and speaking
r .205* .326**
p (2-tailed) .021 .000
N 126 125

*Correlation is significant at the 0.05 level (2-tailed)


**Correlation is significant at the 0.01 level (2-tailed)

Pearson correlations estimating the predictive capacity of each ELP test and EAP
course, for first and final semester GPA, are reported in Table 2.
The CAEL failed to significantly predict student performance in either first or
final semester of study. While the insignificant outcomes for the CAEL could, poten-
tially, be blamed on the relatively small sample size (n = 31), the insignificant and/
or weak predictive capacity found with regards to IELTS and TOEFL iBT results
cannot. The IELTS Academic did not significantly predict first semester student
performance (r = .054, p = .495, n = 163) and did so only weakly for final semes-
ter (r = .199, p = .011, n = 162). The TOEFL iBT was the only test to significantly
predict GPA in both first (r = .246, p = .000, n = 341) and final (r = .218, p = .000,
n = 340) semesters, though it did so weakly.
While none of the EAP course results demonstrated a strong association with
academic program GPAs, results, overall, would seem to demonstrate better overlap
than standardized test results. EAP Reading outcomes showed a significant, weak

13
“IELTS-out/TOEFL-out”: Is the End of General English for… 109

association with first semester performance (r = .246, p = .005, n = 127) and moder-
ate with final (r = .386, p = .000, n = 126). EAP Writing results did not significantly
predict first semester grades (r = .144, p = .107, n = 126) but significantly, weakly
predicted final (r = .280, p = .002, n = 125). The EAP Listening and Speaking course
results, meanwhile, showed a significant, weak correlation with first-semester GPA
(r = .205, p = .021, n = 126), and significant, moderate relationship with final semes-
ter success (r = .326, p = .000, n = 125).

Discussion

One limitation of the study is that, despite the large dataset (n = 1918), once broken
down by predictor (specific ELP test or EAP course result), many resulting cells had
problematically small sample sizes (n < 25). However, efforts were taken to address
only those outcomes with samples larger than 25 participants. Another possible lim-
itation is the use of GPAs as the index for student success. It has long been noted, as
a score out of 4.00, the relatively limited range of the measure likely contributes to
(at least somewhat) muted correlation coefficients and, therefore, potential underes-
timation of overlap between predictors and actual student success. However, while
noting this potential limitation, GPA is by far the most widely available measure of
student performance to which researchers have access. It is also nearly universal in
terms of its use as an estimate of student success and, resultantly, conveniently com-
parable across institutions and contexts.
Another potential limitation of the study, or more specifically the measures
involved, is that any predictor, whether a test or an EAP course, will always be at
least somewhat limited in its capacity to predict student success. No single instru-
ment or process can, for example, address all of the skills, knowledge, attitudes, and
behaviours which influence student performance in academic programs. However,
it is also equally important to remember that this is, at the very heart of the matter,
what institutions are doing with these tests and courses. They administer the tests
to determine who is ready to succeed in academic programs now and who needs
more tuition in language skills before they are likely to thrive. For students in the
latter category, they are assigned to EAP courses specifically intended to improve
the skills and knowledge required to succeed in academic programs. To this end,
then, we should expect to see considerable (though certainly not absolute) overlap
between variance in these test results and course outcomes, and variance in aca-
demic success. The skills, knowledge, and/or attitudes which determine success in
future academic studies should at least be substantially addressed in the test accepted
and programs on offer, or there is little point to their use.
As a final note, it is worth pointing out the surprising number of ELP instru-
ments still officially accepted by the institution which were rarely (if ever) actually
presented by incoming students. The CAE, for example, had no data whatsoever,
meaning no students used this test to demonstrate ELP in the seven-year range of
the data. Similarly, a total of six students presented a PTE result, 16 a computer-
based TOEFL, and 20 a paper-based TOEFL outcome in order to gain entry to the
university. These numbers are low enough to make discerning the effectiveness of

13
110 R. C. Johnson, M. G. Tweedie

the instruments in assisting placement decisions problematic or impossible. As a


result, the institution may wish to consider removing them from their accepted list.
One needs data to evaluate the usefulness of information, including how well evi-
dence like a test score may or may not be contributing to beneficial (and high stakes)
decisions about incoming students. Currently, the institution simply does not have
enough data to know whether or not the evidence provided by these instruments is
contributing (or would contribute) to decisions that are beneficial to incoming stu-
dents and the institution. Reliance on equivalency research from the test developers
themselves, without updated, institution-specific evidence of equivalencies, or with-
out contextualized information on what a particular test score means, in our view is
not sufficient justification for continued acceptance of a given measure as demon-
stration of ELP for admission.
As seen in the results section, the findings indicate little overlap between the
skills and abilities measured by the standardized tests used for ELP by the institu-
tion and those which determine GPA in academic programs. Scores for the CAEL,
for example, did not predict student GPA in students’ first or final semester of study.
IELTS Academic results demonstrated significant but weak overlap with final
semester of study only (r = .199, p = .011), and TOEFL scores showed significant,
weak association with both first- (r = .246, p = .000) and final-semester (r = .218,
p = .000) GPA. Put in terms of coefficient of determination (r2), this result indicates
approximately 4% of variance in student success (in final semester of study) would
seem to be influenced by the competencies measured by IELTS Academic. This
is troubling, as the instrument is, after all, designed specifically for the purpose of
determining who is capable of success in higher education, heavily and continuously
researched and developed, and widely used across the globe. Other studies, however,
have typically reported similarly problematic results for IELTS. Investigations find-
ing a strong (e.g., Bellingham 1993; Harsch et al. 2017; Hill et al. 1999) or even
moderate (e.g., Al-Malki 2014; Woodrow 2006; Yen and Kuzma 2009) relationship
between IELTS outcomes and academic success are in the minority. Far more com-
mon are outcomes suggesting the relationship between IELTS and academic success
are weak (Denham and Oner 1992; Feast 2002; Kerstjens and Nery 2000) or not
significant (Bayliss and Ingram 2006; Cotton and Conrow 1998; Dooey and Oliver
2002; Ingram and Bayliss 2007).
While the TOEFL iBT scores did significantly overlap both first- and final-
semester GPA, coefficients of determination indicate the skills (and other factors)
influencing performance on the test only determine some 4 to 6% (r2 = 0.061 and
r2 = 0.048, respectively). Here, too, we find that, despite the instrument’s design for
assessing academic English competency, continuous and considerable research and
development, and widespread international use, it is typically found to be a weak
predictor of academic success. Cho and Bridgman (2012), for example, also found a
3% overlap between the TOEFL iBT scores and GPAs of 2594 university students.
Wongtrirat (2010) found a very similar 3.5% estimate of determinance between
TOEFL iBT and GPA in a meta-analysis of 22 studies conducted between 1987 and
2009.
The best predictors of EAL student success in academic programs, overall, would
appear to be EAP courses. While EAP Writing predicted student success similarly

13
“IELTS-out/TOEFL-out”: Is the End of General English for… 111

to IELTS and TOEFL iBT results—not significantly for first semester GPA (r = .144,
p = .107) and significantly but weakly for final semester (r = .280, p = .002)—EAP
Reading and EAP Listening and Speaking courses were the only indicators which:
(i) significantly predicted both first and final semester GPAs, and (ii) demonstrated
moderate strength in doing so for at least one semester. EAP Reading course results
showed an association with first semester success that was significant, though weak
(r = .246, p = .005) and with final semester success that was moderate (r = .386,
p = .000). Similarly, Listening and Speaking was found to significantly predict both
semesters’ GPAs, doing so weakly for first semester (r = .205, p = .021) and moder-
ately for final (r = .326, p = .000). Coefficients of determination suggest EAP courses
overlap some 2 to 6% of first semester student performance, and 8 to 16% for final
semester. While these percentages may not seem extremely strong, they are consid-
erably higher than the best overlap (3 to 6%) with IELTS and TOEFL results, found
not only in this study, but typical elsewhere as well.
The results also challenge the claim that generic EAP programs, given their lim-
ited direct connection to the literacies of particular disciplines, are of little utility for
student success. In this study, EAP courses, despite their lack of discipline-specific
content, substantially outstripped the predictive capacity of any standardized ELP
test for student achievement.

Conclusion

The findings of this study underscore unsettling questions about institutional prac-
tices for benchmarking English Language Proficiency of prospective students. The
results highlight the need for tertiary institutions to regularly evaluate which meas-
ures of ELP are accepted for admission, and provide justification for their use. In
the case of the university considered for this study, seven years had passed without
a single applicant using either of two measures (CAE; PTE) as proof of ELP. Upon
what basis, then, can the institution consider these two assessments “equivalent” to
other measures of ELP accepted for admission? This begs the question of how and
why certain tests are accepted in perpetuity as proof of ELP without institution-spe-
cific evidence of being fit for purpose. We assert that an assessment purporting to
benchmark ELP needs to be justified for its use in a given context, not just accepted
on the claims of the test developers, or because it is accepted by competitor institu-
tions. Future research may seek to make explicit what is now a largely opaque pro-
cess: the means by which institutional policymakers arrive at specific benchmarks of
ELP. Consultations with ELP instructors regarding measures of English Language
Proficiency, though not widely utilized, represent a valuable resource for identifying
what various benchmarks actually mean, and admission policy would benefit from
such practice-informed discussions.
The data here may also warrant consideration by students. Certainly, opting to
“IELTS-out” of EAP courses by means of an international assessment of ELP may
translate into significantly shorter program length and therefore reduced financial
costs. The findings of this study, however, indicate that an “IELTS-out” strategy may
not necessarily translate into higher grades.

13
112 R. C. Johnson, M. G. Tweedie

At the beginning of this article we sounded a note of pessimism about the future
of generic EAP, given the many competing means available with which students can
demonstrate ELP for university admission. The findings from this study of student
achievement, however, suggest a more cautionary approach when predicting the end
of general EAP.

References
AERA APA & NCME. (2014). Standards for educational and psychological testing. Washington, DC:
American Educational Research Association. Retrieved from http://www.apa.org/scien​ce/progr​ams/
testi​ng/stand​ards.aspx.
Al-Malki, M. A. S. (2014). Testing the predictive validity of the IELTS test on Omani English candi-
dates’ professional competencies. International Journal of Applied Linguistics and English Litera-
ture, 3(5), 166–172. https​://doi.org/10.7575/aiac.ijale​l.v.3n.5p.166.
Bayliss, A., & Ingram, D. E. (2006). IELTS as a predictor of academic language performance. In Austral-
ian International Education Conference (pp. 1–12).
Bellingham, L. (1993). The relationship of language proficiency to academic success for international
students. New Zealand Journal of Educational Studies, 30(2), 229–232.
Cavanagh, S. (2015). It’s a $1.7 billion testing market (but it’s a $600 billion ed. system). EDWEEK
Market Brief. Retrieved February 4, 2015, from https​://marke​tbrie​f.edwee​k.org/marke​tplac​e-k-12/
its_17_billi​on_testi​ng_marke​t_but_its_a_600_billi​on_ed_syste​m/.
Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (2008). Test score interpretation and use. Building a
validity argument for the Test of English as a Foreign Language (pp. 1–25). New York: Routledge.
Cho, Y., & Bridgeman, B. (2012). Relationship of TOEFL iBT® scores to academic performance:
Some evidence from American universities. Language Testing, 29(3), 421–442. https​://doi.
org/10.1177/02655​32211​43036​8.
Cotton, F., & Conrow, F. (1998). An investigation of the predictive validity of IELTS amongst a group of
international students studying at the University of Tasmania. IELTS Research Reports, 1, 72–115.
David, F. N. (1938). Tables of the ordinates and probability integral of the distribution of the correlation
coefficient in small samples. Cambridge: Cambridge University Press.
Denham, P. A., & Oner, J. A. (1992). IELTS research project: Validation study of listening sub-test. In:
Validation study of listening sub-test (IDP/IELTS commissioned report). Canberra: University of
Canberra.
Dooey, P., & Oliver, R. (2002). An investigation into the predictive validity of the IELTS test as an indi-
cator of future academic success. Prospect, 17(1), 36–54.
ETS. (2018). The TOEFL test. Retrieved October 28, 2018, from https​://www.ets.org/toefl​/.
Feast, V. (2002). The impact of IELTS scores on performance at university. International Education
Journal, 3(4), 70–85.
Gan, Z. (2009). IELTS preparation course and student IELTS performance: A case study in Hong Kong.
RELC Journal, 40(1), 23–41. https​://doi.org/10.1177/00336​88208​10144​9.
Harsch, C., Ushioda, E., & Ladroue, C. (2017). Investigating the predictive validity of TOEFL iBT ®
test scores and their use in informing policy in a United Kingdom university setting. ETS Research
Report Series, 2017(1), 1–80. https​://doi.org/10.1002/ets2.12167​.
Henderson, R., & Hirst, E. (2007). Reframing academic literacy: Re-examining a short-course for “disad-
vantaged” tertiary students. English Teaching: Practice and Critique, 6(2), 25–38.
Hill, K., Storch, N., & Lynch, B. (1999). A comparison of IELTS and TOEFL as predictors of academic
success. International English Language Testing System Research Reports, 2, 52–63.
IELTS. (2018). IELTS Introduction. Retrieved October 28, 2018, from https​://www.ielts​.org/what-is-ielts​
/ielts​-intro​ducti​on.
Ingram, D., & Bayliss, A. (2007). IELTS as a predictor of academic language performance, Part 1: The
view from participants. IELTS Research Reports, 7, 137–204. Retrieved from https​://www2.warwi​
ck.ac.uk/fac/soc/al/resea​rch/group​s/llta/resea​rch/past_proje​cts/stran​d_2_proje​ct_repor​t_publi​c.pdf.
Jamieson, J. M., Eignor, D., Grabe, W., & Kunnan, A. J. (2008). Frameworks for a new TOEFL. Building
a validity argument for the test of English as a foreign language (pp. 55–95). New York: Routledge.

13
“IELTS-out/TOEFL-out”: Is the End of General English for… 113

Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Meas-
urement, 50(1), 1–73. https​://doi.org/10.1111/jedm.12000​.
Kerstjens, M., & Nery, C. (2000). Predictive validity in the IELTS test: A study of the relationship
between IELTS scores and students’ subsequent academic performance. English Language Testing
System Research Reports, 3, 85–108.
Kuh, G. D., Kinzie, J., Buckley, J. A., Bridges, B. K., & Hayek, J. C. (2006). What matters to student
success: A review of the literature. Washington, DC. Retrieved from https​://www.ue.ucsc.edu/sites​/
defau​lt/files​/WhatM​atter​sStud​entSu​ccess​(Kuh,July2​006).pdf.
Lee, J. T. (2015). Education hubs in Asia: A common facade for uncommon visions. In R. Bhandari & A.
Lefebure (Eds.), Asia: The next higher education superpower? (pp. 93–107). New York: Institute of
International Education.
Macaro, E., Curle, S., Pun, J., An, J., & Dearden, J. (2018). A systematic review of English medium
instruction in higher education. Language Teaching, 51(01), 36–76. https​://doi.org/10.1017/S0261​
44481​70003​50.
Macgregor, A., & Folinazzo, G. (2017). Best practices in teaching international students in higher educa-
tion: Issues and strategies. TESOL Journal, 9(2), 299–329. https​://doi.org/10.1002/tesj.324.
Michigan Language Assessment. (2018). MELAB. Retrieved October 28, 2018, from https​://michi​ganas​
sessm​ent.org/test-taker​s/tests​/melab​/.
Murray, N. (2016). Standards of English in higher education: Issues, challenges and strategies. Cam-
bridge: Cambridge University Press.
Murray, N., & Hicks, M. (2016). An institutional approach to English language proficiency. Journal of
Further and Higher Education, 40(2), 170–187. https​://doi.org/10.1080/03098​77X.2014.93826​1.
Paragon Testing Enterprises. (2018). CAEL. Retrieved October 28, 2018, from https​://www.cael.ca/.
Pearson. (2019). PTE Academic. Retrieved July 18, 2019, from https​://pears​onpte​.com/.
Pearson Education. (2017). Accurate fact sheet. Retrieved April 23, 2018, from https​://pears​onpte​.com/
wp-conte​nt/uploa​ds/2014/07/Accur​ateFa​ctshe​et.pdf.
Sheridan, V. (2011). A holistic approach to international students, institutional habitus and academic lit-
eracies in an Irish third level institution. Higher Education, 62(2), 129–140. https​://doi.org/10.1007/
s1073​4-010-9370-2.
Taylor, L. (2004). Issues of test comparability. Research Notes, 15(2). Retrieved from http://www.cambr​
idgee​nglis​h.org/image​s/23131​-resea​rch-notes​-15.pdf.
Tweedie, M. G., & Chu, M.-W. (2019). Challenging equivalency in measures of English language pro-
ficiency for university admission: Data from an undergraduate engineering programme. Studies in
Higher Education, 44(4), 683–695. https​://doi.org/10.1080/03075​079.2017.13950​08.
UCLES. (2019). C1 Advanced. Retrieved July 18, 2019, from https​://www.cambr​idgee​nglis​h.org/exams​
-and-tests​/advan​ced/.
University of Alberta. (2018). Language requirements. Retrieved April 11, 2018, from https​://www.ualbe​
rta.ca/admis​sions​/under​gradu​ate/admis​sion/admis​sion-requi​remen​ts/langu​age-requi​remen​ts.
Wongtrirat, R. (2010). English language proficiency and academic achievement of international students:
A meta-analysis. Old Dominion University. Retrieved from https​://eric.ed.gov/?id=ED519​065.
Woodrow, L. (2006). Academic success of international postgraduate education students and the role of
English proficiency. University of Sydney Papers in TESOL, 1, 51–70.
Yen, D., & Kuzma, J. (2009). Higher IELTS score, higher academic performance? The validity of IELTS
in predicting the academic performance of Chinese students. Worcester Journal of Learning and
Teaching, 3, 1–7.
York, T. T., Gibson, C., & Rankin, S. (2015). Defining and measuring academic success. Practical
Assessment, Research & Evaluation, 20(5), 1–20.
Zheng, Y., & Dejong, J. H. A. L. (2011). Research Note: Establishing construct and concurrent validity
of Pearson Test of English Academic. London. Retrieved from https​://pears​onpte​.com/wp-conte​nt/
uploa​ds/2014/07/RN_Estab​lishi​ngCon​struc​tAndC​oncur​rentV​alidi​tyOfP​TEAca​demic​_2011.pdf.

Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published
maps and institutional affiliations.

13

You might also like