Large-scale education tests often come with side effects


When results come out for big education tests like the Programme for International Student Assessment (PISA), which primarily measures 15-year-old students’ knowledge and skills in reading, mathematics and science, the focus is often on which countries scored the highest.

The education systems of countries that do well on this test are often portrayed as models for the rest of the world. For example, the United Kingdom has committed $54.2 million to help 8,000 schools adopt the math teaching methods of PISA’s top performer, Shanghai, by 2020. The United Kingdom has adopted Chinese textbooks as well.

Some educators have found that there are problems with emulating the top PISA scorers. Big education tests – known as large-scale assessments in the education world – come, in our and our colleagues’ research, with some serious and damaging side effects. Students in countries that did the best on the PISA – the results of which were released on Dec. 3 – often have lower well-being, as measured by students’ satisfaction with life and school. Six out of the top 10 performing countries for reading have student well-being levels that are below the Organisation for Economic Co-operation and Development average.

This suggests to us the need to take a more critical look at what large-scale assessments like the PISA really show and why countries with high PISA scores also score low in well-being.

Another question is whether these tests should hold as much power as they do when it comes to shaping educational policy and practice, or judging the “quality” of one country’s education system over another.

Here are a series of problems that have been shown to occur when too much emphasis is placed on the results of big education tests like the PISA.


Big education tests can distort the definition of quality education. For example, high PISA-scoring educational systems, such as Singapore, Finland, Korea and Shanghai, are viewed as high-quality systems. But we think the quality of education is much more than any education test can assess.

Large-scale education tests can also distort what is actually taught in schools by narrowing it to a limited number of assessed subjects: typically reading, math and, in some cases, science. Meanwhile, other subjects, such as music, art, social studies and languages, are overlooked.

Furthermore, these tests can distort instruction by inducing teachers to teach to the test. For example, the No Child Left Behind Act of 2001, which brought tests as accountability measures to U.S. schools, has led to an increase in instruction time on tested subjects. However, other essential skills, like creativity, problem-solving and organization of knowledge, have been neglected.

Leads to corruption and cheating

Large-scale assessments create incentives and pressure that can lead to corruption and cheating. In 2019, for instance, 50 Americans were charged in a college admission scandal that involved cheating on college entrance exams as well as bribing their children’s way into college.

The cheating is not limited to the U.S. In China, cheating on the National College Entrance Exam and large-scale assessments is a frequent occurrence.

Aggravates inequity

Large-scale assessments can be biased against students from disadvantaged and minority backgrounds and favor advantaged students. Take the SAT as an example. The scores have a strong positive correlation with family income, which means students from wealthier families score higher than those from lower-income families.

As a result, students from lower-income families are not afforded the same opportunities to attend college or end up attending a less prestigious college. This has long-term socioeconomic impact as graduation from college and pursuing advanced degrees have notable differences in lifelong earning opportunities. The ability to attend top-tier colleges greatly increases the likelihood of graduation and being accepted for advanced degrees. When these opportunities are limited due to biased large-scale assessments and unequal opportunities, it may serve only to aggravate inequality and injustice.

In many countries, large-scale assessments are used as gatekeepers to access higher education. This leads parents, teachers, schools, media, policymakers and students to focus on high scores. Scores are then associated with the worthiness of students. When scores become equated with worth, it can demoralize and cause psychological damage to students as well as teachers and other stakeholders.

Exam-induced suicide has been reported in places such as KoreaSingaporeHong Kong and China. These countries also tend to be the high achievers on other large-scale assessments like the PISA.

Potential overlooked

Large-scale assessments can provide useful information for education policy, but overreliance on test results may cause problems. When the focus is on students’ scores and countries’ ranking, other important things, such as creativity, entrepreneurial thinking, social-emotional well-being and critical thinking, might be neglected. These treasured educational outcomes are things that large-scale assessments often fail to capture.

Author Bios: Yurou Wang is a Clinical Assistant Professor at the University of Alabama and Trina E. Emler is at the University of Kansas