In 2020, in addition to the results of PISA, ICILS, PIAAC and TALIS, already published a year earlier, the public interested in trends in the development of education also could access the results of the TIMSS 2019 study, in which Kazakhstan traditionally takes part.

Such a volume of data obtained through various measurements on the state and development of the education system in the Republic of Kazakhstan provides an opportunity for an in-depth analysis of education indicators, comparing them both with the results of other countries, and studying in-country trends.

This section of the digest presents an analysis of research articles on the use of the results of international comparative studies and the “pitfalls” arising in their application to justify reforms.

As E. Klieme (2020) notes in his work on the use of ILSAs in education policy and practice, their main purpose is to measure indicators of the effectiveness, equity and productivity of educational systems, setting benchmarks for international comparison, tracking trends over time, informing educational policies at different levels on innovation in education management and curriculum.

In many countries, including Kazakhstan, ILSA results are widely reported (e.g., PISA results), having “far-reaching influence on education policy” (Klieme, 147, in Hall et al., 2020).

Noting that ILSAs can further develop the educational effectiveness research movement by providing extensive and thoroughly verified data, the author cautions against simplistic interpretation of the results of these studies to avoid misgeneralizations.

Thus, students’ previous achievements (preschool education, additional education, etc.) may have significant importance on their final results, as a result of which the “direction of causality” may be unclear. The author notes that both policymakers and researchers themselves have often been misled by hasty interpretations and too far-reaching conclusions.

Klieme also makes an interesting point – instead of substantiating claims about educational effectiveness as such (e.g., assessing the impact of policies and practices on student achievement), ILSA data can be used to obtain information about the distribution of educational opportunities among students, families, schools and regions.

In such a case, policy and practice are treated as dependent variables, while student performance and their marital status are treated as independent variables. In other words, the author proposes not to consider the results of PISA or TIMSS as evidence of the effectiveness of a particular program or reform, but, on the contrary, to assess the impact of policy on indicators of the quality of education.

According to the author, this will allow asking questions as “Do migrant students and students from socially vulnerable families have an equal share of well-trained teachers, school principals, favorable and motivating classroom environment and opportunities for extracurricular learning?” or “Who gets differentiated instruction, supportive feedback and support from their teachers?”

In his chapter “Policies and Practices of Assessment: A Showcase for the Use (and Misuse) of International Large Scale Assessments in Educational Effectiveness Research,” Klieme analyzes the “invariance” of measurements (the same understanding of the semantic construct in different countries) used in PISA 2015.

For example, after conducting a secondary analysis of student responses to questions about teachers’ feedback in country clusters by language and region, the author concludes that the OECD’s approach to ranking countries on this issue is erroneous, and “instead of providing meaningful and useful information, generates misleading myths about differences between countries ”(Klieme, 163, in Hall et al., 2020). Specifically, the author found bias concerning English-speaking countries that scored above the OECD average on feedback and formative assessments. However, Klieme recognizes the potential for using ILSA data to inform research on educational effectiveness.

In 2020, the International Association for the Evaluation of Educational Achievement (IEA) published a book on the reliability and validity of ILSAs, which includes a series of articles analyzing data from IEA research.

In a chapter titled “Understanding the Policy Influence of International Large-Scale Assessments in Education” Rutkowski et al. analyze methods for systematizing making policy decisions based on ILSA data.

As the authors note, in the educational community, and sometimes among politicians themselves, there can be an erroneous expectation that these ILSAs automatically offer political solutions. It is rather difficult to trace the real relationship between the results of such studies as TIMSS and PIRLS and the measures taken in educational policy, the effectiveness of reforms, etc. – often high-profile releases of research results are used to justify an already existing agenda.

Rutkowski et al. warn that even when such an association is identified, it is still difficult to ascertain the direction of the relationship or the degree of influence that ILSAs have had on any policy change that results. In other words, it is difficult to prove otherwise – that a policy change would not have occurred in the absence of a specific ILSA.

The authors note that countries very rarely carry out a systematic analysis of ILSA data in the context of a specific education system.

As author notes, the connection of any reforms with the results of ILSA occurs when an assumption is made based on the sequence of events, that is, it is assumed that since one event occurs after another, it follows from it (p.264). In the meantime, statements of causation implying that a particular study led to a particular policy decision requires a methodological framework that may simply not be possible due to the complexity of most national systems.

The authors developed a model to assess the impact of ILSA on policy based on the model for determining whether countries’ education goals are in line with ILSA goals, previously proposed by Oliveri et al. (2018).


Klieme, E. (2020) “Policies and Practices of Assessment: A Showcase for the Use (and Misuse) of International Large-Scale Assessments in Educational Effectiveness Research” in International Perspectives in Educational Effectiveness Research, (ed.) Hall, J., Lindorff, A., Sammons, P.  Springer Nature Switzerland,

Rutkowski, D., Thompson, G.& Rutkowski, L. (2020) “Understanding the Policy Influence of International Large-Scale Assessments in Education” in Reliability and Validity of International Large-Scale Assessment. Understanding IEA’s Comparative Studies of Student Achievement, (ed.) Wagemaker, H. (2020). Springer, C.

Oliveri, M. E., Rutkowski, D., & Rutkowski, L. (2018). Bridging validity and evaluation to match international large-scale assessment claims and country aims. ETS Research Report Series, 2018(1), 1–9.



A password has not been entered
Password generation