10 minute read

International Assessments

International Association For Educational Assessment, International Association For The Evaluation Of Educational Achievement, Iea And Oecd Studies Of Reading LiteracyOVERVIEW

Larry Suter

Ton Luijten

Tjeerd Plomp

Vincent Greaney
Thomas Kellaghan

Ronald E. Anderson

Albert E. Beaton

Judith Torney-Purta
Jo-Ann Amadeo
John Schwille


International comparisons of student achievement involve assessing the knowledge of elementary and secondary school students in subjects such as mathematics, science, reading, civics, and technology. The comparisons use test items that have been standardized and agreed upon by participating countries. These complex studies have been carried out since 1959 to explicitly compare student performance among countries for students at a common age. To participate in such a comparative study, a country must demonstrate that it has had prior experience in conducting empirical studies of education.

Comparing student achievement between countries has several goals. To policymakers, country-to-country comparisons of student performance help indicate whether their educational system is performing as well as it could. To a researcher of education issues, the studies provide a basis for hypothesizing whether some policies and practices in education are necessary or sufficient for high student performance (such as requiring all teachers to obtain college degrees in the subject area they teach). To teachers and school administrators, international studies provide examples of behavior that may be a source of new forms of practice and self-evaluation.

Types of Study Results

The results of a large international study in 1995 showed that eighth-grade teachers in the United States are often not involved in decisions about the content areas of their teaching, as teachers are in other nations. U.S. teachers work longer hours than those in most other countries, they do not have as much time during the day to prepare for classes, and their daily classroom teaching is disrupted more often by things such as announcements, band practice, and scheduling changes. Moreover, the organization of curriculum used by elementary and middle schools in the United States appears not to be focused on topics that will propel students toward a more advanced understanding of mathematics. Comparisons with other countries show that U.S. students are just as interested in science and mathematics as other students, they study as long, and they watch just as much television.

Organizational History

Education researchers and policymakers from twelve countries first established a plan for making large-scale cross-national comparisons between countries on student performance in 1958 at the UNESCO Institute for Education in Hamburg, Germany. The first successful large-scale quantitative international study in mathematics was conducted in 1965 by the International Association for the Evaluation of Educational Achievement (IEA) and included Australia, Belgium, England, Finland, France, Germany, Israel, Japan, Netherlands, Scotland, Sweden, and the United States. Since then, studies in fourteen or more countries have been conducted periodically in several subject areas of elementary and secondary education.

Between 1965 and 2001 the IEA sponsored studies of mathematics in 1965, 1982, 1995, and 1999; science in 1970, 1986, 1995, and 1999; reading in 1970, 1991, and 2001; civics in 1970 and 1998; and technology in 1990 and 1999. The Educational Testing Service conducted an International Assessment for Education Progress in science and mathematics in 1990. The Adult Literacy and Lifeskills survey is a large-scale comparative survey designed to identify and measure prose literacy, numeracy, and analytical reasoning in the adult population (those between sixteen and sixty-five years of age). This survey was conducted in 1994 and 2001.

Studies such as these require the development of a set of test items, which are translated into the languages of the participating countries. The translated items are checked for proper translation and they are pretested in each country to determine whether they have misunderstandings or errors that would make the items unsuitable for use in the final study (about three times as many items are written as are finally used). The participating countries collectively agree upon a framework to define critical aspects of the topic area. For example, an elementary mathematics test would include items in numbers, geometry, algebra, functions, analysis, and measurement, and would also have items that represented different aspects of student performance, such as knowing the topic, using procedures, solving problems, reasoning, and communicating. However, no single assessment could cover comprehensively an entire topic for all countries.

The tests are administered to a sample of students in 100 to 200 schools, which are selected to represent all students in the country. An international referee monitors the school selection process to insure that all countries follow correct sampling procedures. The test items are scored according to internationally agreed-upon procedures and are analyzed at an international center to insure cross-national comparability. Countries that do not meet high standards of participation are not included in the comparisons.

Problems of Comparability

Some educators believe that learning is too elusive and culturally specific to be measured in a statistical survey. They believe that the outcomes of education are too diverse, indirect, and unpredictable to be measured in a single instrument. Others believe that comparisons are "odious" because practices that work in one culture may not be appropriate in another culture due to differences in social context and history.

The first IEA study planners were not confident that cross-national comparisons would be valid. They were concerned that the curriculum of different countries would stress different aspects of mathematics, science, or reading, and that any test of student performance might not reflect what students had been taught. To recognize national differences in teaching, the first studies measured the degree to which topics that were emphasized in the school system were actually covered. Curriculum differences were categorized as intended, implemented, or attained curriculum in order to separate the policies of the school district from classroom presentations and actual student performance. The amount of coverage of a topic became an important explanatory variable for between-school and between-country differences in achievement. The analysis showed that students in every country cover the same topics, but that they were often covered in a different order, and with a different emphasis, thus showing that international comparisons of student achievement do reflect the same content areas as other countries and thus they do make sense.

Education practices in the countries studied have been found to have more similarities than differences. The differences can be studied, however, and give important insights into which practices can be improved. International studies have helped policymakers understand that student performance is strongly determined by how schools articulate the content areas they are responsible for.

For example, a study conducted in 1965 showed significant differences in how countries approached the teaching of mathematics. Subsequent studies showed which topics of mathematics each country considered important, at what age they were introduced, and how the topics were sequenced. These studies led educators to pay closer attention to the underlying curriculum and the training of teachers in the United States. They also led to the earliest efforts by the mathematics education professionals to develop a single set of standards for mathematics teaching.

Studies of writing have had difficulty in achieving standards that permit comparison across countries. After several attempts to develop a standard set of principles for grading the writing of students across countries, the IEA gave up its efforts to evaluate writing across cultures. However, a study of reading achievement was successfully conducted in elementary and middle school grades in 1970, and studies are being conducted by the IEA and the Organisation for Economic Co-operation and Development (OECD). International studies have shown that U.S. elementary school students have a high performance level in reading compared with the rest of the participating countries, but only moderate performance at grade nine. These results indicate that U.S. students begin school with sufficient ability to read and interpret texts.

Forms of Inquiry

Comparative studies of student achievement require carefully designed statistical surveys for the statistical measurement aspect of the comparison. The populations must be defined in a common way for each country, even though definitions of a grade might differ from country to country. For example, one way to insure comparability is to select a careful sample of all students who attend whatever grade is common for fourteen-year-old students. These surveys involve students taking a test for about an hour and filling in a background questionnaire of their attitudes toward school. Teachers are asked to complete questionnaires about the curriculum topics they cover and their own professional training.

Since the 1990s studies have sometimes involved the use of videotape technology to collect information on teaching practices and student activities. For example, large national samples of mathematics classrooms were videotaped in 1995 in Japan, Germany, and the United States, and classrooms for other subjects were videotaped, in additional countries, in 2000. Videotape methods permit a more careful description of teaching practices than classroom surveys, and they provide a check on the validity of teachers' self-reporting of their practices. Detailed case studies of educational practices in several countries have also provided information about the social context in which students are taught.

International Assessments in the Twentieth Century

The first international studies were carried out by university research centers unaffiliated with government agencies. The results of those studies were published in academic journals, technical volumes, and academic books. During the 1980s these studies influenced policies in American education. Beginning in 1989 government agencies decided that they should have a larger role in organizing and supporting the studies and improving their quality. The National Center for Education Statistics (NCES), an agency of the U.S. Department of Education, and the National Science Foundation provided the leadership and funding support for creating international assessments. The U.S. National Academy of Sciences established an oversight committee called the Board on International Comparative Studies in Education to monitor the progress of these studies.

By 1995 international comparative studies had become an accepted continuing aspect of describing the status of the educational outcomes and were being carried out regularly by the NCES. Many countries originally participated in these studies in order to conduct an analysis of a single subject area in a single year. They have since shifted toward a more strategic plan to develop consistently measured trends in educational achievement with international benchmarks.

International Assessments in the Twenty-First Century

The complexity of conducting standardized comparisons of student achievement in many countries will always challenge researchers, yet they have become institutionalized in many countries. The OECD, which is based in Paris, has gained support from at least twenty-five governments for a continuing series of international comparisons of reading, mathematics, and science. These comparisons began in 2000. Also in 2000 UNESCO established the International Institute of Statistics to further institutionalize a process for improving the use of comparative statistics for policymaking.

Studies on the use of technology in schools are being developed to provide new information on forms of instructional technology that are becoming widespread in schools. Schools all over the world have introduced the use of computers and other forms of technology to classroom instruction, and studies seek to determine how educational practices are being altered by these systems.


BLACK, PAUL, and WILIAM, DYLAN. 1998. "Inside the Black Box: Raising Standards through Classroom Assessment." Phi Delta Kappan 80 (2):139–148.

COMBER, L. C., and KEEVES, JOHN P. 1973. Science Education in Nineteen Countries: An Empirical Study. Stockholm: John Wiley.

HARNQVIST, KJELL. 1987. "The IEA Revisited." Comparative Education Review 31 (1):48–55.

HUSÉN, TORSTEN, ed. 1967. International Study of Achievement in Mathematics, Volume 1. New York: John Wiley.

HUSÉN, TORSTEN. 1979. "An International Research Venture in Retrospect: The IEA Surveys." Comparative Education Review 23 (3):371–385.

HUSÉN, TORSTEN, and POSTLETHWAITE, T. NEVILLE, eds. 1985. The International Encyclopedia of Education Research and Studies. Oxford: Pergamon Press.

MULLIS, INA; MARTIN, MICHAEL O.; BEATON, ALBERT E.; GONZALEZ, EUGENIO J.; KELLY, DANAL.; and SMITH, TERESA A. 1998. Mathematics and Science Achievement in the Final Year of Secondary School: IEA's Third International Mathematics and Science Study (TIMSS). Boston: Center for the Study of Testing, Evaluation, and Education Policy, Boston College.

ROBITAILLE, DAVID F.; SCHMIDT, WILLIAM H.; RAIZEN, SENTA; McKNIGHT, CURTIS; BRITTON, EDWARD; and NICOL, CYNTHIA. 1993. Curriculum Frameworks for Mathematics and Science. Vancouver, BC, Canada: Pacific Educational Press.

SCHMIDT, WILLIAM H., et al. 1996. Characterizing Pedagogical Flow: An Investigation of Mathematics and Science Teaching in Six Countries. Dordrecht, Netherlands: Kluwer.

STIGLER, JAMES W.; GONZALES, PATRICK A.; KAWANAKA, TAKAKO; KNOLL, STEFFEN; and SERRANO, ANA. 1999. The TIMSS Videotape Classroom Study: Methods and Findings from an Exploratory Research Project on Eighth-Grade Mathematics Instruction in Germany, Japan, and the United States. Washington, DC: National Center for Education Statistics.

SUTER, LARRY. 2001. "Is Student Achievement Immutable? Evidence from International Studies on Schooling and Student Achievement." Journal for the Review of Educational Research. 70 (4):529–545.

TRAVERS, KENNETH J., and WESTBURY, IAN. 1990. The IEA Study of Mathematics, I: Analysis of Mathematics Curricula. Oxford: Pergamon Press.



NATIONAL CENTER FOR EDUCATION STATISTICS. 2001. <http://nces.ed.gov/surveys/SurveyGroups.asp?Group=06>.




Additional topics

Education - Free Encyclopedia Search EngineEducation Encyclopedia