Classroom assessments are those developed or selected by teachers for use during their day-to-day instruction. They are different from the standardized tests that are conducted annually to gauge student achievement, and are most frequently used to serve formative purposes, that is, to help students learn. However, classroom assessments also can be used summatively to determine a student's report card grade. Standardized tests, on the other hand, tend to be considered summative assessments, as they are used to judge student progress over an extended period of time.

As the research summarized below reveals, assessment used during instruction can have a profound impact on student achievement. But to do so, the assessments must provide accurate information and they must be used in appropriate ways.

Research on Impact

In 1984, Benjamin Bloom published a summary of research on the impact of mastery learning models, comparing standard whole-class instruction (the control condition) with two experimental interventions–a mastery learning environment (where students aspire to achieving specific learning standards) and one-on-one tutoring of individual students. One hallmark of both experimental conditions was extensive use of formative classroom assessment during the learning process. Analysis of summative results revealed unprecedented gains in achievement for students in the experimental treatments–when compared to the control groups. To be sure, the entire effect cannot be attributed to the effective use of classroom assessment. But, according to Bloom, a major portion can.

Based on his 1988 compilation of available research, Terry Crooks concluded that classroom assessment can have a major impact on student learning when it:

  • Places great emphasis on understanding, not just recognition or recall of knowledge; as well as on the ability to transfer learning to new situations and other patterns of reasoning
  • Is used formatively to help students learn, and not just summatively for the assignment of a grade
  • Yields feedback that helps students see their growth or progress while they are learning, thereby maintaining the value of the feedback for students
  • Relies on student interaction in ways that enhance the development of self-evaluation skills· Reflects carefully articulated achievement expectations that are set high, but attainable, so as to maximize students' confidence that they can succeed if they try and to prevent them from giving up in hopelessness
  • Consolidates learning by providing regular opportunities for practice with descriptive, not judgmental, feedback
  • Relies on a broad range of modes of assessment aligned appropriately with the diversity of achievement expectations valued in most classrooms
  • Covers all valued achievement expectations and does not reduce the classroom to focus only on that which is easily assessed

A decade later, Paul Black and Dylan Wiliam examined the measurement research literature worldwide in search of answers to three questions: (1) Is there evidence that improving the quality and effectiveness of use of formative (classroom) assessments raises student achievement as reflected in summative assessments? (2) Is there research evidence that formative assessments are in need of improvement?(3) Is there evidence about the kinds of improvements that are most likely to enhance student achievement? They uncovered forty articles that addressed the first question with sufficiently rigorous research designs to permit an estimation of the effects of improved classroom assessment on subsequent standardized test scores. They also uncovered profoundly large effects, including score gains that, if realized in the international math and science tests of the 1990s, would have raised the United States and England from the middle of the pack in the rank order of forty-two participating nations to the top five. Black and Wiliam go on to reveal that "improved formative assessment helps low achievers more than other students, and so reduces the range of achievement while raising achievement overall"(p. 141). They contend that this result has direct implications for districts having difficulty reducing achievement gaps between minorities and other students. The answer to their second question is equally definitive. Citing a litany of research similar to that referenced above, they describe the almost complete international neglect in assessment training for teachers.

Their answer to the third question, asking what specific improvements in classroom assessment are likely to have the greatest impact, is the most interesting of all. They describe the positive effects on student learning of (a) increasing the accuracy of classroom assessments, (b) providing students with frequent informative feedback, rather than infrequent judgmental feedback, and (c) involving students deeply in the classroom assessment, record keeping, and communication processes. They conclude that "self-assessment by pupils, therefore, far from being a luxury, is in fact an essential component of formative assessment. When anyone is trying to learn, feedback about the effort has three elements: redefinition of the desired goal, evidence about present position, and some understanding of a way to close the gap between the two. All three must be understood to some degree by anyone before he or she can take action to improve learning"(p. 143).

Standards of Quality

To have such positive effects, classroom assessments must be carefully developed to yield dependable evidence of student achievement. If they meet the five standards of quality described below, they will, in all probability, produce accurate results.

These standards can take the form of the five questions that the developer can ask about the assessment: (1) Am I clear about what I want to assess?(2) Do I know why I am assessing? (3) Am I sure about how to gather the evidence that I need? (4) Have I gathered enough evidence? (5) Have I eliminated all relevant sources of bias in results? Answers to these questions help judge the quality of classroom assessments. Each is considered in greater detail below.

Standard 1. In any classroom assessment context, one must begin the assessment development process by defining the precise vision of what it means to succeed. Proper assessment methods can be selected only when one knows what kind of achievement needs to be assessed. Are students expected to master subject-matter content–meaning to know and understand? If so, does this mean they must know it outright, or does it mean they must know where and how to find it using reference sources? Are they expected to use their knowledge to reason and solve problems? Should they be able to demonstrate mastery of specific performance skills, where it's the doing that is important, or to use their knowledge, reasoning, and skills to create products that meet standards of quality?

Because there is no single assessment method capable of assessing all these various forms of achievement, one cannot select a proper method without a sharp focus on which of these expectations is to be assessed. The main quality-control challenge is to be sure the target is clear before one begins to devise assessment tasks and scoring procedures to measure it.

Standard 2. The second quality standard is to build each assessment in light of specific information about its intended users. It must be clear what purposes a particular assessment will serve. One cannot design sound assessments without asking who will use the results, and how they will use them. To provide quality information that will meet people's needs, one must analyze their needs. For instance, if students are to use assessment results to make important decisions about their own learning, it is important to conduct the assessment and provide the results in a manner that will meet their needs, which might be distinctly different from the information needs of a teacher, parent, or principal. Thus, the developer of any assessment should be able to provide evidence of having investigated the needs of the intended user of that assessment, and of having conducted that assessment in a manner consistent with that purpose. Otherwise the assessment is without purpose. The quality-control challenge is to develop and administer an assessment only after it has been determined precisely who will use its results, and how they will use them.

Within this standard of quality, the impact research cited above suggests that special emphasis be given to one particular assessment user, the student. While there has been a tendency to think of the student as the subject (or victim) of the assessment, the fact is that the decisions students make that are based on teacher assessments of their success drive their ultimate success in school. Thus, it is essential that they remain in touch with and feel in control of their own improvement over time.

Standard 3. Since there are several different kinds of achievement to assess, and since no single assessment method can reflect them all, educators must rely on a variety of methods. The options available to the classroom teacher include selected response (multiple choice, true/false, matching, and fill-in), essays, performance assessments (based on observation and judgment), and direct personal communication with the student. The assessment task is to match a method with an intended target, as depicted


in Table 1. The quality-control challenge is to be sure that everyone concerned with quality assessment knows and understands how the various pieces of this puzzle fit together.

Standard 4. All assessments rely on a relatively small number of exercises to permit the user to draw inferences about a student's mastery of larger domains of achievement. A sound assessment offers a representative sample of all those possibilities that is large enough to yield dependable inferences about how the respondent would perform if given all possible exercises. Each assessment context places its own special constraints on sampling procedures, and the quality-control challenge is to know how to adjust the sampling strategies to produce results of maximum quality at minimum cost in time and effort.

Standard 5. Even if one devises clear achievement targets, transforms them into proper assessment methods, and samples student performance appropriately, there are still factors that can cause a student's score on a test to misrepresent his or her real achievement. Problems can arise from the test, from the student, or from the environment where the test is administered.

For example, tests can consist of poorly worded questions; they can place reading or writing demands on respondents that are confounded with mastery of the material being tested; or they can have more than one correct response, be incorrectly scored, or contain racial or ethnic bias. The student can experience extreme evaluation anxiety or interpret test items differently from the author's intent, and students may cheat, guess, or lack motivation. In addition, the assessment environment could be uncomfortable, poorly lighted, noisy, or otherwise distracting. Any of these factors could give rise to inaccurate assessment results. Part of the quality-control challenge is to be aware of the potential sources of bias and to know how to devise assessments, prepare students, and plan assessment environments to deflect these problems before they ever have an impact on results.


BLACK, PAUL, and WILIAM, DYLAN. 1998. "Assessment and Classroom Learning." Assessment in Education 5 (1):7–74.

BLACK, PAUL, and WILIAM, DYLAN. 1998. "Inside the Black Box: Raising Standards through Classroom Assessment." Phi Delta Kappan 80 (2):139–148.

BLOOM, BENJAMIN. 1984. "The Search for Methods of Group Instruction as Effective as One-to-One Tutoring." Educational Leadership 41:4–17.

CROOKS, TERRY J. 1988. "The Impact of Classroom Evaluation on Students." Review of Educational Research 58 (4):438–481.

STIGGINS, RICHARD J. 2001. Student-Involved Classroom Assessment, 3rd edition. Columbus, OH: Merrill.



