18 minute read

Research Methods

Qualitative And Ethnographic, School And Program Evaluation, Verbal ProtocolsOVERVIEW

Georgine M. Pion
David S. Cordray

LeAnn G. Putney
Judith L. Green
Carol N. Dixon

Laura Desimone

Tammy Bourg


How do people learn to be effective teachers? What percentage of American students has access to computers at home? What types of assessments best measure learning in science classes? Do college admission tests place certain groups at a disadvantage? Can students who are at risk for dropping out of high school be identified? What is the impact of new technologies on school performance? These are some of the many questions that can be informed by the results of research.

Although research is not the only source used for seeking answers to such questions, it is an important one and the most reliable if executed well. Research is a process in which measurements are taken of individuals or organizations and the resulting data are subjected to analysis and interpretation. Special care is taken to provide as accurate an answer as possible to the posed question by subjecting "beliefs, conjectures, policies, positions, sources of ideas, traditions, and the like … to maximum criticism, in order to counteract and eliminate as much intellectual error as possible" (Bartley, pp. 139–140). In collecting the necessary information, a variety of methodologies and procedures can be used, many of which are shared by such disciplines as education, psychology, sociology, cognitive science, anthropology, history, and economics.

Evidence–The Foundation of Research

In education, research is approached from two distinct perspectives on how knowledge should be acquired. Research using quantitative methods rests on the belief that individuals, groups, organizations, and the environments in which they operate have an objective reality that is relatively constant across time and settings. Consequently, it is possible to construct measures that yield numerical data on this reality, which can then be further probed and interpreted by statistical analyses. In contrast, qualitative research methods are rooted in the conviction that "features of the social environment are constructed as interpretations by individuals and that these interpretations tend to be transitory and situational" (Gall, Borg, and Gall, p. 28). It is only through intensive study of specific cases in natural settings that these meanings and interpretations can be revealed and common themes educed. Although debate over which perspective is "right" continues, qualitative and quantitative research share a common feature–data are at the center of all forms of inquiry.

Fundamentally, data gathering boils down to two basic activities: Researchers either ask individuals (or other units) questions or observe behavior. More specifically, individuals can be asked about their attitudes, beliefs, and knowledge about past or current behaviors or experiences. Questions can also tap personality traits and other hypothetical constructs associated with individuals. Similarly, observations can take on a number of forms: (1) the observer can be a passive transducer of information or an active participant in the group being observed;(2) those being observed may or may not be aware that their behavior is being chronicled for research purposes; and (3) data gathering can be done by a human recorder or through the use of technology (e.g., video cameras or other electronic devices). Another distinction that is applicable to both forms of data gathering is whether the data are developed afresh within the study (i.e., primary data) or stem from secondary sources (e.g., data archives; written documents such as academic transcripts, individualized educational plans, or teacher notes; and artifacts that are found in natural settings). Artifacts can be very telling about naturally occurring phenomena. These can involve trace and accretion measures–that is, "residue" that individuals leave behind in the course of their daily lives. Examples include carpet wear in front of exhibits at children's museums (showing which exhibits are the most popular), graffiti written on school buildings, and websites visited by students.

What should be clear from this discussion so far is that there exists a vast array of approaches to gathering evidence about educational and social phenomena. Although reliance on empirical data distinguishes research-based disciplines from other modes of knowing, decisions about what to gather and how to structure the data gathering process need to be governed by the purpose of the research. In addition, a thoughtful combination of data gathering approaches has the greater chance of producing the most accurate answer.

Purposes of Research

The array of questions listed in the introductory paragraph suggests that research is done for a variety of purposes. These include exploring, describing, predicting, explaining, or evaluating some phenomenon or set of phenomena. Some research is aimed at replicating results from previous studies; other research is focused on quantitatively synthesizing a body of research. These two types of efforts are directed at strengthening a theory, verifying predictions, or probing the robustness of explanations by seeing if they hold true for different types of individuals, organizations, or settings.

Exploration. Very little may be known about some phenomena such as new types of settings, practices, or groups. Here, the research question focuses on identifying salient characteristics or features that merit further and more concerted examination in additional studies.

Description. Often, research is initiated to carefully describe a phenomenon or problem in terms of its structure, form, key ingredients, magnitude, and/or changes over time. The resulting profiles can either be qualitative or narrative, quantitative (e.g., x number of people have this characteristic), or a mixture of both. For example, the National Center for Education Statistics collects statistical information about several aspects of education and monitors changes in these indicators over time. The information covers a broad range of topics, most of which are chosen because of their interest to policymakers and educational personnel.

Prediction. Some questions seek to predict the occurrence of specific phenomena or states on the basis of one or more other characteristics. Short-and long-term planning are often the main rationale for this type of research.

Explanation. It is possible to be able to predict the occurrence of a certain phenomenon but not to know exactly why this relationship exists. In explanatory research, the aim is to not only predict the out-come or state of interest but also understand the mechanisms and processes that result in one variable causing another.

Evaluation. Questions of this nature focus on evaluating or judging the worth of something, typically an intervention or program. Of primary interest is to learn whether an organized set of activities that is aimed at correcting some problem (e.g., poor academic skills, low self-esteem, disruptive behavior) is effective. When these efforts are targeted at evaluating the potential or actual success of policies, regulations, and laws, this is often known as policy analysis.

Replication. Some questions revolve around whether a demonstrated relationship between two variables (e.g., predictive value of the SAT in college persistence) can be again found in different populations or different types of settings. Because few studies can incorporate all relevant populations and settings, it is important to determine how generalizable the results of a study to a particular group or program are.

Synthesis. Taking stock of what is known and what is not known is a major function of research. "Summing-up" a body of prior research can take quantitative (e.g., meta-analysis) and qualitative (narrative summaries) forms.

Types of Research Methods

The purpose or purposes underlying a research study guide the choice of the specific research methods that are used. Any individual research study may address multiple questions, not all of which share the same purpose. Consequently, more than one research method may be incorporated into a particular research effort. Because methods of investigation are not pure (i.e., free of bias), several types of data and methods of gathering data are often used to "triangulate" on the answer to a specific question.

Measurement development. At the root of most inquiry is the act of measuring key conceptual variables of interest (e.g., learning strategies, intrinsic motivation, learning with understanding). When the outcomes being measured are important (e.g., grade placement, speech therapy, college admission), considerable research is often needed prior to conducting the main research study to ensure that the measure accurately describes individuals' status or performance. This can require substantial data collection and analysis in order to determine the measure's reliability, validity, and sensitivity to change; for some measures, additional data from a variety of diverse groups must be gathered for establishing norms that can assist in interpretation. With the exception of exploratory research, the quality of most studies relies heavily upon the degree to which the data-collection instruments provide reliable and valid information on the variables of interest.

Survey methodology. Survey research is primarily aimed at collecting self-report information about a population by asking questions directly of some sample of it. The members of the target population can be individuals (e.g., local teachers), organizations (e.g., parent–teacher associations), or other recognized bodies (e.g., school districts or states). The questions can be directed at examining attitudes and preferences, facts, previous behaviors, and past experiences. Such questions can be asked by interviewers either face-to-face or on the telephone; they can also be self-administered by distributing them to groups (e.g., students in classrooms) or delivering them via the mail, e-mail, or the Internet.

High-quality surveys devote considerable attention to reducing as much as possible the major sources of error that can bias the results. For example, the target population needs to be completely enumerated so that important segments or groups are not unintentionally excluded from being eligible to participate. The sample is chosen in a way as to be representative of the population of interest, which is best accomplished through the use of probability sampling. Substantial time is given to constructing survey questions, pilot testing them, and training interviewers so that item wording, question presentation and format, and interviewing styles are likely to encourage thoughtful and accurate responses. Finally, concerted efforts are used to encourage all sampled individuals to complete the interview or questionnaire.

Surveys are mainly designed for description and prediction. Because they rarely involve the manipulation of independent variables or random assignment of individuals (or units) to conditions, they generally are less useful by themselves for answering explanatory and effects-oriented evaluative questions. If survey research is separated into its two fundamental components–sampling and data gathering through the use of questionnaires–it is easy to see that survey methods are embedded within experimental and quasi-experimental studies. For example, comparing learning outcome among students enrolled in traditional classroom-based college courses with those of students completing the course through distance learning would likely involve the administration of surveys that assess student views of the instructor and their satisfaction with how the course was taught. As another illustration, a major evaluation of Sesame Street that randomly assigned classrooms to in-class viewing of the program involved not only administering standardized reading tests to the students participating but also surveys of teachers and parents. So, in this sense, many forms of inquiry can be improved by using state-of-the-art methods in questionnaire construction and measurement.

Observational methods. Instead of relying on individuals' self-reports of events, researchers can conduct their own observations. This is often preferable when there is a concern that individuals may misreport the requested information, either deliberately or inadvertently (e.g., they cannot remember). In addition, some variables are better measured by direct observation. For example, in comparing direct observations of how long teachers lecture in a class as opposed to asking teachers to self-report the time they spent lecturing; it should be obvious that the latter could be influenced (biased upward or downward) by how the teachers believe the researcher wants them to respond.

Observational methods are typically used in natural settings, although, as with survey methods, observations can be made of behaviors even in experimental and quasi-experimental studies. Both quantitative and qualitative observation strategies are possible. Quantitative strategies involve either training observers to record the information of interest in a systematic fashion or employing audiotape recorders, video cameras, and other electronic devices. When observers are used, they must be trained and monitored as to what should be observed and how it should be recorded (e.g., the number of times that a target behavior occurs during an agreed-upon time period).

Qualitative observational methods are distinctly different in several ways. First, rather than coding a prescribed set of behaviors, the focus of the observations is deliberately left more open-ended. By using open-ended observation schemes, the full range of individuals' responses to an environment can be recorded. That is, observations are much broader in contrast to quantitative observational strategies that focus on specific behaviors. Second, observers do not necessarily strive to remain neutral about what they are observing and may include their own feelings and experiences in interpreting what happened. Also, observers who employ quantitative methods do not participate in the situations that they are observing. In contrast, observers in qualitative research are not typically detached from the setting being studied; rather, they are more likely to be complete participants where the researcher is a member of the setting that is being observed.

Qualitative strategies are typically used to answer exploratory questions as they help identify important variables and hypotheses about them. They also are commonly used to answer descriptive questions because they can provide in-depth information about groups and situations. Although qualitative strategies have been used to answer predictive, explanatory, and evaluative questions, they are less able to yield results that can eliminate all rival explanations for causal relationships.

Experimental methods. Experimental research methods are ideally suited for examining explanatory questions that seek to ascertain whether a cause-and-effect relationship exists among two or more variables. In experiments, the researcher directly manipulates the cause (the independent variable), assigns individuals randomly to various levels of the independent variable, and measures their responses (the expected effect). Ideally, the researcher has a high degree of control over the presentation of the purported cause–where, when, and in what form it is delivered; who receives it; and when and how the effect is measured. This level of control helps rule out alternative or rival explanations for the observed results. Exercising this control typically requires that the research be done under laboratory or contrived conditions rather than in natural settings. Experimental methods, however, can also be used in real-world settings–these are commonly referred to as field experiments.

Conducting experiments in the field is more difficult inasmuch as the chances increase that integral parts of the experimental method will be compromised. Participants may be more likely to leave the study and thus be unavailable for measurement of the outcomes of interest. Subjects who are randomly assigned to the control group, which may receive no tutoring, may decide to obtain help on their own–assistance that resembles the intervention being tested. Such problems essentially work against controlling for rival explanations and the key elements of the experimental method are sacrificed. Excellent discussions of procedures for conducting field experiments can be found in the 2002 book Experimental and Quasi-Experimental Designs for Generalized Causal Inference, written by William R. Shadish, Thomas D. Cook, and Donald T. Campbell, and in Robert F. Boruch's 1997 book Randomized Field Experiments for Planning and Evaluation: A Practical Guide.

Quasi-experimental methods. As suggested by its name, the methods that comprise quasi-experimental research approximate experimental methodologies. They are directed at fulfilling the same purposes–explanation and evaluation–but may provide more equivocal answers than experimental designs. The key characteristic that distinguishes quasi experiments from experiments is the lack of random assignment. Because of this, researchers must make concerted efforts to rule out the plausible rival hypotheses that random assignment is designed to eliminate.

Quasi-experimental designs constitute a core set of research strategies because there are many instances in which it is impossible to successfully assign participants randomly to different conditions or levels of the independent variable. For example, the first evaluation of Sesame Street that was conducted by Samuel Ball and Gerry Bogatz in 1970 was designed as a randomized experiment where individual children in five locations were randomly assigned to either be encouraged to watch the television program (and be observed in their homes doing it) or not encouraged. Classrooms in these locations were also either given television sets or not, and teachers in classrooms with television sets were encouraged to allow the children to view the show at least three days per week. The study, however, turned into a quasi experiment because Sesame Street became so popular that children in the control group (who were not encouraged to watch) ended up watching a considerable number of shows.

The two most frequently used quasi-experimental strategies are time-series designs and nonequivalent comparison group designs, each of which has some variations. In time-series designs, the dependent variable or expected effect is measured several times before and after the independent variable is introduced. For example, in a study of a zero tolerance policy, the number of school incidents related to violence and substance use are recorded on a monthly basis for twelve months before the policy is introduced and twelve or more months after its implementation. If a noticeable reduction in incidents occurs soon after the new policy is introduced and the reduction persists, one can be reasonably confident that the new policy was responsible for the observed increase if no other events occurred that could have resulted in a decline and there was evidence that the policy was actually enforced. This confidence may be even stronger if data are collected on schools that have similar student populations and characteristics but no zero tolerance policies during the same period and there is no reduction in illegal substance and violence-related incidents.

Establishing causal relationships with the nonequivalent comparison group design is typically more difficult. This is because when groups are formed in ways other than random assignment (e.g., participant choice), this often means that they differ in other ways that affect the outcome of interest. For example, suppose that students who are having problems academically are identified and allowed to choose to be involved or not involved in an after-school tutoring program. Those who decide to enroll are also those who may be more motivated to do well, who may have parents who are willing to help their children improve, and who may differ in other ways from those who choose not to stay after school. They may also have less-serious academic problems. Such factors all may contribute to these students exhibiting higher academic gains than their nontutored counterparts do when after-tutoring testing has been completed. It is difficult, however, to disen-tangle the role that tutoring contributed to any observed improvement from these other features. The use of well-validated measures of these characteristics for both groups prior to receiving or not receiving tutoring can help in this process, but the difficulty is to identify and measure all the key variables other than tutoring receipt that can influence the observed outcomes.

Secondary analysis and meta-analysis. Both secondary analysis and meta-analysis are part of the arsenal of quantitative research methods, and both rely on research data already collected by other studies. They are invaluable tools for informing questions that seek descriptive, predictive, explanatory, or evaluative answers. Studies that rely on secondary analysis focus on examining and reanalyzing the raw data from prior surveys, experiments, and quasi experiments. In some cases, the questions prompting the analysis are ones that were not examined by the original investigator; in other cases, secondary analysis is performed because the researcher disagrees to some extent with the original conclusions and wants to probe the data, using different statistical techniques.

Secondary analyses occupy a distinct place in educational research. Since the 1960s federal agencies have sponsored several large-scale survey and evaluation efforts relevant to education, which have been analyzed by other researchers to re-examine the reported results or answer additional questions not addressed by the original researchers. Two examples, both conducted by the National Center for Education Statistics, include the High School and Beyond Survey, which tracks seniors and sophomores as they progress through high school and college and enter the workplace; and the Schools and Staffing Survey, which regularly collects data on the characteristics and qualifications of teachers and principals, class size, and other school conditions.

The primary idea underlying meta-analysis or research synthesis methods is to go beyond the more traditional, narrative literature reviews of research in a given area. The process involves using systematic and comprehensive retrieval practices for accumulating prior studies, quantifying the results by using a common metric (such as the effect size), and statistically combining this collection of results. In general, the reported results that are used from studies involve intermediate statistics such as means, standard deviations, proportions, and correlations.

The use of meta-analysis grew dramatically in the 1990s. Its strength is that it allows one to draw conclusions across multiple studies that addressed the same question (e.g., what have been the effects of bilingual education?) but used different measures, populations, settings, and study designs. The use of both secondary analysis and meta-analysis has increased the longer-term value of individual research efforts, either by increasing the number of questions that can be answered from one large-scale survey or by looking across several small-scale studies that seek answers to the same question. These research methods have contributed much in addressing policymakers' questions in a timely fashion and to advancing theories relevant to translating educational research into recommended practices.


BALL, SAMUEL, and BOGATZ, GERRY A. 1970. The First Year of Sesame Street: An Evaluation. Princeton, NJ: Educational Testing Service.

BARTLEY, WILLIAM W., III. 1962. The Retreat to Commitment. New York: Knopf.

BORUCH, ROBERT F. 1997. Randomized Field Experiments for Planning and Evaluation: A Practical Guide. Thousand Oaks, CA: Sage.

BRYK, ANTHONY S., and RAUDENBUSH, STEPHEN W. 1992. Hierarchical Linear Models: Applications and Data Analysis Methods. Newbury Park, CA: Sage.

COOK, THOMAS D.; COOPER, HARRISON; CORDRAY, DAVID S.; HARTMANN, HEIDI; HEDGES, LARRYV.; LIGHT, RICHARD J.; LOUIS, THOMAS A.; and MOSTELLER, FREDERICK, eds. 1992. Metaanalysis for Explanation: A Casebook. New York: Russell Sage Foundation.

COOPER, HARRISON, and HEDGES, LARRY V., eds. 1994. The Handbook of Research Synthesis. New York: Russell Sage Foundation.

GALL, MERIDITH D.; BORG, WALTER R.; and GALL, JOYCE P. 1966. Educational Research: An Introduction, 6th edition. White Plains, NY: Long-man

SHADISH, WILLIAM R.; COOK, THOMAS D.; and CAMPBELL, DONALD T. 2002. Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston: Houghton Mifflin.



Additional topics

Education - Free Encyclopedia Search EngineEducation Encyclopedia