5 minute read

Readability in Dices

Readability Formulas, Readability and Comprehension Processes

For several decades researchers have been concerned with the question of determining how easy or difficult a text will be for a particular reader to comprehend. For example, if a teacher is assigning a textbook to an eighth-grade class, how would the teacher determine whether the class will actually comprehend the book? Writers may also want to consider the comprehensibility of their texts during the writing and revision process. To address these goals, researchers have developed readability indices, which are tools or methods that provide assessments of the comprehensibility of texts. The term readability indices has also been used to describe the legibility of writing or the interest value of texts, but these aspects of readability will not be discussed here. The most effective readability indices for assessing comprehensibility are those that take into account how people actually go about processing the information in texts.

Readability Formulas

Beginning in the 1920s, many efforts were undertaken to describe the readability of texts in terms of objective characteristics that could be measured and analyzed. First, researchers tabulated surface characteristics of written texts that may be related to how difficult the texts would be to comprehend. Some examples of these characteristics are the difficulty or frequency of the words in a text, the average number of syllables per word, and measures of the length or complexity of sentences. These data were then compared with predetermined standards, such as the average grade level of students who could correctly answer a certain percentage of questions generated from the text passage. Text characteristics that provided the most accurate predictions of the standards were judged to be indices of readability. These characteristics were then developed into readability formulas, which were equations that specified how much weight to assign to each characteristic. Probably the most widely used readability formula was one of Rudolf Flesch's, which used the number of syllables per 100 words and the average number of words per sentence.

Readability formulas also gained favor as guidelines for revising texts to improve comprehensibility. The Flesch formula, for example, implies that texts can be made more comprehensible by using shorter words and shorter sentences. Such guidelines have been used in many contexts and are appealing at least in part because they provide concrete feedback and can be automated. The value of readability formulas is limited, however, because these surface characteristics do not cause texts to be easy or difficult to comprehend. Long words and sentences simply happen to be typical characteristics of texts that are hard to comprehend. This limitation is particularly problematic when considering text revision. In fact, George R. Klare, in a 1963 book titled The Measurement of Readability, reviewed efforts to improve comprehensibility by revising texts to lower the readability scores, and he found this method to be ineffective.

Readability and Comprehension Processes

Since the 1970s, researchers have made great advances in understanding the psychological processes that are involved in reading, and thus the factors that make a text comprehensible. As readers take in information from a text, they attempt to construct a coherent mental representation of the information. Semantic and causal coherence are critical factors that contribute to this process. When current text information is not coherent with previous text information, a reader must generate an inference in order to make sense out of the new information. More coherent texts require fewer inferences and are therefore easier to comprehend. Walter Kintsch and colleagues developed an index of semantic coherence based on the extent to which concepts are repeated across sentences. Tom Trabasso and colleagues have indexed events in a text according to the extent to which they are integral to the causal structure described in the text. Both of these indices of coherence are highly predictive of comprehension. In addition, generating inferences requires readers to use their own knowledge, such that readers with a lot of relevant knowledge will find texts more readable than readers with little knowledge. Readability is therefore not just determined by the text itself but is a product of the interaction between the text and reader.

Semantic and causal indices of readability can also be effectively used in revising texts to make them more readable. According to their 1991 article Bruce K. Britton and Sami Gülgöz revised a textbook passage by repairing semantic coherence breaks (which incidentally did not alter the readability formula score). In another revision, they shortened words and sentences to lower the score on readability formulas. Comprehension improved for the coherence revision but not for the readability revision, indicating that readability indices based on psychological processes provide better guidance for revision than readability formulas.

Although textual coherence and reader knowledge can be effective readability indices, they are quite labor intensive to analyze and are impractical for use with long texts. Thomas Landauer and his colleagues, however, developed a computer model that provides a potential solution to this problem: Latent Semantic Analysis (LSA) estimates the semantic similarity of words and sets of words derived from their use in the context of large amounts of natural language. As reported in a 1998 article, Peter W. Foltz, Landauer, and Kintsch used LSA to assess the semantic coherence of texts by calculating the similarity between adjacent sentence pairs. The LSA model showed very high correlations between semantic coherence and comprehension, thereby suggesting an automatic measure of readability based on coherence. LSA has also been used to automatically assess knowledge about a particular topic by calculating the similarity between the content of short essays and standard texts on the topic. These LSA knowledge assessment scores were also predictive of comprehension, according to a 1998 article by Michael B. W. Wolfe and colleagues. LSA can therefore provide readability indices related both to the coherence of a text itself and to the interaction between the text and the knowledge of a particular reader.

In order to be truly useful, a readability index should be grounded in the psychological processes of reading. The LSA model represents a promising new technique for determining the readability of texts because it has psychological validity and can be automated. Nevertheless, LSA has not yet been developed into a practical index of readability for particular texts and readers. Only further testing on a broad range of texts and readers will reveal its potential.


BRITTON, BRUCE K., and GÜLGÖZ, SAMI. 1991. "Using Kintsch's Computational Model to Improve Instructional Text: Effects of Repairing Inference Calls on Recall and Cognitive Structures." Journal of Educational Psychology 83:329–345.

FLESCH, RUDOLF. 1948. "A New Readability Yardstick." Journal of Applied Psychology 32:221–233.

FOLTZ, PETER W.; KINTSCH, WALTER; and LANDAUER, THOMAS K. 1998. "The Measurement of Textual Coherence with Latent Semantic Analysis." Discourse Processes 25:285–307.

KLARE, GEORGE R. 1963. The Measurement of Read-ability. Ames: Iowa State University Press.

LORCH, ROBERT F., and O'BRIEN, EDWARD J., eds. 1995. Sources of Coherence in Reading. Hillsdale, NJ: Erlbaum.

MILLER, JAMES R., and KINTSCH, WALTER. 1980. "Readability and Recall of Short Prose Passages: A Theoretical Analysis." Journal of Experimental Psychology: Human Learning and Memory 6:335–354.

WOLFE, MICHAEL B. W.; SCHREINER, M. E.; REHDER, BOB; LAHAM, DARRELL; FOLTZ, PETER W.; KINTSCH, WALTER; and LANDAUER, THOMAS K.1998. "Learning from Text: Matching Readers and Texts by Latent Semantic Analysis." Discourse Processes 25:309–336.


Additional topics

Education - Free Encyclopedia Search EngineEducation Encyclopedia