CommonLit Putting Cognitive Science Research to Practice

Earlier this year, we were discussing ways that we might incorporate machine learning and AI into CommonLit’s digital features. During the discussion, one of our engineers said something that made me think: Every time that we make an update to our data model, we are implicitly defining a CommonLit “Theory of Reading.”

In other words, even the smallest features we engineer, define, or choose not to define, have a trickle-down impact on student learning. This starts with the way that different pieces of information are associated with one another (or not associated) in our database architecture. For example, does our app “know” that texts are made up of sentences, and that sentences are made up of turns of phrases? How should we categorize turns of phrases? How do they relate to the Common Core Standards? Is one a “sub-standard” of another? Does our app need to know phonemic letter-sound combinations, since that’s where some students might get tripped up?

There were two things that we all agreed on:

  1. We have an obligation to get it right, and
  2. Getting it right means putting what works into practice, at scale.
A list of the Cognitive Science reading group logistics and norms.

What we did next makes me proud: We started an internal, cross-disciplinary team at CommonLit called the Cognitive Science Research Group. We have three goals. The first goal is to read the research in order to establish a common language at CommonLit for how we talk about texts, reading comprehension, and learning. The second goal is to put this science to practical use within our application through the design and continuous testing of new features. The third goal of the club is to make our learning visible to the broader education and research community.

Our hope is that through our work, we can help other organizations replicate our process of becoming a highly effective learning engineering organization. Through this blog series and other venues, we also hope to shed light on how cognitive science can inform the design of new tools. We’ll do that here by translating what we learn into layman’s terms so that practitioners might be able to act on it.

Sessions #1–9: A Summary of the Research

In the first several sessions, we assigned readings from chapters 12 and 13 of The Science of Reading, A Handbook. These chapters focus on the meaning of reading comprehension, and give us a common language for how we talk about texts.

Every text has four different “levels” of comprehension. First is the text’s linguistic level. This is the simplest level of a text: the text’s words and phrases. Understanding this level of a text requires phonemic awareness, or knowledge of letters, letter sounds, and letter combinations. It also requires word recognition skills, and a body of sight words that you likely learned in Kindergarten. Without this foundational piece, students often get tripped up at a text’s linguistic level. Their working memory spends so much time decoding or identifying specific words that it limits their capacity to build a bigger picture.

The next level of a text is called its microstructure. A text’s microstructure refers to word and sentence-level ideas that the text conveys. In order to understand this level of a text, you need to know lots of connecting words that show relationships: words like “and”, “but”, “for”, “because”, and “so.” To understand this level of a text, you also have to be able to make inferences, meaning that your brain fills in the blanks when something isn’t explicitly stated in the text. For example, when I read the sentence, “He was playing in the sand,” I know that he was likely at the beach. That’s because I have background knowledge (beaches are filled with sand) stored in my long-term memory. As you can imagine, as texts get more and more complex, the background knowledge required to make these inferences also becomes more demanding. Consider all the background knowledge required to fully understand this 5th grade text about George Washington Carver on CommonLit. By this grade, we expect kids to know basic ideas about college, environmentalism, growing plants, and the history of discrimination against African Americans — and this is just the background knowledge required to understand this one text. Students also need to be able to connect pronouns with their antecedents, even when they are far away from each other in the text. Connecting nouns with their referents is called anaphora resolution. Finally, understanding this level of a text requires short-term memory processing. Just as with linguistic level comprehending, when students are making substantial cognitive efforts to understand phrase- and sentence-level ideas, they are unlikely to really be comprehending the text’s larger meaning.

The next level of text is the macrostructure. This refers to the overarching topics, themes, or arguments that are conveyed. Processing this level of the text requires an understanding of the rhetorical schemata of the text, which will help you understand the big ideas or “main idea” of the text.

Finally, you have the Situation Model. This refers to the mental model or image of a text that a reader forms in their mind. This mental image of the text exists in your brain in relationship to all the other ideas you have in your brain based on your prior reading, prior knowledge, or prior experience. For example:

  • You read that a blue whale is the largest animal in the world. You had never heard of blue whales and thought that elephants were the biggest. You LEARN and update your situation model for this text but also your background knowledge.
  • You read an opinion about who the best baseball player is. You bring your background knowledge with your own opinion, deeming the text inaccurate or uninformed, leading you to question other claims the author makes. Your situation model is updated to reflect your stance towards the author in the text.

Before we go even deeper (it gets more complicated), let’s stop here for a moment and remind ourselves that only 20% of low-income students read at grade-level. This means that, on the whole, we are doing a terrible job of delivering equitable reading instruction. The reason this research is so urgent is that it can help shed light on why: Struggling readers who get caught up at the text’s linguistic level or microstructure will never be able to do well on questions that ask about a text’s macrostructure. They will also likely not be able to create an accurate Situation Model of the text and store that in their memories, which will give them the background knowledge to comprehend future texts. It also shows us why most reading comprehension assessments that ask questions solely about a text’s macrostructure are not helpful at diagnosing where our struggling readers are getting caught up. Finally, the research also shows us why teaching a struggling reader a lesson on a text’s main idea (macrostructure) will likely not work.

There are a couple of other big predictors of success. First, to comprehend a text, research suggests that you need to know about 90% of the words in that text (Nagy & Scott, 2000). Not surprisingly, explicit vocabulary instruction is one of the most promising interventions for diverse student populations (Lesaux, 2010). Another big predictor of success is a reader’s standard of coherence, or having the goal of reading to comprehend rather than having the goal of simply processing individual words. With an appropriate standard of coherence, a new Biology student is likely to not become demoralized when they encounter a difficult section of a Biology textbook. They know this is new material and that they are expected to learn it, not to already know it. A student who has a low standard of coherence might find the textbook to be just another case of them not “getting it.” Comprehension monitoring also plays an important role here, so that readers have the mindset and strategies to go back and re-read in order to repair misunderstandings. And of course, a reader’s motivation strongly influences whether they will understand a text. Without motivation, they won’t monitor their comprehension, their standard of coherence will dip below appropriate levels, and they will ultimately fail to understand what they read. Motivation is key, always hidden at the center of all that we teach. Put simply, a reader has to actually care if the text makes sense to them or not. Success begets more success; when a reader has a high standard of coherence, they are likely to become more motivated, and motivation encourages a higher standard of coherence.

What’s next?

CommonLit’s Cognitive Science Research Club will meet every other Wednesday. Throughout the fall, we’ll be forming smaller dedicated reading groups to go even deeper into the areas we discussed: background knowledge, reader motivation, inferences, vocabulary instruction, anaphora resolution, and more. Moving into the spring, we’ll conduct design sprints, prototype new features, and plan experiments to implement cognitive science within our app. During this time, we’ll be guided by our “North Star” metric: student reading growth.

Throughout all of this, we’ll be blogging about our learning and inviting researchers and practitioners to join our group.

Interested in joining CommonLit’s Cognitive Science Research Club? Got feedback for us? Don’t hesitate to reach out to julian.mante@commonlit.org.