Education and a Civil Society: Teaching Evidence-Based Decision Making

Chapter 5: Learning to Reason about Evidence and Explanations: Promising Directions in Education

Back to table of contents
Authors
Eamonn Callan, Tina Grotzer, Jerome Kagan, Richard E. Nisbett, David N. Perkins, and Lee S. Shulman
Project
Teaching Evidence-Based Decision Making in K-16 Education

Tina Grotzer

INTRODUCTION

Recently people were sickened by eating tainted spinach. Government authorities rushed to gather evidence about the source and scale of the problem and whether it extended to other leafy green vegetables. How would you decide whether to feed leafy green vegetables to your child?

Sometimes when you put on sunscreen, you break out in an angry red rash. You haven’t noticed any obvious pattern in which sunscreens cause a rash. How can you figure out what to do?

What does it mean to encourage a scientifically literate population? We live in an age where an abundance of information is readily available at our fingertips. How do we help children grow up to be critical consumers of scientific information and to reason effectively about it in the service of better lives and a healthier planet? What about creating the next generation of scientists? How do we open the door so that more students can pursue science? These are questions that science education researchers strive to answer and K–12 educators grapple with every day.

This paper considers the difficulties students have in reasoning about scientific evidence and the current discourse in K–12 education research and practice about the puzzles and what appear to be promising directions. It draws from the research literatures in cognitive science, development, and science education to provide an overview of the issues and to illuminate the nuanced and highly challenging problem space of scientific reasoning and evidence evaluation. Finally, it offers suggestions for the focus of our collective efforts for improving practice toward developing a more scientifically literate population. Given space considerations and the extensive research literature on this topic, this paper samples the available research literature but is by no means exhaustive.

WHAT MAKES REASONING ABOUT SCIENTIFIC EVIDENCE SO DIFFICULT?

Educating for scientific literacy requires an answer to what makes reasoning about scientific evidence so hard. An expansive literature addresses this question, ranging from research on why people attend to certain types of information over others in a perceptual sense (e.g., Mack and Rock 1998) to why certain information is successful in capturing our attention (e.g., Sunstein 2002) and how default biases and heuristics make it difficult to reason well in any given situation (e.g., Nickerson et al. 1985). It includes research on people’s difficulties with proportional reasoning, probabilities, and statistics (for a review, see Piatelli-Palmarini 1994). Another body of research focuses on reasoning about complexity—not only the requisite skills and patterns of thinking but ways of dealing with the cognitive load of many interrelated, typically dynamic concepts (e.g., Dorner 1989; Feltovich et al. 1993).

In the event that one surmounts these challenges in gaining, sustaining, and focusing attention and helping the public successfully reason about the scientific evidence at hand, the literature in the field also invites us to consider the sociology of the problem space. A diverse literature in its own right, it considers questions such as what is the nature of science and how do the patterns in science interact with how people attend, believe, and understand it? For instance, if science is a process of trading up for more powerful explanatory models, as Thomas Kuhn (1962) suggests, where does this leave the public, who might hold an accumulation model of how science works, in trying to reason about the available scientific evidence? From the perspective of the public, science might seem to change its mind a lot. The literature in the field also addresses how scientists handle uncertainty (e.g., Zehr 1999), how disagreements are vetted, and how local knowledge and “expert knowledge” interact to inform a problem space (e.g., Jasanoff 1997; Wynne 1992). Highly public incidents that draw upon scientific knowledge, such as the spinach scare outlined above, bring the public “along for the ride,” which invites all sorts of opportunities to misunderstand the evidence and doubt the process of generating it. Highly visible cases where the scientific establishment, through failures of the disciplinary means that vet good science from bad, was initially wrong lead to further doubt and mistrust. An example is the well-publicized case of Judah Folkman, who proposed a radically different theory of cancer growth that was initially spurned by his colleagues. Folkman persisted in developing the theory of angiogenesis and to change the course of cancer treatment today. However, it’s easy for the public to take away the simpler message that the scientific establishment was wrong than to evaluate the broader set of cases among which are many instances where the processes of science policed itself to keep bad science out of the public domain.

Given these challenges, it is no wonder that achieving a scientifically literate population is a focus of concern. We know a fair amount about the problem space and why it is so complicated. However, this knowledge doesn’t lead to easy answers about what we should do. The sections that follow explore some of the key areas of research about scientific reasoning that illuminate the direction of research and practice in education. While these directions offer promise, the concluding section reflects upon these directions and considers what else we might be doing in light of how complicated the problem space is.

What Is Needed to Get Us to Change Our Minds? Confirmation Bias and Other Tendencies

How do people respond to different types of evidence? In a seminal work, Deanna Kuhn and colleagues (1988) reviewed a wide-ranging set of studies and conducted targeted studies related to how people reason about evidence. They found that people typically don’t change their minds and that they have a number of difficulties reasoning about evidence. For instance, Kuhn et al. found that people did not distinguish between the evidence and the theory itself. When evidence was discrepant with a theory, people failed to acknowledge it, or they adjusted or ignored it. In cases where they only moderately believed the theory, they often adjusted the theory to fit the evidence without considering the implications of doing so. They also had difficulty perceiving instances of a nonoperative variable or an operative variable that led to an outcome opposite of what they expected. These studies are part of a line of research leading to what has been called “confirmation bias”—the tendency to use evidence to confirm one’s existing beliefs. Confirmation bias has been widely studied and corroborated in the field. For instance, Klahr and colleagues (e.g., Schunn and Klahr 1993; Klahr et al. 1993) found subjects are less skeptical of self-generated hypotheses than those generated by others and more rigorously tested other-generated hypotheses.

However, further discourse in the field suggested that when viewed in context and with a nuanced understanding of the nature of scientific reasoning, subjects’ responses in earlier studies might make scientific sense (e.g., Karmiloff-Smith 1984; Koslowski 1996). Koslowski (1996) argued that scientists do not work in a theory vacuum, so studies asking subjects to reason about theoretically impoverished situations are incomplete and might distort what would be considered scientifically appropriate in a given situation. Like scientists who use “working hypotheses” that don’t fit all available data to reduce processing demands and to organize data in order to search for patterns, people might hold on to an initial theory because of a temporary lack of a better one. Koslowski raised the question of when theory modification is scientifically legitimate. The real issue, she argued, is not whether one seeks confirmation or disconfirmation but whether one considers plausible alternative hypotheses. Science often proceeds by reformulating rather than discarding a theory because of disconfirming evidence. Dunbar (1995) studied scientists in the lab and found that experienced scientists tend to pay great attention to inconsistent results but that most often they changed particular features of their hypotheses rather than discarding it entirely. On the other hand, he found less confirmation bias in experienced scientists than in inexperienced ones but that the former often discarded useful data.

Chinn and Brewer (1998) studied how students responded when confronted with anomalous data or evidence that contradicts their current theories of the world. They found that students responded in seven ways. Students might 1) ignore the data; 2) reject the data (often offering a different explanation); 3) exclude the data from the domain of the specific theory; 4) hold the data in abeyance; 5) reinterpret the data while retaining the theory; 6) reinterpret the data and make peripheral changes to the theory; and 7) accept the data and change theory A, possibly in favor of theory B. However, for each response type, Chinn and Brewer offered examples from historical cases of scientific reasoning to show that scientists also use these response patterns. Karmiloff-Smith (1984) found that children temporarily ignore disconfirming data until they form a solid theory; then they turn to those data again to try to come up with a new theory to explain them; then they generate a unifying theory.

Reaching a conclusion opposite of what one originally expects may also depend upon the tools that one has within his or her repertoire. A greater focus on concepts that support the interpretation of research data, such as statistical inference, random sampling, reliability, and regression to the mean, can support effective modeling and reasoning about data. These concepts typically bridge the science and math curricula and might not receive attention in either place. Petrosino and colleagues (2003) argue that concepts of variability, which are central to data modeling, are given short shrift in school instruction. They report on an eight-week unit in which fourth graders were given the tools to think about what was suggested by the distribution of their data and thus successfully coordinated the data with conjecture to reach a finding opposite of what they initially expected.

This key area of research points to the importance of nuance in how evidence is interpreted. While important general patterns of reasoning are involved, the effective interpretation of evidence relies on many contextual factors. This suggests that mastering and being able to transfer such skills requires diverse, contextualized opportunities from which learners can discern more expert patterns of engagement.

When Does One Thing Actually Cause Another? Causality as Covariation

Consider how you might think about the following questions: “What is causing a rash sometimes when you put on sunscreen but not other times?” Or, “What might be causing you to feel ill sometimes after eating?” One of the first things that most people do is to search for a pattern that goes along with or co-occurs with the outcome. For instance, you might realize that you only get a rash when you wear a sunscreen strength of 40 SPF or higher. Or perhaps you get ill after eating certain kinds of foods or at certain times of the day.

How we detect that a cause and effect relationship exists has a profound impact on what we view as relevant evidence. Kuhn and colleagues (1988) found that people relied on covariation of two or more variables to suggest a causal relationship. The variables might covary in different ways (“When I drink less water, I feel more tired” or “The more caffeine I drink, the more productive I am”) or in more than one direction (“The more caffeine I drink, the more productive I am until I crash and get nothing done”). They also found that people tended toward false inclusion—to view any covariate as causal even in cases where it was merely correlational. Finding covariation with one variable, subjects were unlikely to test other uncontrolled variables. For instance, you might decide that sunscreen of SPF 40 and higher is the culprit in the rash, but you put on sunscreen of SPF 40 and higher only when you spend long periods of time in intense sun. Perhaps the high SPF level isn’t what is causing your rash but long periods of exposure to intense sun.

Overapplied covariation can lead to assumptions and erroneous conclusions. For example, the College Board established a high positive correlation between students who took algebra in eighth or ninth grade and those who went to college. The U.S. secretary of education interpreted this to mean that courses in mathematics, including algebra, were the gateway to college (Bracey 1998). Yet, we don’t know if a third variable, perhaps related to parenting or ability level, led to both outcomes. Further, while covariation in terms of temporal and spatial contiguity can be an important cue to the possibility of a causal relationship, as causality becomes more complex, relying on covarying, contiguous variables can result in shortsightedness because some causes are spatially and temporally remote from their effects (Shultz and Mendelson 1975). Both of these kinds of errors can be detected in people’s reasoning—the tendency to assign a causal relationship to one that is merely correlational and the tendency to look for local causes and effects and to miss spatially and temporally remote ones.

Research shows that people do use covariation, but this does not appear to be the whole story. Infants use contingency, expecting that objects act on one another only if they touch and that action is not enacted from a distance (Borton 1979; Leslie 1982, 1984; Leslie and Keeble 1987; Oakes 1993; Spelke et al. 1995; Van de Walle and Spelke 1993), and clear, age-related increases occur in people’s ability to use contingency data to cue the possibility of a causal relationship. By preschool, children have gained greater understanding of the contextual nuances of applying temporal and spatial cues. They are less likely to pick a spatially remote event as a cause (Koslowski 1976; Koslowski and Snipper 1977; Lesser 1977) and when spatial contiguity cues conflict with temporal cues, children override the former in favor of the latter (Bullock and Gelman 1979). Time delays between causes and effects introduce difficulties for preschoolers (Siegler and Liebert 1974), but by five years of age children are able to deal with time delays when the delays have an identifiable explanation (Mendelson and Shultz 1976); for instance, the physical effects that result from illness after contact with a contamination (Kalish 1997). Adults are able to override temporal and spatial contiguity cues when the specific context—for instance, the possibility of a complex interaction between two competing causes—suggests the need to (Michotte 1946/1963). Adults also are more likely to have and therefore use information about the contingency relations between two events—how often they are likely to co-occur (e.g., Kelley 1973; Nisbett and Ross 1980).

Research also underscores how nuanced people’s reliance on covariation data are. Siegler and Liebert (1974) found that when they varied the degree of covariation (100 percent to 50 percent) between events and introduced variability into the temporal contiguity (immediate versus five-second delay), eight- and nine-year-olds were more sensitive to the lack of perfect covariation than five-year-olds, perhaps due to the distraction of time delays and being less likely to notice the lack of perfect covariation. Gopnik et al. (2004) have argued that young children can override imperfect correlation, may be accepting of the probabilistic context, and may indeed reason as Bayesians, summing across experiences to discern frequencies of particular patterns. Investigations of how people use contingency data in determining cause and effect relationships suggests an agerelated increase in the accuracy of third- and seventh-grade children’s and adults’ judgments of a cause-effect relationship (Shaklee and Goldston 1989; Shaklee and Mims 1981).

Is covariation all that people use in determining a causal relationship? Consider the following two statements: 1) A high correlation exists between the numbers of churches and the crime rate in the United States; 2) A high correlation exists between growing up in certain neighborhoods and the likelihood that one will engage in crime. If you are like most people, you considered what causal mechanism might be in play in each case and were less likely to assign a causal link between churches and crime rate as between growing up in a tough neighborhood and engaging in crime. Research suggests that people do not engage in Popperian mechanism-independent evaluation of evidence. In the sunscreen example, you might also notice that each time you got the rash the sunscreen came out of a white bottle. However, most of us would rule out bottle color as a plausible causal mechanism no matter how highly correlated it is.

Even preschool children attend to mechanism in their causal explanations (e.g., Bullock 1979; Baillargeon et al. 1981). Despite a clear recognition of covariation as a cue to a potential causal relationship, even young children are sensitive to mechanism (Bullock 1979) and as early as four and five years of age can override spatial discontiguity if plausible mechanisms exist. Even three-year-olds aren’t indifferent to causal mechanisms; they just do not have much information about them (Baillargeon et al. 1981). Preschoolers realize that seemingly uncaused events required explanation, even if they are unable to specify the details (Bullock 1984, 1985; Corrigan 1995; Gopnik et al 2004; Shultz and Kestenbaum 1985).

This area of key research suggests that our biases and our abilities might be separate, that while we are inclined to confuse correlation and causation, we are better able to assess causation than we always do. This suggests that helping people reflect upon their tendencies when evaluating evidence and cueing them to instances when they need to be alert to other variables should be effective.

There’s No Simple Way to Handle Complexity: Reductive Biases and Causal Default Patterns

In the process of analyzing evidence, one implicitly makes decisions about how to bound the problem space. An unbounded space is untenable in many reasoning contexts; yet, as with spatially and temporally remote causes, more extended searching is sometimes needed. Similarly, one makes decisions about how to characterize the inherent features and complexity. In the same spirit of efficiency and comprehensibility, when dealing with complexity people tend to reveal “reductive biases.” Feltovich et al. (1993) identified characteristics of concepts or situations that cause difficulty for most people and found that people tend to simplify phenomena in a type of reductive bias. For instance, people often reduce dynamic phenomena to static snapshots and continuous processes to discrete steps.

People reduce information into sets of heuristics as well. For instance, people tend to use an “availability heuristic” when sampling information (Tversky and Kahneman 1982). We sample the most available information from memory. Drawing conclusions from frequency of experience tends to serve us well but can cause errors by biasing the sample (examples include remembering events with shock value and so overestimating their occurrence; Sunstein 2002). Research on “intuitive toxicology” reveals that people hold assumptions about the nature of chemicals that are quite divergent with what experts believe (Sunstein 2002). Laypeople tend to believe that any human-made chemical is bad while any substance from nature is benevolent, whereas experts are more interested in the dosage and recognize that chemicals in nature can be as or more harmful than manufactured ones. Experts view naturally occurring radon as a much greater cancer threat than human-made sources of radiation. How such biases can influence public understanding and direct public sentiment in reasoning about scientific evidence is exemplified in the current look back at Rachel Carson’s effectiveness in the banning of DDT, the lack of scientific evidence to support the banning, and the subsequent malaria deaths (Sunstein 2002; Tierney 2007).

How we reach conclusions about the world around us is also impacted by how we structure causality. Grotzer and colleagues (Grotzer 2004; Perkins and Grotzer 2005) identified nine simplifying causal assumptions that people tend to make in their explanations. People assumed 1) linearity as opposed to nonlinearity; 2) direct connections between causes and effects without intervening steps or indirect connections; 3) unidirectionality as opposed to bidirectionality; 4) sequentiality as opposed to simultaneity; 5) that causes and effects are obvious and perceptible as opposed to nonobvious and imperceptible; 6) active or intentional agents as opposed to nonintentional ones; 7) determinism—where effects must consistently follow “causes” or the “cause” is not considered to be the cause as opposed to probabilistic causation; 8) spatial and temporal contiguity between causes and effects as opposed to spatial gaps or temporal delays; 9) centralized causes with few agents—missing more complex interactions or emergent effects as opposed to decentralized causes or distributed agency. Substantial support for these tendencies exists in the research literature (e.g., Feltovich et al. 1993; Ferrari and Chi 1998; Wilensky and Resnick 1999). For instance, Resnick (1996) refers to the “centralized mindset” and Driver et al. (1985) identified a tendency toward simple linear causal explanations.

Reductive biases and heuristics can help us quickly and efficiently process information. Therefore, in some respects they make sense. In other contexts, however, we might miss parts of the causal story and make uninformed decisions. This key area of research invites reflection on the tendency in education to reduce complex phenomena to simpler versions and raises questions about how we can help learners develop awareness and a reflective stance on the heuristics and causal reasoning patterns that they employ.

Summary Points

The sum of the research in these key areas suggests the difficulty of understanding the nature of scientific reasoning and evidence evaluation without the nuanced context that surrounds such reasoning and evaluation. Strategies that might seem simplistic, reductive, or misguided in one context might fit with scientific practice in another context. Brem and Rips (2000) found that people’s ability to reason about evidence was significantly better in information-rich than information-poor conditions. Participants in their study tended to prefer evidence, but the value they placed on explanations rose significantly when they believed information was scarce. Participants in rich information contexts referred to evidence twice as often as those in poor information contexts. At the same time, people engage in simplifying strategies that complicate their ability to evaluate evidence well. People appear to hold biases that compete with expert evidence-evaluation skills. While generalizable skills do apply across evidence-evaluation contexts, how those skills are enacted in a given situation depends largely on the nuances of the context. Researchers have written about this as the coordination of two problem spaces (e.g., Klahr and Dunbar 1988; Kuhn et al. 1988). What does all this suggest for research on understanding causality and reasoning about evidence? Further, what does it suggest for educational practice?

The developmental research community has responded to the difficulties of the problem space by carefully controlling task demands so that young children can reveal their emerging competencies. Broadly, this work reveals young children to be far more competent in their reasoning than earlier research suggested. These studies are useful as a suggestion of what aspects of more complex competencies children hold at given ages. They also suggest what might be possible in real-world contexts. However, these competencies don’t typically carry over to scientific reasoning tasks that feature authentic, ill-structured, and compound forms of complexity within a given problem context (for a review, see Grotzer 2003). The response of the educational research community has been to study more-authentic context-rich tasks. Research investigations that reflectively build our knowledge of evidence evaluation skills through a combination of context-lean and context-rich tasks and specifically analyze them through that lens are likely to offer the deepest insights into how people reason about scientific evidence.

EFFORTS IN K–12 EDUCATION AND RESEARCH TO IMPROVE SCIENTIFIC EVIDENTIAL REASONING

Evidence evaluation holds a clear place in the K–12 science standards. The standards call for the ability to reason about evidence in relation to scientific explanation. For instance, they state that students should “use evidence to generate explanations, propose alternative explanations, and critique explanations and procedures” (National Research Council 1995). Further, students should “develop descriptions, explanations, predictions, and models using evidence” (National Research Council 1995). The standards place a particular focus on the relationship between evidence and explanations by requiring that students be able to “think critically and logically to make the relationships between evidence and explanations” and be able to account for anomalous data. The standards also recognize some of the puzzles involved in teaching children to reason well about evidence. For instance, they call for the ability to “recognize and analyze alternative explanations and predictions” (National Research Council 1995). They acknowledge some of the difficulties that students have in evaluating evidence and attempt to alert teachers to these difficulties. For instance, “teachers of science for middle-school students should note that students tend to center on evidence that confirms their current beliefs and concepts (i.e., personal explanations), and ignore or fail to perceive evidence that does not agree with their current concepts” (National Research Council 1995). Of course, what the standards call for and what happens in K–12 classrooms are two different pictures. The following paragraphs consider some of the promising directions that science education research has taken and some of the difficulties of enacting these innovations in the classroom.

Inquiry-Based Science

The focus in the standards is largely on the evaluation of evidence in the context of scientific inquiry—as opposed to evaluating evidence in the process of assessing research in the popular press and elsewhere. For instance, the standards focus on “the ability to conduct inquiry and develop understanding about scientific inquiry, including asking questions, planning and conducting investigations, using appropriate tools and techniques to gather data, thinking critically and logically about relationships between evidence and explanations, constructing and analyzing alternative explanations, and communicating scientific arguments” (National Research Council 1995).

Increasingly, inquiry-based science approaches, in which students pursue questions through experimentation, are finding their way into K–12 classrooms. While inquiry-based instruction is typically hands-on, the two approaches are not synonymous, with “hands-on” including teacher-posed activities and recipe-like lab experiences that are not necessarily “minds-on” too (Driver et al. 1985). The move toward inquiry-based science recognizes the important epistemological knowledge that students gain in such contexts, the increased likelihood of transfer, and the ability to deal with ill-structured real-world problems. Inquiry-based science also offers students the opportunity to learn scientific reasoning—how nuanced scientific reasoning can be and what aspects of scientific reasoning generalize across different contexts (Kuhn et al. 1992).

What do we know about the impact of inquiry-based instruction? Students have difficulty formulating research questions and plans for investigating them (Krajcik et al. 1998). Zimmerman (2000) summarized fourteen self-directed experimentation studies and found the following patterns: In general, third to sixth graders often generated uninformative experiments; made judgments that were based on inconclusive or insufficient data while ignoring inconsistent data and disregarding surprising results; attended to causal factors but not noncausal factors; were unsystematic in recording data, outcomes, and plans; and were influenced by and had difficulty disconfirming prior beliefs.

However, this research also suggests some promise. Students did learn new strategies, but these appear to live side by side with old ones, not necessarily replacing them. Masnick and Klahr (2003) found that second- and fourth-grade children could both propose and recognize potential sources of error before they could design unconfounded experiments. They used evidence to guide their reasoning, making predictions and drawing conclusions based on the design of their experiments, and they were sensitive to the context of reasoning: they differentiated the role of error in relative and absolute measurements. Masnick and Klahr (2003) argued that long before children have acquired the formal procedures necessary to control error, they have a surprisingly rich—albeit unsystematic—understanding of its various sources.

Further, the research on how good learners and bad learners think about evidence suggests specific ways that we might help all learners to learn better. Schauble et al. (1992) found that good learners generate and state many more alternative hypotheses, their experiments are more controlled, and they do a more extensive search of the problem space. They are more systematic— recording results and goal-oriented plans. When students hold more-sophisticated conceptual models, they are more likely to try to make sense of disconfirming or surprising evidence (e.g., Dunbar 1993).

Helping students engage in authentic, inquiry-based science gives them the opportunity to learn scientific reasoning and also increases the knowledge demands on teachers in numerous ways. To be effective, teachers need nuanced epistemological knowledge, a considerably greater content knowledge basis upon which to respond to puzzles and questions that arise, and much more sophisticated pedagogical knowledge (to name but a few of their needs). What students end up learning may indeed be more valuable, but what is learned and the quality of learning is typically far more divergent than with traditional, didactic teaching approaches and content costs may accrue if the teacher is not highly skilled in figuring out how to weave content exploration into the inquiry context.

The Epistemology of Science

Engaging in effective inquiry-based science practices is more than knowing how to carry out experiments, however. Sandoval and Reiser (2004) have argued that students need to understand the epistemological commitments that scientists make—the processes they value for generating and validating knowledge. They call for foregrounding these commitments in the context of inquiry-based approaches. This goes beyond helping students learn to “think like scientists.” The aim here is to help students learn the nature of science and the epistemological underpinnings of the discipline. The national science standards also address the importance of knowledge of the epistemology of science. They call for an understanding of the ways of knowing and finding out in the discipline; for instance, “scientists develop explanations using observations (evidence) and what they already know about the world (scientific knowledge). Good explanations are based on evidence from investigations.” And “scientific explanations emphasize evidence, have logically consistent arguments, and use scientific principles, models, and theories. The scientific community accepts and uses such explanations until displaced by better scientific ones. When such displacement occurs, science advances” (National Research Council 1995).

Research suggests that peoples’ ability to reason about evidence may be limited by their epistemological development (Lederman et al. 2002; Sandoval 2003, 2005) and that students with more epistemological knowledge generally perform better in science (e.g., Linn and Songer 1993). This type of knowledge provides the broad context for everything else learned in science and puts students in a better position to join scientific communities.

But understanding the epistemology of science isn’t just for future scientists. Making sense of the wealth of scientific data around us and trusting the scientific enterprise as a whole requires an understanding of how that knowledge is generated. The common view of science is often quite different from and leads to a different set of expectations than that held by scientists (Chalmers 1999). For example, scientists realize that conclusions are always tentative, that science is not a steady process of the accumulation of facts but that as evidence no longer fits a prevailing model, we trade up for more explanatory models in a Kuhnian paradigm shift (Kuhn 1962). What counts as evidence also can change as part of a paradigm shift. However, if the general population expects scientific knowledge to accumulate, trading up seems more like scientists are changing their minds, and, if that is part of science, why would you place your trust in scientific evidence and the conclusions scientists draw from it? Scientists need to consider carefully how to represent and communicate uncertainty (Zehr 1999), and the general population needs to understand what that uncertainty means in the context of a scientific framework.

In practice, helping students learn the epistemology of science holds a number of challenges: 1) understanding the tacit assumptions that scientists make; 2) preparing teachers to understand those assumptions; and 3) finding ways to make epistemological assumptions accessible at the right level for students without reducing them to a stereotyped set of steps.

Unpacking the tacit assumptions that scientists actually make has been an important line of research (e.g., Dunbar 1995). Scientific discovery and reasoning is a highly nuanced and opportunistic endeavor. In telling the story of the discovery of binary pulsars for which he eventually won a Nobel Prize, Russell Hulse (2003) talks of the process leading up to the discovery. He explains the data collection and careful note taking on the behavior of pulsars and the “error” in his data that emerged first as an annoyance and, after further reflection, as an interesting aberration, and, ultimately, as a pattern describing a previously unidentified phenomenon. His story underscores the nuanced attention to certain kinds of patterns and the reasoning processes to follow that characterize scientific reasoning—nuance that can be difficult to unpack and capture in understanding the scientific process.

In most cases, K–12 teachers do not hold the epistemological assumptions of a scientist. This is particularly an issue with K–5 teachers who may have little or no science background and are expected to teach all the disciplines, understand developmental issues, work with students who have different abilities, disabilities, and skill levels, may not speak English as their primary language, and so on. However, epistemology can also be an issue for middle and high school teachers who might be well versed in content but unfamiliar with how knowledge is generated in the sciences. Even if they once knew, the predominant modes of inquiry shift and change over time with new tools. For instance, in genetics, computers now make it possible to randomly run huge numbers of gene sequences searching for a match rather than using a theory-oriented approach to eliminate certain sequences first. Teachers need these understandings if they are to help students learn them.

Finally, making nuanced assumptions visible and accessible without reducing them to a set of stereotyped steps is difficult. Often scientific inquiry is taught in K–12 classrooms as a set of steps to follow; for example, “the scientific method” is often taught this way. In the first step, students are told to “get a hypothesis”—leaving out the entire idea-generation phase of scientific discovery! Lessons might be designed so that students engaged in inquiry are doing the inquiry either with or without explicit reflection on the processes. These stereotyped versions of science impact what the population expects of science as a field. Yet, as Bauer (1992) has argued, what the general population believes about the nature of scientific thinking includes many popular “shoulds” that would impede scientific progress. For instance, if scientists published all their data, the community would be mired in unsound or misleading data as well as that which might be constructive.

One approach is to teach about the nature of science by exploring historical and current-day examples of scientists and scientific reasoning. While this has the benefit of inviting students to consider the nuances of the field, such an approach would need to pay careful attention to how the cases are presented. As with the Folkman case, it’s easy to skew students’ sense of the discipline by sharing only examples that stand out for one reason or another, such as those that warrant historical recognition. Revolutionary science that cuts across disciplinary boundaries or that shifts the current paradigm is far less common than the everyday science that involves solving smaller-scale puzzles and slogging through data.

Despite the challenges, understanding epistemology is clearly a key to developing a population that knows how to think about scientific evidence. Students can learn to think about epistemological issues (Smith et al. 2000), and explicit reflection on epistemology results in more-informed views of the nature of science (Khishfe and Abd-El-Khalick 2002). A promising direction in science education research investigates students’ ability to gain from the use of technology-based, epistemic tools that scaffold their framing of the epistemology (e.g., Bell and Linn 2000; Sandoval and Reiser 2004; Scardamalia and Bereiter 2004). In classrooms using these resources, students have, for instance, demonstrated the ability to negotiate the terms of explanations, engage in planful investigation (Schauble et al. 1991), and evaluate whether evidence fits with their explanations or not (Sandoval and Reiser 2004).

Science as Argumentation

The science education research community has shown growing interest in argumentation as a central scientific practice that students should learn (e.g., Driver et al. 2000; Kuhn 1993; Sandoval and Millwood 2005). The mere presentation of contradictory evidence is not enough to get students to change their minds (Chinn and Brewer 1998). Driver et al. (2000) argue that the practices of science teaching need to be reconceptualized so as to portray scientific knowledge as socially constructed—emphasizing the role of argumentation in science.

The standards outline a clear role for scientific discourse and argumentation in science classrooms. They call for teachers to “structure and facilitate ongoing formal and informal discussion based on a shared understanding of the rules of scientific discourse” (National Research Council 1995) and for students to develop the

ability to engage in the presentation of evidence, reasoned argument, and explanation comes from practice. Teachers encourage informal discussion and structure science activities so that students are required to explain and justify their understanding, argue from data and defend their conclusions, and critically assess and challenge the scientific explanations of one another. (National Research Council 1995)

How argumentation is carried out matters. Debate about why certain explanations are better than others appears to be a critical component in developing epistemic criteria (Rosebery et al. 1992; Sandoval and Reiser 2004). Hogan (1999) found that students who engaged in lessons designed to encourage “metacognitive, regulatory, and strategic aspects of knowledge co-construction” were subsequently better able to articulate their collaborative reasoning processes than the students in control classrooms. Further, when one student uses an evidence evaluation strategy in a discussion, the strategy is more likely to be used by his or her classmates (Anderson et al. 2001; Pluta and Chinn 2007). Unfortunately, most teachers do not provide many opportunities for group or class discussion—expressing uncertainly about how to support such discussions, they opt instead to lecture (Newton et al. 1999). Student discussion, when it does occur, tends to focus on procedural aspects of the practical work rather than the actual science.

Modeling

The epistemology of science involves thinking about the explanatory power of a model in light of the available evidence (e.g., Giere 1988; Hestenes 1992). Scientific knowledge is generated by discarding models that no longer fit the evidence and then trading up for more powerful models. Consider how different this is from typical “school science” where students are taught a “right answer” and are not always taught the rationale for that explanation.

Models are a natural extension of a classroom discourse that involves argumentation and teaching of the epistemology of science. Models are debated and defended and in this way render students’ thinking visible (Lehrer and Schauble 2006) to the person espousing the model, to other students, and to the teacher. The models become an artifact of the sociocultural process in the classroom. Others have argued that models assist in the transfer of learning (Clement 2000) and in conceptual change (Gobert 2000).

How students think about models bears on how they use them to reason about evidence. Harrison and Treagust (2000) argue that most students believe in a one-to-one correspondence between a model and reality; therefore, students need to learn that all models fit in some ways but not in others. Perception of models appears to be rooted in students’ understanding of the epistemology of science. Chittleborough and colleagues (2005) found that many students do have a good understanding of the role of models in the process of science and appreciate the multiplicity and representational yet changing nature of models. While some students have a fascination for true facts and single and correct models, others exhibit more sophisticated epistemologies of science. Chittleborough and colleagues (2005) also found that these understandings improve with learning opportunities. Thinking about models in science (as a learner or a scientist) demands a flexible commitment to the model that one holds. One needs to be able to view the model as a tentative explanation until a more fully explanatory one comes along. This can be hard for students, who often hold robust, alternative conceptions for many science concepts (e.g., Driver et al. 1985). The research on confirmation bias underscores the fact that people do not naturally consider rival models (Driver et al. 1996; Grosslight et al. 1991). Generating rival models from the outset is one way to encourage flexible commitment and deep consideration of the model in light of the available evidence (Grotzer 2002) and promises to lead to better evaluation of evidence.

Infusing a Focus on Causality in Science Learning

When reasoning about complex phenomena, students rely on a series of default assumptions that often distort the nature of the causality involved. For instance, students often give linear or narrative explanations that are story-like: “first this happened, then it made that happen, and so on.” Such explanations have a domino-like quality to them that actually is absent from many science concepts. Concepts such as symbiosis, pressure or density differentials, and electrical circuits are distinctly nonlinear in form. They involve mutual, relational, or cyclic patterns (Grotzer 2003). Concepts might appear straightforward but reveal complexity as soon as one dives below the surface. In addition to nonlinear patterns, they may include nonobvious causes; time delays and spatial gaps between causes and effects; distributed, nonintentional agency; and probabilistic causation where the level of correspondence between causes and effects varies. Abrahamson and Wilensky (2005) found that many of the heuristics necessary for reasoning about complex systems run counter to those involved in reasoning about the linear systems with which students appear to be more familiar.

Causal default assumptions impact what evidence people attend to and the salience that they attach to it. For example, when spatial gaps and time delays are present, people are less likely to notice relevant evidence. Instances of distributed causal agency, where many people acting on one level contribute to an emergent outcome on another level, are difficult to analyze. For example, in the case of global warming, recognizing our individual contributions to the problem (and thus what we might do about them) is extremely difficult because evidence of the problem is available only at the level of the collective outcome.

Increasingly, science education research is considering how to help students learn to think about causality in more complex ways. Students need explicit opportunities to reflect on their default patterns, learn the new causal patterns, and observe how the latter do a better job explaining the phenomenon at hand. Offering rich opportunities to learn complex causal concepts where students have the opportunity to discover different phenomena has shown some promise. Wilensky and Resnick (e.g., Resnick 1996; Wilensky 1998; Wilensky and Resnick 1999) have used multiagent modeling in numerous studies on the concepts of emergence. They have demonstrated that constructionist opportunities to work with dynamic, objectbased models that reveal complex causal concepts result in new insights into the nature of complex phenomena such as gas laws and the behavior of slime molds. Chiu and colleagues (2002) found that having mentors unpack their thinking about complex problems that are just beyond the independent ability of the students made a significant difference in the students’ understanding of the concepts of simultaneity and randomness as these relate to chemical equilibrium as compared to students in a control group. Grotzer and colleagues (e.g., Grotzer and Basca 2003; Grotzer 2000; Perkins and Grotzer 2005) contrasted three pedagogical conditions across a number of science concepts (electricity, ecosystems, air pressure, and density) that engaged students in thinking about simultaneity, multiple causes and effects, nonobvious causes, nonlinearity, and outcomes due to balance or imbalance between multiple variables. They found that a combination of activities designed to reveal the underlying causal concepts and explicit discussion about the nature of causality (what is hard to grasp about it, how particular causal patterns differ from other causal patterns) led to deeper understanding. These students significantly outperformed students who participated only in the causally focused activities or who did not participate in the causally focused activities or discussion but did have “best practices” science units that included extensive model building by students, evaluating evidence, Socratic discussion, dynamic computer models, and attention to students’ evolving models. The differences were especially dramatic for science concepts where the causality was least linear, sequential, and direct.

Beyond Teaching Empiricism

Osborne (2002) has argued that we need to move beyond a focus on empiricism when thinking about how students analyze evidence. A majority of the public interacts with science text in popular accounts or journalistic versions. Korpan and colleagues (1997) have argued that media reports of scientific research are a pervasive and important source of new scientific knowledge and that the ability to evaluate conclusions found in those reports is an important form of scientific literacy.

Korpan and colleagues (1997) found that students generated a range of requests for information and that they focused most often on how the research was conducted and why the results might have occurred. They made fewer requests for other types of information, such as what was found, who conducted the research, where it was conducted, or whether related research had been conducted. They seemed most influenced by the plausibility of the conclusions, typicality of the phenomena, and their own personal familiarity with the phenomena. In a series of studies conducted with his graduate students, Chinn (Buckland and Chinn 2007; DiFranco and Chinn 2007; Hung and Chinn 2007; Pluta and Chinn 2007) has also explored how students make sense of evidence in published studies and how they coordinate that data across studies. Pluta and Chinn (2007) considered how seventh graders resolved conflicting pairs of news stories. The stories were similar to those published in newspaper accounts but had a slightly greater emphasis on methodology. The studies focused on scientific cases (deformed frogs, dinosaur metabolism, aspartame, etc.) that students were given information about. Students primarily reasoned at the level of explanations. However, students often had difficulty connecting the study details to the specific explanation. They ignored study details that might have helped them to integrate the evidence and their explanation, as well as details that did not fit with their initial ideas. Direct replication failure was genuinely baffling to students. While few students were able to use the details of the study to coordinate the results, they revealed a wide range of strategies across the groups of students.

Ninety percent of students used more than one strategy (such as methodological differences, bias, chance, etc.) to account for the conflicting scientific interpretations within a given problem and 70 percent of students used more than one strategy across different problem types. This is promising, particularly in classrooms where students engage in discussion about evidence. Pluta and Chinn (2007) found that certain forms of reasoning were less common, such as reasoning about bias or chance. These concepts warrant additional attention to help students understand and apply them. In general, these studies stress the need for authentic practice in helping students learn how to apply strategies in particular instances.

SUGGESTIONS FOR NEXT STEPS AND NEW DIRECTIONS TO IMPROVE SCIENTIFIC EVIDENTIAL REASONING

Scientific evidential reasoning is a challenging yet important problem space. We have much to build on in taking next steps, and some clear needs are present that suggest directions for our collective efforts in improving the scientific literacy of the general population through education.

A fairly extensive and informative research literature exists; it raises many puzzles but also offers solid information about approaches that are most promising. We have the most information about how students reason in the context of inquirybased learning. The existing standards offer a strong basis from which to work. They draw solidly from the research base and make useful recommendations to classroom teachers. The focus, however, is largely on experimentation. Standards that include analyzing research evidence from published sources would promote greater scientific literacy even among those who do not plan to become scientists.

Far less research has looked at how students of different ages reason about published scientific reports of real world studies (Chinn and Malhotra 2002). Chinn and colleagues (e.g., Hung and Chinn, 2007; Pluta and Chinn 2007) have taken some important first steps here. More research focused on interpreting findings is needed. Continued exploration of the barriers to understanding is also needed so that we can find approaches that offer the most leverage in solving the problem. For instance, work focused on helping students examine their causal default assumptions may systematically impact how they generate explanations. Similarly, helping students understand the epistemology of science helps them to view science as an endeavor differently and to realize that it is more than just learning facts.

A better bridge between research and practice is needed. While a fair amount is known about evidence evaluation skills and the epistemological knowledge that supports them, these findings have been slow to make their way into mainstream practice. This lack of an effective bridge is a perennial problem in applying research findings. If future investigations were more closely situated at the intersection of practice and basic research on learning— in “Pasteur’s Quadrant” (Stokes 1997)—a bridge might no longer be necessary. We need accessible pedagogies that build upon the research results. The kind of work that Sandoval and colleagues (e.g., Sandoval and Reiser 2004) are engaged in—designing computer programs that teach the epistemology of science—holds promise here.

Probably the biggest hurdle, however, has to do with supporting teachers. To assume that teachers will be able to teach the epistemology of science in all its nuances without a considerable investment in professional development and instructional materials is irrationally optimistic. The type of nuanced understanding called for in the science education research comes from a deep understanding of scientific inquiry and the expertise associated with having opportunities to engage in science. This stands in sharp contrast to the comfort level that many teachers, especially at the elementary level, have with science and the high rate of teacher turnover, particularly in urban schools.

What is obvious, even from this brief overview, is how challenging the problem space of scientific reasoning and education is. However, the payoff from engaging with this problem space is a scientifically literate population: critical consumers of the abundance of available scientific information who can reason effectively about evidence and a generation of scientists capable of understanding a complex, dynamic world. Substantial resources already exist to support these efforts. The imperative is clear.

ACKNOWLEDGMENTS

The author would like to acknowledge the contributions of Megan Powell and Rebecca Miller in assisting with the literature review in this paper.

REFERENCES

Abrahamson, D. and U. Wilensky, 2005. The stratified learning zone: Examining collaborative-learning design in demographically-diverse mathematics classrooms. Paper presented at the annual meeting of the American Educational Research Association, Montreal, Canada.

Anderson, R. C., K. Nguyen-Jahiel, B. McNurlen, A. Archodidou, S. Kim, A. Reznitskaya, M. Tillmanns, and L. Gilbert. 2001. The snowball phenomenon: Spread of ways of talking and ways of thinking across groups of children. Cognition and Instruction 19:1–46

Baillargeon, R., R. Gelman, and E. Meck. 1981. Are preschoolers truly indifferent to causal mechanism? Paper presented at the biennial meeting of the Society for Research in Child Development, Boston, MA.

Bauer, H. 1992. Scientific Literacy and the Myth of the Scientific Method. Urbana, IL: University of Illinois Press.

Bell, P., and M. Linn. 2000. Scientific argumentation as learning artifacts: Designing for learning from the web with KIE. International Journal of Science Education 22(8):797–817.

Borton, R. W. 1979. The perception of causality in infants. Paper presented at the biennial meeting of the Society of Research in Child Development, San Francisco, CA.

Bracey, G. W. 1998. Tinkering with TIMMS. Phi Delta Kappan 80(1):32–36.

Brem, S. K., and L. J. Rips. 2000. Explanation and evidence in informal argument. Cognitive Science 24:573–604.

Buckland, L. and C. A. Chinn. 2007. Integrating evidence and models at the 7th Grade: A preliminary investigation. Paper presented at the American Educational Research Association Conference, Chicago, IL.

Bullock, M. 1979. Aspects of the young child’s theory of causation. Ph.D. diss., University of Pennsylvania.

Bullock, M. 1984. Preschool children’s understanding of causal connections. British Journal of Developmental Psychology 2:139–148.

Bullock, M. 1985. Causal reasoning and developmental change over the preschool years. Human Development 28:169–191.

Bullock, M., and R. Gelman. 1979. Preschool children’s assumptions about cause and effect: Temporal ordering. Child Development 50:89–96.

Chalmers, A. F. 1999. What is This Thing Called Science? 3rd ed. Indianapolis: Hackett Publishing Co.

Chinn, C. A., and W. F. Brewer. 1998. An empirical test of a taxonomy of responses to anomalous data in science. Journal of Research in Science Teaching 35:623–654.

Chinn, C. A., and B. A. Malhotra. 2002. Epistemologically authentic reasoning in schools: A theoretical framework for evaluating inquiry tasks. Science Education 86:175–218.

Chittleborough, G. D., D. F. Treagust, and T. L. Mamiala. 2005. Students’ perceptions of the role of models in the process of science and in the process of learning. Research in Science and Technological Education 23(2):195–212.

Chiu, M., C. Chou, and C. Liu. 2002. Dynamic processes of conceptual change: Analysis of constructing mental models of chemical equilibrium. Journal of Research in Science Teaching 39(8):688–712.

Clement, J. 2000. Model-based reasoning as a key research area for science education. International Journal of Science Education 22(9):937–977.

Corrigan, R. 1995. How infants and young children understand the causes of events. In Social development: Review of Personality and Social Psychology, Vol. 15, ed. N. Eisenberg. Thousand Oaks, CA: Sage Publications, Inc.

DiFranco, J. and C. A. Chinn. 2007. Reasoning about disparate data. Paper presented at the American Educational Research Association Conference, Chicago, IL.

Dorner, D. 1989. The Logic of Failure. New York: Metropolitan Books.

Driver, R., E. Guesne, and A. Tiberghien, eds. 1985.Children’s Ideas in Science. Philadelphia: Open University Press.

Driver, R., J. Leach, R. Millar, and P. Scott. 1996. Young People’s Images of Science. Buckingham, UK: Open University Press.

Driver, R., P. Newton, and J. Osborne. 2000. Establishing the norms of scientific argumentation in classrooms. Science Education 84:287–312.

Dunbar, K. 1993. Concept discovery in a scientific domain. Cognitive Science 17:397–434.

Dunbar, K. 1995. How scientists really reason: Scientific reasoning in realworld laboratories. In Mechanisms of Insight, ed. R. J. Sternberg, and J. Davidson, 365–395. Cambridge, MA: MIT Press.

Feltovich, P. J., R. J. Spiro, and R. L. Coulson. 1993. Learning, teaching, and testing for complex conceptual understanding. In Test Theory for a New Generation of Tests, ed. N. Frederiksen and I. Bejar, 181–217. Hillsdale, NJ: LEA.

Ferrari, M., and M. T. C. Chi. 1998. The nature of naïve explanations of natural selection. International Journal of Science Education 20:1231–1256.

Giere, R. N. 1988. Explaining Science: A Cognitive Approach. Chicago: University of Chicago Press.

Gobert, J. D. 2000. A typology of causal models for plate tectonics: Inferential power and barriers to understanding. International Journal of Science Education 22(9):937–977.

Gopnik, A., C. Glymour, D. M. Sobel, L. E. Schulz, T. Kushnir, and D. Danks. 2004. A theory of causal learning in children: Causal maps and Bayes nets. Psychological Review 111:1–31.

Grosslight, L., C. Unger, E. Jay, and C. Smith. 1991. Understanding models and their use in science: Conceptions of middle and high school students and experts. Journal of Research in Science Teaching 28:799–822.

Grotzer, T. A. 2000. How conceptual leaps in understanding the nature of causality can limit learning: An example from electrical circuits. Paper presented at the American Educational Research Association Conference, New Orleans, LA.

Grotzer, T. A. 2002. Causal Patterns in Ecosystems: Lessons to Infuse into Ecosystems Units. Cambridge, MA: Project Zero, Harvard Graduate School of Education.

Grotzer, T. A. 2003. Learning to understand the forms of causality implicit in scientific explanations. Studies in Science Education 39:1–74.

Grotzer, T. A. 2004. Putting science within reach: Addressing patterns of thinking that limit science learning. Principal Leadership. October.

Grotzer, T. A., and B. B. Basca. 2003. How does grasping the underlying causal structures of ecosystems impact students’ understanding? Journal of Biological Education 38(1):16–29.

Harrison, A. G., and D. F. Treagust. 2000. Typology of school science models. International Journal of Science Education 22(9):1011–1026.

Hestenes, D. 1992. Modeling games in the Newtonian world. American Journal of Physics 60(8):732–748.

Hogan, K. 1999. Thinking aloud together: A test of an intervention to foster students’ collaborative scientific reasoning. Journal of Research in Science Teaching 36(10):1085–1109.

Hulse, R. 2003. Keynote Speech: Research on Learning and Education (R.O.L.E.) Principal Investigators’ and Contractors’ Meeting, National Science Foundation, October 2728, 2003, Crystal City, VA.

Hung, C. C., and C. A. Chinn. 2007. Learning to reason about the methodology of scientific studies: A classroom experiment in the middle school. Paper presented at the American Educational Research Association Conference, Chicago, IL.

Jasanoff, S. 1997. Civilisation and madness: The great BSE scare of 1996. Public Understanding of Science 6:221–232.

Kalish, C. 1997. Preschooler’s understanding of mental and bodily reactions to contamination: What you don’t know can hurt you, but cannot sadden you. Developmental Psychology 33(1):79–91.

Karmiloff-Smith, A. 1984. Children’s problemsolving. In Advances in Developmental Psychology, vol. 3, ed. M. E. Lamb, A. L. Brown, and B. Rogoff, 39–90. Hillsdale, NJ: Lawrence Erlbaum Associates.

Kelley, H. H. 1973. The processes of causal attribution. American Psychologist 28(2):107–128.

Khishfe, R., and F. Abd-El-Khalick. 2002. Influence of explicit and reflective versus implicit inquiryoriented instruction on sixth graders’ views of nature of science. Journal of Research in Science Teaching 39:551–578.

Klahr, D., and K. Dunbar. 1988. Dual space search during scientific reasoning. Cognitive Science 12:1–48.

Klahr, D., A. L. Fay, and K. Dunbar. 1993. Heuristics for scientific experimentation: A developmental study. Cognitive Psychology 25:111–146.

Korpan, C. A., G. L. Bisanz, and J. Bisanz. 1997. Assessing literacy in science: Evaluation of scientific news briefs. Science Education 81(5):515–532.

Koslowski, B. 1976. Learning About an Instance of Causation. Unpublished manuscript, Cornell University.

Koslowski, B. 1996. Theory and Evidence. Cambridge, MA: MIT Press.

Koslowski, B., and A. Snipper. 1977. Learning about an instance of nonmechanical causality. Unpublished manuscript, Cornell University.

Krajcik, J., P. Blumenfeld, R. W. Marx, K. H. Bass, and J. Fredericks. 1998. Inquiry in project-based science classrooms: Initial attempts by middle school students. Journal of the Learning Sciences 7(3/4):313–350.

Kuhn, D. 1993. Science as argument: Implications for teaching and learning scientific thinking. Science Education 77(3):319337.

Kuhn, D., E. Amsel, and M. O’Loughlin. 1988. The Development of Scientific Thinking Skills. Orlando: Academic Press.

Kuhn, D., L. Schauble, and M. Garcia-Mila. 1992. Cross-domain development of scientific reasoning. Cognition and Instruction 9(4):285–327.

Kuhn, T. S. 1962. The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

Lederman, N. G., F. Abd-El-Khalick, R. L. Bell, and R. S. Schwartz. 2002. Views of nature of science questionnaire: Toward valid and meaningful assessment of learners’ conceptions of nature of science. Journal of Research in Science Teaching 39(6):497–521.

Lehrer, R., and L. Schauble. 2006. Cultivating model-based reasoning in science education. In Cambridge Handbook of the Learning Sciences, ed. K. Sawyer, 371–388. New York: Cambridge University Press.

Leslie, A. M. 1982. The perception of causality in infants. Perception 11:173–186.

Leslie, A. M. 1984. Spatiotemporal continuity and the perception of causality in infants. Perception 13:287–305.

Leslie, A. M., and S. Keeble. 1987. Do sixth month old infants perceive causality? Cognition 25:265–288.

Lesser, H. 1977. The growth of perceived causality in children. The Journal of Genetic Psychology 130:142–152.

Linn, M., and N. Songer. 1993. How do students make sense? Merrill Palmer Quarterly 39(1):47–73.

Mack, A., and I. Rock. 1998. Inattentional Blindness. Cambridge, MA: MIT Press.

Masnick, A. M., and D. Klahr. 2003. Error matters: An initial exploration of elementary school children’s understanding of experimental error. Journal of Cognition and Development 4(1):67–98.

Mendelson, R., and T. R. Shultz. 1976. Covariation and temporal contiguity as principles of causal inference in young children. Journal of Experimental Child Psychology 22:408–412.

Michotte, A. 1946/1963. The Perception of Causality, trans. T. R. Miles and E. Miles. New York: Basic Books.

National Research Council. 1995. National Science Education Standards. National Academies Press. http://www.nap.edu/readingroom/books/nses/.

Newton, P., R. Driver, R. and J. Osborne. 1999. The place of argumentation in the pedagogy of school science. International Journal of Science Education 21(5):553576.

Nickerson, R., D. Perkins, and E. Smith. 1985. The Teaching of Thinking. Hillsdale, NJ: LEA.

Nisbett, R., and L. Ross. 1980. Human Inference: Strategies and Shortcomings of Social Judgment. Englewood Cliffs, NJ: PrenticeHall.

Oakes, L. M. 1993. The perception of causality by 7- and 10-month-old infants. Paper presented at the Meeting of the Society for Research in Child Development, New Orleans, LA.

Osborne, J. 2002. Science without literacy: A ship without a sail? Cambridge Journal of Education 32(2):203–218.

Perkins, D. N., and T. A. Grotzer. 2005. Dimensions of causal understanding: The role of complex causal models in students’ understanding of science. Studies in Science Education 41:117–165.

Petrosino, A. J., R. Lehrer, and L. Schauble. 2003. Structuring error and experimental variation as distribution in the fourth grade. Mathematical Thinking and Learning 5(2/3):131–156.

Piatelli-Palmarini, M. 1994. Inevitable illusions: How Mistakes of Reason Rule Our Minds. New York: John Wiley and Sons, Inc.

Pluta, W. J., and C. A. Chinn. 2007. Making sense of conflicting studies: Can students build complex evidence-based models? Paper presented at the American Educational Research Association Conference, Chicago, IL.

Resnick, M. 1996. Beyond the centralized mindset. Journal of the Learning Sciences 5(1):1–22.

Rosebery, A. S., B. Warren, and F. R. Conant. 1992. Appropriating scientific discourse: Findings from language minority classrooms, Journal of the Learning Sciences 2(1):61–94.

Sandoval, W. A. 2003. Conceptual and epistemic aspects of students’ scientific explanations. Journal of the Learning Sciences 12(1):5–51.

Sandoval, W. A. 2005. Understanding students’ practical epistemologies and their influence on learning through inquiry. Science Education 89(4):634–656.

Sandoval, W. A., and K.A. Millwood. 2005. The quality of students’ use of evidence in written scientific explanations. Cognition and Instruction 23(1):23–55.

Sandoval, W. A., and B. J. Reiser. 2004. Explanation-driven inquiry: Integrating conceptual and epistemic scaffolds for scientific inquiry. Science Education 88:345–372.

Scardamalia, M., and C. Bereiter. 1994. Computer support for knowledge building communities, The Journal of the Learning Sciences: Special Issue: Computer Support for Collaborative Learning 3(3):265–283.

Schauble, L., R. Glaser, K. Raghavan, and M. Reiner, M. 1991. Causal models and experimentation strategies in scientific reasoning, The Journal of the Learning Sciences 1(2):201–238.

Schauble, L., R. Glaser, K. Raghavan, and M. Reiner. 1992. The integration of knowledge and experimentation strategies in understanding a physical system. Applied Cognitive Psychology 6:321–343.

Schunn, C. D., and D. Klahr. 1993. Self vs. other generated hypotheses in scientific discovery. Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society, 1–7.

Shaklee, H., and D. Goldston. 1989. Development in causal reasoning: Information sampling and judgment rule. Cognitive Development 4:269–281.

Shaklee, H., and M. Mims. 1981. Development of rule use in judgments of covariation between events. Child Development 52:317–325.

Shultz, T. R., and N. R. Kestenbaum. 1985. Causal reasoning in children. Annals of Child Development 2:195–249.

Shultz, T. R., and R. Mendelson. 1975. The use of covariation as a principle of causal analysis. Child Development 46:394–399.

Siegler, R., and R. Liebert. 1974. Effects of contiguity, regularity, and age on children’s causal inferences. Developmental Psychology 10(4):574–579.

Smith, C. L., D. Maclin, C. Houghton, and M. G. Hennessey. 2000. Sixth-grade students’ epistemologies of science: The impact of school science experiences on epistemological development. Cognition and Instruction 18:349–422.

Spelke, E. S., A. Phillips, and A. L. Woodward. 1995. Infants’ knowledge of object motion and human action. In Causal Cognition: A Multidisciplinary Debate, ed. D. Sperber, D. Premack, and A. J. Premack, 44–78. Oxford, UK: Clarendon Press.

Stokes, D. E. 1997. Pasteur’s quadrant: Basic Science and Technological Innovation. Washington, DC: Brookings Institution Press.

Sunstein, C. R. 2002. Risk and Reason: Safety, Law, and the Environment. Cambridge, UK: Cambridge University Press.

Tierney, J. 2007. Fateful voice of a generation still drowns out real science. New York Times. June 5.

Tversky, A., and D. Kahneman. 1982. Judgment under uncertainty: Heuristics and biases. In Judgment Under Uncertainty: Heuristics and Biases, ed. D. Kahneman, P. Slovic, and A. Tversky, 3–20. Cambridge, UK: Cambridge University Press.

Van de Walle, G., and E. S. Spelke. 1993. Integrating information over time: Infant perception of partly occluded objects. Presented at the biennial meeting of the Society for Research in Child Development, New Orleans.

Wilensky, U. 1998. GasLab: An extensible modeling toolkit for connecting micro- and macro-properties of gases. In Computer Modeling in Science and Mathematics Education, ed. N. Roberts, W. Feurzeig, and B. Hunter. Berlin: SpringerVerlag.

Wilensky, U., and M. Resnick. 1999. Thinking in levels: A dynamic systems approach to making sense of the world. Journal of Science Education and Technology 8(1):3–19.

Wynne, B. 1992. Misunderstood misunderstanding: Social identities and public uptake of science. Public Understanding of Science 1:281–304.

Zehr, S. 1999. Scientists’ Representation of Uncertainty. In Communicating Uncertainty: Media Coverage of New and Controversial Science, ed. S. M. Friedman, S. Dunwoody, and C. L. Rogers. Mahwah, NJ: LEA.

Zimmerman, C. 2000. The development of scientific reasoning skills. Developmental Review 20:99–149.