Elsevier

Cognition

Volume 199, June 2020, 104223
Cognition

A test of two processes: The effect of training on deductive and inductive reasoning

https://doi.org/10.1016/j.cognition.2020.104223Get rights and content

Abstract

Dual-process theories posit that separate kinds of intuitive (Type 1) and reflective (Type 2) processes contribute to reasoning. Under this view, inductive judgments are more heavily influenced by Type 1 processing, and deductive judgments are more strongly influenced by Type 2 processing. Alternatively, single-process theories propose that both types of judgments are based on a common form of assessment. The competing accounts were respectively instantiated as two-dimensional and one-dimensional signal detection models, and their predictions were tested against specifically targeted novel data using signed difference analysis. In two experiments, participants evaluated valid and invalid arguments, under induction or deduction instructions. Arguments varied in believability and type of conditional argument structure. Additionally, we used logic training to strengthen Type 2 processing in deduction (Experiments 1 & 2) and belief training to strengthen Type 1 processing in induction (Experiment 2). The logic training successfully improved validity-discrimination, and differential effects on induction and deduction judgments were evident in Experiment 2. While such effects are consistent with popular dual-process accounts, crucially, a one-dimensional model successfully accounted for the results. We also demonstrate that the one-dimensional model is psychologically interpretable, with the model parameters varying sensibly across conditions. We argue that single-process accounts have been prematurely discounted, and formal modeling approaches are important for theoretical progress in the reasoning field.

Introduction

A widespread view is that there are two types of processes in high-level cognition (see Evans & Stanovich, 2013; Melnikoff & Bargh, 2018), as epitomized by the well-known Star Trek characters, Captain Kirk and Mr. Spock. Kirk reasons via gut-feelings and intuitions, while Spock generally applies cold analytical thinking and logic. For a given problem, it seems that people can reason either like Kirk or like Spock. In the lab, researchers have studied this using an argument evaluation task (e.g., Evans, Handley, Harper, & Johnson-Laird, 1999; Rips, 2001; Rotello & Heit, 2009). In this task, participants consider arguments such as:If theUScuts fuel emissions then global warming willbereduced.TheUSdid not cut fuel emissions.Global warming was not reduced.¯

Some participants are given induction reasoning instructions, in which they are asked to judge whether the conclusion below the line is plausible based on the premises above the line.1 Others are given deduction reasoning instructions in which they judge whether the conclusion necessarily follows from the premises. For Argument (1), under induction instructions people may reason more like Kirk and use their prior beliefs about fuel emissions and global warming to decide that the conclusion is plausible. In contrast, under deduction instructions, if people correctly apply Spock-like logic, the conclusion would be deemed not necessarily true (the argument structure is denying the antecedent, which is logically invalid). Though these might appear to be different ways of drawing inferences or conclusions, a key question is whether they reflect the operation of qualitatively different cognitive processes.

Popular dual-process theories propose that there are distinct “Type 1” and “Type 2” processes in human reasoning, judgment and decision making. Such views have been highly influential, with programs based on these theories now advocated in education and assessment (Gillard, Van Dooren, Schaeken, & Verschaffel, 2009; Stanovich, 2016), medical diagnosis (Croskerry, Singhal, & Mamede, 2013), and managerial decision making (Dane & Pratt, 2007), and the concept is being taken up in industry to try to avoid reasoning errors (see Melnikoff & Bargh, 2018). Type 1 processing is generally assumed to be intuitive: It is autonomous, does not require working memory, tends to be fast, and tends to produce responses biased by background knowledge. In contrast, Type 2 processing is seen as reflective: It involves effortful hypothetical thinking, requires working memory, tends to be slow, and tends to produce normative responses (see Evans & Stanovich, 2013). Some theorists propose that the two kinds of processes operate in parallel (e.g., Handley & Trippas, 2015; Sloman, 1996, Sloman, 2014), while others suggest that Type 1 processing generates intuitive default responses, which may or may not be altered by subsequent high-effort Type 2 processing (e.g., De Neys, 2012; Evans, 2007, Evans, 2008; Kahneman & Frederick, 2002). Regardless of the particular version that is preferred, according to dual-process theories, when people consider a reasoning problem such as Argument (1), they could access distinct assessments of argument strength based on Type 1 or Type 2 processes. It is often assumed that induction judgments are particularly dependent on Type 1 processes, while deduction judgments are more dependent on Type 2 processes (Evans, Handley, & Bacon, 2009; Evans & Over, 2013; Rotello & Heit, 2009; Singmann & Klauer, 2011; Verschueren, Schaeken, & d'Ydewalle, 2005).

In contrast, single-process theories propose that a common core process underlies responding in various reasoning, judgment and decision making tasks (cf. Keren, 2013; Keren & Schul, 2009; Kruglanski, 2013; Kruglanski & Gigerenzer, 2011; Osman, 2004, Osman, 2013). Under this view, both induction and deduction judgments for reasoning problems such as Argument (1) are based on a common assessment of subjective argument strength (Rips, 2001). One possibility is that this strength-assessment may be produced by generating and testing mental models of the premises and conclusions (Johnson-Laird, 1994). Another is that it is based on the perceived conditional probability of the conclusion given the premises (Lassiter & Goodman, 2015; Oaksford and Chater, 2001, Oaksford and Chater, 2007).

Dual-process accounts are often framed as verbal models, and a key form of empirical support for them is the existence of functional dissociations (Evans, 2008; Evans & Stanovich, 2013) – including important demonstrations that particular factors affect induction judgments more than deduction judgments, or vice versa (for reviews see e.g., Heit, Rotello, & Hayes, 2012; Stephens, Dunn, & Hayes, 2018). In many studies demonstrating such dissociations, arguments are presented like those in Table 1, which vary in both logical validity and prior believability (based on background knowledge), and participants are asked to evaluate the arguments according to deduction or induction instructions. Factors such as consistency with background causal knowledge, argument length, and premise-conclusion similarity have a greater effect on induction judgments (e.g., Handley, Newstead, & Trippas, 2011; Heit & Rotello, 2010; Rips, 2001; Rotello & Heit, 2009; Singmann & Klauer, 2011), while argument validity, working memory load, and cognitive ability have a stronger impact on deduction judgments (e.g., Evans, Handley, Neilens, & Over, 2010; Heit & Rotello, 2010; Howarth, Handley, & Walsh, 2016; Rotello & Heit, 2009).

However, as many have argued (e.g., Dunn & Kirsner, 1988; Newell & Dunn, 2008; Prince, Brown, & Heathcote, 2012; Stephens et al., 2018; Stephens, Matzke, & Hayes, 2019), although such dissociations are consistent with dual-process accounts, they do not provide compelling evidence for the existence of more than one underlying mechanism for assessing arguments. For this reason, important tests of competing single-process and dual-process theories in reasoning have instead been based on the more rigorous logic of reversed association or state-trace analysis (Bamber, 1979; Dunn & Kalish, 2018; Dunn & Kirsner, 1988), or signed difference analysis (Dunn & Anderson, 2018; Dunn & James, 2003). State-trace analysis tests for evidence against simple single-process models that postulate one key underlying mechanism or latent variable, while signed difference analysis allows the testing of more detailed, formal single- and dual-process models.

In the sections that follow, we review evidence based on state-trace analysis of reasoning data and on formal reasoning models tested via signed difference analysis (i.e., Hayes, Stephens, Ngo, & Dunn, 2018; Rips, 2001; Singmann & Klauer, 2011; Stephens et al., 2018). This review shows that although the simplest single-process models can be rejected, a more complex version based on the signal detection framework can successfully account for induction and deduction judgments – this is a viable model against dual-process competitors. We then discuss how the successful single-process model can be more explicitly tested, and present two new experiments that do so via manipulations including training participants in deductive logic. We use signed difference analysis to test the model in its most general form. To foreshadow, despite the popularity of dual-process accounts, a single-process model can also explain both extant and new data. We also test a more specific (Gaussian) version of the single-process model to show that the model parameters are psychologically interpretable.

According to one of the simplest possible single-process models, induction and deduction judgments are governed by a single underlying latent variable, corresponding to a common psychological dimension of argument strength (Rips, 2001). A key implication of this account is that endorsement rates for arguments assessed under induction versus deduction instructions must be monotonically related; the model “forbids” an ordinal pattern of difference between two conditions in which induction endorsements increase while deduction endorsements decrease, or vice versa (i.e., contributing to a reversed association; see Dunn & Kirsner, 1988). This prediction holds without having to commit to strong (and potentially false) assumptions about exactly how the single latent variable maps onto observed endorsement rates – simply that the mapping functions are monotonic. However, there has been some – albeit inconsistent – evidence against a monotonic relationship between induction and deduction judgments.

Rips (2001) initially demonstrated a monotonicity violation when there was conflict between the validity and believability of arguments. Rips found that valid-unbelievable arguments received higher deduction endorsements than invalid-believable arguments, but this pattern was reversed for induction judgments. However, Stephens et al. (2018) applied state-trace analysis to these data (see Dunn & Kalish, 2018), which involved plotting induction endorsement rates against deduction endorsement rates, and examining the fit of a model that assumes a single latent variable and thus a monotonic relationship (using an appropriate conjoint monotonic regression [CMR] procedure; Kalish, Dunn, Burdakov, & Sysoev, 2016). They found that the model closely approximated the data, so there was no clear evidence for more than one kind of argument strength assessment. Similarly, Hayes et al. (2018) applied state-trace analysis to three experiments where people made deduction or induction judgments about arguments varying in validity and believability. Notably, these experiments tested several factors relevant to important theoretical distinctions between Type 1 and 2 processing: working memory load, working memory capacity, and decision time (e.g., De Neys, 2012; Evans & Stanovich, 2013; Handley & Trippas, 2015). Although there were clear dissociations between argument endorsement rates in induction and deduction, Hayes et al. (2018) found the data could be explained by variations in a single latent variable (again, based on CMR tests).

However, Singmann and Klauer (2011) found larger violations of monotonicity across deduction and induction tasks. In two experiments they manipulated validity, believability (plausible, implausible and neutral) and argument type. Argument types were “affirmation” causal conditional arguments that contrast the valid modus ponens (if A then B, A, therefore B) with the invalid affirming the consequent (if A then B, B, therefore A), and “denial” causal conditional arguments that contrast the valid modus tollens (if A then B, not B, therefore not A) with the invalid denying the antecedent (if A then B, not A, therefore not B). Stephens et al. (2018) applied state-trace analysis to data from both experiments, and this time found reliable departures from monotonicity, rejecting the hypothesis that induction and deduction judgments were driven by a single latent variable. This result is consistent with the view that there are two distinct psychological dimensions of argument strength, as predicted by dual-process accounts. However, by themselves, demonstrations of reversed associations or monotonicity violations are silent on exactly what the multiple latent variables may be. For this reason, it is important to specify and test formal models that instantiate the competing theories, with latent variables defined by the model parameters. More complex single-process models might – and indeed, can – account for non-monotonic data like those from the Singmann and Klauer (2011) experiments (Stephens et al., 2018).

Signal detection theory offers a useful framework for formulating and testing single- and dual-process accounts of the argument evaluation task (Heit & Rotello, 2010; Rotello & Heit, 2009; Rotello, Heit, & Kelly, 2019; Stephens et al., 2018). As shown schematically in Fig. 1, under this framework, arguments are assumed to vary along continuous dimension(s) of subjective argument strength. On the one hand, dual-process models are two-dimensional (2D) – they assume that induction and deduction judgments are based on two different strength dimensions, one based primarily on the output of Type 1 processing and the other based primarily on the output of Type 2 processing, respectively (Fig. 1b). On the other hand, single-process models are one-dimensional (1D) – they assume only a single strength dimension, such that both induction and deduction judgments are based on a common assessment of argument strength (Fig. 1a). Both model classes assume that valid and invalid arguments form distinct distributions in their 2D or 1D space, with the relative distance between them reflecting the extent to which participants distinguish the two argument types. This distance is captured respectively by two discriminability parameters for the 2D model (dD for deduction and dI for induction), or a single discriminability parameter (d) for the 1D model. The models also assume that in the argument evaluation task, participants set a decision threshold or criterion, endorsing only those arguments that sit above the criterion in strength. Fig. 1 shows the most general model variants, labelled the independent-1D and independent-2D models by Stephens et al. (2018) – these variants have distinct, independent criterion parameters for deduction and induction judgments (cD and cI, respectively). Notably, this “single-process” model actually has three parameters or latent variables, while the corresponding “dual-process” model has four. However, simpler 1D and 2D model variants can also be tested, such as those that include only one shared criterion parameter for both deduction and induction (i.e., dependent-1D and -2D models), or those that fix the criterion across different experiment conditions (e.g., factorial combinations of believability and affirmation/denial argument type) – hence there is no criterion parameter (i.e., fixed criterion-1D and -2D models).

The distributions of argument strength in Fig. 1 are shown as different Gaussian and Gamma distributions, but we use these for illustrative purposes only. As we and others have noted (e.g., Dunn, 2008; Loftus, 1978; Rouder, Pratte, & Morey, 2010; Stephens et al., 2018), the true forms of such response distributions are unknown. Hence, it is prudent to test rival signal detection models of reasoning using an approach that makes only minimal assumptions about the relationship between changes in model parameters and the observed induction and deduction endorsement rates. The approach that we adopt, signed difference analysis (SDA; Dunn & Anderson, 2018; Dunn & James, 2003), simply assumes that this relationship is monotonic; for example, if some manipulation produces a positive shift in the model parameters we will see an increase (or at a minimum, no decrease) in argument endorsements. SDA is a natural extension of state-trace analysis. While the latter involves testing for evidence against a model with a single latent variable or parameter in two-dimensional data space (e.g., induction vs. deduction endorsement rates), SDA can be used to test for evidence against more complex models, in higher dimensions.

In the current application of SDA, the data space is four dimensional, based on induction and deduction endorsement rates for valid and invalid arguments (Stephens et al., 2018), as shown along the x-axis in Fig. 2a. Endorsement rates for deduction-valid, deduction-invalid, indication-valid and induction-invalid form the “dependent variables”, and SDA involves testing observed ordinal patterns of difference between “conditions” across these four dependent variables. For example, the “hypothetical” data in Fig. 2a are based on the results observed by Evans et al. (2009) from an argument evaluation task with causal conditional arguments (the non-speeded conditions). In the Figure, Condition 2 roughly corresponds to a condition with affirmation arguments (modus ponens and affirming the consequent) and low-believability, and Condition 1 roughly corresponds to a condition with denial arguments (modus tollens and denying the antecedent) and high-believability. The ordinal pattern of difference here is that deduction-valid and induction-valid endorsements are higher in Condition 2 than in Condition 1, plus deduction-invalid and induction-invalid endorsements are lower in Condition 2 than in Condition 1. Note that this pattern is consistent with more accurate validity discrimination in Condition 2 relative to Condition 1 (i.e., endorsing more valid arguments and rejecting more invalid arguments).

Crucially, each signal detection model can be shown to predict some ordinal data patterns but not others. Observation of ordinal patterns that are “forbidden” by a given 1D or 2D model would falsify that model. These forbidden patterns can be formally identified (see Stephens et al., 2018) – but are also often consistent with the contrasting intuitive assumptions of each reasoning model. For example, all 1D models assume that induction and deduction judgments share a single discriminability parameter. Hence, across two experimental conditions, validity discrimination should never be found to both increase for deduction and decrease for induction (or vice versa). A hypothetical example of this qualitative forbidden pattern is illustrated in Fig. 2b (and will be discussed in more detail shortly); relative to Condition 2, Condition 1 suggests more accurate validity discrimination for deduction (i.e., a higher endorsement rate for valid arguments and a lower endorsement rate for invalid arguments), but less accurate discrimination for induction (i.e., a lower endorsement rate for valid arguments and a higher endorsement rate for invalid arguments).

Stephens et al. (2018) used SDA to test a set of 16 1D and 2D models (the independent-1D and independent-2D models described above, plus variants with more restricted criteria parameters) against a large database of 26 experiments with induction and deduction endorsement rates for valid and invalid arguments (including the data from Rips, 2001, and Singmann & Klauer, 2011). These experiments included a wide range of factors such as variation in the number of premises, causal consistency or believability, the similarity between categories in the premise and conclusion statements, the cognitive ability of participants, time pressure, and the pleasantness of the content. Stephens et al. (2018) also conducted a new experiment involving a manipulation of perceived base-rates, to test the prediction of the independent-1D model (vs. the competing dependent-2D model) that induction and deduction response criteria are indeed independent and thus can be pushed in opposite directions.

Applying an SDA extension of the CMR statistical test to these datasets (see Experiment 1 Results below for further details on the CMR test), Stephens et al. (2018) found that all models were ruled out except the independent-1D model and the independent-2D model. Given that the independent-1D model has three parameters while the 2D variant is a saturated model with four parameters, the former may be preferred on the grounds of parsimony. At a minimum, the success of the independent-1D model shows that the distinction between Type 1 and Type 2 processing, or any similar distinction that assumes two separate assessments of argument strength, is not required to account for extant data from the argument evaluation task. This evidence of a viable single-process model is especially impressive given that the Stephens et al. (2018) database included variation in factors such as cognitive ability, causal consistency, argument structure, and time pressure that have all been claimed by previous researchers to differentially affect Type 1 and Type 2 processing.

However, an important limitation of the Stephens et al. (2018) archival analysis is that none of the reasoning experiments examined were specifically designed to test the competing predictions of the independent-1D and independent-2D models. Therefore, the primary goal of the current experiments was to perform a more targeted test of the independent-1D model, searching for the critical evidence that would reject it in favor of the more complex independent-2D model.

In order to perform a targeted SDA test of the independent-1D model, its permitted and forbidden ordinal data patterns must be understood. In SDA, the relevant ordinal differences between conditions can be captured by signed difference vectors with four elements corresponding to the dependent variables: in this case, (deduction-valid, deduction-invalid, induction-valid, induction-invalid). Each element can take the value +, −, or 0, although it is rare to observe mean differences of exactly zero. Fig. 2 presents two hypothetical examples of different possible signed difference vectors. Fig. 2a shows an example of ±(+,  − ,  + , −), in which relative to condition 1, the condition 2 means are higher for deduction-valid and induction-valid, but lower for deduction-invalid and induction-invalid. Fig. 2b shows an example of ±(+,  − ,  − , +), in which relative to condition 2, the condition 1 means are higher for deduction-valid and induction-invalid, but lower for deduction-invalid and induction-valid. In total, there are 40 possible signed difference vectors (see Stephens et al., 2018).

Different models specify different forbidden and permitted data patterns. Stephens et al. (2018) identified that the independent-1D model has one forbidden sign vector, ±(+,  − ,  − , +), as shown in Fig. 2b. This is the pattern we considered earlier – it is consistent with separate discriminability parameters for deduction and induction judgments, because validity discrimination rates can shift in opposite directions. For example, in the Figure, relative to Condition 2, in Condition 1 deduction judgments show higher discrimination of valid versus invalid arguments (i.e., a higher endorsement rate for valid arguments and a lower endorsement rate for invalid arguments), while the induction judgments show the opposite pattern. This pattern is consistent with the saturated independent-2D model and if observed, it would count as compelling evidence against the independent-1D model.

If there are distinct Type 1 and Type 2 processes that differentially affect induction and deduction judgments, it should be possible to observe the critical pattern forbidden by the independent-1D model, ±(+,  − ,  − , +). How might an experiment produce this complex pattern? It is most likely to be observed between two conditions formed by a combination of experimental factors, such as untrained versus trained judgments, believability, and type of argument form. The first part of the critical pattern (i.e., the first (+,  − , ..) for deduction-valid, deduction-invalid) requires an experimental manipulation that improves validity discrimination, more so for deduction than for induction. This might be achieved by training participants in deductive logic.

The potential for a training manipulation to improve validity discrimination is highlighted by the fact that such discrimination is often poor for the typical untrained undergraduate participant, for many types of logical arguments. For example, across all conditions included in the Stephens et al. (2018) database, mean deduction endorsement rates were 0.80 (SD = 0.14) for valid arguments and 0.34 (SD = 0.27) for invalid arguments, which suggests room for improvement towards the normative values of 1 and 0, respectively. A training manipulation also has theoretical importance for tests of single- and dual-process reasoning models; in particular, it is possible that participants' general lack of logic training masked the observation of distinct Type 1 and 2 processing in the Stephens et al. (2018) database. If participants do not know how to correctly assess validity for some argument forms, perhaps Type 2 processing has simply not been able to exert an influence on responses that is sufficiently distinguishable from that of Type 1 processing. In addition to teaching participants how to assess validity correctly, logic training may also further clarify the distinction between the induction and deduction tasks. Thus post-training there may be a stronger influence of Type 1 processing in induction and Type 2 processing in deduction.

For these reasons, we include logic training as a key factor in our experiments. In Experiment 1, we train both induction and deduction participant groups on how to distinguish valid and invalid arguments. This confirms that the logic training procedure is successful, and examines people's judgments when only logic-based reasoning is trained. In Experiment 2, to further magnify differences between deduction and induction judgments after training, we also train both groups on how to distinguish arguments with believable and unbelievable content. Thus, we train people in both “logic-based” reasoning and “belief-based” reasoning, in principle creating the ideal conditions for them to apply different types of processing in induction and deduction.

The second part of the critical ±(+,  − ,  − , +) pattern, (i.e., the latter (.. −, +) for induction-valid, induction-invalid) might be more likely to appear if – for untrained participants – there is one argument type for which valid arguments are endorsed more often than invalid arguments, and another argument type for which the opposite occurs. This kind of responding has been observed before, for causal conditional arguments with content about real-world events and believable versus unbelievable variants, similar to those shown in Table 1. For affirmation arguments, participants typically endorse valid arguments (modus ponens) more than invalid arguments (affirming the consequent), but for denial arguments (modus tollens vs. denying the antecedent), this effect is often reversed. For instance, these effects were found by Evans et al. (2009), Singmann and Klauer (2011), Trippas et al. (2014), and the new experiment by Stephens et al. (2018). To illustrate, as mentioned earlier, the “hypothetical” results in Fig. 2a are based on the results observed by Evans et al., 2009, Stephens et al., 2018, with Condition 2 roughly corresponding to their (non-speeded) affirmation, low-believability condition, and Condition 1 roughly corresponding to their (non-speeded) denial, high-believability condition. It is suggested that participants rely primarily on the believability of the conditional when assessing the validity of modus tollens (Evans et al., 2010; Singmann & Klauer, 2011).

Considering the pattern of responses in Fig. 2a, note that if in a new Condition 3, validity discrimination can be “corrected” for deduction judgments in Condition 1 (but induction judgments are not altered), this would produce the (idealized) data pattern in Fig. 2b that is forbidden by the independent-1D model. Training may produce this kind of effect, as discussed above. To be clear, to create the ordinal ±(+,  − ,  − , +) pattern, training needs to have a larger impact on deduction than induction – not necessarily no impact on induction. Note that the hypothetical conditions in Fig. 2b would then correspond to a combination of different experiment factors: trained, denial, high-believability responses (Condition 1) versus untrained, affirmation, low-believability responses (Condition 2). Therefore, in addition to training, our experiments also include the factors of argument type (affirmation versus denial causal conditional arguments) and believability (low, high, and neutral, with the latter included simply to increase opportunities for observing the critical SDA forbidden pattern).

Although we do not know of any previous studies of the effects of logic training (and/or belief training) on both induction and deduction judgments, some have investigated the effects of logic training within the laboratory on deductive reasoning. Some training procedures have been shown to improve accuracy for various deduction tasks, including evaluating categorical syllogisms (Prowse Turner & Thompson, 2009), generating conclusions from conditional (and other) premise structures (Klauer, Meiser, & Naumer, 2000; Klauer, Stegmaier, & Meiser, 1997) and the related Wason selection task (e.g., Cheng, Holyoak, Nisbett, & Oliver, 1986; Klaczynski & Laipple, 1993). Although it is unclear exactly which training components are essential, explanation of valid and invalid inferences with concrete examples, plus practice with immediate feedback appear to be beneficial. Therefore, we include these features in our logic training procedure.

Our primary goal is to use signed difference analysis to test the independent-1D and -2D models in their most general form, with minimal distributional assumptions. As we have argued, this SDA testing is important because the “true” distributional forms are unknown. The approach is also powerful because if SDA rules out the independent-1D model, all possible variants of the model with stronger distributional assumptions are also ruled out. However, given that we end up retaining both the independent-1D and -2D models, our subsequent goal is to assess them further, in a more exploratory fashion with stronger distributional assumptions (i.e., Gaussian distributions). This allows us to examine the models' best-fitting parameter values for the experimental conditions. First, we investigate whether the two discriminability parameters of the independent-2D model are correlated. If so, this model is mimicking the independent-1D model and thus its extra complexity is unwarranted. Second, we examine the interpretability of the discriminability and two criterion parameters of the independent-1D model; how do they vary across experimental conditions?

Section snippets

Experiment 1

The primary aim of Experiment 1 was to test for the ordinal pattern forbidden by the independent-1D model, using signed difference analysis. The dependent variables for SDA were induction and deduction endorsement rates (manipulated between groups), elicited for a common set of valid and invalid arguments. Three within-participants factors were the type of causal conditional argument (denial vs. affirmation), the believability of the content (unbelievable, believable, or neutral) and logic

Experiment 2

The primary aim of Experiment 2 was to further test for the ordinal pattern forbidden by the independent-1D model. The design of Experiment 2 was identical to Experiment 1 except that between the logic training and the final post-training block, all participants also completed some “content training”, which asked them to consider whether argument content was “sensible”. Changes to the design are noted below. Our intention was that after both training tasks, deduction participants would know how

Examination of the models with distributional assumptions

The SDA tests have shown that Experiments 1 and 2 do not reject the independent-1D model – in its most general form, with minimal distributional assumptions. Therefore, the 1D model remains as a viable alternative to the saturated independent-2D model – both models can account for the data. While the 1D variant may be preferred on the basis of parsimony, other considerations are also important in model comparison, such as how the parameters vary across conditions and whether they do so in a way

General discussion

Across two experiments, we performed a targeted test for evidence against a “single-process” independent-1D model, which had successfully accounted for a large body of existing data from the argument evaluation task (Stephens et al., 2018). Participants made induction or deduction judgments about valid and invalid arguments, which varied in argument believability and conditional argument form. Additionally, we used training to heighten Type 1 versus Type 2 processing, under a dual-process

CRediT authorship contribution statement

Rachel G. Stephens: Conceptualization, Formal analysis, Investigation, Methodology, Software, Visualization, Writing - original draft, Writing - review & editing. John C. Dunn: Conceptualization, Funding acquisition, Methodology, Software, Resources, Supervision, Writing - review & editing. Brett K. Hayes: Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Supervision, Writing - review & editing. Michael L. Kalish: Conceptualization, Funding acquisition,

Funding

This work was supported by Australian Research Council Discovery Grant DP150101094 and DP190102160 to authors BKH and JCD, and Australian Research Council Discovery Grant DP130101535 to JCD and MLK.

Declaration of competing interest

None.

Notes

The authors thank Eric Moskowitz, Rebecca Leonard and Tennazha Bradley for their assistance with data collection, and Nicole Cruz for assistance with statistical analyses. Materials and data are available as supplementary files, and at https://osf.io/cwf8x/.

References (90)

  • I.P.L. McLaren et al.

    Associations and propositions: The case for a dual-process account of learning in humans

    Neurobiology of Learning and Memory

    (2014)
  • D.E. Melnikoff et al.

    The mythical number two

    Trends in Cognitive Sciences

    (2018)
  • B.R. Newell et al.

    Dimensions in data: Testing psychological models using state-trace analysis

    Trends in Cognitive Sciences

    (2008)
  • S.E. Newstead et al.

    The source of belief bias effects in syllogistic reasoning

    Cognition

    (1992)
  • M. Oaksford et al.

    The probabilistic approach to human reasoning

    Trends in Cognitive Sciences

    (2001)
  • C.M. Rotello et al.

    Do modals identify better models? A comparison of signal detection and probabilistic models of inductive reasoning

    Cognitive Psychology

    (2019)
  • H. Singmann et al.

    Probabilistic conditional reasoning: Disentangling form and content with the dual-source model

    Cognitive Psychology

    (2016)
  • N. Skovgaard-Olsen et al.

    The relevance effect and conditionals

    (2016)
  • R.G. Stephens et al.

    Disappearing dissociations in experimental psychology: Using state-trace analysis to test for multiple processes

    Journal of Mathematical Psychology

    (2019)
  • F.G. Ashby et al.

    A neuropsychological theory of multiple systems in category learning

    Psychological Review

    (1998)
  • A. Baddeley

    Working memory: Theories, models, and controversies

    Annual Review of Psychology

    (2012)
  • D. Bates et al.

    Fitting linear mixed-effects models using lme4

    Journal of Statistical Software

    (2015)
  • D.H. Brainard

    The psychophysics toolbox

    Spatial Vision

    (1997)
  • P. Croskerry et al.

    Cognitive debiasing 1: Origins of bias and theory of debiasing

    British Medical Journal: Quality and Safety

    (2013)
  • E. Dane et al.

    Exploring intuition and its role in managerial decision making

    Academy of Management Review

    (2007)
  • W. De Neys

    Bias and conflict: A case for logical intuitions

    Perspectives on Psychological Science

    (2012)
  • C. Dube et al.

    Assessing the belief bias effect with ROCs: It’s a response bias effect

    Psychological Review

    (2010)
  • J.C. Dunn

    The dimensionality of the remember-know task: A state-trace analysis

    Psychological Review

    (2008)
  • J.C. Dunn et al.

    State-trace analysis

    (2018)
  • J.C. Dunn et al.

    Discovering functionally independent mental processes: The principle of reversed association

    Psychological Review

    (1988)
  • J.S.B.T. Evans

    On the resolution of conflict in dual process theories of reasoning

    Thinking & Reasoning

    (2007)
  • J.S.B.T. Evans

    Dual-processing accounts of reasoning, judgment, and social cognition

    Annual Review of Psychology

    (2008)
  • J.S.B.T. Evans

    Intuition and reasoning: A dual-process perspective

    Psychological Inquiry

    (2010)
  • J.S.B.T. Evans et al.

    On the conflict between logic and belief in syllogistic reasoning

    Memory & Cognition

    (1983)
  • J.S.B.T. Evans et al.

    Reasoning under time pressure: A study of causal conditional inference

    Experimental Psychology

    (2009)
  • J.S.B.T. Evans et al.

    Reasoning about necessity and possibility: A test of the mental model theory of deduction

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (1999)
  • J.S.B.T. Evans et al.

    The influence of cognitive ability and instructional set on causal conditional inference

    The Quarterly Journal of Experimental Psychology

    (2010)
  • J.S.B.T. Evans et al.

    Reasoning to and from belief: Deduction and induction are still distinct

    Thinking & Reasoning

    (2013)
  • J.S.B.T. Evans et al.

    Dual-process theories of higher cognition: Advancing the debate

    Perspectives on Psychological Science

    (2013)
  • E. Gillard et al.

    Dual processes in the psychology of mathematics education and cognitive psychology

    Human Development

    (2009)
  • U. Hahn et al.

    The rationality of informal argumentation: A Bayesian approach to reasoning fallacies

    Psychological Review

    (2007)
  • S.J. Handley et al.

    Logic, beliefs, and instruction: A test of the default interventionist account of belief bias

    Journal of Experimental Psychology: Learning, Memory, and Cognition

    (2011)
  • B.K. Hayes et al.

    Comparing single- and dual-process models of memory development

    Developmental Science

    (2017)
  • B.K. Hayes et al.

    Inductive reasoning 2.0

    Wiley Interdisciplinary Reviews: Cognitive Science

    (2017)
  • B.K. Hayes et al.

    The dimensionality of reasoning: Inductive and deductive inference can be explained by a single process

    Journal of Experimental Psychology: Learning, Memory, & Cognition

    (2018)
  • Cited by (17)

    • Assessing Indonesian student inductive reasoning: Rasch analysis

      2022, Thinking Skills and Creativity
      Citation Excerpt :

      The analogy task, which involves the structure of a display on an object such as figures, numbers, and letters, can be solved by assessing the sample information in the task. This task is frequently used to measure students’ intelligence (Hotulainen et al., 2018; Klauer & Phye, 2008; Stephens et al., 2020; Strobel et al., 2019; Venville & Oliver, 2015). The classification task involves combining various forms of problems that comprise words, figures, and numbers that require students to identify answers that are unrelated to the others.

    • Applied Biomedical Engineering Using Artificial Intelligence and Cognitive Models

      2021, Applied Biomedical Engineering Using Artificial Intelligence and Cognitive Models
    View all citing articles on Scopus
    View full text