Introduction

Proofs can come in many forms, ranging from a system of logical deductions done in a formalistic symbolic way, as in the Principia Mathematica (Whitehead and Russell 1910), over plain text argumentations supplemented by formulas, which are the standard form in many research papers and textbooks, to so-called “Proofs Without Words” given by an image only (Nelsen 1993).

Computers have added to this variety by giving rise to computable proofs, i.e. proofs that can be checked by a computer (Voevodsky 2015), or even proofs executed by a program such as the much debated (Tymoczko 1979) proof of the Four-Color Theorem (Appel and Haken 1977), too long to ever be completely reviewed by a human.

The rise of computers and with it digital transformation of all aspects of human activities has also made environments possible that enable to do mathematics in an interactive way. These environments can come in different representation modes, as some are more formal-symbolic in nature, such as Mathematica or Jupyter Notebook, while others have rather informal visual representations, such as GeoGebra for geometry or Ariadne for topologyFootnote 1. In the same way as programming stimulated computational thinking (Papert 1980), we claim that mathematical simulations nurture mathematical thinking; the focus in both cases lies on human thought and development and not on technology.

Undoubtedly, these programs allow for mathematical activities such as exploration or checking special cases of conjectures (Borwein and Devlin 2008). These “experimental mathematics” (Borwein 2011) activities can be regarded as being equally important as “rigorous proving” in mathematics (Jaffe and Quinn 1993). However, we argue from a theoretical perspective that they can also provide new means to formulate proofs, that is mathematical arguments fulfilling the roles of proofs pointed out, for example, by de Villiers (1990) (see “The Status of Proof in Mathematics and Mathematics Education”).

Research on the impact of these new technologies on the concept of proof, which plays a central role in mathematics, is not very developed, despite their well-known impact on the understanding of mathematics in general (Hoyles and Lagrange 2010). Research is focusing more on the role technology plays in establishing conviction of a fact, which may be even counterproductive in justifying the necessity of proof (Marrades and Gutiérrez 2000; Hoyles and Noss 2003; Bolite Frant and Rabello de Castro 2000; Christou et al. 2004), and efforts to remediate this (Hadas et al. 2000; Jones 2000), see Sinclair and Robutti (2013) for an overview. This focus on technology establishing conviction is related to the use of many of these technologies primarily in educational contexts. In curricula, proofs are well-established almost only in geometry (Hanna and de Bruyn 1999; Stylianides 2007), where students do proofs by geometric constructions in the spirit of Euclid (Mogetta et al. 1999). Some go as far as to say that the central reason for the existence of geometry in the curriculum is for serving as a paradigm for deductive proof (Hanna 1998). Overall, research investigating possible evolutions of the concept of proof when digital tools are used, is not well developed and focuses on their use for such constructions in Euclidean geometry (Marrades and Gutiérrez 2000).

Moreover, the effect of using digital geometry environments in educational contexts is controversial (Hanna 1998), as “proofs” in such environments are often seen as providing evidence for the truth of a statement by making available a large, by continuity seemingly infinite, number of examples (Hanna 1998), without actually being a real proof (Nam 2012). An example would be dragging the corners of a triangle while keeping track of the sum of the internal angles, which can convince a student of the fact that their sum is always 180 without giving any sort of explanation of this fact and its relation to the parallel axiom, and without allowing to be certain that no case exists, in which the sum is different. The scope of such software is thus often seen as limited to exploration (Christou et al. 2004) and conjecturing based on this exploration (Mogetta et al. 1999; Venema 2013).

This deeply rooted view on such technology as being there for exploration and conjecturing, and on proofs as being formal in their representation, led to attempts on doing proofs by integrating such technology and formal proofs in digital environments. This manifests, for example in, the implementation of proof assistants in dynamic geometry environments (Albano et al. 2019; Nam 2012; Kovács 2015; Miyazaki et al. 2017; Hanna et al. 2019).

We argue that these limitations and separation in informal/exploration – formal/proving is mostly due to the nature of geometrical constructions and the role of Dynamic Geometry Environment (DGE) software in proof and technology related activities. To emphasize that these limitations are not inherent to digital environments in general, but to the practice of use of those from geometry, we present several proofsFootnote 2 done in a different type of software, a Dynamic Topology Environment called Ariadne (Sümmermann 2019a). In Ariadne, the user cannot only explore basic concepts of topology including points, paths, and homotopies of paths, but can also formulate proofs of non-existence using invariants such as the winding number of a path. Examples of such proofs are given in Appendix A. They highlight the features of simulation-based proofs in such a digital environment, which are not limited to exploration, experiments, and constructions, but can satisfy the same functions as traditional proofs. In our argumentation, Ariadne serves as an example; it can be exchanged for any simulation with similar characteristics.

Our first findings in this theoretical study are the necessary conditions that the broader class of environments and tools representing mathematics have to fulfill, in order to possibly allow the construction of proofs in such environments. These conditions define the new concept of a mathematical simulation and with it the term simulation(-based) proofs. An emphasis must be put on the word necessary in the previous formulation, as there are examples of mathematical simulations such as Matlab or Maple, in which proofs that go beyond calculations are harder to achieve due to the focus on numerical manipulations.

Following an analysis of this kind of simulation proofs, we present a classification of proofs along the lines of interactivity and formality, showing the place of simulation-based proof in the context of more traditional or alternative forms of proof. We also present an analysis of different functions of simulation proofs, following and extending the framework created for proofs in general by de Villiers (1990). This leads to the conclusion that simulation proofs are of particular interest in mathematics education, with some caveats.

Going beyond the familiar debate of the non-surveyability of computer-generated proofs (Tymoczko 1979) such as the proofs of the Four-Color Theorem and the Kepler conjecture, we introduce technological reasons as a category of acceptance criteria for proofs in the context of simulations.

Article Organization

Roughly speaking, we will give definitions, examples, and intricacies of the use of simulation-based proofs. Then follows a classification of simulation-based proof, first external (in relation to other types of proof) and then internal (highlighting functions of proof).

More precisely, in “The Proof Process in a Mathematical Simulation”, we will start by explaining more closely what we mean when talking about proofs. We then define the notion of a mathematical simulation, in particular distinguishing simulations from animations and microworlds. This is followed by a general analysis of the user interaction process with a simulation. As errors occurring in the interaction process are vital in establishing trust in the simulation, we give a categorization and examples of different types of error that may arise in the process.

This is followed by “Dimensions of Proofs”, in which various forms of proof and proving environments are classified along the dimensions of interactivity and formality. This aids in understanding the place of simulation proofs and mathematical simulations in relation to other forms of proof and mathematical environments.

Subsequently, “Functions of Simulation Proofs” employs de Villiers’ (1990) framework to discuss the functions of simulation proofs. In particular, this also encompasses new categories regarding the conviction of the proof recipient particular to simulation proofs.

Finally, “Conclusion” contains conclusions and implications for mathematics education and mathematics research, followed by some descriptions of simulation proofs in Appendix A.

The Proof Process in a Mathematical Simulation

The Status of Proof in Mathematics and Mathematics Education

While proofs are at the center of mathematics, their nature is highly contested and comprises a wide range of different objects. The spectrum of proofs begins with “formal proofs” in a mathematical logic theory sense, as chains of formalized deductions, which can at least in theory be checked by a computer. As Krabbe (2008) states, these “are a logician’s gadget,” and do not exist in practice (Aberdein 2008). The view that formal proofs are practically non-existent is challenged by new generations of formal proof theories and resulting computational advances, leading to formalized proofs as feasible tools for mathematicians work (Voevodsky 2015), but it is still certainly true for “most” mathematicians and almost all of mathematics in educational contexts.

The next “step” are formal proofs in the sense of axiomatic, symbolic mathematics. Such proofs contain elements of an argumentation and are the type of proof that is most commonly employed in mathematics contexts. That mathematical proofs can be argumentations (Aberdein 2008), i.e. let room for debate, is again a controversial notion and rejected by some (Johnson 2012), as it conflicts with the view of proofs providing absolute conviction of truth beyond doubt (Krabbe 2008).

Proofs in educational settings, such as schools or undergraduate courses, are then again different, as they adhere to forms of reasoning and are communicated with forms of expression “that are valid and known to, or within the conceptual reach of, the classroom community” (Stylianides 2007). This can include other forms than the stricter representation in communications within the mathematical community.

What we propose, for the purpose of this article, is more of an implicit definition of proof, giving a list of attributes defining a proof in the sense of Lakoff (1987) and Weber (2014): By showing that simulation-based proofs can fulfill the roles and functions of proofs as specified by de Villiers (1990), they may be regarded as such. In this way, we adopt a view of proofs as a cluster concept in the sense of Weber (2014).

It should be noted that when we argue that “simulation-based argumentations” can be regarded as proofs, it is not our intent to somehow bypass the sociomathematical norms (in the sense of Yackel and Cobb 1996), and certainly not to set them; we cannot define externally what constitutes a proof, this has to be done by the mathematics community and time.

Mathematical Simulations

Generally, a simulation can be defined as any attempt to mimic a real or imaginary environment or system (Rieber 1996). Based on this, we define a mathematical simulation (MS) to be a simulation mimicking mathematics by following the mechanisms of action immanent to the mathematics being simulated, creating the representation as a consequence of general underlying rules (see Richter-Gebert 2013). A mathematical simulation comes equipped with a certain representation of the mathematical content, in general defined by the person who built the simulation, and the capability to allow user interaction with this representation, be it through manipulation of objects, images, symbols, or other such modes of representation (see “Formality”) yet to be conceived. We will call proofs based on a simulation either simulation-based proofs or simply simulation proofs.

Simulations are fundamentally different from animations, even if it may not be easy to distinguish them from another as a user. An animation is defined to be a software following a pre-determined stimulus response mechanism (Richter-Gebert 2013). While animations are defined by what is being presented, simulations are defined by how the presentation is made.

An example borrowed from physics may help to clarify the differences between animations and simulations. Two tablet apps are given, both allow the movement of a ball on the screen via dragging. If let go, the ball will fall to the ground, bouncing a few times. Behind the visual representation, this could be realized by either implementing the mechanisms of action of the ball, such as gravity and the spring force calculated from the kinetic energy. This would be a (physics) simulation. Another possibility would be to simply let the ball go down pixel by pixel until its coordinates reach a certain value, which then triggers a predefined movement along some curve, giving the “illusion” of the ball bouncing; this is an animation. Although they may not look different, at least at first sight, the simulation may allow the exploration of physical phenomena even beyond the intent of the programmer, while the animation does not. In the scenario of the bouncing ball, this may mean the deformation of the ball on impact, which may not have been intended by the programmer of the simulation but may nevertheless be observed, while it is only visible in the animation if the programmer has explicitly thought of it.

A mathematical example may be a software generating graphs of quadratic functions, in the case of a simulation really plotting the function, in the case of an animation presenting a predefined image from a collection of graphs best fitting the entered parameters. As the number of different graphs on a screen with a finite resolution is finite, the difference between the two cases would be virtually impossible to identify for a user.Footnote 3

These examples showcase the predefined nature of animations, which certainly makes the construction of an animation easier than that of a simulation and may also have some advantages. By their very definition, they are, however, useless for the exploration or representation of mathematics beyond the build-in set of cases.

A term closely related to the one of a mathematical simulation is “microworld,” a concept introduced by Papert (1980) describing a subset of reality or a constructed reality whose structure matches that of a given cognitive mechanism so as to provide an environment where the latter can operate effectively (Papert 1980, p. 204), allowing learning. Hoyles et al. (1996) also stress the importance of interactivity for microworlds, stating “Software which fails to provide the learner with a means of expressing mathematical ideas also fails to open any window on the processes of mathematical learning. A student working with even the very best simulation, is intent on grasping what the simulation is demonstrating rather than attempting to articulate the relationships involved.” Hoyles et al. (1996, p. 54) [emphasis added].

Simulations can, however, be distinguished from these microworlds through design choices made in the design of the latter as to fit the environment to the learners’ cognitive state (Rieber 1996). The aims of a microworld go beyond those of a simulation as it not only represents a mathematical object faithfully, but also specifically strives to enable learning, making microworlds a subclass of simulations. In a microworld, the objects can be manipulated by the user “with the purpose of inducing or discovering their properties and the functioning of the system as a whole” (Edwards 1995, p. 144).

Mathematical simulations can serve as a framework for formulating proofs, as in addition to the capability of representing mathematics in a certain way, which can also be said for Proofs Without Words or even writing on a blank sheet of paper, the user can interact with the simulation in a meaningful way. The interaction is meaningful, as conjectures and arguments can be made in response to the behavior of the simulation, because the simulation reacts according to mathematical laws. Furthermore, this central feature of simulations makes it possible to not only replicate known results, but also to discover new results, as the grounding in underlying mathematical rules allows the user to go beyond the simulation designer’s imagination and intentions.

The simulation used to demonstrate some examples of simulation-based proofs in this paper is AriadneFootnote 4, a dynamic topology environment developed by Sümmermann (2019a). In Ariadne, the user can explore different mathematical spaces including points, paths, and homotopies of paths using touch gestures, constructing points by a touch, paths by dragging points, and homotopies by dragging paths. Mathematically speaking, the “dragging” feature of many DGEs represents a homotopy, so this exploration of spaces via homotopies goes to the core of this “most central tool of DGEs” (Sinclair and Robutti 2013).

The Interaction Process with a Mathematical Simulation

When arguing and possibly even proving based on a mathematical simulation, trust of the user in the simulation is crucial; for example, trust in the physics engine in the bouncing-ball simulation mentioned above. Errors in the interaction with a mathematical simulation could undermine the trust of the user. To determine the kind of errors that can arise in the interaction, we must analyze the interaction process with such a simulation. This analysis is itself independent of the purpose of the interaction, be it exploration or argumentation. The interaction process follows the same principles as the interaction with software in general. It can be described using a Subject-Object-Artefact triangle, derived from situated instrumented activity (Vérillon 2000); our model is specialized for the case of user interaction with mathematical simulations.

In an interaction process with a simulation, the users are in a cycle of adjusting their knowledge of mathematics to the observation of the simulation’s behavior, leading to the formulation of new actions to be taken, resulting in new output of the simulation, which is again observed by the users. Each step of the proof involving an interaction with the software leads to such an adjustment. The general nature of the relationship between the user, the software, and the underlying mathematics can be described using the triangle in Fig. 1.

Fig. 1
figure 1

An interaction process with the mathematical simulation

During the interaction process, the relations in the triangle are traversed by repeating several steps:

  1. 1.

    The users decide their next action as a consequence of their knowledge of mathematics (influenced by mathematics), taking the form of an input to the mathematical simulation.

  2. 2.

    The simulation internally computes a new representation as an output, influenced by the mathematics implemented in it.

  3. 3.

    The users interpret the new representation and adjust their expectations on the simulation’s behavior and possibly also their knowledge of mathematics.

  4. 4.

    With new expectations in mind, based on the representation of the simulation, the users decide their next action, leading to a new input for the simulation.

Errors in the Interaction Process with the MS

For a productive use of mathematical simulations, the process described in “The Interaction Process with a Mathematical Simulation” must function properly. Different types of errors can disrupt this process, which are characterized in this section. We distinguish two categories: errors by the user and errors by the software. An error by the user means an incorrect interpretation of the representation of mathematics of the mathematical simulation, and an error by the simulation an incorrect representation of mathematics by the mathematical simulation.

It is, of course, impossible to determine what, in general, an incorrect representation of mathematics is. But the representation of mathematics in a mathematical simulation constitutes an agreement between user and software to the “language” used to carry information. So even if a representation generated by a simulation – like any representation – cannot be false by itself, it can be judged by its adherence to the representation mode agreed upon and thus implicit to the simulation. In the following, the words “correct” or “error” will be used in this sense. In the triangle from Fig. 1, this error would be situated in “Representation by the software”.

Nevertheless, a mathematical error originating from interaction with the software can only be made by the user if the simulation’s dissenting representation leads to false assumptions about the represented mathematics. Notably, this implies that the error was observed by the user; if an error of the mathematical simulation has no consequences on the representation given as output or is overlooked, then it cannot have implications on the expectations and on the mathematical knowledge of the user.

In addition, the user cannot only hold correct or incorrect assumptions on mathematics, but also make errors in the interpretation of the simulation’s representation. The chance of misinterpretation may increase if the representation mode is not defined explicitly. In Fig. 1, this would correspond to errors in the arrows between artefact and subject.

This leads to the distinction shown in Table 1. The user can correctly or incorrectly interpret the representation by the mathematical simulation. In the same way, the mathematical simulation can render the representation agreed upon correctly or incorrectly. Even if all possible interactions may be situated in this grid, not all errors can be identified in this way; the user can also draw the right or wrong conclusions, corresponding to errors in the subject corner in Fig. 1, or the software can go beyond its representation capabilities.

Table 1 The different types of errors along two dimensions, accounting for representation-related errors

To illustrate the types of errors given in Table 1, examples based on a fictional software are given. These simple examples do not represent proving situations, but suffice to showcase the different possible errors of user and software on the mathematical content.

The software is a function plotter, which can represent a given polynomial function by drawing it in a coordinate system. It does so by randomly sampling points and connecting them by linear splines. It leaves a gap where the function is not defined. For all presented functions, the user wants to investigate singularities of the function using the software, disregarding the fact that a thoughtful user might not trust the software (compare “Criterium 1: Trust in the Technology”), or double-check the result by other means. The error types in Fig. 2 can then be described as follows.

  1. (a)

    The representation is correctly understood by the users: They see the singularity pointed out by the software.

  2. (b)

    The users have the correct assumptions about the representation used, but the software depicts an incorrect representation: It might be that the software did not include 1 in the approximation of the curve, and thus did not observe the singularity. The users “correctly” assume that the function does not have a singularity.

  3. (c)

    The users have incorrect assumptions about the representation used, but the software has no fault in the display: The user does not understand that the circled point is a singularity, even if removable, and believes the function to be free of singularities.

  4. (d)

    The users have incorrect assumptions about mathematics and the simulation has a fault in the display: The software displays a continuous line along the jump discontinuity, which is not the agreed representation for this type of singularity (correct representation in Fig. 2a). The users think such an abrupt and non-differentiable change of slope is a sign of a discontinuity, and interpret the function to have singularities at the points (0.99,1) and (1.01,1.5).

Fig. 2
figure 2

The error types, as seen in a software for function plotting, according to Table 1

Dimensions of Proofs

Mathematical proofs can be classified along several dimensions, for example along the dimension of formality (Lakatos 1978). Besides this dimension, we concentrate on “interactivity” as a second dimension, to account for the focus on more dynamic proof forms, which are the focus of this article. The classification helps to distinguish the various currently existing environments for basing proofs upon and the corresponding proof, and to subsequently point out the place simulation-based proofs take in relation to these other forms of proof. We will also discuss the concept of “transferability” of proofs, which certainly plays an important role in this context. All these dimensions are not necessarily orthogonal, but represent a way of distinguishing some proofs or mathematical environments.

Formality

Formality describes the adherence to standardized mathematical notation in the representation of mathematics, which is distinct from “formal,” meaning written in a formalized language with formalized derivation rules (Krabbe 2008). Here, formal means adhering to a strict, structured symbolic representation.

Proofs, as all mathematical content, can be distinguished with regard to their representation. Different systems have been proposed to categorize representation modes, for example enactive, iconic, and symbolic by Bruner (1966), or somatic, mythic, romantic, philosophical, and ironical by Egan (1997).

We use the much coarser distinction into formal–informal, as this suffices to classify existing environments allowing proofs for the purposes of this article.

Interactivity

Interactivity is defined by the dictionary Merriam-Webster as “mutually or reciprocally active.” That means, not only has the medium to be active, such as in a video, but the user’s actions have consequences on the activity of the medium.

Proofs can present themselves at different levels of interactivity. One end of the spectrum are static proofs. These can range, in different levels of structuredness, from Proofs Without Words (Nelsen 1993), over “traditional” formal proofs from mathematical practice, such as proofs in textbooks, to proofs that can be checked by a computer, the “platonic ideal” (Lamport 2012).

More interactive proofs are given by videos or even animations, which may even be altered by tools such as sliders. These give the users some kind of control over the way the proof is presented to them, but does not constitute a simulation proof in the sense described in “Mathematical Simulations”.

At the other end of the spectrum are fully interactive proofs in mathematical simulations, such as the ones presented in Ariadne. These are proofs in which the user has total control to alter the proof, albeit limited by the allowances of the simulation used.

Overview on representation modes of proofs

Several different types of environments in which proofs can be shown or done, such as digital environments or simply textbooks, are depicted in Fig. 3. The aim of this overview is to give a sense of the place mathematical simulations and with them simulation-based proofs take in comparison to other forms of proof-supporting environments. We will now give some more information about the objects referenced in the figure.

Fig. 3
figure 3

Some proofs and environments for doing proofs sorted along the dimensions interactivity and formality, ranging from static to interactive and formal to informal, respectively. For examples of mechanical proofs, see Richard et al. (2019)

The standard type of proof is a text with symbols, sometimes accompanied by images for clarification or illustration purposes, such as in a standard textbook. Standard non-interactive proofs can, however, range from informal to formal. A very informal category are Proofs Without Words, where the image itself is the proof, only sometimes accompanied by text or symbols for clarification. Their status as to being a proof or representing an idea of a proof is not clearFootnote 5.

A relatively new type of argumentations are so-called “Proofs Without Words 2.0” (Doyle et al. 2014), which are animated versions of Proofs Without Words, sometimes giving the user some control over the animation. This places them at the same level of formality as traditional Proofs Without Words, but more interactive, together with digital textbooks incorporating them such as Mathigon (Legnerhttps://mathigon.org/). Further up in the figure are environments that allow the user to manipulate the objects more freely and to come up with own proofs, such as DGEs or other mathematical simulations. These programs are mostly informal in their representation, which is, however, no requirement for a mathematical simulation.

On the more formal side of mode of representation are software such as Surface Evolver (Brakke 1992) or Jupyter Notebook (Kluyver et al. 2016). Surface Evolver is a program to do operations, such as the simulation of geometric flows, on surfaces, working with a text-based interface. Jupyter Notebook is a web-based interactive computational environment, making it possible to mix text with programming language outputs of, for example, Python, making “seamless the communication between human and machine” (Barba 2015).

An environment, which is both highly interactive and formal, is also possible, for example in the form of proof assistants such as Isabelle (Paulson n.d.) or Lurch (Carter and Monks 2013), or even general mathematical text editors, possibly giving another mode of distinction, which is, however, not the focus of this paper.

Transferability

While the idea behind a proof is in theory independent of its representation, it is nevertheless hard to do comparisons between proofs in different representation modes (Giaquinto 2020). It is even hard to compare proofs in the same representation: given two proofs of the same theorem, there is no formal notion of which proof is “simpler” (Cain 2019) (also compare Inglis and Aberdein’s (2014) discussion on the meaning of “simple” in mathematics). Also, not every proof can be represented in all modes of representation with the same ease, as many Proofs Without Words demonstrate; many would be challenging to formalize, and probably even lose their elegance.

There are, however, cases where the representation of a proof can be changed without changing the proof’s idea. We describe the degree to which this is possible and the amount of “mental work” necessary to do so by transferability. This concept is certainly related to the concept of cognitive unity (Boero et al. 1996; Pedemonte 2007), but does not consider the transition from a person’s argumentation to a formal proof, but the comparison of the products of proving processes in general. It stands out of question that, based on this vague definition, transferability cannot easily be quantified. It is nevertheless an important concept for understanding the role of proofs in different modes of representation. The dimension of transferability is relevant in mathematics as well as mathematics education, as the current de facto representation of choice for proofs is formal-symbolic and all other representation modes are measured as for their alignment to this representation (Brunner and Reusser 2019).

Functions of Simulation Proofs

We follow de Villiers’ (1990) extension of a categorization by Bell (1976) to analyze the functions of simulation proofs. De Villiers highlights the functions of explanation, systematization, discovery, verification, and communication. We split up the function of verification in relative and absolute conviction, to emphasize the difference between believing the statement to have a high probability to be true, which is relative conviction, and believing the statement to be surely true, which is referred to as absolute conviction (Weber and Mejia Ramos 2015).

We identify several criteria particular to simulation-based proofs that influence the relative conviction function: trust in the technology, level of detail, and limits of the representation. These criteria add to the established ones for proofs in general as well as perhaps also to the criteria surrounding computer-based proofs.

Explanation

Explaining why a result holds is one of the main motivations for a formal proof and goes beyond the relative conviction that it holds. That relative conviction can be achieved by, for example, a lack of counterexamples after some searching, is also one of the criticisms of the use of dynamic geometry in educational settings; by being able to provide a large, seemingly infinite, number of examples, students are so convinced of the result they should prove that they no longer feel the need to prove it (Bolite Frant and Rabello de Castro 2000). In the framework of Harel (2013), the students draw certainty from an “inductive proof scheme” instead of the educationally desired “transformational proof scheme”.

On the other hand, numerous Proofs Without Words give examples of visual arguments explaining why a result holds. This shows that the capacity of a proof for explanation does not necessarily depend on the representation. “Proofs Without Words 2.0” denote animated versions of such proofs (Doyle et al. 2014), which can certainly be seen as being something between static visual proofs and simulation-based proofs, and retain the explanatory capacity of Proofs Without Words while being more interactive.

A mathematical simulation goes even further, giving the user more freedom to explore the phenomenon being reviewed and thus more opportunities to find an explanation. There are many examples of explanations using dynamic geometry software, which show that the problem addressed by the criticism above should be directed more to the overemphasis of the relative conviction aspect of proofs in education (Hanna 1998), which leads to the question of “why” being overshadowed and thus neglected by students.

Relative Conviction

While the notion of “relative conviction” was only put forward in Weber and Mejia Ramos (2015), the connection between exploration of mathematical situations using technology and conviction through quasi-empirical testing of the truth of a result is a well-explored concept (see the examples and references in the introduction). It is the main reason for the use of simulations such as Mathematica to check statements by computation in research, and the employment of dynamic geometry software in education. In such a setting, verification is often, especially in lower grades, understood as quasi-empirical verification providing relative conviction (Hanna 1998), as opposed to a deductive proof, which would provide absolute conviction. We believe that simulation proofs can go beyond this kind of relative conviction and also provide absolute conviction by being more than just a collection of examples, however compelling, but providing insights and explanations as to why a result holds.

Absolute Conviction

This aspect not only denotes the absolute conviction of the truth of the result, but also the conviction of the validity of the proof (which certainly implies the former). As there is no consensus on a definition of proof in mathematical practice, there are no generally accepted criteria for its validity (Hanna and Jahnke 1996). Proofs have evolved through history and range over a wide variety of type, and while there are some techniques of proof accepted by most, such as mathematical induction or reductio ad absurdum, there is no one proof type that fulfills all needs and demands of every mathematician (McAllister 2005). Thus, the question of the validity of a proof is a deeply subjective one, influenced by the ever-changing norms of the mathematical community (Sommerhoff and Ufer 2019). This sort of change of norms is not unique to mathematics; quantum theory offering no visualizations led to some physicists not accepting it, until overwhelming evidence forced physicists to reshape their criteria for theory acceptance, abandoning the need for visual representations of phenomena (McAllister 2005).

However, different criteria can be identified that may influence this absolute conviction to a varying extent. We propose that these be grouped into the following categories:

  1. 1.

    Mathematical-logical reasons, the personal understanding of the theorem, and an a priori judgment of its validity; the logical consistency of the proof.

  2. 2.

    Socio-cultural reasons, such as the trust in the author or in the mathematical community that examines the proof.

  3. 3.

    Technological reasons.

The first two categories contain established criteria for proof acceptance (Sommerhoff and Ufer 2019; Yackel and Cobb 1996; Hilbert 1931) and are not the focus of this paper.

The third category of technological reasons is of particular importance for proofs in the context of computers, including mathematical simulations. Several criteria of this category can be identified.

Criterium 1: Trust in the Technology

If a proof involves the work of a computer, then trust in the computer may be a factor in accepting the proof. This is a multi-faceted aspect, which encompasses several subcategories.

It can mean the trust in the operations done by a computer, such as in computer-assisted proofs. This can further be elaborated, as, for example, the original Appel-Haken proof of the Four-Color Theorem not only needed the trust in the operations executed, but also in the validity of the computer program generating the proof itself. This was eliminated not until decades later with a formal proof in Coq (The Coq Development Team 2019), reducing the trust to the Coq system and the computer operations (Gonthier 2008).

The main problem with computer-based proofs, such as computer-assisted proofs, is often their length, which makes them non-surveyable, that is inaccessible for verification to the mathematical community (Tymoczko 1979). But even if a proof seems surveyable, its acceptance by the recipient can still require trust in the technology, as the following example demonstrates.

If users drag a triangle in a DGE, the users most probably have the expectation that only the position of the triangle should change, not its area or other intrinsic features. Each time they perform the dragging operation, they can perform a visual inspection and check if the triangle has changed, strengthening their trust that dragging is a faithful representation of this translation they have in mind. They could further strengthen this trust by looking at the source code of the application, if accessible.

In some points, the users will check against their expectations and mathematical knowledge if the simulation is indeed one, that is if the mathematics are accurately represented. At other times, they will trust the software to be honest with them, and will adjust their expectations and mathematical knowledge according to their interpretation of the visualization given by the software.

Criterium 2: Level of detail

In a traditional paper-and-pencil proof, every argument can be broken down further, until the steps performed are small enough to be understood by the recipient or the axioms are reached. This strengthens the trust of the users in a proof, as they have the certainty that they could, in theory, probe every part of it.

In a digital environment, such as a mathematical simulation, the detail of a proof is limited by the resources the software provides. In addition to the interference of the trust, and with it the conviction of the users in the proof, this massively inhibits the users in their own creativity, as they, opposed to the software designer, are not able to construct their own objects independently of the software’s functionality, a limitation that can hopefully be overcome by future simulations.

A prime example is the “winding number tool” in Ariadne, which provides a way of computing the winding number of a path around a point. By doing this in a certain way, it not only inhibits the users to think of their own way to compute the winding number of a path, but also which invariant of a path to choose in the first place. Therefore, proving in such an environment is always only proving given these constraints.

Criterium 3: Limits of the representation

Every representation mode has its limits, which are independent of the technology used. This is a problem in some types of proof such as visual Proofs Without Words that can hamper proof acceptance (Bardelle 2010). This factor is magnified in dynamic proofs in digital environments, as the representations of mathematical objects here are not directly generated by the prover, i.e. the mathematician. They are thus even more susceptible to error, as the possibility of a simulation going beyond the capabilities of its representation always exists (compare “Errors in the Interaction Process with the MS”). The users know this, which can impede their relative conviction.

Instances of these limits are found readily, for example given a mathematical simulation plotting a function with a removable singularity such as \( f:\mathbb {R}\setminus \{2\}\rightarrow \mathbb {R} \), xx2. A programmer would have had to think of a way to highlight this singularity, for example by drawing a small circle at (2,4), as no resolution would be sufficiently fine to show its existence. In such a way, the programmer must have thought about the representation of every possible function that might be plotted by the software.

The more general and powerful the simulation, the more cases have to be covered in beforehand. In the light of the research-level proving process, one is certainly in the realm of new mathematical objects and connections between them, which makes it all but impossible to construct the environment to account for all possible cases. Herein lies the special challenge for the designers of such mathematical simulations, but also for the user interacting with the software, who has to take this type of limitation into account.

Systematization

Systematization describes the organization of several results into a (deductive) system. This is certainly made easier by using the standardized formal Bourbaki-style notation ubiquitous in mathematics today, allowing the description of results from different fields in the same language.

For a specialized piece of software such as a simulation designed to simulate a certain part of mathematics, the systematization of results from different fields is certainly hard to achieve. If the simulation uses visualization as its mode of representation, this may add to this problem, as at least today, generality and power of a simulation stand opposed to its intuitiveness and informality. This means that a visual simulation often tries to incorporate an informal and intuitive interface, which then limits its generality. Hopefully, future simulations will be able to overcome this limitation.

One could say that systematization is thus a weak point of a mathematical simulation.

Local systematization on the other hand is, however, very much possible; the organization of several definitions, lemmas, and theorems in one (sub-)field being more homogeneous in their representation. An example is a function plotter, which can very well classify different functions such as polynomials using their coefficients, or Ariadne systematizing paths on the plane by their homotopy classes, showing the relation between theorems on these objects.

Discovery

Discovery of new results is certainly not limited to proofs, but is an aspect relevant to proofs. The historical example of sphere eversions illustrates a discovery process in mathematics.

The eversion of the sphere is a regular homotopy turning \( \mathbb {S}^{2} \) in \( \mathbb {R}^{3} \) inside out. Constructions of such eversions were contrived by, for example, Shapiro (Levy and Thurston 1995), Morin (turned into a video by Max 1977), and Thurston (featured in a movie (Levy et al. 1994)Footnote 6).

Visualization in itself can bring new ideas into mathematical research (Bartzos et al. 2018). Now imagine a software had been available to these mathematicians, making possible the deformation of manifolds shown in the videos by directly controlling their manipulation. It is quite possible that the construction of such an eversion would have been more accessible. This is even more plausible as the discovery of the later eversions correlates with the expanded use of visualization software.

In education, one of the main uses of dynamic geometry software is to facilitate the forming of conjectures in students by analyzing specific mathematical settings given by the teacher (Mogetta et al. 1999). This is certainly a part of the discovery aspect of proofs; by investigating arguments for the validity of statements, new statements are conjectured. Furthermore, there is no reason to believe this feature is only of use in education, given a software capable of representing objects of interest to research-level mathematicians, it would certainly be used to discover new results.

Communication

It is a central function of a proof to communicate mathematical ideas. In a simulation, this function encompasses both the communication with the mathematical community and the communication with the software itself. In both situations, the software provides a medium to express thought, which is strongly dependent of the representation form of the proof.

The most used form to communicate proofs is by large a static and formal representation. To quote Mazur (2014) “As with almost all advances, something was lost,” talking about losing “the public” (i.e. non-mathematicians) by this “code, unintelligible for the uninitiated.” However, Victor (2014) argues that even more was lost: The use of formal notation restricts many human capabilities, forcing humans to use only a part of their cognitive abilities as they are confined to sitting in front of a small screen or piece of paper, manipulating it indirectly with a pen or a mouse creating static objects.

Simulation-based proofs may create the possibility to externalize thoughts in a way closer to the way we think. Touch- or gesture-based interfaces can express argumentation in a more embodied fashion, reacting dynamically to human input (Abrahamson and Bakker 2016). While this may also be seen as an economical argument as such an interface may allow “more” work to be done, the idea is to achieve more humane means of communication.

Mathematical simulations also allow, using the internet, a collaborative form of proving, by being able to manipulate objects simultaneously (Borba et al. 2017). While this has been a standard way of working in mathematics for a long time, this collaboration can be now made independent of physical restraints.

A mode of representation often chosen by simulations is a visual one, which has several influencing factors. Visualizations have a long history in mathematics, and in the acceptance of theories in general, having been seen a prerequisite or at least necessary component of a “proof” (von Fritz 1955).Footnote 7 Also, visual representation modes of mathematics may be “closer” to the way we think, as mathematical thinking more often deals with images than with formulas, at least in some areas of mathematics (Hadamard 1954). Hopefully, future simulations can unite the advantages of visual embodied communication with the power, precision, and universality of formal-symbolic representation modes.

Conclusion

This article points out the role mathematical simulations can play in the context of proving. As this connection is not yet well explored and as proving is a core activity in mathematics, this has far-reaching implications for mathematics educational practice and research, as well as for mathematics research itself.

These implications require, to some extent that the presented simulation-based proofs can indeed be regarded as proofs. This issue can surely not be resolved in this general formulation, but will depend on the same criteria other proofs need to fulfill as well, such as their exact form and implementation, and the context of their use.

Implications for Mathematics Education Practice and Research

The development of new technologies and the programming of new software is changing the educational landscape. Decades ago, calculators replaced slide rules and logarithm tables in classrooms. Years ago, DGE and CAS have revolutionized teaching. Their impact on proof has been to give a preliminary exploratory and quasi-empirical step to increase relative conviction regarding a conjecture, without influencing the actual proof, the content and form of which has not changed. Now, with more mathematical simulations such as Ariadne appearing, the form of the actual proof is challenged. This makes it all the more important to do research in this area.

As simulation-based proofs can fulfill all functions of proof, and can do so without the at times inhibiting factor of formulaic representation, they are of particular interest for educators in undergraduate courses. Here, the restrictions of creativity through design choices of the software (compare “Criterium 2: Level of detail”) can even be thought of as a feature, as such restriction can constitute an aid in doing proofs by narrowing the number of choices available. Note that it is not the purpose of simulation-based proofs, nor is it the question addressed in this paper, if simulation-based proofs are to be considered as “real” proofs, in the sense of being a perfect substitute for traditional symbolic proofs. Furthermore, such a discussion cannot be settled by theoretical debate, as the status of proofs depends on the norms of the community (compare “The Status of Proof in Mathematics and Mathematics Education”). Such proofs can rather be a gateway into proving, giving an alternative access to proofs in a non-formal highly interactive setting. They may guide learners on their transition from mental argumentations in the sense of Mamona-Downs and Downs (2010) to formal proofs by giving them an appropriate environment to project their thoughts upon, maybe with the guiding rails of the affordances of the environment.

Examples may be the introduction of the concept of continuity through a suitable mathematical simulation software, for example giving a visual representation of the 𝜖-δ definition of continuity, maybe along the lines of Fig. 4. This software might then be used to prove the continuity or non-continuity of some functions. However, as described in “Mathematical Simulations” and “Discovery”, the software must be powerful enough to allow real discovery by being more than just an animation, but a mathematical simulation.

Fig. 4
figure 4

A GeoGebra-Applet letting the user explore the uniform continuity of some functions (Dikovic 2017)

There may be an obstruction to the implementation of simulation-based proofs in educational settings; the absence, at times, of transferability or “parallelism” (Miller 2012) to formal representation also implies a deficiency of connectivity with further study or even other fields of mathematics. In an educational setting, visual proofs or arguments are mostly made to support a formal proof, which currently is the gold standard in mathematics, and aid in its understanding. A visual proof thus has to be transferable into a formal proof to a certain degree, which may not always be possible (compare “Transferability”). It is however possible, with future development in mathematics visualization that this can be made more balanced or even reversed; a proof will first be provided visually and translated into formal proof as an addition, for example for automated verification purposes.

Following the distinction into “learning to argue” and “arguing to learn” (Baker et al. 2019), simulation-based proofs may not only be used to learn mathematics by proving in a mathematical simulation, but also for inciting a discussion with students on the nature of proof. The debate on whether, or to which extent, the arguments made in a simulation constitute a proof, or how visual arguments relate to proofs in general in the sense of the above discussion, can be used to foster the understanding of the concept of proof.

The consideration of simulation proofs also has implications for mathematics education research. Approaching the subject from the students point of view, researchers investigating ways of teaching proofs or the learning of proofs may consider the use of simulations for either purpose. The implications of their use might then be assessed for their influence on the beliefs on the nature of proof. From the technological side, researchers working on the assessment or development of learning software might be driven to consider existing software for use in learning and teaching of proofs. For educational designers, requisites for the design of such environments must also be established.

Implications for Mathematics Research

Visual proofs are playing an ever larger role in research (Bartzos et al. 2018). As the possibilities of representing mathematics in computer environments as well as the possibilities of interacting with computer-generated content continue to expand, this progress will surely sustain. We believe that simulation-based proofs may be a bridge between traditional proofs and experimental mathematics, combining the deductiveness of the former and the explorational capacities of the latter.

As this is highly dependent on available software, which is arguably harder to develop for research-level mathematics than for undergraduate or school mathematics in general, the long-term developments in this area are hard to predict.

Outlook

While simulation-based proofs can in principle already fulfill all functions of proofs, we are still at the beginning of the digital age, so many more changes are to be expected. Further advances both in technology as well as representations will hopefully lead to simulations vastly more powerful while still being intuitive to use, realizing many of the features outlined in this article. This will undoubtedly lead to a shift in the use of technology away from purely exploratory capacities to other areas of mathematics practice. Research is and will be needed to understand which areas, such as proofs or problem posing and solving, these are, and how they are affected.

As remarked in “Absolute Conviction”, proof is not a static concept, but shaped by the community. It would be interesting to examine the opinions of research mathematicians as well as educators on the status of simulation proofs as acceptable proofs, in mathematics as a scientific discipline and as an item in the curriculum, and likely also identify further factors influencing their opinion.