1 Introduction

The purpose of this paper is twofold, as follows: the first aim is to deepen the understanding of how research co-exists with governance policy in the process of preparing innovations in mathematics education; and the second is to provide an example of how historical comparisons can be a fruitful method in implementation research. The paper compares three Swedish development projects in mathematics that included innovations based on research prepared during different circumstances with respect to policies of school governance. All three projects were driven by central school authorities. The first project, the Swedish New Math project, was launched in the 1960s to prepare for the 1969 curriculum. The New Math project was conceived in a time when the overall trend was to centralise school governance, a trend that peaked in the late 1960s and early 1970s (Prytz, 2018). The second project, PUMP,Footnote 1 was part of the decentralisation trend that started in the early 1970s, which would continue for almost 40 years (Sect. 4). PUMP, a reaction to the ideas of New Math, also entailed another way to develop innovations, a focus that becomes apparent in this paper. The third project, the BoostFootnote 2 project, was launched in the early 2010s when the Swedish school system had undergone several reforms aimed at decentralisation, moving decisions from national to local school boards as well as from school boards to teachers and parents (Sect. 4). Thus, this paper covers three development projects in three contexts of school governance policy.

My main research question is the following: in what way did the process of preparing the innovations based on research co-vary with the shift from centralisation to decentralisation?

My analysis of the process for preparing innovations based on research has been guided by four sub-questions:

  1. 1.

    What was the innovation?

  2. 2.

    What was the aim of the innovation?

  3. 3.

    What methods and theories were used?

  4. 4.

    What was the role of the researchers?

Apart from the introduction and the sections on previous research and theory and method, the paper comprises five more sections. There is a section on Swedish school governance, which is followed by three sections on the development projects, one section per project. In these sections, I answer the four sub-questions. In the final section, I answer the main question, I present my conclusions, and I discuss how my findings add new insights to previous research.

2 Previous research

The background of this ZDM issue is that implementation of mathematics education research is an emerging area of knowledge. Consequently, there is a need for new research methods. One way forward is to use historical comparisons as these can give unique opportunities to discover features of contemporary phenomena, which are difficult to discern just by analysing contemporary sources (cf. Tosh, 2000). In this paper, I show how historical comparisons can be relevant for mathematics education and implementation research (IR). However, the relevance of such comparisons goes beyond mathematics education. Based on an international overview on IR by Century and Cassata (2016), historical comparisons of development projects appear to be very rare in IR in general. To show that historical comparisons can be relevant in the field of implementation of mathematics education research, I discuss how my findings bring new insights to previous studies, Swedish as well as non-Swedish.

The overview by Century and Cassata (2016, pp. 185–186) identifies different organisational and environmental factors that influence implementation processes. One type of factor is organisational culture. In my view, policies of governance and research are part of an organisational culture. However, the nature of the relationship between these two types of policies, and how this relationship has changed over a longer period time, are not addressed by Century and Cassata (2016). These issues are addressed in this paper by an analysis of how researchers in different development projects have operated according to these policies.

There are, nonetheless, studies in mathematics education that, to varying degrees, concern the role of researchers in development and reform processes. These studies reach different conclusions on the role of the researcher. For example, Burkhardt and Schoenfeld (2003, p. 5), in a policy paper on how mathematics education research should progress, stressed the need for robust mechanisms that take ideas from research (they use the word “laboratory”) and put them to wider use in teaching practice. They represent a view in which researchers and teachers are separated: researchers generate innovations outside school that teachers apply in school. However, Burkhardt and Schoenfeld (2003, pp. 7–9) advocated a closer connection between researchers, practices, and development groups. Krainer and Zehetmeier (2013, p. 884), in a reflective paper on an Austrian nationwide development project, sought to overcome this separation as they supported the idea of intervention research that brings together researchers and teachers. Krainer (2014, p. 56) took this idea further and argued that teachers should produce knowledge and researchers should support teachers’ roles in the production of knowledge. In this perspective, researchers and teachers are co-creators of innovations. Similarly, Potari et al. (2019) highlighted the co-creator dimension in their study of how spheres of research, policy, and teaching interacted in the development of an innovative national mathematics curriculum in Greece, a process that contained conflicts. Potari et al. (2019, p. 431–433) stressed the significance of people moving between the spheres, functioning as brokers between divergent standpoints, and making the work go forward.

A different relation between researchers and teachers was reported by Ryve and Hemmi (2019). For several years (2012–2017), they managed a development project (400 teachers in total) in one Swedish municipality. Although Ryve and Hemmi were attentive to teachers’ needs, not least their autonomous position in the educational system, they developed a reform agenda with the purpose of challenging and changing the teachers’ behaviour. The agenda was based on questionnaires, student exams, textbook analysis, and interviews. For example, they wanted teachers to be active rather than reactive in the classroom. The agenda was not very detailed, which was a conscious choice that was intended to avoid collision with the culture of teacher autonomy. As I see it, the teachers in this case were not co-creators of the agenda but rather informants. Moreover, Ryve and Hemmi were in a position in which they were supposed to steer the teachers. This position was different from the roles of researchers in the contemporary Boost project, who had a much less steering role (Sect. 7).

A common feature of the above-mentioned studies is that they do not address how and why researchers are given certain roles in the projects. Burkhardt and Schoenfeld (2003, p. 13) touched on the issue as they argued that educational research is too weak, and a “symptom” of this weakness is that politicians as well as academics in other disciplines can tell educational researchers what to do, which is a type of organisational culture. However, Burkhardt and Schoenfeld (2003) did not specify what it means to tell people what to do. Important to note, Burkhardt and Schoenfeld (2003) addressed other questions concerning why robust mechanisms have not been developed that move research innovations into practice.

This paper pursues the thought expressed by Burkhardt and Schoenfeld (2003), and examines the relationship between politics and educational research. More precisely, I examine the relationship between the role of researchers in development projects and policies of governance. I have no ambition to give a definite answer to the question of why researchers get certain roles. In the final section, I discuss instead how my findings can contribute to such answers.

Finally, in this section I include some words about previous research on the projects studied in this paper. The Boost project was studied by Boesen et al. (2015) and the New Math project was studied by Prytz and Karlberg (2016) and Prytz (2017, 2018). Boesen et al. (2015) examined reports about the design and preparations of the Boost project and analysed how claims were based on educational research. Prytz and Karlberg (2016) studied the preparations for the New Math reform, while Prytz (2018) explained in what respect the reform was a failure and why it failed. Prytz (2017) compared the New Math project with earlier attempts to promote change in Swedish school mathematics. I return to the work of Boesen et al. (2015) in the final sections, since my findings are a bit different. The results of Prytz and Karlberg (2016) and Prytz (2017, 2018) are referred to in this paper, but they are not discussed further since their conclusions are not challenged.

3 Theory and method

The three projects studied were different, for example with respect to the number of teachers involved, the mathematical content, the connections to curriculum reforms, and the nature of the innovations. In addition, the processes of creating and implementing innovations were complex and several aspects can be considered when making comparisons. Therefore, the study requires limitations. These limitations are based on Century and Cassata’s (2016) IR theory and Bray et al.’s (2007) basic ideas concerning comparative research.

Century and Cassata’s (2016) IR theory was used to map the content of the analysis. On the basis of this map, I formulated the purpose and questions to consider. Century and Cassata (2016) defined IR as “the systematic inquiry of innovations enacted in controlled settings or in ordinary practice, the factors that influence innovation enactment and relationships between innovations, influential factors, and outcomes” (p. 181). Thus, innovation is a key object of study. In this paper, I study what constituted the innovation as an object, conceptually as well as physically. For example, an innovation can be a set of pedagogic principles, a textbook design, a textbook, or an assessment tool. An innovation can also be a combination of these things such as a textbook based on pedagogic principles. I have also considered the aim of the innovation, as existing objects can be used to achieve new aims in teaching; when this is the case, the innovative feature lies in the aims and not in the objects.

Apart from innovation, Century and Cassata (2016. pp. 181–188) considered types of factors that influence the implementation process and its outcome. These are characteristics of the individual users, organisational and environmental factors, attributes of the innovation, implementation support strategies, and implementation over time. I have chosen to focus on organisational and environmental factors, more precisely, on policies of governance, scientific policies, and the role of the researchers.

The sub-questions were answered by analysing reports and governmental decisions about the development projects. The reports and decisions were used mainly as descriptive sources. That is, when the sources acknowledged the innovative elements or aims of a project, I counted these as the innovation and the aim. On the basis of such statements, I answered sub-questions 1 and 2. As to sub-questions 3 and 4, I considered explicit descriptions of the theories and methods used, and how the researchers generated innovations. However, I also considered how theories, results of trials, and research literature were used in the reports to justify conclusions related to the innovations and the role of the researchers.

As to the main question, I used the answers to sub-questions 1–4. On the basis of these answers, I looked for what is stable and what is changing across the three projects.

To make comparisons, there need to be common units (Bray et al. 2007). In this study, the common units were the researchers and their actions as they developed the innovations. Despite the projects’ many differences, all the projects involved researchers, and the sources contain many descriptions of the processes of creating the innovations and what the researchers were doing. Finally, the main question also concerns policy of school governance. This part of my analysis relies on previous studies and well-established results about Swedish school governance.

4 School governance in Sweden, 1900–2020

The standard narrative about Swedish school governance in the twentieth century includes two basic movements: from 1900 to the mid-1970s the school system was centralised, and from the mid-1970s to the 2000s the system was decentralised. The notion of governance here includes economic, judicial, and ideological governance. Centralisation refers to a process where decision making is moved from local authorities to central authorities. The latter authorities were often national school authorities, but centralisation was also a matter of merging municipalities and creating larger schools and larger local school authorities, especially in the 1960s and 1970s. Decentralisation refers to the opposite process, but it did not involve dissolving the larger municipalities. The two processes were slow and gradual and involved a number of reforms and other changes. There were also deviations from these two basic movements. For example, some decisions were delegated from the state to the larger local school authorities already in the 1960s. For a more detailed overview, see the report by Prytz and Ringarp (2020).

As to decentralisation, a basic principle was governance by goals. In a school context, this principle meant that politicians and school administrators on the national level formulated only goals for what students should learn; there should be no guidelines about how to teach. The teachers alone were responsible for developing the teaching methods. And teaching methods were judged by how well they fulfilled the national goals. The 1980 curriculum was a small step in that direction, and the 1994 curriculum was a greater and more decisive step. Interestingly, the 1994 curriculum was accompanied by market-oriented reforms related to other parts of the school system that were supposed to improve teaching. Students and parents were allowed to choose a school, but all schools were free of charge, private as well as municipal. All costs were paid by the municipality via a school voucher. This arrangement meant that the schools had to compete for students. The private schools were even allowed to make a profit. This competitive element was supposed to improve teaching methods and ultimately student performance (cf. Ringarp et al. 2015, p. 82; Telhaug et al. 2006, pp. 268–271).

After 2000, there was a partial turn towards centralisation, which included increasing the number of national examinations. By the time of the new curriculum in 2011, national mathematics examinations were given to students in the years when national goals were to be evaluated, namely, years 3, 6, and 9. However, the policy of management by goals was not abandoned; the curriculum continued to have no guidelines concerning teaching methods.

It is important to note that these movements between centralisation and decentralisation were products of governance policies. They did not happen by accident.

The governance of school mathematics in the twentieth century fits this general narrative about movements between centralisation and decentralisation. What follows are the steps towards centralisation. Between 1900 and 1969, the national formal curricula successively received more details as to what to teach and what teaching methods to use. In the late 1930s, a national textbook review was established to approve all textbooks. In the 1940s, national examinations (standardprov) in the primary schools (years 1–7) were introduced. Since the nineteenth century, secondary schools were required to give national examinations. In the 1960s, the central school authorities launched two major development projects with the aim of improving teaching in mathematics, and one of these was the New Math project. (For a comprehensive account of governance of Swedish school mathematics between 1910 and 1980, see Prytz, 2017.)

In 1969, a new mathematics curriculum based on New Math was launched, which included new pedagogical principles on how to introduce and explain concepts. Thus, it was not just a matter of adding new content. Moreover, all textbooks had to be approved by the textbook review committee and all textbooks seem to have adapted to the new principles in the curriculum. (For a more detailed account of the Swedish New Math reform, see Prytz, 2018.)

As to the move towards decentralisation, an early change was the abandonment of central parts of the New Math reform as early as 1973. The central school authorities refrained from enforcing pedagogical principles of the New Math, leaving such decisions to teachers. In addition, in 1974, the textbook review in mathematics and some other subjects became voluntary. After these changes in governing policy, the supply of textbooks changed as more traditional textbooks in mathematics began to appear. (For more details, see Prytz, 2018.)

Further decentralisation steps were taken with the 1980 mathematics curriculum. Rather than prescribing what should be taught each year, the curriculum prescriptions concerned spans of 3 years (1–3, 4–6, and 7–9). The number of national exams (standardprov) was also reduced: from three in years 3, 6, and 9 to one in year 9. The 1994 mathematics curriculum continued according to this trend as it contained goals for years 5 and 9 only. These goals were formulated in a brief and general manner. Instead of the authorities prescribing details, teachers were expected to formulate local curricula. Moreover, unlike all previous curricula in the twentieth century, the 1994 curriculum did not contain methodological guidelines (Prytz, 2015, pp. 312–313). In addition, all types of textbook reviews disappeared in 1991. All these changes meant that decentralisation increased teacher autonomy. Therefore, the governance of school mathematics followed the overall narrative of Swedish school governance, especially for development of governance tools, including textbook reviews, curriculum designs, and national examinations.

However, if innovations are put in focus, a different picture emerges. As shown by Prytz (2017, pp. 63–64), between 1910 and 1960, the central school authorities refrained from initiating and driving innovations in school mathematics even though central governance tools were available. Instead, it was textbook authors who initiated and drove innovations, albeit on a relatively small scale. In the 1960s, this changed. In fact, the New Math project was the first attempt of the Swedish central school authorities to initiate and drive innovations in school mathematics through a major development project.

5 The Swedish New Math project

The Swedish New Math project (years 1–9) comprised two distinct phases: 8 years of preparations of a curriculum reform in the 1960s and the implementation of the curriculum reform beginning in 1969. Thus, innovations were not introduced little by little over several years, but all at once.

This paper focuses on the preparation phase, which contained two components. One component was to develop a new curriculum, which differed from the previous one on several points as it included many innovations. Due to the great number of innovations, it is not possible to go into details. I will just list the major innovations in the 1969 curriculum, where each innovation comprised smaller innovations concerning, for example, new concepts or explanations (cf. SÖ 1962, pp. 164–170; SÖ 1969, pp. 4–6). Innovations included the following:

  • New content as in completely new topics or sub-topics, for example, statistics from year 2, vectors from year 7, and trigonometric functions from year 9.

  • New content such as in old topics in earlier years, for example, equations and geometry from year 1.

  • New principles for teaching the content. It is here that set theory played an important role. When the students were to be taught something new (a concept, an algorithm, etc.), concepts and notations from set theory were to be used. Set theory was also meant to bridge topics. This entailed that set theory was an integral part of all other topics.

The second component was to develop teaching methods and a textbook design that could be used together with the new curriculum, both the new content and the new teaching principles. Actually, most of the preparations concerned this second component, to which I return below.

The official reports do not make it clear how the new types of textbooks should be disseminated to teachers; the idea seems to have been that during the development process new ideas and experiences should spread to both teachers and textbook authors and publishers (cf. NKMM 1967, p. 94). However, a governmental tool was available—i.e., the mandatory textbook review that checked compliance with the curriculum. How publishing companies dealt with this is not well known, but we know that some companies hired people involved in the New Math project and we know a great number of textbooks followed the new teaching principles (Prytz, 2018, p. 205).

As to the aim of the innovations related to New Math in Sweden, it was grand: mathematics education should change swiftly and thoroughly. In the final report, this aim was justified by references to broad changes in society (NKMM 1967, pp. 5–6):

  1. 1.

    An increasing need of mathematics in working life, research, and development. This need was linked to the fact that mathematics was an important tool in science and technology and had become increasingly important in economics and social sciences.

  2. 2.

    An increasing need of mathematics as a part of civic education. A modern and highly developed society demanded much of its citizens in terms of mathematics, as information was presented quantitatively and urbanisation and automatization required citizens to use mathematics in their everyday lives.

  3. 3.

    This increasing use of mathematics was related to advances in the scientific discipline of mathematics, most often relatively recent advances.

  4. 4.

    School mathematics lagged behind the advances in the scientific discipline and its applications.

  5. 5.

    Experiments indicated that teaching should focus on insight and understanding rather than mechanical skills. This idea was linked to psychological theories developed by Jean Piaget (1896–1980) and Jerome Bruner (1915–2016). Thus, changes in Swedish school mathematics were also justified by advances in the academic discipline of psychology.

Therefore, innovations related to New Math were included in a program that not only should change Swedish school mathematics in a fundamental way, but also solve pressing societal problems.

An important feature of this program was that the development of all innovations was based on one single broad theory about learning and teaching mathematics, which in turn was based on the works of Piaget and Bruner. The basic idea was that children, even young ones, possess cognitive structures that resemble mathematical structures. This knowledge should be used to facilitate better teaching. More precisely, a stronger focus on structures in the teaching would give better understanding, which in turn would improve students’ learning (cf. Bjarnadottír 2014, p. 451). It is important to note the hypothetical nature of this theory; in the early 1960s, it was not clear how the theory should be applied to all topics and to all years from 1 to 9. This lack of clarity becomes clear in one of the major international reports on New Math published in 1961. One of the report’s final recommendations was to engage in extensive development programs for teaching methods and textbooks (OECD 1961, pp. 123–125).

In Sweden, this meant 6 years of development, starting in 1961. The New Math project was a joint project with the Nordic countries Sweden, Finland, Norway, and Denmark. The development process had clear scientific features, especially in the development of textbooks. The textbooks were created by a team of authors pursuing one single theory on the teaching and learning of mathematics. The teaching with the new textbooks was then tried systematically in several school classes, comprising a total of 542 school classes in years 1–9. The teachers submitted standardised reports about their efforts. Finally, the textbooks and the teaching were evaluated by comparing test results from the experimental classes and control classes. The experimental classes followed the New Math curriculum and used the New Math textbooks for two or 3 years, whereas the control classes followed the current curriculum and used traditional textbooks (NKMM 1967, pp. 94–97). The tests were given for years 3, 6, 8, and 9. In general, there were only small differences between experimental classes and the control classes (see Prytz & Karlberg, 2016, for a detailed analysis of the results). There was one exception: high achievers in year 9 performed better with the traditional curriculum. These results indicated that the New Math curriculum and textbooks could be used in teaching; they were not better than their traditional counterparts, but they worked. (For further arguments concerning this standpoint, see Prytz & Karlberg, 2016, pp. 87–89).

As to the role of the researchers in the project, about 30 people from all the Nordic countries were involved in the development of the curriculum and the textbooks. These people included the leaders of the project, authors of textbooks (eight Swedes), people handling the collected material, and people performing the quantitative analysis (NKMM 1967, pp. 220–222). I consider these people the researchers in the development process. The researchers conformed to one single theory about teaching and learning mathematics. Collectively, they created and evaluated an innovation in a scientific manner. After the evaluation, the innovation was revised.

6 The PUMP project

PUMP was a research and development project running between 1973 and 1977. It was organised by the Department of Education at Gothenburg University and financed by the National Board of Education (Skolöverstyrelsen, SÖ) (Kilborn, 1979, p. 5). PUMP’s primary aim was to develop an instrument to help teachers make pedagogical decisions both in a short- and long-term perspective (Kilborn & Lundgren, 1973, p. 2). A key component of the instrument was the diagnostic material for arithmetic. This material was also used by the researchers to study the teaching of arithmetic in Swedish schools and several textbook series. The researchers also analysed the development of the current curriculum—i.e., the New Math curriculum (Kilborn, 1979, p. 5). The abbreviation PUMP can be translated as the process analysis of teaching in mathematics/psycholinguistics. In this paper, I do not discuss the psycholinguistic aspect of PUMP.

The major innovation of the PUMP project was diagnostic material in arithmetic for years 1–6, which I refer to as the PUMP material. This might sound like a very small innovation in comparison to the New Math project, which metaphorically was an avalanche of innovations. However, PUMP included very detailed material and many objectives.

According to its creators, an innovative property of the PUMP material was the combination of a very high level of detail (high complexity) and the possibility for teacher to manage the material on their own. Previous diagnostic material was either too simple with respect to content or too complicated for teachers to use without the help of an outside expert (Kilborn & Lundgren, 1973, p. 21).

The fundamental component of the PUMP material was four two-dimensional matrices, one matrix for each arithmetic operation. Each matrix included all types of tasks in arithmetic that had one operation the students would be taught. Each dimension or axis had several entries. On the x-axis, there were number ranges: 0–9, 0–19, 0–99, etc. On the y-axis, there were the numbers of tens transitions the students had to handle as they solved the task (Johansson & Kilborn, 1974, pp. 6–10, 29). The basic idea was that every task concerning one arithmetic operation that students could come across should belong to one and only one cell Myx in the matrix. Moreover, for any given cell Mab, all cells to the left or above (all Myx where y ≤ a and x ≤ b) should contain tasks representing knowledge and skills that were necessary for solving the tasks in Mab. In practice, the matrices were not perfect rectangles: some rows and columns had to be extended beyond the rectangle and some cells were empty (see Appendix 1). The matrices for addition and subtraction concerned only one type of operation each, whereas the matrix for multiplication included addition, and the matrix for division included subtraction.

According to its proponents, the PUMP material had several pedagogical advantages. Teachers could get a better idea about the progression from easy to difficult tasks. The teachers could use ready-made diagnostic tests or create their own that covered several cells and they knew exactly to what cell each task belonged. This meant that teachers could easily spot the students’ weaknesses and identify what knowledge and skills they needed to develop. Therefore, the PUMP material was supposed to help teachers plan their teaching (cf. Kilborn & Lundgren, 1973, p. 25; Johansson & Kilborn, 1974, pp. 16–18; Kilborn, 1979, pp. 17–19). However, the results from the PUMP material could also, if used in several classes, be used by school leaders to distribute resources such as extra support for low performing students (Kilborn, 1979, pp. 19–21).

According to PUMP proponents, the aim of the PUMP material was to resolve a critical situation: poor student outcomes and inefficient teaching in arithmetic. As already mentioned, the material could be used by teachers to plan their teaching, but the people behind the PUMP material also used it to identify the causes of the crisis. They concluded that the curriculum and textbooks were of low quality. Much of the critique of low quality in the formal curriculum concerned a lack of precision concerning how algorithms were introduced and how the teaching should proceed (e.g., moving from easy to complicated algorithms). They also argued that the textbooks went too fast from easy to complicated algorithms and even left out important steps. The basis for this critique was comparison of results of teaching, with the PUMP material, where the latter functioned as a basic norm for good teaching (Kilborn, 1979, pp. 34–42). Here, it is important to mention that the reports concerning PUMP did not advocate any particular teaching method. In fact, it was emphasised that the PUMP material was a tool for planning and teachers were free to choose whatever teaching method they wanted (Kilborn, 1979, p. 56). In this respect, the PUMP project was not the same as the New Math project, which contained clear principles for teaching methods.

In fact, much of the PUMP critique mentioned above concerned New Math. It was the New Math curriculum and consequently New Math textbooks that were criticised. However, when it comes to the development process and the role of researchers, there were clear similarities as well as differences between the PUMP and New Math projects.

The underlying theory of the PUMP project concerned frames in the educational process. According to this theory, there are three types of frames, namely, curriculum frames, organisational frames, and time frames. Together, these frames restrict and govern teaching as they make events possible or impossible (Kilborn & Lundgren, 1973, p. 19). Although the PUMP reports did not explicitly state what type of frames the PUMP material addressed, it seems to me that it was a curriculum frame, at least if we assume that the curriculum involves the content of the teaching. This theory of frames was used to problematize the educational process and identify a research object—i.e., frames that restrict and govern the teaching process. The theory of frames was not used for the construction of the matrices, as the construction (described above) was based only on basic concepts of arithmetic. Nor was any cognitive theory involved. However, in the final report (Kilborn, 1979, pp. 46–52), specific concepts concerning cognition in combination with the PUMP material were used to explain why certain types of teaching are more efficient than other types. These concepts were working memory and long-term memory. The basic idea is that working memory is limited with respect to how much information can be handled at the same time. On average, the limit is seven bits of information. Together with the matrices, this means that teaching that proceeded too fast to the bottom right corner of the matrices would not work. That is, the students would not have automatized how to solve tasks in the upper and left part of the matrices, which would cause overload in the working memory and therefore hamper learning. This reasoning about the limits of working memory resembles the basic parts of what today is called Cognitive Load Theory, but that name was not used by Kilborn (1979).

However, the matrices were not just a theoretical product. They were also tried and revised systematically in the following way. A matrix was constructed on the basis of what the researchers thought should be the right position of a cell. Thus, its status was hypothetical. A student test was constructed with tasks representing each cell and it was given to students to solve. About 1000 students were involved in the trials (Johansson & Kilborn, 1974, p. 11). The principle for the trials was that if the positions of the cells were correct, a task with higher solution frequency would belong to a cell above or to the left of a cell with tasks with lower solution frequencies (Johansson & Kilborn, 1974, pp. 11–12). Thus, there was a principle for checking the hypothetical matrix and revising it. It took approximately 2 years to develop the diagnostic material and the printed products to be used by the teachers, if we consider that the project started in 1973 and the PUMP material (Johansson & Kilborn, 1975) was published in 1975.

As to the researchers’ role in the process of developing the innovation, it was the following. They had one single theory concerning what limits and governs the teaching process. This helped them identify a type of innovation. Collectively, they created an innovation on the basis of very basic mathematical concepts. The innovation was tried and refined according to a systematic and scientific-like method.

7 The Boost project

Launched in 2013 and ending in 2016, the Boost project for mathematics (Matematiklyftet) was an in-service training program for Swedish mathematics teachers for years 1–12. Much of the material is still available via the website of the Swedish National Agency for Education. It was a major program in two ways: 76 per cent of all Swedish mathematics teachers (years 1–12) followed the program (Source B, p. 6) and it covered many parts of the mathematics curriculum, to which I return below. The Boost project was driven by the central school authorities.

The organisational principle of the Boost project was peer learning among teachers with support of researchers in mathematics education at university departments. The role of the researchers was twofold. One role was to educate at university a number of expert teachers for 8 or 9 days (Source B, p. 9). Later, these expert teachers in the schools were in charge of the education of their teacher colleagues. The expert teachers were supposed to be experienced and highly skilled teachers before the project began (Source A, p. 4). The second role of the researchers was to develop modules the teachers would use in their education. I call these the Boost modules.

The Boost modules covered several parts of the curriculum. There was a module for each of the subject specific topics of arithmetic, geometry, algebra, and functions. In addition, there were modules for the general topics of problem solving, digitalisation, and language in mathematics. In addition, there were modules for kindergarten, schools for students with learning disabilities, and adult education (Source B, p. 25). Apart from the main topic, each module had four themes: abilities, formative assessment, interaction, and socio-mathematical norms. The modules covered 4 years spans: 1–3, 4–6, 7–9, and 10–12. In total, there were 36 modules (Source B, Bilaga 1). The teachers of a certain span of years chose two modules for their training program.

Each module comprised films, audio clips, and web texts with instructions and questions for peer learning sessions. Each module also comprised two types of scientific articles: an international overview of each topic and the four themes, and specific articles about the same matter in a Swedish context. As the later type of articles was not dominant, it is not possible to conclude that the research presented to the teachers stemmed from studies of Swedish students and teachers.

The Boost modules can be considered the innovative part of the project that was based on research. However, it was not the whole innovation. As I see it, in the Boost project, the innovations were created in cooperation between researchers and teachers. Innovations per se were not mentioned in this way in the final government decision (Source A) to start the Boost project. However, the decision and the passages about peer learning refer to a report issued by the Ministry of Finance (Source A, p. 4). The report and the chapter about how to achieve change in teaching practices emphasised that experts and researchers did not have all the answers, as teachers’ experiences, values, and convictions were needed (Åman, 2011, p. 88). That is, the design of teaching innovations was not merely the domain of researchers; teachers needed to be involved.

There were two goals of the Boost project: to achieve a changed teaching culture with a focus on developing teaching, and to achieve a changed in-service training culture (fortbildningskultur) where the schools could see the strengths of peer learning. The specific goals for the teachers were to achieve a higher degree of reflection about their teaching decisions, and to have a wider set of teaching methods and attitudes (förhållningssätt) at their disposal so that they could easily adjust the teaching to the students’ different needs (Source B, p. 5). The problem that justified these goals was that student results in mathematics in international and national evaluations had decreased for about 15 years (Source A, p. 3). In the final government decision about the Boost project, a plausible cause of the decreasing results was identified: teaching was led to a lesser extent by the teachers, because the students were working on their own without supervision or feedback from teachers. This conclusion was backed in several official reports (Source A, p. 3).

The Boost project was cast as a peer learning program based on educational research on teacher training, although the only reference to this aspect was a report by the Ministry of Finance (Åman, 2011). This report was a sort of international research overview concerning several aspects of schooling and teaching. As to teacher training, a recurring reference in Åman (2011) was a more systematic international research overview on teacher training (Timperley, 2007) issued by the New Zealand Ministry of Education. Timperley’s overview identified three factors that contribute to positive and sustainable changes in teaching practices. Indeed, they underscored the importance of teachers having opportunities to “learn collaboratively with colleagues as they tested the impact of their teaching on student”. The other two factors concerned the content of the change: the teachers’ “skills for ongoing inquiry into the impact of the practice on students” and the “depth of principled knowledge”. The latter referred to the importance of teachers understanding how their adaptations fit with the fundamental principles of the change agenda and their practice context. Therefore, it was important to give teachers strong theoretical knowledge (Timperley, 2007, p. 219). As to mathematics education, it was important for a training program to focus both on mathematics and pedagogy in order positively to impact students’ learning and results (Timperley, 2007, p. 91–93). Importantly, Timperley (2007) stated that the “evidence base for sustainability in teacher professional learning is disappointingly thin” (p. 225).

Timperley provided no explicit conclusion as to the nature of the innovations and to what extent they should be created by researchers, but researchers were by no means ruled out. Timperley (2007, p. 223) considered how prescriptive a training program needed to be to produce sustainable positive effects. The answer was not definite; there were successful examples that were less prescriptive and that gave teachers more autonomy, but there were also successful examples that were more prescriptive and that gave teachers less autonomy. As I see it, the issue of prescriptiveness concerns researchers and innovations indirectly. If something is prescriptive, that prescription is created by someone, and that someone can be a researcher. Furthermore, the prescription can be an innovation concerning how mathematics should be taught.

This leads to the consideration of how the Boost modules were created. The whole project was administered by the National Agency for Education (Skolverket) together with the National Centre for Mathematics Education at Gothenburg University (Source A, p. 1–2). To guarantee scientific quality, the work of creating the modules was divided among university departments that conducted research in mathematics education. To further secure high quality, the modules were reviewed by other researchers in mathematics education (Source B, p. 7). Thereafter, teachers were involved in adapting the modules. The final report, however, was unclear about how this process was carried out, but it did mention school visits, surveys, and group interviews (Source B, p. 7). It also noted that 308 teachers were involved in trials during a school year (2012/13) before the whole training program was started in 2013 (Source B, p. 5). A general and systematic procedure for the trials was not mentioned for the 36 modules to be tested.

Taken together, the researchers’ roles in the Boost project were to gather relevant research articles and results for teachers, to suggest how mathematics can be taught, and to formulate questions about mathematics teaching. Unlike the New Math and PUMP researchers, the Boost researchers did not conform to one explicit general theory about teaching and learning mathematics as they developed the modules. Of course, this does not mean that such theories were not used. To a certain extent, the Boost researchers were involved in trials of the material they had developed, but it is not clear how this was done or if there was a certain procedure. No more than 1 year was spent on development, trials, and revisions before the training program was launched in full scale, unlike the New Math and PUMP projects, where 6 and 2 years, respectively, were spent on development, trials, and revisions.

8 Concluding discussion

The findings presented above support the conclusion that the role of the researchers and the procedures for developing the innovations co-varied with the shift from centralisation to decentralisation that happened gradually from the mid-1970s to the 2010s.

The changing researcher role concerns the ability to create an innovation that teachers could use to improve their teaching. In the highly centralised Swedish school system of the 1960s, the New Math proponents believed that researchers could develop a new curriculum and suitable textbooks. In addition, they believed that researchers could do this on the basis of one single general theory about cognition, learning, and teaching paired with scientific methods for trials.

In the 1970s, the policy of school governance had started to change towards decentralisation. However, in the PUMP project, it was still believed that researchers, on the basis of one single theory and with the use of scientific methods for trials, could develop an innovation—diagnostic material—that teachers could use to improve their teaching. The guiding theory concerned educational processes with a focus on organisation, but a cognitive theory was later added to explain certain aspects. The innovation was different, however, and it is possible to link this difference to a decentralisation policy. The diagnostic material said nothing about how teachers should teach. In fact, the PUMP researchers even underscored that teachers should choose the teaching methods they saw fit. The PUMP material was a tool for planning the teaching, as it should give teachers precise information about the students’ knowledge and concerning tasks in which they needed more training.

It is important to note that many teachers and students were involved in the New Math and PUMP projects; the innovations were tried and evaluated as they were used by the teachers and students. However, the researcher was in charge of the designs, trials, and evaluations.

The trend towards decentralisation prevailed in Sweden for about 40 years. In the 2010s, the role of the Boost researcher was markedly different. At that point, it was conceived as inefficient if researchers alone were in charge of the creation and development of the innovations. Instead, the researchers had a supportive role, while the teachers collaboratively would develop the innovations. The role of the researchers was to create an environment of possible innovations that could be developed by the teachers. Another difference is that the researchers did not rely on just one single theory about learning and teaching.

These different roles of the researchers were also part of different procedures concerning how the innovations were developed. Among other differences, these procedures differed with respect to time allotted for development of the innovation. In the New Math project, it took the researchers 6 years to develop examples of textbooks that could be used with the new curriculum covering years 1–9. This process involved authoring of the curriculum and textbooks, trials, teacher reports, revisions, and finally further trials with experimental and control classes. The process was systematic and well documented. In the PUMP project, it took the researchers 2 years to develop diagnostic material in arithmetic for years 1–6. This process involved the writing of the material, testing it in schools, and revisions of the material. The process was systematic and well documented. In the Boost project, it took the researchers 1 year to develop the modules, and the reports mention a trial process but not how the trials were conducted or whether they were systematic. However, another major task for the Boost researchers was to examine the research field and collect relevant research articles and results to present to the teachers.

These roles of the researchers and the procedures for developing the innovations in the three projects fit different phases from centralisation (New Math) to early decentralisation (PUMP) to late decentralisation (Boost). In the case of New Math, many decisions about the innovations—what and how to teach—were tied to the researchers. In the PUMP project, all decisions about how to teach were left to the teachers; the innovation concerned what to teach and in what order, and these issues were controlled by researchers. In the Boost project, the teachers were supposed to develop the innovations, so teachers made most of the critical decisions and the researchers had a supportive role.

As to scientific principles, the researchers pursued different logics. The New Math and PUMP researchers followed a scientific logic that I denote as Logic A:

  1. A.

    Formulate a general hypothesis about teaching and learning mathematics on the basis of previous research in one distinct field of research → design an innovation → try the innovation in several classes → modify the innovation → complete the innovation.

Logic A, or at least something similar, was present in the Boost project, but very little time was allotted to it—1 year rather than 6 years in New Math and 2 years in PUMP. Here, we should recall that PUMP just concerned arithmetic in years 1–6, whereas Boost concerned the whole curriculum and years 1–12. In the Boost project, another scientific logic stands out, which I denote as Logic B:

  1. B.

    Examine a broad research field → select relevant publications and results for teachers → design a context of innovations → let research colleagues review it → try it with teachers → modify it → let teachers complete the innovations.

Here, I want to return to previous research and the issue of organisational culture and why researchers get certain roles in processes of developing innovations—a truly international issue (cf. Century & Cassata, 2016, pp. 185–186). On the basis of my findings, I think it is possible to modify Burkhardt and Schoenfeld’s (2003) claims about the weakness of educational research and politicians telling educational researchers what to do in the process of creating and implementing innovations. In my view, it can be a matter of an enduring (not recent) organisational culture where research and governance policy are conflated. My conclusion about co-variation, stated above, suggests that this conflation has existed at least since the 1960s in Sweden.

The decision making concerning the Boost project provides further insights about this conflation culture. In the justification of the organisation of the Boost project, educational research results were indeed applied. As shown above, the main scientific reference was a research overview by Timperley (2007) about teacher training. However, the overview did not explicitly say much about the procedures for developing innovations and to what extent researchers should be involved. In addition, the overview did not refute the idea of more prescriptive teacher training programs, which can limit teacher autonomy and open up the project for more researcher influence. In a similar way, Boesen et al. (2015, pp. 138–139) conclude that the Boost project conformed to a large well-established body of international research literature, but the reports on the project poorly documented how decisions taken were grounded in the literature. As to why this situation occurred, they suggest that knowledge could have been transmitted orally via seminars or other meetings. I suggest another explanation. The well-established body of literature offered different opportunities rather than presenting guidelines and arguments for a specific solution. The people in charge were mainly operating according to a prevailing policy of decentralisation, a central element in the organisational culture, and opted for a solution (Logic B rather than Logic A) that suited that policy, namely, more influence from teachers and less influence from centrally-positioned researchers.

My claim about an influential policy of decentralisation is confirmed by the work of Ryve and Hemmi (2019), in the sense that they are clear about being attentive to the autonomy of Swedish teachers as they designed the reform agenda of their development project in the 2010s. As I see it, a high level of teacher autonomy is a result of a decentralisation policy.

However, the fact that Ryve and Hemmi had more control of the design of the innovations than the Boost researchers indicates that another role of researchers was possible. This aspect raises questions about why that could be, and the local organisational cultures of the projects (cf. Century & Cassata, 2016, pp. 185–186). It may be a practical matter. What is possible with 400 teachers in one municipality is perhaps not possible with 76 per cent of all Swedish mathematics teachers. On the other hand, it may also be a matter of culture as the researchers depended on different authorities and politicians; municipal in one case and national in the other case. Perhaps, municipal politicians view the role of researchers differently than national politicians.

To further illustrate how my historical study can give new insights to implementation research in mathematics education outside the Swedish context (cf. Sect. 2), I turn to Potari et al.’s (2019) study of the creation of a contemporary innovative Greek curriculum. They claim that the Greek educational system is highly centralised (p. 421) and the process of designing the curriculum had a strong research orientation (p. 431). They characterise the influence of research in the following way:

The object of the research activity was to define research and theory informed principles and objectives; distribute mathematics content, taking into account research findings about pupils’ learning; enrich teaching with materials promoting inquiry and mathematical understanding; and facilitate teachers’ engagement in transforming curriculum resources into the actual classroom teaching. (p. 431)

This characterisation is more similar to Logic B than Logic A. But, in Sweden, Logic B was conflated with a policy of decentralisation, giving more influence to teachers and less influence to centrally-positioned researchers. This raises questions regarding Potari et al.’s (2019) conclusion about the significance of conflict solving brokers in a reform process moving between spheres of research, policy, and teaching. If decision-making were centralised, concentrated in the research sphere, control of compliance with decisions in other spheres would be more important than brokerage. So, can it be that the existence of brokers presupposes a local policy of decentralisation? But this is speculation and I suggest further historical research about such questions in dialog with Potari et al. (2019). As to Krainer and Zehetmeier’s (2013) and Krainer’s (2014) conclusions about researchers and teachers as co-creators of innovations, I suggest a similar line of questions and research.

I think it is important that researchers who are involved in processes where innovations are created and implemented be aware of possible conflations of research and governance policy. What is best from a governmental perspective is not necessarily optimal in a research perspective.