1 Introduction

This paper concerns the sorts of functions that can be computed by a human or a finite machine. For the sake of definiteness, computable functions have to start with a finite input (coded as a natural number) and produce a finite output (a natural number again). This paper will concentrate on functions which always produce natural number output given natural number input (that is to say, the total functions), since they can be characterised in terms of logical systems more readily, and are desirable in practice (since no-one wants a program which given some input gets stuck in an endless loop and never returns any output).The emphasis is on the computable functions themselves in systems of the typed lambda calculus and corresponding logical systems. This paper does not include any material on higher order recursion theory, which is a very rich area (see Longley (2005) for a survey), primarily because of the emphasis of higher order recursion theory on relative rather than absolute computability, but also because higher order computations require a hypercomputer to realize (see Koepke and Seyfferth (2009) for a realization of ordinal recursion on an ordinal computer with an infinite tape indexed by transfinite ordinals). That is not to say that computable total natural number functions do not involve higher order structures (such as impredicative logical systems and transfinite ordinals), but the emphasis is on what a finite computer (with unbounded memory) could actually compute. There is a fine line here, but the distinction lies at any point where the results of a previous computation that could not be performed by a finite computer (Turing machine) are required to complete a computation.

There is a focus in this paper on typed systems of lambda calculus. To informally introduce the typed lambda calculus, a type is a property of a set of objects. It is usual to allow that types and their objects are definable by finite expressions, the finite expressions defining objects being called terms. Terms in the lambda calculus are variables and (in some systems) constants, functions (written using the lambda symbol) and the application of functions to previously defined terms. The process of moving from the application of a function to a variable to the function is called abstraction, and is the inverse of application. In typed systems of lambda calculus, each term has a type. For example, \((\lambda x:A)(y:B)\) is a term representing the function that takes term (variable) x of type A to term (variable) y of type B. In some typed systems of lambda calculus types are only ever represented by variables, while in others constants are allowed, and in some systems expressions involving variables and constants are allowed. There is further explanation in this paper, but texts such as (Geuvers and Nederpelt 2014) and (Guallart 2015) provide a good introduction. For this paper another central concept is that of reduction of a lambda term. The idea is that a term is simplified as far as possible using function application (although there are often other rules, to deal with equality for example) and an irreducible normal form results. For example \((\lambda x:A)(y:B)(a:A)\) reduces to y for y : B if y is a variable, while \((\lambda x:A)(\lambda y:B)(yx)(a:A)\) reduces to \((\lambda y:B)(ya)\) for a : A.

Typed systems of lambda calculus form a bridge between computer programs on the one hand and axiomatic deductive systems on the other. When lambda terms are reduced or in general converted to other terms, the process forms a computer program, while the rules for reduction or conversion correspond to logical rules (see Sect. 4 below). All the systems of typed lambda calculus described in this paper can give rise to functional programming languages.

The style of this paper is to discuss and build on key papers in the literature of computation theory and formal logic to show what has been achieved and then to make suggestions on how the computational systems described in these papers can be extended. These suggestions aim to characterise the sorts of formal systems and programming languages which could be developed to extend existing computational paradigms. The original content in this paper is contained in Sects 7 and 10, and comprises a translation of sets to types, the characterisation of a recursion schema that results by substitution in computable functions and the views on the limits of ordinal complexity that can be used to define computable functions. Sects 26 comprise a literature survey of what might be called modern type theory, while Sects 8 and 9 are critical reviews of homotopy type theory (an intensional theory of types) and the untyped lambda calculus respectively.

2 Gödel and Kleene: General Recursion and the Undecidability of Totality

It is a well known result due to Kleene (see Kleene (1943)) that computations in a formal language can be decided accurate or otherwise given the computation and the program instructions in a primitive recursive way by coding the input and output, the program and the computation as natural numbers. Kleene showed that any computable function that returns a value can be written as primitive recursive function that operates on a formally verified computation, usually written as \(U((\mu y)T(e,x,y))\) for primitive recursive function U, input x, computation of least (\(\mu\)) code y, program of code e and primitive recursive computation checking predicate T.

If we consider programs that always have a natural number output given natural number input, called total programs, we see that this fact can be expressed as \((\forall x)(\exists !y)(f(x)=y)\) for program f, where \((\exists !z)P(z):=(\exists z)(P(z)\wedge (\forall w\ne z)\lnot P(w))\). We can clearly re-write this formula as \((\forall x)(\exists !y)(\exists !z)(z=U(y)\wedge T(e,x,y)\wedge (\forall w<y)\lnot T(e,x,w))\), where e is a numerical program code for f, indicating that totality of a program has logical complexity \(\prod _{2}^{0}\)(i.e. can be written in the form \((\forall x)(\exists y)\ldots\), where the ellipses contain no further quantifiers), while the set of terminating programs and inputs has logical complexity \(\sum _{1}^{0}\) (i.e. can be written in the form \((\exists x)\ldots\) ). Gödel showed in Gödel (1931) that no formal axiomatic proof system (that has quantifiers and can represent addition and multiplication of natural numbers), which is a type of computation system, can decide the truth of every proposition of quantifier complexity \(\prod _{1}^{0}\) or richer (so \(\prod _{2}^{0}\), or \(\sum _{2}^{0}\) for example). Kleene showed that the characteristic function corresponding to \((\exists y)T(x,x,y)\) has undecidable totality (often called the “halting problem” and first shown undecidable for computers by Turing in Turing (1937)).

As important as these theorems are (and they are arguably the foundational results in mathematical logic and computability theory in the 20th Century), we must not overlook what Kleene (Kleene 1943) showed en route as far as computer programs are concerned. That is, deciding whether a computation is correct (via Kleene’s T predicate) is in general computationally (far) easier in general than performing a computation (since computational correctness is decidable in primitive recursive arithmetic or weaker as it is expressible as a \(\Delta _{0}^{0}\) proposition, i.e. the proposition can be written with quantifier values bounded by the length of the expression, whereas the least y returned by f(x) is an unbounded search operation. Ackermann in Ackermann (1928) showed that there are total programs which grow faster than any primitive recursive total program (as a nested double recursion)Footnote 1; and it is possible to execute such a program, even though it will quickly exhaust the memory of actual computers. In fact Kleene’s result shows that what makes a computable function hard to compute is the rate of growth of the function.

3 Gödel’s System T and Girard’s System F

The provably total functions (i.e. the programs which always return a result when given input and which can be proved to have that property in a formal axiomatic theory) of first-order Peano Arithmetic (including Ackermann’s function that grows faster than any primitive recursive function) are precisely the hierarchy of functions reducible by application from primitive recursive functionals of finite type over the natural numbers (a result proved by Gödel in his famous Dialectica article Gödel (1958), Gödel (1994)) in a quantifier-free higher-order computation system called System T. To be more precise, Gödel showed that it is possible to compute a sequence of terms which witness any theorem of first-order Heyting arithmeticFootnote 2 (which applies equally to first-order Peano arithmetic by using Gödel’s double negation translation or the direct method due to Shoenfield in Shoenfield (1967)), and since every non-empty term without free variables can be reduced by function application to a numeral and not to an empty term (a term in the empty type, the type of a contradiction),Footnote 3 first-order Peano arithmetic is consistent (see (Avigad and Feferman 1998) 4.2).Footnote 4 In the 1970s an extension of Gödel’s System T of primitive recursive functionals of finite type led the way as a foundation for computing. Girard’s System F (see Girard (1989)) is an extension of System TFootnote 5 which allows variables over and abstraction over types (known as type polymorphism). System F is a very powerful impredicative systemFootnote 6 of the typed lambda calculus, having the same provable total natural number functions as second-order Peano arithmetic with a comprehension axiom for any formula of second-order arithmetic (see Girard (1989), Avigad and Feferman (1998)).

4 Curry-Howard Isomorphism

System F also explicitly pursues an idea due to Curry in Curry (1934) and Howard in Howard (1980a), that all computer programs can be regarded as a way of verifying the type of the data associated with a program. This idea is known as the Curry-Howard isomorphism. The same idea can be expressed in terms of proofs: a proof of a logical expression corresponds to a means of verifying the well-formed-ness of a type corresponding to the logical expression.

To be more specific, the logical rules enabling the formation, introduction and elimination of propositions and predicates in a natural deduction formulation of logic (see Gentzen (1969a), Prawitz (2006)) correspond to the rules enabling the formation, introduction and elimination of types, which in turn correspond to the rules for term rewriting in some typed version of lambda calculus. In Howard’s Howard (1980a) he is able to show that a fragment of first-order predicate logic (actually intuitionistic Heyting Arithmetic in the second part of the paper) comprising \(\wedge\), \(\rightarrow\) and \(\forall\) corresponds to pairing and projection and lambda abstraction and application introduction and elimination rules for implication for term (not type) variables (with it being possible to add other logical connectives).Footnote 7 It is also possible to represent types as propositions in first-order propositional logic (i.e. without quantification over propositions) for \(\wedge\) and \(\rightarrow\) but not for \(\forall\). In System F Girard extends this correspondence at propositional level to introduction and elimination rules for universal and existential quantification for type variables by allowing lambda abstraction over types (written \(\varLambda\)),Footnote 8 which corresponds to “second-order” propositional logic with quantification over propositions. In fact since termsFootnote 9 can be coded as numbers, types are properties of numbers, and polymorphism allows the type of the natural numbers to be defined and quantified over (at least in the sense that “if the natural number structure exists, then ...).Footnote 10 System F is powerful enough to represent second-order Heyting Arithmetic as a deductive theory (with comprehension for arbitrary formulas in the language of second order arithmetic).

In Girard (1989) Girard showed System F to be a conservative extension for computable total functions of second-order Heyting Arithmetic (and therefore second-order Peano arithmetic). Girard’s proof is very interesting because second-order Heyting Arithmetic (like System F) is an impredicative theory. Girard showed that a normal form always exists for all terms in System F and a reduction sequence that has a numeral as a normal form can be represented and proved to terminate in second-order Heyting Arithmetic, and conversely that all provably total functions in second-order Heyting Arithmetic correspond to reduction sequences in System F. Girard’s argument quantifies over all sets of what he called reducibility candidates, which are reductions that avoid lambda abstraction, pairing and non-logical type construction (see Girard (1989), Pistone (2018)), the main argument being by induction on formula complexity, the comprehension axiom schema being used to show that sets of reduction candidates can be parameterized.

5 Martin-Löf’s Intuitionistic Type Theory

Martin-Löf went further than Girard in Martin-Löf (1984) by allowing that types could be defined for any formula expressed in terms of logical connectives over individuals but not types (allowing that a type can mix types and terms, which is known as a dependent type). An example is that \((\forall x:A)(Px\rightarrow Qx)\) is a type, often written \((\Pi x:A)(Px\rightarrow Qx)\), where \(\Pi\) is the type binder corresponding to universal quantification. Similarly, \((\exists x:A)(Px\rightarrow Qx)\) is a type, often written \((\Sigma x:A)(Px\rightarrow Qx)\), where \(\Sigma\) is the type binder corresponding to existential quantification. It is the job of the mathematician according to Martin-Löf to show that a type (called a “proof” type) is non-empty using the standard (Heyting) interpretation of intuitionistic logic (see van Dalen (2001) for a description using the lambda calculus),Footnote 11 which establishes the validity of the inference represented by the type (under the Curry-Howard isomorphism). To be clear, according to this conception of type theory proofs inhabit types (represented by terms) and provide a validation of the logical inference represented by the type. This resulting theory, known as Intuitionistic Type Theory, is actually weaker in its mature formFootnote 12 than System F because type abstraction is not permitted. Intuitionistic Type Theory is a very clean foundation for intuitionistic mathematics in terms of computable functions because proofs are computations. Intuitionistic Type Theory includes inductively defined types (see Martin-Löf (1984) p. 42 et seq). An inductively defined type (or inductive type) is any type where terms of the type can be defined in respect to previous terms by a finitely definable relationship in a formal language, that is well-founded, i.e. every chain of definitions of terms is finite and returns a definite value.Footnote 13 Intuitionistic Type Theory with inductively defined types is a powerful theory (although the induction may need to terminate at the first non-recursively definable ordinal without the ability to form variables over types, in the sense of System F)Footnote 14.

Intuitionistic Type Theory also led to a view of the Curry-Howard isomorphism that is often called “Propositions as types”,Footnote 15 that is to say that propositions correspond to types and proofs of propositions correspond to proofs of a type being non-empty by exhibiting a term that inhabits the type. The reason for this correspondence is that in Intuitionistic Type Theory every well formed first-order intuitionistic logical term has a type and vice versa, and closed terms correspond to propositions.Footnote 16

Martin-Löf’s Intuitionistic Type Theory is a beautiful foundational theory for computation and mathematics. The value of introducing dependent types should not be underestimated. With dependent types comes the ability to create transfinite types that System FFootnote 17 does not have. For example, if \([0]:=N\) and \([n+1]:=[n]\rightarrow N\), then \((\lambda n:N)[n]\) is a transfinite type (see Avigad and Feferman (1998) 10.1), and a function (term) \(t:(\lambda n:N)[n]\) will be such that t(n) : [n], i.e. the return type of the function is dependent on the value of natural number n. Dependent types also enable all logical connectives to be defined in a unified way.Footnote 18

6 Girard’s System \(F_{\omega }\) and Coquand’s Calculus of Constructions

It is possible to extend System F to allow types to be defined with functional variables from any finite type to finite type, that is of the form \(A\rightarrow B\), where A and B are built up from base types such as (generic type variable) \(*\)Footnote 19 (which is known as allowing type constructors over finite types), which gives the System \(F_{\omega }\). Type constructors can be regarded as defining a type variable by a formula involving type variables and type constants. In the same way that System F has the same provably total computable functions as second-order Peano Arithmetic, System \(F_{\omega }\) has the same provably total computable functions as the union of all higher finite order deductive systems of Peano Arithmetic (see Girard (1973), Girard (1986)).

One of the most powerful type based system is Coquand’s Calculus of Constructions in Coquand and Huet (1988),Footnote 20 which allows abstraction over terms, types and type constructors (see Geuvers and Nederpelt (2014) for a clear recent exposition). The Calculus of Constructions is a strict extension of System F and System \(F_{\omega }\) because it allows types to depend on terms (allowing the introduction of transfinite types, see Sect. 5 above), as well as allowing abstraction over types and type constructors. The Calculus of Constructions is very elegant because of the symmetries between types and terms, in the sense that terms can be defined from terms (as in the simply typed lambda calculus)Footnote 21, terms can be defined from types (as in System F), types can be defined from terms (as in Intuitionistic Type Theory) and types can be defined from types (as in System \(F_{\omega }\)). The Calculus of Constructions is at the top right of what is known as the Barendregt Cube (see Barendregt (1991) and Guallart (2015) for an overview).

There is a question though of whether there should be a correspondence between types and generic terms of those types (as the mapping between terms and a type is many-to-one). For example, if you allow a universal or generic type variable “Type” or “*”, what is the generic term variable corresponding to the type “*”? The only sensible answer is that * is the type of a universal term variable “Term”. But then you have to allow that Term is not also a type variable. In Coquand and Huet (1988) Coquand makes clear that the type of * is not *, but that does beg the question of what the type of * is. You could say that the smallest type of * is ** (or the “Type of Type”), but it is not difficult to see that you will end up with ramified types (admitting *, **, *** etc.), see Russell (1908), or with Russell’s paradox (if you allow a type to be its own type). It is possible to say that the smallest type of * is something else, a sort say (often written \(\diamond\)). rather than a type **; but then the same regress occurs when you ask what is the sort of the sort \(\diamond\). There is no suggestion that as a formal system that the Calculus of Constructions is inconsistent (after all “*” does not have to have a type), but for such a powerful type theory it is incomplete in terms of type assignment.Footnote 22 It is possible to create a hierarchy of types (often called a type universe) in the case where types are defined predicatively in terms of other types, and a hierarchy of universes emerge in order to refer to all of the types in the preceding universes.

It is tempting to say that “Type” and “Proposition” are large types (that is the type equivalent of a proper class), while individual types and propositions are small types. If that were the case then Proposition would not be a member of Type (or more naturally Proposition would not be a Type) because both would be the size of proper classes, which do not form a hierarchy of increasing cardinality (assuming the Axiom of Limitation of Size, see von Neumann (1925)). There are further strengthenings of the Calculus of Constructions such as the Calculus of Inductive Constructions Paulin-Mohring (2015), the Extended Calculus of Constructions Luo (1989) and the Extended Calculus of Constructions with inductive definitions Ore (1992), which add features to the universe of types to the Calculus of Construction (a countably infinite hierarchy of predicative types and a single impredicative layer of propositions that is lifted to the predicative type hierarchy).

7 Definable Types

In my view Intuitionistic Type Theory with impredicative type and functional variable (or type constructor) quantification and inductive type definitions would be a reasonable foundation of computation, which is equivalent to the Calculus of Constructions with inductive definitions.Footnote 23 One could allow any type for which an inductive definition could be given in terms of existing types, which can be used in further definitions of types. It is possible in Intuitionistic Type Theory to define well-ordered linear orderings which can be computed, the order types of which are represented by ordinals less than the first non-computable ordinal. The limit of complexity of computable linear orderings is the first non-computable ordinal, but tree structures (which are partial orders) can be much richer. Even a binary tree will define a type which for a tree of countably infinite height has \(2^{\aleph _{0}}\) branches when each node has two successor nodes. Now \(2^{\aleph _{0}}\) branches cannot be computed, but a complete binary tree with branches of height \(\omega\) (often called a universal fan)Footnote 24 can be characterised uniquely as the smallest structure obtained by quantifying universally over all trees that satisfy the inductive definition.Footnote 25 This could be achieved in System F or in an impredicative system which allowed quantification over dependent types (such as allowing type \((\Pi P)(\Pi x:X)\ldots P(x)\ldots\) to be formed from type P that depends on x, where \(\Pi\) is the type binder corresponding to \(\forall\)). In such a system very rich structures could be defined as types. For universal quantification over all structures satisfying a definition this means admitting lambda abstraction over types that represent the structures, which is a reasonable step to take. In practice the construction of a type does not need to specify the universal quantification explicitly, as this can be left implicit in the construction. For the purposes of computation this is sufficient, since a computer with any finite runtime would describe a universal fan of finite depth, which in the limit would give the universal fan. Admittedly the universal fan is not computable, but it is definable as a type.

The main reason why definable structures are relevant to computable functions is due to the relationship between formal deductive systems and computability. We know that structure definitions give rise to axioms that characterise those structures, and that any sufficiently rich formal deductive theory is incomplete with respect to deduction (see Sect. 2). Exactly the same is true of typed systems of lambda calculus. New structures (types) mean new terms to rewrite and to (try to) generate a normal form.

Tree based types can be used as a representative of power types (the type-theoretic version of the set of all subsets of a set, the power set, introduced in Cardelli (1988) as a way to capture the notion of generic subtype). By a process of iteration and abstraction at ordinal limits (producing a limit type), a type theoretic view of the cumulative hierarchy of set theory can be reproduced. If type identity is treated as extensional (that is, if two types have the same extension, namely x : A if and only if x : B for any term x of type A or of type B, then \(A=B\)),Footnote 26 and any sequence of term reductions is finite, then type theory is no different from set theory with the Axiom of FoundationFootnote 27, although types themselves and abstraction and application operators are more readily viewed as logics (via the Curry-Howard isomorphism) and functional construction processes than is the cumulative hierarchy of sets. The correspondence is presented in the table below. The axioms in the table are not a minimal set of axioms (as the Axiom schema of Replacement implies the Axiom schema of Separation), and the Axiom of Choice is not usually considered a core axiom of first-order Zermelo Fraenkel set theory. The axioms cited however do correspond to distinctive types (Table1):

Table 1 The correspondence between sets and types

Type theory is also easier to use than set theory (see Farmer (2007)), although richer type theories (such as the Extended Calculus of Constructions and type theories with an intensional view of identity) can be extremely complicated. My own view is that mathematics is almost exclusively concerned with constructions carried out by functions, and type theory is correspondingly a natural foundation for mathematics and for computable functions in particular.Footnote 28 Type abstraction can be viewed as carrying the least amount of information greater than the information in any application of the type,Footnote 29 which is natural for types because abstraction requires the notion of a generic instance of a typeFootnote 30 (for example a variable that comprises as a term of that type) but is not natural for sets viewed as collections.

8 Homotopy Type Theory

A recent development in type theory has been homotopy type theory (see Awodey (2012, 2013)). The basic premiss of homotopy type theory is that in the same way as there is a correspondence between propositions and types there is a correspondence between types and homotopy spaces. An homotopy space is a topological space equivalent up to homotopy (which roughly means that any two families of n-dimensional open neighbourhoods that cover the space can be deformed continuously into one another).Footnote 31 The promise of this approach is that new axioms of type theory will emerge that are fundamentally topological in nature. An example from Awodey (2013) is Whitehead’s Principle (that a point-wise isomorphism between homotopy groups on well-behaved topological spaces is equivalent to the homotopic equivalence between the two spaces) applied to homotopy types. Homotopy type theory is also part of what is known as the univalence foundational programme for mathematics. Univalence seems to be the view that (homotopic) equivalence is what is meant by identity. That is to say, if identity is understood intensionally and identity of two terms needs a type, introduction and elimination rules and proof that terms are equal, then identity proofs of identity of terms can be viewed as paths in an homotopy space, identity between proofs of identity can be viewed as an homotopy between paths, and so on up the hierarchy of higher-order identities (of types, types of identity over types and so on) (see Awodey and Warren (2009)). This makes sense if identity of two terms is treated as needing a topological explanation that does not reduce to the syntactic equality of the two terms or to one term being a shorthand for another. But given that “propositions as homotopy spaces” is an analogy (or functor in category theory terms) in the same way “propositions as types” is, it is arguable that the term rewriting rules are as clear an explanation of identity of terms as it is possible to get.

It is difficult to judge how successful homotopy type theory will be, although there is no reason in type theory why a type should not be a geometrical or topological space, and it is unclear what the advantage is of treating every type as though it were a homotopy space (other than having a means to handle intensional identity of terms and types). There is no doubt a desire to put to use some very rich mathematical structures described in category theory which do not seem to be reducible to sets and have an inherent geometrical structure, but it is not yet known whether homotopy type theory introduces any new classes of computable function over and above those introduced by axiomatizations of existing areas of mathematics.

9 The Untyped Lambda Calculus and Domains

An alternative to systems of typed lambda calculus is to move away from the requirement of provable totality of functions and embrace programs which do not always terminate. In the untyped lambda calculus (due to Church) terms do not reduce to a minimal normal form in finitely many steps. This is attractive in light of the undecidability of the halting problem because the untyped lambda calculus is equivalent to a universal Turing machine in its power. Introducing a partial ordering where there is a tree of partial functions ordered by inclusion and having programs which always terminate as their limit is a powerful idea (due to Scott in Scott (1976), Scott (1993)). In fact Scott produced a model of the untyped lambda calculus (as a term-rewriting system not as a logic), showing that there are partially ordered sets which have the same structures as the continuous functions from the partially ordered set into itself (see Turner (1990) for a clear account). The problem is that the untyped lambda calculus does not always result in a normal form when the lambda terms are rewritten using a reduction rule (see Geuvers and Nederpelt (2014)), and does not result in a consistent logic. It is possible to produce a lattice based logic based on domains (see Abramsky (1991), Scott (1993) for example), but the significance of the axioms (or rules) is unclear as the axioms reflect the lattice structure rather than a logical inference system.Footnote 32

10 What Might Type Theory Look Like?

As suggested in Sect. 7, type theory should include a type-theoretic version of Zermelo Fraenkel set theory, arguably with intuitionistic logic.Footnote 33 Intuitionistic logic has the great advantage over classical logic that when an existence claim is made there must be a concrete method for constructing an object. But at this point we are going to abandon intuitionistic logic. The reason is that we can define double negation elimination (i.e. the inference \(\lnot \lnot A\) to A) as an elimination rule for negation, which is true of “truth” if not of “proof”. Doing that introduces a symmetry to the status of the “there exists” and “for all” quantifiers, i.e. \((\exists x:B)P(x):=\lnot (\forall x:B)\lnot P(x)\) and \((\forall x:B)P(x):=\lnot (\exists x:B)\lnot P(x)\).Footnote 34 We can define dependent type \((\Sigma x:B)P(x):=[(\forall x:B)(P(x)\rightarrow \bot )\rightarrow \bot ]\), and a term \(t:(\Sigma x:B)P(x)\) would have the form \(g[(\lambda x:B)(\lambda p:P(x))f(p)]\) for terms f and g of type \((\Pi x:B)(P(x)\rightarrow \bot )\) and \((\Pi x:B)(P(x)\rightarrow \bot )\rightarrow \bot\) respectively, but this is problematic because no term has type \(\bot\). It is better to define \(t:(\Sigma x:B)P(x)\) in a second order way as \((\lambda A)(g[(\lambda x:B)(\lambda p:P(x))f(p)])\), for type variable A, dependent type P(x) and terms f and g of type \((\Pi x:B)(P(x)\rightarrow A)\) and \((\Pi x:B)(P(x)\rightarrow A)\rightarrow A\) respectively, which is the Russell-Prawitz-Girard approach (see Rathjen (2018)). While \(\lnot (\forall x:B)\lnot P(x)\) is not the same as \((\exists x:B)P(x)\) it is equivalent via double negation elimination.Footnote 35 It is also well known that classical negation can be interpreted in programming terms as a context continuation, i.e. capturing state information and then passing that as a parameter to a program (see Griffin (1989)).

For the purposes of computation, any operation that we introduce must be computable in the sense that given a natural number input after finitely many steps we arrive at a natural number output (see Introduction above). The question arises how to treat functionals of higher type than \(N\rightarrow N\). The view taken here is a functional needs to be absolutely computable in the sense that with constant previously defined input values (which could also be functionals of lower type) the natural number output can be computed in finitely many steps by means of a recursion in a well-founded structure indexed by ordinal numbers. In order to be computable a functional is replaced by a term in the typed lambda calculus and the ordinal recursion must have the form that the ordinal of the reduct of a term is strictly less than the ordinal of the term. In general form we admit the following recursion schema for ordinal recursive functionals (commas not being used and brackets only used to keep the text readable):Footnote 36

\(Fgf0:=f\) and \(Fgf\beta :=g\beta (\lambda \gamma :\beta )Fgf\gamma\), where f has type T, g has type \(\alpha \rightarrow (\alpha \rightarrow T)\rightarrow T\) and Fgf has type \(\alpha \rightarrow T\) for any type T over the natural numbers, N, and \(\alpha\) an infinite ordinal, and term \(Fgf\beta\) is a function of finitely many terms \(Fgf\gamma\) depending on the choice of \(g\beta\) for \(\beta >0\) and of f for \(\beta =0\). \(\gamma ,\,F\), f and g may contain parameters, which are taken to be the same parameters for each absorbed into type T.

Box 1: Recursion schema for computable functionals

The idea of this recursion schema is to incorporate a computable finite-branching finite-depth tree for the value of F at ordinal \(\beta\). Using a technique made explicit in Terlouw (1982), a well-ordering \(\prec\) of order type \(\alpha\) can be thought of as a well-ordering \(\prec _{*}\) of order type \(2^{\alpha }\) (using ordinal exponentiation) understood as follows. If \(\prec\) is a well ordering of the ordinal numbers \(<\alpha\), a limit ordinal for simplicity, then ordinal codes (under a primitive recursive coding function ⌈⌉ such as the ordinal sum) of finite (strictly descending) sequences of ordinals \(\langle x_{i<n<\omega }\rangle\) such that \(x_{j}\prec x_{k}\) if \(j>k\) can be given a lexicographical ordering \(\prec _{*}\) defined as follows. \(\left\lceil \langle x_{i<n}\rangle \right\rceil \prec _{*}\left\lceil \langle y_{i<m}\rangle \right\rceil\) if \((\exists k<n)[(\forall l<k)(x_{l}=y_{l})\wedge (x_{k}\prec y_{k})]\) or \((\forall l<n)(x_{l}=y_{l})\wedge n<m\), \(\left\lceil \langle x_{i<n}\rangle \right\rceil =_{*}\left\lceil \langle y_{i<m}\rangle \right\rceil\) if \((\forall l<n)(x_{l}=y_{l})\wedge n=m\) and \(\left\lceil \langle y_{i<m}\rangle \right\rceil \prec _{*}\left\lceil \langle x_{i<n}\rangle \right\rceil\) otherwise. We can think of a computation tree of ordinal complexity \(2^{\beta }\) as a tree where all \(F(\gamma )\) terms of ordinal \(\gamma <\beta\) are substituted into g, the well-foundedness of the computation of \(F(\beta )\) following from \(2^{\prec }\)-induction \((\prec\)-induction for each descending finite sequence of terms). We can replace the \(\prec _{*}\) well ordering with a higher type and well ordering \(\prec\) as follows, after Terlouw (1982) for ordinals less than \(\epsilon _{0}\) (the first ordinal \(\alpha\) such that \(\omega ^{\alpha }=\alpha\) using ordinal exponentiation)Footnote 37: \(H(Fgf0\beta )(\gamma )=Fgf\gamma\) if \(\gamma \prec _{*}\beta\), \(H(Fgf0\beta )(\gamma )=g\gamma (\lambda \delta \prec _{*}\gamma )Fgf(\delta )\) if \(\gamma =\beta\), \(H(Fgf0\beta )(\gamma )=0_{T}\) if \(\beta \prec _{*}\gamma\), \(H(Fgf\zeta \beta )(\gamma )=H(H(Fgfp(\gamma ,\beta )\beta )p(\gamma ,\beta )(\beta +2^{p(\beta ,\gamma )}))(\gamma )\) if \(p(\gamma ,\beta )\prec \zeta\), and \(H(Fgf\zeta \beta )(\gamma )=0_{T}\) if \(\gamma \succeq p(\gamma ,\beta )\), where \(p(\gamma ,\beta )\) is the least \(\nu\) such that \(\gamma <\beta +2^{\nu +1}\). \(0_{T}\) is a representation of the natural number 0 in type T. HFgf has a higher degree than Fgf as \(degree(a\rightarrow b\)) is defined by \(max(degree(a)+1,degree(b))\), and Fgf has type \(\alpha \rightarrow T\) and HFgf has type \(\alpha \rightarrow 2^{\alpha }\rightarrow (\alpha \rightarrow T)\rightarrow \alpha \rightarrow T\), and so \(degree(HFgf)=max(degree(\alpha \rightarrow T)+1,degree(2^{\alpha })+1)\ge degree(Fgf)+1\). What the recursion H does is to start from \(\beta =0\) the definition of F by \(\prec _{*}\) recursion and nests the recursion along \(\prec\) in the case of a successor ordinal \(p(\beta ,\gamma )+1\), corresponding to the double application of H needed to achieve \(2^{p(\beta ,\gamma )+1}=2^{p(\beta ,\gamma )}+2^{p(\beta ,\gamma )}\) in the \(\prec _{*}\) recursive definition. While this specific functional recursion only works for ordinals less than \(\epsilon _{0}\), it does show that wherever there is a finite computation tree defined by a partial ordering on a well ordering \(\alpha\), an in general nested recursion can be performed along \(\alpha\) by substitution of previously computed terms while at the same being performed by unnested recursion along the ordering of the finite computation sequences, which has ordinal \(2^{\alpha }\) (or \(\omega ^{\alpha })\) (see (Tait 1961; Fairtlough and Wainer 1992)).

In general a recursive functional, G say, may not result in a computable function since in principle for every limit ordinal \(G(\beta )\) for \(\beta \ge \omega\) has infinitely many predecessor terms, \(G(\gamma )\) for \(\gamma <\beta .\) (This is not true of H because the function p is strictly decreasing for suitable ordinal input.) To make G computable (when reduced by substitution to a function \(N\rightarrow N\)) we require that \(G(\beta )\) depends on finitely many predecessor terms. We could use an ordinal notation system where every ordinal can be written as a computable function of smaller ordinals and find a suitable “predecessor” function (as in the discussion above), or we could follow a hierarchy of total recursive functions. In the latter case the standard way to enforce the finite ordinal dependency is to treat any ordinal \(\beta <\alpha\) as a term representing a monotonically increasing one-to-one computable function \(N\rightarrow \alpha\) such that \(\alpha\) is the least upper bound of \(\beta (n)\) for \(n\in N\) (called a fundamental sequence) such that \(\beta (n)<\beta\) for all \(\beta <\alpha\). Then for a limit ordinal \(\delta\) we could have \(G(\delta )(n)=G(\delta n)(n)\). In a similar vein, it is possible to treat \(\beta\) as a term which accepts natural number or higher type input (which can be reduced to a natural number type by substitution). By treating an ordinal as a term, we can see that nesting functionals corresponds to ordinal exponentiation and for hierarchies of total computable functions diagonalization corresponds to limit ordinals.Footnote 38 The study of these computable functions and functionals has a very rich and current literature, see for example Fairtlough and Wainer (1992) and Rathjen (2006), and it is worth noting that ordinals in this literature are countable ordinals described in a notation system, and that having notations for (countably many) uncountable ordinals extends the hierarchy of total computable natural number functions.

There is a natural upper bound to recursions given by the information content of a typed lambda term when represented as a quantificational deductive theory. Information content is defined by transfinite recursion over the type of natural numbers, N, by \(info(N):=\aleph _{0}\), \(info(P\rightarrow Q):=info(P)+1\), where \(+\) is cardinal addition. For a dependent type \(R(a)\) for some term a of type T, \(Info(R(a))=Card(T)\), \(Info(\Pi x:T)R(x))=Info(\Sigma x:T)R(x))=Card(T)+1\) and \(Info(\Pi R:T)R)=Info(\Sigma R:T)R)=Card(T)+1\), where Card(T) is the cardinality of type T. The reason for this definition is that predicates can be thought of as Boolean functions, i.e. as functions with value True or False. Functions are many-to-one in general, and thus the information in a function cannot be less than the information in the domain of the function. In the case of an infinite type T, taking the least upper bound of ordinals that could be assigned to a term of type T in the domain of a Boolean function, yields \(info(T\rightarrow \{True,False\})=Card(T)+1\).Footnote 39 The information content has the property that it forms a natural upper bound for transfinite induction determined by the cardinality of the type. The idea is that any finite sequence of functional applications which reduces the type of the term to N can be viewed as reducing the information content from \(Card(T)+1\) to \(\aleph _{0}\) in finitely many steps. Any recursion on a term of type T that has finitely many steps from an initial value has only finitely many additional steps over \(Card(T)+1\) and can therefore be regarded as a transfinite recursion up to \(Card(T)+1\).

In general we would have to use infinitary rules for computing the truth of a quantified proposition, so to be computable the best one can hope for is to extend Gödel’s Dialectica interpretation to evaluate the truth value of a quantified proposition as a definition of a recursive functional by a finite process of term reduction. This is possible if the induction axiom of first-order Peano arithmetic is replaced by a principle of transfinite induction up to a certain ordinal, \(\alpha \le Card(T)+1\), and all quantifiers are treated as many-sorted first-order variables. Then the Dialectica interpretation will result in quantifier-free ordinal recursive functionals up to the same ordinal, \(\alpha\), in the form of the schema in Box 1, since at limit ordinals in applications of the principle of transfinite induction the finite dependency condition must apply to ensure a finite sequence of terms witnessing theorems in the theory.

This Dialectica-based (System T) approach does not address impredicative axioms such as comprehension axioms except insofar as ordinals can be defined impredicatively. This may be sufficient if every type is isomorphic to an ordinal, which requires the Axiom of Choice and reduces a type to a set; but in general we should use an impredicative dependent type system of the typed lambda calculus such as the Calculus of Constructions with inductive definitions to define recursive functionals of transfinite type in the form of the schema in Box 1. To be clear, what is being suggested is an impredicative, dependent type extension of Gödel’s System T where the impredicativity is used to define structures (such as ordinals) and the dependent types enable the hierarchy of transfinite types to be extended to these impredicative structures.

The table below gives some upper bounds on definition by transfinite recursion in various systems of the typed lambda calculus, comparing (Table 2) the (in general uncountable) cardinal bound with the countable ordinal bound (the proof-theoretic ordinal) of the computable functions (where known).Footnote 40

Table 2 A comparison of the cardinal and ordinal strength of theories of the typed lambda calculus

It may be surprising to see uncountable cardinals (and therefore ordinals) in this table, but they are present because in impredicative, higher order systems of the typed lambda calculus, and in the corresponding deductive theories, such ordinals are definable as types. \(2^{\aleph _{0}}\) was defined in Section 7 above, and \(\aleph _{1}\) can be defined impredicatively as the intersection of well orderings \(\le 2^{\aleph _{0}}\) which include an uncountable well ordering. The computation of the value (or normal form) of a particular term involving types will always be a finite computation involving term rewriting, but there is no ordinal limit on the types of the terms involved. The reason why all the systems of the typed lambda calculus do not complete the set of \(\Pi _{2}^{0}\) arithmetical truths (and therefore the set of arithmetically definable total functions) is simply that all the term rewriting or deduction systems can be coded as operations on numbers, and all richer and richer types add is the ability to frame larger and larger sets of total functions of the natural numbers. The set of computably functions is countably infinite because there are countably infinite many finite computer programs, but the set of computable functions cannot be computable, otherwise it would be possible to diagonalize out of the set to produce a computable function not in the set.

11 Conclusions

This paper has argued that computable functions can be identified with the functions that correspond to reduction sequences of terms at least as strong as the the Calculus of Constructions. What one ends up with ultimately (at least with extensional identity of types) is a version of Zermelo Fraenkel set theory with restricted transfinite recursion and a functional rather than collection based semantics operating on generic variables or constants of a given type (which could be an inductively defined type such as a power type or a limit type). It is possible to go further and look at the computable functions of subtheories of second-order Zermelo Fraenkel set theory (that is with a second-order replacement axiom), but it is necessary to treat the set theoretic universe as a determinate whole, as a proper class. New set formation principles do arise from the theory of proper classes (see for example Hellman (1989), Welch and Horsten (2016) and Roberts (2017)), but given that the literature on large cardinals and reflection principles is very large, it is sufficient here to note that computable functions are not bounded by any deductive theory or system of the typed lambda calculus, only by the requirement that they form a non-computable countably infinite set.