Introduction

Nullum est iam dictum, quod non dictum sit prius.

Nothing is now said that has not been said before.

Terence, The Eunuch, 161 BCE

In the hope of avoiding the trap mentioned by Terence almost 22 centuries ago, we shall refrain from repeating all of the ideas set out recently by ourselves (Cárdenas et al. 2018; Mariscal et al. 2019; Cornish-Bowden and Cárdenas 2020) and others (Ramstead et al. 2018; Bich and Green 2018; Branscomb and Russell 2018a, b; Fiévet et al. 2018; Cleland 2019; Nicholson 2019; Peretó 2019). Instead we shall examine a number of points that were not considered in detail, or not considered at all, in the cited articles. This approach may appear to make the account somewhat disconnected:Footnote 1

“VERRA interesting, but a wee bit disconnected,” the Scottish lady is reported to have said when she returned the dictionary thinking it was a novel. The same judgment may well be passed on the present collection of quotations, at least as far as the disconnection is concerned. I can only hope that the reader may find the interest makes up for it.

McEachran (1974, p. 1)

Inevitability of life

All life is a stage and a game: either learn to play it, laying by seriousness, or bear its pains.

Palladas, translated by Mackail (1890, xii.xliii)

Life as a random result

Die Hauptsache war, daß man lebte. Das war die Hauptsache.

The main thing was to stay alive; that was the main thing.

Rainer Maria Rilke (1910)

According to Jacques Monod (1972, pp. 145–146),

If [the emergence of the human species] was unique, as may, perhaps, have been the appearance of life itself, then before it did appear its chances of doing so were infinitely slender. The universe was not pregnant with life nor the biosphere with man. Our number came up in the Monte Carlo game.

Monod’s position was probably a minority one, however, and still is: many biochemists are more inclined to agree with George Wald (1963) that

Life in fact is probably a universal phenomenon wherever in the universe conditions permit and sufficient time has elapsed.

After quoting Wald and others who take the deterministic view, a third Nobel prizewinner, Christian de Duve (1991, pp. 212–213) expressed it as follows:

My model is similarly burdened with uncertainties and biases — mostly those of a biochemist, as it happens. But, within its hypothetical framework it is sufficiently detailed and precise to allow a clearcut conceptualization: it is emphatically and unambiguously deterministic. If the model is correct, then anywhere in the universe where the conditions that prevailed in the early days of our planet should obtain, life would develop in the same way, to the point of using the same amino acids and other structural building blocks, the same metabolic pathways; the same sorts of enzymes and coenzymes, the same kinds of proteins and nucleic acids, the same blueprint — if not in every small detail, at least in all essential features.

The long last sentence seems to us to be extreme: we do not expect extraterrestrial life to use exactly the same molecules and metabolic pathways as the life that we know, but in broad outlines the earlier part is probably correct. For us and others, such as Soto et al. (2016), the crucial first step was the appearance of self-organization. Until a primitive system had become able to organize itself and maintain its organization, further development towards life would not be possible: Staying alive was the crucial step before reproduction and evolution could occur.

The appearance of life, followed by natural selection, may well have been inevitable, and that may be all that de Duve was saying, but it is a large step to conclude from this that the appearance of intelligent life, and in particular of humanity, was inevitable. Without being dogmatic about it, we share Monod’s view that “the universe was not pregnant with life nor the biosphere with man.” His next sentence, that “Our number came up in the Monte Carlo game,” expresses the extreme improbability of the emergence of life, and possibly of humanity. Nonetheless, once the apes had evolved it may have been just a matter of time before humans, or organisms resembling humans, evolved.

A natural path to self-organization

Kauffman (1986) argued that self-organization could arise more easily than one might imagine:

The prebiotic emergence of reflexively autocatalytic sets of protein-like polymers may have been highly probable.

He arrived at this conclusion by considering mixtures of very large numbers of fundamental components (which do not have to be amino acids or nucleic acid bases) such that each component has a small but finite probability of catalysing the formation of a bond between two other components. He found that even if the probability of such a catalysis is very small then self-organization will arise naturally in a sufficiently large population. He called this an autocatalytic set. Notice that his system relies on entirely random events, so that although self-organization may arise independently on two different planets with appropriate conditions there is no reason to expect it to arise in the same way, and thus contain the same or even similar metabolites and metabolic pathways, let alone enzymes; indeed there is every reason to expect them to be completely different. Cleland (2019) has even suggested that fundamentally different forms of microbial life may exist, the shadow biosphere, so far undetected, on Earth.

Kauffman took \(10^{-9}\) as the probability that a random component would catalyse a selected reaction:

Let \(P = 10^{-9}\). Then for each specified reaction, on the order of \(10^9\) polymers must be tried to find at least a weak catalyst.

However, he also considered other values in the range \(10^{-8}\)– \(10^{-4}\) and gave no reason to think that \(10^{-9}\) was fundamental. Nonetheless, it is interesting to note that experimental estimates (Ellington and Szostak 1990; Sassanfar and Szostak 1993) indicate that roughly one in \(10^{10}\)–\(10^{11}\) polynucleotides of random sequence folds in such a way as to create a specific binding site for small ligands such as organic dyes and ATP. Subsequent studies with polypeptides 80 residues in length (Chaput and Szostak 2004) suggest that a similar probability of \(10^{-11}\) applies to random polypeptides. Thus the value of \(10^{-9}\) used by Kauffman was not totally removed from reality.

Emergence

Is life an emergent property?

If a living organism were simply the sum of its parts it would not be an emergent system, but if it only functions as an entire system then it may be: the parts can only be understood in terms of the whole organism (Cornish-Bowden et al. 2004). Cairns-Smith (1985, p. 39) made a similar point by saying that “it is the whole machine that makes sense of its components.”Footnote 2 If there were a linear sequence of upward causation from genes through several steps to a whole organism then there would be no emergence. However, it is clear that properties of a whole organism affect its component processes, so there is some downward causation, and a more complete description involves democratic causation (Westerhoff et al. 1990; Hofmeyr and Westerhoff 2001), with elements of both upward and downward causation entering into consideration: there is no “up” and no “down;” everything affects everything else. There is thus no hierarchy in a living organism (Letelier et al. 2011). Noble (2008, 2012) argued that downward causation was essential, and we adopted the view of Westerhoff et al. (1990) that democratic causation was an essential part of the picture (Fig. 43 of Cornish-Bowden and Cárdenas 2020).

Fig. 1
figure 1

Closure to efficient causation. a Rosen’s diagram in Fig. 10C.6 of Life Itself to illustrate a series of alternating efficient and material causes (Rosen’s use of arrows was inconsistent between his Figs. 10C.5 and 10C.6, but we have avoided inconsistency here): f, the efficient cause (catalyst) of metabolism, acts on A, the set of substrates of metabolic reactions, to produce B, the set of products; B is the source for forming f, under the action of \(\Phi \), which catalyses the regeneration of f. If \(\beta \), which catalyses the formation of \(\Phi \) from f, is represented as a property of B (not as the whole of B), then infinite regress is avoided. b This is our attempt to make it more intelligible (Letelier et al. 2011, with some modifications). The irreversible conversion of nutrients into waste, thermodynamically necessary, and explicit in Essays on Life Itself (Rosen 2000, pp. 17–18), but only implicit in panel (a), is shown explicitly here. c Definitions of the types of arrow in (b) as material causes, or substances transformed (“substrates”), and efficient causes, or catalysts

Rosen (1991) based his view of causation on Aristotle’s classification of four causes of some result, illustrated here for glucose 6-phosphate, as discussed earlier (Cornish-Bowden et al. 2007):

  • The material cause defines what it is made from, glucose and ATP;

  • The formal cause defines what it is, an intermediate in glycolysis;

  • The efficient cause defines what brought about its formation, the enzyme hexokinase;Footnote 3

  • The final cause defines the reason why it was made, to participate in harnessing metabolic energy.Footnote 4

Rosen was primarily concerned with the efficient cause, or catalyst in the context of metabolic processes, though he also tried to rescue the final cause from the disrepute into which it had fallen since the time of David Hume, without, however, attributing it to the Will of God. He considered that all catalysts needed by an organism needed to be products of the organism itself, and expressed this ideaFootnote 5 by saying that “organisms are closed to efficient causation.” The diagram that he used to illustrate this assertion (Fig. 1a) is not much easier to understand than the statement itself, but it is drawn in a somewhat clearer way as Fig. 1b, with a more concrete representation in Fig. 2. We have given a much fuller explanation of this diagram in a recent review (Cornish-Bowden and Cárdenas 2020). The essential point is that a living organism is an emergent system, because nothing apart from nutrients is given from outside: all the catalysts (apart from metals, as discussed later) are produced by the organism itself. Figure 2Footnote 6 illustrates how a metabolism-replacement system, or (M, R)-systemFootnote 7 can be conceived in less abstract terms than the diagrams in Fig. 1. Notice that in Fig. 2, a single catalyst, STU, catalyses two different reactions. This point is more important than it may seem, and we shall return to it later in the context of moonlighting.

Fig. 2
figure 2

A simple self-organizing system (Cornish-Bowden et al. 2007; Cornish-Bowden and Cárdenas 2007). The net effect is to bring about the reaction S + T \(\rightarrow \) ST, with the help of two catalysts STU and SU, which are themselves products of the system. STU, SU and ST all decay, and need to be continually resynthesized

The value of theory

Hertz and Maxwell could invent nothing, but it was their useless theoretical work which was seized upon by a clever technician and which has created new means for communication, utility, and amusement by which men whose merits are relatively slight have obtained fame and earned millions. Who were the useful men? Not Marconi, but Clerk Maxwell and Heinrich Hertz. Hertz and Maxwell were geniuses without thought of use. Marconi was a clever inventor with no thought but use.

Abraham Flexner (1939)Footnote 8

Rosen’s work is very abstract, and more generally the development of theories of life could be regarded as a useless activity, leading to no practical applications. Contreras et al. (2011), for example, expressed a widespread view in writing as follows:

Two crucial models that definitively put metabolic control at the very center of biological organization are Autopoiesis, formulated by Maturana and Varela (1980) and Rosen’s (M, R) Systems (Rosen 1958). But these two theoretical studies (like the Chemoton or Autocatalytic sets), although very clarifying in basic aspects, have not produced technical results that illuminate the daily life of bench biologists involved in experimental research.

We agree with this general statement, and we think that the difficulty of understanding Rosen’s ideas has meant that they have not yet had much impact on experimental research. In consequence, we and Letelier have regarded it as an important responsibility to bring these ideas to a broader audience by making them more intelligible: the theory of (M, R) systems provides several landmarks for experimental research, such as the inevitability of moonlighting, the limit of how small a self-organizing system can be, and the absence of a hierarchy from biological organization, all discussed elsewhere in this paper. For more discussion see Letelier et al. (2011), and Cornish-Bowden and Cárdenas (2020).

Fig. 3
figure 3

Quadrant diagram of modes of seeking solutions. This kind of representation was suggested by Whitesides (2015), and is explained in the text

Boi (2019) discussed the differences between developing a theory and finding a practical solution to a problem (Fig. 3), and drew our attention to the quotation of Flexner (1939) above. The four parts of Fig. 3 have the following meanings:

  1. (a)

    Theories of life can be very helpful for understanding the nature of organisms, but they have not led in any obvious way to technological advances.

  2. (b)

    Bioinspired catalysts based on biological exemplars are now being devised, for example by Papini et al. (2019), who use their understanding of chemical catalysis to arrive at practical methods.

  3. (c)

    Failed attempts to improve yields of industrially valuable metabolites have often been based on the idea that overexpression of the enzyme catalysing the rate-limiting step of a metabolic process will lead to unlimited production of the desired metabolite. Although in some cases, such as the rate of translation of mRNA as a function of the availability of tRNA (Garel 1974; Liljenström et al. 1985), the rate of a metabolic reaction may be proportional to the amount of catalyst, the proportionality is usually confined to the physiological range of conditions;Footnote 9 it does not allow the large increases desired for most technological applications, because the sensitivity of a flux to any catalyst concentration always decreases when that concentration increases (Kacser et al. 1995).

  4. (d)

    Living organisms such as Escherichia coli have no understanding of what they are doing, but they are very successful at living.

Catalysis and regulation

Catalytic cycles

Catalytic cycles are an essential characteristic of current theories of life (Contreras et al. 2011; Kreyssig et al. 2012). Gánti (2003) laid great emphasis on them in his chemoton model, as did Eigen and Schuster (1977, 1978a, 1978b) in their development of the hypercycle. They are less obviously present in autocatalytic sets (Kauffman 1969) and autopoiesis (Maturana and Varela 1980), but they must certainly be implied if these theories are to be fully developed. Kauffman (1969) specifically refers to catalysis, and thus implicitly assumes cycles (see below). Maturana and Varela (1980) do not mention catalysis, but their model could not work without it.

Fig. 4
figure 4

Catalysis as a cycle. Any catalysed process a can be written as a cycle of uncatalysed reactions b in which the catalyst is regenerated at the end of the cycle. Although we label STU as the catalyst we could equally regard STUS or STUST as the catalyst; thus the whole cycle is the catalyst. A more concrete example from metabolism is shown in Fig. 5

The case of (M, R) systems (Rosen 1958, 1991) can be seen by examining Fig. 2, or system I in Fig. 8 (below): the intermediate STU catalyses two processes: S + T \(\rightarrow \) ST and S + U \(\rightarrow \) SU, and SU catalyses a third, ST + U \(\rightarrow \) STU. The cyclic nature becomes more evident if we write each process as a series of uncatalysed reactions, as illustrated for example in Fig. 4. It is then evident that STU \(\rightarrow \) STUS \(\rightarrow \) STUST \(\rightarrow \) STU forms a cycle in which STU is regenerated and thus catalyses the whole process (it could equally well be considered to be catalysed by STUS or STUST). The idea of a cycle is implicit in any catalysis, as the catalyst must be regenerated at the end of the process: as long as we accept that catalysis is needed we must accept that there are cycles. As Eigen and Schuster (1977) emphasized, any enzyme-catalysed reaction can be represented as a cycle of uncatalysed reactions, and the example of the reaction catalysed by hexokinase is illustrated in Fig. 5.

More than 200 years ago, Elizabeth Fulhame (1794) had a good understanding of catalysis as a cycle of reactions (Laidler and Cornish-Bowden 1997), though she did not use the word catalysis, which was introduced by Berzelius 40 years later. She believed that water participated as a reactant in all oxidation–reduction reactions (an exaggeration, of course) and that it was regenerated at the end of the cycle:

The carbone attracts the oxygen of the water, and forms carbonic acid, while the hydrogen of the water unites with oxygen of the vital air, and forms a new quantity of water equal to that decomposed.

Catalysis at the origin of life

Any model of life must involve catalysis and therefore catalysis must have been required at the beginning. Although present-day metabolic processes are nearly all catalysed by protein enzymes, that cannot have been the case at the origin of life, when much simpler molecules (including short polypeptides or polynucleotides) or ions, especially metal ions, must have sufficed. Moreover, at the origin of life the materials could have been quite different from the present ones, as argued by Cairns-Smith (1985, pp. 62–64).

Some metal ions, such as \(\hbox {Mg}^{2+}\), \(\hbox {Zn}^{2+}\) and Mo (various oxidation states), still have important roles. Ionized forms of iron are still involved in many processes, but the unionized metal \(\hbox {Fe}^0\) is not, as it has only become available again in the environment since the beginning of the iron age about 3000 years ago, a trivial period on an evolutionary scale. However, the bombardment of the Earth more than \(4 \times 10^9\) years ago by planetesimals caused large amounts of \(\hbox {Fe}^0\) to be raised to the upper atmosphere, from where it rained down for some \(10^7\) years (Genda et al. 2017; Marchi et al. 2018), after which it was oxidized to \(\hbox {Fe}^{2+}\) by \(\hbox {SO}_4^{2-}\) and later on to \(\hbox {Fe}^{3+}\) by \(\hbox {O}_2\). Thus \(\hbox {Fe}^0\) may well have been available as a catalyst at the origin of life (Muchowska et al. 2017).

Fig. 5
figure 5

Any catalysed process can be written as a cycle of uncatalysed reactions. The reaction catalysed by hexokinase is shown here both a as a catalyst, and b as a cycle of uncatalysed processes. Although all the reactions are in principle reversible, only the interconversion of ternary complexes is irreversible for practical purposes. The scheme incorporates the usual assumption that glucose is the first substrate and glucose 6-phosphate the last product, but that may not be case for all forms of hexokinase (Monasterio and Cárdenas 2003)

The role of divalent metal ions in catalysis appears to raise an objection to Rosen’s view that all catalysts must be produced by the organism. Of course, metals are not produced by living organisms, but organisms do synthesize molecules that trap metal ions and allow them to be brought into the cell. Thus the autopoietic concept is respected even in the case of metals. Bacteria in general have evolved mechanisms to acquire metals, such as \(\hbox {Mn}^{2+}\), \(\hbox {Fe}^{2+}\), \(\hbox {Co}^{2+}\), \(\hbox {Ni}^{2+}\), \(\hbox {Cu}^{2+}\) and \(\hbox {Zn}^{2+}\), from the environment. For example, the pathogen Pseudomonas aeruginosa produces a molecule, pseudopaline, that allows it to sequester \(\hbox {Zn}^{2+}\) and \(\hbox {Ni}^{2+}\) (Lhospice et al. 2017).

Diverse functions of enzymes: moonlighting

It has usually been assumed that the first organisms could not have relied on highly specific enzymes such as those known today, and that the early enzymes must have been low in both activity and specificity. For example, Kacser and Beeby (1984) wrote as follows:

In essence our argument is that enzyme systems in early “cells” consisted of a small number of catalytic proteins with low specificity and low turnover numbers, a view previously expounded by Waley (1969), Yčas (1974) and Jensen (1976).

These arguments are based on plausibility, but cannot be seriously doubted, and it is certain that most modern enzymes have both high activity and high specificity. Nonetheless, their specificity is not nearly as total as often thought, and mechanisms are needed to repair the damage caused by the toxic products released by parasitic side reactions (Van Schaftingen et al. 2015; Bommer et al. 2020). These include 1,5-anhydroglucitol 6-phosphate, a side product of the hexokinase-catalysed reaction, which is normally detoxicated by specialized repair enzymes, as illustrated in Fig. 6. When these are absent or insufficiently active the unwanted product can cause neutropenia, a disease of white blood cells. In addition, the repair enzymes are not themselves perfectly specific, as noted by Bommer et al. (2020):

Like all enzymes, metabolite repair enzymes are also expected to have side-activities. In some cases, these activities might appear to catalyze novel physiological reactions, albeit at a rate that is far too low to influence the metabolism of the specific metabolites.

Fig. 6
figure 6

Undesirable side reactions. a The reaction catalysed by hexokinase normally has glucose as substrate and glucose 6-phosphate as product, but b 1,5-anhydroglucitol can also act as substrate, releasing 1,5-anhydroglucitol 6-phosphate as product, a strong inhibitor of the enzyme, and thus capable of interfering with the physiologically important production of glucose 6-phosphate. Although this can be removed by the glucose 6-phosphate transporter it causes neutropenia (an abnormally low concentration of circulating neutrophils) if allowed to accumulate. Adapted from Bommer et al. (2020)

The examples discussed in these papers concern non-standard molecules that resemble the specific substrates and products sufficiently to interact with the active sites of the enzymes. In addition, increasing numbers of cases of moonlighting (Jeffery 2003) are known, in which the unexpected function may be completely different from the expected one. For example, it has long been known that glycolytic enzymes such as alcohol dehydrogenase have structural roles in lens crystallin (Piatigorsky 1992). Glyceraldehyde 3-phosphate dehydrogenase, which has around ten different functions, provides a striking example.

We noted earlier that in Fig. 2 the intermediate STU catalyses two processes: S + T \(\rightarrow \) ST and S + U \(\rightarrow \) SU. The question then arises of whether this is just a fault in the scheme, an example of moonlighting, or something unavoidable. In our view (Cornish-Bowden et al. 2007) it is unavoidable, an essential consequence of closure to efficient causation: if every catalyst catalyses just one reaction, and does nothing else, and if every metabolite has just one function, it is impossible to escape from infinite regress. Thus moonlighting is an essential feature of living systems, not just an interesting property.

Figure 6 illustrates one of several examples of repair in metabolism, including the operation of chaperones, which act to convert misfolded proteins to active forms (Jaenicke 1995), and proofreading for correcting errors in DNA replication caused by insufficient specificity (Hopfield 1974; Ninio 1975). None of these correspond to the sense in which Rosen (1991) used the term repair, meaning really replacement (see footnote 7).

Metabolic regulation

Kinetic regulation is a crucial component of a theory of life. However, some authors, such as Russell et al. (2014), discuss life almost entirely in terms of thermodynamics, implying that if sufficient energy is available then life is possible.Footnote 10

Benner (2009) has described an example to illustrate why it is not as simple as that. He suggested an experiment in which a sample of breakfast cereal and water is divided into two parts. One part is heated in an oven at 220 \(^\circ \)C and maintained at that temperature until there is no further change, and the other is placed in a cage with a pair of adult guinea pigs and maintained at ambient temperature for several weeks. Despite the ample supply of energy the first sample will yield nothing apart from asphalt or tar, whereas the second will yield some new-born guinea pigs in response to a much smaller input of energy. It follows that a living organism not only needs a supply of energy but it also needs internal organization and a means of regulating the chemical reactions that are in principle possible.

However, kinetic regulation is ignored in (M, R) systems, autopoiesis, the chemoton, autocatalytic sets, the hypercycle, and most other theories of life.Footnote 11 Indeed, none of them attempt to incorporate feedback inhibition first described by Dische (1941), as recently discussed (Cornish-Bowden 2021), but usually attributed to others (Umbarger 1956; Yates and Pardee 1956), and now widely recognized to be a crucial mechanism of metabolic regulation (Hofmeyr and Cornish-Bowden 2000; Cornish-Bowden 2012) that needs to be included in theories of life. As we have discussed this in detail in a recent review (Cornish-Bowden and Cárdenas 2020) we shall not do so here.

Nutrition at the origin of life

Tell me: what’s the point of life? I say it’s drinking. Look at the trees along torrent streams that stay moist all day and all night; how large and beautiful they grow! But those that resist are destroyed root and branch.

Antiphanes, Fragment 228,

translated by S. Douglas Olson

Heterotrophic organisms as understood today subsist on other organisms or on their products. However, at the time of the first organisms (long before the last universal common ancestor of all extant organisms, or luca, to be discussed shortly) there were no others to subsist on, so heterotrophy in that sense was clearly impossible, but multiple self-organizing systems that use products of one another, as we have suggested when analysing ecosystems (Cárdenas et al. 2018), could have existed. Some materials still need to come from the environment, however. The prebiotic environment probably contained amino acids and carbohydrates, as we have discussed in the context of hexokinase evolution (Cárdenas et al. 1998), sugars (in complex mixtures) being products of the polymerization of formaldehyde in the formose reaction (Boutlerow 1861). Furthermore, it has been known for many years that a simple gas mixtureFootnote 12 (\(\hbox {CH}_4\), \(\hbox {H}_2\), \(\hbox {H}_2\)O and \(\hbox {NH}_3\)) exposed to repeated electric discharges can produce a variety of organic compounds (Miller 1953).Footnote 13

Present-day autotrophs subsist on inorganic precursors: photoautotrophs, such as green plants, obtain energy from light; chemoautotrophs, such as bacteria that oxidize \(\hbox {H}_2\)S, obtain it by transforming the chemical environment; some fungi even use ionizing radiation in derelict nuclear power stations (Dadachova et al. 2007). Current theories of the origin of life suggest that prebiotic reactions may have been driven by sunlight (Rapf and Vaida 2016), or by chemical energy available close to deep-ocean vents (Martin and Russell 2007; Branscomb and Russell 2018a, b), or in volcanic hot springs (Damer and Deamer 2020), as long as chemical gradients across boundaries were capable of existing.

Other views on the origin of life

There is at least strong suspicion that [the] first biochemical materials were quite different from now....

Above all what turned out to be incidental was the unity of biochemistry. This was a big Red Herring. The unity of biochemistry was seen to be an effect of a quite substantial period of evolution.

Cairns-Smith (1985, pp. 63, 65)

Two other theories are noteworthy:

  1. 1.

    Cairns-Smith (1985, pp. 28–30) criticized the lack of a satisfactory explanation of information storage in conventional ideas of the origin of life, nucleic acids being too difficult to synthesize to be plausibly present at the origin of life. He suggested a mechanism for information management that depends on crystals of clay rather than nucleic acids, in other words on a material that is abundantly available in the inorganic environment and requires no elaborate synthesis.

  2. 2.

    Wächtershäuser (1997) criticized the idea of a “prebiotic soup” (Haldane 1929; Oparin 1953), arguing that it piled hypothesis on hypothesis without ever arriving at a testable falsifiable prediction:

    With every modification the prebiotic broth theory increased in vagueness and ambiguity and it decreased in explanatory power and falsifiability. This development of the theory was counterscientific.

    He proposed that protometabolism occurred on pyrite (FeS) surfaces, i.e., that the first replicators existed in an iron-sulphur world. Freire (2020) has recently incorporated Wächtershäuser’s view of prebiotic chemistry into what he calls “a dual origin of metabolism.”

LUCA, the last universal common ancestor

Fig. 7
figure 7

The last universal common ancestor (luca). After the origin of life approximately \(3.8 \times 10^9\) years ago a very long time passed before the divergence of the three domains known today, the Eubacteria, Archaea and Eukarya. (The horizontal scale is not intended to be a linear scale of time.) luca is generally considered to be the most recent common ancestor of these three domains, and is shown at a. However, if the giant viruses derive from an organism that diverged earlier (see text), then luca needs to be moved to b. If the hypothetical “shadow biosphere” exists then it could have diverged either before the giant virus ancestor, as shown at c, or between a and b. Points on lines that became extinct without leaving descendants (d,d\('\)) could not be luca

Figure 7 illustrates a view of the divergence of life from its origins to the three domains recognized today, the Eubacteria, Archaea, and Eukarya. The organism at the point where these three diverge, marked (a) in the figure, is LUCA. Note that we (Cornish-Bowden and Cárdenas 2017, 2020) consider luca to have existed a very long time after the origin of life, as also implied by Fig. 1 of a recent article of Nasir et al. (2020), and by their statement that “luca was not the first cell. It was the last population of cells that diversified into modern cells.”

Reality may be more complicated than that, however. First of all, two-thirds of the genome of Pithovirus sibericum, one of the giant viruses, appeared initially to be very different from those of other known genomes (Legendre et al. 2014), though now there is more information about its relationships to other viruses (Christo-Foroux et al. 2020). Although this virus does not itself satisfy the definition of a living organism, its genome may suggest that an ancestral organism, now extinct, existed at point (b), before the divergence of the three domains. If so, then luca will need to be redefined as point (b). Moreover, the genomes of the other giant viruses may likewise derive from a very archaic organism, as suggested by Nasir et al. (2020).

A second complication arises from the hypothesis that the shadow biosphere of Cleland (2019) still exists. If so, it could have diverged anywhere at all from the other lines, maybe more than \(10^9\) years ago, and if it is discovered there will need to be further revision of the location of luca in the figure, for example at point (c).

Most authors today (Tuller et al. 2010; Cornish-Bowden and Cárdenas 2017) regard luca as an organism itself, probably resembling a modern bacterial species. Indeed, Chapman and Ragan (1980) were more explicit:

We assume that the common ancestor to all extant forms of life (concept organismFootnote 14) — were it alive today — would be placed in the genus Clostridium, or a biochemically similar genus.

However, there are some, reviewed by Mariscal and Doolittle (2015) and Mariscal et al. (2019), who argue that it should be regarded as a community of organisms. We find that improbable for luca itself, but in any case it must have lived long after the origin of life, the only survivor from a very long period of evolution. This long period started with chemical evolution (Meléndez-Hevia et al. 2008) before the Darwinian threshold (Woese 2002), and was followed by natural selection of genetically defined organisms.

Coexistence of protoorganisms

Even if we do not regard luca itself as a community, there is no corresponding difficulty in thinking that the first living organisms were formed from a community of interacting self-organizing systems capable of passing material between one another (Benomar et al. 2015; Ranava et al. 2021) and sustaining one another. We have described elsewhere how two primitive (M, R) systems could interact in this way (Cárdenas et al. 2018), as illustrated in Fig. 8.

Fig. 8
figure 8

Two interacting self-organizing systems (Cárdenas et al. 2018), each of the type shown in Fig. 2. Both consume food molecules U and T, assumed to be present in the environment. In addition, system I excretes Z, used as a food molecule by system II, which excretes S, used as a food molecule by system I. In the absence of S and Z neither system could organize itself, but together they can

Closely related to this is the question of ecosystems: in what sense are individual organisms individuals, if they depend on other individuals, even if only as food, in order to survive? The system illustrated in Fig. 8 can be regarded as a primitive ecosystem, and life as it is known today consists of a huge number of such interacting systems. These can in principle evolve as separate organisms, but only if changes in one are tolerated by changes in the other. Thus in addition to the separate evolution of the individual organisms the ecosystem as a whole must evolve.

The definition becomes even more difficult when we consider obligate symbionts. As an extreme example, the cedar aphid Cinara cedri cannot synthesize tryptophan, and it relies on two bacterial symbionts, Buchnera aphidicola and Serratia symbiotica, neither of which has a complete tryptophan biosynthesis pathway by itself (Ponce-de-Leon et al. 2017).

The RNA world

The discovery that RNA is capable of catalytic functions now primarily the responsibility of protein enzymes led Gilbert (1986) to suggest that an RNA world “containing only RNA molecules that serve to catalyse the synthesis of themselves,” existed at an early stage of evolution, that was later supplanted by a world with proteins, still encoded by RNA. “Finally, DNA appeared on the scene, the ultimate holder of information copied from the genetic RNA molecules by reverse transcription,” leading to the world as we know it today, with only a few catalytic functions still carried out by RNA, primarily acting on RNA itself, such as self-splicing (Zaug and Cech 1986).

Today, of course, peptide bond formation is catalysed by the ribosome, i.e., by RNA, but there must have been an earlier period in which RNA was not needed. Although Gilbert’s commentary was entitled “Origin of life: the RNA world” we believe that the RNA world he described, if it existed, must have been far too elaborate and evolved to be close to the origin of life. The first organisms must have relied on far simpler catalysts than RNA molecules, such as metals (\(\hbox {Fe}^0\), \(\hbox {Zn}^{2+}\)..., as already discussed) and very simple organic molecules, including simple peptides. Bregestovski (2015), for example, presented the RNA world as the fifth of seven stages of evolution, conceptually closer to today, therefore, than to the origin. He considered even that limited interpretation to be highly improbable (as we do), listing the following problems:

  • Unreliability of the synthesis of initial components;

  • Instability of the molecules, which increases with chain elongation;

  • Catastrophically low probability of meaningful sequences;

  • Lack of a mechanism for the formation of membrane-bound vesicles permeable to the nitrogenous bases and other RNA precursors, as well as able to divide on a regular basis;

  • Lack of driving forces that might ensure the transition from the RNA world to the much more complex DNA world.

We find nothing to disagree with in Bregestovski’s list, as long as we take his “DNA world” to mean our present world in which RNA, DNA and proteins all have necessary roles.

Despite much progress in finding examples of RNA catalysis since the pioneering study of Wilson and Szostak (1995), most examples refer, even today (Walton et al. 2020), to reactions of RNA itself: We know of no examples of RNA-catalysed reactions of small molecules that might plausibly have existed at the origin of life. Polypeptides must also have existed in the primitive world alongside polynucleotides, and so polypeptide- and polynucleotide-catalysed processes ought to have evolved in parallel. There is no reason to postulate a world in which RNA was responsible for all catalysis. Any polyelectrolyte would have non-specific catalytic properties.

We share the view of Cairns-Smith (1985, pp. 42–45) that simple amino acids are plausible prebiotic molecules to have been present at the origin of life, whereas primed nucleotides are not, though primitive organisms could have evolved to produce them. Cairns-Smith’s book appeared before the RNA world was proposed and did not, of course, mention it. In a paper that appeared a decade later (Cairns-Smith 2008), he discussed the RNA world as a possibility worthy of consideration, but one that must have existed well after the origin of life.

Arbitrary results of natural selection

Chirality

How would you like to live in Looking-glass House, Kitty? I wonder if they’d give you milk in there? Perhaps, Looking-glass milk isn’t good to drink.

Lewis Carroll (1871)

As any biochemist will realize, the answer to Alice’s question is no, because proteins such as casein in looking-glass milk will be based on d-amino acids, and could not be digested by a normal cat.Footnote 15 Chirality is widespread in biological molecules but how did it arise, and how is it maintained (Korenić et al. 2020)? Why are proteins composed of l-amino acids rather than d? Why do most sugars in biological systems have the d-configuration? We can picture the early organisms placed on a knife edge: either l or d would work equally well, as the chemical and physical properties are identical, but any slight preponderance of one configuration will then be multiplied by an autocatalytic process until the “wrong” configuration is essentially eliminated (Blackmond 2010).

Fig. 9
figure 9

The reaction catalysed by lactate dehydrogenase, which links pyruvate at the end of glycolysis with pyruvate as the beginning of gluconeogenesis. Some groups of organisms (including insects) use an l-specific enzyme, some (including spiders) use a d-specific enzyme, and some (including mammals) have both enzymes

A macroscopic and easily visible example is found in snails and their predators in the Ryukyu Islands, in the far south of Japan. All snails are asymmetric, with shells that coil either clockwise or anticlockwise. Mating between individuals with opposite chirality is very difficult, giving an obvious selective advantage to snails with the majority clockwise chirality. However, in the Ryukyu Islands the minority of anticlockwise snails appears to be maintained, raising the question of the advantage of belonging to the minority. This seems to be explained by the fact that the principal predator, the snake Pareas iwasakii, has an asymmetric jaw adapted to extracting clockwise individuals from their shells (Hoso et al. 2007), thus providing a survival advantage to anticlockwise snails. Notice that both snails and snakes must have evolved as communities, not as individuals.

Different organisms use different enantiomers of lactate to link glycolysis and gluconeogenesis in the Cori cycle (Fig. 9). This is the only metabolic function of lactate, and, as pyruvate is achiral and l-lactate and d-lactate are chemically identical, either enantiomer can in principle act as intermediate. However, l-lactate dehydrogenase does not recognize d-lactate, and d-lactate dehydrogenase does not recognize l-lactate. Both forms of enzyme exist in nature but they have very different structures and no perceptible similarity (Taguchi and Ohta 1991). Some groups of organisms use an l-specific enzyme, some use a d-specific enzyme, and some have both enzymes. For example, spiders use d-lactate, whereas insects use l-lactate. Such disparity between two groups of arthropods usually regarded as relatives could be interpreted as a suggestion that the two lineages diverged very early in evolution, but this interpretation may seem too improbable to be entertained.Footnote 16 However, both enzymes exist in some organisms, such as humans and other mammals (Cristescu et al. 2007), so a less implausible explanation is possible: The common ancestor of spiders and insects used both d-lactate and l-lactate, but different enzymes were lost in the two lineages.

A similar argument may explain why Archaea use membranes formed from sn-glycerol 1-phosphate, whereas Eubacteria and Eukarya use its stereoisomer sn-glycerol 3-phosphate, and the enzymes that produce the two isomers are not homologous (Peretó et al. 2004).

Genetic code

When the “universal” genetic code was first deciphered it was widely believed that it must be universal (see Fig. 10), because any change would affect so many different proteins that it would inevitably be lethal (Hinegardner and Engelberg 1963). We now realize that this is an oversimplification, because variant genetic codes are used by some mitochondria, and furthermore selenocysteine, an amino acid used by some organisms (Stadtman 1974), is encoded by UGA, a stop codon in the normal code.Footnote 17 Moreover, the mitochondria of the yeast Candida glabrata do not use four codons at all, so these could be adapted to encode any amino acids without changing any proteins. For fuller discussion see Cornish-Bowden (2016, pp. 55–62).

Benner (2009) examined various examples of what he considered less than ideal directions taken by natural selection, such as the use of a rather unstable fragment of degraded RNA as the coenzyme NAD, in which the characteristics that make it unstable play no role in its biochemical activity. He and his co-workers also discussed how a six-base code could work much better than the four-base code that we know (Yang et al. 2011).Footnote 18

Fig. 10
figure 10

Disastrous effect of a change in the genetic code. In the usual code there are six codons for serine, but only two for glutamate. One could suppose that serine has more codons than it “needs,” and glutamate has fewer. If the code were made more “efficient” by transferring the codon AGC from serine to glutamate, this would mean that every serine residue in a protein encoded by AGC would be replaced by glutamate

These examples may seem to illustrate a fallacy first discussed by Dawkins and Carlisle (1976) and called the Concorde fallacy by Dawkins (1989, p. 150) in The Selfish Gene. However, there is an important difference: In the case of wasteful investment either in economics or animal behaviour, it is not absolutely impossible to write off the waste and start again, but there is no way to abandon the existing genetic code and start again with a better one.

Natural selection as a biassed random walk

The process of natural selection is an example of a biassed random walk, which does not involve making the best possible change at each step, but one that is at least not significantly worse than what preceded it.Footnote 19 Although we cannot directly observe natural selection on a time scale of millions of years, we can observe an analogous process on an experimental time scale in chemotactic bacteria such as Salmonella typhimurium. A bacterial cell can move towards a higher concentration of an attractant such as glucose when the concentration in its environment is about 1 μM. An assumed sampling volume of \(10^{-16}\) L \(\equiv \) 0.1 \(\upmu \)\(\hbox {m}^3\) (about 10% of the volume of a bacterial cell)Footnote 20 contains about 60 molecules at 1 \(\upmu \)M. As this is subject to statistical variation of the order of 8 molecules, it is clearly impossible to use spatial comparisons to detect a difference of 1 part in \(10^4\) between the “front” and “back” of a bacterial cell. Macnab and Koshland (1972) showed that instead bacteria use temporal sensing, comparing the concentration at one instant with what it was a few seconds earlier (Fig. 11): If the environment is improving, it continues moving in the same direction for longer than it does if it is deteriorating; in the latter case it “tumbles” and selects a new direction, entirely at random, that is to say unrelated to the previous direction or the direction of higher concentrations of attractant.

Fig. 11
figure 11

Bacterial chemotaxis as a model of natural selection. a A chemotactic organism like Salmonella typhimurium appears to move smoothly towards higher concentrations of an attractant such as glucose when viewed on a gross level. In reality the colony consumes the glucose as it moves, but this complication does not affect an individual cell, which consumes too little glucose to have a perceptible effect on the ambient concentration. b An individual cell moves in fits and starts, with periods of straight swimming interrupted by “tumbling,” after which there is a new direction that is entirely random. The long-term direction is towards the higher glucose concentration, however, because the time that is spent moving in a straight line is longer when it is moving in a “good” direction. This type of motion is called a biassed random walk

Like any model or analogy this one should not be taken too literally, and temporal and spatial sensing are not necessarily mutually exclusive: both can contribute, and in bacteria larger than Salmonella typhimurium spatial sensing may increase in importance. The probability that any mutation during evolution will be favourable is several orders of magnitude smaller than the probability 0.5 that a new direction taken by a bacterial cell in chemotaxis will be favourable. However, fixing a new and favourable variant, or a better direction, occurs after the mutation or tumbling. More important, in the case of chemotaxis one is examining the behaviour of an individual, whereas evolution applies to populations, not to individuals. Perhaps most important, evolution of protein or gene sequences is primarily a consequence of neutral drift (Kimura 1983; Ohta and Gillespie 1996; Cornish-Bowden 2016).

Models and modelling

[Einstein’s] reaction to the living world was illustrated one day as he stood with a friend watching flocks of emigrating birds flying overhead: “I think it is easily possible that they follow beams which are so far unknown to us.”

Albert Einstein, quoted by Clark (1972, p. 35)

Einstein’s remark recalls the comment of Schrödinger (1944, p. 76) in What is Life?

From all we have learnt about the structure of living matter, we must be prepared to find it working in a manner that cannot be reduced to the ordinary laws of physics.

This suggestion has not been generally accepted, as we have discussed elsewhere (Cornish-Bowden and Cárdenas 2020), but it remains an open question that ought to be seriously considered.

Modelling is different from simulation

We cannot stress this too strongly, simulation is not modelling.

Rosen (1993)

Rosen (1991, pp. 182–201) insisted that closure to efficient causation implied that it was impossible to make a functional model of an organism. To understand this, it is important to note the distinction that he made between modelling and simulating a complex system.Footnote 21 His view has been controversial, as discussed previously (Mossio et al. 2009; Cárdenas et al. 2010): we consider that much of the controversy has been due to a failure to understand what Rosen meant. See also the analysis by Louie (2007).

Behaviour can be simulated, for example in a computer, without any claim that the simulation replicates the internal structure of the system simulated: it is sufficient to reproduce the properties of interest. A model, in contrast, requires a far stronger condition to be satisfied, that this internal structure is reproduced. Rosen (1993) argued that mathematical models of complex systems in terms of differential equations worked satisfactorily as long as the systems were large enough to avoid sampling variation, but would eventually break down when such sampling effects overwhelmed the averaging underlying the mathematics: this paralleled the failure of classical (Newtonian) mechanics to handle systems small enough for quantum effects to become relevant.

Fig. 12
figure 12

Kinetic model of an autocatalytic system intended as a possible prebiotic metabolism. It is a more explicit form of the system in Fig. 2, with the catalysed processes shown as cycles of uncatalysed reactions (cf. Figs. 4 and 5). The rate constants shown, as well as the fixed concentrations of S, T and U, are those used in the stochastic simulations (Piedrafita et al. 2012) that led to Fig. 13. Adapted from Piedrafita et al. (2012)

How small can a self-organizing system be?

These considerations also apply to our studies of the smallest size of a system capable of maintaining its organization (Piedrafita et al. 2010, 2012). We examined the model of Fig. 12, which is a redrawing of Fig. 2 with the catalytic processes treated as cycles of uncatalysed chemical reactions, as in Figs. 4, 5, using the rate constants shown, as listed by Piedrafita et al. (2010, 2012). This redrawing seems to add an extra layer of complication, but it is exactly the same model, the redrawing being necessary solely to allow computer simulation.

We first made a deterministic simulation of the system (Piedrafita et al. 2010), deterministic in the sense that it was assumed to be big enough for the amounts of the components to be expressed as concentrations. Maintaining constant the external concentrations of S, T and U at the fixed values shown in Fig. 12, and setting all but one of the internal concentrations to zero,Footnote 22 it proved capable of arriving at a non-trivial steady state with all concentrations non-zero. There were two other steady states: a trivial steady state in which all concentrations and rates were zero, and an unstable steady state that would collapse immediately to one of the two others if the system ever found itself in that state. In the deterministic simulations the three decay rate constants were not held at 0.3 \(\hbox {s}^{-1}\) (as indicated in Fig. 12) but varied in constant proportion in the range 0.0–0.6 \(\hbox {s}^{-1}\).

However, bacteria and viruses are much too small to permit the amounts of molecules to be expressed in terms of concentrations. We therefore also carried out stochastic simulations in terms of numbers of molecules (Piedrafita et al. 2012), starting with amounts corresponding to the unstable steady state in the deterministic simulation. For a volume of 10\(^{-17}\) L (10\(^{-2}\) \(\upmu \)\(\hbox {m}^3\), or about 1% of the volume of a cell of Escherichia coli) the system evolved either to the trivial steady state with all amounts zero, or to a quasi-steady state in which the amounts fluctuated about the non-trivial deterministic steady state (Fig. 13). For a volume of 10\(^{-19}\) L \(\equiv \) 10\(^{-4}\) \(\upmu \)\(\hbox {m}^3\), about 0.01% of a bacterial cell (light grey lines in Fig. 13), the evolution was much faster, but the fluctuations at the quasi-steady state were much larger.

At neither volume would the system remain in the quasi-steady state indefinitely. Eventually a fluctuation large enough to cross the barrier of the deterministic unstable steady state would occur, resulting in a collapse to the trivial steady state, from which no recovery would be possible. The highlighted point in the figure refers to a moment in which that nearly happened, but from which the system was able to recover. However, for a large enough volume, the time needed for collapse would be very large, and at the value of 10\(^{-15}\) L \(\equiv \) 1 \(\upmu \)\(\hbox {m}^3\), the volume of a cell of Escherichia coli, or above, it would be infinite for practical purposes.

Fig. 13
figure 13

Time evolution of the concentration of ST for two different volumes: \(10^{-19}\) L \(\equiv \) 10\(^{-4}\) \(\upmu \)\(\hbox {m}^3\) (light-grey lines) and \(10^{-17}\) L \(\equiv \) 10\(^{-2}\) \(\upmu \)\(\hbox {m}^3\) (dark grey lines). In each case 300 independent simulations were made, with rate constants as shown in Fig. 12, starting from the deterministic unstable steady-state concentrations. The significance of the highlighted point at 2200 s is discussed in the text. Adapted from Piedrafita et al. (2012)

Further study suggested that a quasi-steady state would be effectively impossible for volumes smaller than \(10^{-20}\) L \(\equiv \) 10\(^{-5}\) \(\upmu \)\(\hbox {m}^3\), 0.001% of the volume of a bacterial cell. This conclusion depended, of course, on the specific rate constants used for the simulation, which were reasonable and in accord with the known rate constants for biological processes. More important than the numerical value of the critical volume is the conclusion that a critical volume must exist, and we should be surprised if the true value proved to be much less than \(10^{-19}\) L \(\equiv \) 10\(^{-4}\) \(\upmu \)\(\hbox {m}^3\).

We are not aware of many other stochastic simulations of very small models of life, but there are one or two that can be mentioned. Marin et al. (2013), extended the work illustrated in Fig. 13, and considered the relationship between population size and the survival time of a population before collapse. Gatherer and Galpin (2013) used the process algebra Bio-PEPA to study the model of Fig. 12, and obtained results consistent with those in Fig. 13. Neither of these papers directly considered the question of how large a volume a system needs to have for self-organization for an extended period to be possible.

Most viruses are too small to self-organize

Causa latet, vis est notissima.

The cause is hidden, but the effect is well known.

Ovid, Metamorphoses, 8 CE

We know, of course, that bacteria can self-organize. However, we are writing this while confined at home during the Covid-19 pandemic that is sweeping the world,Footnote 23 and the question arises of whether a Coronavirus is large enough to establish and maintain an organization. According to Neuman et al. (2006) Coronavirus particles are approximately spherical, not including the spikes, or “corona,” with a diameter of about 100 nm, or a radius of about \(5 \times 10^{-8}\) m and hence a volume of about \(\frac{4}{3} \pi \times (5 \times 10^{-8})^3 = 5.24 \times 10^{-22} \mathrm {m}^3\). That is about \(5 \times 10^{-19}\) L \(\equiv \) 5 \(\times \) 10\(^{-4}\) \(\upmu \)\(\hbox {m}^3\), small enough to make it unlikely that a particle of Coronavirus could self-organize indefinitely, so that it would need to rely on the host cell to fulfil various functions that a living organism fulfils for itself; if one believes, as we do, that capacity for self-organization is an essential part of the definition of the living state, then it follows that normal viruses should not be accepted as living organisms. Quite apart from the question of volume, viruses have no metabolism, and without that they could not self-organize. They can use the host’s mechanisms to organize themselves, but that does not make them alive.Footnote 24

The “giant viruses” are much larger, however. For example, Philippe et al. (2013) report that Pandoravirus has ovoid particles 1 \(\upmu \)m in length and 0.5 \(\upmu \)m in diameter, or about one-tenth of the volume of a cell of Escherichia coli. This is large enough to allow the possibility of self-organization, but probably small enough to incur many errors.

Even if Coronavirus is too small to be regarded as a living organism, it has the capacity to cause death in humans, ending the life of a living organism.

Thou talkest much, O man, and thou art laid in earth after a little: keep silence, and while thou yet livest, meditate on death.

Palladas, translated by Mackail (1890, XII.XLVI)

The Painter. And is this the end?

Hieronimo. O no, there is no end: the end is death and madness!

Thomas Kyd (about 1587) The Spanish Tragedy