Introduction

Time–frequency analyses of cultural data have long been a central theme in archaeological research, with contributions to issues pertaining to the construction of relative chronologies, and inferences on patterns of social learning, community and population structure, and identity. The rise and fall of cultural variants are ultimately the aggregate outcome of many individual decision-making processes, and thus, it is tempting to ask whether one can infer those processes, as well as the factors conditioning them, by analysing changes in the relative frequencies of discrete variants of a single cultural trait through time.

Mathematical models of cultural transmission pioneered in the early 1980s by scholars such as Cavalli-Sforza, Feldman, Richerson, and Boyd (Boyd & Richerson, 1985; Cavalli-Sforza & Feldman, 1981) provided the foundations for such an endeavour. Given a putative social learning process, these formal models can make qualitative and quantitative predictions of how frequencies of cultural variants are expected to change over time. Forty years on, the now mature field of cultural evolutionary studies has considerably expanded its suite of transmission models (Kendal et al., 2018), but most importantly, it has been increasingly testing them against empirical evidence through experimental and observational studies (see Creanza et al., 2017; Mesoudi, 2017 for a review). Observational studies have, in particular, witnessed a transition from earlier applications where inferential tools were directly borrowed and adapted from population genetics (Neiman, 1995; Steele et al., 2010) to the development of tailored methods designed to handle the specific challenges posed by cultural evolution (Acerbi & Bentley, 2014; Bentley & Shennan, 2003; Bentley et al., 2011; Crema et al., 2016; Kandler & Crema, 2019; Kandler & Shennan, 2013; Nakamura et al., 2019; O’Dwyer & Kandler, 2017). The analysis of cultural frequency data arguably represents one of the best examples of this research trend. Early applications aimed to determine whether observed cultural diversity deviates from the expectations of an unbiased transmission process, whereby the probability of copying a cultural variant is simply dictated by its relative frequency in the population, and hence, changes in frequencies are solely dictated by the rate of innovation and random drift (i.e. the cumulative effect of these processes over time or geographic distance) (Bentley & Shennan, 2003; Bentley et al., 2004; Neiman, 1995; Steele et al., 2010). The methodological and theoretical insight underpinning this approach is the neutral theory of molecular evolution (Kimura, 1968), adapted to the investigation of cultural data by replacing alleles with cultural variants and employing haploid versions of different mathematical models (Neiman, 1995, see Kandler & Crema, 2019 for a review). However, students of cultural transmission must consider a wider range of alternative hypotheses and mechanisms than neutral evolution (unbiased transmission) versus selection. For this reason, there has been a substantial effort to develop inferential tools capable of determining the goodness-of-fit of other modes of cultural transmission (also known as social learning strategies, see Laland, 2004), including conformist and anti-conformist biases, pro-novelty bias, and attraction bias (Acerbi & Bentley, 2014; Crema et al., 2016; O’Dwyer & Kandler, 2017). These new statistical techniques have been applied to a wide range of cultural data sets, including baby names (Bentley et al., 2004), colour terms (Acerbi & Bentley, 2014), bitcoins (ElBahrawy et al., 2017), and music samples (Youngblood, 2019).

A substantial proportion of case studies also concerns archaeological datasets. In fact, Neiman’s seminal work (Neiman, 1995), which first introduced the idea of borrowing concepts from the neutral theory of molecular evolution, aimed to investigate the stylistic variation of an archaeological dataset. This was followed by a small but steadily increasing number of similar studies over the last 20 years (Crema et al., 2014, 2016; Kohler et al., 2004; Lipo, 2001; Romanowska et al., 2021; Shennan & Wilkinson, 2001; Steele et al., 2010), mostly focused on (but not limited to) the study of ceramic assemblages and actively devoted to methodological development in this realm. Indeed, some of the most recent and advanced inferential techniques such as the use of progeny distributions (Bentley & Shennan, 2003) or approximate Bayesian computation (Crema et al., 2014; Kandler & Shennan, 2015) were first developed in the context of these archaeological applications.

Archaeological case studies, however, have also highlighted some distinctive theoretical and methodological challenges. These include potential confounding biases, such as those encountered in time-averaged datasets (Miller-Atkins & Premo, 2018; Premo, 2014), or unwarranted assumptions, such as the attainment of equilibrium conditions in cultural systems (i.e. a substantial balance between the number of cultural variants that are gained or lost at each generation/temporal unit of observation; see Crema et al., 2016). In this paper, we examine another potential problem in the analysis of cultural frequency data specific to archaeological contexts: the role played by ‘objects’ in the context of cultural change at a community/population level and as observational data underpinning archaeological inference. We argue that when cultural transmission is mediated through objects, we face several theoretical and inferential challenges, in particular if—as we are often bound to do in archaeological contexts—we use material cultural variants as a direct proxy for ideational cultural variants and their frequency. In “The role of ‘Objects’ in Cultural Transmission” section, we briefly review the theoretical foundation of transmission through objects, and then in the “Inferential Challenges” section, we comment on existing methods designed to infer transmission processes from frequency data. The “Model Description” section introduces a simple simulation model emulating the key features of object-mediated transmission and inference and describes two experiments designed to assess the robustness of some of the existing inferential techniques most commonly used in the field. We discuss our results (see “Results”) before turning to their broader implications for cultural evolutionary science (see “Discussion”).

The Role of ‘Objects’ in Cultural Transmission

Mathematical models of cultural change derived from population genetics are predicated on the notion that culture is a social phenomenon which exists as the result of many mechanisms of transmission, inheritance, and differential persistence among individuals and groups. This condition makes it amenable to be modelled and analysed in the same way in which we understand biological evolution. The exactitude of this likeness has been and continues to be the subject of considerable debate (see for example the various comments on Mesoudi et al., 2006), complicated by debate within evolutionary biology itself concerning the robustness of the Weismann barrier (Sabour & Schöler, 2012; Surani, 2016) and the relative importance of environmental conditions for genetic expression (Hall, 2003; Jablonka & Lamb, 2005). In cultural evolutionary studies, it is generally accepted that the cultural equivalents of genes—variously labelled ‘culturgens’, ‘memes’, or increasingly just ‘cultural traits’—are ideas (Aunger, 2002; Dawkins, 1982; Lumsden & Wilson, 1981; Mesoudi, 2011; Mesoudi & O’Brien, 2008; O’Brien et al., 2010), although there is certainly debate about their granularity (Bloch, 2000; Henrich et al., 2008; Mesoudi & O’Brien, 2008; Mesoudi et al., 2006; O’Brien et al., 2010). However, there is no equivalent consensus about exactly what role (or roles) objects play in cultural inheritance (Lake, 1998). In the formalism of Hull’s (1980a, b) replicator-interactor-lineage scheme, objects might be the physical manifestation of replicators, and/or they might be interactors (Lake, 1998). If objects are the physical manifestation of replicators, then that makes them the cultural equivalent of DNA, that is to say, they directly describe or ‘code-for’ ideational cultural traits just as DNA codes-for genes. On the other hand, if objects are interactors but not replicators, then that makes them the cultural equivalent of phenotypic organisms (or parts thereof depending on one’s stance on the thorny issue of granularity noted above), in which case they might be said to ‘express’ rather than ‘code-for’ an ideational cultural trait. This distinction carries with it various philosophical implications. Most straightforwardly, and thinking in terms of the biological analogy, it places objects on different sides of the Weismann barrier and so impacts whether cultural evolution is most appropriately conceived as a Darwinian or Lamarkian process (Hull, 1982; Kronfeldner, 2007; Mesoudi, 2011). More exotically, it has implications regarding human agency—can the machine-copying of an object such as a musical score actually replicate a cultural trait (in this case musical idea or motif) even if no human ever looks at the resulting copies (Dennett, 1995; see also De Block and Ramsay 2015)? Although these questions are undoubtedly interesting, they take us too far from the purpose of this paper, and so we have adopted the following pragmatic approach.

We define ‘objects’ as the public and physical correlates of cultural traits, which in accordance with the general consensus we assume to be ideational. This terminology is intended to emphasise that whilst we assume there is a correlation between the form of objects and the ideational content of cultural traits, we make no assumption regarding whether that correlation exists because objects ‘code-for’ or ‘express’ cultural traits. Additionally, we adopt De Block and Ramsay’s view (2015) that most students of Cultural Evolution are primarily interested in the transmission of ‘cultural genotypes’ from one biological organism (usually a human) to another. Putting these two propositions together, we use the term object-mediated transmission to describe any transmission process in which the relative frequency of objects influences the relative frequency of ideational cultural traits. Note that our use of the term ‘object’ is not intended to necessarily equate cultural traits with whole artefacts, since as has been discussed at length elsewhere (Mesoudi & O’Brien, 2008; O’Brien et al., 2010), some kinds of cultural trait (for example that associated with a class of ceramic vessel) comprise multiple hierarchically lower traits (for example manufacturing technique, choice of motif).

In summary, we are interested in processes in which the probability of a cultural trait being copied is some function of the relative frequency with which it is present as the corresponding object in the population of objects. If the number of physical manifestations of cultural traits produced by individuals is variable and objects are copied at random, then we have an unbiased object-mediated transmission.Footnote 1 In human culture, object-mediated transmission could potentially occur in a number of different scenarios, for example when people copy objects in the absence of those who produced them, or when people are more likely to copy cultural traits directly from those who produce more objects (which might be viewed as a kind of model bias in conventional Dual Inheritance Theory). These two processes are quite different when viewed through the lens of Hull’s distinction between replicators and interactors, but what matters here is that both are object-mediated in the sense that we have defined it. Admittedly, the differences between these processes might matter in terms of where transmission error occurs, and we return to that point shortly.

Discussion of the potential impact of ‘objects’ on cultural/behavioural trait frequencies in a second inheritance system is not new in cultural evolutionary studies. Indeed, Jablonka and Lamb (2005) have argued that behaviour-inducing substances should be regarded as a subcategory of behavioural inheritance along with imitative and non-imitative social learning. For example, the preference for the consumption of juniper berries can be transmitted between rabbits via faecal pellets (Bilkó et al., 1994). Similarly, some forms of non-imitative social learning, such as stimulus enhancement, also entail transmission of behaviour via the physical traces of that behaviour even in the absence of an individual demonstrating it, for example opened milk bottles in blue tits (Sherry & Galef, 1984). In both these examples, the frequency of the objects (i.e. the quantity of faecal pellets or opened bottles) can increase the likelihood of transmission of the behaviours which produce them. From our perspective, both are cases of object-mediated transmission, even if the status of juniper and milk consumption as explicitly cultural traits (Aplin et al., 2014) depends whether one considers that ‘true’ culture is necessarily founded on imitative social learning and exhibits cumulative change (Boyd & Richerson, 1996; Caldwell & Millen, 2009; Dean et al., 2012; Heyes, 1993; Tennie et al., 2009; Tomasello et al., 1993).

The most extensive body of ideas exploring the consequences of object-mediated transmission in human culture is probably that offered by Cultural Attraction Theory (hereafter CAT) (Scott-Phillips et al., 2018; Sperber, 1996). The key argument advanced by proponents of this theory is that at each human-to-human transmission event social learners attempt to reconstruct mental representationsFootnote 2 that specify ideational cultural traits from their public manifestations (i.e. objects or behaviours); consequently, the transmission of cultural traits is never strictly replicative, that is somehow psychologically direct (Sperber, 2000, see also earlier arguments by Heyes, 1993 and Heyes & Plotkin, 1989), but always occurs through public manifestations. Given the lack of direct replication, much of CAT is focused on explaining the stability of cultural forms as the result of cognitive factors that lead to a convergence in the reconstructive process (Boyer, 1994; Sperber, 2000; Sperber & Hirschfeld, 2004). Importantly, both theoretical and experimental studies suggest that, at least under specific circumstances, a reconstructive rather than replicative process (assumed in the Dual-Inheritance school) underlying cultural transmission can lead to different evolutionary expectations (Claidière & Sperber, 2007; Tamariz & Kirby, 2015; Scott-Phillips, 2017; but see also Henrich et al., 2008). It is true that the centrality of psychological processes in cultural transmission is not accepted by all, for example ‘externalist’ memeticists (Gatherer, 1998; see also Dennett, 1995), but for our current purposes, CAT helpfully highlights the two potential loci of error in an object-mediated transmission process: during encoding or during decoding.

Encoding error occurs during the production of objects and results in a mismatch between the cultural trait (a mental representation) and its physical correlate (the ‘object’), whereas decoding error occurs during the reconstructive process and results in a failure to recreate the original cultural trait from its physical correlate. It is worth noting here that both are instances of ‘copy with error’ in the sense that the social learner acquires a modified version of the model’s mental representation; the difference between them is where that error occurs in a process of transmission mediated through objects. Of course, if cultural transmission were psychologically direct, there would be no need for this distinction, but as noted above, we accept the arguments underpinning CAT that this is rarely or even never the case. It is also worth commenting on our use of the term ‘error’. Ultimately, ‘error’ refers to a lack of fidelity in transmission: that the copy does not exactly match the original. As such error can be unintentional or intentional (Eerkens & Lipo, 2005), and in the context of some technologies it may be both. For example, in experimental work on the reproduction of pottery shapes, Gandon et al. (2021) emphasise the ‘irreducible copying error owing to the sensorimotor limits of human performances’ but also note the ‘idiosyncratic manner with which each potter fashioned each pottery type’ (p. 13). Here, however, we are concerned with unintentional—in the sense of undirected—error because we are concerned to assess the reliability of using the neutral model, which assumes undirected error, to infer the presence of other social learning strategies in cultural transmission.

By way of illustration, consider an episode of object-mediated cultural transmission between two potters, the model and the social learner, in which the social learner copies a vessel made by the model and does not have direct access to the model’s intention but is nonetheless aiming to reproduce this. An example of encoding error would be if the model intended to create a series of squares as a decorative motif on a ceramic vessel, but circumstantial events accidentally resulted in the production of a vessel marked with a series of rectangles instead (perhaps the clay was unusually wet, and the vessel sagged before firing). If the social learner accurately copies this particular vessel, they will nevertheless reconstruct an erroneous mental representation which is a series of rectangles, and may subsequently go on to produce more vessels displaying this new motif. However, encoding error is object-specific and does not affect the model’s original mental representation, so the model may subsequently produce additional vessels with the ‘correct’ square motif, in which case the accuracy of the episode of cultural transmission depends on which specific vessel (object) the social learner happens to copy. In contrast, an example of decoding error would be if the model accurately realised their intention by creating square motifs on the ceramic vessel, but perhaps owing to differential curvature of the surface of the vessel, the social learner perceives the vertical sides of the decorative squares to be shorter than the horizontal sides, so resulting in them reconstructing an erroneous mental representation which is a series of rectangles. As a result, unlike in the case of encoding error, the social learner will go on to produce one or more vessels which always have this new motif. It is worth noting that decoding error is likely to be conditioned by whether the learning process is emulative (i.e. exclusively based on the information contained in the object) or imitative (i.e. with the additional insights of the actions leading to the production of the object), with former characterised by higher error rates.

The hypothetical example presented above illustrates how transmission errors can potentially occur either when realising cultural traits as objects or reconstructing them from objects. Considered in the context of a one-off episode of cultural transmission between two humans, both encoding and decoding errors ultimately result in the second cultural generation, that is the social learner, having acquired an erroneous (possibly new) cultural trait. If that were the end of the matter, the distinction between the two types of error would only be significant to the extent that their different causes might contribute differentially to the total error rate. However, in reality, potters usually produce more than one vessel, there may be more than one potter who attempts to copy another potter’s wares, and copying will take place on multiple occasions leading to multiple cultural generations (which may be more frequent than the human generations of potters). Consequently, at any point in time, there will be a population of potters and a population of vessels. These two populations are likely to differ in size. At the population level, the two types of error are expected to generate different dynamics. Encoding error directly increases the richness of variants observed in the population of objects, but there is no concomitant change in the richness of mental representations unless an erroneously produced object serves as the model for a naïve social learner in the next cultural generation. Conversely, still assuming each error introduces a novel variant, decoding error directly increases the richness of mental representations, but only alters the richness of objects if there is a subsequent production event in which the new cultural trait is realised as an object. After discussing additional inferential challenges, we present a model designed to compare the population dynamics resulting from encoding and decoding errors in unbiased object-mediated transmission.

Inferential Challenges

In the context of contemporary cultural evolution, it is an open question whether ideational cultural traits are directly observable. This depends on whether or not one accepts that direct perception of mental states is possible (Froese & Leavens, 2014) because, if not, the ‘instructions’ to create cultural traits can only ever be inferred (Sperber, 2000). However, in the case of the historical sciences, there is no such debate, since physical objects and traces of behaviour are all that survive as a ‘record’ of past cultural traits. That being so, there has been surprisingly little work to tease apart the exact relationship between frequencies of ideational cultural traits and the frequencies of their corresponding objects, even though as we have just suggested, this can be expected to differ according to where errors occur in the process of cultural transmission and how frequently a given ideational cultural trait is realised as an object.

As discussed above, the earliest cultural micro-evolutionary studies effectively treated variant frequencies as something comparable to allele frequencies in molecular evolution and examined whether observed values (or their change over time) significantly deviate from what would be expected from a particular social learning process (Acerbi & Bentley, 2014; Bentley et al., 2004, 2007; Brand et al., 2019; Carrignon et al., 2019; Crema et al., 2016; Kandler & Shennan, 2015; Kohler et al., 2004; Neiman, 1995; O’Dwyer & Kandler, 2017; Shennan & Wilkinson, 2001; Steele et al., 2010). These studies share the fundamental assumption that the observed frequency of cultural variants in the archaeological or historical record provides a reliable estimate of the relative proportion of the cultural variants in the population. If we observe that 20 out of 100 potsherds have a particular design, it would be assumed that the proportion of this cultural variant in the population is 0.2. There are two issues here. First, some authors have rightly pointed out that issues such as time-averaging (Premo, 2014) and sampling error (Kandler & Shennan, 2015) can potentially bias estimates of the original frequency of cultural variants as exhibited by objects. Second, even if such biases can be ‘corrected’, most studies treat the frequency of cultural variants exhibited by objects as a direct proxy for the frequency of ideational cultural traits. In other words, cultural evolutionary studies typically proceed as if objects are the population of interest even if many—perhaps even most—archaeologists would claim that their primary interest is the population of people in the past and the ideas that they have.

A notable exception to this disconnect is offered by Shennan and Wilkinson’s (2001) study of the decorative design of prehistoric Linearbandkeramik (LBK) pottery from Merzbach valley in Germany, in which the authors compared the observed homogeneity of cultural variants with that theoretically expected as a result of innovation, random copying, and drift. The expected homogeneity—based on Ewen’s sampling formula—requires an estimate of the effective population size N. Leaving aside the very specific meaning of ‘effective population’ in the Wright–Fisher model of neutral evolution (Crow & Kimura, 1970; Premo, 2016, p.607), the particularly interesting feature of Shennan and Wilkinson’s study is that they explore three different populations of interest: the number of potters, the number of vessels, and the number of motifs. Whilst the choice of different values of N did not make a qualitative difference to the conclusion of this particular study (but see Crema et al., 2016 for a recent re-examination of the same dataset), there was a quantitative difference in the results. Moreover, there is of course an important theoretical difference between treating N as the number of potters or N as the number of vessels. The former is effectively a measure of the number of ideational cultural traits one could observe in a population of people, and the latter is the number of their publicly observable material correlates (objects displaying categorically different cultural variants) in an archaeological assemblage.

The two numbers coincide only if we consider transmission and production as parts of the same ‘event’. In this case, if we ignore encoding error, the process effectively resembles a Wright–Fisher model. Each generation consists of a transmission process whereby individuals copy a randomly chosen object produced in the previous generation and in turn produce one object based on their updated mental representations. However, the two processes (transmission and production) are not necessarily part of the same event. A potter could copy a design and then carry on producing several vessels with identical motifs. Under such a scenario, there is a potential mismatch between the number of objects and the number of potters. Premo (2014, p.112) similarly made this point when he asked whether existing methods are appropriate “if some individuals deposit more than their ‘fair share’ of variants to the record or … if some individuals are not allowed to deposit their variants at all”. The implications are not limited to estimates of N. Consider a hypothetical assemblage of 100 vessels, an equal number of which display one or the other of two categorically different designs (i.e. two cultural variants). If this was the outcome of 100 potters producing one vessel each, we would argue that there is a comparatively low cultural diversity. However, if this was the product of two potters, each producing 50 replicates of their motifs, we would be dealing with the highest possible level of cultural diversity (i.e. every potter has a different motif). In other words, our expectations should be tuned based on the degree to which we can reliably use objects for our inferential goals without taking into account specific processes such as the distinction between encoding and decoding errors and the bias introduced by multiple production events. Here, we explore whether ignoring these two processes (encoding/decoding errors and production bias) in object-based cultural frequency data can lead to inferential errors.

Model Description

Consider a population of N individuals each associated with a mental representation of a cultural variant. At each generation, each individual produces n copies of a physical realisation of the mental representation, with n drawn from a Poisson distribution with intensity λ. After this production event, all individuals update their mental representation by randomly selecting a cultural variant from the sample pool of objects produced. New variants are introduced in the population either via decoding error or via encoding error. In the first scenario, focal individuals fail to correctly reconstruct the mental representation associated with the object they copy from with probability \({\mu }_{d}\). New variants are thus introduced in the population following the infinite allele model (i.e. with a probability zero of convergent error). In the encoding error scenario, there is a probability \({\mu }_{e}\) that an object produced by the focal individual is no longer directly associated with its parent mental representation. As in the decoding error scenario, new variants are introduced following the infinite allele model. However, in this case, the mental representation of the focal individual remains unchanged after the error unless the mutant object is selected in the transmission process, and a single individual can produce multiple distinct mutant objects each associated with a mental representation that does not yet exist in the population of the producers.

Both variants of the models (Fig. 1) have been implemented as agent-based models in the R statistical computing environment (R Development Core Team, 2022; see ‘Data Availability’ below for access to code and scripts). At initialisation, N individuals are created each with a different mental representation represented by an integer value. Then at each time-step (generation), the following processes occur:

  1. 1.

    Production. For each agent i, identical objects with the same numerical value as the mental representation of the agent are created, with \({n}_{i}\sim Poisson\left(\lambda \right)\).

  2. 2.

    Encoding error. With probability \({\mu }_{e}\), each object value is associated with a new integer value that has not yet been introduced in the simulation run.

  3. 3.

    Object-mediated cultural transmission. Each agent randomly selects an object and updates its mental representation number with the integer value associated with the object. This is comparable to an unbiased transmission process since, on average, the probability of a variant being selected is proportional to its frequency in the population.

  4. 4.

    Decoding error. With probability \({\mu }_{d}\), each agent updates its mental representation value with a new number that has not been previously used in the simulation run. This represents an instance where the individual fails to correctly reconstruct the mental representation of an object.

Fig. 1
figure 1

Wright–Fisher and object-mediated transmission models with encoding and decoding error over two generations. Squares are individuals with their mental representation (integer values), and circles are the corresponding public manifestations (i.e. objects). Arrows represent the transmission process. Integer values in italics and dashed lines represent error events

The models are run for \({T}_{\mathrm{burnin}}+{T}_{\mathrm{collection}}\) time-steps. The first \({T}_{\mathrm{burnin}}\) time-steps ensure that the simulation reaches equilibrium conditions, whilst the frequency of different cultural variants associated with the objects are collected during the remaining \({T}_{\mathrm{collection}}\) time-steps. Although both \({\mu }_{e}\) and \({\mu }_{d}\) can be positive, here, we explore only instances where either one or the other of the two error rates are positive. We will refer to simulation runs where \({\mu }_{e}>0\) and \({\mu }_{d}=0\) as the encoding error model and those with \({\mu }_{e}=0\) and \({\mu }_{d}>0\) as the decoding error model. In order to provide a comparative baseline, we also created a standard Wright–Fisher model where agents at time-step t randomly select an agent from time-step \(t-1\) copying its mental representation with a probability µ of error in the transmission process.

Experimental Design

We compared the output of the three models using two measures of diversity, richness and homogeneity (Gjesfjeld et al, 2020; Kandler & Shennan, 2013; Steele et al, 2010), and progeny distribution (O’Dwyer & Kandler, 2017).

Experiment 1: Diversity

Steele et al. (2010) explored the effect of drift on assemblage diversity. In an empirical application, they found that the frequency distribution of Hittite rim sherds was consistent with the null hypothesis of random copying derived from the Wright–Fisher model, but noted that despite this, there was other evidence for selective decision-making by potters. Other studies have explored the effect of changing population size on the frequency distribution of cultural variants (Kandler & Shennan, 2013), and recently, Gjesfjeld et al. (2020) compared long-term patterns in the richness of American automobiles with the richness of European Neolithic ‘cultures’.

We computed the richness k (the number of unique variants) and the homogeneity H (equivalent to \({\sum }_{i}^{{k}_{i}}{p}_{i}^{2}\) with \({p}_{i}\) being the proportion of the i-th variant) of the distribution of cultural variants in the final set of mental representations of the N agents and in the final set of objects they had produced after 5000 time-steps (i.e. \({T}_{\mathrm{burnin}}=4999\) and \({T}_{\mathrm{collection}}=1\)). We explored the impact of the number of individuals (agents) N and the production rate λ using two different sets of parameter values. In experiment 1a, we considered all possible combinations of three values of N (100, 500, 1000) and four values of λ (0.5, 1, 5, and 10). Note that in this case, the average number of objects produced in each time-step (λN) varies from 50 to 10,000. However, a major part of the rationale for the present study is that in most real-world situations we do not have estimates of N or λ, and we simply infer modes of transmission from the frequency distribution of the observed objects. In experiment 1b, we emulate this situation by exploring four different values of N (50, 500, 1000, and 2000), each paired with a specific production rate λ (20, 2, 1, and 0.5), to ensure that the average number of objects produced in each time-step (λN) was equivalent to 1000. We then compared the richness and homogeneity of the final sets of objects against the expectations of a standard Wright–Fisher model with a population size N randomly drawn from a Poisson distribution with unit intensity. This ensured that the Wright–Fisher expectation was also affected by an, albeit a very small, comparable variation in the number of individuals/objects. In both experiments, \({\mu }_{e}, {\mu }_{d},\) and µ of the three models were all set equal to 0.01. A total of 1000 repetitions have been carried out for each parameter combination.

To summarise, experiment 1 aims to explore whether and to what extent diversity estimates of assemblage composition, measured after taking into consideration encoding/decoding errors and production bias, show any measurable differences with their corresponding mental representations (experiment 1a) and from those expected for the possible configurations entailed by a process of neutral transmission with the number of objects (λN) corresponding to the population size (N) (experiment 1b).

Experiment 2: Progeny Distribution

Bentley et al. (2004) first noted that given a temporal interval T, \(log\left({P}_{k}\right)\), the log frequencies of variants appearing k times, follows a power-law distribution under neutral cultural evolution, with an exponent equal to a function of N and μ. Thus, it follows that the number of variants occurring once (i.e. \(k=1\)) within T is far greater than those appearing twice (\(k=2\)), three times (\(k=3\)), etc., and that this rate of decline in occurrence with increasing values of k can be predicted if T is sufficiently large. O’Dwyler and Kandler (2017) have further examined the shape of such a progeny distribution and noticed that (1) there is indeed a power-law distribution, but with a constant exponent of − 3/2, which does not depend on population size, N, and mutation rate, μ; (2) the power law is actually followed by an exponential cut-off; and 3) the cut-off point depends on the innovation rate μ.

Figure 2 confirms and illustrates this using the result of an agent-based simulation with a Wright–Fisher transmission with different combinations of N and μ (and with \({T}_{\mathrm{burnin}}=10000\) and \({T}_{\mathrm{collection}}=10000\)).

Fig. 2
figure 2

Progeny distribution under the Wright–Fisher model with different settings of N and μ

Experiment 2 explores whether and how the shape of the progeny distribution is affected by an unbiased object-mediated transmission with encoding and decoding errors. We examined two sets of parameter combinations, the first one holding the number of individuals maintaining cultural traits, N, constant at 300 and sweeping the rate at which they produce objects, λ, (0.1, 1, 5, 10), and the second one holding the number of objects, λN, constant at 1000. We used the same settings as in experiments 1 and 2. With a sufficiently large temporal window, the stochastic differences in the shape of the progeny distribution become negligible; hence, below, we explored each parameter combination once using \({T}_{\mathrm{burnin}}=10000\) and \({T}_{\mathrm{collection}}=10000\) with an error rate (\({\mu }_{e}\) or \({\mu }_{d}\)) of 0.01.

Results

Experiment 1: Richness and Diversity

Experiment 1a was designed to explore the impact of increasing the number of individuals (agents) N and/or the production rate λ on diversity, and whether the richness and homogeneity of the composition of the population of cultural variants differs for mental representations versus objects produced from them. Figures 3 and 4 show that for a given population size, N, diversity increases (an increase in richness and decrease in homogeneity) as the production rate, λ, is increased, although the rate of increase in diversity declines as λ increases. In the case of homogeneity (Fig. 3), there are no discernible differences between statistics calculated on the frequency of mental representations versus those calculated on the frequency of traits exhibited by objects, and nor do they vary according to whether an error occurred during encoding or decoding, effectively yielding similar values for a given combination of N and λ. Importantly, however, this was not the case for richness (Fig. 3), particularly under encoding error regimes with larger settings of N. Under these settings, the number of unique variants present among objects was significantly higher than the number of mental representations. The difference between the two summary statistics is most likely caused by the fact that under encoding error regimes, the number of new variants introduced at a given time-step is on average \(N\lambda {\mu }_{e}\); most of these variants are, however, not adopted by the social learners and hence go extinct after one generation (see also experiment 2 below), leading to a marked difference between the richness of variants among objects versus mental representations. In contrast, under decoding error regimes, the number of new variants is simply given by \(N{\mu }_{d}\), with λ this time defining the frequencies of these new variants; this leads to lower values of richness as the number of novel variants is λ times smaller compared to encoding error regime.

Fig. 3
figure 3

Homogeneity under objected mediated transmission with either encoding error or decoding error for different parameter combinations of λ and N. Homogeneity has been calculated on the frequency of variants in the population of mental representations and on the frequency of variants in the population of objects

Fig. 4
figure 4

Richness under objected mediated transmission with either encoding error or decoding error for different parameter combinations of λ and N. Richness has been calculated on the frequency of variants in the population of mental representations and on the frequency of variants in the population of objects

Experiment 1b was designed to capture the frequently encountered real-world situation where we must infer modes of transmission from the frequency distribution of the observed objects without having estimates of N or λ, in other words, without knowing whether those objects resulted from a few people making many objects each or many people making a few. Importantly, the result was consistently lower diversity (lower richness and greater homogeneity) for both versions of object-mediated transmission than for transmission conforming with the Wright–Fisher null hypothesis, even in situations where N is larger than the value set for the Wright–Fisher model (see Fig. 5). With lower values of N (and consequently, in this case, higher values of λ), the number of mental representations is limited, and hence, the amount of variability in the system is constrained. For example, under the decoding error scenario with \(N=50\), the highest richness value possible (assuming all individuals have a different mental representation) is 50 (k = N), whilst the corresponding (lowest) homogeneity value is 0.02 (\(\frac{1}{k}\), where k = N, so every variant is present in equal proportion). The situation is slightly different in encoding error regimes. Richness is higher, (limited approximately to \({k}_{max}=N+\lambda N{\mu }_{e}\), in this case, 60 when \(\lambda =20\) and \({\mu }_{e}=\) 0.01) and the corresponding homogeneity just slightly lower, capped at approximately \({\left(\frac{1-{\mu }_{e}}{N}\right)}^{2}N+\frac{{\mu }_{e}}{\lambda N}\) (equal in this case to 0.019612, see also ESM). In both cases, the difference is dictated by the encoding error, \({\mu }_{e}\), and the production rate, λ, with higher values leading to higher diversity (higher richness and lower homogeneity). These differences (or lack thereof) can be observed when we compare the two forms of error for each parameter combination of objected mediated transmission.

Fig. 5
figure 5

Homogeneity and richness under objected mediated transmission and Wright–Fisher model with different parameter combinations of λ and N. The dashed lines represent the 95% percentile of the Wright–Fisher simulations, blue and red dots are simulation runs yielding richness and homogeneity below and above these thresholds, whilst black dots are those within this range

The systematically lower diversity in object-mediated transmission is however not just explained by the combination of low N and high λ. In two of the parameter combinations, we considered instances where N was equal or higher than the Wright–Fisher model (executed with \(N=1000\)), yet both still yielded lower diversity (lower richness and higher homogeneity). This can be explained by the stronger drift caused by stochasticity in the production event. For example, when \(\lambda =1\), approximately 36.79% (i.e. \(\frac{{\lambda }^{k}{e}^{-\lambda }}{k!}={e}^{-1}=0.3679\)) of the individuals will not produce an object, limiting the amount of diversity that could potentially be realised at each production event.

Experiment 2: Progeny Distribution

Under decoding regimes (Fig. 6a and b), we observe significant differences between object-mediated and Wright–Fisher transmission. When N is held constant (Fig. 6a) and production rates are low (\(\lambda \le 1\)), the power-law section coincides with the Wright–Fisher model, although with the threshold of the exponential tail occurring at different points. With larger values of λ (\(\lambda \ge 5\)), there is an additional power-law segment for low values of k, followed by a power-law section with a similar slope to the Wright–Fisher model. This initial section is determined by the fact that when \(\lambda \ge 5\), obtaining \(k=1\) instances of a variant becomes difficult, since even if a mental representation is associated with a single individual for one generation, the number of objects with such a mental representation is more likely to be larger than 1 (e.g. with \(\lambda =5\) the probability of producing more than one copy of a variant is 0.9596).

Fig. 6
figure 6

Progeny distribution for Wright–Fisher and object-mediated transmission. Upper row: decoding error with \({\mu }_{d}=0.01\); Lower row: encoding error with \({\mu }_{e}=0.01\)

When \(\lambda N\) is held constant (Fig. 6b), object-mediated transmission mostly conforms to the Wright–Fisher expectation (with small differences in the exponential tail), except for cases in which the production rate is high. In this instance, the high production rate (\(\lambda =20\)) did not yield any cases with variants appearing less than 5 times within the window of analysis.

A high production rate is a major driver of deviation from the Wright–Fisher expectation when transmission error occurs during the encoding process (Fig. 6c and d) as well. In both the constant N (Fig. 6c) and constant \(\lambda N\) (Fig. 6d) regimes, larger values of λ determine a drop in \(P\left(k\right)\). This is due to the errors occurring independently during the production of objects, ensuring consistently large number of variants appearing only once (\(k=1\)). Lower values of k above 1 would instead occur only when a variant is actually adopted (i.e. when the mental representation is present in the population) and associated objects are produced. As we saw earlier, deviation from the Wright–Fisher expectation becomes larger with higher values of λ. For example, with \(\lambda =5\) a variant can appear in the assemblage twice (\(k=2\)), either because an individual produced two instances of its mental representation in a given generation (a probability of 0.0842) but possessed the mental representation only for that particular generation and no agents copied the two objects, or alternatively because an individual produced two instances in two generations (0.0011) before changing its mental representation (again without any individual copying the variant). Both scenarios are rare, and hence, \(P\left(k\right)\) for low k is lower than would be expected under a Wright–Fisher transmission process. This is also confirmed by the fact that the power-law section of the progeny distribution appears to start approximately at \(k=\lambda\), at which point this effect is reduced and the occurrence of the specific number of variants becomes more common.

Discussion

Our simulation analyses show, across all experiments, that object-mediated transmission and object-based inference result in patterns that deviate from the expectations of an unbiased transmission process as portrayed by the Wright–Fisher model. It follows that, if the objective of a particular study is to determine whether or not patterns observed in the archaeological record resulted from pure random copying, relying on Wright–Fisher transmission as the null hypothesis runs the risk of falsely rejecting the null model and thus mistakenly concluding that there was some form of transmission bias or selection process in play. It is worth emphasising that this problem does not reside in the statistical tools we employed to measure the distribution of cultural traits (richness, homogeneity, and progeny distribution): in most instances, these methods were able to detect departures from a Wright–Fisher model correctly. Rather, the risk of faulty inference stems from the possibility of incorrectly equating the Wright–Fisher model to a scenario of unbiased cultural transmission where that transmission in mediated through objects and the number of objects produced per person can vary. More broadly, potential issues may arise when (1) both random copying of a teacher/demonstrator and random copying of an object are conflated and described as ‘unbiased transmission’ and (2) the frequency of a cultural variant observed among objects in an archaeological assemblage is assumed to be an unbiased proxy for the frequency of the corresponding mental representation that a population of people carry in their heads.

The latter point is particularly crucial in archaeological research, where the association between cultural variants and individuals engaging in social learning is not directly observable. Indeed, whilst in some of the scenarios we model decoding and encoding error do have different statistical signatures, it is variation in the productivity parameter λ which has the greatest impact, particularly when set to larger values. Under these scenarios, the discrepancy between the number of objects and the number of individuals engaging in their production, as well as in social learning, becomes sufficiently large to generate patterns that deviate from those expected under a pure Wright–Fisher model, with lower diversity in all cases. These results have a number of implications from historical and archaeological standpoints.

It is in fact critical to analyse the change in frequency data in the light of other socioeconomic variables at both a micro and macro scale of observation. At a group level, a change in the production techniques of individual producers, as well as a change in skill transfer practices, change in social stratification, and higher individual specialisation may lead to patterns characterised by lower diversity and higher standardisation (Roux, 2003, 2015) that could be easily misinterpreted as increased conformity biased transmission or preference based on content/value. The same effect is also linked, at a regional scale, to broader transitions towards more specialised economies, where fewer individuals will produce a higher number of objects and a given cultural trait or technique may evolve to fixation in specific localities to gain comparative advantage (Shennan, 1999). At the same time, potential confounding effects linked to production bias may be introduced by a shift from a need-based economy towards an exchange economy (Bentley et al., 2005), which may be linked to changes in the underlying network connecting individuals and groups via, for example, a shift to small-world networks of highly specialised producers, limiting exchange within the broader population (Manzo et al., 2018). Production bias and errors based on reverse-engineering mental representations from material culture are therefore especially relevant for archaeology, as they can be observed at a local/micro scale through frequency data—where empirical evidence is sufficient—and can be used to formulate questions that resonate with higher-level processes observed at a chronological and geographical macroscale, where archaeological assemblages are less likely to be underdetermined for inferring patterns and processes of cultural evolution (Perreault, 2019).

From a methodological point of view, the inferential tools we examined provide different kinds of information regarding the departure from Wright–Fisher regimes. Diversity indices such as homogeneity and richness do not provide sufficient insight regarding whether observed deviations can be linked specifically to production bias or different social learning strategies if examined on their own. Diversity indices are also strongly biased by time-averaging (Premo, 2014), making these statistics unsuited for most archaeological datasets. Progeny distributions seem to reveal some unique signatures that can help distinguish encoding and decoding errors when values of λ are high. However, it is worth noting that such signatures are strongly dependent on our ability to detect rare cultural variants through appropriate random sampling (i.e. low k in Fig. 6). This result replicates O’Dwyer and Kandler’s (2017) demonstration that under certain circumstances removing exclusively rare variants from a complete population could have a profound impact on the resulting progeny distribution, such that a distribution which was originally consistent with neutral transmission instead appeared consistent with novelty bias. From an archaeological perspective there are two issues here. The first is simply that rare cultural variants are less likely to be represented in the sample that is recovered for further study. The inferential impact of this falls outside the scope of this paper and is mostly related to the specific statistical analyses employed. A random sample of the population will miss many of the rare variants, but one can speculate that a strong encoding error regime would still yield a higher number of low-frequency variants compared to that which would be obtained from a sample generated under a Wright–Fisher process. The second, perhaps less obvious but potentially more impactful point, concerns the process of classification by which cultural variants are defined. Given how discrete cultural variants are often defined by subjective decision-making of specialists (see Lyman & O’Brien, 2003 for an account of traditional approaches), the extent to which rare variants and singletons are entirely dismissed or amalgamated with other more common variants remains unclear. Such bias would no longer make the sample representative of the population, and as such, the inferential process is likely to be severely affected (see O’Dwyer & Kandler, 2017 for examples). A more recent alternative approach, that was not explored in this paper, is the use of approximate Bayesian computation and other generative inference techniques. The flexibility of this approach has already led to a number of archaeological applications (Carrignon et al., 2020; Cortell-Nicolau et al., 2021; Crema et al., 2014, 2016; DiNapoli et al., 2021; Kandler & Shennan, 2015; Kovacevic et al., 2015; Porčić & Nikolić, 2016; Rubio-Campillo, 2016) and could generate expected frequencies of cultural variants given a transmission model that incorporates encoding and decoding errors, as well as the potential impact of a specialised economy with high production rates. Whilst such an inferential framework could accommodate a wide range of potential processes, the extent to which observed frequency of cultural variants in objects can tease them apart remains an open question.

Finally, it is worth noting that some of the details of the model explored here are not necessarily representative of all types of object-mediated transmission. For example, the infinite allele model we employed does not consider the possibility of convergence in innovation, and equally, the discrete unit representation we employed does not capture the proximity between variants in design space and its relationship to cognitive attractors and reconstructive processes. Both experimental and simulation studies have shown that cognitive attractors do have an impact in error accumulation (see for example Scott-Phillips, 2017; Claidière & Sperber, 2007), but the psychological and ecological factors affecting attraction are domain-specific, and hence, there must be some doubt about the extent to which their implications can be formalised within a generalised framework for empirical research (cf. Buskell, 2019).

Conclusion

In cultural evolutionary studies, there is a long-running interest in determining the mechanisms which have shaped the evolution of cultural traits. This work typically proceeds by using some measure of trait distribution to detect deviation from a null hypothesis, which is the distribution expected as a result of unbiased cultural transmission (random drift). To date, studies have derived that expectation by borrowing the Wright–Fisher model of neutral evolution from evolutionary biology. We have argued that Wright–Fisher neutral evolution may not be an appropriate null model when cultural transmission is mediated by objects, in other words, when the relative frequencies of traits exhibited by objects in an assemblage do not simply reflect the relative frequencies of the underlying mental representations or ideational cultural ‘genotypes’ carried by a population of people. This may be a general problem, but it is obviously of particular significance in archaeology and other historical sciences which can only directly observe objects. There is currently no consensus about the exact role of objects in cultural transmission, so we have attempted to skirt a somewhat thorny philosophical thicket by focusing on the two things that we think are most significant for cultural evolution: the rate at which ideational cultural traits are given a public manifestation as ‘objects’ and the fact that error may occur during one, or both, of ‘encoding’ a cultural trait in an object and ‘decoding’ a cultural trait from an object. Our computer simulations of unbiased object-mediated cultural transmission produce trait distributions which differ from those predicted by the Wright–Fisher model, as measured by richness, homogeneity, and progeny distribution. The fact that by the usual logic one would conclude that transmission was biased in some way is not down to a failure of the measures currently in use, but instead results from the application of a null model which does not adequately capture the population dynamics that arise when cultural transmission is mediated through objects. If this insight is accepted, then hopefully future research can further explore the impact of the different kinds of error and especially the production rate and its intersection with archaeological sampling.