Introduction

The identification of knapping skill within the Oldowan has long been investigated (Apel 2008; Chavaillon 1976; Delagnes and Roche 2005; Geribàs et al. 2010; Kibunjia 1994; Olausson 1998; Roche et al. 1999; Stout et al. 2019), ranging from the identification of individual skill levels to discussions on skill-related variation across the entire techno-complex (de la Torre 2004; Delagnes and Roche 2005; Toth et al. 2006). Such skill-related studies can be broadly separated into two main approaches: firstly, technological qualitative lithic analyses of archaeological assemblages (de la Torre 2004; Delagnes and Roche 2005; Pelegrin 1993, 1990) and secondly, experimental investigations into technological attributes associated with varying skill levels (Ludwig 1999; Stout 2011, 2002; Toth et al. 2006).

The ability to accurately and repeatedly identify varying skill levels within an archaeological assemblage is essential for our understanding of technological variability in the Early Stone Age (ESA). The suggestion that technological skill within the Oldowan remains static has long been investigated (Leakey 1971; Semaw et al. 2003; Stout et al. 2010). Some have argued, based on assemblages from Omo (Chavaillon 1976, 1970) and Lokalalei 1 (Kibunjia 1998, 1994) that a less technically proficient period within the Oldowan was present prior to 2 Ma, characterised by diminutive cores (e.g. at Omo), a high proportion of fragmented and waste products compared to complete flakes, a restricted range of typological core forms, excessive battering on knapping platforms and high instances of knapping accidents (step scars on cores and step and hinge terminations on flakes) (Merrick et al. 1973; Roche 1989; Kibunjia 1994; Chavaillon 1976). It was argued that the tool makers were unable to identify adequate natural knapping angles despite the local availability of high-quality raw material and showed a lack of manual dexterity evidenced by intensive battering along platform edges (Kibunjia 1994).

Conversely, significant technical competence within the early Oldowan has also been argued for based on assemblages from Gona (Semaw et al. 2003, 1997; Stout et al. 2010, 2005), Lokalalei 2C (Delagnes and Roche 2005; Roche et al. 1999) and re-analyses of the Omo assemblages (de la Torre 2004). It was argued that technical competence within the early Oldowan is characterised by an ability to produce the same range of typological forms seen in the post-2 Ma Oldowan (Semaw et al. 1997; Stout et al. 2010), a high degree of core reduction (Semaw et al. 2003, 1997) and the ability to maximise the production of flakes (Delagnes and Roche 2005), and efficiently exploit small raw materials (de la Torre 2004). Furthermore, considerable raw material selectivity was practiced (Delagnes and Roche 2005; Goldman-Neuman and Hovers 2012; Harmand 2009a, 2009b, 2007; Stout et al. 2010). Flakes and cores from these assemblages possess well-placed impact points, clear knapping platforms, a lack of excessive platform battering (Delagnes and Roche 2005) and low frequency of step scars (Semaw et al. 2003, 1997), and show the ability to rectify knapping accidents (Braun et al. 2019; Delagnes and Roche 2005). Recent discoveries of the earliest lithic technology at Lomekwi 3 (Harmand et al. 2015) and the earliest Oldowan at Bokol Dora (Braun et al. 2019) show the ability to repeatedly use conchoidal fracture to detach superimposed contiguous flakes from the edge of cores. However, when assessed quantitatively, the lithic material also indicates that hominins were unable to optimise impact point locations and produced a high frequency of hinge terminating flakes (Braun et al. 2019; Harmand et al. 2015).

Complementing the archaeological record, experimental studies have contributed greatly to our understanding of knapping skill (Bril et al. 2005; Callahan 2006; Eren et al. 2011a, 2011b; Geribàs et al. 2010; Nami 2010; Nonaka et al. 2010; Pelegrin 2006; Toth 1982), including on the production of Oldowan technology (Harlacker 2003; Schick et al. 1999; Stout et al. 2009; Toth 1982; Toth et al. 2006, 1993), each of whom conducted comparative experimental studies aimed at identifying either the technical skill of ESA hominins (Toth et al. 2006), or technical attributes associated with varying skill levels (Harlacker 2003; Stout et al. 2009).

A comparative analysis of Oldowan assemblages from Gona, experimental assemblages flaked by modern knappers, and an assemblage produced by two captive bonobos (Pan paniscus) (Toth et al. 2006), identified ‘key’ attributes on both cores and flakes associated with increased knapping skill (Table 1). Many of these characteristics are associated either directly or indirectly with the ability to better exploit the volume of a core. Furthermore, attributes associated with a higher level of manual dexterity, such as a lack of battering and step scars, are found at higher knapping skill levels (Toth et al. 2006).

Table 1 Proposed attributes for cores and flakes indicating higher knapping skill in the Oldowan identified by Toth et al. (2006)

Stout et al. (2009) compared novice, intermediate and expert assemblages replicating Oldowan tools across a range of varying raw material qualities (limestone, vein quartz and quartzite and metabasalt). The results identified that higher skilled knappers were associated with an increased frequency of detached products, greater core reduction, lower percentages of cortex on cores, a greater ability to maintain acute edge angles throughout reduction and an increased range of core forms, specifically an increase in bifacial exploitation (Stout et al. 2009). Additionally, experts produced significantly larger, more elongated complete flakes, and more non-cortical platform flakes (Stout et al. 2009). Conversely, ‘novices’ often produced products through wedge initiations through the application of excessive force to inadequate regions of the core. Recently, however, Stout et al. (2019) have suggested that raw material variation (specifically at Gona) plays little part in the variation between exploitation strategies, be it unifacial or bifacial (Stout et al. 2019).

Stout et al. (2009) suggest that site-specific investigation into raw material effects on skill level attributes would provide a more detailed baseline for the interpretation of the local archaeological record. Olduvai Gorge presents an interesting and potentially problematic case to the study of knapping skill in Oldowan assemblages, as it exhibits the use of markedly different raw materials within broadly contemporary assemblages and often in geographically the same location (Leakey 1971). Oldowan lithic assemblages in Bed I and II at Olduvai Gorge often consist of a combination of quartzite, chert and various lavas in differing quantities (McHenry and de la Torre 2018; de la Torre and Mora 2005; Kyara 1999; Hay 1976; Leakey 1971; Proffitt 2018) which have been shown to affect the degree of inter-analyst agreement and analysis validity (Proffitt and de la Torre 2014).

The diverse range of raw material types, qualities and morphologies that constitute the archaeological record of the Oldowan at Olduvai may affect both the final morphology of the lithic material (Jones 1979, 1994) and our ability to analyse this material (Proffitt and de la Torre 2014). As such, a greater understanding of how raw material variation effects the physical manifestation and identification of varying skill levels is paramount for future analyses of technological skill and for understanding technological variation within the Oldowan at Olduvai. The present study addresses this issue by introducing an experimental programme which pursues the following two research hypotheses. (1) Different knapping skill levels produce the same consistent technological attributes across the broad spectrum of raw material types at Olduvai Gorge. (2) Different knapping skill levels are identifiable and distinguishable from each other regardless of raw material type at Olduvai Gorge.

Experimental studies cannot replicate all facets or factors of the past and have to limit the variables under consideration (Classen 1981; Freeman 1968; Gifford-Gonzalez 1991; Johnson 1978; Olausson 2010; Roux et al. 1995; Warren 1914; Young 1989). Previous experimental and ethnographic studies have noted that knapping skill is not binary and carries with it myriad nuances within a continuum of skill (Flenniken 1984; Roux et al. 1995; Stout 2002; Stout et al. 2005). The aim of this study is to identify major differences between skill levels and as such have focused on the two ends, novices and experts. The conclusions from this study are therefore derived from extreme skill level differences. Such differences are unlikely to be as clear in the archaeological record, and these results should be seen as a basis and starting point for identifying inter-assemblage skill level when dealing with multiple raw material types.

Materials and Methods

Materials

Raw Material

The three primary raw material groups identified throughout the Oldowan in Beds I and II of Olduvai Gorge are quartzite, lavas and chert (Hay 1976). Quartzite and various lava artefacts are prevalent throughout Bed I, with chert artefacts found in assemblages during a confined period of time during Bed II associated with the Tuff IIA interval, when the palaeolake was in regression (Hay 1976; Leakey 1971; McHenry and de la Torre 2018). At present, quartzite at Olduvai is found at two primary locations, the Naisiusiu and Naibor Soit inselbergs. Quartzite found at Naisiusiu is finer grained whilst those found derived from Naibor Soit are usually coarser grained (McHenry and de la Torre 2018), with the latter being the most commonly found in the Olduvai archaeological sites (Hay 1976). To replicate the archaeological assemblages, quartzite blocks were collected from the Naibor Soit inselberg, a quartzite outcrop 2 km north of the main branch of the gorge, and the primary source in the landscape for this raw material. This particular quartzite is coarse-grained and possesses micaceous layers which are foliated and lineated (Hay 1976).

During Beds I and II, lavas were available as boulders and cobbles in streams and rivers, originating from the surrounding volcanic highlands (Hay 1976; Kyara 1999; McHenry and de la Torre 2018). The archaeological assemblages of Beds I and II indicate the use of a wide variety of lavas, including basalt, trachy-andesite and phonolite (Hay 1976; McHenry and de la Torre 2018). To ensure consistency in this study, only basalt cobbles were selected, all of which were collected from modern day streams within Olduvai Gorge. It must be noted, however, that these basalt cobbles inevitably possess variations in internal quality as has been shown from petrographic studies (McHenry and de la Torre 2018), although these differences were not identifiable from visual inspection.

Chert was available during a relatively short period of time within Lower-Middle Bed II at Olduvai (Hay 1976; Stiles et al. 1974). It was formed chemically through the precipitation of sodium silicate minerals from the saline alkaline Olduvai palaeolake and made available to hominins during periods of lake regression (Hay 1976). This chert is found as highly irregular fine-grained nodules with a relatively thick chalk cortex. To date, the best-known source of this raw material is found at an outcrop called the main chert unit, in the MNK locality of the Side Gorge (Kimura 2002, 1999, 1997; Stiles et al. 1974). All chert nodules used in this study were collected from this location.

To maintain consistency, raw material selection followed a number of qualitative assessments. All raw material types collected were roughly the same overall size. Each raw material nodule, block or cobble should be easily held and manipulated in the hand, and each blank possessed at least one adequate natural knapping angle of less than 90°. Furthermore, care was taken to select raw material with no external faults or fractures as well as being as internally homogenous as possible. All basalt cobbles collected for this experiment were selected based on the fine-grained nature of the grain structure. Quartzite blocks were selected from the same raw material source as employed during Beds I and II, and as such represents the variability in quality seen in Beds I and II. Similarly, chert nodules were selected from the primary chert source at Olduvai and all nodules were fine grained and homogenous.

Hammerstone selection plays a vital role in the successful exploitation of a core (Mora and de la Torre 2005). Hammerstones used during Beds I and II at Olduvai were mainly rounded lava and quartzite river cobbles as well as reused multifacial quartzite cores (Arroyo and de la Torre 2018; de la Torre and Mora 2018, 2005; Mora and de la Torre 2005).

Both basalt and quartzite hammerstones were used during the production of novice and experienced assemblages. The novice hammerstones were sourced directly from the Olduvai riverbed; due to logistical restrictions, the expert hammerstones were quartzite cobbles sourced from the UK.

Experimental Assemblages

Nine novice knappers (7 females and 2 males), with no previous knapping experience, produced the novice assemblages. Each knapper was provided with a single stone flaking demonstration outlining the fundamental steps required to detach a flake from a core, and no further tuition nor assistance was provided. Each novice knapper was provided with a single nodule, block or cobble of each Olduvai raw material type resulting in nine assemblages for each raw material and 27 in total. The expert assemblages were produced by an experienced male knapper in the UK with over 20 years’ experience in exploiting a wide variety of raw material types. The expert knapper produced three assemblages for each Olduvai raw material, resulting in 9 assemblages in total.

To maintain consistency during the knapping process, each participant adhered to a set of criteria. Each block/nodule/cobble was knapped with the primary intention of producing usable flakes with one or more sharp edge, not to produce core tools (as defined by Leakey 1971). The knapper was permitted to rotate the core as many times as they saw fit. Reduction ceased only when the knapper was unable to detach further flakes and considered further reduction no longer possible. All knappers were permitted to freely choose, switch and rotate hammerstones at any point during the experiment.

Analysis of Lithic Assemblages

All experimental assemblages were subjected to a full technological analysis, with each artefact labelled and measured (maximum length, width, thickness, weight). All artefacts larger than 20 mm were subdivided into technological categories, consisting of complete flakes, fragmented flakes, cores and angular chunks. Detailed technological analysis was conducted on all cores and complete flakes, which included both an attribute analysis and a technological analysis (see Supplementary Material).

The “Identification of the Same Skill Level Across Raw Materials” section is separated into two sections, each of which addresses one of the two stated research hypotheses. The first half of these results relate to identifying the same skill level across raw material groups, whilst the second half reports a full comparative techno-typological analysis between each skill assemblage, for each raw material individually as well as at an assemblage level.

Statistical Methods

Absolute and relative frequencies were established for all technological categories within each raw material. Statistical variation between skill levels in both categorical and numerical attributes was assessed. For categorical attributes, a chi-square test or, where applicable, a Fisher’s exact test (where a 2 × 2 contingency table was possible), was used, followed by a post hoc assessment of the adjusted residuals (AR) to identify the source of any significant variation. Adjusted residual values represent the difference between the observed and expected frequencies for each variable divided by the standard error. Adjusted residual values of greater than +/−2 indicate significant (p = 0.05) over-representation or under-representation of that variable from the expected frequency. Numerical data were subjected to a Kruskal-Wallis test.

All statistical analyses were conducted at two levels of resolution to identify the effects of each raw material group. Firstly, skill level was assessed at an individual raw material basis for each group of knappers, with chert, basalt and quartzite being compared with each other within novice assemblages alone and expert assemblages alone. This was done to test if individual skill levels manifested consistently across each raw material types. Secondly, novice and expert assemblages, both at an individual raw material level and at a combined total assemblage level, were compared to identify significant markers of skill variation within and across raw material types. Whilst acknowledging that inclusion of several expert knappers would have enabled to account for individual variation between knappers, this study is primarily concerned with identifying broad scale differences between raw materials when dealing with vastly differing skill levels, and so the statistical tests used here are appropriate for identifying significant deviations from random chance.

Results

Identification of the Same Skill Level Across Raw Materials

To test if assemblages of the same skill levels are internally consistent across the different raw material types, a technological comparison between raw material types was conducted for each skill level.

Expert Assemblages

A chi-square test indicates significant variation in total frequencies of technological categories between all raw materials (X2(6) = 41.629, p = 0.000). This variation is derived from a significant over-representation of flakes and, to a lesser degree, angular chunks in the chert assemblage, as well as a significant over-representation of fragmented flakes in the quartzite assemblages (Table 2).

Table 2 Absolute, relative and adjusted residual (AR) values for frequencies of all technological categories of expert and novice assemblages, separated by raw material (bold = significant to 0.05)

A chi-square test indicates significant variation between raw materials in core platform battering and flake extraction range on expert cores only (Table 3). This variation is derived from an over-representation of platform battering, and cores with > 9 extractions in the basalt assemblage (ST1). There is, however, a significant difference in maximum dimensions (length, width, thickness and weight) of cores between all raw materials (Table 3). Chert cores are significantly smaller compared to basalt and quartzite cores (Fig. 1). Conversely, no difference in either technological or maximum extraction dimensions in all raw materials is found (Table 3), suggesting that expertly reduced cores, whilst differing in size across raw materials, retain similar extraction dimensions at the end of their reduction.

Table 3 Chi-square and Kruskal Wallis* test results for comparison of categorical and nominal attributes between raw materials for all expert and novice cores and flakes (bold = significant to 0.05)
Fig. 1
figure 1

Boxplots showing maximum a length, b width, c thickness, d weight for all expert and novice cores separated by raw material and maximum e length, f width, g thickness and h weight for all expert and novice flakes, separated by raw material

Half of the categorical attributes identified on complete flakes show significant differences between raw materials (Table 3). AR values indicate an over-representation of non-faceted, cortical platforms on quartzite flakes with 0-3 dorsal extractions, whereas semi-cortical, uni-faceted platforms, <50% dorsal cortex and between 2 and 9 dorsal extractions predominate on chert flakes. Basalt flakes, however, show an over-representation of non-cortical platforms, along with an under-representation of <50% cortical platforms and dorsal surfaces (ST2).

A Kruskal-Wallis test shows significant variation in maximum length, width, thickness and weight (Table 3) for expert flakes between raw materials. Pair-wise comparisons indicate that chert flakes are significantly smaller in all cases compared to quartzite and basalt (Fig. 1 a-d), and quartzite flakes are, on average, larger than both basalt and chert in all dimensions.

Considering fragmented flakes, siret (i.e. split) fractures are significantly over-represented (X2(2), = 17.503, p = 0.000) in the quartzite assemblages, and under-represented in the chert assemblages (Table 4).

Table 4 Absolute, relative and adjusted residual (AR) values for comparison of split flakes between raw materials for all expert and novice fragmented flakes (bold = significant to 0.05)

Novice Assemblages

There is significant (X2(6) = 29.470, p = 0.000) variation in total frequencies of technological categories between all raw materials for novice assemblages. This is derived from a significant over-representation of angular chunks in chert assemblages, and a corresponding under-representation of this category in quartzite and basalt assemblages (Table 2). Once angular chunks are removed, there is no clear difference between the remaining technological categories (X2(4) = 2.018, p = 0.732).

No significant differences in categorical attributes between raw materials are found for novice cores (Table 3). A noteworthy exception is highlighted by the adjusted residuals, where chert cores show a significant over-representation of multifacial exploitation (ST1). Novice cores, however, differ significantly in length, width, thickness and weight between raw materials (Table 3), with chert cores being shorter, less wide and lighter than both basalt and quartzite, and thinner to quartzite cores only. Chert core extractions are also significantly smaller than basalt and quartzite core extractions (Table 3).

When considering complete flakes, nine of the twelve categorical attributes show significant variation between raw materials (Table 3). This variation is derived from significant differences between each raw material, rather than a systematic over-representation or under-representation from a single rock type (ST3). Chert flakes possess an increased frequency of non-faceted and multi-faceted, <50% cortical platforms. Additionally, novice chert flakes possess an over-representation of step terminations, partially cortical (<50% and >50%) dorsal surfaces, and early stage (0% or >75% cortex) flakes. Quartzite flakes show a significant over-representation of non-cortical uni-faceted platforms, and late stage flakes with non-cortical dorsal surfaces, conversely early stage flakes are under-represented. Basalt novice flakes possess a significant over-representation of concave platforms, and Toth type III flake categories, and an under-representation of <50% cortical platforms and dorsal surface cortex coverage. A Kruskal-Wallis test shows no significant difference in length, width, thickness or weight (Table 3), with flakes in all raw material groups showing homogenous average dimensions (Fig. 1).

Novice fragmented flakes possess a significant over-representation of split fractures in quartzite compared to chert and basalt (X2(2) = 42.984, p = 0.000) (Table 4).

Identification of Skill Level Variation Across Raw Materials

To test if skill level was discernible both at an individual raw material level and at a total assemblage level, a comparison of technological attributes between skill levels was conducted for each raw material and all raw materials grouped together.

Assemblage General Characteristics

Novice knappers produced a total of 1103 artefacts, compared to 569 for expert knappers, excluding small debris (Table 2). A single novice was unable to initiate knapping of a basalt nodule, which resulted in a complete failure to detach a single knapping product. This failed core has been excluded from the analysis as no flakes were obtained from it.

At an assemblage level (all raw materials grouped), a chi-square test shows a significant difference (X2(3) = 26.986, p = 0.000) in frequencies of technological categories. AR values show that novice knappers produced a higher frequency of fragmented flakes and angular chunks, with complete flakes associated with expert assemblages (Table 5). When broken down by raw materials, no significant difference between novice and expert assemblages in either basalt (X2(3) = 4.792, p = 0.188) or quartzite (X2(3) = 1.573, p = 0.666) is found. Chert, however, elicits a significant inter-skill variation in technological frequencies (X2(3) = 40.521, p = 0.000), which is substantiated by the AR values (Table 5).

Table 5 Expert and novice adjusted residual values for the frequency of all technological categories at raw material and assemblage levels

At both an assemblage and raw material level, the expert knapper converted a greater total weight of the original core into detached pieces (complete and fragmented flakes). At an assemblage level, nearly 50% (49.7%) of the total weight of the original nodules were converted into complete flakes, and fragmented flakes (17%), with exhausted cores only accounting for 26.9% of the total assemblage weight. This compares dramatically to the productivity of novice knappers, where 57.6% of the weight of the assemblage was attributed to cores and only 24.9% of the original total weight was converted into complete and fragmented flakes. This pattern is also replicated at an individual raw material level. Interestingly, however, for basalt exploitation, the expert knapper retained a larger percentage of the total original nodule weight as cores (43.9%) compared to complete flakes (38.8%) (Table 6).

Table 6 Absolute and relative total exploited weight of the main technological categories for each skill level at raw material and total assemblage levels
Table 6 Results of statistical tests on all categorical (chi-squared and Fisher's exact test) and nominal attributes (Kruskal-Wallis test) between expert and novice cores at a raw material and assemblage level (* = Fisher´s exact test used, bold = significant)

A similar trend is observed when comparing total weight of complete flakes with that of the unaltered nodules. At an assemblage level, expert knapping resulted in higher percentage weights of complete flakes (50% versus 18%), with an average weight of 33.5 g compared with 16.7 g for novice flakes (Table 6). This productivity difference is consistent across all raw materials, with quartzite yielding the highest percentage (53.6% versus 16.9%) and basalt seeing the least difference (38.8% versus 17.5%).

Core Analysis

At an assemblage level, there is no significant difference in overall core dimensions and weight between novices and experts (Table 6). This lack of inter-skill level variation is also observable across raw materials, apart from novice quartzite cores being significantly thicker. Although no overall significant variation based on mean dimensions is observed, novice cores all possess greater dimensional and weight ranges, compared to expert cores (Fig. 1).

Significant inter-skill variation at an assemblage level is identified for exploitation strategy, reduction type, knapping accidents, cortex coverage and platform battering (Table 6). AR values indicate significantly higher frequencies of bifacially exploited cores, no step scars, <50% cortex coverage and an absence of platform battering on expert cores (ST4). Novice cores, on the other hand, show significantly greater frequencies of uni-facial and multi-facial exploitation, step scars, >50% cortex coverage and platform battering (ST5).

When individual raw materials are assessed, statistical differences become less clear, particularly in basalt. Quartzite cores show significant inter-skill variation in reduction type, knapping accidents, cortex coverage and knapping platform battering (Table 6). Expert quartzite cores are characterised by an increase of bifacial exploitation, a lack of accidents, <50% cortex and an absence of battering (ST4), compared to an increase of step scars, >50% cortex and high levels of battering for novice cores (ST5). Chert cores show similar inter-skill variation (Table 6). Expert chert cores show the same characteristics as expert quartzite cores (ST4). Novice chert cores are characterised by an increased frequency of step scars, >50% cortex coverage, platform battering and smaller flake scars (ST5). Basalt cores, on the other hand, show no significant difference between skill levels (Table 7); however, adjusted residual values indicate a lack of knapping accidents and lower levels of cortex for expert cores, with the opposite being true for novices (ST4).

Table 7 Results of statistical tests on all categorical (chi-squared and Fisher's exact test) and nominal attributes (Kruskal-Wallis test) of expert and novice flakes at a raw material and assemblage level (* = Fisher’s exact test used, bold = significant)

At an assemblage level, there is no significant difference in either technological or maximum extraction dimensions. This pattern holds for both basalt and quartzite cores. However, significant inter-skill variation in technological length, maximum length and maximum width of extractions is clear for chert cores (Table 6).

Novice basalt cores are characterised by large battered areas and fragmented core edges due to persistent failed removal attempts along inadequate edge angles and an inability to correctly place hammerstone blows. Novice flaking surfaces often retain numerous step scars and stacks of step scars prohibiting continuation of reduction. This contrasts sharply with the expert basalt cores, where flakes were detached from around the entire circumference of the core, and knapping accidents are absent and battering, although present, remains localised along the edges of the knapping platforms. Novice cores exhibit an overall low level of reduction due to the development and inability to identify and deal with knapping accidents, a loss of flaking surface convexity and adequate knapping angles. Conversely, the full volume of cores was exploited by the expert knapper through the detachment of invasive flakes and correct use of naturally occurring knapping platforms.

Novice quartzite cores are characterised by repeated detachments of small, non-invasive step-terminating flake detachments with very little rotation of the core, causing exploitation to cease prematurely due to an inability to manage the flaking surfaces and knapping angles. Instead, novices repeatedly attempted flake removals in areas where flaking was not feasible, resulting in substantial areas of battering. Conversely, expert quartzite cores are highly reduced, with invasive flake removals and lack of battering. Knapping accidents are rare, with step terminations caused by natural inclusions or fractures within the raw material. Exploitation predominantly centres on the larger planes of the core, maximising flake dimensions. Exploitation ceased in these cases primarily due to the development of concave flaking surfaces, loss of adequate knapping angles, the development of a step fractures or a combination of these factors.

Novice chert cores are often smaller than the other two raw materials (Fig. 2). This is not caused by increased exploitation but due to the unintentional fracturing of the original nodule, producing smaller cores. There is a general ability to produce good-sized flakes (in relation to original core size), which are either cortical or >50% cortical. However, following the initial reduction, knapping angles quickly become obtuse and unsuitable. This is accompanied by the development of multiple series of step scars unsurmountable for novice knappers, which in turn results in extensive areas of battering across the knapping platforms and (Fig. 3). Having said this, however, a number of well-executed chert cores were also produced by novices (Fig. 3). These were often the result of secondary exploitation of unintentionally detached angular fragments.

Fig. 2
figure 2

Examples of novice basalt (a), quartzite (b) and chert cores (c), and expert basalt (d), quartzite (e) and chert (f) cores

Fig. 3
figure 3

Examples of novice cores on chert (a-d), basalt (e-f) and quartzite (g-h) produced during this experiment

Conversely, expert chert cores are the most heavily reduced of all three raw materials. Chert exploitation differs slightly from the other raw materials; whilst in quartzite and basalt, there is a tendency towards a bifacial interaction between planes, in chert cores, a non-cortical surface (created by either a flake removal or fracture) is used as the primary knapping platform with very little bifacial interaction between the associated flaking surface and this platform. Large and invasive flakes which travel far across the centre point are detached unidirectionally, moving back into the volume of the core, with little rotation of the platform. Conversely to the novice chert cores, there are no step scars or stacks of steps, with all flake removals terminating in clear feather terminations. There is no battering on the knapping platforms; instead, impact points are correctly located at points which are conducive to well-formed flake removals. Exploitation ceased, primarily, due to the development of concave flaking surfaces.

Flake Analysis

At an assemblage level, experts were better able to produce complete flakes, which were significantly larger compared to novices. This inter-skill variation is consistent across all raw materials (Table 7).

There is significant variation between skill levels for most categorical attributes at an assemblage level, with dorsal surface step scars being the exception (Table 7). Expertly produced flakes show an increased frequency of central impact points on either uni-faceted or multi-faceted rectilinear platforms which are either non-cortical or partially cortical (ST6). There is also a significant over-representation of flakes with no knapping accidents, clearly defined cross sections and non-cortical or partially cortical dorsal surfaces (ST6).

When broken down by raw material, this differentiation is less clear, as not all attributes are significantly different between skill levels. Basalt and chert flakes show a similar degree of inter-skill variation for 8 of the 12 attributes analysed (Table 7), although these attributes differ between the two raw materials. No significant variation is identified for knapping accidents and number of extractions for basalt flakes, and striking platform morphology and platform shape in chert flakes. Quartzite flakes present the least amount of inter-skill variation, with half of the attributes varying significantly (Table 7).

Impact point type and striking platform cortex coverage both show significant differences between skill levels in chert and basalt, although not in quartzite (Table 7). AR values of chert and basalt expert flakes indicate an over-representation of uni-faceted, non-cortical or partially cortical platforms, with the impact point located centrally. Novice chert and basalt flakes, on the other hand, tend to possess non-faceted, cortical platforms, with mostly de-centred impact points (ST6 and ST7). Dorsal cortex shows no significant difference between skill levels in chert and basalt flakes, although there is a clear difference in quartzite. Novice flakes possess significantly more cortical dorsal surfaces, compared to an over-representation of >50% cortical expert flakes. Striking platform morphology is significantly different only in basalt flakes, with lineal platforms prevalent for novice flakes and well-formed platforms on expert flakes (ST6 and ST7).

Novice knappers tended to produce initial cortical basalt and quartzite flakes that were wide and short (Fig. 4), increasing the convexity of the core flaking surface beyond a suitable angle. Additionally, novice platforms on basalt and quartzite are very thin or lineal, often with a crushed morphology (Fig. 5), caused by hammerstone impacts too close to the edge of the knapping platform. Novice basalt flakes also possess a high degree of battering on the platforms caused by multiple failed removal attempts. Novice chert flakes are less battered than those of basalt, although often present an irregular shape due to an inability to detach invasive removals and overcome the initial irregular morphology of the original chert nodules.

Fig. 4
figure 4

Examples of expert basalt (a), quartzite (b) and chert (c) flakes which are elongated and well produced and novice basalt (d) and quartzite (e) flakes retaining a short and wide morphology

Fig. 5
figure 5

Examples of novice quartzite and basalt flakes. Novice quartzite edge core flakes (a), and short and wide basalt flakes with heavily crushed knapping platforms and impact points (b)

Among novices, there is a preponderance of early stage angular edge core cortical flakes with triangular cross sections and divergent distal ends, representing the corners of tabular quartzite blocks (Fig. 5). Although this type of flake does occur in the expert assemblages, it is far less prevalent, with most expertly produced quartzite flakes being large and thin. Additionally, expertly produced initial cortical flakes are large, thick and invasive, often longer than they are wide, with very little battering (Fig. 4). This ensures that the volume of the core is immediately accessed, providing new, non-cortical areas to act as future knapping platforms.

Split flakes in all raw materials were consistently significantly higher (X2(1) = 106.953, p = 0.000) in the expert total assemblage compared with the novice assemblage, with the same trend observed in chert (X2(1) = 14.777, p = 0.001), basalt (X2(1) =49.229, p = 0.000) and quartzite (X2(1) =38.564, p = 0.000) assemblages.

Discussion

This study was set out to test whether the broad range of raw material types found at Olduvai Gorge, Tanzania affected the ability to identify knapping skill level using conventional technological lithic analytical techniques. To this end, we conducted an extensive analysis of both novice and expert lithic assemblages produced on three different Olduvai raw materials, namely quartzite, basalt and chert. It is important to reiterate that experimental studies such as this cannot replicate all aspects of Early Stone Age lithic production and it is impracticable to suggest that such experimental replication studies can fully replicate the range of biomechanical motions of Plio-Pleistocene hominins, nor their cognitive abilities whilst. Indeed, it is has been suggested that lithic experimental studies may provide datasets which are not comparable to individual archaeological assemblages (Dibble et al. 2016). These assemblages often represent more than merely brief periods of time and have been affected by a wide range of natural and anthropogenic taphanomic influences (Dibble et al. 2016). It is also true that the majority of lithic assemblages at Olduvai Gorge and indeed in other Plio-Pleistocene contexts are heavily time averaged and liable to have been affected by varying levels of spatial disturbance (Perreault 2019; Blumenschine and Masao 1991).

This study, however, seeks to identify broad level differences between skill levels and raw material groups. As such, it has focused on two ends of the skill continuum and limits the study to the three main raw materials found at Olduvai. These results are therefore not directly comparable to any individual archaeological assemblage. Instead, they should be seen as a starting point for identifying diachronic skill level variation. Understanding basic patterns such as this is a fundamental step in the wider goal of interpreting Early Stone Age lithic variation. Previous studies have noted a number of key technological attributes argued to be associated with varying skill levels (see Table 1) (Toth et al. 2006). In this study, however, we have focused on whether a wide range of commonly used lithic attributes show differences between skill levels for each raw material. This bottom-up potentially allows for an understanding of the attributes which may be affected by raw material types across skill levels. Taking this into consideration, we believe that taking a broad diachronic perspective of lithic variation (Proffitt 2018; Ludwig 1999) may, to some extent, allow interpretations of skill variation despite these known taphonomic issues (Dibble et al. 2016).

Does Skill Level Manifest Equally in All Raw Materials?

The results of this study act as a cautionary note when assessing knapper skill by using relative frequencies of technological categories. In the expert assemblage, the significant over-representation of fragmented flakes in quartzite, and of complete flakes in chert, are related to raw material quality, not knapper skill. It is known that quartz and quartzite often fragment more when compared to more homogenous raw materials (Tallavaara et al. 2010). Similarly, novice knappers produced higher frequencies of angular chunks in chert compared to both quartzite and basalt, highlighting the effect that raw material may have on the composition of lithic assemblages. When assessing knapper skill at a total assemblage level, these variations suggest that raw material indeed has an effect on assemblage composition.

Technological categories in each raw material display differing ranges of variation between skill levels. Both expert and novice cores show a general homogeneity of categorical attributes between raw materials, with only expert basalt cores showing significant variation in the knapping platform battering and increased flake scars, and with novices producing more multifacial chert cores.

The metrical attributes of cores, however, present a more complex picture. Both expert and novice cores show significant variation between raw materials in maximum dimensions and weight. For expert cores, this is derived from the ability to more heavily reduce chert nodules but greater difficulty in fully exploiting basalt cores. For novice assemblages, the diminutive size of chert cores is not necessarily a direct result of knapping competence, but instead because some novice knappers after initially fracturing a chert nodule—arguably a feature of lower skill levels—into multiple smaller angular chunks then proceeded to use these as core blanks.

For experts and novices, both chert and quartzite cores are on average smaller than basalt: this may be due to the initial morphology of both raw materials. As Olduvai chert nodules have a highly irregular morphology, smaller chunks have an increased chance of being detached during reduction, and in turn used as separate cores. A similar argument can be posed for quartzite exploitation, where the original tabular morphology of the blank is more likely to break into large chunks, which are subsequently used as core blanks. Rounded basalt cobbles, however, make the detachment of large chunks less likely. As this observation occurs at both a novice and expert level, it is difficult to assign to a specific skill variation. The increased frequency of chert fragment reuse as cores may help explain the diminutive dimensions of chert cores found in Bed II of Olduvai (de la Torre and Mora 2018; Mora and de la Torre 2005; Proffitt 2018). Having said this, however, it also remains a possibility that, during the Oldowan, smaller blocks were selected from the raw material source. Novice cores also demonstrate significant raw material variation in both technological and maximum dimensions of extractions. Assessing knapping skill purely on dimensional attributes of cores would, therefore, be confounded by raw material-derived variation. In the case of Olduvai, any investigation into skill level using these attributes should be undertaken at a raw material level as opposed to an assemblage level.

When considering the flake assemblages, significant differences in how skill levels manifest between raw materials are identified for six of the 12 attributes for experts and nine of the 12 attributes for novices. If raw material had no effect on the manifestation of these attributes, one would expect to either observe the same over-representation and under-representation of technological attributes across each raw material or no significant difference between raw materials. This heterogeneity (Table 8) of technological attributes suggests that raw material type affects the expression of these attributes in both skill levels, highlighting a potential problem in assessing skill level between archaeological assemblages of differing raw material types.

Table 8 Significant over (arrow up) and under (arrow down) representation of technological attributes associated with expert and novice knappers for each Olduvai raw material. The results are derived from chi-square test with post hoc adjusted residuals calculated

A similar picture emerges when considering complete flake dimensions; no raw material-dependent variation for novice flakes is clear; however, expert flakes show significant raw material variation. In this instance, chert flakes are smaller than quartzite and basalt flakes. As expert knappers are demonstrably able to produce large flakes, this variation may be closely associated with the irregular and undulating morphology of the Olduvai chert nodules, whereas novice knappers are equally unable to produce large flakes across all raw materials. This may suggest that chert raw material morphology may have played a part in the apparent abundance of smaller chert flakes at Olduvai assemblages located within the Tuff IIA interval (de la Torre and Mora 2018; Leakey 1971; Mora and de la Torre 2005; Proffitt 2018). However, these increased frequencies of diminutive chert flakes may also be affected by increased core exploitation (Braun et al. 2005) of a high-quality raw material. Conversely, such pattern may be influenced by increased distance from raw material sources, as the site of MNK CFS—an archaeological site directly located over a chert raw material source (Stiles et al. 1974)—has numerous large chert flakes (Proffitt 2018, 2016).

Interestingly, significant raw material-dependent variation is clear when assessing frequencies of split flakes in both novice and expert assemblages. The frequency of split flakes is often attributed as a knapping accident and associated with reduced skill levels (Harmand 2009a; Kibunjia 1994). For both skill levels, split fractures were less common in chert compared to quartzite, whilst this fracture type was also prevalent in novice basalt assemblages. Thus, it is important to note, when using the frequency of split flakes as a proxy for knapper skill, that raw material type has an identifiable impact on the presence of such accidents. Indeed, it has been argued that the brittle nature of quartzite increases the frequency of split flakes (de Lombera Hermida 2009).

When the Olduvai raw materials are compared at both skill levels, it is clear that a wide range of common technological attributes are affected. If the raw material had no impact, one would expect no significant differences at the same skill level between raw materials. Accuracy in assessing knapper skill may, therefore, be lost if assessed at a total assemblage level (i.e. when all raw materials are grouped together), whilst a more accurate assessment may be achieved if raw materials are assessed individually.

Is It Possible to Distinguish Between Skill Levels in All Olduvai Raw Materials?

At an assemblage level, there is significant inter-skill variation in technological categories, with expert assemblages possessing higher ratios of complete/fragmented flakes than novices. This distinction, however, is solely related to the exploitation of chert and is not consistent across each raw material. Once chert is removed, there is no statistical difference between the skill levels using this marker.

Although assemblage composition may not be able to differentiate skill levels, the ability for an experienced knapper to get the most from a core is clearly a skill-related variable (Delagnes and Roche 2005). At both total assemblage and raw material levels, the expert knapper showed a greater degree of productivity, producing a higher total weight of flakes to the total weight of cores, compared to the original raw material weight, a factor which is consistent across all raw materials. An interesting nuance of this variation is that expert basalt assemblages retained a considerably higher proportion of the total weight as cores, compared to chert and quartzite. Although the percentage weight of expert basalt cores is less than novice assemblages, it is significantly compared to the exploitation of both quartzite and chert. This may suggest that Olduvai basalt, even when exploited by expert knappers, is challenging in terms of gaining access to the volume of the core, hindering extensive reduction. Recent analyses of lithic assemblages from the Oldowan site of HWK EE (de la Torre and Mora 2018) have suggested the practice of initially splitting of lava cobbles to gain access and create adequate knapping angles. This study suggests that this may represent a technological adaptation to overcome the apparent difficulties accessing the volume of rounded basalt cobbles where no naturally occurring knapping platform is present.

Inter-skill knapping productivity at both an assemblage and raw material level, therefore, may provide a comparative baseline by which archaeological assemblages can be assessed. Recovery bias, behavioural and post depositional factors all, however, affect the degree to which full reduction sequences are preserved in the archaeological record and inevitably affect assemblage composition.

The ability to reduce cores to diminutive dimensions has been associated with increased levels of knapping skill (Toth et al. 2006). Our study, however, suggests that assessing skill level by core dimensions is potentially problematic. There is little statistical difference in maximum dimensions, both at an assemblage or raw material level, regardless of whether the expert knapper was able to produce cores of the same/similar dimensions compared to a wider range for novices.

The diminutive dimensions of the novice chert and to a lesser extent the quartzite cores are derived from an entirely different reduction sequence to that identified for the expert cores. This consists of a higher instance of breakage due to the original nodule morphology and subsequent reuse of small angular chunks as secondary cores. Expert cores, however, show longer reduction sequences further highlighted by the categorical attributes. Expert cores possess less cortex compared to those produced by novices. Additionally, at an assemblage level, they also tend to be bifacially knapped, whilst novices primarily relied on unifacial and multifacial exploitation strategies. This method in the expert assemblage may represent an increased ability to effect greater reduction by using two flaking surfaces, whilst maintaining, and minimising the loss of an adequate knapping angle, an identified feature of Oldowan reduction (de la Torre and Mora 2005; Toth 1985). This characteristic is shared for quartzite and chert but is not identified for basalt cores. The ability to more efficiently exploit a core has been highlighted by previous studies as a characteristic of varying skill levels (Ferguson 2008; Högberg 2008; Shelley 1990; Stout et al. 2005) and the data from this study corroborate this as a characteristic of expert knappers in both chert and quartzite, but not necessarily basalt.

The ability to overcome knapping accidents has been highlighted as a sign of increased knapping skills within the Oldowan (Delagnes and Roche 2005). The results of our study confirm that both at an assemblage and raw material level, step scars and knapping platform battering can be considered significant differentiators between skill levels. Higher knapping skill levels may be implied by the lack of repeated hammerstone blows and crushing of the knapping platform edge (Delagnes and Roche 2005), and the ability to detach complete flakes that lack step terminations (Hovers 2012). A lack of knapping platform battering and step scars in expert cores on both chert and quartzite supports this argument and can be used as proxies for skilled knapping in these raw materials. These technological attributes were, however, not quantifiably different for basalt cores between novices and experts. In this case, the location and extent of knapping platform battering are more effective indicators of varying skill levels. The battering associated with the expert assemblage was located along intersecting edges which possessed adequate knapping angles; however, novice battering was widely distributed across the platform and along intersecting edges with highly obtuse knapping angles. This variation in battering location reflects the differing levels of technical understanding. Although the raw material morphology made exploitation difficult, the expert knapper had a clear understanding of where to place adequate hammerstone blows to take advantage of even the slightest natural knapping angle, whilst novices showed little to no awareness of such morphological considerations. Based on (1) the lack of battering on the other raw materials, (2) the fact that knapping platform battering is rarely seen in skilled archaeological assemblages on basalt (Delagnes and Roche 2005) and that (3) its presence in this study is more likely associated with the expert knapper’s unfamiliarity with Olduvai basalt as a raw material, it can be confidently proposed that this attribute is a marker for skilled knapping in all raw materials.

The lack of significant differences between skill levels in both nominal and categorical attributes for basalt should not, however, be dismissed. The lack of variation in basalt should be seen as an important limitation in statistically assessing skill level differences within an archaeological assemblage where poorer quality basalt is a predominant raw material.

Turning to the lack of inter-skill variation in the frequency of core extractions, previous experimental studies have shown that highly reduced cores possess a limited number of flake scars (Braun et al. 2005). As the core is reduced, there is a higher probability of removing flake scars associated with previous stages of reduction. This factor may explain the inter-skill homogeneity in this attribute for our study. When looking at flake removals only, on the one hand, novices detach fewer removals compared to experts. However, as the expert cores are more highly exploited, previous flake scars are removed, contributing to masking this clear skill-related difference. Another factor is the invasiveness of flaking, identified here through a qualitative assessment of reduction sequences. This is indeed the case, with significant variation identified at both total assemblage and raw material levels in all flake dimensions, with expert flakes being significantly larger. The expert knapper was better able to maintain the flaking surface convexities and the knapping platform angles through the removal of invasive flakes across the flaking surfaces in all raw materials. Conversely, flake scar morphologies on the novice cores show a predominance of wide, short flakes, often terminating in step fractures which rapidly exhausted any adequate knapping angle between platforms and flaking surfaces. This is caused by the novice knapper’s inability to place hammerstone blows with sufficient accuracy and force, with impact points located too close or far from the platform edge. Such observations have been identified as a characteristic trait for novice knappers elsewhere (Milne 2005; Nyree Finlay 2008; Pigeot 1990; Roux et al. 1995). Our study confirms these observations in the Olduvai raw materials. These features are in stark contrast to the well-placed, individual impact points and the overall lack of core battering seen on expert cores, suggesting a higher ability to control percussion motions (Delagnes and Roche 2005; Maddux 2011) and an ability to evaluate the potential of the core rather than repeatedly attempting extractions in unsuitable areas (Milne 2005).

The issue of flaking surface maintenance is more conspicuous in chert than in basalt and quartzite, with the development of numerous stacks of step scars on novice chert cores (Table 9). These are caused by a combination of failed removal attempts from too concave flaking surfaces, coupled with an inability to identify the barrier that step scars are to continued exploitation. Novice knappers rarely rotated the core to take advantage of a new knapping platform following the development of stack of step scars, a common method of continuing exploitation (Delagnes and Roche 2005; Pigeot 1990). The ability to overcome these simple knapping accidents is a common proxy for increased skill levels (Milne 2005; Nichols and Allstadt 1978; Shelley 1990), and our study corroborates this relationship for the Olduvai raw materials. It has been noted that knapping ability can be separated into skill related to motor functions and skill related to understanding or cognition (Pelegrin 2005). The major differences between novice and expert knappers for each raw material in our experiments clearly fall within these two categories (Table 9). It is clear that the expert knapper has a greater ability to correctly place hammerstone blows as well as possessing a far greater understanding of fracture mechanics which results in the production of higher quality flakes. In addition to this, the higher degree of cognitive skill exhibited by the expert knapper is clear when considering the increased degree of core exploitation and flaking surface maintenance, as well as the degree of rotation of the core and lack of knapping accidents. The results of this study show that entirely novice knappers possess little in terms of motor and cognitive skills. These results confirm previous suggestions that motor skills (i.e. the ability to place successful hammerstone blows and manipulate the core) are likely to develop prior to a fuller understanding of the more nuanced aspects of lithic reduction (cognitive skills) (Pelegrin 2005).

Table 9 Technological attributes of cores, flakes and flake fragments which are significantly variable based on knapper skill level, separated by individual raw material and at a total assemblage level

It has been noted that raw material variation has a direct effect on the production (Jones 1994) and analysis (Proffitt and de la Torre 2014) of lithics from Olduvai. However, there have been few studies which investigate the effect that raw material has on production of Oldowan assemblages at Olduvai, and on how knappers of varying skill levels interact with this range and quality of raw materials. In her assessment of cultural variation within the Olduvai basin, Mary Leakey, applying a typological approach, inherently suggested that varying knapping skill levels were present between the Classic Oldowan, the Developed Oldowan A, the Developed Oldowan B and Early Acheulean at Olduvai (Leakey 1971). This was potentially attributable to different associated hominin species (Leakey 1975). More recent studies have, however, used relative frequencies of technological attributes within Olduvai assemblages to infer variation of skill levels over time (de la Torre and Mora 2005; Kimura 2002, 1999; Ludwig 1999; Proffitt 2018). Within the wider Oldowan, however, varying skill level has been a source of debate for some time. Leakey (1971) argued that the classic Oldowan exhibited a degree of technological stasis, a view supported more recently by some recent studies (Semaw et al. 2003; Stout et al. 2010). Subsequent research and archaeological discoveries, as well as identifying increasingly older examples of Oldowan assemblages, highlighted potential technical variation within this time period (Delagnes and Roche 2005; de la Torre 2004; Braun et al. 2019). The oldest Oldowan from Ledi-Geraru now further highlights this intra-technology variation whilst showing a degree of simplicity compared to the Oldowan assemblages at Gona (Braun et al. 2019). Our results strongly suggest that, when assessing skill levels between Oldowan assemblages, if multiple raw materials are present, each should be compared separately.

Conclusions

In this study, we have sought to address two primary research questions, namely how varying knapper skill levels manifest themselves in each Olduvai raw material, and if it is possible to identify knapping skill variation between the raw materials found at Olduvai Gorge. The first of these questions was addressed by assessing whether common technological attributes are identified across all raw materials for each skill level. The second question was addressed through a comparative analysis of skill at both an individual raw material level and a total assemblage level. The results suggest that differentiations in knapping skill are not equally represented in each raw material when assessed through technological analyses. Furthermore, the morphology and quality of the raw material strongly affect the ability to identify varying skill levels within the archaeological record. Having said this, however, by using a combination of both quantitative and qualitative technological approaches, the identification of skill-related variations across raw materials is possible, especially when assessed at an individual raw material level. The various raw materials available at Olduvai Gorge have been shown to have a direct effect on the identification of technical skill attributes. For this reason, it is suggested that attempts to differentiate skill levels should be undertaken at a raw material level, as opposed to grouping all raw materials together. To maximise assessing relative skill levels within an archaeological assemblage, both cores and flakes should be included, and multiple technological attributes contemplated, as varying skill levels do not express equally across technological categories and attributes. In the Olduvai raw materials, many categorical attributes showed no inter-skill variation in cores. Furthermore, presence of knapping accidents, in some instances, may not be directly related to lower skill levels, highlighted here by the equal frequency of quartzite split fractures for both skill levels.

Based on the results of this study, we would advocate that, when assessing knapper skill variation using technological attributes of both cores and flakes, it is important to treat each raw material type separately, as skill-related attributes are not represented equally between raw materials. Furthermore, this study highlights the need to establish, through experimentation, which technological attributes can be used as proxies for knapping skill levels prior to assessing the archaeological record. This type of initial assessment of how different raw materials react to varying skill levels should be a component of such studies prior to assessing Early Stone Age assemblages.