1 Introduction

It is widely accepted that the involvement of parents in their children’s education benefits both learning and social engagement (Jeynes, 2005). Beyond impacting positively on attainment (Castro et al., 2015; Hill & Tyson, 2009; LeFevre et al., 2009), parental involvement has been shown to improve children’s motivation and enhance self-efficacy (Cheung & Pomerantz, 2011), reduce subject anxiety (Vukovic et al., 2013), improve both attendance (Simon, 2001) and behaviour (Aunola et al., 2003), and lead to improved participation in post-compulsory education (Ross, 2016). From the perspective of young children’s learning of number, the focus of this paper, it is widely assumed that the more parents engage in number-related activity, the higher the child’s attainment (Kleemans et al., 2016, p. 71). However, the typically quantitative literature is inconsistent with respect to the nature and influence of such activities. For example, with respect to kindergarten children, there are studies asserting a positive relationship, with “parent-child numeracy activities and parents’ numeracy expectations … uniquely related to early numeracy skills” (Kleemans et al., 2012, p. 476) and later stimulation at school (Anders et al., 2012). Alternatively, there are studies showing, even when a large number of parental activities are examined, no relationship between parental involvement and numeracy (Missall et al., 2015).

In broad terms, parents’ number-related activities have been categorised as either direct or indirect (LeFevre et al., 2009; Purpura et al., 2020), formal or informal (Huntsinger et al., 2016; Vasilyeva et al., 2018) and advanced or basic (Skwarchuk et al., 2014; Zippert & Ramani, 2017). However, irrespective of their labels, the findings of this research seem ambivalent with respect to what forms of activity predict what forms of learning. With respect to children’s mathematics achievement, there are studies finding a positive impact of formal mathematical activities and a negative impact of informal mathematics activities (Huntsinger et al., 2016). Others, despite an initial aim of examining the impact of both formal and informal activities, have, for a variety of reasons, abandoned the latter and examined only formal activities and found a positive impact on numeracy (LeFevre et al., 2010; Manolitsis et al., 2013). Other studies’ results have been more nuanced, with Skwarchuk et al. (2014) showing that formal home numeracy activities predict children’s symbolic number knowledge, while informal activities predicted children’s non-symbolic arithmetical competence. By way of contrast, Vasilyeva et al. (2018) found that formal activities predict children’s number identification, informal activities predict number magnitude understanding, while both formal and informal activities predict arithmetical competence. Finally, LeFevre et al. (2009), in one of the most cited papers, found, inter alia, that direct number skills interventions had no discernible impact on either children’s mathematical knowledge or fluency, while the playing of number-related games impacted positively on both. In sum, the literature on the impact of parent-initiated activity seems problematically ambivalent. In this respect, a recent review by Mutaf-Yıldız et al. (2020) concluded that the lack of consistency was likely to be a consequence of research being biased towards mothers’ reports of home numeracy activities, privileging investigations of formal numeracy activities over informal, being dominated by self-report surveys and, finally, being inconsistent in the measures of children’s mathematical knowledge used. While we would not disagree with these conclusions, we would argue that the tacit acceptance of broad categorisations like formal and informal exacerbates the problem.

In this paper, therefore, we offer a critique of the ways in which different quantitative studies have operationalised parent-initiated number-related activity.

In general, quantitative studies typically fall into three broad categories, which share two important features. First, they derive their data from self-report surveys aimed at eliciting the frequency with which parents undertake various predetermined home-initiated activities. Second, their analyses, albeit based on differing criteria, examine the impact on achievement of variables created from aggregations of individual activity scores. The differences between the three categories lie in the procedures employed to identify activities for aggregation. Studies in the first category identify activities for aggregation by means of different forms of exploratory factor analyses (EFA). Those in the second exploit confirmatory factor analyses (CFA) before aggregating, while those in the third category simply aggregate frequency scores from a variety of activities structured by, typically, predetermined classifications but with limited attention paid to the robustness of those predetermined classifications.

In the following, therefore, we offer critiques of a small number of influential studies in each category as, essentially, case studies of their respective genres. Before doing so, it is important to acknowledge that while there are quantitative exceptions to the parent survey study, as with the national cohort studies of Domina (2005) and Driessen et al. (2005), bottom-up studies in which parents’ views are solicited are rare and, when undertaken, typically focus on the relationship between minority groups at risk of being disenfranchised and school mathematics (see, for example, de Abreu & Cline, 2005; O’Toole & de Abreu, 2005; Remillard & Jackson, 2006). Finally, what follows draws on summaries of various factor analyses, with the consequence that frequent reference is made to both the labels colleagues have given their factors and the individual items that constitute them. To simplify the reader’s task, throughout the following, factor labels are presented in bold type and individual items in italic.

2 Exploratory factor analyses and principal component analyses (PCA)

In the following, we are mindful that many statistical programmes not only offer both PCA and EFA within a collection of exploratory factor analysis options but that PCA is typically the default (Pohlmann, 2004). Importantly, they have different functions. On the one hand, a PCA is explicitly a data reduction process whereby the outcome variables are linear combinations of the original (Fabrigar et al., 1999; Widaman, 2012). PCAs are typically used to reduce a large set of observed variables to a smaller number of variables representative of some common characteristic of the observed variables (Beavers et al., 2013; Costello & Osborne, 2005). By way of contrast, an EFA is a true factor analysis whereby researchers are not explicitly aiming to reduce data but to identify “a set of latent constructs underlying a battery of measured variables” (Fabrigar et al., 1999, p. 275). In such instances, the original variables are linear combinations of these constructs and the aim is to reduce the impact of as many latent variables on each observed variable as possible. In the following, we present both forms of study, although our concerns lie less in the approaches adopted than the interpretation of the factors identified by those approaches.

2.1 A first PCA

With respect to PCAs, one of the most widely cited studies is that of LeFevre et al. (2009). Motivated by an acknowledgement that one of the reasons “for the lack of consensus across studies is that researchers have not distinguished amongst different types of home numeracy experiences”, they proposed that a “consideration of a variety of indirect and direct experiences” would be “useful in understanding the relations between home experiences and numeracy development” (ibid, p. 56). To this end, parents of Canadian K-2 children were invited to indicate, on a 0–4 scale, how often their child participated in 40 home-based activities “compiled from a variety of sources” (ibid, p. 57) that included 20 activities with an emphasis on number. For a variety of reasons, three of these were excluded from the analysis. Two, using number or arithmetic flash cards and playing with number fridge magnets, were too infrequently used, while the third, learning simple sums, had been incorrectly printed and led to ambiguous responses.

The remaining 17 activities were subjected to “a principal components analysis with varimax rotation to reduce the number of variables and to determine whether certain activities grouped together” (ibid, p. 59). This process yielded four factors, which are summarised in Table 1. Two factors were interpreted as representing indirect activities and two as direct activities. In this instance, direct activities are “used by parents for the explicit purpose of developing quantitative skills”, while “indirect activities are real-world tasks… for which the acquisition of numeracy is likely to be incidental” (ibid, p. 56).

Table 1 A summary of the four factors identified by LeFevre et al.’s (2009) PCA

The indirect factors addressed number-related games and applications, respectively, while the direct factors addressed number skills and number books. Following this, the authors evaluated the impact of the four factors on children’s mathematics competence, which was evaluated against two measures, one focused on knowledge and one on fluency. They found that an aggregated games score correlated positively with both mathematical knowledge and mathematical fluency. However, number books correlated negatively with fluency. Neither number skills nor applications correlated with either mathematical knowledge or mathematical fluency. Moreover, regression analyses showed these four forms of activity accounting for only 4% of the variance in mathematical knowledge, of which 3% was associated with games. Most of the remaining variance was explained by family factors such as parental education. With respect to mathematical fluency, the four factors collectively accounted for 13% of the variance, with most of the rest also explained by family factors.

Aside from the poor predictive power of the regression analyses, our concern lies in the fact that each of the four factors drew on a range of qualitatively different forms of activity, which, when the goal is to identify which forms of parental interventions are productive, seems a little counter-intuitive. For example, the five items loading on number skills involve the counting of objects; counting down; printing numbers; identifying the names of written numbers; and sorting by size, colour or shape. Two of these allude to oral skills, two to symbolic skills and the fifth underpins logical thinking in ways that have no obvious relationship to number. Moreover, within the structure of the factor, sorting things by size, colour or shape not only has a greater significance than all activities bar counting objects, its frequency in parents’ repertoire of activities is relatively low, further problematising its inclusion. Further, while we might concede that printing numbers is unlikely to occur in any context other than one in which the activity has been directed, activities like counting objects, counting down and identifying the names of written numbers could occur in a range of contexts. For example, is encouraging a child to count door numbers while walking along a street a direct or indirect activity? In other words, to conclude that their measure of number skills has limited predictive power when it draws on such diversity of activity is problematic. Is it reasonable to conclude that encouraging oral counting competence has the same effect as symbolic recognition, which has the same effect as sorting objects against a range of criteria? Indeed, as mathematics educators, we are disappointed that such a core competence as sorting is marginalised in such an oversimplified manner.

The applications factor comprised five items concerning encouraging the child to wear a watch, measuring ingredients when cooking, using calendars, talking about money when shopping, and playing with calculators. Again, all five activities are qualitatively different and address different aspects of mathematical learning. For example, measuring ingredients when cooking is a different competence from talking about money when shopping. The former is likely to involve some form of proportional reasoning—if the recipe is for two people how much butter will be needed for four?—while the latter is likely to involve simple arithmetic—if a chocolate bar is 13 kronor how much change will I get from a 20-kronor note? Encouraging children to wear a watch or play with calculators are problematic because one can only ever infer what the consequences may be. There will be parents who systematically encourage their children to interpret a device’s display and others who do not. Also, while using calendars can be construed as addressing temporal knowledge, it is a qualitatively different form of temporal knowledge from that related to telling the time. Overall, the applications factor seems to draw on too broad a conceptualisation of activity to be meaningful. Importantly, because proportional reasoning and arithmetic are core elements of mathematical learning, any measure that conflates the two will necessarily miss the impact on achievement of either and may help explain the factor’s poor predictive power.

Similar comments can be made about the remaining two factors. The games factor draws on four items concerning the playing of card games, making collections, playing board games with dice or spinners, and being timed. As above, similar conceptual problems emerge. First, it is not clear, beyond administrative convenience, how making a collection constitutes a games-related activity. Indeed, our view is that making a collection necessarily involves making decisions about what to collect and developing criteria for categorising the collection. In other words, making a collection is unlikely to occur independently of some process of sorting by size, colour or shape, which is a core number skills activity. In other words, the authors construe the reciprocally related activities of sorting and making collections as direct and indirect, respectively. Finally, with respect to games, it is not clear how being timed is either games-related or indirect as any manifestation is dependent on how it is presented by the parents concerned.

With respect to the number books factor, two of the three items are so vague as to be practically meaningless; what are number activity books and number story books, and what number-related competences are embedded within them? Moreover, in both cases, what makes them direct rather than indirect? The third activity, connecting-the-dots, is no more than an application of counting and, we argue, related to number books more by chance than any underlying causality. Indeed, the ambiguity of two of the three activities and somewhat trivial focus of the third may help explain the factor’s negative impact on fluency.

2.2 A second PCA

Of course, the sceptical reader may say, “but LeFevre et al.’s (2009) factors were the result of appropriately conducted PCAs and, therefore, should be accepted for what they are.” However, before offering our response to what might be interpreted as a reasonable observation, we examine other studies conducted in similar ways. LeFevre et al.’s instrument has been exploited internationally, including analyses of Dutch parents (Kleemans et al., 2013), Flemish parents (Mutaf Yıldız et al., 2018), Chinese parents (Zhang et al., 2020) and comparative studies of Canadian and Greek parents (LeFevre et al., 2010). One of the most widely cited adaptations of LeFevre et al.’s (2009) instrument, and one that exemplifies the interpretive problem of PCA studies, is that of Skwarchuk et al. (2014), who framed their study against the assertion that in

comparison with research on home literacy, evidence linking children’s early numeracy learning to home experiences is more recent and less thorough—and, as a result, is less conclusive. Inconsistent results may indicate that researchers have not developed a clear distinction between informal and formal activities… that are related to numeracy. (ibid, p. 64)

In this instance, in an apparent shift from activities described as direct or indirect, the authors’ goal was to examine “children’s home experiences as predictors of academic outcomes” (ibid, p. 65). In so doing, emphasising the role of shared experiences, they defined

formal numeracy activities as shared experiences in which parents directly and intentionally teach their children about numbers, quantity, or arithmetic to enhance numeracy knowledge. In contrast, informal numeracy activities are those shared activities for which teaching about numbers, quantity, or arithmetic is not the purpose of the activity but may occur incidentally. (ibid, p. 65)

In this respect, despite changes in terminology, formal activities resonate closely with the earlier direct, and informal activities with the earlier indirect. With respect to informal activities, parents were shown a list of games, some of which were genuine and some of which were fabrications devised solely for the purpose of their study, and invited to indicate which were familiar to them. In our view, this seems a bizarre proxy, not least because, as far as we can discern, parents were asked only to indicate which games they recognised rather than played. Formal activities were based on a range of items, assessed against a 0–4 frequency of use score, and subjected to a PCA as a process of data reduction. Their first PCA, based on 12 items, led to a two-factor solution but with four items loading on both. These items were removed and a second PCA undertaken. This yielded two factors, each comprising four items, interpreted as representing advanced formal activities and basic formal activities, respectively. All twelve original items, each with its mean frequency on the 0–4 scale and factor loadings, can be seen in Table 2.

Table 2 The various activities, with no implied relationship between them, included in the analyses

We see the above as problematic for at least four reasons. First, the definition of informal activities as those “for which teaching about numbers, quantity, or arithmetic is not the purpose of the activity but may occur incidentally” (ibid, p. 65), necessarily excludes those parents for whom such activities are used deliberately to teach about numbers but in ways that subordinate learning to the enjoyment of playing. Indeed, as Dubé and Keenan (2016, p. 167) note, in a good mathematical game, learning and enjoyment “are one and the same and this keeps children playing, providing ample opportunity for practice and eventual mastery of the mathematics skills inherit (sic) in the game”. Indeed, play and learning are inseparable entities (Pramling Samuelsson & Johansson, 2006) with important learning benefits for young children (Björklund et al., 2018; Reikerås, 2020; Van Oers, 2010). Also, and similar comments could be made of much work in the field, authors seem to have valorised particular perspectives on mathematical knowledge that may differ from that of the home (De Abreu, 1995).

Second, beyond alerting the reader to the forms of activity most or least typical of Canadian kindergarten parents, the impact of any particular activity on later learning is lost. This is principally due to a manifestation of the same problem discussed above, each of the two factors draws on qualitatively different forms of activity, which, we posit, posed a substantial interpretive challenge to the authors. What is it about the four advanced activities that makes them advanced? Alternatively, what is it about the four basic activities that makes them basic?

Third, the four excluded items were excluded principally because, as can be seen in Table 2, their high frequency of use ensured that any variance for which they are accountable necessarily falls across the two factors, with the consequence that their significance with respect to later learning was arbitrarily ignored when, in fact, any sensible analysis would look to examine these popular activities’ influence on later learning. Indeed, confirming this conjecture, Table 2 shows all four loading on both the factors identified by PCA1.

Fourth, the two factors that remained were necessarily diminished by these omissions. For example, if I help my child learn simple sums was interpreted as representing an advanced activity, then so would, we suggest, we play games that involve counting, adding, or subtracting. Similarly, if I help my child weigh, measure, and compare quantities represents an advanced activity, then so would I ask about quantities. In other words, the instrument seems poorly conceptualised and, as a consequence, has failed to identify which forms of parent-initiated activity contribute to later learning. Indeed, the aggregate scores from the two factors indicated that basic activities had no impact on either children’s non-symbolic arithmetic or their symbolic number knowledge, while advanced activities impacted on symbolic number knowledge only, confirming that aggregations of such diverse activities are unlikely to offer any predictive power.

2.3 EFA

Another study, unrelated to that of LeFevre et al. (2009), is that of Huntsinger et al. (2016), which was motivated by the assertion that much

research has focused on the influence of home environments and parental attitudes, while less attention has been given to what parents actually do to promote children’s learning, particularly in mathematics. Thus, the present study was an in-depth investigation of the activities in which parents engage their young children in order to facilitate academic preparedness. (ibid, p. 1)

Their analyses drew on data from a parent survey, “developed specifically for the present study” and comprising items “derived from methods that parents had named in… interviews conducted in previous research” (Huntsinger et al., 2016, p. 6). Sadly, due we assume to unnoticed omissions following the anonymity of peer review, no references to this interview-based research were included. Ten items, based on a three-point scale, addressed the frequency with which parents undertake specific things like posing mathematics challenges in the car. In addition, 28 items, presented on a four-point scale, assessed how often a child undertakes a variety of home-based activities like doing mathematics-related workbooks or worksheets. Interestingly, no argument was proposed to explain why some items were on a three-point scale and others on a four. Of these 38 items, 23 focused specifically on mathematics and were subjected to two EFAs, undertaken on data gathered from the same participants 1 year apart. The first EFA yielded three factors, which the authors interpreted as representing informal activities, formal activities and fine motor activities. The second EFA, undertaken with 48.5% of the previous cohort, also yielded three factors, which they labelled formal activities, informal activities and games, blocks and toys.

The results of the two analyses, which can be seen in Table 3, show considerable variation in the distribution of the various items and the authors’ interpretations of the factors. First, only 19 of the 23 activities were implicated in both sets of factors, of which only eight loaded on comparable factors across the two analyses. Second, only three of the ten activities interpreted as informal on the first analysis were interpreted as informal on the second. For example, if using mathematics in everyday home routines and playing made-up mathematics games were interpreted as informal activities on the first analysis, by what process have they become formal activities by the time of the second? Similarly, only five of the nine activities interpreted as formal on the second analysis were so interpreted on the first. In short, if factor analytic studies are to have any relevance, then authors need to be consistent in their application of terms like formal and informal to individual items. An activity described as informal one day cannot conveniently become formal the next.

Table 3 The results of the two EFAs undertaken by Huntsinger et al. (2016) at times 1 and 2. F represents formal activities, I informal activities, M fine motor activities and G games

Admittedly, the authors comment that because “parents change the activities that they do with their children as their children learn and mature, we believed the factor structure would be somewhat different a year later at Time 2” (p. 6). If this were the case, and parents’ activities actually change as much as the authors imply, then any factor analytic study will be of limited value unless accompanied by a caution along the lines of “at the time of this study, undertaken in a particular cultural context, parents of children of age n years privileged a particular set of home-based activities.” That being said, there are at least two alternative explanations. The first may be inferred from the factor loadings shown in Table 3. The study drew on data from 200 surveys at time 1 but only 97 at time 2. With samples of such sizes, a cut-off of 0.4 would be typical (Ford et al., 1986; Pohlmann, 2004), although the authors have elected to use 0.3. This decision, acknowledging that “a factor loading for a sample size of at least 300 would need to be at least 0.32 to be considered statistically meaningful” (Yong & Pearce, 2013, p. 85), may have compromised the authors’ ability to interpret their factors. Moreover, as argued by Costello and Osborne (2005), a robust factor requires five or more strongly loading items (0.50 or better), which is the case only for the informal factor identified at time 1.

The second explanation may lie in the fact that fewer than half the parents involved at time 1 were involved at time 2. Thus, it seems plausible that, in fact, parents’ activities had not changed between the two time points and that differences were due to the missing effect of the missing parents. However, even if that were the case, the problem of interpretation remains; what credibility can be inferred from a study that concludes that formal activities have a positive impact on mathematics achievement and informal activities have a significant negative impact when many activities have been described in both ways? As Briggs and Cheek (1986, p. 119) note, “factors that do not replicate are of little value”.

Finally, the authors conclude (Huntsinger et al., 2016, p. 13) that their survey

seems to be a promising instrument for identifying home-based activities which promote mathematics … development in young children in the United States… This research, which has identified home activities that appear to encourage young children’s mathematics … knowledge and skills, may provide practical information which could be disseminated to parents to aid them in building strong foundations for their young children’s academic development.

Our view is that their study has shown their instrument to be far from promising and likely to be of limited help to parents, teachers or researchers wishing to understand or investigate further how parents may best support young children’s acquisition of number competence.

So, why are we so vexed by these studies? For the main part, it seems that the factors yielded by the typical exploratory study are less about identifying activities implicated in children’s learning, whether positively or negatively, than satisfying statistical criteria for inclusion. Thus, when the item pool is diverse, as with all the critiqued studies, items loading on a particular factor do so not because they represent some common form of activity, which is what would be expected from conventional EFAs and PCAs, but because they represent similar response patterns. In other words, there seems to be a tacit assumption that similar activities will yield similar patterns of response, while dissimilar activities will not. The evidence of the critiqued studies suggests that this is not the case, but colleagues try to interpret their factors as though it were. Consequently, it is unsurprising that many of the factors yielded in this way are, in essence, uninterpretable. Moreover, such studies typically lead to the exclusion of activities clearly commonplace in parents’ behavioural repertoires, which seems to counter researchers’ goals of identifying productive activities.

3 Confirmatory factor analyses

In brief, confirmatory factor analyses differ from exploratory factor analyses in a number of ways (Widaman, 2012). The most obvious of these is that while the goal of an EFA is to uncover any structures inherent in data, that of a CFA is to determine how well data fit a predetermined structural model (Taylor & Pastor, 2007). In conducting CFAs, investigators assume the existence of factor structures in order to test hypotheses generated by earlier EFAs (Hurley et al., 1997; Stevens & Zvoch, 2007), a process requiring investigators to “select a small set of the best indicators for each factor” (Widaman, 2012, p. 377).

From the perspective of this paper, a number of studies have employed CFAs to investigate the impact of different forms of parent-initiated activity on children’s learning of mathematics. As we show, these are similarly problematic. In the following, we critique Hart et al. (2016), although similar critiques could have been made of, for example, Dearing et al. (2012), Huang et al. (2017), Missall et al. (2015), Napoli and Purpura (2018), Purpura et al. (2020) or Segers et al. (2015). Hart et al. (2016) devised a 48-item survey focused on the frequency of number-related and other activities to examine their impact on parents’ perceptions of their young children’s mathematics achievement. Drawing on activities identified in previous studies, direct, indirect as well as spatial activities were included because they “might (our emphasis) be related to the home math environment” (Hart et al., p. 6). Parents were informed of the study’s aim and asked to indicate how often they undertook each of the activities on a 1–6 scale that ranged from never through monthly or less, less than once a week but a few times a month, about once a week, a few times a week to almost daily. Mean scores were calculated for each activity, giving an indication of the value parents placed on it. However, in accordance with the conventions of such research, the impacts of individual activities were not examined by the researchers, who subsequently ran eight CFAs to establish the best fitting model based on their predetermined categorisations of direct, indirect and spatial. During this process, six activities were rejected due to low response levels, while a further 19, shown in Table 4, were rejected after the final CFA. The remaining 23 activities, shown in Table 5, yielded the three factors comprising what the researchers call the home math environment.

Table 4 Activities, with means, excluded after confirmatory factor analyses
Table 5 Summary of activities, with frequency means and factor loadings, retained by the CFA according to the a priori categorisation of direct and indirect

Hart et al.’s (2016) predetermined categorisation of number-related activities as direct or indirect seems problematic in at least two ways. First, it is difficult to discern, beyond author assertions, how direct activities are distinguishable from indirect. For example, the extent to which playing with numerical magnets may be construed as direct will depend on the role parents adopt, the purpose they assign to such playing and whether or not they monitor the activity. Also, what distinguishes the directness of noting numbers on signs when driving or walking with children from the indirectness of using numbers when referring to temperatures, time, and dates? What is it about being timed that makes it an indirect activity? Indeed, being timed can be interpreted in a variety of ways. For example, not unreasonable possibilities might involve children being timed when counting to twenty or tying a shoe lace. In such circumstances, one can envisage a child wanting to repeat (or being encouraged to repeated) the task in ways that lead to conversations involving words like slower or faster and an introduction to relative magnitude. How could such activities be construed as indirect?

Second, as found in the studies discussed above, the failure to distinguish between the different forms of activity within each categorisation remains problematic. For example, with respect to indirect activities, four are explicitly connected to time, while a fifth concerns measuring ingredients when cooking. Moreover, when set alongside talking about money when shopping or playing card games, the collection of activities described as indirect are qualitatively different with different implications for learning. Understanding such distinctions matters, particularly if one’s goal is to identify which home-initiated activities are implicated in learning. Is it more important to encourage a child to sort things by colour, shape, or size, and, we assume, encourage logical thinking and an awareness of different forms of mathematical relationship, or note numbers on signs when driving or walking with children, which may help us to identify names of written numbers, count down, or recite numbers in order? Indeed, as discussed above, arguing for sorting as direct and making collections as indirect seems arbitrary. They are mutually dependent activities more closely related to the development of logical thinking—a core goal of mathematics education—than anything explicitly concerned with number.

Hart et al.’s (2016) CFAs we regard as problematic for different reasons. First, it is not clear why the authors opted for CFAs when, we argue, EFAs may have been more appropriate in such an exploratory context. The fact that the authors had to run eight CFAs before finding a satisfactory model is indicative, it seems to us, of a fishing expedition and may indicate, as with many CFA studies, an inadequate initial theorisation (Hurley et al., 1997). Second, notwithstanding the fact that six activities were removed from the analysis due to their being too infrequently reported by parents, a further 19, shown in Table 4, were removed after failing to fit the “best” structural model. This is particularly concerning as, acknowledging the earlier assertion that activities were included because they “might be related to the home math environment” (ibid, p. 6), the authors appear less interested in evaluating the impact on learning of an individual activity than whether or not it fits their statistical model. Moreover, the total number of excluded activities, more than half the original set, tends to support an argument that the authors did not “select a small set of the best indicators for each factor” (Widaman, 2012, p. 377).

This leads to our third concern. Many of the activities rejected by the CFA (see Table 4) were, according to Hart et al.’s coding, at least weekly occurrences in parents’ repertoires. Moreover, the mean frequency for the rejected spatial activities exceeded four, while the mean of those activities that satisfied the CFA’s statistical criteria failed to reach three. In other words, the goal of determining which activities are implicated in children’s learning is confounded by the rejection of so many high frequency activities because they failed to fit the desired structural model. In sum, it seems to us that in the desire to achieve statistical significance with an arbitrarily conceptualised model, the things parents actually do get lost. Moreover, the sheer volume of rejected items not only casts doubt over the validity of the process but suggests that expectations that investigators should “select a small set of the best indicators for each factor” (Widaman, 2012, p. 377) have been ignored.

Fourth, in addition to acknowledging that many of the factor loadings shown in Table 5 are surprisingly low, little can be added that has not already be discussed earlier. Importantly, activities with low loadings but high means indicate to us that they are likely to have loaded on more than one factor and confirm that attempts to identify general forms of parent-initiated activity are likely to be unsuccessful. Fifth, while Hart et al.’s CFA confirmed playing with numerical magnets as an element of the direct activity factor, it was rejected by LeFevre et al. (2009) as a consequence of its achieving too low a frequency of use. Similarly, while Skwarchuk et al. (2014) identified helping my child learn simple sums as a contributor to their advanced activities factor, learning simple sums was rejected by Hart et al.’s CFA. Such differences may suggest, although our view is that this is unlikely acknowledging their cultural proximity, that the Canadian parents of LeFevre’s and Skwarchuk’s studies construe their roles differently from their American neighbours. An alternative explanation, as previously indicated, is that such differences further confirm the problematic nature of such research.

Overall, it seems to us that studies exploiting CFAs typically assume a structural relationship between diverse activities that have been inappropriately categorised in order to satisfy the statistician’s desire for elegance. There is no consistent logic applied to these categorisations and little awareness that such diverse collections of activity are unlikely to yield neat solutions. Indeed, authors rarely explain why a particular activity has been defined as informal or formal, indirect or direct, or advanced or basic. The reader is left to accept such decisions, which have gone unchallenged throughout the literature. Moreover, as we show below, any act of aggregation eliminates the influence of any particular form of activity by burying it beneath a mass of noise.

4 Aggregation studies

A not insubstantial number of studies, having argued for the need to uncover the impact of home-based activity on young children’s mathematics achievement, have simply aggregated scores on a range of qualitatively different activities to create a composite measure for analysis purposes (see, for example, Cai et al., 1999; Dearing et al., 2012; Del Río et al., 2017; Domina, 2005; Driessen et al., 2005; Susperreguy & Davis-Kean, 2016; Vasilyeva et al., 2018; Zippert & Ramani, 2017). Consequently, many of the critiques above are relevant to these studies and are not repeated. That being said, some aggregations are less problematic than others. For example, Niklas and Schneider (2014) evaluated the quality of the home numeracy environment (HNE) by means of three items concerning the frequency with which parents of kindergarten children played dice games, counting games or calculation games with their children. They concluded, after analyses based on an aggregation of the three scores, that the HNE is an important predictor of mathematical abilities at the end of kindergarten and beyond. In this instance, although we might still argue that each item represented a different form of game, at least the totality reflected some sense of parental encouragement of mathematics-related game playing.

5 Discussion

This paper was motivated by ambivalent research concerning the relationship between parent-initiated learning activities and the mathematics learning of their young children. As we investigated the literature in hitherto unconsidered ways, it became clear that despite their best intentions, typically focused on identifying general forms of productive or unproductive activities, the manner in which colleagues have undertaken their research is disappointingly flawed. As we read their papers, it was clear that colleagues’ goals, framed by well-warranted assertions of an inconsistent field, were initially focused on the identification of home-initiated activities likely to promote mathematical learning. For some, these goals were presented generally, as in “the present study was an in-depth investigation of the activities in which parents engage their young children in order to facilitate academic preparedness” (Huntsinger et al., 2016, p. 1) and the study’s aim was “to determine what children and their parents do inside the home that might be related to children’s math achievement in school” (Hart et al., 2016, p. 6). Others were more particular, as in a “consideration of a variety of indirect and direct experiences” would be “useful in understanding the relations between home experiences and numeracy development” (LeFevre et al., 2009, p. 56) and “researchers have not developed a clear distinction between informal and formal activities … that are related to numeracy” (Skwarchuk et al., 2014, p. 64). Whatever their intentions, such studies typically draw on survey instruments, with items being included because they “might be related to the home math environment” (Hart et al., 2016, p.6). In most cases, data are subjected to some form of factor-analytic process, the results of which, for at least three reasons, have underpinned our concerns.

The first, which is more technical than pragmatic, concerns the robustness of the factors identified. For example, Costello and Osborne (2005) have argued that five high-loading items are necessary for a robust construct, and yet both of Skwarchuk et al.’s (2014) factors, two of Huntsinger et al.’s (2016) factors and two of Lefevre et al.’s (2009) factors were based on four or fewer activities.

The second, which is both technical and pragmatic, concerns the exclusion of items. In conventional factor analyses, item exclusion serves only to strengthen constructs, not least because items will typically have been designed to avoid loading on different constructs. In the studies reported here, item exclusion seems not to have strengthened constructs but weakened them. This, we argue, can be explained in two ways. The first is that the activities under scrutiny have not been designed according to some developmental principles but, essentially, selected from a random collection of possibilities alongside an unarticulated and rather naive hope that parents’ choices will be structured by some underlying logic amenable to generalisation. But, of course, all parents will engage in direct and indirect, formal and informal, advanced and basic activities in ways that defy the researchers’ tidy ambitions. The second, which is a consequence of the first, is that the activities most frequently used by parents tend to load on multiple factors and are, therefore, excluded. In other words, if a researcher’s goal is to uncover those activities likely to impact mathematical learning, then excluding the most popular activities because they fail to satisfy the statistics of inclusion seems to invalidate the whole study. In short, and somewhat impolitely, it seems to us that many factor-analytic studies in this developing field are “something akin to a ‘fishing expedition’” (Reio & Shuck, 2015, p.14). Indeed, acknowledging that research has shown no clear indication that parents privilege any particular form of activity, however broadly it may be defined, factor emerging from studies like those above may be due more to serendipity than any structural similarity.

The third, which is more pragmatic than technical, concerns factor interpretation, the quality of which is key to the success of any analysis. In this regard, interpretation should be defensible; do the factors make sense and, importantly, are they reflected in the characteristics of the items on which they are based (Pohlmann, 2004)? It seems obvious to us not only that interpreting constructs as direct or indirect, formal or informal, advanced or basic cannot account for the qualitatively different activities identified by the analyses but also that colleagues’ fixation on such labels seems to have blinded them to alternatives (Ford et al., 1986). This is well exemplified by the five items loading on LeFevre et al.’s (2009) number skills factor; activities alluding to oral skills and symbolic skills are fundamentally different, as any teacher of young children would testify, and should necessarily prompt a reconceptualisation.

In closing, and acknowledging that the above also applies to aggregation studies, we appeal to colleagues working in the field to reconceptualise their work. Their objectives may have been soundly warranted, but their well-justified desires to identify activities that support children’s mathematical development have been undermined by their analytical approaches to survey data. This does not mean that surveys have no role to play, the opposite in fact, but it is the impact of individual activities rather than arbitrary aggregations of activities that needs to be examined. In other words, we should ask ourselves, whether as authors or reviewers, what is the purpose of such research? Is it to identify those activities that actually support learning or to offer statistically robust factors, which, due to the diversity of activities embedded within them, offer few useful insights?