1 Introduction

Anxiety is an unpleasant emotional experience that any person can encounter in his/her life (Gregor 2005). Generally, anxiety is defined by psychologists as a feeling of tension, fear or apprehension that excites the body system above normal functioning capacity and negatively impacts individual performance (Coon and Mitterer 2008).

Test anxiety, in particular, refers to the specific issue of anxiety related to achieving high test scores in educational careers (Chapell et al. 2005; Miller and Bichsel 2004). Since test scores are crucial for student evaluation at school, test anxiety has become a universal issue in contemporary society. Empirical evidence has shown an inverse relationship between test anxiety and school achievement (see, among others, Everson et al. 1991; Nadeem et al. 2012). More specifically, Syokwaa et al. (2014) stressed “…when a student is too anxious, his/her concentration tends to focus too much on his/her anxiety build-up process to the extent that he/she loses focus of the task at hand”. Other researchers also stated that performance goals interfere with test anxiety, namely, higher goals increase test anxiety due to the fear of failure (Elliot et al. 1999; Carver and Scheier 1994; Kandemir 2013). Despite the relevance of this issue, in Italy, a comprehensive study at the national level that attempts to assess and measure the impact of test anxiety on school performance is still lacking. There are some empirical studies at the local level, and most of them refer to math anxiety, which is a distinct subtype of anxiety that can be experienced even in the absence of both general and test anxiety (Hembree 1990; Cargnelutti et al. 2017). As stressed by Ramirez et al. (2013), a measure of math anxiety is not related to children’s other learning abilities, such as reading achievement; consequently, it cannot be considered a proxy for general academic anxiety.

The 2015 wave of the Programme for International Student Assessment (PISA) survey (OECD 2016a) provides a new general index of anxiety, named the index of school work anxiety. This index was computed by measuring students’ cognitive and emotional reactions to schoolwork and testing. For this reason, this general index of anxiety allows us to overcome the limitation common to previous studies. Indeed, we provide a more complete overview of the relationship between anxiety and school performance by studying how that index affect the students’ score in three areas of study, namely, math, science and reading.

Moreover, even if there are several empirical studies at the international level in this framework, the research in the field is mainly focused (albeit with some exceptions; see, for example, Satake and Amato (1995) or Ramirez et al. (2013)) on assessing the effect of anxiety on math performance using standard regression approaches or simple hypothesis testing, with the findings valid mainly around the centre of the distribution of math scores. As a consequence, these findings do not allow us to distinguish the different degrees of association that test anxiety may have among high-performing and low-performing students.

Therefore, starting from these premises, the objective of this paper is twofold. First, our aim is to deeply investigate the relationship between test anxiety and school performance (measured by the scores obtained in math, science and literature) in Italy, controlling for other individual and contextual factors (such as school environment) that are also recognised to have an influence on school performance (see, among others, Agasisti and Vittadini 2012; Azzolini et al. 2012; Bratti et al. 2007; Longobardi et al. 2018; Masci et al. 2016; Matteucci and Mignani 2014). Second, we empirically test whether the negative impact of test anxiety increases as students’ skill increases. In this manner, we indirectly also measure whether higher goals increase test anxiety, starting from the hypothesis that high-skilled students generally have high goals. To this end, we implement an M-quantile regression model that is well suited to the study of this relationship, as explained later in Sect. 3.3. In particular, we use the W-MQRE regression introduced by Schirripa Spagnolo et al. (2020), which allows us to take into account the hierarchical structure and sampling weights of PISA-OECD 2015 data.

The findings represent an attempt to generalize the relationship between test anxiety and school performance at the Italian national level. They quantify the effect of test anxiety along the overall conditional distribution (namely, at different student skill levels once other crucial individual and contextual factors are controlled for) of math, literature and science scores. A similar conditional approach was followed by Faria and Portela (2016) who applied multilevel quantile regression model to analyse the determinants of student achievement in mathematics in Portugal using PISA-OECD 2009 data. However, the W-MQRE regression represents a more flexible alternative to multilevel quantile regression model and it allows taking into account the informativeness of the sampling probabilities in the estimation procedure.

In summary, it was previously shown by different scholars that anxiety has a negative impact on school performance; however, the effect of anxiety at different quantiles had not been studied before in Italy. Our study demonstrates how M-quantile regression can be easily used to measure this effect and how anxiety has a greater association on high-skilled students as it was found in other empirical contexts.

This analysis also attempts to contribute to the current literature by addressing an obstacle that might undermine students’ lifelong learning and thus can help inform the implementation of specific educational actions aimed at reducing the negative impact of test anxiety. The remainder of the paper is organized as follows. In the next section, we review the main findings on the relationship between test anxiety and test scores. A general description of the data, variables and methods is provided in Sect. 3. The main results are discussed in Sect. 4. Some final remarks are given at the end of the paper.

2 Literature review

In the last decade, numerous studies have investigated the relationship between the level of test anxiety and student performance. These studies have mainly focused on a specific issue of anxiety known as math anxiety, which refers to some form of discomfort observed while students work on mathematical problems (Mutodi and Ngirande 2014; Dowker et al. 2016). In fact, anxiety is recognised as one of the most important predictors of student performance, with a negative effect on student achievement. This negative relationship was confirmed in a meta-analysis study by Hembree (1990), who focused on math anxiety and concluded that a higher level of math anxiety decreases performance in mathematics. Other, more recent papers have also focused on math anxiety. McDonald (2001) reviewed the literature on the effects of test anxiety on children in compulsory education and found that test anxiety is widespread among students; therefore, this issue relates to substantial detrimental effects on test performance.

In addition, Núñez-Peña et al. (2013), among others, demonstrated that math anxiety is the most influential factor in explaining the lower performance of students in a specific course at the University of Barcelona; these authors also showed that students who failed the course had higher mathematical anxiety than those who passed the course. Núñez-Peña et al. (2016) confirmed this negative relationship when concentrating on gender differences, showing that females had a higher level of test anxiety than males but that the gender difference in anxiety does not affect exam grades.

The gender issue was also studied by Devine et al. (2012) and Jolejole-Caube et al. (2019). In the first study, the results demonstrated among a sample of British secondary school children that the levels of math anxiety and test anxiety were higher for girls than for boys, while regression analyses demonstrated that math anxiety was a significant predictor of performance for girls but not for boys. In the latter study, the authors implemented an empirical analysis in the Philippines during the second quarter of the 2017–2018 school year. They found a statistically significant negative association between learners’ level of anxiety and mathematics performance, and this association was stronger for male learners than for their female counterparts. However, Payne et al. (1983), in a sample of students in US, found that the correlation between test anxiety and performance in science was the same for girls and boys.

Furthermore, Van Mier et al. (2019) studied the relation between math anxiety and math performance in young children in the Netherlands and found that math anxiety was more pronounced in women than in men. A more general study by Syokwaa et al. (2014) confirmed a negative relationship between anxiety level and academic achievement in select secondary schools in Kenya. Nadeem et al. (2012) obtained the same results for a sample of Pakistani students. Martinčić (2019) studied the relationship between math anxiety, math attitudes and math performance by comparing Italian and Croatian 3rd and 5th graders. Although the overall negative relationship between achievement and anxiety has been well established (see also Zhang et al. 2019), some studies asserted that anxiety might have more serious consequences for high-achieving students because these students likely have higher expectations and/or higher potentialities. Satake and Amato (1995) found an inverse relationship between math anxiety and achievement among Japanese elementary-school children; moreover, they showed that high achievers reported higher levels of anxiety than low achievers because high achievers are under great pressure to enter more prestigious middle schools.

More recently, the work of Ramirez et al. (2013) explored the relationship between math anxiety and math achievement among first- and second-grade schoolchildren in the United States, finding a negative relationship between math anxiety and math achievement only among children with high working memory, likely to also be the students who had the greatest potential for high achievement. Furthermore, Kandemir (2013) found in a sample of students preparing for university exams in Ankara that anxiety increases as perfectionist characteristics and performance goals increase. Regarding Italy, to the best of our knowledge, empirical evidence on this issue is quite limited and basically related to specific local contexts. The relationship between school performance and anxiety has been studied by Mazzone et al. (2007), who analysed a sample of children and adolescents attending elementary and high school in the city of Catania. The findings showed that children with a high level of anxiety are more likely to have insufficient school grades than children with a low level of anxiety, and this effect is much more significant among high school students. Hill et al. (2016) confirmed the negative relationship between math anxiety and students’ math performance in a sample of approximately one thousand students attending primary and secondary school. Passolunghi et al. (2016) showed that sixth- and eighth-grade students who had high math anxiety levels also had lower mathematics achievement and lower scores on working memory performance than children with low levels of math anxiety. Cargnelutti et al. (2017) studied whether anxiety is related to math performance among young students using a longitudinal study of children in Grades 2 to 3 in Italy.

However, all the previously mentioned studies have common traits, namely, in addition to the anxiety-performance relationship being analysed in a local context, it is also assessed only for a single institution or for a small group of schools or universities. Consequently, the studies often cannot provide a complete overview of the phenomenon at a national level. Recently, the wide-ranging discussion and debate on the relationship between educational performance and anxiety have been addressed using the PISA survey. Nevertheless, the existing papers on this issue have studied anxiety related to mathematics because until 2012, only a math anxiety index (that is, students’ feelings of helplessness and stress when dealing with mathematics) was measured. In particular, Lee (2009) studied how three math self-constructs (math self-concept, math self-efficacy, and math anxiety) affect math performance across 41 PISA 2003–participating countries. The findings stressed that in European countries, the association between math anxiety and mathematics performance was higher than in Asian countries. Thien and Ong (2015) analysed the effects of affective characteristics (motivation, self-efficacy and anxiety) on math performance in Malaysia and Singapore using PISA 2012 data. They found that mathematics anxiety had a significant and negative effect in both countries.

Kalaycioglu (2015) used structural equation modeling to study the predictors of mathematics achievement on a sample of students from England, Greece, Hong Kong, the Netherlands, Turkey, and the USA who participated in PISA 2012. She found that the relationship between math anxiety and mathematics achievement was significant only in countries that had low mathematics achievement (Turkey, Greece, and the USA). Foley et al. (2017) found that students reporting higher levels of math anxiety had lower levels of math performance than their peers who reported lower levels of math anxiety. Moreover, they also found that countries with higher scores had lower levels of country-level math anxiety. OECD (2013a) reported that anxiety was associated with a higher decrease in performance among high-skilled students than among low-skilled students.

Finally, Masci et al. (2018) employed multilevel regression trees with a general measure of test anxiety as a student-level covariate to analyse PISA 2015 data in nine countries. They found that in some countries, including Italy, the most significant variable in explaining the students’ achievement in mathematics was the test anxiety indicator. However, their approach focuses only on mathematics performance and does not allow for quantifying the effect of anxiety on high and low-skilled students.

3 Data, variables and method

3.1 Data

The data used for this study come from the 2015 wave of the Programme for International Student Assessment (PISA). This survey is promoted by the Organisation for Economic Co-operation and Development (OECD), and it was first performed in 2000 and then repeated every three years to test the knowledge of 15-year-old students around the world in reading, science and mathematics. Over the past decade, PISA has become the most important and largest survey for evaluating the quality, equity and efficiency of school systems. In 2015, approximately 540,000 students completed the assessment, representing approximately 29 million 15-year-old students in the schools of the 72 participating countries and economies. In particular, our analysis is based only on students enrolled in public upper-secondary schools in Italy, corresponding to the 3rd level of the International Standard Classification of Education (ISCED). This choice is determined by the fact that the upper-secondary level of education is considered a crucial indicator of the output of the educational system of a country (OECD 2003). The data are characterized by a hierarchical structure (Goldstein 2011). Consequently, our estimation sample is composed of 7142 students with non-missing response variables and covariate information. The 7142 students are clustered into 283 schools.

3.2 Variables

The empirical analysis is based on non-experimental, exploratory correlational models. The aim is to study the relationship between a main predictor variable (the test anxiety indicator) and three response variables measuring math, reading and science achievement, controlling for other predictor factors that have a known influence on academic achievement. In this section, we describe in detail the variables used.

The response variables in this paper are the student scores in mathematics, science and reading. In PISA 2015, 10 predictions of the student score, known as plausible values (PV), are provided (for more details about the methodology, see OECD 2017). We analyse all plausible values by using multiple imputation (MI) formulas (Rubin 2004) to obtain correct standard errors.

The main predictor of this study is the index of school work anxiety provided in the 2015 PISA data. This index was computed by measuring students’ cognitive and emotional reactions to schoolwork and testing. First, students were assessed through the following 5 statements: “I often worry that it will be difficult for me to take a test”; “I worry I will get poor grades at school”; “I feel very anxious even if I am well prepared for a test”; “I get very tense when I study for a test”; and “I get nervous when I do not know how to solve a task at school”. Answers were measured on a four-point Likert scale with the following categories: strongly agree, agree, disagree or strongly disagree. Second, the item response theory (namely, the generalized partial credit model (GPCM); cf. OECD 2017) scaling methodology was used to construct the final index of anxiety. The index is standardized to have a mean of 0 and a standard deviation of 1 across OECD countries. Positive values indicate that students had higher levels of schoolwork-related anxiety than the average student across OECD countries, and negative values indicate that students reported lower levels of anxiety than the average student (OECD 2016b). In our sample, the average level of anxiety is equal to 0.48 (cf. Table 1), which is higher than that of other OECD countries (OECD 2016b). As control variables, we used i) gender (ref. male); ii) immigration status (ref. native, namely, students whose mother or father (or both) were born in the country where the PISA test is administered); iii) grade repetition (ref. no repeated grades at any level); iii) arriving late to school (ref. no late arrival to school in the two weeks before the test); iv) socio-economic status; v) proxy for ambition level of the student (ref. not ambitious); vi) the type of school (lyceums vs other schools); and vii) the region (North (baseline) vs South). Both theoretical and empirical considerations drove the choice of this set of control variables. The empirical considerations are mainly related to avoiding multicollinearity issues in the regression models, while the theoretical considerations may be summarized as follows. Gender differences in school performance in Italy have been confirmed by several studies (see, among others, Grilli et al. 2016; Bratti et al. 2007; Giambona and Porcu 2015; Contini et al. 2017); in particular, boys tend to outperform girls in mathematics and science, whereas girls tend to outperform boys in reading (Invalsi (2016) Indagine OCSE PISA 2015). It is also interesting to note that we also tested whether the interaction term between gender and text anxiety was significantly different from zero across quantiles. However, we did not find any significant effect; therefore, we did not include this term in our model. The educational integration of immigrant children is a fundamental step for the economic assimilation of immigrants in host societies (Schneeweis 2011). However, PISA data reveal that children with an immigrant background face enormous challenges in school, and in particular, Azzolini et al. (2012) demonstrated that in Italy, a highly significant gap between immigrants and native students persists even after controlling for family-level variables and allowing for school random effects. Ever since the report of Coleman et al. (1966), educational researchers have acknowledged the importance of students’ socio-economic status in determining their educational achievement. In our analysis, we include the index of economic, social and cultural status (ESCS) as measured in PISA.

The ESCS scale has been transformed such that zero is the score of an average OECD student and one is the standard deviation across equally weighted OECD countries. Higher values indicate a higher socio-economic status (OECD 2016b). The average ESCS value in Italy is negative, highlighting a worse socio-economic condition than the OECD average. We also control for grade repetition among students. Indeed, retained students are more likely to drop out, stay longer in the school system, and spend less time in the labour force (OECD 2013b). Additionally, students who arrive late to school may miss out on learning opportunities, negatively affecting their performance (OECD 2013a). The aspirations and expectations of students may influence their school achievement (Khattab 2015; Gorard et al. 2012). As a proxy for students’ ambition, we considered whether they wanted to achieve top grades in most of the courses they were attending (no (baseline) or yes).

Many studies (see, among others, Agasisti and Vittadini 2012; Agasisti et al. 2017; Bratti et al. 2007; Checchi 2004) have demonstrated that students attending schools located in Northern and Central Italy perform better than those in Southern regions. The type of school is another determinant of student achievement (see, among others, Bratti et al. 2007). On the basis of the kind of education provided, the type of school has been recoded into two categories: lyceums (baseline), which provide academically oriented education, and other types of schools, which provide technically oriented education.

Moreover, we add in the model the school mean of the following individual-level variables: gender, immigration status, grade repetition, arriving late at school, and socio-economic status. However, we do not include the school mean of the index of schoolwork-related anxiety and the school mean of the proxy for ambition because they are not significant in all quantiles on the three scores. These variables can be viewed as proxies of the social context of the school and allow us to estimate peer effects, namely, how the student is affected by the characteristics of other pupils in the same school (Schneeweis and Winter-Ebmer 2007; Dumay and Dupriez 2008). It is worth noting that in preliminary analyses, we also controlled for school climate measured by the variable Student-related factors affecting school climate available in the PISA dataset (PISA, 2017). Indeed, the correlation between the school-mean of the index of anxiety and this variable was negative (\(-0.15\)) and significantly different from zero (\(p \text {-value}=0.0104\)). Therefore, a good school climate would appear as a protective factor and therefore reduce anxiety (Tramonte and Willms 2012). Nevertheless, the effect of school climate in the econometric models was not significant. For this reason, we have decided to present our results omitting this variable.

Table 1 describes the sample by reporting descriptive statistics for the outcomes and regressors.

Table 1 Descriptive statistics of the outcome variable and covariates

3.3 Methods

Educational data have a natural hierarchical structure, namely, pupils are nested into schools. This data structure creates dependence between observations (students) belonging to the same cluster (i.e., schools), and econometric models must take into account such dependence to obtain reliable results. In particular, failing to recognise a hierarchical data structure involves the underestimation of the standard errors of regression coefficients, leading to overstatements of statistical significance. For this reason, empirical studies based on PISA usually use multilevel modelling (Goldstein 2011; OECD 2013c).

However, classical multilevel models estimate an average relationship between the output variable and the explanatory variables. In other words, classical multilevel models allow us to estimate only the average effect of anxiety on students’ achievement in math, literature and science. For this reason, in this paper, we use the M-quantile random effects regression model (MQRE-2L) proposed by Tzavidis et al. (2016) and adapted to complex survey designs by Schirripa Spagnolo et al. (2020), henceforth model W-MQRE. This approach focuses on the estimation of the regression parameters at different points of the conditional distribution of the outcome variable given the set of regressors. Therefore, the W-MQRE allows us to estimate the relationship between test anxiety and student performance at different quantiles and therefore to test whether anxiety has a greater impact on high-performing than on low-performing students.

Therefore, we have a two-level population with M schools (clusters) and\(N_j\) level-1 units (students) within the jth school. Suppose that the whole population is not observed and the sampled data are selected with the following two-stage sampling design: at the first stage, m schools are selected with inclusion probabilities \(\pi _j\), \(j=1,\ldots ,M\), and at the second stage, n students are selected within the jth sampled school with conditional probabilities \(\pi _{(i|j)}\), \(i=1,\ldots ,N_j\). Therefore, the first level sampling weights are \(w_{i|j}= 1/\pi _{i|j}\), i.e. the inverse of the selection probability of unit i in cluster j given that cluster j has been sampled, and the level 2 sampling weights are\(w_j=1/\pi _j\), i.e. the inverse of the selection probability of cluster j. In the PISA database, the student-level weights are student-level overall inclusion weights (\(w_{ij}\) not \(w_{i|j}\)) adjusted for compensating the schools non-participation, the grade non-response and non-participation of students who are included in the sample (see OECD 2017 for more details about the weighting process in PISA 2015).

Let \(y_{ij}\) be the response variable for the ith student and the jth school. Three different M-quantile random intercept models are estimated using the math, reading and science scores as follows:

$$\begin{aligned} MQ_{q}(y_{ij}|I_{ij}, {\mathbf {x}}_{ij}, {{\mathbf {z}}}_{j}, {\varvec{\gamma }}_{qj}; \psi )= I_{ij}\beta _{I;\psi q} + {\mathbf {x}}^T_{ij}{\varvec{\beta }}_{\psi q} + {{\mathbf {z}}}^T_{j}\varvec{\alpha }_{\psi q}+ {\varvec{\gamma }}_{qj} + e_{qij}, \end{aligned}$$
(1)

where \(I_{ij}\) is the anxiety indicator, \(\beta _{I;\psi q}\) is the associated regression coefficient at each M-quantile q, \({\mathbf {x}}_{ij}^T\) is the covariate-vector of the previously described student-level control variables, \({{\mathbf {z}}}^T_{j}\) is the vector of school-level covariates, \({\varvec{\beta }}_{\psi q}\) and \(\alpha _{\psi q}\) are the vectors of the regression parameters for each M-quantile q, \(q \in (0,1)\), and for a chosen influence function \(\psi _q\) for student- and school-level variables, \({\varvec{\gamma }}_{qj}\) is the vector of cluster-specific random effects at the given M-quantile q and \(e_{qij}\) is the individual q-specific random effect. In particular, the influence function \(\psi _q(u)\) is the derivative of an asymmetric loss function \(\rho _q(u)=2|q-I(u<0)|\rho (u)\). In this paper, we use the popular Huber loss function (Huber 1981; Breckling and Chambers 1988):

$$\begin{aligned} \rho (u)= \left\{ \begin{array}{ll} c|u|-c^2/2 &{} \quad |u|>c \\ u^2/2 &{} \quad |u|\le c, \end{array} \right. \end{aligned}$$
(2)

where c is a tuning constant bounded away from zero. In this paper, we set \(c=1.345\); this value gives reasonably high efficiency under normality producing 95% efficiency when the errors are normal and protects against outliers (Huber 1981).

The estimating equations of the model in Eq. (1) are as follows:

$$\begin{aligned}&\sum _{j \in s} w_j \left[ {{\mathbf {H}}}_j^{T} {{\mathbf {V}}}_{qj}^{-1} {{\mathbf {U}}}_{qj}^{1/2} \psi _q ({{\mathbf {r}}}_{qj}) \right] = \mathbf {0}, \end{aligned}$$
(3)
$$\begin{aligned}&\quad -\frac{1}{2} \sum _{j \in s} w_j \left[ K_{2qj} \mathrm {tr}({{\mathbf {V}}}_{qj}^{-1} {{\mathbf {B}}}_j{{\mathbf {B}}}_j^T)- \psi _{q} ({{\mathbf {r}}}_{qj})^{T} {{\mathbf {U}}}_{qj}^{1/2} {{\mathbf {V}}}_{qj}^{-1} {{\mathbf {B}}}_j{{\mathbf {B}}}_j^T {{\mathbf {V}}}_{qj}^{-1} {{\mathbf {U}}}_{qj}^{1/2} \psi _ {q} ({{\mathbf {r}}}_{qj}) \right] = 0, \end{aligned}$$
(4)
$$\begin{aligned}&\quad -\frac{1}{2} \sum _{j \in s} w_j \left[ K_{2qj} \mathrm {tr}({{\mathbf {V}}}_{qj}^{-1} {{\mathbf {W}}}_j^{-1})- \psi _{q} ({{\mathbf {r}}}_{qj})^{T} {{\mathbf {U}}}_{qj}^{1/2} {{\mathbf {V}}}_{jq}^{-1} {{\mathbf {W}}}_{j}^{-1} {{\mathbf {U}}}_{qj}^{1/2} {{\mathbf {V}}}_{qj}^{-1} \psi _{q} ({{\mathbf {r}}}_{qj}) \right] = 0, \end{aligned}$$
(5)

where \({{\mathbf {y}}}_j\) is the vector of response variable (in this case the students’ performance); \({{\mathbf {H}}}_j\) is the matrix of all covariates included in the model specification (both at student- and school-level: \(I_{ij}\),\({\mathbf {x}}_{ij}\) and \({{\mathbf {z}}}_{j}\)) in the school j; \(w_j=1/\pi _j\) is the school-level sampling weight, i.e. the inverse of the selection probability of school j; \({{\mathbf {W}}}_j =\mathrm {diag}(w_{1|j},\ldots , w_{n_j|j})\) is a diagonal matrix of the student-level sampling weights; \({{\mathbf {r}}}_{qj}= {{\mathbf {U}}}_{qj}^{-1/2}({{\mathbf {y}}}_j-I_{ij}\beta _{I;\psi q} - {\mathbf {x}}^T_{ij}{\varvec{\beta }}_{\psi q} - {{\mathbf {z}}}^T_{j}\varvec{\alpha }_{\psi q})\) is a vector of scaled residuals with components \(r_{qij}\); \({{\mathbf {U}}}_{qj}\) is a diagonal matrix with diagonal elements equal to the diagonal elements of the variance-covariance matrix of the jth cluster\({{\mathbf {V}}}_{qj}= \sigma ^2_{\varepsilon _q}{{\mathbf {W}}}_j^{-1} + \sigma ^2_{\gamma _q}{{\mathbf {B}}}_j {{\mathbf {B}}}_j^T\); \(\sigma ^2_{\gamma _q}\) and \(\sigma ^2_{\varepsilon _q}\) are the M-quantile specific variance parameters; \({{\mathbf {B}}}_j\) is an \(n_j \times 1\) vector of known positive constants; \(K_{2qj}= \mathrm {E}[\psi _{q} (e_j)\psi _{q} (e_j)^T]\) with \(e_j \sim N(\mathbf {0}, {{\mathbf {I}}}_{n_j})\). A Newton-Raphson algorithm is used to obtain estimators of the regression coefficients using Eq. (3) while Eqs. (4) and (5) are solved iteratively using the fixed-point iterative method for getting the estimates of the school- and student-level variance, respectively. The variance of the regression parameters is obtained using a sandwich estimator (Schirripa Spagnolo et al. 2020). The model has been fitted using a specific function in the statistical programming environment R (R Core Team 2020).

Interpretation of the regression coefficients is straightforward: each parameter estimates the change in a specified q-quantile of the target variable produced by a one-unit change in the corresponding covariate (the others are fixed). In the following, we estimate an M-quantile random intercept model with random intercepts at the school level and with the student- and school-level weights provided in the PISA 2015 database. The model is estimated for each score separately.

Robust prediction of random effects at the school level was proposed in Schirripa Spagnolo et al. (2020) following the idea of Tzavidis et al. (2016) based on the modified Fellner equation Fellner (1986). The estimation of the random-effects is performed separately at each quantile, therefore the vector of school-random effects depends on the q.

Finally, the goodness-of-fit of our models has been evaluated using the two measures of pseudo-\(R^2\) proposed by Snijders and Bosker (1999). They are:

$$\begin{aligned} \text {pseudo-}R_1^2= 1- (\sigma ^2_{\varepsilon _q}+\sigma ^2_{\gamma _q})/({\tilde{\sigma }}^2_{\varepsilon _q}+{\tilde{\sigma }}^2_{\gamma _q}) \end{aligned}$$
(6)

and

$$\begin{aligned} \text {pseudo-}R_2^2= 1- (\sigma ^2_{\varepsilon _q}/{\tilde{n}}+\sigma ^2_{\gamma _q})/({\tilde{\sigma }}^2_{\varepsilon _q}/{\tilde{n}}+{\tilde{\sigma }}^2_{\gamma _q}), \end{aligned}$$
(7)

where \({\tilde{\sigma }}^2_{\varepsilon _q}\) and \({\tilde{\sigma }}^2_{\gamma _q}\) refer to the variance components of null W-MQRE model and \({\tilde{n}}\) is the harmonized mean of the cluster sizes (Snijders and Bosker 1999). The pseudo-\(R_1^2\) allows us to evaluate the level-one explained proportion of variance while the pseudo-\(R_2^2\) measures the level-two explained proportion of variance.

4 Results

We fit three W-MQRE regression models using the math, reading and science scores as dependent variables. The results of the estimated models are presented in Tables 2, 3 and 4. Figure 1 instead summarizes in a graph the estimated coefficients of the anxiety index at each quantile for the three different scores.

Fig. 1
figure 1

Estimated effect of the anxiety index on mathematics (blue line), science (green line) and reading (red line) (color figure online)

When student- and school-level variables are controlled for, the general measure of anxiety remains very large across the whole distribution and has a significant and negative effect on student achievement regardless of the subject. Moreover, the impact of anxiety is higher at the upper tail of each score distribution than at the lower tail. Therefore, the results obtained suggest that high-performing students are more affected by emotional reactions to tests and school-work anxiety than low-performing students. The findings, therefore, confirm what has been previously stressed by the literature in this framework, as we highlighted in Sect. 2. In addition, these findings generate also new insight concerning the relationship between anxiety and school performance in Italy. However, some differences among the three scores exist. First, the anxiety coefficients for math and science show faster decreases than those observed for reading. It is interesting to note that the smaller variation of the coefficients related to the reading response across quantiles compared to those related to math and science is also connected to the empirical distribution of reading students’ score that appears more concentrated (see Table 1). In general, the three lines do not appear to be parallel, even if the estimated coefficients in reading are always lower than those in math and science. Nevertheless, as expected, the profiles for math are very similar to those observed for science at each quantile. Accordingly, it seems reasonable to conclude that the association between anxiety and reading score is lower along the entire score distribution. In other words, underachievers and higher achievers in reading are less differentiated in terms of their general measures of anxiety, whereas in quantitative subjects (math and science), students appear to be more differentially influenced by anxiety.

The effects of the control variables (see Tables 2, 3 and 4) are as expected and offer further interesting results that we comment on here only briefly, as they are not the main focus of our analysis. Males outperform females in mathematics and science, while they underperform in reading relative to females. Therefore, even if individual- and school-level characteristics are controlled for, the gender gap in favour of boys is larger among good students in math and poor students in reading. As we expected, native students perform better than migrants on all three scores. The difference is significant at each quantile on mathematics and reading scores and increases as we move from the lowest to the highest quantile. Moreover, both the grade repetition and arriving late at school variables have a negative impact on school performance. However, the effect of socio-economic status varies across the three scores and is not always significant at each quantile. The results also confirm the gap between students attending schools located in the North and the South of Italy. The coefficient associated with the macro region is always significantly different from zero, and the effect size is greater along the mathematics distribution than for the other two scores.

Table 2 Results—Weighted-MQRE model for science scores
Table 3 Results—Weighted-MQRE model for mathematics scores
Table 4 Results—Weighted-MQRE model for reading scores

Furthermore, the type of school has a determinant role. The effect of the share of females is negative at the upper parts of the conditional distribution of mathematics and science. The share of immigrants has a negative and decreasing effect on math and science, while the effect increases on reading score. The share of retained students (i.e., the average value of grade repetitions at the school level (“Mean repeat a grade”)) does not affect performance in reading and science, but it negatively affects high-performing students in mathematics. The share of negligent students (i.e., the variable “Mean arriving late”) has no impact on the low-achieving students. The effect of the school mean of socio-economic status is positive along the conditional distribution of the scores, and its decreasing effect suggests that low-performing students can improve their performance in more equal mixed schools.

The between-school and between-student variability,\(\sigma ^2_{\gamma _q}\) and\(\sigma ^2_{\varepsilon _q}\), respectively, tend to display an inverted-U curve for all scores. The proportion of variance explained by clustering, measured by the intraclass correlation coefficient (ICC), tends to be higher in the middle of the outcome distributions, with schools explaining less of the variability in the performance of low- and high-achieving students and explaining more of the variability for students in the middle of the distribution.

In order to evaluate the school-effects at different quantiles, we plot (see Fig. 2) the ranks of the estimated school-effect at the 10th-quantile versus the ranks of the same effect estimated at the 90th-quantile. Obviously, schools with a low rank have a negative effect on the respective score, while schools with high ranks tend to have a high and positive effect on the respective score. The three plots suggest that the school-effect is different at the two extremes of the conditional distribution because the same effect at both quantiles would have produced points aligned on the bisector (dashed line). It is also evident that some schools have a little effect (i.e. low rank) on each of the three scores (science, math and reading) at the higher quantile whereas they ranked very highly at the lower quantile (i.e the school-effect is high and positive). The pattern of these points is very similar in each of the three scores. The reasons for these patterns may be various. One of these reasons may be the result of school policies. For instance, many schools could have implemented policies to help students who perform poorly instead of focusing on policies to maximise the score of high-performing students. Therefore, this sort of equity-focused policies could result in a positive effect on the score of some schools at the lower tail and may disappear at the higher quantiles.

Fig. 2
figure 2

Plot of school random effect ranks (W-MQRE for \(q=0.10, 0.90\)) for science (a), mathematics (b) and reading (c). Solid line refers to the regression line and dashed line to the bisector

Finally, to assess the model fit, we used the two measures of pseudo-\(R^2\) introduced in Sect. 3.3 in Eqs. (6) and (7). The estimated values indicate a good fit for the three models at all quantiles. In particular, the pattern of these two measures is very similar across the quantiles for all the three scores. Indeed, at the centre of the outcome distribution (\(q=0.5\)) we observe higher values than those at the tails (\(q=0.1\) and\(q=0.9\)) and the values of pseudo-\(R_2^2\) are higher than those of pseudo-\(R_1^2\) at each quantile for all scores.

5 Conclusions

This exploratory-correlational study conducted with data from PISA 2015 and based on the W-MQRE regression model proposed by Schirripa Spagnolo et al. (2020) used student performance as the response variable and anxiety index as the main explanatory variable. Other individual and school characteristics are used as control factors in the econometric model. This analysis allows us to elucidate that anxiety plays a crucial role in test scores regardless of the specific area of study, and it negatively impacts individual performance. In this respect, the negative association between anxiety and academic achievement is confirmed. This study adds to the existing literature in this framework not only by assessing the association between a general measure of anxiety and math, reading and science scores; the results further highlight that anxiety has a greater impact on good students. Indeed, as affirmed by Zeidner (1998), test anxiety cannot be explained by a lack of effort or poor exam performance because conscientious and highly motivated students also suffer from its debilitating impact. Consequently, it is reasonable that high-performing students are more worried about achieving poor grades and have a greater fear of failure than low-performing students. The logical conclusion is that anxiety likely leads to underestimation of the true ability of good students. In this respect, a deeper examination of such phenomena should be addressed in the future. As asserted by Kandemir (2013), test anxiety increases with the increase in perfectionist characteristics and performance goals. Coping with anxiety is essential to improving individual performance; moreover, considering that anxiety impacts good students more, reducing the level of anxiety may improve the general educational performance of a country overall. Teachers are key in reducing anxiety. It is fundamental for teachers to recognise that anxiety has a larger effect on the best-performing students and to give them adequate support to lessen it. More important than offering individual help is building good relationships with all students and improving the school environment. Additionally, parents can help children to manage anxiety by encouraging them to have more faith in their ability to accomplish school tasks (OECD 2016b).