1 Introduction

It is an historical fact that education policy was conceived in terms of free and mandatory public schooling (financed by public funds) when it was introduced in the West (Germany, France and later UK and US); and free and mandatory schooling is still at the basis of the Western educational systems today. Several motives have been identified for the introduction of compulsory education (Fyfe 2005). In Prussia, where such a system was first introduced in 1763, the protestant religious motive seem to have prevailed (on this see also Becker and Woessmann 2010). In France and Italy compulsory education laws, dating back to 1881 and 1861 respectively, are mainly seen as a part of the construction of a national state (see also Cipolla 1969). In Japan, it was the desire for modernisation that drove the introduction of mandatory schooling after the opening to the West in 1886. Also the UK and the US, by far the most industrialized countries at the time, passed compulsory education laws at the end of the XIX century (1880 in the UK, from 1885 to 1918, depending on the States, in the US); this slight delay might come as a surprise, but a possible reason for it has been identified in the need for cheap child labour—for example, Galor (2006) suggests that education was made compulsory only when a literate workforce was needed because of technological progress. In that case, parents who may profit from their children’s labour or contribution to home production (Balestrino et al. 2017; Cigno 2013) may have to be forced to send their kids to school. As far as the US are concerned, Bandiera et al. (2018) also stress a nation-building motive aimed at instilling civic values to migrants during the “Age of Mass Migration” going from 1850 to 1914—an attempt at building social capital, one might say.Footnote 1

Initially, the length of mandatory schooling and the enforcement of the attendance prescriptions were relatively limited, especially in South European countries, and generally in the countryside where children were seasonally employed in agriculture. After World War II, there has been a steady increase in the length of compulsory education in Western countries. Murtin and Viarengo (2011) show that there has been a strong convergence in the length of mandatory schooling in fifteen western European countries during the period from 1950 to 2000. At the end of the 1930s, the years of compulsory education ranged from three in Portugal to nine in the UK. After the reforms that occurred in the second half of the XX century, the range was reduced to nine-twelve years. According to Murtin and Viarengo (2011), this convergence is to be traced to the decreasing returns to educational investments, and to the related fact that all countries had reached approximately the same level of profitability. Nowadays, this convergence is further reinforced by globalisation. Higher competitivity in the global markets can only be faced with a more educated workforce.

In developing countries, however, public and free education is not always guaranteed and thus compulsory education is still an issue today, despite its being one of the main prerequisites not only for economic development but also for democratization and human rights (as an object in itself and as a primary tool in the fight against child labour). Elementary education should be made compulsory according to art. 26 of Universal Declaration of Human Rights (1948) and such principle has been reaffirmed in a number of conventions and treaties up to Goal 4 of the UNDP Sustainable Development Goals, which calls for achieving inclusive and quality education for all, and more specifically “ensures that all girls and boys complete free primary and secondary schooling by 2030”.

The two motives behind the establishment of compulsory education that we mentioned before, industrialization and generally productivity needs on the one side and nation-building on the other side, are of course not necessarily conflicting with each other. However, when we consider the expansion of compulsory education that took place in Europe after the Second World War, we pointed out above that the productivity motive may have been the main pushing factor.Footnote 2

For these reasons, we focus on the productivity motive, and therefore build a model in which center stage is taken by the actors of the industrial world, entrepreneurs and workers.Footnote 3 Also, we adopt a positive, rather than a normative, viewpoint. It is of course always possible to argue in favour of compulsory education in normative terms, e.g. because of horizontal equity requirements (Balestrino et al. 2017). However, normative desirability is not enough to explain why compulsory schooling has become an indispensable part of the modern educational policy package. If we take the political economy view that policies are designed according to voters’ preferences by office- or policy-motivated politicians, then it follows that someone’s interests must be furthered by the presence of a mandatory education period. What we require, therefore, is an argument showing that education policy is likely to be part of a winning policy scheme in a political context.Footnote 4

Specifically, we investigate the question whether there might be a social group who is interested in introducing compulsory schooling as part of the equilibrium policy and is endowed with sufficient political power to actually do so.Footnote 5 In our model, as we said, agents are classified into two occupational groups, entrepreneurs and workers. One of the implications of such a division of society is that entrepreneurs have a stronger interest in education policy than workers. The rationale for this is not that entrepreneurs want their children to be well-educated, because they will tend to provide the required education anyway; the point is that they want the children of their workers to be educated, in order to enjoy a better work-force one generation ahead. For this reason, entrepreneurs favour compulsory schooling, financed by the tax system; such a scheme should then prevail at the political equilibrium if the entrepreneurs are able to impose their preferred policy.

Notice also that the fact that in our model both entrepreneurs and workers have a say on education policy through their voting behaviour is consistent with our focus on the productivity factor, which we saw is presumably the main one behind the more recent introductions or expansions of the compulsory education system. Indeed, universal suffrage was not present in the countries where compulsory education was first introduced (see above): due to their low education and income levels, at the time workers did not have the right to vote.Footnote 6

Additionally, we may remark that the phenomenon of “industrial paternalism” provides some indirect evidence of the fact that entrepreneurs have in the course of history cared for their workers’ education. Industrial paternalism entailed, by and large, the provision on the entrepreneurs’ part of basic health and education services to their workers. This covers a period ranging roughly from 1860 to 1950 and is quite common across different places and cultures: we have examples in Europe (France: Reid 1985; Italy: Ciuffetti 2004; Finland: Fellman 2019; UK: Dellheim 1987), in Asia (Japan: Tsutsui 1997), in Africa (Belgian Congo, now Democratic Republic of the Congo: Juif 2019) as well as in the US (Tone 1997). It may be difficult to say what the main rationale for this might have been, possibly a host of different motives (many of which are studied in the references above), but it is undeniable that having a minimally healthy and educated workforce tends to increase productivity – see also fn. 2 and fn. 3. Indeed, it is an established stylised fact that the children of educated parents are more likely to acquire an education (see e.g. Checchi 2005 and the references therein): then, it is plausible to conceive of the entrepreneurs as aware of the potential impact of education on productivity because they see the effects of education on themselves and their children. Instead, workers, not experiencing education first-hand, are less likely to know it as an investment for their children and may easily end up in a vicious circle whereby, generation after generation, they prevent their own offspring from reaping the benefits of having an education.Footnote 7 These are the main reasons why we regard the interpretation of the emergence of compulsory education as a response to productivity requirements as more convincing than alternative explanations such as those based on the presence of a majority of poor workers caring for their children’s welfare and voting for an education system financed by a tax on rich entrepreneurs. Note, however, that the two explanations are not contradictory and that some altruism on the part of the workers may reinforce the effects of the productivity argument.

The paper is structured as follows. Section 2 presents the model and illustrates the nature of the free-market equilibrium, while Sect. 3 introduces the policy instruments and discusses the policy preferences of the agents. Section 4 expounds the political equilibrium achieved via a probabilistic voting process. Finally, Sect. 5 concludes.

2 The model

We consider an overlapping generations economy in which agents live three periods, \(i =0 ,1 ,2.\) In period 0, however, an agent has only a passive role: she receives an education and supplies the time not absorbed by the educational process for the production of a domestically produced service. We refer to agents in period 0 as “children”, in period 1 as “young adults” and in period 2 as “mature adults”. The latter two are the periods where economically relevant decisions are taken and carried out. Agents cease to exist at the end of period 2. For our purposes, then, there are two economically active generations that overlap at each time of the economy, young adults, y, and mature adults, m.

Agents live in households, each made of one parent and one child; in turn, this child will grow up to become a parent; and so on and so forth. There are two social groups, entrepreneurs and workers, who perpetuate themselves generation after generation (no interclass mobility).Footnote 8 Kids are born in period 1, when parents are young adults; in the same period, each parent decides how much education her child should receive. Education requires a money input (out-of-pocket expenditure) as well as a time input (opportunity cost); the time that the kid does not spend in education is combined with the parent’s time and employed to provide a household public good. Notice that it is important to characterize the educational process in such a way that the kid’s time allocation is explicitly accounted for: indeed, it is exactly because parents may wish to rely on their children’s time for the provision of the household public good that they may also wish to reduce or ban altogether school’s attendance.Footnote 9 This is why we model monetary expenditures and time employment as separate inputs in the children’s education.

2.1 Incomes and preferences

There are n entrepreneurs (n/2 young and n/2 mature adults) resulting in n/2 firms. Entrepreneurs’ incomes are given by the profits generated by the firms they own. The ownership structure is thus specified: each young adult entrepreneur co-owns the firm with her parent, and they share the profits; one period ahead, the same agent, now a mature adult, will share ownership and earnings with her own child (again, this is just for simplicity, and without loss of generality). Monetary earnings are not the only objective of an entrepreneur who also cares about his reputation as a successful manager of the firm. Since the actions of the entrepreneur display part of their effect after the latter’s death, we assume that the entrepreneur will take it into account when making her decisions.

Each firm produces a share of the only good that exists in the economy, whose price is unity. Labour is the only (variable) input and there are constant returns to scale. Each worker supplies a fixed amount of labour, the same for all, and produces \(y^{i} =\underline{y} +y \left( e_{\omega }^{t} ,d_{\omega }^{t}\right)\) units of the good in each productive period (\(i =1 ,2\)), where \(\underline{y}\) is the minimum level of production by an uneducated worker, \(e_{\omega }^{t}\) represents the amount of educational expenditure bestowed upon, and \(d_{\omega }^{t}\) denotes the time spent in education by a worker of generation t in period 0 (the total time available is normalized to 1, so that \(1 -d_{\omega }^{t}\) is the time devoted to the production of the household public good). \(y ( \cdot )\) is an increasing and strictly concave function satisfying

$$\begin{aligned} y \left( 0 ,d_{\omega }^{t}\right) =y \left( e_{\omega }^{t} ,0\right) =0 ;\quad \frac{ \partial ^{2}y}{ \partial e_{\omega }^{t} \partial d_{\omega }^{t}} =\frac{ \partial ^{2}y}{ \partial d_{\omega }^{t} \partial e_{\omega }^{t}} >0 ; \end{aligned}$$
(1)

that is, both inputs into the educational process are essential in order to produce more than the minimum level, \(\underline{y}\), and they exhibit technological complementarity: the more time you spend in education, the more effective is the money you spend on it and viceversa (for example, if a kid goes to a high-quality school costing more money, this should make the time spent in education more profitable).Footnote 10 The agent’s non-working time, which is clearly also fixed, is employed in the production of a household public good.

Workers and entrepreneurs bargain over the sharing of output \(y^{i} =\underline{y} +y \left( e_{\omega }^{t} ,d_{\omega }^{t}\right)\). Let \(\mu\) be the index of the power of the entrepreneurs and \((1 -\mu )\) the index of that of the workers, with \(\mu >0.5\), i.e. entrepreneurs have higher bargaining power than workers. Each firm incurs in bargaining costs that are increasing in the number of workers. We assume that such costs increase in a discontinuous way (for example because at some point a further increase in the number of employees requires an additional person to carry on the bargaining effort) implying that each firm, anticipating correctly its bargaining costs, employs a fixed number of workers that represents the equilibrium profit-maximizing level of employment.Footnote 11 We denote such level of employment by 2s, \(s \ge 1\), where s is both the number of young adult and that of mature adult workers.Footnote 12 Given that there are n/2 firms we globally have \(S =s n \ge n\) employed workers.

The objective of the entrepreneurs is to maximize their share of per-period profit \(\underline{y} +y \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) -w_{\omega }^{i t} -C\) where \(w_{\omega }^{i t}\) represents the per-period wage,and C is the per-worker bargaining cost; while the objective of a worker is to maximize \(w_{\omega }^{i t}\). We assume (generalized) Nash bargaining and posit i) that the bargaining costs are sunk, because they are borne independently of the bargaining outcome and ii) that bargaining occurs before production costs are incurred: then, the disagreement point is \(( -C ,0)\)Footnote 13 and the wage level will result from the solution of

$$\begin{aligned} \max _{w_{\omega }^{i t}}(\underline{y} +y \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) -w_{\omega }^{i t})^{\mu } \left( w_{\omega }^{i t}\right) ^{1 -\mu }, \end{aligned}$$

which yields

$$\begin{aligned} w_{\omega }^{i t} =(1 -\mu ) [\underline{y} +y \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) ] . \end{aligned}$$
(2)

Assuming a perfect credit market with zero interest rate, a worker thus earns lifetime income

$$\begin{aligned} w_{\omega }^{t} =2 (1 -\mu ) [\underline{y} +y \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) ] . \end{aligned}$$
(3)

The per-worker profit in each period will be

$$\begin{aligned} \pi ^{i ,t} \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) =\underline{y} +y \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) -w_{\omega }^{i ,t} \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) =\mu \left[ \underline{y} +y \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) \right] ~ i =1 ,2. \end{aligned}$$
(4)

Each entrepreneur earns lifetime income

$$\begin{aligned} w_{\eta }^{t}= & w_{\eta }^{1 ,t} \left( e_{\omega }^{t -1} ,d_{\omega }^{t -1} ,e_{\eta }^{t -1} ,d_{\eta }^{t -1} ,e_{\omega }^{t} ,d_{\omega }^{t} ,e_{\eta }^{t} ,d_{\eta }^{t}\right) \nonumber \\&+\,w_{\eta }^{2 ,t} \left( e_{\omega }^{t} ,d_{\omega }^{t} ,e_{\eta }^{t} ,d_{\eta }^{t} ,e_{\omega }^{t +1} ,d_{\omega }^{t +1} ,e_{\eta }^{t +1} ,d_{\eta }^{t +1}\right) , \end{aligned}$$
(5)

where the subscript \(\eta\) denotes a variable pertaining to an entrepreneur. The lifetime income is given by the sum of the entrepreneur incomes in period 1 and 2, \(w_{\eta }^{1 ,t}\) and \(w_{\eta }^{2 ,t}\), where

$$\begin{aligned}&w_{\eta }^{1 ,t} \left( e_{\omega }^{t -1} ,d_{\omega }^{t -1} ,e_{\eta }^{t -1} ,d_{\eta }^{t -1} ,e_{\omega }^{t} ,d_{\omega }^{t} ,e_{\eta }^{t} ,d_{\eta }^{t}\right) \nonumber \\&\quad =\alpha \left\{ \left[ \pi ^{1 ,t} \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) +\pi ^{1 ,t -1} \left( e_{\omega }^{t -1} ,d_{\omega }^{t -1}\right) \right] s +g \left( e_{\eta }^{t -1} ,d_{\eta }^{t -1}\right) +g \left( e_{\eta }^{t} ,d_{\eta }^{t}\right) \right\} ; \end{aligned}$$
(6)
$$\begin{aligned}&w_{\eta }^{2 ,t} \left( e_{\omega }^{t} ,d_{\omega }^{t} ,e_{\eta }^{t} ,d_{\eta }^{t} ,e_{\omega }^{t +1} ,d_{\omega }^{t +1} ,e_{\eta }^{t +1} ,d_{\eta }^{t +1}\right) \nonumber \\&\quad =\left( 1 -\alpha \right) \left\{ \left[ \pi ^{2 ,t +1} \left( e_{\omega }^{t +1} ,d_{\omega }^{t +1}\right) +\pi ^{2 ,t} \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) \right] s +g \left( e_{\eta }^{t} ,d_{\eta }^{t}\right) +g \left( e_{\eta }^{t +1} ,d_{\eta }^{t +1}\right) \right\} , \end{aligned}$$
(7)

and where \(\alpha\) \(\left( 1 -\alpha \right)\) is the share of earning accruing to a young (mature) adult; \(g ( \cdot )\) is an increasing and concave function converting, for both co-owners of the firm and in each period, the educational inputs received into income—such a function might therefore represent the returns to entrepreneurial ability as mediated by the investments in human capital. Mirroring the preceding assumptions on \(y ( \cdot )\), we posit

$$\begin{aligned} g \left( 0 ,d_{\eta }^{t}\right) =g \left( e_{\eta }^{t} ,0\right) =0 ;\quad \frac{ \partial ^{2}g}{ \partial e_{\eta }^{t} \partial d_{\eta }^{t}} =\frac{ \partial ^{2}g}{ \partial d_{\eta }^{t} \partial e_{\eta }^{t}} >0, \end{aligned}$$
(8)

so that both educational expenditure and time spent in education are essential to develop entrepreneurial ability and the two inputs exhibit technological complementarity.

Also mirroring the assumptions made on the workers’ time allocation, we assume that each entrepreneur supplies a fixed amount of time for management, the same for all, and that the remaining fixed leisure time is employed along with the kid’s non-educational time to produce a household public good.

As a final remark, we notice that, given \(\mu\) \(>0.5\), (3) and (5) imply that the worker’s lifetime income is lower than the entrepreneur’s lifetime income:

$$\begin{aligned} w_{\omega }^{t} <w_{\eta }^{t} ,\quad \forall t . \end{aligned}$$
(9)

Turning now to the agents’ preferences, we assume that all agents are selfish.Footnote 14 Neither the workers nor the entrepreneurs derive any direct utility from their children’s education. However, while workers do not derive any indirect utility either, young entrepreneurs derive an indirect advantage from investing in their own children’s education because this positively affects next-period profits—see (7). Moreover, mature entrepreneurs exhibit a concern for their firm’s future profitability, a “reputational effect”. We choose to focus on this specific element due to our interest in the productivity/industrialisation rationale for educational policy: to keep things simple, we ignore other possible variants like assuming that the parents take pride in their offspring’s education or are worried about the effect of the children’s homework. The impact of the reputational effect on the agents’ voting preferences is discussed in Sect. 4, where it is made clear that the introduction of such an effect brings about only qualitative changes in the results and that nothing of substance is modified.

Therefore, the workers’ utility function is

$$\begin{aligned} U_{\omega } =u \left( c_{\omega }^{1 ,t}\right) +v \left( c_{\omega }^{2 ,t}\right) +f \left( 1 -d_{\omega }^{t +1}\right) , \end{aligned}$$
(10)

where \(f ( \cdot )\) represents the utility from the production of the household public good that we mentioned above. Since the parent’s leisure is fixed, however, we write the sub-utility directly as a function of the kid’s domestic time only, with the provision that \(f \left( 0\right) >0\)—i.e. that only parental time is essential to the production of the household public good.

As to the utility function of the entrepreneurs, it still depends on consumption and on the provision of the household public good. Moreover, we capture the reputational effect by directly introducing a fraction \(\beta\), \(0<\beta <1\), of future profits in the utility function:

$$\begin{aligned} U_{\eta }= & u \left( c_{\eta }^{1 ,t}\right) +v \left( c_{\eta }^{2 ,t}\right) +f \left( 1 -d_{\eta }^{t +1}\right) \nonumber \\&+\,\beta w_{\eta }^{3 ,t} \left( e_{\omega }^{t +1} ,d_{\omega }^{t +1} ,e_{\eta }^{t +1} ,d_{\eta }^{t +1} ,e_{\omega }^{t +2} ,d_{\omega }^{t +2} ,e_{\eta }^{t +2} ,d_{\eta }^{t +2}\right) \end{aligned}$$
(11)

where

$$\begin{aligned}&w_{\eta }^{3 ,t} \left( e_{\omega }^{t +1} ,d_{\omega }^{t +1} ,e_{\eta }^{t +1} ,d_{\eta }^{t +1} ,e_{\omega }^{t +2} ,d_{\omega }^{t +2} ,e_{\eta }^{t +2} ,d_{\eta }^{t +2}\right) \nonumber \\&\quad =\left[ \pi ^{3 ,t +1} \left( e_{\omega }^{t +1} ,d_{\omega }^{t +1}\right) +\pi ^{3 ,t +2} \left( e_{\omega }^{t +2} ,d_{\omega }^{t +2}\right) \right] s \nonumber \\&\qquad +\,g \left( e_{\eta }^{t +1} ,d_{\eta }^{t +1}\right) +g \left( e_{\eta }^{t +2} ,d_{\eta }^{t +2}\right) , \end{aligned}$$
(12)

is the profit generated in the period following the death of the entrepreneur.

We start by describing the laissez-faire economy; government interventions will be considered later on.

2.2 Agent optimisation in a free market

Each worker maximises (10) by choosing her consumption basket and the composition of her kid’s educational process subject to her lifetime budget constraint

$$\begin{aligned} c_{\omega }^{1 ,t} +c_{\omega }^{2 ,t} +e_{\omega }^{t +1} =2 (1 -\mu ) [\underline{y} +y \left( e_{\omega }^{t} ,d_{\omega }^{t}\right) ] \end{aligned}$$
(13)

and her child time constraint

$$\begin{aligned} d_{\omega }^{t +1} \le 1 , \end{aligned}$$
(14)

plus non-negativity constraints for all the choice variables. Since the educational expenditure for the next generation \(e_{\omega }^{t +1}\) does not appear in the utility function, and the time spent by children in education \(d_{\omega }^{t +1}\) appears as a bad, it is clear that \(e_{\omega }^{t +1} =d_{\omega }^{t +1} =0\) at the optimum for all workers of all generations. Thus, the problem reduces to

$$\begin{aligned} \text {Max}\,u \left( c_{\omega }^{1 ,t}\right) +v \left( 2 (1 -\mu ) \underline{y} -c_{\omega }^{1 ,t}\right) , \end{aligned}$$
(15)

where the budget constraint (13) has been substituted into the utility function. The FOC w.r.t. \(c_{\omega }^{1 ,t}\) is, quite simply,

$$\begin{aligned} u^{ \prime } =v^{ \prime } . \end{aligned}$$
(16)

Workers smooth their consumption over time. Since no worker gains from sending her child to school, however, the workers never get an education.

As for the entrepreneurs, they maximise (11) subject to their lifetime budget constraint

$$\begin{aligned}&c_{\eta }^{1 ,t} +c_{\eta }^{2 ,t} +e_{\eta }^{t +1} =w_{\eta }^{1 ,t} \left( e_{\omega }^{t -1} ,d_{\omega }^{t -1} ,e_{\eta }^{t -1} ,d_{\eta }^{t -1} ,e_{\omega }^{t} ,d_{\omega }^{t} ,e_{\eta }^{t} ,d_{\eta }^{t}\right) \nonumber \\&\quad +\,w_{\eta }^{2 ,t} \left( e_{\omega }^{t} ,d_{\omega }^{t} ,e_{\eta }^{t} ,d_{\eta }^{t} , ,e_{\omega }^{t +1} ,d_{\omega }^{t +1} ,e_{\eta }^{t +1} ,d_{\eta }^{t +1}\right) \end{aligned}$$
(17)

and their time constraint

$$\begin{aligned} d_{\eta }^{t +1} \le 1. \end{aligned}$$
(18)

Letting \(\lambda\) denote the Lagrange multiplier for the budget constraint, the FOCs w.r.t. \(c_{\eta }^{1 ,t}\), \(c_{\eta }^{2 ,t}\), \(e_{\eta }^{t +1}\), and \(d_{\eta }^{t +1}\) are

$$\begin{aligned} u^{ \prime } =\lambda ;~v^{ \prime } =\lambda ;\quad \lambda \left[ \left( 1 -\alpha \right) \frac{ \partial g}{ \partial e_{\eta }^{t +1}} -1\right] +\beta \frac{ \partial g}{ \partial e_{\eta }^{t +1}} =0 ;\quad \left[ \lambda \left( 1 -\alpha \right) +\beta \right] \frac{ \partial g}{ \partial d_{\eta }^{t +1}} \ge f^{ \prime } , \end{aligned}$$
(19)

respectively, so that the budget allocation is ruled by

$$\begin{aligned} u^{ \prime } =v^{ \prime } =\left[ \lambda \left( 1 -\alpha \right) +\beta \right] \frac{ \partial g}{ \partial e_{\eta }^{t +1}}. \end{aligned}$$
(20)

Again, consumption will be smoothed over the two periods; however, as far as the entrepreneurs are concerned, each of them gains from having her kid educated, because in the next two periods that kid will own part of the firm, and will contribute her managerial skills to the production process and thus first to the earnings and then to the reputation of the entrepreneur. Therefore, children belonging to this class are educated, and might indeed go to school full-time \((d_{\eta }^{t +1} =1)\). Notice that the reputational effect (\(\beta >0\)) raises the levels of \(e_{\eta }^{t +1}\) and \(d_{\eta }^{t +1}\) but is not necessary for the entrepreneurs to educate their children.

2.3 Characteristics of the free market equilibrium

In the laissez-faire equilibrium, some agents (the entrepreneurs) educate their children while others (the workers) don’t. Notice that the reason why workers are not educated is that educational expenses must be paid by the parent, but the latter does not obtain any return from her child’s education. Not only, but the time devoted to education is subtracted from the production of the household public good. The entrepreneurs, on the contrary, in addition to the gain they get from educating their children, may also take advantage from having an educated work force. This may open the way for policies that oblige parents to send their kids to school.

3 Agent optimisation and policy preferences

In order to investigate whether a compulsory public education policy could gain the support of the majority of voters, we must first assess whether such a measure can actually improve the welfare either of the entrepreneurs, or of the workers or of both categories. As far as the policy tools are concerned, we consider a compulsory education package and a linear income tax/subsidy to be employed both for financing such education measures and for redistributive purposes. We let \(\tau _{\omega }\) and \(\tau _{\eta }\) denote the group-specific marginal income tax rates for workers and entrepreneurs, respectively (possibly negative),Footnote 15 while \(\overline{e}\) represents the minimum expenditure on a child’s education that is imposed upon households and \(\overline{d}\) the minimum amount of time that a child must spend in school. Consequently e and d will now represent the amounts of money and time that are freely allocated to education by households on top of the prescribed levels. Notice that, since the time allocation for the parent is fixed, \(\tau _{\omega }\) and \(\tau _{\eta }\) are not distortionary, and basically equivalent to lump-sum transfers.

3.1 Agent optimisation in the presence of an active policy

Let’s take the workers. Taking into account (2) and the education policy described above, a worker per-period after tax income obtains as

$$\begin{aligned} (1 -\tau _{\omega }) w_{\omega }^{i ,t} \equiv \left( 1 -\tau _{\omega }\right) \left( 1 -\mu \right) \left( \underline{y} +y \left( \overline{e} +e_{\omega }^{t} ,\overline{d} +d_{\omega }^{t}\right) \right) . \end{aligned}$$
(21)

Further, the worker budget constraint (13) becomes

$$\begin{aligned} c_{\omega }^{1 ,t} +c_{\omega }^{2 ,t} +\overline{e} +e_{\omega }^{t +1} =2 \left( 1 -\tau _{\omega }\right) \left( \underline{y} +y \left( \overline{e} +e_{\omega }^{t} ,\overline{d} +d_{\omega }^{t}\right) \right) +\overline{e}, \end{aligned}$$
(22)

where \(\overline{e}\) appears on both sides of the constraint because the policy is publicly financed (either each household receives a subsidy or monetary educational expenses are paid by the government). The time constraint of worker’s child (14) becomes

$$\begin{aligned} d_{\omega }^{t +1} \le 1 -\overline{d}. \end{aligned}$$
(23)

Just as in the free-market equilibrium, the additional education expenditure for children \(e_{\omega }^{t +1}\) does not appear in the utility function, and the additional time spent by children in education \(d_{\omega }^{t +1}\) appears as a bad, therefore \(e_{\omega }^{t +1} =d_{\omega }^{t +1} =0\) at the optimum for all workers of all generations. Thus, a worker’s maximization problem reduces to

$$\begin{aligned} \text {Max}\,u \left( c_{\omega }^{1 ,t}\right) +v \left( 2 \left( 1 -\tau _{\omega }\right) (1 -\mu ) \left( \underline{y} +y \left( \overline{e} ,\overline{d}\right) \right) -c_{\omega }^{1 ,t}\right) +f \left( 1 -\overline{d}\right) . \end{aligned}$$
(24)

The FOC w.r.t. \(c_{\omega }^{1 ,t}\) obtains as

$$\begin{aligned} u^{ \prime } =v^{ \prime }, \end{aligned}$$
(25)

i.e. it has the same form as the FOC (16) obtained in a free market, leading again to consumption smoothing. But now the worker is obliged to have the kid spend \(\overline{d}\) as study time. The worker will also spend \(\overline{e}\) on her child’s education but this would be financed by the tax system.

Let us now consider the entrepreneurs: the entrepreneur’s budget constraint (17) becomes

$$\begin{aligned} c_{\eta }^{1 ,t} +c_{\eta }^{2 ,t} +\overline{e} +e_{\eta }^{t +1} =\left( 1 -\tau _{\eta }\right) \left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t}\right) +\overline{e} . \end{aligned}$$
(26)

Since \(e_{\omega }^{t} =d_{\omega }^{t} =0\) for the reasons given above, we have from (6) and (7) that the entrepreneur’s income in period 1, 2, and 3 obtain as follows

$$\begin{aligned}&w_{\eta }^{1 ,t} \left( \overline{e} ,\overline{d} ,e_{\eta }^{t -1} ,d_{\eta }^{t -1} ,e_{\eta }^{t} ,d_{\eta }^{t}\right) \nonumber \\&\quad =\alpha \left\{ \left[ \pi ^{1 ,t} \left( \overline{e} ,\overline{d}\right) +\pi ^{1 ,t -1} \left( \overline{e} ,\overline{d}\right) \right] s +g \left( \overline{e} +e_{\eta }^{t -1} ,\overline{d} +d_{\eta }^{t -1}\right) \right. \nonumber \\&\qquad \left. +\,g \left( \overline{e} +e_{\eta }^{t} ,\overline{d} +d_{\eta }^{t}\right) \right\} ; \end{aligned}$$
(27)
$$\begin{aligned}&w_{\eta }^{2 ,t} \left( \overline{e} ,\overline{d} ,e_{\eta }^{t} ,d_{\eta }^{t} ,e_{\eta }^{t +1} ,d_{\eta }^{t +1}\right) \nonumber \\&\quad =\left( 1 -\alpha \right) \left\{ \left[ \pi ^{2 ,t +1} \left( \overline{e} ,\overline{d}\right) +\pi ^{2 ,t} \left( \overline{e} ,\overline{d}\right) \right] s +g \left( \overline{e} +e_{\eta }^{t} ,\overline{d} +d_{\eta }^{t}\right) \right. \nonumber \\&\qquad \left. +\,g \left( \overline{e} +e_{\eta }^{t +1} ,\overline{d} +d_{\eta }^{t +1}\right) \right\} ; \end{aligned}$$
(28)
$$\begin{aligned}&w_{\eta }^{3 ,t} \left( \overline{e} ,\overline{d} , ,e_{\eta }^{t +1} ,d_{\eta }^{t +1} ,e_{\eta }^{t +2} ,d_{\eta }^{t +2}\right) \nonumber \\&\quad =\left[ \pi ^{3 ,t +1} \left( \overline{e} ,\overline{d}\right) +\pi ^{3 ,t +2} \left( \overline{e} ,\overline{d}\right) \right] s +\left[g \left( e_{\eta }^{t +1} ,d_{\eta }^{t +1}\right) +g \left( e_{\eta }^{t +2} ,d_{\eta }^{t +2}\right) \right] . \end{aligned}$$
(29)

Entrepreneurs maximise their utility function

$$\begin{aligned} U_{\eta } =u \left( c_{\eta }^{1 ,t}\right) +v \left( c_{\eta }^{2 ,t}\right) +f \left( 1 -\overline{d} -d_{\eta }^{t +1}\right) +\beta w_{\eta }^{3 ,t} \end{aligned}$$
(30)

by choice of \(c_{\eta }^{1 ,t} ,~c_{\eta }^{2 ,t} ,~e_{\eta }^{t +1}\) and \(d_{\eta }^{t +1}\) subject to the budget constraint (26) and the additional time constraint

$$\begin{aligned} d_{\eta }^{t +1} \le 1 -\overline{d}. \end{aligned}$$
(31)

Since it will become clear in the next subsection that there cannot exist a political equilibrium where both entrepreneurs and workers are constrained, we only consider interior solutions for \(e_{\eta }^{t +1}\) and \(d_{\eta }^{t +1}\). The FOCs then are

$$\begin{aligned}&u^{ \prime } =\lambda ;\quad v^{ \prime } =\lambda ;\quad \lambda \left[ \left( 1 -\tau _{\eta }\right) \left( 1 -\alpha \right) \frac{ \partial g}{ \partial e_{\eta }^{t +1}} -1\right] +\beta \frac{ \partial g}{ \partial e_{\eta }^{t +1}} =0 ; \nonumber \\&\left[ \lambda \left( 1 -\tau _{\eta }\right) \left( 1 -\alpha \right) +\beta \right] \frac{ \partial g}{ \partial d_{\eta }^{t +1}} =f^{ \prime }. \end{aligned}$$
(32)

3.2 Policy preferences

We now have to check which of the possible constellations of policy tools is preferred by the agents. Let us begin by writing the government revenue constraint under the assumption that the educational expenditure ration \(\overline{e}\) is paid for by the government:

$$\begin{aligned} \tau _{\omega } (1 -\mu ) \left( \underline{y} +y^{t -1} +\underline{y} +y^{t}\right) \frac{S}{2} +\tau _{\eta } \left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t -1}\right) \frac{n}{2} =\frac{\left( n +S\right) }{2} \overline{e} , \end{aligned}$$
(33)

where we dropped the arguments in \(y^{t},\) \(y^{t -1},\) \(w_{\eta }^{1 ,t}\) and \(w_{\eta }^{2 ,t -1}\) to avoid clutter. For future use, we write the public budget in per-capita terms and we express it in terms of \(\tau _{\omega } (\tau _{\eta } ,\overline{e} ,\overline{d})\):

$$\begin{aligned} \tau _{\omega } (\tau _{\eta } ,\overline{e} ,\overline{d}) =\frac{\overline{e} -\tau _{\eta } \left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t -1}\right) \left( 1 -\sigma \right) }{\Psi \sigma } , \end{aligned}$$
(34)

where \(\sigma =S/\left( n +S\right)\) and \(\Psi =(1 -\mu ) \left( 2 \underline{y} +y^{t} +y^{t -1}\right)\). Next, by deriving \(\tau _{\omega } ( \cdot )\) with respect to \(\tau _{\eta }\), \(\overline{e}\), and \(\overline{d}\) we obtain

$$\begin{aligned} \frac{ \partial \tau _{\omega }}{ \partial \tau _{\eta }}= & -\frac{1 -\sigma }{\Psi \sigma } <0 , \end{aligned}$$
(35)
$$\begin{aligned} \frac{ \partial \tau _{\omega }}{ \partial \overline{e}}= & \frac{1}{\Psi \sigma } >0 , \end{aligned}$$
(36)
$$\begin{aligned} \frac{ \partial \tau _{\omega }}{ \partial \overline{d}}= & \frac{ -\tau _{\eta } \left( \frac{ \partial w_{\eta }^{1 ,t}}{ \partial \overline{d}} +\frac{ \partial w_{\eta }^{2 ,t -1}}{ \partial \overline{d}}\right) \left( 1 -\sigma \right) \Psi \sigma -2 \left( 1 -\mu \right) \frac{ \partial y}{ \partial \overline{d}} \sigma (\overline{e} -\tau _{\eta } \left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t -1}\right) \left( 1 -\sigma \right) )}{\left( \Psi \sigma \right) ^{2}} . \end{aligned}$$
(37)

Notice that here the production function is represented as affected by the education level of the parents, \(y^{t}\), and the grandparents, \(y^{t -1}\), while the possible increase in education prescribed by the policy would affect the earnings of the children. Similarly the current revenue of the entrepreneurs, \(w_{\eta }^{1 ,t}\) and \(w_{\eta }^{2 ,t -1}\), is not affected by a change in \(\overline{d}\). This implies the following

$$\begin{aligned} \frac{ \partial y^{t -1}}{ \partial \overline{d}} =\frac{ \partial y^{t}}{ \partial \overline{d}} =\frac{ \partial w_{\eta }^{1 ,t}}{ \partial \overline{d}} =\frac{ \partial w_{\eta }^{2 ,t -1}}{ \partial \overline{d}} =0. \end{aligned}$$
(38)

By using (38), (37) can be re-written as

$$\begin{aligned} \frac{ \partial \tau _{\omega }}{ \partial \overline{d}} =0. \end{aligned}$$
(39)

Let the indirect utility, written as a function of the policy instruments, be denoted by

$$\begin{aligned} V_{\iota } =V_{\iota } \left( \tau _{\iota } ,\overline{e} ,\overline{d}\right) ,\quad \iota =\omega ,\eta _{y} ,\eta _{m} , \end{aligned}$$
(40)

where \(\eta _{y}\) denotes a young entrepreneur and \(\eta _{m}\) denotes a mature entrepreneur. The derivatives of (40) with respect to the policy instruments for the workers are

$$\begin{aligned} \frac{ \partial V_{\omega }}{ \partial \tau _{\omega }}= & -2 (1 -\mu ) \left( \underline{y} +y^{t}\right) v^{ \prime } <0 ; \end{aligned}$$
(41)
$$\begin{aligned} \frac{ \partial V_{\omega }}{ \partial \overline{e}}= & \left( 1 -\tau _{\omega }\right) 2 (1 -\mu ) \frac{ \partial y^{t}}{ \partial \overline{e}} v^{ \prime } ; \end{aligned}$$
(42)
$$\begin{aligned} \frac{ \partial V_{\omega }}{ \partial \overline{d}}= & \left( \left( 1 -\tau _{\omega }\right) 2 (1 -\mu ) \frac{ \partial y^{t}}{ \partial \overline{d}}\right) v^{ \prime } -f^{ \prime } ; \end{aligned}$$
(43)

where again,

$$\begin{aligned} \frac{ \partial y^{t}}{ \partial \overline{e}} =\frac{ \partial y^{t}}{ \partial \overline{d}} =0, \end{aligned}$$
(44)

as far as the parents’ and grandparents’ income is concerned. Therefore, (42) and (43) can be re-written as

$$\begin{aligned} \frac{ \partial V_{\omega }}{ \partial \overline{e}} =0 ;~\frac{ \partial V_{\omega }}{ \partial \overline{d}} = -f^{ \prime } <0. \end{aligned}$$
(45)

Regarding the entrepreneurs, we must distinguish between the young and the mature ones. For the young, the derivative of (40) with respect to the entrepreneur’s income tax rate obtains as

$$\begin{aligned} \frac{ \partial V_{\eta _{y}}}{ \partial \tau _{\eta }} = -\left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t}\right) \lambda <0. \end{aligned}$$
(46)

As to the derivatives with respect to the minimum expenditure on a child’s education, \(\overline{e}\), and the minimum amount of time a child must spend in school, \(\overline{d}\), we have

$$\begin{aligned} \frac{ \partial V_{\eta _{y}}}{ \partial \overline{e}} =\left( 1 -\tau _{\eta }\right) \frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{e}} \lambda +\beta \frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{e}}>0 ;\quad \frac{ \partial V_{\eta _{y}}}{ \partial \overline{d}} =\left( 1 -\tau _{\eta }\right) \frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{d}} \lambda +\beta \frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{d}} >0 , \end{aligned}$$
(47)

where we have considered that

$$\begin{aligned} \frac{ \partial w_{\eta }^{1 ,t}}{ \partial \overline{e}} =\frac{ \partial w_{\eta }^{1 ,t}}{ \partial \overline{d}} =0, \end{aligned}$$
(48)

because education affects only next-period profits. Notice that the per period entrepreneurs’ income is made of four terms – see (27) and (28). Since the entrepreneurs are not constrained, the compulsory education policy does not induce any change in returns to period two and three entrepreneurial activity \(\left( 1 -\alpha \right) g\), but, given (4), it creates more income via increases in per-worker profits \(\pi ^{2 ,t}\) and \(\left( 1 -\tau _{\eta }\right) \pi ^{3 ,t}\). This means that we can be certain that

$$\begin{aligned} \frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{e}}>0 ;~\frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{d}}>0 ;\frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{e}}>0 ;~\frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{d}} >0. \end{aligned}$$
(49)

Consequently the sign of the derivatives of (47) is positive. In fact the policy measure has no impact on the amount of time and money invested in the education of an entrepreneur’s child. The increase in the compulsory components of e and d will be counterbalanced by a reduction of the same amount in the time and money used to top up the compulsory amounts. As a consequence the entrepreneurs will benefit from the increase of the education of their work-force without incurring in any distortion of their own educational choices.

The mature entrepreneurs will incur in the cost of education without obtaining any monetary return, but obtaining instead a benefit in terms of reputation. The reputational effect, therefore, is key to make them favour an educational policy, in that it makes them care, indirectly, for the workers’ children’s education one generation ahead. Qualitatively, this works just as if we assumed that the mature entrepreneurs had become altruistic (which, as we mentioned in fn. 14 would still be compatible with our results); the interpretation as an interest in the reputation of the firm is however more in line with our basic assumptions. While helpful, it is not absolutely necessary to obtain our main result (the interest in workers’ education of the young entrepreneurs would be enough), and it allows us to show the entrepreneurs as being consistently in support of mandatory education rather than moving away from that support in old age, which seems to sit comfortably with a view of the entrepreneur as having a long-term interest in the family business.

For them, the derivatives of (40) with respect to \(\tau _{\eta }\), \(\overline{e}\), and \(\overline{d}\) obtain as

$$\frac{ \partial V_{\eta _{m}}}{ \partial \tau _{\eta }} = -\left( w_{\eta }^{2 ,t -1}\right) \lambda <0 ;\quad \frac{ \partial V_{\eta _{m}}}{ \partial \overline{e}} =\beta \frac{ \partial w_{\eta }^{3 ,t -1}}{ \partial \overline{e}} ;\quad \frac{ \partial V_{\eta }}{ \partial \overline{d}} =\beta \frac{ \partial w_{\eta }^{3 ,t -1}}{ \partial \overline{d}}.$$
(50)

We are now in a position to calculate the preferred policy by each group. Specifically, the preferred policies can be found by using (34) to replace \(\tau _{\omega }\) in (40) and then choosing \(\tau _{\eta } ,~\overline{e}\) and \(\overline{d}\) so as to maximise:

$$\begin{aligned} V_{\omega }= & V_{\omega }((\tau _{\omega } (\tau _{\eta } ,\overline{e} ,\overline{d}) ,\overline{e} ,\overline{d}), \end{aligned}$$
(51)
$$\begin{aligned} V_{\eta _{k}}= & V_{\eta _{k}} (\tau _{\eta } ,\overline{e} ,\overline{d}) ;\quad k =y ,m, \end{aligned}$$
(52)

for the workers and the entrepreneurs, respectively, under non-negativity constraints for \(\overline{e}\) and \(\overline{d}\) and the constraints that

$$\begin{aligned} \tau _{\iota } \le 1 ,\quad \iota =\omega ,\eta ;\quad \overline{d} \le 1. \end{aligned}$$
(53)

For the workers, the FOCs are:

$$\begin{aligned} \frac{d V_{\omega }}{d \tau _{\eta }}= & \frac{ \partial V_{\omega }}{ \partial \tau _{\omega }} \frac{ \partial \tau _{\omega }}{ \partial \tau _{\eta }} =2 (1 -\mu ) \left( \underline{y} +y^{t}\right) v^{ \prime } \frac{\left( 1 -\sigma \right) }{2 (1 -\mu ) \left( \underline{y} +y^{t}\right) \sigma } =v^{ \prime } \frac{\left( 1 -\sigma \right) }{\sigma } >0 ; \end{aligned}$$
(54)
$$\begin{aligned} \frac{d V_{\omega }}{d \overline{e}}= & \frac{ \partial V_{\omega }}{ \partial \tau _{\omega }} \frac{ \partial \tau _{\omega }}{ \partial \overline{e}} = -2 (1 -\mu ) \left( \underline{y} +y^{t}\right) v^{ \prime } \frac{1}{2 (1 -\mu ) \left( \underline{y} +y^{t}\right) \sigma } = -\frac{1}{\sigma } v^{ \prime } <0 ; \end{aligned}$$
(55)
$$\begin{aligned} \frac{d V_{\omega }}{d \overline{d}}= & -f^{ \prime } +\frac{ \partial V_{\omega }}{ \partial \tau _{\omega }} \frac{ \partial \tau _{\omega }}{ \partial \overline{d}} = -f^{ \prime } <0 , \end{aligned}$$
(56)

implying that the optimal tax rate is \(\tau _{\eta } =1\) while \(\overline{e}\) and \(\overline{d}\) should be optimally set to zero.

For the young entrepreneurs, the FOCs are:

$$\begin{aligned} \frac{ \partial V_{\eta _{y}}}{ \partial \tau _{\eta }}= & -\left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t}\right) \lambda <0 ; \end{aligned}$$
(57)
$$\begin{aligned} \frac{ \partial V_{\eta _{y}}}{ \partial \overline{e}}= & \left( 1 -\tau _{\eta }\right) \frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{e}} +\beta \frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{e}} >0 ; \end{aligned}$$
(58)
$$\begin{aligned} \frac{ \partial V_{\eta _{y}}}{ \partial \overline{d}}= & \left( 1 -\tau _{\eta }\right) \frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{d}} \lambda +\beta \frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{d}} >0. \end{aligned}$$
(59)

The FOCs for the mature ones are instead:

$$\begin{aligned} \frac{ \partial V_{\eta _{m}}}{ \partial \tau _{\eta }}= & -\left( w_{\eta }^{2 ,t -1}\right) \lambda <0 ; \end{aligned}$$
(60)
$$\begin{aligned} \frac{ \partial V_{\eta _{m}}}{ \partial \overline{e}}= & \beta \frac{ \partial w_{\eta }^{3 ,t -1}}{ \partial \overline{e}} >0 ; \end{aligned}$$
(61)
$$\begin{aligned} \frac{ \partial V_{\eta _{m}}}{ \partial \overline{d}}= & \beta \frac{ \partial w_{\eta }^{3 ,t -1}}{ \partial \overline{d}} >0. \end{aligned}$$
(62)

We know from our previous analysis that in this case both \(\partial V_{\eta _{y}}/ \partial \overline{e}\) and \(\partial V_{\eta _{y}}/ \partial \overline{d}\) are positive because of a positive indirect effect as the compulsory education policy creates more income via increases in the after-tax per-worker next-period profits \(\left( 1 -\tau _{\eta }\right) \pi ^{2 ,t}\)—see (47)—and because of the positive reputational effect. The latter is also present in the case of the mature entrepreneurs.

Therefore, the entrepreneurs would prefer to face a zero marginal tax rate while at the same time having positive values for \(\overline{e}\) and \(\overline{d}\) (indeed, entrepreneurs would always favour pushing each ration to its upper limit). This implies that the workers should face a positive tax rate in order to finance education expenditure. The upper limit for \(\overline{d}\) is clearly unity, while for \(\overline{e}\) can be deduced from observing that, given the preferred tax rates, the maximum level of \(\overline{e}\) can be achieved when \(\tau _{\omega } (\tau _{\eta } ,\overline{e} ,\overline{d}) =1,\) implying \(\overline{e} =(1 -\mu ) \left( 2 \underline{y} +y^{t} +y^{t -1}\right) \sigma\).

While the results are possibly too sharp to be taken literally, their qualitative interpretation is clear: the workers do not perceive any benefit from compulsory education but would favour a redistributive income taxation, whereas entrepreneurs gain from compulsory education but would like to shift the entire cost on the workers.

4 Political equilibrium

Let us now focus on the voting process through which an educational policy package is chosen in the political arena. To perform our analysis, we consider a probabilistic voting model with a two-candidate electoral competition—see e.g. Lindbeck and Weibull (1987). In this setup, candidates are uncertain on whether citizens will participate in voting: they could abstain, maybe because they cannot clearly perceive the distance between the proposed platforms. Consequently, the candidates are uncertain on how citizens will vote for any given political proposal. Following a standard approach, we suppose that the voters’ decisions depend on the differences in the expected utilities from the candidates’ different platforms, and that the candidates perceive the probability that a voter will participate in voting and support a platform as a function of the distance between her own platform and that proposed by the rival candidate. Politicians are assumed to be opportunistic, i.e. they are purely office-motivated, and thus aim at maximising their vote share. No credibility issues may arise, because it is also assumed that politicians can make binding commitments to policy platforms proposed during the electoral campaign.Footnote 16

To sum up, the sketch of the electoral procedure is thus the following. Two candidates simultaneously propose their policy platforms, that is their educational policy packages plus their redistributive policy platform. Then, citizens vote for their preferred candidate. Finally, the elected candidate implements the policy she promised during the electoral campaign.

Each candidate selects her policy platform in order to maximise her share of total votes, that depends on the probabilities that each voter will vote for her, taking the rival candidate platform as given. Now, let the probability perceived by candidate \(j ,~j =A ,B\) that an agent votes for her be \(\gamma _{\iota }^{j} ,~\iota =\omega ,\eta _{y} ,\eta _{m}\), where we distinguish between young and mature entrepreneurs because they have different policy preferences.Footnote 17 The expected vote share of a candidate will then be:

$$\begin{aligned} p^{j} =\sigma \gamma _{\omega }^{j} +\frac{(1 -\sigma )}{2} \left( \gamma _{\eta _{y}}^{j} +\gamma _{\eta _{m}}^{j}\right) ,\quad j =A ,B . \end{aligned}$$
(63)

As usual, we posit

$$\begin{aligned} \gamma _{\iota }^{j}= & \Gamma _{\iota } \left[ V_{\iota } \left( \tau _{\eta }^{j} ,\tau _{\omega }^{j} \left( \tau _{\eta }^{j} ,\overline{e}^{j} ,\overline{d}^{j},\right) ,\overline{e}^{j} ,\overline{d}^{j}\right) \right. \nonumber \\&\quad \left. -\,V_{\iota } \left( \tau _{\eta }^{ -j} ,\tau _{\omega }^{ -j} \left( \tau _{\eta }^{ -j} ,\overline{e}^{ -j} ,\overline{d}^{ -j}\right) ,\overline{e}^{ -j} ,\overline{d}^{ -j}\right) \right] , \nonumber \\&\quad \iota =\omega ,\eta _{y} ,\eta _{m} ;\quad j =A ,B , \end{aligned}$$
(64)

where \(\Gamma _{\iota }\) is a smooth, continuous and increasing function varying between 0 and 1, and we use the superscript j, \(j =A ,B\), to denote a policy variable proposed by candidate j.

The assumption that agents will show up at elections with a certain positive probability is of course standard in probabilistic voting models; also standard is it to assume that this probability varies with the agent’s type and, more precisely, that each individual’s voting behaviour is affected by her own ideological attachment to a party (usually represented by an idiosyncratic taste shock which is a random variable with a density function taken to be symmetric around zero). However, we wish to highlight here a different mechanism, namely the positive relationship between income and voting participation: active participation in public life, including active voting, is indeed usually found to be positively related to income at the individual level and, relatedly, negatively associated with income inequality at the aggregate level—see for example Greene and Nikolaev (1999), Benabou (2000), and Horn (2011).Footnote 18

Therefore, we assume that the probability of an individual participating in voting is positively related to her income. In our framework, this means that the entrepreneurs are more active than workers in the voting process, i.e. their abstention probability is lower. We take it that \(\Gamma _{w} ( \Delta V_{\iota }) <\Gamma _{\eta _{k}} ( \Delta V_{\iota })\) \(k =y ,m,\) for any value of the difference in the utility from the two platforms.

Each candidate maximises (63) by choosing her own policy platform \(\tau _{\eta }^{j} ,\overline{e}^{j} ,\overline{d}^{j}\) while taking the other candidate’s platform as given; in a Nash equilibrium in which the candidates announce their policies simultaneously, the resulting equilibrium policies will be identical. As is well-known, then, the objective function of a candidate, that is (63), in a probabilistic voting model coincides with a generalised utilitarian social welfare function—see Mueller (2003).

In what follows, we will assume that for the income tax rate proposed by candidate j for entrepreneurs, \(\tau _{\eta }^{j}\), we always have interior solutions at the political equilibrium. In other words we assume that the abstention rate of the workers (who outnumber the entrepreneurs) is such as to guarantee an interior solution.Footnote 19 As far as the education package is concerned, notice that there cannot exist an equilibrium in which both the workers and the entrepreneurs are constrained. If that were the case, one of the candidates could easily improve the outcome for both groups by simultaneously reducing the ration and the tax rates. For each candidate, the FOCs are:

$$\begin{aligned}&{[}\tau _{\eta }^{j}]\quad \frac{ \partial p^{j}}{ \partial \tau _{\eta }^{j}} =\sigma \frac{ \partial \Gamma _{\omega }}{ \partial V_{\omega }} \frac{ \partial V_{\omega }}{ \partial \tau _{\eta }^{j}} +\frac{(1 -\sigma )}{2} \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} \frac{ \partial V_{\eta _{y}}}{ \partial \tau _{\eta }^{j}} +\frac{(1 -\sigma )}{2} \frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} \frac{ \partial V_{\eta _{m}}}{ \partial \tau _{\eta }^{j}} =0 , \end{aligned}$$
(65)
$$\begin{aligned}&{[}\overline{e}^{j}]\quad \frac{ \partial p^{j}}{ \partial \overline{e}^{j}} =\sigma \frac{ \partial \Gamma _{\omega }}{ \partial V_{\omega }} \frac{ \partial V_{\omega }}{ \partial \overline{e}^{j}} +\frac{(1 -\sigma )}{2} \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} \frac{ \partial V_{\eta _{y}}}{ \partial \overline{e}^{j}} +\frac{(1 -\sigma )}{2} \frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} \frac{ \partial V_{\eta _{m}}}{ \partial \overline{e}^{j}} \ge 0 , \end{aligned}$$
(66)
$$\begin{aligned}&[\overline{d}^{j}]\quad \frac{ \partial p^{j}}{ \partial \overline{d}^{j}} =\sigma \frac{ \partial \Gamma _{\omega }}{ \partial V_{\omega }} \frac{ \partial V_{\omega }}{ \partial \overline{d}^{j}} +\frac{(1 -\sigma )}{2} \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} \frac{ \partial V_{\eta _{y}}}{ \partial \overline{d}^{j}} +\frac{(1 -\sigma )}{2} \frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} \frac{ \partial V_{\eta _{m}}}{ \partial \overline{d}^{j}} \ge 0 , \end{aligned}$$
(67)

where the derivatives of the indirect utility functions w.r.t. the policy parameters are given by (54)–(56) and (57)–(59).

Substituting for \(\partial V_{\omega }/ \partial \tau _{\eta }^{j}\) and \(\partial V_{\eta _{k}}/ \partial \tau _{\eta }^{j},\) \(k =y ,m,\) from the preferred policies (54)–(56), condition (65) becomes:

$$\begin{aligned}&\sigma \frac{ \partial \Gamma _{\omega }}{ \partial V_{\omega }} v^{ \prime } \frac{1 -\sigma }{\sigma } -\frac{1 -\sigma }{2} \lambda \left[ \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} \left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t}\right) +\frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} w_{\eta }^{2 ,t -1}\right] =0 , \nonumber \\&\frac{ \partial \Gamma _{\omega }}{ \partial V_{\omega }} v^{ \prime } =\frac{\lambda }{2} \left[ \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} \left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t}\right) +\frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} w_{\eta }^{2 ,t -1}\right] . \end{aligned}$$
(68)

In words, the marginal tax rates are set so as to equalise the marginal utilities of income weighted by the sensitivity of the two citizen types’ vote to the candidate’s proposal at the equilibrium point,Footnote 20 that is when there is no difference in the proposed platforms (see Mueller 2003, ch. 12). Since the candidates’ proposed policies are identical at the equilibrium, \(\tau _{\eta }^{A} =\tau _{\eta }^{B} =\tau _{\eta }\), \(\overline{e}^{A} =\overline{e}^{B} =\overline{e}\), and \(\overline{d}^{A} =\overline{d}^{B} =\overline{d}\).

The intuition behind these results is the following. Let us suppose, for example, that \(\partial \Gamma _{\eta _{k}}/ \partial V_{\eta _{k}},\) \(k =y ,m,\) is, for any given value of the difference between the utilities in (64), larger than \(\partial \Gamma _{\omega }/ \partial V_{\omega }\), meaning that entrepreneurs respond with a higher increase in the probability of voting for the candidate if the latter marginally differentiates her proposed platform in their favour; then, \(\tau _{\eta }\) will be set in such a way that the marginal utility of income for the entrepreneurs, \(\lambda\), is lower than the marginal utility of income for the workers, \(v^{ \prime }\). That is, the policy favours the citizen whom the candidate perceives as more likely to vote for her as a consequence of such a favour.Footnote 21

In (68) we consider interior solutions for \(\tau _{\eta }.\) Notice, however, that the characteristics of the solution depend on the abstention rate of the workers. Considering that there are more workers than entrepreneurs, in general \(\tau _{\omega }\) cannot be positive unless the workers’ abstention rate is particularly high even for large differences in the utility they can obtain from the two candidates’ platforms,

$$\begin{aligned}&V_{\iota } \left( \tau _{\eta }^{j} ,\tau _{\omega }^{j} \left( \tau _{\eta }^{j} ,\overline{e}^{j} ,\overline{d}^{j},\right) ,\overline{e}^{j} ,\overline{d}^{j}\right) \nonumber \\&\quad -\,V_{\iota } \left( \tau _{\eta }^{ -j} ,\tau _{\omega }^{ -j} \left( \tau _{\eta }^{ -j} ,\overline{e}^{ -j} ,\overline{d}^{ -j}\right) ,\overline{e}^{ -j} ,\overline{d}^{ -j}\right) ,\quad \iota =\omega ,\eta _{y} ,\eta _{m} \end{aligned}$$
(69)

The educational policy must usually be paired with a redistributive taxation in favour of the workers because the latter suffer from a reduction in the household public good. This scenario can arise if the entrepreneurs’ benefits coming from the workers’ education are sufficiently high to compensate both the cost of the compulsory education package and the redistributive policy. Clearly, such a cost would be lower if the workers attached positive value to their children’s education. In that case, less redistribution would be needed for the workers to accept the compulsory education policy. If education were highly valued by the workers, the equilibrium policy could even prescribe positive values for \(\tau _{\omega }\).

When workers do not attach any value to their children’s education, substituting (55), (58), and (61) into condition (66) and (56), (59), and (62) into condition (67), the two conditions can be re-written as

$$\begin{aligned}&{[}\overline{e}] \quad -\frac{ \partial \Gamma _{\omega }}{ \partial V_{\omega }} v^{ \prime } +\frac{1 -\sigma }{2} \left\{ \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} \left[ \left( 1 -\tau _{\eta }\right) \frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{e}} +\beta \frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{e}}\right] \right. \nonumber \\&\quad \left. +\,\frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} \beta \frac{ \partial w_{\eta }^{3 ,t -1}}{ \partial \overline{e}}\right\} \ge 0, \end{aligned}$$
(70)
$$\begin{aligned}&{[}\overline{d}] \quad -\sigma \frac{ \partial \Gamma _{\omega }}{ \partial V_{\omega }} f^{ \prime } +\frac{1 -\sigma }{2} \left\{ \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} \left[ \left( 1 -\tau _{\eta }\right) \frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{d}} \lambda +\beta \frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{d}}\right] \right. \nonumber \\&\quad \left. +\,\frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} \beta \frac{ \partial w_{\eta }^{3 ,t -1}}{ \partial \overline{d}}\right\} \ge 0. \end{aligned}$$
(71)

Further, substituting (65), the above equations can be re-written as

$$\begin{aligned}&{[}\overline{e}] \quad \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} (1 -\sigma ) \left( \left( 1 -\tau _{\eta }\right) \frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{e}} +\beta \frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{e}}\right) +\frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} (1 -\sigma ) \beta \frac{ \partial w_{\eta }^{3 ,t -1}}{ \partial \overline{e}} \nonumber \\&\quad \ge \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} \lambda \left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t}\right) +\frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} \lambda w_{\eta }^{2 ,t -1} ;\; \end{aligned}$$
(72)
$$\begin{aligned}&{[}\overline{d}] \quad \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} (1 -\sigma ) \left( \left( 1 -\tau _{\eta }\right) \frac{ \partial w_{\eta }^{2 ,t}}{ \partial \overline{d}} \lambda +\beta \frac{ \partial w_{\eta }^{3 ,t}}{ \partial \overline{d}}\right) +\frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} (1 -\sigma ) \beta \frac{ \partial w_{\eta }^{3 ,t -1}}{ \partial \overline{d}} \nonumber \\&\quad \ge \frac{ \partial \Gamma _{\eta _{y}}}{ \partial V_{\eta _{y}}} \sigma \lambda \frac{f^{ \prime }}{v^{ \prime }} \left( w_{\eta }^{1 ,t} +w_{\eta }^{2 ,t}\right) +\frac{ \partial \Gamma _{\eta _{m}}}{ \partial V_{\eta _{m}}} \sigma \lambda \frac{f^{ \prime }}{v^{ \prime }} w_{\eta }^{2 ,t -1} .~ \end{aligned}$$
(73)

On the l.h.s. of (72) we have a measure of the marginal benefit of educational expenditure, for the young and the mature entrepreneurs weighted by the respective vote sensitivities to the candidates’ proposals; on the RHS, we have the weighted marginal cost, expressed in utility terms for \(\overline{e}\). Similarly on the l.h.s. of (73) we have a measure of the weighted marginal benefit of school time while on the RHS we have the weighted marginal cost expressed as an opportunity cost for \(\overline{d}.\)

Then, as long as marginal benefits exceed or equal marginal costs, a solution in which a certain level of compulsory education is enforced emerges. We have then the interesting result that a compulsory education policy may be implemented at the political equilibrium, despite the fact that one of the two groups of which the society is composed would not educate the children in a free-market equilibrium. The driving force behind this result is the fact that the entrepreneurs gain from having an educated workforce.

The reputational effect plays a role in this respect, by making mature entrepreneurs care about future profits and thus value workers’ education, but it is by no means necessary to achieve our general result. In the absence of such effect (i.e. for \(\beta =0\)), the l.h.s. of (72) and (73) would be lower yielding lower equilibrium levels for \(\overline{e}\) and \(\overline{d}\). The benefit of the policy would be enjoyed only by young entrepreneurs implying a milder policy in case the equilibrium exists and a lower probability of existence of an equilibrium with the above characteristics but the general conclusion would be unaffected.

5 Conclusions

Over the years, there have been several contributions to the political economy of education. However, their focus seems to have been mostly on secondary or tertiary education. Also, typically, the main driving force behind the results has been the presence of income dispersion. Consider for example, the work by Epple and Romano (1996b). In their model, a publicly provided private good, which could be education, is financed through a flat-rate income tax and policy is determined by majority rule; agents differ by their fixed incomes. At the political equilibrium, the private good is publicly provided as long as it is possible to top it up; interestingly, for some preference configurations, the political equilibrium is of the “ends-against-the-middle” variety, i.e. low- and high-income agents favour low levels of public provision whereas the middle-income agents favour high levels of public provision (see also Epple and Romano 1996a). Another well-established result is that post-compulsory education policies are at least partially regressive, redistributing income from the lower income groups to middle- and high income groups (see e.g. Fernandez and Rogerson 1995).

We have taken a different route here, paying attention specifically to the question whether education should be made compulsory or not. We considered an economy with two categories of agents: entrepreneurs and workers. The type of occupation, rather than the income dispersion, plays a crucial role in the analysis. In laissez-faire, the former gain from having their children educated, while the latter have no interest in sending their children to school. We characterised the preferred education policy-cum-redistributive taxation for the two groups, and find that entrepreneurs favour a compulsory education policy while workers prefer a purely redistributive taxation scheme (in both cases, the policy should preferably be financed entirely by the other group). Then, we introduced a political process with probabilistic voting and verified that an equilibrium with both a compulsory education policy and some redistribution may exist in which the workers are constrained but the entrepreneurs, who benefit from hiring educated workers, are not.

We should note that, on top of establishing a political economy rationale for compulsory schooling based on the intuition that due to a productivity reason entrepreneurs care for education more than workers, our model shows also that it must co-exist with a (limited) redistribution policy. In particular, the model allows to make the point that a certain amount of redistribution is needed to compensate the workers for the fact that they are forced to “over-educate” their kids. This kind of perspective on redistribution could only be achieved, we believe, within the structure of the present model.

To the best of our knowledge, the literature on this topic, at least if we consider the political economy line of work, is limited. Most papers follows different approaches from ours. As an example of these alternative views, consider the contribution by Gradstein (2000), whose elegant argument is based on the idea of time inconsistency. He argues that private financing of education can be an inferior public choice if the current government representing the parents is unable to pre-commit the next generation to a restrained redistributive policy. He observes that public education, relative to private education, generates a more equal income distribution for the children, and therefore suggests that in the future the government will have to implement a relatively moderate redistributive policy, as chosen by the median voter. This reduces the incentive to under-invest in the children’s education, incentive that instead would be large in case the parents expected a more aggressive redistribution policy. Thus, human capital should be accumulated at a faster pace under a public education regime, and this would make it preferable for a majority of parents to the alternative of a private education regime. Another interesting example of a paper in the same vein is Correa et al. (2020), where the political support for different education funding regimes in a one-person, one-vote system is studied. Free education, in which all pupils are treated equally (the same amount of resources is spent on each of them) turns out to be the Condorcet winner. This is because, in economies with some degree of income inequality, any other system concentrates the educational expenditure in some way, either favouring the richest families or those with the smartest kids, and therefore, lacks majority support. This provides a political economy explanation for the observation that governments tend to favor free education for all students (i.e., to spend the same amount on each student).

Clearly, ours is an entirely different line of reasoning, not in contradiction, but certainly based on other foundations and moreover focused specifically on whether education should be made compulsory or not, rather than on whether it should be financed by the State or not (which is of course a somewhat different issue). In our model, income disparity plays a part, in particular by supporting the assumption that entrepreneurs are not constrained by compulsory education, but what really drives our result on the desirability of compulsory education at the political equilibrium is the difference in occupation, i.e. the different role that education plays for the entrepreneurs as opposed to the workers because of the industrialization/productivity reason we have mentioned in the Introduction. Redistribution, however, by compensating workers for the loss of their children’s production, plays a role in that it makes workers accept the presence of compulsory education at the political equilibrium.