1 Introduction

In this paper we consider the following question:

Question

Given k polynomials \(f_1,\dots ,f_k\in \mathbb {R}[X]\) of degree at most d with \(f_1(0)=\dots =f_k(0)=0\), how small can we make the fractional parts \(\Vert f_1(n)\Vert _{\mathbb {R}/\mathbb {Z}},\dots ,\Vert f_k(n)\Vert _{\mathbb {R}/\mathbb {Z}}\) over positive integers \(n\le x\)?

Here \(\Vert \cdot \Vert _{\mathbb {R}/\mathbb {Z}}\) denotes the distance to the nearest integer. Since the polynomial \(f(n)=n+1/2\) certainly doesn’t attain arbitrarily small fractional parts, it is natural to impose the condition \(f_1(0)=\dots =f_k(0)=0\) so that all polynomials individually can obtain small fractional parts. Indeed, it is known that if \(f\in \mathbb {R}[X]\) has degree at most \(d\ge 2\) and satisfies \(f(0)=0\) then the fractional part \(\Vert f(n)\Vert _{\mathbb {R}/\mathbb {Z}}\) can become arbitrarily small, and the recent work of Baker [Bak16] shows that

$$\begin{aligned} \min _{n\le x}\Vert f(n)\Vert _{\mathbb {R}/\mathbb {Z}}\ll _{d}\frac{1}{x^{1/(2d^2-2d)+o(1)}}. \end{aligned}$$
(1.1)

It is worth emphasizing that the bound depends only on x and d, and is otherwise completely uniform over all such polynomials f. The exponent \(1/(2d^2-2d)\) is based on the resolution of Vinogradov’s Mean Value Theorem for \(d\ge 4\)by Bourgain–Demeter–Guth [BDG16], but estimates of the shape \(O(1/d^2)\) were known since the work of Wooley [Woo12] and estimates of the form \(O(1/d^2\log {d})\) go back to Vinogradov [Vin04]. The bound (1.1) is certainly not expected to be tight; for monomials the exponent can be improved to \(O(1/d\log {d})\), for example, and it is conjectured [Bak86] that this should be improvable to \(1+o(1)\). Unfortunately there currently does not appear to be a feasible approach to make progress on the shape of these exponents with the current techniques.

In the case of k polynomials \(f_1,\dots ,f_k\in \mathbb {R}[X]\) of degree at most d with \(f_1(0)=\dots =f_k(0)=0\), the current record due to Baker and Harman [BH84] is for some \(c_d>0\)

$$\begin{aligned} \min _{n\le x}\max _{i\le k}\Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\ll _{k,d} \frac{1}{x^{1/(k^2+k c_d)+o(1)}}. \end{aligned}$$
(1.2)

This refines the initial groundbreaking work of Schmidt [Sch77] from 1977 who had a similar exponent of the form \(1/2k^2\) when \(d=2\). A simple argument based on choosing the coefficients of \(f_1,\dots ,f_k\) uniformly at random shows that one certainly cannot hope to have a result stronger than

$$\begin{aligned} \min _{n\le x}\max _{i\le k}\Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\ll \frac{1}{x^{1/k}}. \end{aligned}$$
(1.3)

These and related questions have been the object of a large amount of study in analytic number theory; see [MT19, Bak17, Bak78, Bak77, Bak80, Bak08, VW00, Woo13, Woo93, Bak18, Sch95, Zah95] for some recent related work. We refer the reader to the book [Bak86] for a comprehensive overview of these questions.

Our main result is to establish a bound for (1.2) with an exponent \(O_d(1/k)\). By comparing this with (1.3) we see that this bound is of the optimal shape in the k-aspect. This result is new even in the simplest non-linear case when \(f_i(n)=\alpha _i n^2\) for \(1\le i\le k\) which corresponds to simultaneous Diophantine approximation with squares. More precisely, our main result is the following.

Theorem 1.1

Let kd be positive integers. There is a constant \(C_d>2\) depending only on d and a constant \(C_{d,k}>2\) depending only on d and k such that the following holds.

Let \(f_1,\dots ,f_k\in \mathbb {R}[X]\) be polynomials of degree at most d such that \(f_1(0)=\dots =f_k(0)=0\). Let \(\epsilon _1,\dots ,\epsilon _k\in (0,1/100]\), and put \(\Delta =\prod _{i=1}^k\epsilon _i\).

If \(\Delta ^{-1}\le x^{1/C_d}\) and \(x>C_{d,k}\) then there is a positive integer \(n<x\) such that

$$\begin{aligned} \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\le \epsilon _i\quad \text {for all }i\in \{1,\dots ,k\}. \end{aligned}$$

Choosing \(\epsilon _1=\dots =\epsilon _k=x^{-1/(kC_d)}\) in Theorem 1.1 gives the improvement mentioned above. In the language of [Bak79], this confirms the conjecture that an arbitrary system of polynomials with \(f_1(0)=\dots =f_k(0)=0\) has ‘Heillbronn status’.

Corollary 1.2

Let \(f_1,\dots ,f_k\in \mathbb {R}[X]\) be polynomials of degree at most d such that \(f_1(0)=\dots =f_k(0)=0\). Then there is a positive integer \(n<x\) such that

$$\begin{aligned} \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\ll _{d,k} x^{-c_d/k}\quad \text {for all }i\in \{1,\dots ,k\}. \end{aligned}$$

Here \(c_d>0\) is a constant depending only on d, and the implied constant depends only on d and k.

Specializing just to the case when \(f_i(n)=\alpha _i n^d\) for each i (for any choice of fixed \(d\ge 2\)), we find that we obtain a new result on simultaneous Diophantine approximation with \(d^{th}\) powers, which is of the optimal shape.

Corollary 1.3

(Simultaneous Diophantine approximation). Let \(d\ge 2\) and \(\alpha _1,\dots ,\alpha _k\in \mathbb {R}\). Then there is a positive integer \(n<x\) such that

$$\begin{aligned} \Vert \alpha _i n^d\Vert _{\mathbb {R}/\mathbb {Z}}\ll _{d,k} x^{-c_d/k}\quad \text {for all }i\in \{1,\dots ,k\}. \end{aligned}$$

Here \(c_d>0\) is a constant depending only on d, and the implied constant depends only on d and k.

As with previous works, a noteworthy feature of Theorem 1.1 and Corollary 1.2 is that the result is completely uniform over the coefficients of the polynomials (or the choice of the \(\alpha _i\) in Corollary 1.3), with the implied constants depending only on d and k.

The proof as given in this paper would yield a constant \(c_d\) in Corollary 1.2 or Corollary 1.3 which is exponentially small in d (\(c_d=10^{-d}\) would probably suffice), but it is likely that with only a small amount of additional effort the constant could be taken to be of the form \(c_d=C/d^2\) or perhaps even \(C/(d+d^2/k)\) for a relatively small explicit absolute constant C. In the interests of emphasizing the main ideas we have chosen not to pursue such explicit bounds in the d-aspect. Similarly we have made no effort to control the implied constant’s dependence on d or k, although it is likely that adapting the ideas behind [GT09, Proposition A.2] would give a reasonable and explicit dependence on k and d.

2 Outline

In the interest of simplicity we consider the case when \(f_i(n)=\alpha _i n^2\), since this case still has most of the main features of the problem at hand. As in Schmidt’s original work, the argument follows an increment strategy, where either the situation looks ‘random’ or there is additive structure allowing us to pass to a self-similar situation with one fewer polynomial. We obtain improved bounds by getting more structural control over the arithmetic nature of the large Fourier coefficients, allowing for a more complicated but more efficient increment strategy (this successfully achieves the challenge mentioned in [Bak86, Page 5] of sharpening Schmidt’s ‘determinant argument’).

For a generic choice of \(\alpha _1,\dots ,\alpha _k\), we expect that the vector of fractional parts \(\mathbf {v}(n)=(\Vert \alpha _1 n^2\Vert _{\mathbb {R}/\mathbb {Z}},\dots ,\Vert \alpha _k n^2\Vert _{\mathbb {R}/\mathbb {Z}})\) will equidistribute in the torus \(\mathbb {R}^k/\mathbb {Z}^k\). Fourier analysis is well-suited to showing such equidistribution, and one finds that given any intervals \(I_1,\dots ,I_k\) of length \(\delta \) (for some small \(\delta >0\)) and \(x>\delta ^{-k-o(1)}\), there is an \(n<x\) such that \(\mathbf {v}(n)\in I_1\times \dots \times I_k\) unless there is a Diophantine relation

$$\begin{aligned} h_1\alpha _1+\dots +h_k\alpha _k\approx \frac{a}{q} \end{aligned}$$

for some constants \(h_i\le \delta ^{-1-o(1)}\) and some \(q<\delta ^{-O(k)}\). If \(\delta >x^{-c/k}\) for some small \(c>0\), such a relation is unusual but would mean that it is genuinely not the case that \(\mathbf {v}(n)\) equidistributes at this scale. (For example, if \(\alpha _1=\alpha _2\), then there is clearly not equidistribution.)

Schmidt [Sch77] addressed this potential issue by restricting to considering integers n such that n was a multiple of \(h_1q\) whenever there is such a Diophantine relation (assuming \(h_1\ne 0\), as we may do by relabeling the indices.) For such n’s we see that

$$\begin{aligned} \Vert \alpha _1 n^2\Vert _{\mathbb {R}/\mathbb {Z}}\approx \left\| \sum _{i=2}^k h_i\alpha _i n^2 /h_1\right\| _{\mathbb {R}/\mathbb {Z}}, \end{aligned}$$

and so if we can find \(a_1,\dots ,a_k\in \mathbb {Z}\) and \(n'<x/q h_1\) such that \(|\alpha _i (q h_1n')^2-a_i|<\delta ^{-1-o(1)}\) for \(2\le i\le k\) and such that \(\sum _{i=1}^k h_i a_i=0\) then we can find an \(n<x\) such that \(\Vert \alpha _i n^2\Vert _{\mathbb {R}/\mathbb {Z}}<\delta ^{-1}\) for \(1\le i\le k\). This essentially reduces the problem of finding \(n<x\) such that \(\Vert \alpha _i n^2\Vert _{\mathbb {R}/\mathbb {Z}}\) is small for \(1\le i\le k\) to one of finding \(m<x\delta ^{k+o(1)}\) such that \(\Vert \alpha _i' m^2\Vert _{\mathbb {R}/\mathbb {Z}}\) is small for \(1\le i\le k-1\) for some reals \(\alpha '_1,\dots ,\alpha '_{k-1}\). Since the problem is now analogous to the original but with one fewer variable, we may repeat the above procedure O(k) times. We maintain a non-trivial range for n provided \(\delta >x^{-c/k^2}\) for some small constant \(c>0\), which gives Schmidt’s result that there is an \(n<x\) such that \(\Vert \alpha _i n^2\Vert _{\mathbb {R}/\mathbb {Z}}<x^{-c/k^2}\) for some constant c independent of k or \(\alpha _1,\dots ,\alpha _k\).

The above procedure would produce a bound of size \(\Vert \alpha _i n^2\Vert _{\mathbb {R}/\mathbb {Z}}<x^{-c/k}\) if at each stage the denominator q was of size \(\delta ^{-O(1)}\) instead of size \(\delta ^{-O(k)}\). Therefore let us consider the most problematic case when q is of size \(\delta ^{-O(k)}\approx x^{c}\). In this case one still has suitable equidistribution via Fourier analysis unless there are \(\textit{many}\) vectors \((h_1,\dots ,h_k)\in [0,\delta ^{-1+o(1)}]\) and coprime integers aq with \(q\in [Q,2Q]\) such that

$$\begin{aligned} h_1\alpha _1+\dots +h_k\alpha _k\approx \frac{a}{q}. \end{aligned}$$

The key new idea in our proof is to exploit the fact we have many such relations rather than just one, and that the \(h_i\) must lie in an additively structured set, which will show the rationals a/q cannot have many distinct denominators. In fact, we will show that it must be the case that several of these relations must have the same denominator q, from which we can reduce to a much lower dimensional situation.

If many of the relations do have the same denominator q, then there must be many linearly independent solutions with the same denominator, and so (after relabelling) we can find short vectors \(\mathbf {h}^{(1)}=(h_{1}^{(1)},\dots ,h_k^{(1)}),\dots ,\mathbf {h}^{(r)}=(h_{1}^{(r)},\dots ,h^{(r)}_k)\) in \([0,\delta ^{-1+o(1)}]^{k}\) such that \((h^{(1)}_1,\dots ,h^{(1)}_r), \dots ,(h^{(r)}_1,\dots ,h^{(r)}_r)\) are linearly independent in \(\mathbb {Z}^r\) and if q|n then

$$\begin{aligned} \left\| \sum _{i=1}^r h_i^{(1)}\alpha _i n^2\right\| _{\mathbb {R}/\mathbb {Z}}&\approx \left\| \sum _{i=r+1}^k h_i^{(1)}\alpha _i n^2\right\| _{\mathbb {R}/\mathbb {Z}},\\&\vdots \\ \left\| \sum _{i=1}^r h_i^{(r)}\alpha _r n^2\right\| _{\mathbb {R}/\mathbb {Z}}&\approx \left\| \sum _{i=r+1}^k h_i^{(r)}\alpha _i n^2\right\| _{\mathbb {R}/\mathbb {Z}}. \end{aligned}$$

If we also restrict to \(\det ( h^{(i)}_j)_{1\le i,j\le r}|n\) then we find that, similarly to in Schmidt’s argument, we can reduce the problem to a lower dimensional one. In this case, however, we reduce the dimension by r rather than just 1, and in fact we can take \(r\gg \log {x}/\log {q}\). In the case when we always have \(q \approx x^c\) this process terminates after O(1) iterations rather than O(k) iterations, allowing us to maintain a non-trivial range of n if \(\delta >x^{-c/k}\) for some small constant c.

Alternatively, if there are many different denominators which occur - say \(Q^{1/100}\) different denominators of size Q—then it turns out we may find a subset of \(Q^{1/200}\) of these denominators which are almost all coprime to one another apart from some fixed integer d which divides all of the denominators in this subset. From this coprimality relation, we see that by adding r of these equations together one finds

$$\begin{aligned} \left\| \sum _{j=1}^r \sum _{i=1}^k \alpha _i h_i^{(j)}\right\| _{\mathbb {R}/\mathbb {Z}}\approx \left\| \frac{a^{(1)}}{q^{(1)}}+\dots +\frac{a^{(r)}}{q^{(r)}} \right\| _{\mathbb {R}/\mathbb {Z}}\ge \frac{1}{q^{(1)}\cdots q^{(r)}}. \end{aligned}$$

In particular, we see that almost all combinations of r/2 of the equations are distinct, and so there are \(\gg Q^{r/200}\) different non-zero combinations. However, we also have that the number of different choices of the coefficients of the \(\alpha _i\) in these relations is bounded by \(r^k\delta ^{-k-o(1)}\). This is less than \(Q^{r/200}\) if r is sufficiently large, giving a contradiction. Hence there cannot be many relations with different denominators q.

The above sketch is too simplistic in multiple ways—there are quantitative issues if the vectors \(\mathbf {h}^{(j)}\) are not essentially orthogonal to each other, and one actually needs to find suitable low height relations to avoid an accumulation of losses through the induction procedure. These can be achieved by exploiting ideas from the geometry of numbers. It is also the case (and was even in Schmidt’s original argument) that it is necessary to consider more general approximations by lattice vectors of the vector of polynomials such that the difference lies in a given convex set, which is essentially equivalent to considering approximations of the form \(\Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\le \epsilon _i\) for some given reals \(\epsilon _1,\dots ,\epsilon _k\).

Remark

It is interesting to note that the above argument can be interpreted as a density-increment argument in the style of Roth’s theorem on arithmetic progressions. This is the first setting which we are aware of when such a strategy produces essentially optimal polynomial-type bounds in a non-trivial situation. Moreover, it has been speculated (see [Gow]) that even in less structured problems one might hope to have either a small density increment on a ‘small codimension’ set, or a large density increment on a ‘large codimension’ set. We achieve something very much along these lines in this (more structured) setting of fractional parts of polynomials.

3 Notation

Throughout the paper we assume that we have polynomials \(f_1,\dots ,f_k\in \mathbb {R}[X]\) of degree at most d with \(f_1(0)=\dots =f_k(0)=0\). We let these polynomials be given by \(f_i(X)=\sum _{j=1}^d f_{i,j}X^j\). Furthermore, we have reals \(\epsilon _1,\dots ,\epsilon _k\in (0,1/100]\), and we put \(\Delta :=\prod _{i=1}^k\epsilon _i\).

To avoid any confusion about the quantifiers in statements of the form ‘if \(A\ll 1\) then \(B\ll 1\), with all implied constants depending only on d and k’, we emphasize that we take this statement to mean for any positive function f(dk) of d and k, there is a positive function \(g_f(d,k)\) depending only on f such that if \(|A|\le f(d,k)\) then we have \(|B|\le g_f(d,k)\).

4 The Main Argument

Our proof of Theorem 1.1 relies on three key propositions. In this section we show how the theorem follows quickly from the propositions, leaving us with the task of establishing the propositions separately from one another.

Our first proposition is a standard result which follows from Weyl’s bound for polynomial exponential sums.

Proposition 4.1

(Equidistribution or many linear relations). Let \(f_1,\dots ,f_k\in \mathbb {R}[X]\) be polynomials of degree at most d such that \(f_1(0)=\dots =f_k(0)=0\). Put \(f_i(X)=\sum _{j=1}^d f_{i,j} X^j\). Let \(\epsilon _1,\dots ,\epsilon _k\in (0,1/100]\), and put \(\Delta =\prod _{i=1}^k\epsilon _i\).

Then there is a constant \(C_d>0\) depending only on d such that, provided \(\Delta ^{-C_d}<x\), at least one of the following holds:

  1. (1)

    We have

    $$\begin{aligned} \#\{n\le x:\, \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}<\epsilon _i\,\forall i\}\gg x\prod _{i=1}^k\epsilon _i. \end{aligned}$$
  2. (2)

    There is some \(Q\le \Delta ^{-C_d}\) such that there are at least \(Q^{1/C_d}\) triples \((\mathbf {a},\mathbf {q},\mathbf {h})\in \mathbb {Z}^d\times \mathbb {Z}^d\times \mathbb {Z}^k\) satisfying:

    1. (a)

      \(\gcd (a_j,q_j)=1\) and \(1\le q_j\le Q\) for \(1\le j\le d\).

    2. (b)

      \(h_i\ll \epsilon _i^{-1}\Delta ^{-1/(2k)^4}\) for \(1\le i\le k\).

    3. (c)

      For each \(j\in \{1,\dots ,d\}\) we have

      $$\begin{aligned} \sum _{i=1}^k h_i f_{i,j}=\frac{a_j}{q_j}+O\left( \frac{Q^{C_d}}{x^{j}}\right) . \end{aligned}$$

All implied constants depend only on d and k.

Our second proposition allows us to find structure in the large Fourier coefficients with many of them giving rise to rationals with the same denominator.

Proposition 4.2

(Many relations must have the same denominator). Let \(f_1,\dots ,f_k\in \mathbb {R}[X]\) be polynomials of degree at most d such that \(f_1(0)=\dots =f_k(0)=0\). Put \(f_i(X)=\sum _{j=1}^d f_{i,j} X^j\). Let \(\epsilon _1,\dots ,\epsilon _k\in (0,1/100]\), and put \(\Delta =\prod _{i=1}^k\epsilon _i\).

Let \(C>2\) and \(Q\le \Delta ^{-C}\) be such that there are at least \(Q^{1/C}\) triples \((\mathbf {a},\mathbf {q},\mathbf {h})\in \mathbb {Z}^d\times \mathbb {Z}^d\times \mathbb {Z}^k\) satisfying:

  1. (1)

    \(\gcd (a_j,q_j)=1\) and \(1\le q_j\le Q\) for \(1\le j\le d\).

  2. (2)

    \(h_i\ll \epsilon _i^{-1}\Delta ^{-1/(2k)^4}\) for \(1\le i\le k\).

  3. (3)

    For each \(j\in \{1,\dots ,d\}\) we have

    $$\begin{aligned} \sum _{i=1}^k h_i f_{i,j}=\frac{a_j}{q_j}+O\left( \frac{Q^{C}}{x^{j}}\right) . \end{aligned}$$

Then there is a constant \(C'_d>0\) depending only on d and C such that provided \(\Delta ^{-C'_d}<x\) there is some positive integer \(q\le Q^{C'_d}\) and at least \(Q^{1/C'_d}\) pairs \((\mathbf {a},\mathbf {h})\in \mathbb {Z}^d\times \mathbb {Z}^k\) such that:

  1. (1)

    \(h_i\ll \epsilon _i^{-1}\Delta ^{-2/(2k)^4}\) for \(i\in \{1,\dots ,k\}\).

  2. (2)

    For each \(j\in \{1,\dots ,d\}\) we have

    $$\begin{aligned} \sum _{i=1}^k h_i f_{i,j}=\frac{a_j}{q}+O\left( \frac{Q^{C'_d}}{x^{j}}\right) . \end{aligned}$$

All implied constants depend only on d and k.

Our third key proposition allows us to pass from many relations with the same denominator to a reduced system of approximations.

Proposition 4.3

(Many relations with the same denominator give rise to a reduced dimension problem). Let \(f_1,\dots ,f_k\in \mathbb {R}[X]\) be polynomials of degree at most d such that \(f_1(0)=\dots =f_k(0)=0\). Put \(f_i(X)=\sum _{j=1}^d f_{i,j} X^j\). Let \(\epsilon _1,\dots ,\epsilon _k\in (0,1/100]\), and put \(\Delta =\prod _{i=1}^k\epsilon _i\).

Let \(C>2\) be such that \(\Delta ^{-1}\le x^{1/4C^2}\), and let q be a positive integer with \(q<Q^C\).

Let \(\mathcal {S}\) be the set of pairs \((\mathbf {a},\mathbf {h})\in \mathbb {Z}^d\times \mathbb {Z}^k\) such that for \(j\in \{1,\dots ,d\}\) we have

$$\begin{aligned} \left| \sum _{i=1}^k h_if_{i,j}-\frac{a_j}{q}\right| \ll \frac{Q^C}{x^j}, \end{aligned}$$

and such that \(|h_i|\ll \epsilon _i^{-1}\Delta ^{-2/(2k)^4}\). Assume that \(\#\mathcal {S}>Q^{1/C}\).

Then there is an integer \(k'<k\), polynomials \(g_1,\dots ,g_{k'}\in \mathbb {R}[X]\) of degree at most d with \(g_1(0)=\dots =g_{k'}(0)=0\) and quantities \(\epsilon _1',\dots ,\epsilon _{k'}'\in (0,1/100]\) and \(y<x\) such that:

  1. (1)

    (Approximations in the new system produce approximations in the old system.) If there is an integer \(n'<y\) such that,

    $$\begin{aligned} \Vert g_i(n')\Vert _{\mathbb {R}/\mathbb {Z}}<\epsilon _i'\quad \text {for all }1\le i\le k' \end{aligned}$$

    then there is an integer \(n<x\) such that

    $$\begin{aligned} \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}<\epsilon _i\quad \text {for all }1\le i\le k. \end{aligned}$$
  2. (2)

    (Increased density of approximations.) We have

    $$\begin{aligned} y(\epsilon _1'\cdots \epsilon _{k'}')^{3C^2-C^2/k'{}^3}\gg x(\epsilon _1\cdots \epsilon _k)^{3C^2-C^2/k^3}. \end{aligned}$$

All implied constants depend only on k and d.

We see that case (2) of the conclusion of Proposition 4.1 satisfies the assumptions of Proposition 4.2, and the conclusion of Proposition 4.2 satisfies the conditions of Proposition 4.3. Thus, putting these three propositions together we obtain

Proposition 4.4

(Induction Step). Let dk be positive integers. There is a constant \(C_d>2\) depending only on d and \(C_{d,k}>2\) depending only on d and k such that the following holds.

Let \(f_1,\dots ,f_k\in \mathbb {R}[X]\) be polynomials of degree at most d such that \(f_1(0)=\dots =f_k(0)=0\). Put \(f_i(X)=\sum _{j=1}^d f_{i,j} X^j\). Let \(\epsilon _1,\dots ,\epsilon _k\in (0,1/100]\), and put \(\Delta =\prod _{i=1}^k\epsilon _i\). Let \(\Delta ^{-1}\le x^{2/C_d}\).

If there is no positive integer \(n<x\) such that

$$\begin{aligned} \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}<\epsilon _i\quad \text {for all }i\in \{1,\dots ,k\}, \end{aligned}$$

then there is a positive integer \(k'<k\) and polynomials \(g_1,\dots ,g_{k'}\in \mathbb {R}[X]\) of degree at most d with \(g_1(0)=\dots =g_{k'}(0)=0\) and reals \(\epsilon _1',\dots ,\epsilon _{k'}'\in (0,1/100]\) and \(y\in \mathbb {R}\) with \(y<x\) such that both of the following hold:

  1. (1)

    There is no positive integer \(n'<y\) such that

    $$\begin{aligned} \Vert g_i(n')\Vert _{\mathbb {R}/\mathbb {Z}}<\epsilon _i'\quad \text {for all }i\in \{1,\dots ,k'\}. \end{aligned}$$
  2. (2)

    We have

    $$\begin{aligned} y(\epsilon _1'\cdots \epsilon _{k'}')^{C_d(3-1/(k')^3)}\ge \frac{x(\epsilon _1\cdots \epsilon _k)^{C_d(3-1/k^3)}}{C_{d,k}}. \end{aligned}$$

All implied constants depend only on k and d.

Proof of Theorem 1.1 assuming Proposition 4.4

Let \(C_d\) and \(C_{d,k}\) be the constants of Proposition 4.4, and let \(C_0=\sup _{j\le k}C_{d,j}\) (which depends only on d and k). Assume for a contradiction that there is no positive \(n<x\) such that \(\Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\le \epsilon _i\) for all \(i\in \{1,\dots ,k\}\).

We will apply Proposition 4.4 repeatedly to reduce the dimension of the problem we consider. Let us define a System to be a tuple \((k,\mathbf {g},\varvec{\delta },y)\) consisting of:

  1. (1)

    A positive integer k.

  2. (2)

    A k-tuple \(\mathbf {g}\) of real polynomials \((g_1,\dots ,g_{k})\) of degree at most d satisfying \(g_1(0)=\dots =g_{k}(0)=0\).

  3. (3)

    A k-tuple \(\varvec{\delta }\) of reals \((\delta _1,\dots ,\delta _k)\) with \(\delta _i\in (0,1/100]\) for all \(i\in \{1,\dots ,k\}\).

  4. (4)

    A real y such that there is no positive integer \(n<y\) satisfying

    $$\begin{aligned} \Vert g_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\le \delta _i\quad \text {for all }i\in \{1,\dots ,k\}. \end{aligned}$$

Given a System \((k,\mathbf {g},\varvec{\delta },y)\), let \(\Delta (\varvec{\delta })=\prod _{i=1}^k\delta _i\). By Proposition 4.4, if a system \((k_j,\mathbf {g}_j,\varvec{\delta }_j,y_j)\) satisfies \(\Delta (\varvec{\delta }_j)^{-1}<y_j^{2/C_d}\) then there is a system \((k_{j+1}, \mathbf {g}_{j+1},\varvec{\delta }_{j+1},y_{j+1})\) such that \(k_{j+1}<k_j\), \(y_{j+1}\le y_j\) and

$$\begin{aligned} y_{j+1}\Delta (\varvec{\delta }_{j+1})^{C_d(3-1/k_{j+1}^2)}\ge \frac{y_j\Delta (\varvec{\delta }_j)^{C_d(3-1/k_j^2)}}{C_0}. \end{aligned}$$

In particular, if

$$\begin{aligned} y_j\Delta (\varvec{\delta }_j)^{C_d(3-1/k_j^2)}>C_0^{k_j} \end{aligned}$$

then

$$\begin{aligned} y_{j+1}\Delta (\varvec{\delta }_{j+1})^{C_d(3-1/k_{j+1}^2)}>C_0^{k_j-1}\ge C_0^{k_{j+1}}. \end{aligned}$$

Moreover, since \(C_d,C_0>2\), this implies that \(\Delta (\varvec{\delta }_{j+1})^{-1}<y_{j+1}^{2/C_d}\). Thus, given a System \((k_1,\mathbf {g}_1,\varvec{\delta }_1,y_1)\) with \(y_1\Delta (\varvec{\delta }_1)^{C_d(3-1/k_1^2)}>C_0^{k_1}\), we may repeatedly apply Proposition 4.4 to obtain an infinite sequence of Systems \((k_j,\mathbf {g}_j,\varvec{\delta }_j,y_j)\) for all \(j=1,2,\dots \). But the \(k_j\) are a decreasing sequence of positive integers, and so no such sequence can exist. Thus there can be no System \((k_1,\mathbf {g}_1,\varvec{\delta }_1,y_1)\) with \(y_1\Delta (\varvec{\delta }_1)^{C_d(3-1/k_1^2)}>C_0^{k_1}\).

Let us be given a positive integer k, a k-tuple \(\mathbf {f}=(f_1,\dots ,f_k)\) of real polynomials of degree at most d with \(f_1(0)=\dots =f_k(0)=0\), a k-tuple of reals \(\varvec{\epsilon }=(\epsilon _1,\dots ,\epsilon _k)\) with \(\epsilon _i\in (0,1/100]\) for all \(i\in \{1,\dots ,k\}\) and a real x with \(x>\Delta (\varvec{\epsilon })^{-C^*_d}\), where \(C^*_d:=3C_d(1-1/k^2)+\log {C_0}/\log {100}\). Then, since \(\epsilon _i\le 1/100\), we see that \(\Delta (\varvec{\epsilon })\le 100^{-k}\) so

$$\begin{aligned} x>\Delta (\varvec{\epsilon })^{-3C_d(1-1/k^2)}\Delta (\varvec{\epsilon })^{-\log {C_0}/ \log {100}}\ge C_0^k \Delta (\varvec{\epsilon })^{-C_d(3-1/k^2)}. \end{aligned}$$

Therefore \((k,\mathbf {f},\varvec{\epsilon },x)\) cannot form a System, and so there must be a positive integer \(n<x\) such that

$$\begin{aligned} \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}<\epsilon _i\quad \text {for all }i\in \{1,\dots ,k\}. \end{aligned}$$

This gives the result. \(\square \)

Since Proposition 4.4 follows immediately from Propositions 4.14.2 and 4.3, it remains to establish these three propositions. We establish each of these in turn over the next three sections.

5 Initial Fourier Analysis and Proposition 4.1

In this section we establish Proposition 4.1. The arguments in this section are standard and well-known to researchers in the field, but for completeness we give complete proofs since the versions we use are slightly different from some occurrences in the literature.

Lemma 5.1

(Weyl exponential sum bound). Let \(f\in \mathbb {R}[X]\) be a monic polynomial of degree d, and \(\alpha \in \mathbb {R}\) satisfy \(\alpha =a/q+O(1/q Q)\) for some \(q<Q\). Then there is a constant \(c_d>0\) depending only on d such that

$$\begin{aligned} \sum _{n<x}e(f(n)\alpha )\ll \frac{x}{q^{c_d}}+\frac{x}{(x^d/q)^{c_d}}. \end{aligned}$$

The implied constant depends only on d.

Proof

This follows from [Vau97, Lemma 2.4]. \(\square \)

Lemma 5.2

(Modified Weyl exponential sum bound). Let \(f(X)=\sum _{i=1}^d f_i X^i\in \mathbb {R}[X]\) be a polynomial of degree d with \(f(0)=0\). Then there is a constant \(C_d''>2\) depending only on d such that the following holds.

If there is some \(Q\in [2,x^{1/C_d''}]\) such that

$$\begin{aligned} \left| \sum _{n<x}e(f(n))\right| \ge \frac{x}{Q} \end{aligned}$$

then there are positive integers \(q_1,\dots ,q_d<Q^{C_d''}\) and integers \(a_1,\dots ,a_d\) such that \(\gcd (a_j,q_j)=1\) for \(j\in \{1,\dots ,d\}\) and

$$\begin{aligned} f_j=\frac{a_j}{q_j}+O\left( \frac{Q^{C_d''}}{x^j}\right) \end{aligned}$$

for \(j\in \{1,\dots ,d\}\). The implied constants depend only on d.

Proof

We prove the result by induction. Assume that for each \(j>d-\ell \) we have that the coefficient \(f_j\) of f(X) satisfies

$$\begin{aligned} f_j=\frac{a_j}{q_j}+O\left( \frac{Q^{C_j''}}{x^j}\right) \end{aligned}$$
(5.1)

for some coprime integers \(a_j,q_j\) with \(q_j<Q^{C_j''}\) and some constants \(C_{j}''\) bounded only in terms of d. In the base case with \(\ell =0\) we make no assumption. We wish to show that there is a constant \(C_{d-\ell }''\) bounded only in terms of d such that if \(Q\le x^{1/C''_{d-\ell }}\) then there are coprime integers \(a_{d-\ell },q_{d-\ell }\) with \(q_{d-\ell }<Q^{C''_{d-\ell }}\) such that (5.1) holds with \(j=d-\ell \). This would then give the result by applying this statement for each \(\ell \in \{0,\dots ,d-1\}\) in turn (noting that the since there are only d different values of \(\ell \) to consider, all implied constants remain bounded only in terms of d).

Let \(C\ge \max _{j>d-\ell }C_j''\) be taken sufficiently large in terms of d, and let \(\tilde{q}=q_d q_{d-1}\cdots q_{d-\ell +1}<Q^{d C}\) (let \(\tilde{q}=1\) if \(\ell =0\)). Since we assume that \(Q<x^{1/C''_{d-\ell }}\), we have \(Q^{2C}\tilde{q}<x^{1/3}\) on restricting to \(C''_{d-\ell }>10 d C\). We can split \(\{1,\dots ,x\}\) into \(O(Q^{2C}\tilde{q})\) disjoint arithmetic progressions with modulus \(\tilde{q}\) each containing between \(x/Q^{2C}\tilde{q}\) and \(2x/Q^{2C}\tilde{q}\) elements. (For each residue class \(b\ (\mathrm {mod}\ \tilde{q})\) greedily take the \(\lceil x/Q^{2C}\tilde{q}\rceil \) smallest elements until less than \(2\lceil x/Q^{2C}\tilde{q}\rceil \) remain.) Then by the triangle inequality

$$\begin{aligned} \left| \sum _{n<x}e(f(n))\right| \ll Q^{2C}\tilde{q}\sup _{\begin{array}{c} x/Q^{2C}\tilde{q}\le y\le 2x/Q^{2C}\tilde{q}\\ x_0\le x \end{array}}\left| \sum _{n<y}e(f(x_0+\tilde{q}n))\right| . \end{aligned}$$

By the hypothesis of the lemma, the left hand side is at least x/Q. Thus there must be a choice of integers \(y\asymp x/\tilde{q}Q^C\) and \(x_0<x\) such that

$$\begin{aligned} \left| \sum _{n<y}e(f(x_0+\tilde{q}n))\right| \gg \frac{y}{Q}. \end{aligned}$$
(5.2)

From the Diophantine approximations (5.1) and the periodicity of e(t) we see that for \(\tilde{q}n<x/Q^{2C}\) we have

$$\begin{aligned} e\left( \sum _{j>d-\ell }(x_0+\tilde{q}n)^jf_j\right)&=e\left( \sum _{j>d-\ell }f_j x_0^j\right) e\left( \sum _{j>d-\ell }\sum _{i=1}^j\left( {\begin{array}{c}j\\ i\end{array}}\right) f_j\tilde{q}^in^i x_0^{j-i}\right) \\&=e\left( \sum _{j>d-\ell }f_j x_0^j\right) e\left( O\left( \frac{Q^C\tilde{q}n}{x}\right) \right) \\&=e\left( \sum _{j>d-\ell }f_j x_0^j\right) +O\left( \frac{1}{Q^C}\right) . \end{aligned}$$

Thus we have

$$\begin{aligned} \left| \sum _{n<y}e(f(x_0+\tilde{q}n))\right| =\left| \sum _{n<y}e(g(n))\right| +O\left( \frac{y}{Q^C}\right) , \end{aligned}$$
(5.3)

where g is the degree \(d-\ell \) polynomial

$$\begin{aligned} g(X)=\sum _{i=1}^{d-\ell }(x_0+\tilde{q}X)^if_i. \end{aligned}$$

Taking C sufficiently large in terms of d, we see that (5.2) and (5.3) show that

$$\begin{aligned} \left| \sum _{n<y}e(g(n))\right| \gg \frac{y}{Q} \end{aligned}$$
(5.4)

for some \(y\asymp x/Q^{2C}\tilde{q}\) and some \(x_0\). Let \(\alpha =f_{d-\ell }\tilde{q}^{d-\ell }\) be the lead coefficient of g. If \(\alpha =0\) then (5.1) clearly holds for \(j=d-\ell \). Thus we may assume \(\alpha \ne 0\). By Dirichlet’s Theorem, for any choice of \(C'\), there is an approximation

$$\begin{aligned} \alpha =\frac{a_{d-\ell }}{q_{d-\ell }}+O\left( \frac{Q^{C'}}{q_{d-\ell }x^{d-\ell }}\right) \end{aligned}$$

for some coprime integers \(a_{d-\ell },q_{d-\ell }\) with \(q_{d-\ell }<x^{d-\ell }/Q^{C'}\). By applying Lemma 5.1 to the polynomial \(g(X)/\alpha \) we see that

$$\begin{aligned} \frac{y}{Q}\ll \left| \sum _{n<y}e(g(n))\right| \ll \frac{y}{q_{d-\ell }^{c_d}}+\frac{y}{(y^{d-\ell }/q_{d-\ell })^{c_d}}. \end{aligned}$$
(5.5)

We recall that

$$\begin{aligned} y^{d-\ell }\ge \frac{x^{d-\ell }}{Q^{2C(d-\ell )}\tilde{q}^{d-\ell }}\ge \frac{x^{d-\ell }}{Q^{(d+2)C(d-\ell )}}, \end{aligned}$$

and that \(q_{d-\ell }\le x^{d-\ell }/Q^{C'}\). Therefore

$$\begin{aligned} y^{d-\ell }/q_{d-\ell }>Q^{C'-(d+2)C(d-\ell )}. \end{aligned}$$

On choosing \(C'\) large compared with \(c_d\) and C, we see that this implies \((y^{d-\ell }/q_{d-\ell })^{c_d}\ge Q^2\). Thus (5.5) implies that \(q_{d-\ell }\le Q^{1/c_d}\le Q^{C'}\). This gives (5.1) with \(j=d-\ell \) and \(C''_{d-\ell }\) large enough in terms of d, and so gives the result. \(\square \)

Lemma 5.3

(Equidistribution or many large Fourier coefficients). Let \(f_1,\dots ,f_k\in \mathbb {R}[X]\) be real valued functions, and \(\epsilon _1,\dots ,\epsilon _k\in (0,1/2]\) be real numbers, with \(\Delta :=\prod _{i=1}^k\epsilon _i\). Then at least one of the following holds:

  1. (1)

    We have

    $$\begin{aligned} \#\{n\le x:\, \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\le \epsilon _i\,\forall i\}\gg \Delta x. \end{aligned}$$
  2. (2)

    There is a quantity \(Q\ge 2\) such that there are at least \(Q^{1/2}\) distinct values of \(\mathbf {h}\in \mathbb {Z}^k\backslash \{\mathbf {0}\}\) with \(|h_i|<\epsilon _i^{-1}\Delta ^{-1/(2k)^4}\) such that

    $$\begin{aligned} \frac{x}{Q}\le \left| \sum _{n\le x}e\left( \sum _{i=1}^k h_i f_i(n)\right) \right| \le \frac{2x}{Q}. \end{aligned}$$

Proof

We fix a smooth function \(\phi :\mathbb {R}\rightarrow [0,1]\) with \(\phi (t)\) supported on \(|t|<1\) which is 1 on \(|t|<1/2\) and let all implied constants depend on \(\phi \). Let

$$\begin{aligned} \Phi _i(t)=\sum _{m\in \mathbb {Z}}\phi \left( \frac{t+m}{\epsilon _i}\right) , \end{aligned}$$

which is clearly 1-periodic, smooth, and supported on \(\Vert t\Vert _{\mathbb {R}/\mathbb {Z}}<\epsilon _i\). By Poisson summation

$$\begin{aligned} \Phi _i(t)=\epsilon _i\sum _{h\in \mathbb {Z}}\hat{\phi }(\epsilon _i h) e(h f_i(t)). \end{aligned}$$

Since \(\phi \) is fixed and smooth, \(\phi ^{(j)}(t)\ll _j 1\), so \(|\hat{\phi }(u)|\ll _j u^{-j}\) for all \(j\ge 0\). Thus we see that the terms with \(|h|\ge \epsilon _i\Delta ^{-1/(2k)^4}\) contribute \(O(\Delta ^{100})\), and so

$$\begin{aligned} \Phi _i(t)=\epsilon _i\sum _{|h|\le \epsilon _i^{-1}\Delta ^{-1/(2k)^4}}\hat{\phi }(\epsilon _i h) e(h f_i(t))+O(\Delta ^{100}). \end{aligned}$$

Thus we find that (recalling \(\hat{\phi }(t)\ll 1\))

$$\begin{aligned}&\#\{n\le x:\, \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\le \epsilon _i\forall i\}\ge \sum _{n\le x}\prod _{i=1}^k \Phi _i(f_i(n))\\&\quad =\Delta \sum _{\begin{array}{c} h_1,\dots h_k\\ |h_i|<\epsilon _i^{-1}\Delta ^{-1/(2k)^4} \end{array}}\left( \prod _{i=1}^k \hat{\phi }(\epsilon _i h_i) \right) \sum _{n\le x}e\left( \sum _{i=1}^k h_i f_i(n)\right) +O(x\Delta ^{99})\\&\quad =x\Delta \hat{\phi }(0)^k+O\left( \Delta \sum _{\begin{array}{c} \mathbf {h}\in \mathbb {Z}^k\backslash \{\mathbf {0}\}\\ |h_i|<\epsilon _i^{-1}\Delta ^{-1/(2k)^4} \end{array}}\left| \sum _{n\le x}e\left( \sum _{i=1}^k h_i f_i(n)\right) \right| \right) +O(x\Delta ^{99}). \end{aligned}$$

For \(\Delta \) sufficiently small we see that \(\Delta \hat{\phi }(0)^k+O(\Delta ^{99})\gg \Delta \), and so either

$$\begin{aligned} \#\{n\le x:\, \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\le \epsilon _i\forall i\}\gg \Delta x \end{aligned}$$

or

$$\begin{aligned} \sum _{\begin{array}{c} \mathbf {h}\in \mathbb {Z}^k\backslash \{\mathbf {0}\}\\ |h_i|<\epsilon _i^{-1}\Delta ^{-1/(2k)^4} \end{array}}\left| \sum _{n\le x}e\left( \sum _{i=1}^k h_i f_i(n)\right) \right| \gg x. \end{aligned}$$

In the latter case, by the pigeonhole principle there is some \(Q=2^j\) such that there are at least \(Q^{1/2}\) choices of \(\mathbf {h}\) in the outer summation such that

$$\begin{aligned} \frac{x}{Q}\le \left| \sum _{n\le x}e\left( \sum _{i=1}^k h_i f_i(n)\right) \right| \le \frac{2x}{Q}. \end{aligned}$$

This gives the result. \(\square \)

Proof of Proposition 4.1

Assume that conclusion (1) of Proposition 4.1 does not hold, so that we wish to establish conclusion (2). By Lemma 5.3, there is a parameter \(Q_1\) such that there are \(Q_1^{1/2}\) choices of \(\mathbf {h}\) for which the corresponding exponential sum is large (of size \(\gg x/Q_1\)). Since the total number of choices of \(\mathbf {h}\) is \(O(\Delta ^{-1-k/(2k)^4})\), we must have \(Q_1\ll \Delta ^{-2-2k/(2k)^4}\), and so if \(\Delta ^{-1}<x^{1/B}\) for B sufficiently large in terms of d, then \(Q_1<x^{1/C_d''}\). We can then apply Lemma 5.2, which shows that each of these values of \(\mathbf {h}=(h_1,\dots ,h_k)\) then gives rise to a linear equation

$$\begin{aligned} \sum _{i=1}^k h_if_{i,j}=\frac{a_j}{q_j}+O\left( \frac{Q_1^{C_d}}{x^j}\right) \end{aligned}$$

with \((a_j,q_j)=1\) and \(q_j\le Q_1^{C_d''}\). Letting \(Q=Q_1^{C_d''}\) and taking \(C_d\) sufficiently large compared with \(C_d''\) then gives the result. \(\square \)

6 Structure in the Large Fourier Coefficients and Proposition 4.2

In this section we prove Proposition 4.2 by showing many different linear relations with small denominators must give rise to several relations with the same denominator.

Lemma 6.1

(Expansion or same denominators). Let \(\delta \in (0,1/200)\) and r a positive integer. Let \(Q>0\) be large enough in terms of \(\delta \) and r, and let \(\mathcal {S}\subset \mathbb {Z}\times \mathbb {Z}\times \mathbb {Z}^k\) be a set of triples \((a,q,\mathbf {h})\) with \(\gcd (a,q)=1\) and \(q\le Q\) such that \(\#\mathcal {S}\ge Q^\delta \). Then one of the following holds:

  1. (1)

    There is a \(q_0\le Q\) such that at least \(\#\mathcal {S}^{1/2}\) of the triples \((a,q,\mathbf {h})\in \mathcal {S}\) have \(q=q_0\).

  2. (2)

    The set

    $$\begin{aligned} \mathcal {A}= & {} \left\{ \frac{a_1}{q_1}+\dots +\frac{a_r}{q_r}:\,\text {there exists }\mathbf {h}_1,\dots ,\mathbf {h}_r\right. \\&\left. \in \mathbb {Z}^k\text { s.t. }(a_i,q_i,\mathbf {h}_i)\in \mathcal {S}\text { for }1\le i\le r\right\} \end{aligned}$$

    has cardinality at least \(\#\mathcal {S}^{r/5}\).

Proof

Throughout the lemma we will assume that Q is large enough in terms of \(\delta \) and r without further comment. We first restrict our attention to a suitable subset of the q’s appearing in \(\mathcal {S}\). For \(j=0,1,\dots \) let

$$\begin{aligned} \mathcal {B}_j=\left\{ q\in [2^j,2^{j+1}):\,\exists \, (a,\mathbf {h})\in \mathbb {Z}\times \mathbb {Z}^k\text { with }(a,q,\mathbf {h})\in \mathcal {S}\right\} . \end{aligned}$$

Clearly \(\mathcal {B}_j\) is empty if \(j>2\log {Q}\) since if \((a,q,\mathbf {h})\in \mathcal {S}\) then \(q\le Q\).

If

$$\begin{aligned} \#\{q:\,\exists (a,\mathbf {h})\in \mathbb {Z}\times \mathbb {Z}^k\text { with }(a,q,\mathbf {h})\in \mathcal {S}\}=\sum _{2^j\le Q}\#\mathcal {B}_{j}\le \#\mathcal {S}^{1/2} \end{aligned}$$

then (by the pigeonhole principle) there is a \(q\le Q\) such that there are at least \(\#\mathcal {S}^{1/2}\) choices of \((a,\mathbf {h})\) with \((a,q,\mathbf {h})\in \mathcal {S}\), since there are this many on average. Thus condition (1) is satisfied in this case.

Thus we may assume that

$$\begin{aligned} \sum _{j}\#\mathcal {B}_{j}>\#\mathcal {S}^{1/2}\ge Q^{\delta /2}, \end{aligned}$$

and so there is some \(j_0\le 2\log {Q}\) such that \(\#\mathcal {B}_{j_0}>\#\mathcal {S}^{2/5}\). Note that we must have \(2^{j_0}>Q^{\delta /3}\) from the trivial bound \(\#\mathcal {B}_j\le 2^{j}\).

If there is an integer d which divides at least \(\#\mathcal {B}_{j_0}/d^{\delta /10}\) elements of \(\mathcal {B}_{j_0}\), we restrict our attention to this subset. By performing this repeatedly, we may assume that there is a fixed integer \(d_0\) and a set \(\mathcal {B}_{j_0}'\subseteq \mathcal {B}_{j_0}\) such that \(\#\mathcal {B}_{j_0}'\ge \#\mathcal {B}_{j_0}/d_0^{\delta /10}\), all elements of \(\mathcal {B}'_{j_0}\) are a multiple of \(d_0\), and there is no integer \(\ell >1\) such that at least \(\#\mathcal {B}_{j_0}'/\ell ^{\delta /10}\) elements of \(\mathcal {B}'_{j_0}\) are a multiple of \(d_0\ell \). Since we must have \(d_0\le 2^{j_0+1}\le 2Q\), we see that \(\#\mathcal {B}_{j_0}'\ge \#\mathcal {S}^{2/5}/Q^{\delta /10}\ge \#\mathcal {S}^{1/4}\). Since \(\mathcal {B}_{j_0}'\subseteq \{b\in [2^{j_0},2^{j_0+1}):\,d_0|b\}\) a set of size \(O(2^{j_0}/d_0)\), we see this also implies that \(d_0<2^{j_0}/Q^{\delta /4}\). Finally, we let

$$\begin{aligned} \mathcal {B}=\{b:d_0b\in \mathcal {B}'_{j_0}\}, \end{aligned}$$

and note that \(\mathcal {B}\subseteq [B,2B)\) where we have set \(B:=2^{j_0}/d_0\). The above discussion implies that \(\#\mathcal {B}\ge \#\mathcal {S}^{1/4}\), that \(B\in [\#\mathcal {S}^{1/4},Q]\), and that there is no integer \(\ell >1\) such that \(\ell \) divides at least \(\#\mathcal {B}/\ell ^{\delta /10}\) elements of \(\mathcal {B}\).

We now wish to show that if we fix a choice of integers a(b) for \(b\in \mathcal {B}\) satisfying \(\gcd (a(b),d_0 b)=1\), then as \((b_1,\dots ,b_r)\) varies in \(\mathcal {B}^r\), many of the sums

$$\begin{aligned} \frac{a(b_1)}{d_0b_1}+\dots +\frac{a(b_r)}{d_0b_r} \end{aligned}$$

have different denominators when written as a single fraction in reduced terms, and so in particular many of the expressions are distinct. If all of \(d_0,b_1,\dots ,b_r\) were pairwise coprime then the denominator would be \(d_0b_1\cdots b_r\), and by the divisor bound there are few different choices of \(b_1,\dots ,b_r\) which would give the same denominator. Instead we are in the situation where the \(b_i\)’s are ‘close’ to coprime, since we expect they typically have small \(\gcd \)s by construction of \(\mathcal {B}\).

Consider the graph \(G=(\mathcal {V},\mathcal {E})\) where the vertex set \(\mathcal {V}\) is taken to be \(\mathcal {B}\), and the edge set \(\mathcal {E}\) is defined by

$$\begin{aligned} \mathcal {E}=\{(b_1,b_2)\in \mathcal {B}^2:\,\gcd (b_1,b_2)\ge B^{\delta ^2/r^2}\}. \end{aligned}$$

We consider separately two cases.

Case 1: \(\#\mathcal {E}\ge \#\mathcal {V}^2/10r^2\).

In this case there are many pairs with a gcd of some size. If we pick a vertex v in G at random, then the expected number of vertices connected to v is at least \(\#\mathcal {V}/10 r^2\), and so (by the pigeonhole principle) there is some \(b_0\in \mathcal {B}\) such that there are at least \(\#\mathcal {B}/10r^2\) elements \(b\in \mathcal {B}\) with \(\gcd (b,b_0)>B^{\delta ^2/r^2}\). Since there are at most \(B^{o(1)}\) divisors of \(b_0\), there must be a divisor \(d>B^{\delta ^2/r^2}\) such that d|b for at least \(\#\mathcal {B}/(10r^2 B^{o(1)})>\#\mathcal {B}/d^{\delta /10}\) elements \(b\in \mathcal {B}\). But this contradicts the fact that \(\mathcal {B}\) is constructed to have no such integers. Thus we must instead have \(\#\mathcal {E}<\#\mathcal {V}^2/10r^2\).

Case 2: \(\#\mathcal {E}<\#\mathcal {V}^2/10r^2\).

In this case the edge density is small, and so a large number of pairs have a very small \(\gcd \). If we pick r distinct vertices in G uniformly at random, then the expected number of edges between these vertices is less than 1/9. In particular, the probability that there are no edges between any of the r chosen vertices is at least 8/9 (by Markov’s inequality). Thus, if we define

$$\begin{aligned} \mathcal {C}=\left\{ (b_1,\dots ,b_r)\in \mathcal {B}^r:\, \gcd (b_i,b_j)<B^{\delta ^2/r^2}\text { for }1\le i< j\le r\right\} , \end{aligned}$$

then \(\#\mathcal {C}\gg _r \#\mathcal {B}^r\).

We now consider the possible denominators of rationals of the form \(a_1/b_1+...+a_r/b_r\) where \((b_1,\dots ,b_r)\in \mathcal {C}\). Given \((b_1,\dots ,b_r)\in \mathcal {C}\), let

$$\begin{aligned} \mathcal {R}(b_1,\dots ,b_r)&=\left\{ (b_1',\dots ,b_r')\in \mathcal {C}:\,\exists \,a_1,\dots ,a_r,a_1',\dots ,a_r'\text { s.t. }\right. \\&\left. \gcd (a_i,d_0b_i)=\gcd (a_i',d_0b_i')=1\,\forall i, \frac{a_1}{b_1}+\dots +\frac{a_r}{b_r}=\frac{a_1'}{b_1'}+\dots +\frac{a_r'}{b_r'}\right\} . \end{aligned}$$

We note that for any choice of \(a_1,\dots ,a_r\) with \(\gcd (a_i,b_i)=1\), the denominator of \(a_1/b_1+\dots + a_r/b_r\) is a multiple of \(p^\ell \) if \(p^\ell \) divides exactly one of \(b_1,\dots ,b_r\) and \(p^{\ell +1}\) divides none of them. Let \(\gcd (b,p^\infty )\) denote the largest power of p dividing \(b>1\), and \(\gcd (b_i,b_j,p^\infty )\) the largest power of p dividing both \(b_i\) and \(b_j\). We now define

$$\begin{aligned} g_p:=\frac{\prod _{i=1}^r \gcd (b_i,p^\infty )}{\prod _{1\le i<j\le r}\gcd (b_i,b_j,p^\infty )^2}. \end{aligned}$$

We see that \(g_p\le p^\ell \) if \(p^\ell \) divides exactly one of \(b_1,\dots ,b_r\) and \(p^{\ell +1}\) divides none of them. Similarly, \(g_p\le 1\) if \(p^\ell \) divides at least 2 of the \(b_i\) but \(p^{\ell +1}\) divides none of them. (If \(b_j\) maximizes \(\gcd (b_j,p^\infty )\), then \(\gcd (b_j,b_i,p^\infty )=\gcd (b_i,p^\infty )\).) Taking the product over all p, we see that for any choice of \(a_1,\dots ,a_r\) with \((a_i,b_i)=1\), the denominator of \(a_1/b_1+\dots +a_r/b_r\) must be of size at least \(\prod _p g_p\). However, if \((b_1,\dots ,b_r)\in \mathcal {C}\) then all pairwise \(\gcd \)’s are small. Therefore, (regardless of \(a_1,\dots ,a_r\)) the denominator must be of size at least

$$\begin{aligned} \prod _p g_p=\frac{\prod _{i=1}^r b_i}{\prod _{1\le i<j\le r}\gcd (b_i,b_j)^2}\ge B^{r-2\delta ^2}. \end{aligned}$$

Moreover, any such denominator is clearly of size \(O(B^r)\). Thus, given \((b_1,\dots ,b_r)\in \mathcal {C}\), there are \(O(B^{2\delta ^2})\) possible denominators for \(a_1/b_1+\dots +a_r/b_r\). Given such a denominator \(q>B^{r-2\delta ^2}\), if the denominator of \(a_1'/b_1'+\dots +a_r'/b_r'\) is also equal to q then q must divide \(\prod _{i=1}^r b'_i\). Thus there are \(O(B^{2\delta ^2})\) such choices of \(\prod _{i=1}^rb_i'\ll B^r\) given q, and so \(O(B^{2\delta ^2+o(1)})\) choices of \(b_1',\dots ,b_r'\) (using the divisor bound). Hence for any choice of \((b_1,\dots ,b_r)\in \mathcal {C}\) there are at most \(B^{5\delta ^2}\) choices of \((b_1',\dots ,b_r')\) in total, and so \(\#\mathcal {R}(b_1,\dots ,b_r)\le B^{5\delta ^2}\).

For each \(b\in \mathcal {B}\), let a(b) be an integer coprime to \(d_0b\) such that \((a(b),d_0b,\mathbf {h})\in \mathcal {S}\) for some \(\mathbf {h}\). (This exists from the definition of \(\mathcal {B}\).) We now note that given \((b_1,\dots ,b_r)\in \mathcal {C}\) the rational \(a(b_1)/d_0b_1+\dots + a(b_r)/d_0b_r\) occurs for at most \(\mathcal {R}(b_1,\dots ,b_r)\) other elements of \(\mathcal {C}\). Thus

$$\begin{aligned} \#\mathcal {A}&\ge \#\left\{ \frac{a(b_1)}{d_0b_1}+\dots +\frac{a(b_r)}{d_0b_r}:\,(b_1,\dots ,b_r) \in \mathcal {C}\right\} \\&\ge \sum _{(b_1,\dots ,b_r)\in \mathcal {C}}\frac{1}{\#\mathcal {R}(b_1,\dots ,b_r)}\\&\ge \frac{\#\mathcal {C}}{B^{5\delta ^2}}\ge \frac{\#\mathcal {B}^r}{Q^{6\delta ^2}}. \end{aligned}$$

Recalling that \(r\ge 1>200\delta \) and \(\#\mathcal {B}>\#\mathcal {S}^{1/4}\ge Q^{\delta /4}\), this gives condition (2), as required. \(\square \)

Lemma 6.2

(Many linear relations must have the same denominator). Let \(\delta \in (0,1/200)\), let \(k\ge 2\) a positive integer and let \(\alpha _1,\dots ,\alpha _k\in [0,1)\). Let Q be large enough in terms of \(\delta \) and k, and let \(\epsilon _1,\dots \epsilon _k\in (0,1]\) be such that \(\Delta =\prod _{i=1}^k\epsilon _i\) satisfies \(Q^{10\delta }\le \Delta ^{-1}\le Q^{(2k)^4}\).

Let \(\mathcal {S}\subset \mathbb {Z}\times \mathbb {Z}\times \mathbb {Z}^k\) be a set of triples \((a,q,\mathbf {h})\) with \(\gcd (a,q)=1\), \(q\le Q\) and \(|h_i|\le \epsilon _i^{-1}\Delta ^{-1/(2k)^4}\) such that \(\#\mathcal {S}\ge Q^\delta \) and such that if \((a,q,\mathbf {h})\in \mathcal {S}\) then

$$\begin{aligned} \left| h_1\alpha _1+\dots +h_k\alpha _k-\frac{a}{q}\right| \le \left( \prod _{i=1}^k \epsilon _i\right) ^{100/\delta }. \end{aligned}$$

Then there is a \(q_0\le Q\) such that at least \(\#\mathcal {S}^{1/2}\) of the triples \((a,q,\mathbf {h})\in \mathcal {S}\) have \(q=q_0\).

Proof

Choose an integer r such that \(Q^{\delta r/20}>\prod _{i=1}^k\epsilon _i^{-1}\ge Q^{\delta r/30}\). We see that such an r must exist and satisfy \(r\in [20,30(2k)^4/\delta ]\) from our bounds on \(\prod _{i=1}^k\epsilon _i^{-1}\) in terms of Q. In particular, we may assume that Q is sufficiently large in terms of r. If \((a_1,q_1,\mathbf {h}_1),\dots ,(a_r,q_r,\mathbf {h}_r)\in \mathcal {S}\) then we have for \(1\le j\le r\)

$$\begin{aligned} \sum _{i=1}^k\alpha _i(\mathbf {h}_j)_i=\frac{a_j}{q_j}+O\left( \prod _{i=1}^k \epsilon _i\right) ^{100/\delta }=\frac{a_j}{q_j}+O(Q^{-10 r/3}). \end{aligned}$$

Adding these together (and recall that Q is sufficiently large so \(r Q^{-r/3}\le 1\) ) gives

$$\begin{aligned} \frac{a_1}{q_1}+\dots +\frac{a_r}{q_r}+O(Q^{-3 r})=\sum _{i=1}^k\alpha _i\tilde{h}_i, \end{aligned}$$

where \(\tilde{h}_i=\sum _{j=1}^r(\mathbf {h}_j)_i\). Since the denominator of \(a_1/q_1+\dots +a_r/q_r\) when written as a single fraction is at most \(Q^r\), we see that this fraction is uniquely determined by the integers \(\tilde{h}_1,\dots ,\tilde{h}_k\), since it is the best rational approximation to \(\sum _{i=1}^k\alpha _i\tilde{h}_i\) with denominator at most \(Q^r\). But \(|\tilde{h}_i|\ll r\epsilon _i^{-1}\Delta ^{-1/(2k)^4}\), and so we find

$$\begin{aligned}&\#\left\{ \frac{a_1}{q_1}+\dots +\frac{a_r}{q_r}:\,\exists \mathbf {h}_1,\dots ,\mathbf {h}_r\text { s.t. }(a_1,q_1,\mathbf {h}_1),\dots ,(a_r,q_r,\mathbf {h}_r)\in \mathcal {S}\right\} \\&\quad \le \#\{(\tilde{h}_1,\dots ,\tilde{h}_r)\in \mathbb {Z}^r:\,|\tilde{h}_i|\ll r\epsilon _i^{-1}\Delta ^{-1/(2k)^4}\}\\&\quad \le \left( \prod _{i=1}^k\epsilon _i^{-1}\right) ^{2}\\&\quad \le Q^{\delta r/10}. \end{aligned}$$

Here we used the fact that \(r\le \delta ^{-1}\log (\prod _{i=1}^k\epsilon _i^{-1})\) and \(\prod _{i=1}^k\epsilon _i^{-1}\ge Q^{10\delta }\) can be assumed to be sufficiently large in terms of \(\delta \) and k.

We see that our situation satisfies all the hypotheses of Lemma 6.1, but the above bound is incompatible with the bound of case (2) in Lemma 6.1, since in our situation case (2) would imply that

$$\begin{aligned} \#\left\{ \frac{a_1}{q_1}+\dots +\frac{a_r}{q_r}:\,\exists \mathbf {h}_1,\dots ,\mathbf {h}_r\text { s.t. }(a_1,q_1,\mathbf {h}_1),\dots ,(a_r,q_r,\mathbf {h}_r)\in \mathcal {S}\right\} \ge Q^{\delta r/5}. \end{aligned}$$

Thus, case (1) of Lemma 6.1 must hold; there must be a \(q_0\le Q\) such that at least \(\#\mathcal {S}^{1/2}\) of the triples \((a,q,\mathbf {h})\in \mathcal {S}\) have \(q=q_0\). \(\square \)

Lemma 6.3

(Many systems of linear relations must have the same denominators). Let \(\delta \in (0,1/200)\), let \(k,d\ge 2\) be positive integers and let \(\alpha _{i,j}\in [0,1)\) for \(1\le i\le k\), \(1\le j\le d\) be reals. Let Q be large enough in terms of \(\delta ,d\) and k, and let \(\epsilon _1,\dots \epsilon _k\in (0,1]\) be such that \(\Delta =\prod _{i=1}^k\epsilon _i\) satisfies \(Q^{10\delta }\le \Delta ^{-1}\le Q^{(2k)^4}\).

Let \(\mathcal {S}\subset \mathbb {Z}^d\times \mathbb {Z}^d\times \mathbb {Z}^k\) be a set of triples \((\mathbf {a},\mathbf {q},\mathbf {h})\) satisfying:

  1. (1)

    \(\gcd (a_j,q_j)=1\) and \(q_j\le Q\) for \(j\in \{1,\dots ,d\}\).

  2. (2)

    \(h_i\le \epsilon _i^{-1}\Delta ^{-1/(2k)^4}\) for \(i\in \{1,\dots ,k\}\).

  3. (3)

    \(\#\mathcal {S}\ge Q^{2^d \delta }\) .

  4. (4)

    For each \(j\in \{1,\dots ,d\}\) we have

    $$\begin{aligned} \left| h_1\alpha _{1,j}+\dots +h_k\alpha _{k,j}-\frac{a_j}{q_j}\right| \le \left( \prod _{i=1}^k\epsilon _i\right) ^{100/\delta }. \end{aligned}$$

Then there is a \(\mathbf {q}_0\in \mathbb {Z}^d\) such that at least \(\#\mathcal {S}^{1/2^d}\) of the triples \((\mathbf {a},\mathbf {q},\mathbf {h})\in \mathcal {S}\) have \(\mathbf {q}=\mathbf {q}_0\).

Proof

This follows from d applications of Lemma 6.2. Given \(\mathcal {S}'\subseteq \mathcal {S}\), for \(j\in \{1,\dots ,d\}\) let

$$\begin{aligned} \pi _j(\mathcal {S}')=\{(a,q,\mathbf {h}):\,\exists \,\mathbf {a},\mathbf {q}\text { s.t. }(\mathbf {a},\mathbf {q},\mathbf {h})\in \mathcal {S}'\text { and }a_j=a,\, q_j=q\}. \end{aligned}$$

We note that

$$\begin{aligned} \left( \prod _{i=1}^k\epsilon _i\right) ^{100/\delta }\le Q^{-10}, \end{aligned}$$

and so given \(\mathbf {h}\in \mathbb {Z}^k\) there is at most one choice of \(\mathbf {a},\mathbf {q}\in \mathbb {Z}^d\) such that \((\mathbf {a},\mathbf {q},\mathbf {h})\in \mathcal {S}\), since \(a_j/q_j\) is the best rational approximation with denominator at most Q to \(h_1\alpha _{1,j}+\dots +h_k\alpha _{k,j}\) if there is any \(\mathbf {a},\mathbf {q}\) such that \((\mathbf {a},\mathbf {q},\mathbf {h})\in \mathcal {S}\) (recall that we must have \(\gcd (a_j,q_j)=1\) if \((\mathbf {a},\mathbf {q},\mathbf {h})\in \mathcal {S}\)). In particular, \(\#\pi _j(\mathcal {S}')=\#\mathcal {S}'\) for all j for any set \(\mathcal {S}'\subseteq \mathcal {S}\).

Given \(\mathcal {S}'\subseteq \mathcal {S}\), let \(\ell _j(\mathcal {S}')\) be an integer maximizing

$$\begin{aligned} \#\{(a,\mathbf {h}):\,(a,\ell ,\mathbf {h})\in \pi _j(\mathcal {S})\} \end{aligned}$$

over all choices of \(\ell \in \mathbb {Z}\) (if there are multiple possibilities we make an arbitrary choice of one). If \(\#\pi _j(\mathcal {S}')\ge Q^{\delta }\), then by Lemma 6.2 at least \(\#\pi _j(\mathcal {S}')^{1/2}\) triples \((a,q,\mathbf {h})\in \pi _j(\mathcal {S}')\) have \(q=\ell _{j}(\mathcal {S}')\). We now let \(\mathcal {S}_0=\mathcal {S}\), and define \(\mathcal {S}_1\supseteq \dots \supseteq \mathcal {S}_d\) in turn, by

$$\begin{aligned} \mathcal {S}_j:=\{(\mathbf {a},\mathbf {q},\mathbf {h})\in \mathcal {S}_{j-1}:\,q_j= \ell _j(\mathcal {S}_{j-1})\}. \end{aligned}$$

Since \(\#\mathcal {S}\ge Q^{2^d\delta }\), we see that \(\#\mathcal {S}_{j}\ge \#\mathcal {S}_{j-1}^{1/2}\ge Q^{2^{d-j}\delta }\ge Q^\delta \) for \(j\in \{1,\dots ,d\}\) by repeatedly applying Lemma 6.2. In particular, we have \(\#\mathcal {S}_d\ge \#\mathcal {S}^{1/2^d}\). Finally, we note that \(\mathcal {S}_d\) is the set of triples \((\mathbf {a},\mathbf {q},\mathbf {h})\in \mathcal {S}\) such that

$$\begin{aligned} \mathbf {q}=\mathbf {q}_0:=(\ell _{1}(\mathcal {S}_0),\ell _{2}(\mathcal {S}_1),\dots , \ell _{d}(\mathcal {S}_{d-1})), \end{aligned}$$

and so we have the result. \(\square \)

Proof of Proposition 4.2

By assumption, there is some \(Q\le (\prod _{i=1}^k\epsilon _i)^{-C}\) such that there are at least \(Q^{1/C}\) triples \((\mathbf {a},\mathbf {q},\mathbf {h})\in \mathbb {Z}^d\times \mathbb {Z}^d\times \mathbb {Z}^k\) with \(\gcd (a_j,q_j)=1\) and \(q_j\le Q\) for \(j\in \{1,\dots ,d\}\), and with \(h_i\ll \epsilon _i^{-1}\Delta ^{-1/(2k)^4}\) for \(i\in \{1,\dots ,k\}\), and with

$$\begin{aligned} \sum _{i=1}^k h_if_{i,j}=\frac{a_j}{q_j}+O\left( \frac{Q^{C}}{x^{j}}\right) . \end{aligned}$$

If \(Q\le \Delta ^{-1/(2k)^4}\) then we just take one such triple \((\mathbf {a},\mathbf {q},\mathbf {h})\). In this case the triples \((j\mathbf {a},j\mathbf {q},j\mathbf {h})\) for \(j\in \{1,\dots ,Q\}\) then give Q relations of the desired type provided \(C_d'>C+1\), since \(jh_i\ll Q\epsilon _i^{-1}\Delta ^{-1/(2k)^4}\ll \epsilon _i^{-1}\Delta ^{-2/(2k)^4}\). Thus we may assume that \(Q^{(2k)^4}>\prod _{i=1}^k\epsilon _i^{-1}\).

We now apply Lemma 6.3 with \(\delta =(10\cdot 2^d\cdot C)^{-1}\). Provided \(C_d'>100/\delta +C^2\) we see the bounds \(Q\le (\prod _{i=1}^k\epsilon _i^{-1})^{C}\) and \((\prod _{i=1}^k\epsilon _i^{-1})^{C'_d}<x\) imply that we have \(Q^{C}/x^d<(\prod _{i=1}^k\epsilon _i)^{100/\delta }\), and so all the hypotheses of Lemma 6.3 are satisfied. This shows that there is a \(\mathbf {q}_0\in \mathbb {Z}^d\) such that at least \(Q^{\delta }\) of the triples \((\mathbf {a},\mathbf {q},\mathbf {h})\) have \(\mathbf {q}=\mathbf {q}_0\). Thus these all give rise to a rational \(a_j/q\) where \(q=\prod _{i=1}^d(\mathbf {q}_0)_i\). This gives the result. \(\square \)

7 Dimension Reduction via Geometry of Numbers and Proposition 4.3

In this section we prove Proposition 4.3 using estimates from the geometry of numbers, thereby completing the proof of Theorem 1.1.

Lemma 7.1

(Many relations give rise to orthogonal generators). Let \(\eta >0\) be sufficiently small in terms of k and d. Let \(B_1,\dots ,B_k>1\) satisfy \(\prod _{i=1}^kB_i\le \eta ^{-1/2}\), and \(\beta _{i,j}\in \mathbb {R}\) for \(1\le i\le k\), \(1\le j\le d\). Let \(\mathcal {R}\) be the region in \(\mathbb {R}^{k+d}\) defined by

$$\begin{aligned}&\mathcal {R}=\left\{ (h_1,\dots ,h_k,a_1,\dots ,a_d)\in \mathbb {R}^{k+d}:\,\left| \sum _{i=1}^k h_i \beta _{i,j}-a_j\right| \le \eta ^j\text { for }1\le j\le d,\right. \\&\quad \left. |h_i|\le B_i\text { for }1\le i\le k\right\} , \end{aligned}$$

and assume that \(\#(\mathcal {R}\cap \mathbb {Z}^{k+d})= N\), with N sufficiently large in terms of k and d.

Then there is an integer \(r\in \{1,\dots ,k\}\) and vectors \(\mathbf {h}^{(1)},\dots ,\mathbf {h}^{(r)}\in \mathbb {Z}^k\) and \(\mathbf {a}^{(1)},\dots ,\mathbf {a}^{(r)}\in \mathbb {Z}^d\) such that:

  1. (1)

    (The \(\mathbf {h}^{(j)},\mathbf {a}^{(j)}\) are a system of Diophantine approximations.) For each \(j\in \{1,\dots ,r\}\) the vector \((h^{(j)}_1,\dots ,h^{(j)}_k,a_1^{(j)},\dots ,a_d^{(j)})\) lies in \(\mathcal {R}\cap \mathbb {Z}^{k+d}\).

  2. (2)

    (The \(\mathbf {h}^{(j)}\) are quasi-orthogonal after rescaling.) Let \(\tilde{h}^{(j)}_i=h^{(j)}_i/B_i\) for \(1\le j\le r\), \(1\le i\le k\). Then we have

    $$\begin{aligned} \Vert \tilde{\mathbf {h}}^{(1)}\wedge \dots \wedge \tilde{\mathbf {h}}^{(r)}\Vert \asymp \Vert \tilde{\mathbf {h}}^{(1)}\Vert _\infty \cdots \Vert \tilde{\mathbf {h}}^{(r)}\Vert _\infty . \end{aligned}$$
  3. (3)

    (The \(\mathbf {h}^{(j)}\) generate many elements of \(\mathcal {R}\cap \mathbb {Z}^{k+d}\)). Let \(\tilde{\mathbf {h}}^{(j)}\) be as above. We have

    $$\begin{aligned} \Vert \tilde{\mathbf {h}}^{(1)}\Vert _\infty \cdots \Vert \tilde{\mathbf {h}}^{(r)}\Vert _\infty \ll \frac{1}{N^{1/(d+1)}}. \end{aligned}$$

All implied constants depend at most on k and d.

We recall that \(\Vert \mathbf {h}^{(1)}\wedge \dots \wedge \mathbf {h}^{(r)}\Vert \) is the r-dimensional volume of the parallelepiped formed by the vectors \(\mathbf {h}^{(1)},\dots ,\mathbf {h}^{(r)}\), which is the (Euclidean) length of the vector in \(\mathbb {R}^{\left( {\begin{array}{c}k\\ r\end{array}}\right) }\) of all determinants of \(r\times r\) submatrices of the \(k\times r\) matrix with columns \(\mathbf {h}^{(1)},\dots ,\mathbf {h}^{(r)}\).

Proof

After potentially permuting the \(B_i\) and \(\beta _{i,j}\), we may assume without loss of generality that \(B_1\ge B_2\ge \dots \ge B_k\). Let \(\Lambda \subset \mathbb {R}^{k+d}\) be the lattice

$$\begin{aligned}&\mathbb {Z}\left( \frac{1}{B_1}\mathbf {e}_1-\sum _{j=1}^d\frac{\beta _{1,j}}{\eta ^j} \mathbf {e}_{k+j}\right) +\dots +\mathbb {Z}\left( \frac{1}{B_k}\mathbf {e}_k-\sum _{j=1}^d \frac{\beta _{k,j}}{\eta ^j}\mathbf {e}_{k+j}\right) \\&\quad +\mathbb {Z}\frac{1}{\eta }\mathbf {e}_{k+1} +\dots +\mathbb {Z}\frac{1}{\eta ^d}\mathbf {e}_{k+d}, \end{aligned}$$

where \(\mathbf {e}_1,\dots ,\mathbf {e}_{k+d}\) are the standard basis vectors of \(\mathbb {Z}^{k+d}\). We see that elements of \(\mathcal {R}\cap \mathbb {Z}^{k+d}\) correspond to elements of \(\Lambda \) with all components bounded by 1 in absolute value. By standard lattice theory, there is a basis \(\mathbf {b}_1,\dots ,\mathbf {b}_{k+d}\) of \(\Lambda \) (see [May20, Lemma 4.1], for example) such that for any \(n_1,\dots ,n_{k+d}\in \mathbb {Z}\) we have

$$\begin{aligned} \left\| \sum _{i=1}^{k+d}n_i\mathbf {b}_i\right\| _\infty \asymp \sum _{i=1}^{k+d} |n_i| \Vert \mathbf {b}_i\Vert _\infty \asymp \sum _{i=1}^{k+d} |n_i|\lambda _i \end{aligned}$$

where \(0<\lambda _1\le \lambda _2\le \dots \le \lambda _{k+d}\) are the successive minima of \(\Lambda \) and the implied constants depend only on k and d.

If \(\lambda _1\le 1/N^{1/(d+1)}\) then the conclusion of the lemma is satisfied with \(r=1\) and \(h^{(1)}_i=(\mathbf {b}_1)_i B_i\) for \(1\le i \le k\) and \(a_j^{(1)}=\eta (\mathbf {b}_1)_{k+j}+\sum _{i=1}^k (\mathbf {b}_1)_i\beta _{i,j}B_i\) for \(1\le j\le d\). (Such a vector \(\mathbf {h}^{(1)}\) is non-zero since if \(\mathbf {b}^{(1)}\) was 0 in the first k coordinates it must have norm at least \(1/\eta >1\).) We see this choice satisfies the conclusion of the lemma. Thus we may assume that \(\lambda _1>1/N^{1/(d+1)}\).

Recall that \(B_1\cdots B_k<\eta ^{-1/2}\) and \(\lambda _1\cdots \lambda _{k+d}\asymp \det (\Lambda )\). We have that

$$\begin{aligned} \frac{1}{\det (\Lambda )}&=\text {vol}\left\{ \mathbf {t}\in \mathbb {R}^{d+k}:\,\left\| \sum _{i=1}^k \left( \frac{t_i}{B_i}\mathbf {e}_i+\sum _{j=1}^d\frac{t_i\beta _{i,j}}{\eta ^j} \mathbf {e}_{k+j}\right) -\sum _{i=1}^{d}\frac{t_{k+i}}{\eta ^i}\mathbf {e}_{k+i} \right\| _\infty \le 1\right\} \\&=B_1\cdots B_k\eta ^{d(d+1)/2}\\&\le \eta ^{d^2/2}. \end{aligned}$$

In particular, \(\lambda _1\cdots \lambda _k\gg \eta ^{-d/2}>1\), and so \(\lambda _j>1\) for some j (since \(\eta \) is sufficiently small in terms of k and d). Since \(\mathcal {R}\cap \mathbb {Z}^{k+d}\ne \emptyset \), we also have that \(\lambda _1\le 1\). Thus there must be some integer J such that \(\lambda _{J}\le 1< \lambda _{J+1}\). We note that for some suitably large constant C (depending only on k and d)

$$\begin{aligned} \{\mathbf {x}\in \Lambda :\,\Vert \mathbf {x}\Vert _{\infty }<1\}&=\left\{ \sum _{i=1}^{k+d}n_i\mathbf {b}_i:\, (n_1,\dots ,n_{k+d})\in \mathbb {Z}^{k+d},\,\left\| \sum _{i=1}^{k+d}n_i\mathbf {b}_i\right\| _\infty \le 1\right\} \\&\subseteq \left\{ \sum _{i=1}^{k+d}n_i\mathbf {b}_i:\,(n_1,\dots ,n_{k+d})\in \mathbb {Z}^{k+d},\, \sum _{i=1}^{k+d}|n_i|\Vert \mathbf {b}_i\Vert _\infty \le C\right\} \\&\subseteq \left\{ \sum _{i=1}^{k+d}n_i\mathbf {b}_i:\,(n_1,\dots ,n_{k+d})\in \mathbb {Z}^{k+d},\,|n_i|\le C\lambda _i^{-1}\right\} . \end{aligned}$$

The final set on the right hand side has cardinality

$$\begin{aligned} \ll \prod _{i=1}^k\left( 1+\frac{C}{\lambda _i}\right) \ll \frac{1}{\lambda _1\cdots \lambda _J}. \end{aligned}$$

Thus we have \(\lambda _1\cdots \lambda _J\ll N^{-1}\). Since \(\lambda _1\ge N^{-1/(d+1)}\) we have \(\lambda _1\cdots \lambda _d\ge N^{-d/(d+1)}\gg \lambda _1\cdots \lambda _J\). Thus we see that \(J>d\).

The determinant of \(\Lambda \) is given by the determinant of the \((k+d)\times (k+d)\) matrix \(M_1\) with columns \(\mathbf {b}_1,\dots ,\mathbf {b}_{k+d}\), and satisfies \(\det (\Lambda )=\det (M_1)\asymp \Vert \mathbf {b}_1\Vert _\infty \cdots \Vert \mathbf {b}_{k+d}\Vert _\infty \asymp \lambda _1\cdots \lambda _{k+d}\). This implies that some \(J\times J\) submatrix \(M_2\) of the \((k+d)\times J\) matrix with columns \(\mathbf {b}_1,\dots ,\mathbf {b}_J\) has \(\det (M_2)\asymp \lambda _1\cdots \lambda _J\). (If all such submatrices had determinant bounded by \(\delta \lambda _1\cdots \lambda _J\), then by expanding the determinant of \(M_1\) into a sum of such determinants, and using \(\Vert \mathbf {b}_j\Vert _{\infty }\le \lambda _j\), we see the determinant of \(M_1\) would be \(O_k(\delta \lambda _1\cdots \lambda _{k+d})\), contradicting our lower bound if \(\delta \) is sufficiently small in terms of k and d). Similarly, we see that there is some choice of \(\mathcal {I}=\{i_1,\dots ,i_{J-d}\}\subseteq \{1,\dots ,J\}\) such that the \((J-d)\times (J-d)\) submatrix \(M_\mathcal {I}\) of \(M_2\) formed by removing the final d rows and removing the \(i^{th}\) column for each \(i\in \{1,\dots ,J\}\setminus \mathcal {I}\) from \(M_2\) satisfies

$$\begin{aligned} \det (M_\mathcal {I})\gg \prod _{i\in \mathcal {I}}\lambda _i. \end{aligned}$$

(Consider expanding the determinant of \(M_2\) via the bottom d rows so that it is a sum of O(1) of such determinants with the coefficient of \(\det (M_\mathcal {I})\) of size \(\ll \prod _{i\notin \mathcal {I}}\Vert \mathbf {b}_i\Vert _\infty \ll \prod _{i\notin \mathcal {I}}\lambda _i\).)

Let \(\mathbf {b}_i'\in \mathbb {R}^{k}\) be the vector formed by removing the last d coordinates of \(\mathbf {b}_i\) for \(1\le i\le k+d\). The above discussion implies that

$$\begin{aligned} \Vert \mathbf {b}_{i_1}'\wedge \dots \wedge \mathbf {b}_{i_{J-d}}'\Vert \ge \det (M_{\mathcal {I}})\gg \prod _{i\in \mathcal {I}}\lambda _i \ge \prod _{i\in \mathcal {I}}\Vert \mathbf {b}_i'\Vert _\infty , \end{aligned}$$

since one of the \((J-d)\times (J-d)\) submatrices formed from taking \(J-d\) rows from \(\mathbf {b}_{i_1}',\dots ,\mathbf {b}_{i_{J-d}}'\) is \(M_{\mathcal {I}}\). (Recall that \(M_{\mathcal {I}}\) was formed from \(M_2\) be removing the final d rows, and so cannot contain the row corresponding to the final d coordinates of the \(\mathbf {b}_i\).) Finally, on recalling that \(N^{-1/(d+1)}\le \lambda _1\le \dots \le \lambda _J\) and \(\lambda _1\cdots \lambda _J\ll N^{-1}\), we have

$$\begin{aligned} \prod _{i\in \mathcal {I}}\Vert \mathbf {b}'_i\Vert _\infty \ll \Vert \mathbf {b}_{i_1}'\wedge \dots \wedge \mathbf {b}_{i_{J-d}}'\Vert \ll \prod _{i\in \mathcal {I}}\lambda _i\ll N^{d/(d+1)}\lambda _{1}\cdots \lambda _J\ll \frac{1}{N^{1/(d+1)}}. \end{aligned}$$

We now have the result of the lemma on putting \(r=J-d\) and taking

$$\begin{aligned} \mathbf {h}^{(j)}&=(B_{1}(\mathbf {b}_{i_j})_1,\dots ,B_{k}(\mathbf {b}_{i_j})_k) \in \mathbb {Z}^k,\\ a_\ell ^{(j)}&=\eta (\mathbf {b}_{i_j})_{k+\ell }+\sum _{i=1}^k (\mathbf {b}_{i_j})_i\beta _{i,\ell }B_i\text { for }1\le \ell \le d, \end{aligned}$$

for \(1\le j\le r=J-d\). Note that this choice has \(\tilde{\mathbf {h}}^{(j)}=\mathbf {b}_{i_j}'\). \(\square \)

Lemma 7.2

Let \(H_1\) be a \(r\times r\) invertible integer matrix and \(H_2\) an \(r\times \ell \) integer matrix. Let \(\Lambda _1,\Lambda _2\subseteq \mathbb {Z}^r\) and \(\Lambda _3\subseteq \mathbb {Z}^\ell \) be lattices defined by

$$\begin{aligned} \Lambda _1&=H_1(\mathbb {Z}^r),\\ \Lambda _2&=H_1(\mathbb {Z}^r)+H_2(\mathbb {Z}^{\ell }),\\ \Lambda _3&=\{\mathbf {y}\in \mathbb {Z}^{\ell }:\,\exists \,\mathbf {x}\in \mathbb {Z}^r\text { s.t. }H_1\mathbf {x}=H_2\mathbf {y}\}. \end{aligned}$$

Then

$$\begin{aligned} \det (\Lambda _1)=\det (\Lambda _3)\det (\Lambda _2). \end{aligned}$$

Proof

We first note that \(\Lambda _1,\Lambda _2,\Lambda _3\) are all full rank since \(H_1\) has non-zero determinant. Let \(\Lambda _1\) have determinant \(D_1=\det (H_1)\) and \(\Lambda _2\) have determinant \(D_2\). Since \(\Lambda _1\subseteq \Lambda _2\subseteq \mathbb {Z}^r\), \(D_1\) and \(D_2\) are integers with \(D_2|D_1\). Since \(D_1\mathbb {Z}^r\subseteq \Lambda _1\), we have that \(D_1\mathbb {Z}^{\ell }\subseteq \Lambda _3\). We also have that \(D_1\mathbb {Z}^{r}\subseteq D_2\mathbb {Z}^r\subseteq \Lambda _2\). Thus, letting \([D_1]\) denote \(\{1,\dots ,D_1\}\), we have

$$\begin{aligned} \frac{D_1^r}{\det (\Lambda _2)}=\#\{\mathbf {z}\in [D_1]^r:\,\exists \,\mathbf {x}\in [D_1]^r,\mathbf {y}\in [D_1]^{\ell }\text { s.t. }H_1\mathbf {x}-H_2\mathbf {y}=\mathbf {z}\ (\mathrm {mod}\ D_1)\}. \end{aligned}$$

The number of representations \(r(\mathbf {z})\) of \(\mathbf {z}\) as \(H_1\mathbf {x}-H_2\mathbf {y}\) with \(\mathbf {x}\in [D_1]^r\), \(\mathbf {y}\in [D_1]^\ell \) is either 0 (if \(\mathbf {z}\notin \Lambda _2\)) or equal to \(r(\mathbf {0})\) (if \(\mathbf {z}\in \Lambda _2\)) by linearity. Thus, since \(\sum _{\mathbf {z}\in [D_1]^r}r(\mathbf {z})=D_1^{\ell +r}\), we have

$$\begin{aligned} \frac{D_1^r}{\det (\Lambda _2)}=\frac{D_1^{\ell +r}}{r(\mathbf {0})}. \end{aligned}$$

But then we have that for any given \(\mathbf {y}\in [D_1]^\ell \), the number \(r_2(\mathbf {y})\) of \(\mathbf {x}\in [D_1]^r\) such that \(H_1\mathbf {x}=H_2\mathbf {y}\pmod {D_1}\) is either 0 (if \(H_2\mathbf {y}\notin \Lambda _1\)) or is equal to \(r_2(\mathbf {0})\) (if \(H_2\mathbf {y}\in \Lambda _1\)). Thus

$$\begin{aligned} r(\mathbf {0})&=\#\{\mathbf {y}\in [D_1]^{\ell }:\,\exists \,\mathbf {x}\in [D_1]^r\text { s.t. }H_1\mathbf {x}=H_2\mathbf {y}\pmod {D_1}\}\cdot r_2(\mathbf {0})\\&=\frac{D_1^{\ell }}{\det (\Lambda _3)}\cdot \#\{\mathbf {x}\in [D_1]^r:\,H_1\mathbf {x}=\mathbf {0}\pmod {D_1}\}\\&=\frac{D_1^{\ell }}{\det (\Lambda _3)}\cdot \frac{D_1^r}{[D_1\mathbb {Z}^r:\Lambda _1]}. \end{aligned}$$

But \(\det (\Lambda _1)=D_1=[\Lambda _1:\mathbb {Z}^r]\), and \([D_1\mathbb {Z}^r:\mathbb {Z}^r]=D_1^r\) so \([D_1\mathbb {Z}^r:\Lambda _1]=D_1^{r}/\det (\Lambda _1)\). Thus we have

$$\begin{aligned} \frac{D_1^r}{\det (\Lambda _2)}=\frac{D_1^{\ell +r}}{r(\mathbf {0})}=\det (\Lambda _3)\cdot [D_1\mathbb {Z}^r:\Lambda _1]=D_1^r\frac{\det (\Lambda _3)}{\det (\Lambda _1)}. \end{aligned}$$

\(\square \)

Lemma 7.3

(Orthogonal relations give rise to reduced dimension problem). Let \(C>2\) and \(f_1,\dots ,f_k\in \mathbb {R}[X]\) be polynomials of degree at most d with \(f_1(0)=\dots =f_k(0)=0\). Put \(f_i(X)=\sum _{j=1}^df_{i,j}X^j\).

Let \(B_1,\dots ,B_k\ge 1\) and \(q_0\in \mathbb {Z}_{>0}\). Let \(\eta \in [0,1/100]\) be such that

$$\begin{aligned} \eta <\frac{q_0^C}{x}. \end{aligned}$$

Let \(r\in \{1,\dots ,k\}\) and \(\mathbf {h}^{(1)},\dots ,\mathbf {h}^{(r)}\in \mathbb {Z}^k\) and \(\mathbf {a}^{(1)},\dots ,\mathbf {a}^{(r)}\in \mathbb {Z}^d\) satisfy:

  1. (1)

    \(|{h}^{(\ell )}_i|\le B_i\) for \(1\le i\le k\) and \(1\le \ell \le r\).

  2. (2)

    For \(1\le \ell \le r\) and \(1\le j\le d\) we have

    $$\begin{aligned} \left| \sum _{i=1}^k h^{(\ell )}_i f_{i,j}-\frac{a^{(\ell )}_j}{q_0}\right| \le \eta ^j. \end{aligned}$$
  3. (3)

    Put \(\tilde{h}^{(\ell )}_i= h^{(\ell )}_i/B_i\) for \(1\le \ell \le r\) and \(1\le i\le k\). We have

    $$\begin{aligned} \Vert \tilde{\mathbf {h}}^{(1)}\wedge \dots \wedge \tilde{\mathbf {h}}^{(r)}\Vert \asymp \Vert \tilde{\mathbf {h}}^{(1)}\Vert _\infty \cdots \Vert \tilde{\mathbf {h}}^{(r)}\Vert _\infty , \end{aligned}$$

    and

    $$\begin{aligned} \Vert \tilde{\mathbf {h}}^{(1)}\Vert _\infty \cdots \Vert \tilde{\mathbf {h}}^{(r)}\Vert _\infty \ll \frac{1}{q_0^{1/C}}. \end{aligned}$$

Then there is an integer \(k'<k\), real polynomials \(g_1,\dots ,g_{k'}\in \mathbb {R}[X]\) of degree at most d with \(g_1(0)=\dots =g_{k'}(0)=0\) and quantities \(B_1',\dots ,B_{k'}'\ge 2\) and \(y<x\) such that:

  1. (1)

    (Approximations in the new system produce approximations in the old system) If there is an integer \(n'<y\) such that

    $$\begin{aligned} \Vert g_i(n')\Vert _{\mathbb {R}/\mathbb {Z}}<\frac{1}{B_i'}\quad \text {for all }1\le i\le k' \end{aligned}$$

    then there is an integer \(n<x\) such that

    $$\begin{aligned} \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}<\frac{1}{B_i}\quad \text {for all }1\le i\le k. \end{aligned}$$
  2. (2)

    (Increased density of approximations) We have

    $$\begin{aligned} \frac{y}{(B_1'\cdots B_{k'}')^{3C^2-C^2/k'{}^3}}\gg \frac{x}{(B_1\cdots B_k)^{3C^2-C^2/k^3-C^2/k^4}}. \end{aligned}$$

All implied constants depend only on k and d.

Proof

Let \(\tilde{M}\) be the \(r\times k\) matrix with rows \(\tilde{\mathbf {h}}^{(1)},\dots ,\tilde{\mathbf {h}}^{(r)}\), and M be the \(r\times k\) matrix with rows \(\mathbf {h}^{(1)},\dots ,\mathbf {h}^{(r)}\). Let \(\tilde{H}_1\) be the \(r\times r\) submatrix of \(\tilde{M}\) with largest determinant (in absolute value). By permuting the coordinates, (and permuting the \(f_i\)), we may assume that \(\tilde{H}_1\) is formed by taking the first r columns of \(\tilde{M}\). Let \(\tilde{H}_2\) be the \(r\times (k-r)\) matrix formed by taking the final \(k-r\) columns of \(\tilde{M}\). Similarly, let \(H_1\) be the matrix formed by taking the first r columns of M, and \(H_2\) formed by taking the last \(k-r\) columns of M.

By the set-up of the lemma, we have for each \(j\in \{1,\dots ,d\}\)

$$\begin{aligned} \tilde{H}_1\begin{pmatrix} f_{1,j} B_1 \\ \vdots \\ f_{r,j} B_r\end{pmatrix} = -\tilde{H}_2\begin{pmatrix}f_{r+1,j}B_{r+1}\\ \vdots \\ f_{k,j} B_k\end{pmatrix}+\frac{1}{q_0}\begin{pmatrix} a^{(1)}_j\\ \vdots \\ a^{(r)}_j \end{pmatrix} + O(\eta ^j). \end{aligned}$$
(7.1)

By construction, \(\det (\tilde{H}_1)\) is the largest determinant of any \(r\times r\) submatrix of M formed with columns \(\tilde{\mathbf {h}}^{(1)},\dots ,\tilde{\mathbf {h}}^{(r)}\), and so \(\det (\tilde{H}_1)\gg \Vert \tilde{\mathbf {h}}^{(1)}\wedge \dots \wedge \tilde{\mathbf {h}}^{(r)}\Vert \). By the assumption of the lemma we have \( \Vert \tilde{\mathbf {h}}^{(1)}\wedge \dots \wedge \tilde{\mathbf {h}}^{(r)}\Vert \gg \Vert \tilde{\mathbf {h}}^{(1)}\Vert _\infty \cdots \Vert \tilde{\mathbf {h}}^{(r)}\Vert _\infty \), and so \(\det (\tilde{H}_1)\asymp \Vert \tilde{\mathbf {h}}_1\Vert _\infty \cdots \Vert \tilde{\mathbf {h}}_r\Vert _\infty \). It follows that \((H^{-1})_{i,j}\ll \Vert \tilde{\mathbf {h}}_j\Vert _\infty ^{-1}\) for all \(1\le i,j\le r\).

We multiply (7.1) by \(n^j\) and then sum over \(j\in \{1,\dots ,d\}\). Rearranging, we find that for any choice of \(b_1,\dots ,b_k\in \mathbb {Z}\), we have

$$\begin{aligned} \begin{pmatrix} (f_1(n)-b_1)B_1 \\ \vdots \\ (f_r(n)-b_r)B_r\end{pmatrix}&= -\tilde{H}_1^{-1}\tilde{H}_2\begin{pmatrix}(f_{r+1}(n)-b_{r+1})B_{r+1}\\ \vdots \\ (f_k(n)-b_k)B_k\end{pmatrix}\\&\quad + \tilde{H}_1^{-1}\left( H_2\begin{pmatrix}b_{r+1}\\ \vdots \\ b_k\end{pmatrix}- H_1\begin{pmatrix}b_1\\ \vdots \\ b_r\end{pmatrix}+\frac{1}{q_0}\begin{pmatrix} \sum _{j=1}^d a^{(1)}_j n^j\\ \vdots \\ \sum _{j=1}^d a^{(r)}_j n^j \end{pmatrix}\right) \\&\quad +O\left( (n\eta +n^d\eta ^d)\begin{pmatrix} 1/\Vert \tilde{\mathbf {h}}_1\Vert _\infty \\ \vdots \\ 1/\Vert \tilde{\mathbf {h}}_r\Vert _\infty \end{pmatrix}\right) . \end{aligned}$$

We note that \((\tilde{H}_2)_{i,j}\ll \Vert \tilde{\mathbf {h}}^{(i)}\Vert _\infty \) for all \(1\le i\le r\) and \(1\le j\le k-r\). Recalling that \((\tilde{H}_1^{-1})_{i,j}\ll \Vert \tilde{\mathbf {h}}^{(j)}\Vert _\infty ^{-1}\) for all \(1\le i,j\le r\), we see that all entries of \(\tilde{H}_1^{-1}\tilde{H}_2\) are of size O(1). In particular, if \(b_1,\dots , b_k\) are such that for each \(r+1\le j\le k\)

$$\begin{aligned} |f_j(n)-b_j|\le \frac{\delta }{B_j}, \end{aligned}$$
(7.2)

then we have that

$$\begin{aligned} -\tilde{H}_1^{-1}\tilde{H}_2\begin{pmatrix}(f_{r+1}(n)-b_{r+1})B_{r+1}\\ \vdots \\ (f_k(n)-b_k)B_k\end{pmatrix}=O(\delta ). \end{aligned}$$

If we have \(\delta <1\) and

$$\begin{aligned} n\le \frac{\delta \min _i\Vert \tilde{\mathbf {h}}_i\Vert _\infty }{\eta }, \end{aligned}$$
(7.3)

then (recalling \(\Vert \tilde{\mathbf {h}}_i\Vert _\infty \le 1\))

$$\begin{aligned} (n\eta +n^d\eta ^d)\begin{pmatrix}1/\Vert \tilde{\mathbf {h}}_1\Vert _\infty \\ \vdots \\ 1/\Vert \tilde{\mathbf {h}}_r\Vert _\infty \end{pmatrix}=O(\delta ). \end{aligned}$$

Thus, if (7.2) and (7.3) hold and also we have

$$\begin{aligned} H_1\begin{pmatrix}b_1\\ \vdots \\ b_r\end{pmatrix}-H_2\begin{pmatrix}b_{r+1}\\ \vdots \\ b_k\end{pmatrix}=\frac{1}{q_0}\begin{pmatrix} \sum _{j=1}^d a^{(1)}_j n^j\\ \vdots \\ \sum _{j=1}^d a^{(r)}_j n^j \end{pmatrix}, \end{aligned}$$
(7.4)

then we have for each \(1\le i\le r\)

$$\begin{aligned} \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\ll \frac{\delta }{B_i}. \end{aligned}$$

In particular, we have \(\Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\le 1/B_i\) for \(1\le i \le r\) if \(\delta \) is chosen to be a sufficiently small constant (depending only on k and d) provided (7.4), (7.2) and (7.3) hold.

Let \(\Lambda _1\subseteq \Lambda _2\subseteq \mathbb {Z}^r\) be full-rank lattices defined in terms of the matrices \(H_1,H_2\) by

$$\begin{aligned} \Lambda _1=H_1(\mathbb {Z}^r),\quad \Lambda _2=H_1(\mathbb {Z}^r)+H_2(\mathbb {Z}^{k-r}). \end{aligned}$$

(They are both full rank since \(H_1\) has non-zero determinant.) Let \(\Lambda _1\) have determinant \(D_1=\det (H_1)\) and \(\Lambda _2\) have determinant \(D_2\). Since \(\Lambda _1\subseteq \Lambda _2\subseteq \mathbb {Z}^r\), \(D_1\) and \(D_2\) are integers with \(D_2|D_1\). Any sublattice of \(\mathbb {Z}^r\) with determinant \(D_2\) contains \(D_2\mathbb {Z}^r\). Therefore for each \(j\in \{1,\dots ,d\}\) there exists a choice of \(b_{1,j}',\dots ,b_{k,j}'\in \mathbb {Z}\) such that

$$\begin{aligned} H_1\begin{pmatrix}b_{1,j}'\\ \vdots \\ b_{r,j}'\end{pmatrix}-H_2\begin{pmatrix}b_{r+1,j}'\\ \vdots \\ b_{k,j}'\end{pmatrix}=D_2^j q_0^{j-1}\begin{pmatrix}a^{(1)}_j\\ \vdots \\ a^{(r)}_j \end{pmatrix}. \end{aligned}$$

We restrict our attention to \(b_i\) of the form \(b_i=\sum _{j=1}^d b_{i,j}'n^j/q_0^j D_2^j+b_i''\) for \(1\le i\le k\), where \(b_1'',\dots ,b_k''\in \mathbb {Z}\) satisfy

$$\begin{aligned} H_1\begin{pmatrix}b_1''\\ \vdots \\ b_r''\end{pmatrix}-H_2\begin{pmatrix}b_{r+1}''\\ \vdots \\ b_k''\end{pmatrix}=0. \end{aligned}$$
(7.5)

To ensure that \(b_1,\dots ,b_k\in \mathbb {Z}\), we will restrict our consideration to integers n such that \(D_2q_0|n\).

The equation (7.5) forces \((b_{r+1}'',\dots ,b_k'')\) to lie in a rank \(r-k\) lattice \(\Lambda _3\), given by

$$\begin{aligned} \Lambda _3=\{\mathbf {y}\in \mathbb {Z}^{k-r}:\,\exists \mathbf {x}\in \mathbb {Z}^r\text { s.t. }H_1\mathbf {x}=H_2\mathbf {y}\}. \end{aligned}$$

By Lemma 7.2, \(\Lambda _3\) has determinant \(D_1/D_2\).

Let \(\mathbf {z}_1,\dots ,\mathbf {z}_{r-k}\) be a Minkowski-reduced basis for \(\Lambda _3\), so in particular \(D_1/D_2=\det (\Lambda _3)\asymp \Vert \mathbf {z}_1\Vert _\infty \cdots \Vert \mathbf {z}_{k-r}\Vert _\infty \). Let \(n=D_2q_0n'\) for some \(n'\in \mathbb {Z}\). We see that we can find \((b_1,\dots ,b_k)\) satisfying the conditions (7.4) and (7.2) provided we can find \(m_1,\dots , m_{k-r}\in \mathbb {Z}\) such that for \(r+1\le i\le k\) we have

$$\begin{aligned} \left| f_i(D_2q_0n')-\sum _{j=1}^d b_{i,j}'(n')^j -\sum _{i=1}^{k-r}m_i(\mathbf {z}_i)_{j-r}\right|&=\left| \tilde{f}_i(n')-\sum _{i=1}^{k-r} m_i(\mathbf {z}_i)_{j-r}\right| \nonumber \\&\le \frac{\delta }{B_j}. \end{aligned}$$
(7.6)

Here we have set \(\tilde{f}_i(X)\in \mathbb {R}[X]\) to be the polynomial

$$\begin{aligned} \tilde{f}_i(X)=f_i(D_2q_0X)-\sum _{j=1}^d b'_{i,j}X^j \end{aligned}$$

for each \(i\in \{r+1,\dots ,k\}\). We note that \(\tilde{f}\) has degree at most d and has \(\tilde{f}_i(0)=0\).

Let Z be the \((k-r)\times (k-r)\) matrix with columns \(\mathbf {z}_1,\dots ,\mathbf {z}_{r-k}\). Since \(\det (Z)\asymp \Vert \mathbf {z}_1\Vert _\infty \cdots \Vert \mathbf {z}_{r-k}\Vert _\infty \) we have that \(Z^{-1}_{i,j}\ll \Vert \mathbf {z}_j\Vert _\infty ^{-1}\). Thus we see that if we can find \(n'\) and \(m_1,\dots ,m_{k-\ell }\) such that

$$\begin{aligned} \left| Z^{-1}\begin{pmatrix} \tilde{f}_{r+1}(n')\\ \vdots \\ \tilde{f}_k(n') \end{pmatrix}-\begin{pmatrix} m_1\\ \vdots \\ m_{k-r}\end{pmatrix} \right| \le \delta ^2 \begin{pmatrix}1/(B_{r+1}\Vert \mathbf {z}_1\Vert _\infty ) \\ \vdots \\ 1/(B_k\Vert \mathbf {z}_{r-k}\Vert _\infty ) \end{pmatrix}, \end{aligned}$$

then (7.6) holds if \(\delta \) is a sufficiently small constant (depending only on k). We now let \(k'=k-r\) and \(B_i'=\delta ^{-2}B_{r+i}\Vert \mathbf {z}_i\Vert _\infty \) and \(g_1,\dots ,g_{k-r}\in \mathbb {R}[X]\) be given by

$$\begin{aligned} \begin{pmatrix}g_1(n')\\ \vdots \\ g_{k-r}(n') \end{pmatrix}=Z^{-1}\begin{pmatrix} \tilde{f}_{r+1}(n')\\ \vdots \\ \tilde{f}_k(n') \end{pmatrix}. \end{aligned}$$

We see that the \(g_i\) are polynomials of degree at most d with \(g_i(0)=0\) since the \(\tilde{f}_i\) are. Finally, we put \(y=\delta x \min _i\Vert \tilde{\mathbf {h}}_i\Vert _\infty /(q_0^{C+1}D_2)\), and note that since \(\eta <q_0^C/x\) we have \(y<\delta \min _i\Vert \tilde{\mathbf {h}}_i\Vert _\infty /(\eta q_0D_2)\).

Putting everything together, we see that if there is an \(n'<y\) such that

$$\begin{aligned} \Vert g(n')\Vert _{\mathbb {R}/\mathbb {Z}}\le \frac{1}{B_i'} \end{aligned}$$

for \(1\le i\le k'=k-r\), then there is an \(n=n' q_0 D_2\) with \(n<x\) and \(n<\delta \min _i\Vert \tilde{\mathbf {h}}_i\Vert _\infty /\eta \) such that

$$\begin{aligned} \Vert f_i(n)\Vert _{\mathbb {R}/\mathbb {Z}}\le \frac{1}{B_i} \end{aligned}$$

for \(1\le i\le k\).

Thus we are left to verify the size estimates with this choice of \(B_1',\dots ,B_{k-r}'\) and y. We have that (recalling that \(1/\delta =O(1)\))

$$\begin{aligned} \prod _{i=1}^{k-r}B_i'&=\frac{\Vert \mathbf {z}_1\Vert _\infty \cdots \Vert \mathbf {z}_{k-r}\Vert _\infty \prod _{i=r+1}^kB_i}{\delta ^{2(k-r)}}\\&\ll \frac{D_1\prod _{i=r+1}^k B_i }{D_2}. \end{aligned}$$

This implies that

$$\begin{aligned} \frac{y}{(\prod _{i=1}^{k-r}B_i')^{C_2}}&=\frac{\delta x \min _i\Vert \tilde{\mathbf {h}}_i\Vert _\infty }{q_0^{C+1} D_2(\prod _{i=1}^{k-r}B_i')^{C_2}}\\&\gg \frac{x D_2^{C_2}\min _i\Vert \tilde{\mathbf {h}}_i\Vert _\infty }{ q_0^{C+1} D_1^{C_2}(\prod _{i=r+1}^k B_i )^{C_2}}. \end{aligned}$$

We recall that \(D_1=\det (H_1)\asymp \Vert \mathbf {h}_1\Vert _\infty \cdots \Vert \mathbf {h}_r\Vert _\infty \ll B_1\cdots B_r/q_0^{1/C}\), and that \(\min _i\Vert \tilde{\mathbf {h}}_i\Vert _\infty \gg \prod _{i=1}^r\Vert \tilde{\mathbf {h}}_i\Vert _\infty \gg D_1/(B_1\cdots B_r)\). This gives

$$\begin{aligned} \frac{y}{(\prod _{i=1}^{k-r}B_i')^{C_2}}&\gg \frac{x}{(\prod _{i=r+1}^k B_i )^{C_2}} \frac{D_2^{C_2-1} }{D_1^{C_2-1}q_0^{C+1} B_1\cdots B_r}\\&\gg \frac{x}{(\prod _{i=1}^k B_i )^{C_2} } q_0^{(C_2-1)/C-C-1}D_2^{C_2-1}. \end{aligned}$$

Finally, we choose \(C_2=3C^2-C^2/(k-r)^3\). Since \(D_2,q_0\ge 1\) and \((C_2-1)/C-C-1>0\), this gives

$$\begin{aligned} \frac{y}{(\prod _{i=1}^{k-r}B_i')^{3C^2-C^2/(k-r)^3}}\gg \frac{x}{(B_1\cdots B_r)^{3C^2-C^2/(k-r)^3}}. \end{aligned}$$

Since \(k>r\ge 1\) we have \(1/(k-r)^3\ge 1/k^3+1/k^4\), which gives the result. \(\square \)

Proof of Proposition 4.3

Lemma 7.1 (taking \(\beta _{i,j}=f_{i,j}\), \(\eta =Q^C/x\) and \(B_i\ll \epsilon _i^{-1}\Delta ^{-2/(2k)^4}\)) shows that if the assumptions of Proposition 4.3 hold then we can find a subset of essentially orthogonal generators. Using these generators in Lemma 7.3 then gives the required conclusion by taking \(\epsilon _i'=1/B_i'\). Indeed, the first two claims of the proposition are clear. For the final claim we note that

$$\begin{aligned} y (\epsilon _1'\cdots \epsilon _{k'}')^{3C^2-C^2/k'{}^3}&=\frac{y}{(B_1'\cdots B_{k'}')^{3C^2-C^2/k'{}^3}}\\&\gg \frac{x}{(B_1\cdots B_k)^{3C^2-C^2/k^3-C^2/k^4}}\\&=x(\epsilon _1\cdots \epsilon _k\Delta ^{2k/(2k)^4})^{(3C^2-C^2/k^3-C^2/k^4)}\\&\gg x(\epsilon _1\cdots \epsilon _k)^{3C^2-C^2/k^3}. \end{aligned}$$

This gives the final claim, establishing Proposition 4.3. \(\square \)