Cauchy-Schwarz complexity

This is my own presentation of a technique that is quite fundamental to higher order additive combinatorics: the repeated use of the Cauchy-Schwarz inequality to extract generic additive structure from mixed additive structure. The way I have I chosen follows closely the definition and particular form used by Manners [Ma] (Appendix B) in his recent quantitative proof of the inverse theorem for Gowers uniformity norms. The same proof could easily be adapted to other circumstances (e.g. one could use multiplicative derivatives instead of additive derivatives, thereby bounding averages of functions over linear forms by appropriate Gowers uniformity norms, as done in [GoWo], Theorem 2.3, for example).

We say that a collection of integer vectors $u_1,\ldots,u_\ell \in \mathbb{Z}^d$ has Cauchy-Schwarz complexity $\leq t$ with denominator $Q$ at position $1\leq j\leq \ell$ if there exist $\sigma_1,\ldots,\sigma_{t+1}\in \mathbb{Z}^d$ such that

  1. $u_j\cdot \sigma_r\neq 0$ and divides $Q$ for each $1\leq r\leq t+1$,</ul>
  2. for all $i\neq j$ there exists some $1\leq r\leq t+1$ such that $u_i\cdot \sigma_r=0$.
In other words, we can find a collection of $t+1$ hyperplanes such that their union covers all our set except $u_j$.

A simple example is the collection of vectors that determine a $k$-term arithmetic progression, namely $u_1,\ldots,u_k = (1,0),\ldots,(1,k-1)$. This determines an arithmetic progression in that as $(x,d)$ ranges over $H^2$, for some abelian group $H$, the set $\{ u_i\cdot (x,d)\}$ ranges over the arithmetic progressions of length $k$. This collection has Cauchy-Schwarz complexity $k-1$, and hence the behaviour of arithmetic progressions of length $k$ is determined by the $U^{k-1}$ norm.

A more elaborate example, which is that used by Manners in his proof of Lemma 3.4.5 of [Ma], is the collection of vectors in $\mathbb{Z}^{k+2}$ given by \[u_\eta = (1,\eta,0)\textrm{ for }\eta\subset [k]\textrm{ and }\] \[v_\nu = (1,\nu,1)\textrm{ for }\nu\subset[k]\textrm{ with }i\in \nu\textrm{ and }\Abs{\nu}\leq R.\] This has Cauchy-Schwarz complexity at most $R-1$ with denominator $(k-1)!$ at any $\nu_0\subset [k]$ with $i\in \nu_0$ and size $\Abs{\nu_0}\leq R$ (note in particular that the complexity is much smaller than the size). Indeed, we can take the vectors $(0,\ldots,0,1)$, and $(0,\ldots,1,\ldots,0)$ which is only non-zero at some fixed $j\neq i\in \nu_0$, and for $\Abs{\nu_0}\leq r\leq R-1$ we take the vector $(-r-2,1,\ldots,1)$. This is a collection of $R$ vectors, which is easily verified to have the required properties.

The use of Cauchy-Schwarz complexity is that it allows us to convert information about functions 'along a collection of linear forms' into quite generic additive information about those functions. We define the discrete derivative $\partial_h f(x) = f(x+h)-f(x)$, and extend this to vectors in the obvious fashion. One can characterise polynomials of degree $s$ (when, say, dealing with functions $\mathbb{R}\to\mathbb{R}$) by saying that the derivatives of order $s+1$ vanish. Inspired by this, we say that a function $f:H\to \mathbb{R}$ is an $\epsilon$-approximate polynomial of degree $s$ if there are at least $\epsilon \lvert H\rvert^{s+2}$ many $(x,h_1,\ldots,h_{s+1})\in H^{s+2}$ such that $\partial_{h_1,\ldots,h_{s+1}}f(x)=0$. The following lemma (which is basically Lemma 3.5.3 of [Ma]) shows that functions which correlate along systems of low complexity must be approximate polynomials.

Suppose that $u_1,\ldots,u_\ell\in \mathbb{Z}^d$ are a collection of integer vectors of Cauchy-Schwarz complexity $\leq t$ with denominator $Q$ at position $j$, and suppose that $f_1,\ldots,f_\ell:H\to\bbr$ are such that \[\Abs{\left\{ x\in H^d : \sum_i f_i(u_i\cdot x)=0\right\}}\geq \delta\Abs{H}^d.\] Then, provided $(\Abs{H},Q)=1$, the function $f_j$ is a $\delta^{2^{t+1}}$-approximate polynomial of degree $t$.
Note that the conclusion is precisely saying that $\sum (-1)^{\lvert \omega\rvert} f_j(\omega\cdot (x,h_1,\ldots,h_{t+1}))=0$ vanishes for many $(x,h_1,\ldots,h_{t+1})\in H^{t+2}$, where $\omega$ ranges over the vectors of the form $(1,\omega')$ as $\omega'\in\{0,1\}^{t+1}$ is arbitrary. This is a system of complexity $\leq t$ at any place. Thus one can view this lemma as moving from vanishing of a collection of functions along a system of complexity $\leq t$ to vanishing of a single one of those functions along another (more generic) system of complexity $\leq t$. Both aspects of this, the emphasis on one particular function of interest, and the genericity of the new system, are useful.

Proof

Let $F_i:H^d\to \bbr$ be defined by $F_i(x)=f_i(u_i\cdot x)$. We prove by induction on $R$ that, for any $0\leq R\leq t+1$, if we write $\Sigma_R$ for the set of those $i$ such that $u_i\cdot \sigma_r=0$ for some $r\leq R$, then \[\sum_{x\in H^d} \sum_{\mathbf{h}\in H^R}1_{\sum_{i\not\in \Sigma_R}\partial_{\sigma_1h_1,\ldots,\sigma_Rh_R}F_i(x)=0}\geq \delta^{2^R} \Abs{H}^{d+R}.\] The base case $R=0$ is given by assumption. Suppose then that we have this inequality for some fixed $0\leq R\leq t$. Let $\Sigma'=\Sigma_{R+1}\backslash \Sigma_R$. For $i\in \Sigma'$ the value of $F_i$ does not change under shifts by $h\sigma_{R+1}$, since $u_i\cdot \sigma_{R+1}=0$ by construction. It follows that \[\sum_{x\in H^d}\sum_{\mathbf{h}\in H^R} \sum_{h\in H}1_{\sum_{i\not\in \Sigma_{R+1}}\partial_{\sigma_1h_1,\ldots,\sigma_Rh_R}F_i(x+h\sigma_{R+1})=-\sum_{i\in \Sigma'}\partial_{\sigma_1h_1,\ldots,\sigma_Rh_R}F_i(x)}\geq \delta^{2^R}\Abs{H}^{d+R+1}.\] The Cauchy-Schwarz inequality and a change of variables then gives \[\sum_{x\in H^d}\sum_{\mathbf{h}\in H^R} \sum_{h\in H}1_{\sum_{i\not\in \Sigma_{R+1}}\partial_{\sigma_1h_1,\ldots,\sigma_Rh_R}F_i(x+h\sigma_{R+1})=\sum_{i\not\in \Sigma_{R+1}}\partial_{\sigma_1h_1,\ldots,\sigma_Rh_R}F_i(x)}\geq \delta^{2^{R+1}}\Abs{H}^{d+R+1}.\] By definition, \[F_i(x+h\sigma_{R+1})-F_i(x) = \partial_{\sigma_{R+1}h}F_i,\] which completes the inductive proof.

By construction, when $R=t+1$, the only remaining index is $j$, and so we have shown \[\sum_{x\in H^d} \sum_{\mathbf{h}\in H^{t+1}}1_{\partial_{\sigma_1h_1,\ldots,\sigma_{t+1}h_{t+1}}F_j(x)=0}\geq \delta^{2^{t+1}} \Abs{H}^{d+t+1}.\] Recalling that $F_j(x)=f_j(u_j\cdot x)$, we can write the left-hand side as \[\sum_{x'\in H}\sum_{\mathbf{h}'\in H^{t+1}}\sum_{\substack{x\in H^d\\ u_j\cdot x=x'}} \sum_{\substack{\mathbf{h}\in H^{t+1}\\ h_i(\sigma_i\cdot u_j)=h_i'}}1_{\partial_{\mathbf{h}'}f_j(x')=0},\] Since each $\sigma_i\cdot u_j$ is coprime to $\Abs{H}$, the sum over $\mathbf{h}$ is uniquely determined, and the sum over $x$ is at most $\Abs{H}^{d-1}$, and hence \[\sum_{x'\in H}\sum_{\mathbf{h}'\in H^{t+1}}1_{\partial_{\mathbf{h}'}f_j(x')=0},\geq \delta^{2^{t+1}} \Abs{H}^{t+2}\] as required.