Matérn Kernel, Sobolev Space, and Gaussian Random Field

1 Reproducing kernel Hilbert space

Let $X\subset\mathbb{R}^{d}$ be a non-empty set. A function $K:X\times X\to\mathbb{R}$ is called a symmetric positive definite (SPD) kernel if

\displaystyle K(x,y)=K(y,x)\quad\mathrm{and}\quad\sum_{i,j=1}^{n}c_{i}c_{j}K(x% _{i},x_{j})\geq 0

(1)

for every finite set of points $\{x_{1},\dots,x_{n}\}\subset X$ and every choice of real coefficients $\{c_{1},\dots,c_{n}\}\subset\mathbb{R}$ .

The classical Moore-Aronszajn theorem asserts that for every such kernel $K$ there exists a unique Hilbert space $(\mathcal{H},\langle\cdot,\cdot\rangle_{\mathcal{H}})$ of real-valued functions on $X$ with the following properties:

(i)

For each $x\in X$ the function $K(\cdot,x)$ belongs to $\mathcal{H}$ ;
(ii)

For every $f\in\mathcal{H}$ and every $x\in X$ ,

$\displaystyle f(x)=\langle f,\;K(\cdot,x)\rangle_{\mathcal{H}}.$ (2)

The Hilbert space $\mathcal{H}$ is called the reproducing kernel Hilbert space (RKHS) associated with the kernel $K$ . The function $K$ itself is the reproducing kernel of $\mathcal{H}$ . Property (2) characterizes the evaluation functional as a continuous linear functional on $\mathcal{H}$ , which is usually called the reproducing property.

2 RKHS with Matérn kernel and Sobolev space

A particularly important class of RKHSs arises from shift-invariant kernels of the form

K(x,y)=\Phi(x-y),\qquad\Phi:\mathbb{R}^{d}\to\mathbb{R},

where $\Phi$ is an SPD kernel function. The next result describes the structure of the RKHS associated with such a kernel; see [4, Theorem 10.12].

Theorem 1 (RKHS of a shift-invariant kernel)

Let $K(x,y)=\Phi(x-y)$ be a shift-invariant kernel on $\mathbb{R}^{d}$ with $\Phi\in C(\mathbb{R}^{d})\cap L^{1}(\mathbb{R}^{d})$ . Then the associated RKHS is

\displaystyle\mathcal{H}\;=\;\Bigl{\{}f\in L^{2}(\mathbb{R}^{d})\cap C(\mathbb% {R}^{d}):\mathcal{F}[\Phi]^{-\frac{1}{2}}\mathcal{F}[f]\in L^{2}(\mathbb{R}^{d% })\Bigr{\}},

(3)

where $\mathcal{F}[\Phi]$ is the Fourier transform of $\Phi$ . The inner product on $\mathcal{H}$ is

\langle f,g\rangle_{\mathcal{H}}=\frac{1}{(2\pi)^{d/2}}\int_{\mathbb{R}^{d}}% \frac{\mathcal{F}[f](\omega)\,\overline{\mathcal{F}[g](\omega)}}{\mathcal{F}[% \Phi](\omega)}\,\mathrm{d}\omega.

Consider the Matérn kernel

\displaystyle K_{\nu,l}(x,y)=\Phi_{\nu,l}(x-y),\quad\Phi_{\nu,l}(x)=\frac{2^{1% -\nu}}{\Gamma(\nu)}{\Bigg{(}\sqrt{2\nu}\frac{\|x\|}{l}\Bigg{)}}^{\nu}B_{\nu}% \Bigg{(}\sqrt{2\nu}\frac{\|x\|}{l}\Bigg{)},

(4)

where $\nu,l>0$ , and $B_{\nu}$ is the modified Bessel function of the second kind. The Fourier transform of $\Phi_{\nu,l}$ is

\displaystyle\mathcal{F}[\Phi_{\nu,l}](\omega)=C_{\nu,l,d}\left(\frac{2\nu}{l^% {2}}+4\pi^{2}\|\omega\|^{2}\right)^{-(\nu+\frac{d}{2})},

(5)

where $C_{\nu,l,d}$ is a constant only depending on $(\nu,l,d)$ ; see [5, §4.2]. Thus, the condition in (3) becomes $\left(\frac{2\nu}{l^{2}}+4\pi^{2}\|\omega\|^{2}\right)^{(\nu+\frac{d}{2})/2}% \widehat{f}(\omega)\in L^{2}(\mathbb{R}^{d})$ . Recall that for any $s\in\mathbb{R}$ , the fractional Sobolev space on $\mathbb{R}^{d}$ is

\displaystyle H^{s}(\mathbb{R}^{d})=\left\{f\in L^{2}(\mathbb{R}^{d}):(1+\|% \omega\|^{2})^{\frac{s}{2}}\widehat{f}(\omega)\in L^{2}(\mathbb{R}^{d})\right\},

(6)

where $\widehat{f}(\omega)=(2\pi)^{-d/2}\int_{\mathbb{R}^{d}}f(x)\,e^{-i\langle\omega% ,x\rangle}\,\mathrm{d}x$ is the Fourier transform of $f$ . The Sobolev embedding theorem states that $H^{s}(\mathbb{R}^{d})\subset C(\mathbb{R}^{d})$ if $s>\frac{d}{2}$ . Note that $c_{1}(1+\|\omega\|^{2})\leq\left(\frac{2\nu}{l^{2}}+4\pi^{2}\|\omega\|^{2}% \right)\leq c_{2}(1+\|\omega\|^{2})$ with some $c_{1},c_{2}>0$ . Therefore, combining (3), (5) and (6), we conclude that: the RKHS with Matérn kernel $K(x,y)=\Phi_{\nu,l}(x-y)$ is norm-equivalent to the Sobolev space $H^{\nu+\frac{d}{2}}(\mathbb{R}^{d})$ .

For a domain $\Omega\in\mathbb{R}^{d}$ and $s\geq 0$ , the Sobolev space $H^{s}(\Omega)$ is the set of restrictions of functions from $H^{s}(\mathbb{R}^{d})$ to $\Omega$ equipped with the norm

\displaystyle\|f\|_{H^{s}(\Omega)}:=\inf\left\{\|g\|_{H^{s}(\mathbb{R}^{d})}:g% \in H^{s}(\mathbb{R}^{d}),g|_{\Omega}=f\right\}.

The RKHS with kernel $K_{\nu,l}$ can also be defined on $\Omega$ . If $\Omega$ is a Lipschitz domain, then $\mathcal{H}$ with Matérn kernel is norm-equivalent to the Sobolev space $H^{\nu+\frac{d}{2}}(\Omega)$ ; see [4, §10.7].

3 Gaussian random field and covariance operator

The following result states that a Gaussian random field (GRF) with Matérn kernel can be given as the solution of a stochastic partial differential equation (SPDE); see [2].

Theorem 2

Let $GP(0,K_{\nu,l})$ be a centered GRF on a Lipschitz domain $\Omega\subset\mathbb{R}^{d}$ such that the covariance function is the Matérn kernel $K_{\nu,l}$ , and $\Delta$ be the Laplacian with Dirichlet boundary condition. Then $u\sim GP(0,K_{\nu,l})$ is the unique solution of the fractional elliptic SPDE

\displaystyle\left(\frac{2\nu}{l^{2}}-\Delta\right)^{\frac{\gamma}{2}}u=W,% \quad\gamma=\nu+\frac{d}{2},

(7)

where $W$ is a spatial Gaussian white noise on $L^{2}(\Omega)$ .

The above SPDE is defined in the distributional sense. Note that $W$ is an isometry from $L^{2}(\Omega)$ to a centered Gaussian space and write $\langle W,\phi\rangle:=W(\phi)$ . Since $(2\nu/l^{2}-\Delta)^{-\frac{\gamma}{2}}$ is a Hilbert-Schmidt operator with $\gamma>\frac{d}{2}$ , this implies that the random element $u$ lies in $L^{2}(\Omega)$ almost surely, and the GRF $GP(0,K_{\nu,l})$ induces a Gaussian measure on $L^{2}(\Omega)$ . The Cameron-Martin space of the induced Gaussian measure is the RKHS $\mathcal{H}_{K_{\nu,l}}$ , which is norm-equivalent to $H^{\nu+\frac{d}{2}}(\Omega)$ and continuously embedded into $L^{2}(\Omega)$ . For any $\phi\in L^{2}(\Omega)$ , the SPDE implies that

\displaystyle\langle W,\phi\rangle=\langle(2\nu/l^{2}-\Delta)^{\frac{\gamma}{2% }}u,\phi\rangle_{L^{2}(\Omega)}=\langle u,(2\nu/l^{2}-\Delta)^{\frac{\gamma}{2% }}\phi\rangle_{L^{2}(\Omega)},

hence $\langle W,(2\nu/l^{2}-\Delta)^{-\frac{\gamma}{2}}\psi\rangle=\langle u,\psi% \rangle_{L^{2}(\Omega)}$ for any $\psi\in L^{2}(\Omega)$ since $(2\nu/l^{2}-\Delta)$ is invertible on $L^{2}(\Omega)$ . For any $\phi,\psi\in L^{2}(\Omega)$ , it follows that

	$\displaystyle\mathbb{E}[\langle u,\phi\rangle_{L^{2}(\Omega)}\langle u,\psi% \rangle_{L^{2}(\Omega)}]$	$\displaystyle=\mathbb{E}[\langle W,(2\nu/l^{2}-\Delta)^{-\frac{\gamma}{2}}\phi% \rangle\langle W,(2\nu/l^{2}-\Delta)^{-\frac{\gamma}{2}}\psi\rangle]$
		$\displaystyle=\langle(2\nu/l^{2}-\Delta)^{-\frac{\gamma}{2}}\phi,(2\nu/l^{2}-% \Delta)^{-\frac{\gamma}{2}}\psi\rangle_{L^{2}(\Omega)}$
		$\displaystyle=\langle(2\nu/l^{2}-\Delta)^{-\gamma}\phi,\psi\rangle_{L^{2}(% \Omega)}.$

Note that $\mathbb{E}[\|u\|_{L^{2}(\Omega)}^{2}]<\infty$ by the Fernique’s theorem. This implies that $B_{u}$ define by $B_{u}(\phi)=\langle u,\phi\rangle_{L^{2}(\Omega)}$ is a regular zero-mean generalized Gaussian field on $L^{2}(\Omega)$ , and its covariance operator is $(2\nu/l^{2}-\Delta)^{-\gamma}$ . Therefore, the covariance operator can be written as the kernel integral operator

\displaystyle(T_{K}f)(x)=\int_{\Omega}K_{\nu,l}(x,y)f(y)\mathrm{d}y

(8)

defined on $L^{2}(\Omega)$ ; see [3, Exercise 3.2.14]. That is, it holds $T_{K}=(2\nu/l^{2}-\Delta)^{-\gamma}$ .

A byproduct of the above result is that we can get the decay rate of the eigenvalues of the trace-class operator $T_{K}$ on $L^{2}(\Omega)$ or $H^{\nu+\frac{d}{2}}(\Omega)$ (the eigenvalues are the same on these two spaces). For a bounded domain $\Omega\subset\mathbb{R}^{d}$ , recall that the Laplacian $-\Delta$ with Dirichlet boundary condition has eigenvalues $\{\lambda_{j}\}_{j}$ (arranged in ascending order) that increase obeying the Weyl’s law $\lambda_{j}\asymp j^{2/d}$ ; see [1, §6.4]. Therefore, the eigenvalues of $T_{K}$ are the eigenvalues of $(2\nu/l^{2}-\Delta)^{-\gamma}$ , which decays with the rate

\displaystyle\mu_{j}=(2\nu/l+\lambda_{j})^{-\gamma}\asymp j^{-2\gamma/d}=j^{-% \left(1+\frac{2\nu}{d}\right)},\qquad j\to\infty.

(9)

Therefore, the eigenvalues of $T_{K}$ with Matérn kernel has a polynomial decay rate.

References

[1] D. Borthwick (2020) Spectral theory: basic concepts and applications. Springer. External Links: Link Cited by: §3.
[2] F. Lindgren, H. Rue, and J. Lindström (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society Series B: Statistical Methodology 73 (4), pp. 423–498. External Links: Link Cited by: §3.
[3] S. V. Lototsky, B. L. Rozovsky, et al. (2017) Stochastic partial differential equations. Vol. 11, Springer. External Links: Link Cited by: §3.
[4] H. Wendland (2004) Scattered data approximation. Vol. 17, Cambridge university press. External Links: Link Cited by: §2, §2.
[5] C. K. Williams and C. E. Rasmussen (2006) Gaussian processes for machine learning. Vol. 2, MIT press Cambridge, MA. External Links: Link Cited by: §2.