1 Reproducing kernel Hilbert space

1 Reproducing kernel Hilbert space

Let Xd𝑋superscript𝑑 be a non-empty set. A function K:X×X:𝐾𝑋𝑋 is called a symmetric positive definite (SPD) kernel if

K(x,y)=K(y,x)andi,j=1ncicjK(xi,xj)0formulae-sequence𝐾𝑥𝑦𝐾𝑦𝑥andsuperscriptsubscript𝑖𝑗1𝑛subscript𝑐𝑖subscript𝑐𝑗𝐾subscript𝑥𝑖subscript𝑥𝑗0 (1)

for every finite set of points {x1,,xn}Xsubscript𝑥1subscript𝑥𝑛𝑋 and every choice of real coefficients {c1,,cn}subscript𝑐1subscript𝑐𝑛.

The classical Moore-Aronszajn theorem asserts that for every such kernel K𝐾 there exists a unique Hilbert space (,,)subscript of real-valued functions on X𝑋 with the following properties:

  1. (i)

    For each xX𝑥𝑋 the function K(,x)𝐾𝑥 belongs to ;

  2. (ii)

    For every f𝑓 and every xX𝑥𝑋,

    f(x)=f,K(,x).𝑓𝑥subscript𝑓𝐾𝑥 (2)

The Hilbert space is called the reproducing kernel Hilbert space (RKHS) associated with the kernel K𝐾. The function K𝐾 itself is the reproducing kernel of . Property (2) characterizes the evaluation functional as a continuous linear functional on , which is usually called the reproducing property.

2 RKHS with Matérn kernel and Sobolev space

A particularly important class of RKHSs arises from shift-invariant kernels of the form

K(x,y)=Φ(xy),Φ:d,:𝐾𝑥𝑦Φ𝑥𝑦Φsuperscript𝑑

where ΦΦ is an SPD kernel function. The next result describes the structure of the RKHS associated with such a kernel; see [4, Theorem 10.12].

Theorem 1 (RKHS of a shift-invariant kernel)

Let K(x,y)=Φ(xy)𝐾𝑥𝑦Φ𝑥𝑦 be a shift-invariant kernel on dsuperscript𝑑 with ΦC(d)L1(d)Φ𝐶superscript𝑑superscript𝐿1superscript𝑑. Then the associated RKHS is

={fL2(d)C(d):[Φ]12[f]L2(d)},conditional-set𝑓superscript𝐿2superscript𝑑𝐶superscript𝑑superscriptdelimited-[]Φ12delimited-[]𝑓superscript𝐿2superscript𝑑 (3)

where [Φ]delimited-[]Φ is the Fourier transform of ΦΦ. The inner product on is

f,g=1(2π)d/2d[f](ω)[g](ω)¯[Φ](ω)dω.subscript𝑓𝑔1superscript2𝜋𝑑2subscriptsuperscript𝑑delimited-[]𝑓𝜔¯delimited-[]𝑔𝜔delimited-[]Φ𝜔differential-d𝜔

Consider the Matérn kernel

Kν,l(x,y)=Φν,l(xy),Φν,l(x)=21νΓ(ν)(2νxl)νBν(2νxl),formulae-sequencesubscript𝐾𝜈𝑙𝑥𝑦subscriptΦ𝜈𝑙𝑥𝑦subscriptΦ𝜈𝑙𝑥superscript21𝜈Γ𝜈superscript2𝜈norm𝑥𝑙𝜈subscript𝐵𝜈2𝜈norm𝑥𝑙 (4)

where ν,l>0𝜈𝑙0, and Bνsubscript𝐵𝜈 is the modified Bessel function of the second kind. The Fourier transform of Φν,lsubscriptΦ𝜈𝑙 is

[Φν,l](ω)=Cν,l,d(2νl2+4π2ω2)(ν+d2),delimited-[]subscriptΦ𝜈𝑙𝜔subscript𝐶𝜈𝑙𝑑superscript2𝜈superscript𝑙24superscript𝜋2superscriptnorm𝜔2𝜈𝑑2 (5)

where Cν,l,dsubscript𝐶𝜈𝑙𝑑 is a constant only depending on (ν,l,d)𝜈𝑙𝑑; see [5, §4.2]. Thus, the condition in (3) becomes (2νl2+4π2ω2)(ν+d2)/2f^(ω)L2(d)superscript2𝜈superscript𝑙24superscript𝜋2superscriptnorm𝜔2𝜈𝑑22^𝑓𝜔superscript𝐿2superscript𝑑. Recall that for any s𝑠, the fractional Sobolev space on dsuperscript𝑑 is

Hs(d)={fL2(d):(1+ω2)s2f^(ω)L2(d)},superscript𝐻𝑠superscript𝑑conditional-set𝑓superscript𝐿2superscript𝑑superscript1superscriptnorm𝜔2𝑠2^𝑓𝜔superscript𝐿2superscript𝑑 (6)

where f^(ω)=(2π)d/2df(x)eiω,xdx^𝑓𝜔superscript2𝜋𝑑2subscriptsuperscript𝑑𝑓𝑥superscript𝑒𝑖𝜔𝑥differential-d𝑥 is the Fourier transform of f𝑓. The Sobolev embedding theorem states that Hs(d)C(d)superscript𝐻𝑠superscript𝑑𝐶superscript𝑑 if s>d2𝑠𝑑2. Note that c1(1+ω2)(2νl2+4π2ω2)c2(1+ω2)subscript𝑐11superscriptnorm𝜔22𝜈superscript𝑙24superscript𝜋2superscriptnorm𝜔2subscript𝑐21superscriptnorm𝜔2 with some c1,c2>0subscript𝑐1subscript𝑐20. Therefore, combining (3), (5) and (6), we conclude that: the RKHS with Matérn kernel K(x,y)=Φν,l(xy)𝐾𝑥𝑦subscriptΦ𝜈𝑙𝑥𝑦 is norm-equivalent to the Sobolev space Hν+d2(d)superscript𝐻𝜈𝑑2superscript𝑑.

For a domain ΩdΩsuperscript𝑑 and s0𝑠0, the Sobolev space Hs(Ω)superscript𝐻𝑠Ω is the set of restrictions of functions from Hs(d)superscript𝐻𝑠superscript𝑑 to ΩΩ equipped with the norm

fHs(Ω):=inf{gHs(d):gHs(d),g|Ω=f}.

The RKHS with kernel Kν,lsubscript𝐾𝜈𝑙 can also be defined on ΩΩ. If ΩΩ is a Lipschitz domain, then with Matérn kernel is norm-equivalent to the Sobolev space Hν+d2(Ω)superscript𝐻𝜈𝑑2Ω; see [4, §10.7].

3 Gaussian random field and covariance operator

The following result states that a Gaussian random field (GRF) with Matérn kernel can be given as the solution of a stochastic partial differential equation (SPDE); see [2].

Theorem 2

Let GP(0,Kν,l)𝐺𝑃0subscript𝐾𝜈𝑙 be a centered GRF on a Lipschitz domain ΩdΩsuperscript𝑑 such that the covariance function is the Matérn kernel Kν,lsubscript𝐾𝜈𝑙, and ΔΔ be the Laplacian with Dirichlet boundary condition. Then uGP(0,Kν,l)similar-to𝑢𝐺𝑃0subscript𝐾𝜈𝑙 is the unique solution of the fractional elliptic SPDE

(2νl2Δ)γ2u=W,γ=ν+d2,formulae-sequencesuperscript2𝜈superscript𝑙2Δ𝛾2𝑢𝑊𝛾𝜈𝑑2 (7)

where W𝑊 is a spatial Gaussian white noise on L2(Ω)superscript𝐿2Ω.

The above SPDE is defined in the distributional sense. Note that W𝑊 is an isometry from L2(Ω)superscript𝐿2Ω to a centered Gaussian space and write W,ϕ:=W(ϕ)assign𝑊italic-ϕ𝑊italic-ϕ. Since (2ν/l2Δ)γ2superscript2𝜈superscript𝑙2Δ𝛾2 is a Hilbert-Schmidt operator with γ>d2𝛾𝑑2, this implies that the random element u𝑢 lies in L2(Ω)superscript𝐿2Ω almost surely, and the GRF GP(0,Kν,l)𝐺𝑃0subscript𝐾𝜈𝑙 induces a Gaussian measure on L2(Ω)superscript𝐿2Ω. The Cameron-Martin space of the induced Gaussian measure is the RKHS Kν,lsubscriptsubscript𝐾𝜈𝑙, which is norm-equivalent to Hν+d2(Ω)superscript𝐻𝜈𝑑2Ω and continuously embedded into L2(Ω)superscript𝐿2Ω. For any ϕL2(Ω)italic-ϕsuperscript𝐿2Ω, the SPDE implies that

W,ϕ=(2ν/l2Δ)γ2u,ϕL2(Ω)=u,(2ν/l2Δ)γ2ϕL2(Ω),𝑊italic-ϕsubscriptsuperscript2𝜈superscript𝑙2Δ𝛾2𝑢italic-ϕsuperscript𝐿2Ωsubscript𝑢superscript2𝜈superscript𝑙2Δ𝛾2italic-ϕsuperscript𝐿2Ω

hence W,(2ν/l2Δ)γ2ψ=u,ψL2(Ω)𝑊superscript2𝜈superscript𝑙2Δ𝛾2𝜓subscript𝑢𝜓superscript𝐿2Ω for any ψL2(Ω)𝜓superscript𝐿2Ω since (2ν/l2Δ)2𝜈superscript𝑙2Δ is invertible on L2(Ω)superscript𝐿2Ω. For any ϕ,ψL2(Ω)italic-ϕ𝜓superscript𝐿2Ω, it follows that

𝔼[u,ϕL2(Ω)u,ψL2(Ω)]𝔼delimited-[]subscript𝑢italic-ϕsuperscript𝐿2Ωsubscript𝑢𝜓superscript𝐿2Ω =𝔼[W,(2ν/l2Δ)γ2ϕW,(2ν/l2Δ)γ2ψ]absent𝔼delimited-[]𝑊superscript2𝜈superscript𝑙2Δ𝛾2italic-ϕ𝑊superscript2𝜈superscript𝑙2Δ𝛾2𝜓
=(2ν/l2Δ)γ2ϕ,(2ν/l2Δ)γ2ψL2(Ω)absentsubscriptsuperscript2𝜈superscript𝑙2Δ𝛾2italic-ϕsuperscript2𝜈superscript𝑙2Δ𝛾2𝜓superscript𝐿2Ω
=(2ν/l2Δ)γϕ,ψL2(Ω).absentsubscriptsuperscript2𝜈superscript𝑙2Δ𝛾italic-ϕ𝜓superscript𝐿2Ω

Note that 𝔼[uL2(Ω)2]<𝔼delimited-[]superscriptsubscriptnorm𝑢superscript𝐿2Ω2 by the Fernique’s theorem. This implies that Busubscript𝐵𝑢 define by Bu(ϕ)=u,ϕL2(Ω)subscript𝐵𝑢italic-ϕsubscript𝑢italic-ϕsuperscript𝐿2Ω is a regular zero-mean generalized Gaussian field on L2(Ω)superscript𝐿2Ω, and its covariance operator is (2ν/l2Δ)γsuperscript2𝜈superscript𝑙2Δ𝛾. Therefore, the covariance operator can be written as the kernel integral operator

(TKf)(x)=ΩKν,l(x,y)f(y)dysubscript𝑇𝐾𝑓𝑥subscriptΩsubscript𝐾𝜈𝑙𝑥𝑦𝑓𝑦differential-d𝑦 (8)

defined on L2(Ω)superscript𝐿2Ω; see [3, Exercise 3.2.14]. That is, it holds TK=(2ν/l2Δ)γsubscript𝑇𝐾superscript2𝜈superscript𝑙2Δ𝛾.

A byproduct of the above result is that we can get the decay rate of the eigenvalues of the trace-class operator TKsubscript𝑇𝐾 on L2(Ω)superscript𝐿2Ω or Hν+d2(Ω)superscript𝐻𝜈𝑑2Ω (the eigenvalues are the same on these two spaces). For a bounded domain ΩdΩsuperscript𝑑, recall that the Laplacian ΔΔ with Dirichlet boundary condition has eigenvalues {λj}jsubscriptsubscript𝜆𝑗𝑗 (arranged in ascending order) that increase obeying the Weyl’s law λjj2/dasymptotically-equalssubscript𝜆𝑗superscript𝑗2𝑑; see [1, §6.4]. Therefore, the eigenvalues of TKsubscript𝑇𝐾 are the eigenvalues of (2ν/l2Δ)γsuperscript2𝜈superscript𝑙2Δ𝛾, which decays with the rate

μj=(2ν/l+λj)γj2γ/d=j(1+2νd),j.formulae-sequencesubscript𝜇𝑗superscript2𝜈𝑙subscript𝜆𝑗𝛾asymptotically-equalssuperscript𝑗2𝛾𝑑superscript𝑗12𝜈𝑑𝑗 (9)

Therefore, the eigenvalues of TKsubscript𝑇𝐾 with Matérn kernel has a polynomial decay rate.

References

  • [1] D. Borthwick (2020) Spectral theory: basic concepts and applications. Springer. External Links: Link Cited by: §3.
  • [2] F. Lindgren, H. Rue, and J. Lindström (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society Series B: Statistical Methodology 73 (4), pp. 423–498. External Links: Link Cited by: §3.
  • [3] S. V. Lototsky, B. L. Rozovsky, et al. (2017) Stochastic partial differential equations. Vol. 11, Springer. External Links: Link Cited by: §3.
  • [4] H. Wendland (2004) Scattered data approximation. Vol. 17, Cambridge university press. External Links: Link Cited by: §2, §2.
  • [5] C. K. Williams and C. E. Rasmussen (2006) Gaussian processes for machine learning. Vol. 2, MIT press Cambridge, MA. External Links: Link Cited by: §2.