Gibbs Phenomena for $L^q$-Best Approximation in Finite Element Spaces -- Some Examples

Recent developments in the context of minimum residual finite element methods are paving the way for designing finite element methods in non-standard function spaces. This, in particular, permits the selection of a solution space in which the best approximation of the solution has desirable properties. One of the biggest challenges in designing finite element methods are non-physical oscillations near thin layers and jump discontinuities. In this article we investigate Gibbs phenomena in the context of $L^q$-best approximation of discontinuities in finite element spaces with $1\leq q<\infty$. Using carefully selected examples, we show that on certain meshes the Gibbs phenomenon can be eliminated in the limit as $q$ tends to $1$. The aim here is to show the potential of $L^1$ as a solution space in connection with suitably designed meshes.


Introduction
This article investigates the Gibbs phenomenon in the context of the L q -best approximation of discontinuous functions in finite element spaces by considering a few carefully selected simple examples that can be analysed in detail.The Gibbs phenomenon was originally discovered by Henry Wilbraham (1848), and described by Willard Gibbs (1899) in the context of approximating jump discontinuities by partial sums of Fourier series.It also occurs in the best approximation of functions either by a trigonometric polynomial in the L 1 -metric (Moskona et al., 1995) or spline functions in the L 2 -metric (Richards, 1991).The best approximation in finite element spaces consisting of piecewise polynomials is closely related to the last example.Saff and Tashev (1999) show that in one dimension the best approximation of a jump discontinuity by polygonal lines leads to Gibbs phenomena for all 1 < q < ∞ but vanishes as q → 1; this is the starting point of our investigation.
We consider several meshes in one and two dimensions and show that on certain meshes the over-and undershoots in the best approximation can be eliminated in the limit q → 1.These results are extensions of Saff and Tashev (1999).However, there exist meshes in both one and two dimensions that do not satisfy this property.The aim of this article is therefore to illustrate which properties the underlying mesh must satisfy to ensure that the oscillations vanish in the L q -best approximation of discontinuous functions.
This study of L q -best approximations in finite element spaces is motivated by approximating solutions to partial differential equations (PDEs) in subspaces of L 1 (Ω).Guermond (2004) points out that there are only very few attempts at achieving this despite the fact that first-order PDEs and their non-linear generalizations have been extensively studied in L 1 (Ω).The existing numerical methods which seek an approximation directly in L 1 (Ω) include the ones outlined in the articles by Lavery (1988Lavery ( , 1989Lavery ( , 1991)), the reweighted least-squares method of Jiang (1993Jiang ( , 1998) ) and the methods outlined in the series of articles by Guermond et al. (Guermond, 2004;Guermond and Popov, 2007;Guermond et al., 2008;Guermond andPopov, 2008/09, 2009).More recently, a novel approach to designing finite element methods in a very general Banach space setting has been introduced by Muga and van der Zee (2017) and applied to the advection-reaction equation (Muga et al., 2019) and to the convection-diffusion-reaction equation (Houston et al., 2019).This approach is based on the so-called discontinuous Petrov-Galerkin methods (e.g., Demkowicz and Gopalakrishnan, 2014) and extends the concept of optimal test norms and functions from Hilbert spaces to more general Banach spaces.At least in an abstract sense, this approach outlines how to design a numerical method that leads to a quasi-best approximation of the solution in a space of choice, provided the continuous problem is well-posed in a suitable sense.In practice, there are hurdles to overcome to design a practical method, but this is not the subject of this article.Nonetheless, it opens up a new approach to designing numerical methods that raises the question of which norms and spaces are favourable for the approximation of certain types of PDEs.
In the context of approximating solutions containing discontinuities and under resolved interior-and boundary layers, the numerical results for existing L 1 -methods suggest such features can be approximated as sharply as a given mesh permits without exhibiting spurious over-or undershoots.This property clearly gives them an enormous advantage over traditional finite element methods yielding approximations in subspaces of L 2 (Ω).Indeed, it is well-known that even seemingly simple examples such as the transport equation or convection-dominated diffusion equations require extra care in the design of the method, with the standard Galerkin finite element method being unstable, and alternative methods often requiring socalled stabilization and/or shock-capturing techniques (e.g., John andKnobloch, 2007a,b, 2008;Roos et al., 2008).
The subdifferential of a function f : V → R at a point v ∈ V is denoted by ∂f (v) ⊂ V .Furthermore, for v, w ∈ V , we write ∂f (v)(w) to denote ϕ(w) for an arbitrary ϕ ∈ ∂f (v).
where n denotes the unit outward normal vector on the boundary of the domain.
We seek an approximation of the analytical solution in a finite dimensional space that consists of continuous piecewise linear polynomials defined on a given mesh.In one dimension, we are interested both in uniform and non-uniform meshes.In two dimensions we consider predominantly uniform and structured meshes, although we include one example of an unstructured mesh.
If ε 1, then the second order term is completely dominated by the first-order term and away from the outflow boundary the solution is essentially given by the solution to the advection problem obtained by setting ε to zero.For the above problems this means that u ≈ 1 away from the outflow boundary.Due to the Dirichlet boundary conditions, a boundary layer forms near the outflow boundary.If the diameter of the elements near the boundary layer is large compared with ε, the layer is fully contained within these elements and, in the above problems, u ≈ 1 in the rest of the domain.Numerically, this essentially means that we approximate the problems (1.1)/(1.2) with ε = 0 while still keeping the boundary conditions at both ends.Clearly, the analytical solution for the above problems with ε = 0 and the boundary conditions only imposed on the inflow part of the boundary is u ≡ 1.This motivates us to consider the best approximations of u ≡ 1 by linear finite element functions satisfying the boundary conditions given in (1.1) and (1.2), respectively.
Fig. 1 shows the L q best approximation of u ≡ 1 by a piecewise linear function u h satisfying u h (0) = 1 and u h (1) = 0 on a uniform mesh consisting of four elements with q = 2 and q = 1.2.We can see that in both cases over-and undershoots are present in the approximation, but that the magnitude of these oscillations is significantly smaller for q = 1.2.This example illustrates the phenomenon of reducing oscillations in the approximation as q → 1 that we shall investigate in this article.

Problem Statement
We consider a subdivision Ω h of the domain Ω = (0, 1) d , d = 1, 2 into n disjoint open simplicial elements (i.e., subintervals when d = 1 and triangles when d = 2) κ i , i = 1, . . ., n, such that Ω = n i=1 κi and define U h to be the standard finite element space consisting of continuous piecewise linear polynomials on the mesh Ω h .Let u ≡ 1 and consider the following (constrained) best approximation problem: Note that the constraint can be removed by using a Dirichlet lift argument as commonly employed in the context of finite element methods and restricting the space U h to the functions that are zero on the part of the boundary where boundary conditions are employed.In one dimension, instead of u(x) ≡ 1, we also consider the L q -best approximation of u(x) = sgn(x) on (−1, 1) by a continuous piecewise linear function u h satisfying −u h (−1) = u h (1) = 1.We use this example to establish the link between our work and Saff and Tashev (1999).

Summary of Results
The main result of this article consists of the precise analysis of very simple examples that illustrate the behaviour of L q -best approximations of discontinuities by continuous piecewise linear polynomials on coarse meshes.We have furthermore included some numerical examples that confirm the theoretical analysis and illustrate how the observed behaviour in simple model examples applies to more general scenarios.In particular, we demonstrate that the over-and undershoots observed in L q -best approximations for 1 < q < ∞ decrease as q → 1.Whether these oscillations disappear entirely depends on the mesh used to define the underlying finite dimensional approximation space.In one dimension, Gibbs phenomena can be eliminated on uniform meshes both for a boundary discontinuity and a jump discontinuity present in the interior of the domain.For non-uniform meshes it depends on the relative sizes of the elements.
In two dimensions, we show that there exist uniform and structured meshes for which Gibbs phenomena are not eliminated.But, we also include examples of meshes in two dimensions on which the over-and undershoots vanish as q → 1.Furthermore, we will illustrate that there exist infinitely many L 1 -best approximations in certain cases which is due to the fact that L 1 (Ω) is not strictly convex.
The first example we consider is the approximation problem (1.3) with d = 1.The following theorem precisely characterises the L q -best approximation for all 1 ≤ q < ∞ for any two-element mesh on (0, 1).Note that in this case, the approximation problem only has one degree of freedom due to the boundary conditions.Furthermore, we prove for an N -element mesh that there exists an L 1 -best approximation with no over-or undershoot if either a grading-type mesh condition is satisfied or a stronger, simple element-size condition.The precise result is given below.
Theorem 1.1 (L q -best approximation of a boundary discontinuity).
1. Consider the mesh given by the subdivision of (0, 1) into the two intervals (0, 1 − h) and (1 − h, 1) with h ∈ (0, 1).For 1 ≤ q < ∞, the solution of the approximation problem (1.3) with u ≡ 1 and d = 1 is given by a continuous piecewise linear polynomial u h that satisfies the boundary conditions and u h (1 − h) = α, where α is defined as follows

Let the mesh be given by a subdivision of the interval
Then is a sufficient condition for the existence of an L 1 -best approximation u h of u ≡ 1 with u h (0) = 1 and u h (1) = 0 satisfying u h (x i ) = 1 for all i = 1, . . ., N − 1.Hence, u h contains no over-or undershoots.

Let the mesh be given by a subdivision of the interval
The length h i of the ith subinterval is given by Remark 1.2.Note that in the second part of Theorem 1.1, condition (1.8) essentially states that elements cannot be too small compared to their neighbouring element closer to the discontinuity.Furthermore, there are no conditions on the size of the elements contained in (0, This means that the mesh can be designed in such a way that it is allowed to be arbitrary away from the discontinuity without leading to oscillations.This observation is particularly useful if more than one discontinuity is to be approximated. Remark 1.3.With very similar arguments as in the proof of the final part of Theorem 1.1, it is easy to see that if h N > h N −1 , but h N −1 ≤ h i for all i = 1, . . ., N − 2, then every L 1 -best approximation must contain over-or undershoots.Moreover, there exists an L 1 -best approximation with overshoot only at the node x N −1 and no further over-or undershoots, i.e., u h (x i ) = 1 for i = 1, . . ., N − 2 and The value at u h (x N −1 ) follows from the first part of the theorem by rescaling the interval.
Fig. 2 shows α specified in Theorem 1.1 for two different ranges of q and three different choices of h.The plot shows that α < 2 for all 1 ≤ q < ∞ and that α decreases as q → 1 for all three choices of h.Furthermore, we can see that the behaviour as q → ∞ is very similar for all choices of h, but that there are clear differences as q → 1.For h = 0.25 and h = 0.5, α approaches 1 as q → 1, hence the overshoot vanishes as q → 1, whereas for h = 0.75 it approaches √ 2h ≈ 1.2247, hence the overshoot does not vanish.This is consistent with the results obtained for the L 1 -best approximation, cf., (1.4).
In Section 6.2 we include examples of two three-element meshes violating the sufficient condition in part three of Theorem 1.1 such that one of the meshes satisfies (1.8), whereas the other mesh violates this condition as well.We will demonstrate that for the latter mesh the overshoot does indeed not vanish entirely as q → 1.
The second best approximation problem we analyse is the best approximation of u(x) = sgn(x) on (−1, 1) on a mesh consisting of exactly four elements that is symmetric with respect to x = 0.The main difference to the result in part one of Theorem 1.1 is that there exists a whole family of best approximations if q = 1.For q > 1, we observe the same behaviour as before.
4. In the limit q → 1 the L q -best approximation converges to the L 1 -best approximation with u h (0) = 0 for any h ∈ (0, 1).The corresponding L 1 -best approximation is anti-symmetric and satisfies We again observe that the presence of over-and undershoots in the L 1 -best approximation depends on the choice of mesh.Furthermore, there exists a whole family of L 1 -best approximations in this case which is possible since L 1 is not strictly convex and therefore minimizers are not necessarily unique.We recover uniqueness if we define the minimizer as the limit as q → 1 of the L q -minimizer.Moreover, it follows from the proof of Theorem 1.4 that the L 1 -best approximation is unique if the subdivision of the interval is no longer symmetric, as we will see in Section 4.
In order to see how this result relates to the work in Saff and Tashev (1999), it first has to be noted that there are two major differences between our investigation and Saff and Tashev (1999): 1.The interval in Saff and Tashev (1999) is subdivided into 2n subintervals of equal length.In contrast to this, we only consider the special case that (−1, 1) is subdivided into 4 subintervals and instead allow the subdivision to be non-uniform but still symmetric with respect to the center of the interval.
2. We consider bounded domains with fixed boundary conditions, which are relevant to finite element approximations, whereas the investigation in Saff and Tashev (1999) considers the limit n → ∞ for the interval [−nh, nh] (ergo essentially an infinite domain) with no boundary conditions.
In Saff and Tashev (1999) it is shown that for a uniform subdivision of the interval [−nh, nh], the overand undershoots disappear as n → ∞ and q → 1.The last point in Theorem 1.4 shows that, on a fixed mesh, we recover the result that the over-and undershoots disappear as q → 1 for h ≤ 0.5, which includes the case of a uniform mesh.However, if h > 0.5, the over-and undershoots do not disappear as q → 1.
The final theoretical result concerns the solution to (1.3) with d = 2 on the four meshes shown in Fig. 3.Note that the discrete space U h has only one degree of freedom on Mesh 1, corresponding to the value at the midpoint, and U h has three degrees of freedom on the other meshes, corresponding to the values at the three nodes on the line x = 0.5.For the first mesh, we analyse the L q -best approximation for all 1 ≤ q < ∞ and show that the solution contains an overshoot that does not disappear as q → 1.For Mesh 2, we show that any L 1 -best approximation must contain over-or undershoots and characterise an L 1 -best approximation.Furthermore, we prove that there exists an L 1 -best approximation on Meshes 3 and 4 without over-or undershoots.Finally, we demonstrate numerically in Section 6.3.1 that the L q -best approximation on Meshes 2, 3 and 4 indeed approaches the L 1 -best approximation characterised in the theorem below.
Theorem 1.5 shows that, on Meshes 1 and 2, the L q -best approximations exhibit an overshoot for all q, including q = 1, while on Meshes 3 and 4 the L 1 -best approximation does not contain any over-or undershoots.
Fig. 4 shows the parameter α defining the L q -best approximation on Mesh 1 for two different ranges of q.The plot shows that α < 2 for all q and that α decreases as q → 1, where it approaches 1.32.This is consistent with the result in Theorem 1.5 obtained for the L 1 -best approximation.To confirm the theoretical results, we have also determined the L q -best approximation numerically by implementing (5.5) using FEniCS (Alnaes et al., 2015).The solution to the resulting non-linear system can be approximated using a Newton iteration if q is sufficiently close to 2. Note that this solver is not robust in q and stalls or diverges for q close to 1 and for q 2. The left plot in Fig. 4 shows numerically determined approximations of α for selected values of q which confirm the theoretical results.
We also include further numerical experiments in Section 6 illustrating that the observations remain the same if u is a more general smooth function and that the over-and undershoots cannot be eliminated by refining the mesh.

Outline of the Paper
The remainder of this article is organised as follows: in Section 2 we describe a characterisation of the L qbest approximation of a function in a finite dimensional subspace that we will use to prove our theoretical results; Sections 3, 4 and 5 contain the proofs of Theorems 1.1, 1.4 and 1.5, respectively.We conclude with several numerical examples in Section 6 illustrating the effect of mesh refinement in one and two dimensions and showing the behaviour of the L q -best approximation as q → 1 in one dimension, as well as on structured and unstructured meshes in two dimensions.

Characterisation of Best L q -Approximation
In this section we describe a characterisation of best-approximation in Banach spaces and more specifically the Lebesgue spaces L q (Ω), 1 ≤ q < ∞.This characterisation will be used in the remainder of this article to determine the best L q -approximation for specific examples.
If U is a Banach space and f a function subdifferential is single valued and agrees with the Gâteaux derivative.We now quote the following theorem, cf., (Singer, 1970, Theorem 1.1).
Theorem 2.1 (Characterisation of best approximation).Let U be a Banach space, U h ⊂ U a closed subspace and u ∈ U .The following statements are equivalent: Remark 2.2.The subdifferential ∂ ( • U ) (•) can be characterised as follows, cf., e.g., (Cioranescu, 1990, Chapter 1, Proposition 3.4).For any w ∈ U , This characterisation allows us to translate the above formulation of Theorem 2.1 directly into the formulation found in Singer (1970).In Muga and van der Zee (2017) the same theorem is stated in terms of the so-called duality mapping, which can also be easily translated into the above formulation.
First we will use Theorem 2.1 to characterise best approximants in subspaces of L q (Ω), 1 < q < ∞.To this end, we determine the subdifferential ∂ • L q (Ω) (w) for an arbitrary w ∈ L q (Ω) and 1 < q < ∞.Note that in this case the norm is Gâteaux differentiable; indeed, we can compute for w ≡ 0: by the canonical identification of an element in the dual space of L q (Ω) with a function in L q (Ω), where 1 = 1/q + 1/q .The following Corollary is an immediate consequence of this by setting w = u − u h .
Corollary 2.3.Let U := L q (Ω) and U h ⊂ U a closed subspace.The function u h ∈ U h is an L q -best approximation of u if and only if Next we will use (2.1) to characterise best approximations in subspaces of L 1 (Ω).Note that in this case the subdifferential ∂ • L 1 (Ω) (w) is in general not single valued for an arbitrary w ∈ L 1 (Ω) .From (2.1), we deduce that It is easy to see that any ψ such that ψ = sgn(w) if w = 0 and |ψ| ≤ 1 almost everywhere satisfies the above conditions.Conversely, the first property implies |ψ(x)| ≤ 1 almost everywhere and the second property implies that ψ(x) = 1 almost everywhere on {u(x) > 0} and ψ(x) = −1 almost everywhere on {u(x) < 0} since It is important to note, that the only condition on ψ on the set {w(x) = 0} is that |ψ| ≤ 1 almost everywhere.The following Corollary characterising L 1 -best approximations is a direct consequence of this by setting w = u − u h .
Corollary 2.4.Let U := L q (Ω) and U h ⊂ U a closed subspace.The function u h ∈ U h is an L 1 -best approximation of u if and only if there exists a function Note that in the case that u and u h only agree on a set of measure zero, the choice of ψ 0 ∈ [−1, 1] becomes irrelevant.

Best Approximation of a Boundary Discontinuity in One Dimension
In this section we consider the best approximation problem (1.3) in one dimension and provide a proof of Theorem 1.1.We split this into three parts: Sections 3.1 and 3.2 contain the proof of the first part of the theorem, where the former addresses the case q = 1 and the latter the case 1 < q < ∞; Section 3.3 contains the proof of the second and third part of the theorem.In the first part of Theorem 1.1, we consider the mesh consisting of the two subintervals (0, 1 − h) and (1 − h, 1).The best approximation u h of u ≡ 1 by a continuous piecewise linear function satisfying the boundary conditions u h (0) = 1 and u h (1) = 0 is determined entirely by the value it takes at the point Thereby, we can write u h = ϕ 0 + αϕ 1 , where α is to be determined and Fig. 5 shows the two functions ϕ 0 and ϕ 1 as well as an approximation u h of u ≡ 1 with α > 1.To eliminate the constraint by introducing a Dirichlet lift in the best approximation problem (1.3), we could define the subspace U h as the span of ϕ 1 and redefine u = 1 − ϕ 0 and u h = αϕ 1 .Note, however, that u − u h remains the same.The main consequence of this observation is, that (2.2) and (2.3) do not have to be satisfied for w h = ϕ 0 due to the boundary condition constraint.

L 1 -Best Approximation
In this section we give a proof of the first part of Theorem 1.1 for the case q = 1.More precisely, we show that the L 1 -best approximation of u ≡ 1 on (0, 1) by a continuous piecewise linear function u h satisfying the boundary conditions u h (0) = 0 and u h (1) = 1 is given by u h = ϕ 0 + αϕ 1 , where Proof.Using the characterisation of the best approximation given in Corollary (2.4), to determine the best approximation, we distinguish between two cases: 1.The set {x ∈ (0, 1) : (u − u h )(x) = 0} has measure zero.(For continuous piecewise linear functions this set has to consist of a finite number of points).This means that ψ = sgn(u − u h ) everywhere except on a set of measure zero and is thus uniquely defined almost everywhere.
2. The set {x ∈ (0, 1) : (u − u h )(x) = 0} has positive measure, i.e., the set contains an interval of positive length.This means that ψ is not uniquely defined on a set with positive measure.

L q -Best Approximation
In this section we give a proof of the first part of Theorem 1.1 for the case 1 < q < ∞.More precisely, we show that the L q -best approximation of u ≡ 1 on (0, 1) by a continuous piecewise linear function u h satisfying the boundary conditions u h (0) = 0 and u h (1) = 1 is given by u h = ϕ 0 + αϕ 1 , where α > 1 and Again, we have to split the integral on each element into the parts where u − u h > 0, u − u h < 0 and u − u h = 0. We consider three cases If α < 1, we have u − u h > 0 everywhere in (0, 1) and thus both sgn(u − u h )|u − u h | q−1 > 0 and ϕ 1 > 0 in (0, 1).Therefore, we have hence α = 1 is not possible.We can therefore assume α > 1.In this case u Hence, the L q -best approximation can be determined by finding α > 1 satisfying existence of which is guaranteed since the L q -best approximation always exists.

Sufficient Conditions on General Meshes
In this section we provide a proof of the second and third parts of Theorem 1.1.To this end, let the mesh be given by a subdivision of the interval (0, 1) into N ≥ 2 subintervals ( The length h i of the ith subinterval is given by h i = x i − x i−1 , i = 1, . . .N .In order to prove the second part of Theorem 1.1, we show that the following conditions are sufficient for the existence of an L 1 -best approximation u h of u ≡ 1 with u h (0) = 1 and u h (1) = 0 satisfying u h (x i ) = 1 for all i = 1, . . ., N − 1: where ϑ N := 0, We then show that the much simpler condition h N ≤ min i=1,...,N −1 h i implies (3.2) which proves the third part of Theorem 1.1.
For α ∈ (0, 1], define ψ α (x) as follows: We claim that there exists α ∈ (0, 1] such that 2) is satisfied.First we show that ψ α is well defined.For this we require ϑ i to be well defined, i.e., we require ϑ i ∈ [0, 1], for all i = M, . . ., N − 1.This is trivially true for ϑ N .Otherwise, for i ≥ M , With this in mind, we now consider (3.3) for i > M .In this case Hence (3.4) becomes zero if and only if which is the definition of ϑ i .Next, we consider i = M .We have already established ϑ 2 M ≤ 1/2.With α > 0, this implies θM < 1.Furthermore, (3.6) still holds with i = M , whereas we obtain Hence (3.4) becomes zero for i = M if and only if which is the definition of θM .Finally, we have to show that there exists α ∈ (0, 1] such that The last expression becomes zero if and only if Hence, we need α such that By the definition of M , we have . This shows that (3.2) is indeed a sufficient condition for the existence of an L 1 -best approximation such that u h (x i ) = 1 for all i = 1, . . ., N − 1, which finishes the proof of part two of Theorem 1.1.
Next, we show that h N ≤ min i=1,...,N −1 h i implies that (3.2) is satisfied.Let ϑ i for i = 1, . . .N − 1 be defined as in Lemma 2. We first show that if ϑ i is real and ϑ i < 1 − 1/ √ 2 for some k ≥ 1 and all i ≥ k, then the following holds: Indeed, by the definition of ϑ i , we have Hence, (3.8) is equivalent to which is true since ϑ i ∈ (0, 1) for i ≥ k.Next, note that if we apply (3.8) recursively, we obtain (3.9) In order to prove that (3.2) is satisfied, first note that (3.2) reduces to h N −1 ≥ h N for i = N − 1 since ϑ N = 0. Now assume for the sake of contradiction that for some j > 0 and all i = j, j + 1, . . .N − 1, ϑ i < 1 − 1/ √ 2 and assume (3.2) holds for i = j + 1, . . .N − 1 but not for i = j.In this case, we have which is a contradiction.
However, the proof given above shows that the condition h N ≤ min i=1,...,N −1 h i is always stronger than (3.2).

Over-and Undershoots at Jump Discontinuities
In this section we consider the L q -best approximation of u(x) = sgn(x) in (−1, 1) as an example of a jump discontinuity in the interior of the domain and provide a proof of Theorem 1.4.We split this into three parts: in Section 4.1, we consider the first part of Theorem 1.4, i.e., we consider the case where the L 1 -best approximation does not exhibit Gibbs phenomena.In Section 4.2, we consider the case where the L 1 -best approximation does exhibit Gibbs phenomena (part two of Theorem 1.4); finally, in Section 4.3 we consider the L q -best approximation for 1 < q < ∞ and the limit as q → 1 (parts three and four of Theorem 1.4).
The condition for u h to be an L 1 best-approximation in Corollary 2.4 can be written as follows: there exists ψ : (−1, 1) → [−1, 1] such that

L 1 -Best Approximation without Over-or Undershoots
In this section we provide a proof of the first part of Theorem 1.4.More precisely, if h ≤ 0.5, a continuous piecewise linear function u h on the mesh shown in Fig. 6 such that −u h (−1) = u h (1) = 1 is an L 1 -best approximation of u(x) = sgn(x) if and only if u h (0) = β, with β ∈ [−1, 1] arbitrary, and Proof.We first show that u h must satisfy −u h (−h) = u h (h) = 1.For the sake of contradiction assume ) almost everywhere and we obtain This is a contradiction since the condition (4.1) is violated.Note that we can use the same argument with opposite sign for u h (−h) > −1.Hence, we have proven that u h (−h) = −1.
Due to the symmetry of the problem, it can easily be seen that the argument for u h To this end, we distinguish the following three cases: All three cases are shown in Fig. 7a-c, where u h (0) = 0 was chosen as an example for the second case.In the first case, u − u h = 0 in (−1, −h) and u − u h < 0 in (−h, 0).We compute This integral is equal to zero for Note that u − u h = 0 in [0, h]; it is easy to see that this integral becomes zero if and only if ψ 0 ≡ 1 in [0, h].The remaining integral then becomes This integral is zero for Due to the symmetry of the problem, it is easy to see that the third case, i.e., u h (0) = −1 also defines an L 1 best approximation if h ≤ 1 − h.This leaves the second case.As in the first case, u − u h = 0 in [−1, −h] and u − u h < 0 in [−h, 0].This implies that we have to require h ≤ 1 − h in order to find ψ 0 such that the integral involving ϕ 1 becomes zero.Furthermore, note that -again due to the symmetry of the problem -the same applies to the integral involving ϕ 3 .Finally, since u − u h < 0 in [−h, 0] and Thus, any choice of u h (0) ∈ [−1, 1] defines an L 1 -best approximation if h ≤ 1/2.This completes the proof.
Remark 4.1.Note that we have shown that there is a whole family of L 1 -best approximations with no over-or undershoots for this particular example if h ≤ 1/2.The situation is quite different if we instead consider a non-symmetric subdivision of the interval (−1, 1) into (−1, −h 1 ), (−h 1 , 0), (0, h 2 ) and (h 2 , 1) with h 1 = h 2 .The integral involving ϕ 2 then implies that the case −1 < u h (0) < 1 does not yield an L 1 -best approximation; the case u h (0) = 1 is an L 1 -best approximation if and only if h 1 < h 2 ≤ 1/2, and the case u h (0) = −1 is an L 1 -best approximation if and only if h 2 < h 1 ≤ 1/2.

L 1 -Best Approximation with Over-and Undershoots
In this section we provide a proof of the second part of Theorem 1.4.More precisely, if h > 0.5, we show that a continuous piecewise linear function u h on the mesh shown in Fig. 6 such that −u h (−1) = u h (1) = 1 is an L 1 -best approximation of u(x) = sgn(x) if and only if Proof.We start by showing that there exists ψ as in Corollary 2.4, such that u h as defined in (4.2) satisfies (2.3) with v h = ϕ 1 .In order to determine ψ(x), we have to find the points where u h intersects u.Note that u h cannot intersect u in (−1, −h) or (h, 1) unless it is identical with u in all of (−1, −h) or (h, 1), respectively.In order to determine the intersections in (−h, 0) and (0, h), we write u h as defined in (4.2) as follows: We start with finding the intersection in (−h, 0); note that β − α = 0 ⇔ β = −1 and u − u h = 0 everywhere in (−1, 0) in this case.If we assume β = −1, we obtain in (−h, 0) and If now β > −1, we have α < −1 and thus sgn(u − u h ) = 1 in (−1, −hϑ) and sgn(u − u h ) = −1 in (−hϑ, 0).Hence ψ = sgn(u − u h ) in (−1, 0) and we compute has to either change sign or become zero for u h to satisfy (1.3).The point of intersection of u and u h at −ϑh and the value of α = u h (−h) uniquely defines u h (0) = β such that (4.2a) holds.Hence, it is also a necessary condition that u h satisfies (4.2a) in order for u h to be an L 1 -best approximation if β > −1.Note that in this case the computation for β < −1 is completely analogous with opposite signs and therefore we have not yet established that β ≥ −1 is a necessary condition.
Similarly, γ − β = 0 ⇔ β = 1 and u − u h = 0 everywhere in (0, 1) in this case.If we assume β = 1, we obtain in (0, h) If now β < 1, we have γ > 1 and thus sgn(u − u h ) = 1 in (0, hϑ) and sgn(u − u h ) = −1 in (hϑ, 1).Due to the symmetry of the problem, the computation is up to the sign the same as for β = −1 and ϕ 1 and we obtain Hence u h as defined in (4.2) satisfies (2.3) with v h = ϕ 3 .Conversely, u h (0) = β implies u h (h) = γ as defined in (4.2) by an analogous argument to the proof of the implication u h (0) = β ⇒ u h (−h) = α.Thus, u h (h) = γ as defined in (4.2c) is a necessary condition for u h to be an L 1 -best approximation if β < 1. Again note that the computation for β > 1 is completely analogous with opposite signs and therefore we have not yet established that β ≤ 1 is a necessary condition.
The only two remaining cases are β = 1 and β = −1.If β = 1, then γ = 1 and α < −1.Thus, sgn(u − u h ) on (−1, 0) is the same as in the case β ∈ (−1, 1) and u − u h = 0 in (0, 1).We now simply have to determine a valid choice for ψ(x) on (0, 1) such that all integrals in (4.1) are zero.One possible choice is trivially given by simply choosing the same as in the case β ∈ (−1, 1).Note that we have already established that α has to be of the form (4.2a) for any and (2.3) is violated with v h = ϕ 3 .Due to the symmetry of the problem, the case β = −1 is analogous.

L q -Best Approximation
In this section, we prove the third and fourth part of Theorem 1.4.More precisely, we show that a continuous piecewise linear function u h on the mesh shown in Fig. 6 such that −u h (−1) = u h (1) = 1 is an L q -best approximation of u(x) = sgn(x) for 1 < q < ∞ if and only if −u h (−h) = u h (h) = α and u h = 0, where α satisfies 0 = −(1 − h)α 2 q(α − 1) q−1 − h(αq + 1)(α − 1) q + h and α > 1.Furthermore, we show that in the limit q → 1 the L q -best approximation converges to the L 1 -best approximation as defined in (4.2) with β = 0, for any h ∈ (0, 1), i.e., the corresponding L 1 -best approximation is anti-symmetric and satisfies Proof.We use the characterisation of the L q -best approximation in Corollary 2.3.Due to the uniqueness of the L q -best approximation for 1 < q < ∞ and the symmetry of the problem, we may assume that the L q -best approximation is an odd function.This means that u h (0) = 0 and for any choice of α and that To determine for which α the latter two integrals become zero, note that this is the same situation as in the example presented in Sections 3.1 and 3.2, only mirrored.Therefore, we again obtain that α satisfies This completes the proof of the first part of the lemma.Since, u h is an odd function, for any 1 < q < ∞, the limit as q → 1 must be an odd function as well and must therefore be zero at x = 0. Since the L 1 -best approximation is uniquely determined by the value it takes at zero according to the first two parts of Theorem 1.4, this completes the proof.Therefore, in the limit we obtain the solution in Fig. 7b if h ≤ 1/2.The corresponding L 1 -best approximation for h > 1/2 is shown in Fig. 7d.

Best Approximation of a Boundary Discontinuity in Two Dimensions
In this section we consider the best approximation problem (1.3) with d = 2 and provide a proof of Theorem 1.5.Hence, we consider the function u ≡ 1 on (0, 1) 2 .We consider the four meshes shown in Fig. 8 and determine the best approximation of u by a continuous function u h that is a linear polynomial on each of the triangles and takes the following values in the four corners: u h (0, 0) = u h (0, 1) = 1 and u h (1, 0) = u h (1, 1) = 0.For all meshes except the first one, we additionally fix the boundary conditions u h (0, 0.5) = 1 and u h (1, 0.5) = 0.The free parameter of the best approximation problem for the first mesh is α = u h (0.5, 0.5); there are three free parameters for each of the remaining meshes.For Meshes 2-4, we denote by v 1 the continuous piecewise linear function that is 1 at the node (0.5, 0) and 0 at all other nodes; by v 2 the continuous piecewise linear function that is 1 at (0.5, 0.5) and 0 at all other nodes; and by v 3 the continuous piecewise linear function that is 1 at (0.5, 1) and zero at all other nodes.The coefficients defining the solution u h are denoted as follows u(0.5, 0) = α, u(0.5, 0.5) = β, u(0.5, 1) = γ.
We split the proof of Theorem 1.5 into three parts: in Section 5.1, we prove the first part of the theorem, i.e., we consider Mesh 1 with q = 1; in Section 5.2 we continue with part two of the theorem and consider Mesh 1 with 1 < q < ∞; Section 5.3 finally contains a proof of parts three, four and five of the theorem, i.e., we show that the L q -best approximation contains over-or undershoots on all four meshes if q > 1 and consider q = 1 for Meshes 2, 3 and 4. (1,1) Figure 9: Left: Reference element τ .Center: The function u h with α > 1.The intersection with u is marked with red lines.Right: The mesh with the area where (u − u h ) < 0 coloured in blue and the area where (u − u h ) > 0 coloured in green.
Proof.We again use the characterisation of the L 1 -best approximation in Corollary 2.4.The space U h is the span of the continuous function v h that is a linear polynomial on each element, zero at the boundary of the domain and 1 at the centroid (0.5, 0.5).We will use the reference triangle τ as depicted on the left in Fig. 9 for all computations.To this end, we define the affine transformations ξ i : τ i → τ , i = 0, 1, 2, 3 that are each composed of a scaling by 0.5, a rotation and a translation.On τ we define the basis functions φ0 , φ1 and φ2 as φ0 in the coordinates of the reference element τ .Note that ξ i (v h | τi ) = φ0 for all i = 0, 1, 2, 3. We consider the following two cases: 1.The set {(x, y) ∈ (0, 1) 2 : (u − u h )(x, y) = 0} has measure zero.
The second case is only possible if α = 1.In this case we have u h = u in τ 3 and u h < u in τ i , i = 0, 1, 2. Noting that the τ i and τ are similar and that the area of τ is precisely twice the area of any τ i , we obtain that On the other hand, where ψ(x, y) arbitrary on τ 3 with −1 ≤ ψ(x, y) ≤ 1.Hence, for any choice of ψ, we have This shows that α = 1 does not yield a best approximation on this mesh.We can furthermore rule out the case α < 1; indeed, in this case u − u h > 0 in the whole domain (0, 1) 2 and since v h > 0 in (0, 1) 2 , we have The only remaining case is α > 1; in this setting we have u − u h < 0 in τ 3 and thus In the remaining three triangles, u − u h changes sign within the element.This is illustrated in Fig. 9. Furthermore, u − u h = 0 on a set of measure zero, hence ψ = sgn(u − u h ) almost everywhere.In order to determine the sections of each element where u − u h is positive and negative, respectively, we will consider ξ i (u − u h ) for i = 0, 1, 2. Due to the symmetry of the approximation problem, we can assume We compute ξ 0 (u − u h ) = 1 − α φ0 − φ1 = −(α − 1) + (α − 1)x + αy.
Remark 5.1 (Uniform refinement).If the mesh is refined uniformly, keeping the same structure as shown in Fig. 10a, it is easy to see that an L 1 -best approximation is given by u(x i , y j ) = α, with α as specified in Section 5.1, if the node (x i , y j ) is connected with the boundary x = 1, and u(x i , y j ) = 1 at the remaining interior nodes.Indeed, in this case we can choose ψ = ψ 0 on the set {x : u(x) − u h (x) = 0} as shown in Fig. 10b.This shows that the overshoot in the L 1 -best approximation remains constant under this type of mesh refinement.
We will now show that there is an L 1 -best approximation with β = γ = 1 and α > 1.We have already established that (5.6) is independent of α; this leaves the integrals involving v 2 and v 3 .We have u − u h > 0 in τ i , i = 3, 6, 7, and u − u h < 0 in τ 0 .Moreover, we have already fixed ψ ≡ −1 in τ 4 .If we now furthermore choose ψ ≡ −1 in τ 1 , we obtain this is again independent of α.Finally, we consider v 3 to determine α.Let again ξ i be the linear transformation that maps τ i onto τ ; we have u − u h > 0 on τ 0 .Furthermore, Thus, Therefore, we compute Putting all three integrals together yields Hence, α > 1 has to satisfy the equation 0 = −3α 3 + 8α − 4.

Numerical Examples
In this section we consider selected examples of meshes for which we have determined the solution of the best approximation problem (1.3) numerically by interpreting the condition (2.2) as a variational problem that can be implemented using standard finite element techniques.Here, we have used FEniCS Alnaes et al. (2015) for the implementation.In Section 6.1 we illustrate that the overshoot in the L q -best approximation does not vanish if the mesh is refined and that these observations even apply if u is a more general smooth function.
In Section 6.3 we illustrate that the L q -approximation on the three meshes considered in Section 5.3 (second half of Theorem 1.5) converges to the L 1 -best approximation characterised in the theorem.Furthermore, we show how the understanding of these special cases can be applied to predict the behaviour of the L q -best approximation on a more general mesh.

Gibbs Phenomenon on Meshes in Two Dimensions
We start with Mesh 1 depicted in Fig. 8 and the refinement shown in Fig. 10a that preserves the structure of the mesh.We have already shown in Remark 5.1 that for q = 1 there exists an L 1 -best approximation such that the overshoot remains constant as we refine the mesh.Indeed, Fig. 11 shows the maximum value of u h for this example with q = 2 and for several refinements of the mesh, as well as the approximation u h for a mesh with this structure consisting of 100 elements.We can clearly see that the maximum value remains constant under this type of refinement which suggests that the maximum overshoot also remains constant for q = 1, as well as in the limit q → 1.

Gibbs Phenomenon on Meshes in One Dimension
Next, we consider a one-dimensional example such that u is not piecewise linear and compute the L q -best approximation numerically.Let u(x) = 1 + 0.1 sin(2πx) on (0, 1) and consider the L q -best approximation u h with u h (0) = 1 and u h (1) = 0 on four different grids: two uniform grids with 5 and 100 elements, respectively, and two meshes where all elements are the same size except the last one which is twice the size of the others.Again we consider a mesh with 5 elements and one with 100 elements.Note that the latter two meshes violate the conditions in parts two and three of Theorem 1.1, but satisfy the condition in Remark 1.3.We therefore expect the overshoot to vanish as q → 1 in the first two cases and to decrease but still be present in the last two.Remark 1.3 and the observations for the previous example suggests that for u ≡ 1, we could expect the overshoot to be the same both when 5 and 100 elements are employed on both the uniform and the non-uniform meshes.
Fig. 12 shows the maximum error at the nodes in all four cases for several values of q.We observe that the overshoot indeed decreases as q → 1.Furthermore, we see that the overshoot is very similar for the coarse and fine meshes in both cases which confirms that the overshoot does not disappear under mesh refinement.However, the overshoot is not identical for 5 and for 100 elements in both cases which can be attributed to the fact that u is not constant.Furthermore, note that the overshoot for the nonuniform mesh is consistently larger than for the uniform mesh, which suggests that it does not disappear entirely as q → 1.Note that on the non-uniform mesh when u ≡ 1 and q = 1, the overshoot would be

(Vanishing) Overshoot in One Dimension
To illustrate the graded mesh condition in part two of Theorem 1.1, we consider two three-element meshes on (0, 1).For the first one we choose h 1 = 0.1 and h 2 = h 3 = 0.45, i.e., the mesh consists of the subintervals (0, 0.1), (0.1, 0.55) and (0.55, 1).For the second one we choose h 1 = 0.1, h 2 = 0.5 and h 3 = 0.4, i.e., the mesh consisting of the subintervals (0, 0.1), (0.1, 0.6) and (0.6, 1).We will check the condition (1.8) for both meshes; Indeed, we will see that for the first mesh the condition is violated but it is satisfied for the second mesh.In the latter case, we therefore know that there exists an L 1 -best approximation without over-or undershoots.In the former case, it is a priori unknown whether or not such an L 1 -best approximation exists, since it is an open problem whether (1.8) is also a necessary condition.
In the first case, we obtain from (1.6) that ϑ 3 = ϑ 2 = 0 yielding the following sufficient conditions for the existence of an L 1 -best approximation without over-or undershoots: h 2 ≥ h 3 and h 1 ≥ h 2 .The second condition is violated.In fact, it is easy to show that, if h 2 = h 3 , the condition h 1 ≥ h 2 is necessary for the existence of an L 1 -best approximation without over-or undershoots.Moreover, one can show that the L 1 -best approximation is unique in this case by solving (2.3) for the points where u and u h intersect.The intersection points uniquely determine u h (0.1) ≈ 0.9931 and u h (0.55) ≈ 1.0247.For brevity, the details are omitted here.
Fig. 13 shows the L q -best approximation on both meshes for q = 2 and q = 1 on the left and the maximal nodal error on both meshes for several values of q on the right.The approximations for q > 1 were again obtained using the implementation of the best approximation problem in FEniCS.We can clearly see, that the maximal overshoot is always larger on the first mesh.In both cases it decreases as q → 1, but the overshoot only vanishes completely on the second mesh.However, even on the first mesh the maximal overshoot is very small for q = 1.Note that, if h 2 and h 1 as chosen for the first mesh were swapped, the maximal overshoot for q = 1 would be u h (0.55) − 1 = 0.2792 according to Remark 1.3 and thus significantly larger than the overshoot we can observe.This shows that the effect of an element being too small and causing the L 1 -best approximation to contain over-and undershoots is much weaker away from the discontinuity than near the discontinuity.0 0.2 0.4 0.6 0.8 1 0 0.5 1 x u h (x) q = 2, h 2 = 0.45 q = 1, h 2 = 0.45 q = 2, h 2 = 0.5 q = 1, h 2 = 0.5 1 1.2 1.4 1.6 1.8 2 0 0.1 0.2 0.3 q maximal nodal error h 2 = 0.45 h 2 = 0.5 Figure 13: L q -best approximations on two three-element meshes on (0, 1) with h 1 = 0.1 and two different choices for h 2 .Left: best approximation with q = 2 and q = 1.Right: maximal nodal error for several values of q.
6.3 (Vanishing) Overshoot in Two Dimensions 6.3.1 Overshoot on Meshes 2, 3 and 4 from Section 5.3 Fig. 14 shows the best approximations for q = 2 and q = 1.1 for three of the meshes we have considered in Section 5.3.Even just a comparison of these two cases for each of the meshes illustrates clearly how the overshoot gradually vanishes on Mesh 3 and Mesh 4. On Mesh 2, the overshoot vanishes away from the boundary y = 0; this is consistent with the L 1 -best approximation described above that only exhibits an overshoot at the node (0.5, 0) and no overshoot at all other nodes.Fig. 15 shows the maximum overshoot for all three meshes for different values of q.The overshoot for q = 1 is taken from the theoretically determined L 1 -best approximations discussed in that section.All remaining values have been determined numerically with an implementation in FEniCS (Alnaes et al., 2015).The plot shows that for the third and fourth mesh, the overshoot indeed disappears as q → 1, whereas for the second mesh it decreases but does not vanish.

Overshoot on Unstructured Meshes
As a final example, we consider the unstructured mesh shown on the left in Fig. 16.From the computations for the previous examples, we deduce that the L 1 -best approximation exhibits no overshoot if for every interior node (x i , y i ) that is connected to the boundary x = 1 through one edge, the area of all triangles whose boundaries contain the node (x i , y i ) and at least one node on the boundary x = 1 is smaller than the area of all remaining triangles whose boundaries contain the node (x i , y i ).Furthermore, the numerics for Mesh 2 (cf., Fig. 8) shown in Fig. 14 suggest that the overshoot disappears for q → 1 away from the nodes violating this condition on the area of connected elements.
The interior nodes connected to the boundary are labelled 1, 2, . . .7 in Fig. 16.It is easy to see that for the nodes 1, 4 and 7 the total area of all triangles touching both the node and the boundary x = 1 is smaller than the total area of all remaining triangles touching the node, whereas this condition is violated for the nodes 2, 3, 5 and 6.Thus, we expect that the overshoot vanishes at the nodes 1, 4 and 7 as q → 1, while at the nodes 2, 3, 5 and 6 it reduces but does not disappear entirely.Fig. 16 shows the L q -best approximation on the unstructured mesh for q = 2 and q = 1.2 in the center and on the right, respectively.Here, we clearly observe that the approximation for q = 2 exhibits overshoots at all nodes connected to the boundary x = 1 with larger overshoots at the nodes 2, 3, 5 and 6.At these nodes the overshoot is reduced but still clearly visible for q = 1.2.On the other hand at the nodes 1, 4 and 7 the overshoot has nearly vanished for q = 1.2.Fig. 17 shows two further unstructured meshes which have been designed in such a way that for every interior node (x i , y i ) that is connected to the boundary x = 1 through one edge, the area of all triangles whose boundaries contain the node (x i , y i ) and at least one node on the boundary x = 1 is smaller than the area of all remaining triangles whose boundaries contain the node (x i , y i ).The difference between the two meshes is that the distance between the boundary x = 1 and the vertical line containing all nodes (a) Mesh 2, q = 2 (b) Mesh 3, q = 2 (c) Mesh 4, q = 2 (d) Mesh 2, q = 1.1 (e) Mesh 3, q = 1.1 (f) Mesh 4, q = 1.1    17c shows the maximum value of u h for different q and Meshes A, B and C.This illustrates that the overshoot decreases on all three meshes as q → 1.The overshoot on Mesh C is always smaller than the overshoot on the other two meshes and the overshoot on Mesh A is always larger than on the other two meshes.This illustrates that if the area of the elements connected to the boundary is decreased in comparison the area of the remaining elements, then the overshoot is reduced for any q and decreases more rapidly as q → 1.This is consistent with the theoretical results in one dimension illustrated at the start of this article in Fig. 2.

Conclusions
In this article, we have investigated Gibbs phenomena in the L q -best approximation of discontinuities within finite element spaces.Using selected examples, we have proven that the Gibbs phenomenon can be eliminated as q → 1 on certain meshes.However, we have seen that there exist non-uniform meshes in one dimension that lead to Gibbs phenomena even if q = 1.In two dimensions, even some uniform meshes lead to Gibbs phenomena if q = 1.Nonetheless, the magnitude of the oscillations decreases as q → 1 on all meshes.
The computational examples presented in this article confirm the theoretical results.Moreover, we have seen that similar observations can be made for more general examples.Furthermore, we have demonstrated that Gibbs phenomenon cannot be eliminated on certain meshes under mesh refinement.For the final computational example, we have been able to establish a link between the structure of the mesh near the discontinuity and the magnitude of the overshoot at the nodes.This observation suggests that the oscillations can be eliminated in the limit as q tends to 1 if the mesh structure near the discontinuity is suitably adjusted.Indeed, this has been used to design meshes for the non-linear Petrov-Galerkin method for the convection-diffusion-reaction equation presented in Houston et al. (2019).

Figure 2 :
Figure 2: Values for α for different ranges of q and three different choices of h.

Figure 4 :
Figure 4: Values for α for different ranges of q on Mesh 1.

( a )Figure 10 :
Figure 10: Uniform refinement of the mesh preserving the structure.
Figure 11: Left: max(u h ) for q = 2 and several refinements as shown in Fig. 10a.Right: u h with 100 elements.

Figure 12 :
Figure 12: Maximal nodal error for different values of q and four different meshes.

Figure 14 :Figure 15 :
Figure 14: L q -best approximation of a boundary discontinuity

Figure 16 :
Figure16: L q -best approximation on an unstructured mesh.