Finite Element Approximation of Steady Flows of Colloidal Solutions

We consider the mathematical analysis and numerical approximation of a system of nonlinear partial differential equations that arises in models that have relevance to steady isochoric flows of colloidal suspensions. The symmetric velocity gradient is assumed to be a monotone nonlinear function of the deviatoric part of the Cauchy stress tensor. We prove the existence of a unique weak solution to the problem, and under the additional assumption that the nonlinearity involved in the constitutive relation is Lipschitz continuous we also prove uniqueness of the weak solution. We then construct mixed finite element approximations of the system using both conforming and nonconforming finite element spaces. For both of these we prove the convergence of the method to the unique weak solution of the problem, and in the case of the conforming method we provide a bound on the error between the analytical solution and its finite element approximation in terms of the best approximation error from the finite element spaces. We propose first a Lions-Mercier type iterative method and next a classical fixed-point algorithm to solve the finite-dimensional problems resulting from the finite element discretisation of the system of nonlinear partial differential equations under consideration and present numerical experiments that illustrate the practical performance of the proposed numerical method.


Introduction
The classical incompressible Navier-Stokes constitutive equation and its usual generalisations, the constitutive relations for the incompressible Stokesian fluid, are explicit expressions for the Cauchy stress in terms of the symmetric part of the velocity gradient. The Stokesian fluid is defined by the constitutive expression Navier-Stokes fluid is a special sub-class of (1.1) that is linear in the symmetric part of the velocity gradient and is defined through: T = −pI + 2µD, (1.2) where µ is the viscosity of the fluid. Power-law fluids are another popular sub-class of (1.1), the power-law fluid being defined through the constitutive equation where µ 0 and α are positive constants and m is a constant; if m is zero we recover the Navier-Stokes fluid model, if it is negative we have a shear-thinning fluid model and if it is positive we have a shear-thickening fluid model. There are however many fluids that cannot be described by constitutive equations of the form (1.1) but require "relations", in the true mathematical sense of the term, between the Cauchy stress and the symmetric part of the velocity gradient. Implicit constitutive relations that involve higher time derivatives of the stress and the symmetric part of the velocity gradient have been proposed to describe the response of non-Newtonian fluids that exhibit viscoelastic response 1 (see Burgers (1939) [13], Oldroyd (1950) [30]); that is fluids that exhibit phenomena like stress relaxation. However, purely implicit algebraic relationship between the stress and the symmetric part of the velocity gradient were not considered to describe non-Newtonian fluids until recently. Such models are critical if one is interested in describing the response of fluids which do not exhibit viscoelasticity but whose material properties depend on the mean value of the stress and the shear rate, a characteristic exhibited by many fluids and colloids, as borne out by numerous experiments. Consider for example an incompressible fluid whose viscosity depends on the mechanical pressure 2 (mean value of the stress) and is shear-thinning, whose constitutive relation takes the form T = −pI + 2µ p, tr(D 2 ) D. which is an implicit relationship between the stress and the symmetric part of the velocity gradient. Rajagopal (2003) [34], (2006) [35] introduced the implicit relationship of the above form (and also the much more general implicit relationship between the history of the stress and the history of the deformation gradient) to describe materials whose properties depend upon the pressure and the shear rate. In fact, the properties of all fluids depend upon the pressure: it is just a matter of how large the variation of the pressure is in order for one to take the variation of the properties into account. The book by Bridgman (1931) [9] entitled "Physics of High Pressures" provides copious references to the experimental literature before 1931 on the variation of the viscosity of fluids with pressure, and one can find recent references to the experimental literature on the dependence of viscosity on pressure in Málek and Rajagopal (2006) [27]. Stokes (1845) [40] recognised that the viscosity of fluids varies with pressure, but in the case of sufficiently slow flows in channels and pipes he assumed that the viscosity could be considered a constant. Suffice to say, constitutive relations of the class (1.7) are necessary to describe the response of fluids whose viscosity depends on the pressure. Also as mentioned earlier, the implicit constitutive relation (1.7) is useful to describe the behaviour of colloids.
Recently, Perlácová and Prǔša (2015) [32] (see also LeRoux and Rajagopal (2013) [39]) used an implicit model belonging to a sub-class of (1.7) to describe the response of colloidal solutions as presented in the experimental work of Boltenhagen et al. (1997) [4], Hu et al. (1998) [21], Lopez-Diaz et al. (2010) [26] among others. Notice that while one always expresses the incompressible Navier-Stokes fluid by the representation (1.2), it is perfectly reasonable to describe it as (1. 8) In fact, it is the representation (1.8) that is in keeping with causality as the stress is the cause and the velocity and hence its gradient is the effect, and this fact cannot be overemphasised. Such a representation would imply that the Stokes assumption that is often appealed to is incorrect (see Rajagopal (2013) [36] for a detailed discussion of the same). Málek et al. (2010) [33] generalised (1.8) to stress power-law fluids, namely constitutive relations of the form: where T d is the deviatoric part of the Cauchy stress, γ and β are positive constants, and n is a constant that can be positive, negative or zero. The constitutive relation (1.9) is capable of describing phenomena that the classical power-law models are incapable of describing. For instance, the constitutive models (1.9) can describe limiting strain rate as well as fluids which allow the possibility of the strain rate initially increasing with stress and later decreasing with stress; both such responses cannot be described by the classical powerlaw fluid model (1.3) (see the discussion in Málek et al. (2010) [33] with regard to the difference in the response characteristics of the stress power-law fluid and the classical power-law fluid). We are interested in a further generalization of the constitutive relation of the form (1.9) that is appropriate for describing the response of colloidal solutions. This constitutive relation takes the form: where α, β, and γ are positive constants, n is a real number, and T d is the deviatoric part of the Cauchy stress. The shear stress in a fluid undergoing simple shear flow, that is described by the constitutive relation given above, increases from zero to a maximum, then decreases to a local minimum, and then increases monotonically as the shear stress increases from zero. As discussed by Le Roux and Rajagopal [39], and Perlácová and Prǔša [32], many colloids exhibit such behavior. The constitutive relation that we introduce first in (2.10) and next in (3.1) includes (1.10) as a special sub-class. It can be posed within a Hilbert space setting owing to the presence of the coefficient α in (1.10), but nevertheless, it is a challenging problem as it involves two nonlinearities: the monotone part in the constitutive relation and the inertial (convective) term. The problem without the inertial term, see Subsection 2.2 below, has already been analysed in [5], while the analysis of the steady-state incompressible Navier-Stokes equations is well-established, see for instance [41,20]. With both nonlinearities present in the model, proving the existence of a weak solution, for instance, to the best of our knowledge cannot be done by simply coupling the techniques used for these two problems, namely the Browder-Minty theorem and the Galerkin method combined with Brouwer's fixed point theorem and a weak compactness argument. More refined arguments are needed; they are crucial to the proofs of Lemmas 4 and 5 below. This work is organised as follows. The notation and the functional-analytic setting are recalled in the next subsection. In Section 2, both linear and fully nonlinear versions of the formulation are briefly analysed for the Stokes system, i.e., without the inertial (convective) term. The theoretical analysis of the complete nonlinear system is carried out in Section 3. The main results of this section are Theorem 1 for the existence of a solution and Proposition 2 for the uniqueness of a solution under additional assumptions on the input data. In Section 4, conforming finite element approximations of these models are proposed and error estimates are derived. The cases of both simplicial and hexahedral elements are discussed. The analysis of the latter is less satisfactory as it requires subdivisions consisting of parallelepipeds and suffers from a higher computational cost. This motivates the introduction of nonconforming approximations in Section 5. In Section 6, two decoupling algorithms are presented and compared: a Lions-Mercier algorithm adapted to a system with a monotone part and an elliptic part, and a classical fixed-point algorithm alternating between the approximation of a Navier-Stokes system and the nonlinear constitutive relation for the stress. Numerical experiments are performed with conforming finite elements on a square mesh in two dimensions. The theoretically established convergence of the scheme is confirmed and convergence of both decoupled algorithms is observed.

Notation and preliminaries
Let Ω ⊂ R d , d ∈ {2, 3}, be a bounded, open, simply connected Lipschitz domain. We consider the function spaces Q := L 2 0 (Ω), V := H 1 0 (Ω) d and M := {S ∈ L 2 (Ω) d×d sym : tr(S) = 0}, (1.11) for the pressure, the velocity, and the deviatoric stress tensor, respectively. As usual, the zero mean value constraint being introduced to fix the undetermined additive constant in the mechanical pressure. Here the subscript sym indicates that the d × d tensors under consideration are assumed to be symmetric. Henceforth, the symmetric gradient of the velocity field v (or, briefly, symmetric velocity gradient) will be denoted by and the deviatoric part of a d × d tensor S is defined by with I the d × d identity tensor; thus the trace of S d is zero. We denote by V the subspace of V consisting of all divergence-free functions contained in V ; that is, (1.14) For vector-valued functions v : with | · | signifying the Euclidean norm on R d , while for tensor-valued functions S : Ω → R d×d , we define is the Frobenius norm of S. Clearly, M is a Hilbert space with this norm. We recall the Poincaré and Korn inequalities, which are, respectively, the following: there exist positive constants C P and C K such that We endow V (and V) with the norm · V := D(·) L 2 (Ω) . (1.17) Both V and V are Hilbert spaces with this norm, because · V is equivalent to both the H 1 (Ω) d×d norm and the H 1 (Ω) d×d semi-norm, thanks to (1.15), (1.16) and the trivial relation D(v) L 2 (Ω) ≤ ∇v L 2 (Ω) .

Stokes system with linear and nonlinear constitutive relations
In this section we study two preliminary model problems without the inertial term; the first one simply reduces to the Stokes system, while the second model problem involves a monotone nonlinearity treated by the Browder-Minty approach.

The Stokes system
Let us consider the problem in Ω, u = 0 on ∂Ω, where f : Ω → R d is a prescribed external force, D(u) is defined by (1.12), the unknown tensor T is symmetric, and α is a given positive constant, the reciprocal of the viscosity coefficient. Here, we assume that f ∈ L 2 (Ω) d for simplicity, but a similar analysis holds for the general case f ∈ V ′ = H −1 (Ω) d ; see for instance Remark 1 in Section 3. By decomposing the Cauchy stress T as T = T d + 1 d tr(T )I and inserting this in the first equation of (2.1) we arrive at the following equivalent problem: which we recognise to be the Stokes system where the mechanical pressure (mean normal stress) is p := − 1 d tr(T ). Recalling the spaces M, V, Q defined in (1.11) and using the relation which holds 3 for any symmetric tensor S, the weak formulation of problem (2.2) can be written as follows: For any S ∈ M , v ∈ V , and q ∈ Q, we set As is usual for the Stokes problem, the unknown pressure can be eliminated from (2.3) by restricting the test functions v to V. In addition, the variable u can also be eliminated by treating the first line of (2.3) as a constraint, thus leading to an equivalent (reduced) problem for which the two variables p and u are eliminated. The equivalence is based on the following (inf-sup) conditions For any R, S ∈ R d×d , with S symmetric, we have that S : R = S+S t 2 : R = 1 2 S : R + 1 2 S t : R = 1 2 S : R + 1 2 S : R t = S : R+R t 2 . and ∃β > 0 : inf where we have used that D(v) L 2 (Ω) ≤ ∇v L 2 (Ω) . It is well-known that the spaces V and Q defined in (1.11) satisfy the inf-sup condition (2.5), see for instance [20], while the relation (2.4) can be easily shown by observing that, for a given v ∈ V, we have R : We can then eliminate the incompressibility constraint by seeking u ∈ V, yielding the (partially reduced) problem: (2.6) Clearly, each solution of (2.3) satisfies (2.6). Conversely, it follows from the inf-sup condition (2.5) that for any solution (T d , u) of (2.6) there exists a unique p ∈ Q such that (T d , u, p) is the solution of (2.3); see [20].
Hence these two problems are equivalent. Furthermore, we can eliminate the unknown u by proceeding as follows; see [5]. First, we introduce the decomposition M = M ⊕ M ⊥ with with C P and C K the constants in Poincaré's and Korn's inequalities (1.15) and (1.16), respectively. We finally get the (fully reduced) problem: find T d 0 ∈ M such that The well-posedness of problem (2.9) follows from the Lax-Milgram lemma, while its equivalence to the original problem (2.3) is guaranteed by (2.4) and (2.5).
Of course, in this simple model with a linear constitutive relation, T d 0 = 0 since the right-hand side of (2.9) vanishes and α(·, ·) Ω is an inner product on M. However, the framework developed here will be used in the sequel in a more general setting.

Stokes model with a nonlinear constitutive relation
Next, we consider the following Stokes-like system with a nonlinear relation between the stress and the symmetric velocity gradient: with γ a given positive constant, and where µ ∈ C 1 ((0, +∞)) ∩ C 0 ([0, +∞)) is a given function satisfying and µ(a) > 0 and µ(a)a ≤ C 1 ∀ a ∈ R ≥0 (2.12) for some positive constant C 1 . Since µ is continuous on any subinterval of R ≥0 , the second part of (2.12) implies that µ is bounded above and we denote its maximum by µ max , Moreover, proceeding as in the proof of [12, Lemma 4.1], we deduce from (2.11) and (2.12) that for any R, S ∈ R d×d , the following monotonicity property hold: with equality if and only if R = S. Introducing again p := − 1 d tr(T ), the weak formulation of problem (2.10) reads as follows: find a triple (2.15) Proceeding exactly as in Section 2.1, we first eliminate the pressure, and we thus deduce that problem (2.15) is equivalent to the following problem: which is further equivalent to the following problem: find with T d f ∈ M ⊥ the solution of (2.8). The Browder-Minty theorem, see for instance [29], guarantees the existence of a solution to problem (2.17). Indeed, let A : where ·, · M denotes the duality pairing between M and its dual space, M ′ . It then easily follows that the mapping is bounded, monotone, coercive and hemi-continuous. By the Browder-Minty theorem these imply surjectivity of A and thereby existence of a solution, while its uniqueness follows from the strict monotonicity of A.

Navier-Stokes with nonlinear constitutive relation
Now, we focus on our problem of interest, where a convective term is added to the first equation of (2.10), i.e., we consider the problem in Ω, u = 0 on ∂Ω. (3.1) We prove a priori estimates, construct a solution, and give sufficient conditions for global uniqueness.

(3.2)
In order to bring forth an elliptic term on the left-hand side of the first equation of (3.2), we rewrite the second equation in (3.2) as and thus by substituting this relation into the first equation of (3.2) we get in Ω, div(u) = 0 in Ω, u = 0 on ∂Ω. (3.4) The weak formulation of (3.4) reads: As previously, we eliminate the pressure by restricting the test functions to V, and we thus obtain the following equivalent reduced problem: for all (S, v) ∈ M × V. Interestingly, (3.6), (3.7) can be further reduced by observing that, given u, (3.7) uniquely determines T d thanks to the Browder-Minty theorem; see the end of Section 2.2. Thus, we define the mapping G : V → M by u → T d with T d ∈ M being the unique solution of where we recall that A is defined in (2.18). With this mapping, (3.6), (3.7) is equivalent to the following problem: find u ∈ V such that Before embarking on the proof of existence of a solution to problem (3.6), (3.7) we establish a series of a priori estimates under the assumption that a solution exists.

A priori estimates
Assuming that problem (3.6), (3.7) has a solution, the following a priori estimates hold for any solution with C P and C K signifying the constants in Poincaré's and Korn's inequality, respectively, and C 1 the constant in (2.12). Proof.
Using then the positivity of µ, see (2.12), we get To obtain a bound for u, we recall the well-known relation which is easily obtained by integration by parts, as follows: Therefore, taking v = u in (3.6) and using (2.12) we obtain from which we directly deduce (3.10); (3.11) follows by applying (3.10) to (3.12).
Proof. The ingredients of the proof are similar to those used in the proof of Lemma 1 and only the derivation of the bound for D(u) is different. First notice that combining (3.6) and (3.7) we have (3.16) Taking v = u in (3.16) we then find that Notice that T d : D(u) ≥ 0 a.e. in Ω. Indeed, from (3.7) we have that in Ω.
Therefore, taking S = D(u) in (3.7) and using the upper bound µ max for µ and the bound (3.17) we have which yields (3.14). Finally, the bound (3.15) for T d is obtained by substituting (3.14) in (3.12).
Remark 1. Similar a priori estimates can be derived in the case when f ∈ V ′ (with V ′ = H −1 (Ω) d ). More precisely, all occurrences of C P C K f L 2 (Ω) can be replaced by f V ′ , where , (3.19) and ·, · V denotes the duality pairing between V ′ and V . The same observation holds for all that follows.
Remark 2. By a direct argument we can also prove that This leads to the same a priori bound (3.15) for T d , Indeed, the choice S = D(u) in (3.7) gives directly (without invoking (3.18)) Then (3.20) follows by substituting the bound into the preceding inequality. This also yields (3.15).

Construction of a solution
In this subsection we prove the existence of a solution in a bounded Lipschitz domain without any restrictions on the data, other than those stated at the beginning of Section 2.2. The first part of the construction is fairly standard: a suitable sequence of (finite-dimensional) Galerkin approximations to the infinite-dimensional problem is constructed, followed by Brouwer's fixed point theorem to prove that each finite-dimensional problem in the sequence has a solution; uniform a priori estimates, similar to those derived in Lemma 1, are established for the Galerkin solutions, which are then used for passing to the (weak) limit, via a weak compactness argument. However, because of the combined effect of the nonlinearities, identifying the limit as a solution to the infinite-dimensional problem requires a more refined argument. For the sake of clarity, the argument is split into several steps.
Step 1 (Finite-dimensional approximation). Formulation (3.9) lends itself readily to a Galerkin discretisation. Since the only unknown is u in V, a separable Hilbert space, we introduce a countably infinite basis {w 1 , w 2 , . . .} of orthonormal functions of V with respect to the inner product whose span is dense in V. Next, we truncate this basis, i.e., for each m ≥ 1 we define and for u m ∈ V m we denote byû m ∈ R m its representation with respect to this basis. Finally, we fix m ≥ 1 and consider the following finite-dimensional problem: find u m ∈ V m such that, for all 1 ≤ j ≤ m, Problem (3.22), which can be seen as the projection of (3.9) onto V m , is equivalent to the following: find u m ∈ R m such that F(û m ) = 0, where F = (F 1 , . . . , F m ) t : R m → R m is the continuous function defined, for j = 1, . . . , m, by Step 2 (Existence of a discrete solution). Problem (3.22) is a system of m nonlinear equations in m unknowns. The existence of a solution to this problem can be established by the following variant of Brouwer's fixed point theorem (see e.g. [17,20]). (3.24) Moreover, T d m = G(u m ) satisfies the uniform bound Proof. We infer from Lemma 3 that F has a zero in the ball B m (0, r) with Indeed, using the antisymmetry property (3.13), which holds because u m ∈ V m ⊂ V, we get where we have used Poincaré's and Korn's inequalities (1.15) and (1.16), respectively, to bound the second term and the relation (2.12) for the third one. As D(u m ) L 2 (Ω) = |û m |, we deduce from the last inequality that if |û m | = r with r as defined above, then Thanks to Lemma 3, there exists a pointû m ∈ B m (0, r) such that F(û m ) = 0, i.e., problem (3.22) has a solution u m ∈ V m that satisfies the uniform bound (3.24). Finally, it is easily shown that T d m := G(u m ) satisfies the bound (3.25).
Step 3 (Passage to the limit m → ∞ and identification of the limit). We consider the sequences (u m ) m≥1 Thanks to the uniform estimates (3.24) and (3.25) there exist two subsequences (not relabelled) such that Our objective is to show that the pair (T d ,ū) ∈ M × V is a solution to the problem under consideration by passing to the limit in (3.22), (3.23).
Passing to the limit in (3.22), (3.23) is however not straightforward because of the lack of strong convergence of T d m in M . Identifying the pair (T d ,ū) ∈ M × V as a solution will be achieved by means of the following two lemmas, the first of which (Lemma 4) relies on the equations and the strong convergence of the sequence (u m ) m≥1 in L q (Ω) d shown above, and the second lemma (Lemma 5) follows from the monotonicity property (2.14). The proof, included below, that the pair (T d ,ū) satisfies (3.7) is inspired by the arguments in [11], where a more general constitutive relation than ( Proof. By testing equation (3.23) with S = D(w j ) and substituting into (3.22) we deduce that Multiplying (3.27) by (û m ) j , summing over j, and applying (3.13), we derive Thus we obtain on the one hand On the other hand, letting m tend to infinity in (3.27) for fixed j and considering the strong convergence of u m , we infer that In view of (3.13), the choice v =ū in (3.30) yields and (3.26) then follows from (3.29) and (3.31).
for all S ∈ M . Taking then S = T d m −T d and using the monotonicity property (2.14) we get Finally, we take the limit m → ∞ of both sides and apply (3.26) to obtain Proof. It follows from Lemma 5 that on the one hand (T d ,ū) solves (3.7) and on the other hand, Indeed, passing to the limit in (3.23) gives, for any S ∈ M , Therefore, taking the limit as m → ∞ in (3.22) we get for each j = 1, 2, . . ., and thus the density of m≥1 V m in V implies that which is precisely (3.6).

Global conditional uniqueness
We now prove global uniqueness of the solution under additional assumptions on the function µ and the input data. The notion of uniqueness we establish is global and conditional in the sense that it holds under suitable restrictions on the data, but it is also global because no other solution exists. Let R d×d sym,0 denote the space of symmetric d × d matrices with vanishing trace and let C S be the smallest positive constant in the following Sobolev embedding: (3.33) If the input data satisfy then the solution of problem (3.6), (3.7) is unique.
Proof. We use a variational argument. Suppose that (T d 1 , u 1 ), (T d 2 , u 2 ) ∈ M × V are solutions of (3.6), (3.7). Let us write δT d := T d 1 − T d 2 and δu := u 1 − u 2 . Subtracting the equations solved by (T d 2 , u 2 ) from those solved by (T d 1 , u 1 ) we get for all (S, v) ∈ M × V the following pair of equalities: The choice S = δT d in (3.37), thanks to the monotonicity property (2.14), leads to Then, by noting that by testing (3.36) with v = δu, and recalling (3.13) we obtain The assumption (3.35) on the data guarantees that the factor on the right-hand side of the last inequality is strictly smaller than 1 α , thus implying that D(δu) L 2 (Ω) = 0, i.e., u 1 = u 2 . Finally, applying this result to (3.38) Remark 3. The strategy used in deriving the second a priori estimate stated in Lemma 2 leads to uniqueness when (3.35) is replaced by In fact both strategies lead to the same condition (3.39); namely, we also get (3.39) by using (3.14) instead of (3.10) to bound D(u 1 ) L 2 (Ω) in the proof of Proposition 2.
Note that both (3.35) and (3.39) hold when γ and f are sufficiently small.

Remark 4.
Under the Lipschitz condition (3.34), the proof of (3.20) and (3.15) is valid with µ max replaced by Λ, and the L 2 (Ω) d norm of f (multiplied by C P C K ) replaced by its norm in V ′ , see Remark 1. More precisely,

Comparison of the a priori estimates
At this stage, it is useful to compare the a priori estimates derived in the previous sections. We have where Λ is replaced by µ max if we do not make the Lipschitz assumption (3.34). For p we have where V ⊥ denotes the orthogonal complement of V in V with respect to the inner product (3.21).

Remark 5.
We can replace C 2 S by the product C p C r of the smallest constants C p and C r from the Sobolev embedding of H 1 (Ω) d into L p (Ω) d and L r (Ω) d , respectively, with p = 6 and r = 3. We could also use the best constantĈ such that In the former case,Ĉ ≤ C p C r while in the latter case,Ĉ ≤ C 3 K C p C r .

Conforming finite element approximation
In this section, we study conforming finite element approximations of problem (3.2), where conformity refers to the discrete velocity space. To facilitate the implementation, it is useful to relax the zero trace restriction on the discrete tensor space, but this is not quite a nonconformity because the theoretical analysis of the preceding sections holds without this condition. In particular, the inf-sup condition (2.4) is still valid (supremum over a larger space). We start with the numerical analysis of general conforming approximations, including existence of discrete solutions, convergence, and error estimates, and give specific examples further on.

General conforming approximation
As stated above, here M = L 2 (Ω) d×d sym . Up to this modification, we propose to discretise the formulation derived from (3.2): (4.1) Note that, since div(u) = 0, by taking S = I the second line of (4.1) implies that the solution T d of (4.1) satisfies tr(T d ) = 0 a.e. in Ω, even though this condition was not explicitly imposed on elements of M . Let h > 0 be a discretisation parameter that will tend to zero and, for each h, let V h ⊂ V , Q h ⊂ Q and M h ⊂ M be three finite-dimensional spaces satisfying the following basic approximation properties, for all S ∈ M , v ∈ V and q ∈ Q: We assume on the one hand that the pair for some constant β * > 0, independent of h, and on the other hand that M h and V h,0 are compatible in the sense that Note that the latter assumption may be prohibitive when considering conforming finite elements on quadrilateral (d = 2) or hexahedral (d = 3) meshes, see Subsection 4.2; this motivates the study of non-conforming finite elements considered in Section 5. The inf-sup condition (4.3) guarantees that which can be shown using a standard argument; see for instance [20]. Here, c b denotes the continuity constant of b 2 (·, ·) on V × Q.
As the divergence of functions of V h,0 is not necessarily zero, the antisymmetry property (3.13) does not hold in the discrete spaces. Since this property is a crucial ingredient in the analysis of our problem, it is standard (see for instance [41,20]) to introduce the trilinear form d : The trilinear form d is obviously antisymmetric and it is consistent thanks to the fact that Moreover, a standard computation shows that there exists a constantD ≤ min(C 2 We then consider the following approximation of problem (4.1): (4.9)

Existence of a discrete solution
Existence of a solution to problem (4.9) without restrictions on the data is established by Brouwer's fixed point theorem, as in Section 3.3. To begin with, for any function v ∈ V , we define the discrete analogue of the mapping G, see This finite-dimensional square system has one and only one solution G h (v) thanks to the properties of the left-hand side: the first term is elliptic and the second term is monotone. As in Section 3.3, in view of the inf-sup condition (4.3), problem (4.9) is equivalent to finding where T h := G h (u h ). By proceeding as in Proposition 1, it is easy to prove that problem (4.11) has at least one solution u h ∈ V h,0 , and by the above equivalence, each solution u h determines a pair ( . Moreover, each solution of problem (4.9) satisfies the same estimates as in (3.10) and (3.11). For the sake of simplicity, since the approximation is conforming, we state them in terms of the norm of f in H −1 (Ω) d , and Regarding the other a priori bounds, (3.20) and (3.15) are satisfied by u h and T h and, if (3.34) holds, so are (3.41) and (3.40), all up to the above norm for f . In contrast, however, we do not have enough information to claim that (3.14) is valid because it relies on the nonnegativity of T h : D(u h ) almost everywhere in Ω; the integral average is positive but this does not always guarantee pointwise nonnegativity. Thus we replace the constant C u of (3.42) by the constant C u in the following inequality: 14) where the last term is included when (3.34) holds. Because C u ≤ C u , we shall use C u to bound both u and u h in order to simplify the constants in the computations that will now follow.
Finally, let us establish the convergence of the sequence of discrete solutions in the limit of h → 0. The above uniform a priori estimates imply that, up to a subsequence of the discretisation parameter h, Clearly, the symmetry of T h implies that ofT and div(ū) = 0 follows from the fact that u h belongs to V h,0 . Then the approximation properties of the discrete spaces and (4.5) permit to replicate the steps of the proof of Lemma 4 and yield (4.15) To fully identify the limit, in addition toT d := G(ū), which has trace zero since div(ū) = 0, we introduce the auxiliary tensorT h := G h (ū). On the one hand Since bothT h andT d are bounded in M uniformly with respect to h, and again a uniform bound, then the approximation properties of M h and the monotonicity property (2.14) imply that lim On the other hand, the auxiliary tensorT h permits us to argue as in the proof of Lemma 5. Indeed, the monotonicity property (2.14) yields From (4.15) and (4.16), we easily derive that the above right-hand side tends to zero. Hence and then combining this with (4.16) we infer that Hence uniqueness of the limit implies thatT =T d = G(ū). This, and (4.3), permit to identify the limit as in Lemma 5 and Theorem 1, and proves convergence to a weak solution without restrictions on the data. Thus we have proved the following result.
Theorem 2. (Convergence for all data) Under the above approximation properties and compatibility of the discrete spaces, up to a subsequence, where (T d , u, p) is a solution of (3.6), (3.7).

Error estimate
We now prove an a priori error estimate between (T d , u, p) and (T h , u h , p h ), under the assumption (3.34) that has not been used so far, and the small data condition (4.18) below. Note that this small data condition is in fact the same as the uniqueness condition (3.35), upon replacing C u by C u . To simplify the notation and compress some of the long displayed lines of mathematics, we shall write · V , · M and · Q instead of D(·) L 2 (Ω) (as a norm on V ), · L 2 (Ω) (as a norm on M ) and · L 2 (Ω) (as a norm on Q), respectively.
Theorem 3. In addition to (3.34), let the input data satisfy where 0 < θ < 1 andD is the constant from (4.8). Then, there exists a constant C > 0 independent of h such that the difference between the solution (T h , u h , p h ) of (4.9) and (T d , u, p) of (4.1) satisfies Proof. Since we are using conforming finite element spaces, taking (S, v, q) = (S h , v h , q h ) in (4.1) and subtracting the equations of (4.9) we easily get The rest of the proof is divided into three steps.
Step 1 (Error bound for the pressure). By the triangle inequality we have, for any q h ∈ Q h , and it therefore suffices to derive a bound on q h − p h Q . From the (discrete) inf-sup condition we have Again, using the first equation of (4.20) we have where we can take c b = C K using the relation div(v) 2 L 2 (Ω) + rot(v) 2 L 2 (Ω) = ∇v 2 L 2 (Ω) that holds because we have homogeneous Dirichlet boundary conditions (otherwise take c b = √ dC K ). Thus, we obtain for any q h ∈ Q h .
Step 2 (Error bound for the stress tensor). Again, we start with the triangle inequality; for any S h ∈ M h we have that and we then bound T h − S h M . Thanks to the monotonicity property (2.14) and the second equation of (4.20), we have and thus Step 3 (Error bound for the velocity). Recalling the definition of V h,0 in (4.2), let v h,0 ∈ V h,0 and let v h := v h,0 − u h ∈ V h,0 . We will first show the relation (4.19) by taking the infimum over V h,0 instead of V h . As before, we use the triangle inequality to get Thanks to the assumption (4.4), we can take S h = D(v h ) in the second equation of (4.20) yielding Using the first equation of (4.20), we can easily derive the equality thanks to the fact that b 2 (v h , q h − p h ) = 0. To bound the convective term, we use Therefore, using the assumption (4.18) on the input data, we obtain for any v h,0 ∈ V h,0 . Finally, combining (4.21), (4.22) and (4.23) we obtain We can then conclude the proof using (4.6).

Examples of conforming approximation
From now on, we assume that the boundary of the Lipschitz domain Ω ⊂ R d is a polygonal line (when d = 2) or a polyhedral surface (when d = 3), so that it can be exactly meshed. For each h, let T h be a conforming mesh on Ω consisting of elements E, triangles or quadrilaterals in two dimensions, tetrahedra or hexahedra (all planar-faced) in three dimensions, conforming in the sense that the mesh has no hanging nodes. As usual, the diameter of E is denoted by h E , and ̺ E is the diameter of the largest ball inscribed in E.

The simplicial case
In the case of simplices, the family of meshes T h is assumed to be regular in the sense of Ciarlet [14]: i.e., it is assumed that there exists a constant σ > 0, independent of h, such that This condition guarantees that there is an invertible affine mapping F E that maps the unit reference simplex onto E. For any integer k ≥ 0, let P k denote the space of polynomials in d variables of degree at most k. In each element E, the functions will be approximated in the spaces P k . The specific choice of finite element spaces is dictated by two considerations. First, conditions (4.3) and (4.4) must be satisfied. Next, since the number of unknowns in (4.9) is large, the degree k of the finite element functions should be small. It is well-known that the lowest degree of conforming approximation of (u, p) satisfying (4.3), without modification of the bilinear forms, is the Taylor-Hood P d 2 -P 1 element, see [20,3], provided each element has at least one interior vertex.
In view of (4.4), this implies that T d is approximated by P d×d 1 . Thus the corresponding finite element spaces are It is easy to check that with these spaces on a simplicial mesh, under condition (4.24), problem (4.9) has at least one solution. Furthermore, if the data satisfy (4.18), then Theorem 3 yields provided that the solution is sufficiently smooth, namely u ∈ H 3 (Ω) d ∩ H 1 0 (Ω) d , T d ∈ H 2 (Ω) d×d , and p ∈ H 2 (Ω) ∩ L 2 0 (Ω). Therefore the scheme has order two for an optimal number of degrees of freedom, i.e., this order of convergence cannot be achieved with fewer degrees of freedom.

The quadrilateral/hexahedral case
The notion of regularity is more complex for quadrilateral and much more complex for hexahedral elements. In the case of quadrilaterals [20], the family of meshes is regular if the elements are convex and, moreover, the subtriangles associated to each vertex (there is one per vertex) all satisfy (4.24). In the case of hexahedra with plane faces, convexity and the validity of (4.24) for the subtetrahedra associated to each vertex are necessary but not sufficient. This difficulty has been investigated by many authors, see for instance [42,23]; the most relevant publication concerning hexahedra with plane faces is however [22], where the minimum of the Jacobian in the reference cubeÊ is bounded below by the minimum of the coefficients of its Bézier expansion and this minimum is determined by an efficient algorithm. The details of this are beyond the scope of this work, and we shall simply assume here that the minimum of these Bézier coefficients is strictly positive and that furthermore, denoting by J E the Jacobian determinant of F E , with a constantĉ independent of E and h. If these conditions hold, there is an invertible bi-affine mapping F E in two dimensions or tri-affine in three dimensions that maps the unit reference square or cube onto E. We let Q k be the space of polynomials in d variables of degree at most k in each variable. In contrast to the case of simplicial meshes, the space Q k is not invariant under the composition with F E , which makes the compatibility condition (4.4) between D(V h ) and M h problematic. To circumvent this issue, we restrict ourselves to affine maps F E , thereby allowing subdivisions consisting of parallelograms/parallelepipeds. In addition, the situation is less satisfactory when a quadrilateral or hexahedral mesh is used, because although the Taylor-Hood Q d 2 -Q 1 element satisfies (4.3), the second condition (4.4) does not hold if T d is approximated by Q d×d 1 since the components of the gradient of Q 2 functions belong to a space, intermediate between Q 2 and Q 1 , that is strictly larger than both Q 1 and P 2 . Therefore, in order to satisfy (4.4), the simplest option is to discretise each component of T d by Q 2 . The corresponding finite element spaces are With these spaces and under the above regularity conditions, problem (4.9) has at least one solution and the error estimate (4.25) holds if the data satisfy (4.18). However, this triple of spaces is no longer optimal, because the degree two approximation of T d now requires far too many degrees of freedom with no gain in accuracy. For instance, when d = 3, its approximation by (Q 2 ) 3×3 sym requires 27 × 6 = 162 unknowns inside each element instead of 8 × 6 = 48 unknowns for (Q 1 ) 3×3 sym . The nonconforming finite element approximations discussed in Section 5 do not require an affine mapping F E and, by considering P-type approximations on the physical element E, do not suffer from the computational cost overhead mentioned above.

Nonconforming finite element approximation
The nonconforming approximations developed here will not only allow the use of elements of degree one for u, but will also lead to locally mass-conserving schemes. Because of the discontinuity of the finite element functions, the proofs are in some cases more complex; this is true in particular for the proof of the inf-sup condition for the discrete divergence.

The quadrilateral/planar-faced hexahedral case
Here we consider quadrilateral/hexahedral grids T h with planar faces, satisfying the regularity assumptions stated in Section 4.2. There is a wide choice of possible approximations with nonconforming finite elements. Here we propose globally discontinuous velocities in P d k , k ≥ 1, in each cell associated with globally discontinuous pressures and stresses both of degree at most k − 1. Thus we consider As usual, the full nonconformity of V h is compensated by adding to the forms consistent jumps and averages on edges when d = 2 or faces when d = 3; see for instance [38]. Let Γ h = Γ i h ∪ Γ b h denote the set of all edges when d = 2 or all faces when d = 3 with Γ i h and Γ b h signifying the set of all interior and the set of all boundary edges (d = 2) or faces (d = 3), respectively. A unit normal vector n e is attributed to each e ∈ Γ h ; its direction can be freely chosen. Here, the following rule is applied: if e ∈ Γ b h , then n e = n Ω , the exterior unit normal to Ω; if e ∈ Γ i h , then n e points from E i to E j , where E i and E j are the two elements of T h adjacent to e and the number i of E i is smaller than that of E j . The jumps and averages of any function f on e (smooth enough to have a trace) are defined by When e ∈ Γ b h , the jump and average are defined to coincide with the trace on e. The terms involving jumps and averages that are added to each form are not unique; here we make the following fairly standard choice: The trilinear form d is approximated by a centred discretisation, as follows: Clearly, the jump terms in (5.4) and (5.6) vanish when v h belongs to H 1 0 (Ω) d . Likewise, the jump and divergence terms in (5.5) vanish when u h and v h belong to H 1 0 (Ω) d and div(u h ) = 0. Moreover, (5.5) is constructed so that d h is antisymmetric, Finally, the following positive definite form acts as a penalty to compensate the nonconformity of u h :

8)
where h e is the average of the diameter of the two elements adjacent to e, if e ∈ Γ i h , or the diameter of the element adjacent to e otherwise. The parameters σ e > 0 will be chosen below to guarantee stability of the scheme, see (5.28) and (5.24). This form is also used to define the norm on denotes the associated semi-norm. Also, in view of (5.6), we define the space of discretely divergence-free functions, The discrete scheme reads: As expected, b 2h (v h , 1) = 0, and therefore the system (5.12) is unchanged when the zero mean value constraint is lifted from the functions of Q h .

Properties of the norm and forms
All constants below depend on the regularity of the mesh but are independent of h. In particular, we shall use C to denote such generic constant independent of h. In addition, we shall use the following "edge to interior" inequality. There exists a constantĈ, depending only on the dimension d and the degree of the polynomials, such that for all v h ∈ V h , all e ∈ Γ h and any element E, adjacent to e, v h L 2 (e) ≤Ĉ |e| |E| It is easy to check that (5.9) defines a norm on V h . Next, the results in [6,7] yield the following consequences of a discrete Korn inequality: and where ∇ h v h is the broken gradient (i.e., the local gradient in each element). Moreover, by following the work in [19,24,10], this can be generalised for all finite p ≥ 1 when d = 2 and all p ∈ [1, 6] when d = 3, to Finally, the inequality below is used in choosing σ e . Its proof is fairly straightforward, but it is included here for the reader's convenience.
Similarly, for e ∈ Γ b h , which is the face of an element E adjacent to ∂Ω, we have

By using the last two inequalities in
with the notational convention that when summing over e ∈ Γ b h the element E under the summation sign is the element adjacent to ∂Ω with face e, and when summing over e ∈ Γ i h the elements E 1 and E 2 under the summation sign are the ones that share the face e. Hence, .
The asserted result (5.23) follows from the last inequality by noting that, for each E ∈ T h , the factor D(u h ) 2 L 2 (E) appears at most 2d times.
Concerning the expression appearing in (5.24) we note that, thanks to the regularity assumption on the family of meshes, we have that h e |e| |E| ≤ C and so (5.25)

First a priori estimates
By testing the first equation of (5.12) with v h = u h , applying the third equation and the antisymmmetry (5.7) of d h , we obtain Next, by testing the second equation of (5.12) with S h = T h and substituting the above equality, we deduce that Thus, in view of (5.14), we have our first bound: A further bound is arrived at by testing the second equation of (5.12) with S h = D(u h ); hence, Then Proposition 3 gives, for any δ > 0, We choose δ = 1 and, upon recalling (5.25), assume that σ e is chosen so that Next, by adding J h (u h , u h ) to both sides of (5.27), applying (5.26) to bound this term, and using the norm of V h , we infer that To close the estimates, we return to (5.26) and get for any δ 2 > 0. Thus Thus we have shown the following uniform and unconditional bounds: An a priori estimate for the pressure requires an inf-sup condition. This is the subject of the next subsection.

An inf-sup condition
In the nonconforming case considered here, the analogue of (4.3) reads with a constant β * > 0 independent of h. To check this condition, recall Fortin's lemma; see for instance [20].

32)
and with a constant C independent of h.
Originally, Fortin's lemma was stated for discrete functions in subspaces of H 1 0 (Ω) d , but the extension to spaces of discontinuous functions is straightforward, as long as the form b 2h (·, ·) is consistent with the divergence, which is the case here.
As the proof of (5.32), (5.33) is fairly technical, we restrict the discussion to the first order case, i.e., k = 1, in hexahedra. The quadrilateral case is much simpler.

The inf-sup condition in planar-faced hexahedra for k = 1
The construction of a suitable operator Π h is usually done by correcting a good approximation operator R h . For instance, we can use the L 2 projection onto the space of polynomials of degree one defined locally in each element, so that R h (v) belongs to V h and satisfies optimal approximation properties; see for instance [8].
By expanding b 2h and denoting by q E the value of q h in E, (5.34) reads Green's formula in each element yields 35) with n E the unit exterior normal to E. Consider now an interior face e shared by E 1 and E 2 , so that n e is interior to E 2 ; the contribution of e to the left-hand side of (5.35) is with a similar contribution to the right-hand side. Notice also that the contribution of a boundary face e ∈ Γ b h is equal to zero on both sides of (5.35). Therefore a sufficient condition for (5.35) is that We will thus construct c h ∈ V h by imposing (5.36) for each element E ∈ T h and each face e ∈ ∂E. To simplify the notation, we will write from now on c h and ( Let E be an arbitrary hexahedral element of T h with faces e i , centre of face b i , and exterior unit normal n i , 1 ≤ i ≤ 6. To be specific, let a i , i = 1, 2, 3, 4, be the vertices of e 1 , a i , i = 1, 3, 5, 6, the vertices of e 2 , a i , i = 1, 2, 5, 7, the vertices of e 3 , a i , i = 5, 6, 7, 8, the vertices of e 4 , a i , i = 2, 4, 7, 8, the vertices of e 5 , and a i , i = 3, 4, 6, 8, the vertices of e 6 . The ordering of the nodes is illustrated in Figure 1. Note that for i = 1, 2, 3, e i+3 is the face opposite to e i , opposite in the sense that its intersection with e i is empty. Without loss of generality, we assume that the vertex a 1 is located at the origin and that the face e 1 lies on the x 3 = 0 plane. Indeed, this situation can be obtained via a rigid motion (translation plus rotation), which preserves all normal vectors. Therefore, the normal to the face e 1 is parallel to the x 3 axis. Now, the idea is to transform E onto a "reference" elementÊ by an affine mapping F E so that the subtetrahedron S 1 of E based on e 1 and containing the origin a 1 is mapped onto the unit tetrahedronŜ 1 . More precisely, as e 2 and e 3 are both adjacent to e 1 , S 1 is the subtetrahedron with vertices a 1 , a 2 , a 3 , and a 5 , andŜ 1 has verticesâ 1 = (0, 0, 0),â 2 = (0, 1, 0),â 3 = (1, 0, 0),â 5 = (0, 0, 1), see Figure 1 for an illustration and some notation. This transformation and notation will be used till the end of this subsection. It stems from the regularity of the family of triangulations that there exists a constant M , independent of E and h, such that diameter(Ê) ≤ M. (5.37) The affine mapping F E has the expression where the constant term is zero since a 1 is the origin, and the matrix B is nonsingular; its columns are respectively a 3 = (a 1 3 , a 2 3 , 0) t , a 2 = (a 1 2 , a 2 2 , 0) t and a 5 = (a 1 5 , a 2 5 , a As F E is an affine transformation, it transforms faces onto faces, edges onto edges, and vertices onto vertices. Thus, since a 4 is in the plane x 3 = 0, thenâ 4 is in the planex 3 = 0. Likewise,â 6 is in the planex 2 = 0,â 7 in the planex 1 = 0, andâ 8 in the plane determined byâ 4 ,â 2 ,â 7 , as well as the plane determined byâ 7 ,â 5 ,â 6 , and the plane determined byâ 6 ,â 3 ,â 4 , hence in the intersection of these three planes. ThereforeÊ is located in the first octant of R 3 . Letn i denote the unit exterior normal vector toê i . It is related to n i by the general formulâ The advantage of having e 1 on the plane x 3 = 0 is thatn 1 = n 1 = (0, 0, −1) t . We also haven 2 = (0, −1, 0) t , andn 3 = (−1, 0, 0) t . Thus |n 3 1 | = |n 2 2 | = |n 1 3 | = 1, (5.39) and the regularity of the family T h implies that there exists a constant ν 0 , independent of h and E, such that |n 3 4 |, |n 2 5 |, |n 1 6 | ≥ ν 0 . (5.40) With this transformation, and after cancelling |detB| on both sides, (5.36) reads locally where the hat denotes composition with F E . Thus, by performing the change of variablê This is a linear system of six equations in twelve unknowns, the coefficients ofd h . Therefore, we can freely choose six coefficients and we have the following existence lemma. Collecting these results and the two extra assumptions mê 1 (d 1 ) = mê 1 (d 2 ) = 0 in (5.42), we find that The three facesê 1 ,ê 5 ,ê 6 share the vertexâ 4 , and the regularity of the hexahedron implies that the three vectors along the segments [â 4 ,â 3 ], [â 4 ,â 8 ], and [â 4 ,â 2 ] is a set of three linearly independent vectors of R 3 . Then the regularity of the hexahedron implies that a polynomial of degree one is uniquely determined by its moments on the four facesê 1 ,ê 5 ,ê 6 ,ê i for any i in the set {2, 3, 4}. Hence, asd 1 (respectively,d 2 ) is a polynomial of degree one, the first set (respectively, second set) of equalities and the regularity of the hexahedron imply thatd 1 = 0, respectively,d 2 = 0. When i = 4, this leads to mê 4 (d 3 ) = 0. Consequently, Let MÊ be the 6 × 6 matrix of the system (5.41) under the restriction (5.42). It stems from Lemma 7 that MÊ is nonsingular. Furthermore, the regularity of the hexahedron implies that MÊ is a continuous function ofÊ, thus continuous in a compact set of R 3 . Hence the norm of its inverse is bounded by a constant C, independent ofÊ, The stability of the correction follows now easily.

Lemma 8.
There exists a constantĈ, independent of h and E, such that for all E in T h and all e in Γ h , 47) where E 1 and E 2 are the two elements sharing e, when e is an interior face, and the sum is reduced to one term, namely the element E adjacent to e, when e is a boundary face.
Proof. The notationĈ below refers to different constants that are all independent of h and E. Recalling (5.41), (5.37) and the transformation fromŜ 1 onto S 1 , we observe that, for any i, By a trace inequality inÊ and the approximation property of R h inÊ, we have Then, by reverting to E, In view of (5.46) and the regularity of the family T h , the above relations lead to the following bound ond h : Since h S1 < h E , we immediately deduce from (5.48) the first two inequalities in (5.47). Finally, the third inequality follows from (5.48) and That completes the proof of the lemma.
As a consequence of Lemma 8 we have the following bounds: Finally, since the construction of Lemma 7 yields a unique correction, it is easy to check that the mapping v → c h defines a linear operator from V h into itself, i.e., c h = c h (v).
On the other hand, we infer from standard approximation properties of R h and the regularity of the mesh, that

A bound on the pressure
As usual, the inf-sup condition (5.31) yields a bound on the pressure. Indeed, it follows from the first equation of (5.12) together with (5.22), (5.20), (5.15) and (5.16) that Then (5.30) implies, with a constant C independent of h (but depending on α), that With the inf-sup condition (5.31), this implies that for another constant C independent of h.

Existence and convergence
The proof of existence of a solution of (5.12) is the same as in the conforming case. Recall that the case of interest is k = 1, which is assumed for the remainder of this subsection, but all of what follows can be straightforwardly extended to a general polynomial degree k ≥ 1 as long as the inf-sup condition (5.31) holds. First, the problem is reduced to one equation by testing the first equation of (5.12) with v h ∈ V h,0 and by observing that the second equation determines for each u h ∈ V h a unique T h in M h . This is expressed by writing T h = G h,DG (u h ). Then, (5.12) is equivalent to finding a u h ∈ V h,0 such that By means of the a priori estimates (5.30), existence of a solution is deduced by Brouwer's fixed point theorem.
However, in order to pass to the limit in the equations of the scheme, following [16], we need to introduce discrete differential operators related to distributional differential operators. These are G sym The polynomial degree one in this space is convenient for proving the convergence of the nonlinear term; see (5.62). The straightforward scaling argument used in proving Proposition 3 shows that and and thus by (5.15) with different constants C independent of h. At the same time, this gives existence of these two operators. The next proposition relates G sym h (u h ) and D(ū). The proof is an easy extension of that written in [16], but we include it below for the reader's convenience. Proof. On the one hand, the bounds (5.56) and (5.30) imply that there exists a functionw ∈ L 2 (Ω) d×d sym such that, up to a subsequence, lim On the other hand, take any tensor F in H 1 (Ω) d×d sym and let P 0 h (F ) be its orthogonal L 2 (Ω) d×d projection on constants in each E. We have that tends to zero with h. Therefore, the definition (5.54) of G sym and a straightforward argument yields that the first term tends to zero. Hence

Now, an application of Green's formula in each
A comparison with (5.59) and uniqueness of the limit yield D(ū) =w, thus proving (5.58).
Remark 6. The fact thatū belongs to H 1 0 (Ω) d is an easy consequence of the above proof.
A similar argument to the one in Proposition 4 gives that Hence, by passing to the limit in the last equation of (5.12), we immediately deduce that div(ū) = 0; thus u belongs to V and satisfies the third equation of (4.1).
In the next theorem, these results are used to show that the limit satisfies the remaining equations of (4.1).
Theorem 5. Let the family of hexahedra T h be regular in the sense defined above. Then the triple (T ,ū,p) solves (4.1).
Proof. The proof proceeds in two steps.
Step 1. Let us start with the first equation of (5.12). Take a function v ∈ D(Ω) d and let v h ∈ V h be the L 2 (Ω) d orthogonal projection of v on P d 1 in each element. It is easy to check that Therefore the weak convergence of T h and the definition of G sym Similarly, As the right-hand side tends to Thanks to the antisymmetry of d h , we have For the first term, the strong convergence of u h in L 4 (Ω) d and the strong convergence of the broken gradient sinceū ∈ V. For the second term, take any piecewise constant approximationv h of v. Then The boundedness of u h in V h and the convergence to zero of v h −v h in L ∞ (Ω) d imply that the first term tends to zero. For the second term, we deduce from the definition of G div As div(ū) = 0, G div h (u h ) tends to zero weakly in L 2 (Ω). Then the strong convergence of u h in L 2 (Ω) d and that ofv h in L ∞ (Ω) d show that this second term tends to zero. It remains to examine the last term of (5.61). Here we use the fact that, for any This, with the boundedness of u h in V h , gives that this last term tends to zero. Thus, we conclude that The conclusion of these limits and a density argument is that the triple (T ,ū,p) satisfies the first equation of (4.1) Step 2. The argument for recovering the constitutive relationT = G(ū) is close to that for the conforming case, up to some changes. On the one hand, we observe that and, since J h (u h , u h ) is positive and bounded, this implies that On the other hand, we infer from (5.63) that and defineT h = G h,DG (ū), i.e., where the second equality holds thanks to the fact thatū belongs to H 1 0 (Ω) d . The fact that div(ū) = 0 implies that the trace ofT d is zero and justifies the above superscript. Therefore and, as in the conforming case, we conclude that Finally, the difference between the equations satisfied by T h andT h yields By testing this equation with S h = T h −T h and using the monotonicity property (2.14), we deduce that However, by (5.54), and it follows from Proposition 4 and (5.65) that Then, by passing to the limit in (5.66), we obtain in view of (5.64) the inequality α lim This proves that (T ,ū) satisfies the second equation of (4.1).

The tetrahedral case
Here we study briefly two examples of finite element discretisations on tetrahedral meshes, the triangular case being simpler. Many of the details are skipped because they follow closely those in the previous subsection. The family of meshes T h is assumed to be regular as in (4.24). Let us start with the same spaces V h , Q h , and M h defined on T h by (5.1), (5.2), and (5.3), respectively, and the same bilinear (5.5), and (5.6), respectively. Then the scheme is again given by (5.12) and, under assumption (4.24), all proofs from the previous subsections are valid in this case, except possibly the proof of the inf-sup condition. In fact, Theorem 4 holds with a much simpler proof. Indeed, take any tetrahedron E. Recalling that the case of interest is k = 1, a polynomial of P 1 is uniquely determined in E by its values at the centre points b e of its four faces e. Then, instead of (5.36), we can use the sufficient condition and this defines uniquely the correction c h . Furthermore, thanks to (4.24), the stability of this correction follows from the fact that E is the image of the unit tetrahedronÊ by an invertible affine mapping whose matrix satisfies the same properties as the matrix B used above. Thus the conclusion of Theorem 4 is valid in this case.
As a second example, it would be tempting to use the Crouzeix-Raviart element of degree one on tetrahedra; see [15]. This would be possible if the analysis did not invoke Korn's inequality (with respect to the broken symmetric gradient), because it is not satisfied by the Crouzeix-Raviart element; cf [18]. Thus, the simplest way to bypass this difficulty is to introduce the jump penalty term J h (u h , v h ) defined in (5.8). Let us describe this discretisation. Again, we suppose that (4.24) holds. The discrete spaces Q h and M h are the same, with k = 1, as in (5.2) and (5.3), respectively. However, instead of V h , we now use the space V CR h whose elements are also piecewise polynomials of degree one in each element, but in contrast with (5.1), they are continuous at the centre points of all interior faces e ∈ Γ i h , and are set to zero at the centre points of all boundary faces e ∈ Γ b h . Thanks to this pointwise continuity and boundary condition, the scheme now involves the following bilinear/trilinear forms, compare with (5.4), (5.5), (5.6): With these new forms, analogously to (5.12), the finite element approximation of the problem reads as follows: find a triple ( is obviously antisymmetric and is simpler. Although the norm of the broken gradient is a norm on V CR h , the mapping v h → D(v h ) h is not a norm on V CR h . According to [6,7], we have instead (5.14) and (5.15). That is why we use again the norm v h V h defined in (5.9) and keep the term J h (u h , v h ) in the first line of (5.71). Note however that the parameters σ e need not be tuned by Proposition 3 since there are no surface terms in b CR 1h (T h , v h ); thus it suffices for instance to take σ e = 1 for each face e. Moreover, the analysis used for the general discontinuous elements substantially simplifies here. First, as there are no surface terms in the bilinear forms, the bounds are simpler. Next, the operator Π h satisfying the statement of Lemma 6 is constructed directly by setting, for v in H 1 0 (Ω) d , see [15]. Clearly, as v ∈ H 1 0 (Ω) d , (5.72) defines a piecewise polynomial function of degree one in V CR h . Finally, convergence of the scheme is derived without the discrete differential operators G sym h and G div h . Indeed, property (5.17) can be extended as is asserted in the proposition.
with a constant C independent of h, then there exists a functionv ∈ H 1 0 (Ω) d satisfying (5.17) and lim where D h stands for the broken symmetric gradient.
The proof, contained in [15], relies on the fact that the integral average of the jump [v h ] e vanishes on any face e and hence, for any tensor Thus, there is no need for G sym h ; the same is true for G div h . This permits to pass directly to the limit in (5.71).

Numerical illustrations
We introduce two decoupled iterative algorithms. The first one is based on a Lions-Mercier decoupling strategy while the second one is a fixed point algorithm. All the algorithms are implemented using the deall.ii library [1]. For simplicity, we focus on conforming finite element approximations for which an a priori error estimate has been derived in Subsection 4.1.2. Performing numerical experiments in the case of the nonconforming approximation scheme will be the subject of future work.
The general setup is the following: • Dirichlet boundary conditions are imposed on the entire domain boundary (not necessarily homogeneous); • A sequence of uniformly refined meshes with square elements of diameter h = √ 2/2 n , n = 2, . . . , 6 (level of refinement) are considered for the mesh refinement analysis; • The finite element spaces M h , V h , and Q h consist, respectively, of discontinuous piecewise polynomials of degree 2, continuous piecewise polynomials of degree 2, and continuous piecewise polynomials of degree 1 (see Subsection 4.2.2).
Following [5], we replace the constitutive relation to design an exact solution. Then, given T d , u and p, we compute the corresponding right-hand sides g and f (forcing term), where we recall that Finally, we choose µ(s) = 1 √ 1+s 2 which corresponds to (1.10) with β = 1 and n = −1/2.

Lions-Mercier decoupled iterative algorithm
We present here an iterative algorithm to compute approximately the solution to problem (3.5), which is based on the formulation (3.4): . Note that problem (6.1) is equivalent to problem (4.9) analysed in Section 4.
To compute the solution to problem (6.1), we propose a decoupled algorithm based on a Lions-Mercier splitting algorithm [25] (alternating-direction method of the Peaceman-Rachford type [31]) applied to the unknown T h . Following the discussion in [5,Section 7], the algorithm reads, for a pseudo-time step τ > 0: Then, for k = 0, 1, . . . , perform the following two steps: Step 1: Find T Step 2: The solution to (6.2) is obtained by first determining u . A standard argument shows that the above algorithm generates uniformly bounded sequences. Thus they converge up to subsequences. However, the identification of a unique limit for the entire sequence is currently unclear.
Regarding the implementation, we make the following comments: • Stopping criterion: For the main loop (Lions-Mercier algorithm), the stopping criterion is ≤ 10 −5 ; (6.4) • Initialisation: We solve the Navier-Stokes system associated to problem (6.2) using Newton's method (the iterates are indexed by m) until the following stopping criterion is met: As an initial guess, we take the solution of the associated Stokes system without the convective term.
The solution to each saddle-point system of the form is obtained using a Schur complement formulation To solve for P, we use the conjugate gradient algorithm in the case of the Stokes problem and GM-RES for the (linearised) Navier-Stokes problems. In both cases, the pressure mass matrix is used as preconditioner and the tolerance for the iterative algorithm is set to 10 −6 BA −1 F − G ℓ2 . A direct method is advocated for every occurrence of A −1 and also to obtain T (0) . Recall that discontinuous piecewise polynomial approximations are used for the stress and so T h is determined locally on each element E ∈ T h as the solution to We again employ Newton's method starting with T so that the global residual is less than 10 −6 . Note that in this case, it might happen that no iteration is needed (e.g. when γ = 0), in which case T • Step 2 : This step is similar to the initialisation step except that we take (u h ) as our initial guess for Newton's method for solving the finite element approximation of the Navier-Stokes system. , and in particular it has vanishing trace. We observe that u is divergence-free. Moreover, the pressure satisfies p = − 1 2 tr(T ) and has zero mean. We report in Table 1 the error for each component of the solution for the case α = 1 and γ = 0, while Table 2 contains the results for α = γ = 1. Note that we use the H 1 semi-norm for the velocity and not the (equivalent) L 2 (Ω) 2×2 norm of the symmetric gradient. We observe in Table 2 n h T d − T h L 2 (Ω) ∇(u − u h ) L 2 (Ω) p − p h L 2 (Ω) iter 2 0.354 6.04199×10 −2 7.51266×10 −2 3.02263×10  Table 2: Case 1, α = γ = 1, δ = 10 −5 , τ = 0.01. that all three errors are O(h 2 ). The deterioration of the convergence rate we observe for T d and p in Table  2 is due to the stopping criterion. Indeed, if we use 10 −6 instead of 10 −5 in the stopping criterion (6.4) for the main loop, then for h = 0.044 (n = 5) we need 250 iterations and we get T d − T h L 2 (Ω) = 4.66898 × 10 −4 , u − u h L 2 (Ω) = 1.13118 × 10 −3 and p − p h L 2 (Ω) = 3.62581 × 10 −4 , compare with the fourth row of Table 2. We give in Tables 3, 4 and 5 the results obtained when a larger pseudo-time step is used. We see that  Table 3: Case 1, α = γ = 1, δ = 10 −5 , τ = 0.05.

A fixed-point algorithm
Instead of the Lions-Mercier type algorithm introduced in Subsection 6.1, we explore the following fixed-point strategy.
Step 1: for all (u h , q h ) ∈ V h × Q h .
Step 2: Find T It is easy to show that this algorithm produces uniformly bounded sequences. The solvers used for these two steps are similar to those described in Subsection 6.1. In particular, we take (u (k) h , p (k) h ) as initial guess for Newton's method for the finite element approximation of the Navier-Stokes system, except when k = 0, in which case we use the solution of the associated Stokes problem.
Concerning the computational cost when similar results are obtained, i.e., when τ = 0.5 for the Lions-Mercier type algorithm, we note that the latter requires the solution of one more equation per iteration, namely the linear equation for T   Table 5) and the algorithm of Section 6.2.