CONVERGENCE RATE ESTIMATES FOR THE LOW MACH AND ALFVÉN NUMBER THREE-SCALE SINGULAR LIMIT OF COMPRESSIBLE IDEAL MAGNETO-HYDRODYNAMICS

Singular limits of the compressible ideal magneto-hydrodynamics equations at small Mach and Alfvén numbers are considered, in which those two parameters are allowed to tend to zero at different rates. After showing that solutions exist in a fixed time interval and developing explicit formulas for the limit system, convergence-rate estimates are obtained using an improvement of the uniform bounds shown previously by the authors, the time-integration method, and a detailed analysis of exact and approximate fast, intermediate, and slow modes. 35B25, 35L45.


INTRODUCTION
A uniform existence theorem and a convergence theorem as the small parameters tend to zero were recently developed [CJS18] for singular limits of symmetric hyperbolic systems of the form In this paper we begin the study of the rate of convergence of solutions of three-scale singular limits to corresponding solutions of their limit equations, an issue that was not considered in [CJS18]. The convergence rate in the general case will undoubtedly be very complicated, since in general many different limit systems are obtained for different power-law relations between the two small parameters as they both tend to zero [CJS18,§4]. In this paper we study the particular case of the low Mach and Alfvén number limit of the compressible ideal magneto-hydrodynamics (MHD) equations in the presence of a large uniform magnetic field, which was the motivating example for our work. As we will show, that system has essentially only one limit system, although the limit µ lim := lim appears in that limit system as a parameter. Even for the MHD system the study of the convergence rate is much more intricate than for standard two-scale singular limits [KM81,KM82,Maj84].
The MHD system in three spatial dimensions that we study, derived in Appendix A from a standard formulation of that system, is a(ε M r) ∂ t r + (u·∇)r) + a(ε M r)ρ(ε M r) where ε M has been rescaled to make the pressure law p(ρ) satisfy Note that in (1.4) "symmetric terms" are grouped into the same vertical column. As noted in Appendix A, the divergence-free condition (1.4d) on the magnetic field is preserved by the dynamics of (1.4c), and so is just a restriction on the initial data. Hence straightforward calculations show that the system (1.4) has the form (1.1), with V = (r, u, b).
When µ is fixed so that only two time scales are present, uniform existence and convergence results were obtained in [KM81] for the limit of smooth solutions of initial value problems of systems (1.1) as the small parameter tends to zero. Moreover, whenever µ lim > 0 corresponding results can be proven via cosmetic changes to the theory in [KM81].
We therefore focus on the more challenging case when µ → 0, although our results will be phrased so as to remain valid when µ lim > 0. We prove the following two theorems about the behavior of solutions to the MHD system (1.4) as the small parameters ε A and ε M tend to zero, of which the first is an application to system (1.4) of an improvement to be shown in §2 of the results of [CJS18] plus an extension of results from [CJS18] itself, and the second is our main result.
Before stating the theorems we define some notations and three operators that will appear in the limit equations or the convergence-rate estimates. First, we let k denote the H k norm. Next, for any vector w let its "horizontal" part w h denote its x, y components ( w 1 w 2 ), and let P div h denote the two-dimensional Leray-Helmholtz projection onto divergence-free velocity fields in the x, y plane or 2-torus, i.e. (1.7) Define the Alfvén and Mach operators ( (1.9) Throughout this paper c and C denote positive constants that are independent of ε A and ε M , which may take different values in each appearance.
Theorem 1.1. Let n ≥ 3 be an integer, and assume that the small parameters ε A and ε M satisfy (1.10) Assume that the spatial domain is R 3 or T 3 and that the initial data V 0 := (r 0 , u 0 , b 0 ), which may depend on the parameters ε A and ε M , are uniformly bounded in H n and satisfy the constraint (1.4d) and the "wellpreparedness" condition (1.11) Then there exist fixed positive T and K such that for (ε A , ε M ) satisfying (1.10) the solution to (1.4) having the initial data V 0 exists for 0 ≤ t ≤ T and satisfies (1.12) in particular the solution is uniformly bounded in H n . Assume in addition that as (ε A , ε M ) satisfying (1.10) tend to zero, their ratio ε A ε M converges to some value µ lim and the initial data V 0 converges in H n to V 0 . Then the solution V = (r, u = (u h , u 3 ), b = (b h , b 3 )) of the MHD system (1.4) with initial data V 0 converges in C 0 ([0, T ]; H n−α ) for every α > 0. Its limit is independent of z, and is identically zero if the spatial domain is R 3 , while if the spatial domain is T 3 it is the unique solution V = (r, (ū h ,ū 3 ), (b h ,b 3 )) of the limit system having initial data V 0 .
Note that if µ lim > 0 then the limit equation (1.13a) for the density perturbation includes a term involving the magnetic field, and the limit equation (1.13c) for the vertical fluid velocity contains a term involving the product of the magnetic field and derivatives of the density, even though no such terms appear in the corresponding equations of the original MHD system (1.4). Those terms arise on account of the coupling between the density perturbation and the vertical magnetic field that occurs in (1.13e) when µ lim is nonzero.
Although the above uniform existence and convergence theorem does not require any assumption about the rate at which the initial data converge to their limit, such an assumption is obviously required in order to obtain a rate of convergence of solutions of the PDE. In the following convergence rate theorem we assume that the initial data has the form developed in §3, which is a specialization of the general form of initial data satisfying (1.11). Specifically, after expanding the initial data in powers of the small parameters and their ratio, the leading-order terms are assumed to be independent of the small parameters in order to avoid degrading the convergence rate. Nevertheless, full generality is retained as far as convergence rates are concerned because their proofs will not rely on the compactness of sequences of solutions, a feature that allows us to consider one solution to the MHD system with fixed ε A , ε M and initial data and compare it to the limit system(s) equipped with initial data that are the leading order terms of the (fixed) initial data for the MHD system.
It will be convenient to write the convergence-rate results using just powers of ε M , by defining a parameter ν determined by (1.14) Since the case when ε A ≥ ε M is covered by the theory of two-scale singular limits or a trivial generalization of that theory, in developing the convergence rate we will assume for notational simplicity that ε A < ε M , i.e., µ < 1 and ν > 0. (1. (1.16e) Under the scaling of (1.10) with ε 0 M and c suitably chosen, ν ≤ 1 n−1 + ln 1 ε M c < 1 n−2 so all powers of ε M appearing in (1.16) are positive, which means that a nontrivial rate of convergence is obtained over the full range of allowed values of ν.
When the initial data are chosen so that the first time derivative of solutions are uniformly bounded at time zero then the convergence rate for solutions of two-scale singular limit systems is typically first order in the single small parameter appearing in those systems [Sch88,Sch94]. However, that result depends crucially on the uniform boundedness of the first time derivative of solutions being propagated for positive time. That propagation result does not generally hold for three-scale systems of the form (1.1) satisfying ε A ε M → 0, as the example in [CJS18,§2] shows. Hence for three-scale singular limits the rate of convergence obtained in (1.16) is slower than for two-scale singular limits, and the proofs are more complicated. Nevertheless, the convergence rates (1.16a) and (1.16c) tend to the standard two-scale O(ε M ) convergence rate as the parameter ν tends to zero, which corresponds to the three-scale system tending to a two-scale system. In order to obtain such a rate of convergence, asymptotically consistent with the two-scale rate, it is necessary to improve the estimate ε M V t n−1 ≤ c obtained in [CJS18], since using that bound in our estimates would have led to a bound c ε ν M in (1.16a), which tends to the trivial O(1) estimate as the three-scale limit tends to the two-scale case in which ν = 0. The required improved bound (1.12) will be proven for general systems (1.1) in §2.
Moreover, an O(ε M ) rate of convergence has been obtained in (1.16b), and for the H 1 norm in (1.16a) and (1.16c). The latter estimates are obtained by using the fact that the first time derivative of the solution does remain uniformly bounded in L 2 even though, as noted above, it does not remain uniformly bounded in H n−1 . The H j estimates (1.16a) and (1.16c) for intermediate values of j are obtained using interpolation. Those estimates are the starting point for an improved estimate for z-averages of products derived in Appendix C, which is combined with techniques described below to yield (1.16b).
One ), respectively, depend on the parameter µ. For simplicity we will therefore define the fast, intermediate, and slow modes to be the fixed projections onto the limits as µ → 0 of the exact eigenspaces. Formulas for all the modes and estimates for the fast and intermediate modes in terms of the operator L A + µL M applied to V will be determined in §3. However, certain estimates in §5, in particular those for the intermediate modes, will require use of the exact µ-dependent modes, which makes their derivations more complicated than in the two-scale case. Even so, the partitioning here is much simpler than for general systems of the form (1.1), for which 1 ]; we will show in §3 that such eigenvalues are not present for the MHD system. The projections defined in §3 will also be used there to determine the appropriate form of the initial data used in Theorem 1.2.
For the two-scale singular limit of the compressible Euler equations without magnetic fields, the timeintegration method developed by one of the authors of this paper together with the special structure of those equations yields an O(ε 2 M ) estimate for the difference between the incompressible component of solutions to the original equations and corresponding solutions of the incompressible limit equations [Che12,Che14]. In those papers the solution is separated into a slow part that satisfies the constraints obeyed by the limit system and a fast part that does not, and different estimates are given for each part. In similar fashion, (1.16) gives separate estimates for the slow, intermediate, and fast parts of the solution. Although estimate (1.16b) is not quadratic in the size of the fast parts of the horizontal fluid velocity and horizontal magnetic fields, the time-integration method will nevertheless play an important role in its derivation, because that method helps to mitigate the effect of the lack of uniform boundedness of the first time derivatives.
The corrector estimate (1.16e) has been included because estimate (1.16d) is rather weak in the main case of interest in which µ lim = 0. The constraint (1.10) together with the relation (1.14) between the parameters implies that ε , which is much larger than the other error bounds in (1.16). The improved estimate (1.16e) involving the corrector shows that (1.16d) is in fact sharp in this case, and gives a formula for principal error term.
The uniform existence part of Theorem 1.1, which concerns solutions to (1.4), will be deduced in §4 from the uniform existence result for (1.1) in §2. When µ lim = 0 the convergence of solutions to (1.4) can be deduced from the convergence theorem in [CJS18, §4] for the general system (1.1), as will be indicated in §4; in particular, the general convergence result simplifies considerably for the MHD equations thanks to the absence shown in §3 of eigenvalues of the large operator of size O( µ j ε A ) for j ≥ 2. However, in order to treat the cases µ lim = 0 and µ lim > 0 together, a direct proof of the convergence part of Theorem 1.1 will be presented in The restriction (1.10) on the relation between ε A and ε M needed to obtain uniform existence does not allow taking the limit when ε A tends to zero but ε M is fixed. As shown in [BK82] and [CJS18], if the initial data for a system of the form (1.1) imply that the first n ≥ ⌊ d 2 ⌋ + 2 time derivatives of the solution are uniformly bounded at time zero rather than only the first time derivative being uniformly bounded, then the restriction (1.10) is not needed, so it is possible to let ε A tend to zero with ε M fixed. The limit equations for a slight variant of the MHD system (1.4) for that case were obtained in [BK82]. However, the additional conditions placed on the initial data are very restrictive: while condition (1.11) only requires that the fast and intermediate modes of the initial data be of size ε A and ε M , respectively, the "super-well-posed" condition that time derivatives through order n be bounded initially determines the expansion of the fast and intermediate modes of the initial data up to the n th powers of the small parameters, and each term in the expansion depends nonlinearly on the slow mode of the initial data. Furthermore, that restriction cannot be removed by the perturbative technique that estimates in H n−1 the difference between solutions with super-well-posed initial data and solutions having perturbed initial data that only yield one bounded time derivative at time zero, because the example in [CJS18,§2] implies that no such three-scale perturbation theorem can possibly hold. However, the estimates in §5 make use of a perturbation result concerning both initial data and inhomogeneous terms for symmetric-hyperbolic systems without large terms, which is proven in Appendix B.

IMPROVED UNIFORM BOUND
In the notation in this paper, in which the smaller parameter is denoted ε A rather than δ and the larger small parameter is denoted ε M rather than ε, Theorem 3.6 of [CJS18] says that solutions to the three-scale symmetric hyperbolic system (1.1) satisfy the uniform estimate for some positive T and finite K provided that: Assumption 2.1.
(1) the operators L A and L M are constant-coefficient differential operators of order at most one and are skew-adjoint on L 2 , (2) n ≥ s 0 + 1, where s 0 := ⌊ d 2 ⌋ + 1 is the Sobolev embedding exponent, i.e., the smallest integer s for which f L ∞ ≤ c f H s , (3) the matrices A i are smooth symmetric functions for j ≥ 0 and the matrix A 0 is positive definite, (4) the small parameters are restricted to the region (1.10).
(5) the initial data V 0 are uniformly bounded in H n and satisfy (1.11).
Actually, [CJS18, Theorem 3.6] is only stated for n = s 0 + 1, but the proof remains valid in the more general case and, as noted in [CJS18,§2], the value of n can be increased arbitrarily by artificially adding extra dimensions that the system and initial data do not depend on. In this section we show the improvement (1.12) of (2.1).
Theorem 2.2. When Assumption 2.1 holds there exist a positive T and finite K such that (1.12) holds.
Remark 2.3. Assumption (1.10) ensures that the new estimate (1.12) is at least as strong as the old estimate (2.1), and is stronger when lim sup ε M →0 ν < 1 n−1 . In particular, for the two scale case in which ν → 0, (1.12) recovers the uniform H n−1 estimate for V t which is not guaranteed by (2.1).
Proof. Theorem 2.2 differs from Theorem 3.6 of [CJS18] only by having different weights multiplying the norms of time derivatives. Hence it suffices to show that in all places in the proof of [CJS18,Theorem 3.6] where the use of the weights in (2.1) was justified the use of the weights in (1.12) is also justified. There are only two such places, namely where it was shown that the weighted sum of norms is bounded at time zero and where it was shown that the small parameters scale out of the estimate for the time derivative of an appropriately weighted energy.
As noted in the proof of Lemma 3.5 of [CJS18], assumption (1.11) ensures that V t t=0 n−1 is bounded uniformly in the small parameters, and hence the PDE (1.1) yields ∂ j t V t=0 n− j ≤ c ε 1− j A for 1 ≤ j ≤ n. Therefore, for 1 ≤ j ≤ n, which shows that the weighted sum of norms in (1.12) is also bounded at time zero uniformly in the small parameters.
The energy estimate both in [CJS18] and here makes use of the norms where V is a solution to (1.1) and D α = ∂ α 1 x 1 · · · ∂ α d x d . As shown in [CJS18], in order to prove a weighted energy estimate like (2.1) or (1.12) it suffices to obtain a uniform bound for where the weights w j are ε j M for the estimate (2.1) or 3. ANALYSIS OF THE LARGE OPERATOR Following [CJS18, §4] but without treating each Fourier mode separately, let P 0 denote the L 2 -orthogonal projection operator onto the nullspace of L A , and let P 1 denote the L 2 -orthogonal projection operator onto the nullspace of P 0 L M P 0 . Since P 0 , being a projection, satisfies P 0 (I − P 0 ) = 0, R(I − P 0 ) ⊆ N(P 0 L M P 0 ) = R(P 1 ) = N(I − P 1 ). Hence R(I − P 0 ) ⊥ R(I − P 1 ), so Expanding the factors in (3.1) shows that P 0 P 1 = P 1 P 0 , which implies that P := P 0 P 1 is an orthogonal projection operator satisfying P(I − P j ) = 0 = (I − P j )P for j ∈ {0, 1}. Moreover, (3.1) together with the definition of P yield (I − P 0 ) + (I − P 1 ) + P = I + (I − P 0 )(I − P 1 ) = I, which shows that the sum of the fast, intermediate, and slow modes defined by Moreover, the mutual L 2 orthogonality of the projections defining the modes together with the fact that the P j are projections onto the null spaces of constant-coefficient differential operators and so commute with derivatives implies that they are also orthogonal in any H k . Therefore combining estimates for different modes to obtain an estimate for the full solution yields estimates that are as sharp up to a constant factor as the component estimates.
These general facts about projections were developed in [Che12,CM13] for the two-scale case. Since the operators L A and L M used here have constant coefficients, our projections P j operate separately on each Fourier mode, so the results for them also follow from perturbation theory of (anti)symmetric operators [Kat82,CJS18]. Since the modes as defined above do not depend on the parameter µ they are in general not the direct sums of the eigenspaces of L A + µL M whose eigenvalues are strictly O(1), O(µ), and o(µ), respectively, but are rather the limits as µ → 0 of those direct sums of eigenspaces [Kat82], [CJS18,§4].
We now calculate the projections, the fast, intermediate, and slow modes, and some elementary results about them for the specific case of the MHD system (1.4). For brevity, we restrict consideration to V = (r, u, b) satisfying ∇·b = 0, and since we only consider initial data and solutions satisfying that constraint that restriction causes no difficulties.
Lemma 3.1. Assume that the spatial domain is T 3 and that V = (r, u, b) where b satisfies ∇·b = 0. Then where the operator av was defined in (1.9). (3) The formulas for the projections are where all missing entries vanish, and P div h and a z are defined in (1.7) and (1.9). (4) All eigenvalues of L A + µL M that are o(µ) are identically zero.
(5) Using the notations V ℓ = (r ℓ , u ℓ , b ℓ ) for ℓ ∈ {F, I, S} and w ℓ = (w ℓ h , w ℓ 3 ) for w ∈ {u, b}, the formulas for the fast, intermediate and slow modes are  (3.7) In particular, (3.10) (6) For any nonnegative integer j, there are constants c 1 and c 2 such that Hence the last part of L A V vanishes iff ∂ z u h = 0 and ∇ h ·u h = 0. Furthermore, solving the divergence-free condition ∇·b = 0 on the magnetic field for its horizontal component to obtain ∇ h ·b h = −∂ z b 3 and substituting that formula into the horizontal divergence of (3.12) On the other hand, when those conditions hold then each term in the second part of L A V vanishes. Similarly, The last component of the second part vanishes iff ∂ z r = 0 and, since ∂ z u 3 = ∇·u − ∇ h ·u h , the first and last parts vanish iff ∇ h ·u h = 0 and ∂ z u = 0. Next, use (3.12) to write the horizontal divergence of the second part as and since it has been shown that ∂ z r = 0, (3.14) vanishes iff ∆(b 3 + ε A ε M r) = 0, which is equivalent to the last equation of (3.4). That equation together with the vanishing of ∂ z r implies that ∂ z b 3 = 0. Finally, the last equation of (3.4) also shows that the horizontal components of the second part of (3.13) vanish iff ∂ z b h = 0.
The formula for P 0 follows directly from the conditions for L A V to vanish in the first part of the lemma, and that formula together with the formula for L M that can be seen from the O(ε −1 M ) terms in (3.13) implies that In turn, formula (3.15) implies the formula for P 1 , and the formulas for the P j imply (3.6). The conditions (3.4) for (L A + µL M )V to vanish differ from the conditions to belong to the null space of P in (3.6) only by adding an O(µ) term to the formula for b 3 . Hence the rank of the restriction to any Fourier mode of N(L A + µL M ) equals the rank of the restriction to that mode of P, which in turn equals the dimension of the direct sum of all eigenspaces of L A + µL M in that Fourier mode having eigenvalues of size o(µ) [Kat82], [CJS18,§4]. Hence all such eigenvalues vanish identically.
Alternatively, the eigenvalues and eigenvectors of the restriction of L A + µL M to any Fourier mode (k, l, m) with m = 0 and (k, l) = (0, 0) can be easily calculated, showing that the zero eigenvalue has multiplicity five and the remaining eigenvalues are ±i 1 + µ 2 √ k 2 + l 2 . For Fourier modes with m = 0, (0, 0 3 , (k, l, m)) is an eigenvector of L A + µL M having eigenvalue zero, while the formulas (1.8) for L A and (3.15) for P 0 L M P 0 imply their restriction to the Fourier mode have rank four and two respectively, so perturbation theory [Kat82] shows that the sum of the dimensions of the eigenspaces of L A + µL M in such Fourier modes of sizes strictly O(1) and strictly O(µ) remain four and two, respectively, for |µ| sufficiently small. In each case the sum of the dimensions of the eigenspaces corresponding to eigenvalues of size strictly O(1), strictly O(µ), and identically zero add up to the full dimension seven of the restriction of L A + µL M to a Fourier mode, so there can be no nonzero eigvalues of size o(µ).
The formulas (3.7) for the modes follow from their definition (3.2) and the formulas (3.5)-(3.6) for the projections P j and P. Since a z b is independent of z, ∇·b S = ∇·(a z b h , av b 3 ) = ∇·(a z b) = a z ∇·b = 0, and trivially ∇·b I = 0, which implies the rest of (3.8). Each component of V S contains a z or av, so (3.9) holds. The left equation in (3.10) follows from the presence of the operator P div h in the formula for u S h , while the right equation there follows from (3.8)-(3.9).
The formula for u F h in (3.7) and the formula (3.13) for Next, by the ellipticity of ∆ together with the formula for b F 3 in (3.7), the identity (3.14) and formula (3.13), By the formula (3.7) defining the modes and formula (3.13) for L A + µL M , Combining (3.16), (3.17), (3.18), (3.19), (3.20), and (3.21) yields the right inequality in (3.11), and since the expressions estimated there can be combined to yield all the terms in (L A + µL M )V the left inequality also holds.
Since it is possible to obtain a bound for (L A + µL M )V in H n−1 by solving the PDE for that expression and using the bounds (1.12) to estimate the result, the estimate (3.11) can be used to obtain "static" estimates for the fast and intermediate modes, i.e., estimates for those modes at any given time in terms of just V and ∂ t V at that time. Those estimates will be obtained in §5. However, we will also need to obtain "dynamic" estimates for the intermediate and slow modes via differential inequalities. For that purpose it does not suffice to use the modes defined above because they are not direct sums of eigenspaces for the operator L A + µL M but only limits as µ → 0 of such sums. The exact slow eigenspace of the zero eigenvalue was determined in Lemma 3.1. In the next lemma we obtain formulas for the 7-vectors of Fourier multiplier operators V α and V β and the self-adjoint Fourier multiplier operator Q, such that for every vector-valued function V. Equations (3.22) imply that for every Fourier mode (k, l, m) the linear combinations (V α ± V β )e i(kx+ly+mz) are eigenfunctions of the operator L A + µL M with purely imaginary or zero eigenvalues ∓imµ Q where i.e., they yield the exact µ-dependent intermediate eigenspaces.
In other words, in the equations for ∂ t (V α ± V β ) · V the largest terms will be, by the MHD system itself, the skew-adjointness of L A , L M and (3.23), (3.24) However, in §5 we will obtain dynamic estimates for (1 − a z )V α · V and (1 − a z )V β · V rather than (V α ± V β ) · V to reduce dispruption to the structure of the rest of the PDE. The operator 1 − a z is applied because the eigenvalues ±imµQ are only of size µ when m = 0. Also since we only consider initial data for which ∇·b = 0 we can and will use the variant V no-div α defined in (3.25) that omits the term in V α containing a gradient in the magnetic field component.
Since the limits as µ → 0 of V α and V β should belong to the intermediate mode defined in (3.7), trying various perturbations leads to the ansatz where the vectors have been normalized by setting the first component of V α to the identity operator 1, and factors of ∆ −1 have been included so that if (A , B, C , D) are all homogeneous order zero Fourier multipliers then all components of V α and V β will also be Fourier multipliers of order zero. Furthermore, in the limit as µ → 0 the operator −∂ z Q should tend to the operator −∂ z appearing in P 0 L M P 0 in (3.15), i.e., Q should tend to one.
Lemma 3.2. The vectors V α and V β defined in (3.25) satisfy (3.22) with (3.28) which will hold provided that (3.29) Solving the first three equations in (3.29) for A , D, and B yields the formulas for those operators claimed in (3.27). Substituting those formulas into the fourth equation in (3.29) and solving the result for C yields the formula for that operator in (3.27). Substituting the formulas obtained so far into the last equation in whose only solution tending to one as µ → 0 is (3.26).
We now turn to explicating the form that initial data satisfying (1.11) takes. Lemma 3.3. Initial data V 0 = (r 0 , u 0 , b 0 ) will be uniformly bounded in H n and satisfy the constraint ∇·b 0 = 0 and the condition (1.11) iff it has the form where every term w 0,ℓ with w ∈ {r, u h , u 3 , b h , b 3 } and ℓ ∈ {F, I, S} has the form specified for the w component of the ℓ mode in (3.7), and may depend on (ε A , ε M ) but satisfies uniformly in those parameters, and Proof. Since the terms appearing in (3.30) are allowed to depend on ε A and ε M , that formula simply expresses the separation of the initial data into fast, intermediate, and slow modes, with the factors of ε A and ε M being purely for later convenience, as is the inclusion of the specific term −µ(1 − av)r 0,S as part of the fast part of b 0 3 . By Lemma 3.1, the condition ∇·b = 0 implies that each mode is divergence-free, and the conditions ∇·b ℓ = 0 for ℓ ∈ {F, I, S} clearly imply that ∇·b = 0, so for initial data of the form (3.30) the conditions (3.32) are equivalent to the assumed condition that ∇·b 0 = 0.
Since, as shown above, the square of the H n norm of V 0 equals the sum of the squares of the H n norms of its modes, the assumed uniform boundedness of V 0 n is equivalent to (3.33) When k is set equal to n − 1 the sum of terms estimated in (3.11) is equivalent to the H n−1 norm of (L A + µL M )V. Hence the assumed uniform boundedness of 1 We note that, by a slight notational exception, the fast part of b 0 3 in (3.30) satisfies showing that, to the leading O(µ) order, (b 0 3 ) F only depends on the slow part of r 0 . Although Lemma 3.3 determines the most general initial data satisfying the conditions needed for Theorem 1.1, as noted in the introduction we need to take more specific data in order to obtain a rate of convergence. In order to allow the initial data to contain all modes but not interfere with the convergence rate, we will still assume that the initial data have the form (3.30), and that (3.32) holds, but will assume that all terms w 0,ℓ in (3.30) are uniformly bouned in H n and each term w 0,S is fixed, → 0 to the unique solution V(t, x) of the limit equations [CJS18, (4.14)-(4.15)] having the limit initial data. When the spatial domain is R 3 then the fact that the limit lies in the range of P implies by Lemma 3.1 that it is independent of z and hence vanishes identically. When the spatial domain is T 3 a calculation shows that for the MHD equations and projection P from (3.6) those limit equations reduce to (1.13) with µ lim set to zero.
of Theorem 1.1. As noted in the introduction, the condition ∇·b = 0 included in (1.4) will be satisfied automatically provided that it is satisfied at time zero, so in view of the assumption that the initial data satisfy that condition it may be omitted from (1.4). The remaining part of (1.4) has the form (1.1) to which Theorem 2. . Since a standard L 2 energy estimate shows that solutions of the limiting equations (1.13) with initial data in H n are unique, convergence to the same limit V holds for all ε A and ε M converging to zero with their ratio µ converging to the same limiting value µ lim .
Multiplying (1.1) by ε A or applying P 0 to it and multiplying the result by ε M yields and since the uniform bounds (1.12) show that the expression in brackets on the right side of (4.1) is bounded in L 2 , taking the limit yields When µ lim = 0 then the left equation in (4.2) implies that V = P 0 V, and hence the right equation there implies that V = PV. Since the vanishing of ∇·b 0 implies that b remains divergence free, Lemma 3.1 then shows that V is independent of z, the horizontal parts of its velocity and magnetic field are divergence free, and (1.13e) holds. When µ lim > 0 then Lemma 3.1 shows that the same conclusion follows from just the left equation of (4.2). When the spatial domain is R 3 then the fact that the limit is independent of z implies that it vanishes, so from now on we consider the case when that domain is T 3 . We next note from (3.6) and (3.13) that multiplying the large operator 1 Hence the limits of the velocity equation and the horizontal magnetic field equation can be determined by taking the limit of P applied to (1.4), since the resulting equations contain no large terms and each O(1) term not containing a time derivative converges strongly, and each term containing a time deriavtive converges weakly, to the term obtained by replacing each variable with its limit.
Moreover, since each equation includes only one term with a time derivative, the equations together with the strong convergence of the terms without time derivatives show that the terms containing time derivatives also converge strongly. This yields (1.13b), (1.13c), and (1.13d); the term in (1.13c) containing a factor of µ lim arises from the limit of the term a z (b h ·∇ h b 3 ) on account of (1.13e).
Finally, in order to determine the limit equation for the time evolution of the density we will add to the equation (1.4a) for r an appropriate multiple of the last component of the equation (1.4c) so as to eliminate the large term that remains in P applied to (1.4). However, it will be convenient to divide (1.4a) by a(ε M r) to recover the conservation form which is advantageous since the operator a z from the definition of the slow modes annihilates z derivatives.
Applying a z to (4.3) yields The third component of (1.4c) can be written in conservation form as in view of the fact that b is divergence free. Applying a z to (4.5), then multiplying by µ, and subtracting the resulting equation from (4.4) yields which contains no large terms. Hence we can take the limit of (4.6) to obtain where we have used the facts that the limit variables are independent of z and thatū h andb h are divergence free. Substituting (1.13e) into (4.7) yields (1.13a).

CONVERGENCE RATE ESTIMATES
The estimates will be obtained by estimating separately the size of the fast modes, the size of the intermediate modes, and the difference between the slow modes and the solution of the limit system, with earlier estimates used when deriving later ones. The estimates for the fast modes will be purely static, i.e., will be obtained simply by solving for certain terms in the PDE system (1.4) and estimating their norms. In contrast, the estimates for difference between the slow modes and the limit solution will be obtained dynamically, i.e., by obtaining energy estimates for the time evolution of norms of that difference. Estimates for the intermediate modes will first be obtained via static estimates and will then be improved via dynamic estimates that use those static estimates.
Recall that µ = ε A ε M = ε ν M is assumed here to be less than one. In order to obtain sharp dynamic estimates we will make use of the fact that smaller static estimates can be obtained by using weaker norms, on account of the interpolation estimate (2.7). In order to see easily the estimate expected to be obtained as a function of the norm used, we introduce an increasing geometric sequence ε j := ε 1+ν− jν M so that by (1.10), (1.14) ε 0 = ε A , ε 1 = ε M , ε j = µε j+1 , ε n ≤ c.
(5.1) 5.1. Static estimates. Static estimates will be obtained by solving the PDE system for (L A + µL M )V and using the uniform bounds (1.12) together with the standard Sobolev product and function estimates f g j ≤ c f n−1 g j , j = 0, . . . , n − 1 (5.2) F(g) n−1 ≤ C( g n−1 ), (5.3) which all will be used henceforth without mention, plus the interpolation estimate (2.7). We will also need an estimate for the time integral of certain fast terms, which will be obtained similarly from the time integral of the PDE.
Note that the bounds in (5.5) for the time integrals of fast components are smaller than the bounds in (5.4) for those components themselves, a phenomenon that seems to have first been utilized in the two-scale case in [KLN91] and is of independent interest, although only the estimates for the first and last expressions on the left side of (5.5) will be used below. Also note that while (5.4) is in the fashion of "interpolative estimates", the time-integral estimates (5.5) can not be improved by reducing the Sobolev norm index, to the best of our knowledge.
The bounds on time integrals in (5.5) are obtained by integrating (1.1) with respect to time, which yields The case j = 0 in (5.16) and estimate (5.17) were already proven in Theorem 5.1. The remaining cases in (5.16) are an improvement over the corresponding cases in (5.6) by one factor of µ.
As shown in Lemma 3.2 and the discussion preceeding it, and in particular formula (3.25), the appropriate expressions to use in the dynamic estimate for the intermediate modes are not the fixed intermediate modes defined in (3.7) but rather where the operators C , D, and Q were defined in (3.27).
As a preliminary to proving Theorem 5.2 we will derive a system of PDEs satisfied by (α, β ), with remainder terms that are consistent with the desired estimate (5.16). The general idea is to apply each of the operators V no-div α and V β to the PDE, note that by Lemma 3.2 the large terms of the result are ∂ z Q applied to the other operator, and calculate the form of the order one terms. In order to simplify that calculation it will be convenient to simplify the original equations beforehand by moving to the right sides of the equations those terms whose H j norms can be estimated by a constant times ε j or the H j norms of r I and u I 3 using (5.4), (5.6), and (2.7). To do so we will make use of the formulas in (3.7) for the fast, intermediate, and slow parts of the solution, and in particular we will use the formula Starting from the MHD equations (1.4), we replace the argument ε M r of ρ and a by ε M r S or zero where possible and compensate by adding terms to the right sides of the equations, except that we retain a factor of a(ε M r S ) multiplying 1 ε M ∇ h ·u h because that expression will then cancel exactly when we build the time evolution equation for α, (5.23). In addition, we apply 1 − a z to the equations since that operator appears in all terms of the formulas (5.18) for α and β . When doing so we reap the first benefit from the transfer of the coefficients depending on intermediate and fast modes to the right side: the fact that slow modes are independent of z allows us to move the operator 1 − a z past most coefficients so that it applies only to differentiated factors. This yields where the equation for the horizontal magnetic field has been omitted since it does not enter into α or β , and Wherever u I 3 appears undifferentiated in R i , the H j norm of the term in which it appears can be estimated by a constant times the H j norm of u I 3 , for j = 0, . . . , n − 1. Similarly, for any smooth function F and j = 0, . . . , n − 1, j in view of the bound for V t n−1 in (1.12) and the constraint (1.10). Finally, terms containing u h without further derivatives can be estimated in the H j norm by a constant times ε j , for j = 0, . . . , n − 1, in view of (5.4). Since these cases cover all the terms in the R i , In order to obtain the evolution equation for α we apply each component of V no-div α to the corresponding equation in (5.20), then multiply the equation involving b 3 by a(ε M r S ) and add the results. In other words we subtract µa(ε M r S ) times (5.20d) from (5.20a) and add a(ε M r S )µ 3 C Q∆ −1 ∂ 2 z applied to (5.20d) to the result. Rearranging the equation so obtained by commuting C Q∆ −1 ∂ 2 z past u S ·∇, making the coefficient of the large terms that do not cancel exactly be aρ everywhere while compensating via terms on the right side, forcing the function to which µ(b S h ·∇ h ) is applied to be β √ 1+µ 2 for reasons that will be explained later and again compensating on the right side, and using the identity 1 = QD from (3.27) that makes the large terms exactly involve β , as we know from (3.22) that they must, yields comes from the right sides of the modified equations (5.20), comes from commuting the operator applied to (5.20d) past the coefficient u S 3 , comes from adding compensating terms to the right side and moving entire terms there, and comes from forcing the term involving b S h ·∇ h to have the desired form, and will be rearranged further later. The form of the large term in both (5.23) and (5.28) below follows from (3.24) or direct calculation using (3.27).
Similarly, to obtain the evolution equation for β we begin by applying V β to (5.20), i.e., adding µ 2 C ∆ −1 ∂ z ∇ h · applied to (5.20b) to D applied to (5.20c), and rearranging terms in similar fashion as for (5.23). In addition, we force the function to which b S h ·∇ h is applied on the left side of the equation to be µ √ 1+µ 2 α for reasons to be explained later, and subtract appropriate constants from the factors appearing inside commutators since that does not affect their value, in order to facilitate estimate the size of the resulting terms. Also, to ensure the symmetry of the resulting system for α and β we multiply the large term µ 2 appearing on the left side of the equation by a(ε M r S )ρ(ε M r S ) and compensate by adding a(ε M r S )ρ(ε M r S ) − 1 times that term to the right side. By lemma 3.2 or by using the formulas in (3.27) there, the resulting large term exactly involves α. This yields comes from the right sides of (5.20), comes from commuting the operators applied to (5.20c) and (5.20b) past coefficients in those equations, comes from moving terms to the right side, balancing a term added on the left side, and using (3.12), and comes from forcing the term involving b S h ·∇ h to have the desired form, and will be rearranged further later. We now estimate the terms R α,i and R β ,i . Since the operators applied to the R i in R α,1 and R β ,1 are all bounded, (5.22) implies that The commutator terms in R α,2 and R β ,2 all involve commutators of a constant-coefficient zeroth-order pseudo-differential operator that is bounded as a function of µ with a function in H n . Each of the two terms making up the commutator is therefore a bounded operator on H j for j ≤ n. The following lemma says that the commutator actually gains one derivative, which in many cases is a vital improvement. [P, f ]g j ≤ c f n g j−1 . (5.34) The constant-coefficient pseudo-differential operators appearing in the commutators in R α,2 and R β ,2 are all homogeneous of degree zero and bounded uniformly in µ (in particular (5.42) below implies a bound on D−1 µ 2 ), and they are real analytic for (k, l, m) = (0, 0, 0) since the denominators in the formulas for C , D, and Q in (3.26), (3.27) are positive for µ < 1, so they satisfy the conditions of Lemma 5.3. The expressions to which the commutators are applied have one of two forms: either they consist of a single spatial derivative applied to a fast or intermediate component that is estimated in (5.4) or (5.6), or they are some component of (∂ t + (u S ·∇))V. In the former case the expression contains a factor µ p with p ≥ 2, so by Lemma 5.3 its H j norm for j = 0, . . . , n − 1 can be estimated by In the latter case the expression contains the factor µ 2 ε M so by the interpolation bounds (2.7) its H j norm for j = 0, . . . , n − 1 can be estimated by cµ 2 ε M ( V t j−1 + c) ≤ cµ 2 ε M µ − j ≤ cµε j . Hence To estimate the terms in R α,3 and R β ,3 note that those terms either have a factor of µ 2 ε M , which is smaller than ε j for any j ∈ {0, . . . , n − 1}, have a zeroth-order pseudo-differential operator applied to ∂ z r I , b z (after applying the z-derivative in the expression to some factor of b), ∇ h ·u h , or ∂ z (b F 3 + µ∆ −1 ∆ h r), all of which have their H j norms estimated in (5.4) or (5.6), or have a zeroth-order pseudo-differential operator applied to . In the first case using the uniform estimate (1.12) shows that the H n−1 norm is bounded by c ε 0 . In the middle cases, since there each term also contains a factor of µ 2 the H j norm is bounded by cµ 2 ε j+1 ≤ c ε j . In the final case, since ∆ −1 ∂ z f j ≤ c f max( j−1,0) and the term again contains a factor of µ 3 , it is bounded by cµ 3 ∇∂ z u I 3 max( j−1,0) ≤ cµ 3 ∂ z u I 3 max( j,1) ≤ cµ 3 ε j+2 ≤ cµε j . Together, these estimates yield Before estimating the remainders R α,4 and R β ,4 we must rearrange those expressions further. For j ≤ n − 2 a simpler method suffices because (5.4) and (5.6) imply that where V F * means all components of V F except a z b F 3 , which is not estimated in (5.4). However, (5.37) is not valid for j = n − 1 because (5.4) and (5.6) do not hold for j = n. For j ≤ n − 2 we can therefore obtain the estimate R α,4 j + R β ,4 j ≤ cε j by using the formulas for α and β in (5.18) together with the facts that D−1 µ 2 is a bounded zeroth-order pseudo-differential operator and that plus estimates similar to those used for R α,3 and R β ,3 . However, in order to obtain an estimate in H n−1 rearrangement is necessary, and it is sufficient to show that R α,4 and R β ,4 can be expressed as linear combinations of the terms , ∂ z r and ∂ z u 3 (5.39) whose H j norms are estimated in (5.4) or (5.6) even though they involve a first derivative of V, and in addition a factor of µ must be present mutiplying the terms ∂ z (r I , u I 3 ) to compensate for the extra factor of 1 µ in (5.6). Substituting (5.38) into the formula for α in (5.18) and solving the result for r I yields Applying (1 − a z ) to (5.38), which turns the final −µ(r − av r) into −µr I , dividing (5.40) by 1 + µ 2 and substituting the result for that final r I , and substituting the result into (5.32) shows that R β ,4 equals plus a sum of terms involving operators of order zero applied to the expressions in (5.39) whose H j norms can be bounded by c ε j using (5.4) and (5.6). To estimate (5.41) we use the identity (5.42) derived from (3.27). Since the constant term cancels, the fact that the other term in (5.42) contains a factor of ∂ z multiplied by an operator of order −1 together with the facts that the z-derivative of all the constituents of α are estimated in (5.4) and (5.6) and that an overall factor of µ is present in (5.42) implies that the H j norm of R β ,4 is bounded by a constant times ε j even for j = n − 1. Similarly, upon substituting formula (5.18) for β into the definition (5.27) of R α,4 and then substituting (5.42) into the result, the constant term from (5.42) cancels and the remaining terms all involve the expressions from (5.39) and so can be estimated by a constant times ε j since an overall factor of µ is present. This shows that R α,4 j + R β ,4 j ≤ c ε j , j = 0, . . . , n − 1.

Equations and estimates for horizontal components of the slow mode.
In similar fashion to the intermediate mode dynamic estimates, in order to estimate the difference between the slow modes of the solution to the original system and the solution of the limit system, it is necessary to obtain PDEs for the exact zero eigenspace of the large operator L A + µL M . For the horizontal components of the velocity and magnetic field the slow modes belong to that exact zero eigenspace, so we will write the evolution equations for the slow modes in the form of the limit equations with error terms added. To do that we apply the projection P onto the slow horizontal modes to the original system, expand all dependent variables into fast, intermediate, and slow modes, and move all terms except those involving purely slow modes to the right sides of the equations. The slow part of the density and vertical velocity will be treated in the following subsection.
Theorem 5.4. Assume that the conditions of Theorem 1.2 hold. Let (r, u, b) be the solution obtained for the MHD system (1.4). In addition, let (r,ū h ,b h ) be the solution of the limit system (1.13a), (1.13b), (1.13d) with the initial data equal to the O(1) part (r 0,S , u 0,S , b 0,S + µ lim (1 − av)r 0,S ) of the initial data (3.30) of the original system. Then Before proving Theorem 5.4 we need to derive appropriate equations. Since all slow modes contain the averaging operator a z in the z direction, it will be convenient to write the equations in conservation form, so that derivatives with respect to z disappear when a z is applied. In particular, (5.47) Using the continuity equation (A.1a), the momentum equation (1.4b) can be rewritten as where Φ is some scalar-valued function defined on T 3 , i.e., periodic in x, y, z. Apply P div h a z to the first two components of (5.48) and a z to the first two components of (1.4c), and simplify the result by using the identity (5.47) not only with w = u but also with w = b since the constraint ∇·b = 0 implies that b·∇ f equals ∇·( f b). The resulting equations can be further simplified by using the definitions (3.7) of the modes to obtain the identities a z b h = b S h and a z (ρu h ) = a z (ρu S h ) + a z (ρu F h ) = (a z ρ)u S h + a z (ρu F h ), and then using the facts that ρ = 1 + ε M r and a z u F h is a gradient to obtain P div where the tensor product ⊗ follows the convention that , (5.49) together with the bounds (1.12) and (1.10) and the relation (1.14) show that Recalling that u h and b h have no intermediate part, we separate them into their fast and slow parts in the tensor products in (5.49a): where trsp denotes the tensor transpose. By (3.9)-(3.10), the slow parts (u S h , b S h ) are independent of z and divergence-free, so the first term on the right side of (5.51) simplifies to u is identically zero and so can be omitted from the second term on the right side of (5.51). Next, we can drop the P div h operator from (5.49a) at the cost of adding a term ∇ h θ (t, x, y) to that equation, since a 2-vector is in the kernel of P div h if and only if it is the horizontal gradient of some scalar-valued function.
The second term on the right side is a "slow-fast" product, which can be rewritten using time-integrated variable as the time derivative of a small term plus a small term, since (5.5) shows that Hence we obtain The bound (5.4) together with the constraint (1.10), the relation (1.14) between the parameters and the definition (5.1) of the ε j implies that u F h n−1 + b F h n−1 ≤ c ε ν M . Using that estimate, the time-derivative bound (5.50), estimate (5.53) for A, the formula ρ = 1 + ε M r, and Corollary C.3 yields the estimates Applying the same ideas to the slow magnetic equation (5.49b), and in particular noting that a z (u S h ⊗ b F h + trsp) = 0 yields The same bounds as for (5.55) show that In order to simplify the equation for the time evolution of u S h further, we need to use an equation for the time evolution of r S . Using the first part of (3.10) and the facts from (3.7) that u I h = 0 and a z r = r S , the vertically-averaged conservation law form (4.4) can be simplified to Using the time-integrated variable A from (5.52), (5.59) can be rewritten as Subtracting u S h times (5.60) from (5.54) noting that a z ρ = 1 + ε M r S , and rewriting the term u S h ∂ t ∇ h ·A on the right side of the result as ∂ also satisfy Ξ 2 n−1 + ξ 2 n−2 ≤ c ε 1+ν M (5.63) in view of the estimate (5.55) and the same bounds used to obtain that estimate. We now move the ε M r S term from the left to the right side and also replace the time-derivative term on the right side of (5.61) by its divergence-free part, which only changes the divergence term, obtaining Using the estimate (5.63), the fact from (5.50) that ∂ t u S h = O(1), and the fact that the projection P div h does not increase Sobolev norms yields ) and ξ U = 0. The estimates (5.66), (5.58) together with the above-mentioned estimate on the difference of the initial data then imply that the hypotheses of Theorem B.1 hold with k = n, r = 1, and δ = ε M . Hence the conclusion of that theorem yields (5.46). , 0 3 is a zero eigenvector of the full large operator L A + µL M . However, as shown in Lemma 3.1, the µ-dependent zero eigenvector of L A + µL M having a nontrivial projection onto the density component is not just the slow mode r S but is a linear combination of r S and b 3 . Specifically, (3.4) implies that V r = a z , 0 3 , 0 2 −µa z satisfies (L A + µL M )V r · V = 0 for all vector-valued functions V, which by the MHD system (1.4) and the skew-adjointness of (L A + µL M ) implies that the PDE satisfied by a z (r − µb 3 ) = V r · V will only contain terms that are at most O(1). We therefore need to calculate the PDE system satisfied by a z (r − µb 3 ) and a z u 3 . It turns out that while the PDEs for those two variables are coupled by terms that are strictly O(1), the coupling to other components of the solution contains only terms that are o(1) and so can be considered as small perturbations.
Theorem 5.5. Under the conditions of Theorem 5.4, In order to treat the term in accordance with the expression estimated in (5.4). Using (5.74) while noting that (b S h ·∇ h )(av r S ) = 0, and using the facts from (1.6), (3.9), and (3.7) that ρ = 1 + ε M r, r S and u S 3 are independent of z and a z u I , and a z r = r S , we obtain As a step towards symmetrizing the system consisting of (5.71), (5.77), we want to replace r S in the last term on the left side of (5.77) by r S − µa z b F 3 , which requires adding a balancing term involving a z b F 3 , which must also be rewritten using (5.74). This leads us to write µ Equating those two expressions and comparing the coefficients of (b S h ·∇ h )(a z b F 3 ) shows that k 2 = k 1 µ, and then comparing the coefficients of (b h ·∇ h )r S yields k 1 = µ 1+µ 2 , i.e., (5.81) The system consisting of (5.71), (5.79) can be symmetrized by multiplying the latter equation by 1 + µ 2 lim . The estimates used to obtain (5.73) together with the time-derivative estimates (5.50), (5.76) and the time-integrated estimate (5.5) show that (5.83) Using (1.13e) and the fact that av r 0,S is a constant, the limit equations (1.13a) and (1.13c) can be rewritten as the system for the dependent variables (1 + µ 2 lim )r − µ 2 lim av r 0,S andū 3 , which has the same form as the system (5.71), (5.79) for the dependent variables r S − µa z b 3 and u S 3 , except that the terms on the right sides are omitted. Since the evolution equation for r shows that av r = av r 0 = av r 0,S + ε M av r 0,I = av r 0,S , (5.86) Hence the form of the initial data from (3.30), (3.36) shows that the difference between the initial data for the two systems is bounded in H n by a constant times ε 1+2ν M + |µ − µ lim |. In view of that bound plus the estimates (5.73), (5.82), and (5.83) for the right sides of (5.71), (5.79), Theorem B.1 shows that (5.87) By (5.86), the static estimate (5.4) with j = n − 3 applied to the −µa z (b F 3 + µ∆ −1 ∆ h r) term of (5.86) shows that (5.87) implies that (5.69) holds.
Remark 5.6. As discussed in the introduction, the term |µ − µ lim | is the dominating error term in (5.87) and (5.69) whenever µ lim = 0, but that term will be eliminated in Theorem 5.7 below by adding corrector terms. The only term yielding an error of size O(ε  Equations (5.91)-(5.92) have the same form as as the system (5.71), (5.79) for the dependent variables r S − µa z b 3 and u S 3 , except that the terms on the right sides are omitted and all occurrences of µ lim on the left sides are replaced by µ. Omitting the step of replacing µ by µ lim in the derivation of (5.71), (5.79) yields those equations with all occurrences of µ lim on the left sides replaced by µ and the terms of order µ − µ lim omitted from their right sides. Since the terms of order µ 2 in (5.91) now involve µ 2 rather than µ 2 lim , as in (5.86) and in contrast to (5.84), there is also no longer a term of size O(|µ − µ lim |) in the difference in the initial data. Hence applying Theorem B.1 now yields an estimate without the term involving |µ − µ lim |, and by using (5.86) the estimate so obtained can be written as (5.90). Here ε M denotes the well-known Mach number, ρ is the fluid density, p(ρ) is the pressure law that satisfies p ′ > 0, u is the fluid velocity, and B is the magnetic field. The parameter ε A , as we call the Alfvén number in this article, is the ratio between flow velocity and speed of the magneto-sonic waves; in [KM81] the Alfvén number is the reciprocal of our version. We consider the case in which a uniform magnetic field is applied in the direction e z parallel to the z-axis, which subjects the fluid to a large Lorenz force. To reformulate the system (A.1) into a form to which the results of [CJS18] can be applied, we begin by rescaling the magnetic field and the density via Applying the well-known calculus identities (e.g., [JJL10,p. 372 having the same initial value u 0 ∈ H k , where k ≥ ⌊ d 2 ⌋ + 2, the matrices A i are smooth and symmetric and A 0 is positive-definite, F is a given function of t and x, L is a first-order differential operator with constant coefficients, with L * denoting its L 2 -adjoint, and Ξ u , Ξ U , ξ u , and ξ U satisfy Ξ u k−r + Ξ U k−r + ξ u k−r−1 + ξ U k−r−1 ≤ cδ for some 0 ≤ r ≤ k − 1, Then max 0≤t≤T u −U k−r−1 ≤ cδ . Proof. Subtracting (B.2) from (B.1) and rearranging terms yields the system Since the A i are smooth and u and U belong to H k , . Moreover, multiplying (B.2) by U t and integrating over the spatial variables shows that which shows that U t L 2 ≤ c( U 1 + 1) without the need for any hypothesis on the size of V . Applying a spatial derivative D α to (B.2), multiplying the result by D α U t , integrating over space, estimating D α U t L 2 in similar fashion, and summing over 0 ≤ |α| ≤ k − r − 1 yields the estimate U t k−r−1 ≤ C( U k−r + 1). Since the assumption on the size of k ensures that f g s ≤ f s g k−1 for s ≤ k − 1, this estimate for U t together with the assumed bounds for u, U , Ξ, and ξ show that the right side of (B.3) equals L * (v −V ) plus terms whose H k−r−1 norm is O( (u −U ) − (Ξ u − Ξ U ) r−k−1 + δ ). Since L((u −U ) − (Ξ u − Ξ U )) = 0, and L is a constantcoefficient differential operator, the term L * (v −V ) drops out of the H k−r−1 energy estimate. Moreover Ξ u , Ξ U , ξ U , and ξ U are O(δ ), so the remaining terms on the right side are O(|(u −U ) − (Ξ u − Ξ U )| + δ ). The standard H k−r−1 energy estimate for symmetric-hyperbolic systems together with Gronwall's inequality therefore shows that (u − U ) − (Ξ u − Ξ U ) k−r−1 ≤ cδ . Using once more the fact that (Ξ u , Ξ U ) k−r−1 is O(δ ) yields the claimed estimate.

APPENDIX C. CALCULUS INEQUALITIES FOR z-AVERAGES
The following result is sharper than what would be obtained by the standard product estimate (e.g., [Maj84, Proposition 2.1A], because the entire product is estimate using the W 1,1 norm rather than pulling out one factor in the L ∞ norm, and the Gagliardo-Nirenberg inequalities are used in dimension two rather than three. Lemma C.1. For all j ≥ 1 there exists a constant C j such that for all f , g ∈ H j (T 3 ) a z ( f g) H j−1 (T 2 ) ≤ C j f H j (T 3 ) g L 2 (T 3 ) + f L 2 (T 3 ) g H j (T 3 ) . (C.1) Proof. We first prove (C.1) for j = 1: By the Gagliardo-Nirenberg inequality h L 2 (T 2 ) ≤ c h W 1,1 (T 2 ) , the standard estimate for integrals | h| ≤ |h|, and the Cauchy-Schwartz inequality, a z ( f g) L 2 (T 2 ) ≤ c a z ( f g) W 1,1 (T 2 ) = c a z ( f g) L 1 (T 2 ) + a z ( f ∇ x,y g) L 1 (T 2 ) + a z (g∇ x,y f ) L 1 (T 2 ) ≤ c f g L 1 (T 3 ) + f ∇g L 1 (T 3 ) + g∇ f L 1 (T 3 ) ≤ c f L 2 (T 3 ) g L 2 (T 3 ) + + f L 2 (T 3 ) ∇g L 2 (T 3 ) + g L 2 (T 3 ) ∇ f L 2 (T 3 )