An adaptive edge element method and its convergence for an electromagnetic constrained optimal control problem

In this work, an adaptive edge element method is developed for an H(curl)-elliptic constrained optimal control problem. We use the lowest-order Nedelec's edge elements of first family and the piecewise (element-wise) constant functions to approximate the state and the control, respectively, and propose a new adaptive algorithm with error estimators involving both residual-type error estimators and lower-order data oscillations. By using a local regular decomposition for H(curl)-functions and the standard bubble function techniques, we derive the a posteriori error estimates for the proposed error estimators. Then we exploit the convergence properties of the orthogonal $L^2$-projections and the mesh-size functions to demonstrate that the sequences of the discrete states and controls generated by the adaptive algorithm converge strongly to the exact solutions of the state and control in the energy-norm and $L^2$-norm, respectively, by first achieving the strong convergence towards the solution to a limiting control problem. Three-dimensional numerical experiments are also presented to confirm our theoretical results and the quasi-optimality of the adaptive edge element method.


Introduction
Many electromagnetic simulation problems involve the following H(curl)-elliptic equation [13] [44]: curl µ −1 curly + σy = f in Ω , (1.1) where σ and µ are the electric permittivity and the magnetic permeability, respectively. This problem is also encountered when the implicit time-stepping scheme is used for solving the full time-dependent Maxwell system [31].
In this work, we are interested in a relevant optimal control problem, namely to find a specially designed external current source profile so that the resulting electromagnetic field achieves an optimal target design. This optimal control problem has direct applications in many areas, such as the induction heating, the magnetic levitation and the electromagnetic material designs. We refer the readers to [63] [12][64] [25] for some recent results on the theory and applications of electromagnetic optimal control problems.
The main purpose of this work is to design and analyse an adaptive edge element method for a class of optimal control problems with the applied current density u being the control variable and the magnetic potential y being the state variable. Let y d and u d be our target magnetic field and control, and U ad denote the unbounded closed convex admissible set: u ∈ L 2 (Ω) ; u ≥ ψ a.e. in Ω , (1.2) where ψ is some obstacle function with a certain regularity. Then we can write the control problem as a constrained minimization problem: find a pair (y * , u * ) ∈ H 0 (curl, Ω) × U ad to minimize the quadratic objective functional: By considering a simple transformation u = u − ψ, we may set ψ in (1.2) to zero in our subsequent analysis for the sake of simplicity, that is, the admissible set shall take the form: U ad := u ∈ L 2 (Ω) ; u ≥ 0 a.e. in Ω .
In this work, we are interested in the numerical study of the practically important situation where the local singularities may be expected for the solution to the optimal control problem (1.3)-(1.4), due to the possible irregular geometry of the domain Ω, the (possibly large) jumps across the interface between two different physical media and the non-smooth source terms (cf. [19] [20]). In these cases, one typically needs to solve the problem on a very fine mesh to fully resolve the singularities, while it may be computationally expensive on account of the exponential growth of the number of degrees of freedom (DoFs). To balance the computational cost and the numerical accuracy, it has proved to be more promising to refine the mesh adaptively when the numerical accuracy is insufficient. The theory of adaptive finite element methods (AFEMs), due to the pioneer work of Babuška and Rheinboldt [7] in 1978 for elliptic boundary value problems, has become very popular and well developed in the past four decades; see [2] [48] and the references therein for an overview. The relevant research in the case of Maxwell's equations, which dated back to Beck et al. [10], has also reached a mature level nowadays (cf. [67] [68][53] [34]). However, the theory of AFEMs for PDE-constrained optimization problems is still a work in progress. Recently, Gong and Yan [28] proved the convergence and quasi-optimality of AFEMs for an elliptic optimal control problem with the help of the variational discretization and the duality argument, and in this setting there is no need to mark the data oscillations as in [29] [26]. In [42] [43], the authors considered the elliptic optimal control problems with integral control constraints and gave a rigorous proof of the convergence and the quasi-optimality. People may find more recent advances on AFEMs for the control problems in [11] [39][5] [27].
The first contribution towards the adaptive methods for optimal control problems of Maxwell's equations went back to Hoppe and Yousept [35]. An AFEM was designed for the same model problem (1.3)-(1.4), and the a posteriori error estimation was also presented, including the reliability and efficiency, with Nédélec's edge element approximations for both the control and the state. But the convergence of the adaptive method was not established in [35]. Later, Pauly and Yousept [49] considered an optimal control problem of first-order magnetostatic equations and presented the a posteriori error analysis. Meanwhile, Xu and Zou [61] considered the adaptive approximation for an optimal control problem associated with an electromagnetic saddle-point model and proved the convergence of discrete solutions.
The current work is a continuation and further improvement of the work [35], where a piecewise linear H(curl)conforming edge element was used to approximate the optimal control variable u. One crucial motivation of our work is based on the observation that the optimal control u * is generally of low regularity, at most H 1 (still under some reasonable conditions) (cf. Proposition 2.1). Therefore we can not expect a better L 2 -approximation accuracy of the control using any higher-order approximation than the piecewise constant one (i.e., the simplest, and also cheapest, approximation). Because of this, we shall use the piecewise constant functions to approximate the control variable u, and propose a new adaptive algorithm with error indicators incorporating the residual-type error estimators and the lower-order data oscillations. As pointed out in [40] [38], the precise structure of the admissible set and the way we discretize the distributed control problem are the critical ingredients when designing an adaptive algorithm, which often influence the analysis of the resulting AFEM essentially. It turns out that in our case here, due to the pointwise unilateral constraints in the admissible set U ad , the inconsistency between the discrete spaces of the state and control as well as the low regularity of the given data, the data oscillation plays an important role in the performance of the algorithm and has to be considered in the error estimates (as well as the marking strategy) to guarantee the convergence of the algorithm. We can readily notice that such a change in the approximation of the control can significantly reduce the number of DoFs and make our algorithm much easier to implement than the algorithm in [35], while still preserving the same numerical accuracy; see Section 5 for the detailed discussions and the implementation issue on the algorithm. To theoretically justify the effectiveness of our adaptive algorithm, we use a Clément-type quasi-interpolation operator established by Schöberl in [53] and the bubble functions [2] to derive the a posteriori error estimates (reliability and efficiency estimates) of the new error estimators. It is worth mentioning that there is a technical issue in [35] when deriving the optimality system for the discrete optimization problem, that is, the first-order optimality condition (2.8d) there cannot imply its complementarity condition (2.10) as the edge element discretization was adopted for the control (hence part of the estimates provided in [35] needs some necessary modifications). This is another motivation of the current work. Instead, in our a posteriori analysis, we shall directly cope with the original forms of the first-order optimality conditions by using the contraction properties of L 2 -projections without the help of the complementarity problems (although they hold in our setting), so that our estimates can be established in a rigorous manner. Apart from the aforementioned contributions towards the algorithmic aspects, another main contribution of this work is the strong convergence of the finite element solutions generated by the proposed adaptive method. This work appears to be the first one on the convergence analysis of AFEMs applied to the electromagnetic optimal control problem involving a variational inequality structure. In the finite element analysis provided in [63] for a quasilinear H(curl)-elliptic optimal control problem, the discrete compactness of the Nédélec edge elements [36] is the main tool in establishing strong convergence. However, the discrete compactness can only be applied to the discrete divergence-free sequence, and it is still unknown whether it is true for adaptively generated meshes. During the course of our convergence analysis, we shall borrow some techniques from the nonlinear optimization to avoid the need of discrete compactness and exploit some strategies with limiting spaces, which was initially adopted in [9] and then developed systemically in [45]. To overcome the difficulties arising from the structure of the constraint set and its discretization, we establish a convergence property of L 2 -projections associated with the meshes (cf. Proposition 4.2), which connects the convergence behaviors of the adaptive meshes and the discrete spaces. It further allows us to define a limiting optimization problem and characterize the limit point of the discrete controls and states (cf. Theorems 4.4 and 4.5). Hereafter, we show that the maximal error estimators and the residuals corresponding to the sequence of adaptive states and adjoint states vanish (cf. Lemmas 4.6 and 4.7), where we use a generalized convergence result of the mesh-size functions (cf. (4.4)), instead of introducing the buffer layer (cf. [61]) between the meshes at different levels, so that our proof can be significantly simplified. By means of these auxiliary results and again the properties of L 2 -projections, we are able to prove that the limit point of the discrete triplets {(y * k , p * k , u * k )} solves the variational inequality associated with the control problem (cf. Theorem 4.8). Hence, the strong convergence of Algorithm 1, i.e., Theorem 4.1, follows immediately.
This work is organized as follows. In Section 2, we briefly review the electromagnetic optimal control problem and consider its finite element approximation. Then we propose the a posteriori error estimator and design an adaptive algorithm. We show that the error estimator is both reliable and efficient in Section 3. After that, we address the issue of the convergence of the adaptive solutions in Section 4. Finally, some numerical results are presented in Section 5 to illustrate the theoretical results and indicate the effectiveness and robustness of the adaptive approach versus the uniform mesh refinement. We end this section with a shorthand notation: x y for x ≤ Cy for some generic constant C that is independent of the mesh size, but may depend on other quantities, e.g., σ, µ −1 , f and the shape regularity of the initial triangulation T 0 . If x ≤ Cy and x ≥ Cy hold simultaneously, we denote it simply by x ≈ y.

The optimal control problem and its discretization
We start by introducing the standard notation for Sobolev spaces and the involved physical parameters (cf. [1] [44]).
Let Ω ∈ R 3 be a bounded polyhedral domain with polyhedral Lipschitz subdomains Ω i , 1 ≤ i ≤ m, such that For any open bounded subset G of Ω with a Lipschitz continuous boundary ∂G and n being its unit outward normal vector. For any real s > 0, we define the Hilbert space H s (G) (resp. H s (G) := H s (G, R 3 )) for Sobolev scalar functions (resp. vector fields) of order s with an inner product (·, ·) s,G and a norm · s,G . We also introduce the spaces: H(curl, G) = {v ∈ L 2 (G) ; curlv ∈ L 2 (G)} and H(div, G) = {v ∈ L 2 (G) ; divv ∈ L 2 (G)}, equipped with the norm v curl,G = ( v 2 0.G + curlv 2 0,G ) 1/2 and v div,G = ( v 2 0.G + divv 2 0,G ) 1/2 , respectively. Here and in what follows, a bold typeface is used to indicate a vector-valued function. The tangential trace mapping γ t (u) := n×u and the normal trace mapping γ n (u) := u·n are well-defined on H(curl, G) and H(div, G), respectively. Then the zero tangential trace space can be introduced by We will also use the fractional order curl-space equipped with the norm · H s (curl,Ω) := · s,Ω + curl· s,Ω . In what follows, we write V for the most frequently used Sobolev space H 0 (curl, Ω).
We are now well-prepared for the mathematical formulation of the control problem. In this work, we focus on the following constrained minimization problem: where the state y(u) ∈ V is the solution to the variational system: We assume that the given source term f ∈ L 2 (Ω) satisfies while the target magnetic field y d and the target control u d satisfy y d ∈ H 0 (curl, Ω) and u d ∈ L 2 (Ω) . (2.5) We remark that although f is not in H(div) globally in Ω, namely f / ∈ H(div, Ω), divf is well-defined on each Ω i (hence almost everywhere on Ω) by the assumption (2.4). In what follows, we write divf for m i=1 divf χ Ωi , by abuse of notation, where χ Ωi is the characteristic function of Ω i . It is also clear that the weak divergence of f is globally defined on Ω if and only if the jump of the normal trace [γ n (f)] Γ across the interfaces Γ := ∪ m i=1 ∂Ω i \∂Ω vanishes. The physical parameters µ and σ are supposed to be polynomials on each subdomain Ω i (hence piecewise polynomials on Ω) and satisfy µ(x) ≥ µ 0 > 0 and σ(x) ≥ σ 0 > 0. The real α > 0 is a stabilization parameter. We define the symmetric bilinear form B(·, ·) associated with the state equation (2.3): It readily follows from the assumptions of the physical parameters that B is continuous and coercive, thus defining an inner product on H(curl, Ω) whose induced energy norm · B := B(·, ·) is equivalent to · curl,Ω .
By the direct method in the calculus of variations (see, for instance, Theorems 2.14-2.16 in [58]) and noting that the cost functional J(u) is weakly lower semicontinuous and strictly convex, we have that there always exists a unique minimizer u * ∈ U ad to the optimization problem. Then the first-order optimality condition yields In what follows, we shall write y * for y(u * ) for short. If we introduce the adjoint state p ∈ V associated with y ∈ V by (2.6) can be equivalently written as the following optimality system in terms of the state and the adjoint state, as well as the control variable: If we further define the Lagrange multiplier (or the adjoint control) for the variational inequality (2.7c): then (2.7c) can be reformulated in a compact form: where ∂I U ad is the subdifferential of the indicator function of the admissible set U ad [33]. We remark that λ * is actually the Fréchet derivative of the functional J(u). We then conclude from (2.7c), or (2.9), that u * is nothing else than the L 2 -projection of − p * α + u d on U ad , i.e., Here and throughout this work, we denote by P E the L 2 -projection on the convex subset E of L 2 (Ω). It is worth mentioning that P U ad in (2.10) can also be understood in the pointwise sense due to the unilateral form of U ad : where the maximum is taken componentwise. We now end our discussion on the continuous optimal control problem with some regularity results.
Remark 2.2. The above result essentially relies on the regularity theory for Maxwell's equations (cf. [60] Here, we consider the case where the coefficients are isotropic and constant, so that the proof is direct. By using the recent result [3, Theorem 1], one can similarly obtain the following generalization. Assume that µ is constant, σ ∈ W 1,3+δ (Ω) for some δ > 0 and Ω is of class C 1,1 . If u d ∈ H 1 (Ω) and f ∈ H(div, Ω) hold as above, then the solution (y * , p * , u * ) to the optimality system (2.7a)-(2.7c) still satisfies u * ∈ H 1 (Ω) and y * , p * ∈ H 1 (curl, Ω).
From the above discussion about the regularity, we can see that if there are no additional assumptions on the physical coefficients, the given data and the domain Ω, the optimal control u * should generally be of low regularity (less regular than H 1 ). Hence we may expect from the standard (h-version) FEM theory that on uniform meshes, the simplest piecewise constant approximation can already achieve the optimal approximation accuracy, and any higherorder approximations can not improve the numerical accuracy of the optimal control. In fact, this is still true even if the optimal control u * has the H 1 -regularity, since we aim only at the L 2 -error of the control variable. This is one of the main motivations of the current work. We would like to further remark that if the solution is piecewise analytic (singularity is allowed on a lower dimensional manifold), one may use general versions of FEM (e.g. hp-FEM) to get better convergence rate, even the exponential rate [57] [8][54] [56] (but the concrete design and implementation of such algorithms would be typically difficult and beyond the scope of this work).
The rest of this section is devoted to the finite element discretization of the control problem. We consider a family of conforming and shape regular triangulations {T h } of Ω, which coincide with the partition Ω = m i=1 Ω i in the sense that Ω i = T ⊂Ω i T . We set h := max T ∈T h h T , where h T denotes the diameter of T . We also refer to F h := {∂T ∩ Ω ; T ∈ T h }, the set of all the interior faces, as the skeleton of the triangulation T h . The diameter of a face F is denoted by h F . In our algorithm, we use the lowest-order H(curl)-conforming edge element space of Nédélec's first family with zero tangential trace: to approximate both the state and the adjoint state. For approximating the control variable, we employ the simplest space of element-wise constant functions with respect to the triangulation T h : Here and in what follows, P k (T ) is the space of polynomials of degree ≤ k over T . The discrete admissible set U ad h is then given by With the help of these notions, we formulate the discrete optimal control problem: As in the continuous case, the existence and uniqueness of the solution to the discrete system (2.12)-(2.13) can be guaranteed, and the corresponding optimality system reads as follows: where P h denotes the L 2 -projection P U h : 15) and the approximate target control u d h in (2.14c) is given by P h u d . Similarly, we denote by y h (u h ) the solution to (2.13) and introduce the discrete reduced cost functional: We also see from (2.14c) and the relation P U ad h = P U ad P h that which directly implies a pointwise representation for the discrete optimal control: (2.18)

Adaptive algorithm and the a posteriori error analysis
In this section, we give a complete discussion of the adaptive edge element method for solving the H(curl)-elliptic control problem and derive the a posteriori error estimates. An adaptive finite element method typically takes the successive loops: where the a posteriori error estimation (module ESTIMATE) is the core step in the design of an adaptive algorithm, through which we can extract the information on the error distribution. For our algorithm and analysis, we shall use the residual-type a posteriori error estimators in terms of numerically computable quantities, which include the element residuals η T , T ∈ T h , and the face residuals η F , F ∈ F h : p.T are the element residuals defined by p,F are the face residuals defined by Here [·] F stands for the jump across the face F . We remark that it may not be easy to see how to define these terms at the first glance, but we shall see that they are generated naturally in the reliability analysis. For ease of exposition, we denote by the residual-type error indicator associated with an element T . We should note that there are no face residuals on the boundary of domain since the state equation satisfies the homogeneous boundary condition. For the a posteriori error estimates and the convergence analysis, a lower-order data oscillation related to u d is needed: Some higher-order data oscillations associated with y d and f shall also be involved: where we assume that y d h ∈ V h and f h ∈ U h are some approximations of y d and f, respectively. In contrast to the element residuals and face residuals associated with the discrete solutions, the data oscillation osc h (u d ) is typically of lower order for a non-smooth target control u d , and at most of O(h), by the Poincaré inequality, even if u d has a certain regularity. Meanwhile, the data oscillations osc h (y d ) and osc h (f) are of the same order as the residuals, and shall be of higher order if the data y d and f have additional regularities. Therefore, we could replace y d and f in the element residuals η (i) y,T , i = 1, 2 and η (1) p,T , as well as some face residuals, with y d h and f h for ease of implementation without any influence on the performance of the algorithm and any essential change of the analysis.
As we shall see, since the inconsistent discrete spaces are used and no additional regularity assumptions are added on the data, the lower-order data oscillations may have a significant contribution to the total error so that a single residual-type error estimator η h is not enough to capture the error distribution accurately. Therefore, to design an efficient adaptive algorithm for our problem, it is necessary to take into account the data oscillations [26,29]. In view of this, we introduce a new mixed error indicator, by incorporating the lower-order data oscillation osc We are now in a position to present the algorithm (see Algorithm 1 below), based on the error estimatorη h . In what follows, we shall use the iteration index k to indicate the dependence of a variable or a quantity on a particular mesh generated in the adaptive process, for instance, we write η k (T ) for η h (T ) to emphasize that we are considering the error estimator η h (T ) on the mesh T k .
Algorithm 1 Adaptive edge element method 1: Specify a shape regular initial mesh T 0 on Ω and set the iteration index k := 0. 2: (SOLVE) Compute the numerical solution (y * k , p * k u * k ) to the discrete optimality system (2.14a)-(2.14c) on the mesh T k . 3: (ESTIMATE) Compute the error estimatorη k (T ) defined in (3.2) for each element T ∈ T k . 4: (MARK) Mark a subset M k ⊂ T k containing at least one element T k that satisfieŝ 5: (REFINE) Refine elements in M k and other necessary elements by bisection to generate the smallest conforming mesh T k+1 with T k+1 ∩ M k = ∅. 6: Set k = k + 1 and go to Step 2 until the preset stopping criterion is met.
Several further remarks concerning each module in Algorithm 1 are in order. First, in the module SOLVE, we are required to solve a large-scale quadratic optimization problem with discretized PDE-constraints efficiently. It is nowadays a very active research area, and many high-performance algorithms and preconditioners have been developed for this purpose (cf. [52][51] [50][55] [16]).
Second, to guarantee the convergence of the algorithm, we only need the condition (3.3) in the module MARK that the marked elements contain an element with the largest error indicator, which can be met by many popular marking strategies, such as the maximum strategy [7], the equidistribution strategy [23] and the Dörfler's strategy [21]. In our numerical experiments (see Section 5), we shall use the Dörfler's strategy to select the marked elements M k with minimal cardinality such that for a given θ ∈ (0, 1), there holds Third, we only include the data oscillation osc h (u d ) in the definition ofη h since it is a lower-order term that may provide a dominant error contribution among all the data oscillations. On the other hand, the behavior of the optimal control u * relies largely on the properties of the obstacle function ψ, whose information is further transferred to u d . We hence expect that the approximation ability of U h associated with the current mesh T h for u d can directly influence the performance of our AFEM; see also [29] [26] for related discussions. In the case where ψ is a constant function and u d = 0, osc h (u d ) vanishes andη h becomes the standard residual-type error estimator η h .
Finally, for the module REFINE, all elements of M k are bisected at least once, and some additional elements in T k \M k may also need to be subdivided in order to generate a sequence of uniformly shape regular and conforming meshes {T k } k≥0 ; see [17] and [48,Section 4] for the detailed mesh refinement algorithm and the necessary assumptions on the initial triangulation T 0 . Such a refinement process ensures that all the generic constants involved in the inequalities below depend only on the shape regularity of the initial mesh and the given data.
We shall next present the a posteriori error analysis based on the error estimatorη h (3.2), including both the reliability and efficiency estimates, which is similar in spirit to the ones given in [35] [53], but with several main difficulties and differences as stated in the introduction, especially those caused by the inconsistency between the discrete spaces of the state and control.

Reliability
In this section, we show that the error estimatorη h is reliable in the sense that it can provide an upper bound for the total error between the true solution and the numerical solution: For this purpose, following [53], we introduce a helpful quasi-interpolation operator Π h . We start with the definition of the extended neighborhood Ω T for an element T ∈ T h and fix some notations. Recall that the neighborhood Ω v of a vortex v is defined as the union of all the elements that contain the vertex v, and the extended neighborhood of a vertex v is given by The corresponding converse fact is that the collection of extended neighborhoods Ω T , T ∈ T h , covers each element in T h finite times uniformly: Here the constants C(T 0 ) only depend on the initial mesh T 0 .
Lemma 3.1. [53, Theorem 1] There exists an interpolation operator Π h : H 0 (curl, Ω) → V h such that for any u ∈ H 0 (curl, Ω), u − Π h u has the decomposition: Since the optimal state y * and adjoint state p * satisfy the coupled system (2.7a)-(2.7c), the Galerkin orthogonality, which is important for the a posteriori error analysis for the linear boundary value problems, does not hold here. To compensate it, we introduce the intermediate state and adjoint state, y(u * h ) and p(u * h ), by the equations: If we use the variational discretization for the control variable, we can derive the following error equivalence (cf. [28][30]): which allows us to directly conclude the reliability of the error estimator from the known result concerning Maxwell's equations [53,Corollary 2]. If we use the edge element discretization for the control as in [35], the above error equivalence still holds, up to the data oscillation osc h (u d ). However, this is not the case in our algorithm since the discrete spaces of the control and state variables are different. Instead, we have the following result.
Lemma 3.2. Let the triplets (y * , p * , u * ) and (y * h , p * h , u * h ) be the solutions to (2.7a)-(2.7c) and (2.14a)-(2.14c), respectively. Then it holds that Proof. By the well-posedness of the H(curl)-elliptic variational problem, we have which, combined with the triangle inequality, reduce the proof of the lemma to the estimate of u * h − u * 0,Ω . In view of (2.10) and (2.17), we get by the contraction property of L 2 -projections [15,Proposition 5.3]. Moreover, we can deduce, by taking φ = p * −p(u * h ) in (3.9) and ψ = y * − y(u * h ) in (3.10), that Combining (3.14) with (3.13) helps us obtain which completes the proof of the lemma.
With the above preparations, we are now ready to prove the reliability of the error estimator.
Theorem 3.3. Let the triplets (y * , p * , u * ) and (y * h , p * h , u * h ) be the solutions to the continuous and discrete optimality systems (2.7a)-(2.7c) and (2.14a)-(2.14c), respectively. Then we have the following reliability estimate: Proof. By Lemma 3.2, it suffices to estimate y(u * h ) − y * h curl,Ω + p(u * h ) − p * h curl,Ω to obtain the reliability estimate (3.15). We first consider the estimate for the state variable y. Let e y be y(u * h ) − y * h , and recall the norm equivalence: v B ≈ v curl,Ω . We can derive, by the Galerkin orthogonality, A direct application of Lemma 3.1 gives us the decomposition: e y − Π h e y = ∇ϕ + z with ϕ ∈ H 1 0 (Ω), z ∈ H 1 0 (Ω). Substituting it into (3.16) and using integration by parts for each T , we have To proceed, by the scaled trace inequality [48, Corollary 6.1]: By the property (3.6), the desired estimate follows from (3.19) and the Cauchy's inequality: The error e p := p(u * h ) − p * h for the adjoint state can be analysed similarly. We note e p 2 curl,Ω ≈B(e p , e p − Π h e p ) + B(e p , Π h e p ) , and write e p − Π h e p = ∇ϕ + z by Lemma 3.1. Then some elementary calculations give us that ,Ω e p curl,Ω . Moreover, by equations (2.14b) and (3.10), we have ,Ω e p curl,Ω . Then we can derive by using integration by parts and the above estimates that ,Ω e p curl,Ω , which, by (3.20), the trace inequality (3.18) and Lemma 3.1, gives Combining estimates (3.20) and (3.21) with Lemma 3.2 and the definition ofη h completes the proof of (3.15).

Efficiency
In this section, we consider the efficiency estimate, which is another aim of the a posteriori error analysis. For this, we need the so-called bubble functions, which plays a similar role to the cut-off functions and can help us estimate the local errors. As we shall see soon, when we deal with the divergence parts of the residual-type error estimator η h (i.e., η (2) y,T , η (2) p,T , η (2) y,F and η (2) p,F ), the higher-order bubble functions have to be used to ensure the vanishing boundary traces of some terms. Moreover, the curl structures in the right-hand sides of the adjoint equations (2.7b) and (2.14b) also need our special and careful treatment. These important points were not addressed in [35].
For the reader's convenience, we next briefly review the definition of the bubble functions and some basic results, see [59] and [2] for a comprehensive introduction of this topic. We define the bubble function for an element , are the barycentric coordinate functions associated with four vertices of T ∈ T h . Similarly, the bubble function for a face F ∈ F h is given by x ∈ T ∈ ω F . Here, ω F := {T ∈ T h ; F ⊂ ∂T } is the element pair for a face F ∈ F h , and λ F i are the barycentric coordinate functions associated with the vertices of the face F ∈ F h , which can be naturally extended to ω F . To extend the face residuals defined on F to ω F , we introduce the extension operator as follows. We first define the operatorÊ : C(F ) → C(T ) on the reference elementT in R 3 byÊ whereF is the face ofT lying on the (x,ŷ)-plane. By using the affine mapping F T (x) = A Tx + a T :T → T ∈ ω F , the general extension operator E : C(F ) → C(ω F ) can be introduced by where the mappings F T , T ∈ ω F , are chosen such thatF is mapped to F and E[p] is well-defined on ω F and continuous. The next lemma summarizes the important properties of the bubble functions, which can be easily verified by the equivalence of norms in a finite-dimensional linear space and the standard scaling argument.
Lemma 3.4. Let k be a positive integer and s be a positive real number. For any T ∈ T h and F ∈ F h , there holds

24)
for T ∈ w F and ϕ ∈ P k (F ).
We are now in a position to state and prove the main result of this section.
Theorem 3.5. Let the triplets (y * , p * , u * ) and (y * h , p * h , u * h ) be the solutions to the continuous and discrete optimality systems (2.7a)-(2.7c) and (2.14a)-(2.14c), respectively, and the multipliers λ * and λ * h be given by (2.8) and (2.16). Then we have the efficiency estimate: Before we start our proof, we remark that it is necessary to consider the error of the multiplier λ * h − λ * 0,Ω here in order to estimateη h , in comparison with [35], since the additional error estimator η (3) p,T is included inη h . We start with the following local efficiency estimate: by the definitions of λ * , λ * h and η p,T and the triangle inequality. Our proof proceeds by establishing more local efficiency estimates for η T , which are divided into the following four groups of estimates, for T ∈ T h and F ∈ F h , where T + and T − are two elements in ω F with F = T + ∩ T − .
Proof. We give the proof of the above four groups of inequalities by the following four steps.
(1) We start with η (1) y,T and readily see by the triangle inequality that It is clear that b T vanishes on ∂T , and hence we can define By estimates (3.26) and (3.27), and the inverse inequality: as well as the norm equivalence: The estimate of η (1) p,T is similar. We note and define z h := b T (curly d h + curlµ −1 curlp * h + σp * h ) ∈ H 0 (curl, Ω). Then a similar estimate as above gives where in (3.29) we have used (curly * h , curlz h ) 0,T = 0 from the fact that y * h is a first-order polynomial on T , and in (3.30) we have used Further applying the inverse estimate for z h curl,T in (3.30) and recalling (3.28), we come to It is clear that ∇z h is a polynomial on T and vanishes on the boundary ∂T , which gives ∇z h ∈ H 0 (curl, Ω). By a direct calculation, we have where we have used the following observation in the last equality: which is from (2.7a) with the test function φ = ∇z h . Again, by Lemma 3.4 and the inverse estimate, we can derive from the definition of η (2) y,T and (3.31) that Likewise, for η (2) p,T , taking z h = div(σp * h )b 2 T and observing from (2.7b): (σp * , ∇z h ) 0,Ω = (curly * − y d , curl∇z h ) 0,Ω = 0 , we can derive, by almost the same arguments as above, that (3) Since the lowest-order edge element is used for the discretization, curly * h is a piecewise constant vector. Then γ t (µ −1 curly * h ) is a polynomial defined on F and can be extended to w F by the extension operator E introduced in (3.22). Define . By the estimate (3.23) and integration by parts over ω F , we have The estimate (3.24) and the inverse estimate give us Combining (3.33) with the formula (3.32), we get where we have used (curlcurly * h , z h ) 0,T = 0 for T ∈ ω F in (3.34). Then, by the inverse estimate and Lemma 3.4, a direct estimate leads to (4) For η (2) y,F , we define We note by the triangle inequality that Then we have, again by the property of the bubble functions (3.23) and integration by parts over ω F , where we can use (3.36) to rewrite the last term as Therefore, we obtain from (3.37)-(3.39), the Cauchy's inequality, applying the inverse estimate to ∇z h 0,ω F and the trace inequality to z h 0,F (cf. (3.18)) that where we have used the trivial bound that h The estimate for η (2) p,F follows from the same (even simpler) argument. In fact, we can define z h = b 2 F E([γ n (σp * h )] F ), which implies ∇z h ∈ H 0 (curl, Ω) and, by (2.7b), (σp * , ∇z h ) 0,Ω = 0. Then, a typical calculation gives which yields, by Lemma 3.4 and the inverse estimate,

Convergence
We devote this whole section to establish our main result that the sequence of adaptively generated finite element solutions {(y * k , p * k , u * k )} k≥0 converges strongly to the true solution (y * , p * , u * ). Theorem 4.1. Let {(y * k , p * k , u * k )} k≥0 be the sequence of discrete triplets generated by the adaptive Algorithm 1 and (y * , p * , u * ) be the solution to the system (2.7a)-(2.7c). Then we have the strong convergences: As already pointed out in the introduction, the first step of the proof of convergence is the introduction of a limiting minimization problem that characterizes the limit of the discrete solutions to the system (2.14a)-(2.14c) while the second step is to show that the solution to the limiting problem actually coincides with the one to (2.7a)-(2.7c). To do so, let us first state some helpful auxiliary results on the convergence behavior of the adaptive meshes {T k } k≥0 . We associate each triangulation T k with a mesh-size function h k ∈ L ∞ (Ω), defined by h k | T = h T for T ∈ T k . Thanks to the monotonicity of {h k } k≥0 , we are allowed to define a limiting mesh-size function by the pointwise limit: h k (x) → h ∞ (x), as k → ∞. It is also known that the pointwise convergence of {h k } can be further improved to the uniform convergence [48,Lemma 4.2]: lim We should note that h ∞ (x) ≡ 0 in general. If h ∞ (x) > 0 at some point x, there is an element T containing x and an index k(x) depending on x such that T ∈ T l for all l ≥ k(x). This observation motivates us to split T k into two classes of elements: T + k consists of the elements in T k that are not refined after the kth iteration, while T 0 k ⊂ T k contains the elements that are refined at least once in the subsequent iterations. We are thus allowed to decompose the domain Ω into two parts: Ω + k := T ∈T + where Ω 0 k := T ∈T 0 k Ω T is the extended neighborhood of Ω 0 k . We remark that this uniform convergence result for the mesh-size functions is crucial for our subsequent analysis. We next give an interesting characterization of the limiting behavior of L 2 -projections {P k } k≥0 := {P U k } k≥0 with the help of the convergence property of the adaptive meshes {T k } k≥0 , which establishes a connection between the limit of mesh-size functions, the L 2 -projections and the limiting problem that we shall propose and deal with in the next section.  (2.15)) associated with the adaptive meshes {T k } k≥0 generated by Algorithm 1. Then for each f ∈ L 2 (Ω), the limit of the sequence {P k f} k≥0 , denoted by P ∞ f , exists as k → ∞. Furthermore, the corresponding limiting operator P ∞ is also an orthogonal L 2 -projection with the range and kernel given by Proof. It suffices to show that the limit of the sequence {P k f} exists for all f ∈ C ∞ c (Ω) (infinitely differentiable vectorvalued functions with compact support in Ω), by the facts that the operators {P k } are uniformly bounded and the space C ∞ c (Ω) is dense in L 2 (Ω). Given two iteration indices k 1 , k 2 with k 2 > k 1 , define T 0 k1,k2 := T k1 \(T k1 ∩ T k2 ), which is a subset of T 0 k1 consisting of the elements that are refined between the k 1 th iteration and k 2 th iteration. We then have, by the equivalent definition of P k in (2.15), where χ Ti is the characteristic function of T i . Recalling the limiting behavior of mesh-size functions {h k } in (4.4), we have that for any δ > 0, there exists an index k(δ) depending on δ such that for all k > k(δ), there holds h k ∞, Ω 0 k ≤ δ. Combining it with the uniform continuity of f, we have that for any ε > 0, there exists an index k(ε, δ) depending on ε and δ such that for any integers k 1 , k 2 satisfying k 2 > k 1 > k(ε, δ), and for any elements T ∈ T 0 k1,k2 , T i ∈ T k2 with T i ⊂ T , it holds that 1 We hence have that {P k f} is a Cauchy sequence in L 2 (Ω). Then it follows from the completeness of L 2 (Ω) that the limit of {P k f} exists. We have proved that for any f ∈ L 2 (Ω), P ∞ f is well-defined, which further allows us to define the bounded linear operator P ∞ on L 2 (Ω), by the uniform boundedness principle. We now show that P ∞ is an orthogonal L 2 -projection with the property (4.5). To do so, we first observe that for any g ∈ L 2 (Ω), it holds that (P ∞ f, g) 0,Ω = lim n→∞ (P k f, g) 0,Ω = lim n→∞ (P k g, f) 0,Ω = (f, P ∞ g) 0,Ω , which implies that P ∞ is self-adjoint. On the other hand, we have Thus, we can conclude that P ∞ is an orthogonal L 2 -projection. To characterize its range and kernel, we denote by P the orthogonal projection associated with the closed space U ∞ defined in (4.5). By definition, there holds ran(P) = U ∞ and ker(P) = U ⊥ ∞ . We readily see from the definition of P ∞ that ran(P ∞ ) ⊂ U ∞ , since every element in ran(P ∞ ) can be approximated by a sequence from k≥0 U k . Conversely, to prove U ∞ ⊂ ran(P ∞ ), we note that for any f ∈ U n , P k f = f holds for each k ≥ n. Then, letting k → ∞ gives us which indicates U n ⊂ ran(P ∞ ). Since n is arbitrary, we readily see k≥0 U k L 2 ⊂ ran(P ∞ ). The proof is complete.

The limiting problem
In order to find the limit point of the discrete triplets {(y * k , p * k , u * k )} k≥0 , we first define several limiting spaces as the closure of the union of discrete spaces at each level: which immediately implies U ad ∞ = U ∞ ∩ U ad , where the space U ∞ is defined in (4.5). Since U ad ∞ is a convex subset of L 2 (Ω), it is closed in the weak topology if and only if it is closed in the strong topology induced by the norm. We hence have the following key lemma. Lemma 4.3. Let u k ∈ U ad k , k ≥ 0, be a sequence weakly converging to a u in L 2 (Ω). Then u ∈ U ad ∞ holds.
We now consider the following limiting problem defined on the limiting spaces: The existence and uniqueness of the minimizer to the above problem can be obtained by the standard arguments as the continuous and discrete cases, and the reduced cost functional for the limiting problem is defined by where y ∞ (u ∞ ) denotes the solution to the equation (4.8) with the source u ∞ . The rest of this subsection is devoted to proving that the limit point of {(y * k , p * k , u * k )} k≥0 exists and happens to be the solution to the limiting optimization problem. As pointed out in the introduction, to compensate the lack of the discrete compactness, we shall investigate the weak limit of the sequence {u * k } k≥0 first, and then improve it to the strong convergence; see the next two theorems.
Theorem 4.4. Suppose that {(u * k , y * k )} k≥0 is the sequence of discrete optimal controls and states generated by Algorithm 1, and (u * ∞ , y * ∞ ) is the optimal control and state to the limiting optimization problem (4.7)-(4.8). Then, we have the following weak convergences: as k → ∞, u * k w u * ∞ in L 2 (Ω) and y * k w y * ∞ in H 0 (curl, Ω) .
Proof. Our proof starts with a simple but important observation that the sequence {u * k } is bounded, which is not a trivial fact due to the unboundedness of U ad . Indeed, for a fixed u ∈ U ad , we have which gives the boundedness of {u * k } in L 2 (Ω). Then the boundedness of {y * k } follows immediately. Hence, by the Banach-Alaoglu theorem, we can extract a subsequence and y * kn w y in H(curl, Ω), as n tends to infinity, which further implies that {y * kn } and {curly * kn } weakly converge to y and curly, respectively, in L 2 (Ω) (since H(curl, Ω) is continuously imbedded in L 2 (Ω) and curl is a bounded linear operator from H(curl, Ω) to L 2 (Ω)). Moreover, Lemma 4.3 yields w ∈ U ad ∞ . Noting V l ⊂ V kn for l ≤ k n and that there holds B(y * kn , v l ) = (f + u * kn , v l ) 0,Ω for l ≤ k n and v l ∈ V l , (4.9) we let n tend to infinity in (4.9) and obtain, by the weak convergences of {u * kn } and {y * kn }, Since l≥0 V l is dense in V ∞ , we readily have which means that (y, w) satisfies the constraint (4.8) of the limiting problem, that is, y = y ∞ (w). We claim that w is the minimizer u * ∞ of the cost functional J ∞ over the set U ad ∞ , namely, which, by (4.10), also implies that the weak limit y of the sequence {y * kn } is y * ∞ . To prove the claim, we first note by the uniform boundedness principle and the weak converges of {curly * kn } and {u * kn } to curly ∞ (w) and w in L 2 (Ω). Clearly, by (4.12) and the fact that u * kn is the minimizer of J kn over U ad kn , there holds, for any sequence, u k ∈ U ad k , k ≥ 0, We next prove an auxiliary fact that for any u ∞ ∈ U ad ∞ and a sequence u k ∈ U ad k for k ≥ 0, if u k − u ∞ 0,Ω → 0 as k → ∞, then lim which readily gives lim This fact (4.15), along with (4.13), completes our proof of the claim (4.11). To prove (4.14), we note, by the assumption, where we have used the Galerkin orthogonality and the density of k≥0 V k in V ∞ . By the above arguments, we can conclude that for any subsequence {(u * kn , y * kn )} n≥1 of {(u * k , y * k )} k≥0 , we can extract a subsequence weakly converging to (u * ∞ , y * ∞ ) in L 2 (Ω) × H 0 (curl, Ω), which immediately yields the weak convergence of the whole sequence {(u * k , y * k )} k≥0 : u * k w u * ∞ in L 2 (Ω) and y * k w y * ∞ in H 0 (curl, Ω) , as k → ∞.
The proof is complete.
Thanks to the above results, we are ready to show the main theorem of this subsection. Proof. We note from the claim (4.14) in the proof of Theorem 4.4 that the second convergence in (4.16) is a consequence of the first one. Hence, it suffices to prove the convergence of {u * k } k≥0 . To this end, a direct calculation gives which, by taking the upper limit on both sides and using Theorem 4.4, implies Then it follows that We choose a sequence u k ∈ U ad k , k ≥ 0, such that u k − u * ∞ 0,Ω → 0, as k → ∞, and then we can derive, by (4.18) and (4.15), that Combining the above estimate (4.19) with (4.17), we obtain the strong convergence of {u * k } k≥0 .
It is easy to write the optimality system for the limiting problem by a standard argument: and see that the sequence of discrete triplets {(y * k , p * k , u * k )} k≥0 converges strongly to the solution (y * ∞ , p * ∞ , u * ∞ ) to the limiting optimality system (4.20a)-(4.20c). We remark that the variational inequality (4.20c) will be used in the next subsection.

Proof of convergence
We have proved the strong convergence of the discrete solutions {(y * k , p * k , u * k )} in Section 4.1, where we have only used the structure of the control problem, while the adaptive process does not have an essential involvement. We shall see that in the subsequent analysis, the error estimatorη h defined in (3.2) and the marking requirement (3.3) in Algorithm 1 play a crucial role. To complete the proof of Theorem 4.1, we start with the following key lemma. Lemma 4.6. Let T k ∈ T k be one of the elements that achieve the maximum value ofη k (T ) over T ∈ T k , i.e., η k ( T k ) := max  Proof. We start with the estimates for η (1) y,T and η (1) y,F on a given mesh T k . A direct application of the inverse estimate and the triangle inequality gives and, by the assumption of µ and the trace inequality (3.18), we have We now consider η (2) y,T and η (2) y,F . Similarly, we have by the triangle inequality. Then the trace inequality (3.18) gives Hence, it follows from (4.22) and (4.23) that The same analysis applies to the other terms in η k . Then a careful but straightforward computation shows, for T ∈ T k , Here ω T denotes the union of elements that share a common face with T . Then, it follows from (4.24) and the estimate where C(f) := f 0,Ω + divf 0,Ω + [γ n (f)] Γ 0,Γ is well-defined by (2.4). Hence, by the definition ofη h , we have the estimate forη k ( T k ):η where η k ( T k ) has been bounded by (4.25). Since T k will be marked in the (k + 1)th iteration, it holds that by (3.5) and the uniform convergence (4.4) of adaptive meshes. Taking advantage of the absolute continuity of the Lebesgue integral and Theorem 4.5, we can get the desired vanishing limit of {η k ( T k )} from (4.25) and (4.26) when k tends to infinity. The proof is complete.
We are now well-prepared to show that the limiting state y * ∞ and adjoint state p * ∞ actually satisfy the variational problems (2.7a) and (2.7b), respectively. It is worth emphasizing that in our proof, there is no need to introduce a buffer layer of elements between the meshes at different levels as in [61], by virtue of the generalized convergence result of mesh-size functions (4.4).
is the solution to the limiting optimality system (4.20a)-(4.20c). Then it satisfies the variational problems (2.7a) and (2.7b), namely,  Proof. We only prove (4.27) since the proof of (4.28) is similar. For this, we first introduce the following residual functionals on H 0 (curl, Ω): and R k (·) := B(y * k , ·) − (f + u * k , ·) 0,Ω , k ≥ 0 . Note that Theorem 4.5 gives us the operator-norm convergence: Therefore, to prove R is actually a zero functional, it is sufficient to show lim k→∞ holds with the constant C independent of k. Then, by Cauchy's inequality, we have which vanishes when l tends to infinity. By above arguments, we have proven In view of (4.37), (4.38) and (4.40), we complete the proof, since the left-hand side of (4.37) is independent of k.
We end our theoretical analysis for the adaptive approximation of the control problem with an additional remark.
Remark 4.9. If the function u d has the regularity: u d ∈ H 1 (Ω), then using Theorem 4.1, the uniform convergence (4.4) and the Poincaré inequality, we can argue in a manner similar to [61, Theorem 6.2] to conclude the convergence of the error estimators {η k }. However, for a general L 2 -data u d , the optimal control u * can only be guaranteed to have a L 2 -regularity, and the data oscillation {osc k (u d )} may not converge to zero in the adaptive process.

Numerical experiments
In this section, we provide a detailed documentation of the numerical results to illustrate the performance of our adaptive algorithm for the optimal control problem. All the examples presented below are implemented based on the MATLAB package iFEM [18] using MATLAB 2017a on a personal laptop with 8.00 GB RAM and dual-core 2.4 GHz CPU. When solving the discrete optimization problems (2.12)-(2.13), we use the projected gradient algorithm in which the HX-preconditioner [32] is also adopted for solving the H(curl)-elliptic problem. More precisely, we start with an initial guess u (0) h ∈ U ad h , and in nth step (n ≥ 1) we solve two H(curl)-elliptic problems to obtain the state y (n) h and the adjoint state p (n) h with the help of HX-preconditioner. Then the control variable u h can be explicitly updated as follows: u where the parameter s can be computed explicitly by solving a one dimensional quadratic optimization problem. Compared to the algorithm in [35] where the edge element was used to approximate the control and an additional least squares problems with inequality constraints need to be solved to realize the box constraint (1.2), which is very expensive and time-consuming, our algorithm uses only about half the number of DoFs for computing the control variable but gives equally accurate numerical resolution, and can be trivially implemented.
In the following, we carry out some numerical experiments for two benchmark problems. For both examples, we shall first compute the numerical solutions and errors on the uniform meshes with five refinement iterations for the purpose of comparison, and then conduct the new adaptive algorithm, which is terminated when the corresponding DoFs are almost the same as that of the finest uniform mesh. The first example is modified from [68] [22], where there is a corner singularity in the solution. We consider an optimal control problem on the L-shape domain: Ω = [−1, 1] 3 \[0, 1] × [0, 1] × [−1, 1] with both the electric permittivity σ and the magnetic permeability µ taken to be 1. We set the target applied current density (control) u d and the target magnetic field (state) y d to zero, and set the parameter α to 0.1. We choose the inhomogeneous Dirichlet boundary condition and the source term f such that the exact solution are given as follows: u * = p * = 0 and y * = grad(r 2 3 sin( 2 3 θ)) in cylindrical coordinates .
We set θ = 0.5 in the marking strategy (3.4) and remark that there is no need to consider the data oscillation osc T (u d ) in this example because of the vanishing desired control u d . In our simulations, the computation starts on a very coarse mesh with 480 DoFs. The adaptively refined meshes in the 10th, 14th and 18th iterations are presented in Figure 1, in which we clearly observe that the meshes are locally refined around the z-axis where the singularity of y * occurs. Note the non-smooth function y * / ∈ H 1 (Ω) and that the lowest-order edge elements of first family are used to discretize the state variable. We typically cannot expect the optimal convergence order O(h) on a uniformly refined mesh according to the standard interpolation result (4.30). However, this theoretical order of convergence can be recovered by the adaptive mesh refinement in the sense that y * k − y * curl,Ω (#DoFs) −1/3 , which has been confirmed by the convergence history plotted in Figure 2 (left) on a double logarithmic scale. We should also observe that the errors of the control u * − u * k 0,Ω and the adjoint state p * − p * k curl,Ω reduce with a slope −2/3, which is twice the one of the error associated with the state y. The difference in the convergence rates may be explained by the fact that u * and p * are zero functions and hence very smooth while the optimal state y * is non-smooth and has a strong singularity near the z-axis. Moreover, Figure 2 (right) shows the reduction of the total error (almost dominated by the error y * − y * k curl,Ω ): and the convergence behavior of the a posteriori error estimatorη h = η h , which confirms the efficiency ofη h and the superiority of the adaptive mesh refinement over the uniform mesh refinement: the total error on the adaptively refined mesh reduces with an order −1/3, double the one (−0.15) on the uniformly refined mesh. This is also confirmed by the computing times and computational costs: for achieving the error over the mesh generated by the 5th uniform refinement, the adaptive algorithm takes only about 80 seconds and 10 5 DoFs, whereas the uniform one takes about 580 seconds and 1410240 DoFs.  , and the total errors for the uniform mesh refinement and the adaptive mesh refinement, as well as the error estimatorη h (right) for the first example.
The second example is chosen to be similar to the one in [35], where there are non-smooth source terms and large jumps of physical coefficients across the interface between two different media. To be precise, we consider an optimal control problem on Ω = [−1, 1] 3 with a high-contrast inclusion: Ω c := {x ∈ R 3 ; x 2 + y 2 + z 2 < 0.6 2 }, on which the coefficients µ and σ are given by σ = 10 in Ω c , 1 in Ω\Ω c , µ −1 = 0.1 in Ω c , 1 in Ω\Ω c .
Other parameters are chosen to be same as the first example. We still set θ = 0.5 in the marking strategy (3.4) and start our computation on a very coarse initial mesh with 604 DoFs. We show the evolution of adaptive meshes on the (y, z)-cross section at x = 0 in Figure 3, from which we immediately see that the meshes are strongly concentrated in the high-contrast inclusion Ω c and clearly capture the shape of the interface. We can also observe, from Figure 4 (left), an optimal convergence order −1/3 for the control and state variable and a little bit faster decrease of the error for the adjoint state p * − p * k curl,Ω , which is possibly because p * is quite smooth (a zero function), compared to u * and y * . Figure 4 (right) again verifies the effectiveness of the error estimatorη k and the gain of computational efficiency from the adaptive mesh refinement: the total error on the adaptively refined mesh reduces with an order −1/3, whereas the error on the uniform one reduces only with an order −0.2.  Figure 4: Convergence histories of the control, state and adjoint state (left), and the total errors for the uniform mesh refinement and the adaptive mesh refinement, as well as the error estimatorη h (right) for the second example.

Concluding remarks
In this work, we have studied an electromagnetic optimal control problem and used the lowest-order edge elements to approximate the state and adjoint state, while used the piecewise constant functions to approximate the control. We have designed an adaptive finite element method with an error indicator involving both the residual-type error estimator and the lower-order data oscillation. We have established the reliability and efficiency of the a posteriori error estimator and the strong convergence of the adaptive finite element solutions for both the state and control variables. With a very minor modification, our analysis and arguments can be directly applied to the more realistic case [63] [61] where the control u is added on a subdomain Ω c of Ω and the case [28] where the control satisfies a bilateral constraint. It is worth pointing out that our arguments can also be modified to cope with the inhomogeneous Dirichlet boundary condition. To be exact, suppose that the boundary condition is given by γ t y = γ t g with g ∈ H(curl, Ω) being a known function satisfying the given Dirichlet trace data [14]. Then we can check that the optimality systems (2.7a)-(2.7c) and (2.14a)-(2.14c) still hold, except that (2.7a) and (2.14a) are solved on the affine spaces g + V and Π h g + V h , respectively. Hence, up to a possible data oscillation: g − Π h g curl,Ω , our a posteriori error estimates and the convergence analysis can be readily applied.
As mentioned in the Introduction, the model of our interest can be connected with the discretization of the control problem of time-dependent eddy current equations [41][46] [47], where the implicit time-stepping scheme is adopted for the sake of stability [31] [10]. In this case, the coefficient σ will be scaled by the current time-step size. Therefore, it is important to design an error estimator which is robust with respect to the scaling of the coefficients, namely, the generic constants involved in the a posteriori error analysis should be independent of the scaling factors of the coefficients. For this, we may measure the errors of the states (y and p) by the energy norm on H 0 (curl, Ω): · 2 B = µ −1 curl· 2 0,Ω + √ σ· 2 0,Ω , and the error estimators may also need to be scaled correspondingly. If we assume that µ and σ are element-wise constant with respect to the initial mesh T 0 [10], or that there exists a scaled norm · 2 V := µ −1 * curl · 2 0,Ω + σ * · 2 0,Ω with µ * and σ * being positive constants, such that B(·, ·) is continuous and inf-sup stable with respect to · V with the involved generic constants independent of µ and σ [53], one can naturally follow the a posteriori error analysis provided in this work to obtain a robust error analysis by using the energy norm and the scaled error estimators (for instance, if the norm · V defined above exists, by adding a scaling factor √ µ * , we can modify η (1) y,T and η (1) y,F as follows: η (1) y,T := h T √ µ * f + u * h − curlµ −1 curly * h − σy * h 0,T and η (1) y,F := √ h F µ * [γ t (µ −1 curly * h )] F 0,F ). We refer the readers to [17][48] for related discussions. However, the detailed and rigorous treatments for the general coefficients and the time-dependent model are not trivial tasks and need further investigations. We finally remark that it is also of great interest to design the adaptive algorithm for more complicated models, such as the nonlinear electromagnetic control problem [64] and the Maxwell variational inequalities [66] [65].