Efficient MATLAB Codes for the 2D/3D Stokes Equation with the Mini-Element

Koko, Jonas

doi:10.15388/Informatica.2019.205

Informatica

Efficient MATLAB Codes for the 2D/3D Stokes Equation with the Mini-Element

Volume 30, Issue 2 (2019), pp. 243–268

Jonas Koko

https://doi.org/10.15388/Informatica.2019.205

Pub. online: 1 January 2019 Type: Research Article

Open Access

Received
1 February 2018

Accepted
1 December 2018

Published
1 January 2019

Abstract

We propose a fast MATLAB implementation of the mini-element (i.e. $P1$-Bubble/$P1$) for the finite element approximation of the generalized Stokes equation in 2D and 3D. We use cell arrays to derive vectorized assembling functions. We also propose a Uzawa conjugate gradient method as an iterative solver for the global Stokes system. Numerical experiments show that our implementation has an (almost) optimal time-scaling. For 3D problems, the proposed Uzawa conjugate gradient algorithm outperforms MATLAB built-in linear solvers.

1 Introduction

MATLAB is a popular problem solving environment, widely used for general scientific computation in education, engineering and research. MATLAB is nowadays a standard tools in many areas. Thanks to its collection of direct (e.g. $\mathit{LU}$, ${\mathit{LDL}^{\top }}$, Cholesky) and iterative (e.g. conjugate gradient, GMRES, bi-conjugate gradient) solvers, it is tempting to use MATLAB for the numerical approximation of partial differential equations (PDEs). For the finite element method (FEM), the main obstacle for using MATLAB is the assembly of the global matrix and vector. Since MATLAB built-in solvers are optimized, the assembly operations can take up to 99% of whole CPU time, as shown in Koko (2007), using an implementation with standard loop over triangles (Alberty et al., 1999, 2002; Kwon and Bang, 2000) directly derived from compiled languages (FORTRAN and C/C++). Unfortunately, some vectorized FEM codes are less flexible and require a huge amount of memory due to the allocation of auxiliary arrays and the corresponding matrix operations (Koko, 2007; Rahman and Valdman, 2013, 2015). Recently, Koko (2016) proposed a MATLAB implementation close to the standard form by using cell-arrays to store the gradient of the basis functions, for the Poisson equation and linear elasticity in 2D and 3D.

In this paper, we propose a fast MATLAB implementation of the $P1$-Bubble/$P1$ finite element (Mini element, Arnold et al., 1984; Ern and Guermond, 2002; Glowinski, 2003) for the generalized Stokes problem in 2D and 3D. The mini element for spatial discretization of the Stokes problem is easy to use in engineering practice since it allows for the use of equal-order interpolation (the same mesh for velocity and pressure). Equal-order interpolation is very useful in large-scale multi-physics codes. Indeed, a code dealing with several independent variables (e.g. chemical species, velocity components, etc.) requires the transfer of information between its different components at each time-step. Fast implementation means that our code operates on array and does not use for-loops over elements (triangles of tetrahedrons) for the assembling operations. Instead, we use cell-arrays to store element matrices as in Koko (2016). We also propose a solution strategy for the final linear system. Indeed, we propose an efficient (preconditioned) Uzawa conjugate gradient method derived from the one used with $P2/P1$ (or $P1$-iso-$P2/P1$) finite element pair (Cahouet and Chabard, 1988; Glowinski, 2003). The proposed conjugate gradient method operates on the dual (pressure) space and, at each iteration, d independent linear systems are solved ($d=2,3$). Our implementation needs only MATLAB basic distribution functions and can be easily modified and refined.

The paper is organized as follows. The model problem is described in Section 2, followed by a finite element discretization in Section 3. The element matrices in 2D/3D are described in Section 4. In Section 5, we propose our Uzawa conjugate gradient method for solving the Stokes system. MATLAB implementation details are given in Section 6. Numerical experiments are carried out in Section 7. Readers can download and edit the codes from http://www.isima.fr/~jkoko/Codes/KSTOK.tar.gz.

2 The Model Problem

Let Ω be a bounded domain in ${\mathbb{R}^{d}}$ ($d=2,3$) with a Lipschitz-continuous boundary Γ. Consider in Ω the Stokes problem

(2.1)

\[ \alpha \boldsymbol{u}-\nu \Delta \boldsymbol{u}+\nabla p=\boldsymbol{f},\hspace{1em}\text{in}\hspace{2.5pt}\Omega ,\]

(2.2)

\[ \nabla \cdot \boldsymbol{u}=0,\hspace{1em}\text{in}\hspace{2.5pt}\Omega ,\]

(2.3)

\[ \boldsymbol{u}={\boldsymbol{u}^{D}},\hspace{1em}\text{on}\hspace{2.5pt}\Gamma ,\]

where $\boldsymbol{u}=({u_{1}},\dots ,{u_{d}})\in {\mathbb{R}^{d}}$ is the velocity vector, p, the pressure, and $\boldsymbol{f}=({f_{1}},\dots ,{f_{d}})\in {\mathbb{R}^{d}}$, the field of external forces. In equation (2.1), $\alpha \geqslant 0$ is an arbitrary constant. If $\alpha =0$, then equations (2.1)–(2.3) turn to be the classic Stokes problem. If $\alpha >0$, then equations (2.1)–(2.3) turn to be a generalized Stokes problem encountered in time discretization of Navier-Stokes equations (see e.g. Glowinski, 2003). The constant $\nu >0$ is the kinematic viscosity.

We need the functional spaces $\boldsymbol{V}={H_{0}^{1}}{(\Omega )^{d}}$,

\[ {\boldsymbol{V}^{D}}=\big\{\boldsymbol{v}\in {H^{1}}{(\Omega )^{d}}\hspace{0.1667em}:\hspace{0.1667em}\boldsymbol{v}={\boldsymbol{u}^{D}}\hspace{2.5pt}\text{on}\hspace{2.5pt}\Gamma \big\},\hspace{1em}P=\bigg\{q\in {L^{2}}(\Omega )\hspace{0.1667em}:\hspace{0.1667em}{\int _{\Omega }}q\hspace{0.1667em}\mathrm{d}x=0\bigg\},\]

and bilinear forms

\[\begin{array}{l}\displaystyle {a_{i}}({u_{i}},{v_{i}})=\alpha {({u_{i}},{v_{i}})_{\Omega }}+\nu {(\nabla {u_{i}},\nabla {v_{i}})_{\Omega }},\hspace{1em}i=1,\dots ,d,\\ {} \displaystyle \boldsymbol{a}(\boldsymbol{u},\boldsymbol{v})={\sum \limits_{i=1}^{d}}{a_{i}}({u_{i}},{v_{i}}),\end{array}\]

where ${(\cdot \hspace{0.1667em},\cdot )_{\Omega }}$ stands for the standard ${L^{2}}(\Omega )$ scalar product. The variational formulation of the Stokes problem (2.1)–(2.3) is as follows:

Find $(\boldsymbol{u},p)\in {\boldsymbol{V}^{D}}\times P$ such that:

(2.4)

\[ a(\boldsymbol{u},\boldsymbol{v})-{(p,\nabla \cdot \boldsymbol{v})_{\Omega }}={(\boldsymbol{f},\boldsymbol{v})_{\Omega }},\hspace{1em}\forall \boldsymbol{v}\in \boldsymbol{V},\]

(2.5)

\[ -{(q,\nabla \cdot \boldsymbol{u})_{\Omega }}=0,\hspace{1em}\forall q\in P.\]

3 Finite Element Discretization with $P1$-Bubble-$P1$

For the finite element discretization of (2.4)–(2.5), we have to chose a finite element pair for the velocity field and the pressure. This choice cannot be arbitrary but must satisfy the inf-sup condition (Babuska, 1971; Brezzi, 1974).

3.1 Mini Element

In this paper we study the discretization of the Stokes problem (2.1)–(2.3) by the finite element pair $P1$-bubble/$P1$ (the so-called mini-element), introduced by Arnold et al. (1984). This element leads to a relatively low number of degrees of freedom with a good approximate solution.

Let ${\mathcal{T}_{h}}$ be a triangulation of Ω and T a triangle of ${\mathcal{T}_{h}}$. We define the space associated with the bubble by

\[ {\mathbb{B}_{h}}=\big\{{\boldsymbol{v}_{h}}\in {\mathcal{C}^{0}}(\bar{\Omega });\hspace{2.5pt}\forall T\in {\mathcal{T}_{h}},\hspace{2.5pt}{v_{h|T}}=\boldsymbol{x}{b^{(T)}}\big\}.\]

We also defined the discrete function spaces

\[\begin{array}{l}\displaystyle {V_{ih}}=\big\{{\boldsymbol{v}_{h}}\in {\mathcal{C}^{0}}(\bar{\Omega });\hspace{0.2778em}{\boldsymbol{v}_{h|T}}\in {P^{1}},\hspace{2.5pt}\forall T\in {\mathcal{T}_{h}};\hspace{2.5pt}{\boldsymbol{v}_{h|\Gamma }}=0\big\},\hspace{1em}i=1,\dots ,d,\\ {} \displaystyle {P_{h}}=\bigg\{{q_{h}}\in {\mathcal{C}^{0}}(\bar{\Omega });\hspace{0.2778em}{q_{h|T}}\in {P^{1}},\hspace{2.5pt}\forall T\in {\mathcal{T}_{h}};\hspace{0.1667em}:\hspace{0.1667em}{\int _{\Omega }}{q_{h}}\hspace{0.1667em}\mathrm{d}x=0\bigg\},\end{array}\]

and we set ${X_{ih}}={V_{ih}}\oplus {\mathbb{B}_{h}}$ and ${\boldsymbol{X}_{h}}={X_{1h}}\times \cdots \times {X_{dh}}$. With the above preparations, the discrete variational problem reads as follows.

Find $({\boldsymbol{u}_{h}},{p_{h}})\in {\boldsymbol{X}_{h}}\times {P_{h}}$ such that

(3.1)

\[ \boldsymbol{a}({\boldsymbol{u}_{h}},{\boldsymbol{v}_{h}})-{({p_{h}},\nabla \cdot {\boldsymbol{v}_{h}})_{\Omega }}={(\boldsymbol{f},{\boldsymbol{v}_{h}})_{\Omega }}\hspace{1em}\forall {v_{h}}\in {\boldsymbol{X}_{h}},\]

(3.2)

\[ -{({q_{h}},\nabla \cdot {\boldsymbol{u}_{h}})_{\Omega }}=0\hspace{1em}\forall {q_{h}}\in {P_{h}}.\]

For a given triangle T, the velocity field ${\boldsymbol{u}_{h}}$ and the pressure ${p_{h}}$ are approximated by linear combinations of the basis functions in the form

\[ {\boldsymbol{u}_{h}}(\boldsymbol{x})={\sum \limits_{i=1}^{d+1}}{\phi _{i}}(\boldsymbol{x}){\mathbf{u}_{i}}+{\mathbf{u}_{b}}{\phi _{b}}(\boldsymbol{x}),\hspace{2em}{p_{h}}(\boldsymbol{x})={\sum \limits_{i=1}^{d+1}}{\phi _{i}}(\boldsymbol{x}){\mathrm{p}_{i}},\]

where ${\mathbf{u}_{i}}$ and ${\mathrm{p}_{i}}$ are nodal values of ${\boldsymbol{u}_{h}}$ and ${p_{h}}$ while ${\mathbf{u}_{b}}$ is the bubble value. The basis functions are defined by

\[ {\phi _{1}}(\boldsymbol{x})=1-x-y,\hspace{2em}{\phi _{2}}(\boldsymbol{x})=x,\hspace{2em}{\phi _{3}}(\boldsymbol{x})=y,\hspace{2em}{\phi _{b}}(\boldsymbol{x})=27{\prod \limits_{i=1}^{3}}{\phi _{i}}(\boldsymbol{x})\]

in 2D, and

\[\begin{array}{l}\displaystyle {\phi _{1}}(\boldsymbol{x})=1-x-y-z,\hspace{2em}{\phi _{2}}(\boldsymbol{x})=x,\hspace{2em}{\phi _{3}}(\boldsymbol{x})=y,\\ {} \displaystyle {\phi _{4}}(\boldsymbol{x})=z,\hspace{2em}{\phi _{b}}(\boldsymbol{x})=256{\prod \limits_{i=1}^{4}}{\phi _{i}}(\boldsymbol{x})\end{array}\]

in 3D.

To construct the coefficient matrices, a number of integrals involving powers of the basis functions will be computed. Integrals over a triangle T can be evaluated directly by the following formulas

(3.3)

\[ {\int _{T}}{\phi _{1}^{{\alpha _{1}}}}{\phi _{2}^{{\alpha _{2}}}}{\phi _{3}^{{\alpha _{3}}}}\mathrm{d}\boldsymbol{x}=2|T|\frac{{\alpha _{1}}!\hspace{0.1667em}{\alpha _{2}}!\hspace{0.1667em}{\alpha _{3}}!}{({\alpha _{1}}+{\alpha _{2}}+{\alpha _{3}}+2)!},\]

(3.4)

\[ {\int _{T}}{\phi _{1}^{{\alpha _{1}}}}{\phi _{2}^{{\alpha _{2}}}}{\phi _{3}^{{\alpha _{3}}}}{\phi _{4}^{{\alpha _{4}}}}\mathrm{d}\boldsymbol{x}=6|T|\frac{{\alpha _{1}}!\hspace{0.1667em}{\alpha _{2}}!\hspace{0.1667em}{\alpha _{3}}!\hspace{0.1667em}{\alpha _{4}}\hspace{-0.1667em}}{({\alpha _{1}}+{\alpha _{2}}+{\alpha _{3}}+{\alpha _{4}}+3)!},\]

where $|T|$ stands for the triangle area (in 2D), or the tetrahedron volume (in 3D). A useful property for the basis functions is

(3.5)

\[ {\sum \limits_{i=1}^{d+1}}\nabla {\phi _{i}}=0.\]

3.2 Algebraic Formulation

We use the following notations for the discrete velocity/pressure nodal values

(3.6)

\[ {\bar{u}_{i}}=\left[\substack{{u_{i}}\\ {} {u_{ib}}}\right],\hspace{2em}{\bar{f}_{i}}=\left[\substack{{f_{i}}\\ {} {f_{ib}}}\right],\hspace{1em}i=1,\dots ,d.\]

Systems (3.1)–(3.2) lead to the following algebraic form, using notations (3.6)

(3.7)

\[ \left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c}\bar{A}\hspace{1em}& 0\hspace{1em}& -{\bar{B}_{1}^{\top }}\\ {} 0\hspace{1em}& \bar{A}\hspace{1em}& -{\bar{B}_{2}^{\top }}\\ {} -{\bar{B}_{1}}\hspace{1em}& -{\bar{B}_{2}}\hspace{1em}& 0\end{array}\right]\left[\substack{{\bar{u}_{1}}\\ {} {\bar{u}_{2}}\\ {} \mathrm{p}}\right]=\left[\substack{{\bar{f}_{1}}\\ {} {\bar{f}_{2}}\\ {} 0}\right]\]

in 2D, or

(3.8)

\[ \left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c@{\hskip4.0pt}c}\bar{A}\hspace{1em}& 0\hspace{1em}& 0\hspace{1em}& -{\bar{B}_{1}^{\top }}\\ {} 0\hspace{1em}& \bar{A}\hspace{1em}& 0\hspace{1em}& -{\bar{B}_{2}^{\top }}\\ {} 0\hspace{1em}& 0\hspace{1em}& \bar{A}\hspace{1em}& -{\bar{B}_{3}^{\top }}\\ {} -{\bar{B}_{1}}\hspace{1em}& -{\bar{B}_{2}}\hspace{1em}& -{\bar{B}_{3}}\hspace{1em}& 0\end{array}\right]\left[\substack{{\bar{u}_{1}}\\ {} {\bar{u}_{2}}\\ {} {\bar{u}_{3}}\\ {} p}\right]=\left[\substack{{\bar{f}_{1}}\\ {} {\bar{f}_{2}}\\ {} {\bar{f}_{3}}\\ {} 0}\right]\]

in 3D. In (3.7) and (3.8) $\bar{A}=\bar{M}+\bar{R}$, with $\bar{M}$ as the mass matrix and $\bar{R}$ as the stiffness matrix. ${\bar{B}_{i}}$ is the divergence submatrix associated with the i-th partial derivative, i.e.

\[ {\bar{B}_{i}}\equiv {({q_{h}},{\partial _{i}}{u_{ih}})_{\Omega }},\hspace{1em}i=1,\dots ,d.\]

To create the algebraic system (3.7) or (3.8), the discrete system (3.1)–(3.2) is evaluated over each triangle T to obtain the element matrices and vectors

\[\begin{array}{l}\displaystyle {\bar{M}_{ij}^{(T)}}={\int _{T}}\alpha {\phi _{i}}{\phi _{j}}\mathrm{d}x,\hspace{2em}{\bar{R}_{ij}^{(T)}}={\int _{T}}\nu \nabla {\phi _{i}}\cdot \nabla {\phi _{j}}\mathrm{d}x,\\ {} \displaystyle {\bar{B}_{ij}^{(T)}}={\int _{T}}{\partial _{1}}{\phi _{i}}{\phi _{j}}\mathrm{d}x+{\int _{T}}{\partial _{2}}{\phi _{i}}{\phi _{j}}\mathrm{d}x,\hspace{2em}{\bar{f}_{i}^{(T)}}={\int _{T}}f{\phi _{i}}\mathrm{d}x.\end{array}\]

Then assembling operations consist of direct-summing the element matrices over the triangulation ${\mathcal{T}_{h}}$ to obtain the global matrices $\bar{M}=({\bar{M}_{ij}})$, $\bar{R}=({\bar{R}_{ij}})$, $\bar{B}=({\bar{B}_{ij}})$ and $\bar{f}=({\bar{f}_{i}})$

\[\begin{array}{l}\displaystyle {\bar{M}_{ij}}=\sum \limits_{T\in {\mathcal{T}_{h}}}{\bar{M}_{ij}^{(T)}},\hspace{2em}{\bar{R}_{ij}}=\sum \limits_{T\in {\mathcal{T}_{h}}}{\bar{R}_{ij}^{(T)}},\\ {} \displaystyle {\bar{B}_{ij}}=\sum \limits_{T\in {\mathcal{T}_{h}}}{\bar{B}_{ij}^{(T)}},\hspace{2em}{\bar{f}_{i}}=\sum \limits_{T\in {\mathcal{T}_{h}}}{\bar{f}_{i}^{(T)}}.\end{array}\]

In the next sections we detail the element matrices $\bar{M}$, $\bar{R}$, ${\bar{B}_{i}}$ and the right-hand side $\bar{f}$.

4 Element Matrices

For $P1$ finite element, we need only the element area and the gradient of the basis functions for the element matrices and vectors. To derive vectorized MATLAB codes, we need analytical expressions for all element matrices and vectors. This section is devoted to this task. For the sake of the presentation we drop the $(T)$-superscript introduced in the previous section to distinguish element matrices from global ones.

4.1 Two-Dimensional Case

For a triangle T, let ${\{({x_{i}},{y_{i}})\}_{i=1,2,3}}$ be the vertices and ${\{\phi \}_{i=1,2,3}}$ the corresponding basis functions. The gradient of ${\phi _{i}}$ is given by

(4.1)

\[ \left[\substack{\nabla {\phi _{1}^{t}}\\ {} \nabla {\phi _{2}^{t}}\\ {} \nabla {\phi _{3}^{t}}}\right]={\left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c}1\hspace{1em}& 1\hspace{1em}& 1\\ {} {x_{1}}\hspace{1em}& {x_{2}}\hspace{1em}& {x_{3}}\\ {} {y_{1}}\hspace{1em}& {y_{2}}\hspace{1em}& {y_{3}}\end{array}\right]^{-1}}\left[\begin{array}{c@{\hskip4.0pt}c}0\hspace{1em}& 0\\ {} 1\hspace{1em}& 0\\ {} 0\hspace{1em}& 1\end{array}\right]=\frac{1}{2|T|}\left[\begin{array}{c@{\hskip4.0pt}c}{y_{2}}-{y_{3}}\hspace{1em}& {x_{3}}-{x_{2}}\\ {} {y_{3}}-{y_{1}}\hspace{1em}& {x_{1}}-{x_{3}}\\ {} {y_{1}}-{y_{2}}\hspace{1em}& {x_{2}}-{x_{1}}\end{array}\right],\]

where $|T|$ is the area of T given by

\[ 2|T|=\mathrm{det}\left[\begin{array}{c@{\hskip4.0pt}c}{x_{2}}-{x_{1}}\hspace{1em}& {x_{3}}-{x_{1}}\\ {} {y_{2}}-{y_{1}}\hspace{1em}& {y_{3}}-{y_{1}}\end{array}\right]=({x_{2}}-{x_{1}})({y_{3}}-{y_{1}})-({x_{3}}-{x_{1}})({y_{2}}-{y_{1}}).\]

A nonbubble entry of the element stiffness matrix $\bar{R}$ is given by

(4.2)

\[ {\bar{R}_{ij}}={\int _{T}}\nu \nabla {\phi _{i}}\cdot \nabla {\phi _{j}}\mathrm{d}x=|T|\nu \nabla {\phi _{i}}\cdot \nabla {\phi _{j}},\hspace{1em}i,j=1,2,3.\]

For the bubble entries ${\bar{R}_{bj}}$, for $j=1,2,3$. A straightforward calculation yields to (using (3.5))

\[ {\bar{R}_{bj}}=\frac{9}{4}|T|{\sum \limits_{i=1}^{3}}\nabla {\phi _{i}}=0,\hspace{1em}j=1,2,3.\]

For the diagonal entry corresponding to the bubble (i.e. $i=j=b$) we have

(4.3)

\[\begin{aligned}{}{\bar{R}_{bb}}=& \nu {\int _{T}}{27^{2}}\nabla ({\phi _{1}}{\phi _{2}}{\phi _{3}})\cdot \nabla ({\phi _{1}}{\phi _{2}}{\phi _{3}})\mathrm{d}x\\ {} =& \frac{81}{10}\nu |T|\left(|\nabla {\phi _{1}}{|^{2}}+|\nabla {\phi _{2}}{|^{2}}+|\nabla {\phi _{3}}{|^{2}}+\nabla {\phi _{1}}\nabla {\phi _{2}}+\nabla {\phi _{1}}\nabla {\phi _{3}}+\nabla {\phi _{2}}\nabla {\phi _{3}}\right)\\ {} =& \frac{81}{10}\nu |T|\left(|\nabla {\phi _{1}}{|^{2}}+|\nabla {\phi _{2}}{|^{2}}+\nabla {\phi _{1}}\cdot \nabla {\phi _{2}}\right)=:{\omega _{R}},\end{aligned}\]

using (3.5).

With the above results, the element stiffness matrix is therefore

\[ \bar{R}=\left(\begin{array}{c@{\hskip4.0pt}c}R\hspace{1em}& 0\\ {} 0\hspace{1em}& {\omega _{R}}\end{array}\right).\]

As for the stiffness matrix, we set $M={({\bar{M}_{ij}})_{i,j=1,\dots ,3}}$ as the nonbubble part of the mass matrix. A direct calculation yields

(4.4)

\[ {M_{ij}}=\left\{\begin{array}{l@{\hskip4.0pt}l}\frac{\alpha }{6}|T|\hspace{1em}& \text{if}\hspace{2.5pt}i=j,\\ {} \frac{\alpha }{12}|T|\hspace{1em}& \text{elsewhere}.\end{array}\right.\]

The bubble part of the mass matrix is given by

(4.5)

\[\begin{aligned}{}{\bar{M}_{bj}}=& \frac{3\alpha }{20}|T|,\hspace{1em}j=1,2,3,\\ {} {\bar{M}_{bb}}=& \frac{81\alpha }{280}|T|=:{\omega _{M}}.\end{aligned}\]

The element mass matrix is therefore

\[ \bar{M}=\left(\begin{array}{c@{\hskip4.0pt}c}M\hspace{1em}& z\\ {} {z^{\top }}\hspace{1em}& {\omega _{M}}\end{array}\right),\]

where

\[ {z^{\top }}=\frac{3}{20}\alpha |T|\left(1\hspace{2.5pt}1\hspace{2.5pt}1\right).\]

Finally, the element stiffness/mass matrix $\bar{A}$ is

\[ \bar{A}=\left(\begin{array}{c@{\hskip4.0pt}c}A\hspace{1em}& z\\ {} {z^{\top }}\hspace{1em}& \omega \end{array}\right),\]

where we have set $A=R+M$ and $\omega ={\omega _{R}}+{\omega _{M}}$.

A direct integration yields the element divergence matrix $-\bar{B}=[-{\bar{B}_{1}}\hspace{2.5pt}-{\bar{B}_{2}}]$, where ${\bar{B}_{i}}=[{B_{i}}\hspace{2.5pt}-{B_{ib}}]$ with

(4.6)

\[ {B_{i}}=\frac{|T|}{3}\left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c}{\partial _{i}}{\phi _{1}}\hspace{1em}& {\partial _{i}}{\phi _{2}}\hspace{1em}& {\partial _{i}}{\phi _{3}}\\ {} {\partial _{i}}{\phi _{1}}\hspace{1em}& {\partial _{i}}{\phi _{2}}\hspace{1em}& {\partial _{i}}{\phi _{3}}\\ {} {\partial _{i}}{\phi _{1}}\hspace{1em}& {\partial _{i}}{\phi _{2}}\hspace{1em}& {\partial _{i}}{\phi _{3}}\end{array}\right],\hspace{1em}i=1,2,\]

and

(4.7)

\[ {B_{ib}}=\frac{9|T|}{20}\left[\substack{{\partial _{i}}{\phi _{1}}\\ {} {\partial _{i}}{\phi _{2}}\\ {} {\partial _{i}}{\phi _{3}}}\right],\hspace{1em}i=1,2.\]

The contribution of the right-hand side component ${f_{i}}$, in nonbubble terms, is given by

(4.8)

\[ {f_{i}^{(T)}}=\frac{|T|}{3}{f_{iT}},\hspace{1em}i=1,2,3,\]

where ${f_{iT}}$ is a mean value of ${f_{i}}$ on T. The bubble terms of the right-hand side are

(4.9)

\[ {f_{ib}^{(T)}}={\int _{T}}{f_{i}}{\phi _{b}}\mathrm{d}x=\frac{9}{20}|T|{f_{iT}},\hspace{1em}i=1,2.\]

With the above element matrices and vectors the $11\times 11$ element system corresponding to (3.7) is

\[ \left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c@{\hskip4.0pt}c@{\hskip4.0pt}c}A\hspace{1em}& z\hspace{1em}& 0\hspace{1em}& 0\hspace{1em}& -{B_{1}^{\top }}\\ {} {z^{\top }}\hspace{1em}& \omega \hspace{1em}& 0\hspace{1em}& 0\hspace{1em}& {B_{1b}^{\top }}\\ {} 0\hspace{1em}& 0\hspace{1em}& A\hspace{1em}& z\hspace{1em}& -{B_{2}^{\top }}\\ {} 0\hspace{1em}& 0\hspace{1em}& {z^{\top }}\hspace{1em}& \omega \hspace{1em}& {B_{2b}^{\top }}\\ {} -{B_{1}}\hspace{1em}& {B_{1b}}\hspace{1em}& -{B_{2}}\hspace{1em}& {B_{2b}}\hspace{1em}& 0\end{array}\right]\left[\substack{{u_{1}}\\ {} {u_{1b}}\\ {} {u_{2}}\\ {} {u_{2b}}\\ {} p}\right]=\left[\substack{{f_{1}}\\ {} {f_{1b}}\\ {} {f_{2}}\\ {} {f_{2b}}\\ {} 0}\right].\]

To reveal diagonal blocks, the system above can be rearranged as follows

(4.10)

\[ \left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c@{\hskip4.0pt}c@{\hskip4.0pt}c}A\hspace{1em}& 0\hspace{1em}& z\hspace{1em}& 0\hspace{1em}& -{B_{1}^{\top }}\\ {} 0\hspace{1em}& A\hspace{1em}& 0\hspace{1em}& z\hspace{1em}& -{B_{2}^{\top }}\\ {} {z^{\top }}\hspace{1em}& 0\hspace{1em}& \omega \hspace{1em}& 0\hspace{1em}& {B_{1b}^{\top }}\\ {} 0\hspace{1em}& {z^{\top }}\hspace{1em}& 0\hspace{1em}& \omega \hspace{1em}& {B_{2b}^{\top }}\\ {} -{B_{1}}\hspace{1em}& -{B_{2}}\hspace{1em}& {B_{1b}}\hspace{1em}& {B_{2b}}\hspace{1em}& 0\end{array}\right]\left[\substack{{u_{1}}\\ {} {u_{2}}\\ {} {u_{1b}}\\ {} {u_{2b}}\\ {} p}\right]=\left[\substack{{f_{1}}\\ {} {f_{2}}\\ {} {f_{1b}}\\ {} {f_{2b}}\\ {} 0}\right].\]

We can now eliminate the bubble unknowns ${u_{1b}}$ and ${u_{2b}}$ since they correspond to diagonal blocks (the ω blocks). From (4.10)${_{3}}$ and (4.10)${_{4}}$, we deduce that

(4.11)

\[ {u_{ib}}=({f_{ib}}-{B_{ib}^{\top }}p-{z^{\top }}{u_{i}})/\omega ,\hspace{1em}i=1,2.\]

Substituting (4.11) into (4.10)${_{1}}$, (4.10)${_{2}}$ and (4.10)${_{5}}$ we obtain a linear system in ${({u_{1}}\hspace{2.5pt}{u_{2}}\hspace{2.5pt}p)^{\top }}$ whose matrix is

(4.12)

\[ \left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c}A-{\omega ^{-1}}z{z^{\top }}\hspace{1em}& 0\hspace{1em}& -{B_{1}^{\top }}-{\omega ^{-1}}z{B_{1b}^{\top }}\\ {} 0\hspace{1em}& A-{\omega ^{-1}}z{z^{\top }}\hspace{1em}& -{B_{2}^{\top }}-{\omega ^{-1}}z{B_{2b}^{\top }}\\ {} -{B_{1}}-{\omega ^{-1}}{B_{1b}}{z^{\top }}\hspace{1em}& -{B_{2}}-{\omega ^{-1}}{B_{2b}}{z^{\top }}\hspace{1em}& -{\omega ^{-1}}({B_{1b}}{B_{1b}^{\top }}+{B_{2b}}{B_{2b}^{\top }})\end{array}\right]\]

and the right-hand side

(4.13)

\[ \left[\substack{{f_{1}}-{\omega ^{-1}}z{f_{1b}}\\ {} {f_{2}}-{\omega ^{-1}}z{f_{2b}}\\ {} -{\omega ^{-1}}({B_{1b}}{f_{1b}}+{B_{2b}}{f_{2b}})}\right].\]

4.2 Three-Dimensional Case

Let ${\{{\boldsymbol{x}_{i}}=({x_{i}},{y_{i}},{z_{i}})\}_{i=1,\dots ,4}}$ be the vertices of a tetrahedron T and ${\{{\phi _{i}}\}_{i=1,\dots ,4}}$ the corresponding basis functions. The gradient ${\nabla _{\boldsymbol{x}}}{\phi _{i}}$ over T are given by

\[ \left[\substack{\nabla {\phi _{1}^{\top }}\\ {} \nabla {\phi _{2}^{\top }}\\ {} \nabla {\phi _{3}^{\top }}\\ {} \nabla {\phi _{4}^{\top }}}\right]={\left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c@{\hskip4.0pt}c}1\hspace{1em}& 1\hspace{1em}& 1\hspace{1em}& 1\\ {} {x_{1}}\hspace{1em}& {x_{2}}\hspace{1em}& {x_{3}}\hspace{1em}& {x_{4}}\\ {} {y_{1}}\hspace{1em}& {y_{2}}\hspace{1em}& {y_{3}}\hspace{1em}& {y_{4}}\\ {} {z_{1}}\hspace{1em}& {z_{2}}\hspace{1em}& {z_{3}}\hspace{1em}& {z_{4}}\end{array}\right]^{-1}}\left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c}0\hspace{1em}& 0\hspace{1em}& 0\\ {} 1\hspace{1em}& 0\hspace{1em}& 0\\ {} 0\hspace{1em}& 1\hspace{1em}& 0\\ {} 0\hspace{1em}& 0\hspace{1em}& 1\end{array}\right].\]

An alternative formula for computing $\nabla {\phi _{i}}$ is

\[ {\nabla _{\boldsymbol{x}}}{\phi _{i}}={J^{-1}}{\nabla _{\boldsymbol{\xi }}}{\phi _{i}}(\boldsymbol{\xi }),\]

where J is the Jacobean matrix of the mapping $\boldsymbol{\xi }\mapsto \boldsymbol{x}(\boldsymbol{\xi })$, that is,

(4.14)

\[ J=\left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c}{x_{2}}-{x_{1}}\hspace{1em}& {y_{2}}-{y_{1}}\hspace{1em}& {z_{2}}-{z_{1}}\\ {} {x_{3}}-{x_{1}}\hspace{1em}& {y_{3}}-{y_{1}}\hspace{1em}& {z_{3}}-{z_{1}}\\ {} {x_{4}}-{x_{1}}\hspace{1em}& {y_{4}}-{y_{1}}\hspace{1em}& {z_{4}}-{z_{1}}\end{array}\right].\]

The volume of tetrahedron T is given, from (4.14), by $|T|=\mathrm{det}(J)/6$.

As for 2D case, the nonbubble entries of the element stiffness matrix are given by

(4.15)

\[ {\bar{R}_{bj}}=\frac{9}{4}|T|{\sum \limits_{i=1}^{3}}\nabla {\phi _{i}}=0,\hspace{1em}j=1,2,3,4,\]

while ${\bar{R}_{bj}}=0$, for all $j=1,\dots ,4$. For the diagonal entry, using (3.4)–(3.5), we obtain

(4.16)

\[\begin{aligned}{}{\bar{R}_{bb}}=& \nu {\int _{T}}{256^{2}}\nabla ({\phi _{1}}{\phi _{2}}{\phi _{3}}{\phi _{4}})\cdot \nabla ({\phi _{1}}{\phi _{2}}{\phi _{3}}{\phi _{4}})\mathrm{d}x\\ {} =& \frac{8192}{845}\nu |T|\left[{\sum \limits_{\ell =1}^{3}}|\nabla {\phi _{\ell }}{|^{2}}+\nabla {\phi _{1}}\cdot \nabla {\phi _{2}}+\nabla {\phi _{1}}\cdot \nabla {\phi _{3}}+\nabla {\phi _{2}}\cdot \nabla {\phi _{3}}\right]\\ {} =:& {\omega _{R}}.\end{aligned}\]

For the 3D mass matrix, a straightforward calculation with a linear tetrahedron shows that (for nonbubble entries)

(4.17)

\[ {\bar{M}_{ij}}=\left\{\begin{array}{l@{\hskip4.0pt}l}{\alpha _{T}}\frac{|T|}{10}\hspace{1em}& \text{if}\hspace{2.5pt}i=j,\\ {} {\alpha _{T}}\frac{|T|}{20}\hspace{1em}& \text{if}\hspace{2.5pt}i\ne j,\end{array}\right.\]

where ${\alpha _{T}}$ is a mean value of α on T. The bubble part of the mass matrix is given by

(4.18)

\[\begin{aligned}{}{\bar{M}_{bj}}=& \frac{8\alpha }{105}|T|,\hspace{1em}j=1,2,3,\\ {} {\bar{M}_{bb}}=& \frac{8192\alpha }{51975}|T|=:{\omega _{M}}.\end{aligned}\]

The element mass matrix is therefore

\[ \bar{M}=\left(\begin{array}{c@{\hskip4.0pt}c}M\hspace{1em}& z\\ {} {z^{\top }}\hspace{1em}& {\omega _{M}}\end{array}\right),\]

where

\[ {z^{\top }}=\frac{8\alpha }{105}|T|\left(\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c@{\hskip4.0pt}c}1\hspace{1em}& 1\hspace{1em}& 1\hspace{1em}& 1\end{array}\right).\]

If we set $-\bar{B}=[-{\bar{B}_{1}}\hspace{2.5pt}-{\bar{B}_{2}}\hspace{2.5pt}-{\bar{B}_{3}}]$ and ${\bar{B}_{i}}=[{B_{i}}\hspace{2.5pt}-{B_{ib}}]$, a straightforward calculation yields to

(4.19)

\[ {B_{i}}=\frac{|T|}{4}\left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c@{\hskip4.0pt}c}{\partial _{i}}{\phi _{1}}\hspace{1em}& {\partial _{i}}{\phi _{2}}\hspace{1em}& {\partial _{i}}{\phi _{3}}\hspace{1em}& {\partial _{i}}{\phi _{4}}\\ {} {\partial _{i}}{\phi _{1}}\hspace{1em}& {\partial _{i}}{\phi _{2}}\hspace{1em}& {\partial _{i}}{\phi _{3}}\hspace{1em}& {\partial _{i}}{\phi _{4}}\\ {} {\partial _{i}}{\phi _{1}}\hspace{1em}& {\partial _{i}}{\phi _{2}}\hspace{1em}& {\partial _{i}}{\phi _{3}}\hspace{1em}& {\partial _{i}}{\phi _{4}}\\ {} {\partial _{i}}{\phi _{1}}\hspace{1em}& {\partial _{i}}{\phi _{2}}\hspace{1em}& {\partial _{i}}{\phi _{3}}\hspace{1em}& {\partial _{i}}{\phi _{4}}\end{array}\right],\hspace{1em}i=1,\dots ,4,\]

and

(4.20)

\[ {B_{ib}}=\frac{32}{105}|T|\left[\substack{{\partial _{i}}{\phi _{1}},\\ {} {\partial _{i}}{\phi _{2}}\\ {} {\partial _{i}}{\phi _{3}}\\ {} {\partial _{i}}{\phi _{4}}}\right].\]

The contribution of the right-hand side component ${f_{i}}$, in nonbubble terms, is given by

(4.21)

\[ {f_{i}^{(T)}}=\frac{|T|}{4}{f_{iT}},\hspace{1em}i=1,2,3,4,\]

where ${f_{iT}}$ is a mean value of ${f_{i}}$ on T. The bubble terms of the right-hand side are

(4.22)

\[ {f_{ib}^{(T)}}={\int _{T}}{f_{i}}{\phi _{b}}\mathrm{d}x=\frac{32}{105}|T|{f_{iT}},\hspace{1em}i=1,2,3.\]

As in 2D, we rearrange the Stokes system and we get

(4.23)

\[ {u_{ib}}=\big({f_{ib}}-{B_{ib}^{\top }}p-{z^{\top }}{u_{i}}\big)\big/\omega ,\hspace{1em}i=1,\dots ,4.\]

Substituting (4.23) into the rest of the system, we obtain a linear system whose matrix is

(4.24)

\[ \left[\begin{array}{c@{\hskip4.0pt}c@{\hskip4.0pt}c@{\hskip4.0pt}c}A-{\omega ^{-1}}z{z^{\top }}\hspace{1em}& 0\hspace{1em}& 0\hspace{1em}& -{B_{1}^{\top }}-{\omega ^{-1}}z{B_{1b}^{\top }}\\ {} 0\hspace{1em}& A-{\omega ^{-1}}z{z^{\top }}\hspace{1em}& 0\hspace{1em}& -{B_{2}^{\top }}-{\omega ^{-1}}z{B_{2b}^{\top }}\\ {} 0\hspace{1em}& 0\hspace{1em}& A-{\omega ^{-1}}z{z^{\top }}\hspace{1em}& -{B_{3}^{\top }}-{\omega ^{-1}}z{B_{3b}^{\top }}\\ {} -{B_{1}}-{\omega ^{-1}}{B_{1b}}{z^{\top }}\hspace{1em}& -{B_{2}}-{\omega ^{-1}}{B_{2b}}{z^{\top }}\hspace{1em}& -{B_{3}}-{\omega ^{-1}}{B_{3b}}{z^{\top }}\hspace{1em}& -{\omega ^{-1}}{\textstyle\textstyle\sum _{i=1}^{3}}{B_{ib}}{B_{ib}^{\top }}\end{array}\right]\]

and the right-hand side

(4.25)

\[ \left[\substack{{f_{1}}-{\omega ^{-1}}z{f_{1b}}\\ {} {f_{2}}-{\omega ^{-1}}z{f_{2b}}\\ {} -{\omega ^{-1}}{\textstyle\textstyle\sum _{i=1}^{3}}{B_{ib}}{B_{ib}^{\top }}}\right].\]

5 Uzawa Conjugate Gradient Algorithm

We propose in this section a preconditioned conjugate gradient method for solving the Stokes system after the assembly of the element systems (4.12)–(4.13) and (4.24)–(4.25). For a finite element pair of the form ${P_{k+1}}/{P_{k}}$ (e.g. $P2/P1$ or $P1$-iso-$P2$/$P1$), the preconditioner advocated by Cahouet and Chabard (1988) (see also Fortin and Glowinski, 1983; Glowinski, 2003; Glowinski and Le Tallec, 1989) is efficient and widely used. In the author’s knowledge, there is no equivalent preconditioner for the pair $P1$-Bubble/$P1$ (or $P1/P1$ with stabilization). The algorithm presented in his section is therefore an original contribution.

The Stokes system can be rewritten in a compact form

(5.1)

\[ \left[\begin{array}{c@{\hskip4.0pt}c}\mathbf{A}\hspace{1em}& -{\mathbf{B}^{\top }}\\ {} -\mathbf{B}\hspace{1em}& -\mathrm{C}\end{array}\right]\left[\substack{\mathbf{u}\\ {} \mathrm{p}}\right]=\left[\substack{\mathbf{f}\\ {} {\mathrm{f}_{p}}}\right],\]

where A is (symmetric) positive definite block diagonal matrix, C is positive semi-definite matrix.

Let us introduce the generalized Lagrangian function

(5.2)

\[ \mathcal{L}(\mathbf{u},\mathbf{p})=\frac{1}{2}{\mathbf{u}^{\top }}\mathbf{A}\mathbf{u}-{\mathbf{f}^{\top }}\mathbf{u}-{\mathrm{p}^{\top }}\mathbf{B}\mathbf{u}-\frac{1}{2}{\mathrm{p}^{\top }}\mathrm{C}\mathrm{p}-{\mathrm{f}_{p}^{\top }}\mathrm{p}.\]

Due to the properties of A and C, the saddle-point for (5.2) exists. Then it follows that (5.1) is the saddle-point equation for (5.2), that is, (5.1) characterizes the solution of the saddle-point problem

(5.3)

\[ \underset{\mathbf{u}}{\min }\underset{\mathrm{p}}{\max }\mathcal{L}(\mathbf{u},\mathrm{p})=\underset{\mathrm{p}}{\max }\underset{\mathbf{u}}{\min }\mathcal{L}(\mathbf{u},\mathrm{p}).\]

5.1 Uzawa Conjuage Gradient Algoritm

To derive a dual (Uzawa) algorithm for (5.1), we assume that $\mathbf{u}=\mathbf{u}(\mathrm{p})$ is the solution of Poisson equation

(5.4)

\[ \mathbf{A}\mathbf{u}=\mathbf{f}+{\mathbf{B}^{\top }}\mathrm{p},\]

that is, in the decomposed useful form

\[ {\mathrm{A}_{i}}{\mathrm{u}_{i}}={\mathrm{f}_{i}}+{\mathrm{B}_{i}^{\top }}\mathrm{p},\hspace{1em}i=1,\dots ,d.\]

Then multiplying (5.4) by u and substituting the result in (5.2), we obtain

(5.5)

\[ \mathcal{L}(\mathbf{u}(\mathrm{p}),\mathrm{p})=-\frac{1}{2}{\mathbf{u}^{\top }}\mathbf{A}\mathbf{u}-\frac{1}{2}{\mathrm{p}^{\top }}\mathrm{C}\mathrm{p}-{\mathrm{f}_{p}^{\top }}\mathrm{p}.\]

If we introduce the dual functional

\[ {J^{\ast }}(\mathrm{p})=-\underset{\mathbf{u}}{\min }\mathcal{L}(\mathbf{u}(\mathrm{p}),\mathrm{p})\]

with $\mathbf{u}(\mathrm{p})$ solution of (5.4), the saddle-point problem (5.3) becomes.

Find p such that

(5.6)

\[ {J^{\ast }}(\mathrm{p})\leqslant {J^{\ast }}(\mathrm{q}),\hspace{1em}\forall q.\]

From (5.5), ${J^{\ast }}$ is quadratic and coercive. From (5.4), we deduce that the mapping $\mathrm{p}\mapsto \mathbf{u}(\mathrm{p})$ is linear and $\mathbf{u}(\mathrm{p}+t\mathrm{d})=\mathbf{u}(\mathrm{p})+t\mathbf{w}$ where w is the solution of the sensitivity problem

(5.7)

\[ \mathbf{A}\mathbf{w}={\mathbf{B}^{\top }}\mathrm{d},\]

\[ {\mathrm{A}_{i}}{\mathrm{w}_{i}}={\mathrm{B}_{i}^{\top }}\mathrm{d},\hspace{1em}i=1,\dots ,d.\]

It follows that the derivative of ${J^{\ast }}$ is

(5.8)

\[ \mathrm{r}:=\nabla {J^{\ast }}(\mathrm{p})=\mathbf{B}\mathbf{u}+\mathrm{C}\mathrm{p}+{\mathrm{f}_{p}}.\]

With a search direction d, we compute an optimal stepsize ρ by solving

\[ \nabla {J^{\ast }}{(\mathrm{p}+\rho \mathrm{d})^{\top }}\mathrm{d}=0,\]

that is,

\[ \rho =-{\mathrm{d}^{\top }}(\mathbf{B}\mathbf{w}+\mathrm{C}\mathrm{d})/({\mathrm{r}^{\top }}\mathrm{d}),\]

where w is the solution of the sensitivity equation (5.7).

Since ${J^{\ast }}$ is a quadratic function, the best search direction is a conjugate gradient direction. As a consequence, the best algorithm for (5.6) is a conjugate gradient algorithm. At each iteration k, the Fletcher–Reeves conjugate gradient direction is given by

\[\begin{array}{l}\displaystyle {\mathrm{d}_{k}}={\mathrm{r}_{k+1}}+{\beta _{k}}{\mathrm{d}_{k}},\\ {} \displaystyle {\beta _{k}}={\left\| {\mathrm{r}_{k+1}}\right\| ^{2}}{\left\| {\mathrm{r}_{k}}\right\| ^{-2}}.\end{array}\]

A dual (Uzawa) conjugate gradient algorithm for solving the saddle-point problem (5.1) is outlined in Algorithm 1. Theoretically, Algorithm 1 converges in at most ${n_{B}}=\mathrm{rank}(\mathbf{B})$ iterations. Obviously, for large scale problems, ${n_{B}}$ is very large and it is preferable that convergence should be obtained in a number of iterations considerably less than ${n_{B}}$.

Algorithm 1

Uzawa conjugate gradient algorithm for (5.1).

5.2 Preconditioned Uzawa Conjugate Gradient Algorithm

A practical implementation of a conjugate gradient method for solving (5.1) requires a preconditioner, that is, a suitable scalar product for computing $\nabla {J^{\ast }}(\mathrm{p})$ instead of the standard one used in (5.8). The convergence properties of the conjugate gradient method for the generalized Stokes problem are deteriorated for large values of the ratio $\alpha /\nu $, see e.g. (Glowinski, 2003; Glowinski and Le Tallec, 1989).

To derive a preconditioner for (5.1) following the idea of Cahouet and Chabard (1988), Glowinski and Guidoboni (2009), we need to simplify the continuous problem. We first notice that the equivalence between $P1$-Bubble/$P1$ element and the stabilized formulation has been proved (Pierre, 1987, 1989; 1995; Matsumoto, 2005; Baiocchi et al., 1993). Then, if we neglect the bubble contribution in the stiffness and divergence matrices, (5.1) can be expressed in the strong form as

(5.9)

\[ \alpha \boldsymbol{u}-\nu \Delta \boldsymbol{u}+\nabla p=f,\]

(5.10)

\[ \nabla \cdot \boldsymbol{u}-\nabla \cdot ({\nu _{h}}\nabla \boldsymbol{u})=0,\]

where ${\nu _{h}}$ is an element dependent constant.

As in Glowinski and Guidoboni (2009) we define the linear operator from ${L^{2}}(\Omega )$ into ${L^{2}}(\Omega )$ (neglecting the constant term in (5.8))

(5.11)

\[ \phi :=\mathcal{A}q=\nabla \cdot {\boldsymbol{u}_{q}}-\nabla \cdot ({\nu _{h}}\nabla q),\]

where ${\boldsymbol{u}_{q}}$ is the solution of

(5.12)

\[ \alpha {\boldsymbol{u}_{q}}-\nu \Delta {\boldsymbol{u}_{q}}=-\nabla q.\]

Note that (5.12) is the (strong) sensitivity system. The idea behind preconditioning is to find a linear operator $\mathcal{B}$ such that $\mathcal{B}\mathcal{A}q=q$. Applying the divergence operator in (5.12), we obtain

\[ \alpha \nabla \cdot {\boldsymbol{u}_{q}}-\nu \nabla \cdot \Delta {\boldsymbol{u}_{q}}=-\Delta q\]

(5.13)

\[ \alpha \nabla \cdot {\boldsymbol{u}_{q}}-\nu \Delta (\nabla \cdot {\boldsymbol{u}_{q}})=-\Delta q.\]

Using (5.11) in (5.13), we obtain

(5.14)

\[ -\Delta q-\alpha \nabla \cdot ({\nu _{h}}\nabla q)+\nu \Delta \big(\nabla \cdot ({\nu _{h}}\nabla q)\big)=\alpha \phi -\nu \Delta \phi .\]

Then, in practice, at each step of the preconditioned conjugate gradient algorithm, the gradient of ${J^{\ast }}$ is computed as an approximate solution of the linear system

(5.15)

\[ (K+\alpha C+\nu K{M^{-1}}C)\mathrm{g}=(\alpha M+\nu K)\mathrm{r},\]

where K and M are the stiffness and the mass matrices, respectively. Let us introduce the mesh Reynolds number

\[ R{e_{h}}=\frac{\alpha }{\nu }{h^{2}},\]

where h is the mesh size. Taking into account the CPU time and the storage requirement for computing the last term of the matrix involved in (5.15), we consider the following preconditioning system instead

(5.16)

\[ (K+\alpha C)\mathrm{g}=\mathrm{H}\mathrm{r},\]

where

(5.17)

\[ \mathrm{H}=\left\{\begin{array}{l@{\hskip4.0pt}l}\nu K+\alpha \mathrm{diag}(M)\hspace{1em}& \text{if}\hspace{2.5pt}R{e_{h}}\leqslant 1,\\ {} \nu K+\alpha \mathbb{I}\hspace{1em}& \text{if}\hspace{2.5pt}R{e_{h}}>1.\end{array}\right.\]

In (5.16) and (5.17), K and M are $P1$ stiffness and mass matrix, respectively, and C, the bubble matrix from Functions 6.1–6.3.

Algorithm 2

Preconditioned Uzawa conjugate gradient algorithm for (5.1).

From the theory of preconditioned conjugate gradient methods for the Stokes problem (see e.g. Glowinski, 2003), if Dirichlet conditions are imposed for the velocity, then for g in (5.14) we must impose $\partial \mathrm{g}/\partial n=0$ (homogeneous Neumann boundary conditions). On the other hand, where a stress condition is prescribed for the fluid, we must impose $\mathrm{g}=0$ in (5.14).

Remark 1.

The preconditioning system (5.15), that is,

\[ {(\alpha M+\nu K)^{-1}}(K+\alpha C+\nu KC)\mathrm{g}=\mathrm{r},\]

induces (over the discrete pressure space) the norm $|\mathrm{g}{|_{P}}={\mathrm{g}^{\top }}\mathrm{r}$.

Remark 2.

If $C=0$ in (5.15), we recover the preconditioning system of Cahouet and Chabard (1988) widely used for $P2/P1$ or $P1$-iso-$P2/P1$ finite elements.

With the preparation given in the previous section, the preconditioned Uzawa conjugate gradient algorithm for solving the Stokes system (5.1) is detailed in Algorithm 2. Algorithm 2 inherits the main properties from the conjugate gradient algorithm of Cahouet and Chabard (1988):

• At each step, we solve d independent linear systems for the velocity field. The matrices involved are sparse and can be factorized or preconditioned once and for all.
• The same algorithm is able to deal with 2D or 3D systems without any complication.

6 MATLAB Implementation

We know in detail the assembly of the Stokes systems (4.12)–(4.13) and (4.24)–(4.25). For the computational efficiency, the MATLAB codes must be vectorized (i.e. without long for-loops). We then use arrays, cell-arrays, and MATLAB vectorized operators and functions

.*,   ./,  .^,  sum, sparse

6.1 Mesh Representation and KPDE Package

We assume that the triangulation of Ω consists of np nodes and nt elements (triangles or tetrahedrons). We adopt the mesh representation by arrays used in Koko (2007; 2016), Persson and Strang (2004). The nodes coordinates are stored in an array p(1:np,1:2) (in 2D) or p(1:np,1:3) (in 3D). The element nodes are stores in an array t(1:nt,1:3) (in 2D) or t(1:nt,1:4) (in 3D). Dirichlet boundary conditions are provided by a list of nodes and the corresponding prescribed values.

As shown in Koko (2016), using cell-array in FEM allows to have implementation close to the standard form used in classical languages (C/C++, FORTRAN) FEM codes, while being efficient. The idea is to compute and store, for all triangles, the element matrix entry ${A_{ij}^{(T)}}$ in the cell-array At{i,j}. Then we use MATLAB function sparse to assemble A with finite small for loops

for i=1:nd
for j=1:nd
   A=A+sparse(t(:,i),t(:,j),At{i,j},np,np);
end
end

where nd=3 (in 2D) or nd=4 (in 3D). This approach is used in KPDE package (Koko, 2016) for assembling matrices and vectors from Poisson and linear elasticity equations in 2D and 3D.

We need two key functions from KPDE package, kpde2dgphi and kpde3dgphi, which compute the gradient of the basis functions and the elements volume as follows

[ar,g]=kpde2dgphi(p,t);     %2D
[vol,g]=kpde3dgphi(p,t);    %3D

g is 3-by-1 or 4-by-1 cell-array such that g{i}(:,k) is ${\partial _{k}}{\phi _{i}}(x)$ for all elements.

6.2 2D Case

Using the cell array g and the array ar, the nonbubble entry ${\bar{R}_{ij}}$, (4.2), of the element stiffness matrix is then given (for all triangles) by

Rij=nu*ar.*sum(g{i}.*g{j},2).

The bubble diagonal entry ${\omega _{R}}$, (4.3), is computed as follows for all triangles

omega_r=(81/10)*nu.*ar.*(sum(g1.^2,2)+sum(g2.^2,2)
         +sum(g1.*g2,2)).

The nonbubble entry ${M_{ij}}$, (4.4), of the element mass matrix is,

Mii=alpha.*ar/6 ;       % diagonal
Mij=alpha.*ar/12;       % off-diagonal.

The bubble entries (4.5) of the element mass matrix are given by

omega_m=(81/280)*alpha.*ar;    % diagonal
Mbj=(3/20)*alpha.*ar};         % off-diagonal.

The entries $(i,j)$ of the element divergence matrix (4.6) are vectorized as follows

B1ij=ar.*g{i}(:,1)/3; B2ij=ar.*g{i}(:,2)/3;

while the bubble entries (4.7) are

B1ib=(9/10)*ar.*g{i}(:,1); B2ib=(9/10)*ar.*g{i}(:,2).

The vectorized element contribution of the right-hand side (4.8) is

f1=(1/3)*ar.*f1t; f2=(1/3)*ar.*f2t;

while the bubble entries (4.9) are

f1b=(9/20)*ar.*f1t; f2b=(9/20)*ar.*f2t.

Function 6.1

Assembly of the 2D Stokes matrix (4.12).

Function 6.2

Assembly of the Stokes right-hand side (4.13).

MATLAB Functions 6.1–6.2 assemble the Stokes system. To make the assembling functions self-contained, all calculations are integrated except elements area and the gradient of basis functions which can be computed outside and passed as argument. Note that in 2D, the time for computing the triangles area and the gradient of the basis functions is not significant. Functions 6.1–6.2 can be called without the last two arguments without a significant computational overcost.

Instead of matrix (4.12) and vector (4.13), Functions 6.1–6.2 can return submatrices and subvectors of (4.12)–(4.13) if called with more than one output arguments. The statement

[A,B,C]=kstok2dp1bmat(p,t,nu,alpha)

returns the velocity stiffness matrix A, the divergence matrix B=[B1 B2] and the pressure (stiffness) matrix C. Similarly,

[b,bp]=kstok2dp1brhs(p,t,nu,alpha)

returns the velocity right-hand side b=[b1; b2] and the pressure right-hand side bp. These submatrices are used in the preconditioned Uzawa conjugate gradient method presented in Section 5.

6.3 3D Case

Using the cell-array g and the element volume vol, computed by the KPDE function kpde3dgphi, the element stiffness matrices are computed simultaneously on all elements by

Rij=nu*vol.*sum(g{i}.*g{j},2);

while bubble diagonal entry (4.16) is given by

omega_R=(8192/945)*nu.*vol.*(sum(g{1}.^2,2)+sum(g{2}.^2,2)
        +sum(g{3}.^2,2)...+sum(g{1}.*g{2},2)
        +sum(g{1}.*g{3},2)+sum(g{2}.*g{3},2)).

The nonbubble entries of element mass matrix (4.17) are computed by

Mii=alpha.*vol/10;      % diagonal
Mij=alpha.*vol/20;      % off diagonal.

The bubble part of the mass matrix (4.18) is given by

omega_m=(8192/51975)*alpha.*vol;    % diagonal
Mbj=(8/105)*alpha.*vol;             % off-diagonal.

The entries $(i,j)$ of the element divergence matrices (4.19) are computed simultaneously as follows

B1ij=vol.*g{i}(:,1)/4;  B2ij=vol.*g{i}(:,2)/4;
B3ij=vol.*g{i}(:,3)/4;

while the bubble entries are

B1ib=(32/105)*vol.*g{i}(:,1); B2ib=(32/105)*vol.*g{i}(:,2);
B3ib=(32/105)*vol.*g{i}(:,3).

The element contributions of the right-hand side (4.21)–(4.22) are given by

f1=vol.*f1t/4; f2=vol.*f2t/4; f3=vol.*f3t/4;
f1b=(32/105)*vol*f1t;   f2b=(32/105)*vol*f2t;
f3b=(32/105)*vol*f3t.

Function 6.3

Assembly of the 3D Stokes matrix (4.24).

Function 6.4

Assembly of the 3D Stokes right-hand side (4.25).

MATLAB Functions 6.3–6.4 assemble the three-dimensional Stokes system. For computational efficiency, the elements volume vol and the gradient of the basis functions g can be computed once and for all, and passed to Functions 6.3–6.4. For more flexibility, vol and g can also be computed inside Functions 6.3–6.4 if they do not appear in the list of input arguments. As in 2D, Functions 6.3–6.4 can return submatrices and subvectors used in the preconditioned Uzawa conjugate gradient algorithm, if called with more than one output argument.

6.4 Preconditioned Uzawa Conjugate Gradient Algorithm

In our MATLAB implementation the same function (i.e. kstokcg) is used for 2D and 3D problems. For this, we use cell-arrays to store the component system informations: Cholesky factors, permutation vectors, right-hand sides. For instance, for the 3D Stokes problem we form

R={R1 R2 R3};   % Cholesky factors
s={s1 s2 s3};    % Permutations vectors
b={b(1:np) b(np+1:2*np) b3(2*np+1:3*np)}; %Right-hand sides.

Then in the conjugate gradient function, the velocity systems are solved using the for-loop

for i=1:nd
     w{i}(s{i})=R{i}’\(R{i}\b{i}(s{i}));
end

where nd=3. An alternative implementation is to form the block diagonal matrix R=blkdiag(R1, R2, R3), the permutation vector s=[s1 s2 s3] such that e.g. in Step k.1, we solve

w(s)=R’\(R\b(s)).

But for large scale 3D problems, computing R’\(R\b(s)) requires a large amount of memory and can fail.

7 Numerical Experiments

We now propose some numerical experiments to demonstrate the performances of our implementation. The computations have been carried out on a Dell Precision T3610 work station equipped with Intel Xeon 3.0GHz processor with 32GB RAM. The MATLAB version is 9 (R2016a).

7.1 Scalability

We first study the scalability of our MATLAB codes: We consider the discretization of a unit cube ${(0,\hspace{2.5pt}1)^{d}}$ ($d=2,3$) with a uniform mesh of size h, with nt triangles (or tetrahedrons) and np nodes. This initial mesh is successively uniformly refined to produce meshes of size $h/2$, $h/4$, $h/8$, etc. After each refinement the number of triangles is multiplied by 4 (2D) and the number of tetrahedraons by 8 (3D). Since the assembly process is essentially based on the number of elements, we expect that the time to assemble the matrices increases by approximately the same factor, i.e. 5–6 in 2D and 8–10 in 3D as observed in Koko (2016) for Poisson equation and linear elasticity. Tables 1–2 show the assembly CPU times (in Seconds) for the Stokes system in 2D and 3D, respectively. We can notice an almost linear optimal time-scaling for our implementation.

Table 1

CPU times (in seconds) for assembling the Stokes matrix system of size N in 2D.

h	1/32	1/64	1/128	1/256	1/512	1/1024
N	3*1089	3*4225	3*16641	3*66049	3*263169	3*1050625
A	0.018	0.071	0.325	1.516	7.242	36.390
b	0.002	0.011	0.050	0.251	1.214	5.995

Table 2

CPU times (in seconds) for assembling the Stokes matrix system of size N in 3D.

h	1/4	1/8	1/16	1/32	1/64	1/128
N	4*125	4*729	4*4913	4*35937	4*274625	4*2146689
A	0.075	0.057	0.397	3.438	34.538	348.249
b	0.027	0.018	0.051	0.465	4.544	46.394

Table 3

Comparative performances of MATLAB direct solvers and Algorithm 2 for the 2D Stokes system, $\nu =1/50$ and $\alpha /\nu ={10^{3}}$.

Mesh size h	1/32	1/64	1/128	1/256	1/512
Gaussian elim. CPU	0.02	0.07	0.40	2.11	10.88
${\mathrm{LDL}^{\top }}$ CPU	0.02	0.17	1.22	7.13	46.11
Algorithm 2 CPU	0.02	0.08	0.35	1.94	12.01

7.2 Factorization Versus Uzawa Conjugate Gradient

We now consider a Stokes flow in a driven cavity $\Omega ={(0,\hspace{2.5pt}1)^{d}}$ with $\boldsymbol{f}=0$ in (2.1) and

2D:

${\Gamma _{1}}=(0,\hspace{2.5pt}1)\times \{1\}$, $\boldsymbol{u}={(1,\hspace{2.5pt}0)^{\top }}$ on ${\Gamma _{1}}$, and $\boldsymbol{u}=0$ on $\partial \Omega \setminus {\Gamma _{1}}$;

3D:

${\Gamma _{1}}=(0,\hspace{2.5pt}1)\times (0,\hspace{2.5pt}1)\times \{1\}$, $\boldsymbol{u}={(1,\hspace{2.5pt}0,\hspace{2.5pt}0)^{\top }}$ on ${\Gamma _{1}}$, and $\boldsymbol{u}=0$ on $\partial \Omega \setminus {\Gamma _{1}}$.

Ω is discretized by uniform meshes of size $1/16$, $1/32$, $1/64$, $1/128$, $1/256$ and $1/512$ in 2D, and $1/4$, $1/8$, $1/16$, $1/32$ and $1/64$ in 3D. Since there are powerful linear (direct) solvers in MATLAB, we first compare our conjugate gradient algorithm with

• \ (backslash) the standard MATLAB solver for general matrix based on Gaussian elimination;
• ldl the block ${\mathit{LDL}^{\top }}$ factorization for symmetric indefinite systems.

Table 3 shows the comparative performances for the 2D Stokes problem with $\nu =1/50$ and $\alpha /\nu ={10^{3}}$. We can notice that the proposed Uzawa conjugate gradient (Algorithm 2) and the MATLAB built-in Gaussian elimination are almost equivalent even though, in Algorithm 2, the component systems are uncoupled and can be solved in parallel. The CPU times for ${\mathit{LDL}^{\top }}$ include CPU time for the factorization and columns and rows permutation to reduce fill-in that represents up to 90% of the whole CPU time.

If the Stokes problem is used in an iterative process (e.g. time stepping or linearization), then Algorithm 2 or ${\mathit{LDL}^{\top }}$ factorization are preferable. Indeed, if a ${\mathit{LDL}^{\top }}$ is carried out (once and for all) in the initialization step, then the solution of linear system reduces to forward/backward substitutions in the rest of the iterative process. The computational cost of Algorithm 2 can be reduced by using, as initial solution at the current step, the solution of the previous step.

For the 3D Stokes problem the proposed Algorithm 2 outperforms the MATLAB direct solvers, Table 4. For the largest problem ($4\ast {65^{3}}=1\hspace{0.1667em}098\hspace{0.1667em}500$ unknowns) the direct solvers fail because of lack of memory due to fill-in during factorization. Table 5 shows the performances of the MATLAB gmres iterative solver using incomplete $LU$ factorization as preconditioner (MATLAB function ilu with ${10^{-3}}$ as drop tolerance). The value of restart parameter is 10. GMRES algorithm outperforms Algorithm 2 up to $h=1/32$. But for the largest problem, Algorithm 2 is more than four times faster. It is clear that for large scale 3D problems, the proposed Uzawa algorithm is preferable. Table 6 shows the good convergence properties of the proposed Uzawa conjugate gradient algorithm when $\alpha /\nu \gg 1$.

Table 4

Comparative CPU times (in sec.) of MATLAB direct solvers and Algorithm 2 for the 3D Stokes system, $\nu =1/50$ and $\alpha /\nu ={10^{3}}$, (OoM = Out of Memory).

Mesh size h	1/4	1/8	1/16	1/32	1/64
Gaussian elim.	0.01	0.05	1.73	145.44	OoM
${\mathrm{LDL}^{\top }}$	0.02	0.08	6.29	128.54	OoM
Algorithm 2	0.04	0.89	1.70	78.08	1894.59

Table 5

Performances of the gmres iterative solver for the 3D Stokes system, $\nu =1/50$ and $\alpha /\nu ={10^{3}}$.

Mesh size h	1/4	1/8	1/16	1/32	1/64
CPU Times (sec.)	0.04	0.02	1.66	39.61	8230.59

Table 6

Number of iterations versus $\alpha /\nu $ for the 3D driven cavity problem for $h=1/32$.

$\alpha /\nu $	${10^{1}}$	${10^{2}}$	${10^{3}}$	${10^{4}}$	${10^{5}}$	${10^{6}}$	${10^{7}}$
$\nu =1/50$	44	39	41	28	13	8	7
$\nu =1/200$	37	34	33	26	13	8	7
$\nu =1/1000$	31	28	27	24	18	8	7

7.3 2D Visualization

In two-dimensional incompressible fluid problems, it is usual to display the stream-lines. If the domain Ω is bounded and simply connected, in order to compute the stream-function ψ, we have to solve the Poisson–Neumann problem

(7.1)

\[\begin{aligned}{}-\Delta \psi =& \omega ,\hspace{2.5pt}\text{in}\hspace{2.5pt}\Omega ,\end{aligned}\]

(7.2)

\[\begin{aligned}{}{\partial _{n}}\psi =& -\boldsymbol{u}\cdot \tau ,\end{aligned}\]

where $\omega ={\partial _{1}}{u_{2}}-{\partial _{2}}{u_{1}}$ is the vorticity and τ the counter-clockwise oriented unit tangent vector at Γ. Problem (7.1)–(7.2) has a unique solution in ${H^{1}}(\Omega )/\mathbb{R}$. The variational formulation of (7.1)–(7.2) is

Find $\psi \in {H^{1}}(\Omega )$

(7.3)

\[ {(\nabla \psi ,\nabla \varphi )_{\Omega }}={({u_{1}},{\partial _{2}}\varphi )_{\Omega }}-{({u_{2}},{\partial _{1}}\varphi )_{\Omega }},\hspace{1em}\forall \varphi \in {H^{1}}(\Omega ).\]

This leads to the following algebraic system using $P1$ finite element

\[ R\psi ={B_{2}^{\top }}{\mathrm{u}_{1}}-{B_{1}^{\top }}{\mathrm{u}_{2}},\]

where R is the 2D Laplacian matrix. We impose $\psi =0$ at an arbitrary node to ensure the uniqueness.

Fig. 1

Mesh sample of the flow around cylinder problem.

We now consider a test problem derived from a benchmark problem described in Schäfer and Turek (1996). A mesh sample is shown in Fig. 1. The inflow and outflow conditions (on left/right boundaries) are

\[\begin{array}{l}\displaystyle {u_{1}}=\frac{0.3}{{0.41^{2}}}\times 4y(0.41-y),\hspace{2.5pt}{u_{2}}=0\hspace{1em}\text{on}\hspace{2.5pt}{\Gamma _{in}}=\{0\}\times (0,0.41),\\ {} \displaystyle {u_{1}}=\frac{0.3}{{0.41^{2}}}\times 4y(0.41-y),\hspace{2.5pt}{u_{2}}=0\hspace{1em}\text{on}\hspace{2.5pt}{\Gamma _{out}}=\{2.2\}\times (0,0.41).\end{array}\]

On the other parts of the boundary of Ω, homogeneous boundary conditions are prescribed (i.e. $\boldsymbol{u}=0$). The parameter α in (2.1) is set to 0. The centre of the internal cylinder is $(0.25,0.2)$ and the diameter is 0.1. The kinematic viscosity is $\nu ={10^{-3}}$. This gives a Reynolds number of $Re=30$ based on the diameter of the cylinder and the maximum of the inflow velocity. The domain is discretized by a non uniform mesh consisting of 1730 nodes and 3280 triangles, Fig. 1. The velocity field obtained with MATLAB command

quiver(p(:,1),p(:,2),u1,u2)

is shown in Fig. 2. Isobar lines of Fig. 3 are obtained using the contour plotting function kpde2dcont from KPDE package (Koko, 2016). The streamlines of Fig. 4 are obtained by plotting the solution of (7.3) with kpde2dcont.

Unfortunately, for 3D flows, there is no simple tool for graphics output. quiver3 allows for visualization of 3D velocity fields but the result is often unsatisfactory. Plotting 3D functions (or their contours) is a non trivial problem. There is no simple subproblem like (7.3) for streamlines in 3D.

Fig. 2

Velocity field of the flow around cylinder problem.

Fig. 3

Isobar lines for the flow around a cylinder problem.

Fig. 4

Streamlines for the flow around a cylinder problem.

8 Conclusion

We have proposed a fast MATLAB package for the numerical approximation of the generalized Stokes problem with the mini-element. Numerical experiments show that the proposed assembling functions have an optimal linear time-scaling. The proposed Uzawa conjugate gradient algorithm outperforms the MATLAB built-in solvers for 3D problems.

References

Alberty, J., Carstensen, C., Funken, S.A. (1999). Remarks around 50 lines of matlab: short finite element implementation. Numerical Algorithms, 20, 117–137.

Alberty, J., Carstensen, C., Funken, S.A., Klose, R. (2002). Matlab implementation of the finite element method in elasticity. Computing, 69, 239–263.

Arnold, D., Brezzi, Fortin M, F. (1984). A stable finite element for the Stokes equations. Calcolo, 21, 337–344.

Babuska, I. (1971). Error bounds for finite element method. Numerische Mathematik, 16, 322–333.

Baiocchi, C., Brezzi, F., Franca, L.P. (1993). Virtual bubbles and Galerkin-Least-Squares methods. Computer Methods in Applied Mechanics and Engineering, 105, 125–141.

Brezzi, F. (1974). On the existence, uniqueness and approximation of saddle-point problems arising from Lagrange multipliers. RAIRO, 8, 129–151.

Cahouet, J., Chabard, J.P. (1988). Some fast 3-D solvers for the generalized Stokes problem. International Journal for Numerical Methods in Fluids, 8, 269–295.

Ern, A., Guermond, J.-L. (2002). Eléments finis: théorie, applications, mise en œuvre. SMAI Mathématiques et Applications, Vol. 36. Springer.

Fortin, M., Glowinski, R. (1983). Augmented Lagrangian Methods: Application to the Numerical Solution of Boundary-Value Problems. North-Holland, Amsterdam.

Glowinski, R. (2003). Numerical methods for fluids (part 3). In: Ciarlet, P.G., Lions, J.L. (Eds.), Numerical Methods for Fluids (Part 3), Handbook of Numerical Analysis, Vol. IX. North-Holland, Amsterdam, pp. 3–1074.

Glowinski, R., Guidoboni, G. (2009). On the predoncitioned conjugate gradient solution of a Stokes problem with Robin-type boundary conditions. Comptes Rendus de l’Académie des Sciences Paris, 347(15), 903–908.

Glowinski, R., Le Tallec, P. (1989). Augmented Lagrangian and Operator-splitting Methods in Nonlinear Mechanics. Studies in Applied Mathematics. SIAM, Philadelphia.

Koko, J. (2007). Vectorized MATLAB codes for two-dimensional linear elasticity. Scientific Programming, 15, 157–172.

Koko, J. (2016). Fast MATLAB assembly of fem matrices in 2d and 3d using cell array approach. International Journal of Modelling and Simulation, 7.

Kwon, Y.W., Bang, H. (2000). The Finite Element Method Using MATLAB. CRC Press, New York.

Matsumoto, J. (2005). A relationship between stabilized FEM and bubble functions element stabilization method with orthogonal basis for incompressible flows. Journal of Applied Mechanics, 8, 233–242.

Persson, P.-O., Strang, G. (2004). A simple mesh generator in Matlab. SIAM Review, 42, 329–345.

Pierre, R. (1987). Regularization procedures of mixed finite element approximation of the Stokes problem. Research-Report RR-0673, INRIA.

Pierre, R. (1989). Regularization procedures of mixed finite element approximation of the Stokes problem. Numerical Methods for Partial Differential Equations, 5, 241–258.

Pierre, R. (1995). Optimal selection of the bubble function in the stabilization of P1-P1 element for the Stokes problem. Numerical Methods for Partial Differential Equations, 32, 1210–1224.

Rahman, T., Valdman, J. (2013). Fast MATLAB assembly of FEM matrices in 2D and 3D: nodal elements. Applied Mathematics and Computation, 219, 7151–7158.

Rahman, T., Valdman, J. (2015). Fast MATLAB assembly of FEM matrices in 2D and 3D: edge elements. Applied Mathematics and Computation, 267, 252–263.

Schäfer, M., Turek, S. (1996). Benchmark computations of laminar flow around a cylinder. Notes on Numerical Fluid Mechanics, 52, 547–566.

Biographies

Koko Jonas

jonas.koko@uca.fr

J. Koko is an associate professor in applied mathematics at the Computer Science School at the University Clermont-Auvergne. His research interests include numerical optimization with applications to Partial Differential Equations (PDE) and parallel computing, vectorized MATLAB codes for the numerical approximation of PDEs.

Reading mode

Table of contents

1 Introduction
2 The Model Problem
3 Finite Element Discretization with $P1$-Bubble-$P1$
4 Element Matrices
5 Uzawa Conjugate Gradient Algorithm
6 MATLAB Implementation
7 Numerical Experiments
8 Conclusion
References
Biographies

Open access article under the CC BY license.

Keywords

finite element method Stokes problem Uzawa conjugate gradient MATLAB

Metrics

since January 2020

2001

Article info
views

1915

Full article
views

1096

PDF
downloads

265

XML
downloads

RSS

Figures
10
Tables
6

Algorithm 1

Uzawa conjugate gradient algorithm for (5.1).

Algorithm 2

Preconditioned Uzawa conjugate gradient algorithm for (5.1).

Function 6.1

Assembly of the 2D Stokes matrix (4.12).

Function 6.2

Assembly of the Stokes right-hand side (4.13).

Function 6.3

Assembly of the 3D Stokes matrix (4.24).

Function 6.4

Assembly of the 3D Stokes right-hand side (4.25).

Fig. 1

Mesh sample of the flow around cylinder problem.

Fig. 2

Velocity field of the flow around cylinder problem.

Fig. 3

Isobar lines for the flow around a cylinder problem.

Fig. 4

Streamlines for the flow around a cylinder problem.

Table 1

CPU times (in seconds) for assembling the Stokes matrix system of size N in 2D.

Table 2

CPU times (in seconds) for assembling the Stokes matrix system of size N in 3D.

Table 3

Comparative performances of MATLAB direct solvers and Algorithm 2 for the 2D Stokes system, $\nu =1/50$ and $\alpha /\nu ={10^{3}}$.

Table 4

Comparative CPU times (in sec.) of MATLAB direct solvers and Algorithm 2 for the 3D Stokes system, $\nu =1/50$ and $\alpha /\nu ={10^{3}}$, (OoM = Out of Memory).

Table 5

Performances of the gmres iterative solver for the 3D Stokes system, $\nu =1/50$ and $\alpha /\nu ={10^{3}}$.

Table 6

Number of iterations versus $\alpha /\nu $ for the 3D driven cavity problem for $h=1/32$.

Algorithm 1

Uzawa conjugate gradient algorithm for (5.1).

Algorithm 2

Preconditioned Uzawa conjugate gradient algorithm for (5.1).

Function 6.1

Assembly of the 2D Stokes matrix (4.12).

Function 6.2

Assembly of the Stokes right-hand side (4.13).

Function 6.3

Assembly of the 3D Stokes matrix (4.24).

Function 6.4

Assembly of the 3D Stokes right-hand side (4.25).

Fig. 1

Mesh sample of the flow around cylinder problem.

Fig. 2

Velocity field of the flow around cylinder problem.

Fig. 3

Isobar lines for the flow around a cylinder problem.

Fig. 4

Streamlines for the flow around a cylinder problem.

Table 1

CPU times (in seconds) for assembling the Stokes matrix system of size N in 2D.

h	1/32	1/64	1/128	1/256	1/512	1/1024
N	3*1089	3*4225	3*16641	3*66049	3*263169	3*1050625
A	0.018	0.071	0.325	1.516	7.242	36.390
b	0.002	0.011	0.050	0.251	1.214	5.995

Table 2

CPU times (in seconds) for assembling the Stokes matrix system of size N in 3D.

h	1/4	1/8	1/16	1/32	1/64	1/128
N	4*125	4*729	4*4913	4*35937	4*274625	4*2146689
A	0.075	0.057	0.397	3.438	34.538	348.249
b	0.027	0.018	0.051	0.465	4.544	46.394

Table 3

Comparative performances of MATLAB direct solvers and Algorithm 2 for the 2D Stokes system, $\nu =1/50$ and $\alpha /\nu ={10^{3}}$.

Mesh size h	1/32	1/64	1/128	1/256	1/512
Gaussian elim. CPU	0.02	0.07	0.40	2.11	10.88
${\mathrm{LDL}^{\top }}$ CPU	0.02	0.17	1.22	7.13	46.11
Algorithm 2 CPU	0.02	0.08	0.35	1.94	12.01

Table 4

Comparative CPU times (in sec.) of MATLAB direct solvers and Algorithm 2 for the 3D Stokes system, $\nu =1/50$ and $\alpha /\nu ={10^{3}}$, (OoM = Out of Memory).

Mesh size h	1/4	1/8	1/16	1/32	1/64
Gaussian elim.	0.01	0.05	1.73	145.44	OoM
${\mathrm{LDL}^{\top }}$	0.02	0.08	6.29	128.54	OoM
Algorithm 2	0.04	0.89	1.70	78.08	1894.59

Table 5

Performances of the gmres iterative solver for the 3D Stokes system, $\nu =1/50$ and $\alpha /\nu ={10^{3}}$.

Mesh size h	1/4	1/8	1/16	1/32	1/64
CPU Times (sec.)	0.04	0.02	1.66	39.61	8230.59

Table 6

Number of iterations versus $\alpha /\nu $ for the 3D driven cavity problem for $h=1/32$.

$\alpha /\nu $	${10^{1}}$	${10^{2}}$	${10^{3}}$	${10^{4}}$	${10^{5}}$	${10^{6}}$	${10^{7}}$
$\nu =1/50$	44	39	41	28	13	8	7
$\nu =1/200$	37	34	33	26	13	8	7
$\nu =1/1000$	31	28	27	24	18	8	7

Authors

Abstract

1 Introduction

2 The Model Problem

(2.1)

(2.2)

(2.3)

(2.4)

(2.5)

3 Finite Element Discretization with $P1$-Bubble-$P1$

3.1 Mini Element

(3.1)

(3.2)

(3.3)

(3.4)

(3.5)

3.2 Algebraic Formulation

(3.6)

(3.7)

(3.8)

4 Element Matrices

4.1 Two-Dimensional Case

(4.1)

(4.2)

(4.3)

(4.4)

(4.5)

(4.6)

(4.7)

(4.8)

(4.9)

(4.10)

(4.11)

(4.12)

(4.13)

4.2 Three-Dimensional Case

(4.14)

(4.15)

(4.16)

(4.17)

(4.18)

(4.19)

(4.20)

(4.21)

(4.22)

(4.23)

(4.24)

(4.25)

5 Uzawa Conjugate Gradient Algorithm

(5.1)

(5.2)

(5.3)

5.1 Uzawa Conjuage Gradient Algoritm

(5.4)

(5.5)

(5.6)

(5.7)

(5.8)

Algorithm 1

5.2 Preconditioned Uzawa Conjugate Gradient Algorithm

(5.9)

(5.10)

(5.11)

(5.12)

(5.13)

(5.14)

(5.15)

(5.16)

(5.17)

Algorithm 2

Remark 1.

Remark 2.

6 MATLAB Implementation

6.1 Mesh Representation and KPDE Package

6.2 2D Case

Function 6.1

Function 6.2

6.3 3D Case

Function 6.3

Function 6.4