Introduction to Elementary FEM

Created at , last modified at in «Mathematics»

I have recently been working on a project that involves using the Finite Element Method. Here is a sneak peak:

To begin what is hopefully going to be a series of blog posts, I would like to introduce the basic concept of the Finite Element Method. This note is aimed at around undergraduate level for vaguely STEM-ish fields with solid mathematical background. I tried to be a bit less formal than I would usually write these things, hopefully it makes this note less intimidating and more approachable.

To begin, let's consider solving a simple PDE in 1D. Let f:(0,1)Rf : (0, 1) \to \mathbb{R} be a continuous function. We are looking for a sufficiently differentiable function u:[0,1]Ru : [0, 1] \to \mathbb{R} such that.*∂ denotes the derivative operator. That is, ∂u is the derivative of u and ∂²u is the second derivative of u.

2u=fandu(0)=u(1)=0.\partial^2 u = f \qquad\text{and}\qquad u(0) = u(1) = 0.

This is nothing special, just finding the second antiderivative of ff. The boundary conditions are somewhat unimportant in this case, since one may always add a linear polynomial to the solution to satisfy them. This becomes more interesting in higher dimensions.

As one may expect, computing the function uu may be difficult, entirely depending on the form of ff. This means we need to look for an approximation.

To start, let's play a game:

First, let's introduce some notation. We are going to define the inner product of two reasonably nice functions uu and vv on the interval [0,1][0, 1] by:

uv=01uvdx.\langle u \mid v \rangle = \intop_0^1 u v \, \text{d}x.

Let's say that uu solves the PDE. Then, let's grab any other twice differentiable function vv that is also zero on the boundary, that is, v(0)=v(1)=0v(0) = v(1) = 0 (meaning, it could plausibly be a solution of the PDE). and compute the following integral using integration by parts:

0=2ufv=01(2uf)vdx=01(2u)vdx01fvdx=[(u)v]01001(u)(v)dx01fvdx\begin{align*} 0 = \langle \partial^2 u - f \mid v \rangle &= \intop_0^1 (\partial^2 u - f) v \, \text{d}x \\ &= \intop_0^1 (\partial^2 u) v \, \text{d}x - \intop_0^1 f v \, \text{d}x \\ &= \underbrace{\left[ (\partial u) v\right]_0^1}_{0} - \intop_0^1 (\partial u) (\partial v) \, \text{d}x - \intop_0^1 f v \, \text{d}x \\ \end{align*}

Rearanging, we get the equality

uv=01(u)(v)dx=01fvdx=fv.\langle \partial u \mid \partial v \rangle = \intop_0^1 (\partial u) (\partial v) \, \text{d}x = \intop_0^1 f v \, \text{d}x = \langle f \mid v \rangle.

It is an exercise in analysis to show that, this condition is in fact sufficient: if uu satisfies this condition for every twice differentiable function vv that is zero on the boundary, then uu is a solution of the PDE. This is called the weak form of the PDE.

It is not immediately obvious how this is useful. On a first glance, we have converted a problem that requires us to find a function which has a certain property into a problem that requires us to find a function that satisfies a certain integral equation for every function vv. This does not seem to be easier at all!

Now comes the main trick: instead of considering the entire space of twice differentiable functions that are zero on the boundary, we are going to use a different, finite dimensional space of functions. This space does not even need to be a subspace of the space of smooth functions and in fact, it usually isn't in practical implementations.

Definition. For the remainder of this post, let

0=x0<x1<<xn+1=10 = x_0 < x_1 < \cdots < x_{n+1} = 1

be a partition of the unit interval. Furthermore, let the solution space, denoted by VV, be the vector space of functions on the unit interval that are

  1. continuous
  2. piecewise linear, that is, linear polynomial on every interval [xi,xi+1][x_i, x_{i+1}]
  3. zero on the boundary, that is, u(0)=u(1)=0u(0) = u(1) = 0 for every uVu \in V

The functions in this space look like this:

In more theoretical texts, the first and second conditions are often abstracted, for instance by considering polynomials of arbitrary degree.

It is worth observing that in practice, we are the ones that choose the partition (x1,,xn)(x_1, \ldots, x_n). This means that if we have some knowledge about the function ff, or maybe we demand better accuracy in some regions, we can increase the density of the partition in those regions.

Definition. A weak solution is a function uVu \in V such that, for every vVv \in V, the following holds:

uv=fv\langle \partial u \mid \partial v \rangle = \langle f \mid v \rangle

I have smuggled in another trick here: notice that the weak form only requires once-differentiable functions. This is convenient since the second derivatives of our piecewise linear functions are zero almost everywhere. This allows us to work with simpler functions, simplifying the computation later.

Now for the final trick: notice that VV is a finite dimensional vector space. This means that we can choose a basis of VV, say (v1,,vn)(v_1, \ldots, v_n). We use this basis to write uu as a linear combination of the basis functions:

u=α1v1+α2v2++αnvn.u = \alpha_1 v_1 + \alpha_2 v_2 + \cdots + \alpha_n v_n.

It is now an easy exercise in linear algebra to show that it is sufficient for the weak solution to satisfy the weak form for every basis function viv_i of VV instead of every function in VV. That is, we assemble a system of equations:

α1v1v1+α2v2v1++αnvnv1=fv1,α1v1vn+α2v2vn++αnvnvn=fvn.\begin{align*} \alpha_1 \langle \partial v_1 \mid \partial v_1 \rangle &+ \alpha_2 \langle \partial v_2 \mid \partial v_1 \rangle + \cdots + \alpha_n \langle \partial v_n \mid \partial v_1 \rangle = \langle f \mid v_1 \rangle, \\ &\qquad\qquad\qquad\vdots \\ \alpha_1 \langle \partial v_1 \mid \partial v_n \rangle &+ \alpha_2 \langle \partial v_2 \mid \partial v_n \rangle + \cdots + \alpha_n \langle \partial v_n \mid \partial v_n \rangle = \langle f \mid v_n \rangle. \end{align*}

Or in matrix form:

(v1v1v2v1vnv1v1v2v2v2vnvnv1vnv2vnvnvn)(α1α2αn)=(fv1fv2fvn)\begin{pmatrix} \langle \partial v_1 \mid \partial v_1 \rangle & \langle \partial v_2 \mid \partial v_1 \rangle & \cdots & \langle \partial v_n \mid \partial v_1 \rangle\\ \langle \partial v_1 \mid \partial v_2 \rangle & \langle \partial v_2 \mid \partial v_2 \rangle & \cdots & \langle \partial v_n \mid \partial v_n \rangle\\ \vdots & \vdots & \ddots & \vdots\\ \langle \partial v_1 \mid \partial v_n \rangle & \langle \partial v_2 \mid \partial v_n \rangle & \cdots & \langle \partial v_n \mid \partial v_n \rangle \end{pmatrix} \begin{pmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \end{pmatrix} = \begin{pmatrix} \langle f \mid v_1 \rangle \\ \langle f \mid v_2 \rangle \\ \vdots \\ \langle f \mid v_n \rangle \end{pmatrix}

Using fancy terminology, the matrix on the left is the Gram matrix of the inner product*Verifying that this is an inner product is left as an exercise to the reader. More advanced treatments tend to allow more general bilinear forms here. \langle \partial - \mid \partial - \rangle on the space VV. Engineers usually call this matrix the stiffness matrix. It can be shown that this system of equations always has a solution.

What remains is picking the basis functions. Let's consider several demands we would like to impose on the basis:

  1. The stiffness matrix should be reasonably sparse.
  2. The right hand side terms fvi\langle f \mid v_i \rangle should be easy to compute or at least approximate.
  3. Once we get the solution (α1,,αn)(\alpha_1, \ldots, \alpha_n), we should be able to compute the function uu easily from the coefficients.

The usual basis functions are the hat functions. These are piecewise linear functions that look like this:

That is, an increasing linear slope between xi1x_{i-1} and xix_i and a decreasing linear slope between xix_i and xi+1x_{i+1}. In more precise terms:

vi(x)={xxi1xixi1ifxi1x<xi,xi+1xxi+1xiifxix<xi+1,0otherwise.v_i(x) = \begin{cases} \frac{x - x_{i-1}}{x_i - x_{i-1}} &\text{if}\quad x_{i-1} \leq x < x_i, \\ \frac{x_{i+1} - x}{x_{i+1} - x_i} &\text{if}\quad x_i \leq x < x_{i+1}, \\ 0 & \text{otherwise}. \end{cases}

These functions have some desirable properties. First, note that if we have ij2\left|i - j\right| \geq 2, then vivj=0\langle \partial v_i \mid \partial v_j \rangle = 0, since the intervals where the functions are non-zero do not even overlap. This means that the stiffness matrix is tridiagonal, which satisfies our sparsity requirement. There are even specialized algorithms that can be used to solve systems of equations that have a tridiagonal matrix. It is not that easy in higher dimensions, but even there, methods that work well with sparse matrices are available.

Computing the right hand side terms may be a bit more difficult, depending on the form of ff. We are looking to evaluate the integral

fvi=01f(x)vi(x)dx=xi1xi+1f(x)vi(x)dx.\langle f \mid v_i \rangle = \intop_0^1 f(x) v_i(x) \, \text{d}x = \intop_{x_{i-1}}^{x_{i+1}} f(x) v_i(x) \, \text{d}x.

In practice, we use a numerical integration scheme here. An easy approach is to assume that fVf \in V, that is, we linearly interpolate ff from its values on (x1,,xn)(x_1, \ldots, x_n). Then, we can simply compute the integral directly. Even simpler approach is to assume exploit the locality of the basis functions, assuming that ff is constant on [xi1,xi+1][x_{i-1}, x_{i+1}] and compute

fvif(xi)xi1xi+1vi(x)dx.\langle f \mid v_i \rangle \approx f(x_i) \intop_{x_{i-1}}^{x_{i+1}} v_i(x) \, \text{d}x.

Do note that this, again, splits our right hand side into a part that can be pre-computed for a given partition and a part that depends on the function ff.

And finally, since the basis functions are setup in a way that only viv_i is non-zero on xix_i, computing the value of the solution is simply:

u(xi)=(α1vi+α2vi++αnvi)(xi)=αi.u(x_i) = (\alpha_1 v_i + \alpha_2 v_i + \cdots + \alpha_n v_i)(x_i) = \alpha_i.