Matrix free solvers#

Elliptic equation#

We here study the general two-dimensional elliptic equation

(1)#\[\begin{align} -\nabla\cdot( \chi \nabla_\perp \phi) = \rho \end{align}\]

which in a two-dimensional Cartesian grid reads

(2)#\[\begin{align} -\frac{\partial}{\partial x} \left( \chi(x,y) \frac{\partial}{\partial x} \phi(x,y)\right) - \frac{\partial}{\partial y} \left( \chi(x,y) \frac{\partial}{\partial y} \phi(x,y)\right) = \rho(x,y) \label{eq:elliptic2d} \end{align}\]

The task is to find a solution for \(\phi\) for given \(\rho\) and \(\chi\).

Discretization#

We use discontinuous Galerkin (dG) methods to discretize

(3)#\[\begin{align} \partial_x \rightarrow D_x \end{align}\]

where \(D_x\) is a block-sparse matrix. We then have

(4)#\[\begin{align} M \phi &= \rho \label{eq:matrix} \\ M &= D_x^T \chi D_x + D_y^T \chi D_y + J \end{align}\]

We here see that \(M\) is self-adjoint, which means that we can use a conjugate gradient (CG) solver.

Problem#

// Pseudo-code:
Grid g;
Matrix dx = create_dx (g, bcx), dy = create_dy(g, bcy), j = create_jump();
DiagMatrix diag_chi = create_from_given_vector(chi);
// assembly of M requires four matrix-matrix multiplications and 2 additions
Matrix M = dx.transpose()*diag_chi*dx + dy.transpose()*diag_chi*dy + j;
// Now solve with CG
CG cg;
Vector phi = discretize_phi(g), rho = discretize_rho(g);
cg.solve( M, phi, rho, eps = 1e-8);

In order to assemble \(M\) four matrix-matrix multiplcations need to be performed
This takes longer than the entire CG solve, at least in our initial tests (admittedly 10 years ago)

Solution: matrix - free solvers#

Matrix-free

A matrix-free solver is any solver for \(M x = b\) that does not require access to the elements of the matrix \(M_{ij}\)

Matrix-free solvers are thus a subclass of available solvers

Examples of solvers that are matrix-free:

All Krylov-subspace solvers are matrix-free. E.g. conjugate gradient (CG), LGMRES, BICG, etc.
Fixed point iterations
Chebyshev iteration

Examples of solvers that are not matrix-free

Direct solvers; need to access \(M_{ij}\) directly
Jacobi iteration; because it needs to decompose \(M = D + L + U\)
Gauss-Seidel iteration; needs to decompose \(M = L_* + U\)

Example: Main loop of CG algorithm#

To solve

(5)#\[\begin{align} Mx = b \end{align}\]

the main loop of CG reads https://en.wikipedia.org/wiki/Conjugate_gradient_method

(6)#\[\begin{align} \alpha_k =& \frac{r_k^T \cdot r_k}{p_k^T \cdot ( \color{red}{M \cdot p_k})} \\ x_{k+1} = & x_k + \alpha_k p_k \\ r_{k+1} = & r_k - \alpha_k \color{red}{M \cdot p_k} \\ \beta_k = & \frac{r_{k+1}^T \cdot r_{k+1}}{r_k^T \cdot r_k} \\ p_{k+1} = & r_{k+1} + \beta_k p_k \end{align}\]

To implement you only need to implement the application of \(M\) to a vector rather than \(M\) itself

Grid g;
Matrix dx = create_dx (g, bcx), dy = create_dy(g, bcy), j = create_jump();
DiagMatrix diag_chi = create_from_given_vector(chi);

// Implement the effet of matrix without ever assembling it
Vector matrix_vector_product_with_elliptic_matrix( Vector phi)
{
    Vector dxP = dx*phi, dyP = dy*phi, JP = j*phi;
    Vector tempX = diag_chi*dxP, tempY = diag_chi*dyP;
    dxP = dx.transpose()*tempX, dy= dy.transpose()*tempY;
    return dxP + dyP + JP;
}

// In main CG loop
Vector Ap= matrix_vector_product_with_elliptic_matrix( p_k);
double alpha = r_old / p*Ap;
x = x + alpha*p;
r = r - alpha*Ap;
r_new = r*r;
double beta = r_new/r_old;
p = r + beta*p;
r_old = r_new;

A manufactured example problem#

We manufacture the solution

(7)#\[\begin{align} \chi &= 1 + A \sin (x)\sin(y) \\ \rho &= 2 \sin(x) \sin(y) (A\sin(x)\sin(y) +1)- A\sin^2(x)\cos^2(y) - A\cos^2(x)\sin^2(y) \\ \phi &= \sin(x)\sin(y) \end{align}\]

for \(A\in ]-1,1[\) and solve on the domain \([0,\pi]\times [0,2\pi]\) for Dirichlet boundary conditions in \(x\) and periodic in \(y\). The initital guess is zero.

Interface to Feltor from python#

We have a C++ program “solvers.cpp” written using the FELTOR library.

Takes json file as input
Writes output as yaml

import simplesimdb  # A leightweight data creator and database manager
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yaml # for output files

pd.set_option('display.float_format', lambda x: '%.2e' % x)
# Type make solvers in the repository to have the executable available
database = simplesimdb.Manager( executable='./solvers', directory='data', filetype = 'yaml')

Create some default input parameters#

def create_solver() :
    return {
        "type" : "CG", # CG, LGMRES, BICGSTABl, Multigrid
        "eps" : 1e-6,
        "preconditioner" :{
            "type" : "none" # None or diagonal
        },
        "max_iter" : 50000, # maximum number of iterations
        # for CG
        "check-every" : 1, # errror check every .. iterations
        # for LGMRES
        "inner_m" : 30,
        "outer_k" : 3,
        # for BICGSTABL
        "l_input" : 3
    }

def create_inputfile ():
    '''Create an input file for the solvers FELTOR code'''
    return {
        "grid" : {
            "n" : 3,
            "Nx" : 128, 
            "Ny" : 256,
        },
        "solver" : create_solver(),
        "equations":
        {
            "amp" : 0.9, # The closer to 1 the more difficult!   
        },
        "elliptic" :
        {
            "jfactor" : 1,
            "direction" : "forward" # forward or centered

        }    
    }

Some basic setup#

# Clean out all existing simulations
database.delete_all()
database.directory='data'

def make_dataframe():
    '''Make a pandas Dataframe of all simulations in the database
    
    Concatenates all inputs and outputs as serialized dicts
    '''
    inputs  = list()
    outputs =list()
    for data in database.table() :
        inputs.append( data)
        with open( database.outfile( data)) as f:
            output = yaml.full_load(f)
            outputs.append( output)
    dfi = pd.json_normalize(inputs)        
    dfo = pd.json_normalize(outputs)
    return pd.concat([dfi, dfo], axis=1).sort_values( 'time', ascending=False)

Our first solver test: Unpreconditioned CG on a Nvidia Titan Xp#

# Create an inputfile
inputfile = create_inputfile()

# Run a simulation
outputfile=database.create( inputfile)

# Read in the outputfile (yaml in this case)
with open(outputfile) as f:
    output = yaml.full_load(f)

# Print
print(inputfile)
print(output)

Running simulation 2c3fd2...d15a.yaml

{'grid': {'n': 3, 'Nx': 128, 'Ny': 256}, 'solver': {'type': 'CG', 'eps': 1e-06, 'preconditioner': {'type': 'none'}, 'max_iter': 50000, 'check-every': 1, 'inner_m': 30, 'outer_k': 3, 'l_input': 3}, 'equations': {'amp': 0.9}, 'elliptic': {'jfactor': 1, 'direction': 'forward'}}
{'time': 1.43784, 'iter': 4133, 'error': 8.07962e-08, 'error_abs': 1.79484e-07}

Let us print this in a nicer way#

# Use Pandas to analyse the data in a nice way
df = make_dataframe()  
df[['solver.type','solver.preconditioner.type','grid.n','grid.Nx','grid.Ny','iter','time']]

	solver.type	solver.preconditioner.type	grid.n	grid.Nx	grid.Ny	iter	time
0	CG	none	3	128	256	4133	1.44e+00

Matrix-free solvers

Matrix free solvers

Contents

Matrix free solvers#

Elliptic equation#

Discretization#

Problem#

Solution: matrix - free solvers#

Example: Main loop of CG algorithm#

A manufactured example problem#

Interface to Feltor from python#

Create some default input parameters#

Some basic setup#

Our first solver test: Unpreconditioned CG on a Nvidia Titan Xp#

Let us print this in a nicer way#

Preconditioners in a matrix-free solver#

How do other Krylov-solvers fare against CG?#

LGMRES#

BICGSTABl#

Observations#

Matrix-free (geometric) multigrid#

A first attempt: Nested iterations#

Fine-tuning: Avoiding scalar products on coarse grid#

Matrix-free Full multigrid#

Conclusions#

	solver.type	solver.preconditioner.type	grid.n	grid.Nx	grid.Ny	iter	time	time_per_iter
0	LGMRES	diagonal	3	128	256	5229	7.92e+00	1.51e-03
2	CG	none	3	128	256	4133	1.44e+00	3.48e-04
1	BICGSTABl	diagonal	3	128	256	1374	9.38e-01	6.82e-04
3	CG	diagonal	3	128	256	2097	7.41e-01	3.53e-04