# Iterative Refinement¶

Iterative refinement is equivalent to a linear iteration in second normal form as described in [1], i.e. for an equation system \(Ax=b\) the solution \(x\) is computed iteratively via

\(x_{i+1} := x_i - W ( A \cdot x_i - b )\)

with some start value \(x_0\) and an matrix \(W\), which might be considered as a preconditioner.

The iteration converges if

\(||I - W \cdot A ||_2 < 1\)

In the context of H-matrices, a reasonable choice for \(W\) is given by either the approximate inverse of \(A\) or its LU factorization, i.e., \(W = (LU)^{-1}\) and \(A = LU\). The latter is to be preferred as its computation is much faster compared to the inverse of \(A\).

Due to the approximate nature of low-rank arithmetic, a mixed precision representation of the LU factors does not change the error, assuming low-rank error and unit roundoff of the chosen precision are compatible. However, arithmetic in different precisions may still affect the result, especially when the condition number \(\kappa(A)\) is large.

In the program, \(W\) is computed and represented in FP64 or FP32 while for the product \(s = W \cdot t\) the vectors \(s,t\) are always in FP64.

## Program¶

The corresponding program is *iterrefine* and build as

```
scons programs=iterrefine frameworks=seq,tbb
```

## Examples¶

For the following examples the matrix is computed in FP64 and H-LU factorization used for \(W\) either in FP64 or in FP32. Furthermore, a normalized random solution \(\tilde x\) was chosen and a corresponding RHS \(b\) computed. Reported are the errors \(||\tilde x - x_i||_2\).

### Laplace SLP¶

For \(\epsilon = 10^{-4}\) and \(n = 32.768\) (grid: *sphere-6*) and inversion error via H-LU of
\(7 \cdot 10^{-4}\) (FP64 and FP32) was obtained.

step |
FP64 |
FP32 |
---|---|---|

0 |
2.00e+00 |
2.00e+00 |

1 |
8.50e-05 |
8.55e-05 |

2 |
1.50e-08 |
1.50e-08 |

3 |
4.10e-12 |
4.13e-12 |

### Matérn Covariance¶

For \(\epsilon = 10^{-6}\) and \(n = 32.768\) (grid: *randcube-32768*) and inversion error via H-LU of
\(5 \cdot 10^{-3}\) (FP64) and \(4 \cdot 10^{-2}\) (FP32) was obtained.

step |
FP64 |
FP32 |
---|---|---|

0 |
2.00e+00 |
2.00e+00 |

1 |
9.10e-04 |
1.64e-03 |

2 |
1.12e-06 |
3.51e-06 |

3 |
2.04e-09 |
1.37e-08 |

4 |
4.87e-11 |

## References¶

Wolfgang Hackbusch:

*“Iterative Solution of Large Sparse Systems of Equations”*, Springer, 2016