Skip to content

Commit c0150a3

Browse files
committed
document remainder of LU solver
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
1 parent 3db4a60 commit c0150a3

File tree

1 file changed

+168
-27
lines changed

1 file changed

+168
-27
lines changed

docs/advanced_documentation/algorithms/lu-solver.md

Lines changed: 168 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -120,14 +120,15 @@ The power grid model uses a modified version of the
120120

121121
The Gaussian elimination process itself is as usual. Let
122122
$M_p\equiv\begin{bmatrix}m_p && \vec{r}_p^T \\ \vec{q}_p && \hat{M}_p\end{bmatrix}$, where $p$ is
123-
the current pivot element at the top left of the matrix, $\vec{q}$ and $\vec{r}_p^T$ are the
124-
associated column and row vectors containing the rest of the pivot column and row and $\hat{M}_p$ is
125-
the bottom-right block of the matrix. Gaussian elimination constructs the matrices
123+
the current pivot element at the top left of the matrix, $m_p$ its matrix element value, $\vec{q}$
124+
and $\vec{r}_p^T$ are the associated column and row vectors containing the rest of the pivot column
125+
and row and $\hat{M}_p$ is the bottom-right block of the matrix. Gaussian elimination constructs the
126+
matrices
126127

127128
$$
128129
\begin{align*}
129-
L_p &= \begin{bmatrix} 1 && \vec{0}^T \\ m_p^{-1} \vec{q}_p && \hat{D}_p\end{bmatrix}
130-
Q_p &= \begin{bmatrix} m_p && \vec{r}_p^T \\ \vec{0} && \hat{M}_p - m_p^{-1} \vec{q}_p \vec{r}_p^T\end{bmatrix} \equiv \begin{bmatrix} 1 && \tilde{\vec{r}}_p^T \\ \vec{0} && M_{p+1} \end{bmatrix}
130+
L_p &= \begin{bmatrix} 1 && \vec{0}^T \\ m_p^{-1} \vec{q}_p && \hat{D}_p\end{bmatrix} \\
131+
Q_p &= \begin{bmatrix} m_p && \vec{r}_p^T \\ \vec{0} && \hat{M}_p - m_p^{-1} \vec{q}_p \vec{r}_p^T\end{bmatrix} \equiv \begin{bmatrix} 1 && \vec{r}_p^T \\ \vec{0} && M_{p+1} \end{bmatrix}
131132
\end{align*}
132133
$$
133134

@@ -152,14 +153,23 @@ m_0 && \vec{r}_0^T && && \\
152153
$$
153154

154155
The process described in the above assumes no pivot permutations were necessary. If permutations
155-
are required, they are kept track of in a separate permution matrix, so that $A = LUP$.
156+
are required, they are kept track of in separate row-permution and column-permutation matrices $P$
157+
and $Q$, such that $PAQ = LU$, which can be rewritten as
158+
159+
$$
160+
A = P^{-1}LUQ^{-1}
161+
$$
156162

157163
#### Dense LU factorization algorithm
158164

165+
The power grid model uses an in-place approach. Permutations are
166+
159167
Let $M\equiv M\left[0:N, 0:N\right]$ be the $N\times N$-matrix and $M\left[i,j\right]$ its element
160168
at (0-based) indices $(i,j)$, where $i,j = 0..(N-1)$.
161169

162-
1. Loop over all rows: $p = 0..(N-1)$:
170+
1. Initialize the permutations $P$ and $Q$ to the identity permutation.
171+
2. Initialize fill-in elements to $0$.
172+
3. Loop over all rows: $p = 0..(N-1)$:
163173
1. Set the remaining matrix: $M_p \gets M\left[p:N,p:N\right]$ with size $N_p\times N_p$.
164174
2. Find largest element $M_p\left[i_p,j_p\right]$ in $M_p$ by magnitude. This is the pivot
165175
element.
@@ -180,7 +190,15 @@ at (0-based) indices $(i,j)$, where $i,j = 0..(N-1)$.
180190
6. Apply Gaussian elimination for the current pivot element:
181191
1. $M_p\left[0,0:N_p\right] \gets \frac{1}{M_p[0,0]}M_p\left[0,0:N_p\right]$
182192
2. $M_p\left[1:N_p,0:N_p\right] \gets M_p\left[1:N_p,0:N_p\right] - M_p\left[1:N_p,0\right] \otimes M_p\left[0,0:N_p\right]$
183-
7. Continue with the next $p$ to factorize the the bottom-right block.
193+
7. Accumulate the permutation matrices:
194+
1. In $P$: swap $p \leftrightarrow p + i_p$
195+
2. In $Q$: swap $p \leftrightarrow p + j_p$
196+
8. Continue with the next $p$ to factorize the the bottom-right block.
197+
198+
$L$ is now the matrix composed of ones on the diagonal and the bottom-left off-diagonal triangle of
199+
$M$ and zeros in the upper-right off-diagonal triangle. $Q$ is the matrix composed of the
200+
upper-right triangle of $M$ including the diagonal elements and zeros in the lower-left off-diagonal
201+
triangle.
184202

185203
```{note}
186204
In the actual implementation, we use a slightly different order of operations: instead of raising
@@ -190,7 +208,57 @@ change the functional behavior.
190208

191209
### Block-sparse LU factorization
192210

193-
The LU factorization process for block-sparse matrices is similar to that for dense matrices.
211+
The LU factorization process for block-sparse matrices is similar to that for
212+
[dense matrices](#dense-lu-factorization), but in this case, $m_p$ is a block element, and
213+
$\vec{q}_p$, $\vec{r}_p^T$ and $\hat{M}_p$ consist of block elements as well. Notice that the
214+
inverse $m_p^{-1}$ can be calculated from is LU decomposition, which can be obtained from the
215+
[dense LU factorization process](#dense-lu-factorization-process).
216+
217+
#### Block-sparse LU factorization process
218+
219+
The Gaussian elimination process itself is as usual. Completely analogously to and following the
220+
same conventions as [before](#dense-lu-factorization-process), let
221+
$\underline{M}_p\equiv\begin{bmatrix}\underline{m}_p && \vec{\underline{r}}_p^T \\ \vec{\underline{q}}_p && \hat{\underline{M}}_p\end{bmatrix}$
222+
be the block-sparse matrix to decompose. $\underline{m}_p$ is a dense block that can be
223+
[LU-factorized](#dense-lu-factorization-process):
224+
$\underline{m}_p = \underline{p}_p^{-1} \underline{l}_p \underline{u}_p \underline{q}_p^{-1}$, where
225+
he lower-case helps avoiding confusion with the block-sparse matrix components. Gaussian elimination
226+
constructs the matrices
227+
228+
$$
229+
\begin{align*}
230+
\underline{L}_p &= \begin{bmatrix} 1 && \vec{\underline{0}}^T \\ \overrightarrow{\underline{m}_p^{-1}\underline{q}_p} && \hat{\underline{D}}_p\end{bmatrix}
231+
\underline{Q}_p &= \begin{bmatrix} \underline{m}_p && \vec{\underline{r}}_p^T \\ \vec{\underline{0}} && \hat{\underline{M}}_p - \widehat{\underline{m}_p^{-1}\underline{q}_p \underline{r}_p^T}\end{bmatrix} \equiv \begin{bmatrix} \underline{1} && \vec{\underline{r}}_p^T \\ \vec{0} && \underline{M}_{p+1} \end{bmatrix}
232+
\end{align*}
233+
$$
234+
235+
Here, $\overrightarrow{\underline{m}_p^{-1}\underline{q}_p}$ is symbolic notation for the
236+
block-vector of solutions of the equation $\underline{m}_p x_{k;p} = \underline{q}_{k;p}$, where
237+
$k = 0..(p-1)$. Similarly, $\widehat{\underline{m}_p^{-1}\underline{q}_p \underline{r}_p^T}$ is
238+
symbolic notation for the block-matrix of solutions of the equation
239+
$\underline{m}_p^{-1} x_{k,l;p} = \underline{q}_{k,p} \underline{r}_{p,l}^T$, where
240+
$k,l = 0..(p-1)$. That is:
241+
242+
$$
243+
\begin{align*}
244+
\overrightarrow{\underline{m}_p^{-1}\underline{q}_p}
245+
&= \begin{bmatrix}\underline{m}_p^{-1}\underline{q}_{0,p} \\
246+
\vdots \\
247+
\underline{m}_p^{-1} \underline{q}_{N-1,p} \end{bmatrix} \\
248+
\widehat{\underline{m}_p^{-1}\underline{q}_p \underline{r}_p^T}
249+
&= \begin{bmatrix}\underline{m}_p^{-1} \underline{q}_{0,p} \underline{r}_{p,0}^T && \cdots &&
250+
\underline{m}_p^{-1} \underline{q}_{0,p} \underline{r}_{p,N-1}^T \\
251+
\vdots && \ddots && \vdots \\
252+
\underline{m}_p^{-1} \underline{q}_{0,p} \underline{r}_{p,0}^T && \cdots &&
253+
\underline{m}_p^{-1} \underline{q}_{N-1,p} \underline{r}_{p,N-1}^T \end{bmatrix}
254+
\end{align*}
255+
$$
256+
257+
Instead of calculating $\underline{m}_p^{-1}$ to obtain $\underline{m}_p^{-1} x_p$, the power grid
258+
model first solves the matrix equation $l_p y_p = p_p a_p$, followed by solving $u_p z_p = y_p$. The
259+
end-result is then $\underline{m}_p^{-1} x_p = q_p z_p$.
260+
261+
Iteratively applying above factorization process yields $L$ and $U$, as well as $P$ and $Q$.
194262

195263
#### Block-sparse indices
196264

@@ -249,21 +317,89 @@ M_{0,0} && && && M_{0,3} \\
249317
M_{3,0} && && M_{3,2} && M_{3,3}
250318
\end{pmatrix} &\equiv
251319
\begin{bmatrix}
320+
M_{0,0} && M_{0,3} && M_{1,1} && M_{1,2} && M_{2,1} && M_{2,2} && M_{2,3} && M_{3,0} && M_{3,2} && M_{3,3} \\
321+
[[0 && 3] && [1 && 2] && [1 && 2 && 3] && [0 && 2 && 3]] \\
322+
\end{bmatrix} \\
323+
&\equiv
324+
\begin{bmatrix}
252325
M_{0,0} && M_{0,3} && M_{1,1} && M_{1,2} && M_{2,1} && M_{2,2} && M_{2,3} && M_{3,0} && M_{3,2} && M_{3,3} && \\
253-
[[0 && 3] && [1 && 2] && [1 && 2 && 3] && [0 && 2 && 3]] && \\
254-
[0 && && 2 && && 4 && && 7 && && 10]
326+
[0 && 3 && 1 && 2 && 1 && 2 && 3 && 0 && 2 && 3] && \\
327+
[0 && && 2 && && 4 && && && 7 && && && && 10]
255328
\end{bmatrix}
256329
\end{align*}
257330
$$
258331

259-
In the right-hand side, the upper row contains the present block entries and the bottom row their
260-
column indices. The row indices are implied by the index of the inner vector in the outer one.
332+
In the first equation, the upper row contains the present block entries and the bottom row their
333+
column indices per row to obtain a flattened representation of the matrix. In the last equivalence,
334+
the column indices, in turn, are also flattened into a separate flattened representation of the
335+
column indices and the start indices of each row - the index pointer (`indptr`). Note that the size
336+
of `indptr` is 1 greater than the amount of rows. This enables describing the amount of elements
337+
in each row as the difference between its element in `indptr` and the next element.
261338

262-
Looping over the rows and columns becomes trivial. Let $\text{indptr}$ be the nested list of column
263-
indices. The double loop becomes:
339+
Looping over the rows and columns becomes trivial. Let $\text{indptr}$ be the start. The double loop becomes:
264340

265341
1. Loop all rows: $i=0..(N-1)$:
266-
1. The list of indices for this row is
342+
1. The column indices for this row start at $\text{indptr}\left[i\right]$.
343+
2. The column indices for this row stop at $\text{indptr}\left[i+1\right]$.
344+
3. Loop all columns:
345+
$j=\left(\text{indptr}\left[i\right]\right)..\left(\text{indptr}\left[i+1\right]\right)$:
346+
1. The current row index is $i$.
347+
2. The current column index is $j$.
348+
349+
### Block-sparse LU solving
350+
351+
Solving an equation $M x = b$ using a
352+
[pre-factored LU-factorization](#block-sparse-lu-factorization) is done using the regular forward
353+
and backward substitutions. It starts by permuting the rows: $b^{\prime} = PB$. After that, the
354+
forward substitution step essentially amounts to solving the matrix equation $Ly = b^{\prime}$,
355+
followed by the backwards substitution by solving $Uz = y$. The final result is then obtained by
356+
applying the column permutation: $x = Qz$.
357+
358+
#### Forward substitution
359+
360+
The row permutation is applied as follows.
361+
362+
1. Loop over all block-rows: $i=0..(N-1)$:
363+
1. If the matrix is a block matrix:
364+
1. Apply the current row's block permutation: $b[i] \gets P[i] b[i]$.
365+
2. Proceed.
366+
2. Else:
367+
1. Proceed.
368+
369+
The equation $Ly = Pb$ is solved as follows.
370+
371+
1. Loop over all block-rows: $i=0..(N-1)$:
372+
1. Loop over all lower-triangle off-diagonal columns (beware of sparsity): $j=0..(i-1)$:
373+
1. $b[i] \gets b[i] - L[i,j] \cdot b[i]$.
374+
2. Continue with next block-column.
375+
2. If the matrix is a block matrix:
376+
1. Follow the same steps within the block.
377+
2. Proceed.
378+
3. Else:
379+
1. Proceed.
380+
381+
The equation $Uz = y$ is solved as follows.
382+
383+
1. Loop over all block-rows in reverse order: $i=(N-1)..0$:
384+
1. Loop over all upper-triangle off-diagonal columns (beware of sparsity): $j=(i+1)..0$:
385+
1. $b[i] \gets b[i] - U[i,j] \cdot b[i]$.
386+
2. Continue with next block-column.
387+
2. Handle the diagonal element:
388+
1. If the matrix is a block matrix:
389+
1. Follow the same steps within the block.
390+
2. Proceed.
391+
2. Else:
392+
1. $b[i] \gets b[i] / U[i,i]$.
393+
2. Proceed.
394+
395+
Apply the column permutation as follows.
396+
397+
1. Loop over all block-rows: $i=0..(N-1)$:
398+
1. If the matrix is a block matrix:
399+
1. Apply the current row's block permutation: $b[i] \gets Q[i] b[i]$.
400+
2. Proceed.
401+
2. Else:
402+
1. Proceed.
267403

268404
### Pivot perturbation
269405

@@ -300,20 +436,24 @@ be the matrix, $\left\|M\right\|_{\infty ,\text{bwod}}$ the
300436
$\text{direction}$ ensures that the complex phase of the pivot element is preserved, with a fallback
301437
the positive real axis when the pivot element is identically zero.
302438

303-
#### Iterative refinement
439+
### Iterative refinement of LU solvers
304440

305441
This algorithm is heavily inspired by the GESP algorithm described in
306442
[Li99](https://www.semanticscholar.org/paper/A-Scalable-Sparse-Direct-Solver-Using-Static-Li-Demmel/7ea1c3360826ad3996f387eeb6d70815e1eb3761).
307443

308-
The refinement process improves the solution to the matrix equation $A \cdot x = b$ as follows:
309-
In iteration step $i$, it assumes an existing approximation $x_i$ for $x$. It then defines
310-
the difference between the current best and the actual solution $\Delta x = x - x_i$.
311-
Substiting in the original equation yields $A \cdot (x_i + \Delta x) = b$, so that
312-
$A \cdot \Delta x = b - A \cdot x_i =: r$, where the residual $r$ can be calculated.
313-
An estimation for the left-hand side can be obtained by using the pivot-perturbed matrix
314-
$\tilde{A}$ instead of the original matrix A. Convergence can be reached if $r \to 0$, since then
315-
also $\left\|\Delta x\right\| \to 0$. Solving for $\Delta x$ and substituting back into
316-
$x_{i+1} = x_i + \Delta x$ provides the next best approximation $x_{i+1}$ for $x$.
444+
[LU solving](#block-sparse-lu-solving) with an [LU-decomposition](#block-sparse-lu-factorization)
445+
obtained using [pivot perturbation](#pivot-perturbation) yields only approximate results, because
446+
the LU-decomposition itself is only approximate. Because of that, an iterative refinement process
447+
is required to improve the solution to the matrix equation $A \cdot x = b$.
448+
449+
The iterative refinement process is as follows: in iteration step $i$, it assumes an existing
450+
approximation $x_i$ for $x$. It then defines the difference between the current best and the actual
451+
solution $\Delta x = x - x_i$. Substiting in the original equation yields
452+
$A \cdot (x_i + \Delta x) = b$, so that $A \cdot \Delta x = b - A \cdot x_i =: r$, where the
453+
residual $r$ can be calculated. An estimation for the left-hand side can be obtained by using the
454+
pivot-perturbed matrix $\tilde{A}$ instead of the original matrix A. Convergence can be reached if
455+
$r \to 0$, since then also $\left\|\Delta x\right\| \to 0$. Solving for $\Delta x$ and substituting
456+
back into $x_{i+1} = x_i + \Delta x$ provides the next best approximation $x_{i+1}$ for $x$.
317457

318458
A measure for the quality of the approximation is given by the $\text{backward_error}$ (see also
319459
[backward error formula](#improved-backward-error-calculation)).
@@ -389,7 +529,8 @@ $\sum_j \left|A[i,j]\right| \left|x[j]\right|$ are small and, therefore, their s
389529
rounding errors, which may be several orders larger than machine precision.
390530

391531
[Li99](https://www.semanticscholar.org/paper/A-Scalable-Sparse-Direct-Solver-Using-Static-Li-Demmel/7ea1c3360826ad3996f387eeb6d70815e1eb3761)
392-
uses the following backward error in the [iterative refinement algorithm](#iterative-refinement):
532+
uses the following backward error in the
533+
[iterative refinement algorithm](#iterative-refinement-of-lu-solvers):
393534

394535
$$
395536
\begin{align*}

0 commit comments

Comments
 (0)