@@ -120,14 +120,15 @@ The power grid model uses a modified version of the
120
120
121
121
The Gaussian elimination process itself is as usual. Let
122
122
$M_p\equiv\begin{bmatrix}m_p && \vec{r}_ p^T \\ \vec{q}_ p && \hat{M}_ p\end{bmatrix}$, where $p$ is
123
- the current pivot element at the top left of the matrix, $\vec{q}$ and $\vec{r}_ p^T$ are the
124
- associated column and row vectors containing the rest of the pivot column and row and $\hat{M}_ p$ is
125
- the bottom-right block of the matrix. Gaussian elimination constructs the matrices
123
+ the current pivot element at the top left of the matrix, $m_p$ its matrix element value, $\vec{q}$
124
+ and $\vec{r}_ p^T$ are the associated column and row vectors containing the rest of the pivot column
125
+ and row and $\hat{M}_ p$ is the bottom-right block of the matrix. Gaussian elimination constructs the
126
+ matrices
126
127
127
128
$$
128
129
\begin{align*}
129
- L_p &= \begin{bmatrix} 1 && \vec{0}^T \\ m_p^{-1} \vec{q}_p && \hat{D}_p\end{bmatrix}
130
- Q_p &= \begin{bmatrix} m_p && \vec{r}_p^T \\ \vec{0} && \hat{M}_p - m_p^{-1} \vec{q}_p \vec{r}_p^T\end{bmatrix} \equiv \begin{bmatrix} 1 && \tilde{\ vec{r} }_p^T \\ \vec{0} && M_{p+1} \end{bmatrix}
130
+ L_p &= \begin{bmatrix} 1 && \vec{0}^T \\ m_p^{-1} \vec{q}_p && \hat{D}_p\end{bmatrix} \\
131
+ Q_p &= \begin{bmatrix} m_p && \vec{r}_p^T \\ \vec{0} && \hat{M}_p - m_p^{-1} \vec{q}_p \vec{r}_p^T\end{bmatrix} \equiv \begin{bmatrix} 1 && \vec{r}_p^T \\ \vec{0} && M_{p+1} \end{bmatrix}
131
132
\end{align*}
132
133
$$
133
134
@@ -152,14 +153,23 @@ m_0 && \vec{r}_0^T && && \\
152
153
$$
153
154
154
155
The process described in the above assumes no pivot permutations were necessary. If permutations
155
- are required, they are kept track of in a separate permution matrix, so that $A = LUP$.
156
+ are required, they are kept track of in separate row-permution and column-permutation matrices $P$
157
+ and $Q$, such that $PAQ = LU$, which can be rewritten as
158
+
159
+ $$
160
+ A = P^{-1}LUQ^{-1}
161
+ $$
156
162
157
163
#### Dense LU factorization algorithm
158
164
165
+ The power grid model uses an in-place approach. Permutations are
166
+
159
167
Let $M\equiv M\left[ 0: N , 0: N \right] $ be the $N\times N$-matrix and $M\left[ i,j\right] $ its element
160
168
at (0-based) indices $(i,j)$, where $i,j = 0..(N-1)$.
161
169
162
- 1 . Loop over all rows: $p = 0..(N-1)$:
170
+ 1 . Initialize the permutations $P$ and $Q$ to the identity permutation.
171
+ 2 . Initialize fill-in elements to $0$.
172
+ 3 . Loop over all rows: $p = 0..(N-1)$:
163
173
1 . Set the remaining matrix: $M_p \gets M\left[ p: N ,p: N \right] $ with size $N_p\times N_p$.
164
174
2 . Find largest element $M_p\left[ i_p,j_p\right] $ in $M_p$ by magnitude. This is the pivot
165
175
element.
@@ -180,7 +190,15 @@ at (0-based) indices $(i,j)$, where $i,j = 0..(N-1)$.
180
190
6 . Apply Gaussian elimination for the current pivot element:
181
191
1 . $M_p\left[ 0,0: N_p \right] \gets \frac{1}{M_p[ 0,0] }M_p\left[ 0,0: N_p \right] $
182
192
2 . $M_p\left[ 1: N_p ,0: N_p \right] \gets M_p\left[ 1: N_p ,0: N_p \right] - M_p\left[ 1: N_p ,0\right] \otimes M_p\left[ 0,0: N_p \right] $
183
- 7 . Continue with the next $p$ to factorize the the bottom-right block.
193
+ 7 . Accumulate the permutation matrices:
194
+ 1 . In $P$: swap $p \leftrightarrow p + i_p$
195
+ 2 . In $Q$: swap $p \leftrightarrow p + j_p$
196
+ 8 . Continue with the next $p$ to factorize the the bottom-right block.
197
+
198
+ $L$ is now the matrix composed of ones on the diagonal and the bottom-left off-diagonal triangle of
199
+ $M$ and zeros in the upper-right off-diagonal triangle. $Q$ is the matrix composed of the
200
+ upper-right triangle of $M$ including the diagonal elements and zeros in the lower-left off-diagonal
201
+ triangle.
184
202
185
203
``` {note}
186
204
In the actual implementation, we use a slightly different order of operations: instead of raising
@@ -190,7 +208,57 @@ change the functional behavior.
190
208
191
209
### Block-sparse LU factorization
192
210
193
- The LU factorization process for block-sparse matrices is similar to that for dense matrices.
211
+ The LU factorization process for block-sparse matrices is similar to that for
212
+ [ dense matrices] ( #dense-lu-factorization ) , but in this case, $m_p$ is a block element, and
213
+ $\vec{q}_ p$, $\vec{r}_ p^T$ and $\hat{M}_ p$ consist of block elements as well. Notice that the
214
+ inverse $m_p^{-1}$ can be calculated from is LU decomposition, which can be obtained from the
215
+ [ dense LU factorization process] ( #dense-lu-factorization-process ) .
216
+
217
+ #### Block-sparse LU factorization process
218
+
219
+ The Gaussian elimination process itself is as usual. Completely analogously to and following the
220
+ same conventions as [ before] ( #dense-lu-factorization-process ) , let
221
+ $\underline{M}_ p\equiv\begin{bmatrix}\underline{m}_ p && \vec{\underline{r}}_ p^T \\ \vec{\underline{q}}_ p && \hat{\underline{M}}_ p\end{bmatrix}$
222
+ be the block-sparse matrix to decompose. $\underline{m}_ p$ is a dense block that can be
223
+ [ LU-factorized] ( #dense-lu-factorization-process ) :
224
+ $\underline{m}_ p = \underline{p}_ p^{-1} \underline{l}_ p \underline{u}_ p \underline{q}_ p^{-1}$, where
225
+ he lower-case helps avoiding confusion with the block-sparse matrix components. Gaussian elimination
226
+ constructs the matrices
227
+
228
+ $$
229
+ \begin{align*}
230
+ \underline{L}_p &= \begin{bmatrix} 1 && \vec{\underline{0}}^T \\ \overrightarrow{\underline{m}_p^{-1}\underline{q}_p} && \hat{\underline{D}}_p\end{bmatrix}
231
+ \underline{Q}_p &= \begin{bmatrix} \underline{m}_p && \vec{\underline{r}}_p^T \\ \vec{\underline{0}} && \hat{\underline{M}}_p - \widehat{\underline{m}_p^{-1}\underline{q}_p \underline{r}_p^T}\end{bmatrix} \equiv \begin{bmatrix} \underline{1} && \vec{\underline{r}}_p^T \\ \vec{0} && \underline{M}_{p+1} \end{bmatrix}
232
+ \end{align*}
233
+ $$
234
+
235
+ Here, $\overrightarrow{\underline{m}_ p^{-1}\underline{q}_ p}$ is symbolic notation for the
236
+ block-vector of solutions of the equation $\underline{m}_ p x_ {k;p} = \underline{q}_ {k;p}$, where
237
+ $k = 0..(p-1)$. Similarly, $\widehat{\underline{m}_ p^{-1}\underline{q}_ p \underline{r}_ p^T}$ is
238
+ symbolic notation for the block-matrix of solutions of the equation
239
+ $\underline{m}_ p^{-1} x_ {k,l;p} = \underline{q}_ {k,p} \underline{r}_ {p,l}^T$, where
240
+ $k,l = 0..(p-1)$. That is:
241
+
242
+ $$
243
+ \begin{align*}
244
+ \overrightarrow{\underline{m}_p^{-1}\underline{q}_p}
245
+ &= \begin{bmatrix}\underline{m}_p^{-1}\underline{q}_{0,p} \\
246
+ \vdots \\
247
+ \underline{m}_p^{-1} \underline{q}_{N-1,p} \end{bmatrix} \\
248
+ \widehat{\underline{m}_p^{-1}\underline{q}_p \underline{r}_p^T}
249
+ &= \begin{bmatrix}\underline{m}_p^{-1} \underline{q}_{0,p} \underline{r}_{p,0}^T && \cdots &&
250
+ \underline{m}_p^{-1} \underline{q}_{0,p} \underline{r}_{p,N-1}^T \\
251
+ \vdots && \ddots && \vdots \\
252
+ \underline{m}_p^{-1} \underline{q}_{0,p} \underline{r}_{p,0}^T && \cdots &&
253
+ \underline{m}_p^{-1} \underline{q}_{N-1,p} \underline{r}_{p,N-1}^T \end{bmatrix}
254
+ \end{align*}
255
+ $$
256
+
257
+ Instead of calculating $\underline{m}_ p^{-1}$ to obtain $\underline{m}_ p^{-1} x_p$, the power grid
258
+ model first solves the matrix equation $l_p y_p = p_p a_p$, followed by solving $u_p z_p = y_p$. The
259
+ end-result is then $\underline{m}_ p^{-1} x_p = q_p z_p$.
260
+
261
+ Iteratively applying above factorization process yields $L$ and $U$, as well as $P$ and $Q$.
194
262
195
263
#### Block-sparse indices
196
264
@@ -249,21 +317,89 @@ M_{0,0} && && && M_{0,3} \\
249
317
M_{3,0} && && M_{3,2} && M_{3,3}
250
318
\end{pmatrix} &\equiv
251
319
\begin{bmatrix}
320
+ M_{0,0} && M_{0,3} && M_{1,1} && M_{1,2} && M_{2,1} && M_{2,2} && M_{2,3} && M_{3,0} && M_{3,2} && M_{3,3} \\
321
+ [[0 && 3] && [1 && 2] && [1 && 2 && 3] && [0 && 2 && 3]] \\
322
+ \end{bmatrix} \\
323
+ &\equiv
324
+ \begin{bmatrix}
252
325
M_{0,0} && M_{0,3} && M_{1,1} && M_{1,2} && M_{2,1} && M_{2,2} && M_{2,3} && M_{3,0} && M_{3,2} && M_{3,3} && \\
253
- [[ 0 && 3] && [ 1 && 2] && [ 1 && 2 && 3] && [ 0 && 2 && 3] ] && \\
254
- [0 && && 2 && && 4 && && 7 && && 10]
326
+ [0 && 3 && 1 && 2 && 1 && 2 && 3 && 0 && 2 && 3] && \\
327
+ [0 && && 2 && && 4 && && && 7 && && && && 10]
255
328
\end{bmatrix}
256
329
\end{align*}
257
330
$$
258
331
259
- In the right-hand side, the upper row contains the present block entries and the bottom row their
260
- column indices. The row indices are implied by the index of the inner vector in the outer one.
332
+ In the first equation, the upper row contains the present block entries and the bottom row their
333
+ column indices per row to obtain a flattened representation of the matrix. In the last equivalence,
334
+ the column indices, in turn, are also flattened into a separate flattened representation of the
335
+ column indices and the start indices of each row - the index pointer (` indptr ` ). Note that the size
336
+ of ` indptr ` is 1 greater than the amount of rows. This enables describing the amount of elements
337
+ in each row as the difference between its element in ` indptr ` and the next element.
261
338
262
- Looping over the rows and columns becomes trivial. Let $\text{indptr}$ be the nested list of column
263
- indices. The double loop becomes:
339
+ Looping over the rows and columns becomes trivial. Let $\text{indptr}$ be the start. The double loop becomes:
264
340
265
341
1 . Loop all rows: $i=0..(N-1)$:
266
- 1 . The list of indices for this row is
342
+ 1 . The column indices for this row start at $\text{indptr}\left[ i\right] $.
343
+ 2 . The column indices for this row stop at $\text{indptr}\left[ i+1\right] $.
344
+ 3 . Loop all columns:
345
+ $j=\left(\text{indptr}\left[ i\right] \right)..\left(\text{indptr}\left[ i+1\right] \right)$:
346
+ 1 . The current row index is $i$.
347
+ 2 . The current column index is $j$.
348
+
349
+ ### Block-sparse LU solving
350
+
351
+ Solving an equation $M x = b$ using a
352
+ [ pre-factored LU-factorization] ( #block-sparse-lu-factorization ) is done using the regular forward
353
+ and backward substitutions. It starts by permuting the rows: $b^{\prime} = PB$. After that, the
354
+ forward substitution step essentially amounts to solving the matrix equation $Ly = b^{\prime}$,
355
+ followed by the backwards substitution by solving $Uz = y$. The final result is then obtained by
356
+ applying the column permutation: $x = Qz$.
357
+
358
+ #### Forward substitution
359
+
360
+ The row permutation is applied as follows.
361
+
362
+ 1 . Loop over all block-rows: $i=0..(N-1)$:
363
+ 1 . If the matrix is a block matrix:
364
+ 1 . Apply the current row's block permutation: $b[ i] \gets P[ i] b[ i] $.
365
+ 2 . Proceed.
366
+ 2 . Else:
367
+ 1 . Proceed.
368
+
369
+ The equation $Ly = Pb$ is solved as follows.
370
+
371
+ 1 . Loop over all block-rows: $i=0..(N-1)$:
372
+ 1 . Loop over all lower-triangle off-diagonal columns (beware of sparsity): $j=0..(i-1)$:
373
+ 1 . $b[ i] \gets b[ i] - L[ i,j] \cdot b[ i] $.
374
+ 2 . Continue with next block-column.
375
+ 2 . If the matrix is a block matrix:
376
+ 1 . Follow the same steps within the block.
377
+ 2 . Proceed.
378
+ 3 . Else:
379
+ 1 . Proceed.
380
+
381
+ The equation $Uz = y$ is solved as follows.
382
+
383
+ 1 . Loop over all block-rows in reverse order: $i=(N-1)..0$:
384
+ 1 . Loop over all upper-triangle off-diagonal columns (beware of sparsity): $j=(i+1)..0$:
385
+ 1 . $b[ i] \gets b[ i] - U[ i,j] \cdot b[ i] $.
386
+ 2 . Continue with next block-column.
387
+ 2 . Handle the diagonal element:
388
+ 1 . If the matrix is a block matrix:
389
+ 1 . Follow the same steps within the block.
390
+ 2 . Proceed.
391
+ 2 . Else:
392
+ 1 . $b[ i] \gets b[ i] / U[ i,i] $.
393
+ 2 . Proceed.
394
+
395
+ Apply the column permutation as follows.
396
+
397
+ 1 . Loop over all block-rows: $i=0..(N-1)$:
398
+ 1 . If the matrix is a block matrix:
399
+ 1 . Apply the current row's block permutation: $b[ i] \gets Q[ i] b[ i] $.
400
+ 2 . Proceed.
401
+ 2 . Else:
402
+ 1 . Proceed.
267
403
268
404
### Pivot perturbation
269
405
@@ -300,20 +436,24 @@ be the matrix, $\left\|M\right\|_{\infty ,\text{bwod}}$ the
300
436
$\text{direction}$ ensures that the complex phase of the pivot element is preserved, with a fallback
301
437
the positive real axis when the pivot element is identically zero.
302
438
303
- #### Iterative refinement
439
+ ### Iterative refinement of LU solvers
304
440
305
441
This algorithm is heavily inspired by the GESP algorithm described in
306
442
[ Li99] ( https://www.semanticscholar.org/paper/A-Scalable-Sparse-Direct-Solver-Using-Static-Li-Demmel/7ea1c3360826ad3996f387eeb6d70815e1eb3761 ) .
307
443
308
- The refinement process improves the solution to the matrix equation $A \cdot x = b$ as follows:
309
- In iteration step $i$, it assumes an existing approximation $x_i$ for $x$. It then defines
310
- the difference between the current best and the actual solution $\Delta x = x - x_i$.
311
- Substiting in the original equation yields $A \cdot (x_i + \Delta x) = b$, so that
312
- $A \cdot \Delta x = b - A \cdot x_i =: r$, where the residual $r$ can be calculated.
313
- An estimation for the left-hand side can be obtained by using the pivot-perturbed matrix
314
- $\tilde{A}$ instead of the original matrix A. Convergence can be reached if $r \to 0$, since then
315
- also $\left\| \Delta x\right\| \to 0$. Solving for $\Delta x$ and substituting back into
316
- $x_ {i+1} = x_i + \Delta x$ provides the next best approximation $x_ {i+1}$ for $x$.
444
+ [ LU solving] ( #block-sparse-lu-solving ) with an [ LU-decomposition] ( #block-sparse-lu-factorization )
445
+ obtained using [ pivot perturbation] ( #pivot-perturbation ) yields only approximate results, because
446
+ the LU-decomposition itself is only approximate. Because of that, an iterative refinement process
447
+ is required to improve the solution to the matrix equation $A \cdot x = b$.
448
+
449
+ The iterative refinement process is as follows: in iteration step $i$, it assumes an existing
450
+ approximation $x_i$ for $x$. It then defines the difference between the current best and the actual
451
+ solution $\Delta x = x - x_i$. Substiting in the original equation yields
452
+ $A \cdot (x_i + \Delta x) = b$, so that $A \cdot \Delta x = b - A \cdot x_i =: r$, where the
453
+ residual $r$ can be calculated. An estimation for the left-hand side can be obtained by using the
454
+ pivot-perturbed matrix $\tilde{A}$ instead of the original matrix A. Convergence can be reached if
455
+ $r \to 0$, since then also $\left\| \Delta x\right\| \to 0$. Solving for $\Delta x$ and substituting
456
+ back into $x_ {i+1} = x_i + \Delta x$ provides the next best approximation $x_ {i+1}$ for $x$.
317
457
318
458
A measure for the quality of the approximation is given by the $\text{backward_error}$ (see also
319
459
[ backward error formula] ( #improved-backward-error-calculation ) ).
@@ -389,7 +529,8 @@ $\sum_j \left|A[i,j]\right| \left|x[j]\right|$ are small and, therefore, their s
389
529
rounding errors, which may be several orders larger than machine precision.
390
530
391
531
[ Li99] ( https://www.semanticscholar.org/paper/A-Scalable-Sparse-Direct-Solver-Using-Static-Li-Demmel/7ea1c3360826ad3996f387eeb6d70815e1eb3761 )
392
- uses the following backward error in the [ iterative refinement algorithm] ( #iterative-refinement ) :
532
+ uses the following backward error in the
533
+ [ iterative refinement algorithm] ( #iterative-refinement-of-lu-solvers ) :
393
534
394
535
$$
395
536
\begin{align*}
0 commit comments