矩阵增量式求逆

math

发布日期: 2020-03-15

更新日期: 2020-03-29

文章字数: 1.8k

阅读时长: 8 分

阅读次数:

本文只讨论狭义的可逆矩阵，首先必须是方阵。

背景&动机

暴力求一个矩阵的逆，时间复杂度是\(O(n^3)\)，太高了。如果这个矩阵会随着时间越长越大，每长一次就要求一次逆，如何降低计算开销？再如果这个矩阵不光变大，还有可能缩小，又如何处理？

一个典型的场景是在线计算核矩阵的逆(Wu et al. 2017; Cauwenberghs and Poggio 2001)。

背景知识

假设某可逆矩阵，分为\(A,B,C,D\)四块，则有公式可以计算该分块矩阵的逆。按照子矩阵形状的不同，可分为如下两种情况：

~~公式有点多，不要怕，真的~~

\(A, D\)为方阵：

\[ \begin{align} \begin{split} \left[ \begin{array}{cc} A & B \\ C & D \\ \end{array} \right]^{-1} &= \left[ \begin{array}{cc} A^{-1}+A^{-1}BFCA^{-1} & -A^{-1}BF \\ -FCA^{-1} & F \\ \end{array} \right]\\ \text{where}\qquad F&=(D-CA^{-1}B)^{-1} \end{split} \end{align} \qquad(1)\]

公式 1 等价于：

\[ \begin{align} \begin{split} \left[ \begin{array}{cc} A & B \\ C & D \\ \end{array} \right]^{-1} &= \left[ \begin{array}{cc} E & -EBD^{-1} \\ -D^{-1}CE & D^{-1}+D^{-1}CEBD^{-1} \\ \end{array} \right]\\ \text{where}\qquad E&=(A-BD^{-1}C)^{-1} \end{split} \end{align} \qquad(2)\]

\(B, C\)为方阵：

\[ \begin{align} \begin{split} \left[ \begin{array}{cc} A & B \\ C & D \\ \end{array} \right]^{-1} &= \left[ \begin{array}{cc} -GDB^{-1} & G \\ B^{-1}+B^{-1}AGDB^{-1} & -BAG \\ \end{array} \right]\\ \text{where}\qquad G&=(C-DB^{-1}A)^{-1} \end{split} \end{align} \qquad(3)\]

公式 3 也有等价形式：

\[ \begin{align} \begin{split} \left[ \begin{array}{cc} A & B \\ C & D \\ \end{array} \right]^{-1} &= \left[ \begin{array}{cc} -C^{-1}DH & C^{-1}+C^{-1}DHAC^{-1} \\ H & -HAC^{-1} \\ \end{array} \right]\\ \text{where}\qquad H&=(B-AC^{-1}D)^{-1} \end{split} \end{align} \qquad(4)\]

如何做

有了上述四个公式，即可增量计算矩阵逆。举例如下：

假设现在有一矩阵 \(A\in\mathbb{R}^{n\times n}\)，已知其逆矩阵 \(A^{-1}\)，如果：在\(A\)右侧添加子矩阵\(B\in\mathbb{R}^{n\times m}\)，下方添加子矩阵\(C\in\mathbb{R}^{m\times n}\)，右下方添加子矩阵\(D\in\mathbb{R}^{m\times m}\)，构成大矩阵 \(A'\in\mathbb{R}^{(n+m)\times (n+m)}\)，求\(A'\)的逆。

这个问题是矩阵扩张的情形，可以直接应用公式 1 解决。那么问题反过来，缩小如何解决？

已知矩阵\(A'\in\mathbb{R}^{n\times n}\)是矩阵\(A\)的逆，若在\(A\)（以及\(A'\)）的左上两边各舍弃\(m\)列，求\(A\)剩下的子矩阵的逆。

稍微多了点步骤，这相当于：已知公式 2 的右边，删去了左上角的\(m\)行/列，求\(D^{-1}\)。还可以注意到：

\(A'\) 剩下的东西就相当于是公式 2 右边的 \(D^{-1}+D^{-1}CEBD^{-1}\)
左边删去的相当于是\(-D^{-1}CE\)
上边删去的相当于是 \(-EBD^{-1}\)
左上角删去的是 \(E\)

另外把 \(D^{-1}C\) 与 \(BD^{-1}\) 分别看成整体，可以建立方程组求出\(D^{-1}\)来：

\[ \begin{split} \text{左上删去的部分} = E \\ -[D^{-1}C]E = \text{左边删去的} \\ -E[BD^{-1}] = \text{上边删去的} \\ D^{-1} = \text{剩下的} - [D^{-1}C]E[BD^{-1}] \end{split} \] 把\(D^{-1}C\)、\(BD^{-1}\)当整体解出来，带到第4个公式里，即可得到\(D^{-1}\)。

总结

前面举例介绍了在已知矩阵的右下角新增行列时如何求逆，以及从已知矩阵的左上角删除行列时如何求逆。分别是公式 1 正用，公式 2 反用的情况。

不难发现，公式 1-4 正着用就分别适用于已知 \(A^{-1}, D^{-1}, B^{-1}, C^{-1}\)，求大矩阵的逆；而反着用就分别适用于已知大矩阵的逆，求对应剩余部分的逆的情况。当然了，这样做省时间的前提是要\(m\ll n\)，不然求\(m\times m\)大小的矩阵的逆也是很花时间的，得不偿失了。

def inverse_enlarge(iA, B, C, D, where='TL'):
    '''
    已知iA为某方阵A的逆，在A周围新添加BCD三块，组成新方阵。返回该新方阵的逆。
    where: 指示iA的位置，即在大方阵的哪个角。
    BCD的位置按照如下原则：A与B在同一行。A的对角是D。剩余的是C。A与D始终是方阵。
    '''
    assert(iA.shape[0]==B.shape[0])
    assert(iA.shape[1]==C.shape[1])
    assert(D.shape[0]==C.shape[0])
    assert(D.shape[1]==B.shape[1])
    assert(iA.shape[0]==iA.shape[1] and D.shape[0]==D.shape[1])
    F = np.linalg.inv(D-C@iA@B)
    Ap = iA+iA@B@F@C@iA
    Bp = -iA@B@F
    Cp = -F@C@iA
    if where == 'BR':
        return np.asarray(np.bmat([[F, Cp],[Bp, Ap]]))
    elif where == 'TL':
        return np.asarray(np.bmat([[Ap, Bp],[Cp, F]]))
    elif where == 'TR':
        return np.asarray(np.bmat([[Cp, F],[Ap, Bp]]))
    else:
        assert(where == 'BL')
        return np.asarray(np.bmat([[Bp, Ap],[F, Cp]]))

def inverse_shrink(V, m, where='BR'):
    '''
    已知V为某分块矩阵的逆，求某个子方阵的逆。限制：V与要求的子矩阵都必须是方阵。
    V: 已知的逆矩阵
    m: 返回值大小为<m x m>
    where: 返回值在大矩阵的哪个角。
    '''
    assert(V.shape[0]==V.shape[1] and V.shape[0]>m>0)
    if where=='BR':
        F, Cp, Bp, X = V[:-m, :-m], V[:-m, -m:], V[-m:, :-m], V[-m:, -m:]
    elif where=='TL':
        X, Bp, Cp, F = V[:m, :m], V[:m, m:], V[m:, :m], V[m:, m:]
    elif where=='TR':
        m = V.shape[0]-m
        Cp, F, X, Bp = V[:m, :-m], V[:m, -m:], V[m:, :-m], V[m:, -m:]
    else:
        assert(where=='BL')
        m = V.shape[0]-m
        Bp, X, F, Cp = V[:-m, :m], V[:-m, m:], V[-m:, :m], V[-m:, m:]
    tmp = np.linalg.inv(F)
    CiA = tmp@Cp
    iAB = Bp@tmp
    iA = X-iAB@F@CiA
    return iA

测试代码

A = np.random.randn(5,5)
B = np.random.randn(5,1)
C = np.random.randn(1,5)
D = np.random.randn(1,1)
iA = np.linalg.inv(A)

Q = np.asarray(np.bmat( [[D, C],[B,A]] ))
iQ = inverse_enlarge(iA, B, C, D, 'BR')
print(np.allclose(iQ, np.linalg.inv(Q) ))

Q = np.asarray(np.bmat( [[A, B],[C,D]] ))
iQ = inverse_enlarge(iA, B, C, D, 'TL')
print(np.allclose(iQ, np.linalg.inv(Q) ))

Q = np.asarray(np.bmat( [[B, A],[D,C]] ))
iQ = inverse_enlarge(iA, B, C, D, 'TR')
print(np.allclose(iQ, np.linalg.inv(Q) ))

Q = np.asarray(np.bmat( [[C, D],[A,B]] ))
iQ = inverse_enlarge(iA, B, C, D, 'BL')
print(np.allclose(iQ, np.linalg.inv(Q) ))


A = np.random.randn(1,1)
B = np.random.randn(1,5)
C = np.random.randn(5,1)
D = np.random.randn(5,5)

iQ = np.linalg.inv(np.asarray(np.bmat( [[A, B],[C,D]] )))
iD = inverse_shrink(iQ, 5, 'BR')
print(np.allclose(iD, np.linalg.inv(D)))

iQ = np.linalg.inv(np.asarray(np.bmat( [[D, C],[B,A]] )))
iD = inverse_shrink(iQ, 5, 'TL')
print(np.allclose(iD, np.linalg.inv(D)))

iQ = np.linalg.inv(np.asarray(np.bmat( [[C, D],[A,B]] )))
iD = inverse_shrink(iQ, 5, 'TR')
print(np.allclose(iD, np.linalg.inv(D)))

iQ = np.linalg.inv(np.asarray(np.bmat( [[B, A],[D,C]] )))
iD = inverse_shrink(iQ, 5, 'BL')
print(np.allclose(iD, np.linalg.inv(D)))

参考文献

知乎：如何求一个矩阵的逆

http://www.cs.ubbcluj.ro/~csatol/SOGP/thesis/Iterative_computation.html

Cauwenberghs, Gert, and Tomaso Poggio. 2001. “Incremental and Decremental Support Vector Machine Learning.” In Advances in Neural Information Processing Systems, 409–15.

Wu, Chung-Hao, Wei-Chen His, Henry Horng-Shing Lu, and Hsueh-Ming Hang. 2017. “Online Multiclass Passive-Aggressive Learning on a Fixed Budget.” In 2017 Ieee International Symposium on Circuits and Systems (Iscas), 1–4. IEEE.

Legend94rz

https://legend94rz.github.io