标量/矩阵函数对向量/矩阵求导

欢迎访问Blog总目录

目录

一、标量函数对向量求导

二、向量函数对向量求导

三、实值标量函数对矩阵求导

四、矩阵函数对矩阵求导


一、标量函数对向量求导

         假设f(\mathbf{x})是标量函数,自变量为\mathbf{x}\mathbf{x}^{m \times 1} = \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{m}\end{bmatrix}^{m \times 1}m*1维度的列向量。

f(\mathbf{x})\mathbf{x}求导:

  • 分母布局(常用):导数的行数与自变量相同,维度与自变量一致

\frac{\partial f(\mathbf{x})}{\partial \mathbf{x}^{m \times 1}} = \left[ \frac{\partial f(\mathbf{x})}{\partial x_1}, \frac{\partial f(\mathbf{x})}{\partial x_2}, \ldots, \frac{\partial f(\mathbf{x})}{\partial x_m} \right]^\mathrm{T} {m \times 1}

注:后续未标明处均为分母布局。

  • 分子布局:导师的行数与分子相同

\frac{\partial f(\mathbf{x})}{\partial \mathbf{x}^{m \times 1}} = \left[ \frac{\partial f(\mathbf{x})}{\partial x_1}, \frac{\partial f(\mathbf{x})}{\partial x_2}, \ldots, \frac{\partial f(\mathbf{x})}{\partial x_m} \right] {1 \times m}

二、向量函数对向量求导

     假设f(\mathbf{x})是向量函数,f(\mathbf{x})^{n \times 1} = \begin{bmatrix} f_{1}(\mathbf{x}) \\ f_{2}(\mathbf{x}) \\ \vdots \\ f_{n}(\mathbf{x})\end{bmatrix}^{n \times 1}n*1维度的列向量,自变量为\mathbf{x}\mathbf{x}^{m \times 1} = \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{m}\end{bmatrix}^{m \times 1}m*1维度的列向量。

f(\mathbf{x})\mathbf{x}求导:

  • 分母布局

\frac{\partial f(\mathbf{x})^{n \times 1}}{\partial \mathbf{x}^{m \times 1} } = \begin{bmatrix} \frac{\partial f(\mathbf{x})}{\partial x_1} \\ \frac{\partial f(\mathbf{x})}{\partial x_2} \\ \vdots \\ \frac{\partial f(\mathbf{x})}{\partial x_m} \end{bmatrix}^{m \times 1} = \begin{bmatrix} \frac{\partial f_1(\mathbf{x})}{\partial x_1} & \frac{\partial f_2(\mathbf{x})}{\partial x_1} & \cdots & \frac{\partial f_n(\mathbf{x})}{\partial x_1} \\ \frac{\partial f_1(\mathbf{x})}{\partial x_2} & \frac{\partial f_2(\mathbf{x})}{\partial x_2} & \cdots & \frac{\partial f_n(\mathbf{x})}{\partial x_2} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial f_1(\mathbf{x})}{\partial x_m} & \frac{\partial f_2(\mathbf{x})}{\partial x_m} & \cdots & \frac{\partial f_n(\mathbf{x})}{\partial x_m} \end{bmatrix}^{m \times n}

  • 分子布局

\frac{\partial f(\mathbf{x})^{n \times 1}}{\partial \mathbf{x}^{m \times 1}} = \begin{bmatrix} \frac{\partial f_1(\mathbf{x})}{\partial \mathbf{x}} \\ \frac{\partial f_2(\mathbf{x})}{\partial \mathbf{x}} \\ \vdots \\ \frac{\partial f_n(\mathbf{x})}{\partial \mathbf{x}} \end{bmatrix}^{n \times 1} = \begin{bmatrix} \frac{\partial f_1(\mathbf{x})}{\partial x_1} & \frac{\partial f_1(\mathbf{x})}{\partial x_2} & \cdots & \frac{\partial f_1(\mathbf{x})}{\partial x_m} \\ \frac{\partial f_2(\mathbf{x})}{\partial x_1} & \frac{\partial f_2(\mathbf{x})}{\partial x_2} & \cdots & \frac{\partial f_2(\mathbf{x})}{\partial x_m} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial f_n(\mathbf{x})}{\partial x_1} & \frac{\partial f_n(\mathbf{x})}{\partial x_2} & \cdots & \frac{\partial f_n(\mathbf{x})}{\partial x_m} \end{bmatrix}^{n \times m}

举例:

\frac{\partial\mathbf{A}^{m \times m} {\mathbf{y}^{m \times 1}}^T}{\partial \mathbf{y}^{m \times 1}} =[ \mathbf{A}^T]^{m \times m}

\frac{\partial{\mathbf{y}^{m \times 1}}^T \mathbf{A}^{m \times m} {\mathbf{y}^{m \times 1}}}{\partial \mathbf{y}^{m \times 1}} =[\mathbf{A}\mathbf{y}+ \mathbf{A}^T\mathbf{y}]^{m \times 1}

链式法则:

        假设g(f(\mathbf{x}))是标量函数,f(\mathbf{x})是向量函数,f(\mathbf{x})^{n \times 1} = \begin{bmatrix} f_{1}(\mathbf{x}) \\ f_{2}(\mathbf{x}) \\ \vdots \\ f_{n}(\mathbf{x})\end{bmatrix}^{n \times 1}n*1维度的列向量,自变量为\mathbf{x}\mathbf{x}^{m \times 1} = \begin{bmatrix} x_{1} \\ x_{2} \\ \vdots \\ x_{m}\end{bmatrix}^{m \times 1}m*1维度的列向量。

g(f(\mathbf{x}))\mathbf{x}求导:

[\frac{\partial g(f(\mathbf{x}))}{\partial \mathbf{x}}]^{m \times 1} = \frac{[\partial f(\mathbf{x})}{\partial \mathbf{x}}]^{m \times n} \cdot [\frac{\partial g(f(\mathbf{x}))}{\partial f(\mathbf{x})}]^{n \times 1}

注意与常规链式法则的顺序不同!

三、实值标量函数对矩阵求导

        假设f(\bold{X})是实值标量函数,自变量为\bold{X}\mathbf{X}^{m \times n} = \begin{bmatrix} x_{11} & x_{12} & \cdots & x_{1n} \\ x_{21} & x_{22} & \cdots & x_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & \cdots & x_{mn} \end{bmatrix}^{m \times n}m*n维度的矩阵。

f(\bold{X})\bold{X}^{m \times n}求导:

                   \frac{\partial f(\mathbf{X})}{\partial \mathbf{X}^{m \times n}} = \begin{bmatrix} \frac{\partial f}{\partial x_{11}} & \frac{\partial f}{\partial x_{12}} & \cdots & \frac{\partial f}{\partial x_{1n}} \\ \frac{\partial f}{\partial x_{21}} & \frac{\partial f}{\partial x_{22}} & \cdots & \vdots \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial f}{\partial x_{m1}} & \cdots & \cdots & \frac{\partial f}{\partial x_{mn}} \end{bmatrix}^{m \times n}                                                                                                        

乘积法则:\frac{\partial[f(\mathbf{X}) g(\mathbf{X})]}{\partial \mathbf{X}} = [\frac{\partial[f(\mathbf{X})]}{\partial \mathbf{X}} ]^{m \times n}g(\mathbf{X}) + f(\mathbf{X}) [\frac{\partial[g(\mathbf{X})]}{\partial \mathbf{X}} ]^{m \times n}

商法则:\frac{\partial \left[ \frac{f(\mathbf{X})}{g(\mathbf{X})} \right]}{\partial \mathbf{X}} = \frac{1}{g^2(\mathbf{X})} \left[ [\frac{\partial f(\mathbf{X})}{\partial \mathbf{X}}] ^{m \times n} g(\mathbf{X}) - f(\mathbf{X}) [\frac{\partial g(\mathbf{X})}{\partial \mathbf{X}} ]^{m \times n}\right]

举例:

\frac{\partial ([{\mathbf{a}^{m \times 1} }]^\top \mathbf{X} [{\mathbf{b}^{n \times 1} }])}{\partial \mathbf{X}} = [\mathbf{a}\mathbf{b}^\top] ^{m \times n}

\frac{\partial ([{\mathbf{a}^{n \times 1} }]^\top \mathbf{X}^\top [{\mathbf{b}^{m \times 1} }])}{\partial \mathbf{X}} = [\mathbf{b}\mathbf{a}^\top] ^{m \times n}

\frac{\partial ([\mathbf{a}^{m \times 1}]^\top \mathbf{X} \mathbf{X}^\top [\mathbf{b}^{m \times 1}])}{\partial \mathbf{X}} = [\mathbf{a}\mathbf{b}^\top \mathbf{X} + \mathbf{b}\mathbf{a}^\top \mathbf{X}]^{m \times n}

\frac{\partial ([\mathbf{a}^{n \times 1}]^\top \mathbf{X}^\top \mathbf{X} [\mathbf{b}^{n \times 1}])}{\partial \mathbf{X}} = [ \mathbf{X}\mathbf{b}\mathbf{a}^\top + \mathbf{X}\mathbf{ab}^\top]^{m \times n}

四、矩阵函数对矩阵求导

        假设f(\bold{X})是矩阵函数,f(\mathbf{X})^{a \times b} = \begin{bmatrix} f(\mathbf{X})_{11} & f(\mathbf{X})_{12} & \cdots & f(\mathbf{X})_{1b} \\ f(\mathbf{X})_{21} & f(\mathbf{X})_{22} & \cdots & f(\mathbf{X})_{2b} \\ \vdots & \vdots & \ddots & \vdots \\ f(\mathbf{X})_{a1} & f(\mathbf{X})_{a2} & \cdots & f(\mathbf{X})_{ab} \end{bmatrix}^{a \times b}a*b维度的矩阵,自变量为\bold{X}\mathbf{X}^{m \times n} = \begin{bmatrix} x_{11} & x_{12} & \cdots & x_{1n} \\ x_{21} & x_{22} & \cdots & x_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & \cdots & x_{mn} \end{bmatrix}^{m \times n}m*n维度的矩阵。

f(\bold{X})^{a \times b}\bold{X}^{m \times n}求导:

\frac{\partial f(\mathbf{X})^{a \times b}}{\partial \mathbf{X}^{m \times n}} = \begin{bmatrix} \frac{\partial f(\mathbf{x})}{\partial x_{11}} & \cdots & \frac{\partial f(\mathbf{x})}{\partial x_{1n}} \\ \frac{\partial f(\mathbf{x})}{\partial x_{21}} & \cdots & \vdots \\ \vdots & \ddots & \vdots \\ \frac{\partial f(\mathbf{x})}{\partial x_{m1}} & \cdots & \frac{\partial f(\mathbf{x})}{\partial x_{mn}} \end{bmatrix} = \begin{bmatrix} \frac{\partial f_{11}}{\partial x_{11}} & \cdots & \frac{\partial f_{1b}}{\partial x_{11}} & \cdots \\ \frac{\partial f_{a1}}{\partial x_{11}} & \cdots & \frac{\partial f_{ab}}{\partial x_{11}} & \cdots \\ \vdots & \ddots & \vdots & \vdots \end{bmatrix} ^{(a \cdot m) \times (b \cdot n)}

参考链接:矩阵求导公式的数学推导(矩阵求导——基础篇) - 知乎

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值