向量(矩阵)求导

向量(矩阵)对求导运算总结。

向量(矩阵)求导

1 向量(矩阵)对元素求导

1.1 行向量对元素求导

yT=[y1,,yn]\boldsymbol{y}^T = [y_1, \cdots, y_n]nn 维行向量,xx 为元素,则

yTx=[y1x,,ynx]\frac{\partial \boldsymbol{y}^T}{\partial x} = [\frac{\partial y_1}{\partial x}, \cdots, \frac{\partial y_n}{\partial x}]

1.2 列向量对元素求导

y=[y1,,ym]T\boldsymbol{y} = [y_1, \cdots, y_m]^Tmm 维列向量,xx 为元素,则

yx=[y1xymx]\frac{\partial \boldsymbol{y}}{\partial x} = \left[ \begin{matrix} \frac{\partial y_1}{\partial x} \\ \vdots \\ \frac{\partial y_m}{\partial x} \end{matrix} \right]

1.3 矩阵对元素求导

Y=[yij]Y = [y_{ij}]m×nm\times n 矩阵,xx 为元素,则

Yx=[y11xy1nxym1xymnx]\frac{\partial Y}{\partial x} = \left[ \begin{matrix} \frac{\partial y_{11}}{\partial x} &\cdots &\frac{\partial y_{1n}}{\partial x} \\ \vdots &\vdots &\vdots \\ \frac{\partial y_{m1}}{\partial x} &\cdots &\frac{\partial y_{mn}}{\partial x} \end{matrix} \right]

2 元素对向量(矩阵)求导

2.1 元素对行向量求导

yy 为元素,xT=[x1,,xq]\boldsymbol{x}^T = [x_1, \cdots, x_q]qq 维行向量,则

yxT=[yx1,,yxq]\frac{\partial y}{\partial \boldsymbol{x}^T} = [\frac{\partial y}{\partial x_1}, \cdots, \frac{\partial y}{\partial x_q}]

2.2 元素对列向量求导

yy 为元素,x=[x1,,xp]T\boldsymbol{x} = [x_1, \cdots, x_p]^Tpp 维行向量,则

yx=[yx1yxp]\frac{\partial y}{\partial \boldsymbol{x}} = \left[ \begin{matrix} \frac{\partial y}{\partial x_1} \\ \vdots \\ \frac{\partial y}{\partial x_p} \end{matrix} \right]

2.3 元素对矩阵求导

yy 为元素,X=[xij]X = [x_{ij}]p×qp\times q 矩阵,则

yX=[yx11yx1qyxp1yxpq]\frac{\partial y}{\partial X} = \left[ \begin{matrix} \frac{\partial y}{\partial x_{11}} &\cdots &\frac{\partial y}{\partial x_{1q}} \\ \vdots &\vdots &\vdots \\ \frac{\partial y}{\partial x_{p1}} &\cdots &\frac{\partial y}{\partial x_{pq}} \end{matrix} \right]

3 行(列)向量对列(行)向量求导

3.1 行向量对列向量求导

yT=[y1,,yn]\boldsymbol{y}^T = [y_1, \cdots, y_n]nn 维行向量,x=[x1,,xp]T\boldsymbol{x} = [x_1, \cdots, x_p]^Tpp 维行向量,则

yTx=[y1x1ynx1y1xpynxp]\frac{\partial \boldsymbol{y}^T}{\partial \boldsymbol{x}} = \left[ \begin{matrix} \frac{\partial y_1}{\partial x_1} &\cdots &\frac{\partial y_n}{\partial x_1} \\ \vdots &\vdots &\vdots \\ \frac{\partial y_1}{\partial x_p} &\cdots &\frac{\partial y_n}{\partial x_p} \end{matrix} \right]

3.2 列向量对行向量求导

y=[y1,,ym]T\boldsymbol{y} = [y_1, \cdots, y_m]^Tmm 维列向量,xT=[x1,,xq]\boldsymbol{x}^T = [x_1, \cdots, x_q]qq 维行向量,则

yxT=[y1x1y1xqymx1ymxq]\frac{\partial \boldsymbol{y}}{\partial \boldsymbol{x}^T} = \left[ \begin{matrix} \frac{\partial y_1}{\partial x_1} &\cdots &\frac{\partial y_1}{\partial x_q} \\ \vdots &\vdots &\vdots \\ \frac{\partial y_m}{\partial x_1} &\cdots &\frac{\partial y_m}{\partial x_q} \end{matrix} \right]

4 行(列)向量对行(列)向量求导

4.1 行向量对行向量求导

yT=[y1,,yn]\boldsymbol{y}^T = [y_1, \cdots, y_n]nn 维行向量,xT=[x1,,xq]\boldsymbol{x}^T = [x_1, \cdots, x_q]qq 维行向量,则

yTxT=[yTx1,,yTxq]\frac{\partial \boldsymbol{y}^T}{\partial \boldsymbol{x}^T} = [\frac{\partial \boldsymbol{y}^T}{\partial x_1}, \cdots, \frac{\partial \boldsymbol{y}^T}{\partial x_q}]

4.2 列向量对列向量求导

y=[y1,,ym]T\boldsymbol{y} = [y_1, \cdots, y_m]^Tmm 维列向量,x=[x1,,xp]T\boldsymbol{x} = [x_1, \cdots, x_p]^Tpp 维行向量,则

yx=[y1xymx]\frac{\partial \boldsymbol{y}}{\partial \boldsymbol{x}} = \left[ \begin{matrix} \frac{\partial y_1}{\partial \boldsymbol{x}} \\ \vdots \\ \frac{\partial y_m}{\partial \boldsymbol{x}} \end{matrix} \right]

5 矩阵对行(列)向量求导

5.1 矩阵对行向量求导

Y=[yij]Y = [y_{ij}]m×nm\times n 矩阵,xT=[x1,,xq]\boldsymbol{x}^T = [x_1, \cdots, x_q]qq 维行向量,则

YxT=[Yx1,,Yxq]Rm×nq\frac{\partial Y}{\partial \boldsymbol{x}^T} = [\frac{\partial Y}{\partial x_1}, \cdots, \frac{\partial Y}{\partial x_q}] \in \mathbb{R}^{m \times nq}

Y=[y1T,,ymT]TY = [\boldsymbol{y}_1^T, \cdots, \boldsymbol{y}_m^T]^T,则可以视为多个行向量对单个行向量求导

YxT=[y1TxTymTxT]=[y1Tx1y1TxqymTx1ymTxq]Rm×nq\frac{\partial Y}{\partial \boldsymbol{x}^T} = \left[ \begin{matrix} \frac{\partial \boldsymbol{y}_1^T}{\partial \boldsymbol{x}^T} \\ \vdots \\ \frac{\partial \boldsymbol{y}_m^T}{\partial \boldsymbol{x}^T} \end{matrix} \right] = \left[ \begin{matrix} \frac{\partial \boldsymbol{y}_1^T}{\partial x_1} &\cdots &\frac{\partial \boldsymbol{y}_1^T}{\partial x_q} \\ \vdots &\vdots &\vdots \\ \frac{\partial \boldsymbol{y}_m^T}{\partial x_1} &\cdots &\frac{\partial \boldsymbol{y}_m^T}{\partial x_q} \end{matrix} \right] \in \mathbb{R}^{m \times nq}

5.2 矩阵对列向量求导

Y=[yij]Y = [y_{ij}]m×nm\times n 矩阵,x=[x1,,xp]T\boldsymbol{x} = [x_1, \cdots, x_p]^Tpp 维列向量,则

Yx=[y11xy1nxym1xymnx]Rmp×n\frac{\partial Y}{\partial \boldsymbol{x}} = \left[ \begin{matrix} \frac{\partial y_{11}}{\partial \boldsymbol{x}} &\cdots &\frac{\partial y_{1n}}{\partial \boldsymbol{x}} \\ \vdots &\vdots &\vdots \\ \frac{\partial y_{m1}}{\partial \boldsymbol{x}} &\cdots &\frac{\partial y_{mn}}{\partial \boldsymbol{x}} \end{matrix} \right] \in \mathbb{R}^{mp \times n}

Y=[y1,,yn]Y = [\boldsymbol{y}_1, \cdots, \boldsymbol{y}_n],则可以视为多个列向量对单个列向量求导

YxT=[y1x,,ynx]Rmp×n\frac{\partial Y}{\partial \boldsymbol{x}^T} = [\frac{\partial \boldsymbol{y_1}}{\partial \boldsymbol{x}}, \cdots, \frac{\partial \boldsymbol{y_n}}{\partial \boldsymbol{x}}] \in \mathbb{R}^{mp \times n}

6 向量(矩阵)对矩阵求导

6.1 行向量对矩阵求导

yT=[y1,,yn]\boldsymbol{y}^T = [y_1, \cdots, y_n]nn 维行向量,X=[xij]X = [x_{ij}]p×qp\times q 矩阵,则

yTX=[yTx11yTx1qyTxp1yTxpq]Rp×nq\frac{\partial \boldsymbol{y}^T}{\partial X} = \left[ \begin{matrix} \frac{\partial \boldsymbol{y}^T}{\partial x_{11}} &\cdots &\frac{\partial \boldsymbol{y}^T}{\partial x_{1q}} \\ \vdots &\vdots &\vdots \\ \frac{\partial \boldsymbol{y}^T}{\partial x_{p1}} &\cdots &\frac{\partial \boldsymbol{y}^T}{\partial x_{pq}} \end{matrix} \right] \in \mathbb{R}^{p \times nq}

X=[x1T,,xpT]TX = [\boldsymbol{x}_1^T, \cdots, \boldsymbol{x}_p^T]^T,则可以视为单个行向量对多个行向量求导

yTX=[yTx1TyTxpT]Rp×nq\frac{\partial \boldsymbol{y}^T}{\partial X} = \left[ \begin{matrix} \frac{\partial \boldsymbol{y}^T}{\partial \boldsymbol{x}_1^T} \\ \vdots \\ \frac{\partial \boldsymbol{y}^T}{\partial \boldsymbol{x}_p^T} \end{matrix} \right] \in \mathbb{R}^{p \times nq}

6.2 列向量对矩阵求导

y=[y1,,ym]T\boldsymbol{y} = [y_1, \cdots, y_m]^Tmm 维列向量,X=[xij]X = [x_{ij}]p×qp\times q 矩阵,则

yX=[y1XymX]Rmp×q\frac{\partial \boldsymbol{y}}{\partial X} = \left[ \begin{matrix} \frac{\partial y_1}{\partial X} \\ \vdots \\ \frac{\partial y_m}{\partial X} \end{matrix} \right] \in \mathbb{R}^{mp \times q}

X=[x1,,xq]X = [\boldsymbol{x}_1, \cdots, \boldsymbol{x}_q],则可以视为单个列向量对多个列向量求导

yX=[yx1,,yxq]=[y1x1y1xqymx1ymxq]Rmp×q\frac{\partial \boldsymbol{y}}{\partial X} = [\frac{\partial \boldsymbol{y}}{\partial \boldsymbol{x}_1}, \cdots, \frac{\partial \boldsymbol{y}}{\partial \boldsymbol{x}_q}] = \left[ \begin{matrix} \frac{\partial y_1}{\partial \boldsymbol{x}_1} &\cdots &\frac{\partial y_1}{\partial \boldsymbol{x}_q} \\ \vdots &\vdots &\vdots \\ \frac{\partial y_m}{\partial \boldsymbol{x}_1} &\cdots &\frac{\partial y_m}{\partial \boldsymbol{x}_q} \end{matrix} \right] \in \mathbb{R}^{mp \times q}

6.3 矩阵对矩阵求导

Y=[yij]Y = [y_{ij}]m×nm\times n 矩阵,X=[xij]X = [x_{ij}]p×qp\times q 矩阵,则

YX=[y1Tx1y1TxqymTx1ymTxq]Rmp×nq\frac{\partial Y}{\partial X} = \left[ \begin{matrix} \frac{\partial \boldsymbol{y}_1^T}{\partial \boldsymbol{x}_1} &\cdots &\frac{\partial \boldsymbol{y}_1^T}{\partial \boldsymbol{x}_q} \\ \vdots &\vdots &\vdots \\ \frac{\partial \boldsymbol{y}_m^T}{\partial \boldsymbol{x}_1} &\cdots &\frac{\partial \boldsymbol{y}_m^T}{\partial \boldsymbol{x}_q} \end{matrix} \right] \in \mathbb{R}^{mp \times nq}

先将 XX 看成列向量,再将 YY 看成行向量。

7 几个推广公式

AxxT=A(Ax)Tx=AT\frac{\partial Ax}{\partial x^T} = A \\ \frac{\partial (Ax)^T}{\partial x} = A^T

参考