微分,微分算子与Taylor展开
学 SI152 发现作业不会整,于是重学数学。
一元数值函数
对于 \(f: \mathbb{R} \mapsto \mathbb{R}\)
求导:
\[f'(x) = \dfrac{dy}{dx} = \lim_{h \to 0} \frac{f(x+h) - f(x)}{h}
\]
微分(非线性函数在点附近的线性近似):
\[df = f'(x) dx
\]
一阶微分形式不变性:
\[dz = \dfrac{dz}{dy} dy = \dfrac{dz}{dy} \left(\dfrac{dy}{dx} dx\right) = \left(\dfrac{dz}{dy} \dfrac{dy}{dx}\right) dx = \dfrac{dz}{dx} dx
\]
(蕴含了链式法则)
Lagrange 中值定理:
\[f(x)-f(x^{(0)}) = f'(\theta) (x-x^{(0)}) , \theta\in(x,x^{(0)})
\]
Taylor 展开:
\[\begin{aligned}
f(x^{(0)}+x) &= \sum_{i=0}^{p} \frac{f^{(i)}(x^{(0)})}{i!} x^i + o(x^{p+1}) \\
&= f(x^{(0)}) + f'(x^{(0)})x + \frac{f''(x^{(0)})}{2!}x^2 + \frac{f'''(x^{(0)})}{3!}x^3 + o(x^3)
\end{aligned}
\]
证明思路:对 \(n\) 次 Taylor 多项式 \(T_n\) 归纳,第 \(k\) 项有
\[0=\lim_{x\to 0} \dfrac{f(x^{(0)}+x)-T_n(x^{(0)}+x)}{x^k} = \lim_{x\to 0} \dfrac{f^{(k)}(x^{(0)}+x)-T_n^{(k)}(x^{(0)}+x)}{k!}
\]
Maclaurin 展开:\(x=0\) 的 Taylor 展开。
余项
- Peano 余项:
\[ R_n(x) = o(x^nn)
\]
- Lagrange 余项:
\[ R_n(x) = \dfrac{f^{(n+1)}(\xi)}{(n+1)!} x^{n+1} , \xi\in (x^{(0)}, x^{(0)}+x)
\]
- 积分余项:
\[ R_n(x) = \dfrac{1}{n!} \int_{x^{(0)}+x}^{x^{(0)}} f^{(n+1)}(t) (x^{(0)}+t)^n dt
\]
多元数值函数
对于 \(f: \mathbb{R}^n \mapsto \mathbb{R}\)
求偏导:
\[f'_{x_i} = \dfrac{\partial f}{\partial x_i} = \lim_{h \to 0} \frac{f(\dots,x+h,\dots) - f(\dots,x,\dots)}{h} ~,i=1,2,\dots,n
\]
梯度:
\[\nabla f = \begin{pmatrix}
\frac{\partial f}{\partial x_1} \\
\vdots \\
\frac{\partial f}{\partial x_n}
\end{pmatrix}
\]
全微分:
\[df = \sum_{i=1}^{n} \dfrac{\partial f}{\partial x_i} dx_i = \nabla f^T \cdot dx
\]
一阶微分形式不变性:
\[\begin{aligned}
dz &= \nabla_y z^T \cdot dy = \nabla_y z^T \cdot (\nabla_x y^T \cdot dx)
\\
&= (\nabla_y z^T \cdot \nabla_x y^T) \cdot dx = \nabla_x z^T \cdot dx
\end{aligned}
\]
微分中值定理:
\[f(x)-f(x^{(0)}) = \nabla f^T(\theta) (x-x^{(0)}) , \theta = \lambda x+(1-\lambda)x^{(0)} , \lambda\in\mathbb{R}
\]
Taylor 展开:若 \(y = (y_1,\dots,y_n)^T\)
\[\begin{aligned}
f(y^{(0)}+y) &= \sum_{i=0}^{p} \frac{1}{i!}\left(\sum_{i=1}^{n} y_i \dfrac{\partial}{\partial x_i} \right)^i f \bigg|_{x=y^{(0)}} + o(\rho^{p+1}) \\
&= f(y^{(0)}) + \nabla f^T(y^{(0)})\cdot y + \dfrac{1}{2} y^T \nabla^2 f(y^{(0)}) y + o(|y|^2)
\end{aligned}
\]
其中 \(\nabla^2 f\) 为黑塞矩阵 (Hessian Matrix)。
\[\nabla^2 f = \bm{H}(f) = \begin{pmatrix}
\frac{\partial^2 f}{\partial x_1^2} & \frac{\partial^2 f}{\partial x_1 \partial x_2} & \cdots & \frac{\partial^2 f}{\partial x_1 \partial x_n} \\
\frac{\partial^2 f}{\partial x_2 \partial x_1} & \frac{\partial^2 f}{\partial x_2^2} & \cdots & \frac{\partial^2 f}{\partial x_2 \partial x_n} \\
\vdots & \vdots & \ddots & \vdots \\
\frac{\partial^2 f}{\partial x_n \partial x_1} & \frac{\partial^2 f}{\partial x_n \partial x_2} & \cdots & \frac{\partial^2 f}{\partial^2 x_n} \\
\end{pmatrix}
\]
特别的,\(f: \mathbb{R}^2 \mapsto \mathbb{R}\)
\[\begin{aligned}
f(x_1^{(0)}+x_1, x_2^{(0)}+x_2) &= f(x_1^{(0)}, x_2^{(0)}) + \frac{\partial f}{\partial x_1} \bigg|_{x_1^{(0)},x_2^{(0)}} x_1 + \frac{\partial f}{\partial x_2} \bigg|_{x_1^{(0)},x_2^{(0)}} x_2 \\
&\quad +\dfrac{1}{2} \left[ \frac{\partial^2 f}{\partial x_1^2} \bigg|_{x_1^{(0)},x_2^{(0)}} x_1^2 + 2 \frac{\partial^2 f}{\partial x_1 \partial x_2} \bigg|_{x_1^{(0)},x_2^{(0)}} x_1 x_2 + \frac{\partial^2 f}{\partial x_1^2} \bigg|_{x_1^{(0)},x_2^{(0)}} x_2^2 \right] + o(\rho^2) \\
&= f(x_1^{(0)}, x_2^{(0)}) + \begin{pmatrix}
\frac{\partial f}{\partial x_1} ~ \frac{\partial f}{\partial x_2}
\end{pmatrix} \begin{pmatrix}
x_1 \\ x_2
\end{pmatrix} +\dfrac{1}{2} \begin{pmatrix}
x_1 ~ x_2
\end{pmatrix}
\begin{pmatrix}
\frac{\partial^2 f}{\partial x_1^2} ~ \frac{\partial f}{\partial x_1 \partial x_2} \\
\frac{\partial f}{\partial x_2 \partial x_1} ~ \frac{\partial f}{\partial x_2^2}
\end{pmatrix}
\begin{pmatrix}
x_1 \\ x_2
\end{pmatrix} + o(\rho^2)
\end{aligned}
\]
证明:对于 \(f(x+k, y+h)\) 的情形,构造 \(F(t) = f(x+kt, y+ht)\) 转换为一元函数后展开可证。
黑塞矩阵判断极值:正定性
- 正定矩阵:严格极小值
- 负定矩阵:严格极大值
- 不定矩阵:非极值
证明:Taylor展开
多元向量值函数
对于 \(f: \mathbb{R}^n \mapsto \mathbb{R}^m\)
微分:
\[df = \begin{pmatrix}
df_1 \\
\vdots \\
df_m
\end{pmatrix} =
\begin{pmatrix}
\frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} & \cdots & \frac{\partial f_1}{\partial x_n} \\
\frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} & \cdots & \frac{\partial f_2}{\partial x_n} \\
\vdots & \vdots & \ddots & \vdots \\
\frac{\partial f_m}{\partial x_1} & \frac{\partial f_m}{\partial x_2} & \cdots & \frac{\partial f_n}{\partial x_n} \\
\end{pmatrix} \begin{pmatrix}
dx_1 \\
\vdots \\
dx_n
\end{pmatrix} = \nabla f^T \cdot dx
\]
其中,Jacobi 矩阵:
\[\nabla f^T = \bm{J}_x(f) = \begin{pmatrix}
\frac{\partial f_1}{\partial x_1} & \frac{\partial f_1}{\partial x_2} & \cdots & \frac{\partial f_1}{\partial x_n} \\
\frac{\partial f_2}{\partial x_1} & \frac{\partial f_2}{\partial x_2} & \cdots & \frac{\partial f_2}{\partial x_n} \\
\vdots & \vdots & \ddots & \vdots \\
\frac{\partial f_m}{\partial x_1} & \frac{\partial f_m}{\partial x_2} & \cdots & \frac{\partial f_n}{\partial x_n} \\
\end{pmatrix}
\]
满足
\[dy = \bm{J}_x(f) \cdot dx = \nabla_x f^T \cdot dx
\]
故认为 Jacobi 矩阵是向量值函数的导数。
(广义)梯度:
\[\nabla f = \begin{pmatrix}
\frac{\partial f_1}{\partial x_1} & \frac{\partial f_2}{\partial x_1} & \cdots & \frac{\partial f_m}{\partial x_1} \\
\frac{\partial f_1}{\partial x_2} & \frac{\partial f_2}{\partial x_2} & \cdots & \frac{\partial f_m}{\partial x_2} \\
\vdots & \vdots & \ddots & \vdots \\
\frac{\partial f_1}{\partial x_n} & \frac{\partial f_2}{\partial x_n} & \cdots & \frac{\partial f_m}{\partial x_n} \\
\end{pmatrix}
\]
一阶微分形式不变性:
\[\begin{aligned}
dz &= \nabla_y z^T \cdot dy = \nabla_y z^T \cdot (\nabla_x y^T \cdot dx)
\\
&= (\nabla_y z^T \cdot \nabla_x y^T) \cdot dx = \nabla_x z^T \cdot dx
\end{aligned}
\]
“类微分算子”
Reference:https://www.zhihu.com/question/22455493
- \(\mathrm{d}(X+Y) = \mathrm{d}X + \mathrm{d}Y\)
- $\mathrm{d}(AX) = A \mathrm{d}X $,其中 \(A\) 与变量无关。
- \(\mathrm{d}(XY) = Y\mathrm{d}X + X\mathrm{d}Y\)
Nabla 算子
实值函数情形
Referance: https://zhuanlan.zhihu.com/p/52834609
\[\begin{aligned}
\nabla (\varphi\psi) &=\nabla (\varphi\psi_{c})+\nabla (\varphi_{c}\psi)=\psi\nabla\varphi+\varphi\nabla\psi, \\
\nabla\cdot\left(\varphi\bm{f}\right) &=\nabla\cdot\left(\varphi\bm{f}_{c}\right)+\nabla\cdot\left(\varphi_{c}\bm{f}\right)=\nabla\varphi\cdot\bm{f}+\varphi\nabla\cdot\bm{f}, \\
\nabla\times\left(\varphi\bm{f}\right) &=\nabla\times\left(\varphi\bm{f}_{c}\right)+\nabla\times\left(\varphi_{c}\bm{f}\right)=\nabla\varphi\times\bm{f}+\varphi\nabla\times\bm{f}, \\
\nabla\cdot\left(\bm{f}\times\bm{g}\right) &=\nabla\cdot\left(\bm{f}\times\bm{g}_{c}\right)+\nabla\cdot\left(\bm{f}_{c}\times\bm{g}\right) \\
&=\left(\nabla\times\bm{f}\right)\cdot\bm{g}-(\nabla\times\bm{g})\cdot\bm{f} , \\
\nabla\times\left(\bm{f}\times\bm{g}\right) &=\nabla\times\left(\bm{f}\times\bm{g}_{c}\right)+\nabla\times\left(\bm{f}_{c}\times\bm{g}\right) \\
&=\left(\bm{g}\cdot\nabla\right)\bm{f}-\left(\nabla\cdot\bm{f}\right)\bm{g}+\left(\nabla\cdot\bm{g}\right)\bm{f}-\left(\bm{f}\cdot\nabla\right)\bm{g}, \\
\nabla\left(\bm{f}\cdot\bm{g}\right) &=\bm{f}\times(\nabla\times\bm{g})+\left(\bm{f}\cdot\nabla\right)\bm{g}+\bm{g}\times\left(\nabla\times\bm{f}\right)+(\bm{g}\cdot\nabla) \bm{f} , \\
\nabla\cdot\nabla\varphi &\equiv \nabla^{2}\varphi, \\
\nabla\times\left(\nabla\times\bm{f}\right) &=\nabla\left(\nabla\cdot\bm{f}\right)-\nabla^{2}\bm{f} , \\
\nabla\cdot\left(\bm{f} \bm{g}\right) &=\nabla\cdot\left(\bm{f} \bm{g}_{c}\right)+\nabla\cdot\left(\bm{f} \bm{g}\right)=\left(\nabla\cdot\bm{f}\right)\bm{g}+\left(\bm{f}\cdot\nabla\right)\bm{g},
\end{aligned}
\]
\[\begin{aligned}
&\nabla\times(\nabla\varphi)\equiv0, \\
&\nabla\cdot\left(\nabla\times\bm{A}\right)\equiv0, \\
&\nabla f\left(u\right)=\left(\nabla u\right){\frac{\mathrm{d}f}{\mathrm{d}u}}, \\
&\nabla\cdot\bm{A}\left(u\right)=\left(\nabla u\right)\cdot\frac{\mathrm{d}\bm{A}}{\mathrm{d}u}, \\
&\nabla\times{\bm{A}}\left(u\right)=\left(\nabla u\right)\times{\frac{\mathrm{d}{\bm{A}}}{\mathrm{d}u}},
\end{aligned}\]
向量值函数情形
\[\begin{aligned}
& \nabla( f^T(x)) = (\nabla f(x))^T \\
& \nabla( b^T x) = b \\
& \nabla( x^T A x) = (A+A^T) x \\
& \nabla( x^T x) = 2 x \\
& \nabla(A x) = A^T \\
& \nabla_x( x^T A y) = A y \\
& \nabla_x( y^T A x) = A^T y \\
& \nabla(f(Ax)) = A^T \nabla f(Ax) \\
\end{aligned}\]
矩阵函数
(待补)