机器学习笔记之高斯分布(一)——使用极大似然估计计算最优参数

本文链接：https://blog.csdn.net/qq_34758157/article/details/125620336

机器学习笔记之高斯分布——使用极大似然估计计算最优参数

高斯分布( $\text{Gaussian Distribution}$ )，也称正态分布( $\text{Normal Distribution}$ )。该分布在数学、物理及工程等领域都占据重要地位的分布。从随机变量 $\mathcal X$ 的维度进行划分，可以将高斯分布分为一维正态分布和多元正态分布。

一维正态分布

假设一维随机变量 $X$ 服从一个数学期望为 $\mu$ ，方差为 $\sigma^2$ 的高斯分布，将其记作：
$\mathcal X \sim \mathcal N(\mu,\sigma^2)$
对应高斯分布的概率密度函数( $\text{Probability Density Function,PDF}$ )表示如下：
$\frac{1}{\sqrt{2\pi}\sigma}\exp \left[-\frac{(x - \mu)^2}{2\sigma^2}\right]$
其中( $\mu,\sigma$ (标准差))为正态分布 $\mathcal N(\mu,\sigma^2)$ 的充分统计量(若某一数据集合确定服从于高斯分布，只需要知道该分布的样本均值和方差 $\to$ 可以从该分布中生成任意一个样本)。
充分统计量在'指数族分布'中详细介绍。

数学期望 $\mu$ 决定了概率密度函数的有效位置；方差 $\sigma^2$ 决定了概率密度函数的有效范围。我们称 $\mu=0,\sigma=1$ 的分布为标准正态分布( $\text{Standard Normal Distribution}$ )。具体代码如下：

import math
import matplotlib.pyplot as plt

def norm(x,mu,sigma):
    return (1 / (sigma * math.sqrt(2 * math.pi))) * math.exp(-1 * (((x - mu) ** 2) / (sigma ** 2)))

if __name__ == '__main__':
    x = np.linspace(-10,10,500)
    y = [norm(i,0,1) for i in x]
    y_1 = [norm(i,0,4.0) for i in x]
    y_2 = [norm(i,-2,4.0) for i in x]
    plt.plot(x,y)
    plt.plot(x,y_1)
    plt.plot(x,y_2)
    plt.show()

返回结果如下：
请添加图片描述

多维正态分布

多元正态分布是一维正态分布向多维的推广。其充分统计量是期望和协方差矩阵，分别记作： $\mu,\Sigma$ 。如果 $\Sigma$ 是 非奇异矩阵，多元正态分布的概率密度函数表示如下：
$f_{\mathbf x}(x_1,x_2,...,x_p) = \frac{1}{\sqrt{(2\pi)^p|\Sigma|}}\exp \left[-\frac{1}{2}(\mathbf x - \mu)^{\top}\Sigma^{-1}(\mathbf x - \mu) \right]$
其中：
$\mathbf x \in \mathcal X$ ，是数据集合 $\mathcal X$ 的一个样本，共包含 $p$ 个维度： $x_1,x_2,...,x_p)$ 。 $|\Sigma|$ 表示协方差矩阵 $\Sigma$ 的行列式结果。具体代码如下：

from scipy.stats import multivariate_normal

def draw_pic(mu,sigma):
    x = np.linspace(-2,2,50)
    y = np.linspace(-2,2,50)
    X,Y = np.meshgrid(x,y)
    pos = np.empty(X.shape + (2,))
    pos[:, :, 0] = X
    pos[:, :, 1] = Y
    Gaussians_l = []
    Gaussians_l.append(multivariate_normal(mu,sigma))

    fig = plt.figure()
    ax = fig.gca(projection='3d')
    ax.plot_surface(X, Y, Gaussians_l[0].pdf(pos), cmap='viridis', linewidth=0)
    plt.show()

if __name__ == '__main__':
    mu = [0, 0]
    sigma = [[0.5, 0], [0, 0.5]]
    draw_pic(mu,sigma)

返回图片结果：
请添加图片描述

回顾：数据集合与概率模型

上一节介绍到，数据集合是由 大量重复试验 产生的样本所组成的集合。并介绍了向量维度是 描述样本特征的一组值的数量。
现已知一个数据集合 $\mathcal X$ ，它共包含 $N$ 个样本，每个样本包含 $p$ 个维度，使用数学符号表示该数据集合如下：
$\mathcal X = \begin{pmatrix} x^{(1)} \\ x^{(2)} \\ x^{(3)}\\ \vdots \\ x^{(N)} \end{pmatrix} = \begin{pmatrix} x_1^{(1)},x_2^{(1)},\cdots,x_p^{(1)} \\ x_1^{(2)},x_2^{(2)},\cdots,x_p^{(2)} \\ x_1^{(3)},x_2^{(3)},\cdots,x_p^{(3)} \\ \vdots \\ x_1^{(N)},x_2^{(N)},\cdots,x_p^{(N)} \end{pmatrix}_{N\times p}$