ML 系列：第 29节 — 连续概率分布（拉普拉斯分布）

原创于 2024-11-23 11:45:53 发布 · 1.8k 阅读

21 ·

CC 4.0 BY-SA版权

文章标签：

#概率论

人工智能同时被 3 个专栏收录

108 篇文章

订阅专栏

机器学习专栏

103 篇文章

订阅专栏

概率模型

17 篇文章

订阅专栏

文章目录

一、说明
二、拉普拉斯分布的特征
三、示例
四、对密度函数的解释

一、说明

拉普拉斯分布，也称为双指数分布，是一种概率分布，在统计学中经常用于对不对称数据进行建模。它以法国数学家皮埃尔-西蒙·拉普拉斯（Pierre-Simon Laplace）的名字命名，他在 19 世纪初首次引入了它。

二、拉普拉斯分布的特征

对称和不对称：虽然拉普拉斯分布在其均值周围对称，但它与正态分布的不同之处在于它具有较重的尾部。这意味着它允许更有效地对具有异常值或极值的数据进行建模。
概率密度函数（PDF）：拉普拉斯分布的概率密度函数由下式给出：
在这里插入图片描述

μ 是位置参数，用于确定分布的中心。
b 是 scale 参数，用于控制分布的散布或宽度。
意味着：
拉普拉斯分布的均值（μ）由 location 参数给出。对于参数为 μ 和 b 的拉普拉斯分布，均值为 μ。

方差：
拉普拉斯分布的方差是尺度参数（b）的函数。计算公式为：
在这里插入图片描述

标准差：
拉普拉斯分布的标准差（σ）是方差的平方根：
在这里插入图片描述

三、示例

假设我们有一个参数为 mu = 0 且 b = 1 的拉普拉斯分布。然后

平均值（μ） = 0
方差 = 2 × 1² = 2
标准差（σ） = √2 ×1 ≈ 1.414
Python 中的示例
1. 具有不同μ的拉普拉斯分布
下面的第一个代码是一个 Python 脚本，它说明了具有不同 μ 值（位置参数）的拉普拉斯分布。

import numpy as np
import matplotlib.pyplot as plt

# Define the Laplacian distribution function
def laplacian_pdf(x, mu, b):
    return (1 / (2 * b)) * np.exp(-np.abs(x - mu) / b)

# Values of the location parameter 'mu'
mu_values = [-2, 0, 2]
b = 1  # Fixed scale parameter

# Create a range of x values
x = np.linspace(-10, 10, 1000)

# Plot the Laplacian distributions for varying mu
plt.figure(figsize=(12, 6))

for mu in mu_values:
    y = laplacian_pdf(x, mu, b)
    plt.plot(x, y, label=f'μ = {mu}')

plt.title('Laplacian Distribution with Varying μ')
plt.xlabel('x')
plt.ylabel('Probability Density')
plt.legend()
plt.grid(True)
plt.show()

这是上述代码的输出：
在这里插入图片描述

图 1.变化μ的拉普拉斯分布（位置参数）此脚本绘制位置参数 μ 不同值的拉普拉斯分布，其中具有固定尺度参数 b=1。

laplacian_pdf 函数计算mu_values中每个 μ 值的概率密度。
2. 具有变化 b 的拉普拉斯分布
下面的第二段代码是一个 Python 脚本，它说明了具有不同 b （scale 参数）值的拉普拉斯分布。

import numpy as np
import matplotlib.pyplot as plt

# Define the Laplacian distribution function
def laplacian_pdf(x, mu, b):
    return (1 / (2 * b)) * np.exp(-np.abs(x - mu) / b)

# Values of the scale parameter 'b'
b_values = [0.5, 1, 2]
mu = 0  # Fixed location parameter

# Create a range of x values
x = np.linspace(-10, 10, 1000)

# Plot the Laplacian distributions for varying b
plt.figure(figsize=(12, 6))

for b in b_values:
    y = laplacian_pdf(x, mu, b)
    plt.plot(x, y, label=f'b = {b}, σ = {np.sqrt(2)*b:.2f}')

plt.title('Laplacian Distribution with Varying b')
plt.xlabel('x')
plt.ylabel('Probability Density')
plt.legend()
plt.grid(True)
plt.show()

这是上述代码的输出：
在这里插入图片描述

图 1. 变化 b 的拉普拉斯分布（尺度参数）
此脚本绘制刻度参数 b 的不同值的拉普拉斯分布，固定位置参数 μ = 0
该函数计算中每个 b 值的概率密度。laplacian_pdfb_values
问题：为什么拉普拉斯分布具有尖锐的峰值和沉重的尾部？
拉普拉斯分布具有尖锐的峰值和沉重的尾部，因为它的概率密度函数随着与平均值的绝对距离呈指数下降，导致与多项式衰减的分布（例如正态分布）相比，尾部的衰减速度较慢。