似然比检验 LRT

最新推荐文章于 2025-07-16 10:33:23 发布

原创最新推荐文章于 2025-07-16 10:33:23 发布 · 9.6w 阅读

235 ·

CC 4.0 BY-SA版权

文章标签：

#likelihood ratio test #似然比检验

统计学专栏收录该内容

30 篇文章

订阅专栏

似然比检验（likelihood ratio test, LRT）是一种检验参数能否反映真实约束的方法（分布或模型的某参数 $\theta$ 等于 $\theta_0$ 是否为真实约束）。似然比检验的思想是：“如果参数约束是有效的，那么加上这样的约束不应该引起似然函数最大值的大幅度降低。也就是说似然比检验的实质是在比较有约束条件下的似然函数最大值与无约束条件下似然函数最大值。” 可以看出，似然比检验是一种通用的检验方法（比 $t$ 检验、 $\chi ^2$ 检验等具有更广的适用范围）。

以下摘自《应用多元统计分析》：

考虑多元正态分布 $N_q(\theta ,I)$ ，为了检验 $\theta$ 是否等于 $\theta_0$ ，我们建立检验问题：

   $H_0:\theta =\theta _0$

   $H_1:$ 对 $\theta$ 没有约束

或者等价地， $\Omega _0=\left \{ \theta _0 \right \}, \Omega _1=R^q$ 。

定义 $L_j^*=maxL(X;\theta )$ 为每个假设似然函数的最大值。考虑似然比（LR）：

   $\lambda (X)=\frac{L_0^*}{L_1^*}$ 或者写成对数形式   $-2log\lambda =2(l_1^*-l_2^*)$

[ 注：通常写成乘以2形式是为了之后推导中近似变换为卡方值，即 $-2log\lambda$ 的渐进分布是 $\chi _{q-r}^2$ ，其中 $\Omega _0\in \Omega _1$ 是 $r$ 维 ]

如果LR值比较高，则倾向于接受 $H_0$ ，否则倾向于接受 $H_1$ 。

关于似然函数，可以参考：

似然与极大似然估计

似然函数及最大似然估计及似然比检验

似然比检验LRT的应用广泛，包括：均值（包括均值向量）的比较、重复度量、轮廓分析（趋势比较）、模型适合度等等。

均值向量的比较：以二维向量为例，比如同时检验A、B两组人群的身高 $x_1$ 和体重 $x_2$ 是否来自同一总体，可将身高和体重的均值组合成向量，即A组的均值向量为 $\vec{\mu}_A=(x_{1A},x_{2A})$ ，B组的均值向量为 $\vec{\mu}_B=(x_{1B},x_{2B})$ ，对这两组均值向量进行检验（此处LRT其实等价于Hotelling's T2检验）。均值向量的问题，其实本质上是线性假设（线性约束）问题，并且也可以用作回归系数的假设检验。

重复度量：同一个指标在同一个主体上进行多次测量，检验多次测量值之间是否有差异（比如是否存在时间效应及治疗效应）。

轮廓分析：以两组的轮廓为例，当重复测量发生在两组主体中时，想考量两组的重复测量趋势是否一致。对于这个问题，可以从这3方面考虑（这3方面需依次考虑）：

基于平行的定义，这些轮廓相似吗（轮廓是否平行，注意这里不相交便意味着平行）？
如果轮廓是平行的，二者处于同一水平吗（两组轮廓是否为同一轮廓）？
如果轮廓是平行的，但二者不处于同一水平，轮廓存在治疗效应吗（无论接受何种治疗，轮廓是否始终保持相同，趋势是否一致）？

这些问题可以转换为均值的线性约束问题进行求解。

这些内容的具体介绍请参考《应用多元统计分析》。

下面仅介绍使用似然比检验评估模型的适合度。

评估模型的适合度

似然比检验用来评估两个模型中那个模型更适合当前数据分析。具体来说，一个相对复杂的模型与一个简单模型比较，来检验它是不是能够显著地适合一个特定的数据集。如果可以，那么这个复杂模型的附加参数能够用在以后的数据分析中。LRT应用的一个前提条件是这些待比较的模型应该是分级的巢式模型。具体来讲，是说相对于简单模型，复杂模型仅仅是多了一个或者多个附加参数。增加模型参数必定会导致高似然值成绩。因此根据似然值的高低来判断模型的适合度是不准确的。LRT提供了一个客观的标准来选择合适的模型。LRT检验的公式: LR = 2*(InL1- InL2)

其中L1为复杂模型最大似然值，L2为简单标准模型最大似然值LR近似的符合卡方分布。为了检验两个模型似然值的差异是否显著，我们必须要考虑自由度。LRT 检验中，自由度等于在复杂模型中增加的模型参数的数目。这样根据卡方分布临界值表，我们就可以判断模型差异是否显著。

以下摘自维基百科：

In statistics, a likelihood ratio test (LR test) is a statistical test used for comparing the goodness of fit of two statistical models — a null model against an alternative model. The test is based on the likelihood ratio, which expresses how many times more likely the data are under one model than the other. This likelihood ratio, or equivalently its logarithm, can then be used to compute a p-value, or compared to a critical value to decide whether or not to reject the null model.

When the logarithm of the likelihood ratio is used, the statistic is known as a log-likelihood ratio statistic, and the probability distribution of this test statistic, assuming that the null model is true, can be approximated using Wilks' theorem.

In the case of distinguishing between two models, each of which has no unknown parameters, use of the likelihood ratio test can be justified by the Neyman–Pearson lemma, which demonstrates that such a test has the highest power among all competitors.

Being a function of the data x, the likelihood ratio is therefore a statistic. The likelihood ratio test rejects the null hypothesis if the value of this statistic is too small. How small is too small depends on the significance level of the test, i.e., on what probability of Type I error is considered tolerable ("Type I" errors consist of the rejection of a null hypothesis that is true).

The numerator corresponds to the likelihood of an observed outcome under the null hypothesis. The denominator corresponds to the maximum likelihood of an observed outcome varying parameters over the whole parameter space. The numerator of this ratio is less than the denominator. The likelihood ratio hence is between 0 and 1. Low values of the likelihood ratio mean that the observed result was less likely to occur under the null hypothesis as compared to the alternative. High values of the statistic mean that the observed outcome was nearly as likely to occur under the null hypothesis as the alternative, and the null hypothesis cannot be rejected.

The likelihood-ratio test requires nested models – models in which the more complex one can be transformed into the simpler model by imposing a set of constraints on the parameters. If the models are not nested, then a generalization of the likelihood-ratio test can usually be used instead: the relative likelihood.

也就是说，比较的两个模型之间存在“嵌合关系”，其中一个模型的变量无约束，另一个模型的变量是前者经过约束后得到的。如果两个模型之间不是嵌套关系，那么就不能使用LRT，而要使用广义的LRT，即相对LR。

在R里面有很多包都有这个函数，最常使用的是rms包中的lrtest()。如：

library(rms)
all.X <-data.frame(x.T=data.T, x.N=data.N, x.S=data.S, x.G=data.G, x.V=data.V, x.P=data.P, x.CEA2=data.CEA2, x.CA1992=data.CA1992)
TN.model <- cph(Surv(survival.time,survival.status)~ x.T+x.N, 
                data=all.X, na.action=na.omit )
TNC.model <- cph(Surv(survival.time,survival.status)~ x.T+x.N+x.CEA2, 
                 data=all.X, na.action=na.omit )
TN2TNC <- lrtest(TN.model, TNC.model)

除了似然比检验，还有Wald检验、拉格朗日乘数检验都是基于最大似然估计MLE。当样本量较大时，三者是渐进等价的。

尼曼-皮尔森引理说明，似然比检验是所有具有同等显著性差异的检验中最有统计效力的检验。

另外这个博客可以参考以下：似然函数及最大似然估计及似然比检验

参考资料：

《应用多元统计分析》 Wolfgang Hardle等著，陈诗一译. 北京大学出版社

似然函数及最大似然估计及似然比检验

似然与极大似然估计

维基百科 - likelihood ratio test