LASSO坐标下降法Coordinate Descent Method公式推导及代码

最新推荐文章于 2024-05-07 19:57:26 发布

原创

最新推荐文章于 2024-05-07 19:57:26 发布 · 4.7k 阅读

34 ·

CC 4.0 BY-SA版权

文章标签：

#python #机器学习 #optimization

文章目录

LASSO by Coordinate Descent Method

LASSO by Coordinate Descent Method

Prepare:

from itertools import cycle
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import lasso_path, enet_path
from sklearn import datasets
from copy import deepcopy

X = np.random.randn(100,10)
y = np.dot(X,[1,2,3,4,5,6,7,8,9,10])

The code is the simplified version of _cd_fast.enet_coordinate_descent() with beta=0 and l1_ratio=1 from scikit-learn (source code:lasso coodinate descent source). The original code is implemented in Cython, and code here is pure python for convenience, easier to understand but much slower.

Coordinate Descent Method Framework

randomly set $\beta^{(0)}$ for iteration 0
For $k$ th iteration:
----For $j = 1$ to $p$ :
-------- $\beta^{(k)}_j = argmin_{\beta_j} \mathcal{L}_{l1}(\beta)=argmin_{\beta_j} \mathcal{L}_{l1}(\beta_1^{(k-1)}, \beta_2^{(k-1)},\ldots, \beta_{j-1}^{(k-1)}, \beta_j, \beta_{j+1}^{(k)}, \ldots, \beta_p^{(k-1)})$
----Endfor
----Check convergence: if yes, end algorithm; else continue update
Endfor

Here the objective function is
$\mathcal{L}_{l1}=\frac{1}{2N}(Y-X\beta)^T (Y-X\beta) + \lambda \left\lVert \beta \right\rVert_1$
where the size of $X$ , $Y$ , $\beta$ is $N\times p$ , $N\times 1$ , $p\times 1$ , which means $N$ samples and $p$ features.