sklearn 使用最小二乘法
一. 说明:
sklearn中最小二乘法,拟合的直线是特征的线性组合:
y^(w,x)=w0+w1x1+...+wpxp\hat{y}(w, x) = w_0 + w_1 x_1 + ... + w_p x_py^(w,x)=w0+w1x1+...+wpxp
损失函数定义:
w=(w1,...,wp)Tw = (w_1,..., w_p)^Tw=(w1,...,wp)T
Xw=[x11x12...x1p⋮⋮⋱⋮xn1xn2...xnp][w1w2⋮wp]X w = \begin{bmatrix}
x_{11}&x_{12} &...&x_{1p}\\
\vdots&\vdots &\ddots&\vdots\\
x_{n1}&x_{n2} &...&x_{np}\\
\end{bmatrix}
\begin{bmatrix}w_1\\w_2\\\vdots\\w_p\end{bmatrix}Xw=⎣⎢⎡x11⋮xn1x12⋮xn2...⋱...x1p⋮xnp⎦⎥⎤⎣⎢⎢⎢⎡w1w2⋮wp⎦⎥⎥⎥⎤
loss function=minw∣∣Xw−y∣∣22 loss\ function = \min_{w} || X w - y||_2^2loss function=wmin∣∣Xw−y∣∣22
二. 举例使用
❗️ sklearn 中 (w1,...,wp)(w_1,..., w_p)(w1,...,wp) 作为 coef_ , w0w_0w0 作为 intercept_
# 最小二乘法用于sklearn中的线性回归,引入它。
from sklearn import linear_model
reg = linear_model.LinearRegression()
def foo(x1,x2): # w0 = 5, w1 = 2, w2 = 3
return 2 * x1 + 3 * x2 + 5
"""生成测试数据 X,y
X 10行2列
y 10行1列
"""
X = [[i,(i+1)/2] for i in range(10)]
y = [foo(i,(i+1)/2) for i in range(10)]
# 根据参数拟合直线
reg.fit(X,y)
# 输出 w1,w2 = [2.8, 1.4]
print(reg.coef_)
# 输出 w0 = 5.8
print(reg.intercept_)
"""
拟合直线: y = 2.8 * x1 + 1.4 * x2 + 5.8
"""
# 用生成的直线进行预测
print(reg.predict(X))