Linear Regression

最新推荐文章于 2020-09-22 04:15:51 发布

linjiet

最新推荐文章于 2020-09-22 04:15:51 发布

阅读量367

点赞数

CC 4.0 BY-SA版权

分类专栏：机器学习机器学习与统计学习方法文章标签： linear regression

本文链接：https://blog.csdn.net/qq_39742013/article/details/89502530

机器学习同时被 2 个专栏收录

27 篇文章

订阅专栏

机器学习与统计学习方法

4 篇文章

订阅专栏

本文深入探讨线性回归模型，解析其数学原理与梯度下降法的应用。从直观角度理解损失函数，通过数值解和解析解两种方式求解参数，详细阐述了随机梯度下降过程及其参数优化技巧。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

This article records my note about Linear regression.

The Linear Regression is a single layer neural network with a output.Trough this model,we see theory of gradient descent with intuition.

general model

$f(x)=w^Tx+b=w_1x_1+w_2x_2+...+w_nx_n+b$

Numerical solution

loss function

The $\frac{1}{2}$ is for simplicity. The $m$ is number of samples in a batch.
$l=\frac{1}{m}\sum_i^m\frac{1}{2}(f(x^i)-y^i)^2\\ =\frac{1}{2m}\sum_i^m(w^Tx^i+b-y^i)^2$
When batch is given, $l$ is quadratic function about $w$ and $b$ .When we fix $b$ ,we can get gradient of $w$ .You can imagine that a quadratic function may jitter in lowest point with a big learning rate.Thus,we need a good lr by testing.

SGD

Firstly,computing derivation.
$\frac{\partial l}{\partial w}=\frac{1}{2m}\sum_i^m(2(w^Tx^i+b-y^i)x^i),\\ where\ w\ and\ x\ both\ are\ vectors.\\ for\ w_j\\ \frac{\partial l}{\partial w_j}=\frac{1}{2m}\sum_i^m(2(w^Tx^i+b-y^i)x^i_j)\\ =\frac{1}{m}\sum_i^m(w^Tx^i+b)x^i_j-\frac{1}{m}\sum_i^m(x^i_jy^i)$
When batch is given,we use constant $p$ and $q$ , $z$ to replace $\frac{1}{m}\sum x^i_jx^i$ and $\frac{1}{m}\sum (x^iy^i)$ , $\frac{1}{m}\sum x^i_j$ respectively:
$\frac{\partial l}{\partial w_i}=w^Tp+zb-q\\ where\ zb-q\ is\ constant$
Optimizing parameters:
$w_i-=\eta*\frac{\partial l}{\partial w_i}$

Analytical solution

We can directly use the least square method to solve this question.
Refer to least square method.
Specific derivation process can refer to Zhou ZH Watermelon Book.
(PS: for n*d matrix $w$ where d>n,r( $w^Tw$ )<=r( $w^T$ )<=n).