0% found this document useful (0 votes)
24 views

Curve Fitting

Regression is used to find the best-fitting curve for a set of data points by minimizing the sum of the residuals between the observed data and the fitted curve. Least squares regression finds the best-fitting linear curve by minimizing the sum of the squares of the vertical distances between each data point and the curve. The normal equations are used to simultaneously solve for the parameters that define the linear curve. The coefficient of determination and correlation coefficient quantify how well the linear model fits the data compared to just using the mean.

Uploaded by

digiy40095
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Curve Fitting

Regression is used to find the best-fitting curve for a set of data points by minimizing the sum of the residuals between the observed data and the fitted curve. Least squares regression finds the best-fitting linear curve by minimizing the sum of the squares of the vertical distances between each data point and the curve. The normal equations are used to simultaneously solve for the parameters that define the linear curve. The coefficient of determination and correlation coefficient quantify how well the linear model fits the data compared to just using the mean.

Uploaded by

digiy40095
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

CURVE FITTING

What is Regression?
What is regression? Given n data points (x1, y1), (x 2, y 2), ... , (xn, yn)
best fit y = f (x) to the data. The best fit is generally based on
minimizing the sum of the square of the residu.als, Sr
y = f ( x)
Residual at a point is
 i = y i − f (x i ) ( xn, yn )
Sum of the square of the
residuals
n
Sr = (yi − f (xi))2
i=1 ( x 1, y 1 )
Figure. Basic model for regression
7 Dr. Khaldoon Bani-Hani
Least-Squares Regression
When large error is associated with data,
polynomial interpolation is inappropriate since
it will generate the high number of order.
The best way is to sketch a best line through
the points.

The curve should minimizes the discrepancy


between the data points and the curve.

(8) Dr. Khaldoon Bani-Hani


Linear Regression
Given n data points (x1, y1), (x 2, y 2), ... , (xn, yn )
Best fit the data to the function f(x)= y = a 0 + a 1 x
The “best” line through y
points is arbitrary and need x,y
i i
certain criterion to establish
ei = yi − a 0 − a1 x i x ,y
a base for the fit n n

minimizing x ,y
n
x ,y 3 3
2 2
 ei
y = a 0 + a1 x
i=1
x ,y x
work as a criterion, where 1 1

ei = yi − (a 0 + a 1 xi ) Figure. Linear regression of y vs. x data


showing residuals at a typical point, xi .
9 Dr. Khaldoon Bani-Hani
For a “Best” Fit:
Minimize the total sum of the squares of the
residuals (error) between the measured y and
the y calculated with the linear model:

n n n
S r =  ei2 =  ( yi , measured − yi , model) 2 =  ( yi − a0 − a1 xi ) 2
i=1 i=1 i=1

n = total number of points

(10) Dr. Khaldoon Bani-Hani


13.2.1 Criteria for best fit
• The sum of the squares of the residuals which should be
minimized is:
n n
Sr =  e =  i − − 2
( y 2 a a x )
i o 1 i
i =1 i =1
f(x)
• Find the constant parameters ao
and a1 that satisfies the above
criterion, the error is minimized as
follows: ei
Sr Sr
=0 & =0
ao a1 ao Slope =a1 x
(12) Dr. Khaldoon Bani-Hani
Least Squares Fit of a Straight Line:
To determine the values of ao and a1
n n
Sr =  e =  i − − 2
( y 2 a a x )
i o 1 i
i =1 i =1

Differentiating and equating to zero:


Sr
= −2 ( yi − ao − a1 xi ) = 0
ao
Sr
= −2 ( yi − ao − a1 xi )xi = 0
a1
0 =  yi −  ao −  a1 xi
0 =  xi yi −  ao xi −  a1 x i2

(13) Dr. Khaldoon Bani-Hani


Linear Regression
0 =  yi −  ao −  a1 xi
0 =  xi yi −  a o xi −  a1 x i2
Realizing that ao = n.ao these equations can be set with two
unknowns

 nn ao + ( xi )a1 =  yi
 xi   ao  
2 = 
yi 


x a + x a (
= x )
( i )ox i a1i 1 i yi i i
y

Normal equations that can be solved simultaneously for a1 and ao:


n x i yi −  x i  yi  x
a1 = ao =
y i
− a1 i
= y − a1 x
n xi2 − ( x )
2
i
n n

(14) Dr. Khaldoon Bani-Hani


13.2.3 Error Quantification of Linear Regression
• The square of the vertical distance between the data and the best
fit line n n
Sr =  e =  i − − 2
( y 2a a x )
i o 1 i
i =1 i =1
• The square of the discrepancy between data and the mean
S t =  ( yi − y) 2

Standard error of estimate : Standard Deviation :


Sr St
sy/ x = sy =
n−2 n−1

(19) Dr. Khaldoon Bani-Hani


13.2.3 Error Quantification of Linear Regression
• To measure the improvement achieved describing the data by a
straight line instead of using the average:
Coefficeint of determination:
St − Sr • Perfect fit:
r =
2
St Sr = 0 and r = 1
Correlation coefficient : • No improvement:
St − Sr r = 0 and Sr = St
r=
St
Or
r=
( x )( y )
n ( x i yi ) − i i

n x − ( x ) n y − ( y )
2 2
2 2
i i i i

(20) Dr. Khaldoon Bani-Hani


i xi yi xi2 xiyi a0+a1xi (yi-y)2 (yi-a0-a1xi)2
1 10 25 100 250 -39.5833 380534.8 4171.003
2 20 70 400 1400 155.1191 327041 7245.261
3 30 380 900 11400 349.8215 68578.52 910.7419
4 40 550 1600 22000 544.5239 8441.016 29.98767
5 50 610 2500 30500 739.2263 1016.016 16699.44
6 60 1220 3600 73200 933.9287 334228.5 81836.79
7 70 830 4900 58100 1128.6311 35391.02 89180.53
8 80 1450 6400 116000 1323.3335 653066 16044.4
S 360 5135 20400 312850 5135.0008 1808297 216118.2

x = 360 / 8 = 45, y = 5135 / 8 = 641.875 s y = 1808297 / 7 = 508.26


8(312850) − 360(5135) s y / x = 216118 / 6 = 189.79
a1 = = 19.47024
8(20400) − (360) 2
1808297 − 216118
ao = 641.875 − 19.47024(45) = −234.2857 r2 = = 0.8805
1808297

(21) Dr. Khaldoon Bani-Hani

You might also like