EXPERIMENT 1-Error and Data Analysis: (I) Significant Figures
EXPERIMENT 1-Error and Data Analysis: (I) Significant Figures
1.1 Introduction
Our knowledge of the physical world around us is obtained from empirical observations
and careful laboratory experiments in which we do measurements of physical quantities.
It is important to understand how to express data collected from such measurements and
how to analyze and draw meaningful conclusions from it. In this introductory study
experiment we examine the type of experimental errors and some methods of error and
data analysis that will be used in any experiment in which measurements are actually
made.
1.3 Theory
(i) Significant Figures
The significant figures of a (measured or calculated) quantity are the meaningful digits
in it. There are conventions which you should learn and follow for how to express
numbers so as to properly indicate their significant figures.
Any digit that is not zero is significant. Thus 459 has three significant figures and
1.961 has four significant figures.
Zeros between non zero digits are significant. Thus 2006 has four significant
figures.
Zeros to the left of the first non zero digit are not significant. Thus 0.000074 has
only two significant figures. This is more easily seen if it is written in a scientific
notation, i.e., 7.4×10-5.
For numbers with decimal points, zeros to the right of a non zero digit are
significant. Thus 3.00 has three significant figures and 0.090 has two significant
figures. For this reason it is important to keep the trailing zeros to indicate the
actual number of significant figures.
For numbers without decimal points, trailing zeros may or may not be significant.
Thus, 400 indicates only one significant figure. To indicate that the trailing zeros
are significant a decimal point must be added. For example, 400. has three
significant figures, and 4 x 102 has one significant figure.
Error and Data Analysis
Exact numbers have an infinite number of significant digits. For example, if there
are two oranges on a table, then the number of oranges is 2.000... . Defined
numbers are also like this. For example, the number of centimeters per inch (2.54)
has an infinite number of significant digits, as does the speed of light (299792458
m/s).
There are also specific rules for how to consistently express the uncertainty associated
with a number. In general, the last significant figure in any result should be of the same
order of magnitude (i.e. in the same decimal position) as the uncertainty. Also, the
uncertainty should be rounded to one or two significant figures. Always work out the
uncertainty after finding the number of significant figures for the actual measurement.
Example
9.820.02 10.01.5 4 1
In practice, when doing mathematical calculations, it is a good idea to keep one more
digit than is significant to reduce rounding errors. But in the end, the answer must be
expressed with only the proper number of significant figures. After addition or
subtraction, the result is significant only to the place determined by the largest last
significant place in the original numbers. For example,
should be rounded to get 90.4 (the tenths place is the last significant place in 1.1). After
multiplication or division, the number of significant figures in the result is determined by
the original number with the smallest number of significant figures. For example,
should be rounded off to 12.6 (three significant figures like that of 2.80).
A measurement may be made of a quantity which has an accepted value which can be
looked up in a handbook (e.g. the density of copper). The difference between the
measurement and the accepted value is not what is meant by error. Such accepted values
are not "right" answers. They are just measurements made by other people which have
errors associated with them as well.
2
Error and Data Analysis
Nor does error mean "blunder". Reading a scale backwards, misunderstanding what we
are doing or walloping into a partner's measuring apparatus are blunders which can be
caught and should simply be disregarded.
Obviously, it cannot be determined exactly how far off a measurement is; if this could be
done, it would be possible to just give a more accurate, corrected value.
Error, then, has to do with uncertainty in measurements that nothing can be done about. If
a measurement is repeated, the values obtained will differ and none of the results can be
preferred over the others. Although it is not possible to do anything about such error, it
can be characterized. For instance, the repeated measurements may cluster tightly
together or they may spread widely. This pattern can be analyzed systematically.
Classification of Errors
Generally, errors can be divided into two broad and rough but useful classes: systematic
and random.
Systematic errors are errors which tend to shift all measurements in a systematic way so
their mean value is displaced. This may be due to such things as incorrect calibration of
equipment, consistently improper use of equipment or failure to properly account for
some effect. In a sense, a systematic error is rather like a blunder and large systematic
errors can and must be eliminated in a good experiment. But small systematic errors will
always be present. For instance, no instrument can ever be calibrated perfectly.
Other sources of systematic errors are external effects which can change the results of the
experiment, but for which the corrections are not well known. In science, the reasons why
several independent confirmations of experimental results are often required (especially
using different techniques) is because different apparatus at different places may be
affected by different systematic effects. Aside from making mistakes (such as thinking
one is using the ×10 scale, and actually using the ×100 scale), the reason why
experiments sometimes yield results which may be far outside the quoted errors is
because of systematic effects which were not accounted for.
Random errors are errors which fluctuate from one measurement to the next. They yield
results distributed about some mean value. They can occur for a variety of reasons.
They may occur due to lack of sensitivity. For a sufficiently a small change an
instrument may not be able to respond to it or to indicate it or the observer may
not be able to discern it.
They may occur due to noise. There may be extraneous disturbances which
cannot be taken into account.
They may be due to imprecise definition.
They may also occur due to statistical processes such as the roll of dice.
3
Error and Data Analysis
Many times you will find results quoted with two errors. The first error quoted is usually
the random error, and the second is called the systematic error. If only one error is
quoted, then the errors from all sources are added together.
Percent Error
The purpose of some experiments is to determine the value of a well known physical
quantity, for instance the value g-acceleration due to gravity. The accepted or theoretical
value of such a quantity is found in textbooks and physics handbooks is the most accurate
value obtained through sophisticated experimental methods or mathematical methods.
The absolute difference between the experimental value XE and the theoretical value XT of
a physical quantity X is given by the relation:
It is not always possible to find theoretical value for a physical quantity to be measured.
In such circumstance we resort to comparison of results obtained from two equally
dependable measurements. The comparison is expressed as a percent difference which is
given by
absolute difference
Percent difference = 100%
average
X1 X 2
100%
= X1 X 2 , (3)
2
where X1 and X2 are results from the two methods.
4
Error and Data Analysis
Statistical Tools
As discussed in above in most cases we do not know what the real value of the measured
quantity is. If the measurement is repeated a number of times we might find that the
result is a little different each time, however, not in a systematic or predictable way.
These variations in the result are due to random or statistical errors. Often the sources of
random errors cannot be identified, and random errors cannot be predicted. Thus it is
necessary to quantify random errors by means of statistical analysis. Simply by repeating
an experiment or a measurement several times we will get an idea of how much the
results vary form one measurement to the other. If there is little variation of the results we
have a high precision, whereas large variations in the result indicate low precision. A
way to visualize accuracy and precision is by the example of a dart board.
If we are a poor "dartist" our shots may be all over the board – and the wall; each shot
hitting quite some distance from the other: both our accuracy (i.e. how close we are to the
bull’s eye) and our precision (i.e. the scatter of our shots) are low (below left).
A somewhat better “dartist” will at least consistently hit the board, but still with a wide
scatter: now the accuracy of the throw is high, but the precision remains low (above
right). Once the player gets consistent and there is not much scatter in the shots, the
results may look like this:
5
Error and Data Analysis
Above left is not very accurate, however, the shots have high precision; once this is
worked out systematically "drift to the right" the shots will be very accurate and very
precise (above right).
What should be used to quantify the random error of a measurement? If only a single
measurement X is made, we must decide how closely we can estimate the digit beyond
those which can be read directly, perhaps 0.2 to 0.5 of the distance between the finest
divisions on the measuring instrument.
i. Mean Value
X X 2 ... X N X k
(4)
X 1 k 1
N N
ii. Deviation from the mean
Deviation from the mean or simply, deviation, is a quantitative description of how close
the individual measurements are to each other and is given by
d k X k X . (5)
Note that the deviation can be positive or negative as some of the measured values are
larger than the mean and others smaller. In ideal case the average of the deviations of a
N
d k
(6)
k 1
N
This is a measure of the dispersion of our data about the mean (i.e., a measure of
precision). The experimental value Xv of a measured quantity is given in the form
Xv X d (7)
Note that the term gives a measure of the precision of the experimental value. The
accuracy of the mean value experimental measurements should be expressed in terms of
the percent error or percent difference.
iii. Standard Deviation
6
Error and Data Analysis
From statistical analysis we find that, for a data to have a Gaussian distribution means
that the probability of obtaining the result X is,
X X0 2
1
p( X ) e 2 2
(7)
2
where x0 is most probable value and , which is called the standard deviation, determines
the width of the distribution. Because of the law of large numbers this assumption will
tend to be valid for random errors. And so it is common practice to quote error in terms
of the standard deviation of a Gaussian distribution fit to the observed data distribution.
This is the way we should quote error in our reports.
The mean is the most probable value of a Gaussian distribution. In terms of the mean, the
standard deviation of any distribution is,
N N
X X d
2
k k
(8)
X k 1
k 1
N N
The quantity 2, the square of the standard deviation, is called the variance. The best
estimate of the true standard deviation is,
X X
2
k
. (9)
X k 1
N 1
The reason why we divide by N to get the best estimate of the mean and only by N-1 for
the best estimate of the standard deviation needs to be explained. The true mean value of
x is not being used to calculate the variance, but only the average of the measurements as
the best estimate of it. Thus, X k X as calculated is always a little bit smaller than
2
X k X true , the quantity really wanted. In the theory of probability (that is, using the
2
assumption that the data has a Gaussian distribution), it can be shown that this
underestimate is corrected by using N-1 instead of N.
Xv X X (10)
If one made one more measurement of x then (this is also a property of a Gaussian
distribution) it would have some 68% probability of lying within X X . Note that this
means that about 30% of all experiments will disagree with the accepted value by more
than one standard deviation!
7
Error and Data Analysis
However, we are also interested in the error of the mean, which is smaller than x if
there were several measurements. An exact calculation yields,
X X
2
k
X X k 1
(11)
N N ( N 1)
for the standard error of the mean. This means that, for example, if there were 20
measurements, the error on the mean itself would be = 4.47 times smaller then the error
of each measurement. The number to report for this series of N measurements of x is
X
X X where . The meaning of this is that if the N measurements of x were repeated
N
there would be a 68% probability the new mean value of would lie within X X (that is
between X X and X X ). Note that this also means that there is a 32% probability
that it will fall outside of this range. This means that out of 100 experiments of this type,
on the average, 32 experiments will obtain a value which is outside the standard errors.
For a Gaussian distribution there is a 5% probability that the true value is outside of the
range X 2 X , i.e. twice the standard error, and only a 0.3% chance that it is outside the
range of X 3 X .
Frequently, the result of an experiment will not be measured directly. Rather, it will be
calculated from several measured physical quantities (each of which has a mean value
and an error). What is the resulting error in the final result of such an experiment?
For instance, what is the error in Z = A + B where A and B are two measured quantities
with errors A and B respectively?
A first thought might be that the error in Z would be just the sum of the errors in A and
B. After all,
A A B B A B A B (12)
and
A A B B A B A B . (13)
But this assumes that, when combined, the errors in A and B have the same sign and
maximum magnitude; that is that they always combine in the worst possible way. This
could only happen if the errors in the two variables were perfectly correlated, (i.e.. if the
two variables were not really independent).
8
Error and Data Analysis
If the variables are independent then sometimes the error in one variable will happen to
cancel out some of the error in the other therefore, on the average, the error in Z will be
less than the sum of the errors in its parts. A reasonable way to try to take this into
account is to treat the perturbations in Z produced by perturbations in its parts as if they
were "perpendicular" and added according to the Pythagorean theorem,
Z A2 B 2 . (14)
The derivation of the general rule for error propagation is beyond the scope general
physics course. However, we give results for some common relationships (functional
dependences) between measured quantities. Suppose there are two measurements, A and
B, and the final result is Z = F(A, B) for some function F.
Experimental data is often presented in graphical form for reporting as well as obtain
some information easily from properties of the graph.
Graphing procedures
In most cases quantities are plotted using Cartesian coordinate system in which the
horizontal axis, referred to as abscissa, is labeled as x-axis and the vertical axis often
called ordinate is labeled as y. The location of a point on the graph is specified by its
coordinates x and y, given as (x, y) with respect to the origin, say O, the intersection of
the axes.
9
Error and Data Analysis
(b) If data points are plotted by hand, draw a smooth line connecting the points. If
some plotting computer software is used choose plot with lines option.
(c) Include the uncertainties in the experiment as error bar (mean deviation or
standard deviation) on your graph.
Why do we plot a data? The main reason for plotting a data in laboratory report is to
explore the relationship between various measured quantities. Some quantities of interest
in this case are
(a) slope of the graph if the relationship is linear.
(b) The degree of the polynomial or the coefficient of the highest degree polynomial
for non linear relationship among measured quantities.
Note that these quantities can be determined directly from the graph with the use of
proper least square fit (You may ask your instructor how to do a proper fit to your data).
10
Error and Data Analysis
Data Table 1
t (s) t 2 (s 2) y (m)
0.00 0.00 0.00
0.31 0.10 0.50
0.44 0.19 1.00
0.53 0.28 1.50
0.63 0.40 2.00
0.71 0.50 2.50
0.76 0.58 3.00
0.83 0.69 3.50
0.89 0.79 4.00
0.94 0.88 4.50
1.00 1.00 5.00
5.00
4.00
3.00
y (m)
2.00
1.00
0.00
0.00 0.20 0.40 0.60 0.80 1.00 1.20
-1.00
t (s)
FIGURE 1
11
Error and Data Analysis
1.4 Procedure
Complete the exercise in the data and data analysis section. Show your calculations in
detail and plot graphs as required.
3. If the theoretical value of is 3.142 what is the fractional error and percent error
of the experimental value found in (2)?
5. The equation of motion for an object in free fall starting from rest is y = ½
2
gt,
where g is the acceleration due to gravity. The graphical form of this equation and
the data plotted in Fig. 1 is called a parabola, a polynomial of degree 2, which has
the general form y = at2+bt+c , where a, b, c are constants with values a = g/2, b = 0,
c = 0 respectively for the present case.
For data given in table 1 plot y vs t2. This will be the plot of y vs t where t = t2
which has the general form of y = ax+b , a = g/2 and b = 0.
(a) Determine the experimental value of g from the slope of your graph.
(b) Compute the percent error of the experimental value of g determined from the
graph in (a)
12
Error and Data Analysis
2. Explain the difference between measurement precision and accuracy. How are these
related to the types of error you have discussed in question (1)?
5. Explain how one can express experimental result using the quantities discussed in (4).
6. (Optional) Explain how one can account for uncertainty in calculation of area from
measurements of length and width of a rectangular block. [Let l and w be
uncertainties in length and width measurement respectively. Express A in terms of
the two uncertainties ]
13