Notes On Forest Mensuration
Notes On Forest Mensuration
Notes on
FOREST MENSURATION
I. Statics
Oscar Garcı́a
September 1995
Copyright
c 2004 Oscar Garcı́a
1
Contents
1 Introduction 4
3 Trees 28
3.1 Diameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Heights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Cubature of trees . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Stem analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Volume functions (tables) . . . . . . . . . . . . . . . . . . . . 35
3.6 Form factors and quotients, etc. . . . . . . . . . . . . . . . . . 37
3.7 Taper functions (curves) . . . . . . . . . . . . . . . . . . . . . 38
3.8 Bark functions . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4 Stands 42
4.1 Diameter, basal area . . . . . . . . . . . . . . . . . . . . . . . 42
4.2 Heights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.1 Height-diameter curves . . . . . . . . . . . . . . . . . 43
4.2.2 Dominant height . . . . . . . . . . . . . . . . . . . . . 44
4.3 Cubature of stands/plots . . . . . . . . . . . . . . . . . . . . 46
4.4 Volume functions . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.5 Stand tables, distributions . . . . . . . . . . . . . . . . . . . . 48
2
A Errors 54
A.1 Error bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
A.2 Significant figures . . . . . . . . . . . . . . . . . . . . . . . . . 56
B Regression 62
B.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
B.2 The least-squares method . . . . . . . . . . . . . . . . . . . . 66
B.3 Statistical considerations . . . . . . . . . . . . . . . . . . . . . 71
3
Chapter 1
Introduction
4
problems and new situations. It is expected that many practical details on
specific methods and measurements will be learned through reading the lit-
erature, exercises, and subsequent courses. The laboratories and subjects
treated in them will be an important and integral part of the course.
Topics less important or somewhat peripheral to the central subject will
be indicated with smaller font and with a ♥. Items marked with ♥♥ are
included mostly for curiosity value.
General references:
INSTITUTO DE MANEJO FORESTAL, Cátedra de Manejo Forestal. Pre-
vious years notes. (Much relevant and complementary material, in Spanish).
HUSCH,B. ,MILLER,C.I., and BEERS,T.W. Forest Mensuration. 3rd Edi-
tion. Wiley. 1982. (Good elementary text, emphasizing measurement prin-
ciples).
AVERY, T.E. and BURKHART, H.E. Forest Measurements. Fourth edi-
tion. McGraw-Hill. 1994. (Another elementary text, brief explanations and
emphasis on North American practices).
AVERY, T.E. Natural Resources Measurements. Second edition. McGraw-
Hill. 1975. (Earlier edition, available in the Library).
LOETSCH, F., ZOEHRER, F. and HALLER, K. Forest Inventory, [Link].
BLV Verlagsgesellschaft. 1973. (Fairly comprehensive, good reference text).
PARDE, J. Dendrométrie. Editions de l’Ecole Nationale des Eauxs et Forêts
– Nancy. 1961. (Mensuration text in French. There is an updated edition
by Pardé and Bouchon, but not in the Library).
PRODAN, M. Holzmeßlehre. J.D. Sauërlander’s Verlag, Frankfurt. 1965.
(Classic, in German).
CAB VAN LAAR, A. y AKQA, ? ?? (To appear, but by the authors it
should be excellent).
SPURR, S.H. Forest Inventory. Ronald. 1952. (A classic, still interesting
in its treatment of volume tables, growth fundamentals, etc.).
CAILLIEZ, F. Estimación del Volumen Forestal y Predicción del
Rendimiento. Vol.1 - Estimación del Volumen. Estudio FAO: Montes 22/1.
1980. (Volumes, Spanish translation).
ALDER, D. Estimación del Volumen Forestal y Predicción del Rendimiento.
Vol.2 - Predicción del Rendimiento. Estudio FAO: Montes 22/2. 1980. (One
of the best on growth models. Spanish translation).
5
BRUCE, D. and SCHUMACHER, F.X. Medición Forestal. Editorial Her-
rero. 1965. (Translation of a fairly old text, although still useful in its
instrument and measurements part).
CLUTTER, J.L., FORTSON, J.C., PIENAAR, L.V., BRISTER, G.H. and
BAILEY, R.L. Timber Management: A Quantitative Approach. Wiley.
1983. (Forest Management text, also good coverage of the most common
methods in growth modelling).
ASSMAN, E. The Principles of Forest Yield Study. Pergamon Press. 1970.
(Good reference, lots on European growth studies. Suffers of lack of struc-
ture and analysis).
CARRON,L.T. An Outline of Forest Mensuration with Special Reference to
Australia. Australian National University Press. 1968. (Brief and without
much detail, but interesting for mensuration in plantations).
PETERS N.,R., JOBET J.,J. y AGUIRRE A.,S. Compendio de Tablas Aux-
iliares Para el Manejo de Plantaciones de Pino Insigne. Instituto Forestal,
Manual No.14. 1985. (Practices, equations, etc., used in Chile).
HAMILTON, G.J. Forest Mensuration Handbook. Forestry Commission
Booklet No.39. Her Majesty’s Stationery Office, London. 1975. (Interesting
manual for its detailed description of standards and procedures. Discusses
appropriate degrees of refinement in measurements according to costs and
benefits).
6
Chapter 2
2.1 Length
Lengths are usually measured with tabes or graduated rods, and they do
not present major difficulties. It must be remembered nevertheless that the
exact definition of a length is subject to conventions that can vary from a
situation to another. For example, often it is common to round down log
lengths to the nearest foot or decimeter. Depending on the application or
established conventions, the length of a curved log could be measured in a
straight line, or following the curvature.
♥ Although measuring lengths may seem simple, some aspects are not completely
obvious. How long is the coast of Chile? How to measure it? It could be mea-
sured on a map by carefully translating a ruler, walking a compass over it, with a
curvometer, or tracing it with a thread. In any case, the measurement on a small
scale student map surely will give a value smaller than if it is measured in charts
at scale 1:50000. It is possible to imagine that if the process is repeated on aerial
photographs every time on larger scale, greater lengths will be obtained. Which is
the “true” length? Is there such a thing?
Questions, exercises
2. Graph the logarithm of the lengths obtained over the logarithm of the
resolution (opening of the compass). See any trend?
7
3. Repeat the same with a semicircle. Similarities and differences?
4. The limit of the slope in the relationship between the length loga-
rithms vs. resolution is called the fractal dimension of a curve. If this
dimension is not a whole number, the curve is a fractal . A fashionable
topic is the use of fractals (curves and generalizations to surfaces and
volumes) as models for natural objects and natural phenomena. Nu-
merous articles on this have been appearing in forest research journals.
Comment on the following statements:
(a) How do you think that that value was obtained? Suggest alter-
native methods.
(b) Do you find any usefulness in this information?
8
In Physics, philosophical problems about the nature or existence of “real” values
have been obviated by taking the operational point of view. In this a quantity is
defined through the procedure (operations) used to measure it.
2.2 Diameters
The instruments commonly used are the tape (measuring the circumference)
and the calipers. Also diameters at the ends of logs can be measured directly.
We will not talk more here of this situation, but of measurements “from
outside”, as at points far from the log ends, or on standing trees.
Calipers
What is of interest generally is not the diameter, but the area of the
cross-section with the purpose of estimating volumes. In the first place, it
is clear that both the tape and the caliper ignore possible concavities in the
section, dealing with the convex closure(convex hull ):
9
The difference (positive) between the area of the convex closure and the
area of the stem cross-section is called the convex deficit.
The tape actually measures the perimeter P of the convex closure. What
is called “diameter” is the diameter that a circle with that perimeter would
have, that is to say, D = P/π. In fact, the diameter tape used in mensuration
is graduated in units of π, so that it gives that diameter directly. But the
area of the convex closure will always be less than or equal to the area
πD2 /4 of that circle, since the circle is the figure with the largest area for a
given perimeter. The difference is called isoperimetric deficit. We have
identified then two sources of overestimation when calculating the area of
the section by S = πD2 /4 = P 2 /(4π).
In general, the diameter measured with calipers varies with the direction
from which it is measured. By a theorem of Cauchy (1841), it is possible to
prove that the expected value of a caliper diameter measured in a random
direction (or what is the same, the diameter averaged over all the possible
directions) is equal to the value obtained with tape (we are ignoring possible
errors of measurement). That is to say, E[Dc ] = Dt .
♥ More precisely, what Cauchy said is that for any convex figure with perimeter
P , the expected value of its projection on a random direction is P/π. Clearly, the
measurement Dc given by calipers corresponds to a projection, and the “diameter”
given by the tape is Dt = P/π.
Nevertheless, the expected value of the area of the cross-section obtained
by a caliper measurement does not agree with that obtained with the tape.
The caliper diameters have a certain variance
10
♥ Could we determine the form of the cross-section from several caliper measure-
ments, and thus to devise some type of correction for the isoperimetric deficit? The
following figure has the same projection in all directions, like a circle:
♥♥ The previous figure is called the Reuleaux triangle. Matérn gives other ex-
amples of figures with this property, called orbiforms. Another interesting article
is: Gardner, M. “Curves of constant width, one of which makes it possible to drill
square holes”. Scientific American, February of 1963.
Questions, exercises
1. You need to cut a 1.3 m rod. You only have a diameter tape graduated
in centimeters and millimeters (of diameter). What reading of the tape
corresponds to the length of the rod?
11
3. Note that the Reuleaux triangle shown above is formed by three seg-
ments of circle, centered at the vertices. Calculate, as percentages of
the real area, the areas obtained from measurements with tape and
calipers.
5. The following are some properties of the ellipse with greater diameter
2a and minor diameter 2b:
x2 y 2
equation: + 2 =1
a2 b
area: πab
excentricity: e = 1 − b2 /a2
3 √
perimeter: π[ (a + b) − ab] (aprox.)
2
(this is an approximation for the perimeter valid for
moderate excentricities,
2π
sufficient for this example; the exact value is a 0 1 − e2 sin2 θ dθ, an el-
liptical integral of the second kind whose values can be found in tables).
Take some reasonable excentricity values for trees with elliptical cross-
section (for example thinking about a greater diameter of 30 cm and
several reasonable small diameters). For these excentricities:
12
2.3 Cubature of logs
When calculating volumes of logs and trees we normally assume that the
sections are circular, or at least that diameters are such that the area of the
section is πD2 /4. This is a simple example of the use of models, idealizations
that are taken as true in later calculations and decisions.
13
viously isolated bits of knowledge, and to identify gaps where more work is needed.
The benefits come from the development of the model, and not so much from its
later use, if any. I shall focus on models for prediction, intended for management
planning. Typical applications are in forecasting for forest planning purposes, in
the comparison and evaluation of silvicultural (pruning, thinning) regimes, and in
the updating of stand description databases.
For the most part the models are presented as deterministic. In general,
decision-makers use model results as representing a most likely course of events,
and introducing “randomness” in the predictions has not been found very helpful
in practice. A stochastic (random, probabilistic) component is however necessary
for developing rational parameter estimation procedures, and appropriate stochas-
tic structures are discussed later in that context.”
(Garcı́a, O. “The state-space approach in growth modelling”. Canadian Journal of
Forest Research 24 , 1894–1903, 1994).
It is customary in forest mensuration to take the shape of logs and trees
as similar to certain solids of revolution, the cylinder, paraboloid, cone, or
neiloid. The general formula for the variation of the diameter D with the
length x is D2 = axn with n = 0, 1, 2, 3. In terms of the area of the section,
S = bxn .
More generally, different parts from the tree resemble portions of these
solids. The crown part, in conifers, tends to the cone form. The stem central
part approaches a paraboloid. The base of the tree expands in a form similar
to the neiloid, although generally values of n greater than 3 come closer.
14
The volume of a stem section, for example a log, clearly corresponds to
the area underneath the curve of S over length. For a log in the central,
paraboloidal, part of the tree, the section changes to a linear form, and the
volume is the area of the following trapeze.
The volume (area of the trapeze) can in this case be calculated given the
15
end sections and the length:
S0 + SL π
V = L = (D02 + DL
2
)L .
2 8
In Mensuration this is known as Smalian’s formula. The volume/area can
also be calculated based on the midpoint diameter:
π 2
V = Sm L = D L,
4 m
Huber’s formula. We assume here that we are using consistent units (e. g.,
D and L in metres, V in cubic metres).
Questions, exercises
It can be seen that Smalian gives the area of the upper trapeze, overesti-
mating the real volume. Huber gives the area of the lower trapeze, producing
an underestimate. Comparing the areas between the dotted lines on each
side of the curve, one sees that Huber gets closer to the real value.
Huber’s formula is generally more exact, and requires measuring one
diameter instead of two. In many instances, however, the centre of the log is
16
not easily accessible, as when logs are stacked up. In addition, if the volume
without bark is needed, it is easier to measure the diameters under bark
in the extremities of the log. Because of this Smalian’s formula, although
producing larger errors, tends to be used most frequently.
If we had three diameters, at the ends and centre, a weighted average of
Huber and Smalian would reduce the errors. It is shown that the following
formula, that can be seen as such a weighted average, gives exact results for
polynomials of up to third degree. That is to say, it is exact for all the solids
of revolution considered here.
S0 + 4Sm + SL
V = L.
6
In Mensuration this one is traditionally known as Newton’s formula, al-
though in mathematics this is the basis of Simpson’s rule. It is little used
in practice, except in the cubature of complete trees as we will see later.
Questions, exercises
(a) Use the small and large diameters to calculate the central diam-
eter:
i. Assuming that the log is a truncated cone.
ii. Assuming that the log is a truncated paraboloid.
(b) Calculate the volume in cubic metres by the formulas of
i. Smalian.
ii. Huber.
17
3. By geometry and integration it is found that the area under S = bxn
between two points with S = S0 and S = SL , separated by a length
L, is
1+1/n 1+1/n
L SL − S0
V = 1/n 1/n
.
n+1 S −S
L 0
where r = SL /S0 .
4. Take a reasonable range of values for r (think of approximately
paraboloid trees with various diameters and heights). Calculate, for
n = 1, 2, 3, 8, the percentage of error in the formulas of Smalian and
Huber. (Hint: notice that S 1/n as a function of length is a straight
line).
♥Numerical integration Calculating the area under a curve f (x) given some
points on it is an important problem in Numerical Analysis: numerical integration
or quadrature. The usual approach is to obtain formulas of the form L wi f (xi )
(the length multiplied by a weighted average of the heights) that are exact for
polynomials of the highest degree possible, subject to various constraints on the
number of points and on the values of the xi and/or the wi . In the case where
the xi are given uniformly spaced one has the Newton-Cotes formulas, closed if
the xi include the ends of the integration interval, open otherwise. The formulas
of Smalian and Newton used in mensuration correspond to the closed Newton-
Cotes for 2 and 3 points, respectively. They are known as the trapeze and Simpson
formulas, and are exact for polynomials of 1st and 3rd degree (the error is a function
of the second and third derivatives). The one of Huber could be seen as the open
formula with one point, exact for polynomials of 1st degree.
If the xi can be chosen freely,
the Gauss formulas are obtained. For example,
with three points, L[5f (L(1 − 3/5)/2) + 8f (L/2) + f (L(1 + 3/5)/2)]/18 is exact
for polynomials of up to the fifth degree. In the formulas of Chebyshev it is required
all the wi to be equal, which reduces the effect of errors in f (x), and facilitates the
calculations since it is sufficient to take the mean of the f (xi ). For two points Gauss
and Chebyshev coincide, and for one we would get Huber again.
Note that, depending on the irregularity and errors in f (x), it may be preferable
to integrate subintervals separately than to use formulas of high degree. With
uncertainty in f (x) it may also be better to use a polynomial approximating its
values instead of one of higher degree that interpolates them exactly. For more
details see Numerical Analysis texts.
18
2.4 Stacked wood
Products such as firewood and pulp logs are frequently commercialized ac-
cording to their volume in piles or stacks. A stere metre is the volume of a
stack of 1 × 1 × 1 metres (a cubic metre stacked), and it is used for firewood.
For pulp the unit most used in Chile is the metro ruma, a section of 1 × 1
metres in a stack of logs 2.44 m long. It may be with bark or without bark.
A unit common in North America is the cord, that corresponds to a stack
of 4 × 8 feet with logs 4 feet long.
Due to edge effects the wood content can vary slightly with the stack di-
mensions, and much with the stacking method, so buyers and sellers usually
establish specific norms on dimensions and stacking methods.
Other important factors in the solid content are the irregularity of the
logs, the variability of the diameters, and the bark thickness. Movement
during transport can also introduce important changes.
It is of interest to have conversion factors, for example solid cubic metres
per metro ruma. These may also be expressed as a proportion of solid
volume in the stowed volume. Peters et al give some conversion factors for
radiata pine in INFOR’s Manual No. 14. Bruce and Schumacher mention a
formula by the second author for the proportion of solid volume over bark:
0.84 − 0.04N , where N is the mean number of logs by square foot on the
face of the stack.
It is possible to obtain more exact conversion factors for specific situa-
tions through sampling. Stacks can be measured, and then taken apart to
determine the volume of the logs using Smalian’s formula, for instance. An
often-mentioned alternative consists of taking photographs and measuring
in them the proportion of wood with a points grid or some other method.
Similar grids have been suggested for direct use in the field. More practical
for field use would be the method proposed by Avery: marking a rod or
tape at regular intervals, and determining the proportion of marks that fall
in wood when placing it on the face of the stack. In any case, sufficient
sampling intensity is needed to obtain the desired precision.
19
♥Point grids and systematic sampling It is easy to obtain unbiased esti-
mators of known precision using random sampling. Systematic sampling, of which
the point grid is a special case, is generally much more efficient. Its use has been
resisted, however, due to the difficulty in estimating its precision, and to the pos-
sibility of extreme errors in unfavourable cases. These extreme errors, associated
to coincidences of periodicity between the population and the sample, are probably
rare in practice, and can usually be avoided by taking suitable precautions. The
theory behind its precision is complex, and exact results depend on non-observable
population characteristics. However, it has been possible to obtain satisfactory ap-
proximations. Thus, in some disciplines there has been a resurgence of interest in
systematic sampling procedures, specially in Stereology. This deals with the deter-
mination of quantitative characteristics mainly in organs and animal tissues, and
there is a highly developed mathematical theory. See for instance the articles by
Matérn, Kellerer, Mattfeldt and by Cruz-Orive in volume 153, number 3 of the
Journal of Microscopy, 1989.
The variance in a measurement of area with a square point grid is found to be
approximately 0.0728P a3 , where P is the perimeter of the area to be measured and
a is the spacing of the grid. This formula was obtained by Matheron in 1965, and
presented in the forestry literature by J. Bouchon ([Link]. 32 , 131-134, 1975)
and R.B. Chevrou (Resource Inventory Notes 20 , 3-6, 1979). The more exact value
0.0728 instead of 0.0724 is given by Matérn in the reference already mentioned,
20
and it is used in the equation below. Gundersen and Jensen ([Link] 147 ,
229-263, 1987) use this to give the coefficient of variation of the
√ area A based on
the counted number of points N and of the form indicator P/ A:
√
σ/A = 0.27(P/ A)1/2 /N 3/4 .
√ √
The ratio P/ A has a minimum of 2 π for circles (note the relation with the
isoperimetric deficit). Gundersen and Jensen give examples with ratios of up to 33
for very complicated figures, which gives for the coefficient of variation an approx-
imate range of 0.5N 3/4 to 1.5N 3/4 .
Questions, exercises
1. How many metros ruma are there in one cord?
2. We have a stack of pulp logs, stowed along a strong slope. Its length,
along the ground, is of 12 m. The mean height measured vertically is
1.5 m, and measured perpendicular to the ground is 1.2 m. The logs
are 2.44 m long. What is the content in metros ruma?
3. Calculate the solid proportion in a stack of cylinders of uniform diam-
eter stacked in rectangular pattern. The same for the more compact
arrangement (triangular).
21
2.5 Weight measure
A common wood measurement method for pulp logs, and in sometimes for
sawlogs, is through weight. Loaded trucks or railroad cars are weighed, and
the known weight of the empty vehicle is subtracted. For smaller scale oper-
ations and field measurement, a device has been developed in New Zealand
that calculates the weight from the pressure in the hydraulic system of a
front loader.
It is of interest to know the equivalent in volume and/or dry weight.
Factors that affect these conversions are the variations in moisture content
and in wood density. In turn, these vary with the time elapsed since harvest,
tree size, site, weather, and locality of origin. Variables such as the amount
of mud adhered to the logs and the amount of fuel in the trucks also can be
important.
The most important source of variation is generally the moisture content.
It should be mentioned that often paying by weight is advantageous for the
buyer, being an incentive for delivering fresh wood.
Perhaps the most usual is using an average conversion factor. Some-
times corrections based on moisture content (estimated with electrical in-
struments) are made. If P is green weight, p is dry weight, and V is vol-
ume, the required conversion factor to volume V /P depends on the basic
density p/V , characteristic of the species, and the moisture content (%)
100(P − p)/p. The green volume V does not change appreciably while the
moisture content does not go below the fibre saturation point, around 30%.
Regressions are also used that include the mean log size or the number
of logs per load. Another predictor that has been used is the date (month),
since the variations through the year usually are important. An analysis
of the factors that affect conversion factors, specifically for radiata pine,
is found in: Ellis, J.C. “Weight/volume conversion factors for logs”, New
Zealand Logging Industry Research Association, Technical Release 6 (3),
1984.
Questions, exercises
1. Given the moisture content h and the basic density d, obtain formulas
for the conversion factors to volume and to dry weight.
2. The price of a metro ruma of debarked E. anonimus is $20000. Cal-
culate an equivalent price to pay for the ton. It is known that: the
proportion of solid wood is 0.72; basic density is 0.74 g/cm3 ; moisture
content is 80%.
22
2.6 Sawn timber
In Chile it is customary to express the volume of sawn wood in pulgadas
madereras (“lumber inches”), a unit related to the North American board
foot. Although the metric system gains ground, in sawmilling practice mea-
sures in feet and inches are still in current use. One inch (abbreviation 1”)
is 2.54 cm. A foot (abbreviation 1’) is 12 inches, or 30.48 cm.
The board foot is the volume of a square piece of 1 foot by side and
one inch of thickness. A pulgada maderera is the volume of a board 10”
wide by 12’ of length and 1” of thickness.
Clearly, one pulgada maderera equals 10 board feet. This is used com-
monly for native timbers. In plantations, more common is the pulgada
corta or pulgada pinera, which is defined with a length of 10.5’ instead of
12’. These are nominal measures, that is to say, they may include tolerances
and/or planning and shrinking losses.
Questions, exercises
(a) 2” × 4” × 16’
(b) 3/4” × 3 3/4” × 8’
(c) 4” × 4” × 10’
23
Log rules are tables or formulas that give sawn volume based on the
small diameter and length of a log. They have been obtained through dia-
grams, or as formulas derived from geometric reasoning.
In the diagram method, circles of various sizes were drawn at scale,
representing the small end of logs. In these, boards that would be obtained
in sawing were indicated, and their volume calculated.
The best known log rule of this kind is the one of Scribner (1846). A
slight variant, the Scribner decimal C, is one of the most used nowadays in
the U.S.A. There is an approximation formula:
V = 0.79D2 − 2D − 4
24
terms are multiplied by the length to obtain the recoverable volume, which
in board feet and for diameter in inches is
V = 0.22D2 − 0.71D .
This formula can be modified for sawkerfs different from 1/8”, and the most
used is the one of 1/4”. With this sawkerf the area of the circle should have
been reduced by a factor of 16/21 instead of 16/19, so that the formula
above is adjusted simply multiplying it by 19/21, obtaining
V = 0.199D2 − 0.642D .
Unlike Scribner’s rule, the International takes into account taper, as-
suming that boards of 4’ of length are usable, and that the average taper is
1/2” in 4’. The previous formula is therefore applied starting at the small of
each 4’ section, incrementing the diameter by 1/2” for the following section.
Thus, for logs of 16’ the 1/4” rule would give the following formula:
V = 0.199D2 − 0.642D
+ 0.199(D + 0.5)2 − 0.642(D + 0.5)
+ 0.199(D + 1.0)2 − 0.642(D + 1.0)
+ 0.199(D + 1.5)2 − 0.642(D + 1.5)
= 0.769D2 − 1.374D − 1.23 .
In their commercial application these log rules are accompanied by de-
tailed standards for measuring, rounding, and defect allowances. Specially
in over-mature natural forests, the defects (rot, cracks, sweep) are usually
important. The general idea is to enclose the defect in a parallelepiped,
thinking about the way in which the cuts will be made, and deducting its
volume before applying the rule. Practices used in the U.S.A. are described
by Bell, J.F. and Dilworth, J.R. “Log Scaling and Timber Cruising”, O.S.U.
Book Inc. Stores, Corvallis, Oregon, 1988, 1993.
The traditional log rules can only give rough estimates, since the con-
version varies much with the technology of the sawmill, species, product
dimensions, etc. For a given situation it is possible to develop an empirical
log rule or conversion function through sawing studies. A certain number
of logs covering the desired range of diameters is measured, sawn, and the
lumber obtained is measured. With these data it is then possible to fit a
regression equation giving the sawn volume as a function of small diameter,
and possibly of length if this varies. From the log rule formulas already seen,
it is seen that a reasonable function might take the form (for a given length)
V = b0 + b1 D + b2 D2 .
25
♥ F.X. Schumacher and W.C. Jones (Journal of Forestry 38 , 889–896, 1940)
proposed an interesting method to obtain empirical log rules without counting on
detailed individual log data. The basic idea is that the previous equation can be
added over all logs processed in a day:
V = b0 N + b 1 D + b2 D2 .
It is then possible to estimate the coefficients from daily total production V,
number of logs N , and sums of small diameters and of their squares. Formulas for
variable-length logs are handled in an similar way.
The method can be useful also for weight/volume conversion factors, and in
other applications. Clearly, to obtain reliable results, long daily production series
with important day to day variations in the characteristics and number of logs are
needed.
In many instances it may be more convenient to express the sawn yield
in relative terms, as a conversion factor of pulgadas or cubic metres sawn
per cubic metre of logs. The cubic volume can be estimated from the small
diameter assuming some value for taper, for instance the 1/2” per 4’ of
length (1:96) of the International rule.
Questions, exercises
1. A log 3.2 m long has small and large diameters (under bark) of 30 and
34 cm, respectively.
(a) Obtain the sawn volume in pulgadas madereras using the Inter-
national 1/8” rule.
(b) Based on Smalian, give the conversion factor (% of recovery).
3. Express the International 1/4” rule for sawn volume in cubic metres,
diameter in centimetres, and length in metres.
26
5. Obtain a formula for the conversion factor to sawn timber, as % in
function of the small diameter, using the 1/4” International log rule
and its taper assumption. Hint: consider a 4’ section.
V = (0.79D2 − 2D − 4)(L/16)
27
Chapter 3
Trees
3.1 Diameters
The most commonly used diameter is the diameter at breast height (DBH).
It is defined as the diameter, over bark unless stated otherwise, at a height
above the ground that in most of the countries that use the metric system
is 1.3 meters. Some exceptions are New Zealand (1.4 m), and Japan (1.25
m). In the US, 4.5’ is used. These heights are convenient for measurement
with callipers, and are somewhat distant from the influence of the butt-swell
at the base of the tree (although perhaps a greater height might have been
preferable).
When measuring DBH it is desirable to rely on more precise specifica-
tions, which unfortunately are not standardized. For example, on a slope
it is customary to measure the DBH height either from the mean ground
level at the tree base, or from ground level on the upper slope side. In case
of stem deformation at breast height, the measurement may be displaced
upward, downward, or the average of two measurements may be taken.
Questions, exercises
1. Discuss the advantages and disadvantages of the several criteria of
measurement of the DBH just mentioned.
2. How important can the difference be between measuring the DBH at
1.3 or at 1.4 m? A typical stem taper in the first log is on the order
of 1:100.
The most common instruments used for measuring DBH are the cal-
liper and the diameter tape (graduated in units of π). Upper diameters are
28
measured by climbing, with instruments mounted on rods, or with various
kinds of optical dendrometers. In all instances, the considerations on vari-
ability and error sources, and the relations with cross-sectional area already
discussed in the section on log measurement are applicable.
In order to turn diameters over bark to under bark, the bark thick-
ness can be measured, or estimated from pre-established relationships. The
Swedish bark gauge is most commonly used. Its proper use requires periodic
practice and calibration, and biases can be important. Two readings, on op-
posite sides of the stem, are commonly taken, adding them up to obtain the
“double bark thickness”, considered as the difference between the diameters.
3.2 Heights
Heights of up to 10–15 m are preferably measured with telescopic poles. For
greater heights, clinometers (instruments that measure vertical angles) or
hypsometers (specialized instruments that indicate height) are used. Some
hypsometers (Christen, Merrit) use similarity of triangles, but at the present
time most are based on trigonometrical principles.
29
all based on the use of a pendulum or counterweight to establish a vertical
reference. Clinometers such as the Abney level, that uses an air bubble as
reference, may also give good results, although its use is more cumbersome.
For high precision it is necessary to resort to theodolites or tachymeters.
Hypsometers are subject to large errors if they are not used with care.
The height pole is preferable when practical. Several readings must always
be taken; the median of three would be recommendable. The calibration of
the instrument must also be verified periodically. In particular, the accuracy
of the rangefinder must be checked, since substantial factory variations have
been found.
Questions, exercises
1. Based on the above figure, derive the principles and formulas for the
calculation of heights. The general approach in this type of problem
consists of establishing or identifying right-triangles and applying some
of the following formulas
a
c = sin α
c b
c = cos α
a
a
b = tan α
α a2 + b2 = c2
b
2. What happens if the level of the line of vision is below the base of the
tree?
30
(a) What is the height of the tree?
(b) The angles have an error of ±1◦ . By substitution, calculate limits
of error for the height.
31
added. For uniform height intervals, Huber or Newton can also be
used. If they are included, usually the stump section is taken as a
cylinder, and the top section as a cone. This is, perhaps, the most
common procedure. If the sections are sufficiently short, errors and
differences between the various formulas are negligible.
32
in cylindrical wooden cores extracted at right angles to the surface of the
stem with an increment borer :
The visibility of the rings may be improved with colorants and other
treatments. X-ray or gamma-ray densitometers are also used to detect the
changes of density associated with the rings. There is a number of possible
sources of error that need to be kept in mind. The presence of false rings,
produced by abrupt climatic variations or other factors, causes difficulties.
Eccentricity of the rings, and inclination of the borer, can produce serious
errors. There may be compression of the outer rings in the wooden core,
especially if the borer is not sharp enough. Loss of moisture below the fiber
saturation point (approximately 30%) produces contraction of the wood. A
good manual on increment boring is: Jozsa, L. “Increment core sampling
techniques for high quality cores”, Forintek, Spec. Pub. No. SP-30, 1988.
The number of annual rings between the pith and the cambium indicates
the year in which the tree reached the corresponding height. This can be
used to estimate the age of the tree. The time elapsed between two heights
can also be estimated, and from there the height growth rate. Measurement
of the rings provide estimates of growth in diameter and in basal area. For
values over bark, it is necessary to estimate the bark thickness indirectly,
usually with bark–dbh relationships.
33
Three-dimensionally, what we have is wood layers that form annually
on top of each other. Observing the intersections of these layers with cross
sections of the stem at various heights (the rings), the past dimensions of
the stem can be reconstructed. Thus, data can be obtained for volume
tables, taper curves, and site indices (growth in height). The principles
of this reconstruction are more or less obvious, except possibly for height
estimation.
The problem with height is that the end of each height increment can
happen at any height between the levels of two successive cuts, the exact
height being unknown. Sometimes it is possible to guide oneself by external
indicators (whorls, scars left by the bracts of the apical bud, arrangement of
the foliar primordia), and to make cuts that coincide with the end of each
annual increment. Otherwise, it is necessary to do some kind of interpola-
tion. Dyer and Bailey (Forest Science 33 ,3–13, 1987) found that a simple
method proposed by Carmean gives good results.
Carmean’s method is based on assuming a constant increase in height
for every year between two successive levels, with the cuts occurring at
the middle of an increment. The distance between two successive cuts is
divided by the difference in ring numbers, obtaining a mean increment k.
The heights estimated above the level of the lower cut are then k/2, k/2 +
k, k/2 + 2k, . . . (see the figure).
Questions, exercises
1. In a stem analysis, a tree with 67 rings in the stump is cut into 5 m
logs. The numbers of rings in the upper ends of the logs are 53, 37,
and 19. Consider the time intervals in which the tree passed from a
cut level to the next. For each one of these intervals, estimate the
average rate of growth in height.
34
2. In a stem analysis (in 1995), we have obtained diameters for rings 5
and 10, counted from the outside of each section. With these diameters
the following areas were calculated (m2 ):
Calculate the growth rate in m3 /year between 1984 and 1989. Use
Smalian, taking the stump (height 30 cm) as a cylinder.
35
between the diameter under bark at 5.19 m of height and the DBH over
bark. 5.19 m corresponds to the end of a first 16 feet log.
Some common forms for volume functions are
V = a + bD2 + H + dD2 H ,
and the variants obtained with various combinations of a, b and c set equal
to zero, and
log V = a + b log D + c log H .
We are dealing with typical linear regression problems, without major
complications. Nevertheless, three particular aspects may be mentioned.
It often happens that the dispersion of the regression residuals for V
tends to increase with the values of the variable, an instance of heterocedas-
ticity. The logarithmic transformation, when it is used, can eliminate or
reduce this effect, because if σ is proportional to V then the deviations of
log V have a variance approximately constant.
Another way of facing heterocedasticity is to use weighed least-squares,
applied to volume tables by Cunia in 1964. The assumption is that the
variance of εi is σ 2 wi , where the wi are known. This is a special case
of generalized least-squares, with W diagonal (only its diagonal elements
wii ≡ wi are different from zero). The parameters are then estimated by
2
minimizing the weighed sum of squares ei /wi . A program for ordinary
linear regression can also be used, noticing that the model yi = xi β + εi
with variance σ 2 wi reduces to one with variance σ 2 if we divide both sides
√
by wi .
A typical example is the volume equation V = a + bD2 H. It is often the
case that the residuals suggest a σ proportional to the independent variable
D2 H. Better estimates for the parameters a and b are therefore obtained
by fitting the equation DV2 H = a D21H + b.
Another topic that is often mentioned is the fact that when using log-
arithmic transformations of the dependent variable, such as log V , biased
estimators are obtained for the original variable, in this case V . Corrections
to the results of the linear regression have been proposed in order to reduce
the bias. It is not clear, however, if this is really justified, because gener-
ally the bias is reduced at the cost of increasing the MSE and reducing the
likelihood.
It is also possible to avoid the bias by using nonlinear regression, for
example fitting by least-squares the equation V = kDb H c instead of the
logarithmic equation shown above. This has become fashionable with the
advances in computing. Considering the already mentioned stabilization of
36
the variance that is generally obtained with the logarithms, it seems probable
that in most instances this remedy is worse than the disease.
The third topic is the comparison of models that use different dependent
variable transformations. Obviously in this case it does not make sense to
compare the regression SE, MSE or R2 , since these refer to different vari-
ables. Probably the best thing to do is a graphical analysis of residuals,
since the quality of the fitting may be different for different values of the
variables. Another possibility is to calculate the SE or MSE for the un-
transformed variables, or with a same transformation, preferably separately
for various predictor ranges. In a different approach, Furnival (Forest Sci-
ence 7 ,337–341, 1961) proposed an index that is frequently used for this
purpose. It is essentially an approximation to a transformation of the likeli-
hood function. It must be kept in mind, therefore, that it measures as much
the plausibility of the regression function, as that of the error distribution
implicit in it.
37
Questions, exercises
4. Verify the values of b equal to 0.20 for the neiloid, 0.26 for the cone,
0.39 for the paraboloid, and 0.79 for the cylinder.
38
One of the simplest taper functions, recommended by Kozak, Munro and
Smith in Canada, is a regression of the form
d2 h h2
= b1 + b2 + b3 ,
D2 H H2
where d is the diameter at height h, and D and H are DBH and total height,
respectively.
Note in the first place that it might be desirable to force the function so
that when h = H (at the top) the diameter is zero. Clearly, one must have
b1 + b2 + b3 = 0. Substituting b1 in terms of b2 and b3 , it is seen that this
can be achieved with a regression
d2 h h2
= b2 ( − 1) + b 3 ( − 1) ,
D2 H H2
or
d2 /D2 h2
= b2 + b3 ( 2 + 1) ,
h/H − 1 H
for example. If d is a diameter over bark, one could also make d = D for
h = 1.3 m, leaving a single free parameter.
A second observation is that a second degree polynomial in h will not
well represent the form of the stem near the base. This could be improved
adding a term in h3 /H 3 (recall the neiloid d2 = kh3 ). In practice it has
been found that to better represent the butt-swell, a term with a quite high
power h, such as h8 or h40 , is generally needed.
Another characteristic of this kind of equation is that it implies a “shape”
that does not change with tree size. Graphing d vs h for trees with different
D and H, it is clear that the curves can be matched over all their length by
choosing appropriate scale factors for the axes d and h (the curves of d/D
vs h/H are equal). Often better results are obtained if form is allowed to
vary with size. For this, in equations like the previous one, functions of D
and/or H are substituted for some of the bi . Note that if these functions
are linear in their parameters, the regression is still linear after the substi-
tution. A way of finding appropriate forms for these functions is to fit an
initial regression separately for each tree, and then to graph the bi over D
and H. This it is an example of problems sometimes called, for historical
reasons, “harmonization of curves”, and that appear frequently in Forest
Mensuration.
A great variety of models and estimation methods has been used to
obtain taper functions. We will only examine another two examples.
39
P. L. Real and J. A. Moore (pages 1037–1044 in Forest Growth Modelling
and Prediction, USDA Forest Service, General Technical Report NC-120,
1988), used the following initial model for Douglas-fir, fitting it indepen-
dently to the data of each tree:
y = b1 (x3 − x2 ) + b2 (x8 − x2 ) + b3 (x40 − x2 ) ,
where y is d2 /D2 −x2 and x is (H −h)/(H −4.5). Note the use of high powers
of h, and the conditioning to ensure a zero diameter at the apex, and d = D
at breast height (d is over bark, heights are in feet, and 4.5 is the breast
height). Then, the b1 , b2 and b3 for each tree were fitted to three regressions
containing (not necessarily in linear form) H, D, and crown length.
This model is atypical in including another predicting variable in ad-
dition to D and H, the crown length. This seems a good idea, recall the
differences in stem form within the crown (approx. conical), and below it
(approx. parabolic). In addition, crown length reflects stand characteristics,
being associated with its density, and it is known that two trees with a same
DBH and height, one dominant in a dense stand, and the other suppressed
in a more open stand, would have different forms. On the other hand, it
would probably have been advisable to re-estimate the parameters directly
with the full data set after substituting the expressions for the bi .
The other example is from A. Gordon (New Zealand Journal of Forestry
Science 13 ,146–155, 1983), for radiata pine. The function is
4V
d2 = (b1 z + b2 z 2 + b5 z 5 + b16 z 16 ) ,
πH
with the constraint bi /(i + 1) = 1 enforced through regression with a
transformed function. The variable z is 1 − h/H, and V is the volume cal-
culated with a cubic volume function obtained from the same data. This is
what is called a compatible taper function, a concept developed by Demaer-
schalk in Canada. It has the property that integrating πd2 /4 with respect to
h between 0 and H produces exactly the same V estimated by the volume
function. Although the compatibility between taper and volume functions
is appealing, it is not very clear if the possible sacrifices of flexibility and
precision are worthwhile.
Functions based on polynomials seem to be most common, but other
forms have also been used, such as rational and trigonometric functions. The
use of splines has also become popular. These are combinations of several
functions, usually cubic polynomials, each one valid over certain range of
the independent variable, and with continuity and smoothness constraints
at the joining points.
40
From a statistical point of view, we may comment that in taper functions
the assumptions necessary for the optimality of least-squares are far from
being fulfilled. The homoscedasticity assumption is unrealistic, considering
that there are perfectly known points (the apex and possibly the DBH), near
which errors should be smaller. Similarly, the independence assumption is
untenable since, on a given tree, diameters taken close together tend to
deviate from the mean in the same direction.
Questions, exercises
41
Chapter 4
Stands
42
The arithmetic mean diameter is less used. Occasionally dominant diam-
eters or top diameters are defined, based on largest trees. These are related
to the top heights described below.
Note that the variance of a sample of diameters is
2
s2 = d2 − d
(the bar over an expression denotes an average). It is seen then that the
(quadratic) mean diameter is always larger than the arithmetic mean diam-
eter (unless the variance is zero).
4.2 Heights
4.2.1 Height-diameter curves
For time and cost reasons, often the heights of all the trees in a stand or
sample plot are not measured. Heights measured on a sub-sample are used
in a regression to estimate heights based on DBH. Simple linear regressions
with transformations of the variables are generally used.
Comparisons of height–DBH curves have been done, for example, by
Curtis (Forest Science 13 ,365–375, 1967), A.R. Ek (p. 67–80 Statistics in
Forest Research, Proc. of meeting of IUFRO Subject Group S6.02, Van-
couver, 1973), Garcı́a (INFOR, Nota Técnica No. 19, 1973), and Arabatzis
and Burkhart (Forest Science 38,192–198, 1992). Some equations that have
given good results are H = b1 + b2 log D, H = b1 + b2 /(D + 10), and
log H = b1 + b2 /D.
It is a typical regression problem, without greater complications. It
is convenient, nevertheless, to make sure that the resulting curve is not
43
decreasing, something that frequently happens with some functions, with
small samples, and/or with large measurement errors.
Questions, exercises
44
the plot area is taken: for the 100 largest per hectare, the 10 largest in a 1000
m2 plot are selected, or the 5 largest in a 500 m2 plot. There is a number of
variants of this approach. The m tallest, or the m trees of largest diameter
may be chosen. The second way is often easier. When the heights are
estimated from a height–DBH curve, both methods are obviously equivalent.
Once the m trees are selected, the arithmetic mean of the heights, either
measured or estimated, can be calculated. Another alternative is to calculate
the quadratic mean of the DBH for the m trees (called sometimes dominant
DBH or top DBH), and to use the height given by the height–DBH curve
for that mean. The height of the highest or fattest tree in 1/100 ha plots
or sub-plots is also used. In all cases, malformed trees (broken, forked, etc.)
are excluded.
Questions, exercises
1. The class of 1994 obtained the following data for a 500 m2 Eucalyptus
nitens plot:
45
DBHs: 30 7 33 22 23 24 26 13 30 25 6 23 19 14 19 29 28 25 26 42 20
21 35 42 14 45 24 14 41 23 13 28 42 26 20 22 25 40 27 21 26 27 38 23
46 21 20 16 22 25 31 29 21 21 22 25 25 22 21 17
Sample trees:
DBH: 30 22 26 30 28 42 20 21 35 42 14 45 27 26 27 31 29 25
HT: 22 24 26 30 23 28 20 20 25 27 14 32 27 25 27 27 28 24
Calculate DBH and dominant height according to the several alterna-
tives described.
46
and added. Often the data are well fitted by a simple linear regression of
volume on DBH squared (or on tree basal area), the so-called volume line
or volume–basal area line1 .
The first method is most used, and may be somewhat less time con-
suming when the height–diameter curve is needed anyway, for instance for
calculating dominant height. The local volume function may turn out to be
somewhat biased because of the indirect way in which it is obtained. The
second method is more direct and more general (it is not limited to the use
of volume functions). The volume line normally presents a much tighter
relationship between the variables that the height–diameter curve. Usually
the second method gives somewhat better results, although the differences
are not great.
It is customary not to include as sample trees those that display mal-
formations such as forks, twists, or broken leaders. This provides more
consistent measures at the stand level, both for heights and for volumes.
On the other hand, some overestimation takes place. It is important also to
remember that what is calculated is a standing volume that, due to logging
losses and waste, differs from the volume extracted.
As a general comment, it could be said that perhaps the importance
of calculating cubic volumes tends to be overrated. In practice, mean con-
version factors of doubtful accuracy are applied to the cubic volume for
estimating sawn volumes, dollar values, etc. Its direct utility is mostly as a
conventional unit traditionally used for comparative purposes.
Questions, exercises
1. With the data from the E. nitens plot of Section 4.2.2, estimate the
volume by hectare with both methods. Graph the data and relation-
ships.
3. Compare the fit to the data for this height–DBH function with that
for the other functions previously indicated, graphically and through
the MSE.
4. What form of local volume function would imply the best height–DBH
function found in the previous point? Compare with the volume line.
1
Unrelated to the similarly named Gray’s “volume line”, which is the line of squared
diameter over height on the stem for the parabolic part of a tree.
47
5. In a 500 m2 plot, the following DBH (cm) were measured, sorted in
increasing order:
6.3 7.4 12.9 13.1 13.6 13.7 13.9 16.2 16.7 19.0 19.1 19.9 20.1 20.5 20.6
21.1 21.2 21.2 21.2 21.5 21.7 21.7 21.8 22.0 22.5 22.7 22.7 22.9 23.4
24.3 24.4 24.6 24.9 25.1 25.2 25.4 25.4 25.7 26.0 26.3 26.4 26.9 27.2
28.1 28.1 28.7 29.4 29.8 30.0 31.0 33.0
(n = 51, D = 1136.5, D2 = 26935.99). With sample trees,
the height–DBH curve H = −15.1 + 12.0 ln D and the volume line
V = −0.115 + 0.00082D2 were obtained.
48
or these may be presented in the form of a histogram or of a frequency table
(a “stand table”).
35 10 Tree list:
40
20 20,30,35,45,15,15,30,30,35,35 ,30,20,
35 40
25 20,25,40,40,35,10,25,25,20,25 ,35
30 20
25 20
30
35
45 15 40 Stand table:
20
15 30 5
25 4 4
3 3
2
30 35 1 1
25 10 15 20 25 30 35 40 45
In the stand table, traditionally the DBH are grouped in DBH classes,
and the number of trees in each class is shown in terms of its per hectare
equivalent. It is customary to include also the volume by hectare calculated
for each DBH class, what is called a “stock table”, and sometimes also the
estimated heights and other variables. The grouping into classes is done a
posteriori, or it may result from recording only the diameter class at the
time of measurement.
Stand tables were often used to facilitate the calculation of aggregate
variables such as the basal area, mean DBH, and volume per hectare, adding
class values weighted by frequency. With the advances in computing this
application has lost its importance. Nowadays it is not recommendable to
group data in this way for calculation purposes, since the loss of precision
is not justified. Stand and stock tables may still be useful as a simple and
convenient summary of stand characteristics.
An alternative to the stand table, which is essentially a histogram of the
diameter distribution, is the approximation of the distribution by continuous
functions. Probability distribution functions are used, making an analogy
between the proportion of trees with diameters in a certain interval, and the
probability that a random variable should take values within the interval.
The distribution function F (x) for a random variable X is the probability
of this having a value less than or equal to x:
F (x) = Pr{X ≤ x} .
If the variable is continuous and the derivative exists, f (x) = dF (x)/dx is the
density function. Given a distribution or density function, the probability
49
of X having a value between a and b, or analogously the proportion of trees
with DBH between a and b, can be calculated then as
b
Pr{a < X ≤ b} = F (b) − F (a) = f (x) dx .
a
50
(eds) Proc. IUFRO Symp. on Forest Management Planning and Managerial
Economics, U. of Tokyo, 1984), B.E. Borders and W.D. Patterson (Forest
Science 36 , 413–424, 1990).
Although undoubtedly distributions (in all their forms, including tree
lists and stand tables) are useful and necessary for estimating product mixes
and sizes, their reliability tends to be overestimated. Aside from the sam-
pling variability already mentioned, the analogy with probability distribu-
tions has been taken too far. The use of sample plots produces an over-
representation of pairs of trees separated by short distances, and the DBH
are not distributed at random on the ground. Competition induces nega-
tive correlations in the DBH of nearby trees, whereas microsite variations
induce positive correlations. Consequently, the distributions derived from
plot data, those that are usually obtained, can be considerably different
from the distributions for a whole stand, those are usually required in the
applications (O. Garcı́a, “ What is a Diameter Distribution?”, in Minowa,M.
and Tsuyuki,S. (eds) Proc. Symp. Integrated Forest Management Systems,
Japan Soc. of Forest Planning Press, 1992). These models must be used
with caution, appreciating their limitations.
Diverse methods have been used for estimating the parameters of DBH
distributions, the main ones being the maximum likelihood method (ML)
and the method of moments. This last one consists of making the two or
three first moments of the distribution (depending on if 2 or 3 parameters
need to be estimated) to agree with the respective moments of the sample.
That is, with two parameters one takes the values for which the mean and
variance in the sample and in the theoretical distribution are the same.
Although statistically not as efficient as ML, there are certain advantages in
the consistency of the observed quadratic mean DBH with the one calculated
from the distribution (clearly, this happens if either the distribution of the
diameters or that of the basal areas is considered).
µ = bΓ(1 + 1/c) ,
51
More convenient is to use an approximation formula that gives c as a function
of the coefficient of variation z = σ/µ, sufficiently accurate for practical purposes
(O. Garcı́a, New Zealand Journal of Forestry Science 11 , 304-306, 1981):
Then one obtains b = µ/Γ(1 + 1/c). For tables and approximations to the gamma
function see for example Abramowitz, M. and Stegun, I. A., “Handbook of Math-
ematical Functions”, or the article just cited (in APL the function ! calculates
Γ(z + 1), which for z integer equals the factorial of z). The approximation given is
not valid for c ≤ 1, where the Weibull takes the form of an inverted J instead of
being unimodal.
Questions, exercises
52
7. The mean for the exponential with distribution function F (x) = 1 −
e−kx is 1/k. Assume that in a stand with 800 trees per hectare and
a basal area of 40 m2 /ha, the squared diameters (or tree basal areas)
follow an exponential distribution.
Calculate the number of trees smaller than 25 cm, estimating k by the
method of moments.
53
Appendix A
Errors
All measurements are subject to error and uncertainty. Error sources are
varied, and could be classified in many ways. For instance, there are what
we might call “mistakes”, due to wrong readings on an instrument scale,
transcription errors, etc. There are instrumental errors, due to defects or bad
use of an instrument Personal errors, caused by deficiencies in the observer
senses, or by subconscious influence of his interests or preferences. Very
important and often ignored are errors due to the model; for instance, in
most calculations with tree diameters and cross-sections it is assumed that
the cross-section is circular. Systematic errors are those that always act in
the same direction.
In relation to an instrument or method that generates a (real or hypo-
thetical) series of measurements, it is useful to distinguish between accu-
racy and precision. Accuracy refers to the closeness between measurements
and the true value. Precision has to do with consistency, closeness of the
measurements among themselves. Measurements can be precise but inaccu-
rate. Some authors understand accuracy as the absence of systematic errors
(“bias”), closeness of the measurements mean to the true value.
54
tities subject to error can be determined by substituting all possible combi-
nations of negative and positive errors, and taking the extreme results (the
combinations to be tried can be reduced if it is clear which are the most
unfavorable situations). It is a good idea to do this in important instances.
The methods described below are more convenient, and can provide useful
relationships between errors and variables.
It is clear that in a sum or difference errors add up, because they are
assumed independent and the direction of their action is unknown (for a
bound, the most unfavorable situation must be taken):
(x ± ∆x) + (y ± ∆y) = (x + y) ± (∆x + ∆y)
(x ± ∆x) − (y ± ∆y) = (x − y) ± (∆x + ∆y) .
Multiplication and division is somewhat more complicated:
(x ± ∆x)(y ± ∆y) = xy ± x∆y ± y∆x ± ∆x∆y .
The last term is small relative to the others, and omitting it we can write
(assuming that x and y are positive)
(x ± ∆x)(y ± ∆y) = xy ± xy(∆x/x + ∆y/y) .
∆x/x is the relative error for x (while ∆x is the absolute error ). It is seen,
then, that the relative error for the product is approximately the sum of the
relative errors for the factors. The same happens with division.
More generally, the error for a function of x and y may be approximated
by the initial terms of its Taylor series:
∂g(x, y) ∂g(x, y)
g(x + ∆x, y + ∆y) = g(x, y) + ∆x + ∆y + . . . .
∂x ∂y
The omitted terms contain products of errors and, as in the multiplication,
can be neglected if the errors are not too large. Considering the uncertainty
in the error signs, we find then that in the worst case the error in g is
approximately
∂g(x, y)
∆g = ∆x + ∂g(x, y) ∆y .
∂x ∂y
The generalization to any number of variables is obvious.
Let us see two simple examples.
(i) Let z = g(x, y) = xy. Then
∆z = |y|∆x + |x|∆y ,
which agrees with the results above.
55
(ii) The error in the one-variable function g(x) = ln x is
1 ∆x
∆ ln x = ∆x =
x x
Questions, exercises
2. Calculate the error (bound) for a tree height given the errors in the
distance measurement and in the top and base angle measurements.
3. Assume that the height error is dominated by the error in the angle α
between the top and the horizontal, and that this error is independent
of α (other errors are negligible). Show that the error is a minimum
when α = 45◦ .
56
by significant figures, or the relative error, are independent of the measure-
ment units: 3.24 m and 324 mm carry the same precision.
These relationships between figures and errors allow us to establish cer-
tain rules about the significant figures to be used in results from arithmetical
operations. The error in a sum or difference is dominated by the largest ab-
solute error in their components (as seen above, maximum errors add up;
other error measures combine with less weight on the smaller errors, as will
be seen below). Therefore, a rule is adopted to give the result with a num-
ber of decimal places equal to the least number of decimal places among the
terms added or subtracted:
123
32.3
+ 0.276
-------
156
In multiplication and division the same happens with the relative errors, so
that in the result the least number of significant figures among the factors
is used:
754.1 x 0.052 = 39
Questions, exercises
1. Indicate the number of significant figures in: (a) 1.00025 (b) 0.002710
(c) 10.003 (d) 100000
57
(a) Assume now an error of about 1%. Repeat the profit calculation
using the appropriate number of significant figures. What can
you say about the profitability?
(b) With the 1% errors, obtain error limits by substituting the most
optimistic and most pessimistic values.
4. A sample variance can be computed as n1 (xi − x)2 , where x is the
mean n1 xi . It is often suggested to simplify calculations by using
the formula n1 x2i − x2 .
58
PRACTICAL SITUATION PROBABILISTIC MODEL
Uncertainty in x ; X is a random variable
Weighting of possible values ; density f (x)
Weighted mean of g(x) ; expected value E[g(X)]
59
which is zero if X and Y are independent (more precisely, uncorrelated), and can
reach 1 if X and Y tend to vary jointly or −1 if the vary in opposite ways. Finally,
note that if a is not random,
V [X + a] = V [X] .
♥♥ The density f (x) that defined the probability for intervals on the x line
generalizes to higher-dimensional spaces. For instance, the joint density f (x, y)
applied to the plane of points specified by coordinate pairs (x, y). (These pairs and
their analogs in more dimensions can be seen as lists of numbers, or vectors). It is
said that the random variables X and Y are independent if their joint density is
of the form f (x, y) = f1 (x)f2 (y). A consequence that derives from the definition
of expectation as a multiple integral is that if the variables are independent, then
E[XY ] = E[X]E[Y ]. It is easily verified that this implies Cov[X, Y ] = 0. It may be
mentioned that zero covariance (uncorrelated variables) does not necessarily imply
independence, except in the important case of the Normal distribution.
We are ready now to examine error propagation. Let us see first the addition
case.
If errors act independently, it is seen that the standard error for the sum is
σx+y = σx2 + σy2 .
Measured this way, the error grows more slowly than the maximum error ∆.
For the general case we use, as before, the Taylor series:
In the derivatives we could have used the means or the observed values, instead of
the actual values x0 e y0 . The approximations would still be valid, provided that
the errors are not too large.
60
Let us use this to calculate the standard error for a logarithm:
2 2 2
σln x = (1/x0 ) σx .
σln x = σx /µx .
Questions, exercises
1. Obtain an expression for the coefficient of variation of the product of two
independent variables X and Y as a function of the coefficients of variation
of the factors.
2. For the previous problem, graph CV(XY )/CV(X) over CV(Y )/CV(X) for
CV(X) > CV(Y ). What effect have the smaller and larger errors on the
error of the result? Implications for model building?
61
Appendix B
Regression
B.1 Matrices
An order n × m matrix is simply a table of numbers with n rows and m
columns: ⎡ ⎤
a11 a12 · · · a1n
⎢ ⎥
⎢ a21 a22 · · · a2n ⎥
⎢
A=⎢ . .. ⎥
.. ⎥ = [aij ] .
⎣ .. . . ⎦
am1 am2 · · · amn
The aij are the matrix elements. Instead of the square brackets, round
parenthesis or double vertical lines are also used: aij .
A vector is a list of numbers. In matrix algebra they are taken as one-
row matrices (row vector) or one-column matrices (column vector). Unless
stated otherwise, we shall assume columns. They are usually represented by
lower-case letters, often underlined or in bold-face:
⎡ ⎤
x1
⎢ ⎥
⎢ x2 ⎥
x=⎢
⎢ .. ⎥ = [xi ] .
⎥
⎣ . ⎦
xn
62
The sum of two matrices is the matrix of sums of their elements:
kA = k[aij ] = [kaij ] .
That is, the element ij in the product is the sum of products of the elements
from row i of A and those from column j of B. Clearly, for the product to
be defined the number of columns in the first matrix must equal the number
of rows in the second one.
Defining the product in this way is useful, for example, in handling
systems of linear equations. The system
where
e = [e1 e2 · · · en ] .
63
Even if two matrices have the proper dimensions for calculating the
products AB and BA, in general the results are different (the matrix product
is not commutative). Other than this, and that the operations are not always
possible (certain relationships between numbers of rows and columns must
be satisfied), the sum, difference, and product of matrices behave as the
corresponding operations on scalars. For instance,
A(B + C) = [aij ][bij + cij ] = [ aik (bkj + ckj )] = [ aik bkj + aik ckj ]
k k k
= AB + AC .
Questions, exercises
64
are known as identity. They act as the number 1 among the scalars; mul-
tiplying any matrix and the identity (of the proper order) does not change
it:
IA = AI = A .
Now for an analogue to scalar division. In the same way as subtraction
may be seen as summing a negative, a − b = a + (−b), division may be seen
as multiplication with a reciprocal: a/b = a(1/b) = ab−1 , where b−1 b = 1.
With matrices, the analogue of a reciprocal is the inverse,
A−1 A = AA−1 = I .
Note that for this to make sense, A must be square (same number of rows
and columns). Even thus, not all square matrices have an inverse. Those
that do not have one are called singular .
Using the inverse we could write the solution of the equation system
Ax = b given earlier:
x = A−1 b .
For this solution to exist, A must be square (m = n, that is, the number of
equations must equal the number of unknowns). In addition, for A not to
be singular, the equations must be “linearly independent” (there must be
no redundant equations). There are various methods for inverting matrices,
one of the most common being Gaussian elimination. This may be also used
to solve equation systems without computing the full inverse.
It is not difficult to verify the following properties:
dAx dx
=A
dt dt
da x
=a
dx
dx x
= 2x
dx
65
dx Ax
= (A + A )x
dx
etc. In general, the results are similar to those for scalars, taking into
account the no commutativity of products.
30
30
25
25
height, m
height, m
20
15
20
10
5 15
0
0 5 10 15 20 25 30 35 40 45 2.6 2.8 3 3.2 3.4 3.6 3.8 4
dbh, cm ln(dbh)
A curve like the one shown may be used for estimating the heights of trees
in the stand for which only the dbh is known. Clearly, knowing the dbh helps
in estimating the height, that is, contributes to reduce the uncertainty about
its value. The curve is a “model” that provides height values to be used in
place of the unknown ones, or that can serve as a summary description of the
observations. At any rate, it is convenient to have an equation for the curve
to facilitate its use, and the curve should pass “close” to the observations.
In some instances there are theoretical reasons that suggest a specific
kind of equation. In others, as in this example, the equation is purely em-
66
pirical, chosen with convenience and data-fitting criteria. In general, there
will be a class of equations or models y = f (x, b), where y is the depen-
dent variable, x is a vector of independent variables, and b is a vector
of parameters whose values will be determined for producing a good fit.
With a two-dimensional x we obtain a surface instead of a curve, and for
higher dimensions a hypersurface. To choose the equation form one may
use experience with similar problems, trial and error, graphs with transfor-
mations producing linear data trends, considerations about the form that
the curve should take for the extremes, etc. In the example we have used
H = f (D, b1 , b2 ) = b1 + b2 ln D, seeing in the right-hand-side scatter dia-
gram that the relationship between H and ln D is roughly linear (note in
passing that extrapolation to small diameters outside the range of the data
eventually produces negative heights). It would be always possible to choose
a curve that passes close to each one of the observations. Although in some
sense this would describe perfectly the observed data, in general much less
irregular curves, with a small number of parameters, will produce better
estimates for future or unobserved values.
Once the form of the equation to be tried is decided, it is necessary
to choose parameter values that result in a good fit. It can be assumed
that, for a given D, the difference between the unknown H and f (D, b1 , b2 )
would tend to be smaller if these differences are small for the observed
values. That is, b should be such that the absolute values of the deviations,
residuals or “errors” ei = Hi − f (Di , b1 , b2 ) are small for all the observations
(Di , Hi ). Obviously, if we try to reduce one ei beyond some point the other
ei will increase, so that we need some criterion that takes into account the
whole set of these. A possible criterion would be to minimize the sum of
absolute values |ei | (“L1 -norm regression”). Another possibility would
be to minimize the largest error (min max |ei |, the minimax criterion). The
criterion most commonly used, because of mathematical convenience and
of possessing in some instances certain statistical justifications that we will
examine later, is that of least-squares, which consists of minimizing e2i .
We have then a model y = f (x, b), n observations (yi , xi ), i = 1, 2, . . . n,
and we look for a b such that it minimizes
n
n
e2i = [yi − f (x, b)]2 .
i=1 i=1
Equivalently, we minimize the root mean square error (RMSE) n1 e2i ,
which is a useful measure of goodness-of-fit. In general, this optimization
problem cannot be solved analytically, and it is necessary to resort to it-
67
erative numerical optimization methods. An important exception occurs
when the model is a linear function of the parameters b. In this linear re-
gression situation, it is possible to obtain explicit solutions for the optimal
(least-squares) values of the parameters or coefficients.
Our example of H vs D is an instance of linear regression. It can be
written
y = b1 + b2 x ,
with y = H, x = ln D. This is a straight line, taking here the variable x as
predictor. In general, both y and x can be transformations of the original
variables. Ideally, the data would satisfy the n equations system
y1 = b1 + b2 x1
y2 = b1 + b2 x2
.. ..
. .
yn = b1 + b1 xn
y = Xb .
If we had n = 2, we would have a system of two equations in two unknowns
(b1 y b2 ), usually with a unique solution. In matrix terms, y = Xb with X
square and invertible has the solution b = X−1 y.
With n > 2, in general not all the observations are co-linear, and the
equation system is incompatible. The objective is to find a b such that the
approximation y ≈ Xb is the best possible, in the sense of minimizing the
length |e| of the vector e = y − Xb computed from a generalization to n
dimensions of Pithagoras Theorem:
n
|e|2 = e2i = e e .
i=1
68
are used, in terms of which the solution can be written as b = X− y. The
APL computer language to be used in the laboratories has a generalized
inversion and generalized matrix division operator that makes very simple
the computation of linear regressions. In APL notation, the matrix product
Xb is X+.×B (indicating that we are dealing with sums of products). The
coefficients can be obtained with the generalized inverse, B„( X)+.×Y
or, preferably, with the generalized matrix division B„Y X .
Before presenting the least-squares solution most commonly used in text-
books and manual calculations, let us examine the more general multiple
linear regression situation, where in contrast to the previous simple linear
regression example in which there was just one predictor x there are now p
predictors. The model is
y = b1 x1 + b2 x2 + . . . bp xp = b x = x b .
that is, ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤
y1 x11 x12 · · · x1p b1 e1
⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥
⎢ y2 ⎥ ⎢ x21 x22 · · · x2p ⎥⎢ b2 ⎥ ⎢ e2 ⎥
⎢ ⎥=⎢ ⎥⎢ ⎥+⎢ ⎥
⎢ .. ⎥ ⎢ .. .. .. ⎥⎢ .. ⎥ ⎢ .. ⎥
⎣ . ⎦ ⎣ . . . ⎦⎣ . ⎦ ⎣ . ⎦
yn xn1 xn2 · · · xnp bp en
y = Xb + e .
The matrix equation is the same as before, again we have to minimize
e e,and the direct factorization and APL solutions do not change. Almost
always a constant is included in the model, and then x1 and the xi1 equal 1.
The most usual explicit solution form is obtained as follows. To minimize
the sum of squares Q = e e, we make the derivative equal to zero:
Q = e e = (y − Xb) (y − Xb)
y y − 2y Xb + b X Xb
dQ
= −2X y + 2X Xb = 0 ,
db
69
what gives us the normal equations:
X Xb = X y .
b = (X X)−1 X y .
Questions, exercises
2. Verify that
−1
a b 1 d −b
= .
c d ad − bc −c a
Use this to obtain formulas for the two parameters in simple linear
regression (with the original variables).
70
3. Obtain formulas for simple linear regression using the centered vari-
ables (deviations from the means).
yi = xi β + εi ,
where the εi are uncorrelated random variables with mean 0 and unknown
variance σ 2 . That is,
E[ε] = 0, V [ε] = σ 2 I ,
where V [·] is the covariance matrix. The xi are known predictor vectors,
and β is a vector of unknown parameters to be estimated.
We look for an estimator β̂ = b unbiased, i. e. E[b] = β, and with a
variance as small as possible. Let us restrict the search also to estimators
that are linear functions on the observations, b = Ay for some matrix A.
Then, the Gauss-Markov theorem says that for the linear minimum variance
unbiased estimator A = (X X)−1 X . This is the least-squares estimator.
The restriction to estimators that are linear on the observations may
seem somewhat arbitrary. If we add the assumption that the deviations
follow a normal distribution, the least squares criterion is obtained through
a different route. Let the model, not necessarily linear, be
yi = f (xi , β) + εi
71
with the εi normal, with mean 0, variance σ 2 , and independent. That is,
y = f (X, β) + ε ,
ε ∼ N (0, σ 2 I) .
The likelihood function is the probability of the model generating data
like the observed. The maximum likelihood (ML) estimation method consists
of estimating the unknown parameters as the values that maximize this
function. Besides being intuitively reasonable, the MV estimators have a
number of desirable statistical properties, especially in large samples.
Here the likelihood function equals the joint probability density of the yi ,
considered as a function of β and σ 2 . From the independence assumption,
the joint density is the product of the (normal) densities of each yi :
with
1 {yi − f (xi , β)}2
fi (yi ) = √ exp[− ].
2πσ 2 2σ 2
The likelihood is then
n 2
1 ε i
L= √ exp[− ].
2πσ 2 2σ 2
Clearly, the β that maximizes L is that which minimizes the sum ε2i . We
conclude that, under this model, the ML estimator of β is the least-squares
estimator.
It is also found, taking the derivative of L with respect to σ 2 and making
it equal to zero, that the ML estimator of σ 2 is mean square error (MSE)
2
ε̂i /n = e2i /n, the square of the RMSE. The expected value of e2i , for
linear models, turns out to be (n − p)σ 2 , so that the MSE is biased. It is
customary to use the unbiased estimator SE2 for the residual variance σ 2 ,
and the standard error SE as estimator for σ.
Another goodness-of-fit indicator often used, incorrectly, is the coefficient
of determination R2 = 1 − MSE/Sy2 , where Sy2 = (yi − y)2 /n is the vari-
ance of the observations yi when the predictors are ignored. For comparing
models with the same data, R2 provides the same information as the MSE
or RMSE. With different data sets, however, an R2 close to one does not
imply necessarily a tight relationship or a good model. Among other things,
the total variance depends of how the sample has been selected, and unless
this can be considered as a random sample from a multivariate distribution,
it does not represent a characteristic of the population.
72
Questions, exercises
x 1 2 3 4 5 6 7 8 9 10
y 1 4 9 16 25 36 49 64 81 100
2. Compute R2 .
so that b is an unbiased estimator. The same happens with any function linear
on the parameters, and, in particular, the prediction expected value ŷ(x) = x b
equals y(x) = x β for any x.
Because the covariance matrix V [Az] for a linear transformation is AV [z]A , it
is found that
V [b] = σ 2 (X X)−1 .
If ε is normal, this and the fact that any linear transformation of a normal vector
is normal allow us to obtain confidence intervals and hypothesis tests for linear
functions of b.
Obviously, in real life these statistical models cannot be expected to be
fulfilled exactly. But it can be expected that the more we approach the
assumptions, the better the estimators will be. For instance, if it is seen
that the scatter of the residuals is not quite uniform (heterocedasticity), it
would be advisable to employ some transformation that changes this situa-
tion. Another possible problem is the presence of autocorrelation (correla-
tion among consecutive measurements). In particular, hypothesis tests are
subject to the plausibility of the statistical model.
♥Generalized least squares Assume that in the linear model the covari-
ance matrix for ε has the form σ 2 W, with a known matrix W
= I. Maintaining
the other assumptions, it is then found that both the minimum variance unbi-
ased and the ML estimator are obtained by minimizing e W−1 e. The solution is
b = (X W−1 X)−1 X W−1 y.
A good introduction to statistical inference is found in Chapter 2 of
Graybill, for which there is a Spanish translation among the course materials.
73
A general text with a good treatment of linear regression is Peña Sánchez de
Rivera, D. “Estadı́stica, Modelos y Métodos” (2 Vols.), Alianza Editorial,
Madrid, 1992.
74