0% found this document useful (0 votes)
22 views5 pages

BIOSTATISTICS

The definition of bio-statistics

Uploaded by

tinagold184
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views5 pages

BIOSTATISTICS

The definition of bio-statistics

Uploaded by

tinagold184
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

BIOSTATISTICS

Biostatistics is the application of statistical principles to questions and problems in medicine,


public health or biology. The science of collecting and analyzing biological or health data using
statistical methods. Biostatistics may be used to help learn the possible causes of a particular
disease such as cancer and how often a disease occurs in a certain group of people. This is
called biometry.

The National Cancer Institute defines biostatistics as “the science of collecting and analyzing
biologic or health data using statistical methods.” The use of statistics in health care dates back
more than a century to the earliest application of the scientific method in medical research.
Many health care decisions are based in small or large part on the results of biostatistical
research. The application of statistics to biological and medical data have a tremendous great
impact on the provision of health care and prevention of disease. The accurate interpretation of
biostatistical data can serve as the foundation for efforts to improve public health and the
quality of patient care. As with many burgeoning technologies, however, there is much
uncertainty among nursing professionals about the role of biostatistics in health care.
Familiarity with statistical principles helps nurses understand and evaluate the results of health
care studies. It also enables them to participate in medical research projects and communicate
the results of biostatistical research to patients and other health care workers in ways that are
easy for them to understand. The analytical and interpretive concepts underlying biostatistics
correspond to many advanced nursing practices, including the creation of more efficient care
delivery systems and the development of individualized care strategies intended to improve
patient outcomes.

Nurses must rely on their training and experience to determine the most effective ways to apply
the knowledge gained from biostatistical research to ensure that it contributes a cost-effective
and patient-centered solution. Research reported in the Journal of Nursing Education and
Practice found that training nurses in the effective use of biostatistics not only demystifies the
science of statistics, but also makes nurses more efficient and effective by enabling them to apply
research results directly in their practice. In summary, Biostatistics is the discipline concerned
with how we ought to make decisions when analyzing biomedical data. It is the evolving
discipline concerned with formulating explicit rules to compensate both for the fallibility of
human intuition in general and for biases in study design in particular.

Glossary of Terms

Statistics - a set of concepts, rules, and procedures that help us to:


o organize numerical information in the form of tables, graphs, and charts;
o understand statistical techniques underlying decisions that affect our lives and
well-being; and
o make informed decisions.

Data - facts, observations, and information that come from investigations.


o Measurement data sometimes called quantitative data -- the result of using some
instrument to measure something (e.g., test score, weight);
o Categorical data also referred to as frequency or qualitative data. Things are
grouped according to some common property(ies) and the number of members
of the group are recorded (e.g., males/females, vehicle type).
Variable - property of an object or event that can take on different values. For example,
college major is a variable that takes on values like mathematics, computer science,
English, psychology, etc.
o Discrete Variable - a variable with a limited number of values (e.g., gender
(male/female), college class (freshman/sophomore/junior/senior).
o Continuous Variable - a variable that can take on many different values, in
theory, any value between the lowest and highest points on the measurement
scale.
o Independent Variable - a variable that is manipulated, measured, or selected by
the researcher as an antecedent condition to an observed behavior. In a
hypothesized cause-and-effect relationship, the independent variable is the
cause and the dependent variable is the outcome or effect.
o Dependent Variable - a variable that is not under the experimenter's control --
the data. It is the variable that is observed and measured in response to the
independent variable.
o Qualitative Variable - a variable based on categorical data.
o Quantitative Variable - a variable based on quantitative data.

 Graphs - visual display of data used to present frequency distributions so that the
shape of the distribution can easily be seen.

o Bar graph - a form of graph that uses bars separated by an arbitrary amount of
space to represent how often elements within a category occur. The higher the
bar, the higher the frequency of occurrence. The underlying measurement scale
is discrete (nominal or ordinal-scale data), not continuous.
o Histogram - a form of a bar graph used with interval or ratio-scaled data. Unlike
the bar graph, bars in a histogram touch with the width of the bars defined by
the upper and lower limits of the interval. The measurement scale is
continuous, so the lower limit of any one interval is also the upper limit of the
previous interval.
o Boxplot - a graphical representation of dispersions and extreme scores.
Represented in this graphic are minimum, maximum, and quartile scores in the
form of a box with "whiskers." The box includes the range of scores falling
into the middle 50% of the distribution (Inter Quartile Range = 75th percentile -
25th percentile)and the whiskers are lines extended to the minimum and
maximum scores in the distribution or to mathematically defined (+/-1.5*IQR)
upper and lower fences.
o Scatterplot - a form of graph that presents information from a bivariate
distribution. In a scatterplot, each subject in an experimental study is
represented by a single point in two-dimensional space. The underlying scale
of measurement for both variables is continuous (measurement data). This is
one of the most useful techniques for gaining insight into the relationship
between tw variables.

Measures of Center - Plotting data in a frequency distribution shows the general shape
of the distribution and gives a general sense of how the numbers are bunched. Several
statistics can be used to represent the "center" of the distribution. These statistics are
commonly referred to as measures of central tendency.
o Mode - The mode of a distribution is simply defined as the most frequent or
common score in the distribution. The mode is the point or value of X that
corresponds to the highest point on the distribution. If the highest frequency is
shared by more than one value, the distribution is said to be multimodal. It is
not uncommon to see distributions that are bimodal reflecting peaks in scoring
at two different points in the distribution.
o Median - The median is the score that divides the distribution into halves; half of
the scores are above the median and half are below it when the data are
arranged in numerical order. The median is also referred to as the score at the
50th percentile in the distribution. The median location of N numbers can be
found by the formula (N + 1) / 2. When N is an odd number, the formula yields
a integer that represents the value in a numerically ordered distribution
corresponding to the median location. (For example, in the distribution of
numbers (3 1 5 4 9 9 8) the median location is (7 + 1) / 2 = 4. When applied to
the ordered distribution (1 3 4 5 8 9 9), the value 5 is the median, three scores
are above 5 and three are below 5. If there were only 6 values (1 3 4 5 8 9), the
median location is (6 + 1) / 2 = 3.5. In this case the median is half-way between
the 3rd and 4th scores (4 and 5) or 4.5.
o Mean - The mean is the most common measure of central tendency and the one
that can be mathematically manipulated. It is defined as the average of a
distribution is equal to the X / N. Simply, the mean is computed by summing
all the scores in the distribution (X) and dividing that sum by the total number
of scores (N). The mean is the balance point in a distribution such that if you
subtract each value in the distribution from the mean and sum all of these
deviation scores, the result will be zero.

Measures of Spread - Although the average value in a distribution is informative about


how scores are centered in the distribution, the mean, median, and mode lack context
for interpreting those statistics. Measures of variability provide information about the
degree to which individual scores are clustered about or deviate from the average value
in a distribution.
o Range - The simplest measure of variability to compute and understand is the
range. The range is the difference between the highest and lowest score in a
distribution. Although it is easy to compute, it is not often used as the sole
measure of variability due to its instability. Because it is based solely on the
most extreme scores in the distribution and does not fully reflect the pattern of
variation within a distribution, the range is a very limited measure of variability.
o Interquartile Range (IQR) - Provides a measure of the spread of the middle
50% of the scores. The IQR is defined as the 75th percentile - the 25th
percentile. The interquartile range plays an important role in the graphical
method known as the boxplot. The advantage of using the IQR is that it is easy
to compute and extreme scores in the distribution have much less impact but its
strength is also a weakness in that it suffers as a measure of variability because
it discards too much data. Researchers want to study variability while
eliminating scores that are likely to be accidents. The boxplot allows for this
for this distinction and is an important tool for exploring data.
o Variance - The variance is a measure based on the deviations of individual scores
from the mean. As noted in the definition of the mean, however, simply
summing the deviations will result in a value of 0. To get around this problem
the variance is based on squared deviations of scores about the mean. When the
deviations are squared, the rank order and relative distance of scores in the
distribution is preserved while negative values are eliminated. Then to control
for the number of subjects in the distribution, the sum of the squared deviations,
(X - X), is divided by N (population) or by N - 1 (sample). The result is the
average of the sum of the squared deviations and it is called the variance.
o Standard deviation - The standard deviation (s or ) is defined as the positive
square root of the variance. The variance is a measure in squared units and has
little meaning with respect to the data. Thus, the standard deviation is a
measure of variability expressed in the same units as the data. The standard
deviation is very much like a mean or an "average" of these deviations. In a
normal (symmetric and mound-shaped) distribution, about two-thirds of the
scores fall between +1 and -1 standard deviations from the mean and the
standard deviation is approximately 1/4 of the range in small samples (N < 30)
and 1/5 to 1/6 of the range in large samples (N > 100).

Measures of Shape - For distributions summarizing data from continuous measurement


scales, statistics can be used to describe how the distribution rises and drops.
o Symmetric - Distributions that have the same shape on both sides of the center
are called symmetric. A symmetric distribution with only one peak is referred
to as a normal distribution.
o Skewness - Refers to the degree of asymmetry in a distribution. Asymmetry
often reflects extreme scores in a distribution.
 Positively skewed - A distribution is positively skewed when is has a tail
extending out to the right (larger numbers) When a distribution is
positively skewed, the mean is greater than the median reflecting the
fact that the mean is sensitive to each score in the distribution and is
subject to large shifts when the sample is small and contains extreme
scores.
 Negatively skewed - A negatively skwed distribution has an extended tail
pointing to the left (smaller numbers) and reflects bunching of numbers
in the upper part of the distribution with fewer scores at the lower end of
the measurement scale.
o Kurtosis - Like skewness, kurtosis has a specific mathematical definition, but
generally it refers to how scores are concentrated in the center of the
distribution, the upper and lower tails (ends), and the shoulders (between the
center and tails) of a distribution.
 Mesokurtic - A normal distribution is called mesokurtic. The tails of a
mesokurtic distribution are neither too thin or too thick, and there are
neither too many or too few scores in the center of the distribution.
 Platykurtic - Starting with a mesokurtic distribution and moving scores
from both the center and tails into the shoulders, the distribution flattens
out and is referred to as platykurtic.
 Leptokurtic - If you move scores from shoulders of a mesokurtic
distribution into the center and tails of a distribution, the result is a
peaked distribution with thick tails. This shape is referred to as
leptokurtic.

You might also like