WHAT IS BIOSTATISTICS?
CLASSIFICATION OF DATA
BY PRABHJOT KAUR NAHAR
22011259
FYIP BIOLOGICAL SCIENCES
HUMAN GENETICS (SEM 6)
BIOSTATISTICS
WHAT IS BIOSTATISTICS?
Biostatistics is the branch of statistics that deals with the application of statistical methods to
biological, medical, and health-related research.
Key Points:
• Helps in designing biological experiments.
• Collects, analyzes, and interprets biological data.
Used in public health, genetics, clinical trials, etc.
Importance of Biostatistics
Assists in medical decision-making.
Identifies disease risk factors.
Evaluates effectiveness of treatments.
1. Predicts health outcomes.
DATA
Definition:-
A set of values recorded on one or more observational units. Data are raw materials of statistics.
☐ Data set: A collection of data is data set
Data point: A single observation
Raw data: Information before it arranged and analysed
Sources of data:-
1. Experiments
2. Surveys
3. Records
TYPES OF DATA
1. Qualitative Data (Categorical):
Describes categories or qualities.
• Examples: Gender, Blood Type, Disease Type.
2. Quantitative Data (Numerical):
• Represents measurable quantities.
• Examples: Height, Weight, Blood Pressure.
CLASSIFICATION OF DATA
Data can be classified based on:
1. Nature
2. Measurement Scale
3. Data Collection Method
Classification by Nature
1. Qualitative (Categorical):
• Nominal: No order (e.g., blood group)
•Ordinal: Ordered categories (e.g., pain severity: mild,
moderate, severe)
2. Quantitative (Numerical):
• Discrete: Countable values (e.g., number of children)
• Continuous: Infinite values in a range (e.g., weight)
Classification by Measurement Scale
Nominal Scale
Ordinal Scale
Interval Scale
1. Ratio Scale
Each scale determines what kind of statistical
analysis is appropriate.
Classification by Data Collection
1. • Primary Data: Collected first-hand
2. (e.g., surveys, experiments)
2. • Secondary Data: Collected from existing sources
(e.g., records, reports)
QUALITATIVE DATA
Qualitative Data (Categorical Data)
Qualitative data represent characteristics or attributes that cannot be measured numerically, but can be categorized or labeled.
Types of Qualitative Data:
1. Nominal Data:
• Simple categories with no natural order.
• Example: Blood group (A, B, AB, O), Gender (Male/Female/Other), Marital Status.
2. Ordinal Data:
• Categories that have a specific order, but differences between them are not measurable.
• Example:
• Pain scale (Mild, Moderate, Severe)
• Education level (Primary, Secondary, Graduate)
Key Features:
Cannot perform arithmetic operations.
Used to label or classify data.
NOMINAL DATA ORDINAL DATA
Nominal data is a type of categorical data where values are Ordinal data is another type of categorical data, but unlike
used to label or name different categories, but the nominal data, it has a clear, meaningful order or ranking between
categories don’t have any meaningful order or ranking. categories
even though the differences between them are not
Key Features of Nominal Data: measurable.
Key Features of Ordinal Data:
• Categories are names or labels
• Categories have a logical order
• No natural order (unlike ordinal data)
Differences between ranks are not equal or known
• Usually used for classification Still not used for mathematical calculations like mean
Examples: Examples:
Education Level: High School < Bachelor’s < Master’s < PhD
• Colors: Red, Blue, Green
Customer Satisfaction: Very Unsatisfied < Unsatisfied <
• Gender: Male, Female Neutral < Satisfied < Very Satisfied
• Country: India, USA, Japan Class Rank: 1st, 2nd, 3rd (but we don’t know by how much they
differ)
• Yes/No responses: Yes, No So, in simple terms:
You can count how many times a category occurs, but you • Nominal = categories without order
can’t add, subtract, or compare the values mathematically.
Ordinal = categories with order but unknown spacing
QUANTITATIVE DATA
Quantitative Data (Numerical Data)
Quantitative data consist of numerical values that can be measured and analyzed mathematically.
Types of Quantitative Data:
1. Discrete Data:
• Countable and comes in whole numbers.
• Example: Number of children in a family, Number of patients.
2. Continuous Data:
• Can take any value within a range and may include fractions/decimals.
• Example: Height, Weight, Blood Pressure, Temperature.
Key Features:
• Arithmetic operations are applicable.
Used for statistical calculations like mean, median, standard deviation, etc.
DISCRETE DATA CONTINUOUS DATA
Continuous data refer to numerical values that can take any value within
Discrete data is a type of data that can only take on a given range. This means the data is measurable, not countable, and it
specific, separate values. These values are can include fractions and decimals.
countable and often whole numbers. You can’t have Examples of Continuous Data:
values in between them. Height – 170.5 cm, 161.2 cm
• Weight-65.3 kg, 72.8 kg
Examples: Time 1.5 hours, 2.25 minutes
Use in Biostatistics:
Number of students in a class (you can’t have 25.5 • Continuous data is very common in medical research, clinical
students) trials, and health studies.
Used to calculate:
Number of cars in a parking lot Mean (Average) , Standard Deviation
Correlation & Regression, Normal Distribution
Shoe sizes (depending on the scale, these are often
Visual RepRepresentatio
discrete like 7, 7.5, 8, etc.)
1. Histograms , Line graphs , Box plots
In contrast, continuous data can take on any Continuous data is of 2 types :
value within a range (like height, weight, or 1. Interval
temperature). 2. ratio
INTERVAL DATA RATIO DATA
Interval data are numerical data where the difference between Ratio data are numerical data with equal intervals and a true zero
values is meaningful, but there is no true zero point. point, meaning zero indicates the absence of the quantity.
• Equal Intervals: The difference between values is consistent Key Features:
(e.g., the difference between 10°C and 20°C is the same as • Equal Intervals (like interval scale)
between 20°C and 30°C).
• True Zero Exists (0 = absence)
• No Absolute Zero: Zero doesn’t mean absence of the quantity.
For example, 0°C does not mean ‘no temperature.’ All mathematical operations are meaningful: addition,
subtraction, multiplication, division
• Addition and subtraction are meaningful, but ratios are not.
You can say “twice as much”
Examples:
Examples:
• Temperature (Celsius or Fahrenheit): 0°C doesn’t mean “no
heat.” Height: 0 cm = no height
• Dates/Calendar Years: Difference between 2000 and 2010 is 10 • Weight: 0 kg = no weight
years, but year 0 doesn’t mean time started. Age: 0 years = no age
• IQ scores: You can say one IQ is 20 points higher than another, Heart rate, Blood pressure (if measured from 0), Distance, Income
but not “twice as intelligent.”
Statistical Operations Allowed:
Statistical Operations Allowed:
• Mean, Median, Mode, Range, Standard Deviation
• Mean, Median, Standard Deviation
Ratios are valid (e.g., 60 kg is twice as heavy as 30 kg)
Cannot say “twice as much” because no true zero
ON THE BASIS OF SOURCE OF INFORMATION/DATA
PRIMARY DATA SECONDARY DATA
Secondary data is data that has been collected, processed, and published by
Primary data refers to the data that is collected firsthand by a researcher someone else, and is used for analysis or reference by another person or
for a specific purpose or study. It is original, raw data that has not been researcher.
previously collected or processed.
Key Features of Secondary Data:
Key Features of Primary Data: 1. Second-hand information
• First-hand information 2. Already collected and available
• Collected directly from the source 3. Quicker and cheaper to access
• Specific to the research objective 4. Might not be specific to your current research needs
• Generally more accurate and up-to-date Sources of Secondary Data:
• Government reports or census data
Examples of Primary Data Collection Methods:
• Research articles or journals
1. Surveys or questionnaires
• Books and newspapers
2. Interviews
• Company records or websites
3. Observations • Online databases (like World Bank, NSSO, etc.)
4. Experiments Example:
5. Focus groups If you use data from a government report on literacy rates for your project,
that’s secondary data.
Example: If a student conducts a survey in their college to know how
many students prefer online classes that’s primary data.