0% found this document useful (0 votes)
10 views65 pages

Statistics 01 (2)

The document discusses various research methods in data collection, emphasizing the distinction between primary and secondary data sources. It outlines different data collection approaches, including comprehensive inventory and sampling methods, as well as specific techniques such as surveys, interviews, and observations. Additionally, it details probability and non-probability sampling methods, highlighting their applications and potential biases.

Uploaded by

bboulaaras19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views65 pages

Statistics 01 (2)

The document discusses various research methods in data collection, emphasizing the distinction between primary and secondary data sources. It outlines different data collection approaches, including comprehensive inventory and sampling methods, as well as specific techniques such as surveys, interviews, and observations. Additionally, it details probability and non-probability sampling methods, highlighting their applications and potential biases.

Uploaded by

bboulaaras19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Chapitre 02: Research Methods in Data

Collection and Sampling Techniques


➢Data Sources: Primary and Secondary.

➢Data Collection Approaches.

➢Data Collection Methods.

➢Sampling Techniques.

➢Exercises.
02
Data Sources: Primary and Secondary
In this title, we will distinguish between primary and
secondary data sources and convey the idea that they are like
two different pathways to access information for research
and analysis.

03
Primary Data Sources
Primary data sources refer to the firsthand, original data collected
directly from individuals, entities, or observations, etc.
Researchers gather primary data through methods such as surveys,
interviews, experiments, observations, or questionnaires. Since this
data is obtained directly from the source, it is unique to the
researcher's study and tailored to address specific research questions.
Primary data sources are highly valuable for their relevance,
accuracy, and direct applicability to the research objectives, making
them essential in various fields for generating new insights and
drawing specific conclusions.
04
Secondary Data Sources
Secondary data sources comprise data that has been previously
collected, processed, and published by other researchers, organizations,
or entities for purposes other than the current research study. This data is
not obtained directly from the original source or individuals but is rather
sourced from existing databases, publications, reports, or any other form
of pre-recorded information. Researchers use secondary data to analyze
and draw insights without direct involvement in the data collection
process. Secondary data sources are valuable for comparative studies,
historical research, and large-scale analyses, providing a wealth of pre-
existing information for various research purposes.
05
Data Collection Approaches
The comprehensive inventory method and the sampling
method are distinct approaches employed in data collection
and analysis, especially in the realms of research, auditing,
and data management.

06
Comprehensive Inventory Method
The comprehensive inventory method, also known as full or
complete enumeration, entails gathering data from an entire
population or dataset, leaving no item unexamined or unaccounted
for. This approach is chosen when the population is relatively
small and manageable, and the necessary resources for examining
each element are accessible. Its significance lies in its ability to
ensure utmost accuracy, as it eradicates the possibility of sampling
errors.

07
Sampling Method
The sampling method entails choosing a representative subset
from a larger population or dataset to draw inferences or make
conclusions about the entire population. This approach becomes
essential when collecting data from every single element in the
population is impractical due to factors like time, cost, etc.
Sampling methods find extensive application in diverse fields
such as research, market surveys, and quality control, where
gathering data from the entire population is unfeasible.

08
This method enables researchers to make well-informed
decisions, predictions, and data analyses with reasonable
accuracy, all while minimizing costs and efforts. Properly
executed sampling methods yield valuable insights, making
them an indispensable tool in various research endeavors.

09
Data Collection Methods
Data collection methods encompass a wide array of techniques
employed to systematically gather information for research, analysis,
and decision-making purposes. These methods serve as the
foundation upon which meaningful insights and conclusions are
drawn.

10
Questionnaires
A questionnaire is simply a set of questions. These lists of
questions typically aim to gather information about an individual, a
group of individuals, or a corporation.
A questionnaire’s main purpose is to gather information or data
from a target audience. Questionnaires can collect both
quantitative and qualitative data depending on the question types.

11
Surveys
A survey encompasses a wider and more comprehensive scope
compared to a questionnaire. While a questionnaire is a set of
structured questions, a survey not only includes these questions but
also incorporates the quantitative process of collecting, measuring,
and analyzing the responses obtained.

12
Interviews
Interviews serve as a valuable method for gathering rich, nuanced,
and context-specific data. They entail interactive dialogues
between researchers and participants, facilitating the exploration of
intricate subjects, comprehension of diverse viewpoints, and
delving into the experiences and thoughts of those involved.
Effective interviewing skills, active listening, and rapport-building
are essential for ensuring the reliability and validity of the
collected data during the interview process.

13
Observations
Observational methods involve the systematic observation and
recording of behaviors, events, or processes within their natural
context. These observations can take on two primary forms:
participant-based, where the researcher actively participates in the
observed activities, or non-participant, where the researcher takes
on a passive observer role. This method is particularly instrumental
in the examination of behaviors, social interactions, and the study
of natural occurrences.

14
Experiments
Experiments are precisely structured inquiries created to examine
hypotheses and establish causal relationships. Researchers
intentionally alter one or more variables and observe how these
changes impact other variables. By conducting experiments within
controlled environments and following meticulously designed
protocols, researchers ensure the reliability and validity of the
gathered data. This method is extensively used in scientific
research, enabling researchers to draw conclusions about cause and
effect by analyzing the outcomes of their experiments.
15
Collection of Existing Records or Documents
Data collection extends beyond direct participant engagement, it
encompasses the exploration of pre-existing records, documents,
or archival resources. Researchers delve into historical
manuscripts, official records, published materials, and any written
content pertinent to their research focus. This method serves as a
bountiful wellspring of both qualitative and quantitative data,
bestowing invaluable historical context and supplementary
perspectives.

16
Sampling Techniques
In order to derive meaningful and valid conclusions from your
research findings, it is essential to thoughtfully determine the
methodology for selecting a sample that accurately represents the
entire population. This methodology is known as a "sampling method."
In research, there are two primary categories of sampling methods:
1.Probability Sampling: Probability sampling relies on random
selection techniques, ensuring that each member of the population has
an equal chance of being included in the sample. This method enables
researchers to make robust and statistically sound inferences about the
entire population based on the characteristics of the selected sample.
17
2.Non-Probability Sampling: Non-probability sampling, on the
other hand, does not involve random selection. Instead, it relies on
non-random criteria such as convenience, accessibility, or specific
characteristics of the participants. This method allows for the
collection of data with ease and may be more practical in certain
situations, but it typically comes with limitations in terms of
generalizability and statistical inference.
Selecting the most appropriate sampling method for your research
depends on the research goals, available resources, and the level of
precision required for the study.
18
Probability Sampling
Probability sampling ensures that every individual in the
population has an equal opportunity to be chosen. It is
predominantly employed in quantitative research. If your objective
is to generate results that accurately reflect the entire population,
employing probability sampling techniques is the most reliable
approach.
There are four primary types of probability samples:

19
Figure01: types of probability samples
20
Simple random sampling
In a simple random sample, each member of the population holds
an equal probability of being chosen. For this method to be
effective, your sampling frame must encompass the entire
population.
To execute this sampling method, tools such as random number
generators or other entirely chance-based techniques can be
employed. These tools ensure fairness and unbiased selection by
relying solely on randomization principles.

21
Example:
Imagine you're conducting research on student preferences in a
large university with 10,000 enrolled students. You aim to select a
simple random sample of 500 students to participate in your study.
Here's how you can use simple random sampling:
1.Assigning Numbers: Each student in the university is assigned a
unique number from 1 to 10,000.
2.Using a Random Number Generator: Employ either computer
software or a physical random number generator, to generate 500
random numbers between 1 and 10,000.
22
3.Selection Process: The random numbers correspond to the
student IDs. For instance, if the random number generator
generates the numbers 253, 765, 1201, 4989, ..., you select the
students with these respective IDs (253rd, 765th, 1201st, 4989th,
etc.) from the university database.
By following this method, you ensure that every student in the
university has an equal chance of being selected, forming a
representative sample for your research on student preferences.

23
Systematic sampling
Systematic sampling is similar to simple random sampling, but it
is usually slightly easier to conduct. Every member of the
population is listed with a number, but instead of randomly
generating numbers, individuals are chosen at regular intervals.

24
Example:
In the university's student database, all students are listed
alphabetically by their last names. To create a systematic sample of
150 students, you proceed as follows:
1.Sort Students Alphabetically: Arrange the list of 10,000
students in alphabetical order based on their last names.
2.Determine the Sampling Interval: Divide the total number of
students (10,000) by the desired sample size (150) to get a
sampling interval of approximately 67 (10,000 / 150 ≈ 67).

25
3.Select a Random Starting Point: Use a random number
generator to select a random starting point between 1 and 67. Let's
assume the random number generated is 23.
4.Choose Systematic Samples: Starting from the 23rd student on
the list, select every 67th student thereafter (23, 90, 157, and so
on). Continue this pattern until you have sampled 150 students.
By systematically selecting every 67th student from the sorted list
starting from a random point, you create a representative sample
that ensures every student has an equal chance of being included,
leading to a systematic sample of 150 students for your research
study.

26
Note:
When employing systematic sampling, it is crucial to verify that the
list being sampled does not contain any hidden patterns that could
bias the sample. For instance, if the student roster is organized by
academic departments and students within each department are
arranged according to their grades, there is a possibility that your
systematic interval might skip over students with lower grades. This
could lead to a sample that is disproportionately represented by high-
achieving students, potentially impacting the overall validity of your
findings. It's essential to thoroughly understand the data organization
and take steps to mitigate any biases.
27
Stratified sampling
Stratified sampling involves dividing the population into subpopulations
that may differ in important ways. It allows you to draw more precise
conclusions by ensuring that every subgroup is properly represented in
the sample.
To use this sampling method, you divide the population into subgroups
(called strata) based on the relevant characteristics (e.g., gender identity,
age range, income bracket, job role).
Based on the overall proportions of the population, you calculate how
many people should be sampled from each subgroup. Then you use
random or systematic sampling to select a sample from each subgroup.

28
Example:
Imagine a university with 5,000 undergraduate students and 2,000
graduate students. The university administration wants to conduct
a survey to understand the overall satisfaction levels of students.
To ensure that the sample reflects the undergraduate and graduate
student proportions accurately, the population is divided into two
strata: undergraduate students and graduate students.
In the undergraduate stratum, there are 3,600 female students and
1,400 male students. In the graduate stratum, there are 1,200
female students and 800 male students. The university aims to
survey 500 students for their research.
29
To create a representative sample, they use stratified random
sampling. From the undergraduate stratum, they randomly selected
357 students (257 female students and 100 male students) based on
the gender proportions within the undergraduate stratum. Similarly,
from the graduate stratum, they randomly selected 143 students
(86 female students and 57 male students) based on the gender
proportions within the graduate stratum.
By using this method, the university ensures that the survey
sample accurately represents the gender balance in both the
undergraduate and graduate student populations, allowing for
meaningful insights into the satisfaction levels of students across
different academic levels and genders.
30
Cluster sampling
Cluster sampling also involves dividing the population into
subgroups, but each subgroup should have similar characteristics
to the whole sample. Instead of sampling individuals from each
subgroup, you randomly select entire subgroups.
If it is practically possible, you might include every individual
from each sampled cluster. If the clusters themselves are large, you
can also sample individuals from within each cluster using one of
the techniques above.

31
Example:
Suppose a national retail chain operates in 10 different regions across the
country, and they want to assess the effectiveness of a new training program
for their employees. Each region has roughly the same number of employees in
similar roles. However, conducting the assessment in all 10 regions would be
too time-consuming and costly. Instead, they decided to use cluster sampling.
They randomly select 3 out of the 10 regions as their clusters. Let's say they
select regions A, D, and G as their clusters.
Within each selected cluster (A, D, and G), the company conducts the training
program assessment on a sample of employees. In Region A, they survey 150
employees, in Region D, they survey 100 employees, and in Region G, they
survey 120 employees.
32
Non-probability sampling
In a non-probability sample, individuals are chosen based on
specific, non-random criteria, and not every individual within the
population has an equal chance of being included. Although non-
probability samples are more accessible and cost-effective, they
carry a higher risk of sampling bias. This bias weakens the
inferences about the population when compared to probability
samples, potentially limiting the scope of your conclusions. Even
when using a non-probability sample, it is crucial to strive for
representativeness within the sample, aiming to make it as reflective
of the overall population as possible.
33
Non-probability sampling techniques are commonly employed in
exploratory and qualitative research where the focus is on
understanding diverse perspectives and contexts rather than
making precise statistical predictions.
There are five main types of non-probability sample:

34
35
Convenience sampling
A convenience sample simply includes the individuals who happen to be
most accessible to the researcher.
This is an easy and inexpensive way to gather initial data, but there is no
way to tell if the sample is representative of the population, so it can’t
produce generalizable results. Convenience samples are at risk for both
sampling bias and selection bias.
This sampling approach is frequently employed in pilot studies, early
research stages, or instances where swift insights are essential. Its utility
lies in providing initial data for exploration and hypothesis generation
rather than in producing statistically robust or widely applicable results.

36
Example:
Suppose you are conducting a study on dietary habits among university
students, and you decide to distribute your survey during lunchtime in the
university cafeteria. Since you can only approach students during specific
hours, you end up surveying a group of students who are present in the
cafeteria during that time. While this method allows you to collect responses
conveniently, the sample you gather may not be representative of all students at
the university.
The limitation here is that the sample is biased toward students who eat in the
cafeteria during lunch hours. This may exclude commuter students, students
with different schedules, or those who prefer to eat elsewhere. Consequently,
the survey results might not accurately reflect the dietary habits of the entire
student population, leading to potential biases in your findings. 37
Voluntary response sampling
Comparable to convenience samples, voluntary response samples
primarily rely on accessibility. In contrast to researchers actively
selecting participants, individuals voluntarily participate, often by
responding to public online surveys or similar outreach methods.
However, voluntary response samples are inherently biased, as
certain individuals are more inclined to volunteer than others,
resulting in self-selection bias. This bias can skew the findings,
making it challenging to draw accurate and unbiased conclusions
about the broader population.
38
Example:
You conduct a survey on social media platforms to gather
opinions about a new mobile phone app that was recently
launched. You share the survey link widely, and many enthusiastic
users of the app respond. While you receive a large number of
responses, it's important to note that these respondents are more
likely to be individuals who are highly engaged with the app and
have strong opinions about it. Therefore, their feedback might be
biased towards positive experiences or, conversely, negative
experiences if they encounter issues.
39
While these responses provide valuable insights into the opinions
of active users, they may not accurately represent the opinions of
the entire user base. Users who are less engaged or have neutral
opinions might be underrepresented in the survey results, leading
to a potential bias in the feedback received.

40
Purposive sampling
This sampling method, often referred to as purposive or judgment
sampling, hinges on the researcher's expertise to deliberately select
a sample that serves the research objectives effectively. It finds
common application in qualitative research contexts, where the
primary goal is to attain an in-depth understanding of a particular
phenomenon, rather than seeking to draw statistical inferences.
This approach is especially pertinent when dealing with small and
highly specific populations.

41
An efficient purposive sample necessitates well-defined inclusion
and exclusion criteria, backed by a clear rationale for each
selection. It is vital to transparently articulate these criteria to
mitigate potential observer bias that might influence the research
findings. By judiciously applying purposive sampling, researchers
can hone in on precisely the information they seek, facilitating a
more nuanced and comprehensive exploration of the subject
matter.

42
Example:
In a research study focusing on entrepreneurs' success stories, a
researcher selects participants based on their exceptional achievements
in starting and growing successful businesses. The researcher identifies
these individuals through industry awards, news articles, and
professional networks. By purposefully selecting highly successful
entrepreneurs, the study aims to gain unique insights into the specific
strategies, challenges, and traits that have contributed to their
remarkable accomplishments. This purposive sampling approach
ensures that the research focuses on individuals who can provide
valuable and in-depth information, offering a rich understanding of
entrepreneurial success factors. 43
Snowball sampling
Initially, one or a few participants are selected. These participants
then refer the researcher to others, who, in turn, refer more
participants. The sample grows like a snowball rolling downhill.
Commonly used in social sciences, especially when researching
hidden or hard-to-reach populations, like drug users or
marginalized communities.

44
Example:
You are conducting a study on the challenges faced by street vendors
in your city, many of whom have undocumented or informal
businesses. Given the lack of an official list of street vendors, it's not
possible to use probability sampling methods. However, you begin
your research by building a network of trust and collaboration.
You start by engaging with one street vendor who agrees to participate
in your study. This initial contact leads to a snowball effect where the
vendor introduces you to other vendors in the area whom they know
personally. These newly introduced vendors, in turn, connect you with
even more participants.
45
While your sample is not randomly selected and may not
represent the entire population of street vendors in the city, this
snowball sampling technique allows you to access a group that is
often difficult to reach through traditional research methods. By
nurturing trust and leveraging personal connections, you can delve
into the challenges faced by this community, shedding light on
their experiences and perspectives.

46
Quota sampling
Quota sampling is founded on the deliberate, non-random
selection of a predetermined number or proportion of units known
as "quotas."
In this approach, the initial step involves partitioning the
population into distinctive, non-overlapping subgroups, commonly
referred to as "strata." Following this, sample units are selectively
recruited until the designated quotas for each stratum are fulfilled.
These quotas are established based on specific characteristics
predetermined by the researcher before delineating the strata.
47
The fundamental purpose of quota sampling is to exercise control
over the composition of the sample, enabling researchers to
intentionally shape the composition of their sample by allocating
quotas to different strata. This method is frequently employed to
ensure that certain characteristics or attributes are adequately
represented in the study, offering more control and precision in
sample selection.

48
Example:
A tech company is conducting market research to understand consumer
preferences for a new fitness app. The company wants to ensure that the
app appeals to a diverse range of fitness enthusiasts, including those
interested in cardio exercises, strength training, yoga, and flexibility
workouts. To gather representative data, they divided the target
population into four fitness categories: cardio enthusiasts, strength
trainers, yoga practitioners, and flexibility-focused individuals.
The company sets a total sample size of 800 participants, with a quota of
200 participants for each fitness category. Researchers employ targeted
recruitment strategies to ensure that they reach individuals from each
group.

49
For instance, they collaborate with local gyms and fitness centers
to find cardio enthusiasts and strength trainers, partner with yoga
studios to connect with yoga practitioners and engage with
wellness communities for individuals focused on flexibility
exercises.
By employing this method, the company ensures that its research
sample is balanced and representative of various fitness
preferences. This approach enables them to analyze the unique
needs and interests of different fitness groups, allowing for the
development of a versatile app that caters to a wide range of fitness
enthusiasts.

50
Exercise 01:
In each statement, tell whether primary or secondary data sources have been
used:
a.A researcher conducts interviews with employees to gather information about
workplace satisfaction.
b.A student analyzes a dataset published in a scientific journal for a research
paper.
c.A company uses its sales records from the past year to analyze customer
purchasing behavior.
d.A government agency conducts a national census to collect demographic
data.
51
e.A biologist conducts experiments in a laboratory to study the
effects of a new drug on cells.
f.An economist analyzes the GDP data published by the government
for the past decade.
g.A marketing researcher surveys customers to understand their
preferences for a new product.
h.A meteorologist collects weather data from weather stations across
the country for climate research.

52
Exercise 02:
Determine whether you should use primary or secondary data
sources for the following statements and explain your reasons:
a.Investigating global warming trends over the last century.
b.Analyzing sales data of a retail chain over the past five years.
c.Studying the eating habits of teenagers in urban areas.
d.Comparing crime rates in different neighborhoods within a city.
e.Investigating the impact of social media on mental health among
young adults.
53
f.Analyzing the effectiveness of a new teaching method in improving
student performance in elementary schools.
g.Assessing the impact of a new government policy on the local
economy.
h.Examining consumer preferences and trends in the smartphone
industry.
i. Investigating the job performance of employees in a specific
department within your company.
j. Studying the history and development of a particular type of
renewable energy technology.

54
Exercise 03:
Determine whether you should use Comprehensive Inventory or
Sampling Method sources for the following statements:
a.Analyzing the entire population of customer feedback forms to
understand overall satisfaction levels for a specific product.
b.Assessing the quality of a large batch of manufactured goods by
inspecting a random sample of items from the batch.
c.Conducting a census to count the total number of households in a
specific city.
d.Determining the average height of students in a school by measuring
the heights of a representative sample of students.
55
e.Estimating the total revenue generated by a company in a year by
examining the financial records of a randomly selected group of
months.
f.Checking the accuracy of a book inventory by counting all the
books in the store.
g.Studying the average time customers spend on a website by
analyzing data from a randomly selected group of website visitors.
h.Calculating the percentage of defective items in a factory's
production line by examining a random sample of items from the
production.

56
Exercise 04:
Which data collection method would you choose for the following
statements, and provide justification for your choice:
a.you want to understand the impact of online reviews on customer
purchasing decisions for a particular product.
b.you are interested in studying the behavior of shoppers in a retail
store.
c.you need to analyze the impact of a new government policy on
small businesses in a specific region.

57
d.you want to explore the factors influencing employees' job
satisfaction within my organization.
e.you are researching historical trends in migration patterns in a
specific country over the past century.
f.You are exploring the reasons for the high dropout rate among
students in a specific college program.
g.You are researching the impact of a new drug on patients' blood
pressure.

58
Exercise 05:
Identify which sampling technique is being described in each
statement:
a.A research study divides a population into age groups, and then a
random sample is taken from each group.
b.A teacher selects every 10th student on the class roster for a survey
about study habits.
c.A market researcher divides a city into blocks and randomly selects
specific blocks to survey residents.

59
d.An opinion pollster selects participants who voluntarily respond to
an advertisement inviting people to participate.
e.The researcher selects teenagers who are actively involved in local
gaming tournaments and events, aiming to capture insights from
dedicated gamers.
f.A scientist randomly selects households from different
neighborhoods for a study on environmental pollution.
g.A market researcher aims to understand the preferences of
smartphone users in a city.
h.The researcher stands outside a popular gaming store and
interviews teenagers who happen to exit the store.

60
i. You aim to study social media usage across different age groups
(teens, young adults, middle-aged, and seniors) in the city. You
decided to survey 100 participants, with 25 participants from each
age group.
j. You are studying a niche group of teenagers who are part of a
subculture with limited visibility. You start with one participant
from the subculture and ask them to refer you to others within their
group.

61
Exercise 06:
Imagine you are conducting research on a university campus.
Determine the most appropriate sampling technique for each
statement and explain your choice:
a.Surveying students from different majors to understand their career
aspirations.
b.Studying the sleeping patterns of students by selecting every 5th
student who walks out of the campus library in the evening.
c.Analyzing the eating habits of students by randomly selecting
specific dormitory buildings and surveying all students living in
those buildings.
62
d.Investigating faculty satisfaction by surveying all faculty members
who agree to participate.
e.Conducting a campus-wide health survey by randomly selecting
students from a list of all enrolled students.

63
Exercise 07:
A researcher wishes to estimate the average weight of newborns in
South Algeria in the last five years. He takes a random sample of
235 newborns and obtains an average of 3.27 kilograms.
a.What is the population of interest?
b.What is the parameter of interest?
c.Based on this sample, do we know the average weight of newborns
in South Algeria? Explain fully

64
Exercise 08:
Suppose a local grocery store wants to understand the shopping habits of its
customers. The store manager decided to conduct a survey to gather information
about customer preferences. Due to budget constraints and time limitations, the
manager opts for convenience sampling. The manager stands near the store
entrance and surveys the first 50 customers who enter the store one Saturday
afternoon.
a.What is the sampling method used in this survey?
b.Discuss the potential biases introduced by using convenience sampling in this
scenario. How might the results of the survey be skewed?
c.Propose an alternative sampling method that could provide more representative
and reliable results for understanding customer preferences in the grocery store.
Explain why the proposed method is an improvement over convenience
sampling.
65

You might also like