Random Variables and Probability Distributions
Random Variables and Probability Distributions
Course Outline
1. Correlation
2. Regression: Linear Regression Model
Nature of
Statistics
What is Statistics?
Many people will probably respond that statistics is
numbers, charts, graphs, etc.
For example:
2 types of data:
1. Quantitative data (numeric values)
2. Qualitative data (categorical or attribute)
The use of statistics dates back to census taking in
ancient Babylonia, Egypt, and later in the Roman Empire,
when data were collected about matters concerning the
state, such as births and deaths. In fact, the word statistics is
derived from the Latin word status, meaning, “state”.
Statistics is the science of collecting, classifying,
organizing, summarizing, analyzing, and interpreting data in
order to draw conclusions or make decisions.
2 types of statistics:
1. Descriptive statistics (Numerical/Graphical Methods)
2. Inferential Statistics (Generalizing Methods)
Example 1. A large of men at age 48, was studied for 18
years. Results revealed that 60% to 70% unmarried men,
were alive at age 65, while 90% of the married men, were
alive at age 65. Which part of the study represents the
descriptive branch of statistics? What conclusions might
be drawn from this study using inferential statistics?
(Source: Philippine Daily Inquirer)
Solution:
Descriptive statistics involves statements such as
“60% to 70% unmarried men were alive at age 65” while
“90% married men were alive at 65.” A possible inference
drawn from the study shows that being married is
associated with a longer life for men.
We use inferential statistics to try to infer from the sample
data what the population might think. Or, we use inferential
statistics to make judgments of the probability that an
observed difference between groups is a dependable one or
one that might have happened by chance in this study.
A population is used to designate the complete set of items
that is of interest in the research.
Population
Unit
Sample
Example 1. In a recent survey, 1 000 high school
students were asked if they read news on the internet at
least once a week. Six hundred of the students said yes.
Identify the population and the sample. Describe the data
set.
Solution: Response of all high
The population consists of the school (Population)
responses of all high school students
while a sample consists of the responses
of the 1 000 high school students in the Response of
survey. students in survey
(Sample)
The sample is a subset of the
responses of all high school students.
The data set consists of 600 yes’s and
400 no’s.
Another two important terms used are parameter and
statistics.
1 2
3 4
Let X = score on the spinner
X 1 2 3 4
P(x)
Example 1. The given spinner is divided into four sections. Let
X be the score where the arrow will stop ( numbered as 1, 2, 3,
and 4, in the drawing below).
= + + +
=1
The two requirements for discrete probability distribution
are satisfied. Therefore, the distribution is a discrete
probability distribution.
Exercise 2. The spinner below is divided into eight sections.
Let X be the score where the arrow will stop (numbered as 1,
2, 3, 4, in the drawing below)
4 3
4 3
Example 3. When two fair dice are thrown simultaneously, the
following are the possible outcomes.
P(2) = P(1, 1) =
P(12) = P(6, 6) =
The discrete probability distribution (in tabular form) is
given below:
x 2 3 4 5 6 7 8 9 10 11 12
P(x)
x 0 1 2 3 4
P(x) 0.1 0.2 ? 0.2 0.2
a. Determine P(2).
Solution:
1 = 0.7 + P(2)
P(2) = 0.3
Exercise 1. A random variable X has the following
probability distribution:
x 1 2 3 4
P(x) 0.21 29c 0.29 0.21
a. Determine c.
PROBGroupie!!
x 0 1 2 3 4
P(x) 0.05 0.25 0.4 ? 0.06
a. Find P(3).
4.
x 1 2 3 4 5
P(x) c