0% found this document useful (0 votes)
166 views11 pages

Lesson #05: Data Management: Feasible)

The document discusses key concepts in data management and statistics including: 1. Data is unprocessed information that is accepted as input, processed by CPUs, and presented as output or information. 2. Statistics involves collecting, organizing, summarizing, analyzing, and presenting data from samples to make inferences about populations. 3. Common measures of central tendency like mean, median, and mode and measures of variation like range, standard deviation, and interquartile range are used to summarize and analyze sample data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
166 views11 pages

Lesson #05: Data Management: Feasible)

The document discusses key concepts in data management and statistics including: 1. Data is unprocessed information that is accepted as input, processed by CPUs, and presented as output or information. 2. Statistics involves collecting, organizing, summarizing, analyzing, and presenting data from samples to make inferences about populations. 3. Common measures of central tendency like mean, median, and mode and measures of variation like range, standard deviation, and interquartile range are used to summarize and analyze sample data.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Lesson #05: Data Management

Data or Datum

 Unprocessed information (Very much useful and


feasible)
 This comprises the data management
Elements of Computer System
a. Accept
 Input Devices
 The data is the one being accepted in the input.
b. Process
 CPU (Central Processing Unit)
 Data is the one being processed by CPU
c. Present
 Output Devices
 Data became information.

Statistics

 a discipline concerned with the analysis of data and


decision making based upon data
 involves collecting, organizing, summarizing, (part or
involve in input) and analyzing (involves about the
CPU) and presenting data (involves in Output)
 a solid edifice of mathematical theorems proven
through unassailable laws of logic.

Why do we need to study Statistics?


a. In medical sciences, to determine the efficacy of a
drug.
 “Medical students
may not like
statistics, buy
as doctors they
will.”
- Martin Bland
b. In business and economics, statistics is used in
forecasting.

c. Even for our daily lives, temperature forecasts use


statistics.
d. In psychology, statistics is also used.
e. Even in sports, statistics is also used

Types of Statistics
A. Descriptive Statistics
B. Inferential Statistics
C. Mathematical Statistics
Descriptive statistics

 involves methods of organizing, summarizing and


presenting data
Inferential statistics

 involves methods of using information from a sample to


draw conclusions about the population
Population vs. Sample
A. Population
 refers to all the members of the subject of
interest
 Result: PARAMETER (Characteristics)
B. Sample
 refers to selected the members of the subject of
interest
 Result: STATISTICS (uses statistical tool)
To prove your sample size you need statistics
STATISTICS is an estimate of the PARAMETER
Variables vs. Constants For Example:
Variables X+5 = 10
 are to measured X is variable
 In a form of alphabet
10 and 5 is constant
Constants

 are fixed.
 In the form of numbers

In the given scenarios, identify the following:


a. Population & Samples
b. Parameter & Statistics
c. Variables & Constants

A. When all UST freshmen students were asked, it was


found that, on the average, they sleep for only 3.7
hours (population) per day during exam week. But from
randomly a thirty (30)
Parameter: 3.7 hours
selected UST freshmen
students, it was found to be Statistics: 3.6 hours
3.6 hours per day. (Sample)

B. From 100 randomly selected residents of Calabarzon, it


was found that 13% (Statistics) of them had Dengue
fever in 2016. But according to DOH National
Epidemiology Center (NEC),
11.9% (Parameter) of Parameter: 11.9%
Filipinos had Dengue fever
in 2016. Statistics: 13%

C. 5% (Parameter) of Asian men suffers from red-green


color blindness, From 250 randomly selected men in the
Philippines, it was found that 3% (Statistics) suffers
from this type of color Parameter: 5%
blindness.
Statistics: 3%

Data Presentation
A. Textual
 Results are presented in declarative form.
B. Tabular
 Results are tables, composed of rows and columns.
C. Graphical
 Results are presented in diagrams
 Should be simple to understand easily.

Types of Graphs
1. Line Graphs
 To observe trends
 To observe gaps between categories per unit of
time
2. Pie Graphs
 To describe parts of a whole
3. Scatterplots
 describes the relationship of two quantitative
variables
4. Statistical Maps
 presents statistical information with respect to
geographical location
5. Other graphs
a. Pictogram
b. Population Pyramid
c. Boxplot
d. Violinplot

Quantitative vs Qualitative variables


A. Quantitative
 are in numerical form
 Level of Measurement: Ratio and Interval
B. Qualitative
 are textual form
 Level of Measurement: Ordinal and Nominal

Level of Measurement
A. Ratio
 Numerical variable with absolute zero
B. Interval
 Numerical variable with relative zero
C. Ordinal
 Categorical variable with order
D. Nominal
 Categorical variable with no order

Lesson #06: Descriptive Statistics


Measures Of Central Tendency

 also known as “average”


• Mean
• Median
• Mode

A. MEAN (arithmetic mean)

 the sum of observations divided by the number of


observations.

Population Mean: Sample Mean:

Example:
A marketing specialist gathered five randomly selected
customers and their age (years) are 19, 25, 32, 27 and 41.
Find the mean age of the customers.
 If x1, x2, …, xn are random samples from a population
with mean μ, then mean =(∑x)/n is an unbiased
estimate of μ.

Example 2:

B. MEDIAN

 the middle value of ordered observations


Example 1:
A marketing specialist gathered five randomly selected
customers and their age (years) are 19, 25, 32, 27 and 41.
What is the median age of the customers?
Arranging the observations ascendingly: 19, 25, 27, 32, 41.
The middle value is 27
Example 2:
An researcher wants to determine the cholesterol level
(mg/dL) of all the six residents of Guyan Island.
Observations are as follows: 120, 120, 140, 150, 160, 190.
Find its median.
Arranging the observations ascendingly: 120, 120, 140, 150,
160, 190.
The middle values are 140 and 150. Just take the middle
value of 140 and 150.
Median = (140+150)/2 = 145
Given that x1< x2< … < xn , The median is = 𝐱(𝟏/𝟐)(𝐧+𝟏)
C. MODE

 the most frequent observation(s)


 A set of observations with one mode is called
unimodal, two modes is called bimodal, three modes is
trimodal, and more than three modes is multimodal or
polymodal.
Example 1:
An researcher wants to determine the cholesterol level
(mg/dL) of all the six residents of Guyan Island.
Observations are as follows: 120, 120, 140, 150, 160, 190.
What is its mode?
Mode = 120

When Do We Use The Different Measures Of Central Tendency?

Measures Of Other Position

 also known as “quantiles”


• Quartiles
• Deciles
• Percentiles

A. Quartiles

Interpolation:

Qn+ . Decimal (Qn−Qx)


B. Decile

Interpolation:

D n+. Decimal(D n−D x )

C. Percentile

Interpolation:

P n+. Decimal(P n−P x )

Measures Of Variation
• Range
• Interquartile Range (IQR)
• Mean Absolute Deviation
• Variance
• Standard Deviation
• Coefficient Of Variation

A. Range

 the difference between the lowest & highest


observations
Example: 19, 25, 27, 32, 41.
Range = 41 – 19 = 22
B. Interquartile Range (IQR)

 the difference between the Q1 and Q3


IQR = Q3-Q1 = 36.5 - 22 = 14.5
Additionally,
LB = Q1 – 1.5(IQR) UB = Q3 + 1.5(IQR)
= 22 – 1.5(14.5) = 36.5+ 1.5(14.5)
= 0.25 = 58.25

The Boxplot

 also known as the Box and


Whiskers plot

C. Mean Absolute Deviation

 the average distance of each


observation from the mean

D. Variance
Population Mean: Sample Mean:

If x1 , x2 , … , xn are random samples from a population


with variance σ 2 , then s 2= σ x−x ത 2 n−1 is an unbiased
estimate of σ 2 .

E. Standard Deviation
Population Mean: Sample Mean:

F. Coefficient Of Variation

 Used to compare the variability of two or more


variables with different means.
 Used to compare the variability of two or more
variables with different units of measurement.

Population Mean: Sample Mean:

Skewness

 Normal Distributio = 0
 Shape of the normal distribution curve
Types of Skewness
1. Normal Curve
 Mean = Median = Mode
 SK = o
2. Positively-Skewed
 Mean > Median > Mode
 SK > 0
 Long positive side (Right)
3. Negatively-Skewed
 Mean < Median < Mode
 SK < 0
 Long negative side (Left)
Formula:

3 ( Mean−Median)
SK=
Standard Deviation

Kurtosis

 Normal Distribution = 3
 Height of the normal distribution curve
Types of Kurtosis
1. Mesokurtic Curve
 K = 3
2. Leptokurtic Curve
 K > 3
 Highest
3. Platykurtic Curve
 K < 3
 Lowest
Formula:

K=
∑ (x−mean)4
( n )( standard diviation ) 4

You might also like