Welcome to
Quantitative
Methods Class
Instructor: Ms . Ayesha N. Rao
Instructors Profile
Done several projects of Time-Series with Pakistan
Institute of Development Economics(PIDE), Central Board
of Revenue (CBR) and Federal Bureau of Statistics (FBS).
PhD in progress
Hold a M.Phil in Statistics
Hold a M.Sc in Statistics
Publications
1. Zahid Asghar and Ayesha Nazuk. Iran-Pakistan-India
Gas PipelineAn Economic Analysis in a Game
Theoretic Framework. The Pakistan Development
Review,Vol. 46, No. 4, Part II (2007) pp. 537550.
2. Ayesha Nazuk and Javid Shabbir. A New Mixed
Randomized Response Model, Proceedings of the
European conferences on Quality in Official
Statistics- Q2010, Held in Finlanda Hall, Helsinki,
Finland (4th-6th May 2010).
3. Estimating the proportion of liars in NUST, NUST
Journal of Business and Finance.
.For details type AYESHA NAZUK in Google.
More Pulblications
4. Ayesha Nazuk, Fiza Amer, Quratulain Tanvir, Saba Nawaz, Sahar Zahid
Siddiqui, Shahwaiz Alvi (2013) Entrepreneurial Education in Public
Sector Institutes of Rawalpindi/Islamabad, International Journal of
Management Sciences and Business Research, Vol. 2, Issue 2, pp. 5672.
5. Ayesha Nazuk, Sadia Nadir and Javid Shabbir, (2013), Adjustment of
the auxiliary variable for estimation of a finite population mean, article
accepted for publication in Lahore Journal of Operations Research and
Statistics.
6. Ayesha Nazuk, Yusra Siddiquii, Maha Gul, Rana Iradat Shareef, Meraj
Murtaza and Raza Abbas Rajput, Analysis of Cheating disorder among
university students through Randomized Response Technique,
International Journal of Business and Behavioral Sciences Vol. 3, No.3;
2013, pp. 15-22.
7. Book review of "Bio-statistical Analysis" by Jerrold H. Zar, NUST Journal
of Business and Economics, Vol 2 No. 2, pp. 98-99.
Workshop Conducted
Registered with PDC NUST and have organized several
trainings on Statistics and softwares.
Trained the faculty of NUST Business School with tools of
Econometrics.
Trained the faculty of AIOU with SPSS.
Have delivered guest lectures in various
universities/organizations.
Student Consultation
Students are expected to go through the class
lectures and notes on continuous basis. In
case of a problem, you are welcome to
contact me on;
Appointment Hours are posted on LMS. Or
any other time but subject to prior
appointment on office phone i:e 90853560.
Mid-Term =
30
Terminal
Exam=40
Assignments= 15
Quizzes= 15
Marking
Scheme
In case of
Term Paper
marks for
assig will be
scaled down.
Contact Details
Ms. Ayesha N. Rao
Office: Room 310, NBS Faculty
Block.
Phone: +92-51-9085-3560
E-mail:
ayesha.nazuk@nbs.edu.pk or
ayesha.nazuk@s3h.nust.edu.pk
COURSE OBJECTIVES
This
course
provides
an
introduction to theoretical and
applied statistics and Mathematics
for business and economics. The
main objective is to stress the
importance of applying statistical
analysis to the solution of common
business problems.
Text Book
- will be uploaded on LMS soon.
Assignments
Students are recommended to make a
study group (each consists of 3 to 4
students) and are strongly encouraged to
study together to solve homework problems.
Submit assignments in group. NO
INDIVIDUAL ASSIGNMENTS.
Examination and Quizzes
2-3
quizzes,
mid-term
exam
and
comprehensive final exam will be given in
class during the semester.
Quizzes, of course, will be solved
independently.
Review Worksheets
If required, review worksheets will be posted
on LMS.
You are encouraged to solve and discuss
these mutually and with the instructor.
Make-Up Quiz
There will be no make-ups for missed
quizzes regardless of reason.
Late assignments will not be accepted.
DO NOT REQUEST FOR THIS.
Make-up quiz may be given under extreme
circumstances.
For such request please
submit a written application.
Absentees are
supposed
To cover previous lecture and do not come in
class- unprepared.
It has been noticed in past few semesters that
absentees try to impede the pace of the
lecture.
Such behavior is not at all welcome.
CLASSROOM POLICY
I expect you to conduct yourself with
professional courtesy in the classroom.
You should not talk to other students during
lectures unless directed to do so by the
instructor.
Brief discussions, in a decent low voice, to
ensure understanding may be done.
Please turn off Cell Phones, Beepers, ipods or Pagers.
Course Outline
Measures of Central Tendency &
Dispersion
The Arithmetic Mean, The Mode, The Median
Range, Skewness, Kurtosis, Variance &
Standard Deviation
Course Outline
Probability
Basic Definitions: Events, Sample Space &
Probability
Rules of Probability
The Rule of Complements
Addition Law & Mutually Exclusive Events
Conditional Probability
Independence of Events
Product Rules for independent events
Course Outline
Probability Distributions
Normal distribution
Student-t distribution
Course Outline
Hypothesis Testing
The Concept of Hypothesis Testing
Type I & Type II Errors, Computing the p-Value
One-tailed & Two-tailed Tests
Tests of the Mean of a Normal Distribution:
Population variance known
Tests of the Mean of a Normal Distribution:
Population variance unknown
Course Outline
Hypothesis Testing II
Tests of the Difference Between Two
Population
Tests of the Difference Between Two
Population Proportions
Tests of the Equality of the Variances Between
Two Normally Distributed Populations
Course Outline
Regression & Correlation Analysis
Regression versus Correlation
Regression versus Causation
Classical Linear Regression Model &
Assumptions
Method of Ordinary Least Squares
Method of Logistic Regression
Course Outline
Differentiation
Concepts of Derivatives
Rules of Derivatives
Examples & Practice
Applications in Business
Course Outline
Optimization
Concavity & Inflection Points Identification of
Maxima & Minima Business Applications
Course Outline
Depreciation &/or Annuities
Straight-line-method, Sum-of-year-digit
Method, Declining Balance Method, Units of
Production Method & The MARC Method
Annuities, Sinking Funds
Course Outline
Markup
Markup on Cost
Markup on Selling Price, Relationships
between markups
Markdown and Shrinkage
Course Outline
Discounts
Trade discount, Trade discount series
Cash discounts, Discounts and Freight terms
Scales of
Measurem
ent
Nominal data is just for naming. E:g our
names, names of cities, CNIC numbers, roll
numbers of students provided that they are
not assigned as per merit. All arithmetic
operations are invalid.
Ordinal data is for naming with a sense of
ranking. Level of management from low to
high. Numbers on the back of cricketers
provided that they are based on ICC
ranking.
Scales of
Measurem
ent
Interval data is purely numeric but it has not
got a true zero point. The Fahrenheit and
Celsius scales of temperatures are both
examples of data at the interval level of
measurement. You can talk about 30 degrees
being 60 degrees less than 90 degrees, so
differences do make sense. However 0 degrees
(in both scales) cold as it may be does not
represent the total absence of temperature.
PDC NUST workshop on SPSS-Trainer Ms.
Ayesha Nazuk Rao (Assistant Professor
NUST)-July 29, 2013 to July 31 2013.
Scales of
Measurem
ent
Ratio scales data is purely numeric and it has
got a true zero point. Distances, in any system
of measurement give us data at the ratio level.
A measurement such as 0 feet does make
sense, as it represents no length. Furthermore
2 feet is twice as long as 1 foot. So ratios can
be formed between the data.
Type of Averages
Commerci
al
averages
Mathematical
average
Positional
Average
athematical Average is
based on an algebraic
formula
Harmonic
Mean
Arithmetic
Mean
Geometri
c Mean
ositional Averages are
ased on their relative
location
Mode
Quartiles,
Octiles,
Deciles,
Percentiles
Median
Commercial Averages
Moving
Averages (To
know the trend in
a time series)
Progressi
ve
Average
Composite
Average
(used
(used to
to report
report
average
average
profits/losses
profits/losses etc
etc in
in
the
starting
year
of
the starting year of
a
a firm)
firm)
Calculation Of Progressive Average
Calculate the progressive average of the data
Solution: Calculation of progressive average
Years
Sale (in lakhs of $)
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
8
9
8
7
8
9
10
11
11
12
10
Progressive
total.
Progressive
average.
8
8.0
17
8.5
25
8.3
32
8.0
40
8.0
49
8.1
59
8.4
70
8.7
81
9.0
93
9.3
103 9.3
Composite Average-Use
The Dow Jones Composite
Average is a stock index from
Dow Jones Indexes that tracks 65
prominent companies. The
average's components are every
stock from the Dow Jones
Industrial Average, the Dow Jones
Transportation Average, and the
Dow Jones Utility Average.
C.A= Sum of all averages/ number
Unbias
ed
Arithmetic Mean
Scale
data
only.
Testing
possible
Comple
te Data
Use
Further
Treatme
nt
Outliers
need to
be
deleted
Not
Robust
The Arithmetic Mean is not
independent of origin and scale
Let Y=ax+b
Then mean of Y= a+b (Mean of X)
Let Y=ax-b
Then mean of Y= a-b (Mean of X)
Addition/subtraction changes the origin and
multiplication/division changes the scale.
Not
Unbias
ed
Relies
on only
one
value
Mode
Testing of
Hypothesi
s not
possible
No
Further
Treatme
nt
Can be
found in
any scale
Robust
Not
Unbias
ed
Median
Testing of
Hypothesis
possible
through nonparametric
test
Relies
on one
or at
most
two
values
No
Further
Treatme
nt
Ordinal or
more
Robust
PDC NUST workshop on SPSS-Trainer Ms.
Ayesha Nazuk Rao (Assistant Professor
NUST)-July 29, 2013 to July 31 2013.
Not
Unbias
ed
Relies
on
complet
e data
Geometric Mean
No
Further
Treatme
nt
Testing of
hypothes
is
possible
Scale
data only
Robust
PDC NUST workshop on SPSS-Trainer Ms.
Ayesha Nazuk Rao (Assistant Professor
NUST)-July 29, 2013 to July 31 2013.
Mode is valid for
use both in
symmetric and
skewed dataset.
One data can
have no or more
than one mode.
Median is also
valid for use
both in
symmetric and
skewed data
set. It can be
used in ordinal
data set as well.
PDC NUST workshop on SPSS-Trainer Ms.
Ayesha Nazuk Rao (Assistant Professor
NUST)-July 29, 2013 to July 31 2013.
Mean or
Arithmetic mean
is valid for use
only in
symmetric data
only. It is valid
for use in scale
data only.
Geometric Mean
is a measure
that uses
complete data
and is yet
Robust .
Trimmed means
are used in case
we want to have
picture of data
free from extreme
values.
One can use
2.5%, 5 or 10%
trimmed mean .
PDC NUST workshop on SPSS-Trainer Ms.
Ayesha Nazuk Rao (Assistant Professor
NUST)-July 29, 2013 to July 31 2013.
For a 10%
trimmed mean
we ignore
smallest 10%
and largest 10%
observations.
Calculate the
mean of the
truncated
dataset.
Winsorized mean
is also used in the
presence of
outliers
Replaces
outliers with
most extreme
values in the
remaining
dataset.
PDC NUST workshop on SPSS-Trainer Ms.
Ayesha Nazuk Rao (Assistant Professor
NUST)-July 29, 2013 to July 31 2013.
See example in
next slide
For a sample of 10 numbers (from
x1, the smallest, to x10 the largest)
the 20% Winsorized mean is
The key is in the repetition of x2 and
x9: the extras substitute for the
original values x1 and x10 which have
been discarded and replaced.
Quadratic Mean is
SQUARE ROOT
of the sum of
squares divided
by no. of values
Q.M > = A.M
PDC NUST workshop on SPSS-Trainer Ms.
Ayesha Nazuk Rao (Assistant Professor
NUST)-July 29, 2013 to July 31 2013.
Q.M is better in
performance in
presence of
negative values
Example: Q.M
X
-100
0
100
X**2
10000
0
10000
A.M=0
Q.M= 81.64966
Data Type and Average
used
Ratios of change, proportions, percentages
etc. G.M
Rate of change per unit of time such as
speed, number of items produced per day etc.
H.M
Well behaved (that is outlier free) data which
is purely quantitative A.M
NominalMode
Ordinal---Median
Median
The Middle observation of an arrayed
( that is arranged either in ascending or
descending order) is called median.
For an ungrouped data it is the item
number (n+1)/2.
e:g if the data set is 4,6,1,5,4,8. we shall
array it 1,4,4,5,6,8 median= item at
position (6+1)/2=3.5th item= (3rd item+4th
item)/2=(4+5)/2=4.5. So median is 4.5.
If data is 1,3,7 median is (3+1)/2=2nd
item= 3.
Median for grouped
data
h n
Median l C
f 2
l LCB of median class
h height of median class
f frequency of median class
C cummulative frequency of class before median class
Median can be found for ordinal
data
For example in a shop there are
clothes in different shades of a
color. One can arrange the
clothes from light shades to
darker as follows.
S1,S2,S2,S3,S4,S5,S6
Median shade is at (7+1)/2=4th
shade=S3
Grouped data Median
Calculations
Interval f
C.F
C.B
01
23
45
Total
5
11
13
-------
-0.51.5
1.53.5 Median Class
as 7.5 lies
3.55.5 here
5
6
2
13
Locate the class whose C.F covers
n/2=sum(f)/2=13/2=7.5
Contd.
Median=1.5+(2/6)
(7.5-5)=2.33
So, the value 2.33
cuts previous data in
two equal halves.
Mode for grouped data
l = the lower limit of modal class = 15
f1 = frequency of modal class = 7
fo = frequency of class preceding the modal class = 5
f2 = frequency of class succeeding the modal class = 2
h = size of class intervals = 5
Example Continued
Mode
Mode
Mode
Mode
Mode
Mode
=
=
=
=
=
=
15 + [(7 - 5) / (2 x 7 - 5 - 2)] x 5
15 + [2 / (14 - 7)] x 5
15 + (2 / 7) x 5
15 + (10 / 7)
15 + 1.42
16.42
Summary of
Central Tendency Measures
Measure
Equation
Mean
Xi / n
Median
(n+1)Position
2
Mode
none
Description
Balance Point
Middle Value
When Ordered
Most Frequent
Quintiles; Quartiles Deciles and
Percentiles
Quintiles are break points that divide an
arrayed data in to i equal parts.
Quartiles divide data into four equal parts.
There are three quartiles Q1,Q2 and Q3.
Q1= [(n+1)/4]th item, 25% data lies
before it.
Q2=[2(n+1)/4]th item=[(n+1)/2]th item
which is median. 50% data lie before Q2
Q3=[3(n+1)/4] item, 75 % data lies before
it.
Quartiles
1. Measure of Non-central Tendency
2. Split Ordered Data into 4 Quarters
25
25
25
25
% of i-th
% Quartile
%
%
3. Position
Q1
Q2
Q3
i (n 1)
Positioning Point ofQi
4
Deciles
Deciles are 9 break points that divide an
arrayed data in to 10 equal parts.
D1=[(n+1)/10]th item, 10% data lies before
it.
D2=[2(n+1)/10]th item 20% data lie before it.
D9=[9(n+1)/10]th item, 90% data lies before
it.
Note that D5=Q2=Median
Percentiles
Percentiles are 99 break points that divide an
arrayed data in to 100 equal parts.
P25=Q1
P75=Q3
P10=D1
Other relationships may easily be seen.
Measures of
Variation
Data Summary; A
To recall the data
compaction process,
Glance
1. To summarize the data we use graphs and charts
2. For more technical analysis, a frequency distribution
is made.
3. To report a summary value that may represent the
data, we find measure of central tendency.
4. BUT there may be data sets who have same value of
central tendency but differ in terms of
variation/scatter around the central value.
Illustrative example
On the average
31 patients get
satisfactory
treatment from
both D1 and D2.
However the
data, number of
patients that
come to D1 or
D2, is very
different.
Doctor Mean Values( No.
of patients)
D1
30.75 12,35,36,40
D2
30.75 1,4,3,115
Measure of Central Tendency are
Insufficient
Measure of central cannot convey the
full picture of data.
Specifically they cannot tell us the
amount of scatter in the data.
If a measure of scatter ( variation)
accompanies a measure of central
tendency, then data can be more
efficiently described.
Measure of variation
Definition of Measure of Variation
Measure of variation is a measure that
describes how spread out or scattered a set of
data. It is also known as measures of
dispersion or measures of spread.
Examples of Measure of Variation
Some Important measures of variation:
The range, the variance, and the standard
deviation.
Range
The range is the distance between the lowest
data point and the highest data point.
Range can be misleading since it does not
take into consideration every value. Consider
each of the following data sets:
1,10,10,10,10 and 1,2,5,8,10. Both have a
range of 9, yet the first data set is clearly not
as dispersed as the second.
Variance &
Standard Deviation
1. Measures of Dispersion
2. Most Common Measures
3. Consider How Data Are Distributed
4. Show Variation About Mean ( X or )
X = 8.3
4 6
8 10 12
Sample Variance
Formula
n
2
n - 1 in
X)
(Xi
denominator! (Use
i 1
n1
N if Population
Variance)
Standard Deviation
The standard deviation of a set of scores
is a measure of variation of scores about
the mean. It is calculated by
procedure for finding the standard
deviation ( ungrouped data)
1) Find the mean of the scores
2) Subtract the mean from each individual
3)
4)
5)
6)
score
Square each of the values in step 2
Add up all the squares obtained in step 3
Divide the total in step 4 by n-1
Find the square root of step 5.
Ungrouped Sample Data;
S.D
Find the standard
deviation of the
data 1, 2, 12, 3, 6
and 11.
The mean of X is
5.83
Variance is
(110.833/5) =
22.166
And S.D is 4.708.
X
1
2
12
3
6
11
X X
-4.8
-3.8
6.1
-2.8
0.17
5.17
Sum -----
X X
23.4
14.7
38.0
8.03
0.03
26.7
110.833
Standard Deviation Grouped
Sample data
Interv f
al
(X)
01
23
45
Total
0.5 -1.538
2.5 0.462
4.5 2.462
------
5
6
2
13
X-A.M
(X-A.M)^2
f(X-A.M)^2
2.3654
0.2134
6.0614
11.82
1.280
12.12
25.23
A.M=sum of f*x/sum of
f=26.50/13=2.038
S.D=Square root of (25.23/13-
Variance
Variance is the square of S.D
Because the differences are squared, the
units of variance are not the same as the units
of the data. Therefore, the standard deviation
is reported as the square root of the variance
and the units then correspond to those of the
data set.
Interpretation of Standard Deviation:
There are some ideas you remember about
standard deviation and variance
A small standard deviation means the data is
close together, a large deviation means the
data is wide spread
At least 75% of all scores fall within 2
standard deviations from the mean and at
least 89% fall within at least 3 standard
deviations from the mean.
Welcome to
Mathematics &
Statistics Class
Instructor: Ms . Ayesha N. Rao
76
Inter-quartile Range
When there are extreme values in a
distribution or when the distribution is
skewed, variance and standard deviation
are not true measures of spread. in these
situations inter-quartile range or semi-inter
quartile range are preferred measures of
spread.
Inter quartile range is the difference
between the Q1 and Q3. Semi-inter
quartile range is half of the difference
between the Q1 and Q3.
Summary of
Variation Measures
Measure
Range
Equation
Xlargest - Xsmallest Total Spread
Q3 - Q1
Interquartile Range
Standard Deviation
(Sample)
Standard Deviation
(Population)
Variance
(Sample)
Description
Spread of Middle 50%
2
n1
X i
X
N
(Xi - X )2
n- 1
Dispersion about
Sample Mean
Dispersion about
2
Population Mean
Squared Dispersion
about Sample Mean
Relative Measure of
Dispersion
Comparison of data sets
Up till now we have been analyzing a
single data.
Direct comparison of variance/standard
deviation is not valid. Because they
depend on unit of measurement.
For example if we have data on weights of
potatoes and another on weights of milk
cartons. Then variance of 0.1 kg may be
considered large for potatoes but small for
milk cartons.
Relative measures of variation are those
that help in comparing two or more data
sets; as to which data is more
Coefficient of Variation
It is defined as the ratio of the standard
deviation to the mean;
C.V=S.D/Mean
This is only defined for non-zero mean, and is
most useful for variables that are always
positive.
It does not have any meaning for data on an
interval scale.
S.D Vs C.V
For example, the value of the standard
deviation of a set of weights will be different
depending on whether they are measured in
kilograms or pounds. The coefficient of
variation, however, will be the same in both
cases as it does not depend on the unit of
measurement.
C.V interpretation
Lesser the C.V,
lesser is the
variability in the
data.
C.V Pros and Cons.
Advantages
The coefficient of variation is a
dimensionless number. So when
comparing between data sets with
different units or wildly different means,
one should use the coefficient of variation
for comparison instead of the standard
deviation.
Disadvantages
When the mean value is near zero, the
coefficient of variation is sensitive to small
changes in the mean, limiting its
usefulness.
Z-Scores
Z-scores are a means of answering the
question ``how many standard deviations
away from the mean is this observation?'' If
our observation X is from a population with
mean and standard deviation , then
Z Score for Sample
On the other hand, if the observation X is
from a sample with mean and standard
deviation s, then
X
Z
s
Z Score Interpretation
A positive (negative) Z-
score indicates that the
observation is greater than
(less than) the mean.
Example
In a certain city the mean price of a quart of
milk is 63 cents and the standard deviation is
8 cents. The average price of a package of
bacon is $1.80 and the standard deviation is
15 cents. If we pay $0.89 for a quart of milk
and $2.19 for a package of bacon at a 24-hour
convenience store, which is relatively more
expensive? To answer this, we compute Zscores for each:
Solution
Z (Milk)=(0.89-0.63)/0.08=3.25
Z (Bacon)= (2.19-1.80)/0.15=2.60
Our Z-scores show us that we are overpaying
quite a bit more for the milk than we are for
the bacon.