0% found this document useful (0 votes)
31 views38 pages

Intro To Statistics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views38 pages

Intro To Statistics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

 Statistics has been defined in two ways.

 Some writers define it as' statistical data', i.e.,


numerical statement of facts,while others define it as
'statistical methods', i.e., complete body of the
principles and techniques used in collecting and
analysing such data.
Data Types

 Data: Data are systematic record of values taken by a


variable or a number of variables on a particular point
of time or over different points of time.
 Data collected on a single point, of time over different
sections (may be classified on demographic, geographic
or other considerations) are called cross-section data.
Whereas data collected over a period of time are called
time series data
 Data may be quantitative or qualitative in nature. For
example, heights of 50 students of FYBA, SCAC are
quantitative whereas religion of theirs is qualitative in
nature.
 Data of quantitative nature are technically called
variables whereas data of qualitative nature are called
attributes.
 Variables may be discrete as well as continuous. If a
variable can take any value within its range, then it is
called a continuous variable otherwise it is called a
discrete variable.
 For example Heights of students of FYBA, SCAC are a
continuous variable whereas number of students in
FYBA/ FYB.Com in SCAC is discrete variable.
Collection of Data

 Primary data and Secondary data


 Primary data are those which are collected for a
specific purpose directly from the field of enquiry and
hence they are original in nature.
 On the other hand, data collected by someone but used
by another or collected for one purpose and used for
another are called secondary data.
 A better way may be to express the figures in an
ascending or descending order of magnitude, commonly
termed as array.
 A bar ( I ) called tally mark is put against the‘ number
when it occurs. Having occurred four times. The fifth
occurrence is represented by putting a cross tally (/) on
the first four tallies. This technique facilitates the
counting of the tally marks at the end.
 Frequency of a variable is the number of times it occurs
in given data. The word 'frequency' is derived from 'how
frequently' a variable occurs.
Solve
1, 2, 3, 4, 6, 9, 9, 8, 5, 1, 1, 9,
9, 0, 6, 9.
Marks ( X axis) Tally Frequency ( Y
axis)
0
1
2
3
4
5
6
7
8
9
Solution
Marks ( X Tally Frequency ( Y
axis) axis)

0 1
1 3
2 1
3 1
4 1
5 1
6 2
7 0
8 1
9 5
Total 16
 There are two types of frequency distribution, namely,
simple frequency distribution and grouped frequency
distribution.
 Simple frequency distribution shows the values of the
variable individually
 Grouped frequency distribution shows the, values of
the variable in groups or intervals.
Grouped Frequency
 Class: When a large number of observations varying in a wide range
are available, they are usually classified into several groups
according to the size of the values. Each of these groups defined by
an interval is called class interval or simply class
 The Classes of the type 15-19. 20- 24, 25- 29 in which both the
upper and lower limit are included are called inclusive class
 Exclusive Class: It should be clearly understood that in. the above
classes, the upper limits of each class are excluded from the
respective classes
 0-5
 5-10
 10-15
 15-20
 20-25
 Class Frequency: The number of observation falling under each
class is called its class frequency or simply frequency.
 Class Limits: The two numbers used to specify the limits of a class
interval for tallying the original observations are called the class
limits.
Solution
Class Interval Frequency ( Y axis)

0 -2

2-4

4-6

6-8

8-10
 Class Boundaries: The extreme values (observations) of
a variable, which could ever be included in a class
interval, 'are called class boundaries.
 Mid-Point of Class Interval: The value exactly at the
middle of a class interval is called class mark or mid-
value. It is used as the representative value 'Of the class
interval. Thus, Mid-point of Class interval = (Lower class
boundary +Upper class boundary)/2.
 Width of a Class: Width of class is defined as the
difference between the upper and lower class
boundaries. Thus, Width of a Class = (upper class
 boundary - lower class boundary).
Make the Class Interval
Exclusive
 If d is the gap between the upper limit of any class and
the lower limit of the Succeeding class, the class
boundaries for any class are then given by :
Graphic Representation of a
Frequency Distribution
 Histogram is the most common form diagrammatic
presentation of grouped frequency data. It is a set of
adjacent rectangles on a common base line. The base of
each rectangle measures the class width whereas the
height measures the frequency density.
To Draw an Histogram

• Step 1: Choose a suitable scale to represent weights on the


horizontal axis.
• Step 2: Choose a suitable scale to represent the frequencies
on the vertical axis.
• Step 3: Then draw the bars corresponding to each of the given
weights using their frequencies.

 If the grouped' frequency distribution is not continuous,


first it is to be converted into continuous distribution
and then the histogram is drawn.
Draw an Histogram
To draw an Histogram, one has to
make the continuous distribution
 Frequency Polygon : A frequency polygon is a visual
representation of a distribution. The visualization tool is
used to understand the shape of a distribution.
Essentially, the frequency polygon indicates the number
of occurrences for each distinct class in the dataset.

 For an ungrouped distribution, the frequency polygon is


obtained by ,plotting points with abscissa as the variate
values and the ordinate as the corresponding frequencies
and joining the plotted points by means of straight lines.
For a grouped frequency distribution, the abscissa'of points
are mid-values of the class intervals.
Where should
the line
touch
 Frequency Polygon of a
frequency distribution
could be achieved by
joining the midpoints of
the tops of the
consecutive rectangles.
The two end points of a
frequency polygon are
joined to the base line
at the mid values " of
the empty classes at the
end of the frequency
distribution
Draw Histogram and
Frequency Polygon
Cumulative Frequency Curves
or Ogives
 Ogives are nothing but the graphical representation of
the cumulative distribution. Plotting the cumulative
frequencies against the mid-values of classes and
joining them, we obtain ogives.
Less than Ogive

Class Freque Less


 Taking the Class interval Interval ncy than C.f
along X-axis and Less than C.f
along , Y-axis. we plot the 0-10 4 4
10-20 8 12 (
 Plot the cumulative 8+4)
frequencies 4, 12, 23, ... , 59
against the upper limits of
20-30 11 23
the corresponding interval 30-40 15 38
i.e. 10,20; ... , 70
respectively 40-50 12 50
 The smooth curve obtained 50-60 6 56
on joining these points is
called ogive or more 60-70 3 59
particularly 'less than' ogive.
Less than Ogive

Class Freque Less


 Taking the Class interval Interval ncy than C.f
along X-axis and Less than C.f
along , Y-axis. we plot the 0-10 4 4
10-20 8 8+4 = 12
 Plot the cumulative 20-30 11 12+11=
frequencies 4, 12, 23, ... , 59
against the upper limits of
23
the corresponding interval 30-40 15 23+ 15=
i.e. 10,20; ... , 70
respectively 38
 The smooth curve obtained 40-50 12 38+12=
on joining these points is 50
called ogive or more
particularly 'less than' ogive. 50-60 6 50+6=
56
60-70 3 56+3 =
59
More than Ogive

 Plot the cumulative Class Freque More


frequencies 59, 55,…..3 Interval ncy than C.f
against the lower limits 0-10 4 59
of the corresponding
interval i.e. 0, 10, 10-20 8 59-4= 55
20…..60 respectively 20-30 11 55-8= 47
 The smooth curve 30-40 15 47-11=
obtained on joining 36
these points is called 40-50 12 36-15=
ogive or more 21
particularly ‘more than'
ogive. 50-60 6 21-12= 9
60-70 3 9-6= 3
More than Ogive

 Plot the cumulative Class Freque More


frequencies 59, 55,…..3 Interval ncy than C.f
against the lower limits 0-10 4 59
of the corresponding
interval i.e. 0, 10, 10-20 8 55
20…..60 respectively 20-30 11 47
 The smooth curve 30-40 15 36
obtained on joining 40-50 12 21
these points is called
ogive or more 50-60 6 9
particularly ‘more than' 60-70 3 3
ogive.
Draw the cumulative frequency curve for the following
distribution showing the number of marks of 59
students in Statistics.

Class Frequency Less than More than


Interval C.f C.F

0-10 4 4 59
10-20 8 12 55
20-30 11 23 47
30-40 15 38 36
40-50 12 50 21
50-60 6 56 9
60-70 3 59 3
The frequency distribution table
of the salaries of 55 workers will
be as below.Plot Ogive
Monthly Number of Less tha C.f More than
Salary (in $) Workers C.f

0-2000 10 10 55
2000-4000 25 35 55-10= 45
4000-6000 5 45-25= 20
6000-8000 5 20- 5= 15
8000-10000 10 15- 5= 10
The frequency distribution table
of the salaries of 55 workers will
be as below.
Monthly Number of Less tha C.f More than
Salary (in $) Workers C.f

0-2000 10 10 55
2000-4000 25 35 45
4000-6000 5 40 20
6000-8000 5 45 15
8000-10000 10 55 10
Solve
Make a Histogram, Frequency
Polygon and Ogives(more than
and
 The less than)
following table shows the time taken (in minutes)
by 100 students to travel to school on a particular day.
Draw the Histogram

Time Number of students


0-5 5
5-10 25
10-15 40
15-20 17
20-25 13
Histogram

You might also like