0% found this document useful (0 votes)

4 views

Statistics

Statistics is the mathematical study of data collection, analysis, interpretation, and presentation, utilizing measures such as mean, median, and mode. It is divided into descriptive statistics, which summarizes data, and inferential statistics, which makes predictions about a population based on a sample. Key concepts include measures of central tendency, variability, and the importance of understanding population and samples.

Uploaded by

yogitas804

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Statistics

Uploaded by

yogitas804

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 21

Statistics

Definition of Statistics
Any raw Data, when collected and organized in the form of numerical or
tables, is known as Statistics. Statistics is also the mathematical study of
the probability of events occurring based on known quantitative Data or
a Collection of Data.
Statistics attempts to infer the properties of a large Collection of Data
from inspection of a sample of the Collection thereby allowing educated
guesses to be made with a minimum of expense. There are generally 3
kinds of averages commonly used in Statistics. They are: (i) Mean, (ii)
Median, and (iii) Mode.
Statistics is the study of Data Collection, Analysis, Interpretation,
Presentation, and organizing in a specific way. Mathematical methods
used for different analytics include mathematical Analysis, linear algebra,
stochastic Analysis, the theory of measure-theoretical probability, and
differential equations. Collecting, classifying, organizing, and displaying
numerical Data is associated with Statistics. This helps one to grasp
different outcomes from it and foresee several possibilities of various
events. Statistics discuss information, observations, and Data in the form
of numerical Data.
Types of Statistics
There are two kinds of Statistics, which are descriptive Statistics and
inferential Statistics. In descriptive Statistics, the Data or Collection Data
are described in a summarized way, whereas in inferential Statistics, we
make use of it in order to explain the descriptive kind. Both of them are
used on a large scale. Also, there is another kind of Statistics where
descriptive transitions into inferential Statistics.
Statistics is mainly divided into the following two categories.
1. Descriptive Statistics
2. Inferential Statistics
Descriptive Statistics
In the descriptive Statistics, the Data is described in a summarized way.
The summarization is done from the sample of the population using
different parameters like Mean or standard deviation. Descriptive
Statistics are a way of using charts, graphs, and summary measures to
organize, represent, and explain a set of Data.
 Data is typically arranged and displayed in tables or graphs
summarizing details such as histograms, pie charts, bars or scatter
plots.
 Descriptive Statistics are just descriptive and thus do not require
normalization beyond the Data collected.
A) Types of Measures
Descriptive statistics are classified into two types:

1. Measure of central tendency

The central tendency measure is a single value that seeks to describe
the entire set of data. The three main characteristics of central tendency
are as follows:

a. Mean
It is calculated by dividing the total number of observations by the sum of
the observations. It can also be described as the sum divided by the
count.

b. Median
(n+1)/2
It is the data set's middle value. It divides the data into two halves. If the
number of items in the data set is odd, the center element is the median;
otherwise, the median is the average of two center elements.

c. Mode
It is the most often occurring value in the given data collection. If the
frequency of all data points is the same, the data set may not have a
mode. We can also have several modes if we meet two or more data
points with the same frequency.
2. Measure of variability
The spread of data, or how well our data is dispersed, is a measure of
variability. The most common measures of variability are:

a. Standard deviation
It is calculated by taking the square root of the variance. It is determined
by first determining the Mean, then subtracting each number from the
Mean, also known as the average, and squaring the result. Adding the
values, dividing by the number of words, and finally taking the square
root.

b. Range
The range represents the difference between the largest and smallest
data points in our data set. The range is proportional to the spread of
data, so the wider the range, the wider the spread of data, and vice
versa.
Range = Largest data value – smallest data value

c. Variance
It is defined as a squared deviation from the mean on average. It is
determined by squaring the difference between each data point and the
average, also known as the mean, adding all of them, and then dividing
by the number of data points in our data collection.
B) Population and Samples
The population is a grouping of all the elements or things you are
interested in statistics. Populations are frequently large, making them
unsuitable for data collection and analysis. That is why statisticians
typically attempt to draw conclusions about a population by selecting and
analyzing a representative subset of that group.
This subset of a population is referred to as a sample. Ideally, the
sample should preserve the population's key statistical traits to a
reasonable degree. You'll be able to conclude the population based on
the sample.

C) Outliers
A data point that deviates significantly from the rest of the data in a
sample or population is referred to as an outlier.
Outliers can have a variety of causes, but here is a handful to get you
started:
 Natural data variation
 Changes in the observed system's behavior
 Data gathering errors
 Outliers are frequently caused by data-gathering problems
Inferential Statistics
In the Inferential Statistics, we try to interpret the Meaning of descriptive
Statistics. After the Data has been collected, analyzed, and summarised
we use Inferential Statistics to describe the Meaning of the collected
Data.
 Inferential Statistics use the probability principle to assess whether
trends contained in the research sample can be generalized to the
larger population from which the sample originally comes.
 Inferential Statistics are intended to test hypotheses and
investigate relationships between variables and can be used to
make population predictions.
 Inferential Statistics are used to draw conclusions and inferences,
i.e., to make valid generalizations from samples.

Example
In a class, the Data is the set of marks obtained by 50 students. Now
when we take out the Data average, the result is the average of 50
students’ marks. If the average marks obtained by 50 students are 88
out of 100, on the basis of the outcome, we will draw a conclusion.
Mean, Median and Mode in Statistics
Mean: Mean is considered the arithmetic average of a Data set that is
found by adding the numbers in a set and dividing by the number of
observations in the Data set.
Median: The middle number in the Data set while listed in either
ascending or descending order is the Median.
Mode: The number that occurs the most in a Data set and ranges
between the highest and lowest value is the Mode.

For n number of observations, we have

Mode = The value which occurs most frequently

Measures of Dispersion in Statistics
The measures of central tendency do not suffice to describe the complete
information about a given Data. Therefore, the variability is described by a
value called the measure of dispersion.
The different measures of dispersion include:
1. The range in Statistics is calculated as the difference between the
maximum value and the minimum value of the Data points.
2. The quartile deviation that measures the absolute measure of
dispersion. The Data points are divided into 3 quarters. Find the Median
of the Data points. The Median of the Data points to the left of this
Median is said to be the upper quartile and the Median of the Data
points to the right of this Median is said to be the lower quartile. Upper
quartile - lower quartile is the interquartile range. Half of this is the
quartile deviation.
3. The Mean deviation is the statistical measure to determine the average
of the absolute difference between the items in a distribution and the
Mean or Median of that series.
4. The standard deviation is the measure of the amount of variation of a
set of values.
Stages of Statistics

1. Collection of Data:
This is the first step of statistical Analysis where we collect the Data
using different methods depending upon the case.
2. Organizing the Collected Data:
In the next step, we organize the collected Data in a Meaningful
manner. All the Data is made easier to understand.
3. Presentation of Data:
In the third step we simplify the Data. These Data are presented in the
form of tables, graphs, and diagrams.
4. Analysis of the Data:
Analysis is required to get the right results. It is often carried out using
measures of central tendencies, measures of dispersion, correlation,
regression, and interpolation.
5. Interpretation of Data:
In this last stage, conclusions are enacted. Use of comparisons is
made. On this basis, forecasting is made.
Uses of Statistics

 Statistics helps to obtain appropriate quantitative Data.

 Statistics helps to present complex Data for the simple and consistent
Interpretation of the Data in a suitable tabular, diagrammatic, and
graphic form.
 Statistics help to explain the nature and pattern of variability through
quantitative observations of a phenomenon.
 Statistics help to depict the Data in tabular form, or in a graphical form
in order to understand it properly.
Applications of Statistics

 Statistics is used in Machine Learning and Data Mining.

 Statistics is used in Mathematics.
 Statistics is used in Economics.
Calculating Descriptive Statistics in Python

Python statistical modules provide simple and effective techniques for

interacting with data.

Let’s get our hands filthy by implementing these libraries and techniques
in Python.

1. Measures of Central Tendency

a. Mean

import statistics

# initializing list

li = [1, 2, 3, 3, 2, 2, 2, 1]

# using mean() to calculate average of list

# elements

print ("The average of list values is : ",end="")

print (statistics.mean(li))

Output:

The average of list values is : 2

b. Median

from statistics import median

from fractions import Fraction as fr

data1 = (2, 3, 4, 5, 7, 9, 11)

# tuple of floating point values

data2 = (2.4, 5.1, 6.7, 8.9)

# tuple of fractional numbers

data3 = (fr(1, 2), fr(44, 12), fr(10, 3), fr(2, 3))

data4 = (-5, -1, -12, -19, -3)

data5 = (-1, -2, -3, -4, 4, 3, 2, 1)

# Printing the median of above datasets

print("Median of data-set 1 is % s" % (median(data1)))

print("Median of data-set 2 is % s" % (median(data2)))

print("Median of data-set 3 is % s" % (median(data3)))

print("Median of data-set 4 is % s" % (median(data4)))

print("Median of data-set 5 is % s" % (median(data5)))

Output:

Median of data-set 1 is 5

Median of data-set 2 is 5.9

Median of data-set 3 is 2

Median of data-set 4 is -5

Median of data-set 5 is 0.0

c. Mode

from statistics import mode

from fractions import Fraction as fr

# tuple of positive integer numbers

data1 = (2, 3, 3, 4, 5, 5, 5, 5, 6, 6, 6, 7)

# tuple of a set of floating point values

data2 = (2.4, 1.3, 1.3, 1.3, 2.4, 4.6)

# tuple of a set of fractional numbers

data3 = (fr(1, 2), fr(1, 2), fr(10, 3), fr(2, 3))

# tuple of a set of negative integers

data4 = (-1, -2, -2, -2, -7, -7, -9)

# tuple of strings

data5 = ("red", "blue", "black", "blue", "black",

"black", "brown")

# Printing out the mode of the above data-sets

print("Mode of data set 1 is % s" % (mode(data1)))

print("Mode of data set 2 is % s" % (mode(data2)))

print("Mode of data set 3 is % s" % (mode(data3)))

print("Mode of data set 4 is % s" % (mode(data4)))

print("Mode of data set 5 is % s" % (mode(data5)))

Output:
Mode of data set 1 is 5

Mode of data set 2 is 1.3

Mode of data set 3 is 1/2

Mode of data set 4 is -2

Mode of data set 5 is black

2. Measure of variability

a. Range

# Sample Data

arr = [1, 2, 3, 4, 5]

#Finding Max

Maximum = max(arr)

# Finding Min

Minimum = min(arr)

# Difference Of Max and Min

Range = Maximum-Minimum

print("Maximum = {}, Minimum = {} and Range =

{}".format(

Maximum, Minimum, Range))

Output:

Maximum = 5, Minimum = 1 and Range = 4

b. Variance

# Python code to demonstrate variance()

# function on varying range of data-types

# importing statistics module

from statistics import variance

# importing fractions as parameter values

from fractions import Fraction as fr

# tuple of a set of positive integers

# numbers are spread apart but not very much

sample1 = (1, 2, 5, 4, 8, 9, 12)

# tuple of a set of negative integers

sample2 = (-2, -4, -3, -1, -5, -6)

# tuple of a set of positive and negative numbers

# data-points are spread apart considerably

sample3 = (-9, -1, -0, 2, 1, 3, 4, 19)

# tuple of a set of fractional numbers

sample4 = (fr(1, 2), fr(2, 3), fr(3, 4),

fr(5, 6), fr(7, 8))

# tuple of a set of floating point values

sample5 = (1.23, 1.45, 2.1, 2.2, 1.9)

# Print the variance of each samples

print("Variance of Sample1 is % s " %

(variance(sample1)))

print("Variance of Sample2 is % s " %

(variance(sample2)))

print("Variance of Sample3 is % s " %

(variance(sample3)))

print("Variance of Sample4 is % s " %

(variance(sample4)))

print("Variance of Sample5 is % s " %

(variance(sample5)))

Output:

Variance of Sample1 is 15.80952380952381

Variance of Sample2 is 3.5

Variance of Sample3 is 61.125

Variance of Sample4 is 1/45

Variance of Sample5 is 0.17613000000000006

c. Standard Deviation

from statistics import stdev

# importing fractions as parameter values

from fractions import Fraction as fr

# creating a varying range of sample sets

# numbers are spread apart but not very much

sample1 = (1, 2, 5, 4, 8, 9, 12)

# tuple of a set of negative integers

sample2 = (-2, -4, -3, -1, -5, -6)

# tuple of a set of positive and negative numbers

# data-points are spread apart considerably

sample3 = (-9, -1, -0, 2, 1, 3, 4, 19)

# tuple of a set of floating point values

sample4 = (1.23, 1.45, 2.1, 2.2, 1.9)

# Print the standard deviation of

# following sample sets of observations

print("The Standard Deviation of Sample1 is % s"

% (stdev(sample1)))

print("The Standard Deviation of Sample2 is % s"

% (stdev(sample2)))

print("The Standard Deviation of Sample3 is % s"

% (stdev(sample3)))

print("The Standard Deviation of Sample4 is % s"

% (stdev(sample4)))

Output:

The Standard Deviation of Sample1 is 3.9761191895520196

The Standard Deviation of Sample2 is 1.8708286933869707

The Standard Deviation of Sample3 is 7.8182478855559445

The Standard Deviation of Sample4 is 0.4196784483387

Educ 201
No ratings yet
Educ 201
2 pages
Mathematical Method For Physicists Ch. 1 & 2 Selected Solutions Webber and Arfken
100% (3)
Mathematical Method For Physicists Ch. 1 & 2 Selected Solutions Webber and Arfken
7 pages
NCM 107 Learning Packet On Related Learning Experience Rle Focus Unit: Care For High-Risk Pregnancy
No ratings yet
NCM 107 Learning Packet On Related Learning Experience Rle Focus Unit: Care For High-Risk Pregnancy
7 pages
Malampaya An Overview
100% (1)
Malampaya An Overview
12 pages
Statistics Notes
No ratings yet
Statistics Notes
16 pages
Presentation 4
No ratings yet
Presentation 4
29 pages
Statistics
No ratings yet
Statistics
13 pages
Statistics Notes Self Made
100% (1)
Statistics Notes Self Made
41 pages
Presentation ON Introduction To Statistics: Course No: URP 5151 Couse Title: Statistics For Planners
No ratings yet
Presentation ON Introduction To Statistics: Course No: URP 5151 Couse Title: Statistics For Planners
37 pages
Statistical Analysis_ Descriptive Stat (2)
No ratings yet
Statistical Analysis_ Descriptive Stat (2)
6 pages
Ch 2 Lecture Notes
No ratings yet
Ch 2 Lecture Notes
12 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Statistics, Statistical Modelling & Data Analytics
No ratings yet
Statistics, Statistical Modelling & Data Analytics
68 pages
PDS_Unit4
No ratings yet
PDS_Unit4
18 pages
Lecture 1
No ratings yet
Lecture 1
32 pages
Pyscho
No ratings yet
Pyscho
2 pages
Statistics_Compendium_DMS IIT DELHI_2025
No ratings yet
Statistics_Compendium_DMS IIT DELHI_2025
18 pages
Statistics[1]
No ratings yet
Statistics[1]
152 pages
chapter2-statistical analysis
No ratings yet
chapter2-statistical analysis
86 pages
Statistics
No ratings yet
Statistics
152 pages
Session 1 On Descriptive Statistics
No ratings yet
Session 1 On Descriptive Statistics
24 pages
NITKclass 1
No ratings yet
NITKclass 1
50 pages
SSM & Da All Unit Notes
No ratings yet
SSM & Da All Unit Notes
152 pages
Statistics - Docx Unit 1
No ratings yet
Statistics - Docx Unit 1
9 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
9 pages
Unit 1 - Business Statistics & Analytics
No ratings yet
Unit 1 - Business Statistics & Analytics
25 pages
Ssmda End Sem
No ratings yet
Ssmda End Sem
152 pages
2 - Introduction To Statistics
No ratings yet
2 - Introduction To Statistics
97 pages
Statistics
No ratings yet
Statistics
68 pages
Statistics Theory
No ratings yet
Statistics Theory
3 pages
Statistics SLM
No ratings yet
Statistics SLM
7 pages
Probability and Statistics Notes
No ratings yet
Probability and Statistics Notes
38 pages
Module3
No ratings yet
Module3
54 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
93 pages
Advanced Statistics1
No ratings yet
Advanced Statistics1
19 pages
DSBDL Asg 3 Write Up
No ratings yet
DSBDL Asg 3 Write Up
6 pages
8614.educational Statitics Unit 4
No ratings yet
8614.educational Statitics Unit 4
34 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
Descriptive Statistic
No ratings yet
Descriptive Statistic
37 pages
Day 3 Educational Statistics
No ratings yet
Day 3 Educational Statistics
37 pages
Unit 8. Data Analysis
No ratings yet
Unit 8. Data Analysis
69 pages
Statistics
No ratings yet
Statistics
11 pages
Business Analytics
No ratings yet
Business Analytics
44 pages
Statistics and Its Types(v1.0)
No ratings yet
Statistics and Its Types(v1.0)
6 pages
Chapter1 Statistics
No ratings yet
Chapter1 Statistics
17 pages
Lesson 02 Probability and Statistics
No ratings yet
Lesson 02 Probability and Statistics
127 pages
Statistical Machine Learning
100% (1)
Statistical Machine Learning
12 pages
Click To Add Text Dr. Cemre Erciyes: Soc 2003 Statistical Methods and Computer Applications in Social Sciences 18/19
No ratings yet
Click To Add Text Dr. Cemre Erciyes: Soc 2003 Statistical Methods and Computer Applications in Social Sciences 18/19
69 pages
Assignment No 3
No ratings yet
Assignment No 3
16 pages
Descriptive Statistics (1)
No ratings yet
Descriptive Statistics (1)
63 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
21 pages
Jerome Statistics
No ratings yet
Jerome Statistics
12 pages
Descriptive Statistics.pptx
No ratings yet
Descriptive Statistics.pptx
14 pages
Tian Statistics Lesson 3 Descriptive Statistics
No ratings yet
Tian Statistics Lesson 3 Descriptive Statistics
64 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
13 pages
Assignment
No ratings yet
Assignment
23 pages
Assignment
No ratings yet
Assignment
30 pages
Chap 4 Research Method and Technical Writing
No ratings yet
Chap 4 Research Method and Technical Writing
34 pages
1 Ffaa
No ratings yet
1 Ffaa
9 pages
Statistics - Imp Points
No ratings yet
Statistics - Imp Points
6 pages
BSA Unit (1)
No ratings yet
BSA Unit (1)
18 pages
ge8 statistics
No ratings yet
ge8 statistics
2 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Te Ads 2 Icp 5994 7211en Agilent
No ratings yet
Te Ads 2 Icp 5994 7211en Agilent
10 pages
PMW-350 PMW-320 PMW-EX350 PMW-EX330: Solid-State Memory Camcorder
No ratings yet
PMW-350 PMW-320 PMW-EX350 PMW-EX330: Solid-State Memory Camcorder
13 pages
The Strengths & Weaknesses of Face2Vec - FaceNet
No ratings yet
The Strengths & Weaknesses of Face2Vec - FaceNet
6 pages
560952294-Coats-Capes-Bunka 3
No ratings yet
560952294-Coats-Capes-Bunka 3
3 pages
Burkholderia Cepacia Selective Agar (BCSA)
No ratings yet
Burkholderia Cepacia Selective Agar (BCSA)
2 pages
Chapter 4 Pavement Materials & Design
No ratings yet
Chapter 4 Pavement Materials & Design
69 pages
Ham Dhanvant Bhaagath Sach Naai - This Shabad Brings Wealth and Prosperity – Sikh Dharma International
No ratings yet
Ham Dhanvant Bhaagath Sach Naai - This Shabad Brings Wealth and Prosperity – Sikh Dharma International
1 page
L14 (2)
No ratings yet
L14 (2)
22 pages
Module 4 - Risk Assessment
0% (2)
Module 4 - Risk Assessment
2 pages
Co Operative Audit
No ratings yet
Co Operative Audit
4 pages
Mifos Cloud Datasheet
No ratings yet
Mifos Cloud Datasheet
2 pages
Quiz Exercises 9 Reported Speech 9
No ratings yet
Quiz Exercises 9 Reported Speech 9
2 pages
Calculus I Course Outline
No ratings yet
Calculus I Course Outline
2 pages
Bread Butter IM - Compressed
No ratings yet
Bread Butter IM - Compressed
25 pages
MAA00A1
No ratings yet
MAA00A1
13 pages
Dpsru-Wcsc Yoga Day 1 (Yoga For Flexibility) PDF
No ratings yet
Dpsru-Wcsc Yoga Day 1 (Yoga For Flexibility) PDF
8 pages
ESNW Builder Breakfast Ecolighten
No ratings yet
ESNW Builder Breakfast Ecolighten
22 pages
MBA QUESTION PAPER MGU Dec 2023
No ratings yet
MBA QUESTION PAPER MGU Dec 2023
4 pages
Bahit (Bar Glasses)
No ratings yet
Bahit (Bar Glasses)
6 pages
AAOSub Retina 2012 Syllabus
No ratings yet
AAOSub Retina 2012 Syllabus
169 pages
PsychK Article2
100% (1)
PsychK Article2
1 page
Chemistry Quiz
No ratings yet
Chemistry Quiz
5 pages
Water Meter Sensus WP Dynamic 130°C
No ratings yet
Water Meter Sensus WP Dynamic 130°C
4 pages
India Sixty Years After Independence by RLM Patil
No ratings yet
India Sixty Years After Independence by RLM Patil
6 pages
Boethius On The Supreme Good PDF
No ratings yet
Boethius On The Supreme Good PDF
5 pages
Axpert KS Off-Grid Inverter Selection Guide
No ratings yet
Axpert KS Off-Grid Inverter Selection Guide
1 page
Effect of Frothers and Dodecylamine On Bubble Size and Gas Holdup in A Downflow Column PDF
No ratings yet
Effect of Frothers and Dodecylamine On Bubble Size and Gas Holdup in A Downflow Column PDF
7 pages