0% found this document useful (0 votes)

18 views5 pages

HISTOGRAMS

Uploaded by

Claudia Ferraz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views5 pages

HISTOGRAMS

Uploaded by

Claudia Ferraz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

1.

2 Histograms

Histograms are density estimates. A density estimate gives a good impression of the
distribution of the data. In contrast to boxplots, density estimates show possible
multimodality of the data. The idea is to locally represent the data density by
counting the number of observations in a sequence of consecutive intervals (bins)
with origin x0 . Let Bj .x0 ; h/ denote the bin of length h which is the element of a
bin grid starting at x0 :

Bj .x0 ; h/ D Œx0 C .j 1/h; x0 C jh/; j 2 Z;

where Œ:; :/ denotes a left closed and right open interval. If fxi gniD1 is an i.i.d. sample
with density f , the histogram is defined as follows:

XX
n
fOh .x/ D n1 h1 Ifxi 2 Bj .x0 ; h/g Ifx 2 Bj .x0 ; h/g: (1.7)
j 2Z i D1

In sum (1.7) the first indicator function Ifxi 2 Bj .x0 ; h/g (see Symbols and
Notation in Chap. 21) counts the number of observations falling into bin Bj .x0 ; h/.
The second indicator function is responsible for “localising” the counts around x.
The parameter h is a smoothing or localising parameter and controls the width of
the histogram bins. An h that is too large leads to very big blocks and thus to a
very unstructured histogram. On the other hand, an h that is too small gives a very
variable estimate with many unimportant peaks.
The effect of h is given in detail in Fig. 1.6. It contains the histogram (upper
left) for the diagonal of the counterfeit bank notes for x0 D 137:8 (the minimum
of these observations) and h D 0:1. Increasing h to h D 0:2 and using the same
origin, x0 D 137:8, results in the histogram shown in the lower left of the figure.
This density histogram is somewhat smoother due to the larger h. The binwidth is
next set to h D 0:3 (upper right). From this histogram, one has the impression that
the distribution of the diagonal is bimodal with peaks at about 138.5 and 139.9.
12 1 Comparison of Batches

10 30

8 25

20
6
15
4
10
2 5

0 0
138 139 140 141 138 139 140 141
h = 0.1 h = 0.3

20 40

15 30

10 20

5 10

0 0
138 139 140 141 138 139 140 141
h = 0.2 h = 0.4

Fig. 1.6 Diagonal of counterfeit bank notes. Histograms with x0 D 137:8 and h D 0:1 (upper
left), h D 0:2 (lower left), h D 0:3 (upper right), h D 0:4 (lower right) MVAhisbank1

The detection of modes requires fine tuning of the binwidth. Using methods from
smoothing methodology (Härdle, Müller, Sperlich, & Werwatz, 2004) one can find
an “optimal” binwidth h for n observations:
p 1=3
24
hopt D :
n

Unfortunately, the binwidth h is not the only parameter determining the shapes of fO.
In Fig. 1.7, we show histograms with x0 D 137:65 (upper left), x0 D 137:75
(lower left), with x0 D 137:85 (upper right), and x0 D 137:95 (lower right). All
the graphs have been scaled equally on the y-axis to allow comparison. One sees
that—despite the fixed binwidth h—the interpretation is not facilitated. The shift
of the origin x0 (to four different locations) created four different histograms. This
1.2 Histograms 13

40 40

20 20

0 0
138 139 140 141 138 139 140 141
x = 137.65 x = 137.85
0 0

40 40

20 20

0 0
138 139 140 141 138 139 140 141
x = 137.75 x = 137.95
0 0

Fig. 1.7 Diagonal of counterfeit bank notes. Histogram with h D 0:4 and origins x0 D 137:65
(upper left), x0 D 137:75 (lower left), x0 D 137:85 (upper right), x0 D 137:95 (lower right)
MVAhisbank2

property of histograms strongly contradicts the goal of presenting data features.

Obviously, the same data are represented quite differently by the four histograms. A
remedy has been proposed by Scott (1985): “Average the shifted histograms!”. The
result is presented in Fig. 1.8.
Here all bank note observations (genuine and counterfeit) have been used. The
(so-called) averaged shifted histogram is no longer dependent on the origin and
shows a clear bimodality of the diagonals of the Swiss bank notes.
14 1 Comparison of Batches

Swiss Bank Notes Swiss Bank Notes

0.4 0.4
Diagonal

Diagonal
0.3 0.3

0.2 0.2

0.1 0.1

0 0
138 139 140 141 142 138 139 140 141 142
2 shifts 8 shifts

Swiss Bank Notes Swiss Bank Notes

0.4 0.4
Diagonal

Diagonal
0.3 0.3

0.2 0.2

0.1 0.1

0 0
138 139 140 141 142 138 139 140 141 142
4 shifts 16 shifts

Fig. 1.8 Averaged shifted histograms based on all (counterfeit and genuine) Swiss bank notes:
there are 2 shifts (upper left), 4 shifts (lower left), 8 shifts (upper right) and 16 shifts (lower right)
MVAashbank

Summary
,! Modes of the density are detected with a histogram.

,! Modes correspond to strong peaks in the histogram.

,! Histograms with the same h need not be identical. They also

depend on the origin x0 of the grid.
,! The influence of the origin x0 is drastic. Changing x0 creates
different looking histograms.
,! The consequence of an h that is too large is an unstructured
histogram that is too flat.
,! A binwidth h that is too small results in an unstable histogram.
1.3 Kernel Densities 15

Summary (continued)
p
,! There is an “optimal” h D .24 =n/1=3 .

,! It is recommended to use averaged histograms. They are kernel

densities.

(Bernard. W. Silverman) Density Estimation For Sta
No ratings yet
(Bernard. W. Silverman) Density Estimation For Sta
92 pages
SPV Basic Tutorial v1
100% (4)
SPV Basic Tutorial v1
53 pages
Activity 3 7
No ratings yet
Activity 3 7
7 pages
aula4
No ratings yet
aula4
15 pages
Non-Parametric Methods
No ratings yet
Non-Parametric Methods
51 pages
Density Estimation
No ratings yet
Density Estimation
17 pages
CrimeStatChapter 8
No ratings yet
CrimeStatChapter 8
43 pages
U4 ProbabilityDensityEstimation
No ratings yet
U4 ProbabilityDensityEstimation
6 pages
On density estimation
No ratings yet
On density estimation
4 pages
05 Density Estimation
No ratings yet
05 Density Estimation
29 pages
04.05-Histograms-and-Binnings - Ipynb - Colaboratory
No ratings yet
04.05-Histograms-and-Binnings - Ipynb - Colaboratory
7 pages
Review of Kernel Density Estimation
No ratings yet
Review of Kernel Density Estimation
35 pages
Ast Part1 PDF
No ratings yet
Ast Part1 PDF
20 pages
CE-613 - DOC - 02 Descriptive Stat, Frequency Plot
No ratings yet
CE-613 - DOC - 02 Descriptive Stat, Frequency Plot
62 pages
Histogram: Nonparametric Kernel Density Estimation
No ratings yet
Histogram: Nonparametric Kernel Density Estimation
19 pages
Histogram Tools
No ratings yet
Histogram Tools
18 pages
Test 2
No ratings yet
Test 2
3 pages
How To Use Histograms Modified
No ratings yet
How To Use Histograms Modified
4 pages
Non Parametric Density Estimation
No ratings yet
Non Parametric Density Estimation
4 pages
Statistics II: Histograms in R: Richard Gill March 5, 2009
No ratings yet
Statistics II: Histograms in R: Richard Gill March 5, 2009
9 pages
slides3part1-mrbm2324
No ratings yet
slides3part1-mrbm2324
29 pages
Chapter 2 - part 2 - (Histogram)
No ratings yet
Chapter 2 - part 2 - (Histogram)
18 pages
Intro To Kernel Density Estimation
No ratings yet
Intro To Kernel Density Estimation
4 pages
13 Density Estimation Note
No ratings yet
13 Density Estimation Note
48 pages
81d1db19834f123fcfc79ad32097aeafe17f
No ratings yet
81d1db19834f123fcfc79ad32097aeafe17f
5 pages
Empirical Finance1
No ratings yet
Empirical Finance1
31 pages
Modern Multivariate Statistical Techniques: - Nonparametric Density Estimation Xi Chen Nov 6
No ratings yet
Modern Multivariate Statistical Techniques: - Nonparametric Density Estimation Xi Chen Nov 6
20 pages
5 - Histograms, Frequency Polygons and Oglives
No ratings yet
5 - Histograms, Frequency Polygons and Oglives
26 pages
Highest Density Regions: For Uni-And Bivariate Densities
No ratings yet
Highest Density Regions: For Uni-And Bivariate Densities
27 pages
Mvaslides
No ratings yet
Mvaslides
995 pages
Tabak-Turner
No ratings yet
Tabak-Turner
20 pages
Histogram
No ratings yet
Histogram
7 pages
Kernel Density Estimation - Wikipedia
No ratings yet
Kernel Density Estimation - Wikipedia
11 pages
Parameter Estimation - PR
No ratings yet
Parameter Estimation - PR
66 pages
Data Visualization
No ratings yet
Data Visualization
20 pages
Histogram
No ratings yet
Histogram
11 pages
Understanding and Using Histograms
No ratings yet
Understanding and Using Histograms
7 pages
Articulo Sheather
No ratings yet
Articulo Sheather
11 pages
Histograms
No ratings yet
Histograms
16 pages
Histogram
No ratings yet
Histogram
12 pages
Pertemuan 5-6. Associative Statistic-Univariate Observation Multivariate Data
No ratings yet
Pertemuan 5-6. Associative Statistic-Univariate Observation Multivariate Data
51 pages
densityestimation
No ratings yet
densityestimation
28 pages
Histogram
No ratings yet
Histogram
11 pages
Mean-Shift Tracking: R.Collins, CSE, PSU CSE598G Spring 2006
No ratings yet
Mean-Shift Tracking: R.Collins, CSE, PSU CSE598G Spring 2006
93 pages
Notes: Section 1: Exploratory Data Analysis
No ratings yet
Notes: Section 1: Exploratory Data Analysis
6 pages
P4-Histogram Processing - Tutorial
No ratings yet
P4-Histogram Processing - Tutorial
40 pages
Histograms, Frequency Polygons, and Ogives: Section 2.3
No ratings yet
Histograms, Frequency Polygons, and Ogives: Section 2.3
20 pages
Histograms, Frequency Polygons, and Ogives: Section 2.3
No ratings yet
Histograms, Frequency Polygons, and Ogives: Section 2.3
17 pages
Histogram - Wikipedia, The Free Encyclopedia PDF
No ratings yet
Histogram - Wikipedia, The Free Encyclopedia PDF
12 pages
Module 2JH
No ratings yet
Module 2JH
112 pages
TEAA - Memory Based Tecniques
No ratings yet
TEAA - Memory Based Tecniques
23 pages
Histogram
No ratings yet
Histogram
2 pages
Histrogram: A Histogram Is A Graph Showing Frequency Distributions
No ratings yet
Histrogram: A Histogram Is A Graph Showing Frequency Distributions
10 pages
Histogram
No ratings yet
Histogram
17 pages
AMC Technical Brief 4 (Kernel Density Estimation Using Kernel - Xla)
No ratings yet
AMC Technical Brief 4 (Kernel Density Estimation Using Kernel - Xla)
2 pages
Nonparametric Methods: Jason Corso
No ratings yet
Nonparametric Methods: Jason Corso
49 pages
Understanding Histograms
No ratings yet
Understanding Histograms
4 pages
Statistics Calculator Manual
No ratings yet
Statistics Calculator Manual
73 pages
Arithmetic Mean Average
No ratings yet
Arithmetic Mean Average
29 pages
Capsule Calculus
From Everand
Capsule Calculus
Ira Ritow
No ratings yet
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
From Everand
Two Dimensional Computer Graphics: Exploring the Visual Realm: Two Dimensional Computer Graphics in Computer Vision
Fouad Sabry
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Workshop Week02
No ratings yet
Workshop Week02
8 pages
Statistics for Technology A Course in Applied Statistics Third Edition Chatfield 2024 Scribd Download
100% (2)
Statistics for Technology A Course in Applied Statistics Third Edition Chatfield 2024 Scribd Download
60 pages
Resource Estimation
No ratings yet
Resource Estimation
188 pages
Food Quality Management System
0% (1)
Food Quality Management System
7 pages
tms ppt
No ratings yet
tms ppt
14 pages
Community Health Nursing Concept II - Handouts
100% (1)
Community Health Nursing Concept II - Handouts
25 pages
Decision Science Assignment
No ratings yet
Decision Science Assignment
13 pages
GCSE Maths Higher Histograms
No ratings yet
GCSE Maths Higher Histograms
1 page
Project 1 - How Much Crime?: Sample Solutions
No ratings yet
Project 1 - How Much Crime?: Sample Solutions
8 pages
ssc201 Lecture Note-1
No ratings yet
ssc201 Lecture Note-1
62 pages
Math - B Rules and Formula
No ratings yet
Math - B Rules and Formula
40 pages
Taticek-Product Monitoring & Post-Approval Lifecycle Management of Biotech Products
No ratings yet
Taticek-Product Monitoring & Post-Approval Lifecycle Management of Biotech Products
36 pages
Displaying & Organizing Data Statistics
No ratings yet
Displaying & Organizing Data Statistics
22 pages
Universidad de Las Fuerzas Armadas "Espe": Departamento de Ciencias de La Energia Y Mecanica Diseño Mecatronico
No ratings yet
Universidad de Las Fuerzas Armadas "Espe": Departamento de Ciencias de La Energia Y Mecanica Diseño Mecatronico
10 pages
Linac Tps CT Daily Monthly Annual Qa Tables
No ratings yet
Linac Tps CT Daily Monthly Annual Qa Tables
36 pages
Form 4 Chapter 6: Statistics
No ratings yet
Form 4 Chapter 6: Statistics
15 pages
Laws of Life Essay Topics
100% (2)
Laws of Life Essay Topics
3 pages
The Art of Data Visualization - Learn 7 Visualizations in R
No ratings yet
The Art of Data Visualization - Learn 7 Visualizations in R
15 pages
SQC and Iqc
No ratings yet
SQC and Iqc
6 pages
Copy of 2.4.1 Assignment_ Exploring One Column
No ratings yet
Copy of 2.4.1 Assignment_ Exploring One Column
21 pages
Lobsters_Start
No ratings yet
Lobsters_Start
1 page
Lecture 04
No ratings yet
Lecture 04
17 pages
Eda Module 2 Frequency Distribution Graphs
No ratings yet
Eda Module 2 Frequency Distribution Graphs
24 pages
FDS notes
No ratings yet
FDS notes
5 pages
Guia Rapida Iniciación Leapfrog - GTG
No ratings yet
Guia Rapida Iniciación Leapfrog - GTG
169 pages
Narrative Report sample
No ratings yet
Narrative Report sample
16 pages
R Cheat Sheet 3
No ratings yet
R Cheat Sheet 3
41 pages
Ekofisk Oil
No ratings yet
Ekofisk Oil
19 pages