MIDI Toolbox Manual
MIDI Toolbox Manual
D4
E4
F4#
G4#
A4#
C5
D5
E5
F5#
Time in seconds
P
i
t
c
h
toolbox
MATLAB Tools
for Music Research
Tuomas Eerola & Petri Toiviainen
Music Cogni tion Group
MIDI Toolbox
MATLAB Tools
for Music Research
Tuomas Eerola & Petri Toiviainen
Copyright : Tuomas Eerola & Petri Toiviainen
Cover & layout: Tuomas Eerola & Petri Toiviainen
Publisher: Department of Music, University of J yvskyl, Finland
Printing and binding: Kopijyv, J yvskyl, Finland
ISBN: 951-39-1796-7 (printed version)
ISBN: 951-39-1795-9 (pdf version)
Document data:
Eerola, T. & Toiviainen, P. (2004). MIDI Toolbox: MATLAB Tools for Music
Research. University of J yvskyl: Kopijyv, J yvskyl, Finland.
Electronic version available from: http://www.jyu.fi/musica/miditoolbox/
CONTENTS
4
MIDI Toolbox
CONTENTS
CHAPTER 1 INTRODUCTION.................................................................................... 6
CHAPTER 2 INSTALLATION..................................................................................... 8
CHAPTER 3 BASIC OPERATIONS............................................................................ 9
CHAPTER 4 EXAMPLES ........................................................................................... 15
EXAMPLE 1: VISUALIZING MIDI DATA............................................................... 15
EXAMPLE 2: MELODIC CONTOUR........................................................................ 21
EXAMPLE 3: KEY-FINDING................................................................................... 24
EXAMPLE 4: METER-FINDING.............................................................................. 29
EXAMPLE 5: MELODIC SEGMENTATION.............................................................. 32
EXAMPLE 6: MELODIC EXPECTATIONS............................................................... 34
EXAMPLE 7: MELODIC COMPLEXITY.................................................................. 37
EXAMPLE 8: ANALYZING MIDI COLLECTIONS .................................................. 38
EXAMPLE 9: MELODIC SIMILARITY..................................................................... 41
EXAMPLE 10: CREATING SEQUENCES ................................................................. 43
REFERENCES.......................................................................................................... 45
CHAPTER 5 FUNCTION REFERENCE................................................................... 48
ALPHABETICAL INDEX OF FUNCTIONS................................................................ 96
CHAPTER 1 INTRODUCTION
6
MIDI Toolbox
CHAPTER 1 INTRODUCTION
MIDI Toolbox provides a set of Matlab functions, which together have all the
necessary machinery to analyze and visualize MIDI data. The development of the
Toolbox has been part of ongoing research involved in topics relating to musical data-
mining, modelling music perception and decomposing the data for and from
perceptual experiments. Although MIDI data is not necessarily a good representation
of music in general, it suffices for many research questions dealing with concepts
such as melodic contour, tonality and pulse finding. These concepts are intriguing
from the point of view of music perception and the chosen representation greatly
affects the way these issues can be approached. MIDI is not able to handle the timbre
of music and therefore it unsuitable representation for a number of research questions
(for summary, see Hewlett and Selfridge-Field, 1993-94, p. 11-28). All musical
signals may be processed from acoustic representation and there are suitable tools
available for these purposes (e.g. IPEM toolbox, Leman et al., 2000). However, there
is a body of essential questions of music cognition that benefit from a MIDI-based
approach. MIDI does not contain notational information, such as phrase and bar
markings, and neither is that information conveyed in explicit terms to the ears of
music listeners. Consequently, models of music cognition must infer these musical
cues from the pitch, timing and velocity information that MIDI provides. Another
advantage of the MIDI format is that it is extremely wide-spread among the research
community as well as having a wider group of users amongst the music professionals,
artists and amateur musicians. MIDI is a common file format between many notation,
sequencing and performance programs across a variety of operating systems.
Numerous pieces of hardware exist that collect data from musical performances,
either directly from the instrument (e.g. digital pianos and other MIDI instruments) or
from the movements of the artists (e.g. motion tracking of musicians gestures, hand
movements etc.). The vast majority of this technology is based on MIDI
representation. However, the analysis of the MIDI data is often developed from
scratch for each research question. The aim of MIDI Toolbox is to provide the core
representation and functions that are needed most often. These basic tools are
designed to be modular to allow easy further development and tailoring for specific
analysis needs. Another aim is to facilitate efficient use and to lower the threshold of
practicalities. For example, the Toolbox can be used as teaching aid in music
cognition courses.
This documentation provides a description of the Toolbox (Chapter 1), installation
and system requirements (Chapter 2). Basic issues are explained in Chapter 3. Chapter
4 demonstrates the Toolbox functions using various examples. The Users Guide does
not describe any of the underlying theories in detail. Chapter 5 focuses on a collection
format and Chapter 6 is the reference section, describing all functions in the Toolbox.
The online reference documentation provides direct hypertext links to specific
Toolbox functions. This is available at http://www.jyu.fi/musica/miditoolbox/
7
CHAPTER 1 INTRODUCTION
MIDI Toolbox
This Users Guide assumes that the readers are familiar with Matlab. At the moment,
the MIDI Toolbox is a collection of Matlab functions that do not require any extra
toolboxes to run. Signal processing and Statistics toolboxes both available
separately from Mathworks offer useful extra tools for the analysis of perceptual
experiments.
MIDI Toolbox comes with no warranty. This is free software, and you are welcome to
redistribute it under certain conditions. See License.txt for details of GNU
General Public License.
We would like to thank various people contributing to the toolbox. The conversion to
and from MIDI file is based on the C source code by Piet van Oostrum, which, in
turn, uses the midifile library written by Tim Thompson and updated by Michael
Czeiszperger. Brian Cameron found out some sneaky bugs in the aforementioned C
source code. Micah Bregman helped to check parts of the manual and wrote out some
new functions.
Comments, suggestions or questions?
Many functions are still not completely tested in MIDI Toolbox version 1.0. Check
the online forum for corrections and revisions:
http://www.jyu.fi/musica/miditoolbox/forum.html
Alternatively, you can report any bugs or problems to:
Tuomas Eerola, Petri Toiviainen
{ptee, ptoiviai}@cc.jyu.fi
Department of Music
University of J yvskyl
P.O. BOX 35
40014 University of J yvskyl
Finland
CHAPTER 1 INTRODUCTION
8
MIDI Toolbox
CHAPTER 2 INSTALLATION
Availability
The whole toolbox is available either as a zipped package from the internet
(http://www.jyu.fi/musica/miditoolbox/).
Installation
Unpack the MIDI Toolbox file package you have downloaded. For this, use a program
like Winzip for Windows and Stuffit Expander for Macintosh. This will create a
directory called miditoolbox. Secondly, a version of the Matlab program needs to be
installed (see www.mathworks.com). Thirdly, the Toolbox needs to be defined in
the Matlab path variable.
Windows (98, 2000, XP)
The MIDI Toolbox version 1.0 is compatible with Matlab 5.3 and Matlab 6.5.
Macintosh (OS X)
The MIDI Toolbox version 1.0 is compatible with Matlab 6.5 for Macintosh.
Linux
Currently not tested but should be compatible.
9
CHAPTER 3 BASIC OPERATIONS
MIDI Toolbox
CHAPTER 3 BASIC OPERATIONS
Basic issues
In this tutorial, we assume that the reader has basic knowledge of the Matlab
command syntax. Many good tutorials exist in the Internet, see:
http://www.math.ufl.edu/help/matlab-tutorial/
http://www.math.mtu.edu/~msgocken/intro/intro.html
http://www.helsinki.fi/~mjlaine/matlab/index.html (in Finnish)
http://www.csc.fi/oppaat/matlab/matlabohje.pdf (in Finnish)
In the following examples, the commands that are typed to Matlab command prompt
are written in monospaced font and are preceded by the sign. Help is also
available within the Matlab session. For example, to understand what a particular
function does, type help and the name of the function at the command prompt. For
example, to obtain information about how the pitch-class distribution function works,
type:
help pcdist1
To see a list of all available functions in the Toolbox, type:
help miditoolbox
Reading MIDI files into Matlab
The basic functions in MIDI Toolbox read and manipulate type 0 and type 1 MIDI
files. The following command reads and parses a MIDI file called laksin.mid and
stores it as a matrix of notes called nmat in Matlabs workspace:
nmat = readmidi('laksin.mid');
This particular MIDI file contains the first two verses of a Finnish Folk song called
"Lksin min kesyn" (trad.).
Basic terms
Notematrix (or nmat) refers to a matrix representation of note events in a MIDI file.
We can now type nmat and see what the notematrix of the folk song looks like.
CHAPTER 3 BASIC OPERATIONS
10
MIDI Toolbox
nmat
nmat =
0 0.9000 1.0000 64.0000 82.0000 0 0.5510
1.0000 0.9000 1.0000 71.0000 89.0000 0.6122 0.5510
2.0000 0.4500 1.0000 71.0000 82.0000 1.2245 0.2755
2.5000 0.4500 1.0000 69.0000 70.0000 1.5306 0.2755
3.0000 0.4528 1.0000 67.0000 72.0000 1.8367 0.2772
3.5000 0.4528 1.0000 66.0000 72.0000 2.1429 0.2772
4.0000 0.9000 1.0000 64.0000 70.0000 2.4490 0.5510
5.0000 0.9000 1.0000 66.0000 79.0000 3.0612 0.5510
6.0000 0.9000 1.0000 67.0000 85.0000 3.6735 0.5510
7.0000 1.7500 1.0000 66.0000 72.0000 4.2857 1.0714
We see that the variable nmat contains a 7 x 10 matrix filled with numbers. The
columns refer to various types of information such as MIDI pitch and MIDI channel.
The rows stand for the individual note events (in this case, the melody has 10 notes
and each of them is described in terms pitch, onset time, duration, volume and so
forth). The labels of the columns are as follows:
ONSET DURATION MIDI MIDI VELOCITY ONSET DURATION
(BEATS) (BEATS) channel PITCH (SEC) (SEC)
The first column indicates the onset of the notes in beats (based on ticks per quarter
note) and the second column the duration of the notes in these same beat-values. The
third column denotes the MIDI channel (1-16), and the fourth the MIDI pitch, where
middle C (C4) is 60. The fifth column is the velocity describing how fast the key of
the note is pressed, in other words, how loud the note is played (0-127). The last two
columns correspond to the first two (onset in beats, duration in beats) except that
seconds are used instead of beats.
Often one wants to refer only to pitch or duration values in the notematrix. For clarity
and convenience, these columns may be called by few basic selector functions that
refer to each specific column only. These are onset (either 'beat' or 'sec', the
former is the default), dur (either 'beat' or 'sec'), channel, pitch,
and velocity. For example, pitch(nmat) returns only the MIDI notes values of
the notematrix and onset(nmat) returns only the onset times (in beats) of the events
in the notematrix.
Collection format
Large corpora of music are more convenient to process in Matlab if they are stored in
Matlabs own cell structures rather than keeping them as MIDI files that are loaded
separately for the analysis. You can store multiple notematrices in cell structures from
a directory of MIDI files by using dir2cellmatr function. The function processes
all MIDI files in the current directory and stores the notematrices and the filenames in
the variables of your choice:
[demo_collection,filenames] = dir2cellmatr;
After creating cell matrix structure of the MIDI files, individual notematrices can be
called by the following convention:
11
CHAPTER 3 BASIC OPERATIONS
MIDI Toolbox
tune1 = demo_collection{1};
With large collections of MIDI files applying the analyzecoll function to a cell
structure is preferred to analyzing the MIDI files separately (by using the
analyzedir function). This is because in the former case the files need not be
converted into Matlab format, which increases the speed of the analysis greatly.
Example 8 in Chapter 4 illuminates the use of the collection format.
Future changes to the notematrix representation
MIDI files often contain a wealth of other information than the one pertaining to note
events. Tick type information, tempo, key signature, meter signature, copyright notes,
lyrics, track names, various types of controller data and changes in these across time
are commonly included in MIDI files. Some of these details would be useful for
certain types of analyses. However, at this stage only hold pedal information is
retained in the conversion from MIDI file to notematrix. In the next version of the
Toolbox, we are considering storing these details in the Matlab Field Structures. The
drawback of this improvement is that it will also change way the existing functions
are called. In future version we are also planning to include a graphical user interface
for common operations and analyses.
Combining functions
Functions in Matlab can be combined:
plotdist(pcdist1(nmat))
In the example, function pcdist1 calculates the pitch-class distribution of the
notematrix nmat and then uses the command plotdist to create a labeled bar
graph of the distribution.
Saving variables
Variables in Matlab can be saved using the command save filename. This
command saves all variables in Matlab session to a Matlab matrix file (filename.mat)
on the hard disk. It is often useful to use the clear command to purge unnecessary
variables from the Matlab session before saving the variables.
Saving MIDI files
A notematrix can be converted into a MIDI file by using the writemidi command.
The syntax of the command is:
writemidi(nmat,ofname,<tpq>,<tempo>,<tsig1>,<tsig2>)
In the command syntax, nmat refers to the notematrix, ofname to the name of the
generated MIDI file. There are some other parameters that are optional (denoted by
the brackets). For example, you have created a new notematrix called probemelody
that contains a sequence from the probe-tone experiments and want to save the
sequence into a MIDI file named probemelody.mid. The following command writes
the MIDI file with a tempo of 90 beats per minute.
writemidi(probemelody,'probemelody.mid',120, 90);
CHAPTER 3 BASIC OPERATIONS
12
MIDI Toolbox
Playing notematrices
There are two methods of listening to the contents of a notematrix. The first method
involves playing the MIDI file created by nmat2mf command using the internal MIDI
player of the operating system (such as Quicktime, Mediaplayer or Winamp). This
method uses the following command:
playmidi(nmat)
This function is dependent on the operating system. In Windows, use
definemidiplayer function and choose a midi player of your choice by browsing
and selecting the correct executable from your hard disk (Media Player, Winamp, etc).
This function writes the path and the filename down to midiplayer.txt in MIDI
Toolbox directory for future use. In MacOS X, the path is already predefined in the
abovementioned files.
The second method is to synthesize the notematrix into waveform using nmat2snd
function. This is computationally more demanding, especially if the notematrix is
large. Matlab can render these results into audible form by using sound or soundsc
function or alternatively the waveform may be written into a file using wavwrite
function. Simple way to hear the notematrix is type:
playsound(nmat);
Referring to parts of a notematrix
Often one wants to select only a certain part of a notematrix for analysis. For example,
instead of analyzing the whole MIDI sequence, you may wish to examine the first 8
bars or select only MIDI events in a certain MIDI channel. Basic selection is
accomplished using Matlabs own index system, for example:
first_12_notes = laksin(1:12,:);
It is also possible to refer to MIDI Toolbox definitions and functions when selecting
parts of the notematrix. The following examples give a few ideas of how these may be
used. Many of these functions belong to the filter category (see Chapter 6).
first_4_secs = onsetwindow(laksin,0,4,'sec');
first_meas = onsetwindow(laksin,0,3,'beat');
between_1_and_2_sec = onsetwindow(laksin,1,2,'sec');
only_in_channel1 = getmidich(laksin,1);
remove_channel10 = dropmidich(laksin,10);
no_short_notes = dropshortnotes(laksin,'sec',0.3)
Manipulating note matrices
Often one wants to find and manipulate the tempo of a notematrix. Heres an example
of how the tempo is obtained and then set to a faster rate.
13
CHAPTER 3 BASIC OPERATIONS
MIDI Toolbox
tempo = gettempo(laksin)
tempo = 98.000
laksin_128bpm = settempo(laksin,128);
To scale any values in the notematrix, use scale command:
laksin_with_halved_durations = scale(laksin,'dur',0.5);
One can assign any parameter (channel, duration, onset, velocity, pitch) in the
notematrix a certain fixed value. For instance, to set all note velocities to 64,
setvalues command may be used:
laksin_velocity64 = setvalues(laksin,'vel',64);
Transposing the MIDI file is also a useful operation. This example transposes the folk
tune Lksin a major third down (minus four semitones).
laksin_in_c = shift(laksin,'pitch',-4);
Transposition can also be done to a velocity or channel information of the notematrix.
Heres an example of channel alteration.
laksin_channel2 = shift(laksin,'chan',1);
If you do not know the key of the MIDI file and wish to transpose the file to a C
major or C minor key, it can be performed using the transpose2c function. This
method draws on a built-in key-finding algorithm, which is described later.
laksin_in_c = transpose2c(laksin);
It is also possible to combine different notematrices using Matlabs regular command
syntax. To create Finnish folk tune Lksin in parallel thirds, use:
laksin_parallel_thirds = [laksin; laksin_in_c];
Sometimes a notematrix might need to be quantized. This is relatively easy to carry
out using quantize function. In this example, a Bach prelude is quantized using
sixteenth beat resolution. The first argument quantizes the onsets, the second
argument the durations and the third argument filters out notes that are shorter than
the criteria (sixteenth notes in this case):
prelude_edited = quantize(prelude, 1/16,1/16,1/16);
In many cases one wishes to eliminate certain aspects of the notematrix. For example,
a simple way to get the upper line of the polyphonic notematrix is to use extreme
function:
prelude_edited = extreme(prelude_edited,'high');
Also, leading silence in notematrix is something that often is unnecessary. This can be
removed using the trim function:
CHAPTER 3 BASIC OPERATIONS
14
MIDI Toolbox
prelude_edited = trim(prelude_edited);
Demonstrations
Demonstrations, which are loosely based on the examples described in the next
chapter, are available in the MIDI Toolbox directory. Type in mididemo to go
through the demos.
15
CHAPTER 4 EXAMPLES
MIDI Toolbox
CHAPTER 4 EXAMPLES
Example 1: Visualizing MIDI Data
The pianoroll function displays conventional pianoroll notation as it is available in
sequencers. The function has the following syntax:
pianoroll(nmat,<varargin>);
The first argument refers to the notematrix and other arguments are optional. Possible
arguments refer to axis labels (either MIDI note numbers or note names for Y-axis
and either beats or seconds for the X-axis), colors or other options. For example, the
following command outputs the pitch and velocity information:
pianoroll(laksin,'name','sec','vel');
Figure 1: Pianoroll notation of the two first phrases of Lksin min kesyn. The lower panel
shows the velocity information.
Figure 2. Notation of first two verses of the Finnish Folk tune "Lksin min kesyn".
0 1 2 3 4 5 6 7 8 9 10
C4#
D4
D4#
E4
F4
F4#
G4
G4#
A4
A4#
B4
C5
C5#
Time in seconds
P
i
t
c
h
0 1 2 3 4 5 6 7 8 9 10
0
20
40
60
80
100
120
Time in seconds
V
e
l
o
c
i
t
y
CHAPTER 4 EXAMPLES
16
MIDI Toolbox
Pianoroll output is rather straightforward to interpret. If you compare it with notation
of the same song (Figure 2), you can easily see the differences and similarities
between pianoroll and traditional notation.
Polyphonic and more elaborate MIDI files can also be visualised using pianoroll
notation. Also, the time axis can be set to display beats rather than seconds and the
pitch height axis can be set to show MIDI pitch numbers. For example, to plot first
five measures (i.e., 5 * 4 beats per measure =20 beats) of the Bachs C-major
Prelude:
prelude = readmidi('wtcii01a.mid');
prelude5 = onsetwindow(prelude,0,20,'beat');
pianoroll(prelude5,'num','beat');
Figure 3. First five measures of Bach's C-major Prelude from Wohltemperierte Klavier II
(BWV 870).
In Figure 3, the horizontal lines indicate Cs and the vertical dotted lines correspond to
onset beats, which in this case have their equivalent in the notation (four beats per
measure). The notation of the C-major Prelude is shown in Figure 13. Furthermore,
MIDI files with several channels may be plotted with pianoroll function, which
highlights the channels by using different colors.
Visualization of distributions
In this example, we have loaded the third movement (Sarabande) from J . S. Bachs
Partita in A minor for Solo Flute (BWV 1013) into a notematrix called sarabande
(see Figure 4).
0 2 4 6 8 10 12 14 16 18 20
34
39
44
49
54
59
64
69
74
Time in beats
P
i
t
c
h
17
CHAPTER 4 EXAMPLES
MIDI Toolbox
Figure 4. Bach's Flute Sarabande (BWV 1013).
First, we can examine the note distribution of the Sarabande in order to see whether
the key signature is apparent from the distribution of the pitch-classes. The following
command creates a bar chart of the pitch-class distribution of the Sarabande.
plotdist(pcdist1(sarabande));
Figure 5. Pitch-class distribution in Bach's Flute Sarabande (BWV 1013).
C C# D D# E F F# G G# A A# B
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
Pitch-class
P
r
o
p
o
r
t
i
o
n
(
%
)
CHAPTER 4 EXAMPLES
18
MIDI Toolbox
The inner function, pcdist1, calculates the proportion of each pitch-class in the
sequence, and the outer function, plotdist, creates the labeled graph. From the
resulting graph, shown in Figure 5, we can infer that the Sarabande is indeed in A
minor key as A, C and E are the most commonly used tones. More about inferring the
key in a separate section on key-finding (Example 3).
Another basic description of musical content is the interval structure. In monophonic
music this is easily compiled, as shown below, but detecting successive intervals in
polyphonic music is a difficult perceptual task and will not be covered here. To see
what kind of intervals are most common in the Sarabande, type:
plotdist(ivdist1(sarabande));
Figure 6. Interval distribution in Bach's Flute Sarabande (BWV 1013).
In the middle, P1 stands for unison (perfect first), i.e. note repetitions, which are fairly
rare in this work (P8 stands for perfect octave, m3 is the minor third, M3 is a major
third and so on). Let us compare the distribution of interval sizes and direction to the
results obtained from analysis of large musical corpora by Vos and Troost (1989). To
obtain suitable distributions of Sarabande, we use function ivdirdist1 and
ivsizedist1 (see Figure 7).
-P8 -M6 -d5 -m3 P1 +m3 +d5 +M6 +P8
0
0.05
0.1
0.15
0.2
0.25
Interval
P
r
o
p
o
r
t
i
o
n
(
%
)
19
CHAPTER 4 EXAMPLES
MIDI Toolbox
P
r
o
p
o
r
t
i
o
n
a
s
c
e
n
d
i
n
g
(
%
)
MI2 MA2 MI3 MA3 P4 D5 P5 MI6 MA6 MI7 MA7 P8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
P
r
o
p
o
r
t
i
o
n
a
s
c
e
n
d
i
n
g
(
%
)
MI2 MA2 MI3 MA3 P4 D5 P5 MI6 MA6 MI7 MA7 P8
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
MI2 MA2 MI3 MA3 P4 D5 P5
0
0.05
0.1
0.15
0.2
0.25
P
r
o
p
o
r
t
i
o
n
o
f
o
c
c
u
r
r
e
n
c
e
Interval
P1 MI2MA2MI3MA3P4 D5 P5 MI6MA6MI7MA7P8
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
P
r
o
p
o
r
t
i
o
n
(
%
)
Figure 7. The top left panel shows the distribution of interval sizes in Sarabande and the
lower left panels displays the theoretical frequency of occurrence of intervals according to
Dowling and Harwood (1986). The top right panels shows the proportion of the ascending
intervals and the lower right panel displays the same data in collection of folk music (N=327),
compiled by Vos and Troost (1989).
We see in Figure 7 that in the corpus analyzed by Vos and Troost (1989) the interval
structure is usually asymmetric (lower right panel). This means that large intervals
tend to ascend whereas small intervals tend to descend. This is not evident in
Sarabande (panels on the right) as the fifths tend to descend rather than ascend.
Displaying the distributions of two-tone continuations in Sarabande is similar to
displaying tone distributions:
plotdist(pcdist2(sarabande));
Distributions in Sarabande
Theoretical distribution
Distribution in folk music
CHAPTER 4 EXAMPLES
20
MIDI Toolbox
Figure 8. The proportion of two-note continuations in Bach's Flute Sarabande (BWV 1013).
The colorbar at the right displays the proportion associated with each colour.
Figure 8 shows the proportion of tone transitions in Sarabande. The most common
transition is from dominant (E) to D and next most common transition is the F to
dominant E. Commonly, the diagonal of the tone transition matrix shows high
proportion of occurrences but this work clearly avoids unisons. The few unisons
shown in the transition matrix are due to octave displacement. Note that these
statistics are different from the interval distributions. It is also possible to view a
distribution of note durations in a similar manner (functions durdist1 and
durdist2).
In Matlab, there are further visualization techniques that can be used to display the
distributions. Quite often, plotting the data using different colors is especially
informative. In some cases, three-dimensional plots can aid the interpretation of the
data (see Example 7 for a three-dimensional version of note transitions.).
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
Pitch-class 2
P
i
t
c
h
-
c
l
a
s
s
1
C C# D D# E F F# G G# A A# B
C
C#
D
D#
E
F
F#
G
G#
A
A#
B
21
CHAPTER 4 EXAMPLES
MIDI Toolbox
Example 2: Melodic Contour
Melodic contour describes the overall shape of the melody. The contour
representation of a melody is usually easier to remember than exact interval
information (Dowling, 1978; Dowling & Fujitani, 1971) and numerous music
informational retrieval systems use contour to find specific melodies from large music
databases (e.g., Kim et al., 2000; Lemstrm et al., 2001). Contour is also central in
explorations of melodic arches, which describe the typical contours found in the
phrases of Western folk songs (Huron, 1996).
Figure 9 below shows two versions of melodic contour using different degrees of
resolution. The degree of resolution depends upon the value of the sampling step,
expressed in MIDI beats. The larger the resolution, the more coarse the contour. The
dashed line represents a detailed contour with the resolution STEP 0.25. This high
level of detail is not often necessary in dealing with melodic contour. The solid line
represents a coarser melodic contour that might be more useful for finding out the
overall structure of the melody.
plotmelcontour(laksin,0.25,'abs',':r.'); hold on
plotmelcontour(laksin,1,'abs','-bo'); hold off
legend(['resolution in beats=.25'; ...
'resolution in beats=1.0']);
Figure 9. Melodic contour and notation of "Lksin min kesyn".
One application of the melodic contour is finding out whether the sequence contains
repeated melodic phrases. This can be done using an autocorrelation technique
(Eerola et al., submitted). The autocorrelation function of a time series is obtained by
correlating the series with a delayed copy of itself, using delay values ranging from L
0 2 4 6 8 10 12 14 16 18
60
62
64
66
68
70
72
74
76
Time (beats)
M
i
d
i
n
o
t
e
resolution in beats =.25
resolution in beats =1.0
CHAPTER 4 EXAMPLES
22
MIDI Toolbox
to +L, where L denotes the total length of the time series. A time series is
autocorrelated if it is possible to predict its value at a given point of time by knowing
its value at other points of time. Positive autocorrelation means that points at a certain
distance away from each other have similar values (Box, J enkins & Reinsel, 1994).
l = reftune('laksin');
c = melcontouracorr(l);
t = [-(length(c)-1)/2:1:(length(c)-1)/2]*.1;
plot(t,c,'k');md = round(max(onset(l))+ dur(l(end,:)));
axis([-md md -0.4 1]); xlabel('\bfLag (in beats)')
set(gca,'XTick',-md:2:md); ylabel('\bfCorr. coeff.')
Figure 10. A plot of autocorrelation across melodic contour of "Lksin min kesyn".
Figure 10 shows the autocorrelation function of the contour of Lksin mina kesyn.
At the middle of the figure (at the peak, lag 0 beats) the autocorrelation function gives
the result of 1.0, perfect correlation, because at this point the melodic contour is
compared with itself. The autocorrelation function is always symmetric with respect
to the point corresponding to zero lag. Therefore, only the right halve needs to be
regarded to estimate the degree of self-similarity. The shaded area shows the self-
similarity of the melodic contour; only the positive correlations of the autocorrelation
function (half-wave rectification) are observed. This relevant, right portion of the
autocorrelation function may be plotted using the 'ac' parameter in melcontour
command:
plotmelcontour(l,0.5,'abs','b','ac');
-18 -16 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12 14 16 18
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
Lag (i n beats)
C
o
r
r
.
c
o
e
f
f
.
23
CHAPTER 4 EXAMPLES
MIDI Toolbox
Figure 11. Self-similarity of melodic contour of Lksin min kesyn.
0 2 4 6 8 10 12 14 16
0.1
0.2
0.3
0.4
0.5
0.6
Lag (beats)
C
o
r
r
.
c
o
e
f
.
Autocorr. sampl. res. (beats)=0.5
CHAPTER 4 EXAMPLES
24
MIDI Toolbox
Example 3: Key-Finding
The classic Krumhansl & Schmuckler key-finding algorithm (Krumhansl, 1990), is
based on key profiles obtained from empirical work by Krumhansl & Kessler (1982).
The key profiles were obtained in a series of experiments, where listeners heard a
context sequence, consisting of an incomplete major or minor scale or a chord
cadence, followed by each of the chromatic scale pitches in separate trials. (See
Example 9 for instructions on creating the probe-tone stimuli using the Toolbox).
Figure 12 shows the averaged data from all keys and contexts, called C major and C
minor key profiles.
Figure 12. Probe-tone ratings for the keys of C major and C minor (Krumhansl & Kessler,
1982).
In the K-S key-finding algorithm, the 24 individual key profiles, 12 major and 12
minor key profiles, are correlated with the pitch-class distribution of the piece
weighted according to their duration. This gives a measure of the strength of each key.
Let us take the C major Prelude in J . S. Bach's Wohltemperierte Klavier II (BWV
870). The first page of this Prelude is shown in Figure 13 .
We load this file into a variable called prelude and take only the first 10 measures
(first page in Figure 13) of it to find a likely key area:
prelude10=onsetwindow(prelude,0,40);
keyname(kkkey(prelude10))
ans = 'C'
The inner function in the second command line (kkkey) performs the K-S key-
finding algorithm and the outer function changes the numerical output of the inner
function to a letter denoting the key. Not surprisingly, the highest correlation of the
note distribution in the first 10 measures of the Prelude is obtained with the C major
key profile. A closer look at the other candidates the algorithm offers reveals the
strength of all keys:
keystrengths = kkcc(prelude10); % corr. to all keys
plotdist(keystrengths); % plot all corr. coefficients
Figure 15 displays the correlation coefficient to all 24 key profiles. According to the
figure, G major and a minor keys are also high candidates for the most likely key.
This is not surprising considering that these are dominant and parallel minor keys to C
major.
C C#D D# E F F# G G# A A# B
1
2
3
4
5
6
7
P
r
o
b
e
-
t
o
n
e
r
a
t
i
n
g
C major
C C#D D# E F F# G G# A A# B
1
2
3
4
5
6
7
P
r
o
b
e
-
t
o
n
e
r
a
t
i
n
g
C minor
25
CHAPTER 4 EXAMPLES
MIDI Toolbox
Figure 13. First page of Bach's C major Prelude from Wohltemperierte Klavier II (BWV 870).
C D E F# G# A# c d e f# g# a#
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
Key
C
o
r
r
.
c
o
e
f
f
.
CHAPTER 4 EXAMPLES
26
MIDI Toolbox
Figure 14. Correlation coefficients of the pitch-class distribution in Bach's C-major prelude to
all 24 key profiles.
Another way of exploring key-finding is to look at how tonality changes over time. In
the technique, key-finding is performed within a small window that runs across the
length of the music. This operation uses the movewindow function. Below is an
example of finding the maximal key correlation using a 4-beat window that is moved
by 2 beats at a time.
prelude4=onsetwindow(prelude,0,16,'beat');
keys = movewindow(prelude4,4,2,'beat','maxkkcc');
label=keyname(movewindow(prelude4,4,2,'beat','kkkey'));
time=0:2:16; plot(time,keys,':ko','LineWidth',1.25);
axis([-0.2 16.2 .4 1])
for i=1:length(label)
text(time(i),keys(i)+.025,label(i),...
'HorizontalAlignment','center','FontSize',12);
end
ylabel('\bfMax. key corr. coeff.');
xlabel('\bfTime (beats)')
Figure 15. Maximum key correlation coefficients across time in the beginning of the C-major
Prelude.
Figure 15 displays the key changes over time, showing the movement towards the F
major (meas. 4) and further to G major (meas. 6). Although the measure shows the
strength of the key correlation, it gives a rather simplistic view of the tonality as the
dispersion of the key center between the alternate local keys is not shown. A recent
dynamic model of tonality induction (Toiviainen & Krumhansl, 2003) calculates local
tonality based on key profiles. The results may be projected onto a self-organizing
map (SOM) trained with the 24 key profiles. In the following example, the function
calculates the key strengths and creates the projection. The second argument in the
syntax example defines the colorbar and the third the color.
keysom(prelude10,1); % create a color figure
0 2 4 6 8 10 12 14 16
0.4
0.5
0.6
0.7
0.8
0.9
1
c
C
C
F
e
e
a
d d
M
a
x
.
k
e
y
c
o
r
r
.
c
o
e
f
f
.
Ti me (beats)
27
CHAPTER 4 EXAMPLES
MIDI Toolbox
Figure 16. Self-organizing map (SOM) of the tonality in Bachs C-major Prelude.
The map underlying the tonal strengths in Figure 16 is toroid in shape, which means
that the opposite edges are attached to each other. The local tonality is the strongest in
the area between a minor and C major. This visualization of tonality can be used to
show the fluctuations of the key center and key strength over time. Below is an
example of this using a 4-beat window that steps 2 beats forward each time.
keysomanim(prelude4,4,2); % show animation in Matlab
keysomanim(prelude4,4,2,'beat','strip'); % show strips
Beat 2.0
C
Db
D
Eb
E
F
Gb
G
Ab
A
Bb
B
c
c#
d
d#
e
f
f#
g
ab
a
bb
b
Beat 4.0
C
Db
D
Eb
E
F
Gb
G
Ab
A
Bb
B
c
c#
d
d#
e
f
f#
g
ab
a
bb
b
Beat 6.0
C
Db
D
Eb
E
F
Gb
G
Ab
A
Bb
B
c
c#
d
d#
e
f
f#
g
ab
a
bb
b
Beat 8.0
C
Db
D
Eb
E
F
Gb
G
Ab
A
Bb
B
c
c#
d
d#
e
f
f#
g
ab
a
bb
b
Beat 10.0
C
Db
D
Eb
E
F
Gb
G
Ab
A
Bb
B
c
c#
d
d#
e
f
f#
g
ab
a
bb
b
Beat 12.0
C
Db
D
Eb
E
F
Gb
G
Ab
A
Bb
B
c
c#
d
d#
e
f
f#
g
ab
a
bb
b
Beat 14.0
C
Db
D
Eb
E
F
Gb
G
Ab
A
Bb
B
c
c#
d
d#
e
f
f#
g
ab
a
bb
b
Beat 16.0
C
Db
D
Eb
E
F
Gb
G
Ab
A
Bb
B
c
c#
d
d#
e
f
f#
g
ab
a
bb
b
Figure 17. First four measures (two frames per measure) of the tonality animation in Bach's
Prelude.
Figure 17 displays the tonality of the first four measures of the Prelude. From the
separate figures one can see how the tonal center is first firmly in C major and then it
moves towards other regions, F, e, etc.
CHAPTER 4 EXAMPLES
28
MIDI Toolbox
Figure 18. First four measures of the Bach's Prelude corresponding to the tonality animation
of Figure 17.
Another option in keysomanim function allows to save the animation as a Matlab
movie ('movie'). The saved movie can be played back by movie command or
written to a file using avifile command. When playing back the movie, be sure to
synchronize the animations using equivalent frame rate in order to retain the timing
information. For example, to create an animation using 5 frames per second (fps), the
following syntax may be used:
m=keysomanim(prelude4,2,.2,'sec','movie'); % 5 fps
movie(m,1,5); % last arg. = frames per second (fps)
Matlab movies use extensive amounts of memory. Therefore, with long musical
sequences it is recommended to use the 'frames' option and combine the frames
afterwards with a video editing software (such as Adobe Premiere). At the moment,
the sound cannot be easily included in the animation file without using an external
utility.
29
CHAPTER 4 EXAMPLES
MIDI Toolbox
Example 4: Meter-Finding
One way of visualizing the possible meter of a notematrix is to display its note onset
distribution in terms of the beat structure. This can be accomplished using the
onsetdist function. Let us plot the onset distribution of the Bachs Prelude
assuming a four-beat measure:
onsetdist(prelude,4,'fig');
In this function, the second parameter refers to the assumed number of beats per
measure. The onsets are weighted by the durations of tones because the longer the
tone is, the more salient and prominent it is for the listener (Thompson, 1994).
Figure 19. Distribution of note onsets in Fugue.
Figure 19 shows that the Prelude is clear in terms of the note onset distribution across
a measure. Most onsets occur at the first beat of the measure, at the most important
location according to metrical hierarchy. This onset distribution is similar to that one
commonly found in music, for example, in the works of Bach, Mozart, Brahms, and
Shostakovich (Palmer & Krumhansl, 1990). Behavioral experiments conducted by
Palmer and Krumhansl (1990) have also demonstrated that a similar hierarchical grid
may reside in the minds of Western listeners.
Inferring the meter is a challenge that involves finding a regular beat structure from
the notematrix. One technique is to use the autocorrelation function and to seek peaks
from the onset structure corresponding to simple duple (2/4, 2/8, 2/2, 4/4) or simple
triple meter (3/4, 3/2). This technique resembles the method used by Brown (1993) to
estimate meter. Toiviainen and Eerola (2004) tested the effectiveness of the method in
classifying the meters into duple or triple using two large collections of melodies
(Essen collection and Finnish Folk Tunes, N =12368). With only durational accents,
the correct classification rate was around 80%. This method is available as the meter
function in the Toolbox:
bestmeter = meter(laksin)
bestmeter = 3
0 1 2 3 4
0
20
40
60
80
100
120
140
Location within measure (quarter notes)
T
o
t
a
l
d
u
r
a
t
i
o
n
(
q
u
a
r
t
e
r
n
o
t
e
s
)
Distribution of note onsets (quarter notes)
CHAPTER 4 EXAMPLES
30
MIDI Toolbox
This indicates the most probable meter is simple triple (probably 3/4). When melodic
accent is incorporated into the inference of meter, the correct classification of meter is
higher (up to 93% of the Essen collection and 95% of Finnish folk songs were
correctly classified in Toiviainen & Eerola, 2004). This optimized function is
available in toolbox using the 'optimal' parameter in meter function, although the
scope of that function is limited to monophonic melodies.
Detecting compound meters (6/8, 9/8, 6/4) presents another challenge for meter-
finding that will not be covered here. A plot of autocorrelation results obtained by
using onsetacorr function provides a closer look of how the meter is inferred
(Figure 20). In the function, second parameter refers to divisions per quarter note.
onsetacorr(laksin,4,'fig');
Figure 20. Autocorrelation function of onset times in Lksin Min Kesyn.
Figure 20 shows that the zero time lag receives perfect correlation as the onset
distribution is correlated with itself. Time lags at 1-8 quarter notes are stronger than
the time lags at other positions. Also, there is a difference between the correlations for
the time lag 2, 3 and 4. The lag of 3 beats (marked with A) is higher (although only
slightly) than the lags 2 and 4 beats and therefore it is plausible that the meter is
simple triple.
Even if we now know the likely meter we cannot be sure the first event or events in
the notematrix are not pick-up beats. In this dilemma, it is useful to look at the
metrical hierarchy, which stems from the work by Lerdahl and J ackendoff (1983).
They described the rhythmic structure of Western music as consisting of alteration of
weak and strong beats, which are organized in a hierarchical manner. The positions in
the highest level of this hierarchy correspond to the first beat of the measure and are
assigned highest values, the second highest level to the middle of the measure and so
on, depending on meter. It is possible to examine the metrical hierarchy of events in a
notematrix by making use of the meter-finding algorithm and finding the best fit
between cyclical permutations of the onset distribution and Lerdahl and J ackendoff
metrical hierarchies, shown below:
0 2 4 6 8
0
0.2
0.4
0.6
0.8
1
Time lag (quarter notes)
C
o
r
r
e
l
a
t
i
o
n
Autocorrelation function of onset times
A
31
CHAPTER 4 EXAMPLES
MIDI Toolbox
plothierarchy(laksin,'sec');
The dots in Figure 20 represent the metrical hierarchy. High stacks of dots (connected
with a stem) correspond to events with high metrical hierarchy. In this melody, three
levels are in use. The meter-finding algorithm infers the meter of the tune correctly
(3/4), but the algorithm assumes that the first note is a pick-up note. This probably
happens because of the metrical stress caused by the long notes in the second beats in
measures three and six. A listener unfamiliar with the song could easily form this
interpretation of meter.
Figure 21. Notation of Lksin min kesyn (upper panel) and the inferred metrical
hierarchy for the events (lower panel).
0 2 4 6 8 10
2
3
4
5
Time in seconds
M
e
t
r
i
c
a
l
h
i
e
r
a
r
c
h
y
CHAPTER 4 EXAMPLES
32
MIDI Toolbox
Example 5: Melodic Segmentation
One of the fundamental processes in perceiving music is the segmentation of the
auditory stream into smaller units, melodic phrases, motifs and such issues. Various
computational approaches to segmentation have been taken. With symbolic
representations of music, we can distinguish rule-based and statistical (or memory-
based) approaches. An example of the first category is the algorithm by Tenney and
Polansky (1980), which finds the locations where the changes in clangs occur.
These clangs correspond to large pitch intervals and large inter-onset-intervals (IOIs).
This idea is partly based on Gestalt psychology. For example, this algorithm segments
Lksin in the following way:
segmentgestalt(laksin,'fig');
Figure 22. Segmented version of Lksin min kesyn. The dotted line indicates clang
boundaries and the black line indicates the segment boundary, both the result of the Gestalt-
based algorithm (Tenney & Polansky, 1980).
Another segmentation technique uses the probabilities derived from the analysis of
melodies (e.g., Bod, 2002). In this technique, demonstrated in Figure 22, the
probabilities of phrase boundaries have been derived from pitch-class-, interval- and
duration distributions at the segment boundaries in the Essen folk song collection.
segmentprob(laksin,.6,'fig');
0 2 4 6 8 10 12 14 16
C4#
D4
D4#
E4
F4
F4#
G4
G4#
A4
A4#
B4
C5
C5#
Time in beats
P
i
t
c
h
33
CHAPTER 4 EXAMPLES
MIDI Toolbox
Figure 23. Segmentation based on the probabilities of tone, interval, and duration distributions
at segment boundaries in the Essen collection. The middle panel shows the probabilities of
segment boundaries by the algorithm. The tune contains the two first phrases of Lksin min
kesyn.
Both segmentation algorithms produce plausible divisions of the example tune
although the correct segmentation is more in line with Tenney & Polanskys model.
Finally, a Local Boundary Detection Model by Cambouropoulos (1997) is a recent
variant of the rule-based model that offers effective segmentation of monophonic
input.
boundary(laksin,'fig');
Figure 24. Segmentation of Lksin min kesyn based on Local Boundary Detection Model
(Camporopoulos, 1997)
0 2 4 6 8 10 12 14 16
C4#
D4
D4#
E4
F4
F4#
G4
G4#
A4
A4#
B4
C5
C5#
Time in beats
P
i
t
c
h
0 2 4 6 8 10 12 14 16
0
0.5
1
0 2 4 6 8 10 12 14 16
C4#
D4
D4#
E4
F4
F4#
G4
G4#
A4
A4#
B4
C5
C5#
Time in beats
P
i
t
c
h
0 2 4 6 8 10 12 14 16
0
0.2
0.4
0.6
0.8
1
B
o
u
n
d
a
r
y
s
t
r
e
n
g
t
h
s
CHAPTER 4 EXAMPLES
34
MIDI Toolbox
Example 6: Melodic Expectations
Recent work on melodic expectancy has shown how music draws on common
psychological principles of expectation that have been captured in Narmours (1990)
cognitively oriented music-theoretic model. The model draws on the Gestalt-based
principles of proximity, similarity, and good continuation and has been found to
predict listeners melodic expectancies fairly well (Krumhansl, 1995a, b). The model
operates by looking at implicative intervals and realized intervals. The former creates
implications for the melody's continuation and the next interval carries out its
implications (Figure 25).
Figure 25. An example implicative and realized interval.
The model contains five principles (Registral Direction, Intervallic Difference,
Registral Return, Proximity, and Closure) that are each characterized by a specific
rule describing the registral direction and the distance in pitch between successive
tones. The principle of Registral Return, for example, refers to cases in which the
second tone of the realized interval is within two semitones of the first tone of the
implicative interval. According to the theory, listeners expect skips to return to
proximate pitch. The combinations of implicative intervals and realized intervals that
satisfy this principle are shown by the shaded area at the Figure 26.
Figure 26. Demonstration of Registral Return in Narmours implication-realization model.
The vertical axis corresponds to the implicative interval ranging from 0 to 11 semitones. The
horizontal axis corresponds to the realized interval, ranging from 12 semitones in the opposite
direction of the implicative interval to 12 semitones in the same direction of the implicative
interval. The shaded grids indicate the combinations of implied and realized intervals that
fulfil the principle. A small X is displayed where the example fragment from Figure 25 would
be positioned along the grid. According to the principle of Registral Return, the example
-12 -10 -8 -6 -4 -2 0 2 4 6 8 10 12
0
1
2
3
4
5
6
7
8
9
10
11
I
m
p
l
i
c
a
t
i
v
e
i
n
t
e
r
v
a
l
Realized interval
X
35
CHAPTER 4 EXAMPLES
MIDI Toolbox
fragment (containing intervals of 2 semitones +2 semitones) would not be predictable
according to the algorithm as it lies outside the shaded area.
The implication-realization model has been quantified by Krumhansl (1995b), who
also added a new principle, Consonance, to the model. The Figure 27 displays the
quantification schemes of all six principles in Narmours model, available in the
Toolbox as narmour function.
Figure 27. Quantification of Narmour's Implication-realization model (Krumhansl,
1995b). The darker areas indicate better realization of implied intervals.
A cursory look at the shaded areas of Figure 27 indicates that proximate pitches
(Proximity), reversals of direction in large intervals (Registral Direction) and unisons,
perfect fourths, fifths and octaves (Consonance) are preferred as realized intervals by
Narmours model.
Other factors affect melodic expectations as well. The local key context is also a
strong influence on what listeners expect of melodic continuations. If the key of the
passage is known, tonal stability values (obtained from the experiment by Krumhansl
& Kessler, 1982, shown in Figure 12) can be used to evaluate the degree of fitness of
individual tones to the local key context. tonality function in the MIDI toolbox can
be used to assign these values to note events assuming the key is in C major or C
minor. In addition, differences in tonal strengths of the individual pitch-classes form
asymmetrical relationships between the adjacent pitches. Tonally unstable tones tend
to be attracted to tonally stable tones (e.g., in C major, B is pulled towards the tonic C,
and G
and G
) receive the
lowest mean scores. The individual predictions of the different models illuminate why
these candidates receive different fitness rating according to the models. The tone G,
the highest candidate, is appropriate to continue the sequence because of its close
proximity to the previous and median tone of the sequence, high degree of tonal
stability of the mediant tone (G) in e-minor and because its movement direction can
be predicted from the previous pitch heights. The lowest candidate G
is also close in
pitch proximity but it is not tonally stable and it also forms dissonant interval with the
previous tone. Note that not all principles are commonly needed to estimate the fitness
of a given tone to a sequence and the exact weights of the principles vary across
music styles. Furthermore, this method does not explicitly account for longer pitch
patterns although it is evident that in the example melody, listeners have already come
across similar melodic phrase in the beginning of the melody. However, these issues
can be examined using contour-related (Example 2) and continuous models (Example
7).
Continuations Context
37
CHAPTER 4 EXAMPLES
MIDI Toolbox
Example 7: Melodic Complexity
Occasionally, it is interesting to know how complicated, difficult or original a
melody is. For example, Dean Keith Simonton (1984, 1994) analyzed a large number
of classical themes and noticed that the originality of the themes is connected with
their popularity. This relationship is in the form of inverted-U function where the
most popular themes are of medium originality. As a result, the most simple themes
are not popular (they may be considered banal) and neither are the most complex
ones. There are also other uses for a melodic complexity measure such as using it as
an aid in classification of melodic material (Toiviainen & Eerola, 2001). Simontons
model of melodic originality is based on tone-transition probabilities. The output of
this model (compltrans) produces an inverse of the averaged probability, scaled
between 0 and 10 where higher value indicates higher melodic originality.
Another way of assessing melodic complexity is to focus on tonal and accent
coherence, and to the amount of pitch skips and contour self-similarity the melody
exhibits. This model has been coined expectancy-based model (Eerola & North, 2000)
of melodic complexity because the components of the model are derived from
melodic expectancy theories (available as complebm function). An alternative
measure of melodic complexity is anchored in continuous measurement of note event
distribution (pitch-class, interval) entropy (use movewindow and entropy and
various distribution functions). This measure creates melodic predictability values for
each point in the melody (hence the term continuous). These values have been found
to correspond to the predictability ratings given by listeners in experiments (Eerola et
al., 2002). This measure offers a possibility to observe the moment-by-moment
fluctuations in melodic predictability.
0 5 10 15 20 25
0
0.2
0.4
0.6
0.8
1
P
r
e
d
i
c
t
a
b
i
l
i
t
y
Bar numbers
P-C entropy
Figure 29. Predictability of Bachs Sarabande (first 27 measures).
The Figure 29 displays how the predictability fluctuates over time. In the beginning,
predictability increases as the opening melodic motifs are repeated (see Figure 4 for
notation). At measure 20, the Sarabande takes a new turn, modulates and contains
large pitch skips all that lead to lower predictability values.
CHAPTER 4 EXAMPLES
38
MIDI Toolbox
Example 8: Analyzing MIDI Collections
In this example, we have a sample of 50 Finnish Folk songs from the Suomen Kansan
Svelmt collection (Eerola & Toiviainen, 2004). First, we load all songs saved in a
Matlab cell matrix (see the first line of commands below and the notes about the
collection format in the Chapter 3). Then we can investigate any property of the
collection with a single command (analyzecoll). For example, the following
commands can be used to calculate the pitch-class profile of all songs in the collection
(all songs have been transposed into C major/c minor).
load finfolktunes.mat % we get a variable, nm
pcd = analyzecoll(nm,'pcdist1'); % 50 x 12 matrix
meanpcd = mean(pcd,1); % collapsed into 12-comp. vector
plotdist(meanpcd);
C C# D D# E F F# G G# A A# B
0
0.05
0.1
0.15
0.2
0.25
Pitch-class
P
r
o
p
o
r
t
i
o
n
(
%
)
Figure 30. Key profile of the 50 Finnish folk songs (Eerola & Toiviainen, 2004).
In order to compare the resulting distribution to a suitable existing reference
distribution, one can use the refstat function in the Toolbox. Various reference
statistics in the refstat function can be listed using the help command. To obtain
the mean pitch-class distributions in Essen Folk Song Collection and in Schubert
songs, type:
essen_pc = refstat('pcdist1essen');
schubert_pc = refstat('pcdist1schubert');
The Essen tone profile has been obtained from the digital folk song collection, edited
by Helmut Schaffrath (1995), from which the songs have been converted into **kern
representation (Huron, 1999) and subsequently to MIDI and then analyzed using the
MIDI Toolbox. The Schubert profile is based on the work by Knopoff and Hutchinson
(1983) who tallied the frequency of intervals in Schubert songs in major keys. Now
we can plot the pitch-class distributions of these three corpora.
39
CHAPTER 4 EXAMPLES
MIDI Toolbox
C C# D D# E F F# G G# A A# B
0
0.05
0.1
0.15
0.2
0.25
0.3
P
r
o
p
o
r
t
i
o
n
Pitch-class
Finnish Collection
Essen Collection
Classical (major)
Figure 31. Pitch-class profiles of Finnish (Eerola & Toiviainen, 2004) and European folk
songs (Essen collection, Schaffrath, 1995) and Schubert songs (Knopoff & Hutchinson,
1983).
The note distributions of the three collections seem to be highly similar. The profile
obtained from the Finnish folk songs displays some differences, mainly concerning
higher proportion of subdominant (F) and lowered leading note (B