Python Programming for Media Computation
Python Programming for Media Computation
Mark Guzdial
I Introduction 5
2 Introduction to Programming 19
2.1 Programming is about Naming . . . . . . . . . . . . . . . . . 19
2.1.1 Files and their Names . . . . . . . . . . . . . . . . . . 21
2.2 Programming in Python . . . . . . . . . . . . . . . . . . . . . 22
2.2.1 Programming in JES . . . . . . . . . . . . . . . . . . . 23
2.2.2 Media Computation in JES . . . . . . . . . . . . . . . 23
2.2.3 Making a Recipe . . . . . . . . . . . . . . . . . . . . . 32
II Sounds 43
iii
iv CONTENTS
4 Creating Sounds 97
4.1 Creating an Echo . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.1.1 Creating Multiple Echoes . . . . . . . . . . . . . . . . 99
4.1.2 Additive Synthesis . . . . . . . . . . . . . . . . . . . . 100
V Files 175
VI Text 181
14 Recursion 205
15 Objects 207
16 Java 211
16.1 Java examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
vi CONTENTS
List of Figures
vii
viii LIST OF FIGURES
4.1 The top and middle waves are added together to create the
bottom wave . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2 The raw 440 Hz signal on top, then the 440+880+1320 Hz
signal on the bottom . . . . . . . . . . . . . . . . . . . . . . . 110
4.3 FFT of the 440 Hz sound . . . . . . . . . . . . . . . . . . . . 110
4.4 FFT of the combined sound . . . . . . . . . . . . . . . . . . . 110
4.5 The 440 Hz square wave (top) and additive combination of
square waves (bottom) . . . . . . . . . . . . . . . . . . . . . . 111
4.6 FFTs of the 440 Hz square wave (top) and additive combi-
nation of square waves (bottom) . . . . . . . . . . . . . . . . 111
5.16 Original picture (left) and mirrored along the vertical axis
(right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.17 Flowers in the mediasources folder . . . . . . . . . . . . . . 141
5.18 Collage of owers . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.19 Increasing reds in the browns . . . . . . . . . . . . . . . . . . 147
5.20 Increasing reds in the browns, within a certain range . . . . . 148
5.21 A picture of a child (Katie), and her background without her 149
5.22 A new background, the moon . . . . . . . . . . . . . . . . . . 149
5.23 Katie on the moon . . . . . . . . . . . . . . . . . . . . . . . . 150
5.24 Mark in front of a blue sheet . . . . . . . . . . . . . . . . . . 150
5.25 Mark on the moon . . . . . . . . . . . . . . . . . . . . . . . . 151
5.26 Mark in the jungle . . . . . . . . . . . . . . . . . . . . . . . . 152
5.27 Color: RGB triplets in a matrix representation . . . . . . . . 159
5.28 Color: The original picture (left) and red-reduced version (right)160
5.29 Color: Overly blue (left) and red increased by 20% (right) . . 160
5.30 Color: Original (left) and blue erased (right) . . . . . . . . . 161
5.31 Color: Lightening and darkening of original picture . . . . . . 161
5.32 Color: Negative of the image . . . . . . . . . . . . . . . . . . 162
5.33 Color: Color picture converted to greyscale . . . . . . . . . . 162
5.34 Color: Increasing reds in the browns . . . . . . . . . . . . . . 163
5.35 Color: Increasing reds in the browns, within a certain range . 164
This book is based on the proposition that very few people actually want to
learn to program. However, most educated people want to use a computer,
and the task that they most want to do with a computer is communicate.
Alan Perlis rst made the claim in 1961 that computer science, and program-
ming explicitly, should be part of a liberal education [Greenberger, 1962].
However, what weve learned since then is that one doesnt just learn to
program. One learns to program something [Adelson and Soloway, 1985,
Harel and Papert, 1990], and the motivation to do that something can make
the dierence between learning to program or not [Bruckman, 2000].
The philosophies which drive the structure of this book include:
The computer is the most amazingly creative device that humans have
ever conceived of. It is literally completely made up of mind-stu. As
the movie says, Dont just dream it, be it. If you can imagine it,
1
2 LIST OF FIGURES
Typographical notations
Examples of Python code look like this: x = x + 1. Longer examples look
look like this:
def helloWorld():
print "Hello, world!"
When showing something that the user types in with Pythons response,
it will have a similar font and style, but the users typing will appear after
a Python prompt (>>>):
>>> print 3 + 4
7
def helloWorld():
print "Hello, world!"
End of Recipe 1
Making it Work Tip: An Example How To
Make It Work
Best practices or techniques that really help are high-
lighted like this.
Acknowledgements
Our sincere thanks go out to
Jason Ergle, Claire Bailey, David Raines, and Joshua Sklare who made
JES a reality with amazing quality for such a short amount of time.
Jason and David took JES the next steps, improving installation, de-
bugging, and process support.
Adam Wilson built the MediaTools that are so useful for exploring
sounds and images and processing video.
Kurt Eiselt who worked hard to make this eort real, convincing others
to take it seriously.
4 LIST OF FIGURES
Janet Kolodner and Aaron Bobick who were excited and encouraging
about the idea of media computation for students new to computer
science.
Joan Morton, Chrissy Hendricks, and all the sta of the GVU Center
who made sure that we had what we needed and that the paperwork
was done to make this class come together.
Introduction
5
Chapter 1
Introduction to Media
Computation
7
8 CHAPTER 1. INTRODUCTION TO MEDIA COMPUTATION
Some computer scientists study how recipes are written: Are there
better or worse ways of doing something? If youve ever had to sepa-
rate whites from yolks in eggs, you know that knowing the right way
to do it makes a world of dierence. Computer science theoreticians
worry about the fastest and shortest recipes, and the ones that take
up the least amount of space (you can think about it as counter space
the analogy works). How a recipe works, completely apart from
how its written, is called the study of algorithms. Software engi-
neers worry about how large groups can put together recipes that still
work. (The recipe for some programs, like the one that keeps track of
Visa/MasterCard records has literally millions of steps!)
Can recipes be written for anything? Are there some recipes that
cant be written? Computer scientists actually do know that there are
recipes that cant be written. For example, you cant write a recipe
that can absolutely tell, for any other recipe, if the other recipe will
actually work. How about intelligence? Can we write a recipe that can
think (and how would you tell if you got it right)? Computer scientsts
in theory, intelligent systems, articial intelligence, and systems worry
about things like this.
There are even computer scientists who worry about whether peo-
ple like what the recipes produce, like the restauraunt critics for the
newspaper. Some of these are human-computer interface specialists
who worry about whether people like how the recipes work (those
recipes that produce an interface that people use, like windows,
buttons, scrollbars, and other elements of what we think about as a
1.1. WHAT IS COMPUTER SCIENCE ABOUT? 9
running program).
The recipe metaphor also works on another level. Everyone knows that
some things in recipe can be changed without changing the result dramat-
ically. You can always increase all the units by a multiplier to make more.
You can always add more garlic or oregano to the spaghetti sauce. But
there are some things that you cannot change in a recipe. If the recipe calls
for baking powder, you may not substitute baking soda. If youre supposed
to boil the dumplings then saute them, the reverse order will probably not
work well.
Similarly, for software recipes. There are usually things you can easily
change: The actual names of things (though you should change names con-
sistently), some of the constants (numbers that appear as plain old numbers,
not as variables), and maybe even some of the data ranges (sections of the
data) being manipulated. But the order of the commands to the computer,
however, almost always has to stay exactly as stated. As we go on, youll
learn what can be changed safely, and what cant.
Computer scientists specify their recipes with programming languages.
Dierent programming languages are used for dierent purposes. Some of
them are wildly popular, like Java and C++. Others are more obscure,
like Squeak and T. Others are designed to make computer science ideas
10 CHAPTER 1. INTRODUCTION TO MEDIA COMPUTATION
very easy to learn, like Scheme or Python, but the fact that theyre easy to
learn doesnt always make them very popular nor the best choice for experts
building larger or more complicated recipes. Its a hard balance in teaching
computer science to pick a language that is easy to learn and is popular and
useful enough that students are motivated to learn it.
Why dont computer scientists just use natural languages, like English
or Spanish? The problem is that natural languages evolved the way that
they did to enhance communications between very smart beings, humans.
As well go into more in the next section, computers are exceptionally dumb.
They need a level of specicity that natural language isnt good at. Further,
what we say to one another in natural communication is not exactly what
youre saying in a computational recipe. When was the last time you told
someone how a videogame like Doom or Quake or Super Mario Brothers
worked in such minute detail that they could actually replicate the game
(say, on paper)? English isnt good for that kind of task.
There are so many dierent kinds of programming languages because
there are so many dierent kinds of recipes to write. Programs written in
the programming language C tend to be very fast and ecient, but they
also tend to be hard to read, hard to write, and require units that are more
about computers than about bird migrations or DNA or whatever else you
want to write your recipe about. The programming language Lisp (and its
related languages like Scheme, T, and Common Lisp) is very exible and is
well suited to exploring how to write recipes that have never been written
before, but Lisp looks so strange compared to languages like C that many
people avoid it and there are (natural consequence) few people who know it.
If you want to hire a hundred programmers to work on your project, youre
going to nd it easier to nd a hundred programmers who know a popular
language than a less popular onebut that doesnt mean that the popular
language is the best one for your task!
The programming language that were using in this book is Python
(http://www.python.org for more information on Python). Python is a
fairly popular programming language, used very often for Web and me-
dia programming. The web search engine Google is mostly programmed in
Python. The media company Industrial Light & Magic also uses Python.
A list of companies using Python is available at http://www.python.org/
psa/Users.html. Python is known for being easy to learn, easy to read,
very exible, but not very ecient. The same algorithm coded in C and
in Python will probably be faster in C. The version of Python that were
using is called Jython (http://www.jython.org). Python is normally im-
plemented in the programming language C. Jython is Python implemented
1.2. WHAT COMPUTERS UNDERSTAND 11
in Java. Jython lets us do multimedia that will work across multiple com-
puter platforms.
1
Well talk more about this level of the computer in Chapter 12
12 CHAPTER 1. INTRODUCTION TO MEDIA COMPUTATION
0
wires 1
0
1 interpreted as
0 74
0
1
0
Figure 1.1: Eight wires with a pattern of voltages is a byte, which gets
interpreted as a pattern of eight 0s and 1s, which gets interpreted as a
decimal number.
study computer science? Why should you learn to program? Isnt it enough
to learn to use all this great software? The following two sections provide
two answers to these questions.
Exercises
Exercise 1: Find an ASCII table on the Web: A table listing every char-
acter and its corresponding numeric representation.
Exercise 2: Find a Unicode table on the Web. Whats the dierence
between ASCII and Unicode?
Exercise 3: Consider the representation for pictures described in Sec-
tion 1.3, where each dot (pixel) in the picture is represented by three
bytes, for the red, green, and blue components of the color at that dot. How
many bytes does it take to represent a 640x480 picture, a common picture
size on the Web? How many bytes does it take to represent a 1024x768
picture, a common screen size? (What do you think is meant now by a 3
megapixel camera?)
Exercise 4: How many dierent numbers can be represented by one byte?
In other words, eight bits can represent from zero to what number? What
18 CHAPTER 1. INTRODUCTION TO MEDIA COMPUTATION
To Dig Deeper
James Gleicks book Chaos describes more on emergent properties.
Mitchel Resnicks book Turtles, Termites, and Trac Jams: Explo-
rations in Massively Parallel Microworlds [Resnick, 1997] describes how ants,
termites, and even trac jams and slime molds can be described pretty ac-
curately with hundreds or thousands of very small programs running and
interacting all at once.
Beyond the Digital Domain [Abernethy and Allen, 1998] is a wonder-
ful introductory book to computation with lots of good information about
digital media.
Chapter 2
Introduction to
Programming
Obviously, the computer itself doesnt care about names. If the computer
is just a calculator, then remembering words and the words association with
values is just a waste of the computers memory. But for humans, its very
powerful. It allows us to work with the computer in a natural way, even a
way that extends how we think about recipes (processes) entirely.
XXX This section needs work
A programming language is really a set of names that a computer has
encodings for, such that those names make the computer do expected ac-
19
20 CHAPTER 2. INTRODUCTION TO PROGRAMMING
tions and interpret our data in expected ways. Some of the programming
languages names allow us to dene new namingswhich allows us to create
our own layers of encoding. Assigning a variable to a value is one way of
dening a name for the computer. Dening a function is giving a name to
a recipe.
There are good names and less good names. That has nothing to do
with curse words, nor with TLAs (Three Letter Acronyms). A good set of
encodings and names allow one to describe recipes in a way thats natural,
without having to say too much. The variety of dierent programming
languages can be thought of as a collection of sets of namings-and-encodings.
Some are better for some tasks than others. Some languages require you to
write more to describe the same recipe than othersbut sometimes that
more leads to a much more (human) readable recipe that helps others to
understand what youre saying.
How the units and values (data) of a recipe can be interpreted is often
also named. Remember how we said in Section 1.2 (page 11) that everything
is in bytes, but we can interpret those bytes as numbers? In some program-
ming languages, you can say explicitly that some value is a byte, and later
tell the language to treat it as a number, an integer (or sometimes int).
Similarly, you can tell the computer that these series of bytes is a collection
of numbers (an array of integers), or a collection of characters (a string),
or even as a more complex encoding of a single oating point number (a
oatany number with a decimal point in it).
When you bring things into memory, you will name the value, so that
you can retrieve it and use it later. In that sense, programming is something
like algebra. To write generalizable equations and functions (those that work
for any number or value), you wrote equations and functions with variables,
like P V = nRT or e = M c2 or f (x) = sin(x). Those Ps, Vs, Rs, Ts,
es, Ms, cs, and xs were names for values. When you evaluated f (30), you
knew that the x was the name for 30 when computing f . Well be naming
media (as values) in the same way when using them when programming.
1. Make sure that you have Java installed on your computer. If you dont
have it, you can get it from the Sun site at http://www.java.sun.com.
3. If you dont have a CD, youll need the individual components. Youll
be able to get Jython from http://www.jython.org and JES from
http://coweb.cc.gatech.edu/mediaComp-plan/Gettingsetup. Be
sure to unzip JES as a directory inside the Jython directory.
2.2. PROGRAMMING IN PYTHON 23
Program area
Command
area
>>> print a
A local or global name could not be
found.
Another function that JES1 knows is one that allows you to pick a le
from your disk. It takes no input, like ord did, but it does return a string
which is the name of the le on your disk. The name of the function is
pickAFile. Python is very picky about capitalizationneither pickafile
nor Pickafile will work! Try it like this print pickAFile(). When you
do, you will get something that looks like Figure 2.3.
Youre probably already familiar with how to use a le picker or le
dialog like this:
Double-click on folders/directories to open them.
Click to select and then click Open, or double-click, to select a le.
Once you select a le, what gets returned is the le name as a string
(a sequence of characters). (If you click Cancel, pickAFile returns the
empty stringa string of characters, with no characters in it, e.g., "".) Try
it: Do print pickAFile() and Open a le.
What you get when you nally select a le will depend on your operating
system. On Windows, your le name will probably start with C: and will
1
You might notice that I switched from saying Python knows to JES knows. print
is something that all Python implementations know. pickAFile is something that we built
for JES. In general, you can ignore the dierence, but if you try to use another kind of
Python, itll be important to know whats common and what isnt.
28 CHAPTER 2. INTRODUCTION TO PROGRAMMING
The character between words (e.g., the / between Users and guz-
dial) is called the path delimiter . Everything from the beginning of
the le name to the last path delimiter is called the path to the le.
That describes exactly where on the hard disk (in which directory) a
le exists.
Files that have an extension of .jpg are JPEG les. They contain
pictures. JPEG is a standard encoding for any kind of images. The other
kind of media les that well be using alot are .wav les (Figure 2.4). The
.wav extension means that these are WAV les. They contain sounds.
WAV is a standard encoding for sounds. There are many other kinds of
extensions for les, and there are even many other kinds of media extensions.
For example, there are also GIF (.gif) les for images and AIFF (.aif
or .ai) les for sounds. Well stick to JPEG and WAV in this text, just
to avoid too much complexity.
2.2. PROGRAMMING IN PYTHON 29
.jpg file
.wav file
Showing a Picture
So now we know how to get a complete le name: Path and base name.
This doesnt mean that we have the le itself loaded into memory. To get
the le into memory, we have to tell JES how to interpret this le. We know
that JPEG les are pictures, but we have to tell JES explicitly to read the
le and make a picture from it. There is a function for that, too, named
makePicture.
makePicture does require an argumentsome input to the function. It
takes a le name! Lucky uswe know how to get one of those!
The result from print suggests that we did in fact make a picture, from
a given lename and a given height and width. Success! Oh, you wanted
to actually see the picture? Well need another function! (Did I mention
somewhere that computers are stupid?) The function to show the picture
is named show. show also takes an argumenta Picture! Figure 2.5 is the
result. Ta-dah!
Notice that the output from show is None. Functions in Python dont
have to return a value, unlike real mathematical functions. If a function
does something (like opening up a picture in a window), then it doesnt also
need to return a value. Its still pretty darn useful.
30 CHAPTER 2. INTRODUCTION TO PROGRAMMING
Playing a Sound
We can actually replicate this entire process with sounds.
We will use play to play the sound. play takes a sound as input, but
returns None.
None
(Well explain what the length of the sound means in the next chapter.)
Please do try this on your own, using JPEG les and WAV les that you
have on your own computer, that you make yourself, or that came on your
CD. (We talk more about where to get the media and how to create it in
future chapters.)
Congratulations! Youve just worked your rst media computation!
Notice that the algebraic notions of subsitution and evaluation work here
as well. mypicture = makePicture(myfilename) causes the exact same
picture to be created as if we had executed makePicture(pickAFile())2,
because we set myfilename to be equal to the result of pickAFile(). The
values get substituted for the names when the expression is evaluated. makePicture(myfilename)
is an expression which, at evaluation time, gets expanded into
makePicture("/Users/guzdial/mediasources/barbara.jpg")
because /Users/guzdial/mediasources/barbara.jpg is the name of the le
that was picked when pickAFile() was evaluated and the returned value
was named myfilename.
We actually dont need to use print every time we ask the computer to
do something. If we want to call a function that doesnt return anything
2
Assuming, of course, that you picked the same le.
32 CHAPTER 2. INTRODUCTION TO PROGRAMMING
(and so is pretty useless to print), we can just call the function by typing
its name and its input (if any) and hitting return.
>>> show(mypicture)
Whatever inputs that this recipe will take. This recipe can be a func-
tion that takes inputs, like abs or makePicture. The inputs are named
and placed between parentheses separated by commas. If your recipe
takes no inputs, you simply enter () to indicate no inputs.
What comes after that are the commands to be executed, one after the
other, whenever this recipe is told to execute.
At this point, I need you to imagine a bit. Most real programs that
do useful things, especially those that create user interfaces, require the
denition of more than one function. Imagine that in the program area you
have several def commands. How do you think Python will gure out that
one function has ended and a new one begun? (Especially because it is
possible to dene functions inside of other functions.) Python needs some
way of guring out where the function body ends: Which statements are
part of this function and which are part of the next.
The answer is indentation. All the statements that are part of the def-
inition are slightly indented after the def statement. I recommend using
exactly two spacesits enough to see, its easy to remember, and its sim-
ple. Youd enter the function in the program area like this (where indicates
a single space, a single press of the spacebar): def hello():
print "Hello!"
We can now dene our rst recipe! You type this into the program area
of JES. When youre done, save the le: Use the extension .py to indicate
a Python le. (I saved mine as pickAndShow.py.)
def pickAndShow():
myfile = pickAFile()
mypict = makePicture(myfile)
show(mypict)
End of Recipe 2
Once youve typed in your recipe and saved it, you can load it. Click
the Load button.
34 CHAPTER 2. INTRODUCTION TO PROGRAMMING
Now you can execute your recipe. Click in the command area. Since
you arent taking any input and not returning any value (i.e., this isnt a
function), simply type the name of your recipe as a command:
>>> pickAndShow()
>>>
Youll be prompted for a le, and then the picture will appear (Fig-
ure 2.6).
2.2. PROGRAMMING IN PYTHON 35
We can similarly dene our second recipe, to pick and play a sound.
def pickAndPlay():
myfile = pickAFile()
mysound = makeSound(myfile)
play(mysound)
End of Recipe 3
' $
Making it Work Tip: Name the names you
like
Youll notice that, in the last section, we were us-
ing the names myfilename and mypicture. In this
recipe, I used myfile and mypict. Does it matter?
Absolutely not! Well, to the computer, at any rate.
The computer doesnt care what names you use
theyre entirely for your benet. Pick names that (a)
are meaningful to you (so that you can read and un-
derstand your program), (b) are meaningful to others
(so that others you show your program to can un-
derstand it), and (c) are easy to type. 25-character
names, like,
myPictureThatIAmGoingToOpenAfterThis)
are meaningful, easy-to-read, but are a royal pain to
type.
& %
While cool, this probably isnt the most useful thing for you. Having
to pick the le over-and-over again is just a pain. But now that we have
the power of recipes, we can dene new ones however we like! Lets dene
one that will just open a specic picture we want, and another that opens
a specic sound we want.
Use pickAFile to get the le name of the sound or picture that you want.
Were going to need that in dening the recipe to play that specic sound
or show that specic picture. Well just set the value of myfile directly,
instead of as a result of pickAFile, by putting the string between quotes
36 CHAPTER 2. INTRODUCTION TO PROGRAMMING
Be sure to replace FILENAME below with the complete path to your own
picture le, e.g.,
/Users/guzdial/mediasources/barbara.jpg
def showPicture():
myfile = "FILENAME"
mypict = makePicture(myfile)
show(mypict)
End of Recipe 4
Be sure to replace FILENAME below with the complete path to your own
sound le, e.g.,
/Users/guzdial/mediasources/hello.wav.
def playSound():
myfile = "FILENAME"
mysound = makeSound(myfile)
play(mysound)
End of Recipe 5
2.2. PROGRAMMING IN PYTHON 37
' $
Making it Work Tip: Copying and pasting
Text can be copied and pasted between the pro-
gram and command areas. You can use print
pickAFile() to print a lename, then select it and
copy it (from the Edit menu), then click in the com-
mand area and paste it. Similarly, you can copy
whole commands from the command area up to the
program area: Thats an easy way to test the indi-
vidual commands, and then put them all in a recipe
once you have the order right and theyre working.
You can also copy text within the command area.
Instead of re-typing a command, select it, copy it,
paste it into the bottom line (make sure the cursor
is at the end of the line!), and hit return to execute
it.
& %
How do we create a real function with inputs out of our stored recipe? Why
would you want to?
An important reason for using a variable so that input to the recipe
can be specied is to make a program more general. Consider Recipe 4,
showPicture. Thats for a specic le name. Would it be useful to have a
function that could take any le name, then make and show the picture?
That kind of function handles the general case of making and showing pic-
tures. We call that kind of generalization abstraction. Abstraction leads to
general solutions that work in lots of situations.
Dening a recipe that takes input is very easy. It continues to be a matter
of substitution and evaluation. Well put a name inside those parentheses
on the def line. That name is sometimes called the parameter well just
call it the input variable for right now.
When you evaluate the function, by specifying its name with an input
value (also called the argument) inside parentheses (like makepicture(myfilename)
or show(mypicture)), the input value is assigned to the input variable. Dur-
ing the execution of the function (recipe), the input value will be substituted
for the value.
Heres what a recipe would look like that takes the le name as an input
38 CHAPTER 2. INTRODUCTION TO PROGRAMMING
variable:
def showNamed(myfile):
mypict = makePicture(myfile)
show(mypict)
End of Recipe 6
When I type
showNamed("/Users/guzdial/mediasources/barbara.jpg")
and hit return, the variable myfile takes on the value
"/Users/guzdial/mediasources/barbara.jpg".
myPict will then be assigned to the picture resulting from reading and in-
terpreting the le at
/Users/guzdial/mediasources/barbara.jpg
Then the pictures is shown.
We can do a sound in the same way.
def playNamed(myfile):
mysound = makeSound(myfile)
play(mysound)
End of Recipe 7
We can also write recipes that take pictures or sounds in as the input
values. Heres a recipe that shows a picture but takes the picture object as
the input value, instead of the lename.
def showPicture(mypict):
show(mypict)
2.2. PROGRAMMING IN PYTHON 39
End of Recipe 8
Now, whats the dierence between the function showPicture and the
provided function show? Nothing at all. We can certainly create a function
that provides a new name to another function. If that makes your code
easier for you to understand, than its a great idea.
In this chapter, we talk about several kinds of encodings of data (or objects).
Exercises
Exercise 9: Try some other operations with strings in JES. What happens
if you multiple a number by a string, like 3 * "Hello"? What happens if
you try to multiply a string by a string, "a" * "b"?
Exercise 10: You can combine the sound playing and picture showing
commands in the same recipe. Trying playing a sound and then show a
picture while a sound is playing. Try playing a sound and opening several
pictures while the sound is still playing.
Exercise 11: We evaluated the expression pickAFile() when we wanted
to execute the function named pickAFile. But what is the name pickAFile
anyway? What do you get if you print pickAFile? How about print
makePicture? What do you thinks going on here?
Exercise 12: Try the playNamed function. You werent given any exam-
ples of its use, but you should be able to gure it out from showNamed.
To Dig Deeper
The best (deepest, most material, most elegance) computer science textbook
is Structure and Interpretation of Computer Programs [Abelson et al., 1996],
by Abelson, Sussman, and Sussman. Its a hard book to get through, though.
Somewhat easier, but in the same spirit is the new book How to Design
Programs [Felleisen et al., 2001].
2.2. PROGRAMMING IN PYTHON 41
Neither of these books are really aimed at students who want to program
because its fun or because they have something small that they want to do.
Theyre really aimed at future professional software developers. The best
books aimed at the less hardcore user are by Brian Harvey. His book Simply
Scheme uses the same programming language as the earlier two, Scheme,
but is more approachable. My favorite of this class of books, though, is
Brians three volume set Computer Science Logo Style [Harvey, 1997] which
combine good computer science with creative and fun projects.
42 CHAPTER 2. INTRODUCTION TO PROGRAMMING
Part II
Sounds
43
Chapter 3
45
46 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
Figure 3.1: Raindrops causing ripples in the surface of the water, just as
sound causes ripples in the air
The simplest sound in the world is a sine wave (Figure 3.2). In a sine
wave, the compressions and rarefactions arrive with equal size and regularity.
In a sine wave, one compression plus one rarefaction is called a cycle. At
some point in the cycle, there has to be a point where there is zero pressure,
just between the compression and the rarefaction. The distance from the
zero point to the greatest pressure (or least pressure) is called the amplitude.
Formally, amplitude is measured in Newtons per meter-squared (N/m2 ).
Thats a rather hard unit to understand in terms of perception, but you can
get a sense of the amazing range of human hearing from this unit. The
smallest sound that humans typically hear is 0.0002N/m2 , and the point at
which we sense the vibrations in our entire body is 200N/m2 ! In general,
amplitude is the most important factor in our perception of volume: If the
3.1. HOW SOUND IS ENCODED 47
Figure 3.3: The distance between peaks in ripples in a pond are not
constantsome gaps are longer, some shorter
a sense of where music ts into that spectrum, the note A above middle C
is 440 Hz in traditional, equal temperament tuning (Figure 3.4).
Like intensity, our perception of pitch is almost exactly proportional to
the log of the frequency. We really dont perceive absolute dierences in
pitch, but the ratio of the frequencies. If you heard a 100 Hz sound followed
by a 200 Hz sound, youd percieve the same pitch change (or pitch interval)
as a shift from 1000 Hz to 2000 Hz. Obviously, a dierent of 100 Hz is a lot
smaller than a change of 1000 Hz, but we perceive it to be the same.
In standard tuning, the ratio in frequency between the same notes in
adjacent octaves is 2 : 1. Frequency doubles each octave. We told you
earlier that A above middle C is 440 Hz. You know then that the next A
up the scale is 880 Hz.
How we think about music is dependent upon our cultural standards
with respect to standards, but there are some universals. Among these
universals are the use of pitch intervals (e.g., the ratio between notes C and D
remains the same in every octave), the relationship between octaves remains
constant, and the existence of four to seven main pitches (not considering
sharps and ats here) in an octave.
What makes the experience of one sound dierent from another? Why
3.1. HOW SOUND IS ENCODED 49
Real sounds are almost never single frequency sound waves. Most
natural sounds have several frequencies in them, often at dierent
amplitudes. When a piano plays the note C, for example, part of the
richness of the tone is that the notes E and G are also in the sound,
but at lower amplitudes.
Not all sound waves are represented very well by sine waves. Real
sounds have funny bumps and sharp edges. Our ears can pick these up,
at least in the rst few waves. We can do a reasonable job synthesizing
with sine waves, but synthesizers sometimes also use other kinds of
wave forms to get dierent kinds of sounds (Figure 3.5).
Figure 3.5: Some synthesizers using triangular (or sawtooth) or square waves.
Using the sound tools, you can actually observe sounds to get a sense of
what louder and softer sounds look like, and what higher and lower pitched
sounds look like.
The basic sound editor looks like Figure 3.6. You can record sounds,
open WAV les on your disk, and view the sounds in a variety of ways. (Of
course, assuming that you have a microphone on your computer!)
To view sounds, click the View button, then the Record button. (Hit
the Stop button to stop recording.) There are three kinds of views that
you can make of the sound.
The rst is the signal view (Figure 3.7). In the signal view, youre looking
at the sound raweach increase in air pressure results in an rise in the graph,
and each decrease in sound pressure results in a drop in the graph. Note
how rapidly the wave changes! Try some softer and louder sounds so that
you can see how the sounds look changes. You can always get back to the
signal view from another view by clicking the Signal button.
The second view is the spectrum view (Figure 3.8). The spectrum view
is a completely dierent perspective on the sound. In the previous section,
you read that natural sounds are often actually composed of several dierent
frequencies at once. The spectrum view shows these individual frequencies.
3.1. HOW SOUND IS ENCODED 51
#
Making it Work Tip: Explore sounds!
You really should try these dierent views on real
sounds. Youll get a much better understanding for
sound and what the manipulations were doing in this
chapter are doing to the sounds.
" !
This isnt just a theoretical result. The Nyquist theorem inuences ap-
plications in our daily life. It turns out that human voices dont typically
get over 4,000 Hz. Thats why our telephone system is designed around
capturing 8,000 samples per second. Thats why playing music through the
telephone doesnt really work very well. The limits of (most) human hearing
3.1. HOW SOUND IS ENCODED 55
000000000000000
000000000000001
000000000000010
56 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
000000000000011
...
111111111111110
111111111111111
That looks forbidding. Lets see if we can gure out a pattern. If weve
got two bits, there are four patterns: 00, 01, 10, 11. If weve got three bits,
there are eight patterns: 000, 001, 010, 011, 100, 101, 110, 111. It turns out
that 22 is four, and 23 is eight. Play with four bits. How many patterns are
there? 24 = 16 It turns out that we can state this as a general principle.
21 5 = 32, 768. Why is there one more value in the negative range than
the positive? Zero is neither negative nor positive, but if we want to repre-
sent it as bits, we need to dene some pattern as zero. We use one of the
positive range values (where the sign bit is zero) to represent zero, so that
takes up one of the 32,768 patterns.
The sample size is a limitation on the amplitude of the sound that can
be captured. If you have a sound that generates a pressure greater than
32,767 (or a rarefaction greater than -32,768), youll only capture up to the
limits of the 16 bits. If you were to look at the wave in the signal view, it
would look like somebody took some scissors and clipped o the peaks of
the waves. We call that eect clipping for that very reason. If you play (or
generate) a sound thats clipped, it sound badit sounds like your speakers
are breaking.
There are other ways of digitizing sound, but this is by far the most
common. The technical term for is pulse coded modulation (PCM). You
may encounter that term if you read further in audio or play with audio
software.
What this means is that a sound in a computer is a long list of numbers,
each of which is a sample in time. There is an ordering in these samples: If
you played the samples out of order, you wouldnt get the same sound at all.
The most ecient way to store an ordered list of data items on a computer
is with an array. An array is literally a sequence of bytes right next to one
another in memory. We call each value in an array an element.
We can easily store the samples that make up a sound into an array.
Think of each two bytes as storing a single sample. The array will be
3.1. HOW SOUND IS ENCODED 57
largefor CD-quality sounds, there will be 44,100 elements for every second
of recording. A minute long recording will result in an array with 26,460,000
elements.
Each array element has a number associated with it called its index . The
index numbers increase sequentially. The rst one is 1, the second one is 2,
and so on. You can think about an array as a long line of boxes, each one
holding a value and each box having an index number on it (Figure 3.12).
59 39 16 10 -1
...
1 2 3 4 5
Using the MediaTools, you can graph a sound le (Figure 3.13) and
get a sense of where the sound is quiet (small amplitudes), and loud (large
amplitudes). This is actually important if you want to manipulate the sound.
For example, the gaps between recordedwords tend to be quietat least
quieter than the words themselves. You can pick out where words end by
looking for these gaps, as in Figure 3.13.
You will then be shown the le in the sound editor view (Figure 3.16).
The sound editor lets you explore a sound in many ways (Figure 3.17). As
you scroll through the sound and change the sound cursor (the red/blue line
3.1. HOW SOUND IS ENCODED 59
in the graph) position, the index changes to show you which sound array
element youre currently looking at, and the value shows you the value
at that index. You can also t the whole sound into the graph to get an
overall view (but currently breaks the index/value displays). You can even
play your recorded sound as if it were an instrumenttry pressing the
piano keys across the bottom of the editor. You can also set the cursor (via
the scrollbar or by dragging in the graph window) and play the sound before
the cursora good way to hear what part of the sound corresponds to what
index positions. (Clicking the <> button provides a menu of option which
includes getting an FFT view of the sound.)
1. Well need to get a lename of a WAV le, and make a sound from it.
You already saw how to do that in a previous chapter.
2. You will often get the samples of the sound. Sample objects are easy to
manipulate, and they know that when you change them, they should
automatically change the original sound. Youll read rst about ma-
nipulating the samples to start with, then about how to manipulate
the sound samples from within the sound itself.
3. Whether you get the sample objects out of a sound, or just deal with
the samples in the sound object, you will then want to do something
to the samples.
4. You may then want to write the sound back out to a new le, to
use elsewhere. (Most sound editing programs know how to deal with
audio.)
>>> filename=pickAFile()
>>> print filename
/Users/guzdial/mediasources/preamble.wav
>>> sound=makeSound(filename)
>>> print sound
Sound of length 421109
You can get the samples from a sound using getSamples. The function
getSamples takes a sound as input and returns an array of all the samples
as sample objects. When you execute this function, it may take quite a while
before it nisheslonger for longer sounds, shorter for shorter sounds.
>>> samples=getSamples(sound)
>>> print samples
Samples, length 421109
3.2. MANIPULATING SOUNDS 61
What numbers can we use as index values? Anything between 1 and the
length of the sound in samples. We get that length (the maximum index
value) with getLength. Notice the error that we get below if we try to get
a sample past the end of the array.
>>> writeSndTo(sound,"mysound.wav")
A local or global name could not be
found.
Its no big deal. JES will let you copy the mistyped
command, paste it into the bottommost line of the
command area, then x it. Be sure to put the cursor
at the end of the line before you press the Return
key.
& %
What do you think would happen if we then played this sound? Would
it really sound dierent than it did before, now that weve turned the rst
sample from the number 36 to the number 12? Not really. To explain why
not, lets nd out what the sampling rate is for this sound, by using the
function getSamplingRate which takes a sound as its input.
>>> writeSoundTo(sound,"new-preamble.wav")
(or iterating). Were mostly going to use the command for. A for loop
executes some commands (that you specify) for an array (that you provide),
where each time that the commands are executed, a particular variable (that
you, again, get to name) will have the value of a dierent element of the
array.
We are going to use the getSamples function we saw earlier to provide
our array. We will use a for loop that looks like this:
Next comes the variable name that you want to use in your code for
addressing (and manipulating) the elements of the array.
Then, you need the array. We use the function getSamples to generate
an array for us.
What comes next are the commands that you want to execute for each
sample. Each time the commands are executed, the variable (in our example
sample) will be a dierent element from the array. The commands (called
the body) are specied as a block. This means that they should follow the
for statement, each on their own line, and indented by two more spaces!
For example, here is the for loop that simply sets each sample to its own
value (a particularly useless exercise, but itll get more interesting in just a
couple pages).
The rst statement says that were going to have a for loop that will
set the variable sample to each of the elements of the array that is
output from getSamples(sound).
The next statement is indented, so its part of the body of the for
loopone of the statements that will get executed each time sample
has a new value. It says to name the value of the sample in the variable
sample. That name is value.
The third statement is still indented, so its still part of the loop body.
Here we set the value of the sample to the value of the variable value.
Heres the exact same code (it would work exactly the same), but with
dierent variable names.
for s in getSamples(sound):
v = getSample(s)
setSample(s,v)
Whats the dierence? Slightly easier to confuse variable names. s and v are
not as obvious what they are naming as sample and value. Python doesnt
care which we use, and the single character variable names are clearly easier
to type. But the longer variable names make it easier to understand your
code later.
Notice that the earlier paragraph had emphasized by two more spaces.
Remember that what comes after a function denition def statement is also
a block. If you have a for loop inside a function, then the for statement is
indented two spaces already, so the body of the for loop (the statements to
be executed) must be indented four spaces. The for loops block is inside the
functions block. Thats called a nested block one block is nested inside the
other. Heres an example of turning our useless loop into an equally useless
function:
def doNothing():
for sample in getSamples(sound):
value = getSample(sample)
setSample(sample,value)
You dont actually have to put loops into functions to use them. You can
type them into the command area of JES. JES is smart enough to gure out
that you need to type more than one command if youre specifying a loop, so
it changes the prompt from >>> to .... Of course, it cant gure out when
66 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
youre done, so youll have to just hit return without typing anything else
to tell JES that youre done with the body of your loop. It looks something
like this:
... setSample(sample,value)
You probably realize that we dont really need the variable value (or v).
We can combine the two statements in the body into one. Heres doing it
at the command line:
... setSample(sample,getSample(sample))
Earlier, we said that the amplitude of a sound is the main factor in the
volume. This means that if we increase the amplitude, we increase the
volume. Or if we decrease the amplitude, we decrease the volume.
Dont get confused here changing the amplitude doesnt reach out
and twist up the volume knob on your speakers. If your speakers volume
(or computers volume) is turned down, the sound will never get very loud.
The point is getting the sound itself louder. Have you ever watched a movie
on TV where, without changing the volume on the TV, sound becomes so
low that you can hardly hear it? (Marlon Brandos dialogue in the movie
The Godfather comes to mind.) Thats what were doing here. We can make
sounds shout or whisper by tweaking the amplitude.
68 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
Increasing volume
Heres a function that doubles the amplitude of an input sound.
def increaseVolume(sound):
for sample in getSamples(sound):
value = getSample(sample)
setSample(sample,value * 2)
End of Recipe 9
Go ahead and type the above into your JES Program Area. Click Load
to get Python to process the function and make the name increaseVolume
stand for this function. Follow along the example below to get a better idea
of how this all works.
To use this recipe, you have to create a sound rst, then pass it in as
input. In the below example, we get the lename by setting the variable
f explicitly to a string that is a lename, as opposed to using pickAFile.
Dont forget that you cant type this code in and have it work as-is: Your
path names will be dierent than mine!
>>> f="/Users/guzdial/mediasources/gettysburg10.wav"
>>> s=makeSound(f)
>>> increaseVolume(s)
>>> play(s)
>>> writeSoundTo(s,"/Users/guzdial/mediasources/louder-g10.wav")
Now, is it really louder, or does it just seem that way? We can check it in
several ways. You could always make the sound even louder by evaluating
increaseVolume on our sound a few more timeseventually, youll be to-
tally convinced that the sound is louder. But there are ways to test even
more subtle eects.
If we compared graphs of the two sounds, youd nd that the sound
in the new le (louder-g10.wav in our example) has much bigger ampli-
tude than the sound in the original le (gettysburg10.wav, which is in the
mediasources directory on your CD). Check it out in Figure 3.18.
Figure 3.18: Comparing the graphs of the original sound (top) and the
louder one (bottom)
Maybe youre unsure that youre really seeing a larger wave in the second
picture. (Its particularly hard to see in the MediaTools which automatically
scales the graphs so that they look the sameI had to use an another sound
70 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
Figure 3.19: Comparing specic samples in the original sound (top) and the
louder one (bottom)
tool to generate the pictures in Figure 3.18.) You can use the MediaTools
to check the individual sample values. Open up both WAV les, and open
the sound editor for each. Scroll down into the middle of the sound, then
drag the cursor to any value you want. Now, so the same to the second one.
Youll see that the louder sound (bottom one in Figure 3.19) really does
have double the value of the same sample in the original sound.
Finally, you can always check for yourself from within JES. If youve
been following along with the example2 , then the variable s is the now
louder sound. f should still be the lename of the original sound. Go ahead
and make a new sound object which is the original soundthat is named
below as soriginal (for sound original). Check any sample value that you
wantits always true that the louder sound has twice the sample values of
the original sound.
>>> print s
Sound of length 220567
2
What? You havent? You should ! Itll make much more sense if you try it yourself!
3.2. MANIPULATING SOUNDS 71
>>> print f
/Users/guzdial/mediasources/gettysburg10.wav
>>> soriginal=makeSound(f)
>>> print getSampleValueAt(s,1)
118
>>> print getSampleValueAt(soriginal,1)
59
>>> print getSampleValueAt(s,2)
78
>>> print getSampleValueAt(soriginal,2)
39
>>> print getSampleValueAt(s,1000)
-80
>>> print getSampleValueAt(soriginal,1000)
-40
That last one is particularly telling. Even negative values become more
negative. Thats whats meant by increasing the amplitude. The ampli-
tude of the wave goes in both directions. We have to make the wave larger
in both the positive and negative dimensons.
Its important to do what you just read in this chapter: Doubt your
programs. Did that really do what I wanted it to do? The way you check is
by testing. Thats what this section is about. You just saw several ways to
test:
def increaseVolume(sound):
for sample in getSamples(sound):
value = getSample(sample)
setSample(sample,value * 2)
72 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
Recall our picture of how the samples in a sound array might look.
59 39 16 10 -1
...
1 2 3 4 5
When the for loop begins, sample will be the name for the rst sample.
sample
getSamples
59 39 16 10 -1
(sound) ...
1 2 3 4 5
sample
getSamples
118 39 16 10 -1
(sound) ...
1 2 3 4 5
Thats the end of the rst pass through the body of the for loop. Python
will then start the loop over and move sample on to point at the next element
in the array.
3.2. MANIPULATING SOUNDS 73
sample
getSamples
118 39 16 10 -1
(sound) ...
1 2 3 4 5
Again, the value is set to the value of the sample, then the sample will
be doubled.
sample
getSamples
118 78 16 10 -1
(sound) ...
1 2 3 4 5
getSamples
118 78 32 20 -2
(sound) ...
1 2 3 4 5
But really, the for loop keeps going through all the samplestens of
thousands of them! Thank goodness its the computer executing this recipe!
One way to think about whats happening here is that the for loop
doesnt really do anything, in the sense of changing anything in the sound.
Only the body of the loop does work. The for loop tells the computer what
to do. Its a manager. What the computer actually does is something like
this:
sample = sample #1
value = value of the sample, 59
change sample to 118
sample = sample #2
value = 39
change sample to 78
74 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
sample = sample #3
...
sample = sample #5
value = -1
change sample to -2
...
The for loop is only saying, Do all of this for every element in the
array. Its the body of the loop that contains the Python commands that
get executed.
What you have just read in this section is called tracing the program.
We slowly went through how each step in the program was executed. We
drew pictures to describe the data in the program. We used numbers, ar-
rows, equations, and even plain English to explain what was going on in the
program. This is the single most important technique in programming. Its
part of debugging. Your program will not always work. Absolutely, guar-
anteed, without a shadow of a doubtyou will write code that does not do
what you want. But the computer will do SOMETHING. How do you gure
out what it is doing? You debug, and the most signicant way to do that
is by tracing the program.
Decreasing volume
Decreasing volume, then, is the reverse of the previous process.
def decreaseVolume(sound):
for sample in getSamples(sound):
value = getSample(sample)
setSample(sample,value * 0.5)
End of Recipe 10
>>> f=pickAFile()
>>> print f
/Users/guzdial/mediasources/louder-g10.wav
3.2. MANIPULATING SOUNDS 75
>>> sound=makeSound(f)
>>> print sound
Sound of length 220568
>>> play(sound)
>>> decreaseVolume(sound)
>>> play(sound)
>>> decreaseVolume(sound)
>>> play(sound)
Normalizing sounds
If you think about it some, it seems strange that the last two recipes work!
We can just multiply these numbers representing a soundand the sound
seems (essentially) the same to our ears? The way we experience a sound
depends less on the specic numbers than on the relationship between them.
Remember that the overall shape of the sound waveform is dependent on
many samples. In general, if we multiply all the samples by the same mul-
tiplier, we only eect our sense of volume (intensity), not the sound itself.
(Well work to change the sound itself in future sections.)
A common operation that people want to do with sounds is to make them
as LOUD AS POSSIBLE. Thats called normalizing. Its not really hard
to do, but it takes more lines of Python code than weve used previously
and a few more variables, but we can do it. Heres the recipe, in English,
that we need to tell the computer to do.
We have to gure out what the largest sample in the sound is. If its
already at the maximum value (32767), then we cant really increase
the volume and still get what seems like the same sound. Remember
that we have to multiply all the samples by the same multiplier.
Its an easy recipe (algorithm) to nd the largest valuesort of a
sub-recipe within the overall normalizing recipe. Dene a name (say,
largest) and assign it a small value (0 works). Now, check all the
samples. If you nd a sample larger than the largest, make that
larger value the new meaning for largest. Keep checking the samples,
now comparing to the new largest. Eventually, the very largest value
in the array will be in the variable largest.
76 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
Next, we need to gure out what value to multiply all the samples by.
We want the largest value to become 32767. Thus, we want to gure
out a multiplier such that
(multiplier)(largest) = 32767.
Solve for the multiplier:
multiplier = 32767/largest. The multiplier will need to be a oating
point number (have a decimal component), so we need to convince
Python that not everything here is an integer. Turns out that thats
easyuse 32767.0.
def normalize(sound):
largest = 0
for s in getSamples(sound):
largest = max(largest,getSample(s) )
multiplier = 32767.0 / largest
for s in getSamples(sound):
louder = multiplier * getSample(s)
setSample(s,louder)
3.2. MANIPULATING SOUNDS 77
End of Recipe 11
There are blank lines in there! Sure, Python doesnt care about those.
Adding blank lines can be useful to break up and improve the under-
standability of longer programs.
Some of the statements in this recipe are pretty long, so they wrap
around in the text. Type them as a single line! Python doesnt let you
hit return until the end of the statementmake sure that your print
statements are all on one line.
>>> normalize(sound)
Largest sample in original sound was 5656
Multiplier is 5.7933168316831685
>>> play(sound)
Exciting, huh? Obviously, the interesting part is hearing the much louder
volume, which is awfully hard to do in a book.
Recall that each sample has a number, and that we can get each indi-
vidual sample with getSampleValueAt (with a sound and an index number
as input). We can set any sample with setSampleValueAt (with inputs of a
sound, an index number, and a new value). Thats how we can manipulate
samples without using getSamples and sample objects. But we still dont
want to have to write code like:
setSampleAt(sound,1,12)
setSampleAt(sound,2,28)
...
You might be wondering what this square bracket stu is, e.g., [1,2] in
the rst example above. Thats the notation for an arrayits how Python
prints out a series of numbers to show that this is an array3 . If we use range
3
Technically, range returns a sequence, which is a somewhat dierent ordered collection
of data from an array. But for our purposes, well call it an array.
3.2. MANIPULATING SOUNDS 79
to generate the array for the for loop, our variable will walk through each
of the sequential numbers we generate.
It turns out that range can also take three inputs. If a third input is
provided, its an incrementthe amount to step between generated integers.
Recipe 12: Increase an input sounds volume by doubling the amplitude, using range
def increaseVolumeByRange(sound):
for sampleIndex in range(1,getLength(sound)+1):
value = getSampleValueAt(sound,sampleIndex)
setSampleValueAt(sound,sampleIndex,value * 2)
End of Recipe 12
Recipe 13: Increase the volume in the first half of the sound, and decrease in the second
def increaseAndDecrease(sound):
for sampleIndex in range(1,getLength(sound)/2):
80 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
value = getSampleValueAt(sound,sampleIndex)
setSampleValueAt(sound,sampleIndex,value * 2)
for sampleIndex in range(getLength(sound)/2,getLength(sound)+1):
value = getSampleValueAt(sound,sampleIndex)
setSampleValueAt(sound,sampleIndex,value * 0.2)
End of Recipe 13
To demonstrate it in ways that you can trust the result (because you
dont really know whats in the sound in the above examples), lets use
range to make an array, then reference it the same way.
1
>>> print myArray[0]
0
>>> print myArray[35]
35
>>> mySecondArray = range(0,100,2)
>>> print mySecondArray[35]
70
Scroll and move the cursor (by dragging in the graph) until you think
that the cursor is before or after a sound of interest.
Check your positioning by playing the sound before and after the cur-
sor, using the buttons in the sound editor.
Using exactly this process, I found the ending points of the rst few
words in preamble10.wav. (I gure that the rst word starts at the index
1, though that might not always be true for every sound.)
82 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
targetIndex = Where-the-incoming-sound-should-start
for sourceIndex in range(startingPoint,endingPoint)
setSampleValueAt( target, targetIndex, getSampleValueAt( source,
sourceIndex))
targetIndex = targetIndex + 1
Below is the recipe that changes the preamble from We the people of
the United States to We the UNITED people of the United States.
Be sure to change the file variable before trying this on your computer.
# Splicing
# Using the preamble sound, make "We the united people"
def splicePreamble():
file = "/Users/guzdial/mediasources/preamble10.wav"
source = makeSound(file)
target = makeSound(file) # This will be the newly spliced sound
End of Recipe 14
>>> newSound=splicePreamble()
84 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
Theres a lot going on in this recipe! Lets walk through it, slowly.
Notice that there are lots of lines with # in them. The hash charac-
ter signies that what comes after that character on the line is a note to
the programmer and should be ignored by Python! Its called a comment.
Comments are great ways to explain what youre doing to othersand to
yourself! The reality is that its hard to remember all the details of a pro-
gram, so its often very useful to leave notes about what you did if youll
ever play with the program again.
The function splice takes no parameters. Sure, it would be great to
write a single function that can do any kind of splicing we want, in the same
way as weve done generalized increasing volume and normalization. But
how could you? How do you generalize all the start and end points? Its
easier, at least to start, to create single recipes that handle specic splicing
tasks.
We see here three of those copying loops like we set up earlier. Actually,
there are only two. The rst one copies the word United into place. The
second one copies the word people into place. But wait, you might be
thinking. The word people was already in the sound! Thats true, but
when we copy United in, we overwrite some of the word people, so we
copy it in again.
Heres the simpler form. Try it and listen to the result:
def spliceSimpler():
file = "/Users/guzdial/mediasources/preamble10.wav"
source = makeSound(file)
target = makeSound(file) # This will be the newly spliced
sound
targetIndex=17408 # targetIndex starts at just after
"We the" in the new sound
for sourceIndex in range( 33414, 40052): # Where the word
"United" is in the sound
setSampleValueAt( target, targetIndex, getSampleValueAt(
source, sourceIndex))
targetIndex = targetIndex + 1
play(target) #Lets hear and return the result
return target
Lets see if we can gure out whats going on mathematically. Recall the
table back on page 82. Were going to start inserting samples at sample index
17408. The word United has (40052 33414) 6638 samples. (Exercise for
3.2. MANIPULATING SOUNDS 85
the reader: How long is that in seconds?) That means that well be writing
into the target from 17408 to (17408 + 6638) sample index 24046. We know
from the table that the word People ends at index 26726. If the word
People is more than (26726 24046) 2,680 samples, then it will start
earlier than 24046, and our insertion of United is going to trample on
part of it. If the word United is over 6000 samples, I doubt that the word
People is less than 2000. Thats why it sounds crunched. Why does it
work with where the of is? The speaker must have paused in there. If you
check the table again, youll see that the word of ends at sample index
32131 and the word before it ends at 26726. The word of takes fewer than
(32131 26726) 5405 samples, which is why the original recipe works.
The third loop in the original Recipe 14 (page 83) looks like the same
kind of copy loop, but its really only putting in a few 0s. As you might
have already guessed, samples with 0s are silent. Putting a few in creates
a pause that sounds better. (Theres an exercise which suggests pulling it
out and seeing what you hear.)
Finally, at the very end of the recipe, theres a new statement we havent
seen yet: return. Weve now seen many functions in Python that return
values. This is how one does it. Its important for splice to return the
newly spliced sound. Because of the scope of the function splice, if the
new sound wasnt created, it would simply disappear when the function
ended. By returning it, its possible to give it a name and play it (and even
further manipulate it) after the function stops executing.
Figure 3.20 shows the original preamble10.wav le in the top sound
editor, and the new spliced one (saved with writeSoundTo) on the bottom.
The lines are drawn so that the spliced section lies between them, while the
rest of the sounds are identical.
def backwards(filename):
source = makeSound(filename)
86 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
Figure 3.20: Comparing the original sound (top) to the spliced sound (bot-
tom)
target = makeSound(filename)
sourceIndex = getLength(source)
for targetIndex in range(1,getLength(target)+1):
sourceValue = getSampleValueAt(source,sourceIndex)
setSampleValueAt(target,targetIndex,sourceValue)
sourceIndex = sourceIndex - 1
return target
3.2. MANIPULATING SOUNDS 87
End of Recipe 15
This recipe uses another variant of the array element copying sub-recipe
that weve seen previously.
The recipe starts the sourceIndex at the end of the array, rather than
the front.
The targetIndex moves from 1 to the length, during which time the
recipe:
def double(filename):
source = makeSound(filename)
target = makeSound(filename)
targetIndex = 1
88 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
#Clear out the rest of the target sound -- its only half full!
for secondHalf in range( getLength( target)/2, getLength( target)):
setSampleValueAt(target,targetIndex,0)
targetIndex = targetIndex + 1
play(target)
return target
End of Recipe 16
This recipe looks like its using the array-copying sub-recipe we saw ear-
lier, but notice that the range uses the third parameterwere incrementing
by two. If we increment by two, we only ll half the samples in the target,
so the second loop just lls the rest with zeroes.
Try it4 ! Youll see that the sound really does double in frequency!
How did that happen? Its not really all that complicated. Think of it
this way. The frequency of the basic le is really the number of cycles that
pass by in a certain amount of time. If you skip every other sample, the new
sound has just as many cycles, but has them in half the amount of time!
Now lets try the other way: Lets take every sample twice! What hap-
pens then?
To do this, we need to learn a new Python function: int. int returns
the integer portion of the input.
4
You are now trying this out as you read, arent you?
3.2. MANIPULATING SOUNDS 89
Heres the recipe that halves the frequency. Were using the array-
copying sub-recipe again, but were sort of reversing it. The for loop moves
the targetIndex along the length of the sound. The sourceIndex is now
being incrementedbut only by 0.5! The eect is that well take every sam-
ple in the source twice. The sourceIndex will be 1, 1.5, 2, 2.5, and so on,
but because were using the int of that value, well take samples 1, 1, 2, 2,
and so on.
def half(filename):
source = makeSound(filename)
target = makeSound(filename)
sourceIndex = 1
for targetIndex in range(1, getLength( target)+1):
setSampleValueAt( target, targetIndex, getSampleValueAt(
source, int(sourceIndex)))
sourceIndex = sourceIndex + 0.5
play(target)
return target
End of Recipe 17
Think about what were doing here. Imagine that the number 0.5 above
were actually 0.75. Or 2. or 3. Would this work? The for loop would have
to change, but essentially the idea is the same in all these cases. We are
sampling the source data to create the target data. Using a sample index of
0.5 slows down the sound and halves the frequency. A sample index larger
than one speeds up the sound and increases the frequency.
90 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
Lets try to generalize this sampling with the below recipe. (Note that
this one wont work right!)
def shift(filename,factor):
source = makeSound(filename)
target = makeSound(filename)
sourceIndex = 1
for targetIndex in range(1, getLength( target)+1):
setSampleValueAt( target, targetIndex, getSampleValueAt(
source, int(sourceIndex)))
sourceIndex = sourceIndex + factor
play(target)
return target
End of Recipe 18
>>> hello=pickAFile()
>>> print hello
/Users/guzdial/mediasources/hello.wav
>>> lowerhello=shift(hello,0.75)
That will work really well! But what if the factor for sampling is MORE
than 1.0?
>>> higherhello=shift(hello,1.5)
I wasnt able to do what you wanted.
The error java.lang.ArrayIndexOutOfBoundsException has occured
Please check line 7 of /Users/guzdial/shift-broken.py
Why? Whats happening? Heres how you could see it: Print out
the sourceIndex just before the setSampleValueAt. Youd see that the
sourceIndex becomes larger than the source sound! Of course, that makes
3.2. MANIPULATING SOUNDS 91
def shift(filename,factor):
source = makeSound(filename)
target = makeSound(filename)
sourceIndex = 1
for targetIndex in range(1, getLength( target)+1):
setSampleValueAt( target, targetIndex, getSampleValueAt(
source, int(sourceIndex)))
sourceIndex = sourceIndex + factor
if sourceIndex > getLength(source):
sourceIndex = 1
play(target)
return target
End of Recipe 19
Exercises
Exercise 13: Open up the Sonogram view and say some vowel sounds.
Is there a distinctive pattern? Do Ohs always sound the same? Do
Ahs? Does it matter if you switch speakersare the patterns the same?
Exercise 14: Get a couple of dierent instruments and play the same note
on them into MediaTools sound editor with the sonogram view open. Are
all Cs made equal? Can you see some of why one sound is dierent than
another?
Exercise 15: Try out a variety of WAV les as instruments, using the
piano keyboard in the MediaTools sound editor. What kinds of recordings
work best as instruments?
Exercise 16: Recipe 9 (page 68) takes a sound as input. Write a function
increaseVolumeNamed that takes a le name as input then play the louder
sound.
Exercise 17: Rewrite Recipe 9 (page 68) so that it takes two inputs: The
sound to increase in volume, and a lename where the newly louder sound
should be stored. Then, increase the volume, and write the sound out to the
name le. Also, try doing it taking an input lename instead of the sound,
so that inputs are both lenames.
Exercise 18: Rewrite Recipe 9 (page 68) so that it takes two inputs: A
sound to increase in volume, and a multiplier. Use the multiplier as how
much to increase the amplitude of the sound samples. Can we use this same
function to both increase and decrease the volume? Demonstrate commands
that you would execute to do each.
Exercise 19: In section 3.2.3, we walked through how Recipe 9 (page 68)
worked. Draw the pictures to show how Recipe 10 (page 74) works, in the
same way.
Exercise 20: What happens if you increase a volume too far? Do it
once, and again, and again. Does it always keep getting louder? Or does
something else happen? Can you explain why?
Exercise 21: Try sprinkling in some specic values into your sounds.
3.2. MANIPULATING SOUNDS 95
What happens if you put a few hundred 32767 samples into the middle of a
sound? Or a few hundred -32768? Or a bunch of zeroes? What happens to
the sound?
Exercise 22: In Recipe 12 (page 79), we add one to getLength(sound)
in the range function. Whyd we do that?
Exercise 23: Rewrite Recipe 13 (page 79) so that two input values are
provided to the function: The sound, and a percentage of how far into the
sound to go before dropping the volume.
Exercise 24: Rewrite Recipe 13 (page 79) so that you normalize the rst
second of a sound, then slowly decrease the sound in steps of 1/5 for each
following second. (How many samples are in a second? getSamplingRate
is the number of samples per second for the given sound.)
Exercise 25: Try rewriting Recipe 13 (page 79) so that you have a linear
increase in volume to halfway through the sound, then linearly decrease the
volume down to zero in the second half.
Exercise 26: What happens if you take out the bit of silence added in
to the target sound in Recipe 14 (page 83)? Try out? Can you hear any
dierence?
Exercise 27: I think that if were going to say We the UNITED people
in Recipe 14 (page 83), the UNITED should be really emphasizedreally
loud. Change the recipe so that the word united is louder in the phrase
united people.
Exercise 28: How long is a sound compared to the original when its been
doubled by Recipe 16 (page 87)?
Exercise 29: Hip-hop DJs move turntables so that sections of sound are
moved forwards and backwards quickly. Try combining Recipe 15 (page 86)
and Recipe 16 (page 87) to get the same eect. Play a second of a sound
quickly forward, then quickly backward, two or three times. (You might
have to move faster than just double the speed.)
Exercise 30: Try using a stopwatch to time the execution of the recipes
in this chapter. Time from hitting return on the command, until the next
prompt appears. What is the relationship between execution time and the
length of the sound? Is it a linear relationship, i.e., longer sounds take longer
to process and shorter sounds take less time to process? Or is it something
else? Compare the individual recipes. Does normalizing a sound take longer
than raising (or lowering) the amplitude a constant amount? How much
longer? Does it matter if the sound is longer or shorter?
Exercise 31: Consider changing the if block in Recipe 19 (page 91) to
sourceIndex = sourceIndex - getLength(source). Whats the dier-
ence from just setting the sourceIndex to 1? Is this better or worse? Why?
96 CHAPTER 3. ENCODING AND MANIPULATING SOUNDS
Exercise 32: If you use Recipe 19 (page 91) with a factor of 2.0 or 3.0,
youll get the sound repeated or even triplicated. Why? Can you x it?
Write shiftDur that takes a number of samples (or even seconds) to play
the sound.
Exercise 33: Change the shift function in Recipe 19 (page 91) to shiftFreq
which takes a frequency instead of a factor, then plays the given sound at
the desired frequency.
To Dig Deeper
There are many wonderful books on psychoacoustics and computer music.
One of my favorites for understandability is Computer Music: Synthesis,
Composition, and Performance by Dodge and Jerse [Dodge and Jerse, 1997].
The bible of computer music is Curtis Roads massive The Computer Music
Tutorial [Roads, 1996].
When you are using MediaTools, you are actually using a programming
language called Squeak , developed initially and primarily by Alan Kay, Dan
Ingalls, Ted Kaehler, John Maloney, and Scott Wallace [Ingalls et al., 1997].
Squeak is now open-source5 , and is an excellent cross-platform multimedia
tool. There is a good book introducing Squeak, including the sound capabil-
ities [Guzdial, 2001], and another book on Squeak [Guzdial and Rose, 2001]
that includes a chapter on Siren, an o-shoot of Squeak by Stephen Pope
especially designed for computer music exploration and composition.
5
http://www.squeak.org
Chapter 4
Creating Sounds
>>> setMediaFolder()
New media folder: /Users/guzdial/mediasources/
97
98 CHAPTER 4. CREATING SOUNDS
def echo(delay):
f = pickAFile()
s1 = makeSound(f)
s2 = makeSound(f)
for p in range(delay+1, getLength(s1)):
# set delay to original value + delayed value * .6
setSampleValueAt(s1, p, getSampleValueAt( s1,p) + .6*getSampleValueAt(
s2, p-delay) )
4.1. CREATING AN ECHO 99
play(s1)
End of Recipe 20
def echoes(delay,echoes):
f = pickAFile()
s1 = makeSound(f)
s2 = makeSound(f)
endCurrentSound = getLength(s1)
newLength = endCurrentSound+(echoes * delay) # get ultimate
length of sound
echoAmplitude = 1
for echoCount in range (1, echoes+1): # for each echo
# decrement amplitude to .6 of current volume
echoAmplitude = echoAmplitude * 0.6
# loop through the entire sound
for e in range (1,endCurrentSound+1):
# increment position by one
position = e+delay*echoCount
# Set this samples value to the original value plus the
amplitude * the original sample value
setSampleValueAt(s1,position, getSampleValueAt(s1, position)
+ echoAmplitude * getSampleValueAt( s2, position-(delay*echoCount)
) )
play(s1)
100 CHAPTER 4. CREATING SOUNDS
End of Recipe 21
Now, we want to create a sound at a given frequency, say 440 Hz. This
means that we have to t an entire cycle like the above into 1/440 of a
second. (440 cycles per second means that each cycle ts into 1/440 second,
or 0.00227 seconds.) I made the above picture using 20 values. Call it 20
samples. How many samples to I have to chop up the 440 Hz cycle into?
Thats the same question as: How many samples must go by in 0.00227
4.1. CREATING AN ECHO 101
Multiply the result by the desired amplitude, and put that in the
sampleIndex.
To build sounds, there are some silent sounds in the media sources. Our
sine wave generator will use one second of silence to build a sine wave of one
second. Well provide an amplitude as inputthat will be the maximum
amplitude of the sound. (Since sine generates between 1 and 1, the range
of amplitudes will be between amplitude and amplitude.)
Common Bug: Set the media folder first!
If youre going to use code that uses getMediaPath,
youll need to execute setMediaFolder rst.
def sineWave(freq,amplitude):
buildSin = makeSound(mySound)
return (buildSin)
End of Recipe 22
>>> f880=sineWave(880,4000)
>>> play(f880)
Now, lets add sine waves together. Like we said at the beginning of the
chapter, thats pretty easy: Just add the samples at the same indices to-
gether. Heres a function that adds one sound into a second sound.
def addSounds(sound1,sound2):
for index in range(1,getLength(sound1)+1):
s1Sample = getSampleValueAt(sound1,index)
s2Sample = getSampleValueAt(sound2,index)
setSampleValueAt(sound2,index,s1Sample+s2Sample)
4.1. CREATING AN ECHO 103
End of Recipe 23
How are we going to use this function to add together sine waves? We
need both functions at once. Turns out that its easy:
#
Making it Work Tip: You can put more than
one function in the same file!
Its perfectly okay to have more than one function
in the same le. Just type them all in in any order.
Python will gure it out.
" !
My le additive.py looks like this:
def sineWave(freq,amplitude):
mySound = getMediaPath(sec1silence.wav)
buildSin = makeSound(mySound)
interval = 1.0/freq
samplesPerCycle = interval * sr # samples per cycle:
make sure floating point
maxCycle = 2 * pi
return (buildSin)
def addSounds(sound1,sound2):
for index in range(1,getLength(sound1)+1):
s1Sample = getSampleValueAt(sound1,index)
s2Sample = getSampleValueAt(sound2,index)
setSampleValueAt(sound2,index,s1Sample+s2Sample)
104 CHAPTER 4. CREATING SOUNDS
Lets add together 440 Hz, 880 Hz (twice 440), and 1320 Hz (880+440),
but well have the amplitudes increase. Well double the amplitude each
time: 2000, then 4000, then 8000. Well add them all up into the name
f440. At the end, I generate a 440 Hz sound so that I can listen to them
both and compare.
>>> f440=sineWave(440,2000)
>>> f880=sineWave(880,4000)
>>> f1320=sineWave(1320,8000)
>>> addSounds(f880,f440)
>>> addSounds(f1320,f440)
>>> play(f440)
>>> just440=sineWave(440,2000)
>>> play(just440)
' $
Common Bug: Beware of adding amplitudes
past 32767
When you add sounds, you add their amplitudes, too.
A maximum of 2000+4000+8000 will never be greater
than 32767, but do worry about that. Remember
what happened when the amplitude got too high last
chapter. . .
& %
Square waves
We dont have to just add sine waves. We can also add square waves. These
are literally square-shaped waves, moving between +1 and 1. The FFT
will look very dierent, and the sound will be very dierent. It can actually
be a much richer sound.
Try swapping this recipe in for the sine wave generator and see what you
think. Note the use of an if statement to swap between the positive and
negative sides of the wave half-way through a cycle.
Recipe 24: Square wave generator for given frequency and amplitude
def squareWave(freq,amplitude):
i = 0
setSampleValueAt(square,s,sampleVal)
i = i + 1
return(square)
End of Recipe 24
>>> sq440=squareWave(440,4000)
>>> play(sq440)
>>> sq880=squareWave(880,8000)
>>> sq1320=squareWave(1320,10000)
>>> writeSoundTo(sq440,getMediaPath("square440.wav"))
Note: There is no file at /Users/guzdial/mediasources/square440.wav
>>> addSounds(sq880,sq440)
>>> addSounds(sq1320,sq440)
>>> play(sq440)
>>> writeSoundTo(sq440,getMediaPath("squarecombined440.wav"))
Note: There is no file at /Users/guzdial/mediasources/squarecombined440.wav
Youll nd that the waves (in the wave editor of MediaTools) really do
look square (Figure 4.5), but the most amazing thing is all the additional
spikes in FFT (Figure 4.6). Square waves really do result in a much more
complex sound.
Triangle waves
Try triangle waves instead of square waves with this recipe.
def triangleWav(freq):
# if end of a half-cycle
if (i > samplesPerHalfCycle):
# reverse the increment every half-cycle
increment = increment * -1
# and reinit the half-cycle counter
i = 0
play(triangle)
End of Recipe 25
108 CHAPTER 4. CREATING SOUNDS
Exercises
Exercise 34: Using the sound tools, gure out the characteristic pattern
of dierent instruments. For example, pianos tend to have a pattern the
opposite of what we createdthe amplitudes decrease as we get to higher
sine waves. Try creating a variety of patterns and see how they sound and
how they look.
Exercise 35: When musicians work with additive synthesis, they will of-
ten wrap envelopes around the sounds, and even around each added sine
wave. An envelope changes the amplitude over time: It might start out
small, then grow (rapidly or slowly), then hold at a certain value during the
sound, and then drop before the sound ends. That kind of pattern is some-
times called the attack-sustain-decay (ASD) envelope. Try implementing
that for the sine and square wave generators.
To Dig Deeper
Good books on computer music will talk a lot about creating sounds from
scratch like in this chapter. One of my favorites for understandability is
Computer Music: Synthesis, Composition, and Performance by Dodge and
Jerse [Dodge and Jerse, 1997]. The bible of computer music is Curtis Roads
massive The Computer Music Tutorial [Roads, 1996].
One of the most powerful tools for playing with this level of computer
music is CSound. Its a software music synthesis system, free, and totally
cross-platform. The book by Richard Boulanger [Boulanger, 2000] has ev-
erything you need for playing with CSound.
4.1. CREATING AN ECHO 109
Figure 4.1: The top and middle waves are added together to create the
bottom wave
110 CHAPTER 4. CREATING SOUNDS
Figure 4.2: The raw 440 Hz signal on top, then the 440+880+1320 Hz signal
on the bottom
Figure 4.5: The 440 Hz square wave (top) and additive combination of
square waves (bottom)
Figure 4.6: FFTs of the 440 Hz square wave (top) and additive combination
of square waves (bottom)
112 CHAPTER 4. CREATING SOUNDS
Part III
Pictures
113
Chapter 5
115
116 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
1 2 3 4
15 12 13 10
1
9 7
2
3 6
Whats stored at each element in the picture is a pixel. The word pixel
is short for picture element. Its literally a dot, and the overall picture
is made up of lots of these dots. Have you ever taken a magnifying glass
to pictures in the newspaper or magazines, or to a television or even your
own monitor? (Figure 5.2 was generated by taking an Intel microscope and
pointing it at the screen at 60x magnication.) Its made up of many, many
dots. When you look at the picture in the magazine or on the television, it
doesnt look like its broken up into millions of discrete spots, but it is.
Just like the samples that make up a sound, our human sensor apparatus
cant distinguish (without magnication or other special equipment) the
small bits in the whole. Thats what makes it possible to digitize pictures.
We break up the picture into smaller elements (pixels), but enough of them
that the picture doesnt look choppy when looked at as a whole. If you
can see the eects of the digitization (e.g., lines have sharp edges, you see
little rectangles in some spots), we call that pixelizationthe eect when
the digitization process becomes obvious.
Picture encoding is only the next step in complexity after sound encod-
ing. A sound is inherently linearit progresses forward in time. A picture
has two dimensions, a width and a height. But other than that, its quite
similar.
We will encode each pixel as a triplet of numbers. The rst number
represents the amount of red in the pixel. The second is the amount of
green, and the third is the amount of blue. It turns out that we can actually
5.1. HOW PICTURES ARE ENCODED 117
Figure 5.2: Cursor and icon at regular magnication on top, and close-up
views of the cursor (left) and the line below the cursor (right)
make up any color by combining red, green, and blue light (Figure 5.3).
Combining all three gives us pure white. Turning o all three gives us
black. We call this the RGB model.
There are other models for dening and encoding colors besides the RGB
color model. Theres the HSV color model which encodes Hue, Saturation,
and Value. The nice thing about the HSV model is that some notions, like
making a color lighter or darker map cleanly to it (e.g., you simply
change the saturation). Another model is the CMYK color model, which
encodes Cyan, Magenta, Yellow, and blacK (B could be confused with
Blue). The CMYK model is what printers usethose are the inks they
combine to make colors. However, the four elements means more to encode
on a computer, so its less popular for media computation. RGB is probably
the most popular model on computers.
Each color component in a pixel is typically represented with a single
byte, eight bits. If you recall our earlier discussion, eight bits can represent
256 values (28 ), which we typically use to represent the values 0 to 255.
Each pixel, then, uses 24 bits to represent colors. Using our same formula
(22 4), we know that the standard encoding for color using the RGB model
can represent 16,777,216 colors. There are certainly more than 16 million
118 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
Figure 5.3: Merging red, green, and blue to make new colors
colors in all of creation, but it would take a very discerning eye to pick out
any missing in this model.
Most facilities for allowing users to pick out colors let the users specify
the color as RGB components. The Macintosh oers RGB sliders in its basic
color picker (Figure 5.4). The color chooser in JES (which is the standard
Java Swing color chooser) oers a similar set of sliders (Figure 5.5).
less intense. (0, 100, 0) is a light green, and (0, 0, 100) is light blue.
When the red component is the same as the green and as the blue,
the resultant color is gray. (50, 50, 50) would be a fairly light gray, and
(100, 100, 100) is darker.
The Figure 5.6 (replicated at Figure 5.27 (page 159) in the color pages
at the end of this chapter) is a representation of pixel RGB triplets in a
matrix representation. Thus, the pixel at (2, 1) has color (5, 10, 100) which
means that it has a red value of 5, a green value of 10, and a blue value of
100its a mostly blue color, but not pure blue. Pixel at (4, 1) has a pure
green color ((0, 100, 0)), but only 100 (out of a possible 255), so its a fairly
light green.
>>> file=pickAFile()
>>> print file
120 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
/Users/guzdial/mediasources/barbara.jpg
>>> picture=makePicture(file)
>>> show(picture)
>>> print picture
Picture, filename /Users/guzdial/mediasources/barbara.jpg height
294 width 222
' $
Common Bug: When the picture cant be read
You might sometimes get an error like this:
>>> p=makePicture(file)
java.lang.IllegalArgumentException
java.lang.IllegalArgumentException:
Width (-1) and height (0) must be > 0
We can get any particular pixel from a picture using getPixel with the
picture, and the coordinates of the pixel desired. We can also get all the
pixels with getPixels.
>>> pixel=getPixel(picture,1,1)
>>> print pixel
Pixel, color=color r=168 g=131 b=105
>>> pixels=getPixels(picture)
>>> print pixels[0]
Pixel, color=color r=168 g=131 b=105
' $
Common Bug: Dont try printing the pixels:
Way too big!
getPixels literally returns an array of all the pix-
els (as opposed to a samples object, like getSamples
returns). If you try to print the return value from
getPixels, youll get the printout of each pixel, like
you see above. How many pixels are there? Well, this
small sample picture has a width of 222 and a height
of 294. 222x294 = 65, 268 65 thousand lines like the
above is a big printout. You probably dont want to
wait for it to nish. If you do this accidentally, just
quit JES and re-start it.
& %
Pixels know where they came from. You can ask them their x and y
coordinates with getX and getY.
Each pixel knows how to getRed and setRed. (Green and blue work
similarly.)
122 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
You can also ask a pixel for its color with getColor, and you can also
set the color with setColor. Color objects know their red, green, and blue
components. You can also make new colors with makeColor.
>>> color=getColor(pixel)
>>> print color
color r=255 g=131 b=105
>>> setColor(pixel,color)
>>> newColor=makeColor(0,100,0)
>>> print newColor
color r=0 g=100 b=0
>>> setColor(pixel,newColor)
>>> print getColor(pixel)
color r=0 g=100 b=0
If you change the color of a pixel, the picture that the pixel is from does
get changed.
So, if color1 has red, green, and blue components (r1 , g1 , b1 ), and color2 has
(r2 , g2 , b2 ), then color1 color2 creates a new color (r1 r2 , g1 g2 , b1 b2 ).
We can also use <, >, and == (test for equality) to compare colors.
>>> print c1
color r=10 g=10 b=10
>>> print c2
color r=20 g=20 b=20
>>> print c2-c1
color r=10 g=10 b=10
>>> print c2 > c1
1
>>> print c2 < c1
0
The distance between two colors is the Cartesian distance between the
colors as points in a three-dimensional space, where red, green, and blue are
the three dimensions. Recall that the distance between two points (x1 , y1 )
and (x2 , y2 ) is:
(x1 x2 )2 + (y1 y2 )2
The similar measure for two colors (red1 , green1 , blue1 ) and (red2 , green2 , blue2 )
is:
(red1 red2 )2 + (green1 green2 )2 + (blue1 blue2 )2
You can automatically get the darker or lighter versions of colors with
makeDarker or makeLighter. (Remember that this was easy in HSV, but
not so easy in RGB. These functions do it for you.)
You can also make colors from pickAColor, which gives you a variety of
ways of picking a color.
>>> newcolor=pickAColor()
>>> print newcolor
color r=255 g=51 b=51
Once you have a color, you can get lighter or darker versions of the same
color with makeLighter and makeDarker.
>>> print c
color r=10 g=100 b=200
>>> print makeLighter(c)
color r=10 g=100 b=200
>>> print c
color r=14 g=142 b=255
>>> print makeDarker(c)
color r=9 g=99 b=178
>>> print c
color r=9 g=99 b=178
When you have nished manipulating a picture, you can write it out
with writePictureTo.
>>> writePictureTo(picture,"/Users/guzdial/newpicture.jpg")
Common Bug: End with .jpg
Be sure to end your lename with .jpg in order to
get your operating system to recognize it as a JPEG
le.
Of course, we dont have to write new functions to manipulate pictures.
We can do it from the command area using the functions just described.
5.2. MANIPULATING PICTURES 125
>>> file="/Users/guzdial/mediasources/barbara.jpg"
>>> pict=makePicture(file)
>>> show(pict)
>>> setColor(getPixel(pict,10,100),yellow)
>>> setColor(getPixel(pict,11,100),yellow)
>>> setColor(getPixel(pict,12,100),yellow)
>>> setColor(getPixel(pict,13,100),yellow)
>>> repaint(pict)
The result showing a small yellow line on the left side appears in Fig-
ure 5.7. This is 100 pixels down, and the pixels 10, 11, 12, and 13 from the
left edge.
Figure 5.7: Directly modifying the pixel colors via commands: Note the
small yellow line on the left
126 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
The red, green, and blue values will be displayed for the pixel youre
pointing at. This is useful when you want to get a sense of how the
colors in your picture map to numeric red, green, and blue values. Its
also helpful if youre going to be doing some computation on the pixels
and want to check the values.
The x and y position will be displaedy for the pixel youre point at.
This is useful when you want to gure out regions of the screen, e.g.,
if you want to process only part of the picture. If you know the range
of x and y coordinates where you want to process, you can tune your
for loop to reach just those sections.
Finally, a magnier is available to let you see the pixels blown up.
(The magnier can be clicked and dragged around.)
def decreaseRed(picture):
for p in getPixels(picture):
value=getRed(p)
setRed(p,value*0.5)
End of Recipe 26
Figure 5.9: The original picture (left) and red-reduced version (right)
(and at Figure 5.28 on page 160). 50% is obviously a lot of red to reduce!
The picture looks like it was taken through a blue lter.
Lets increase the red in the picture now. If multiplying the red compo-
nent by 0.5 reduced it, multiplying it by something over 1.0 should increase
it. Im going to apply the increase to the exact same picture, to see if we
can even it out some (Figure 5.10 and Figure 5.29).
def increaseRed(picture):
for p in getPixels(picture):
value=getRed(p)
setRed(p,value*1.2)
End of Recipe 27
We can even get rid of a color completely. The below recipe erases the
5.2. MANIPULATING PICTURES 129
Figure 5.10: Overly blue (left) and red increased by 20% (right)
def clearBlue(picture):
for p in getPixels(picture):
setBlue(p,0)
End of Recipe 28
seen earlier.
def lighten(picture):
for px in getPixels(picture):
color = getColor(px)
makeLighter(color)
setColor(px,color)
End of Recipe 29
def darken(picture):
for px in getPixels(picture):
color = getColor(px)
5.2. MANIPULATING PICTURES 131
makeDarker(color)
setColor(px,color)
End of Recipe 30
Creating a negative
Creating a negative image of a picture is much easier than you might think
at rst. Lets think it through. What we want is the opposite of each of
the current values for red, green, and blue. Its easiest to understand at the
extremes. If we have a red component of 0, we want 255 instead. If we have
255, we want the negative to have a zero.
Now lets consider the middle ground. If the red component is slightly
red (say, 50), we want something that is almost completely redwhere the
almost is the same amount of redness in the original picture. We want
the maximum red (255), but 50 less than that. We want a red component of
255 50 = 205. In general, the negative should be 255 original. We need
to compute the negative of each of the red, green, and blue components,
then create a new negative color, and set the pixel to the negative color.
132 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
Heres the recipe that does it, and you can see even from the grayscale
image that it really does work (Figure 5.13 and Figure 5.32).
def negative(picture):
for px in getPixels(picture):
red=getRed(px)
green=getGreen(px)
blue=getBlue(px)
negColor=makeColor( 255-red, 255-green, 255-blue)
setColor(px,negColor)
End of Recipe 31
Converting to greyscale
Converting to greyscale is a fun recipe. Its short, not hard to understand,
and yet has such a nice visual eect. Its a really nice example of what one
5.2. MANIPULATING PICTURES 133
def greyScale(picture):
for p in getPixels(picture):
intensity = (getRed(p)+getGreen(p)+getBlue(p))/3
setColor(p,makeColor(intensity,intensity,intensity))
End of Recipe 32
def greyScaleNew(picture):
for px in getPixels(picture):
newRed = getRed(px) * 0.299
newGreen = getGreen(px) * 0.587
newBlue = getBlue(px) * 0.114
luminance = newRed+newGreen+newBlue
134 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
setColor(px,makeColor(luminance,luminance,luminance))
End of Recipe 33
need to use two loops, one for each dimension of the picture. The inner loop
will be nested inside the outer loop, literally, inside its block. At this point,
youre going to have to be careful in how you space your code to make sure
that your blocks line up right.
Your loops will look something like this:
for x in range(1,getWidth(picture)):
for y in range(1,getHeight(picture)):
pixel=getPixel(picture,x,y)
For example, heres Recipe 29 (page 130), but using explicit pixel refer-
ences.
def lighten(picture):
for x in range(1,getWidth(picture)):
for y in range(1,getHeight(picture)):
px = getPixel(picture,x,y)
color = getColor(px)
makeLighter(color)
setColor(px,color)
End of Recipe 34
Mirroring a picture
Lets start out with an interesting eect that isnt particularly useful, but it
is fun. Lets mirror a picture along its vertical axis. In other words, imagine
that you have a mirror, and you place it on a picture so that the left side
of the picture shows up in the mirror. Thats the eect that were going to
implement. Well do it in a couple of dierent ways.
First, lets think through what were going to do. Well pick a horizon-
tal mirrorpointhalfway across the picture, getWidth(picture)/2. (We
want this to be an integer, a whole number, so well apply int to it.) Well
have the x cooridinate move from 1 to the mirrorpoint. At each value
of x, we want to copy the color at the pixel x pixels to the left of the
mirrorpoint to the pixel x pixels to the right of the mirrorpoint. The left
136 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
mirrorpoint
mirrorpoint-1 mirrorpoint+1
a b c d e
Figure 5.15: Once we pick a mirrorpoint, we can just walk x halfway and
subtract/add to mirrorpoint
def mirrorVertical(source):
mirrorpoint = int(getWidth(source)/2)
for y in range(1,getHeight(source)):
for x in range(1,mirrorpoint):
p = getPixel(source, x+mirrorpoint,y)
p2 = getPixel(source, mirrorpoint-x,y)
setColor(p,makeColor(getRed(p2), getGreen(p2), getBlue(p2)))
End of Recipe 35
Wed use it like this, and the result appears in Figure 5.16.
>>> file="/Users/guzdial/mediasources/santa.jpg"
>>> print file
/Users/guzdial/mediasources/santa.jpg
>>> picture=makePicture(file)
>>> mirrorVertical(picture)
>>> show(picture)
Figure 5.16: Original picture (left) and mirrored along the vertical axis
(right)
it as we move along. Heres the same recipe, shorter but more complex.
def mirrorVertical(source):
for y in range(1, getHeight(source)):
for x in range((getWidth(source)/2)+1, getWidth(source)):
p = getPixel(source,x,y)
p2 = getPixel(source,(( getWidth(source)/2)- (x-( getWidth(source)/2))),y)
setColor(p,makeColor( getRed(p2), getGreen(p2), getBlue(p2)))
End of Recipe 36
Scaling a picture
A very common thing to do with pictures is to scale them. Scaling up
means to make them larger, and scaling them down makes them smaller.
Its common to scale a 1-megapixel or 3-megapixel picture down to a smaller
size to make it easier to place on the Web. Smaller pictures require less disk
space, and thus less network bandwidth, and thus are easier and faster to
download.
138 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
Scaling a picture requires the use of the sampling sub-recipe that we saw
earlier. But instead of taking double-samples (to halve the frequency) or
every-other-sample (to double the frequency), well be taking double-pixels
(to double the size, that is, scale up) or every-other-pixel (to shrink the
picture, that is, scale down).
XXX NEED TO EXPLAIN THIS BETTER
Our target will be the paper-sized JPEG le in the mediasources direc-
tory, which is 7x9.5 inches, which will t on a 9x11.5 inch lettersize piece of
paper with one inch margins.
>>> paperfile=getMediaPath("7inx95in.jpg")
>>> paperpicture=makePicture(paperfile)
>>> print getWidth(paperpicture)
504
>>> print getHeight(paperpicture)
684
We set the sourceX index to start at one, and then start the targetX
for loop.
We then copy the pixel color from the source to the target.
def resize(source,increment):
newWidth = int(getWidth(source)* (1/increment))
newHeight = int(getWidth(source)* (1/increment))
target = makePicture( getMediaPath("7inx95in.jpg"))
sourceX = 1
for targetX in range(1,newWidth):
sourceY = 1
for targetY in range(1,newHeight):
targetPixel=getPixel(target, targetX, targetY)
sourcePixel=getPixel(source, int(sourceX), int(sourceY))
setColor(targetPixel,getColor(sourcePixel))
sourceY = sourceY + increment
if sourceY > getHeight(source):
sourceY = sourceY- getHeight(source)
sourceX = sourceX + increment
if sourceX > getWidth(source):
sourceX = sourceX- getWidth(source)
return target
End of Recipe 37
End of Recipe 38
Creating a collage
In the mediasources folder are a couple images of owers (Figure 5.17),
each 100 pixels wide. Lets make a collage of them, by combining several
of our eects to create dierent owers. Well copy them all into the blank
image 640x480.jpg. All we really have to do is to copy the pixel colors to
the right places.
Heres how we run the collage(Figure 5.18):
>>> flowers=createCollage()
Picture, filename /Users/guzdial/mediasources/flower1.jpg height
138 width 100
Picture, filename /Users/guzdial/mediasources/flower2.jpg height
227 width 100
Picture, filename /Users/guzdial/mediasources/640x480.jpg height
480 width 640
5.2. MANIPULATING PICTURES 141
def createCollage():
flower1=makePicture(getMediaPath("flower1.jpg"))
print flower1
flower2=makePicture(getMediaPath("flower2.jpg"))
print flower2
canvas=makePicture(getMediaPath("640x480.jpg"))
print canvas
#First picture, at left edge
targetX=1
for sourceX in range(1,getWidth(flower1)):
targetY=getHeight(canvas)-getHeight(flower1)-5
for sourceY in range(1,getHeight(flower1)):
px=getPixel(flower1,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
targetX=targetX + 1
#Second picture, 100 pixels over
142 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
targetX=100
for sourceX in range(1,getWidth(flower2)):
targetY=getHeight(canvas)-getHeight(flower2)-5
for sourceY in range(1,getHeight(flower2)):
px=getPixel(flower2,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
targetX=targetX + 1
#Third picture, flower1 negated
negative(flower1)
targetX=200
for sourceX in range(1,getWidth(flower1)):
targetY=getHeight(canvas)-getHeight(flower1)-5
for sourceY in range(1,getHeight(flower1)):
px=getPixel(flower1,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
5.2. MANIPULATING PICTURES 143
targetX=targetX + 1
#Fourth picture, flower2 with no blue
clearBlue(flower2)
targetX=300
for sourceX in range(1,getWidth(flower2)):
targetY=getHeight(canvas)-getHeight(flower2)-5
for sourceY in range(1,getHeight(flower2)):
px=getPixel(flower2,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
targetX=targetX + 1
#Fifth picture, flower1, negated with decreased red
decreaseRed(flower1)
targetX=400
for sourceX in range(1,getWidth(flower1)):
targetY=getHeight(canvas)-getHeight(flower1)-5
for sourceY in range(1,getHeight(flower1)):
px=getPixel(flower1,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
targetX=targetX + 1
show(canvas)
return(canvas)
End of Recipe 39
As long as this is, it would be even longer if we actually put all the
eects in the same function. Instead, I copied the functions we did earlier.
My whole program area looks like this:
def createCollage():
flower1=makePicture(getMediaPath("flower1.jpg"))
print flower1
flower2=makePicture(getMediaPath("flower2.jpg"))
print flower2
canvas=makePicture(getMediaPath("640x480.jpg"))
print canvas
#First picture, at left edge
144 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
targetX=1
for sourceX in range(1,getWidth(flower1)):
targetY=getHeight(canvas)-getHeight(flower1)-5
for sourceY in range(1,getHeight(flower1)):
px=getPixel(flower1,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
targetX=targetX + 1
#Second picture, 100 pixels over
targetX=100
for sourceX in range(1,getWidth(flower2)):
targetY=getHeight(canvas)-getHeight(flower2)-5
for sourceY in range(1,getHeight(flower2)):
px=getPixel(flower2,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
targetX=targetX + 1
#Third picture, flower1 negated
negative(flower1)
targetX=200
for sourceX in range(1,getWidth(flower1)):
targetY=getHeight(canvas)-getHeight(flower1)-5
for sourceY in range(1,getHeight(flower1)):
px=getPixel(flower1,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
targetX=targetX + 1
#Fourth picture, flower2 with no blue
clearBlue(flower2)
targetX=300
for sourceX in range(1,getWidth(flower2)):
targetY=getHeight(canvas)-getHeight(flower2)-5
for sourceY in range(1,getHeight(flower2)):
px=getPixel(flower2,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
5.2. MANIPULATING PICTURES 145
targetX=targetX + 1
#Fifth picture, flower1, negated with decreased red
decreaseRed(flower1)
targetX=400
for sourceX in range(1,getWidth(flower1)):
targetY=getHeight(canvas)-getHeight(flower1)-5
for sourceY in range(1,getHeight(flower1)):
px=getPixel(flower1,sourceX,sourceY)
cx=getPixel(canvas,targetX,targetY)
setColor(cx,getColor(px))
targetY=targetY + 1
targetX=targetX + 1
show(canvas)
return(canvas)
def clearBlue(picture):
for p in getPixels(picture):
setBlue(p,0)
def negative(picture):
for px in getPixels(picture):
red=getRed(px)
green=getGreen(px)
blue=getBlue(px)
negColor=makeColor( 255-red, 255-green, 255-blue)
setColor(px,negColor)
def decreaseRed(picture):
for p in getPixels(picture):
value=getRed(p)
setRed(p,value*0.5)
lot with the value that I used for distance (here, 50.0) and the amount of
redness increase (here, 50% increase). The result is that the wood behind
her gets increased, too (Figure 5.19 and Figure 5.34).
def turnRed():
brown = makeColor(57,16,8)
file="/Users/guzdial/mediasources/barbara.jpg"
picture=makePicture(file)
for px in getPixels(picture):
color = getColor(px)
if distance(color,brown)<50.0:
redness=int(getRed(px)*1.5)
blueness=getBlue(px)
greenness=getGreen(px)
setColor(px,makeColor(redness,blueness,greenness))
show(picture)
return(picture)
End of Recipe 40
With the MediaTools, we can also gure out the coordinates just around
Barbs face, and then just do the browns near her face. The eect isnt too
good, though its clear that it worked. The line of redness is too sharp and
rectangular (Figure 5.20 and Figure 5.35).
def turnRedInRange():
brown = makeColor(57,16,8)
file="/Users/guzdial/mediasources/barbara.jpg"
picture=makePicture(file)
for x in range(70,168):
for y in range(56,190):
px=getPixel(picture,x,y)
5.2. MANIPULATING PICTURES 147
color = getColor(px)
if distance(color,brown)<50.0:
redness=int(getRed(px)*1.5)
blueness=getBlue(px)
greenness=getGreen(px)
setColor(px,makeColor(redness,blueness,greenness))
show(picture)
return(picture)
End of Recipe 41
148 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
Background subtraction
Lets imagine that you have a picture of someone, and a picture of where they
stood without them there (Figure 5.21). Could you subtract the background
of the person (i.e., gure out where the colors are exactly the same), and
then replace another background? Say, of the moon (Figure 5.22)?
Recipe 42: Subtract the background and replace it with a new one
Figure 5.21: A picture of a child (Katie), and her background without her
bgpx = getPixel(bg,x,y)
if (distance(getColor(p1px),getColor(bgpx)) < 15.0):
setColor(p1px,getColor(getPixel(newbg,x,y)))
return pic1
End of Recipe 42
You can, but the eect isnt as good as youd like (Figure 5.23). My
daughters shirt color was too close to the color of the wall. And though the
light was dim, a shadow is denitely having an eect here.
150 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
Chromakey
Recipe 43: Chromakey: Replace all blue with the new background
def chromakey(source,bg):
# source should have something in front of blue, bg is the
new background
for x in range(1,getWidth(source)):
for y in range(1,getHeight(source)):
p = getPixel(source,x,y)
# My definition of blue: If the redness + greenness < blueness
if (getRed(p) + getGreen(p) < getBlue(p)):
#Then, grab the color at the same spot from the new background
setColor(p,getColor(getPixel(bg,x,y)))
return source
End of Recipe 43
The eect is really quite striking (Figure 5.25). Do note the folds in
the lunar surface, though. The really cool thing is that this recipe works for
any background thats the same size as the image (Figure 5.26).
Theres another way of writing this code, which is shorter but does the
same thing.
def chromakey2(source,bg):
for p in pixels(source):
if (getRed(p)+getGreen(p) < getBlue(p)):
setColor(p,getColor(getPixel(bg,x(p),y(p))))
return source
End of Recipe 44
Some of our best eects come from combining pixelsusing pixels from
other sides of a target pixel to inform what we do.
5.2. MANIPULATING PICTURES 153
Blurring
def blurExample(size):
pic = makePicture(pickAFile())
newPic = blur(pic,size)
show(newPic)
show(pic)
#!!
#To blur an image we take a pixel and set its color to the average
of all pixels around
#it(of a certain distance) to the current pixels color. This
give the effect of
#blending things together such as in the case of a blur where
you lose detail.
#
#pic is the image, and size is how big an area to average, 1=3x3
pixel area with current pixel
#in the center, 2=5x5, 3=7x7...
#positive #s only , 0 will do nothing and return the original
image
def blur(pic,size):
new = getFolderPath("640x480.jpg")
for x in range(1,getWidth(pic)):
print On x> , x
for y in range(1,getHeight(pic)):
newClr = blurHelper(pic,size,x-size,y-size)
setColor(getPixel(new,x,y),newClr)
return new
#!!
#At a given x,y(integer that is in the image) in the picture,pic,
is sums up the area
#of pixels as indicated by size.
#
# returns a Color representing the average of the surrounding
pixels
154 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
def blurHelper(pic,size,x,y):
red,green,blue = 0,0,0
cnt = 0
for x2 in range(0,(1+(size*2))):
if(x+x2 >= 0):
if(x+x2 < getWidth(pic)):
for y2 in range(0,(1+(size*2))):
if(y+y2 >= 0):
if(y+y2 < getHeight(pic)):
p = getPixel(pic,(x+x2),(y+y2))
blue = blue + getBlue(p)
red = red + getRed(p)
green = green + getGreen(p)
cnt = cnt + 1
return makeColor(red/cnt,green/cnt,blue/cnt)
End of Recipe 45
Exercises
Exercise 36: Recipe 26 (page 127) is obviously too much color reduction.
Write a version that only reduces the red by 10%, then one by 20%. Which
seems to be more useful? Note that you can always repeatedly reduce the
5.2. MANIPULATING PICTURES 157
redness in a picture, but you dont want to have to do it too many times,
either.
Exercise 37: Write the blue and green versions of Recipe 26 (page 127).
Exercise 38: Each of the below is equivalent to Recipe 27 (page 128).
Test them and convince them. Which do you prefer and why?
def increaseRed2(picture):
for p in getPixels(picture):
setRed(p,getRed(p)*1.2)
def increaseRed3(picture):
for p in getPixels(picture):
redComponent = getRed(p)
greenComponent = getGreen(p)
blueComponent = getBlue(p)
newRed=int(redComponent*1.2)
newColor = makeColor(newRed,greenComponent,blueComponent)
setColor(p,newColor)
Exercise 39: If you keep increasing the red, eventually the red looks like
it disappears, and you eventually get errors about illegal arguments. What
you do think is going on?
Exercise 40: Rewrite Recipe 28 (page 129) to clear blue, but for red and
green. For each of these, which would be the most useful in actual practice?
How about combinations of these?
Exercise 41: Rewrite Recipe 28 (page 129) to maximize blue (i.e., setting
it to 255) instead of clearing it. Is this useful? Would the red or green
versions be useful?
Exercise 42: There is more than one way to compute the right greyscale
value for a color value. The simple recipe that we use in Recipe 32 (page 133)
may not be what your greyscale printer uses when printing a color picture.
Compare the color (relatively unconverted by the printer) greyscale image
using our simple algorithm in Figure 5.33 with the original color picture
that the printer has converted to greyscale (left of Figure 5.9). How do the
two pictures dier?
Exercise 43: Are Recipe 35 (page 136) and Recipe 36 (page 137) really
the same? Look at them carefully and consider the end conditions: The
points when x is at the beginning and end of its range, for example. Its
easy in loops to be o-by-one.
158 CHAPTER 5. ENCODING AND MANIPULATING PICTURES
Exercise 44: Can you rewrite the vertical mirroring function (Recipe 35
(page 136)) to do horizontal mirroring? How about mirroring along the
diagonal (from (1, 1) to (width, height)?
Exercise 45: Think about how the greyscale algorithm works. Basically,
if you know the luminance of anything visual (e.g., a small image, a letter),
you can replace a pixel with that visual element in a similar way to create
a collage image. Try implementing that. Youll need 256 visual elements of
increasing lightness, all of the same size. Youll create a collage by replacing
each pixel in the original image with one of these visual elements.
To Dig Deeper
The bible of computer graphics is Introduction to Computer Graphics [Foley et al., 1993].
Its highly recommended.
5.3. COLOR FIGURES 159
Figure 5.28: Color: The original picture (left) and red-reduced version
(right)
Figure 5.29: Color: Overly blue (left) and red increased by 20% (right)
5.3. COLOR FIGURES 161
Figure 5.35: Color: Increasing reds in the browns, within a certain range
Chapter 6
Creating Pictures
Sometimes you want to create your own images, or add things to images
other than other images. Thats what this chapter is about.
def lineExample():
img = makePicture(pickAFile())
new = verticalLines(img)
new2 = horizontalLines(img)
show(new2)
return new2
def horizontalLines(src):
for x in range(1,getHeight(src),5):
for y in range(1,getWidth(src)):
setColor(getPixel(src,y,x),black)
return src
def verticalLines(src):
for x in range(1,getWidth(src),5):
165
166 CHAPTER 6. CREATING PICTURES
for y in range(1,getHeight(src)):
setColor(getPixel(src,x,y),black)
return src
End of Recipe 46
def littlepicture():
canvas=makePicture(getMediaPath("640x480.jpg"))
addText(canvas,10,50,"This is not a picture")
addLine(canvas,10,20,300,50)
addRectFilled(canvas,0,200,300,500,yellow)
addRect(canvas,10,210,290,490)
return canvas
End of Recipe 47
6.2. DRAWING WITH DRAWING COMMANDS 167
The recipe above draws the picture in Figure 6.2. These are examples
of the drawing commands available in JES.
Heres a thought: Which of these is smaller? The picture, on my disk, is
about 15 kilobytes (a kilobyte is a thousand bytes). The recipe is less than
100 bytes. But they are equivalent. What if you just saved the program and
not the pixels? Thats what a vector representation for graphics is about.
168 CHAPTER 6. CREATING PICTURES
Part IV
Meta-Issues: How we do
what we do
169
Chapter 7
7.1.2 Bottom-up
What do you know how to do of your program? Does it say that you
have to manipulate sound? Try a couple of the sound recipes in the
book to remember how to do that. Does it say that you have to change
red levels? Can you nd a recipe that does that and try it?
Now, can you add some of these together? Can you put together a
couple of recipes that do part of what you want?
Keep growing the program. Is it closer to what you need? What else
do you need to add?
171
172 CHAPTER 7. DESIGN AND DEBUGGING
Run your program often. Make sure it works, and that you understand
what you have so far.
Tracing Code
Sit down with pencil and paper and gure out the variable values and whats
happening.
Print statements really are very useful to help one in tracing code. You
can also use the printNow function which forces the print to occur rightaway,
rather than at the end of the execution.
Exercises
To Dig Deeper
7.2. TECHNIQUES OF DEBUGGING 173
Files
175
Chapter 8
This chapter will eventually talk about directories and manipulating direc-
tories, and how to read and write les.
177
178CHAPTER 8. ENCODING, CREATING, AND MANIPULATING FILES
def copyFile(path1,path2):
inFile = open(path1,r)
outFile = open(path2,w)
temp = inFile.read()
outFile.write(temp)
outFile.close()
return
End of Recipe 48
What if we cant t the whole le into memory? But what if we run out
of memory?
def copyFile(path1,path2):
inFile = open(path1,r)
outFile = open(path2,w)
outFile.close()
return
End of Recipe 49
def copyFile(path1,path2):
inFile = open(path1,rb)
outFile = open(path2,wb)
outFile.close()
return
End of Recipe 50
180CHAPTER 8. ENCODING, CREATING, AND MANIPULATING FILES
Part VI
Text
181
Chapter 9
Manipulating text is what this chapter is about. Its very important for
us because a very common form of text for us today is HyperText Markup
Language (HTML).
def html():
# To use this routine the Media folder must be used for your
output and picture files
myHTML = getMediaPath("myHTML.html")
pictureFile = getMediaPath("barbara.jpg")
eol=chr(11) #End-of-line character
# The following line builds a literal string that includes
both single and double quotes
buildSpecial = "<IMG SRC=
"+ pictureFile + " ALT= "I am one
heck of a programmer!">+eol
183
184 CHAPTER 9. ENCODING AND MANIPULATION OF TEXT
outFile = open(myHTML,w)
outFile.write( <HTML>+eol)
outFile.write(<HEAD>+eol)
outFile.write(<TITLE>Homepage of Georgia P. Burdell</TITLE>+eol)
outFile.write(<LINK REL=STYLESHEET TYPE="text/css" HREF="style.css">)
outFile.write(</HEAD>+eol)
outFile.write(<BODY>+eol)
outFile.write(<CENTER><H2>Welcome to the home page of Georgia
P. Burdell!</H2></CENTER>+eol)
outFile.write(<BR>+eol)
outFile.write(<P> Hello, and welcome to my home page! As
you should have already+eol)
outFile.write( guessed, my name is Georgia, and I am a <A
HREF=http://www.cc.gatech.edu><B>+eol)
outFile.write(Computer Science</B></A> major at <A HREF=http://www.gatec
outFile.write(Georgia Tech</B> </A>+eol)
outFile.write(<BR>+eol)
outFile.write(Here is a picture of me in case you were wondering
what I looked like.+eol)
outFile.write(</P>+eol)
outFile.write(buildSpecial) #
Write the special line we built up near the top
outFile.write(<P><H4> Well, welcome to my web page. The
majority of it is still under construction, so I dont have a
lot to show you right now. +eol)
outFile.write(I am in my 75th year at Georgia Tech but am
taking CS 1315 so I dont have a lot of spare time to update the
page.+eol)
outFile.write(I promise to start real soon!+eol)
outFile.write(--Georgia P. Burdell</P></H4>+eol)
outFile.write(<HR>+eol)
outFile.write(<PIf you want to send me e-mail, click <><A
HREF = "mailto:[email protected]">[email protected]</A>+eol)
outFile.write(<HR></P>+eol)
outFile.write(<CENTER><A HREF="http://www.cc.gatech.edu/">+eol)
outFile.write(<IMG SRC="http://www.cc.gatech.edu/newhome_images/CoC_logo
ALT= "To my school"></CENTER>+eol)
outFile.write(</A>+eol)
outFile.write(</BODY>+eol)
9.2. CONVERTING FROM SOUND TO TEXT TO GRAPHICS 185
outFile.write(</HTML>+eol)
outFile.close()
End of Recipe 51
Recipe 52: Convert a sound into a text file that Excel can read
def writeSampleValue():
f = pickAFile() # File where the original sound resides
source = makeSound(f)
eol = chr(11)
endCurrentSound = getLength(source)
outFile.close()
End of Recipe 52
186 CHAPTER 9. ENCODING AND MANIPULATION OF TEXT
Part VII
Movies
187
Chapter 10
Movies (video) are actually very simple to manipulate. They are arrays
of pictures (frames). You need to be concerned with the frame rate (the
number of frames per second), but its mostly just things youve seen before.
It just takes a long time to process. . .
189
190CHAPTER 10. ENCODING, MANIPULATION AND CREATING MOVIES
into a folder in the mediasources directory. Youll nd that these are very
dark frames (Figure 10.2). Can we lighten them (Figure 10.3)? Well, maybe
a little.
#
Common Bug: If you see getMediaPath, then
setMediaFolder
Whenever you see getMediaPath in a recipe, you
know that you have to setMediaFolder before using
that recipe.
" !
Youd run this like lightenMovie("/Users/guzdial/mediasources/dark-bladerunner/").
Be sure to include the nal le directory delimeter!.
def lightenMovie(folder):
import os
for file in os.listdir(folder):
picture=makePicture(folder+file)
for px in getPixels(picture):
color=getColor(px)
makeLighter(color)
10.2. COMPOSITING TO CREATE NEW MOVIES 191
setColor(px,color)
writePictureTo(picture,folder+"l"+file)
End of Recipe 53
def santaMovie(folder):
santafile="/Users/guzdial/Work/mediasources/santa.jpg"
santa=makePicture(santafile)
startXPos = 10
startYPos = 100
import os
for file in os.listdir(folder):
frame=makePicture(folder+file)
xmax=min(startXPos+getWidth(santa),getWidth(frame))
ymax=min(startYPos+getHeight(santa),getHeight(frame))
santaX = 1
for x in range(startXPos,xmax):
santaY = 1
for y in range(startYPos,ymax):
px=getPixel(frame,x,y)
santaPixel=getPixel(santa,santaX,santaY)
setColor(px,getColor(santaPixel))
santaY = santaY + 1
santaX = santaX + 1
writePictureTo(frame,folder+"s"+file)
# make Santa sink one line lower each frame
startXPos = startXPos + 1
End of Recipe 54
192CHAPTER 10. ENCODING, MANIPULATION AND CREATING MOVIES
def AnimationSimple():
frames = 50
rx = 0
ry = 0
rw = 50
rh = 50
for f in range(1,frames+1):
pic = BlankPicture(500,500)
pic.addOvalFilled(Color(255,0,0),x,y,w,h)
if(f < 10):
pic.writeTo(test0%d.jpg%(f))
else:
pic.writeTo(test0%d.jpg%(f))
x = 5 + x
y = 5 + y
def AnimationSimple2():
frames = 100
w = 50
h = 50
#ball postions balls r,g,b,y x and y posistions
rx = 0
ry = 0
bx = 275
by = 225
gx = 275
gy = 275
yx = 225
yy = 275
10.3. ANIMATION: CREATING MOVIES FROM SCRATCH 193
for f in range(1,frames+1):
pic = BlankPicture(500,500)
if(f < 50):
rx += 5
ry += 5
else:
rx -= 2
ry -= 2
bx += 5
by -= 5
gx += 5
gy += 5
yx -= 5
yy += 5
pic.addOvalFilled(red,rx,ry,w,h)
pic.addOvalFilled(blue,bx,by,w,h)
pic.addOvalFilled(green,gx,gy,w,h)
pic.addOvalFilled(yellow,yx,yy,w,h)
if(f < 10):
pic.writeTo(test00%d.jpg%(f))
elif(f < 100):
pic.writeTo(test0%d.jpg%(f))
else:
pic.writeTo(test%d.jpg%(f))
End of Recipe 55
Exercises
Exercise 46: How would we lighten the Bladerunner frame more?
Exercise 47: Under what conditions would it not be worth anything to
do so?
Exercise 48: Try applying a dierent manipulation, besides lightening,
to frames of a movie.
194CHAPTER 10. ENCODING, MANIPULATION AND CREATING MOVIES
Chapter 11
Storing Media
195
196 CHAPTER 11. STORING MEDIA
Part VIII
197
Chapter 12
When are programs fast, and when are they slow? And why?
12.1 Complexity
Why are the movie programs so slow, while others are so fast?
Heres a rough way of guring it out: Count the loops.
Look at the normalization recipe (Recipe 11 (page 76)). Do you see two
loops? Each of which goes through n samples. Wed say that this order of
complexity of this recipe is O(2n). As youll see, the real speed dierences
have little to do with the constants, so well often call this O(n).
Now look at the picture processing code. Youll often see two loops, one
working across m pixels and up and down n pixels. Wed call that O(mn).
If m and n are near one another, wed say O(n2 ).
Now look at movie processing. l frames, each m by n. Thats O(lmn).
And if these are close to one another. O(n3 ).
199
200 CHAPTER 12. HOW FAST CAN WE GET?
With Moores law, in two years, you can do that in only 1,000 years!
Can we do better? Maybe can you be satised with less than perfect?
Can we be smarter than checking EVERY combination? Thats part of
heuristics, a part of what articial intelligence researchers do.
201
Chapter 13
Functional Decomposition
def increaseOneSample(sample):
setSample(sample,getSample(sample)*2)
We can use that function and apply it to the whole sequence of samples
like this:
>>> file="/Users/guzdial/mediasources/hello.wav"
>>> sound=makeSound(file)
>>> result=map(increaseOneSample,getSamples(sound))
>>> play(sound)
But it turns out that we dont even have to create that extra function.
lambda allows us to create a function without even naming it!
>>> file="/Users/guzdial/mediasources/hello.wav"
>>> sound=makeSound(file)
>>> result=map(lambda s:setSample(s,getSample(s)*2),getSamples(sound))
>>> play(sound)
203
204 CHAPTER 13. FUNCTIONAL DECOMPOSITION
Chapter 14
205
206 CHAPTER 14. RECURSION
Chapter 15
Remember the modules that we saw? Those are akin to objects. Well use
dot notation.
sound.writeTo and picture.writeTo are easier than remembering writeSoundTo
and writePictureTo. Its always writeTo.
Both colors and pixels understand getRed. Makes it easier to work with.
207
208 CHAPTER 15. OBJECTS
Part X
Other Languages
209
Chapter 16
211
212 CHAPTER 16. JAVA
asample = s.getSample(i);
if (asample > loudest)
loudest = asample;
if (multiplier == 1)
return;
System.exit(0);
End of Recipe 56
JavaPixel p = src.getPixel(y,x);
//System.out.println("x = " + x );
//System.out.println("y = " + y ); // 149
//System.out.println("red = " + p.getRed() );
int nC = (p.getRed()+p.getGreen()+p.getBlue() )/ 3;
p.setBlue(nC);
p.setRed(nC);
p.setGreen(nC);
src.saveImage("grey.jpg");
End of Recipe 57
Bibliography
[Abelson et al., 1996] Abelson, H., Sussman, G. J., and Sussman, J. (1996).
Structure and Intepretation of Computer Programs 2nd Edition. MIT
Press, Cambridge, MA.
[Adelson and Soloway, 1985] Adelson, B. and Soloway, E. (1985). The role
of domain experience in software design. IEEE Transactions on Software
Engineering, SE-11(11):13511360.
[Boulanger, 2000] Boulanger, R., editor (2000). The CSound Book: Perspec-
tives in Synthesis, Sound Design, Signal Processing, and Programming.
MIT Press, Cambridge, MA.
[Felleisen et al., 2001] Felleisen, M., Findler, R. B., Flatt, M., and Krish-
namurthi, S. (2001). How to Design Programs: An Introduction to Pro-
gramming and Computing. MIT Press, Cambridge, MA.
[Foley et al., 1993] Foley, J. D., Van Dam, A., and Feiner, S. K. (1993).
Introduction to Computer Graphics. Addison Wesley, Reading, MA.
215
216 BIBLIOGRAPHY
[Guzdial and Rose, 2001] Guzdial, M. and Rose, K., editors (2001). Squeak,
Open Personal Computing for Multimedia. Prentice-Hall, Englewood, NJ.
[Harel and Papert, 1990] Harel, I. and Papert, S. (1990). Software design
as a learning environment. Interactive Learning Environments, 1(1):132.
[Harvey, 1997] Harvey, B. (1997). Computer Science Logo Style 2/e Vol. 1:
Symbolic Computing. MIT Press, Cambridge, MA.
[Ingalls et al., 1997] Ingalls, D., Kaehler, T., Maloney, J., Wallace, S., and
Kay, A. (1997). Back to the future: The story of squeak, a practical
smalltalk written in itself. In OOPSLA97 Conference Proceedings, pages
318326. ACM, Atlanta, GA.
[Roads, 1996] Roads, C. (1996). The Computer Music Tutorial. MIT Press,
Cambridge, MA.
Index
217
218 INDEX
WAV, 28
weights, 133
window (of time), 51
writePictureTo, 124