Preface
All of Programming
Andrew Hilton and Anne Bracy
June 27, 2019
Edition 1
Copyright © 2015–2019 Andrew Hilton and Anne Bracy
All rights reserved. If you have purchased this book from
an authorized source, you may download it to a device that you
own for your own personal use. However, you may not
distribute this book in part nor in whole to others (including,
but not limited to, making it available for download via a
website, emailing it to others, or distributing by physical media
such) without the express written permission of the authors.
If you did not purchase this book from an authorized
source, please do so before continuing to use this book.
Cover photograph by Margaret J. Foster, used with permission.
Website: http://aop.cs.cornell.edu/
E-mail: [email protected]
ISBN: 978-0-9967182-1-9
Table of Contents
Preface
I Introduction to Programming in C
1 Introduction
1.1 How to Write a Program
1.2 Algorithms
1.3 Step 1: Work an Example Yourself
1.4 Step 2: Write Down What You Just Did
1.5 Step 3: Generalize Your Steps
1.6 Step 4: Test Your Algorithm
1.7 Some Examples
1.8 Next Steps
1.9 Practice Exercises
2 Reading Code
2.1 Variables
2.2 Expressions
2.3 Functions
2.4 Conditional Statements
2.5 Shorthand
2.6 Loops
2.7 Higher-level Meaning
2.8 Practice Exercises
3 Types
3.1 Hardware Representations
3.2 Basic Data Types
3.3 Printing Redux
3.4 Expressions Have Types
3.5 “Non-Numbers”
3.6 Complex, Custom Data Types
3.7 Practice Exercises
4 Writing Code
4.1 Step 1: Work an Example Yourself
4.2 Step 2: Write Down What You Just Did
4.3 Step 3: Generalize Your Steps
4.4 Step 4: Test Your Algorithm
4.5 Step 5: Translate Your Algorithm to Code
4.6 A Complete Example
4.7 Next Steps: Compiling, Running, Testing
and Debugging
4.8 Practice Exercises
5 Compiling and Running
5.1 The Compilation Process
5.2 Running Your Program
5.3 Compiler Options
5.4 Other Possibilities
5.5 Practice Exercises
6 Fixing Your Code: Testing and Debugging
6.1 Step 6: Test Your Code
6.2 Step 7: Debug Your Code
6.3 Practice Exercises
7 Recursion
7.1 Reading Recursive Code
7.2 Writing Recursive Functions
7.3 Tail Recursion
7.4 Functional Programming
7.5 Mutual Recursion
7.6 Theory
7.7 Practice Exercises
8 Pointers
8.1 Pointer Basics
8.2 A Picture Is Worth a Thousand Words
8.3 swap, Revisited
8.4 Pointers Under the Hood
8.5 Pointers to Sophisticated Types
8.6 Aliasing: Multiple Names for a Box
8.7 Pointer Arithmetic
8.8 Use Memory Checker Tools
8.9 Practice Exercises
9 Arrays
9.1 Motivating Example
9.2 Array Declaration
9.3 Accessing an Array
9.4 Passing Arrays as Parameters
9.5 Writing Code with Arrays
9.6 Sizes of Arrays: size_t
9.7 Practice Exercises
10 Uses of Pointers
10.1 Strings
10.2 Multidimensional Arrays
10.3 Function Pointers
10.4 Security Hazards
10.5 Practice Exercises
11 Interacting with the User and System
11.1 The Operating System
11.2 Command Line Arguments
11.3 Files
11.4 Other Interactions
11.5 Practice Exercises
12 Dynamic Allocation
12.1 malloc
12.2 free
12.3 realloc
12.4 getline
12.5 Practice Exercises
13 Programming in the Large
13.1 Abstraction
13.2 Readability
13.3 Working in Teams
13.4 A Modestly Sized Example
13.5 Even Larger Programs
13.6 Practice Exercises
II C++
14 Transition to C++
14.1 Object-Oriented Programming
14.2 References
14.3 Namespaces
14.4 Function Overloading
14.5 Operator Overloading
14.6 Other Aspects of Switching to C++
14.7 Practice Exercises
15 Object Creation and Destruction
15.1 Object Construction
15.2 Object Destruction
15.3 Object Copying
15.4 Unnamed Temporaries
15.5 Practice Exercises
16 Strings and IO Revisited
16.1 Strings
16.2 Output
16.3 Input
16.4 Other Streams
16.5 Practice Exercises
17 Templates
17.1 Templated Functions
17.2 Templated Classes
17.3 Template Rules
17.4 The Standard Template Library
17.5 Practice Exercises
18 Inheritance
18.1 Another Conceptual Example
18.2 Writing Classes with Inheritance
18.3 Construction and Destruction
18.4 Subtype Polymorphism
18.5 Method Overriding
18.6 Abstract Methods and Classes
18.7 Inheritance and Templates
18.8 Planning Your Inheritance Hierarchy
18.9 Practice Exercises
19 Error Handling and Exceptions
19.1 C-Style Error Handling
19.2 C++-style: Exceptions
19.3 Executing Code with Exceptions
19.4 Exceptions as Part of a Function’s Interface
19.5 Exception Corner Cases
19.6 Using Exceptions Properly
19.7 Exception Safety
19.8 Real C++
19.9 Practice Exercises
III Data Structures and Algorithms
20 Introduction To Algorithms and Data Structures
20.1 Big-Oh Notation
20.2 Abstract Data Types
20.3 Queues
20.4 Stacks
20.5 Sets
20.6 Maps
20.7 ADTs and Abstract Classes
20.8 Practice Exercises
21 Linked Lists
21.1 Linked List Basic Operations
21.2 Insert in Sorted Order
21.3 Removing from a List
21.4 Iterators
21.5 Uses for ADTs
21.6 STL List
21.7 Practice Exercises
22 Binary Search Trees
22.1 Binary Search Trees Concepts
22.2 Adding to a Binary Search Tree
22.3 Searching a Binary Tree
22.4 Removing From a Binary Search Tree
22.5 Tree Traversals
22.6 Practice Exercises
23 Hash Tables
23.1 Hash Table Basics
23.2 Collision Resolution
23.3 Hashing Functions
23.4 Rehashing
23.5 Practice Exercises
24 Heaps and Priority Queues
24.1 Heap Concepts
24.2 Heap Array Implementation
24.3 STL Priority Queue
24.4 Priority Queues Use: Compression
24.5 Practice Exercises
25 Graphs
25.1 Graph Applications
25.2 Graph Implementations
25.3 Graph Searches
25.4 Minimum Spanning Trees
25.5 Other Algorithms
25.6 Practice Exercises
26 Sorting
26.1 Sorts
26.2 Sorts
26.3 Sorting Tradeoffs
26.4 Sorting Libraries
26.5 Practice Exercises
IV Other Topics
27 Balanced BSTs
27.1 AVL Insertion
27.2 AVL Delete
27.3 Red-Black Insert
27.4 Red-Black Delete
28 Concurrency
28.1 Processes
28.2 Threads
28.3 Synchronization
28.4 Atomic Primitives
28.5 Lock Free Data Structures
28.6 Parallel Programming Idioms
28.7 Amdahl’s Law
28.8 Much More…
29 Advanced Topics in Inheritance
29.1 Object Layout
29.2 Multiple Inheritance
29.3 Virtual Multiple Inheritance
29.4 Mixins
30 Other Languages
30.1 Garbage Collection
30.2 Language Paradigms
30.3 Type Systems
30.4 Parameter Passing
30.5 And More…
31 Other Languages: Java
31.1 Getting Started
31.2 Primitives and Objects
31.3 Object Creation and Destruction
31.4 Inheritance
31.5 Arrays
31.6 Java API
31.7 Exceptions
31.8 Generics
31.9 Other Features
32 Other Languages: Python
32.1 Getting Started
32.2 Basic Types
32.3 Data Structures
32.4 Control: Loops and Whitespace
32.5 Example: Reading from a File
33 Other Languages: SML
33.1 Getting Started
33.2 Basic: Math, Conditionals, Functions, and
Tuples
33.3 Data Types and Data Structures
33.4 Interlude: Compiler Errors
33.5 Higher Order Functions
33.6 Side-effects
33.7 The Module System
33.8 Common Mistakes
33.9 Practice Exercises
34 … And Beyond
V Appendicies
A Why Expert Tools
B UNIX Basics
B.1 In the Beginning Was the Command Line
B.2 Getting Help: man and help
B.3 Directories
B.4 Displaying Files
B.5 Moving, Copying, and Deleting
B.6 Pattern Expansion: Globbing and Braces
B.7 Redirection and Pipes
B.8 Searching
B.9 Command Shells
B.10 Scripting
B.11 Environment Variables
B.12 Remote Login: ssh
C Editing: Emacs
C.1 Emacs vocabulary
C.2 Running Emacs
C.3 Files and Buffers
C.4 Cancel and Undo
C.5 Cut, Copy and Paste
C.6 Multiple Buffers
C.7 Search and Replace
C.8 Programming Commands
C.9 Advanced Movement
C.10 Keyboard Macros
C.11 Writing Text or LaTeX
C.12 Configuring and Customizing Emacs
D Other Important Tools
D.1 Build Tool: make
D.2 Debugger: GDB
D.3 Memory Checker: Valgrind’s Memcheck
D.4 Revision Control: Git
D.5 Tools: A Good Investment
E Miscellaneous C and C++ Topics
E.1 Ternary If
E.2 Unions
E.3 The C Preprocessor
E.4 C++ Casting
E.5 Boost Libraries
E.6 C++11 Features
E.7 goto
F Compiler Errors Explained
G Answers to Selected Exercises
G.1 Answers for Chapter 1
G.2 Answers for Chapter 2
G.3 Answers for Chapter 3
G.4 Answers for Chapter 4
G.5 Answers for Chapter 5
G.6 Answers for Chapter 6
G.7 Answers for Chapter 7
G.8 Answers for Chapter 8
G.9 Answers for Chapter 9
G.10 Answers for Chapter 10
G.11 Answers for Chapter 11
G.12 Answers for Chapter 12
G.13 Answers for Chapter 13
G.14 Answers for Chapter 14
G.15 Answers for Chapter 15
G.16 Answers for Chapter 16
G.17 Answers for Chapter 17
G.18 Answers for Chapter 18
G.19 Answers for Chapter 19
G.20 Answers for Chapter 20
G.21 Answers for Chapter 21
G.22 Answers for Chapter 22
G.23 Answers for Chapter 23
G.24 Answers for Chapter 24
G.25 Answers for Chapter 25
G.26 Answers for Chapter 26
G.27 Answers for Chapter 33
List of Figures
1.1 A high level view of writing a program
1.2 The first five steps of writing a program.
2.1 A variable declaration.
2.2 An assignment statement.
2.3 A function declaration.
2.4 Variables organized into frames.
2.5 Depiction of the scope of parameters and variables.
2.6 Scope Rules
2.7 Syntax of if/else.
2.8 Syntax of switch/case.
2.9 Syntax of a while loop.
2.10 Syntax of a do-while loop.
2.11 Syntax of a for loop
3.1 Decimal and binary interpretation of the same pattern
of digits.
3.2 Code, conceptual representation, and actual hardware
representation.
3.3 Basic data types supported in C
3.4 A subset of ASCII number-to-character mappings.
3.5 Examples of chars and ints.
3.6 Floating Point Representation
3.7 Precision
3.8 Printing various data types
3.9 Printing in C.
3.10 How to encode a string as a series of ASCII
characters.
3.11 Visualization, code, and representation of a rectangle
3.12 Various syntactic options for creating struct tags,
types, and instantiating variables.
3.13 Use of Typedef
3.14 Enumerated Types
4.1 Working an example of the rectangle intersection
problem.
4.2 Another instance of the rectangle problem, with the
Cartesian grid removed.
4.3 Rectangles demonstrating problem with algorithm
5.1 A high-level view of the process that GCC takes to
compile a program
6.1 The control flow graph for our example code.
6.2 Four paths through a CFG
6.3 The scientific method.
8.1 Pointer Basics
8.2 Data with Addresses
8.3 Data with Addresses
8.4 Program Memory.
8.5 Depiction of NULL
8.6 Structs and Pointers
8.7 Pointers To Pointers
8.8 Aliasing With Different Types
9.1 Depiction of an Array
10.1 A variable pointing at a string literal.
10.2 String literals are typically placed in a read-only
portion of the static data section.
10.3 String literals versus arrays of chars
10.4 Applying the == operator to two strings just compares
the pointers for equality.
10.5 Assigning pointers does not copy strings
10.6 Illustration of the difference between 12345 and
"12345".
10.7 Layout of a 2D array
10.8 Layout of an array of pointers
10.9 Two declarations of multidimensional arrays of
characters, initialized with strings.
10.10 An array of strings
10.11 Motivation for function pointers
(a) A function that increments all elements of an array.
(b) A function that squares all elements of an array.
(c) A function that takes the absolute value of all elements
of an array.
(d) A function that doubles all elements of an array.
11.1 Conceptual diagram of interaction with OS and
hardware.
11.2 Arguments to main
11.3 Conceptual representation of the result of opening a
file.
12.1 Highlighting the heap.
12.2 The signature of malloc.
12.3 Copying only a pointer.
12.4 A shallow copy.
12.5 A deep copy.
12.6 The signature of free.
12.7 The signature of realloc.
12.8 The signature of getline.
13.1 Curly brace placement
(a) Java style.
(b) One True Brace Style (1TBS).
(c) Allman style.
(d) GNU style.
14.1 Summary of the syntax for declaring a class.
14.2 Illustration of member visibility for classes and
structs.
18.1 A diagram of the example BankAccount inheritance
hierarchy.
18.2 Depiction of the contents of main’s frame when we
call the sayHi methods.
18.3 Naïve class relationships for our hypothetical game.
18.4 An improved version of our class hierarchy, using
inheritance.
18.5 An even better inheritance hierarchy.
19.1 Overview of exception guarantees.
20.1 Some examples of functions and Big Oh.
20.2 A FIFO queue
20.3 A LIFO stack
21.1 Example linked lists.
22.1 Binary trees vs Binary search trees
22.2 Three graphs that are not trees.
22.3 Three trees with different properties.
22.4 Binary search tree properties
22.5 Abstract syntax tree
22.6 Working an example ourselves of how to add to a
binary search tree.
22.7 Considerations of other potential places to add 15.
22.8 Our example binary search tree with 15 added.
22.9 An example binary tree to use for discussing deletion.
22.10 Two options for removing 19 from the tree.
24.1 Two heaps with the same data but opposite ordering
rules.
25.1 A task dependency graph
25.2 Graph coloring
25.3 Adjacency matrix representation of a graph
25.4 Adjacency list representation of a graph
25.5 Illustration of Dijkstra’s Algorithm.
26.1 Runtime comparison of various sorts.
26.2 Runtime of sorts on larger data
sizes.
27.1 An example red-black tree.
27.2 Red-black tree rotation and recoloring.
28.1 Speedups for our image smoothing algorithm.
28.2 An example of pipeline parallelism.
29.1 Example object layouts with inheritance.
29.2 Example object layouts with vtables.
29.3 The conundrum with a first attempt at multiple
inheritance. We make two attempts to layout class C but in
both cases fail to respect the subobject rule for one parent.
29.4 Correct layout with multiple inheritance. The A
subobject is shown in blue and the B subobject is shown in
green.
29.5 Diagram showing call to an overridden function via a
non-primary parent.
29.6 Object layout for ImageButton. Observe that it has
two distinct GuiComponent subobjects: one inherited from
Button (shaded blue), and one inherited from
ImageDisplay (shaded green).
29.7 The inheritance hierarchy we have made (left) and
the inheritance hierarchy that we want (right).
29.8 Object layout with virtual inheritance. Classes that
inherit virtually have the offset from the start of the object
to their parent in their vtable. Classes that multiply inherit
can then have one subobject of the virtually inherited
ancestor.
29.9 A complex inheritance hierarchy to demonstrate the
rules for creating and destroying virtually inherited
objects. Blue edges indicate virtual inheritance.
A.1 Tools for novices vs tools for experts
B.1 The command prompt.
B.2 Display of man page for printf.
B.3 Examples of the cd and ls commands.
C.1 Sample .emacs file.
E.1 An illustration of the overlap of fields in a union.
Preface
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
All of ProgrammingAll of ProgrammingAcknowledgements
Preface
Programming is an increasingly popular skill—whether for
those who want to be professional software developers, or
those who want to write programs to analyze and manipulate
data or automate tasks in some other field. Programming
course enrollment is soaring, and a plethora of online options
are springing up to provide instruction in the field. However,
experience shows that many courses (of either form) which aim
to teach introductory programming do not actually teach how to
program.
In writing this book, we set out to provide a platform for
instructors to design courses which properly place their focus
on the core fundamentals of programming, or to let a motivated
student learn these skills independently. We do not waste any
time or space on the specifics of the buzzword-of-the-day
technology. Instead, we dedicate all of our efforts to core
concepts and techniques of programming. A student who
masters the material in this book will not just be a competent C
programmer, but also a competent programmer, as the title of
the book would suggest.1
Some people may question the language choice of this
book: “Why C?” “Isn’t C hard for beginners?” “Everyone loves
language X, why not do this in X?”. At some level, the answer
is “it does not matter.” We are teaching programming not a
particular language. We just need a language so that students
can implement their programs in a way that a computer can
understand. We note that we do briefly discuss other languages
in four chapters, beginning with Chapter 30.
On another level, C (and C++) are excellent choices for a
variety of reasons. Perhaps most importantly, we can introduce
ideas in a natural and logical fashion without “just do this
because you have to, but cannot understand it yet”—such a
practice is harmful to teaching any programmer, who should
fully understand any code she writes. Furthermore, C and C++
provide a more complete picture of programming concepts.
Many other language choices would require omitting some
core concept which that language does not have. Such an
omission would require the student to learn an entirely new
concept to switch languages. As the “icing on the cake,” C and
C++ have a long history, and still have a wide-spread (and
well-paid!) presence in industry.
We note that this book is quite large: over 30 chapters, and
six appendices, spanning over 700 pages and 7.5 hours of
video. Covering all of this material in a single semester is quite
an aggressive pace—approximately one chapter per class day.
Such a pace is possible, but requires heavily motivated students
who are willing to put in significant effort. Generally that pace
would only be appropriate to a Masters level “ramp up” course
for students switching disciplines from one with no
programming background into one where many other classes
expect near-professional level programming.
For an undergraduate course, a more appropriate pace
would be to use Part I (Introduction to Programming in C) as a
“CS 1” course (likely with heavy reference to the appendices
on programmers’ tools and editors). Such a pace would result
in approximately one chapter per week. A “CS 2” course could
then be constructed from Parts II (C++) and III (Data
Structures and Algorithms) also at approximately one chapter
per week. Part IV’s material could be placed in later courses
that are intended only for more serious programmers.
We further recommend using this book in a “flipped
classroom” model—in which students’ primary intake of
material is done out of class (i.e., by reading this book), and in
class time is spent on activities. These activities should
primarily be formed from programming, or programming-
related (e.g., executing code by hand) topics. Students can then
perform the most important tasks—doing programming—with
expert help and guidance available.
We provide some questions and problems at the end of
each chapter (in Parts I, II, and III) to help you check your
understanding of the material. Some of these problems ask you
to explain the basic concepts in the chapter. Others ask you to
perform the skills you should be learning (reading and writing
code). If you are teaching a class with this book, we encourage
you to create some larger, more sophisticated problems for
students to do in class—possibly providing some infrastructure
to allow students to do write “cool and exciting” programs.
Some practice problems have sample answers in the back of
the book. Such problems have hyperlinks to the answer, and the
answer has a hyperlink back to the problem.
We will also note that this book has embedded videos,
which are an integral part of its design. You should watch the
videos as you work your way through this book, as they convey
important material—a lot of things in programming happen
actively, and are much better conveyed to you, the learner,
through animations rather than static figures. Videos should
look generally like this:
You will notice that the video has relatively standard play
controls. You can click the video to play/pause it, as well as use
the time-position slider at the bottom to jump backwards or
forwards in the video. If you do not understand something, you
may want to jump back and rewatch it!
Finally, we will note that this is the second version
(edition 1) of the book. We have worked to remove a variety of
typos, and make other improvements relative to the first
version (edition 0). However, we would be surprised if there
are not other typos or issues lurking somewhere in the book. If
you discover a problem, please check our website
http://aop.cs.cornell.edu/ to see if we are already aware
of it. If not, please report the problem to us there. We will post
a correction and fix it in the next edition. If you need to contact
us, you can email us at
[email protected].
We hope you enjoy the book and learn a lot—happy
hacking!
All of ProgrammingAcknowledgements
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
All of ProgrammingPrefaceI Introduction to Programming in C
Acknowledgements
We would like to take a moment to thank the many people who
made this book possible. We are both deeply grateful to the
many wonderful teachers of computer science—from high
school through graduate school—who both educated us and
inspired us. It is one thing to convey knowledge. It is quite
another thing to ignite a love of computer science and teaching
that motivates one to write a textbook. We have both also
enjoyed teaching bright and motivated students. They have also
played an important motivating and refining role in this
undertaking.
We also want to thank our respective spouses, Margaret
and Kilian for their love and support during the lengthy and
arduous process of writing a book of this size. It is hard to
imagine finishing this book without them.
A special thank you goes to Genevieve Lipp, for many of
the revisions made between Edition 0 and Edition 1.
Finally, we would like to give a huge thank you to the
authors of LaTeXML, without which we would not have been
able to convert this book to EPuB format.
We would also like to thank those who submitted
suggestions for changes from Edition 0: John Caughie, Angel
Perez, Wanxin Yuan, Yuchen Zhou, Kevin Liang, Leo Fang,
Tai-Lin Wu, Faustine Li, Margaret Foster, Jeff Vaughan, Ming-
Tso Wei, Andrew Bihl, Xi (Ronnie) Chen, Huimin He, Wiwi
Samsul, and Arvind Roshaan.
PrefaceI Introduction to Programming in C
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
All of ProgrammingAcknowledgements1 Introduction
Part I Introduction to
Programming in C
1 Introduction
2 Reading Code
3 Types
4 Writing Code
5 Compiling and Running
6 Fixing Your Code: Testing and Debugging
7 Recursion
8 Pointers
9 Arrays
10 Uses of Pointers
11 Interacting with the User and System
12 Dynamic Allocation
13 Programming in the Large
Acknowledgements1 Introduction
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in CI Introduction to Programming in C2 Reading
Code
Chapter 1
Introduction
Programming is an increasingly important skill, whether you
aspire to a career in software development, or in other fields.
By reading this text and practicing these skills, you will learn
how to program, starting from no prior knowledge. Even if you
have some prior knowledge, you may wish to start at the
beginning of the book, as the approach here may be quite
different than what you have seen before.
Many new programmers (and some courses or texts) place
undue focus on language syntax and language features—
aiming to become experts in whatever language they have
heard is popular on the job market. While syntax is important
—the computer cannot understand your program if you do not
write it properly in a programming language—it is not the heart
of programming. In fact, the key aspect of programming is
metacognition—thinking about how you think. Specifically,
programming is fundamentally about figuring out how to solve
a class of problems, and writing down the algorithm—a set of
steps to solve any problem in that class—in a clear and
unambiguous manner. Programming languages (such as C,
C++, Java, Scheme, or SML) figure into this equation primarily
as a means to provide a clearly defined manner to write down
the algorithm. Natural language, such as English, is too
ambiguous and complicated for this purpose. A good
programmer should be able to pick up a new language quite
quickly. The key skills of programming are universal—learning
a new language is largely just a matter of learning its syntax.
A natural consequence of being overly syntax-focused is
that many novice programmers attempt to dive right into
writing the code (in the programming language) as the first
step. However, writing the code is actually a much later step in
the process. A good programmer will plan first and write
second, possibly breaking down a large programming task into
several smaller tasks in the process. Even when cautioned to
plan first and code second, many programming students ignore
the advice—after all, why “waste” 30 minutes planning when
you are time-crunched from all the work you have to do. This
tradeoff, however, presents a false economy—30 minutes
planning could save hours of trying to make the code work
properly. Well-planned code is not only more likely to be
correct (or at least closer to correct), but is also easier to
understand—and thus to fix.
To try to better understand the importance of planning
before you write, imagine an analogy to building a house or
skyscraper. If you were tasked with building a skyscraper,
would you break ground and start building right away, figuring
out how to design the building as you go? Hopefully not.
Instead, you (or an architect) would design blueprints for the
building first. These blueprints would be iteratively refined
until they meet everyone’s specifications—they must meet the
requirements of the building’s owner, as well as be possible to
build reasonably. Once the blueprints are completed, they must
be approved by the local government. Actual construction only
begins once the plans are fully completed. Programming should
be done in a similar manner—come up with a complete plan
(algorithm) first and build (implement in code) second.
We said that the heart of programming is to figure out how
to solve a class of problems—not just one particular problem.
The distinction here is best explained by an example. Consider
the task of figuring out if a particular number (e.g., 7) is prime.
With sufficient knowledge of math (i.e., the definition of a
prime number and the rules of division), one can solve this
problem—determining that 7 is in fact prime. However, a
programming problem typically looks at a more general class
of problems. We would typically not write a program to
determine if 7 is prime, but rather a program that, given a
number , determines if is prime. Once we have an
algorithm for this general class of problems, we can have the
computer solve any particular instance of the problem for us.
When we examine a class of problems, we have
parameters, which tell us which particular problem in the class
we are solving. In the previous example, the class of problems
is parameterized by —the number we want to test for
primality. To develop an algorithm for this class of problems,
we must account for all possible legal values of the parameters.
As we will see later (in Chapter 3), programming languages let
us restrict what type of information a parameter can represent,
to limit the legal values to those that make sense in the context
of the problem. For primality testing, we would want our
parameter to be restricted such that it can only hold integer
numbers. It would not make any sense to check if letters,
words, or files are prime.
To write a program that takes any number and
determines if is prime, we must first figure out the
algorithm for this class of problems. As we said before, if we
attack the problem by blindly writing code, we will end up with
a mess—much like constructing a skyscraper with no plan.
Coming up with the appropriate algorithm for a class of
problems is a challenging task, and typically requires
significant work and thought.
1.1 How to Write a Program
Figure 1.1: A high level view of writing a program
Figure 1.1 shows a high-level overview of the
programming process. A programmer starts by devising the
algorithm for the task she is trying to solve. We will split this
planning phase into four steps in the process of writing a
program, which we will discuss in more detail shortly. At the
end of these four steps, the programmer should have a
complete plan for the task at hand—and be convinced that the
plan is a good one.
After devising a proper algorithm, she is ready for Step 5
of the programming process: translating her plan into code in
the programming language she is using for her current project.
Initially, translation to code will go slowly, as you will be
unfamiliar with the syntax, likely needing to look up the
specific details often. However, even if slow, it should be fairly
straightforward. You already devised the plan, so you should
have done all the actual problem-solving tasks already. Your
algorithm may have some complex steps, but that is fine. As we
will see later, whenever your algorithm calls for a step that is
too complicated to be simply translated into a few lines of
code, you should turn that step into its own separate
programming task and repeat the programming process on it.
We will discuss translation to code in much more detail
in Chapter 4, as well as how to turn the code into something
that the computer can run in Chapter 5.
Once the algorithm is implemented in code, the
programmer must test her code, which is Step 6 of the
programming process. By testing the program, the programmer
tries to uncover errors in her algorithm or implementation. If
the programmer finds errors in her program, she debugs the
program (Step 7)—finding out the cause of the error, and fixing
it. The programmer may need to return to the algorithm design
steps (if the error lies in the algorithm) or to translation to code
(if the error lies in the implementation) to correct the error. The
programmer then repeats all of the later steps.
At some point, the programmer completes enough test
cases with no errors to become convinced that her program is
correct. Note that we said that the programmer becomes
convinced that the program is correct. No amount of testing can
guarantee that the program is correct. Instead, more testing
increases the programmer’s confidence that the code is correct.
When the programmer is convinced her code is correct, she has
successfully completed the task at hand. We will discuss testing
and debugging in much more detail in Chapter 6.
1.2 Algorithms
As we discussed earlier, an algorithm is a clear set of steps to
solve any problem in a particular class. Typically, algorithms
have at least one parameter; however, algorithms with no
parameters exist—they are simply restricted to one specific
problem, rather than a more general class. We can discuss and
think about algorithms in the absence of any particular
knowledge of computers—a good algorithm can not only be
translated into code, but could also be executed by a person
with no particular knowledge of the problem at hand.
Algorithms that computers work on deal with numbers—
in fact Chapter 3 will discuss the concept of “Everything Is a
Number,” which is a key principle in programming. Computers
can only compute on numbers; however, Chapter 3 will also
illustrate how we can represent a variety of useful things
(letters, words, images, videos, sound, etc.) as numbers so that
computers can compute on them. As a simple example of an
algorithm that works with numbers, we might consider the
following algorithm (which takes one parameter , a non-
negative integer):
Given a non-negative integer N:
Make a variable called x, and set it equal to
(N+2).
Count from 0 to N (include both ends),
and for each number (call it i) that you count:
Write down the value of (x * i).
Update x to be equal to (x + i * N).
When you finish counting, write down the value
of x.
For any non-negative integer that I give you, you
should be able to execute these steps. If you do these steps for
, you should come up with the sequence of numbers 0
4 12 10. These steps are unambiguous as to what should
happen. It is possible that you get the wrong answer if you
misunderstand the directions or make arithmetic mistakes, but
otherwise, everyone who does them for a particular value of
should get the same answer. We will also note that this
algorithm can be converted into any programming language
quite easily—all that is needed is to know the basic syntax of
the particular language you want.
You may wonder why we would want an algorithm that
generates this particular sequence of numbers. In this case, it is
just a contrived algorithm to show as a simple introductory
example. In reality, we are going to devise algorithms that
solve some particular problem. However, devising the
algorithm for a problem takes some significant work, and this
will be the focus of discussion for the rest of the chapter.
Even though computers can only work with numbers, we
can envision algorithms that might be executed by humans who
can work on a variety of things. For example, we might write
algorithms that operate on physical objects, such as LEGO
bricks or food. Even though such things would be difficult to
implement on a computer (we would need the computer to
control a robot to actually interact with the physical world),
they are still instructive, as the fundamental algorithmic design
principles are the same.
One exercise done at the start of some introductory
programming courses is to have the students write down
directions to make a peanut butter and jelly sandwich. The
instructor then executes the algorithms, which are often
imprecise and ambiguous. The instructor takes the most
comical interpretation of the instructions to illustrate that what
the students wrote did not actually describe what they meant.
This exercise underscores an important point—you must
specify exactly what you want the computer to do. The
computer does not “know what you mean” when you write
something vague, nor can it figure out an “etc.” Instead, you
must be able to describe exactly what you want to do in a step-
by-step fashion. Precisely describing the exact steps to perform
a specific task is somewhat tricky, as we are used to people
implicitly understanding details we omit. The computer will
not do that for you (in any programming language).
Even though the “sandwich algorithm” exercise makes an
important point about precisely describing the steps you want
the computer to perform, it falls short of truly illustrating the
hardest part of designing an algorithm. This algorithm has no
parameters, so it just describes how to solve one particular
problem (making a peanut butter and jelly sandwich). Real
programming problems (typically) involve algorithms that take
parameters. A more appropriate problem might be “Write an
algorithm that takes a list of things you want in a sandwich and
describes how to make the sandwich.”
Such a problem is much more complex but illustrates
many concepts involved in devising a real algorithm. First, our
algorithm cannot take a list of just anything to include in the
sandwich—it really will only work with certain types of things,
namely food. We would not expect our algorithm to be able to
make us a “car, skyscraper, airplane” sandwich. These items
are all the wrong type. We will learn more about types in
programming in Chapter 3.
Our algorithm may also have to deal with error cases.
Even if we specify the correct type of inputs, the particular
values may be impossible to operate on correctly. For example,
“chicken breast” is food, but if the chicken breast has not been
cooked yet, we should not try to make a sandwich out of it.
Another error case in our sandwich creation algorithm might be
if we specify too much food to go inside the sandwich (how do
you make a sandwich with an entire turkey, 40 pounds of
carrots, and 3 gallons of ice cream?). Of course, if we were
writing this sandwich algorithm for humans, we could ignore
this craziness because humans have “common sense”—
however, computers do not.
Even if we ignore all of the error cases, our general
algorithm is not as simple as just stacking up the ingredients on
top of bread in the order they appear in the input. For example,
we might have an input of “chicken, mustard, spinach,
tomatoes.” Here, we probably want to spread the mustard on
the bread first, then place the other ingredients on it (hopefully
in an order that makes the most stable sandwich).
It would seem that writing a correct algorithm to make a
sandwich from an arbitrary list of ingredients is quite a
complex task. Even if we did not want to implement that
algorithm in code, but rather have it be properly executed by a
person with no common sense (or a professor with a comedic
disregard for common sense), this task is quite challenging to
do correctly. How could we go about this task and hope to get a
good algorithm?
The wrong way to write an algorithm is to just throw some
stuff on the page, and then try to straighten it out later. Imagine
if we approached our sandwich example by writing down some
steps and having someone (with no common sense) try them
out. After the kitchen catches on fire, we try to go in and figure
out what went wrong. We then tweak the steps, and try again.
This time, the kitchen explodes instead. We repeat this process
until we finally get something that resembles a sandwich, and
the house did not burn down.
The previous paragraph may sound silly, but this is
exactly how many novice (and intermediate) programmers
approach programming tasks. They jump right into writing
code (No time to plan! Busy schedule!), and it inevitably does
not work. They then pour countless hours into trying to fix the
code, even though they do not have a clear plan for what it is
supposed to do. As they “fix” the code, it becomes a larger,
more tangled mess. Eventually, the program kind-of-sort-of
works, and they call it good enough.
Figure 1.2: The first five steps of writing a program.
Instead, you should devise an algorithm in a disciplined
fashion. Figure 1.2 shows how you should approach designing
your algorithm. We will spend the next few sections discussing
each of these steps in detail. However, note that “translate to
code” comes only after you have an algorithm that you have
tested by hand—giving you some confidence that your plan is
solid before you build on it.
If you plan well enough and translate it correctly, your
code will just work the first time. If it does not work the first
time, you at least have a solid plan of what the code should be
doing to guide your debugging.
1.3 Step 1: Work an Example Yourself
The first step in trying to design an algorithm is to work at least
one instance of the problem—picking specific values for each
parameter—yourself (by hand). Often this step will involve
drawing a diagram of the problem at hand, in order to work it
precisely. The more precisely you can perform this problem
(including the more precisely you can draw a diagram of the
situation if applicable), the easier the remainder of our steps
will be. A good example of the sort of picture you might draw
would be the diagrams drawn in many science classes
(especially physics classes). Figure 1.2 shows multiple copies
of the box for this step layered one on top of the other, as you
may need to perform this step multiple times to generalize the
algorithm properly.
One of the examples of an algorithm that we mentioned
early in this chapter was determining if a number is prime. If
you were trying to write a function to determine if a number is
prime, your first step would be to pick a number and figure out
if it is prime. Just saying “ok, I know 7 is prime,” is not of
much use—you just used a fact you know and did not actually
work out the problem. For a problem such as this one, which
has a “yes or no” answer, we probably want to work at least
one example that comes up with a “yes” answer, and one that
comes up with a “no” answer.
Another example would be if we wanted to write a
program to compute raised to the power ( ). To do
Step 1, we would pick particular values for and , and work
them by hand. We might try and , getting an
answer of .
If you get stuck at this step, it typically means one of two
things. The first case is that the problem is ill-specified—it is
not clear what you are supposed to do. In such a situation, you
must resolve how the problem should be solved before
proceeding. In the case of a classroom setting, this resolution
may require asking your professor or TA for more details. In an
industrial setting, asking your technical lead or customer may
be required. If you are solving a problem of your own creation,
you may need to think harder about what the right answers
should be and refine your definition of the problem.
The second case where Step 1 is difficult is when you lack
domain knowledge—the knowledge of the particular field or
discipline the problem deals with. In our primality example, if
you did not remember the definition of a prime number, that
would be an example of lacking domain knowledge—the
problem domain is mathematics, and you are lacking in math
knowledge. No amount of programming expertise nor effort
(“working harder”) will overcome this lack of domain
knowledge. Instead, you must consult a source of domain
expertise—a math textbook, website, or expert. Once you have
the correct domain knowledge, you can proceed with solving
your instance of the problem. Note that domain knowledge
may come from domains other than math. It can come from
any field, as programming is useful for processing any sort of
information.
Sometimes, domain knowledge may come from particular
fields of computer science or engineering. For example, if you
intend to write a program that determines the meaning of
English text, the relevant domain field is actually a sub-field of
computer science, called Natural Language Processing. Here
the domain knowledge would be the specific techniques
developed to write programs that deal with natural language. A
source of domain knowledge on English (an English professor
or textbook) is unlikely to contain such information.
1.4 Step 2: Write Down What You Just
Did
For this step, you must think about what you did to solve the
problem and write down the steps to solve that particular
instance. Another way to think about this step is to write down
a clear set of instructions that anyone else could follow to
reproduce your answer for the particular problem instance that
you just solved. If you do multiple instances in Step 1, you will
repeat Step 2 multiple times as well, once for each instance you
did in Step 1. If an instruction is somewhat complex, that is all
right, as long as the instruction has a clear meaning—later, we
will turn these complex steps into their own programming
problems, which we will solve separately.
The difficult part of Step 2 is thinking about exactly what
you did to accomplish the problem. The difficulty here is that it
is very easy to mentally gloss over small details, “easy” steps,
or things you do implicitly. This difficulty is best illustrated by
the peanut butter and jelly exercise we mentioned earlier.
Implicit assumptions about what to do or relying on common
sense lead to imprecise or omitted steps. The computer will not
fill in any steps you omit, thus you must be careful to think
through all the details.
Returning to our example of computing to the , we
might write down the following steps for and :
I multiplied 3 by 3.
I got 9.
I multiplied 3 by 9.
I got 27.
I multiplied 3 by 27.
I got 81.
81 was my answer.
The steps are very precise—and leave nothing to guess
work. Anyone who can perform basic arithmetic can follow
these steps to get the right answer. Computers are very good at
arithmetic, so none of these steps is even complex enough to
require splitting into a sub-problem.
1.5 Step 3: Generalize Your Steps
Having solved one or more problems from the class we are
interested in and written down the particular steps we executed
to solve them, we are ready to try to generalize those steps into
an algorithm. In our Step 2 steps, we solve particular instances,
but now we need to find the pattern that allows us to solve the
whole class. This generalization typically requires two
activities. First, we must take particular values we used and
replace them with mathematical expressions of the parameters.
Looking at our Step 2 steps for computing , we would see
that we are always multiplying 3 by something in each step. In
the more general case, we will not always use 3—we are using
3 specifically because it is the value that we picked for . We
can start to generalize this by replacing this occurrence of
with (note that we change to the imperative mood for Step 3,
since we are moving from a description to instructions):
Multiply x by 3.
You get 9.
Multiply x by 9.
You get 27.
Multiply x by 27.
You get 81.
81 is your answer.
The second common way to generalize steps is to find
repetition—the same step repeated over and over. Often the
number of times that the pattern repeats will depend on the
parameters. We must generalize how many times to do the
steps, as well as what the steps are. Sometimes, we may find
steps that are almost repetitive, in which case we may need to
adjust our steps to make them exactly repetitive. In our
example, our multiplication steps are almost repetitive—both
multiply by “something,” but that “something” changes (3
then 9 then 27). Examining the steps in more detail, we will see
that the “something” we multiply is the answer from the
previous step. We can then give it a name (and an initial value)
to make all of these steps the same:
Start with n = 3.
n = Multiply x by n.
n = Multiply x by n.
n = Multiply x by n.
n is your answer.
Now, we have the same exact step repeated three times.
We can now contemplate how many times this step repeats as a
function of and/or . We must be careful not to jump to the
conclusion that it repeats times because —that is
just a coincidence in this case. In this case, it repeats
times. The reason for this is that we need to multiply s
together, and we already have one in at the start, so we need
more. This would lead to the following generalized
steps:
Start with n = 3.
Count up from 1 to y-1 (inclusive), and for each
number you count,
n = Multiply x by n.
n is your answer.
We need to make one more generalization of a specific
value to a function of the parameters. We start with ;
however, we would not always want to start with . In the
general case, we would want to start with :
Start with n = x.
Count up from 1 to y-1 (inclusive), and for each
number you count,
n = Multiply x by n.
n is your answer.
Sometimes you may find it difficult to see the pattern,
making it hard to generalize the steps. When this happens,
returning to Steps 1 and 2 may help. Doing more instances of
the problem will provide more information for you to consider,
possibly giving you insight into the patterns of your algorithm.
1.6 Step 4: Test Your Algorithm
After Step 3, we have an algorithm that we think is right.
However, it is entirely possible that we have messed up along
the way. The primary purpose of Step 4 is to ensure our steps
are actually right before we proceed. To accomplish this, we
test our algorithm with different values of the parameters than
the ones we used to design our algorithm. We execute our
algorithm by hand and compare the answer it obtains to the
right answer. If they differ, then we know our algorithm is
wrong. The more test cases (values of parameters) we use, the
more confident we can become that our algorithm is correct.
Unfortunately, it is impossible to ensure that our algorithm is
correct by testing. The only way to be completely sure that
your algorithm is correct is to formally prove its correctness
(using a mathematical proof), which is beyond the scope of this
textbook.
One common type of mistake is misgeneralizing in Step 3.
As we just discussed, one might think that the steps repeated
times because and the steps repeated 3 times. If we
had written that down in Step 3, our algorithm would only
work when ; otherwise we would count the
wrong number of times and get the wrong answer. If that were
the case, we would hopefully detect the problem by testing our
algorithm by hand in Step 4. When we detect such a problem,
we must go back and re-examine the generalizations we made
in Step 3. Often, this is best accomplished by returning to Steps
1 and 2 for whatever test case exposed the problem. Redoing
Steps 1 and 2 will give you a concrete set of steps to generalize
differently. You can then find where the generalization you
came up with before is wrong, and revise it accordingly.
Another common type of mistake is that there are cases
we did not consider in designing our algorithm. In fact, in our
example, we did not consider what happens when ,
and our algorithm handles this case incorrectly. If you execute
the algorithm by hand with , , you should get
; however, you will get an answer of 2. Specifically,
you will start with . We would then try to count
up from 1 to , of which there are no numbers,
so we would be done counting right away. We would then give
back (which is ) as our answer.
To fix our algorithm, we would go back and revisit Steps
1 and 2 for the case that failed ( , ). This case is
a bit tricky since we just know that the answer is 1 without
doing any work ( for any ). The fact that the answer
requires no work makes Step 2 a little different—we just give
an answer of 1. While this simplicity may seem nice, it actually
makes it a little more difficult to incorporate it into our
generalized steps. We might be tempted to write generalized
steps like these:
If y is 0, then:
1 is your answer.
Otherwise:
Start with n = x.
Count up from 1 to y-1 (inclusive), and for
each number you count,
n = Multiply x by n.
n is your answer.
These steps check explicitly for the case that gave us a
problem ( ), give the right answer for that case, then
perform the more general algorithm. For some problems, there
may be corner cases that require this sort of special attention.
However, for this problem, we can do better. Note that if you
were unable to see the better solution and were to take the
above approach, it is not wrong per se, but it is not the best
solution.
Instead, a better approach would be to realize that if we
count no times, we need an answer of 1, so we should start
at 1 instead of at . In doing so, we need to count 1 more time
(to instead of to )—to multiply by one more time:
Start with n = 1.
Count up from 1 to y (inclusive), and for each
number you count,
n = Multiply x by n.
n is your answer.
Whenever we detect problems with our algorithm in Step
4, we typically want to return to Steps 1 and 2 to get more
information to generalize from. Sometimes, we may see the
problem right away (e.g., if we made a trivial arithmetic
mistake, or if executing the problematic test case by hand gives
us insight into the correct generalization). If we see how to fix
the problem, it is fine to fix it right away without redoing Steps
1 and 2, but if you are stuck, you should redo those steps until
you find a solution. Whatever approach you take to fixing your
algorithm, you should re-test it with all the test cases you have
already used, as well as some new ones.
Determining good test cases is an important skill that
improves with practice. For testing in Step 4, you will want to
test with cases that at least produce a few different answers
(e.g., if your algorithm has a “yes” or “no” answer, you should
test with parameters that produce both “yes” and “no”). You
should also test any corner cases—cases where the behavior
may be different from the more general cases. Whenever you
have conditional decisions (including limits on where to count),
you should test potential corner cases right around the
boundaries of these conditions. For example, if your algorithm
makes a decision based on whether or not , you might
want to test with , , and . You can limit
your “pencil and paper” testing somewhat, since you will do
more testing on the actual code once you have written it.
1.7 Some Examples
Having learned the basic steps of designing an algorithm, it is
useful to see several examples of them in action. We are going
to work through four examples in a variety of forms. For two of
the examples, we will work from a problem statement (that is a
description of what the algorithm should do). For the other two
examples, we will work from a set of examples that illustrate
the pattern. Writing algorithms from either starting point is a
useful skill for programmers, and the two skills are tightly
interlinked—in both cases, we are trying to find the general
solution or pattern, and write it down clearly.
1.7.1 A Numerical Sequence
For our first example, we will find the pattern from a set of
given examples:
The numbers in the table below are the result of executing
an algorithm, which has one parameter , which must be a
non-negative integer, and produces sequences of integers as
outputs. For values of from 0 to 5, the algorithm produces
the following sequences of numbers as outputs:
Output
0 0
1 1 -1
2 420
3 9753
4 16 14 12 10 8
5 25 23 21 19 17 15
We start with this Numerical Sequence example, since it is
the simplest of the four, and the key skill—being able to
analyze sequences of numbers to find patterns—is required for
pretty much any other algorithm. In many algorithms you will
design, you will find a sequence of numbers (after all:
Everything Is a Number), in which you need to find the pattern
to describe the generalization.
Step 1: Work An Example Yourself. Unlike working a
problem from a description, we have several examples already
given to us—6 examples are already worked, so we have 6
“Step 1”s already done for us. However, it is still useful to walk
through a couple in detail to have them in your head. Even just
copying the numbers down directly helps bring them into your
active memory.
Step 2: Write Down What You Just Did. In this
particular case, there really isn’t anything more to what you did
than just writing down the sequence of numbers. That is, for
, , and , we might write down:
I wrote down 1. I wrote down 4. I wrote down 9.
I wrote down -1. I wrote down 2. I wrote down 7.
I wrote down 0. I wrote down 5.
I wrote down 3.
Step 3: Generalize Your Steps. We clearly have
repetition in these steps (we are, after all, writing down a
sequence of numbers), but a lot of things depend on N. For
each value of , we are writing down a different number of
numbers in the sequence. We might first figure out how many
numbers we will write down for a given value of . If the
pattern is not immediately obvious, it may help to make a table
like this one:
N How many numbers in the sequence?
0 1
1 2
2 3
3 4
4 5
5 6
Now we can see that each sequence has numbers
in it. We therefore want to repeat a step of the form “Write
down (some number)” times, where we need to figure
out the formula for “(some number).” Since we want to
repeatedly do a similar step, it is natural to “count them out”—
that is, count from to (inclusive; or to if we
want to exclude the upper bound), name the number we are
counting on, and then say what to do for each number we
count. Let us count from to (inclusive), and call each
number that we count .
Now, we need to figure out the formula for each number
that we need to write in terms of and . For example, if we
start by looking at :
16 14 12 10 8
We will see that the numbers decrease by 2 each time. We
go from 16 to 14 (2 less), then to 12 (again, 2 less), and so on.
For in particular, we can come up with a formula of
. If we go back to the entire problem, and look at
each we were given, we might come up with the following
table of formulas:
N Formula for the i number
0
1
2
3
4
5
These formulas are pretty similar, except for two things.
First, is a bit of an odd case—it only has one number (0),
and thus we did not naturally come up with a on it,
which the other formulas have. However, since we start
counting at , , and anything minus is
itself, we can just add the missing to this formula
without changing anything, giving us as the
formula for .
Now, all of the formulas look the same, except that they
start with a different number. That is, they are all (something)
, but that (something) differs. If we can figure out the
pattern in the (something), we have a general formula, and just
need to write down the steps. Scrutinizing the relationship
between these numbers, we can see that the formula is
.
Count from 0 to N (inclusive), and for each
number (i) you count,
Write down the number (N squared - 2 * i).
Video 1.1: Testing our algorithm
Step 4: Test Your Algorithm. Now that we have written
this algorithm, we should test it. It is entirely possible that we
made a mistake in generalizing the patterns. If we did, we
would prefer to catch the problem now. If we write the code,
then discover the problem, we have wasted work. We test the
algorithm by picking a particular and following the steps. If
we used particular values of to generalize from, we want to
pick other values to test with. Video 1.1 demonstrates testing of
these steps for .
After testing our algorithm by executing it step-by-step as
in the video, we are more confident in its correctness. We know
that our algorithm produces the right answer on at least one
input ( ). Of course, we cannot be completely sure our
algorithm is right from testing (in general, no number of tests
can ensure correctness). However, the more we test our
algorithm, the more confident we can become in it. Only once
we are completely happy with our algorithm should we then
proceed to translate it to code.
1.7.2 A Pattern of Squares
For our second example, we will look at a pattern of squares
drawn on a grid. You may wonder why a programmer would be
interested in drawing squares on a grid. Beyond this example
serving us well for analyzing patterns in general, computer
graphics ultimately boil down to drawing colored pixels on a
2D grid (the screen). In this particular example, we have an
algorithm that is parameterized over one integer and
produces a pattern of red and blue squares on a grid that starts
all white. The output of the algorithm for to
is as follows:
To devise the algorithm, we should work through Steps 1–
4 (as we should with all problems). Video 1.2 walks through
these steps to illustrate how we could come up with this
algorithm.
Video 1.2: Devising the algorithm for the
pattern on the grids.
We note that there are many correct algorithms for this
problem. Even if we restrict ourselves to the ones we can come
up with naturally (that is, as a result of working through steps
1–4, rather than trying something bizarre), there are still many
choices that are equivalent and correct. Which algorithm you
come up with would be determined by how you approach Step
1.
The algorithm in the video works from left to right, filling
in each column from bottom to top. If we had worked Step 1 by
working from the top down, filling in each row from left to
right, we might have ended up with the following slightly
different algorithm instead:
Count down from N to 0 (inclusive), and for each
number (y) you count,
Count from 0 to y (inclusive), and for each
number (x) you count,
If (x + y is a multiple of 3), then:
Place a blue square at (x,y).
Otherwise:
Place a red square at (x,y).
Of course, those are not the only two ways. You could
have worked across the rows from the bottom up going right to
left and come up with a slightly different (but also equivalent)
algorithm. Or even an entirely different approach, such as
filling in the entire “triangle” with red squares, then going back
to fill in the blue squares.
We emphasize this point because it is important for you to
understand that there is always more than one right answer to a
programming problem. You might work a problem and come
up with a correct solution but find that it looks completely
different from some other solution you know to be correct (e.g.,
the ones provided for some of the problems in this book, or a
teacher’s solutions to an exam). Understanding this possibility
is important so that you will not incorrectly think that a right
answer is wrong because you have seen a different right
answer. Not only is that experience frustrating, but it hinders
your learning.
1.7.3 Drawing a Rectangle
For our third example, we will work from the following
problem statement:
Given , , , and , draw a blue filled-
in rectangle whose lower left corner
is at on a grid.
Like the last problem, this one involves drawing
something on a 2D plane. As with the last one, this problem is
one we might want to do in graphics—however, here, we are
working the problem from a description of what we want to do,
rather than from some examples. We might end up working a
problem that is conceptually quite similar in a non-graphical
context—if we represent data in a 2D grid (e.g., a table or
matrix), we might want to set a rectangular range of data values
to something in particular. Even though this second problem
“looks” different, the algorithm would be quite similar.
Video 1.3 walks through this problem.
Video 1.3: Devising the algorithm for filling in a
rectangle.
1.7.4 Closest Point
For our fourth example, we will work from the following
problem statement:
Given a set of points, and another point ,
select the point from that is closest to .
We might want to write a program to solve this problem
for a variety of reasons. Some of the most natural reasons
would be that we actually want to apply this to physical
locations. For example, if we are writing a program that deals
with maps and directions, we might want to locate the gas
station nearest to the user. The software could then find all of
the gas stations in the area ( ), query the GPS for the user’s
location ( ), and then apply this algorithm to determine which
gas station is closest to the user.
However, this problem also has a variety of applications
that are slightly more subtle. For example, suppose we have
some information describing the characteristics an item, and we
want to classify it based on the similarity of those
characteristics to preexisting categories of items. If we have
two characteristics, then the first is the x-coordinate, and the
second is the y-coordinate. If we have more than two, we solve
the problem in a higher-dimensional space (which is still the
same problem— we just compute distance differently). We then
use the points describing the categories we desire as our ,
and the characteristics of the item we are classifying as our .
The closest item in is the best match for . For example,
suppose the key features in determining the breed of a dog
were its adult height (the x-coordinate) and its adult weight (the
y-coordinate). If you have a set of representative dog
breeds, and you’d like to know the breed of your dog , you
simply determine which member of your dog is closest to.
Video 1.4: Devising an algorithm to find the
closest point in a set to another point .
Video 1.4 walks through the algorithmic design for this
problem. We will revisit this problem much later in Video 9.4,
where we will turn this algorithm into C code.
1.8 Next Steps
At this point, you should have a basic grasp on the idea of
developing simple algorithms. This skill is one that you will
practice as you go through the rest of this book, as every
programming problem’s key component is figuring out the
correct algorithm. We cannot overemphasize the importance of
working through problems in a step-by-step fashion. Many
novice programmers try to skip over the first several steps and
plunge right into writing code. The result is frequently a
disaster, which they end up spending orders of magnitude more
time trying to fix than they would have spent planning
correctly in the first place.
The reasons that novice programmers give for skipping
straight to step 5 vary, but one common one is “Step 3 (writing
a generalized algorithm) seemed too hard.” This reason is quite
possibly the worst reason to skip over Step 3—if making a
correct plan is proving hard, how can you possibly hope to
write correct code without the plan? Better is to repeat Steps 1
and 2 on more examples until you can find the pattern and
write down the algorithm. Another common reason that novice
programmers give for skipping the first steps is “to save
time”—however, they often then report spending countless
hours trying to debug the resulting code. It is well worth 10 or
even 30 minutes of planning to avoid trying to debug a
hopeless mess for multiple hours!
As you become more and more practiced at this process,
you may find that steps 1–4 come naturally, and you can do
them in your head without writing them down—much like may
happen with mathematical skills. When these improvements in
your programming skills happen, then there is nothing wrong
with doing the easier steps in your head, as long as you are sure
that you are doing them correctly. However, whenever you are
programming at the boundaries of your abilities, you will need
to go through these steps—so it is quite important to remember
how the full process works even as you become more skilled.
In the next chapter, we will continue from here by first
learning a bit about reading code in C, before we continue on
to more about writing code. By reading code, we mean being
able to understand exactly what a piece of code does, executing
it step-by-step by hand. This skill is important for three
reasons. First, it is very difficult to write when you cannot read.
Reading the code will be a matter of drawing and updating
diagrams that reflect the state of the program as the code
executes. Writing code will be a matter of writing the syntax to
effect the appropriate transformations—as spelled out in the
algorithm—to the program’s state. Second, being able to read
your code is crucial for being able to debug your code. Third,
you may end up in a variety of settings (e.g., group class
projects, coding teams in industry) where you must read and
understand what other people’s code does so that you can work
on it.
1.9 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 1.1 : What is an algorithm?
• Question 1.2 : What is a parameter?
• Question 1.3 : How many tests are needed to ensure
that an algorithm is correct?
• Question 1.4 : The numbers in the table below are the
result of executing an algorithm that has one parameter
, which must be a non-negative integer, and
produces sequences of integers as outputs. For values of
from 0 to 5, the algorithm produces the following
sequences of numbers as outputs:
N output
0 02
1 3579
2 6 8 10 12 14 16
3 9 11 13 15 17 19 21 23
4 12 14 16 18 20 22 24 26 28 30
5 15 17 19 21 23 25 27 29 31 33 35 37
Determine the algorithm that was used to generate the
numbers in this table and
1. Write it down
2. Execute it for N=6, and write down your result.
3. Give your description of the algorithm to a
friend who is not a programmer, and ask
him/her to execute it for N=6. Compare your
results to his/hers.
• Question 1.5 : The diagrams shown below below are
the result of executing an algorithm that has one
parameter , which must be a non-negative integer,
and colors boxes on a 10x10 grid. For values of
from 0 to 5, the algorithm produces the following
patterns:
Determine the algorithm that was used to draw these
patterns and
1. Write it down
2. Execute it for N=6, and write down your result
(possibly on graph paper).
3. Give your description of the algorithm to a
friend who is not a programmer, and ask
him/her to execute it for N=6. Compare your
results to his/hers.
• Question 1.6 : The numbers in the table below are the
result of executing an algorithm that has one parameter
, which must be a non-negative integer, and
produces sequences of integers as outputs. For values of
from 0 to 5, the algorithm produces the following
sequences of numbers as outputs:
N output
0
1 01
2 0223
3 024345
4 02464567
5 0246856789
Determine the algorithm that was used to generate the
numbers in this table and
1. Write it down
2. Execute it for N=6, and write down your result.
3. Give your description of the algorithm to a
friend who is not a programmer, and ask
him/her to execute it for N=6. Compare your
results to his/hers.
• Question 1.7 : The diagrams shown below below are
the result of executing an algorithm that has one
parameter , which must be a non-negative integer,
and colors boxes on a 10x10 grid. For values of
from 0 to 5, the algorithm produces the following
patterns:
Determine the algorithm that was used to draw these
patterns and
1. Write it down
2. Execute it for N=6, and write down your result
(possibly on graph paper).
3. Give your description of the algorithm to a
friend who is not a programmer, and ask
him/her to execute it for N=6. Compare your
results to his/hers.
• Question 1.8 : The numbers in the table below are the
result of executing an algorithm that has one parameter
, which must be a non-negative integer, and
produces sequences of integers as outputs. For values of
from 0 to 5, the algorithm produces the following
sequences of numbers as outputs:
N output
0
1 -1 0 3
2 -4 -3 0 5 12 21
3 -9 -8 -5 0 7 16 27 40 55
4 -16 -15 -12 -7 0 9 20 33 48 65 84 105
5 -25 -24 -21 -16 -9 0 11 24 39 56 75 96 119 144 171
Determine the algorithm that was used to generate the
numbers in this table and
1. Write it down
2. Execute it for N=6, and write down your result.
3. Give your description of the algorithm to a
friend who is not a programmer, and ask
him/her to execute it for N=6. Compare your
results to his/hers.
I Introduction to Programming in C2 Reading Code
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C1 Introduction3 Types
Chapter 2
Reading Code
Before you can learn to write, you must learn to read. In the
case of code, learning to read means being able to take a piece
of code and execute it by hand in a step-by-step fashion.
Correctly executing a piece of code by hand will allow you to
determine the effects that the code will produce. If you cannot
understand what code does, you cannot possibly write it.
There are two important parts to executing code by hand.
The first is understanding what each statement does. The rest of
this chapter will cover the basic statements of C, and we will
see more advanced topics later on. As we introduce these basic
C constructs, we will explain their syntax—the rules for how to
write them according to the grammatical rules of C—and their
semantics—what they mean (i.e., what they make the program
do). We suggest that you make yourself a “quick reference”
sheet, where you write down the syntactic rules and effects of
each construct as you learn it. You will want to refer to these
later as you begin to write code (although, after some practice,
they will start to come naturally to you).
The second important part is keeping track of the state of
the program in a correct fashion—that is, what part of the code
is currently being executed, as well as the values of the
variables involved. As we will see shortly, tracking the values
of variables is not quite as simple as listing their names and
values (as we will sometimes make new variables and destroy
old ones). Accordingly, we will keep a diagram in a particular
way that allows us to track this state correctly and easily. We
will track our current location in the code with an arrow ( ),
which will typically rest between lines of code. It will be after
the last line we have executed, and before the next line we are
going to execute. Our basic process will be to evaluate the line
of code after the execution arrow (according to the rules we
will learn here) and update our diagram appropriately. We will
then advance the arrow to the next line of code and repeat the
process until the program exits.
As you work through this chapter, you will hopefully
notice some similarities between our executing code by hand
and the way we tested algorithms (Step 4) in the previous
chapter. This similarity is not a coincidence, as code is just a
formal way to express an algorithm such that the computer can
understand how to follow the steps. Whenever you execute
algorithms by hand—no matter what language they are in:
English, C, C++, Java, SML, Python, etc.—you will want to
follow a very similar procedure so that you capture the effects
precisely.
2.1 Variables
Programs track most of their state in variables, each of which
you can think of as a box that stores a value. In order to use a
variable, the programmer must declare it, specifying its type
and name. The type specifies what kind of value can be held in
a variable’s box (for example, whether it is a number, a letter,
or text). We will learn about types in Chapter 3, but for now, we
will use variables whose types are all int, short for “integer”—
meaning that the value in their box is a number.
2.1.1 Declaration
The name of a variable may be any valid identifier—the
formal programming term for a word that can be used to name
something. In C, identifiers may be any length and can contain
letters, numbers, and underscores (_). They may only start with
a letter or an underscore (not a
number) and are case-sensitive
(meaning that abc is different from
Abc, and ABC is different from both
of them). The variable declaration
ends with a semicolon, which is
used to end many statements in C. Figure 2.1: A variable declaration.
A statement in a programming
language is roughly analogous to a sentence in English—it is a
complete line of code, which can be executed for an effect.
Figure 2.1 shows a variable declaration and identifies each of
the pieces.
When executing code by hand, the effect of a variable
declaration is to create a new box, labeled with the name of the
variable. In C, a newly declared variable is uninitialized,
meaning that its value is undefined. When the computer
actually executes the program, it has a finite (but quite large)
number of “boxes” (memory locations), and the variable will be
given one that is currently not in use. The value of the variable
will be whatever value happened to be in the location
previously, until a new value is assigned to the variable (which
we will see shortly). Correspondingly, when we execute a
variable declaration by hand, we will draw a box and place a ?
in it for its value, indicating that it is unknown. If we ever use
an unknown value as we execute our program, it indicates a
problem with our program, since its behavior is undefined—its
behavior will change based on whatever the value actually is,
which we cannot predict.
Video 2.1 Video 2.1: Variable declarations.
shows the
execution of code containing two variable declarations—x and
y. At the start, the execution arrow is at the beginning of the
code. The area on the right, which represents the state of the
program, is empty. As the execution arrow advances across
these statements, we execute their effects: drawing a box for
each variable with a ? in it, indicating that the variable is
uninitialized.
2.1.2 Assignment
For variables to be useful, we must be able to change their
values. To accomplish this, we use assignment statements—
statements that change the value contained in a box. An
assignment statement starts with an lvalue on the left. An lvalue
(pronounced “el-value”) must be something that “names a
box”—indicating which box the assignment statement will
change. The simplest lvalue is a variable, which names the
variable’s own box. (Later we shall see how to name boxes in
other ways, but for now, we will only consider variable names.)
After the lvalue, comes a single equals sign (called the
assignment operator), followed by an rvalue on the right, then a
semicolon at the end. The rvalue (pronounced “are-value”)
must be an expression, and its value will be placed in the box.
Figure 2.2: An assignment statement.
An expression is a combination of values and operations
that evaluates to a value. For the moment, we will just consider
numeric constants (such as 3), which evaluate simply to
themselves (that is, 3 evaluates to the number 3). We will
discuss more expressions shortly. Evaluating any assignment
statement is a matter of figuring out what box the left side
names, evaluating the right side to a value (e.g., a number), and
then changing the value in the box named on the left side to the
value from the right side.
Figure 2.2 shows an example of an assignment statement
and identifies the individual pieces. This assignment statement
assigns the value 3 to the variable myVariable. Its effect is to
change the value in the box named myVariable to be 3.
Video 2.2: Declarations and assignments.
The declaration and initialization—the first assignment—
of a variable may be combined into a single statement, such as
int x = 3;, which has the same effect as the two individual
statements int x; x = 3;. Video 2.2 shows the execution of a
combination of variable declarations and assignment
statements.
2.2 Expressions
As we mentioned previously, an expression is a combination of
values and operations that evaluates to a value. We have
already seen the simplest expressions—numerical constants,
which evaluate to themselves. We can also use mathematical
operators, such as +, -, *, and /, to carry out arithmetic
operations. For example, 7 + 3 evaluates to 1 and
4 * 6 + 9 * 3 evaluates to 51. These operators have the
standard rules of precedence—multiplication and division
occur before addition and subtraction—and associativity:
4 - 3 - 1 means (4 - 3) - 1 not 4 - (3 -1). Parenthesis
may be used to enforce a specific order of operations—
4 * (6 + 9) * 3 evaluates to 180.
Another common operator, which you may not be as
familiar with, is the modulus operator, %. The modulus operator
evaluates to the remainder when dividing the first operand by
the second. That is a % b (read “a modulus b”, or “a mod b” for
short) is the remainder when a is divided by b. For example,
19 % 5 = 4 because with a remainder of 4—
, and . In C-style languages, the
modulus operator has the same precedence as multiplication
and division.
Variables may also appear in expressions. When a variable
appears in an expression, it is evaluated to a value by reading
the current value out of its box. It is important to note that
assignment statements involving variables on the right side are
not algebraic equations to be solved—we cannot write x-y = z
* q. Note that here, the left side of this statement does not
“name a box”. If you want to solve an algebraic equation, you
must do so in a step-by-step fashion.
We can, however, write perfectly meaningful assignment
statements that are not valid in algebra. For example, a
statement such as x = x + 1; is quite common in
programming but has no solution if you think of it as an
algebraic equation. In programming, this statement means to
take current value of x, add 1 to it, and update x’s value to that
result.
Video 2.3: Assignment statements with more
complex right-hand sides.
Video 2.3 shows the execution of some assignment
statements with more complex expressions on their right-hand
sides than in previous examples.
2.3 Functions
Figure 2.3: A function declaration.
A function gives a name to a parameterized computation—
it is the implementation in code of a specific algorithm. All of
the code that you will read or write (in this book) will be inside
of functions. There are two aspects to using functions in your
programming: declaring a function—which provides the
definition for how a function behaves—and calling a function
—which executes the definition of the function for specific
values of the parameters.
Figure 2.3 shows a function declaration. The function’s
name may be any valid identifier, just like a variable’s name. In
this particular example, the function’s name is myFunction.
Immediately before the function’s name is its return type—the
type of value that this function will compute. As mentioned
earlier, we will learn more about types later. For now, we will
just work with ints, which are numbers. The fact that this
function returns an int means that its “answer” is an int. After
the function’s name comes a set of parenthesis, with the
parameter list inside. The parameter list looks like a comma-
separated list of variable declarations. Here, the function takes
two parameters, x and y, both of which are ints. The similarity
between parameters and variable declarations is not a
coincidence—the parameters behave much like variables, but
they are initialized by the function call (which we will discuss
shortly).
The body of the function then comes between a set of
curly braces and is comprised of zero or more statements. The
body of this function has two statements. The first statement in
this function’s body is the now-familiar declaration and
initialization of a variable: z is declared as a variable of type
int and initialized to the value of the expression x - 2 * y.
The second statement within the body of this function is a
new type of statement, which we have not seen before: a return
statement. A return statement starts with the keyword return,
which is then followed by an expression. The effect of this
statement is to say what the “answer” is for the current
function, leaving its computation and returning to the code that
called it.
To understand this last concept completely, we must first
see the other aspect of using a function—calling the function. A
function call is another kind of expression, whose value is
whatever “answer” the called function comes up with when it is
executed with the specified arguments—values of its
parameters. This “answer” is more formally called the
function’s return value.
Evaluating a function call is more complex than evaluating
the other kinds of expressions that we have seen so far—it may
take many steps of executing the code in the function to
determine its answer. In fact, code may call one function, which
itself may call other functions before finally coming up with an
answer. While this may seem daunting, we can do it properly
by following a few rules for executing function calls by hand.
As a first step towards reading code with function calls,
we must first group together the variables belonging to one
particular function into a larger box, labeled with the function’s
name, which is called a frame (or stack frame, since they are
located on the call stack). Figure 2.4 shows an example of this
organization.
Figure 2.4: Variables organized into frames.
Notice that in the example of Figure 2.4, one of the
functions is named main. The function named main is special—
execution of a program starts at the start of main. We start by
drawing an empty frame for main, and putting the execution
arrow right before the first line of code in main. We then
execute statements of the code until main returns, which ends
the program.
Calls to functions may appear in expressions, in which
case we must evaluate the function to determine its return
result. To do this evaluation, we take the following steps:
1. Draw a frame for the function being called. Place a box
in that frame for each parameter that this function takes.
2. Initialize the parameters by evaluating the
corresponding expressions in the function call and
copying the resulting values into the parameter’s box in
the called function’s frame.
3. Mark the location of the function call, and note that
location in the corner of the function’s frame.
4. Move the execution arrow immediately before the first
line of code in the called function.
5. Evaluate the lines of code inside the called function.
6. When you reach a return statement, evaluate its
argument to a value. Note down this return value.
7. Return the execution arrow back to where the function
was called—you know this location because you noted
it in the corner of the frame. You will return the arrow to
the middle of the line of code (rather than the typical
“between them”) because that line of code is part-way
done.
8. Erase the frame for the called function.
9. Use the return value of the function as the value of the
function call in the expression in which it appears.
A function call may also be used as a statement by itself,
in which case, it is evaluated the same as above, except that its
return value is not used for anything.
Video 2.4: Execution of function calls.
Video 2.4 demonstrates the execution of code with
function calls.
2.3.1 Scope
So far, all of our code examples have had only one variable
with a particular name. However, in real programs—which may
be quite large and developed by multiple people—we may have
many different variables with the same name. This possibility
means that we need rules to determine which variable a
particular name refers to. These rules are based on the notion of
scope.
The scope of a variable is the region of code in which it is
visible. Within a variable’s scope, its name may refer to it.
Outside of a variable’s scope, nothing can refer to it directly.
Most variables that you will use will be local variables—
variables declared inside of a function—and function
parameters. In C, the scope of a local variable begins with its
declaration and ends at the closing curly brace (}) that closes
the block of code—the code between matching open and close
curly braces—that the variable was declared in. Function
parameters have a scope of the entire function they belong to.
Figure 2.5: Depiction of the scope of parameters and variables.
Figure 2.5 shows a snippet of code (we have not learned
the details of what most of this code does, but that is not
important—we are just interested in the scope of the variables).
The figure shows the same piece of code three times, with
different scopes highlighted. The leftmost portion of the figure
shows the scope of the parameters (x and y)—which is the
entire function—in a blue box. The middle portion shows the
scope of the variable n—which starts at its declaration and
continues to the close curly brace, which ends the function—in
a red box. The right portion shows the scope of the variable q—
which starts at its declaration and ends at the next curly brace—
in a green box.
To determine which variable a name refers to, we must
first determine which variable(s) with that name are in scope at
the reference. If no variables of that name are in scope, then the
reference is illegal. If exactly one variable is in scope, then the
name refers to that variable. If multiple variables are in scope,
we select the one whose declaration is in the innermost
enclosing block. That is, if you went backwards out of blocks,
through open curly braces, the variable that would go out of
scope first is the one to use.
Figure 2.6 shows a code fragment with four different xs in
it. (As the actual behavior of the code is irrelevant to this
example, much of it is replaced with an ellipsis (…).) The first
x in the figure is declared outside of any of the functions—it is
a global variable. The “box” for a global variable exists outside
of any frames, and is created when the program starts. If the
global variable is initialized in its declaration, the value is also
placed in the box before the program starts. The areas where x
references this variable are colored purple.
Figure 2.6: Code with four different xs, color-coded by which one you get when
you use the variable name x.
We note that there is a time and place to use global
variables, but their use should be rare. When novice
programmers learn about global variables, they often want to
use them for all sorts of inappropriate purposes. Typically these
uses reflect a lack of understanding of parameter passing or
how functions return values. We recommend against using
global variables for any problem in this book, and more
generally unless it is truly the correct design approach.
The next x in our example is the parameter to the function
f. The scope for this x begins at the open curly brace ({) of f’s
body, and ends at the matching close curly brace (}). The
region of the program where x references the parameter to x are
shown in red. Observe that the red begins and ends with the
curly braces surrounding the body of f but has a “hole” where
there is a different x in a smaller scope in the middle.
The “hole” in the red region corresponds to the portion of
the code (shown in blue) where x references the local variable
declared inside of the while loop’s body. After this local
variable x goes out of scope at the closing curly brace of the
block it was declared in, we return to the “red region” where
the parameter of f is what we reference with the name x.
Between the end of f and the declaration of a local
variable named x inside of function g, the global variable is
what the name x references—shown in the figure by coloring
this region of code purple. When there is a local variable named
x declared inside of g, then the name x references it (this area is
shown in green) until it goes out of scope, at which point the
name x again references the global variable.
If all of that seems complicated, you will be comforted by
the fact that thinking through such issues should not come up in
well-written code. Ideally, you should write your code such that
you have at most one variable by any particular name in scope
at a time (related to this point: you should name your variables
meaningfully—x is seldom a good name for a variable, unless
of course it represents the x coordinate of a point or something
similar). However, you should still know what the rule is, as it
is common to many programming languages. You may come
across code with multiple variables of the same name in scope
at some point and need to understand how to read it.
2.3.2 Printing
Our example programs so far have computed results but had no
way to communicate them to the user. Such programs would be
useless in practice. Real programs have means to communicate
with their user, both to read input and to provide output. Many
programs that you are accustomed to have graphical user
interfaces (GUIs); however, we will work primarily with
programs that use a command line interface. Writing GUIs is a
more complex task and requires a variety of additional
concepts.
Command line programs provide output to their user by
printing it out on the terminal. In C, printing output is
accomplished by calling the printf function, which takes a
string specifying what to print. We will learn more about strings
later (in Section 10.1), as they require knowledge of pointers to
understand. For now, you can think of them as being text—
words or sentences. In much the same way that we can write
down numerical literals (such as or ), we can write
down string literals by placing the string we want inside of
quotation marks, e.g., "This is a string". If we wanted to
print out the string, “Hello World,” we would type
printf("Hello World");.
The f in printf stands for “formatted,” meaning printf
does not just print literal strings but can take multiple
arguments (of various types), format the output as a string, and
print the result. To format output in this way, the string
argument of printf (which is called the “format string”)
includes special format specifiers, which start with a percent
sign (%). For now, we will only concern ourselves with %d,
which specifies that an integer should be formatted as a decimal
(base 10) number. For example, if we wrote the following code
fragment:
1 int x = 3;
2 int y = 4;
3 printf("x + y = %d", x + y);
it would print x + y = 7 because it would evaluate the
expression x + y to get 3 + 4, which is 7, and format the
number 7 as a decimal number in place of the %d specifier. The
rest of the string is printed literally.
Another type of special information we can include in the
string is escape sequences. Escape sequences are two (or more)
characters, the first of which is a backslash (\), which gives the
remaining characters special meaning. The most common
escape sequence you will encounter is \n, which means
“newline.” If you want your print statement to print a newline
character (which makes the next output begin at the start of the
next line), then you do so with \n. If you want a literal
backslash (that is, you actually want to print a backslash), \\ is
the escape sequence for that purpose. We will note that you
generally will want to print a newline in your output, not only
so that it looks nice, but also because printf does not actually
print the output to the screen until it encounters a newline
character or is otherwise forced to do so.
We will discuss the various format specifiers printf
accepts, as well as the escape sequences that C understands, in
Chapter 3.
Video 2.5: An example with printed output.
Video 2.5 demonstrates the execution of code that prints
output.
2.4 Conditional Statements
In addition to computing arithmetic combinations of their
variables, programs often make decisions based on the values
of their variables—executing different statements based on the
values of expressions. In C, an if/else statement specifies that
one block of code should be executed if a condition is true, and
another block should be executed if that condition is false.
To writing meaningful if/else statements, we need to
introduce operators that allow us to compare two expressions to
produce true or false outcomes. In C, there are no distinct
values for true or false; instead, false is 0, and true is anything
non-zero. We will refer to true and false because they make
more sense conceptually; the distinction should not make a
practical difference in most cases.
expr1 == expr2 tests if expr1 is equal to expr2
expr1 != expr2 tests if expr1 is not equal to expr2
expr1 < expr2 tests if expr1 is less than expr2
expr1 <= expr2 tests if expr1 is less than or equal to expr2
expr1 > expr2 tests if expr1 is greater than expr2
expr1 >= expr2 tests if expr1 is greater than or equal to expr2
!expr computes the logical NOT of expr
expr1 && expr2 computes the logical AND of expr1 and expr2
expr1 || expr2 computes the logical OR of expr1 and expr2
Table 2.1: Logical operators.
Table 2.1 shows the C operators for conditional
expressions. The first six (==, !=, <=, <=, >, and >=) are
relational operators—they compare two expressions for
equality or inequality. For any of these operators, both operands
(the expressions on the left and right) are evaluated to a value
then compared appropriately. The operator then produces a true
or false value.
The last three operators in the table (!, &&, and ||) are
boolean operators—they operate on true/false values. The first
of these ! performs the boolean NOT operation. It is a unary
operator—meaning it has one operand—which evaluates to true
if its operand is false and evaluates to false if its operand is
true.
The && and || operators perform the logical AND and
logical OR operations respectively. The logical AND of two
values is true if and only if both values are true; otherwise it is
false. The logical OR of two values is true if and only if either
of the values are true; otherwise it is false.
Unlike previous operators we have seen, && and || may
know their answer from only one argument. In the case of &&, if
either operand is false, then the result is false, regardless of the
other value. Similarly for ||, if either operand is true, then the
result is true regardless of the other value. C exploits this fact in
the way it evaluates && and || by making them short circuit—
they may only evaluate one operand. Specifically, the first
operand is always evaluated to a value; however, if the value of
that operand determines the result of the entire && or ||
expression—false for && or true for ||—then the second
operand is not evaluated at all.
2.4.1 if/else
Figure 2.7: Syntax of if/else.
Now that we understand comparison operators and can
compare expressions, we can discuss the evaluation of if/else
statements. The syntax for an if/else statement is shown in
Figure 2.7. The keyword if is followed by an expression in
parenthesis. This expression is evaluated to a value to
determine whether the “then” block or the “else” block is
executed. The “then” block of code comes immediately after
the expression. C does not have a then keyword (although
some languages do); however, this block of code serves the
same purpose regardless of the syntactic particulars of the
language—it is executed if the conditional expression evaluates
to true. After the “then” block, we have the keyword else,
followed by the “else” block. This block of code is executed if
the conditional expression evaluates to false.
When your execution arrow reaches an if statement,
evaluate the conditional expression. Evaluating this expression
proceeds just like evaluating any other expression. If the result
is true, move the execution arrow immediately inside the “then”
block and continue executing statements as usual. When your
execution reaches the close curly brace that ends the “then”
block, skip over the “else” block, placing your execution arrow
immediately after the close curly brace of the “else” block, and
continue executing statements from there.
If the result of the conditional expression is false, you
should instead skip the “then” block and execute the “else”
block. Move your execution arrow into the start of the “else”
block, and continue executing statements from there. When
your execution arrow reaches the close curly brace that ends the
“else” block, simply move it past that curly brace (which has no
effect—it just denotes the end of the block), and continue
executing statements normally.
Video 2.6: Execution of if/else.
C permits if with no else, which is equivalent to an
empty “else” block (as if the programmer had written else {}).
If you execute an if with no else, then simply imagine the
empty “else” block. If the conditional expression evaluates to
true, you should execute the “then” block as previously
described; however, there is no “else” block to skip. Instead,
continue executing statements immediately after the end of the
“then” block (skipping over the non-existent “else” block). If
the conditional expression evaluates to false, then skip the
“then” block, and execute whatever statements follow it (doing
nothing for the “else” block).
if/else statements may be nested—one (or more) may
occur in the “then” or “else” block of another if/else
statement. When you encounter nested statements, the same
rules apply. The inner statement is just one of the (possibly)
many statements in the block, and is executed according to its
rules—the condition is evaluated, whichever of the “then” or
“else” blocks is appropriate is executed, and then execution
continues after the end of the “else” block. When the execution
arrow reaches the end of the outer “then” or “else” block, it
behaves no differently than if there were no inner if statement.
Video 2.6 demonstrates the execution of some if/else
statements.
Video 2.7: Execution of switch/case.
2.4.2 switch/case
Another way programs can make decisions is to use
switch/case. The syntax of switch/case is shown in
Figure 2.8. Here, when the execution arrow reaches the switch
statement, the selection expression—in parenthesis after the
keyword switch—is evaluated to a value. This value is then
used to determine which case to enter. The execution arrow
then jumps to the corresponding case—the one whose label
(the constant immediately after the keyword case) matches the
selection expression’s value. If no label matches, then the
execution arrow jumps to the default case if there is one, and
to the closing curly brace of the switch if not.
Figure 2.8: Syntax of switch/case.
Once the execution arrow has jumped into a particular
case, execution continues as normal until it encounters the
keyword break. When the execution arrow reaches the break
keyword, it jumps to the close curly brace that ends the switch
statement. Note that reaching another case label does not end
the current case. Unless the execution arrow encounters break,
execution continues from one statement to the next. When the
execution arrow passes from one case into the next like this, it
is called “falling through” into the next case.
For example, if we were executing the code in Figure 2.8
and reached the switch statement with x having a value of 17
and y having a value of 16, then we would first evaluate the
selection expression (x - y) and get a value of 1. The
execution arrow would then jump to case 1: and begin
executing statements after it. We would execute y = 9;. Then
we would fall through the next case label—our execution arrow
would move past it into the next case (the label itself has no
effect). Then we would execute z = 42;. Next, we would
execute the break; statement, causing our execution arrow to
jump to the close curly brace of the switch, after which we
would continue executing whatever other statements are there.
Video 2.7 shows the execution of a switch/case statement
under a few different conditions.
2.5 Shorthand
C (and many other
Shorthand Meaning
programming languages) has
shorthand—also called syntactic x += y; x = x + y;
sugar—for a variety of common x -= y; x = x - y;
operations. These shorthands do x *= y; x = x * y;
not introduce any new x /= y; x = x / y;
behaviors; instead, they just x++; x = x + 1;
provide a shorter way to write
++x; x = x + 1;
common patterns of existing
things we have seen. Table 2.2 x--; x = x - 1;
shows the most common --x; x = x - 1;
shorthand notations in C. These Table 2.2: Shorthands.
shorthands have exactly the
same effect as their expanded meanings. Consequently, when
you encounter a shorthand statement while executing code, you
can execute it by considering what its fully written out form is
and performing the effects of that statement.
Another possible shorthand is to omit the curly braces
around single-statement blocks of code in certain cases. For
example, if the “then” and/or “else” clause of an if statement is
one single statement, the curly braces are not required. While
you may encounter code written this way by other people, it is
highly inadvisable to write code this way. Omitting the curly
braces presents a danger if you modify the code in the future. If
you add another statement to the clause, but forget to add curly
braces, then the statement will not actually be part of the clause
but rather the first statement after the if. Such errors have
produced high-profile security vulnerabilities recently.
2.6 Loops
Programs often repeat the same block of code multiple times
using a loop. As you may recall from the examples in Section
1.7, algorithms often have repetitive behavior. Finding
repetitive patterns is crucial to generalizing over inputs, as your
program may need to perform similar work multiple times for
different pieces of the input—and the number of repetitions will
change with the characteristics of the inputs. In addtion to
loops, there is another way to express repetition, called
recursion, which you will learn about in Chapter 7.
2.6.1 while Loops
Figure 2.9: Syntax of a while loop.
There are three kinds of loops in C. The first of these is the
while loop. The syntax for a while loop is shown in Figure 2.9.
The keyword while is followed by an expression in
parenthesis. Much like an if statement, this expression is
evaluated to determine whether or not to enter the block of code
immediately following it, which is known as the body of the
loop. If the conditional expression evaluates to true, the
execution arrow moves inside the body of the loop, and its
statements are executed normally. The while loop differs from
the if statement in what happens when the execution arrow
reaches the closing curly brace. In the case of a while loop, it
jumps up to the top of the loop, immediately before the while
keyword. The conditional expression is then re-evaluated, and
if it is still true, execution re-enters the loop body. If the
conditional expression evaluates to false, then the execution
arrow skips to immediately after the closing curly brace of the
loop body and proceeds from there.
Video 2.8: Execution of a while loop.
Video 2.8 shows the execution of a while loop.
2.6.2 do-while Loops
Figure 2.10: Syntax of a do-while loop.
Another type of loop in C is the do-while loop. Unlike a
while loop, which checks its conditional expression at the top
of the loop, the do-while loop checks its conditional expression
at the bottom of the loop—after it has executed the body. While
this distinction may seem contrived—either way the condition
is checked between iterations—it is important at the start of the
loop. A while loop may execute its body zero times, skipping
the entire loop, if the condition is false initially. By contrast, a
do-while loop is guaranteed to execute its body at least once
because it executes the loop body before ever checking the
condition.
Figure 2.10 shows the syntax of a do-while loop. The
keyword do is followed by the loop body. After the loop body,
the keyword while is followed by the conditional expression
and a semicolon.
Execution of a do-while loop proceeds by first entering
the loop body and executing all of the statements contained in
it. When the execution arrow reaches the while at the end of
the loop body, its conditional expression is evaluated. If the
expression evaluates to true, then the execution arrow jumps
back to the start of the loop body. If the expression evaluates to
false, then it moves past the end of the loop, and execution
continues with the next statement after the loop.
2.6.3 for Loops
The third type of loop in C is a for loop. The for loop is
syntactic sugar—it does not introduce any new behavior but
instead provides a more convenient syntax for a common
programming idiom. In the case of for loops, the common
idiom is counting from one number to another. Figure 2.11
shows the syntax of a for loop and how it is de-sugared into a
while loop—that is, how we could write the for loop in terms
of the already familiar while loop. Knowing how the for loop
de-sugars to a while loop tells us how to execute it. We can
imagine the equivalent while loop and follow the execution
rules we have already learned for it.
Figure 2.11: Syntax of a for loop (left), and how you can understand it in terms
an equivalent while loop (right).
The for keyword is followed by three pieces, separated by
semicolons, inside of parenthesis. The first of these is the
“initialization statement.” It happens once before the first time
the loop’s condition is checked. In the de-sugaring, this
statement appears right before the while loop. The second
piece is not a statement (even though it is followed by a
semicolon) but rather the conditional expression for the loop. In
the de-sugaring, this expression is the conditional expression of
the while loop. The third statement is the “increment
statement.” In the de-sugaring, it appears immediately before
the close curly brace of the loop body. After all of these is the
loop body, which (except for the addition of the “increment
statement” at the end) is the loop body of the while loop in the
de-sugared version.
If you examine Figure 2.11 carefully, you will notice that
there is a set of curly braces around the entire piece of while-
based code. These curly braces are there for a subtle but
important reason. The scope of any variables declared in the
“initialization statement” of the for loop have a scope that is
limited to the for loop. Recall that a variable typically has a
scope that is limited to the curly braces that enclose its
declaration. For a variable declared in the start of the for loop,
the scope appears to be an exception to this rule; however, it is
not if we think of it in terms of the de-sugaring shown above
with the curly braces surrounding the declaration.
2.6.4 Nesting
Just as if/else statements may be nested, loops may also be
nested. Similarly, loops follow exactly the same rules no matter
how they are nested. In fact, if/else statements and loops may
be nested within each other in any combination. The rules are
always the same regardless of any combination or depth of
nesting.
Video 2.9: Execution of a nested loops and if
statements.
Video 2.9 shows the execution of some code with nested
loops and if statements.
2.6.5 continue and break
Sometimes a programmer wants to leave the loop body early,
rather than finishing all of the statements inside of it. There are
two possible behaviors a programmer might want when leaving
the loop body early.
One behavior would be to exit the loop completely,
making the execution arrow jump to immediately after the close
curly brace that ends the loop (the same place it goes when the
loop’s condition evaluates to false). This behavior is obtained
by using the break; statement—which we have already seen in
the context of switch/case. Whenever the execution arrow
encounters a break statement, it executes the statement by
jumping out of the innermost enclosing loop (whether it is a
while, do-while, or for loop) or switch statement. If the
break statement is inside multiple of these that are nested
together (e.g. a loop inside a case of a switch statement), then
it exits only the most immediately enclosing one. If a break
statement occurs and is not inside one of these loops or a
switch statement, it is an error in the program.
The other possible behavior that the programmer might
want to have is for the execution arrow to jump back to the top
of the loop. This behavior is accomplished with the continue;
statement. Executing the continue statement jumps to the top
of the innermost enclosing loop (if it is not in a loop, it is an
error). In the case of a for loop, the “increment statement” in
the for loop is executed immediately before the jump. This fact
complicates the de-sugaring of a for loop into a while loop
slightly relative to the explanation given above. If the for loop
contains any continue statements, then the “increment
statement” is written not only before the close curly brace of
the loop, but also before any continue statements.
Video 2.10: Execution of continuefor loop.
Video 2.10 shows how to transform a for loop with a
continue statement inside of it into an equivalent while loop
then execute the resulting code. We note that in this example,
simply using an if/else statement would be better—however,
a good example of the use of continue is not easy to come by
until we learn some more advanced concepts.
2.7 Higher-level Meaning
So far, we have discussed reading code in terms of step-by-step
execution to determine its effects. While this skill is crucial for
programmers, another useful skill is to be able to understand
the meaning of a piece of code—what algorithm it implements
and how it does so. This skill is useful to programmers, as they
may need to understand—and possibly modify—code they did
not write.
The skill of reading code and ascertaining its higher level
meaning is in some ways more a matter of reversing the process
of writing code—you translate the code into the algorithmic
steps it represents, then you figure out what the purpose of that
general algorithm is. Sometimes you accomplish this by
working examples from the algorithm to see what it does.
Sometimes the original programmer was helpful and wrote
documentation explaining what it does and how it accomplishes
that task.
2.8 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 2.1 : What does the following code print
when it is executed?
1 int f(int x, int y) {
2 if (x < y) {
3 return y - x;
4 }
5 return x + 5 - y;
6 }
7
8 int main(void) {
9 int a = 3;
10 int b = 4;
11 int c = f(b, a);
12 printf("c = %d\n", c);
13 a = f(a, c);
14 printf("a = %d\n", a);
15 b = f(c, f(a, b));
16 printf("b = %d\n", b);
17 return 0;
18 }
• Question 2.2 : What does the following code print
when it is executed?
1 int main(void) {
2 for (int x = 0; x < 3; x++) {
3 for (int y = 0; y < 3; y++) {
4 if (x - y % 2 == 0) {
5 printf(" O ");
6 }
7 else if (x <= y) {
8 printf(" X ");
9 }
10 else {
11 printf(" ");
12 }
13 }
14 printf("\n");
15 }
16 return EXIT_SUCCESS;
17 }
• Question 2.3 : What does the following code print
when it is executed?
1 int f(int x, int y) {
2 printf("In f(%d, %d)\n", x, y);
3 if (x + 2 < y) {
4 x += 3;
5 return y * x;
6 }
7 else {
8 return x + y + 2;
9 }
10 }
11
12 int main(void) {
13 int answer = 0;
14 for (int i = 0; i < 4; i++) {
15 answer += f(i, answer);
16 printf("i = %d, answer = %d\n",
17 }
18 return EXIT_SUCCESS;
19 }
• Question 2.4 : Given the following code
1 int g(int x, int y) {
2 switch(x - y) {
3 case 0:
4 return x;
5 case 4:
6 y++;
7 break;
8 case 7:
9 x--;
10 case 9:
11 return x * y;
12 case 3:
13 y = x + 9;
14 default:
15 return y - x;
16 }
17 return y;
18 }
What does each of the following expressions
evaluate to?
1. g(14, 7)
2. g(9, 5)
3. g(3, 0)
4. g(2, 9)
5. g(5, 5)
6. g(27, 18)
• Question 2.5 : Consider the following code, and think
about the differences between the functions f and g.
1 void f(int x, int y) {
2 while (x < y) {
3 printf("%d", y - x);
4 x++;
5 y--;
6 }
7 }
8 void g(int x, int y) {
9 do {
10 printf("%d", y - x);
11 x++;
12 y--;
13 while (x < y);
14 }
Come up with values of the parameters (x and y),
where the two functions produce different output. What
does each function produce for output on the parameter
values you picked? Check your answer by swapping
parameter values with a friend. Execute f(x, y) and
g(x, y) by hand for your friend’s values of x and y, and
see if you came up with the same answer. Have your
friend do the same for your values of x and y.
• Question 2.6 : What does the following code print
when it is executed?
1 int min(int a, int b) {
2 if (a < b) {
3 return a;
4 }
5 return b;
6 }
7 int max(int a, int b) {
8 if (a > b) {
9 return a;
10 }
11 return b;
12 }
13 int euclid(int a, int b) {
14 printf("euclid(%d, %d)\n", a, b);
15 int larger = max(a, b);
16 int smaller = min(a, b);
17 if (smaller == 0) {
18 return larger;
19 }
20 return euclid(smaller, larger % s
21 }
22 int main(void) {
23 int x = euclid(9135, 426);
24 printf("x = %d\n", x);
25 return EXIT_SUCCESS;
26 }
(Hint: nothing unusual happens if a function calls
itself. You just follow the same rules we have learned.
We will talk a lot more about functions that call
themselves in Chapter 7)
• Question 2.7 : Trace the creation and destruction of
stack frames for the following code. What stack frames
exist after each line of code is executed?
1 void aFinalFn() {
2 }
3
4 void sillyFunction() {
5 aFinalFn();
6 }
7
8 void someFn() {
9 sillyFunction();
10 sillyFunction();
11 }
12
13 int main(void) {
14 someFn();
15 aFinalFn();
16 return EXIT_SUCCESS;
17 }
1 Introduction3 Types
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C2 Reading Code4 Writing Code
Chapter 3
Types
Until this point, our programs have been declaring and
manipulating integers only. We have declared variables such as x
and y and given them values such as 3 or 4. What happens when
we want to move past integers? What happens when we want to
move past numbers altogether?
3.1 Hardware Representations
First and foremost, as far as the computer is concerned, there is no
way to move “past numbers” because to the computer, Everything
Is a Number. A computer stores everything as a series of 0’s and
1’s. Each 0 or 1 is called a bit, and there are many ways to interpret
these bits. This is where types come in. A type is a programming
language construct that specifies both a size and an interpretation
of a series of bits stored in a computer. For example, the type for
working with integers is an int, whose size is typically 32 bits and
whose interpretation is an integer number directly represented in
binary.
Figure 3.1: Decimal and binary interpretation of the same pattern of digits.
3.1.1 Binary Numbers
Before we delve into how to represent numbers in binary, let us
briefly discuss the decimal system, which should be familiar to all
of us. A decimal number is a number represented in base 10, in
which there are 10 possible values for each digit (0–9). When these
digits are concatenated to make strings of numbers, they are
interpreted column by column. Beginning at the far right and
moving to the left, we have the 1’s column, the 10’s column, the
100’s column, and so forth. The number 348, for example, has 8
ones, 4 tens, and 3 hundreds. The value of each column is formed
by taking the number 10 and raising it to increasing exponents.
The ones column is actually , the tens column is
, the hundreds column is , and so forth.
When we see a number in base 10, we automatically interpret it
using the process shown in Figure 3.1(a), without giving it much
thought.
A binary number is a number represented in base 2, in which
there are only two possible values for each digit (0 and 1). The 0
and 1 correspond to low and high voltage values stored in your
computer. Although it might be possible for a computer to store
more than two voltage values and therefore support a base larger
than 2, it would be extremely difficult to support the 10 voltage
values that would be required to support a base 10 number system
in hardware. A familiarity with base 2 is helpful in understanding
how your computer stores and interprets data.
Binary numbers are interpreted such that each bit (the name
for a binary digit) holds the value 2 raised to an increasing
exponent, as shown in Figure 3.1(b). We begin with the rightmost
bit (also called the least significant bit), which holds the value
, or the ones column. The next bit holds the value
, or the twos column. In base 10, each column is ten
times larger than the one before it. In base 2, each column’s value
grows by a multiple of 2. The number (the subscript indicates
the base) has one two and no ones. It corresponds to the value 2 in
base 10. Congratulations! You are now technically equipped to
understand the age-old joke: “There are 10 types of people in the
world. Those who understand binary and those who do not.”
3.1.2 Looking Under the Hood
When you are driving a car in traffic, it is probably not a good idea
to think too much about what the engine is doing—in fact, you
really do not need to know how it works in order to drive. This
example illustrates an important concept in programming:
abstraction—the separation of interface (what something does or
how you use it) from implementation (how something works).
Abstraction often comes in multiple levels. Driving a car, the
level of abstraction you care about is that the steering wheel turns
the car, the gas pedal makes it go faster, and the brake makes it
slow down. Your mechanic’s level of abstraction is how the pieces
of the engine fit together, what level is appropriate for the brake
fluid, and whether your oil filter is screwed on tightly enough. The
engineers who designed the car thought about the physics to make
it all work efficiently. At each deeper level, you can think about
details that were not important at higher levels, but are still crucial
to making the system work. We could continue to lower and lower
levels of abstraction until we start thinking about quantum
interactions of atoms—fortunately you don’t need to worry about
that to merge onto the interstate!
There are times, however, when it is a good idea to take a
look “under the hood”—to go deeper than the abstraction levels
you typically care about. At the very least, you might want to
know whether the car has a diesel engine before filling up the tank,
or to be aware that your car has oil, and you should get it changed
sometimes.
Similarly, you need not constantly consider the inner
workings of your CPU in order to write good code. Thinking about
variables as boxes that store values is a good level of abstraction,
but having some knowledge of what goes on under the hood can
be important. When you first declare your variables and assign
them a type, it is a good idea to pause and consider what this
actually means at the hardware level.
As mentioned earlier, a type indicates both a size and an
interpretation. Figure 3.2 shows you the now-familiar figure with
code and its conceptual representation. For this chapter, we will
add a third column, showing you the underlying representation at
the hardware level. When you declare a variable x of type int, you
should think about this conceptually as a box called x with a value
42 inside. But at a hardware level, the type int means that you
have allocated 32 bits dedicated to this variable, and you have
chosen for these bits to be interpreted as an integer number in
order to yield the value 42.
Figure 3.2: Code, conceptual representation, and actual hardware representation.
Hex. As you may well imagine, reading and writing out a series
of 32 1’s and 0’s is tedious at best and error-prone at worst.
Because of this, many computer scientists choose to write out the
values of numbers they are thinking about in binary using an
encoding called hexadecimal, or hex for short. Hex is base 16,
meaning it represents a number with a 1’s column, a 16’s column,
a 256’s column, and so on. As a hex digit can have 16 possible
values (0–15), but our decimal number system only has 10
possible symbols (0–9), we use the letters A–F to represent the
values 10–15 in a single digit. The number 11, for example, is
represented by the single digit B in hex. Numbers represented in
binary can easily be converted to hex by simply grouping them
into four-digit clusters, each of which can be represented by a
single hex digit. For example, the four rightmost bits in Figure 3.2
(colored blue) are 1010, which has the decimal value 10 and the
hex value A. The next four bits in Figure 3.2 (colored green) are
0010, which has the decimal value 2 and the hex value 2. The
remaining 24 bits in the number are all zeroes. Instead of writing
out the entire 32-bit binary sequence, we can use eight digits of
hex (0x0000002A) or the shorthand 0x2A. (In both cases, the
leading 0x (interchangeable with just x) indicates that the number
is in hex.)
3.2 Basic Data Types
C supports a very small number of data types, each with a unique
combination of size and interpretation. They are shown in
Figure 3.3. As the caption of this figure notes, the sizes listed are
common, and what we will use in general discussion in this book,
but not guaranteed. In particular, it depends on the hardware and
the compiler—the program that turns your code into instructions
that the computer can actually execute (more on this in Chapter 5).
Figure 3.3: Basic data types supported in C. Note: sizes shown are typical but can
vary with compiler and hardware.
3.2.1 char
A char (pronounced either “car” or “char”) is the smallest data
type—a mere eight bits long—and is used to encode characters.
With only eight bits, there are only possible values for
a char (from to ). On most machines
you will use, these eight bits are interpreted via the American
Standard Code for Information Interchange (ASCII) character-
encoding scheme, which maps 128 number sequences to letters,
basic punctuation, and upper- and lower-case letters. A subset of
this mapping is shown in Figure 3.4; please don’t try to memorize
it. Another, much more expressive character-encoding scheme you
may encounter (particularly when needing to encode non-English
characters) is Unicode (which requires more than one byte).
If you look at the first line of code in Figure 3.5, you can see
the char c declared and initialized to the value ’A’. Notice that we
wrote A in single quotation marks—these indicate a character
literal. In the same way that we could write down a string literal in
Section 2.3.2, we can also write down a character literal—the
specific constant character value we want to use. Writing down
this literal gives us the numerical value for A without us having to
know that it is 65. If we did need to know, we could consult an
ASCII table like the one in Figure 3.4. Being able to write ’A’
instead of 65 is another example of abstraction—we do not need to
know the ASCII encoding; we can just write down the character
we want.
Figure 3.4: A subset of ASCII number-to-character mappings.
3.2.2 int
We have said that an int is a 32-bit value interpreted as an integer
directly from its binary representation. As it turns out, this is only
half of the story—the positive half of the story. If we dedicate all
32 bits to expressing positive numbers, we can express
values, from 0 up to 4,294,967,295. We request this interpretation
by using the qualifier unsigned in the declaration, as shown in the
second line of Figure 3.5.
What about negative numbers? ints are actually represented
using an encoding called two’s complement, in which half of the
possible bit patterns are used to express negative numbers
and the other half to express positive ones. Specifically, all
numbers with the most significant bit equal to 1 are negative
numbers. A 32-bit int is inherently signed (i.e., can have both
positive and negative values) and can express values from
to . Note that both
unsigned and signed ints have possible values. For the
unsigned int, they are all positive; for the signed int, half are
positive and half are negative.
Another pair of qualifiers you may run into are short and
long, which decrease or increase the total number of bits dedicated
to a particular variable, respectively. For example, a short int
(also referred to and declared in C simply as a short) is often only
16 bits long. Technically, the only requirement that the C language
standard imposes is that a short int has fewer than or equal to as
many bits as an int and that a long int has greater than or equal
to as many bits as an int.
Figure 3.5: Examples of chars and ints. At first glance, c and x appear identical
since they both have the binary value 65. However, they differ in both size (c has only
eight bits, whereas x has 32) and interpretation (c’s value is interpreted using ASCII
encoding whereas x’s value is interpreted as a signed integer). Similarly, y and z are
identical in hardware but have differing interpretations because y is unsigned and z is
not.
3.2.3 float and double
The final two basic data types in C allow the programmer to
express real numbers. Since there are an infinite number of real
numbers, the computer cannot express them all (that would require
an infinite number of bits!). Instead, for values that cannot be
represented exactly, an approximation of the value is stored.
If you think about the fact that computers can only store
values as 0s and 1s, you may wonder how it is possible to store a
real number, which has a fractional part. In much the same way
that decimal representations of a number can have a fractional
portion with places to the right of a decimal point (the tenth’s,
hundredth’s, thousandth’s, etc. places), binary representations of
numbers can have fractional portions after the binary point. The
places to the right of the binary point are the half’s, quarter’s,
eighth’s, etc. places.
One way we could (but often do not) choose to represent real
numbers is by fixed point. We could take 32 bits and interpret them
as having the binary point in the middle. That is, the most
significant 16 bits would be the “integer” part, and the least 16 bits
would be the “fractional” part. While this representation would be
conceptually simple, it is also rather constrained—we could not
represent very large numbers, nor could we represent very small
numbers precisely.
Instead, the most common choice is similar to scientific
notion. Recall that in decimal scientific notation, number 403 can
be expressed as . Computers use floating point
notation, the same notation but implicitly in base 2: ,
where is called the mantissa (though you may also hear it
referred to as the significand), and is the exponent.
Figure 3.6: Floating point representation. A float has 32 bits, and a double has 64
bits to express the three necessary fields to represent a floating point number.
A float has 32 bits to represent a floating point number.
These 32 bits are divided into three fields. The lowest 23 bits
encode the mantissa, and the next eight bits encode the exponent.
The most significant bit is the sign bit , which augments our
formula as follows: . (When , the
number is negative. When , the number is positive.) A
double has 64 bits and uses them by extending the mantissa to 52
bits and the exponent to 11 bits. Examples of both a float and a
double are shown in Figure 3.6.
Standards. There would be many possible ways to divide a
given number of bits into the mantissa and exponent fields. The
arrangement here is part of the Institute of Electrical and
Electronics Engineers (IEEE) Standard. Industry standards like
these make it possible for engineers from a variety of companies to
agree upon a single encoding by which floating point numbers can
be represented and subsequently interpreted across all languages,
platforms, and hardware products. Part of the IEEE Standard for
floating point notation involves two adjustments to the bitwise
representations of a float and a double. These adjustments
(normalization and adding a bias) make the actual binary
representation of these numbers less accessible to a first time
observer. We encourage the interested reader to read the actual
IEEE floating point Standard and allow the less curious reader
simply to trust that there is a bitwise encoding for the numbers in
Figure 3.6, which is just outside the scope of this textbook.
Precision. There are an infinite number of values between the
numbers 0 and 1. It should be unsurprising, then, that when we use
a finite number of bits to represent all possible floating point
values, some precision will be lost. A float is said to represent
single-precision floating point whereas a double is said to
represent double-precision floating point. (Since a double has 64
bits, it can dedicate more bits to both the mantissa and exponent
fields, allowing for more precision.)
Figure 3.7: Precision. In practice, the imprecision associated with floating point
arithmetic can produce surprising results. A double offers more precision than a
float, but neither is immune to the problem of representing an infinite number of
values with a finite number of bits.
How does precision play out in practice? Figure 3.7 shows
how unexpected (or at least unintuitive) things can happen due to
imprecision. If you take the square root of 2.0 and store it in the
float fRoot you get a particular value. If you store the number in a
double, the value is the same number to the eighth decimal place;
then the two values diverge. Interestingly, in neither case if you
take the square root of 2.0 and square it, do you end up with
exactly 2.0. Notice how the code in Figure 3.7 tests to see whether
the root of 2.0 squared yields 2.0, and it does not in either case. As
we will discuss in the next section, the default print setting for
floats and doubles is to print up to six decimal places (see
Figure 3.8). As a consequence, the user has no reason to think that
these numbers are not exactly 2.0 and the fact that neither test for
equality to 2.0 passes is simply confusing.
It is important for programmers to understand precision when
they choose types for their variables and when they perform tests
on variables whose values are assumed to be known. Some
programs will need more precision in order to run correctly. Some
programs will have to allow for a small degree of imprecision in
order to run correctly. Understanding exactly the level of precision
required for your code is critical to writing correct code. For every
project you begin (or join) it is definitely worth taking a minute to
think about the code and how important precision might be in that
particular domain. This is true particularly for programs that will
ultimately be used to make life-and-death decisions for those who
have no say over the precision decisions you are making for them.
It is also important to understand the cost. A double takes up
twice as much space as a float. This may not matter for a single
variable, but some programs declare thousands or even millions of
variables at a time. If these variables do not require the precision
of a double, choosing a float can make your code run faster and
use less memory with no loss of correctness.
Figure 3.8: How to print various data types in C. Also shown: decimal formats that
allow you to custom print each number and escape sequences allowing you to specify
white space.
3.3 Printing Redux
As we learned in Section 2.3.2, C supports printing formatted
information via the function printf. Now that we have multiple
types, we can explore the various format specifiers, which allow us
to print variables of a variety of types. Figure 3.8 shows the most
common specifiers. You should not try to memorize these (nor
really anything in learning to program)—rather, you should know
how/where to look up what you need. As you use the most
common ones often, they will come to you naturally. You can find
more format specifiers, as well as more information about these
format specifiers in the man page for printf (see Section B.2 for
more information about man pages).
Figure 3.9 shows some examples of these format specifiers
being used. Here, the code (shown on the left) declares a few
variables and prints them out using the format specifiers described
in Figure 3.8. Note that while we have already discussed
hexadecimal (base 16), this example also makes reference to octal,
which is base 8.
Figure 3.9: Printing in C. The above figure shows a few printing examples,
incorporating a variety of format specifiers, decimal formats, and escape sequences.
3.4 Expressions Have Types
In Section 2.2, we learned that expressions are evaluated to values
—if you have a + b * 2, the current value of b is read out of its
box, multiplied by 2, then the value of a is read out of its box and
added to the product of b * 2. The expression evaluates to the
resulting sum.
Expressions also have types, which are determined by the
types of the sub-expressions that make them up. The simplest
expressions are constants, which have type int if they are integer
constants (e.g., 2 or 46), or type double if they are real constants
(e.g., 3.14, or -8.19). The types of constants can be modified by
applying a letter suffix if needed (U for unsigned, L for long, and f
for float): 3.14f is a constant with type float, and
999999999999L is a constant with type long int. The next
simplest type of expression is a variable, which has whatever type
it was declared to have.
Most (but not all) expressions with binary operators—
e1 op e2 (e.g., a + b or c * 4)—have the same type as their
operands. If a and b are doubles, then a + b is a double as well.
Likewise, if c is an int, then c * 4 is also an int (note that 4 is an
int).
The type of a function is its declared return type. That is, if
you have
1 int f (int x, int y) { ... }
2 int g (double d, char c) {...}
then the expression f(3, 4) + g(42.6, ’a’) has type int. We
can see this from the fact that f(3, 4) has type int (f is declared
to return an int), as does g(42.6, ’a’). As we just discussed,
adding two ints results in an int.
3.4.1 Type Conversion
The next natural question is “what happens if you have a binary
operator, and its operands have different types?” For example, if a
is an int and b is a double, then what type is a + b? The answer
to this question depends on the types involved.
Fundamentally, the first thing that must happen is that the two
operands must be converted to the same type. Most1 operations
can only be performed on operands of the same type. The
processor has one instruction to add two 32-bit integers, a different
instruction to add two 16-bit integers, a third one to add two 32-bit
floats, a fourth to add two 64-bit doubles, and so on. The compiler
must translate your code into one of these instructions, so it must
pick one of them and arrange to have the inputs in a proper format
in order to be able to perform the math.
When the two operands have different types, the compiler
attempts to add a type conversion (sometimes called a type
promotion) to make the types the same. If no type conversion is
possible, the compiler will issue an error message and refuse to
compile the code. When the compiler inserts a type conversion, it
typically must add instructions to the program that cause the
processor to explicitly change the bit representation from the size
and representation used by the original type to the size and
representation used by the new type. The compiler chooses which
operand to convert based on what gives the “best” answer. In our
int + double example, the compiler will convert the int to a
double to avoid losing the fractional part of the number.
There are four common ways that the bit representations must
be changed to convert from one type to another during a type
promotion. When converting from a smaller signed integer type to
a longer signed integer, the number must be sign extended—the
sign bit (most significant bit) must be copied an appropriate
number of times to fill in the additional bits. When converting
from a smaller unsigned integer type to a longer unsigned integer
type, the number must be zero extended—the additional bits are
filled in with all zeros. The third common way that the bit
representation can be changed during an automatic conversion
happens when a longer integer type is converted to a shorter
integer type. Here, the bit pattern is truncated to fit—the most
significant bits are thrown away, and only the least significant bits
are retained.
The fourth way that the bit representation may need to be
changed is to fully calculate what the representation of the value is
in the new type. For example, when converting from an integer
type to a real type, the compiler must insert an instruction that
requests that the CPU compute the floating point (binary scientific
notation) representation of that integer.
There are other cases where a type conversion does not need
to alter the bit pattern; instead, it changes the interpretation. For
example, converting from a signed int to an unsigned int
leaves the bit pattern unchanged. However, if the value was
originally negative, it will now be interpreted as a large positive
number. Consider the following code:
1 unsigned int bigNum = 100;
2 int littleNum = -100;
3 if (bigNum > littleNum)
4 printf("Obviously, 100 is bigger than -1
5 else
6 printf("Something unexpected has happene
When this code runs, it prints “Something unexpected has
happened!” The bit pattern of littleNum (which has a leading 1
because it is negative) is preserved; the value is changed to a
number larger than 100 (because under an unsigned interpretation,
this leading 1 indicates a very large number). We will note that the
compiler produces a warning (an indication that you probably did
something bad—which means you should go fix your code!) for
this behavior, as comparing signed integers to unsigned integers is
typically a bad idea for exactly this reason.
When you declare a variable and assign it a particular type,
you specify how you would like the data associated with that
variable—the bit pattern “in the box”—to be interpreted for the
entirety of its life span. There are some occasions, however, when
you or the compiler may have to temporarily treat the variable as
though it were of another type. When a programmer does this, it is
called casting, and when a compiler does it, it is called type
conversion or type promotion. It is extremely important to
understand when to do the former and when the compiler is doing
the latter because it can often be the cause of confusion and
consequently errors. We will note that while understanding when
to cast is important, understanding that you should generally not
cast is even more important—sprinkling casts into your code to
make errors go away indicates that you are not actually fixing your
code but rather hiding the problems with it.
3.4.2 Casting
Sometimes, the programmer wants to explicitly request a
conversion from one type to another—either because the compiler
has no reason to insert it automatically (the types are already the
same, but a different type of operation is desired), or because the
compiler does not consider the conversion “safe” enough to do
automatically. This explicitly requested conversion is called
casting and is written in the code by placing the desired type in
parenthesis before the expression whose value should be
converted. For example, (double)x evaluates x (by reading the
value in its box), then converts it to a double.
To see why we would want this capability, let us begin with a
seemingly benign example. We want to write a program that
calculates how many hours you would work per day if you
stretched the 40-hour work week across seven days instead of five.
A naïve implementation of the code (shown in Video 3.1) might
begin with two ints, nHours and nDays. Here, int is a perfectly
reasonable type, as we are working only in integer numbers of
hours (40) and days (7). This code then divides the number of
hours by the number of days and stores the result in the float
avgWorkDay. If you execute this code carefully by hand, you will
find that when it prints the answer out, it will print 5.0. Somehow
our work week just got shortened to 35 hours!
In this case, the problem lies in the fact that we divided two
ints, and integer division always produces an integer result—in
this case 5. When the compiler looks at this expression, there are
only integers involved, so it sees no need to convert either operand
to any other type. It therefore generates instructions that request
the CPU to perform integer division, producing an integer result.
However, when the compiler examines the assignment, it sees
that the type on the left (the type of the box it will store the value
in) is float, while the type of the expression on the right (the type
of the value that the division results in) is int. It then inserts the
type conversion instruction at the point of the assignment:
converting the integer division result to a floating point number as
it puts it in the box. Video 3.1 illustrates this execution.
Video 3.1: Execution of our naïve code for
computing the hours in a seven-day work week.
Here, what we really wanted to do was to convert both
operands to a real type (float or double) before the division
occurs, then perform the division on real numbers. We can achieve
this goal by introducing an explicit cast—requesting a type
conversion. We could explicitly cast both operands; however,
casting either one is sufficient to achieve our goal. Once one
operand is converted to a real type, the compiler is forced to
automatically convert the other. We prefer writing
a / (double) b over (double)a / b even though they are the
same, as the former does not require the reader of the code to
remember the relative precedence (“order of operations”) between
a cast and the mathematical operator. However, we note that
casting has very high operator precedence—it happens quite early
in the order of operations.
Video 3.2: Execution of our fixed code for
computing the hours in a seven-day work week.
Video 3.2 illustrates the execution of the modified code with
the cast added. Observe how the cast does not affect the boxes,
only the intermediate numbers that are being worked with in the
computation.
3.4.3 Overflow and Underflow
Figure 3.3 showed us some of the basic types supported in C and
their sizes. The fact that each type has a set size creates a limit on
the smallest and largest possible number that can be stored in a
variable of a particular type. For example, a short is typically 16
bits, meaning it can express exactly possible values. If these
values are split between positive and negative numbers, then the
largest possible number that can be stored in a short has a bit
pattern 0111111111111111, which is 32767 in decimal.
What happens if you try to add 1 to this number? Adding 1
yields an unsurprising
100000000000000. The bit pattern is expected, but the
interpretation of a signed short with this bit pattern is ,
which could be surprising (the very popular xkcd comic, illustrates
this principle nicely: http://xkcd.com/571/). If the short were
unsigned, the same bit pattern 100000000000000 would be
interpreted as an unsurprising 32768.
This odd behavior is an example of overflow: an operation
results in a number that is too large to be represented by the result
type of the operation. The opposite effect is called underflow, in
which an operation results in a number that is too small to be
represented by the result type of the operation. Overflow is a
natural consequence of the size limitations of types.
Note that overflow (and underflow) are actions that occur
during a specific operation. It is correct to say “Performing a 16-
bit signed addition of results in overflow.” It is not
correct to say “ overflowed.” The number
by itself is perfectly fine. The problem of overflow (or underflow)
happens when you get as your answer for
. The operation does not have to be a “math”
operation to exhibit overflow. Assignment of a larger type to a
smaller type can result in overflow as well. Consider the following
code:
1 short int s;
2 int x = 99999;
3 s = x;
4 printf("%d\n", s);
In this code, the assignment of x (which is a 32-bit int) to s
(which is a 16-bit short int) overflows—the truncation
performed in the type conversion discards non-zero bits. This code
will print out -31073, which would be quite unexpected to a person
who does not understand overflow.
Whether overflow is a problem for the correctness of your
program is context-specific. Clocks, for example, experience
overflow twice a day without problems. (That 12:59 is followed by
1:00 is the intended behavior). As a programmer, realize that your
choice of type determines the upper and lower limits of each
variable, and you are responsible for knowing the possibility and
impact of overflow for each of these choices.
3.5 “Non-Numbers”
It is worth restating: Everything Is a Number. This rule is
fundamental to understanding how computers work and is one of
the most important concepts in programming. For every variable
you create in any programming language, the value of that variable
—the data that you place “in the box” of every conceptual diagram
you draw—is stored in your computer as a series of zeros and
ones. This fact is easy to accept for a positive integer, whose base
10 representation is simply converted to base 2 and then stored in a
series of bits. Understanding how negative numbers and floating
point numbers are also represented as a series of zeros and ones
may be a little less straightforward, but it still appeals to our
general intuition about numbers.
Extending this rule to things that do not seem like numbers—
words, colors, pictures, songs, movies—may seem like a much
harder conceptual leap. However, with our newfound
understanding that computers can only operate on numbers, we
must realize that all of these things must be numbers too—after all,
our computers operate on them regularly.
Finding a way to encode these “non-number” data types is a
simply a matter of coming up with a new convention for encoding
the information as bits and interpreting the bits to mean the
original information. These new conventions are no longer
included as basic data types of the C programming language
(though some of them are basic data types in languages other than
C). Instead, new types are formed by combining the basic types to
achieve the programmer’s goals. These more complex types may
be widely accepted programming conventions (like the
representation of strings), or may be something done by one single
programmer specific to their programming task.
3.5.1 Strings
A string is a sequence of characters that ends with a special
character called the null terminator, which can be written with the
character literal ’\0’ (pronounced “backslash zero”), which
signals the end of the string. A string is referred to by the location
of the first character in memory and each eight-bit character is read
until the ’\0’ is detected. A simple drawing of this concept is
shown in Figure 3.10.
Figure 3.10: How to encode a string as a series of ASCII characters.
Strings are not a basic data type in C, meaning you cannot
simply declare and use them as you would an int or a double. To
give you a tiny glimpse into the complexity of the matter, consider
how large a string should be. Is there a pre-defined number of bits
that should correspond to a string data type? Since each string has
a unique number of characters, this does not seem like a choice
that can be made up front. In fact, the size of a string will need to
be dynamically determined on a per-string basis. To truly
understand how to create and use strings, an understanding of
pointers (the topic of Chapter 8) is required. This is one reason
why Figure 3.10 is deliberately lacking in details—because we
haven’t yet explained the concepts necessary to show you how to
declare and instantiate them. We will delay further discussion of
strings to Section 10.1.
3.5.2 Images
Your computer frequently displays images—whether its the
windows and icons on your screen, or the lolcats you view in your
web-browser. These may seem like they are not numbers; however,
they are actually just many numbers put together. The first step to
representing an image as a number is to represent a color as a
number.
While there are many ways to represent a color as a number,
the most common is RGB encoding, which encodes each color by
specifying how much red, green, and blue they contain. Typically,
this encoding is done with each component being represented on a
scale from 0 to 255. The RGB values for the color red are:
, , . Orange is ,
, . If you search the Internet, you will find
many online tools that will let you select a color and then tell you
its corresponding RGB encoding.
Once we can encode a single color numerically, an image is
encoded as a 2D grid of colors. Each “dot” in this grid is called a
pixel. As with strings, understanding how to store a 2D sequence
requires an understanding of pointers, which will come later.
However, for now it suffices to understand that an image can be
encoded as many numbers organized in a logical format.
You may have noticed that computers typically have a variety
of image formats, such as JPG, BMP, PNG, TIFF, and many
others. Each of these encodes the image numerically; however, the
specific details differ between the formats. Some image formats
compress the image data—performing math on the colors (after all,
the colors are just numbers!) to encode the image data in fewer
bits, reducing the size of the data that must be stored on disk
and/or transferred across the Internet.
3.5.3 Sound
Sound is another common aspect of computer use that seems like it
is not a number. However, sound is naturally a waveform, which
can easily be represented as a sequence of numbers. The most
direct numeric representation of a sound wave is to record the
“height” of the wave at periodic intervals, forming a sequence of
numbers. The frequency of these intervals is called the sampling
rate (higher sampling rates result in better quality of the sound),
and is typically 22 kHz or 44kHz—that is, 22,000 or 44,000
samples per second. Stereo sound simply has two sequences of
numbers—one for the left channel and one for the right channel.
As with images, there are many typical formats (e.g., WAV, AIFF,
AAC, etc.), some of which are compressed (again, the sound is just
numbers, so we can do math on it).
3.5.4 Videos
Videos (including those found in this book) again seem to defy the
“Everything Is a Number” rule—however, by now, you should see
the path to numeric encoding. A video is a sequence of images
(called “frames”) and the corresponding sound. We have already
seen how to encode images and sound as numbers. The simplest
approach would be to encode the video as the sequence of images
plus the sound. While this approach gives us a bunch of numbers,
it would be huge—one minute of a 512 pixel x 256 pixel video at
32 frames per second with a stereo sound track at 44 kHz would
require about 725 Megabytes (almost 1 GB). Correspondingly, all
common movie formats (e.g., MP4, MOV, etc.) apply
compression, not only to the images and sound themselves, but
also in terms of not storing the entire image for all frames, but
rather storing a way to compute the next frame’s image based on
the changes from the previous frame.
3.6 Complex, Custom Data Types
You may be starting to notice that the definitions of many data
types are essentially a set of agreed-upon conventions. One of the
great things about rich programming languages like C is that it
gives a programmer the power to create new data types and
associated conventions. Some conventions, like the IEEE floating
point standard, are agreed upon across multiple programming
languages, compilers, machine languages, and the architecture of
the processors they run on. This requires the coordination of
hundreds of companies and tens of thousands of engineers. Other
conventions can be more local, existing only in a particular code
base, or a collection of files that all use a common library. This
may require the coordination of multiple people (who are usually
working together already) or may only affect a single person who
simply wishes to produce clean, modifiable, and debuggable
programs.
Suppose you are designing a program that regularly draws
and computes various properties of rectangles. It would be very
convenient to have a data type that captures the basic properties of
a rectangle. In C, creating a custom data type is accomplished via
the keyword struct.
Figure 3.11: Visualization of a rectangle (left), code fragment of a complex, custom
rectangle datatype (center), and conceptual representation of a conglomerate variable
myRect with components left, bottom, right, and top (right).
3.6.1 struct
A struct allows a programmer to bundle multiple variables into a
single entity. For example, if we wish to define a rectangle via its
four coordinates on an x-y plane2 as shown in Figure 3.11 (left),
these four coordinates can be bundled into a single, conglomerate
data structure, whose internal structure will look like the code in
Figure 3.11 (center). Structs are represented conceptually with a
single box in which all the component fields reside, each with their
own box. Figure 3.11 (right) shows a variable myRect with its four
fields.
Figure 3.12: Various syntactic options for creating struct tags, types, and instantiating
variables.
Syntactically, there are multiple ways to declare, define, and
use structs. Figure 3.12 shows four different syntactic options that
all create the same conceptual struct. Regardless of which syntactic
option you choose, the drawing of your conceptual representation
will be the same. It is not important for you to be “fluent” in all
four options. You may choose a single approach and stick with it.
However, it is important for you to know about all four options
because others contributing to the same code base as you may have
a different style, and internet searches will also result in many
versions of effectively the same code. You need to be aware of
these differences so that you can correctly understand and extend
code whose syntax differs from your preferred style.
Struct declarations do not go inside functions; they live in the
global scope of the program, next to function declarations and
definitions. All of them use the keyword struct. Option 1 in
Figure 3.12 begins with the keyword struct, followed by the tag
of our choosing. In this case, we use the tag rect_t. Ending the tag
in “_t” is a convention that makes it easier to recognize the name
as identifying a type throughout your code. A name such as rect
would be acceptable, just a little less reader-friendly. Everything
inside the braces belongs to the definition of the newly defined
struct named rect_t. The semi-colon indicates the completion of
the definition.
The far right column of Figure 3.12 shows how to instantiate
a variable for each syntactic option. For Option 1, the type of the
variable is struct rect_t, and the name of the variable is myRect.
Once you declare the variable, you can access the fields of the
struct using a dot (period): myRect.top gives you access to the
field top of type int inside the myRect variable. Note: when you
instantiate a variable of type struct rect_t, you choose a top
level name (myRect) only. The names of the fields are determined
in the definition of the structure and cannot be customized on a
per-instance basis.
Video 3.3: Declaration and use of a struct for a
rectangle.
Video 3.3 shows an example of a declaration of a struct for a
rectangle, as well as its initialization and use. Note that the
assignment statements follow the same basic rules we have seen so
far: you find the box named by the left side, evaluate the right side
to a value, and store that value into the box named by the left side.
The dot operator just gives you a different way to name a box—
you can name a “box inside a box.”
A key part of good programming is using good abstractions.
Structs are another form of abstraction. Once we have a rectangle
struct, other pieces of code can operate on rectangles without
looking at the implementation. We could write many functions to
manipulate rectangles, and those functions could be the only
pieces of code that know the internal details of rectangles—the rest
of the code could just call those functions.
However, part of using good abstractions is using them
correctly. In the case of structs, remember that their primary
purpose is to group together data that belongs together logically. In
this example, we use a struct for a rectangle—something that
logically makes sense as a combination of other pieces of data. In
Figure 3.11 we illustrate the connection between the conceptual
idea (the visualization on the left) and the declaration (in the
middle). We can think about operations on rectangles and
understand what they are conceptually, without looking at the
implementation details.
While it may seem silly to say: do not just group data together
into structs without a logical purpose. Sometimes novice
programmers are tempted to just put a bunch of things in a struct
because they get used in the same parts of a program (to pass one
parameter around instead of a few). However, if you cannot
articulate why those data make sense together, they do not belong
in a struct together.
3.6.2 typedef
Many consider Option 1 in Figure 3.12 to be somewhat unwieldy
because the type of the variable includes the word struct in it. For
example, suppose you wanted a function called shrinkRect that
takes a rectangle as its input and returns a smaller rectangle as its
output. Using the syntax of Option 1, the function would have the
signature
struct rect_t shrinkRect(struct rect_t shrinkThisRectang
le). Depending on how often you need to write out the type of the
structure, this syntax can become cumbersome and make your
code appear cluttered.
The solution to needing to type out “struct rect_t” every
time you want to declare, pass, or use your new struct is to create a
new data type that is explicitly of type struct. We do this using the
keyword typedef. The exact syntax is shown in Option 2 of
Figure 3.12. The first lines declare the rect_tag struct in the same
way as before. However, after this struct definition, the last line
(typedef struct rect_tag rect_t;) is the declaration of the
type rect_t, which is defined as having the type struct
rect_tag. Options 3 and 4 also “typedef” a new type; however,
they both combine the typedef into a single statement with the
structure declaration.
Figure 3.13: Use of typedef. Left: code that defines and uses a new data type, rgb_t
to store color values. Right: a change to the definition of the type requires no
subsequent code changes.
Although typedefs can simplify the use of structs, that is far
from their only use. Any time that you are writing code in a
specific context, typedefs can help you make your code more
readable by naming a type according to its meaning and use. For
example, suppose you are writing a program that deals with colors.
In the context of programming color characteristics, you might
want to define a new data type for the colors in an RGB value. For
example, you could create a new data type called rgb_t (which
represents one of the red, green, or blue components of the color)
of type unsigned int (because we know the values should be
positive integers) and then declare variables red, green, and blue
of type rgb_t. An example of this is shown on the left side of
Figure 3.13. typedefs provide a helpful abstraction for
programmers. Instead of having to write “unsigned int”
throughout her code, or frankly even think about the range of
acceptable values in RGB representations, the programmer simply
uses the custom type rgb_t and gives it no further thought.
typedefs have another nice property of limiting the definition
of a particular type to a single place in the code base. Suppose a
programmer wished to conserve the space dedicated to variables
and therefore wished to use an unsigned char instead of an
unsigned int (after all, the values from 0 to 255 all fit within the
eight bits of an unsigned char). Without a typedef, this change
would require a tedious and error-prone search of many (but by no
means all—it may be used for variables unrelated to colors)
instances of unsigned int throughout the code, changing these
types to unsigned char. With a typedef, the programmer simply
changes the single line of code in which rgb_t was defined (see
the right side of Figure 3.13). No other code changes are required.
Heads up about typedef. The use of typedefs is somewhat
controversial in some programming circles. In the context of
structs, there are those who believe it is important not to abstract
the struct away from a type. They believe that programmers should
always know when a particular variable is a struct and when it is
not. Similarly, they believe that programmers should always be
aware of the actual types of the data they use, lest they fall prey to
typing errors that could have been otherwise avoided. Use
typedefs when the abstraction simplifies rather than obfuscates
your code.
3.6.3 Enumerated Types
Figure 3.14: Enumerated types. Above: Declaration of an enumerated type. Below:
Code that uses this type.
The last form of custom type that a programmer can create is
called an enumerated type. Enumerated types are named constants
that can increase the readability and the correctness of your code.
They are most useful when you have a type of data with a set of
values that you would like to label by their conceptual name
(rather than using a raw number), and either the particular
numerical values do not matter (as long as they are distinct), or
they occur in naturally in a sequential progression. For example,
until 2011 the United States’ Homeland Security maintained a
color-coded terrorism threat advisory scale, which it used to
maintain heightened or more relaxed security in various locations,
including major airports. There were five threat levels from green
to red in ascending order of severity.
These five threat levels could be recorded in an enumerated
type, which we can create ourselves as shown in Figure 3.14. We
begin with the keyword enum, followed by the name of the new
enumerated type, in this case threat_level_t. The various threat
levels are placed in curly braces, as shown. Each level is assigned
a constant value, starting with 0.3 The enumerated names are
constant—they are not assignable variables. Their values cannot
change throughout the program. The convention for indicating that
a name denotes a constant is to write the name in all uppercase.
However, variables of the enumerated type can be created and
assigned to normally.
Because enumerated types have integer values, they can be
used in constructs such as simple value comparisons, switch
statements, and for loops. Figure 3.14 shows an example of the
first two. Video 3.4 illustrates the execution of this code.
Video 3.4: Executing code that uses an enumerated
type.
Another example of enumerated types would be if we wanted
to make a program that regularly refers to a small set of fruits:
grapes, apples, oranges, bananas, and pears. Suppose we want to
represent each of these as a number (because we regularly use
constructs like switch statements on the fruits themselves), but we
do not really care which number each is represented as. We can
make a enumerated type, enum fruit_t { GRAPE, APPLE,...};
and then use these constants throughout our code.
3.7 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 3.1 : Why is “Everything Is a Number” an
important rule of programming?
• Question 3.2 : What is a char?
• Question 3.3 : What are floats and doubles? How are
they similar? How are they different
• Question 3.4 : What two things do a variable’s type tell the
compiler about that variable?
• Question 3.5 : In the table below there are four columns:
one for binary, one for hexadecimal (hex), and two for
decimal (base 10). The decimal column on the left
interprets the number as eight-bit unsigned two’s
complement binary, and the decimal column on the right
interprets the number as eight-bit signed two’s complement
binary. In each row, you are given one of the numbers and
should fill in the other three columns by performing the
appropriate conversions:
Decimal
Binary Hex
(Unsigned) (Signed)
00000000
0x3C
200
-42
0110101
0x7F
100
87
• Question 3.6 : What is type promotion? What is casting?
How are they similar? How are they different?
• Question 3.7 : Assume that we have executed int a = 4;
and int b = 5;. What are the type and value of each of the
following expressions?
1. a / b
2. (double)(a / b)
3. a / (double)b
4. (double)a / b
5. a - b / 2
6. a - b / 2.
• Question 3.8 : What happens if integer arithmetic results in
a number too large to represent?
• Question 3.9 : How are colors represented as numbers?
• Question 3.10 : What is a string? How does it related to the
“Everything Is a Number” principle?
• Question 3.11 : What does typedef do?
• Question 3.12 : Suppose you are writing software in which
you need a unique sequence of numbers, and you decide
that unsignedlong is a sufficiently large type to work with
them. How could you give this type a name (e.g., seq_t) so
that you can use that name throughout your program? Write
the C statement that would accomplish this goal.
2 Reading Code4 Writing Code
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C3 Types5 Compiling and Running
Chapter 4
Writing Code
Writing code is (or at least, should be) 90% planning. Investing
an extra 10 minutes in carefully planning out a piece of code
can save hours of debugging a snarled mess later on. Many
novice programmers mistakenly jump right into writing code
without a plan, only to end up pouring hours into what should
be a relatively short task.
Planning first is not only the best approach for novices,
but also for skilled programmers. However, if you see a highly
experienced programmer in action, you may not see her
planning when working on a relatively easy problem. Not
seeing the planning does not mean she is not doing it, but that
she is capable of doing all of the planning in her head. As you
advance in programming skill, this will eventually happen for
you as well—there will be certain problems that you can just
solve in your head and write down the solution. Of course,
having practiced the skills required to solve harder problems
will be key, as your skills will be put to better use if you work
on problems at the difficult end of your capabilities.
As we discussed in Chapter 1, planning for programming
primarily consists of developing the algorithm to solve the
relevant problem. Once the algorithm is devised (and tested),
translating it to code becomes relatively straightforward. Once
you have implemented your program in code, you will need to
test—and likely debug—that implementation. Having a clear
plan of what the program should do at each step makes the
debugging process significantly easier.
Even though Chapter 1 explained Steps 1–4 and worked
some examples, we are going to revisit them now. One reason
for revisiting these steps is that they are crucial to
programming, and it is likely that they have faded somewhat
from your mind, as we last saw them three chapters ago.
However, we are also going to revisit them now, as you have
been introduced to the material in Chapter 2 and Chapter 3,
which you did not know in Chapter 1. Accordingly, we can
now talk about types and representing everything as numbers.
After we revisit Steps 1–4, we will continue on to Step 5,
translating our algorithms to code. At the end of this chapter,
Video 4.1 will work an entire problem from Step 1 to Step 5.
4.1 Step 1: Work an Example Yourself
The first step to devising an algorithm is to work an instance of
the problem yourself. As we discussed earlier, if you cannot do
the problem, you cannot hope to write an algorithm to do it—
that is like trying to explain to someone how to do something
you yourself do not understand how to do. However, you have
to not only be able to do the problem, but also do it
methodically enough that you can analyze what you did and
generalize it.
Often, a key part of working the problem yourself is
drawing a picture of the problem and its solution. Drawing a
clear and precise picture allows you to visualize the state of the
problem as you manipulate it. Having a clear idea of the state
of the problem and how you are manipulating it will help you
with the next step, in which you write down precisely what you
did on this instance of the problem.
We will use the following problem as an example to work
from for the rest of this chapter:
Given two rectangles, compute the rectangle that
represents their
intersection. You may assume the rectangles are
vertical or horizontal.
Figure 4.1: Working an example of the rectangle intersection problem.
The first thing we should do here is work at least one
instance of this problem (we may want to work more). In order
to do this, we need a bit of domain knowledge—what a
rectangle is (a shape with four sides, such that adjacent sides
are at right angles) and what their intersection is (the area that
is within both of them).
What instance we pick is really up to us. For some
problems, some instances will be more insightful than others,
and some will expose corner cases—inputs where our
algorithm needs to behave specially. The most important rule in
picking a particular instance of the problem is to pick one that
you can work completely and precisely by hand.
Figure 4.1 shows the results of Step 1 for the rectangle
intersection problem. We picked an instance of the problem—
here the yellow-shaded rectangle from to
intersecting with the blue-shaded rectangle from to
. The resulting intersection is the green-shaded
rectangle from to .
Figure 4.2: Another instance of the rectangle problem, with the Cartesian grid
removed.
You should note a few things about this example. First,
while the yellow/blue/green coloring is not truly a part of the
problem, there is nothing wrong with adding extra information
to your diagram to help you understand what is going on.
Second, note that the diagram is done precisely—we drew a
Cartesian coordinate grid, and placed the rectangles at their
correct coordinates. This precision not only ensured that any
information we obtained from analyzing our diagram was
correct and not a result of sloppy drawing (though whether
some relationship is generally true or a consequence of the
specific case we chose is not guaranteed by a careful drawing).
You typically do not need to draw things with the precision of a
drafting technician, but the more precise you can be, the better.
In this case, we can tell the answer just by looking at the
picture and seeing where the green region is. However, to write
a program to do this, we need to figure out the math behind it
—we need to be able to work the problem in some way other
than just looking at the diagram and seeing the answer.
Sometimes trying work things mathematically is hard when
you can just see the answer. Learning to put the obvious aside
and think about what is going on is a key programming skill to
learn, but this takes some time.
If you struggle with it, it may be useful to work another
instance of the problem but eliminate extra information that lets
you jump straight to the answer without understanding.
Figure 4.2 shows a different instance of the rectangle problem
with the Cartesian grid removed (note that it was still drawn
such that the rectangles are the right size and in the correct
relative positions). We can still precisely work the problem
from this diagram, but it is a little harder to just look at the
Cartesian grid and see the answer. Take a second to work out
the answer before you continue.
Note that there is nothing wrong with working a few
instances of the problem, taking different approaches as you do
it, and including/excluding various extra information as you do
so. In general, it is better to spend extra time in an earlier step
of programming than getting stuck in a later step (if you do get
stuck, you might want to go back to an earlier step and redo it
with another instance of the problem). For Step 1, doing a few
different instances of the problem is preferable to moving into
Step 2 and only being able to come up with “I just did it—it
was obvious.”
4.2 Step 2: Write Down What You Just
Did
Now you are ready to think about what it was exactly that you
just did and write down a step-by-step approach to solve one
specific instance of the problem. Using our example from
Figure 4.2, this would basically be a set of steps anyone could
follow to find the intersection of the rectangle from
to with the rectangle from to .
Note that you are not trying to generalize to any rectangles
here; rather, you are writing down what you did for this
particular pair of rectangles.
There are actually two important pieces to think about
here. The first is how you represented the problem with
numbers. Remember the key rule of programming: “Everything
Is a Number.” Since a rectangle is not a number (in the way, for
example, the price of bread is), we will have to find a way to
properly represent a rectangle using a number (or several). If
you go back and read the descriptions of the instances of the
problems we worked, you will find that we already have been
representing each rectangle with four numbers—two for the
bottom left corner and two for the top right corner. Now that
we have assured ourselves rectangles are numbers, we know
we can happily compute on them—we also have an idea of
what information we should think of a rectangle as having
(each of which is just a number): a bottom, a left, a top, and a
right. This analysis leads us to make the same definition of a
rectangle as in Section 3.6.1, but we underscore it here as it is
an important part of the programming process.
The second thing we need to think about is what exactly it
was that we did to come up with our answer, and write it down.
These steps can be anywhere in the spectrum of pseudocode—
notation that looks like programming but does not obey any
formal rules of syntax—to pure natural language (e.g., English)
that you are comfortable with. The important thing here is not
any particular notation but to have a clear idea of what you did
in a step-by-step fashion before you try to generalize the steps.
For example, we might write the following:
I found the intersection of
left: -2
bottom: 1
right: 6
top: 3
and
left: -1
bottom: -1
right: 4
top: 4
by making a rectangle with
left: -1
bottom: 1
right: 4
top: 3
In this case, we do not have many steps, but it is still crucial for
us to write them down.
4.3 Step 3: Generalize Your Steps
Now that we know what we did for this particular instance, we
need to generalize to all instances of the problem. We will note
that this step is often the most difficult (you have to think about
why you did what you did, recognize patterns, and figure out
how to deal with any possible inputs) and the most mistake
prone (which is why we test the algorithm in Step 4 before we
proceed).
4.3.1 Generalizing Values
One aspect of generalizing your algorithm is to scrutinize each
value you used and contemplate what it is in the general case.
Is it a constant that does not change depending on the inputs?
Does it depend on one (or more) of the parameters? If it does
depend on some of the parameters, what is the relationship
between them?
Going back to the rectangle example on which we did
Step 2, we came up with for the left value of the answer
rectangle. We can quickly rule out the idea that this is a
constant—surely not all rectangles have as the left side of
their intersection (counterexamples would be easy to come by
if we needed to convince ourselves).
Now we are left figuring out how relates to the input
parameters. It could be that the left value of the answer
rectangle matches one of the values of the input rectangles—
both the left and the bottom of the second rectangle are . It
could be the case that it has some mathematical relationship to
another value—maybe the left of the first rectangle divided by
2, or plus 1, or maybe the negative of the bottom of the first
rectangle. Any of these would yield and work in this case,
but we need to think about why the answer is to figure out
the correct generalization.
Sometimes this analysis is quite difficult. Whenever you
get stuck on generalization, it can help to repeat Steps 1 and 2,
to give us more information to work with and more insight. For
example, looking back at the other example we worked first in
Step 1, we can rule out some of the ideas we pondered in the
prior paragraph. From these two examples, we might draw the
conclusion that the left value of the intersection is the left value
of the second rectangle. We might proceed similarly to generate
the following generalized algorithm (as with Step 2, notational
specifics do not matter as long as you are precise enough that
each step has a clear meaning):
To find the intersection of two rectangles, r1
and r2:
Your answer is a rectangle with
left: r2’s left
bottom: r1’s bottom
right: r2’s right
top: r1’s top
While these generalized steps accurately describe the two
examples we did, they are in fact not a correct generalization.
Figure 4.3: A pair of rectangles, where our algorithm gives the wrong answer
(red dashed rectangle).
Figure 4.3 shows a pair of rectangles where our algorithm
gives the wrong answer—shown with a red dashed rectangle. If
we make an incorrect generalization such as this, we should
catch it in Step 4 (or if not, then when we test the code at the
end of Step 5). In such a case, we must return to Step 3 before
proceeding and fix our algorithm.
When you detect a mis-generalization of your algorithm,
you have the advantage that you have already worked at least
one example that highlights a case you need to analyze
carefully. In this case, we can see that we want r1’s right (not
r2’s right) for the right side of the answer, and r2’s bottom (not
r1’s bottom) for the bottom side of the answer. Note that r1’s
right and r2’s bottom did not work for the earlier cases, so we
cannot simply change our algorithm to use those in all cases.
Instead, we must think carefully about when we need which
one and why.
Careful scrutiny will lead us to conclude that we need the
minimum of r1’s right and r2’s right, and the maximum of r1’s
bottom and r2’s bottom. We may also realize that we should do
something similar for the left and top (if not, we should find
that out when repeating Step 4). We could then come up with
the following correctly generalized steps:
To find the intersection of two rectangles, r1
and r2:
Make a rectangle (called ans) with
left: maximum of r1’s left and r2’s left
bottom: maximum of r1’s bottom and r2’s bottom
right: minimum of r1’s right and r2’s right
top: minimum of r1’s top and r2’s top
That rectangle called ans is your answer.
We will note that in the case of rectangles that do not intersect,
this algorithm will produce an illogical rectangle as the answer
(its top will be less than its bottom and/or its left will be greater
than its right). For the purpose of this problem, we will say that
giving such an invalid rectangle in these cases is the intended
behavior of the algorithm—in part because we have not learned
how to represent “no such thing” easily.
4.3.2 Generalizing Repetitions
Another important part of generalizing an algorithm is to look
for repetitions of the same (or similar) steps. When similar
steps repeat, you will want to generalize your algorithm in
terms of how many times the steps repeat (or until what
condition is met). To examine this aspect of generalizing, we
will deviate from our rectangle example (which does not have
this type of repetition) and consider a slightly different problem
for a moment:
Given an integer N (> 0), print a right triangle
of *s, with height N
and base N.
For example, if N = 4, you would print:
*
**
***
****
We might work an example with and end up with the
following result from Step 2:
Print 1 star
Print a newline
Print 2 stars
Print a newline
Print 3 stars
Print a newline
Print 4 stars
Print a newline
Print 5 stars
Print a newline
Here, we are doing almost the same thing (Print stars; print a
newline.) five times. Once we observe the repetition, we can
take one step towards generalizing the algorithm by re-writing
the algorithm like this:
Count (call it i) from 1 to 5 (inclusive)
Print i stars
Print a newline
Notice that the way we have re-written the algorithm here gives
us two new constants to scrutinize: the 1 and the 5 in the range
that we count from/to. Careful consideration of these would
show that 1 is truly a constant (we always start counting at 1
for this algorithm), but 5 should be generalized to :
Count (call it i) from 1 to N (inclusive)
Print i stars
Print a newline
This algorithm is correct for the triangle-of-stars problem.
Sometimes it takes a little more work to make the steps of
your algorithm match up so that you can describe them in terms
of repetition. For example, consider the following problem:
Given a list of numbers, find their sum.
We might work this problem on the list of numbers
and end up with the following result from
Step 2:
Add 3 + 5 (= 8)
Add 8 + 42 (= 50)
Add 50 + 11 (= 61)
Your answer is 61.
Scrutinizing each of these constants might lead us to the
following more general steps:
Add (the 1st number) + (the 2nd number)
Add (the previous total) + (the 3rd number)
Add (the previous total) + (the 4th number)
Your answer is (the previous total).
Here, we almost, but not quite, have a nice repetitive pattern.
We can, however, make the steps match up:
previous_total = 0
previous_total = Add previous_total + (the 1st
number)
previous_total = Add previous_total + (the 2nd
number)
previous_total = Add previous_total + (the 3rd
number)
previous_total = Add previous_total + (the 4th
number)
Your answer is previous_total.
Note that mathematically speaking, what we did was
exploit the fact that 0 is the additive identity—
for any number . We will also note that starting with the
identity element as our answer before doing math to the items
in a list is typically a good idea, since the list may be empty.
Often, the correct answer when performing math on an empty
list is the identity element of the operation you are performing.
That is, the sum of an empty list of numbers is 0; the product of
an empty list of numbers is 1 (the multiplicative identity). Now
that we have re-arranged our steps, we can generalize nicely:
previous_total = 0
Count (call it i) from 1 to how many numbers you
have
previous_total = Add previous_total + (the ith
number)
Your answer is previous_total.
In this example, we also did something that will make Step 5
(translating to code) a bit easier—naming values that we want
to manipulate. In particular, we gave a name to the running
total we compute, which means that not only is it clear exactly
what we are referencing when we say previous_total, but
also that when we reach Step 5, this will translate directly into
a variable.
4.3.3 Generalizing Conditional Behavior
Sometimes when we are generalizing, we will have steps that
appear sometimes but not others. We may perform a step for
some parameter values but not for others, or we may have steps
that are almost repetitive, but some actions appear in some
repetitions and not in others. In either case, we need to figure
out under what conditions we should do those steps.
It may take some work and thinking to determine these
patterns, where some conditions indicate we perform certain
steps, and some conditions indicate we do not. As with many
things in generalizing, if it is not immediately apparent, it can
be quite useful to work more examples—giving you more
information to generalize from. You might also find it
informative to make a table of the circumstances (parameter
values, information specific to each repetition, etc.) and
whether or not the steps are done under those circumstances.
Once you have figured out the pattern, you can express the
step in the algorithm more generally by describing the
condition that should be determined, as well as what to do if
that condition is true and what to do if it is false. Doing so
makes your algorithm a little bit more general and may help
you express a large sequence of steps as repetition, since they
will now be more uniform.
4.3.4 Generalization: An Iterative Process
Generalization is an iterative process—you take what you have,
generalize (or rewrite it) a bit, and then try to generalize that
result more. Sometimes one step of generalization opens up
new avenues of generalization that were not visible before. We
have already seen how recognizing repetitive patterns can lead
to the opportunity to generalize in terms of how many times
you do the repeated steps. You may also end up exposing the
repetitive pattern of some steps only once you have figured out
what the generalization of the values in those steps is.
4.4 Step 4: Test Your Algorithm
Once you have generalized your algorithm, it is time to test it
out. To test it out, you should work it for different instances of
the problem than the one(s) you used to come up with it. The
goal here is to find out if you mis-generalized before you
proceed. We have already seen one instance of mis-
generalization in our rectangle problem, in which our algorithm
was too specific to the examples from which we built it (always
using r1’s bottom, r2’s left, etc.). Testing on these same
examples would not have revealed any problems.
In doing this testing, you want to strike a balance—
enough testing to give you confidence that your algorithm is
correct before you proceed, but not an excessive amount of
testing. Note that in this testing, you perform your steps by
hand, so it may be somewhat slow for a long or complex
algorithm. You can do more extensive testing after you
translate your algorithm to code. The tradeoff there is that the
computer will execute your test cases (which is fast), but if
your algorithm is not correct, you have spent time
implementing the wrong algorithm.
Here are some guidelines to help you devise a good set of
test cases to use in Step 4:
• Try test cases that are qualitatively different from what
you used to design your algorithm. In the case of our
rectangle example, the two examples we used to build
the algorithm were both fairly similar, but the third
example (which we used to show the flaw) was
noticeably different—the rectangles overlapped in a
different way.
• Try to find corner cases—inputs where your algorithm
behaves differently. If your algorithm takes a list of
things as an input, try it with an empty list. If you count
from 1 to (inclusive), try (you will count
no times) and (you will count only one time).
• Try to obtain statement coverage—that is, between all
of your test cases, each line in the algorithm should be
executed at least once. We will discuss various forms of
test case coverage later in Chapter 6.
• Examine your algorithm and see if there are any
apparent oddities in its behavior (it always answers
“true,” it never answers “0” even though that seems like
a plausible answer, etc.), and think about whether or not
you can get a test case where the right answer is
something that your algorithm cannot give as its
answer.
4.5 Step 5: Translate Your Algorithm to
Code
Now that you are confident in your algorithm, it is time to
translate it into code. This task is something that you can do
with pencil and paper (e.g., as you often will need to do in a
programming class on exams), but most of the time, you will
want to actually type your code into an editor so that you can
compile and run your program. Here, we will primarily focus
on the mechanics of the translation from algorithmic steps to
code. We strongly recommend that you acquaint yourself with a
programmer’s editor (Emacs or Vim) and use it whenever you
program. We cover Emacs in Appendix C, if you need an
introduction.
We should start Step 5 by writing down the declaration of
the function we are writing, with its body (the code inside of it)
replaced by the generalized algorithm from Step 3, written as
comments. Comments are lines in a program that have a
syntactic indication they are for humans only (to make notes on
how things work and help people read and understand your
code), and not an actual part of the behavior of the program.
When you execute code by hand, you should simply skip over
comments, as they have no effect. In C, there are two forms of
comments: // comments to the end of the line and /*...*/,
which makes everything between the slash-star and the star-
slash into a comment.
One thing we may need to do in writing down the function
declaration is figure out its parameter types and return type.
These may be given to us—in a class programming problem,
you may be told as part of the assignment description; in a
professional setting, it may be part of an interface agreed upon
between members of the project team—however, if you do not
know, you need to figure this out before proceeding.
Returning to our rectangle intersection example, we know
that the function we are writing takes two rectangles and
returns a rectangle. Earlier, we decided that a rectangle could
be represented as four numbers—suggesting a struct. However,
as you learned earlier, there are a variety of different types of
numbers—should these numbers be ints? floats? doubles?
Or some other type of number?
The answer to the question about which type of number
we need is “It depends.” You may be surprised to learn that “It
depends” is often a perfectly valid answer to many questions
related to programming; however, if you give this answer, you
should describe what it depends on, and what the answer is
under various circumstances.
For our rectangle example, the type we need depends on
what we are doing with the rectangles. One of the real number
types (float or double) would make sense if we are writing a
math-related program, where our rectangles can have fractional
coordinates. Choosing between float and double is a matter of
what precision and what range we need for our rectangles. If
we are doing computer graphics and working in the coordinates
of the screen (which come in discrete pixels), then int makes
the most sense, as you cannot have fractional pieces. For this
example, we will assume that we want to use floats.
With this decision made, we would start our translation to
code by declaring the function and writing the algorithm in
comments. We then go through and translate each of the steps
into code, line by line. If you have written good (i.e., clear and
precise) steps in Step 3, this translation should be fairly straight
forward—most steps you will want to implement naturally
translate into the syntax we have already learned:
Repetition Whenever you have discovered repetition
while generalizing your algorithm, it translates into
a loop. Typically, if your repetition involves
counting, you will use a for loop. Otherwise, if
you are sure you always want to do the body at
least once, a do-while is the most appropriate
type. In other cases (which typically align with
steps like “as long as (something)…,” while loops
are generally your best bet. If your algorithm calls
for you to “stop repeating things” or “stop
counting” you will want to translate that idea into a
break statement. Meanwhile, if your algorithm
calls for you to skip the rest of the steps in the
current repetition and go back the start of the loop,
that translates into a continue statement.
Decision Making Whenever your algorithm calls for you
to make a decision, that will translate into either
if/else or switch/case. You will typically only
want switch/case when you are making a
decision based on many possible numerical values
of one expression. Otherwise, you will want
if/else.
Math Generally, when your algorithm calls for
mathematical computations, these translate directly
into expressions in your program that compute that
math.
Names
When your algorithm names a value and
manipulates it, that translates into a variable in
your program. You need to think about what type
the variable has and declare it before you use it. Be
sure to initialize your variable by assigning to it
before you use it—which your algorithm should do
anyways (if not, what value did you use when
testing it in Step 4?).
Altering Values Whenever your algorithm manipulates
the values it works with, these translate into
assignment statements—you are changing the
value of the corresponding variable.
Giving an Answer When your algorithm knows the
answer and has no more work to do, you should
write a return statement, which returns the answer
you have computed.
Complicated Steps Whenever you have a complex line
in your algorithm—something that you cannot
translate directly into a few lines of code—you
should call another function to perform the work
of that step. In some cases, this function will
already exist—either because you (or some
member of your programming team) has already
written it, or because it exists in the standard C
library (or another library you are using). In this
case, you can call the existing function (possibly
reading its documentation to find its exact
parameters), and move on to translating the next
line of your algorithm.
In other cases, there will not already be a
function to do what you need. In these cases, you
should decide what parameters the function takes,
what its exact behavior is, and what you want to
call it. Write this information down (either on
paper, or in comments elsewhere in your source
code), but do not worry about defining the function
yet. Instead, just call the function you will write in
the future and move on to translating the next line
of your algorithm. When you finish writing the
code for this algorithm, you will go implement the
function you just called—this is a programming
problem all of its own, so you will go through all
of the Seven Steps for it.
Abstracting code out into a separate function
has another advantage—you can reuse that
function to solve other problems later. As you
write other code, you may find that you need to
perform the same tasks that you already did in
earlier programming problems. If you pulled the
code for these tasks into their own functions, you
can simply call those functions. Copy/pasting code
is generally a terrible idea—whenever you find
yourself inclined to do so, you should instead find
a logical way to abstract it out into a function and
call that function from the places where you need
that functionality.
With a clearly defined algorithm, the translation to code
should proceed in a fairly straightforward manner. Initially, you
may need to look up the syntax of various statements (you did
make that quick reference sheet we recommended in Chapter 2,
right?), but you should quickly become familiar with them. If
you find yourself struggling with this translation, it likely either
means that your description of your algorithm is too vague (in
which case, you need to go back to it, think about what
precisely you meant, and refine it), or that the pieces of your
algorithm are complex, and you are getting hung up on them,
rather than calling a function (as described above) to do that
piece, which you will write afterwards.
The process of taking large, complex pieces and
separating them out into their own functions—known as top-
down design—is crucial as you write larger and larger
programs. Initially, we will write individual functions to serve a
small, simple purpose—we might write one or two additional
functions to implement a complex step. However, as your
programming skill expands, you will write larger, more
complex programs. Here, you may end up writing dozens of
functions—solving progressively smaller problems until you
reach a piece small enough that you do not need to break it
down any further. While it may seem advantageous to just
write everything in one giant function, such an approach not
only makes the programming more difficult, but also tends to
result in a complex mess that is difficult to test and debug.
Whenever you have a chance to pull a well-defined logical
piece of your program out into its own function, you should
consider this an opportunity, not a burden. We will talk much
more about writing larger programs in Chapter 13 after you
master the basic programming concepts and are ready to write
significant pieces of code.
4.5.1 Composability
When you are translating your code from your algorithmic
description to C (or whatever other language you want), you
can translate an instruction into code in the same way, no
matter what other steps it is near or what conditions or
repetitions it is inside of. That is, you do not have to do
anything special to write a loop inside of another loop, nor to
write a conditional statement inside of a loop—you can just put
the pieces together and they work as expected.
The ability to put things together and have them work as
expected is called composability and is important to building
not only programs, but other complex systems. If you put a for
loop inside of an if statement, you do not need to worry about
any special rules or odd behaviors: you only need to know how
a for loop and an if statement work, and you can reason about
the behavior of their combination.
In general, modern programming languages are designed
so that features and language constructs can be composed and
work as expected. C (and later C++) follows this principle
pretty well, so you can compose pretty much anything you
learn from this book with pretty much anything else (with one
notable exception that we will discuss in Section 18.7).
4.5.2 Finishing Our Earlier Examples
Now that we have seen how to translate our generalized steps
into code, we can finish our earlier examples. We will start with
the simpler one, the triangle of stars. Here, we see the code as
the translation of our generalized steps from earlier (which are
placed in the code as comments—each line corresponds to the
directions from the algorithm on the line right before it):
1 //prints i stars
2 void printIStars(int i) {
3 //Count (call it j) from 1 to i (inclusive)
4 for (int j = 1; j <= i; j++) {
5 // Print a star
6 printf("*");
7 }
8 }
9 //prints a triangle of n stars
10 void printStarTriangle(int n) {
11 //Count (call it i) from 1 to n (inclusive)
12 for (int i = 1; i <= n; i++) {
13 //Print i stars
14 printIStars(i);
15 //Print a newline
16 printf("\n");
17 }
18 }
Note how we abstracted out printIStars. The resulting
function is small enough and simple enough that we would
have been justified in writing it inline if we saw exactly how to
do it right away. However, there is nothing wrong with pulling
it out into its own function and solving it separately. In fact, if
it is not immediately obvious what to write to translate it,
abstracting it out is exactly what you should do.
The other example we saw earlier was our rectangle
intersection problem. We can translate these steps into code as
well, using the same approach. Again, we include the
generalized steps as comments and abstract out the maximum
and minimum functions—here, doing so allows us to avoid
writing the code for this functionality twice, as we need each in
two places. Here is the resulting code:
1 //a rectangle with left, bottom, top, and right
2 struct rect_tag {
3 float left;
4 float bottom;
5 float top;
6 float right;
7 };
8 typedef struct rect_tag rect_t;
9
10 float minimum(float f1, float f2) {
11 //compare f1 to f2
12 if (f1 < f2) {
13 //if f1 is smaller than f2, then f1 is your answe
14 return f1;
15 }
16 else {
17 //otherwise, f2 is your answer
18 return f2;
19 }
20 }
21 float maximum(float f1, float f2) {
22 //compare f1 to f2
23 if (f1 > f2) {
24 //if f1 is larger than f2, then f1 is your answer
25 return f1;
26 }
27 else {
28 //otherwise, f2 is your answer
29 return f2;
30 }
31 }
32 //To find the intersection of two rectangles, r1 and r2
33 rect_t intersection(rect_t r1, rect_t r2)
34 //Make a rectangle (called ans) with
35 rect_t ans;
36 //left: maximum of r1’s left and r2’s left
37 ans.left = maximum(r1.left, r2.left);
38 //bottom: maximum of r1’s bottom and r2’s bottom
39 ans.bottom = maximum(r1.bottom, r2.bott
40 //right: minimum of r1’s right and r2’s right
41 ans.right = minimum(r1.right, r2.right)
42 //top: minimum of r1’s top and r2’s top
43 ans.top = minimum(r1.top, r2.top);
44 //The rectangle called ans is your answer
45 return ans;
46 }
4.6 A Complete Example
Video 4.1: Writing the isPrime function.
Video 4.1 walks through a complete example—writing the
isPrime function, which takes one integer (N) and determines if
N is prime (in which case, it returns 1) or not (in which case it
returns 0).
4.7 Next Steps: Compiling, Running,
Testing and Debugging
Steps 6 (testing your program) and 7 (debugging it) require you
to be able to run your code. The computer’s processor does not
actually understand the source code directly, so the source code
must first be translated to the numerically encoded (Everything
Is a Number) instructions that the processor can understand. In
the next chapter, we will discuss this process, as well as some
of the details of what you need to put in your source file to
make a complete program. We will then discuss testing and
debugging in Chapter 6.
4.8 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 4.1 : Write a function myAbs, which takes an
integer and returns an integer that is the absolute
value of .
• Question 4.2 : Write a function avg3, which takes three
integers ( , , and ) and returns the floating point
number that is their average. Be careful—you may need
to think back to what you learned in Section 3.4.2 to be
sure you get the right answer (in particular, if ,
, and ), you should get , not
).
• Question 4.3 : Write a function myRound, which takes a
double and returns an integer. This function should
round to the nearest integer, and return the rounded
result. (Hint: to get the fractional portion of (the part
after the decimal), think about how you can get the
integral portion (the part before the decimal) using what
you learned in Chapter 3—then think about what
mathematical operation you can use to compute the
fractional portion from the information you have).
• Question 4.4 : Write a function factorial, which takes
an integer and returns an int that is the factorial of
( in math notation).
• Question 4.5 : Write a function isPow2, which takes an
integer and returns an int that is 1 (“true”) if is a
power of 2 and 0 (“false”) if it is not. Note that 1 is a
power of 2 ( ), and 0 is not a power of 2. Note: some
approaches to this problem involve computing . In
C, if you write 2^i it will NOT compute —instead, it
will compute the bitwise exclusive-or (XOR) of 2 and
. If you want to compute easily, you can write
1 << i (where << is the binary left shift operator—so it
takes the number 1 and puts 0s after it in the binary
representation.
• Question 4.6 : Write a function printBinary, which
takes an integer and returns void. This function
should print the binary representation of the number .
You may assume that the number has at most 20 bits.
• Question 4.7 : Write a function printFactors, which
takes an integer , and returns void. This function
should print the prime factorization of —that is, it
should print a multiplication expression composed of
prime numbers, which if evaluated would result in .
For example, given the number 132, your function
should print 2 * 2 * 3 * 11. If your function is given
a number that is 1 or less, your function should print
nothing. Hint: you should end up with at least two steps
that are too complex to translate directly into a single
line—these should result in additional functions you
write.
• Question 4.8 : Write a function isPerfect, which takes
an integer and determines if is perfect (hint: the
definition of “perfect number” is domain knowledge in
math—if you do not know it, look it up before you
attempt Step 1). If is perfect, your function should
return 1, otherwise it should return 0.
• Question 4.9 : Write a function power, which takes two
unsigned integers, and , and returns an unsigned
integer. This function should compute and return
(that is, to the power).
• Question 4.10 : For any of the previous questions, take
your code, and trade it with a friend. First, you should
examine the syntax of each other’s code—does it follow
the rules we specified in Chapter 2? If not, explain to
each other what you think is not correct. Next, have her
execute your code by hand (for some reasonable
parameter values) while you execute her code by hand.
Determine if each other’s code is correct. If not, help
each other understand what might be wrong, and figure
out how to fix it.
3 Types5 Compiling and Running
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C4 Writing Code6 Fixing Your Code: Testing and
Debugging
Chapter 5
Compiling and Running
Once you have written your code, you need to compile it in order
to be able to run it. Compiling a program is the act of translating
the human-readable code that a programmer wrote (called “source
code”) into a machine-executable format. The compiler is the
program that performs this process for you: it takes your source
code as input and writes out the actual executable file, which you
can then run.
At its simplest, compiling your program is a matter of running
the compiler and giving it a command line argument specifying the
.c file with the source code for your program. There are many
different C compilers, but we will be using GCC, which stands for
“GNU Compiler Collection.” If you wrote your program in a file
called myProgram.c, you could execute the command
gcc myProgram.c to compile your code. Assuming there are no
errors in your code, the compiler would produce an executable
program called a.out, which you could then execute by typing the
command ./a.out. However, if you try this on any of the C code
you wrote in the previous chapters, you will encounter errors—we
need to add a bit more to our code (which we will see shortly) to
make it compile.
If you are not familiar with using a UNIX/Linux command-
line interface, you might want to read Appendix B for more
information about using the command line and command-line
arguments. Typically, you will invoke the compiler with more
arguments than just the name of one source file. We will discuss
those options after we cover some more about the compilation
process.
5.1 The Compilation Process
When you are programming, you will use the compiler all the time,
so it is useful to have a bit of an understanding of what it does at a
high level. The inner workings and details of the compiler are quite
complex and require an in-depth understanding of many areas of
computer science and engineering, so we will not go into those.
Instead, we will just cover the parts relevant to day-to-day program
writing.
Figure 5.1: A high-level view of the process that GCC takes to compile a program.
Light blue boxes represent code you have written; light green boxes represent built-in
parts of C.
Figure 5.1 shows a high-level overview of the process that
GCC goes through to compile the code. In this picture, the light
blue boxes represent code you have written, while the light green
boxes represent built-in parts of C. The orange clouds indicate
steps of this process (each is a separate program, but GCC invokes
these programs for you), and the white boxes represent
intermediate files that GCC generates to pass information from one
stage to the next. The dark blue box in the upper right represents
the final executable—the program that you can run to make your
computer do whatever the program tells it to do.
5.1.1 The Preprocessor: Header Files and Defines
The first step (in the upper left) is the preprocessor, which takes
your C source file and combines it with any header files that it
includes, as well as expanding any macros that you might have
used. To help understand this process, we will look at our first
complete C program, which is algorithmically simple (all it does is
print “Hello World”), but useful for explaining the compilation
process.
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 printf("Hello World\n");
6 return EXIT_SUCCESS;
7 }
While the main function’s contents should be mostly familiar
—a call to printf and a return statement—there are several new
concepts in this program. The first two lines are #include
directives (pronounced “pound include” or “hash include”). These
lines of code are not actually statements that are executed when the
program is run, but rather directives to the preprocessor portion of
the compiler. In particular, these directives tell the preprocessor to
literally include the contents of the named file at that point in the
program source, before passing it on to the later steps of the
compilation process. These #include directives name the file to be
included in angle brackets (<>) because that file is one of the
standard C header files. If you wrote your own header file, you
would include it by placing its name in quotation marks (e.g.,
#include "myHeader.h"). (This is not a formal rule but a very
common convention.) Preprocessor directives begin with a pound
sign (#) and have their own syntax.
In this particular program, there are two #include directives.
The first of these directs the preprocessor to include the file
stdio.h and the second directs it to include stdlib.h. These
header files—and header files in general—primarily contain three
things: function prototypes, macro definitions, and type
declarations.
A function prototype looks much like a function definition,
except that it has a semicolon in place of the body. The prototype
tells the compiler that the function exists somewhere in the
program, as well as the return and parameter types of the function.
Providing the prototype allows the compiler to check that the
correct number and type of arguments are passed to the function
and that the return value is used correctly, without having the entire
function definition available. In the case of printf, stdio.h
provides the prototype. The actual implementation of printf is
inside the C standard library, which we will discuss further when
we learn about the linker in Section 5.1.4
Header files may also contain macro definitions. The simplest
use of a macro definition is to define a constant, such as
1 #define EXIT_SUCCESS
This directive (from stdlib.h) tells the preprocessor to define the
symbol EXIT_SUCCESS to be . Whenever the preprocessor
encounters the symbol EXIT_SUCCESS, it sees that it is defined as a
macro and expands the macro to its definition. In this case, the
definition is just , so the preprocessor replaces EXIT_SUCCESS with
in the source it passes on to the later stages of compilation. Note
that the preprocessor splits the input into identifiers, and checks
each identifier to see if it is a defined macro, so EXIT_SUCCESS_42
will not expand to 0_42, since it is not defined as a macro, and the
preprocessor will leave it alone (unless it is defined elsewhere).
Using macro definitions for constants provides a variety of
advantages to the programmer over writing the numerical constant
directly. For one, if the programmer ever needs to change the
constant, only the macro definition must be changed, rather than
all of the places where the constant is used. Another advantage is
that naming the constant makes the code more readable. The
naming of the constant in return EXIT_SUCCESS gives you a clue
that the return value here indicates that the program succeeded. In
fact, this is exactly what this statement does. The return value from
main indicates the success or failure of your program to whatever
program ran it.
A third advantage of using macro-defined constants is
portability. While 0 may indicate success on your platform—the
combination of the type of hardware and the operating system you
have—it may mean failure on some other platform. If you
hardcode the constant 0 into your code, then it may be correct on
your platform, but it may need to be rewritten to work correctly on
another platform. By contrast, if you use the constants defined in
the standard header files, then when you recompile your program
on the new platform, it will just work correctly—the header files
on that platform will have those constants defined to the correct
values.
The standard header file limits.h contains constants
specifically for portability. These constants describe the maximum
and minimum value of various types on the current platform. For
example, if a program needs to know the maximum and minimum
values for an int, it can use INT_MAX and INT_MIN, respectively.
On platforms where an int is 32 bits, these would be defined like
this:
1 #define INT_MAX 2147483647
2 #define INT_MIN -2147483648
On a platform with a different size for the int type, these
would be defined to whatever value is appropriate for the size of
the int on that platform.
Macros can also take arguments; however, these arguments
behave differently from function arguments. Recall that function
calls are evaluated when the program runs, and the values of the
arguments are copied into the function’s newly created frame.
Macros are expanded by the preprocessor (while the program is
being compiled, before it has even started running), and the
arguments are just expanded textually. In fact, the macro
arguments do not have any declared types, and do not even need to
be valid C expressions—only the text resulting from the expansion
needs to be valid (and well typed).
We could (though as we shall see shortly, we shouldn’t)
define a SQUARE macro as follows:
1 #define SQUARE(x) x * x
The preprocessor would then expand SQUARE(3) to 3 * 3, or
SQUARE(45.9) to 45.9 * 45.9. Note that here, we are using the
fact that macro arguments do not have types to pass it an int in the
first case and a double in the second case. However, what happens
if we attempt SQUARE(z - y)? If this were a function, we would
evaluate z - y to a value and copy it into the stack frame for a call
to SQUARE; however, this is a macro expansion, so the preprocessor
works only with text. It expands the macro by replacing x in the
macro definition with the text z - y, resulting in z - y * z - y.
Note that this will compute z - (y * z) - y, which is not z - y
squared.
We could improve on this macro by defining it like this:
1 #define SQUARE(x) ((x) * (x))
Now, SQUARE(z - y) will expand to ((z - y) * (z - y)),
giving the correct result irrespective of the arguments and location
of the macro. We can still run into problems if the macro argument
has side effects. For example, if we type SQUARE(f(i)), then the
function f will be called twice. If f prints something, it will be
printed twice, once for each call to f in the macro expansion.
The SQUARE macro is a bit contrived—we would be better to
write the multiplication down where we need it—but highlights the
nature (and dangers) of textual expansion of macros. Section E.3
contains much more information about the C preprocessor and
macros. They are quite powerful and can be used for some rather
complex things, but you will not need them for most things you
write in this book.
Header files may also contain type declarations. For example,
stdio.h contains a type declaration for a FILE type, which is used
by a variety of functions that manipulate files. The functions also
have their prototypes in stdio.h, and we will discuss them in
Chapter 11 when we learn about accessing files and reading input.
Another example of type declarations in standard header files
is the integer types in stdint.h. As mentioned in Chapter 3,
integers come in different sizes, and the size of an int varies from
platform to platform. Often programmers do not care too much
about the size of an int, but sometimes using a specifically-sized
int is important. stdint.h defines types such as int32_t (which
is guaranteed to be a 32-bit signed int on any platform) or
uint64_t (which is always a 64-bit unsigned int).
With all of that in mind, we can revisit our Hello World
program from earlier:
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 printf("Hello World\n");
6 return EXIT_SUCCESS;
7 }
The preprocessor would take this code and basically
transform it into this:
1 int printf(const char * , ...);
2
3 int main(void) {
4 printf("Hello World\n");
5 return 0;
6 }
In actuality, the result of the preprocessor is a few thousand
lines long, since it includes the entire contents of stdio.h and
stdlib.h, which have many other function prototypes and type
declarations. However, these do not affect our code (the compiler
will know those types and functions exist, but since we do not use
them, they do not matter). We will note that the prototype for
printf contains a few features that we have not seen yet. You can
think of const char * as being the type for a literal string, and the
... means that printf takes a variable number of arguments—a
feature of C that we will not delve into beyond using it to call
printf and similar functions.
5.1.2 The Actual Compiler
The output of the preprocessor is stored in a temporary file and
passed to the actual compiler. At first, it may seem a bit confusing
that one of the steps of the compilation process is the actual
compiler. However, we often refer to complex processes by their
eponymous step. For example, if you say “I am going to bake a
cake,” even though only part of the process involves actually
baking the cake (cooking the cake batter in the oven), we still
understand that you will go through the entire process: finding a
recipe, getting the ingredients, mixing the batter, baking the cake,
letting it cool, and icing it.
The compiler reads the pre-processed source code—which
has all the specified files included and all macro definitions
expanded—and translates it into assembly. Assembly is the lowest-
level type of human-readable code. In assembly, each statement
corresponds to one machine instruction. Machine instructions
correspond to very simple operations, such as adding two numbers
or moving a value to or from memory. While humans can program
in assembly (sometimes with good reason), that is not our focus
here—we just want to understand enough of the compilation
process to know what is going on when we compile our programs.
The compiler is a rather complex program. Its first task is to
read in your program source and “understand” your program
according to the rules of C—a task called parsing the program. In
order for the compiler to parse your program, the source code must
have the correct syntax (i.e., as we described in Chapter 2). If your
code is not syntactically correct, the compiler will print an error
message and attempt to continue parsing but may be confused by
the earlier errors. For example, the following code has one error—
a missing semicolon after int x = 3 on line 5:
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int x = 3
6 int y = 4;
7 printf ("%d\n", x + y);
8 return EXIT_SUCCESS;
9 }
However, compiling the code results in two errors:
err1.c: In function ’main’:
err1.c:6:5: error: expected ’,’ or ’;’ before ’int’
int y = 4;
^
err1.c:7:25: error: ’y’ undeclared (first use in
this function)
printf ("%d\n", x + y);
^
err1.c:7:25: note: each undeclared identifier is
reported only once
for each function it appears in
The first error (expected ’,’ or ’;’ before ’int’) correctly
identifies the problem with our code—we are missing a semicolon
before the compiler sees int at the start of line 6 (it should go at
the end of line 5, but the compiler does not know anything is
wrong until it sees the next word). The next error (’y’
undeclared) is not actually another problem with the code, but
rather a result of the compiler being confused by the first error—
the missing semicolon caused it to misunderstand the declaration
of y as a part of the previous statement, so it does not recognize y
later in the code.
Compiler errors can be a source of confusion and frustration
for novice programmers. One key rule in dealing with them is to
try to fix them in order, from first to last. If you find an error
message (other than the first one) confusing, you should fix the
first error, then retry compiling the code to see if the error goes
away—it may just be a result of the compiler being confused by
earlier errors.
Dealing with Compilation Errors:
Tip 1
Remember that the compiler can get confused
by earlier errors. If later errors are confusing,
fix the first error, then try to recompile before
you attempt to fix them.
Another source of confusion in dealing with compiler errors is
that the error message may not do a good job of describing the
problem. The compiler tries its best to describe the problem but
may guess incorrectly about what you were trying to do or refer to
unfamiliar possibilities. For example, if we have a slightly
different variation on the missing semicolon problem from the
previous example:
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 int x
6 int y;
7 x = 3;
8 y = 4;
9 printf("%d\n", x + y);
10 return EXIT_SUCCESS;
11 }
The error here is still a missing semicolon after the
declaration of x on line 5; however, the compiler gives the
following error messages:
err2.c: In function ’main’:
err2.c:5:5: error: ISO C forbids nested functions [-
Werror=pedantic]
int x
^
err2.c:6:5: error: expected ’=’, ’,’, ’;’, ’asm’ or
’__attribute__’
before ’int’
int y;
^
err2.c:7:5: error: ’x’ undeclared (first use in this
function)
x = 3;
^
err2.c:7:5: note: each undeclared identifier is
reported only once for
each function it appears in
cc1: all warnings being treated as errors
These error messages may appear quite intimidating, since they
refer to several language features we have not seen (and which are
all irrelevant to the problem). The first error message here is
actually completely irrelevant, other than that it is on the right line
number. The compiler first guesses that we might be trying to use a
non-standard language extension (extra features that one particular
compiler allows in a language) that allows us to write one function
inside of another. This extension is forbidden by this version of the
language standard. The compiler is trying to be helpful, even
though it is wrong. In general, if an error message is referencing
something completely unfamiliar, it is likely that the compiler is
confused or offering suggestions that are not relevant to what you
are trying to do.
Dealing with Compilation Errors:
Tip 2
If parts of an error message are completely
unfamiliar, try to ignore them and see if the rest
of the error message(s) make sense. If so, try to
use the part that makes sense to understand and
fix your error. If not, search for the confusing
parts on Google and see if any of them are
relevant.
Looking at the next error message, we see that it is on the
same line number (6), so the compiler may be trying to give us
more or different information about the same problem. Here, while
there are some unfamiliar things (such as asm or __attribute__),
the rest of the error message makes sense if we ignore these. That
is, we could understand the error message if it were:
err2.c:6:5: error: expected ’=’, ’,’, or ’;’ before
’int’
The compiler has told us it expects an equals sign, comma, or
semicolon before we get to the word int on line 6. In this case, we
are missing a semicolon, but the compiler is suggesting other
possibilities (e.g., we could have had an = had we been writing
int x = 3;). The asm and __attribute__ suggestions it provided
could also have been options, but these are very advanced
language features, which we will not cover.
Another source of confusion in dealing with error messages
can arise from the fact that the compiler may not notice the error
immediately. As long as the source code seems to obey the rules of
C’s syntax, the compiler assumes everything is correct, even if its
parse does not agree with what you meant. In the following code,
the actual error is omission of a close curly brace on line 9:
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 for (int i = 0; i < 10; i++) {
6 for (int j = 0; j < 10; j++) {
7 printf("%d\n", (i + 3) * (j + 4));
8 }
9
10 return EXIT_SUCCESS;
11 }
However, the compiler does not recognize the error
immediately. Having a return statement inside the for loop is
syntactically valid. The closing brace after it on line 11—which the
programmer intended to be the end of main is also legal, but the
compiler parses it as the end of the first for loop. The compiler
then discovers that something is not right when it reaches the end
of the file and the body of main has not ended, reporting the error
as:
err3.c: In function ’main’:
err3.c:11:1: error: expected declaration or
statement at end of input
}
^
Dealing with Compilation Errors:
Tip 3
Programmer’s editors are very good at helping
you find mismatched braces and parentheses.
Most will indent the code according to the
nesting level of the braces it is inside and will
show you the matching brace/parenthesis when
your cursor is on one of the pair. Using these
sorts of features can be the easiest way to find
errors from mismatched braces and parentheses.
Once the compiler has parsed your program, it type-checks
the program. Type-checking the program involves determining the
type of every expression (including the sub-expressions inside of
it) and making sure that they are compatible with the ways they are
being used. It also involves checking that functions have the right
number and type of arguments passed to them. The compiler will
produce error messages for any problems it discovers during this
process.
Another important note about fixing compiler errors is that
sometimes fixing one error will let the compiler progress further
along, resulting in more errors. Novice programmers who are not
confident in their correction of their first error may undo the fix,
thinking they made things worse, even though the fix may have
been correct—the extra errors just came from progressing further
through the process.
Dealing with Compilation Errors:
Tip 4
Be confident in your fix for an error. If you do
not understand what is wrong and how to fix it,
find out and be sure rather than randomly
changing things.
Both Appendix F and the web page
http://aop.cs.cornell.edu/errors/index.html provide more
details on a variety of error messages and will likely prove useful
in helping you diagnose errors you may encounter.
Once the compiler finishes checking your code for errors, it
will translate it into assembly—the individual machine instructions
required to do what your program says. You can ask the compiler
to optimize your code—transform it to make the resulting assembly
run faster—by specifying the -O option followed by the level of
optimization that you want. Programs are usually compiled with no
optimizations for testing and debugging (the code transformation
makes the program very difficult to debug) and then re-compiled
with optimizations once they are ready for release/use. Typically,
real programs compiled with GCC are compiled with -O3 when
they are optimized. GCC provides a variety of other options to
control optimization at a finer level of detail; however, they are
beyond the scope of this book. We are strictly concerned with
writing code that works, so you do not need to bother with the -O
flag for any of the exercises you do in this book.
5.1.3 Assembling
The next step is to take the assembly that the compiler generated
and assemble it into an object file. GCC invokes the assembler to
translate the assembly instructions from the textual/human-
readable format into their numerical encodings that the processor
can understand and execute. This translation is another example of
“Everything Is a Number”—even the instructions that the
computer executes are numbers.
While it is possible to get error messages at this stage, it
should not happen for anything you will try to do in this book.
Generally errors here are limited to cases in which you explicitly
write the specific assembly-level instructions that you want into
your program (which is possible in C, but limited to very advanced
situations) and make errors in those.
The important thing to understand about this step is that it
results in an object file. The object file contains the machine-
executable instructions for the source file that you compiled but is
not yet a complete program. The object file may reference
functions that it does not define (such as those in the C library or
those written in other files). You can request that GCC stop after it
assembles an object file by specifying the -c option. By default,
the name of the object file will be the name of the .c file with the
.c replaced by .o. For example, gcc -c xyz.c will compile xyz.c
into xyz.o. If you wish to provide a different name for the object
file, use the -o option followed by the name you want. For
example, gcc -c xyz.c -o awesomeName.o will produce an object
file called awesomeName.o.
This ability to stop is important for large programs where you
will split the code into multiple different C files. Splitting the
program into large files is primarily useful to help you keep the
code split into manageable pieces. However, each source file can
be individually compiled to an object file, and those object files
can then be linked together (as we will discuss in the next section).
If you change code in one file, you can recompile only that file (to
generate a new object file for it) and re-link the program without
recompiling any of the other source files. For programs you write
in this class, this will not make a difference. However, when you
write real programs, you may have tens or hundreds of thousands
of lines of code split across dozens or hundreds of files. In this
case, the difference between recompiling one file and recompiling
all files (especially if optimizations are enabled) may be the
difference between tens of seconds and tens of minutes.
5.1.4 Linking
The final step of the process is to link the program. Linking the
program takes one or more object files and combines them
together with various libraries, as well as some startup code, and
produces the actual executable binary. The object files refer to
functions by name, and the linker’s job is to resolve these
references—finding the matching definition. If the linker cannot
find the definition for a name (called a “symbol”) that is used, or if
the same symbol is defined multiple times, it will report an error.
Linker errors—indicated by the fact that they are reported by
ld (the linker’s name)—are typically less common than other
compiler errors. If you encounter an unresolved symbol error, it
means that you either did not define the symbol, did not include
the object file that defines the symbol in the link, or that the
symbol was specified as only visible inside that object file (which
can be done by using the static keyword—a feature we will not
delve into). If you encounter errors from duplicate definitions of a
symbol, first make sure you did not try to name two different
functions with the same name. Next, make sure you did not include
any files twice on the compilation command line. Finally, make
sure you do not #include a .c file—only header files—and that
you only include function prototypes in the header file, not the
function’s definition (there are some advanced exceptions to these
rules, but if you are at that stage, you should understand the linking
process and static keyword well enough to fix any problems).
Sometimes you may wish to use an external library—a
collection of functions that are already written and packaged up to
use for some purpose (e.g., drawing graphics, playing sound,
advanced math, or many other purposes). The C library is one such
example, which is linked in by default. For other libraries, you
must specifically request that the linker link with the library with
the -l command line option, specifying the name of the library
right after the l. If all goes well, the linker will resolve the all the
symbol references and combine the various object files and
libraries together into an executable binary.
5.1.5 Building Large Programs: make
The programs you will write as exercises in this book will be
relatively small—at most a hundred lines and a couple of files. For
programs of this size, recompiling the entire program takes a few
seconds. Real programs tend to be quite a bit larger. As the
program size increases, so does the compilation time. As an
example, one program changed, compiled, and run frequently by
one of the authors has about 40,000 lines of code, 90 files, and
takes about 75 seconds to compile completely.
While 75 seconds may not sound long, it is practically an
eternity in the typical development cycle. Programmers recompile
their code dozens to hundreds of times per day, meaning that all of
those minutes add up—waiting 75 seconds 50 times in a day adds
up to a bit more than an hour, or an eighth of a standard work day.
This large of an overhead in the development cycle would be
prohibitively large (despite its ability to provide convenient
excuses for slacking off: e.g., http://xkcd.com/303/).
Fortunately, most of the time, one does not need to recompile
all of the source code if the object (.o) files from previous
compilations are kept. Instead, only the file (or files) that were
changed need to be recompiled, then the program needs to be
linked again. As mentioned in the previous sections, compiling
each source file to an object file is accomplished with the -c
option, and the various object files can then be listed on the
command line to GCC in order to pass them to the linker. For
comparison, recompiling one source file then relinking the same
program described above takes about one second—much faster
than recompiling the whole thing.
However, orchestrating this process manually is tedious and
error prone. If the programmer forgets to recompile a file, then the
program will not work correctly, possibly in strange ways. Doing
this manually not only requires the programmer to remember all
files that were changed, but also which files included with
#include were changed. For example, if the programmer changes
myHeader.h, then any file that #includes myHeader.h must also be
recompiled, as the source files’ contents have effectively changed.
Long ago, programmers developed the Make utility to not
only automate this process, but also to simplify compiling
programs in general. The make command reads a file called
Makefile (though you can ask it to read an input file by a different
name), which specifies how to compile your program. Specifically,
it names the targets that can be made, their dependencies, and the
rules to make the target.
When make is run, it starts from a particular target—
something that it can build out of other things. A common starting
target might be your entire program. Make first checks if the target
is up-to-date. Checking that a target is up-to-date requires checking
that all of the files the target depends on are themselves up-to-date
and that none of them are newer than the target. For example, the
target for the whole program may depended on many object files.
If any such file is not itself up-to-date (for example, the object file
may depend on a .c file that just got changed), it is rebuilt first.
Make then follows whatever rule was specified for the target to
build it from its dependencies. If the target is already up to date,
then it does nothing.
Using Make to compile the program is also useful when the
command line to compile the program becomes complex,
especially when other people may need to do the compilation.
Large complicated programs may require linking with several
libraries, as well as a variety of other command-line options.
Trying to remember all of these and type them in every time you
compile is painful. Even worse, many real programs need to be
compiled by other people—whether it is multiple members of a
large development team or people who you distribute the program
to. Expecting these other people to figure out the commands
required to make your program leads to much frustration for them.
Providing a Makefile allows anyone to build the program simply
by typing make.
Appendix D explains Make in more detail: how to write a
Makefile and a few advanced features.
5.2 Running Your Program
Now that you have compiled your program, you are ready to run it.
You can run your program much like any other program—by
typing its name at the command prompt. However, unlike many
programs you have used so far, the directory in which your
program is located is not part of the PATH—the environment
variable that specifies which directories to look in to run a
program. Therefore, you have to specify which directory to find
the program in as part of the name. Since the program is in the
current directory, you can just put ./ before the program’s name
(for example./myProgram)—telling the command shell to run the
myProgram program in the current directory. If you are unfamiliar
with the command shell, directories, path names, or environment
variables, you can read more about them in Appendix B.
You can also run your program from inside various tools that
are intended to help you test and debug the program. Two
incredibly useful tools are GDB (the GNU Debugger) and Valgrind
(pronounced “val-grinned” with a short ‘i’). For more information
about GDB, we will discuss debugging in Chapter 6 and GDB in
particular in Section D.2. Valgrind emulates the computer but
tracks more information about what your program is doing to
report errors that may otherwise go undetected (see Section D.3).
Valgrind is particularly good at finding errors in your program
that did not manifest simply because you got lucky when you ran
it. For example, recall from Chapter 2 that a variable does not have
a value until you assign it one—we just draw a ? in its box to
indicate the value is unknown. If your program has a variable that
is used before it is assigned any value—called an uninitialized
variable—the variable will evaluate to whatever value was
previously stored into the location that the variable resides in. You
may get “lucky” on what value the uninitialized variable ends up
holding. In fact, you may get “lucky” thousands of times and not
see any problem but then get unlucky when it actually matters.
This may seem highly improbable—after all, with billions of
possible values that the variable could end up being, what are your
chances of getting lucky? However, the value that you end up with
is not random, it is whatever was in that storage location
previously—for example, the value that some other variable had
when it went out of scope.
When you run your program directly on the computer, it does
not explicitly track whether a variable has been initialized—the
compiler generates instructions that do exactly what your program
specifies, and the computer does exactly what those instructions
tell it to. When you run your program in Valgrind, however,
Valgrind explicitly tracks which variables are initialized and which
are not. If your program uses the uninitialized value in certain
ways, Valgrind will report an error to you (it does not do this in all
cases for a variety of reasons beyond the scope of this discussion—
however, suffice it to say that it will catch the cases where it really
matters).
Valgrind is useful for detecting many other problems with
your program that you might not otherwise discover easily. We
highly recommend running your program in Valgrind whenever
you are testing your program. We will talk about testing in much
more detail in Chapter 6, but for now, you will want to run your
program to see if it works. When we learn how to make programs
that work with various forms of input, you will want to test your
program on a variety of inputs to become more and more confident
that it works.
Video 5.1: Writing, compiling, and running the
“Hello World” program.
Video 5.1 ties everything we have talked about so far in this
chapter with the “Hello World” example. The video shows writing
the code, compiling it, and then running it directly as well as in
Valgrind. In the video, we use some compiler options, which we
recommend you use in general, and will discuss momentarily.
5.3 Compiler Options
Usually, you will want to name the resulting program something
meaningful, rather than the default a.out. To change the output file
name, use the -o option, and specify the output file name after it.
For example, gcc -o myProgram myProgram.c would compile
myProgram.c (as above), but instead of producing a.out, it would
name the program myProgram.
Another option you will commonly want to use is --
std=gnu99. This option specifies that the compiler should use the
C99 standard with GNU extensions. There are actually a few
different versions of C (referred to as “standards” because they
reflect a standardization of features). C99 will match what we
describe in this book, and is generally a reasonable standard to
program in.
Another useful pair of options are -Wall and -Werror. The
first of these requests that the compiler issue warnings for a wide
range of questionable behavior. As we discussed in Section 5.1.2,
the compiler checks your program for certain kinds of errors. If it
detects errors, it reports them to you and requires you to fix them
before proceeding. Warnings are like errors in that the compiler
will report a problem that it has detected; however, unlike errors,
the compiler will continue and produce a program even if it
warned you about something. The -Werror option tells the
compiler to treat all warnings as errors—making it refuse to
compile the program until the programmer fixes all the warnings.
Novice programmers often wonder why one would want to
enable warnings to begin with, and especially to ask the compiler
to convert warnings to errors. Errors prevent your program from
compiling, so it seems like any way to avoid them and get to the
compiled binary is a good thing. However, a programmer’s goal is
not just to produce any executable program, but one that works
correctly at hand. Any time the compiler can alert you to potential
problems, you can fix them much more easily than if they lead to
problems you have to debug. A good analogy is purchasing a
house. Think of the compiler as your inspector for the house.
While you are eager to purchase and move into your new house,
you would much rather have the inspector discover a potential
problem before you buy the house and move in.
We strongly recommend compiling with at least the following
warning options:
-Wall -Wsign-compare -Wwrite-strings -Wtype-limits -
Werror
These options will help catch a lot of mistakes, and should not
pose an undue burden on correctly written code.
Recent versions of GCC also support an option -
fsanitize=address that will generate code that includes extra
checking to help detect a variety of problems at runtime. Using this
option is also strongly recommended during your development
cycle. However, we will note that you typically cannot run a
program built with this option in Valgrind (see Section D.3 for
information about Valgrind—we strongly encourage you use it).
The two tools detect different, but overlapping sets of problems, so
use of both is a good idea—they just have to be used separately.
5.4 Other Possibilities
The process of compiling and running a program that we have
described here is not the only way to execute a program, although
it is fairly typical for C and C++ programs. An alternative to
compilation is to interpret the program. Instead of translating the
program into machine instructions, the source code is given to an
interpreter—a program that reads the source of another program
and performs the actions specified in it. Interpreters and compilers
have much in common—they must both read and analyze the
source code of a program—the key difference is what they do with
the program. A compiler translates it into an executable the runs
directly on the computer. An interpreter executes the effects of the
program. Some interpreters allow for interaction via a read-eval-
print-loop (REPL)—that is, you can type a bit of source code,
which the interpreter reads as input (read); it then evaluates it
(eval), prints the result (print), and repeats the process (loop).
Many people will say that a language is compiled or
interpreted (e.g., C is compiled, but Python is interpreted), which is
rather imprecise. Any programming language could have a
compiler and/or interpreter written for it. A more precise version of
this statement would be that a particular language is typically
compiled, or another particular language is typically interpreted. C
is typically compiled; however, there exist interpreters for it.
Likewise, as we discuss in Chapter 32, Python is usually
interpreted, although there are also compilers for it.
In some cases, the two can be mixed. Typically Java is
compiled into instructions for the Java Virtual Machine (JVM)—an
instruction set typically exectued by an interpreter, not directly by
a computer. Most JVMs perform just-in-time (JIT) compilation,
meaning that as they interpret the program, they also compile the
program into directly executable code in their own memory
(remember: “Everything Is a Number,” including computer
instructions), which it can then execute the next time it encounters
the same piece of code. JIT provides speed advantages over purely
interpreting the code.
5.5 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 5.1 : Write, compile, and run a program that
prints out your name (followed by a newline).
• Question 5.2 : Write, compile, and run a program that
prints out all of the numbers from 1 to 1000 (inclusive,
each on their own line).
• Question 5.3 : Write, compile, and run a program that, for
each number from 1 to 1000 (inclusive), prints out that
number, a colon, a space, then the square of that number,
one per line. The first line of your output should be 1: 1,
the second line should be 2: 4, and so on through
1000: 100000.
• Question 5.4 : Take your isPow2 function from Question
4.8 in Chapter 4, and write it in a file called powers.c.
Write a main function that prints out all of the powers of 2
less than 20,000,000 (20 million). Compile and run your
program.
• Question 5.5 : Take your printBinary function to print a
number in binary from Question 4.8 in Chapter 4, and write
it in a file called printBinary.c. Write a main function that
uses the printBinary function to print the binary
representation of all of the numbers from 0 to 1000
(inclusive). Compile and run your program.
• Question 5.6 : Take your printFactors function to print
the prime factors from Question 4.8 in Chapter 4, and write
it in a file called printFactors.c. Write a main function
that uses the printFactors function to print the prime
factors of all of the numbers from 2 to 1000 (inclusive).
Compile and run your program.
• Question 5.7 : Take your power function to compute
from Question 4.8 in Chapter 4, and write it in a file called
power.c. Write a main function that uses the power function
to print a table of for all from between 0 and 5
(inclusive) and all between 0 and 16 (inclusive). Why do
the results for and look strange (hint: think back
to Chapter 3)?
• Question 5.8 : Find a type that would allow your power
function to work correctly on and , and change
your code for the previous problem to use this type. You
will probably need to look at the man page for printf to
find the correct format specifier to print the result.
• Question 5.9 : Run Valgrind on all of the programs you
have written for the other questions in this chapter. If it
identifies any problems with your code, fix them.
4 Writing Code6 Fixing Your Code: Testing and Debugging
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C5 Compiling and Running7 Recursion
Chapter 6
Fixing Your Code: Testing
and Debugging
Once you have written and compiled your code, you are ready
to test and debug it. These two steps are closely related, but
distinct. Testing is the process of trying to find problems
(“bugs”) in your code, and debugging is the process of fixing
those bugs.
Both of these skills are crucial to good programming—
what good is writing code if it does not work correctly? Testing
your code identifies the problems you need to fix and then
gives you increasingly more confidence that the code works as
you expect it to. Once you have found a problem, debugging is
the key skill to let you fix the problem in a disciplined manner,
rather than just being frustrated by randomly changing things to
no avail.
Many novice programmers (and even many experienced
programmers) underestimate the time required for testing.
Proper testing requires not only testing on a variety of cases,
but also analyzing your code and thinking carefully about what
test cases you need to write. Unfortunately, testing is easy to
shirk—when you are under time pressure, it may seem
worthwhile to just dash together a couple test cases, then move
on. However, in large software projects (or even modest-sized
ones), “moving on” will typically involve writing code that
uses the functions you just wrote. If the functions you wrote are
broken, then your next work starts from a poor foundation, and
you will just spend longer trying to make it work.
Debugging is another area where many programmers
underestimate the time required. It is often hard to say how
long it will take to find one particular bug, as you often do not
know how complex the problem is until you have solved it. To
make matters worse, inexperienced programmers will often
rush ahead—trying to debug quickly rather than in a disciplined
manner. Such activities follow the old saying “haste makes
waste.” A programmer may try to make changes in the hopes
that it will fix a problem, without investing the time to fully
understand the problem and devise a correct solution.
6.1 Step 6: Test Your Code
Testing is all about finding bugs in your code. A good test case
is one that identifies a problem. While this may seem
unintuitive, consider the analogy of going to the doctor: while it
may be great to be told you are healthy, if there is something
wrong with you, the test that identifies that problem (so that the
doctor can fix it) is the most important test. Similarly, with your
program, you want to identify and fix the problem rather than
think it works when it does not. In the case of a class
assignment, you can think of this as preferring to find and fix a
problem before you turn the assignment in, rather than think it
works and receive a poor grade.
In “the real world,” the consequences become more
serious: bad code kills people. As our world becomes more
and more automated—software controls critical flight systems
on aircraft, cars are beginning to drive themselves, etc.—it is
easy to imagine how software bugs can cause fatal
catastrophes, but you do not even need to imagine these
possibilities, as there have already been people killed by
software bugs. Examples include the Therac-25 (a radiation
therapy machine whose software bugs overdosed and killed
patients), the Patriot Missile defense system (a software bug
caused the system to fail intercept a Scud missile in the Gulf
War: 28 people died, and many more were injured), and the
crash of a Spanish military transport plane (the software to
control the engines malfunctioned and prevented three of the
four engines from delivering enough power to keep the plane in
the air; four people died, and two were injured).
Even when software bugs are not fatal to humans, they can
cause other types of catastrophes: a $400 million trading error,
exploding rockets, a phone system crash, a blackout of most of
the east coast, and the Mars Climate Orbiter crash were all the
result of bugs in software. The Heartbleed bug presented the
possibility of hackers gaining information from and access to
affected systems (https://xkcd.com/1354/ provides a nice
high-level explanation of the problem. Once you finish the next
few chapters, you could understand a more technical
explanation).
One of the keys to testing your code well—both
thoroughly and in a manageable fashion—is to test
incrementally. Write one function, then test it well before
moving on. Ideally, you should be completely confident that a
function you wrote works before you use it.
Note that while we discussed a top-down design
methodology in Chapter 4, incremental testing is naturally a
bottom-up approach. These two techniques do not conflict with
each other—if you write code for function A, and end up
needing to write functions B and C as part of it, you should test
B and C before testing A.
6.1.1 Black Box Testing
The testing methodology that most people think of first is black
box testing. In black box testing, the tester considers only the
expected behavior of the function—not any implementation
details—to devise test cases. The lack of access to the
implementation details is where this testing methodology gets
its name—the function’s implementation is treated as a “black
box,” which the tester cannot look inside.
Black box testing does not mean that the tester thinks of a
few cases in an ad hoc manner and calls it a day. Instead, the
tester—whose goal is to craft test cases that are likely to expose
errors—should contemplate what cases are likely to be error-
prone from the way the function behaves. For example, suppose
you needed to test a function to sum a list of integers. Without
seeing the code, what test cases might you come up with?
We might start with a simple test case to test the basic
functionality of the code—for example, seeing that the function
gives an answer of 15 when given an input of {1, 2, 3, 4,
5}. However, we should also devise more difficult test cases,
particularly those that test corner cases—inputs that require
specific behavior unlike that of other cases. In this example, we
might test with the empty list (that is, the list with no numbers
in it) and see that we get 0 as our answer. We might make sure
to test with a list that has negative numbers in it or one with
many very large numbers in it. Note that Practice Exercise 6.3
asks you to contemplate what sorts of errors these test cases
might expose.
Observe that we were able to contemplate good test cases
for our hypothetical problem without going through Steps 1–5.
You can actually come up with a set of black box tests for a
problem before you start on it. Some programmers advocate a
test-first development approach. One advantage is that if you
have a comprehensive test suite written before you start, you
are unlikely to skimp on testing after you implement your code.
Another advantage is that by thinking about your corner cases
in advance, you are less likely to make mistakes in developing
and implementing your algorithm.
Writing good tests cases requires a lot of thought: you
need to think about a wide variety of things that can go wrong.
Here are some suggestions to guide your thinking, as well as
important lessons. First, some thoughts for testing error cases:
• Make sure your tests cover every error case. Think
about all the inputs the program cannot handle, i.e., ones
where you would expect to receive an error message. If
the program requires a number, give it xyz, which is not
a valid number. An even better test case might be
123xyz, which starts out as a number but isn’t entirely a
number. How the program should handle this depends
on the requirements of the program, but you want to
make sure it handles it correctly.
• Be sure to test “too many” as well as “too few.” If a
program requires exactly things, test it with at least
one case greater than and at least one case with
fewer than . For example, if a program requires a
line of input with exactly 10 characters, test it with 9
and with 11.
• Any given test case can only test one “error message
and exit” condition. This means that if you want to test
two different error conditions, you need two different
test cases: one for each error condition. Suppose that
our program requires a three-letter string of lowercase
letters. We might be tempted to test with aBcD to test
two things at once, but this will only test one of them
(and believing you tested both is problematic!) To see
why this rule exists, think about the algorithm to check
for these errors:
Check if the input string has exactly
three letters.
If it does not, then:
print "Error: the input needs to be
three letters"
exit the program
Check if the input string is made up
entirely of lowercase letters.
If it is not then:
print "Error: the input needs to be all
lowercase letters"
exit the program
Violating one input condition will cause the program to
exit! This is also true of other situations where the
program rejects the input, even if does not exit.
• Test exactly at the boundary of validity. If a program
requires between 7 and 18 things, you should have test
cases with 6, 7, 8, 17, 18, and 19 things. You need to
make sure that 6 and 19 are rejected while 7, 8, 17, and
18 are accepted. Testing exactly at the boundaries is
important because of the common “off by one” mistake
—maybe the programmer wrote < when she should
have written <=, or >= when she should have written >,
or something similar. If you test with values that are
right at the boundary, you will find these mistakes.
However, testing is not just about checking error handling.
You want to make sure the algorithm correctly handles valid
inputs too. Here are some suggestions:
• Think carefully about whether or not there are any
special cases where one particular input value (or set of
values) has to be treated unusually. For example, in
poker, an Ace is usually ranked the highest; however, it
can have the lowest ranking in an “Ace low straight” (5
4 3 2 A). If you are testing code related to poker hands,
you would want to explicitly test this case, since it
requires treating an input value differently from normal.
• Think carefully about the requirements, and consider
whether something could be misinterpreted, easily mis-
implemented, or have variations that could seem
correct. Suppose your algorithm works with sequences
of decreasing numbers. You should test with a sequence
like 7 6 6 5 4, which has two equal numbers in it.
Checking equal numbers is a good idea here, since
people might have misunderstood whether the sequence
is strictly decreasing (equal numbers don’t count as
continuing to decrease) or non-increasing (equal
numbers do count).
• Think about types. What would happen if the
programmer used the wrong type in a particular place?
This could mean that the programmer used a type that
was too small to hold the required answer (such as a 32-
bit integer when a 64-bit integer is required), used an
integer type when a floating point type is required, or
used a type of the wrong signedness (signed when
unsigned is required or vice versa).
• Consider any kind of off-by-one error that the
programmer might have been able to make. Does the
algorithm seem like it could involve counting? What if
the programmer was off by one at either end of the
range she counted over? Does it involve numerical
comparison? What if < and <= (or > and >=) were mixed
up?
• Whenever you have a particular type of problem in
mind, think about how that mistake would affect the
answer relative to the correct behavior, and make sure
they are different. For example, suppose you are writing
a program that takes two sequences of integers and
determine which one has more even numbers in it. You
have considered that the programmer might have an off-
by-one error where he accidentally misses the last
element of the sequence. Would this be a good test
case?
Sequence 1: 2 3 5 6 9 8
Sequence 2: 1 4 2 8 7 6
This would not be a good test case for this particular
problem. If the program is correct, it will answer
“Sequence 2” (which has four, compared to three).
However, if the algorithm mistakenly skips the last
element, it will still answer “Sequence 2” (because it
will count three elements in Sequence 2, and two
elements in Sequence 1). A good test case to cover this
off-by-one-error would be:
Sequence 1: 2 3 5 6 9 8
Sequence 2: 1 4 2 8 7 3
Now a correct program will report a tie (three vs. three)
and a program with this particular bug will report
Sequence 2 as having more even numbers.
• Consider all major categories of inputs, and be sure you
cover them.
For numerical inputs, these would generally be
negative, zero, and positive. One is also usually a good
number to be sure you cover.
For sequences of data, your tests should cover an
empty sequence, a single element sequence, and a
sequence with many elements.
For characters, use lowercase letters, uppercase
letters, digits, punctuation, spaces, non-printable
characters in your test cases.
For many algorithms, there are problem-specific
categories you should consider. For example, if you are
testing a function related to prime numbers (e.g.,
isPrime), then you should consider prime and
composite (not prime) numbers as input categories to
cover.
When you combine two ways to categorize data,
cover all the combinations. For example, if you have a
sequence of numbers, you should test with an empty
list, a one-element sequence with zero, a one-element
sequence with a negative number, a one-element
sequence with a positive number, and have each of
negative, zero, and positive numbers appearing in your
many-element sequences.
• An important corollary of the previous rules is that if
your algorithm gives a set of answers where you can list
all possible ones (true/false, values from an enum, a
string from a particular set, etc.), then your test cases
should ensure that you get every answer at least once.
Furthermore, if there are other conditions you think are
important, you should be sure that you get all possible
answers for each of these conditions. For example, if
you are getting a yes/no answer, for a numeric input,
you should test with a negative number that gives yes, a
negative number that gives no, a positive number that
gives yes, a positive number that gives no, and zero
(zero, being only one input, will have one answer).
All of this advice is a great starting point, but the most
important thing for testing is to think carefully—imagine all the
things that could go wrong, think carefully about how to test
them, and make sure your test cases are actually testing what
you think they are testing.
6.1.2 White Box Testing
Another testing methodology is white box testing. Unlike black
box testing, white box testing involves examining the code to
devise test cases. One consideration in white box tests is test
coverage—a description of how well your test cases cover the
different behaviors of your code.
Note that while white box and black box testing are
different, they are not mutually exclusive, but rather
complimentary. One might start by forming a set of black box
test cases, implement the algorithm, then create more test cases
to reach a desired level of test coverage.
We will discuss three levels of test coverage: statement
coverage, decision coverage, and path coverage. To illustrate
each of these types of coverage, we will use the (contrived)
code example:
1 int aFunction(int a, int b, int c) {
2 int answer = 0;
3 if (a < b) {
4 answer = b - a;
5 }
6 if (b < c) {
7 answer = answer * (b - c);
8 }
9 else {
10 answer = answer + 42;
11 }
12 return answer;
13 }
Statement coverage means that every statement in the
function is executed. In our example, if we were to use only
one test with a = 1, b = 2, and c = 3, we would not have
statement coverage. Specifically, this test case (by itself) does
not execute line 10 of the code (the body of the else). Our lack
of coverage indicates we have not adequately tested our
function—it has behaviors our test case has not exercised.
We can then figure out more test cases to add to our test
suite to improve our testing until the desired coverage level is
met. For example, we might add the test case a = 1, b = 3, and
c = 2 to our test suite to test the else case. The two test cases
together provide statement coverage of this function.
Statement coverage is a minimal starting point, so if we
want to test our code well, we likely want a stronger coverage
metric. To see how statement coverage falls short, notice that
we have not tested any cases where the a < b test evaluates to
false—that is, a case where we do not enter the body of the if
statement (line 4).
A stronger level of coverage is decision coverage—in
which all possible outcomes of decisions are exercised. For
boolean tests, this means we construct a test case where the
expression evaluates to true and another where it evaluates to
false. If we have a switch/case statement, we must construct at
least one test case per choice in the case statement, plus one for
the default case, even if it is implicit—that is, if we do not
write it down, it behaves as if we wrote default: break;.
To understand decision coverage more fully, it helps to
visualize it. In order to visualize decision coverage, we need to
understand the concept of a control flow graph (CFG). A
control flow graph is a directed graph (in the mathematical
sense) whose nodes are basic blocks (boxes) and whose edges
(arrows) represent the possible ways the control flow arrow can
move from one basic block to the next. A basic block is a
contiguous sequence of statements that must be executed in an
“all-or-nothing” manner; the execution arrow cannot jump into
or out of the middle of a basic block. (We discuss graphs in
much more detail in Chapter 25.)
Figure 6.1: The control flow graph for our example code.
Figure 6.1 shows the control flow graph for our example
function (aFunction) from above. The very first basic block
has a lone arrow coming into it, indicating that it is the start—
the execution arrow can only reach it by calling the function.
This first basic block has two statements: the
declaration/initialization of answer and the if statement, which
tests if a < b. The basic block ends here because the execution
arrow could go two places—indicated by the two edges leaving
this block. The final block has no arrows going out of it; when
the execution arrow moves to the end of this block, there is
nowhere else to go within this function. At this point, the
function is done and returns its answer. Note that we draw the
CFG for one function at a time. We could also draw how the
execution arrow moves between functions, which is called a
call graph.
Decision coverage corresponds to having a suite of test
cases that covers every edge in the graph. In our graph, if a < b
is true, we will take the left edge coming out of the first block
(corresponding to going into the “then” clause of the if/else).
If it is false, we will take the right edge, which skips over this
clause, and as there is no else clause, goes to the next if
statement. This second edge—representing the path of our
execution arrow when a < b is false—is highlighted in red in
Figure 6.1, since it represents the shortcomings of our test cases
so far with respect to decision coverage.
We can obtain decision coverage on this function by
adding a test case where the execution arrow traverses this edge
of the control flow graph—or put a different way, where the
a < b decision evaluates to false. An example of this test case
would be a = 5, b = 2, and c = 1.
Figure 6.2: The four possible paths through our example CFG. Our tests cover
three out of the four, but the red path has not been tested.
An even stronger type of test coverage is path coverage.
To achieve path coverage, our test cases must span all possible
valid paths through the control flow graph (following the
direction of the arrows). Figure 6.2 shows the four paths
through our example control flow graph and color codes them
based on which of our test cases cover them. Note that there is
one path (shown in red) that we have not yet covered. This path
corresponds to the case where a < b is false, but b < c is true.
Our test cases that gave us decision coverage tested each of
these conditions separately, but none of the tests tested both of
these together at the same time—where our execution would
follow the red path. We can add another test case (e.g., a = 5,
b = 2, and c = 3) to gain path coverage.
At this point you may be wondering “Why do we have all
of these levels of test coverage?” and “How do I pick the right
level of coverage for what I am doing?” Software engineers
define and discuss multiple levels of test coverage because the
higher levels of coverage give us more confidence in the
correctness of our code but at the cost of more test cases. While
our example only requires two, three, and four test cases for the
three levels of coverage we discussed, it is a simple (and
contrived) piece of code. The number of paths through the
control flow graph is exponential in the number of conditional
choices—if we have six if/else statements one after the other,
then there are 64 possible paths through the control flow graph.
If we increase the number of if/else statements from six to 16,
then the number of paths grows to 65536—quite a lot!
So how do you pick the right level of test coverage? As
with many things, the answer is “it depends.”1 The first aspect
of this decision is “how confident do you need to be in your
code?” If you are doing preliminary testing of your algorithm
by hand in Step 4 (and will test the implemented algorithm
once you have translated it to code), then statement coverage is
a reasonable choice. Here, testing more cases is quite time
consuming (you are doing them by hand), and you will do more
test cases later (once it is translated to code and compiled,
where you can have the computer run the tests).
By contrast, if you are testing a piece of critical software
that will be deployed when you finish testing, only achieving
statement coverage would be woefully insufficient. Here, you
would likely want to aim for more than just the minimum to
achieve path coverage. Unfortunately, no amount of testing can
ensure the code is correct—it can only increase our confidence
that it is. If you require absolute certainty that the code is
correct, you must formally prove it (which is beyond the scope
of this book).
6.1.3 Generating Tests
One difficulty with testing arises from the fact that you want to
test the cases you did not think of—but if you do not think of
these cases, then how do you know to write tests for them? One
approach to such a problem is to generate the test cases
according to some algorithm. If the function we are testing
takes a single integer as input, we might iterate over some large
range (say -1,000,000 to 1,000,000) and use each integer in that
range as a test case.
Another possibility is to generate the test cases
(pseudo-)randomly (called, unsurprisingly, random testing).
Note that pseudo-random2 means the numbers look random (no
“obvious” pattern) to a human but are generated by an
algorithm that will produce the same answer each time if they
start from the same initial state (called a “seed”). Random
testing can be appealing, as it can hit a wide range of cases
quickly that you might not think of at all. For example, if your
algorithm has six parameters, and you decide you want to test
100,000 possibilities for each parameter in all combinations,
you will need test cases—even if you can
do 10 trillion test cases per second (which would be beyond
“fast” by modern standards), they will take about 3 million
years to run! With random testing, you could run a few
thousand or million cases and rely on the “Law of Large
Numbers” to make it likely that you encounter a lot of varieties
of relationships between the parameters.
One tricky part about generating test cases algorithmically
is that we need some way to verify the answer—and the
function we are trying to test is what computes that answer. We
obviously cannot trust our function to check itself, leaving us a
seeming “chicken and egg” problem. In a very few cases, it
may be appealing to write two versions of the function that can
be used to check each other. This approach is appropriate when
you are writing a complex implementation in order to achieve
high performance, but you could also write a simpler (but
slower) implementation whose correctness you would more
readily be sure of. Here, it makes sense to implement both and
test the complex/optimized algorithm against the
simpler/slower algorithm on many test cases.
We often test properties of the answer to see that it correct.
For example, if we are testing an algorithm to compute the
square root of a number, we can check that the answer times
itself produces the original input (that is, that —or
is within the expected error given the precision of the numbers
involved.3) Testing this property of the answer assures us that it
was correct without requiring us to know (or be able to
compute by other means) what it is. The previous technique
works well for invertible functions (that is, where we can apply
some inverse operation that should result in the original input),
but many programming problems do not fit into this case.
We may, however, be able to test other properties of the
system to increase our confidence that it is correct. For
example, imagine that we are writing software to simulate a
network, which routes messages from sources to destinations
(the details of how to implement this are beyond the skills we
have learned so far, but that is not important for the example).
Even without knowing the right answer, we can still test that
certain properties of the system are obeyed: every message we
send should be delivered to the proper destination, that delivery
should happen exactly one time, no improper destinations
should receive the messages, the delivery should occur in some
reasonable time bound, and so on.
Checking these properties does not check that the program
gave the right answer (e.g., it may have routed the message by
an incorrect but workable path), but it checks for certain classes
of errors on those test cases. As with all test cases, this
increases our confidence that the program is right but does not
prove that it is right. Of course, we would need to test the other
aspects of the answer in other ways—which may involve
looking at the details of fewer answers by hand to ensure all
details are right.
This approach requires development of code that is not
part of the main application—a program that not only sends the
requests into the network model, but also tracks the status of
those requests and checks for the proper outcomes. This
additional program is called a test harness—it is a program you
write to run and test the main parts of your code. Developing
such infrastructure can be time-consuming but is often a good
investment of time, especially for large projects.
6.1.4 Asserts
In any form of testing, it can be useful to not only check the end
result, but also that the invariants of the system are maintained
in the middle of its execution. An invariant is a property that is
(or should be) always true at a given point in the program.
When you know an invariant that should be true at a given
point in the code, you can write an assert statement, which
checks that it is true. assert(expr); checks that expr is true.4
If it is true, then nothing happens, and execution continues as
normal. However, if expr is false, then it prints a message
stating that an assertion failed, and aborts the program—
terminating it immediately wherever it is.
As an example, recall the printFactors function from one
of the practice problems for Chapter 4. We could add some
assert statements to express invariants of our algorithm:
1 void printFactors(int n) {
2 if (n <= 1) {
3 return;
4 }
5 int p = 2;
6 while (!isPrime(n)) {
7 assert(isPrime(p)); //p should always be prime
8 assert(p < n); //p should always be less tha
9 if (n % p == 0) {
10 printf("%d * ", p);
11 n = n / p;
12 }
13 else {
14 p = nextPrimeAfter(p); //helper function to get
15 }
16 }
17 printf("%d\n", n);
18 }
Here, we have added two assert statements. The first, on
line 7, checks that p is prime (using our isPrime function). This
fact should always be true here—if it is not, our algorithm is
broken (we may end up printing non-prime factors of the
numbers, which is incorrect). How could such an error occur?
One possibility would be a bug in nextPrimeAfter—the helper
function we have written to find the next prime after a
particular prime. Another possibility would be if we
accidentally modify p in a way we do not intend to. Of course,
if our code is correct, neither of these will happen, and the
assert will do nothing, but the point is to help us if we do
make a mistake.
The second assert checks another invariant of our
algorithm: p should always be less than n inside this loop (think
about why this is). As with the first assert, we hope it has no
effect—just checking that the expression is always true—but if
something is wrong, it will help us detect the problem.
Note that assert statements are an example of the
principle that if our program has an error we want it to fail fast
—that is, we would rather the program crash as soon after the
error occurs as possible. The longer the program runs after an
error occurs, the more likely it is to give incorrect answers and
the more difficult it is to debug. Ideally, when our program has
an error, we will have an assert fail immediately after it,
pointing out exactly what the problem is and where it lies.
In almost all cases, giving the wrong answer (due to an
undetected error) is significantly worse than the program
detecting the situation and crashing. To see this tradeoff,
consider the case of software to compute a missile launch
trajectory. If there is an error in such software, would you rather
it give incorrect launching instructions or print a message that
an error occurred and abort?
Many novice and intermediate programmers worry that
asserts will slow their program down. In general, the
slowdown is negligible, especially on fast modern computers.
In many situations, 1–2% performance just does not matter—do
you really care if you get your answer in 0.1 seconds versus
0.11 seconds? However, there are performance-critical
situations where ever bit of speed matters. For these situations,
you can pass the -DNDEBUG option to the compiler to turn off the
asserts in your optimized code. For all other situations, keeping
them active is generally advisable. Note that performance-
critical code is the domain of expert programmers who also
have a deep understanding of the underlying hardware—such is
well beyond the scope of this book.
6.1.5 Regression Testing
Another important consideration in real software is that your
program changes over time. In “the real world,” your
requirements may evolve (as your customers request new
features or you release one version and move on to the next).
You may also be fixing bugs or going back to add more
complex functionality that you originally skipped in the interest
of getting simpler parts working. Whatever the reason for your
changes, you want to be sure that your modifications have not
broken existing functionality—this is the job of regression
testing.
Regression testing is the process of running a set of test
cases—your regression test suite—that have worked in the past
to ensure they still work on your current code. Regression tests
are usually automated (e.g., via a shell script) and run
frequently to detect newly introduced problems regularly. On
large software projects with many developers, a common
practice is to run “nightly regressions”—run the regression test
suite on the code each night (when development is presumably
less active) to detect bugs introduced during the day. Other
development practices may involve running a regression test
suite before checking the code into the revision control system
(e.g., Git, Subversion, or Mercurial).
6.1.6 Code Review
Another technique to increase your confidence in the
correctness of your program is a code review. In a code review,
another skilled programmer reviews the code you have written,
looking for mistakes you might have made. Such a review is
another way to address the “how to test for problems you did
not think of” problem—the reviewer brings a fresh perspective
to the code and may identify areas of concern you did not think
about. Code reviews have one nice advantage over other forms
of testing: often when your reviewer identifies a problem, she
can propose steps towards fixing it.
One form of code review is a code walk-through in which
the programmer explains the algorithm and code to the
reviewer. Typically, the reviewer is familiar with the problem
and algorithm required to solve it, so the walk-through focuses
on the code itself. The programmer goes line-by-line through
the code, explaining to the reviewer what each line does, and
why it is there. If the reviewer is not familiar with the
problem/algorithm, the walk-through process may start with the
programmer walking the reviewer through earlier steps of the
programming process—possibly even starting from Step 1:
Work an Example Yourself.
6.2 Step 7: Debug Your Code
Once you have found a problem in your code, you need to fix it
—this process is called debugging. Many novice programmers
(and even some moderately experienced programmers) debug
in an ad hoc fashion—trying to change something in their code
and hoping that it will fix their problem. Such an approach is
seldom effective and often leads to much frustration.
Returning to our doctor analogy from earlier, suppose you
were sick and went to the doctor. Does your doctor say “Oh I
don’t know what is wrong, but try this medicine. If it doesn’t
fix things, come back tomorrow and we’ll try another
medicine.” If your doctor does treat you this way, it might be
time to find a new doctor! Trying random “fixes” in the hope
that you will get lucky on one is not a good way to diagnose
and fix problems. Even worse, if your symptoms change during
this process, you have no idea if one of the random “fixes” you
tried made things worse, or if your untreated condition is
progressing. Similar analogies can be made to any profession
that diagnoses and fixes problems (such as mechanics).
Hopefully, your doctor (and mechanic) follow a more
scientific approach to diagnosing and fixing problems. As do
they, so should you in diagnosing and fixing your program. In
fact, debugging should be an application of the scientific
method, which you may have learned in a science class.
Figure 6.3: The scientific method.
6.2.1 Observe a Phenomenon
Figure 6.3 shows a flowchart of the scientific method. All
science starts with observing some phenomenon. In the natural
sciences this might be noticing that objects fall towards the
ground when not supported by anything, that finches in certain
of the Galápagos Islands have different characteristics from
other finches, or that the water level rises when you get into the
bathtub.
In programming, our observation of phenomena relates to
the behavior of our programs on certain inputs (“My program
gives the wrong answer when I give it an input of 42!”). These
observations typically arise from our test cases but may happen
under other circumstances (for example, the end user reports a
problem we did not discover in testing).
6.2.2 Ask a Question
Once you have observed a phenomenon, the next step in the
scientific method is to ask a question. Asking a good question
here is crucial to the success of the rest of our scientific
endeavor. While a broad question such as “What is wrong with
my program and how do I fix it?” may seem appealing, it may
be quite difficult to answer. Instead, we should aim for more
focused questions: “On which line of code does my program
crash?” or “Why does my program call myFunction with x = 3
and y = -42?”.
Answering one question often leads to an observation that
leads to another question—restarting the scientific process all
over again. Discovering what is wrong in this iterative fashion
is perfectly fine, and is in fact a great way to proceed. You start
by asking “Which line does my program crash on?” then when
you answer that, you ask “Why does it crash on this line?” the
answer to that then leads you to ask “How is x = 3 and
y = -42?” which in turn leads you to ask another question, and
so on. Eventually, your chain of questions and answers leads
you to the discovery of the problem, even if it is somewhat far
removed from the visible symptom.
6.2.3 Gather Information and Apply Expert
Knowledge
Many people will say that forming a hypothesis is the next step
of the scientific method. If you can form a hypothesis
immediately, that is great. However, forming a good hypothesis
is difficult, and forming one right away is often not possible.
The next step of the scientific method is actually to gather
information and combine it with your expert knowledge. Going
back to our example of visiting the doctor, the doctor gathers
information by examining the patient and combines it with her
expert knowledge—years of training on symptoms of diseases
and how the body works—to form a hypothesis.
In the case of debugging, you need to gather information
about what is happening in your program and combine this with
your own expert knowledge on programming. Your expert
knowledge comes in two parts here. One is your knowledge of
programming in general—the rules for how to execute code by
hand that we learned in Chapter 2 (and will continue learning as
we introduce more topics), and the other is your domain
knowledge of the particular program you are writing—the
expected behaviors of each part of it.
Your expert knowledge will grow with practice in
programming and the domain for which you are writing
programs. However, gathering information effectively is a skill
of its own. The information-gathering aspect of debugging is
often conflated with the entirety of debugging—if you ask
someone how they debug, they will often explain to you what
techniques they use to gather information.
The simplest way to gather information is to insert print
statements (in C, calls to printf) to display the values of
various variables at various points in the program. The resulting
output can give you information about the control flow of the
program (which statements were executed, and in what order—
as shown by what order your print statements print their
output), and of course the values of the variables you print.
Gathering information by printing has the advantages that
it is simple and requires no other knowledge or experience.
However, it has several disadvantages as well. One is that
changing what you print out requires recompiling and re-
running your program. While this disadvantage may seem
small, if your bug takes 15 minutes to manifest, restarting the
program for each new piece of information you discover that
you want can be quite time-consuming. Another disadvantage is
that the output may be overwhelming (i.e., thousands of lines of
output to sift through) if your program executes for even a
modest time before experiencing the problem. A third
disadvantage is that it cannot replicate or replace many features
that debuggers offer.
Another approach to information gathering is to use a
debugger—a tool specifically designed to aid programmers in
the debugging process. The debugger is in fact primarily aimed
at this piece of the debugging process—gathering information
(sadly, it does not offer you hypotheses or expert knowledge).
One widely-used debugger is GDB, which we cover in detail in
Section D.2. We strongly recommend that you learn it, and if
you intend to become a serious programmer, become an expert
in it. We will mention generally what you can do with it here,
but leave the details to Section D.2, as GDB has features that
we relate to topics we have not learned yet.
For now, we will discuss the high-level points of a
debugger. When you run your program inside the debugger, you
can give the debugger a variety of commands to control the
execution of your program and get information from it. Note
that Emacs understands how to interact with GDB, and using
them together makes the entire process go much more
smoothly.
When you run your program inside of a debugger, it will
run as normal until one of the following happens: (a) it exits,
(b) it crashes5, or (c) it encounters a breakpoint (or watchpoint)
you have set. A breakpoint is set on a particular line of code and
instructs the debugger to stop the execution of your program
whenever the execution arrow is on it. Breakpoints can be
conditional—meaning you can specify that you only want to
stop on some particular line when a conditional expression you
specify is true. Watchpoints specify that you want to stop the
program when a particular “box” changes.
Once your program is stopped, you can examine the state
of the program by printing the values of expressions. The
debugger will evaluate the expression and print the result for
you—giving you information about the state of the program.
Most often, you will want to print the values of variables to see
what is going on, though you may print much more complex
expressions if you wish.
After printing some information, you will often want to
continue executing in some fashion—either running until the
debugger would stop naturally (as described above), or maybe
just executing one more statement, then stopping. The debugger
gives you the ability to choose either one. If you want to
execute one statement at a time, and the current statement
involves a function call, you have two options. You can either
step over the call (asking the debugger to evaluate the entire
function call and stop on the next line of the current function),
or you can step into the function call (asking the debugger to
follow the execution arrow inside the function and let you
explore what is happening inside it).
This approach to gathering information is more flexible
than print statements—if you encounter one oddity, which
suggests other things you need to explore, you can print them
immediately. By contrast, if you print the value of a variable
with a print statement, adding more print statements to
investigate other variables requires recompiling and rerunning
the program.
The previous paragraph alludes to a common occurrence
in the debugging process: recursive observations—gaining
some information (e.g., seeing the value you printed for one
variable) leads you to want some other information (e.g., to
print some other variable). Often when you are investigating
one phenomenon (“My program crashes when I enter 3…”),
you observe some other phenomenon (“y is 0 on line 42..”),
which itself leads to a question meriting investigation (“How
did that happen?”). The investigation of this second
phenomenon proceeds according to the scientific method. In
such a case, you are recursively applying the scientific method.
We will learn about recursion as a programming technique in
Chapter 7, but for now it suffices to say that recursion is when
an algorithm (in this case, the scientific method) has a step that
calls for you to apply the same algorithm to “smaller” (by some
metric) inputs. Here, investigating the second observation may
immediately solve your problem, may give you useful
information to allow you to proceed, or may prove to be a red
herring (something that was actually fine, but just surprised you
—in which case, you just continue gathering information for
your original question).
6.2.4 Form a Hypothesis
The whole point of gathering all of this information is to help
you form a hypothesis. Sometimes, you may be able to form a
hypothesis right away—typically for problems that are simple
relative to your experience level. However, forming a good
hypothesis is generally hard and requires significant
information gathering.
Forming a good hypothesis is the key to proceeding
effectively. A vague hypothesis is hard to test and not so useful
in identifying the problem. As an extreme example, the
hypothesis “My program is broken” is easily verified, but rather
useless. The hypothesis “My program is dividing by 0 on line
47 for certain inputs” is more useful, but could be improved.
Even better would be “My program is dividing by 0 on line 47
if y is odd and z is a perfect square.” This (contrived)
hypothesis is specific and clear—giving it two important
characteristics for debugging.
The first characteristic of a good hypothesis that this
exhibits is that it is testable. For a hypothesis to be testable, it
must make specific predictions about the behavior of the
program: when I give the program inputs that meet
(condition), I will observe (behavior). For such a
hypothesis, you can execute test cases to either refute this
hypothesis (e.g., if the program’s behavior does not match the
predictions that the hypothesis makes) or to become confident
enough in your hypothesis that you accept it.6 The contrived
hypothesis we presented at the end of the previous paragraph is
quite testable: we specify a certain category of inputs (y is odd
and z is a perfect square) and exactly what behavior we expect
to observe (division by 0 on line 47).
The second characteristic of a good hypothesis for
debugging is that it is actionable—if we convince ourselves that
it is true, it provides us with an indication of either how to fix
the error in our code, or what the next steps towards it are. In
the case of our contrived hypothesis, confirmation would likely
suggest a special case of the algorithm we did not consider. The
fact that our hypothesis is specific (with regards to what types
of inputs trigger the error) identifies the corner cases for us,
guiding us to the path to fixing the problem.
6.2.5 Test the Hypothesis
Once we have formed our hypothesis, we want to test it. In the
simplest case, this may be trivial, as the evidence in front of us
immediately convinces us beyond any doubt. If our hypothesis
is “When x is 0, line 57: y / x crashes the program.” and we
just formed this hypothesis by printing x in our debugger when
the program crashed on line 57, then we are already convinced
the hypothesis is true. In such a case, we accept the hypothesis
immediately and move on with no further testing.
However, do not be lulled into a false sense that you
should often or eagerly accept your hypothesis without any
further testing. Ten or twenty minutes spent testing a hypothesis
you are “pretty sure of” is a much better investment of your
time than wasting five–10 hours making a larger mess of your
code because you are acting on incorrect information.
To illustrate the benefits of such a tradeoff, suppose that
your notion of “pretty sure” corresponds to your hypothesis
being correct 95% of the time—meaning your hypothesis is
wrong one time in 20. On the one hand, if you spend (on
average) 10 minutes testing each of 20 hypotheses, that
amounts to 200 minutes (three hours, 20 minutes) of testing. On
the other hand, if you blindly proceed, assuming your
hypotheses are correct with no further testing—in 19 out of 20
cases, you will save yourself 10 minutes. However, in the case
where you are incorrect, you might waste five–10 hours
“fixing” your code based on an incorrect hypothesis about its
broken behavior! This tradeoff represents a false economy—
you think you are saving time by skipping the testing of the
hypothesis, but you are actually wasting more time than you
have saved when you are incorrect.
You may think those time ratios sound a bit exaggerated,
but once you convince yourself of incorrect information, it is
quite easy to “go down the rabbit hole.” You make one
seemingly logical conclusion from your false information after
another. Once you change your code based on incorrect
information, you are breaking your code even more—
introducing new errors rather than fixing the existing one(s).
You now have to debug all of these problems, which is even
harder because you are convinced of a false premise.
Debugging when you are convinced of a false premise is
especially frustrating because you start to reach a point where
nothing makes sense. This frustration makes such situations
especially important to avoid.
Testing our hypothesis may proceed in a variety of ways—
sometimes by a combination of them—such as:
Constructing test cases. Sometimes our hypothesis will
suggest a new test case (in the sense of the testing
we discussed in Section 6.1), that we can use to
produce the error under a variety of circumstances.
Generally, these follow from the specific cases that
your hypothesis makes predictions about: “My
program will (something) with inputs that
(something).” When your hypothesis suggests this
sort of result, construct a test case with the values
you were thinking of, and see if your program
exhibits that behavior. Also, construct other test
cases that do not meet the conditions to see if you
were too specific.
Inspecting the internal state of the program. We can
use the same tools we used for gathering more
information (print statements or the debugger) to
inspect the internals of the program. This sort of
testing is particularly useful when our hypothesis
suggests something about the behavior or our
program in its past—before it reaches the point
where we observe the symptom. This past may be
recent (“…but that means in the previous iteration
of the loop…” or “then x had to be odd on the last
line….”) or the distant past (“…but for that to be
the case, I had to have deleted (something) from
(some data structure)…” or “…then y’s value has
to have changed in a way I did not expect…”).
Here we want to use the debugger to inspect the
state we are interested in and see if it agrees with
our hypothesis. Frequently, when we take this
approach, discovering the surprising change in
state will give us a clue to the problem.
Adding asserts.
Earlier, we discussed asserts as a testing
tool; however, they are also useful for debugging.
If our hypothesis suggests we are violating an
invariant of our algorithm, we can often write an
assert statement to check this invariant. If the
assert does not fail, we refute the hypothesis. If
the assert fails, it not only gives us confidence in
our hypothesis, but also makes our program fail
faster—the closer it fails to the actual source of the
problem, the easier it is to fix.
Code inspection. Another method to test your hypothesis
is to inspect and analyze your code. Sometimes a
simple hypothesis can be tested with a quick
inspection. For example, if you hypothesize that
you forgot to initialize a particular variable, you
can frequently just look at the relevant portion of
the code and see whether or not you have an
initialization statement or not. While this technique
is generally the best for “easy” hypotheses (such as
the one just described), it typically becomes much
harder as you deal with more complex hypotheses.
Of course, here “difficult” is relative to the
programmer’s skill level. Novice programmers
who seek the help of a much more experienced
programmer may get the wrong impression about
the debugging process by seeing their problems
debugged by inspection. While the problems that
the novice programmer faces seem complex and
daunting, the experienced programmer—for whom
the problems are easy and familiar—may debug it
by inspection in a matter of seconds.
Unfortunately, many novice programmers faced
with such an experience take away the wrong
lesson: debugging is magic or lucky guesswork.
As we test, we will either convince ourselves that our
hypothesis is correct and accept it, or we will find that it is not
true and reject it. In the former case, we now know what is
wrong with our program and are ready to correct it. Typically,
identifying the precise problem and cause are 90% of the battle
—thus if our hypothesis was a good one, we are most of the
way there. Of course, sometimes our problem is severe and
requires significant modifications to our program. In the worst
cases, a significant redesign and implementation of large
portions of the code from scratch.
The decision to throw away large portions of code and
redo them from scratch is not one to be taken lightly, nor an
easy one to make. In making such a decision, the programmer
should be wary of The Poker Player’s Fallacy—the temptation
to make a decision based on prior investments rather than
future outcomes. This term comes from a fallacy that many
novice poker players succumb to: betting based on how much
they have already put into the pot, rather than how likely they
are to win the hand (“I’m already in for $200, so I may as well
bet another $10 on the off chance I win.”). If you are not likely
to win the pot, betting another $10 is just throwing that money
away. The smart poker player will only bet on her current hand
if she thinks she can win (whether by a better hand or a bluff).
Similarly, when evaluating whether to modify the current
code or throw it away and start fresh, you should not consider
how much time you have already put into it, but rather how
much time it will take to fix the current code versus redesigning
it from scratch. Note that this is not intended to suggest you
should throw away your code and redesign it from scratch
every time there is an error. Instead, you should contemplate
the time required for both options and making a rational
tradeoff.
If instead of accepting your hypothesis, you find that you
must reject it, do not despair. In investigating this hypothesis,
you have gained valuable information that will inform your
next hypothesis. You may find a new hypothesis immediately
after you reject your current one (“Aha! the problem is not if z
is even, but rather if it is a multiple of 8!”). If not, return to the
information gathering stage and repeat the hypothesis formation
process.
One thing to be wary of when rejecting a hypothesis: there
may be multiple errors in your code. Do not be mislead by
symptoms of other errors masking your current problem. For
example, suppose that the program you are testing and
debugging has two errors in it. One of these errors causes the
program to crash on line 99 if x is a multiple of 17. Another
error causes the program to crash before it reaches lines 99 if x
is greater than 10,000. You have formed a hypothesis that
accurately describes the first error and are testing it with
x=17,000 (which is a multiple of 17 and greater than 10,000).
The fact that the program crashes sooner than we expect
should not cause us to reject the hypothesis immediately. We
must instead consider the possibility that there is another error
that is triggered by overlapping conditions. When faced with
such a possibility, we have two options.
One option we might take is to defer our investigation of
the first error while we try to debug the second. If we can fix
this second error, we can retest the case and find that it does not
contradict our original hypothesis on the first error.
The other option we have is to confirm our suspicion that the
difference in behavior (between what we observed and what we
hypothesized) is in fact a symptom of a different problem, then
proceed with testing the current hypothesis. Here, we must
proceed with caution—we do not want to reject a valid
hypothesis, but at the same time we must be careful not to
accept a false one. We should confirm that the test case in
question is actually not triggering the situation we intended to
test—perhaps it is not reaching that point in the code, or not
exhibiting the intended circumstances when it does reach that
point. Once we have confirmed that the test case is not actually
testing the hypothesis, we can continue with other cases. Of
course, after we fix the current error, we should return to this
case and find out what the other error is.
6.3 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 6.1 : Why do we consider a test case that fails
to be a good test case?
• Question 6.2 : What is an assert statement?
• Question 6.3 : Why do we want our code to “fail fast”?
• Question 6.4 : In our discussion of black box testing,
we considered an example of a function that sums a list
of integers. We suggested that one might want to test on
a list with negative numbers and on a list with many
very large integers. What errors in the code might these
cases expose? (Hint, think about what you learned in
Chapter 3)
• Question 6.5 : The C library has a function
int abs(int i). Devise a set of black box test cases
for this function. As with all testing, you should try hard
to think of the difficult cases. Can you think of a case
that will cause it to give the wrong answer? Hint: think
about what you learned in Chapter 3 about how
numbers are represented. In particular, you should try to
come up with a number where abs(n) results in a
negative number—e.g., one that makes this function
print its message:
1 #include <stdlib.h>
2 #include <stdio.h>
3
4 void f(int x) {
5 int y = abs(x);
6 if (y < 0) {
7 printf("abs(%d) = %d. That can’
8 }
9 }
• Question 6.6 : Suppose I am testing the following
function:
1 int something(int a, int b) {
2 int x = b - a;
3 int y = 1;
4 if (x > 0) {
5 y = x * (x + 1);
6 }
7 if (y > a) {
8 return 42 - y;
9 }
10 return y - b;
11 }
and I use the following test cases (in this order):
1.
2.
3.
4.
5.
6.
At what point do I have statement coverage of this
function with my test cases? How about decision
coverage? Path coverage?
• Question 6.7 : Use the debugger to win the guessing
game, whose code is shown below, in one guess.
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <sys/time.h>
4
5 int main(void) {
6 int guessesMade = 0;
7 int yourGuess;
8 char buffer[1024];
9 struct timeval now;
10 //using the time to seed the random number g
11 //is not very secure, but that doesn’t matter
12 gettimeofday(&now, NULL);
13 srandom(now.tv_usec);
14
15 int myNumber = random();
16
17 printf("I’m thinking of a number.
18
19 do {
20 if (guessesMade > 0) {
21 printf("Sorry that is not rig
22 }
23 printf("What number do you gues
24 if (fgets(buffer, 1024, stdin)
25 printf("Oh no, you are giving
26 return EXIT_FAILURE;
27 }
28 yourGuess = atoi(buffer);
29 guessesMade++;
30 } while (yourGuess != myNumber);
31
32 printf("That is correct, you won
33 return EXIT_SUCCESS;
34 }
5 Compiling and Running7 Recursion
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C6 Fixing Your Code: Testing and Debugging8 Pointers
Chapter 7
Recursion
The ability to repeat the same computation (with different values for the
variables) is key to many algorithms. So far, the algorithms we have seen have
used iteration to express this repetition—they have used loops to repeat the
steps until the appropriate conditions are met. However, there is another
approach called recursion, in which the algorithm calls itself with different
parameter values.
Recursive functions—those that call themselves—are important to
understand because many algorithms are much easier to write recursively. We
will note that any function you can write recursively you can write iteratively
and vice versa. However, for some problems, one style of algorithm may be
simple, while the other may be quite complex. Mastering both is crucial to
becoming a skilled programmer.
7.1 Reading Recursive Code
As we typically do, we will start by learning to read before we learn to write.
However, unlike most things we will encounter, there is actually not anything
new to learn—we just follow the same rules that we have already learned. In
fact, one of Chapter 2’s practice questions even had a recursive function for you
to execute by hand. However, we will still remind you of the rules for executing
function calls and underscore how they are exactly the same for a recursive
function.
We will start by working with the factorial function as an example.
Recall that in math, the factorial of a number (written in math notation) is
One thing you may notice here is that the mathematical definition of the
function is recursive—the general case of is expressed in terms of the
factorial of a smaller number ( ). Since the problem itself is recursive,
a recursive algorithm is a natural choice. We will see how to go about devising
and implementing an algorithm in Section 7.2, but for now, we will just work
from the following code, which computes factorial:
1 int factorial(int n) {
2 if (n <= 0) {
3 return 1;
4 }
5 return n * factorial(n - 1);
6 }
If you take a second to examine this code, you will see that everything in it is
something we have seen. Line 1 tells us that this function is called factorial,
takes an int n, and returns an int. Line 2 has an if statement with a
conditional expression (n <= ), which should be imminently familiar by now.
Line 3 contains the “then” clause of the if statement, which just returns 1. Line
5 returns an expression that is the result of multiplying n by the result of calling
the factorial function with the argument n - 1.
Video 7.1: Executing the recursive factorial function by hand.
Whenever our execution arrow reaches Line 5, we do everything exactly as
we learned in Chapter 2. We multiply n (which we would read out of its box) by
the value returned by factorial(n - 1). To compute factorial(n - 1)—
which is itself a function call—we would need to draw a stack frame for
factorial (with a box for the parameter n), compute n - 1 and copy it into the
box for the parameter, note down where to return to, and move our execution
arrow into the start of the factorial function.
Video 7.1 walks through the execution of factorial(3).
7.2 Writing Recursive Functions
Recall that in Section 4.5 we learned that when translating your steps to code, a
complex step should be translated into a call to a function. If you do not already
have a function that performs the task you require, you go and write that
function once you finish the current one. Writing a recursive function is actually
just a special case of this principle: the function we need to call just happens to
be the one we are writing. If we are willing to accept that this function will
work correctly when we finish it, then it is exactly what we need to use.
At this point, many novice programmers are confused by the seeming
“magic” of recursion: if you can just call the function you are writing, why not
just write
1 int factorial_broken(int n) {
2 return factorial_broken(n);
3 }
Surely, if the computer can figure out the “magic” of a function calling itself in
our first example, it can do the same magic here.
Of course, recursion is not actually magic. In our first example of factorial,
we could execute factorial(3) by hand and come up with the correct answer.
If we try to execute factorial_broken(3), we will actually never come up
with an answer. Instead, we will recurse infinitely—that is
factorial_broken(3) will just call factorial_broken(3), which will just call
factorial_broken(3), and so on, forever. Executing this code by hand, we
would quickly realize the problem, and stop. The computer, on the other hand,
does not reason, and would continue following our instructions, until stopped
by something—when it either runs out of memory to make frames, or when the
user kills the program.
Instead, let us better understand recursion by examining some key
differences between the two functions. One key difference is that the first
function was obtained by following the Seven Steps to write a program, and the
recursive call corresponded to something we actually did in trying to solve the
problem. By contrast, the factorial_broken function corresponds to saying
“Look, the way you compute factorial is to just compute the factorial.” Such a
statement is pretty useless in telling you how to perform the required task—
although it does lend itself to the joke, “All you have to do to write a correct
recursive function is write a correct recursive function.”
The second difference is that the factorial function has a base case—a
condition in which it can give an answer without calling itself—in addition to
its recursive case—when the function calls itself. In this particular function, the
base case is when n <= —the function just returns 1 without further work.
Another important aspect of the correct factorial function is that the
recursive calls always make progress toward the base case. In this particular
case, whenever factorial calls itself, it does so with a smaller parameter value
than its own (factorial(3) calls factorial(2), which calls factorial(1),
which calls factorial(0)).
Video 7.2: Writing factorial recursively.
Now that we have some general principles in mind, it is time to see how
we wrote the factorial function we saw earlier. Video 7.2 walks through Steps
1–5 of the programming process for the factorial function.
Another mathematical function that lends itself naturally to recursion is the
Fibonacci function. The Fibonacci function is defined as follows:
Video 7.3: Writing Fibonacci recursively.
Video 7.3 shows Steps 1–5 of the programming process for the Fibonacci
function. Observe that the resulting function has two base cases—if n is 0 or 1,
and that all recursive calls make progress towards the base cases. If n is
negative, the function negates it and recurses—this is “progress” since we then
always work with positive numbers. If n is greater than 2, the function recurses
on n - 1 and n - 2– which means that n is strictly decreasing as long as it is
positive. The fact that we do n - 2 makes it important that 0 and 1 are both
base cases. If 1 were not a base case, then fib(1) recursing on fib(n - 2)
would call fib(-1), which would then call fib(1), which would then call
fib(-1), and we would recurse infinitely.
We will note that this implementation of the Fibonacci function is quite
slow, even for moderately sized n. On my computer, computing fib(46) takes
about 8 seconds—quite a long time for the computer. One incorrect conclusion
that perpetuates as a bit of an “urban myth” is that recursion is slow. What is
actually happening here is that the way that we have written our fib algorithm
duplicates work—our implementation will compute fib(44) twice, fib(43)
three times, fib(42) five times, fib(41) eight times, …, fib(1) 1,836,311,903
times, and fib(0) 1,134,903,170 times!1 In total, evaluating fib(46) will
require 5,942,430,145 total calls to the Fibonacci function. Video 7.4 illustrates
this duplication of work for fib(5).
Video 7.4: Duplication of computation in Fibonacci.
The problem here is not that recursion is slow. It is that algorithms that
recompute the same thing millions of times are slow. For the moment, we are
not terribly concerned with performance—we want to focus on learning to write
correctly working code. However, there are times when performance matters. In
some cases, a horribly inefficient/slow algorithm such as this would be
unacceptable, but anything reasonable works fine. In other cases, every last bit
of performance counts, and highly skilled (and well-paid!) programmers spend
hours refining their code for speed.
What could we do if we had an algorithm such as this, but the slow
performance were unacceptable? We need to find some way to express our
algorithm such that it does not repeat work it has already done. In the case of
Fibonacci, we could rethink our algorithm to work up from 0 and 1 (computing
fib(2), then fib(3), then fib(4), and so on)—at any step, we just add together
the previous two values (we compute fib(5) by adding together whatever we
just computed for fib(3) and fib(4)). However, rethinking the algorithm may
be tricky—to come up with a different approach, you need to convince yourself
to think about the problem differently.
Another way to fix the performance problem without changing the
underlying algorithm is memoization2 —keeping a table of values that we have
already computed and checking that table to see if we already know the answer
before we recompute something. We will learn later how to efficiently store and
look up data, which would make a memoized version of the function quite fast.
7.3 Tail Recursion
The recursive functions we have seen so far use head recursion—they perform
a computation after they make a recursive call. In the factorial example, after
the recursive call to factorial, the returned result is multiplied by n before that
answer is returned. There is another form of recursion called tail recursion. In
tail recursion, the recursive call is the last thing the function does before
returning. That is, for a tail-recursive function f, the only recursive call will be
found in the context of return f(...);—as soon as the recursive call to f
returns, this function immediately returns its result without any further
computation.
A generalization of this idea (separate from recursion) is a tail call—a
function call is a tail call if the caller returns immediately after the called
function returns, without further computation. Put another way: if function f
calls function g, then the call to g is a tail call if the only thing f does after g
returns is immediately return g’s return value (or if f’s return type is void, just
return). These two concepts tie together in that a recursive function is tail-
recursive if and only if its recursive call is a tail call. To understand this
definition, let us look at a tail-recursive implementation of the factorial
function:
1 int factorial_helper(int n, int ans) {
2 //base case
3 if (n <= 0) {
4 return ans;
5 }
6 //recursive call is a tail call
7 //after recursive call returns, just return its answer
8 return factorial_helper(n - 1, ans * n);
9 }
10 int factorial(int n) {
11 return factorial_helper(n, 1); //tail call
12 }
Here we have a tail-recursive helper function—a function our primary
function calls to do much of the work. Helper functions are quite common with
tail recursion but may appear in other contexts as well. The helper function
takes an additional parameter for the answer, which it builds up as it recurses.
The primary function just calls this function, passing in n and 1. Notice how the
recursive call is a tail call. All that happens after that call returns is that its result
is returned immediately. Note: the tail-recursive version of factorial does no less
work than the original recursive version. In the original recursive version, the
factorial of a number is created by a series of multiplications that occur after
each recursive call. In the tail-recursive version, this multiplication takes place
prior to the recursive call. The running product is stored in the second
parameter to the tail-recursive function.
Video 7.5: Execution of the tail-recursive implementation of
factorial.
Video 7.5 shows the execution of factorial(3) using this tail-recursive
implementation. We will note that in this video (and many videos in the future),
we elide the main calling factorial—even though all programs start with main,
we will sometimes just show a program starting in the middle for the sake of
brevity. In such videos, we will just start with the execution arrow inside the
function and a frame set up for it with whatever arguments we want to
demonstrate it with. You can imagine that this function was called by main (or
some other function, which itself was called by main).
After you watch this video, take a moment to write factorial using
iteration (loops), and execute it by hand for n = 3. Do you notice any
similarities between the iterative execution and the tail recursive execution?
What about differences? Take a minute to try this out (possibly re-watching the
video after you do the iterative implementation/execution of factorial) before
you proceed.
To be concrete, let us consider the following iterative implementation of
the factorial function:
1 int factorial(int n) {
2 int ans = 1;
3 while (n > 0) {
4 ans = ans * n;
5 n--;
6 }
7 return ans;
8 }
Executing both of these implementations of the factorial function by
hand shows that they behave pretty much the same way. The main difference is
that the recursive function creates multiple frames, whereas the iterative
function does not. However, an optimizing compiler will perform tail-recursion
elimination and reuse the current frame on a tail-recursive call. Because the
function returns immediately after the tail call completes, the compiler
recognizes that the values in the frame will never be used again, so it can be
overwritten.
In fact, tail recursion and iteration are equivalent. Any algorithm we can
write with iteration, we can trivially transform into tail recursion, and vice
versa. A smart compiler will compile tail-recursive code and iterative code into
the exact same instructions.
In general, if we have a loop that looks like this (note that this is not truly
C, but rather pseudo-code—it is code-like, but not exactly code):
1 t1 var1 = expr1; //t1..tN are types (like int)
2 t2 var2 = expr2;
3 ...
4 tN varN = exprN;
5
6 while(condition) { //whatever conditional expression
7 someCode; //we might do some other code here, eg printing things
8 var1 = update1;
9 var2 = update2;
10 ...
11 varN = updateN;
12 }
13 return ansExpr; //some expression like var1 + var2 * var3
We can transform it into a tail recursion that looks like this3:
1 appropriateType helper(t1 var1, t2 var2, ..., tN varN) {
2 if (!condition) {
3 return ansExpr;
4 }
5 someCode;
6 return helper(update1, update2, ..., updateN);
7 }
The inverse is also true—we can convert tail recursion to a while loop by
doing the reverse.
7.4 Functional Programming
The equivalence of tail recursion and iteration is especially important for
functional programming languages. In a purely functional language, you cannot
actually modify a value once you create it (at least not from the standpoint of
the language: the compiler is free to reuse the same “box” if it can conclude you
will never use it again). As such, there are not loops (which typically require
modifying variables to change the conditions), but rather only recursive
functions. What you would typically write as a loop, you instead just write as a
tail-recursive function.
Functional programming is a beautiful thing, and we highly recommend
that any serious aspiring programmer become fluent in at least one functional
language. One of the authors particularly loves SML (see Chapter 33) and
highly recommends it. Even if your main line of programming work is in some
other language (such as C, Java, C++, or Python), having experience in a
functional language helps you think about problems in different ways and
makes you a better programmer all around. Of course, you may also end up
using a functional language for your primary work.
7.5 Mutual Recursion
We may also find it useful to write some functions using mutual recursion—two
or more functions that call each other. Recall that recursive functions come up
when our generalized steps call for us to solve the same problem on parameter
values closer to the base case(s). Mutually recursive functions occur when we
write one function, find a complex step we want to abstract out into a second
function, then go to write that second function, only to find a complex step that
exactly matches the first function. Here, we again need to be careful to make
sure the mutually recursive pair makes progress toward a base case, which does
not require recursion—otherwise, we will recurse infinitely, and our program
will not terminate.
As a (somewhat contrived) example, suppose we did not have the modulus
(%) operator available and wanted to write a function to figure out if a positive
number is even. We might start from the fact that 0 (is even) and 1 (is not even)
are easy cases4 and use the fact that n is even if (and only if) n - 1 is odd. Such
reasoning would lead us to the following code:
1 int isEven(unsigned int n) {
2 if (n == 0) {
3 return 1;
4 }
5 if (n == 1) {
6 return 0;
7 }
8 return isOdd(n - 1); //complicated step: abstract into a function
9 }
We would now need to proceed by writing the isOdd function, which we
relied on in implementing isEven. In writing that, we might start from the fact
that 0 (is not odd) and 1 (is odd) are easy cases and use the fact that n is odd if
(and only if) n - 1 is even. We would then write:
1 int isOdd(unsigned int n) {
2 if (n == 0) {
3 return 0;
4 }
5 if (n == 1) {
6 return 1;
7 }
8 return isEven(n - 1); //already have a function to do this step
9 }
These two functions are mutually recursive—they call each other. Note
that we will need to write the prototype (recall that the prototype for a function
tells the name, return type, and argument types without providing the body) for
the second function before the first to let the compiler know about the existence
of the second function. The resulting code would look like this:
1 int isOdd(unsigned int n); //prototype for isOdd
2 int isEven(unsigned int n) {
3 if (n == 0) {
4 return 1;
5 }
6 if (n == 1) {
7 return 0;
8 }
9 return isOdd(n - 1); //complicated step: abstract into a function
10 }
11 int isOdd(unsigned int n) {
12 if (n == 0) {
13 return 0;
14 }
15 if (n == 1) {
16 return 1;
17 }
18 return isEven(n - 1); //already have a function to do this step
19 }
Video 7.6 shows the execution of these mutually recursive functions. Note
that the calls are tail calls, so the compiler may optimize the frame usage.
Video 7.6: Execution of the mutually recursive isOdd and
isEven functions.
While our example here may be rather contrived (if you want to know if a
number is odd or even, it is much more efficient to just test if n % 2 == or
not), there are many important uses of mutually recursive functions. One
common use is recursive descent parsing. Parsing is the process of analyzing
input text to determine its meaning. A recursive descent parser is typically
written by writing many functions (each of which parses a specific part of the
input), which then mutually recurse to accomplish their jobs. We will not go
into the details here, but you can imagine the mutually recursive nature by
thinking about C functions and their arguments—for example, if you were
writing the parser for a C compiler.
At a high level, to parse a function call (f(arg1, arg2,...)) you would
write a function parseCall. The parseCall would read the function name, and
the open parenthesis, then repeatedly call another function, parseExpression to
parse each argument (which must be an expression)—as well as checking if the
argument is followed by a comma or a close parenthesis. The parseExpression
function itself may encounter a function call, in which case, it would need to
call parseCall. Such a situation would occur if the text being parsed looked
like this f(3,g(42), h(x,y,z(1)))—some of the arguments to f are
themselves function calls, and in fact, one of the arguments to h is also a
function call.
Such a parse often results in a mutually recursively defined data structure
(we will learn more about recursive data structures in Part III)—meaning you
have two (or more) types that recursively reference each other. Algorithms to
operate on mutually recursive data structures typically lend themselves quite
naturally to mutually recursive implementations. Of course, mutually recursive
data structures may come up in a variety of other contexts as well.
7.6 Theory
Recursion has a strong relationship to the mathematical proof technique of
induction. If you need a quick refresher on induction, it is a technique that lets
us prove , where is some proposition about .
(Translation: we would like to prove that is true for all positive, whole
numbers, .) Proof by induction starts by proving the base case, . The
proof then proceeds by showing the inductive case—either proving
(weak induction) or
(strong induction).
The similarities between the two—having a base case and a
recursive/inductive case working on smaller values—are not a random
coincidence. Our recursive function computes some answer (with certain
properties) for all possible inputs.5 The recursive function works by assuming
the recursive case works correctly and using that fact to make the current case
work correctly—much like the inductive case assumes that the proposition
holds for smaller numbers and uses that fact to prove it for the “current”
number. In fact, if we wanted to prove that a recursive function is correct, we
would proceed by induction—and the structure of the inductive proof would
mirror the structure of the recursive function.
Recursion is not just limited to the natural numbers (unsigned ints). We
can recurse with arguments of different types or on the natural numbers with a
different ordering. In general, we can recurse on any set with a well-ordering
(that is, a total ordering where every subset of the set has a least element) on it.
We have seen this principle in action (though not explicitly discussed the
theoretical implications) in our Fibonacci example. Here, our function operated
on all integers (positive and negative). Proof by induction over all integers is a
bit trickier, since they do not have a smallest value (thus base case) under their
standard ordering.
Our Fibonacci example was, however, theoretically sound, as we can well-
order the integers by having a different ordering (which we will call ).
Specifically, for our Fibonacci function, we would want the ordering
Now, whenever fib(x) recurses to fib(y), we can see that —it is
“less” on this well-ordering. We can then prove the function correct by
induction using this same ordering.
We can use this principle to perform recursion (and correspondingly,
induction) over types that are not directly numbers (of course, to a computer,
everything is a number, as we learned earlier—which we will come back to in a
second). Later, when we discuss recursively-defined data structures, such as
lists and trees, this principle will be quite useful—we will want to recurse over
the structure of the data. Fortunately, these structures have well-orderings, so
we can recurse on them soundly.
The “Everything Is a Number” principle actually appears in the
mathematical theory related to well-ordered sets. We can take our well-ordered
sets and “number” the elements, then just consider the ordering of those
numbers. Technically, we may need the ordinal numbers to perform this
numbering, but if you are not familiar with them, you can just imagine the
natural numbers as being sufficient.
All of this discussion of math is not just a theoretical curiosity. Instead, it
gives us a formal way of understanding when our recursion is well-founded,
versus when we may recurse infinitely. If we can construct a well-ordering of
our parameters and show that every time we recurse we are recursing with
values that are “less than” (under this well-ordering) what was passed in, we
can conclude that our recursion will always terminate—that is, we will never
recurse infinitely. Observe that this property implies we have a base case, as the
well-ordering has a smallest element, so we are not allowed to recurse on that
element (it is impossible to have anything smaller, so we cannot obey the rule of
recursing on only smaller elements).
This property may sound both wondrous (“Great! I can guarantee my
recursions will never be infinite.”) and possibly daunting (“This sounds like a
lot of math…My knowledge of well-ordered sets and the ordinals is kind of
rusty.”). Fear not—for many recursive tasks, this ordering is much simpler than
it sounds. We can make a measure function, which maps the parameter values to
the naturals (or ordinals if needed) and convince ourselves that the measure
decreases with every recursive call. For lists, we might measure them with their
length, for trees we might measure them with their height, and so on. For most
programming tasks, you will not actually need to formally prove any of this—
you will just want to convince yourself that it is true to ensure you do not
recurse infinitely.
7.7 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 7.1 : Use recursion to write the function
int pascal(int i, int j), which returns the number in the row
and column of Pascal’s triangle (we will consider the first row and
first column to be numbered , ). If and lie outside
the triangle, your function should return 0. Recall that Pascal’s triangle
looks like this:
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
....continues infinitely.....
Write a main function that tests your code with several values, and
convince yourself it is correct.
• Question 7.2 : The following code approximates the square root of a
double and returns that approximation (Note: if you actually need the
square root of something, just use the built in sqrt function (found in
math.h)—which is faster and more accurate).
1 double mySqrt(double d) {
2 double lo = 0;
3 double hi = d;
4 double guess = d / 2;
5 while ( fabs(guess * guess - d) > 0.000001 ) {
6 if (guess * guess > d) {
7 //too high:
8 // reduce guess to midway between guess and lo
9 // reduce hi to previous guess
10 double temp = (guess - lo) / 2 + lo;
11 hi = guess;
12 guess = temp;
13 }
14 else {
15 //too low:
16 // increase guess to midway between guess and hi
17 // increase lo to previous guess
18 double temp = (hi - guess) / 2 + guess;
19 lo = guess;
20 guess = temp;
21 }
22 }
23 return guess;
24 }
Convert the iterative solution presented above into a tail-recursive
implementation that performs the same task. Hint: write a helper
function.
• Question 7.3 : Execute the iterative and tail-recursive square root code
(from the previous problem) by hand for d = 64.. Convince yourself
that they are doing the same thing, and make sure you understand why
the tail-call optimization (reusing the frames on a tail call) is correct.
• Question 7.4 : What is the output when the following code is executed?
1 void g(int x) {
2 if (x != 0) {
3 g(x / 10);
4 printf("%d", x % 10);
5 }
6 }
7 void f(int x) {
8 if (x < 0) {
9 printf("-");
10 g(-x);
11 }
12 else if (x == 0) {
13 printf("0");
14 }
15 else {
16 g(x);
17 }
18 printf("\n");
19 }
20 int main(void) {
21 f(42);
22 f(-913);
23 return EXIT_SUCCESS;
24 }
• Question 7.5 : Is the function g in the previous problem head- or tail-
recursive? Why?
• Question 7.6 : Write a recursive function
unsigned power(unsigned x, unsigned y), which computes .
Write a main function that tests it with several values, and convince
yourself it is correct.
• Question 7.7 : Write a recursive function void printHex(unsigned x),
which takes an unsigned int and prints it out in hexadecimal. Recall
from Section 3.1.2 that hexadecimal is base 16 and has digits with
values 0–F. As always, write a main that tests your code for several
values, and convince yourself it is correct.
6 Fixing Your Code: Testing and Debugging8 Pointers
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C7 Recursion9 Arrays
Chapter 8
Pointers
One of the most important and powerful aspects of the C
programming language is its use of pointers. Pointers give a
programmer a significant amount of control and flexibility when
programming, enabling solutions that are clean and efficient.
However, they can also be a common source of confusion and
bugs—primarily when people use them without understanding
them or fail to plan properly. This chapter will introduce the
concept of pointers and provide the reader with an in-depth
understanding of what they are and how to use them correctly.
Let us begin with a simple example to motivate the use of
pointers. Suppose we want to write a swap function, which will
take two integers and swap their values. With the programming
tools we have so far, our function might look something like
this:
1 void naive_swap(int x, int y) { // note: this
2 int temp = x;
3 x = y;
4 y = temp;
5 }
6 int main(void) {
7 int a = 3;
8 int b = 4;
9 swap(a, b);
10 printf("a = %d, b = %d\n", a, b);
11 return EXIT_SUCCESS;
12 }
Unfortunately, this code is not going to have the
functionality we intended. When we pass the arguments a and b
to the function naive_swap, the function is given its own local
copy of these variables. Although the function successfully
swaps its copies of a and b, the original variables are left
unchanged. You should be able to see this behavior yourself by
applying the rules we have seen so far; however, to make sure
you understand clearly, Video 8.1 walks through the execution
of this flawed code.
Video 8.1: Stepping through a naïve (pointer-less)
implementation of swap.
In order to write our function, we need a way for a function
to refer to its caller’s variables (in this case, a and b). Pointers
provide exactly this functionality—the ability to refer to another
location in the program—although they also have many other
uses beyond just swapping data.
8.1 Pointer Basics
Pointers are a way of referring to the location of a variable.
Instead of storing a value like 5 or the character ’c’, a pointer’s
value is the location of another variable. Later in this chapter we
will discuss how this location is stored in hardware (i.e., as a
number). Conceptually, you can think of the location as an arrow
that points to another variable.
Just like we can make variables that are integers, doubles,
etc., we can make variables that are pointers. Such variables
have a size (a number of bytes in memory), a name, and a value.
They must be declared and initialized.
Declaring a pointer In C, pointer (by itself) is not a type. It
is a type constructor—a language construct that, when applied to
another type, gives us a new type. In particular, adding a star (*)
after any type names the type that is a pointer to that type. For
example, the code char *my_char_pointer; (pronounced “char
star my char pointer”) declares a variable with the name
my_char_pointer with the type pointer to a char (a variable that
points to a character). The declaration of the pointer tells you the
name of the variable and type of variable that this pointer will be
pointing to.
Figure 8.1: Pointer Basics. Code (left) and conceptual representation (right) of a
variable x with value 5 and a variable xPtr, which points to x.
Assigning to a Pointer As with all other variables, we can
assign to pointers, changing their values. In the case of a pointer,
changing its value means changing where it points—what its
arrow points at. Just like other variables, we will want to
initialize a pointer (giving it something to point at) before we
use it for anything else. If we do not initialize a pointer before
we use it, we have an arrow pointing at some random location in
our program, and we will get bad behavior from our program—
if we are lucky, it will crash.
Assignment statements involving pointers follow the same
rules that we have seen so far. However, we have not yet seen
any way to get an arrow pointing at a box—which is what we
need to assign to a pointer. To get an arrow pointing at a box
(technically speaking, the address of that box in memory), we
need a way to name that box, and then we need to use the &
operator. Conceptually, the & operator (the symbol is called an
“ampersand,” and the operator is named the “address-of”
operator) gives us an arrow pointing at its operand. The code
xPtr = &x;, for example, sets the value of the variable xPtr to
the address of x. After it is initialized, xPtr points to the variable
x. On the left of Figure 8.1, you can see code that declares an
integer x and a pointer to an integer xPtr. In line 2, the value of
x is initialized to 5 on the same line in which it is declared. In
line 5, the value of xPtr is initialized to the location of x. Once
initialized, xPtr points to x.
The address-of operator can be applied to any lvalue (recall
that an lvalue is an expression that “names a box”) and evaluates
to an arrow pointing at the box named by the lvalue. The only
kind of lvalue we have seen so far is a variable, which names its
own box. Accordingly, for a variable x, the expression &x gives
an arrow pointing at x’s box. It is important to note that the
address of a variable is itself not an lvalue and thus not
something that can be changed by the programmer. The code “&x
= 5;” will not compile. A programmer can access the location of
a variable, but it is not possible to change the location of a
variable.
Note that in the context of pointers, the & symbol is a unary
operator—an operator that takes only one operand. It is used
before the lvalue whose address should be taken. This operator
is not to be confused with the binary operator—an operator that
takes two operands. The binary operator & performs a bitwise
AND (performing a boolean AND on each binary bit of the two
operands).1
Dereferencing a pointer Once we have arrows pointing at
boxes, we want to make use of them by “following the arrow”
and operating on the box it points at—either reading or changing
the value in the box at the end of the arrow. Following the arrow
is accomplished using the star symbol (*), a unary operator that
dereferences a pointer.2 An example of the use of the
dereference operator is shown in line 6 of Figure 8.1. *xPtr =
6; changes the value that xPtr points to (i.e., the value in the
box at the end of the arrow—namely, x’s box)—to 6. Notice that
the green arrow indicates this line has not yet been executed,
hence x still has the value 5 in the conceptual representation.
Once line 6 is executed, however, the value will be 6.
Do not be confused by the two contexts in which you will
see the star (*) symbol. In a variable declaration, such as int
*p;), the star is part of the type name and tells us that we want a
pointer to some other type (in our example, int * is the type of
p). In an expression, the star is the dereference operator. For
example, the code r = *p; gives the variable r a new value,
namely the value inside the box p (which is an arrow) points to.
The code *p = r; changes the value inside the box p points at to
be a new value, namely the value of the variable r. As always,
remember that you can think about declaration with initialization
as two statements: int * q = &y; is the same as the two
statements int *q; q = &y;. Generally when you work with
pointers, you will use the star first to declare the variable and
then later to dereference it. Only variables that are of a pointer
type may be dereferenced.
8.2 A Picture Is Worth a Thousand Words
When working with pointers, always draw pictures. Although
there are only three basic pointer actions—declaring, assigning
(including initializing), and dereferencing, trying to read code
that does all of these actions—often multiple times within a
single line—can be very confusing. Until you can successfully
read code that uses pointers, it will be almost impossible to write
code that does so. Furthermore, when you are writing code with
pointers, careful planning—by drawing pictures—is even more
important in order to write correct code.
Pointers are variables and should be drawn like any other
variable at the time of declaration. Draw a box with the name of
the variable on the left and the value of the variable on the right.
When a pointer has been declared but not yet initialized, it is not
known where the pointer points. The uninitialized value can be
represented with a question mark just as it is for uninitialized
variables of other types.
Video 8.2: Stepping through a simple series of
pointer manipulations.
As you learned in Section 2.1, assignment statements in C
(x = y;) can be broken down into two parts. The left-hand side
(here, x), which is the lvalue, is the box that will have a new
value placed inside it. Before this chapter, lvalues were simply
variables (e.g., x or y). With the introduction of pointers, we can
also use the a derferenced pointer as an lvalue—if p is a pointer,
then *p is an lvalue, as it names the box at the end of p’s arrow.
For example, line 6 of Figure 8.1 assigns a value to the variable
xPtr and line 7 assigns a value to the location that xPtr points
to.
The right-hand side (the y in the statement x = y;), which
is the rvalue, is the value that will be placed inside the
box/lvalue. With the introduction of pointers, we add two new
types of expressions that can appear in rvalues: the address of an
lvalue (&x) and the dereferencing of a pointer (*p).
Video 8.2 shows an example of some basic pointer code
and the drawings that help us to correctly follow its behavior
line by line.
8.3 swap, Revisited
Now that we know about pointers, we can write functions that
are able to access and manipulate the variables of another
function—we can pass pointers to the locations we want the
called function to operate on. We now have the tools to write a
correct version of swap. The key is to pass the swap function
pointers to the original variables a and b, as seen here:
1 void swap(int *x, int *y) {
2 int temp = *x;
3 *x = *y;
4 *y = temp;
5 }
6 int main(void) {
7 int a = 3;
8 int b = 4;
9 swap(&a, &b);
10 printf("a = %d, b = %d\n", a, b);
11 return EXIT_SUCCESS;
12 }
Because the variables x and y hold arrows pointing to the
original variables a and b, the function can swap the caller
function’s variables and not simply its own local copies. The
parameters now tell us the locations of data to manipulate, rather
than the values to manipulate. However, it can manipulate the
values at those locations by dereferencing the pointers.
Video 8.3 steps through the execution of this correct swap
function.
In this video (and many others with pointers, which will
become quite common through the rest of the book), we may use
different colors to draw different pointers. These colors have no
significance in terms of the program behavior—there is nothing
special about one pointer being dark blue and one being light
blue. Instead, we just use different colors for visual clarity when
the diagrams would be confusing if the pointers were all drawn
in the same color.
Video 8.3: Stepping through a correct
implementation of swap with pointers.
8.4 Pointers Under the Hood
In a conceptual drawing, representing a pointer with an arrow is
an effective way of showing what the pointer points to. From a
technical perspective, however, an arrow does not make much
sense. (How exactly does one represent an arrow in hardware?
After all, “Everything Is a Number”—and arrows do not seem
like numbers.)
Figure 8.2: Data with Addresses.
The mechanics of pointers make a little more sense when
we look under the hood at the hardware representation. By now
you are familiar with the idea that data is stored in your
computer in bytes. Some data types, like characters, require one
byte (eight bits), and some data types, like integers, require four
bytes3 (32 bits). When we draw boxes for our variables, we do
not necessarily think about how big the box is, but that
information is implicit in the type of the variable.
Addressing A computer keeps track of all of the data by
storing it in memory. The hardware names each location in
memory by a numeric address. As each different address refers
to one byte of data, this type of storage is called byte-
addressable memory. A visualization of what this looks like is
shown in Figure 8.2. Each box represents a byte of memory, and
each byte has an address shown immediately to the left of it. For
example, the address 208 contains one byte of data with the
value 00000000.
Figure 8.3: Generic drawing of bytes in memory and their addresses.
Figure 8.3 (an expansion of Figure 3.10, which depicted the
ASCII encoding of a string) shows the code, conceptual
representation, and hardware representation of the declaration
and initialization of one four-byte integer, four one-byte
characters, and finally two four-byte pointers.4 Each variable has
a base address. For example, the address of x is 104, and the
address of c3 is 101. The variable x is small enough that it can
be expressed within the first byte of its allocated space in
memory. If it were larger (or simply negative), however, the
bytes associated with addresses 105–107 would be non-zero.5
On a 32-bit machine, addresses are 32 bits in size. Therefore,
pointers are always four bytes in size, regardless of the size of
the data they point to.
With this more concrete understanding of memory and
addresses, the hardware representation of pointers becomes
clear: pointers store the addresses of the variable they point to.
The final variables, y and z, in Figure 8.3 show just that. The
variable y is an integer pointer initialized to point to x. The
conceptual drawing shows this as an arrow pointing to the box
labeled x. The hardware drawing shows this as a variable that
has the value 104, the base address of x. The variable z is
declared as a pointer to a char. Although a character is only one
byte, an address is 32 bits, and so z is four bytes in size and
contains the value 101, the location of c3. (If these variables
were located in memory locations with higher addresses, the
numerical values of the addresses would be larger, and there
would be non-zero values across all four bytes of the pointers y
and z.)
Pointers are variables whose values are addresses. An
important consequence of this fact is that a pointer can only
point to addressable data. Expressions that do not correspond to
a location in memory cannot be pointed to, and therefore a
program cannot attempt to use the ampersand (address of)
operator for these expressions. For example, neither 3 nor (x+y)
are expressions that represent an addressable “box” in memory.
Consequently, lines like these are illegal: int *ptr = &3; and
int *ptr = &(x + y);. Note that 3 and (x+y) are not lvalues—
they do not name boxes, which is why they cannot have an
address. The compiler would reject this code, and we would
need to correct it—primarily by thinking carefully about what
we were trying to do (drawing a picture). A corollary to this rule
is that an assignment statement can only assign a variable that
corresponds to a location in memory. So expressions such as 3 =
4; or x + y + z = 2 will also trigger a compiler error.
8.4.1 A Program’s View of Memory
On a 32-bit machine, where addresses are 32 bits in size, the
entire memory space begins at 0x00000000 (in hex, each 0
represents 4 bits of 0) and ends at 0xFFFFFFFF (recall that 0x
indicates hexadecimal, and that each F represents four binary
1’s). Every program has this entire address space6 at its disposal,
and there is a convention for how a program uses this range of
addresses. Figure 8.4 depicts where in memory a program’s
various components are stored.
Figure 8.4: High-level depiction of a program’s memory layout.
Code Given the repeated claims that “Everything Is a
Number,” it should come as no surprise that a program itself can
be represented by a series of numbers. Producing these numbers
is the job of a compiler, which takes a program in a language
like C and converts it into a bunch of numbers (called object
code), which is readable by the computer. An instruction in C
that adds two numbers together, for example, might be encoded
as a 32-bit instruction in object code. Some of the 32 bits will
tell the machine to perform addition, some of the bits will
encode which two numbers to add, and some of the bits will tell
the processor where it should store the computed sum. The
compiler converts the C code into object code and also assigns
each encoded instruction a location in memory. These encoded
program instructions live in the Code portion of memory, shown
in yellow in Figure 8.4.
Static data The static data area contains variables that are
accessible for the entire run of the program (e.g., global
variables). Unlike a variable that is declared inside a function
and is no longer accessible when the function returns, static
variables are accessible until the entire program terminates
(hence, the term static)). Conceptually, a static variable’s box
remains “in the picture” for whole program, whereas other
variables are usable for only a subset of the program’s lifetime.
These variables are placed in their own location in memory, just
past the Code portion of the program, shown as Static Data in
green in Figure 8.4.
The final two sections of memory are for two different
types of program data that are available at specific times during
a program’s execution. The Heap (in purple) stores dynamically
allocated data, which we will discuss in Chapter 12. The Stack
(in orange) stores the local variables declared by each function.
The stack is divided into stack frames (introduced in Section
2.3), which start when the function is called and last until it
returns. The conceptual drawings throughout this book have
primarily shown pictures of stack frames as boxes (one for each
function) with local variables. In actuality, the stack is one
contiguous piece of memory; each stack frame sits below the
stack frame of the function that called it.
Video 8.4: Stepping through our swap function
with attention to what happens in hardware.
With these details about how pointers work and how code
is placed in memory, we can show a more detailed view of what
happens during the execution of our swap function. Video 8.4
shows the same swap function’s execution, but with a contiguous
view of the stack and memory addresses as the values of
pointers. It also shows the code of the program stored in
memory in four-byte boxes. The return address field of the
stack—previously depicted in our conceptual drawings as a blue
call site location–is also made explicit in this video. The return
address is the address of the instruction that should be executed
next after the function being called completes and returns.
The calling convention—the specific details of how
arguments are passed to and values are returned from functions
—depicted in Video 8.4 resembles an x86 machine; the
arguments to a function reside in the stack frame of the function
that called it. This is slightly different from the conceptual
drawings we have shown previously, in which the arguments
were placed in the stack frame of the function being called. For a
conceptual drawing, this is both sufficient and easy to
understand. Hardware details will differ slightly for every target
architecture.
8.4.2 NULL: A Pointer to Nothing
At the bottom of Figure 8.4, there is a blank space below the
code segment, which is an invalid area of the program’s
memory. You might wonder why the code does not start at
address 0, rather than wasting this space. The reason is that
programs use a pointer with the numeric value of 0—which has
the special name NULL7 to mean “does not point at anything.” By
not having any valid portion of the program placed at (or near)
address 0, we can be sure that nothing will be placed at address
0. This means no properly initialized pointer that actually points
to something would ever have the value NULL.
Knowing that a pointer does not point at an actual thing is
useful for a variety of reasons. For one, we may sometimes have
algorithms that need to answer “there is no answer.” For
example, if we think about our closest point example from
Video 1.4, when the set of points is empty, we must return “there
is no answer”—with pointers, we can return a pointer to a point,
and return NULL in the case where “there is no answer.” We may
also have functions, whose job it is to create something, that
return NULL if they fail to do so—in Chapter 11, we will see
functions that open files (so we can read data from the disk),
which fail in this manner, and in Chapter 12 we will see
functions that dynamically allocate memory, which also return
NULL if the memory cannot be allocated. In Chapter 21, we will
begin to learn about linked structures, which store data with
pointers from one item to the next and use NULL to indicate the
end of the sequence.
Figure 8.5: Conceptual depiction of NULL: a flat-headed arrow.
When we use NULL, we will represent it as an arrow with a
flat head (indicating that it does not point at anything), as shown
in Figure 8.5. Whenever we have NULL, we can use the pointer
itself (which just has a numeric value of zero); however, we
cannot follow the arrow (as it does not point at anything). If we
attempt to follow the arrow (i.e., dereference the pointer), then
our program will crash with a segmentation fault—an error
indicating we attempted to access memory in an invalid way (in
fact, if our program tries to access any region of memory that is
“blank” in Figure 8.4, it will crash with a segmentation fault).
The NULL pointer has a special type that we have not seen
yet—void *. A void pointer indicates a pointer to any type, and
is compatible with any other pointer type—we can assign it to an
int *, a double *, or any other pointer type we want. Likewise,
if we have a variable of type void *, we can assign any other
pointer type to it. Because we do not know what a void *
actually points at, we can neither dereference it (the compiler
has no idea what type of thing it should find at the end of the
arrow), nor do pointer arithmetic on it (see Section 8.7).
8.5 Pointers to Sophisticated Types
So far, we have primarily talked about pointers to the most basic
types, such as int or double. However, we can have a pointer to
any type—including to a struct, or even to another pointer. We
can even have a pointer to a pointer to a pointer to a struct
(which may be useful for certain things).
8.5.1 Structs
When we have pointers to structs, we can just use the * and .
operators that we have seen so far; however, the order of
operations means that . happens first. If we write *a.b, it means
*(a.b)—a should be a struct, which we look inside to find a
field named b (which should be a pointer), and we dereference
that pointer. If we have a pointer to a struct (c, and we want to
refer to the field d in the struct at the end of the arrow, we would
need parenthesis, and write (*c).d (or the -> operator we will
learn about momentarily).
Figure 8.6: Structs and Pointers: *a.b dereferences the b field inside of struct a.
Dereferencing q, then selecting field x requires either parenthesis, or the ->
operator.
Figure 8.6 illustrates. Here, we have a struct that has a field
p (which is a pointer to an int), and a field x, which is an int.
We then declare y (an int), a (a struct), and q (a pointer to a
struct), and initialize them. When we write *a.p, the order of
operations is to evaluate a.p (which is an arrow pointing at y),
then dereference that arrow. If we wrote *q.x, we would receive
a compiler error, as q is not a struct, and the order of operations
would say to do q.x first (which is not possible, since q is not a
struct). We could write parentheses, as in the figure ((*q).x).
However, pointers to structs are incredibly common, and
the above syntax gets quite cumbersome, especially with
pointers to structs that have pointers to structs, and so on. For
(*q).x, it may not be so bad, but if we have (*(*(*q).r).s).t,
it becomes incredibly ugly and confusing. Instead, we should
use the -> operator, which is shorthand for dereferencing a
pointer to a struct and selecting a field—that is, we could write
q->x (which means exactly the same thing as (*q).x). For our
more complex example, we could instead write q->r->s->t
(which is easier to read and modify).
8.5.2 Pointers to Pointers
We can have pointers to pointers (or pointers to pointers to
pointers etc.). For example, an int** is a pointer to a pointer to
an int. An int*** is a pointer to a pointer to a pointer to an int.
We can have as many levels of “pointer” as we want (or need);
however, the usefulness drops of quite quickly (int** is quite
common, int*** moderately common, but neither author has
ever had a use for an int**********). The rules for pointers to
pointers are no different from anything else we have seen so far:
the * operator dereferences a pointer (follows an arrow), and the
& operator takes the address of something (gives an arrow
pointing at that thing).
Figure 8.7: An illustration of pointers to pointers.
Figure 8.7 illustrates pointers to pointers. Here we have
four variables, a (which is an int), p (which is an int*), q
(which is an int **), and r (which is an int***). If we were to
write *r, it would refer to q’s box (because we would follow the
arrow from r’s box, and end up at q’s box. We could write **r,
which would refer to p’s box—because *r is an arrow pointing
at p, and the second * dereferences that pointer. Likewise, ***r
would refer to a’s box. It would be a compiler error to write
****r, because that would attempt to follow the arrow in a’s
box, which is not an arrow, but rather a number (sure, everything
is a number—but our types have told the compiler that a is just a
plain number, not a number that means an arrow).
You may wonder why8 we might want pointers to pointers.
One answer to this question is “for all the same reasons we want
pointers”—a pointer gives us the ability to refer to the location
of a thing, rather than to have a copy of that thing. Anytime we
have a variable that tells us the location of a thing, we can
change the original thing through the pointer. Just as we might
want to write swap for integers, we might also want to write
swap for int *s (in which case, our swap function would take
int **s as parameters). In Section 10.2, we will see that we
might want pointers to pointers to store data in two (or more)
dimensional structures (e.g., a grid-like fashion). When we get to
Chapter 21 and Chapter 22 and begin manipulating linked data
structures, we will see that pointers to pointers give us elegant
and efficient ways to accomplish a variety of tasks.
Notice how the types work out (i.e. match up so that the
compiler type check the program). Whenever we take the
address of an lvalue of type T, we end up with a pointer of type
T * (e.g., p has type int *, so &p has type int **). Whenever
we take the address of something, we “add a star” to the type.
This rule makes intuitive sense because we have a pointer to
whatever we had before. Whenever we dereference an
expression of type T*, we end up with a value of type T—we
“take a star off the type” because we followed the arrow.
For the expression…
*e &e
e must be… a pointer an lvalue
if e’s type is T* T
…then the resulting type is T T*
conceptually, this means: Follow the arrow Give me an arrow
that is e’s value pointing at e
Table 8.1: Rules for pointers.
Table 8.1 summarizes these rules about pointers. Note that
* and & are inverse operations—if we write *&e or &*e, they both
just result in e (whenever they are legal). The first would mean
“give me an arrow pointing at e, then follow it back (to e),”
while the second would mean “follow the arrow that is e’s value,
then give me an arrow pointing at wherever you ended up.”
8.5.3 const
As we discuss pointers to various types, it is a good time to
introduce the notion of const data—data that we tell the
compiler we are not allowed to change. When we declare a
variable, we can add const to its type to specify that the
compiler should not allow us to change the data:
1 const int x = 3; //assigning to x is illegal
If we try to change the value of x, the compiler will produce an
error. Declaring data as const can be useful, as it removes a
certain class of mistakes that we can make in our program:
changing things we should not.
When we have pointers, there are two different things that
can be const: the data that the pointer points at (what is in the
box at the end of the arrow) or the pointer itself (where it
points). If we write:
1 const int * p = &x;
we have declared p as a pointer to a const int—that is, p points
at an int, and we are not allowed to change that int. We can
change where p points (e.g., p = &y; is legal—if y is an int).
However, changing the value in the box that p points at (e.g.,
*p = 4;) is illegal—we have said that the int p points at is
const.
If we do try to write something like *p = 4;, we will get a
compiler error like this:
assignment of read-only location ’*p’
We can achieve exactly the same effect by writing:9
1 int const * p = &x; //same as const int * p
If we want to specify that we can change *p, but not p
itself, we would write:
1 int * const p = &x;
This declaration says that p is a const pointer to a (modifiable)
int. Writing *p = 4; would be legal, but writing p = &y; would
be illegal.
If we so desire, we can combine both to prevent changing
either where the pointer points or the value it points at:
1 const int * const p = &x;
The same principle applies to pointers to pointers (to
pointers to pointers …). For example, with an int **, we have
the following combinations:
Can we change…
**p *p p
int **p Yes Yes Yes
const int **p No Yes Yes
int *const*p Yes No Yes
int **const p Yes Yes No
const int *const *p No No Yes
const int ** const p No Yes No
int * const * const p Yes No No
const int * const * const p No No No
Note that a declaration of const tells the compiler to give
us an error only if we try to change the data through the variable
declared as const or perform an operation where the const-ness
gets dropped. For example, the following is legal:
1 int x = 3;
2 const int * p = &x;
3 x = 4;
Here, we are not allowed to change *p, but the value we find at
*p can still be changed by assigning to x (since x is not const, it
is not an error to assign to it). However, if we write:
1 const int y = 3;
2 int * q = &y;
3 *q = 4;
then we will receive a compiler warning (which we should treat
as an error):
initialization discards ’const’ qualifier from
pointer target type
[enabled by default]
The error is on line 2, in which we assign &y (which has type
const int *) to q (which has type int *)—discarding the
const qualifier (const is called a qualifier because it modifies a
type). This snippet of code is an error because *q = 4; (on line
3) would be perfectly legal (q is not declared with the const
qualifier on the type it points to) but would do something we
have explicitly said we do not want to do: modify the value of y.
Novice programmers often express some confusion at the
fact that the first example is legal and the second is not—in both
cases, we have tried to declare a variable and a pointer (to that
variable), with one const and the other not. We have then tried
to modify the value in that box through whichever is not const
—but one is ok, and the other is not. These rules do actually
make sense: in the second case, we have said “y cannot be
modified,” then we try to say “q is a pointer (which I can use to
modify or read a value) to y”—that clearly violates what we said
about y (that it cannot be modified). In the first case, however,
we are saying “x is a variable that can be modified,” and then “p
is a pointer, which we can only use to read the value it points at,
not modify it”—this does not impose new (nor violate existing)
restrictions on x; it only tells us what we can and cannot do with
p.
8.6 Aliasing: Multiple Names for a Box
In our discussion of pointers, we have alluded to the fact that we
may now have multiple names for the same box; however, we
have not explicitly discussed this fact. For example, in
Figure 8.7, we have four names for a’s box: a, *p, **q, and ***r.
Whenever we have more than one name for a box, we say that
the names alias each other—that is, we might say *p aliases **q.
10
Aliasing plays a couple of important roles in our
programming. First, when we are working through Step 2 of our
programming process, we may find that we changed a value, but
there are multiple ways we can name what value we changed.
When we write down precisely what we did, it is crucial to think
about which name we used to find the box when we worked the
problem by hand in Step 1. If we are not sure, we should make
note of the other names we might have used—then if we have
trouble generalizing, we can consider whether we should have
be using some other name instead.
When we are debugging (or executing code by hand), it is
also important to understand aliasing. Novice C programmers
often express surprise and confusion at the fact that a variable
changes its value without being directly assigned to: “I wrote
x = 4;, then look I don’t assign to x anywhere in this code, but
now it is 47!” Generally such behavior indicates that you alias
the variable in question (although you may not have meant to).
If you have problems of this type, using watch commands in
GDB (see Section D.2 for more about GDB) can be incredibly
helpful.
If we want to live dangerously, we can even have aliases
with different types. Consider the following code:
1 float f = 3.14;
2 int * p = (int *) &f; //generally a bad idea!
3 printf("%d\n", *p);
Here, f is an alias for *p, although f is declared to be a float,
while p is declared to be a pointer to an int (so we have told the
compiler that *p is an int). What will happen when you run this
code?
Your first reaction might be to say that it prints 3—after all,
from what we learned in Chapter 3, the following (similar-ish)
code would print 3:
1 float f = 3.14;
2 int x = (int) f;
3 printf("%d\n", x);
Figure 8.8: Aliasing with different types. Storing the floating-point bit encoding
for 3.14 in f, then reading it out as though it represented an integer.
However, if we run the first code snippet on the computer,
we will get 1078523331! This result may seem like the computer
is just giving us random nonsense; however, what happened is
perfectly logical (if a bit low-level). When we cast a float to an
int (in the second snippet of code), we ask the compiler to insert
instructions that convert from a float to an int. However, when
we dereference a pointer to an int, the compiler just generates
instructions to read the bit pattern at that location in memory as
something that is already an int. In the first snippet of code,
initializing float f = 3.14; writes the bit pattern of the
floating point encoding of 3.14 into f’s box. Without going into
the details (which are discussed in Section 3.2.3), the floating
point encoding of 3.14 works out to 0100 0000 0100 1000
1111 0101 1100 0011 (=0x4048F5C3 in hex, 1078523331 in
decimal). When we dereference the pointer as an int, the
program reads out that bit pattern, interprets it as though it were
an integer, and prints 1078523331 as output. Figure 8.8
illustrates.
This last example is not something you need to use in
programs you write, but rather a caution against abusing
pointers. Unless/until you understand exactly what is happening
here and have a good reason to do it11, you should not cast
between pointer types.
8.7 Pointer Arithmetic
Like all types in C, pointers are variables with numerical values.
The precocious programming student may wonder if the value of
variables can be manipulated the way one can manipulate the
value of integers and floating point numbers. The answer is a
resounding “Yes!”
Consider the following code:
1 int x = 4;
2 float y = 4.0;
3 int *ptr = &x;
4
5 x = x + 1;
6 y = y + 1;
7 ptr = ptr + 1;
Lines 1–3 initialize the values of three variables of various types
(integer, floating-point number, and pointer to an integer). Lines
5–7 add 1 to each of these variables. For each type, adding 1 has
a different meaning. For x, adding 1 performs integer addition,
resulting in the value 5. For y, adding 1 requires converting the 1
into a floating point number (1.0) and then performing floating
point addition, resulting in 5.0. For both integers and floating-
point numbers, adding 1 has the basic semantics of “one larger”.
For the integer pointer ptr (which initially points to x), adding 1
has the semantics of “one integer later in memory”.
Incrementing the pointer should have it point to the “next”
integer in memory. In order to do this, the compiler emits
instructions that actually add the number of bytes that an integer
occupies in memory (e.g., +1 means to change the numerical
value of the pointer by +4).
Likewise, when adding to a pointer to any type T, the
compiler generates instructions that add ( times the number
of bytes for values of type T) to the numeric value of the pointer
—causing it to point Ts further in memory. This is why
pointer arithmetic does not work with pointers to void; since the
compiler has no idea how big a “thing” is, it does not know how
to do the math to move by “things”.
The code we have written here is legal as far as the
compiler is concerned; however, our use of pointer arithmetic is
rather nonsensical in this context. In particular, we have no idea
what box ptr actually points at when this snippet of code
finishes. If we had another line of code that did *ptr = 3;, the
code would still compile but would have undefined behavior—
we could not execute it by hand and say with certainty what
happens. Specifically, when ptr = &x, it is pointing at one box
(for an integer) that is all by itself—it is not part of some
sequence of multiple integer boxes (which we will see shortly).
Incrementing the pointer will point it at some location in
memory, we just do not know what. It could be the box for y, the
return address of the function, or even the box for ptr itself.
The fact that we do not know what happens here is not
simply a matter of the fact that we have not learned what
happens—it is a matter of the fact that the rules of C give the
compiler a lot of freedom in how it lays out the variables in
memory. One compiler may place one piece of data after x,
while another may place some other data after x. In fact, the
same compiler may change how it arranges variables in memory
when given different command line options, changing its
optimization levels.
We will consider all code with undefined behavior, such as
this, to be erroneous. Accordingly, if you were to execute this
code by hand, when you perform ptr = ptr + 1;, you should
change the value of ptr to just be a note that it is &x + 1, and
not any valid arrow. If you then dereference a pointer that does
not validly point at a box, you should stop, declare your code
broken, and go fix it. We will note that simply performing
arithmetic on pointers such that they do not point to a valid box
is not, by itself, an error—only dereferencing the pointer while
we do not know what it points at is the problem. We could
perform ptr = ptr - 1; right after this code, and know with
certainty that ptr points at x again. We might also just go past a
valid sequence of boxes at the end of a loop, but not use the
invalid pointer value. We will generally not do these things
(although we will see an example of the latter in Video 9.1), and
you should know the difference between what is and is not
acceptable coding practice.
8.8 Use Memory Checker Tools
Now that you are starting to use pointers, it is crucial to use
memory checker tools, such as Valgrind and/or -
fsanitize=address (see Section D.3). These will help you find
erroneous behavior and make fixing your program easier. Use
them all throughout the testing process.
8.9 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 8.1 : What is a pointer? How should you think
about one conceptually? How are they actually
implemented?
• Question 8.2 : Is “pointer” a type?
• Question 8.3 : What is NULL? Why is it useful? What is
the type of NULL? What does this type mean?
• Question 8.4 : What does the -> operator mean?
Specifically, if a program has a->b, then (1) what must
be true of the type of a for this code to be legal? (2) How
do you find the box named by a->b? (3) How could you
write a->b without the arrow operator (using other
operators you have learned before to get the same
effects)?
• Question 8.5 : What does const mean?
• Question 8.6 : What does aliasing mean?
• Question 8.7 : Suppose I told you that *p is an alias for
*q before executing the lines *p = 5; and then
*q = 42;. After executing these lines, what is the value
of *p? Why?
• Question 8.8 : Consider the following code (which does
not compile for many reasons):
1 int f(int a, int * b, int * c) {
2 a = &b;
3 &b = &c;
4 *b = a;
5 &c = b;
6 c = a;
7 return b;
8 }
For each line in the table below, determine if that
line is legal or not. If it is legal, write “legal” in the table.
If not, describe why the line is illegal.
2: a = &b;
3: &b = &c;
4: *b = a;
5: &c = b;
6: c = a;
7: return b;
• Question 8.9 : Suppose that you have three int *s: p, q,
and r. Of these, p and q are NULL, and r points at a valid
box belonging to an int. Which of the following lines of
code will cause the program to segfault (consider each
line separately: not the cumulative effects of executing
them all sequentially)?
1. p = q;
2. *p = 3;
3. *r = *q;
4. r = q;
5. p = r;
• Question 8.10 : What is the output when the following
code is run?
1 void g(int x, int * y) {
2 printf("In g, x = %d, *y = %d\n",
3 x++;
4 *y = *y - x;
5 y = &x;
6 }
7 void f(int * a, int b) {
8 printf("In f, *a = %d, b = %d\n",
9 *a += b;
10 b *= 2;
11 g(*a, &b);
12 printf("Back in f, *a = %d, b = %d
13 }
14 int main(void) {
15 int x = 3;
16 int y = 4;
17 f(&x, y);
18 printf("In main: x = %d, y = %d\n"
19 return EXIT_SUCCESS;
20 }
• Question 8.11 : What is the output when the following
code is run?
1 int main(void) {
2 int a = 3;
3 int b = 4;
4 int c = 5;
5 int * p = &a;
6 int * q = &b;
7 int * r = &c;
8 int ** s = &p;
9 int ** t = &q;
10 printf("**s = %d\n", **s);
11 printf("**t = %d\n", **t);
12 *s = r;
13 r = *t;
14 **s = 55;
15 **t = 99;
16 printf("a = %d\n", a);
17 printf("b = %d\n", b);
18 printf("c = %d\n", c);
19 printf("*p = %d\n", *p);
20 printf("*q = %d\n", *q);
21 printf("*r = %d\n", *r);
22 printf("**s = %d\n", **s);
23 printf("**t = %d\n", **t);
24 return EXIT_SUCCESS;
25 }
7 Recursion9 Arrays
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C8 Pointers10 Uses of Pointers
Chapter 9
Arrays
Section 8.7 implied that pointer arithmetic is most useful when
we have sequences of boxes that are guaranteed to be one after
the other. Such sequences, called arrays, are incredibly useful.
Arrays allow us to solve a wide variety of useful and interesting
problems—we frequently want to write programs that operate
on a lot of data of the same type. Arrays let us bundle this data
together, refer to it with a single variable, and easily traverse
each element in our bundle. An understanding of arrays also
enables us to understand the details of strings (which we will
learn about in Section 10.1).
9.1 Motivating Example
As a motivating example, suppose we wanted to write a
program to break a simple cryptographic system,1 such as a
Caesar cipher or a Vigenère cipher. The first of these, the
Caesar cipher, named after the ancient Roman emperor Julius
Caesar, encrypts a message by adding a fixed shift (e.g., add
three) to each letter. For example, encrypting Hello would
yield Khoor, as K is three letters after H, h is three letters after e,
and so on. The Vigenère cipher applies the same concept of
shifting each letter but uses a key with more than one amount to
shift by. For example, we might have a key of (3, 5, 7, 2, 9), in
which case we would add three to the first, sixth, eleventh,
sixteenth, and so on letters of the message; we would add five
to the second, seventh, twelfth, seventeenth, and so on letters;
etc. We will note that for a long time, Vigenère was regarded as
unbreakable and that many novice programmers think of
variants of it when trying to devise their own encryption
schemes. However, it is easily broken.
Breaking both of these ciphers relies on frequency
counting—determining how often each letter appears in the
encrypted message—and then using the fact that the frequency
distribution of letters in natural language is highly uneven. In
English, e is by far the most common letter, while letters such
as q, x, z are quite uncommon (if you are curious, we give the
frequency distribution of letters in this book in Table 24.1). If
we were trying to break a Caesar cipher, once we have
frequency counted the message, we can guess that the most
frequently occurring letter is e, and then try to decrypt the
message. If all characters (not just letters) are encrypted, we
would guess that the most frequently occurring symbol is space
and try to decrypt the message. If the decrypted message does
not make sense, we can try again with the second most
common letter/symbol. Although it might take a few tries, we
would typically succeed on the first attempt (as long as the
message is long enough to give reasonable frequency counts).
If we actually wanted to break such messages, we would
want to write a program to do it—specifically, an algorithm that
would take an encrypted message as input and output the
decrypted message (if we are worried about needing multiple
tries, we could have it output the decrypted message for the top
few possibilities). Writing this program requires a few pieces.
First, we need to frequency count the encrypted message, and
then we need to find out which letter’s count is the largest.
Since we do not know how to work with strings yet (our
message would be a string—and we cannot learn about strings
until we know about arrays), let us just look at the problem of
finding the largest of the frequency counts—of which there
would be 26, one for each letter.
You could write a function to implement this algorithm
right now; however, it would be horrendously ugly:
1 char highestFrequency(unsigned numAs, uns
2 unsigned numDs, uns
3 ... //many parameters om
4 unsigned numXs, uns
5 char bestLetter = ’a’;
6 unsigned bestCount = numAs;
7 if (numBs > bestCount) {
8 bestLetter = ’b’;
9 bestCount = numBs;
10 }
11 if (numCs > bestCount) {
12 bestLetter = ’c’;
13 bestCount = numCs;
14 }
15 if (numDs > bestCount) {
16 bestLetter = ’d’;
17 bestCount = numDs;
18 }
19 ... //many lines of code omitted
20 if (numZs > bestCount) {
21 bestLetter = ’z’;
22 bestCount = numZs;
23 }
24 return bestLetter;
25 }
We sketch this code out but leave out large parts of it—in
particular, there are 17 other parameters (never write a function
that takes 26 parameters!) and 84 lines of code (also, never
write a function that is 100+ lines long)—for the cases
’e’–’y’, which are missing. This approach would work, but it
is tedious and error-prone to write (how likely are you to miss
changing something as you copy/paste 24 cases?). Matters
would get even worse if we were dealing with 1000 items
instead of 26—could you imagine trying to write this function
to find the max of 1000 things? Looking at it, we can see that
we are repeating almost the same thing over and over—
suggesting there should be a way to generalize this code into an
algorithm where the program can do the repetitive work,
instead of you doing it. It seems like there has to be a better
way, and fortunately there is—we can use an array!
9.2 Array Declaration
An array is a sequence of items of the same type (e.g., an array
of ints is a sequence of ints). When an array is created, the
programmer specifies its size (i.e., the number of items in the
sequence)—so we might make an array of 26 ints for our
frequency counting example. If we wanted to declare an array
of four ints called myArray, we would write:
1 int myArray[4];
Note that this line of code looks just like any other
variable declaration, except that it has [4] after the variable
name—that specifies we are declaring an array of size four (i.e.,
four ints in a sequence, rather than just one int). When we
make this declaration, we end up with a slightly different
semantics than we are used to. The variable name (in this case,
myArray) is a pointer to the four boxes that make up the array.
Unlike other variables, myArray is not an lvalue—we cannot
change where it points. Instead, it just names a pointer to the
first box in the array. Starting with C99 (the “version” of C
defined in 1999), arrays may be declared with a non-constant
expression as the size, i.e., we can write int myArray[x];,
where x is a previously declared (and initialized!) integer
variable. As we mentioned in Chapter 5, you will generally
want to give GCC the command line option --std=gnu99 to
enable features defined in C99—this is one such feature.
Figure 9.1: Depiction of an array of four ints, resulting from the declaration of
myArray.
Figure 9.1 illustrates the effects of this declaration. We
have created the array (which has four uninitialized boxes for
ints), and myArray names a pointer to it. Unlike other
variables, we draw myArray without a box of its own, to
underscore the fact that the value of myArray is not necesarily
stored in the program’s memory, and cannot be changed. While
not storing the value explicitly may sound like it is difficult to
use, it is not—the compiler just translates the array name into a
fixed offset from the start of the function’s stack frame. That is,
the compiler treats all uses of myArray as if they were
frame + offset, where frame is the start of the stack frame,
and offset is the particular location in the frame the compiler
chooses.
The C standard says that myArray is an lvalue, but not a
modifiable lvalue. This distinction may sound a bit strange,
especially as myArray does not have a box of its own. However,
&myArray is legal (and evaluates to the same numerical value as
myArray, although of a slightly different type, as we shall see
shortly).
As with other variables, we can initialize an array in the
same line that we declare it. However, as the array holds
multiple values, we write the initialization data inside of curly
braces. For example, we might write:
1 int myArray[4] = {42, 39, 16, 7};
You might think that writing the wrong number of
elements in the initializer (e.g., 5 or 7 in the above example)
would result in an error from the compiler; however, you will
only receive a warning if you write too many elements in the
initializer. If you write too few elements, the compiler will
silently fill the remaining ones in with 0. This behavior can be a
useful feature for zero-initialing the entire array—we can write
just a single 0 in the curly braces and the compiler will fill in as
many zeros as there are elements of the array:
1 int myArray[4] = {0}; //initialize all elements to
Note that most compilers will also accept
int myArray[4] = {};; however, GCC will warn you for it if
you compile with -pedantic, as its not strictly allowed by the
language standard.
If we provide an initializer, we can also omit the array size
and instead write empty square brackets—the compiler will
count the elements of the initializer and fill in the array size:
1 int myArray[] = {42, 39, 16, 7}; //compiler figures ou
We’ll also note that you can use a similar syntax to
initialize structs. For example, the following will initialize the
first field of p to 3 and the second field to 4:
1 point p = {3, 4};
However, for structs, this form of initialization is very brittle–if
you add another field to the struct before or in between these
two, you will no longer be initializing the fields the way you
intend. In C99 (e.g., what you get when you compile with -
std=gnu99), you can designate which element you are
initializing:
1 point p = {.x = 3, .y = 4};
Now, if another field is added to the struct, it will be zero-
initialized, and the x and y fields will still be initialized
properly.
If you are initializing an array of structs, these two
techniques work particularly well together (an example of
composability—which you learned about in Section 4.5.1). For
example, we could declare and initialize an array of three
points:
1 point myPoints[] = { {.x = 3, .y = 4},
2 {.x = 5, .y = 7},
3 {.x = 9, .y = 2} };
9.2.1 What Is The Type of An Array
Suppose you have declared myArray as follows:
1 int myArray[4] = {42, 39, 16, 7};
A natural question is “what is the type of myArray?” As we told
you earlier, myArray names a pointer to the first element of the
array, so int * is a natural choice. For most practical purposes,
this is the type of myArray. The type is actually int [4].
However, almost anything you do with an array will cause the
type to decay into int *. That is, the compiler will effectively
“forget” that the type is specifically an array of ints, and just
treat it as a generic int pointer.
In fact, there are three ways that an array can be used
without causing its type to decay to a pointer. However, to
discuss two of these, we will need to see some more concepts
later on. When we reach the appropriate times, we will mention
these situations. In all other cases, the array gets used like a
pointer.
The third operation that does not cause decay, which we
can discuss now, is &. As we said earlier, myArray and &myArray
evaluate to the same numerical value, but with slightly different
types. When used as a pointer, myArray will evaluate to a
pointer to an int. However, &myArray will evaluate to a pointer
to 4 ints. This difference is primarily significant if you attempt
to do pointer arithmetic: if ints are 4 bytes, then myArray+1 will
be numerically 4 larger than myArray, while &myArray+1 will
be numerically 16 larger than myArray.
9.3 Accessing an Array
We can access the elements of an array in a couple different
ways (which are actually the same under the hood!). We have
already learned that the name of the array is a pointer to its first
element, that we can do arithmetic on pointers, and that we can
dereference pointers to access the data at the end of the arrow.
We can put these concepts together to see one way we could
access elements in the array. Video 9.1 illustrates.
Video 9.1: Array access with pointer arithmetic.
Accessing an array element using pointer arithmetic works
fine and sometimes is the natural way to access elements.
However, sometimes we just want the element of an array,
and it would be cumbersome to declare a pointer, add to it,
then dereference it. We can accomplish this goal more
succinctly by indexing the array. When we index an array, we
write the name of the array, followed by square brackets
containing the number of the element we want to refer to: e.g.,
myArray[3]. Indexing an array names the specific box within
the sequence and can be used as either an lvalue or an rvalue. It
is important to note that in C (and C++ and Java), array indexes
are zero-based—the first element of the array is myArray[0].
Some programming languages have one-based array indexing,
but zero-based is generally more common. Video 9.2 illustrates
the execution of code (which has the same behavior as the code
in Video 9.1) using array indexing.
Video 9.2: Array access with indexing.
We will note that accessing an array out of bounds (at any
element that does not exist) is an error that the compiler cannot
detect. If you write such code, your program will access some
box, but you do not know what box it actually is. This behavior
is much the same as the erroneous behavior we discussed when
we talked about pointer arithmetic. In fact, pointer arithmetic
and array indexing are exactly the same under the hood: the
compiler turns myArray[i] into *(myArray + i).2
Note that a consequence of this rule is that if we take
&myArray[i], it is equivalent to &*(myArray + i), and the &
and * cancel (as we learned previously, they are inverse
operators), so it is just myArray + i. This result is fortunate, as
it lines up with what we would hope for: &myArray[i] says
“give me an arrow pointing at the box of myArray,” while
myArray + i says “give me a pointer boxes after where
myArray points”—these are two different ways to describe the
same thing.
9.4 Passing Arrays as Parameters
In general, when we want to pass an array as a parameter, we
will want to pass a pointer to the array, as well as an integer
specifying how many elements are in the array, such as this:
1 int myFunction(int * myArray, int size) {
2 //whatever code...
3 }
There is no way to get the size of an array in C, other than
passing along that information explicitly, and we often want to
make functions that are generic in the size of the array they can
operate on (i.e., we do not want to hardcode a function to only
work on an array of a particular size). If we wanted, we could
make a struct that puts the array and its size together as one
piece of data—then pass that struct around.
When we pass a pointer that actually points at an array, we
can index it like an array (because it is an array—remember the
name of an array variable is just a pointer) and perform pointer
arithmetic on it. Such pointer arithmetic will be well-defined, as
long as the resulting pointer remains within the bounds of the
array, as we are guaranteed that the array elements are
sequential in memory.
We can also pass an array as a parameter with the square
bracket syntax:
1 int myFunction(int myArray[], int size) {
2 //whatever code...
3 }
This definition is functionally equivalent to the one we
saw before with the pointer syntax. Some people prefer it, as it
indicates more explicitly that myArray is intended to be an
array. We can also write a size in the []; however, the compiler
will not make any attempt to check if that size is actually
correct, thus it is easy to write something incorrect there (or
something that becomes incorrect as you change your code),
and may confuse a reader.
When you call a function that takes an array, you can pass any
expression that evaluates to a pointer to a sequence of elements.
Typically, this expression is just the name of the array (which is
a pointer to the first element). However, we could perform
pointer arithmetic on an array (to get a pointer to an element in
the middle of it—which is a valid, but shorter, array), or we
might retrieve the pointer out of some other variable—it might
be a field in a struct, or even an element in another array (we
will see how to do this in Section 10.2).
9.5 Writing Code with Arrays
When we write code with arrays, we need to look for patterns
in where we access the array. A common pattern is to access
each element of the array in order (the , then the , etc.).
Such a pattern generally lends itself naturally to counting over
the elements of the array with a for loop. However, we might
have other patterns—as always, we should work the problem
ourselves, write down what we did, and look for the patterns.
If we have complicated data (e.g., arrays of structs that
have arrays inside them) it is very important to think carefully
about how we can name the particular box we want to
manipulate. For problems with complex data structures,
drawing a diagram with the appropriate pointers and arrays can
be an important aspect of working an example of the problem
ourselves (Step 1). Then, when we write down exactly what we
did (Step 2), we can think carefully about how we can name
each box that we manipulate. As we mentioned in Section 8.6,
sometimes a box will have multiple names, in which case we
need to think about how we picked that box—what “route” we
took to get to it, to figure out what name is most appropriate. If
we are unsure, we should make a note of the other names for
the box that we can think of; then in Step 3, we may realize that
we prefer another way to name the same box, as it creates a
more consistent pattern with other “almost repetitive” parts of
our algorithm.
9.5.1 Index of the Largest Element
Video 9.3: Devising the code to find the index of
the largest element of an array.
In our motivating example (breaking simple cryptographic
systems), we discussed solving the problem of finding the
largest of 26 numbers. Now that we know about arrays, we can
implement a much better version of this function. We can also
make our function slightly more general—instead of only
operating on 26 pieces of data, we can make it work on an array
of any size. Video 9.3 walks through the creation of such a
function.
9.5.2 Closest Point
As another example, we can return to the closest point
algorithm we considered in Video 1.4. When we first
considered this example back in Section 1.7.4, we worked
through the design of the algorithm for this problem. However,
at that time, we did not know how to write any C code, so we
stopped there. Now, however, we are ready to complete that
example. Video 9.4 walks through the translation of this
algorithm to code.
Video 9.4: Translating our closest point
algorithm to code.
9.5.3 Dangling Pointers
When you write code with arrays, you may be tempted to return
an array from a function (after all, it is natural to solve
problems where an array is your answer). However, we must be
careful, because the storage space for the arrays we have
created in this chapter live in the stack frame and thus are
deallocated after the function returns. The value of the
expression that names the array is just a pointer to that space, so
all that gets copied to the calling function is an arrow pointing
at something that no longer exists. Whenever you have a
pointer to something whose memory has been deallocated, it is
called a dangling pointer. Dereferencing a dangling pointer
results in undefined behavior (and thus represents a serious
problem with your code) because you have no idea what values
are at the end of the arrow.
Video 9.5: Illustration of a dangling pointer
caused by attempting to return an array.
Video 9.5 illustrates this problematic behavior. If we were
to try to compile the code in the video, the compiler would
warn us that our code is broken:
warning: function returns address of local
variable [-Wreturn-local-addr]
return myArray;
^
However, you should understand why the compiler is giving
this warning and never write code that exhibits this bad
behavior, rather than relying on the compiler to find it. The
compiler’s ability to detect this sort of problem is rather
limited.3 Consider the following code, which is still broken,
and has the exact same effect as the code in the video:
1 //Still broken
2 int * initArray(int howLarge) {
3 int myArray[howLarge];
4 for (int i = 0; i < howLarge; i++) {
5 myArray[i] = i;
6 }
7 int * p = myArray;
8 return p;
9 }
GCC produces no warning, even though the code is
equally broken. If you execute a call to this function by hand
(recommended), you will find that its return value is dangling.
One thing to be particularly wary of with respect to
dangling pointers is that sometimes you can get away with
using them without observing any problematic effects—your
code is still broken, it just does not look (or even behave)
broken. Having the code seem fine is particularly dangerous for
two reasons. First, when there is a problem in our code, we
want to find it. We want to fix it, so that our code is correct, and
we do not have danger lurking inside. Second, the “but I did
that before and it was fine” effect is dangerous to novices—if
you learn that something is ok, you keep doing it. You do not
want to form bad habits.
To understand why you only see problems sometimes,
remember that a value stored in memory will remain there until
changed by the program—and a deallocated memory location
may not be reused immediately. Once a function returns and its
frame is destroyed, the memory locations that were part of that
frame will not be reused until more values must be placed on
the stack. If we call another function, its frame will be placed in
the next available stack slots, overwriting the recently
deallocated memory. Only at this point will the values
associated with the deallocated stack frame change (due to
assignments to variables in the new frame). Now writing to
memory through the dangling pointer will change variables in
that frame in unpredictable ways. Note that is not safe to use a
dangling pointer into a deallocated frame even when you have
not called another function—even though most stack
allocations will come from calling a function, there is nothing
to guarantee that those are the only way that the stack is used.
In Chapter 12, we will learn how to allocate memory in
the heap, rather than in a stack frame. Memory in the heap will
remain allocated until we explicitly deallocate it—persisting
beyond the lifetime of the function that performed the
allocation.
9.6 Sizes of Arrays: size_t
In various examples so far, we have represented the size and
indices of an array with an int—a signed integer, meaning it
can hold positive or negative values. If we think very carefully
about this situation, we might wonder why we use a signed int
—a negative array index would not be legal, as there is not
4
myArray[-1] (this would be out of bounds). This analysis
suggests that we should use an unsigned int. However, while
we are thinking very carefully about what type we might use,
we may as well ask “what is the most correct type to use?” In
particular, there are a variety of sizes of unsigned integers—do
we want eight bits? 16 bits? 32 bits? 64 bits? For a four-
element array, any of these will work just fine. However, we
might want to write a more general array manipulation function
that works with any size of array. In that case, how large of an
unsigned int would we want to use to describe the size of the
array and/or index it?
In C, the number of bits we would need varies from one
platform to another—on a 32-bit platform (meaning memory
addresses are 32 bits), we would want a 32-bit unsigned int;
on a 64-bit platform, we would want a 64 bit unsigned int.
Fortunately, the designers of C realized this possibility and
decided to make a type name for “unsigned integers that
describe the size of things”—size_t. Whenever you see
size_t, you should think “unsigned int with the right number
of bits to describe the size or index of an array.” For example,
our closestPoint function would be slightly more correct if
we wrote (note the code is the same as in Video 9.4, except that
we changed the type of n and i to be size_t instead of int):
1 point * closestPoint(point * s, size_t n,
2 if (n == 0) {
3 return NULL;
4 }
5 double bestDistance = computeDistance(s
6 point * bestChoice = &s[0];
7 for (size_t i = 1; i < n ; i++) {
8 double currentDistance = computeDista
9 if (currentDistance < bestDistance) {
10 bestChoice = &s[i];
11 bestDistance = currentDistance;
12 }
13 }
14 return bestChoice;
15 }
What does “slightly more correct” mean? Practically
speaking, it would only make a difference for very large arrays
—ones whose size can be represented by a size_t but not an
int (e.g., typically more than two billion elements). Such
situations will not come up in any of the problems you will do
in this book, nor many situations you are likely to encounter as
a novice programmer. However, if you become a professional
programmer working at a company that routinely deals with
huge data sets, it may matter—being in the habit of being
precisely correct will then be an asset to you. You will also see
size_t frequently in the C library, so you should know what it
means.
While we are discussing the sizes of data, it is a good time
to introduce the sizeof operator. Recall that you do not
actually know how many bytes an int or a pointer is, as their
actual sizes can vary from one platform to another. Instead of
writing a numerical constant that represents the size on one
platform, you should let the C compiler calculate the size of a
type for you with the sizeof operator. The sizeof operator
takes one operand, which can either be a type name (e.g.,
sizeof(double)) or an expression (e.g., sizeof(*p)). If the
operand is an expression, the compiler figures out the type of
that expression (remember expressions have types) and
evaluates the size of that type. In either case, sizeof evaluates
the number of bytes the type requires. The type of this number
of bytes is size_t.
Note that sizeof is an operator and not a function. While
you will often see it written with parenthesis (e.g.,
sizeof(expr)) for clarity, it is legal to write it without (e.g.,
sizeof expr).
Recall that earlier, we said that there are certain operations
that can be performed on arrays without causing array-to-
pointer decay. The sizeof operator is the second such operation
that we have seen. If myArray is an array of 4 ints, then
sizeof(myArray) will be 4 * sizeof(int), the number of
bytes in memory that allocated in the stack frame due to the
declaration of myArray.
However, you should NOT use sizeof to attempt to
determine the number of elements in an array. You may even
see such a practice recommended on certain websites that make
suggestions to novice programmers, in the form
sizeof(myArray)/sizeof(myArray[0]). Even though it may
“seem to work” in some cases, it is a bad idea to use it even in
those cases. In particular, it will work in exactly the situations
where no array-to-pointer decay has happened, and the
compiler “remembers” the original size of the array. This
restriction basically means it will have to be in the same
function (if you pass the array to another function, it undergoes
array-to-pointer decay) and you have the size handy from
wherever you declared it. Even if you think that it would be
convient to do so in the same function, you are making your
code incredibly brittle—if you later decide to pull that code out
of the function, you will break it.
For example, if you write
1 int myFunction(int x) {
2 int myArray[x+1];
3 //...some code that fills in myArray
4 size_t size = sizeof(myArray)/
5 sizeof(myArray[0]);
6 for (size_t i = 0; i < size; i++) {
7 //code using myArray
8 }
9 //more code
10 }
everything may seem fine. However, if you later decide
that the function is growing too large, and you want to abstract
it out, you will break your code unless you remember to fix the
array size calculation:
1 int myOtherFunction(int * myArray) {
2 //now will compute sizeof(int*)/sizeof(int)
3 //probably size=1 or 2 depending on platform
4 size_t size = sizeof(myArray)/
5 sizeof(myArray[0]);
6 for (size_t i = 0; i < size; i++) {
7 //code using myArray
8 }
9 }
10 int myFunction(int x) {
11 int myArray[x+1];
12 //...some code that fills in myArray
13 int result = myOtherFunction(myArray);
14 //more code
15 }
However, if we just wrote this code cleanly to begin with:
1 int myFunction(int x) {
2 size_t size = x+1;
3 int myArray[size];
4 //...some code that fills in myArray
5 for (size_t i = 0; i < size; i++) {
6 //code using myArray
7 }
8 //more code
9 }
Then we would not have any problems—we would be
forced to pass in size to our new function, as if we forgot, the
compiler would remind us.
The last thing we will note about sizeof is that it is
evaluated at compile time. That is, when the compiler translates
your code into assembly, it evaluates all sizeof operators.
When the program runs, there is no “measurement” of objects
at runtime, just use of whatever the compiler determined the
size was.
9.7 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 9.1 : What is an array? How are arrays stored
in C? When you write the name of an array in C, what
does it mean conceptually?
• Question 9.2 : If you write int x[5];, in the array
named x, how many elements are there that are
initialized, and what are the values of the ones that are
initialized?
• Question 9.3 : If you write int x[5] = \{4, 6\};, in
the array named x, how many elements are there that are
initialized, and what are the values of the ones that are
initialized?
• Question 9.4 : If you write int x[] = \{4, 6\};, in
the array named x, how many elements are there that are
initialized, and what are the values of the ones that are
initialized?
• Question 9.5 : What is a “dangling pointer”? What
happens if you try to dereference one?
• Question 9.6 : Write a main function that calls the
closestPoint function we wrote in Video 9.4 and
prints out its answer. Put the code for the closestPoint
function, as well as an appropriate definition for the
point struct and the computeDist function into your file
(as well as any #includes you need) so you can run
your code.
• Question 9.7 : Rewrite the closestPoint function to
access the elements of the array with pointer arithmetic,
instead of indexing (the resulting code should not use
the [] operator). Use the main you wrote in the previous
question to test the code out.
• Question 9.8 : What is size_t? Why is it more correct
to use size_t instead of int to describe the size or
index of an array?
• Question 9.9 : Write the function:
int sumArray(int * array, size_t n);, which
returns the sum of the elements in the array passed in
(whose length is n). Of course, you should also write a
main function that tests your code.
• Question 9.10 : Write the function:
int arrayContains(int * array, size_t n, int to
Find);, which returns 1 if array
(which has n elements) contains a value equal to
toFind, and 0 if it does not.
• Question 9.11 : Write the function:
size_t maxSeq(int * array, size_t n);, which
returns the length of the maximum increasing
contiguous subsequence in the array. The parameter n
specifies the length of the array. For example, if the
array passed in were
\{1, 2, 1, 3, 6, 7, 2, 4, 6, 9\}, this function
would return 4 because the longest sequence of (strictly)
increasing numbers in that array is \{1, 3, 6, 7\},
which has length 4. Note that \{1, 3, 6, 7, 9\} is an
increasing subsequence but is not contiguous (finding
non-contiguous ones efficiently takes techniques we
haven’t learned yet). Of course, you should also write a
main function that tests your code.
• Question 9.12 : Given the following snippet of code:
1 int array[3];
2 int a;
3 int * p = &array[1];
4 int * q = &a;
5 int ** r = &p;
Group these names a, p, *p, p[1], array[0],
array[1], array[2], q, *q, **r, and *r into sets based
on the box they name—that is, divide the names in
groups such that all names in one group alias each other
but do not alias any of the names in other groups.
8 Pointers10 Uses of Pointers
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C9 Arrays11 Interacting with the User and System
Chapter 10
Uses of Pointers
Now that we have learned about pointers and arrays, we are ready to see a variety of applications of
them: strings, multidimensional arrays, and function pointers. We will also briefly discuss security
hazards that can arise in programs with errors related to these topics at the end of this chapter.
10.1 Strings
Now that we have learned about pointers and memory, we are finally ready to dive into strings. A
string is a sequence of characters, terminated by the null terminator character, ’\0’ (it has numerical
value 0, but the character literal for it is written with a backslash, since you cannot type that character
normally. Do not confuse it with ’0’, the digit zero). Since the string is a sequence of characters, it is
stored in memory as an array and referenced via a pointer to its first character. The array of characters
can be accessed and manipulated in the same ways as any other array.
10.1.1 Strings Literals
We have seen string literals so far—a sequence of characters written down in quotation marks, such as
"Hello World\n". Now that we understand pointers, we can understand their type: const char *,
that is, a pointer to characters that cannot be modified (recall that here, the const modifies the chars
that are pointed to).
That is, if we wanted to store a string literal in a variable, we might write:
1 const char * str = "Hello World\n";
Figure 10.1: A variable pointing at a string literal.
Figure 10.1 illustrates the effect of such a statement and the layout of the string in memory. str is
a pointer, pointing at an array of characters. These characters appear in the order of the string and are
followed by the null terminator character \ (note that we do not need to write this character down in
the string literal—the compiler adds it for us for literals only). Below the conceptual representation of
the string, Figure 10.1 shows its numeric representation—the string is just a sequence of bytes (eight-
bit numbers) in memory, the last of which has a numeric value of 0 (do not forget: Everything Is a
Number!).
Notice that we used const indicating that we cannot modify the characters pointed to by str (that
is, assignment to str[i] will result in a compiler error). If we forget the const modifier,
unfortunately, the code will still compile (we can receive a warning for this type of mistake with -
Wwrite-strings, which is not enabled by default with -Wall because many programmers are sloppy
about const anyways—which is not to say that you should be). However, if we omit const and try to
modify the string, the program will crash with a segmentation fault.
Figure 10.2: String literals are typically placed in a read-only portion of the static data section.
The reason the program will crash if you attempt to modify a string literal is that the data for the
string literal is stored into a read-only portion of the static data section. Figure 10.2 shows the variable
pointing at the string literal from Figure 10.1 in the context of the “picture of memory” you learned
about in Chapter 8.
The data for the string literal (the actual bytes that make up the string) reside in the read-only
portion of the static data section for the entire lifetime of the program—from the time it is loaded into
memory until it exits. This data is placed into memory by the loader—the portion of the operating
system that reads the executable file from the disk and initializes its memory appropriately. The loader
knows what to write for the string literals (and where they should go in memory) because the compiler
writes information into the executable file describing the contents of the data section. After the loader
finishes initializing memory, it marks the read-only portions of the static data section as non-writeable
in the page table—the structure that the operating system maintains to describe the program’s memory
to the hardware.
Attempting to write to a read-only portion of memory will behave much like writing to an invalid
region of memory—it will cause the hardware to trap into the operating system—transferring
execution control from your program to the OS kernel (conceptually, the hardware takes the execution
arrow out of your program and puts it into a particular function in the OS, noting where the execution
arrow was in case the OS wants to return control to your program). The OS then sees that the program
was attempting to access memory in invalid ways and kills it with a segmentation fault.
The compiler puts the string literals into a read-only region of memory because the literal may
get reused and thus should not be changed.
Consider the following code:
1 char * str1 = "Hello";
2 str1[0] = ’J’; //this would crash, but suppose it did not
3 //...
4 char * str2 = "Hello";
5 printf("%s\n", str2);
Both occurrences of the literal "Hello" evaluate to pointers to the location where the characters
of that string is stored. The compiler is free to put the two identical string literals in one location,
meaning str1 and str2 would point at the same memory. If modifying this memory were allowed,
printing str2 would print "Jello", which would be confusing. In a worse case, modifying string
literals could pose a wide range of issues, from strange behaviors to security problems. Note that even
if the literal appears in only one place in the program, it may get re-used multiple times (inside a loop,
in a function that is called more than once, etc.)—in such a case, our expectation as a programmer is
that the literal will always be what we wrote, and it has not been changed by previous code.
10.1.2 Mutable Strings
When we want to modify a string, we need the string to reside in writeable memory, such as the frame
of a function (or memory that is dynamically allocated by malloc, which we will learn about in
Chapter 12). To make space for a string in a function’s frame, we need to declare an array of chars
with sufficient space to hold all of its characters plus its null terminator.
One way we can declare and initialize our array of characters is like this:
1 char str[] = "Hello World\n";
This code behaves exactly as if we wrote:
1 char str[] = {’H’, ’e’, ’l’, ’l’, ’o’, ’ ’,
2 ’W’, ’o’, ’r’, ’l’ ’d’, ’\n’, ’\0’};
That is, it declares a variable str, which is an array of 13 characters (remember that the size of an
array may be implicit if we provide an initializer from which the compiler can determine the size), and
initializes it by copying the characters of the string "Hello World\n" (including the null terminator)
into that array. Being slightly more explicit, one could think of this code as doing:
1 char str[13];
2 str[0] = ’H’;
3 str[1] = ’e’;
4 str[2] = ’l’;
5 str[3] = ’l’;
6 str[4] = ’o’;
7 str[5] = ’ ’;
8 str[6] = ’W’;
9 str[7] = ’o’;
10 str[8] = ’r’;
11 str[9] = ’l’;
12 str[10] = ’d’;
13 str[11] = ’\n’;
14 str[12] = ’\0’;
We will note that this type of initialization of an array is the third case in which an array can be
used without array-to-pointer decay occuring. The string literal on the left side is treated as an array of
characters, not just a pointer to the first one.
Figure 10.3: The difference between a string declared as a pointer to a literal (left) and as an array initialized by a literal (right).
Figure 10.3 illustrates the difference between declaring str as const char * str versus
char str[].
We can declare the array str with an explicit size, but we must be careful—if we do not include
enough space for the null terminator (i.e., we declare it char str[12] = "Hello World\n";), the
compiler will not complain. Instead, it will initialize the character array exactly as we have requested,
but there will be no ’\0’ placed at the end. The compiler allows this behavior since it makes for a
perfectly valid array of characters, even though it is not a valid string. If we only compute on the array
in such ways that it only accesses those 12 characters, our program is fine. However, if we use that
array for anything (e.g., pass it to printf or any of the string library functions we will learn about
soon) that expects an actual string (i.e., one with a null terminator on the end), then the array will be
accessed past its bounds.
Failing to terminate the string may not always appear in testing—you might “get lucky” and have
the next byte of memory already be 0 anyways. While this may seem nice—your program “works”—
it is actually quite a dangerous sort of problem. You may test your program a thousand times and not
see any errors, then deploy it and have it crash or produce incorrect results. We strongly recommend
the use of tools such as Valgrind, which are capable of detecting this sort of error.
It is perfectly fine, however, to request more space than is required for your string. For example,
char str[100] = "Hello World\n"; is entirely legitimate. We may wish to request extra space in
this fashion if we plan to add to the string, making it longer. Of course, whenever we do so, we must
be sure that we have enough space for whatever we may want our string to hold. (Remember that the
programmer is responsible for keeping track of the size of her arrays. There is no way to inspect an
array and derive its size.)
10.1.3 String Equality
Figure 10.4: Applying the == operator to two strings just compares the pointers for equality.
Often, programmers want to compare two strings to see if they are equal. Our first inclination
might be to use the == operator, which we have already seen. However, the == operator will compare
pointer equality. That is, if we write str1 == str2, it will check if str1 and str2 are arrows pointing
at the same place. Sometimes pointer equality is what we mean, but more often, we want to check to
see if the two strings have the same sequence of characters, even if they are in different locations in
memory. This concept is illustrated in Figure 10.4.
Video 10.1: Writing a function to compare two strings.
Video 10.1 shows an example of how we might write a function ourselves to test if two strings
have the same contents—that is, if they contain exactly the same sequence of characters. Of course,
this task is common enough that there is already a function to do it—called strcmp—in string.h of
the C library. That function behaves slightly differently than the one we wrote in the example—it
returns 0 if the strings are equal and non-zero if they are different. In fact, it returns a positive number
if the first string is “greater than” the second string and a negative number if the first string is “less
than” the second string. Here “greater than” and “less than” refer to lexicographic order—what you
would think of as “alphabetical ordering,” but extended to encompass the fact that strings may have
non-letters. The comparison is case-sensitive (so abc is different from Abc), but there is another
function strcasecmp, which performs case-insensitive comparison.
10.1.4 String Copying
Figure 10.5: Assigning one pointer to another follows the same rules we have always seen. You identify the box named by the left
side (in this case str1), evaluate the right side to a value (in this case an arrow pointing at the B in Blueberry, and copy that value
(arrow) into the box from the left.
Take a moment to look at the situation depicted in Figure 10.5. In this figure, the execution arrow
is immediately before the assignment statement str1 = str2;. What do you think will happen when
this assignment statement is executed?
The short answer is: we will follow the same rules we always have. The right side of the
assignment statement (str2) evaluates to an arrow pointing at the second string ("Blueberry"). We
will take that value (the arrow) and copy it into the box named by the left side of the assignment
statement (in this case, str1). The result is that both str1 and str2 point at the same memory
location. If we change the contents of one (e.g., we execute str1[0] = ’x’;) then if we “look at” that
memory location through its other name (str2[0]), we will “see” the change (remember aliasing from
Chapter 8).
We may, however, want to actually copy the contents of the string from one location to another.
As with comparing for equality, doing this copy yourself requires iterating through the characters of
the string and copying them one by one to the destination. In doing so, we must be careful that the
destination has sufficient space to receive the string being copied into it. Video 10.2 walks through
creating a function to perform this task.
Video 10.2: Writing a function to deep copy a string.
The C library has a function strncpy, which performs this task for us—it copies a string from
one location to another and takes a parameter ( ) telling it the maximum number of characters it is
allowed to copy. If the length of the source string is greater than or equal to , then the destination is
not null-terminated—a situation the programmer must typically rectify before using the string for any
significant purpose.
Note that there is a similarly named function, strcpy (the previous one had an “n” in the middle
of its name—this one does not). The strcpy function is more dangerous, as there is no way to tell it
how much space is available in the destination. If insufficient space is available, then strcpy will
simply overwrite whatever follows it in memory, creating a variety of problems. Some of these
problems may result in security hazards, as we will discuss in Section 10.4. There is another function
strdup, which allocates space for a copy of the string and copies it into that space. However, to
understand how strdup works, we need to discuss dynamic allocation in Chapter 12.
10.1.5 Converting from Strings to ints
One important thing to remember when using strings is that they cannot be implicitly converted to
integers (or floating point types) by casting—either implicit or explicit. Consider the following code
fragment:
1 const char * str = "12345";
2 int x = str;
Attempting to compile this piece of code results in the error message:
initialization makes integer from pointer without a cast
This error arises because the assignment does not convert the number the string represents textually
into an integer (that is, it does not result in x = 12345). Instead, it would take the numerical value of
str (which is a pointer, thus its numerical value is the address in memory where the sequence of
characters 12345 is stored) and assign it to x. This behavior follows exactly the rules we have learned
for assignment statements: evaluate the right side to a value (which is an arrow, meaning its numerical
value is an address), and write it in the box named by the left side.
Figure 10.6: Illustration of the difference between 12345 and "12345".
To help understand why simple assignment does not work, Figure 10.6 gives a peek “under the
hood” for code with a char * pointing at the string "12345" and an int with the value 12345. The left
side of the figure shows the conceptual representation, which we typically work with. On the right
side, the figure shows the same state of the program, but with addresses and the numeric values
contained in those addresses. The variable str (whose bytes are colored in red) is a pointer, which is
eight bytes on this particular system. Its numeric value is the address in memory of the bytes of the
string literal "12345", which is 0x100000f40. The contents of memory locations 0x100000f40–
0x100000f45 are the characters of that string—the numeric values for the characters ’1’ (0x31), ’2’
(0x32), ’3’ (0x33), ’4’ (0x34), ’5’ (0x35), and ’\0’ (0x00) in that order. The variable x, which is an
int, occupies four bytes that hold the value 0x00003039, which is 12345 in decimal.
If we were to convince the compiler to allow us to assign x = str, we would copy the value
from str (which is 0x100000f40) into x. Of course, since x cannot hold this entire value (remember
that on this particular system, pointers are eight bytes, but ints are four bytes), the value will be
truncated. x would end up being assigned the value 0x0000f40 (the lowest four bytes of
0x100000f40), which is 3904 in decimal—still not what we want.
Another incorrect (at least for this task) approach would be to write x = *str, dereferencing str
to get the value at the end of the arrow, rather than the pointer itself. Here, we would read one
character out of the string (*str evaluates to ’1’, which is 0x31). We would then assign this value
(0x31) to x. Now, we would end up with x being 0x31 (which is 49 in decimal)—also not what we
desire!
While the previous example may seem a bit contrived due to the use of a literal string (why not
just write int x = 12345;?), consider a more useful example:
1 printf("Please enter a number: ");
2 char * str = readAStringFromTheUser(); //we’ll learn how later
3 int x = str; //they will enter a string.
4 //we’d like to store it as an int...
This example not only illustrates why we might want to perform this sort of operation but also leads us
into understanding one of the complexities in such a task. What if the user enters xyz? How do we
then convert that to a number? For that matter, what if our code instead read (note the “in
hexadecimal”):
1 printf("Please enter a number in hexadecimal: ");
2 char * str = readAStringFromTheUser(); //we’ll learn how later
3 int x = str; //they will enter a string.
4 //we’d like to store it as an int...
Now, if the user enters the sequence of characters 12345, they do not mean the number 12345 (twelve
thousand, three hundred, forty-five), since we told them we would interpret it as hexadecimal. Instead
it is 74565 in decimal (you can work out this conversion yourself for practice).
If we wanted to perform such a conversion ourselves by hand, we would need to iterate over the
characters in the string and perform math. However, as this type of conversion is a common task, there
are C library functions that perform it for us. The atoi function is the simplest of these—it converts a
string to an integer by interpreting the sequence of characters as a decimal number. If there is no valid
number at the start, it returns 0. A slightly more complex function is strtol, which lets you specify
the base (decimal, hexadecimal, etc.), as well as to pass in the address of a char *, which it will fill in
with a pointer to the first character after the number. That is, if you give it the string "123xyz", it will
set this pointer to point at the x (you can also pass in NULL if you do not need this extra information, in
which case it skips this part).
This concept can be a little bit confusing at first, as you are used to seeing the textual
representation of a number (e.g., its decimal form written on paper and thinking of it as the number
itself). The best way to understand this concept is to write these functions that convert from strings to
ints yourself (in general, it is best to learn by doing). See the practice questions at the end of this
chapter.
10.1.6 Standard Library Functions
We have already seen a few examples of string-related functions from the C library. However, there
are many more available, as string manipulation is a common task in programming. As a general rule
of thumb, if you think “I bet all programmers want to do (something) to strings regularly,” then it is
quite likely that the C library has a function to do that thing to strings—concatenate them, find a
character in them, find their length, look for one string inside another, etc.
You can read about all of these functions in their man pages (if you are not familiar with man
pages, see Section B.2). You can also consult man string for a list of all of the string-related functions
and their respective prototypes to help you find what you are looking for if you do not know its name.
10.2 Multidimensional Arrays
In Chapter 9, we learned about arrays, which let us store a sequence of elements of the same type and
access a particular element by indexing into the array. We can expand on this concept with
multidimensional arrays. For example, we might want to represent a mathematical matrix (which is
conceptually rectangular, rather than linear) as a two-dimensional array of numbers or an image with a
two-dimensional array of colors. If we wanted to have an array of strings, we would actually end up
with a two-dimensional array of characters, as strings are themselves arrays of characters.
As always, deciding how to represent your data is part of Step 2 of your algorithmic design
process, in which you figure out exactly what it was that you did in Step 1. In Chapter 8, you learned
that data that is represented as a sequence of elements of the same type is naturally represented as an
array. This concept extends to multidimensional arrays whenever your data naturally occurs in a
higher dimensional organization. You can create multidimensional arrays with any number of
dimensions, so you can represent data of any number of dimensions that you want. For example, if
your program works with data that is organized by day of the year, then within each day, by hour of
the day, and within each hour by room number within a building, then a natural representation would
be a three-dimensional array: .
10.2.1 Declaration
In C, multidimensional arrays are arrays of arrays—a two-dimensional array is an array whose
elements are one-dimensional arrays; a three-dimensional array is an array whose elements are two-
dimensional arrays; and so on. Accordingly, we declare them with multiple sets of square brackets,
each indicating the size of the corresponding dimension. For example, we might declare a two-
dimensional array of doubles that is four elements by three elements (e.g., to use as a
mathematical matrix) like this:
1 double myMatrix[4][3];
Figure 10.7: Left: conceptual layout of a matrix. Right: in-memory layout of myMatrix[4][3].
Declaring myMatrix in this fashion results in an array with four elements. Each of the elements of
myMatrix is an array of three doubles. Accordingly, myMatrix occupies (4 * 3 * sizeof(double))
bytes of memory, with the three elements of myMatrix[0] appearing together, followed by the three
elements of myMatrix[1], and so on. Figure 10.7 depicts this layout on the right, as well as the
conceptual (i.e., rectangular) layout of the matrix on the left. In both sides of the figure, the zeroth
element (“row”) of myMatrix is colored green, the first is colored pink, the second blue, and the third
orange, so that you can easily see how the data is laid out. The particular addresses are not important
(and are just examples—they would change from program to program), but are intended to show how
the elements are all consecutive in memory.
10.2.2 Indexing Multidimensional Arrays
If we were to write myMatrix[i] (where i is some integer type), then we would expect that expression
to evaluate to the element of myMatrix according to the rules that we learned in Section 8.7. For
example, if we wrote myMatrix[2], we would expect that to evaluate to the second element of
myMatrix, which is the three blue boxes in Figure 10.7. This element is an array of three doubles, so
we would expect the type to be double *, and as the first element of that array is at 0x7fff5c346b30
(in this particular example), we would expect that to be the value of the expression (as we represent
arrays by a pointer to their first element). If you expect all of these things (based on what you have
already learned), you would be correct.
We may wish to index the two-dimensional array twice, such as myMatrix[2][1]. When the
program evaluates this expression, it will first evaluate myMatrix[2], obtaining a pointer to the three-
element array that is the second element of myMatrix. Then, it will index that array (which is an array
of doubles), and evaluate to a double. Of course, myMatrix[2][1] is an lvalue, as it names a
particular box, so we can use it on the left side of an assignment, e.g, myMatrix[2][1] = 3.14;.
However, we should note that myMatrix[2] behaves just like any other array. While the C standard
calls it an lvalue, but not a modifiable lvalue, it does not have a box of its own (as you should recall
from Chapter 8). The pointer that myMatrix[2] evaluates to is not actually stored anywhere, it is just
calculated by pointer arithmetic from myMatrix.
10.2.3 Multidimensional Array Initializers
We can initialize a multidimensional array in the same line that we declare it by using a braced
initializer, as we can for a one-dimensional array. In the case of a multidimensional array, we should
remember that each element of the array is itself an array, and write a braced initializer for it:1
1 double myMatrix[4][3] = { {1.0, 2.5, 3.2}, //elements of myMatrix[0]
2 {7.9, 1.2, 9.9}, //elements of myMatrix[1]
3 {8.8, 3.4, 0.0}, //elements of myMatrix[2]
4 {4.5, 9.2, 1.6} }; //elements of myMatrix[3]
When we initialize an array in this fashion we can leave off the first dimension, as the compiler
can determine how many elements there are from the initializer:
1 //also legal: removed the 4 from the []
2 double myMatrix[][3] = { {1.0, 2.5, 3.2}, //elements of myMatrix[0]
3 {7.9, 1.2, 9.9}, //elements of myMatrix[1]
4 {8.8, 3.4, 0.0}, //elements of myMatrix[2]
5 {4.5, 9.2, 1.6} }; //elements of myMatrix[3]
You may not elide the second dimension’s size specification, even when you provide a complete
initializer. You may also elide the first dimension’s size when you are declaring a parameter for a
function, but may not elide any other dimension’s size.
A multidimensional array is not limited to two dimensions. For more dimensions, you can write
additional []s specifying the size of each additional dimension:
1 int x[4][2][7]; //x is a 3D array, with 4 elements, each of which is
2 //an array with 2 elements
3 // (whose elements are 7-element arrays of ints)
4 char s[88][99][122][44]; //s is a 4D array of chars: 88 x 99 x 122 x 44.
All of the same rules apply to these arrays with more dimensions. If we write x[1], we get a
pointer to the two-dimensional array that is the first element of x. If we write x[1][1], we get a
pointer to the one-dimensional array of ints that is the first element of that array, and if we write x[1]
[1][4], we get the int that is the fourth element of that array. We can also initialize these arrays with
braced initializers if we want to (although writing the initializer for a large array with many
dimensions will likely take a significant amount of time).
10.2.4 Array of Pointers (to Arrays)
We can also represent multidimensional data with arrays that explicitly hold pointers to other arrays.
For example, we might write the following:
1 double row0[3];
2 double row1[3];
3 double row2[3];
4 double row3[3];
5 double * myMatrix[4] = {row0, row1, row2, row3};
Here, we again have a matrix; however, this matrix is represented in a rather different fashion
in memory. Here, myMatrix is an array of four pointers, each of which explicitly points at an array that
represents a row of the matrix.
Figure 10.8: Left: conceptual layout of a matrix as an array of pointers. Right: in-memory layout of this array of pointers.
Figure 10.8 illustrates the layout of this data structure. On the left, this figure depicts the
conceptual representation of this data structure: myMatrix is an array of four pointers, and each of
these pointers points at one of the arrays row0–row3. The right side of this figure depicts the in-
memory layout of this data structure. Here, each of the row arrays may not be next to each other in
memory (they might be, but do not have to be). The four entries of myMatrix now hold pointers to (the
addresses of) the four row arrays.
Elements of the arrays are accessed in similar ways for both representations. For either
representation, myMatrix[2], evaluates to a pointer to the array that is the second row of the matrix.
Likewise, myMatrix[2][1] evaluates to the double in the first column of the second row of the
matrix.
However, there are some significant differences. First, in this array of pointers representation, the
pointers to the rows are explicitly stored in memory. Accordingly, evaluating myMatrix[i] actually
involves reading a value from memory, not just computing an offset. This difference has performance
implications, which we will not go into here, as we are not prepared to discuss performance (such a
discussion requires a detailed knowledge of hardware).
Explicitly storing the pointers to the rows of the matrix allows us to do some things with this
representation we cannot do with the first representation. First, we are not constrained to having each
row be the same size as the other rows. Second, in the array of pointers representation, myMatrix[i] is
a modifiable lvalue (recall that it is not if we just declare an array with multiple dimensions), and has a
“box” (memory location) of its own. Accordingly, we can change where the pointers point if we so
desire. Third, we can have two rows point at the exact same array (aliasing each other). While these
abilities may not be terribly useful for a mathematical matrix, they can be incredibly useful for a
variety of other tasks that have data with multiple dimensions. This array of pointers representation
will also prove quite useful when we learn about dynamic allocation in Chapter 12.
10.2.5 Incompatibility of Representation
One aspect of multidimensional arrays that often confuses novice C programmers is that these two
ways to represent multidimensional data are not compatible with each other—they are different types
and cannot be implicitly converted from one to the other. In fact, if you try to explicitly convert from
one to the other (via a cast), you will get results ranging from nonsensical answers to your program
crashing. This common problem underscores the importance of knowing the exact meaning of the
types you declare and fully understanding the semantics of every line of code that you write.
Video 10.3: An illustration of why casting between incompatible representations can
cause your program to crash.
Video 10.3 illustrates an example of what can go wrong when a programmer naïvely inserts a
cast to “fix” a compiler error without understanding the implications of what he is doing. In this
example, the program crashes, although far worse consequences are possible. Recall from Section
6.1.4 that a program that gives the wrong answer (with no indication that something went wrong) is
often far worse than a program that crashes. It is possible that we could instead read or write values
we did not intend to and produce bogus results.
10.2.6 Arrays of Strings
As we mentioned earlier, an array of strings is inherently a multidimensional array of characters, as
strings themselves are really just arrays of characters. Accordingly, all the same rules apply to arrays
of strings, and we can use either representation that we want. However, as arrays of strings are fairly
important (among other things as we shall see in Section 11.2, the program can access its command
line arguments via an array of strings), it is worth discussing them explicitly.
Consider the following two statements, each of which declares a multidimensional array of
chars,and initializes it with a braced array of string literals:
1 char strs[3][4] = {"Abc", "def", "ghi"};
2 char chrs[3][3] = {"Abc", "def", "ghi"};
Observe that the difference between the two declarations is the size of the second dimension of the
array—which is four in the first statement and three in the second. The first statement (which declares
strs) includes space for the null terminator, which is required to make the sequence of characters a
valid string. The second statement, which declares chrs, does not include such space and only stores
the characters that were written (with no null terminator). Figure 10.9 illustrates the effects of the two
statements.
Figure 10.9: Two declarations of multidimensional arrays of characters, initialized with strings.
This second statement is correct if (and only if) we intend to use chrs only as a multidimensional
array of characters and not use its elements for anything that expects a null-terminated string. As the
second makes a valid multidimensional array of chars, it is not illegal and will not produce an error or
a warning. This behavior is much the same as we discussed in Section 10.1.2 for just declaring arrays
of characters and initializing them from string literals. However, a significant difference is that in the
multidimensional case, we cannot omit the size from the second dimension (which is the number of
characters in each string), as C allows us to omit only the first dimension of a multidimensional array.
If you declare a multidimensional array of chars to hold strings of different lengths, then you
must size the second dimension according to the length of the longest string. For example, we might
declare the following array:
1 char words[4][10] = {"A", "cat", "likes", "sleeping."};
In this example, words requires 40 characters of storage despite the fact that the strings used to
initialize it only occupy 22 characters. This representation wastes some space. While that waste may
not be significant in this example, if we were instead looking at millions of strings with lengths that
vary greatly, we might be wasting megabytes.
Figure 10.10: An array of strings (i.e., an array of pointers to sequences of chars ending in a null terminator).
We might instead use the array-of-pointers representation for an array of strings. As we
previously discussed, representing multidimensional data with an array of pointers allows us to have
items of different lengths, which naturally solves the problem of wasted space. To represent our array
of strings in this fashion, we might declare and initialize words as follows:
1 const char * words[] = {"A", "cat", "likes", "sleeping."};
Observe that here, we declare words as an array of const char *s—the elements of the array are
pointers to const chars, and thus the chars they point to may not be modified. We should include the
const (and must include it to be const-correct) as we have indicated that words should be initialized
to pointers to string literals, which are in read-only memory (as discussed in Section 10.1.1).
Figure 10.10 illustrates the layout of words.
We will note that it is common to end an array of strings with a NULL pointer, such as this:
1 const char * words2[] = {"A", "cat", "likes", "sleeping.", NULL};
This convention is common, as it allows for one to write loops that iterate over the array without
knowing a priori how many elements are in the array. Instead, the loop can have a condition that
checks for NULL, such as this:
1 const char ** ptr = words2;
2 while (*ptr != NULL) {
3 printf("%s ", *ptr);
4 ptr++;
5 }
6 printf("\n");
It would be a beneficial exercise to execute this code by hand to make sure you understand all of the
concepts involved. You can then put it into a source file (with appropriate #includes, and inside of
main) to make sure you derived the correct answer.
10.3 Function Pointers
The actual instructions your program executes are (of course) numbers, and they are stored in the
computer’s memory, just like the program’s data is. Consequently, each instruction has an address, just
like each piece of data does. As these instructions have addresses, we can have pointers to them. It is
not generally useful to have a pointer to an arbitrary instruction, but it can be quite useful to have a
pointer to the first instruction in a function—which we typically just think of as a pointer to the
function itself and call a function pointer.
Technically speaking, the name of any function is a pointer to that function (that is, “printf” is a
pointer to the printf function); however, we do not typically think of them in this way. Instead, when
we refer to a function pointer, we typically mean a variable or parameter that points at a function.
However, the fact that a function’s name is a pointer to it is useful to initialize such variables and/or
parameters.
1 void squareAll(int * data, int n) { 1 void absAll(int * data, int n) {
2 for (int i = 0; i < n; i++) { 2 for (int i = 0; i < n; i++) {
3 data[i] = data[i] * data[i]; 3 data[i] = abs(data[i]);
4 } 4 }
5 } 5 }
(b) A function that squares all elements of an array. (c) A function that takes the absolute value of all elements of an array.
Figure 10.11: Motivation for function pointers: four very similar pieces of code.
The most useful application of function pointers arises from the ability to make a function pointer
a parameter to a function we are writing (or that is provided by a library). To motivate this
functionality, consider the four very similar pieces of code in Figure 10.11. Each of these functions
does something to every element of an array (of ints)—the only difference between them is what they
do to each element.
Instead of duplicating the code—rewriting the entire function each time—it would be nicer if we
could write one function that takes a parameter specifying “what to do to each item.” Then, we could
simply call that function with an appropriate function for each task. While avoiding this duplication of
code may not seem so important here (the function is only a few lines long), this concern can become
much more significant as you write more complex functions that operate over more complex data
structures.
We can achieve this behavior by passing in a function pointer for the parameter that specifies
“what to do to each item.”
1 void doToAll(int * data, int n, int (*f)(int)) {
2 for (int i = 0; i < n; i++) {
3 data[i] = f(data[i]);
4 }
5 }
Most of the code in this example should seem quite familiar, except the somewhat odd looking
parameter declaration: int (*f) (int), which declares a parameter (called f) whose type is “a
pointer to a function that takes an int as a parameter and returns an int.” Function pointer
declarations are a bit unusual in that the name of the parameter (or variable—the declarations have the
same syntax) is in the middle of the declaration. However, this syntax makes sense, as it looks a lot
like the normal declaration of a function—the return type comes first, followed by the name, followed
by the parameters in parentheses. Here, however, we only need to specify the parameter types; we do
not name them. Note that the parentheses around *f are important—without them, the * becomes part
of the return type (that is, the * is read as part of int*), and the declaration appears to be describing a
function that returns an int*. There are times when both the parentheses and the * can be omitted
(writing int f(int) ); however, it is generally best to be consistent (and avoid trying to remember
when this is permissible; we mention it in case you see it).
As with other types, we can use typedef with function pointers. The syntax is again more similar
to function declarations than to other forms of typedef. We might re-write our previous example to
use typedef, so that it is easier to read:
1 typedef int (*int_function_t) (int);
2
3 void doToAll(int * data, int n, int_function_t f) {
4 for (int i = 0; i < n; i++) {
5 data[i] = f(data[i]);
6 }
7 }
Once we have this doToAll function defined, we can use it by passing in a pointer to any function
of the appropriate type (i.e., one that takes an int and returns an int). Since the name of a function is
a pointer to it, we can just write the name of the function we want to use as the value to pass in for that
argument:
1 int inc(int x) {
2 return x + 1;
3 }
4 int square(int x) {
5 return x * x;
6 }
7
8 // ...
9 doToAll(array1, n1, inc);
10 // ...
11 doToAll(array2, n2, square);
12 // ...
We will note that you may see such things written with the address-of operator, such as
doToAll(array1, n1, &inc). This syntax is legal, but the & is superfluous, just as it is with the name
of an array—the name of the function is already a pointer. Note that if we have a function pointer
other than the name of a function (i.e., a variable or a parameter), then we could take the address of
that variable, giving us a pointer to a pointer to a function. It is best to use only the address-of operator
in this latter case, which comes up rather infrequently.
Another example of using function pointers as parameters is a generic sorting function—one that
can sort any type of data. We will discuss sorting in more detail in Chapter 26. For now, it suffices to
know that sorting an array is the process of arranging the elements of that array into increasing (or
decreasing) order. Sorting an array is a common task in programs, as sorted data can be accessed more
efficiently than unsorted data. We will formalize this notion of efficiency later, but for now, imagine
trying to find a book in a library where the books are arranged alphabetically (i.e., sorted) versus in
one where they are stored in no particular order.
As we will see when we learn more about sorting, there are many different sorting algorithms,
but none of them care about the specific type of data, just whether one piece of data is “less than,”
“equal to,” or “greater than” another piece of data. Correspondingly, we could make a generic sorting
function—one that can sort an array of any type of data—by having it take a parameter that is a
pointer to a function that compares two elements of the array. In fact, the C library provides such a
function (which sorts in ascending order—smallest to largest):
1 void qsort(void *base,
2 size_t nmemb,
3 size_t size,
4 int (*compar)(const void *, const void *));
The first parameter to this function, void * base, is the array to sort. Recall that void * is “a pointer
to an unspecified type of data”—allowing qsort to take an array of any type. The second parameter,
size_t nmemb specifies the number of elements (or members) in the array (recall that size_t is an
unsigned integer type appropriate to use for the size of things). The third parameter, size_t size
specifies the size of each element of the array—that is, how many bytes each elements takes in
memory. This information is required because otherwise qsort has no way to tell where one element
of the array ends and the next begins. The final parameter is the one we are most interested in for this
discussion—compar is a pointer to a function that takes two const void *s and returns a int. Here,
the const void *s point at the two elements to be compared (they are const since the comparison
function should not modify the array). The function returns a positive number if the first pointer points
at something greater than what the second pointer points at, 0 if they point at equal things, and a
negative number for less than.
This description of qsort may seem like a lot to take in, but is more easily understood by seeing
a couple examples. These two examples will both be wrapper functions around qsort—small
functions that do little or no real computation, but provide a simpler interface. The first example is a
wrapper to sort arrays of ints:
1 int compareInts(const void * n1vp, const void * n2vp) {
2 const int * n1ptr = n1vp; //convert back to int* so we can dereference
3 const int * n2ptr = n2vp;
4 return *n1ptr - *n2ptr; //subtracting the two numbers compares them
5 }
6
7 void sortIntArray(int * array, size_t nelements) {
8 qsort(array, nelements, sizeof(int), compareInts);
9 }
First, we write a comparison function, compareInts, whose behavior is compatible with the
interface of qsort—it takes pointers, which are declared to be const void *s, and returns an int.
Since this function is intended to be used only when sorting arrays of ints, it converts const void *s
to const int *s, and dereferences them to get the actual ints in the array. Subtracting these two ints
gives a result that conforms to the expectations of the qsort function (it will be positive if the first is
greater, 0 if they are equal, or negative if the first is less).
Once this function is written, we can write the sortIntArray function, which wraps the qsort
function. Observe how sortIntArray does no real computation (it just calls qsort to do all the work),
but provides a much simpler interface (you pass it an array of ints and the number of elements in the
array; you should be able to use this function to sort an array without any explanation). The
sortIntArray function passes its arguments as the first two arguments to qsort, and then passes
sizeof(int) as the third argument, since each element of the array will be sizeof(int) bytes large
(probably four, but the correct way to write it is with the sizeof operator, in case you ever compile it
somewhere where it is not four). For the fourth argument, the function passes a pointer to
compareInts—recall that the name of the function is a pointer to that function. The qsort function
will then call compareInts to determine the relative ordering of the elements in the array.
We can make use of qsort in similar ways for other types. For example, we could write some
similar functions to sort an array of strings (an array of const char *s):
1 int compareStrings(const void * s1vp, const void * s2vp) {
2 //first const: s1vp actually points at (const char *)
3 //second const: cannot change *s1vp (is a const void *)
4 const char * const * s1ptr = s1vp;
5 const char * const * s2ptr = s2vp;
6 return strcmp(*s1ptr, *s2ptr);
7 }
8
9 void sortStringArray(const char ** array, size_t nelements) {
10 qsort(array, nelements, sizeof(const char *), compareStrings);
11 }
We again start by writing a function to compare strings that conforms to the interface required by
qsort. This function, compareStrings, looks much the same as compareInts. The main difference is
that we use strcmp to perform the string comparison. We saw strcmp earlier (in Section 10.1.3) to test
if two strings are equal; however, it returns a positive/zero/negative value based on the ordering of the
strings, as qsort expects.
Note that the pointers passed in are pointers to the elements in the array (that is, they point at the
boxes in the array), even though those elements are themselves pointers (since they are strings). When
we convert them from void *s, we must take care to convert them to the correct type—here,
const char * const *—and use them appropriately, or our function will be broken in some way. For
example, consider the following broken code:
1 //BROKEN DO NOT DO THIS!
2 int compareStrings(const void * s1vp, const void * s2vp) {
3 const char * s1 = s1vp;
4 const char * s2 = s2vp;
5 return strcmp(s1, s2);
6 }
This code will actually compile without any errors or warnings, but will not work correctly. This
is a danger anytime you use void *—the flexibility gives you “enough rope to hang yourself” because
you have no guarantees that you will use the pointer in the correct way. As we will see later, C++ has
features in its type system that allow us to have generic functions in a much safer way.
We can use function pointers pretty much anywhere we can use any other type—not only as the
types of parameters, but also as the types of variables, the type of elements in an array, or fields in
structs. In fact, as we will see later, function pointers in structures lie at the heart of object-oriented
programming, although object-oriented languages hide this implementation detail from the casual
programmer.
10.4 Security Hazards
Improper use of strings and related functions frequently lead to security vulnerabilities in software—
an opportunity for a malicious user to abuse the software and compromise the functionality of the
system in some way. Depending on the type of vulnerability, the attacker may be able to cause the
system to leak information, manipulate data within the program, or even execute arbitrary code. An
error that allows any of these behaviors is considered very serious.
One common form of security vulnerability is a buffer overflow, in which the code provides a
possibility to write a string into a array that is too small for it (typically when a string is read from the
user)—overflowing the array and writing over other data. If the array resides in the frame, the data
overwritten may include the return address (which we draw as a numbered circle, but is actually stored
as a pointer in the frame when the computer executes the code). Overwriting the return address
changes where the function returns, allowing the attacker to trick the program into executing different
code at that time.
To take advantage of the ability to change where the function returns, the attacker carefully crafts
her input string so that it contains the machine instructions she wishes to execute. Remember,
“Everything Is a Number”: letters and machine instructions are just bytes of data—as is the new return
address where an attacker expects her code will end up. She then enters this input, and the program
does whatever her instructions tell it to do. The ability to execute even a few instructions is sufficient
to execute the command shell and run arbitrary commands.
Video 10.4: An example of how a buffer overflow exploit works. The (ridiculously
unsafe) gets function is used to read a string. A malicious user can craft an input string
that causes the attacker’s own code to be executed by the program.
Video 10.4 shows vulnerable code being exploited. In this example, a programmer has carelessly
used the unsafe gets function to read a string from the user into an array. However, the gets function
stops only when it reads a newline character (’\n’), regardless of how much space is in the array it is
supposed to read into—it will overwrite any other data after the array until a newline character is
encountered. In this video, the attacker carefully crafts an input much longer than the intended array,
which contains the instruction bytes to execute the command shell (/bin/bash). Although this may
not seem terribly bad if the user is running the program herself at the shell (after all, when the program
exits, she will return to the command shell anyways), if the program has higher privileges or is
running on a remote system and reading input across the network, the seriousness of the security
vulnerability becomes readily apparent.
Another string error that can lead to security vulnerabilities is format string attacks. Recall that
printf takes a format string (a string with % signs to indicate format conversions, such as %d to
convert a decimal integer), and then an appropriate number of other arguments for the values to
convert. A format string vulnerability arises whenever there is a possibility that the user may affect the
format string in such a way that they can introduce extra format conversions.
As a simple example, imagine that there were a readAString function, which reads a string from
the user (we will learn how to read input from the user in Chapter 11). Consider the following
vulnerable code, which attempts to read a string then print it back:
1 //BAD CODE: DO NOT DO!
2 char * input = readAString();
3 printf(input);
If an attacker inputs a string with % signs in it, it can cause the program to behave in ways it
should not. Notice that the call to printf in the above code uses the input read from the user as the
format string. Even though there are no arguments for format specifiers to convert, printf is not
deterred. If the user input contains %-conversion, printf will take the data where these arguments
should be, format them as directed, and print them. At a minimum, the attacker can cause the program
to reveal data by placing %d or %s format specifiers in his input.
What information this attack reveals depends on what information resides in the place that
printf looks for those extra arguments. To make this vulnerability even worse, printf has a format
specifier (%n) that writes the number of characters printed so far to a memory location specified by an
int * passed as the appropriate argument. A clever attacker can use this format conversion to modify
the memory of the program in malicious ways, changing its behavior, and possibly executing arbitrary
code.
The correct way to use printf2 with a string input (or potentially affected) by the user is to use
the %s conversion:
1 //CORRECT CODE
2 char * input = readAString();
3 printf("%s", input);
Note that GCC will give you a warning if your format string is not a literal and you have no
format arguments (i.e., the format string is the only argument to printf); however, it will not warn
you if there are other arguments. This behavior may seem odd but exists for a good reason. If you
have nothing to convert, you should use "%s" for the string. However, there may be times when you
have an argument and want to compute the format string. For example, the following code is fine:
1 const char * fmt = "%d\n";
2 if (printInHex) {
3 fmt = "%x\n";
4 }
5 printf(fmt, someNumber);
Here, we are sure the format string is either %d\n (print a number as decimal) or %x\n (print a
number as hex), either of which is fine to print someNumber (which we assume is an int). However,
we must be cautious whenever we write code that computes a format string to ensure that the user may
not affect it in malicious ways.
Format string vulnerabilities fall into a larger category of security flaws where a program uses
unsanitized inputs. More generally, if a program uses strings in a way that certain characters are
special, it must take care to remove or escape those characters in input read from the user. In the case
of format strings for printf, these special characters are % signs. If we wanted to let the user control
the format string, we could do so safely if (and only if), we took care to sanitize the string first—
iterating over it and modifying % signs to remove their special meaning (i.e., by removing them or
converting them to %%—the format specifier that prints a literal percent sign). However, in the case of
printf there is no reason to take this approach—it is simpler (and thus less error prone) to simply use
the %s specifier to print the string literally. If we need format specifiers in a user-dependent way, our
code should build the format string itself.
There are, however, other situations where we may wish to read a string from the user and
include it in some context where characters have special meanings. Two of the most common cases
are commands that are passed to the command shell, and information passed to databases.
The command shell considers many characters to be special, but one particularly dangerous one
is ‘—text enclosed in back-ticks is executed as its own command (see Chapter B). Suppose our
program reads some input from the user and passes it as an argument to a shell command—that is, it
executes someCommand stringFromUser. If a malicious user enters ‘rm -rf *‘, then the command
shell will perform back-tick expansion, and run the command rm -rf *, which will erase all files in
the current directory. While this command is destructive, a more insidious user could find far better
commands to execute—ones that give them access to the system to gain and/or modify information.
A similar problem can occur with improper use of databases, where the program passes SQL
commands to the database as a string. We will not go into the details of SQL, but imagine we can
illustrate the point without a full understanding of them. Suppose the program wants to run the
command SELECT * from Users WHERE name=’strFromUser’, where strFromUser is a string read
from the user (e.g., you have asked them for their user name, and they have entered it). If we are not
careful, the user may type a ’ (terminating the literal name is matched against), and a ; to end the
current command, followed by an arbitrary command of his choosing. Such a vulnerability allows the
attacker to modify information in the database however he wants. The web-comic xkcd has a nice
cartoon on this topic: http://xkcd.com/327/.
Note that sanitizing inputs is a task that must be performed with care. A sanitization function that
catches half of the cases is barely better than not sanitizing at all—a clever attacker will try all the
possible special characters and eventually find the one allowing her to compromise your system.
Whenever you need to sanitize your inputs, the best thing to do first is to check if there is already a
function available to you, written by experts. For example, some database interfaces have prepared
statements, which allow you to write the SQL query with ?s in the place of various inputs, bind values
to those inputs, and then the database library ensures that there are no input sanitization issues.
10.5 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 10.1 : Use the man pages to find a function that:
1. Takes a string and returns its length.
2. Takes a string and a character and returns a pointer to the first occurrence of the
character inside the string (or NULL if not found).
3. Takes two strings and returns a pointer to the first occurrence of the second string
inside the first string (or NULL if not found).
• Question 10.2 : Write the function int myatoi(const char * str), which behaves like the
atoi function, except that you write it yourself (without using atoi or strtol).
• Question 10.3 : Write the function int myatoiHex(const char * str), which behaves like
the atoi function, except (a) it interprets the string as a hexadecimal (base 16) number rather
than decimal and (b) you write it yourself without using atoi or strtol).
• Question 10.4 : Write the function
long mystrtol(const char * str, char ** endptr, int base), which behaves like
strtol, except that you write it yourself (without using atoi or strtol)
• Question 10.5 : In chess, an FEN string describes the position of the board. The string has six
parts, each separated by a space, which we will describe shortly. Your job for this exercise is to
write the function void printFENBoard(const char * fen), which takes an FEN string as
input and prints the resulting board (along with the other information encoded in the string).
We will only concern ourselves with the position information and “draw” the board it
describes.
The first part of the FEN string describes the layout of the pieces on the board, which you
should “draw” textually. This portion of the string describes each row (from “top” to “bottom”
in our perspective), separated by a slash. The contents of the row description are either letters
(one of pnbrqkPNBRQK, denoting a piece: pawn, knight, bishop, rook, queen, or king) or
numbers (denoting that many empty squares). You should draw the board by printing the letter
for each piece and a space for each blank square. You should print a newline at the end of each
row, so that the board has a square configuration.
You may assume the FEN string you are given is valid.
• Question 10.6 : Consider the description of FEN strings from the previous problem. Define a
structure pieces, which has a count for pawns, rooks, bishops, queens, knights, and kings.
Then write the function
void countPieces(const char * fen, pieces * white, pieces * black), which takes
an FEN string as its first argument and two pointers to pieces structs as its second and third
arguments respectively. The function should then fill in these structs with the count of how
many pieces are on the board for each side. Note that capital letters (PNBRQK) indicate a
white piece, while lower case letters (pnbrqk) indicate a black piece. All of the pieces are
denoted by the first letter of their name, except knights which are denoted by N/n (to avoid
confusion with the king). You may assume the FEN string you are given is valid.
• Question 10.7 : Write the function void fenToBoard(const char * fen, char board[8]
[8]), which takes an FEN string and an two-dimensional array of characters as input
and fills in the board with the letters representing the pieces of the FEN string. This problem is
similar to the previous problem where you were asked to print the board, but instead of
printing the board, you are writing the letters into the array. You may assume the FEN
string is valid.
Note that you should not assume anything about the initial contents of the board array—
you should fill each square with the correct character (which might be a space).
• Question 10.8 : Write the function
void addMatricies(double ** ans, double ** a, double ** b, int w, int h),
which takes three two-dimensional arrays—one to write the answer to (ans) and two input
matrices (a and b), as well as the width (w) and height (h) of all three matrices. Your function
should compute the matrix sum of a + b and store the result in ans. You should assume that
the matrix is laid out so that it is first indexed by the row, then by the column.
• Question 10.9 : In some popular games, the board consists of pieces in a variety of colors
(e.g., red, blue, green, etc.), and the player attempts to make matches by swapping them such
that three (or more) of the same color are in a row or column. Imagine you are writing such a
game and have an enum for the colors:
1 enum color_t {
2 RED,
3 BLUE,
4 GREEN,
5 YELLOW,
6 ORANGE,
7 PURPLE
8 };
You have a representation of your board as a array of enum color_ts. Write
the function int containsMatch(const int board[10][10]), which takes in a board and
determines if there is any match (three in a row—vertically, or horizontally of the same color)
on it. If there is a match, this function returns 1; otherwise, it returns 0. As always, test your
code extensively.
• Question 10.10 : (Requires Calculus)Recall that the derivative of a function is
Given a particular function, you can often take the derivative symbolically, writing down the
algebraic expression for the resulting function (as you likely learned how to do in calculus
class). However, you can also numerically approximate the derivative of a function by
evaluating for a very small value of —the smaller the value of , the better
the approximation.
For this problem, you should write a function derivative, which takes two parameters
and returns a double. The first parameter is a pointer to the function to approximate the
derivative of—which should take a double and return a double. The second parameter should
be the value at which it should approximate the derivative. The function should return the
numerical approximation it comes up with and should use 0.000000000001 as the value for .
• Question 10.11 : (Requires Calculus)In the previous problem, you saw how a function pointer
can be useful for writing a function that can numerically approximate the derivative of any
other function (of the right type). Another important technique in calculus is integration—
finding the area under a curve. As you may recall from calculus, the definite integral is defined
as:
We can numerically approximate this integral for a function by iterating over the range from
to , computing the area of a rectangle of width and height and adding up all of
these areas (we could also use trapezoids). The smaller our value of is, the better our
approximation is.
Write the function
double integrate(double a, double b, double (*f)(double)), which numerically
approximates with .
• Question 10.12 : (Requires Calculus)The gradient of a mathematical function that takes
multiple inputs is the generalization of the derivative into multiple dimensions. We will
consider the two-dimensional case here (although the concept generalizes to any number of
dimensions). For a function ( ) that takes two inputs and , the gradient ( ) is a
function whose output is a two-dimensional vector. That is, is a vector pointing in
the direction in which has the greatest rate of increase at the point , with a
magnitude that is the slope of the graph in that direction. The gradient of the function can be
computed by taking the partial derivative of the function with respect to each component. That
is, the -component of the vector is and the -component is .
In much the same way that we can numerically approximate the derivative of an arbitrary
function at a particular point, we can numerically approximate the gradient of an arbitrary
function at a particular point. Our result will, however, be a struct with two components to
represent a vector. Fill in the gradient function shown below (consult the internet or a math
textbook if you need more domain knowledge):
1 struct vect2d_tag {
2 double dx;
3 double dy;
4 };
5 typedef struct vect2d_tag vect2d;
6
7 vect2d gradient(double (*f)(double, double), double x, double y) {
8 //WRITE THIS FUNCTION
9 }
• Question 10.13 : The gradient (which we discussed and you implemented in the previous
question) can be useful to numerically find the local maximum or minimum of a function. We
can accomplish this by using gradient ascent (to find the maximum) or gradient descent (to
find the minimum). Gradient ascent works by starting at a particular point and iteratively
improving it by moving along the vector of the gradient at the current point. That is, if we are
currently at , we select the next point by
where is some factor that we scale the gradient by. In the simplest form of gradient ascent,
is constant, although using adaptive values of can improve the rate at which the
algorithm converges. Note that gradient descent works in much the same way but goes against
the gradient by replacing the with a in the above equation. The process ends when it
converges, that is, when is sufficiently close to that our answer is good enough
(they will be closer as the function “levels out” as we near the local maximum, where the
gradient will be ).
For this problem, you will write the gradientAscent function shown below:
1 struct point2d_tag {
2 double x;
3 double y;
4 };
5 typedef struct point2d_tag point2d;
6
7 point2d gradientAscent(double (*f)(double, double),
8 point2d startPoint,
9 double gamma,
10 double convergedDistance ) {
11 //WRITE THIS FUNCTION
12 }
Here, f is the two-dimensional function you want to maximize, startPoint has the
coordinates from which you should start your ascent, gamma ( ) is the constant to
scale the gradient by when updating your current point, and convergedDistance is the
distance between and , where we consider the algorithm to have converged (that
is, when the distance between them is less than convergedDistance, we consider to be
“close enough” to the maximum to be the right answer).
Note: you will want to use your gradient function from the previous problem.
9 Arrays11 Interacting with the User and System
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C10 Uses of Pointers12 Dynamic Allocation
Chapter 11
Interacting with the User and
System
So far, our programs have had a rather limited interaction with the
user or rest of the system. They have printed some results to
standard output (typically to the terminal, but sometimes you have
redirected it to a file) and have not taken any input from the user,
taken arguments on the command line, accessed files, nor many
other things we typically think of real programs as doing. Now that
we have learned about topics such as strings and arrays, we are
ready to learn how to do these things.
11.1 The Operating System
Most
interesting
interactions
with “the
world”—
reading
input from
the user, Figure 11.1: Conceptual diagram of interaction with OS and hardware.
writing a
file on disk, sending data over a network, etc.—require access to
hardware devices (the keyboard, the disk drive, the network card) in
ways that normal programs cannot perform themselves. One key
aspect of this issue is that “normal” programs cannot be trusted to
access hardware directly. If your program could read the disk
directly, then it could ignore the permissions system in place to
protect different users’ files and read or write any data it wanted.
Furthermore, an error in a program that can access the disk directly
could corrupt the entire file system, destroying all data on the
system.
Instead, the program asks the operating system (OS)—low-
level software responsible for managing all of the resources on the
system for all programs—to access hardware on its behalf. The
program makes this request via a system call—a special type of
function call that transfers control from the program into the
operating system. The OS checks that the program’s request is
within the bounds of its permissions before performing it, which is
how the system enforces security rules, such as file permissions. If
the request is not permissible, the OS can return an error to the
program. If, on the other hand, the request is fine, then the OS
performs the underlying hardware operations to make it happen and
then returns the result to the program.
Figure 11.1 depicts the conceptual relationship between the
program, C library, operating system, and hardware. While your C
code can make system calls directly, it is more common to use
functions in the C library. The C library functions then make system
calls as they need. The OS then interacts with the hardware
appropriately.
For example, suppose you call printf to write a string to
standard output. The printf function is part of the C library and
involves significant code that does not require system calls: it must
scan the format string for % signs and perform the appropriate
format conversions, building up the actual string to print. At some
point, however, printf must actually write the resulting string to
standard output, which requires a system call. To perform this task,
printf uses the write system call.
Novice programmers are often imprecise about the difference
between a system call (which transfers control into the OS,
requesting it to perform a task) and a library call (which calls a
function found in a library, such as the C standard library). For
example, calling printf a “system call” is technically incorrect,
though most people will understand what you mean. Being precise
with this distinction is useful for two reasons. First, the more precise
you are with your terminology, the more knowledgeable you come
across during interviews (if you are interviewing for a
programming-related job). Second, if you need to look a function up
in the man pages, knowing whether it is a system call or part of the
C library tells you which section to look in—system calls are found
in section 2, while C library functions are found in section 3.
11.1.1 Errors from System Calls
System calls can fail in a variety of ways. For example, you may try
to read a file that does not exist or you do not have permissions for.
You might also try to connect to a remote computer across the
network at an address that is invalid or unreachable (the network or
remote computer is down). Whenever these system calls fail in C,
they (or technically their wrapper in the C library) set a global
variable1 called errno (which stands for “error number”).
We have not worked with global variables—and their use is
typically discouraged—but they are simply variables whose scope is
the entire program. They are declared outside of any function and
have a “box” that is not part of any frame. You can read (or write)
them just like any other variable.
In the particular case of errno, it is set by a failing call and
read if your program wants more information about why the call
failed. If you want to check if a specific error occured, you can
compare it against the various constants, which are defined in
errno.h.
You may also wish to print out a message describing the error
for the user. Just printing the numeric value of errno is not usually
useful (do you know what error 2 means?). Fortunately, the C
library has a function perror, which prints a descriptive error
message based on the current value of errno. The perror function
takes one argument: a string it prints before its descriptive message.
Note that since errno is global (there is one errno for the
entire program), you must take care not to call anything that might
change it before you test it or call perror. For example,
1 //BROKEN CODE
2 int x = someSystemCall();
3 if (x != 0) {
4 printf("someSystemCall() failed!\n"); //may change er
5 perror("The error was: ");
6 }
Here, printf might change errno (it makes system calls), so we
may not have a correct description of the error from perror. The
possibility of unexpected changes from other parts of the code is
one of the hazards of global variables and part of the reason to avoid
them in general.
11.1.2 Further Learning
There is much more to learn about how the OS works, its
interactions with the program and hardware, and various related
topics. However, those are beyond the scope of this book. We
strongly recommend that the serious programmer take at least one
hardware class and one operating systems class.
11.2 Command Line Arguments
One form of input that our programs can take is command line
arguments. As you have seen from many programs that you have
used by now, when you run a command from the command prompt,
you can specify arguments on the command line by writing them
after the name of the program. For example, when you run gcc -o
hello hello.c, you are passing three arguments to GCC (-o,
hello, and hello.c).2 These arguments tell GCC—which is
basically the implementation of a big algorithm to compile a C
program—what specific instance of the problem we want it to solve
(in this case, what C source code file to read, and where to put the
resulting program).
We can write our programs so that they can examine their
command line arguments. To do so, we must write main with a
slightly different signature:
1 int main (int argc, char ** argv) {
2 //...whatever code you want goes here...
3 }
Here we see that main now takes two arguments. The first,
int argc is the count of how many command line arguments were
passed in (it stands for argument count). The second is an array of
strings that contains the arguments that were passed in (it stands for
argument vector). The zeroth element of argv (that is, argv[0])
contains the name of the program,3 as it was invoked on the
command line (so if you wrote ./a.out, then argv[0] is the
sequence of characters ./a.out\0).
We can see this behavior in a toy example:
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(int argc, char ** argv) {
5 printf("Hello, my name is %s\n", argv[0]);
6 return EXIT_SUCCESS;
7 }
This program prints out the name it was invoked with and then
exits.
Figure 11.2: Frame of main, for a program run with ./myProgram input.txt -n 42.
argv[0] is not typically useful, but we must remember it is
there if we access the other arguments so that we look for them at
the correct indices. Whatever other arguments were passed to our
program appear in argv[1] through argv[argc - 1].4 The
arguments appear in the array in the order that they were written on
the command line (so the first one is in argv[1], the second in
argv[2], and so on). The exact division of the text of the command
line into discrete arguments is up to the command shell but is
typically done by separating the text at whitespace (note that the
program could be invoked by something other than the shell, in
which case, the invoking program can pass it whatever arguments it
wants). The arguments are passed in as text, so if the program
intends to interpret one of its arguments numerically, it must convert
that text into a number (as discussed in Section 10.1.5). Figure 11.2
depicts the frame at the start of main for a program that is run with
the command line ./myProgram input.txt -n 42.
We access the elements of argv as we would any other array of
strings. We will note that for programs that expect a particular
number of arguments, they should check argc first to make sure that
the user actually supplied the right number of arguments before
accessing the elements of argv—failure to do so can result in the
program segfaulting when the user does not provide enough
arguments.
The simplest access patterns to the command line arguments
are iterating over the elements of argv (for programs that do the
same thing to all arguments), extracting particular information from
specific indices (for example, a program may always expect an
input file name as argv[1] and an output file name as argv[2]), or
a mix of the two (reading specific information from the first
elements, then iterating over the rest).
11.2.1 Complex Option Processing
Some programs, however, perform more complex option processing
—taking a variety of flags and options, some of which require
arguments (and in some cases, optional arguments). Going back to
our GCC example, When we write gcc -o myProgram
myProgram.c, the -o option itself takes an argument—the next
command line argument after it specifies what the output is exactly
because -o came right before it. However, GCC does not require -o
to come in this position, we could write other arguments first.
For more complex option processing (such as would be
required by a program like GCC), the getopt function is quite
popular. getopt is part of the C library and parses the command line
arguments, accounting for such considerations as options that have
(potentially optional) arguments, as well as short and long name
versions (for example, some programs may accept a short name,
like -o and a long name like --output with the same meaning). We
will not go into the details of getopt here, but should you need it,
you can, of course, read about it in its man page (be sure to specify
man -S3 getopt, as there is a program by the same name in section
1).
11.2.2 The Environment Pointer
While much less commonly used than the command line arguments,
main can potentially take a third argument: char ** envp, a pointer
to an array of strings containing the values of environment
variables. If your program needs to inspect its environment
variables (see Section B.11 if you are not familiar with environment
variables), you can include this third parameter and access this
array. If you do so, the elements of the array are strings of the form
variable=value (e.g., PATH=/bin:/usr/bin). You can also access
the environment variables with the functions getenv, setenv,
putenv, and unsetenv. See their man pages for details.
11.2.3 Process Creation
While we do not want to delve into too many details of process
creation, we do want to briefly mention a few points to assure you
that there is no magic involved in making new programs or getting
the command line arguments to main. When the command shell (or
any other program—the command shell is just a normal program)
wants to run another program, it makes a couple of system calls.
First, it makes a call to create a new process (fork). This new
process (which is an identical copy of the original, distinguishable
only by the return value of the fork system call) then makes another
system call (execve) to replace its running program with the
requested program. The execve system call takes an argument
specifying the file with the binary (or script) to run, a second
argument specifying the values to pass the new program as argv
(which must end with a NULL), and a third argument specifying the
values to pass for envp (even if main ignores envp, these are still
passed to the new program so they can be accessed by the various
environment manipulation functions mentioned in the previous
subsection.
When the OS executes the execve system call, it destroys the
currently running program (the system call only returns on an error)
and loads the specified executable binary into memory. It writes the
values of argv and envp into memory in a pre-agreed upon format
(part of the ABI—application binary interface: the contract between
the OS and programs about how things work). The kernel then sets
the execution arrow to a starting location specified in the executable
binary (On Linux with GCC, the entry point is a symbol called
_start—but the details are platform-specific).
This startup code (which resides in an object file that is linked
with any C program you compile—unless you request explicitly for
it not to be) then calls various functions that initialize the C library.
This startup code also counts the elements of argv to compute the
value of argc and eventually calls main. Regardless of how main is
declared, it always passes in argc, argv, and envp—if main is
declared with fewer arguments, it still receives them but simply
ignores them.5
When main returns, it—like all other functions—returns to the
function that called it. In the case of main, the caller is this startup
code. This code then performs any cleanup required by the C library
and calls exit (which quits the program), passing in the return value
of main as the argument of exit—which specifies the exit status of
the program.
The shell (or other program that ran the program in question)
can make a system call that waits for its “child” process(es) to exit
and collects their return values.
11.3 Files
Another form of interaction with the outside world that we may
wish to have our program perform is accessing files. We will
assume that you are familiar with what the basics of what a file is
and the relevant concepts surrounding it (such as a path name,
access permissions, etc.)—if you do not, you should familiarize
yourself with this information in Appendix B—and focus on how
we can write programs that access files.
11.3.1 Opening Files
The first thing we must do to access a file (whether for reading or
writing) is open it. Opening a file results in a stream associated with
that file. A stream (which is represented by a FILE * in C) is a
sequence of data (in this case chars) that can be read and/or written.
The stream has a current position, which typically advances
whenever a read or write operation is performed on the stream (and
may or may not be arbitrarily movable by the program, depending
on what the stream is associated with).
Typically, you will open a file with the fopen function, which
has the following prototype:
1 FILE * fopen(const char * filename, const char * mod
This function takes two arguments, the first of which is the name of
the file to open. This filename is a string (which must be null-
terminated and is not modified by fopen), which is the pathname of
the file to open. The pathname can either be an absolute path
(starting with a /), or a path relative to the current working directory
of the program. The second argument is the mode of the file—
whether the file should be opened for reading and/or writing,
whether to create the file if it does not exist, whether or not existing
content is cleared out, and from what position accesses to the file
start. Typically, the a string literal (such as "r") is passed for the
mode parameter (though any expression that evaluates to a valid
string is, of course, legal). We will discuss the mode in more detail
momentarily.
Figure 11.3: Conceptual representation of the result of opening a file.
Before going into more details about the mode, it is useful to
see the effects of opening a file depicted. Figure 11.3 shows what
happens when fopen is used to open a file and the resulting FILE *
is assigned to a variable called f. The depiction of the FILE is
conceptual here, as the actual struct is much more complex and does
not directly contain the data from the file. However, we do not need
to know what the actual FILE struct contains, only how to operate
on it. In fact, even if we do know what is in a FILE struct on one
particular system, we should not attempt to use such information to
directly access the fields inside of it, as the implementation details
are system-dependent, and thus our code would be non-portable.
Our conceptual representation of the FILE struct does,
however, show the relevant pieces for typical use—we can use this
abstraction to think about what a program will do, and thus how to
execute it by hand. There is some information about the state of the
file (which file is opened, whether it can be read or written, and
whether or not the program has attempted to read past the end of the
file) shown in the top portion. The bottom portion shows the data in
the file, along with the position of the stream (indicated by a blue
cursor). Read and write operations will occur at the current position
(i.e., this blue cursor) and will advance it.
We will note that real FILE structs do not contain the name of
the file, but rather a file descriptor—a numeric identifier returned to
the program by the operating system when the program makes the
system call to open the file (namely, the open system call). The C
library functions that operate on the stream pass this descriptor to
the relevant system calls, which perform the underlying input/output
operations for them.
It is possible for fopen to fail—for example, the file you
requested may not exist, or you may not have the permissions
required to access it. Whenever fopen fails, it returns NULL (and sets
errno appropriately). You should always check the return value of
fopen to see if it is NULL or a valid stream before you attempt to use
the stream.
Mode Read and/or write Does not exist? Truncate? Position
read only fails no beginning
r
r+ read/write fails no beginning
w write only created yes beginning
w+ read/write created yes beginning
a writing created no end
a+ read/write created no end
Table 11.1: Summary of modes for fopen.
The possible modes for fopen are summarized in 11.1 but can
be found in more detail in the man page for fopen. The first column
shows the possible legal values for the mode string.6 The second
column shows whether the opened file can be read, written, or both.
The third column shows what happens if the requested file does not
exist—for some modes, it is created (if possible), and for others, the
call to fopen fails. The fourth column shows whether the contents of
the file are truncated to zero-length if the file already exists—that is,
whether or not the existing contents of the file are discarded. The
final column shows where the accesses start—at the beginning or
end of the contents of the file. We note that for the a modes (which
stands for append), all writes will write their data to the end of the
file, even if the program explicitly moves the current position
elsewhere.
11.3.2 Reading Files
Once we have our file open, we might want to read input from it.
We typically will use one of three functions to do so: fgetc, fgets,
or fread. Of these, fgetc is useful when you want to read one
character (e.g., letter) at a time. This function has the following
prototype:
1 int fgetc(FILE * stream);
While you might expect this function to return a char, it
returns an int so that it can return all possible chars, plus a distinct
value to indicate that there are no more characters available in the
stream—that the end of the file (EOF) has been reached. The value
for end-of-file is #defined as the constant EOF in stdio.h. Note that
reading the character advances the current position in the stream.
The fact that functions that read from streams advance the
position of the stream poses a minor annoyance when writing a
loop. Consider the following (broken) code, which attempts to print
every character from an input file:
1 //broken
2 FILE * f = fopen(inputfilename, "r");
3 if (f == NULL) { /* error handling code omitted */ }
4 while (fgetc(f) != EOF) {
5 char c = fgetc(f);
6 printf("%c", c);
7 }
8 //...other code...
This code will read one character, check if it is the end of the file,
then read a different character, and print it. This code will actually
print every other character from the input file, plus possibly
something spurious at the end (if there are an odd number of
characters in the file).
The cleanest way to restructure the loop is to exploit the fact
that an assignment is also an expression that evaluates to the value
that is assigned. While that may sound like a technically-complex
mouthful, what it means is that x = 3 is not only an assignment of 3
to x, but also an expression that evaluates to 3. We could therefore
write y = x = 3 to assign 3 to both x and y—however, we typically
do not do so, as it makes the code less clear than writing two
assignment statements. In this particular case, however, it is OK to
exploit this property of assignments, and it is in fact a common
idiom. The following code correctly prints every character from the
input file:
1 //fixed
2 FILE * f = fopen(inputfilename, "r");
3 if (f == NULL) { /* error handling code omitted */ }
4 int c;
5 while ( (c = fgetc(f)) != EOF ) {
6 printf("%c", c);
7 }
8 //...other code...
Observe how we assign the result of fgetc to the variable c in the
while loop’s conditional expression. We then wrap that assignment
statement in parentheses to ensure the correct order of operations,7
and compare the value that was assigned (whatever fgetc returned)
to EOF.
You may have noticed that the type of c is int, not char. If we
declared c as a char, our program would have a rather subtle bug.
Can you spot it? Remember that we said fgetc returns an int so
that it can return any possible character value read from the file,
plus some distinct value for EOF. Assigning this return value to a
char then inherently discards information—we are taking
possible values and assigning them to something that can hold
different bit patterns (recall from Chapter 3 that in the case of char,
). On most systems EOF is -1, so in this particular case,
we would not be able to distinguish between reading character
number 255 and the end of the file8 —if our input had character
number 255 in it, our program would prematurely exit the loop and
ignore the rest of the input! You should aim to think of these sorts of
corner cases when you write test cases.
Video 11.1: Reading a file with fgetc.
Video 11.1 shows an example of some code that uses fgetc. In
particular, this code reads a file (whose name is specified on the
command line) and counts the number of letters (as determined by
isalpha, from ctype.h) in that file.
The fgets function is useful when you want to read one line
(with a maximum length9) at a time. This function has the following
prototype:
1 char * fgets(char * str, int size, FILE * stream);
This function takes three arguments. The first is a pointer to an array
in which to store the characters read from the file. That is, fgets
will write the data into str[0], str[1], str[2], and so on. The
second argument specifies how much space is available for it to
write data into. That is, size specifies the size of the array str. The
final argument specifies from what stream to read the data.
This function returns str if it succeeds (reads data without
error), in which case, the data in str is null-terminated. It returns
NULL if it fails—either if it encounters the end of the file before
reading any data or encounters some other error. If you need to
distinguish between an error and end-of-file, you should use the
feof and/or ferror functions, which specify whether something
attempted to read past end-of-file, or whether some other error
occured, respectively (see their man pages for details).
Video 11.2 shows the use of fgets to read a file and perform
some simple calculations with the data it reads—in this case,
converting the textual representation of numbers into integers (using
atoi) and summing them. Observe how the code must take care that
it may not actually read an entire line (if the line is too long for the
array passed to fgets). If we were to actually write this program,
we would want a longer line size than five—we just use five to
illustrate what happens when fgets cannot read the whole line.
Video 11.2: Reading from a file with fgets.
Now is a good time to re-mention that you should NEVER use
the gets function. This function behaves somewhat similarly to
fgets, but does not take an argument specifying the size of the
array it reads into. This oversight means that it will continue to read
data until it reaches a newline, even if it writes past the bounds of
the array (it has no way to tell how big it is). The gets function
therefore poses a significant security vulnerability, as it is
susceptible to buffer overflows (which we discussed in Section
10.4).
We may also want to read non-textual data from a file. For
example, we might have an image, video, sound, or other file,
where we write data in a binary format. In such a file, rather than
writing the textual representation of an integer, the actual bytes for
the integer are written to the file. The specific size of integer used
for each piece of data is part of the file format specification. When
we want to read data in this fashion, the most appropriate function is
fread, which has the following prototype:
1 size_t fread(void * ptr, size_t size, size_t nitems,
The first argument is a pointer to the data to write—it is a void *,
since it could be any type of data. The next argument specifies the
size of each item. That is, if we are writing an int or an array of
ints, we would pass in sizeof(int). However, if we are reading
and writing files, we probably want to work with the int types
defined in stdint.h, which have their sizes fixed across systems
(such as int32_t or uint64_t). The third argument specifies how
many such items should be read from the stream, and the final
argument specifies from which stream to read. The fread function
returns how many items were successfully read. As with fgets, you
should employ feof and/or ferror to obtain more information if
fread returns fewer items than you requested.
We will note that there is also a function fscanf, which reads
formatted input and performs conversions in the reverse fashion of
what printf does. If you need this functionality, it is often easier to
deal with errors if you use fgets (or, as we will see later, getline)
and then use sscanf on the resulting string. ¡¡¡¡¡¡¡ HEAD fgets
(and later getline) will read a line at a time, and then you can
attempt to convert the results out, which may or may not have the
desired format. In such a case, you can easily continue reading the
next line. By contrast, fscanf will stop reading input as soon as it
encounters something that does not match the requested format. If
you want to continue reading from the next line, you must then
explicitly read out the rest of the line before proceeding. If you want
to know more about fscanf or sscanf, see their man pages.
11.3.3 Writing files
The other operation we may wish to perform is to write to a file. As
with reading from a file, there are a variety of options to write to a
file. One option—useful if we are printing formatted text—is the
fprintf function. This function behaves the same as the familiar
printf function, except that it takes an additional argument (before
its other arguments), which is a FILE * specifying where to write
the output.
You can also use fputc to write a single character at a time, or
fputs to write a string without any format conversions.10 That is, if
you do fputs("%d") it will just print %d to the file rather than
attempting to convert an integer and print the result.
Finally, if you want to print non-textual data, your best choice
is fwrite, which has the prototype: ======= fgets (and later
getline) will read a line at a time, and then you can attempt to
convert the results out, which may or may not have the desired
format. In such a case, you can easily continue reading the next line.
By contrast, fscanf will stop reading input as soon as it encounters
something that does not match the requested format. If you want to
continue reading from the next line, you must then explicitly read
out the rest of the line before proceeding. If you want to know more
about fscanf or sscanf, see their man pages.
11.3.4 Writing Files
The other operation we may wish to perform is to write to a file. As
with reading from a file, there are a variety of options to write to a
file. One option—useful if we are printing formatted text—is the
fprintf function. This function behaves the same as the familiar
printf function, except that it takes an additional argument (before
its other arguments): a FILE * specifying where to write the output.
You can also use fputc to write a single character at a time or
fputs to write a string without any format conversions. That is, if
you do fputs("%d"), it will just print %d to the file directly, rather
than attempting to convert an integer and print the result.
Finally, if you want to print non-textual data, your best choice
is fwrite, which has the prototype: ¿¿¿¿¿¿¿ master
1 size_t fwrite(const void * ptr,
2 size_t size,
3 size_t nitems,
4 FILE * stream);
The arguments are much the same as those given to fread, except
that the data is read from the buffer pointed to by ptr and written to
the stream (whereas fread reads from the stream and writes into the
buffer pointed to by ptr).
All of these methods write at the current position in the file,
and advance it accordingly. Furthermore, any of these methods of
writing to a file may fail, some of which are detected immediately,
and others of which are detected later. See the relevant man pages
for the functions you are using to see how they might fail and what
values they return to indicate failures.
The reason that the failures may be detected later is that the C
library functions may buffer the data and not immediately request
that the OS write it. Even once the application makes the requisite
system calls to write the data, the OS may buffer it internally for a
while before actually writing it out to the underlying hardware
device. The motivation for not writing immediately is performance:
making a system call is a bit of a slow process, and writing to a hard
disk is quite slow. Furthermore, writing to a disk becomes less
efficient as you write smaller quantities of data to it—there are fixed
overheads to find the location on the disk, which can be amortized
over large writes—so the OS tries to buffer up writes until there is
significant data that can be written at once (we will see this in a bit
more detail later in Video 11.4, when we learn more about closing in
the next section).
Video 11.3: Writing to a file.
Video 11.3 shows an example of using fprintf to write data to
a file. In particular, this piece of code takes three command line
arguments: a start and end number, and an output file name. It then
prints the squares of each number in the range from start to end
(inclusive) to the specified file.
11.3.5 Closing Files
After you are finished with a file, you should close it with the
fclose function, which has the following prototype:
1 int fclose(FILE * stream);
This function takes one argument, specifying which stream to close.
Closing the stream sends any buffered write data to the OS and then
asks the OS to close the associated file descriptor. After calling
fclose on a stream, you may no longer access it (a call to fgetc,
fprintf, etc. is erroneous, though exactly what will happen is
undefined).
Observe that fclose returns an int. This return value indicates
the success (0 is returned) or failure (EOF is returned, and errno is
set accordingly) of the fclose operation. The fclose operation can
fail for a variety of reasons, the most serious of which arise in
circumstances where the data cannot actually be written to the
underlying hardware device (e.g., disk drive)—for example, the
disk is full, or the file resides on a remote file system and network
connectivity has been lost. Failure of fclose for a file you have
been writing to is a serious situation—it means that the data your
program has tried to write may be lost. You should therefore always
check the return value of fclose to see if it failed.
However, what you do in response to the failure of fclose is a
bit of a difficult question. You cannot try again, as performing any
operation on the stream you attempted to close—including another
call to fclose—is erroneous (and results in undefined behavior).
What you do, however, is highly situation-dependent.
In an interactive program, you may wish to inform the user of
the problem, and she may be able to take corrective action before
proceeding. For example, suppose you are writing an editor (like
Emacs). If the user attempts to save their work but the disk is full,
she would much rather be told that the save failed (and why). The
user could then proceed to free up disk space and save again (which
would involve fopening the file again, fwriteing all the data, then
fcloseing that stream—not retrying to fclose the original stream).
By contrast, if the editor ignored the return value of fclose and
failed silently—not informing the user of the problem, she may quit,
losing all of her work.
In other situations, you may not be directly interacting with the
user (or a user capable of remedying the situation: imagine if you
are writing some web service—the user you are interacting with
typically has no ability to administer the system). You still will want
to detect the problem and take some sort of corrective action.
For exercises in this book, we will not concern ourselves with
complex corrective actions when fclose fails—printing an error
message suffices. However, you should get in the habit of checking
its return value. This way, when you are working on real programs,
you will check the return value by habit, and at least think about
what you should do if it fails.
Video 11.4: Closing a file and writing buffered data
down to the OS and disk.
Video 11.3 shows some more details of what happens when we
make the call to fclose at the end of the previous video example.
Here, we show the data movement to the OS kernel as the C library
makes the underlying write system call before closing the file.
Then we see how the OS then writes the data to the underlying
hardware, (e.g., hard disk drive) when the file is closed. Note that
any of these pieces may write the data out sooner but are not
(typically) required to. We also note that the representation of the
kernel structures and the interactions of the kernel with the disk
drive are shown in high level abstractions—and are actually quite
complex, but we are not concerned with those details here.
11.4 Other Interactions
Sometimes programs interact with the rest of the system or outside
world in ways other than reading from/writing to files. UNIX tries
to make as many of these access types as are reasonable appear to
be similar to accessing files—presenting the same interface, and
thus allowing the use of familiar functions to perform these
operations. For operations conforming to this model, the underlying
system call to initiate access returns a file descriptor, which is then
passed to other appropriate system calls (such as read or write). If
the user wants to use the IO functions from stdio (fprintf, fgets,
fgetc,…), she can use the fdopen library call to get a FILE *
corresponding to the file descriptor.
One example of an interaction that “looks like a file” is reading
input from the user and printing it to the terminal. You have already
printed things to the terminal with printf, which prints the output
to stdout, a FILE * that is connected to “standard output.” By
default, standard output is the terminal; however, as you learned in
Appendix B, you can redirect the output so that it goes to an actual
file instead. You can also use fprintf to print to stderr, which is
another FILE * that also goes to the terminal by default. While
stdout and stderr both print to the terminal by default, they serve
different purposes: one is for output and the other is for errors.11
You can also read from the terminal to get input from a user by
reading from stdin. Each of these FILE *s is declared in stdio.h.
They are all open when your program starts (the C library opens
them before main), and they correspond to file descriptors 0
(stdin), 1 (stdout), and 2 (stderr). You should not ever close
these FILE *s, as the C library is responsible for them, not your
code.
Another example of an interaction that “looks like a file” is
transferring data across the network. The program obtains a file
descriptor for a socket—the logical abstraction over which data is
transferred—via the socket system call. Depending on the network
protocol the program desires, additional setup may then be required
to specify the destination, establish the connection, etc. However,
once the socket is set up, data can be sent across the socket by using
the write system call or read from the network by using the read
system call (which, if no data has been received, will block—cause
the program to wait—until data arrives). Of course, if the fdopen
system call has been used to set up a FILE *, then library functions
like fprintf or fgets can be used (which make write and read
system calls on the underlying file descriptor).
Another form of interaction that looks like a file is a pipe. The
name “pipe” may sound familiar from the similarly named shell
construct (as in cmd1 | cmd2 at the shell), which uses this type of
communication. A pipe is a one-way communication channel
between two processes. One process writes data into one “end” of
the pipe, and the other process reads from the pipe, obtaining the
data that was written in by the first process (or blocking if no data is
available). This communication looks exactly like reading/writing a
file to each process (as they again, read/write the file descriptors,
or use library functions that do so). The shell uses this when you use
the pipe operator, as it sets up a pipe, and arranges for the standard
output file descriptor of the first process to be the writing end of the
pipe and the standard input file descriptor for the second process to
be the reading end of the pipe.
Another way that UNIX provides access to things that are not
traditional files in a way that looks like files is through device
special files. These appear as files in the file system (you can see
them with ls, for example) but have a special file type indicating
that the OS should not attempt to read/write data from/to the disk
(or other media), but should instead perform special functionality.
These files are typically found in the /dev directory.
For example, on Linux, the “file” /dev/random provides access
to the secure pseudorandom number generator provided by the
kernel. You can open this file for reading with fopen (or open) and
read from it with whatever method you prefer. However, when you
perform a read on this file, the kernel recognizes that it should
perform a random number generation routine to supply the data,
rather than reading the disk. The kernel will generate random
numbers according to its algorithms and return that as the data read
by the system call. The read operation may block if the kernel’s
entropy model indicates it needs more entropy to generate numbers
securely.12 The /dev/urandom device can be used instead to
generate numbers with the same algorithm but without regard to if
there is sufficient entropy available for security-sensitive purposes.
There are, however, things that do not fit into the “everything
is a file” model. These are typically handled by system calls, but
may not involve file descriptors. For example, if you need your
program to determine the current time, you can use the
gettimeofday system call, which just returns the time of day. There
are also system calls to create a new processes (fork), replace the
currently running program with another (execve), exit the current
program (exit is the library call typically used for this purpose,
_exit provides access to the underlying system call, which is rarely
needed), and many more things. In fact, there are a few hundred
system calls. As you gain experience programming, they will
become more familiar to you.
One other form of interaction is when the OS needs to inform
the program of something asynchronously—not at a time that the
program explicitly asks for it, but rather at any time during its
execution. Here, unlike with a system call, the OS is initiating the
communication with the program. UNIX-style OSes support many
signals (each has a number and a symbolic name, indicating what it
represents).
Most signals are fatal to the program by default—if the OS
needs to deliver a fatal signal to the program, it will kill the program
in question. We have already seen an example of one fatal signal—
although we have not discussed it as a signal. When your program
segfaults (which happens when your program accesses memory in
certain invalid ways), the OS sends it SIGSEGV, which kills the
program. Some other signals have a default action of being ignored,
in which case nothing happens if the OS delivers that signal.
Programs can change what happens for each particular signal
(except for signal 9, which is SIGKILL, which is always fatal). One
option for the new behavior of the signal is to have the OS cause the
program to run a particular function (the program specifies which
function when it makes the system call asking the OS to modify its
behavior for that signal). We are not going to go into the details of
signals, signal handling, and related topics here, but you should at
least know they exist and anticipate learning about them in the
future. We will note that setting your program to just ignore
SIGSEGV because you cannot get it to work (and it keeps
segfaulting) is a Bad Idea—remember you never want to hide an
error—you always want to fix it.
11.5 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 11.1 : Write a program called myEcho, which prints
all of its command line arguments, each separated by a
space. After all of these, it should print a newline. That is,
your program is essentially a simplified version of the echo
program.
• Question 11.2 : Expand on your myEcho program by making
it accept the optional -n and/or -e options that the real echo
program accepts. If your program is given the -n option, it
should not print the trailing newline character after printing
its arguments. If your program is given the -e option, it
should interpret backslashed escape sequences, as described
in the echo man page.
• Question 11.3 : Write a program called myCat, which treats
each of its command line arguments as an input file name,
reads the input files in the order they appear, and prints them
all to standard output. If no command line arguments are
specified, myCat should read standard input and print it to
standard output. This program effectively behaves like a
simplified version of the cat program.
• Question 11.4 : Expand on your myCat program so that it
accepts the -n option, causing it to number the output lines.
• Question 11.5 : Write the program myCp, which takes two
command line arguments and copies the file named by the
first to the second file name. Note that this is effectively a
simple version of the cp program.
• Question 11.6 : Expand your myCp program to allow an
arbitrary number of arguments, as long as the last one names
an existing directory, in which case, all but the last argument
name files to copy into the directory (named by the last
argument).
Consult man -S2 stat for information about how to
determine if a file is a directory or not.
• Question 11.7 : Expand your myCp program further to accept
the -r option, which causes it to recursively copy the
contents of a directory. That is, if one of the source files is a
directory, it will copy the directory, and all of its contents—
if some of that contents is also a directory, it will copy that
recursively as well.
Consult man opendir to see how to find out what is in a
directory. Note that there will always be entries for “.” and
“..”, which you need to treat specially.
• Question 11.8 : Write a program called fileSum, which
takes 0 or 1 command line arguments. If 0 arguments are
specified, this program reads from stdin. If one command
line argument is specified, it reads from the file named on
the command line. This program then reads lines of input,
each of which should contain an integer number, and sums
them. At end of input, your program should print the
resulting sum. If the input contains anything that is not a
valid number, your program should print an error message to
stderr, then exit.
• Question 11.9 : Write a function that reads a text file which
is an “image” of a chess board (i.e., the output of your code
in Question 10.5) and prints out the position information for
the corresponding FEN string. The file should have exactly
eight lines of text, each of which should have exactly nine
characters (eight “board symbols” and one newline). If the
file does not match this format, you should print an error and
exit.
10 Uses of Pointers12 Dynamic Allocation
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C11 Interacting with the User and System13
Programming in the Large
Chapter 12
Dynamic Allocation
Recall from Section 9.5.3 the task of writing a function that takes
an integer, creates an array of the size specified by the integer,
initializes each field, and returns the array back to the caller.
Given the tools we had thus far, our code (which did not work!)
looked like this:
1 //broken code: do not do
2 int * initArray(int howLarge) {
3 int myArray[howLarge];
4 for (int i = 0; i < howLarge; i++) {
5 myArray[i] = i;
6 }
7 return myArray;
8 }
The reason this code will not work is that the array is created
on the stack. Variables on the stack exist only until the function
ends, at which point, the stack frame essentially disappears.1
While it may not seem that important from this example to
be able to create an array and return it, programmers often want
to write functions that perform more complex creation tasks (and
have the results persist beyond the function that created them).
Suppose you needed to read information from a file (possibly
with a complex format). You might want to write a function to
read the information, allocate memory to hold the results in some
combination of arrays and structs, and return the result to the
caller. Fortunately, there is a way to do exactly this: dynamic
memory allocation.
Figure 12.1: Highlighting the heap: where dynamic memory is allocated in program
memory.
Dynamic memory allocation allows a programmer to request
a specific amount memory to be allocated on the heap
(highlighted in purple in Figure 12.1)—not the stack. Because the
memory is not in the stack frame, it is not freed when the function
returns. Instead, the programmer must explicitly free the memory
when she is done using it.
When looking at Figure 12.1, notice how the heap has an
arrow indicating that it grows (in fact, it grows upwards, while
the stack grows downwards). Unlike the code and static data
segment, the heap changes sizes as the program runs—thus the
term “dynamic memory allocation.” The memory allocation
library manages the free memory in the heap, and when an
allocation request cannot be satisfied from the existing free
blocks of memory, the allocation library requests (by making a
system call) that the upper boundary of the heap be increased—
causing the heap to grow. The allocation library can then use the
newly allocated region of the heap to satisfy the allocation
request.
As we alluded to, dynamic memory allocation involves both
allocating memory (with the malloc function—discussed in
Section 12.1) and freeing that memory when you no longer need
it (with the free function—discussed in Section 12.2).
Programmers may also wish to reallocate a block of memory at a
different size (for example, you may think an array of eight
elements is sufficient then read some input and find that you need
16). The realloc function (the topic of Section 12.3) allows a
programmer to do exactly this—asking the standard library to
allocate a new (larger or smaller) block of memory, copy the
contents of the original to the new one, and free the old one
(although the library may optimize this procedure if it can expand
the block in place). For all three of these functions,
#include <stdlib.h>. We will also see a wonderful function,
getline, for reading strings of arbitrary length using dynamic
allocation.
12.1 malloc
Figure 12.2: The signature of malloc.
The most basic function for dynamic memory allocation is
called malloc (which performs memory allocation). Calling this
function is how you allocate memory dynamically. The malloc
function, shown in Figure 12.2, takes one argument telling it how
much memory is needed, and it returns a pointer to that allocated
memory in the form of a void *. Many beginning programmers
are intimidated by the concept of a void *, but you should not
be! Recall that a void * just means a pointer, but we do not know
what type of thing it points to. If malloc instead returned
something more specific (for example, an int *), we would need
a new version of malloc for every data type. This would be both
unwieldy and (in the context of user-defined data types)
impossible. Just remember that you can assign a void * to any
other pointer type—so just assign the return result of malloc to
whatever pointer you want to initialize.
12.1.1 How Many Bytes?
malloc requires that you tell it how much memory you need in
bytes—but how do you know how many bytes something is? You
may think you know that an int is four bytes (or eight bytes, or
whatever) and that a particular struct you just wrote is 100 bytes
—but writing a specific number of bytes is incorrect in almost all
cases (it is never correct for anything you will do in this book).
There are two reasons why writing a specific number of
bytes is never correct. The first is portability—the ability to
compile your code on a different system and still have it work
correctly. Even if you are absolutely sure that an int is four bytes
on your computer (with the particular compiler you are using),
you may want to compile and run your code on a different
computer (or with a different compiler) where an int may be a
different number of bytes. The second reason is maintainability—
the ease with which you can make changes to your code and still
have it work correctly. Even if your struct is 100 bytes now, what
happens if you add a field to it, and now it is 104 bytes? You
would have to go find and change every place in the code where
you assumed the size is 100 bytes.
Instead, you should let the C compiler calculate the size of
the type for you with the sizeof operator. Recall from Section
9.6, that the sizeof operator takes one operand (either a type or
an expression) and evaluates (at compile time) to the number of
bytes that type (or the type of that expression) requires. If you
want to malloc space of an int, you could do
malloc(sizeof(int)). If you want to allocate an array of things,
you can determine the size of the entire array by multiplying the
number of elements you need by the size of each element. So, for
example, to allocate enough memory for six integers, you could
call malloc(6 * sizeof(int)).
Sometimes, novice programmers (even after having been
warned that they should use sizeof), think “I know that integers
are four bytes, and I don’t ever need to run this on any other
machine… I bet if I simplify that to
int *myArray = malloc(24) my code will be faster, since it
won’t be computing sizeof nor multiplying…” This line of
reasoning is flawed in two ways. First, you should never sacrifice
correctness in an attempt to improve speed. Second, no
performance improvement is actually made by this
“simplification.” The sizeof operator is evaluated at compile
time, not run-time, and any decent compiler will, as part of its
optimization process, evaluate arithmetic expressions that have
constant operands. You should write correct code and let the
compiler make it fast2
However, we can do even better in terms of writing good
code—specifically, more maintainable code. The very best way to
call malloc would be
int * myArray = malloc(6 * sizeof(*myArray)). This way, if
someone decides later that the array should actually be of type
char or double, the call to malloc would continue to be correct
with no additional changes—the compiler will evaluate the type
of *myArray as part of evaluating the sizeof expression and
come up with the right type, and thus right size for the array
elements.
12.1.2 Video Example
Video 12.1: Stepping through a simple call to
malloc.
A simple example of how to call malloc is shown in
Video 12.1. Notice how the memory allocated by malloc is
outside of the stack—it does not belong to any frame.
12.1.3 Failure
It is possible that the heap could run out of space and there is not
enough memory to fulfill the current request. In such cases,
malloc will return NULL instead of a valid pointer to a location on
the heap. As such, it is a good idea to check the value returned by
malloc and make sure it is not NULL before trying to use the
pointer. If the value is NULL, the program should gracefully abort
with an error message explaining that a call to malloc failed (or if
it can recover from the situation and continue—that is even
better). This is preferable to the alternative: a mysterious
segmentation fault.
12.1.4 Fixing initArray
With malloc at our disposal, we are finally able to correctly
implement the task introduced at the beginning of this chapter:
1 // this code does work!
2 int * initArray(int howLarge) {
3 int * array = malloc(howLarge * sizeof(*a
4 if (array != NULL) {
5 for (int i = 0; i < howLarge; i++) {
6 array[i] = i;
7 }
8 }
9 return array;
10 }
Note that in this function, if malloc fails (i.e., returns NULL),
then the function returns NULL—this pushes the task of handling
the error up to whoever called this function. Whenever you write
a function where you would not know how the error should be
handled, making the caller handle the error is a good strategy.
12.1.5 mallocing More Complex Structures
Even though our examples so far have shown mallocing arrays of
ints, we can, of course, malloc any type we want. We can
malloc one of a thing if we need (rather than an array) or any
number of things (as memory permits). We can form complex
data structures in the heap by mallocing structs that have
pointers, and then setting those to point at other locations in the
heap (which themselves point at blocks of memory allocated by
malloc). We state this explicitly because it is important; however,
you could also realize that all of this is possible due to the
principle of composability (recall from Section 4.5.1)—we can
put all these different pieces together to make larger things.
For example, we might write
1 struct point_tag {
2 int x;
3 int y;
4 };
5 typedef struct point_tag point_t;
6 struct polygon_tag {
7 size_t num_points;
8 point_t * points;
9 };
10 typedef struct polygon_tag polygon_t;
11
12 polygon_t * makeRectangle(point_t c1, point
13 polygon_t * answer = malloc(sizeof(*answe
14 answer->num_points = 4;
15 answer->points = malloc(answer->num_point
16 answer->points[0] = c1;
17 answer->points[1].x = c1.x;
18 answer->points[1].y = c2.y;
19 answer->points[2] = c2;
20 answer->points[3].x = c2.x;
21 answer->points[3].y = c1.y;
22 return answer;
23 }
Here, we have a function that mallocs space for a polygon
(one struct), which itself has a pointer to an array of points. This
particular function makes a rectangle, so it mallocs space for four
points and fills them in before returning its answer to the caller.
12.1.6 Shallow vs. Deep Copying
Suppose we had a polygon_t * p1 pointing at a polygon we
created (e.g., by calling makeRectangle), and we wanted to make
a copy of the polygon it points to. If we just wrote the following,
we would only copy the pointer—we would not actually copy the
object it points to:
1 polygon_t * p2 = p1; //just copy pointer
Figure 12.3: Copying only a pointer.
Figure 12.3 illustrates the (hopefully now familiar) effects of
this statement. After assigning the pointer p1 to the pointer p2,
both point at the exact same memory location. The only new box
that was created was the box for the pointer p2, and if we change
anything about the polygon through one pointer, we will see the
change if we examine the values through the other pointer—since
they point at the same values.
If we were to use malloc, we could create a copy. For
example, we might write:
1 polygon_t * p2 = malloc(sizeof(*p2));
2 *p2 = *p1;
Figure 12.4: A shallow copy.
Figure 12.4 illustrates the effects of this piece of code. Now,
we have two polygons, but only one array of points. We have
created a shallow copy of the polygon—we have made a copy of
the polygon, by exactly copying the fields inside of it. Each
pointer in the shallow copy points at exactly the same location in
memory as the corresponding pointer in the original. In some
cases, a shallow copy may be what we want. However, if we did
p1->points[0].x = 42;, we would change the x of p2’s zeroth
point, since p1->points points at the same memory as p2-
>points. Notice that we must also be careful when freeing an
object that has had a shallow copy made of it—we need to take
care to only free the array of points when we are done with both
shallow copies. As they share the memory, if we free the points
array when we are done with one then try to use the other copy, it
will have a dangling pointer.
If we want two completely distinct polygon objects, we want
to make a deep copy—in which we do not just copy pointers, but
instead allocate new memory for and make deep copies of
whatever the pointers point to. To make a deep copy of our
polygon_t, we would write:
1 polygon_t * p2 = malloc(sizeof(*p2));
2 p2->num_points = p1->num_points;
3 p2->points = malloc(p2->num_points * sizeof(*p2->po
4 for (size_t i = 0; i < p2->num_points; i++) {
5 p2->points[i] = p1->points[i];
6 }
Figure 12.5: A deep copy.
Figure 12.5 illustrates the effects of the deep copy from this
fragment of code (which again, should not be surprising—as
always, it follows the same rules you have learned). Here we have
created two completely distinct polygons, which have the same
values for their points but their own memory. If we change the x
or y of one polygon’s points, the other remains unaffected, as they
are two completely distinct data structures. Similarly, we can now
completely free one polygon when we are done with it without
worrying about anything else sharing its internal structures.
12.2 free
Figure 12.6: The signature of free.
Unlike memory allocated on the stack (which is freed as
soon as the function associated with that stack frame returns),
memory on the heap must be explicitly freed by the programmer.
For this, the C Standard Library provides the function free. The
free function, whose signature is shown in Figure 12.6, takes one
argument: the starting address of the memory that needs to be
freed. This starting address should match the address that was
returned by malloc. As a good rule of thumb, every block of
memory allocated by malloc should be freed by a corresponding
call to free somewhere later in the execution of the program.
When you Video 12.2: Mechanics of free(p).
free memory,
the block of memory you freed is deallocated, and you may no
longer use it. Any attempt to dereference pointers that point into
that block of memory is an error—those pointers are dangling. Of
course, as with all dangling pointers, exactly what will happen is
undefined. When you execute code by hand, and you need to free
memory, you should erase the block of memory that is being
freed. Specifically, if the code says free(p), you should follow
the arrow for p (which should reach a block of memory in the
heap, or be NULL) and then erase that entire block of memory. If p
is NULL, nothing happens. If p points at something other than the
start of a block in the heap, it is an error and bad things will
happen (your program will likely crash, but maybe something
worse happens).
Video 12.2 illustrates these basic mechanics. Note that in the
video, p and q both point into the same block of memory (even
though not at the same exact box). After free(p), q is also
dangling—any attempt to dereference q would be an error.
Note that free only frees the memory block you ask it to. If
that memory block contains other pointers to other blocks in the
heap, and you are done with that memory too, you should free
those memory blocks before you free the memory block
containing those pointers. In our polygon example, we would free
a polygon like this (although better would be to write a
freePolygon function with the two calls to free in it):
1 polygon_t * p = makeRectangle(c1, c2);
2 //stuff that uses p
3 //...
4 //done with p and its points
5 free(p->points);
6 free(p);
Note that doing free(p) followed by free(p->points)
would be wrong: p would be dangling at the second call, and thus
we should not dereference it.
12.2.1 Memory Leaks
When you lose all references to a block of memory (that is, no
pointers point at it), and the memory is still allocated, you have
leaked memory (or you might say your program has a memory
leak). For the small programs you write in this book, a memory
leak may not seem like a big deal—all of the memory allocated to
your program will be released by the operating system when your
program exits, so who cares if you have a few hundred bytes
lying around?
In small programs that run for at most a few seconds, a
memory leak may not have much impact. However, in real
programs, memory leaks present significant performance issues.
Do you ever find that certain programs get slower and slower as
they are open longer? Maybe you have a program that, if open for
a day or two, you have to restart because it is annoyingly
sluggish. A good guess would be that the programmers who wrote
it were sloppy, and it is leaking memory.
You, however, are not going to be a sloppy programmer. You
are going to free all your memory. When you write a program,
you should run it in Valgrind (see Section D.3—and be sure to
use it!), and be sure you get the following message at the end:
All heap blocks were freed -- no leaks are possible
Video 12.3 shows code that allocates memory but does not
free it—leaking the memory. Notice how, when p goes out of
scope, there are no more references to that block of memory, but
it remains allocated. The memory cannot be reused for future
allocation requests (as it has not been freed) but is not needed by
the program any longer (so it should have been freed). The video
concludes by fixing the code by adding a call to free.
Video 12.3: Code that leaks memory and how to
fix it with free.
12.2.2 A Dynamic Memory Allocation Analogy
malloc and free can be confusing to novice programmers, but
they are crucial to producing correct, efficient, and clean code. To
help you understand them, consider the following analogy: You
are at the bus depot and there are 50 lockers available for rent. To
request a locker, you simply tell the worker at the window how
many lockers you need. You will be given a contiguous stretch of
lockers, and you will be told which is the first locker that is yours.
For example, you might say “Five lockers, please,” and you
might be told “Your starting locker is number 37.” At this point,
you have been given lockers 37 through 41. You might also be
told “No, sorry,” because there are not five contiguous lockers
available for rent.3 In response to your request, the worker at the
window records that five lockers are in use, starting at locker 37.
When you are done using your lockers, you return to the
window and tell the worker “I’m done with the lockers starting at
37.” and those five lockers now become available to someone
else behind you in line. When you return your lockers, you free
all or none of them. You can’t return a subset of your request.
Also, you need to keep track of your starting locker (37) because
that is how the worker has recorded your booking. In technical
terms, you may call free only with a pointer that was returned by
malloc. Furthermore, you cannot return your lockers twice. Even
if you and your partner are both using the lockers, only one of
you should return them. Again, in technical speak, you may call
free only once, even if there are multiple pointers to that location
in memory. Freeing the same location in memory multiple times
is illegal.
12.2.3 Common Problems with free
Our locker analogy made references to two common errors that
programmers make when using malloc and free. Trying to “free”
the same lockers twice is a problem. In the case of memory
allocation, trying to free the same block of memory more than
one time is called double freeing. Generally, your program will
crash (segfault); although, other more sinister behaviors can
occur. A segfault on the line where the double free happened is
nice—it makes debugging easier. However, you may get stranger
symptoms, including your program crashing the next time you
call malloc. In general, if malloc crashes, an earlier error in your
code has corrupted its bookkeeping structures, and you have just
now exposed the problem. Run your code in Valgrind, and it is
quite likely to help you expose the error sooner.
Another common problem, also alluded to in the locker
analogy, is freeing something that is not at the start of the block
returned by malloc. If the locker attendant gave you lockers 37–
41, you cannot go back and say “I’m done with 38.” This may
seem silly: why can’t the locker attendant just figure out that
when you say you are done with 38, you mean the block from
37–41? For a human tracking lockers, this may seem like a silly
rule; however, it makes much more sense for malloc and free.
Neither of these functions is magical (nothing in your
computer is—as you should have learned by now). They need to
do their own bookkeeping to track which parts of memory are
free and which are in use, as well as how big each block that is in
use is. Bookkeeping requires memory: they must store their own
data structures to track the information—but where do they get
the memory to track what memory is in use? The answer is that
malloc actually allocates more memory than you ask for, and
keeps a bit for itself, right before the start of what it gives you.
You might ask for 16 bytes, and malloc gives you 32—the first
16 contain its information about the block, and the next 16 it
gives you to use. When you free the block, the free function
calculates the address of the metadata from the pointer you give it
(e.g., subtract 16). If you give it a pointer in the middle of the
block, it looks for the metadata in the wrong place.
Going back to the locker analogy, this would be as if the
locker attendant gives you lockers 37–41 but then puts a note in
locker 36 that says “this block of lockers is five long.” When you
return locker 37, he looks in locker 36 and finds the note. If you
instead tried to give back locker 38, he would look in locker 37
and become very confused.
A third common mistake is freeing memory that is not on the
heap. If you try to free a variable that is on the stack (or global),
something bad will happen—most likely, your program will
crash.
Video 12.4: Three common problems when using
free.
Video 12.4 illustrates these problems.
12.3 realloc
Suppose a program initially asks for n bytes on the heap but later
discovers it needs more than n bytes. In this case, it is not
acceptable to simply reference past the size of the initial malloc
request. For example, if an array has four elements that are
indexed via array[0] to array[3], it would not be acceptable to
simply write into array[4] just because the program’s space
requirements for the array have changed. In locker-speak, this is
the equivalent to realizing you need a sixth locker and taking
locker 42 even though it was not given to you to use. (Locker 42
might be in use by someone else, or it might be given to another
person at a later time.) The proper way to respond to this
increased space needs is to use realloc.
Figure 12.7: The signature of realloc.
realloc effectively resizes a malloced region of memory.
Its signature is shown Figure 12.7. The arguments to realloc are
the pointer to the region in memory that you wish to resize4 and
the new size you wish the region in memory to have. If
successful, realloc will return a pointer to the new, larger
location in the heap that is now at your disposal. If no area in the
heap of the requested size is available, realloc returns NULL.
Video 12.5 steps through a simple example.
Keep in mind that the new location in memory does not need
to be anywhere near the original location in the heap. In terms of
our locker analogy, if you return to the window and say “I have
the lockers starting at locker 37, but I need six lockers now,” the
worker may not be able to give you locker 42. Instead the worker
may respond “Okay, your new starting locker is 12.”
Conveniently, the worker will move the contents of lockers 37–41
into lockers 12–16 for you.
Video 12.5: Stepping through a call to realloc.
Note that even though we wrote
p = realloc(p, 14 * sizeof(*p)); in the video (which is a
simple example of the mechanics), doing so in real code can
result in a memory leak if realloc fails. If realloc fails, it
returns NULL but does not free the memory that was passed in to it
(so that the program has not lost that data—in case it can recover
from the situation and continue on).
Beyond malloc, free, and realloc, there are a few more
standard functions available for memory allocation, such as
calloc (which zeroes out the region in memory for you—by
contrast, malloc does nothing to initialize the memory). The man
pages for all of these functions are, as always, great reference.
12.4 getline
You have already seen fgets, which lets you read a string into a
buffer you have preallocated, specifying the maximum size for
the string to read (i.e., how much space is in the buffer).
However, how could you write code that would read a string of
any length? Before this chapter, we have not had the tools to think
about such a function—it clearly requires dynamic allocation, as
we would need that function to allocate memory that lasts after
the function returns. Now, we can learn about getline.
Figure 12.8: The signature of getline.
The function getline is available in the C Standard IO
Library (#include <stdio.h>). Its signature (shown in
Figure 12.8) looks a bit intimidating, but it is worth
understanding. getline reads a single line from the file specified
in its third argument, stream. It does this by reading characters
from the file repeatedly until it sees the character ’\n’, which
indicates the end of a line. As it reads each character, it copies the
characters into a buffer in memory. After reading the newline
character, getline places a ’\0’ character, which indicates the
end of the string.
So far, this behavior sounds much like fgets; however, the
difference is in the fact that getline allocates space for the string
as needed. The difference arises from the fact that getline uses
malloc and realloc to allocate/enlarge the buffer. The linep
parameter points at a pointer. If *linep is NULL, getline mallocs
a new buffer. If *linep is not NULL, getline uses that buffer
(which is *linecapp bytes long) to start with. If the initial buffer
is not long enough, getline reallocs the buffer as needed.
Whenever getline mallocs or reallocs the buffer, it updates
*linecapp to reflect the number of bytes allocated. It also
updates *linep to point at the new buffer. When the getline
function returns, *linep is a pointer to the string read from the
file.
The getline function returns on an error (including
end of file), and the number of bytes read (not counting the null
terminator byte) on success. Note that the return type is ssize_t,
which stands for “signed size_t”—that is, the signed integer
type, which is the same number of bytes as size_t (so it can
return ).
Video 12.6: Using getline to read lines from a
file and print them out.
The getline function can be used in two ways. First, the
user can provide getline with a pointer to a malloced buffer
(*linep) whose size is pointed to by linecapp. If the line being
read is larger than *linecapp, getline will perform a realloc
for the user and modify the values of *linep (to point to the
newly realloced region) and *linecapp (to be the size of the
newly reallocated buffer) accordingly. Second, the user can
provide getline no buffer, indicated by linep containing a
pointer to NULL (linep cannot be NULL, but if *linep is NULL). In
this case, getline will perform a malloc for the user and modify
the values of *linep (to point to the newly malloced region) and
*linecapp (to be the size of the newly allocated buffer)
accordingly. These two ways can be used together—i.e., one can
write a loop where no buffer is provided the first time, and the
same buffer is reused on subsequent iterations. Video 12.6 shows
an example of using getline to read lines from a file and print
them out.
Video 12.7: An example that combines getline
and realloc.
We conclude this chapter with Video 12.7, which shows a
slightly larger example that combines getline and realloc to
read all the lines from a file into an array. The example then sorts
them (using qsort, which you saw in Section 10.3), prints out the
results, and frees the memory appropriately.
12.5 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 12.1 : What does the malloc function do?
• Question 12.2 : What does the sizeof operator do? Why
is it important?
• Question 12.3 : What is a memory leak?
• Question 12.4 : What does the free function do? What is
OK to do with a block of memory after you free it?
• Question 12.5 : Suppose you call realloc, and your
program segfaults inside the C library code for realloc.
What kind of problem would you suspect in your code?
How would you go about fixing it?
• Question 12.6 : Read the man page for strdup. What does
this function do? What does it mean that its argument is a
const char *? How do you need to deallocate the
memory it allocates for you?
• Question 12.7 : Write
char * myStrdup(const char * str). This function
should behave exactly like strdup (whose man page you
read in the previous question). You should not make use
of the strdup function in writing this function.
• Question 12.8 : Consider the following code (which has
memory leaks):
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int f(int n) {
5 int * p = malloc(2 * sizeof(*p));
6 p[0] = n;
7 p[1] = n+2;
8 int ans = p[0] * p[1];
9 return ans;
10 }
11 int main(void) {
12 int * p = malloc(4 * sizeof(*p));
13 int * q = p;
14 int ** r = &q;
15 p[0] = f(1);
16 *r = NULL;
17 q = malloc(2 * sizeof(*q));
18 p = q;
19 q = NULL;
20 return EXIT_SUCCESS;
21 }
Execute this code by hand, and identify the lines on
which memory is leaked (that is, the line on which the last
reference to a block is lost). Next, modify the code by
inserting appropriate calls to free (do not make any other
changes to the code) so that the code no longer leaks
memory. Use Valgrind to make sure you have properly
eliminated all memory leaks without introducing any
other bugs.
• Question 12.9 : Video 12.7 walked through a program
that read lines from stdin, sorted them, and printed the
results. Write a similar program (or modify the code in the
video) so that the program takes one or more command
line arguments (each of which specifies a file name), reads
all of the lines in all of the specified files, sorts the data,
and prints the results. Note that this program should read
all lines from all files, then sort all of them together (so
lines from different files may be interleaved in the output),
then print the results. Test your program, and be sure that
it Valgrinds cleanly and does not leak memory.
• Question 12.10 : Consider the following code, which has
errors in it:
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 struct my_struct_tag {
5 int nNums;
6 int * nums;
7 };
8 typedef struct my_struct_tag my_struc
9
10 void f(my_struct * ptr) {
11 for (int i = 0; i <= ptr->nNums; i+
12 printf("%d\n", ptr->nums[i]);
13 }
14 free(ptr);
15 }
16 int main(void) {
17 my_struct * s = malloc(sizeof(s));
18 s->nNums = 5;
19 s->nums = malloc(5 * sizeof(*s->num
20 for (int i = 0; i < s->nNums; i++)
21 s->nums[i] = i + 4;
22 }
23 f(s);
24 free(s);
25 return EXIT_SUCCESS;
26 }
Identify and fix four major errors in the code. Check
your work with Valgrind. After you have made the code
work with no Valgrind errors, is there a way it can be
further improved by using a more correct type in some
place?
• Question 12.11 : For this problem, you will write a
program that reads two matrices from files (whose names
are specified as command line arguments), multiplies the
two matricies (if you don’t know or don’t remember how
to multiply matricies, find that domain knowledge on the
internet). If the two matricies cannot be multiplied, your
program should print an error and exit. Otherwise, it
should print the result then free any memory it allocated.
When your program reads the input matricies, it should
read a file where the first line contains an unsigned integer
(specifying the width), the next line contains an unsigned
integer (specifying the height), then each other line
contains a double (which is a value in the matrix). The
matrix values will be ordered such that you read across
the zeroth row (starting from column 0, and ending at the
max column), then across the first row, and so on. If there
are any errors reading the file (e.g., not enough values,
values that are not doubles, etc.), your program should
print an error and exit.
Hint: you should use malloc to allocate your
matrices. Hint 2: we provide the struct for a matrix, as
well as the functions (though you are encouraged to
abstract things out into more functions as you see fit) you
should write to accomplish this task:
1 struct matrix_tag {
2 double ** values;
3 size_t rows;
4 size_t columns;
5 };
6 typedef struct matrix_tag matrix_t;
7
8 matrix_t * multiply(matrix_t * left,
9 //write this
10 }
11 matrix_t * readMatrix(const char * fi
12 //write this
13 }
14 void printMatrix(matrix_t * matrix) {
15 //write this
16 }
17 void freeMatrix(matrix_t * matrix) {
18 //write this
19 }
20 int main(int argc, char ** argv) {
21 //write this
22 }
11 Interacting with the User and System13 Programming in the Large
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
I Introduction to Programming in C12 Dynamic AllocationII C++
Chapter 13
Programming in the Large
So far, we have focused exclusively on programming in the small—designing the algorithm for a
small-sized task (i.e., one or a few functions), implementing it, testing it, and debugging it. However,
most “real” programs have significant differences from the tiny ones we have developed so far. One
key difference is that they tend to be much larger than those we have written (e.g., the Linux kernel
has about 16 million lines of code in about 30 thousand source files). Another difference is that they
have more than one person working on them, sometimes teams of hundreds to thousands. A third key
difference is that real software has a long life span during which it must be maintained.
Now that you have an understanding of the basics of programming in the small, we are ready to
begin learning about programming in the large—the aspects of designing and implementing programs
that arise from having significantly-sized programs with long life spans. A key aspect of programming
in the large is the higher-level design of the program. Instead of just writing one (or a few) functions
to perform a function-sized task, the programmer must now design the code into multiple
appropriately sized modules with clearly defined interfaces between them. Of course, programming in
the small still comes into play, as the design will ultimately boil down to many function-sized tasks,
which must be implemented.
A good analogy to think about programming in the small and in the large is writing (as in,
English text). “Writing in the small” would be the task of crafting a sentence or paragraph. Here, you
are concerned with issues like grammar (syntax) and word choice. You have one particular point you
want to make and must make it clearly. This is about the size of programming tasks you have
accomplished so far.
Continuing our analogy, “writing in the large” would come into play for larger documents—
whether they are dozen-page papers or many-hundred page books—which may also be written by
multiperson teams. Now, we must be concerned with splitting the task into logical pieces (chapters
and sections) and forming an outline. We must concern ourselves with the “interface” (in the case of
writing: logical flow of ideas) between pieces at all granularities—chapter-to-chapter, section-to-
section, and paragraph-to-paragraph. At some point, our “writing in the large” will produce a design
that dictates many “writing in the small” tasks—we then apply our skill in that area to “implement”
each of these paragraphs.
This analogy is particularly apt because the “in the large” and “in the small” skills are distinct but
go hand-in-hand in both cases. In both cases, you need both skills to write a complete
document/program. You also need to keep the “in the small” in mind as you do the “in the large”—
thinking about what makes for an appropriately sized paragraph/function that you can implement.
We are going to spend the rest of the chapter introducing programming in the large. Our coverage
here will be sufficient for you to begin writing modest-sized programs, which we will start to work
toward through the rest of this book. However, we will note that if you plan to work on very large
software projects and/or want to become an expert in programming in the large, there is an entire
subfield of computer science, called Software Engineering, dedicated to this topic. In such a case, you
should take classes in that area once you have mastered the basics of programming.
13.1 Abstraction
One of the key techniques for designing any large system is abstraction. As we previously discussed,
abstraction is the separation of interface from implementation. The interface of something is what it
does, while the implementation is how it does it.
Recall our example of abstraction from Section 3.1.2: driving a car versus knowing how it works
under the hood. The car provides a simple interface (turning the steering wheel turns the car, pushing
the accelerator makes the car go faster, pushing the brake slows it down, etc.); however the
implementation (how everything works under the hood) is quite complex. Of course, you do not
actually have to know how the car works to drive it—the implementation is hidden from you. You
only need to know the interface.
A similar principle applies in designing programs—you should be able to use a function if you
know what it does, without having to know how it does it. In fact, in your programming so far, you
have routinely used a wide variety of C library functions without knowing their implementation
details (even ones that you may have written an equivalent for as a practice exercise may be
implemented in a much more optimized manner in the real C library).
13.1.1 The Seven-Item Limit
One of the reasons abstraction is crucial to large systems is that it breaks them down into pieces we
can think about at one time. Psychologists have found1 that the human brain is generally limited to
thinking about approximately seven things at any given time. However, the “things” in this limitation
are in terms of pieces of information with semantic meaning to the person thinking about them. For
example, it is difficult to remember the sequence of letters zyqmpwntyorprs but easy to remember the
word unsurmountable, even though they both have the same number of letters. The word
unsurmountable is one logical piece of information and thus requires you to remember only one
thing, while the gibberish zyqmpwntyorprs requires you to remember each individual letter, as it is not
a meaningful word. Meanwhile, even though dzqf is also gibberish, you can easily remember it
because it has few letters.
Note that being able to think about a function is important in a variety of contexts. If a function’s
complexity exceeds your cognitive capacity, it becomes very difficult to write—there are just too
many things going on to keep straight in your head as you try to generalize, and patterns become
difficult to find. However, this complexity is also a problem when you want to understand what code
that is already written does. If you write the code, and it does not work, understanding it is crucial to
effective debugging. You may also need to understand code to modify it. Remember that real software
has a long life span—you might need to go back to code you (or someone else) wrote years ago to
change it in support of a new feature.
Abstracting code out into functions (or other larger modules) provides a way to wrap the entire
behavior up into one logically meaningful thing—the resulting function has a well-defined task, and
you can think about it in terms of what it does (its interface) rather than thinking about the details of
how it does it. Having the function as one logical unit you can think about means that it only counts
for one in the limit of the seven things you can think about at once.
To see this principle in action in a programming context, consider the following code (do not
invest too much effort in trying to puzzle out what it does):
1 roster_t * readInput(const char * fname) {
2 FILE * f = fopen(fname, "r");
3 size_t sz = 0;
4 char * ln = NULL;
5 char * endp = NULL;
6 if (f == NULL) {
7 return NULL; //Could not open file->indicate failure
8 }
9 roster_t * answer = malloc(sizeof(*answer));
10 answer->numStudents = readAnInteger(f);
11 if (getline(&ln, &sz, f) == -1) {
12 fprintf(stderr,"Could not read a line (expecing a number)\n");
13 exit(EXIT_FAILURE);
14 }
15 answer->numStudents = strtol(ln, &endp, 10);
16 //skip whitespace
17 while (isspace(*endp)) {
18 endp++;
19 }
20 if (*endp != ’\0’) {
21 fprintf(stderr, "Input line was not a number\n");
22 exit(EXIT_FAILURE);
23 }
24 answer->students = malloc(answer->numStudents * sizeof(*answer->student
25 if (answer->students == NULL) {
26 free(answer);
27 return NULL;
28 }
29 for (int i =0; i < answer->numStudents; i++) {
30 student_t * s= malloc(sizeof(*s));
31 s->name = NULL;
32 getline(&s->name, &sz, f);
33 char * p = strchr(s->name, ’\n’);
34 if (p != NULL) {
35 *p = ’\0’;
36 }
37 if (getline(&ln, &sz, f) == -1) {
38 fprintf(stderr,"Could not read a line (expecing a number)\n");
39 exit(EXIT_FAILURE);
40 }
41 s->numClasses = strtol(ln, &endp, 10);
42 //skip whitespace
43 while (isspace(*endp)) {
44 endp++;
45 }
46 if (*endp != ’\0’) {
47 fprintf(stderr, "Input line was not a number\n");
48 exit(EXIT_FAILURE);
49 }
50 s->classes = malloc(s->numClasses * sizeof(*s->classes));
51 for (int i = 0; i < s->numClasses; i++) {
52 s->classes[i] = NULL;
53 getline(&s->classes[i], &sz, f);
54 p = strchr(s->classes[i], ’\n’);
55 if (p != NULL) {
56 *p = ’\0’;
57 }
58 }
59 answer->students[i] = s;
60 }
61 int result = fclose(f);
62 assert(result == 0);
63 free(ln);
64 return answer;
65 }
This giant function is a huge mess! Not only is it ridiculously difficult to understand what it does,
but it also duplicates code—performing the same task in multiple places. Duplication of code is
generally bad for many reasons. The most straightforward reason not to duplicate code is that you
duplicate the effort to write it. Perhaps more importantly, it is more difficult to maintain—if you need
to change something, you have to remember (or your development teammate needs to figure out) to
change it in two places. Finally, if you have an error in your code you have to fix it in multiple places.
This giant mess of code really should be about four functions. In fact, this comes from combining
four functions from the moderately-sized example we work in Section 13.4. Keep this mess in mind as
you read that section, and contrast the difficulty in understanding this piece of code with
understanding the corresponding four functions in that example.
13.1.2 Hierarchical Abstraction
Abstraction works hierarchically—we can combine the smallest “building blocks” of our system into
larger logical units. We can then combine those larger “blocks” into larger units and repeat this
process until we have a large, complex system. Of course, ideally, at any granularity where we inspect
the system (e.g., read the code), we have something small enough to fit into our cognitive capabilities
—something we can analyze and understand.
Returning to our example of letters and words, words get combined into sentences, sentences get
combined into paragraphs, paragraphs get combined into sections, and so forth. You can see the same
effects of thinking about things in logical units at any of these granularities: think about the sentence
“I am learning about why abstraction is important in programming,” and the gibberish phrase: “A in
tricycle grail air discontinue my flammable to imprecision.” As with the letters/words example, you
can think about the first more easily, as it has a logical meaning, even though they both have the same
number of words (and letters in each word).
Designing software with hierarchical abstraction can primarily happen in two ways: bottom-up,
or top-down. In a bottom-up design, you start with the smallest building blocks first, and build
successively larger components from them. This approach lends itself well to incremental testing
(build a piece, test it, build a piece, test it, etc.). However, the downside is that you have to be sure you
are building the right blocks, and they will all fit together in the end.
The other design philosophy is top-down. In top-down, you start by designing the highest-level
algorithm and determine what other functions you need in support of it. You then proceed to design
these functions until you reach small enough functions that you do not need to abstract out any more
pieces. This design approach should sound rather familiar, as it is exactly what we have described to
you when discussing how to translate your generalized steps into code. The advantage here is that you
know exactly what pieces you need at every step (they are required by the higher-level algorithms that
you have already designed) and how they fit together.
The downside to top-down design can arise in testing. If you try to write the whole thing then test
it, you are asking for trouble (as we discussed in Chapter 6). However, if you implement your
algorithm in a top-down fashion, you may have high-level algorithms that rely on lower-level pieces
that do not exist. This problem can be overcome in a couple of ways.
First, you can still test what you have every time you build a complete piece. That is, when you
finish a “small block,” you can test it. Then once you have built (and tested) all the “small blocks” for
a medium-sized piece, you can test it. Effectively, you are building and testing your code in a bottom-
up fashion, even though you have done the design in a top-down fashion (and may have some partially
implemented/untested higher-level algorithms).
The second way you can address this problem is to write “test stubs”—simple implementations
of the smaller pieces that do not actually work in general but exhibit the right behaviors for the test
cases you will use to test your higher-level algorithms. Such test stubs may be as simple as hard-coded
responses (e.g., if (input == 3) {return 42;}). You can then test your higher-level functions
assuming the lower-level pieces work correctly, before proceeding to implement those pieces.
13.2 Readability
As we mentioned before, writing code that can be read and understood—both by yourself and by
others—is a crucial skill as you develop code in larger projects with a longer life span. Many aspects
of a code’s design and implementation contribute to its readability—or lack thereof. Many novice
programmers get completely focused on getting their code to work and ignore its readability. While
correct code is critical, it is harder to make unreadable code work correctly. Additionally, you are
likely to develop bad habits—which become harder to break the longer you have them. You do not
want to end up like this guy http://xkcd.com/1513/.
13.2.1 Function Size
As we just discussed, smaller functions are easier to read and understand (as well as easier to write)
than larger functions. A general rule of thumb is that your function should fit comfortably into one
screen—that is, you should be able to read the entire thing all at once without scrolling. Of course,
exactly what constitutes “one screen” somewhat depends on your terminal window size and how you
format your code, so remember that this is a guideline.2 If you end up with a function that makes
logical sense and is over by three lines, do not sweat it too much. Likewise, if you have a function that
performs eight different logical tasks, you should not say “well that is fine, I managed to cram it all on
one screen.”
13.2.2 Naming
The names you use for identifiers can contribute significantly to (or detract significantly from) the
readability of your code. Name your variables, functions, and types to indicate what they mean and/or
do. If you can tell what something does from reading its name, you do not have to work (as hard) to
figure it out. Of course, at the same time, you should not name your variables in overly long ways that
become cumbersome to type.
A good rule of thumb here is that the length of a variable’s name should be proportional to the
size of its scope, and the complexity of its use. It is reasonable to name the counter variable of a for
loop i because it has a relatively small scope (one loop) and a simple use (it just counts: you can tell
that from reading the for loop where it is declared). Functions and types should therefore generally
have relatively descriptive names. They (usually) exist at quite large scopes (all of the functions we
have seen so far have had a scope of the entire program) and perform complex tasks.
Some people like naming conventions. We have seen a few naming conventions so far. One is
placing a _t suffix on the type name (e.g., color_t). Another is writing the names of constants in all
capitals. Some programmers like the “Hungarian notation” scheme, where variable names are prefixed
with a sequence of letters indicating their types (e.g., chInput starts with a “ch” indicating its type is
char, and while iLength starts with an “i” indicating its type is int).
Another set of conventions arise in how you “glue together” multiple words, as spaces are not
allowed in variable names. The two major ways are to use underscores (_) wherever you would want
spaces (e.g., num_letters_skipped). The other is to capitalize the first letter of words other than the
first (e.g., numLettersSkipped)—this approach is called “inner caps” (also called “camel case”).
Either of these methods is fine, though many programmers have a strong preference for one over the
other. We note that “inner caps” can also be applied to names that start with a capital letter on the first
word by convention—such as class names, which we will see when we discuss C++ in Chapter 14.
13.2.3 Formatting
The formatting of code can also have a large impact on its readability. Programmers largely agree on
some basic rules. One common rule is that code should be indented to reflect the block structure—i.e.,
in C, code inside more curly braces should be indented more than code inside fewer curly braces.
Another is that one should generally place a newline at the end of a statement, so that consecutive
statements appear on different lines.
1 int sumNums (int n) 1 int sumNums (int n)
2 { 2 {
3 int total = 0; 3 int total = 0;
4 for (int i = 0; i < n; i++) { 4 for (int i = 0; i < n; i++)
5 total = total + n; 5 {
6 } 6 total = total + n;
7 return total; 7 }
8 } 8 return total;
9 }
(b) One True Brace Style (1TBS).
(c) Allman style.
Figure 13.1: Four commonly used conventions for curly brace placement.
However, programmers typically have much dissension about the specific details of code
formatting. Probably the most contentious topic is the placement of curly braces. Figure 13.1 shows
four different brace-placement conventions: Java Style, 1TBS, Allman Style, and GNU Style. We note
that there are other variations on many of these, typically involving whether the else “cuddles” the
close curly brace of the corresponding if (that is, the else is on the same line as the closing brace),
and/or whether curly braces are used around single-line blocks. There are of course, other styles not
shown here as well.
We strongly prefer the so-called “Java” style (so named because Sun’s style guidelines used it,
and thus much of the code for Java uses it) for a couple reasons. First, placing the open brace on the
same line as the if/else/for/… that it belongs to helps to avoid errors where a programmer
accidentally adds a line of code to the body, but mistakenly places it before the braces.
To see this problem, consider the following code:
1 if (hasPrivileges(user))
2 logActivity(user,action); //accidentally added in wrong place
3 {
4 someSecureActivity(action);
5 }
Here someone has added code—a call to a function to log some activity—intending to write it in the
body of the if but has separated the curly braces from the if. In this case, the logActivity call is the
entirety of the body of the if, and the remaining code is legal, but is not conditionally protected by the
if. That is, the code behaves as if it were:
1 if (hasPrivileges(user)) {
2 logActivity(user,action);
3 }
4 someSecureActivity(action);
Such mistakes can contribute to serious errors in the software. In this hypothetical example, the
mistake would almost certainly prove to be a security vulnerability, allowing any user to perform the
privileged action (and to make matters worse, not logging the activity if the user does not have
privileges to perform that action).
We prefer Java style over 1TBS primarily from a consistency perspective (the curly brace is
always in the same place relative to the rest of the code). Perhaps most importantly, we recommend a
brace style in which you always use curly braces, even when they can be omitted around single-line
blocks. Omitting them can make introducing errors easier.
Ultimately, what brace style you use is going to either be dictated by the style guidelines
wherever you work or left up to your personal choice.
13.2.4 Commenting and Documentation
Well-documented code is significantly easier to read than poorly-document code. Good documentation
provides insights into the algorithm being used, explains why things that may seem surprising are
correct, and generally allows the reader to understand how and why the code works the way it does.
Of course, the key here is in writing good documentation—not just writing a lot of documentation.
Most documentation is written as comments in the code. We have seen comments before and
already learned that they indicate text that is for humans and thus is ignored by the compiler. In C and
C++, comments can either be written with // to comment to the end of the line or /* */ to comment
everything enclosed by the slash-star and star-slash.
Understanding how to write good documentation can be quite tricky for novice (and even
moderately experienced) programmers. Skill in this area generally increases as you read other people’s
code (or your own code that you have not seen in a long time) and find comments particularly useful
or lacking—things you wish the comments explained. Here are some good rules of thumb to help
write better documentation:
Document Large-Scale Design As you design larger programs, you will have multiple pieces
that fit together. Describe how they fit together. What are the interfaces between modules?
What are the invariants of the whole system?
Describe Each Component For each component (function, object—when we get to C++, or
file) of your system, write a comment at the start describing it. Start by describing the
interface for anyone who just needs to use the component and does not need to know
about its implementation details. For functions, describe what each parameter means as
well as any restrictions on their values. Describe what the function returns. If it has any
side effects, explain them.
Continue by describing implementation details for anyone who needs to modify it. If
it uses a commonly known algorithm, say so. If it uses an algorithm you designed
yourself, explain how the algorithm works.
Do Not Comment The Obvious A common mistake is to believe that just writing a lot of
comments makes for good documentation. However, comments that describe the obvious
are counter-productive—they clutter up the code while providing no useful information.
Consider the following example:
1 int sumNums(int n) {
2 int total = 0; //declare total, set to 0
3 for (int i = 0; i < n; i++) { //for loop
4 total = total + n; //add n to the total
5 }
6 return total; //return the total
7 }
Here, the programmer has written four comments, but none of them contribute
anything useful to understanding the code. Anyone who understands the basics of how C
works (i.e. what we covered in Chapter 2) can get the same information instantly just by
looking at the code.
Explain The Unexpected If your code contains something unusual, such as a statement that is
correct but could appear to be erroneous, you should document why that statement is
correct. When someone unfamiliar with the code reads it, this documentation will save
them from wondering whether or not the unusual aspect of the code is an error—they can
simply read the explanation, rather than trying to figure out if it is broken. In general, if
you think someone would ask “Why does it do that?” or “Is that right?” you should
preemptively answer that question with a comment.
Consider the following example of code with useful documentation in it:
1 /* This function takes two parameters:
2 * array (which is an array of ints)
3 * n (the number of items in ’array’)
4 * and sorts the integers found in ’array’ into ascending order.
5 * The sorting algorithm used here is the widely-known "quick sort."
6 * For details, see
7 * - "All of Programming" (Hilton + Bracy), Chapter 26, or
8 * - "Introduction to Algorithms" (CLRS), Chapter 7.
9 */
10 void sort(int * array, size_t n) {
11 if (n <= 1) {
12 return; //arrays of size 1 or smaller are trivially sorted
13 }
14 //i starts at -1: it is incremented before it is used to index the array
15 int i = -1;
16 //j starts at n - 1 even though it is decremented before it
17 //is used to index the array, since array[n - 1] is our pivot.
18 int j = n - 1;
19 //we just use array[n - 1] here for simplicity. In a real implementation,
20 //we might want more sophisticated pivot selection (see CLR)
21 int pivot = array[n - 1];
22 while (1) {
23 do { //scan from left for value >= pivot
24 i = i + 1;
25 } while (array[i] < pivot);
26 do { //scan from right for value < pivot
27 j = j - 1;
28 } while (j >= 0 && array[j] >= pivot);
29 if (i >= j) { //if i and j have crossed, data is partitioned.
30 break;
31 }
32 //swap array[i] with array[j]
33 int temp = array[i];
34 array[i] = array[j];
35 array[j] = temp;
36 }
37 //swap array[i] with the pivot (array[n - 1])
38 array[n - 1] = array[i];
39 array[i] = pivot;
40 //recursively sort the partitioned halves: [0, i) and [i + 1, n)
41 sort(array, i);
42 sort(&array[i + 1], n - i - 1);
43 }
The code begins with a description of the function from a high level. It describes the interface to
the function; if you only need to use the function, you can tell how to do so from just reading the
comment. It also describes the algorithm used in this function. In this particular case, since the
algorithm is well known, it provides a brief description, and a couple of references in case the reader
is unfamiliar with it. Note that this function is a bit long (it does not quite fit into my terminal), so
ideally we should abstract part of it out into its own separate function. Lines 14–36 are a great
candidate for pulling out into their own function, as they perform a specific logical task (called
“partitioning” the data—which you will learn about in Section 26.2.3).
Inside the code, the comments describe why or how things happen, as well as the rationale
behind things that seem unusual. For example, the comment describing the initialization of i explains
why it is initialized to , which may be confusing (or even seem like a mistake) to the unfamiliar
reader.
13.3 Working in Teams
Another consideration in larger programming tasks is working in teams. Depending on the size of the
programming task, teams may range from two or three members to hundreds. Working with teams
magnifies the importance of the considerations we have discussed so far. However, it also introduces
its own challenges and benefits.
One issue that becomes much more important when programming in a team is the use of a good
revision control system (e.g., Git, Subversion, or Mercurial). These systems facilitate multiple people
editing the same code base at once, as well as giving the ability to find out who made which changes
in the code and track older versions. We discuss git in Section D.4.
Another significant issue in programming with larger teams is integration—putting the pieces
together. Programmers may write and test their own modules, determining that they work in isolation,
only to discover that they do not work together with the other programmers’ modules when they
attempt to integrate. Problems can arise from misunderstood (or imprecisely defined) interfaces and
differing expectations. Many programmers underestimate the time required for integration, which can
actually be quite significant.
Of course, programming in teams has benefits too—primarily that the team can accomplish more
in the same time than any individual can. Of course, the simplest way this works is by the fact that
multiple people are programming at the same time. However, there are other benefits to working in
teams, arising from the old principle of “two heads are better than one.”
One nice way to work in partnerships is pair programming. In pair programming, two
programmers share a computer with one “driving” and the other “navigating.” The driver controls the
keyboard and actually writes the code. Meanwhile, the navigator watches the code being written and
thinks about what is happening. The navigator is responsible for checking for problems with code—
whether they are just local mistakes or failure to respect global invariants. The programmers may trade
roles whenever they feel it is appropriate.
13.4 A Modestly Sized Example
To see all of these concepts in action, we are going to consider a modestly sized programming
example in detail (we note that this is still a rather small program in the grand scheme of things).
Write a program roster that reads in a file with the following format:
Number of students
Student Name
Number of classes
Classname 1
Classname 2
...
Classname N
Student Name
Number of classes
Classname 1
Classname 2
...
Classname N
That is, the first line contains a single integer that is the number of students described in the file.
After that, there are student descriptions, which each start with a line containing the student’s
name. The next line is the number of classes that the student is in. After that, there are the
specified number of lines, each of which contains the name of one class.
Your program should read this file in and then output the class roster for each class. Specifically,
for each class, it should create a file called classname.roster.txt (where classname is the
name of the class) and print each of the students who are in that course (one per line) into that
file.
For example, if your input file were
4
Orpheus
3
Mus304
Lit322
Bio419
Hercules
2
Gym399
Lit322
Perseus
3
Lit322
Phys511
Bio419
Bellerophon
2
Gym399
Bio419
Then your program would create the following five output files:
Note that if the file does not obey this format, your program should print a meaningful error
message to stderr, and exit with a failure status.
This problem is far too large to write as a single function. Doing this task well involves breaking
it down into about a dozen functions. We can intuitively see the need to break things down just by
observing that there are several discrete tasks we will need to perform: reading our input (which itself
has smaller subtasks, such as reading a number with error checking, and reading the information for
one student), “rearranging” the information to the format we need, writing the output (which itself has
smaller subtasks, such as computing the filename for a class and writing out the information for a
class) and freeing the memory we allocated when we read the input.
As always, we should begin with Step 1 and work through the programming process we have
learned all throughout this book. Of course, for a task this big, we should recognize that we are
seeking to break the task down into many functions, so we should be content with relatively complex
steps—which then dictate what functions we need. Video 13.1 shows the initial steps for attacking this
problem.
Video 13.1: Initial steps for this example.
From what we have done so far in Video 13.1, we can make a couple important steps towards
solving this problem. The first is that we can define several of the types involved in the problem. The
video worked through how we see our student_t and roster_t types. We can similarly work through
what we need to represent our list of all the classes for a classes_t type. We would then come up
with the following declarations:
1 // Each student has a name and an array of classes
2 struct student_tag {
3 char * name;
4 char ** classes;
5 int numClasses;
6 };
7 typedef struct student_tag student_t;
8
9 //A roster is an array of student records (with the size)
10 struct roster_tag {
11 student_t ** students;
12 int numStudents;
13 };
14 typedef struct roster_tag roster_t;
15
16 //An array of class names (with the size)
17 struct classes_tag {
18 char ** classNames;
19 int numClasses;
20 };
21 typedef struct classes_tag classes_t;
Having these type declarations in mind (as well as the pictures we drew of them in Video 13.1)
will be helpful as we work through the subproblems that we are about to create. These types will be
the input and/or output types of many of these other algorithms. Knowing exactly what they look like
allows us to draw precise pictures in Step 1 for each of those problems, coming up with the right
algorithms.
The other important step we took in the video was coming up with the general algorithm for
main:
Read the input from the file named by argv[1] (call this r)
Make a list of all of the (unique) class names (call this result c)
Write one output file per class (from c and r)
We will want to make a few alterations to this algorithm. First, we will notice that we did not
write down anything about returning our answer—however, main always returns an int. In this case,
we should return EXIT_SUCCESS. As we think carefully about what our other functions (that we are
about to write) will do, we may realize that they will need to malloc some memory—so we should
free that memory when we are done with it. If we do not realize this fact yet, we may need to come
back later and add the freeing. Furthermore, if we test carefully (with corner cases) in Step 4, we
might realize we need to add some error checking.
Ultimately, this results in a nice generalized algorithm for main:
Check that the argc == 2 (fail if not)
Read the input from the file named by argv[1] (call this the_roster)
Check that r != NULL (fail if not)
Create the list of all of the class names (call this result unique_class_list)
Write all the class roster files from unique_class_list and the_roster
Free memory held by unique_class_list
Free memory held by the_roster
return EXIT_SUCCESS
As long as we abstract each of the complex steps out into its own function, this results in a nice,
short, easy-to-understand main function:
1 int main(int argc, char ** argv) {
2 if (argc != 2) {
3 fprintf(stderr, "Usage: roster inputname\n");
4 return EXIT_FAILURE;
5 }
6 roster_t * the_roster = readInput(argv[1]);
7 if (the_roster == NULL) {
8 fprintf(stderr, "Could not read the roster information\n");
9 return EXIT_FAILURE;
10 }
11 classes_t * unique_class_list = getClassList(the_roster);
12 writeAllFiles(unique_class_list, the_roster);
13 freeClasses(unique_class_list);
14 freeRoster(the_roster);
15 return EXIT_SUCCESS;
16 }
Having written main, we now have five new programming tasks: readInput, getClassList,
writeAllFiles, freeClasses, and freeRoster.
We will leave working through the details (Steps 1–4) of readInput to the reader, but one
probably comes up with code that looks like this:
1 roster_t * readInput(const char * fname) {
2 FILE * f = fopen(fname, "r");
3 if (f == NULL) {
4 return NULL; //Could not open file->indicate failure
5 }
6 roster_t * answer = malloc(sizeof(*answer));
7 answer->numStudents = readAnInteger(f);
8 answer->students = malloc(answer->numStudents *
9 sizeof(*answer->students));
10 if (answer->students == NULL) {
11 free(answer);
12 return NULL;
13 }
14 for (int i = 0; i < answer->numStudents; i++) {
15 answer->students[i] = readAStudent(f);
16 }
17 //we could check if we have reached EOF here--
18 //depends on what we consider "correct" for the format
19 //Directions don’t specify if this is an error.
20 int result = fclose(f);
21 assert(result == 0);
22 return answer;
23 }
Note that again, we have formed two new programming tasks to do—writing readAnInteger and
readAStudent. While this may seem counterproductive—we have gone from one programming task
(solving this entire problem), to five programming tasks (as listed above), to six programming tasks
(the remaining four from above plus the new two)—we have actually made significant progress. Each
of our remaining tasks is far smaller and simpler than the ones that motivated their creation.
Furthermore, by writing functions for each logical task, we will reduce code duplication. We will
find that the same task comes up in multiple contexts and be able to just call a function we already
wrote. For example, we will find that readAnInteger is useful in readAStudent, which might look
like (again, we leave Steps 1–4 to the reader):
1 void stripNewline(char * str) {
2 char * p = strchr(str, ’\n’);
3 if (p != NULL) {
4 *p = ’\0’;
5 }
6 }
7 student_t * readAStudent(FILE * f) {
8 student_t * s = malloc(sizeof(*s));
9 size_t sz = 0;
10 s->name = NULL;
11 getline(&s->name, &sz, f);
12 stripNewline(s->name);
13 s->numClasses = readAnInteger(f);
14 s->classes = malloc(s->numClasses * sizeof(*s->classes));
15 for (int i = 0; i < s->numClasses; i++) {
16 s->classes[i] = NULL;
17 getline(&s->classes[i], &sz, f);
18 stripNewline(s->classes[i]);
19 }
20 return s;
21 }
At this point, we have only introduced one new (rather simple) additional function,
stripNewline (which just removes the \n from the end of the string), which we also show above.
When we write the readAnInteger function (which we will not show here, but you should be able to
write it), we will not need any other functions at all—the only complex steps in readAnInteger can be
implemented by calling functions in the C library. At that point, we will have completed everything
we needed for readInput.
Contrast these four functions with what we wrote in Section 13.1.1. Even though these four
functions perform the same behavior as the giant function we saw earlier, they are much more
readable, debuggable, and maintainable.
As you write each of these functions, you would want to test them. Testing each portion of the
code in isolation gives us many benefits, as we discussed in Section 6.1. You can debug any problems
you have here now, and be confident that each works before you proceed (which involves making
functions that rely on these smaller functions).
Testing the smaller functions is relatively straightforward (you will want to write test harnesses
for them, with main functions that are separate from the rest of your program). However, readInput is
a bit larger and more complex—you will need a way to see if it got the right output easily to test it. To
perform this testing, we could write another function that we do not really need for our main task
(void printClassRoster(roster_t * r)), which prints out the class roster. We then change main (or
write a separate main in another file) to temporarily look like this:
1 int main(int argc, char ** argv) {
2 if (argc != 2) {
3 fprintf(stderr, "Usage: roster inputname\n");
4 return EXIT_FAILURE;
5 }
6 roster * the_roster = readInput(argv[1]);
7 if (r == NULL) {
8 fprintf(stderr,"Could not read the roster information\n");
9 return EXIT_FAILURE;
10 }
11 printRoster(the_roster); //print what we read.
12 //skip the rest: it isnt written yet anyways
13 /* classes * unique_class_list = getClassList (the_roster); */
14 /* writeAllFiles (unique_class_list, the_roster); */
15 /* freeClasses(unique_class_list); */
16 /* freeRoster(the_roster); */
17 return EXIT_SUCCESS;
18 }
We would then proceed with getClassList in much the same fashion—working Steps 1–5 and
abstracting complex steps out into their own functions. In this case, we will likely want to abstract out
a function to check if an array of strings contains a particular string (this function will also prove
useful when we go to write our output files) and may or may not want to abstract out the function to
add a string to the list of classes (which requires reallocing the array). We might end up with a
function like this:
1 classes_t * getClassList(roster_t * the_roster) {
2 classes_t * ans = malloc(sizeof(*ans));
3 assert(ans != NULL);
4 ans->numClasses = 0;
5 ans->classNames = NULL;
6 for (int i = 0; i < the_roster->numStudents; i++) {
7 student_t * current_student = the_roster->students[i];
8 for (int j = 0; j < current_student->numClasses; j++) {
9 if (!contains(ans->classNames,
10 current_student->classes[j],
11 ans->numClasses)) {
12 addClassToList(ans, current_student->classes[j]);
13 }
14 }
15 }
16 return ans;
17 }
The two pieces we abstracted out are contains:
1 int contains(char ** array, const char * str, int n) {
2 for (int i = 0; i < n; i++) {
3 if (!strcmp(array[i], str)) {
4 return 1;
5 }
6 }
7 return 0;
8 }
and addClassToList:
1 void addClassToList(classes_t * class_list, char * name) {
2 class_list->numClasses++;
3 class_list->classNames = realloc(class_list->classNames,
4 class_list->numClasses *
5 sizeof(*class_list->classNames));
6 class_list->classNames[class_list->numClasses - 1] = name;
7 }
After writing these functions, we should again test everything together we have so far by writing
a function to print the list of classes and putting a call to it in main.
Our last programming task is to write our output files. We might end up with something like this:
1 void writeOneFile(const char * cName, roster_t * r) {
2 char * fname = makeRosterFileName(cName);
3 FILE * output_file = fopen(fname, "w");
4 if (output_file == NULL) {
5 perror("fopen");
6 fprintf(stderr,"Trying to open %s\n", fname);
7 abort();
8 }
9 free(fname);
10 for (int i = 0; i < r->numStudents; i++) {
11 student_t * s = r->students[i];
12 if (contains(s->classes, cName, s->numClasses)) {
13 fprintf(output_file, "%s\n", s->name);
14 }
15 }
16 int result = fclose(output_file);
17 assert(result == 0);
18 }
19 void writeAllFiles(classes_t * unique_class_list, roster_t * the_roster)
20 for (int i = 0; i < unique_class_list->numClasses; i++) {
21 writeOneFile(unique_class_list->classNames[i], the_roster);
22 }
23 }
This code abstracts out the task of writing one file into its own function, leaving the
writeAllFiles to simply iterate over the class list, and let writeOneFile do the major work. The
writeOneFile function abstracts out makeRosterFileName, as that was itself a somewhat complex
step in translating the generalized algorithm into code. We can then write this function:
1 char * makeRosterFileName(const char * cName) {
2 const char * suffix = ".roster.txt";
3 unsigned len = strlen(cName) + strlen(suffix) + 1;
4 char * ans = malloc(len * sizeof(*ans));
5 snprintf(ans, len, "%s%s", cName, suffix);
6 return ans;
7 }
At this point, we would have completed the programming task. We wrote 14 functions, but each
function is a reasonable size (all have fewer than 25 lines) with a clearly defined purpose. Notice how
this abstraction contributes to the readability of the code. You can look at a function, and it is small
enough that you can think about what it does. You can then understand the higher-level functions in
terms of the lower-level functions they call—knowing what they do, without having to worry about
how they do it at that point. Such is the benefit of good abstraction.
Additionally, our code is easy to change. For example, suppose we decide to change the format of
the input file so that each student’s entry does not start with the number of classes that they have, but
instead, their entry ends with a blank line. We would only need to change readAStudent, as that code
is the only part that handles reading a student’s information. Everything else is isolated from the
details of how to read a student by abstracting that task into the readAStudent function. The rest of the
code (in particular, readInput) only needs to know that readAStudent will read one student entry and
return a pointer to it. If we made more drastic changes to the input format, then the impact on our code
changes would be limited to readInput and the functions it calls. Nothing else cares about the details
of how the input is read.
13.5 Even Larger Programs
As the size of our programming problems continues to grow, so does the need to break the task down
into manageable pieces. Writing small pieces of code then testing them is the only way to make solid
progress. The opposite approach (which is a poor choice) is the big bang approach, in which the
programmer tries to write all of the code, and only then starts testing and debugging.
One important way to split larger problems into smaller problems is to start with a minimal set of
features—implement, test, and debug those—then add new features one at a time. After you add each
new feature, retest the entire system, and make sure that everything still works. This model of
development requires more discipline (resisting the temptation to write everything all at once—and
hope it will all just work) but pays off in terms of increased overall productivity and software quality.
Discipline (especially under pressure) is one of the key features of good programmers—a good
programmer will work (whether implementing, debugging, and testing) in a disciplined manner,
especially under time pressure. Remember: “haste makes waste.”
As an example of how we might apply this principle, consider writing a program to let someone
play chess. The first thing we might do is just write the program so that it displays the board but not
even any pieces. This may not sound like much, but it is the smallest piece we can make and thus
something we can and should stop and test before we proceed. Once we have this part working, we
might add displaying the pieces on the board and test again. A next step might be to allow the user to
move pieces around but without worrying about the legality of the moves—just letting the user pick
up any piece and put it down anywhere else.
At this point, it is likely quite useful to add code to save and load a board position. Whether or
not we intend to have this feature in the final game, it will prove incredibly useful for testing—letting
us set up and use test cases in an efficient manner. As with all other functionality, we should test this
out thoroughly—we would not want our later testing confused and complicated by poor infrastructure.
After testing this functionality, our next step would be to add the rules of the game one by one.
We can add the basic rules for each piece moving and after each one, test the code again. After all the
basic moves, we can add castling and other complex rules (and test each as we add them). We can then
add code to check for winning (and test) and drawing (and test). If we want our program to have other
features (e.g., playing across a network), we can then add them one at a time (and for large features,
break them down into smaller pieces) and test as we go.
13.6 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 13.1 : How many pieces of information can the human brain work with at a time?
• Question 13.2 : What is abstraction? Why is it important?
• Question 13.3 : Why is it important to write readable code? How does good documentation
improve the readability of code?
• Question 13.4 : What is a reasonable size for a function?
• Question 13.5 : What is the “big bang” approach to developing a large piece of software? Why
is it bad? How can you avoid it?
• Question 13.6 : Do the examaple problem from Section 13.4 (without referencing the pieces
of the solution provided in the text). You do not need to come up with the exact same answer,
but you should be able to work through the programming process and break it down into
reasonably sized functions.
• Question 13.7 : Change your solution to the previous question to adapt to a new input format,
where each student’s information is terminated by a blank line instead of having a count of
how many classes the student is in.
• Question 13.8 : Write a program that takes two command line arguments, both of which name
files. The first file contains a “story” with some words replaced by the underscore (’_’)
character. The second contains a list of words that can be used to fill in the blanks to complete
the story. The second file contains one word per line. Your program should read the first file,
and print its contents, except every time it encounters an underscore, it should pick a random
word from the second file, and print it instead.
For example, the first input file might contain the following “story”:
Once upon a _ there was a _. The _ lived in
a very scary _. One day, the _ left his _
and went in search of a _. This _ is
getting pretty ridiculous, so we are
going to _ it right here. Have a nice _.
and the second input file might contain the following word list:
dragon
sword
forest
house
cave
cake
fire
cellphone
knight
horse
tree
gold
ship
One possible output of the random story program might then be
Once upon a dragon there was a horse. The gold lived in
a very scary cave. One day, the sword left his sword
and went in search of a ship. This house is
getting pretty ridiculous, so we are
going to sword it right here. Have a nice cellphone.
Note: we will improve on our random story program in Question 20.8, after we have
learned about some data structures.
12 Dynamic AllocationII C++
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
All of Programming13 Programming in the Large14 Transition to C++
Part II C++
14 Transition to C++
15 Object Creation and Destruction
16 Strings and IO Revisited
17 Templates
18 Inheritance
19 Error Handling and Exceptions
13 Programming in the Large14 Transition to C++
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
II C++II C++15 Object Creation and Destruction
Chapter 14
Transition to C++
Now that we have learned the basics of programming and are
starting to concern ourselves with larger problems, we are
going to switch from C to C++. C++ builds on C (it is mostly a
superset of C) and introduces a variety of features that are
useful for programming in the large. We make this transition
before we delve into data structures and algorithms (in Part III),
as one of the major advantages of C++ arises from its ability to
better express generic data structures and algorithms—where
one implementation can operate on different types of data
(which we will see in Chapter 17).
We note that we did not start with C++, as it is difficult to
find a reasonable starting point without an understanding of C
(we would need to leave important things unexplained). With
our understanding of C, we can transition gradually into C++,
though many of the examples as we transition will “look like a
C programmer wrote them”—as they will do some things using
C constructs that most C++ programmers eschew.
Before we proceed, we would like to note that C++ is
actually quite a complex language and thus a challenge to
explain to novices. Often a concept in C++ will have an easily
understood basic principle and general use but a variety of
complex underlying technical details, especially for obscure
corner cases. In writing about C++ for a novice programmer,
this complexity often presents a difficult tradeoff between
complete technical accuracy and something a novice (or even
relatively experienced) reader can grasp. When faced with this
tradeoff, we will aim for understandability, avoiding delving
into details that are unlikely to come up in most programs you
will write in the near future (or possibly ever—we have been
doing C++ for decades and never had use for some of the odder
aspects of C++).
To further complicate matters, there are three versions of
the C++ standard (the formal rules of the language) in use in
practice today (C++03, C++11, and C++14), as well as the
most recent standard (C++17) with subtle technical differences.
We will simplify the discussion by primarily explaining things
in terms of C++03. However, we will briefly present some of
the new features in C++11 in Section E.6 (the changes in
C++14 are relatively minor)—these are mostly features aimed
at experts and thus not important to get started in C++. At the
same time, sometimes the standards are unclear and/or disagree
with what is commonly done. In such cases, we may present the
way things are typically done without delving into the subtle
technical details.
C++ also has its own terminology for many concepts,
which differs from the terminology used in other similar
languages. This terminology difference arises from C++’s
heritage of being originally created as an improvement to C
(which was initially compiled by translation into C). When
such a situation arises, we will aim to present both sets of
terminology. We will then use the terms interchangeably
throughout the rest of the text. While this choice may offend
C++ purists, we expect it to prove useful to anyone who goes
on to program in other languages (e.g., Java).
Remember: this book is about how to program and is not
intended as a detailed dive into the dark corners of C++. If you
want to learn all the nitty-gritty corner cases and subtleties of
C++, you should find a book specifically focused on the topic.
If you truly want to become an expert in the exact details of the
formal rules of the language, you might then proceed to read
the language standard (which itself is longer than this entire
book).
14.1 Object-Oriented Programming
One core difference between C and C++ is that C++ supports
object-oriented programming (OOP). In OOP, we think about
our program in terms of objects, which have a state inside of
them, and we ask the objects to perform actions. The actual
implementation of how to perform that action is hidden from us
(inside the object) and may vary from one type of object to
another. As we will see in Chapter 18, we can make objects of
different—but related—types (which may have different
implementations of the same behavior) and store them in the
same variables or arrays.
Two of the primary concepts in OOP are classes and
objects. Technically the only difference in C++ between a class
and a struct is the default access control of its members (which
we will discuss shortly). Objects are instances of classes—that
is, when you “draw a box” for something of a class type, you
have made an object. You may have multiple boxes for the
same class (or struct) type—meaning you may have multiple
objects of a given type. As you are familiar with from your
experience with structs, each of these boxes has its own sub-
boxes for the fields declared in the class (or struct).
In C++, however, classes (and also structs) can have
functions declared inside of them, which are key to the object-
oriented nature. These functions are either called member
functions or methods,1 and have special behavior related to
being inside of the objects, which we will discuss shortly. We
will note that while methods can be declared inside of structs or
classes in C++, many programmers adhere to the convention of
using structs for types that only contains data, and classes for
types that have methods.
The syntax for declaring a class is quite similar to the
syntax for declaring a struct. However, C++ does away with the
requirement to use the struct keyword when using the struct
tag to name the type. That is, if we declare a struct in C++, we
can just use its tag as a type name. That is, we can just write the
following code in C++ (but not in C):
1 struct myStruct {
2 int a;
3 int b;
4 };
5 int main(void) {
6 myStruct x; //not legal in C, legal in C++.
7 return 0;
8 }
The same is true of classes—the following code is also
legal:
1 class myClass {
2 int a;
3 int b;
4 };
5 int main(void) {
6 myClass x;
7 return 0;
8 }
In fact, you would seldom see a C++ programmer write
class myClass x; in this code.
Figure 14.1: Summary of the syntax for declaring a class.
Figure 14.1 summarizes the syntax for declaring a class—
the keyword class is followed by the name of the class. The
members (fields and methods) of the class are contained inside
of the curly braces, which are opened after the name of the
class. Finally, the class declaration ends with a semicolon.
14.1.1 Access Control
In C++, members (fields or methods) of a struct or a class can
have their access restricted so that only code within the class
can directly access them. There are three levels of access (also
called visibility) that each member can have: public, private,
or protected. If a member is public, it can be accessed by any
piece of code. If a member is private, it can only be accessed
by code within that class. We will discuss protected once we
discuss inheritance in Chapter 18. In a struct, the default access
is public, and in a class, the default access is private.
At this point, a natural question for you to be asking is:
why would we want to restrict access to the members of our
class? Remember that abstraction is key to designing large
systems. We want to present an interface any code can use but
hide the implementation details—making these implementation
details private lets us do so in a way that can be enforced by the
compiler. To see an example of why we might want to do this,
consider the following code, which declares a class for a bank
account (we would want to add a few more things to this, which
we will see later):
1 class BankAccount {
2 private:
3 double balance;
4 public:
5 void deposit(double amount) {
6 balance += amount;
7 }
8 double withdraw(double desiredAmount) {
9 if (desiredAmount <= balance) {
10 balance -= desiredAmount;
11 return desiredAmount;
12 }
13 else {
14 double actualAmount = balance;
15 balance = 0;
16 return actualAmount;
17 }
18 }
19 double getBalance() {
20 return balance;
21 }
22 void initAccount() {
23 balance = 0; //we will see a better way to do
24 }
25 };
Here, we declare the balance field to be private—only
code inside of the BankAccount class can access this field. We
note that line 2 is not strictly needed—if we had omitted it,
balance would still be private, as that is the default in a class.
On line 4, we request public access control to all members that
follow (until some other access control is requested). We then
declare four methods: deposit, withdraw, getBalance, and
initAccount (which initializes the balance to zero—we will
see a better way in Chapter 15). The withdraw function takes a
desired amount but returns at most the available balance. Note
that by making the balance private, we can ensure other code
cannot modify balance in inappropriate ways,2 e.g.,
withdrawing more money than is available. Furthermore, if we
decided we wanted to log all changes to the account balance,
we know we only need to modify the deposit and withdraw
functions—no other code changes balance.
Figure 14.2: Illustration of member visibility for classes and structs.
Figure 14.2 illustrates the visibility of fields in a class and
a struct. The regions shown in green contain public members of
the class/struct, while regions shown in pink contain private
members. The default access (before we explicitly write an
access specifier) depends on how we declare the class/struct—
and in fact, that is the only difference between declaring a class
and declaring a struct in C++. For a class (on the left), the
default access is private, while for a struct (on the right), it is
public. In either case, once we explicitly write an access
specifier, that level of visibility persists until we write another
—fields c and d are public because they come after we have
requested public visibility, and before we request private
visibility. Once we write another access specifier, that level of
visibility persists until we write another (or the class/struct
ends).
14.1.2 Encapsulation: Methods Act on Objects
We said earlier that C++ adds the ability to put methods inside
of classes (or structs), which is a key aspect of the OOP
paradigm. Looking at our BankAccount example, we see that
there are four methods inside the BankAccount class, but why is
this important? Classes encapsulate—combine into a single
logical unit—the data and methods that act on them. That is, we
can think of these methods as being inside each instance of the
object. Calling them causes them to act on the data inside that
particular instance of the object.
For example, notice how these methods reference the
balance field. When we call deposit on a particular bank
account object, it will adjust the balance field inside that
particular object. Notice that we talk about calling a method on
a particular object. We can access these methods in the same
way that we can access fields of a struct (which is the same as
for fields of a class)—with either the dot (.) operator if we
have the object directly, or with the arrow (->) operator if we
have a pointer to it.
For example, we might do
1 BankAccount myAccount; //declare a variable of type Bank
2 myAccount.initAccount(); //call initAccount() on myAccoun
3 myAccount.deposit(42.50); //call deposit(42.50) on myAcc
Let us notice that in this example (as there is only one
BankAccount), you could probably execute this code by hand
and get the correct result, but you would be missing a key detail
to actually do it correctly. Notice that the code inside the
methods references the balance field. With only one
BankAccount object, there is only one balance field, so you can
just guess that it is the only one. But what would happen if you
had many instances of BankAccount (each of which would have
its own balance field), and more complex code? How would
you know which balance field (that is, which BankAccount
box’s sub-box named balance) to reference?
Last time we needed to resolve confusion about multiple
boxes of the same name, we discussed the notion of scope—the
range in which a variable can be seen. Here, our confusion is
not a matter of scope—the name of the field is in scope for any
of the methods inside the class3 —and that name refers to the
field in the same object as the method. We could imagine each
object as having its own copy of the code for each method in
the class—and the code references the field in the object it is in.
While the “code inside the object” approach works
logically, it would be quite inefficient to implement—and to
execute by hand (imagine copying all of the code for an object
into it each time you make a new one!). Instead, C++ (and
many other OO languages) adds an extra parameter to each
method, called this, which is a pointer to the object the method
is inside of. For a method inside a class of type T, you can think
of the this pointer as if it were a parameter declared as
T * const this (that is, a pointer to a T, the contents of which
can be changed, but the pointer itself cannot be made to point at
anything else). All references to fields in the object are
implicitly this->field.
Understanding of this is key to our correctly executing
object-oriented code by hand. When you set up the frame for a
method (called on an object), draw another box for the implicit
argument, this. The value you should pass for this is a pointer
to the object the method is being called on. If the method
invocation looks like ptr->method(args), then copy ptr. If the
method invocation looks like object.method(args), then take
the address of object (that is, pass an arrow pointing at object,
as if you passed &object). As you execute the code inside of
the method, if you encounter a reference to a member of the
class, remember that it acts as if it has this-> before it. This
rule applies to both fields and methods—that is, if you call a
method inside an object, it is implicitly invoked on this.
Video 14.1: Executing code with methods.
Video 14.1 walks through an example of the execution of
code with method calls on objects.
14.1.3 const Methods
There are times when we want to specify that the implicit this
argument is a const pointer. That is, that the compiler should
consider it to be a const T * const this rather than just
T * const this. As we do not explicitly declare this
ourselves, C++ provides a special place to put the const: after
the close parenthesis of the parameter list, but before the open
curly brace of the function body. For example, we might write a
Point class like this:
1 class Point {
2 private:
3 int x;
4 int y;
5 public:
6 void setX(int new_x) {
7 x = new_x;
8 }
9 void setY(int new_y) {
10 y = new_y;
11 }
12 int getX() const {
13 return x;
14 }
15 int getY() const {
16 return y;
17 }
18 };
Here, the getX() and getY() methods are declared with
const on the end, meaning that their implicit this argument is
a const pointer—we cannot modify anything in the object it
points at (which is the object the method was invoked upon).
Declaring methods in a const-correct fashion (meaning, we put
const on them exactly when it is correct to do so) is important,
as we can invoke a const method upon a const object but
cannot invoke a non-const method upon a const object. While
that last sentence sounds technically complicated, it is easy to
see with an example:
1 void someMethod(const Point * p) {
2 int x = p->getX(); //legal: getX() is const
3 p->setX(42); //illegal: setX is not const
4 //but *p is const!
5 }
Here, we have a method that takes a const Point * as a
parameter. Calling a const method (such as getX()) on p is
legal, as the const declaration ensures we will not modify the
contents of the object (which is what the const on the
declaration of p promises to anyone who calls someMethod).
However, we cannot invoke a non-const method (such as setX)
on p, as it may (and in this case, certainly does) modify the
object.
14.1.4 Plain Old Data
C++ makes a distinction between classes (or structs) that are
plain old data (POD) classes and classes that are not. At a high
level, all POD types have direct C analogs—they are just
sequences of fields in memory that you could copy around
freely. POD types require no special setup or destruction. You
could allocate them with malloc, copy them with memcpy, and
deallocate them with free.
Unfortunately, discussing exactly what makes a class non-
POD is quite complex, relies on concepts we have not
introduced yet, and varies between C++03 and C++11. The
simplest rule is that if you can write it in C, it is a POD class. If
you cannot write it in C, it is probably not POD, with one
significant exception: if you declare functions (a.k.a. methods)
inside of a class or struct, they do not make the type non-POD
(unless they are virtual functions, which we will learn about
later).
This last exception may seem quite surprising, as adding
functions “inside” of a class is a significant feature of OOP.
However, even though the functions are conceptually “inside
the object,” they are actually not contained in the object in
memory. Instead, the code for these functions is just placed
with the rest of the code in memory and just takes the extra
this parameter to tell which object it is currently operating on.
The fact that the function is not actually inside the object means
that the objects as they are in memory still have a direct C
analog. That is, if we had a C++ class declaration of:
1 class MyClass {
2 public:
3 int x;
4 int y;
5 int getSum() const { return x + y; }
6 };
We could think of it as the C code:
1 typedef struct {
2 int x;
3 int y;
4 } MyClass;
5
6 int getSum (const MyClass * const this) {
7 return this->x + this->y;
8 }
The class in the C++ code and the struct in the C code
would have the same layout in memory. That is, both of them
would have the same size (sizeof(MyClass) would be the
same whether we did the C++ version or the C version), and the
offset of each field relative to the start of the object would be
the same.
So far, the only C++ feature that we have seen that can
make a type non-POD is access restrictions (declaring fields
private). Whether declaring any fields private, or mixing
multiple visibilities (some private fields and some public fields)
in a class makes it non-POD depends on whether you are using
C++03 or C++11. Additionally, any class that has members of a
non-POD type is itself non-POD (although it may have pointers
to non-POD types, since the pointers themselves are all that is
held in the object in memory, and they are POD).
As we delve more into C++, we will learn more features
that make types non-POD. Knowing whether a type is POD or
not is important for knowing what you can and cannot do to it.
With any non-POD type, you cannot work on the “raw”
memory directly in a safe way, whether it is allocating,
copying, or deallocating that memory. Instead, you have to use
the C++ operators that know how to work with non-POD types
properly.
14.1.5 Static Members
Often we want the fields and methods of a class to “belong to”
a specific instance of the class. In our BankAccount example,
we want every BankAccount object we create to have a
different balance field—the balance of one account is a
completely different “box” from the balance of any other
account. Whenever we create a new BankAccount object, it
should have a box inside of it for the balance. Likewise, when
we deposit to a BankAccount (via the deposit method), it
should affect the balance in the BankAccount object we are
invoking deposit on (and no other).
However, there are times when we want all instances of a
class to share the same “box” for a particular field or to have a
method that acts on no particular instance of that class. In our
BankAccount example, we might want each BankAccount to
have an account number that is uniquely assigned when the
account is created. While each BankAccount’s particular
account number should be “one box per instance” as we have
seen so far, we also need a “box” shared by all BankAccounts to
track the next account number to assign to a newly opened
account.
We want the next account number to be a static field of the
BankAccount class. In this context, the keyword static means
that there is one “box” shared by all instances of the
BankAccount class, not one box per instance. We can modify
our BankAccount class as follows:
1 class BankAccount {
2 private:
3 static unsigned long nextAccountNumber;
4 unsigned long accoutNumber;
5 double balance;
6 public:
7 unsigned long getAccountNumber() const
8 return accoutNumber;
9 }
10 void deposit(double amount) {
11 balance += amount;
12 }
13 double withdraw(double desiredAmount) {
14 if (desiredAmount <= balance) {
15 balance -= desiredAmount;
16 return desiredAmount;
17 }
18 else {
19 double actualAmount = balance;
20 balance = 0;
21 return actualAmount;
22 }
23 }
24 double getBalance() const {
25 return balance;
26 }
27 void initAccount() {
28 accoutNumber = nextAccountNumber;
29 nextAccountNumber++;
30 balance = 0; //we will see a better way to do
31 }
32 };
33
34 unsigned long BankAccount::nextAccountNum
There are a few things to notice about the declaration of
the nextAccountNumber. First, inside the class declaration, we
declare static unsigned long nextAccountNumber.
Declaring this field as static requests that there not be one box
per instance of the class. Methods inside the class can still
access this one “shared” box (as seen in initAccount).
However, in C++, this declaration does not actually create the
“box”. Instead, we need another line of code outside of the
class definition:
1 unsigned long BankAccount::nextAccountNumber = 0;
This line of code actually creates the box and must be
placed at the global scope—outside of any functions or classes.
It is just like any other variable declaration, except that the
name of the variable being declared looks a bit odd. The name
of the variable, BankAccount::nextAccountNumber, specifies
the nextAccountNumber “inside of” the BankAccount class. The
:: is the scope resolution operator, which allows us to specify a
name inside of another named scope, such as a class. We will
note that this statement executes before main begins execution
—that is, the box is created and initialized before we begin our
normal execution of code at the start of main.
You can also write static methods in a class. These
methods cannot access non-static members of the class because
unlike non-static methods, they do not have a this pointer
passed to them—they operate on no particular object.
We are not going to have much use for static members in
anything we do in this book. However, it useful to have seen
the concept (especially if you go on to program in Java) and
know what the term means, as it may appear in compiler errors
(especially “non-static”).
14.1.6 Classes Can Contain Other Types
In C++ (and many other OO language, classes can contain
other types inside of them. Such a declaration can either be a
typedef of an existing type or even the declaration of an entire
other class (in which case, the resulting class is called an inner
class). Like fields and methods, the current access specifier
affects the visibility of the type declaration.
For example, we could add some types to our
BankAccount class:
1 class BankAccount {
2 public:
3 typedef double money_t;
4 private:
5 class Transaction {
6 public:
7 money_t amount;
8 timeval when;
9 };
10 money_t balance;
11 Transaction * transactions;
12 //other stuff elided...
13 };
Line 3 makes money_t a typedef for double inside of
BankAccount. This type declaration is public, so it can be
referenced freely outside of the class; however, since the name
is inside of the class, the scope resolution operator, (::) must be
used—the type name would be written BankAccount::money_t.
Inside the class, it can be referenced simply as money_t (as
done when declaring the balance field in this example.
This example also declares an inner class, Transaction,
which is private. Code outside of the BankAccount class cannot
make use of this type at all. The Transaction class has its own
fields (and could have methods, and anything else a class can
have) with their own visibility.
You may wonder what happens when you have a private
inner class (such as the Transaction class here) with public
members (as this class has). Having a private class prevents
code outside of the class from naming the type (that is, it cannot
make use of the type name BankAccount::Transaction). Such
a restriction typically means that we design our code such that
the private inner class never “escapes” the outer class—that is,
we never return instances of the class (or pointers to them) to
the code in the outside world, and have no public fields whose
type is the inner class. If we do not let instances of this private
class escape, then there is nothing complex to thinking about
how the access restrictions work: code in the outer class can
access members of the inner class according to their visibility
modifiers, and code outside of the outer class cannot even “see”
the inner class at all. If we do allow the inner class to escape the
outer class, then even though that code cannot name the type, it
can still access public fields in the object.
Unlike most things “inside” of a class, methods of an inner
class do not have direct access to non-static members of the
outer class. This restriction is not an issue of access restrictions,
but rather that the methods inside the inner class have a this
pointer that points at the object that is an instance of that inner
class (e.g., it is a pointer to a Transaction), and does not have
a pointer to the outer-class instance (e.g., BankAccount) that
created them (in fact, there might not be one). If we want to let
the inner class reference the outer class, we can do so easily by
having a field in the inner class that is a pointer to the outer
class type and initializing it to point at the appropriate outer-
class instance. We can then access fields and invoke methods
through that pointer.
14.1.7 The Basics of Good OO Design
Object-oriented programming is a nice tool to help with
programming in the large; however, as with most tools, it must
be used properly to have benefits. While good OO design is
quite a large topic (as we mention in Chapter 34, an aspiring
professional programmer should take a software engineering
course), we cover some key principles here to help guide you as
you begin to build software using objects.
Classes are nouns. Methods are verbs. When you make
a class, its name should be a noun. When you make
instances of that class, you are making “things,”
which correspond naturally to nouns. By contrast,
methods should have verb names—they do things.
If you find yourself naming a class with a verb,
you are likely describing something that should be
a method within a particular class (maybe an
existing class or maybe a new one).
Keep classes small and to their purpose. As a class
should be named for a noun, its implementation
should reflect that noun. If there are subpieces that
can logically be split out into their own class,
doing so is often a good idea—much like we often
want to abstract out tasks into functions as we
write other functions.
Be as general as you can be, but no more.
As we will see later, C++ provides a variety
of ways to make your classes general. For
example, we will see in Chapter 17 how we can
make classes and methods that operate on any type
of data, as long as the operations are the same
regardless of type. When designing a class, if you
have a choice between making the class more
general (whether in terms of types, having more
parameters versus hard coding, or other ways to
generalize your code), or limiting yourself to what
you need specifically at this moment, it is typically
better to write the more general class. Such an
approach saves you from reimplementing the class
to accommodate other types or variations later. Of
course, if making the class more general presents
significant difficulties, you may be trying to make
it too generalized, which can be even worse.
Avoid “Manager” classes. One common hallmark of
poor OO design is classes with “Manager” in their
name. For example, if one is writing a game, and
writes a GameManager class, this class reflects a
poor design choice. Here, the programmer has
basically said “I have a bunch of stuff I need to do
but cannot come up with a good OO design, so I
will throw it all into a class that manages
everything.” Here, the programmer could improve
his design by separating out the functionality of
this manager into logical units that map onto nouns
—for example, we might have a Timer class (to
handle events that happen after a certain amount of
time), a Set class (from which we make a set of
timers to represent all the active timers), a
GameBoard class (to represent the board), and so
on. Basically, unless you are writing a class to
represent an employee who oversees other
employees, you should never name a class with
manager in its name.4
14.2 References
In C, we can refer to a “box” either directly by using its name
or by dereferencing pointers that point at its box. C++ provides
another way to refer to a box: via a reference. A reference is
similar to a pointer in that it provides access to a “box” through
a level of indirection; however, there are many differences
between references and pointers. A reference is intended to
conceptually be “another name for a box.” Reference types are
declared with an & in the same way that pointers are declared
with a * (e.g., int & x = y; declares a reference x, which has
type “int reference” and feferences y—it is conceptually
another name for y’s box).
The first difference between references and pointers is that
once a reference is initialized (to refer to some box), it cannot
be changed to refer to a different box. By contrast, pointers can
have what they point at changed any time (unless they are
declared with appropriate const modifiers to prevent such
modifications). A consequence of this rule is that any type that
has a member whose type is a reference is not a POD type—
because the reference requires “special” initialization (you
cannot just assign to it later).
The second difference is that references are automatically
dereferenced, whereas pointers must be explicitly dereferenced.
The third difference is that a reference must be initialized
when it is declared. The initialization is treated specially—it
sets what the reference refers to—while any other use of the
reference implicitly refers to whatever is referenced.
The fourth difference, which is a corollary of these first
two differences, is that we cannot have NULL references.
The fifth difference is that we cannot have a “reference to
a reference” (int &&x is not a reference to a reference to an int
5
). Similarly, we cannot have a pointer to a reference (so
int&* x is not legal); however, we can have a reference to a
pointer (so int *&x is legal).
The sixth difference is that we cannot perform “reference
arithmetic,” even though we can perform pointer arithmetic.
For the most part, we can think of the behavior of
references in terms of imagining a translation of the code with
references to equivalent code with pointers. In the code below,
the first piece of code shows two functions that use references
(they do not really accomplish useful tasks nor use the
references in a useful way—they are just simple functions for
the purposes of example). The second shows code with
equivalent behavior that uses pointers instead of references.
Code with References
1 int f(int & x, int y) {
2 int z = x + y;
3 x = z;
4 return z - 2;
5 }
6 int g(int x) {
7 int z = f (x, 3);
8 int & r = z;
9 r++;
10 return z * x;
11 }
Equivalent Code With Pointers
1 int f(int * const x, int y) {
2 int z = (*x) + y;
3 (*x) = z;
4 return z - 2;
5 }
6 int g(int x) {
7 int z = f (&x, 3);
8 int * const r = &z;
9 (*r)++;
10 return z * x;
11 }
More generally, we can (for the most part) translate
reference-based code to pointer-based according to the
following rules.
Declaration If we have a declaration of a variable or
parameter of type T & (where T is some type), we
can translate it to a declaration of a pointer of type
T * const (that is, a pointer to a T where the
pointer itself cannot be changed, but what it points
to can be). In the example code above, this rule is
applied to the declaration of the parameter x in line
1 and to the variable r in line 9.
Initialization When we initialize the reference, we
implicitly take the address of the value we are
initializing from. This initialization must happen in
the same statement as the declaration for a variable
and is thus different from any other assignment to
the reference. We can translate the initialization by
imagining a & before the initializing expression. In
the example code above, we see this rule applied to
a the initialization of the variable r in line 9, where
we convert the right side of the assignment
statement from z to &z (the address of z, just like in
C).
For a parameter, initialization happens when
we pass the parameter value into the function. This
rule means that instead of passing a copy of the
value of the expression passed in, we pass an arrow
pointing to it. We can see this rule applied to
parameter passing in line 8, where we translate
passing x in the reference-based code into passing
&x in the pointer-based code.
Uses Any other time we use the reference (whether
to name a box on the left side of an assignment
statement or in an expression), we implicitly
dereference the pointer. We can therefore translate
any other use of the reference to a dereference of
the pointer that we translated the reference into.
We can see this rule applied in the example code
above in lines 2, 3, and 10.
Note that sometimes these rules lead to pairs of operations
that “cancel out” (such as &*x, which is just x), when one
reference is initialized from another. This behavior is exactly
correct, as the pointer is just used directly in these cases—the
reference being initialized is another name for whatever “box”
the other reference names. Another important consequence of
the way references work is that a call to a function that returns a
(non-const) reference may be used as the left side of an
assignment statement. To see an example of this principle,
consider the following code (which is a simple example of this
concept, not at all a good OO design):
Code with References
1 class Point {
2 private:
3 int x;
4 public:
5 int & getX() {
6 return x;
7 }
8 };
9 //...
10 //...
11 myPoint.getX() = 3;
Equivalent Code With Pointers
1 class Point {
2 private:
3 int x;
4 public:
5 int * const getX() {
6 return &x;
7 }
8 };
9 //...
10 //...
11 *(myPoint.getX()) = 3;
In the first piece of code (with references),
myPoint.getX() is a valid lvalue, as it names the x box inside
of myPoint. We can see how this behaves by looking at the
equivalent pointer-based code on the right, in which *
(myPoint.getX()) would name the same box in the same way
—by following an arrow to it.
Video 14.2: Executing code with references.
Video 14.2 shows execution of C++ code with references
by thinking about how they translate into pointer-based code.
Video 14.3: Executing swap, written with
references.
As another basic example of references, we can revisit the
swap function we saw in Video 8.3. Recall that in this video, we
saw how to execute a function that swaps two ints by taking
pointers to the boxes that should be swapped. We could instead
accomplish the same task by passing references to our swap
function. Video 14.3 illustrates such code and its execution.
Observe how similar the behavior is to the pointer-based swap
function from Video 8.3.
In describing these rules, we put a caveat of “for the most
part” on them. There are two reasons for this caveat. The first
reason is that a const reference may be initialized from
something that is not an lvalue. Remember that lvalue is the
technical term for “something that names a box.” When
something is not an lvalue, we cannot take its address (as there
is no box to get an arrow pointing to), thus we cannot apply the
translation rules above. To see a concrete example of this,
consider the following C++ code fragment (where most of the
bodies of the functions are elided, as indicated by the ellipses):
1 void someFunction(const int & x) {
2 //...
3 }
4 void anotherFunction(void) {
5 //...
6 someFunction(3);
7 //...
8 }
Passing 3 to someFunction is legal because x is a const int &
(a const int reference)—if it were just int &, and thus not
const, this call would be illegal. However, our translation rules
say that this would be equivalent to the following (illegal) code
with pointers:
1 void someFunction(const int * const x) {
2 //...
3 }
4 void anotherFunction(void) {
5 //...
6 someFunction(&3);
7 //...
8 }
Attempting to compile this pointer-based code results in an
error, as 3 is not an lvalue, thus we cannot take its address.
What actually happens is that the compiler creates a temporary
const int variable and passes the address of that variable,
such as this (note that the temporary variable does not really
have a name, even though we show it with one below):
1 void someFunction(const int * const x) {
2 //...
3 }
4 void anotherFunction(void) {
5 //...
6 {
7 const int tempArg = 3;
8 someFunction(&tempArg);
9 }
10 //...
11 }
Initializing a reference from something that is not an
lvalue is only legal if the reference is constant. Much like a
const pointer, a const reference cannot be used to modify the
box it refers to. Even though we could imagine letting such
code be legal—the temporarily created variable would be
modified, then discarded—such a language design choice is
more likely to lead to mistakes, as it leads to nonsensical code,
such as
1 //this code is not legal
2 void swap(int & x, int & y) {
3 int temp = x;
4 x = y;
5 y = temp;
6 }
7 //....
8 swap(3, 4); //what is this even supposed to mean?
Of course, we can always declare a temporary variable
ourselves, initialize it with whatever value we want, and then
use that variable to initialize a non-const reference.
The other reason for our “for the most part” caveat is that
pointers and references are distinct types. While these rules
give us a translation semantics to understand how references
behave (meaning you know how to execute the code by
applying familiar rules), actually translating to use pointers
instead of references changes the types that variables (and thus
expressions that use them) have in the program. While this
distinction may seem to be subtle and insignificant, it is
actually quite important. As we will see shortly in Section 14.4
and Section 14.5, C++ permits us to declare functions with the
same name but different parameter types, as well as to provide
definitions of many operators (such as +, *, =, ==, etc.) where
user-defined types (or references to user-defined types) are
involved. We can define an operator that operates on two
references but not on two pointers.
14.3 Namespaces
In C, functions and type names (as well as global variables, if
you use them) reside in a global scope, which is visible
throughout the entire program. The only way to restrict the
visibility of a function’s name is to declare it as static,6 which
restricts its visibility to the compilation unit it is declared in. If
a function is not declared static, its name must be unique in
the entire program. If it is static, it may not be used in any
other compilation unit. Such a design is not ideal for large
pieces of software with many developers, as it introduces the
problem of name collisions—developers attempting to name
different functions with the same name. This problem can be
especially bad if developers want to use multiple libraries
(which may be pre-compiled) that have a name collision.
C++ introduces a way to create named scopes, called
namespaces, which can be used from anywhere in the program.
Declarations can be placed inside of a namespace (e.g., one
called somename) by wrapping namespace somename { .... }
around the declarations. For example, we might write:
1 namespace dataAnalysis {
2 class DataPoint { ... };
3 class DataSet { ... };
4 DataSet combineDataSets(DataSet * array, int n)
5 }
While we will not be writing programs large enough that
we need to declare multiple namespaces in this book, knowing
how to use them is important, especially since the C++ standard
library declares its functions and types in the std namespace.
Additionally, many popular C++ libraries declare the functions
that they provide inside of a namespace, exactly to avoid name
collisions between libraries.
There are two ways to reference a name declared inside of
a namespace. The first is to use the scope resolution operator
::. For example, if we want to use the vector class in the C++
standard library, which is in the std namespace, we can
reference it by its fully qualified name, std::vector.
The second way to reference names inside of a namespace
is to open the namespace with the using keyword. The using
keyword instructs the compiler to bring the names from the
requested namespace into the current scope. If we were to write
using namespace std; we would open the entire namespace,
and could just write vector to refer to std::vector.
We can also use using to bring particular items from a
namespace into the current scope. For example, we can write
using std::vector; to bring only the name vector from the
std namespace into scope.
We can open multiple namespace into the same scope with
using, but if any of the namespaces have functions of the same
name,7 then we must explicitly specify which function we want
when we use it. Note that the compiler only requires you to
explicitly specify which function you want if the “best choice”
of functions with the same name is ambiguous. As we will
discuss shortly, there may be one particular best choice based
on the parameter types of the functions in question.
Opening namespaces is generally regarded as something to
be done sparingly and possibly avoided entirely in large scopes.
As a general rule, the larger the scope, the more wary you
should be of opening an entire namespace. Furthermore, the
more namespaces you already have open in a scope, the more
wary you should be of opening another. Instead, opening only
the particular names you desire (e.g., using std::vector;
instead of just using namespace std;) is typically preferable.
14.4 Function Overloading
In C, we can only have one function of a given name visible at
any time. If we want to write a max function that takes two
doubles and another max function that takes two ints, we must
give them different names. If we try to give them both the same
name—e.g., we write the following code:
1 int max(int x, int y) {
2 if (x > y) {
3 return x;
4 }
5 return y;
6 }
7 double max(double x, double y) {
8 if (x > y) {
9 return x;
10 }
11 return y;
12 }
Then the compiler will give us the error:
error: conflicting types for ’max’
However, in C++ this code is perfectly legal—multiple
functions of the same name are legal if they have different
parameter types. This concept of allowing multiple functions of
the same name is called function overloading. Note that an
overloading is legal if (and only if) the functions can be
distinguished by their parameter types, read as an ordered list.
That is, the following functions all represent valid overloadings
(actual contents of the function omitted, as they are irrelevant,
and of course, f is generally a terrible name for a function):
1 int f(int x, double d) {...}
2 double f(double d, int x) {...}
3 void f(int y) {...}
4 int f(double d) {...}
5 double f(void) {}
6 int f(int * p) {}
7 int f(const int * p) {}
Every one of these functions has a different parameter list
than the others (a const int * is a different type from an
int *). Note that functions that only differ in their return types
and/or parameter names are not valid overloadings, as the
compiler determines which function you are referencing by
looking at the types of the parameters you pass in. Overloading
of member functions (a.k.a. methods) is also legal—a class may
have multiple functions of the same name, as long as they have
different parameter types. Methods that differ only in the
const-ness of this are legal, as that constitutes a different
parameter type.
The compiler must also be able to determine an
unambiguous best choice of which overloaded function you
want from the parameters you pass in. Given the previous
declarations, a call to f(3,3) is illegal, and results in the
following error messages:
error: call of overloaded ’f(int, int)’ is
ambiguous
note: candidates are:
note: int f(int, double)
note: double f(double, int)
Here, the compiler considers converting f(3, 3) to
f(3.0, 3) (which would be a call to double f(double, int))
to be “just as good as” converting it to f(3, 3.0) (which
would be a call to int f(int, double)). The compiler does
not look at how the return type is used in making this
determination either. However, if we had another overloading
of f that took (int, int) as its parameters, that would be
unambiguously the best choice, and the call would be legal.
If that last paragraph seemed a bit complex, let it be a
warning to use function overloading sparingly, if at all. Many
programmers view function overloading as a horrible idea for
several reasons. First, it can confuse the reader of the code, as
you have to find all the possible functions of a given name then
know the rules for determining what is the “best match” to
know what happens in the code. Second, it provides a source of
potential errors in writing the code. The programmer may think
she is calling one function, when, in fact, she is calling another.
Third, it introduces the possibility of very surprising errors
when working code is modified—introducing another function
of the same name may change what the “best choice” is for
some other part of the code, making it so that a piece of code
that was working and has not been modified is no longer
correct.
Note that this issue is closely related to one of the reasons
why opening entire namespaces (especially multiple entire
namespaces) is generally considered a bad idea. Consider the
following code fragment:
1 namespace libraryX {
2 double aFunction(double x) { ... }
3 };
4 namespace moduleY {
5 int somethingElse(int y) { ... }
6 };
7 ...
8 using namespace libraryX;
9 using namespace moduleY;
10 double x = aFunction(2);
11 int z = somethingElse(42);
Here, we open two namespaces, and all of the code is
completely legal. The call to aFunction references
libraryX::aFunction, and the compiler converts 2 to 2.0.
Now, however, suppose that a developer working on moduleY
writes a function of the same name, but it takes an int as a
parameter. This developer is unaware of the names used in
libraryX, which is not a part of the code she is concerned with
(which is part of the advantage of properly used namespaces).
Now, the code reads as follows:
1 namespace libraryX {
2 double aFunction(double x) { ... }
3 };
4 namespace moduleY {
5 int somethingElse(int y) { ... }
6 int aFunction(int x) { ... }
7 };
8 ...
9 using namespace libraryX;
10 using namespace moduleY;
11 double x = aFunction(2);
12 int z = somethingElse(42);
Here, there are two aFunctions in scope; however,
moduleY::aFunction is unambiguously a better choice than
libraryX::aFunction, so the compiler will select
moduleY::aFunction as the target of the call (and then convert
the returned result from an int to a double). Calling this other
function (which likely behaves quite differently) breaks the
code in a surprising, and thus hard-to-debug way.
If you are going to use function overloading, you should
follow a few simple guidelines. First, you should only overload
functions when they perform the same task but on different
types. For example, our max functions that we described earlier
do the same computation but on different types. Second, you
should only overload functions in such a way that
understanding what the “best choice” is for a particular call is
straightforward.
14.4.1 Name Mangling
When you compile a C++ program, the C++ compiler generates
assembly, which is then assembled into an object file. As with
C programs, the linker then links together the object files,
resolving symbol references. The linker does not understand
function overloading, nor does it have type information
available to it. Instead, the C++ compiler must ensure that the
names of the symbols the linker sees are unique. To accomplish
this goal, the C++ compiler performs name mangling—
adjusting the function names to encode the parameter type
information, as well as what class and namespace the function
resides inside of—so that each name is unique.
While you generally do not need to know the specific
details of name mangling, it is useful to understand that it
happens, and that C does not mangle names. If you mix C and
C++ code, the C++ compiler must be explicitly informed of
functions that were compiled with a C compiler by declaring
them extern "C", such as
1 extern "C" {
2 void someFunction(int x);
3 int anotherFunction(const char * str);
4 }
Note that main is treated specially (e.g., generally as if it
were declared extern "C"), as it may be declared with or
without parameters for the command line arguments, and is
called by the startup library, which is frequently written in C.
14.5 Operator Overloading
C++ takes function overloading one step further, allowing
operator overloading. You can write “function declarations”
that define the behavior of the operators (such as +, -, *, ++, =,
and many others) when at least one user-defined type (e.g.
class) is involved. For example, we might write:
1 class Point {
2 private:
3 int x;
4 int y;
5 public:
6 Point operator+(const Point & rhs) cons
7 Point ans;
8 ans.x = x + rhs.x;
9 ans.y = y + rhs.y;
10 return ans;
11 }
12 };
Here, we have defined an overloading of the + operator
inside of the Point class, which takes two points as arguments.
The first Point is this (the implicit argument that points at the
object a method is invoked upon), which points at the left-hand
operand of the + operator. The second operand is a const
reference to the right-hand operand of the + operator (named
rhs, since it is the right-hand-side operand of the operator). We
pass a const reference rather than the value so that we avoid
copying the entire object as we pass the parameter (which is
generally the correct way to pass the right-operand argument of
an overloaded operator).
Operator overloading is quite common in C++; however,
as with function overloading, it should only be used when it is
appropriate (not just any time you can). C++ does not enforce
any rules about what the overloaded operators do, but they
should obey common sense. If you overload +, it should
correspond in some way to addition. For example, if you have a
class for a matrix (in the mathematical sense), it is logical to
overload to overload the + operator to add two matrices. Such
an overloaded operator would be declared inside of the Matrix
class as
1 Matrix operator+(const Matrix & rhs) {
2 //implementation goes here
3 }
Note that the first (left) operand is the this object.
Likewise, we overloaded the += operator on a Matrix, so we
would expect it to add another Matrix to the current one,
updating the current matrix with the sum. Such an overloaded
operator would be declared inside of the Matrix class as:
1 Matrix & operator+=(const Matrix & rhs) {
2 //implementation goes here
3 return *this;
4 }
Note that in the case of operators that modify the object
they are inside of, such as +=, they return a reference to that
object. That is, their return type is a reference to the their own
class type, and they return *this. Note that when returning
*this, a reference (the return value) is being initialized, thus
the address is implicitly taken; that is, the pointer is &*this,
which is just this. The reason why operators such as this return
a reference to the object (rather than void) is that a = b += c;
is legal, even if we strongly discourage writing such code (it
performs b += c; then a = b; and is better written as two
statements).
As we mentioned earlier, const versus non-const
functions constitute valid overloadings. For example, in our
Matrix class, we might overload the indexing operator ([]) to
give us a row of the matrix (assume we have some class,
MatrixRow which represents one row of the matrix). We might
wish to provide two versions of this operator, one that is non-
const and returns a non-const MatrixRow reference, and one
that is const and returns a const MatrixRow reference:
1 MatrixRow & operator[](int index) {
2 //code here
3 }
4 const MatrixRow & operator[](int index) const {
5 //implementation here
6 }
Such an implementation allows us to get a const (i.e.,
read-only) row from a const Matrix, allowing us to read the
row (and thus presumably the elements) out of it but not modify
them. However, if we have a non-const Matrix, we can get a
non-const MatrixRow, allowing us to read or write the
elements of the Matrix.
Video 14.4: Executing code with an overloaded
operator.
Executing code with overloading operators is primarily a
matter of thinking of the overloaded operators as functions and
using the rules you are well familiar with (with the additional
rule about passing this, which we just learned for operators
that are members of classes). Video 14.4 illustrates.
14.6 Other Aspects of Switching to C++
There are a variety of small topics that are all important to
know as you get started in C++ programming. We cover several
of these here.
14.6.1 Compilation
Instead of compiling with GCC, C++ programs are compiled
with G++ (g++ on the command line). For the most part, the
options we have discussed for GCC work for G++; however,
the option for the language standard is different. For everything
we are going to do in this book, -std=gnu++98 will work fine,
specifying the GNU extensions to the C++98 standard, which is
the default. However, if you want to use any of the new features
in C++11, you should specify -std=gnu++11. For a list of
C++11 features and their availability in various versions of
G++, consult the documentation page:
https://gcc.gnu.org/projects/cxx0x.html. We also give an
overview in Section E.6.
14.6.2 The bool Type
C++ has an actual bool type with values true and false (C99
has _Bool and stdbool.h typedefs it to bool, as well as
defining true and false; however, bool, true, and false are
actually part of the language in C++). C++ will still respect C-
style use of an integer as meaning false if the integer is 0 and
true otherwise; however, you should generally use bool for the
type of parameters and variables that represent a boolean value.
14.6.3 void Pointers
In C, we can assign any pointer type to a void pointer and
assign a void pointer to any pointer type without a cast. C++
removes this flexibility and requires an explicit cast. While this
change may seem annoying, other of the new features that C++
provides make it less cumbersome than it may seem, and it is in
fact imminently reasonable. As we will see in Chapter 15, C++
provides a new mechanism for dynamic allocation, which
returns a correctly typed pointer instead of a void *. C++ also
provides a nicer mechanism for referencing data of any type,
via templates (which we will discuss in Chapter 17), which
allow classes and functions to be parameterized over one or
more types (e.g., what type of data they hold or operate on).
Being able to write classes that can hold any type using
templates (rather than void *s) is one of the compelling
reasons to switch to C++ before learning about data structures
in Part III.
14.6.4 Standard Headers
In C, the standard header files end in .h (e.g., stdio.h,
stdlib.h). In C++, the standard headers do not have any dot
suffix. For example, C++ has a header file for the std::vector
class (which we have mentioned, but not delved into).
However, this header file is not vector.h; it is simply vector.
That is, you would include it with #include <vector>.
You can still include the C standard header files if you
want. However, as these header files put names into the global
scope (rather than the std namespace), they are not the prefered
way to include the C standard library. Instead, C++ provides a
header file that has the same name as the C header file but starts
with a c and does not end with .h (e.g., cstdlib for stdlib.h
and cstdio for stdio.h). These versions of the header files
provide similar definitions, except they are placed in the std
namespace (so cstdio would provide std::printf).
14.6.5 Code Organization
In C, we had the organization that a header file declared the
interface, and the corresponding C source file defined the
implementation. In C++, a similar arrangement occurs with
header files and C++ source files (which end with a .cpp
extension). In C++, class declarations typically occur in header
files (as they describe the interface of the class). It is generally
considered to be fine to write the implementation of very short
methods directly inside of the class declaration in the header
file. For example, if you want to have a private field that can be
read outside the class, you would write a public accessor (a.k.a.
“getter”) method. Such a method might look like
int getX() const {return x;}. Such a method is short
enough that it is fine to write it inside the class declaration in
the header file.
The remaining methods need their implementations
written in the .cpp file. However, when we write these
methods, we must specify which class they belong in. We do so
with the scope resolution operator (::) to give the fully
qualified name (Classname::methodName) in the declaration.
For example, if we wanted to write the implementations of our
BankAccount class in a separate .cpp file, we would write the
following in the header file:
1 class BankAccount {
2 private:
3 double balance;
4 public:
5 void deposit(double amount) ;
6 double withdraw (double desiredAmount);
7 double getBalance() const;
8 void initAccount();
9 };
Then, in the .cpp file, we would write:
1 #include "bank.h"
2
3 void BankAccount::deposit(double amount)
4 balance += amount;
5 }
6
7 double BankAccount::withdraw(double desir
8 if (desiredAmount <= balance) {
9 balance -= desiredAmount;
10 return desiredAmount;
11 }
12 else {
13 double actualAmount = balance;
14 balance = 0;
15 return actualAmount;
16 }
17 }
18
19 double BankAccount::getBalance() const {
20 return balance;
21 }
22
23 void BankAccount::initAccount() {
24 balance = 0; //we will see a better way to do th
25 }
We will note that there is no rule enforced by the compiler
about how long of a method is acceptable inside the class
declaration. Instead, such a decision is governed by coding
standards in whatever organization you work in or your own
personal preference. However, the guiding principle in making
such a decision is that someone reading the header file should
easily be able to see the interface for the class (i.e., how they
can use it) without the declaration being cluttered up by method
implementations.
14.6.6 Default Values
C++ allows default values to be specified for some or all of the
parameters of a function (or method). These default values
provide the caller of the function in question with the ability to
omit the arguments for certain parameters if the default values
are desired. The arguments that are omitted must be the
rightmost arguments of the call. For example, suppose we write
the following function prototype with default parameter values:
1 int f(int x, int y = 3, int z = 4, bool b = false
Then the following are calls are legal and interpreted as noted
in the comments:
1 f(9); //x = 9, y = 3, z = 4, b = false
2 f(9, 8); //x = 9, y = 8, z = 4, b = false
3 f(9, 8, 7); //x = 9, y = 8, z = 7, b = false
4 f(9, 8, 7, true); //x = 9, y = 8, z = 7, b = true
When reading code with default parameters, you make the
function call the same way as normal but copy the default
values into the frame as if they were passed normally (that is, as
if all the arguments were specified).
We will note that default parameter values are often
abused by novice programmers (save some typing! put
whatever I use most often as a default!). You should only use
them when you really mean for that to be the default behavior
of a function. For example, a reasonable use would be:
1 int doSomething(int arg1, int arg2, bool verboseD
In this example, not having verbose debugging is a good
default behavior. By providing this parameter, we could turn on
verbose debugging on a call-by-call basis, by passing true for
that parameter.
We will also note that the values of default parameters
used at a call site are based on what is “seen” by the compiler at
that point in the code. That is, if you declare the function in the
header file that was included without default values, then you
cannot make use of them, even if the implementation declares
them. Consequently, if you are going to use default parameters,
you should specify them in the prototype in the header file, and
only in the prototype in the header file.
14.6.7 Reference Material
The man pages are the best reference material for C. However,
for C++, you often know what class you want to work with, and
wish to look at the methods inside of it.
http://www.cplusplus.com/ provides an excellent reference
for the C++ library in this format—you can look up the page
for a particular class and then examine a list of the methods in
that class to find what you need.
14.7 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 14.1 : What is an object? What is a class?
How are the two related?
• Question 14.2 : What is the difference between a struct
and a class?
• Question 14.3 : What is the this pointer? How do you
determine what it points at?
• Question 14.4 : What does const mean when it appears
after the parameter list but before the open curly brace
of a function body, as in void getX() const {...}?
• Question 14.5 : What is a reference? How is it like a
pointer? How is it different?
• Question 14.6 : What is operator overloading?
• Question 14.7 : When you overload an operator that
modifies the object (such as +=), what type and value
should your operator typically return?
• Question 14.8 : What is the output when the following
code is executed?
1 #include <cstdio>
2 #include <cstdlib>
3
4 class Point {
5 private:
6 int x;
7 int y;
8 public:
9 void setLocation(int newX, int ne
10 x = newX;
11 y = newY;
12 }
13 int getX() const {
14 return x;
15 }
16 int getY() const {
17 return y;
18 }
19 };
20 void printPoint(const char * name,
21 printf("%s: (%d,%d)\n", name, p.g
22 }
23 void f(Point & p) {
24 printPoint("p", p);
25 p.setLocation(p.getX() + 2, p.get
26 }
27
28 int main(void) {
29 Point p1;
30 Point p2;
31 p1.setLocation(2,4);
32 p2.setLocation(3,5);
33 f(p1);
34 f(p2);
35 printPoint("p1", p1);
36 printPoint("p2", p2);
37 return EXIT_SUCCESS;
38 }
• Question 14.9 : Write a class for a Square, which has
the following members:
–
A private double for the edge length
– A public method void setEdgeLength(double)
that uses assert to check that the passed-in edge
length is non-negative (and aborts the program if
not). If the edge length is non-negative, it sets
the edge length field to the passed-in value.
– A public method
double getEdgeLength() const, which returns
the edge length.
– A public method double getArea() const,
which returns the area of the square.
– A public method
double getPerimeter() const, which returns
the perimeter of the square.
Test your code with the following main:
1 int main(void) {
2 Square squares[4];
3 for (int i = 0; i < 4; i++) {
4 squares[i].setEdgeLength(2 * i
5 }
6 for (int i = 0; i < 4; i++) {
7 printf("Square %d has edge leng
8 printf(" and area %f\n", squa
9 printf(" and perimeter %f\n",
10 }
11 printf("Trying to set a negative
12 squares[0].setEdgeLength(-1);
13 return EXIT_FAILURE;
14 }
Which should output the following (to stdout):
Square 0 has edge length 2.000000
and area 4.000000
and perimeter 8.000000
Square 1 has edge length 4.000000
and area 16.000000
and perimeter 16.000000
Square 2 has edge length 6.000000
and area 36.000000
and perimeter 24.000000
Square 3 has edge length 8.000000
and area 64.000000
and perimeter 32.000000
Trying to set a negative edge length
(should abort)
then print an assertion failure message and abort.
• Question 14.10 : Take the Point class from Question
14.7 and add three overloaded operators to it:
– A += operator, which takes a const Point &
and increases this Point’s x by the passed-in
Point’s x and this Point’s y by the passed-in
Point’s y. It should then return a reference to
this object.
– A == operator, which takes a const Point &
and determines if it has the same coordinates as
this Point.
– A *= operator, which takes an int and scales
(multiplies) this Point’s x and y by the passed-in
integer. This operator should return a reference
to this Point.
Write a main to test your code.
• Question 14.11 : What is the output when the following
code is executed?
1 #include <cstdio>
2 #include <cstdlib>
3
4 void f(int & y, int * z) {
5 printf("y = %d, *z = %d\n", y, *z
6 y += *z;
7 *z = 42;
8 }
9
10 int main(void) {
11 int a = 3;
12 int b = 4;
13 int & x = a;
14 x = b;
15 printf("a = %d, b = %d, x = %d\n"
16 f(b, &x);
17 printf("a = %d, b = %d, x = %d\n"
18 return EXIT_SUCCESS;
19 }
II C++15 Object Creation and Destruction
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
II C++14 Transition to C++16 Strings and IO Revisited
Chapter 15
Object Creation and
Destruction
One of the benefits of object-oriented languages (such as C++)
is the ability to design classes such that the privacy of their data
can ensure their invariants are always maintained. By keeping
fields private, the only way they can be manipulated is through
the public interface of the class, which only changes the fields
in ways that respect the invariants of the objects. Of course, the
implementations of these methods must be written correctly to
achieve these goals; however, the designer of the class does not
need to worry about unrelated code modifying the object in
unexpected ways.
While the access restrictions help designers maintain
invariants, in order to be truly useful, objects must be able to
initialize their state properly as well—the invariants must be
initially established before they can be maintained. An idea that
goes nowhere would be to have the code creating the object
initialize its fields directly. Such an approach would not only
require relaxation of the visibility restrictions to allow the code
direct access to the private fields but would also require the
class designer to trust external code to set up the initial state
correctly.
A slightly better approach would be to have each class
provide a public method (e.g., called initialize) code should
call immediately after allocating an object. This approach,
which is what we showed in our example in the previous
chapter (because we will not learn the right way until this
chapter), would mean that we would need to do something like
this:
1 BankAccount * account = malloc(sizeof(*account));
2 account->initialize();
However, this approach has several problems. First, it is
easy for a programmer to forget to call the initialize method
when creating an object. Such forgetfulness would lead to the
fields of the object being used uninitialized, which you are
already aware results in the worst sort of errors—those that
only show up sometimes. The second problem with this
approach is that we cannot enforce that initialize is only
called on a newly created object. As the method is public, it
could be called by any other piece of code at any time.
We also have a similar problem when an object is about to
be destroyed. Instead of initializing the state, we want to clean
up any resources in use by the object (e.g., freeing any memory
that only it has references to, closing files that it has open, etc.).
As with initialization, we might imagine each class providing a
public cleanup method, but this approach suffers similar
problems to its initialization counterpart. As with initialization,
a programmer may forget to call cleanup or call it at an
inappropriate time.
15.1 Object Construction
The approach of having a particular method to initialize the
object could be fixed if we could make this method “special” in
such a way that (1) it is always called when you create an
object, (2) it cannot be called directly (at any time) by the
programmer but instead can only be called during object
creation. C++ (and many other object-oriented languages) take
exactly this approach—these “special methods” are called
constructors. In C++, a constructor has no return type (not even
void) and the same name as the class it is inside of. For
example, if we wanted to change our BankAccount class from
the last chapter to use a constructor, it would look like this:
1 class BankAccount {
2 private:
3 double balance;
4 public:
5 BankAccount() {
6 balance = 0;
7 }
8 //other methods remain the same
9 };
Of course, we could also write the BankAccount
constructor outside of the class declaration, in which case, we
would write:
1 BankAccount::BankAccount() {
2 balance = 0;
3 }
Video 15.1: Executing code that creates an
object with a constructor.
Now, if we declare a variable of type BankAccount, C++
will automatically call the constructor for that variable when
the “box” for the variable is created. As the constructor is
“inside of” an object, it has an implicit this parameter, which
points at the newly created object. Video 15.1 shows the
execution of code in which an object with a constructor is
created.
Note that the constructor only happens when a value
whose type is a class is created. If you create a pointer (or
reference) to an object, then no new object is created, just a
pointer, so no constructor is run.
15.1.1 Overloading
As with all other functions in C++, constructors can be
overloaded. We might want to overload constructors so that we
can allow different ways to initialize an object, based on
different information. For example, in our BankAccount class,
we might want to write a second constructor that takes in an
initial balance and initializes the balance field appropriately:
1 BankAccount::BankAccount() {
2 balance = 0;
3 }
4 BankAccount::BankAccount(double initialBalance) {
5 balance = initialBalance;
6 }
The constructor we have seen so far that takes no
parameters has a special name: the default constructor. If you
do not write any constructors in a class, the C++ compiler will
provide a default constructor that basically behaves as if you
declared it like this:
1 class MyClass {
2 public:
3 MyClass() {}
4 };
Note that if you write any other constructor, the C++ compiler
does not provide this constructor for you.
If you write any constructors, the class is non-POD (it
requires “special” initialization—not just allocation of a chunk
of bytes) by virtue of having a constructor. If you do not write a
constructor and the compiler provides an implicit default
constructor, the class is non-POD if that constructor turns out to
be nontrivial. If the constructor is trivial, then it does not
automatically make the class non-POD, but other aspects of the
class may, of course, make it non-POD. The non-technical
description of a constructor being “nontrivial” is that it “does
something.” The provided constructor looks like it obviously
does nothing; however, as we will discuss in Section 15.1.4,
there may be some implicit initializations that happen in such a
constructor.
A class type that has a public default constructor—whether
explicitly declared or automatically provided—is called a
default constructible class. Many pieces of the C++ library only
work with classes that are default constructible—if you try to
use them with a class that is not default constructible, you will
get a compiler error. Generally, you want your class to be
default constructible unless you have a compelling reason not
to.
To make use of a constructor other than the default
constructor when initializing an object, you must specify the
arguments you wish to pass to the constructor. For local
variables, you do so by placing parentheses with the argument
after the name of the new variable, for example:
1 BankAccount myAccount(42.3); //pass 42.3 as initialBala
Note that while it may be tempting (for consistency) to
write empty parentheses after a variable we wish to construct
via the default constructor, this approach unfortunately does not
work: BankAccount x(); is interpreted as a function named x
that takes no parameters and returns a BankAccount.
15.1.2 Dynamic Allocation
Suppose that instead of wanting to create a local variable for a
BankAccount, we wanted to dynamically allocate a
BankAccount object. In C, we would use malloc to allocate the
proper number of bytes (i.e. malloc(sizeof(BankAccount)),
and everything would work fine. However, now that our
BankAccount class has a constructor, it is not a POD class, so
using malloc will not work properly. Recall from Section
14.1.4 that (basically) a class is only POD (“plain old data”) if
it has a direct C analog. C does not have anything resembling
constructors—there is no way to have structs initialized
automatically whenever they are allocated.
The underlying reason why malloc will not work properly
(in this case) is that it will not run the constructor. There is no
way for malloc to actually know what type of object it is
allocating space for and thus no way for it to call the proper
constructor—recall that the compiler will just evaluate
sizeof(BankAccount) to a size_t, and malloc will be called
with that integer as a parameter. malloc will simply allocate a
contiguous sequence of the number of bytes you have requested
of it.
However, the whole point of having the constructor in the
BankAccount class was to guarantee that it would be properly
initialized. By virtue of having a constructor (or possibly
multiple constructors), a BankAccount is no longer just “a
bunch of fields”—it also includes a procedure for setting up the
initial state of those fields. It does not behave like a C struct
anymore, therefore using C’s tools to just allocate “a chunk of
memory” is inappropriate.
In C++, the proper way to allocate memory dynamically is
to use the new operator. For example, we might write:
1 BankAccount * accountPtr = new BankAccount();
Evaluating the new BankAccount() expression1 allocates
memory for one object of type
BankAccount and calls the default (no argument) constructor to
initialize the newly created object. This expression evaluates to
a pointer to the newly allocated object and has type
BankAccount * (contrast this type, which is completely
accurate, to void *, which is what malloc returns). If we need
to pass arguments to the constructor for BankAccount, we can
do so by placing them in the parentheses:
1 BankAccount * accountPtr = new BankAccount(initia
We can also use new[] operator to allocate space for an
array. For example, if we wanted an array of 42 BankAccounts,
we could do:
1 BankAccount * accountArray = new BankAccount[42]
This code would allocate space for 42 consecutive
BankAccount objects and invoke the default constructor on each
of them in ascending order of index. If you create an array of
objects, you can only have them constructed with their default
constructor. If the class does not have a default constructor or
you want to initialize the elements of the array with some other
constructor, you need to create an array of pointers and then
write a loop to create each object and put the pointer to it into
the array:
1 BankAccount ** accountPointerArray = new BankAcco
2 for (int i = 0; i < 42; i++) {
3 accountPointerArray[i] = new BankAccount(initia
4 }
Video 15.2: Execution of code with new and
new[].
Video 15.2 illustrates the execution of code with new and
new[].
15.1.3 Types of Initialization
One aspect of C++ object creation that is often misunderstood
is the difference between this line of code:
1 BankAccount * accountPtr = new BankAccount(); //pa
and this line of code:
1 BankAccount * accountPtr = new BankAccount; //no p
The two appear quite similar, but one has parentheses after
BankAccount and the other does not. Odds are good that if you
ask 20 different C++ programmers what the difference is, you
will get 15 different answers, and none will be correct. We will
explain the difference, not so much because you need to
remember it to write good programs, but rather because it
strongly motivates doing what you should do anyways.
These two uses of new make use of two different types of
initialization. The first (with the parentheses) uses value
initialization, and the second uses default initialization. Each of
these has different behavior, and the specifics depend on
whether the type being initialized is POD or non-POD.
When value initialization is used,2 a class with a default
constructor is initialized by its default constructor. A class
without any constructor has every field value initialized. Non-
class types are zero initialized. Arrays have their elements value
initialized.
When default initialization is used, non-POD types are
initialized by their default constructor. POD types are left
uninitialized. Arrays have their elements default initialized.
Believe it or not, that is actually a simplification of the
rules. However, most C++ programmers get by just fine
without understanding them. How can this be? The best
approach is to just always include a default constructor in your
classes. Notice that the main similarity between the two is that
classes with a default constructor that you wrote (remember:
having a programmer-defined constructor makes the class a
non-POD type) will be initialized by their default constructor
under either scheme. If you write a default constructor, you do
not need to remember the distinctions, since both will do the
same thing—using that constructor.
15.1.4 Initializer Lists
While our constructor for the BankAccount works fine, it makes
use of traditional assignment statements, rather than the
preferred C++ approach, which is to use an initializer list. An
initializer list is a list of initializations written by placing a
colon after the close parenthesis of the constructor’s parameter
list and writing a sequence of initializations before the open
curly brace of the function’s body. Each initialization takes the
form name(values) that we just saw for constructing objects
with parameters to their constructors. For example, our
BankAccount’s constructors would be rewritten as follows:
1 BankAccount::BankAccount() : balance(0) {}
2 BankAccount::BankAccount(double initBal) : balanc
Video 15.3: Creating objects whose constructors
use initalizer lists.
Video 15.3 shows the execution of code where objects
whose constructors make use of initializer lists are created. This
video also shows the behavior of a field declared as static,
which we discussed briefly in Section 14.1.5.
A natural question for a C programmer learning C++ is
“Why should I bother with initialization lists? Assignment
statements seem to work just fine…” However, there are
several reasons why initialization lists are preferable. Most of
the reasons are all based on the fact that C++ makes a
distinction between initialization and assignment. Any
assignment statements in the constructor are treated as regular
assignment statements, while initializers in the initialization list
are treated as initializations.
One important way that this distinction between
initialization and assignment matters is that C++ ensures all
fields have some form of initialization before the open curly
brace of the constructor. If you specify the initialization you
want in the initialization list, then you get exactly what you
want. Otherwise, the field is default initialized. However,
remember that default initialization for POD types leaves them
with unspecified values.
The distinction between assignment statements and
initialization is critical here, as an assignment statement in the
constructor does not specify the initialization behavior. Instead,
if you place an assignment statement that assigns to the field in
the body of the constructor, it will assign to the already created
object, changing its fields. Depending on the exact details, you
may get the behavior you want (e.g., if the field has a POD
type). You may get the behavior you want, except that it is
slower (from creating, then overwriting the object). You may
also get undesired behavior, if constructing and/or destroying
the objects has some noticeable effects.
Another case where the distinction between initializing
and assigning to a field matters is for references. Recall that
initializing a reference sets what the reference refers to (points
at), while assigning to the reference implicitly dereferences the
reference and assigns to whatever it refers to. If your class has
fields of a reference type, you must initialize them in the
initializer list. Otherwise, you will receive an error message
like:
uninitialized reference member
’ClassName::fieldName’
A third case where this distinction matters is if you have a
const field (i.e. one whose type includes a const modifier,
which does not allow its value to be changed). A const field
must be initialized3 in the initializer list and may not be
assigned to anywhere. This requirement exists so that the
compiler may ensure the const field will be initialized but
never changed. If you fail to initialize a const field in an
initializer list, you will get an error message like this:
uninitialized member ’ClassName::fieldName’ with
’const’ type ’const int’
The best practice for C++ is to use the initializer list to
initialize the fields of the object. You may still need to write
code in the body of the constructor if more complex setup is
needed after the fields are initialized.
There is one final detail about initializer lists you should
understand, as not knowing about it often leads to confusing
warnings or errors. The order in which fields are initialized by
the initializer list is the order in which they are declared in the
class, not the order their initializers are written in the initializer
list. For example, if we wrote the following code:
1 class Something {
2 private:
3 int x;
4 int y;
5 int z;
6 public:
7 Something(int a, int b, int c) : z(c), y(b), x
8 };
The order of initialization would be x(a), y(b), z(c) because
that is the order the fields are declared in, even though it is not
the order the initializations appear in the list. Although this may
seem like an insignificant technicality, it matters in cases where
one field is used to initialize another.
Consider the slightly different code below:
1 class Something {
2 private:
3 int x;
4 int y;
5 int z;
6 public:
7 Something(int a, int b) : z(a), y(b), x(z + y)
8 };
Here, the fact that x is initialized first is quite significant. The
expression from which we initialize x is the sum of z + y,
neither of which have been initialized yet, so x gets initialized
to some unknown value! Note that if you compile with
warnings turned off, this code will compile just fine (despite the
serious problem lurking in its initializer list). If you compile
with -Wall -Werror, then the compiler will produce an error
message so you know to fix your problem:
In constructor ’Something::Something(int, int)’:
’Something::z’ will be initialized after
’int Something::y’
when initialized here
’Something::y’ will be initialized after
’int Something::x’
when initialized here
Whenever you encounter this type of error message, fix
the ordering of your initializer list, and make sure you are not
using uninitialized fields to initialize other fields.
Video 15.4: Initialization list ordering and
implicit initialization of non-POD fields.
Video 15.4 shows the execution of some more code with
initializer lists. This video shows both intializer lists whose
elements are in a different order from their declarations and the
implicit initialization of non-POD types when they are omitted
from the initializer list.
15.1.5 What to Do
Object construction in C++ is a fairly complex topic. This
section has a significant amount of information, including more
deep technical details than we typically like to give in this
book. However, these technical details provide the motivation
for the why of several things that are generally important to do.
Without this information, telling you what to do would seem
like an arbitrary set of rules without any explanation for the
reasons.
Although there are exceptions to the following rules, you
should understand much more about the details of object
construction, POD versus non-POD types, and how objects are
implemented before you break them (complete mastery of all
the material in this entire book is a good start, but you need to
know a bit more about the internals of C++ than we will cover
to truly know when it is safe to break some of these rules):
Make your classes default constructible. Write a public
default constructor in every class that you write.
Make that constructor initialize the object in a sane
way.
Use initializer lists to initialize your class’s fields. Initia
lize the fields of your class in the initializer list. Do
not try to initialize them in the constructor’s body
with assignment statements. The two are different.
In the initializer list, explicitly initialize every field. Do
not rely on the implicit default initialization for any
field. Explicitly initialize it to the value you want.
If you think you do not care if it is uninitialized,
pick some value to initialize it to—it will make
testing and debugging that much easier to
guarantee the field has one particular value.
Initialize the fields in the order they are declared. You
should be compiling with -Wall -Werror
anyways, so doing anything else should produce an
error.
Use new and new[], not malloc. Using malloc on a type
that is not POD will result in problems. Depending
on why the type is not a POD type, you may
experience a variety of strange and difficult-to-
debug behaviors (understanding what is going on
and why requires understanding of the internals of
C++, which is what you are violating by using
malloc with a non-POD type). In C++, just always
use new and new[] (there is no need to ever use
malloc only danger in doing so).
15.2 Object Destruction
In much the same way that we would like to have our classes be
able to specify an initialization procedure that is guaranteed to
happen (and guaranteed to happen only when an object is being
created), we would like to be able to specify a cleanup
procedure for object destruction (and only for object
destruction). Such a procedure is called a destructor. A
destructor is named the same as the class it appears in, except
with a tilde (~) (which is typically shift plus the key left of the 1
key). Like a constructor, a destructor has no return type, not
even void. Unlike constructors, destructors may not be
overloaded. A class may only have one destructor, and it must
take no parameters. As the destructor is inside the class, it
receives an implicit this parameter, just like any other member
function. Destructors are typically public but may be private—
however, if they are, only that class can destroy objects of that
type (thus, the usefulness of private destructors is quite
limited).
At present, it would not make sense to add a destructor to
our BankAccount class—there is nothing that it would need to
do. However, if we imagined our class having some
dynamically allocated memory associated with it, such as a
transaction history stored as a dynamically allocated array, then
it would make sense:
1 class BankAccount {
2 private:
3 Transaction * transactionHistory; //array
4 int numTransactions;
5 double balance;
6 void addTransaction(double amount, cons
7 Transaction * temp = new Transaction[
8 for (int i = 0; i < numTransactions;
9 temp[i] = transactionHistory[i];
10 }
11 temp[numTransactions].message = messa
12 temp[numTransactions].amount = amount
13 gettimeofday(&temp[numTransactions].w
14 Transaction * old = transactionHistor
15 transactionHistory = temp;
16 numTransactions++;
17 delete[] old;
18 }
19 public:
20 BankAccount() : transactionHistory(NULL
21 balance(0) {}
22 ~BankAccount() {
23 delete[] transactionHistory;
24 }
25 void deposit(double amount) {
26 addTransaction(amount, "Deposit");
27 balance += amount;
28 }
29 double withdraw(double desiredAmount) {
30 if (desiredAmount <= balance) {
31 addTransaction(-desiredAmount, "Wit
32 balance -= desiredAmount;
33 return desiredAmount;
34 }
35 else {
36 double actualAmount = balance;
37 addTransaction(-actualAmount, "With
38 balance = 0;
39 return actualAmount;
40 }
41 }
42 double getBalance() const {
43 return balance;
44 }
45 };
Here, our BankAccount class tracks an array of all
transactions it has performed. Every time the withdraw or
deposit methods are called, they call the private
addTransaction method to record the transaction in this
history. This method then reallocates the array to be larger,4
adds the new entry to the end, and uses the delete[] operator
to free the memory from the old array. delete and delete[]
free the memory allocated by new and new[] respectively. We
will discuss them more in a moment.
Before we proceed, we should briefly note that if we
wanted to this in a real C++ program, we would want to use the
built-in vector class, which provides an array-like interface,
but also has the ability to add more elements to the end
(causing it to grow). However, we are not yet ready to discuss
vector, as it is a templated class, and we will not learn about
templates until Chapter 17. We will note that, unfortunately,
new[] does not have a realloc analog (for good reason:
remember we can move POD around at will, but not non-POD
types).
We should also take a moment to note that this example
illustrates the use of visibility restrictions to enforce invariants.
This BankAccount class has the invariant that the current
balance must be equal to the sum of the amounts in the
transaction history (since the history is a log of the changes to
the balance). No external code can modify the balance or
transactionHistory, nor call addTransaction to violate this
invariant. Consequently, we need only convince ourselves that
the class maintains this invariant properly (through a
combination of code inspection, testing, and debugging) to
know that the invariant will hold in the entire program for any
program we write with this class.
15.2.1 When Are Destructors Invoked
A destructor is invoked whenever the “box” for an object is
about to be destroyed. This destruction can happen either due to
dynamic deallocation through delete or delete[], by a local
variable going out of scope, or by one object that contains
another (as a field) being destroyed. Whenever a destructor is
called, the object it is invoked on (i.e., where the this pointer
points when the destructor is called) is the object that is about
to be destroyed. The “box” is only actually destroyed after the
destructor completes.
In the first of these cases, delete and delete[] free the
memory allocated by new and new[] (respectively), just like
free frees the memory allocated by malloc. It is only correct to
use delete to free memory allocated by new and delete[] to
free memory allocated by delete[]. Mixing these up can lead
to memory leaks or program crashes. If you deallocate memory
for an array with delete[], the elements of the array have their
destructor (if any) invoked in decreasing order of index—the
opposite order from which they were constructed. In fact, as a
general rule, whenever construction and destruction occur in a
group, the order of destruction is the opposite of the order of
construction.
For local variables (and parameters), their “box” is
destroyed whenever they go out of scope. If the variable’s type
is a class with a destructor (note: not a pointer to a class with a
destructor, in which case, only the pointer is being destroyed,
not the object), then the destructor must be invoked before the
box is destroyed. If multiple variables go out of scope at the
same point, their destructors are invoked in the opposite order
from the order in which they were constructed. We will note
that this rule (combined with the fact that destructors may have
arbitrary effects) means you must be careful about when
variables go out of scope when executing code by hand (or,
more generally, understanding what it is doing).
Consider:
1 #include <cstdio>
2 #include <cstdlib>
3 class Example {
4 int x;
5 public:
6 Example(int i) : x(i) {
7 std::printf("Created example %d\n", x
8 }
9 public:
10 ~Example() {
11 std::printf("Destroyed example %d\n",
12 }
13 };
14 int main(void) {
15 Example e1(42);
16 Example e2(99);
17 for (int i = 0; i < 4; i++) {
18 Example eloop(i);
19 }
20 return EXIT_SUCCESS;
21 }
This code does not make a particularly good use of its
constructor or destructor, but having them print a message is
informative for illustrating this point about how object creation
and destruction behaves. The variable eloop goes out of scope
at the close curly brace that ends the for loop, and another
object by the same name is initialized the next time around the
loop. Therefore, this code would print:
Created example 42
Created example 99
Created example 0
Destroyed example 0
Created example 1
Destroyed example 1
Created example 2
Destroyed example 2
Created example 3
Destroyed example 3
Destroyed example 99
Destroyed example 42
Video 15.5: Execution of code with object
destruction.
Video 15.5 illustrates the execution of code with object
destruction, making better use of the destructor. Here, the
Polygon class has a pointer to an array of Points, which needs
to be deleted when the object that owns the array is destroyed.
Fields inside of a class are also destroyed as part of the
destruction of the enclosing object. After the class’s destructor
completes, each of the fields is destroyed in the reverse of the
order in which they were initialized. For any field whose type is
a class with a destructor, that destructor is invoked before the
field is destroyed. Note that, if you have a field that is a pointer,
then the box being destroyed only contains a pointer, not an
object, so no destructor is invoked. If you want to destroy the
object that the pointer points at, you must explicitly delete it in
the class’s destructor.
If you do not explicitly declare a destructor for a class, the
compiler implicitly provides one that looks like:
1 class MyClass {
2 public:
3 ~MyClass() {}
4 };
The automatically supplied destructor is a trivial
destructor if the class has no fields that have nontrivial
destructors. If you explicitly write a destructor (even if it has
nothing in it), that destructor is considered nontrivial. Put
another way, a destructor is trivial if (a) the compiler
automatically supplies it, and (b) it actually does nothing. Any
class with a nontrivial destructor is not a POD type.
Video 15.6: Execution of more complex code
with destructors.
Video 15.6 shows the execution of code with destructors,
where objects have fields that need to be destroyed along with
the object they belong to.
15.3 Object Copying
There are many ways in which programs copy values, such as
when a parameter is passed to a function (its value is copied
into the called function’s frame), when a value is assigned to a
variable (its value is copied into the destination “box”), or when
a value is returned (it is copied out of the returning function’s
frame to be returned to the caller). In C, all of these copies
occur in a simple fashion. All of C’s types are plain old data
(POD), so the underlying numeric representation is directly
copied from one location to another.
Video 15.7: Naïvely copying an object via
parameter passing.
Video 15.8: Naïvely copying an object via
assignment.
In C++, however, there are types that are not POD.
Objects that are not POD cannot simply be copied from one
location to another. Instead, the class may wish to specify a way
that the object should be copied. To see why, Video 15.75 and
Video 15.8 illustrate what happens when we have a non-POD
type and naïvely copy its fields when copying during parameter
passing and when copying by assignment.
In both videos, the fundamental problem is the same: we
cannot just naïvely copy the fields from one object to another.
When we simply copy the points field, both the original object
and the copy point at the same memory. When one object is
destroyed, it frees this memory, leaving the other’s pointer
dangling. In the case of assigning to an existing object, there is
an additional problem, which is not present when creating a
new object as a copy—as we saw in Video 15.8, assigning to p1
leaks the memory it previously pointed to, as we have
overwritten the pointer without freeing that memory.
Instead of always simply making shallow copies, we need
to allow our classes to specify how their objects should be
copied. A shallow copy may suffice for some types (even if
they are not POD), but clearly does not suffice for all types.
C++ distinguishes between two types of copying: copying
during initialization (the copy constructor) and copying during
assignment (the copy assignment operator). This distinction
allows us to free existing resources when assigning to an
existing object. We must do so during assignment to avoid
memory leaks; however, we cannot do so during object creation
as those fields are not initialized.
15.3.1 Copy Constructor
Copying during initialization occurs when a new object is
created as a copy of an old one. This form of copying occurs
when objects are passed to functions by value (as opposed to
passing a reference or pointer), when an object is returned from
a function by value, or explicitly when the programmer writes
another object of the same type as the initializer for a newly
declared object. Whatever the reason for initializing an object
from a copy of another object, the copy constructor is invoked
to perform the copying in the fashion the class defines.
We might modify the Polygon class from Video 15.7 to
have a copy constructor as follows:
1 class Polygon {
2 Point * points;
3 size_t numPoints;
4 public:
5 Polygon(size_t n) : points(new Point[n]), numPo
6 //copy constructor: makes a deep copy
7 Polygon(const Polygon & rhs) : points(new Point
8 numPoints(rhs.nu
9 for (size_t i = 0; i < numPoints; i++) {
10 points[i] = rhs.points[i];
11 }
12 }
13 ~Polygon() {
14 delete[] points;
15 }
16 };
As with all constructors, the copy constructor is named the
same as the class in which it resides, and it has no return type
(not even void). However, the copy constructor is a very
specific overloading of the constructor. It takes a reference
(generally a const reference) to its own type (in this case, as
const Polygon &). Passing the argument by reference means
that the argument points at the original object, rather than being
a copy of it. If we were to pass the argument as the value of the
object, we would have to make a copy of it just to get it into the
copy constructor! We typically pass a const reference as we
typically should not modify the object that we are making a
copy of. However, we can write a copy constructor that takes a
non-const reference if we have a good reason to do so. If we
desire, we can have multiple copy constructors: one that takes a
const reference, and one that takes a non-const reference.
If we do not explicitly specify a copy constructor, then
C++ automatically provides one. The provided copy
constructor performs as if it contains an initializer list that
initializes every field from the corresponding field in the
argument. That is, in general it looks
1 class SomeClass {
2 Type1 field1;
3 Type2 field2;
4 ...
5 TypeN fieldN;
6 public:
7 //this is what the provided copy constructor would
8 //look like if you did not write any copy constructor
9 SomeClass(const SomeClass & rhs) : field1(rhs.f
10 field2(rhs.f
11 ...
12 fieldN(rhs.f
13 };
Video 15.9: Executing code with a copy
constructor: parameter-passing example
revisited.
For fields of non-class types, this initialization simply
copies the value. For fields of class types, this initialization
invokes the corresponding copy constructor (which may either
by written by the user or automatically provided). The
automatically generated copy constructor will take a const
reference as an argument if possible (i.e., if all fields have copy
constructors which take const references), otherwise, it will
have a non-const reference as an argument.
As with default constructors, a copy constructor may be
classified as trivial. In order for a copy constructor to be trivial,
it must be automatically provided by the compiler (no user-
defined copy constructors are trivial), and the fields must all
have trivial copy constructors. There are also some other
conditions, which are related to topics we will learn about later.
A trivial copy constructor essentially simply copies the bytes in
memory from one object to another, much as we would copy a
struct in C.
Video 15.9 revisits the earlier example with our updated
Polygon class that contains a copy constructor.
15.3.2 Assignment Operator
The other form of copying that can occur is copying during
assignment. Unlike copying during initialization, copying
during an assignment changes the value of an object that
already exists (and is already initialized) to be a copy of another
object. Copying by assignment makes use of the assignment
operator, operator=. Classes may overload the assignment
operator to specify how their objects should be copied during
assignment.
1 class Polygon {
2 Point * points;
3 size_t numPoints;
4 public:
5 Polygon(size_t n) : points(new Point[n]), numPo
6 //copy constructor: makes a deep copy
7 Polygon(const Polygon & rhs) : points(new Point
8 numPoints(rhs.nu
9 for (size_t i = 0; i < numPoints; i++) {
10 points[i] = rhs.points[i];
11 }
12 }
13 Polygon & operator=(const Polygon & rhs) {
14 if (this != &rhs) {
15 Point * temp = new Point[rhs.numPoints];
16 for (size_t i = 0; i < rhs.numPoints; i++)
17 temp[i] = rhs.points[i];
18 }
19 delete[] points;
20 numPoints = rhs.numPoints;
21 points = temp;
22 }
23 return *this;
24 }
25 ~Polygon() {
26 delete[] points;
27 }
28 };
Like the copy constructor, the copy assignment operator
takes a constant6 reference7 to its own type. We could overload
the assignment operator on other types. However, we should
only do so if it makes sense to do so. For example, if we were
writing a class for arbitrary size integers (typically called
“BigInt” or “BigNum”), we might overload the assignment
operator to take a normal int. Such an overloading would make
sense (as it allows us to write myNum = 3;); however, it is not
the copy assignment operator that we are concerned with here.
The copy assignment operator also has rather similar
internal behavior to the copy constructor (which makes sense,
as both specify how to copy the object). However, there are
some important distinctions between the two. First, the
assignment operator returns a value. Specifically, it returns a
reference to this object (as do most operators that modify the
current object). Remember that when we initialize a reference
(in this case, the return value) from an object, we implicitly
take the object’s address to get the underlying pointer.
Therefore, the pointer that is the value of the reference returned
is &*this, which is just this.
The second difference between the assignment operator
and the copy constructor is that the operator begins by checking
if this != &rhs. That is, if the object being assigned to is
distinct from the object copied. As a general rule, we should
check for this condition in writing an assignment operator, as
we might otherwise run into problems (we may delete a
pointer in this object, then use it dangling in rhs). Note that
we perform this check by comparing pointers (i.e., we do
this != &rhs), not values (i.e., do not do *this != &rhs).
Comparing pointers tells us what we want to know—are this
and rhs referencing exactly the same object in memory (also,
comparing pointers is one or two instructions, while calling the
!= operator may involve a complex comparison via an
overloaded != operator).
The third difference relative to the copy constructor is that
the assignment operator must clean up the existing object
before we assign the copied values to it. In the copy
constructor, there is nothing to cleanup at the start: the object is
uninitialized. However, in the assignment operator, we are
overwriting an existing object with a copy.
Much like the copy constructor, if the user does not write
an assignment operator, one is automatically provided. The
automatically provided assignment operator generally looks
like this:
1 class SomeClass {
2 Type1 field1;
3 Type2 field2;
4 ...
5 TypeN fieldN;
6 public:
7 //this is what the automatically provided assignment operator
8 SomeClass & operator=(const SomeClass & rhs) {
9 field1 = rhs.field1;
10 field2 = rhs.field2;
11 ...
12 fieldN = rhs.fieldN;
13 return *this;
14 }
15 };
The automatically provided assignment operator does not
check for self-assignment before copying the fields. All it does
is copy each field in the class (using the appropriate assignment
operator for each field’s type) then return a reference to the
object that was assigned to. If you want any other behavior, you
must define the operator yourself. The provided assignment
operator is considered “trivial” under similar circumstances to
when the copy constructor is considered trivial (it is
automatically provided, each field in the class has a trivial
assignment operator, and some other conditions related to
topics we will learn about later).
We might be tempted to write our assignment operator in a
slightly simpler (although, as we shall see, less correct) fashion:
1 Polygon & operator=(const Polygon & rhs) {
2 if (this != &rhs) {
3 delete[] points;
4 points = new Point[rhs.numPoints];
5 for (size_t i = 0; i < rhs.numPoints; i++) {
6 points[i] = rhs.points[i];
7 }
8 numPoints = rhs.numPoints;
9 }
10 return *this;
11 }
The two implementations are fairly similar; however, the
second one deletes the existing data first to avoid the use of a
temporary pointer. It then allocates the memory, assigning the
result of new directly to this->points, copies the values from
the array, then initializes numPoints.
This second implementation will work just fine under
most conditions; however, it is technically less correct because
of its behavior if new fails (e.g., no more memory is available).
We cannot discuss the exact behavior that occurs when new fails
until Chapter 19, when we will learn about exceptions.
However, the important distinction is that if new fails, the first
implementation of the assignment operator leaves the object in
a valid state, but the second implementation leaves the object
with a dangling pointer. We will discuss these issues in much
more detail in Chapter 19.
15.3.3 Executing Code with Copying
C++’s copy constructors and assignment operators change the
rules of executing code by hand. Whenever you are going to
copy an object, you must first determine whether to use the
copy constructor or assignment operator and then determine
what that constructor/operator does (which may be the
automatically provided constructor/operator discussed
previously). The first decision—whether to use the copy
constructor or assignment operator—is strictly a matter of
whether you are initializing an object that you are newly
creating or changing the values in an existing object.
A seemingly appealing approach would be to just look for
the equals sign, which signifies assignment. However, there is a
problem with this approach—sometimes the equals sign can be
used as part of initialization of a new object. C++ consider the
following to be initialization, not assignment:
1 MyClass something = anotherThing;
As this statement is initialization, many programmers
consider it preferable to write the following instead:
1 MyClass something(anotherThing);
The two have the same exact behavior (the copy
constructor will be used if anotherThing is a MyClass object).
However, the second way of writing this piece of code avoids
confusing the inexperienced C++ programmer with respect to
what happens.
Once you have determined whether you are using the copy
constructor or the assignment operator, you must determine if
that constructor/operator is trivial or not. Recall that (for now at
least) the constructor/operator is trivial if it was automatically
provided by the compiler, and all constructors/operators that it
makes use of are also trivial. If the constructor/operator is
trivial, then you can simply copy the values directly as you
have been doing in C since Chapter 2.
If the constructor/operator is non-trivial, then you must
call it like a function (passing in the object being
initialized/assigned to as this) and step through its code line-
by-line to execute its effects.
Video 15.10: Executing code with an overloaded
copy assignment operator.
15.3.4 Rule of Three
If you have fully internalized the lessons of this chapter, you
may have noticed that there seems to be a relationship between
needing a destructor, needing a copy constructor, and needing
an assignment operator in a class. If a class needs custom
behavior to destroy an object, then that class needs custom
behavior to copy the object—performing a deep copy, so that
the destruction of an object does not leave dangling pointers in
other objects. Likewise, if a class needs special copying
behavior for initialization (the copy constructor), that class
needs special copying behavior for assignment, and vice versa.
Completing the relationship, if a class needs special copying
behavior, it almost certainly has resources that need to be
cleaned up by a destructor.
Not only is this observation true, it is important enough to
have a name. This principle is called the rule of three. The rule
of three states that if you write a destructor, copy constructor, or
assignment operator in a class, you must write all three of them.
15.4 Unnamed Temporaries
While we are discussing object creation, it is useful to discuss
the concept of unnamed temporaries—values that are created
as the result of an expression but not given a name (i.e., no
variable is declared for them, so they do not have a name). We
have actually seen unnamed temporaries in practice in C;
however, there is much less to say about them. For example,
when we compute 4 + 3 * 5, the computation of 3 * 5 occurs
first, producing 15. The value 15 is an unnamed temporary—it
is the result of an expression that is not given a name. The
program must store this value somewhere to be able to perform
the computation 4 + 15. However, that storage is quite short-
lived. As soon as the program computes 4 + 15, the value of
this temporary is no longer needed and may be discarded. The
compiler orchestrates all of this behind the scenes.
In C, there is not much interesting about unnamed
temporaries. They exist, but we do not need to think carefully
about them when executing code by hand. In the example
above, we would likely perform the math in our heads without
even writing down the 15 anywhere—that is fine, as long as we
do it correctly. If we had a longer-lived temporary (such as if
we did f(3) + g(4) and had to remember the result of f(3)
while executing g(4)), we might write it down over the
expression that generated it. We did this before without
explicitly talking about it because it is fairly simple.
In C++, however, we must be more careful about unnamed
temporaries when they are objects, as they can have more
complex behavior. Like any other objects, unnamed
temporaries are initialized by constructors and destroyed by
destructors, which we need to account for when we consider the
behavior of the program. For example, if we write a + b * c,
and a, b, and c are objects (whose types have overloaded the +
and * operators), then the result of b * c is an unnamed
temporary with an object type. The creation and destruction of
this object may be nontrivial, thus we must account for their
effects in executing the code.
There are a variety of ways that unnamed temporaries are
created. The simplest way is to declare an object with no name
by writing the class name, followed by parentheses and the
initializers of the object. For example:
1 MyClass(42); //create an unnamed object
Such a statement will allocate space for an object of type
MyClass in the current stack frame. Unlike most objects we
create, the “box” will not have any name. The next effect of the
statement will be to initialize it by passing 42 to its constructor.
The this argument of the constructor will be an arrow pointing
at the unnamed box. In this particular example, the object will
then be immediately destroyed. The destructor will be invoked
to clean up the object (again, this will point at the unnamed
box), and then the temporarily allocated storage for it will be
deallocated, removing the unnamed box from the current stack
frame.
More generally, an unnamed temporary object is
deallocated at the end of the full expression (that is, an
expression that is not a piece of a larger expression) containing
its creation. For example, suppose MyClass overloads the + and
* operators (to also return MyClass objects), and consider the
following code:
1 x = MyClass(42) + MyClass(17) * MyClass(3);
This fragment of code creates five unnamed temporaries
(MyClass(42), MyClass(17), MyClass(3), the result of the
multiplication, and the result of the addition)8. These
temporaries are destroyed (in reverse order of their creation) at
the end of the entire assignment expression—that is, after the
result of the addition is assigned to x. You may note that we
wrote assignment expression—recall that in C and C++
“lvalue = expression” is itself an expression, thus the entire
full expression here is
1 x = MyClass(42) + MyClass(17) * MyClass(3)
This rule is another great example of a rule where
remembering its details are not that important, as long as you
write code where it does not matter. If you do not care about the
precise places where these temporaries are destroyed (just the
fact that they are destroyed properly), you do not need to
remember the specific details. However, if you write code
where the specifics of object destruction (or creation order)
matter, you should fully understand the rules.
15.4.1 Parameter Passing
Unnamed temporaries can be useful in passing parameters
to functions. For example, suppose we have a function that
takes a MyClass object:
1 int someFunction(MyClass something) { ... }
We can pass an unnamed temporary as a parameter to the
function:
1 someFunction(MyClass(42));
This call is legal and constructs an unnamed temporary (as
we discussed before), copies that unnamed temporary object
into the stack frame for someFunction, and then destroys the
temporary after someFunction returns (as the full expression
containing its creation ends there). We could instead write:
1 MyClass temp(42);
2 someFunction(temp);
However, there is a subtle difference. In the first version
(where we construct an unnamed temporary), the compiler is
allowed to optimize away the copy. That is, rather than
constructing an unnamed temporary and copying it into
someFunction’s frame, the compiler is allowed to create the
object directly in someFunction’s frame to avoid the extra copy
operation. The object that it creates is still destroyed at the
proper time (after evaluation of the full expression where it was
created).
The alert reader may be a bit perplexed as to how this
destruction happens, as we have shown the called function’s
frame being destroyed, along with all objects in it before
execution returns to the caller. However, the called function
cannot be responsible for destroying these objects, as execution
must return to the caller first. Technically speaking, most
implementations will have the called function destroy
everything except the parameters, and the caller destroy the
parameters. However, this level of detail introduces
complexities that are not significant in most cases (anything we
have done or will do in this book). If you write code where this
level of detail of destruction order matters then (a) you should
be an expert in not only these details, but also what precisely
the C++ standard does and does not guarantee, as well as what
your compiler (and any you might port your code to) does
exactly and (b) you should have a really good reason for
writing such code.
We will further note that it is generally preferable to have
functions take a const reference rather than a copy of an object.
That is, we probably should write someFunction like this
instead:
1 int someFunction(const MyClass & something) { ..
Now, no copying is involved in any case. As the reference
is a const reference, we can still pass an unnamed temporary to
it. The compiler will create a “box” for the unnamed temporary
and pass the address of that box as the pointer that is the
reference. If the reference is non-const, then we would not be
able to pass an unnamed temporary, as it is not an lvalue.
15.4.2 Return Values
Unnamed temporaries are also involved in returning a value
from a function. In our earlier
MyClass(42) + MyClass(17) * MyClass(3) example, the
result of the multiplication (which is an unnamed temporary)
comes from the return value of the multiplication operator
(which is a function). There are, however, two ways in which
the creation of this unnamed temporary can happen.
The most intuitive approach from the perspective of the
rules that we have learned so far is that the function (in this
case, multiplication operator) creates an object, and that object
is explicitly copied to initialize the unnamed temporary. This
copying would make use of the copy constructor (as the
unnamed temporary is a newly created object), after which the
local object inside the function would be destroyed.
However, that approach is inefficient from a perspective of
requiring an explicit copy. C++ allows the compiler to elide the
copy, even if the copy constructor has noticeable effects. If the
compiler elides the copy, then it arranges for the result to be
directly initialized from within the function. This optimization
is called the return value optimization.
If you are executing code by hand, you can typically
choose whether to assume the compiler performs return value
optimization or not. For most code, the optimization does not
change the correctness or behavior of the code (although it is
permitted to perform this optimization even if it does change
the behavior of the code). If you are experiencing strange
problems, you might execute the code both ways to see if there
is a difference in behavior. If you are writing code where the
presence of return value optimization poses problems, you are
almost certainly doing something wrong. If you have a good
reason for whatever you are doing, you should dynamically
allocate the object with new, return a pointer, then explicitly
delete the object when appropriate.
15.4.3 Implicit Conversions
We just saw that we can pass an unnamed temporary to
someFunction like this:
1 int someFunction(const MyClass & something) { ..
2 ...
3 ...
4 someFunction(MyClass(42));
However, what is perhaps more surprising is that we can write
the call to someFunction like this (even if someFunction is not
overloaded to take an int):
1 someFunction(42);
This behavior may seem quite shocking, as 42 has type
int, and someFunction requires an argument of type MyClass.
The fact that the C++ compiler accepts this function call makes
it appear that the rules of type checking as you have come to
know them are being completely disregarded. In actuality, this
behavior is a broad generalization of behavior we have been
familiar with since Section 3.4.1. Recall that we discussed what
happens if you try to add an int and a double. The int is
implicitly converted to a double before the arithmetic happens.
In C++, any constructor that takes one argument is
considered as a way to implicitly convert from its argument
type to the type of the class it resides in, unless that constructor
is declared explicit. In the case of this particular example,
this means that the C++ compiler considers MyClass’s
constructor, which takes an int, as a valid way to implicitly
convert the int (42), which we have passed as an argument to
an object of type MyClass, which is required for the function
call. This implicit conversion creates an unnamed temporary
object according to all the rules that we just discussed.
The compiler is only allowed to apply this rule once per
initialization. That is, if class A has a constructor that takes a B,
and B has a constructor that takes a C, then we may not pass a C
where an A is required and expect the compiler to make a
temporary B from which it then can then make a temporary A.
As a general rule, you should declare all of your single-
argument constructors except your copy constructors as
explicit, such as:
1 class MyClass {
2 //other stuff here...
3 public:
4 explicit MyClass(int x) : someField(x) { ... }
5 MyClass(const MyClass & rhs) : someField(rhs.so
6 //other stuff here...
7 };
In general, the fact that the compiler can insert “what it
thinks you mean” in place of “what you said” allows for
mistakes to slide by unnoticed. If you meant to construct a
temporary of type MyClass, you should have written that.
Remember the principle that we have mentioned several times
before: you want the compiler to catch as many of your
mistakes as possible. Having implicit conversions available
reduces the compiler’s ability to help you in this fashion—
rather than telling you that you passed the wrong parameter
type, it may convert it to a legal type, which may not even be
the one you wanted.
Another danger lurks in abusing implicit conversions,
especially combined with function overloading. You may write
code that works perfectly fine right now and relies on an
implicit conversion. However, later, another programmer may
add an overloading of the function you are calling that is a
better choice (does not require implicit conversion) but does
something else. As a simple example, imagine if we had an
Image class and a draw function, which draws an Image on the
screen at specified coordinates. The Image class has a
constructor that takes a string (C++ introduces a string class,
but we will not learn about that until Chapter 16, so we will use
const char *s for now):
1 class Image {
2 public:
3 //should be explicit.
4 //but we omit that for the purpose of the example
5 Image(const char * filename) { ... }
6 };
1 void draw(int x, int y, const Image & image) { .
Now, suppose that since the Image constructor is not
explicit, the code makes use of the implicit conversion:
1 draw(xpos, ypos, "someImage.png");
The compiler sees that it can use the constructor of Image,
which takes a single const char * argument to implicitly
convert, and the code compiles just fine. Let us suppose that
this code works exactly as intended for the moment. The
programmer tests the code and all is well. In fact, perhaps, the
programmer makes use (or should we say “abuses”) these
implicit conversions in many other places in the program. All
are perfectly legal, and may well work just fine (for now).
Now suppose that at some later date, another programmer
decides to write a function to draw a string of text on the
screen. That is, it takes a string and prints the image of those
letters onto the screen at a particular set of coordinates:
1 void draw(int x, int y, const char * message) {
The programmer adds this code then recompiles the
program. Now, the calls that abused the implicit conversion:
draw(xpos, ypos, "someImage.png"); actually call the newly
added draw function, as it is a better match (no conversions are
required)! Instead of reading the file and drawing the image,
they draw the file name on the screen. The programmer testing
the new draw function will be quite surprised by the new
behavior of the program and need to invest some debugging
effort to find out what is happening. After the problem is
identified, the programmer will need to go in and add explicit
conversions everywhere that the implicit conversion was
abused.
Declaring your constructor with explicit prevents the
compiler from making use of it unless you explicitly create the
unnamed temporary, like this:
1 draw(xpos, ypos, Image("someImage.png"));
As we just mentioned, you should generally declare all
single-argument constructors to be explicit, except for the
copy constructor. The copy constructor should never be
declared as explicit. It cannot be used for an implicit
conversion (it would only convert from one type to the same
type) and must be used in a variety of implicit circumstances,
such as copying parameters into a callee’s frame and copying
the return value out of a function.
15.5 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 15.1 : What is a constructor? How are they
declared? Why are they useful?
• Question 15.2 : If you write a constructor in a class, can
that class be POD? Why or why not?
• Question 15.3 : Why can you not use malloc with non-
POD types?
• Question 15.4 : What is an initializer list, and why
should you use them in your constructors?
• Question 15.5 : What is a destructor? How are they
named? Why are they useful? When are they invoked?
• Question 15.6 : Suppose that C is a class, and c1 is
previously declared to be of type C (and initialized). If I
write C c2 = c1;, which gets used: the copy
constructor or the assignment operator?
• Question 15.7 : What is the output when the following
code is run?
1 #include <cstdio>
2 #include <cstdlib>
3
4 class Example {
5 private:
6 int data;
7 public:
8 Example() : data(0) {
9 printf("Example default constru
10 }
11 Example(int x) : data(x) {
12 printf("Example constructor (%d
13 }
14 Example(const Example & e) : data
15 printf("Example copy constructo
16 }
17 Example & operator=(const Example
18 data = rhs.data;
19 printf("Example assign operator
20 return *this;
21 }
22 ~Example() {
23 printf("Example destructor [wit
24 }
25 void add(int x) {
26 data += x;
27 }
28 };
29
30 void f(Example & a, Example b) {
31 printf("Inside f\n");
32 a.add(10);
33 b.add(20);
34 }
35
36 int main(void) {
37 printf("e1:\n");
38 Example e1;
39 printf("e2:\n");
40 Example e2(4);
41 printf("e3:\n");
42 Example e3 = e2;
43 e3.add(5);
44 printf("e1 = e2:\n");
45 e1 = e2;
46 printf("f(e1,e2):\n");
47 f(e1, e2);
48 printf("back in main\n");
49 return EXIT_SUCCESS;
50 }
(Note that there is really no need to overload the
copy constructor nor assignment operator here, except
to have them print messages so you can check your
understanding of their behavior).
• Question 15.8 : What is the rule of three?
• Question 15.9 : Write an IntArray class with the
following behaviors:
– A default constructor, which initializes the array
to an empty (zero-element) array.
– A constructor that takes a size_t, which
initializes the array to have the passed-in number
of elements and initializes each element to 0.
Recall that size_t is an unsigned int type that
is appropriate for the size of an array.
– A copy constructor, which makes a deep copy of
the IntArray.
– A destructor, which properly frees the memory
associated with the IntArray.
– A copy assignment operator, which makes a
deep copy of the assigned IntArray (and frees
up any resources previously held by the
assigned-to IntArray, which are no longer
needed).
– An overloaded int & operator[]
(size_t index), which returns a reference to
the specified item of the array.
– An overloaded const int & operator[]
(size_t index) const, which returns a const
reference to the specified item of the array.
– A size_t size() const function, which
returns the size of the array.
You should also write a main to test your code
extensively.
• Question 15.10 : Write a program that demonstrates the
automatically provided assignment operator (for a class
that does not explicitly define one) does not check for
self assignment. Your program does not need to
accomplish any particular “useful” task but should
convince anyone who sees its code and output of the
behavior of this rule.
14 Transition to C++16 Strings and IO Revisited
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
II C++15 Object Creation and Destruction17 Templates
Chapter 16
Strings and IO Revisited
C++ introduces new classes for strings and for performing IO.
Although the C approaches are still valid, most C++
programmers use the C++-specific approaches. The advantage
of these new approaches is that they provide a more object-
oriented approach than their C counterparts. However, as with
most things, there are tradeoffs.
16.1 Strings
In C, a string is simply a sequence of characters, terminated by
the special null-terminator character (’\0’). Although this
implementation of strings is sufficient to define and manipulate
them, it is not at all in line with the object-oriented paradigm of
C++. Instead, C++ provides a string class (std::string, for
which you need to #include <string>). Of course, you can
still define and manipulate C-style strings via a char *
whenever it is appropriate (and in fact, argv is still a char **
just as it is in C).
The string class encapsulates the sequence of bytes that
form a string into an object that provides a variety of methods
and operators to operate on that type. For example, strings
have a length() method, which tells how many characters are
in the string. There are also operators such as +=, which
appends one string to the end of another, and [], which indexes
into the string, returning a reference to the requested character.
As the [] operator returns a char & (when applied to a non-
const string—otherwise a const char &, it can be used to
modify the contents of a string (e.g., myString[0] = ’a’;).
The string class also provides a variety of constructors,
including one that takes a const char * (i.e., a C-style string)
as an argument. This constructor allows creation of C++ strings
from C strings, including string literals (which still retain the
same const char * type in C++). For example, you can write
std::string s("Hello World"); to create a C++ string object
(locally in the current frame) and initialize it with the (C) string
literal "Hello World".
For full reference on C++’s string, consult
http://www.cplusplus.com/reference/string/string/.
16.1.1 Cautionary Example
Our discussion of C++’s strings provides an excellent
opportunity for a cautionary example. Consider the following
C++ code to reverse a string, using recursion:
1 #include <string>
2 //This code is terribly inefficient
3 std::string reverse(std::string s, std::s
4 if (s.length() == 0) {
5 return ans;
6 }
7 return reverse(s.substr(1), s[0] + ans)
8 }
9 //Note that the next function is not recursive.
10 //It calls the other (overloaded) reverse function, not
11 std::string reverse(std::string s) {
12 return reverse(s, "");
13 }
At first glance, this code appears to be a reasonable C++
implementation of code to reverse a string using recursion. In
fact, a naïve examination of this code would lead one to believe
that this implementation is tail recursion (recall from Chapter 7
that a tail recursive function immediately returns the recursive
result with no further computation). However, this function is
not actually tail recursive, so the compiler cannot apply tail call
optimization to it. If you fully understood and internalized the
information from Chapter 15, you will realize why this function
is not tail recursive—the frames for reverse have four string
objects (s, ans, as well as temporary objects for s.substr(1)
and s[0] + ans), which have nontrivial destructors. Thus,
these objects must be destructed after the recursive call of
reverse returns.
However, being head recursive is not the only inefficiency
in this function (if it were, the impact would be fairly minor).
The larger inefficiency arises from the fact that we are creating
many new objects and performing significant copying. When
we call s.substr(1), we are creating a new local string object,
which has all but the first character of s and almost certainly
involves copying that data into the new object. We also create a
new string object that is the concatenation of s[0] and ans,
which also requires copying characters from those strings into
the newly created object. We then copy the temporary objects
into the parameters of the recursive call!
You may think “ok, so we make a few copies—how big of
a deal can that be?” If we compare to a more efficient
implementation (which actually is tail recursive and eliminates
the extra copying):
1 #include <string>
2 //This code is much more efficient!
3 std::string reverse(const std::string & s
4 if (ind == s.length()) {
5 return ans;
6 }
7 ans[ind] = s[s.length() - ind - 1];
8 return reverse(s, ans, ind + 1);
9 }
10 //Note that the next function is not recursive.
11 //It calls the other (overloaded) reverse function, not
12 std::string reverse(std::string s) {
13 std::string ans(s.length(), ’\0’);
14 return reverse(s, ans, 0);
15 }
We find that this more efficient implementation is about
2000x faster (yes, two thousand times), when reversing large
strings. On the computer we timed these implementations on,
the first reversed a 10,000-character string in 42,500
microseconds. The second implementation reversed the same
10,000-character string in 20 microseconds. The first
implementation could not reverse 100,000 character strings
because the stack overflowed (the head recursive functions
created more frames than there were space for on the stack). We
note that we can also implement the reverse function iteratively,
which has indistinguishable performance from the efficient
recursive implementation:
1 #include <string>
2
3 std::string reverse(std::string s) {
4 std::string ans(s.length(), ’\0’);
5 for (size_t i = 0; i < s.length(); i++)
6 ans[i] = s[s.length() - 1 - i];
7 }
8 return ans;
9 }
We present this cautionary tale not for the specific details
of implementing string reverse, but to underscore the
importance of understanding the implications of all of the code
you write. In C++, it is very easy to write code that copies
values excessively and creates/destroy many temporary objects.
Such code can be painfully slow. While we are not focusing on
performance optimization in this book, we note that writing
code multiple orders of magnitude slower than it should be is
often unacceptable.
You should think carefully about what exactly your C++
code does and whether it really needs to make copies or create
new temporary objects. Observe that by learning exactly what
C++ code means and how to execute it by hand, you have
exactly the tools required to think through these concerns as
you write code. When you find yourself having such problems,
consider ways to pass parameters by reference (or const
reference) instead of value, as well as where temporary objects
are created, and how they might be eliminated.
16.2 Output
In C, we used FILE *s for output. These represented an open
file, including stdout and stderr. We then used functions such
as fprintf or printf (which is like fprintf but always writes
to stdout) to print output. These two functions provided format
conversions (%d, %s, etc) to allow us to print the values of
expressions of various types. If we wanted to write a function
to print a complex data structure, we would write it to take a
const pointer to that structure as a parameter.
The C-style mechanisms for printing are great for C;
however, they do not align with the object-oriented principles
C++ strives for. The C approach requires us to write a function
that takes in the data to be printed. An OO approach lets a class
define how to print itself with a method encapsulated inside of
it. Reconciling such a design goal with the way printf is
designed is a bit tricky.
Instead, the designers of C++ decided to leverage operator
overloading to devise an entirely different style of printing
anything. In C++, the fundamental type for output operations is
a std::ostream, and the << operator is overloaded to work with
it as the left-hand operand with a variety of possible types for
its right-hand operand. When using the << operator to work
with std::ostreams, it is called the “stream insertion operator”
(otherwise, it is typically called the “left shift operator,” as it
performs bit shift left for integers).
Instead of using stdout (which is a FILE *), C++
introduces std::cout, which is a std::ostream (likewise,
there is std::cerr, which is analogous to stderr). To see this
in action, let us start by rewriting the “Hello World” program in
C++:
1 #include <iostream>
2 #include <cstdlib>
3
4 int main(void) {
5 std::cout << "Hello World\n";
6 return EXIT_SUCCESS;
7 }
Here, we #include <iostream> for the C++ stream
library, including std::cout, the std::ostream type, and the
built-in overloadings of <<. We #include <cstdlib> for the
definition of EXIT_SUCCESS. Inside of main, we use the stream
insertion operator to print the string "Hello World\n" to
std::cout.
The << operator is overloaded to allow its use with a
variety of built-in types as its right operand. For example, we
might write the following code:
1 #include <iostream>
2 #include <cstdlib>
3
4 int main(void) {
5 for (int i = 0; i < 10; i++) {
6 std::cout << "i = " << i << "\n";
7 std::cout << "and i / (i + 1) = " <<
8 }
9 return EXIT_SUCCESS;
10 }
This code has two C++-style print statements. The first, on
line 6, prints the literal string "i = " then prints the decimal
(base 10) representation of the value of integer i, then the
literal string "\n". That is, it behaves basically like
printf("i = %d\n", i). Similarly, the next line behaves like
printf("and i / (i + 1) = %f\n", i / (double)(i + 1))
would in C. Note that while C’s printf requires us to explicitly
write format specifiers (such as %d or %f), C++’s output streams
do not. Instead, the C++ << is overloaded on multiple types.
The << operator applied to an ostream and an int is a different
function from the << operator applied to an ostream and a
double.
16.2.1 Return Type of Stream Insertion
A natural question from the previous example is “How does it
work to have multiple <<s in a single statement?” Each <<
operator evaluates to a value, which in this case (and by
convention, in general for stream operators) is its left operand.
The fact that the operator evaluates to the stream means that
std::cout << "i = " evaluates to std::cout (after printing
the string "i = "), so the next << operator again has std::cout
as its left operand. To make this concept a bit more concrete,
we could imagine the implementation of << with int as the
right operand type as:
1 std::ostream & operator<<(std::ostream &
2 char asString[16]; //decimal representation of
3 snprintf(asString, 16, "%d", num);
4 stream.write(asString, strlen(asString)
5 return stream;
6 }
That is, we might convert the integer to a sequence of
characters then write those characters to the underlying
ostream (the write method of an ostream takes an array of
characters and a count of how many characters to write then
writes them to the stream). After we write the data to the
stream, we then return the original stream as the return result of
the operator, which is what it evaluates to in the expression in
which it was used.
This convention is important to remember if you overload
the << for your own types. If you make the overloaded operator
return the ostream & that was passed in as its left operand, your
<< will be usable in the “normal” way—that is, you can chain
multiple of your << operators and/or mix them with other
overloadings in a single statement. If you make your << return
void (as one might naïvely do), you will not be able to chain
them together.
16.2.2 Writing Stream Insertion for Your Own
Classes
In an ideal world, we would write the overloading of the <<
operator inside of a class that we were writing. Such a design
would fit with the OO design principle of encapsulation: how
we print the class would be contained inside of it. However,
writing it inside of the class would require the object being
printed to be the left operand instead of the right—for operators
that are members of classes, the left operand is this. Instead,
we must write a method outside of the class that looks like this:
1 std::ostream & operator<<(std::ostream & stream,
2 //whatever it takes to print things (elided)...
3 return stream;
4 }
Besides design idealism, this approach presents a
pragmatic difficulty: our << operator is now outside the class
but may require access to the private internals of the class to
print out an instance. That is, we may wish for our printing
method to have access to fields and methods of our class that
are not exposed in the public interface (especially if we want to
use the << operator to print objects for debugging).
C++ resolves this conundrum by allowing a class to
declare functions or class as its friend. When a class declares a
function or a class as its friend, that function or class is
allowed access to the private elements of the declaring class.
The friend declaration is placed inside of the class wishing to
grant access to its private members:
1 class MyClass {
2 //declarations of fields, methods, etc.
3 friend std::ostream & operator<<(std::ostream &
4 const MyClass
5 };
Observe that for functions, we declare the friendship by
writing the entire prototype of the function. As overloaded
functions are different from each other, a function with the
same name but different parameter types as a friend would not
itself be a friend (unless also declared as such explicitly).
Friendship declarations are a feature that should be used
quite rarely. Declaring something as a friend inherently violates
the core principles of OO design: it lets something outside of
the class alter its private internal state. Using it for operators
that are overloaded and are, in principle, part of the class but
are just declared outside of the class due to the ordering of the
parameters is a legitimate use of friend. Except for that use,
you can probably avoid the use of friend in any C++ program
you write and not have any problems.
16.2.3 Object Orientation of This Approach
C++ adopts this approach (in favor of C’s printf) as it presents
a more object-oriented way to handle IO operations. By using
objects, the underlying IO functionality is encapsulated into a
class—the << and >> operators ask the stream object to perform
output and input (respectively) with their right operands as
arguments. While this may seem like a subtle distinction, it has
some practical uses. In Section 16.4, we will see that we can
have streams that read/write from/to things that are not files
(such as strings). We could then write code that prints to any
stream (whether that stream actually writes to a file, builds up a
string, or does something else) and not worry about what kind
of stream will be passed in. We will see in Section 18.4 how we
can pass different stream types in to the same function.
16.2.4 Controlling Formatting
One last useful detail we mention in case you need it for C++
output streams is how to control the specifics of formatting. In
C, we can print an integer in decimal with %d and in
hexadecimal with %x; however, with C++’s streams, we do not
have a place to put such a format string. Instead, the stream
(which is an object) has internal state that keeps track of the
current formatting specifications. The current format state is
altered by either inserting special values (called manipulators)
with the << operator or by calling methods on the stream. To
change the base in which integers are printed, the first approach
is used, using std::hex, std::dec, or std::oct accordingly.
For example,
1 int x = 42;
2 std::cout << x << "\n";
3 std::cout << std::hex << x << "\n";
4 std::cout << std::oct << x << "\n";
5 std::cout << std::dec << x << "\n";
Here, we print the integer x four times. The first time
converts the number in whatever base the cout stream was
previously left in. Assuming we have not modified its base, it is
in decimal, as the C++ library is guaranteed to start the streams
in decimal mode. The next line first changes the mode of the
stream to hexadecimal then prints x (so it would print out 2a).
The third line repeats this process in octal (base 8, so it prints
52). Finally, the stream is converted back to decimal and prints
42. We note that it is generally good practice to put a stream
back into decimal mode when you are done printing things in
other modes.
Other types of formatting are controlled via methods in the
stream class. For example, the std::cout.width(5); would set
the field width of cout to five, meaning that an output
conversion that uses field width (e.g., printing integers) will
print at least five characters. What extra characters get printed
is controlled by calling the fill method and passing in the
desired character. Similarly, the floating point precision can be
controlled with the precision method.
16.3 Input
In addition to introducing new ways to perform output, C++
also introduces new ways to perform input. Here, we again use
stream objects (in this case, istreams), and the stream
extraction operator, >>. Much like stdout has an analog of
std::cout, stdin has an analogous std::cin. For example, we
might do
1 int x;
2 std::cin >> x;
to read an integer from standard input. As with output streams,
input streams have a base they work in, which is by default
decimal (base 10), so unless we have changed the base of
std::cin (e.g. with std::cin >> hex), the integer will be
converted from the text typed on standard input as a decimal
number.
Just like for output streams and the stream insertion
operator, you can overload the stream extraction operator for
input streams and classes that you write. Many of the lessons
from the stream insertion operator that we discussed earlier still
apply. You should write your stream extraction operator to take
a reference to the input stream (which it returns at the end) and
a reference to the type of data you want to read (of course, this
reference is not const, as you plan to modify it). For example:
1 std::istream & operator>>(std::istream & stream,
2 //whatever it takes to read things (elided)...
3 return stream;
4 }
As with stream insertion, you may wish to make this
overloaded operator a friend of your class so that it can
modify the internal data directly in ways that may not be
possible through the external interface.
16.3.1 Errors
Remember the code fragment we just saw above to read an
integer:
1 int x;
2 std::cin >> x;
What happens if you execute this code, but the user enters xyz?
Clearly, it cannot be converted to an integer, since the text the
user entered is not the base 10 representation of any number.
The design of the C++ operator precludes using the return value
(which is already used to return the stream so the >>s can be
chained together) or passing another argument (the >> operator
can only take two arguments) to report the error.
Instead, the input stream object contains an error state
internally. When an operation produces an error, it sets the
stream’s error state. This error state persists, causing future
operation to fail, until it is explicitly cleared. When an error
occurs, the input that could not be converted is left unread, so
future read operations can try to read it (after the error state is
cleared). For example, we might write:
1 int x;
2 std::cin >> x;
3 if (!std::cin.good()) { //check if any errors happen
4 std::cin.clear(); //clear the error state
5 std::string badinput;
6 std::cin >> badinput; //read a string
7 if (!std::cin.good()) {
8 std::cerr << "Cannot read anything from cin!\
9 }
10 else {
11 std::cerr << "Oops! " << badinput << " wasn’
12 }
13 }
In the above example, we try to read an int from
std::cin. After doing so, we check std::cin.good(), which
returns true if no errors have happened and false otherwise. If
an error has happened, we use std::cin.clear() to clear the
error state so that subsequent operations can succeed. We then
read a string, which could also fail (for example, if we have
reached the end of the file). Of course, if the user has entered
input that is just not a valid integer, we can succeed in reading
it as a string but not in converting it to an int. We therefore
check again if we succeeded and print an appropriate error
message about the problem that happened.
C++’s input streams actually distinguish between three
types of failures: end of file (which can be checked with the
.eof() method), logical errors on the operation—such as an
inability to perform the proper conversion (which can be
checked with the .fail() method), and errors with the
underlying I/O operation (which can be checked with the
.bad() method if .fail() returned false). Of course, output
streams can experience errors too and can have similar methods
applied to them as needed.
It is important to know that if you use the >> operator to
read a string, it will only read one “word,” not one line. That
is, it will stop at whitespace for what it converts into the string.
If you want to read an entire line, use the std::getline,1
function, which takes an input stream reference and a string
reference and fills in the string with the next line of input from
the stream (unless an error occurs). Note that std::getline is
not a member of any class; it is just a function.
16.4 Other Streams
C++’s streams can be used for more things than just printing to
standard output or reading from standard input. In much the
same way that C’s stdio library can be used to work with files,
C++’s stream library has classes that work with files (although,
unlike C’s stdio, the types are slightly different between
different types of streams). C++ also has streams that collect
the data into an internal buffer (or extract it from the buffer),
allowing us to manipulate the data within the program
(analogously to C’s snprintf and sscanf functions).
16.4.1 Files
In C++, files are manipulated using the std::ifstream (for
reading), std::ofstream (for writing), and std::fstream (for
reading and writing) classes. These classes have default
constructors (which create an object with no file associated
with it—you can open one later with the open method) and
constructors that take a const char * specifying the file name
(they also take a second parameter for the mode, but it has an
appropriate default value). An open file can be closed with the
close method. To use any of these classes,
#include <fstream>.
When using these classes, data is written to the file with
the stream insertion operator (<<) and read from the file with
the stream extraction operator (>>). We will note that even
though these are not exactly the same types as the ostream and
istream we saw earlier, they work with the same operators as if
they were the same types. We will understand how exactly this
works when we learn about inheritance and polymorphism in
Chapter 18. The short version for now is that an ofstream is a
special type of ostream and thus is compatible with things that
expect pointers or references to ostreams. An fstream is
compatible with both ostream and istream, so we can use both
<< and >> on it.
16.4.2 String Streams
Sometimes we want to manipulate text in ways that the
functionality provided by the << or >> operators are convenient,
but we do not want to print the output to an external file (or
read it in from a file). Instead, we may want to perform this
functionality in such a way that we can get the resulting string
into a variable (for <<) or extract the fields from a string
variable (for >>). We can accomplish such behavior with the
std::stringstream class.
If we want to use a string stream to “build up” a string into
a variable, we can default construct one and then use the << to
add elements to the stream. Whenever we use <<, the string
stream accumulates the text that would be printed if we were
using a “normal” stream. Whenever we are done building the
string up, we can use the .str() method to get a std::string
with all the text we have inserted into the stream.
We can also use a string stream to “pull apart” a string as
formatted input; that is, we can use >> to extract pieces from
the string the string stream holds in much the same way that we
would use >> to read them from cin or a file. Typically, when
we do so, we construct the string stream with the constructor
that takes a string, passing in the string we want to pull apart.
Then we use >> to extract fields into variables of appropriate
types. Of course, as with other streams, the extraction could
fail, so we must check for errors. The two behaviors are not
mutually exclusive—we can insert text into a string stream with
<< then extract it back out with >>.
16.5 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 16.1 : Use the C++ documentation to find a
way to extend a string by appending the contents of
another string to the end of it.
• Question 16.2 : How do you convert from a C string
(char *) to a C++ std::string? What about from a
C++ std::string to a C string?
• Question 16.3 : What value does the expression
std::cout << "hi" evaluate to? (Note: this question is
not asking what this prints, but what value it evaluates
to). Why is the value it evaluates to important?
• Question 16.4 : Write the function
int myatoi(const std::string & str), which
behaves like the atoi function, except that it takes a
C++ string (do not use library functions to accomplish
this task for you, such as atoi, strtol, or std::stoi).
• Question 16.5 : Write the function
int myatoiHex(const char * strconst std::string
* str), which behaves like the atoi function except
that it (a) interprets the string as a hexadecimal (base
16) number rather than decimal and (b) takes a
std::string (again, do not use library functions to do
the work for you.)
• Question 16.6 : Re-write the myEcho, myCat, and
fileSum programs from Question 11.5, Question 11.5,
and Question 11.5.
15 Object Creation and Destruction17 Templates
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
II C++16 Strings and IO Revisited18 Inheritance
Chapter 17
Templates
Often, programmers find themselves needing to perform similar
operations on different data types in a similar fashion. For
example, if we want to find the largest item in an array of any
type of data, we use the same basic algorithm—the only real
difference is what the underlying comparison operator does to
compare particular elements. For example, consider the
following two functions that find the maximum element of an
array (and return a pointer to that element). The first operates
on an array of ints, while the second operates on an array of
strings:
1 int * arrMax(int * array,
2 size_t n) {
3 if (n == 0) {
4 return NULL;
5 }
6 int * ans = &array[0];
7 for (size_t i = 1; i < n; i++) {
8 if (array[i] > *ans) {
9 ans = &array[i];
10 }
11 }
12 return ans;
13 }
1 string * arrMax(string * array,
2 size_t n) {
3 if (n == 0) {
4 return NULL;
5 }
6 string * ans = &array[0];
7 for (size_t i = 1; i < n; i++) {
8 if (array[i] > *ans) {
9 ans = &array[i];
10 }
11 }
12 return ans;
13 }
These two functions are the same except for the three
places where int has been replaced with string (in the return
type, in the parameter type, and in the declaration of the local
variable answer). In fact, for any type for which the > operator
is defined, we could find the maximum element of an array of
that type with this same algorithm. We could overload this
function for any type we want; however, that approach is
unappealing as it requires significant duplication of code.
Instead, what we desire (and what C++ gives us, but C
does not) is parametric polymorphism. Polymorphism, which
literally means “many shapes” or “many forms,” is the ability
of the same code to operate on different types. This ability to
operate on multiple types reduces code duplication by allowing
the same piece of code to be reused across the different types it
can operate on.
Polymorphism comes in a variety of forms. What we are
interested in at the moment is parametric polymorphism,
meaning that we can write our code so that it is parameterized
over what type it operates on (we will see a different kind of
polymorphism in Chapter 18). That is, we want to declare a
type parameter T and replace int with T in the above code.
Then, when we want to call the function, we can specify the
type for T and get the function we desire. C++ provides
parametric polymorphism through templates.
17.1 Templated Functions
In C++, you can write a templated function by using the
keyword template followed by the template parameters in
angle brackets (<>). Unlike function parameters, template
parameters may be types, which are specified with typename
where the type of the parameter would typically go.1 The
template parameters have a scope of the entire templated
function. For example, we might write our function that returns
a pointer to the largest element in an array as a template as
follows:
1 template<typename T>
2 T * arrMax(T * array, size_t n) {
3 if (n == 0) {
4 return NULL;
5 }
6 T * ans = &array[0];
7 for (size_t i = 1; i < n; i++) {
8 if (array[i] > *ans) {
9 ans = &array[i];
10 }
11 }
12 return ans;
13 }
In this example, the templated function has one parameter
T, which is a type. In the function, we can use T anywhere we
could use any other type. In this particular case, we use T in
three places: the return type, the parameter type for array, and
the type used to declare the local variable ans. Recall from
before that these three places are exactly the places we had to
change to alter the original example function when changing it
to work on strings instead of ints.
Templates may take multiple parameters, and their
parameters may be “normal” types, such as int. For example, it
would be perfectly valid to write a templated function that
looked generally like this:
1 template<typename T, typename S, int N>
2 int someFunction(const T & t, S s) {
3 ...
4 }
This templated function takes three parameters, two types
and one integer. We do not really care what it does for the
purposes of this example, just that we could write it.
17.1.1 Instantiating Templated Functions
In much the same way that a function takes parameters and
returns a value, we can think of a template as taking parameters
and creating a function. The templated function itself is not
actually a function. Instead, we must instantiate the template—
giving it “values” for its parameters—to create an actual
function. Note that even though there is similarity between
calling a function (giving it parameters and getting a value in
return) and instantiating a template (giving it parameters and
getting a function), the two have different terminology as well
as other important distinctions, which we will discuss shortly.
To instantiate a template, we write the name of the
templated function, followed by the arguments we wish to pass
inside of angle brackets. The result is a function, which we then
call as we normally would:
1 int * m1 = arrMax<int>(myIntArray, nIntElements);
2 string * m2 = arrMax<string>(myStringArray, nStri
3 int x = someFunction<int *, double, 42>(m1, 3.14)
Whenever you instantiate a template, the C++ compiler
creates a template specialization (or just specialization)—a
“normal” function derived from a template for particular values
of its parameters—if one does not already exist. If the compiler
needs to create the specialization, it does so by taking the
template definition and replacing the template parameters with
the actual arguments passed in the instantiation. In our example
above, the specialization the compiler generates for
arrMax<int> would look exactly like the arrMax function for
ints at the start of the chapter. Similarly, the specialization
created for arrMax<string> would look just like the arrMax for
strings at the start of the chapter.
17.2 Templated Classes
In much the same way that C++ allows a programmer to write
templated functions, it also allows templated classes. Like a
templated function, a templated class has template parameters,
which can be either types or values. The scope of the template
parameter is the entire class declaration. For example, we could
write an Array class that holds an array of any type T (and
keeps the size along with it):
1 template<typename T>
2 class Array {
3 private:
4 T * data;
5 size_t size;
6 public:
7 Array() : data(NULL), size(0) {}
8 Array(size_t n) : data(new T[n]()), siz
9 Array(const Array & rhs) : data(NULL),
10 (*this) = rhs;
11 }
12 ~Array() {
13 delete[] data;
14 }
15 Array & operator=(const Array & rhs) {
16 if (this != &rhs) {
17 T * temp = new T[rhs.size];
18 for (int i = 0; i < rhs.size; i++)
19 temp[i] = rhs.data[i];
20 }
21 delete[] data;
22 data = temp;
23 size = rhs.size;
24 }
25 return *this;
26 }
27 T & operator[](unsigned ind) {
28 return data[ind];
29 }
30 const T & operator[] (unsigned ind) con
31 return data[ind];
32 }
33 size_t getSize() const {
34 return size;
35 }
36 };
As with functions, whenever we instantiate the templated
class (by supplying the actual arguments for the parameters),
the compiler creates a specialization of that template (if one
does not already exist). For example, we could instantiate the
template for ints by writing Array<int> and for strings by
writing Array<std::string>. The specializations created by
(or, if they already exist, referenced by) these two instantiations
are different classes. As these two classes are different, they are
not interchangeable. For example, the following code is illegal:
1 Array<int> intArray(4);
2 Array<std::string> stringArray(intArray); //illega
There is no constructor in Array<std::string> that can
take a Array<int>. The copy constructor can only copy from
another Array<std::string>, which is a different type. If you
think about this rule, it makes perfect sense: we can not create
an array of ints by copying an array of strings.
17.2.1 Templates as Template Parameters
There is a third option for template parameters, which we have
not yet discussed: they can also be another template! For
example, we might write:
1 template<typename T, template<typename> class Con
2 class Something {
3 private:
4 Container<T> data;
5 //other code in the rest of the class...
6 };
Here, Something is a template whose second parameter is
another template (specifically, one that takes one type as a
parameter—such as the Array template we just made). Inside of
the templated class, we can use the template parameter
(Container) just like we would use any other template that has
the same parameter types. In this particular example, we make
a field called data, whose type is Container<T>.
With the Array template we wrote above, we could
instantiate this templated class like this:
1 Something<int, Array> x;
This instantiation will create a specialization whose data
field is an Array<int>.
Of course, if we always wanted to use an Array to store
our data, we could just write that in our class—so why would
we want to parameterize this template over this templated
class? This is just another example of abstraction—we
decouple the specific implementation for how the Container
stores its data from the interface required to do so. Our
Something template will work with any templated class that
conforms to the interface it expects. We then have greater
flexibility in how we use it later.
17.3 Template Rules
There are several rules related to the workings of templates.
Each of these rules has a good reason based on the way
templates work, which we will explain as we present the
corresponding rule. We will note that these rules are all quite
specific to C++’s templates. Other languages provide similar
forms of parametric polymorphism with different rules.
17.3.1 Template Definitions Must Be “Visible” at
Instantiations
By now, we are used to being able to declare a prototype for a
function or the interface for a class in a header file, and place
the implementation in the .c or .cpp file. For templates, this rule
changes. Instead, the actual definition (i.e., implementation) of
the templated function or class must be “visible” to the
compiler at the point where the template is instantiated. The
compiler cannot just generate a call to it and find it in an object
file.
The reason for this rule is that a templated function (or
class) is not actually a function (or class)—just a recipe to
create the specialization when the template is instantiated.
When the compiler finds the instantiation of the template, it
needs the definition (i.e., recipe) in order to create the actual
specialization. It is only at the time when the compiler creates
the specialization that there is actual concrete code that can be
placed in an object file.
Because of this rule, you should write your entire
templated classes and functions in their header files. Although
this prescription goes against our prior advice on how to do
things, it is a special case. The C++ compiler is aware of the
problems that typically arise in placing implementations in
header files and takes special care to make sure the linker does
not encounter problems when the header file is included by
multiple source files, which are then linked together.
17.3.2 Template Arguments Must Be Compile-Time
Constants
The arguments that are passed to a template must be compile-
time constants—they may be an expression, but the compiler
must be able to directly evaluate that expression to a constant.
For example, if f is a function templated over an int, the
following is illegal:
1 for (int i = 0; i < 4; i++) {
2 x += f<i>(x); //illegal: i is not a compile-time constant
3 }
Note that we can achieve a similar effect with legal code:
1 x += f<0>(x);
2 x += f<1>(x);
3 x += f<2>(x);
4 x += f<3>(x);
Even though both do the same basic thing, the second is
legal because the template arguments are constants. The
compiler is not clever enough (in fact, is not allowed to be
clever enough) to convert the former to the latter.
The reason for this rule is that the compiler must create a
specialization for the argument values given. When it sees
f<0>, it has to go create a version of f where the parameter’s
value is 0 and then compile that code. If the compiler cannot
figure out the exact value of the parameter (easily), then it has
no idea what specializations of the template it should create and
thus cannot compile the code.
17.3.3 Templates Are Only Type Checked When
Specialized
In C++, a template is only type checked when it is specialized.
That is, you can write a template that can only legally be
instantiated with some types, but not others. This rule is
actually quite important, as otherwise we would be heavily
restricted in what templates we can write. To see the
importance of this rule, let us consider our earlier example of
finding the maximum element of an array:
1 template<typename T>
2 T * arrMax(T * array, size_t n) {
3 if (n == 0) {
4 return NULL;
5 }
6 T * ans = &array[0];
7 for (size_t i = 1; i < n; i++) {
8 if (array[i] > *ans) {
9 ans = &array[i];
10 }
11 }
12 return ans;
13 }
If we carefully scrutinize this code, we find that it is legal
only for Ts where the > operator is defined to compare two Ts
and return a bool. Instantiating this template is therefore legal
on int and std::string where this operator is defined.
However, there may be other types for which this operator is
not defined (and possibly even does not make sense).
Fully understanding this rule requires us to be a little bit
more precise about exactly when the compiler does what in
regards to specializing a template. For a function, the details are
fairly straightforward: the first time the function encounters an
instantiation of the templated function with a particular set of
arguments, it specializes (and thus type checks) the function.
For a class, however, only the specialization occurs in
parts. Whenever an instance of the class is created, the compiler
specializes the part of the class definition that is required to
make the object’s in-memory representation—the fields, as well
as virtual methods (which we will learn about in Chapter 18).
The “normal” methods (a.k.a. member functions) do not
actually affect the in-memory representation of an instance of
the object, as they are not placed in the object but rather the
code segment (we will learn more about the details of how
objects are laid out in memory in Chapter 29).
The (non-virtual) methods of a templated class are only
specialized when the corresponding method is used. Although
this piece-by-piece specialization may seem a bit odd, it
actually proves quite useful in the way that C++ templates are
used. A programmer can write a templated class and provide
methods in that template that only work on types where certain
functions or operators are defined. The programmer can then
use the template on classes that do not have these functions or
operators defined, as long as she does not use the methods that
require them. For example, we might write:
1 template<typename T>
2 class Something {
3 T data;
4 //other stuff...
5 public:
6 bool operator==(const Something & rhs) const {
7 return data == rhs.data;
8 }
9 bool operator<(const Something & rhs) const {
10 return data < rhs.data;
11 }
12 };
If we instantiate the Something template on a type (T) that
admits equality (has an == operator defined) but not ordering
(does not have a < operator defined), that is legal. We can use
the == operator to compare two Something<T>s. However, if we
attempt to compare two Something<T>s with <, then the
compiler will specialize the < operator for Something, try to
type check it, and produce an error as the < operator is not
defined on T.
17.3.4 Multiple Close Brackets Must Be Separated by
Whitespace
Sometimes you will need to close multiple template angle
brackets in succession. For example, suppose you were to
instantiate the std::vector template (which we will learn
about in Section 17.4.1) with a std::pair (which we will learn
about in Section 17.4.2) with an int and a std::string:
1 std::vector<std::pair<int, std::string> > myVecto
The white space between the two >s is required. If we
wrote this code without the space:
1 //illegal: needs a space between the >>s
2 std::vector<std::pair<int, std::string>> myVector
Then we will get an error such as:
error: ’>>’ should be ’> >’ within a nested
template argument list
std::vector<std::pair<int,std::string>>
myVector;
17.3.5 Dependent Type Names Require Keyword
typename
In C and C++, the compiler must figured out whether a name
refers to a type or not to determine how to parse the code.
Consider examples such as x & y, x * y, and x(y). Each of
these fragments of code is legal whether x names a type or is a
variable but has a completely different meaning. In the case of
x & y, this code fragment either declares an x reference called
y (if x names a type) or is an expression using the binary & (bit-
wise and) operator on the variables x and y. Likewise, x * y
either declares a pointer to an x or multiplies x by y. The
expression x(y) either creates an unnamed temporary of type x
(passing in y to its constructor) or calls a function named x with
argument y.
While this necessity may be a bit of a pain in the normal
course of C and C++ compilation, the compiler can typically
simply check to see if x names a type or not. However, if T is a
template parameter that names a type (e.g., we are inside the
scope of template<typename T>), then the compiler has a
difficult time determining if T::x names a type or a variable.
Likewise, Something<T>::x is difficult to figure out. In these
two situations, we call x a dependent name, as its interpretation
depends on what T is.
Whenever you use a dependent name for a type, you need
to explicitly tell the compiler that it names a type by placing the
typename keyword before the name of the type. For example:
1 template<typename T>
2 class Something {
3 private:
4 typename T::t oneField
5 typename anotherTemplate<T>::x twoField;
6 public:
7 typename T::t someFunction(typename T::t x) {
8 typename T::t someVar;
9 //some code
10 }
11 };
This example (which is purely intended to demonstrate the
syntax not any useful template or class), shows five uses of
dependent type names: declaring the two fields, the return type
of the function, the parameter type of the function, and the local
variable in the function. Each of these requires us to explicitly
add the typename keyword, as shown in the example.
We will note that there are a variety of complexities in the
rules related to dependent names, which we are not going to
delve into here. There are also circumstances where a
dependent name has a templated class or function inside of it,
and the compiler must explicitly be told that name refers to a
template (via the template keyword) to disambiguate the < that
encloses the template’s arguments from the less-than operator.
These complexities are not really required for a basic grasp of
C++ programming and template use—the above description of
the rules are sufficient for what we will do in this book.
However, you should know that this description is not the entire
“can of worms” if you plan to proclaim yourself as a “C++
expert.”
17.3.6 You Can Provide an Explicit Specialization
You can explicitly write a specialization for a template—
providing specific behavior for particular values of the template
arguments. Often explicit specialization is performed to provide
a more efficient implementation of the exact same behavior as
the “primary template” (the original template, which you are
explicitly specializing). However, there is no rule to enforce
this behavior. You could explicitly specialize a template to
behave completely differently from the primary template.
Doing so would generally be a bad idea, as it would confused
anyone (including yourself) using the code.
Explicit specializations may either be partial (you only
specialize with respect to some of the parameters, leaving the
resulting specialization parameterized over the others) or
complete (you specialize all of the parameters). If you use
partial specialization, you should be aware of the rules that C++
uses to match template instantiations to the partial
specializations you have written, which we will not go into
here.
In fact, we will not say much about this topic other than
that it exists. We will mention it briefly again Section E.6.11
when we discuss C++11’s variadic templates, but otherwise we
will not have much need for it in this book.
17.3.7 Template Parameters for Functions (But Not
Classes) May Be Inferred.
When you use a templated function, you can omit the angle
brackets and template arguments in cases where the compiler
can infer them—that is, if the compiler can guess what to use
based on other information, such as the types of the parameters
to the function in question. However, we recommend against
this practice. It is much better to be explicit and make sure you
and the compiler both agree on what types are being used.
Being explicit also helps with readability—someone else (or
yourself later) examining your code knows exactly what is
happening. Furthermore, the compiler may infer a perfectly
legal type that you did not intend. We mostly mention this rule
in case you see code with it, so that you will not be surprised
but instead will understand what is going on.
For templated classes, the arguments must always be
explicitly specified. The compiler will never try to infer them.
17.4 The Standard Template Library
C++ has a Standard Template Library (STL), which
provides a variety of useful templated classes and functions. We
will obviously not describe everything in the STL here;
however, we will mention a few important and useful things
from it. We will see some other parts of the STL as we progress
through Part III and learn about data structures and algorithms
that have STL implementations.
17.4.1 The std::vector Templated Class
One useful class in the STL is the std::vector<typename T>
templated class.2 A vector is similar to an array in that it stores
elements of some other type—namely, the type that is its
template parameter T. Like an array, the vector’s elements may
be accessed with the [] operator. You should
#include <vector> if you want to use this class.
Unlike an array, the size of a vector can be changed via
methods such as push_back (which adds an element to the end
of the vector) and insert (which inserts an element at an
arbitrary3 position). The size of the vector can also be reduced
by removing elements from it with methods such as pop_back
(which removes the last element of the vector) and erase
(which removes an arbitrary element).
vectors also provide a variety of features that C’s arrays
(which are just pointers) do not. C++’s vectors provide a copy
constructor and assignment operator, so that a vector can be
copied, making copies of all of its elements (of course, a
destructor is also provided to free up the memory internally
allocated by the vector). Additionally, a vector keeps track of
its size, which can be queried with the size method.
The vector template also provides for overloaded
operators to compare two vectors for equality or ordering. The
== operator checks if the two vectors are the same size and, if
so, compares each element in its left-hand operand to the
corresponding element in the right-hand operand for equality.
Of course, this comparison requires the == operator to be
defined on Ts, which it uses to compare the individual elements
to each other.
The == operator for vectors is a great example of the
template rule that we discussed in Section 17.3.3—that type
checking for templates only happens on the specializations that
are actually used. The STL can define the vector template
despite the fact that not all types have == defined on them. We
can then instantiate the vectors with any type. For any of these
types (T) that do define ==, we can compare vector<T>s for
equality. We cannot compare vectors for equality if the type of
objects they hold does not have == defined on it.
vectors also have a < operator, which orders two vectors
(holding the same type) lexicographically. Recall that in
Section 10.1.3, we described lexicographic ordering as being
“what you would think of as ’alphabetical ordering’ but
extended to encompass the fact that strings may have non-
letters.” Lexicographic ordering is actually a bit more general
than that. It applies to any sequence and is determined by
comparing the each element of the sequence in order from the
first towards the last. If two elements are equal, then the
comparison continues to the next pair of elements. When two
differing elements are encountered, their order determines the
lexicographic ordering of the entire sequence. Basically, it is
“how alphabetical ordering works” but generalized to any
sequence of any type of thing, where that type of thing can be
compared for ordering.
vectors are incredibly useful and typically preferred over
“bare” arrays in C++. You can read more about the particulars
of them in their online reference:
http://www.cplusplus.com/reference/vector/vector/.
17.4.2 The std::pair Templated Class
Another useful templated class is the std::pair class, which is
templated over two types, T1 and T2. All the pair class does is
provide a way to pair two objects—one of type T1 and one of
type T2—together into a single object. The elements of the pair
can then be accessed via the public fields first and second.
Grouping two items together into one is a simple concept
but quite useful. There are many cases in programming
languages in which you are restricted to “one thing” but really
want to have two. For example, we can only return one value
from a function. Sometimes, we find ourselves wanting to
return two values, but we cannot do that in C or C++. However,
we can make the return type of the function a pair, combine
the two values into a pair, and return that pair. Another
example appears in a variety of data structures, such as the
std::vector we just discussed. Each index of a vector (or
array) can only hold one value. However, we might want to
have a vector (or array) of pairs of data.
17.4.3 Iterators
C++’s STL has a variety of “container” classes (classes whose
job is to hold other data). Many of these container classes are
designed to provide similar interfaces to each other. Among
other reasons, providing a similar interface between these
classes allows code that uses them to be templated over the
container class, allowing it to work flexibly with any of them.
In fact, the STL provides a variety of algorithms (which we will
discuss shortly) designed to work with most of its containers.
Many algorithms need to iterate over the data structure
they work on (as you have likely experienced extensively by
now). For vectors and arrays, accessing a particular element by
numerical index (i.e., myVector[i]) is quite efficient. However,
as we shall see in Part III, there are other data structures where
this style of access is quite inefficient (we will also formalize
what we mean by “efficient” in Chapter 20)—the data structure
has to internally iterate through its elements to find the element
in question. That is, if we wrote:
1 for (int i = 0; i < someThing.size(); i++) {
2 x = someThing[i];
3 }
Then the someThing[i] operation (which would be an
overloaded operator in whatever type someThing is) may take
significant computation to get the proper element.
Often in these situations, going from one element to the
“next” one is still efficient—the inefficiencies arise from
seeking an arbitrary element where the “go to next” operation
must be done several times internally. If we were willing to
make the data structure expose its internal representations to the
external code that needs to iterate over it, we could solve this
inefficiency—our code could keep track of the internal state of
the structure and perform efficient operations to go from one
element to the next. Of course, exposing the internal
representations of a class to the rest of the code goes against the
principles of abstraction and object-oriented programming.
The approach C++ takes to resolve this tension between
good abstraction and high performance is to provide an
abstraction called an iterator. An iterator is a class (typically
internal to a data structure) that encapsulates the state of the
internal traversal while providing an implementation-
independent interface to external code. Put a different way, the
iterator (which is a class inside of the data structure—thus it is
fine for it to know the internal representation details) tracks
where the traversal is in whatever fashion is efficient for the
data structure it belongs to but gives a simple interface to other
code.
It is easiest to see how an iterator works by an example.
We will start with a std::vector, since that is the data
structure we are familiar with:
1 std::vector<int>::iterator it = myVector.begin();
2 while (it != myVector.end()) {
3 std::cout << *it << "\n";
4 ++it;
5 }
The type std::vector<int>::iterator is the iterator
class defined inside of vector. We initialize it with
myVector.begin() (myVector is a std::vector<int> declared
and filled in somewhere before the example begins). The begin
function in vector (and other classes that support iterators)
returns an iterator positioned at the start of the data structure. In
the particular case of a vector, the iterator might be
implemented as simply holding the current index (as indexing
is efficient), in which case the iterator returned by begin would
hold 0.
We then have a loop in which we compare our iterator (it)
against the return result of myVector.end(), which returns an
iterator that is past the last element of the data structure (if we
think of our vector’s iterator as holding an index, then end
would return the size of the vector, as size - 1 is the last
valid index). If it is equal to myVector.end(), then we are
done with the traversal.
If it is not equal to myVector.end(), then we enter the
body of the while loop and print out an element of the vector
(and a newline). We access the particular element of the vector
the iterator refers to by dereferencing the iterator with its
overloaded * operator, which returns a reference to that
element.
At the end of the loop, we increment the iterator with
++it, which moves it to the next element. If we think of the
iterator for a vector as internally holding an index into the
vector, this operation would increment that integer. You may
wonder why we have written ++it rather than more familiar
it++. They are functionally equivalent, but ++it is more
efficient due to the difference in the semantics of the operators.
The postfix increment operator (it++) evaluates to the value
before the increment was performed, while the prefix increment
operator (++it) evaluates to the value after the increment is
performed. Although this distinction seems minor, the postfix
increment requires the operator to make a copy of the original
object, perform the increment, then return the copy of the
object (which would then be destroyed as we are not using it).
In general, you should use the prefix increment operator
whenever the increment is on an object type to avoid these
unnecessary copies and destructions.
This loop with a vector may seem silly—after all, we can
just write a for loop from 0 to the size of the vector and make
use of the [] operator to get a reference to a particular element.
However, if we were to use some other data structure (such as a
std::set, which we will talk about in Section 20.5, we could
not index it with an integer, but we could use almost identical
code with iterators:
1 std::set<int>::iterator it = mySet.begin();
2 while (it != mySet.end()) {
3 std::cout << *it << "\n";
4 ++it;
5 }
In fact, if you look carefully, you will notice that the only
substantiative change (as opposed to just changing the
variable’s name) between the std::set example and the
std::vector example is the type of it. We could even write a
templated function:
1 template<typename T>
2 void printElements(T & container) {
3 typename T::iterator it = container.begin();
4 while (it != container.end()) {
5 std::cout << *it << "\n";
6 ++it;
7 }
8 }
We could then use this templated function to print elements
from any container that supports iterators. Observe how we
have put abstraction to good use: separating the interface for
iterating (efficiently) across a collection of elements from the
implementation of doing so.
Of course, the previous example is a bit unappealing to the
meticulous programmer, as it is not const-correct (that is,
container should be a const T &, as we do not modify its
elements). However, begin returns an iterator that allows the
elements it references to be modified (as it returns a reference,
not a const reference), so we cannot call it on a const object.
We can make this code more correct by using a
const_iterator, which is just like an iterator except that
dereferencing it provides a const reference to the underlying
element:
1 template<typename T>
2 void printElements(const T & container) {
3 typename T::const_iterator it = container.begin
4 while (it != container.end()) {
5 std::cout << *it << "\n";
6 ++it;
7 }
8 }
Note that there is no mystery or magic in begin returning
an iterator in the first examples and const_iterator in this
most recent example. This difference is just a result of the fact
that begin is overloaded with the following two signatures (in
the STL classes and in other well designed classes):
1 iterator begin();
2 const_iterator begin() const;
As we discussed previously, these differ in the type of
their this argument and thus are a legal overloading. The
second has this as a pointer to a const object. Therefore, when
we invoke begin on a const object, overloading resolution
selects the second, and the return type is a const_iterator.
C++ has a variety of kinds of iterators, which vary in how
they can traverse their data structure. Some iterators are
restricted in what they can do—forward iterators can only move
forward (via ++), and not backwards (via --), while
bidirectional and random access iterators can go both forwards
and backwards. However, random access iterators can move
forwards or backwards by more than one element at a time (via
+= and -=), which the other types cannot. We are typically
going to work only with basic iterator functionality
(comparison for equality, dereference, increment), so we are not
going to be greatly concerned with the fancier features of the
various types. Of course, if you want to know more about them,
you can read plenty online:
http://www.cplusplus.com/reference/iterator/.
One other important consideration when using an iterator
is what happens when the underlying data structure changes by
having elements inserted or removed. The simplest way to see
why such a modification might pose a problem is to consider
the case where the element an iterator refers to is removed
while the iterator still refers to it. Now, the iterator references
something no longer in the data structure—does it even make
sense to dereference this iterator? Can we move from the
position of a non existent element in a meaningful way?
Whenever a modification occurs, an iterator may be
invalidated by that modification—use of an invalidated iterator
is incorrect (that is, it is an error in your program—which will
likely crash, but even if it does not, your program is broken).
The specific circumstances under which an iterator is
invalidated varies from data structure to data structure, thus you
should consult the documentation for the class and operation
you are using to determine its impact on the validity of your
iterators.
17.4.4 Algorithms from the STL
The STL also provides a variety of algorithms (for most of
these, you should #include <algorithm>). The simplest of
these are the min and max algorithms, which choose the
smallest/largest of their two parameters. There are two versions
of these templates. The first simply compares the two
parameters with the < operator. The second allows for a custom
comparison function to determine the ordering.
The second version (which allows a custom ordering) is
implemented by having a second template parameter that
specifies a class with an overloaded () operator (the “function
call” operator). Overloading the function call operator on a
class allows the programmer to place parentheses with
arguments after an instance of that class in a way that looks like
a function call—resulting in a call to the overloaded operator.
The function call operator can be overloaded with any number
of parameters. An object with “function call” behavior (i.e., an
overloaded function call operator) is referred to as a function
object.4
To see an example of the min algorithm with a custom
comparison function object, we can start with a class that
compares two std::strings in a case-insensitive fashion:
1 class OrderIgnoreCase {
2 public:
3 bool operator()(const std::string & s1,
4 for (size_t i = 0; i < s1.size() && i
5 char c1 = tolower(s1[i]);
6 char c2 = tolower(s2[i]);
7 if (c1 != c2) {
8 return c1 < c2;
9 }
10 }
11 return s1.size() < s2.size();
12 }
13 };
In this example, we overload the function call operator to
take two const std::string &s, which are the strings to
compare for ordering. The syntax may seem a bit odd; however,
nothing is actually unusual—the name of the operator is just
operator(), which is then followed by the parameter list in
parentheses. The function then compares the strings ignoring
the case of each letter (by converting the letters to lower case
before comparing them to each other).
We could then use this class to with the min algorithm to
find the minimum of two strings and ignore case:
1 std::min<std::string, OrderIgnoreCase>(str1, str2
Of course, the STL has a variety of other algorithms
beyond just min and max. Many of these make use of iterators
(allowing them to work on a variety of containers) and function
objects (allowing them to abstract out how various operations
are performed, giving them more flexibility). We recommend
consulting their documentation to find out details about any of
them: http://www.cplusplus.com/reference/algorithm/.
17.5 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 17.1 : Why are templates useful?
• Question 17.2 : What is parametric polymorphism?
• Question 17.3 : When is a template type checked?
What are the implications of this for what is and is not
legal in templated code?
• Question 17.4 : What is wrong with writing
std::vector<std::pair<int,int>> ?
• Question 17.5 : Write a templated function that takes in
an array of Ts and an int for the number of items in the
array and returns a count of how many of the items are
“even”. For this function, “even” means that an item
mod 2 is equal to 0. You can assume that this template
will only be applied to types where % is overloaded on
Ts and ints.
• Question 17.6 : What does it mean for a template
parameter to be “inferred”? Under what circumstances
will this happen?
• Question 17.7 : Rewrite your countEven function to
operate on a std::vector instead of an array. Your
function should take its parameter as a const reference
to a vector (i.e., a conststd::vector<T> &).
• Question 17.8 : Rewrite your countEven so that it can
operate on the iterators within any type (i.e., its two
parameters are T::iterators).
• Question 17.9 : In the previous question, why did you
have to use typename in the parameter declarations?
• Question 17.10 : Write a function
vector<pair<T2, T1> > flipPairs(const vector<pa
ir<T1, T2> > & v), which takes a vector of pairs,
and returns a vector of pairs where the first and
second element have been switched.
16 Strings and IO Revisited18 Inheritance
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
II C++17 Templates19 Error Handling and Exceptions
Chapter 18
Inheritance
C++, like many object-oriented languages, supports
inheritance. Inheritance is the ability to declare a class (called
the child class or subclass) in such a way that it obtains all of
the fields and methods of another class (called the parent class
or superclass). The child class can have more fields and
methods of its own added to it and can override the behavior of
the methods it inherited.
Inheritance is best used when two classes exhibit an is-a
relationship. For example, if we were designing a set of
components for a graphical interface, we might have a class for
a Button. If we were to want to create a class for a button with
an image on it, (e.g., an ImageButton class), we would want to
use inheritance. An ImageButton is a Button; it just adds more
features (having an image instead of just text) and overrides
some behaviors of the parent class (how it is drawn). In this use
of inheritance, Button is called the parent class, super class or
base class, and ImageButton is called the child class, subclass
or derived class. We would say that ImageButton inherits from
or extends Button.
Although the ImageButton in our example will change
some behaviors, much of the code for the original Button class
(e.g., the code to process the button being clicked and execute
some other function in response to it) remains the same. Using
inheritance allows the ImageButton to inherit this functionality
from the Button class, rather than requiring the programmer to
rewrite (or copy and paste) the code in multiple places.
Avoiding this duplication of code makes our program easier to
write correctly and easier to maintain.
Contrast the is-a relationship between an ImageButton and
a Button with a has-a relationship. A Button has a string for
the text that appears on it, but we would not say that “A Button
is a string.” Whenever two types exhibit as has-a relationship,
inheritance is inappropriate. Instead, one should have a field for
the other—the Button class should declare a field of type
string for its text.
18.1 Another Conceptual Example
As another example of inheritance (which we will use in many
of the examples throughout this chapter), consider the
BankAccount class we have used in prior examples. If we were
actually making a program to manage bank accounts, we might
have specific subtypes of bank account, which have additional
properties and slightly different behaviors. For example, we
might have an InvestmentAccount, which has a list of stocks
that are in the account in addition to the cash balance. In this
case, inheritance is again appropriate—an InvestmentAccount
is a BankAccount.
We are not limited to one level of inheritance. When
appropriate, we can inherit again from a class that is itself a
child class, using it as the parent class of a third class. In our
BankAccount/InvestmentAccount class, we might have a
special type of InvestmentAccount that allows people to “trade
on margin” (i.e., buying stocks with borrowed money), called a
MarginAccount. In such a case, MarginAccount could extend
InvestmentAccount.
Figure 18.1: A diagram of the example BankAccount inheritance hierarchy.
We can also make many different child classes with the
same parent class. For example, in addition to having
MarginAccount inherit from InvestmentAccount, we could
also have RetirementAccount inherit from
InvestmentAccount. All of these classes form an inheritance
hierarchy—the collection of all classes that share a common
“ancestor” class (parent class, “grandparent” class, etc.).
Figure 18.1 shows a diagram of the classes in our example
inheritance hierarchy. In this diagram, each class is represented
by three boxes, one with the class’s name, a second with the
class’s fields, and a third with the class’s methods. The arrow
indicates the “inherits from” (“is-a”) relationship between the
classes. For example, the BankAccount class has two fields: a
double for the balance, and an unsigned long for the account
number. It also has three methods, one to deposit money, one to
withdraw money, and one to get the account balance.
The InvestmentAccount class inherits the fields and
methods BankAccount has, so we do not show them again.
However, we add some new fields (a vector of
pair<Stock *, double>s, which would contain each Stock
and how many shares the account has) and how many trades
were executed this month (so we can charge different
commission rates based on how frequently the owner trades).
We then add some methods to buy and sell stock as well as
compute the market value (the sum of the values of the stocks
and the cash balance).
The MarginAccount inherits from InvestmentAccount.
This class inherits the balance, acctNumber, stocks, and
tradesThisMonth fields from InvestmentAccount. It also
inherits the deposit, withdraw, sellStock, getBalance, and
getMarketValue methods. However, this class needs to specify
different behavior for the buyStock method rather than using
the behavior it would inherit from its parent class (i.e., rather
than indicating an error if there is not enough cash in the
account, it could borrow against the margin). Accordingly,
MarginAccount would override the buyStock method—
providing its own definition in place of the one it would
normally inherit from its parent.
Similarly, RetirementAccount inherits from
InvestmentAccount but overrides deposit (to check the
contribution against the annual contribution limit) and
withdraw (as withdrawing from a retirement account has
different tax consequences than withdrawing from a regular
account).
18.2 Writing Classes with Inheritance
To use inheritance in C++, the class declaration for the child
class has a colon, followed by an access specifier, and then the
parent class name between the class’s name and the open curly
brace of the class declaration. In our BankAccount example, we
would write:
1 class BankAccount {
2 double balance;
3 unsigned long acctNumber;
4 public:
5 void deposit(double amount);
6 double withdraw(double amount);
7 double getBalance() const;
8 };
9 class InvestmentAccount : public BankAccount {
10 vector<pair<Stock *, double> > stocks;
11 unsigned tradesThisMonth;
12 public:
13 void buyStock(Stock whichStock, double numShare
14 void sellStock(Stock whichStock, double numShar
15 double getMarketValue() const;
16 };
We would also add constructors and destructors to each
class as needed and write the method bodies (either inline or
separately, as we have seen previously). Observe that we do not
re-declare the fields or methods that the child class is inheriting
from its parent.
In fact, if we do declare fields of the same name as those
in the parent class, we end up with objects that have two
different fields of the same name. They are distinct in that they
have different fully qualified names. In this example, if we
added a duplicated field called balance, code in
InvestmentAccount would reference it by default, and would
refer to the inherited field by BankAccount::balance.
However, creating multiple fields with the same name in this
fashion is generally an indication of poor design or a lack of
understanding of how inheritance works.
The public access specifier in the inheritance declaration
(: public BankAccount) specifies how the access of inherited
members should be changed. Using public inheritance, as we
have done in this example, specifies that the access should be
unchanged from that declared in the parent class: public
members remain public, and private members remain private.
Alternatively, one may use private inheritance (by writing
: private BankAccount), in which all inherited members
become private in the child class.
A third option is available, using an access specifier we
have not yet seen: protected. Using protected inheritance
makes public members of the parent class into protected
members of the child class. Members of a class may be
declared protected, which means that they may be accessed by
members of the class or by members of any of its child classes.
That is, if we write:
1 class A {
2 protected:
3 int x;
4 };
5 class B : public A {
6 void someFunction() {
7 x++;
8 }
9 };
The reference to x inside of class B (which is a child class of A)
is legal. The field x is protected, so the child classes can access
it. However, code outside of classes A and B (and any other
child classes of B) would not be able to access x directly. Of
course, classes can also declare other classes as friends to
grant them access to private/protected members.
C++ has one slightly subtle restriction on the use of
protected members in a child class: they may only be accessed
through a pointer/reference/variable of the child class’s own
type. In the example above, the access to x is performed
through the implicit this pointer, which has type B const *, so
it is legal according to this rule. However, if we wrote:
1 class A {
2 protected:
3 int x;
4 };
5 class B : public A {
6 void someFunction(A * anA) {
7 anA->x++;
8 }
9 };
We would instead receive the following compiler error:
In member function ’void B::someFunction(A*)’:
error: ’int A::x’ is protected
error: within this context
Here, the problem is that we are accessing the field x in
class B through a pointer that is of type A *.
18.3 Construction and Destruction
When objects that use inheritance are constructed or destroyed,
the constructors and destructors for their parent classes are run
to initialize/cleanup the parent portion of the objects. The first
thing that happens when creating a new object is that the
constructor for its parent-most ancestor is run to initialize that
portion of the object. Then the constructor for the next closest
ancestor is run, and so on. Finally, the constructor for the class
itself is run to initialize that class. To be more concrete,
consider an inheritance hierarchy in which we have class A
(which inherits from nothing), class B, which inherits from A,
and class C, which inherits from B. Whenever we create an
object of type C, then first thing that happens is that we run the
constructor for A, followed by the constructor for B, and finally
the constructor for C. In destroying an object, the destructors
are run in the reverse order: from the class itself, up the chain
of parent classes (in this example: C, B, then A).
In C++, the type of an object actually changes during this
construction process. In the example above, the new operator
allocates space for an entire C object, but the type of the object
is initially set to A. Only after the constructor for A finishes,
does the type of the object change to B. The type of the object
remains as B while the constructor for B executes. Finally, the
type of the object becomes C after B’s constructor completes,
right before C’s constructor begins. At this point, the actual type
of the object remains the same until it is destroyed. We will
show the type of an object in our diagrams at the top of the
object’s box with blue text on a gray background.
During destruction, the process happens in reverse;
however, it stops if any parent class’s destructor is trivial. When
destroying a C, C’s destructor executes first (assuming it is
nontrivial). The code in C’s destructor executes, followed by the
destructors for any fields in C (in reverse order of construction
as usual). If the parent class’s destructor (in this example, B) is
nontrivial, then it is called. The type of the object changes from
C to B once control enters B’s destructor, right before any of the
code in B’s destructor begins. After B’s destructor completes and
its fields are destructed, A’s destructor must be run (unless it is
trivial). Once control enters A’s destructor, the type of the object
changes to A. After all nontrivial destructors complete, the
memory allocated to the object can be released. While the
actual type of the object during this process may seem a subtle
pedantic distinction, we will see that it can actually make a
difference in how code executes when we learn a bit more
about inheritance.
We will briefly note that C++’s design choice in this
regard differs drastically from Java’s (Java is another object-
oriented language). If we were to examine our example in Java,
we would still run the constructors in the same order (A, then B,
then C); however, the object would have type C as soon as it is
created, before entering A’s constructor. Java does not have
destructors in the same fashion as C++.
All of the above discussion has referenced calling the
constructor of a particular class. However, in C++, it is possible
to overload the constructor, meaning there may be multiple
ones available. This possibility raises the questions of how the
programmer can select a particular constructor for the parent
class and how the programmer can pass in the desired
arguments to the parent class constructors that take them.
If the programmer does not explicitly specify a call to the
parent class’s constructor, then the default constructor is
implicitly used. If there is not default constructor (or if the
default constructor is private), then the compiler will produce
an error. If the programmer wishes to call some other
constructor explicitly, she writes the call to the parent class’s
constructor as the first element of the initializer list, by writing
the parent class’s name, then parentheses with the argument list.
For example, we might write:
1 class BankAccount {
2 double balance;
3 unsigned long acctNumber;
4 static unsigned long nextAccountNumber;
5 public:
6 BankAccount() : balance(0), acctNumber(nextAcco
7 nextAccountNumber++;
8 }
9 BankAccount(double b) : balance(b), acctNumber
10 nextAccountNumber++;
11 }
12 };
13 class InvestmentAccount : public BankAccount {
14 vector<pair<Stock *, double> > stocks;
15 unsigned tradesThisMonth;
16 public:
17 InvestmentAccount() : tradesThisMonth(0) { }
18 InvestmentAccount(double balance) : BankAccount
19 tradesThisM
20 };
Here, the BankAccount class has two constructors: a
default constructor and one that takes the initial balance as a
double. The InvestmentAccount class (which inherits from
BankAccount) also has two constructors. In
InvestmentAccount’s default constructor, the programmer has
not written any explicit call to any of BankAccount’s
constructors, so an implicit call to the default constructor is
used (as if it were the first element of the initializer list). The
second (which takes a double for the initial balance as its own
arguments) passes that argument to the BankAccount
constructor that takes a double for the initial balance. Note that
destructors do not take arguments, so there is no issue of
specifying which one to call or what arguments to pass it.
Video 18.1: Constructing and destroying objects
that use inheritance.
Video 18.1 shows what happens during object construction
and destruction when the classes involved make use of
inheritance.
18.4 Subtype Polymorphism
In Chapter 17, we learned that one way programmers can
increase code reuse (and thus reduce code duplication) is
through polymorphism—which allows the same code to operate
on multiple types. The form of polymorphism we saw in
Chapter 17 is parametric polymorphism because the type is a
parameter (to the templated class/function). Another form of
polymorphism, which is related to inheritance, is subtype
polymorphism.
Subtype polymorphism arises when one type
(InvestmentAccount) is a subtype of another type
(BankAccount), meaning that an instance of
InvestmentAccount is substitutable for an instance of
BankAccount. In OO languages such as C++ or Java, a child
class is a subtype of its parent class. By the nature of
inheritance we are guaranteed that anything that is legal to do to
the parent class (access a particular field, call a particular
method, etc.) is also legal to do to the child class. If the parent
class has a field called x, we know that the child class does as
well because it inherited that field. Likewise, if the parent class
has a method f(int), we know the child class has such a
method as well because it inherited that method—although it
may override it providing a different behavior.
Note that in C++, polymorphism is restricted by the access
modifier used in inheriting the parent class. If public
inheritance is used (which is the most common way), then
polymorphism may be freely used anywhere. If private or
protected inheritance is used, then polymorphism is only
permissible where a field with that access restriction could be
used (in the class or its friends for private inheritance, or in the
class, its subclasses, or any of their friends for protected
inheritance). If the preceding portions of this paragraph seem a
bit complex, do not worry too much, as they are only really
relevant if you have a compelling need for non-public
inheritance.
In C++, subtype polymorphism allows the programmer to
treat an instance of a child class as if it were an instance of one
of its parent classes. However, polymorphism is only applicable
when used with pointers or references. Concretely, if class A is
the (public) parent class of class B, then the following code is
legal:
1 void f(A * a) {
2 ...
3 }
4 void g(B * b) {
5 f(b); //uses polymorphism
6 }
Here, the function f is declared to take a pointer to an A. In
g, b is a pointer to a B. However, we can pass b as the argument
to f. Likewise, we can place pointers to Bs into an array of
pointers to As or assign a pointer to a B to a variable whose type
is a pointer to an A.
Remember that inheritance reflects an “is-a” relationship.
The fact that a B “is-an” A means that the fact that we can use a
B as an A is quite natural. Returning to our earlier example of an
InvestmentAccount and a BankAccount, it makes sense that we
can use an InvestmentAccount in any context where we need a
BankAccount. If we have an array (or vector) of
BankAccount *s, then placing an InvestmentAccount * into
that array makes perfect sense—it is a BankAccount.
To further see the benefits of subtype polymorphism,
consider if our BankAccount class had a method to accrue
interest:
1 class BankAccount {
2 double interestRate;
3 //other fields elided
4 public:
5 //other methods and constructors elided
6 void accrueInterest(double fractionOfYear) {
7 balance += balance * interestRate * fractionO
8 }
9 };
Now, suppose that we have a
vector<BankAccount *> allAccounts, which contains all of
the BankAccounts that our bank software tracks. Polymorphism
allows us to place any subclass of BankAccount into
allAccounts. For example, we might have the following code:
1 class Bank {
2 vector<BankAccount *> allAccounts;
3 public:
4 InvestmentAccount * createInvestmentAccount(dou
5 InvestmentAccount * newAccount = new Investme
6 allAccounts.push_back(newAccount); //use polymo
7 return newAccount;
8 }
9 void accrueInterestOnAllAccounts(double fractio
10 vector<BankAccount *>::iterator it = allAccou
11 while (it != allAccounts.end()) {
12 BankAccount * currentAccount = *it;
13 currentAccount->accrueInterest(fractionOfYe
14 ++it;
15 }
16 }
17 };
In this example, we have a Bank class (which tracks all of
the accounts at one bank). This class has the allAccounts
vector, which contains a pointer to every account at the bank.
We have a method that creates a new investment account and
adds the pointer to that account to the allAccounts vector
before returning it to the caller. Adding this pointer (which is an
InvestmentAccount * to the vector (which holds
BankAccount *s) is a use of polymorphism. The compiler
allows the implicit conversion from an InvestmentAccount *
to a BankAccount * because it knows that an
InvestmentAccount is a BankAccount.
The second method shows how polymorphism helps us
avoid duplicating code. In this method (which we would call at
the end of each month), we accrue interest for every account at
the bank. Even though the pointers in the vector may actually
point at InvestmentAccounts, RetirementAccounts, or
MarginAccounts (in addition to plain BankAccounts), we can
iterate over them all with one iterator and call accrueInterest
on each of them. This use of polymorphism also provides
benefits if we need to expand our code later. If our bank adds a
new type of account (which also extends BankAccount), we do
not need to change the accrueInterestOnAllAccounts method
at all—it just handles that new account type naturally.
The reason we can only apply polymorphism to pointers
and references is a matter of implementation. An
InvestmentAccount takes more space (in memory) than a
BankAccount. As such, if we have an InvestmentAccount
directly (i.e., not via a pointer or reference), it needs a “bigger
box” than a BankAccount. If a frame or object is laid out for a
BankAccount, it will only have a box of a size appropriate for a
BankAccount, not an InvestmentAccount. However, pointers
are the same size regardless of what they point to. We will learn
more about the implementation details (what happens “under
the hood”) to make inheritance and polymorphism work in
Chapter 29.
When we deal with pointers (or references) to objects in
the presence of polymorphism, it is important to understand the
difference between the static type and the dynamic type of the
object it points at. The static type is the type obtained by the
type checking rules of the compiler, which only uses the
declared types of variables. The dynamic type of the object is
the type of object that is actually pointed at.
For example, consider if we wrote
BankAccount * b = new InvestmentAccount();. Here, the
static type of *b is BankAccount—b is declared as a pointer to a
BankAccount, so no matter what type it actually points at, the
static type of *b is always a BankAccount. However, the
dynamic type of *b is InvestmentAccount. If we drew the
execution of this code by hand, we would see that the arrow in
b’s box points at an InvestmentAccount object.
This distinction is important because the compiler only
works with the static types. When the compiler type checks the
program, it must ensure that method calls (and field accesses)
are legal based only on the static types. If we tried to call b-
>buyStock(someStock,amount) in the above example—as far
as the compiler is concerned, b points at a BankAccount, and
BankAccount objects do not have a buyStock method. Even in
cases where it is “obvious” to a person looking at the code that
the dynamic type will always be some more specific type, the
compiler does not use this fact during type checking.
18.5 Method Overriding
A child class may override a method it inherits from its parent
class, specifying a new behavior for that method rather than
using the one it inherits. In our BankAccount example, we
might want to override the buyStock method in the
MarginAccount class rather than use the method it would
normally inherit from InvestmentAccount (to allow the
account’s owner to buy stock without enough cash in the
account).
We might try to override the method by simply writing
another method by the same name (and with the same
parameter list) in the child class. In some OO languages (such
as Java), this approach would work perfectly. However, in C++,
this approach does not override the behavior when we use
instances of the class with the new behavior polymorphically.
We will illustrate this concept with a simpler example:
1 #include <iostream>
2 #include <cstdlib>
3
4 class A {
5 public:
6 void sayHi() {
7 std::cout << "Hello from class A\n";
8 }
9 };
10
11 class B : public A {
12 public:
13 void sayHi() {
14 std::cout << "Hello from class B\n";
15 }
16 };
17
18 int main(void) {
19 A anA;
20 B aB;
21 A * ptr = &aB;
22 anA.sayHi();
23 aB.sayHi();
24 ptr->sayHi();
25 return EXIT_SUCCESS;
26 };
The output would be:
Hello from class A
Hello from class B
Hello from class A
Figure 18.2: Depiction of the contents of main’s frame when we call the sayHi
methods.
Observe that the first two calls to the sayHi method have
straightforward behavior: anA is an A, so it uses the sayHi
method in class A, and aB is a B, so it uses the sayHi method in
class B. However, when we do ptr->sayHi(), we have slightly
more interesting behavior.
For ptr, the static type (recall: the type the compiler is
aware of) of the object it points at is an A. We can see this fact
just by looking at the code (which is really all the compiler
sees)—ptr was declared as a pointer to an A, so that is the type
the compiler knows. However, the dynamic type (recall: the
type of object it is actually pointing at) is a B. To see this fact,
we need to execute the code by hand. Doing so would yield a
state as shown in Figure 18.2 when we reach the calls to the
sayHi methods. Here, we can see that ptr is actually pointing
at a B object.
With the static and dynamic types being different, a
natural question is “which one dictates what method to call?”
As you can see from the output we gave earlier, the static type
was used to determine the method to call—A’s sayHi was
invoked for the ptr->sayHi call. The approach of having the
static type determine which method to call is called static
dispatch and is the default behavior in C++ unless we request
otherwise.
However, static dispatch disagrees with what we typically
would want in the way we would use inheritance and
polymorphism. Returning to our earlier example with
MarginAccount’s buyStock method, we certainly want the
method call to dispatch to the overridden method any time the
owner attempts to buy stock.
The behavior we desire in this case (and most cases) is
dynamic dispatch, in which the dynamic type of the object
determines which method to invoke. If we want a method to be
dynamically dispatched, we have to declare it as virtual. If we
changed our simpler example to use dynamically dispatched
methods, by declaring them virtual:
1 #include <iostream>
2 #include <cstdlib>
3
4 class A {
5 public:
6 virtual void sayHi() { //note the "virtual
7 std::cout << "Hello from class A\n";
8 }
9 };
10
11 class B : public A {
12 public:
13 virtual void sayHi() {
14 std::cout << "Hello from class B\n";
15 }
16 };
17
18 int main(void) {
19 A anA;
20 B aB;
21 A * ptr = &aB;
22 anA.sayHi();
23 aB.sayHi();
24 ptr->sayHi();
25 return EXIT_SUCCESS;
26 };
The output would become:
Hello from class A
Hello from class B
Hello from class B
Notice how the last line has changed. Now, the call to
sayHi is dynamically dispatched. When the static and dynamic
types are the same (the first two calls), this change does not
make a difference in which method is called. However, when
the static and dynamic types differ (as in the ptr->sayHi()
call), the result changes. We now use dynamic dispatch, and
call the method corresponding to the dynamic type of the object
—what type of object the pointer actually points to.
Note that the declaration of the method as virtual must
appear in the parent class. The reason for this requirement is
that when the compiler compiles the call to ptr->sayHi(), it
only knows the static type of ptr. In this case, the static type of
the object that ptr points to is an A, so the compiler looks in the
definition of class A to see whether sayHi should be statically or
dynamically dispatched. The compiler then generates different
code based on whether the function is not virtual (in which
case, it generates a direct call to A’s sayHi), or virtual (in which
case, it generates code to dynamically dispatch the call, which
is a bit more complex—we will discuss how this works in
Chapter 29). Note that once a method is declared virtual, it
remains virtual in all child classes (and their children, and so
on), even if not explicitly declared so. However, it is good to
explicitly declare them virtual to make the behavior clearer to
anyone reading the code.
Video 18.2: Method dispatch in C++.
Having static dispatch as the default behavior is a design
choice not seen in most OO languages other than C++. The
motivation for this choice in C++ is performance. Dynamic
dispatch has a slight runtime cost, and the designers of C++
decided that a programmer should only pay that cost if they
need it. Most other OO languages decide to make dynamic
dispatch the default (or only) option, as it more naturally
corresponds to what a programmer expects in an OO paradigm.
Classes that contain virtual methods are never POD (plain
old data) types, as they contain extra information to allow
dynamic dispatch. For the rest of this book, we will draw
objects with their type as part of their “box” if they contain
virtual methods. While it may seem better to draw the type on
all objects, only doing so when the object’s class contains at
least one virtual method is the most accurate reflection of
reality. In particular, an object with at least one virtual method
has an extra field that is particular to its type (again, we’ll learn
about the details in Chapter 29). Objects without virtual
methods have no extra information and do not have an extra
field, so it is a bit misleading to draw an extra “subbox” to
indicate their type.
Video 18.3: Dynamically dispatching methods
during object construction and destruction.
As you may recall from Section 18.3, in C++, the dynamic
type of an object changes during object construction and
destruction (in Java, the dynamic type remains constant). Now
that we understand dynamic dispatch, we can see the reason
such a rule may have meaningful impact on the behavior of our
program. If we call virtual (dynamically dispatched) methods
during the construction or destruction of an object (i.e., from its
constructor or destructor), then the rules for what dynamic type
an object has at each point in its construction (or destruction)
govern what method is actually called. Video 18.3 illustrates.
Video 18.4: The pitfalls of non-virtual
destructors when objects are used
polymorphically.
In C++, whenever you use a class that may participate in
polymorphism, its destructor should be declared virtual.
Declaring destructors as virtual whenever you use classes
polymorphically is important to avoid issues with improperly
destroying objects. When you delete an object, the destructor
call is dispatched according to the same rules as method calls.
If the destructor in the static type of the object being destroyed
is not virtual, then the destructor call is statically dispatched. If
the destructor in that class is virtual, then the destructor call is
dynamically dispatched. Video 18.4 illustrates the pitfalls of a
non-virtual destructor when objects are used polymorphically.
There are a couple of other useful things to know about
overriding methods. First, if you want to call the parent class’s
version of a method, you can do so by explicitly requesting it
with the fully qualified name. For example, if our
MarginAccount class’s buyStock method wanted to call the
inherited buyStock method (e.g., after borrowing money as
needed), it could do so by using the fully qualified name of the
method. For example, MarginAccount’s buyStock might have
the following code:
1 class MarginAccount : public InvestmentAccount {
2 //other things here.
3 virtual bool buyStock(Stock * s, double numShar
4 double cost = getCost(s) * numShares;
5 double borrowAmount = 0;
6 if (balance < cost) {
7 borrowAmount = cost - balance;
8 if (marginUsed + borrowAmount < marginLimit
9 balance += borrowAmount;
10 marginUsed += borrowAmount;
11 }
12 else {
13 return false;
14 }
15 }
16 if (InvestmentAccount::buyStock(s, numShares)
17 return true;
18 }
19 balance -= borrowAmount;
20 marginUsed -= borrowAmount;
21 return false;
22 }
23 };
Another useful thing to know is that an overridden method
may have a more permissive access restriction (e.g., if the
parent declares the method as private, the child could declare its
overridden version as public). However, it cannot become more
restrictive (you cannot override a public method with a private
one). Additionally, an overridden method may change the
return type in a covariant fashion—meaning that the return type
in the subclass is a subtype of the return type in the superclass.
For example:
1 class Animal {
2 public:
3 virtual Animal * getFather() {
4 //code here
5 }
6 virtual Animal * getMother() {
7 //code here
8 }
9 };
10 class Cat : public Animal {
11 public:
12 virtual Cat * getFather() {
13 //code here
14 }
15 virtual Cat * getMother() {
16 //code here
17 }
18 };
Here, the Animal class has two methods (getFather and
getMother), which each return an Animal *. This declaration
makes sense, as an animal’s father or mother would be an
animal. We then declare Cat as a subclass of Animal. Here we
override these two methods but change their return type to
Cat *. This overriding is legal, as Cat * is a subtype of
Animal *—that is, we can use polymorphism to assign a Cat *
to an Animal *. This change of return type makes sense in this
example, as the cat’s father and mother will be cats, not just any
type of animal. Making the return type more specific in this
fashion may be useful in code that uses the Cat class in a non-
polymorphic fashion, as the compiler will know that the return
value is a Cat * (allowing us to use methods specific to Cats).
Note that if the methods returned Animal and Cat (by value, not
by pointers), then this overriding would be illegal, as
polymorphism only works on pointers or references.
Attempting to do so would result in error messages such as:
invalid covariant return type for ’virtual Cat
Cat::getFather()’
overriding ’virtual Animal Animal::getFather()’
18.6 Abstract Methods and Classes
Suppose you were working on a program that dealt with shapes
(e.g., some sort of graphics program). You might have a variety
of classes for different shapes, such as a class for a Circle
(which might have a radius and a center), a class for a
Rectangle (which might have an , a , a width, and a
height), and a class for a Triangle (which might have three
points). Each of these classes exhibits an “is-a” relationship
with a Shape class (a circle is a shape; a rectangle is a shape; a
triangle is a shape), so we might want to make use of
inheritance. We could declare a Shape class and then make
Circle, Rectangle, and Triangle subclasses of Shape.
Making use of inheritance would confer several
advantages to our software design. We could make use of
polymorphism, allowing us to track all of the shapes in our
system as an array of Shape *s. We could then make use of
dynamic dispatch to have method invocations result in the
correct code being executed based on the actual type of shape
that was created. For example, we might have a containsPoint
method, which takes a Point and tests if the point is inside of
the shape. Each class could implement this method in the way
appropriate to its own type of shape, and when we invoke
containsPoint on a Shape *, dynamic dispatch would mean
we call the right method for the type of shape we actually have.
However, if we carefully scrutinize this design, we will
notice something different from our previous examples.
Although Shape is a perfectly valid class to make, there is
nothing that is “just a shape” and not something more specific.1
Because there is nothing that is “just a shape,” we run into a
small snag: we would like to declare the containsPoint
method in the Shape class; however, there is no way to
implement it. That is, we cannot write a correct containsPoint
method in the Shape class because there is no correct way to do
so. Writing a “dummy” implementation that always returns true
(or always returns false) in unappealing, as it is an ugly hack,
which is likely to lead to us forgetting to implement this
method in a subclass and having an annoying error in our
program.
Instead, what we would like to do is declare the
containsPoint method in the Shape class in such a way that
we tell the compiler “there is no way I can define this method
in this class, but any child class of mine must override this
method with a real implementation.” Such a method is called an
abstract method or a pure virtual member function.2 We declare
a virtual method as abstract by placing = 0; at the end of its
declaration:
1 class Shape {
2 public:
3 virtual bool containsPoint(const Point & p) con
4 };
Note that abstract methods must be virtual, as it only
makes sense to use them with dynamic dispatch. The whole
point is that we can have a Shape * (or Shape &), and call
containsPoint on it without specifying how we would do
containsPoint on a generic Shape. Now, each of our
subclasses of Shape (Circle, Rectangle, and Triangle) will
override the containsPoint method as appropriate to their
respective types of shapes.
When a class has an abstract method in it, that class
becomes an abstract class. There are a few special rules that go
along with abstract classes. The first is that an abstract class
cannot be instantiated. That is, you cannot do new Shape, nor
can you declare a variable to have type Shape. However, you
can declare a variable (or parameter) to have type Shape * or
Shape &. A Shape * or a Shape & can be used to
polymorphically to reference an instance of a concrete subclass
of Shape—one that has provided actual implementations for all
of its abstract methods, such as Circle, Rectangle, or
Triangle.
The second rule is that any subclass of an abstract class is
also abstract unless it defines concrete implementations for all
abstract methods in its parents. If our design called for it, we
could make an abstract subclass of Shape that does not define
containsPoint and then make concrete subclasses of that
class. Of course, we could also make a subclass that did define
containsPoint but also declared new abstract methods of its
own, and such a class would also be abstract.
These two rules work together to make an important
guarantee to the compiler. Any object you actually instantiate
will have an implementation for all of the methods declared in
it. This rule is crucial to the usefulness of abstract classes. It
means that whenever we have a Shape *, we can call
containsPoint on it (or more generally, whenever we have a
pointer or reference to an abstract class, we can call any of the
methods that we have declared as abstract). Even though
containsPoint is not implemented in Shape, the compiler can
be assured that whatever subclass of Shape the pointer points at
will have an implementation of this method. Recall that the
compiler cannot figure out the dynamic type of the object that
the pointer points at—it must only rely on the static type.
Therefore, the Shape class must “promise” that any subclass
has this method for the call to be legal.
Video 18.5: Executing code with abstract
classes.
Video 18.5 illustrates the behavior of code with abstract
classes.
We will note that there is one “hole” in the guarantees
made about having an implementation of the method available.
Recall that in C++ (but not most other OO languages), the type
of the object changes during the object construction (and
destruction) process. As with any other class, abstract classes
can have constructors, and the constructors for abstract parent
classes are executed in the same way as the constructors for any
other parent classes. If the code in the constructor of an abstract
class is such that it calls an abstract method, there is a problem
—the dynamic type of the object is the abstract class and no
implementation is available.
For example, if Shape had a constructor that called an
abstract method, then during the Shape constructor, the
dynamic type of the object being constructed is just Shape. No
implementation of the abstract method is available, so no legal
call may be made. If the call appears directly in the constructor,
the compiler will produce an error message. However, the
compiler is easily fooled into not producing an error if the
constructor calls some other method, which in turn calls the
abstract method. In such a case, the program will crash when
the abstract method is called while the dynamic type does not
provide an implementation. Note that this issue is particular to
the C++ rules, which change the object type during
construction. In Java, the object type is immediately set at the
type being constructed, and the method call is dynamically
dispatched to that type’s implementation.
18.7 Inheritance and Templates
As we discussed earlier, most features we see in programming
languages are composable—we can mix them together, and
they work exactly the way we would expect. For example,
function parameters and references exhibit this property. If we
know how to declare a function parameter and we know how to
declare a reference, we can combine the two and declare a
function parameter that is a reference—and it works exactly the
way we expect. Unlike most pairs of features, templates are not
fully composable with inheritance, mostly with respect to the
rules that relate to virtual methods. While this delves a little
more into odd corners of the language than we typically like to
go, we mention it to help you avoid surprises (and the
frustration that goes along with them).
18.7.1 Aspects That Are Composable
First, we will start with some aspects that are composable. It is
perfectly fine (and works “as expected”) to have a templated
class inherit from another class, to inherit from an instantiation
of a templated class, or to mix the two (having a templated
class inherit from an instantiation of another templated class).
Often when we want to have a templated class inherit from a
templated parent class, we want to keep the generality of the
parent class—we can achieve this behavior by instantiating the
parent class with the template parameter of the child:
1 template<typename T>
2 class MyFancyVector : public std::vector<T> {
3 //whatever we want
4 };
Here we are still instantiating std::vector, we just
happen to be instantiating it with T, which is the template
parameter of MyFancyVector. Whenever we instantiate
MyFancyVector, the resulting class will inherit from
std::vector instantiated with the same argument as
MyFancyVector (that is, MyFancyVector<int> will inherit from
std::vector<int>.
We will note that you can even parameterize a class in
terms of what its parent is:
1 template<typename T>
2 class MyClass : public T {
3 //code here...
4 }
This design is called a mixin, and we will discuss it in more
detail in Chapter 29.
It is also perfectly fine for a templated class to have virtual
methods:
1 template<typename T>
2 class MyClass {
3 public:
4 //perfectly fine
5 virtual int computeSomething(int x) {
6 //some code
7 }
8 //also fine
9 virtual void someFunction() = 0;
10 //still fine, good idea if used polymorphically
11 virtual ~MyClass() {}
12 };
18.7.2 Aspects That Are Not Composable
Generally speaking, virtual methods and templates interact in
complex ways. Understanding why these interactions occur
requires understanding the material in Chapter 29, so we will
not cover it now. Instead, we will just give you some rules to be
wary of:
A templated method cannot be virtual. You cannot
declare a templated method to be virtual (do not
confuse this with a method inside of a templated
class, which can be virtual as we discussed above):
1 class MyClass {
2 public:
3 //illegal: virtual templated function
4 template<typename X> virtual
5 int doSomething(const X & arg) {
6 //some code here...
7 }
8 };
Attempting to do so will result in an error message
such as:
error: templates may not be ’virtual’
template<typename X> virtual
If you want to have a variety of virtual
methods with similar functionality in the base
class, you can instead make a protected (or private)
non-virtual template, and have non-templated
methods call it:
1 class MyClass {
2 protected:
3 //templated, non-virtual: legal
4 template<typename X>
5 int doSomething_implementation(const
6 //code here.
7 }
8 public:
9 //virtual, non-templated: legal
10 virtual int doSomething(const int &
11 return doSomething_implementation<
12 }
13 //virtual, non-templated: legal
14 virtual int doSomething(const double
15 return doSomething_implementation<
16 }
17 //etc.
18 };
A templated function cannot override an inherited
method.
Suppose we have a parent class:
1 class Parent {
2 public:
3 virtual void something() {
4 std::cout << "Parent::something\n"
5 }
6 };
Now, suppose we write a child class with a
method by the same name (and parameter list—
whether or not the parameter list is the same due to
template specialization or not):
1 class Child : public Parent {
2 public:
3 template<typename T> void something(
4 std::cout << "Child::something<T>\
5 }
6 };
This Child class does not actually override
the method of the same name from the Parent
class. Instead, we have a non-virtual template
method in the child class and inherit the virtual
method from the parent class. If we execute the
following code:
1 Parent * p = new Child();
2 p->something();
then it will print "Parent::something\n". This
rule is something of a corollary of the previous
rule, as the method would have to be virtual to
override a virtual method—however, the language
designers decided to make this method legal as a
non-virtual method that does not override the
parent, rather than illegal under the previous rule.
Virtual methods are specialized when an instance is made.
In Section 17.3.3 we discussed how a
templated function, class, or method is only type
checked when it is specialized. In that section, we
also discussed how one may specialize a class, and
create instances of the specialized class, without
the compiler specializing (and thus type checking)
all of the methods inside of that templated class.
However, this rule only applies to non-virtual
methods. If a method is virtual, then the compiler
must specialize (and thus type check) it whenever
it must create an instance of the class.
18.8 Planning Your Inheritance Hierarchy
So far, we have covered a lot of important concepts in terms of
what inheritance is and how it works. However, as with most
programming tools, you need to know how to use inheritance
properly in order for it to work well for you. As always,
planning is the key to using it well. Here is a good general,
high-level approach to planning your inheritance relationships:
1. Determine what classes you need and what members
they have.
2. Look for similarities between classes: do multiple
classes have the same members (even if the methods
would behave differently)? If you find similarities,
determine if you can pull the similar members out into
another class that exhibits an “is-a” relationship with the
classes in question. If so, this new class represents a
strong possibility for a class that is a parent of the other
classes in question.
3. Look to see if there are anything with natural “is-a”
relationships that are not related by inheritance. If so,
consider making one a subclass of the other.
Contemplate what behaviors and fields can be moved
into the parent classes.
4. Repeat steps 2 and 3 until you run out of opportunities
for good uses of inheritance.
5. Determine which classes should be abstract. These are
the classes you cannot actually have “just” that type of
thing without being more specific. Typically this
constraint goes hand-in-hand with having a method
where all things of that type are certain to have that
method, but you cannot define it for the type in
question.
There are a few other general guidelines to think about in
designing your inheritance relationships:
• In general, you want a common member as far “up” the
inheritance hierarchy as possible (meaning in a parent
class rather than a child class). Doing so avoids
duplicating code. Of course, you should only put the
field or method in the parent class if the parent type
actually has that field or (possibly abstract) method.
• Make plentiful use of dynamic dispatch. Good object-
oriented programming favors letting dynamic dispatch
“make the decision” of what to do based on what type
of object you have over explicit conditionals (e.g.,
if/else).
Figure 18.3: Naïve class relationships for our hypothetical game.
As an example of designing our inheritance hierarchy, let
us suppose we were writing some sort of game. In this
hypothetical game, there is a hero (controlled by the player),
who has some hit points (“life”), magic points, and can gain
levels (become stronger). The hero has some position on the
screen, which is a point, and a collection of images (so she can
be drawn in a variety of poses and animations). The game also
has enemies (a variety of monsters, villains, or whatever, which
are controlled by the computer), which also have hit points,
magic points, a position on the screen, and a collection of
images. Unlike the hero, the enemies do not gain levels, but
instead have a method to strategize—which implements the
game’s decision-making process for that particular enemy. The
game also has “power ups” (items that help out the hero or an
enemy if they “pick it up”), as well as projectiles, which are for
fireballs, bullets, or whatever our hero and villains attack each
other with. The power ups and projectiles also each has a point
for its position on screen, and a collection of images. The hero,
enemies, power-ups, and projectiles all have a method to draw
themselves on the screen, and each has a way to test for
collisions with the other types.
Instead of writing a description of what the classes have, it
is quite useful to draw out their relationships (especially as they
become more complex). Figure 18.3 shows a diagram of a
naïve design of the class relationships for this hypothetical
game. In this diagram, each box represents a class and has three
parts. In the top part of the box is the name of the class. The
middle part contains the fields of the class, and the bottom part
contains the methods of the class. We draw an arrow from one
class to another to illustrate a “has-a” relationship. The has-a
relationship would be implemented with a field in the class at
the source of the arrow. The arrows are labeled with the name
of the field (on the top) and how many the class has on the
bottom. For example, the hero has one point for her position.
However, she has one or more (indicated by 1..*) images. The
bottom portion of the box lists the methods of the class. In
designing a real game, these classes may be somewhat more
complex, but this simplification works for the purpose of our
example.
Drawing these class relationships is not only a useful tool
for novice programmers but an important piece of what
professional software engineers do in designing large systems.
In fact, what we have drawn here is basically a UML class
diagram though with some details omitted.3 Software engineers
use UML diagrams not only to plan their class hierarchies but
to describe them in an unambiguous way to the other members
of their software development team. UML is an example of the
old saying “a picture is worth a thousand words.” We will not
cover UML in any significant detail, but you should be aware it
exists. If you plan on a professional career in software
development, you will likely take a software engineering class
and learn much more about it.
Figure 18.4: An improved version of our class hierarchy, using inheritance.
Looking at Figure 18.3, we can see that many of these
classes have much in common. All of them have a Point for
their position and a collection of Images, as well as methods to
draw themselves, check for collisions, and update their
positions. From examining these similarities, we should realize
the opportunity for inheritance. All of these classes not only
have these same fields and behaviors, but they all have them
because they are different specific types of the same more
general type—”things that we draw in our game.” Of course,
ThingThatWeDrawInTheGame is not a great name for a class, so
we should pick a better one for the parent class we are
considering. Fortunately, there is a technical term for this
concept, sprite—a 2D image that is drawn onto the screen but
is not the background image.
With this insight, we might revise our inheritance
hierarchy as shown in Figure 18.4. This new diagram
introduces the Sprite class as the parent class of the Hero,
Enemy, Projectile, and PowerUp classes. Note that the arrow
with the unfilled triangular arrow head indicates the subclass
relationship. All of the common functionality is pulled up into
the Sprite class, avoiding duplication of code. Notice how now
that all of these types are subclasses of the Sprite class, we no
longer need 11 different checkCollision methods. Instead, we
can just write one checkCollision method in Sprite, which
checks for a collision with some other Sprite. Polymorphism
lets us pass in any subtype of Sprite we need to.
Even though our second iteration of this design is a
significant improvement over the first, we still have some
duplication of code. If we look at the Hero and Enemy classes,
we will see that both of them have hitPoints and
magicPoints. Additionally, the PowerUp class has two different
methods, one to apply its benefits to the Hero and one to apply
its benefits to an Enemy. Presumably, these two methods have
significant duplication in what they do—for example, if the
PowerUp is a healing potion, they both add some value to the
Hero’s or Enemy’s hitPoints.
Observing this duplication, we should then contemplate
whether there is some other class C we should write, such that a
Hero is a C and an Enemy is a C. Is there a C, such that it has
hitPoints and magicPoints, it makes sense to apply a PowerUp
to a C, and a C is a Sprite? We consider these criteria because
we want to place the class in the class hierarchy as the parent
class of Hero and Enemy (thus the first two constraints), we
want to bring the duplicated functionality into this class (thus
the second two constraints), and the class would need to be a
child class of Sprite (thus the last constraint). In this particular
case, creating a Creature class makes perfect sense, as it
satisfies all of these criteria.
Figure 18.5: An even better inheritance hierarchy.
Based on this analysis, we would then proceed to revise
our class hierarchy as shown in Figure 18.5. Here, we have
introduced the abstract Creature class (in UML, an italicized
name indicates that the class or method is abstract) as the parent
class of Hero and Enemy. Now, our PowerUp class has a single
method allowing it to be applied to a Creature.
While this example has been a great illustration of how to
find opportunities to use inheritance, we would like to note that
this example hierarchy is by no means complete. Beyond the
fact that many of these classes would likely have other methods
and fields, we would probably have many more classes. We
would likely want a variety of subclasses of Enemy, PowerUp,
and Projectile to implement different behaviors for different
types of each of those. In fact, these three classes would likely
be abstract, as each actual Enemy would be some subtype of
Enemy—you would not have something that is “just an enemy.”
With these classes made abstract, some of their methods would
also be abstract. Hero could have subclasses too, if your game
has a variety of heroes with different behaviors.
Additionally, while this design could certainly be
implemented (and is excellent for demonstrating inheritance
concepts), it does not adhere to some other software
engineering principles that we might like in a real system. Most
notably, it combines the user interface (everything related to
drawing and user interaction) together with the model (the data
and state of the game world). In engineering programs that
interact with end users, the MVC (Model, View, Controller)
paradigm is quite popular. This paradigm splits the design into
three pieces: the model (which holds all the data and state), the
view (which handles drawing that state), and the controller
(which handles receiving input and updating the model
accordingly). Each of these pieces would have many classes,
but no class would be in both parts. Instead, the view queries
the model for information and draws accordingly, and the
controller calls methods on the model to update its state. A
variant of this paradigm that combines the view and controller
(which can be somewhat hard to separate sometimes) is called
UI Delegate. We are not going to cover them in detail, but you
will learn about them if you take a software engineering class.
18.9 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 18.1 : What is inhertiance? When is it
appropriate? What is the benefit?
• Question 18.2 : What is subtype polymorphism?
• Question 18.3 : Given the following classes:
1 class A {
2 int x;
3 public:
4 void something() { ... }
5 };
6
7 class B : public A {
8 int y;
9 public:
10 void anotherFunction() { ... }
11 };
If you try to write this code:
1 A * ptr = new B();
2 ptr->anotherFunction();
You will receive a compiler error. Why?
• Question 18.4 : In the previous code, we declared
class B : public A. What does public mean here?
• Question 18.5 : What does protected mean?
• Question 18.6 : What does it mean to override a
method? How is overriding different from overloading a
method?
• Question 18.7 : What is the output when the following
C++ code is run?
1 #include <iostream>
2 #include <cstdlib>
3
4 class A {
5 protected:
6 int x;
7 public:
8 A(): x(0) { std::cout <<"A()\n
9 A(int _x): x(_x) { std::cout <<"
10 virtual ~A() { std::cout <<"~A()\
11 int myNum() const { return x; }
12 virtual void setNum(int n) { x =
13 };
14
15 class B : public A {
16 protected:
17 int y;
18 public:
19 B(): y(0) { std::cout <<"B()\n";
20 B(int _x, int _y): A(_x), y(_y) {
21 std::cout <<"B("<<x<<","<<y<<")
22 }
23 virtual ~B() { std::cout <<"~B()
24 virtual int myNum() const { retur
25 virtual void setNum(int n) { y =
26 };
27
28 int main(void) {
29 B * b1 = new B();
30 B * b2 = new B(3, 8);
31 A * a1 = b1;
32 A * a2 = b2;
33 b1->setNum(99);
34 a1->setNum(42);
35 std::cout << "a1->myNum() = " <<
36 std::cout << "a2->myNum() = " <<
37 std::cout << "b1->myNum() = " <<
38 std::cout << "b2->myNum() = " <<
39 delete b1;
40 delete a2;
41 return EXIT_SUCCESS;
42 }
• Question 18.8 : If a child class needs to pass parameters
to its parent class’s constructor, how would you specify
what parameter values to pass?
• Question 18.9 : Why is it important for polymorphic
classes to have virtual destructors?
• Question 18.10 : If a class has a virtual method, can it
be a POD type?
• Question 18.11 : What is an abstract method (also
called a “pure virtual function”)? How do you declare
one? What is it useful for?
17 Templates19 Error Handling and Exceptions
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
II C++18 InheritanceIII Data Structures and Algorithms
Chapter 19
Error Handling and
Exceptions
In the real world, things do not always go as expected.
Accordingly, programs deployed in the real world must check
for and handle situations when problems occur. A problematic
situation for a program might be improperly formatted input
from the user (e.g., the program wants the user to input an
integer, but the user types xyz), inability to open a file, failure
of a network connection, or a wide variety of other issues.
Similarly, the problem may come from a variety of sources: the
user of the program, a logical error in the program itself, a
problem with the computer’s hardware, an unplugged network
cable, or any number of other issues.
Whatever the type and source of the problem are, the
program must deal with the situation in some fashion. The
worst possible way to deal with any problem is for the program
to produce the wrong answer (or action, or lack of action)
without informing the user of the problem—a silent failure.
Consider the case in which the software to control an airplane’s
flight systems cannot lower the landing gear properly. Failing
to do so without notifying the user (i.e. pilot) of the situation is
a disastrous course of action.
A course of action that is slightly better in most cases is to
abort the program (or at least the operation) and inform the user
of the problem. Producing no answer at all is preferable to
producing an incorrect answer—imagine software to target a
missile: we would prefer the missile to self-destruct rather than
hit the wrong target. When the program deals with an error by
aborting, it should do so with the explicit intention of the
programmer (i.e., the programmer should call assert or check
a condition then call abort), and the program should give an
error message to the user. Simply segfaulting due to an invalid
memory access is never appropriate. Of course, in some
situations, aborting the program may be an unacceptable course
of action—in our landing gear example, the flight control
software’s abrupt termination would make the problem worse
and therefore be a terrible decision.
We would prefer to handle the error more gracefully.
Graceful error handling requires making the user aware of the
problem (unless the program can fix it), as well as allowing the
program to continue to function normally in all other regards.
Ideally, we would present the user with some options to remedy
the situation (an opportunity to enter a different input, select a
different file, retry a failed operation, or engage in some course
of remedial/corrective action). Whatever the course of action,
the programmer should have explicitly thought of and
accounted for the potentially problematic conditions in the
code’s design and written her code accordingly.
As your programming skills develop, you should get into
the practice of writing bullet proof code—code that can
gracefully handle any problem that can happen. You should get
in the habit of contemplating every possible failure mode as
you write your code and determining how you can handle it.
When you write a line of code, you should ask “what can go
wrong with that line?” and come up with plans for those cases.
In thinking through how to handle a problematic
condition, the first question a programmer should consider is
whether or not the erroneous situation can be handled directly
in the function she is writing or if the function should notify its
caller of the failure and allow the caller to handle the problem.
This decision should be guided by considering whether or not
the function in question can properly handle the error in all
cases. If it can handle the error properly, it should. If it cannot,
it should inform its caller and allow the caller to handle the
error (or inform its caller, and so on).
How a function propagates an error to its caller in C++
will be the subject of the rest of this chapter. In C, functions
return a value indicating the error, which the caller must check.
However, C++ provides a better approach (called exceptions).
Before we discuss how exceptions work, let us examine the C-
style approach and understand why we would prefer a better
option.
19.1 C-Style Error Handling
In C, error handling involves checking the return value of a
function that you call and possibly returning an error to your
function’s caller as needed. For example, suppose we were to
write some code that reads input from a file. It might look
something like this:
1 someStruct * readInput(const char * filename) {
2 FILE * f = fopen(filename, "r");
3 if (f == NULL) {
4 return NULL;
5 }
6 someStruct * ans = malloc(sizeof(*ans));
7 if (ans == NULL) {
8 fclose(f); //maybe check this too
9 return NULL;
10 }
11 while (...) {
12 ...
13 someType * ptr = malloc(sizeof(*ptr));
14 if (ptr == NULL) {
15 //code to free up all memory we have allocated so far
16 fclose(f); //maybe check this too
17 return NULL;
18 }
19 ...
20 }
21 fclose(f);//maybe check this too
22 return ans;
23 }
24 void someOtherFunction() {
25 ...
26 someStruct * input = readInput(inputFileName);
27 if (input == NULL) {
28 //do whatever we need to do here
29 }
30 ...
31 }
Here, we have elided most of the body of the code, and just
written a few select parts the merit some error checking—file
opening and memory allocation. We generally want to check
fclose but might get away with not doing so in this case, as we
are only reading from the file.
Looking at this code for a bit highlights several problems
with C-style error handling. First and foremost, it is easy to
forget (or be lazy about). If we fail to check the return value of
any particular function, we might never notice the problem
during testing. In the case of fopen, we will discover the error
during testing if we test our program with the name of a file
that cannot be opened, which should be high on the list of
things we try during the testing process. However, testing cases
such as the failure of malloc or fclose are a bit more difficult.
Forgetting error checks may seem minor; however, they may
result in silent failures, crashes, or even security vulnerabilities.
This problem is exacerbated by the fact that there are a
wide variety of functions we are accustomed to always
succeeding, and we do not really even think about the
possibility of failure. For example, printf can fail, but most
programmers do not even know this fact much less think about
it when they write a call to printf (it returns the number of
characters printed or a negative value on an error). Does the
fact that printf can fail mean you always need to check its
return value? That really depends on the consequences if the
call to printf fails. However, you should always think about
the possibility of failure.
A second problem with this style of error handling is that
it “clutters up” the code. Even though rigorous error handling is
critical to writing bulletproof (and thus correct) code, it can feel
cumbersome to write as well as “in the way” to read through.
This downside is particularly problematic, as it makes it easier
for lazy programmers to feel justified in omitting error checks
that really should be present.
A third problem with the error handling in the above
example is that we only know that an error has occurred and
have no additional information about what went wrong. In this
example, when someOtherFunction checks the return value of
readInput, it can only test if the return value is NULL
(indicating an error) or not. In the case of an error,
someOtherFunction cannot determine why readInput failed—
the input file could not be opened, a memory allocation failed,
or some other reason. Not knowing the reason for the error
makes it more difficult to take corrective action.
The typical approach to this third problem is to set errno
(recall from Section 11.1.1 that errno is a global integer) to an
error code indicating what went wrong. The header file
errno.h not only defines errno but also the various standard
error codes, such as ENOMEM (insufficient memory), ENOENT (the
file was not found), and about a hundred others. Most C library
functions will set errno appropriately when they fail, so if the
failure stems from the library call immediately before, we do
not need to set errno ourselves. However, as errno is global,
we must be careful of other library calls that might change its
value if we plan to use it. There are a handful of functions that
work with errno, such as perror (which prints a description of
the error the current value of errno represents) and strerror
(which returns a char * pointing at such a string). Of course,
we generally want to avoid global variables, so such an
approach is nonideal. However, if we are programming in C, it
is what we have to work with.
19.2 C++-style: Exceptions
From the previous discussion, we can observe three key
characteristics we desire in an error handling mechanism. First,
we want to remove the possibility that an error can be silently
ignored. If a particular piece of code does not handle an error,
the error should propagate outwards to that function’s caller.
This process should continue until a piece of code capable of
explicitly handling the error is found.
Second, we would like error handling to be as unobtrusive
as possible in our code—reducing the “cluttered” feeling given
by C-style error handling. If a function is going to allow an
error to propagate to its caller, we should generally be able to
write no extra code for that to happen. However, our error
handling mechanism should work in such a way that the error
propagation does not cause us to leak resources, nor leave
objects in invalid states. Of course, just because we do not have
to write anything does not mean that we do not have to think
about what happens.
The third aspect we desire in our error handling
mechanism is the ability to convey extra information about the
error. In particular, we would like to be able to convey arbitrary
1
information about the problem. Beyond just informing the
caller of the type of problem, we might want to more specific
and include detailed information about the problem.
To achieve all of these goals, C++ introduces exceptions,
which involve two concepts: throw and try/catch. A
programmer places code where an error might occur (that she
has a plan to handle) in a try block. The try block is
immediately followed by one or more catch blocks. Each
catch block specifies how to handle a particular type of
exception. Code that detects an error it cannot handle throws an
exception to indicate the problem. This exception can be any
C++ type, including an object type. The function that has
detected the problem can include any additional information it
wants to communicate to the error handling code by including
it in the object it throws.
Once an exception is thrown, it propagates up the call
stack, forcing each function to return (such that each of their
stack frames is destroyed) until a suitable exception handler is
found, i.e. until it is caught with a try/catch block. Each time a
stack frame is destroyed, destructors are invoked as usual to
clean up objects; however, if one of these destructors throws an
exception that propagates out of the destructor, the program
crashes. Therefore you should generally design your destructors
to ensure this does not happen.
This style of error handling conforms to the design goals
we laid out at the start of this section. Code that can handle an
error describes how to deal with that error with try/catch
blocks. Code that cannot handle an error can simply do nothing,
and the error will propagate to the caller—it cannot be ignored.
As the exception is an object, it can inform the exception
handler of the type of problem (via what type of object it is)
and can encapsulate other information about the situation as
needed.
19.3 Executing Code with Exceptions
Having learned the fundamental concepts of exception
handling, we now turn our attention to understanding in more
detail what happens when code throws an exception. Throwing
an exception is accomplished by executing a throw statement,2
which is the keyword throw, followed by an expression that
evaluates to the exception to throw. For example:
1 throw std::exception();
This expression constructs an unnamed temporary (of type
std::exception via its default constructor) then throws the
resulting object. The exception then propagates up the call
stack, forcing functions to return and destroying their frames,
until it finds a suitable exception handler in the form of a try
block with a following catch block that can catch an exception
of a compatible type.
In C++, we can throw any type of object, but should
generally only throw subtypes of std::exception. To use
std::exception, you should #include <exception>. There
are also a variety of built-in subtypes of std::exception,
which typically require you to #include <stdexcept>. You can
read about them in the documentation for std::exception
(http://www.cplusplus.com/reference/exception/excepti
on/. Of course, you can also create your own subtypes of
std::exception by declaring a class that inherits (publicly)
from std::exception or any of its existing subtypes.
Once an exception is thrown, control is transferred to the
nearest suitable exception handler, possibly unwinding the stack
(forcing functions to return, destroying their frames, and
executing the destructors for objects in those frames).
Exception handlers are written with try and catch. When code
might throw an exception and the program knows how to
handle the situation, the programmer writes the code within a
try block. She then writes one or more handlers, each of which
specifies a type of exception to catch. For example:
1 try {
2 //code that might throw
3 }
4 catch(std::exception & e) {
5 //code to handle a generic exception
6 }
If an exception is thrown within the try block, including
the functions it calls (unless they catch it), control transfers into
the exception handler (catch block), to allow the program to
deal with the situation.
More specifically, when an exception is thrown, the
following steps occur:
1. The exception object is potentially copied out of the
frame into some location that will persist through
handling. The “potentially” qualifier appears here as the
compiler may eliminate the copy as long at it does not
change the behavior of the program (other than
removing the extra copy constructor and destruction).
The compiler may, for example, arrange for the
unnamed temporary to be directly allocated into some
other part of memory (which it will handle).
2. If the execution arrow is currently inside of a try
block, then it jumps to the close curly brace of the try
block (if any variables go out of scope, they are
destructed appropriately) and begins trying to match the
exception type against the handler types in the order
they appear (if it is not inside a try block, go to step 3).
If the execution arrow encounters a catch block capable
of catching the exception type that was thrown, the
exception is considered caught, and the process
continues in step 4. If no handler is found, step 2 repeats
(note that the execution arrow is now outside of the try
block where it started—it may be inside another if they
are nested—or may not be, in which case, step 2 will
direct you to go to step 3).
3. If the execution arrow was not currently inside of a try
block in step 2, then the exception propagates out of the
function it is in. Propagating the exception out of the
function destroys the function’s frame, including any
objects inside it. In the process of destroying the frame,
the destructors for the objects are invoked as if the
function had returned normally. The execution arrow
then returns to wherever the function was called (again,
much like the function had returned), and step 2 repeats.
4. Once an exception is caught, it is handled by the code
within the catch block. The first step of handling the
exception is to bind (a reference to) it to the variable
name declared in the () of the catch block. Then the
code in the catch block is executed according to the
normal rules, with one exception. This one exception is
that if a statement of the form throw; (throw with no
expression following it—just a semicolon after it ) is
encountered inside of the catch block, then the
exception being handled is re-thrown (the exception
handling process starts again from step 2—no extra
copy is made).
5. If/when the execution arrow reaches the close curly
brace of the handler, then the program is finished
handling the exception. The exception object is
deallocated (including invoking its destructor), and
execution continues normally at the statement
immediately following the close curly brace.
Video 19.1: Executing code with exceptions.
Video 19.1 demonstrates the behavior of a program when an
exception is thrown and caught.
There are slight variations on the try and catch blocks
that are worth mentioning. For the catch block, one can specify
that it will catch any type, but in doing so, it cannot bind the
exception to a variable. This generic catch is accomplished by
placing three dots (...) in the parentheses where one would
normally write the type of exception to catch and the name to
bind it to. That is, we can write:
1 try {
2 //code that might throw
3 }
4 catch(...) {
5 //code to handle any exception
6 }
Since we are catching an exception of an unknown type, we
cannot bind it to any variable, as there is nothing we can do
with that variable.
For a try block, there is a variation called a function try
block, which is primarily of use in a constructor, where the
programmer wants to catch exceptions that may occur in the
initializer list. In a function try block, the keyword try appears
before the function body (and before the initializer list, if any)
and the handlers appear after the close curly brace of the
function’s body. For example, to use a function try block on a
constructor, we might write:
1 class X {
2 Y aY;
3 Z aZ;
4 public:
5 X() try : aY(42), aZ(0) {
6 //constructor’s body here as normal
7 }
8 catch(std::exception & e) {
9 //code to handle the exception
10 }
11 //other members here as normal.
12 };
If an exception occurs in the initializer list or the body of the
constructor, it will be caught by the handler after the function
(assuming it matches). However, a function try block on a
constructor has a special behavior: since the object cannot be
properly initialized, the exception is automatically re-thrown
(as if throw; were the last line of the function try block.
Anything that was successfully constructed (including the
parent-class portion of the object) is destroyed before entering
the handler.
For a “normal” function, a function try block covers the
entire body of the function and may return as normal. In fact, if
the function returns a value, the function try block should return
a value (else you will be warned for “control reaches end of
non-void function”).
19.4 Exceptions as Part of a Function’s
Interface
Functions may include an exception specification—a list of the
types of exception it may throw. Such a declaration is added to
a function by writing throw() with the exception types listed in
the parentheses. An empty list of types in the parentheses
specifies that the function may not throw any type of exception,
whereas completely omitting the exception specification means
that the function may throw any type of exception.
For example:
1 int f(int x) throw(std::bad_alloc, std::invalid_a
2 int g(int * p, int n) throw();
3 int h(int z);
The first of the functions in the example, f, is declared to
throw two possible exception types, std::bad_alloc and
std::invalid_argument. The second function is declared to
throw no exceptions. The third does not include an exception
specification, so it may throw any type of exception.
The exception specification is part of the interface of the
function. It tells the code that uses the function what types of
error conditions might result from the use of that function.
Code that uses functions with exception specifications knows to
either handle those errors (if appropriate) or declare them in its
own exception specification (as the un-handled exception
would propagate out of the caller to its own caller).
When overriding a method that provides an exception
specification, the overriding method (in the child class) must
have an exception specification that is the same or more
restrictive than the exception specification of the inherited
method. That is, if f above were a method in a parent class, a
child class could override it with any of the following four
exception specifications:
1 //Option 1: same as the parent class
2 int f(int x) throw(std::bad_alloc, std::invalid_a
3 //Option 2: more restrictive: cannot throw std::invalid_argument
4 int f(int x) throw(std::bad_alloc);
5 //Option 3: more restrictive: cannot throw std::bad_alloc anymore
6 int f(int x) throw(std::invalid_argument);
7 //Option 4: more restrictive: cannot throw any exception
8 int f(int x) throw();
The reason for this restriction arises from polymorphism.
If the child class’s method can throw exceptions the interface of
the parent class’s method did not allow, then the child class is
no longer a suitable substitute for the parent class. That is, we
may have a pointer that is statically declared to be a pointer to
the parent class but actually points at an instance of the child
class. The code where this pointer exists may then invoke the
overridden method, expecting the exception behavior promised
by the interface in the parent class. However, if the child class’s
method throws unexpected types, the calling code is
unequipped to handle those properly.
From a software engineering standpoint, providing a
correct exception specification for every function you write
makes for better code. In doing so, you are more clearly
specifying the interface of the function, providing better
information to anyone (including yourself in the future) who
uses the function. As we will see in Section 19.7, thinking
carefully about exception behavior is a crucial part of writing
professional-quality code. Including a correct exception
specification for a function is a key part of this process in two
ways. First, you should think through the exception behavior of
your code fully enough that writing a correct exception
specification is easy. Second, reasoning about the exception
behavior of your code requires knowledge of the exception
specifications of the functions it calls—if they do not clearly
specify this behavior, you must delve into their implementation
to find details that should be provided in the interface.
Unfortunately, C++’s exception specifications were
defined in a way that makes them orders of magnitude less
useful than they could and should be. Not only do some experts
advise against writing such specifications, but also C++11 has
deprecated3 them. The reason for this gap between the potential
and actual usefulness arises from the fact that the exception
specification is not checked by the compiler. Instead, the
compiler must generate code that enforces the guarantees at
runtime (imposing an overhead). As we will momentarily
discuss, violating the exception specification (by throwing an
unexpected type of exception) is generally handled in a very
heavyweight fashion: the program is killed.
19.5 Exception Corner Cases
If you have fully grokked the preceding information on
exceptions, you should have a variety of questions about corner
cases: “What if an exception propagates outside of main?”
“What if a destructor throws an exception during stack
unwinding?” “What if a function throws an exception that is
not allowed by its exception specification?” “What if my code
executes throw; (with no operands) while no exception is being
handled?”4 C++ has two special functions, unexpected and
terminate to handle such situations.
The first of these, unexpected(), is called when a function
throws an exception that is not allowed by its exception
specification. The default behavior of unexpected() depends
on whether the exception specification allows
std::bad_exception. If so, then the unexpected() function
throws a std::bad_exception and exception handling
continues normally. If std::bad_exception is not permitted,
then unexpected() calls terminate().
As its name suggests, the default behavior of the
terminate() function is to terminate the program by calling
abort(). This function is called in the other situations
described above (an exception that propagates outside of main,
an exception that propagates out of a destructor during stack
unwinding, throw; when no exception is being handled), as
well as by the default implementation of unexpected().
We have referenced the default behavior of these
functions; however, the programmer may supply her own
behavior if desired by calling set_unexpected or
set_terminate, passing in a function pointer to specify the
new behavior of calls to unexpected() or terminate()
respectively. The user-supplied functions for either of these
may not return. In the case of a terminate handler, the supplied
function must terminate the program. In the case of an
unexpected handler, the function may either throw an
appropriate exception or terminate the program.
Another set of corner cases arise when an exception is
thrown during object construction. When such a situation
occurs, the pieces of the object that are already initialized are
destroyed, and the memory for the object is freed (even if it was
allocated with new). In the case of an array of objects (e.g.,
allocated with new[]), if the object’s constructor throws
an exception, then the objects from indices down to
will be destructed in reverse order of their construction.
19.6 Using Exceptions Properly
When used properly, exceptions simplify the task of writing
code that is capable of recovering gracefully from error
conditions. However, when used improperly, exceptions can
make the code significantly worse. We provide the following
guidelines to help you understand how and when to use
exceptions.
Exceptions are for error conditions. A program should
throw an exception to indicate that an error
condition has occurred that which must be handled
by the caller (or its caller, or its caller, etc.). If the
function can handle the situation properly itself, it
should do so. Likewise, you should not use
exceptions for the normal execution path of a piece
of code.
Throw an unnamed temporary, created locally. The
only way you should throw an exception is
throw exn_type(args);. Note that there is no new
in that statement (you should not do
throw new exn_type(args)).
Re-throw an exception only with throw; If your
exception handler must re-throw an exception, then
you should do throw; (with no arguments). Even
though you have the exception object bound to a
name (e.g., if you wrote
catch(exn_type_name & e) { ... }, you should
not do throw e; to re-throw the exception. Doing
so makes an extra copy of the exception object.
Catch by (possibly constant) reference. You should
always catch exceptions by reference, or by const
reference. That is, your handlers should look like
catch(exn_type_name & e) or
catch(const exn_type_name & e). They should
not look like catch(exn_type_name e) (catching
by value) or catch(exn_type_name * e) (catching
by pointer).
Declare handlers from most to least specific. If you
declare multiple handlers (catch blocks) for the
same try block, you should declare them in order
from most specific type to least specific type. The
handlers are matched in their order of appearance,
and if the thrown exception type is
polymorphically compatible with the type the
handler declares, then that handler is used. If a
handler specifying a child class appears after a
handler specifying a parent class, the child class
handler will never be used, as the exceptions of
that type will be caught by the handler for the
parent class. That is, if child_exn_type inherits
from parent_exn_type, then the following code is
poorly written:
1 try {
2 //code
3 }
4 catch (parent_exn_type & e) {
5 //handler code
6 }
7 catch (child_exn_type & e) { //bad, will
8 //handler code
9 }
Destructors should never throw exceptions.
Never. If
they perform any operation that could throw, you
must find a way to handle it appropriately (with
try/catch).
Exception types should inherit from std::exception. I
f you write your own exception class, it should
inherit (publicly) from std::exception or one of
its subclasses.
Keep exception types simple. If you write your own
exception class, it should be quite simple. Most
specifically, it should not have any behavior that
can throw an exception. Typically, you will want
the exception class to have no dynamic allocation
at all (new can throw std::bad_alloc).
Override the what method in your own exception types.
The std::exception class declares the
method:
1 virtual const char * what() const thro
This method provides a description of the
exception that happened. Note that it returns a C-
style string (just a const char *, not a
std::string), as those are simpler, and
constructing a std::string may throw an
exception. Often, you will override this method to
just return a string literal.
Be aware of the exception behavior of all code you work
with.
Whenever you call another function, you
should be aware of what exceptions it might throw.
Knowing this aspect of the code’s behavior is
crucial to understanding your own code’s
exception behavior, as well as to writing exception
safe code, which is the topic of the next section.
19.7 Exception Safety
One of the aspects that distinguishes amateur code that
generally works from well-written C++ code suitable to a
professional programmer is exception safety—what guarantees
the code makes in exceptional circumstances. We alluded to
this briefly earlier in Section 15.3.2 when we presented two
different versions of an assignment operator:
Strong Exception Guarantee
1 IntArray & operator=(const IntArray & rhs) {
2 if (this != &rhs) {
3 int * newData = new int[rhs.size];
4 for (int i = 0; i < rhs.size; i++) {
5 newData[i] = rhs.data[i];
6 }
7 delete[] data;
8 data = newData;
9 size = rhs.size;
10 }
11 return *this;
12 }
No Exception Guarantees
1 IntArray & operator=(const IntArray & rhs) {
2 if (this != &rhs) {
3 delete[] data;
4 data = new int[rhs.size];
5 for (int i = 0; i < rhs.size; i++) {
6 data[i] = rhs.data[i];
7 }
8
9 size = rhs.size;
10 }
11 return *this;
12 }
The second piece of code makes no exception guarantees.
The new operator may throw an exception (if the memory
allocation request cannot be satisfied). If this exception is
thrown, it will propagate outside of the operator= (which is not
equipped to handle this circumstance). However, by the time
the exception is thrown, data has been deleted, and the state of
this object is invalid. Even if the code that has called the
assignment operator can handle a memory allocation failure
exception, the object it attempted to assign to is in an corrupted
state (its data pointer is dangling) and will cause the program
to crash when it is used or destroyed (the destructor will double
free the data).
While such code would get by fine in an introductory
programming course, it is not really correct—in the real world,
problems happen, and the program must be able to deal with
them. Perhaps more importantly, when other parts of the
program try to deal with the problem, their job becomes
impossible when code makes no exception guarantees and
leaves objects in corrupted states.
Figure 19.1: Overview of exception guarantees.
At a minimum, professional code should provide basic
exception guarantees—it should ensure that if an exception
happens, the object’s invariants are maintained, and no memory
is leaked. The first piece of code in the assignment operator
example above provides a strong exception guarantee—beyond
just promising to respect invariants and not leak memory, this
guarantee means that if an exception happens, no side effects
will be visible. In our example, the object being assigned to will
either be properly assigned to (if no exception happens) or will
be completely unchanged (if new fails). An even stronger
exception guarantee is the no-throw guarantee (also called the
no-fail guarantee), which promises that the code will never
throw an exception—if any of the operations it performs fail, it
will handle the resulting exception(s). Destructors should
always provide a no-throw guarantee. If a class’s destructor
does not provide a no-throw guarantee, then any code that
creates instances of that class cannot provide any guarantee, as
it cannot ensure that the object it created will be properly
destroyed. Figure 19.1 summarizes the four standard levels of
exception guarantees.
The first step in providing any of these guarantees is to
understand the exception behavior of the operations that your
code uses. In our assignment operator example, the new
operation may throw an exception (specifically,
std::bad_alloc); however, the other operations involved
cannot. As there is only one possible source of failure, we can
provide a strong guarantee relatively easily—we perform the
operation that might fail before we make any changes to the
object.
More generally, however, we may need to go to greater
lengths to provide exception safety. Consider the following:
1 template<typename T>
2 class MyArray {
3 T * data;
4 size_t size;
5 public:
6 //other methods elided
7 MyArray<T> & operator=(const MyArray<T> & rhs)
8 if (this != &rhs) {
9 T * newData = new T[rhs.size];
10 for (int i = 0; i < rhs.size; i++) {
11 newData[i] = rhs.data[i];
12 }
13 delete[] data;
14 data = newData;
15 size = rhs.size;
16 }
17 return *this;
18 }
This assignment operator looks quite similar to our strong
guarantee assignment operator for IntArrays; however, here
we have a templated class that holds any type T. Now, the
assignment statement on line 11 might throw an exception—for
example T might be IntArray, in which case the assignment
uses the operator=, which we just discussed. If the assignment
statement on line 11 throws an exception, this assignment
operator will leak memory—meaning it provides no exception
guarantees.
One way we could fix this code would be to catch the
exception, clean up our memory allocations, then re-throw the
exception:
1 template<typename T>
2 class MyArray {
3 T * data;
4 size_t size;
5 public:
6 //other methods elided
7 MyArray<T> & operator=(const MyArray<T> & rhs)
8 if (this != &rhs) {
9 T * newData = new T[rhs.size];
10 try {
11 for (int i = 0; i < rhs.size; i++) {
12 newData[i] = rhs.data[i];
13 }
14 }
15 catch(std::exception & e) {
16 delete[] newData;
17 throw;
18 }
19 delete[] data;
20 data = newData;
21 size = rhs.size;
22 }
23 return *this;
24 }
This implementation of the assignment operator provides a
strong guarantee if and only if the copy assignment operator for
type T provides at least a basic exception guarantee (basic,
strong, or no-throw) and if the destructor for type T provides a
no-throw guarantee. Otherwise, this code provides no exception
guarantees. If the rest of our code is written correctly, these
requirements will be met—all code should provide at least a
basic guarantee, and destructors should always provide a no-
throw guarantee.
This approach—inserting try/catch to clean up whenever
we might have an exception—works but is not ideal. In fact,
cluttering up our code with all of the requisite try/catches goes
against one of the reasons we wanted exceptions in the first
place: to reduce the clutter from handling errors properly and
thoroughly. Fortunately, there are better approaches.
19.7.1 Resource Acquisition Is Initialization
In this particular case—where we want a sequence of elements
of type T—we could just use a std::vector anyways instead of
a dynamically allocated array. The vector will internally have a
dynamically allocated array, but the code for it is already
written in ways that provide proper exception guarantees. If we
have the object directly in the frame (as opposed to a pointer in
the frame to a dynamically allocated object in the heap), then
the object’s destructor will be invoked when an exception
propagates out of the function, destroying its frame.
We can use the fact that destructors are invoked on objects
in the frame when the exception destroys the frame to simplify
exception safe resource management (whether memory
allocation/deallocation or other resources) in the general case.
What we need is an object in the local frame that is constructed
when the resource is allocated and whose destructor frees that
resource (unless we explicitly remove the resource from that
object’s control). This design principle is called Resource
Acquisition is Initialization (or RAII for short).
C++’s STL provides a templated class,
std::auto_ptr<T>, which is designed to help write exception
safe code with the RAII paradigm. The basic use of the
std::auto_ptr<T> template is to initialize it by passing the
result of new to its constructor, then make use of the get to
obtain the underlying pointer or the overloaded * operator to
dereference that pointer. If the auto_ptr is destroyed while it
still “owns” the pointer, it will delete the pointer it owns. You
can release the pointer from its ownership with the release
method, after which it will not destroy it (and get will return
NULL). For example (suppose A, B, X, and Y are classes):
1 //an example that does not do anything particularly useful
2 X * someFunction(A & anA, B & aB) {
3 std::auto_ptr<X> myX(new X());
4 std::auto_ptr<Y> myY(new Y());
5 aB.someMethod(myX.get()); //someMethod takes an X
6 *myY = anA.somethingElse(); //dereference pointer ow
7 return myX.release(); //remove ownership of pointer, a
8 } //myY will delete the Y pointer it owns
This code provides a basic exception guarantee (as long as
everything it calls does too). On line 3, we allocate memory
(which might fail—but that is fine), but then we give ownership
of that memory to an auto_ptr. On line 4, we allocate more
memory (which also might fail—in which case, the exception
destroys the first auto_ptr, freeing the memory we allocated).
On the next line of code, we call someMethod on aB,
passing in the X * that new X() evaluated to (which is what the
auto_ptr is holding). On the next line, we use the overloaded *
operator to dereference the pointer owned by the auto_ptr
(equivalent to *myY.get()) and store the result of
anA.somethingElse() in that box. If either of these methods
throws an exception, both auto_ptrs will be destroyed, freeing
the corresponding memory.
Finally, we release ownership of the pointer in myX and
return it. The return value of release is the pointer that was
owned by the auto_ptr, but it also makes it so that the
auto_ptr owns no pointers. Now when myX is destroyed, it will
do nothing, which is good because if it destroyed the pointer it
previously owned, the return value of this function would be a
dangling pointer. When the function returns, its frame is
destroyed, including both auto_ptrs. As we just discussed, myX
no longer owns any pointer, so its destructor does nothing. myY
still owns a pointer, so it deletes that pointer.
We will note that std::auto_ptr does not work with
arrays (as it uses delete and not delete[]). The Boost Library
(see http://www.boost.org/) provides a
boost::interprocess::unique_ptr<T,D> templated class.
The second template argument specifies how to delete the
owned pointer, making it possible to use it properly with arrays.
C++11 adapts this template into std::unique_ptr and
deprecates std::auto_ptr (see Section E.6.9).
19.7.2 Exception Safety Idiom: Temp-and-Swap
One exception safety idiom is to modify an object by creating a
temporary object (of the same type) and then swapping the
contents of the newly created temporary object with the original
object. This idiom provides a strong exception guarantee (if the
swap operation provides at least a strong exception guarantee—
preferably, it will provide a no-throw guarantee) as the
modifications all occur on the newly created temp object. If
anything goes wrong, temp will be destroyed during the
destruction of the stack frame, and the original object will
remain unchanged. If the modifications complete successfully,
then the contents of the modified objects are swapped with the
original, which updates the state of the original object and
leaves its old state in the temporary object (which will be
destroyed at the end of the function).
In this idiom, our code might look generally like this:
1 class SomeClass {
2 void makeSomeModifications() {
3 SomeClass temp(*this);
4 //make changes to temp
5 //...
6 std::swap(*this, temp);
7 }
8 };
Note that std::swap is a templated function that performs
the swap operation using the copy constructor and copy
assignment operator of the class involved.5 Accordingly, you
cannot use the general std::swap to implement the assignment
operator. We can, however, define our own swap operation
(which we could probably do in a more efficient manner) and,
if we wanted to, provide it as an explicit specialization6 of the
std::swap template.
We will note that temp-and-swap is a conceptually simple
idiom for providing strong exception guarantees, but it comes
at a performance cost—making extra copies and moving data
around takes time. These hidden performance costs are a great
example of why you should fully understand the entire behavior
of every line of code you write. When using temp-and-swap,
you should think through how much extra copying and data
movement is involved and keep its performance cost in mind as
you continue to develop.
As with many C++ topics, we could write an entire
volume on exception safety, but our purpose here is not to delve
deeply into the depths of this topic. Rather, we want you to
understand that it is a significant concern in writing truly
correct code, give you some basic understanding of the issues
involved, and illustrate a few of the basic principles in writing
exception-safe code in a reasonable way. If you go on to
become a serious C++ programmer, this is a topic you should
learn more about from experience and an in-depth C++ book.
19.8 Real C++
As we finish out this part of the book, we want to take a
moment to note that C++ is a lot more than just C with classes
(or templates). RAII, which we have discussed in this chapter is
a very significant part of that difference. An experienced C++
programmer seldom has pointers to dynamically allocated
objects, as they violate this principle. Instead, they use RAII to
manage their objects. A nice consequence of this design choice
is that if your objects do not have pointers to dynamically
allocated objects (or other data that requires special handling
for copy/destruction), then you do not have to write any Rule of
Three methods (remember: Ro3 does not say you have to
always write them—it says if you have to write any you have to
write them all)! We will also note that we told you to avoid
casting in C, but urge this even more strongly in C++. If you
feel like you need to cast in C++, you should know that “C
style” casts are legal, but generally not what you want as the
other casts let you be more specific about what exactly you are
trying to convert. See Section E.4.2 for more on C++’s casting
options.
As we enter the part of book on data structures, we will,
however, manipulate dynamically allocated pointers directly.
Part of the reason for this is that we are not going to deeply
cover smart pointers7, which would give us a good solution to
handling our pointers with RAII. However, you will learn
everything important about these data structures, and write
acceptable C++ code by obeying the Rule of Three.
If you plan to go on to a career involving professional C++
programming, we recommend that you first read Section E.6 to
learn about C++11, which has some major changes over
C++03. You should then go on to learn about C++14 and
C++17. After that, you should read books specifically about
C++ (e.g., “Effective C++” and “Effective Modern C++” by
Scott Meyers).
19.9 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 19.1 : What is “silent failure”? Why is it the
worst way for a program to fail?
• Question 19.2 : What is “bulletproof code”?
• Question 19.3 : What is errno?
• Question 19.4 : What is a C++ exception? What does it
mean to “throw” one? What does it mean to “catch”
one?
• Question 19.5 : When an exception propagates out of a
function, what happens to the objects in that function’s
frame?
• Question 19.6 : When you throw an exception, should
you use new to create the exception object? If not, what
should you do?
• Question 19.7 : When you catch an exception, should
you catch a pointer to the exception type, a reference to
the exception type, or the value of the exception type?
• Question 19.8 : What happens if an exception
propagates out of main?
• Question 19.9 : What are the four levels of exception
guarantees that a function can make? What do each of
these exception guarantees mean? What is the minimum
level appropriate for professional code?
• Question 19.10 : What is RAII?
• Question 19.11 : Consider the following code, in
particular the function f:
1 template<typename T>
2 class SomeClass {
3 T a;
4 T b;
5 int * ptr;
6 //other methods elided
7 int f(int x) {
8 int * temp = new int[x];
9 int ans = 0;
10 for (int i = 0; i < x; i++) {
11 T z = a + b;
12 int num = g(z);
13 temp[i] = num;
14 ans += num;
15 }
16 std::swap(temp,ptr);
17 delete[] temp;
18 return ans;
19 }
20 };
Under what conditions (exception guarantees
provided by various functions and operators that f calls)
would f provide a no-throw exception guarantee? What
about a strong exception guarantee? What about a basic
guarantee?
18 InheritanceIII Data Structures and Algorithms
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
All of Programming19 Error Handling and Exceptions20 Introduction To Algorithms
and Data Structures
Part III Data Structures and
Algorithms
20 Introduction To Algorithms and Data Structures
21 Linked Lists
22 Binary Search Trees
23 Hash Tables
24 Heaps and Priority Queues
25 Graphs
26 Sorting
19 Error Handling and Exceptions20 Introduction To Algorithms and Data
Structures
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
III Data Structures and AlgorithmsIII Data Structures and Algorithms21 Linked Lists
Chapter 20
Introduction To Algorithms and Data
Structures
At this point, you should have a pretty good idea how to write programs that are about
one to a dozen functions in size. You have practicee devising algorithms, translating them
to code, testing them, and debugging them—in both C and C++. However, our structures
for storing data so far have been rather simplistic—mostly arrays, or simple structs,
which we have manipulated directly everywhere in our program.
As our programs and the size of the inputs they work with grow, there are two
important concerns we need to focus on more. The first, which we have not discussed at
all so far is writing efficient algorithms—ones that run fast. For our purposes so far,
inefficiency has not mattered—we have worked with small inputs and had no time
constraints. However, real programs often work on large data sets, and speed matters.
There are really two aspects to high performance programming. The first (which we
will focus on) is using an efficient algorithm. The second (which we will not focus on, as
it requires a detailed understanding of the underlying hardware) is a high-performance
implementation—optimizing the code for a particular algorithm.
One key aspect of an efficient algorithm is the choice of the correct data structure—
how we store the data makes access to its elements more or less efficient depending on
the operations required. So far, we have just stored our data in arrays, which provide fast
access to a given element, but we must examine each element if we want to search for a
particular item, find the max/min, or a variety of other operations. Examining every
element is fine if the array has size 10, but it may be slow if the array has size
10,000,000.
The second concern we must increasingly focus on is abstraction, which we
discussed previously in Chapter 13. However, in that context, we were primarily focused
on abstracting complex steps out into functions. Here, we will concern ourselves with
separating the interface—what a data structure can do—from the implementation—how
it does it—in the data structures that we design.
20.1 Big-Oh Notation
Before we begin our study of data structures in this part, we need to formalize our notion
of efficiency of an algorithm—that is, give a mathematical way we can clearly say
whether one algorithm is more efficient than another. The formalization we use is Big-Oh
notation, which considers the number of steps our algorithm must execute as a function
of its input size. Big Oh considers the asymptotic behavior—what happens for large input
sizes only—of that function and ignores constant factors—it considers
and to be “the same.”
You may wonder why we would want such a notion of efficiency—especially why
we would want to ignore constant factors. Since we are counting algorithmic steps, not
all steps represent the same time cost on a real computer (and the absolute and relative
time costs vary from one system to another). Since we are already dealing with an
abstraction of the steps that wipes away their relative costs, we do not care exactly how
many there are—three times as many steps that are “cheaper” (take less time) may
actually be better than fewer steps that are more “expensive.”
However, what we do care about here is how the number of steps scales as the input
size grows large. From this theoretical perspective, we care primarily about this
asymptotic behavior for two reasons. First, computers are really fast—so on a small
input, the difference does not matter much. If your computer finishes a task in 100ns
versus 150ns, how much difference does that make? Such a difference only matters if you
repeat the task millions to billions of times, in which case you are dealing with a large
anyways.
Second, if one function grows faster than another by more than a constant factor
(e.g., versus ), then for sufficiently large inputs, the faster growing function will
exceed the more slowly growing function—and the gap between them will continue to
increase as the input grows. This asymptotic dominance will occur no matter what the
constant factors are—the only difference they will make is where the faster growing
function overtakes the more slowly growing one.
Figure 20.1: Some examples of functions and Big Oh.
20.1.1 Definition
As Big-Oh notation is a mathematical formalism, it has a mathematical definition:1
Read: is Big Oh of if and only if there exists and , such that for all
, . This definition may seem a bit intimidating, but it is much
easier to understand with a picture. Figure 20.1(a) shows two functions,
(in red) and (in blue). With these two functions,
. At the left side of the graph (i.e., for small numbers),
—but we do not care about the behavior for small numbers; we are only
interested in the asymptotic behavior, for large numbers. We formalize this notion of
“only caring about large numbers” by picking some number ( ) that divides the line
between “small” (do not care about) and “large” (do care about). In Figure 20.1(a), is
drawn as a green dashed line. For numbers greater than (to the right of the green
dashed line), , for some constant (here works fine, since
).
Note that the reverse is not true— . No matter what we
pick for and , we can always find an where . Suppose
we pick , . This selection of and means we
would need to show that . However,
doing the math, we would find that for , we will find
that this statement is in fact false. No matter how large of a we try, we will fail because
grows more quickly than by more than a constant factor—exactly the
behavior that Big-Oh notation is intended to capture.
Figure 20.1(b) illustrates why is part of the definition of Big Oh. Here, we have
and . Suppose we want to show that
. We cannot pick an such that , ;
however, part of the motivation for Big Oh is that we ignore constant factors—which is
why the definition multiplies by some constant . Choosing , and
, we can easily show that , . You should also
observe that in the case of these two functions, not only , but also
the reverse: .
0 Yes No
non-zero, finite Yes Yes
No Yes
undefined Unknown
Table 20.1: Summary of Big-Oh relationships of functions by looking at limit behavior.
We will note that you can evaluate the Big-Oh relationship between two functions
by taking . If this limit is 0, then , but the
reverse is not true. If this limit is a finite non-zero constant, both functions are Big Oh of
each other. If the limit is , then . A fourth possibility exists: the
limit is undefined (that is, the ratio neither converges nor grows infinitely), in which case,
we do not know. An undefined limit is what you will get if neither function is Big Oh of
the other (as with sin and cos), however, you can also see this result in other cases (e.g.,
and sin). These possibilities are summarized in 20.1.
20.1.2 Practical Big Ohs for Algorithms
We are mostly concerned with Big Oh as it pertains to algorithm runtimes (or space
requirements, as we will discuss shortly). In a theory or algorithms class, we would
formally prove that algorithms exhibit a particular runtime,2 and that the runtime is Big
Oh of whatever we claim it is. We are not going to delve into the formal proofs here, but
a serious computer scientist should learn how to do so as part of her education.
While it is possible to theoretically compare any two functions to see if one is Big
Oh of the other, the practical application is to consider which algorithms are more
efficient on large inputs than others. There are many functions that are rather illogical to
think of as describing the runtime of a function, such as , , etc., so
we are not interested in them. Instead, we will focus on the categories that most runtimes
fall into. For each of these categories, we take the “simplest” function that describes the
class. For example, if a function is it is also , so we will
just talk about the entire class as . We will also note that while a function that is
is technically , we generally pick the description that conveys the most
information. These classes are summarized in Table 20.2 and are not exclusive—it is
possible to have reasonable algorithms in classes that we do not describe here, such as
, but they come up less often.
. From a practical perspective, the best we can hope to do is to devise an
algorithm with a runtime that has constant time, —an algorithm that can give an
answer in the same number of steps no matter what size of input it needs to operate on.
While it may seem like such algorithms are limited to trivial tasks, we will see that with
careful setup of our data structures, we will be able to perform meaningful tasks in
time, though it may require more work to set up the data structure in the first
place.
As a simple example, consider finding the smallest element in a sorted array. It takes
work to sort the array to begin with, but once it is sorted, we can determine its
smallest (or largest) element in time—the smallest element is in index 0, the next
smallest in index 1, etc.
. The next complexity class that comes up in practice in algorithms is
log-star time—those with runtime that is , which means “how many times
you have to write to get . At the risk of making
theoreticians cringe, is for any practical purpose:
. They differ in theory (math
can consider numbers larger than the number of atoms in the universe), but for practical
purposes, we need to have an input that can be stored somewhere in the physical
universe, so it is pretty safe to assume our input is smaller than the universe, and thus
.
. The “next best” runtime that comes up often in practice is logarithmic
time, . algorithms are actually quite common, as they arise
whenever our algorithm can split its input in half at each step. You can execute an
algorithm yourself (called binary search—which we will learn more about
later) to find a word in the dictionary in few steps. Start in the middle, and see if the word
is before or after the first word on that page. If it is before, you can discard the entire last
half of the dictionary; if after, then discard the front half. Now repeat the process with the
remaining half (dividing it into half again). If your dictionary is 10,000 pages, you can
find the right page in about 14 tries (maybe a few more if you do not split exactly in half
each time). Anytime you can design an algorithm with runtime, that is
generally great—for most practical purposes on a computer, as most
computers can only address pieces of data anyways (though we know of none that
actually have that much storage—one estimate3 suggests that Google’s total storage
capacity across all of its data centers is about bytes).
Constant Time
Practically Constant
Logarithmic Time
Linear Time
Linearithmic Time
Quadratic Time
Cubic Time
… …
Polynomial Time
Exponential Time
Factorial Time
Table 20.2: Summary of Big-Oh classes most commonly found in algorithmic runtimes.
. The next common class of algorithms is linear time algorithms, those
whose runtime is . You have already seen and implemented many linear time
algorithms, such as finding the maximum element of an unsorted array (which must
examine every element in the array), converting a string to an integer (which must
examine every character in the string), and computing the factorial of a number (which
computes the factorial of every smaller positive number), just to name a few. For some
problems, a linear time algorithm is the best you can do (which is not too bad), but if you
need to work with a large , the difference between logarithmic and linear algorithms
can be huge—just think about the difference between 1 billion, and
.
. The next common class of algorithms is linearithmic time
algorithms—those with a running time that is . These algorithms
typically arise from doing one or more operations that take time to each
item in the data set. This class is also common in sorting, where it represents the
(provably) best possible runtime for a general sorting algorithm.
. The next common class of algorithms is the quadratic time algorithms—
those with a running time that is . These algorithms arise frequently when the
algorithm must examine all pairings of the input data (of which there are ) or
perform some operation for each item of the input. There are many sorting
algorithms that take time and are frequently used because they are
conceptually simpler than the algorithms—if you were to think up a
sorting algorithm off the top of your head with no prior knowledge, you would almost
certainly think of one of the ones. Sometimes, quadratic is the best you can do,
but if you have to be efficient on large inputs, you might think hard about whether you
can come up with an algorithm instead—the difference between
and
is quite noticeable. If your computer performs 1 billion operations per second, the former
finishes in 1 billion seconds (about 31 years), while the later finishes in 30 seconds.
, ,… We can continue to think about larger (but
constant) exponents— is called cubic time, is called quartic time, and
then we generally do not have fancy names for them, especially as they become
increasingly less common in practice. However, we refer to all of the runtimes of the
form (where is some constant and does not depend on ) as polynomial
4
time—the runtime can be written as a polynomial in . This class of algorithms is
important because they are considered to be tractable—solvable in a reasonable time for
reasonably sized inputs. For sufficiently large inputs, we may not be able to solve in them
in a reasonable time—as with our example of an algorithm on an input of size
1 billion. However, even in that example, we could make the computation finish in a
reasonable time by throwing more computational resources at it—if we can parallelize
the algorithm enough to get a 1000x speedup, it will finish in 11 days. Such a speedup
and runtime may sound large, but if we are reasonably dealing with that size of input, we
probably have a data center or supercomputer available to us. An important point is that
anything that is not polynomial time is certainly intractable, as we will discuss shortly.
With exponential time—algorithms that run in time, we enter
the territory of intractable algorithms. Here, adding one to the size of the input doubles
the runtime. While this may not sound so bad, observe that for , we run into
the problem that we have steps—the same number as our
algorithm requires to work on an input of size 1 billion, which we estimated as
taking about 30 years. We could attack this problem with parallelism, but that will only
get us so far—we might be able to do in a reasonable time with serious
computational resources, but is pretty much impossible. To illustrate this
point, we show a rough approximation of the computing resources required to do various
problem sizes of an algorithm in about two weeks in Table 20.3.5 These
resources are described in terms of how many computers you would need if you were
using (a) a typical desktop (b) the world’s fastest supercomputer, or (c) the combination
of the top 500 supercomputers.
Number of computers required if using…
Problem Size Desktop Computer World’s Top Super Computer Combined Top 500 SCs
— —
500 — —
500,000 0.01 —
500,000,000 10 1
500 billion 10,000 1,000
500 trillion 10,000,000 1,000,000
Table 20.3: Approximate compute power required to finish various sizes of an problem in two weeks.
— indicates the problem is too small to use that much compute power.
Notice how increasing the input size by 10 increases the computational power
required by about three orders of magnitude (500 desktops to 500,000 desktops, etc.).
Such an increase puts you in an entirely different league of computational power
requirements. Going from 50 to 60 in our example takes us from a personal computer
(many people have them) to a small data center of 500 computers (which a moderately
sized corporation could reasonably have). The next step of three orders of magnitude step
takes you to where you would need half a million desktop computers to finish in two
weeks.
At this point, you are probably more likely to want to use a super computer. If your
algorithm has a constant factor around 10, this machine will do the job for you in about
two weeks. Even if your algorithm has lower constant factors, it is now not ridiculous to
use it for the task at hand (it would be silly to use a super computer for a task that a
desktop could handle reasonably).
Beyond this point (at in our example), we enter the realm where no
single entity on the planet has that much compute power. You need either half a billion
desktops, or a combination of the world’s top super computers working together—
roughly the compute capacity of the top 500 combined.
By the time we reach the bottom of the table ( ), we have exceeded the
computational power of the entire world. You simply cannot compute a problem of this
size in any reasonable amount of time, no matter how hard you try, or what resources you
have–even for the largest/most powerful corporations and/or governments. You might
wonder about the possibility of a large government might amassing that much
computational power in secret. However, the power (i.e., electricity) needed to perform
that compute likely exceeds world-wide total power consumption, which seems rather
difficult to keep secret.
The purpose of this discussion was not for you to try to remember what compute
power is required for various problem sizes, but to illustrate the meaning of intractable.
For an input of size 100 (which is really quite small), solving the problem in a reasonable
amount of time is impossible. For such a problem, it is not just a matter of getting more
computational resources to speed things up—if you could dedicate every computer on the
planet to your task, it would still take hundreds of years.
Having an exponential time algorithm to solve your problem is clearly bad—you
just cannot use it on any significant problem size. Whenever possible, you should find a
way to improve your algorithm to get a polynomial time solution. However, such
improvements are not always possible (as far as the best computer scientists and
mathematicians in the world know).
There is a special class of problems, called NP-complete problems, where the best
known algorithm for any of these problems requires exponential time. Furthermore, a
polynomial time solution to any NP complete problem can be transformed into a
polynomial time solution to any other NP complete problem! Whether or not such an
algorithm exists or not is an open question, and proving it one way or another is worth a
million dollars. This class of problem is incredibly interesting, especially as it includes
many useful problems in a wide variety of fields. Note that this is a rather informal
description of an important class of problems with a formal, mathematical definition.
However, the theory/math/formalism is beyond the scope of this book. We encourage all
serious programmers to take a class on complexity theory during their studies.
Exponential time is bad enough, but factorial time—algorithms whose
runtime are are even worse (and thus, also intractable). To see what we mean
by “even worse,” observe the following relationships between and :
From these relationships, we can see that we hit the limits of feasibility much more
quickly. While problems become literally impossible around , the
problems become literally impossible around .
20.1.3 Typical, Worst, or Amortized
When we consider the of the runtime of an algorithm, we may consider a few
different situations, which might present different answers. One way that we might
analyze the algorithm is to consider the typical case—what happens for “most” inputs.
Another way that we might analyze the algorithm is to consider what the worst case is.
For some algorithms, the typical behavior may be much better than the worst case
behavior. For example, we will see in Section 26.2.3 that the typical behavior of the
quick sort algorithm is , while the worst case behavior is .
Another way that we might analyze the runtime of an algorithm is the amortized
behavior—if we do the operation times, what is the total runtime of all of those
operations, divided by . A great example of amortized behavior is adding to a vector
(i.e., std::vector). The vector maintains an array with a certain capacity (which may be
larger than the number of elements in the vector). When a new item is added, if there is
sufficient capacity in the array, it can be added in time. However, if there is not
sufficient capacity, the array must be resized, which is an operation (as each
element must be copied to the new array). By doubling the capacity instead of adding
space for one element, the cost of resizing can be amortized over the next additions,
resulting in the add operation having an amortized runtime of .
To see how this amortized analysis works, imagine that the vector has elements
in it already, and its capacity is full. Now, consider the next additions to the vector. If
we increase the capacity by one for each addition, then we will do copies, each of
which takes time. The result will be work, for operations—
resulting in amortized behavior of . However, if we double the
capacity, then we do one copy, followed by inserts that have adequate space.
We therefore do work for operations—resulting in amortized behavior of
.
20.1.4 Space
We can use Big-Oh notation to describe space requirements as well as time requirements.
Generally if people say “Algorithm is ,” they mean “Algorithm ’s runtime
is .” However, we can also discuss the space required by a particular
algorithm/data structure to store things. It may seem a bit odd to have storage
requirements that are not , since an initial intuition might be that you need to
store exactly the data that is your input. However, we may need to store less—consider a
program that reads integers from the network interface and prints the current maximum
value whenever a new high is encountered. Such a program can use an algorithm that
requires space—it only needs to store the old maximum, not all of the input data.
On the flip side, we might want more than space, typically to allow for
algorithms with more efficient runtimes. We might, for example, build a table with
information about all pairings of inputs (requiring space), so that we can look
up the information about any given pair in time without having to compute
anything.
20.1.5 Limitations
We close our discussion of Big-Oh by noting that it is not a perfect measure of efficiency:
constant factors do matter. If algorithm A executes steps and
algorithm B executes steps, then which is more efficient? Algorithm A’s runtime is
, while B’s is , so we generally prefer A. However, Algorithm A’s
asymptotic advantage only manifests for large N—in this case, once
. For small N, Algorithm B is more efficient. For example, if
, then Algorithm A will require steps (1 billion), while
Algorithm B will only require (20 thousand).
We close our discussion of Big Oh by noting that it is not a perfect measure of
efficiency: constant factors do matter. If algorithm executes steps,
and algorithm executes steps, then which is more efficient? Algorithm ’s
runtime is , while ’s is , so we generally prefer . However,
algorithm ’s asymptotic advantage only manifests for large —in this case, once
. For small , Algorithm is more efficient. For example, if
, then Algorithm will require steps (1 billion), while
Algorithm will only require (20 thousand).
20.2 Abstract Data Types
As our programs grow in complexity, we must be more and more careful about how we
design them to make their correct implementation and future maintenance a manageable
task. One key tool in accomplishing this goal is abstraction—the separation of interface
from implementation. In the context of data structures, this abstraction arises from
separating the abstract data type (abbreviated ADT) from its concrete implementation.
An abstract data type defines the interface—and only the interface—for a data
structure. It specifies what the data structure does but says nothing about how it
accomplishes those tasks. This separation provides a level of abstraction.
As we saw earlier, abstraction provides a nice way to break a larger problem down
into multiple smaller problems, making the entire problem easier to think about and thus
solve. The same principle applies to ADTs: you can define the interface to your data type
and think about what it needs to do as your write the algorithms that use it, without
worrying about how it accomplishes those tasks. You still need to implement the data
structure at some point, but that becomes a separate (and smaller) task.
In larger settings, abstraction boundaries provide more advantages. As your
programs grow large enough that they are done by teams of more than one person,
abstraction is crucial to division of work. No matter how you divide up the work, there
will be some point where one person’s code uses the interface provided by another
person’s code. In the case of ADTs, one person could write the algorithms that use the
data structures, while another may write the data structures themselves. As we will see,
some common ADTs have implementations in the C++ STL—which is an extension of
this principle. You do not need to know about the details of how the authors of the STL
library wrote their data structures, you just need to know what they can do for you.
Good abstraction makes it easy to change implementations of the data structure. For
example, suppose you are developing a piece of software and define an ADT. You respect
your abstraction boundaries—using only the operations in the interface—and start with a
simple implementation of the ADT that is correct but slow. This setup is great for getting
something working and small-scale testing. As you test, you encounter bugs and write a
debug implementation of your data structure that logs what operations you perform—you
can now change to using your debug class without changing any of the code that uses the
data structure. After you debug your code, you find that the simple/slow implementation
of this data structure is a performance bottleneck and invest the time in a more
sophisticated and efficient implementation. Again, your abstractions guarantee that you
can switch the implementation details without affecting the code that uses the data
structure.
While there are many useful ADTs, we will start our study with a focus on four
common ones: queues, stacks, sets, and maps. For each of these ADTs, we will talk about
some uses of the ADT, how we might define the ADT in C++, and how we might
implement the ADT using an array or vector. Of course, as they are ADTs, we might
implement them in a variety of ways. However, arrays are the only data structure we have
seen so far, so for now, we will limit our discussion to array-based implementations. In
the following chapters, we will learn about other data structures and see that we can
implement these ADTs more efficiently with these new structures.
20.3 Queues
A queue is a first-in first-out (FIFO) sequence of items. The primary operations of a
queue are enqueue and dequeue, where enqueue inserts an item into the queue, and
dequeue retrieves the next item from the queue (removing it in the process). The FIFO
nature of the queue means that the items returned by dequeue operations are in the same
order as they were placed in the queue by enqueue operations. A queue ADT may support
other operations (such as testing if the queue is empty, obtaining a count of how many
elements are in the queue, or peeking at the next element—seeing what it is without
removing it); however, the defining aspect of a queue is the FIFO relationship between
enqueue and dequeue operations. Conceptually, a queue is much like “standing in line”—
the first person to get in line is the first person to get served.
Figure 20.2: A queue is conceptually like standing in line. The first item in is the first item out (FIFO).
Figure 20.2 illustrates the conceptual principle of a queue with the analogy to
standing in line. Note that the FIFO nature of the queue means that we place items into
one end of the queue (i.e., the tail) and remove items from the other end (i.e., the head).
20.3.1 Uses of Queues
Queues show up in a wide variety of circumstances in computer programs. Many times,
requests of some sort must “wait in line” until they can be processed. Sometimes, these
requests are for a hardware resource, where only one operation can proceed at a time. If
multiple requests are made for the hardware resource close together in time, some of
them must wait. A queue is a natural choice in many cases, as servicing the requests in
the order that they were made is “fair” and ensures that no request will wait forever.
Queues also appear in a variety of algorithms. Frequently, we find that we may wish
to process one piece of data, and in processing that, it generates several “next” pieces of
data. We wish to immediately process all of those pieces of data before processing any of
the “next” data that they generate. That is, if item 1 creates items 2, 3, and 4, we wish to
process 2, 3, and 4 next. Even if item 2 creates items 5, and 6 to process, we still wish to
process 3 before 5. A queue is the natural choice for an ADT for such an algorithm, as we
can enqueue each item into the queue as it is generated, then dequeue each item in the
proper order.
20.3.2 ADT Definition
We might make an ADT for a generic Queue class (a queue that can hold any type of
data), as follows:
1 template<typename T>
2 class Queue {
3 public:
4 void enqueue(const T & item);
5 T dequeue(); //might choose to return void instead
6 T & peek();
7 const T & peek() const;
8 int numberOfItems() const;
9 };
Observe how this ADT defines an interface but says nothing about the
implementation—those details would be private to the class and are not needed to use the
class. We can tell exactly what operations are available (and what types they operate on),
but nothing is specified about how they are implemented. By nature of this ADT being a
queue, we mean for enqueue and dequeue to behave in a FIFO fashion; however, C++
(and most other programming languages) do not have a way to specify this constraint in
the interface description. Allowing arbitrary behavioral constraints to be specified in an
interface description and checked in an implementation is (provably) something that
cannot be done automatically. A language that supported this feature would require the
programmer to help formally prove the expressed invariants in many cases.
C++’s STL (Standard Template Library) has a std::queue class, which implements
a generic queue. The names for the methods in the STL implementation of the queue are
different from what is listed above: enqueue is called push and dequeue is called pop.
The STL implementation uses these nonstandard names for the queue operations so that
the queue class has the same interface as other classes that contain sequences of elements
(such as the stack class, which we will talk about shortly). Having the same interface
means that code can be templated over what type of class they operate on, and the code
can take a queue or a stack (or other class with the same interface).
Another difference between the queue ADT we presented here, and that found in the
C++ STL is that the STL implementation of the queue has its pop method return void (it
only removes the element, it does not return it). The reason for this implementation
choice is that it avoids copying the item out of the queue to return it when it is removed.
Thus, a programmer needs to peek at the item in the queue, use it, and then pop it. Note
that since peek returns a reference to the item inside the queue, that reference is not valid
after the item is poped. If the programmer wishes to use the item after it is poped from the
queue, she must copy the value first.
You can read all about the std::queue class in C++’s STL online:
http://www.cplusplus.com/reference/queue/queue/.
20.3.3 An Array or Vector Implementation
If we want to implement a queue in an array or vector, we typically need two indices,
which indicate the head and tail of the queue. When we enqueue an item, we use the tail
index to determine where to place that item and then increment the tail index mod6 the
size of the queue. Dequeuing an item behaves similarly, except we take from (and
increment) the head index.
If our queue is a fixed size (that is, it has a certain maximum number of elements,
and can never hold more than that), then we can just create an array (or vector) of that
size. However, we often may want to have a queue that can grow arbitrarily large—
holding however many elements we need it to. If we need this functionality with an
array- or vector-based queue, we can accomplish it by allocating more space, but we have
to do a little bit of work in the process. The issue we have to be careful of is that the head
may be in the middle of the queue, with the tail physically behind it (that is, numerically
tail < head). In such a case, adding more elements at the end of the array/vector would
place them in the middle of the queue (logically between the head and the tail). We
therefore need to explicitly copy them from the smaller array/vector to the larger
array/vector in a way that respects the logical ordering of the queue. Video 20.1
illustrates.
Video 20.1: The operation of an array/vector-based queue, including
resizing to add more elements.
With an array- or vector-based implementation, we can pop from a queue in
time. When we enqueue, our worst case is , as we may have to resize the queue
(copying all the elements). However, as long as we double the capacity each time that we
need to resize, we can achieve amortized behavior.
20.3.4 Deques
Sometimes programmers want the ability to add or remove from both ends of the queue.
Such a data structure is called a deque (pronounced like “deck”), which stands for
“double ended queue.” We are not going to focus on deques much, but you should have
at least heard of them. The most common use of deques is in work stealing scheduling
algorithms. Since we will not learn about concurrency until Chapter 28, we will not
discuss this example until that chapter.
You can read all about the std::deque class in C++’s STL online:
http://www.cplusplus.com/reference/deque/deque/.
20.4 Stacks
A stack is a last-in first-out (LIFO) sequence of items. The primary operations on a stack
are push (which places an item onto the stack) and pop (which obtains and removes an
item from the stack). Unlike a queue, in which items are returned in order, pop returns the
most recently pushed item that has not been popped. That is, if we push 1, then 2, then 3,
and then perform a pop operation, we will get 3. If we pop again, we will get 2. If we then
push 4, the next pop operation will return 4. As with a queue, a stack ADT might support
other operations as well if we want it to. Conceptually, a stack functions like a stack of
plates, where plates are added and removed at the top of the stack.
Figure 20.3: A stack is conceptually like a stack of plates. The last item added is the first item out (LIFO).
Figure 20.3 illustrates the analogy of a stack of plates. Items are both added to and
removed from the same end of the stack (i.e., the top). Accordingly, the last item in is the
first item out—whatever plate you set on the stack most recently is what you would take
back off the stack next.
20.4.1 Uses of Stacks
Stacks are also ubiquitous in programming. In fact, we have been working with one stack
since Chapter 2—the call stack is the stack of frames the computer maintains. Even
though we do not write an explicit data structure for this stack (nor push/pop it directly),
it is still a stack. Instead, the compiler generates code to push a new frame when we call a
function, and to pop the current frame when we return from a function. The behavior of
these frames exhibits the LIFO behavior characteristic of stacks—the most recent
function to be called is the first one to return.
Stacks have a variety of other applications as well. If you want to reverse any
sequence of items, you can do so by pushing the items onto a stack in the order you
encounter them, then pop them off in reverse order. Stacks are also quite useful for nested
matching, such as with parentheses and braces in a programming language. For example,
if we have (4 + (3 * 2) - (8 * 9) + 1), the first close parenthesis matches the
second open parenthesis, not the first. These parentheses match in a LIFO fashion: a
close parenthesis matches the most recent open parenthesis that has not yet been closed.
If we wanted to write a program to match these, we would want to use a stack.
Video 20.2: Using a stack to keep track of which HTML tags are open.
The parentheses matching problem above is a specific case of a more general (and
potentially complex) useful task: parsing. Parsing refers generally to the task of having a
computer “understand” an input file. A parsing task may be quite simple, such as reading
a sequence of lines of the form key=value and splitting it up into key/value pairs.
However, parsing tasks may also be quite complex. One of the first things the C compiler
must do is parse the code you have written, building a data structure it can use for later
tasks. Another parsing task might be reading an HTML file and figuring out how to
format the document—determining which tags are open at each point in the input.
Video 20.2 illustrates how a stack is a natural match for this task. We will not go into the
details of parsing here but will note that parsing algorithms that can deal with arbitrary
nesting basically have to use a stack.
Another common application of stacks is the “undo” feature found in many end-user
programs. Here, the program maintains a stack of the states of the document (or image, or
whatever) or the differences between states. Whenever the user changes the document, a
new state is pushed onto the stack. If the user wants to undo a change, then the last
change is the one that should be undone, so the stack is popped. A “redo” operation can
be supported by adding a second stack. Whenever a state is popped from the undo stack,
it is pushed onto the redo stack.
20.4.2 ADT Definition
We might make an ADT for a generic Stack class (a stack that can hold any type of data),
as follows:
1 template<typename T>
2 class Stack {
3 public:
4 void push(const T & item);
5 T pop(); //might choose to return void instead
6 T & peek();
7 const T & peek() const;
8 int numberOfItems() const ;
9 };
As with our Queue ADT, our Stack ADT specifies the interface but not the
implementation details (which would be private). Again, we cannot express the intended
LIFO behavior of these operations in a compiler-checkable way. As such, our Stack
looks much like our Queue, except that the function names have been changed to match
the traditional push/pop names for a stack. C++’s STL also has a std::stack class with a
similar interface to the std::queue class (but LIFO instead of FIFO behavior).
Documentation for the STL std::stack can be found online:
http://www.cplusplus.com/reference/stack/stack/.
20.4.3 An Array or Vector Implementation
Implementing a stack as an array is more straightforward than implementing a queue
with an array, as elements are only added to and removed from one end. Using a vector
further simplifies the task, as the vector can resize as needed to accommodate more items
in the stack. When using a vector, an item can be pushed onto the stack just by using the
push_back operation to place the item at the end of the vector. Likewise, an item can be
popped from the stack by using the vector’s pop_back to remove the last item from the
vector.
Using a vector gives amortized push and pop behavior for a stack, as those
are the runtimes of the underlying vector operations. However, any particular push
operation might take time if the vector must be resized (recall that the vector
doubles its capacity whenever it must resize so that the cost of copying all of the
elements is amortized over many additional operations). Using an array gives the same
behavior as long as you double the capacity whenever it must be resized to obtain the
same amortized costs.
20.5 Sets
A set is simply a collection of elements, much like those found in mathematics. As such,
a set ADT supports operations similar to those found on a mathematical set, such as
adding items, testing if an item is in the set, checking if the set is empty, taking the union
of two sets, and intersecting two sets. Set ADTs frequently support operations we do not
think of as commonly in math, such as iterating through all elements. A variant of this
ADT is a multiset (also called a bag), which allows the same element to appear multiple
times, whereas a true set either has an object or does not.
Note that while mathematical sets may easily be infinite (e.g., a mathematician can
simply write “consider the set of all natural numbers…”), computers must have a finite
representation of anything they will compute on. This constraint means that, while you
can represent an infinite set, you must implement the set in such a way as to represent the
set in finite memory. If you ever find yourself working with infinite sets, you will need to
think carefully about how to represent them and how to implement the operations that
you require on them. We will limit ourselves to finite sets here. A consequence of this
restriction is that we cannot take the complement of a set (the complement of the set of
when we consider natural numbers is the infinite set ).
20.5.1 Uses of Sets
Sets have many uses in programming, as tracking what items belong to a particular group
of things is quite a common task. For example, if we were making some word-based
game and we needed the program to track the set of valid words (e.g., so it could
determine if a word a user played is legal), then a set ADT is natural. Not only does it
correspond intuitively to what we described wanting (i.e., we want the set of valid
words), but the main operation we want is to test if a particular item (namely the word
that was played) is in the set of words.
As another example, suppose we wish to write a program for task scheduling. In
such a program, each task may need some resources, which would be naturally
represented with a Set ADT. Our program might then analyze whether or not two tasks
could be performed at the same time—which would require testing if the intersection of
their resource sets is empty (they cannot use the same resources at the same time). If they
can be scheduled together, the resources in use by the paired task would be computed by
taking the union of their resource sets.
In Chapter 25, we will also see that sets are quite useful for algorithms that need to
track which items in a data structure they have already worked on.
20.5.2 ADT Definition
We might define our Set ADT like this:
1 template<typename T>
2 class Set {
3 public:
4 void add(const T & item);
5 bool contains(const T & item) const;
6 bool numItems() const;
7 void remove(const T & item);
8 Set<T> intersect(const Set<T> & s) const;
9 Set<T> unionSets(const Set<T> & s) const;
10 };
This ADT contains the basic features we might want in a set, which we discussed
above. We might also want to add iterators (e.g., iterator begin(), iterator end()),
In such a case, the type for the iterator and its interface become part of the interface of
our Set (much as they are in the STL’s vector).
Documentation for the STL std::set can be found online here:
http://www.cplusplus.com/reference/set/set.
20.5.3 An Array or Vector Implementation
We could implement a Set ADT with an array or vector (although, as we will see, this
approach is relatively inefficient). To do so, we would store each element in the array (or
vector) and add an element by first checking if it is already in the set, and if not, placing
it at the end of the array (reallocating storage as needed). In an array-based
implementation, we would need to keep an explicit count of how many items are in the
array. Because a vector abstracts away the details of reallocation and tracks the number
of elements, it is often a more logical choice.
To test if our array/vector-based Set contains a particular element, we would just
iterate over all of the elements in the array/vector and see if they are the item we are
looking for. Intersection can be implemented by iterating over one set, and for each item,
checking if the other contains it or not. If so, we add the item to our answer. Similarly,
union can be implemented by iterating over both sets and adding the items from each set
to the answer.
We will leave analyzing the runtime of a vector-based implementation as an exercise
for you.
20.6 Maps
A map tracks a mapping from keys to values. Adding to a map requires both the key and
the value—updating the mapping so that the specified key is then mapped to the
requested value. To look up an item in a map, one provides the desired key, and the map
returns the corresponding value (or some indication that the key is invalid if it is not
currently in the map).
20.6.1 Uses of Maps
Maps are one of the most ubiquitous ADTs in programming—programmers often want to
associate one piece of information (a key) with some other information (the value that
goes with it) and then later look up that information based on the key. As an example,
consider a social networking site. We might want to maintain a map where the key is a
user ID, and the value is a list of user IDs representing the user’s friends. We might also
map a user ID to the user’s profile information. Note that in a real, large-scale social
networking site, this map would be implemented with a database for scalability, but the
principle is the same.
In the example from Chapter 13, we were effectively working with maps
implemented as simple arrays, although we did not use that abstraction around them. If
you recall, in that example, we read in a mapping from students to the list of classes they
were taking, converted it to a mapping from classes to the list of students in each class,
and printed the results back out.
20.6.2 ADT Definition
We might define our Map ADT like this (note that the ADT is generic in terms of both the
type for the keys—K—and the type for the values—V—which may be different):
1 template<typename K, typename V>
2 class Map {
3 public:
4 void add(const K & key, const V & value);
5 const V& lookup(const K & key) const;
6 V & lookup(const K & key);
7 bool numItems() const;
8 void remove(const K & key) ;
9 };
This ADT contains the basic features we might want in Map, which we discussed
above. We might also want to add iterators (e.g., iterator begin(), iterator end()),
In such a case, the type for the iterator and its interface become part of the interface of
our Map (as we discussed in Set). If we have an iterator for our Map, we might want to
have it return a std::pair with the key and value for each element. We could also add
other features to our Map ADT—such as a method that returns a Set of the keys.
Documentation for the STL std::map can be found online here:
http://www.cplusplus.com/reference/map/map.
20.6.3 An Array or Vector Implementation
Implementing a Map with an array or vector is fairly similar to implementing a Set in the
same fashion; however, the array/vector would hold pairs of key/values. Looking up an
item would iterate over the array/vector, searching for a matching key, and then return the
corresponding value. However, this raises the issue of what to do if the corresponding
key is not found. In a Set, the operation to search the structure just returns true or false—
indicating if the item is there. However, for a Map, the searching operation returns the
value—leaving the problem of what to return if no such item is there.
The approach that C++’s STL7 takes is to provide two different ways to find an
element, with two different behaviors. The [] operator is overloaded to take a key and
return a reference to the corresponding value. If the key is not in the map already, then
the [] operator will add the key, with a default-constructed value (that is, V()) to the map
and return a reference to the newly constructed value. Note there is no const overloading,
as the operator modifies the map if the key is not found.
The other way that the STL map provides is to use find, which returns an iterator
(for this, there are const and non-const overloadings). The programmer should then
compare the result to the map’s end, to see if the key was not found before dereferencing
the iterator. A third design approach one could consider would be to throw an exception if
the key is not found.
20.7 ADTs and Abstract Classes
We may be tempted to declare our ADTs with purely abstract classes (i.e., classes where
all of the methods are abstract). For example, we might declare an AbstractSet like this:
1 template<typename T>
2 class AbstractSet {
3 public:
4 virtual void add(const T & item) = 0;
5 virtual bool contains(const T & item) const = 0;
6 virtual bool numItems() const = 0;
7 virtual void remove(const T & item) = 0;
8 virtual AbstractSet<T> * intersect(const AbstractSet<T> & s) const =
9 virtual AbstractSet<T> * unionSets(const AbstractSet<T> & s) const =
10 };
Then we might declare one or more concrete sub-classes of AbstractSet, which provide
the implementation(s) of this ADT. From a software engineering perspective, this
approach is quite appealing. We can specify the ADT and make use of polymorphism to
use a concrete implementation (i.e. a subclass). Accordingly, we can interchange
implementations easily if we are always using a pointer to the AbstractSet class and
simply changing what type we create when we first initialize that pointer.
One downside to this approach, however, is that we pay the performance cost of
dynamic dispatch (which we will learn about in Chapter 29). Depending on how critical
performance is for our application, this downside may or may not matter much. The other
downside is that if we want our Set ADT to have an iterator, implementing that feature in
this approach is a bit cumbersome. The difficulty here arises from the fact that the iterator
we declare in our ADT is itself abstract, and thus we need to return concrete subclasses of
it from our methods. However, we can only use polymorphism with pointers or
references. To see this difficulty, consider the following hypothetical declaration:
1 template<typename T>
2 class AbstractSet {
3 public:
4 //others omitted
5 class iterator {
6 public:
7 virtual T & operator*() = 0;
8 virtual iterator operator++() = 0;
9 };
10 virtual iterator begin() = 0;
11 virtual iterator end() = 0;
12 };
Here, we run into the problem that AbstractSet::iterator is abstract, but we are
trying to have operator++, begin, and end return it by value. This approach is not legal,
as we cannot have an instance of AbstractSet::iterator and can only use
polymorphism if we have a pointer (AbstractSet::iterator *) or reference
(AbstractSet::iterator &).
We could make this general idea work, but the resulting design and implementation
is a bit nastier—we would only want to do something like this if we had a good reason to.
We are not going to focus too much on this approach, but it does not hurt for you to see
what you would need to do to make something like this work—you might want to
consider the same general technique in other circumstances where you have a stronger
need to separate interface from implementation. To make this work, we would write
something like this:
1 template<typename T> class AbstractSet {
2 protected:
3 class iterator_impl {
4 public:
5 virtual iterator_impl * copy() const = 0;
6 virtual T & dereference() = 0;
7 virtual void increment() = 0;
8 virtual bool equals(const iterator_impl * rhs) const = 0;
9 virtual ~iterator_impl() {}
10 };
11 public:
12 class iterator {
13 iterator_impl * impl;
14 public:
15 iterator(iterator_impl * i): impl(i) { }
16 iterator(const iterator & rhs): impl(rhs.impl->copy()) {}
17 ~iterator() {delete impl;}
18 iterator & operator=(const iterator & rhs) {
19 if (this != &rhs) {
20 iterator_impl * temp = rhs.impl->copy();
21 delete impl;
22 impl = temp;
23 }
24 return *this;
25 }
26 T & operator*() {
27 return impl->dereference();
28 }
29 iterator & operator++() {
30 impl->increment();
31 return *this;
32 }
33 bool operator!=(const iterator & rhs) const {
34 return !impl->equals(rhs.impl);
35 }
36 bool operator==(const iterator & rhs) const {
37 return impl->equals(rhs.impl);
38 }
39 };
40 virtual iterator begin() = 0;
41 virtual iterator end() = 0;
42 //everything else omitted
43 };
That is, we have a concrete iterator class in the interface for the ADT, but it holds
a pointer to an abstract implementation. Each subclass of the ADT would then implement
a concrete subclass of iterator_impl as well as implementing begin and end to return
iterators whose implementation has a pointer to the appropriate. We will note that this
approach has some impacts on performance: primarily from allocating/copying/deleting
objects, as well as from dynamic dispatch.
Whether or not such an approach is beneficial is ultimately a question of the design
complexity and performance trade-offs. If you have significant benefits from the ability
to use different implementations through a pointer to the abstract type, it may be useful.
However, if every last bit of performance matters, you will not want this layer of
abstraction. C++ tends towards the later in its design decisions, and accordingly, the STL
classes do not have an abstract class to define a pure interface.
20.8 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 20.1 : Consider and . Determine
whether or not and justify your answer.
• Question 20.2 : Write a Set using a std::vector. Your Set should support the
operations Section 20.5.2.
• Question 20.3 : What is the runtime for each of the operations in your Set (hint:
consult the documentation for std::vector to find out the runtimes of the various
vector operations you used)? For operations that operate on two input Sets,
assume one has elements and the other has elements.
• Question 20.4 : Repeat the example of Chapter 13 using C++, std::map, and
std::vector
• Question 20.5 : Recall our “random story” program from Question 13.6. For this
exercise, you are going to make the program more sophisticated by placing the
word list into categories. Each line of the word list will now have the form
category:word. For example,
animal:dragon
animal:cat
animal:fly
verb:fly
verb:swim
place:cave
place:city
time:day
time:week
time:hour
thing:sword
thing:textbook
Note that a word may appear in multiple categories.
The blanks in our story will now start with an _, but then will be followed by
either (a) the name of a category (which may only be made up of letters) or (b) a
positive integer. In the first case, your program should pick a random word from
the appropriate category (if the category does not exist, print an error). In the
second case, your program should repeat the word that was used that many blanks
ago. That is, if the number is 1, you should repeat the word from the previous
blank. If it is 2, the word from two blanks prior, etc. If the number is not valid
(negative, 0, or larger than the number of blanks so far), your program should
print an error and exit.
Hint: you will want to use std::map, std::set and std::vector.
• Question 20.6 : Write a program that takes two file names as command line
arguments and prints out the list of words that both files have in common. For
simplicity, we will consider consecutive sequences of alphabetic characters (as
determined by isalpha to be a “word” (so x-ray would be two “words”: x and
ray, even though it is actually one word. Hint: think about what ADT you want to
use.
III Data Structures and Algorithms21 Linked Lists
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
III Data Structures and Algorithms20 Introduction To Algorithms and Data
Structures22 Binary Search Trees
Chapter 21
Linked Lists
So far, we have used arrays as our primary means of storing
multiple elements. However, we can create other data structures
with different trade-offs. The first new data structure we will
learn about is a linked list—a data structure that is a linear
sequence of nodes connected by pointers.
When we make a linked list, we will need a LinkedList
class, as well as a Node class, which can be a private inner class
of LinkedList (recall from Section 14.1.6 that we can declare
one class inside another), as the Nodes are conceptually internal
to the list, and nothing outside the list should know about them.
We can make our LinkedList generic in terms of what type of
data it holds by using a template. The first part of our class
declaration might look something like this (we would want to
add other methods, constructors, and a destructor):
1 template<typename T>
2 class LinkedList {
3 class Node {
4 public:
5 T data;
6 Node * next;
7 };
8 Node * head;
9 public:
10 LinkedList() : head(NULL) {}
11 };
There are a few interesting things to note about this class
declaration. First, the inner class Node has a field next, which is
a pointer to a Node—the type is recursively defined: a Node has
a pointer to a Node. As with recursive algorithms, recursive data
types are fine as long as they are well-founded. For types in C
and C++, the compiler requires that until a type is fully
declared (i.e., it reaches the close curly brace of the struct or
class declaration), only pointers to that type may be used. This
requirement ensures that all recursively defined types are well-
founded and take finite space (note that if a Node could hold a T
and a Node, then sizeof(Node) would be sizeof(T) +
sizeof(Node), which requires Nodes to be infinitely large). We
make the fields of the Node class public, so that the LinkedList
can access them directly—the Node is private to the
LinkedList, so we do not need to worry about anything else
operating directly on Nodes.
Figure 21.1: Example linked lists.
The LinkedList class itself has a pointer to a Node, called
the head pointer, which will point at the first Node in the list (or
be NULL if the list is empty). If the list contains more than one
element, then the next pointer of the first element points at the
second, whose next pointer points at the third, and so on.
Figure 21.1(a) illustrates a linked list with three elements.
Notice how this structure means we can add more elements (at
any position) by changing where pointers point, rather than by
moving elements around.
There are some common variations on the basic linked list
structure. The variation we saw is a singly linked list, meaning
each node points only at the next element. An alternative is a
doubly linked list, in which each node points not only at the
next element, but also at the previous element. The advantage
of a doubly linked list is that it makes traversing the list in the
reverse direction much easier. A linked list can also have a tail
pointer in addition to its head pointer. The tail pointer always
points at the last element in the list (while the head pointer
always points at the first element of the list). Having a tail
pointer gives access to the end of the list (without one,
finding the end takes time, as the entire list must be
traversed from the start). We might also add a field to
remember how many elements are in the list, so we can tell the
size of the list in time (otherwise, we would need to
count the elements by traversing the list, which takes
time).
Figure 21.1(b) illustrates a doubly linked list with a tail
pointer and a field containing the number of elements in the
list. The class declaration for this linked list would look like the
following (plus, of course, various other methods not pictured):
1 template<typename T>
2 class LinkedList {
3 class Node {
4 public:
5 T data;
6 Node * next;
7 Node * prev;
8 };
9 Node * head;
10 Node * tail;
11 size_t size;
12 public:
13 LinkedList() : head(NULL), tail(NULL), size(0)
14 };
It is also possible to make a circular list, in which there is
no last node, but instead, the nodes form a circle. Circular lists
can be either singly or doubly linked—with backwards links
making it easy to traverse the circle in the opposite direction.
Circular lists have rather different uses (they are great for
scheduling tasks, where you want to cycle through the tasks)
from linear linked lists, and we will not explore them in great
detail here. If you master the pointer manipulations and
concepts of “regular” linked lists, circular linked lists should be
no problem for you.
21.1 Linked List Basic Operations
Now that we understand the basic structure of a linked list, we
will examine some basic operations on a list.
21.1.1 Add to Front
Video 21.1: Adding to the front of a singly
linked list.
If we want to add an element to a linked list, adding to the
front is the easiest place to do so (and we can always do it in
time). Whether we are using a singly or doubly linked
list, we start by making a new node that holds the data we want
to add and set its next field to the current head of the list. If our
list has a tail pointer, we may need to update the tail pointer as
well—if the newly added element is also the last element (i.e.,
it is now the only element after adding to an empty list), then
the tail pointer should also point at it. If our list maintains an
explicit count of the items in it, we need to increment that
count. Video 21.1 demonstrates addition to the front of a singly
linked list.
If our list is doubly linked and was not empty before we
added to it, then we need to update the second node (which was
the head before we began adding). Its previous field should
point at our newly added node. If the list was previously empty,
there is no other node. We also need to set the newly created
node’s previous field to NULL (although we should make our
Node constructor do this automatically).
Video 21.2: Adding to the front of a doubly
linked list.
Video 21.2 demonstrates adding to the front of a doubly
linked list. Notice how the line head->next->prev names a box
in a somewhat complicated way—this statement is a great
example of a point we have made before: the importance of
thinking about how to name a box. If you were devising the
algorithm (rather than seeing it executed after it is already
written), thinking about how to name the boxes you want to
change would be a critical task in Step 2.
21.1.2 Add to Back
If we want to add to the back of a list, the first thing we need to
do is find the last node in the list. If our list has a tail pointer,
we can just use the tail pointer to find the last node (in
time). Otherwise, we must iterate through the nodes in the list
(which takes time), looking for one whose next field
is NULL. Once we have identified the last node in the list, we
need to set its next field to a newly created node holding the
data we want to add. If our list is doubly linked, we need to
update the new node’s previous pointer to point at the node
before it in the list (which is either the tail pointer or whatever
variable we used to find the last node in the list). If we have a
tail pointer, we need to update it to point at our newly created
node. As with adding to the front, we would update the element
count if we maintain one.
Video 21.3: Adding to the back of a doubly
linked list.
When adding to the back of a linked list, adding to an
empty list is a special case—there is no last node to begin with.
We need to set the head pointer to point at the new node,
instead of setting a node’s next field. We then need to perform
all the other steps (setting the previous field, updating the tail
pointer, and incrementing the element count).
Video 21.3 demonstrates adding to the back of a doubly
linked list.
21.1.3 Searching
If we want to search the list for a particular element—whether
we look to see if it contains an element with a specific value, or
we want the element of the list—we typically start at the
head node and iterate through each node, following the next
pointer from one node to the next. Note that unlike an array,
accessing the element of a list takes time (recall
that we can access a particular index of an array in
time). This difference is an important distinction between lists
and arrays—we trade the ability to easily access a particular
index for the ability to modify the structure without copying
elements around.
21.1.4 Copying
If we want to make a deep copy of a linked list (e.g., in the
copy constructor or the copying assignment operator), we need
to iterate through its nodes, creating new nodes with the same
data and linking them together so that the items appear in the
same order. One way to do this would be to iterate through our
source list in reverse order and add each item to the front of the
destination list (i.e., we could call an addToFront method we
already wrote). If our list supports easy reverse iteration (i.e., is
doubly linked) this option is easy and efficient.
We could also iterate through our source list and add each
item to the back of our destination list. If our list supports
addToBack in time (i.e., has a tail pointer), then this
option is easy and efficient as well. If our list does not have a
tail pointer, we could perform this efficiently by keeping an
explicit pointer to the end of the list as we perform the copy—
effectively a temporary tail pointer local to the copying method.
Of course, we could also just redesign our list to include a tail
pointer, unless we have a good reason not to maintain one.
21.1.5 Destruction
Video 21.4: Destroying a linked list.
As our linked list has dynamically allocated data (the
nodes), we should write a destructor for it. The destructor
should deallocate all of the nodes in the list that is being
destroyed. Destroying a linked list is relatively straight forward:
we iterate over each node and delete it. However, one common
novice mistake is to delete a node and then attempt to use its
next pointer to go to the next node in the list—doing so uses
memory after it has been freed, which, as we already learned, is
erroneous. Instead, we need to use a temporary variable to
remember the next node in the list, delete the current node, then
go to the next node (using the pointer stored in the temporary
variable). Video 21.4 illustrates both this common mistake, as
well as the correct way to destroy a list.
21.2 Insert in Sorted Order
We may wish to insert into a list in sorted order (i.e., we keep
the items of the list in ascending or descending order, and when
we add a new element, we wish to place it in the correct
position for this ordering). We are going to explore this method
in great detail—seeing three different ways that we can do it—
for a couple of reasons. First, it is an excellent example in
linked list manipulation. In fact, these same general approaches
will serve us well if we want to make other modifications to the
list’s structure (e.g., if we wanted to add in the middle of the list
based on some other criteria, or if we wanted to remove from
the middle of the list). Second, it is great to see the same
problem solved with a few different approaches. Third, we can
compare and contrast the approaches.
21.2.1 Pointer to Node Before
One common approach to inserting in a particular position in a
linked list is to search through the list for the node before where
we want to add. That is, we have a Node * current and iterate
through our list looking for the situation where we want to set
current’s next to the node we want to add. We then perform
the pointer manipulations to insert the new node and are done.
Such an approach has a corner case when we want to add
to the front of the list (as we want to change the head pointer,
and not the next field of some node). Sometimes corner cases
are inevitable, but in this case, the need to separately handle
adding as the first element makes this approach rather inelegant
—especially as we duplicate some code. As we will see shortly,
there are two other ways we can solve this problem where we
do not need any special cases.
Video 21.5: Adding in sorted order to a linked
list by searching for the node before where we
want to add.
Video 21.5 illustrates this method.
21.2.2 Recursion
A different way we could add in sorted order is to use
recursion. As linked lists have a recursive structure (nodes have
pointers to nodes), they lend themselves quite naturally to a
recursive algorithm. Such algorithms typically take a Node *
(e.g, Node * current) as a parameter and have a recursive case
on current->next. For operations that modify the structure of
the list (such as adding or removing), these algorithms typically
return a Node * that points at the resulting sequence of nodes—
the method then updates current based on this (often doing
something like setting current->next to the return result).
In this particular case, we have two base cases: if current
is NULL, or if the data we want to add is less (for ascending
order) than current->data, then we have found the correct
position to add into the list—the newly added item goes before
current (in fact, current should be the next) of the newly
added item. If current is NULL, the resulting sequence of nodes
has only the new item. If current is not NULL, the resulting
sequence has the new item first, followed by the sequence of
nodes starting at current.
For the recursive case, we can observe that if we do not
want to add before the current node, we want to add the
requested item (in a sorted fashion) to the sequence of nodes
that come after current—that is, we would like to
addSorted(data, current->next) (where data is the data we
are supposed to add to the list). Even though we are writing
addSorted, we must trust that it works—that is the key to
recursion. It then returns a sequence of nodes that is just like
the sequence starting at current->next, except that data has
been inserted into it in a sorted fashion. We then need to update
current->next to point at this new sequence and return
current (which is now the start of a sequence that is just like
current was when the method was called, except that the
requested data has been added in a sorted fashion).
This approach leads to the elegant recursive algorithm:
1 template<typename T>
2 class LinkedList {
3 //other parts elided
4 private:
5 class Node {
6 public:
7 T data;
8 Node * next;
9 Node(const T & d) : data(d), next(NULL) {}
10 Node(const T & d, Node * n): data(d), next(n)
11 };
12 //private recursive helper method
13 Node * addSorted(const T & data, Node * current
14 //base case: insert before current
15 if (current == NULL || data < current->data)
16 return new Node(data, current);
17 }
18 //recursive case: add to the rest of list
19 //then update current’s next
20 current->next = addSorted(data, current->next
21 return current;
22 }
23 public:
24 //public addSorted method: just takes data to add,
25 //has private recursive helper do the work
26 void addSorted(const T & data) {
27 head = addSorted(data, head);
28 }
29 };
Note that the main portion of the work is done by a private
helper method, which takes the current node as a parameter.
The public method only takes the data that we want to add,
calls our helper, passing in the head, and updates the head to
the returned result.
Video 21.6:
Video 21.6 shows the execution of this method.
21.2.3 Pointer to a Pointer
If we return to our iterative method, we can observe that the
ugliness arises from the fact that we had a special case for
adding as the first element (where we had to change head)
versus adding in any other position (where we had to change
the next field of a node). We can have a much more elegant
iterative approach by recognizing the similarity between the
two—they are both Node *s—and finding a way to deal with
them uniformly. In particular, we can have a Node ** that
points at “the box we might want to change”—and then we can
point it at either head or the next field of any node (as they are
both Node *s, so we can point a Node ** at them). This
approach results in an elegant iterative algorithm, but has a
little bit more pointer sophistication (of course, by this point,
pointers are second nature—right?)
Video 21.7:
Video 21.7 illustrates this approach.
21.3 Removing from a List
We may also want to be able to remove an item from a list
(either by specifying the position or value we want to remove).
In either case, we remove from the list by modifying the
pointers of the previous node (and, if doubly linked, next node
as well) to take the node we are removing out of the list, then
we delete that node.
21.3.1 Remove from Front or Back
Removing from the head of a linked list can always be
done in time. After checking that the list is not empty,
we just need to store the current value of the head pointer in a
temporary variable, update the head pointer to head->next,
then delete the original head node (which we stored in a
temporary variable for exactly this purpose). Note that this
procedure is exactly what we were doing (repeatedly) in
Video 21.4 when we examined how to destroy a linked list.
If our list is doubly linked, we need to update the
previous field of the new head node if the list is not empty.
Likewise, if our list has a tail pointer, we need to check to see if
it now empty, and if so, update the tail pointer to NULL.
Removing from the back of a linked list can be done in
time if our list is doubly linked and has a tail pointer. In
such a case, we follow a very similar procedure to removing
from the front, except we operate at the tail instead of the head,
and with the previous field instead of the next field.
Continuing the mirror-image behavior, we would update the
next field of the new tail (if non-empty) and set head to NULL if
the list is empty.
If our list is singly linked (or does not have a tail pointer),
we would need to follow one of the algorithms for
removing a general element (which we will see momentarily).
Of course, if we remove from the back of the list frequently, we
would be better off making our list doubly linked and
maintaining a tail pointer (so we can perform the operation in
time), unless we have a very good reason not to.
21.3.2 Other Removals: Pointer to Node Before
As with inserting in the middle of the list, we can remove from
the middle of the list by finding the node before the one we
want to remove (that is, if we wanted to remove the element
whose data is 42, we would want to find a node, current such
that current->next->data is 42). If our list is doubly linked,
we could find the node that we want, then go backwards one
node (by following the previous pointer); however, we still
would have a corner case for removing the first node in the list
—we need to update the head pointer rather than the next field
of any node.
Once we have a pointer to the node before the one we
want to remove, we first need to store the pointer to the node
we want to delete (i.e., current->next) in a temporary
variable, so we can delete it later. We then need to update
current->next to be current->next->next, which removes
the desired node from the list. If we have a doubly linked list,
we then need to update current->next->prev (if current-
>next is not NULL). If we maintain a tail pointer, we need to
update it if we removed the tail node of the list.
As with insertion, we prefer the other two approaches, as
they are much more elegant.
21.3.3 Other Removals: Recursion
We can use recursion to remove a particular element from a list
following very similar principles to adding in sorted order (or
any other particular position). Here, we have two base cases.
The first base case is if the current node is NULL. When the
current node is NULL, the list does not contain the item we are
trying to delete (it does not contain any items, so it obviously
does not contain the item we are trying to delete), so we can
just return the list unchanged—i.e., return NULL.
The second base case is if the current node meets our
criteria for deletion (e.g., it has the particular data we want to
delete, is in the specified position, etc.). In this case, we can
remove the item from the list by returning everything after it
(i.e., return its next), although we need to first delete the node
we are removing.
In any other case, we recursively remove the node from
the rest of the list and then update the current node’s next field
to the recursive result, before returning the current node:
1 template<typename T>
2 class LinkedList {
3 //other parts elided
4 private:
5 class Node {
6 public:
7 T data;
8 Node * next;
9 Node(const T & d) : data(d), next(NULL) {}
10 Node(const T & d, Node * n) : data(d), next(n
11 };
12 //private recursive helper
13 Node * remove(const T & data, Node * current) {
14 if (current == NULL) { //base case: empt
15 return NULL; //answer is empt
16 }
17 if (data == current->data) { //base case: node
18 Node * ans = current->next; //answer will be
19 delete current; //delete node we
20 return ans; //return our answ
21 }
22 //recursive case: remove from rest of list
23 //then update current’s next
24 current->next = remove(data, current->next);
25 return current;
26 }
27 public:
28 //public remove method: just takes data to remove,
29 //has private recursive helper do the work
30 void remove(const T & data) {
31 head = remove(data, head);
32 }
33 };
You should take a minute to execute this code by hand on
a few lists to fully understand what is going on. You should also
observe how the basic structure of the algorithm is quite similar
to our recursive insertion algorithm. Finally, take a minute to
figure out how you would modify this algorithm to handle a list
with a tail pointer, as well as how to handle a doubly linked list.
21.3.4 Other Removals: Pointer to a Pointer
We can also use the pointer-to-a-pointer approach to remove
from a list. As with inserting, this gives us an elegant iterative
approach because we no longer have a corner case for
removing the head. We can keep a Node ** that starts at &head
and represents the “box we might want to change,” then iterate
along our list, trying to find the proper place from which to
remove. Once we find it, we perform the appropriate pointer
manipulations (changing the box we are pointing at to be the
next node after the one to remove).
We leave this approach as practice for you. You should
first try it for a singly linked list with no tail pointer (and test
it!), then adapt it to a doubly linked list with a tail pointer (and
test it!). As always, the biggest hint we can give is to work
through the problem by following the Seven Wteps we have
taught you all along. As you manipulate the values in boxes,
think carefully about how you reached that box—i.e., how you
would name the box you are manipulating.
21.3.5 Remove All Occurrences
So far, our discussion of removal has implicitly assumed that
we mean to remove one occurrence of a value from the list—
either because we only want to remove one of potentially many
copies or because we are sure that there is only one copy of that
value. Sometimes, we might want to remove all occurrences of
a particular value instead.
Before we examine a good way to do this problem, let us
look at a bad way, which is common of novice programmers,
working from our recursive approach:
1 //BAD APPROACH: DO NOT DO
2 template<typename T>
3 class LinkedList { //other parts elided
4 bool removedSomething; //BAD
5 Node * remove(const T & data, Node * current) {
6 if (current == NULL) { //base case: empt
7 return NULL; //answer is empt
8 }
9 if (data == current->data) { //base case: node
10 Node * ans = current->next; //answer will be
11 delete current; //delete node we
12 removedSomething = true; //we removed some
13 return ans; //return our answ
14 }
15 //recursive case: remove from rest of list then update curre
16 current->next = remove(data, current->next);
17 return current;
18 }
19 public:
20 void remove(const T & data) {
21 head = remove(data, head);
22 }
23 void removeAll(const T & data) {
24 do {
25 removedSomething = false;
26 remove(data); //just keep removing until all gone
27 } while(removedSomething);
28 }
29 };
Here, we have decided to implement removeAll by
repeatedly calling remove (which seems like a good thing—we
are reusing code we have already written). However, we have
done a few bad things. The first is that we made a field
(removedSomething) that should not be part of the state of the
object—a linked list does not have a “removedSomething.” We
just hacked this field in since it seemed hard to tell when to stop
removing otherwise. While this may not seem so bad, we want
to avoid ugly hacks—they lead to more and more ugly hacks,
and code we cannot work with.
The second bad thing that we did is make our algorithm
relatively inefficient—in the worst case (every other element
matches our criteria for removal), our removeAll has
running time.
We could write a much cleaner approach (which has an
runtime), but that duplicates a lot of code relative to
the remove method we already wrote:
1 //Decent approach, but duplicates a lot of code:
2 template<typename T>
3 class LinkedList {
4 //other parts elided
5 private:
6 //private recursive helper
7 Node * removeAll(const T & data, Node * current
8 if (current == NULL) { //base case: e
9 return NULL; //answer is e
10 }
11 if (data == current->data) { //now a differ
12 Node * ans = removeAll(data, current->next)
13 delete current;
14 return ans;
15 }
16 //recursive case: remove from rest of list
17 //then update current’s next
18 current->next = removeAll(data, current->next
19 return current;
20 }
21 public:
22 //public remove method: just takes data to remove,
23 //has private recursive helper do the work
24 void removeAll(const T & data) {
25 head = removeAll(data, head);
26 }
27 };
What we would really like to do is have an efficient
implementation but not have to write two copies of essentially
the same code. If we look at the two, we will see that the only
difference is whether we recursively call our removal method
on current->next in the case where we match the data we
want to remove—this change is what makes the code remove
all occurrences rather than just one (as the recursive call here
will remove all occurrences from the rest of the list before
returning it as the answer from this call).
Whenever we realize that code is quite similar except for a
small difference, we should think about the tools our
programming language provides us to generalize code (e.g.,
functions, templates, polymorphism, etc.). In this particular
case, templates provide a nice solution:
1 //GOOD APPROACH
2 template<typename T>
3 class LinkedList {
4 //other parts elided
5 private:
6 //private recursive helper
7 //templated over whether to remove all or one
8 template<bool removeAll>
9 Node * remove(const T & data, Node * current) {
10 if (current == NULL) {
11 return NULL;
12 }
13 if (data == current->data) {
14 Node * ans;
15 if (removeAll) {
16 ans = remove<removeAll>(data, current->ne
17 }
18 else {
19 ans = current->next;
20 }
21 delete current;
22 return ans;
23 }
24 current->next = remove<removeAll>(data, curre
25 return current;
26 }
27 public:
28 //public remove methods: just takes data to remove,
29 //has private recursive helper do the work
30 void remove(const T & data) {
31 head = remove<false>(data, head);
32 }
33 void removeAll(const T & data) {
34 head = remove<true>(data, head);
35 }
36 };
Here, we have templated our private remove helper
method over whether or not it should remove all elements (true)
or just one (false). Recall that we can template over values as
well as over types. Now, we have one copy of the code to do
the work for removal, with a single if statement that chooses
whether to recurse or not. Our public methods then simply
instantiate the remove template with either false or true as
appropriate and call it.
21.4 Iterators
In Section 17.4.3, we introduced the idea of iterators—objects
that encapsulate a position in a data structure and give us a way
to access and move through the elements without knowing the
internals of that structure. For linked lists, the use of an iterator
becomes even more important from an efficiency standpoint.
To see why, let us start with the skeleton of an algorithm that
operates on each element in a list that does not use iterators:
1 for (size_t i = 0; i < myList.getSize(); i++) {
2 T & current = myList.getElement(i);
3 //do stuff with current
4 }
Take a second to think about the runtime of this
algorithmic skeleton (assume that getSize takes constant time
by storing the size in a field in the list). This code takes
time—the getElement(i) time takes time,
and we call it with values from
. Summing these1 gives
us a total running time of . This may be a bit
disconcerting as we have worked primarily with arrays and
vectors so far, where an algorithm with this structure would
have runtime (as accessing the element of an
array or vector would take time).
However, if our list provides an iterator—which
encapsulates the position in the list—we can return to an
algorithm:
1 LinkedList<T>::iterator it = myList.begin();
2 while (it != myList.end()) {
3 T & current = *it;
4 //do stuff with current
5 ++it;
6 }
Now, all of the operations inside of our loop are (or at
least, should be), constant time. Doing work
times results in an algorithm, so we are again happy
with the efficiency of this code. However, we should look at
how an iterator for a list works to demystify what is going on
here.
21.4.1 Implementation
We can implement the iterator by writing an inner class inside
of our LinkedList class. As an inner class of LinkedList, the
iterator is a part of the class, and therefore it is fine for it to
know about the implementation details of the list (and to access
private members). That is, we might start with something that
look like this:
1 template<typename T>
2 class LinkedList {
3 class Node {
4 //contents of Node elided
5 };
6 public:
7 class iterator {
8 //contents of iterator elided
9 };
10 //other things in LinkedList elided
11 };
At this point, we need to figure out what goes inside of the
iterator—i.e., what state it should encapsulate. In the case of a
linked list, the state we need for a position in the list is a pointer
to a node, so our iterator should hold a Node * (which should
be private). We likely also want to write a constructor for
iterator, which takes a Node *. Once we have these, we can
write begin and end in our LinkedList, which return iterators
that encapsulate the start of the list and just past the end of the
list respectively:
1 template<typename T>
2 class LinkedList {
3 class Node { /*contents of Node elided */ };
4 public:
5 class iterator {
6 Node * current;
7 public:
8 iterator() : current(NULL) {}
9 explicit iterator(Node * c) : current(c) {}
10 //other things in iterator
11 };
12 iterator begin() {
13 return iterator(head);
14 }
15 iterator end() {
16 return iterator(NULL);
17 }
18 //other things in LinkedList elided
19 };
Now, we just need to write the overloaded operators inside
of iterator to implement the various functionality it provides.
For example, we might have:
1 //still inside LinkedList (not shown)
2 class iterator {
3 Node * current;
4 public:
5 iterator() : current(NULL) {}
6 explicit iterator(Node * c) : current(c) {}
7 iterator & operator++() {
8 current = current->next;
9 return *this;
10 }
11 iterator & operator++(int) {
12 iterator ans(current);
13 current = current->next;
14 return ans;
15 }
16 T & operator*() const {
17 return current->data;
18 }
19 T * operator->() const {
20 return ¤t->data;
21 }
22 bool operator!=(const iterator & rhs) const {
23 return current != rhs.current;
24 }
25 bool operator==(const iterator & rhs) const {
26 return current == rhs.current;
27 }
28 //possibly other methods
29 };
Each of these operators works with the state of the iterator
—moving it to the next node (for the ++ operators,2 returning a
reference/pointer to the current node’s data (for the * and ->
operators), and comparing two iterators to see if they refer to
different or the same nodes (for the == and != operators). This
iterator, in fact, gives us the capabilities for a forward iterator,
meaning we can iterate through the structure in the forward
direction only. If we have a doubly linked list, we would likely
want to make our iterator a bidirectional iterator by adding the
-- operators, which would go backwards through the list along
the previous links.
We would also likely want to write a const_iterator,
which provides read-only access to the list. We would then
have const overloadings of begin and end (so they would be
used if the list is const), which return const_iterators:
1 template<typename T>
2 class LinkedList {
3 class Node {
4 //contents of Node elided
5 };
6 public:
7 class iterator {
8 //as above
9 };
10 class const_iterator {
11 //similar, but const Node *
12 //returns const &s/const *s
13 };
14 iterator begin() { return iterator(head); }
15 iterator end() { return iterator(NULL); }
16 const_iterator begin() const { return const_ite
17 const_iterator end() const { return const_itera
18 //other things in LinkedList elided
19 };
21.5 Uses for ADTs
We can use linked lists to implement the ADTs we discussed in
Chapter 20. The advantage over arrays arises from the fact that
we can add elements without copying the existing elements
around. For stacks and queues, linked lists work quite well. For
maps and sets, we will see (in the next few chapters) that we
can do better with other data structures. However, it is still
important to understand how to use linked lists for maps and
sets—first, they will give us some important fundamentals to
work from, and second, we may want to use linked lists as part
of hash tables (which we will see in Chapter 23).
21.5.1 Stacks
Recall that a stack has LIFO behavior—the last element in is
the first element out. We can implement this behavior with a
linked list by adding to and removing from the same end of the
list (adding to the front and removing from the front is
generally the easiest). We can both add to and remove from the
front in time, which is the best we can hope for. One
downside of a linked list implementation is that we have a bit
of space overhead (for every element, we store the element, as
well as a next pointer). If we are very concerned about
performance, we would want to consider the performance
impact of allocating/deallocating nodes for every access (new
and delete take a little bit of work).
21.5.2 Queues
A queue has FIFO behavior—the first element in is the first
element out. We can implement this behavior with a linked list
by adding to and removing from opposite ends—if we add to
the back of the list (i.e., at the tail), then we remove from the
front (i.e., the head). We can implement both of these
operations in time if we maintain a tail pointer.
21.5.3 Sets
We can implement a set with a linked list by just keeping each
element in the list. Many of the operations then proceed in a
straightforward fashion: we add an element by adding it to the
list; check if the set contains an element by traversing the
elements in the list, testing if each one matches (and returning
true upon any match and false if none match); and remove
elements by removing from the list.
We need to be a bit careful about what happens if we add
an element twice and then remove it (recall that in a set, the
correct behavior is for the element to no longer be in the set).
We can address this in a couple ways: either check for
duplicates when we add, or remove all occurrences when we
remove. The advantage of the former is that our list stays
smaller if we would add many copies of the same element, but
the disadvantage is that adding becomes an operation
(as we must search the list)—if we do not search for duplicates,
we can just add to the front in time.
This tradeoff is an example of a case where we need to be
careful of what means when we consider —if we think
about searching our list (to see if it contains an item), then that
is an operation. However, if we allow our set to
contain duplicated copies of items, may be much larger
than if our list only contains unique items.
As an example, suppose we wanted to check whether
certain words appeared in the text of this book. We might build
a set of the words by reading the LaTeX source, splitting it into
words, and placing them into a set. There are approximately
300,000 words in this book, but only about 10,000 unique
words, as many words appear multiple times throughout the
book (e.g., the word “the” appears about 15,000 times). If we
let our set have every occurrence of the words, then will be
300,000. If we enforce uniqueness, then will only be
10,000—about a difference.
Although, as we will see shortly, if we want a set that is
efficient for large , we can use either a binary search tree
(Chapter 22; revisited in Chapter 27), or a hash table (Chapter
23).
21.5.4 Maps
We can implement a map with a linked list by storing the
key/value pair in each node. Again, most of the operations
proceed in a straightforward fashion: we add a key/value pair
by inserting it into the list (probably removing the old mapping
first); we look up the value for a key by iterating through the
list (checking each node for a match, and returning the
matching value); we remove a key from the list by finding the
matching node and removing it.
As with sets, we can be much more efficient with the data
structures that we are going to learn about shortly; however,
linked lists will build important fundamentals before we
proceed to them.
21.6 STL List
The STL has a built-in std::list class, which implements a
doubly linked list, with iterators. See
http://www.cplusplus.com/reference/list/list/ for
details.
21.7 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 21.1 : What is the difference between a singly
linked list and a doubly linked list? What is an
advantage of making a list doubly linked?
• Question 21.2 : What is a “tail pointer”? What is an
advantage of having a tail pointer in a linked list?
• Question 21.3 : Draw a singly linked list that contains
9, 7, and 5 (in that order)
• Question 21.4 : Why is this destructor incorrect?
1 ~LinkedList() {
2 while(head != NULL) {
3 delete head;
4 head = head->next;
5 }
6 }
• Question 21.5 : What would happen if we tried to write
this destructor in Node?
1 ~Node() {
2 delete next;
3 }
• Question 21.6 : Write a doubly linked list class
(templated over what type of data it holds) that has the
following:
– A private inner class Node for its nodes, which
should have data, next, and previous. You may
wish to write some constructors for Node as well.
– A head pointer and a tail pointer (both should be
private).
– A default constructor that makes the head and
tail both NULL.
– The Rule of Three methods (destructor, copy
constructor, copying assignment operator).
– A void addFront(const T &item) method,
which adds the item to the front of the list
– A size_t getSize() const function, which
returns the number of items in the list.
– A T& operator[](size_t index), which
returns a reference to the data in the index
element (starting from 0). You should make an
exception class (that extends std::exception)
and throw an instance of it when the requested
item does not exist. You should write a const
and a non-const version of this.
Test your list extensively before proceeding.
• Question 21.7 : Add more functionality to your list
from the previous problem. As you add each method,
test it until you are confident it is correct before moving
on to the next method. Add the following functionality:
– A void addBack(const T &item) method,
which adds the item to the back of the list
– A bool remove(const T &item) method,
which removes the specified item from the list
(assume == is overloaded on Ts, and use it to tell
if you have the item you want). This should
return true if an item was actually removed and
false if no such item existed. You should only
remove the first (starting from the head) if there
are multiple items that match.
– int find(const T &item) const, which
returns the index of the item in the list or -1 if no
such item exists
• Question 21.8 : Add an iterator class to list (from the
previous problems), with the functionality required for a
bidirectional iterator. Give your list begin and end
methods that return the correct iterators
• Question 21.9 : Add a const_iterator class to your
list, which gives you a bidirectional const iterator.
Overload begin and end with
const_iterator begin() const and
const_iterator end() const versions.
• Question 21.10 : Overload the << operator so that you
can print a std::list<T>. Your overloaded operator
should be templated over the type of items in the list
and should take a std::ostream & and a
const std::list<T> &. It should then print (to the
passed-in stream) an open square bracket, the elements
of the list (comma-separated), and then a close square
bracket. For example, printing the empty list should
result in [], and printing the list with 1, 2, and 3 in it
should print [1,2,3].
• Question 21.11 : Write a function that takes a
const std::list<T> & and returns a std::list<T>
that is the reverse of the list passed in (that is, it has the
same elements, but in backwards order. (Hint: you may
find it useful to use your overloaded << operator when
you test this function).
• Question 21.12 : Write a templated function that takes
in const std::list<T> & and returns a count of how
many of the items are “even”. For this function, “even”
means that an item mod 2 is equal to 0. You can assume
that this template will only be applied to types where %
is overloaded on Ts and ints.
• Question 21.13 : Could you reuse a function you wrote
in a previous exercise to make the preceding problem
very easy?
20 Introduction To Algorithms and Data Structures22 Binary Search Trees
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
III Data Structures and Algorithms21 Linked Lists23 Hash Tables
Chapter 22
Binary Search Trees
Linked lists allow us to perform a lot of operations in
time, which sounds like the best we might hope to do. At a first
glance, we might think that if we want to find something, we
might potentially have to look through all of the items we have to
see if it is there. However, we can actually do much better if we
use more sophisticated data structures.
One way we can access things in sub-linear time (less than
) is to use binary search. The idea of binary search is to
start with ordered data, then split the problem in half at each step.
Because we are splitting the problem in half each time, we can
find the item we are looking for in time. The
difference between and time can be quite
significant—for example, the of 1 billion is about 30.
To see how binary search works conceptually, imagine you
wanted to find a word (e.g., “game”) in a dictionary. You could
start by opening the dictionary to the middle, and seeing what
word is on that page (e.g., “macaroni”). By comparing the two
words, you can tell which half of the dictionary your target word
is in. In this case, since “game” comes before “macaroni,” you
have narrowed your search down to the first half of the
dictionary. Now, you repeat the process with the first half of the
dictionary—taking the word on the middle page of that half (e.g.,
“frog”), comparing and reducing the problem by half again. In
this case, “frog” comes before “game,” so we know that “game”
is in the second quarter (the second half of the first half) of the
dictionary. We then repeat the process again, narrowing down the
range to search to one particular eighth, sixteenth, thirty-second,
etc. of the dictionary. If we can choose exactly the middle page
each time, we are guaranteed to find the right page in a number of
steps equal to the of the number of pages. Even if our
dictionary had one million pages, we would finish in 20 steps!
Observe that this technique works exactly because our input
data is in sorted order. When we compare “game” to “macaroni,”
we know “game” comes on an earlier page than “macaroni”
because “game” “macaroni.” If the data were not sorted (e.g.,
if we were just trying to find an occurrence of a word in a novel),
we could not use binary search.
We can use binary search on a sorted array (and in fact, you
will do that as a practice exercise). However, inserting into a
sorted array is an operation. To see this fact, consider the
case where we want to insert an element that needs to go
somewhere near the start of the array. To insert that element into
the proper location, we must first move all of the later elements in
the array down one spot—copying arr[i] to arr[i+1] for each i
that comes after the place we want to insert. As this operation
requires copying potentially every element in the array, it is an
operation.
22.1 Binary Search Trees Concepts
We can combine the concepts of a linked dynamic data structure
with the idea of binary search to come up with the idea of a
binary search tree (BST for short). A binary search tree is a data
structure where the nodes have two pointers to other nodes
(typically called left and right), as well as whatever data they
need to hold. The tree maintains the invariant that, for any node,
all data to the left of that node (i.e., transitively reachable via its
left pointer) is less than the data in the current node, and all data
to the right is greater than (or equal, if duplicates are allowable)
to that node. The tree itself holds one pointer to a node (typically
called root).
Figure 22.1: Two binary trees. The one on the left obeys binary search tree ordering.
The one on the right does not.
We generally draw trees conceptually without explicitly
drawing the boxes for the left and right pointers; however, we
understand they would be there if we were to actually examine
the in-memory structures of a tree. Figure 22.1(a) illustrates a
conceptual drawing of a valid binary search tree. Observe that for
each node in the tree, every node to its left is smaller, and every
node to its right is greater. Note that computer scientists draw
trees “upside down” from what you would expect in nature. The
root node is at the top of the tree, and it “grows” down—the
opposite of a real tree. This seemingly illogical choice most likely
stems from the fact that people tend to prefer to draw things from
the top down.
By contrast, Figure 22.1(b) shows a binary tree that does not
respect the ordering rules for a binary search tree. Here, the
problem is that 56 is to the left of 55, even though 56 is larger
than 55. Notice that the ordering requirements apply to the whole
tree—not just to a the nodes immediately to the left or right of
any particular node. If we only look at pairs of nodes along the
red path, everything seems fine: 20, which is left of 55, is less
than 55; 43 is greater than/right of 20; and 56 is greater than/right
of 43. However, the rules for a binary search tree require that
every node to the left of 55 be less than it, and 56 is not. This
strong rule is required so that we can search the tree efficiently. If
we were looking for 56 in this tree, we would compare 56 to 55,
determine that 56 is greater, and look only at the nodes to the
right of 55 (which only include 74 and 99).
22.1.1 Terminology
Before we delve too much deeper into binary search trees, it will
be useful for us to cover some specific terminology, so that we
can discuss trees precisely. We will first present several terms,
then see several examples. As trees are a specific type of graphs,
we will have to start by defining some terminology about graphs
—which will then let us give a precise definition of trees, binary
trees, and binary search trees.
Node As with a linked list, a node in a graph
(including a tree) is the structure that holds one item
in the tree. That item may comprise multiple values
(e.g., key and value) depending on how we want to
use the graph. In our examples, we draw the nodes as
circles with one value (which represents the key on
which the tree is ordered) inside them.
Edge An edge represents a connection between two
nodes. As we will see in Chapter 25, edges in a more
general graph can be directed (meaning they go one
way), undirected (meaning they go either way), and
may have other attributes. For now, however, we will
only consider directed edges—meaning that they
point from one node to another. In our conceptual
drawings, edges will be drawn as arrows. In a binary
tree, the edges will be implemented as the left and
right pointers in the node that they originate from
(pointing at the nodes they point to). We will note
that sometimes people explicitly draw NULL pointers
(as in Figure 22.1), and other times people omit them
(as in Figure 22.3). We will generally omit them
from conceptual drawings, and include them when
that level of detail is required.
Graph A graph is a collection of nodes and edges.
(Directed) path A path is a sequence of nodes that “follows
the edges” of the graph but does not repeat an edge.
That is, there is an edge from the first node to the
second, an edge from the second to the third, and so
on. We may explicitly say directed path to
distinguish it from an undirected path; however,
when just “path” is used, people generally mean a
directed path.
Undirected path An undirected path is a path in which we
ignore the direction of the edges (i.e., ignore the
direction of the arrows)—that is, our sequence of
nodes can have a “from” node following the node it
has an edge “to.” As with a directed path, no edge
may be repeated in an undirected path.
Cycle A cycle is a path that starts and ends at the same
node—logically a “loop” in the graph. Trees do not
have any cycles.
Undirected cycle An undirected cycle is an undirected path
that starts and ends at the same node—logically a
“loop” where we might ignore the direction of the
arrows. Trees also do not have any undirected cycles.
Connected A graph is connected when there exists at least
one undirected path between any pair of nodes.
Tree A tree is a data structure that comprises nodes
and edges and is a connected graph with no
undirected cycles. Note that by virtue of this
definition, there will be exactly one undirected path
between any pair of nodes.
Rooted tree A rooted tree is a tree in which one particular
node is the root node. The root node is special in that
there exists a directed path from it to every other
node in the tree.
Binary tree
A binary tree is a rooted tree where each node
has at most two outgoing edges (other nodes that it
points at).
Binary search tree A binary search tree is a binary tree
that obeys the binary search tree ordering rules:
everything to the left of a given node must be smaller
than that node, and everything to the right must be
greater than (or equal to) that node.
Parent (of a node) In a binary tree, the parent of a node is
the node that points at it.
Children (of a node) The children of a node are the nodes
it directly points to.
Ancestors (of a node) The ancestors of a node are the set
of nodes from which there exists a directed path from
the ancestor to that node.
Descendants (of a node) The descendants of a node are the
set of nodes from which there exists a directed path
from the node to its descendant.
Depth (of a node) The depth of a node is the length of the
path from the root node to that node. Different
people use different conventions as to whether or not
the root is depth 0 or depth 1. We will use the
convention that the root is at depth 0.
Leaf nodes A leaf node is a node with no children.
Sub-tree A sub-tree is a tree formed by taking the
descendants of a particular node in a tree, along with
the edges that connect them, and forming a tree from
those.
Height (of a node) The height of a node is the maximum
length path from it to a leaf node. Different people
use different conventions of exactly how to count the
height (e.g., whether a leaf node has height 0 or
height 1). We will use the convention that a leaf node
has height 1, and the height of NULL is 0.
Height (of a tree) The height of a tree is the height of its
root.
Full A binary tree is full if every node either has zero
children (i.e., is a leaf node) or two children.
Balanced A binary tree is balanced when, for every node
in the tree, the height of its children differ by at most
1.
Complete A binary tree is complete when every level,
except possibly the last, has as many nodes as it
possibly can have, and the last level is filled in from
left to right.
Figure 22.2: Three graphs that are not trees.
Figure 22.2 illustrates three graphs that are not trees. The
leftmost graph, in Figure 22.2(a) is not a tree because it has a
directed cycle. Specifically, there is a path (we follow the edges,
and do not repeat any edges) from the node with 2, to the node
with 6, to the node with 8, to the node with 2. As this path starts
and ends at the same node, it is a cycle—therefore this graph is
not a tree. We will note that saying “the node with 6” is
cumbersome, so we will frequently just refer to the node by the
value it holds (e.g., the path from 2 to 6 to 8 to 2).
The middle graph, Figure 22.2(b) is not a tree because it has
an undirected cycle. There are no directed cycles, so this
particular graph is a directed acyclic graph (DAG). We will see a
bit more about DAGs in Chapter 25.
The right graph, Figure 22.2(c) is not a tree because it is
disconnected—there is no path between the any of the left four
nodes (86, 42, 91, and 54) and any of the right two nodes (19 and
27). As this graph is composed of multiple trees, it is called a
forest.
Figure 22.3 shows three trees that exhibit different
properties (although none is a binary search tree). The first,
shown on the left in Figure 22.3(a) shows a tree in which each
node has at most two children. However, this tree is not a rooted
tree, so we do not consider it a binary tree. Observe that even
though this graph is connected, there is no root—we cannot pick
one single node that is the ancestor of all other nodes in the tree.
We can see this fact most clearly by examining 7 and 12. There is
no path from 7 to 12, nor is there a path from 12 to 7.
Figure 22.3(b) shows an example of a rooted tree (1 is the
root) that is not a binary tree. The root node has four children
(specifically, 7, 43, 99, and 74), and 7 has three children. We
might also observe that the height of this tree is 3, and that it has
many leaf nodes (5, 12, 32, 49, 99, 33, and 12).
Figure 22.3: Three trees with different properties.
Figure 22.3(c) shows another example of a tree. This tree is
a binary tree (all nodes have at most two children, and 7 is the
root); however, this tree is not a binary search tree. We can see
many examples of pairs of nodes that violate the binary search
tree ordering rules: e.g., 7’s left child (12) is greater than it, and
74’s right child (44) is less than it. This binary tree has height 5—
the path 7 12 64 49 85 has five nodes on it and is the
longest path from the root to a leaf in the tree.
Figure 22.4 shows three binary search trees—all of the trees
are binary trees (rooted trees where each node has at most two
children) that obey the binary search tree ordering rules. We
explicitly note the height of each node next to it to help us discuss
whether or not a tree is balanced, and we explicitly show NULL
nodes (whose heights are 0), as they affect whether the tree is
balanced. Each of these trees shows examples of different
properties.
The tree in Figure 22.4(a)(top left) is balanced. If you
examine each node, you will see that its two children’s height
differ by at most 1. For example, the root node’s two children
have height 3 (for the left child, which is 20), and height 2 (for
the right child, which is 74). This tree is not full (5 and 43 each
have one child, but a full tree requires all nodes to have either
zero or two children). The tree is also not complete—even though
the first three levels are completely filled up, the nodes on the last
level are not all on the far left. In particular, 5 does not have a left
child. If we were to add a node to the left of 5 (e.g., 1) and not
make any other change to the tree, then it would be complete.
Figure 22.4: Three binary search trees with different properties. The height of each
node is indicated in dark blue next to the node.
The tree in Figure 22.4(b)(top right) is not balanced. In
particular, the children of the root differ in height by 2. The root’s
left child has height 3, and its right child has height 1. This tree
is, however, full—each node has two children, except for the leaf
nodes (1, 11, 21, 35, and 93) which have zero children. This tree
is not complete, as the level with 4 and 25 is not filled up—93
would need both a left and a right child for this tree to be
complete. However, the sub-tree rooted at 19 (i.e. the sub-tree
formed by 19 and all of its descendants) is full, balanced, and
complete.
The tree in Figure 22.4(c) (bottom) is complete and
balanced, but it is not full. This tree is complete because all levels
except the bottom are filled completely, and the bottom has all of
its nodes on the left side. This tree is balanced, as all nodes have
children whose heights differ by at most 1. However, this tree is
not full, as 60 has one child.
22.1.2 Uses
Before we delve into the details of how to implement binary
search trees, it is useful to spend some time discussing what they
are useful for. One use of binary search trees is to implement map
and set ADTs with access time for addition, look up
(either checking if a set contains an item, or finding the value for
a given key in a map), and removal. We will note that the
“vanilla” binary search trees we will learn about in this chapter
will typically have access time; however, their
worst case behavior is . We can ensure
access time in all cases by ensuring that the tree remains
balanced. We will see how we can accomplish this goal in
Chapter 27.
A map or set implemented with a binary search tree requires
that the keys be a totally ordered type—that is, a type where we
can compare any two elements (a and b) and conclude that either
a < b, a = b, or a > b. This restriction means we would not be
able to use a binary tree for keys whose type cannot be compared
to form an ordering. However, the fact that this restriction is the
only restriction on the type of keys we use means that we may be
able to use a binary search tree when we cannot use other types.
For example, in Chapter 23, we will learn about hash tables,
which will let us implement maps and sets with an expected
access time; however, there we will have a different
restriction upon the keys.
For either a map or a set, a binary search tree’s nodes will
each hold the data for one entry in the map/set—for a map, this
would be both the key and the value, and for a set, this would just
be the item. We are going to show examples using trees with ints
and primarily demonstrate the operations as they would be
implemented for a set. Using a binary search tree for a map
results in much the same basic algorithms. Additionally, making a
generic binary search tree (using templates) is a straightforward
generalization of the algorithms we present here.
Binary search trees are also useful for operations that are not
part of a typical map or set. For example, if we wanted to
efficiently find all keys within a given range (e.g., between 5000
and 30,000) we can do so quite efficiently. Another operation we
can perform efficiently with a binary search tree is to find the
smallest key greater than or equal to a particular value. Such an
operation might be useful in resource management, where we
want to find a resource that satisfies a request (it is greater than or
equal to the requested value) but wastes as little as possible (it is
the smallest such item).
Figure 22.5: An example of a use of non-binary search trees: the abstract syntax
tree for
if (x <3) \{ z = a + b * 2; \} else \{ y = 0; \}.
Even though the focus of this chapter is on binary search
trees, we will also briefly note that trees other than binary search
trees see significant uses in programming. One useful example of
trees (that are not binary search trees) is abstract syntax trees.
When a program parses input—analyzes it for its grammatical
structure—it is often useful to build a tree that represents the
meaning derived from that parse. For example, Figure 22.5 shows
the abstract syntax tree that a C compiler might produce from
parsing
1 if (x < 3) {
2 z = a + b * 2;
3 }
4 else {
5 y = 0;
6 }
The root of the abstract syntax tree is the top-level syntactic
construct that we have parsed (in this case, an if), and each
node’s children are the expressions or statements contained inside
of it. For example, the if node has three children. The leftmost
child represents the conditional expression, the middle child
represents the then-clause, and the rightmost child represents the
else-clause. Looking at the middle node, which is an assignment
statement, its left child represents the left operand (z), and its
right child represents the right operand (a + b * 2).
A tree structure is well suited to representing the syntactic
structure of a program (or many other forms of input—including
html and xml), because a tree is a recursively defined data
structure, and many forms of input have recursively defined
structure. In a program, expressions contain sub-expressions,
which can be arbitrarily complex—representing them with a tree
works naturally, as the tree can be however deep is required to
represent the complexity of the sub-expressions. We will not
delve into the details of parsing or abstract syntax trees here, as
our focus in this chapter is primarily on binary search trees.
However, it is useful for you to know that trees come in a wide
variety of forms and have many uses.
22.2 Adding to a Binary Search Tree
For binary trees to be useful (i.e., more efficient than other data
structures), we need to be able to add to them in
time, while still preserving the binary search tree ordering. As
always, we should begin by working an example of the problem
ourselves: taking a binary search tree and trying to add a piece of
data to it. However, to work this example, we need some domain
knowledge. In this particular case, the domain is data structures,
and the particular knowledge we need is where we should add a
node to a binary search tree.
Figure 22.6: Working an example ourselves of how to add to a binary search tree.
To gain this domain knowledge, we will consider the
example of adding 15 to the binary search tree shown in
Figure 22.6(a). Our first inclination might be to try to add the
new node as the root (after all, adding to the front was the
simplest way to add to a linked list). However, this approach will
not work. If we were to attempt this approach for the current
example, we would need to put the old root (55) to the right of
the new root, as 55 is greater than 15. Such an attempt would
result in the tree shown in Figure 22.6(b), which does not obey
the binary search tree rules. Even though 55 is on the correct side
of 15, there are other nodes in the wrong place. Specifically, 5
and 12 are to the right of 15, but should be to its left.
We might imagine “fixing” this approach by examining the
tree, finding nodes that are newly out of place, and putting them
in the correct place (e.g., moving 5 and 12 to the left of 7).
However, such an approach would require us to examine and
move many nodes in the tree and would require us to have a way
to add those nodes into the correct place—the exact problem we
are trying to solve. While that may sound like recursion, it is not
a very good use of it, as we are not solving a smaller instance of
the same problem, and we will rearrange the tree much more than
required.
Figure 22.7: Considerations of other potential places to add 15.
Instead of trying to hack together a solution where we add to
the root, let us go back and think about the problem a little bit
more completely. Figure 22.7(a) shows our example tree,
highlighting the path on which we would search for 15. This path
also represents the places we could consider adding 15—
anywhere else, and it would not be found if we then looked for it.
However, not all of these places are viable. For each place along
this path, we have put either a green circle (with a letter in it, so
we can talk about each place) if the tree would continue to follow
the binary search tree rules after adding 15 there, or a red circle if
it would not.
Location A would be a matter of adding 15 as the root,
which we just saw would not work. Trying to add 15 at location
B would also not work, as 20 would end up to the left of 15. If we
were to try to put 20 as the right child of 15, then we would end
up with 5 to the right of 15, also violating the rules. Notice that
this problem is very similar to what went wrong when we thought
about trying to add 15 as the root of the tree. Adding at any of C,
D, or E could work for this tree; however, only one of them will
work in every case!
To see why these do not all work in every case, consider the
slightly different tree shown in Figure 22.7(b). In this tree,
everything is the same except that 5’s right child is 19, not 12.
Now, location C would not respect the rules, as adding 15 at this
location would result in 19 being to the left of 15, which is not
allowed. Notice that when we are examining the data in the nodes
at 20 and/or 5, we would have no idea about the contents of 5’s
descendants, unless we had our algorithm explicitly examine
them all.
Figure 22.8: Our example binary search tree with 15 added.
Location E therefore has a wonderful benefit over every
other location (in either tree, but we will talk about the original
tree in Figure 22.7(a)). If we add 15 as the right child of 12, then
15 is a leaf node. As our newly added node has no descendants,
there is no possibility that it has descendants on the wrong side.
Likewise, we know that it is in the right position relative to its
ancestors, as we found the location to add it by following the path
formed by going in the directions indicated by the binary search
tree ordering rules at every step. The resulting tree (with 15
correctly added) is shown in Figure 22.8.
Now that we understand how (conceptually) to add to a
binary search tree, we are ready to work out the details of an
algorithm. We can take three approaches to this problem,
similarly to the three approaches we could take adding to a linked
list.
22.2.1 Recursion
As binary search trees are recursively defined structures—the left
and right sub-trees of any node are themselves binary search trees
—recursion is a natural approach to manipulating them. The base
case is adding to an empty tree (NULL), which is trivial—the
resulting tree is a single node with the key (and any other data)
that we want. In the recursive case, we compare the current
node’s key we want to add, then recursively add to the
appropriate sub-tree. The recursion returns the updated sub-tree,
and we set the left or right pointer of the current node
(whichever direction we recursed to the recursive return result.
This update is only strictly needed for the node whose pointer
changes from NULL to the new node; however, our code is much
simpler (and still correct) if we do it in all cases. Video 22.1
walks through devising a recursive algorithm to add to a binary
search tree.
Video 22.1: Devising a recursive algorithm to add
to a binary search tree.
22.2.2 Find the Parent of the Node to Add
We could instead add to a binary search tree by keeping a pointer
to a node and iteratively seeking out the node whose left or
right pointer we need to update. As with linked lists, such an
approach must stop at the node that will be the parent of the
newly added node, and adding to the empty tree is a special case.
This approach requires the most code and is the least elegant;
however, it is still useful to understand how it works. You may
see it written by others, or you might end up in a situation where
you need to add to a tree, but there is something that prevents you
from using the other approaches.
22.2.3 Pointer to a Pointer to a Node
We could also think about this algorithm in terms of a “pointer to
the box to change”—that is, a pointer to a pointer to a node. This
approach looks very similar to the pointer to a pointer to a node
approach for adding to a linked list; the main difference is that we
have to determine whether to update the pointer with the address
of the left field or the address of the right field of the current
node. Video 22.2 shows this approach.
Video 22.2: Devising an algorithm to add to a
binary search tree by keeping a pointer to the box
that we might want to change.
Observe that the running time of any of these algorithms is
linear in the length of the path from the root to the insertion point
—i.e., if is the length of that path, then the algorithm is
. If our tree is balanced (or close to balanced), then
, so the insertion algorithm is .
However, if the tree is very imbalanced, we may degenerate to a
case where , and we end up with an insertion
following a path that is linear in , at which point our algorithm
has running time linear in the number of nodes in the tree.
Chapter 27 will show you how to ensure the tree is balanced (or
at least balanced enough) to guarantee insertions
(as well as searches and deletions).
22.3 Searching a Binary Tree
Once we have data in a binary search tree, we might want to
search the tree (try to find a particular piece of data). For a set
ADT, searching the tree would be the key to implementing the
contains operation. For a map ADT, we would search for a
particular key, and return the value in the node where we found it
(if any).
Searching a binary tree lends itself quite naturally to a
recursive algorithm, as do many other algorithms related to trees.
However, as the algorithm is tail recursive, it can easily be
written with iteration instead (recall we discussed tail recursion in
Section 7.3). Video 22.3 walks through devising and
implementing the algorithm to search a binary search tree using
iteration.
Video 22.3: Devising an algorithm to search a
binary search tree using iteration.
Observe that at each step of the algorithm, we narrow down
where to look to either the left or the right sub-tree. If the tree is
balanced (or close to balanced), then selecting one sub-tree (and
not examining anything in the other) means that we are cutting
our problem size roughly in half each time, and therefore we have
search time. As with addition to the tree, we must
take care to ensure this behavior, which we will
discuss later.
22.4 Removing From a Binary Search Tree
Removing from a binary search tree is a matter of first finding the
node you want to remove (via a similar technique to finding it for
any other purpose—however, if we approach the problem
iteratively, we likely want to use a pointer to a pointer as we
search the tree, so that we end up with a pointer to the box we
will want to change). After we find the node we want to remove,
we need to manipulate the tree pointers to actually remove that
node. For the cases where the node to remove has either zero
children or one child (i.e., at least one of its pointers is NULL) this
removal is a process almost exactly like linked list removal. We
update the parent’s left or right pointer (whichever is pointing
at the node we want to remove) to point at the to-be-removed
node’s single child (or set it to NULL if that node has no children),
delete the node we removed, and then we are done. In the case of
deleting the root node, we would need to update the root pointer
itself, instead of a parent’s left or right pointer (as the root
node has no parent).
Figure 22.9: An example binary tree to use for discussing deletion.
For example, if we were to remove 70 (which has no
children) from the binary search tree shown in Figure 22.9, we
could set 84’s left pointer to NULL and delete the node that held
70. Likewise, if we wanted to remove 93 (which has one child)
from the tree, we could set 60’s right pointer to point at 84 (93’s
only non-NULL child) and delete the node that held 93. In both
cases the resulting tree would be correct in that (a) we only
removed the node we intended to remove and (b) binary search
tree ordering is still preserved. We recommend you draw these
trees out yourself and attempt other removal operations of nodes
that have zero or one child before proceeding.
Things get a bit more complex when we want to remove a
node that has two children (both its left and its right pointers
are not NULL). For example, consider the case where we want to
remove 19 from the tree pictured in Figure 22.9. If we simply
point 60’s left pointer at 4, then we have “lost” the sub-tree
rooted at 25—inadvertently removing them from the tree (and
leaking their memory). Similarly if we point 60’s left pointer at
25, then we lose the sub-tree rooted at 4. We could imagine
choosing one of these solutions (e.g., point 60’s left at 4), and
reattaching the other sub-tree (e.g., the one rooted at 25) to the
bottom of the sub-tree that remains (e.g., setting 11’s right
pointer to point at 25). This approach could work but is not as
good as the standard approach. Not only will this approach result
in significant imbalance in the tree upon removals, it will result in
imbalance that is harder to repair when we learn how to keep a
tree balanced in Chapter 27.
Instead, the standard approach to removing from a tree in the
case where the node to be removed has two children is to find the
most similar node in the tree that has zero or one child, put its
data into the node we want to remove, then remove that node
from the tree. While this approach may sound complex, finding
the “most similar” node (that is, the immediately smaller or
immediately greater in the ordering of the tree) is quite easy. We
can either pick the largest item smaller than the current node by
going left once, then all the way right; or we can pick the smallest
item larger than the current node by going right once, then all the
way left. In our example of removing 19, we would either select
11 (which is the largest value in the tree that is smaller than 19)
by going left once (to 4), then as far right as we can—that is,
following right pointers as long as they are not NULL and
stopping when we reach a node whose right pointer is NULL. We
could instead choose 21 by going right once, then all the way left
to get the smallest item larger than 19.
Figure 22.10: Two options for removing 19 from the tree.
Figure 22.10 shows these two options for removing 19 from
the tree we showed in Figure 22.9.
We will leave the implementation of this algorithm up to
you. However, we will briefly discuss one aspect of the
implementation, as it underscores a broader point. A novice
programmer may think that this algorithm has four cases when
the target node is found: it has zero children, it has only a left
child, it has only a right child, or it has two children. However, a
more experienced programmer will see that there are, in fact, only
three cases: the left child is NULL, the right child is NULL, or
neither child is NULL. That is, assuming we are implementing this
using a recursive approach (where each recursive call returns the
node that its parent should set its appropriate child to), we might
write:
1 if (node->left == NULL) {
2 temp = node->right;
3 delete node;
4 return temp;
5 }
6 else if (node->right == NULL) {
7 ...
8 }
9 else {
10 ...
11 }
Here, the first case covers both the situation in which there
are zero children or only a right child. This works just fine
because node->right will be NULL, which is exactly what we
need to return in that case.
We discuss this point explicitly not only as it makes the
implementation of this particular algorithm cleaner and simpler,
but because it is part of a larger skill. Programmers should
recognize similarity wherever possible and use that recognition to
reduce the number of cases in the code. Having more cases makes
your code more error prone, harder to test, and harder to
maintain.
22.5 Tree Traversals
When we traverse (e.g. to print out, sum up, deallocate memory
for, etc.) all items in a linear data structure (such as an array or a
linked list), we generally do it from the start to the end (or
sometimes in the reverse order if we have a good reason).
However, trees are not linear—there is no single successor to a
given node, but rather two children. Accordingly, we can traverse
trees in a variety of ways. These different ways result in different
orders in which we visit the nodes of the tree. For some purposes,
we may not care about the order (e.g., if for some reason we want
to sum the data in a tree of integers), but for other purposes the
ordering may be very important.
22.5.1 Inorder Traversal
One of the most natural traversal orders for a binary search tree is
an inorder traversal. An inorder traversal prints the data of the
tree in ascending order (smallest first to largest last). At first
glance, printing the elements of the tree in order may seem tricky.
However, the ordering properties of a binary search tree actually
make it quite simple. For any given node, all items smaller than it
are to its left, and all items larger than it are to its right.
Accordingly, if we want to print all items in the sub-tree
rooted at node in ascending order, we need to (1) print all
items in the left sub-tree in ascending order, (2) print N’s data (3)
print all items in the right sub-tree in ascending order. Once we
observe that (1) and (3) are exactly the problem we are trying to
solve, we should begin thinking about a recursive approach.
In thinking about a recursive approach, we will want to think
about a few important considerations that we learned about in
Chapter 7. First, we need a base case. For trees, a very natural
base case is the empty tree (i.e., NULL). We can print the elements
in the empty tree quite easily—we just do nothing.
We also want to ensure that our algorithm always terminates.
As you may recall from Chapter 7, we can ensure termination by
coming up with a measure function that strictly decreases
whenever we recurse. In the case of our traversal (and many
recursive tree algorithms), we can use the height of the tree as the
measure function, as each sub-tree has strictly smaller height than
its parent node (by the definition of the height).
What we described above (1–3 in our recursive steps, as
well as the base case) lend themselves to a short recursive
implementation (as a helper that would be called with the root of
the tree):
1 void printInorder(Node * current) {
2 if (current != NULL) {
3 printInorder(current->left);
4 std::cout << current->data << " ";
5 printInorder(current->right);
6 }
7 }
Before proceeding, we recommend that you take a moment
to execute this algorithm by hand on a small tree (either one that
we have shown in a figure in this chapter, or one of your own
creation). Note that you should get the data that is in the tree in
ascending order, for any valid binary search tree.
22.5.2 Preorder Traversals
We can traverse the tree in a different order by moving the line
where we “do something” to the current node (e.g., print it—
although we could do anything else) to a different position
relative to the recursive calls. For example, we could print our
nodes in a preorder traversal by putting the print statement first:
1 void printPreorder(Node * current) {
2 if (current != NULL) {
3 std::cout << current->data << " ";
4 printPreorder(current->left);
5 printPreorder(current->right);
6 }
7 }
Before you proceed, try executing this code by hand on the
binary search tree in Figure 22.10(a). You should end up with the
following output:
60 11 4 1 25 21 35 93 84 70 86
At this point, you may wonder what purpose such a traversal
serves (why would we want the contents of this tree in that
order?). While it does not produce a natural ordering for humans
to read, a preorder traversal produces an ordering that will
reconstruct the tree with the exact same structure if you were to
have a program read the items and add them to an empty tree.
If, for example, we wanted to save the tree to disk and then
read it back in later, we could perform a preorder traversal to save
the contents. Then, when we want to load the tree back in, we can
just read the items back from the file and add them to a tree
(which starts empty) in the order they appear. We will get the
same tree structure due to the “insert at the bottom” nature of the
algorithm to add to a binary search tree. Convince yourself of this
fact by taking the output of the above preorder traversal and add
it to an empty tree.
Contrast this result with what would happen if you used an
inorder traversal to save the data from the tree. If you used the
output of an inorder traversal to add the items to an empty tree (in
ascending order), you would get a tree in which every node has a
NULL left child—effectively it would look like a linked list (and
we would not have access time).
We will note that for a binary search tree, we would likely
want to use a balancing tree insertion algorithm anyways (see
Chapter 27). However, we may have other tree structures in
which we want to print out the data in a way that easily preserves
the structure. Likewise, we may wish to carry out other
operations on the tree besides printing in this fashion.
22.5.3 Postorder Traversal
The other place that we could put what we do to each node
relative to the recursive calls is at the end. Such a traversal is
called a postorder traversal. For example, if we wanted to print
the tree using a postorder traversal, we might write:
1 void printPostorder(Node * current) {
2 if (current != NULL) {
3 printPostorder(current->left);
4 printPostorder(current->right);
5 std::cout << current->data << " ";
6 }
7 }
Again, you might try executing this algorithm on an example tree
before you proceed. If you perform this traversal on the tree in
Figure 22.10(a), you should end up with the following answer:
1 4 21 35 25 11 70 86 84 93 60
Again, this type of traversal may seem nonintuitive. However, we
can see a very natural use of traversing the tree in a postorder
fashion if we consider freeing the memory for a node rather than
printing it. That is, if we want to delete all of the nodes in the
tree (e.g., in a function we might call from the tree’s destructor),
we need to recursively destroy both children, then delete the node
itself:
1 void destroy(Node * current) {
2 if (current != NULL) {
3 destroy(current->left);
4 destroy(current->right);
5 delete current;
6 }
7 }
This code is exactly a postorder traversal where what we do to
each node is delete it (as opposed to printing its data). Note that
if we tried to use a preorder or inorder traversal to destroy the
tree, we would dereference a dangling pointer when we make the
recursive call(s) after deleteing the current node (as we would
try to read left and/or right out of memory that we just freed).
22.5.4 Reverse Traversals
We could also perform any of the above traversals with the
recursion to the right occurring before the recursion to the left.
Such a traversal would have similar properties to the above but
reverse the order of the left and right sub-trees. These are
generally not that interesting, as they do not typically solve some
problem that the forward ordered traversals do not solve.
22.6 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 22.1 : Write a function that will perform binary
search on an array of integers. Your function should take
an int toFind (specifying what data to find), an
int * array (specifying the array to look in), and a
size_t n (specifying how many elements are in the
array). Your function should then perform binary search to
find the index of the requested element. If the element is
found, your function should return its index. If it is not
found, it should return -1.
• Question 22.2 : In the previous problem, you did binary
search on an array of integers (e.g., as you would do in
C). In this problem, you will write a generic binary search
in C++ using templates an iterators. Your new binary
search function will have two template parameters: T (the
type of data being search for) and Container (the type
that holds the data—e.g., a vector). Your function should
have the following signature:
1 template<typename T, typename Contain
2 typename Container::const_iterator
3 binarySearch(const T& toFind,
4 typename Container::cons
5 typename Container::cons
• Question 22.3 : Draw the binary search tree that results
from adding (in this order) the following items to the
empty tree: 100, 50, 30, 200, 300, 250, 40, 1, 999
• Question 22.4 : What is the height of each node in the
tree you drew in the previous question? Is the tree
balanced? Why or why not?
• Question 22.5 : In what order are the nodes of the tree
from the previous questions visited if you perform a
preorder traversal? What about an inorder traversal? What
about a postorder traversal?
• Question 22.6 : Write a binary search tree class
(templated over what type of data it holds), which has the
following:
– A private inner class Node for its nodes, which
should have data, left, and right. You may wish
to write some constructors for Node as well.
– A root pointer (which should be private).
– A default constructor that makes the root NULL.
– The Rule of Three methods (destructor, copy
constructor, copying assignment operator).
– A void add(const T & item) method, which
adds the item to the tree (use any approach we
discussed).
– A void printTree() const method that prints
the tree to stdout with sufficient detail that you
can test your code extensively.
• Question 22.7 : We already saw how to search a binary
search tree using iteration. Now, write a function that
accomplishes the same task (checks if an item is in a tree)
using recursion—include it to the tree you have been
writing and testing.
• Question 22.8 : In Section 22.4, we described the
algorithm for removing from a binary search tree. For this
problem, you are going to write the code to implement
this algorithm using recursion. You should write the
recursive function void remove(int toRemove), which is
a member of the binary search tree class, and removes the
specified item (from a tree of ints). If the tree does not
contain the requested item, then the tree should not be
changed. Include this method in the tree you have been
writing and testing.
• Question 22.9 : In this problem, you will write a function
that performs the same task as in the previous question
(remove from a tree), but you should use iteration with a
pointer to a pointer instead of recursion. Include this
method in the tree you have been writing and testing (but
give it a different name).
• Question 22.10 : Write the function
std::list<T>allInRange(const T & low, const T &
high), which creates a list of all items in your tree that are
between low (inclusive) and high (exclusive). The items
in the list should be in order. Your algorithm should be
efficient—if the ordering properties of a binary search tree
guarantee you do not need to examine a particular subtree,
you should not examine that subtree. Include this method
in the tree you have been developing over the last several
problems.
• Question 22.11 : Write the three common traversals
(inorder, preorder, and postorder). You should make these
generic in what they do with each piece of data by
templating the traversal function over a type F which is a
type that is expected to overload the function call
operator, void operator()(const T&) to specify what to
do with each item during the traversal. Your inorder
traversal function should have the following signature.
template<typename F> void inorderTraverse(F & fu
ncobj) const (and your other functions should be
similar), and should invoke funcobj on each piece of data
at the proper time in the traversal. You should write a
class that overloads operator() such that it prints the
item passed in and use it to test these methods.
• Question 22.12 : Write
vector<T> allElements() const, which puts all
elements into a vector. Hint: you should be able to use one
of the generic traversals that you just wrote to do most of
the work—you just need to create a class that overloads it
operator() to place the items in a vector.
21 Linked Lists23 Hash Tables
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
III Data Structures and Algorithms22 Binary Search Trees24 Heaps and Priority
Queues
Chapter 23
Hash Tables
Implementing maps and sets with access for
insertion and lookup operations gives us fairly efficient data
structures. However, as always, we would like to ask “can we
do better?” Trying to do better than seems to be
hard— is quite nice but does not come up often,
and is great but seems difficult. However, expected
behavior is exactly what we will achieve in this chapter,
using a data structure known as a hash table.
To see how we can accomplish this goal, let us return to a
familiar data structure with access: the array (or vector).
If we want to access a particular element of an array (by index),
we can do so in time. From this point, let us make an
observation: If we wanted to construct a map whose keys could
only be integers in a certain range (e.g., 0–1000) we could do
so by making an array of the appropriate size and storing the
values in the index corresponding to the key. Likewise, we
could implement a set of integers over a certain range with an
array of booleans.
23.1 Hash Table Basics
That approach sounds great for the limited set of applications
where we want our keys to be numbers in a small range, but
can we adapt it to the more general case? For example, could
we use keys that are strings or some other object of our own
devising? What if we need a range larger than 0–1000?
The first of these apparent difficulties can be solved by
remembering Everything Is a Number (hopefully you recall this
principle from Chapter 3). Whatever types we are working with
are fundamentally numbers—we may just have to explicitly
expose the underlying numeric properties to use them to index
an array. The second of these apparent difficulties can be solved
by performing some math to keep the numeric value that we
use to index the array in its range. If we have a 1000-element
array, then computing will yield a valid index (
) for any non-negative integer .
1
From these two solutions, we can build the general
“formula” for a hash table and see how it gives us expected
behavior:
1. Apply a hashing function to the key, which converts it
to an unsigned integer (likely doing math on the
underlying numeric structure of the data).
2. Compute , where is the result of applying
the hashing function to the key, and is the size of
the array.
3. Use the resulting value to index into the array.
Observe that none of these operations have running times that
depend on the number of elements in the data structure—the
time to hash the key may be proportional to the size of the key,2
the time to compute the mod operation is constant, and
indexing an array is also constant time.
We will note that if you want to make a generic hash table,
there are several ways that we can handle the hashing function.
One is to simply rely on the existence of a function with a
particular name (e.g., hash), which takes a key (likely by const
reference) and returns an unsigned int (or size_t). Such an
approach is discouraged as it is inflexible and restrictive.
Another approach would be to require the keys to have a hash
method. This approach is a bit cumbersome, as one has to write
a wrapper class around the types you actually want to use. This
wrapper class would provide the hash function. In general, we
do not suggest these approaches, but you may see them.
A better approach is to pass a template parameter for how
to hash the keys. This template parameter can name a class
whose function call operator (operator()) is overloaded to
take a const reference to the key and return an unsigned int,
and it does so by performing the desired hash operation. The
hash table can then make an instance of this object and use it as
if it were a function to hash keys as needed:
1 template<typename Key,
2 typename Value,
3 typename Hasher>
4 class HashMap {
5 private:
6 Hasher hash;
7 public:
8 void add(const Key & k, const Value & v) {
9 unsigned int h = hash(k);
10 //...everything else elided
11 }
12 };
23.2 Collision Resolution
This approach sounds great, but there is one slight problem:
what happens if we try to put two keys in the same index at the
same time? That is, suppose we have a 1000-element array, and
we add the key "apple" by hashing it to get 9845 (nothing
special about this number—just as an example), which means
that it goes in index . Then we add
"grape" and hash it to get 4845, which also goes in index
. Our hash table needs to be able to
deal with this situation—which is called a collision, but in such
a way as to maintain its behavior.
23.2.1 Chaining
One way we can deal with collisions is to use chaining—we
maintain an array of linked lists. Each node in the linked list
holds a key/value pair (for a map, or just the item for a set). For
example, the start of implementing a map might look like this:
1 template<typename Key,
2 typename Value,
3 typename Hasher>
4 class HashMap {
5 private:
6 Hasher hash;
7 vector<list<pair<Key, Value> > > table;
8 public:
9 void add(const Key & k, const Value & v) {
10 unsigned int h = hash(k);
11 h = h % table.size();
12 table[h].push_front(pair<Key, Value>(k, v));
13 }
14 };
Note that we might want to first remove any existing
mapping for k (not shown) before adding it. Most other
operations we might want to do (finding an item, removing an
item, etc.) basically work out to hashing the key, then
performing the corresponding item on the appropriate linked
list. Video 23.1 illustrates the behavior of a hash table with
external chaining.
Video 23.1: A hash table with external chaining.
At this point, you should be thinking that we just
destroyed the behavior of our hash table—when we go
to look up an item, we are going to search a linked list, which
you know is . The fact that we have a vector of 1000
such lists makes its , which is still .
However, the trick is that we do not fix the size of the
vector/array at 1000 elements—if the number of vector/array
elements we have (typically called buckets) is itself
(and we divide the items evenly among them), then the access
time is = . We will see how and when
to increase the number of buckets in Section 23.4.
While that may seem like mathematical magic, the
intuition is that as long as we have buckets, each chain
will be quite short (i.e., have length). If we have a
million elements spread across a million buckets, each bucket
has, on average, a chain of length 1. Because we may not
spread the elements out perfectly, some buckets may be empty,
and some may have more than one element; however, if we do
a decent job of spreading elements out, we would not expect to
find long chains.
Sometimes when people learn about hash tables, they get
an idea in the hopes of improving efficiency: what if we chain
with binary search trees instead of linked lists? This idea
sounds nice at first glance—we just learned how binary trees
typically3 give better behavior than linked lists, so why not
make a hash table as an array of such trees? The problem with
this idea is that trees have better asymptotic behavior—
meaning, they will perform better on large enough data sets.
However, the whole point of a hash table is to have each chain
be quite small—if your chain size is large enough for a binary
search tree to be better than a linked list, your hash function is
doing a poor job of spreading the data between buckets. Linked
lists should generally have much better constant factors (which
do not appear in the Big-Oh analysis) than trees and thus be
better for few elements.
We will note that chaining is most common for general
purpose use, especially as most of the implementation work is
already done if you have a linked list already written.
23.2.2 Open Addressing
Another collision resolution strategy is called open addressing
—seeking a nearby array index that is not used. The simplest
form of open addressing is linear probing, in which indices are
tried sequentially until an open one is found. In our earlier
example where "grape" collided with "apple" on index 845,
we would try to insert "grape" at index 846. If that failed, we
would then try 847, 848, etc. If we reach 999, we wrap around
to 0 and continue. We could perform linear probing with a
different step size—e.g., adding three each time—if we so
desired. The defining characteristic of linear probing is that the
step size is constant.
A different way to do open addressing is to use quadratic
probing. The general principle is the same, but instead of going
by a constant step size each time, we increase the step each
time. In our "grape" example, after we collide on index 845,
we would try 846 ( ),848 ( ), 851 (
), 855 ( ), etc. The motivation behind
quadratic probing (instead of linear probing) is that, if the data
is clustering up in one area, the algorithm will more quickly
find its way out of that cluster.
In either of these approaches, when we go to look up (or
remove) an item, we need to take into account the possibility
that it experienced a collision when it was added. If our table
does not support a “remove” operation, we can search
(following the same probing scheme) until we either find the
required key, or we find an empty bucket.
If we support a “remove” operation, an empty bucket is
not sufficient to indicate that the desired key is not located past
that point—some other item could have been present in that
bucket when we added the key we are looking for but then
subsequently been deleted. We could continue to search until
we have scanned the entire table; however, for obvious reasons,
this approach is inefficient. Instead, we can distinguish between
a “truly empty” bucket and one that has had data, but that data
was deleted—then we can stop when a truly empty bucket is
encountered. As you might suspect, with many insertions and
removals, the behavior of our table may degrade, as “truly
empty” buckets disappear—we can fix this by periodically
cleaning up the table: we reinsert each item into a new table,
and start fresh.
23.3 Hashing Functions
One important aspect of an efficient hash table is a good hash
function. Of course, the natural question you should ask is
“what does ’good’ mean in this context?” We will provide two
different criteria for a hash function. The first is whether or not
a hash function is valid (if it is not valid, we cannot
meaningfully use it). We will say that a hash function is valid if
(1) it is purely a function of its input—meaning that if we call
hash(x) we are guaranteed to get the same value on subsequent
calls to hash(x) unless we modify x and (2) for any two objects
a and b that we consider equivalent (that is a == b evaluates to
true), then hash(a) == hash(b) also evaluates to true.
The second criteria we will consider is how “good” a valid
hash function is. The “goodness” of a hash function is how
likely we are to get different hash values for objects that we
consider different. That is, if a == b evaluates to false, how
likely are we to have hash(a) give a different value from
hash(b). A hash function that always returns 0 is valid (we will
never get the wrong answer by looking in the wrong place), but
is terrible—it will not spread our objects out in the hash table,
and it will give us terrible performance. A hash function for
strings that returns the numerical value of the first character of
the string (or 0 if the string is empty) is also very bad—all
strings that start with “a” will map to the same bucket. If we
make the array in our hash table quite large, we will not be able
to make use of most of the buckets, as our hash function will
only return values in the range 0–255.
23.3.1 A Bad Hash Function for Strings
We might think the following function (which adds the
numerical values of each character in the string) is decently
good:
1 unsigned hash(const std::string & str) {
2 unsigned ans = 0;
3 for (std::string::const_iterator it = s
4 it != str.end(); ++it) {
5 ans += *it;
6 }
7 return ans;
8 }
However, this hash function will return the same value for
strings that are permutations of each other (e.g., "bat" and
"tab") or which happen to add to the same values (e.g., "bat",
"elf", "ago", and "aim", "end", "keg", and "odd"—all add to
311). This problem is not limited to short words either—if we
use this function to hash the words in the system dictionary, we
end up with 623 different words colliding in the worst case
(which is the words that hash to 973—including such diverse
words as "ambitious", "grounding", and "wondering"). In
fact, of the 235,886 words in the dictionary, only 198 of them
hash to a unique value (one not shared by any other word), and
only 1821 of them (less than 1%) have fewer than 10 words
that hash to the same value. This hash function is not good.
23.3.2 A Better Hash Function for Strings
We can, however, do much better if we make our hash function
slightly more sophisticated:
1 unsigned hash(const std::string & str) {
2 unsigned ans = 0;
3 for (std::string::const_iterator it = s
4 it != str.end(); ++it) {
5 ans = ans * 29 + *it;
6 }
7 return ans;
8 }
Notice that the only difference between our improved hash
function and our original hash function is that we multiply the
existing answer by 29 before we add the current character. With
this hash function, only four pairs of words hash to the same
value from the entire system dictionary. Clearly this hash
function is much better than our prior attempt.
At this point, you are probably wondering why
multiplying by 29 makes such a big difference in how good our
hash function is, and why we picked 29 in particular. First,
multiplying by 29 makes a huge difference because
permutations and other combinations of letters that add to the
same value no longer hash to the same values—instead of
"bat" hashing to and "tab" hashing to
, which are both the same, now "bat"
hashes to while "tab"
hashes to , which are
different values. The particular value of 29 works well because
it is prime and larger than 26—the strings we are hashing are
primarily composed of lower case letters, so each character
typically takes one of 26 possible values. To a first
approximation, we are making a base-29 number with each
letter as a “digit.” This description is only an approximation of
what is happening, as we have some capital letters, and the
numerical values of the lower case letters range from 97 to 122
(not 0 to 25), but it is close enough to understand why it works
much better.
There is also another, subtle reason why the second hash
function works better—it has a larger range. The first (really
bad) hash function produces values in the range 0–2621, while
the second (better) hash function produces values in the range
0–4,294,962,258.4 The fact that the first function hashes
200,000+ words into a range of 0—2621 means that there will
definitely be many collisions. In fact, we are guaranteed that,
on average, each word will collide with 100 others. By contrast,
the second function spreads the 200,000+ words out over a
range of 4 billion hash values—giving the possibility that each
value is distinct.
At this point, we have seen that our second function is
better than the first, and for this particular set of input data is
pretty good. However, we may need to hash other strings
besides just words from the dictionary—and those might
exhibit different properties. For example, if we were to hash
URLs, they would contain characters that do not appear in our
words (e.g., slashes, dots, colons, and numbers), and they may
be much longer than any of the words in the dictionary. Our
second hash function is still likely to be pretty good, although
we should be sure to test it out on representative data to make
sure.
If you need to design hash functions for your own objects,
there are a few basic principles that make for a good starting
point. First, your hash function should incorporate all parts of
the object—do not try to make it faster by examining only a
small piece of the object. While such an approach may make
the hash function itself a bit faster, it is likely to degrade the
overall performance of your data structure—a few extra
additions are faster than searching through hundreds of
colliding pieces of data. Second, try to combine the data in
ways that permutations and alterations create different results.
We saw an example of this principle in our two functions:
multiplying by a prime and adding produced much better
results than simply adding. Third, whatever you do, be sure to
test it and see if it is actually good—you would be surprised
how many seemingly good ideas turn out to be bad in practice.
23.3.3 Cryptographic Hash Functions
The hash functions we have been discussing so far have been
adequate for hash tables, in which collisions degrade
performance, but do not have serious security consequences.
There is an entirely different class of hash functions that we
might use for security-sensitive purposes. The important
difference in these hash functions is that they are based on
cryptographic principles (i.e., made by cryptography experts),
and finding values that experience collisions is quite difficult.
We will note that such functions generate hash values much
larger than 32 bits—generally at least 128 bits.
One use of cryptographically strong hash functions is in
the storage of login information. Suppose you are creating a
website (or other system where users log in). The worst practice
would be to store their passwords in cleartext—meaning to just
store them directly. If you store passwords in cleartext and your
system is compromised, the attacker can obtain all of the
passwords of every user trivially. Note that such a weakness not
only grants the attacker the ability to easily gain the user’s
passwords to their site, but frequently to many others: often
people will use the same password for multiple sites.
A better approach (although still not correct) would be to
store hash(password) for each user, using a cryptographically
strong hash function. Under such an approach, when the user
sets up her account (or changes her password), the system
computes hash(password) and stores it in the password
database. When the user logs in, she provides the password,
which the system hashes, and then can compare the computed
hash value to the stored hash value—if they match, the user
either provided the same password, or one that experienced a
hash collision. As cryptographically strong hash functions have
very few collisions (and they are hard to produce), we can be
quite sure that the user provided the correct password. The
advantage here is that the system never stores the passwords
directly—only the hashes. Thus an attacker would have to find
the password that goes with each hash.
The attacker can try to find the password that goes with
each hash by a variety of means. The simplest are dictionary
attacks and brute force attacks. For a dictionary attack, the
attacker takes a list of words that are likely passwords, and
hashes each of them. For each hashed potential word, the
attacker checks if the hash value is in the stolen password file.
If so, the attacker has found a user’s password information. A
brute force attack is similar, except the attacker tries all
possible passwords in a given range (e.g., all 8 to 10 character
passwords). The values can also be pre-computed and stored,
speeding up the process, but requiring significant storage space.
The problem with just storing hashes is that an attacker
can try to break all passwords in parallel, giving him an
advantage proportional to the number of users of the system.
For each hash that the attacker computes, he succeeds in
gaining a password if any user on the system chose that
password. If your system has 100,000 users, the attacker gets
the advantage of 100,000-way parallelism in checking each
password.
The correct approach is to give each user randomly
generated (per-user) salt—a random string—when the
password is created or changed. The authentication database
then stores the salt and hash(password + salt) for each user.
Now, when the user logs in, she supplies her password, the
system looks up her account’s salt in the authentication
database, appends it to the entered password, hashes the
resulting string, and compares that value to the hash stored in
the database.
Crytographic hash functions have a variety of other
security-sensitive applications. We will not delve into them
here, but in general if you want to be able to verify information
(possibly without exposing the underlying information), they
are a good approach. Devising hash functions that are strong
enough for cryptographic applications is quite difficult, and
thus not something you should try to do on your own.
At the time of this writing, SHA-3 is an accepted standard,
which is still considered secure. If you need a cryptographic
hash algorithm, you should see what the current state of the art
is and use it.
23.3.4 Hash Function Objects
At the end of Section 23.1, we discussed making a generic hash
table and suggested the approach of including a template
parameter that is a type whose operator() is expected to be
overloaded to perform the hash function. If we wanted to write
our own, we could do so like this:
1 class StringHasher {
2 private:
3 const int multiplyBy;
4 public:
5 StringHasher() : multiplyBy(29) {}
6 StringHasher(int m) : multiplyBy(m) {}
7 unsigned operator() (const std::string
8 unsigned ans = 0;
9 for (std::string::const_iterator it =
10 it!=str.end(); ++it) {
11 ans = ans * multiplyBy + *it;
12 }
13 return ans;
14 }
15 };
Note that we have generalized our previous approach
slightly—the amount to multiply by is now a parameter to the
constructor (and the default constructor chooses 29). The
overloaded operator() carries out the same hash algorithm we
explored earlier.
For std::string (and some other types), we do not even
have to write our own—the C++ STL has a built in template:
std::hash<T>. If you only need to hash strings, using the built-
in one is a great option (see
http://www.cplusplus.com/reference/functional/hash/
for more details). However, the lessons we have presented here
will serve as a good basis if you need to hash data types that
you define yourself.
23.4 Rehashing
Hash tables only work efficiently if there are more buckets than
elements stored in the table—otherwise, the probability of
collisions increases and performance degrades. If one knows
exactly how many elements will be placed in a table a priori,
then one can size the table appropriately. However, it is much
more common to need to resize the table as the number of
elements in it grows—a process called rehashing.
The first aspect of rehashing a hash table is when to do so.
The metric of interest for this aspect of designing a hash table is
the load factor—the ratio of the number of elements actually
stored in the table to the number of buckets. If we have 30
elements stored in a hash table whose array has 100 spaces,
then the load factor is 30% (or 0.3). Computing the load factor
requires a count of how many elements are in the hash table
(which should be maintained in a field that is incremented on
insertions and decremented on deletions), and dividing by the
size of the array (or vector). After each insertion, the hash table
should check the load factor and see if it exceeds some
threshold (generally in the 0.5–0.8 range). If so, the table needs
to be rehashed (which will decrease the load factor
significantly, as the array size will increase greatly).
The second aspect of rehashing a hash table is how to do
so. The first step is to allocate a larger array (or vector)—even
if we are using a vector, we cannot just reuse/grow the existing
one. We will want the new array to be at roughly twice as a
large as the existing one, so that we can amortize the cost of the
rehash operation over many additional insertions. We could
exactly double the size, although we may want to use prime
numbers of buckets (in which case, we would keep an array of
primes to use for sizes, where one is roughly double the
previous one and go to the next size in our list).
Video 23.2: Rehashing a hash table with external
chaining.
We then iterate over all items from the old table (e.g., with
an externally chained table, iterating over all buckets, and for
each bucket, iterating over all items in the list) and place them
into the new table. Adding each item to the new table requires
recomputing its hash and computing the mod of that value by
the new table size. Once all items are moved to the new table,
the old table is destroyed, and the pointer to the array/vector is
updated. Video 23.2 illustrates.
23.4.1 Hash Table Sizing
The choice of doubling exactly versus a prime size presents a
bit of a tradeoff. If we keep our hash table’s size as a power of
2, we can perform the mod operation much more efficiently.
Specifically, you can compute mod by computing
x & ((1 << k) - 1), where & performs a bit-wise AND, and
<< shifts the bits left. Most computers can compute & in a single
cycle but take tens of cycles to compute mod. However, if we
use a prime size, then we reduce the chances of introducing
extra collisions when we perform the modulus operation if our
hash function creates patterns in the numbers. We will note that
sometimes there are pressing considerations for one design
choice (e.g., if we are designing a hardware structure, we will
typically opt for a single-cycle operation); however, it is good
to understand the tradeoffs involved in making such a choice.
As an example, suppose we choose between a table of size
4096 (which is ) and 4093 (which is prime). For either
size, there are approximately different 32-bit integers that
mod to the same number. If we pick a particular bucket (e.g, 0),
we can describe the pattern of numbers that mod to it:
(i.e, 0, 4096, 8192, 12288, …) for a table of size
4096, and (i.e., 0, 4093, 8186, 12279, …) for a table
of size 4093. If our hash function maps inputs to hash values
that follow one of these sorts of patterns, then even if the hash
function produces distinct un-moded values, they will collide
modulo the table size (e.g., 4101, and 8197 are distinct but are
both 5 mod 4096).
Of course, we could end up with a hash function that
incidentally maps inputs to patterns based on 4093—so why is
the prime size better? The answer is that we are less likely to
end up with patterns based on 4093, because they can only arise
from structure in the output that is a multiple of 4093, not a
factor of it (as 4093 is prime). For a power of 2, a pattern based
on any smaller power of 2 will cancel out mod that power of 2.
If we were to have a pattern that strides by 2048 (e.g., 0, 2048,
4096, 6144, 8192) and take its elements mod 4096, they would
map to only two values (i.e., 0, 2048, 0, 2048, and 0
respectively).
Perhaps the most important concern in such a design is to
be sure that we do not pick a number (nor multiple of a
number) that is used in our hash function itself. If our hash
function multiplies its result by 4093 at each step before adding
the next element, then a table of size 4093 is a far worse choice
than 4096, as the mod will discard all information other than
the last element (
). We will
note that a deeper treatment of these issues is beyond the scope
of this book, but it is an excellent example of why computer
scientists need mastery of discrete math.
23.5 Practice Exercises
Selected questions have links to answers in the back of the
book.
• Question 23.1 : What is external chaining?
• Question 23.2 : What is linear probing?
• Question 23.3 : Why should passwords be salted and
hashed?
• Question 23.4 : What is the load factor of a hash table?
Why is it important that the load factor remain low?
What should we do if the load factor gets too high?
How does doing this action reduce the load factor?
• Question 23.5 : In Section 23.3.4, we made a hashing
object that generalized the family of simple string hash
functions we examined earlier. As we saw, changing
what we multiply by in this function has a significant
impact on how good (or bad) the function is. For this
problem, you will write a program to empirically test
different values of this factor by counting collisions.
Specifically, you should first write a program that takes
one command line argument (which is an integer that
you will use as the multiplication factor in the
generalized hash function). This program should then
read from standard input until it reaches end of input.
For each line that it reads, it should hash the input, and
keep track of which input lines experience hash
collisions (hint: use a map). The program should then
print the cumulative distribution of how many collisions
were experienced. Once you have tested your program,
give it the system dictionary (/usr/share/dict/words)
as input, and graph the results (as a cumulative
distribution function, called a “cdf” for short—that is,
your x-axis should be collisions, and the y-axis should
be the cumulative fraction of words with that many or
fewer collisions) for a variety of multiplication factors.
Your resulting graph should look something like this
(although the results may be slightly different
depending on differences in what words are in your
system dictionary):
• Question 23.6 : Suppose we have a hash function that
maps strings to ints as shown in the following table:
String Hash Value
red 147
blue 301
white 237
green 335
black 370
Draw the resulting data structure when "red", "blue",
"white", "green", and "black" are inserted (in that
order) into an externally chained hash table with 11
buckets. Assume that you add to the front of a list on a
collision.
• Question 23.7 : Repeat the previous problem with
linear probing with a step size of 1. How many
collisions would you encounter if you inserted another
item that hashed into bucket 5?
• Question 23.8 : Repeat the previous problem with
quadratic probing (+1, +2, +3,…). How many collisions
would you encounter if you inserted another item that
hashed into bucket 5?
• Question 23.9 : Implement a set with an externally
chained hash table. Your hash table should be generic in
(i.e., templated over) the type of items it contains, as
well as how to hash them (i.e., the template parameter
should be a class whose operator() is overloaded to
perform the hash function). Your table should rehash
whenever the load factor exceeds 0.75.
• Question 23.10 : Write a program that tests the
performance of your hash table, and confirm it has
behavior. To accomplish this goal, you should
write a program that takes one command line argument
, creates distinct elements, and times adding
them to the hash table as well as checking if the hash
table contains them (you should time adding separately
from checking if it contains them, but should measure
the aggregate time to add/check all elements). Your
program should then print the average time per element
for an add operation as well as the average time per
element for a contains operation. Run your program for
=10,000, =100,000, =1,000,000, and
=10,000,000, and see if your data confirms
behavior.
22 Binary Search Trees24 Heaps and Priority Queues
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
III Data Structures and Algorithms23 Hash Tables25 Graphs
Chapter 24
Heaps and Priority Queues
In Section 20.3, we learned about the queue abstract data type,
which manages the items added to it in a FIFO fashion. Strictly
FIFO queues have many important uses; however, sometimes we
would prefer to prioritize the elements in our queue. A priority
queue is queue where each item has an associated priority, and
the next item returned from the queue is the one with the highest
priority.1
We might want this prioritization for a few reasons. One
reason might be that we are scheduling tasks that have explicit
priorities associated with them. In such a scheduler, we would
want to run our highest priority task at any time. However, as we
will see at the end of the chapter, priority queues are not limited
to applications where we are using a traditional notion of
priority. One common example, which we will see in Section
24.4 is a data compression algorithm known as Huffman coding.
We could implement a priority queue using the data
structures we have already seen. Of these, the best choice so far
would be a binary search tree (although, we would want to use
the balancing algorithms we shall see in Chapter 27 to ensure
access time). As long as our binary search tree
remains balanced, it would give us runtimes for
the three operations we would want to support in a priority
queue: enqueue, dequeue, and peek. However, we can do better
not only in a theoretical sense (i.e., better Big-Oh behavior), but
also in a practical sense (i.e., better constant factors, which do
not show up in Big-Oh analysis) by using a heap2 —a data
structure that gives efficient access to its largest (or smallest)
element. In particular, we will be able to implement enqueue and
dequeue with runtime, and peek with
runtime.
24.1 Heap Concepts
Conceptually, a heap is a complete binary tree that obeys the
heap ordering rule. Recall from Section 22.1.1 that a complete
binary tree is one in which every level, except possibly the last
one, has as many nodes as possible, and the last level is filled in
from left to right. There are two ways we can define the heap
ordering rule, based on whether we want efficient access to the
largest element (called a max-heap) or the smallest element
(called a min-heap). In a max-heap, the heap ordering rule says
that every node is larger than its children. Similarly, in a min-
heap, the heap ordering rule says that every node is smaller than
its children. Note that this rule is quite different from the binary
search tree ordering rule. In a min-heap, the parent node is
smaller than both of its children, while in a binary search tree,
the left child is smaller than the parent and the right child is
larger than the parent.
Figure 24.1: Two heaps with the same data but opposite ordering rules.
Figure 24.1 illustrates heaps conceptually. On the left, we
show a max-heap holding the nine integers. On the right, we
show a min-heap holding the same nine integers. Observe how
with either ordering, the tree is a complete tree—the first three
levels are completely filled with one, two, and four nodes
respectively, and the nodes on the last level are in the left-most
positions. For the max-heap, each node is larger than its children
(87 is greater than both 80 and 63—and also larger than their
children). However, there is no ordering between sibling nodes
—we could still have a valid heap with 63 as 87’s left child, and
80 as 87’s right child, as long as the rest of the heap followed the
rules (we could not have 74 as 63’s child, so we cannot just swap
80 and 63 in this heap). The min-heap follows similar rules; we
just require the opposite ordering: parents must be smaller than
their children.
Using a heap to implement a priority queue, we would want
to order the heap by priority (and either use a max-heap or a
min-heap based on whether the largest or smallest priority is
logically highest). We can then easily implement the peek
operation by just looking at the root node—it will always be the
highest priority item in the heap. However, we must still be able
to implement enqueue or dequeue efficiently (i.e., in
time), while still maintaining the invariants of the
heap.
24.1.1 Insertion
To enqueue an item, we must insert it into the heap. We insert by
first placing the new item at the next available place in the tree
(the leftmost open slot on the bottom level, or the first slot of a
new level if the bottom level is full). Placing the new item in this
location ensures the tree remains complete (which is one of the
rules of the heap) but may cause the heap ordering rules to be
violated. We then fix the heap ordering (while maintaining heap
completeness) by “bubbling up” the node. We compare the node
to its parent and, if the heap ordering is violated, we swap the
two. We then repeat the process with the node in its new position
and its new parent. This process continues until either the node
is correctly ordered with respect to its parent (which ensures that
the entire heap is correctly ordered), or the node reaches the root.
Video 24.1: Adding items to a max-heap.
Video 24.1 demonstrates the conceptual process of heap
insertion. Observe that this operation is guaranteed to have
runtime because we only swap along the path from
the initial insertion point to the root, whose length must be
. This length guarantee arises from the fact that the
tree must be complete, which puts a tighter bound on the height
than simply being balanced (i.e., we know that the height of the
tree is exactly ). We can also observe that the
bubble up process is guaranteed to maintain the completeness of
the tree (as it only swaps nodes, thus leaving the structure of the
tree unchanged), and it will ensure heap ordering when it
completes (primarily due to the transitivity of and ).
24.1.2 Deletion
If we want to remove the root element from a heap (as we would
need to do to implement the dequeue operation of a priority
queue), we can do so efficiently (removing an arbitrary element
is much less efficient, but we generally only use a heap when we
remove only the root element). The first step for removal is to
swap the position of the root with the rightmost element on the
last row. Now, we can remove the element that was the root
while still maintaining the completeness of the tree. However,
we have violated the heap ordering rule by swapping an element
from the bottom of the heap to the top. We must therefore repair
the heap ordering by bubbling that node down the heap until it is
in a correct position.
To bubble an element down, we compare it to its two
children and determine which is largest (for a max-heap) or
smallest (for a min-heap). That element should be the parent of
the other two, so if it is not, the current parent must be swapped
with it, and the bubble down process continues from that point.
If the elements are already correctly ordered, then the heap is
correctly ordered, so no other steps are required. Note that
sometimes when we want to compare a node to its children,
those children might not exist (we might be at the bottom of the
heap). In such a case, the missing children do not participate in
the comparison. If a node has only one child, it is compared
against that child. If it has no children, then there is nothing else
to do.
Video 24.2: Removing the largest item from a
max-heap.
Video 24.1 demonstrates the conceptual process of
removing the largest element from a max-heap. As with
insertion, we can observe that removing from a heap has
runtime because in the worst case, the item
swapped into the root must be bubbled all the way down the
heap, following a path of length .
24.2 Heap Array Implementation
Even though heaps are conceptually trees, they are actually
implemented in arrays (or vectors, or some similar structure).
Doing so requires us to setup an indexing scheme where we can
compute the index of an item’s parent, left child, and right child
from the array index storing that item. We can use one of two
indexing schemes, based on whether or not we want to store the
root of the heap at index 0 or index 1 of the array.
While storing the root at index 0 seems like a natural choice
in languages (such as C and C++) that use 0-based array
indexing, we may want to instead use index 0 for a sentinel. A
sentinel is a special item that is not actually part of the heap’s
data, which is ordered so that it stops the bubble up process
without a special case. In a min-heap, the sentinel node would be
the smallest possible item of a given type, while it would be the
largest possible item in a max-heap. Use of a sentinel in index 0
(which is “right before”) the heap means that you do not need to
explicitly check if the inserted node has become the root, as you
will then compare it with the sentinel and find it is always
correctly ordered.
For a type such as int, we can easily find a smallest or
largest item of that type by using either INT_MIN or INT_MAX
(defined in <limits.h>). However, for some types, we might not
be able to define a maximum and/or minimum. For example, for
strings, there is a minimum element of the type (the empty string
is less than all other strings for lexicographical ordering);
however, there is no maximum element. If you claim to have a
string that is the largest possible string, I can make a string that
will be larger than it by adding another letter to its end.
Once we have picked whether to use a sentinel or not (and
thus whether we want to place the root at index 0 or at index 1),
we can determine the arithmetic required to find the children or
parents of a given index. Both schemes are quite similar—
requiring doubling the index and adding to get to the children,
and the inverse operation to get to the parent. The following
table shows the require math in each case:
Root at 0 Root at 1
Parent
Left child
Right child
Note that both of these schemes arrange the “tree” into the
array by placing one level of the tree first, then the next level
after it, and so on. That is the first level (the root) is at index 0,
the second level (the root’s children) are at indices 1–2, the third
level (their children) are at indices 3–6, the fourth level are at 7–
14, and so on. The doubling operation works with this
arrangement as each level has twice as many nodes as the
previous level. We then compute the parent’s index by using the
inverse operation—subtracting 1, then dividing by 2. Note that
is the inverse of both and
because we are doing integer division— and
. The arithmetic when the root is placed in index 1
follows similar principles, but with a change of 1 in each
calculation to reflect the difference in the root’s index.
We will leave the implementation of a heap as an exercise
for you. Both the bubble up and bubble down algorithms should
be well within your skill set by this point (do not forget to work
some examples yourself with the data laid out in an array!);
however, we want to make one note about bubble down. The
bubble down implementation is one in which novices tend to
write code with horrific “case bloat”—writing out many more
cases than are actually needed. Specifically, when the current
“node” has two children, it may seem like one must write out all
possible cases of the ordering of that item and its children (of
which there are six: , ,
, , , and
). If we were doing a min-heap, we would swap
with the left child, then recurse in the first two cases, do nothing
in the second two cases, and swap with the right child, then
recurse in the second two cases.
However, we can greatly reduce the number of cases (and
thus the code duplication). First, we should notice that we really
only do three things in these six cases, and that it does not matter
whether we have or . What we
want to do is pick the smaller of the left and right child first, then
compare that to the current data. We do not care at all how the
current data is ordered relative to its other child—only how it
compares to the smallest child. That is, we might write
something that looks like this:
1 int minIndex = right;
2 if (data[left] < data[right]) {
3 minIndex = left;
4 }
5 if (data[minIndex] < data[current]) {
6 //swap with minIndex then recurse
7 }
24.3 STL Priority Queue
C++’s STL has a built-in priority queue, std::priority_queue,
which you can read about online at
http://www.cplusplus.com/reference/queue/priority_queu
e/. If you consult this documentation, you will notice that the
priority queue is templated over what type of elements it holds,
as well as two other parameters that have default values:
Container = vector<T> and
Compare = less<typename Container::value_type>.
We very briefly mentioned templates with parameters that
have default values in Section 17.4.1 (vector has a second
parameter, but the default value is fine); however, we have not
revisited that topic since then as it was not important. Now,
however, we have a case of a template where we are potentially
more interested in explicitly specifying these optional
parameters rather than just always using the defaults.
In particular, the third parameter, Compare allows us to
specify the ordering we want to use in the priority queue. The
default is less which just uses the < operator on whatever types
it is operating on (it is an error to use it if < is not defined on
those types). However, we might want a different ordering for a
variety of reasons.
First, STL’s priority queue always uses a max-heap (that is,
when we peek or pop3 from it, we always get the largest
element). However, we might want to get the smallest element
instead. We can still use STL’s priority queue, we just have to
reverse the ordering. That is, we could give it std::greater for
Compare:
1 std::priority_queue<int, vector<int>, std::greater
Second, we might just want a different ordering than < (or
>) give. For example, if we want to put pointers to some type
into our priority queue, then we almost certainly do not want the
ordering defined by < on pointers (which orders them by the
numerical value of the pointers)—we probably want an ordering
that examines the appropriate data within the objects that the
pointers point to. In this case, we can write our own comparison
class (which needs to override the operator(), taking in two
const references to the types to compare and returning a bool).
24.4 Priority Queues Use: Compression
As we mentioned at the start of this chapter, one use of priority
queues is Huffman coding, which is a compression algorithm. A
compression algorithm takes input data and re-encodes it in a
particular way in the hopes of reducing4 how many bytes are
required to represent the data. Reducing the number of bytes
needed to encode the data is helpful in terms of making it take
less storage space or requiring less time to transmit it across a
network.
One of the key principles behind how Huffman coding
works is that the frequency of symbols in the input will typically
not be uniform. In the case of English text, this principle
basically boils down to the fact that not all characters are equally
common. If we were to look at the frequency of letters in
writing, we would find some characters are incredibly common,
while others are incredibly uncommon. As an example of the
uneven distribution, Table 24.1 shows the frequency of each
character that appears in the LaTeX source of this book at the
time we started writing this section.
216098 p 26553 ( 4634 N 1220 L 709 6 262
e 136232 f 23437 ) 4597 j 1212 P 676 | 254
t 108288 g 20338 : 3601 2 1197 R 661 G 252
i 86176 w 18950 T 3058 = 1194 4 609 9 226
n 79760 y 17496 ’ 2526 $ 1181 [ 603 ! 223
a 77971 b 16495 I 2173 O 1150 ] 599 8 201
o 77099 \ 13037 A 1880 z 1075 & 587 X 197
s 69647 , 11935 C 1875 ; 994 D 549 7 183
r 67007 v 11419 / 1629 E 979 5 508 # 170
l 51829 . 11190 ‘ 1618 M 964 > 499 ? 162
h 47753 + 10354 0 1586 % 964 V 438 ^ 138
c 39847 - 7224 S 1568 _ 924 Y 405 Q 126
d 32436 { 7194 1 1496 B 822 U 397 J 109
u 30664 } 7183 F 1489 H 743 ~ 382 K 69
\n 29161 x 5845 W 1324 3 739 < 368 @ 23
m 27910 k 5249 q 1302 * 720 " 361 Z 20
Table 24.1: Example frequency distribution from English text. This uneven
distribution makes text easy to compress and (as we discussed in Section 9.1)
makes simple cryptographic systems easy to break.
Notice how the most common characters are quite common.
The five most frequent characters (space, ‘e’, ‘t’, ‘i’ and ‘n’)
account for 43% of all characters, while the 32 least common
characters (the last two columns) account for less than 1%
combined. Without compression, each of these characters
occupies exactly 1 byte (8 bits). However, we could do better if
we encoded the more common characters with fewer bits at the
expense of encoding the less common characters with more bits.
5
Of course, we need to figure out exactly what bit patterns
we want to use to encode each symbol—and that choice is going
to depend on the input we are compressing. While the characters
of this text may obey this distribution, text in another language
will have another distribution, and non-text files will have
completely different distributions. We also need to be able to
decompress the data. If we choose to represent ‘t’ with 10, ‘e’
with 01, and ‘m’ with 1001, we will be unable to tell the
difference between “te” and “m”.
Huffman coding gives an algorithm to find the optimal
encoding where no symbol’s encoding is a prefix of another’s—
i.e. the encoding of one symbol never appears as the first part of
any other symbol’s encoding. Because no symbol is a prefix of
another, we can decode by just reading the input, and every time
we see the encoding of a symbol, we output that symbol.
The algorithm centers around building a binary tree (not a
binary search tree) whose leaf nodes correspond to input
symbols. Once the tree is built (which is where our priority
queue comes into play), we can determine the encoding for each
symbol by examining the path from the root of the tree to the
leaf node with that symbol. As we follow the path from the root
to the leaf, we write down a 0 each time we go left and a 1 each
time we go right. Notice how this scheme ensures that the
encoding is guaranteed to not generate the prefix of any other
symbol’s encoding (that symbol would need to follow the same
path, but then go further—however, the symbol itself is a leaf, so
there is nothing further). Of course, re-finding the path for each
symbol as we read the input is inefficient, so we would want to
create a map from symbols to their encoding one time (right
after building the tree), then look up each symbol in the map as
we read the input symbols and write out the compressed data.
The tree is built by creating a priority queue filled with tree
nodes, ordered by the frequency of each node, such that the
smallest frequency is the highest priority. Initially, we add one
leaf node per input symbol. We then dequeue the two highest
priority nodes (which have the lowest frequency) and make them
the children of a new node. We set the frequency of this new
node to be the sum of the frequencies of its children and then
enqueue it into the priority queue. This process repeats until
there is a single node in the priority queue. At that point, the
remaining node is dequeued and is the root of the encoding tree.
Video 24.3: An example of Huffman coding with
five symbols.
Video 24.3 demonstrates the process with 5 input symbols.
24.5 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 24.1 : Add the following items to an empty
min-heap (in this order) and draw the resulting heap as a
tree and as an array: 100, 50, 30, 200, 300, 250, 40, 1,
999
• Question 24.2 : Remove the minimum element twice
from the min-heap you created in the previous problem.
Draw the heap after both removals as a tree and as an
array
• Question 24.3 : Add the following items to an empty
max-heap (in this order) and draw the resulting heap as a
tree and as an array: 100, 50, 30, 200, 300, 250, 40, 1,
999
• Question 24.4 : Remove the maximum element twice
from the max-heap you created in the previous problem.
Draw the heap after both removals as a tree and as an
array
• Question 24.5 : Implement a min-heap with insert,
removeMin and peek operations. Use a std::vector to
store the elements.
23 Hash Tables25 Graphs
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
III Data Structures and Algorithms24 Heaps and Priority Queues26 Sorting
Chapter 25
Graphs
A graph—which you may recall from Section 22.1.1 is a collection of
nodes (also called vertices) and edges—is an incredibly versatile data
structure. We previously introduced the concept of a graph in the context of
trees, which are a specific type of graph. However, we are now going to
look at the broader class of all graphs and see how they can be put to use in
a variety of applications, as well as some common algorithms you should
be familiar with. We note that while we are going to cover several major
topics, there is a lot more that you can (and should, if you want to be a
serious programmer) learn about graphs.
As we examine more general graphs, there are a few variations we can
use, depending on our purposes. One variation is that we might put weights
on the edges—indicating some cost or other metric of the connection
between the nodes (we could attach other information to the edges, but
weights are most common). For some purposes, we may use undirected
edges, while other purposes may required directed edges.
25.1 Graph Applications
We start our study of graphs by examining a few of their uses.
25.1.1 Task Scheduling
Suppose we have tasks that we want to schedule with a variety of resources
that can execute types of tasks. These tasks may have dependencies
between them (we cannot start task until task completes), although
some tasks are independent of each other (we can do in parallel with
). We can assign a latency to each task, specifying how long we expect it to
take. Our goal in scheduling these tasks would then be to complete them all
in the smallest amount of time possible.
Figure 25.1: Left: a task dependency graph. Edge weights (numbers on the edges) indicate the
length of each task. An edge from one task to another indicates a dependence—the first task must
be completed before the second task may begin. Right: a valid schedule of the tasks.
As an example, consider the task dependency graph shown on the left
of Figure 25.1. In this example, the nodes represent tasks (which are part of
preparing a meal with dessert), and the edges represent dependencies
between tasks—the source of an edge must be completed before the
destination of the edge may begin. The edges are labeled with weights,
which indicate how long (in minutes) each task takes. Each node is colored
according to what resource is required—blue for the chef, pink for the
oven, and gray if no resource constraints are imposed.
We can then use the dependency graph to generate a valid schedule—
one in which no tasks are done out of order, and all tasks have the
resources they require—such as the one shown on the right of Figure 25.1.
We may wonder whether or not we can do better—meaning finish all tasks
in less total time—than this schedule. We can easily see that we cannot by
observing that the path through the graph on the “bread making” nodes has
length (total edge weight) 165 minutes, so we cannot possibly do better
than 165 minutes, which is the length of the schedule we have come up
with.
Because the “bread making” path constrains how quickly we can
complete our entire work, we call it the critical path. If this path were
made longer (e.g., our bread takes longer to bake than we expected), then
our entire set of jobs would take that much longer. If this path were made
shorter (e.g., we had a friend help make the bread dough, or we rushed the
rising time), then we might reduce the total time by the same amount. If we
reduce the length of this particular path enough, then it becomes non-
critical, and some other path becomes critical. We should also note that
tasks that are not on the critical path have some slack—we could start them
later without affecting the overall completion time. If we wanted to take an
hour break before sauteing the chicken, we could do so without finishing
the overall meal preparation later.
This form of task scheduling is not limited to meal planning. If we
were managing a large project or the logistics of a significant operation, we
may want to perform a similar scheduling analysis. If we were writing a
compiler, part of the optimizer (which transforms the generated instructions
to make them faster) would be the instruction scheduler, which would
compute a dependency graph of the instructions, weight their edges with
each instruction’s latency (different instructions take different amounts of
time), and then schedule the instructions to minimize the time they take for
the execution resources in the target hardware.
In these more general cases, our scheduling graph may be a directed
acyclic graph (DAG)—meaning that it contains no directed cycles,
although it may contain undirected cycles. A directed cycle in a task graph
would make it impossible to schedule, as it would indicate circular
dependencies in the tasks—we would need to complete before ,
before , and before —none could ever be scheduled first.
However, an undirected cycle (meaning that there is a cycle if we ignore
the edge directions, but not if we pay attention to them) is permitted. It
would be perfectly fine for and to depend on and also for to
depend on both and .
25.1.2 Resource Allocation
Another use of graphs is in resource allocation problems. Suppose we have
some set of resources (e.g., parking spaces) and some set of users of those
resources (e.g., employees). The same resource cannot be used in two
different ways at the same time; however, if two resource users need the
resource at disjoint times, they can be assigned the same resource (e.g., if
two employees work non-overlapping schedules, they could be assigned
the same parking space). When faced with such a resource allocation
problem, we can solve it—meaning determine an assignment of users to
resources such that there are no conflicts—using graph coloring.
Graph coloring is the process of taking a graph and assigning colors to
each node such that no two adjacent nodes have the same color assigned to
them. In the case of resource allocation, we create an interference graph—
a graph in which two nodes are connected by an edge if they conflict with
each other (i.e., require a resource at the same time). In such a graph, a
valid coloring represents a valid allocation of resources, in which each
color represents a particular resource.
Figure 25.2: Left: Times at which each person needs a room for devious plotting. Right: The
colored interference graph.
As an example of graph coloring, eight characters from Hamlet each
need a room for devious plotting (since that is what Shakespearean
characters do). However, the castle only has four rooms available—the
blue room, the green room, the orange room, and the purple room. Each
character has a very specific schedule in which they need rooms for
plotting, shown on the left of Figure 25.2, and needs to be assigned the
same room for all of their plotting. Of course, no two characters can plot in
the same room at the same time. How can these eight characters have
rooms assigned to them?
We can construct the interference graph (shown on the right, with
each node labeled by the first letter of the character’s name) by drawing
edges between the nodes of characters who want to plot at the same time.
For example, Hamlet has conflicts with Ophelia (both want to plot at 3:00
and 4:00), Rosencrantz (both want to plot at 3:00), Polonious (both want to
plot at 4:00), Guildenstern (both want to plot at 4:00), and Yorick (both
want to plot at 7:00). As Hamlet conflicts with each of these other
characters, his node in the graph has an edge connecting it to their nodes.
However, Hamlet’s plotting schedule does not overlap with Fortinbras or
Laertes, so they could be assigned to the same room—thus they are not
adjacent in the graph.
The interference graph on the right of Figure 25.2 is colored such that
no two adjacent nodes have the same color (as the rules of graph coloring
call for). We can use these colors as the room assignment, and no two
characters will have conflicts: Hamlet and Fortinbras are assigned the
green room, Yorick and Polonious are assigned the orange room,
Rosencrantz and Guildenstern are assigned the blue room, and Ophelia and
Laertes are assigned the purple room. We will note that this graph is
planar, meaning we can draw it on a piece of paper with no edges crossing
each other. There is a famous theorem, the four color theorem, which states
that every planar graph can be colored with at most four colors. We will
also note that this graph cannot possibly be colored with three colors—
Hamlet, Polonious, Guildenstern, and Ophelia all require distinct rooms.
Graph coloring is not limited to toy examples, rather it is useful for
solving a variety of real resource allocation problems. For example,
compilers need to assign variables to machine registers (which are
basically the hardware equivalent of variables, but there are a limited
number of them). This problem is also often solved with graph coloring, in
which the variables are the nodes, the registers are the colors, and two
nodes conflict if their variables simultaneously have a value that might be
needed (as opposed to a value that will never be used again, in which case
it may be safely overwritten).
Note that graph coloring (and many other interesting graph problems)
is NP-complete. Recall from Section 20.1.2 that NP-complete problems are
ones where the best known solution has exponential time ( ), and
that solving any NP-complete problem in polynomial time would provide a
polynomial time solution to all NP-complete problems. There are, however,
polynomial time approximations—algorithms that give reasonably good
answers. In the case of graph coloring, an approximation would mean we
color the graph according to the rules, but may use more colors than would
be ideally needed (maybe we 27 color a graph that could be 25 colored).
25.1.3 Path Planning
Another use of graphs is in finding a path from one location to another.
One ubiquitous instance of this application is in map software, where a
user asks for directions from one place to another. The map software tracks
locations as nodes and roads as edges, with weights indicating how long it
takes to traverse that segment of road. One-way streets can be represented
with directed edges, while two-way streets can be represented as
undirected edges. The map software then runs an algorithm to find a path
from the origin location to the destination location. In Section 25.3.4, we
will see Dijkstra’s shortest path algorithm, which the mapping software
might use to find the path that minimizes travel time.
We could use similar algorithms for other path planning applications.
If we were wanted to program something to move in an intelligent way
from one place to another—whether a robot, or a character in a game—we
would want to use graph algorithms. In such situations, we could represent
information about how difficult terrain is to traverse with edge weights, so
that we can find the best path under whatever criteria is important.
25.1.4 Social Networks
Another popular use of graphs (which is relatively recent—having emerged
in roughly the past decade) is social networks. In these applications of
graphs, people are nodes, and edges represent a “friendship” relationship.
Various algorithms can be applied to the graph data structure, either to
provide features to the users, or to analyze the data to enhance advertising
revenue. For example, the system might suggest friends for a user by
finding the set of people who are not already friends but are reachable by
traversing one or two other nodes. If there are multiple distinct paths
between a pair, the chances of them knowing each other go up, as they
have more friends in common.
25.2 Graph Implementations
The basic functionality that we want in a graph is to add or remove nodes
(likely with some information associated with them), add or remove edges
between nodes (possibly with weight information), and query information
about the graph (e.g., get the set of all nodes, get the set of nodes adjacent
to a particular node, test if two nodes are adjacent). If we have weighted
edges, we would also want the ability to set (and possibly later adjust) the
weights. We would also want to not just be able to find out whether two
nodes are connected, but also what the weight of the edge between them is
if they are connected. If we are doing a directed graph, then an edge from
to is distinct from an edge from to .
We could define our graph’s interface in a variety of ways. The
simplest would be to just have each node identified by an unsigned integer
and to use a double for the weight (or an integer, or have no weight
information). The interface for such a graph might look like this:
1 class Graph {
2 public:
3 unsigned addNode();
4 void removeNode (unsigned whichNode);
5 void addEdge(unsigned from, unsigned to, double weight);
6 void removeEdge(unsigned from, unsigned to);
7 unsigned getNodeCount() const;
8 set<pair<unsigned, double> > getAdjacencies(unsigned whi
9 double getEdge(unsigned from, unsigned to) const;
10 bool isAdjacent(unsigned from, unsigned to) const;
11 };
We might, however, want to associate more information with each
node and/or with each edge—possibly, even making the graph generic in
what that information is. We could also make our graph generic in whether
it is directed or not (as the differences in the implementation are rather
minor). Our more general interface would look fairly similar:
1 template<typename N, typename E, bool directed>
2 class Graph {
3 public:
4 void addNode(const N & nodeInfo);
5 void removeNode (const N & nodeInfo);
6 void addEdge(const N & fromNode, const N & toNode, const
7 void removeEdge(const N & fromNode, const N & toNode);
8 set<const N &> getNodes() const;
9 set<pair<const N &, E &> > getAdjacencies(const N & whic
10 const E & getEdge(const N & fromNode, const N & toNode)
11 bool isAdjacent(const N & fromNode, const N & toNode) co
12 };
Note that most of the changes replaced the unsigned ints that were
previously used to identify nodes with const N &s and the doubles that
previously gave edge weights with E &s. We also replaced the
getNodeCount method with one that returns a set of all nodes.
We may also prefer to make our graph have a more “OO” interface.
We could make separate classes for Nodes and Edges and operate on those
objects. In such a design, functionality such as getAdjacencies would be
placed inside the Node class (and operate on this Node, instead of taking a
Node as a parameter). Ultimately, how you design the interface—including
how generic you make it, and whether you create explicit Node/Edge
objects depends on what you plan to use the graph for.
We now consider two ways we can implement the graph.
25.2.1 Adjacency Matrix
One way that we can implement a graph is an adjacency matrix—a two-
dimensional array (or vector of vectors) in which there is a row and a
column for each node. The information stored in each element of the
matrix indicates whether or not there is an edge (as well as any information
pertaining to that edge, if one exists) from the “row” node (i.e. the node
corresponding to the row that the information is in) to the “column” node.
Tracking edge information in a generic way is a bit annoying, as there
is no value of type E that we always know can be used for “no edge.”
However, we have a couple options—we could keep one matrix of bools
that indicates if an edge exists and a separate matrix that indicates the edge
information (or a matrix of pairs of bools and Es). We could then hold
default-constructed Es in the information matrix, even when an edge does
not exist, and simply ignore them. Another option would be to hold
pointers to Es, and use NULL to indicate that no such item exists.
If we do not need to implement a graph that is generic in E but rather
can just have a specific type for edges (e.g., double), we can pick a
particular value (e.g., positive infinity)1 to indicate that no edge exists.
Once we have ironed out that implementation detail, implementing the
operations is rather straightforward. Adding a node is a matter of
expanding the matrix in both directions. If we use a vector of vectors, we
can add a new “row” by creating a vector filled to the appropriate size with
“no edge” information and adding it to our vector. We can then iterate over
all the “rows” and add a “column” by adding “no edge” to the end of the
vector. Removing a node is a matter of removing both the row and column
corresponding to it by removing from the right place in the vectors. We do,
however, need a way to map Ns to their indices in the vector. If N is just an
unsigned int (and the node IDs are contiguous), then this is trivial.
Otherwise, we might keep a map from Ns to unsigned ints.
Figure 25.3: Left: adjacency matrix representation of a graph. Right: conceptual representation
of the same graph.
Figure 25.3 shows an adjacency matrix representation2 (left) of a
graph (conceptually depicted on the right). Here, our nodes are identified
by integers (0–5), and the weights on the edges are numbers with
indicating no connection. If you look at a particular edge in the conceptual
depiction (for example the edge of weight 9 from node 3 to node 4), you
can find it in the adjacency matrix by examining the row corresponding to
the source node (e.g., row 3), then looking in the index corresponding to
the destination node (e.g., column 4), where you will find the weight of
that edge (e.g., 9).
The parts of the diagram drawn with blue dashed lines show what
happens when another node (node 5) is added to the graph. Here, we need
to add a new row (the fifth row, shown at the bottom of the figure), which
involves creating the vector (the actual items of that row), filling them with
, and then adding it to the vector of vectors (putting in the box with the
arrow in the lower left corner). We also must add a new column to each
existing row (the blue boxes down the right side).
Adding and removing edges is a matter of adjusting the edge
information in the right row/column corresponding to the nodes passed in.
Likewise, checking if two nodes are adjacent and getting the edge
information (if they are) is a matter of indexing the vectors at the
appropriate place and returning that information. Getting all the nodes is a
matter of taking the keys from the Node to int map and putting them into a
set. Likewise, getting the adjacencies for a particular node is a matter of
iterating across the proper vector and putting the information into a set.
If our graph is undirected, we would make sure to add/remove the
opposite direction edge every time we manipulate the edge information. If
the graph is directed, we do not have to do anything special.
25.2.2 Adjacency List
Another way to implement the graph is to keep a map from each node to
the information about what nodes are adjacent to it—called an adjacency
list. This technique gets its name from an implementation in which that
information is stored as a linked list (e.g., a linked list of pair<N, E>s).
Such a design would mean our graph is implemented primarily as a
map<N, list<pair<N, E> > >. However, we could use a more efficient
implementation than a linked list—such as a map<N, map<N, E> >. Here,
we map each node (i.e., the source node) to a mapping from nodes (i.e., the
destination node) to edge information. If we can find edge information in
the second map, then the nodes are adjacent (with the information we
found describing the edge); otherwise, they are not adjacent (at least not in
that direction).
Figure 25.4: Left: adjacency list representation of a graph. Right: conceptual representation of
the same graph.
To illustrate this implementation of a graph, Figure 25.4 shows the
same graph as in Figure 25.3 (redrawn on the right) but with a map of maps
representation on the left. These maps could be implemented with lists,
trees, or hash tables, as needed—the basic ideas are the same. Accordingly,
we draw them as generic/conceptual key/value mappings. In this
representation, finding an edge involves looking up the source vertex in the
map shown on the left—which yields a another map; then we would look
up the destination node in this second map. For example, if we wanted to
find the edge from 3 to 4, we would look in this map for key 3 and find a
value that is itself a map (the one with two keys: 1 and 4). We would then
look in this second map using the destination node as the key. If that key is
present, the edge exists, and the value we obtain has the edge information
(in the case of node 4, we find a weight of 9).
As with Figure 25.3, we show the changes required to add node 5 to
the graph. Here, we just add the key 5 with a value of an empty map to the
left map in the figure—5 is a valid node, but it is not the source of any
edges. Removing a node may be a bit complex, as we need to not only
remove it from the “source” map, but also remove any occurrences of it
from the “destination” maps of all other nodes (e.g, to remove 2 from the
graph, we would need to remove from the destination maps of 0, 1, and 4).
Manipulating edges is quite straightforward and efficient as we only need
to perform the corresponding map operations.
25.2.3 Efficiency
Adjacency Adjacency List with…
Matrix Linked List Balanced Tree Hash table
space
addNode
removeNode
addEdge
removeEdge
getNodes
getAdjacencies
getEdge
isAdjacent
Table 25.1: Asymptotic (Big-Oh) space and time efficiency of adjacency matrix and adjacency
list implementations. is the number of verticies (nodes), and is the number of edges.
The getNodes and getAdjacencies assume that you have to explicitly create a new set for the
answer (you do not just have an existing one to return).
With multiple implementation options, a natural question would be
what the efficiency tradeoffs are in choosing one implementation over
another. As we already have familiarity with the efficiencies of all the
structures involved, we can analyze this problem in terms of what
operations we need to conduct on vectors, lists, trees, or hash tables. In
performing this analysis, we will consider two aspects of the problem size:
the number of vertices ( ),3 and the number of edges ( ). We note that
—that is, we will have at most edges to represent, as
that would be an edge in each direction (the ) between every pair of
vertices. However, most graphs will have significantly fewer edges, as
most pairs of vertices will not be adjacent to each other—thus considering
separately (rather than as just ) gives us a more descriptive bound.
Table 25.1 shows four implementations of the graph: an adjacency
matrix and three adjacency list implementations (linked lists, balanced
binary search trees, and hash tables). For the adjacency list
implementations, we assume all maps involved use the same
implementation. For the adjacency matrix representation, we assume that
the vectors or arrays have their capacities doubled whenever they are
resized, so that they have amortized behavior when adding
elements (and we present the amortized behavior in the table). We note that
for the binary search tree and hash table implementations, we present the
complexity of getAdjacencies in terms of —meaning the number
of elements in the answer set. It is clear that ; however,
on large graphs, it is likely to be much smaller in practice.
The hash table implementation has the best asymptotic behavior (Big
Oh) in each category, so why might we want to use anything else?
Remember that Big Oh does not tell the whole story. The constant factors
(which Big Oh discards) are likely much better for an adjacency matrix. As
always, we should think carefully about what we are doing in choosing our
design. If our program will create a graph once, then never modify it again,
while performing billions of isAdjacent operations, we may see benefits
from the lower constant factors of an adjacency matrix.
25.3 Graph Searches
One common problem in a graph is to search for a path from one node to
another. In such a problem, we may either want to just know whether or not
a path exists (i.e., have a true/false answer), or actually return the path (i.e.,
the answer is a data structure that is a list of nodes that can be followed
from the source to the destination). In either case, the same basic
algorithms can be used. However, there are a variety of ways we can search
the graphs, which may give different paths. If we care about the properties
of the path (and not just that it exists), we need to choose our algorithm
carefully.
25.3.1 Depth-First Search: Recursively
One way we can search a graph, called a depth-first search (DFS), is
analogous to being in a maze. Suppose you were in a maze, trying to find
your way from where you are to an exit. You might try to find your way out
of the maze using an algorithm that looks like this:
searchForExit(room) {
if (room is the exit) {
return the path [room]
}
for each passageway leaving the room {
go down the passageway to a new room R
if (searchForExit(R) yields a path P) {
return the path R + room
}
}
return no valid path
}
Note that this algorithm is recursive: we can find our way out of the
maze by going down a hallway, then finding a path to the exit from
whatever room we end up in. If that room is the exit, then we have
succeeded (base case) and we return a path that is just the exit room (or, if
we are really lost in a maze, we just leave the maze). If we exhaust all
passageways leaving the room without success, there is no valid path from
this room to the exit.
Video 25.1: Testing our algorithm for finding our way out of
a maze (and finding that it does not work).
This approach sounds nice in theory, but it has a significant problem,
which would come up in testing our steps (Step 4 of our programming
process). Video 25.1 shows what happens when we test our algorithm. Our
algorithm ends up in infinite recursion, exploring the passageway between
two rooms repeatedly—in our maze conceptualization, we walk into a
room, then try the passageway we just came out of. We need to make our
algorithm “smarter” by keeping track of which rooms we have already
visited and not re-exploring from them.
In particular, we can adjust our algorithm to keep track of a set of
rooms we have already visited. If we have already visited a room, then we
return “no path” without further exploration. If we are in the middle of
exploring paths from that room (e.g., we went from to to back to
), but we need some other passageway leading from the room (e.g.,
to ), then we will explore that passageway when we return back to room
after fully exploring and . Our modified algorithm might look
like this:
searchForExit(room, visited) {
if (room is the exit) {
return the path [room]
}
if (room is in visited) {
return no valid path
}
visited = visited + room
for each passageway leaving the room {
go down the passageway to a new room R
if (searchForExit(R) yields a path P) {
return the path R + room
}
}
return no valid path
}
If you recall from Section 7.6, to ensure recursion terminates, we need
to be able to “measure” our function—compute some metric on its
arguments that decreases with every recursive call. Our original algorithm
clearly has no such measure (as it does not terminate). At first glance, our
modified algorithm may seem to suffer from lack of a measure, as the
visited set is growing, but we need a shrinking measure. However, if we
consider the set of nodes in the graph but not in the visited set (i.e., the set
of nodes yet-to-be explored), we find that it decreases in size with each
recursive call.
This algorithm is called “depth-first search” because we fully explore
one path before exploring any other path. A consequence of this approach
is that you may end up with a much longer path than you need—the DFS
may find a long, meandering path to get from to , when may be
directly adjacent to (but just not the first adjacency explored). However,
DFS has a variety of other applications, such as finding strongly connected
components and topological sort (both of which we mention briefly at the
end of this chapter).
25.3.2 Depth-First Search: Explicit Stack
Instead of using recursion, we could use iteration with an explicit stack. In
this approach, we build up paths and push them onto a stack. The LIFO
nature of the stack means we will continue exploring fully down one path
before considering different options closer to the start. Such an approach
looks generally like this:
dfs(from, to) {
Stack todo;
Set visited
todo.push(the path [from])
as long as todo is not empty {
currentPath = todo.pop()
currentNode = currentPath.lastNode()
if (currentNode == to) {
return currentPath
}
if (currentNode is not in visited) {
visited.add(currentNode)
for each (x adjacent to currentNode) {
todo.push(currentPath.addNode(x))
}
}
}
return no valid path exists
}
This approach to DFS merits a bit of discussion for two reasons. First,
as we shall see shortly, the same algorithmic structure can be used for other
searching techniques by replacing the stack with a queue or priority queue.
Second, this algorithm is an example of a larger class of algorithms, known
as worklist algorithms, which keep a list of items to work on, take an item
from the list, process it, and in the process possibly generate new work
(which is then added to the worklist).
Video 25.2: Execution of the DFS algorithm.
25.3.3 Breadth-First Search
While depth-first search explores all the way down one path before trying
another, breadth-first search (BFS) explores outwards from the source. A
good way to conceptualize breadth-first search is to imagine pouring water
into the graph at the starting node—it will spread out evenly along all
paths, reaching all nodes that are distance 1 away, then all nodes that are
distance 2 away, and so on. Breadth-first search will result in the path with
the fewest number of “hops” (that is, that traverses the smallest number of
other nodes) but not necessarily the shortest total path weight.
We can modify our DFS algorithm to do BFS instead by simply
changing the worklist from a stack to a queue:
bfs(from, to) {
Queue todo; //only change Stack -> Queue
Set visited
todo.push(the path [from])
as long as todo is not empty {
currentPath = todo.pop()
currentNode = currentPath.lastNode()
if (currentNode == to) {
return currentPath
}
if (currentNode is not in visited) {
visited.add(currentNode)
for each (x adjacent to currentNode) {
todo.push(currentPath.addNode(x))
}
}
}
return no valid path exists
}
If our queue’s enqueue and dequeue operations are named differently
than the stack’s push and pop operations, we would need to modify the
names of those functions as well. As you may recall from Section 20.3,
STL names the operations the same (they are push and pop for both stacks
and queues), so that the two structures have the same interface. This
similarity in the interfaces means that we could write one implementation
for either search and template it over what type of data structure we want to
use as our work list. That is, we might adjust our pseudo-code to be:
template<typename WorkList>
search(from, to) {
WorkList todo; //generic in WorkList type
Set visited
todo.push(the path [from])
as long as todo is not empty {
currentPath = todo.pop()
currentNode = currentPath.lastNode()
if (currentNode == to) {
return currentPath
}
if (currentNode is not in visited) {
visited.add(currentNode)
for each (x adjacent to currentNode) {
todo.push(currentPath.addNode(x))
}
}
}
return no valid path exists
}
25.3.4 Dijkstra’s Shortest Path Algorithm
In many uses of a graph, we would prefer to not just find any path, but to
find the shortest possible path in particular. If we return to the map
example, our software’s users would be most annoyed if it used DFS to
find directions from one place to another—the resulting path would be
valid but might involve driving all over town. Fortunately, there are
algorithms that find the shortest path between two nodes, whenever it
exists. We are going to discuss Dijkstra’s shortest path algorithm, which
works whenever there are no edges of negative weight. If you have a
directed graph with negative weight edges, the Bellman-Ford algorithm
(which we will not go into here) can be used, as long as there are no cycles
with a net negative cost (in such a case, there is no shortest path, as any
path can be made shorter by traversing the negative cost cycle additional
times).
The original formulation of Dijkstra’s algorithm works by keeping a
table of the best distance discovered so far from the source node to each
possible destination. This table is initialized with all entries have
(meaning there is no known path so far) except for the source node, which
is initialized to have distance 0 (as it can be reached from itself with no
cost). The algorithm then proceeds by selecting the node that is not yet
marked completed and has the smallest distance (choosing from not-yet-
completed nodes) in the table. The algorithm then updates the table based
on the adjacencies of this current node, as a new better distance may have
been discovered. In particular, if the current node is , and the distance in
the table to it is , then for each adjacent to (with edge weight
), the algorithm computes (the distance to reach plus the
distance to go from to ), and if is less than the current
value in the table for , updates the table entry for to be .
Once all adjacencies are processed, the current node ( ) is marked
completed.
The algorithm can either be used to find the shortest path from one
source to one destination—in which case it ends when the destination node
is selected as the current node—or to find the shortest paths from one
source to every possible destination, in which case it ends after all nodes
have been marked completed.
Figure 25.5: Illustration of Dijkstra’s Algorithm. The table of best distances is shown for each
step on the left as the algorithm searches for the shortest paths to each node starting at .
Figure 25.5 illustrates the behavior of Dijstrka’s algorithm for the
graph on the right side of the figure, starting from . The left side of the
figure shows the table of best distances in each step of the algorithm (the
first state is pictured on the left, and the final state on the right). In this
table, the current node is shown with bold blue text, and completed nodes
are shown with light blue backgrounds. In the first time step (left), the best
distance (so far) to every node except is , and the distance to is
0. As 0 is smaller than , is picked as the current node, and its
adjacencies are explored.
The second column shows the state after is completed (notice its
background is light blue). Notice that its adjacencies ( , , and )
have had their distances update to finite values. As has the smallest
value in this table (ignoring , which is completed), we select it as the
current node and explore its adjacencies ( and ). Notice that we
update but do not update (as is worse than 0). In the third
step, we explore from and discover that is reachable in at most 9
(we will find that we can do better later). Next, we choose and improve
our best distance to —we can get there in 8 by going – –
instead of – – . Next, we pick and find an even better path to
: – – – . Finally, we pick , and have nothing left to update. At
this point, we have a table of the distance along the shortest path from
to each node in the graph.
An easier way to do Dijkstra’s algorithm is to use our generic search
algorithm from the previous section but use a priority queue for the
worklist. Here, the priority queue must be ordered by total path length,
such that the path with the shortest length is the highest priority.
25.4 Minimum Spanning Trees
Suppose we have several buildings, and we wish to connect them to each
other with power lines. We want to connect the buildings with power lines
such that they all have power but minimize the cost of the power lines we
use to do so. We do not, however, have to directly connect every building
to every other building—if is connected to and is connected to
, we do not need to connect to , as power can be transferred
through .
We can solve this problem by constructing a graph in which each
building is a node and the paths between buildings are edges. The weights
on each edge represent the cost of laying power lines along that path. In
such a formulation, we want to select the subset of the edges that connect
all of the nodes together with the minimum total cost (sum of edge
weights) to do so—called the minimum spanning tree (it is called a
“spanning tree” because it is a tree that spans the nodes of the graph; of the
possible spanning trees, we want the one with minimum cost).
There are two efficient algorithms to find minimum spanning trees—
Prim’s and Kruskal’s—we will cover them both.
25.4.1 Prim’s Algorithm
Prim’s algorithm works on the principle of starting from any node (it does
not matter which one) and building out a connected tree from it. The
algorithm maintains a connected tree at all times and “grows” it one node
at a time by picking the node that can be added with the least cost (lowest
edge weight from any of the nodes currently in the tree). The algorithm
terminates when all nodes are included in the minimum spanning tree.
One way to think about this algorithm is to imagine a priority queue
of edges, ordered by each edge’s weight, with the lowest weight being the
highest priority. The algorithm starts by picking one node in the graph and
adding its edges to the priority queue. The algorithm then proceeds by
taking the lowest weight edge from the queue, checking if it forms a cycle
—which is a matter of seeing if both “ends” of the edge are already in the
tree, which is easily accomplished by keeping a set of nodes that are in the
tree. If the edge would form a cycle, it is removed from the priority queue
with no other action. If it would not form a cycle, then it adds a new node
to the tree, so the edges leading from that node are added to the priority
queue (and the edge that was just used is removed from the priority queue).
Video 25.3: Prim’s algorithm to find a minimum spanning
tree.
Prim’s algorithm can also be formulated in terms of keeping the
vertices in the priority queue (instead of the edges). In this formulation, the
vertices must be ordered by their best distance from the current tree—that
is, the lowest edge weight from any of the nodes that have already been put
into the tree. At each step, the highest priority node is selected, removed
from the priority queue, and added to the current tree. When the node is
added to the tree, each of its adjacencies is examined to see if its best
distance has improved—that is, if the edge from the newly selected node is
better than the previous best edge. If so, the adjacent node needs to have its
priority changed to reflect this better weight. This operation can be
accomplished easily in a heap by changing the priority value and bubbling
the node up the heap until it reaches the correct place (just as is done when
adding the node). Video 25.3 shows Prim’s algorithm being used to find the
minimum spanning tree of a graph.
25.4.2 Kruskal’s Algorithm
Another way to compute the minimum spanning tree of a graph is
Kruskal’s algorithm. In this approach, all of the edges in the graph are
placed into a priority queue (ordered such that the lowest weight edge has
the highest priority). The algorithm proceeds by taking the highest priority
edge from the priority queue and testing if it forms a cycle. If the edge does
form a cycle, it is ignored. If it does not form a cycle, it is included in the
minimum spanning tree. As with Prim’s algorithm, Kruskal’s algorithm
terminates when all nodes are combined into a single tree. Video 25.4
demonstrates this algorithm.
Video 25.4: Kruskal’s algorithm to find a minimum
spanning tree.
Testing for a cycle in Kruskal’s algorithm is a bit trickier than testing
for a cycle in Prim’s algorithm. In Prim’s algorithm, we have exactly one
set of nodes that belongs to the tree (as we have one tree that we build
from), so we can just keep a set data structure, test if a node is a member of
that set, and add it to the set. For Kruskal’s algorithm, however, we may
have many smaller trees at any given time—in fact, we can think of the
starting state of the algorithm as many one-node trees. Each edge that we
add joins together two small trees into one larger tree. To test for cycles, we
need to know if the two nodes of an edge belong to different trees (in
which case no cycle is formed) for the same tree (in which case a cycle
would be formed). If the edge is added, we need to combine the two trees
into one.
We could implement this functionality—called union-find (because
we need to be able to union two sets together and find which set an element
belongs to) in a variety of ways; however, there are very efficient ways to
do so. We are not going to delve into the details of these algorithms here,
but you can expect to learn about it in an algorithm theory class.
25.5 Other Algorithms
There are a variety of other graph algorithms and problems that
programmers should be familiar with. We are not going to delve into the
details of them here; however, we will briefly mention a few important
ones here so that you know what things you might want to learn more
about in the future (of course, if you take an algorithm theory course, you
should learn all about these):
Clique and independent set A clique in a graph is a group of nodes
that are all directly connected to each other (an edge exists
between each pair of nodes in the set). We may wish to find a
clique of at least a particular size, or the largest clique in a
graph. The dual of this problem is to find an independent set—
a set of nodes with no direct edges between any pair of them.
Both of these problems are NP-Complete.
Isomorphism Two graphs are isomorphic if they have the same
structure. More formally, and are isomorphic if and
only if we can find a mapping from the nodes in to the
nodes in such that for all pairs of nodes, and , there
is an edge between and if and only if there is an
edge between and . At present, it is unknown whether
graph isomorphism is NP-Complete or not.
Max flow/min cut In a single source/single sink (one node with no
incoming edge, one node with no outgoing edges), weighted
DAG, we might want to find the maximum flow through the
graph—that is, if we consider each edge weight to represent
the bandwidth (amount that can pass through per unit time),
we might want to figure out the most flow we can send
through the graph edges.
Strongly connected components In a directed graph, a strongly
connected component (SCC) is a set of nodes in which every
node is reachable from every other node (that is, you can
follow the directed paths to/from each node in the component).
There is an efficient algorithm to find the SCCs of a graph
using DFS.
Topological sort Topological sort takes a directed acyclic graph
(DAG) and produces an ordering of its nodes such that each
node in the output is before all nodes that are reachable from it
(on directed paths). This problem is useful when we consider
the edges of the DAG to represent a “depends on” relationship
and we want to find a valid schedule (in which each task
appears before all tasks that depend on it). This is exactly what
we were doing in our cooking example earlier. This algorithm
has efficient solutions—one based on DFS and one based on
removing source nodes (ones with no incoming edges) from
the graph and adding them to the list. The later algorithm is
nice for scheduling, as it is easy to apply prioritization
heuristics when there are multiple source nodes to choose from
(e.g., select the source node that is at the head of the longest
dependence chain).
Traveling salesperson
The traveling salesperson problem (TSP) asks
for the minimum cost trip between a set of “cities” (nodes)
where each edge is weighted with the cost to travel between
that pair of cities. This problem is NP-complete.
25.6 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 25.1 : What is a DAG? How is it different from a tree?
Give one example of an important type of problem where DAGs
are useful.
• Question 25.2 : What does it mean for a graph to be “planar”?
• Question 25.3 : Perform Prim’s algorithm to find the minimum
spanning tree on the following graph, starting at node .
Show the resulting MST, and list the order in which nodes are
added to it.
• Question 25.4 : Perform Kruskal’s algorithm to find the minimum
spanning tree on the same graph as the previous problem. List the
order in which edges are added to the tree.
• Question 25.5 : Using the same graph as the previous two
problems, perform Dijkstra’s algorithm to find the length of the
shortest path from A to each other vertex in the graph.
• Question 25.6 : Do you notice any similarities between Prim’s
Algorithm and Dijkstra’s Algorithm? What is the difference
between them? What does these observations suggest about
implementation?
• Question 25.7 : Implement a graph, including DFS, BFS,
Dijkstra’s, and one of the MST algorithms. As always implement a
small piece of the problem, then test extensively before proceeding.
24 Heaps and Priority Queues26 Sorting
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
III Data Structures and Algorithms25 GraphsIV Other Topics
Chapter 26
Sorting
Sorting data is the process of organizing the data into either ascending
order (smallest to largest) or descending order (largest to smallest).
Programmers often find it useful to sort data as part of solving some
other problem. For example, we have seen binary search, which
requires the input array to be in sorted order but allows us to find a
particular element (or determine it is not in our input array/vector) in
time, rather than time. Likewise, if we had a
1,000,000 element array, and we wanted to know the largest 5,000
elements, an easy and relatively efficient approach would be to sort
the data into descending order, then look at the first 5,000 elements.1
We will note that you can sort data of any type for which you can
define a total ordering on the type—i.e., you can compare any two
elements of the type (a and b) and either have a < b, a = b, or a > b.
Any total ordering will work. For this chapter, we will work with
integers (ordered with their “normal” ordering), but the same concepts
and algorithms apply to sorting any data with any ordering.
Similarly, we can sort data in any linear data structure—e.g.,
arrays or linked lists. Here, the primary difference is what operations
can be performed in time. For example, accessing a particular
element of an array by index takes time, but the same
operation takes time in a linked list. We will primarily focus
on sorting arrays in our examples, but we will discuss sorting linked
lists as well.
26.1 Sorts
If you were to work through the process of devising an algorithm to
sort data, you would likely come up with an algorithm that has a
running time of . The most natural and intuitive ways of
sorting data fall into these running times. These lend themselves to the
quickest/easiest implementations, but are poor choices if you must
sort a large data set. We will cover three major examples here: bubble
sort, insertion sort, and selection sort.
26.1.1 Bubble Sort
The basic idea of bubble sort is to iterate across the array, comparing
each element to the one next to it. If the two elements are out of order,
bubble sort swaps them. Iterating over the array brings each element
closer to its correct position, but one pass over the array is insufficient
to guarantee that the array is completely sorted. Therefore, bubble sort
repeats the process until it does not swap any elements in a pass over
the array.
Video 26.1: Execution of bubble sort.
Video 26.1 walks through the execution of bubble sort. For this
video (and others in this chapter), we will show the data being sorted
conceptually as bars whose height corresponds to the data values. We
deviate from our usual number-in-box representation, as this approach
gives you a better view of what is happening in the array. We also
color the bars red (wrong order) or green (right order) when they are
compared to each other, and purple when they are swapped. We will
note that by this point, you should be able to trace the execution of
this code by hand with numbers in boxes.
Video 26.1 has slower animation at the start and end and much
faster animation in the middle. The first pass through the array is
animated more slowly so that you can see more detail of what is
happening (of course, you can pause/rewind the video as you want
any time). The middle is shown quickly, as it is repetition of what has
already been done. We animate the last pass over the loop slowly to
illustrate that the final pass effectively verifies the ordering without
doing any swaps.
Video 26.2: Fast execution of a larger bubble sort.
Video 26.2 shows bubble sort sorting a slightly larger array (24
elements instead of nine). Here, we show the sort much faster, and
without the code, to illustrate the bigger picture behavior of the sort.
In this video, you can see a few things about the behavior of bubble
sort. The largest elements “bubble up” quickly—the first pass through
the array moves the largest element to the last position, then each
subsequent pass ensures that at least the next largest element is moved
to its correct position. You can also observe that elements that “bubble
down” (they start to the right of their correct position) may take many
passes, as they might only move one place closer each time—the end
of the sort shown in Video 26.2 spends four iterations moving 28
down one place at a time to its correct position after everything else is
correctly ordered relative to all the other elements.
We can (informally) observe the behavior of this
algorithm from what this video showed us. Each pass is guaranteed to
put one element into the correct place—it may end up putting more
where they belong, but only one is guaranteed per pass. Each pass
requires time (as it examines each element of the array), and
we need passes (as we may need one pass per element to
move that element to the right place). Because we need
passes, which each do work, we have an
algorithm.
We might be tempted to try to improve on bubble sort by
observing that in Video 26.2 (a) our algorithm spends significant time
comparing the elements at the end of the array that are already
guaranteed to be sorted and (b) that we bubble elements much more
quickly in the direction that we sort. If we were then to modify our
algorithm to remember where the array is sorted, and to alternate
directions, we would end up reinventing shaker sort. Shaker sort also
has performance, so it is still not efficient on large data sets.
We mostly illustrate it to show that tweaking a simple sort may result
in slight improvements without improving the Big Oh of the
algorithm.
Video 26.3: Illustration of the execution of shaker sort.
We will note that bubble sort is relatively easy to implement on a
linked list (one would swap the data in the nodes, not the nodes
themselves). It is also no less efficient on linked lists than it is on
arrays—the sort iterates over the elements of the array in sequence
and compares one element to the immediately subsequent element.
Shaker sort would only be useful on a doubly-linked list, so that the
reverse passes could follow previous pointers.
26.1.2 Insertion Sort
Another sort that is commonly used when efficiency is not
a concern is insertion sort. The basic idea of insertion sort is to divide
the array into two portions—one that is sorted, and one that is not
sorted. The first element of the unsorted portion is then inserted into
the sorted portion at the correct position relative to the other already
sorted elements. This insertion grows the sorted portion of the array
by one element, while simultaneously shrinking the unsorted portion
of the array. When the unsorted portion of the array shrinks away
completely, the entire array is sorted. The sort starts with the division
between the sorted portion and the unsorted portion such that the
sorted portion has one element, and the unsorted portion has
elements. This division works because any one-element sequence is
always sorted.
Video 26.4: Execution of insertion sort.
Video 26.4 illustrates the execution of the insertion sort
algorithm. Notice that while bubble sort is based on comparing and
swapping elements, insertion sort is based on finding a position,
shifting elements to the right to make space, and inserting the element
into a position that may be far away from where it started. When the
algorithm copies elements right to prepare for insertion, a duplicate of
the item being moved exists briefly. We show this in the video by
drawing the newly copied item in black, and the original (now
duplicate) copy in grey. On the next step of the algorithm (either
shifting the next element right or inserting the current element), the
duplicated copy will be overwritten.
As with bubble sort, we can informally observe that insertion
sort is an algorithm. Each element of the array must be
inserted into the sorted portion of the array (which requires
insertions), and each insertion requires searching for the correct space
(which is an operation) and shifting the elements to the right
(which is also an operation)—which makes each insertion
, which is still . Again, doing
insertions that each take time results in an
algorithm.
Insertion sort works very nicely for linked lists. Instead of having
a sorted and unsorted portion of the linked list, we simply build a new
sorted linked list. At each step of the algorithm, we remove the head
of the original list and perform a sorted insertion into the new list. We
can reuse the original nodes, so we do not need any extra allocation or
copying. Accordingly, insertion sort on a linked list may be slightly
more efficient than insertion sort on an array; however, it is still
.
Video 26.5: Fast execution of a larger insertion sort.
Video 26.5 shows a faster animation of a larger execution of
insertion sort. As with bubble sort, the primary purpose of this larger
animation is to show you the bigger picture/conceptual behavior of
the algorithm. Here, you can see how for each insertion the algorithm
scans from the left looking for the insertion point, then copies
elements to the right to “make space” for the newly inserted item.
26.1.3 Selection Sort
Video 26.6: Execution of selection sort.
Another algorithm that is popular for its simplicity
when efficiency is not a concern is selection sort. As with insertion
sort, selection sort conceptually divides the array into a sorted and an
unsorted region, then works to expand the sorted region. The
difference between selection and insertion sort is that selection sort
imposes the additional invariant on the sorted region that it contains
the smallest elements of the array in sorted fashion—thus placing
them in their correct position the first time. To maintain this invariant,
selection sort must start with an empty sorted region (compared to
insertion sort, which can start with the first element in the sorted
region), and selection sort grows the sorted region in a different way
from insertion sort.
Selection sort examines the unsorted region for its smallest
element, then swaps that element with the first element of the
unsorted region (unless it is already in that position). As the sorted
region has the smallest elements of the array already, this swap
operation places the next larger element into the correct position. At
this point, the sorted region grows (and the unsorted region shrinks)
and the process repeats until the entire array is sorted.
Video 26.6 shows the execution of selection sort with the
corresponding code. After watching this video and Video 26.4, you
should be able to see the differences in how selection and insertion
sort operate. You should also be able to observe the
behavior of selection sort. There are iterations of the outer
loop, and each of these iterations calls findMinIdx to find the index of
the smallest item in the unsorted region. The findMinIdx function
also has running time, so we have .
Video 26.7: Execution of a larger selection sort.
Selection sort is no worse on linked lists than it is on arrays, as it
iterates sequentially though the data structure, both in terms of where
the boundary between the unsorted region and the sorted region is, as
well as for finding the minimum item in the unsorted region.
However, insertion sort tends to be quite popular when sorting linked
lists, as most of the work is already done if you already have a
function to perform sorted insertion on the list.
As we have done before, we show a faster animation of selection
sort on a larger data set in Video 26.7.
26.2 Sorts
The sorts are conceptually simple (and relatively easy to
implement) but are inefficient for large data sets. If you just need to
write something to sort 50 pieces of data, any of the sorts
we saw will work just fine. However, if you need to sort 50 million
pieces of data, an sort will be rather slow (e.g., multiple
days on today’s computers). If we need efficiency on larger input
sizes, we need to use an sort, such as heap sort,
merge sort, or quick sort.
Figure 26.1: Runtime comparison of various sorts.
To see the performance difference between all of these sorts,
Figure 26.1 plots the runtime (in seconds) on the y-axis versus the
number of items in the array to be sorted on the x-axis. The four
sorts that we have already seen appear on the left side of the
graph and quickly take longer than 10 seconds (the maximum that is
shown here) at input sizes of less than 500,000 elements. Meanwhile,
the sorts we are about to learn about are barely
visible at the bottom of the graph, as they do not even take 1 second at
2 million elements.
If we were to extrapolate from the data for insertion sort in
Figure 26.1 to 5 million elements, we would estimate that it would
take about an hour; 50 million elements would take about 4.5 days;
500-million elements would take about 1 year. However, we can just
run the sorts on 500 million element input sets, and
they take between 50 seconds and 3 minutes (depending on which
sort) on regular desktop computer.
26.2.1 Heap Sort
One way we could achieve sorting would be to
revisit selection sort but find a more efficient (i.e., ) way
to find the minimum element to swap into the next place. We have
already seen ways to organize data so that we could find the minimum
(or maximum) element efficiently—binary search trees (Chapter 22)
and heaps (Chapter 24). From a conceptual level, the idea of heap sort
would be to place all of the items from the array into a heap, then
repeatedly remove the maximum element from the heap and place it
into the correct location in the array. As we will see shortly, we are
going to put in our maximum elements from right to left, so we use a
max-heap for ascending order, and a min-heap for descending order.
However, if you recall from Chapter 24, heaps are actually
implemented as arrays. Accordingly, we can partition our array into a
part that is unordered and a part that is heap-ordered. We can then
follow a similar approach, moving the boundary between the regions
by inserting the next element into the heap. While this may sound a
bit like insertion sort, we are inserting into a heap—which is an
operation, rather than an operation. After we
finish this pass (which is typically called heapify), our array is not
sorted—it is a heap. We note that the heapify pass takes
time, as it performs heap insertions (each of
which is ).
Now that we have a valid heap, we can perform a much more
efficient version of selection sort. However, since we are using the
same array for the heap and the sorted array, we have to partition our
array so that the heap is to the left (lower indices) and the sorted array
is to the right. The largest element of the heap is in index 0 of the
array, and we want to put it at the right end of the array—in the space
right at the border between the heap and the sorted array. Fortunately,
we need to move the element that is in that position to the top of the
heap as part of the heap remove operation anyways. We therefore
swap the element of the array (which is the top of the heap) with
the last element in the heap. This swap operation extends the sorted
region by one element and starts the removal operation from the heap
in the correct way—we just need to bubble down the node to its
correct position (which is also an operation). After we
finish bubbling the element down, we repeat the process until the heap
is gone and the entire array is sorted.
Video 26.8: Execution of heap sort.
Video 26.8 shows the execution of heap sort. Notice how in the
first portion of the video, the array is “heapified”—turned into a heap
ordering. In the second portion, elements are taken from the heap (at
index 0—the largest element), and placed into the sorted portion of
the array.
26.2.2 Merge Sort
Another approach to sorting is merge sort. Merge
sort is a divide and conquer algorithm—one that approaches the
problem by recursing on smaller pieces of the problem (divide), then
coming the results of the recursion (conquer). In particular, merge sort
splits and recursively sorts the left half of the array, then recursively
sorts the right half of the array, then performs a merge operation,
where the results of the two sorted arrays are combined.
The merge operation iterates over the two sorted regions of the
array, taking the element from whichever half has the smallest next
element. The selected element is then written into a temporary array
(“in place” merge, while possible, is quite complicated). The process
repeats, taking from whichever sorted half has the smallest next
element, until one half is empty. At this point, the remaining elements
from the other half are copied to the temporary space,2 then the entire
temporary space (which is now sorted) is written back into the array.
One possible base case for the recursion of merge sort is when
there is one element in the array, as a single element array is always
sorted. Often, however, people will resort to a simple sort for a small
number of elements. The motivation for this approach is that simple
sorts may be more efficient for small data sets. For example, we might
have a base case where we use selectionSort when n < 16 (We
performed a quick experiment and found this was the best choice for
the particular system we were experimenting on. It resulted in about a
15% speedup over a base case of doing nothing when n < 2.).
Video 26.9: Execution of merge sort.
Video 26.9 shows the execution of merge sort on a 16-element
array. In this video, we use n <= 2 as the base case—sorting two
elements requires only comparing the two and swapping their order or
leaving them in the same order as required. While using a larger n as
the base case is more efficient in practice, our focus here is on
showing you how the algorithm operates. Throughout the video, some
elements are drawn in blue (meaning they are in the range that the
algorithm is currently working on), and other elements are drawn in
gray (meaning that they are outside of the range that the algorithm is
currently working on). The top left corner of the video describes
where the animations are in the merge sort algorithm at each step.
During merge phases, two colored boxes (one purple and one pink)
appear around the two parts of the array that are being merged
together. A purple and a pink arrow keep track of the next element to
merge from each portion, and the elements are moved out to
temporary space (above the array) then copied back into the array.
26.2.3 Quick Sort
Another sort that splits the problem into smaller pieces and solves
them using recursion is quick sort. Quick sort typically has a running
time of , but in the worst case it can have a running
time of . The reason for this variation in running times has
to do with the fact that that quick sort may not partition the input array
into exactly half each time (as we will see shortly). If the splits end up
being roughly half-and-half most of the time, the run time will be
; however, if the splits end up being very imbalanced
most of the time, the running time will be . The results in
Figure 26.1 were generated with random data, which quick sort
performs well on.
The idea behind quick sort is to partition the array into smaller
and larger elements first, then recursively sort those pieces. The
partitioning is done with respect to one particular element, called the
pivot. The basic steps of the algorithm are to pick the pivot, partition
the array so that all elements smaller than the pivot are to the left of
where it will end up and all larger elements are to the right, swap the
pivot into its proper position, then recursively quick sort the two
partitioned halves. Note that the partitioning algorithm makes no
attempt to order the elements in each half correctly with respect to
each other—it only tries to get elements smaller than the pivot on the
left and larger than the pivot on the right.
Selecting the pivot is a bit of a complex topic. Ideally, we would
like to pick the median element of the input data (i.e., the one that will
be in the middle once we sort). Unfortunately, finding the median is
fairly complex (the easiest way to do so is to sort the data first…), so
we must use some other approach. If you know that your data will be
fairly random, you can simply use the element at index n - 1 as the
pivot. However, if you use the element at n - 1 as the pivot and your
input is already close to sorted (in either order), your quick sort will
degenerate into running time.
Another possibility is to choose a random array index and use
that element as the pivot. Such a choice gives you the expectation of
picking a good (i.e. near the median) value sometimes, and a poor
value other times. Unless we get really unlucky, this mix of good and
bad pivots should be sufficient to get runtimes. One
might improve on the odds of good pivots by randomly choosing
three indices and picking the median of those three elements.
However we pick the pivot, we swap it with the last element of the
array before we commence partitioning. Basically, this swap-to-end
puts the pivot in an “out of the way” place where we don’t need to
treat it specially in the rest of the algorithm (we just start our right-to-
left scan at the index before the pivot).
Once a pivot is selected and swapped to the end, the partitioning
step proceeds by first scanning from left to right, looking for an
element of the array that is larger than the pivot (i.e. belongs on the
right side). Then the algorithm scans from right to left looking for an
element that is smaller than the pivot (i.e. belongs on the left side).
Once these two elements are found, they are swapped with each other,
putting them both on the correct side of the array. The process then
repeats with another left scan/right scan, and ends when the two scans
cross each other (which is checked after the right-to-left scan stops).
At the time when the scan ends, the left-to-right scan index is in
the correct place for the pivot element to be placed, so the element it
points at is swapped with the pivot. It may be a little bit difficult to
see that the left-to-right scan index is positioned where the pivot
belongs, but it follows from the invariants of the algorithm.3 We can
see this property by remembering that (a) everything to the left of the
scan index is smaller than the pivot (otherwise, we would have
already swapped it to the right) (b) the scan index is currently on an
element that belongs on the right side of the array (because the left-to-
right scan stopped there) and (c) all elements to the right of the scan
index belong on the right side of the array (because the right-to-left
scan has already passed them, and thus they would have been
swapped to the left if they did not belong). If we swap the current
item with the pivot (which we know is safe because of (b)), then (a)
and (c) together tell us that we are in the right place for the pivot
element.
After the pivot is swapped into position, the left side and the
right side are recursively sorted. The recursive sort calls can exclude
the pivot element itself, as we already know that it is in the correct
position. As with merge sort, we may wish to have a base case that
calls a simple sort to finish the last handful of elements.
Video 26.10: Execution of quick sort.
Video 26.10 shows the execution of quick sort. The upper right
corner displays the call stack, in terms of the range where quick sort is
operating—elements outside of the current range are shaded gray.
This implementation of quick sort treats two or fewer elements as a
base case, although a real implementation would resort to a simple
sort at more elements (e.g., 16). When you watch the video, you will
see two indices scanning from the left (green box) and right (red box),
comparing each element to the pivot (purple box). When an element
that belongs on the right is found on the left, the scan from right to left
begins. When the right-to-left scan finds an element that belongs on
the left side, it is swapped, and scanning proceeds from the left.
26.3 Sorting Tradeoffs
When choosing a sorting algorithm, a natural question is “which one
is best”? The answer (as with many things) is “it depends.”4 If we
want simplicity and do not care about speed (e.g., we need to write
something quickly to sort 100 numbers), then we might choose
whatever seems most natural to us.
However, if we care about speed on large inputs, we know that
we must pick an algorithm. Recall that
Figure 26.1 showed how much the algorithms
outperformed the algorithms for modestly large input sizes.
However, in that graph, the three algorithms all
appeared to be roughly the same (as we will see shortly, this similarity
is a function of the graph’s scale). If they are the same, how would we
choose between them?
Each of the algorithms has some
disadvantage over the others. For quick sort, its main disadvantage is
that it is not guaranteed to be —we just expect it
to run in that time in the general case. For merge sort, the main
disadvantage is the extra space required to merge (which is
space). By contrast, heap sort can be done completely in place (as all
the recursive operations are tail recursive), and quick sort only needs
space for the stack frames (assuming it runs in
time—if it degrades into time, it also
requires frames.
Heap sort Merge sort Quick sort
Guaranteed Yes Yes No
Extra Space Needed?
Good locality No Yes Yes
Table 26.1: Summary of tradeoffs of sorting algorithms.
The main disadvantage of heap sort is poor locality—meaning
that it accesses array elements that are far apart from each other,
rather than close together. For example, when heap sort accesses
element 100,001 the next element that it accesses is either the parent
(50,000—if we are bubbling up as you make the heap) or the left and
right children (200,002 and 200,003—if we are bubbling down as we
remove from the heap). From a theoretical standpoint, these distant
accesses do not matter. However, from a practical standpoint, they
cause the algorithm to have higher constant factors, due to the way
hardware works. In particular, hardware has caches that are much
faster to access than main memory. The cache hierarchy is designed to
work best when programs exhibit good locality (meaning it generally
accesses data with memory addresses that are close to each other—
most programs typically exhibit good locality), so poor locality results
in slower access to each element of the array.5 Table 26.1 summarizes
these tradeoffs.
Figure 26.2: Runtime of sorts on larger data sizes.
While these differences are important considerations, we have
not yet answered the question you are mostly likely hoping for:
“Which sorting algorithm is fastest on arrays?” Even though they look
fairly similar in Figure 26.1, the scale of that graph is set to show the
algorithms, and thus the algorithms
cannot be easily distinguished. Figure 26.2 shows the runtime of these
three algorithms for much larger inputs—up to 500 million elements.
The previous figure only showed up to 2 million (which is a tiny
region in the lower left corner of this graph).
In this experiment,6 heap sort is significantly slower than merge
sort or quick sort. This result is what we would generally expect, as
heap sort’s poor locality will lead to long latency accesses to the array.
Merge sort and quick sort have similar performance, but quick sort
still has about an 18% advantage on the 500 million element array.
Even though quick sort appears to be a clear winner, we must be
cautious. This experiment was run on random data, which is good for
quick sort—we are likely to pick pivots in the middle of the array
often enough to have behavior. In the worst case,
quick sort could be much slower. This combination seems to pose a
bit of a conundrum: we want to use quick sort for its typical high
performance, but worry about the risk of an incredibly slow sort.
One way to resolve this tension is to hybridize quick sort with
heap sort. This approach, which is called introspection sort, starts
with quick sort, but switches to heap sort if quick sort’s recursion
depth exceeds a limit proportional to (e.g., ). To
implement this limit, quick sort takes an extra parameter, which
specifies how many recursive call depths it may make before
switching to heap sort. When this parameter reaches 0, the modified
quick sort calls heap sort and returns. When quick sort recurses, it
passes its current limit minus 1. This hybridization guarantees
runtime (there are at most recursive
calls before switching to heap sort), but still gives the performance
benefits of quick sort in the general case.
26.4 Sorting Libraries
Sorting is such a common task that most languages provide some
form of sorting in their standard libraries. In C, the standard library
provides qsort, which has the following prototype:
1 void qsort(void * base,
2 size_t nmemb,
3 size_t size,
4 int (*compar)(const void *, const void *));
Recall from Section 10.3 that qsort takes a void * (and the size
of each element) so that it can sort any type of data, and that it takes a
function pointer to specify how to compare each element. As you may
have now guessed, qsort implements the quick sort algorithm. Now
that you know about sorting, you know what exactly it does and what
the advantages and disadvantages of that approach are. See man qsort
for more details.
In C++, the STL provides std::sort, which sorts the elements
of any container that supports random access iterators (e.g., a vector).
By default, std::sort sorts into ascending order according to the <
operator defined for whatever type it is sorting. However, you can
pass it an optional extra argument to define a custom ordering if you
need. See http://www.cplusplus.com/reference/algorithm/sort/
for more details.
C++’s STL also provides another sorting algorithm that is stable
—meaning that if the input contains items that are equal according to
the ordering, they will remain in the same relative order once the
array is sorted. While it may seem odd to worry about the ordering of
two equal items, stability is only concerned with the equality of the
items with regards to the ordering operator used to sort—they may
still be distinguishable. For example, if we were working with a
vector of student records, we might sort by GPA. If we have multiple
students with the same GPA, we can still distinguish them (e.g., they
have different names, student numbers, etc.). Whether or not we care
about the stability of a sort depends on what we are using it for. See
http://www.cplusplus.com/reference/algorithm/stable_sort/
for more details about std::stable_sort.
To wrap up the sorting chapter, we refer you to xkcd for a few
other sorting algorithms that every aspiring programmer should be
familiar with: https://xkcd.com/1185/.
26.5 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 26.1 : For which of the following problems would
sorting the data (which is otherwise in an unknown order at
the start) as a first step be an efficient approach?
Minimum Find the smallest item in the data
Maximum Find the smallest item in the data
Check if any item… Check to see if any item in the input has
a given property (e.g., is even, is greater than 4, is
equal to 42, is a perfect square, etc.).
Check if all items… Check if all items in the input have a
given property.
Intersect Given two arrays, produce an array with the items
that are in both input arrays.
• Question 26.2 : What does it mean for a sort to be “stable”?
• Question 26.3 : Implement each of the eight sorts described in
this chapter (bubble, shaker, insertion, selection, quick, heap,
merge, and introspection). Time your sorts for various input
sizes, and create graphs similar to those shown in Figure 26.1
and Figure 26.2. Hint: make your program take two command
line arguments: one for the size of array to sort, and one for
the name of the sort to perform. Note that your results should
have similar trends but may not have the exact same values
due to differences in the performance characteristics of the
hardware on your system.
• Question 26.4 : In the previous problem, how could you (or
did you) use inheritance to reduce code duplication in your
timing/testing code?
• Question 26.5 : Add support to your timing program for the
three sorts from the C and C++ standard libraries: qsort,
std::sort, and std::stable_sort. Include these on your
graphs. How do they compare to your sorts? How do they
compare to each other?
25 GraphsIV Other Topics
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
All of Programming26 Sorting27 Balanced BSTs
Part IV Other Topics
27 Balanced BSTs
28 Concurrency
29 Advanced Topics in Inheritance
30 Other Languages
31 Other Languages: Java
32 Other Languages: Python
33 Other Languages: SML
34 … And Beyond
26 Sorting27 Balanced BSTs
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
IV Other TopicsIV Other Topics28 Concurrency
Chapter 27
Balanced BSTs
In Chapter 22, we learned about binary search trees, and how
they can give us access time (for insert, delete,
and searching) as long as they are balanced. If we just use a
“plain” binary search tree—with the algorithms that we already
learned about—then the tree can become imbalanced as we add
elements. In particular, if we add elements in increasing or
decreasing order, the tree will completely degenerate into a
linked list, as the elements will form a single chain going to the
right (increasing order) or left (decreasing order). In this
chapter, we are going to learn how we can fix this problem by
adjusting our insertion and deletion algorithms to ensure the
height of the trees remains We will see two
common techniques: AVL trees and red-black trees. The two
algorithms both maintain behavior but with
different rules and invariants.
27.1 AVL Insertion
Video 27.1: Adding 3 creates imbalance. Left
rotation re-balances the tree.
The first balanced tree we will explore is the AVL tree,1
which maintains a strictly balanced tree. Recall from Chapter
22 that a tree is balanced when, for every node, the height of its
children differ by at most 1. AVL insertion starts as normal
BST insertion using the recursive approach that we saw before;
however, after each recursive call returns, the algorithm
updates the height information for each node (we want to store
the height in a field in each node rather than recalculate it each
time we need it) and checks for an imbalance (children whose
heights differ by more than 1). If a node is imbalanced, the
algorithm performs a rotation to restore the balance. For
insertion, we are guaranteed that one rotation operation is
sufficient to restore balance.
Video 27.1 shows how insertion into an AVL tree can
create imbalance and how that imbalance can be repaired by a
rotation. In this case, the root node (1) becomes imbalanced as
its left child has height 0, and its right child has height 2. After
rotating, the new root of this sub-tree (2) is balanced, as both
children have height 1. Notice that the height of the new root is
the same as the height of the original root.
This example is not the only case for rotations. If we were
to add to the left side of a mirror-image situation, we would
solve the problem with a right rotation—which is basically the
same, but in the opposite direction. We also may not rotate
from the root (we rotate from whatever node is imbalanced—
where its children’s height differ by more than one). The
imbalanced node may also not be so close to the point at which
we added—we may add deep in a sub-tree, and create
imbalance several levels up (which is why we want to balance
as we return up our recursions—it allows us to fix the problem
at any level).
Video 27.2: Generalized single rotations for an
AVL tree.
Video 27.2 shows the generalized cases for both of these
rotations, which are called single rotations. We show colored
triangles to represent generic sub-trees (they could be empty, or
very large). The color does not matter, except to help you keep
track of which sub-tree is which as they move around.
These two situations allow us to fix the problem with one
rotation (thus the name “single rotation”), as the imbalance
either arises from the right child’s right child growing, or the
left child’s left child growing. In either of these situations,
rotating once from the imbalanced node repairs the tree as the
imbalanced node can be made the parent of the two shorter
sub-trees (its original left child and the left child of its right
child—or in the other case, its original right child and the right
child of its left child), and the resulting sub-tree will match the
height of the sub-tree that grew larger.
However, there are two other situations (one is a mirror
image of the other) in which one rotation will not fix the
problem. If the imbalance arises from the right child’s left sub-
tree growing (or the left child’s right sub-tree), a single rotation
just shifts around the location of the imbalance. Instead, a
double rotation is required. A double rotation involves rotation
in one direction from the child of the imbalanced node, then in
the opposite direction from the imbalanced node. The first
rotation at the child places the imbalance into the correct
position for a single rotation to correct it.
Video 27.3: Generalized double rotations for an
AVL tree.
Video 27.3 illustrates these double rotations. Note that the
pointer manipulations involved are exactly what we have
already learned for single rotations; we just do two rotations in
opposite directions, first from the child, then from the
imbalanced node.
Having seen the basic mechanics of the rotations, it is
useful to see them concretely applied to a tree in the context of
adding nodes. Video 27.4 shows the addition of two nodes (3
and then 1) to an AVL tree. Each of these additions causes an
imbalance, which must be fixed, although the two additions
cause the imbalance in different places. Note that although we
show (and focus on) additions that cause imbalance, it is
possible that an addition might not cause any imbalance, and
thus no rotations are needed. For example, if we were to add 58
(as the right child of 56) to the tree in the video, no rotations
would be required.
Video 27.4: Example of adding to an AVL tree
and applying rotations to correct imbalances.
27.2 AVL Delete
Deletion from an AVL tree starts the same as regular binary
search tree deletion: we recursively find the desired node, then
either (if it has zero children or one child) remove it from the
tree or (if it has two children) swap its data with an appropriate
node and delete that one (which is guaranteed to have either
zero children or one child).
Re-balancing the tree after removing the correct node
follows a principle similar to what we did for adding—we must
detect imbalances along the path from the point where we
modified the structure of the tree (in the case of two-child
removal, this is the node we actually deleted from the tree—
which is further down than where we found the data we wanted
to remove) back up to the root and correct them by rotations.
The rotation operations are fundamentally the same; however,
we may have to perform multiple rotations as we progress up
the tree.
When we need to rotate, the question arises as to which
rotation we need to apply—a single rotation or a double
rotation. The first thing we need to know is whether the left
child or right child is the taller one. Then, we look at the taller
child’s children’s height (that is, the grandchildren of the
imbalanced node on the side of the taller child). If the left child
is taller, and the left-left grandchild is taller than or equal to the
left-right grandchild, then a single right rotation will correct the
problem. Likewise, if the right child is taller, and the right-right
grandchild is taller than or equal to the right-left grandchild,
then a single left rotation will correct the problem. Otherwise, a
double rotation is required.
While this rule may sound complicated, you can think of it
simply in terms of where it “looks like” you added to the tree—
if the tallest path goes left then left, it “looks like” you just add
to the tree down that path, so a similar rotation (single right)
will fix the problem. If the tallest path goes left then right, it
“looks like” you added along that path, so you need a double
rotation (left rotate, then right rotate).
Another important difference between adding and deleting
is that we have modified the tree by making it shorter, and a
rotation makes a sub-tree shorter as it restores balance.
Accordingly, we may fix a sub-tree but still have a balance
problem further up the tree. Fortunately, we can correct all
problems by checking and rotating as we return up from our
recursive calls.
Video 27.5: Deletion from an AVL tree.
Video 27.5 demonstrates deletion from an AVL tree.
27.3 Red-Black Insert
Another way to ensure behavior is to use a red-
black tree. The rules for a red-black tree do not guarantee
balance in the same sense as an AVL tree—we may have nodes
whose heights differ by more than 1—however, they do ensure
that the maximum length of a path from the root to a leaf is
. These slightly looser rules mean that the paths
may be longer; however, what is important is that they
maintain a logarithmic bound.
You may wonder why we would want a red-black tree if
the path length guarantees are not as good. The answer is that
red-black trees are amenable to rotating as we follow the path
down the tree to find the insertion point. As we do not need to
perform work “on the way back up” the path (which makes
recursion quite handy), the red-black algorithm is quite
amenable to iterative implementations.
A red-black tree must maintain four invariants:
1.
Every node is colored either red or black.
2. The root is black.
3. If a node is red, its children must be black.
4. Every path from root NULL must have the same
number of black nodes.
Notice that an important consequence of these rules is that
the longest path is at most twice as long as the shortest path. In
the worst case, the shortest path is all black nodes (let us
suppose there are such nodes). The longest path must also
have exactly black nodes on it (rule 4), but the number of
red nodes is limited by rule 3—we may have at most one red
node per black node along that path (if a node is red, its child
must be black, so we cannot have two reds next to each other
along the path). Therefore, we know that we have at most
nodes along the longest path. This guarantee is clearly weaker
than AVL (where we know the path lengths differ by at most
one), but it is sufficient to guarantee path length
(proving this fact is beyond the scope of this book and left to an
algorithm theory class).
We will note that we generally consider NULL nodes to be
black (as opposed to being a separate “not a node” entity), as it
simplifies reasoning about the trees—we can just talk about the
case of something being a black node versus “a black node or
NULL.”
Figure 27.1: An example red-black tree. Notice that all paths from the root to
NULL have three black nodes and no red node has a red child.
Figure 27.1 shows an example red-black tree. Take a
moment to look at the tree and convince yourself that it follows
all four rules. In particular, note how every possible path from
the root to NULL has exactly three black nodes (four if you
count the NULLs). You should also notice that this tree is not a
valid AVL tree.
When we add to a red-black tree, the node we add must be
red. If we were to add a black node, then we would have one
more black node along the path where we added the node than
along all other paths, violating rule 4. However, the tricky part
of adding is that the new node’s parent must be black—if it is
red, then we violate rule 3. To make sure we meet these
conditions, we rotate and recolor nodes as we traverse down
the tree looking for the insertion point and ensure that we will
be able to add a red node when we reach the right place.
As we traverse down the tree, we check if our current
node is a black node with two red children (of course, any node
with red children must be black). If both children are red, we
recolor the current node red, and the children black. Observe
that this transformation maintains rule 4 (as we keep the same
number of black nodes along the path; we just move where
they are), but may introduce violations of rule 3. Specifically, if
the current node’s parent is red, then we have introduced two
red nodes in a row. When such a problem happens, we fix it by
rotating (as we shall see shortly) and recoloring nodes.
These transformations may also result in the root being
colored red, violating rule 2. However, this violation is the
easiest to fix: after we finish our insertion, we simply recolor
the root to black. We can always recolor the root from red to
black, as it sits on all possible paths from the root to NULL and
thus increases the black count of all paths equally. In fact,
recoloring the root from red to black is the only way that we
can increase the black count of the paths, as it is the only way
to increase them all at once.
Figure 27.2: Red-black tree rotation and recoloring.
When we transform a black node with two red children
into a red node with two black children, we might violate rule 3
of the red-black rules. Whenever this problem occurs, we fix it
by rotating using the same four rotations we used for AVL
trees. However, unlike an AVL tree, we are not shortening a
path where we have already added—we are preemptively
shortening a path where we are about to add, and in the process
we make sure we can add a red node.
Figure 27.2 illustrates this process. As we traverse down
the tree, we find that 70 has two red children (left side of the
figure). We recolor 70 to be red and its children to be black.
Observe that this transformation preserves the black count on
each path through the sub-tree we have shown—there are two
black nodes on each path plus however many are in each
triangle (which again represents an abstracted-out sub-tree).
Note that the black count for all paths through all four triangles
must be the same when we start for our tree to obey rule 4. In
this particular case, our recoloring (shown in the middle of the
figure) has introduced two red nodes in a row, violating rule 3
(it might not introduce such a violation, but it can). We
therefore need to perform a rotation at 70’s grandparent (10) to
correct the violation (right side of the figure). When we
perform this rotation, we again adjust the colors; however, now
we have ensured that all rules are obeyed.
There are a few important things to note about the setup in
Figure 27.2. First, it is important that 10’s left child be black. If
10’s left child were red, then the final picture would have two
red nodes in a row (10 and its left child). Fortunately, we never
have to deal with the case where 10’s left child is red. While
this may seem surprising, remember that any time we see a
black node with two red children, we recolor—if 10’s left child
had been red, we would have recolored its children to be black
and 10 to be red. Avoiding this problematic situation is exactly
why we perform this recoloring step as we go down the tree—it
ensures that when we need to perform such a rotation, the
resulting coloring will be legal. Next, notice that we need to
rotate from 70’s grandparent. A natural question would be what
we do if 70 does not have a grandparent—i.e., if 20 were the
root. Fortunately, this situation never comes up either, due to
rule 2—the root must always be black. Because the root is
always black, we never introduce a red-red violation if we turn
one of its children red.
Video 27.6: Insertion into a red-black tree.
Video 27.6 demonstrates the entire insertion process into a
red-black tree for a few items.
27.4 Red-Black Delete
Deleting from a red-black tree is easy if the node that we
remove from the tree is red—we can simply take it out of the
tree, and we are sure no rules were violated. We know we have
not affected the black count, as we have only deleted a red
node. We also know we have not introduced any red-red
violations, as the parent, as well as any children the node has,
must both be black, and it is perfectly fine for a black node to
have black children. However, the situation is more complex if
the node we want to delete is black—taking it out of the tree
reduces the black count along that particular path and requires
recoloring and rotating to restore the rules.
One good way to think about red-black deletion to help
have a clear way of maintaining the rules is Matt Might’s
approach of adding two additional temporary colors (meaning
they can only exist while fixing the tree after a deletion—and
must be removed before the operation is complete) of “double
black” and “negative black”. In this approach, a “double black”
node counts as +2 black nodes, while a “negative black” node
counts as -1 black node. The process begins by temporarily
coloring the NULL node at the deletion site “double black”
(recall: NULL nodes are normally treated as black), which
maintains the black count, although we obviously have to
remove the “double black” node before we can call the process
complete.
The “double black” node is pushed up the tree (towards
the root) by “increasing” the blackness of its parent, and
“decreasing” its own and its sibling’s blackness. Increasing the
blackness of a node transforms negative black (-1) into red (0),
red (0) into black (1), and black (1) into double black (2).
Decreasing acts exactly the opposite. Note that this transform
preserves the black count along the path (which we keep
correct at all times); however, it may introduce red-red
violations, or negative black nodes. These problems need to be
eliminated by rotation/recoloring appropriately. Note that if we
push the double black node up to the root, then can simply
recolor the root from double black to black, decreasing the
black count of all paths as the tree shrinks.
For full details, see Matt Might’s blog article on the
subject: http://matt.might.net/articles/red-black-
delete/.
IV Other Topics28 Concurrency
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
IV Other Topics27 Balanced BSTs29 Advanced Topics in Inheritance
Chapter 28
Concurrency
So far, we have learned about sequential programming—
programming in which execution proceeds from one statement
to the next in a sequential fashion, in a “one thing at a time”
way. However, an increasingly important skill is parallel
programming—writing programs where multiple things are
happening at the same time. Parallel programming is primarily
important from a performance standpoint—it offers no
functional benefits over sequential programming. That is, there
is no problem that we can solve with parallel programming that
we cannot solve with sequential programming. However, a
parallel program may solve that problem much more quickly.
Parallel programming is especially important today
because of the current trends in hardware design. For a long
time (basically until the early 2000s) the speed at which
commercial processors could execute sequential programs
increased exponentially. However, in the mid-2000s, processor
design began to shift towards multi-core chips for a variety of
reasons.1 These multi-core chips favor the ability to do multiple
“things” at the same time—either by executing parallel
programs or executing multiple processes. Future trends appear
to favor even more emphasis on parallelism.
Because of this emphasis on parallelism, aspiring software
developers should learn parallel programming. If you write a
sequential program and run it on a 4-core system or an 8-core
system, you will not get any performance benefits from the
additional cores on the larger system. To take advantage of the
extra performance potential of the more parallel system, the
programmer must explicitly parallelize her code.
Even with a single-core system, understanding
concurrency is important. Modern OSes run multiple programs
at “the same time”—meaning they actually switch between
them at a speed faster than humans can perceive. Furthermore,
if we delve into lower-level system programming (e.g.,
modifying the OS kernel), we will find that we need to be
aware of many similar issues not only due to the fact that the
OS handles multiple cores, but also because it must deal with
other parts of the system operating in parallel with the
processor.
As with many of the chapters in this book, we could write
an entire book on parallel programming, as it is a very large
topic. Likewise, it takes significant practice and effort to truly
become an expert at parallel programming. However, we will
introduce you to the basic concepts, so that you have at least
heard of them if you do not take a parallel programming course,
and are well positioned to dive right in to it if you do take such
a course.
28.1 Processes
One way to achieve parallelism is to run multiple processes at
the same time. A process is a running instance of a program.
This subtle distinction in terminology is important to precisely
discuss executing a program. As an example, consider emacs,
which is a program. When you type emacs (with whatever
arguments you need) at the command line, a new process gets
created to run the program emacs. At the same time, you could
open another terminal, and run another copy of emacs—
creating a second process (which is running the same program).
Each of these processes has its own execution arrow and
memory space (stack, heap, data segment, and code). If we
were to execute them by hand, we would execute each
independently. Each would start with its own execution arrow
at the start of its own copy of main, and with its own frame.
The two might even have different values of argc and argv, as
we might run them with different arguments. If we continue
execution of these programs, we might observe rather different
states at any time, as they are completely unrelated to each
other except that their code is the same. Neither of them would
ever be able to affect the state of the other.2
Each process also has its own process ID (called PID for
short), which is a number that identifies the process. Each PID
is unique at any given time, but they may be “recycled” after a
process exits. The PID gets used to identify which process you
are referring to when you make system calls that manipulate
processes.
Running multiple processes (whether for the same
program or different programs) is one way to exploit the
parallelism available in a multi-core system. However, for this
purpose, an interactive program such as emacs is a poor
example. Interactive programs spend most of their time waiting
for the user to type input, and thus do not need much processing
power. You could run multiple copies of emacs on a single core
system without noticing any slowdown.
28.1.1 Performance Benefits
A better example from a performance standpoint of a program
that you are familiar with would be gcc (especially if it is a
compiling a program with significant optimization flags
enabled—and therefore executes a lot of code to analyze and
transform the input source code). This example is better, as gcc
does not interact with the user, and performs a lot of
computation. gcc is a also a great example of how we can
exploit parallelism for performance with multiple processes. If
we were working on a large program with several source files,
we would setup our Makefile to compile3 each source file to an
object file, then link the object files. If we had 50 source files
(i.e. a modestly sized project), we would have 51 executions of
gcc to compile the entire program (50 compilations of the
source into objects, plus one to link).
If we were to look at the compilation tasks described
above, we would find that the 50 compilations of source files
into object files are all independent—they operate on
completely different inputs and outputs from each other. The
51st task (linking them all together) depends on the first 50
tasks. Because the first 50 tasks are independent, we could run
them in parallel. For example, if we have a system with 8 cores,
we might run 8 different gcc tasks at a time, one on each core.
In fact, make has an option specifically to run independent tasks
in parallel: the -j option asks make to run jobs in parallel. The -
j option can either be given with no argument, in which case
make will run all independent jobs in parallel, or with a number
specifying the maximum number of jobs to run at one time.
These 50 tasks would then take significantly less time to
run (as we are doing 8 at a time), and thus our compilation
would be much faster. The linking task has to wait for all of the
other compilation tasks to finish, and thus will not be executed
in parallel with any of these other tasks. We will note that when
we say “significantly less time” it is generally not “8x less
time” for a variety of reasons.
First, the tasks may take different amounts of time to
finish, and this may cause some load imbalance—we might
spend some time waiting for the last few tasks to run while we
have no other tasks we can run. Second, the parallel tasks may
contend for hardware resources. This contention may make
each task take longer to run in parallel than it would running
alone. However, even though each individual task is slower, the
fact we are accomplishing multiple of them at one time makes
the entire set of tasks complete more quickly. We will note that
discussing the details of hardware resource contention is
beyond the scope of this book, as you must learn a lot about
computer architecture to understand the details.
As an example, consider if we ran 50 tasks, which each
took 1 second by themselves, but take 2 seconds when running
8 of them parallel and 1.2 seconds when running 2 in parallel.
If we run the 50 tasks sequentially, the job would take 50
seconds. If we run them with 8 in parallel, we would have 6
batches of 8 (48 jobs), which would take 12 seconds, followed
by 1 batch of 2 jobs, which would take 1.2 seconds. The total
execution time would then be 13.2 seconds, which is only 3.78x
faster than the sequential execution. The last two jobs running
by themselves in this example are an example of load
imbalance—we only get to run two jobs at a time, as opposed
to our desired eight.
28.1.2 fork, execve, and waitpid
As with all things in programming, there is nothing magical (or
that you cannot do yourself) about the way that make runs
multiple jobs in parallel. It creates new processes via the fork
system call, and then has each new process execute the
appropriate task (via the execve system call), and then waits for
jobs to complete (via the waitpid system call). Most of the rest
of what make does is a matter of parsing the input file, building
a dependency graph for the tasks that need to be performed, and
using the relevant graph algorithms to schedule the tasks.
The fork system call creates a new process (called the
child process) that is exactly like the original (called the parent
process) except for the return value of fork (0 in the child; the
child’s PID in the parent4), its process id (which is newly
allocated to it), its parent process id (which is the PID of the
process that created it), and resetting some OS-level resource
accounting. Memory5 and file descriptors are copied. The
copying of memory means that if one process writes to
memory, its own memory will be updated, but the other copy
will not. Likewise, for file descriptors, they will refer to the
same files, but changes to the descriptor state (e.g., seeking to a
different position in the stream) will not be reflected in the
other copy.
The execve system call replaces the program in the
currently running process with a newly loaded program
(specified by passing the path to the executable binary as the
first argument of the execve system call). If the child calls
execve right after forking, then the second copy of the parent’s
memory is destroyed and replaced with a memory image loaded
from the requested binary. Note that the other arguments of
execve specify the arguments (which end up being argv of the
new program) and the contents of the environment variables for
the new program.
The other system call we mentioned is waitpid, which
allows a parent process to wait for one or more of its child
processes to terminate. We note that the combination of these
three system calls are used by your shell whenever you run a
program to create a new process, execute the program you
requested in that process, and wait for that process to exit.
We are not going to delve too deeply into use of these
system calls. Instead, we provide you a brief introduction to
them, refer you to the man pages for details, and suggest a good
systems programming course.
28.1.3 Other Uses of fork
We can use fork to create multiple processes without using
execve to run a different program. If we are writing a server
that processes incoming requests of some sort (e.g. a web-
server), we could achieve parallelism between independent
requests in a relatively simple fashion by having the server
fork a new copy of itself to process a request. That copy would
then exit after processing the request. For example, our code
might look generally like this:
1 while (!exiting) {
2 Request * r = getIncomingRequest();
3 pid_t fork_result = fork();
4 if (fork_result == -1) {
5 //handle the error
6 }
7 else if (fork_result == 0) {
8 handleRequest(r);
9 exit (EXIT_SUCCESS);
10 }
11 else {
12 //do whatever else the parent needs to do
13 ....
14 }
15 }
16 ....
The fork call creates two independent programs. One of
these (specifically, the child process) calls handleRequest then
exits. The other enters the else at the bottom of the code
fragment, and does whatever other tasks this server needs to do.
These tasks can be accomplished in parallel with the child’s
processing of the request. In fact, the parent may go through the
entire loop and accept another request and fork another process
before the child finishes. For a real program, we would need to
be careful to limit the number of children we have at any time.
28.2 Threads
Parallelism from creating separate processes is largely limited
to the case where the parallel tasks do not need to
communicate. Some communication is possible by setting up
shared memory segments (via explicit system calls), but when
the parallel tasks need to communicate a non-trivial amount,
using multiple threads is a better choice than using separate
processes.
Threads are much like processes except that they share an
address space. If we create two threads (within the same
process), they have separate stacks and separate execution
arrows, but they share a heap, code segment, and global data
segment. Additionally, even though they have separate stacks,
because they are in the same address space, one thread can
access data in the other’s stack via pointers.
Notice that in the previous paragraph we described the two
threads sharing an address space as “within the same
process”—a process can have one or more threads within it.
When a process initially starts, it contains one thread, which
begins execution at the start of main as we have seen all along.
If the program is designed to use multiple threads, then the
code will execute library calls to spawn additional threads
within the same process—meaning they share the address
space.
There are a variety of thread libraries, but we will talk
specifically about pthreads—the POSIX thread library.
Pthreads are rather portable, and thus a good way to write
threaded code. If you use some other threading library, a lot of
the higher-level concepts will remain the same, but the
particular details will change. Whenever you write programs
using pthreads, you need to #include<pthread.h> and link
with the pthreads library by specifying -lpthread at the end of
your compilation command line (e.g., gcc -o myProgram
myProgram.c -lpthread).
We note that C++11 introduces std::thread along with
various other primities required for parallel code. If you are
writing C++11, you should use these instead of the pthreads
library. However, we discuss threading in terms of pthreads, as
C++11 is not included in the main flow of the text (we discuss
it in Section E.6). Of course, if you learn the concepts of multi-
threaded code well, moving between pthreads and C++11’s
std::thread is quite easy.
28.2.1 Creating a Thread
In pthreads, we create a new thread with the pthread_create
function. This function takes four parameters. The first
parameter is a pointer to a pthread_t, which describes a thread.
The pthread_create function fills in this structure with
information about the newly created thread. We can use the
pthread_t to refer to the thread in other library calls. The
second argument lets you specify particular attributes of the
thread. You can pass in NULL for default attributes, which we
will do for these examples (we are not going to go into thread
attributes here).
The third argument to pthread_create specifies the entry
point for the new thread. Unlike the entire program, which
starts at main, there is no pre-defined entry point for other
threads. Instead, the thread that creates the new thread must
specify the function in which the new thread should start.
Whenever the entry function returns, the new thread exits. This
function is specified by passing in a function pointer, which
must have type void* (*) (void *)—that is, a pointer to a
function that takes a void * and returns a void *. These types
were chosen as they allow the thread to take and return pointers
to any type. The fourth argument of pthread_create specifies
what to pass into this function as its argument when that
function is called on the newly spawned thread.
When you call pthread_create, a few things happen that
are quite different from anything we have seen before. First, a
new stack is created that is independent from the caller’s stack.
6
Frames can be created and destroyed on one stack in an order
unrelated to the creation and destruction of frames on the other
stack. The new stack is created with a frame for the function
requested as the third argument of pthread_create, passing in
the argument as specified in the pthread_create call. We will
note that behind the scenes there are actually one or more other
frames that exist above this frame, which are part of the pthread
library—in much the same way that there are actually some
frames above main at the start of the program that are part of
the C library. These frames correspond to functions that allocate
space for the entire stack, call the requested function, and cause
the thread to exit after it returns.
The second unusual thing is that a second execution arrow
is created, at the start of the function that we specify as the
entry point of the new thread. The pthread library creates this
second execution arrow by making a system call to spawn a
new thread. The execution arrow actually originates somewhere
in the pthread library, which calls the requested function, as we
discussed above—however, for our purposes, we can reason
about the execution of threads by thinking of the execution
arrow appearing at the start of the requested function.
28.2.2 Parallel Execution
Once we have two execution arrows, we have to deal with the
question of which execution arrow we should move next.
Unfortunately, the answer is “we don’t actually know”—we
could move either one. This may seem incredibly odd, but is a
result of the fact that multithreaded programs are non-
deterministic—there is a valid set of behaviors rather than one
particular behavior we will observe for a specific input.
This non-determinism makes testing and debugging
multithreaded code much more difficult than testing and
debugging single-threaded code. The set of behaviors that we
can observe for a particular input might be quite large, and we
might test the code 1000 times for the same input, and see 1000
different behaviors, yet we might not have seen all possibilities.
The reason why we have this non-determinism arises from
the way that the hardware executes the multiple threads.7 If the
threads are executing at the same time on different cores, the
cores are executing the instructions at the same time. However,
each logical step (e.g., line of code) may correspond to a
different number of instructions, and the cores may execute
those instructions at different rates for a variety of reasons that
you will understand once you take a hardware class.
Furthermore, the OS scheduler may de-schedule one thread (so
that some other program can use the core it was running on)
while leaving the other thread running. Accordingly, one thread
may make millions of steps while the other is not running at all.
28.2.3 Thread Exit
When a thread exits—either by returning from the function it
started in, or by calling pthread_exit—it stops running. Its
return value (either what the function returned, or the argument
passed to pthread_exit) must be kept available for another
thread to retrieve. Waiting for another thread to exit and
obtaining its return value is done by calling pthread_join,
passing in the pthread_t identifying the thread to wait for, and
a pointer to a void* to fill in with the return value. A thread can
call pthread_join to just wait for a thread to finish and ignore
its return value by passing in NULL for the second argument.
Note that the operation of waiting for a parallel task to
complete is called joining the other task, thus the name of the
library function.
When one thread calls pthread_join, it blocks—its
execution arrow stops advancing—until the thread it is joining
terminates. Joining a thread also allows for the pthreads library
to release resources associated with the thread. A thread can
also tell the pthread library that it will never join another thread
with pthread_detach, which allows the thread library to
release resources as soon as the thread exits.
Note that the main thread (the original one, whose entry
point is the main function) behaves differently from the other
threads in these regards. If main returns, the entire process
exits, terminating all of the threads inside of it. This difference
arises from the fact that main was not called by the pthreads
library (within a function that calls pthread_exit with the
return value of the thread’s initial function) but rather by the
part of the C library that calls exit with the return value of
main.
28.2.4 Video Example
Video 28.1: Basics of multithreaded execution.
As with all aspects of programming, it helps us to see the
execution of code involving multiple threads. Video 28.1 shows
the execution of a contrived program with pthreads. Note that
this program does not make good use of concurrency, but rather
is just intended to demonstrate the basic mechanics of creating
a thread, having parallel execution, and joining a thread.
It is also important to note that unlike examples we have
previously seen, there is no longer one single correct output (for
a given input) for the code in Video 28.1. As we discussed
above, multithreaded code is non-deterministic—we can choose
to advance either execution arrow at any given step. For this
particular program, there are 20 possible valid outputs that we
can obtain.
However, it is important to note that just because
behaviors are possible does not mean they are likely on a
particular system. When I test the code in Video 28.1 on my
laptop, and run it several times, I get the following output every
time:
main: 0
f: 1
main: 1
f: 1
main: 2
f: 2
However, I cannot rely on this order of execution. I could
run this code on my laptop and get completely different
behavior. Furthermore, if I take this code and run it on my
desktop (which has a different OS), I get the following output:
main: 0
main: 1
main: 2
f: 1
f: 1
f: 2
Of course, within each thread, the order of operations is
constrained by the normal rules of program execution—we
would never see main 1 before main 0 because they occur one
after the other in the same thread. For more sophisticated
programs, we would not only see more variation in ordering
between executions on the same computer, but also, the
complexities added to our programming process would be more
significant. If we had data structures shared between the
multiple threads, we would need to ensure that our code works
correctly when the threads access them at the same time.
28.2.5 Another Example
As a more useful example, imagine if you wanted to smooth an
image. We will not delve into the details here, but smoothing an
image basically involves iterating over the pixels, and, for each
pixel computing a color value for it that is some weighted
average of the color values of its neighboring pixels. That is, it
might generally look like this:
1 void smooth(image_t * dst, image_t * src) {
2 for (int y = 0; y < src->height; y++) {
3 for (int x = 0; x < src->width; x++) {
4 dst->data[y][x] = wavg_pixel(src, x, y);
5 }
6 }
7 }
The wavg_pixel function computes the weighted average of the
pixels centered around x,y in the source image, for whatever
weighting we want to use.
This algorithm is a natural candidate for parallelization if
we need to improve its performance. The computations for each
pixel are independent of one another—we can do them in any
order. Even though multiple computations may read the same
values from the source image, all of the writes are to the
destination image. Accordingly, the non-determinism inherent
in multiple threads will not affect our final answer.
We can divide the work between threads in a variety of
ways (by rows, by columns, alternating one pixel then the
next); however, the best approach is to divide the work by
rows.8 If we have 2 threads and 1000 rows, we have one thread
process the first 500 rows, and the second thread process the
other 500 rows.
To parallelize this code, we need to restructure it slightly.
First, we need to rewrite the smoothing function so that it
accepts a void * argument, and can go from a start row to an
end row:
1 typedef struct _thr_arg{
2 image_t * src;
3 image_t * dst;
4 int startY;
5 int endY;
6 } thr_arg;
7
8
9 void * smoothThread(void * varg) {
10 thr_arg * arg = varg;
11 for (int y = arg->startY; y < arg->endY; y++)
12 for (int x = 0; x < arg->src->width; x++) {
13 arg->dst->data[y][x] = wavg_pixel(arg->sr
14 }
15 }
16 free(arg);
17 return NULL;
18 }
Notice that now, smoothThread has the correct prototype to
serve as the entry point for a pthread. We pass all the
information we would normally pass as an argument to
smoothThread inside of a single struct, so that we can pass one
pointer to it. You should also notice that this function frees the
memory associated with its argument right before it returns.
This memory will be allocated by the main thread (to pass the
information in). However, the memory cannot be freed until the
“worker” thread finishes using it, making it much simpler to
have the worker thread free it.
We can then write a function that takes a source and
destination image, as well as a number of threads, spawns that
many threads to work on the pieces of the image, and then joins
each of the threads.
1 void smoothParallel(image_t * src, image_t * dst,
2 int perThread = src->height / nThreads + 1;
3 int extras = src->height % nThreads;
4 int curr = 0;
5 pthread_t * threads = malloc(nThreads * sizeof
6 for (int i =0; i < nThreads; i++) {
7 if (i == extras) {
8 perThread--;
9 }
10 thr_arg * arg = malloc(sizeof(*arg));
11 arg->src = src;
12 arg->dst = dst;
13 arg->startY = curr;
14 arg->endY = curr + perThread;
15 curr += perThread;
16 pthread_create(&threads[i], NULL, smoothThrea
17 }
18 for (int i = 0; i < nThreads ; i++) {
19 pthread_join(threads[i], NULL);
20 }
21 free(threads);
22 }
This function loops, counting up to the number of requested
threads, allocates space for the information to pass to a thread,
fills in that information, then spawns a thread. Note that the
computation of src->height % nThreads and the if statement
at the start of the loop spread out the “leftover” lines after
division evenly between the first threads (e.g. if there were
1500 lines and 8 threads, the first four threads would do 188
lines, and the next four threads would do 187 lines).
After all the threads are spawned, this function joins each
thread, waiting for it to complete. We’ll note that technically
there are N+1 threads, one of which is the main thread and the
other N are the worker threads. However, the main thread just
spawns the threads and then waits for the workers, so it does
not do anything during the computation time. This approach
often makes the code a bit cleaner.
Figure 28.1: Speedups for our image smoothing algorithm.
Figure 28.1 shows the speedups obtained by 1, 2, 4, 8, and
16 threads for three sizes of images (256x256, 1024x1024, and
2048x2048). Each speedup is how much faster that particular
number of threads was compared to 1 thread on the same
problem size (so a y value of 4 means that it took one quarter
the time). In general, the best we expect to do is to achieve an
Nx speedup for N threads. 9
For few threads (i.e., fewer than 4) we obtain near ideal
speedups for all the problem sizes we explore: about 1.9x for 2
threads, and about 3.6x for 4 threads. However, at 8 threads, we
only continue to see significant performance improvements on
the larger problem sizes. The 256x256 problem size only sees
3.8x speedup for 8 threads—barely more than the 3.6x we saw
with 4 threads, but using twice the hardware resources. Here,
there is much less work to divide up, so the overhead of starting
each thread begins to be more significant. As we go to 16
threads, we continue to see almost no performance
improvement relative to 4 threads for the small problem size
(only a 4.6x speedup). The largest problem size (2048x2048) is
still doing quite well at 14x speedup. The medium problem size
has begun to flatten off, but is still receiving significant benefits
(12.5x speedups).
We stop this experiment at 16 threads, as that is what the
hardware supports (we would not expect to see any more
speedups past that, as the hardware cannot run any more
threads in parallel). However, if we had a machine with more
cores, we would expect to see the 256x256 problem size to
actually start having a lower speedup (as overheads dominate
and no more gains are possible). We would expect the
1024x1024 problem size to continue to flatten out before
eventually declining, and the 2048x2048 to continue to scale
almost ideal to at least 64 cores (as it is 4x larger than the
1024x1024 problem, which starts to flatten out at 16 cores)
before beginning to flatten out and eventually decline.
28.3 Synchronization
Our parallel image smoothing problem is a bit of an unusual
case in parallel programming, in that it is embarrassingly
parallel (often abbreviated “EP”)—we can parallelize it
relatively easily as there are readily apparent ways that we can
form completely independent tasks. However, many problems
are a bit trickier to parallelize, as the tasks are not completely
independent. Here, the tasks communicate with each other in
some way (i.e., one writes data that another reads or writes).
When there is communication between threads, the
programmer must (typically) use some form of synchronization
—a technique in which threads are forced to wait before
performing specific operations.
To see the need for synchronization, consider a program
designed to read multiple text files and count the total number
of occurrences of each word10 in all of the files. Here is a
sequential version of this program, which has a function that
uses a std::map to count how many occurrences of the string
are in the file:
1 typedef std::map<std::string, unsigned lo
2
3 void countStringsIn(std::istream & input,
4 strCountMap & counts)
5 std::string word;
6 while (input >> word) {
7 counts[word]++;
8 }
9 }
We might then write another function that loops over a
vector of input streams, and calls our countStringsIn function
for each of them, using the same map each time (so that the
counts are summed over all files):
1 strCountMap readFiles(std::vector<std::is
2 std::vector<std::istream*>::iterator it
3 strCountMap answer;
4 while (it != inputs.end()) {
5 countStringsIn(**it, answer);
6 ++it;
7 }
8 return answer;
9 }
We note that these sequential version of the code work just
fine. Now, we might try to parallelize this code by splitting the
input vector between multiple threads, much like we did with
the image smoothing example earlier. We might write a
structure (or class) for a thread argument:
1 class ThreadArg {
2 public:
3 const std::vector<std::istream*> * cons
4 strCountMap * const counts;
5 const size_t start;
6 const size_t end;
7 ThreadArg(const std::vector<std::istrea
8 strCountMap * c,
9 size_t s,
10 size_t e) : vector(v),
11 counts(c),
12 start(s),
13 end(e){}
14 };
Then we might write a function that works as a pthread11
entry point:
1 void * readFileThread(void * varg) {
2 ThreadArg * arg = (ThreadArg *) varg;
3 const std::vector<std::istream*> & v =
4 for (size_t i = arg->start; i < arg->e
5 countStringsIn(*v[i], *arg->counts);
6 }
7 delete arg;
8 return NULL;
9 }
Then a function that spawns the threads:
1 strCountMap readFiles(std::vector<std::is
2 strCountMap answer;
3 int perThread = inputs.size() / nThread
4 int extras = inputs.size() % nThreads;
5 pthread_t * threads = new pthread_t[nTh
6 int curr = 0;
7 for (int i =0; i < nThreads; i++) {
8 if (i == extras) {
9 perThread--;
10 }
11 ThreadArg * arg = new ThreadArg(&inpu
12 &answ
13 curr,
14 curr
15 curr += perThread;
16 pthread_create(&threads[i], NULL, rea
17 }
18 for (int i = 0; i < nThreads ; i++) {
19 pthread_join(threads[i], NULL);
20 }
21 delete[] threads;
22 return answer;
23 }
If we write a main to test this code, we would find that we
get different answers when we run it on the same input—
meaning that at least some of the time we are miscounting.
Even worse, we might end up corrupting our map. In one test
execution, we obtained the following result from a main that
iterates over the map (using its iterator) and prints the
key/value pairs:
a = 218188
b = 203830
b = 1
c = 112603
d = 112045
d = 1
e = 48853
f = 48546
Notice that the map has two distinct b and two distinct d
keys—it certainly should not do that! The problem here is that
our program has data races—situations in which multiple
threads are accessing the same data, and the specific order in
which we advance the execution arrows of each thread affects
the results. To see the problem clearly, we will start with a
simpler function:
1 void * incrThread(void * varg) {
2 int * arg = varg;
3 for (int i = 0; i < 5000; i++) {
4 int temp = *arg;
5 temp++;
6 *arg = temp;
7 }
8 return NULL;
9 }
10
11 int main(void) {
12 int x = 0;
13 pthread_t thr;
14 pthread_create(&thr, NULL, incrThread,
15 incrThread(&x);
16 pthread_join(thr, NULL);
17 printf("%d\n", x);
18 return EXIT_SUCCESS;
19 }
This code has a function that takes a pointer (as a void*,
but expects it to point to an int), and increments the integer
that it points to 5000 times. If we did not know about data
races, two threads executing this function each one time, with
the argument pointing at the same integer would result in
incrementing that variable 10,000 times (thus, we might expect
the output to always be 10000). However, if I run this program
three times, I get outputs of 6124, 6513, and 6814!
The problem here is the data race on reading *arg. If one
thread executes line 4 (reading *arg into temp) then the other
thread executes line 4 before the first thread executes line 6
(updating *arg with the incremented value), then both will read
and store the same value in that iteration of the loop. We say
that lines 4–6 form a critical section—a region of the program
that we must ensure that at most one thread’s execution arrow is
inside of at any given time.
Video 28.2 illustrates the data race in this code.12 We note
(*arg)++ does not solve the problem—it compiles into the
same instructions as what we wrote here (we just wrote it out in
three lines to make it clearer in the video).
Video 28.2: Illustration of data races in
multithreaded code.
Our earlier example with maps exhibited a similar
problem (with the increments), as well as with the map’s
internal check if an item exists before adding it in operator[].
Now that we understand data races and critical sections, we are
ready to see how synchronization can solve this problem.
28.3.1 Mutexes (Locks)
A mutual exclusion lock (commonly called a mutex, or just
lock) is a synchronization construct that is used to guard a
critical section. A mutex has two states: either locked or
unlocked, and supports two main operations, lock (also called
“acquire”) and unlock (also called “release”). If a thread
attempts to lock a mutex that is already locked, then it blocks
(recall: “blocks” means the execution arrow stops advancing)
until the mutex is unlocked by another thread. If the lock is
already unlocked, then the mutex switches to the locked state,
and the thread continues execution. A thread can unlock a
mutex it has already locked (unlocking a mutex the thread did
not lock results in undefined behavior).
We generally use one mutex to guard a particular piece of
data—i.e., we must lock the same mutex to enter all critical
sections that access that piece of data (we may use one mutex
to guard multiple pieces of data). Accordingly, programmers
will often idiomatically refer to locking the piece of data to
mean locking the mutex that they used to guard that piece of
data. For example, one might say “I had a bug in this function
because I forgot to lock x before I modified it!” when more
precisely one might have spoken about locking the mutex
guarding x. Programmers will also often refer to holding a lock
on a piece of data to mean having the mutex that guards that
piece of data locked. For example, one might say (or write in
documentation): “Be sure that you are holding a lock on foo
before you call bar.”
In pthreads, we use mutexes by first declaring a variable of
type pthread_mutex_t, and then initializing it with
pthread_mutex_init, which takes a pointer to the mutex
variable, and a pointer to the attributes of the mutex (NULL gives
you default attributes—we are not going to delve into the
various attributes you can use). The mutex is initialized to the
unlocked state.
Threads can then call pthread_mutex_lock to lock the
mutex, and pthread_mutex_unlock to unlock it. Both of these
functions take a pointer to the mutex that you want to
lock/unlock. Once all threads are done using a mutex, the
pthread_mutex_destroy function should be called to cleanup
the mutex.
We could fix our previous example by using a mutex to
guard the critical section:
1 pthread_mutex_t lock;
2
3 void * incrThread(void * varg) {
4 int * arg = varg;
5 for (int i = 0; i < 5000; i++) {
6 pthread_mutex_lock(&lock);
7 int temp = *arg;
8 temp++;
9 *arg = temp;
10 pthread_mutex_unlock(&lock);
11 }
12 return NULL;
13 }
14
15 int main(void) {
16 int x = 0;
17 pthread_t thr;
18 pthread_mutex_init(&lock, NULL);
19 pthread_create(&thr, NULL, incrThread,
20 incrThread(&x);
21 pthread_join(thr, NULL);
22 pthread_mutex_destroy(&lock);
23 printf("%d\n", x);
24 return EXIT_SUCCESS;
25 }
This code is now correct (it will print 1000 every time it
runs). We can see how the mutex fixed this code by executing it
by hand, and remembering that when a thread’s execution
arrow reaches a pthread_mutex_lock call, it will block if the
mutex is already locked. Video 28.3 illustrates.
Video 28.3: Illustration of mutexes in
multithreaded code.
Even though our code is now correct, it is quite slow—on
my computer, it is roughly 200x slower13 than a sequential
version of the code (without any parallel threads at all)!
Remember that the whole point of using multiple threads in the
first place is to increase performance.
There are a couple problems here, performance-wise.
First, this contrived example does not actually have any parallel
work—remember that the way that threads obtain performance
is by allowing multiple cores to work on different parts of the
problem in parallel. Here, the entire computation is serially
dependent, so using threads is a terrible idea to start with.
You might think that lack of parallel work would leave us
no worse off (even if no better) than a sequential version of the
code; however, locking a mutex has a non-trivial overhead.
That overhead can be fairly significant if a thread running on
another core had the mutex locked last. The technical details of
why require a moderately sophisticated knowledge of hardware
(namely: cache coherence—the protocol that keeps data
consistent across multiple cores), but the simplified version is
that the thread must request the mutex’s data from the other
core and wait for it to be sent across the chip.
If we were to return to our less contrived problem of
reading multiple files in parallel and counting the sum of the
words in them, we might be able to get speedups from
parallelism. However, we must be careful. If we just lock a
mutex before counts[word]++ and unlock it after, like this:
1 typedef std::map<std::string, unsigned lo
2
3 pthread_mutex_t counts_lock; //initialized el
4
5 void countStringsIn(std::istream & input,
6 strCountMap & counts)
7 std::string word;
8 while (input >> word) {
9 pthread_mutex_lock (&counts_lock);
10 counts[word]++;
11 pthread_mutex_lock (&counts_unlock);
12 }
13 }
Then we will not observe much speedup. Even though the
problem itself has parallelism, our implementation will
significantly serialize it—the threads will spend most of their
time waiting to lock the mutex. We will note that in a situation
such as this, we say that the mutex is heavily contended—many
threads are trying to acquire it at once. For a situation such as
this one, we can obtain speedups, but we need to make some
changes to the way we lock our data structures.
28.3.2 Locking Granularity
One way we can improve the performance of our parallel string
counting is to change our locking granularity—how large of a
piece of data we protect with a single lock. In the previous
example, we have one lock for the entire map of counts.
However, we could imagine using a data structure with finer-
grained locking. That is, we might lock pieces of the data
structure at a time, as long as we can do so safely.
The easiest data structure to think about fine-grained
locking on is a hashtable. We could imagine a hashtable with
one mutex per bucket. We would implement such a hashtable
with an array of mutexes (of the same size as the array of linked
lists), and then whenever we perform an operation, we would
lock the appropriate mutex before manipulating the
corresponding linked list. Operations that affect multiple rows
(such as rehashing) require a thread to acquire multiple locks
before they proceed.
This design allows for multiple operations to proceed in
parallel as long as they are to distinct bucket in the table.
Thread 0 can insert into bucket 12 while thread 1 looks up a
piece of data in bucket 47. The threads will only be serialized if
they access the same bucket, which is hopefully rare (especially
if the table is quite large). We could use this approach for our
map from strings to counts in the example we have been
discussing, but we must take care to design our data structure
with a good interface. If we just look up the item and return a
reference, there is nothing to prevent races on the modifications
to the value. That is, even if counts is a hashtable with proper
locking, this code still has a race:
1 int & countRef = counts[word];
2 countRef++;
Specifically, two threads may have a reference to the same
word’s count at the same time and have the problem from our
counting example.
The easiest way to support the required functionality with
a clean interface would be for the map’s interface to have a
function that takes a parameter of “what to do to the value”
while it holds the lock (e.g., by a function pointer or function
object).
We will note that finer-grained locking leads to better
scalability (more performance from more threads), but requires
more care from the programmer.
28.3.3 Reader/Writer Locks
Another way to improve the scalability of code while retaining
correctness is to use reader/writer locks. A reader/writer lock is
a special kind of mutex that a thread can lock for reading or
lock for writing. The reader/writer lock allows multiple threads
to simultaneously lock it for reading (also called “acquiring a
read lock”). However, only one thread can acquire a write lock
at any given time, and may not do so as long as any thread
holds a read lock. The idea behind this kind of lock is that it is
safe for multiple threads to access a piece of data as long as all
of them are reading it. It is only when a thread is writing the
data that it truly needs exclusive access.
To see a motivation for a reader/writer lock, let us return
to our hashtable that uses fine-grained locking. Suppose we
write the add method like this:
1 void add(K & key, V & value) {
2 int ind = hash(key);
3 ind = ind % numBuckets;
4 pthread_mutex_lock (&locks[ind]);
5 buckets[ind]->add (key, value);
6 pthread_mutex_unlock (&locks[ind]);
7 }
This add method looks fine (we hash the key, mod by the
number of buckets, lock the correct bucket, insert into that list,
then unlock that bucket); however, we need to be careful. In
particular, consider what happens if our hashtable also supports
a rehash operation. The rehash operation will change
numBuckets, and could be done by another thread. We therefore
must be careful with our access to numBuckets to ensure that
we do not have a data race. We could have a mutex that protects
numBuckets:
1 void add(K & key, V & value) {
2 int ind = hash(key);
3 pthread_mutex_lock(&bucketLock);
4 ind = ind % numBuckets;
5 pthread_mutex_lock (&locks[ind]);
6 pthread_mutex_unlock (&bucketLock);
7 buckets[ind]->add (key, value);
8 pthread_mutex_unlock (&locks[ind]);
9 }
However, now we will serialize all adds (and lookups, and
removes) on the bucketLock, which is likely to be highly
contended if we access the hashtable often. We will note that
unlocking the bucketLock after we acquire locks[ind] (but
before the next line) is correct as our rehash will need to
acquire locks[ind] before it begins modifying the table.
This situation is a great example of where we might want
to use a reader/writer lock, as our common operations (add,
lookup, remove) need to only read numBuckets, and our rehash
operation (which writes it) is rather uncommon.
1 void add(K & key, V & value) {
2 int ind = hash(key);
3 pthread_rwlock_rdlock(&bucketLock); //read loc
4 ind = ind % numBuckets;
5 pthread_mutex_lock (&locks[ind]);
6 buckets[ind]->add (key, value);
7 pthread_mutex_unlock (&locks[ind]);
8 pthread_rwlock_unlock (&bucketLock); //unlock
9 }
We will note that we moved the unlock of bucketLock
below the linked list add operation, as the read lock will not
block add, lookup, or remove operations, and it will simplify
our implementation of rehash (which can ensure no other
accesses to any rows by obtaining a write lock on bucketLock).
If we perform lookups much more frequently than table
modifications (which is often the case), then we might want to
use reader/writer locks for each bucket (that is, make
locks[ind]) into reader/write locks.
If you need to use reader/writer locks, declare your lock
variable as a pthread_rwlock_t, and read the man pages for
the following functions:
1 pthread_rwlock_init
2 pthread_rwlock_rdlock
3 pthread_rwlock_wrlock
4 pthread_rwlock_unlock
5 pthread_rwlock_destroy
28.3.4 Deadlock
One issue that we must be careful of when we write a parallel
program is deadlock—when one or more threads cannot
advance at all because they are waiting for a locked mutex that
will never be unlocked. The simplest way to see how deadlock
can arise is to consider thread 0 executing the following code:
1 pthread_mutex_lock (&lockA);
2 pthread_mutex_lock (&lockB);
3 ... //whatever code is in our critical section
4 pthread_mutex_unlock (&lockA);
5 pthread_mutex_unlock (&lockB);
while thread 1 is executing the following code (in some other
function):
1 pthread_mutex_lock (&lockB);
2 pthread_mutex_lock (&lockA);
3 ... //whatever code is in our critical section
4 pthread_mutex_unlock (&lockB);
5 pthread_mutex_unlock (&lockA);
Suppose that each thread successfully acquires a lock in
line 1 of their respective pieces of code (thread 0 acquires
lockA and thread 1 acquires lockB). Both threads then get
“stuck” at line 2. Thread 0 blocks waiting for thread 1 to release
lockB. However, thread 1 will never release that lock, as it is
blocked waiting for thread 0 to release lockA. Both threads will
then wait forever for something that will never happen, and our
program will get stuck.
Deadlock often plagues programmers who are new to
parallel programming; however, careful discipline can help you
avoid this problem. One crucial step to avoiding deadlock is to
ensure that you always acquire locks in the same order. If one
function locks lockA then lockB, then every other function in
the program that requires both locks must lock lockA first and
lockB second. Most programmers follow a convention to
acquire locks in ascending order when there is a natural
ordering to them (e.g., increasing index in an array, alphabetical
order by variable name, etc).
Sometimes it may be appropriate to use
pthread_mutex_trylock—a function that is much like
pthread_mutex_lock, but which does not block if the mutex is
already locked. Instead, it returns 0 if it successfully locked the
lock, and non-zero otherwise. The program can then check this
return value to determine if the lock was acquired and proceed
accordingly. We will note that there are also
pthread_rwlock_tryrdlock and pthread_rwlock_wrlock for
reader/writer locks. We will not go into any more detail here,
but you should at least be aware of them so you know what to
look for if you need more information in the future.
28.3.5 Condition Variables
Sometimes a programmer wants a thread to wait for a particular
condition to become true, but cannot do so while holding the
mutex that protects that particular piece of data. For example,
suppose we have a Queue that threads use to communicate (one
places work into the Queue, and another dequeues work from
the Queue). A naïve programmer might write:
1 //broken
2 WorkItem * getWork(Queue * myQueue) {
3 pthread_mutex_lock(&myQueue->lock);
4 while (myQueue->isEmpty()) {
5 ; //busy wait
6 }
7 WorkItem * answer = myQueue->dequeue();
8 pthread_mutex_unlock(&myQueue->lock);
9 return answer;
10 }
At first glance, this function may seem fine; however, let us
consider what happens when the queue is empty. The thread
trying to obtain work will lock the mutex for the queue, then
busy wait—executing a loop that does nothing until some
condition is met—until the queue is not empty. However, if we
think about how the queue will become non-empty, we will see
the problem. Some other thread will call a function that looks
like the following:
1 void addWork(Queue * myQueue, WorkItem * w) {
2 pthread_mutex_lock(&myQueue->lock);
3 myQueue->enqueue(w);
4 pthread_mutex_unlock(&myQueue->lock);
5 }
This function will enqueue a piece of work into the queue;
however, it must acquire the mutex lock for the queue in order
to do so. However, the other thread is holding that lock already,
waiting for the queue to become non-empty, which won’t
happen until this thread can acquire the lock! This example
leads us to another important consideration in avoiding
deadlock: we cannot busy wait while we hold a mutex if some
other thread must acquire that mutex to satisfy the condition we
are busy waiting on.
We could “fix” this by unlocking the mutex, doing
something that takes some time, then re-locking the mutex, like
this:
1 //not a good solution
2 WorkItem * getWork(Queue * myQueue) {
3 pthread_mutex_lock(&myQueue->lock);
4 while (myQueue->isEmpty()) {
5 pthread_mutex_unlock(&myQueue->lock);
6 //do something that takes some time?
7 pthread_mutex_lock(&myQueue->lock);
8 }
9 WorkItem * answer = myQueue->dequeue();
10 pthread_mutex_unlock(&myQueue->lock);
11 return answer;
12 }
However, this solution is not good from a performance
standpoint. Our thread that is waiting for the condition is
locking and unlocking the mutex repeatedly. If multiple threads
are waiting for the queue to become non-empty, they will be
contending for the mutex (moving its data around the system,
degrading performance) even though they are doing nothing
with it.
Here, what we want is a condition variable—a
synchronization construct that supports the operations wait,
14
signal, and broadcast. The wait operation blocks the thread
until some other thread does a signal or broadcast. A signal
operation unblocks one thread that is waiting, while a
broadcast unblocks all threads that are waiting. One key
aspect of the wait operation is that it takes a mutex and releases
it in such a way that there is no race with the thread starting to
wait (there is no risk of the mutex being released, another
thread signaling, and then the thread starting to wait).
A correct implementation of our getWork function would
use a condition variable like this:
1 WorkItem * getWork(Queue * myQueue) {
2 pthread_mutex_lock(&myQueue->lock);
3 while (myQueue->isEmpty()) {
4 pthread_cond_wait(&myQueue->cv,
5 &myQueue->lock);
6 }
7 WorkItem * answer = myQueue->dequeue();
8 pthread_mutex_unlock(&myQueue->lock);
9 return answer;
10 }
Notice that, we must wrap the wait operation
(pthread_cond_wait in pthreads) in a while loop that rechecks
the condition. Even though there is no race between the release
of the mutex and the thread starting to wait (meaning no
signals get lost), another thread may acquire the mutex before
this thread reacquires it (which is part of the wait operation).
That is, thread 0 may wait, thread 1 may enqueue work (and
signal thread 0, waking it up), and thread 2 may dequeue the
work before thread 0 reacquires the mutex and dequeues the
work. When such a situation happens, thread 0 must wait
again.
We then make our addWork so that it signals
(pthread_cond_signal in pthreads) after it enqueues work.
This change will cause one thread that is waiting to be “woken
up” (unblocked)
1 void addWork(Queue * myQueue, WorkItem * w) {
2 pthread_mutex_lock(&myQueue->lock);
3 myQueue->enqueue(w);
4 pthread_cond_signal(&myQueue->cv);
5 pthread_mutex_unlock(&myQueue->lock);
6 }
We note that the addWork uses signal, as it should wake up
at most one waiting thread. If we needed to wake up all
waiting threads, we would use broadcast (in pthreads,
pthread_cond_broadcast) instead.
28.3.6 Barriers
Another common synchronization construct is a barrier—a
construct that requires a certain number of threads to reach it
before any may proceed. To use a barrier, a programmer first
initializes the barrier, indicating how many threads will
participate in the barrier (in pthreads, pthread_barrier_init).
Then the threads perform some computation, and when they
finish and must synchronize with each other, they wait on the
barrier (pthread_barrier_wait). As threads wait on the
barrier, they will block until the required number of threads
reach the barrier. At that point, all threads will proceed past the
barrier. In pthreads, the pthread_barrier_wait call returns a
value that distinguishes one thread from all of the others, in
case the programmer needs to have one thread take special
actions.
Barriers see significant use in scientific computing, where
one phase of computation must be completed before another
phase begins. For example, many scientific simulations model
physical systems (weather, molecular dynamics, …) in time
steps. Such a system might have parallel computation within a
time step, then have a barrier at the end of each time step. The
barrier requires all threads to finish time step N before any
thread begins the computations for time step N+1.
We note that barriers are an optional part of pthreads, so
not all implementations will support them.
28.4 Atomic Primitives
If you think carefully for a moment about how synchronization
constructs are implemented, you will realize that there is a bit
of a tricky difficulty. To put that a different way, suppose you
were implementing the pthreads library: 15 how would you
write pthread_mutex_lock? Our first attempt might be
something like this:
1 //This code does not work!
2 typedef int mutex_t;
3
4 void mutex_lock(mutex_t * lock) {
5 while(*lock != 0) {
6 ; //busy wait
7 }
8 *lock = 1;
9 }
At first glance, this code has the right general idea. We can
use an int for the state of the mutex with 0 meaning unlocked
and 1 meaning locked. We then check if the lock is unlocked,
and if not, wait until it is (the while loop on lines 5–7). Once
the lock is unlocked, we lock it by setting its value to 1. We
note that the actual pthread_mutex_lock function returns an
int as it checks for error conditions.
However, we now have a data race on the mutex itself!
Two threads could check the status of the same mutex (on line
5), see that it is unlocked, and proceed to line 8. Both threads
would then execute line 8 (setting the value of the mutex to 1
twice), and enter the critical section at the same time. Our first
reaction might be to try to have some mutex lock to protect the
critical section within our locking code (lines 5–8); however,
this approach fails. We cannot use recursion here, as we would
not be trying to solve a smaller instance of the problem.
To actually solve this problem, we need to make use of
primitives that will read and update data atomically—in such a
way that the hardware guarantees there are no races between
the read and the write. There are many different types of atomic
primitives that can be used to implement higher-level
synchronization constructs. We will cover a few of the most
common ones, as well as see how atomic operations can be
useful for other purposes.
28.4.1 Test-and-set
One atomic operation that we can use to implement a mutex is
atomic test-and-set (TAS). As its name suggests, test-and-set
allows us to atomically test the value in a memory location and
set it to 1. That is, it has the same effect as the following code,
except that the read of *ptr on line 2 and the write of *ptr on
line 3 are guaranteed to be atomic (no other thread can change
the value of *ptr in between them).
1 int test_and_set(int * ptr) {
2 int x = *ptr;
3 *ptr = 1;
4 return x;
5 }
If we have an atomic test and set operation, we can use it
to (maybe) correctly implement the lock operation for a mutex:
1 //This code works, but is not good...
2 typedef int mutex_t;
3
4 void mutex_lock(mutex_t * lock) {
5 while(test_and_set(lock) != 0) {
6 ; //busy wait
7 }
8 }
The reason we say “(maybe)” correctly is that the
correctness of this code actually depends on some subtle details
of the test_and_set, which are not apparent in our high-level
description. These details revolve around a fairly complex
topic, which requires significant hardware understanding, but
roughly boils down to the fact that the hardware is generally
allowed to reorder memory operations (loads and stores) to
improve performance (within some certain rules, called the
memory consistency model). If our hardware is allowed to
reorder stores relative to the test-and-set operation (i.e., take
stores that appear in the code after the TAS and execute them
before the TAS completes), then our mutex lock is not correct,
as the hardware may perform stores from inside the critical
section before the lock is acquired! For the rest of this
discussion, we are going to assume that we are talking about
atomic operations from the x86 ISA, which prevent the
hardware from reordering around them. If that all sounds a bit
complex, let it underscore our point that you really need to
understand hardware to do parallel programming well.
We will also note that gcc has an
__atomic_test_and_set; however, we are not using it in our
examples as it requires a parameter related to the required
memory model of the operation, and we do not intend to delve
into that topic here.
We can improve further on our test-and-set
implementation by using an idiom called test-and-test-and-set.
The idea here is to test the lock with a non-atomic operation,
then only use the atomic operation when there is a possibility
we might actually acquire the lock. The motivation behind this
design arises from the fact that test-and-set requires the
processor to obtain exclusive—meaning no other processor may
have a copy so that the processor that does have a copy may
safely modify it—permission to the data, so it must be removed
from all other processors. This operation is a bit expensive, and
system performance will degrade if many threads contend on
the lock. We can write a test-and-test-and-set lock like this:
1 //Better
2 typedef int mutex_t;
3
4 void mutex_lock(mutex_t * lock) {
5 do{
6 while (*lock != 0) {
7 ;
8 }
9 } while(test_and_set(lock) != 0);
10 }
We note that we could potentially make one more
improvement to this code by not having the processor work so
hard in the inner loop. Some hardware supports a notion of
asking the processor to wait just a little bit via a special
instruction (in x86, this instruction is pause). If we do not care
about portability,16 we might improve on this code further in
this way:
1 //Best (if we only need x86)
2 typedef int mutex_t;
3
4 void mutex_lock(mutex_t * lock) {
5 do{
6 while (*lock != 0) {
7 //tell the compiler to produce
8 //a specific assembly instruction here
9 asm volatile("pause\n");
10 }
11 } while(test_and_set(lock) != 0);
12 }
We will note that the actual code in pthreads is rather more
complex than this example, but this implementation of a lock
would be functionally correct and perform reasonably well.
28.4.2 Compare-and-swap
A more general atomic primitive is compare-and-swap (CAS),
which compares the value in a memory location to an expected
value. If the value in the memory location matches, then the
memory location is updated with a new value. If the values do
not match, then no update is made to memory. The entire CAS
operation evaluates to the value that was read from memory.
That is, compare and swap does the following atomically:
1 int compare_and_swap(int * location, int expected
2 temp = *location;
3 if (temp == expected_old) {
4 *location = new_val;
5 }
6 return temp;
7 }
We can use this primitive to implement the lock operation
for a mutex in much the same way as we did with TAS.
Similarly, we can make the same improvements to our CAS-
based lock that we made to our TAS-based lock. We show the
only the best version here, with the TAS replaced by CAS:
1 typedef int mutex_t;
2
3 void mutex_lock(mutex_t * lock) {
4 do{
5 while (*lock != 0) {
6 //tell the compiler to produce
7 //a specific assembly instruction here
8 asm volatile("pause\n");
9 }
10 } while(compare_and_swap(lock,0,1) != 0);
11 }
28.4.3 Load-linked/Store-conditional
Some instruction sets are based on the idea of having only
simple operations, and the idea of having atomic operations
(which both read and write memory) goes against their design
philosophy. However, these instruction sets still need to support
parallel programming, and thus must have some primitives that
mutexes can be built from.
Instead of atomic operations, these ISAs support the
concept of load-linked and store-conditional. These two
instructions work together to build operations that behave
atomically. The load-linked instruction reads a value from
memory and asks the hardware to watch that memory address.
If, subsequently, another thread stores to that memory location,
the hardware will remember that fact. The store-conditional
instruction then writes to that memory location only if it has not
been changed (in which case, we say that store-conditional
succeeds). If the memory location has been modified, the store-
conditional fails. The program can check the success or failure
of the store-conditional operation, and retry the operation if
needed.
For example, if our hardware supported load-linked/store-
conditional, and we wanted to implement test-and-set, we could
do so as follows (load_linked and store_conditional would
be instructions, not functions—but we are not delving into
assembly programming, nor tying ourselves to a specific ISA):
1 int test_and_set(int * ptr) {
2 int temp;
3 do {
4 temp = load_linked(ptr);
5 } while(store_conditional(ptr, 1) != SUCCESS);
6 return temp;
7 }
28.4.4 Atomic Increment
Some instruction sets support operations such as atomic
increment—which atomically reads a value from memory, adds
1, and stores it back to the same memory location. If
incrementing a variable is all that you need, the atomic
increment operation is generally much more efficient than
acquiring a lock, performing the increment, and releasing the
lock (on my computer, about 5x difference in performance).
There may also be “atomic fetch-and-XXX” operations, where
XXX is some operation such as add, subtract, etc. Exactly what
is supported depends on the instruction set.
Atomic increment operations still have some serializing
effect on the code. Although we can just execute them as one
atomic step when executing by hand, in a real system the data
must move from one core to another. This movement takes
time, and thus degrades the performance of the threads waiting
for it to move.
28.5 Lock Free Data Structures
We have already discussed how locks are an impediment to
performance in parallel code—if threads spend significant time
waiting for contended locks, then they do not provide as much
performance benefits as we would hope for. This problem
typically gets worse as you attempt to parallelize over more
threads, as the expected time waiting for locks increases.
Consequently, parallel programmers have contemplated lock
free data structures—data structures that are capable of
operating correctly when multiple threads access them
simultaneously, but which do not need locks to provide that
correctness.
A variety of data structures can be implemented in a lock
free fashion. Most rely on atomic primitives (most frequently
atomic compare-and-swap) to manipulate the data structure.
Generally, these data structures work best when there are not
many racing writes, as they frequently are structured so that the
writer that loses the race must retry its write operation (after it
detects that its CAS has failed).
For example, if we wanted to make a lock free stack, we
might write a push operation that looks something like this:
1 //maybe right, likely an issue
2 void push(const T & x) {
3 Node * n = new Node(x);
4 do {
5 n->next = top;
6 }
7 while(compare_and_swap(&top, n->next, n) != n
8 }
Notice that we create a node with the data that we want,
and then repeatedly try to make that node the top of the stack
until we succeed. If there are not other racing updates, the CAS
will succeed, and set top = n. However, if the CAS fails (that
is, if top != n->next—which can only happen if another
thread changed top in between the statements), then the code
will go through the do-while loop another time, retrying the
insertion.
This implementation has one subtle issue: the new operator
itself is very likely implemented with a lock internally. That is,
when new attempts to allocate memory, it almost certainly does
so by locking a list of free blocks in the heap, allocating from
that list, then unlocking the free list. There are thread-friendly
memory allocation libraries (which keep pools of memory local
to each thread, and allow them to allocate from that pool
without a lock—then lock a global free list only if the local
pool runs out), so the above implementation could work if we
can be sure that new is lock free at least most of the time.
However, if we cannot be sure of that, we may need to
keep our own pool of allocated nodes, and access it in a lock-
free way. We may wish to do this anyways as we do not need
the general purpose behavior of new, and thus can reduce the
performance overheads of allocation if we implement it well.
We run into a slightly more difficult problem when we try
to pop from the stack. To see why, let us consider a reasonable
looking solution:
1 //broken
2 T pop() {
3 Node * n;
4 do {
5 n = top;
6 if (n == NULL) {
7 throw empty();
8 }
9 } while(compare_and_swap(&top, n, n->next) !=
10 T ans = n->data;
11 delete n;
12 return ans;
13 }
Here, we try to get the top of the stack (and if the stack
becomes empty, we throw an exception). We again use CAS to
swap that node out of the stack (setting top = n->next if-and-
only-if top is still equal to n), then we extract the data, free the
memory for the node, and return our answer.
At first glance, this seems fine (except that delete likely
has a lock too)—however, we have missed the fact that we can
have a race condition with respect to freeing the memory, then
using a dangling pointer on another thread! That is, suppose
that two threads both enter the pop method at the same time.
Both of them reach line 5 and set n to point at the particular
node that is the top of the stack right now (let us call it node1).
Thread 0 then executes the if, CAS, assignment to answer, and
delete operation (freeing the memory for node1) before thread 1
does anything else. Now when thread1 tries to perform the CAS
operation, its access to n->next is not valid, as memory for
node1 (which is what n points at) has been freed.
If you are paying extremely careful attention, you might
now think “but the CAS would fail, and we would not actually
use the value in n->next, so it does not matter if it is garbage.”
While that is true, it is not sufficient for our data structure to be
correct. For one thing, the access to n->next itself might
segfault. However, we also run into other ways that this race
could corrupt our program. Suppose that when thread 0 and
thread 1 race to pop node1, node1’s next is node2. Suppose that
thread 1 has read n->next, and has passed a pointer to node2 as
the third argument of CAS. While thread1 is calling CAS,
thread 2 pushes an item on the stack, and the memory allocator
returns the same memory that was previously used by node1
(which has now been freed). That node is then pushed onto the
stack, with node3 as its next (maybe node3 was pushed in the
interim, or node2 was popped). Now, the CAS will succeed, but
the state of the list will be incorrect—we will have either
dropped some items that should be included, or re-inserted
nodes that have been popped (and freed—and possibly
reallocated).
Correctly freeing memory in a lock free data structure is
generally a bit tricky. We do not want to lock the data structure
to do so (as that defeats the point), but we also cannot afford to
have a race in which we corrupt the data structure as we just
described. We basically have to build a list of to-be-freed nodes
(using a similar technique as our “push” operation above), then
we can remove and re-use (or delete) them when we can be
sure that no threads are in the middle of any operations that
might reference them. We could accomplish this by ensuring
that every other thread has completed at least one operation
since the we removed a particular node, periodically locking
the data structure to clean up the memory (e.g., we might use
reader/writer locks, and typically use a read lock, but write lock
to cleanup), or having explicit operations to clean up, which we
use at places in the program where we know it is safe to do so.
Writing lock free data structures is (obviously) a rather
complex topic, and, as you may have guessed by now, requires
some understanding of hardware. However, you should be
aware of what they are, generally how they work, and that there
exist libraries that implement them. You may be able to
accomplish what you need with such libraries.
28.6 Parallel Programming Idioms
There are some common ways in which programs are
parallelized. We briefly introduce a few of these common
parallel programming idioms here.
28.6.1 Data Parallelism
Perhaps the most straightforward way to parallelize a program
occurs when different data elements can be processed in an
independent fashion. We already saw this approach in our
image smoothing example in Section 28.2.5. There, we could
compute each output pixel independently of every other output
pixel. Accordingly, we could divide the work up along those
data elements, and have quite scalable code.
We will note that data parallelism can also be exploited
with vector instructions—instructions that perform the same
operation on multiple pieces of data at one time. You can use
vector instructions with or without threading—they are
basically orthogonal to how you would design your algorithm.
Performance interactions go beyond the scope of this book, as
they require significant understanding of hardware.17
Vector instructions are available on most modern
processors. On x86, they are part of the SSE instruction set
extensions, which are common on most commercial x86 cores.
gcc will try to use vector instructions in the cases where it can
determine that it is safe to do so, if you enable optimizations at
level -O3 (or explicitly turn on the specific optimizations). You
can also write explicit vector data types with extensions (see
https://gcc.gnu.org/onlinedocs/gcc/Vector-
Extensions.html).
28.6.2 Pipeline Parallelism
Another way that we might parallelize a program is to use
pipeline parallelism, in which we divide our program into a
series of tasks that are carried out in an assembly-line-like
fashion. As an example, imagine we are writing a program that
receives encrypted video over a network connection. The
program needs to receive the data, decrypt it, decode the video
(that is, run the decoder of the video format to determine the
raw pixels to draw), and then draw those pixels to the screen at
the right time. We could design this program as a pipeline-
parallel task with 4 stage: receive, decrypt, decode, and draw.
In such a design, we could have 4 threads—one for each
stage—which would communicate by passing data along the
pipeline. The receive thread would write data into a buffer that
the decrypt thread would read from; the decrypt thread would
write decrypted data into a buffer that the decode thread would
read from; the decode thread would write raw pixel data into a
buffer that the draw thread would read from. Each thread would
operate on a different frame (time slice of video) at a time—
while the draw thread draws frame 0, the decode thread would
decode frame 1, the decrypt frame would decrypt frame 2, and
the receive thread would receive frame 3.
Of course, it may not work out that ideally, as the stages
may take different amounts of time, so we might end up with
some frames buffered in between stages. If a thread’s input
buffer is empty, or output buffer is full, then it must stall—
which hinders performance. Accordingly, we must take care to
balance our pipeline—making sure that the total throughput of
each stage is roughly the same. If decoding the video takes
twice as long as any other step, then we need to dedicate two
threads to decoding. In such a design, one thread would decode
one frame, while the other decodes the next.
Figure 28.2: An example of pipeline parallelism.
Figure 28.2 illustrates this design (with two “decode”
threads). The data moves along the pipeline from left (“Read
Data From Network”) to right (“Draw”), and there are buffers
(e.g., queues designed for concurrent access, such as those we
discussed in Section 28.3.5) in between the threads. In this
depiction, we have shown the data frames that might in each
stage/buffer at a particular point in time with a blue box with
the frame number. The “draw” thread is drawing frame 0, frame
1 is buffered up, ready for it to read when it needs it. The
decode threads are working on frames 2 and 3. Frame 4 is
buffered for the next decode thread that is ready, and so on.
28.6.3 Task Parallelism
Another common way to parallelize an algorithm is to use task
parallelism—in which tasks (an invocation of a function with a
particular argument) spawn more tasks, and then wait for their
child tasks’ results (called “sync” or “join”) when they need
them. Task parallelism is quite natural for divide-and-conquer
algorithms, (such as quicksort or mergesort), where the
“divide” phase is implemented in terms of spawning tasks, and
the conquer phase is implemented in terms of waiting for those
children to finish, then processing their results appropriately.
Efficient task management in a task parallel algorithm is a
bit complex, and it is important to achieving high performance.
Each thread typically has a task queue. When a task spawns
new work, it places it into its own task queue.18 However, the
management is not quite that simple.
Another important consideration is load balancing—
making sure that one thread does not finish its work first and sit
idle while other threads continue to do work. To achieve load
balancing, many task parallel systems use work stealing—a
thread whose task queue runs empty will “steal” work from the
head of another thread’s task queue. This stealing involves
picking a victim queue to steal from, then dequeueing a task
from the head of its task queue. The “thief” then begins work
on its stolen task (which may spawn new tasks). Note that this
scheme also address the question of how work initially gets
distributed to other threads—the first thread starts spawning
tasks, and then the other threads start stealing those newly
spawned tasks.
As other threads may steal from a task queue, the queues
must be protected from data races. However, we would not
want to use a lock for every access, as a thread will frequently
enqueue and dequeue work from its own task queue (even
uncontended locks have fairly noticeable overhead). We can
design the tasks queues for high performance by having threads
access their own task queues at the tail (whereas thieves steal
from the head) for both adding and removing work. Notice that
as the thread both inserts and removes work at the tail, this data
structure is not a true queue—the thread gets work in a LIFO
fashion. However, it is not a pure stack either, as thieves obtain
work in a FIFO fashion. This data structure is actually a deque
(pronounced “deck” and not to be confused with the verb
“dequeue”). Recall that we briefly mentioned deques in Section
20.3.4.
With the thread accessing its own deque only at the tail,
and thieves accessing only at the head, we can design the data
structure for efficiency by making the common case fast. In
particular, most accesses will be done at the tail by the thread
that owns that deque, and typically with more than one element
in the deque (the head and tail are distinct). Accordingly, we
design to make those cases fast, and all cases correct. We can
do this by having thieves lock the deque, and the owning thread
only lock the deque in cases where it is needed. We are not
going to go into the details here, but the curious reader can find
them in
http://supertech.csail.mit.edu/papers/cilk5.pdf.
The under-the-hood details of task management for task
parallelism are quite tricky. Fortunately, however, they are not
something that most programmers need to worry about
implementing when doing task parallel programming. Instead,
the programmer makes use of a task parallel runtime (written
by parallel programming experts), which provides the task
management for him. One popular task parallel system for C
and/or C++ is Cilk (which is described in the paper linked
above). Intel has since expanded on Cilk, making “Cilk Plus,”
which you can find out more about here
https://www.cilkplus.org/cilk-plus-tutorial.
To see an example, consider our quicksort algorithm from
Section 26.2.3. We can parallelize this algorithm by performing
the recursive calls in parallel. If we were to write our quicksort
in Cilk Plus, it might look something like this:
1 void quicksort(int * array, int n) {
2 if (n <= 16) {
3 selectionSort(array, n);
4 return;
5 }
6 int pivotIndex = partition(array, n);
7 cilk_spawn quicksort(array, pivotIndex);
8 quicksort(&array[pivotIndex+1], n - pivotIndex
9 cilk_sync; //not needed: implicit at end of function.
10 }
When we start to sort an array, only one thread will be
running. It will partition the array, then put a task to recursively
quicksort the first part of the array into its task queue (via the
cilk_spawn). It will then recurse on the second half of the
array. Meanwhile, another thread will steal the task to sort the
first half of the array. At this point, two threads are partitioning
the halves of the array in parallel. Once they finish that
partitioning, they will spawn new tasks (which will then get
stolen by two other threads), and recurse.
If we only have 4 threads, then at this point, all of them are
active, and work will not be stolen again until near the end of
the computation. Now, each thread is going to end up putting a
task into its own queue (for the first half of the array segment it
is currently sorting) and recursing on the second half. However,
all threads are busy with their own tasks, and are also pushing
new tasks into the their queues. Consequently, the size of each
task queue will grow until a thread reaches a base case.
Once a thread reaches a base case, it will complete its
recursive call to quicksort without spawning any new tasks. In
the case of a directly called quicksort (as opposed to one called
via cilk_spawn), the execution will then encounter a
cilk_sync, which makes the current task wait until its children
have finished (there is actually an implicit cilk_sync at the end
of each function, so we did not need to explicitly write it here—
we just included it explicitly to make the discussion of what
happens clearer). The scheduler will then run a different task on
that thread.
That other task will be pulled from the task queue. It may
be a leaf task (with no children of its own), in which case, the
task queue will shrink. Alternatively, it could spawn new
children tasks, but we know that eventually we will be done
because the recursion always generates smaller tasks at every
step. Eventually, a thread will remove all of the tasks from its
deque. At this point, the thread whose task deque has become
empty will try to steal work from other threads. Work is likely
to be available for stealing as it is quite unlikely that we divided
the work perfectly evenly between threads. However, note that
stealing is a relatively rare case.
We are not going to delve deeply into the details of Cilk,
but rather show it as an example of a task-parallel system. From
this example, you can see the main appeal of this form of
parallel programming (our recursive quicksort looks much like
regular quicksort), and that most of the complex details are
already taken care of in the implementation of Cilk.
28.7 Amdahl’s Law
Whenever you try to speed up code (or really, anything)—
whether by optimizing sequential code, or parallelizing a
program—it is important to understand where your program is
spending most of its time, and focus your efforts there. Suppose
your program spends 10% of its time doing task A, and 90%
doing task B. Would you be better off speedup up task A by
99%, or speeding up task B by 20%? If you choose to focus
your efforts on task A, you would speedup the whole program
by 9.9%; however, if you instead picked to focus on task B, you
would see a speedup of 18%19—even though you could speed
up task A by a lot more, it comprised a lot less of the total
execution time. In fact, if we were to speedup task A in the
above example infinitely (so that it takes no time at all), we
would only speed up the entire program by 10% (which makes
sense: A only took 10% of the time to begin with).
In terms of parallelizing an algorithm, this analysis is most
relevant in terms of considering how much of a program is
serial and how much is parallel (you will not be able to
parallelize the entire program). If only 10% of your program’s
execution time can be parallelized, then the largest speedup you
could hope for (if you could perfectly parallelize it over an
infinite number of threads) is 10%. Of course, you will neither
have infinite threads, nor perfect parallelism, but understanding
this upper bound is important. By contrast, if 90% of the
program is parallel, decently parallelizing it over a few threads
to speed up the parallel region by 50% reduces the total
execution time of the program by 45%.
This analysis is formalized in Amdahl’s Law—a formula20
for the maximum theoretical speedup of a task given a certain
number of threads. If we let be the time of the serial portion,
be the time for the parallel portion, and be the number
of threads, then Amdhal’s Law says we expect a speedup of:
Note that this is true of any system where we improve one
component and ignore another. If we are optimizing sequential
code, and making an improvement to one function, the above
law describes our speedup if we let be the time for the rest
of the program (the portion unaffected by the changes) and
be the time for the function we improve (the portion affected by
the changes), and be the factor by which we speed that up
(which might be a non-integer, e.g. 1.05 if we speed it up by
5%).
A corollary of Amdhal’s Law is make the common case
fast. If we improve the common case (what happens most of the
time), we will see the most speedup—even if we slightly slow
down the uncommon case. We saw this principle in the work
stealing scheduler: more than 99% of the time, a thread works
from its own task deque, so the scheduler is designed to make
that case fast. The design makes the uncommon case (stealing
work) slow—possibly slower than if a simpler design were
used. However, making the common case fast improves
performance over all.
Amdahl’s Law applies to anything where you seek to
improve one portion of a system with a change that does
nothing for the rest of it. This could be in terms of time, or
other resources such a money. As an example of the application
of Amdahl’s Law to a non-programming time-based
optimization: do not try to make tooth brushing faster. Most
people spend 2–5 minutes per day (about 0.3% of the time in
the day) brushing their teeth. If you were to cut the time it takes
people to brush their teeth by a third, that optimization would
give them 0.1% of their day back. Likewise, if you are
considering trimming a $100,000 budget, examining the $100
items may not be the best, as cutting them in half only yields a
0.05% savings. Of course, sometimes the only way to improve
is to find many small improvements in a lot of places; however,
you should always focus your efforts on the largest components
first.
28.8 Much More…
Even though this chapter has covered a lot of material, it has
barely scratched the surface of parallel programming. For one,
we have really only scratched the surface of multithreaded
programming—giving you just enough information to let you
start thinking about parallelizing code, and to start to know
where you need to go learn more. Most of that has centered
around “basic” pthread operations—however, you could use
other thread libraries (such as Boost in C++), We will also note
that if you have not read the Section D.3.6, which introduces
the Helgrind tool, you may wish to do so (and read more about
it in the online documentation linked from that section) before
you attempt any significant multithreaded programming.
In fact, a serious introduction to parallel programming
would be a semester long course (with its own significantly
sized book) of its own. There are many topics we have not
remotely touched on here. Of the topics we have touched on,
we have typically given the briefest of introductions. For one
thing, there is just too much material to go into depth on any of
it. For another, you really need to understand hardware at an in-
depth level before you can seriously learn anything in depth
(you may have noticed that this problem kept coming up in the
material presented here). If you think parallel programming
sounds like an area you want to explore in depth, you should
become an expert sequential programmer (you cannot parallel
program if you cannot sequentially program), you should learn
a LOT about hardware (generally, at least an undergraduate-
level computer organization class plus a graduate-level
computer architecture class), and then you should take a course
(or two) dedicated to parallel programming.
27 Balanced BSTs29 Advanced Topics in Inheritance
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
IV Other Topics28 Concurrency30 Other Languages
Chapter 29
Advanced Topics in Inheritance
C++ introduces a variety of nice features, such as inheritance, that have complex
implementation details behind the scenes. From one perspective, the advantage
of a language such as C++ is that the programmer does not need to concern
herself with such details—she can write in terms of higher-level abstractions,
and let the compiler sort out the details. From another perspective, it is
important to always understand what you code is doing—if your code is slow, or
exhibiting behavior that is not what you intended, complete understanding is
critical to diagnosing the situation.
In this chapter, we are going to delve into the details of how objects are laid
out in memory, how inheritance (including dynamic dispatch) works, and how
multiple inheritance works in C++. As we learn about multiple inheritance, we
will see how it is implemented, so that you understand the overheads associated
with its use.
29.1 Object Layout
Objects must be laid out in memory in such a way that the compiler can generate
instructions to access their fields knowing only their static type. That is, all
objects of type A must have their fields in the same position relative to the start
of the object. In C, structs (and POD types in C++) are laid out in a
straightforward fashion: each field is placed one after the other in the order they
appear (possibly with some blank space to ensure the fields remain properly
aligned).
We note that we only discussed the fields of an object, and not its method.
The code for the methods does not actually get stored inside of the objects.
There is one copy of the code for each method in the code segment. In fact, non-
virtual methods do not even affect the object layout at all. The compiler can
generate instructions to directly call the appropriate function. For virtual
methods, there is an impact on object layout, as we will see shortly.
29.1.1 Inheritance
In C++, we have an additional constraint on object layout when we use
inheritance, so that we can use objects polymorphically. If class B extends class
A, then we may end up with an A* pointing at an instance of B. As we said
before, the compiler must be able to generate instructions to access the fields of
this object knowing only its static type. This requirement gives us the subobject
rule: the way in which we lay out a child class must contain a complete
subobject of its parent class, laid out in exactly the same fashion as the parent.
Figure 29.1: Example object layouts with inheritance.
Figure 29.1 illustrates this principle. On the left, we have a class A with two
fields of type int (a, and b). The layout of an object of type A is to place the a
field at offset 0 (the start of the object), and the b field at offset 4.1 Every object
of type A will have this exact same layout.
In the middle of the figure, we declare a class B, which extends A and adds
two fields: d (a double) and z (an int). The layout for object of type B has x at
offset 0, y at offset 4, d at offset 8, and z at offset 16. Observe how this layout
obeys the subobject rule: the layout of B contains a piece that is laid out exactly
according to the rules of A’s layout (shaded in light blue in the figure). We refer
to this piece of the object as the “A subobject”.
On the right side of the figure, we declare a class C, which extends B, and
adds another field q (of type int). We derive the layout of a C in much the same
way that we derived it for a B—we start with the layout of the parent class (in
this case B), and then add the new fields to the end. This design obeys the
subobject rule quite naturally. Here, we have shaded the B subobject in a light
blue. Observe that because B’s layout obeys the subobject rule, we are
guaranteed to have an A subobject in C’s layout.
29.1.2 Dynamic Dispatch
As you hopefully recall from Section 18.5, virtual methods are dynamically
dispatched—the method that is called depends on the dynamic type of the
object. However, the compiler must generate instructions without knowing the
dynamic type of the object, as it may differ across times that the call site is
encountered, or be impossible to determine. Accordingly, when a class has
virtual methods, its objects must be laid out in such a way as to contain the
information to dispatch the method call. The compiler can then generate
instructions that read this information out of the object, and call the appropriate
method.
A naïve approach would be to store a unique integer in each object
indicating its type, and then have the compiler generate the equivalent of a
switch-case statement at the call site. In this approach, the code would switch
based on the type identifier read from the object, and each case would call the
appropriate method for that type. Such an approach would work, but would be a
bit slower than the way that most compilers actually implement dynamic
dispatch.
Instead, we can exploit the fact that the instructions that we want to execute
are stored in memory, thus they have an address, and we can have a pointer to
them. This principle should not sound surprising, as we discussed function
pointer in Section 10.3. We could therefore store a pointer to each dynamically
dispatched method in the object (and ensure the same layout rules we need for
fields). However, this approach suffers from the fact that it could increase the
size of each object significantly (one pointer per virtual method). This extra
space is rather wasteful as all objects of the same dynamic type will hold the
same values in these fields.
What actually gets done is that the object holds one single pointer to a table
of function pointers. This table of function pointers is called the vtable (short for
“virtual function table”) and contains one pointer per virtual function. All
objects of the same dynamic type will have their vtable pointers point at the
same vtable.
As you also learned earlier, the dynamic type of the object changes during
construction and destruction (in C++; not in all languages). Now that we know a
little bit about object layout, we can understand how this change happens. The
first thing each constructor does (after the parent constructor returns) is to set the
object’s vtable pointer to point at the vtable appropriate to its class. The
destructors also update the vtable pointer appropriately.
The contents of the vtable (i.e., the pointers to the methods) must obey the
subobject rule just as the fields of an object do. The reasons for this requirement
are pretty similar—the compiler must be able to generate the instructions to
dispatch a method based only on the static type. Therefore, the pointer to any
particular method must be in the same offset of the vtable no matter what the
dynamic type of the object is.
Figure 29.2: Example object layouts with vtables.
Figure 29.2 illustrates object layout with vtables. On the left, class X has
two fields (a and b) and two virtual methods (f and g). Accordingly, objects of
type X are laid out with a pointer to the vtable at offset 0, field a at offset 4, and
field b at offset 4. The vtable is laid out with a pointer to f (which points to
X::f) at offset 0, and a pointer to g (which points to X::g) at offset 4.
On the right, class Y extends X. It adds two fields, overrides f, and adds a
new virtual method, h. We can see that the layout for objects of type Y obeys the
subobject rule (the subobject is shaded blue). The vtable pointer is still at offset
0 (although it points at the vtable for a Y), fields a and b are at offsets 4 and 8
respectively, and pointers to f and g resides in the vtable at offsets 0 and 4
respectively. As Y overrides f, the pointer for f in Y’s vtable will point at Y::f.
By contrast, since it just inherited g from X, the pointer for g in Y’s vtable will
point at X::g. We add the new fields at the bottom of the object, and the new
virtual methods at the bottom of the vtable.
If we were to have a virtual destructor (which we learned is good practice,
and crucial if you will destroy polymorphic objects with non-trivial destructors),
there would be one vtable entry for the destructor. This same entry (at the same
offset) would be used for the destructor in all children, even though they have
different names in the program source—because they are all the function to call
when an object is being destroyed.
Now that we have seen the object layout required for dynamic dispatch, we can
understand the performance costs associated with it. For static dispatch, all that
is required is a function call instruction to a target that is known at compile time.
For dynamic dispatch, the call requires (1) reading memory (called a “load”) to
get the vtable pointer from the object (2) a load from the vtable to get the actual
function pointer to call (3) an instruction that actually calls the function via the
pointer.
On most modern high-end processors, these two loads are likely a relatively
small performance cost—the processor can typically reorder instructions, and do
them in parallel with other instructions. A quick test on my desktop showed no
measurable difference in the execution time for a loop that made 1 million
dynamically dispatched method calls versus 1 million statically dispatched calls.
2
Of course, you may write code for other processors (e.g., less aggressive cores
in the mobile or embedded space) where this performance difference may be
larger, or you may be concerned about other metrics such as energy (performing
a load consumes significant energy, even if it is overlapped with other
instructions).
We will also note that when C++ was designed (around 1983), processors
were not only much slower (around 10MHz, compared to 3.4 GHz), but much
less advanced in their design. Processors of that time did not do reorder
instructions around each other to find multiple instructions to do at a time. In
fact, they did not even really do multiple instructions at one time, but rather took
multiple cycles to do most instructions! Accordingly, the performance cost of
dynamic dispatch was a rather significant concern at the time, thus why it is not
the default in C++.
This is, however, another more subtle performance impact of a dynamically
dispatched method. It prevents the compiler from inlining call. Inlining a
function call is an optimization that a compiler performs when the programmer
calls a small function, where the compiler puts the instructions for the called
function directly into the caller, avoiding the instructions required to setup a
stack frame, call the function, return from the function, and cleanup the stack
frame. Inlining also opens up other opportunities for optimization. Again, this
overhead is not catastrophic, but may matter in performance critical situations.
We will also note that if you are actually using dynamic dispatch to call
different methods from the same call site, there may be performance costs
arising from the processor having difficulty predicting the target of the function
call in advance. However, when this situation happens, you are actually using
the dynamic dispatch to accomplish the goals of your program.
In general, dynamic dispatch is a good tool, as it lets programmers write
code in a more understandable and maintainable fashion. You may write a lot of
code without the performance implications ever being significant. However, if
you ever need to optimize your code, understanding what is happening might
matter a lot.
29.2 Multiple Inheritance
C++ allows multiple inheritance—a class may inherit from more than one parent
class. In such a situation, the child class inherits all of the fields and methods of
all of its parents. Additionally (as long as the inheritance is public), the child
class may be treated polymorphically as any of its parents. However, multiple
inheritance introduces a variety of complexities, both for the programmer, and
for object layout.
An example of multiple inheritance from the C++ STL is the
std::iostream class (a stream for both reading and writing), which inherits
from both std::istream (a stream for reading) and std::ostream (a stream for
writing). Here, the STL designers chose multiple inheritance because an
iostream is-an istream and is-an ostream. We might have an iostream and
want to pass it to a function that expects an istream& (i.e., a function designed
to read some input) or pass it to a function that expects an ostream& (i.e., a
function designed to write some output). Because the iostream class inherits
from both parent classes, it can be used polymorphically with either parent.
Furthermore, the multiple inheritance is useful, as the STL implementers could
make use of the inherited functionality to avoid rewriting the code to read and/or
write.
29.2.1 Design Choices
As another potential example, we could revisit our ImageButton example from
Chapter 18. In that chapter, we described ImageButton as exhibiting an is-a
relationship with Button and a has-a relationship with Image. Could we instead
have ImageButton use multiple inheritance, inheriting from both Button and
Image? We could—but that is the wrong question. The right question is whether
or not multiple inheritance is the best design choice.
To assess the correctness of this design choice, we would first need to ask
ourselves if the ImageButton that we are designing exhibits an is-a relationship
with both Button (which we already assumed it does) and Image. In particular,
we would want to know if the ImageButton is substitutable for an Image—that
is, could we pass an ImageButton to any piece of code that expects an Image.
The answer here is “probably not”. We might have functions that operate
on Images such as scaling, smoothing, or recoloring the image, that do not make
sense for an ImageButton. You may think “but what if I want to smooth the
image on the button!”—however, that is a great argument for composition (has-
a). The image is a part of the button, and we may wish to operate on it, but we
would be scaling/smoothing/recoloring the Image that the Button has.
If we answered “no” to the “is-a” relationship question, then we should not
use multiple inheritance. However, we might find ourselves second guessing this
choice based on the fear of duplicating some code. In particular, we may think
that we will need to write code in this class to display the image, resize the
image when the button changes size (e.g., if the user resizes the window), etc,
and then find ourselves wanting to write the same code in other components that
display images. We certainly do not want to duplicate that code, but we just
decided that multiple inheritance is inappropriate.
If you re-read the last paragraph carefully, you will find the key. We said
that we want to reuse the code in other components that display images. What
we really want is a class that a is a component in our graphical interface toolkit
that displays an image. That component (which we will call ImageDisplay) has-
an Image, but is not itself an image—it is a component responsible for
displaying an image. If we were then to revisit the is-a question (“Is an
ImageButton an ImageDisplay”—or put a different way, “Is a component that is
a button with an image on it a component that displays an image?”) then the
answer is yes, and we are justified in using multiple inheritance.
29.2.2 Syntax
The syntax of declaring a class that inherits from multiple parents is to separate
the parent classes with a comma:
1 class ImageButton : public Button, public ImageDisplay {
2 //contents of the class
3 };
Note that the order of the parent classes is significant. The first parent
named is the primary parent, and as we shall see once we learn about object
layout can be used polymorphically more efficiently. As we will also discuss
shortly, this order affects the order of construction and destruction. We will note
that inheriting from the same class twice is illegal (you cannot declare
class X: public Y, public Y)—it generally does not make sense from a
design standpoint, and creates a host of problems.
Another bit of syntax that we may need when using multiple inheritance is
the ability to distinguish between multiple inherited methods or fields of the
same name. For example, suppose we have two classes that both declare the
same field or method names (in this case, both):
1 class A {
2 public: 1 class C {
3 int x; 2 public:
4 void describe() { 3 int x;
5 std::cout << "A with x = " 4 void describe() {
6 << x << "\n"; 5 std::cout << "C with x = "
7 } 6 << x << "\n";
8 }; 7 }
8 };
Now, suppose we declare a class (B) that inherits from both A and C:
1 class B: public A, public C {
2
3 };
Observe that B has two distinct fields named x as well as two distinct
methods called describe (in each case, one inherited from A and one inherited
from C). Therefore, if we try to access x or call describe, we run into the
problem that the compiler does not know which one we mean. That is, if we
write:
1 B b ;
2 b.x = 3;
3 b.describe();
then both lines 2 and 3 both result in errors that look like request for member
’x’ is ambiguous, and then show the possible candidates (the things we might
have meant). We can resolve this ambiguity by specifying exactly which one we
want with the scope resolution operator:
1 B b ;
2 b.A::x= 3;
3 b.C::x= 4;
4 b.A::describe();
5 b.C::describe();
This code will print:
A with x = 3
C with x = 4
as it sets the two different x fields to 3 and 4 respectively, then calls the
describe inherited from A, then the describe inherited from C.
29.2.3 Construction/Destruction
When a class uses multiple inheritance, the construction of an object involves
the construction of all parent classes. As you hopefully recall from the single
inheritance case, the first thing that a constructor does is to call the constructor
for the parent class (either by an implicit call to the default constructor, or by an
explicit call made in the initializer list). Multiple inheritance has similar rules,
except that every parent class’s constructor is called in the order that the
inheritance is declared (regardless of the order where explicit constructor calls
are written in the initializer list).
In our ImageButton example, the first thing that would happen in the
constructor for an ImageButton would be a call to the constructor for Button
(the primary parent). If the initializer list for ImageButton’s constructor contains
an explicit call to a Button constructor, that constructor is used. Otherwise, there
is an implicit call to the default constructor for Button. After the Button
constructor returns, a constructor for ImageDisplay (the second parent) will be
called. If ImageButton inherited from other classes, their constructors would
continue to be called in order.
Note that the dynamic type of the object that we are creating changes
during the creation process according to same rules that we have learned all
along, we just need to know that the type becomes ImageButton only after all
parent constructors finish. That is, Button’s constructor sets the dynamic type to
Button (i.e., sets the vtable pointer to point at the vtable for Button. After that,
the constructor for ImageDisplay is invoked, and it sets the vtable pointer to
point at the vtable for ImageDisplay’s, as it always does (however, as we shall
see shortly, this is actually a different vtable pointer than Button’s constructor
set, as we will need two of them in the ImageDisplay object!). Once
ImageDisplay’s constructor returns, the constructor for ImageButton sets the
vtable pointers to point at vtables for ImageButton.
As always, destruction proceeds in the reverse order of construction. After
ImageButton’s destructor completes, it implicitly calls the destructor for
ImageDisplay and then the destructor for Button. The vtable pointers are
manipulated during this process to update the dynamic types appropriately.
29.2.4 Layout
Figure 29.3: The conundrum with a first attempt at multiple inheritance. We make two attempts to
layout class C but in both cases fail to respect the subobject rule for one parent.
A first attempt at laying out an object that uses multiple inheritance runs
into a bit of a conundrum: it appears that we cannot follow the subobject rule
with regards to both parents at the time same. Figure 29.3 illustrates this
problem. We have two classes (A and B) shown at the top of the figure. We then
have another class (C) that multiply inherits from A and B. We make two attempts
to layout C, but fail at both. On the left, we put everything that C inherits from A
first, then follow it with the members inherited from B, then the items new to C.
This layout has an A subobject (shown in blue), but does not have a B subobject.
If we try to put the members inherited from B first, then the members inherited
from A (pictured on the right), we run into the opposite problem—now we have
a valid B subobject (shown in green), but do not have a valid A subobject.
Figure 29.4: Correct layout with multiple inheritance. The A subobject is shown in blue and the B
subobject is shown in green.
Clearly we cannot satisfy the constraints if we try to make both subobjects
start at the beginning of the object. However, the subobject rule only requires
that we have a subobject with the same layout, not that that subobject is at the
start of the entire object. Accordingly, we can lay out objects of type C by
placing the entire A subobject first, then the entire B subobject (including a
vtable) after it. We then place any new fields (in this case, z) after the end of the
B subobject.
Notice that we now have two different vtables. The one at the start of the
object (called the primary vtable) contains pointers to all of the methods. The
secondary vtable (in the B subobject) only contains pointers to the methods that
a B has. We will only use the secondary vtable when we are polymorphically
treating our object as a B (or more generally, as the second parent).
In particular, when we convert a C pointer to an A pointer, the compiler
must adjust the numerical value of the pointer behind the scenes to point at the A
subobject. More generally, this adjustment occurs whenever we treat a multiply-
inhered object as a parent other than its primary parent. Note that this conversion
occurs implicitly whenever we call a method inherited (but not overridden) from
a non-primary parent, as that method expects a this pointer of the parent type
(in this example, if we called g, which expects this to point at a B).
This conversion also occurs during object construction and destruction,
when the constructor(s) and destructor(s) for the non-primary parent(s) are
invoked. When constructing a C, the B constructor is passed a this pointer that
points at the B subobject. It then sets the vtable pointer in that object (which is
the secondary vtable of a larger object—but B’s constructor does not know that,
nor does it care) to point to the vtable for a B. Once that constructor finishes, C’s
constructor updates both of its vtable pointers appropriately.
If class C overrides g (which appears in its non-primary parent, B), then we
run into another issue, which is fixed by more behind-the-scenes pointer
manipulation by the compiler. First, let us look at what might happen in C’s’
overriding of g:
1 class C: public A, public B {
2 int z;
3 public:
4 virtual void h() {...}
5
6 virtual void g() { //the this pointer points at a C
7 h(); //implicitly this->h();
8 z++; //implicitly this->z++;
9 }
10
11 };
Inside of C’s overriding of g, the this pointer is a pointer to a C, and can
therefore be used to access fields and methods inside of a C (which might not
exist in an A or in a B). Accordingly, those field access and method dispatches
are compiled assuming that this points at an entire C.
Now, let us suppose that we polymorphically use a C as a B, and call g().
1 void someFunction(B * aB) {
2 aB->g();
3 //other code...
4 }
5 void anotherFunction(C* aC) {
6 someFunction(aC);
7 //other code...
8 }
Here, someFunction takes a pointer to a B, and calls g on it. We also have
anotherFunction, which takes a pointer to a C, and passes it to someFunction.
Passing aC as the argument to someFunction implicitly converts the C* to an B*.
As we just learned, this conversion requires the compiler to perform a numerical
adjustment of the pointer value behind the scenes (in this example, adding 8).
Figure 29.5: Diagram showing call to an overridden function via a non-primary parent.
Figure 29.5 illustrates the situation as execution enters someFunction. This
figure shows two stack frames, with the appropriate pointers, drawn pointing at
the part of the object that they would really point at. We have shaded the A and B
subobjects in this diagram for clarity. The call aB->g() will then by dynamically
dispatched by reading the vtable according to the layout of B objects, and will
pass aB as the value of this. However, aB points at the B subobject, not the
entire C object. We need to adjust the this pointer (by subtracting 8) to get a
pointer to the entire C object.
Note that we cannot perform this adjustment at the call site (that is, inside
of someFunction), as that would cause the program to function incorrectly when
we pass in an object that is just a B (and not a subobject of a C). Instead, the
compiler arranges for the non-primary vtable to point at a few instructions that
precede the code of g, and adjust the this pointer appropriately. The primary
vtable points at the actual entry to g, while the secondary vtable points at this
fixup code. This solution is shown in the figure with the pointers in the vtables
pointing at the appropriate place (the code for g, or this -=8).
These pointer adjustments require an extra instruction each. As with
dynamic dispatch, there may be some performance overhead, but we expect it to
be small on a modern processor. As long as you are not pushing for the last bit of
performance, the overhead should not be that big of a deal, but you should be
aware of it in case it ever matters.
29.3 Virtual Multiple Inheritance
One common problem in using multiple inheritance arises when a class inherits
two copies of a grandparent class, but it is only logical to inherit one. In our
ImageButton example, suppose that both Button and ImageDisplay inherit from
GuiComponent. This GuiComponent class might be the ancestor of all
components in our toolkit, and would contain fields such as the x, y coordinates
of the component, the width, and the height that all components have. That is,
we might have:
1 class GuiComponent {
2 protected:
3 int x;
4 int y;
5 int width;
6 int height;
7 public:
8 virtual void draw();
9 //other fields and methods not shown
10
11 };
and then our Button and ImageDisplay classes inherit from GuiComponent:
1 class Button : public GuiComponent {
2 string text;
3 //other fields and methods not shown
4 };
5
6 class ImageDisplay : public GuiComponent {
7 protected:
8 Image * image;
9 //other fields and methods not shown
10 };
Figure 29.6: Object layout for ImageButton. Observe that it has two distinct GuiComponent subobjects:
one inherited from Button (shaded blue), and one inherited from ImageDisplay (shaded green).
Having learned how to layout objects using inheritance and multiple
inheritance, we can lay out our ImageButton objects. Figure 29.6 shows the
layout of Button (left), ImageDisplay (center), and ImageButton (right). The
first two classes just use the rules for inheritance (their GuiComponent subobjects
are shaded blue and green respectively). The ImageButton object just uses the
rules for multiple inheritance, putting the two subobjects together (if there were
new fields, they would be added at the end). Notice that the resulting object has
two distinct GuiComponent subobjects: one inherited from Button (shaded blue),
and one inherited from ImageDisplay (shaded green). Accordingly, the object
has two (distinct) fields for each of x, y, height, and width.
This problem is quite serious (not just a matter of wasted space in the
object), as some pieces of code will use one x while others will use the other x.
In particular, any method inherited from either parent (but not overridden) will
use the field inherited from the same parent (e.g, a method inherited from
Button will use the “blue x” while a method inherited from ImageDisplay will
use the “green x”). Any new code we write in ImageButton will have to
explicitly specify which x it refers to as either Button::x or ImageDisplay::x.
If we had this inheritance hierarchy and wanted our object to behave in a
sane manner (to have one only x coordinate that it uses consistently), we would
have to override every method that we inherited from one parent class—
rewriting it to use the set of fields we have chosen from the other parent. Not
only does this duplication of code defeat the purpose of using multiple
inheritance to begin with, it also violates a variety of other principles of good
design, and will make our code impossible to maintain. If someone (you or
another programmer) adds functionality to that parent class, the child class has
to be updated with an additional copy of the method. This approach would be
horrific—thus we need to rethink how we want to structure our inheritance
hierarchy.
Figure 29.7: The inheritance hierarchy we have made (left) and the inheritance hierarchy that we want
(right).
Figure 29.7 shows the inheritance hierarchy that we have implemented
(left), in which Button and ImageDisplay inherit from GuiComponent, but do so
in a distinct fashion. What we want to design is an inheritance hierarchy like the
one pictured on the right, where Button and ImageDisplay both inherit from
GuiComponent, but do so in a way that they share one subobject between the two
of them.
We can obtain this inheritance hierarchy via virtual inheritance—which
tells C++ to inherit from a parent class in such a way that only one subobject of
that parent class appears in any future descendants. In doing so, we must specify
that Button and ImageDisplay inherit virtually from GuiComponent, by adding
the virtual keyword to the inheritance specification of both of their class
declarations:
1 class Button : public virtual GuiComponent {
2 string text;
3 //other fields and methods not shown
4 };
5
6 class ImageDisplay : public virtual GuiComponent {
7 protected:
8 Image * image;
9 //other fields and methods not shown
10 };
Note that the virtual keyword has to be used for these classes, not for the
ImageButton—as we will see shortly, Button and ImageDisplay have to be laid
out different to accommodate this design requirement. The fact that these classes
have to be laid out in a more complex fashion is the reason why virtual
inheritance is not the default behavior, even though it seems like what we would
want most of the time from a perspective of designing our inheritance
hierarchies.
29.3.1 Layout
Laying out objects that use virtual inheritance appears to pose a bit of a
conundrum with regards to having only one copy of the GuiComponent
subobject, and ensuring that the it is in a consistent location in Button,
ImageDisplay, and ImageButton. However, we can solve this problem by
adding a layer of indirection to the layout of the object—rather than having the
subobject at a fixed offset, the compiler will generate instructions to read the
offset of the subobject out of the vtable (thus why it is called “virtual”
inheritance) and compute the offset to fields using that offset.
Figure 29.8: Object layout with virtual inheritance. Classes that inherit virtually have the offset from the
start of the object to their parent in their vtable. Classes that multiply inherit can then have one
subobject of the virtually inherited ancestor.
Figure 29.8 shows the object layout when virtual inheritance is used. The
left and center portions of the figure show the layout of the Button and
ImageDisplay classes, which virtually inherit from GuiComponent. In both of
these, the GuiComponent subobject is shaded green. Notice that the parent
subobject is now at the end, not the start of the object. The vtable for Button and
ImageDisplay both contain a new field—they start with the offset to find their
parent (in this case, GuiComponent) subobject. This offset is the difference in the
memory address of the start of the object and the start of the parent subobject.
When the program accesses fields in the virtually inherited class, the
memory address of that field is not at a fixed offset from the start of the object.
If we have a Button* that points at a Button, then y is at offset 20; however, if
that Button* points at an ImageButton, then y is at offset 28. Accordingly, the
compiler must generate instructions that read the offset to the virtually inherited
subobject out of the vtable, add that offset to the start of the object, then add the
offset of the field within the virtually inherited subobject, then reads or writes
memory at that address.
Note that this level of indirection comes with a performance cost. This
performance cost may be a bit more significant than the other ones we have seen
in this chapter, as it will incur extra memory operations on every access to a
field in a virtually inherited ancestor class. The compiler may be able to improve
on this overhead some if, for example, many fields in the same object are
accessed at once (in which case, it could compute the start of the subobject once,
and re-use that calculation for other fields). As with all things, you should
understand what is happening, especially if performance is every critical to your
code.
29.3.2 Construction/Destruction
Constructing and destructing objects that use virtual inheritance and multiple
inheritance requires us to have slightly different rules than before. In the
ImageButton example, we cannot have Button’s constructor call
GuiComponent’s constructor and also have ImageDisplay’s constructor call
GuiComponent’s constructor again, as that would initialize the object twice
(possibly leading to memory leaks and inconsistent behavior). Similarly, we
cannot have the object GuiComponent subobject destructed twice, as that would
potentially cause double-free errors.
To ensure the correct (and logical) behavior, virtually inherited classes have
a special rule for construction. The constructors for virtually inherited classes
are invoked as the first step of the constructor of the class actually being created.
This initialization happens regardless of where the virtually inherited classes
appear in the inheritance hierarchy. However, this initialization occurs in a very
specific order.
The order in which virtually inherited subobjects are initialized is defined
in terms of the inheritance hierarchy DAG. If we draw the inheritance hierarchy
as a graph (as in Figure 29.7), it will form a DAG (we may have un-directed
cycles, whenever virtually inherited ancestors are inherited along multiple paths,
but will never have directed cycles). The rules for initialization order require
that this graph is drawn such that, at every level, the parents of a class are drawn
from left to right in the order that they appear in the class declarations
inheritance specification. Once you have this DAG, you can determine the
required order of initialization by performing a DFS (recall: depth first search)
starting at the class you are actually creating, and whenever you consider the
successors of that node in the graph (which are really its parents in the
inheritance hierarchy), traverse them from left to right. Whenever you finish all
of the successors of a virtually inherited class in this traversal, write it down.
The order you write those classes down is the order in which they are initialized.
We note that another way to describe this algorithm would be to say that we
record the post-order number of each class as we perform the DFS, and write
them down in increasing order of their post-order numbers. The post-order
number of something during a (or other traversal) is the numbering that
corresponds to the sequence in which you are completely done with each node
(i.e., have assigned post-order numbers to all of its successors).
Once we understand the order of construction, the order of destruction is
simple. As always, the order of destruction is the exact opposite of the order of
construction. Accordingly, we destroy the virtually inherited subobject last, after
all other destructors. These objects are destroyed in the opposite order of their
construction.
Figure 29.9: A complex inheritance hierarchy to demonstrate the rules for creating and destroying
virtually inherited objects. Blue edges indicate virtual inheritance.
To understand these rules, it helps to see an example. Suppose we have the
following complex (contrived) inheritance hierarchy:
1 class A {};
2 class B {};
3 class C : public virtual A {};
4 class D : public virtual A, public virtual B {};
5 class E : public virtual B {};
6 class F : public C {};
7 class G : public virtual C, public virtual D {};
8 class H : public virtual D, public virtual E {};
9 class I : public F, public G, public virtual H {};
This class hierarchy is pictured in Figure 29.9, with virtual inheritance
edges shown in blue, and non-virtual inheritance edges shown in black. The
figure also shows the post-order number for each node (in white on blue for
classes that are virtually inherited, and in black on gray for classes that are not).
If we write down the virtually inherited classes in increasing order of their post-
order numbers (A, C, B, D, E, H), then we get the order in which the virtually
inherited classes are initialized when we create an I. After these constructors are
executed (as if they were called in that order from I’s constructor), the non-
virtual classes are initialized in the usual order: C, F, G, then I. Note that there
are two C subobjects, one virtually inherited, and one non-virtually inherited.
Destruction proceeds in the reverse order.
As with many things, this is not a rule you need to memorize. However, if
you find yourself making complex inheritance hierarchies using virtual
inheritance, you should fully understand everything you are doing. Knowing
what order your classes are created and destroyed is an important aspect of that
complete understanding. If nothing else, you should remember that there are
special rules, and look them up if you ever need them.
29.4 Mixins
Suppose that, in our hypothetical GUI toolkit, we wanted to add fancy borders to
a component. If we wanted to do this for one single component, (e.g., Buttons)
we might write a subclass (e.g., FancyBordersButton) that adds and overrides
methods to provide this functionality (e.g., it might override the draw() method
to draw the desired borders). However, if we wanted this capability for all
components, we would find ourselves writing a subclass for each component,
and duplicating the code to draw the borders. We might look to inheritance, but
find that we cannot add this capability at the top of the inheritance hierarchy, as
it is not common to all components (we want Buttons with regular borders still).
What we really desire here is a way to add common functionality at the
bottom of the inheritance hierarchy—that is, we’d like to write a subclass that
can extend from a variety of super classes. Such a design is called a mixin.3 We
can implement mixins in C++ with templates, templating a class over its parent:
1 template<typename Parent>
2 class FancyBorderedComponent : public Parent {
3 public:
4 virtual void draw() {
5 Parent::draw(); //use inherited draw method first
6 //(some code to draw borders )
7 }
8 };
Now, we can write the FancyBorderedComponent mixin once, and apply
that template to any other class (e.g., FancyBorderedComponent<Button>). The
result will be a class that extends its template parameter, and overrides the draw
method. We could also write additional mixins and layer them together—
effectively stacking functionality onto the bottom of our inheritance hierarchy.
We are not going to delve deeply into mixins, but want to mention them in
case you ever find yourself in a situation where they improve your design. The
concept is one that people may not realize they can do until they see it.
However, once you have seen it the basic idea is fairly intuitive, and situations
may present themselves where the design idea is quite useful. For those who are
interested in more details and considerations, we recommend reading these
papers about mixins by Yannis Smaragdakis:
http://yanniss.github.io/templates.pdf
and
http://yanniss.github.io/practical-fmtd.pdf.
28 Concurrency30 Other Languages
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
IV Other Topics29 Advanced Topics in Inheritance31 Other Languages: Java
Chapter 30
Other Languages
At this point, you should have learned a lot about
programming, as well as some specifics of C and C++. If you
have deeply internalized the lessons we presented about the
programming process, reading code, and data structures, you
can pick up new programming languages with relative ease.
The core of the programming process will be the same no
matter what language you use, as data structures and algorithms
are language independent. Most of what you have to learn are
the syntactic details and what is provided by the standard
libraries of a new language.
There are, however, a variety of language design choices
that are good to know about. We will talk about them briefly
here, then give a brief introduction to a few specific (useful)
languages over the next few chapters.
30.1 Garbage Collection
In C and C++, a programmer must manually free memory when
it is about to be no longer referenced. However, as you likely
have noticed by now, programmer errors related to freeing
memory can lead to a variety of bad program behavior. Instead,
many languages include garbage collection—the language
runtime automatically frees heap-allocated memory that is no
longer reachable.
The upsides of garbage collection are that it makes it
easier for the programmer to write correct code, by eliminating
whole classes of errors that the programmer can make. Double
freeing a piece of memory is not possible, as you cannot
manually free any pointers. Likewise, you cannot free an
improper pointer, nor free memory then use it. Memory leaks
are no longer possible either, as the garbage collector will
collect the memory.
The downside of garbage collection is that it may cause
the program to “pause” when the garbage collector runs. Most
garbage collection strategies involve stopping the entire
program and running the garbage collection algorithm when the
heap is full (the program would need to request more memory
from the OS to satisfy a memory allocation request). Garbage
collection algorithms typically have to examine at least all live
objects (those that are still reachable), and some examine all
objects in the heap. Accordingly, the garbage collector may take
a significant amount of time (on program execution scales) to
run.
Improving the performance of garbage collection has been
a subject of much research over the past few decades.
Therefore, while there is some performance overhead, it is
much better than it used to be. One important technique that
improves performance significantly is generational garbage
collection—in which the heap is divided into multiple
generations (which may each use different garbage collection
algorithms).
In generational garbage collection, objects are allocated in
the youngest generation, which is also the smallest. As the
youngest generation is small, collections proceed much more
quickly than they would for the entire heap. When objects
survive some number of collections in the youngest generation
(called “minor collections”), they are moved to the next older
generation. Each generation is only collected when it fills up,
and objects that survive a few collections are promoted to the
next older generation. Accordingly, collections in the oldest
generation (“major collections”—which take the longest) are
quite infrequent.
The combination of speed of minor collections and
infrequency of major collections gives generational garbage
collection relative good amortized performance overheads—if
we look at the impact on runtime over a long running program,
we would expect it to be fairly low. Accordingly, for many
programs, modern garbage collection does not pose a
significant performance impediment. However, if the “long
pause” of a major collection cannot be tolerated—such as in
any real-time system—then garbage collection is not feasible.
We are not going to go into the details of garbage
collection algorithms here (nor the technical details required to
make generational garbage collection work). However, you
should be aware that it exists in many languages, know
generally what the performance implications are, and plan to
learn the details of how it works (e.g., in a compilers class) if
your future programming plans include situations where they
are relevant.
30.2 Language Paradigms
So far, we have seen two language paradigms. C is an
imperative language—we specify how the program
accomplishes its tasks by a series of statements that alter the
state of variables in the program. C++ is an object oriented
language—we specify how the program accomplishes its tasks
by invoking methods on objects (which might then refer to
other objects, and invoke methods on them). Of course, in C++,
we can perform imperative-style programming as well (we
could just write a C-like program in it).
There are, however, other language paradigms. One such
paradigm is functional programming—in which we express our
computation in terms of functions being applied to values.
While this may not sound that different from C at first (where
we have functions too), there are a few key features of
functional programming languages that are significantly
different. First, in a purely functional language, you cannot
change anything once you create it. That is, if x = 3, then we
cannot change x—it will always be 3 for as long as it exists.
This may sound incredibly limiting, however, it is actually
not. We can make another x with a different value, we just
cannot change the existing one. The most notable way in which
we might do so is by recursion—if we have f(int x), and we
call f(3), then x = 3 and we cannot change that. However, we
can recursively call f(2) and then have a different x (i.e., in a
different frame) that is equal to 2. Of particular note, we can
write “loops” with tail recursion (recall that they are equivalent
from Section 7.3), and thus can write any algorithm we could
write in an imperative language.
In a functional language, we also have functional data
structures—we do not modify the data structure with operations
such as add or remove (after all, we cannot modify anything
once it exists), but instead we create a new data structure that is
just like it, except that it has whatever changes are required by
the operation we are performing. Functional data structures
often share as much common substructure as possible for
efficiency.
30.3 Type Systems
Both C and C++ are languages in which we declare the types of
our variables, function return types, and parameters (e.g., we
say int x;), and the compiler checks the types. C and C++ also
allow some conversions between types (either implicitly or
explicitly as we have learned). Other languages differ in these
design decisions, or other aspects of their type system—the set
of rules governing type checking (or the lack thereof).
Some languages do not even have a static type system,
meaning that the compiler does not do any type checking at
compile time. Such languages are also called untyped
languages, although the programmer must often think about
what type of data she is working with. The compiler just does
not check the types (thus they are not declared). Some untyped
languages have runtime type checking—if you try to multiply a
string by a float, the compiler will not give you an error, but the
program will crash at runtime. Such languages are called
dynamically typed languages, as they check the types at
runtime (but not compile time).
Other untyped languages do no type checking at all (e.g.,
assembly—where the programmer specifies the program in
terms of individual machine instructions). An attempt to
multiply a string by a float will result in whatever computation
the programmer has requested in terms of the raw bits in
memory and registers, and the instructions that the programmer
wrote.
Advocates of untyped languages claim that type systems
are constraining, and that they can write more flexible code.
Such claims are based on the idea that you can write code that
does not type check in the type system of some statically typed
language, but the code is perfectly safe—it will never crash.
For example, if + is defined on ints (to do integer addition),
floats (to do floating point addition) and strings (to do
concatenation). Then code that adds each element of the array
{3, 4.5, "Hello"} to the corresponding element of the array
{9, 3.14, "World"} will work perfectly fine, but does not
type check in a static type system.
The opposing view is that static type systems are better,
and generally that the stronger the type system (meaning the
more errors it can catch), the better. This view holds that
stronger type systems are preferable as they eliminate classes of
errors that the programmer might make. By allowing the
compiler to detect problems, the programmer can fix the error
at compile time, and thus not have to find it in a test case, then
debug it. If the type-related problem does not come up in a
programmer’s test cases, he may not find it until the product is
released, and the bug causes significant problems. However, if
the compiler catches the bug due to the type system, no test
cases are needed.
Advocates of stronger typing would argue that a strong
type system need not interfere with writing good code. Many
strong type systems include features such as parametric
polymorphism, which allow the same code to be used with
different types, but in a type-safe way. A language could have a
type system that allows an array of “ints or floats or strings” in
a type safe way—in particular, the typing construct that allows
this behavior is called a disjoint union, which means that it can
be hold one of a variety of types, but we can tell which variety
is there. SML (which is a very strongly typed language) has this
feature, and we will see it in Section 33.3.4 when we discuss
SML.
The type systems for languages can vary greatly from one
language to another. The most important point here is to not be
surprised if you pick up a new language, and its typing rules are
significantly different from C’s—they will be an important part
of learning that language. The topic of type systems is rather
large and complicated, and there is an entire sub-field of
Computer Science devoted to type theory. For readers who are
interested in a deep study of type theory, we recommend
“Types and Programming Languages” by Benjamin C. Pierce
for further reading (commonly abbreviated TaPL).
30.4 Parameter Passing
Another design choice in programming languages is the rules
for parameter passing. In C, we always use a parameter passing
rule called pass-by-value,1 which is the rule we have learned
since Chapter 2—where we make a copy of the value to pass as
the parameter to a function. This rule is common in
programming languages, but is not the only possibility. Some
languages may even support multiple different rules with bits of
syntax allowing the programmer to specify which parameter
passing rule she wants to use for a specific argument.
Another parameter passing rule that a language might use
is pass-by-reference, in which the language actually passes a
pointer to the argument, and then the body of the function
automatically derferences it anytime it is used. This behavior is
what you get in C++ when an argument is a reference type
(e.g., you have an argument of type int &). Some people say
that such arguments pass an int by references, while others
would argue that you are actually passing an int & by value.
Such a distinction is largely semantic-hair-splitting, and arises
from the fact that int & is a valid type for a variable in C++.2
In particular, the following code leads to the pass-a-reference-
by-value argument:
1 void someFunction (int & x) {
2 //whatever code
3 }
4 ...
5 int a = 3;
6 int & b = a;
7 ...
8 someFunction(b);
Here, we are clearly passing b by value at the call site, as we
make an exact copy of it. That value just happens to be an
int &. Note that there are languages that support pass-by-
reference, in which you cannot declare a variable of reference
type.
Note however, that in C, if we pass an pointer (e.g., an
int *), there is no ambiguity: we are passing a pointer by
value. There is no such thing as pass-by-reference in C.
Likewise, in C++, if we pass an explicit pointer, we are passing
that pointer by value. We could pass a reference to a pointer
(e.g., a int *&), in which case we could argue that we are
passing a pointer by reference, or passing a pointer reference by
value.
Another parameter passing rule is pass-by-name. In pass-
by-name, the entire argument is logically passed as an
expression, which is re-evaluated every time a use of the
argument is encountered. If the expression passed as an
argument has side-effects, these effects occur each time the
expression is re-evaluated. This parameter passing rule is
typically implemented by encapsulating the argument
expression into an anonymous function that takes no
arguments, and passing a pointer to that function. Each use of
the parameter in the function’s body, then calls the function via
the passed pointer to (re-)evaluate the argument expression.
Scala has syntax to pass arguments by name.
The last parameter passing rule we will discuss is pass-by-
need. In pass-by-need, the argument expression is not evaluated
to a value until it is used; however, once it is evaluated, it is not
re-evaluated for subsequent uses. Accordingly, the argument
expression is evaluated either once (if it is actually used one or
more times during the execution of the function) or zero times
(if the function body does not actually use the parameter value).
Because these semantics call for an argument to not be
evaluated until/unless it is needed, they are also called lazy
evaluation. An important consequence of lazy evaluation is that
we might pass in an expression that, if evaluated, would crash
the program (segfault, divide by zero, etc) or go into an infinite
loop; however, as long as the called function does not use the
argument (e.g., due to the control flow), those bad effects will
never occur.
Lazy evaluation is closely related to the short-circuit
evaluation of the || and && operators in C—if we wanted to be
able to implement such behavior in our own functions (or
overloaded operators), we would need a way to declare
parameters as being evaluated lazily.
30.5 And More…
There are a variety of other aspects that differ from one
language to another. Our goal here is not to delve into a
comprehensive study of programming language design choices
(that would be a semester-long course of its own), but rather to
make you aware that things are not always quite the same as
they are in C or C++.
To continue our brief study of programming in other
languages, the next few chapters present a brief introduction to
a few notable languages. These next chapters are by no means
meant to be complete guides to transitioning to those
languages; however, they should provide you with sufficient
information to serve as a starting point to learning those
languages. They will also serve as concrete examples of some
of the language features we described above.
29 Advanced Topics in Inheritance31 Other Languages: Java
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
IV Other Topics30 Other Languages32 Other Languages: Python
Chapter 31
Other Languages: Java
Java is a natural next language for a student who has learned to
program in C and C++. Java’s basic syntax is quite similar to C
and C++, so you could probably read most Java code at this
point. There are, however, several significant differences.
31.1 Getting Started
Java has a compiler, called javac, which compiles java source
(files ending in .java). Unlike GCC, javac does not produce
instructions that are directly executable by the processor.
Instead, it generates Java Bytecode (which it places in .class
files), which is then interpreted by the Java Virtual Machine
(JVM). You therefore run a Java program by java ClassName
where ClassName is the name of the Java class that contains
main.
You can install the Java Compiler and Java Virtual
Machine together by installing a Java Development Kit (JDK).
On Linux, you can install OpenJDK through your package
manager (sudo apt-get install openjdk-7-jdk), If you
have Mac OSX, you can install OpenJDK through Mac Ports.
You can also download Oracle’s JDK directly from them.
Once you have JDK installed, the first thing you might
want to try to do is compile and run a program that prints
“Hello World”. This program is a canonical first program to
write in a new language because it does something obvious and
is simple. It basically gives you a starting point to make sure
you know how to use the toolchain (compiler, etc) and run the
resulting program. If we were to put the following code into a
file called Hello.java:
1 public class Hello {
2 public static void main(String[] args
3 System.out.println("Hello World");
4 }
5 }
Then we could compile the code with javac Hello.java.
Assuming that the Java compiler is installed correctly, no error
messages will be printed and the compiler will produce
Hello.class. You can then run this class with java Hello and
it should print Hello World.
If we look at the code from Hello.java we can see many
familiar constructs, and some slightly odd looking things.
Starting on the first line, we can see that we are declaring a
class called Hello, and that there are curly braces enclosing the
members of that class. Unlike C++, we see that the class is
declared with public right before it. This declaration shows
two differences between Java and C++. First, visibility
modifiers (such as public) are written directly before each
member in Java, whereas in C++ we write them with a colon,
followed by several members that have the specified visibility.
The second difference is that we are declaring the top-
level class itself to be public. The reason for this difference is
that in Java, code can be divided into packages. Packages act
much like libraries in C, and can have classes that are not
visible to code outside of the package. The public modifier on
the class declaration says that this class is visible to any other
class regardless of package (this code is also not placed in any
particular package).
31.1.1 Differences in main
On the second line of Hello.java, which says
public static void main(String[] args) { we see several
things that look familiar, but that are subtly different from what
we expect. The similarities with C lead us to believe that we are
declaring a static method called main, which returns void and
takes one argument called args of type String[]. From what
we just learned, we should not be surprised to see the public
modifier written directly in front of the declaration.
However, it may seem a bit odd that main is declared
inside of a class. In C++ we could declare a method called main
in a class, but it would not be the main function that serves as
the entry point to the program. In Java, however, all code goes
inside of a class, including main. As main is not particular to
any specific instance of the class (it will be executed before we
have a chance to make any instances), it is declared static. We
will note that it is perfectly fine to have main inside each of
many classes in a program if you want—the one that starts the
program is the one inside of the class that you tell java to run.
Next, it may seem a bit odd that main returns void (rather
than int). In Java, main returns void as it assumes the program
succeeded unless it specifically exists another way (via
System.exit(status)).
It may also seem a bit odd that main takes one argument:
String[] args. As with C/C++, the argument to main
represents the command line arguments to the program.
However, Java has a built-in String type, and has arrays that
are objects that have a field for their length. Accordingly, what
you are used to as argc can be obtained by args.length, and
the elements of args in Java are similar to the elements of argv
in C/C++.
However, unlike argv in C/C++, the strings passed to main
Java do not include the name of the program (which would be
java, nor the name of the class being run—they start the
command line arguments after the class name). Unlike C/C++,
where you can omit the arguments to main if your program
does not need them, Java requires that main take exactly one
argument of type String[]. Of course, the name of this
argument does not matter (though args or argv are
conventionally the most common).
31.1.2 Differences in Printing
Inside the body of main, we see another difference. Instead of
using printf to print to stdout, we use System.out.println.
Here, System is a class (remember, all things are inside of
classes in Java), and out is a public static member inside of it,
whose type is PrintStream. PrintStream, which is Java’s class
for printing various types of data to a stream. System has two
public fields of this type, out (for stdout) and err (for stderr), as
well as one field (in for stdin) of type InputStream.
The println method of a PrintStream prints out the data
passed to it, followed by a newline. There are many
overloadings of it to print various types of data. There is also a
print method, which prints the argument without adding a
newline. Both println and print only take one argument
(unlike printf, which takes an arbitrary number of arguments).
However, Java overloads the + operator to concatenate Strings.
If the left operand of + is a String and the right is not, the right
operand will be converted to a String before concatenation.
Accordingly, printf-style conversion specifiers are not needed
in Java. Instead, one simply concatenates the desired data into a
String:
1 int x = 3;
2 System.out.println("x = " + x);
If you just want to print an integer by itself, you do not
need to concatenate it with a String as println is overloaded
to take an int.
31.2 Primitives and Objects
Java has two primary categories of types: primitive types and
objects. The eight primitive types in Java are boolean (true or
false), byte (8-bit signed integer), char (16-bit Unicode
character), double (64-bit IEEE floating point), float (32-bit
IEEE floating point), int (32-bit signed integer), long (64-bit
signed integer), short (16-bit signed integer). Java also has
void, which is much like void in C/C++. However, void is not
listed as a primitive type in the Java Language Specification, as
it does not behave like the other types (you cannot make
variables nor arrays of type void).
The primitive types work much like you expect from
C/C++, except that the language specification says exactly how
large they are and what representation they have—in Java, an
int will always be a signed 32-bit 2’s complement integer. Java
also has somewhat more restrictive rules about when explicit
casts are required to convert between primitive types than C
does.
Every other type in Java is a class type. Java does not have
structs, enums, nor unions. In Java, variables of class type are
always references, although those references are not quite the
same as C++’s references—they are more like C++’s pointers
(i.e., assigning to them changes what object they point at and
they can be null). Passing an expression with class type as the
argument to a method call passes a copy of the pointer. The
method can modify the object through that pointer, but if the
method changes where the pointer points, the value in the
calling method is unchanged (exactly as would happen if we
passed a pointer to an object to a function in C++).
In Java, all class types implicitly inherit from the built-in
Object class, unless they explicitly inherit from some other
class (in which case, Object is still a super-class, as their
parent, grandparent, or some other ancestor eventually inherits
from Object). The Object class has a handful of important
methods. For example, Object has a method
public String toString(), which converts that object to a
String representation (each class should override it for their
own way to convert to a String).
Another important method in Object is the equals
method, which compares two objects for equality. In Java, the
== operator, when used on objects, compares the pointers, to
see if they are referencing the exact same object (just as would
happen if we used == to compare pointers in C++). If we want
to perform a comparison of the contents of the object (e.g., to
see if two Strings are the same sequence of characters), then
we have to use the equals method. When you write a class, you
should generally override the equals method to specify how to
compare your objects for equality.
31.3 Object Creation and Destruction
In Java, all objects reside in the heap—it is not possible to
create an object that resides directly in a stack frame. Object
creation is done with the new operator, which is quite familiar
by now, and has very similar syntax. However, there are some
important differences between C++ and Java in terms of object
creation and destruction.
First, Java does not have delete. Instead, Java uses
garbage collection to automatically reclaim heap space that is
no longer reachable. Note that by not having delete, and by
having no ability to get a pointer to anything on the stack (there
is also no address-of operator, nor pointer arithmetic), Java
avoids all problems with dangling pointers by making them
impossible to create. Java classes also do not have destructors.1
Java classes do have constructors, which look fairly
similar to C++’s, except that there is no such thing as an
initializer list. Instead, the constructor simply initializes the
fields directly with assignment statements. Unlike C++, there is
no default initialization of class-type fields with their default
constructor—as they are all effectively pointers, they are
simply initialized to NULL, and the constructor can then
properly initialize them as needed. Java constructors can
specify arguments to parent constructors by having a call to
super() as their first line, passing in the desired arguments to
that call—super() is a call to the super constructor, and
implicitly is done with no arguments if not explicitly requested,
unless the first line of the constructor is this() (with some
number of arguments in the parenthesis), in which case, the
constructor “chains” to another overloading of the constructor,
by calling the matching one with the specified arguments. That
constructor then either chains to another, or calls super()
(implicitly or explicitly).
Note that Java does not have a “copy constructor” nor a
“copy assignment operator”. Copying objects is much less
common in Java, as we work with pointers to the objects, and
we do not have to worry about dangling pointers or when to
free an object (because Java has garbage collection). If we do
need to make deep copies of our objects, we can override the
clone() method (which is inherited from Object, but the
inherited version just throws an exception indicating that the
class cannot be cloned). Likewise, if we wanted to modify an
object to update its state to be a copy of another object (like
operator= in C++), we would write a method to do so.
31.4 Inheritance
Java has inheritance, which behaves similarly to C++’s,
although there are some significant differences. The first major
difference is that dynamic dispatch is the only option. There is
no way to have a method statically dispatched in Java—every
method will dynamically dispatch for every class. The second
major difference is that an object’s dynamic type does not
change during object construction. As soon as the memory is
allocated for the object, its dynamic type becomes what ever
type was newed, and that type never changes. The important
consequence of this difference is that if method calls are made
by parent constructors, they will always be dynamically
dispatched to the methods written in the class actually being
created. Consequently, there is no possibility of accidentally
calling an abstract method with no implementation during
object construction (see Section 18.6 for a refresher on C++’s
abstract classes).
Another important difference is that Java does not have
multiple inheritance. As you may recall (from Section 29.2),
C++’s multiple inheritance results in complicated object
layouts, the need to resolve ambiguities in things inherited from
multiple parents, and the potential for multiple copies of
grandparents (unless virtual multiple inheritance is used). Java
avoids this entire set of issues by not having multiple
inheritance at all. However, Java recognizes the benefits of
being able to polymorphically treat one type as many others
(which may not be related to each other).
Java gives the ability to treat a class as multiple different
types by introducing the concept of interfaces. An interface
specifies a set of methods that must be implemented (or the
class must be declared abstract), and may not contain any fields
except for ones that are final (what Java calls const) and
static. A class that implements an interface (Java uses
implements for interfaces, and extends for classes) can be
polymorphically treated as the interface type. However, as the
interface has no fields, it does not complicate object layout in
the same ways that multiple inheritance does. Dynamic
dispatch through an interface is a bit more complex, however.
Java also provides the ability to declare a class as final,
meaning that it cannot be used as the parent of any other class.
31.5 Arrays
In Java, arrays are first-order objects. They are created by with
the new operator:
1 int[] x = new int[3];
In the case of an array of objects (including an array of
arrays), the elements of the array are initialized to null (not
default constructed as they are in C++). Indexing an array in
Java is much the same as in C or C++: the index is placed in
square brackets after the name of the array (or expression that
evaluates to the array). However, unlike C or C++, arrays know
their length (they have a field called length), and perform
bounds checking on all array accesses. Attempting to access an
invalid array index results in an
ArrayIndexOutOfBoundsException.
Java’s arrays are covariant, meaning that if Child is a
subtype of Parent, then Child[] is a subtype of Parent[].
Accordingly, it is possible to assign a Child[] to a Parent[]
variable (or pass it as a parameter). This covariance is safe as
long as the array is only being read. However, if the array is
modified, there is a risk of placing an object that is not the
correct type into the array. Java addresses this issue by
performing a runtime check on array stores, and throwing an
ArrayStoreException if the item stored does not have the
correct dynamic type for the dynamic type of the array.
31.6 Java API
Java has a rich built-in library (called the Java API). Its
documentation can be found online (for Java 8) at
https://docs.oracle.com/javase/8/docs/api/. You will
notice that the documentation is divided into packages, which
is the way that Java groups code together. The built-in packages
have names that start with java. or javax.. Programmers can
write their own packages too (in which case the standard
naming convention:
https://docs.oracle.com/javase/tutorial/java/package/
namingpkgs.html).
If you want to use classes from any package other than
java.lang, you need to import it—which behaves somewhat
like opening a namespace in C++—or use its fully qualified
name. You can either import all classes in a package (e.g.,
import java.util.*;), or you can import an individual class
(e.g., import javax.swing.JFrame).
As you get started in Java, you will most likely want to
become familiar with the classes in java.lang, java.io, and
java.util first. The java.lang package provides the classes
fundamental to the language—such as String, and System. The
java.io package provides classes for IO operations (opening,
reading, writing files, etc). The java.util provides various
data structures—including ArrayList (which is similar to
C++’s Vector), HashMap/HashSet/TreeMap/TreeSet (Maps and
Sets implemented with hashtables/trees), Stack, and
LinkedList.
31.7 Exceptions
Like C++, Java has exceptions; however, there are some
significant differences. The first difference is that Java restricts
what types of objects can be thrown to those that are subclasses
of Throwable. The Throwable class has two direct subclasses,
Error, and Exception. Of these, the Error class is the parent
class for exceptions that indicate situations that are quite
difficult to handle, and should generally not happen—such as
failure to allocate memory, or an inability to load some of the
application’s bytecode.
The Exception class has many subclasses (both direct
children, as well as grandchildren, etc) indicating a variety of
problematic situations. One direct subclass is
RuntimeException, whose children are situations where
requiring the programmer to handle the exception every time it
could possibly occur would be too cumbersome. Subclasses of
RuntimeException include ArrayIndexOutOfBoundsException
(as we mentioned earlier, this exception is thrown when an
array index is out of bounds), NullPointerException (thrown
when the program attempts to dereference a null pointer), and a
variety of others. As you can see from these examples, it would
be a pain if you had to wrap every array index and every object
access in a try/catch.
The remaining exceptions are checked exceptions—they
must either be explicitly handled with try/catch, or the method
must declare them in its throws clause, indicating that it might
throw them. Any calling method must then either explicitly
handle these exception with try/catch, or must itself declare
them in its throws clause. Unlike C++, the compiler checks that
these exception specifications are accurate.
Java also introduces finally, which can go with a try,
either after the last catch, or right after the try with no catch
blocks. The finally block is executed whether or not an
exception occurs, and is Java’s way of dealing with non-
memory resources that must be released by a block of code
(whether or not an exception occurs).
The last major difference with Java’s exceptions is that
they are thrown with throw new WhateverExceptionType();
and caught with catch (WhateverExceptionType exn). Note
that Java does not have C++’s references, but that exn would be
a Java-style reference (aka, pointer) to the exception object.
Note that the exception objects in Java are allocated on the
heap, then thrown, as Java does not allocate objects anywhere
else. The exception object will be deallocated at some point
after it is no longer needed—whenever the garbage collector
gets to it. In Java, rethrowing an exception in the catch block is
done by just throw exn; (assuming exn is the name that the
exception was bound to when it was caught, as in our example
catch above).
31.8 Generics
Java does not have templates, but does have generics. Generics
provide similar features to templates (e.g., the ability to write a
LinkedList that can hold any type of data), but are done in a
drastically different way. One of the major differences is that
generics are typechecked when they are declared, not when
they are instantiated. The consequence of this difference is that
the body of the generic must be valid in all cases, not only for
the classes that it is actually applied to. While this may seem to
be overly constraining, Java provides bounded type parameters,
meaning that you can specify that the type parameter must be a
subclass of a particular class (or implement one or more
particular interfaces). For example, you could write a class that
is generic in type T, but type T must be a subclass of
InputStream:
1 class MyClass<T extends InputStream> {
2 //body of class
3 }
In such a class, you can use a T just like you could use any
InputStream, but you cannot expect any other specific behavior.
Note that if you have methods that return a T out of the class,
the type system will still recognize that if you have
MyClass<FileInputStream>, those methods return a
FileInputStream.
Another major difference between C++’s templates and
Java’s generics is that Java’s generics compile into one class,
which is just used for all instantiations. By contrast, C++
generates a different specialization of each template for each
different set of parameter values passed to it. One pragmatic
consequence is that you cannot create a new T() in a class that
is generic in T. That is, the following code is illegal:
1 //will not compile
2 class MyClass<T> {
3 void someMethod() {
4 new T(); //illegal
5 }
6 }
31.9 Other Features
Java has a variety of other interesting features. One of them is
class loading—a class can be dynamically loaded from any
valid sequence of bytes (e.g., read from a file, read across the
network), and then used. If a programmer so desires, she can
write custom class loaders for whatever purposes she might
need.
Another interesting feature is reflection, which allows a
program to access the members of a class in a way that does not
require them to be known at compile time. For example, you
can iterate over the declared methods, fields, or even inner
classes of a particular class, looking for one that meets specific
characteristics. In combination with class loading, reflection
might be useful for loading plugins. It can also be useful in
code analysis tools, where you might want to find the properties
of a class without parsing it.
Another feature is native methods—sometimes you cannot
write code in Java because it interacts with the system at a low
level, or you wish to use a library that is written in C. Java
provides an interface (the JNI—Java Native Interface) to allow
Java code to call code written in C, C++, assembly, or
potentially other languages.
The last feature we will mention is that Java has two
toolkits for graphical user interfaces (GUIs). These provide a
way to make applications with windows, buttons, text fields,
etc.
30 Other Languages32 Other Languages: Python
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
IV Other Topics31 Other Languages: Java33 Other Languages: SML
Chapter 32
Other Languages: Python
Python—named after Monty Python and not the snake—is a
high-level programming language known for its readability.1
Unlike the other languages we have (or will) see, Python is
dynamically typed. Like Java and SML, Python is garbage
collected. Although you may find yourself in a situation in
which you are working with a large, complex code base written
in Python, most people’s first interaction with Python is as a
tool for writing quick scripts that involve file and string
manipulation (where Python truly excels). In contrast to the
low-level C, which can often be cumbersome to write but
produces a very fast executable, Python supports quick
implementation at the expense of slower runtimes.
Python code is usually interpreted rather than compiled. A
C programmer will first write code, then compile the code into
a binary, and finally execute the binary. A Python programmer
will write code and then have it interpreted—read and executed
in a single, final step (recall our discussion of this in Section
5.4). Python compilers that create standalone executables from
Python source code do exist, but they are not commonly used.
32.1 Getting Started
Python interpreters are widely available, first and foremost at
python.org. This is also a great place to find helpful tutorials
and FAQs.
Once a Python interpreter is installed, it can be invoked
via the command line in two modes of use: normal or
interactive. In normal mode, you provide Python with the
source code it should interpret. For example, if we were to put
the following code into a file called hello.py:
1 print "hello world!"
Then you could have that code interpreted by the Python
interpreter by typing “python hello.py” on the command line
from within the directory that contains your code. At this point,
you will see the output “hello world!” on the screen.
Notice how extremely simple and readable this program is.
Compare this to the Hello World code we saw in the previous
chapter for Java (which is also a higher-level language than C)
and you can see how efficient coding in Python can be.
An alternative approach to this version of Hello World is
to put the following code into a file called hello2.py:
1 #!/usr/bin/python
2
3 print "hello world!"
In this version, the first line of the file indicates where the
interpreter is that should be invoked for the file. The #!
specifies the interpreter for the script, as discussed in Section
B.10. The path that follows (/usr/bin/python) is specific to a
particular machine). To determine the right path for your
machine, simply type which python on the command line. If a
Python interpreter is installed on that machine, this command
will output the correct path. If you use the approach of
hello2.py, you need to change the permissions of your source
code and make it executable. You can do this by typing chmod
+x hello2.py on the command line. Now, technically, you
have created a Python executable. In reality, it is a script that
invokes the Python interpreter and gives it the remainder of the
file to interpret. You can then “run” your script by typing
“./hello2.py” on the command line.
Note that it is convention but not requirement to give files
that contain Python code a .py suffix. It would be semantically
equivalent (but hardly user-friendly!) to name a file of Python
code file hello.blah.
For the the interactive mode, you invoke the provided
Python on the command line and then enter Python code
directly rather than storing it into a file. To leave the interpreter
at any time, simply type Control-D. The following is a
transcript of an interactive version of the same Hello World
program:
1 % python
2 Python 2.7.6 (default, Sep 9 2014, 15:04
3 [GCC 4.2.1 Compatible Apple LLVM 6.0 (cla
4 Type "help", "copyright", "credits" or "l
5 >>> print "hello world!"
6 hello world!
7 >>> ^D
8 %
Line 2 indicates the version and date of the Python
interpreter that was invoked. Line 3 indicates the version of
GCC that was used to compile the interpreter and on what
platform that took place. The same transcript on a different
machine might look like this:
1 $ python
2 Python 2.3.4 (#1, Apr 15 2011, 17:40:06)
3 [GCC 3.4.6 20060404 (Red Hat 3.4.6-11)] o
4 Type "help", "copyright", "credits" or "l
5 >>> print "hello world!"
6 hello world!
7 >>>
8 $
Notice in this version that the command line prompt ($
versus %) and the version of Python both differ. Also, the
Control-D to exit Python was not printed to the screen. The
interaction with the Python interpreter, however, remains the
same.
The remainder of this chapter will briefly introduce you to
some of the most important concepts in Python. The intent is
not to teach you all of Python in a few short pages but to give
you a feel for the language, show you a few key examples that
would enable you to start writing your own Python programs,
and to serve as a starting point should you wish to delve further
using other resources.
32.2 Basic Types
Python is a dynamically typed language, so we do not declare
variables with types. For example, we can just type the
following lines of code (either into a file or line by line into the
interpreter):
1 a = 3
2 b = 3.0
3 c = "three"
Python will track the value and type of the variables, and
know that a holds the int 3, b holds the floating point number
3.0, and c holds the string "three".
Python can also evaluate mathematical expressions
without storing them to variables. For example, one could
simply type:
1 >>> 8 * 5 + 14
2 54
3 >>>
For this reason, many people use Python as a command
line calculator. One could write a program in C to perform this
calculation, but it would take significantly longer than it does in
Python.
Python supports a variety of convenient operators for
strings. Strings can be concatenated using the + operator.
Substrings can also be easily accessed using an array-like
notation, as shown in the following example:
1 >>> a = "hello"
2 >>> b = "world"
3 >>> c = a + b
4 >>> print c[0]
5 h
6 >>> print c[0:8]
7 hellowor
8 >>> print c[2:4]
9 ll
A lone subscript will access a single character. The x:y
notation will access characters at indices x up to (but not
including) y.
Python also has the basic boolean values True and False.
Here is a very simple example of a boolean variable in action:
1 >>> answer = 2 > 8
2 >>> print answer
3 False
4 >>>
32.3 Data Structures
Python offers a rich supply of data structures. Very briefly, they
are as follows:
List a sequence of items. The sequence, which is
surrounded by square brackets, can be changed
once it is created. For example:
1 >>> myShoes = [’sandals’, ’snea
2 >>> yourShoes = [’sandals’, ’fl
3 >>> myShoes[1] = ’boots’
4 >>> print myShoes
5 [’sandals’, ’boots’, ’stilettos
6 >>> all = myShoes + yourShoes
7 >>> print all
8 [’sandals’, ’boots’, ’stilettos
9 >>>
In this example, we create two lists, modify
one of them, then concatenate them into a third list.
Tuple
a sequence of items. The sequence, which is
surrounded by round brackets, cannot be changed
once it is created. Notice that in the following
code:
1 >>> myShoes = (’sandals’, ’snea
2 >>> yourShoes = (’sandals’, ’fl
3 >>> myShoes[1] = ’boots’
4 Traceback (most recent call las
5 File "<stdin>", line 1, in <m
6 TypeError: ’tuple’ object does
7 >>> all = myShoes + yourShoes
8 >>> print all
9 (’sandals’, ’sneakers’, ’stilet
10 >>>
Line 3 produces a type error because we
attempt to perform an assignment statement with
an l value that is not mutable.
Set a sequence of unordered items with no
duplicate items (exactly the Set ADT we learned
about in Section 20.5). A quick example of some
set operations are as follows:
1 >>> myShoes = [’sandals’, ’snea
2 >>> yourShoes = [’sandals’, ’fl
3 >>> mySet = set(myShoes)
4 >>> yourSet = set(yourShoes)
5 >>> shoesWeBothHave = mySet & y
6 >>> print shoesWeBothHave
7 set([’sandals’])
8 >>> allTheShoes = mySet | yourS
9 >>> print allTheShoes
10 set([’flats’, ’moccasins’, ’san
11 >>> shoesOnlyIHave = mySet- you
12 >>> print shoesOnlyIHave
13 set([’stilettos’, ’sneakers’])
14 >>>
Notice that a set is created by explicitly
making a set out of a list. Lines 5, 8, and 11 show
how to perform set intersection, union, and
difference, respectively.
Dictionary
is Python’s implementation of a Map ADT
(which we learned about in Section 20.6)—so it
maps keys to values. This is perhaps the most
beloved of all Python data structures because, as
we have mentioned before, maps are powerful and
ubiquitous in programming. The Python dictionary
provides syntactic convenience for writing map
manipulations easily. The following code shows
some basic dictionary operations:
1 >>> shoePrices = {}
2 >>> shoePrices[’boots’] = 100
3 >>> shoePrices[’sandals’] = 20
4 >>> shoePrices[’moccasins’] = 4
5 >>> print shoePrices
6 {’boots’: 100, ’moccasins’: 40,
7 >>> print shoePrices.keys()
8 [’boots’, ’moccasins’, ’sandals
9 >>> print shoePrices.values()
10 [100, 40, 20]
11 >>>
32.4 Control: Loops and Whitespace
Python has a very simple construct for loops. Below is an
example of a while loop:
1 i = 0
2 while (i < 5):
3 print i
4 i = i + 1
When run, this code produces the following output:
1 0
2 1
3 2
4 3
5 4
Notice that there are no braces to demarcate the body of
the while loop. They are not necessary because whitespace is
meaningful in python. Consider what would happen if the user
failed to type in the spaces on line 4, like this:
1 i = 0
2 while (i < 5):
3 print i
4 i = i + 1
The result would be an infinite series of 0’s, one per line.
Here is an example of a simple for loop:
1 myShoes = [’sandals’, ’sneakers’, ’stilet
2 for shoe in myShoes:
3 print "I like "+shoe
which produces the following output:
1 I like sandals
2 I like sneakers
3 I like stilettos
The range function is also popular when writing for loops.
It takes an upper and a lower bound and can be used within for
loops as in the following example:
1 for i in range(0, 5):
2 print "I have "+str(i)+" pair of shoe
which produces the following output:
1 I have 0 pair of shoes!
2 I have 1 pair of shoes!
3 I have 2 pair of shoes!
4 I have 3 pair of shoes!
5 I have 4 pair of shoes!
Notice that the indices of the range function work
identically to the indexing for lists, shown earlier. Also, in order
to concatenate the string and the integers, the integer must first
be converted to a string, as shown on line 2.
32.5 Example: Reading from a File
As a final example, we will revisit the task from Video 12.6 in
which we were charged with the task of reading lines from a
file and printing them out. Recall that our C implementation
looked like this:
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(void) {
5 char * line= NULL;
6 size_t sz = 0;
7 ssize_t linelen;
8 FILE *f = fopen ("names", "r") ;
9 while ((linelen = getline(&line, &sz, f
10 printf("linelen = %zd, line = %s, sz
11 printf("%s", line);
12 }
13 free(line);
14 return EXIT_SUCCESS;
15 }
In Python, we can write this program with much less code.
We do not need to include headers, write main, declare
variables (Python is dynamically typed), or free memory
(Python is garbage collected). The code in Python would look
like this:
1 with open(’names’,’r’) as f:
2 for line in f:
3 print line
If the above example doesn’t motivate you to learn more
about Python when tasked to perform anything with files and
strings, nothing will! To delve deeper into Python, we
recommend next learning more about its string processing
capabilities (especially the strip function), its use of functions
(including lambda functions), and how it supports objects.
31 Other Languages: Java33 Other Languages: SML
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
IV Other Topics32 Other Languages: Python34 … And Beyond
Chapter 33
Other Languages: SML
In this chapter, we will introduce you to SML—a strongly typed
functional programming language with parametric polymorphism.
The specific variant of SML we will be using is SML-NJ
(Standard ML of New Jersey). If you are using Linux, you can
install smlnj through your package manager (sudo apt-get
install smlnj). If you are running Mac OSX, you can install it
either by downloading a disk image (dmg) or via Mac Ports. As
always, you should be editing your code in Emacs or vim. If you
are using Emacs, sml-mode is highly recommended (on Linux,
sudo apt-get install sml-mode).
33.1 Getting Started
SML is a strongly typed functional language, with parametric
polymorphism. However, unlike many strongly typed langauges
that you might be familiar with, SML infers the types of all
variables—you do not (typically) have to write them down
yourself. The type it infers is the most general type—that is,
whenever polymorphism is safe, SML will infer a polymorphic
type for any variable or function.
SML is a functional language—computation (typically) is
comprised of functions applied to values. These functions return
new values, which are then passed to other functions. Some of
these values might themselves be functions, which can be passed
as arguments to or returned from other functions. However, SML
is not purely functional: it provides ways for side-effecting
computations if you so desire.
SML-NJ has a “read-eval-print-loop” (REPL). When you
start SML (either by running sml at the command line, or by
doing M-x sml-mode in emacs), you will get a banner stating the
version, and then a prompt that is a single dash (-). If you type a
valid SML expression at this prompt, SML will evaluate it and
print out the value.
As we have mentioned before, the “canonical” first program
to write in any language is “Hello World.” At the SML prompt,
type
1 "Hello World";
and hit enter. SML will evaluate this expression, and print the
result:
1 val it = "Hello World" : string
The string literal Hello World is already a value, so SML simply
evaluates it to itself. It then binds it to the variable it, since you
did not ask for it to be bound to anything else in particular (that is
what the val it = part is). It also reports the type of its answer,
in this case a string.
Note that the semi-colon at the end of the line is here to tell
the REPL that we are done typing input—it is not used to end
every statement. If you forget to put the semicolon when writing
at the REPL, you will get an = prompt, which indicates that SML
is expecting you to continue what you were typing. If you simply
forgot the semi-colon, you can just type it and hit enter.
At this point you may argue that the canonical first
programming exercise is to print Hello World, not to simply
evaluate the expression. We can do that in SML as well, by using
the print function. Type the following at your SML prompt:
1 print "Hello World\n";
Note that we added a \n to the end of the string—as with C (and
many other programming languages), \n is the escape sequence
for a newline character. When you evaluate this expression—
which passes the string "Hello World\n" to the print function,
SML will print:
1 Hello World
2 val it = () : unit
The first line of this output is the result of the call to print. The
second line of the output is the value that the expression evaluated
to. In this case, the value is () (which is pronounced unit) of type
unit. The unit type—which has only one value: unit—is the type
used to indicate no meaningful value, much like void in C or Java.
It is typically used in side-effecting computations (print is a side
effecting computation: it has the side effect of outputting a string,
but computes no meaningful value).
Note one subtle detail, which may confuse novice-SML
programmers: it did not change types or values in the above
example—there are two different variables named it. The first
one has type string and value "Hello World", the second one has
type unit and value unit. This may seem strange and subtle, but it
is important. Once you bind a value to a variable, that binding
never changes. You can have other variables later of the same
name, but cannot change what has already been bound. If you re-
bind a name to a new value, subsequent uses of that name will
reference the new binding. However, anything between the first
and second binding will continue to refer to the original binding.
When writing larger pieces of code, you will not want to
type it all in at the REPL—any mistake or changes will be a pain
to edit. You can tell SML to load a file with the use command:
use "myfile.sml". Later on, we will see how to use the
Compilation Manager to compile and load a large set of files, but
for now, either enter short pieces of code at the REPL, or type
longer pieces of code into a file, and use use.
33.2 Basic: Math, Conditionals, Functions,
and Tuples
Printing strings is useful, but most computation revolves around
math. SML supports integers (of type int) as well as floating point
numbers (of type real). If you type 3 as your SML REPL, SML
will report that it evaluates to 3, and is of type int. Likewise, if
you evaluate 3.14, SML will evaluate it to 3.14, and tell you that
it is of type real.
SML has infix math operators for addition (+), subtraction
(-), and multiplication (*) for both integers and reals. You can
experiment with expressions like 2+4, 3.14 * 6.2, etc.. and it
will evaluate them. These operators all have the standard
precedence and associativity that you expect. SML also has
division; however, for real numbers it is the / operator, but for
integers, the operator is called div (it is still infix: 99 div 11).
Integers also support mod, which is another infix operator.
Note that the +,-, and * operators are overloaded to either
take two ints and return an int, or to take two reals and return a
real. This overloading is a special case that is built into the
language for a handful of mathematical operators. You cannot, in
general do your own operator overloading like this. Also, note
that you cannot “mix and match” ints and reals with these
operators: 3 * 6.2 is ill-typed—you must either explicitly
convert the int to a real, or the real to an int before operating on
them. We will see how to do such a conversion later.
33.2.1 Conditionals
SML has a type bool, with two values: true and false. It also
has the standard comparison operators: <, <=, >, >=, = (equal to),
and <> (not equals). There is no need to have different operators
(e.g. = vs ==) to distinguish assignment from comparison, as they
cannot appear in the same grammatical contexts. SML also has
not (which negates a boolean), andalso (which performance a
logical and), and orelse (which performs a logical or).
SML has two ways to make conditional decisions (although,
as we shall see briefly, one is actually just short-hand for the
other). Both styles are conditional expressions, not statements—
they evaluate to a value. Relative to C (and C++ and Java), it is
most like cond ? expr1 :expr2 construct (which we mention in
Section E.1).
The first way to write a conditional expression is with if-
then-else. In SML, since this is a conditional expression, there is
always an else—anything else is a parse error. The syntax of if-
then-else is
1 if expr then expr else expr
where the first expression (conditional test) must have type bool,
and the other two expressions must have the same type. For
example:
1 - if 3 < 4 then "hello" else "goodbye";
2 val it = "hello" : string
The second way to write a conditional is the case expression.
The case expression is much more general than if-than-else, and
much more powerful than the switch/case construct of C or Java.
The SML case expression has the syntax
1 case expr of
2 pattern => expr
3 | pattern => expr
4 | pattern => expr
5 ...
The case expression matches the first expression against each
pattern, and evaluates to the expression on the right-hand side of
the matching pattern. We will learn more about pattern matching
as we introduce other data types/structures throughout the
document. For now, we will suffice to say that any constant can
be used as a pattern, and that _ can be used to indicate “match
anything.” For example, if x is an int, then we could have
1 case x of
2 0 => "its zero"
3 | 1 => "its one"
4 | _ => "something else"
Note that all cases must result in the same type. A bit of trivia: if-
then-else is typically converted to case inside the compiler:
if e1 then e2 else e3 is simply
1 case e1 of
2 true => e2
3 | false => e3
You can also write a variable name in a pattern, in which
case, it can match any value, but that value is bound to the
variable. For example:
1 case x * y of
2 0 => 0
3 | z => z - 1
In this expression, we compute x * y and pattern match its
value. If x * y evaluated to 0, the whole expression evaluates to
0. Otherwise, we bind the result to z (as if we had written
int z = x * y; in C), and the expression evaluates to z-1.
33.2.2 Functions
Now that we have some basic building blocks, we are ready to
start writing some functions. We will start with a very simple
example function:
1 fun add3 x = x + 3
The keyword fun tells SML that we are defining a function. After
that comes the name of the function we are defining (add3)
followed by the name of its argument (x), then an equal sign, and
an expression for the function’s body. Note that in SML all
functions take exactly one argument and return exactly one value
(this may sound restrictive, but we will see shortly that its really
no problem at all).
If we evaluate the above function at the REPL, it will report:
1 val add3 = fn : int -> int
Here, SML is telling us that it has bound a value to the name
add3. The value is represented as simply fn, indicating that the
value is a function. The type is reported as int -> int (read “int
to int”). The arrow indicates a function type. The type on the left
side of the arrow is the argument type, and the type on the right
side of the arrow is the return type.
SML inferred the types of add3 and x for us. It sees that x is
being added to 3, so x must be an int. Knowing that x must be an
int, and that x is the argument to add3, SML figures out that add3
must take an int as an argument. It figures out that add3 must also
return an int because the body (x+3) has type int.
If we want to use the add3 function with an argument of 4,
we write:
1 add3 4
To a C or Java programmer, this may seem odd: the function call
lacks parenthesis. In SML, writing two expressions next to each
other means “function call”. Parenthesis are only needed for order
of operations/binding, or for tuples (which we will discuss
momentarily).
We can write more sophisticated and interesting functions
using recursion. For example we can write the factorial function:
1 fun fact x =
2 if x = 0
3 then 1
4 else x * fact (x - 1)
Try entering the factorial function at your REPL, and then
try calling it with a few values.
33.2.3 Tuples and Pattern Matching
An alternative way to write factorial is with tail recursion;
however, at first glance, this seems problematic: tail recursive
factorial appears to require two function arguments, but SML only
allows one argument for any function. This restriction is not
problematic, as the argument (and return value) may be tuples—
pairs, triplets, quadruplets, etc… of data.
A tuple is written in parenthesis with commas separating its
elements, for example (3,5) or (7,true,"hello"). The first of
these has type int * int (i.e., a pair of integers), and the second
of these has type int * bool * string—a three tuple with an
int, a bool, and a string. In types * indicates a tuple, in
expressions, * represents multiplication. Note that () is actually a
tuple with zero elements, and as mentioned before has a special
name: unit.
There are functions to access the individual elements of
tuples, but they are hardly ever used. Instead, tuples are frequently
deconstructed by pattern matching—one of the nicest features of
SML. The easiest way to explain a pattern match is to start with
an example:
1 fun fact_tail t = case t of
2 (0,ans) => ans
3 | (x,ans) => fact_tail (x-1, ans *
Here, fact_tail takes a single argument (t), which is a tuple
(specifically, int * int). The body of the function uses case to
match on t and pull it apart. The first case has a pattern that
matches a tuple where the first element is 0. In the spot for the
second element, there is a variable name, so this variable is bound
to the second element of the tuple. The right hand side of the => is
the expression to evaluate to when this case is matched. The
second case, started by the |, matches any two element tuple—
both elements have variables for their patterns. When this pattern
is matched, x is bound to the first element of the tuple, and ans is
bound to the second element of the tuple.
Functions with pattern matching are so common that SML
allows a short hand, where the pattern match is written directly as
the function argument:
1 fun fact_tail (0,ans) = ans
2 | fact_tail (x,ans) = fact_tail(x-1, ans * x);
Once you are used to the pattern matching syntax, you will
love it. It is incredibly readable, and easy to write.
33.2.4 let
One last annoying feature remains about our tail-recursive
implementation of factorial: what we wrote is really a “helper”
function. We want to compute the factorial of x, but need to pass
in (x,1). It would be very nice if we could package up the helper
inside a function that just takes an int, and hide the helper from
the rest of the program. Fortunately, we can do this with let, an
expression that allows us to make local declarations. For our
factorial function, we can write this:
1 fun fact x =
2 let fun help (0,ans) = ans
3 | help (x,ans) = help(x-1, x * ans)
4 in
5 help(x,1)
6 end
Here the body of the fact function is a let expression. The let
expression allows us to bind one or more names (functions with
fun or values with val), and then evaluate an expression in the
scope of those bindings. The scope of all of those name bindings
is limited to the let expression. Multiple bindings are possible,
each of which can see all preceding bindings in the same let:
1 fun complexFunction (x,y,z) =
2 let val a = f(x,y,32)
3 val b = g(z,12,false)
4 fun help1(0,ans) = ans
5 | help1(n,ans) = help(n-1, if x mod b = 0
6 then ans + a
7 else ans = 2
8 val c = help1 (y*z, b)
9 fun test n = x < a andalso y > z
10 in
11 if test b
12 then c + 1
13 else c - 1
14 end
The above (contrived) example assumes there are previously
defined functions f and g, which it uses to compute the values for
a and b respectively.
Earlier, we discussed that names can be re-bound, but not
changed. Now that we have introduced let, we can see a concrete
example of this in action:
1 fun silly x =
2 let val a = x
3 fun addA b = a + b
4 val a = x + 1
5 in
6 addA 3
7 end
While this is a rather contrived example, you can play with it at
the REPL to see that the addA function sees the original binding
to a (that is, the value of x), not the subsequent re-binding of a (to
x+1). Let also allows for single-case pattern matching with vals.
One use of this would be to pull apart tuples returned by functions
(or evaluated from other expressions).
1 fun f x = (x+1, x*x)
2
3 fun g y =
4 let val a = y * y + 3
5 val (b,c) = f a
6 in
7 b+c
8 end
Becoming familiar with let will likely help you acclimate to
SML much more quickly. It is very convenient for writing large,
complex functions, and lets you write in a slightly more familiar
“assign a value to a variable, then use it to assign a value to the
next variable” style. Just remember that once you bind the
variables, they never change.
33.2.5 Mutual Recursion
Mutually recursive functions are quite handy in a variety of
circumstances. In SML, they can be defined with the and keyword
(not to be confused with andalso). For mutually recursive
functions the functions must be defined consecutively (no other
definitions in between them, and the second and later functions in
a mutually recursive group use and instead of fun):
1 fun isEven 0 = true
2 | isEven x = not (isOdd (x-1))
3 and isOdd 0 = false
4 | isOdd x = not (isEven (x-1))
33.3 Data Types and Data Structures
We have already seen one simple way to aggregate data: tuples.
Tuples let us combined multiple pieces of data together into one
piece of data, with an appropriate type: an int and a string are
paired together to form an int * string tuple. SML provides
other ways to aggregate data, including the ability to create your
own (potentially recursive) data types.
33.3.1 Records
Tuples are great for a few items (2–4), or maybe a few more if
they are just being used as function parameters/return values.
With very large numbers of fields, tuples can become
cumbersome, as it may become difficult to remember which item
is which.
SML also supports records with named fields. The syntax for
a record is {name=value, name=value, ...}. For example:
1 {x=3, y=17, z=4}
is an expression that creates a record with three fields (x,y, and z)
having values 3,17, and 4 (respectively). The type of this record is
1 {x:int, y:int, z:int}
Records can be pulled apart either with pattern matching, or
with field selection functions. Field selection functions are named
with a # and the field name, for example #x, #y, etc. For example:
1 let val r = {x=3, y=17, z=4}
2 in
3 #x r
4 end
Pattern matching with records can be done in a couple ways.
In one way, the field names are used as the variables to be bound
to:
1 fun f {x,y,z} = x + y * z
However, this does not work if one wishes to match on two
records with the same field name, or to specify constraints on the
field values. One can also write the pattern for each field after an
equals sign:
1 fun f {x=0, y=1, z} = z
2 | f {x=0, y , z} = y+z
3 | f {x , y , z} = x * (y+z)
4
5 fun g ({x=a,y=b}, {x=c,y=d}) = a*c + b *d
Sometimes for records with many fields, it becomes
cumbersome to write down all of the field names in a pattern
match, especially if you only care about a couple. SML has the
concept of flex records, where you are allowed to write … to
mean “there are more fields I am not naming”; however, SML
must still be able to figure out exactly what type the record is
when it runs its type inference algorithm. Typically this means
that either another case fully specify the record fields, or the
programmer explicitly annotates the type (in at least one case).
For example, in the following function, the first two cases
leave out z and q with …, but the last case specifies all of the
record fields, allowing SML to infer the correct type:
1 fun f {x=0,y ,...} = y
2 | f {x ,y=0,...} = x
3 | f {x ,y ,z,q} = x*z + y*q
For the other approach (naming the record type) to be useful,
we need to introduce the fact that you can give types names:
1 type myRec = {x:string, y:int, z:bool, a:int, b:int
2 fun f ({z=false,...}:myRec) = 0
3 | f {a,y, ...} = a + y
There are two things to note from the previous example. First,
SML does not care about the order of record fields, only their
names. a,b is the same as b,a. Second, parenthesis are required
around ({z=false,...}:myRec), to cause the type annotation to
bind to the record, and not the function result type. If you write
f (a:t) you are saying that and a has type t. If you say f a:t
you are saying that f returns type t.
Tuples are actually a special case of records—they have
fields named 1, 2, 3… You can see this with the following code
fragment (which appear ill-typed, unless you know that tuples are
actually records):
1 let val {1=x,2=y} = (3,4)
2 in
3 x + y
4 end
Using this fact in the above ways is not terribly useful, and rather
discouraged (as its counter-intuitive, and thus confusing).
However, there is one useful application of this fact: the functions
#1, #2, etc can be used to pull individual fields out of tuples when
desired. As mentioned earlier, pattern matching is much more
common and typically preferred; however, this is possible when
desired.
33.3.2 Lists
Tuples and records are great if you want to combine a fixed
number of elements. However, frequently, programmers need
dynamic data structures to accommodate varying sized storage.
The most prevalent data structure in SML is the list. List is a
“type constructor”—that is, “list” is not a type by itself, but given
another type it constructs a type. For example, “int list” or “string
list” are types. Since “int list” is a type, and “list” is a type
constructor, “int list list” is also a valid type—a list of lists of ints.
In SML, we can make lists in two ways. First, we can write
list literals with square brackets: [1,2,3] is a list of three ints (1,
2, and 3 in that order). An example of a function that makes a list
using this syntax would be:
1 fun f x =
2 let val a = x * 2
3 val b = x * 3
4 in
5 [0, x, a, b, x * 4]
6 end
This function has type int -> int list (that is, it takes an int,
and returns a list of ints). Think for a second about what it does,
and then try it out at the REPL and see if you understand it.
The other way to construct lists is with the :: operator
(which is read “cons”—the name comes from LISP and Scheme).
You can think of this as doing “add to front” (which we saw in
Section 21.1.1), but instead of modifying the existing list, it
makes a new list (although it does not entirely copy the list—the
new list will have a pointer into the old list to share as much
structure as possible, a trick that can be done because nothing can
ever be changed). We could re-write our previous example using
:: (in fact, this is what the compiler translates our list literal into):
1 fun f x =
2 let val a = x * 2
3 val b = x * 3
4 in
5 0::x::a::b::(x*4)::[]
6 end
SML allows for pattern matching on lists using either syntax.
For example, the following function computes the length of a list.
The first pattern match uses the [] syntax for the empty list, and
the second uses :: to pull match a non-empty list whose first item
we do not care about (remember that _ means “dont care”) and
whose rest is bound to r.
1 fun len [] = 0
2 | len (_::r) = 1 + len r
You can also write things like the following function, which
have pattern matches for the empty list, a one item list (binding
the single item to a), a two item list (binding the two items to a
and b), and any longer list (binding the first to a and the rest to r).
1 fun f [] = 0
2 | f [a] = a+1
3 | f [a,b] = a * b
4 | f (a::r) = a + f r
33.3.3 Polymorphism: Flexibility in Your Types
Consider the following code to operate on a list:
1 fun insLst (x,[]) = [x]
2 | insLst (x,a::l) = a::x::l
This function takes a pair with a “thing” and a list of
“things”—but what type of thing? We could do
1 insLst (4,[1,2,3])
or
1 insLst ("x",["y","z","q"])
or
1 insLst ([1,2,3],[[4,5,6],[],[9,7,3]])
In the first, insLst has type int * int list -> int list, in
the second string * string list -> string list, in the third,
int list * int list list -> int list list. This seems
problematic since SML must assign insLst exactly one type
when it type checks the function, but it seems to have three
different types.
What actually happens here is that SML assigns uses
parametric polymorphism—specifically it infers the type of
insLst to be ’a * ’a list -> ’a. Here, ’a is a type variable—a
placeholder that can be instantiated with an actual type. This
polymorphic type can be instantiated differently at each use of
insLst—replacing ’a with any valid type (int, string, and int list
above). Type variables are somewhat like typename template
parameters (which should be quite familiar by now). However,
unlike C++’s templates, SML only allows parametric
polymorphism when a function truly works for any type.
The types of functions may have multiple type variables
(similarly to how we might have multiple template parameters),
for example:
1 fun f (a,b) = (b,a)
Here, f has type ’a * ’b -> ’b * ’a It can be instantiated as
int * string -> string * int, int * int -> int * int,
int list * int -> int * int list etc. Note that when SML
infers the type of a function, the algorithm it uses is guaranteed to
find the most general type of that function.
Polymorphic functions and data-types are quite powerful, as we
will see shortly. SML provides one special case of polymorphism:
any type, but where equality comparisons are possible. Such types
are called “eqtypes” and are represented with two ticks instead of
one: ’’a. For example in the contains function:
1 fun contains (_,[]) = false
2 | contains (x,a::l) = x = a orelse contains(x,l)
It works on any type of list, as long as we can compare an element
of that list to something of the same type and see if they are equal.
This function has type ’’a * ’’a list -> bool (SML/NJ will
also warn you that you are “calling polyEqual,” which is not a big
deal, and you can safely ignore the warning).
This function will work on int * int list, since ints can
be compared for equality, but will not work int ( (int -
> int) * (int -> int) list) since functions cannot be
compared for equality.
33.3.4 Datatypes : Defining Your Own
SML allows the programmer to define her own datatypes, which
may be recursive (or mutually recursive) and/or polymorphic.
We have actually already seen one datatype that is built-in:
the list. Datatypes are defined in terms of their constructors
(which differ from C++’s constructors), for example:
1 datatype foo =
2 X of int
3 | Y of string * bool
4 | Z
Here, we define a new datatype (foo), with three constructors
(X, Y, and Z). X has type int -> foo, Y has type
string * bool -> foo and Z has type foo. foo is now a type just
like any other: we can have foo lists, foo * int pairs, etc.
This means that X 3, Y(“hello”,false), and Z are all
expressions of type foo. Just like any other type, we can pattern
match on foos, which is useful to pull them apart:
1 fun myFun (X x) = x
2 | myFun (Y(s,b)) = if b then size s else 42
3 | myFun Z =
When we define the datatype this way, SML knows not only
that these constructors are the ways to create a foo, it knows that
these are the only ways to create a foo. This lets SML determine if
the match is exhaustive and/or redundant.
We can define the datatypes to be recursive by using the
name of the newly defined type in the argument lists of one or
more of the constructors:
1 datatype file =
2 Directory of string * file list
3 | NormalFile of string * int * string
Just like functions, datatypes can be made mutually recursive
using and:
1 datatype a =
2 A of int * string
3 | B of bool * b
4 and b =
5 C of int * int list
6 | D of a list
Datatypes may be made polymorphic like this:
1 datatype ’a tree =
2 NODE of ’a * ’a tree * ’a tree
3 | LEAF
You can specify two ticks to require and eqtype:
1 datatype ’’a tree =
2 NODE of ’’a * ’’a tree * ’’a tree
3 | LEAF
Note that lists are basically defined as
1 datatype ’a list =
2 :: of ’a * ’a list
3 | nil
except that :: is made into an infix operator (there is also
special syntax for list literals that you cannot create for your own
data types).
Another really useful data type is the option type:
1 datatype ’a option =
2 SOME of ’a
3 | NONE
An int option for example, lets you return SOME(an int) to
indicate a valid int answer, or NONE to indicate “no answer”.
33.4 Interlude: Compiler Errors
One major difficulty that novice SML programmers often
face is understanding and correcting compiler errors. This
difficulty tends to arise from the fact that the compiler works
differently in many ways than typical compilers for most other
languages like C or Java.
33.4.1 Parse Errors
One major difference between SML and compilers that you might
be used to is the way that SML handles syntax errors. SML’s
parser uses an error correction algorithm where it attempts to
insert, remove, or change lexical tokens to give a correct
grammatically correct parse. This scheme results in SML
reporting parse error messages in terms of what it ended up
changing to get a correct parse. For example, suppose you write
this:
1 let val x = [1,2,3
2 in
3 x
4 end
SML will report the following error message:
temp.sml:2.2 Error: syntax error: inserting RBRACKET
This indicates that the best solution that the parser was able to
come up with was to insert a “RBRACKET” token (a right square
bracket) to get a correct parse. Note that this is the error message
you get if you load the file with use. If you type the same code in
at the REPL, you get a confusing error message:
stdIn:34.2-36.5 Error: syntax error: deleting IN ID
END
Where the parser decides to drop in x end in the hopes of
finding more correct tokens later (e.g., you continue typing ,4]. If
you are getting errors with your code, it is often better to save it in
a file and use that file, to get better error messages.
Another thing to understand about the error correction
algorithm is that the parser chooses a fix that matches the
grammar. There may be many options, and it has no way of trying
to pick the “right one,” even when only one makes sense for the
program. Consider the following program, which has a mismatch
between an left square bracket and a right parenthesis:
1 let val x = [1,2,3)
2 in
3 4::x
4 end
SML’s parser will report the following correction:
temp.sml:1.13 Error: syntax error: replacing LBRACKET
with LPAREN
That is, it decided to try replacing the left square bracket with a
left parenthesis:
1 let val x = (1,2,3)
2 in
3 4::x
4 end
This program is grammatically correct, but ill-typed. The parser is
not smart enough to realize that this is the wrong choice, and it
should replace the RPAREN with an RBRACKET.
When you first start having parse errors, the reference to
specific token names may not be terribly information. However,
as you write and compile more code, you will become
accustomed to them. Here are some common token names to get
you started out:
Token Name Meaning
(any keyword) (That keyword)
LPAREN/RPAREN Left and right parenthesis:()
LBRACKET/RBRACKET Left and right square brackets: []
LBRACE/RBRACE Left and right curly braces:{}
ARROW An arrow, as in a function type: ->
DARROW A “double” arrow, as in a case: =>
ID An identifier x, factorial, etc..
WILD The “wildcard” or “dont care” pattern: _
VECTORSTART The start of a vector literal: #[
(other punctuation name) self-explanatory by name
EOF End of file
33.4.2 Type Errors
There are two aspects of SML’s type errors that are confusing to
new SML programmers. First, SML may infer the incorrect type
for a variable from an erroneous use, then report type errors on
the correct use. Consider this function:
1 fun f x =
2 let val y = "My input value is " ^ x ^ "\n"
3 val () = print y
4 val z = x * 3
5 in
6 z + (x div 4)
7 end
The programmer likely intended for x to be an int, and the
actual error is that x is concatenated with strings without a
conversion (i.e., Int.toString x). However, SML infers the type
for x as string, and then reports errors where user of a string is
not valid. In more complex code, it may help to explicitly
annotate a type, e.g.:
1 fun f (x:int) =
2 let val y = "My input value is " ^ x ^ "\n"
3 val () = print y
4 val z = x * 3
5 in
6 z + (x div 4)
7 end
to help find the actual error. This annotate forces SML to use int
as the type for x, causing it to detect the correct error.
Note that f x : int annotates the return type of f as an int,
and says nothing about x. The other confusion aspect of type
errors is the terminology that SML uses. For example, the
following error:
temp.sml:3.6-3.14 Error: operator and operand don’t
agree [literal]
operator domain: string list
operand: int
in expression:
someFun x
sounds somewhat complex, but is really saying “I expected a list
of strings (string list), but have a thing that is an int—that is not
right”.
Here, “operator” is referring to the function hd, and the
operand is x. The operator domain is the type that the operator
(hd) expects (a list of things) and the operand (x) has type int.
Another term that new SML programmers are unfamiliar
with is “tycon mismatch”—this stands for “type constructor
mismatch” and happens when SML’s type inference algorithm
needs a type made from two incompatible type constructors (that
is, one thing is used in incompatible ways, which would require a
type constructed from two different things). For example, this
code:
1 fun f x =
2 let val (a,b) = x
3 val q = a + b + 3
4 val [c,d,e,4] = x
5 in
6 q + c + d - e
7 end
produces the error:
temp.sml:4.6-4.19 Error: pattern and expression in val
dec don’t agree [tycon mismatch]
pattern: int list
expression: int * int
in declaration:
c :: d :: e :: <pat> :: <pat> = x
SML infers the type of x as int * int—a pair of ints, but
has run into problems when trying to reconcile that with pulling x
apart by a pattern match into a list of ints. Here, SML is
complaining that the type for x would have to be made from two
different (and incompatible) type constructors: one the one hand it
needs to be made from the “*” constructor (int * int) and on the
other, from the “list” constructor (int list).
33.4.3 Other Errors
For help on other errors, see
http://www.smlnj.org/doc/errors.html
33.5 Higher Order Functions
SML allows functions to be treated just like any other value: they
can be passed as arguments to other functions, returned from
other functions, put in lists, trees, or other data structures. Pretty
much the only thing you cannot do with functions is test them for
equality (such a test is undecidable).
Why would we want to do such a thing? One thing we might
want to do is have a generic “max” function that allows us to pass
in different ways to compare things (especially so that it can take
different types):
1 fun max gt =
2 let fun lp curr [] = curr
3 | lp curr (a::l) = if gt(a,curr)
4 then lp a l
5 else lp curr l
6 in
7 lp
8 end
This max function exhibits both taking a function as a
parameter (gt) and returning another function (lp) as its return
value. If we want to do a “normal” max on int lists, we could do
val maxInt = max (op >). Note that op > is the syntax to take
an infix operator (>) and get the function value associated with it.
Here, maxInt has type int -> int list -> int. That means it is
a function that takes an int, and return an int list -> int
function. Suppose we know that all our int lists are non-empty
and contain positive values. We could do val f = maxInt to
get an int list -> int function that will use the normal
“greater than” ordering on lists. We can use f on a bunch of
different int lists, and it will work fine.
We could also pass our max function different orderings on
ints, or even comparison functions for different types. We might
improve our max function slightly to use options, so that we don’t
have to pass the start value in from outside:
1 fun max gt [] = NONE
2 | max gt (a::l) =
3 let fun lp curr [] = curr
4 | lp curr (a::l) = if gt(a,curr)
5 then lp a l
6 else lp curr l
7 in
8 SOME(lp a l)
9 end
This function is a not kludgy, since it handles the corner case
of the empty list correctly, no matter what the code that uses it
does (and by returning an option, it forces that code to deal with
the answer of “there was nothing there”).
The function has the type:
1 (’a * ’a -> bool) -> ’a list -> ’a option
That is, you give it a comparison function, and it gives you back a
function that takes a list and returns an option with SOME of the
largest element, or NONE for the empty list.
Observe that this function declaration’s syntax may make it
appear to take multiple arguments: we wrote max gt [].
However, the max function only takes one argument (of type
(’a * ’a -> bool). That function then returns a function that
also takes one argument (of type ’a list). This style of function
is said to have curried arguments.
So now, val maxInt = max (op >) gives us a maxInt
function that has type int list -> int option. Note that we
can write anonymous functions with the keyword fn. Suppose we
want to examine a list in such a way that all even numbers are
treated as greater than all odd numbers. We could do
1 val weirdMaxInt =
2 max (fn (x,y) =>
3 if x mod 2 <> y mod 2
4 then x mod 2 = 0
5 else x > y)
Here we make an anonymous comparison function (we don’t
give it a name) that takes a pair of integer (x,y), and returns a
boolean as we want.
Higher order functions and lists are so ubiquitous in SML
that it comes with three higher order functions to process lists that
you should learn to love: fold, map, and filter. fold comes in two
flavors foldl and foldr.
33.5.1 map
We’ll start with map, which has type (’a -> ’b) -> ’a list -
> ’b list. What this says is “you give me a function that
transforms ’a s into ’b s, and I will give you a function that
transforms lists of ’a s into lists of ’b s”.
For example, map (fn x => x -1) is a function that takes a
list of ints and returns a list of ints where all values are one
smaller. map is basically written like this:
1 fun map f [] = []
2 | map f (a::l) = (f a)::(map f l)
33.5.2 filter
filter has type (’a -> bool) -> ’a list -> ’a list. This
basically says “give me a function that says to keep something or
throw it away, and I’ll give you a function that filters the list
according to that rule”.
For example, filter (fn x => x mod 2 = 0) is a function
that filters lists keeping all even numbers and discarding all odd
numbers. filter basically looks like this:
1 fun filter f [] = []
2 | filter f (a::l) = if f a
3 then a::(filter f l)
4 else filter f l
33.5.3 fold
Both versions of fold (foldl and foldr) have type:
1 (’a * ’b -> ’b) -> ’b -> ’a list -> ’b
This type is the most complex, but these functions are also
the most powerful. Here, ’a is the type of things in the list, and ’b
is the type of the answer at each step. You pass fold a function
that takes an item from the list (a ’a) and the current answer-so-far
(a ’b) and the function returns the newest answer-so-far. You then
pass fold the starting answer-so-far. It returns a function that
performs this operation on a list and gives you your answers.
For example, here is a way to write the sum function to sum a
list of integers: val sum = foldl (op +) The function to do at
each step is + (add: type int * int -> int). And we start at 0 as
our answer-so-far. The resulting function will sum any list of
integers.
We can also re-write our earlier max function:
1 fun max gt [] = NONE
2 | max gt (a::l) =
3 let fun pick(a,b) = if gt(a,b)
4 then a
5 else b
6 in
7 SOME(foldl pick a l)
8 end
Note that once you are accustomed to the functions, using
them makes your code imminently readable: you don’t have to
slog through the basic list processing, you just look at “what are
you doing” and “where are you starting”.
The two variants of fold (foldl vs foldr) pick the order in
which the list is processed. For commutative side-effect-free
operations like addition, either one works; however, for some
things, order matters.
foldl works left-to-right. That is, it applies the function to the
starting answer and the first element of the list. Then takes the
resulting answer and uses it with the second element of the list,
and so on. foldl is basically:
1 fun foldl f ans [] = ans
2 | foldl f ans (a::l) = foldl f (f(a,ans)) l
foldr works right-to-left. It applies the function to the starting
answer and the last element of the list, then works backwards.
That is, foldr is basically
1 fun foldr f ans [] = ans
2 | foldr f ans (a::l) = f(a,foldr f ans l)
For those of you who like to think in terms of imperative
algorithms, you should think of fold as being a “for loop” over all
elements of the list. That is, if you want to write
1 currentState = init;
2 for ( x=list.begin(); x != list.end(); x++ ) {
3 currentState = f (x, currentState);
4 }
5 return currentState;
You should just write foldl f init list. If you want to iterate
in reverse, use foldr f init list.
Note that foldl, foldr, and map are all available directly in the
top level environment (you can just use their names and get the
right function). filter is not, but is accessible as List.filter (more
on structures later).
33.6 Side-effects
With the exception of printing, the code we have seen so far has
been purely functional—nothing is ever modified, instead a new
different thing is created. Applying map to a list does not modify
the list—it makes a new list with the mapped elements.
33.6.1 Type constructors: ref + array
There are, however, two type constructors that create side-
effectable types: ref and array. The following code shows an
example of an int ref in action:
1 let val r = ref 0
2 val x = r
3 val () = r := 42
4 in
5 !x
6 end
Here, r and x both have type int ref. They both reference the
same int (much like two pointers to the same integer). r := 42
changes the values referenced by r to be 42. This is an expression
that has unit type (recall that unit is the name for a zero-element
tuple and is the type for things that produce no meaningful value,
but are done for side-effect). The body of the let uses the !
operator to dereference the ref x, getting the value it references,
which is 42. Arrays work somewhat similarly, and primarily use
the following functions:
1 Array.sub : ’a array * int -> ’a
2 Array.update : ’a array * int * ’a -> unit
See http://www.standardml.org/Basis/array.html for
more details on arrays if you need them.
33.6.2 Applications
Note that we can do interesting things with refs and higher-order
functions:
1 fun makeCounter initVal =
2 let val r = ref initVal
3 in
4 fn () => let val ans = !r
5 val () = r := ans + 1
6 in
7 ans
8 end
9 end
This function has type int -> unit -> int. You give it an
initial value, and it returns a function that will count up every time
you apply it to unit. The code below will result in a being 3, b
being 4, and c being 5.
1 val f = makeCounter 3
2 val a = f ()
3 val b = f ()
4 val c = f ()
We will note that you can emulate objects (of the OOP style)
with refs and records:
1 type pt = { getX : unit -> int,
2 getY : unit -> int,
3 setX : int -> unit,
4 setY : int -> unit}
5 fun makePoint (x,y) =
6 let val xr = ref x
7 val yr = ref y
8 in
9 {getX = fn () => !xr,
10 getY = fn () => !yr,
11 setX = fn x’ => xr := x’,
12 setY = fn y’ => yr := y’}
13 end
Here, makePoint acts as a constructor (in the OOP sense) for
“objects” of type pt, which encapsulate the data and the functions
that act on them. We actually hide the data here, and nobody can
directly access the refs.
33.6.3 The Value Restriction
One caveat of side-effects is that, if done naively, they make
polymorphic types unsafe (see
http://mlton.org/ValueRestriction for an example). The
important consequence of this is that ML has a rule called the
“value restriction.” This may come up sometimes when you use
polymorphic functions, especially partial applications of fold (that
is, you give fold a few arguments to get the function you want—
which is sort of “waiting” for the rest of the arguments to fold”).
For example, consider the following use of foldl to reverse a list:
1 val reverse = foldl (op ::) []
Evaluating this results in
stdIn:109.5-109.31 Warning: type vars not generalized
because of
value restriction are instantiated to dummy types
(X1,X2,...)
val reverse = fn : ?.X1 list -> ?.X1 list
Attempting to apply the reverse function to any list will result in a
type error—to be safe (ensure nobody did anything bad with refs),
SML had to make it not polymorphic, and gave it a monomorphic
type on “dummy types”—types that are not meaningful. This
made the reverse function, as written here, pretty useless since we
can’t apply it to much.
However, all is not lost. If we wanted a monomorphic
function, we could simply specify the type:
1 val reverse: int list -> int list = foldl (op ::)
but this is not the best option—we can actually get a
polymorphic function this way, we just have to write it a bit
differently:
1 fun reverse x = foldl (op ::) [] x
This seems almost exactly the same (and works the same) as
the original (potentially problematic) solution—the differences
are subtle, and why it guarantees safety is too complex to go into
here. The important thing to know is what to do if you get that
sort of error message.
33.7 The Module System
So far, our SML discussion has covered programming in the small
—writing individual functions, or a few functions to do relatively
minor tasks. SML has a rich module system to support
programming in the large. There are three pieces to this module
system—structures, signatures, and functors—which we will
discuss over the next three sections.
33.7.1 Structures
An SML structure forms a module, grouping together related
types, constants, and functions into a coherent package. Consider
the following example of the MyBST structure, which defines a
datatype and two operations (add and find) for binary search
trees:
1 structure MyBST =
2 struct
3 datatype ’a bst =
4 NODE of int * ’a * ’a bst * ’a bst
5 | NIL
6 fun add (k,v,NIL) = NODE(k,v,NIL, NIL)
7 | add (k,v,NODE (k’, v’, l,r)) = if (k < k’)
8 then NODE(k’,
9 else NODE(k’,
10 fun find (k,NIL) = NONE
11 | find (k,NODE(k’,v,l,r)) = case Int.compare (k,
12 EQUAL => SOME v
13 | LESS => find(k,l
14 | GREATER => find
15 end
Most of the code in this example should be familiar from
what we have seen so far: a polymorphic datatype declaration,
two functions written with pattern matches, an if expression, a
case expression, and a few function calls. There are, however, two
new things. The first is the declaration of the structure itself.
structure MyBST = tells SML that we are declaring a structure
(called MyBST). After this we can either follow it with a new
structure definition (struct .... end) or with the name of an
existing structure.
The other new construct that we see here is the use of the
Int.compare function. This shows how to use something found in
another structure. Here, Int is the name of an existing structure (it
is built into SML, and provides a variety of functions to do things
to integers). The dot gives us access to something inside the Int
structure, namely the compare function. This feature is useful, as
there may be many compare functions in a large program. In fact,
just in the built-in SML library, there are several compare
functions, for example: String.compare, Int.compare,
Real.compare.
As we mentioned above, you can declare a structure to be
equal to an existing structure. This is particularly useful to give a
structure a shorter name if you frequently reference it. For
example, if you are writing another structure that uses MyBST
frequently, you might say structure B=MyBST at the start of that
module. You can then say B.find and B.add to save typing.
You can also open a module—e.g., open MyBST, which lets
you reference the contents of the module without prefixing it with
the name (i.e. just find instead of MyBST.find). While this also
saves typing, it can lead to less readable code: if you open many
structures, someone reading the code many have to search
through all of them to find where a function is declared.
So far, SML’s structures sound a bit like C++’s namespaces
(in that they give us nameable scopes). In that regard, they are in
fact similar; however, as we shall see shortly, the rest of the
module system (signatures and functors) gives them powerful
features not available in namespaces.
33.7.2 Signatures
Our binary search tree example from the previous section is a
specific case of a more general construct: the Map abstract data
type (where we look up values given a key). We could also
envision other implementations of the Map abstract data type:
lists, balanced BSTs, Hashtables,etc… Signatures provide a way
to define an interface without defining an implementation. They
force structures to conform to that interface, and force users of
that structure to only use those types and values (including
functions) specified in the interface. This is particularly useful for
code maintainability, as you are always guaranteed you can
replace change out one implementation of a structure for another
conforming to the same signature with no other code changes.
Consider the following example signature:
1 signature MyMap =
2 sig
3 type ’a map
4 val empty: ’a map
5 val add: int * ’a * ’a map -> ’a map
6 val find: int * ’a map -> ’a option
7 end
This signature specifies that a structure conforming to the MyMap
signature must contain one type and three values (two of which
are functions because they have an arrow type). However, it says
nothing about how these functions are implemented. With a few
slight modifications, we can specify that our MyBST structure
conforms to the MyMap signature (called ascribing the signature
to the structure):
1 structure MyBST : MyMap =
2 struct
3 datatype ’a map =
4 NODE of int * ’a * ’a map * ’a map
5 | NIL
6 val empty = NIL
7
8 fun add (k,v,NIL) = NODE(k,v,NIL, NIL)
9 | add (k,v,NODE (k’, v’, l,r)) = if (k < k’)
10 then NODE(k’,
11 else NODE(k’,
12 fun find (k,NIL) = NONE
13 | find (k,NODE(k’,v,l,r)) = case Int.compare(k,k
14 EQUAL => SOME v
15 | LESS => find(k,l
16 | GREATER => find
17 end
Note that we have made three changes here:
1. The declaration of MyBST now has : MyMap, saying that
it conforms to the MyMap signature
2. We have changed the type name from bst to map, since
the signature requires a type called map (we could also
keep the datatype declaration the same, and then add this
declaration type ’a map = ’a bst).
3. We have added the declaration val empty = NIL so
that we have the empty value required by the signature.
The compiler will check that the MyBST structure has all the
names required by the signature, and that they all have the proper
types. The structure may define additional names (types or
values), but these effectively become private when the signature is
ascribed.
There are actually two types of ascription. The one we just
saw (using the : syntax) is called transparent. The other type
(using :> instead of :) is called opaque. The difference between
the two is whether or not the actual types used in the structure are
externally visible. Both types of ascription hide all names that are
not declared in the signature.
The difference between the two types of ascription are hard
to see with the MyBST example, so we will consider the
following different implementation of the MyMap signature:
1 structure MyLstMap : MyMap =
2 struct
3 type ’a map = (int * ’a) list
4 val empty = []
5 fun add(k,v,lst) = (k,v)::lst
6 fun find(k,[]) = NONE
7 | find(k,(k’,v)::l) = if (k=k’)
8 then SOME v
9 else find (k,l)
10 end
This implementation uses lists. With transparent ascription,
however, our information hiding is incomplete: the fact that ’a
map is a (int * ’a) list is externally visible. This fact allows an
external module to write something that “cheats” on the interface
to the module like this:
1 (1,2)::MyLstMap.empty : int MyLstMap.map
If we do such a thing, then we cannot seamlessly swap out one
map implementation for another. If we use opaque ascription
(shown below), then the fact that ’a map is a list is not externally
visible, and the information is completely hidden.
1 structure MyLstMap :> MyMap =
2 struct
3 type ’a map = (int * ’a) list
4 val empty = []
5 fun add(k,v,lst) = (k,v)::lst
6 fun find(k,[]) = NONE
7 | find(k,(k’,v)::l) = if (k=k’)
8 then SOME v
9 else find (k,l)
10 end
It is generally best to use opaque ascription, though there are
times when transparent is appropriate.
33.7.3 Built-in Structures
SML comes with a wide set of built-in structures called the “Basis
Library.” You can find documentation on them here:
http://www.standardml.org/Basis/overview.html
Ones of particular note are List (various useful things to do
on/with lists), ListPair (things to do to/with pairs of lists), Int
(functions to work with ints), String (for string
operations/manipulations), Array (as mentioned above: mutable
arrays), and Vector (which are similar to arrays, but are immutable
—you cannot change their contents once you create them; unlike
C++’s vectors, you cannot resize them). You can also find things
like TextIO (reading/write text to/from files), OS (and its sub-
structures—for a wide variety of system calls), and many more
things.
33.7.4 Using the Compilation Manager
SML/NJ comes with a utility that is similar to make, but
customized for SML. This utility is called the “Compilation
Manager,” or CM. When working on any program of significant
size, use of the compilation manager is highly recommended.
Using the compilation manager most efficiently goes hand-
in-hand with organizing your source code properly. Place each
structure definition in its own .sml file (do the same for each
functor definition—discussed in the next section). Place each
signature definition in its own .sig file. Then write a .cm file for
your program. This .cm file should start with the line Group is,
followed by a blank line. It should then list all of the source files
(one per line). Order does not matter. It should also list
$/basis.cm to include the basis library. Frequently, you will also
want $/smlnj-lib.cm for the utility library (which contains a lot
of algorithms and data structures). An example .cm file might be
Group is
$/basis.cm
$/smlnj-lib.cm
foo.sig
foo.sml
bar.sig
bar.sml
main.sml
Once you have your .cm file created, you can compile your
program, and (if compilation was successful) load the name
bindings it introduces into your current REPL with CM.make
"filename.cm" where filename.cm is replaced with whatever
you named the file. If the compilation was successful the last lines
of output (before the new REPL prompt) should be
[New bindings added.]
val it = true : bool
If CM.make returns false, it indicates that a compilation error
occurred, and you should be able to find that error in the output. If
you are using ml-lex (a lexical analysis tool for SML), you should
be able to just list .lex files in your .cm file. CM will know to run
ml-lex on these to get sml source, and then compile the resulting
SML file. If you are using ml-yacc (a parser generator for SML),
you can list .grm files in your .cm file, but also need to list $/ml-
yacc-lib.cm to include the ml-yacc library. CM will know to run
ml-yacc on the .grm file to generate sml, and compile the
resulting file.
33.7.5 Functors
Our previous Map design is a good step in the right direction: we
make good use of abstraction (signatures provide an interface that
hides the implementation), and our Map is polymorphic in the
values it stores. However, our Map has one major downside: the
keys must be ints.
We can fix this, but we need some way to parameterize the
data structure over the comparison function. This sounds simple
enough, as we have already learned that we can pass functions to
other functions. A straw-man approach would be to use this
interface:
1 signature MyMap =
2 sig
3 type (’a,’b) map
4 val empty: (’a,’b) map
5 val add: (’b * ’b -> order) -> ’b * ’a * (’a,
6 val find: (’b * ’b -> order) -> ’b * (’a,’b) m
7 end
Note that this signature makes use of the built-in type order,
which is defined as:
1 datatype order =
2 GREATER
3 | EQUAL
4 | LESS
This interface works, but exhibits a design flaw that can
come back to bite the programmer. A programming mistake could
cause the programmer to add with one ordering function and find
with another, causing mysterious errors where something should
be in the map, but is not found. A better option might be:
1 signature MyMap =
2 sig
3 type (’a,’b) map
4 val empty: (’b * ’b -> order) -> (’a,’b) map
5 val add: ’b * ’a * (’a,’b) map -> (’a,’b) map
6 val find: ’b * (’a,’b) map -> ’a option
7 end
In this interface, we provide the ordering function when an
empty map is created, and encapsulate that ordering function in
the map data structure (so the map type might be a pair of a
function and a tree). This approach solves the main problem in
option 1—a programmer can no longer erroneously try to find
with a different function than was used to add. However, we can
still do better. Suppose we wanted to add a merge function to our
interface:
1 val merge: (’a,’b) map * (’a, ’b) map -> (’a, ’b) m
When we add this function, we would like implementations to be
able to support their inherent orderings for an efficient merge
operation where possible. In the setup we have, this might result
in something like
1 fun merge (m1, m2) =
2 if (sameOrdering(m1,m2))
3 then fastMerge (m1,m2)
4 else slowMerge (m1,m2)
However, we have a problem for almost any practical
implementation: functions cannot be compared for equality, so we
cannot check if the orderings are the same! One can imagine a
variety of kludges, but functors provide an elegant solution: we
parameterize the structure as a whole over another structure
(containing types and functions). Instantiating the functor with a
particular structure results in a structure that the type system
recognizes as distinct from other instantiations of the same
functor. To see this in action, let us first look at the built-in
signature ORD_KEY:
1 signature ORD_KEY=
2 sig
3 type ord_key
4 val compare : (ord_key * ord_key) -> order
5 end
This does not seem terribly exciting—it is just a type and a
comparison function over that type—however, this is exactly
what we need to parameterize our MyBST Map implementation
over. To do this, let us first revisit the MyMap signature:
1 signature MyMap =
2 sig
3 type key
4 type ’a map
5 val empty: ’a map
6 val add: key * ’a * ’a map -> ’a map
7 val find: key * ’a map -> ’a option
8 end
Now our signature says that Maps will define two types: (1)
the type of key they use (2) the type for a polymorphic map
(which has the key fixed, but can be instantiated for any data
type). The signature then provides similar values/functions as the
original—it just uses key as the type of its key rather than int.
Now, we can make a generic BST map functor, which is
parameterized over a structure (K) conforming to the ORD_KEY
signature. The resulting structure (which is a BST-based map for a
particular key type) will conform to MyMap:
1 functor MyBST (K: ORD_KEY) : MyMap =
2 struct
3 type key = K.ord_key
4 datatype ’a map =
5 NODE of key * ’a * ’a map * ’a map
6 | NIL
7 val empty = NIL
8 fun add (k,v,NIL) = NODE(k,v,NIL, NIL)
9 | add (k,v,NODE (k’,v’,l,r)) = case K.compare(k,
10 EQUAL => NODE(k,
11 | LESS => NODE(k’,
12 | GREATER => NODE
13 fun find (k,NIL) = NONE
14 | find (k,NODE(k’,v,l,r)) = case K.compare(k,k’)
15 EQUAL => SOME v
16 | LESS => find(k,l
17 | GREATER => find
18 end
Once we have this, we can do things like
1 structure IntBST = MyBST(struct type ord_key = int
2 val compare = Int.c
3 end)
4 structure StrBST = MyBST(struct type ord_key = stri
5 val compare = Strin
6 end)
A few things to note about this example: (1) IntBST.map and
StrBST.map are two distinct type constructors. The type
bool IntBST.map is completely different from bool StrBST.map.
In fact, if we do this:
1 structure IntBST = MyBST(struct type ord_key = int
2 val compare = Int.c
3 end)
4 structure IntBST2 = MyBST(struct type ord_key = int
5 val compare = Int.c
6 end)
7 structure IntBST3 = IntBST
The type constructors IntBST.map and IntBST2.map are distinct.
Note that IntBST3.map is the same as IntBST.map, because
IntBST3 is exactly the same structure as IntBST.
If you were paying very careful attention, you will notice that
I used transparent ascription of the MyMap signature rather than
opaque (even though I said opaque is generally better).
1 functor MyBST (K: ORD_KEY) : MyMap =
If we change this to opaque ascription:
1 functor MyBST (K: ORD_KEY) :> MyMap =
but leave everything else the same, then this works great until we
actually try to use it:
- IntBST.add(1,"hello",IntBST.empty);
stdIn:42.1-42.35 Error: operator and operand don’t
agree [literal]
operator domain: IntBST.key * ’Z * ’Z IntBST.map
operand: int * string * ’Y IntBST.map
in expression:
IntBST.add (1,"hello",IntBST.empty)
The problem here is that opaque ascription has not only hidden
the information we want (the inner workings of our tree data
type), but has also hidden information we need: that key is int!
More generally, what we want our functor to reveal (as part
of its interface) is that its key type will always be the same as K’s
ord_key type—and that its ok for the outside world to know this.
We accomplish this with the where keyword:
1 functor MyBST (K: ORD_KEY) :> MyMap where type key
2 struct
3 ...
4 end
with this definition of the MyBST functor (as well as the
definition of the IntBST structure above), we get an excellent
design of our data structure: the BST implementation of the map
can be instantiated for any type we can give an ordering function
for. The details of the implementation are neatly hidden behind
the abstraction boundary enforced by opaque ascription, and we
can still write any other MyMap implementation we want (Note
that these do not have to be functors at all: we can make our
original int list-based map work by just adding type key=int to
it).
There are a few more things you should know about writing
functors. First, if you want a functor to be parameterized over
multiple structures (that is, “take multiple arguments”) you can,
but you do it in a way that looks like curried arguments:
1 functor F (X: Sig1) (Y: Sig2) =
2 struct
3 ...
4 end
Second, you can use where in the functor’s argument
declaration to contrain the signature a bit more. For example, in
the practice exercises you will do shortly, you will want to have a
functor parameterized over a structure conforming to the ORD_MAP
signature, but will want to contrain the key type to strings:
1 functor F(M: ORD_MAP where type Key.ord_key = strin
Note that Key is a sub-structure of ORD_MAP, so you are requiring
that M.Key.ord_key be a string.
33.7.6 Built-in Maps and Sets
SML’s utility library provides ORD_MAP and ORD_SET signatures, as
well as some efficient implementations of them. You can read
about the interfaces here: http://www.smlnj.org/doc/smlnj-
lib/Manual/ord-map.html#ORD_MAP:SIG:SPEC and here:
http://www.smlnj.org/doc/smlnj-lib/Manual/ord-
set.html#ORD_SET:SIG:SPEC
The utility library provides three implementations of each of these
BinaryMapFn/BinarySetFn, SplayMapFn/SplaySetFn, and
ListMapFn/ListSetFn.
33.8 Common Mistakes
Here are a few common mistakes for new SML programmers:
1. and vs andalso: and is for mutually recursive declarations.
andalso is for the boolean and operation.
2. if then else is an expression, not a statement. The else is
required, and both the then and the else must have the
same type. If you want to do something for side-effect in
one case, but not do something in the other case, use () as
the body of the else.
3. dot is for accessing things within structures—these are
not objects, they are modules. These modules separate
namespaces (that is, String.compare and Int.compare are
two different compares), but there is no encapsulation
here. You do not do value.function.
4. fun f a:t vs fun f (a:t). The first of these says that f
returns type t. The second say that the argument to f is of
type t.
33.9 Practice Exercises
Selected questions have links to answers in the back of the book.
• Question 33.1 : Write the min function, which has type
int * int -> int and returns the smaller of its two
arguments. Test your function at the REPL.
• Question 33.2 : Write the function sumTo, which has type
int -> int and returns the sum of all integers between 0
and its argument (inclusive).
• Question 33.3 : Write the fib function, which has type
int -> int and computes the nth Fibonacci number. Test
your function at the REPL.
• Question 33.4 : Write the isPrime function which has
type int -> bool and determines whether or not its
argument is prime. Test your function at the REPL.
• Question 33.5 : Write the function sumList which has
type int list ->int and computes the sum of all of the
items in the list. Test your function at the REPL.
• Question 33.6 : Write the function squareList which has
type int list -> int list and returns a list of the same
length as its argument, where each item in the returned list
is the square of the corresponding item in the argument
list. For example, squareList [2,5,1,3] should return
[4,25,1,9]. Test your function at the REPL.
• Question 33.7 : Using the following datatype declaration:
1 datatype expr =
2 NUM of int
3 | PLUS of expr * expr
4 | MINUS of expr * expr
5 | TIMES of expr * expr
6 | DIV of expr * expr
7 | F of expr list * (int list -> int)
Write the function eval: expr -> int which
evaluates an expression to a value. NUM(x) simply
evaluates to x. PLUS, MINUS, TIMES, and DIV should
recursively evaluate their sub-expressions, then perform
the appropriate math operations. F exprs should
recursively evaluate all the exprs in their expr list, then
apply their function to the resulting integers. (Hint: You
should use map in your F case.)
• Question 33.8 : Use fold to write the
flatten: ’a list list -> ’alist function. The flatten
function merges together a list of lists into a single list
with all of the elements in their original order. For
example: flatten [[1,2,3], [4], [5,6], [], [7]]
should result in the list [1,2,3,4,5,6,7].
• Question 33.9 : Use fold to implement your own version
of map.
• Question 33.10 : Use fold to implement your own version
of filter.
• Question 33.11 : Use fold to write the function
count: (’a -> bool) -> ’a list -> int which returns
a count of how many items in the list the ’a ->bool’
function returned true for.
• Question 33.12 : Use fold to write mapPartial: (’a -
> ’b option) -> ’a list -> ’b list this function is
like a combination of map and filter. For each item of the
input list, if the function applied to that item returns
SOME(b), then b appears in the output list. If NONE is
returned, that item is dropped from the output list.
• Question 33.13 : For this exercise, you will be filling in
the contents of the following functor:
1 functor F(M: ORD_MAP where type Key.o
2 (S: ORD_SET where type Key.o
3 sig
4 val proc: string list -> S.set M.ma
5 end
6 =
7 struct
8 ...
9 end
The proc function takes a list of strings (which are
file names), and builds a map of sets. The map maps
strings (i.e., words) to the set of files names in which they
appear. Words are separated by spaces (or newlines). For
example, if you were passed ["a.txt","b.txt"] and a.txt
had this contents:
Hello World
test
and b.txt had this contents:
a test
input
Your function should return a map that maps “Hello” and
“World” to the set { a.txt } , “test” to the set { a.txt,
b.txt } and “a” and “input” to the set { b.txt } .
Checking out the String, and TextIO structures are highly
recommended for implementing this function. You should
then instantiate your functor on at least one structure
instantiated from one of the built-in ORD_MAP and ORD_SET
functors and test it out.
32 Other Languages: Python34 … And Beyond
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
IV Other Topics33 Other Languages: SMLV Appendicies
Chapter 34
… And Beyond
At this point, we have reached the end of our technical content
—if you have mastered all of the material in this book, you
should be a competent programmer. You should be able to take
a programming problem (that you have the domain knowledge
for), devise an appropriate algorithm, and implement it in, in C
or C++ (and an intro to 3 other languages). You should be able
to test your code with a reasonably good set of test cases, and
debug your code when it goes wrong.
However, if you are aspiring to be a professional computer
scientist or engineer, you should understand that this
knowledge does not represent the end of your journey, but
rather the start. Programming skills form the basic foundation
for our craft—and you will continue to hone those skills with
experience and practice—but there is much other knowledge
required to be a top-quality professional in the field. The best
computer scientists/engineers have a broad background across
the wide range of the discipline, and a deep specialization in a
particular area of expertise. Here, we give a quick discussion of
topics that you might explore for breadth or depth:
Theory Every professional programmer should have a
solid command of formal logic (propositions,
proofs, induction, …), discrete mathematics
(probability, combinatorics, graph theory, …),
automata theory (DFAs, NFAs, RegExps, PDAs,
CFGs—and correspondingly, parsing—and Turing
Machines—including undecidability), and
algorithms (complexity classes: P, NP, PSPACE,
…; proving time and space bounds of an
algorithm, proving correctness properties of
algorithms, familiar with common/well-known
algorithms, …). Advanced topics may include
deeper exploration of these and related topics, as
well as things like programming language/type
theory.
Hardware
The code you write will run on hardware. If
you want that code to be fast, you need to
understand how the hardware executes it. This
knowledge spans topics such as assembly language
(the individual instructions your program gets
compiled into), the memory hierarchy (how/where
data is stored, different levels are faster/slower
than others—data arrangement can have a big
impact on speed), how the processor executes the
instructions, how data moves from core to core in
a multi-core system. How much you learn here
depends on how much you want to be able to make
your code fast. Of course, being a hardware
designer/developer is an exciting career path in its
own right, but requires even deeper knowledge of
these areas.
Low-level software Most software interacts with two
crucial pieces of software: the operating system,
and the compiler. Understanding them at a basic
level is good breadth for any programmer. A
deeper understanding of these topics is crucial if
you want to work directly on one of these
programs (i.e., work on gcc or the Linux kernel).
Networking With the prevalence of networking—the
Internet, mobile devices, etc…—every
professional programmer should have knowledge
of at least the basics of networking (the 7-layer
model, socket programming, TCP/IP, …). Of
course, deeper specialization is possible in this
field in a variety of ways.
Databases
Databases provide efficient and reliable
storage of large data sets, and are commonly used
in a wide variety of settings—especially
“enterprise web” applications. All programmers
should have a basic understanding of SQL, and be
able to write simple queries in it, as well as
understanding the ACID principle.
Security If you plan to write programs that potentially
interact with untrusted users (e.g., anything on the
Internet), a decent understanding of security is
required to avoid disaster. Security compromises
due to security-related incompetence is common,
and costly. You should understand topics such as
buffer overflows, input sanitization to prevent
command injection (e.g., SQL command injection
or shell command injection), and some basics of
cryptography—most importantly, what
cryptographic algorithms exist to provide various
functionality and that you should NEVER try to
make up your own.1 If you store passwords, you
should understand how and why to salt and hash
them properly (as we discussed in Section 23.3.3)
—as well as what hash algorithms are currently
considered secure, and which are not. If you intend
to write code that performs secure communication,
you should understand topics like key exchange
protocols and man-in-the-middle attacks. You
should also know what methods provide secure
sources of random numbers.
Software Engineering We have discussed the core topics
of programming here, but it takes more to design,
test, and maintain large software systems. On the
design aspect, a basic software engineering course
should cover topics such as UML diagrams, design
patterns, and the open/closed principle. We
covered the basics of testing, but a software
engineering course will dig in deeper, including a
more complete treatment of testing in larger
system (e.g., unit testing and integration testing).
This field also examines maintaining software—
changing the code as the requirements evolve and
bugs are fixed. This topic is crucial as code in the
“real world” has a long life cycle with significant
changes—design choices affect how difficult these
changes are. Other topics may include examination
of different ways to work in teams (pair
programming, Agile development, etc…) and often
a broader and deeper treatment of programming-
related tools than we have presented here.
Concurrency
We barely scratched the surface of
concurrency in Chapter 28. As hardware
increasingly focuses on multi-core designs for
performance, concurrent programming is
becoming increasingly important whenever speed
matters. Concurrent programming—especially
with high performance—is typically considered a
rather difficult task (and thus often commands high
salaries for those well skilled in it!). Deep
specialization in this area requires a strong
understanding of hardware (after all, there really is
not much point to writing parallel code if you are
not aiming for speed—and understanding the
hardware is critical for performance), as well as
being an expert programmer in general (if you
cannot write sequential programs correctly, you
have no hope of writing concurrent programs
correctly).
Applications There are a variety of applications of
programming that can also be useful to specialize
in (or just have basic breadth in)— graphics,
machine learning, AI, computational sciences, ….
If one of them excites you, dive into it!
33 Other Languages: SMLV Appendicies
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
All of Programming34 … And BeyondA Why Expert Tools
Part V Appendicies
A Why Expert Tools
B UNIX Basics
C Editing: Emacs
D Other Important Tools
E Miscellaneous C and C++ Topics
F Compiler Errors Explained
G Answers to Selected Exercises
Index
34 … And BeyondIndexA Why Expert Tools
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
V AppendiciesV AppendiciesB UNIX Basics
Appendix A
Why Expert Tools
Programming is all about making something. Whenever you
make something, you do well to invest time and effort into
learning the tools that help you with that task. Over the years,
programmers have developed a wide variety of tools that
support development efforts in various ways. If you want to
become a serious programmer, mastering these tools is crucial.
New programmers often wonder why they should invest
the effort into learning these sorts of tools. It is possible to
program with more-familiar seeming environments, which
require less up-front effort to learn the basics. For example, you
could write and compile your programs in a graphical
environment called an “IDE” (which stands for integrated
development environment), which will have a more familiar
“buttons and menus” interface. Choosing tools like these which
are designed for the ease of novices represent a short-term
benefit and long-term loss. If you are studying programming
casually (e.g., just taking one or two required courses), then the
time investment to learn these tools is not likely to be
worthwhile. However, if your goal is to become a professional
programmer, you will want to become well-versed in the tools
of your trade.
Figure A.1 shows the long-term tradeoff of using a tool
designed for novices versus using a tool designed for experts.
The x-axis of this graph represents time spent learning and
using the tool.
The y-axis
represents proficiency
(what you can do) with
the tool. The red line
shows the progression
of proficiency with a
tool designed for a
novice. At time 0— Figure A.1: Tools for novices vs tools for experts
when you first start
using the tool—you have a basic proficiency with it. This basic
proficiency stems from the fact that the tool is setup to be easy
for novices—it is “user friendly.” As you spend more time with
the tool, you learn more features and tricks, but your
proficiency quickly plateaus as you reach the limits of your
tool.
The blue line shows the progression with a tool designed
for expert use. At time 0, the tool is difficult to use. It does not
fit with the paradigms you are used to. As you spend time and
effort learning the tool, your proficiency increases. At some
point, your rate of learning increases too as you become
familiar with the terminology and paradigms of the tool—you
know what to look for, what to ask about, and where to look
when you do not know something. As you continue to learn,
your proficiency progresses past the plateau you could achieve
in the tool designed for novices. You may eventually plateau,
but when you do, that plateau will be much higher with the tool
designed for experts. For some tools, you may never plateau—
one author has used emacs for 15 years and still learns new
things regularly.
This tradeoff is not unique to programming. Professional
photographers, for example, use equipment that gives them full
control over their art. They make decisions about shutter speed,
aperture, and light sensitivity every few minutes. This level of
control is likely overkill for taking casual photos. And a novice
user might find that their first dozen photos taken on
professional equipment are blurry, over-exposed, or that they
missed the shot entirely while fiddling with camera settings. At
the same time, few professional photographers would give up
their professional equipment and replace it with a “point-and-
click” camera for any serious artistic undertaking.
Another reason to invest time and effort into learning
programming tools (if you want to be a professional
programmer) is the perception associated with your tool
choices. Using the tools of an expert programmer (especially if
you use them well) sets up the perception that you are an
expert programmer. Several students have reported that when
interviewing for jobs, the fact they used the tools described in
this appendix was important in interviews. Think of the
photography analogy: would you hire a photographer who only
knows how to use cheap disposable cameras?
V AppendiciesIndexB UNIX Basics
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
V AppendiciesA Why Expert ToolsC Editing: Emacs
Appendix B
UNIX Basics
UNIX is a multitasking, multiuser operating system that is well suited
to programming and programming-related tasks (running servers, etc.).
Technically speaking, UNIX refers to a specific operating system
developed at Bell Labs in the 1970s; however, today it is more
commonly used (slightly imprecisely) to mean “any UNIX-like”
operating system, such as Linux, Free BSD, Solaris, AIX, and even
Mac OSX.1 Here, we will use the more general term and note that you
are most likely to use Linux or Mac OSX.
UNIX is a great example of the tools for experts versus tools for
novices tradeoffs discussed in the introduction to these appendices. If
you are reading this section, odds are good that you fall into the
relatively large set of people who are “master novices” when it comes
to using a computer—that is, you have mastered all of the skills of a
novice system. You can use a graphical interface to open files, send
email, browse the web, and play music. Maybe you can even fix a few
things when something goes wrong. However, you would be hard
pressed to make your computer perform moderately sophisticated tasks
in an automated fashion.
As a simple example, suppose you had 50 files in a directory
(a.k.a. “folder”) and wanted to rename them all by replacing _ with - in
their names (but otherwise leaving the names unchanged). As a “master
novice” you could perform this task in the graphical interface by hand
—clicking each file, clicking rename, and typing in the new name.
However, such an approach would be incredibly tedious and time
consuming. An expert user would use the command line (which we will
introduce shortly) to rename all 50 files in a single command, taking
only a few seconds of work.
B.1 In the Beginning Was the Command Line
While UNIX has a graphical interface (GUI), its users often make use
of the command line.2 In its simplest usage, the command line has you
type the name of the program you want to run, whereas a GUI-based
operating system might have you double-click on an icon of the
program you want to run. The command line interface can be
intimidating or frustrating at first, but an expert user will often prefer
the command line to a GUI. Beyond being the natural environment to
program in, it allows for us to perform more sophisticated tasks,
especially automating those that might otherwise be repetitive.
To reach a command line prompt, you will need to use a terminal
emulator (commonly referred to as just a “terminal”), which is a
program that emulates a text-mode terminal. If you are running a
UNIX-based system (Linux or Mac OSX), a terminal is available
natively. In Linux, if you are using the graphical environment, you can
run xterm, or you can switch to an actual text-mode terminal by
pressing Ctrl-Alt-F1 (to switch back to the graphical interface, you can
press Ctrl-Alt-F7). If you are running Mac OSX, you can run the
Terminal application (typically found under Applications
Utilities).
If you are running Windows, there are some command line options
(typically called cmd or command, depending the version of Windows);
however, these tend to be quite simplistic by UNIX standards. You
could install a tool called Cygwin, which provides the basics of a
UNIX environment if you wanted. However, if you have access to a
UNIX server (e.g., if you are taking a class and your teacher has set one
up for your to work on), it is typically easier to just log into the server
remotely and work there. This is explained in more detail in Section
B.12.
Figure B.1: The command prompt.
Once you have started your terminal, it should display a command
prompt (or just “prompt” for short). Figure B.1 shows a picture of a
typical command prompt. The prompt not only lets you know that the
shell is ready for you to give it a command but also provides some
information. In this case, it gives the current username (drew, displayed
before the @) and the hostname of the system you are on (in this case,
the system is named fenrir, displayed after the @). It then has a : and
the current directory. In this case, the current directory is ~, which is
UNIX shorthand for “your home directory” (which we will elaborate
on momentarily). After that, the $ is the typical symbol for the end of
the prompt for a typical user, indicating that a command can be entered.
The grey box is the cursor, which indicates where you are typing input.
The cursor blinks, which is not shown in the figure.
The prompt displays this information, since it is typically useful to
know immediately without having to run a command to find out. While
it may seem trivial to remember who you are or what computer you are
on, it is quite common to work across multiple computers.3 For
example, a developer may have one terminal open on their local
computer, one logged into a server shared by their development team,
and a third logged into a system for experimentation and testing.
Likewise, one may have multiple usernames on the same system for
different purposes.4 Exactly what information the prompt displays is
configurable, which we will discuss briefly later.
Now You Try: Command Line Basics
Open a UNIX terminal either locally (on a Mac/UNIX
machine) or by logging onto a remote UNIX server (see
Section B.12). What does your prompt look like? Does it
include your username? The hostname? Type “whoami” at the
prompt to allay any existential crisis you may be having.
B.2 Getting Help: man and help
The first commands we will learn are those that provide built-in help.
The first, and most versatile of these is the man command (which is
short for “manual”). This command displays the manual page (“man
page” for short) for whatever you request—commands, library
functions, and a variety of other important topics. For example, if you
type man -S3 printf, your computer will display the man page for the
printf function from the C library.
Before we discuss the details of the man command, we will take a
brief aside to discuss command line arguments. Like many UNIX
commands, man takes arguments on the command line to specify
exactly what it should do. In the case of man, these arguments
(typically) specify which page (or pages) you want it to display for you.
In general, command line arguments are separated from the command
name (and each other) by white space (one or more spaces or tabs). In
the example above, we gave the man command two arguments: -S3 and
printf.
Of these two arguments, the first is an “option.” Options are
arguments that differ from “normal” arguments in that they start with a
- and change the behavior of the command, rather than specifying the
typical details of the program (such as which page to display or what
file to act on). In the particular example above, the -S3 argument tells
man to look in section 3 of the manual, which is dedicated to the C
library.
Figure B.2: Display of man page for printf.
Before we delve into the options and the details of the various
sections of the manual, we will look at what the manual displays in a
bit more detail. Figure B.2 shows the output of man -S3 printf. This
page actually has information not only for printf, but also for a variety
of related functions, which are all listed at the top of the page. The
SYNOPSIS section lists the #include file to use, as well as the functions’
prototypes. At the bottom of the screen is the start of the DESCRIPTION
section, which describes the behavior of the function in detail. This
description runs off the bottom of the screen, but you can scroll up and
down with the arrow keys. You can also use d and u to scroll a page at a
time. You can also quit by pressing q. These are the most important and
useful keys to know, but there are a variety of other ones you can use,
which you can find out about by pressing h (for help).
If you were to continue scrolling down through the man page for
printf, you would find out everything you could ever want to know
about it (including all the various options and features for the format
string, what values it returns under various conditions, etc.). We are not
interested in the details of printf for this discussion, only that the man
page provides them.
The manual includes pages on topics other than just the C library,
such as commands. For example, in Section B.3, we will introduce the
ls command. If you wanted to know more details of this command,
you could do man ls to read about it. The manual page describes what
command line arguments ls expects, as well as the details of the
various options it accepts.
Now You Try: Man Pages
Read the man page for ls. Find out what options you can give
the ls command to (a) have it list in “long format” (with more
details) and (b) use unit suffixes for Megabytes, Gigabytes,
etc…when it lists the sizes.
Unlike printf, we did not specify a section of the manual for ls.
In fact, not specifying the section explicitly is the common case—man
will look through the sections sequentially trying to find the page we
requested. If there is nothing with the same name in an earlier section
of the manual, then you do not need to specify the section. In the case
of ls, the page we are looking for is in Section 1—which has
information about executable programs and shell commands. In fact,
when you run man ls, you can see that it found the page in section 1 by
looking in the top left corner, where you will see LS(1). The (1)
denotes section 1 of the manual.
If we just type man printf we get the man page for the printf
command from section 1 (“printf(1)”). This page corresponds to the
executable command printf, which lets you print things at your shell.
For example, you could type printf "Hello %d\n" 42 at your shell
and it would print out Hello 42. While this may not seem useful
Section B.10 introduces “shell scripts,” which can automate various
tasks. When writing a script, it might be useful to print information out
such as this. Since man finds this page first, if we want the C library
function printf (for example, if we are programming and need to look
up a format specifier that we do not remember), we need to explicitly
ask for section 3 with the -S3 option, as section 3 has C library
reference.
So far, we have seen two sections of the manual: section 1, which
is for executable programs and shell commands, and section 3, which is
for C library function reference. How would we find these out if we did
not have this book handy? Also, how do we find out about the other
sections of the manual? The man command, like most other commands
has its own manual page too, so we could just read that. In fact, if we
type man man, the computer will display the manual page for the man
command. Scrolling down a screen or so into the DESCRIPTION section
shows the following table of sections:
The table below shows the section numbers of the
manual
followed by the types of pages they contain.
1 Executable programs or shell commands
2 System calls (functions provided by the kernel)
3 Library calls (functions within program
libraries)
4 Special files (usually found in /dev)
5 File formats and conventions eg /etc/passwd
6 Games
7 Miscellaneous (including macro packages
and conventions), e.g. man(7), groff(7)
8 System administration commands (usually only for
root)
9 Kernel routines [Non standard]
Scrolling down further in the manual will show various examples
of how to use man, as well as the various options it accepts.
New users of the man system often face the conundrum that
reading a man page is great for the details of something if you know
what you need, but how do you find the right page if you do not know
what you are looking for? There are two main ways to find this sort of
information. The first is to use the -k option, which asks man to do a
keyword search. For example, suppose you wanted to find a C function
to compare two strings. Running the command man -k compare lists
about 56 commands and C library functions that have the word
“compare” in their description. You can then look through this list, find
things that look relevant, and read their respective pages to find the
details.
The other way to find things is to look in the SEE ALSO section at
the end of another page if you know something related but not quite
right. This section, which you can find at the end of each man page, lists
the other pages the author thought were relevant to someone reading
the page she wrote.
Now You Try: Searching The Man Pages
Use man -k to find a command which will omit repeated lines
from its input.
B.3 Directories
The discussion of the prompt introduced three important concepts:
directories, the current directory, and the user’s home directory.
Directories are an organizational unit on the filesystem, which contain
files and/or other directories. You may be familiar with the concept
under the name “folder”, which is the graphical metaphor for the
directory. The actual technical term, which is the correct way to refer to
the organizational unit on the filesystem, is “directory”. Folder is really
only appropriate when referring to the iconography used in many
graphical interfaces.
To understand the importance of the “current directory,” we must
first understand the concept of path names—how we specify a
particular file or directory. In UNIX, the filesystem is organized in a
hierarchical structure, starting from the root, which is called / . Inside
the root directory, there are other directories and files. The directories
may themselves contain more directories and files, and so on. Each file
(or directory—directories are actually a special type of file) can be
named with a path. A path is how to locate the file in the system. An
absolute path name specifies all of the directories that must be
traversed, starting at the root. Components of a path name are separated
by a / . For example, /home/drew/myfile.txt is an absolute
pathname, which specifies the myfile.txt inside of the drew directory,
which is itself inside of the home directory, inside the root directory of
the file system.
The “current directory” (also called the “current working
directory” or “working directory”) of a program is the directory a
relative path name starts from. A relative path name is a path name that
does not begin with / (path names that begin with / are absolute path
names). Effectively, a relative path name is turned into an absolute path
name by prepending the path to the current directory to the front of it.
That is, if the current working directory is /home/drew then the relative
path name textbook/chapter4.tex refers to
/home/drew/textbook/chapter4.tex.
All programs have a current directory, including the command
shell. When you first start your command shell, its current directory is
your home directory. On a UNIX system, each user has a home
directory, which is where they store their files. Typically the name of
user’s home directory matches their user name. On Linux systems, they
are typically found in /home (so a user named “drew” would have a
home directory of /home/drew). Mac OSX typically places the home
directories in /Users (so “drew” would have /Users/drew). The home
directory is important enough that it has its own abbreviation, ~. Using
~ by itself refers to your own home directory. Using ~ immediately
followed by a user name refers to the home directory of that user (e.g.,
~fred would refer to fred’s home directory).
Now You Try: Current Directory
Use the pwd command to find out what the current working
directory of your command shell is.
There are a handful of useful directory-related commands that you
should know. The first is cd, which stands for “change directory.” This
command changes the current directory to a different directory that you
specify as its command line argument (recall from earlier that
command line arguments are written on the command line after the
command name and are separated from it by white space). For
example, cd / would change the current directory to / (the root of the
filesystem). Note that without the space (cd/) the command shell
interprets it as a command named “cd/” with no arguments, and gives
an error message that it cannot find the command.
The argument to cd can be the pathname (relative or absolute—as
a general rule, you can use either) for any directory that you have
permission to access. We will discuss permissions in more detail
shortly, but for now, it will suffice to say that if you do not have
permission to access the directory that you request, cd will give you an
error message and not change the directory.
Another useful command is ls, which lists the contents of a
directory—what files and directories are inside of it. With no
arguments, ls lists the contents of the current directory. If specify one
or more path names as arguments, ls will list information about them.
For path names that specify directories, ls will display the contents of
the directories. For path names that specify regular files, ls will list
information about the files named.
Figure B.3: Examples of the cd and ls commands.
Figure B.3 shows an example of using the cd and ls commands.
The first command in the example is cd examples, which changes the
current directory to the relative path examples. Since the current
directory is /home/drew, this makes an absolute path of
/home/drew/examples (which is called ~/examples for short). On the
second line, you can see that the prompt now shows the current
directory as ~/examples. The second command is ls, which lists the
contents of the examples directory (since there are no arguments, ls
lists the current directory’s contents). In this example, the current
directory has 2 directories (dir1 and dir2) and 2 regular files
(myfile.c and myfile.txt) in it. The default on most systems is for ls
to color code its output: directories are shown in dark blue, while
regular files are shown in plain white. There are other file types, which
are also shown in different colors.
The ls command (like man, and many other UNIX commands)
also can take special arguments called “options”. For example, for ls
the -l option requests that ls print extra information about each file
that it lists. The -a option requests that ls list all files. By contrast, its
default behavior is to skip over files whose names begin with . (i.e., a
dot). While this behavior may seem odd, it arises from the UNIX
convention that files are named with a . if and only if you typically do
not want to see. One common use of these “dot files” is for
configuration files (or directories). For example, a command shell
(which parses and executes the commands you type at the prompt)
maintains a configuration file in each user’s home directory. For the
command shell bash (see Section B.9.), this file is called .bashrc. For
the command shell tsch (see Section B.9.), this file is called .cshrc.
The other common files whose names start with . are the special
directory names . and ... In any directory, . refers to that directory
itself (so cd . would do nothing—it would change to the directory you
are already in). This name can be useful when you need to explicitly
specify something in the current directory (./myCommand). The name ..
refers to the parent directory of the current directory—that is, the
directory that this directory is inside of. Using cd .. takes you “one
level up” in the directory hierarchy. The exception to this is the .. in
the root directory, which refers back to the root directory itself, since
you cannot go “up” any higher.
The ls command has many other options, as do many UNIX
commands. Over time, you will become familiar with the options that
you use frequently. However, you may wonder how you find out about
other options that you do not know about. Like most UNIX commands,
ls has a man page (as we discussed in Section B.2) that describes how
to use the command, as well as the various options it takes. You can
read this manual page by typing man ls at the command prompt.
Two other useful directory-related commands are mkdir and
rmdir. The mkdir command takes one argument and creates a directory
by the specified name. The rmdir command takes one argument and
removes (deletes) the specified directory. To delete a directory using
rmdir, the directory must be empty (it must contain no files or
directories, except for . and .., which cannot be deleted).
Now You Try: Directory Commands
If you are not already in your home directory, cd to it.
• Make a directory called example
• List the contents of your current directory (you should
see the example directory you just made)
• Use cd to change directories into the example
directory
• Use ls to look at the contents of your new current
directory.
• Use cd .. to go back up one level
• Remove the example directory that you created.
B.4 Displaying Files
Now that we have the basics of directories, we will learn some useful
commands to manipulate regular files. We will start with commands to
display the contents of files: cat, more, less, head, and tail.
The first of these, cat, reads one or more files, concatenates them
together (which is where it gets its name), and prints them out. As you
may have guessed by now, cat determines which file(s) to read and
print based on its command line arguments. It will print out each file
you name, in the order that you name them.
If you do not give cat any command line arguments, then it will
read standard input and print it out. Typically, standard input is the
input of the terminal that you run a program from—meaning it is
usually what you type. If you just run cat with no arguments, this
means it will print back what you type in. While that may sound
somewhat useless, it can become more useful when either standard
input or standard output (where it prints: typically the terminal’s
screen) are redirected or piped somewhere else. We will discuss
redirection and pipes in Section B.7.
While you can use cat to display the contents of a file, you
typically want a bit more functionality than just printing the file out.
The more command displays one screenfull and then waits until you
press a key before displaying the next screenfull. It gets its name from
the fact that it prompts ---More-- to indicate that you should press a
key to see more text. The less command supercedes more and provides
more functionality: you can scroll up and down with the arrow keys,
and search for text. Many systems actually run less whenever you ask
for more.
There are also commands to show just the start (head) or just the
end (tail) of a file. Each of these commands can take an argument of
how many lines to display from the requested file. Of course, for full
details on any of these commands, see their man pages.
Note that these commands just let you view the contents of files.
We will discuss editing files in Appendix C
Now You Try: Looking at Files
UNIX has a system dictionary, in /usr/share/dict/words (which
contains one word per line). Use the head command to print
the first 20 lines of this file. Use the tail command to print the
last 25 lines of this file.
B.5 Moving, Copying, and Deleting
Another task you may wish to perform is to move (mv), copy (cp), or
delete (rm—stands for “remove”) files. The first two take a source and
a destination, in that order. That is where to move (or copy) the file
from, followed by where to move (or copy) it to. If you give either of
these commands more than 2 arguments, they assume that the first N-1
are sources, and the last is the destination, which must be a directory. In
this case, each of the sources is moved (or copied) into that directory,
keeping its original filename.
The rm command takes any number of arguments, and deletes
each file that you specify. If you want to delete a directory, you can use
the rmdir command instead. If you use rmdir, the directory must be
empty—it must contain no files or subdirectories (other than . and ..).
You can also use rm to recursively delete all files and directories
contained within a directory by giving it the -r option. Use rm with
care: once you delete something, it is gone.5
Now You Try: Basic File Movements
• Copy the system dictionary to your home directory.
• Rename (move) the copy you created to have the
name mydictionary (note: you don’t actually need two
separate steps: you can specify this name when you
copy).
• Use ls to look at the contents of your home directory
• Delete mydictionary
B.6 Pattern Expansion: Globbing and Braces
You may (frequently) find yourself wishing to manipulate many files at
once that conform to some pattern—for example, removing all files
whose name ends with ~ (editors typically make backup files while you
edit by appending ~ to the name). You may have many of these files,
and typing in all of their names would be tedious.
Because these names follow a pattern, you can use globbing—
paterns that expand to multiple arguments based on the file names in
the current directory—to describe them succinctly. In this particular
case, you could do rm *~ (note there is no space between the * and the
~; doing rm * ~ would expand the * to all files in the directory, and
then ~ would be a separate argument after all of them). Here, * is a
pattern that means “match anything”. The entire pattern *~ matches any
file name (in the current directory) whose name ends with ~. The shell
expands the glob before passing the command line arguments to rm—
that is, it will replace *~ with the appropriately matching names, and rm
will see all of those names as its command line arguments.
There are some other UNIX globbing patterns besides just *. One
of them is ?, which matches any one character. By contrast, * matches
any number (including 0) of characters. You can also specify a certain
set of characters to match with [...] or to exclude with [!...]. For
example, if you were to use the pattern file0[123].txt it would match
file01.txt, file02.txt, and file03.txt. If you did
file0[!123].txt, then it would not match those names, but would
match names like file09.txt, file0x.txt, or file0..txt (and many
others).
Sometimes, you may wish to use one of these special characters
literally—that is, you might want to use * to mean just the character *.
In this case, you can escape the character to remove its special
meaning. For example, rm \* will remove exactly the file named *,
whereas rm * will remove all files in the current directory.
Another form of pattern expansion that UNIX supports is brace
expansion. Brace expansion takes a list of comma-separated choices in
curly braces, such as {a,b,c} and replaces the surround argument with
one version for each item in the list, using that item in place of the list.
For example rm file{1,a,X}.txt would expand to rm file1.txt
filea.txt fileX.txt. This particular example could be accomplished
with globbing as well (using rm file[1aX].txt), but there are uses for
brace expansion that globbing is ill-suited for.
One major difference between globbing and brace expansion is
that globbing operates on the file names in the local directory. Suppose
you wanted to copy some specific files from a remote computer. As we
will discuss in Section B.12, the scp program lets you securely copy
files from one computer to another. You could do scp
user@computer:~/file{1,2,3}.txt ./ to copy three files
(file1.txt, file2.txt, and file3.txt). Globbing is not appropriate
here, since you don’t want to expand based on the names of local files.
Brace expansion is also useful when the choices are longer than
one character each. For example, rm dir1/dir2/{abc,xyz}.txt. Brace
expansion can also be used multiple times in one argument, in which
case you get all possible pairings of the expansions. For example
{a,b,c}{1,2,3} expands to 9 arguments (a1, a2, a3, b1, b2, b3, c1, c2,
c3).
Now You Try: Expansions
• Use brace expansion and the echo command (which
prints is arguments) to print all 9 combinations of
chicken, turkey, and beef with cheddar, swiss, and
blue.
• List all of the files in /bin whose names start with s.
B.7 Redirection and Pipes
When you run a program under UNIX, it has access to three “files” by
default: stdin, stdout, and stderr. In the typical scheme of things, all
three of these are connected to the terminal in which the program is
running. stdin can be read for input from the user typing at the
terminal, and stderr and stdout can be printed to to write output to
the terminal, with the former nominally being for error-related printing,
and the later for everything else.
However, where these files read and write can be redirected on the
command line where you run the program. Redirecting the input or
output of a program means that instead of the file reading from/writing
to the terminal’s keyboard/screen, it will read/write the file you request
instead. Redirection is accomplished with the < (for input) and/or > (for
output) operators. For example, ./myProgram < file1.txt >
output.txt runs the program myProgram with its input redirected from
file1.txt and its output redirected to output.txt.
You can also redirect stderr by using 2>. The reason for the 2 is
that stderr is file descriptor number 2. In yet another example of how
everything is a number (the key lesson of Chapter 3), programs
communicate with the operating system kernel about files in terms of
file descriptors—numeric handles representing open files. When a
program opens a file, the OS kernel returns a file descriptor the
program uses for all future requests about that file until it closes it.
Note that while in Chapter 11 we discuss IO in terms of FILE*s, these
actually are structures that wrap the file descriptor in more state for the
C library. Standard input, output, and error are just file descriptors (0,
1, and 2 respectively) that are open before the program starts.
You can, in fact, redirect other file descriptors other than the
standard three. For example, if you wrote ./cmd 3< f1 4> f2, it would
open the file f1 for reading as file descriptor 3 and f2 for writing as file
descriptor 4 before starting the program. You can also use the <>
operator (possibly with a number before it) to redirect a file descriptor
for both reading and writing. The advanced behaviors described in this
paragraph are relatively uncommon as few programs expect such file
descriptors to be open when the program starts.
Two more commonly used features of redirection are >>, which
redirects the output to a file, but appends to the original contents rather
than erasing it, and 2>&1, which redirects one file descriptor (in this
case 2—stderr) to refer to exactly the same file as another (in this case
1—stdout).
UNIX also supports a special form of input redirection called a
“here document.” A here document lets you write a literal multi-line
input for the program, and redirect its input to be what you wrote.
Redirecting input with a here document involves the << operator,
followed by the “here tag”—the word that you will use to indicate
where the here document ends. While this tag can be anything (that
does not appear on a line by itself in the input), it is traditionally EOF
(which stands for “end of file”). For example:
1 drew@fenrir:~$ cat << EOF
2 > This is a here document
3 > which will all serve as the input for the cat
4 > Until it ends with the here tag on a line by i
5 > (which is right below this)
6 > EOF
The above would run the cat command with its input redirected to
be the multiple lines of text between the two EOF markers. Note: the “>”
characters above are not entered by the user. They appear at the
beginning of each line as a sub-prompt to complete the here document.
In some settings, this prompt might be a “?”. When run with no
arguments, cat reads standard input (in this case, the text of the here
document) and prints it out. Here documents can be quite useful when
writing scripts, which are basically programs in the shell. We will
discuss them in more detail in Section B.10
Another way that the inputs/outputs of programs can be
manipulated is with pipes. A pipe connects the output of one program
to the input of another program. Using a pipe from the command shell
is a matter of placing the | (read “pipe”) between two commands. The
output of the first command becomes the input of the second command.
For example, diff x.c y.c | less runs the command diff x.c y.c,
which prints the differences between the two files x.c and y.c;
however, since the output is piped to less, it will serve as less’s input.
With no arguments, less reads stdin and lets you scroll around in it.
This entire command line lets you scroll through the differences
between the files, which may be quite useful if there are a large set of
differences.
It would be possible to achieve a similar effect with redirection
and two commands: diff x.c y.c > temp then less temp; however,
there are subtle, yet important differences. With the redirection
approach, the diff command is run completely, writing to a file on
disk, then the less command is run using that file as input. With the
pipe approach, the two programs are run at the same time, with the
output from diff being passed directly to less through the OS kernel’s
memory. This distinction may make a significant difference in speed
and disk-space used if the output of the first command is quite large.
The pipe approach is also more convenient to type.
You can build command pipelines with more than two commands
—connecting the output of the first to the input of the second, the
output of the second to the input of the third, and so on. In fact,
command pipelines with three or four commands are quite common
amongst experienced UNIX users. Part of the UNIX philosophy is to
make commands that perform one task well, and connect them together
as needed.
Note that the command shell processes redirections and pipes
before the requested program actually starts. They are not included in
the command line arguments of the program.
Now You Try: Pipes and Redirection
• Use echo and redirection to create a file called
myName.txt with your name in it.
• Use head to print the first 5000 words of the system
dictionary, and then pipe the output to tail so that you
only see the last 300 of those 5000.
• Perform the previous example, but pipe that output to
less, so that you can scroll through the results.
B.8 Searching
One common and important task when using a computer is searching
for things. For example, suppose you have many C source files, and
you want to search through them to see where you called myFunction
You could use the grep command, which searches one or more files (or
standard input if you do not specify any file names) for a particular
pattern. The simplest of patterns is a literal string: myFunction matches
exactly itself. Therefore, you could do grep myFunction *.c, and it
would search in all files ending with .c in the current directory (recall
that the shell expands the * glob), and print out each matching line as
well as the file in which it occurs.
The previous example is quite useful, but is just a taste of the
power of grep. The patterns that grep can search for are not limited to
just exactly matching one string, but rather support more general
patterns. Grep, and a variety of other tools that use similar patterns,
describe them as “regular expressions” (“regexps” for short), which is
mostly true—technically speaking, grep’s patterns support features that
go beyond the capabilities of true regular expressions. As one contrived
example, suppose you wanted to a list of all words in the English
language with any 4 characters, then w, then any 3 characters. You
could use grep to search the system dictionary
(/usr/share/dict/words) for a regexp that matches exactly this
criteria:
grep ’^.\{4\}w.\{3\}$’ /usr/share/dict/words
This pattern may seem complex, but is really a few simple pieces
strung together. The ’s around the outside of the pattern tell the
command shell that we do not want it to interpret special characters in
that argument, but rather pass it as-is to grep. The ^ at the start of the
pattern matches the start of the line. The . matches any character, and is
followed by \{4\}, which specifies 4 repetitions of the prior pattern
(we could have instead written .... if we wanted). The w matches
exactly the letter w. The .\{3\} matches any three characters, in the
same way as the .\{4\} matched any 4 characters. Finally the $ at the
end matches the end of the line. Without the ^ and $ we could match
the rest of the pattern anywhere in a line (which we might want
sometimes).
Our goal here is not to discuss all the intricacies of grep, nor the
possibilities for its patterns, but rather to introduce you to the tool, and
let you know that you can search for rather complex patterns if you
need to. We will note that regexps and the shell use special characters
(*, {}, etc) for different purposes. Often you will want to enclose your
pattern in ’ to prevent the shell from expanding globs and braces, and
applying other special meanings to characters in your patterns.
Another type of searching that you might want to do is to find files
that meet a specific criteria. One might be inclined to approach this by
using ls and pipeing the output to grep. Such an approach is possible
(and looking in the man page for ls shows that the -R option makes it
recursively look through subdirectories). This approach could work, as
long as you only want the criteria to include the name of the file you
are looking for, though even then, it is not the best way.
A better way is to use the find command, which takes the criteria
to look for, and the path to look in. The criteria can be the name of the
file, or other things like “find files newer than some specific file.” The
criteria to look for are specified as options to find—for example -name
pattern specifies to find files whose name matches pattern. The -
name pattern is one of the most commonly used ones, and the pattern
can include shell glob patterns. However, these must be escaped with a
\ to prevent the shell from expanding them before passing the argument
to find. Again, we are not going to go into the details of find here, but
want you to know that it exists, and you can read all about it in its
manpage if you need to.
Now You Try: Searching
• Use grep to find all the words in the system dictionary
that have “sho” any where in them.
• Use grep to find all the words in the system dictionary
that have “sho” at the start.
• Use grep to find all the words in the system dictionary
that have an “s”, followed by 0 or more characters,
followed by an “h”, 0 or more characters, then an “o”
(Note: the regexp for this pattern is s.*h.*o).
• Use the find command to list all files in /usr with
“net” in their names somewhere.
B.9 Command Shells
We’ve been a little vague about the command line. The truth of the
matter is that when you type command at a terminal prompt, there is a
program that parses, interprets, and executes these commands for you.
This program is called a command shell. At a minimum, a UNIX
command shell supports all UNIX commands (such as cd, ls, rm, etc.).
However, most UNIX command shells provide more sophisticated
features, effectively forming a programming language of their own.
This programming language allows an experienced user to write “shell
scripts,” which contain algorithms implemented in shell commands to
automate tasks (which in some cases may be quite complex).
There are a variety of command shells. One of the most popular is
bash. Another, slightly older but still rather prevalent is tcsh
(pronounced “tee-see-shell”). We will briefly introduce both to you.
Command shell preferences (much like text editor preferences) can be a
heated topic. A quick internet search will supply you with hours of
arguments about which is better. We recommend being pragmatic in
your choice. If those around you (co-workers, friends, TAs, instructors)
are all gravitating towards a particular shell, this is the one you should
use. It increases the amount of help you can get from and give to
others, and it decreases the number of problems that may arise due to
differences in the shells.
Both Linux and Mac OSX should run bash by default, unless you
have changed your default shell. If they run some other shell, you can
just type bash at the prompt to run a bash shell (it too is a program, just
like any other). If you are running Windows, bash is not built in.
As a final note, bash and tcsh are only two of many command
shells, most ending in sh. To name just a few: sh, csh, zsh, and dash.
Become familiar with one; dabble with the rest on a need to know basis
only.
B.10 Scripting
UNIX command shells are not simply an interface to run programs,
they are a kind of programming language themselves. Programs written
in command shells are called scripts. These scripts contain programs
built from shell commands, often involving running other programs.
The shell scripting language has most of the programming constructs
you would expect from learning to program in C—variables,
conditional statements, loops, and functions. As with many things in
this appendix, our goal is not to provide a comprehensive guide to the
topic, but to introduce you to the idea so that you can seek out more
information when the tool is useful to you. This section will
specifically discuss bash scripts, but the code examples will be given in
both bash (on the left) and tcsh (on the right) in order to give you
some familiarity with the latter and to show you how various scripting
languages differ. Note that we do not expect (or even suggest) you to
learn both of these. Instead, we present both for the eventuality where
you search for how to perform some task and find results in a shell that
is not the one you use—you will have at least seen that there are
different shells, and that they generally provide similar functionality,
even if with slightly different syntax.
As with most programming languages, shell scripts have variables.
Unlike C, bash scripts are untyped. You do not declare the types of
your variables—nor even declare the variables before you use them. To
assign to a variable, you simply write variable=value. Unlike C, bash
does not require a semicolon to end a statement. Instead a statement
may be terminated by either a newline or a semicolon.
Using a variable in bash requires putting a dollar sign ($) before
the variable’s name. For example, we could do the following:
bash tcsh
1 variable="hello world" 1 set variable="hello world"
2 echo $variable 2 echo $variable
This very simple script assigns the string "hello world" to the
variable variable, and then runs the command echo $variable, which
the shell expands to echo hello world before running the command
(Recall that echo is a program that simply prints out its command line
arguments). So the scripts behavior is to print hello world. You can
try this at your command shell, or you can write these commands into a
file, save it and run it.
If you save a script into a file, you need to make the file
executable in order to be able to run it. UNIX tracks permissions for
files, and by default they are not executable (though when you compile
programs with a compiler like gcc, it adds execution permissions at the
end of linking the binary). We will not go into the full details of
permissions here, but just mention that you can run
chmod u+x filename to add execute permissions for the user who
owns the file (typically, the owner is you if you just created it).
If you create bash scripts, it is convention (though not required) to
name them with .sh at the end of the name. Following this conventions
makes it easy for people (including yourself) to realize that the file is
an executable shell script, and can not only be run, but also read by a
human (as compared to a compiled binary program, which is not
human readable).
Additionally when you save a script in a file, you should start it
with a line indicating what program should interpret the script. Such a
line starts with #! and then has the full path of the program capable of
running the script. Note that the # is read “hash” or “pound” and the !
is read “bang”, so the #! combination is either read “pound-bang” or
“hash-bang”, with the later sometimes shortened to “shebang”.
For a bash script, this line would read #!/bin/bash. This line lets
the kernel know that the script should be interpreted by bash. You can
write scripts for other shells (which have different syntaxes), or other
scripting languages, such as perl. Note that # is “comment to end of
line” in bash (and most other scripting languages), so the line will have
no effect in the script itself. If no such line is present, then it will be run
by the default shell. In our example, the complete script would look
like this:
bash tcsh
1 #!/bin/bash 1 #!/bin/tcsh
2 2
3 variable="hello world" 3 set variable="hello world"
4 echo $variable 4 echo $variable
As with much programming, the most usefulness comes when we
can have the computer repeat tasks for us. bash has loops to repeat
tasks with variations. The most common loop in bash is the for loop,
although there is also a while loop. bash’s for loop behaves slightly
differently from C’s. In bash and tcsh, the syntaxes are:
bash tcsh
1 for variable in alist
1 foreach variable (alist)
2 do
2 commands
3 commands
3 end
4 done
Here, variable can be whatever variable name you want. The
loop will iterate once per item in the alist (in order), with the current
item being assigned to the variable before executing the commands that
form the loop body. For example,
bash tcsh
for i in oneFish twoFish redFish
foreach i (oneFish twoFish redFish)
do
echo "Current fish is $i"
echo "Current fish is $i"
end
done
will print
Current fish is oneFish
Current fish is twoFish
Current fish is redFish
Note that the list of things can be the value of a variable, shell glob
(e.g., *.c), or the output of command—using back-tick expansion.
When you write a command inside of back-ticks (‘, the character that
shares a key with the tilde, on the far left of the numbers row on an
American keyboard), bash runs that command, and replaces the back-
tick expression with the output of that command.
The following example uses back-tick expansion to run the
command find . -name \*.c (finding all .c files in the current
directory and its sub-directories). The output of this find command
becomes the list of things that the for loop iterates over:
bash tcsh
for i in ‘find . -name \*.c‘
foreach i (‘find . \-name \*.c‘)
do
# whatever commands you want
# whatever commands you want
end
done
To return to our motivating example at the start of this appendix—
renaming all files in the current directory to replace _ with -. If we take
a second to introduce the tr command, we now have the skills to do
this task with a quick for loop. In its simplest usage, the tr command
takes two arguments—the first is a list of characters to replace, and the
second is the list of characters to replace them with. It reads standard
input, and for each character that it reads, it either prints its replacement
(if that character appears in the first list of characters, it prints the
corresponding character from the second list), otherwise it prints the
character unmodified. With this command, we can use a for loop to
iterate over all the files in the current directory, use the mv command to
rename them, and use back-tick expansion and tr to compute the new
name.
bash tcsh
1 for i in *_*
1 foreach i ( *_* )
2 do
2 mv $i ‘echo $i | tr _ -‘
3 mv $i ‘echo $i | tr _ -‘
3 end
4 done
While this loop may seem a bit unfamiliar to you now, as you gain
experience with shell scripting, such a command will come naturally to
you whenever you need to perform a repetitive task. If you want to
count over numbers (as you would with a for loop in C), you can use
the seq command and back-tick expansion to generate the list of
numbers that you want to iterate through.
We could write an entire book on shell scripting, but that is not the
purpose of this text. Instead, we will suggest that those interested in
reading more about shell scripting consult the wealth of existing
resources available on the Internet. One such resource is the Advanced
Bash Scripting Guide available on the Linux Documentation Project
web-site.
B.11 Environment Variables
Some variables have special meaning to the shell or certain
programs. For example, the PATH variable specifies where the shell
should look for programs to execute. When you type a program name
without any directory (e.g., ssh has no directory in its name, as
opposed to ./myProgram, which names a particular directory), the shell
searches through the components of the PATH in order, looking for a
matching program name. If it finds one, it runs it. Otherwise, it reports
an error. You can see what your current PATH is by echo $PATH (since
PATH is a variable, and echo prints its command line arguments). An
example PATH is
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bi
n:/usr/games
Notice that the value of the PATH is a colon delimited list of directory
names.
Another variable that controls the shell’s behavior is IFS—the
internal field separator. This variable controls how the shell divides
input up into fields. Consider the following loop:
1 for i in ‘cat someFile‘
2 do
3 # some commands with i
4 done
When bash goes to execute the loop, it has to split up the output of
cat someFile into fields (what to set i to for each iteration of the
loop). The current value of IFS controls how this splitting is done. The
default value of IFS causes the input to be split into fields at any
whitespace. However, you might want to split the fields differently: at
only newline (IFS=$’\n’), at commas (IFS=’,’), or at some other
separator you desire.
There are a variety of other environment variables, and we will of
course not go into them all here. However, we will mention two useful
things to understand about environment variables. First, most variables
are local to the shell you run them in. By default, the variables will not
be passed down to programs that you run from within the shell. If you
want a variable to appear in the environment of commands you run,
you should export it. Typically this is done when the variable is
assigned (export myVar="hi"), but can be done later (myVar="hi" …
export myVar).
Second, you can read (and manipulate) environment variables
from programs that you write. One way to do so is with the getenv
function (from stdlib.h), which takes the name of an environment
variable, and gives you its value. You can also declare main to take a
third argument char ** envp, which is a pointer to an array of strings
containing the environment variables (in the form "variable=value").
B.12 Remote Login: ssh
To log into a remote UNIX system, you will need to use an ssh
program. ssh stands for “secure shell” and provides an encrypted
connection to a terminal on a remote computer. When you “ssh into”
another computer, you run an ssh client on your computer, which
connects to an ssh server on the other computer. The client and server
setup an encrypted session, and then you login with your username and
password. These credentials are sent over the encrypted connection, so
they are protected from attackers who might try to eavesdrop on the
connection. Once you are authenticated, you can type commands in
your local terminal, and the ssh program will encrypt them, and send
them to the server. The server will execute the commands, encrypt the
output, and send it back, where your client will decrypt it and display it.
If you are using a UNIX based system, sshing to a remote
computer is just a matter of type ssh username@servername at the
command prompt in your terminal (if your username is the same on
both systems, you can omit the username@ part). The ssh program will
then ask you for your password (unless you have other authentication
methods setup). After successfully authenticating, you will be provided
with a command prompt on the remote system, and can execute
commands on it as you desire.
If you are using a Windows machine, the easiest way to ssh is to
find or download a program that is an ssh client. One example that is
both open-source and commonly available is called PuTTY. Another is
called SSH Secure Shell. There are more options than these. Often
Windows machines maintained by major universities will have at least
one of these installed, possibly residing in a directory called Utilities or
Internet. Simply start the program and enter the appropriate information
about the username and server you are trying to connect to.
A companion of ssh is scp, which allows you to copy files
securely from one computer to another over the same protocol as ssh.
Use of scp is much the same as use of cp except that the source or
destination file (or both) can be on another computer. You specify
copying to/from another computer by prefixing the file name with
user@server:. For example, for user smith123 to copy a local file
called myFile into a directory called myDir on a remote server called
myserver.edu, she would type scp myFile
[email protected]:myDir/ There are many other features
available for ssh. Consult man ssh and man scp for more details.
A Why Expert ToolsIndexC Editing: Emacs
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
V AppendiciesB UNIX BasicsD Other Important Tools
Appendix C
Editing: Emacs
When you are programming, your editor—the program you use to actually write your
code—is key to your productivity. One of the most important aspects of the editor in
your productivity is not getting in the way of the flow of ideas from your brain through
your keyboard, and into your source code. While this point may seem odd or
unimportant to novice programmers, as you become more experienced, you will start
finding yourself “In The Zone” as you program. When you are “In The Zone,” your
mental focus is fully dedicated to your programming, and you have your mind full of
your plans and designs for the program. Here, disrupting your focus at all can take you
out of The Zone, destroying your mental focus and losing your train of thought.
Programmer’s editors (such as Emacs and Vim) are designed so that you can do
everything you need without taking your hands off the keyboard. The ability to do
everything on the keyboard is crucial to staying in The Zone—pausing to grab the
mouse and stumble through menus looking for what you need can be enough to disrupt
your focus. Instead, programmers using tools like Emacs and Vim have the keyboard
shortcuts for anything they do regularly in muscle memory—they can perform the
activities without consciously thinking about how, leaving their full focus to their
programming tasks.
Besides enabling you to stay In The Zone by not having the distraction of the
mouse, programming editors provide a variety of other productivity enhancing features.
One such feature is syntax highlighting—coloring the words that you type based on
how they fit into the syntax of the language (e.g, keywords, type names, the name of a
function being declared, string literals, integer constants, etc…). Syntax highlighting
helps make it easier for you to read the code—for example, you can easily distinguish
what text is inside a string literal. It also helps you avoid identify and correct mistakes
—such as accidently using a keyword (that you are not familiar with) as a variable
name, mismatching (or improperly escaping) quotation marks in string literals, or
misspelling keywords.
Another feature is automatic indentation according to the structure of the language.
Programmers generally consider indenting code according its nesting level (how many
{} it is inside of) to be a key element of proper style and formatting. Not only does this
help with readability, but having the editor automatically indent based on the actual
nesting level (as opposed to what the programmer thinks it is) can help identify and
correct errors earlier.
A third important feature of programming editors is their ability to interact with
the other tools used by the programmer—the debugger, the compilation/build tools, and
revision control systems. For example, in Emacs, you can compile from within the
editor, and if the compilation results to errors, jump directly to each of them within the
editor. Likewise, when using the debugger (GDB, which we will discuss in Section
D.2), Emacs understands how to interact with GDB and will display the current
execution point in your code, and let you send commands to GDB about a particular
line of code you are viewing.
We will also note that programming editors typically have a lot of flexibility in
what they are useful for. The authors of this book use Emacs for pretty much everything
that they write, including this book itself! As with programming, staying In The Zone is
crucial to productivity when writing text.
While both Emacs and Vim are appropriate editors for professional programmers,
we will focus on Emacs here. We note that if you use (or want to learn) Vim, that is
great, but we both use and know Emacs. We recommend having a basic familiarity with
both editors (in case you ever need to use a system where one is installed, but not the
other). However, we note that it virtually impossible to be an expert in both of them.
Remember that you want to train your muscle memory so that you can perform editor
commands by instinct, without thinking about them. Most people cannot do this for two
completely different sets of editor commands.
C.1 Emacs vocabulary
The first step in our introduction to Emacs is the terminology that it uses.
Buffer At a first approximation, you can think of an emacs buffer as being
“an open file”—when you open foo.c, Emacs creates a buffer for foo.c,
and displays it to you. However, you can also have buffers that are not
associated with files—for example *scratch* (a buffer Emacs creates by
default for things you do not want to save), *shell* if you run a shell, and
any number of other things.
Frame What you would think of as a “window.”
Window A frame can be divided into one or more “windows”, each of which
can display a different buffer (although multiple windows can display the
same buffer, possibly at different positions).
Point The Emacs term for where the text cursor is.
Mark One location per buffer, which, when set by the user, Emacs
remembers until changed or cleared by the user. Commands that operate on
regions of the buffer (copy, indent region, etc.) typically act on the region
between the point and the mark.
C-(key) Control-(key), i.e., C-x is Control-x, which means to hold the Control
key while pressing x.
M-(key) Meta-(key), i.e., M-x is Meta-x. Depending on your system and
keyboard layout, ESC (most likely) or ALT may function as Meta for you
—or maybe either. If you use ESC press it first, release it, then hit the other
key. If you use ALT, hold it down while you press the other key.
Kill
Cut (as in cut-and-paste).
Yank Paste
extended-command Not all commands are bound to key strokes—extended
commands are executed by typing M-x, then the name of the command. For
example, to play Tetris, you do M-x tetris, since Tetris playing is not
bound to any specific key.
RET Return (a.k.a. Enter).
Minibuffer At the bottom of the emacs frame, there is an area big enough for one
line of text below the status bar. This area is called the minibuffer, and is
used when commands needs to interact with you. For example, when you
go to open a file, Emacs prompts you for the file name in the minibuffer,
and you type your answer there.
Major Mode Emacs changes its behavior based on what type of contents the
current buffer has (C code, Java code, Scheme code, LaTeX,…). The major
mode defines its current behavior in these ways. Each buffer can only have
one major mode at a time.
Minor Mode In addition to their major modes, buffers can have minor modes,
which provide additional features or change functionality. For example, a
buffer for editing LaTeX source may be in the LaTeX major mode, but may
have the Flyspell minor mode, to provide spell checking as you type.
Balanced Expression A region of text that typically has balanced delimiters
(parentheses, braces, etc). The exact specification of what constitutes a
balanced expression is dependent on the current major mode.
Modeline The status line at the bottom of the frame, just above the minibuffer.
C.2 Running Emacs
When you run emacs from the command line, you can specify one or more files for it to
open as it starts as arguments. Depending on your system, Emacs may run in a terminal,
or as a graphical window. Some systems (e.g., Linux) support both modes. Mac OSX
has terminal-mode Emacs installed by default, but graphical versions can be
downloaded and installed. If you have a version of Emacs that supports both modes, the
default behavior will be to use the graphical one—you can override this with the -nw
option on the command line (which is particularly useful if you are running Emacs on a
remote computer over a slow Internet connection, as drawing the window may be very
slow).
C.3 Files and Buffers
Open (“visit”) file C-x C-f
Save current buffer C-x C-s
Kill (close) current buffer C-x k
Insert contents of another file C-x i
Save As [another name] C-x C-w
Recover file from autosave information M-x recover-file
Revert to last saved copy M-x revert-buffer
Quit Emacs C-x C-c
Suspend Emacs (to shell) C-z
resume emacs after suspend fg [at bash prompt]
Table C.1: Commands to manipulate files or leave Emacs.
Once Emacs opens, you can begin editing in the current buffer. Basic editing
proceeds in a straightforward way: if you type letters, they will be inserted at the point.
If you press the arrow keys, the point will move around in the appropriate direction. Of
course, you may wish to perform other commands besides just writing text. Table C.1
shows a variety of commands related to manipulating files, and leaving Emacs. These
commands—like many others—are done with key combinations involving the control
key. Note that for multiple character sequences (such C-x C-s to save a file), you do not
need to hold all the keys down at once (you release x, then press s). Also note that if a
command has no C- modifier on the second or later character, you need to release
Control before pressing that character. For example, the C-x i command (listed in
Table C.1) inserts the contents of another file at the point—this command is
accomplished by pressing Control-x, then releasing both control and x, and pressing i.
This command is different from C-x C-i (which has control on both characters, and is
the relatively uncommonly used indent-rigidly command, which indents the region
between the point and the mark by a fixed amount).
When you open a file (C-x C-f), Emacs will prompt you (in the minibuffer) for
the name of the file that you want to open. You can type in the name of the file you
want, which can either exist (in which case it opens the existing file), or not (in which
case, it will create a new file when you save it). If you want to open an existing file, you
can TAB complete the name—type the first few letters and press TAB. If there are
multiple options, Emacs will complete as much as it can, then wait for more input from
you. You can either type more letters (and press TAB again if needed), or hit TAB
immediately to display possible completions. If you display the completions, they will
appear in their own buffer (*Completions*). You can either type the name of one, or go
into the *Completions* buffer, place the point on the one you want, and hit RET to
select it.
Table C.1 also includes the command to quit Emacs (C-x C-c). This command will
prompt you to save any unsaved files before exiting, and return you to the command
prompt. Sometimes, you may wish to return to the command prompt briefly, without
fully quitting Emacs. You can also suspend Emacs by pressing C-z (Control z). Note
that C-z is not Emacs specific, but works for many UNIX programs. You can return to a
suspended program by running the fg command at the command prompt. Note that this
most useful if you are running terminal-mode Emacs. If you are running it graphically,
you can suspend it with C-z (in the terminal window, not the Emacs window), and then
do bg to let Emacs continue to run in the background while allowing you to interact
with the command shell. In the graphical case, you can just start Emacs in the
background with the & character at the end of the command line—again, this is a feature
of the shell, and not Emacs specific.
C.4 Cancel and Undo
Cancel C-g
C-_
Undo C-x u
C-/
Resume undoing M-x undo-only
Undo in selected region only C-u C-_
Table C.2: Undo and cancel.
Two of the most important commands to learn early on are how to cancel and
undo. The C-g command cancels most anything that can be canceled in Emacs. Most
commonly, if you have partially entered a command incorrectly (including if you have
entered a full command key combination, but it is prompting you for input), you can hit
C-g to cancel it. C-g will also cancel commands that Emacs is in the middle of
executing (if they take long enough to type a key while they are still running) whenever
doing so is possible.
Learning how to undo is also important. Emacs supports three different key
combinations to undo (C-_, C-x u, and C-/). They are all the same, and you can learn
whichever you find easiest. You can undo multiple commands in a row by repeatedly
pressing the undo command.
Emacs handles un-undo (often called “redo”) in a different way that you are likely
used to. Once you start undoing, Emacs remembers your undo commands separately
from all the commands that happened before you began undoing (i.e., the commands
you are undoing in reverse order). When you finish undoing—which Emacs considers
to be whenever you do any command other than an undo—Emacs puts all of the undos
that you just did into the command history. Now, if you undo, you will undo those
undos (redoing the original commands). If you accidently interrupted your undoing and
want to resume without redoing your undos, use the command M-x undo-only to clear
the recent undos from the history of commands that undo works from.
Emacs also allows you to undo within a selected region only. For example,
suppose you edit region A; then go off and edit regions B, C, and D; then you realize
that your edits to region A were incorrect—you would like to undo them, leaving your
changes to B, C, and D intact. You can select the region (in this example, A) you want
to undo changes in (set the mark at the start (C-space), move the point to the end), and
perform the undo. Emacs will say Undo in region! in the minibuffer, and undo your
most recent change in that region. 1
C.5 Cut, Copy and Paste
Kill (cut) to end of line C-k
Set Mark (start selecting) C-Space
Copy selected region M-w
Kill (cut) selected region C-w
Kill to next occurrence of (char) M-z (char)
Kill next balanced expression C-M-k
Kill forwards word M-d
Kill backwards word M-DEL
Append next kill to previous C-M-w
Table C.3: Commands to cut (kill) and copy text.
Another set of commonly useful commands are those for copying and pasting. The
terminology for cutting, copying and pasting in Emacs is a bit different from what you
are likely used to—cutting is called “killing” (because it removes the text from the
buffer). Emacs places the killed text on the “kill ring.” Note that the kill ring is different
(and behaves a bit differently) from the system clipboard, although if you are running
Emacs in graphical mode, it will place killed text there as well. Unlike most copy/paste
systems, the kill ring holds multiple entries (by default 60—when it is full, the oldest is
discarded to make space for the newest). As we will see shortly, you can paste text from
kills other than the most recent.
Emacs has a variety of commands for killing text, several of which are shown in
Table C.3. Two of the most commonly used are “kill to end of line” (C-k)—which will
kill text from the mark to the next newline—and “kill selected region” (C-w), which
kills the text in the currently selected region—between the mark and the point. To select
a region, move the point to the start (or end—it does not matter if you select it
backwards) and press C-space to set the mark (Emacs will say “Mark set” in the
minibuffer). Next move the point to the end of the region. The area between the point
and the mark is considered “selected”. Note that if you are using Emacs in graphical
mode, you can also select a region with the mouse. Now you can perform commands
that act on the selected region—such as C-w to kill it, M-w to copy it (put it on the kill
ring without actually killing it), or many other commands. Note that if you undo a kill
command, it will undo the changes to the buffer (restoring the killed text), but leaves
the text on the kill ring.
Note that there are a handful of less common, but quite useful kill commands. The
M-z (“zap to char”) command prompts for a character, then kills from the point to the
next occurrence of that character. For example, if you had the point at the start of a C
statement, and wanted to kill the entire statement (including the semicolon at the end),
you could press M-z ;. Another useful advanced kill command is C-M-k (ESC, then
Control-k), which performs “kill balanced expression.” Suppose you wanted to kill an
entire block of code in C (starting with a { and ending with the matching }). If there are
no other blocks nested inside it, you could do, M-z }. However, if there are other blocks
nested inside, you want to kill not to the first, but to the matching }—this is where “kill
balanced expression is useful”. If you position the mark on the open {, and do C-M-k, it
will do exactly what you want. The commands M-d and M-DEL delete one word forwards
or backwards respectively.
The last command listed in Table C.3 lets you append your next kill to the
previous kill ring entry. Typically each kill command makes a new kill ring entry. As
we will discuss momentarily, pasting will copy one kill ring entry back into the buffer.
If you combine two kills into one kill ring entry, then they will behave as if they were
killed all at once (even if the original text was discontinuous) when you paste them
back. The C-M-w command makes the next kill command you perform have this
“append to previous” behavior (when you do it, Emacs will tell you “If the next
command is a kill, it will append” in the minibuffer). If your next command after a C-M-
w command is not a kill, then nothing special happens.
Yank (paste) previous kill C-y
Replace previous paste with earlier kill M-y
Table C.4: Commands to yank (paste) text.
Pasting is called “yanking” (because it yanks the text back into the buffer from the
kill ring). Table C.4 shows the two main commands for pasting. Basic pasting behavior
is accomplished with the C-y command, which “yanks” (pastes) the most recent entry
on the kill ring, copying its text into the buffer at the point. The kill ring is not modified
by this action, so you can re-paste the same text again by doing C-y as many times as
you want.
Emacs also has the feature of pasting items that are not the most recent. The most
common way to do this is to paste the most recently killed item, then use M-y to replace
it with an earlier kill until you find the one you want. For example, if you made 3 kills
with the text A, B, and C respectively, then C-y would paste the C back in. If you pressed
M-y, Emacs would remove the C and replace it with the B. Pressing M-y again would
replace the B with the A. Note that M-y is only valid immediately following C-y or M-y.
When you use M-y to paste earlier kills, Emacs remembers what you pasted most
recently until the next kill (the “yank pointer”). Subsequent pastes start from this yank
pointer. Killing text resets the yank pointer to the newest kill. If you were to do the
example above, and then paste (C-y) again, the paste would insert A. If you then killed
D, the next paste would insert D, and using M-y would cycle back through C, B, and A in
that order.
You can modify how far (and in which direction) the M-y command skips by
giving it a prefix argument. In Emacs, some commands accept an argument (typically a
number) that is entered before you do the command itself, by doing M-# where # is
whatever number you want (for example M-3). If you do M-3 M-y, Emacs will go
replace the most recent paste with the text 3 items back on the kill ring. You can move
forward on the kill ring (towards the most recent kill) by prefixing the M-y command
with a negative argument. The C-y command can also accept a prefix argument, which
makes it directly paste the text found that many items back on the kill ring.
C.6 Multiple Buffers
Change to other (open) buffer C-x b
[then select buffer to change to]
Split window horizontally C-x 2
Split window vertically C-x 3
Move between split windows C-x o
Un-split all windows C-x 1
Remove current split window C-x 0
Make current split window taller C-x ^
Table C.5: Dealing with multiple buffers.
Often you will want to work with multiple buffers at a time. As your projects grow,
you will often have multiple source files, and need to move between them easily. Even
on small projects, you are likely to have a handful of buffers you are working with at a
time—your source code, Makefile, debugger, and possibly compilation errors.
Table C.5 lists commands that are useful in these situations.
The first (and probably most commonly used) lets you switch which buffer a
window displays. Pressing C-x b prompts you (in the minibuffer) for the buffer to
change to, and presents you with a default of the most recent buffer you were working
in that is not currently displayed. If the default is what you want, just press return
(“RET” in Emacs parlance). Otherwise, type in the name of the buffer that you want,
and hit return. You can TAB complete the name of the buffer that you want to switch to
in the same way that you can when opening an existing file.
When you first start Emacs, the frame has one window (recall from the vocabulary
discussion that what you normally think of as a “window” is called a “frame”, and a
“window” is what displays one buffer inside that frame) plus the modeline and
minibuffer. However, you can split any window into smaller pieces. Splitting a window
lets you look at two buffers, or even two different (potentially far away) places in one
buffer at the same time. Looking at two places at once can be incredibly useful when
programming—for example, you can view the declaration of a function in one file at
the same time you view the place you want to call it in another.
You can split the current window either horizontally (C-x 2) or vertically (C-x 3),
and can repeat the process until the resulting windows would fall below a minimum
size that Emacs considers useless. Once you have split the windows, you can move
between them with C-x o (or clicking in them with the mouse, if you are using Emacs
in a graphical mode). You can also un-split a single window (C-x 0—note zero for this
command, and the lowercase letter o to move between windows), or un-split completely
with C-x 1. You can also increase the size of the current window with C-x ^.
Note that sometimes Emacs will automatically split (for example, to display
completions or compiler errors) the window in half horizontally if it needs to display a
buffer for you and you only have one window. This automatic splitting typically
behaves much like any other split (you can use the same commands to move into the
new window, re-split it, etc…). There are two major differences. One is that in cases
where whatever Emacs needed to display becomes irrelevant (e.g., when you finish
selecting a file, the completions become irrelevant), the window displaying that buffer
will automatically disappear. The other is that whenever you are in the middle of a
command that is using the minibuffer (such as selecting a file to open), you cannot use
another command that requires the minibuffer (such as changing the buffer that a
window displays).
C.7 Search and Replace
Incremental Search Forwards C-s
Incremental Search Backwards C-r
To repeat the search (find next/prev), press C-s or C-r again
Search and Replace M-x replace-string
Query Replace M-%
Search and Replace Regexp M-x replace-regexp
Query Replace Regexp C-M-%
Table C.6: Search and replace.
Another set of editing commands you will want to learn are how to search and
replace text. The most common way to search is with the incremental search
commands, C-s (search forwards) and C-r (search backwards). After you enter either of
these commands, the minibuffer will say “I-search:” (or “I-search backwards:”),
prompting you for what to search for. Emacs will search as you type each letter (thus
why it is called incremental search)—it will find the text that matches what you have
typed so far nearest to the point in the appropriate direction. Emacs will also highlight
all text that matches what you have entered so far. The incremental nature of the search
means you only have to type as many letters as it takes to find what you are looking for.
You can also search again by pressing C-s or C-r again, or switch between the two
(searching backwards for the previous instance of whatever you searched forwards for).
If you perform any other command after the search, it will stop the search, but you can
resume it again by pressing C-s or C-r twice (once to start the searching, and again to
tell it to re-search for the last item).
If your search reaches the start (or end if going backwards) of the buffer, Emacs
will say “Failing I-search: (search string)” in the minibuffer. You can press C-s (or C-r)
again to make Emacs wrap the search around to the start (or end) of the buffer, in which
case it will say “Wrapped I-search: (search string)” if it successfully finds something or
“Failing overwrapped I-search: (search string)” if the search string does not exist
anywhere in the buffer.
You can search and replace for a string with M-x replace-string, which will
prompt you in the minibuffer for the string to search for and the string to replace it with.
If you have a region selected, the search/replace will be confined to that region,2
otherwise it will proceed from the point to the end of the buffer.
A slightly different way to replace strings is with the “query replace” command
(M-%, or M-x query-replace). The query replace command prompts you for each
potential replacement as to whether or not you want to replace it. The responses to the
replacement queries are y to replace the current match, n to not replace the current
match and move to the next one, ! to replace all matches without further queries, or ESC
to stop without any more replacements. You can also press ? for help, which will list
those options, as well as some more advanced ones that we will not cover here.
Emacs also has versions of the replace and query replace commands that operate
on regular expressions (regexps). As with the regexps accepted by grep (discussed in
Section B.8), the regexps that Emacs accepts are technically more powerful than a true
regular expression from a formal computer theory sense. We will not delve into the
intricacies of Emacs’s regular expressions, but note that they provide a powerful way to
search and replace complex patterns.
C.8 Programming Commands
Compile M-x compile
I recommend binding this to C-c C-v
M-g n
Go to next error
C-x ‘
Go to previous error M-g p
Complete current word M-/
Replace completion with next choice M-/ [again, right after previous M-/]
Show semantic completions C-c , SPACE Semantic
Show details of completions C-c , l Semantic
Jump to function def (this file) C-c , j Semantic
Jump to function def (any file) C-c , J Semantic
Goto Line Number M-x goto-line
M-x indent-region
Indent Selected Region
C-M-/
Comment DWIM M-;
Table C.7: Programming-related commands.
Emacs, of course, has a variety of commands that are useful for programming,
some of which are shown in Table C.7. One of these is the ability to compile from
within Emacs, via the command M-x compile. 3 This command will prompt you for the
compilation command to use (the default of make -k works fine if you are compiling
with a Makefile), then runs that command. Emacs will display the results of the
compilation (success or any error messages) in a buffer for you. You can then go
through the compilation errors, and Emacs will display the relevant line of code for
each one. You can go backwards to the previous error with M-g p.
Emacs not only displays the compilation errors, but also lets you jump directly too
them—displaying the message in one window and the relevant source line in another.
Often, you will want to see each error in the order that the compiler found them. You
can go from one error to the next by pressing C-x ‘ (or M-g n). You can then correct the
error, and press C-x ‘ again to move to the next, until you have either fixed all the
errors, or decided that you should retry compilation before proceeding. You can also go
backwards to the previous error with M-g p, or move the point onto any error message
in the compilation errors buffer and place RET to jump to it.
There are some options you can set to alter the compilation error behavior
(according to your personal preferences). The most notable two configuration options
are making Emacs automatically jump to the first error when it displays the compilation
results, and automatically jumping to an error when you move the point onto the
message (rather than after you hit RET on the message). We discuss these options later,
in the section on configuring and customizing Emacs.
Another useful feature when programming (or sometimes, in general) is to auto-
complete a word. The simplest way to auto-complete a word is to hit M-/, which
searches backwards through the current buffer for a word that starts with the same
sequence of letters. It then completes the word based on what it finds. If you do not like
this completion, you can press M-/ again, and Emacs will continue search for another
completion, and replace the one you did not like with the next it finds. After going
through the entire buffer, Emacs will search other open buffers for possible
completions.
If you enable Semantic mode (M-x semantic-mode to enable for the current
session, or enable it by default by adding (semantic-mode 1) to your configuration file
—see the sample configuration file at the end of this appendix for an example), then
you can use a more sophisticated completion mechanism that understands the syntax of
the programming language. The C-c , SPACE command displays a window with the
context sensitive completions. For example, if p is a pointer to a structure, and you
write p-> then request completions, Semantic understands what type p and what fields
that struct has, so it will list those fields as your completion options.
You can then complete the text in a variety of ways. One option is to use TAB to
complete as much as can be done uniquely (that is, if there are no choices to make at
present, fill in letters until a choice needs to be made). You can then enter one or more
letters to narrow down the choices, and press TAB again to complete more. For
example, if the possible completions are apple, apply and bears, then you would need
to either type a or b. If you picked a, then you could hit TAB to complete up to appl,
and then either type e or y. The other option is to put the point on the completion you
want and hit RET (or click if you are using graphical mode).
Some people prefer to use Semantic’s “idle completion mode,” which displays
possible completions at the point whenever you stop doing anything (i.e., when Emacs
is idle). You can enable this mode with global-semantic-idle-completions-mode (or
by default in your configuration). When this mode is enabled and you stop doing
anything, Emacs will see if there is anything meaningful to complete. If so, it will
display the first possible completion in highlighted text, and how many possible
completions it found in the minibuffer. You can then hit TAB to cycle through the
possible completions, enter some letters to refine the completions (to those starting with
the letters you have typed), hit RET to accept the completion, or hit C-g to cancel.
If you have enabled Semantic mode, you can also use it to jump to the definition
of a structure or function. If you use C-c, j, Semantic mode will look only in the
current file, if you use C-c, J, it will look in all files that it has analyzed. For either of
these, you will be prompted to enter what you want to jump to (in the minibuffer), and
then Emacs will take you there.
Semantic mode has a variety of other features, which we will not delve into here.
If you are interested in more advanced features, you can look at the Semantic mode
documentation online
(http://www.gnu.org/software/emacs/manual/html_mono/semantic.html).
Table C.7 notes which commands listed there require Semantic mode).
While Emacs automatically indents (according to your programming language),
there are times when you want to explicitly ask it to re-indent a region (for example, if
you move it in or out of a block, and it needs to be adjusted to reflect its new location).
You can ask Emacs to re-indent the selected region with M-x indent-region or C-M-/.
Emacs has a wide variety of comment-related commands. The most versatile of
these is the “Comment Do What I Mean” command, M-;. If a region is selected, the
command comments the region if it is not already all a comment, and uncomments it if
it is. If no region is selected, the command acts on a line, and comments the line out,
indents it appropriately, and moves the point to the start of the comment’s text if it is
not already a comment. If the line is already a comment, it just indents it and moves the
point.
C.9 Advanced Movement
Move to start of line C-a
Move to end of line C-e
Move to start of buffer M-<
Move to end of buffer M->
Forward by a word M-f
Backward by a word M-b
Return to mark (where you set it) C-u C-space
Forward by a balanced expression M-C-f
Backward by a balanced expression M-C-b
Start of line, past indentation M-m
Start of current function M-C-a
End of current function M-C-e
Table C.8: Advanced Emacs movement commands.
Another important set of commands to master as you learn Emacs is those for
advanced movement. While you can use the arrow keys to move around, often you will
want to move in larger steps at a time. Table C.8 shows some of the advanced
movement commands that Emacs supports. It is also useful to note that bash supports
the same key commands when they make sense (e.g., start/end of line;
forward/backward word). The M-m command moves to the start of the line, ignoring any
leading whitespace (which is used to indent the text on that line). What the rest of the
commands do should be fairly straightforward from their descriptions in the table
(recall that we have already discussed what “balanced expressions” are).
We will also note that these commands can be incredibly useful when you want to
define keyboard macros (which we will discuss momentarily), as they let you move by
across things by their meaning (a word, function, expression, line) rather than by a
fixed number of characters—thus you can get the desired effect in macros where you
want to operate on a logical unit whose length in characters varies.
C.10 Keyboard Macros
Start Defining Keyboard Macro C-x (
Finish Defining Keyboard Macro C-x )
Execute Macro C-x e
Repeat Execution of Macro e [right after executing it]
Do macro to each line in a region M-x apply-macro-to-region-lines
Name last macro M-x name-last-kbd-macro
Table C.9: Emacs macros.
Emacs lets you record a sequence of commands (called a keyboard macro), which
you can later replay to repeat its effects. When used with features such as search and
advanced movement, these can produce complicated effects allowing you to automate
complicated but repetitive tasks. Recording a keyboard macro starts with C-( and ends
with C-). The macro encompasses all commands in between the two.
As an example, let us suppose that you are writing C code and have defined an
enum type with many cases. You want to write a function that will convert one of these
enums to a string:
1 enum my_enum_t {
2 MY_ENUM_XYZ,
3 MY_ENUM_SOMETHING,
4 …
5 };
6
7 const char * my_enum_to_string (enum my_enum_t e) {
8 switch (e) {
9 case MY_ENUM_XYZ: return "MY_ENUM_XYZ";
10 case MY_ENUM_SOMETHING: return "MY_ENUM_SOMETHING";
11 …
12 }
13 }
You could instead write the body of the function by copying and pasting the names
of the enumerated values, then making a keyboard macro to convert them.
1. Place the point at the start of the first line, and hit C-x ( (to start the macro).
2. Press M-m (move to start of line skipping indentation) to put the point on the M in
MY_ENUM_XYZ.
3. Type case.
4. Set the mark (C-space)
5. Incremental search (C-s) forward for ,
6. Move backwards one space (left arrow)
7. Copy the selected region (M-w).
8. Move right, delete the comma then insert the text : return "
9. Paste (C-y)
10. Finish the line with ";"
11. Hit down to move to the next line
12. Finish the macro with C-x )
Now you can repeat the macro one time by pressing C-x e. Immediately after C-x e,
you can execute the macro again by just pressing e (and again by continuing to press e).
If you want to apply the macro many times, once per line (as in our example), you can
also use the “apply macro to region lines” feature, which iterates over the lines in the
selected region, and moves the point to the start of each line, then runs the most recent
keyboard macro. (Note that if you took this approach, you could skip the next to last
step in the macro described above, as you no longer need to explicitly move to the next
line as part of the macro). Applying the macro to the lines of a selected region is
accomplished by either the long command name (M-x apply-macro-to-region-lines)
or the short hand (C-x C-k r).
Note that this example is relatively simple (and could be accomplished with
regexp search and replace, though the macro is simpler than the regexp, or with X-
Macros—as we discuss in Section E.3.6), but you can make quite complex macros to
perform sophisticated tasks. You can also give a macro you have created a name (with
the M-x name-last-kbd-macro command). This command will prompt you for the name
to give the macro, and you can later re-run the macro by doing M-x name, where name is
whatever you named it. You can also save named macros into your configuration file
for use across sessions (which we will discuss later).
C.11 Writing Text or LaTeX
Spell check word M-$
Spell check buffer interactively M-x ispell-buffer
Spell check buffer by coloring words M-x flyspell-buffer
Automatically color mis-spelled words M-x flyspell-mode
Table C.10: Spell checking.
Once you become comfortable with Emacs, you will likely not want to edit
anything in any other program—you will have all of the Emacs commands committed
to muscle memory, and do them without thinking about it. If you use Emacs
keybindings in other programs, you often get interesting, but rather unpleasant results.
Fortunately, Emacs is flexible enough that you can use it for editing pretty much
anything textual. The most common non-programming text you are likely to edit is
natural language (English or other “human” language) in either plain text or LaTeX (we
will not cover LaTeX at all in this book, but its a great thing to learn).
Most of the editing commands you will want to use when writing text or LaTeX
are the ones we have covered so far—cut, copy, paste, undo, movement, macros, etc.—
however, one thing you will want that you do not need while programming is spell
checking. Table C.10 lists the most commonly used spell checking related commands.
M-$ spell checks the word at the point, providing possible replacements in a small
window at the top of the screen (you can select one with the number or letter assigned
to it) if the word is incorrect. You can also press ? to list some other options, such as
adding the word to your private dictionary (making the spell checker trust it as valid for
everything you do in the future), or accept it for the current session (until you quit
emacs).
You can spell check the entire buffer interactively with the M-x ispell-buffer
command, which will go through the buffer and ask you what you want to do for each
word that it thinks is misspelled (i.e., are not in the system dictionary, and you have not
told it to accept as valid words).
Many people prefer flyspell mode as their method for spell checking. In flyspell
mode, misspelled words are colored and underlined in the text. A word in red indicates
that the word appears to be misspelled, and it is the first occurrence of that word in the
buffer. A word in yellow indicates that the word appears to be misspelled, but you have
used it before in the buffer. When flyspell indicates a misspelled word, you can use M-$
on it as described above.
C.12 Configuring and Customizing Emacs
You can customize Emacs to your liking by editing the file ~/.emacs (recall from
Section B.3 that ~ is shorthand for your home directory). This file, which is referred to
as your .emacs (read “dot emacs”) file, contains Elisp commands that are executed
when Emacs starts. Note that Elisp is actually a complete programming language, so
you can define rather complicated functions if you want. However, much of the Elisp in
the .emacs file is relatively simple—setting variables, enabling modes, binding keys,
and loading other packages. At the end of this section, we present a sample .emacs file
with comments (in Elisp, ; comments to end of line) describing what the various lines
in it do. We also comment on how strongly we recommend various options.
You can set a variable with the setq function. In Elisp, functions are called with a
parenthesis before their name, followed by their arguments, and then the close
parenthesis. The setq function takes two arguments: the name of the variable to set, and
the value to assign to it. For example, (setq line-number-mode t) sets the variable
line-number-mode (which controls whether or not the current line number shows up in
the modeline at the bottom of the frame) to the value t (which means “true” in Elisp;
nil means “false”.).
Some minor modes are enabled by calling a function directly (these are typically
things you can enable by doing M-x something where the something is the same name
as the function you call). Some of these take an argument like “1” or “t” to say to
enable it (see the documentation for whatever mode you are looking for if its not one of
the ones shown below).
One particularly useful function to call is global-set-key (you can also set keys
locally with local-set-key), which lets you make your own key bindings. Note that
you can even completely redo the key bindings in Emacs to behave in any way that you
want. global-set-key takes two arguments—the first is a string (in ""s) that specifies
the key sequence to bind to. The string expected by global-set-key is a bit different
from the normal emacs key notation (for example, Control is \C- instead of C-);
however, you can use the kbd function, which converts from the more familiar notation
into the required format. For example the following command binds C-c C-v to the
compilation command:
1 (global-set-key (kbd "C-c C-v") ’compile)
Note that if you make (and name) a macro that you want to persist across quitting
and restarting Emacs, you can save it into your .emacs file with the M-x insert-kbd-
macro command. Go into your .emacs file, and run the M-x insert-kbd-macro
command, then type the appropriate macro name at the prompt. Emacs will print out the
Elisp code to define the macro. You can then bind the name to a key using global-set-
key. For example, if we took the example macro from earlier and named it case-print,
we could insert it into our .emacs and bind it to C-c 1 (picked for not other reason than
that it is not used for anything, we would end up with:
1 (fset ’case-print
2 "\C-[mcase \C-@\C-s,\C-[OD\C-[w\C-[OC\C-?: return \"\C-y\";")
3
4 (global-set-key (kbd "C-c 1") ’case-print)
Note that the contents of the fset are difficult to read and understand, but that
does not matter—it is just Emacs’s encoding of our macro, and we do not need to do
anything with it. After we next time we start Emacs, we could hit C-c 1 to apply our
macro any time we want (if we want to use the keyboard short cut in the current
session, we could run M-x global-set-key interactively—the .emacs file is only
processed when Emacs starts though). We will also note that you can set a key locally
to a specific mode, but we will not go into that here.
You can also have Emacs perform some function (such as enabling a minor mode)
when it opens a buffer in a particular mode via a mode hook. You can use the add-hook
function to add mode hooks. This function takes two arguments, the mode to hook into
(what mode triggers the activity), and what to do. For example, you might add the
following two hooks to latex mode to enable flyspell mode and immediately spell check
the entire buffer whenever you open a buffer in LaTeX mode:
1 (add-hook ’latex-mode-hook ’flyspell-mode)
2 (add-hook ’latex-mode-hook ’flyspell-buffer)
Emacs can also load Elisp packages that can contain arbitrarily complex Elisp.
Typically if you get such a package, it will contain instructions for how to modify your
.emacs file to load it. We will also note that you can write your own Elisp to do pretty
much anything, but that is well beyond the scope of this introduction.
Figure C.1 shows a sample .emacs file to get started from.
1 ; Recommended: show line numbers in modeline
2 (setq line-number-mode t)
3
4 ; Personal Preference: show column numbers in modeline
5 (setq column-number-mode t)
6
7 ; Definitely want syntax highlighting everywhere
8 (global-font-lock-mode t)
9 ; Recommended: maximum coloration in syntax highlighting
10 (setq font-lock-maximum-decoration t)
11
12 ; Personal preference: jump straight to first compiler error
13 (setq compilation-auto-jump-to-first-error t)
14
15 ; Personal Preference: Automatically set compilation mode to
16 ; move to an error when you move the point over it, rather than hitting RET
17 (add-hook ’compilation-mode-hook ’next-error-follow-minor-mode)
18
19 ; I don’t like it, but some people do: Puts line numbers down the left column
20 ; (global-linum-mode 1)
21
22 ; Recommended: how else do you know the sun is about to rise?
23 (display-time)
24
25 ; Strongly recommended: highlights matching (), {}. and []
26 (show-paren-mode)
27
28 ; If you have a version of Emacs older than 23
29 ; this is not the default, and you probably want it
30 ; Newer versions of Emacs use transient mark mode by default.
31 ; (transient-mark-mode 1)
32
33 ; Personal Preference: enables Semantic mode, with the context
34 ; sensitive completions discussed above requires parsing the buffer
35 ;(semantic-mode 1)
36
37 ; Personal Preference: Semantic mode shows completions when you stop typing.
38 ;(global-semantic-idle-completions-mode)
39
40 ; Recommended if you write LaTeX: Automatically spell check LaTeX buffers
41 (add-hook ’latex-mode-hook ’flyspell-mode)
42 (add-hook ’latex-mode-hook ’flyspell-buffer)
43
44 ; Recommended: Set C-c C-v to compile
45 (global-set-key (kbd "C-c C-v") ’compile)
Figure C.1: Sample .emacs file.
B UNIX BasicsIndexD Other Important Tools
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
V AppendiciesC Editing: EmacsE Miscellaneous C and C++ Topics
Appendix D
Other Important Tools
There are many tools every programmer should have a decent
familiarity with. Beyond your editor, you should be familiar
with a build system (for C/C++, this is typically make), a
debugger (e.g., GDB), and a revision control system (there are
many, but we discuss Git as it is quite popular). For C and C++,
you should also be familiar with Valgrind, which checks your
program for a variety of memory errors. Of course, the more
familiar with these tools you are, the better. However, at least
basic competency is an absolute requirement for professional
programming.
D.1 Build Tool: make
For small projects (i.e., those with one or two source files),
manually typing the gcc command to compile the files together
works reasonably well. However, for large projects, we
typically want to recompile only the files that have actually
changed, or those that depend on a file that has change (e.g.,
those that #include a header file that has changed). We could
try to track this manually and only recompile those files;
however, this is tedious and error prone. Instead, we use make
to manage the task for us.
The input to make is a Makefile,1 which contains one or
more rules that specify how to produce a target from its
prerequisites (the files its depends on). The rule is comprised of
the target specification, followed by a colon and then a list of
the prerequisite files. After the list of prerequisites, there is a
newline, and then any commands required to rebuild that target
from the prerequisites. The commands may appear over
multiple lines; however, each line must begin with a TAB
character (multiple spaces will not work, and accidentally using
them instead of a TAB is often the source of problems with a
Makefile).
When you run make, you can specify a particular target to
build (if you do not specify one, make uses the first target in the
Makefile as the default target). To build the target, make will
first check if it is up to date. Checking if a target is up to date
first requires ensuring that each prerequisite is up to date by
potentially rebuilding it. This process bottoms out when make
encounters a file that exists, which is not itself the target of any
rules. Such a file is up to date.
Once all files that a target depends on are ensured to be up
to date, make checks if the target itself needs to be (re)built.
First, make check if the target file exists. If not, it must be built.
If the target file already exists, make compares its last-modified
time (which is tracked by all major filesystems) to the last-
modified times of each of the prerequisites specified in the rule.
If any dependency is newer than the target file, then the target
is out of date, and must be rebuilt. Note that if any of the
prerequisites were rebuilt in this process, then that file will have
been modified more recently than the target, thus forcing the
target to be rebuilt.
To be a bit more concrete, let us look at a specific example
of a Makefile:
1 myProgram: oneFile.o anotherFile.o
2 gcc -o myProgram oneFile.o anotherFile.o
3 oneFile.o: oneFile.c oneHeader.h someHeader.h
4 gcc -std=gnu99 -pedantic -Wall -c oneFile.c
5 anotherFile.o: anotherFile.c anotherHeader.h someHeader
6 gcc -std=gnu99 -pedantic -Wall -c anotherFile.c
In this Makefile there are three targets: myProgram,
oneFile.o, and anotherFile.o. If we just type make, then make
will attempt to (re)build myProgram, as that is the first target in
the file, and thus the default. This target depends on oneFile.o
and anotherFile.o, so the first thing make will do is make the
oneFile.o target (much as if we had typed make oneFile.o).
oneFile.o depends on one .c and two .h files, none of
which are targets of other rules. Therefore, make does not need
to rebuild them (if they do not already exit, make will give an
error like this:
make: *** No rule to make target ’oneHeader.h’,
needed by ’oneFile.o’. Stop.
Assuming that all three of these .c/.h files exist, make will see
if oneFile.o does not exist, or if any of those three files are
newer than it. If so, then make will rebuild the file by running
the specified GCC command. If oneFile.o already exists and
is newer than the relevant source files, then nothing needs to be
done to build it.
After processing oneFile.o, make does a similar process
for anotherFile.o. After that completes, it checks if it needs to
build myProgram (that is, if either myProgram does not exist, or
either of the object files that it depends on are newer than it). If
so, it runs the specified GCC command (which will link the
object files and produce the binary called myProgram). If not, it
will produce the message:
make: ’myProgram’ is up to date.
Observe that, because of the way this procedure works, if
you were to change code in oneFile.c, then only one of the
two object files would be recompiled (oneFile.o), and then the
program would be re-linked. The other object file
(anotherFile.o) would not need to be recompiled. While this
may seem like an insignificant difference for two files, if the
project had 50 files, compiled with heavy optimizations, the
difference between recompiling all 50, and only recompiling
one would be a noticeable amount of time.
D.1.1 Variables
The way we have written our example Makefile has a lot of
copying and pasting—something we want to avoid in anything
we do. In particular, we might notice that we have the same
compiler options in many places. If we wanted to change these
options (e.g., to turn on optimizations, or add a debugging
flag), we would have to do it in every place. Instead, we would
prefer to put the compiler options in a variable, and use that
variable in each of the relevant rules. For example, we might
change our previous example to the following:
1 CFLAGS=-std=gnu99 -pedantic -Wall
2 myProgram: oneFile.o anotherFile.o
3 gcc -o myProgram oneFile.o anotherFile.o
4 oneFile.o: oneFile.c oneHeader.h someHeader.h
5 gcc $(CFLAGS) -c oneFile.c
6 anotherFile.o: anotherFile.c anotherHeader.h someHeader
7 gcc $(CFLAGS) -c anotherFile.c
Here, we define a variable CFLAGS, which we set equal to
our desired compilation flags. We then use that variable by
putting its name inside of $() in the rules. Note that changes to
the Makefile do not automatically outdate targets that use
them, so they will not necessarily be rebuilt with the new flags
if you just type make after making the change (although you
could make them all depend on Makefile).
D.1.2 Clean
A common target to put in a Makefile is a clean target. The
clean target is a bit different in that it does not actually create a
file called clean (it is therefore called a “phony” target).
Instead, it is a target intended to remove the compiled program,
all object files, all editor backups (*.c~ *.h~), and any other
files that you might consider to be cluttery. This target gets used
to either force the entire program to be rebuilt (e.g., after you
change various compilation flags in the Makefile), or if you
just need to clean up the directory, leaving only the source files
(e.g., if you are going to zip or tar up the source files to
distribute them to someone).
We might add a clean target to our Makefile as follows:
1 .PHONY: clean
2 clean:
3 rm -f myProgram *.o *.c~ *.h~
Note that the .PHONY: clean tells make that clean is a
phony target—we do not actually expect it to create a file called
“clean”, nor would we want to consider it up to date and skip
its commands if a file called “clean” already existed (as there
are no prerequisites, it would be considered up to date if it
existed). If we wanted other phony targets, we would list them
all as if they were prerequisites of the .PHONY target (e.g.
.PHONY: clean depend).
In general, you should add a clean target to your
Makefiles, as most people will expect one to be present, and it
can be quite useful.
D.1.3 Generic Rules
Our example Makefile improved slightly when we used a
variable to hold the compilation flags. However, our Makefile
still suffers from a lot of repetition, and would be a pain to
maintain if we had more than a few sources files. If you look at
what we wrote, we are doing pretty much the same thing to
compile each of our .c source files into an object file.
Whenever we find ourselves repeating ourselves, there should
be a better way.
In make, we can write generic rules. A generic rule lets us
specify that we want to be able to build (something).o from
(something).c, where we represent the something with a
percent-sign (%). As a first step, we might write (note that #
indicates a comment to the end of the line):
1 # A good start, but we lost the dependencies on the hea
2 CFLAGS=-std=gnu99 -pedantic -Wall
3 myProgram: oneFile.o anotherFile.o
4 gcc -o myProgram oneFile.o anotherFile.o
5 %.o: %.c
6 gcc $(CFLAGS) -c $<
7 .PHONY: clean
8 clean:
9 rm -f myProgram *.o *.c~ *.h~
Here, we have replaced the two rules we had for each
object file with one generic rule. It specifies how to make a file
ending with .o from a file of the same name, except with .c
instead of .o. In this rule, we cannot write the literal name of
the source file, as it changes for each instance of the rule.
Instead, we have to use the special built-in variable $<, which
make will set to the name of the first prerequisite of the rule (in
this case, the name of the .c file).
However, we have introduced a significant problem now.
We have made it so that our object files no longer depend on
the relevant header files. If we were to change a header file,
then make might not rebuild all of the relevant object files. Such
a mistake can cause strange and confusing bugs, as one object
file may expect data in an old layout but the code will now be
passed data in a different layout. We could make every object
file depend on every header file (by writing %.o : %.c *.h);
however, this approach is overkill—we would definitely rebuild
everything that we need to when we change a header file,
because we would rebuild every object file, even if we only
need to rebuild a few.
We can fix this in a better way by adding the extra
dependency information to our Makefile:
1 # This fixes the problem
2 CFLAGS=-std=gnu99 -pedantic -Wall
3 myProgram: oneFile.o anotherFile.o
4 gcc -o myProgram oneFile.o anotherFile.o
5 %.o: %.c
6 gcc $(CFLAGS) -c $<
7 .PHONY: clean
8 clean:
9 rm -f myProgram *.o *.c~ *.h~
10 oneFile.o: oneHeader.h someHeader.h
11 anotherFile.o: anotherHeader.h someHeader.h
Here, we still have the generic rule, but we also have
specified the additional dependencies separately. Even though it
looks like we have two rules, make understands that we are just
providing additional dependence information because we have
not specified any commands. If we did specify commands in
the, they would supersede the generic rules for those targets.
Managing all of this dependency information by hand
would, of course, be tedious and error-prone. The programmer
would have to figure out every file that is transitively2 included
by each source file, and keep the information up to date as the
code changes. Instead, there is a tool called makedepend, which
will edit the Makefile to put all of this information at the end.
In its simplest usage, makedepend takes as arguments all of the
source files (i.e., all of the .c and/or .cpp files), and edits the
Makefile. It can also be given a variety of options, such as -I
path to tell it to look for include files in the specified path. See
man makedepend for more details.
D.1.4 Built-in Generic Rules
Some generic rules are so common that they are built into make,
and we do not even have to write them. As you may suspect,
building a .o file from a similarly named .c file is quite
common, as it is what C programmers do most often.
Accordingly, we do not even need to explicitly write our %.o:
%.c rule if we are happy with the built-in generic rule for this
pattern.
We can see the all of the rules (including both those that
are built-in and those that are specified by the Makefile) by
using make -p. Doing so also builds the default target as usual
—if we want to avoid building anything, we can do make -p -
f/dev/null to use the special file /dev/null as our Makefile
(reading from /dev/null results in end-of-file right away, so
the result will be a Makefile with no rules, thus it will not do
anything).
If we use make -p to explore the built-in rules for building
.o files from .c files, we will find:
1 %.o: %.c
2 # commands to execute (built-in):
3 $(COMPILE.c) $(OUTPUT_OPTION) $<
Understanding this rule requires us to look at the definitions of
COMPILE.c and OUTPUT_OPTION, which are also included in the
output of make -p:
1 COMPILE.c = $(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c
2 OUTPUT_OPTION = -o $@
By default, CFLAGS (flags for the C-compiler) and
CPPFLAGS (flags for the C preprocessor3), as well as
TARGET_ARCH (flags to specify what architecture to target) are
empty. By default CC (the C-compiler command) is cc (which
may or may not be GCC depending on how our system is
configured). The defaults for any of these variables (or any
other variables) can be overridden by specifying their values in
our Makefile. Note that $@ in OUTPUT_OPTION is a special
variable and the name of the current target (much like $< is the
name of the first prerequisite).
All of that may sound a bit complex but it basically boils
down to the fact that the default rule is
cc -c -o something.o something.c
and we can override the specifics to get the behavior we want,
while still using the default rule. That is, we might use the
following Makefile:
1 CC = gcc
2 CFLAGS = -std=gnu99 -pedantic -Wall
3 myProgram: oneFile.o anotherFile.o
4 gcc -o myProgram oneFile.o anotherFile.o
5 .PHONY: clean depend
6 clean:
7 rm -f myProgram *.o *.c~ *.h~
8 depend:
9 makedepend anotherFile.c oneFile.c
10 # DO NOT DELETE
11
12 anotherFile.o: anotherHeader.h someHeader.h
13 oneFile.o: oneHeader.h someHeader.h
Here, we have specified that we want to use GCC as the
C-compiler (CC), and specified the CFLAGS that we want. Now,
when we try to compile an object file from a C file, the default
rule will result in
gcc -std=gnu99 -pedantic -Wall -c -o something.o
something.c
You should also note that we have added another phony
target, depend, which runs makedepend with the two C source
files that we are working with. the # DO NOT DELETE line and
everything below it are what makedepend added to our
Makefile when I ran make depend. Note that if we re-run
makedepend (preferably via the make depend), it will look for
this line to tell where to delete the old dependency information
and add its new information.
D.1.5 Built-in Functions
Our Makefile is looking more like something we could use in a
large project, but we have still manually listed our source and
object files in a couple places. If we were to add a new source
file, but forget to update the makedepend command line, we
would not end up with the right dependencies for that file when
we run make depend. Likewise, we might forget to add object
files in the correct places (e.g., if we add it to the compilation
command line, but not the dependencies for the entire program,
we may not rebuild that object file when needed).
We can fix these problems by using some of make’s built-
in functions to automatically compute the set of .c files in the
current directory, and then to generate the list of target object
files from that list. The syntax of function calls in make is
$(functionName arg1, arg2, arg3). We can use the
$(wildcard pattern) function to generate the list of .c files in
the current directory: SRCS = $(wildcard *.c). Then we can
use the $(patsubst pattern, replacement, text) function
to replace the .c endings with .o endings: OBJS = $(patsubst
%.c, %.o, $(SRCS)). Once we have done this, we can use
$(SRCS) and $(OBJS) in our Makefile:
1 CC = gcc
2 CFLAGS = -std=gnu99 -pedantic -Wall
3 SRCS=$(wildcard *.c)
4 OBJS=$(patsubst %.c,%.o,$(SRCS))
5 myProgram: $(OBJS)
6 gcc -o $@ $(OBJS)
7 .PHONY: clean depend
8 clean:
9 rm -f myProgram *.o *.c~ *.h~
10 depend:
11 makedepend $(SRCS)
12 # DO NOT DELETE
13
14 anotherFile.o: anotherHeader.h someHeader.h
15 oneFile.o: oneHeader.h someHeader.h
At this point, we have a Makefile that we could use on a
large-scale project. The only thing we need to do when we add
source files or include new header files in existing source files
is run make depend to update the dependency information.
Other than that, we can build our project with make, which will
only recompile the required files.
We could, however, be a little bit fancier. In a real project,
we likely want to build a debug version of our code (with no
optimizations, and -ggdb3 to turn on debugging information—
see Section D.2 for more info about debugging), and an
optimized version of our code that will run faster (where the
compiler works hard to produce improve the instructions that it
generates, but those transformations generally make debugging
quite difficulty). We could change our CFLAGS back and forth
between flags for debugging and flags for optimization, and
remember to make clean each time we switch. However, we
can also just set our Makefile up to build both debug and
optimized object files and binaries with different names:
1 CC = gcc
2 CFLAGS = -std=gnu99 -pedantic -Wall -O3
3 DBGFLAGS = -std=gnu99 -pedantic -Wall -ggdb3 -DDEBUG
4 SRCS=$(wildcard *.c)
5 OBJS=$(patsubst %.c,%.o,$(SRCS))
6 DBGOBJS=$(patsubst %.c,%.dbg.o,$(SRCS))
7
8 .PHONY: clean depend all
9
10 all: myProgram myProgram-debug
11
12 myProgram: $(OBJS)
13 gcc -o $@ -O3 $(OBJS)
14 myProgram-debug: $(DBGOBJS)
15 gcc -o $@ -ggdb3 $(DBGOBJS)
16 %.dbg.o: %.c
17 gcc $(DBGFLAGS) -c -o $@ $<
18
19 clean:
20 rm -f myProgram myProgram-debug *.o *.c~ *.h~
21 depend:
22 makedepend $(SRCS)
23 makedepend -a -o .dbg.o $(SRCS)
24 # DO NOT DELETE
25
26 anotherFile.o: anotherHeader.h someHeader.h
27 oneFile.o: oneHeader.h someHeader.h
Now, if we make, we get both myProgram (the optimized
version), and myProgram-debug (which is compiled for
debugging).
D.1.6 Parallelizing Compilation
One useful feature of make, especially on modern multi-core
systems is the ability to have it run independent tasks in
parallel. If you give make the -j option, it requests that it run as
many tasks in parallel as it can. You may wish to ask it to limit
the number of parallel tasks to a particular number at any given
time, which you can do by specifying that number as an
argument to the -j option (e.g., make -j8 runs up to 8 tasks in
parallel). On large projects, this may make a significant
difference in how long a build takes.
D.1.7 …And Much More
You can use make for much more beyond just compiling C
programs. In fact, you can use make for pretty much any task
that you can describe in terms of creating targets from the
prerequisites that they depend on. For most such tasks, you can
put the parallelization capabilities of make to good use to speed
up the task.
We have given you enough of an introduction to make to
write a basic Makefile. However, as with many topics, we have
barely scratched the surface. You can find out a lot more about
make by reading the online manual:
https://www.gnu.org/software/make/manual/.
D.2 Debugger: GDB
Once you have implemented your algorithm in code, you must
test it. Testing often reveals that your code has mistakes
(“bugs”), which you must then fix. As we discuss in Section
6.2, debugging is the process of identifying precisely what is
wrong with your code and fixing it. This process should be
done scientifically (rather than by changing things in an ad hoc
fashion and hoping something works), and a key component of
that scientific process is gathering more information.
A key tool in the debugging process is a piece of software
called a debugger, although this name is a bit misleading, as the
program does not actually debug your code for you, it just helps
you gather information about what is going on in your code.
Here, we will introduce you to a particular debugger, GDB, The
GNU DeBugger. Becoming proficient with the debugger is
highly recommended, as it will drastically reduce the amount of
time you spend trying to fix bugs—often one of the largest
sources of effort and frustration for novices, as they tend to
both make a lot of mistakes and struggle to fix them.
D.2.1 Getting Started
The first step in using GDB (or most any other debugging tool)
is to compile the code with debugging symbols—extra
information to help a debugging tool understand what the
layout of the code and data in memory is—included in the
binary. The -g option to GCC requests that it include this
debugging information, but if you are using GDB in particular,
you should use -ggdb3, which requests the maximum amount
of debug information (e.g., it will include information about
preprocessor macro definitions) in a GDB-specific format. Note
that if you compile your program in multiple steps (object files,
then linking), you should include -ggdb3 at all steps.
Once you have your program compiled with debugging
symbols, you need to run GDB. You can run GDB directly from
the command line; however, it is much better to run it from
inside of Emacs.4 To run GDB inside Emacs, use the command
M-x gdb (that is either ESC x or ALT-x depending on your
keyboard setup, then type gdb, and hit enter). At this point,
Emacs should prompt you for how you want to run GDB (“Run
gdb (like this):”), and provide a proposed command line.
Typically the options that emacs proposes are what you want;
however, you may want to change the name of the program to
debug (specified as the last argument on the command line).
Once you are happy with the command line, hit enter to start
GDB.
At this point, you will end up with a buffer titled *gdb*, or
*gdb-prog* (where prog is the name of the program). The stars
in the buffer name indicate that the buffer corresponds to
interaction with a process, not a file on disk. This buffer should
contain some output from GDB, which tells you its version,
some information about where to find the manual, and a
message about the ‘‘help’’ command. The last lines of output
should be (replace “yourProgram” with the name of the
program you are debugging):
Reading symbols from yourProgram...done.
(gdb)
Note that if you instead get the following, it indicates that you
did not compile with debugging symbols (in which case, you
should recompile with debugging symbols):
Reading symbols from yourProgram...(no debugging
symbols found)...done.
(gdb)
Also, this message indicates that your requested program does
not exist in the current directory:
yourProgram: No such file or directory.
(gdb)
Note that the (gdb) at the start of the last line of the output
is GDB’s command prompt. Whenever it displays this prompt,
it is ready to accept a command from you. The first commands
we are going to learn are:
start Begin (or restart) the program’s execution.
Stop the program (to accept more commands) as
soon as execution enters main.
run This command runs the program (possibly
restarting it). It will not stop unless it encounters
some other condition that causes it to stop (we will
learn about these later).
step Advance the program one “step”, in much the
same way that we would advance the execution
arrow when executing code by hand. More
specifically, GDB will execute until the execution
arrow moves to a different line of source code,
whether that is by going to the next line, jumping
in response to control flow, or some other reason.
In particular, step will go into a function called by
the current line. This command can be abbreviated
s.
next Advance the program one line of code.
However, unlike step, if the current line of code is
a function call, GDB will execute the entire called
function without stopping. This command can be
abbreviated n.
print The print command takes an expression as
an argument, evaluates that expression, and prints
the results. Note that if the expression has side-
effects, they will happen, and will affect the state
of the program (e.g., if you do print x = 3, it will
set x to 3, then print 3). You can put /x after print
to get the result printed in hexadecimal format.
This command can be abbreviated p (or p/x to
print in hex). Every time you print the value of an
expression, GDB will remember the value in its
own internal variables, which are named $1, $2,
etc. (you can tell which one it used, because it will
say which one it assigned to when it prints the
value—e.g., $1 = 42). You can use these $
variables in other expressions if you want to make
use of these values later. GDB also has a feature to
let you print multiple elements from an array—if
you put @number after an lvalue, GDB will print
number values starting at the location you named.
This feature is most useful with arrays—for
example, if a is an array, you can do p a[0]@5 to
print the first 5 elements of a.
display
The display command takes an expression as
an argument, and displays its value every time
GDB stops and displays a prompt. For example
display i will evaluate and print i before each
(gdb) prompt. You can abbreviate this command
disp.
If you hit enter without entering any command, GDB will
repeat the last command you entered. This feature is most
useful when you want to use step or next multiple times in a
row.
Note that if you need to pass command line arguments to
your program, you can either write them after the start or run
command (e.g., run someArg anotherArg), or you can use set
args to set the command line arguments.
Video D.1: Example of basic GDB comments.
Video D.1 illustrates the use of the basic GDB commands
that we have discussed here.
D.2.2 Investigating the State of Your Program
One of the most useful feature of a debugger is the ability to
investigate the state of your program. The print and display
commands that we have discussed so far provide a start, as they
allow you to see what value an expression evaluates to;
however, there is much more that you can do.
One useful features is the ability to inspect the current set
of stack frames, and move up and down within them. The
backtrace command lists all of the stack frames (with the
current one on top, and main on the bottom), so that you can see
what function calls got you to the current place. The backtrace
also lists the line where each call was made (or where the
execution arrow is, for the current frame).
When you print expressions, GDB uses the variables in the
current scope. However, sometimes, you might want to inspect
variables in other frames further up the stack. You can instruct
GDB to select different frames with up and down, which move
up and down the stack specifically.
One particularlly common use of up is when your program
stops in a failed assert. When this happens, GDB will stop
deep inside the C library, in the code that handles assert.
However, you will want to get back to your own code, which is
a few stack frames up. You can use up a few times until GDB
returns to a frame corresponding to your code.
You can also get information about various aspects of the
program with the info command, which has various
subcommands. For example, info frame will describe the
memory layout of the current frame, info types will describe
the types that are in the current program. There are a variety of
commands, but most of them are for more advanced uses
info
—you can use help info to see them all.
D.2.3 Controlling Execution
The next and step commands give you the basic ability to
advance the state of the program; however, there are also more
advanced commands for controlling the execution. If we are
debugging a large, complex program, we may not want to step
through every line one-by-one to reach the point in the program
where we want to gather information.
One of the most useful ways to control the execution of
our program is to set a breakpoint on a particular line. A
breakpoint instructs GDB to stop execution whenever the
program reaches that particular line. You can set a breakpoint
with the break command, followed by either a line number or a
function name (meaning to set the breakpoint at the start of that
function). In emacs, you can also press C-x C-a C-b to set a
breakpoint at the point. It is also possible to set a breakpoint at
a particular memory address, although that is a more advanced
feature. When we set a breakpoint, GDB will assign it a
number, which we can use to identify it to other breakpoint-
related commands.
Once we have a breakpoint set, we can run the program
(or continue, if it is already started), and it will execute until
the breakpoint is encountered (or some other condition that
causes execution to stop). When the breakpoint is encountered,
GDB will return control to us at a (gdb) prompt, allowing us to
give it other commands—we might inspect the state of the
program, set more breakpoints, and continue.
By default, breakpoints are unconditional breakpoints—
GDB will stop the program and give you control any time it
reaches the appropriate line. Sometimes, however, we may
want to stop under a particular condition. For example, we may
have a for loop that executes 1,000,000 times, and we need
information from the iteration where i is 250,000. With an
unconditional breakpoint, the program would stop, and we
would need to continue many times before we got the
information we wanted. We can instead, use a conditional
breakpoint—once where we give GDB a C expression to
evaluate to determine if it should give us control, or let the
program continue to run.
We can put a condition on a breakpoint when we create it
with the break command by writing if after the location,
followed by the conditional expression. We can also add a
condition later (or change an existing condition) with the cond
command. For example, if we want to make a breakpoint on
line size for the condition i==25000, we could tell GDB:
(gdb) break 7 if i==250000
Alternatively, if the breakpoint already existed, for
example, as breakpoint 1, we could write
cond 1 i==250000
If we write a cond command with no expression, then it
makes a breakpoint unconditional. We can also enable or
disable breakpoints (by their numeric id). A disabled
breakpoint still exists (and can be re-enabled later), but has no
effect—it will not cause the program to stop. We can also
delete a breakpoint by its numeric id. You can use the info
breakpoints command (which can be abbreviated i b) to see
the status of current breakpoints.
Two other useful commands to control the execution of the
program are until, which causes a loop to execute until it
finishes (GDB stops at the first line after the loop), and finish
(which can be abbreviated fin), which finishes the current
function—i.e., causes execution until the current function
returns.
D.2.4 Watchpoints
Another incredibly useful feature of GDB is a watchpoint—the
ability to have GDB stop when the value of a particular
expression changes. For example, we can write watch i, which
will cause GDB to stop whenever the value of i changes. When
GDB stops in response to a watchpoint, it will print the old
value of the expression and the new value.
Watchpoints can be a particularly powerful tool when you
have pointer-related problems, and values of variables are
changing through aliases. However, sometimes, the alias we
have when we setup the watchpoint may go out of scope before
the change we are interested in happens. For example, we may
want to watch *p, but p is a local variable, whose scope ends
before the value changes. Whenever we face such a problem,
we can print p, which will give us the pointer in a GDB
variable (e.g., $6), and then we can use that $-variable (which
never goes out of scope—it lives until we restart GDB) to set
our watchpoint: watch *$6.
D.2.5 Signals
Whenever your program receives a signal (recall from Section
11.4—a notification from the OS in response to certain kinds of
events), GDB will stop the program and give you control.
There are three particularly common signals that come up
during debugging. The first is SIGSEGV, which indicates a
segmentation fault. If your program is segfaulting, then just
running it in GDB can help you gather a lot of information
about what is happening. When the segfault happens, GDB will
stop, and your program will be on the line where the segfault
happened. You can then begin inspecting the state of the
program (printing out variables) to see what went wrong.
Another common signal is SIGABRT, which happens
when you program calls abort() or fails an assert. As with
segfaults, if your code is failing asserts, then running it in GDB
can be incredibly useful—you will get control of the program at
the point where assert causes the program to abort, and (after
going up a few frames back into your own code), see exactly
what was going on when the problem happened.
The other signal that is useful is SIGINT, which happens
when the program is interrupted—e.g., by the user pressing
Control-c (inside emacs, you have to press C-c C-c: Control-C
twice). If your program is getting stuck in an infinite loop, you
can run it in GDB, and then after a while, press Control-c. You
can then see where the program is, and what it is doing. You are
not guaranteed to be in the right place (you may interrupt the
program before it gets into the infinite loop), but if you wait
sufficiently long, you will typically end up where you want.
You can then see what is happening, and why you are not
getting the behavior you want.
Video D.2: Example of some more GDB
commands.
Video D.2 illustrates more GDB commands and concepts
that we have discussed, including interrupting the execution of
the program with C-c C-c, and failing an assert.
D.3 Memory Checker: Valgrind’s
Memcheck
The C compiler will give you warning and errors for things that
it can tell are problematic. However, what the compiler can do
is limited to static analysis of the code (that is, things it can do
without running the code or knowing the inputs to the
program). Consequently, there are many types or problems that
the compiler cannot detect. These types of problems are
typically found by running the code on a test case where the
problem occurs.
However, just because the problem occurs does not mean
that it produces a useful symptom for the tester. Sometimes a
problem can occur with no observable result,5 with an
observable result that only manifests much later, or with an
observable result that is hard to trace to the actual cause. All of
these possibilities make testing and debugging more difficult.
Ideally, we would like to have any problem be
immediately detected and reported, with useful information
about what happened. When we just run a C program directly,
such things do not occur as the compiler does not insert any
extra checking or reporting for us. However, there are tools that
can perform additional checking as we run our program to help
us test and debug the program. One such tool is Valgrind+, in
particular, its Memcheck tool.
Valgrind is actually a collection of tools, which are
designed so that more can be added if desired. However, we are
primarily interested in Memcheck, which is the default tool and
will do the checking that we require. Whenever you are testing
and debugging your code, you should run it in Valgrind (called
“valgrinding your program”). While valgrinding your program
is much slower than running the program directly, it will help
your testing and debugging immensely. You should fix any
errors that Valgrind reports, even if they do not seem to be a
problem.
To “valgrind your program”, run the valgrind command
and give it your program name as an argument (Memcheck is
the default tool). If your program takes command line
arguments, simply pass them as additional arguments after your
program’s name. For example, if you would normally run
./myProgram hello 42 instead run valgrind ./myProgram
hello 42. For the most benefits, you should compile
debugging information in your program (pass the -g or -ggdb3
options to GCC).
When you run Valgrind, you will get some output that
looks generally like this:
==11907== Memcheck, a memory error detector
==11907== Copyright (C) 2002-2013, and GNU GPL’d, by
Julian Seward et al.
==11907== Using Valgrind-3.10.0.SVN and LibVEX; rerun
with -h for copyright info
==11907== Command: ./myProgram hello 42
==11907==
==11907==
==11907== HEAP SUMMARY:
==11907== in use at exit: 0 bytes in 0 blocks
==11907== total heap usage: 2 allocs, 2 frees, 128
bytes allocated
==11907==
==11907== All heap blocks were freed -- no leaks are
possible
==11907==
==11907== For counts of detected and suppressed
errors, rerun with: -v
==11907== ERROR SUMMARY: 0 errors from 0 contexts
(suppressed: 0 from 0)
Each line that starts with ==11907== here is part of the
output of Valgrind. Note that when you run it, the number you
get will vary (it is the process id, so will change each time).
Valgrind prints a message telling you what it is doing
(including the command it is running). Then it runs your
program. In this case, the program did not produce any output
on stdout; however, if it did, that output would be interspersed
with Valgrind’s output. There are also no Valgrind errors—if
there were, they would be printed as they happen. At the end,
Valgrind gives a summary of our dynamic allocations (see
Chapter 12), and errors. The last line shows that we “valgrinded
cleanly”—we had no errors that Valgrind could detect. If we
did not valgrind cleanly, we should fix our program even if the
output appears to be correct.
Note that if our program has errors, Valgrind will report
them and keep running (until a certain error limit is reached).
Whenever you have multiple errors, you should start with the
first one, fix it, and then move to the next one. Much like
compiler errors, later problems may be the result of an earlier
error.
One common misconception that novice programmers
often get is that they should run Valgrind only after they have
debugged their program and otherwise think it works.6
However, you will find the debugging process much easier if
you use Valgrind’s Memcheck throughout your testing and
debugging process. You may find that the odd/confusing bug
you have not been able to figure out for hours is actually caused
by a subtle problem that you did not notice earlier, which
Valgrind could have found for you. Note that Valgrind does not
“play nice” with code that has been compiled with -
fsanitize=address, so you should compile without that option
to valgrind your code. Ensuring that both tools cannot find any
problems with your code is likely a great idea, as they can
detect different problems.
We will introduce the basics here, but recommend further
reading in the Valgrind user’s manual:
http://valgrind.org/docs/manual/manual.html for further
information.
D.3.1 Uninitialized Values
When we execute code by hand, we put a ? in a box if that
location has not been initialized. Similarly, when we execute by
hand, we consider it an error to use an uninitialized value. It is
indicative of a problem in our program, and we have no idea
what will actually happen when we run our code.
However, when you just run your program, there is always
some value there, we just do not know what it is. Therefore,
when we run our program, we will get a value (it is just
whatever happened to be in that memory location before), and
there is not any checking that the value is initialized. We will
not know that we have used an initialized value, and if we get
unlucky, our program will print the correct output anyways.
Use of uninitialized values are a great example of why
“silent” bugs are dangerous, and need to be fixed. We might
think our program is right when we test and debug it because
the value in that location is coincidentally correct. However,
some change in the execution of the program can cause that
“luck” to end, resulting in the program exhibiting an error.
The change that disrupts your “luck” could be due to a
variety of circumstance. For example, changing the input to the
program could cause a different execution path leading up to
the use of the uninitialized value, thus causing it to have a
different value in a variety of different possible ways. Another
possibility is if you compile your code with optimizations
versus for debugging. When the compiler optimizes your code,
it may place different variables in different locations relative to
the debugging version. Such a bug can therefore lead to the
annoying situation where the debug version of your program
appears to work, but the optimized version does not (even on
the same inputs). We have even seen students experience
problems with differences between the values read for
uninitialized locations between redirecting the output to a file
and printing it directly to the terminal.
Valgrind’s Memcheck tool explicitly tracks the validity of
every bit in the program and can tell us about the use of
uninitialized values. By default, Memcheck will tell you when
you use an uninitialized value; however, these errors are limited
to certain uses. If x is uninitialized, and you do y = x,
Memcheck will not report an error, but rather just note that y
now holds an uninitialized value. In fact, even if we compute
y = x + 3, we will still not get an error immediately.
However, if we use an uninitialized location (or one
holding a value that was computed from an uninitialized
location) in certain ways that Memcheck considers to affect the
behavior of the program, it will give us an error. One such case
is when the control flow of the program depends on the
uninitialized value—when it appears in the conditional
expression of an if, or a loop, or in the selection expression of
a switch statement. In such a situation, Memcheck will
produce the error:
Conditional jump or move depends on uninitialized
value(s)
and produce a call-stack trace showing where that use occurred.
For example, if we write the function:
1 void f(int x) {
2 int y;
3 int z = x + y;
4 printf("%d\n", z);
5 }
Valgrind’s Memcheck will report the following error (this
code is inside of uninit.c, which has other lines before and
after those shown above):
==12241== Conditional jump or move depends on
uninitialised value(s)
==12241== at 0x4E8158E: vfprintf (vfprintf.c:1660)
==12241== by 0x4E8B498: printf (printf.c:33)
==12241== by 0x400556: f (uninit.c:7)
==12241== by 0x400580: main (uninit.c:15)
This error indicates that the uninitialized value was used
inside of vfprintf, which was called by printf, which was
called by f (on line 7), which was called by main (on line 15).
We may be able to fix the program directly by observing the
call to printf on line 7 and seeing what we did wrong.
However, if we do not see the problem right off (which
could be likely if the uninitialized value has been passed
through a variety of function parameters and data structures
before Memcheck reports the error), we need more help from
Memcheck. In fact, when Memcheck reports such errors, it will
helpfully suggest the option we need to get more information
from it at the end of its output:
==12241== Use --track-origins=yes to see where
uninitialised values come from
If we run Valgrind again passing in this option (valgrind
--track-origins=yes ./myProgram), it will report where the
uninitialized value was created when it reports the error:
==12260== Conditional jump or move depends on
uninitialised value(s)
==12260== at 0x4E8158E: vfprintf (vfprintf.c:1660)
==12260== by 0x4E8B498: printf (printf.c:33)
==12260== by 0x400556: f (uninit.c:7)
==12260== by 0x400580: main (uninit.c:15)
==12260== Uninitialised value was created by a stack
allocation
==12260== at 0x40052D: f (uninit.c:4)
We can now see that the value was created by stack
allocation (meaning allocating a frame for a function—see
Section 8.4.1 for more about the layout of memory), and we
have the particular line of code that caused that creation
( int y; inside of f).
There are other cases where a use of an uninitialized value
will result in a message such as:
==12235== Use of uninitialised value of size 8
Here Memcheck is telling us that we used an uninitialized
value in a way that it considered problematic, and that the value
we used was 8 bytes in size (that is, how many bytes of
memory it was accessing). If our uninitialized value is passed to
a system call (see Section 11.1), we will get an error message
that looks like this:
==12362== Syscall param write(fd) contains
uninitialised byte(s)
All of these indicate the same fundamental problem: we
have a value that we did not initialize. We need to find it
(probably by using --track-origins=yes) and properly
initialize it.
D.3.2 Invalid Reads/Writes
Valgrind’s Memcheck tool will also perform stricter checking
of memory accesses (e.g. via pointers) than normally occurs
when you run your program.7 In particular, Memcheck will
track whether or not every address is valid at any particular
time (as well as whether or not that address contains initialized
data). Any access to an invalid address will result in an error for
Memcheck.
For example, suppose we wrote the following (broken)
code:
1 //horribly broken---returns a dangling pointer!
2 int * makeArray(size_t sz) {
3 int data[sz];
4 for (size_t i =0; i < sz; i++) {
5 data[i] = i;
6 }
7 return data;
8 }
Depending on what we do with the return result of this
function, our code might either appear to work, or give us
rather strange errors. If we read the array in the calling function
(e.g., main), then we might not immediately observe anything
bad (however, if we call other functions, the values in the array
might “mysteriously” change). If we ran such code in Valgrind,
we would might get an error such as this:
==24640== Invalid read of size 4
==24640== at 0x40060C: main (dangling.c:16)
==24640== Address 0xfff000340 is just below the
stack ptr.
Here, Memcheck is telling us that we tried to read 4 bytes
from an invalid (currently unallocated) memory location. It
gives us a call stack trace for where the invalid read occurred
(in this case, it was in main on line 16 of the code, which is not
shown here). Memcheck tells us what address experienced the
problem, and gives us the most information it can about where
that address is relative to valid regions of memory. In this
particular case, the address is just below the stack pointer,
meaning that it is in the frame of a function that recently
returned. If we had written to the address instead, we would get
a message about an “Invalid write of size X.”
Note that Memcheck cannot detect all memory related
errors (even though it can detect many that will slip through
otherwise). If another function were called (which would
allocate a frame in the same address range), the memory would
again become valid, and Memcheck would be unable to tell that
accesses to it through this pointer are not correct. Likewise,
Memcheck may not be able to detect an array-out-of-bounds
error because the memory location that is improperly accessed
may still be a valid address for the program to access (e.g., part
of some other variable).
Note that using -fsanitize=address can find a lot of
problems of this type that Memcheck cannot. The reason is that
-fsanitize=address forces extra unused locations between
variables, and marks them unreadable with the validity bits that
it uses. Because there is now invalid space between the
variables, the checks inserted by -fsanitize=address will
detect accesses in between them, such as going out of the
bounds of one array.
D.3.3 Valgrind with GDB
We may have a problem that appears on a particular line of
code, but only under specific circumstances that do not
manifest the first several times that line of code is encountered.
When such a situation occurs, we may find it more difficult to
debug the code, as simply placing a breakpoint on the offending
line may not give us enough information—we might not know
when we reach that breakpoint under the right conditions. What
we would like is the ability to run GDB and Valgrind together,
and have Valgrind tell GDB when it encounters an error, giving
control to GDB.
Fortunately, we can do exactly that. If we run Valgrind
with the options --vgdb=full --vgdb-error=0, then Valgrind
will stop on the first error that it encounters and give control to
GDB.8 Some coordination is required to get GDB connected to
Valgrind (they run as separate processes); however, when run
with those options, Valgrind will give us the information we
need to pass to GDB to make this happen:
==24099== (action at startup) vgdb me ...
==24099==
==24099== TO DEBUG THIS PROCESS USING GDB: start GDB
like this
==24099== /path/to/gdb ./a.out
==24099== and then give GDB the following command
==24099== target remote |
/usr/lib/valgrind/../../bin/vgdb --pid=24099
==24099== --pid is optional if only one valgrind
process is running
At this point, Valgrind has started the program, but not yet
entered main—it is waiting for you to start GDB and connect it
to the Valgrind. We can do so by running GDB (in a separate
terminal, or emacs buffer) and then copying and pasting the
target command that Valgrind gave us into GDB’s command
prompt:
(gdb) target remote |
/usr/lib/valgrind/../../bin/vgdb --pid=24099
Remote debugging using |
/usr/lib/valgrind/../../bin/vgdb --pid=24099
relaying data between gdb and process 24099
Reading symbols from /lib64/ld-linux-x86-
64.so.2...Reading symbols from ...
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
0x00000000040012d0 in _start () from /lib64/ld-linux-
x86-64.so.2
(gdb)
At this point, we can give GDB any commands we want.
Most often, we will want to give GDB the command continue,
which will let it run until Valgrind encounters an error. At that
point, Valgrind will interrupt the program and return control to
GDB. You can now give GDB whatever commands you want to
so that you can investigate the state of your program.
The combination of Valgrind and GDB is quite powerful,
and gives you the ability to run some new commands, via the
monitor command. For example, if we are trying to debug
pointer related errors and want to know what variables still
point at a particular memory location, we can do use the
monitor who_points_at command:9
(gdb) monitor who_points_at 0x51fc040
==24303== Searching for pointers to 0x51fc040
==24303== *0xfff000450 points at 0x51fc040
==24303== Location 0xfff000450 is 0 bytes inside
local var "p"
==24303== declared at example.c:6, in frame #0 of
thread 1
There are many other monitor commands available for
Memcheck. See http://valgrind.org/docs/manual/mc-
manual.html#mc-manual.monitor-commands for more
information about available monitor commands, and their
arguments.
D.3.4 Dynamic Allocation Issues
Valgrind’s Memcheck tool is quite useful for finding problems
related to dynamic allocation—whether malloc and free in C,
or new/new[], and delete/delete[] in C++. As with other
regions of memory, Memcheck will explicitly track which
addresses are valid and which are not. It also tracks exactly
what pointers were returned by malloc, new, and new[], and
places some invalid space on each side of the allocated block.
From all this information, Memcheck can report a wide
variety of problems. First, if an access goes just past the end of
a dynamically allocated array, Memcheck can detect this
problem. Second, Memcheck can detect double freeing
pointers, freeing the incorrect pointer, and mismatches between
allocations and deallocations (e.g., deleteing memory
allocated with malloc, or mixing up delete[] with delete).
For example, suppose we wrote the following (obviously
buggy) code:
1 int * ptr = malloc(sizeof(int));
2 ptr[1] = 3;
If we run this inside of Memcheck, it reports the following
error:
==5465== Invalid write of size 4
==5465== at 0x40054B: main (outOfBounds.c:8)
==5465== Address 0x51fc044 is 0 bytes after a block
of size 4 alloc’d
==5465== at 0x4C2AB80: malloc (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5465== by 0x40053E: main (outOfBounds.c:7)
The error first tells us the problem (we made an invalid
write of 4 bytes), with a stack trace indicating where that
invalid write happened (in main, on line 8 of the file
outOfBounds.c). The second part tells us what invalid address
our program tried to access, and the nearest valid location. In
this case, it reports it as “0 bytes after” (meaning in the first
invalid byte past a valid region) “a block of size 4” (meaning
how much space was allocated into that valid region).
Memcheck then reports where that valid region of memory was
allocated (in main on line 7, by calling malloc).
Note that if we recently freed a block of memory,
Memcheck will report proximity to that block of memory, even
though it is no longer valid. For example, if we write:
1 int * ptr = malloc(sizeof(int));
2 free(ptr);
3 ptr[0] = 3;
Then Memcheck will report the invalid write as being
inside of the freed block (and tell us where we freed the block):
==5486== Invalid write of size 4
==5486== at 0x4005A3: main (outOfBounds2.c:9)
==5486== Address 0x51fc040 is 0 bytes inside a block
of size 4 free’d
==5486== at 0x4C2BDEC: free (in
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==5486== by 0x40059E: main (outOfBounds2.c:8)
Valgrind’s Memcheck will also check for memory leaks.
However, by default it only reports a summary of the leaks,
which is not useful for finding and fixing the problems. If you
have memory leaks, you will want to run with the --leak-
check=full option. When you do so, Memcheck will report the
location of each allocation that was not freed. You can then use
this information to figure out where you should free that
memory.
Note that when running Valgrind’s Memcheck with GDB,
you can run the leak checker at any time with the monitor
command monitor leak_check full reachable any.
D.3.5 memcheck.h
Sometimes we may want to interact with Valgrind’s tools
directly in our program. For example, we might want to
explicitly check if a value is initialized at a certain point in the
program (e.g., as part of debugging an error about uninitialized
values). Valgrind provides header files, such as memcheck.h,
which contains a variety of macros for exactly this purpose. For
example, we could change the function we were using earlier as
an example of uninitialized values to
1 void f(int x) {
2 int y;
3 int z = x + y;
4 VALGRIND_CHECK_MEM_IS_DEFINED(&z,sizeof(z));
5 printf("%d\n", z);
6 }
Now, when we run this program in Valgrind, we get the error
message more immediately:
==12425== Uninitialised byte(s) found during client
check request
==12425== at 0x4007C9: f (uninit4.c:8)
==12425== by 0x400811: main (uninit4.c:17)
==12425== Address 0xfff000410 is on thread 1’s stack
==12425== Uninitialised value was created by a stack
allocation
==12425== at 0x400765: f (uninit4.c:5)
Many of Memcheck’s features are available through these
macros. Most other tools have similar header files for programs
to interact directly with them. See
http://valgrind.org/docs/manual/mc-manual.html#mc-
manual.clientreqs for more details.
D.3.6 Other Valgrind Tools
Memcheck is not the only tool in Valgrind—although it is one
of the most commonly used ones, and is what many people
think of when they hear Valgrind.10 As you become a more
advanced programmer, you may find it useful to put some of
the other tools to use. We will not delve into them too deeply,
but will note a couple that exist, so that you can know to
explore them further as you need to.
If you have read Chapter 28, you will know that there are a
variety of issues that arise in multi-threaded programming, such
as deadlocks due to improper ordering of acquiring locks, data
races, or improper use of synchronization primitives. You
should also know that these can be difficult to find due to the
non-deterministic behavior of multi-threaded executions—a
bug may manifest one time, but then not show up in the next
hundred times you run the program. The Valgrind tool Helgrind
is designed to check for a variety of errors related to multi-
threaded programming. See
http://valgrind.org/docs/manual/hg-manual.html for
more details on Helgrind.
Valgrind’s tools are not limited to helping you find
correctness bugs in your program, but also can be helpful in
understanding performance memory usage issues. For example,
the Callgrind tool gives information about the performance
characteristics of a program based on Valgrind’s simulation of
hardware resources as it executes the program. Another tool is
Massif, which profiles the dynamic memory allocations in the
heap, and gives information about how much memory is
allocated at any given time and where in the code the memory
was allocated.
See http://valgrind.org/info/tools.html for an
overview of the tools. Also note that many of these other tools
have header files that you can include with macros that allow
your program to interact directly with Valgrind (similarly to the
functionality in memcheck.h).
D.4 Revision Control: Git
While it may not seem important in the early stages of
programming, revision control—using software to track
multiple versions of your code, and let multiple developers
collaborate easily—is a crucial part of professional
development (or even larger-scale academic development).
There are many different popular revision control systems, but
most of them share some common principles. We are going to
focus on Git, which is currently the most popular tool, and is
quite featureful. We strongly recommend that you learn
revision control early, and make it part of your standard
practice for software development (and most other things that
you do).
We are going to introduce some high-level features, and
motivations for using revision control here, but refer you to the
Git book for details: http://git-scm.com/book/en/v2.
D.4.1 Past Versions
The basic principle underlying most revision control system is
that you commit your work into the revision control system,
and it then keeps a snapshot of that version of your work. When
you commit, you write a log entry, which describes what you
have done. Later, you can review the log, and return to an older
version of your work if you need to. For example, if you decide
to rewrite a piece of code to improve it, but find out that you
have instead, broken your program, you can return to a prior
working version.
A “novice tools” approach to this problem would be to
keep copies of your work, and manage them by hand. You
might think “well I could just copy all my code before I start,
then use that copy later if I need to.” If you are consciously
aware that you are about to undertake a risky code rewrite, you
might take this course of action. However, what if you start
modifying your code, and only later realize the trouble? If you
use a revision control system, you should commit regularly, and
thus will have good revisions in the past. Furthermore, if you
make regular copies manually, you will find it hard to manage
them all by hand.
Revision systems such as Git also have excellent tools for
working with past versions. For example, Git has a command
called bisect, which lets you search for where in the past you
broke something. Suppose that I know that a feature was
working six months ago; however, today, I ran some tests and
realized the feature was broken. I would like to go back and
find exactly which version broke the feature, see what changed
in that version, and then correct the code. The bisect command
asks Git to help me search for the first broken version.
In particular, bisect binary searches (see Chapter 22)
through the revisions (thus the name), to find where the
problem occured. You can either guide the search manually by
telling Git whether each revision that it visits is good or bad, or
you can have Git perform the search fully automatically by
providing a shell script that determines if the revision is correct
or broken.
Revision control is not limited to code, and, in fact, is
incredibly useful for any large, collaborative project (or even
for just managing your own data on a small scale). To provide a
concrete example of bisect, we use Git to revision control all
of the materials for this book—LaTeX source, code examples,
animations, and video recordings. One of the animations got
inadvertently deleted, and we needed to restore it from the most
recent version. Writing a shell script to test if that file existed,
then running git bisect found the revision where the file was
deleted in a matter of seconds, and let us restore it from the
previous revision (and be sure we had the most up-to-date
version!).
D.4.2 Collaboration
When you work on a software project with hundreds of source
files and dozens of developers, how do you keep everyone up
to date on the latest source? E-mailing files back and forth
would be a nightmare. Even if you only have 2 people working
on a project, managing changes between the developers is a
first-order concern.
Revision control systems such as Git support notions of
pushing —sending your changes to another repository (located
on another computer)—and pulling changes—getting the most
recent version from another repository. If you try to push your
own changes, but are not up to date with the other repository,
Git will require you to pull first, so that you cannot
unexpectedly overwrite someone else’s changes. When you
pull, Git will take care to make sure your changes are
integrated with the remote changes.
If you pull and have not made any changes to your own
repository (since the last time you pushed to that repository),
Git will simply update you to the most recent version of the
repository. If you have made changes, Git will try to merge
your changes with the remote changes. If you have changed
different files, or different parts of the same file, Git will
typically handle the merge automatically. However, if Git
cannot merge automatically, it will indicate what problems need
your attention, and require you to fix them before you push
your changes back.
D.4.3 Multiple Versions of the Present
Revision control tools, such as Git, not only let you return to
past versions, but also let you keep multiple different “present”
versions, via a feature called branches. To see why you might
want multiple “present” versions, imagine that you are
developing a large software project, and have decided to begin
adding a new, complex feature. Adding this feature will take
you and your team of five developers four weeks to complete.
One week into the effort, a critical problem is found in the
currently released version of your software, which must be
fixed as soon as possible.
Such a situation is a great use of branches. The
development team might maintain one branch production,
which contains code that is ready for release. The only changes
that may be made to the production branch are those that are
ready for deployment. Another development branch can be
used to work on active development—adding new features,
testing them, etc. In the situation described above, the
development branch is where the team would be adding the
new complex feature.
How do these branches help? The developer assigned to
fix the bug in the production code can checkout that branch
(working on that version of the “present”), while the other
developers can continue working on the development branch.
In fact, this developer could (and should) make a new branch
(let us call it bugfix) to work on the bug fix. The developer can
then commit changes to bugfix, and other developers can
switch to this branch if they need to work on that fix as well.
When the bug fix is complete, it can be merged back into the
production branch, incorporating the changes into the
production software.
At the same time, development of new features continues
on the development branch. When the bug fix is completed, it
can also be merged into the development branch. Likewise,
when the development of new features are completed, and
ready for release, they can be merged back into the production
branch. Whenever these merges happen, Git will perform
similar actions as in the case of merging in changes pulled from
another repository—either handling it automatically if it can, or
requiring the developer to figure out how to resolve conflicts.
Note that this is a bit of a small-scale motivational
example. If we were really developing a large piece of
software, we would almost certainly want more branches than
those described above. We may have different developers
working on different features in parallel, and want to manage
combination of those changes with branches. We might also
want branches for different stages of testing, or other purposes.
D.4.4 Read More!
Having, hopefully, shown you some of the great benefits of
revision control, we recommend further reading from the online
book about Git: http://git-scm.com/book/en/v2/
In our opinion, reading (trying out, and understanding) the
first two chapter is a “must” even for those who are only
casually interested in learning to program. We recommend that
any class using this book (or really, any software development
class) operate through revision control—distribute assignments
by putting them in a repository students pull from, then have
students submit assignments by committing and pushing their
changes.
If you are interested in serious software development, you
should master chapters 3–6 quickly, and then continue to hone
your Git skills as you develop your other programming skills.
Over time, you should learn about the remaining material, and
work to use Git fluently.
D.5 Tools: A Good Investment
In this appendix, we have introduced you to the basics of four
crucial tools. What we have described here is the minimal level
of competency for a programmer moving beyond the novice
stage into the intermediate stage (i.e., where you will be when
you master the material in this book). However, if you plan to
engage in a serious/professional programming career, you will
find that time spent mastering these (or similar) tools is a great
investment. Increasing your proficiency with these tools will
make you a much more efficient programmer. Much of this
proficiency will come with practice and reading manuals.
C Editing: EmacsIndexE Miscellaneous C and C++ Topics
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
V AppendiciesD Other Important ToolsF Compiler Errors Explained
Appendix E
Miscellaneous C and C++
Topics
This appendix presents some C and/or C++ related information,
which may be useful if you intend to do significant C and/or
C++ programming. However, it is not critical to learning to
program, thus we place it out of the main portion of the book.
E.1 Ternary If
In C (and C++), the ? : operator lets you write an expression
that conditionally selects between its second and third operand.
This operand is a ternary operator—meaning it takes three
operands (recall that unary operators take one operand and
binary operators take two operands). To evaluate a ? b : c,
first a is evaluated to a value. If that value is true (non-zero),
then b is evaluated to a value and the whole expression
evaluates to that value. Otherwise, c is evaluated to a value and
the whole expression evaluates to that value.
The ? : operator is easy to abuse, so novices are generally
warned against it. There is a time and place for it (namely, if it
makes your code more readable), however, abuse of it can
make code needlessly complex and difficult to read. You can
write any program you want and never use ? :—you can just
use if statements instead and be fine. However, sometimes it is
nice. First, consider the max function on two ints:
1 int max(int a, int b) {
2 if (a > b) {
3 return a;
4 }
5 else {
6 return b;
7 }
8 }
This function is a great example of where the ? : operator
can make the code more readable. If we can write an expression
(instead of a statement) to choose between two things, we can
just return that expression:
1 int max(int a, int b) {
2 return (a > b) ? a : b;
3 }
E.2 Unions
C and C++ both include the notion of a union—a type that
contains multiple other types that overlap each other in
memory. The declaration of a union looks very much like the
declaration of a struct (except with the union keyword instead
of the struct keyword). The main difference is that where a
struct places each element one after the other, a union places all
of the elements “on top of” each other. Accordingly, the size of
a union is the size of its largest member (whereas the size of a
struct is the sum of the sizes of its members, plus possibly some
extra space to align them properly). For example, we might
write:
1 union nums {
2 uint64_t qword;
3 uint32_t dwords[2];
4 uint16_t words[4];
5 uint8_t bytes[8];
6 };
A variable of type nums will consist of one box that we can
“look at” multiple ways by accessing its different fields. For
example, we could write the following (C) code:
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 union nums {
5 uint64_t qword;
6 uint32_t dwords[2];
7 uint16_t words[4];
8 uint8_t bytes[8];
9 };
10
11 int main(void) {
12 union nums n;
13 n.qword = 0x0123456789ABCDEF;
14 for (int i =0; i < 2; i++) {
15 printf("n.dwords[%d] = 0x%X\n", i, n.dwords[i]);
16 }
17 for (int i =0; i < 4; i++) {
18 printf("n.words[%d] = 0x%X\n", i, n.words[i]);
19 }
20 for (int i =0; i < 8; i++) {
21 printf("n.bytes[%d] = 0x%X\n", i, n.bytes[i]);
22 }
23 return EXIT_SUCCESS;
24 };
The output of this program on my computer is
n.dwords[0] = 0x89ABCDEF
n.dwords[1] = 0x1234567
n.words[0] = 0xCDEF
n.words[1] = 0x89AB
n.words[2] = 0x4567
n.words[3] = 0x123
n.bytes[0] = 0xEF
n.bytes[1] = 0xCD
n.bytes[2] = 0xAB
n.bytes[3] = 0x89
n.bytes[4] = 0x67
n.bytes[5] = 0x45
n.bytes[6] = 0x23
n.bytes[7] = 0x1
Figure E.1: An illustration of the overlap of fields in a union.
This output arises because the dwords, words, and bytes
fields of the n variable overlap with the qword field. Figure E.1
illustrates the overlap of the fields in memory, by showing the
box for n, and highlighting where the “box” for some of its
fields lie. Observe how these boxes overlap—that is the essence
of a union. However, it is important to note that this output is
dependent on the in-memory representation of the underlying
data. In this particular case, it depends on the endianness of the
machine—which order it writes the bytes of a multi-byte
integer into memory.
We note that in C++ there are a variety of special
restrictions on what can go into unions due to their nature. They
cannot have virtual methods, members that are references, nor
participate in inheritance (they can neither inherit from another
class, nor serve as the parent class of any other class), nor
members with non-trivial constructors, copy-assignment
operators, nor destructors.1
E.3 The C Preprocessor
The C preprocessor has a variety of features that make it quite
powerful. We have previously only introduced its simpler
features: #include and #define. These two are the basic
features that you need to include header files and define
constants. However, if you do serious C programming, you may
find it useful to know a bit more about the preprocessor—either
to use those features yourself, or because you may need to read
code written by someone else that uses them.
E.3.1 Conditional Compilation
The preprocessor supports conditional compilation—either
including or excluding a certain region of code depending on
some condition. There are a couple of ways that we can test
conditions, but all of them form a conditional group, which
starts with the condition directive, and ends with an #endif
directive. The text between the conditional directive and the
#endif directive is called the controlled text.
Most commonly, the condition is whether or not a certain
macro is defined. In such a case, our conditional directive has
the form #ifdef MACRO, which checks if MACRO is defined. If
MACRO is defined, then the controlled text is kept in the program,
and the preprocessor continues to process it. Otherwise, the
controlled text is discarded and has no effect on the final
program. We can also check if a macro is not defined with
#ifndef MACRO (which stands for “if not defined”).
As example of this use is to allow us to write print
statements for debugging in our code, and enable them when
needed, but leave them disabled otherwise. We might write:
1 int myFunction(int a, int b) {
2 #ifdef DEBUG
3 printf("Entering myFunction(%d,%d)\n", a, b);
4 #endif
5 //whatever other code
6 }
Now, if we define a macro called DEBUG, our debugging
print statement will be included in the source code that is
ultimately sent to the compiler, and thus will print its message
whenever we call this function. However, if DEBUG is undefined,
the preprocessor will drop the controlled text (in this case, the
print statement), and it will not be included in the compiled
program.
If we used conditional compilation for this purpose, we
could add or remove a #define statement at the top of our
code, however, this is not a great solution. If we have a large
number of source files, updating the #define in all of them
becomes unmanageable. A better approach is to use the fact that
gcc accepts a command line argument -D, which lets you define
a macro on the command line. We can either define it without a
value (-DDEBUG), in which case it is defined as 1, or with a
value -DDEBUG=3, in which case it is defined to whatever we
write.
Another common use of conditional compilation is to
support code portability. In some cases, we may need to write
platform specific code, however, we may still want our code to
build and run on a wide variety of platforms. In such cases, we
may wish to use conditional compilation to incorporate the
proper code for the platform on which we are building, and
elide all other code.
We may wish to use these features to conditionally define
other macros. For example, if we wish to annotate our functions
with attributes (which are a GNU-C extension—meaning gcc
supports them, but other compilers may not), we could do so in
a portable way:
1 #ifdef __GNUC__
2 #define NO_RETURN __attribute__ ((noreturn))
3 #else
4 #define NO_RETURN
5 #endif
6
7 //later in the code
8 void fatalError(const char * string) NO_RETURN;
In this example, we check if the macro __GNUC__2 is
defined. If so, it defines the NO_RETURN macro to
__attribute__ ((noreturn))—a gnu extension that tells the
compiler that a function will never return (which helps the
compiler better analyze the code).3 We then use the #else
directive, which (as you may expect) incorporates or elides its
controlled text by the opposite of the condition it is paired with.
If __GNUC__ is not defined, we just define NO_RETURN to nothing
(that is, it is a valid macro, but expands to no text).
Now, later in our code, we can write NO_RETURN on a
function declaration. If the compiler does not support GNU
extensions, this expands into nothing, and the code is still
correct. If the compiler does support GNU extensions, we give
the compiler the extra bit of information through the attribute
system.
Another important use of conditional compilation is to
protect header files from multiple inclusion in large projects. If
we have a large project with many source files, we may end up
with the same header file included into a source file in a variety
of ways. For example, suppose we write point.h, which
defines a type for a point. We then write rectangle.h, which
must #include "point.h" for the definition of a point, which
the rectangle type uses. We might also then write circle.h,
which also must #include "point.h" to make use of the
definition of the point type. However, if we then have another
file that includes both rectangle.h and circle.h, it will
effectively include point.h twice, resulting in two definitions
of the point type inside of that file. This redefinition will result
in errors.
To protect against this situation, you should use a include
guard—an #ifndef test to see if a macro is not defined,
followed by a definition of that macro, then the contents of the
header file, and then the #endif, That is, our point.h file
should look something like this:
1 #ifndef INCLUDED_POINT_H
2 #define INCLUDED_POINT_H
3 struct _point_t {
4 int x;
5 int y;
6 };
7
8 typedef struct _point_t point_t;
9
10 #endif
Now, the first time the point.h file is encountered,
INCLUDED_POINT_H will not yet be defined, and the
preprocessor will process the controlled text—namely the
contents of the header file. The next thing that will happen is
that INCLUDED_POINT_H will be defined. Now, if this same file
is included again, INCLUDED_POINT_H will already be defined,
so the #ifndef INCLUDED_POINT_H condition will be false, and
the preprocessor will omit the second copy of the contents of
the header file.
We will note that exactly what you name the include guard
does not matter, as long as it is legal, unique per header file,
and does not conflict with any other macro definition in the
entire program (including those found in header files you
include). Generally when organizations have coding style
guidelines, they incorporate rules about the naming convention
used for include guards. Typically these include the full path
name of the header file within the code base, with slashes and
dots replaced by underscores (so if our point.h file were in
geometry/primitives/point.h, we would use an include
guard that has GEOMETRY_PRIMITIVES_POINT_H in its name).
Some conventions put INCLUDED on one end or the other, some
do not.
Conditional compilation can also be done with the #if
directive, which evaluates a numerical expression, and includes
the controlled text if that expression evaluates to non-zero. This
expression can include integer constants, defined macros, and
basic mathematical operations (comparison, and, or, etc), as
well as the special defined operator (which tests if a macro is
defined). Note that the expression is evaluated by the
preprocessor, so it can only evaluate the values that macros
expand to. It cannot compute sizeof(a_type) for example, nor
can it use the value of any variables.
We might use this feature if, for example, we want
different DEBUG levels:
1 #if DEBUG > 3
2 //print some very verbose debugging
3 #endif
4 //...
5 #if DEBUG > 2
6 //print some less verbose debugging
7 #endif
We can then define DEBUG to different values when we compile
to control how much/which debug output we get. If a macro is
undefined, it is treated as 0 in evaluating the expression.
This directive is also often used to remove large chunks of
code from the program while still leaving them in the source
file:
1 #if 0
2 //code that we want to not have take effect
3 #endif
E.3.2 Multi-line Macro Definitions
If we wish to write a complex macro, we might want its
definition to span multiple lines. However, a newline ends the
macro definition. We can request that the definition be allowed
to continue to the next line by placing a backslash (\) at the end
of the line. Note that the backslash must be the absolute last
character on the line—even whitespace or comments cannot
technically be placed after it. If you do place any other
character after the backslash, the macro definition will end at
the newline, and the next line of code will be passed directly to
the compiler.
E.3.3 Built-in Macros about Location/Time
Sometimes it is useful for our program to have information
about where something is in the code, or when the compilation
took place. For example, suppose we want to have our program
print the date and time at which it was built (such information
might be useful for finding the correct version to fix bug
reports). We can use the built-in macros __DATE__ and
__TIME__, which expand to the date and time (respectively) at
which the preprocessor was run as string literals.
For example, we could write:
1 printf("Built on " __DATE__ " at " __TIME__ "\n");
and the preprocessor would expand this to a line of code like
this (the exact results, depend on the date and time, of course):
1 printf("Built on " "Apr 12 2015" " at " "15:10:35" "\n");
This line of code may look a little strange, as it 5 string literals
one after the other—however, the rules of C say that
consecutive string literals are concatenated to form a single
literal, thus the compiler will turn the above into this:
1 printf("Built on Apr 12 2015 at 15:10:35\n");
We can also tell the current file name and line number,
using the __FILE__ and __LINE__ macros. These are especially
useful for debugging message. For example, we might write the
following macro:
1 #define DBGPRINT(mesg) fprintf(stderr,"[%s:%d] %s\n", _
Note that the __FILE__ and __LINE__ macros contained
inside of the DBGPRINT macro will not be expanded at the point
of definition, but rather at the point where the DBGPRINT macro
itself is expanded. Accordingly, it will print the file name and
line number of wherever you write
DBGPRINT("Some Message").
We may also wish to know what function a piece of code
occurs in. We can use __func__ (in C99 or later), or
__FUNCTION__ (as a GNU extension) to do so. However, we
should note that neither of these is a macro—the preprocessor
has no idea what function a given macro expansion takes place
in. Instead, these are special names that the compiler itself turns
into a string.
E.3.4 Stringification
Sometimes, we may wish to turn a macro argument into a string
literal—a process called stringification. For example, suppose
we wanted to define a macro that behaves much like assert,
but allows us to print a custom message. Like assert, we want
our macro to print the condition that failed. We can accomplish
this with the stringification operator, #, which we place
immediately before the name of a macro argument to request
that the preprocessor emit a string literal with the text of that
argument. We could do so with a macro such as this:
1 #define CHECK(x,mesg) if(!(x)) { \
2 fprintf(stderr,"[%s:%d] " #x " failed: %s\n", __FIL
3 abort(); \
4 }
Then if we did the following:
1 CHECK(lst != NULL, "This function does not work on an empty lis
We would get output that looks like this:
[myFile.c:7] lst != NULL failed: This function does
not work on an empty list
This output occurs because the preprocessor expands #x to
"lst != NULL" as that is the stringification of the first
argument of the macro.
E.3.5 Variadic Macros
C and C++ support4 variadic macros—macros that support a
variable number of arguments. This feature is most useful when
you want to define a “printf-like” macro. For example, suppose
we wanted to adapt our CHECK macro to allow us to include
other information in a printf-like fashion.
We can write a variadic macro by putting ... at the end of
the macro argument list, then writing __VA_ARGS__ where we
want to use all of the arguments passed in as the variable
arguments. For example, we might write:
1 #define CHECK(x,mesg,…) if(!(x)) {\
2 fprintf(stderr,"[%s:%d] " #x " failed:" mesg "\n",
3 _
4 _
5 abort(); \
6 }
Now, what we pass as mesg can be any valid format string
for fprintf (including format conversions like %s and %d), and
we can pass an appropriate number of extra arguments to the
function. For example, we could write:
1 CHECK(lst != NULL, "The %s function does not work on an
or
1 CHECK(x + y > z - q, "x = %d, y = %d, z = %d, q = %d"
However, we run into a problem if we pass no extra
arguments—the macro expands to a call to fprintf that has an
extra comma at the end— that is, __VA_ARGS__ expands to
nothing, so the end of the fprintf call looks like
"myFile.c", 7, );, which is not syntactically legal.
Some compilers will silently drop the extra comma and
allow the code to compile (which is a rather poor idea, as it
means that one can write incorrect code that appears to work,
without even thinking about the problem. GNU C supports an
extension, ##__VA_ARGS__, which will remove the preceding
extra comma if __VA_ARGS__ is empty. However, this approach
is generally not portable (although you can use it inside of an
#ifdef __GNUC__ safely).
E.3.6 X-Macros
Let us suppose that we have some enumeration:
1 enum myEnum{
2 ONE_THING,
3 TWO_THING,
4 RED_THING,
5 BLUE_THING,
6 NUM_MY_ENUM_ITEMS
7 };
and we have some function to convert this enum to a string:
1 const char * myEnumToString(enum myEnum e) {
2 switch(e) {
3 case ONE_THING: return "ONE_THING";
4 case TWO_THING: return "TWO_THING";
5 case RED_THING: return "RED_THING";
6 case BLUE_THING: return "BLUE_THING";
7 default: return "<invalid myEnum>";
8 }
9 }
If this is all we ever do with that enum, this code is fine.
However, if we add another case to the enumeration, we have
to remember to go change myEnumToString to add another case
to the switch (e.g., GREEN_THING) statement (where we will
likely copy one case, paste it, and edit the label and string
literal—and thus be prone to copy/paste mistakes). If we forget
to do so, our code remains legal, but we may end up with
confusing results that cause us to waste time. For example,
while debugging, we might have a GREEN_THING, and try to
print it out by converting it to a string using myEnumToString,
but think we have an invalid value due to forgetting to update
this function. Remember that anytime code requires someone to
update multiple places to make one change, you are asking for
people to introduce errors later on!
We can solve this problem using X-macros—an advanced
macro technique in which we (1) make uses of a yet-to-be-
defined macro (X) in a such a way that we can cause its
expansion in multiple places (either with another macro
definition, or with a file we will #include)and (2) define X,
expand the definitions that use X, then undefine X, (3) repeat
step 2 as many times as we need, with possibly different
definitions of X each time.
While this may sound rather complex, an example
clarifies. Suppose we write the following as the contents of a
file called things.c:
1 DEF_THING(ONE_THING)
2 DEF_THING(TWO_THING)
3 DEF_THING(RED_THING)
4 DEF_THING(BLUE_THING)
Notice that this C file does not stand alone. It relies on a
macro DEF_THING, which is not defined anywhere inside it. (It
also does not really define any code).5 However, we can now
make use of this file by defining the DEF_THING macro, then
#includeing the file. For example, we can define our
enumeration as follows:
1 #define DEF_THING(x) x,
2 enum myEnum {
3 #include "things.c"
4 NUM_MY_ENUM_ITEMS
5 };
6 #undef DEF_THING
This definition will expand to the enumeration definition
we wrote earlier. Notice that we use the #undef directive at the
end of this piece of code, which undefines the macro, so that
we can redefine it in the future. We might then write our
function to convert to a string as follows:
1 #define DEF_THING(x) case x: return #x;
2 const char * myEnumToString(enum myEnum e) {
3 switch(e) {
4 #include "things.c"
5 default: return "<unknown myEnum>";
6 }
7 }
8 #undef DEF_THING
This fragment of code might look quite odd, however, it
does exactly what we want. It will define the DEF_THING macro
to expand DEF_THING(ONE_THING) into
case ONE_THING: return "ONE_THING"; (remember, that #x
stringifies x). Accordingly, when we #include "things.c", the
preprocessor writes the cases of our switch statement for us.
This approach solves our maintainability problem, as we
now only have to change one place to add items to the
enumeration definition and the cases to print them at the same
time. If we want to add GREEN_THING, we just add
DEF_THING(GREEN_THING) to things.c, and recompile. The
preprocessor will expand the (different) DEF_THING macros on
GREEN_THING in the relevant places, and everything will work
correctly.
This approach can also be taken without the separate file
by defining another macro (recall that the backslash at the end
of each line just lets the macro definition continue to the next
line, as a newline otherwise ends the macro definition):
1 #define THING_LIST \
2 DEF_THING(ONE_THING) \
3 DEF_THING(TWO_THING) \
4 DEF_THING(RED_THING) \
5 DEF_THING(BLUE_THING)
If we take this approach, we can just expand the
THING_LIST macro in each place that we #included things.c
in the previous approach:
1 #define THING_LIST \
2 DEF_THING(ONE_THING) \
3 DEF_THING(TWO_THING) \
4 DEF_THING(RED_THING) \
5 DEF_THING(BLUE_THING)
6
7 #define DEF_THING(x) x,
8 enum myEnum {
9 THING_LIST
10 NUM_MY_ENUM_ITEMS
11 };
12 #undef DEF_THING
13
14 #define DEF_THING(x) case x: return #x;
15 const char * myEnumToString(enum myEnum e) {
16 switch(e) {
17 THING_LIST
18 default: return "<unknown myEnum>";
19 }
20 }
21 #undef DEF_THING
Both approaches work equally well. Choosing between the
two is generally a matter of convention and circumstance. I
tend to prefer including a separate file for long and/or complex
lists, and defining another macro for shorter, simple lists.
E.3.7 Token Pasting
Suppose that we wanted to use X-macros for the previous
example, but wanted to print ONE, TWO, RED, and BLUE, while still
having the constants named ONE_THING, TWO_THING, RED_THING,
and BLUE_THING. We can achieve this goal, but we need a way
to glue two identifiers together into one identifier. That is, we
need our macro to take ONE as an argument, and glue it together
with _THING to form the identifier ONE_THING. What we want to
do is called token pasting and is achieved by writing ##
between the two identifiers.
We start by writing our list of macros to expand with just
ONE, TWO, RED, and BLUE:
1 DEF_THING(ONE)
2 DEF_THING(TWO)
3 DEF_THING(RED)
4 DEF_THING(BLUE)
Then, when we #include the file, we define the
DEF_THING macros to use token pasting:
1 #define DEF_THING(x) x##_THING,
2 enum myEnum {
3 #include "things2.c"
4 NUM_MY_ENUM_ITEMS
5 };
6 #undef DEF_THING
Notice how the definition of DEF_THING now produces a
single token, which it constructs by gluing together the text of
the macro argument _THING. We likewise modify the code for
our printing function to use token pasting in its creation of the
case labels:
1 #define DEF_THING(x) case x##_THING: return #x;
2 const char * myEnumToString(enum myEnum e) {
3 switch(e) {
4 #include "things2.c"
5 default: return "<unknown myEnum>";
6 }
7 }
8 #undef DEF_THING
Token pasting is probably a feature you will not use very
often, however, when it can be incredibly useful in the right
circumstances.
E.3.8 Custom Errors and Warnings
There may be times when you want to generate your own errors
or warnings. You can do this with the #error and #warning
directives respectively. These are most useful in combination
with conditional compilation. For example, if you know that
your code uses GNU extension, and will not otherwise compile,
you might write:
1 #ifndef __GNUC__
2 #error "This code requires a compiler that supports GNU extension
3 #endif
The code will still not compile with a non-GNU compiler,
but the person attempting to build your program will at least
have a meaningful error message that makes it much easier to
diagnose the problem.
E.4 C++ Casting
As we discussed in Section 3.4.1, sometimes programs need to
convert one a value from one type to another. Except for the
implicit void*-to-any-pointer-type conversion, C++ supports
the same implicit and explicit conversions as C. However, it
also introduces a variety of other ways to convert.
E.4.1 Implicit Conversions
As we discussed in Section 15.4.3, a one argument constructor
may be used to implicitly convert from the type of the
constructor’s argument to the type of object the constructor is
inside of, unless the constructor is declared explicit. There are
two other ways to add implicit conversions for classes that you
write. One is to overload the assignment operator with a
parameter of the type to implicitly convert from. The second
other way is to define a type-conversion operator, which lets a
class specify how to implicitly convert its objects to some other
type. We recommend avoiding all of these approaches unless
you have a good reason—requiring the programmer to
explicitly request conversion is well worth it. The handful of
extra keystrokes is more than made up for by avoiding strange
bugs from unexpected implicit conversions.
E.4.2 Explicit Conversions
In C, we can explicitly cast from one type to another by writing
the desired type in parentheses (which we saw in Section 3.4.2).
C++ adds a variety of casting operators to give the programmer
more control over what may and may not change during the
cast operation.
Const Cast. One casting operator allows the programmer
to change the “cv-qualifiers” on a type—meaning its const-
ness, or volatility6 of a type. This casting operator, which is
written const_cast<type>(expr) casts expr to the type written
in the angle brackets. If the type in the angle brackets differs
from the actual type of the expression in more ways than just its
cv-qualifiers, the compiler produces an error. Accordingly, it is
useful if you expect to only be changing the cv-qualifiers, and
nothing else—the compiler will tell you if you are incorrect,
and thus you can fix your code.
Note that such a cast does not actually change the
properties of the data involved. Consider the following (broken)
code:
1 //broken
2 const char * s = "Hello";
3 char * p = const_cast<char *>(s);
4 strcpy(s, "Bye");
This code will compile correctly, as it type checks.
However, the code will crash when we run it, as we attempt to
modify read only memory—removing the const qualifier from
the pointer does not make the memory that it points to
writeable. See
http://en.cppreference.com/w/cpp/language/const_cast
for more details.
Dynamic Cast. Another casting operator, dynamic_cast,
casts between polymorphic pointer (or reference) types with
runtime type checking to ensure that the cast is correct (i.e., that
result pointer actually points at a valid object of the requested
type). If we write dynamic_cast<T*>(expr), then the compiler
will first check that expr has a type that is a pointer to a class
with at least one virtual function.7 The compiler will then
generate code that performs a runtime check that the pointer
actually points at an object of type T. If so, the entire
dynamic_cast expression evaluates to that pointer, otherwise, it
evaluates to NULL. Note that in the case of a successful dynamic
cast between a non-primary parent class and a child class, the
pointer will be appropriately adjusted to point at the start of the
child object (see Section 29.2).
The dynamic_cast operator can also be used with
reference types. As with pointers, the type of the expression to
cast must be a reference to a class type with at least one virtual
function. On success, the expression evaluates to a reference to
the object of the requested type. On failure, std::bad_cast is
thrown (see Chapter 19 if you have not yet read about
exceptions).
This operator is the safe way to cast from a pointer-to-a-
parent class to a pointer-to-a-child class, often called down-
casting. It can also be used to convert a pointer-to-a-child to a
pointer-to-a-parent (called up-casting), but that can also be
done implicitly. When multiple inheritance is used, it is
possible to side-cast—converting from one parent to another
within the same object. That is, if class C inherits from A and B,
and we have a B * that actually points at a C, we can
dynamic_cast it to an A*, which will point at the A sub-object
within the same C object.
You can use dynamic_cast<void*>(expr) to cast to a
void *. In such a case, the resulting pointer will point at the
entire object, as it was actually created (i.e., if you are using
multiple inheritance, it will give you a pointer to the entire
object of the most derived class). See
http://en.cppreference.com/w/cpp/language/dynamic_cas
t for more details.
Static Cast. The third casting operator asks the compiler
to find a logical way to convert from one type to another. This
conversion, which is written static_cast<T>(expr) can be
used to perform the same conversions as dynamic_cast, but
without the runtime checks. In the case of a downcast,
static_cast is unsafe unless the programmer has other means
of ensuring that the object is actually of the requested type.
Static cast can also be used to convert between types using
user defined conversions—one argument constructors, or the
overloaded conversion operator. If a one argument constructor
is available to perform the conversion, it will be used, even if
declared explicit (as writing the static_cast is considered
an explicit request for that conversion). The static_cast
operator may also make use of implicit conversions. See
http://en.cppreference.com/w/cpp/language/static_cast
for more details.
Reinterpret Cast. The fourth casting operator tells the
compiler to treat the same underlying numeric values as another
type. If we write reinterpret_cast<T>(expr), then the
compiler will generate code to evaluate expr, and then will just
treat the resulting bit pattern as type T. There are a lot of rules
about when you can use reinterpret casts, see
http://en.cppreference.com/w/cpp/language/reinterpret
_cast for more details.
C-style Casting. At this point, you may be wondering
what happens if you write a C-style cast (of the form (T)expr).
The compiler will pick between const_cast, static_cast, and
reinterpret_cast (with the last two possibly combined with
const_cast, as they do not adjust the cv-qualifiers). See
http://en.cppreference.com/w/cpp/language/explicit_ca
st for more details.
E.5 Boost Libraries
Boost provides a wide variety of libraries for C++
development. These libraries include a wide variety of
functionality, such as threading, asynchronous IO, data
structures, and algorithms. We are not going to delve into the
details of them here, as they are documented online. However,
you should be aware that Boost libraries exist, and look to see if
they can help you with whatever you need to do. See
http://www.boost.org/ for details.
E.6 C++11 Features
We have primarily discussed C++03 (which is sometimes
referred to as C++98), however, in 2011, a newer version of the
standard was introduced with a large variety of features. Even
more recently, C++14 was introduced, however, it provides a
much smaller set of changes to the rules than C++11 did.8 With
g++, these features can be used with the -std=gnu++11
command line option. We want to briefly mention some of the
larger features in C++11, however, our goal is not to delve into
every corner of C++11, nor to go into these in detail. We do not
aim to cover all of the new features here, nor to provide a
complete in-depth coverage of the features we do cover.
Instead, we want to describe the big idea behind some of the
changes, so that you can be sure to find out more details before
attempting to engage in professional C++11 programming. As
with our broader discussion of C++, if you want to know all
about every feature and detail of the language, you should get a
book focused explicitly on such topics.
E.6.1 nullptr
In C, and C++03, we have become used to writing NULL to
represent the null pointer. However, the way that NULL was
defined (in C, as ((void*)0) and in C++ as simply ) leave a lot
to be desired in terms of type safety. C++11 improves on the
definition of NULL, by providing nullptr, which is of type
nullptr_t (which is also new in C++11). The nullptr_t type
is a compatible with any other pointer type (we can assign
nullptr to any pointer, or compare a pointer of type nullptr_t
to any other pointer). However, values of type nullptr_t are
not compatible with ints. Expressions of type nullptr_t may
still be implicitly converted to booleans, however.
E.6.2 Exception Specifications Revisited: noexcept
C++11 deprecates the throws (exceptionNames) style
exception specifications. Instead, the consensus was that what
is really important is whether or not a function can throw any
exception or none at all. As such, the preferred way to describe
exception information is to write noexcept to specify that a
function cannot throw any exception.
The C++11 noexcept declaration is a bit more
sophisticated than the C++03 throws specification in that the
C++11-style can take a constant boolean expression to
determine if the function is noexcept (if the expression is true)
or may throw (otherwise). The word noexcept can also be used
as an operator to determine if an expression may not throw an
exception. The combination of these two uses of noexcept is
useful for templates, where the exception behavior of the
templated function/method depends on the exception behavior
of some aspect of the template parameter:
1 template<typename T>
2 int f(const T & x) noexcept(noexcept(x.f()) &&
3 noexcept(x.g(42))){
4 return x.f() + x.g(42);
5 }
Here, whether or not f cannot throw any exceptions
depends on the behavior of the methods f() and g(int) inside
of x (which is a T).
E.6.3 Working with Types
C++11 introduces some features to make working with types
(especially in the context of templates) easier. We will mention
two such features here (although there are others). The first is
the re-purposing of the auto keyword9 to ask the compiler to
automatically determine the type of a variable. For example,
suppose we wanted to write a templated function, like this:
1 template<typename T1, typename T2>
2 int f(const T1& x, const T2& y) {
3 ??? something = x.f() + y.g();
4 //more code here that uses ’something’
5 }
Determining what type we should write for the variable
something may be problematic, and in fact, writing any one
particular type might limit the utility of our template (restricting
it to work only on a particular subset of types for T1 and T2
that it might otherwise be fine on). In C++11, we can write
auto as the type of the variable to mean “figure out the type
from whatever I initialize it with”, e.g.,
1 template<typename T1, typename T2>
2 int f(const T1& x, const T2& y) {
3 auto something = x.f() + y.g();
4 //more code here that uses ’something’
5 }
The second is the new decltype keyword, which takes
one operand, and determines its type, in much the same way
that sizeof determines its operand’s size. Like sizeof,
decltype does not actually evaluate its argument to a value.
Suppose that in our previous example, the next thing we wanted
to do was to declare a std::set whose type is appropriate to
put something into it. That is, we want to appropriately fill in
the question marks here:
1 template<typename T1, typename T2>
2 int f(const T1& x, const T2& y) {
3 auto something = x.f() + y.g();
4 std::set<???> s;
5 //more code that uses ’s’ and ’something’
6 }
We cannot simply write auto s; because the compiler has
no way to figure out what type we mean. Instead, we can use
decltype(something) to name the type of something and use
it as the template argument of std::set
1 template<typename T1, typename T2>
2 int f(const T1& x, const T2& y) {
3 auto something = x.f() + y.g();
4 std::set<decltype(something)> s;
5 //more code that uses ’s’ and ’something’
6 }
There is a special syntax for declaring functions whose
return types depends on the types of the parameters. That is, we
might like to write a different function that looks like this:
1 //not legal
2 template<typename T1, typename T2>
3 decltype(x.f() + y.g()) g(const T1& x, const T2& y) {
4 return x.f() + y.g();
5 }
However, this code is not legal due to the fact that x and y
are not yet in scope at the point where we try to write the return
type. C++11 solves this problem by introducing the concept of
a trailing return type—where we write auto in the place where
we would normally write the return type, and then after the
parameter list, we write -> followed by the actual return type
(where we can use decltype on expressions involving the
parameters.
1 template<typename T1, typename T2>
2 auto f(const T1& x, const T2& y) -> decltype(x.f() + y.g()){
3 return x.f() + y.g();
4 }
E.6.4 Move Semantics and Rvalue References
One tension between performance and readable code in C++03
arises from copying of temporary objects. From a performance
standpoint, needless copies must be avoided (as copying objects
takes time). However, writing code in such a way as to avoid
these copies often makes it much less natural. Consider the
following code fragment:
1 IntArray a(1000000); //create an IntArray with 1,000,000 element
2 IntArray b(1000000); //and another
3 //..code that fills in a and be with data...
4 std::vector<IntArray> v;
5 v.push_back(a+b);
Let us suppose that operator+ is defined on IntArray in
the natural way—it adds the arrays together element-wise, and
returns the resulting newly made IntArray (which presumably
has some dynamic allocation inside it). Adding this IntArray to
our vector invokes the copy constructor (as the vector must
copy the object into itself), which involves allocating space for
1 million elements, then copying them all. After the call to
v.push_back(a+b) completes, the unnamed temporary is
destroyed (deleting the memory that the values were copied
from). Such copying is painful for performance, but in C++03,
there is no immediately obvious “nice” way to rewrite this code
to avoid such a problem.
Ideally, what we would like to do in this case is to have the
object inside of the vector constructed in such a way that it
“steals” the memory from the unnamed temporary object,
avoiding the extra allocation and copy. That is, we would like
to copy only the pointer to the memory, and then change the
pointer in the temporary so that the destructor will not free the
memory we just stole. However, such stealing is only a valid
strategy in the case where we know we have an object whose
destruction is imminent.
C++11 introduces rvalue references, move constructors,
and move assignment operators to solve such a problem. The
idea behind an rvalue reference (which is declared with two
ampersands) is that they are only bound to rvalues (i.e.,
unnamed temporaries). Correspondingly, the move constructor
and move assignment operators take rvalue references (instead
of “normal” references—called lvalue references to distinguish
them). Since the move constructor (and move assignment
operator) know that the object they reference must be an
unnamed temporary, which will be destroyed soon, they can
perform exactly the stealing we desire. That is, we might write
our IntArray class as follows:
1 class IntArray {
2 int * data;
3 size_t numElements;
4 public:
5 IntArray(size_t n) : data(new int[n]()), numElements
6 ~IntArray() noexcept {
7 delete[] data;
8 }
9 //move constructor
10 IntArray(IntArray && rhs) noexcept : data(nullptr), n
11 std::swap(data, rhs.data);
12 std::swap(numElements, rhs.numElements);
13 }
14 //move assignment operator
15 IntArray& operator=(IntArray && rhs) noexcept {
16 if (this != &rhs) {
17 std::swap(data, rhs.data);
18 std::swap(numElements, rhs.numElements);
19 }
20 return *this;
21 }
22 //copy constructor
23 IntArray(const IntArray & rhs) : data(new int[rhs.num
24 numElements(rhs.numE
25 for (size_t i = 0; i < numElements; i++) {
26 data[i] = rhs.data[i];
27 }
28 }
29 //copy assignment operator
30 IntArray& operator=(const IntArray & rhs) {
31 if (this != &rhs) {
32 if (numElements != rhs.numElements) {
33 int * temp = new int[rhs.numElements];
34 delete[] data;
35 data = temp;
36 }
37 for (size_t i = 0; i < rhs.numElements; i++) {
38 data[i] = rhs.data[i];
39 }
40 }
41 return *this;
42 }
43 IntArray operator+(const IntArray & rhs) {
44 assert(rhs.numElements == numElements);
45 IntArray answer(numElements);
46 for (size_t i = 0; i < numElements; i++) {
47 answer.data[i] = data[i] + rhs.data[i];
48 }
49 return answer;
50 }
51 //other methods omitted
52 };
Observe how the move assignment operator and move
constructor now swap the fields of this object with the
temporary we are moving from. The destruction of the
temporary will now destroy any memory that this object
previously held, and this object will point at the memory
previously held by the temporary. Also notice that the move
assignment operator and move constructor are noexcept—
unlike their copying cousins, they do not need to allocate
memory, and thus can provide a no-throw guarantee.
The vector<T> class makes use of the move constructor
by providing an overloading of push_back, which takes an
rvalue reference:
1 void push_back (T&& val);
This method will make use of the move constructor to move the
temporary into the vector without copying. We note that there is
little point in writing the IntArray class, as we can just use
vector<int>.
There may be times when the programmer wishes to have
move semantics for an object that is not an unnamed temporary.
In such cases, she may use std::move to obtain an rvalue
reference to the object in question. Of course, such a request
must be used with care.
The introduction of rvalue references changes the Rule of
Three (which we discussed in Section 15.3.4) into the Rule of
Five, adding the move constructor and move assignment
operator to the list of behaviors that must all be written if any of
the five are written.
E.6.5 Constructors
C++11 introduces a handful of features related to constructors.
The first is to allow the explicit introduction of a default
version of a constructor. For example, we might write :
1 class X {
2 //some fields
3 public:
4 X() = default;
5 X(int n) {
6 //whatever code
7 }
8 };
By writing any constructor (e.g., the one that takes an
int), we do not end up with the compiler provided default for
the default constructor. In C++03, we would have to go write
an equivalent constructor by hand. However, in C++11, we can
write X() = default; to have the compiler provide it.
We can also remove a constructor that would otherwise be
provided automatically by using = delete. For example, if we
wanted there to be no copy constructor (to prevent copying of
the objects at all), we could write:
1 class X {
2 //some fields
3 public:
4 X() = default;
5 X(int n) {
6 //whatever code
7 }
8 X(const X& rhs) = delete; //no copy constructor
9 };
The = delete can also be used to delete the default provided
assignment operator or destructor as well. C++11 also
introduces the idea of delegating constructors—to reduce code
duplication, one constructor may express itself in terms of a call
to another constructor for the same class. This delegation is
accomplished by placing a call to the delegatee in the
delegating constructor’s initializer list:
1 class Y {
2 public:
3 Y(): Y(0, nullptr) /* delegate the work to the other constructo
4 Y(size_t sz, int * data) {
5 //code
6 }
7 };
E.6.6 Range-based for Loops
With the prevalence of iterators in C++, C++11 decided to
make a shorthand syntax for them in the form of a range-based
for loop. A range-based for loop has different syntax than a
regular for loop and iterates over a structure for which begin()
and end() are defined and return iterators. It can also be used
on a sequence of elements in braces (i.e., with
{elem0, elem1, ..., elemN}). For example:
1 #include <iostream>
2 #include <vector>
3 #include <set>
4
5 int main(void) {
6 std::vector<std::string> aVector;
7 aVector.push_back("Hello");
8 aVector.push_back("World");
9 for (const std::string & s : aVector) {
10 std::cout << s << std::endl;
11 }
12 std::set<int> aSet;
13 for (int i : {0,1,4,8,99,123}) {
14 aSet.insert(i);
15 }
16 for (int i : aSet ) {
17 std::cout << i << std::endl;
18 }
19 return EXIT_SUCCESS;
20 }
Range-based for loops are just syntactic sugar (recall that
syntactic sugar means it is just short hand that does not
introduce any new functionality—it just makes for less writing)
for a regular loop with an explicit iterator. The last loop in the
example is equivalent to:
1 {
2 std::set<int>::iterator it = aSet.begin();
3 std::set<int>::iterator itEnd = aSet.end();
4 while (it != itEnd) {
5 int i = *it;
6 std::cout << i << std::endl;
7 ++it;
8 }
9 }
E.6.7 Lambdas
C++11 introduces support for lambda expressions—anonymous
functions, which can capture variables from their surrounding
environment (forming a closure—a function with a captured
variable environment). This construct gets its name from the
lambda calculus ( -calculus), which is the theoretical
foundation for functional programming, where such anonymous
functions are common.
In C++, a lambda expression begins with its capture list
inside of square brackets (which we will discuss shortly),
followed by its parameter list in parenthesis (just like a normal
function’s parameter list), a few optional attributes, its return
type (which may or may not be required, as we will see
shortly), and then its body in curly braces (just like a normal
function). The expression has a unique anonymous type, which
has operator() overridden, as well as a few other members.
Invoking operator() executes the function.
Let us see an example to make things more concrete:
1 int main(void) {
2 auto f = [] (int a, int b) { return a + b; };
3
4 std::cout << f(3,4) << "\n";
5
6 return EXIT_SUCCESS;
7 }
The first line of main declares a variable of type f, whose
type is auto, meaning we have requested the compiler to
determine it from the initializing expression (see Section E.6.3).
We then initialize f with a lambda expression. The capture list
is empty (it is []), and we have two parameters (the parameter
list is (int a, int b)). The body of the lambda is then
{ return a + b; }. Note that the semicolon at the end of the
line ends the declaration/initialization statement.
At this point, f is a function that takes two ints and
returns their sum. The next line of main prints the result of
f(3,4), which invokes the anonymous function, and computes
7.
This use of lambdas shows the basic syntax and behavior,
but does not show why we might want them. To see them in
action, remember that we might want to parameterize an
algorithm over some piece of what it does. In C, we use
function pointers (see Section 10.3) to accomplish such
parameterization. In C++, we generally prefer to create objects
that declare an operator()—note that this is exactly what
lambdas do, just with shorter syntax.
To see a more useful example of lambda, suppose we want
to write a function that takes a vector of strings, and checks if
every element of the vector has the letter ’a’ in it. We could
just write a loop by hand, or we could use std::all_of, which
checks if all items between two iterators meet some condition
—where that condition is expressed by a function passed in.
This function takes an items from the collection, and returns
true (if it meets the desired condition) or false (if it does not).
We can write this code using lambda like this:
1 bool allHaveA(std::vector<std::string> & v) {
2
3 return std::all_of(v.begin(),
4 v.end(),
5 [] (const std::string & s) {
6 return s.find(’a’) != std::str
7 });
8 }
Observe that the third argument to all_of is a lambda
expression, which takes one argument (a const string reference)
and returns a bool (indicating whether or not ’a’ was found in
the passed-in string).
We could write this code without a lambda, by declaring
an explicit class with an overloaded operator(). Such an
implementation might look like this:
1 class CheckForA {
2 public:
3 bool operator() (const std::string & s) const {
4 return s.find(’a’) != std::string::npos;
5 }
6 };
7 bool allHaveA(std::vector<std::string> & v) {
8
9 return std::all_of(v.begin(),
10 v.end(),
11 CheckForA());
12 }
The two pieces of code do basically the same thing—the
later just explicitly declares the class and gives it a name, and is
rather less convenient to write.
Now, suppose that we wanted to make our function a bit
more general—so that it checks if all elements of the vector
contain any letter, not just ’a’. We would need to pass a
parameter to it, and use that parameter inside the lambda’s
body:
1 bool allHaveLetter(std::vector<std::string> & v, char c
2
3 return std::all_of(v.begin(),
4 v.end(),
5 [c] (const std::string & s) {
6 return s.find(c) != std::string:
7 });
8 }
Note that in this modified function, the lambda has a non-
empty capture list—we wrote [c], telling the compiler that the
lambda must “capture” the value of c when it is created.
Capturing the value means that it is copied (by value) into the
closure when the closure is created. Now, our code is basically
equivalent to this code:
1 class CheckFor {
2 const char c;
3 public:
4 CheckFor(char x): c(x) {}
5 bool operator() (const std::string & s) const {
6 return s.find(c) != std::string::npos;
7 }
8 };
9 bool allHaveLetter(std::vector<std::string> & v, char c
10
11 return std::all_of(v.begin(),
12 v.end(),
13 CheckFor(c));
14 }
Note how the constructor now takes a parameter, and
stores that value in an instance variable, which is then used in
the operator(). We can also capture by reference ([&c]), and
can capture multiple different values, in different ways (e.g.,
[a, &b, c] captures a and c by value, and b by reference). We
can also specify default capture—meaning that all variables in
the body of the lambda that reference declarations outside of
the lambda are captured according in the default fashion we
request—either by value (by placing a = sign at the start of the
capture list), or by reference (by placing a & at the start of the
capture list).
We have not explicitly written the return types for our
lambdas, as all of them have had bodies for which the compiler
can deduce the return type. If we need to explicitly write down
the return type, then we do so by placing -> type between the
parameter list and the body. For example, we could write down
the return type of our first example like this:
1 int main(void) {
2 auto f = [] (int a, int b) -> int { return a + b;};
3
4 std::cout << f(3,4) << "\n";
5
6 return EXIT_SUCCESS;
7 }
You can also place mutable after the parameter list to
specify that the lambda’s body is allowed to modify the
captured information—otherwise it behaves as if all captured
values are const, and the operator() behaves as if it were
declared with const on the end (as in the examples above
where we showed the equivalent explicit class declaration). You
can also place an exception specification after mutable (if you
use mutable, or after the parameter list, if you do not).
E.6.8 Overriding Control
C++11 gives special meaning to two identifiers when they are
placed in particular locations. The first, override, is used with
virtual functions (methods) and tells the compiler that the
function being declared is intended to override an inherited
function. If that function is not actually overriding something,
an error is produced. To see why this is useful, consider the
following legal,10 but probably erroneous example:
1 class Parent {
2 public:
3 virtual void f() const {
4 //code
5 };
6 };
7
8 class Child: public Parent {
9 public:
10 //does not actually override f() const
11 //instead is an overloading
12 virtual void f() {
13 //code
14 }
15 };
However, if we make use of C++11’s override
declaration, we can explicitly tell the compiler that we are
expecting the child’s method to override something from the
parent, like this:
1 class Parent {
2 public:
3 virtual void f() const {
4 //code
5 };
6 };
7
8 class Child: public Parent {
9 public:
10 //does not actually override f() const
11 //instead is an overloading
12 virtual void f() override {
13 //code
14 }
15 };
Now, the compiler will check and find we did not actually
override anything, and produce an error:
error: ’virtual void Child::f()’ marked override, but
does not override
virtual void f() override {
Explicitly specifying that a method is expected to override
an inherited method is not only useful in catching initial errors,
but avoiding future errors when the code changes. For example,
the code above may have been initially written with
Parent::f() not being declared const, and Child::f()
correctly overriding it. However, at some later point in time,
another programmer added the const modifier to Parent::f().
Such a change would silently cause strange behaviors if Child
objects were being used polymorphically, and f() called on
them.
The other special identifier is final, which may be used in
either a class declaration, or a virtual function declaration.
Declaring a class as final tells the compiler that no other class
may inherit from it. Declaring a virtual method as final declares
that it may not be overridden in any child classes. For example:
1 class A {
2 public:
3 virtual void f() {
4 //code
5 }
6 };
7 class B : public A {
8 public:
9 //children of B cannot override f() futhter
10 //also note that we expect f() to override something
11 virtual void f() final override {
12 //other code
13 }
14 };
15 class C final : public A {
16 //members
17 };
18 class D final {
19 //members
20 };
In this example, B declares f as final. Other classes may
still inherit from B, but none may override f if they do so (we
also declare the f as override because it is correct to do so).
Classes C and D are declared final—no other class can inherit
from them at all.
E.6.9 Smart Pointers
C++03’s std::auto_ptr provides some basic RAII11
capabilities for data that is referenced by pointers. The
auto_ptr is initialized by passing a pointer into its constructor,
which it then deletes in its destructor. The benefit of such a
design is that the auto_ptr (which is either in a stack frame or
inside of an object—by value) will be automatically destroyed
when the containing object or frame is destroyed, causing its
destructor to deallocate the corresponding memory.
The difficulty with std::auto_ptr is that its copy
constructor and copy assignment operator do not actually make
copies—instead, they transfer ownership of the pointer from
one auto_ptr to the other, leaving the original one “empty” (its
underlying pointer is set to NULL). The consequence of this
design is that what you would expect to make a copy does not
actually do so—instead, it modifies the original source.
C++11 deprecates12 std::auto_ptr, and adds three new
smart pointers: std::unique_ptr, std::shared_ptr, and
std::weak_ptr.
The first of these, std::unique_ptr is a smart pointer that
retains sole ownership of a pointer—when it is destroyed, it
deletes the underlying pointer. Unlike std::auto_ptr,
however, unique pointers cannot be copied (the copy
constructor and copy assignment operator are deleted—a new
feature of C++11, which removes the default ones and does not
replace them with anything). They can, however, be explicitly
moved (with std::move) to transfer ownership. The new
unique pointers also add support for arrays, which was missing
in the old auto pointers. See
http://en.cppreference.com/w/cpp/memory/unique_ptr for
more details.
The second of these, std::shared_ptr allows for multiple
instances to share an underlying pointer, and tracks how many
such instances there are (called reference counting). When one
individual copy is destroyed (or has its underlying pointer
changed), the reference count is decremented. The underlying
pointer is only deleted if the reference count reaches zero.
These can be copied (which increments the reference count, as
there is now another shared pointer managing the pointer). See
http://en.cppreference.com/w/cpp/memory/shared_ptr for
more details.
The third of these, std::weak_ptr holds a reference to
memory managed by std::shared_ptr, but in such a way that
it does not increment the reference count—that is, it does not
count as a reference. Because the weak pointer does not count
as a reference, the memory it refers to may be deallocated while
the weak pointer still exists, so the weak pointer class does not
allow direct access to this memory. Instead, the weak pointer
must be “upgraded” to a shared pointer, using its lock method
—which either returns a std::shared_ptr, which references
the same memory (if that memory has not been deallocated), or
an empty shared pointer (if the memory has been deallocated
already). Weak pointers are useful for more complex resource
management situation, which we will not delve into here,
however, they are good to know about in case you ever find
yourself needing them. See
http://en.cppreference.com/w/cpp/memory/weak_ptr for
more details.
E.6.10 Static Assertions
C++11 introduces static_assert, which is an assert checked
at compile time (by contrast, the regular assert is checked at
runtime). static_assert takes a boolean to check (which must
be a compile time constant) and a string literal, for the error
message to print out (which also must be a compile time
constant). This construct can be quite useful for templates,
where the programmer may wish to check that the template is
only used with arguments that meet certain requirements.
E.6.11 Variadic Templates
C++11 introduces variadic templates—that is, templates that
can take variable length argument lists. The simplest such use
of these templates is the std::tuple class, which generalizes
the std::pair class to a tuple with any number of elements.
The class declaration for std::tuple looks generally like this:
1 template <typename… Types> class tuple; //... indicates 0 or mor
The ... introduces the variable length argument list. Use
of the tuple template is relatively straight forward:
1 #include <tuple>
2 #include <iostream>
3 #include <string>
4 #include <cstdlib>
5
6 int main (void) {
7 std::tuple<int, double, std::string> t1(42, 3.14, "He
8 std::cout << "element 0 is " << std::get<0>(t1) << "
9 std::cout << "element 1 is " << std::get<1>(t1) << "
10 std::cout << "element 2 is " << std::get<2>(t1) << "
11 return EXIT_SUCCESS;
12 }
The above example constructs a 3-tuple (with an int, a
double, and a string), and then uses std::get<int>
(tuple &) to extract the elements. The integer template
argument to get specifies which element of the tuple should be
obtained. The output is intuitive:
element 0 is 42
element 1 is 3.14
element 2 is Hello World
Writing your own variadic template is a bit more complex
due to the way that parameter packs (the group of template
parameters in Types above) are handled. Pretty much the only
things that can be done with them is to expand them into all of
their constituent typenames to pass them as arguments to
another template (there are other things, but they do not help
with the “basics” of writing a variadic template. To make use of
them, we write recursive template specializations (remember:
the compiler will expand template specializations at compile
time). We can see the basics of this concept with an extremely
simplified version of the tuple template class:
1 template <typename… Types> class tuple;
2
3 template <> class tuple <> {
4 };
5
6 template <typename First, typename… Rest>
7 class tuple<First, Rest…> : public tuple <Rest…> {
8 First data;
9 public:
10 tuple(const First& first, const Rest&… rest): tuple<R
11 };
The first line declares the primary template, as an
incomplete type. The declaration on lines 3 and 4 is the base
case of our template recursion. This specialization is used in the
case where the parameter pack is empty, and declares an empty
class. The remainder declares the recursive case of the
template. It matches a template instantiation with at least one
argument, the first of which is bound to First, and the
remainder of which are bound to Rest. It then recursively
instantiates tuple with the expansion of the parameter pack
(Rest...) as the parent class. This class has one field for the
corresponding data, but also inherits fields from its parents for
the other data elements. The constructor takes a number of
parameters that matches the number of types the tuple is
instantiated with. It uses the parent constructor
(tuple<Rest...>(rest...))to initialize all of the rest of the
fields, and the initializes its own field.
E.7 goto
C and C++ both include the ability for the programmer to write
goto label; (where label is an identifier specifying a label
declared elsewhere), which causes the execution arrow to
immediately jump from that statement to the label. You should
pretty much always avoid goto—in fact, the famous computer
scientist/mathematician Edsger Dijkstra wrote a famous letter
called “Go To Statement Considered Harmful,”13 which begins:
For a number of years I have been familiar with the
observation that the quality of programmers is a
decreasing function of the density of go to statements in
the programs they produce. More recently I discovered
why the use of the go to statement has such disastrous
effects, and I became convinced that the go to statement
should be abolished from all “higher level” programming
languages (i.e. everything except, perhaps, plain machine
code).
We will note that goto is prevalent in the Linux kernel,
primarily to remove code duplication for cleanup when exiting
a function from multiple places (e.g., errors detected midway
through a function that require releasing resources already
allocated). By the time you are an advanced enough
programmer to write kernel code, you will be able to
understand the tradeoffs involved in such a decision and likely
have significant experience with assembly (where there are no
higher level control constructs like loops and if statements—
only conditional and unconditional branches). Until such time,
you should not use goto in any program you write.
D Other Important ToolsIndexF Compiler Errors Explained
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
V AppendiciesE Miscellaneous C and C++ TopicsG Answers to Selected Exercises
Appendix F
Compiler Errors Explained
This Appendix is intended to help you understand and fix
common compiler errors that you may encounter. The
messages are ordered alphabetically. Since many messages can
occur on various identifiers (e.g., different variables may be
undefined), ’id’ is used whenever an identifier appears in the
message, and that message is alphabetized accordingly (so if
you have an error that starts with ’x’, look in alphabetical
order for i, not x).
These error messages come from gcc—you will get
different wording from other compilers (including LLVM,
which is what you generally get if you type gcc on Mac OSX).
This list is by no means complete, but we have included many
common ones. If you come across an error that is not listed
here, check http://aop.cs.cornell.edu/ to see if we have
already posted it, and if not, let us know what it is.
• #include expects "FILENAME" or <FILENAME>
You wrote a #include directive, but the filename
following it had neither <> nor "" around it. It should
look like #include <stdio.h> for standard include
files, or #include "myHeader.h" for ones you wrote
yourself.
• C++’>>’ should be ’> >’ within a nested
template argument list
Whitespace is required between two close angle
brackets for nested template arguments. The designers
of C++ made this rule to simplify the problems of
distinguishing the two close brackets for nested
templates from the >> operator during parsing. See
Section 17.3.4.
•a label can only be part of a statement and a
declaration is not a statement
Due to a quirk of the C language, you cannot place the
declaration of a variable immediately after a label (such
as a case label). Instead of writing
1 case 1:
2 int y;
3 …
4 break;
Either, write:
1 case 1: {
2 int y;
3 …
4 break;
5 }
or declare y outside of the switch statement.
• array subscript is not an integer
This error means that you are trying to index into an
array with a type which is not an integer—such as
float, double, a pointer, a struct, etc…—which is
illegal. Remember that the array subscript (the number
inside the square brackets) specifies which element of
the array you want. Requesting the 3.47 element of
an array does not make sense.
– If the subscript is a real type (float, or
double), then maybe you meant to convert it to
an int first. You can do so either by explicitly
casting, or making a new int variable and
assigning to it (which implciitly casts). If you
want to round in a different way than truncation,
you should use the appropriate function for the
rounding you want.
– If the subscript is a pointer to an integer type,
you may have meant to dereference it.
– If the subscript is a struct, you may have
meant to select a particular member out of it
with the dot (.) operator.
– If it is a pointer to a struct, you may have
meant to select a particular member out of it
with the arrow (->) operator.
• assignment discards qualifiers from pointer
target type
This error message means that you are assigning a
pointer declared with modifiers (also called
“qualifiers”) on its type (such as const) to a pointer that
does not have these modifiers. Such a situation is an
error because it allows you to create an alias for a box
which accesses it in a different way than it was
originally declared. For example, if you write
1 const char * str = “something’’;
2 …
3 …
4 char * modifiable = str; //illegal!
5 modifiable[0] = ’X’; //would be legal, but bad.
The code exhibits improper behavior (modifying read-
only locations), but the only place that the error can be
noticed by the compiler is on line 4, where the
assignment converts a const char * to a char *. The
modification on line 5 is legal since modifiable points
at (mutable) chars, not const chars.
• assignment makes integer from pointer without
a cast
This error means that you have an assignment statement
where the right side has a pointer type (such as int *,
char *, int **, etc…), and the left side names a box
which has an integer type (such as int, unsigned int,
long, …). Typically this problem arises either (1)
because you forgot to dereference the pointer on the
right side:
1 int * p = …;
2 int x = …;
3 …
4 x = p; //maybe you meant x = *p;
or (2) because you have a string that you expect to have
the textual representation of a number, and have
forgotten that you cannot perform the conversion with a
cast (either implicit or explicit). See Section 10.1.5 for a
deatiled discussion of (2).
Note that inserting an explicit cast is almost never
the solution you want. Explicitly casting from the
pointer type to int is only the right answer if you plan
to explicitly manipulate the address numerically, in
ways other than standard pointer arithmetic. In such a
case (which would generally mean you are sufficiently
advanced in your programming skill that you no longer
need to consult this reference for common errors), your
integer type should be intptr_t or uintptr_t.
• assignment makes pointer from integer without
a cast
This error means that you have an assignment statement
where the right side has an integer type (such as int,
unsigned int, long, …), and the left side names a box
which has a pointer type (such as int *, char *,
int **, etc…). This error typically either means that
(1) you forgot to derference the left side:
1 int * p = …;
2 int x = …;
3 …
4 p = x; //maybe you meant *p = x;
or (2) that you forgot to take the address of the right
side:
1 int * p = …;
2 int x = …;
3 …
4 p = x; //maybe you meant p = &x;
Draw a picture of what you are trying to do, and see if it
corresponds with either of these changes. If it is one of
these two, then correct it appropriately. If it is not,
carefully think about how your manipulations of the
picture correspond to pointer operations, and correct
that part of the code accordingly. See the note in the
previous error about why a cast is not the correct
solution for anything you are likely to be doing.
• assignment of read-only variable ’id’
OR
assignment of read-only location
Either of these two error message means that you are
trying to modify a location (“box”) that you have said
cannot/should not be modified—typically because the
way you are trying to name the box has been declared
with the const modifier. The first message is what you
get if you have a variable declared const and try to
assign to it. The second message arises when you refer
to the box indirectly (e.g., through a const pointer). If
the pointer is const because it points at a string literal,
and you need to modify that string, you need to copy
the string into a modifiable area of memory (see Section
10.1.1 and Section 10.1.2).
Otherwise, if you are sure that you should be
modifying this box, think carefully about why it was
declared const in the first place, and if you are
convinced that that modifier can be removed, then you
can fix this error by doing so. Note that this approach
may cause other errors if this name is an alias for
another name which was also declared const.
You may also find that this box really should be
const, and thus you should not be modifying it. In this
case, you should rethink your algorithm and decide
what you should be modifying (or if you need to make a
copy of this box into another box, that you can modify).
• break statement not within loop or switch
You wrote break, but it is not inside of a loop (for,
while, or do-while) nor a switch, so there is nothing to
break out of. Often this error happens when curly
braces do not match as you expect them to, thus what
you expect to be inside a loop or switch is actually not.
• both’long’ and ’short’ in declaration
specifiers
You declared a variable with both long and short
which does not make sense (e.g., you wrote
long short int x;). Pick the one that you meant, and
delete the other.
• C++cannot dynamic_cast ’expr’ (of type
’type1’) to type ’type 2’ (source type is not
something )
You can only use dynamic_cast to cast between
pointers or references to class types that have at least
one virtual function (and thus have a vtable in which to
store the type information required to perform the
runtime checks required by dynamic_cast). First, be
sure your source type is a pointer or reference to a class
type. If so, check if you have virtual functions in the
base type. Remember that you should always have a
virtual destructor for any type that you are using
polymorphically.
• C++cannot return from a handler of a
function-try-block of a constructor
If you write a function-try-block in a constructor to
handle exceptions, the handlers may not return—they
can only throw an exception (whether by implicitly
rethrowing the original exception, or by explitly
throwing a different exception). This requirement exists
because an exception caught during the intializer list
means the object could not be properly initialized, and
thus is incomplete—it cannot be used, so the exception
must propagate to wherever the object is being created.
See Section 19.3.
• comparison between signed and unsigned integer
expressions
A comparison (such as <, <=, etc) is being made where
one operand is a signed integer type and the other is an
unsigned integer type. Comparing signed and unsigned
values results in (typically) undesireable behavior due
to the implicit conversion as discussed in Section 3.4.1.
• continue statement not within a loop
You wrote continue, but it is not inside of a loop (for,
while, or do-while) so it does not make sense—there is
no loop to go to the next iteration of. Often this error
happens when curly braces do not match as you expect
them to, thus what you expect to be inside a loop is
actually not.
• control reaches end of non-void function
This error message means that the execution arrow can
reach the close curly brace of a function (whose return
type is something other than void) without
encountering a return statement. This situation is
problematic because the function has promised to return
an answer, but can finish without saying what the
answer is. Typically, the programmer has either
forgotten to write a return statement at the end of the
function, or has return statements inside of conditional
statements where the compiler cannot convince itself
that execution will encounter a return statement in all
cases. Note that the compiler does not perform
sophisticated algebraic reasoning about the
relationships between the conditional expressions of if
statements, so you may encounter this error in case
where you can convince yourself that all cases are
covered, but the compiler cannot. If this happens, add
some code to print an error message and call abort();
to the end of the function, which will abort the program
if the execution arrow reaches it (which would indicate
your reasoning about the function was incorrect, and
something needs to be fixed).
• crosses initialization of ’type id’
This is typically the second line of a multi-line error
message, such as jump to case label crosses
initialization of ’type id’. Look at the first line of the
message, and find what to fix from that.
• declaration for parameter ’id’ but no such
parameter
This error message typically means that you forgot the
open curly brace ({) at the start of the body of a
function. The error message is phrased this way, as
ancient versions of C declared parameters differently,
and the first variable you declare inside the function
that is missing its { appears to be from this style of
parameter declaration.
• C++duplicate base type ’id’ invalid
This error indicates that you tried to mutiply inherit
from the same parent class multiple times. For example
(where ’id’ would be A):
class B: public A, public A If you actually meant
to try to multiply inherit from the same class multiple
times, your inheritance hierarchy is likely quite flawed,
and you should redesign it.
• C++dynamic_cast of ’something’ to
’otherthing’ can never succeed
You are trying to cast between types which are
unrelated by inheritance, and therefore cannot possibly
satisfy the runtime checks of dynamic_cast. You may
want to read Section E.4.2 and Chapter 18, then rethink
what you are doing. If you are sure you want to convert
between unrelated types, you will need to use a
different cast operator, however, odds are good that you
will not like the results.
• ’else’ without a previous ’if’
You have an else that does not come immediately after
the “then clause” of an if. Often this error happens
when your curly braces do not match as you expect
them to.
• expected ’{’ at end of input
You are missing an open curly brace, and the compiler
noticed this problem when it got to the end of your
source code file.
• fatal error: no input files
You did not specify the .c (or .cpp) file(s) that you
wanted to compile. It is also possible that the compiler
does not think you specified an input file because you
wrote the input file name in a position that is the
argument of some other option. For example, if you
wrote gcc -o myprogram.c, then gcc interprets
myprogram.c as the argument to the option -o (i.e., the
name of the output file to produce).
• ’for’ loop initial declaration used outside
C99 mode
Declaring a variable inside a for loop (for (int i =
0; ...) is a newer feature of C. Use --std=gnu99 in
your compilation command line.
• format not a string literal and no format
arguments
This error means that you have passed a format string
(i.e., the first argument of printf or a similar function)
as a variable (not a string literal) and it is the only
argument to printf—that is, you wrote printf(str);.
You should do printf("%s", str); instead. See
Section 10.4 to understand why.
• ’id1’ is an inaccessible base of ’id2’
This error means you are trying to use polymorphism
with a class that inherited with non-public (protected or
private) inheritance, in a circumstnace not permitted by
the access modifier you used. For example, if you have
the following two classes:
1 class A {
2 //...
3 };
4 class B : private A {
5 //...
6 };
7 //...
then you cannot use polymorphism to assign a B pointer
(or reference) to an A pointer or reference, unless you
are inside of the B class (or one of its friends). Most
likely, you want to just use public inheritance, which
will fix this problem.
• ’id’ redeclared as different kind of symbol
This error means that you have redeclared the same
variable (or function name) twice in incompatible ways.
For example, if you write
1 int myFunction(int x) {
2 int x;
3 …
4 …
5 }
then you have two declarations of x—one as a
parameter, and one as a local variable. If you actually
want two different boxes, then give them different
names. If you meant them to be the same, remove the
extra declaration.
• ’id’ undeclared (first use in this function)
The variable id has not been declared at this point in
the code. The most likely reasons are:
1. You may have misspelled the variable’s name
(either at this point or when you declared it).
2. You forgot to declare the variable at all.
3. The variable’s declaration was not in this scope.
4. The compiler was confused by another error
when it was trying to process the declaration, so
it skipped over it.
Note that this is often followed by the messages
(Each undeclared identifier is reported only
once
for each function it appears in.)
which are not truly errors, but the compiler telling you
that it will not report errors for future uses of this
variable in the same function.
• implicit declaration of function ’id’
The compiler has found a call to a function which you
have not declared (not provided a prototype for). If you
encounter this error message for a function which is
part of the C library (e.g., printf, strcmp, exit, etc..) it
almost always means that you either forgot to #include
the correct header (.h) file, or that you mis-spelled the
function’s name.
• C++In constructor ’Something::Something(int,
int)’:
’Something::z’ will be initialized after
’int Something::y’
when initialized here
The order that fields are initialized in your initializer list
does not match the order in which they were declared in
your class. This mismatch can result in problems as the
fields are initialized by their declaration order, not the
order that they appear in the initializer list. Please see
Section 15.1.4 for more details.
• initializationdiscards qualifiers from
pointer target type
See “assignment discards qualifiers from pointer target
type” and note that this message only differs in that it
appears when the assignment is the initialization of a
variable in the same statement that the variable is
declared in.
• initialization makes integer from pointer
without a cast
See “assignment makes integer from pointer without a
cast” and note that this message only differs in that it
appears when the assignment is the initialization of a
variable in the same statement that the variable is
declared in.
• initialization makes pointer from integer
without a cast
See “assignment makes pointer from integer without a
cast” and note that this message only differs in that it
appears when the assignment is the initialization of a
variable in the same statement that the variable is
declared in.
• integer constant is too large for its type
You wrote a constant that is too big for the type it is
being used as. First, make sure there is a type for which
you can write a literal constant that is compatible with
the numeric value you wrote (they generally cannot be
more than 64 bits for integers). If so, write the
appropriate suffix on the constant (e.g., ULL for
unsigned long long).
• invalid digit "8" in octal constant
invalid digit "9" in octal constant
A number that starts with 0 (such as 0123) is an octal
(base 8) constant. You cannot use 8 or 9 in such a
constant.
• invalid operands to binaryop (have ’type1’ and
’type2’)
You are trying to use a binary operator (op is something
like +, -, *, /, %, …) with types that are incompatible.
For example,
1 int * p = …;
2 float f = …;
3 p = p + f; //error on this line
Here, p is a pointer to an int, and f is a float. Trying to
add them together is not permitted, since there is no
meaningful interpretation of + when the operands are a
pointer and float—you cannot perform pointer
arithmetic in that way. Think through what you are
trying to accomplish here, and what it is you want to
add. If you are tryign to add two structs, remember
that it must be done element by element.
Note that C++ will give you a slightly different
error message, and has slightly different considerations.
• C++invalid operands of types ’type1’ and
’type2 ’ to binary ’operatorop’
This error is very similar to the slightly different C error
—which you should read. The major differerence when
fixing this type of error in C++ is to consider operator
overloading. It is possible that you may have tried to
overload the operator in question for the types you
wanted to use, but had some problem with that
declaration. If so, you should find and fix the
declaration of the overloaded operator you think should
be used. If not, you may wish to fix this error by
overloading the operator in question for those types, if
that makes sense.
• invalidsuffix "X" on floating constant
invalid suffix "X" on integer constant
You wrote a constant with an invalid suffix letter, such
as 3X or 4.2a. For integers, legal suffixes are typicall U
(unsigned), L (long), LL (long long). The L and LL
suffixed can be combined with U (UL or ULL). For
floating point constants, the suffixed can typically be F
(float), or L (long double). If you meant to write a
hexadecimal (hex) integer, be sure to start it with 0x
(such as 0x123A).
• ISO C forbids empty initializer braces
You wrote something like int x[6] = {};, presumably
to try to initialize all elements of the array to 0.
Technically, the C rules do not allow the empty braces
(though many compilers will accept it). You should
instead write int x[6]={0};
• jump to case label case XX: crosses
initialization of ’type id’
You have a variable whose scope is the entire switch
statement, but whose declaration and initialization
appears inside one particular case. Either move the
declaration outside of the switch statement, or wrap the
case in {}s so that the scope of the variable is limited to
that case.
• large integer implicitly truncated to unsigned
type
You have a very large integer constant, which was not
suffixed to indicate an unsigned type, but cannot fit into
the signed type that the compiler tried to give it. First
make sure that it can fit into a type which it is legal to
write literal values for (typically at most 64-bits), then
apply an appropriate suffix (e.g., ULL for unsigned long
long) to the constant.
• lvalue required as left operand of assignment
lvalue is the technical term for “something that names a
box.” If the left operand of an assignment (i.e., what is
supposed to name the box you want to change) is not an
lvalue—it does not name a box—then the assignment
statement makes no sense. For example, if you write
3=4, you cannot change the value of 3 to be 4 (3 is not
an lvalue: it does not name a box). This error can also
come up when you write something like x+y=3. x+y
does not name a box—it is an expression which
evaluates to a value. Novice programmers often write
something like this when they want x+y to be 3, but this
is not specific enough. You need to explicitly say which
variable to change and how to make this happen. For
example, x=y-3 (if you want to change x) or y=x-3 (if
you want to change y) might be two ways to accomplish
this goal. You need to think about how you want to
change things in a step-by-step fashion, and correct
your code.
• missing terminating " character
You are missing the " character which ends a string
literal, or it appears on the next line (you may not have
a newline inside a string literal. If you want the string to
contain a newline, use the \n escape sequence). You
may have also forgotten to escape a "" that you wanted
to appear literally in the string (e.g.,
"Quotation marks look like \".").
• no input files
See “fatal error: no input files”.
• old-style parameter declarations in prototyped
function definition
This error message typically means that you forgot the
open curly brace ({) at the start of the body of a
function. The error message is phrased this way, as
ancient versions of C declared parameters differently,
and the first variable you declare inside the function
that is missing its { appears to be from this style of
parameter declaration.
•
operation on ’id’ may be undefined
You have a statement which both uses and modifies ’id’
in a way that is undefined. For example, if you write
a[x++] = x; then it is undefined whether change to x
caused by x++ happens before or after the value of x is
read to evaluate the expression on the right side of the
statement. You should break your code down into
multiple statements, each of which changes exactly one
’box’.
• overflow in implicit constant conversion
You are assigning a constant value to a variable whose
type cannot hold the value that you wrote. The compiler
is warning you that overflow (see Section 3.4.3) will
occur in the assignment, and you will not actually end
up with the variable holding the value that you wrote.
• parameter ’id’ is initialized
This error message typically means that you forgot the
open curly brace ({) at the start of the body of a
function. The error message is phrased this way, as
ancient versions of C declared parameters differently,
and the first variable you declare and initialize inside
the function that is missing its { appears to be from this
style of parameter declaration.
• C++request for member ’id’ is ambiguous
The compiler has multiple possible members (fields or
methods) named ’id’ that it could choose from here, and
all are equally “good,” so it cannot decide. This error
might occur if you are using multiple inheritance, and
inherit from two classes with identically named
members and then try to use that member in the child
class (see Section 29.2.2).
• return type of ’main’ is not ’int’
You have declared the main function with a return type
other than int, which is not correct. The main function
should always return an int, which should generally be
either EXIT_SUCCESS or EXIT_FAILURE, as it indicates
the success or failure status of the program.
• stray ’\’ in program
You have a \ somewhere in your program that it does
not belong.
• stray ’\##’ in program (where ## is a number)
You have an unusual character in your program. This
error most often comes up when you copy from
something that writes “fancy” quotation marks, and
paste it into your editor.
• suggest parentheses around assignment used as
truth value [-Werror=parentheses]
This error happens when you use an assignment (e.g.,
x=y) as a conditional expression—where the value of
the assignment expression is evaluated for “truth”. For
example, if you write if(x=y) you will get this error.
99.99% of the time, what you meant is if(x==y) (with
two equals signs, meaning to compare x and y for
equality). The compiler messages “suggests
parenthesis” because if you did mean to use the
assignment expression as the conditional expression,
you should write if( (x=y) != 0) (or
if ( (x=y) != NULL)) to indicate that you explicitly
meant to do this.
• templates may not be ’virtual’
You cannot declared a templated method as virtual
(though you can declare virtual methods inside a
templated class). See Section 18.7.2 for more details,
and Chapter 29 for an understanding of why.
• too many decimal points in number
You wrote a floating point literal with more than one
decimal point, such as 3.4.5, which does not make
sense. Correct the literal to the number that you
intended.
• type of ’id’ defaults to ’int’
You declared ’id’ without a type, so the compiler
assumed ’int’. This warning is most common if you
declare a parameter without a type name, such as
int f(x) {...}. The compiler assumes that x is an
any, but gives you a warning since it is not sure that is
what you meant.
•
C++uninitializedmember
’ClassName::fieldName’ with ’const’ type
’const ty’
Your class has a field (fieldName) which has a const
type (which is either not a class, or is a POD type),
which was not initialized in the class’s initializer list.
const fields must be initialized in the initializer list. If
your class is a non-POD class, then a call to the default
constructor will be automatically inserted, however, we
recommend that you include an explicit initializer in the
list anyways.
• C++uninitializedreference member
’ClassName::fieldName’
Your class has a field (fieldName) which has a
reference type, but it was not initialized in the initializer
list of one or more constructors. As references must be
initialized, C++ requires that fields whose types are
references be initialized in the initializer list. Assigning
to them in the constructor is treated as a normal
assignment (not an initialization), and is not sufficient
to meet this requirement. See Section 15.1.4 for more
details.
• unterminated comment
You have a comment started by /**/ with no matching
*/ to close it.
• unused variable ’id’
You have declared the specified variable, but have not
used it anywhere in its scope (you may or may not have
assigned to it, but have not read its value anywhere).
The three most often causes of this error are (1) the
variable is no longer needed—you have removed code
that previously used it, but forgot to delete the
declaration. In this case, just delete the declaration. (2)
you may have mis-spelled the use(s) or it relative to the
declaration (possibly meaning you mis-spelled the
declaration, but spelled the uses correctly). This
situation is often accompanied by error messages about
undeclared variables, but may not be if you have
closely named variables such that the mis-spellings
appear correct (which is a bad idea—your variable
names should ideally be rather different from each other
within a scope). The solution here is to fix the spellings.
(3) you may have two variables of the same name with
overlapping scopes. In such a case, you may be
referencing the wrong variable (the other one in scope
with the same name) where you mean to use the one
that is being noted as un-used. You can fix this problem
by renaming your variables so they have unique names
within a scope (which is a good idea anyways).
E Miscellaneous C and C++ TopicsIndexG Answers to Selected Exercises
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
V AppendiciesF Compiler Errors ExplainedIndex
Appendix G
Answers to Selected Exercises
G.1 Answers for Chapter 1
• Answer to Question 1.1 : An algorithm is a clear step-by-step set of instructions
to solve any problem in a particular class of problems.
• Answer to Question 1.2 : A parameter is an input to an algorithm that defines
which problem within the class should be solved.
• Answer to Question 1.3 : No amount of testing can ensure correctness. A test-
case can only find that something is wrong, or make us more confident that the
algorithm is correct—it cannot prove that it is correct. Proving an algorithm is
correct requires a mathematical proof.
• Answer to Question 1.4 :
1.
Count from 0 to 2*N+1 (inclusive).
For each number "i" that you count,
Write down 2*i + 3*N
2. 18 20 22 24 26 28 30 32 34 36 38 40 42 44
3. You should get the same answer as your friend. If not, find out why your
steps were ambiguous.
• Answer to Question 1.5 :
1.
(Note: coordinates start from (0,0) in the bottom left)
Count from 0 to N (inclusive)
For each number "i" that you count,
Draw a red box at (i,i)
Draw a red box at (i,i+1)
After you finish counting,
If N is even
Draw a green box at (N+1, N+1)
Otherwise
Draw a blue box at (N+2, N+1)
Draw a blue box at (N+2, N)
2. The output for N=6 is:
3. You should get the same answer as your friend. If not, find out why your
steps were ambiguous.
G.2 Answers for Chapter 2
• Answer to Question 2.1 : The output is:
c = 6
a = 3
b = 10
• Answer to Question 2.2 : The output is:
O X O
O X
X
• Answer to Question 2.3 : The output is:
In f(0, 0)
i = 0, answer = 2
In f(1, 2)
i = 1, answer = 7
In f(2, 7)
i = 2, answer = 42
In f(3, 42)
i = 3, answer = 294
• Answer to Question 2.4 : They evaluate to the following:
1. g(14, 7) evaluates to 91
2. g(9, 5) evaluates to 6
3. g(3, 0) evaluates to 9
4. g(2, 9) evaluates to 7
5. g(5, 5) evaluates to 5
6. g(27, 18) evaluates to 486
• Answer to Question 2.6 : The output is:
euclid(9135, 426)
euclid(426, 189)
euclid(189, 48)
euclid(48, 45)
euclid(45, 3)
euclid(3, 0)
x = 3
• Answer to Question 2.7 : The creation and destruction of stack frames is shown
in this video:
G.3 Answers for Chapter 3
• Answer to Question 3.1 : “Everything Is a Number” is an important rule for
programming because computers can only operate on binary integers.
Consequently, for anything that does not seem numeric, one has to find a way to
numerically represent it to write programs that deal with it.
• Answer to Question 3.2 : char is the type for a character: a symbol, such as a
letter, space, punctuation, or digits.
• Answer to Question 3.3 : floats and doubles are the types that can be used for
real numbers—numbers with significant digits after the decimal point. While
floats and doubles approximate real numbers, they are (unlike true real
numbers) finite in both range and precision. A double has more bits than a float,
and thus can store a larger range of values with more precision.
• Answer to Question 3.4 : A variable’s type tells the compiler (1) how much space
the variable needs in memory and (2) how to interpret the number stored in the
variable’s memory.
• Answer to Question 3.5 :
Decimal
Binary Hex
(Unsigned) (Signed)
00000000 0x0 0 0
00111100 0x3C 60 60
11001000 0xC8 200 -56
11010110 0xD6 214 -42
00110101 0x35 53 53
01111111 0x7F 127 127
01100100 0x64 100 100
01010111 0x57 87 87
• Answer to Question 3.7 :
1. a / b: type is int, value is 0
2. (double)(a / b): type is double, value is 1.0
3. a / (double)b: type is double, value is 1.25
4. (double)a / b, type is double, value is 1.25
5. a - b / 2: type is int, value is 2
6. a - b / 2.: type is double, value is 1.5
• Answer to Question 3.8 : If integer arithmetic results in a number too large to
result in the integer type being computed on, then overflow occurs—the answer is
something mathematically illogical such as a negative number (for signed
integers) or a small positive integer (for unsigned numbers)
• Answer to Question 3.12 : typedef unsigned long seq_t;
G.4 Answers for Chapter 4
• Answer to Question 4.1 :
1 int abs(int n) {
2 if (n < 0) {
3 return -n;
4 }
5 else {
6 return n;
7 }
8 }
• Answer to Question 4.5 :
1 int isPow2(int n) {
2 if (n < 1) {
3 return 0;
4 }
5 while (n != 1) {
6 if (n % 2 != 0) {
7 return 0;
8 }
9 n = n / 2;
10 }
11 return 1;
12 }
G.5 Answers for Chapter 5
• Answer to Question 5.2 :
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main (void) {
5 for (int i = 1; i <= 1000; i++) {
6 printf("%d\n", i);
7 }
8 return EXIT_SUCCESS;
9 }
G.6 Answers for Chapter 6
• Answer to Question 6.1 : The purpose of testing is to find errors (“bugs”) in our
code. Accordingly, if we can write a test case that fails, we have found an error in
our code, and thus written a successful test case.
• Answer to Question 6.4 : One possible error that the list with negative numbers
in it might expose would be if an unsigned type were used in a place where a
signed type should have been used. Many very large integers might expose
problems related to overflowing a variable.
• Answer to Question 6.6 : I have:
Statement coverage after test cases 1, 2, and 3.
Decision coverage after test cases 1, 2, 3, and 4.
Path coverage after test cases 1, 2, 3, 4, and 5.
G.7 Answers for Chapter 7
• Answer to Question 7.2 :
1 //helper with parameters for all the variables we use in the loop
2 double mySqrt_helper(double d, double lo, double guess, double hi) {
3 //this is the opposite of the loop condition
4 if (fabs(guess * guess - d) <= 0.000001) {
5 return guess; //return the answer (what was after the loop)
6 }
7 //this if statement was in the original code
8 if (guess * guess > d) {
9 //so was the computation of temp
10 double temp = (guess - lo) / 2 + lo;
11 //update of variables for next iteration
12 //d: unchanged (=d), lo: unchanged (=lo), guess = temp, hi = guess
13 return mySqrt_helper(d, lo, temp, guess);
14 }
15 else {
16 //temp computation from original code
17 double temp = (hi - guess) / 2 + guess;
18 //update variables for next iteration
19 //d: unchanged (=d), lo = guess, guess = temp, hi: unchanged (=hi)
20 return mySqrt_helper(d, guess, temp, hi);
21 }
22 }
23 double mySqrt(double d) {
24 //call helper with initial values at start of loop
25 return mySqrt_helper(d, 0, d / 2, d);
26 }
• Answer to Question 7.4 : The output is
42
-913
• Answer to Question 7.5 : The function g in the previous question is head-
recursive because it does more computation after its recursive call returns. In
particular, it calls printf after calling itself, so its stack frame cannot be reused.
G.8 Answers for Chapter 8
• Answer to Question 8.1 : A pointer is a value whose meaning is the location of
some other value in the program’s memory. Conceptually, a pointer is an arrow
(which points at another box). Pointers are implemented by using the numerical
value of the memory address of the lvalue that the pointer points at.
• Answer to Question 8.2 : No, “pointer” is a type constructor—”pointer to an int”
is a type, as is “pointer to a pointer to a char”, but by itself “pointer” is not a
complete type.
• Answer to Question 8.4 : The -> operator means “follow a pointer and select a
field”. (1) The type of a must be a pointer to a struct with a field named b. (2) To
find the box named by a->b, first look in the box named by a, which should hold
an arrow. Follow that arrow to a box for a struct. Look inside that struct’s box for
the sub-box for the field b. That box is the one named by a->b. (3) a->b is exactly
the same as (*a).b.
• Answer to Question 8.7 : The answer is 42. Because we said that *p and *q are
aliases, we have said that p and q point at the same box. The first statement
changes this box to 5 (at which point, if we evaluated *p or *q, we would get 5).
The second statement changes that box to 42.
• Answer to Question 8.8 :
2: a= &b; Incompatible types (left: int, right int**)
3: &b = &c; Left side is not an lvalue
4: *b = a; Legal
Left side is not an lvalue; Incompatible types (left:
5: &c = b;
int **, right int*)
6: c = a; Incompatible types (left: int *, right int)
Incompatible types (declared: int, actual: int *)
7: return b;
• Answer to Question 8.9 :
1. p = q;
2. *p = 3; Segfaults
3. *r = *q; Segfaults
4. r = q;
5. p = r;
• Answer to Question 8.10 :
In f, *a = 3, b = 4
In g, x = 7, *y = 8
Back in f, *a = 7, b = 0
In main: x = 7, y = 4
• Answer to Question 8.11 :
**s = 3
**t = 4
a = 3
b = 99
c = 55
*p = 55
*q = 99
*r = 99
**s = 55
**t = 99
G.9 Answers for Chapter 9
• Answer to Question 9.1 : An array is a an indexable sequence of values. In C,
arrays are stored in consecutive addresses in memory. The name of the array is a
pointer to the first element of the array.
• Answer to Question 9.2 : There are five elements, none of which are initialized.
• Answer to Question 9.3 : There are five elements, all of which are initialized.
Element 0 has value 4. Element 1 has value 6. Elements 2, 3, and 4 have value 0.
• Answer to Question 9.4 : There are two elements, both of which are initialized.
Element 0 has value 4. Element 1 has value 6.
• Answer to Question 9.6 :
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <math.h>
4
5 typedef struct {
6 int x;
7 int y;
8 } point;
9
10 double computeDist(point p1, point p2) {
11 double dx = p2.x - p1.x;
12 double dy = p2.y - p1.y;
13 return sqrt(dx * dx + dy * dy);
14 }
15
16 point * closestPoint(point * s, size_t n, point p) {
17 if (n == 0) { return NULL; }
18 double bestDistance = computeDist(s[0], p);
19 point * bestChoice = &s[0];
20 for (size_t i = 1; i < n; i++) {
21 double currDist = computeDist(s[i], p);
22 if (currDist < bestDistance) {
23 bestChoice = &s[i];
24 bestDistance = currDist;
25 }
26 }
27 return bestChoice;
28 }
29
30 int main(void) {
31 point s[7] = { {.x = 2, .y = 7}, {.x = 10, .y = 5}, {.x = 8, .y = -2}
32 {.x = 7, .y = -6}, {.x = -3, .y = -5}, {.x = -8, .y = 0}
33 {.x = -5, .y = 6}};
34 point target = {-1,-1};
35 point * p = closestPoint(s, 7,target);
36 printf("Closest to (-1,-1) is (%d,%d)\n", p->x, p->y);
37 return EXIT_SUCCESS;
38 }
• Answer to Question 9.12 :
– a and *q
– array[0] \item \codeinlinearray[1]+, *p, and **r
– array[2] and p[1]
–q
– p and *r
G.10 Answers for Chapter 10
• Answer to Question 10.5 :
1 void printFENBoard(const char * fen) {
2 while (*fen != ’ ’ && *fen != ’\0’) {
3 if (*fen == ’/’) {
4 // the / character ends the row
5 printf("\n");
6 }
7 else if (isdigit(*fen)) {
8 int n = *fen - ’0’;
9 //error checking omitted since we
10 //are allowed to assume the FEN string is valid
11 for (int i = 0; i < n; i++) {
12 printf(" ");
13 }
14 }
15 else {
16 //otherwise, we print the letter
17 //error checking ommitted since we
18 //are allowed to assume the FEN string is valid
19 printf("%c", *fen);
20 }
21 fen++;
22 }
23 printf("\n");
24 }
• Answer to Question 10.8 : One way to write the code for addMatrix is:
1 void addMatricies(double ** ans, double **a, double ** b, int w, int h) {
2 for (int i = 0; i < h; i++) {
3 for (int j = 0; j < w; j++) {
4 ans[i][j] = a[i][j] + b[i][j];
5 }
6 }
7 }
• Answer to Question 10.10 :
1 double derivative( double (*f) (double), double x) {
2 //approximate limit h->0 of (f(x+h) - f(x)) / h
3 const double h = 0.000000000001;
4 double y1 = f(x);
5 double y2 = f(x+h);
6 return (y2-y1)/h;
7 }
G.11 Answers for Chapter 11
• Answer to Question 11.1 :
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 int main(int argc, char ** argv) {
5 for (int i = 1; i < argc; i++) {
6 if (i > 1) {
7 printf(" ");
8 }
9 printf("%s", argv[i]);
10 }
11 printf("\n");
12 return EXIT_SUCCESS;
13 }
• Answer to Question 11.3 :
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 //this size is arbitrary
5 #define LINE_SIZE 512
6 void readAndPrint(FILE * f) {
7 char line[LINE_SIZE];
8 while (fgets(line, LINE_SIZE, f) != NULL) {
9 //could use printf("%s", line) as well.
10 if (fputs(line,stdout) < 0) {
11 perror("fputs failed");
12 }
13 }
14 }
15 int main(int argc, char ** argv) {
16 if (argc == 1) {
17 readAndPrint(stdin);
18 }
19 else {
20 for (int i = 1; i < argc; i++) {
21 FILE * f = fopen(argv[i], "r");
22 if (f == NULL) {
23 perror("fopen failed");
24 fprintf(stderr,"while trying to read %s!\n", argv[i]);
25 continue;
26 }
27 readAndPrint(f);
28 if(fclose(f) != 0) {
29 perror("fclose failed");
30 fprintf(stderr,"while trying to close %s\n", argv[i]);
31 }
32 }
33 }
34 return EXIT_SUCCESS;
35 }
G.12 Answers for Chapter 12
• Answer to Question 12.1 : The malloc function allocates memory on the heap,
and returns a pointer to the memory it allocated. If it fails, it returns NULL.
• Answer to Question 12.4 : The free function releases memory (that was
allocated by malloc) after you are done using it. After you free a block of
memory you may not use it for anything—all pointers to it are dangling, and
dereferencing them is an error.
• Answer to Question 12.5 : You have probably corrupted the bookkeeping
structures used by the memory allocator—possibly by writing into memory that
you free, double freeing, freeing something not on the heap, or freeing the middle
of a block. The best way to go about fixing this problem is to run your code in
Valgrind, which generally will identify the problem more directly.
• Answer to Question 12.7 :
1 char * myStrdup(const char * str){
2 size_t len = strlen(str) + 1;
3 char * ans = malloc(len * sizeof(*ans));
4 strncpy(ans, str, len);
5 return ans;
6 }
• Answer to Question 12.10 :
1 #include <stdio.h>
2 #include <stdlib.h>
3
4 struct _my_struct {
5 size_t nNums; //size_t is more correct
6 int * nums;
7 };
8 typedef struct _my_struct my_struct;
9
10 void f(my_struct * ptr) {
11 // < not <= [out of bounds]
12 for (size_t i = 0; i < ptr->nNums; i++) {
13 printf("%d\n", ptr->nums[i]);
14 }
15 //don’t free ptr here: double frees with free(s)
16 }
17 int main(void) {
18 //sizeof(*s), not sizeof(s)
19 my_struct * s = malloc(sizeof(*s));
20 s->nNums = 5;
21 s->nums = malloc(5 * sizeof(*s->nums));
22 for (size_t i = 0; i < s->nNums; i++) {
23 s->nums[i] = i + 4;
24 }
25 f(s);
26 free(s->nums);//forgot to free s->nums
27 free(s);
28 return EXIT_SUCCESS;
29 }
Note that the double free error could have been fixed either as we did here,
or by removing free(s) from main (in which case, the free(s->nums) would
have to be done as free(ptr->nums) inside f). We prefer this solution, as it
makes more sense for f to be a function that just prints data, rather than prints and
frees data (we often want to print data without freeing it!)
G.13 Answers for Chapter 13
• Answer to Question 13.2 : Abstraction is the separation of interface (what
something does) from implementation (how it does it). Abstraction is crucial to
designing large systems (code or other systems), as the human mind can only
work with about 7-items at a time. By packing up a complex implementation
behind a simple interface, we can make the entire complex thing into one “item”
that our mind can work with.
• Answer to Question 13.4 : A function should fit comfortably in one terminal
window—i.e., you should be able to fit it all on screen at once.
• Answer to Question 13.8 : In our solution, we declare the following struct to hold
our words:
1 struct _words_t {
2 size_t n_words;
3 char ** words;
4 };
5 typedef struct _words_t words_t;
Then can start by writing our main:
1 int main(int argc, char ** argv) {
2 if (argc != 3) {
3 fprintf(stderr, "Usage: randStory story words\n");
4 return EXIT_FAILURE;
5 }
6 words_t * words = readWords(argv[2]);
7 if (words == NULL) {
8 return EXIT_FAILURE;
9 }
10 int returnStatus = printStory(argv[1], words);
11 freeWords(words);
12 return returnStatus;
13 }
Having done that, we have three smaller programming tasks. We will do
readWords first:
1 words_t * readWords (const char * fname) {
2 FILE * f = fopen(fname, "r");
3 if (f == NULL) {
4 fprintf(stderr, "%s cannot be opened\n", fname);
5 return NULL;
6 }
7 words_t * ans = malloc(sizeof(*ans));
8 if (ans == NULL) {
9 fclose(f);
10 return NULL;
11 }
12 ans->n_words = 0;
13 ans->words = NULL;
14 char * line = NULL;
15 size_t sz = 0;
16 while (getline(&line, &sz, f) >= 0) {
17 trimNewLine(line);
18 insertWord(ans, line);
19 line = NULL;
20 }
21 free(line);
22 fclose(f);
23 return ans;
24 }
In that function, we abstracted out two pieces into smaller tasks, so we do
those next:
1 void insertWord(words_t * words, char * line) {
2 words->n_words++;
3 char ** temp = realloc(words->words,
4 words->n_words * sizeof(*temp));
5 assert(temp != NULL);
6 temp[words->n_words-1] = line;
7 words->words = temp;
8 }
9 void trimNewLine(char * s) {
10 char * p = strchr(s, ’\n’);
11 if (p != NULL) {
12 *p = ’\0’;
13 }
14 }
We then do printStory (and abstract out picking a random word):
1 const char * randomWord (words_t * words) {
2 size_t i = random();
3 i = i % words->n_words;
4 return words->words[i];
5 }
6 int printStory(const char * fname, words_t * words) {
7 FILE * f = fopen(fname, "r");
8 if (f == NULL) {
9 fprintf(stderr, "%s cannot be opened\n", fname);
10 return EXIT_FAILURE;
11 }
12 int c;
13 while ( (c = fgetc(f)) != EOF) {
14 if (c == ’_’) {
15 printf("%s", randomWord(words));
16 }
17 else {
18 fputc(c,stdout);
19 }
20 }
21 fclose(f);
22 return EXIT_SUCCESS;
23 }
Finally, we write freeWords:
1 void freeWords(words_t * words) {
2 for (size_t i = 0; i < words->n_words; i++) {
3 free(words->words[i]);
4 }
5 free(words->words);
6 free(words);
7 }
G.14 Answers for Chapter 14
• Answer to Question 14.3 : The this pointer is a pointer to the object which the
method is “inside.” You determine what it points at by keeping track of it like a
parameter (it is an implicit parameter) when a method is called—as you pass
parameters to the method call, you should draw a box for this, and initialize it to
point at the object on which the method is being invoked.
• Answer to Question 14.7 : Operators that modify the current object, such as +=,
should typically have a return type that is a reference to the class they are inside
of and should have a return value of *this.
• Answer to Question 14.8 : The output is
p: (2,4)
p: (3,5)
p1: (4,3)
p2: (5,4)
• Answer to Question 14.9 : We can implement the Square class as follows:
1 class Square {
2 private:
3 double edgeLength;
4 public:
5 void setEdgeLength(double newLength) {
6 assert(newLength >= 0);
7 edgeLength = newLength;
8 }
9 double getEdgeLength() const {
10 return edgeLength;
11 }
12 double getArea() const {
13 return edgeLength * edgeLength;
14 }
15 double getPerimeter() const {
16 return 4 * edgeLength;
17 }
18 };
• Answer to Question 14.10 : The modified Point class should look like this:
1 class Point {
2 private:
3 int x;
4 int y;
5 public:
6 void setLocation(int newX, int newY) {
7 x = newX;
8 y = newY;
9 }
10 int getX() const {
11 return x;
12 }
13 int getY() const {
14 return y;
15 }
16 Point & operator+=(const Point & rhs) {
17 x += rhs.x;
18 y += rhs.y;
19 return *this;
20 }
21 bool operator==(const Point & rhs) const {
22 return x == rhs.x && y == rhs.y;
23 }
24 Point & operator*=(int scale) {
25 x *= scale;
26 y *= scale;
27 return *this;
28 }
29 };
G.15 Answers for Chapter 15
• Answer to Question 15.1 : A constructor is a special method (named the same as
the class it resides in but having no return type—not even void) that initializes
objects of that class when they are created. The constructor is automatically
called when an object is created (either directly in a frame or with the new
operator). They are useful because they ensure objects will always be properly
initialized.
• Answer to Question 15.2 : No, having any nontrivial constructor makes a class
not a POD type, as it may only be properly created by means which will invoke
the constructor to initialize it.
• Answer to Question 15.6 : Because the statement C c2 = c1; initializes a new
object (rather than changing an existing object), the copy constructor is used.
G.16 Answers for Chapter 16
• Answer to Question 16.1 : Either the append method
(http://www.cplusplus.com/reference/string/string/append/) or the +=
(http://www.cplusplus.com/reference/string/string/operator+=/)
operator will perform this task.
• Answer to Question 16.2 : To convert a C string to a C++ string, you can use the
constructor in std::string which takes a char *. To convert the other way, you
can use the c_str method in std::string.
• Answer to Question 16.3 : It evaluates to std::cout. Evaluating to std::cout is
important because it is what lets us “chain” together multiple << operations, such
as std::cout << "x= " << x << "\n ".
G.17 Answers for Chapter 17
• Answer to Question 17.1 : Templates are useful because they reduce duplication
of code. Whenever an algorithm (or class) works in the same way for many data
types, we can write it once with a template then instantiate the template multiple
times for the different types of data we need to use it for.
• Answer to Question 17.3 : A template is type checked when it is instantiated:
when it is actually used. An important implication of this fact is that the templated
code need only be legal for the types it is actually used on.
• Answer to Question 17.4 : C++ requires that the two >s be separated by a space:
std::vector<std::pair<int,int> >
• Answer to Question 17.5 :
1 template<typename T>
2 unsigned countEven(T * array, size_t n) {
3 unsigned count = 0;
4 for (size_t i = 0; i < n; i++) {
5 if (array[i] % 2 == 0) {
6 count++;
7 }
8 }
9 return count;
10 }
• Answer to Question 17.8 :
1 template<typename T>
2 unsigned countEven(typename T::const_iterator start,
3 typename T::const_iterator end) {
4 unsigned count = 0;
5 while (start != end) {
6 if (*start % 2 == 0) {
7 count++;
8 }
9 ++start;
10 }
11 return count;
12 }
G.18 Answers for Chapter 18
• Answer to Question 18.1 : Inheritance is when you make a parent class (also
called “base class”) and then extend it into one or more children classes (also
called “subclasses”). It is appropriate when you have an “is-a” relationship, and is
beneficial because it reduces code duplication: the child class automatically has
everythign defined in the parent class.
• Answer to Question 18.2 : Subtype polymorphism is the ability to treat a pointer
to (or a reference to) an instance of a child class as a pointer whose type is
declared as the parent class. For example, if class Cat extends class Animal, it is
legal to write Animal * a = new Cat(); because of subtype polymorphism.
• Answer to Question 18.3 : The second line is illegal because the compiler only
knows the static type of ptr (A *) when it compiles the code—it cannot tell that
ptr actually points at a B (the dynamic type). It must type check based on the
static type and cannot find anotherFunction in class A, so it gives an error.
• Answer to Question 18.7 : The output is
A()
B()
A(3)
B(3,8)
a1->myNum() = 0
a2->myNum() = 3
b1->myNum() = 42
b2->myNum() = 8
~B()
~A()
~B()
~A()
• Answer to Question 18.10 : No, classes that have virtual methods are never plain
old data (POD) types.
G.19 Answers for Chapter 19
• Answer to Question 19.5 : They are destroyed, running their destructors in the
opposite order of their creation.
• Answer to Question 19.6 : No, you should not use new, instead you should throw
an unnamed temporary (e.g., throw someExceptionType();)
• Answer to Question 19.7 : You should catch a reference to the exception type.
• Answer to Question 19.11 : This function either provides no exception guarantee,
or a strong exception guarantee. It cannot provide no-throw, since new might
throw. If every operation that it uses provides a no-throw guarantee (T;’s
operator+, and the function g being the questionable ones), then f provides a
strong guarantee. If not, then it provides no guarantee. If these operations provide
any weaker guarantee, then f provides no exception guarantee, as it would leak
memory when any of those operations throws an exception.
G.20 Answers for Chapter 20
• Answer to Question 20.1 : No. The easiest way to see this is to take
, which is , which simplifies to
, which is clearly .
Operation Runtime
add
contains
• Answer to Question 20.3 : numItems
remove
intersect
unionSets
G.21 Answers for Chapter 21
• Answer to Question 21.2 : A tail pointer points at the last element of the list. The
advantage of a tail pointer is that it gives access time to the end of the list
(i.e., we can perform operations like “add to back” in time).
• Answer to Question 21.3 :
• Answer to Question 21.5 : If we wrote that destructor in a Node, then destroying
any Node would destroy all Nodes that come after it. We might just want to
destroy one particular Node (e.g., removing one item from the list) and not the
entire list.
• Answer to Question 21.10 :
1 template<typename T>
2 std::ostream & operator<<(std::ostream & stream, const std::list<T> & list)
3 stream << "[";
4 std::string before = "";
5 for (typename std::list<T>::const_iterator it = list.begin();
6 it != list.end();
7 ++it) {
8 stream << before << *it;
9 before = ", ";
10 }
11 stream << "]";
12 return stream;
13 }
• Answer to Question 21.11 :
1 template<typename T>
2 std::list<T> reverse(const std::list<T> & inputList) {
3 std::list<T> answer;
4 for (typename std::list<T>::const_iterator it = inputList.begin();
5 it != inputList.end();
6 ++it) {
7 answer.push_front(*it);
8 }
9 return answer;
10 }
• Answer to Question 21.13 : Yes, in Chapter 17’s Question 17.5, you wrote a
countEven, which operates on the iterators of any structure, you could just call it,
passing in the iterators returned by begin and end from the list.
G.22 Answers for Chapter 22
• Answer to Question 22.1 :
1 size_t binarySearch(int toFind, int * array, size_t n) {
2 size_t lo = 0;
3 size_t hi = n;
4 while (lo < hi) {
5 size_t mid = (hi + lo) / 2 ;
6 if (array[mid] == toFind) {
7 return mid;
8 }
9 if (toFind < array[mid]) {
10 hi = mid;
11 }
12 else {
13 lo = mid + 1;
14 }
15 }
16 return -1;
17 }
• Answer to Question 22.2 :
1 template<typename T, typename Container>
2 typename Container::const_iterator
3 binarySearch(const T& toFind,
4 typename Container::const_iterator start,
5 typename Container::const_iterator end) {
6 typename Container::const_iterator hi = end;
7 while (start < hi) {
8 typename Container::const_iterator mid = start + (hi-start)/2;
9 const T& current = *mid;
10 if (current == toFind) {
11 return mid;
12 }
13 if (toFind < current) {
14 hi = mid;
15 }
16 else {
17 start = mid + 1;
18 }
19 }
20 return end;
21 }
• Answer to Question 22.3 :
• Answer to Question 22.4 : The diagram of the tree below has each node labeled
with its height (in blue). The tree is not balanced—the two nodes shaded in red
have children whose heights differ by more than one (one child has height 2, and
the other child is NULL, which has height 0).
• Answer to Question 22.5 :
Preorder 100, 50, 30, 1, 40, 200, 300, 250, 999
Inorder 1, 30, 40, 50, 100, 200, 250, 300, 999
Postorder 1, 40, 30, 50, 250, 999, 300, 200, 100
• Answer to Question 22.11 : For inorder:
1 private:
2 template<typename F> void inorderTraverse(F & funcobj, Node * curr) const
3 if (curr != NULL) {
4 inorderTraverse(funcobj, curr->left);
5 funcobj(curr->data);
6 inorderTraverse(funcobj,curr->right);
7 }
8 }
9 public:
10 template<typename F> void inorderTraverse(F & funcobj) const {
11 inorderTraverse(funcobj,root);
12 }
For preorder:
1 private:
2 template<typename F> void preorderTraverse(F & funcobj, Node * curr) cons
3 if (curr != NULL) {
4 funcobj(curr->data);
5 preorderTraverse(funcobj, curr->left);
6 preorderTraverse(funcobj,curr->right);
7 }
8 }
9 public:
10 template<typename F> void preorderTraverse(F & funcobj) const {
11 preorderTraverse(funcobj,root);
12 }
For postorder:
1 private:
2 template<typename F> void postorderTraverse(F & funcobj, Node * curr) con
3 if (curr != NULL) {
4 postorderTraverse(funcobj, curr->left);
5 postorderTraverse(funcobj,curr->right);
6 funcobj(curr->data);
7 }
8 }
9 public:
10 template<typename F> void postorderTraverse(F & funcobj) const {
11 postorderTraverse(funcobj,root);
12 }
The printing class we used to test is:
1 template<typename T>
2 class Printer {
3 public:
4 void operator()(const T & x) {
5 std::cout << x << " ";
6 }
7 };
G.23 Answers for Chapter 23
• Answer to Question 23.4 : The load factor of a hash table is the ratio of the
number of items in the table to the number of buckets. As the load factor
increases, the chances of collisions increases, degrading performance. When the
load factor gets high, the table should be rehashed. Rehashing the table creates
more buckets, so the load factor decreases.
• Answer to Question 23.6 : First note that 147 % 11 = 4, 301 % 11 = 4, 237 %
11 = 6, 335 % 11 = 5, and 370 % 11 = 7, then we can draw the following table:
• Answer to Question 23.7 :
Inserting another element that hashes into bucket 5 would collide with four other
elements (blue, white, green, black–in that order).
• Answer to Question 23.8 :
Inserting another element that hashes into bucket 5 would only collide with blue
G.24 Answers for Chapter 24
• Answer to Question 24.1 :
• Answer to Question 24.2 :
• Answer to Question 24.3 :
• Answer to Question 24.4 :
G.25 Answers for Chapter 25
• Answer to Question 25.3 : The MST is colored blue on the graph to the left.
The nodes are added in the order , , , , , , . The table to the
right shows the “best distance to the tree” at each step (left to right), with the
node currently being added shown in bold blue text.
• Answer to Question 25.4 : You obtain the same MST as the previous problem.
You add the edges in the order , , , , , . It is
correct to end up with / swapped (both length 2), and /
swapped (both length 3).
• Answer to Question 25.5 :
Note that C and E could be transposed (performed in either order), as they are
both at distance 8 from A.
• Answer to Question 25.6 : They both operate in a similar fashion—keeping track
of distances to each node, and working next on the node with the smallest
distance, then updating the distances based on the edges leaving that node.
However, Prim’s considers the “distance” to be only edge length, while Dijkstra’s
works with the total path length from the starting node. These observations
suggest that if we need both algorithms, we can avoid code duplication by writing
one piece of code that can do both, but has some generalization allowing us to
choose which (e.g., a template parameter).
G.26 Answers for Chapter 26
• Answer to Question 26.1 :
Minimum Less efficient to sort the data first: we can find the minimum in
time on an unsorted array or list. Sorting the data takes
time, after which we can find the minimum in
time.
Maximum Less efficient for the same reasons as finding the minimum element.
Check if any item… Less efficient. We can check if any item has a given
property in time on an unsorted array or list. We may be
able to check for some properties more efficiently on sorted data
(e.g., we can check if a particular value is in the data in
) time); however, it takes us time to sort the data
first.
Check if all items… As with checking if any item has a property, we can solve
this in time on unsorted data, while sorting the data requires
time.
Intersect More efficient: without sorting, this algorithm requires
time (where and are the sizes of the two arrays). If we sort
one array first, we can solve the problem in
time
(by iterating over the unsorted array and binary searching the sorted
array to see if it contains each element). If we sort both arrays first,
we can solve the problem in
time, by scanning both arrays in a fashion similar to the “merge” step
of merge sort.
• Answer to Question 26.4 : You could define an abstract parent class,
AbstractSort, which has an abstract method sort. You could then write one
subclass of AbstractSort for each sort that you implemented. You could then
write the method AbstractSort * createSort(const string& sortName),
which returns a new sort of the appropriate type based on the name. You can then
just work with the AbstractSort object in the timing code. Note that this
approach actually has a name—it is called the “abstract factory” pattern (we do
not cover design patterns, but you will learn about them in a software engineering
course).
G.27 Answers for Chapter 33
• Answer to Question 33.2 : We might use head recursion:
1 fun sumTo 0 = 0
2 | sumTo n = if (n < 0)
3 then ~(sumTo(~n))
4 else n+ sumTo(n-1)
Or with tail recursion:
1 fun sumTo n =
2 let fun helper ans 0 = ans
3 | helper ans n = helper (ans+n) (n-1)
4 in
5 helper 0 (if (n < 0)
6 then ~n
7 else n)
8 end
F Compiler Errors ExplainedIndexIndex
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML
V AppendiciesG Answers to Selected Exercises
Index
#define §5.1.1, §5.1.1
#endif §E.3.1
#include §5.1.1
ABI §11.2.3
abstract class §18.6
abstract data type §20.2
abstract method §18.6
abstract syntax trees §22.1.2
abstraction §11.3.1, §13.1, §13.4, §14.1.1, Chapter 20,
§3.1.2, §3.6.1, item Complicated Steps
access §14.1.1
access control §14.1
accessor §14.6.5
address §8.1
adjacency matrix(graphs) §25.2.1
adjacency list(graphs) §25.2.2
ADT §20.2
algorithm Chapter 1
alias §8.6
aliasing 6th item
Amdahl’s Law §28.7
amortized behavior §20.1.3
ancestors(of a tree node) item Ancestors (of a node)
arguments §2.3
array §9.2, 1st item
arrays Chapter 9
ascending order Chapter 26
assembly §5.1.2, §5.1.3
assert statement §6.1.4
assignment statements §2.1.2
asymptotic behavior §20.1
asynchronously §11.4
atomic increment §28.4.4
atomically §28.4
AVL trees Chapter 27
bag §20.5
balanced(binary tree) item Balanced
barrier §28.3.6
base case §7.2
base class Chapter 18
basic block §6.1.2
basic exception guarantees §19.7
bidirectional iterator §21.4.1
big endian footnote 5
big bang approach §13.5
Big-Oh notation §20.1
binary search tree §22.1
binary point §3.2.3
binary search Chapter 22
binary search tree item Binary search tree
binary tree item Binary tree
bit §3.1
black box testing §6.1.1
block §11.4, §2.3.1
blocks §28.2.3
body §2.6.1
branches(git) §D.4.3
breadth-first search §25.3.3
break §2.6.5
breakpoint §D.2.3, §6.2.3
brute force attacks(password cracking) §23.3.3
BST §22.1
bubble sort §26.1.1
buckets(hash table) §23.2.1
buffer overflow §10.4
bullet proof code Chapter 19
busy wait §28.3.5
byte-addressable memory §8.4
Caesar cipher §9.1
call stack §20.4.1
call graph §6.1.2
call stack §2.3
Callgrind(valgrind tool) §D.3.6
calling §2.3
calling convention §8.4.1
case, see switch-case
casting §3.4.1, §3.4.2
catch §19.3
caught item 2
chaining(hash table) §23.2.1
character literal §3.2.1
characters §3.2.1
checked exceptions(Java) §31.7
child class Chapter 18, Chapter 18
child process §28.1.2
children(of a tree node) item Children (of a node)
Cilk §28.6.3
circular list Chapter 21
clique(graph) item Clique and independent set
closure §E.6.7
code review §6.1.6
code walk-through §6.1.6
collision(hash table) §23.2
command line arguments §11.2
command shell §B.9
comments §4.5
compare-and-swap §28.4.2
compile Chapter 5
compile-time constants §17.3.2
compiler §3.2, Chapter 5
Compiler Error §5.1.2
in Emacs §C.8
compiling
preprocessor §5.1.1
complete(binary tree) item Complete
composability §4.5.1, §9.2
composable §18.7
concrete subclass §18.6
condition variable §28.3.5
conditional breakpoint §D.2.3
conditional compilation §E.3.1
conditional expressions §2.4
conditional group §E.3.1
connected(graph) item Connected
const §8.5.3
constant time §20.1.2
constructors §15.1
contended(mutex) §28.3.1
continue §2.6.5
control flow graph §6.1.2
controlled text §E.3.1
copy assignment operator §15.3
copy constructor §15.3, §15.3.1
corner cases §4.1, 2nd item, §6.1.1
covariant §18.5
critical path(task scheduling) §25.1.1
critical section §28.3
cubic time §20.1.2
curried(functional programming) §33.5
cycle(in a graph) item Cycle
dangling pointer §9.5.3
data races §28.3
data structure Chapter 20
deadlock §28.3.4
debugger §D.2, §6.2.3
debugging §6.2
debugging symbols §D.2.1
decision coverage §6.1.2, §6.1.2
deep copy §12.1.6
default constructible §15.1.1
default constructor §15.1.1
default initialization §15.1.3
default values §14.6.6
delegating constructors §E.6.5
dependent name §17.3.5
depth-first search §25.3.1
depth(of a binary tree node) item Depth (of a node)
deque §20.3.4
derived class Chapter 18
descendants(of a tree node) item Descendants (of a node)
descending order Chapter 26
destructor §15.2
device special files §11.4
dictionary attacks(password cracking) §23.3.3
directed acyclic graph §22.1.1, §25.1.1
directed path item (Directed) path
directives §5.1.1
disjoint union §30.3
divide and conquer §26.2.2
do-while loop §2.6.2
double freeing §12.2.3
double rotation §27.1
doubly linked list Chapter 21
down-casting §E.4.2
dynamic dispatch §18.5
dynamic memory allocation Chapter 12
dynamic type §18.4
dynamically typed languages §30.3
edge(in a graph) item Edge
else, see if-else
Emacs
advanced movement §C.9—§C.9
auto-complete §C.8
buffer item Buffer, §C.6—§C.6
cancel §C.4—§C.4
comment §C.8
compiling §C.8
configuring §C.12—§C.12
copy §C.5
cut §C.5—§C.5
frame item Frame
indent §C.8
kill item Kill, §C.5—§C.5
mark item Mark
minibuffer item Minibuffer
modeline item Modeline
open file §C.3
paste §C.5—§C.5
point item Point
quit §C.3
replace §C.7—§C.7
search §C.7—§C.7
spell check §C.11—§C.11
tab complete §C.3—§C.3
undo §C.4—§C.4
window item Window, §C.6—§C.6
yank item Yank, §C.5—§C.5
embarrassingly parallel §28.3
encapsulate §14.1.2
endianness §E.2
enumerated type §3.6.3
EOF §11.3.2
errno §11.1.1, §19.1
escape sequences §2.3.2
exception safety §19.7
exception specification §19.4
exceptions Chapter 19, §19.2
explicit §15.4.3
exponential time §20.1.2
expression §2.1.2, §2.2
extends Chapter 18
factorial time §20.1.2
fail fast §6.1.4
FIFO §20.3
file
open §11.3.1
file descriptor §11.3.1
files §11.3
fixed point §3.2.3
for loop §2.6.3
forest(graphs) §22.1.1
fork §28.1.2
format string attacks §10.4
forward iterator §21.4.1
four color theorem §25.1.2
frame §2.3
frequency counting §9.1
friend §16.2.2
full(binary tree) item Full
fully qualified name §14.3
function §2.3
function object §17.4.4
function overloading §14.4
function pointer §10.3
function prototypes §5.1.1
function try block §19.3
functional data structures §30.2
functional programming §30.2
functional programming languages §7.4
garbage collection §30.1
gcc Chapter 5
GDB §D.2
gdb §5.2
generational garbage collection §30.1
generics(Java) §31.8
getline §12.4
Git §D.4
global variable §2.3.1
goto §E.7
gradient 12nd item
gradient descent 13rd item
gradient ascent 13rd item
graph item Graph, Chapter 25
graph coloring §25.1.2
GUI §2.3.2
handled item 4
handlers §19.3
has-a Chapter 18
hash table Chapter 23
hashing function item 1
head recursion §7.3
head pointer(linked list) Chapter 21
header files §5.1.1
heap Chapter 24
heap sort §26.2.1
heapify §26.2.1
height(of a tree node) item Height (of a node)
Helgrind(valgrind tool) §D.3.6, §28.8
helper function §7.3
hex §3.1.2
hexadecimal §3.1.2
holding a lock §28.3.1
Huffman coding §24.4
identifier §2.1.1
if-else §2.4.1—§2.4.1
imperative language §30.2
implementation §13.1
include guard §E.3.1
incremental search §C.7
incremental testing §6.1
indentation Appendix C
independent set(graph) item Clique and independent set
indexing(an array) §9.3
induction §7.6
inheritance Chapter 18
inheritance hierarchy §18.1
inherits from Chapter 18
initialization §2.1.2
initializer list §15.1.4
inner class §14.1.6
inorder traversal §22.5.1
insertion sort §26.1.2
instantiate §17.1.1
integration §13.3
interface §13.1, Chapter 20
interfaces(Java) §31.4
interference graph §25.1.2
interpreter §5.4
intractable §20.1.2
introspection sort §26.3
invalidated §17.4.3
invariants §6.1.4
is-a Chapter 18
isomorphic(graph) item Isomorphism
iteration Chapter 7
iterator §17.4.3
iterators §21.4
Java §5.4
JIT §5.4
joining(a thread) §28.2.3
JVM §5.4
keyboard macro §C.10
Kruskal’s algorithm §25.4.2
label §2.4.2
lambda expressions(C++11) §E.6.7
lazy evaluation §30.4
leaf node item Leaf nodes
leaked memory §12.2.1
least significant bit §3.1.1
lexicographic order §10.1.3
lexicographically §17.4.1
libraries §5.1.4
LIFO §20.4
linear probing(hash table) §23.2.2
linear time §20.1.2
linearithmic time §20.1.2
link §5.1.4
linked list Chapter 21
little endian footnote 5
load balancing §28.6.3
load factor §23.4
load imbalance §28.1.1
load-linked §28.4.3
loader §10.1.1
local variables §2.3.1
locality §26.3
lock §28.3.1
lock free data structures §28.5
locking granularity §28.3.2
log-star time §20.1.2
logarithmic time §20.1.2
loop §2.6
lvalue 41st item, §2.1.2
macro expansion §5.1.1
macros §5.1.1
main (function) §2.3
maintainability §12.1.1
make §5.1.5
make (command §D.1
make the common case fast §28.7
manipulators §16.2.4
map §20.6
Massif(valgrind tool) §D.3.6
max-heap §24.1
measure function §7.6
member functions §14.1
Memcheck §D.3
memoization §7.2
memory consistency model §28.4.1
memory leak §12.2.1
merge sort §26.2.2
methods §14.1
min-heap §24.1
minimum spanning tree §25.4
mixin §18.7.1, §29.4
Model, View, Controller §18.8
modulus §2.2
move assignment operators §E.6.4
move constructors §E.6.4
multi-core Chapter 28
multidimensional arrays §10.2
multiple inheritance §29.2
multiset §20.5
mutex §28.3.1
mutual recursion §7.5
mutual exclusion lock §28.3.1
MVC §18.8
name collisions §14.3
name mangling §14.4.1
namespace §14.3
naming conventions §13.2.2
native methods(Java) §31.9
new §15.1.2
no exception guarantees §19.7
no-fail guarantee §19.7
no-throw guarantee §19.7
node item Node
non-deterministic §28.2.2
NP-complete §20.1.2
null terminator §10.1, §3.5.1
object file §5.1.3
object oriented language §30.2
object-oriented programming §14.1
octal §3.3
opaque ascription (SML) §33.7.2
open
file §11.3.1
open addressing(hash table) §23.2.2
operating system §11.1
operator overloading §14.5
overflow §3.4.3
override §18.1, §18.5
packages(Java) §31.1, §31.6
page table §10.1.1
pair programming §13.3
parallel programming Chapter 28
parameter list §2.3
parameter packs §E.6.11
parameters Chapter 1
parametric polymorphism §33.3.3
parametric polymorphism Chapter 17, §18.4
parent class Chapter 18, Chapter 18
parent process §28.1.2
parent(of a tree node) item Parent (of a node)
parsing §20.4.1, §5.1.2
pass-by-name §30.4
pass-by-need §30.4
pass-by-reference §30.4
pass-by-value §30.4
path coverage §6.1.2, §6.1.2
path(in a graph) item (Directed) path
peeking §20.3
pipe §11.4
pipeline parallelism §28.6.2
pivot
quick sort §26.2.3
pixel §3.5.2
plain old data §14.1.4
planar(graph) §25.1.2
POD §14.1.4
pointer 1st item
pointers Chapter 8
Polymorphism Chapter 17
polymorphism §18.4
polynomial time §20.1.2
portability §12.1.1, §5.1.1
post-order number §29.3.2
postorder §22.5.3
preorder §22.5.2
preprocessor §5.1.1
prerequisites(Makefile) §D.1
Prim’s algorithm §25.4.1
primary parent(class) §29.2.2
primary vtable §29.2.4
primitive types(Java) §31.2
printing §2.3.2
priority queue Chapter 24
process ID §28.1
processes §28.1
programming in the large Chapter 13
programming in the small Chapter 13
prototypes §5.1.4
pseudo-random §6.1.3
pthreads §28.2
pure virtual member function §18.6
quadratic probing(hash table) §23.2.2
quadratic time §20.1.2
qualifier §8.5.3
quartic time §20.1.2
queue §20.3
quick sort §26.2.3
RAII §19.7.1
random testing §6.1.3
range-based for loop §E.6.6
read-only §10.1.1
reader/writer locks §28.3.3
recursion §2.6, Chapter 7
recursive descent parsing §7.5
recursive case §7.2
red-black tree §27.3
red-black trees Chapter 27
reference §14.2
reference counting §E.6.9
reflection(Java) §31.9
regexp §C.7
regression testing §6.1.5
rehashing §23.4
Resource Acquisition is Initialization §19.7.1
return statement §2.3
return type(of a function) §2.3
return value §2.3
return value optimization §15.4.2
revision control §D.4
RGB encoding §3.5.2
root node item Rooted tree
rooted tree item Rooted tree
Rule of Five §E.6.4
rule of three §15.3.4
running your program §5.2
rvalue §2.1.2
rvalue references §E.6.4
salt(password hashing) §23.3.3
sanitize §10.4
scientific method §6.2
scope §14.1.2, §2.3.1
scope resolution §14.1.5, §14.1.6
scope resolution operator §14.3
security vulnerabilities §10.4
segmentation fault §8.4.2
segmentation fault §10.1.1
selection sort §26.1.3
selection expression §2.4.2
semantics Chapter 2
sentinel(heap) §24.2
sequential programming Chapter 28
set §20.5
shaker sort §26.1.1
shallow copy §12.1.6
side-cast §E.4.2
sign extended §3.4.1
silent failure Chapter 19
single rotations(balanced BST) §27.1
singly linked list Chapter 21
slack(task scheduling) §25.1.1
socket §11.4
sort Chapter 26
bubble §26.1.1
heap §26.2.1
insertion §26.1.2
merge §26.2.2
quick §26.2.3
selection §26.1.3
shaker §26.1.1
specialization §17.1.1
sprite §18.8
stable(sort) §26.4
stack §20.4
stack frame §2.3
stack frames §8.4.1
state Chapter 2
statement coverage §6.1.2
statement coverage 3rd item
static §14.1.5
static dispatch §18.5
static type §18.4
static type system §30.3
store-conditional §28.4.3
stream §11.3.1
stream extraction §16.3
stringification §E.3.4
strings §10.1
strong exception guarantee §19.7
strongly connected component item Strongly connected
components
struct §3.6.1
sub-tree item Sub-tree
subclass Chapter 18, Chapter 18
subobject rule §29.1.1
subtype polymorphism §18.4
super class Chapter 18
superclass Chapter 18
suspend §C.3
switch-case §2.4.2—§2.4.2
synchronization §28.3
syntactic sugar §2.5
syntax Chapter 2
syntax highlighting Appendix C
system call §11.1
tag §3.6.1
tail call §7.3
tail pointer(linked list) Chapter 21
tail recursion §7.3
target(Makefile) §D.1
task parallelism §28.6.3
template specification
complete §17.3.6
partial §17.3.6
template parameters §17.1
template specialization §17.1.1
templated classes §17.2
templates Chapter 17
ternary operator §E.1
test coverage §6.1.2
test harness §6.1.3
test-and-set §28.4.1
testing §1.6
The Poker Player’s Fallacy §6.2.5
this pointer §14.1.2
threads §28.2
throw §19.2
token pasting §E.3.7
top-down design §4.5
Topological sort item Topological sort
total order §22.1.2, Chapter 26, §7.6
tractable §20.1.2
trailing return type §E.6.3
transparent ascription (SML) §33.7.2
trap §10.1.1
traveling salesperson problem item Traveling salesperson
tree item Tree
trivial §15.3.1
trivial destructor §15.2.1
try §19.3
try/catch §19.2
type §2.1, §3.1, §3.1.2
type promotion §3.4.1, §3.4.1
type conversion §3.4.1, §3.4.1
type inference §33.1
type parameter Chapter 17
type system §30.3
type variable (SML) §33.3.3
type-conversion §E.4.1
UI Delegate §18.8
UML class diagram §18.8
unconditional breakpoints §D.2.3
undefined behavior §8.7
underflow §3.4.3
undirected cycle(in a graph) item Undirected cycle
undirected path(in a graph) item Undirected path
union §E.2
union-find §25.4.2
unnamed temporaries §15.4
unsanitized inputs §10.4
untyped languages §30.3
unwinding the stack §19.3
up-casting §E.4.2
valgrind §D.3, §5.2
value §2.1.2
value initialization §15.1.3
variable
declaration §2.1
in expressions §2.2
initialization §2.1.2
uninitialized §2.1.1, §5.2
variables §2.1
variadic macros §E.3.5
variadic templates §E.6.11
vector §17.4.1
vector instructions §28.6.1
vertices(graph) Chapter 25
Vigenère cipher §9.1
virtual inheritance §29.3
visibility §14.1.1
void pointer §8.4.2
vtable §29.1.2
watchpoint §D.2.4, §6.2.3
weights(edges in graphs) Chapter 25
while loop §2.6.1
white box testing §6.1.2
work stealing §28.6.3
worklist algorithms §25.3.2
wrapper functions §10.3
X-macros §E.3.6
zero extended §3.4.1
zero-based(array indexing) §9.3
G Answers to Selected Exercises
Generated on Thu Jun 27 15:08:37 2019 by LaTeXML