Unit 4
Unit 4
In the previous chapters, we studied several complex abstract data types that
required the use of a data structure for their implementation. In this chapter, we
continue exploring abstract data types with a focus on several common containers.
Two of these are provided by Python as part of the language itself: sets and
dictionaries. Nevertheless, it’s still important to understand how they work and
some of the common ways in which they are implemented.
Your experience in programming will likely not be limited to the Python lan-
guage. At some point in the future, you may use one if not several other common
programming languages. While some of these do provide a wide range of abstract
data types as part of the language itself or included in their standard library, oth-
ers, like C, do not. Thus, it’s important that you know how to implement a set or
dictionary ADT if necessary, when one is not available as part of the language.
Further, both the set and dictionary types provide excellent examples of ab-
stract data types that can be implemented using different data structures. As you
learned in Chapter 1, there may be multiple data structures and ways to organize
the data in those structures that are suitable for implementing an abstract data
type. Thus, it’s not uncommon for language libraries to provide multiple imple-
mentations of an abstract data type, which allows the programmer to choose the
best option for a given problem. Your ability to choose from among these various
implementations will depend not only on your knowledge of the abstract data type
itself, but also on understanding the pros and cons of the various implementations.
3.1 Sets
The Set ADT is a common container used in computer science. But unlike the
Bag ADT introduced in Chapter 1, a set stores unique values and represents the
same structure found in mathematics. It is commonly used when you need to store
a collection of unique values without regard to how they are stored or when you
need to perform various mathematical set operations on collections.
69
70 CHAPTER 3 Sets and Maps
A set is a container that stores a collection of unique values over a given comparable
domain in which the stored values have no particular ordering.
Set(): Creates a new set initialized to the empty set.
length (): Returns the number of elements in the set, also known as the
cardinality. Accessed using the len() function.
add( element ): Modifies the set by adding the given value or element to the
set if the element is not already a member. If the element is not unique, no
action is taken and the operation is skipped.
remove( element ): Removes the given value from the set if the value is con-
tained in the set and raises an exception otherwise.
equals ( setB ): Determines if the set is equal to another set and returns a
boolean value. For two sets, A and B, to be equal, both A and B must contain
the same number of elements and all elements in A must also be elements in
B. If both sets are empty, the sets are equal. Access with == or !=.
isSubsetOf( setB ): Determines if the set is a subset of another set and re-
turns a boolean value. For set A to be a subset of B, all elements in A must
also be elements in B.
union( setB ): Creates and returns a new set that is the union of this set and
setB. The new set created from the union of two sets, A and B, contains all
elements in A plus those elements in B that are not in A. Neither set A nor
set B is modified by this operation.
intersect( setB ): Creates and returns a new set that is the intersection
of this set and setB. The intersection of sets A and B contains only those
elements that are in both A and B. Neither set A nor set B is modified by
this operation.
difference( setB ): Creates and returns a new set that is the difference of
this set and setB. The set difference, A − B, contains only those elements that
are in A but not in B. Neither set A nor set B is modified by this operation.
3.1 Sets 71
iterator (): Creates and returns an iterator that can be used to iterate over
the collection of items.
Example Use
To illustrate the use of the Set ADT, we create and use sets containing the courses
currently being taken by two students. In the following code segment, we create
two sets and add elements to each. The results are illustrated in Figure 3.1.
smith = Set()
smith.add( "CSCI-112" )
smith.add( "MATH-121" )
smith.add( "HIST-340" )
smith.add( "ECON-101" )
roberts = Set()
roberts.add( "POL-101" )
roberts.add( "ANTH-230" )
roberts.add( "CSCI-112" )
roberts.add( "ECON-101" )
“CSCI-112” “CSCI-112”
“MATH-121” “POL-101”
“ECON-101” “ECON-101”
“HIST-340” “ANTH-230”
Next, we determine if the two students are taking the exact same courses. If
not, then we want to know if they are taking any of the same courses. We can do
this by computing the intersection between the two sets.
if smith == roberts :
print( "Smith and Roberts are taking the same courses." )
else :
sameCourses = smith.intersection( roberts )
if sameCourses.isEmpty() :
print( "Smith and Roberts are not taking any of "\
+ "the same courses." )
else :
print( "Smith and Roberts are taking some of the "\
+ "same courses:" )
for course in sameCourses :
print( course )
72 CHAPTER 3 Sets and Maps
In this case, the two students are both taking CSCI-112 and ECON-101. Thus,
the results of executing the previous code segment will be
Suppose we want to know which courses Smith is taking that Roberts is not
taking. We can determine this using the set difference operation:
smith roberts
theElements theElements
Set Set
0 “CSCI-112”
“CSCI-112” 0 “POL-101”
“POL-101”
1 “MATH-121”
“MATH-121” 1 “ANTH-230”
“ANTH-230”
2 “HIST-340”
“HIST-340” 2 “CSCI-112”
“CSCI-112”
3 “ECON-101”
“ECON-101” 3 “ECON-101”
“ECON-101”
Adding Elements
As indicated earlier, we must ensure that duplicate values are not added to the set
since the list structure does not handle this for us. When implementing the add
method, shown in lines 16–18, we must first determine if the supplied element is
already in the list or not. If the element is not a duplicate, we can simply append
the value to the end of the list; if the element is a duplicate, we do nothing. The
reason for this is that the definition of the add() operation indicates no action
is taken when an attempt is made to add a duplicate value. This is known as a
noop, which is short for no operation and indicates no action is taken. Noops are
appropriate in some cases, which will be stated implicitly in the definition of an
abstract data type by indicating no action is to be taken when the precondition
fails as we did with the add() operation.
TIP
vantage of the abstraction and avoid “reinventing the wheel” by duplicating
code in several places.
sure the two sets contain the same number of elements; otherwise, they cannot be
equal. It would be inefficient to compare the individual elements since we already
know the two sets cannot be equal. After verifying the size of the lists, we can test
to see if the self set is a subset of setB by calling self.isSubsetOf(setB). This
is a valid test since two equal sets are subsets of each other and we already know
they are of the same size.
To determine if one set is the subset of another, we can iterate over the list
of elements in the self set and make sure each is contained in setB. If just one
element in the self set is not in setB, then it is not a subset. The implementation
of the isSubsetOf() method is shown in lines 33–37.
3.2 Maps
Searching for data items based on unique key values is a very common application
in computer science. An abstract data type that provides this type of search
capability is often referred to as a map or dictionary since it maps a key to
a corresponding value. Consider the problem of a university registrar having to
manage and process large volumes of data related to students. To keep track of the
information or records of data, the registrar assigns a unique student identification
76 CHAPTER 3 Sets and Maps
number to each individual student as illustrated in Figure 3.3. Later, when the
registrar needs to search for a student’s information, the identification number is
used. Using this keyed approach allows access to a specific student record. If
the names were used to identify the records instead, then what happens when
multiple students have the same name? Or, what happens if the name was entered
incorrectly when the record was initially created?
10210
Brown
John
10175
14 East Main St
Smith
Somewhere
John
10142 VA
14 East Main St
Roberts 99155
Somewhere
John
10015 VA
14 East Main St
Smith 99155
Somewhere
John
VA
14 East Main St
99155
Somewhere
VA
99155
In this section, we define our own Map ADT and then provide an implementa-
tion using a list. In later chapters, we will implement and evaluate the map using
a variety of data structures. We use the term map to distinguish our ADT from
the dictionary provided by Python. The Python dictionary is implemented using a
hash table, which requires the key objects to contain the hash method for gen-
erating a hash code. This can limit the type of problems with which a dictionary
can be used. We define our Map ADT with the minimum requirement that the
keys are comparable, which will allow it to be used in a wider range of problems.
It’s not uncommon to provide multiple implementations of an ADT as is done with
many language libraries. We will explore the implementation details of Python’s
dictionary later in Chapter 11 when we discuss hash tables and the design of hash
functions.
A map is a container for storing a collection of data records in which each record
is associated with a unique key. The key components must be comparable.
Map(): Creates a new empty map.
Instead of using two lists to store the key/value entries in the map, we can use
a single list. The individual keys and corresponding values can both be saved in a
single object, with that object then stored in the list. A sample instance illustrating
the data organization required for this approach is shown in Figure 3.4.
Smith
Smith
entryList John
John
14
14 East
East Main
Main St
St
Map Somewhere
Somewhere Roberts
Roberts
VA
VA Susan
Susan
99155
0 10015
10015 99155 231
231 Quarry
Quarry Rd
Rd
Nowhere
Nowhere
TX
TX
1 10142
10142 11333
11333
2 10210
10210
Brown
Brown
Jessica
Jessica
3 10175
10175 Smith
Smith 231
231 Quarry
Quarry Rd
Rd
Jane
Jane Plains
Plains
MapEntry 81
81 Jefferson
Jefferson St
St TN
TN
East
East End
End 30101
30101
PA
PA
28541
28541
The implementation of the Map ADT using a single list is provided in List-
ing 3.2. As we indicated earlier in Chapter 1, we want to avoid the use of tuples
when storing structured data since it’s better practice to use classes with named
fields. The MapEntry storage class, defined in lines 56–59, will be used to store the
individual key/value pairs. Note this storage class is defined to be private since it’s
only intended for use by the Map class that provides the single list implementation
of the Map ADT.
17 # new value replaces the current value associated with the key.
18 def add( self, key, value ):
19 ndx = self._findPosition( key )
20 if ndx is not None : # if the key was found
21 self._entryList[ndx].value = value
22 return False
23 else : # otherwise add a new entry
24 entry = _MapEntry( key, value )
25 self._entryList.append( entry )
26 return True
27
28 # Returns the value associated with the key.
29 def valueOf( self, key ):
30 ndx = self._findPosition( key )
31 assert ndx is not None, "Invalid map key."
32 return self._entryList[ndx].value
33
34 # Removes the entry associated with the key.
35 def remove( self, key ):
36 ndx = self._findPosition( key )
37 assert ndx is not None, "Invalid map key."
38 self._entryList.pop( ndx )
39
40 # Returns an iterator for traversing the keys in the map.
41 def __iter__( self ):
42 return _MapIterator( self._entryList )
43
44 # Helper method used to find the index position of a category. If the
45 # key is not found, None is returned.
46 def _findPosition( self, key ):
47 # Iterate through each entry in the list.
48 for i in range( len(self) ) :
49 # Is the key stored in the ith entry?
50 if self._entryList[i].key == key :
51 return i
52 # When not found, return None.
53 return None
54
55 # Storage class for holding the key/value pairs.
56 class _MapEntry :
57 def __init__( self, key, value ):
58 self.key = key
59 self.value = value
Many of the methods require a search to determine if the map contains a given
key. In this implementation, the standard in operator cannot be used since the list
contains MapEntry objects and not simply key entries. Instead, we have to search
the list ourselves and examine the key field of each MapEntry object. Likewise, we
routinely have to locate within the list the position containing a specific key/value
entry. Since these operations will be needed in several methods, we can create a
helper method that combines the two searches and use it where needed.
The findPosition() helper method searches the list for the given key. If
the key is found, the index of its location is returned; otherwise, the function
80 CHAPTER 3 Sets and Maps
returns None to indicate the key is not contained in the map. When used by
the other methods, the value returned can be evaluated to determine both the
existence of the key and the location of the corresponding entry if the key is in
the map. By combining the two searches into a single operation, we eliminate the
need to first determine if the map contains the key and then searching again for
its location. Given the helper method, the implementation of the various methods
is straightforward. Implementation of the iterator method is left as an exercise.
les
columns tab 1 2
0 1 2 3 4 0
0 0
1 1
rows
rows
2 2
3 3
0 1 2 3
columns
Figure 3.5: Sample multi-dimensional arrays: (left) a 2-D array viewed as a rectangular
table and (right) a 3-D array viewed as a box of tables.
length( dim ): Returns the length of the given array dimension. The individ-
ual dimensions are numbered starting from 1, where 1 represents the first, or
highest, dimension possible in the array. Thus, in an array with three dimen-
sions, 1 indicates the number of tables in the box, 2 is the number of rows,
and 3 is the number of columns.
clear( value ): Clears the array by setting each element to the given value.
appropriate syntax to make use of a 1-D array. Multi-dimensional arrays are not
handled at the hardware level. Instead, the programming language typically pro-
vides its own mechanism for creating and managing multi-dimensional arrays.
As we saw earlier, a one-dimensional array is composed of a group of sequential
elements stored in successive memory locations. The index used to reference a
particular element is simply the offset from the first element in the array. In most
programming languages, a multi-dimensional array is actually created and stored
in memory as a one-dimensional array. With this organization, a multi-dimensional
array is simply an abstract view of a physical one-dimensional data structure.
Array Storage
A one-dimensional array is commonly used to physically store arrays of higher
dimensions. Consider a two-dimensional array divided into a table of rows and
columns as illustrated in Figure 3.6. How can the individual elements of the table
be stored in the one-dimensional structure while maintaining direct access to the
individual table elements? There are two common approaches. The elements
can be stored in row-major order or column-major order . Most high-level
programming languages use row-major order, with FORTRAN being one of the
few languages that uses column-major ordering to store and manage 2-D arrays.
0 1 2 3 4
0 2 15
15 45 13
13 78
78
1 40 12
12 52 91
91 86
86
2 59 25
25 33 41
41 66
In row-major order, the individual rows are stored sequentially, one at a time,
as illustrated in Figure 3.7. The first row of 5 elements are stored in the first 5
sequential elements of the 1-D array, the second row of 5 elements are stored in
the next five sequential elements, and so forth.
In column-major order, the 2-D array is stored sequentially, one entire column
at a time, as illustrated in Figure 3.8. The first column of 3 elements are stored in
the first 3 sequential elements of the 1-D array, followed by the 3 elements of the
second column, and so on.
For larger dimensions, a similar approach can be used. With a three-dimensional
array, the individual tables can be stored contiguously using either row-major or
column-major ordering. As the number of dimensions grow, all elements within
a single instance of each dimension are stored contiguously before the next in-
stance. For example, given a four-dimensional array, which can be thought of as
an array of boxes, all elements of an individual box (3-D array) are stored before
the next box.
3.3 Multi-Dimensional Arrays 83
0 1 2 3 4
Physical storage
0 2 15
15 45
45 13
13 78
78
of a 2-D array using
row-major order.
1 40
40 12
12 52
52 91
91 86
86
2 59
59 25
25 33
33 41
41 6
22
15
15
45
45
13
13
78
78
40
40
12
12
52
52 91
91 86
86 59
59 25 33
33 41
41 66
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Figure 3.7: Physical storage of a sample 2-D array (top) in a 1-D array using row-major
order (bottom).
Index Computation
Since multi-dimensional arrays are created and managed by instructions in the
programming language, accessing an individual element must also be handled by
the language. When an individual element of a 2-D array is accessed, the compiler
must include additional instructions to calculate the offset of the specific element
within the 1-D array. Given a 2-D array of size m×n and using row-major ordering,
an equation can be derived to compute this offset.
To derive the formula, consider the 2-D array illustrated in Figure 3.7 and
observe the physical storage location within the 1-D array for the first element in
several of the rows. Element (0, 0) maps to position 0 since it is the first element
in both the abstract 2-D and physical 1-D arrays. The first entry of the second
row (1, 0) maps to position n since it follows the first n elements of the first row.
Likewise, element (2, 0) maps to position 2n since it follows the first 2n elements
in the first two rows. We could continue in the same fashion through all of the
rows, but you would soon notice the position for the first element of the ith row is
0 1 2 3 4
Physical storage
of a 2-D array using 0 2 15
15 45
45 13
13 78
78
column-major order.
1 40 12
12 52
52 91
91 86
86
2 59 25
25 33
33 41
41 6
22
40
40
59
15
15
12
12
25
45
45
52
52 33
33 13
13 91
91 41 78 86
86 66
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Figure 3.8: Physical storage of a sample 2-D array (top) in a 1-D array using column-
major order (bottom).
84 CHAPTER 3 Sets and Maps
n ∗ i. Since the subscripts start from zero, the ith subscript not only represents a
specific row but also indicates the number of complete rows skipped to reach the
ith row.
Knowing the position of the first element of each row, the position for any
element within a 2-D array can be determined. Given an element (i, j) of a 2-D
array, the storage location of that element in the 1-D array is computed as
The column index, j, is not only the offset within the given row but also the
number of elements that must be skipped in the ith row to reach the jth column.
To see this formula in action, again consider the 2-D array from Figure 3.7 and
assume we want to access element (2, 3). Finding the target element within the
1-D array requires skipping over the first 2 complete rows of elements:
2 15
15 45 13 78
78
i
40 12
12 52 91 86
86
59
59 25
25 33
33 41
41 66
59
59 25
25 33 41
41 6
Plugging the indices into the equation from above results in an index position of
13, which corresponds to the position of element (2, 3) within the 1-D array used
to physically store the 2-D array.
Similar equations can be derived for arrays of higher dimensions. Given a 3-D
array of size d1 × d2 × d3 , the 1-D array offset of element (i1 , i2 , i3 ) stored using
row-major order will be
For each component (i) in the subscript, the equation computes the number of
elements that must be skipped within the corresponding dimension. For example,
the factor (d2 ∗ d3 ) indicates the number of elements in a single table of the cube.
When it’s multiplied by i1 we get the number of complete tables to skip and in turn
the number of elements to skip in order to arrive at the first element of table i1 .
3.3 Multi-Dimensional Arrays 85
i1
d2
d3
where the fj values are the factors representing the number of elements to be
skipped within the corresponding dimension and are computed using
n
Y
fn = 1 and fj = dk ∀0<j<n (3.5)
k=j+1
The size of a multi-dimensional array is fixed at the time it’s created and cannot
change during execution. Likewise, the several fj products used in the equation
above will not change once the size of the array is set. This can be used to our
advantage to reduce the number of multiplications required to compute the element
offsets. Instead of computing the products every time an element is accessed, we
can compute and store the factor values and simply plug them into the equation
when needed.
When using the function, we can pass a variable number of arguments for each
invocation. For example, all of the following are valid function calls:
func( 12 )
func( 5, 8, 2 )
func( 18, -2, 50, 21, 6 )
The asterisk next to the argument name (*args) tells Python to accept any
number of arguments and to combine them into a tuple. The tuple is then passed
to the function and assigned to the formal argument marked with the asterisk.
Note the asterisk is only used in the argument list to indicate that the function
or method can accept any number of arguments. It is not part of the argument
name. The len() operation can be applied to the tuple to determine the number
of actual arguments passed to the function. The individual arguments, which
are elements in the tuple, can be accessed by using the subscript notation or by
iterating the collection.
Constructor
The constructor, which is shown in lines 4–19, defines three data fields: dims
stores the sizes of the individual dimensions; factors stores the factor values
used in the index equation; and elements is used to store the 1-D array used as
the physical storage for the multi-dimensional array.
The constructor is defined to accept a variable-length argument as required in
the ADT definition. The resulting tuple will contain the sizes of the individual
dimensions and is assigned to the dims field. The dimensionality of the array
must be verified at the beginning of the constructor as the MultiArray ADT is
meant for use with arrays of two dimensions or more.
The elements of the multi-dimensional array will be stored in a 1-D array. The
fixed size of the array can be computed as the product of the dimension lengths
by traversing over the tuple containing the variable-length argument. During the
traversal, the precondition requiring all dimension lengths be greater than zero is
also evaluated. The Array class defined earlier in the chapter is used to create the
storage array.
Finally, a 1-D array is created and assigned to the factors field. The size of
the array is equal to the number of dimensions in the multi-dimensional array. This
array will be initialized to the factor values used in Equation 3.4 for computing
the element offsets. The actual computation and initialization is performed by the
computeFactors() helper method, which is left as an exercise. A sample instance
of the MultiArray class is illustrated in Figure 3.9.
dims
3 55
factors
5 11
elements
22 15
15 45
45 13
13 78
78 40 12
12 52
52 91
91 86
86 59
59 25
25 33
33 41
41 6
MultiArray
Figure 3.9: A sample MultiArray object for the 2-D array from Figure 3.6.
numDims() method returns the dimensionality of the array, which can be obtained
from the number of elements in the dims tuple.
Element Access
Access to individual elements within an n-D array requires an n-tuple or multi-
component subscript, one for each dimension. As indicated in Section 2.3.2, when
a multi-component subscript is specified (i.e., y = x[i,j]), Python automatically
stores the components in a tuple in the order listed within the brackets and passes
the tuple to the ndxTuple argument.
The contents of the ndxTuple are passed to the computeIndex() helper method
to compute the index offset within the 1-D storage array. The use of the helper
method reduces the need for duplicate code that otherwise would be required in
both element access methods. The setitem operator method can be imple-
mented in a similar fashion. The major difference is that this method requires a
second argument to receive the value to which an element is set and modifies the
indicated element with the new value instead of returning a value.
: : : : ... : :
where the first line indicates the number of stores; the second line indicates the
number of individual items (both of which are integers); and the remaining lines
contain the sales data. Each line of the sales data consists of four pieces of in-
formation: the store number, the month number, the item number, and the sales
amount for the given item in the given store during the given month. For sim-
plicity, the store and item numbers will consist of consecutive integer values in the
range [1 . . . max], where max is the number of stores or items as extracted from
the first two lines of the file. The month is indicated by an integer in the range
[1 . . . 12] and the sales amount is a floating-point value.
Data Organization
While some reports, like the student report from Chapter 1, are easy to produce
by simply extracting the data and writing it to the report, others require that we
first organize the data in some meaningful way in order to extract the information
needed. That is definitely the case for this problem, where we may need to produce
many different reports from the same collection of data. The ideal structure for
storing the sales data is a 3-D array, as shown in Figure 3.11, in which one dimen-
sion represents the stores, another represents the items sold in the stores, and the
last dimension represents each of the 12 months in the calendar year. The 3-D
array can be viewed as a collection of spreadsheets, as illustrated in Figure 3.12.
res
sto...
7
0
0
:
items
100
0 ... 12
months
.
..
es
3
or
2
st
1
0
0
.
5
..
items
6
10
: : : :
99
0 1 2 3 4 5 6 7 8 9 10 11
months
Each spreadsheet contains the sales for a specific store and is divided into rows and
columns where each row contains the sales for one item and the columns contain
the sales for each month.
Since the store, item, and month numbers are all composed of consecutive
integer values starting from 1, we can easily represent each by a unique index
that is one less than the given number. For example, the data for January will be
stored in column 0, the data for February will be stored in column 1, and so on.
Likewise, the data for item number 1 will be stored in row 0, the data for item
number 2 will be stored in row 1, and so on. We leave the actual extraction of
the data from a text file as an exercise. But for illustration purposes, we assume
this step has been completed resulting in the creation and initialization of the 3-D
array as shown here:
# Compute the total sales of all items for all months in a given store.
def totalSalesByStore( salesData, store ):
# Subtract 1 from the store # since the array indices are 1 less
# than the given store #.
92 CHAPTER 3 Sets and Maps
s = store-1
# Accumulate the total sales for the given store.
total = 0.0
return total
Assuming our view of the data as a collection of spreadsheets, this requires travers-
ing over every element in the spreadsheet containing the data for the given store.
If store equals 1, this is equivalent to processing every element in the spreadsheet
shown at the front of Figure 3.12. Two nested loops are required since we must sum
the values from each row and column contained in the given store spreadsheet.
The number of rows (dimension number 2) and columns (dimension number 3) can
be obtained using the length() array method.
return total
This time, the two nested loops have to iterate over every row of every spread-
sheet for the single column representing the given month. If we use this function
to compute the total sales for the month of January, the elements of the 3-D array
that will be accessed are shown by the shaded area in Figure 3.13(a).
# Compute the total sales of a single item in all stores over all months.
def totalSalesByItem( salesData, item ):
# The item number must be offset by 1.
m = item - 1
return total
The cells of the array that would be accessed when using this function to
compute the total sales for item number 5 are shown by the shaded area in Fig-
ure 3.13(b). Remember, the sales for each item are stored in a specific row of the
array and the index of that row is one less than the item number since the indices
start at 0.
7 7
.
.
..
..
es
3 es 3
or
or
2 2
st
st
1 1
0 0
0 0
1 1
2 2
3 3
4 4
.
.
5 5
..
..
items
items
6 6
7 7
8 8
9 9
10 10
: : : : : : : :
(a) 99 (b) 99
0 1 2 3 4 5 6 7 8 9 10 11 0 1 2 3 4 5 6 7 8 9 10 11
months months
Figure 3.13: The elements of the 3-D array that must be accessed to compute the total
sales: (a) for the month of January and (b) for item number 5.
# Compute the total sales per month for a given store. A 1-D array is
# returned that contains the totals for each month.
# Iterate over the sales of each item sold during the m month.
for i in range( salesData.length(2) ):
sum += salesData[s, i, m]
Figure 3.14 illustrates the use of the 1-D array for storing the individual
monthly totals. The shaded area shows the elements of the 3-D array that are
accessed when computing the total sales for the month of April at store number 1.
The monthly total will be stored at index position 3 within the 1-D array since
that is the corresponding column in the 3-D array for the month of April.
store months
0 0 1 2 3 4 5 6 7 8 9 10 11
5
items
10
: : : :
99
totals
0 1 2 3 4 5 6 7 8 9 10 11
Figure 3.14: The elements the 3-D array that must be accessed to compute the monthly
sales for store number 1.
Exercises 95
Exercises
3.1 Complete the Set ADT by implementing intersect() and difference().
It can then be used as shown here to create a set initialized with the given
values:
3.3 Add a new operation to the Set ADT to test for a proper subset. Given two
sets, A and B, A is a proper subset of B, if A is a subset of B and A does not
equal B.
3.4 Add the str() method to the Set implementation to allow a user to print
the contents of the set. The resulting string should look similar to that of a
list, except you are to use curly braces to surround the elements.
3.5 Add Python operator methods to the Set class that can be used to perform
similar operations to those already defined by named methods:
3.6 Add a new operation keyArray() to the Map class that returns an array con-
taining all of the keys stored in the map. The array of keys should be in no
particular ordering.
3.7 Add Python operators to the Map class that can be used to perform similar
operations to those already defined by named methods:
3.8 Design and implement the iterator class SetIterator for use with the Set
ADT implemented using a list.
96 CHAPTER 3 Sets and Maps
3.9 Design and implement the iterator class MapIterator for use with the Map
ADT implemented using a list.
3.10 Develop the index equation that computes the location within a 1-D array for
element (i, j) of a 2-D array stored in column-major order.
3.11 The 2-D array described in Chapter 2 is a simple rectan-
0
gular structure consisting of the same number of elements
1
in each row. Other layouts are possible and sometimes
2
required by problems in computer science. For example,
3
the lower triangular array shown on the right is organized
4
such that the rows are staggered with each successive row
0 1 2 3 4
consisting of one more element than the previous row.
(a) Derive an equation that computes the total number of elements in the
lower triangular table for a table of size m × n.
(b) Derive an index equation that maps an element of the lower triangular
table onto a one-dimensional array stored in row-major order.
Programming Projects
3.1 In this chapter, we implemented the Set ADT using a list. Implement the Set
ADT using a bag created from the Bag class. In your opinion, which is the
better implementation? Explain your answer.
3.2 Define a new class named TriangleArray to implement the lower triangular
table described in Exercise 3.11.
3.3 Given a collection of items stored in a bag, design a linear time algorithm that
determines the number of unique items in the collection.
3.4 Write a function that extracts the sales data from a text file and builds the
3-D array used to produce the various reports in Section 3.4. Assume the data
file has the format as described in the chapter.
3.5 Write a menu-driven program that uses your function from the previous ques-
tion to extract the sales data and can produce any of the following reports: