BCS401-Module-3
BCS401-Module-3
LECTURE 23:
There are three major variations of this idea that differ by what we transform a given instance to
AVL Trees:
AVL trees were invented in 1962 by two Russian scientists, G. M. Adelson-Velsky and E. M.
Landis [Ade62], after whom this data structure is named.
An AVL tree is a binary search tree in which the balance factor of every node, which is defined
as the difference between the heights of the node’s left and right subtrees, is either 0 or +1 or −1.
The height of the empty tree is defined as −1.
For example, the binary search tree in Figure 6.2a is an AVL tree but the one in Figure 6.2b is
not.
If an insertion of a new node makes an AVL tree unbalanced, we transform the tree by a rotation.
A rotation in an AVL tree is a local transformation of its subtree rooted at a node whose balance
has become either +2 or −2. If there are several such nodes, we rotate the tree rooted at the
unbalanced node that is the closest to the newly inserted leaf. There are only four types of
An example of constructing an AVL tree for a given list of numbers is shown in the above figure.
If there are several nodes with the ±2 balance, the rotation is done for the tree rooted at the
unbalanced node that is the closest to the newly inserted leaf.
As with any search tree, the critical characteristic is the tree’s height. It turns out that it is
bounded both above and below by logarithmic functions.
[log 2 n] ≤ h < 1.4405 log 2(n + 2) − 1.3277.
Review Questions:
1. How to construct an AVL Tree?
2. What is the drawback of AVL trees?
3. What is the advantage of AVL trees?
Review Questions:
1. What is 2-node ?
2. Define 3-node.
3. What is the upper bound of 2-3 tree?
4. What is the lower bound of 2-3 tree?
5. What is the efficiency for insertion in 2-3 tree?
1. The shape property—the binary tree is essentially complete (or simply complete), i.e., all its
levels are full except possibly the last level, where only some rightmost leaves may be missing.
2. The parental dominance or heap property—the key in each node is greater than or equal to the
keys in its children.
The first tree is a heap. The second one is not a heap, because the tree’s shape property is violated.
The third one is not a heap, because the parental dominance fails for the node with key 5.
Key values in a heap are ordered top down; i.e., a sequence of values on any path from the root to
a leaf is decreasing. However, there is no left-to-right order in key values; i.e., there is no
relationship among key values for nodes either on the same level of the tree or, more generally, in
the left and right subtrees of the same node.
1. There exists exactly one essentially complete binary tree with n nodes. Its height is equal to
[log2n]
2. The root of a heap always contains its largest element.
3. A node of a heap considered with all its descendants is also a heap.
4. A heap can be implemented as an array by recording its elements in the top-down, left-to-right
fashion. It is convenient to store the heap’s elements in positions 1 through n of such an array,
leaving H [0] unused. In such a representation,
a. the parental node keys will be in the first [n/2] positions of the array, while the leaf
keys will occupy the last [n/2] positions;
b. the children of a key in the array’s parental position i (1 ≤ i ≤ [n/2]) will be in positions
2i and 2i + 1, and, correspondingly, the parent of a key in position i (2 ≤ i ≤ n) will be in
position [i/2].
A heap can be defined as an array H [1..n] in which every element in position i in the first half of
the array is greater than or equal to the elements in positions 2i and 2i + 1, i.e.,
Review Questions:
1. What is priority queue?
2. Define Heap.
3. What element does the root of heap contain?
4. How is heap represented using an array?
There are two principal alternatives for constructing a heap for a given list of keys.
The first is the bottom-up heap construction algorithm as illustrated in figure
How efficient is this algorithm in the worst case? Assume, for simplicity, that n = 2 k − 1 so that
a heap’s tree is full, i.e., the largest possible number of nodes occurs on each level. Let h be the
height of the tree. According to the first property of heaps in the list at the beginning of the
The alternative algorithm constructs a heap by successive insertions of a new key into previously
constructed heap; To insert a new key K into a heap first, attach a new node with key K in it after
the last leaf of the existing heap. Then shift K up to its appropriate place in the new heap as
follows. Compare K with its parent’s key: if the latter is greater than or equal to K, stop
otherwise, swap these two keys and compare K with its new parent. This swapping continues
until K is not greater than its last parent or it reaches the root
Since the height of a heap with n nodes is about log 2 n, the time efficiency of insertion is in
O(log n).
Inserting a key
Deleting the root’s key from a heap can be done with the following algorithm, illustrated in
figure
Review Questions
1. What is bottom up heap construction?
2. What is the total number of key comparisons in the worst case?
3. What is top down heap construction?
4. What is the time efficiency for insertion?
5. What is the time efficiency for deletion?
HEAPSORT
Heapsort is a two-stage algorithm that works as follows.
Stage 1 (heap construction): Construct a heap for a given array.
Stage 2 (maximum deletions): Apply the root-deletion operation n − 1 times to the remaining
heap.
As a result, the array elements are eliminated in decreasing order. But since under the array
implementation of heaps an element being deleted is placed last, the resulting array will be
exactly the original array sorted in increasing order. Heapsort is traced on a specific input in the
figure
This means that C(n) ∈ O(n log n) for the second stage of heapsort. For both stages, we get O(n)
+ O(n log n) = O(n log n). The time efficiency of heapsort is, in fact, in θ(n log n) in both the
Review Questions
The other type of technique that exploits space-for-time trade-offs simply uses extra space to
facilitate faster and/or more flexible access to the data. This approach is called prestructuring.
This name highlights two facets of this variation of the space-for-time trade-off: some processing
is done before a problem in question is actually solved but, unlike the input-enhancement variety,
it deals with access structuring
There is one more algorithm design technique related to the space-for-time trade-off idea:
dynamic programming. This strategy is based on recording solutions to overlapping subproblems
of a given problem in a table from which a solution to the problem in question is then obtained.
Two final comments about the interplay between time and space in algorithm design need to be
made. First, the two resources—time and space—do not have to compete with each other in all
design situations. In fact, they can align to bring an algorithmic solution that minimizes both the
running time and the space consumed. Such a situation arises, in particular, when an algorithm
uses a space-efficient data structure to represent a problem’s input, which leads, in turn, to a
faster algorithm. Second, one cannot discuss space-time trade-offs without mentioning the
hugely important area of data compression. Note, however, that in data compression, size
reduction is the goal rather than a technique for solving another problem.
SORTING BY COUNTING -
As a first example of applying the input-enhancement technique, we discuss its application to
the sorting problem. For each element of a list to be sorted, count the total number of elements
smaller than this element and record the results in a table. These numbers will indicate the
positions of the elements in the sorted list: e.g., if the count is 10 for some element, it should be
The time efficiency of this algorithm should be quadratic because the algorithm considers all the
different pairs of an n-element array. More formally, the number of times its basic operation, the
comparison A[i] < A[j ], is executed is equal to the sum encountered several times already:
Thus, the algorithm makes the same number of key comparisons as selection sort and in addition
uses a linear amount of extra space. On the positive side, the algorithm makes the minimum
number of key moves possible, placing each of them directly in their final position in a sorted
array.
Review Questions
The input-enhancement idea: preprocess the pattern to get some information about it, store this
information in a table, and then use this information during an actual search for the pattern in a
given text. This is exactly the idea behind the two best- known algorithms of this type: the Knuth-
Morris-Pratt algorithm and the Boyer-Moore algorithm
The principal difference between these two algorithms lies in the way they compare characters
of a pattern with their counterparts in a text: the Knuth- Morris-Pratt algorithm does it left to
right, whereas the Boyer-Moore algorithm does it right to left.
s0 . . . c . . . sn−1
BARBER
Starting with the last R of the pattern and moving right to left, compare the corresponding pairs
of characters in the pattern and the text. If all the pattern’s characters match successfully, a
matching substring is found. Then the search can be either stopped altogether or continued if
another occurrence of the same pattern is desired. If a mismatch occurs, shift the pattern to the
right. Horspool’s algorithm determines the size of such a shift by looking at the character c of
the text that is aligned against the last character of the pattern. This is the case even if character
c itself matches its counterpart in the pattern.
Case 2 If there are occurrences of character c in the pattern but it is not the last one there—e.g.,
c is letter B in the example—the shift should align the rightmost occurrence of c in the pattern
with the c in the text:
Case 3 If c happens to be the last character in the pattern but there are no c’s among its other
m−1 characters—e.g., c is letter R in the example—the situation is similar to that of Case 1 and
the pattern should be shifted by the entire pattern’s length m:
Case 4 Finally, if c happens to be the last character in the pattern and there are other c’s among
its first m−1 characters—e.g., c is letter R in the example - the situation is similar to that of Case
2 and the rightmost occurrence of c among the first m − 1 characters in the pattern should be
aligned with the text’s c
These examples demonstrate that right-to-left character comparisons can lead to farther shifts of
the pattern than the shifts by only one position always made by the brute-force algorithm.
However, if such an algorithm had to check all the characters of the pattern on every trial, it
would lose much of this superiority. Fortunately, the idea of input enhancement makes repetitive
comparisons unnecessary. Shift sizes can be precomputed and stored in a table.
For example, for the pattern BARBER, all the table’s entries will be equal to 6, except for the
entries for E, B, R, and A, which will be 1, 2, 3, and 4, respectively.
Review Questions
Horspool’s algorithm:
Step 1: For a given pattern of length m and the alphabet used in both the pattern and text,
construct the shift table
Step 3: repeat the following until either a matching substring is found or the pattern reaches
beyond the last character of the text. Starting with the last character in the pattern, compare the
corresponding characters in the pattern and text until either all m characters are matched(then
stop) or a mismatching pair is encountered. Retrieve entry t(c) from the c’s column of the shift
table where c is the text’s character currently aligned against the last character of the pattern, and
shift the pattern by t(c) characters to the right along the text.
Worst-case efficiency of Horspool’s algorithm is in O(nm). But for random texts, it is O(n),
Horspool’s algorithm is obviously faster on average than the brute-force algorithm.
Review Questions