0% found this document useful (0 votes)
14 views50 pages

Lecture 6 - Algorithm trees II

Lecture 6 focuses on balanced trees, particularly AVL trees, which maintain a balance condition to ensure O(log n) search time. The lecture discusses the properties of AVL trees, their implementation, and the process of rotations to maintain balance during insertions. Additionally, it covers double rotations for specific imbalances and exercises to reinforce understanding of AVL tree operations.

Uploaded by

slimbunny006
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views50 pages

Lecture 6 - Algorithm trees II

Lecture 6 focuses on balanced trees, particularly AVL trees, which maintain a balance condition to ensure O(log n) search time. The lecture discusses the properties of AVL trees, their implementation, and the process of rotations to maintain balance during insertions. Additionally, it covers double rotations for specific imbalances and exercises to reinforce understanding of AVL tree operations.

Uploaded by

slimbunny006
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Department of

Computer Science

COMP-370
Algorithms
Lecture 6
Trees II

Spring Semester
2022
02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 1 1
[email protected]
Department of
Computer Science

Lecture Overview
• Balanced Trees

• AVL Trees
• Properties

• Maintaining Balance

• Data on Disk

• M-ary and B-trees

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 2 2
[email protected]
Department of
Computer Science

Previous Lecture: Binary Search Trees


• Search running is 𝑂(ℎ).

• But… height is at best log 2 𝑛.

• and… height is at worst 𝑛 − 1.

• This means to perform search on BST, best


case is 𝑂(log 2 𝑛) and worst case 𝑂(𝑛).

• Can we tighten bounds and guarantee search is 𝑂(log 2 𝑛)?


02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 3 3
[email protected]
Department of
Computer Science

Idea: Reduce height of a Tree


• Fill internal nodes:
Can chain looking trees become perfect trees?

• Increase number of children, why have 2 when you could


have 4, 6, 8, …
• Yes, but NOT by too much because then trees end up with large
child lists (same problem… different place).
02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 4 4
[email protected]
Department of
Computer Science

Balanced Tree
• A tree where its leafs are ALL at the same height.

Not full but Full but not Full, balanced


balanced balanced and perfect

Remember (Lecture 5): Full is a tree where every internal node


has exactly two non-empty children.
02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 5 5
[email protected]
Department of
Computer Science

AVL Trees (Adelson‐Velskii and Landis)


• An AVL tree is a BST with a balance condition.

• Balance condition must be easily maintained and must


guarantee tree depth is O(logn).

• An AVL tree is identical to a BST, except that for every


node in the tree, the height of the left and right
subtrees can differ by at most 1.

• A full tree is also a AVL tree.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 6 6
[email protected]
Department of
Computer Science

Exercise
• Which one is an AVL tree an why?

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 7 7
[email protected]
Department of
Computer Science

Exercise
• Is the following an AVL tree?

Requiring balance at the root is not enough.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 8 8
[email protected]
Department of
Computer Science

The Height of an AVL Tree


• Suppose 𝑁(ℎ) is the minimum number of nodes of an
AVL tree with height ℎ.

• We know: 𝑁(0) = 1 and 𝑁(1) = 2.

• For ℎ ≥ 2 an AVL tree must:


• Have a root,

• One of the two subtrees must have height ℎ − 1.

• The height of the two subtrees can differ at most by 1.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 9 9
[email protected]
Department of
Computer Science

The Height of an AVL Tree


• Thus:
𝑁(ℎ) = 𝑁(ℎ − 1) + 𝑁(ℎ − 2) + 1

• This recursive relationship closely resembles Fibonacci


numbers, from which the bound claimed above on the
height of an AVL tree follows.

• So… the height of an AVL tree with n nodes is O(logn).

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 10 10
[email protected]
Department of
Computer Science

AVL Trees
• All operations are now guaranteed to be of O(logn).

• Yes… but insertion?


Inserting key = 6 would destroy
the balance condition at node
with key = 8.

-> left subtree has height 2 while


right is 0.

To preserve balance condition we


6 will be performing rotation.
02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 11 11
[email protected]
Department of
Computer Science

AVL Tree Implementation


• The coding of an AVL tree is similar to a BST (because AVLs
are also BSTs). But… we add 1 more field:
• The height of the tree from each node.

• The AVLNode:
• key
• height
• Left subtree pointer
• Right subtree pointer

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 12 12
[email protected]
Department of
Computer Science

Node Insertion
• The insertion of a node happens as in BSTs but
we keep trace of the path followed.

• Afterwards we must follow the reverse path to


update the height of the nodes with possible
new values.
• If AVL tree has become imbalanced, meaning there is a
node with subtrees that have a height difference >=1 then
we must rebalance the tree The process is called rotation.
02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 13 13
[email protected]
Department of
Computer Science

Node Insertion
• Insert the values 72, 26, 9 to an empty AVL tree:

To avoid violation,
ideally, we would like
to have this.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 14 14
[email protected]
Department of
Computer Science

Left Rotation
• Insertion into the left subtree of the left child.

• Before insertion: Subtrees R, S and T have same height h.

• After insertion: Suppose node is inserted to subtree R, meaning


its height becomes h+1.
Node added Left Rotation
and heights
updated

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 15 15
[email protected]
Department of
Computer Science

Left Rotation Process


• Left rotation of A and B trees means:
1. A.left = B.right
2. B.right = A
3. A.height = C.height + 1
4. B.height = C.height + 2

• Before the rotation, A was parent to B, afterwards, B becomes


parent to A.
• After left rotation, the tree remains an AVL tree with:
A.height = h + 1 (height of R)

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 16 16
[email protected]
Department of
Computer Science

Exercise
• Insert 6 to the following AVL tree and make sure
balance condition is preserved:

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 17 17
[email protected]
Department of
Computer Science

Exercise
• The following tree is imbalanced. Find where the misbalancing
occurs and fix the problem.
A
Imbalanced
B ℎ𝑙𝑒𝑓𝑡 = 3
C
ℎ𝑟𝑖𝑔ℎ𝑡 = 1

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 18 18
[email protected]
Department of
Computer Science

Right Rotation (Symmetric to LR)


• Insertion into the right subtree of the right child.

• Before insertion: Subtrees R, S and T have same height h.

• After insertion: Suppose node is inserted to subtree T, meaning


its height becomes h+1.
Right Rotation

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 19 19
[email protected]
Department of
Computer Science

Right Rotation Process


• Right rotation of A and B trees means:
1. A.right = C.left
2. C.left = A
3. A.height = B.height + 1
4. C.height = B.height + 2

• Before the rotation, A was parent to C, afterwards, C becomes


parent to A.
• After right rotation, the tree remains an AVL tree with:
C.height = h + 1 (height of T)

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 20 20
[email protected]
Department of
Computer Science

Exercise
• Starting from an initial empty AVL tree, insert the items
3, 2, 1, 4, 5, 6, 7:

First issues comes


when we insert 3

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 21 21
[email protected]
Department of
Computer Science

Exercise
Next issue comes when we insert 5 with node 3 right height being 2 and left 0.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 22 22
[email protected]
Department of
Computer Science

Exercise
Next issue comes at the root when we insert 6.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 23 23
[email protected]
Department of
Computer Science

Exercise
Finally the insertion of 7 causes another issue:

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 24 24
[email protected]
Department of
Computer Science

Double Rotations
• An insertion to right subtree of left child and an insertion
to left subtree of the right child require two rotations.

• These are the two cases where the insertion occurs on the
“inside” (i.e., left–right or right–left).
Still… imbalanced

Adding to Y cannot be
fixed with single rotation.
02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 25 25
[email protected]
Department of
Computer Science

Left-Right Double Rotation


• Insertion into the right subtree of the left child.

• Any insertion to the right subtree of 𝑘1 , meaning to


either B or C, will cause imbalance, which requires
double rotation to fix.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 26 26
[email protected]
Department of
Computer Science

Left-Right Double Rotation Process


• Left-Right rotation involving 𝑘1 , 𝑘2 , 𝑘3 means:
1. k1.right = k2.left

2. k3.left = k2.right

3. k2.left = k1

4. k2.right = k3

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 27 27
[email protected]
Department of
Computer Science

Right-Left Double Rotation


• Insertion into the left subtree of the right child.

• Any insertion to the left subtree of 𝑘2 , meaning to


either B or C, will cause imbalance, which requires
double rotation to fix.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 28 28
[email protected]
Department of
Computer Science

Right-Left Double Rotation Process


• Right-Left double rotation involving 𝑘1 , 𝑘2 , 𝑘3 means:
1. k3.left = k2.right

2. k1.right = k2.left

3. k2.left = k1

4. k2.right = k3

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 29 29
[email protected]
Department of
Computer Science

Exercise
• Insert in the following AVL tree the values 16, 15, 14,
13, 12, 11, 10 , 8 and 9:

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 30 30
[email protected]
Department of
Computer Science

Exercise
• Inserting 15 causes a height imbalance at node 7:

Adding to left of the


right subtree.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 31 31
[email protected]
Department of
Computer Science

Exercise
• Inserting 14 causes a double rotation:

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 32 32
[email protected]
Department of
Computer Science

Exercise
• Inserting 13 causes a imbalance at the root but single
rotation is enough:

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 33 33
[email protected]
Department of
Computer Science

Exercise
• Inserting 12 causes imbalance but single rotation is
enough:

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 34 34
[email protected]
Department of
Computer Science

Exercise
• To insert 11, a single rotation needs to be performed, and the
same is true for insertion of 10. We insert 8 without a rotation
creating an almost perfectly balanced tree:

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 35 35
[email protected]
Department of
Computer Science

Data on Disk
• Up to now, we have assumed entire data (structure) can fit
in memory -> unrealistic.

• In reality we must store data on disk and only parts of the


dataset resides in memory.

• But… disk I/O magnitudes of scale slower than processor


(e.g., 120 disk I/O equal billions of ops by processor).
• So… Big-Oh analysis fails as not all operations are equal.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 36 36
[email protected]
Department of
Computer Science

Data on Disk
• Since disk I/Os are slow… we must reduce disk I/Os.

• Example… Florida citizen driving records – 10M records


• Driving records (256 bytes) with driving ID the key (32 bytes).
• 10M x 256 = 2.5GB (assume… data cannot fit in memory)
• 20 employees accessing system… meaning each employee
has roughly access to 1/20th of server resources.
• So… in 1 second we can execute billions of ops or perform 6
disk I/Os.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 37 37
[email protected]
Department of
Computer Science

Trees for Indexes


• If we consider BST… it may be unbalanced meaning worst
case linear search -> 10M disk accesses needed.

• If we consider AVL tree which guarantees height O(logN)


then logN = 25 disk access and thus 4s.
• So… reducing the tree height helps!
• Also.. If we have more branching, we have less height.
• Perfect binary tree with 31 nodes has height of 5 while a 5-ary
tree has height of 3… remember M-ary tree height 𝑂(log 𝑀 𝑁).

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 38 38
[email protected]
Department of
Computer Science

M-ary Trees
• For a binary tree we need 1 key to decide which of two
branches to follow.

• For a M-ary tree we need M-1 key to decide.

5-ary tree with 31 nodes

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 39 39
[email protected]
Department of
Computer Science

B-trees
• A B-tree of order M is an M-ary tree with these properties:
• The data items are stored at leaves.
• The non-leaf nodes store up to 𝑀 − 1 keys to guide the searching
and key 𝑖 represents the smallest key in subtree 𝑖 + 1.
• The root is either a leaf or has between two and M children.
• All non-leaf nodes (except the root) have between ⌈𝑀/2⌉ and 𝑀
children.
• All leaves are at the same depth and have between ⌈𝐿/2⌉ and 𝐿
data items.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 40 40
[email protected]
Department of
Computer Science

B-tree of order 5

• Non-leaf nodes have between 3 and 5 children and thus…


between 2 and 4 keys.
• The root has 3 children but it can possibly have two children.
• Here… L = 5 meaning each leaf has between 3 to 5 data items.
02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 41 41
[email protected]
Department of
Computer Science

B-tree
• Usually, each node represents one disk block that needs 1
disk I/O to be accessed.

• M and L are chosen based on items that can fit disk block.

• Suppose block is 8192 bytes and key (e.g., driver ID) is 32


bytes. For a M-ary B-tree we have M-1 keys, for a total of
(M-1)*32 bytes plus M*4 bytes for branch pointers.

• So the largest value of M for which 36*M - 32 fits in 8192 is


M = 228. Also, L = 8192/256 = 32 data records.
02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 42 42
[email protected]
Department of
Computer Science

B-tree Insertion
• Insert 57 to B-tree:

57

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 43 43
[email protected]
Department of
Computer Science

B-tree Insertion
• Insert 55 to B-tree

But… leaf is already full.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 44 44
[email protected]
Department of
Computer Science

B-tree Leaf Splitting


• When a leaf is at L+1 then it is split into 2 leafs both
guaranteed to have the min number of data records.

• The parent must also be updated and the mid of the


previous leaf is used as key.

• So… two disk accesses are required to write these leafs


and 1 more for updating parent.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 45 45
[email protected]
Department of
Computer Science

B-tree Insertion
• Insert 55 to B-tree:

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 46 46
[email protected]
Department of
Computer Science

B-tree Insertion
• Insert 40 to B-tree:

But… non-leaf is also


already full.

Leaf is already full.

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 47 47
[email protected]
Department of
Computer Science

B-tree Non-Leaf Splitting


• When a parent is split, we must update the values of the
keys and also the parent’s parent.
• So… this costs 5 disk I/Os.

• When a non-leaf node is split, its parent always gains a


child.
• What if the parent already has reached its limit of children?
• In that case, we continue splitting nodes up the tree until either
we find a parent that does not need to be split or we reach the
root.
02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 48 48
[email protected]
Department of
Computer Science

B-tree Insertion
• Insert 40 to B-tree:

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 49 49
[email protected]
Department of
Computer Science

COMP-370
Algorithms

Questions?

02/03/2022
Demetris Trihinas
Lecture 6 | Trees II 50 50
[email protected]

You might also like