0% found this document useful (0 votes)
16 views

L-2005-08-Advance Data Structure Part 1-HS

The document discusses different data structures for storing and retrieving data efficiently, specifically hash tables. It covers the basic concepts of hash tables including hash functions and different strategies for resolving collisions. The key benefits of hash tables are constant-time retrieval of data on average through hashing the key, regardless of the total number of elements.

Uploaded by

22004788
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views

L-2005-08-Advance Data Structure Part 1-HS

The document discusses different data structures for storing and retrieving data efficiently, specifically hash tables. It covers the basic concepts of hash tables including hash functions and different strategies for resolving collisions. The key benefits of hash tables are constant-time retrieval of data on average through hashing the key, regardless of the total number of elements.

Uploaded by

22004788
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

1

WIA2005: Algorithm Design and Analysis


Attendance Password: zp6e8x
Student: K1
Lecturer: Dr. Hazrina Sofian (HS)
In computer programming, create/Insert, read, update, and
delete (CRUD) + Search/Select are basic operations.

How to data structure can play a role in reducing the time


complexity for these operation?

Hash Table allows high-speed retrieval of data no matter how


much data there is. Concept: we know the index number of the
element. T(1) time complexity, independent from the size of array
and the position in the array.

Thus, hash tables are widely used in database indexing, caching,


program compilation, error checking, etc.

But how can you know which index contains the value that you are
looking? – Each index is calculated using the value of element
itself.
https://youtu.be/KyUTuwz_b7Q
3

Lecture 8: Advance Data Structure (Part 1)

Learning Objectives
At the end of this lecture, you will be able to know:

PART 1
A. Types of tables in data
structure
1. Direct access table
2. Hash table
B. Hash Functions
C. Four collision strategies
4
A: Types of tables in data structure

Introduction

Searching your name in a stack of papers, or =


array 0 to array 25. You need to search 1-1

Data structure in
Even if the names are arranged in Computer Science
alphabetical order, would it be easy
for everyone in this class to search
for your own name?
1.Search
Direct Address Tables
for an element in a direct address table

• Direct address table is applicable when we can afford to allocate 1 key into 1 array.
• Ability to search position in an array in O(1).

John, Kathy

• Use array index A to Z


instead of 0 to 25.

• A to Z is the key (i.e What if we have more than 1


John, Kathy). keys that map to the same
array index? Example John
and Julia.

Collision!
1. Direct Address Tables
Direct-Address Tables

• Direct addressing is a simple technique that works well when the universe U of
keys is reasonably small.

• Suppose that an application needs a dynamic set in which each element has a
key drawn from the universe U = {0, 1, 2,.., m-1}, where m is not too large.

• We shall assume that no two elements have the same key.


1. Direct Address Tables
Illustration

To represent the dynamic set, we can use direct-address table, denoted by T[0..m-1],
in which each slot, corresponds to a key in the universe U .

Each key in the universe U = {0,1,..,9} corresponds to


an index in the table. The set K = {2,3,5,8} of actual
Each of these
keys determines the slots in the table that contain
operation takes O(1)
pointers to elements. The other slots, heavily shaded,
time
contain NIL.
space
1. Direct Address
Disadvantages
Tables allocated
for T
would be
wasted
• if the universe U is large, storing a table T of size |U|
may be impractical, or even impossible, given the John
memory available on a typical computer. Kath
y
• The set K of keys actually stored may be so small relative
to U that most of the space allocated for T would be
wasted. space
allocated
How to reduce for T
the range of would be
array that need wasted
to be handled?

Hash Tables
With direct addressing, an element with key k is stored in slot k.

With hashing, this element is stored in slot h(k); that is, we use a hash
function h to compute the slot from the key k.
2. Hash Tables
Hash table is a generalization of the simpler notion of an ordinary array

• The aim of hash table is to reduce the range of array that need to be handled.
• Instead of |U| values, we need to handle on m values.
• Hash function h maps the universe U of keys into the slots of a hash table T[0..m-1].
h : U -> {0,1,..,m-1}
• where the size m of the hash table is typically much less than |U|.

T(n) = O(n) – Worst case

T(n) = O(1) – Average case

What if we have more than 1


keys may hash to the same
slot? Example k2 and k5

Collision!
10
B: Hash Functions

What make a good hash function?


A good hash function satisfies uniform hashing: each key is equally distributed to different slots

Closest to no collision, in order to get best case

Unfortunately, it is not possible to ensure all keys will be equally distributed to different
slots.
Hash Functions
Example

There are 2 hash functions to be learned in this course:


1. Division Method
2. Multiplication Method

• Most hash functions assume that the keys is the set of natural numbers.

• Thus, if the keys are not natural numbers, we find a way to interpret them
as natural numbers. Example, for alphanumeric keys, divide the sum of
ASCII codes in the key by the number of available address, m, and take the
remainder.
HashDivision
Functions
Method

In the division method for creating hash functions, we map a key k into one
of m slots by taking the remainder of k divided by m.
h(k) = k mod m

For example, if the hash table has size m = 12 and the key is k = 100, then h(k)
= 4.

0 1 2 3 4
100
HashMultiplication
Functions Method

The multiplication method for creating hash functions operates in


two steps.
Step 1: Multiple key with a constant A: 0 < A < 1. Extract the
fractional part of kA.
Step 2: Multiple this value by m, and take the floor result.

In short, hash function is:

h(k) = ⌊ m(k A (mod 1))⌋


14

C: Four collision strategies

There are 4 strategies to resolve


collision in hash tables
At this stage, we observe that both tables prone to
collisions

John, Julia

Direct-address tables hash tables


Hash Tables
Resolving collision by chaining

- Each time a collision occurs an element is added to the beginning/end of the linked list

h(k) Keys

a
b
/
c d
/
e f g h

i
Hash Tables
Resolving collision by chaining

O(1)

O(1)
How long does it take to search for
an element with a given key?
Calculate load factor
Hash Tables
Analysis CHAIN HASH SEARCH

Worst-case: all n keys hash to the same slot, creating a list of


length n. The worst-case time for searching is thus O(n) plus
the time to compute the hash function

Best-case: No collision at all.

Average-case performance of hashing depends on how well


the hash function h distributes the set of keys to be stored
among the m slots, on the average.
Load factor is the average number of elements stored in a chain.

Load factor α = number of elements, n


number of slots in the hash table, m
Hash Tables
Practice Question: By using chaining method, insert the keys 6, 28, 19, 16, 20, 33, 12, 17,
10 into a hash table. The table have 7 slots and let the hash function be h(k) = k mod 7.

Step 1: calculate the h(k) to place Step 2: Each time collision occur, the element is
the key (k) using the given hash added to the beginning or the end of the list
function.
h(k) is like the address of the key h(k) Keys

k mod 7 h(k) 0 28
6%7 6 1
28 % 7 0 2 16
19 % 7 5 3 17 10
16 % 7 2 4
20 % 7 6 5 19 33 12
33 % 7 5 6 6 20
12 % 7 5
17 % 7 3 Disadvantages of channing: Too many keys produce collision.
One index array becomes a list. Searching will then be slow
10 % 7 3 again
Hash Tables
Practice Question: What is the load factor to insert the keys 6, 28, 19, 15, 20,
33, 12, 17, 10 into a hash table. The table have 7 slots and let the hash
function be h(k) = k mod 7.

Load factor is the average number of elements stored in a chain.

Load factor α = number of elements, n


number of slots in the hash table, m

Load factor is the average number of elements stored in a chain.

Load factor α = 9
7

= 1.29
Concept of Open addressing
In open addressing, no elements are stored outside the table.
That is, each array index contains ONE element or NIL

X
21

Concept of Open addressing


Many application requires only the operations INSERT, SEARCH, and DELETE

Storing value
DELETED instead of
NIL.
22

Concept of Probing
• To perform insertion using open addressing, we successively examine an
empty slot in which to put the key.
• Instead of being fixed in the order 0, 1,.., m-1 (which requires O(n) search
time), the sequence of positions probed depends upon the key being
inserted.
• To determine which slots to probe, we extend the hash function to include
the probe number (starting from 0) as a second input.
h: U x {0, 1,.., m-1} -> {0, 1,.., m-1}
With open addressing, we require that for every key k, the probe sequence
<h(k, 0), h(k, 1),.., h(k,m-1)>
be a permutation of <0, 1,.., m-1> , so that every hash-table position is
eventually considered as a slot for a new key as the table fills up.
23

Concept of Probing
• We will examine three commonly used techniques to compute the probe
sequences required for open addressing:
• Linear probing.
• Quadratic probing.
• Double hashing.
Open addressing
Resolving collision by open addressing with linear probing
Open addressing
Resolving collision by open addressing with linear probing
PRACTISE QUESTION
Let the table have 8 slots, and let the hash function be h(k) = k mod 8.
11, 4, 5, 12, 25, 18
Insert each key above, from left to right, into the hash table below
using linear probing to resolve collisions.

Step 1: calculate the h(k) to place the key (k) Step 2: if full, probe slot T[h’(k)+1]. still full? probe slot T[h’(k)+2].
using the given hash function. First probe slot is still full? probe slot T[h’(k)+3]…. and so on.
T[h’(k)].
h(k) is like the address of the key

11 % 8 = 3 0 1 2 3 4 5 6 7
4 %8=4 11
5 %8=5
12 % 8 = 4
25 % 8 = 1
18 % 8 = 2
Open addressing
Resolving collision by open addressing with linear probing
PRACTISE QUESTION
Let the table have 8 slots, and let the hash function be h(k) = k mod 8.
11, 4, 5, 12, 25, 18
Insert each key above, from left to right, into the hash table below
using linear probing to resolve collisions.

Step 1: calculate the h(k) to place the key (k) Step 2: if full, probe slot T[h’(k)+1]. still full? probe slot T[h’(k)+2].
using the given hash function. First probe slot is still full? probe slot T[h’(k)+3]…. and so on.
T[h’(k)].
h(k) is like the address of the key

11 % 8 = 3 0 1 2 3 4 5 6 7
4 %8=4 11 4
5 %8=5
12 % 8 = 4
25 % 8 = 1
18 % 8 = 2
Open addressing
Resolving collision by open addressing with linear probing
PRACTISE QUESTION
Let the table have 8 slots, and let the hash function be h(k) = k mod 8.
11, 4, 5, 12, 25, 18
Insert each key above, from left to right, into the hash table below
using linear probing to resolve collisions.

Step 1: calculate the h(k) to place the key (k) Step 2: if full, probe slot T[h’(k)+1]. still full? probe slot T[h’(k)+2].
using the given hash function. First probe slot is still full? probe slot T[h’(k)+3]…. and so on.
T[h’(k)].
h(k) is like the address of the key

11 % 8 = 3 0 1 2 3 4 5 6 7
4 %8=4 11 4 5
5 %8=5
12 % 8 = 4
25 % 8 = 1
18 % 8 = 2
Open addressing
Resolving collision by open addressing with linear probing
PRACTISE QUESTION
Let the table have 8 slots, and let the hash function be h(k) = k mod 8.
11, 4, 5, 12, 25, 18
Insert each key above, from left to right, into the hash table below
using linear probing to resolve collisions.

Step 1: calculate the h(k) to place the key (k) Step 2: if full, probe slot T[h’(k)+1]. still full? probe slot T[h’(k)+2].
using the given hash function. First probe slot is still full? probe slot T[h’(k)+3]…. and so on.
T[h’(k)].
h(k) is like the address of the key

11 % 8 = 3 0 1 2 3 4 5 6 7
4 %8=4 11 4 5
5 %8=5
12 % 8 = 4
Collision!
25 % 8 = 1
18 % 8 = 2
Open addressing
Resolving collision by open addressing with linear probing
PRACTISE QUESTION
Let the table have 8 slots, and let the hash function be h(k) = k mod 8.
11, 4, 5, 12, 25, 18
Insert each key above, from left to right, into the hash table below
using linear probing to resolve collisions.

Step 1: calculate the h(k) to place the key (k) Step 2: if full, probe slot T[h’(k)+1]. still full? probe slot T[h’(k)+2].
using the given hash function. First probe slot is still full? probe slot T[h’(k)+3]…. and so on.
T[h’(k)].
h(k) is like the address of the key

11 % 8 = 3 0 1 2 3 4 5 6 7
4 %8=4 11 4 5 12
5 %8=5
12 % 8 = 4
25 % 8 = 1
18 % 8 = 2
Open addressing
Resolving collision by open addressing with linear probing
PRACTISE QUESTION
Let the table have 8 slots, and let the hash function be h(k) = k mod 8.
11, 4, 5, 12, 25, 18
Insert each key above, from left to right, into the hash table below
using linear probing to resolve collisions.

Step 1: calculate the h(k) to place the key (k) Step 2: if full, probe slot T[h’(k)+1]. still full? probe slot T[h’(k)+2].
using the given hash function. First probe slot is still full? probe slot T[h’(k)+3]…. and so on.
T[h’(k)].
h(k) is like the address of the key

11 % 8 = 3 0 1 2 3 4 5 6 7
4 %8=4 25 11 4 5 12
5 %8=5
12 % 8 = 4
25 % 8 = 1
18 % 8 = 2
Open addressing
Resolving collision by open addressing with linear probing
PRACTISE QUESTION
Let the table have 8 slots, and let the hash function be h(k) = k mod 8.
11, 4, 5, 12, 25, 18
Insert each key above, from left to right, into the hash table below
using linear probing to resolve collisions.

Step 1: calculate the h(k) to place the key (k) Step 2: if full, probe slot T[h’(k)+1]. still full? probe slot T[h’(k)+2].
using the given hash function. First probe slot is still full? probe slot T[h’(k)+3]…. and so on.
T[h’(k)].
h(k) is like the address of the key

11 % 8 = 3 0 1 2 3 4 5 6 7
4 %8=4 25 18 11 4 5 12
5 %8=5
12 % 8 = 4
25 % 8 = 1
18 % 8 = 2
Open addressing
Resolving collision by open addressing with quadratic probing

1. Set counter i=0


2. Get the hash value, h(k)=(k+i2) mod
3. If hash table [h(k)] is empty insert the key. DONE!
Else must find available space.
3.1 Increment I by 1
3.2 Compute a new hash value, h(k)=(k+i2) mod
3.3 repeat step 3 until equal to number of slot in the hash table.

Advantage of quadratic probing:


https://youtu.be/tfXPEgYDQgI
https://www.youtube.com/watch?v=BoZbu1cR0no
Open addressing
Resolving collision by open addressing with quadratic probing

PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k)=(k+i2) mod 7
76, 40, 48, 5, 20
Insert each key above, from left to right, into the hash table below using
quadratic probing to resolve collisions.

Step 1: calculate the h(k)


to place the key (k) using 0 1 2 3 4 5 6
the given hash function.
40 76
h(k) is like the address of
the key

76 % 7 = 6
40 % 7 = 5
48 % 7 = 6
5%7=5
20 % 7 = 6

https://www.youtube.com/watch?v=BoZbu1cR0no
Open addressing
Resolving collision by open addressing with quadratic probing

PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k)=(k+i2) mod 7
76, 40, 48, 5, 20
Insert each key above, from left to right, into the hash table below using
quadratic probing to resolve collisions.

Step 1: calculate the h(k)


to place the key (k) using 0 1 2 3 4 5 6
the given hash function.
40 76
h(k) is like the address of
the key

76 % 7 = 6
40 % 7 = 5
48 % 7 = 6
5%7=5
20 % 7 = 6

https://www.youtube.com/watch?v=BoZbu1cR0no
Open addressing
Resolving collision by open addressing with quadratic probing

PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k)=(k+i2) mod 7
76, 40, 48, 5, 20
Insert each key above, from left to right, into the hash table below using
quadratic probing to resolve collisions.

Step 1: calculate the h(k)


to place the key (k) using 0 1 2 3 4 5 6
the given hash function.
40 76
h(k) is like the address of
the key

Collision!
76 % 7 = 6 i=0
40 % 7 = 5 H(48) = (48 + i2) % 7
48 % 7 = 6 =6
i =1
5%7=5 H(48) = (48 + i2) % 7
20 % 7 = 6 = (48 + 12) % 7
= 49 % 7
=0
https://www.youtube.com/watch?v=BoZbu1cR0no
Open addressing
Resolving collision by open addressing with quadratic probing

PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k)=(k+i2) mod 7
76, 40, 48, 5, 20
Insert each key above, from left to right, into the hash table below using
quadratic probing to resolve collisions.

Step 1: calculate the h(k)


to place the key (k) using 0 1 2 3 4 5 6
the given hash function.
48 40 76
h(k) is like the address of
the key

Collision!
76 % 7 = 6 i=0
40 % 7 = 5 H(48) = (48 + i2) % 7
48 % 7 = 6 =6
i =1
5%7=5 H(48) = (48 + i2) % 7
20 % 7 = 6 = (48 + 12) % 7
= 49 % 7
=0
https://www.youtube.com/watch?v=BoZbu1cR0no
Open addressing
Resolving collision by open addressing with quadratic probing
PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k)=(k+i2) mod 7
76, 40, 48, 5, 20
Insert each key above, from left to right, into the hash table below using
quadratic probing to resolve collisions.
Step 1: calculate the h(k)
to place the key (k) using 0 1 2 3 4 5 6
the given hash function.
48 5 40 76
h(k) is like the address of
the key

Collision! Collision!
76 % 7 = 6 i=0
40 % 7 = 5 H(5) = (5 + 02) % 7
48 % 7 = 6 =5
i =1
5%7=5 H(5) = (5 + i2) % 7
20 % 7 = 6 = (5 + 12) % 7
=6%7
=6
i =2
H(5) = (5 + i2) % 7
= (5 + 22) % 7
=9%7
https://www.youtube.com/watch?v=BoZbu1cR0no
=2
Open addressing
Resolving collision by open addressing with quadratic probing
PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k)=(k+i2) mod 7
76, 40, 48, 5, 20
Insert each key above, from left to right, into the hash table below using
quadratic probing to resolve collisions.
Step 1: calculate the h(k)
to place the key (k) using 0 1 2 3 4 5 6
the given hash function.
48 5 20 40 76
h(k) is like the address of
the key

Collision! Collision!
76 % 7 = 6 i=0
40 % 7 = 5 H(20) = 20 % 7
48 % 7 = 6 =6
i =1
5%7=5
H(20) = (6 + i2) % 7
20 % 7 = 6 = (6 + 12) % 7
=7%7
=0
i =2
H(20) = (6 + i2) % 7
= (6 + 22) % 7
= 10 % 7
https://www.youtube.com/watch?v=BoZbu1cR0no
=3
Open addressing
Resolving collision by open addressing with double hashing

• Double hashing offers one of the best methods available for open
addressing because the permutations produced have many of the
characteristics of randomly chosen permutations.
• Double hashing uses a hash function of the form
h(k,i) = (h1(k) + i h2(k)) mod m
where both h1 and h2 are auxiliary hash functions.

https://www.youtube.com/watch?v=BoZbu1cR0no
Open addressing
Resolving collision by open addressing with double hashing

PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k) = k mod 7.
19, 26, 13, 47, 17
Insert each key above, from left to right, into the hash table below using
double hash to with second hash function h(k) = 5 - (ik mod 5)resolve
collisions.

Step 1: calculate the h(k)


to place the key (k) using 0 1 2 3 4 5 6
the given hash function.
19
h(k) is like the address of
the key

19 % 7 = 5
26 % 7 = 5
13 % 7 = 6
47 % 7 = 5
17 % 7 = 3
Open addressing
Resolving collision by open addressing with double hashing

PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k) = k mod 7.
19, 26, 13, 47, 17
Insert each key above, from left to right, into the hash table below using
double hash to with second hash function h(k) = 5 - (ik mod 5)resolve
collisions.

Step 1: calculate the h(k) 0 1 2 3 4 5 6


to place the key (k) using 26 19
the given hash function.
h(k) is like the address of
the key

19 % 7 = 5
Collision!
26 % 7 = 5
13 % 7 = 6
47 % 7 = 5 i=1
H2(26) = 5 - (ik mod 5)
17 % 7 = 3 = 5 - (26 mod 5)
= 5 - (1)
=4
Open addressing
Resolving collision by open addressing with double hashing

PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k) = k mod 7.
19, 26, 13, 47, 17
Insert each key above, from left to right, into the hash table below using
double hash to with second hash function h(k) = 5 - (ik mod 5)resolve
collisions.

Step 1: calculate the h(k) 0 1 2 3 4 5 6


to place the key (k) using 26 19 13
the given hash function.
h(k) is like the address of
the key

19 % 7 = 5
26 % 7 = 5
13 % 7 = 6
47 % 7 = 5
17 % 7 = 3
Open addressing
Resolving collision by open addressing with double hashing

PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k) = k mod 7.
19, 26, 13, 47, 17
Insert each key above, from left to right, into the hash table below using
double hash to with second hash function h(k) = 5 - (ik mod 5)resolve
collisions.

Step 1: calculate the h(k) 0 1 2 3 4 5 6


to place the key (k) using 47 26 19 13
the given hash function.
h(k) is like the address of
the key

19 % 7 = 5
Collision!
26 % 7 = 5 i=0
13 % 7 = 6 i=1
H2(47) = 5 - (ik mod 5)
47 % 7 = 5 = 5 - (47 mod 5)
17 % 7 = 3 = 5 - (2)
=3
Open addressing
Resolving collision by open addressing with double hashing

PRACTISE QUESTION
Let the table have 7 slots, and let the hash function be h(k) = k mod 7.
19, 26, 13, 47, 17
Insert each key above, from left to right, into the hash table below using
double hash to with second hash function h(k) = 5 - (ik mod 5)resolve
collisions.

Step 1: calculate the h(k) 0 1 2 3 4 5 6


to place the key (k) using 47 26 17 19 13
the given hash function.
h(k) is like the address of
the key

19 % 7 = 5
26 % 7 = 5
13 % 7 = 6
47 % 7 = 5
17 % 7 = 3
Open addressing
Analysis of open addressing
• With open addressing, at most one element occupies each slot, and thus n ≤ m, which
implies α ≤ 1.
• We assume that we are using uniform hashing.
• In this idealized scheme, the probe sequence <h(k, 0), h(k, 1),.., h(k,m-1)> used to insert or
search for each key k is equally likely to be any permutation of <0, 1,.., m-1>.

Example: 0 1 2 3 4 5 6
47 26 17 19 13
n = number of elements = 5
m = number of slots = 7
Load factor α = number of elements, n
number of slots in the hash table, m

= 5
7

= 0.71
46

Any Question
???
WHA WHY WHERE WHEN WHO HOW
T

You might also like