0% found this document useful (0 votes)
18 views17 pages

Lecture 22-Hashing - II

The document discusses different techniques for resolving collisions in hashing including quadratic probing, chaining and rehashing. It also covers topics like secondary clustering in quadratic probing and designing good hash functions.

Uploaded by

Prhna Wala Bacha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views17 pages

Lecture 22-Hashing - II

The document discusses different techniques for resolving collisions in hashing including quadratic probing, chaining and rehashing. It also covers topics like secondary clustering in quadratic probing and designing good hash functions.

Uploaded by

Prhna Wala Bacha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 17

FAST- National University of

Computer and Emerging Sciences


CFD Campus
Data Structures
Hashing-II

Muhammad Usman Joyia

Lecture No. 22 1
Quadratic Probing

• Resolving a hash collision by using rehashing


formula (HashValue ± I2)%array_size, where I is
the number of times that the rehash function has
been applied
• It distributes the key on wide range over the hash
table so..
• Quadratic probing reduces clustering

2
Quadratic Probing Less likely to
encounter
Primary
f(i) = i2 Clustering

• Probe sequence:
0th probe = h(k) mod TableSize
1th probe = (h(k) + 1) mod TableSize
2th probe = (h(k) + 4) mod TableSize
3th probe = (h(k) + 9) mod TableSize
...
ith probe = (h(k) + i2) mod TableSize

3
Quadratic Probing (cont’d)
• Example: Load the keys 23, 13, 21, 14, 7, 8, and 15, in this order,
in a hash table of size 7 using quadratic probing with c(i) = i2 and
the hash function: h(key) = key % 7
• The required probe sequences are given by:
hi(key) = (h(key)  i2) % 7 i = 0, 1, 2, 3

4
Quadratic Probing (cont’d)
h0(23) = 23 % 7 = 2 hi(key) = (h(key)  i2) % 7 i = 0, 1, 2, 3
h0(13) = 13 % 7 = 6
h0(21) = 21 % 7 = 0
h0(14) = 14 % 7 = 0 collision 0 O 21
h1(14) = (0 + 12) % 7 = 1
h0(7) = (7 % 7) % 7 = 0 collision 1 O 14
h1(7) = (0 + 12) % 7 = 1 collision
h-1(7) = (0 - 12) % 7 = -1 2 O 23
NORMALIZE: (-1 + 7) % 7 = 6 collision
h2(7) = (0 + 22) % 7 = 4 3 O 15
h0(8) = (8 %7 )= 1 collision
h1(8) = (1 + 12) % 7 = 2 collision 4 O 7
h-1(8) = (1 - 12) % 7 = 0 collision
5 O 8
h2(8) = (1 + 22) % 7 = 5
h0(15) = (15 % 7) = 1 collision
6 O 13
h1(15) = (1 + 1 ) % 7 = 2 collision
2

h-1(15) = (1 - 12) % 7 = 0 collision


h2(15) = (1 + 22) % 7 = 5 collision 5
h-2(15) = (1 - 22) % 7 = -3
Secondary Clusters
• Quadratic probing is better than linear probing because it eliminates primary
clustering.
• However, it may result in secondary clustering: if h(k1) = h(k2) the probing
sequences for k1 and k2 are exactly the same. This sequence of locations is called a
secondary cluster.
• Secondary clustering is less harmful than primary clustering because secondary
clusters do not combine to form large clusters.
• Example of Secondary Clustering: Suppose keys k0, k1, k2, k3, and k4 are
inserted in the given order in an originally empty hash table using quadratic
probing with c(i) = i2. Assuming that each of the keys hashes to the same array
index x. A secondary cluster will develop and grow in size:

•Quadratic probing reduces clustering but it does not necessary examine every slot.
6
Task
• Insert 2, 12, 14, 18, 20, 24, 32, 22, 144, 55,
66, 45, 49 size=13

7
Overflow
• Hash Table may get full
– No more insertions possible

• Hash table may get too full


– Insertions, deletions, search take longer time

• Solution: Rehash
– Build another table that is twice as big and has a new hash function
– Move all elements from smaller table to bigger table

• Cost of Rehashing = O(N)


– But happens only when table is close to full
– Close to full = table is X percent full, where X is a tunable parameter
Rehashing Example
Original Hash Table After Rehashing

After Inserting 23
Note
 In number theory, two integers a and b are said to be relatively
prime, mutually prime, or coprime (also spelled co-prime) if
the only positive integer that evenly divides both of them is 1.
That is, the only common positive factor of the two numbers is
1. This is equivalent to their greatest common divisor being 1.
e.g 14 and 15
e.g 2 and 9
e.g 3 and 8

10
Bucket

• When the bucket becomes full, we must again deal


with the problem of handling collision
Chain

• A linked list of elements that share the same hash


location
• Use the hash value not as the actual location of the
element, but rather as the index into an array of
pointer
• Each pointer accesses a chain of elements that
share the same hash location
Chain
Comparison – Linear Probing & Chaining
Designing a good Hash Function

• A good hash function minimize the collisions


 One Solution
– Use a data structure that has more space for keys

 Another Solution
– Design hash function to minimize the collisions
– Produce unique keys as much as possible
Some common hash functions

• Division Method
– The most common hash functions use the division
method (%) to generate function
– Key % TableSize
• If the element key is string? (Rabin karp)
• Folding
– A hash method that breaks the key into several pieces
and concatenates or exclusive-ORs some of the pieces
to form the hash value
References

• Nell Dale – Chapter 10.


• http://www.cplusplus.com/doc/tutorial/templates/
• Robert Lafore, Chapter 14, Page 681

17

You might also like