1
HASHING
2
Hashing
• Mathematical concept
– To define any number as set of numbers in
given interval
– To cut down part of number
– Used in discreet maths, e.g graph theory, set
theory
– Used in Searching technique
– Used in encryption methods
3
Hash Functions and Hash
Tables
• Hashing has 2 major components
– Hash function h
– Hash Table Data Structure of size N
• A hash function h maps keys (a identifying element of
record set) to hash value or hash key which refers to
specific location in Hash table
• Example:
h(x) = x mod N
is a hash function for integer keys
• The integer h(x) is called the hash value of key x
4
Hash Functions and Hash Tables
• A hash table data structure is an array or array
type ADTof some fixed size, containing the keys.
• An array in which records are not stored
consecutively - their place of storage is
calculated using the key and a hash function
Key hash
function
array
index
5
• Hashed key: the result of applying a hash function to a
key
• Keys and entries are scattered throughout the array
• Contains the main advantages of both Arrays and Trees
• Mainly the topic of hashing depends upon the two main
factors / parts
(a) Hash Function (b) Collision Resolution
• Table Size is also an factor (miner) in Hashing, which is
0 to tablesize-1.
6
Table Size
• Hash table size
– Should be appropriate for the hash function used
– Too big will waste memory; too small will
increase collisions and may eventually force
rehashing (copying into a larger table)
7
Example
• We design a hash table for
a dictionary storing items
(SSN, Name), where SSN
(social security number) is a
nine-digit positive integer
• The actual data is not
stored in hash table
• Pin points the location of
actual data or set of data
• Our hash table uses an
array of size N = 10,000 and
the hash function
h(x) = last four digits of x




0
1
2
3
4
9997
9998
9999
…
451-229-0004
981-101-0002
200-751-9998
025-612-0001
8
Hash Function
• The mapping of keys into the table is called Hash
Function
• A hash function,
– Ideally, it should distribute keys and entries evenly
throughout the table
– It should be easy and quick to compute.
– It should minimize collisions, where the position
given by the hash function is already occupied
– It should be applicable to all objects
9
• Different types of hash functions are used for the
mapping of keys into tables.
(a) Division Method
(b) Mid-square Method
(c) Folding Method
10
1. Division Method
• Choose a number m larger than the number n of keys
in k.
• The number m is usually chosen to be a prime no.
• The hash function H is defined as,
H(k) = k(mod m) or H(k) = k(mod m) + 1
• Denotes the remainder, when k is divided by m
• 2nd formula is used when range is from 1 to m.
11
• Example:
Elements are: 3205, 7148, 2345
Table size: 0 – 99 (prime)
m = 97 (prime)
H(3205)= 4, H(7148)=67, H(2345)=17
• For 2nd formula add 1 into the remainders.
12
2. Folding Method
• The key k is partitioned into no. of parts
• Then add these parts together and ignoring the
last carry.
• One can also reverse the first part before
adding (right or left justified. Mostly right)
H(k) = k1 + k2 + ………. + kn
13
• Example:
H(3205)=32+05=37 or H(3250)=32+50=82
H(7148)=71+43=19 or H(7184)=71+84=55
H(2345)=23+45=77 or H(2354)=23+54=68
14
3. Mid-Square Method
• The key k is squared. Then the hash function H is
defined as
H(k) = l
• The l is obtained by deleting the digits from both
ends of K2.
• The same position must be used for all the keys.
15
• Example:
k: 3205 7148 2345
k2: 10272025 51093904 5499025
H(k): 72 93 99
• 4th and 5th digits have been selected. From the
right side.
16
Collision Resolution Strategies
• If two keys map on the same hash table index then we
have a collision.
• As the number of elements in the table increases, the
likelihood of a collision increases - so make the table
as large as practical
• Collisions may still happen, so we need a collision
resolution strategy
17
• Two approaches are used to resolve collisions.
(a) Separate chaining: chain together several keys/entries
in each position.
(b) Open addressing: store the key/entry in a different
position.
• Probing: If the table position given by the hashed
key is already occupied, increase the position by
some amount, until an empty position is found
18
Open Addressing
• Types of open addressing are
1. Linear Probing
2. Quadratic Probing
3. Double Hashing.
19
1. Linear Probing
• Locations are checked from the hash location k to the
end of the table and the element is placed in the first
empty slot
• If the bottom of the table is reached, checking “wraps
around” to the start of the table. Modulus is used for
this purpose
• Thus, if linear probing is used, these routines must
continue down the table until a match or empty location
is found
20
• Linear probing is guaranteed to find a slot for the
insertion if there still an empty slot in the table.
• Even though the hash table size is a prime number is
probably not an appropriate size; the size should be at
least 30% larger than the maximum number of elements
ever to be stored in the table.
• If the load factor is greater than 50% - 70% then the
time to search or to add a record will increase.
21
H(k)=h, h+1, h+2, h+3,……, h+I
• However, linear probing also tends to promote
clustering within the table.
1 2 3 4 5 6 7 8
22
2. Quadratic Probing
• Quadratic probing is a solution to the clustering
problem
– Linear probing adds 1, 2, 3, etc. to the original
hashed key
– Quadratic probing adds 12, 22, 32 etc. to the original
hashed key
• However, whereas linear probing guarantees that all
empty positions will be examined if necessary,
quadratic probing does not
23
• If the table size is prime, this will try approximately
half the table slots.
• More generally, with quadratic probing, insertion may
be impossible if the table is more than half-full!
H(k) = h, h+1, h+4, h+5, h+6,……, h+i2
24
3. Double Hashing
• 2nd hash function H’ is used to resolve the collision.
• Here H’(k) = h’ ≠ m
• Therefore we can search the locations with addresses,
H’(k) = h, h+h’, h+2h’, h+3h’,…….
• If m is prime, then this sequence access all the
locations.
25
Double Hashing
• Double hashing uses a
secondary hash function
d(k) and handles
collisions by placing an
item in the first available
cell of the series
(h + jd(k)) mod N
for j = 0, 1, … , N - 1
• The secondary hash
function d(k) cannot
have zero values
• The table size N must be
a prime to allow probing
of all the cells
• Common choice of
compression map for the
secondary hash function:
d2(k) = k mod q
where
– q < N
– q is a prime
• The possible values for
d2(k) are
1, 2, … , q
26
• Consider a hash
table storing integer
keys that handles
collision with double
hashing
– N = 13
– h(k) = k mod 13
– d(k) = k mod 7
• Insert keys 18, 41,
22, 44, 59, 32, 31,
73, in this order
Example of Double Hashing
0 1 2 3 4 5 6 7 8 9 10 11 12
59 41 18 32 44 8 22 44 11
0 1 2 3 4 5 6 7 8 9 10 11 12
k h(k ) d (k ) Probes
18 5 9 5
41 2 8 2
22 9 10 9
44 5 5 5 7
59 7 10 7 10 0
32 6 4 6
31 5 8 5 8
73 8 11 8 11
27
Applications of Hashing
• Compilers use hash tables to keep track of declared
variables
• A hash table can be used for on-line spelling checkers
— if misspelling detection (rather than correction) is
important, an entire dictionary can be hashed and
words checked in constant time
• Game playing programs use hash tables to store seen
positions, thereby saving computation time if the
position is encountered again
• Hash functions can be used to quickly check for
inequality — if two elements hash to different values
they must be different

More Related Content

PPTX
Dijkstra's Algorithm
PPTX
Dijkstra's algorithm presentation
PPTX
Hashing in datastructure
PPTX
Dijkstra s algorithm
PPTX
Priority queue in DSA
PPTX
Tree Traversal
DOCX
Dijkstra algorithm
PPT
Chapter 12 ds
Dijkstra's Algorithm
Dijkstra's algorithm presentation
Hashing in datastructure
Dijkstra s algorithm
Priority queue in DSA
Tree Traversal
Dijkstra algorithm
Chapter 12 ds

What's hot (20)

PPTX
PPT
Hashing PPT
PPT
Biconnected components (13024116056)
PPTX
Shortest path problem
PPTX
CLOSEST PAIR (Final)
PPTX
Dijkstra's algorithm
PPTX
Skip lists (Advance Data structure)
PPT
Data Structure and Algorithms Binary Search Tree
PPTX
A* Algorithm
PPT
Max flow min cut
PPT
Queue AS an ADT (Abstract Data Type)
PPTX
Dijkstra's Algorithm
PPTX
Stack and Queue
PPTX
Multi ways trees
PPT
4.4 hashing
PDF
AI local search
PPT
Spanning trees
PPTX
SORTING techniques.pptx
PDF
WebTech Tutorial Querying DBPedia
Hashing PPT
Biconnected components (13024116056)
Shortest path problem
CLOSEST PAIR (Final)
Dijkstra's algorithm
Skip lists (Advance Data structure)
Data Structure and Algorithms Binary Search Tree
A* Algorithm
Max flow min cut
Queue AS an ADT (Abstract Data Type)
Dijkstra's Algorithm
Stack and Queue
Multi ways trees
4.4 hashing
AI local search
Spanning trees
SORTING techniques.pptx
WebTech Tutorial Querying DBPedia
Ad

Similar to Hashing in Data Structure and analysis of Algorithms (20)

PDF
LECT 10, 11-DSALGO(Hashing).pdf
PDF
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
PDF
Tojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdv
PPTX
HASHING IS NOT YASH IT IS HASH.pptx
PPTX
8. Hash table
PPTX
hashing explained in detail with hash functions
PPT
Analysis Of Algorithms - Hashing
PPTX
PPT 2 wirha DSA hasings dvd ho gi of DJ of ch huu Raj of DJ.pptx
PDF
L21_Hashing.pdf
PPT
Hashing Techniques in Data Strucures and Algorithm
PPTX
Hashing techniques, Hashing function,Collision detection techniques
PPTX
Hashing .pptx
PPTX
Data Structures- Hashing
PPTX
Data Structures-Topic-Hashing, Collision
PDF
Hashing components and its laws 2 types
PPTX
hashing in data structures and its applications
PPTX
Hashing.pptx
PPTX
Hashing in data structure is presented in these slides
LECT 10, 11-DSALGO(Hashing).pdf
hashtableeeeeeeeeeeeeeeeeeeeeeeeeeee.pdf
Tojo Sir Hash Tables.pdfsfdasdasv fdsfdfsdv
HASHING IS NOT YASH IT IS HASH.pptx
8. Hash table
hashing explained in detail with hash functions
Analysis Of Algorithms - Hashing
PPT 2 wirha DSA hasings dvd ho gi of DJ of ch huu Raj of DJ.pptx
L21_Hashing.pdf
Hashing Techniques in Data Strucures and Algorithm
Hashing techniques, Hashing function,Collision detection techniques
Hashing .pptx
Data Structures- Hashing
Data Structures-Topic-Hashing, Collision
Hashing components and its laws 2 types
hashing in data structures and its applications
Hashing.pptx
Hashing in data structure is presented in these slides
Ad

Recently uploaded (20)

PPTX
1.Introduction to orthodonti hhhgghhcs.pptx
PPTX
Bussiness Plan S Group of college 2020-23 Final
PPT
DWDM unit 1 for btech 3rd year students.ppt
PPTX
Sistem Informasi Manejemn-Sistem Manajemen Database
PPT
Handout for Lean and Six Sigma application
PPTX
Evaluasi program Bhs Inggris th 2023-2024 dan prog th 2024-2025-1.pptx
PDF
PPT IEPT 2025_Ms. Nurul Presentation 10.pdf
PPTX
reflex-210317162019.pptxjy5i767i6i67i67i67i76
PDF
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
PDF
Nucleic-Acids_-Structure-Typ...-1.pdf 011
PDF
PPT nikita containers of the company use
PPTX
text mining_Natural Language Processing.pptx
PPTX
BDA_Basics of Big data Unit-1.pptx Big data
PPT
Drug treatment of Malbbbbbhhbbbbhharia.ppt
PPTX
Dkdkskakkakakakskskdjddidiiffiiddakaka.pptx
PPT
Technicalities in writing workshops indigenous language
PPTX
DataGovernancePrimer_Hosch_2018_11_04.pptx
PPTX
4. Sustainability.pptxxxxxxxxxxxxxxxxxxx
PDF
The-Physical-Self.pdf college students1-4
PPTX
The future of AIThe future of AIThe future of AI
1.Introduction to orthodonti hhhgghhcs.pptx
Bussiness Plan S Group of college 2020-23 Final
DWDM unit 1 for btech 3rd year students.ppt
Sistem Informasi Manejemn-Sistem Manajemen Database
Handout for Lean and Six Sigma application
Evaluasi program Bhs Inggris th 2023-2024 dan prog th 2024-2025-1.pptx
PPT IEPT 2025_Ms. Nurul Presentation 10.pdf
reflex-210317162019.pptxjy5i767i6i67i67i67i76
Delhi c@ll girl# cute girls in delhi with travel girls in delhi call now
Nucleic-Acids_-Structure-Typ...-1.pdf 011
PPT nikita containers of the company use
text mining_Natural Language Processing.pptx
BDA_Basics of Big data Unit-1.pptx Big data
Drug treatment of Malbbbbbhhbbbbhharia.ppt
Dkdkskakkakakakskskdjddidiiffiiddakaka.pptx
Technicalities in writing workshops indigenous language
DataGovernancePrimer_Hosch_2018_11_04.pptx
4. Sustainability.pptxxxxxxxxxxxxxxxxxxx
The-Physical-Self.pdf college students1-4
The future of AIThe future of AIThe future of AI

Hashing in Data Structure and analysis of Algorithms

  • 2. 2 Hashing • Mathematical concept – To define any number as set of numbers in given interval – To cut down part of number – Used in discreet maths, e.g graph theory, set theory – Used in Searching technique – Used in encryption methods
  • 3. 3 Hash Functions and Hash Tables • Hashing has 2 major components – Hash function h – Hash Table Data Structure of size N • A hash function h maps keys (a identifying element of record set) to hash value or hash key which refers to specific location in Hash table • Example: h(x) = x mod N is a hash function for integer keys • The integer h(x) is called the hash value of key x
  • 4. 4 Hash Functions and Hash Tables • A hash table data structure is an array or array type ADTof some fixed size, containing the keys. • An array in which records are not stored consecutively - their place of storage is calculated using the key and a hash function Key hash function array index
  • 5. 5 • Hashed key: the result of applying a hash function to a key • Keys and entries are scattered throughout the array • Contains the main advantages of both Arrays and Trees • Mainly the topic of hashing depends upon the two main factors / parts (a) Hash Function (b) Collision Resolution • Table Size is also an factor (miner) in Hashing, which is 0 to tablesize-1.
  • 6. 6 Table Size • Hash table size – Should be appropriate for the hash function used – Too big will waste memory; too small will increase collisions and may eventually force rehashing (copying into a larger table)
  • 7. 7 Example • We design a hash table for a dictionary storing items (SSN, Name), where SSN (social security number) is a nine-digit positive integer • The actual data is not stored in hash table • Pin points the location of actual data or set of data • Our hash table uses an array of size N = 10,000 and the hash function h(x) = last four digits of x     0 1 2 3 4 9997 9998 9999 … 451-229-0004 981-101-0002 200-751-9998 025-612-0001
  • 8. 8 Hash Function • The mapping of keys into the table is called Hash Function • A hash function, – Ideally, it should distribute keys and entries evenly throughout the table – It should be easy and quick to compute. – It should minimize collisions, where the position given by the hash function is already occupied – It should be applicable to all objects
  • 9. 9 • Different types of hash functions are used for the mapping of keys into tables. (a) Division Method (b) Mid-square Method (c) Folding Method
  • 10. 10 1. Division Method • Choose a number m larger than the number n of keys in k. • The number m is usually chosen to be a prime no. • The hash function H is defined as, H(k) = k(mod m) or H(k) = k(mod m) + 1 • Denotes the remainder, when k is divided by m • 2nd formula is used when range is from 1 to m.
  • 11. 11 • Example: Elements are: 3205, 7148, 2345 Table size: 0 – 99 (prime) m = 97 (prime) H(3205)= 4, H(7148)=67, H(2345)=17 • For 2nd formula add 1 into the remainders.
  • 12. 12 2. Folding Method • The key k is partitioned into no. of parts • Then add these parts together and ignoring the last carry. • One can also reverse the first part before adding (right or left justified. Mostly right) H(k) = k1 + k2 + ………. + kn
  • 13. 13 • Example: H(3205)=32+05=37 or H(3250)=32+50=82 H(7148)=71+43=19 or H(7184)=71+84=55 H(2345)=23+45=77 or H(2354)=23+54=68
  • 14. 14 3. Mid-Square Method • The key k is squared. Then the hash function H is defined as H(k) = l • The l is obtained by deleting the digits from both ends of K2. • The same position must be used for all the keys.
  • 15. 15 • Example: k: 3205 7148 2345 k2: 10272025 51093904 5499025 H(k): 72 93 99 • 4th and 5th digits have been selected. From the right side.
  • 16. 16 Collision Resolution Strategies • If two keys map on the same hash table index then we have a collision. • As the number of elements in the table increases, the likelihood of a collision increases - so make the table as large as practical • Collisions may still happen, so we need a collision resolution strategy
  • 17. 17 • Two approaches are used to resolve collisions. (a) Separate chaining: chain together several keys/entries in each position. (b) Open addressing: store the key/entry in a different position. • Probing: If the table position given by the hashed key is already occupied, increase the position by some amount, until an empty position is found
  • 18. 18 Open Addressing • Types of open addressing are 1. Linear Probing 2. Quadratic Probing 3. Double Hashing.
  • 19. 19 1. Linear Probing • Locations are checked from the hash location k to the end of the table and the element is placed in the first empty slot • If the bottom of the table is reached, checking “wraps around” to the start of the table. Modulus is used for this purpose • Thus, if linear probing is used, these routines must continue down the table until a match or empty location is found
  • 20. 20 • Linear probing is guaranteed to find a slot for the insertion if there still an empty slot in the table. • Even though the hash table size is a prime number is probably not an appropriate size; the size should be at least 30% larger than the maximum number of elements ever to be stored in the table. • If the load factor is greater than 50% - 70% then the time to search or to add a record will increase.
  • 21. 21 H(k)=h, h+1, h+2, h+3,……, h+I • However, linear probing also tends to promote clustering within the table. 1 2 3 4 5 6 7 8
  • 22. 22 2. Quadratic Probing • Quadratic probing is a solution to the clustering problem – Linear probing adds 1, 2, 3, etc. to the original hashed key – Quadratic probing adds 12, 22, 32 etc. to the original hashed key • However, whereas linear probing guarantees that all empty positions will be examined if necessary, quadratic probing does not
  • 23. 23 • If the table size is prime, this will try approximately half the table slots. • More generally, with quadratic probing, insertion may be impossible if the table is more than half-full! H(k) = h, h+1, h+4, h+5, h+6,……, h+i2
  • 24. 24 3. Double Hashing • 2nd hash function H’ is used to resolve the collision. • Here H’(k) = h’ ≠ m • Therefore we can search the locations with addresses, H’(k) = h, h+h’, h+2h’, h+3h’,……. • If m is prime, then this sequence access all the locations.
  • 25. 25 Double Hashing • Double hashing uses a secondary hash function d(k) and handles collisions by placing an item in the first available cell of the series (h + jd(k)) mod N for j = 0, 1, … , N - 1 • The secondary hash function d(k) cannot have zero values • The table size N must be a prime to allow probing of all the cells • Common choice of compression map for the secondary hash function: d2(k) = k mod q where – q < N – q is a prime • The possible values for d2(k) are 1, 2, … , q
  • 26. 26 • Consider a hash table storing integer keys that handles collision with double hashing – N = 13 – h(k) = k mod 13 – d(k) = k mod 7 • Insert keys 18, 41, 22, 44, 59, 32, 31, 73, in this order Example of Double Hashing 0 1 2 3 4 5 6 7 8 9 10 11 12 59 41 18 32 44 8 22 44 11 0 1 2 3 4 5 6 7 8 9 10 11 12 k h(k ) d (k ) Probes 18 5 9 5 41 2 8 2 22 9 10 9 44 5 5 5 7 59 7 10 7 10 0 32 6 4 6 31 5 8 5 8 73 8 11 8 11
  • 27. 27 Applications of Hashing • Compilers use hash tables to keep track of declared variables • A hash table can be used for on-line spelling checkers — if misspelling detection (rather than correction) is important, an entire dictionary can be hashed and words checked in constant time • Game playing programs use hash tables to store seen positions, thereby saving computation time if the position is encountered again • Hash functions can be used to quickly check for inequality — if two elements hash to different values they must be different