Digital Search Structures
Digital Search Structures
• Insertion
• Searching
• Deletion.
Insertion
• The element 1001 is in the node which has one right child. This node can be removed by
just replacing the deleted node 1001 with the child node 1011
Example : Consider the DST shown and delete 1001
Note: 1001 has two children
• If the bit is zero, the search moves to the left sub tree, otherwise it
moves to the right sub tree.
• Once the element node is reached, the key in this node is compared with
the key we are searching for. This is the only key comparison that takes
place.
Operations on Patricia:
1) Search
2) Insert
3) Delete
Searching Patricia
• To search for a key in Patricia, we start at the root and
proceed down the tree, using the BitNumber in each node to
tell us which bit to examine in the search key.
•
We proceed left if the bit is 0 and right if it is 1.
• The keys in the nodes are not examined at all on the way
down the tree.
• Eventually, when an element pointer is encountered i.e pointer
to node with a lesser BitNumber, the node value is checked
with the search key.
• Thus, if the key at the node pointed to by this element pointer
is equal to the search key, then the search is successful;
otherwise, it is unsuccessful.
• Example: Search for 1001 in the Patricia
Algorithm for Searching Patricia
Patricia* patricia_ search(Patricia *t, key)
Patricia p, y;
Step 1: IF t = = NULL
[END of IF]
SET p = t;
Step 3:
do
SET p = y;
If key[y->bit_number]= =0
SET y = y->left_child;
ELSE
SET y = y->right_child;
[END of IF]
[END of WHILE]
ELSE
Inserting into Patricia
• Let key be the element to be inserted. Firstly, search for the key in Patricia.
Let q be the node where the patricia_search algorithm terminates. Find the bit
position j at which q and key differs. The BitNumber for the new node is j. The
position for the new node has to be decided. The new node will be placed in
between A and B where A and B can be obtained as follows.
• Initially A points to the header node and B points to its left child. Move A and
B down the Patricia until the condition A - BitNumber < B - BitNumber < j is
satisfied and update the position of A and B. Set node A as B. B is updated
based on the bit in the key at position specified by the B - BitNumber. If the
bit is 0, update B as B's left child and otherwise as its right child.
• As the condition fails, the positions of A and B are fixed and the new node is to
be placed in between A and B. If B is a left child of A, the new node is inserted
as a left child to A and otherwise new node is inserted as a right child to A.
• If the jth bit in the key is 0 then the left link of new node will be self-pointed
and right link of new node points to node B. Otherwise, the right link of new
node will be self-pointed and left link points to node B.
Example 7: Construct a Patricia by inserting elements in the order
1000, 0010,1001,1100,0000 and 0001
Delete 1100
from the
patricia
Delete from Patricia
After
deleting
0010
Delete from Patricia
After
deleting 1100
MUTI-WAY TRIES
• A multi-way trie (or simply trie) is a tree data structure used to store strings of
varying length.
• The word ‘TRIE’ is extracted from the word ‘RETRIEVAL’.
• A trie is used for efficient retrieval of the data, i.e., for performing efficient
search on the data.
• A trie is a tree of degree m>=2 in which the branching at any level of the tree
is determined not by entire key value, but by only a portion of it.
• The trie consists of two types of nodes, i.e., element nodes and branch nodes.
• The element node has only a data field which consists of the key which is
being stored in the trie.
• The branch node consists of the pointers to other sub-trees which may again
contain pointers to other sub-trees or pointers to element nodes.
• The elements or keys are stored in the leaf nodes.
• The main advantage of trie data structure is that the strings of similar
character prefixes can use the same prefix data and store only tails as separate
data.
• Example: Consider the trie which stores English words of different
lengths. In this trie, each branch node contains 27 pointers, 26
pointers pointing to English alphabets and an extra pointer field
which stores a blank character that is used to terminate the keys.
• To access a key in the trie, we need to move down in a series of
branch nodes following the appropriate branches based on the
alphabetical characters forming the key.
• All the nodes which neither point to branch nodes or element nodes
are represented using NULL pointers.
• Thus the depth of the information nodes or element nodes depends on
the similarity of the first few characters (prefixes) with its fellow
keys.
• Operations:
• Searching
• Insertion
• Deletion
• Example: Consider set of records consisting of names, AadharID, date of joining,
and department name of the employees of an organization. Construct a trie using
Aadhar ID as the key field.
• Solution:
• Let us consider radix 10, so that we will have 10 pointers for each branch node,
from 0 to 9 . Examine the digits of the key AadharID from left to right. Using the
first digit of AadharID, partition the records into three groups. First group whose
AadharID begins with 5 (i.e., Rajani, Arun, and Sushrut), the second group whose
AadharID begins with 2 (i.e., Nirmal and Anshul) and the third group that starts
with 9 (i.e., Ram). Groups having more than one element are partitioned with the
help of the next digit in the key. This process of partitioning is continued until
every group has exactly one element in it.
Searching a Trie
• For searching an element in the trie, we start searching for the key
from the root node which is a branch node.
• Let us suppose the key k is made up of k1, k2, k3, kn characters.
• The first character of the key k1 is extracted and the respective
child pointer in the root node is identified.
• If it is an element node, its value is compared with the key and
otherwise if it is a branch node, the pointer at next character k2 is
considered.
• If this pointer points to an element node the key is compared with
the element node value and otherwise if it is a branch node, the
pointer at next character k3 is considered.
• The process is repeated until we reach the element node which is
equivalent to the key we are searching for.
Sampling Strategies
• F1 : sample(key,i) = keyi
• [Branching at level i is done basing on the ith character of the key]
• F2 : sample(key,i)= keyn-i+1 where n is number of characters in key.
• [Branching at level i is done basing on the n-i+1 character of the key]
• F3 : sample(key,i)= keyr(key,i) for r(key,i) is a randomization function
• [Branching at level i is done basing on the randomization function. The
randomization function will yield any value from 1 to n where n is the
no. of characters in key.]
• F4 : sample(key,i)= keyi/2 if i is even otherwise
• key n-(i-1)/2 if i is odd
• [Branching at level i is done basing on the i/2th character if i is even
and on n-(i-1)/2th character if i is odd.]
Trie obtained by applying sampling function F1
Trie obtained by applying sampling function
F2
Trie obtained by applying sampling function
F3