Data StructuresB-treeJibrael Jos : Sep 2009
IntroductionMultiway TreesB TreeApplicationStructureAlgo : Insert / DeleteAvoid Taking Printout : Use RTF Outline in case needed2Agenda
Data StructuresAVL TreesRed BlackB-treeHashing / Indexing TechniquesGraphs Please Do Not Take Printout : Use RTF Outline in case needed3
Path Has to be enjoyedWalkingWalking in Rain !!CertificationEffort ~   SatisfactionPlease Do Not Take Printout : Use RTF Outline in case needed4
ResearchShoulders of GiantsResearch on an area to reach a level of expertiseMindmap and Research PathPlease Do Not Take Printout : Use RTF Outline in case needed5
B TreePlease Do Not Take Printout : Use RTF Outline in case needed6
MethodologyOne Book to AnotherOne Link to AnotherAvoid Taking Printout : Use RTF Outline in case needed7
Binary Search TreeWhat happens if data is loaded in a binary search tree in this order23, 32, 45, 11, 43 , 411,2,3,4,5,6,7,8What is AVL treePlease Do Not Take Printout : Use RTF Outline in case needed8
Multiway TreesPlease Do Not Take Printout : Use RTF Outline in case needed9>= K2>= K1 <K2< K1
m-way treesReduce the depth of the tree to  O(logmn)with m-way treesmchildren,  m-1 keys per nodem = 10  :  106 keys in 6 levels vs 20 for a binary treebut ........
m-way treesBut you have to search through the m keys in each node!Reduces your gain from having fewer levels!
m-way trees
Anand BB-treesAll leaves are on the same levelAll nodes except for the root and the leaveshaveat least m/2 childrenat most m childrenEach node is at leasthalf full of keys
BTREE
DiskPlease Do Not Take Printout : Use RTF Outline in case needed151 track = 5000 Chars1 Cylinder = 20 tracks1 disk unit = 200 cylinders
Time TakenSeek TimeLatency TimeTransmission TimeOvercoming Latency Time ??72.5 + o.o5n millisec to read n chars
3 levelPlease Do Not Take Printout : Use RTF Outline in case needed17
Multiway TreeM – ary tree3 levels : Cylinder , Track , Record : Index Seq (RDBMS)Tables with less changePlease Do Not Take Printout : Use RTF Outline in case needed18
BTreeIf level is 3,  m =199 then what is NHow many split per insertion ?Please Do Not Take Printout : Use RTF Outline in case needed19
Multiway Trees : ApplicationNDPL , Delhi: Electricity Billing3 lakh consumers Table indexed as BTREEUCO Bank, JaipurOne DD takes 10 minutes to printSaviour : BTREEPlease Do Not Take Printout : Use RTF Outline in case needed20
B-trees - InsertionInsertionB-tree property : block is at least half-full of keysInsertion into block with m keysblock overflowssplit blockpromote one keysplit parent if necessaryif root is split, tree becomes one level deeper
Insert Node63
After Insert 63
Insert Node99
After Insert 99
Split Node04node
Structure of BtreenodefirstPtrnumEntries     Entries[1.. M-1]End Entry        keyrightPtrEnd EntryAvoid Taking Printout : Use RTF Outline in case needed27
Split Node : Final043medianentryfromNdx3node2toNdx2rightPtr
Split Node : Final443medianentryfromNdx3node1toNdx2rightPtr
Traversal
DeleteDelete Walk ThroughReflowBorrow LeftBorrow RightCombineDelete MidAvoid Taking Printout : Use RTF Outline in case needed31Agenda
Delete : For 78Please Do Not Take Printout : Use RTF Outline in case needed32Btree Delete        Delete()                 Delete()                        Delete Mid()                         Reflow()                 Reflow()         If shorter delete root122222
Btree DeleteIf (root null)     print (“Attempt to delete from null tree”)Else     shorter = delete (root, target)     if Shorter           delete root Return rootPlease Do Not Take Printout : Use RTF Outline in case needed33BTarget = 78122222
Delete(root , deleteKey)If (root null)          data does not existElseentryNdx= searchNode(root, deleteKey)            if found entry to be deleted                      if leaf node                          underflow=deleteEntry()                     else                           underflow=deleteMid (left)                            if underflow                                    underflow=reflow()Please Do Not Take Printout : Use RTF Outline in case needed34BDTarget = 78122222
Delete Else PartElse           if deleteKey less than first entrysubtree=firstPtr           elsesubtree=rightPtr           underflow= delete (subtree,deleteKey)           if underflow                underflow= reflow()Return underflowPlease Do Not Take Printout : Use RTF Outline in case needed35BDTarget = 78122222
Delete(root , deleteKey)If (root null)          data does not existElseentryNdx= searchNode(root, deleteKey)            if found entry to be deleted                      if leaf node                          underflow=deleteEntry()                     else                            underflow=deleteMid (root,entryIndx,left)                            if underflow                                    underflow=reflow(root,entryIndx)Please Do Not Take Printout : Use RTF Outline in case needed36BDDTarget = 78DM122222
Delete(root , deleteKey)Please Do Not Take Printout : Use RTF Outline in case needed37BIf (root null)          data does not existElseentryNdx= searchNode(root, deleteKey)            if found entry to be deleted                      if leaf node                          underflow=deleteEntry()                     else                          underflow=deleteMid (root,entryIndx,left)                            if underflow                                    underflow=reflow(root,entryIndx)DD74 replaces 78122212
Delete(root , deleteKey)If (root null)          data does not existElseentryNdx= searchNode(root, deleteKey)            if found entry to be deleted                      if leaf node                          underflow=deleteEntry()                     else                          underflow=deleteMid (root,entryIndx,left)                            if underflow                                    underflow=reflow(root,entryIndx)Please Do Not Take Printout : Use RTF Outline in case needed38BDDAfter Reflow11224
Delete Else PartElse           if deleteKey less than first entrysubtree=firstPtr           elsesubtree=rightPtr           underflow= delete (subtree,deleteKey)           if underflow                underflow= reflow(root,entryIndx)Return underflowPlease Do Not Take Printout : Use RTF Outline in case needed39BDBefore Reflow11224
Delete Else PartElse           if deleteKey less than first entrysubtree=firstPtr           elsesubtree=rightPtr           underflow= delete (subtree,deleteKey)           if underflow                underflow= reflow(root,entryIndx)Return underflowPlease Do Not Take Printout : Use RTF Outline in case needed40BDAfter Reflow0424
BTREE DeleteIf (root null)     print (“Attempt to delete from null tree”)Else     shorter = delete (root, target)     if Shorter           delete root Return rootPlease Do Not Take Printout : Use RTF Outline in case needed41B0424
BTREE DeleteIf (root null)     print (“Attempt to delete from null tree”)Else     shorter = delete (root, target)     if Shorter           delete root Return rootPlease Do Not Take Printout : Use RTF Outline in case needed42B424
TemplatesPlease Do Not Take Printout : Use RTF Outline in case needed433421
DeletePlease Do Not Take Printout : Use RTF Outline in case needed44122222
Delete : For 78Please Do Not Take Printout : Use RTF Outline in case needed45Btree Delete        Delete()                 Delete()                        Delete Mid()                         Reflow()                 Reflow()         If shorter delete root122222
Delete : Reflow1: Try to borrow right. 2: If 1 failed try to borrow from left3:  Cannot Borrow (1,2 failed)  CombinePlease Do Not Take Printout : Use RTF Outline in case needed46
Delete ReflowUnderflow=falseIf RT->no  > min EntriesBorrowRight (root,entryNdx,LT,RT)Else        If LT->no  > min EntriesBorrowLeft (root,entryNdx,LT,RT)Else         combine (root,entryNdx,LT,RT)         if root->no < min entries               underflow=TrueReturn underflowPlease Do Not Take Printout : Use RTF Outline in case needed47
Borrow LeftPlease Do Not Take Printout : Use RTF Outline in case needed48213Node >= 74 < 78Node >= 78 < 85
CombinePlease Do Not Take Printout : Use RTF Outline in case needed4931222
CombinePlease Do Not Take Printout : Use RTF Outline in case needed5031322
CombinePlease Do Not Take Printout : Use RTF Outline in case needed513422
CombinePlease Do Not Take Printout : Use RTF Outline in case needed522422
Delete MidIf leaf      exchange data and delete leaf entryElse       traverse right to locate predecessordeleteMid(right)             if underflow                      reflowPlease Do Not Take Printout : Use RTF Outline in case needed53
Delete MidPlease Do Not Take Printout : Use RTF Outline in case needed54122222Case 1: To Delete 78 we replace with 74
Delete MidPlease Do Not Take Printout : Use RTF Outline in case needed55122222Case 2:To Delete 78 we replace with 76Hence recursive call of Delete Mid to locate predecessor2
orderPlease Do Not Take Printout : Use RTF Outline in case needed56
Get the Order Right Keys are 4Subtrees Max is 5 = Order is 5Minimum = 3 (which is subtrees)Min Keys is 2Please Do Not Take Printout : Use RTF Outline in case needed57424
2-3 TreeOrder 3 ….. So how many keys in a nodeThis rule is valid for non root leafRoot can have 0, 2, 3 subtreesPlease Do Not Take Printout : Use RTF Outline in case needed58
2 -3 TreePlease Do Not Take Printout : Use RTF Outline in case needed59122222
2-3-4 TreeOrder 4 ….. So how many keys in a nodeThis rule is valid for non root leafRoot can have 0, 2, 3 subtreesPlease Do Not Take Printout : Use RTF Outline in case needed60
Structure of B + treeNon leaf nodefirstPtrnumEntries     Entries[1.. M-1]End Entry        keyrightPtrEnd EntryAvoid Taking Printout : Use RTF Outline in case needed61Leaf node
firstPtr
numEntries

BTree, Data Structures

  • 1.
  • 2.
    IntroductionMultiway TreesB TreeApplicationStructureAlgo: Insert / DeleteAvoid Taking Printout : Use RTF Outline in case needed2Agenda
  • 3.
    Data StructuresAVL TreesRedBlackB-treeHashing / Indexing TechniquesGraphs Please Do Not Take Printout : Use RTF Outline in case needed3
  • 4.
    Path Has tobe enjoyedWalkingWalking in Rain !!CertificationEffort ~ SatisfactionPlease Do Not Take Printout : Use RTF Outline in case needed4
  • 5.
    ResearchShoulders of GiantsResearchon an area to reach a level of expertiseMindmap and Research PathPlease Do Not Take Printout : Use RTF Outline in case needed5
  • 6.
    B TreePlease DoNot Take Printout : Use RTF Outline in case needed6
  • 7.
    MethodologyOne Book toAnotherOne Link to AnotherAvoid Taking Printout : Use RTF Outline in case needed7
  • 8.
    Binary Search TreeWhathappens if data is loaded in a binary search tree in this order23, 32, 45, 11, 43 , 411,2,3,4,5,6,7,8What is AVL treePlease Do Not Take Printout : Use RTF Outline in case needed8
  • 9.
    Multiway TreesPlease DoNot Take Printout : Use RTF Outline in case needed9>= K2>= K1 <K2< K1
  • 10.
    m-way treesReduce thedepth of the tree to O(logmn)with m-way treesmchildren, m-1 keys per nodem = 10 : 106 keys in 6 levels vs 20 for a binary treebut ........
  • 11.
    m-way treesBut youhave to search through the m keys in each node!Reduces your gain from having fewer levels!
  • 12.
  • 13.
    Anand BB-treesAll leavesare on the same levelAll nodes except for the root and the leaveshaveat least m/2 childrenat most m childrenEach node is at leasthalf full of keys
  • 14.
  • 15.
    DiskPlease Do NotTake Printout : Use RTF Outline in case needed151 track = 5000 Chars1 Cylinder = 20 tracks1 disk unit = 200 cylinders
  • 16.
    Time TakenSeek TimeLatencyTimeTransmission TimeOvercoming Latency Time ??72.5 + o.o5n millisec to read n chars
  • 17.
    3 levelPlease DoNot Take Printout : Use RTF Outline in case needed17
  • 18.
    Multiway TreeM –ary tree3 levels : Cylinder , Track , Record : Index Seq (RDBMS)Tables with less changePlease Do Not Take Printout : Use RTF Outline in case needed18
  • 19.
    BTreeIf level is3, m =199 then what is NHow many split per insertion ?Please Do Not Take Printout : Use RTF Outline in case needed19
  • 20.
    Multiway Trees :ApplicationNDPL , Delhi: Electricity Billing3 lakh consumers Table indexed as BTREEUCO Bank, JaipurOne DD takes 10 minutes to printSaviour : BTREEPlease Do Not Take Printout : Use RTF Outline in case needed20
  • 21.
    B-trees - InsertionInsertionB-treeproperty : block is at least half-full of keysInsertion into block with m keysblock overflowssplit blockpromote one keysplit parent if necessaryif root is split, tree becomes one level deeper
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
    Structure of BtreenodefirstPtrnumEntries Entries[1.. M-1]End Entry keyrightPtrEnd EntryAvoid Taking Printout : Use RTF Outline in case needed27
  • 28.
    Split Node :Final043medianentryfromNdx3node2toNdx2rightPtr
  • 29.
    Split Node :Final443medianentryfromNdx3node1toNdx2rightPtr
  • 30.
  • 31.
    DeleteDelete Walk ThroughReflowBorrowLeftBorrow RightCombineDelete MidAvoid Taking Printout : Use RTF Outline in case needed31Agenda
  • 32.
    Delete : For78Please Do Not Take Printout : Use RTF Outline in case needed32Btree Delete Delete() Delete() Delete Mid() Reflow() Reflow() If shorter delete root122222
  • 33.
    Btree DeleteIf (rootnull) print (“Attempt to delete from null tree”)Else shorter = delete (root, target) if Shorter delete root Return rootPlease Do Not Take Printout : Use RTF Outline in case needed33BTarget = 78122222
  • 34.
    Delete(root , deleteKey)If(root null) data does not existElseentryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (left) if underflow underflow=reflow()Please Do Not Take Printout : Use RTF Outline in case needed34BDTarget = 78122222
  • 35.
    Delete Else PartElse if deleteKey less than first entrysubtree=firstPtr elsesubtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow()Return underflowPlease Do Not Take Printout : Use RTF Outline in case needed35BDTarget = 78122222
  • 36.
    Delete(root , deleteKey)If(root null) data does not existElseentryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)Please Do Not Take Printout : Use RTF Outline in case needed36BDDTarget = 78DM122222
  • 37.
    Delete(root , deleteKey)PleaseDo Not Take Printout : Use RTF Outline in case needed37BIf (root null) data does not existElseentryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)DD74 replaces 78122212
  • 38.
    Delete(root , deleteKey)If(root null) data does not existElseentryNdx= searchNode(root, deleteKey) if found entry to be deleted if leaf node underflow=deleteEntry() else underflow=deleteMid (root,entryIndx,left) if underflow underflow=reflow(root,entryIndx)Please Do Not Take Printout : Use RTF Outline in case needed38BDDAfter Reflow11224
  • 39.
    Delete Else PartElse if deleteKey less than first entrysubtree=firstPtr elsesubtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx)Return underflowPlease Do Not Take Printout : Use RTF Outline in case needed39BDBefore Reflow11224
  • 40.
    Delete Else PartElse if deleteKey less than first entrysubtree=firstPtr elsesubtree=rightPtr underflow= delete (subtree,deleteKey) if underflow underflow= reflow(root,entryIndx)Return underflowPlease Do Not Take Printout : Use RTF Outline in case needed40BDAfter Reflow0424
  • 41.
    BTREE DeleteIf (rootnull) print (“Attempt to delete from null tree”)Else shorter = delete (root, target) if Shorter delete root Return rootPlease Do Not Take Printout : Use RTF Outline in case needed41B0424
  • 42.
    BTREE DeleteIf (rootnull) print (“Attempt to delete from null tree”)Else shorter = delete (root, target) if Shorter delete root Return rootPlease Do Not Take Printout : Use RTF Outline in case needed42B424
  • 43.
    TemplatesPlease Do NotTake Printout : Use RTF Outline in case needed433421
  • 44.
    DeletePlease Do NotTake Printout : Use RTF Outline in case needed44122222
  • 45.
    Delete : For78Please Do Not Take Printout : Use RTF Outline in case needed45Btree Delete Delete() Delete() Delete Mid() Reflow() Reflow() If shorter delete root122222
  • 46.
    Delete : Reflow1:Try to borrow right. 2: If 1 failed try to borrow from left3: Cannot Borrow (1,2 failed) CombinePlease Do Not Take Printout : Use RTF Outline in case needed46
  • 47.
    Delete ReflowUnderflow=falseIf RT->no > min EntriesBorrowRight (root,entryNdx,LT,RT)Else If LT->no > min EntriesBorrowLeft (root,entryNdx,LT,RT)Else combine (root,entryNdx,LT,RT) if root->no < min entries underflow=TrueReturn underflowPlease Do Not Take Printout : Use RTF Outline in case needed47
  • 48.
    Borrow LeftPlease DoNot Take Printout : Use RTF Outline in case needed48213Node >= 74 < 78Node >= 78 < 85
  • 49.
    CombinePlease Do NotTake Printout : Use RTF Outline in case needed4931222
  • 50.
    CombinePlease Do NotTake Printout : Use RTF Outline in case needed5031322
  • 51.
    CombinePlease Do NotTake Printout : Use RTF Outline in case needed513422
  • 52.
    CombinePlease Do NotTake Printout : Use RTF Outline in case needed522422
  • 53.
    Delete MidIf leaf exchange data and delete leaf entryElse traverse right to locate predecessordeleteMid(right) if underflow reflowPlease Do Not Take Printout : Use RTF Outline in case needed53
  • 54.
    Delete MidPlease DoNot Take Printout : Use RTF Outline in case needed54122222Case 1: To Delete 78 we replace with 74
  • 55.
    Delete MidPlease DoNot Take Printout : Use RTF Outline in case needed55122222Case 2:To Delete 78 we replace with 76Hence recursive call of Delete Mid to locate predecessor2
  • 56.
    orderPlease Do NotTake Printout : Use RTF Outline in case needed56
  • 57.
    Get the OrderRight Keys are 4Subtrees Max is 5 = Order is 5Minimum = 3 (which is subtrees)Min Keys is 2Please Do Not Take Printout : Use RTF Outline in case needed57424
  • 58.
    2-3 TreeOrder 3….. So how many keys in a nodeThis rule is valid for non root leafRoot can have 0, 2, 3 subtreesPlease Do Not Take Printout : Use RTF Outline in case needed58
  • 59.
    2 -3 TreePleaseDo Not Take Printout : Use RTF Outline in case needed59122222
  • 60.
    2-3-4 TreeOrder 4….. So how many keys in a nodeThis rule is valid for non root leafRoot can have 0, 2, 3 subtreesPlease Do Not Take Printout : Use RTF Outline in case needed60
  • 61.
    Structure of B+ treeNon leaf nodefirstPtrnumEntries Entries[1.. M-1]End Entry keyrightPtrEnd EntryAvoid Taking Printout : Use RTF Outline in case needed61Leaf node
  • 62.
  • 63.
  • 64.
    Entries[1.. M-1]
  • 65.
    Next Leaf Node
  • 66.
    End B +TreePlease Do Not Take Printout : Use RTF Outline in case needed6212222Implies there are more nodes
  • 67.
    B * TreeSpaceUsageBTREE nodes can be 50% Empty (1/2)So rule modified to two third (2/3)Also when node overflows instead of being split immed distributed with siblingsAnd even when split happens all siblings are equally distributed (pg 462)Please Do Not Take Printout : Use RTF Outline in case needed63
  • 68.
    B+-treesB+ treesAll thekeys in the nodes are dummiesOnly the keys in the leaves point to “real” dataLinking the leaves Ability to scan the collection in orderwithout passing through the higher nodes
  • 69.
    ReferenceMy CourseFurzonChapter 10Volume 3 Knuth : 5.4.9 (Disks ) 6.2.4 (Multiway)Please Do Not Take Printout : Use RTF Outline in case needed65Action ItemDo research on BTREE , AVL , Red Black