DBMS5
DBMS5
5.14
UNIT
CLOUD SERVICE
PROVIDERS
(Important Formulae)
Transaction Menagement (3) Indexing : Indexing refers to the process of
findinga particular record in a file usingone
Transactions or more index or (indexes) or storing a record
indexing in any order (randomly on the disk)
Overview of storage and
Data on externalstorage Indexing
File organization and indexing
. Index data structures
- Comparison of file organization
Tree - structured indexing Hash based Tree based
Indexing Indexing
- Intuition for tree indexes
- Indexed sequentialaccess method (4) ISAM : ISAM stands for Indexed sequentiai
(1SAM) Access Method
B+ Tress: Dynamic Index Structure, (5) B+ tress: B* -tree index structure
Search, Insert, Delete represents a balanced tree satisfying the
Hash - Based Indexing: following properties
- Static hashing (i) Allpaths from root to node follow the same
’ Extendible hashing length
’ Linear hashing (ii) Each node that is not a root or leaf has
between n/2 and n children
’ Extendible vs linear hashing
(1) Index: n-I
(iii) A leaf node had between and a-1 values
in database, an index is a data structure that 2
improves the speed of data retrieval
operations on adatabase table
(2) Types of indexes: There are 3 types of
indexes
Primary index
() Secondary index
(ii) Clustering index
Warning :
XeroxlPhotocopying of this book ls a CRIMINAL Act. Anyone found guilty ls LIABLE to face LEGAL proceedings
Database Management Systems
Answers
152
Short Questions with
alternatives for what
main to store
Q1. Write a short notes on Data on external There are three an index :
entry in
storage ? as a data
Adata entryk* is an actual data reCord (with
Ans (1)
search key k)
A DBMS stores vast guantities of data and Adata entryis a <krid> pair, where id is the
the data must persist across program executions. (2) recordid of a datarecord with search key valkue k.
Therefore, data is stored on extermal storage devices Adata entry is a <k, rid.-list> pair, where
such as disks and tapes, and fetched into main (3) record ids of data
memory as needed for processing. The unit of
is a list of
rid-list
with search key
value k. records
information readfrom written to disk is a page Q4. Define an index. Whát are the different
The size of a page isa DBMS parameterand typical
values are 4kb or 8ktb The cost of page /O (input kinds of indexes ?
from disk to main memory and output from Ans : In database, anindex is a data structure that
memory to disk) dominates the cost of typical improves the speed of data retrieval operations
database operations, and database systems are
in databases ,
carefully optimized to minimize the cost. The a database table. Indexes indey i
following points are important to keep in mind. analogous to indexes in text book. The called'
field of the file
(1) Disks are the most important external storage usuallyspecified on one is a fileot
devices. indexing field. One form of an index
(2) Tapes anre sequential access devices access entries containing field value and pointer
to record
devicesand force usto read data one page after which is ordered by field value. The index file ie
the other. They are mostly used to archieve data the term used to describe the records.
that isnot needed on a regular basis Types of indexes:
(3) Each reord in a fle has a unique identifier In DBMS, there are three types of indexces
called a record id, or rid for short (i) Primary index
Q2. What is file organizations and indexing ? (ii) Secondary index
Ans : (iüü) Clustering index
File Organization : (1) Primary index :
File organization is a mechanism of An index that is defined based on ordering
physically arrangingor organizing the records of a key field of an ordered file is called
file onto secondary storage devices such as primary index. Primary indices are two
magnetic disk, tapes or CD-ROM. types (1) Dense Index (2) sparse index.
Afile can be created, destroyed and have (ii) Secondary index :
An index that is defined based on non
records inserted into and deleted from it.
ordering field of the data file is called
Indexing: secondary index.
An index is a data structure that organizes
data records on disk to optimize certain kinds of (i1) Clustering index :
retrieval operations. An index allows us to An index defined on the ordering a file of an
efficientiy retrieve all records that satisfy search ordered file is called clustering index.
condition on the search key fields of the index. We Q5. Give the properties of indexes?
can aiso create additional indexes on a given Ans:
collection of data records, each with a different The properties of indexed are as follows :
search key to speed up search operations that are (1) Indexed enhance the performance level ot
not efficientiy supported by the file organization the databases.
used to store the date records. (2) Theycan retrieve the records in a particular
Q3. What is data entry and what are the alterna sequence order.
tives that to store as a data entry in a index? (3) They are capable of addressing the
Ans : requirements of the application program.
The term data entry to refer to the records (4) They consume less time to locate the file records
storedinan index file. A data entrywith search (5) They eliminate the need to analyze each enty
key value k, denoted as k*, contains enough during the query execution.
with
information to locate one or more data records (6) They increase the speed of accessing ao
search key value k, we can efficiently search an records.
then use
index to find the desired data entries and (7) They can perform binary search on variable
records.
these to obtain data length file records.
Warning: Xerox/Photocopying of this book is a CRIMINAL.Act. Anyone found guilty is LIABLE to face LEGAL proceedings
Database Management Systems
5.3
is the difference between a primary
index and a secondary index ?
Q6. What
Ans
ISAM B+ - tree
1 SAM (ndexed sequential Access Method) 1. B+ -ree is a dynamic indexingstructure
is a staticindexing structure. 2. Applicable for dynamic files
2 Appicable for static files 3. The leaf pages are allocated randomly
3. The leaf pages are allocated sequentially 4. Due to dynamic size of B+ - trees,
4. Due to static size of ISAM, overflow chains overflow chains may frequently occur
may rarely occur 5. Leaf as well as index level pages can be
5. Oniy leaf pages can be modified modified
6. Scanning is done more efficiently 6. Scahning is done les effienty
7 Insertions lead to long overflow chains 7. Insertions are handled elegantly without
8. The number of nodes to be examined is overflow chains
equal to B+ tree plus the number of The number of nodes to be examined is
overfiow pages equal to the height of tree
9. The pertormance of ISAM is less efficient 9. The perfomance of B+ -trees is more
10 Locking over head of ISAM is less efficient
10. Locking overhead of B+ trees is more
Warning:Xerox/Photocopying of this book is aCRIMINAL Act. Anyone found gullty is LIABLE to face proceedings
LEGAL
Database Management Systems
Leaf pages
Overflow page Primary pages
Fig : ISAM Index Structure
proceedings
Warning :Xerox/Photocopying guitty is LIABLE to face LEGAL
of this book is a CRIMINAL Act. Anyone found
Database Management:Systems
Q15. Write a short notes on B'- trees ?
Ans:
B* -tree index structure following properties:
represents a balanced tree satisfyingtgthe
(i) Allpaths from root to node follow the same length
(iü) Cach node that is not aroot or leaf has between Gand n chilaren
n-1
(ii) A leaf node has between and n-1 values
h (key) mod N 1
key h
N-1
Overflow pages
Primary bucket pages
Fig: Static Hashing
Warning: Xerox/Photocopying of this book is a CRIMINAL Act. Anyone found guity is IABLE to face LEGALproceodings
Database Management Systems
hashing ?
5.7
Write a short
notes on extendible
Q18.
Ans: dynamic hashing technique that is capable of dividing the buckets as
Extendible hashing is athem as it shrinks. This requires reorganization of buckets in hash table.
and rejoining
database grows
feature of extendible hashing is that it reduces the performance overhead of hashing
An
important
performs reorganization only on one bucket ata time.
t
becauseit of extendible hash
general structure
The fiqure below shows the
dË Bucket 1
00...
01..
dT
10..
Bucket 2
11....
dg
Bucket 3
111....
Bucket
Address
Table
Bucket 'n'
Warning: Xerox/Photocopying of this book is a CRIMINAL Act. Anyone found guilty is IABLE to face LEGAL proceedings
Database Management Systems 5.8
Questions with Answers
5.1
Essay
ATA EXTERNAL
ON STORAGE
Q1. Explain data on external storage?
Ans:
A DBMS stores vast quantities of data, and the data must persist across program executions.
is stored storano dovices such as disks and tapes, and tetched into mai
on ovtonalThe
memory asdata
Tnerefore, needed for processing. unit of information read from or written to disk is a page. The
size of a page is a DBMS parameter, and typical values are4kb or 8kb. The cost of page l/O (input from
dIsk tomain memorv and output fron memoy to disk) dominates
the costof typical database operatione
this cost. While the details of how files t
dhd database systems are carefully ontimizedto minimize utilized are covered, the following poin
Tecords are physicallv stored ondisk and how main memorv is
are important to keep in mind.
They allow us to retrieve any page at
DISkS are the mOst importantexternal stora ge devices.
the older that they are stored physicall,
fIxed cost per page. However if we read several pages insame
reading the pages in a random order.
the cost canbe much less than the cost of
after the other. They aro
lapes are sequential access devices and force us to read data one page
mostBv used to archieve datathat is not needed on a regular basis.
An rid has the
Each record in a file has a unigue jdentifier called a record id, or rid for short.
rid
property that we can identify the disk address of the page containing the record by using the
Data is read
software
into memory for processing and written to disk for persistent storage by a layer of
called buffer manager when the files and access methods layer needs to process a page.
fetches the
it asks buffer manager to fetch the page, specifying the pages rid. The buffer manager
page from disk if it is not already in memory.
Q2. Expiain the memory hierarchy ?
Ans :
Memory in a computer system is arranged in hierarchy. At the top, we have primary storage,
which consists of cache and main memory and provides very fast access to data. Then comes secondary
storage. which consists of slower devices such as magnetic disks. Tertiary storage is the slowest class of
storage devices.
CPU
CACHE
Primary Storage
MAIN MEMORY
Request for data
MAGNETIC DISK
Secondary Storage
Data satistying
request TAPE
Tertiary Storage
Fig: The memory Hierarchy
Magnetic Disks:
Magnetic disks support direct access to a desired location and are widely used for database
'applications. A DBMS provides seamless access to data on disks, applications need not Worryabout
or disk.
whetherdata is in main memory
Warning: Xerox/Photocopying of this book is a CRIMINAL Act. Anyone found guilty is LIABLE to face LEGAL proceedings
Database Management Systems
called disk blocks, A disk block is a contiguous sequence of bytes
5.9 stored on disk in units from a disk. Blocks are arranged in concentric rings
Datais
which data is written to
a disk and read
in The set of all tracks with the same diameter is called a cylinder.
andunit on one or more platters. array of disk heads, one per recorded surface, is
calledtracts, dividedinto arch, çalled sectors. An computer. It implements commands to
is disk drive to the
Eachtrack unit. Adisk controllerinterfaces a transferring data to and from the disk surfaces.
movedas a sector by movingthe arm asssembly and
write a
for when data is written to sector
and stored with the sector.
read or computed
check.sumis
main memory takes approximately the same time.
A to any desired location in a disk block
disk is more complicated. The tine to access
access
While direct access a location on
determiningthe time to Seek time is time taken move the disk heads to the track on which a desired
components. block to rotate under the disk head.
waiting time for the desiredblock
has several
located. Rotational delay is the one the head is positioned.
write the data in the
blockis
time is the time to actually read or
Transfer time.
seek time + rotational delay + transfer
Access time =
management?
note on disk space
08 Write a detailed
Ans:
Management :
Disk Space isthe component of mini
space manager manages space on disk. Disk space manager also performs
The disk
of the allocation and de allocation of pages within a database. It context of a
hase that takes care file layer within the
torm disk and provides a logical
rOads and writes a pages to and
database managerment system.
accessed
contiguousblocks to hold the data that is frequently
The sequence of pages are stored as sequentially accessing disk block. This capability must also
This is advantageous for
in seguential order. DBMS by the disk spae manager.
be providedto the higher layers of the layers to
underlying hardware details and make the higher
The disk space manager hides all the
think of data as collection pages.
Handling of free blocks: grow or shrink
space manager keeps track of the space on the disk. The database may
The disk disk space
insertion or deletion operations are performed on it. To manage the disk space, the deletion
when the are on which disk blocks. The
as which pages
manager will keep track of used disk block as well
operation on the disk may create 'holes'
There are two ways to detemine block usage
(1) Using a list of free blocks
(2 Using bitmap
(1) Using a list of free blocks :
located they are added to the list for the future
Inthis method, whenever the blocks are deal the disk which points to the first block on
reference. The pointer is stored in known location on
the free list.
(2) Usingbitmap :
bit will help into determining whether the
bitmap maintains one bit for each disk block. Thissequence of blocks on disk very fast. This is
Dlock is free or not, which identifies and allocates
very difficult to implementwith the linked list.
PRIMARY & SECONDARY INDEXES
é FiLE ORGANIZATION AND INDEXING-CLUSTER INDEXES,
Q4. Explain about fixed length records?
Ans:
record slots are uniform and
uall records on the page are quaranteed to be of the same length, occupied by records and
are
earanged consecutively within a page. At anyinstant some slots
Free Space
N 1|M.
Page Header
Number of records Numbers of Slots
Fig: Alternative Page Organizations for fixed length records
The sioted page organization described for variable length records. It can also be used for fixed
iength records. It becomes attractive. If we need to move records around on a page for reasons ther
than keeping track of space frud bydeleted.
Q5. Expiain about variable length records ?
Ans :
Variabie length file organization isaway of arranging variable length records within a file. Basically,
variabie length records are the records of multiple sizes. In constraints to fixed - length records, variable
length records incur some overhead while performing insertion and deletion
of different between the space created after deleting the operation. This is because
record and the space required for in_erting the
record variabie length file organization is used for organizing the databases that store
greater than the disk block. In general variable length records include data where size is
a record consistingof.
Multiple records type in a file
(üi) Record types in which it is possible to define variable length fields
(iii) Record types in which it is possible to reverse thesame field
multiple times.
Slotted- page structure is a technique employed for implementing
structure is basically used for organizing records in ablock. Slotted page variableof lenath records. This
placed at the beginning of every individual block. This header stores the consists a header which is
information regarding.
(i) Total number of records entries present in the header
(ii) End of free space within the block
(ii) Arraythat contains entriesspecitying the location and size of the record
Typesofindices :
types of indices:
In DBMS. there are three
Primary index
(u) Secondary index
(iüi) Clustering index
(i) Primary index;
field of an ordering file is called primary index. It
An index that is defined based on ordering key an ordered file consisting of two fields. Primary
follows the same ordering as that of the file. It is index.
indices are of two types i) Dense index ii) sparse index
record for every search key value in the file. The
(a) Dense index : Dense index has an index pointer to the first data record with that search key.
record contains the search key value and a
record for only some of the search key values in
(b) Sparse index : Sparse index has an index sequentially according to search key value.
the file. It isused when records are arranged
() Secondary index :
data file is called secondary index.
An index that is defined based on non ordering field of the on a key field is sometimes called a
Secondary index has different ordering than the one of the files. It record in the data file.
Secondary key. The key field is guaranteed to have unique value for each
(ii) Clustering index :
Clustering index
An index defined on the ordering file of an ordered file is called clustering index.
the data file can have
nas the same orderingas the one of the file. In cltstering index, ordering field of
Some values for several records in the file.
e What is primary and secondary indexes? Explain them with a suitable examples?
Ans :
Primary Index :
rOr topic primary index refer Q.No - 6Q Consider an example of primary index shown in the
abie. This table is specified on the ordered key attribute of the file and index contains two fields such as
index entry and apointer to the primary key field of the file.
Warning : Xerow Photocopying of this book is a CRIMINAL Act. Anyone found guity is LIABLE to face LEGAL proceedings
Database Management Systems 5.12
A-217 Brighton 750
Greatest Record Block Number Downtown 500
Number A-101
Downtown 600
40 A-110
A Miances 700
80 A-215 400
A-102 Peryridge
160 C |A-201 Perryridge 900
250 D A-218 Perryridge 700
Redwood 700
A-222
In order to search a row with key RoundHill 350
willrefer the first entryin the glven table "50', we
which is
A-305
11....
Bucket 2
d3
Bucket 3
111.
Bucket
Address
Table
Bucket 'n
address have 2
bits
datato have 2
bit address. range of
the existing R3
updatesthe accomandate
it tries to h-level 1
Then it
H(R1)’ 100100
H(R2) -’ 010110 'Split image' buckets
created (through splitting
110110 of the buckets) in this round.
H(R3)’
00
R NR
01 M
LINEAR HASHING SEARCH:
R
10
To find bucket for data entry k, find h,vel.1
(k): .if h, (k) in range 'Next to N', K belong or
here else, r could belong to bucket h,t to(k)find
R,
11
(k) +Np, must apply h,., (k)
and R? bucket h,
Now we can see
that address of R' R3is out.
address and
reflect the new Simple formulation:
are changed to of the data
increase, it
also inserted. As the size buckets, if no buckets 2N.
existing
tries to insert in the (1) Next
18 10 30buckot
page
010 10 14
011 11
Add A3 Level 0
h1 Overflow pages
Primary pages
000 100 32*
001 01 9 25* 5
010 10 14° 18* 10 30*
100 00 46 36
Level0
Add 50 Level =1
h1 h
Primary pages Overflow pages
000 , 100 32*
001 01 9 25*
010 10 66* 18 10
34°50
011 11 31* 35* 7 11
100 00 44 36°
101 01 5 37 29*
111 11 31 7
Page
length records
Fig: Alte rnative Page Organizations for fixed
lfa new page isrequired, it isobtained bymaking arequest to the disk space manager and then
from the heap file, it is removed from the
added to the list of pages in the file. lf a page is to be deleted
list and disk space manager is lock to de allocate it.
Q15. Explain heap file unclustured has index?
Ans :
clustered tree indexes we assume that each data
Heap file with un clustered hash index:As for un in out analysis, and for
static hashing
entry is one tenth the size of a data record. We consider only
Simplicity we assume that there are nooverflow chains.
In astatichashed file, pages are kept at about 80 percent occupancy. This as achieved by adding
percent full, when records are initially loaded into
a new pageto a bucket when each existingpage is 80
to store data entries is therefore 1.25 times the
iasnea file structure. The number of pages required that is 1.25 (0.10B) =0.125B. The number of
humber of pages when the entries are densely packed, relative size and occupancy.
data entries that fit on pages is 10(0.80R) =8R taking into account the
Scan:
As tor an un clustered tree index, all data entries can
be retrieved in expensively, at a cost of
VoB|D+8RC) I/Os, the cost of this step is BR(D+C). This is prohibitively expensive and further
Tesults are unordered. So no one ever scans a hash index.
Search with equality selection :
is equality conditions
Ihis operation is supported very efficient for matching selections, that
cost of identifying the page that
Waed tor each field in the composite key <age,sal>. The
Warning : XeroNPhotocopying of this book is a CRIMINAL Act. Anyone found gulty is LJABLS to face LEGAL proceedings
Database Management Systems
contains
costs D. Ifqualifying data of just one page,
Assumingthat this bucket consists records
we entries is H. scanning half the on the
ot scanning theassumeisthat we find the data entry after is therefore H+2D+4RC, which page,
than the page
0-5(8R)C=4RC.
cost for a tree index. The total cost is the cos
even retrievingoo,it
Search with range selectlon:
record in
The hash
heap file, at acost of structure offers no help, and the entire heap file of employee the
Insert :
2D+C. The additional cost is H+ZUT employee
C
We must first, the record in the employee heap file, at cost 2D + the
heap file. The additionalcostinsert
is. H + 2D + C. employee
Delete :
We need to locate the data record in the employee file and the data entry in the
search step costs H+ 2D + 4RC. Now, we need to write out the modified pages on the index index,
and
the
data file, at a cost of 2D. the
5.4 CoMPARISION OF FILE ORGANIZATIONS
Q16. Compare various file organizations?
Ans:
The comparison of heap sorted and hashed files is done based on the definitions and operations perfomed on
those files.
Heap Files Sorted Files Hashed Files
1. records can be 1. Records are stored in 1. Record should be placed
placed anywhere in sequential order to hash according
the file function 2. The cost of scanning is
2. the cost of scanning 2. The cost of scanning is 1.25B(D +RC)
is B(D +RC) B(D+RC) 3. Selection is based on the
3. selection is specified 3. Equality selectin is search key
on acandidate key specified on the sorted 4. The cost of searching
4. the lost of searching field with equality selection is
with equality 4. The cost of searching with H +D+0.5RC
selection is 0.5
equality selection is 5. The entire file must be
B(D+RC) DlogzB+Clog:B
5. the entire file must be 5. Range selection is on the
scanned and the range
Scanned.for search selection ison the search
sort field. The cost= cost key. The cost is 1.25
with range selection. of search + cost of
The cost is B(D+RC) B(D+RC)
6. Records are inserted retrieving the satisfied set 6. Appropriate page must
of records
at the end of the file. 6. Find the correct position be located modifed and
The cost is 2D+C then written back
with the sorted field and
7. Search for the record cost=cost of search
insert the record at corect
then remove it from +C+D
position cost= 2(0.5 7. Fist the record is
the page and write B(D+RC) + B(D+RC)
the modified page 7. Search the record, remove searched, removed from
back for sirnplicity that recod and write the the page and then
cost = search cost
modified page back. modified page is written
+C+D Cost=cost of search back cost=cost of search
+B(D+RC) +C+D.
Warning : XeroxPhotocopying of this book is a CRIMINAL.Act. Anyone found guilty is LJABLE to face LEGAL procoodings
Database Management Systems
sequential, direct index file sequential to be organization?
5.19 Differentiate between
o17.
Direct File Organization Indexed sequential File organization
Ans: Sequential File Organization In direct file organization 1 In indexed sequential file
sequentialfile records are stored in direct organizations often
oganization records are.
1. In access storage devices records are stored in direct
sequentialaccess (DASD) example:Magnetic access devices. Example:
stored indevices example:
storage disks (Hard Disks) Magnetic disks (Hard
Magnetic tapes (Audio Inthis required records are Disk)
Cassettes) searched randomly using 2 In this desired records are
are being searched either
2 In this records
acoessed by searching keys.
sequentially or randomly
3. Before processing the
the file
from beginning of
the file till
transactions it is not 3. Before processing
to the end of necessary to sort the transactions even though
records is found. sequential access is used
Before processing the records stored in memory. there is no need to sort
transactions, records must 4 Accessingspeed is more
compared to sequential records.
either
be sorted in access and less when 4 Accessingspeed is more
ascending or descending compared to indexed when compared with both
order. sequential access. sequential access and
Acoessing speed is very direct access since index is
4
both 5 The organization is more
less compared to expensive compared to sued.
direct and indexed
sequential file organization sequential access and less 5 This organization is very
expensive compared to expensive compared to
5 This organization is
economically low direct access. both.sequential access and
compared to both direct 6 Timeconsumption isless direct access since it
access and indexed when compared with requires special software.
sequential access. sequential file organization 6 Time consumption is very
6. Time consumption is more and more compared to less compared to both
compared to both indexed sequential fle sequential file organization
sequential and indexed organization. and direct file
sequential file organization organization.
5.5 INDEXES & PERFORMANCE TUNING; INTUTIONS FOR TREE INDEXES
Q18. What is the intention behind tree structured indexes ?
Ans :
Consider afile of students records sorted bygpa. To answer a range selection such as "Find all
sudents with a gpa higher than 3.0" we must identify the first such student by doing a binary search of
he tle andthen scan the file from that point on. If the file is large, the initial binary search can be quite
expensive, since cost of proportional to the number of pages fetched.
As for any index 3 alternatives for data
entries K
(@ Data record with key value K
(ü) <k,rid of data
(iü) <k, list of
record with search key value k>
rids of data records with search key k>
notce is orthogonal to indexing technique used to locate data entries k
ee structured indexing technigues support both range' searches and equality
searches
Index entry
|P |KPK P, Pm
War ing Xerox/Photocopying of this book is aCRIMINAL Act. Anyone found guilty is LIABLE to face LEGAL proceedings
Database Management Systeme
We refer 10 palrs of the form or just entries
asindex entriesnumber
when
the b.20 s
clear. Note that each index page <key.polnter>
polnter more thanthe of keys, each
key cont
servesext n
contents of thecontains spointed to by the polnters to its
a separator for the one left and right
The simple index file data pages
structure
Index file
(KKJ KN
Data file
Pagen
Overflow
pages
Leaf pages
Primary pages
Fig: ISAM indeX structure
Each tree node is a disk page, and all the data resides in the leaf pages. This
index that uses alternative corresponds to an
(1) For data entries, in terms of the alternatives, we can create an
index alternative.
(2) By sorting the data records in a separate file and storing <key,rid> pairs in
ISAM index the leaf pages of the
The non - leaf level pages are then allocated. If
so that more entries are inserted intoa leaf than will fit there aare several inserts to the file subsequentiy
onto singe page, additional pages are needed
because the index structure is static. These additional pages are allocated from an overflowarea. The
allocation of pages is shown below.
Data pages
Index pages
Over flowpages
Fig : Page Allocation in ISAM
Thebasic operation of insertion, deletion and search are allquite straight forward Eor
selection search, we start at root node & determine which sub tree aneguality
to search by comparing the valuein
Warning :Xerox/Photocopying of this book ls a CRIMINAL Act. Anyone found gulty ls
LIABLE to tace LEGAL procoedings
Database Management Systems
kev valuesin the node. For a range query the stating point
givenrecord with the then retrieved sequentially. For insert and deletes the
5.21 field of the data pages
searchlevelis determined and search and the record is inserted or deleted with overflow pages
the data deeterminedfor a
inthe pageis
appropriate
addedis necessary.
DYNAMIC INDEX S TRUCTURE
TREES : A
5.7 B trees ?
ExplainB*
O20.
witha variable but often large nåmber, of children per node. A B* tree
Ans: n-arraytree either a leaf or a node with two or more
AB* tree internal nodes and leaves. The root may be
is an
root. structure which is widely used, is a balanced tree in which the internal
consists of a tree search Since the tree structure grows and
B' nodes contain the data entries.
children. The and the leaf where the set
nodes direct
the search
not feasible to allocate the leaf pages sequentially as in ISAM,
dynamically, it is all leaf pages efficiently we have to link them using page
shrinks
pages was static. To retrievelinked list. We can easily traverse the sequence of leaf pages
them into a doubly
leat
of primary
pointers. By organizing
in their direction.
Index entries.
(To direct search)
Index file
Data entring
("Sequence set"')
(u) A leaf node has betuoon -l and n-1 values the following are some of the main character tics of
2
a B+ tree:
we willSearching arecord in B* tree: Suppose we want to search .65 in the below B* tree structure. First
fetch for
we find the intermediary node which will direct to the leaf node that can contain record for
65.50
3rd leaf node branch between 50 &75 nodes in the intermediary node. Then we will be redirected to the
we have to at the end. Here DBMS will perform sequential search to find 65. Suppose instead of 65,
search for 60. of this book ls a CRIMINAL Act. Anyone found guiltyis LIABLE to face LEGAL proceedings
Narning : Xerox/Photocopying
Database Management Systeme
75 80 90 95
50 55 65 70
10 15 20 25 30 3540
Insertion in B+ Tree :
Suppose we have to insert a record 60 in below structure. It will go to 3rd leaf node after 55.
Since it it is balanced tree and that leaf nodeis already full. We can'tinsert the record there, but it
should be inserted there without affecting the fill factors balance and order. So the only option there is
to split the leaf node but how to split the nodes
5565 70 75 80 9095
5 10 15 20 25 3035 40 50
Ihe 3 leaf node should have values (50.55.60.65.70) and its current root node is 50 we will ent
the leaf node in the middle so that its balance is not altered. So we can group (50,55) and (60,65,701
into 2 leaf nodes. If these two has to be leaf nodes.the intermediary node can't branch from 50.
should ordered to it then we can have pointers to new leaf node.
255060 75
60 65 70
DELETE IN B* TREE:
Suppose we have to delete 60 from the above example we have to remove 60 from 4th leaf
as well as from the intermediary node too. If we remove from node
intermediary
B+ tree rules. So we need to modify it have a balance tree. After node, the tree will not satisty
deleting 60 from above B+ tree and
re-arranging nodes, it will appear as below
25||5075
|510 1520 25 30 35 40 50 55 60 65 75 80 90 95
Suppose we have to delete 15 from
delete 15from that node. There is no needabove tree. We will transverse to the 1st leaf node
for any re and simply
appear in the intermediary node arrangement as the tree is balanced and 15 do not
23|s0||75
|5 10 20 25 30 35 40 50 55 60 65
75 80 90