Indexes and Fragmentation in SQL 2000 Part1
Indexes and Fragmentation in SQL 2000 Part1
!
W
W
O
O
N
N
y
y
bu
bu
to
to
k
k
lic
lic
C
C
w
w
m
m
w w
w
w
o
o
.d o .c .d o .c
c u -tr a c k c u -tr a c k
http://www.sqlservercentral.com/articles/Performance+Tuning+and+Scaling/2648/ 6/30/2009
F -X C h a n ge
Indexes and Fragmentation in SQL Server 2000 Part 1 - SQLServerCentral Page 2 of 7 F -X C h a n ge
PD PD
!
W
W
O
O
N
N
y
y
bu
bu
to
to
k
k
lic
lic
C
C
w
w
m
m
w w
w
w
o
o
.d o .c .d o .c
c u -tr a c k c u -tr a c k
-- This script creates our test database and creates a table within it
-- You will have to modify the 2 filename parameters to values of where you want the DB c
CREATE DATABASE [myIndexes] ON (NAME = N'myIndexes_Data',
FILENAME = N'C:\myIndexes_Data.MDF' , SIZE = 1000, FILEGROWTH = 10%)
LOG ON (NAME = N'myIndexes_Log', FILENAME = N'C:\myIndexes_Log.LDF' , SIZE = 30, FILEGROW
COLLATE Latin1_General_CI_AS
GO
USE [myIndexes]
GO
CREATE TABLE [dbo].[myTable] (
[myPK] [uniqueidentifier] NOT NULL ,
[myID] [bigint] IDENTITY (1, 1) NOT NULL ,
[Char1] [varchar] (20) COLLATE Latin1_General_CI_AS NOT NULL ,
[Char2] [char] (200) COLLATE Latin1_General_CI_AS NOT NULL ,
[Char3] [varchar] (2000) COLLATE Latin1_General_CI_AS NOT NULL ,
[Num1] [int] NOT NULL ,
[Num2] [money] NOT NULL ,
[Date1] [datetime] NOT NULL
) ON [PRIMARY]
GO
ALTER DATABASE myIndexes SET RECOVERY SIMPLE
GO
myPK uniqueidentifier 16
myID bigint 8
Char1 varchar 20
Char2 char 200
Char3 varchar 2000
Num1 int 4
Num2 money 8
Date1 datetime 8
----
total = 2264
You can see that we have used a total 2,264 bytes. Also I
have used the term bytes instead of characters - that's the
official way field length is measured. Generally a text field
will use 1 byte per character but this does not easily
convert for a date (8 bytes) or a money field (8 bytes) so it
is best that we refer to the length from now on as bytes.
Other data types take up different amounts of storage
(e.g.. unicode characters use 2 bytes and would therefore
limit you to 4,000 characters per row). I want to keep things
relatively simple here so you can look them up in BOL if
you want to know more.
Note: Certain types can span multiple pages but we won't
be covering this here.
http://www.sqlservercentral.com/articles/Performance+Tuning+and+Scaling/2648/ 6/30/2009
F -X C h a n ge
Indexes and Fragmentation in SQL Server 2000 Part 1 - SQLServerCentral Page 3 of 7 F -X C h a n ge
PD PD
!
W
W
O
O
N
N
y
y
bu
bu
to
to
k
k
lic
lic
C
C
w
w
m
m
w w
w
w
o
o
.d o .c .d o .c
c u -tr a c k c u -tr a c k
dpages = the number of data pages used for our table data
INSERT INTO [myTable] VALUES (NEWID(), 'test1', 'test1', 'test1', 123, 321, GETDATE() )
and we now see that these vales have been updated. The
values you get may differ from mine so bear this in mind
when we use them as parameters for the upcoming
exercise. My values are...
first = 0x220000000100
dpages = 1
http://www.sqlservercentral.com/articles/Performance+Tuning+and+Scaling/2648/ 6/30/2009
F -X C h a n ge
Indexes and Fragmentation in SQL Server 2000 Part 1 - SQLServerCentral Page 4 of 7 F -X C h a n ge
PD PD
!
W
W
O
O
N
N
y
y
bu
bu
to
to
k
k
lic
lic
C
C
w
w
m
m
w w
w
w
o
o
.d o .c .d o .c
c u -tr a c k c u -tr a c k
SET @Loop = 0
You will see at the start of the script that it loops 1000
times (WHILE @Loop < 1000) - this also refers to the
number of records that will be added so ideally we want to
change this to nearer 500000. Perhaps you can run it
overnight. I've challenged a colleague at work to see which
of us can come up with a script that randomly inserts
values into the database the fastest - I have some good
http://www.sqlservercentral.com/articles/Performance+Tuning+and+Scaling/2648/ 6/30/2009
F -X C h a n ge
Indexes and Fragmentation in SQL Server 2000 Part 1 - SQLServerCentral Page 5 of 7 F -X C h a n ge
PD PD
!
W
W
O
O
N
N
y
y
bu
bu
to
to
k
k
lic
lic
C
C
w
w
m
m
w w
w
w
o
o
.d o .c .d o .c
c u -tr a c k c u -tr a c k
ideas but not had the time to try them out. If anyone is
interested in joining in the fun (or has any suggestions) let
me know and I'll send you the contest rules!
OK while the script is running lets run a check to make
sure we can see things happening as we would expect
them to. Run the following to check that the number of data
pages has been increasing - remember it was at one
before...
DBCC UPDATEUSAGE(0)
- Pages Scanned................................: 1
- Extents Scanned..............................: 1
- Extent Switches..............................: 0
- Avg. Pages per Extent........................: 1.0
- Scan Density [Best Count:Actual Count].......: 100.00% [1:1]
- Extent Scan Fragmentation ...................: 0.00%
- Avg. Bytes Free per Page.....................: 7827.0
- Avg. Page Density (full).....................: 3.30%
http://www.sqlservercentral.com/articles/Performance+Tuning+and+Scaling/2648/ 6/30/2009
F -X C h a n ge
Indexes and Fragmentation in SQL Server 2000 Part 1 - SQLServerCentral Page 6 of 7 F -X C h a n ge
PD PD
!
W
W
O
O
N
N
y
y
bu
bu
to
to
k
k
lic
lic
C
C
w
w
m
m
w w
w
w
o
o
.d o .c .d o .c
c u -tr a c k c u -tr a c k
more than the number of extent switches that take
place. This would mean that SQL has started at extent
one and read through each of them in turn without being
redirected elsewhere.
Avg. Pages per Extent
As we know we only have one data page so SQL has
only create one at this point. As data starts to grow SQL
will think ahead and create more.
Scan Density
Best count is the ideal number of extent changes if
everything is contiguously linked. Actual count is the
actual number of extent changes. The number in scan
density is 100 if everything is contiguous; if it is less
than 100, some fragmentation exists. Scan density is a
percentage. As you can see from our results we have a
perfect 100%!
Extent Scan Fragmentation
Because we're not using an index yet this value is
irrelevant.
Avg. Bytes Free per Page
Average number of free bytes on the pages scanned.
The higher the number, the less full the pages are.
Lower numbers are better.
Avg. Page Density (full)
Average page density (as a percentage). This value
takes into account row size, so it is a more accurate
indication of how full your pages are. The higher the
percentage, the better.
If you run the DBCC SHOWCONTIG command again on
your table you will start to see different results on the
extents. Because we are working with a heap (a table
without a clustered index) the data is in no particular order.
It is generally in the order that it was inserted into the table
(providing that we are mainly inserting new rows) however
SQL will also insert data in to any available space it comes
across. If you do a SELECT * FROM [myTable] you will
probably find that the [myID] column doesn't increment
serially in the early part of the table and this is because as
SQL grows the pages and extents it will copy
approximately half of the contents of one page into the
next so that there's always room for quick insertion of
additional records later on (known as page splits). In the
early stages of our data generation script SQL will find free
space in some of the initially created data pages and insert
some records into these pages rather than at then end of
the table. I've never actually had confirmation of this but
this is how I understand it - it's as if SQL initially expands
quite quickly leaving a lot of free space in the first extent(s)
and then returns shortly after to fill them in.
I hope you have picked up on the point that because we
are using a heap the data is in no particular order. And if
you perform any kind of SELECT statement on a heap the
entire table must be scanned. And as we have read if we
were to select all the records from the table we wouldn't
necessarily get them back in the order they were inserted.
Saying that, and I hope you are with me at this point, can a
heap actually become fragmented? And if so is there a
command to defrag it? Well I suppose the answer to
whether a heap can become fragmented is yes but would
that actually cause us any problems? It's not as if we
would be scanning the table without any indexes applied
and, as we've not specified any order for the table, then
we're not loosing out. Fragmentation, of a kind, can occur
on a heap when forwarding records are created that point
to an updated record that was moved due to the update
making it bigger that it's currently allocated space (i.e.
there was no room for it in the current slot). There is no
command that will defrag a heap (unless you apply a
clustered index and remove it again as this will force an
order which will remain after the index is dropped, but will
not be maintained).
Conclusion
So we've not really learnt anything new about how to
improve our table fragmentation but we have been through
the motions of how data builds up within the database file.
And we've seen that having a table without an index, in
particular a clustered index, deserves the name 'heap'
because basically that is all it is, and of little use to us.
Running any form of query on a heap will force a full scan
of the table and could take a considerable amount of time.
And the results from the DBCC SHOWCONTIG command,
which is a tool for measuring fragmentation, is also of little
use on a heap.
http://www.sqlservercentral.com/articles/Performance+Tuning+and+Scaling/2648/ 6/30/2009
F -X C h a n ge
Indexes and Fragmentation in SQL Server 2000 Part 1 - SQLServerCentral Page 7 of 7 F -X C h a n ge
PD PD
!
W
W
O
O
N
N
y
y
bu
bu
to
to
k
k
lic
lic
C
C
w
w
m
m
w w
w
w
o
o
.d o .c .d o .c
c u -tr a c k c u -tr a c k
Your response
Rate this | Join the discussion |
Briefcase | Print
Copyright © 2002-2009 Simple Talk Publishing. All Rights Reserved. Privacy Policy. Terms of Use
http://www.sqlservercentral.com/articles/Performance+Tuning+and+Scaling/2648/ 6/30/2009