Techie-Friendly Blog: January 2014

The toughest thing for an amateur DBA is to understand the architecture of RDBMS. Everyone can write queries and get or update data but the one who knows how the data is retrieved and updated, he/she is the one who can master RDBMS.

In this article, I'm gonna speak about how data is retrieved by SQL Server when a user queries it using SELECT statement.

SQL Server architecture and Windows Operating System's architecture have resemblance in most of the cases. In fact some features of SQL Server work based on the functioning of Windows. In Windows, when a user opens a file, it is fetched from Secondary Storage and kept in RAM until the user closes it. The same happens with SQL Server while reading the data from a table.

Data in table is stored in the form of Pages and Extents. The size of a page is 8KB. Eight contiguous pages is called as Extent. They are stored on disk physically. When data is queried from the table, there exists a component called Buffer Manager which checks for the requested data pages on disk. If pages are found then it copies them from disk to Buffer Cache. They are stored in Buffer Cache until there exists a memory deficit in the Buffer Cache. Buffer Manager deletes the pages that are not used frequently. I'll write about Buffer Management in my future posts.

Reading Pages can be of two types namely,

1. Logical Reads.

2. Physical Reads.

Logical Read occurs when SQL Server reads the pages which are present in Buffer Cache. Physical Read means reading pages from the disk. Suppose the requested page doesn't exist in Buffer Cache, it is copied from disk to Buffer Cache and then it is read from Buffer Cache. The read operation that is performed from disk to Buffer Cache is called Physical Read.

For reading pages, SQL Server has a mechanism called Read-Ahead Mechanism. It manages all the read operations. Suppose a query is submitted to SQL Server, Read-Ahead predicts what are the pages required for the query's execution plan and copies them into Buffer Cache from disk. The number of Buffers for the pages is allocated based on requirement. This is done before the query uses the pages actually. While reading the pages, if any page is found existing in Buffer Cache, that page is discarded by Read-Ahead after reading it. It doesn't copy the page again from disk to Buffer Cache. There are two ways in which Read-Ahead mechanism works like,

1. Reading Data Pages.

2. Reading Index Pages.

Reading Data Pages

For reading data pages without an index, SQL Server uses Table Scan operator. It is a very efficient operator. Index Allocation Map (IAM) pages list the extents used by table. Every data row has a memory address in which it is stored. As rows are present in an order on the disk, SQL Server sorts the addresses of the rows which are to be read, in the same order. For this, it reads the IAM and then forms the sorted order of addresses. Now storage engine optimizes its I/O as the addresses are read in a sequential manner.

Reading Index Pages

Reading Index Pages differ with the type of index used. Generally two indexes are created namely Clustered Index and Non-Clustered Index.

Reading Clustered Index

Clustered Index has a three level architecture. The leaf nodes contain a key value and data rows. The Intermediate Level nodes contain the key values of Leaf Level nodes. The root node points to the Intermediate Level nodes. It means Intermediate Level Nodes contain the list of Leaf Level Nodes available.

While reading Clustered Index, the Storage Engine doesn't go to Leaf Nodes directly. Instead it goes to the Intermediate Level nodes and list out what all Leaf Nodes to be read. After making the list, the Read-Ahead Mechanism reads the respective nodes and copies them from disk to Buffer Cache. Those Leaf Nodes are read in the order of their key values. If there are contiguous pages, they are read in a single READ operation. If there are more pages to be read then it schedules a block of reads at a time.

Reading Non-Clustered Index

As Non-Clustered Index stores the addresses of data rows in its leaf nodes unlike Clustered Index, Storage Engine uses Prefetching mechanism to have a speed base table lookups. Storage Engine goes to the leaf nodes of Non-Clustered Index and gets the memory addresses of the data rows. Then it goes to the address and fetches the data rows. This is an asynchronous process. It means, scanning the index and retrieving the data rows from the index values which are already read is done simultaneously. This means, some data rows are retrieved before the Index Scan is completed. This results in faster data retrieval and is independent of presence of Clustered Index in the table.

The Enterprise edition of SQL Server supports more prefetching than other editions. It facilitates more pages to be Read-Ahead. The level of prefetching cannot be configured by user in any edition of SQL Server.

Advanced Scanning

There is one more reading mechanism supported only by the Enterprise Edition of SQL Server. It is an advanced scanning which is also called Merry-Go-Round Scanning.

Imagine, there are 1000 data pages in a table. One transaction TRAN1 is reading them completely. When it completed reading 100 pages, another transaction TRAN2 tries to read the table completely. Now TRAN2 should get all the 1000 pages from table. For this, it needs to wait for buffers occupied by TRAN1 until TRAN1 frees them. This may result to deadlock sometimes. To avoid this conflict, TRAN2 gets rows from 101th page along with TRAN1. The same buffers are used for both the transactions. Every page is read and passed to both of them. This goes on until all the pages are read completely.

Right now, TRAN1 has all 1000 pages while TRAN2 has 900 pages. Now SQL Server scans the table again to get first 100 pages for TRAN2 and stops reading when it gets 100 pages which lets the TRAN2 to have all 1000 pages with it.

The knowledge for writing this article is acquired from the following Microsoft Official Documentation. Please do read it.

http://technet.microsoft.com/en-us/library/ms191475(v=sql.105).aspx

As I said in my earlier post, Index acts like a pointer. In case of Non-Clustered Index, it stores a key value and address of the data row. Index is mainly used to sort the data rows. Apart from Clustered Index, SQL Server allows you to have a Non-Clustered Index on a table.

A table can have only one clustered index in any version of SQL Server but until SQL Server 2008 R2, there can be 249 Non-Clustered Indexes per table. This limit has increased in SQL Server 2012 to 999. Non-Clustered Index doesn't sort the data rows in a physical order they are stored on disk. It does a logical sorting and sorting mechanism is as it is for Clustered Index. It is ASC (Ascending) by default and user can manually specify it as DESC (Descending). Non-Clustered Index can be Unique or Non-Unique.

Unlike Clustered Index, Non-Clustered Index sorts the data rows in a logical order but not in a physical order. Physically rows are present as they are stored on disk but on creating a Non-Clustered Index, it appears to the user as they are sorted by his/her choice.

Architecture

This is the architecture of Non-Clustered Index specified by Microsoft in its official documentation. Like Clustered Index, it is also a 3 level architecture. At the bottom level, the data pages are present. If there is no clustered index created before then these data pages store the addresses of data rows directly. A table without any index is called HEAP. If there is already a clustered index existing on table then data pages of Non-Clustered Index store the key values of Clustered Index. If the clustered index is unique then directly those unique key values are stored in data pages. If the clustered index is non-unique index then SQL Server makes all key values unique by adding a 4 byte number to every key value. User cannot know what is that 4 byte value, it's done internally by SQL Server. This 4 byte number is called UNIQUEIFIER.

Like Clustered Index architecture, Non-Clustered Index architecture also has a doubly linked list. Leaf Nodes contain either addresses of Data Rows or Clustered Index key values. Intermediate Level Nodes point to the leaf nodes while Root Node points to the Intermediate Level Nodes.

In the above specified Microsoft's architecture picture, there is a minor mistake. It specified that index_id > 0. It should be index_id > 1 because the id 1 is always reserved for Clustered Index. Let's prove it.

SELECT * FROM sys.partitions

See the Result Pane, if the column value of index_id is 1 then it specifies Clustered Index. Any index_id value greater than 1 specifies Non-Clustered Index. As I said before, there can be 249 Non-Clustered Indexes per table up to SQL Server 2008 R2 and 999 in SQL Server 2012. If you are using 2008 R2 you can have index_id column value up to 250. If you are using 2012 then you can have it up to 1000. If for a table you create one Non-Clustered Index, index_id has 2 for an object_id. If you create one more Non-Clustered Index on the same table then index_id has 3 for the same object_id. If the index_id column has 0 then that specifies a Heap ( A table without any index).

As usual, to know about the indexes, use the following system view.

SELECT * FROM sys.indexes

From the Result Pane, you can notice what are the Clustered and Non-Clustered indexes present on what tables in your database. Remember that Heap is of type 0, Clustered Index is of type 1 and Non-Clustered Index is of type 2. If it is unique then the column is_unique has a value 1.

As I said in my post on Clustered Index, if you create a Primary Key it automatically creates a unique clustered index. Here a small question may arise in mind.

Can I create a Primary Key which creates a Non-Clustered Index?

The answer is YES. You can create a Non-Clustered Index with a primary key. But you should specify it manually while creating it. If not SQL Server creates Clustered Index by default. Whether it is a Clustered or a Non-Clustered, an index created by Primary Key is always unique. Specify the server to create a Non-Clustered Index with a Primary Key as below:

ALTER TABLE <Table-Name> ADD CONSTRAINT PK_COLUMN1 PRIMARY KEY NONCLUSTERED (<Column-Name>)

Now you can check in Object Explorer that a Unique Non-Clustered index is created with Primary Key.

Usage of Non-Clustered Index

Non-Clustered Index is not used by Optimizer always like it uses Clustered Index. Let's have a small demonstration of it.

Create a sample table and insert some rows into it as follows:

CREATE TABLE SampleTable (ID INT,Name VARCHAR(100))

INSERT SampleTable VALUES(1,'A')

INSERT SampleTable VALUES(2,'B')

INSERT SampleTable VALUES(3,'C')

INSERT SampleTable VALUES(4,'D')

INSERT SampleTable VALUES(5,'E')

Now create a Non-Clustered Index either unique or non-unique. For now I'm creating a non-unique index.

CREATE NONCLUSTERED INDEX NON_IX_ID ON SampleTable(ID)

Note a point here. I specified in the above statement that I'm creating a Non-Clustered Index. If you don't specify that keyword, Optimizer creates Non-Clustered Index by default.

CREATE INDEX NON_IX_ID ON SampleTable(ID)

This statement is equivalent to previous statement but specifying NONCLUSTERED is a good SQL practice.

Now query data from the table by turning ON the Actual Execution Plan.

SELECT * FROM SampleTable

Now see the Actual Execution Plan, Optimizer performs Table scan instead of performing Index Scan.

Let's try for another alternative by including a query predicate.

SELECT ID,Name FROM SampleTable WHERE ID=1

See the Execution Plan, even in this case optimizer performs Table Scan instead of Index Scan.

If you query only indexed column and if you include a predicate for that then Optimizer uses the index. Try running both the below queries.

SELECT ID FROM SampleTable

SELECT ID FROM SampleTable WHERE ID=1

See the Execution plan for both the above queries. Optimizer performs Index Scan and Index Seek operations respectively.

We may get surprised why our index is not being used when full table is queried. For that we can force the Optimizer to use our index using a hint. Let's do that now.

SELECT * FROM SampleTable WITH (INDEX(NON_IX_ID))

See the Execution Plan now:

As we forced, Optimizer used index but as per its wish it also performed RID Lookup operation in the heap. This means it scanned the whole table along with scanning index. It looked every row in table and every row in Index and then performed Inner Join of both. The matching values are returned. This becomes a burden for optimizer every time we use a forcing hint.

Now let's analyze why Optimizer ignored the index and went for Table Scan. We'll analyze the costs incurred for Optimizer for doing both the operations.

In the above figures, the Cost of Nested Loop belongs to the query where we used hint to use our index. Carefully observe the values we got for I/O cost, Operator Cost and CPU Cost. Sum the costs of Heap, Index Scan and Nested loops and then compare the result value with Table Scan. We can notice that for Table Scan, the costs incurred are lesser. However, the job of Query Optimizer is to find the cheapest execution plan among all the available plans. So it chose to perform Table Scan rather than Index Scan. Here our table is a heap, so it performed RID Lookup. If there is a Clustered Index then Optimizer would perform Key Lookup which scans Clustered Index. The major difference in usage of Clustered Index and Non-Clustered Index is that Clustered Index stores the entire data row in its leaf nodes while Non-Clustered Index stores only the addresses of Data Rows in its leaf nodes.

To have our index used for any kind of querying, we should have INCLUDED COLUMNS for it. I'll write about this concept in my future posts.

Disadvantages of Non-Clustered Index

Non-Clustered Index has disadvantages and some limitations.

Disk Space

As I demonstrated above, for querying whole table, Optimizer performs Table Scan instead of Index Scan. In such cases Index doesn't come to work. This also consumes Disk Space. So before creating index data in your table should be analyzed whether index is needed for it.

Fragmentation

Generally in our computer disks, if we do more file deletions and creation there occurs a fragmentation of disk space. The same is the case with index. If more number of INSERT, UPDATE and DELETE operations are performed on the table with index then it results in Fragmentation of index. This fragmentation causes slow execution of query.

Techie-Friendly Blog

Sunday, January 5, 2014

How SQL Server retrieves the data?