This is Part 1 in 3 part series. Read Part 2, Part 3 here.
In an Oracle Database, indexes are optional structures associated with database tables and clusters, used for speeding up the data retrieval. Just like an index in a book helps finding the information inside the book faster, an index in the oracle database provides a faster access path to the table data. Indexes are independent of the data in the associated table and indexes can be created or dropped without affecting the base table and its data. On the other hand, once created oracle automatically maintains the indexes and any changes in table data are reflected in the index automatically with no additional action from the users manipulating the table data. In order to support large data volumes, indexes are stored in disks using data structures that provide efficient retrieval of table data with minimum disk I/O.
There are different types of indexes oracle supports and internal storage wise there are mainly 2 types, the b-tree indexes and bitmap indexes. A b-tree index uses a tree data structure called balanced tree (b-tree) to store the data to support quick retrieval of data. In case of bitmap indexes, oracle stores a bitmap of rowid (unique identifier used to identify a row in Oracle) with distinct values the indexed column has. Here in this 2 blog series I am trying to illustrate the working of B-Tree and Bitmap indexes in Oracle database.
In an Oracle Database, indexes are optional structures associated with database tables and clusters, used for speeding up the data retrieval. Just like an index in a book helps finding the information inside the book faster, an index in the oracle database provides a faster access path to the table data. Indexes are independent of the data in the associated table and indexes can be created or dropped without affecting the base table and its data. On the other hand, once created oracle automatically maintains the indexes and any changes in table data are reflected in the index automatically with no additional action from the users manipulating the table data. In order to support large data volumes, indexes are stored in disks using data structures that provide efficient retrieval of table data with minimum disk I/O.
There are different types of indexes oracle supports and internal storage wise there are mainly 2 types, the b-tree indexes and bitmap indexes. A b-tree index uses a tree data structure called balanced tree (b-tree) to store the data to support quick retrieval of data. In case of bitmap indexes, oracle stores a bitmap of rowid (unique identifier used to identify a row in Oracle) with distinct values the indexed column has. Here in this 2 blog series I am trying to illustrate the working of B-Tree and Bitmap indexes in Oracle database.
B-tree indexes
B-tree indexes are the standard/traditional tree indexes oracle has been using since its earlier versions, to provide improved speed for data retrieval operations from database tables. Oracle stores the indexes as balanced trees (B-Tree) in the disk in order to minimize the disk I/O while retrieving data. For the columns for which index is created, the data is sorted and stored as rowid – index key value pair for each row. In oracle, the b-Tree indexes are stored as an ordered list of values divided into block-wide ranges. The end points of the ranges along with the pointers to the blocks are stored as a search tree where a value from a set of N entries can be found in log (n) time.What is a B-tree ?
Tree structures support various kind of data manipulation operations like search, insert, delete on a set of data in a time proportional to the height of the tree. B-Tree or balanced tree is a variant of the tree data structure which is optimized for situations where it is not possible to keep the entire tree structure in the main memory and require a secondary storage such as disks to maintain the tree. In operations where large volume of data is consumed, systems spend most of the time in waiting for the data to be retrieved from disk which is the most expensive and time consuming of all the operations. A B-Tree stores the data in such way that, the data stored in the tree structure can be retrieved with least number of disk accesses, thus speeding up the data retrieval process.In a b-tree, each node is designed to accommodate more than one keys and the number of sub trees originating from each node depends on the number keys stored in the node. This design characteristic of b-tree to branch out in number of directions and to store large number of keys in each node helps in keeping height of the tree relatively small which in turn reduces the I/O required while doing operations like searching an element or adding or deleting an element. Every B-tree is said to be of some order “n”, meaning each node in the b-tree can contain “n” to “2n” number of keys.
For example, following diagram (Figure 1) shows a b-tree of order 1 storing numbers from 0 - 9. Being a b-tree of order 1, each node in the tree can have 1 to 2 keys and a maximum of 3 pointers pointing to the sub trees. On each node keys are kept in sorted order.
Figure 1 |
In this case any key in the tree can be searched with a maximum of 3 node look up. Here it may not look significant since the example shows a b-tree of order 1 storing 10 keys. In practical scenarios like an oracle index storing hundreds of thousands of keys, b-tree provides large savings in terms of disk I/O, hence speeding up the data retrieval significantly. For example, in a b-tree storing 10,000,000 keys with 50 keys in each node, in order to search a key, a maximum of 4 node look up is good enough.
Oracle’s implementation of b-tree index
The basic storage unit for
an oracle database is a ‘data block’; meaning when oracle database processes
read or write data from disk it is done for a data block. One data block
corresponds to a specific number of bytes of physical database space on disk
depending on the data block size mentioned while creating the database. The
next level of logical database space is an extent. An extent is a specific
number of contiguous data blocks allocated for storing a specific type of
information. The level of logical database storage above an extent is called a
segment and oracle allocates different type of segments to store different type
of objects. When an index is created, index is stored by allocating an index
segment in the tablespace specified in the index creation statement.
For B-tree indexes in
oracle, each block allocated for the index is considered as a node in the
b-tree data structure. B-tree indexes have two types of nodes, leaf nodes and
branch nodes. Leaf nodes store the index keys, which is the basic unit of an
index, in sorted order together with the rowid information which can be used to
locate the row in the database table.
Branch nodes store a reduced list of index keys, and pointers to the
child blocks containing the keys. In
branch nodes, oracle does not always store the complete index key, but stores
only the minimum key prefix that is required to make a branching decision
between two keys.
When an index is created on an empty table, a
node is created which is called the root node (which is actually and empty data
block) and as data gets inserted in the table, oracle stores index keys in the
node in sorted order, along with the rowid information. At this point of time
root node acts as a leaf node. A root node at this stage can be represented as
in figure 2.
Figure 2 |
As more keys get added to the index, the root
node becomes full and oracle splits the root node to create two leaf nodes to
accommodate the additional entries.
While doing so, the root node becomes a branch node storing block
address of 2 of the newly created leaf nodes along with first index key stored
in those blocks. All the index keys and
their rowid information are distributed to the newly created leaf nodes. Figure 3 represents the state of the b-tree index at this point of time.
Figure 3 |
As more and more data gets added to the table,
more and more leaf nodes are created and more entries are made in the root node
to point to the newly created leaf nodes and at one point, the root node may
not have enough space to hold anymore entries to store the block addresses of
the leaf nodes that are getting created. The root node is again split to create
another level of branch nodes in order to accommodate the growing number of
leaf nodes. This process is continued so that all the leaf blocks are
maintained at the same depth, and retrieving any record from any part of the
index will take roughly the same amount of time. Apart from the index key,
rowid pair, leaf nodes also stores block addresses of the adjacent leaf nodes
so that the leaf nodes are cyclically linked to each other.
Illustration from Oracle Database
To have a better understanding at Oracle’s way of storing
indexes, a sample table called EMP was created and an index EMP_COMM_IDX was
created on column COMMENTS which of size 3500 characters. A column of such a
big size was chosen to produce an index layout that is spread across multiple
leaf/branch nodes with minimum number of records. In this case 8KB being the
data block size, Oracle will be able to store only 2 index keys in one data
block and with 10 rows itself index will have 5 blocks associated to it.
Following table shows the content of EMP table, the data set used in this
example. Note that the comments column in the EMP table actually contains
character data of size 3500 characters, and only a portion of the data is
actually shown in the table given below because of space constraints.
ID
|
NAME
|
DEPT
|
COMMENTS
|
2
|
AAA
|
A
|
22222222222222222222…..
|
3
|
AAA
|
B
|
33333333333333333333…..
|
8
|
AAA
|
A
|
44444444444444444444…..
|
9
|
AAA
|
C
|
99999999999999999999…..
|
6
|
AAA
|
A
|
66666666666666666666…..
|
7
|
AAA
|
B
|
77777777777777777777….
|
4
|
AAA
|
A
|
88888888888888888888….
|
5
|
AAA
|
B
|
55555555555555555555….
|
0
|
AAA
|
A
|
00000000000000000000….
|
1
|
AAA
|
B
|
11111111111111111111…..
|
The index was created after populating 10
records in the table with comments column having strings of 3500 characters
length. Shown below is the block dump
for the index EMP_COMM_IDX from the database generated using oracle dump
utility.
----- begin tree dump
branch: 0x3c00b8a 62917514 (0: nrow: 5, level: 1)
leaf: 0x3c00b8b 62917515 (-1: nrow: 2 rrow: 2)
leaf: 0x3c00b8c 62917516 (0: nrow: 2 rrow: 2)
leaf: 0x3c00b8d 62917517 (1: nrow: 2 rrow: 2)
leaf: 0x3c00b8e 62917518 (2: nrow: 2 rrow: 2)
leaf: 0x3c00b8f 62917519 (3: nrow: 2 rrow: 2)
----- end tree dump
The dump shows a total of six blocks allocated
for the index EMP_COMM_IDX, out of which five are leaf nodes and the other one
a branch node. Here each leaf node has 2 index keys stored in it and the branch
node contains prefixes for five index keys and their respective child leaf
block’s addresses. Note that in branch nodes, oracle does not store the entire
index key, but only the prefixes of the index keys that are good enough for
making the branching decision. Looking further into each block using oracle’s
data block dump utility following diagram was developed with the real block
addresses that Oracle has used to represent the index in the data file.
Figure 4 |
Here the branch node block address is 2954 and
leaf nodes starts from block 2955 to 2959. In the branch node Oracle has stored
only one character (2 for a string of 3500 2’s for example) for each index key
which enough for making the branching decision while searching for a key. Apart
from rowid/keys pairs leaf nodes also stores block address for adjacent leaf
nodes.
Read Part 2 here
No comments:
Post a Comment