Indexing columns of bit type

| category: Programming | author: st
Tags:

Clearly, an index build on only one column of the bit type is useless because the selectivity is poor (50% of a table in average). However, when a table has several columns of the bit type, the composite index may be efficient.

For the test example, we'll create a table and fill it with random data.

USE AdventureWorks
GO
CREATE TABLE bit_test (
   id int IDENTITY(1, 1) PRIMARY KEY,
   value nvarchar(20),
   f1 bit,
   f2 bit,
   f3 bit,
   f4 bit
)
GO

DECLARE @row_count int = 1000;
DECLARE @i int = 0;

WHILE @i < @row_count BEGIN
   INSERT INTO bit_test (value, f1, f2, f3, f4)
   SELECT
      convert(nvarchar(10), @i),
      convert(int, rand() + 0.5),
      convert(int, rand() + 0.5),
      convert(int, rand() + 0.5),
      convert(int, rand() + 0.5)
   SET @i = @i + 1
END

Then create the index on all bit columns.

CREATE INDEX IX1_bit_test ON bit_test (f1, f2, f3, f4)

It's time to run the query and see the plan.

SET STATISTICS PROFILE ON
GO
SELECT *
FROM bit_test
WHERE f1 = 1 AND f2 = 1 AND f3 = 1 AND f4 = 1;
GO
SET STATISTICS PROFILE OFF
GO

As we can see, the index is not used!

Rows  Executes StmtText
----- -------- ---------------------------------------------------------------------------------
79    1        SELECT * FROM [bit_test] WHERE [f1]=@1 AND [f2]=@2 AND [f3]=@3 AND [f4]=@4
79    1          |--Filter(WHERE:([AdventureWorks].[dbo].[bit_test].[f1]=(1) AND [AdventureWorks]
0     0               |--Compute Scalar(DEFINE:([AdventureWorks].[dbo].[bit_test].[cf1]=[Adventur
0     0                    |--Compute Scalar(DEFINE:([AdventureWorks].[dbo].[bit_test].[cf1]=(([A
1000  1                         |--Clustered Index Scan(OBJECT:([AdventureWorks].[dbo].[bit_test]

Why is the index not used? Indeed, the writing of queries like SELECT * is not a good practice. So we need to specify the required columns and re-run. Voila!

SELECT id, f1, f2, f3, f4
FROM bit_test
WHERE f1 = 1 AND f2 = 1 AND f3 = 1 AND f4 = 1;
Rows Executes StmtText
---- -------- -----------------------------------------------------------------------
79   1        SELECT [id],[f1],[f2],[f3],[f4] FROM [bit_test] WHERE [f1]=@1 AND [f2]=@
79   1          |--Index Seek(OBJECT:([AdventureWorks].[dbo].[bit_test].[IX1_bit_test])

There are other methods to process bit columns.

You can create one or several calculated columns based on boolean expressions which are frequently used in queries, then indexing the columns.

ALTER TABLE bit_test ADD cf1 AS (f1 & f2 & f3 & f4)
GO
CREATE INDEX IX2_bit_test ON bit_test (cf1);
GO
SELECT id, cf1 FROM bit_test WHERE cf1 = 1;
Rows Executes StmtText
---- -------- ----------------------------------------------------------------------------
79   1        SELECT [id],[cf1] FROM [bit_test] WHERE [cf1]=@1
0    0          |--Compute Scalar(DEFINE:([AdventureWorks].[dbo].[bit_test].[cf1]=[Adventur
79   1               |--Index Seek(OBJECT:([AdventureWorks].[dbo].[bit_test].[IX2_bit_test])

You may also create a calculated column of integer type which contains a combination of values 2N. Indeed, it's a frequent method in C and assembler languages.

ALTER TABLE bit_test ADD cf2
   AS (convert(int, f1) +
       convert(int, f2) * 2 +
       convert(int, f3) * 4 +
       convert(int, f4) * 8 )
GO
CREATE INDEX IX3_bit_test ON bit_test (cf2);
GO
SELECT id, cf2 FROM bit_test WHERE cf2 = 15; -- 15 (decimal) = 1111 (binary)
Rows Executes StmtText
---- -------- ----------------------------------------------------------------------------
79   1        SELECT [id],[cf2] FROM [bit_test] WHERE [cf2]=@1
0    0          |--Compute Scalar(DEFINE:([AdventureWorks].[dbo].[bit_test].[cf2]=[Adventur
79   1               |--Index Seek(OBJECT:([AdventureWorks].[dbo].[bit_test].[IX3_bit_test])