SQL Server: key-value store table and hash index

| category: Testing | author: st
Tags: ,

Hash indexes are used as entry points for memory-optimized tables; both features were introduced in SQL Server 2014. Let's look at index performance in scenario "key-value store".

Scenario and environment

The table stores a significant volume (about 10 millions rows) of pairs key-value. The key is of integer type and the value is a small random string.

The table store is used :

  • random read of a value from a key
  • random update of a value by a key

We will tests the scenario for the following types of table :

  • with a hash primary key (memory optimized table)
  • with a non-clustered primary key (heap table)
  • with a clustered primary key (cluster table)

The test workstation has 16 GB of RAM (8 GB for SQL Server) and small SSD storage.

SELECT @@version;
---
Microsoft SQL Server 2017 (RTM-CU11) (KB4462262) - 14.0.3038.14 (X64)
Developer Edition (64-bit) on Windows 10 Enterprise 10.0 <X64> (Build 17134: ) (Hypervisor)

Preparing database

A memory optimized table requires a specific file group to ensure the data durability. According to the documentation, in most cases the bucket count should be between 1 and 2 times the number of distinct values in the index key. As the table will not grow I select 10M of buckets for the tests of 10M of rows.

USE master;
GO
DROP DATABASE IF EXISTS test;
GO
CREATE DATABASE test
ON PRIMARY
(NAME = N'test_data', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\DATA\test_data.mdf', SIZE = 1024MB),
FILEGROUP fg_test_memory_optimized CONTAINS MEMORY_OPTIMIZED_DATA
(NAME = N'test_memory_optimized', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\DATA\test_memory_optimized')
LOG ON
(NAME = N'test_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\MSSQL\DATA\test_data_log.ldf', SIZE = 2048MB)
GO
ALTER DATABASE [buyer_tests] SET RECOVERY SIMPLE
GO

USE test
GO

CREATE TABLE kv1 (
   k int NOT NULL,
   v nvarchar(100) NOT NULL,
   CONSTRAINT PK_kv1 PRIMARY KEY NONCLUSTERED HASH (k)
      WITH (BUCKET_COUNT = 10000000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

CREATE TABLE kv2 (
   k int NOT NULL,
   v nvarchar(100) NOT NULL,
   CONSTRAINT PK_kv2 PRIMARY KEY NONCLUSTERED (k)
);

CREATE TABLE kv3 (
   k int NOT NULL,
   v nvarchar(100) NOT NULL,
   CONSTRAINT PK_kv3 PRIMARY KEY CLUSTERED (k)
);
GO
CREATE VIEW rand2 AS
   SELECT rand(abs(convert(int, convert(varbinary, newid())))) AS value;

Now fill the data for only one table and then copy them to the other ones. Finally, all three tables contains the same data.

SET NOCOUNT ON;
USE test;

DECLARE @i int;
DECLARE @max_count int = 10000000;
DECLARE @batch_count int = 10000;

DELETE FROM kv1;
TRUNCATE TABLE kv2;
TRUNCATE TABLE kv3;

SET @i = 0;
BEGIN TRANSACTION;
WHILE @i < @max_count BEGIN
    INSERT INTO kv1 (k, v)
    SELECT @i, CAST(rand2.value AS nvarchar(100)) FROM rand2
    SET @i = @i + 1;
    IF @i % @batch_count = 0 BEGIN
        PRINT @i;
        COMMIT TRANSACTION;
        BEGIN TRANSACTION;
    END;
END;
IF @@TRANCOUNT > 0
    COMMIT TRANSACTION;
PRINT 'Finished kv1 data';

INSERT INTO kv2 WITH (TABLOCK)
SELECT k, v FROM kv1;

INSERT INTO kv3 WITH (TABLOCK)
SELECT k, v FROM kv1;

See the hash index property to ensure that average chain length is considerable well.

SELECT
    object_name(hs.object_id) AS table_name,
    i.name as index_name,
    hs.total_bucket_count,
    hs.empty_bucket_count,
    floor((cast(empty_bucket_count as float)/total_bucket_count) * 100) AS empty_bucket_percent,
    hs.avg_chain_length,
    hs.max_chain_length
FROM sys.dm_db_xtp_hash_index_stats hs
    INNER JOIN sys.indexes i ON hs.object_id = i.object_id AND hs.index_id = i.index_id
table_name   index_name  total_bucket_count  empty_bucket_count  empty_bucket_percent  avg_chain_length  max_chain_length
------------ ----------- ------------------- ------------------- --------------------- ----------------- ----------------
kv1          PK_kv1      16777216            9040006             53                    1                 8

Testing

Run the tests sequentially for the tables kv1, kv2 and kv3.

SET NOCOUNT ON;
USE test;
DBCC DROPCLEANBUFFERS;
DBCC FREEPROCCACHE;

DECLARE @i int, @k int, @v nvarchar(100);
DECLARE @curr_time int;
DECLARE @max_key int = 10000000;
DECLARE @run_count int = 100000;
DECLARE @d1 datetime2, @d2 datetime2;
DECLARE @updated bit;
DECLARE @stat TABLE (
    test_case nvarchar(10),
    total_time int,
    min_time int,
    max_time int
);
INSERT INTO @stat VALUES
('read', 0, 1E9, 0),
('upd', 0, 1E9, 0);

-- Random read
SET @i = 0;
WHILE @i < @run_count
BEGIN
    SET @v = NULL;
    WHILE @v IS NULL
    BEGIN
        SELECT @k = CAST((rand2.value * @max_key + 1) AS int) FROM rand2;
        SET @d1 = SYSDATETIME();
        SET @v = (SELECT v FROM kv1 WHERE k = @k);
        SET @d2 = SYSDATETIME();
    END;
    SET @curr_time = DATEDIFF(microsecond, @d1, @d2);
    UPDATE @stat
    SET
        total_time = total_time + @curr_time,
        min_time = CASE WHEN min_time > @curr_time AND @curr_time > 0 THEN @curr_time ELSE min_time END,
        max_time = CASE WHEN max_time < @curr_time THEN @curr_time ELSE max_time END
    WHERE test_case = 'read';
    SET @i = @i  + 1;
END;

-- Random update
SET @i = 0;
WHILE @i < @run_count
BEGIN
    SET @updated = 0;
    WHILE @updated = 0
    BEGIN
        SELECT
            @k = CAST((rand2.value * @max_key + 1) AS int),
            @v = CAST(rand2.value AS nvarchar(100))
        FROM rand2;
        SET @d1 = SYSDATETIME();
        UPDATE kv1 SET v = @v, @updated = 1 WHERE k = @k;
        SET @d2 = SYSDATETIME();
    END;
    SET @curr_time = DATEDIFF(microsecond, @d1, @d2);
    UPDATE @stat
    SET
        total_time = total_time + @curr_time,
        min_time = CASE WHEN min_time > @curr_time AND @curr_time > 0 THEN @curr_time ELSE min_time END,
        max_time = CASE WHEN max_time < @curr_time THEN @curr_time ELSE max_time END
    WHERE test_case = 'upd';
    SET @i = @i  + 1;
END;

SELECT
    test_case,
    total_time / CAST(@run_count AS float) AS avg_time_mksec,
    min_time AS min_time_mksec,
    max_time AS max_time_mksec
FROM @stat;

Results (in microseconds) include the time to call SYSDATETIME() and set the value to the variable @d2 so I will compare only relative values. The maximal values should correspond a "cold" query after the cache and buffers cleanup.

kv1 (hash index, memory optimized)

test_case  avg_time_mksec         min_time_mksec max_time_mksec
---------- ---------------------- -------------- --------------
read       10,56365               954            1997
upd        185,48355              114            29999


kv2 (heap)

test_case  avg_time_mksec         min_time_mksec max_time_mksec
---------- ---------------------- -------------- --------------
read       67,31709               275            31619
upd        176,32221              101            19020

kv3 (cluster)

test_case  avg_time_mksec         min_time_mksec max_time_mksec
---------- ---------------------- -------------- --------------
read       40,39668               129            4949
upd        162,18376              26             50999

Conclusions

One can see that key seek and value gathering from the hash indexed table are about 4 times faster that from the clustered one and 7 times faster that from the heap table. However, you should have a good reasons like a really high workload to improve the performance in several dozens of microseconds only using memory optimized table instead of clustered one.

On other hand, the random update test shows that there is no duration difference. Memory-optimized tables have the versioning organization that explains the time required for modifications. I'm not sure whether some additional tuning may improve the performance.