Optimisation of Transact SQL queries

| category: Testing | author: st
Tags:

As DBAs know, Database Tuning Advisor can help you in some cases.

  • The analyze of the traces will be efficient in OLTP where the most of queries are repeating and could be generalized
  • The analyze of traces will be less efficient in OLAP/reporting where the queries are parameterized by users. The recommendations to create index with included columns let your database grows quickly. Be aware to apply recommendations like those.
  • Finally, the analyze will be useless in case of OLAP/Decision. Every "ad hoc" query produces the recommendation to add some indexes with included columns. Taking into account the size of data warehouses/datamarts and that the kind of queries is hard to generalize, this kind of "optimization" wont be a good solution.

In such case the precise optimization is required. I propose you the following procedure that could be adapted accordingly your needs.

Gather required information

I developed a small utility that gathers static and statistical data about database server. You can download it from sourceforge.net. The sources are open so you can compile it yourself too.

Instructions to start:

  1. Unpack ZIP file and place the utility in some directory (directory on database server is preferred)
  2. Change connection's parameters in configuration file appcfg.xml
  3. Start ssexpertcmd.exe directly or from command line (recommended)

The utility creates the subdirectory corresponding to start time and then stores into all gathering informations as CSV files.

At the same time the log file "ssexpertcmd.exe.log" will be created to store information about any errors. I.e. some database could be off-line.

It is not necessary to place the utility on database server and you can start it from workstation. In this case all storage information will be about your local disks.

Selection of necessary informations

In the directory created by ssexpertcmd.exe you have many CSV files that you can open with Excel or Open/LibreOffice:

Scope view
000_Storage.csv Information about logical an physical storage
010_ServerVersion.csv Information about software versions and editions (see also Latest updates and version history for SQL Server)
020_ServerInfo.csv Some information about server parameters
Databases view
030_Databases.csv Databases list with their parameters
040_DBFiles.csv Information about databases stockages (data files and groups...)
050_TableSizes-.csv The rows count and size of tables and indexes
060_Indexes-.csv Information about indexes
070_IndexesStat-.csv Statistical information about indexes: density, average key length...
Issues and errors
080_IndexesAnomalies-.csv Anomalies found in indexes, i.e. primary key is absent (NO_PK) etc.
Queries statistics
090_QStatTopAVGCPUTime-00-GLOBAL.csv Top of 300 most CPU time consuming queries (global view). Calculated as AvgCPUTime = sum(total_worker_time) / sum(execution_count) where "total_worker_time" is the total amount of CPU time, reported in microseconds (but only accurate to milliseconds), that was consumed by executions of this plan since it was compiled.
090_QStatTopAvgCPUTime-.csv same list specific to databases (attention, ad hoc queries are in global view only).
090_QStatTopFrequent-00-GLOBAL.csv Top 300 of the most frequently used queries (global view)
090_QStatTopFrequent-.csv same list specific to databases
090_QStatTopLogicalReads-00-GLOBAL.csv Top 300 most logical reading consuming queries (global view)
090_QStatTopLogicalReads-.csv same list specific to databases
090_QStatTopMaxWorkerTime-00-GLOBAL.csv Top 300 queries by value of "max_worker_time" (global view). MaxWorkerTime is the maximum CPU time, reported in microseconds (but only accurate to milliseconds), that this plan has ever consumed during a single execution.
090_QStatTopMaxWorkerTime-.csv same list specific to databases

See also MSDN article about sys.dm_exec_query_stats view.

Analyzing queries

You could start from 090_QStatTop\* data files and analyze several queries at top of the list.

It is recommended to install some plug-in to format SQL code in Management Studio. Then you copy the query from file to SSMS and format it.

Add the following instructions before SQL code to show execution time and readings:

SET STATISTICS IO ON
SET STATISTICS TIME ON

If needs, add the instructions of cleaning the cache to simulate "cold mode" (i.e after server restart or query was deleted from cache). Don't use this on production server!.

DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

However, on production server you can force query recompilation:

SELECT
...
OPTION (RECOMPILE);

Activate the option "Include actual execution plan" in the toolbar of SSMS or in the menu "Query - Include actual execution plan". Or use SET instruction:

SET STATISTICS XML ON

Start the query and see the results about CPU time/elapsed time, readings and real execution plan.

There are many sources explaining how to improve Transact SQL query. I.e. the good start point is the article "Checklist for Analyzing Slow-Running Queries".

Have a good hunting!