In this post, I am going to talk about Performance Tuning Best Practices for SQL running on an Azure VM with key considerations. Understand the characteristics of running SQL Server on a Windows Azure VM (IaaS). and what impacts performance. Additionally, learn how to monitor and troubleshoot performance of SQL Server running on Windows Azure VM. Let’s start with the best Practices for SQL on Azure VM.
Best Practices for SQL on Azure VM
OLTP systems with transactional workloads
- Large number of small transactions with both Reads as well as Inserts & Updates
- Transactions are mostly similar in pattern
- Significant amount of reads (Varies between 60% to 80%)
- Random Read/Writes between 8k-64k chunks.
- User concurrency is High.
Read my earlier post on SQL Server Performance Tuning for All Editions.
Windows Azure VM Characteristics
- Configuration options
VM Configuration Options
- Virtual Machine size
- Network bandwidth
- Disk types and configurations
- Disk cache settings
VM Disk Options
- Default OS disk with persistent data and caching.
This is a 127GB disk optimized for OS access patterns and Windows (or Linux) boot up time. This is not meant for either transactional or Data Warehouse (Analytical) workloads.
- Attach additional Persistent Data Disks
You can add up to 16 disks with 1 TB in size and 500 IOPS each.
- Temporary local disk drives which are a non-persistent disk
- It is used for temporary data storage & OS page files. This is not meant for data storage or SQL Server files.
- Hosted in attached disks on every physical host. This means that it is not a part of the Azure Blob Storage and does not give you the regular SLA provided by Azure.
- It is cleaned up in the event of a VM Failover from one host system to another or recycling.
- Physical disks are shared across other VMs on the same physical machine. This means that the performance is not guaranteed and can have a large variance.
- *Not recommended for user or system database files
If you agree to all the conditions mentioned above, you can still go ahead and put the TEMP DB system files on the non-persistent data drive for performance reasons.
Windows Azure IO subsystem
The storage disks are implemented as a shared multi-tenant service with built-in triple redundancy, and with optional geo-redundancy across regions. Given below are the main characteristics of Azure Disk subsystem:
- Performance is less predictable than on-premises due to multi-tenancy and other reasons.
- It is a shared infrastructure, where host machines, storage services, network bandwidth are shared between subscribers.
- Performance can depend on which region the VM is provisioned. If your VM is at a far away distance, then it will have a direct impact on network latency.
- Subject to downtime and maintenance operations. Availability still stands at 99.99%.
- Choose between granular control & configuration for on-premise deployment vs. cost, simplicity, out of box redundancy for Azure deployments.
The VM disk is cached in both inside physical host machine and also to an external Azure Page Blobs. This greatly reduces disk I/O latency by reducing the number of transactions hitting against the Azure Storage subsystem.
The VM Cache is a 2-tier cache which stores recently accessed data in host RAM cache – space shared by all VMs on the machine in the form of Page Blobs. The second Tier stores less recently accessed data stored on local hard disks of the physical host machine. There is a reserved cache space for VM “OS Disk” and “Data Disks” based on the VM size.
VM cache settings
Read Only: All requests cached for future reads. All writes persisted directly to Windows Azure Storage Blobs.
Read Write: This is a kind of a “Read Ahead” algorithm. The reads and writes are cached for future reads. Non-write-through writes persisted to local cache first. For SQL Server, writes are persisted to WA storage because it uses Write-through It provides the lowest disk latency for light workloads
None (disabled): This allows you to bypasses the cache. All the data are transferred to disk and persisted to the Windows Azure Storage. This method gives you the highest I/O rate for I/O intensive workloads. You also need to take “Transaction Cost” into consideration.
Caching by default is disabled on the Data Disks. You can enable caching of upto 4 Read only and 4 Read write disks for data drives.
Disk Caching Best Practices for SQL Server
For the OS drive, the default caching is “Read Write” which reduces latency as mentioned before.
For Data Disks, the default is “None (Disabled)” which gives you the highest throughput. For Read intensive workload, Enable “Read Only” caching for better performance.
VM network performance best practices
- Network latency is higher compared to an on-premise deployment. Reducing network round trips can greatly improve performance.
- Consolidate applications which have high transactions or “Chatty” Apps on the same virtual machine to reduce round-trips.
- Host your VM’s on the same V-NET & Cloud Service.
- VM’s with Always On availability groups needs to be in the same Cloud Service. Routing of Read-Only transactions to secondaries is recommended using a listener on the Public IP interface.
- For Databases of <1TB storage, a Single Data Disk can be recommended with acceptable performance.
- For >1TB DB files and Higher IOPS requirement, use multiple data disks with the following configuration:
- Just like SQL Server On-Premises, use Filegroups with multiple database files(MDF* & Ldf*), and place each database file on a separate data disks.
- For more performance, you can create a striped Disk using multiple data disks using the Windows Server storage space.
- Storage spaces are recommended over OS striped volumes.
- Spreading the I/O workload across a number of data disks benefits smaller random access patterns for OLTP operations where IOPS and bandwidth scale in a nearly linear fashion.
- As the I/O block size increases, for reading operations, IOPS or bandwidth doesn’t linearly scale by adding more data disks.
- If your workload is read intensive with analytical queries, adding more disks will not necessarily help.
- For write intensive workload, adding more data disks can increase performance in a nearly linear fashion. This means that you can benefit from placing each transaction log for multiple databases on a separate data disk.
- For large sequential, I/O block sizes write generally scale better than reads.
Use Data Compression to reduce IO
As you can see, there is a significant reduction of logical and physical reads with page compression enabled, and the elapsed time. Unsurprisingly, CPU time of the query does go up with page compression as expected, because SQL Server needs to decompress the data while returning results to the query. Please note that your results will vary, depending on your workload and scenario.
Use these data compression techniques for I/O intensive workloads and then decide which tables and indexes to compress. If your CPU is already highly utilized, do not implement this or upgrade the VM to a higher service tier.This is not enabled by default in Azure VM images. Reduces IO for Creating a DB Restoring, DB Adding files to a DB, Extending file size, Autogrow, etc. Add SQL service account to Perform Volume Maintenance Tasks security policy.
This step will not help, if you have performance issues at the Transaction Log throughput level.
Recommendations for Instant File Initialization
- Ensure that you Pre-size all database files appropriately
- Configure Instant File Initialization before creating database
- Configure Instant File Initialization first and then extend files like TEMPDB.
- Configure Instant File Initialization first before restoring a database on a new server or on a newly create VM.
- Restart SQL Server Service for the configuration manager after configuring Instant File Initialization.
Other Recommendations for Data Disk Performance
Spreading Data files of a single Database into multiple storage accounts is not recommended. This is because Blobs are not written at the same time and could be out of sync. Instead, spread the data files of different databases across multiple disks to achieve higher IOPS / bandwidth performance.
A storage account has a hard limit of 20000 tps (Transactions per second)
Performance Counter to Capture and Analyse Database performance on an Azure VM
LogicalDisk(*)Avg. Disk Bytes/Read
LogicalDisk(*)Avg. Disk Bytes/Write
LogicalDisk(*)Avg. Disk Queue Length
LogicalDisk(*)Avg. Disk sec/Read
LogicalDisk(*)Avg. Disk sec/Write
LogicalDisk(*)Current Disk Queue Length
LogicalDisk(*)Disk Read Bytes/sec
LogicalDisk(*)Disk Write Bytes/sec
MemoryFree System Page Table Entries
Processor(*)% Privileged Time
Processor(*)% Processor Time
SQLServer:Locks(_Total)Average Wait Time (ms)
SystemProcessor Queue Length
Use tools like PAL from Codeplex to analyze the results.
I hope this was helpful. Please comment below to give feedback.
- Performance Guidance for SQL Server in Windows Azure Virtual Machines: Download it here
- Azure Storage Scalability and Performance TargetsAzure
- Azure virtual machine disks and cache settings
- Analyzing I/O Characteristics and Sizing Storage Systems for SQL Server Database Applications
- Best practices and recommendations for optimizing SQL Server performance in Azure VMs
- Analyzing I/O Characteristics and Sizing Storage Systems for SQL Server Database Applications
- Data Compression: Strategy, Capacity Planning and Best Practices
- How and Why to Enable Instant File Initialization
I hope that you like this post on Best Practices for SQL on Azure VM. Any comments or suggestions are welcome in the comment section below. Cheers!