Queues are often misrepresented as the very “bane of our existence” and yet queues restore some order of semblance to our chaotic life.
Queue depth is the number of I/O requests (SCSI commands) that can be queued at one time on a storage controller. Each I/O request from the host’s initiator HBA to the storage controller’s target adapter consumes a queue entry. Typically, a higher queue depth equates to better performance.
Overview Of Queues
There is three layer or queue stacks:
At the Virtual Machine level, there are 2 queues
- PVSCSI Adapter queue
- Per VMDK queue
You can find more information on this KB: 2053145
At a physical Server level there are 2 queues
- A HBA (Host Bus Adapter) queue per physical HBA
- A Device/LUN queue (a queue per LUN).
Please consider that if you have some servers in a cluster, using same HBA on all servers is recommended.
Also if you have different HBA adapters in hosts, you can change HBA queue depth.
Read this KB: 1267
Read this KB: 2044993, if you have problem with your HBA driver on ESXi 5.5 and above.
About any other storage adapter, you should read its vendor documents and apply recommendations.
- Storage Array Queue
You need to read your storage vendor documents about your storage to find your storage port queue.
I know, this is 2048 for most EMC storage devices.
Queue Depth Calculation: Physical Server
Number of IO which can be generated per LUN with single queue depth slot = 1000ms / (Average latency between host and storage array)
If latency is 10ms, server can generate 100 IOPS. With set LUN queue depth to maximum (It’s depended to your HBA, you should read HBA vendor document) = 100 x 512 (Maximum LUN Queue Depth)=51,200
Number of supported LUNs
Typically, each server has two HBA ports and if each HBA ports can generating 8192 (To find maximum IO per port, you need to read your HBA vendor as well):
(HBA1 queue depth + HBA2 queue depth) / (lun_queue_depth per lun)
(8192 + 8192) / 512 = 16384 / 512 = 32 LUNS
The server can have 32 LUN maximum and the server can generate: 51200 x 32 = 1,638,400 IOPS
Because the servers HBAs can generate high workload so you need to have estimated from storage point of view.
Queue Depth Calculation: Storage SAN
Here s formula for calculating storage array queue depth:
Port-QD ≥ Host1 (P * L * QD) + Host2 (P * L * QD) + … + Hostn (P * L * QD)
Port-QD = Maximum queue depth of the array target port
P = Number of initiators per Storage Port (number of ESX hosts, plus all other hosts sharing the same SP ports)
L = Number of LUNs presented to the host via the array target port ie sharing the same paths
QD = LUN queue depth / Execution throttle (maximum number of simultaneous I/O for each LUN any particular path to the SP)
Maximum number of LUNS which can be serviced by both the FC ports without flooding the FC storage port queue with lun queue depth=512 and heavy IO to the luns
= (Port-QD of 1st FA port 0 + Port-QD of 2nd FA port 1) / lun_queue_depth
= (1600 + 1600) / 512 = 6.25 ~ 6 LUNS
Other initiators are likely to be sharing the same SP ports, so these will also need to have their queue depths limited.
Typically a SAN will have at least 2 storage FA cards each with 2 storage ports which could support a lots of LUNs, so a QFULL situation may not arise that often if proper queue tuning is done
Given the server is able to send 16384 IO’s per 10 ms, the storage port has only 3200 (Both FA ports) slots and hence QFULL error condition will happen.
Please read this KB 1008113 to controlling and preventing queue full situation.
Queue depth calculation is very important and you can keep under control and preventing full queue situation.
Read the below links to find more information and examples:
- Using esxtop to identify storage performance issues for ESX / ESXi (multiple versions) (1008205)
- Setting the Maximum Outstanding Disk Requests for virtual machines (1268)