ESXi PCI Passthrough – Large VM Memory (MainHeap) BUG!
ESXi PCI Passthrough
This is a combination hardware and software feature on hypervisors to allows VMs to use PCI functions directly And we know it as VMDirectPath I/O in vSphere environment.
VMDirectPath I/O needs some requirements to work perfectly, please read this KB for more information, as we read it!
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2142307
There is also some limitation when using VMDirectPath I/O and the below features will unavailable:
- Hot adding and removing of virtual devices
- Suspend and resume
- Record and replay
- Fault tolerance
- High availability
- DRS (limited availability. The virtual machine can be part of a cluster, but cannot migrate across hosts)
- Snapshots
I couldn’t find any other limitation specially about memory size, so now why we couldn’t use more than 790 GB to 850 GB of our server memory capacity?!
Anyway, let’s review our test scenario!
Our Test Scenario
We have some Sun X4-8 servers with the below specifications:
- CPU: 8 x E7-8895 v2
- Local Disk: 8 x 600 GB SAS Disk
- Memory: 48 x 16 GB – Totally 768 GB
- PCI Devices:
- 2 x QLogic 2600 16 Gb – 2 Ports (HBA)
- 2 x Intel 82599EB 10 Gb – 2 Ports (Network)
 
- Embedded Devices: 2 x Intel I350 1Gb
ESXi 6.x U2 has been installed on all the servers and two virtual machines with 120 CPU cores and 368 GB memory were created on the servers.
Each virtual machine has one port of Intel 82599EB and two ports of the HBA cards as PCI Passthrough devices.
There was no problem till we added more memory module to each server and increased total capacity to 1 TB.
We have planned to increase each VM memory capacity to 512 GB at least but after memory expansion, we couldn’t power on both virtual machines with more than 395 GB memory or more than 790 GB one by one.
There was something wrong and we faced with the below error during virtual machine power on:
Failed to register the device pciPassthru0 for x:x.x due to unavailable hardware or software support.
We checked many situations but nothing changed! We reduced virtual machines cores and try to power on them with different memory capacity but nothing changed. Even we did ESXi upgrade to ESXi 6.x U3 but result was same.
Adding advanced parameters to virtual machines configuration file for allow virtual machine to use 64-bit MMIO addresses but nothing changed.
We found the below lines in vmkernel.log and the lines logged during power on virtual machines:
2017-05-31T21:12:41.904Z cpu51:38631)VSCSI: 4038: handle 8199(vscsi0:0):Creating Virtual Device for world 38632 (FSS handle 592241) numBlocks=1169920000 (bs=512)
2017-05-31T21:12:41.904Z cpu51:38631)VSCSI: 273: handle 8199(vscsi0:0):Input values: res=0 limit=-2 bw=-1 Shares=1000
2017-05-31T21:12:41.919Z cpu51:38631)WARNING: Heap: 3721: Heap mainHeap (21594976/31462240): Maximum allowed growth (9867264) too small for size (13111296)
2017-05-31T21:12:41.919Z cpu51:38631)WARNING: Heap: 4214: Heap_Align(mainHeap, 13107208/13107208 bytes, 8 align) failed. caller: 0x41800433fbe2
2017-05-31T21:12:41.922Z cpu51:38631)VSCSI: 6726: handle 8199(vscsi0:0):Destroying Device for world 38632 (pendCom 0)
2017-05-31T21:12:41.946Z cpu16:33292)WARNING: SP: 1523: Smashing barrier dealloc-Barrier.
2017-05-31T21:12:42.077Z cpu185:35847)Config: 681: “SIOControlFlag2” = 0, Old Value: 1, (Status: 0x0)
2017-05-31T21:13:23.048Z cpu23:37390 opID=c69de623)World: 15516: VC opID 384E9BA6-0000016B-3934 maps to vmkernel opID c69de623
2017-05-31T21:13:23.048Z cpu23:37390 opID=c69de623)Config: 681: “SIOControlFlag2” = 1, Old Value: 0, (Status: 0x0)
Seems, there is some limitation about ESXi vmkernel memory management because of the below warnings:
WARNING: Heap: 3721: Heap mainHeap (21594976/31462240): Maximum allowed growth (9867264) too small for size (13111296)
WARNING: Heap: 4214: Heap_Align(mainHeap, 13107208/13107208 bytes, 8 align) failed. caller: 0x41800433fbe2
We did search but we could find nothing about the warning, just we found a way to check the mainHeap status.
The below command help us to find mainHeap status:
[root@ESXi-001:~] vsish
/> cat /system/heap
/system/heapMgrVA /system/heaps/
/> cat /system/heaps/mainHeap-0x43004d006000/stats
Heap stats {
Name:mainHeap
dynamically growable:1
physical contiguity:MM_PhysContigType: 1 -> Any Physical Contiguity
lower memory PA limit:0
upper memory PA limit:-1
may use reserved memory:0
memory pool:76
# of ranges allocated:1
dlmalloc overhead:1024
current heap size:1319776
initial heap size:0
current bytes allocated:1161424
current bytes available:158352
current bytes releasable:158048
percent free of current size:11
percent releasable of current size:11
maximum heap size:31462240
maximum bytes available:30300816
percent free of max size:96
lowest percent free of max size ever encountered:96
# of failure messages:0
number of succeeded allocations:36329
number of failed allocations:0
average size of an allocation:380
number of requests we try to satisfy per heap growth:48
number of heap growth operations:2
number of heap shrink operations:0
“maximum heap size:31462240” is same on different ESXi version so upgrade or downgrade couldn’t help us.
Finally, we did same tests on HPE DL580 G8 with 1 TB memory and result was same as Sun X4-8. So this is not related to hardware.
Actually, I guess, there is a bug about ESXi mainHeap and ESXi can’t power on virtual machines with more than 800 GB memory capacity totally when we are using PCI Passthrough.
Example of memory configuration:
- Two virtual machines with 400 GB memory or more for each one, power on will be failed.
- One virtual machine with 800 GB memory ore more, power on will be failed.
- Two virtual machine with 395 GB memory or less, virtual machines will be powered on.
- One virtual machine with 790 GB memory or less, virtual machine will be powered on.
I’ve reported this to VMware but please inform me that if anyone has same experiences and know the solution.
May be we missed some configurations but i don’t know, because we have problem just with big servers.
 
			     			 
																											 
																											 
																											 
																											 
																											 
																											 
																											 
																											 
																											
Did you get any resolution from VMware? This is exactly what we’re seeing. We have 4 VM’s that need 1TB of vRAM and a 6.4TB NVMe PCI passthrough
Actually, I did report the issue to VMware but they didn’t answer me.
Also it’s depended to hardware because I could assign more memory to virtual machines when I was using HPE DL580 G8.
But I couldn’t assign more than 800GB memroy to virtual machines when they are hosting on Oracle X4-8.
I suggest you, try difference solutions without Pass-Through.
If you get solution, please share with me.
Ya, we’ve put in a ticket with VMware. No news yet