Hi esxi users, I'm facing a problem since i've setting my esxi on a i7 3770 - 32 gb ram server my esxi host : 3 win 2012 vm 3 ubuntu 12.10 server vm pfsense as router
nexentastor as nas
version 5.1.0 799733
2 datastores : 3tb Sata hdd each
The thing is my host crash at least once a day with a (supposed*)purple screen and a core dump logs (below)* its an hosted server without kvm so i hardly suppose that the purple screen come with the crash
I think that it may be nexentastor vm but i've try without it and it has failed too
(btw for nexentastor i've set the 2 virtual disk in software raid 1
(that was the main goal)
and persistant independant ..
The hosting company has proceding of an 10 hour hardware test with cpu mem and hdd without finding anything relevant
I've try several things :
- Setting ESXI power managment to best performance (disabling power managment in other words)
- Running Prime95 on 2 vm with 10 ram gb and 2 vcpu each
Probably other things ..
I've never try to just run one vm during a time, it may be relevant but anyway i dont use esxi to run one vm
I just wonder if it could be a bug between ivy bridge and esxiFYI One of the core dump (panic portion) :
2012-11-14T03:48:01.041Z cpu4:6089)World: 8381: PRDA 0x418041000000 ss 0x0 ds 0x10b es 0x10b fs 0x10b gs 0x0
2012-11-14T03:48:01.041Z cpu4:6089)World: 8383: TR 0x4020 GDT 0x41221f261000 (0x402f) IDT 0x418030113000 (0xfff)
2012-11-14T03:48:01.041Z cpu4:6089)World: 8384: CR0 0x80010031 CR3 0x197d7b000 CR4 0x42768
2012-11-14T03:48:01.045Z cpu4:6089)Backtrace for current CPU #4, worldID=6089, ebp=0x41221f25ba08
2012-11-14T03:48:01.046Z cpu4:6089)0x41221f25bae8:[0x41803007b4a7]Panic@vmkernel#nover+0xae stack: 0x2e067c00000010, 0x0, 0x1f25bb38,
2012-11-14T03:48:01.047Z cpu4:6089)0x41221f25bc18:[0x4180300a7823]TLBDoInvalidate@vmkernel#nover+0x45a stack: 0xca, 0x0, 0x0, 0x0, 0x0
2012-11-14T03:48:01.047Z cpu4:6089)0x41221f25bc68:[0x418030489e17]UserMem_CartelFlush@<None>#<None>+0xce stack: 0xcaa0b, 0x0, 0x0, 0x4
2012-11-14T03:48:01.047Z cpu4:6089)0x41221f25bd78:[0x41803048ab91]UserMemUnmapStateCleanup@<None>#<None>+0x58 stack: 0x0, 0x41221f25bd
2012-11-14T03:48:01.047Z cpu4:6089)0x41221f25be58:[0x41803048b97d]UserMemUnmap@<None>#<None>+0x104 stack: 0x41221f267000, 0x41221f25bf
2012-11-14T03:48:01.048Z cpu4:6089)0x41221f25be98:[0x41803048bf20]UserMem_Unmap@<None>#<None>+0xe3 stack: 0x426, 0x0, 0x41221f25bef8,
2012-11-14T03:48:01.048Z cpu4:6089)0x41221f25beb8:[0x4180304a5985]UW64VMKSyscallUnpackReleasePhysMemMap@<None>#<None>+0x18 stack: 0x10
2012-11-14T03:48:01.048Z cpu4:6089)0x41221f25bef8:[0x418030476791]User_LinuxSyscallHandler@<None>#<None>+0x17c stack: 0x41803004cc70,
2012-11-14T03:48:01.048Z cpu4:6089)0x41221f25bf18:[0x4180300a82be]User_LinuxSyscallHandler@vmkernel#nover+0x19 stack: 0x3ffe63bed80, 0
2012-11-14T03:48:01.049Z cpu4:6089)0x41221f25bf28:[0x418030110064]gate_entry@vmkernel#nover+0x63 stack: 0x10b, 0x0, 0x0, 0x426, 0xcf76
2012-11-14T03:48:01.049Z cpu4:6089)VMware ESXi 5.1.0 [Releasebuild-799733 x86_64]
PCPU 1 locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU(s): 1).
2012-11-14T03:48:01.050Z cpu4:6089)cr0=0x80010031 cr2=0xcaa0b750 cr3=0x197d7b000 cr4=0x42768
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:0 world:6111 name:"vmm0:Windows_2012_-3" (V)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:1 world:6032 name:"vmm0:Windows_2012_-2" (V)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:2 world:6098 name:"vmm0:Windows_2012_-1" (V)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:3 world:4099 name:"idle3" (IS)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:4 world:6089 name:"vmx-vcpu-0:NexentaStor" (U)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:5 world:6134 name:"vmm0:Ubuntu_1" (V)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:6 world:4102 name:"idle6" (IS)
2012-11-14T03:48:01.050Z cpu4:6089)pcpu:7 world:4103 name:"idle7" (IS)
2012-11-14T03:48:01.050Z cpu4:6089)@BlueScreen: PCPU 1 locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU(s): 1).
And a second :
And a last one
2012-11-13T22:02:27.029Z cpu4:10256)Backtrace for current CPU #4, worldID=10256, ebp=0x41222041bce8
2012-11-13T22:02:27.030Z cpu4:10256)0x41222041bce8:[0x4180266dbd6f]Power_HaltPCPU@vmkernel#nover+0x276 stack: 0x41222041bde8, 0x4122204
2012-11-13T22:02:27.030Z cpu4:10256)0x41222041bde8:[0x4180265bd114]CpuSchedIdleLoopInt@vmkernel#nover+0x873 stack: 0x2e318b84c, 0x41000
2012-11-13T22:02:27.031Z cpu4:10256)0x41222041bed8:[0x4180265c4bef]CpuSchedDispatch@vmkernel#nover+0xabe stack: 0x0, 0x0, 0x0, 0x41001f
2012-11-13T22:02:27.031Z cpu4:10256)0x41222041bf48:[0x4180265c5f1f]CpuSchedWait@vmkernel#nover+0x242 stack: 0x410000000000, 0x418000000
2012-11-13T22:02:27.031Z cpu4:10256)0x41222041bf98:[0x4180265c6140]CpuSched_VcpuHalt@vmkernel#nover+0x14b stack: 0x4180265b84f9, 0x4100
2012-11-13T22:02:27.031Z cpu4:10256)0x41222041bfe8:[0x4180264f3698]VMMVMKCall_Call@vmkernel#nover+0x1af stack: 0x0, 0x0, 0x0, 0x0, 0x0
2012-11-13T22:02:27.032Z cpu4:10256)0x4180264c77d8:[0xfffffffffc223a12]__vmk_versionInfo_str@esx#nover+0xd547b4b1 stack: 0x0, 0x0, 0x0,
2012-11-13T22:02:27.032Z cpu4:10256)VMware ESXi 5.1.0 [Releasebuild-799733 x86_64]
Machine Check Exception: Unknown Intel encoding: 0xbe2000000005110a. PCPU4 in world 10256:vmm0:Nexenta
System has encountered a Hardware Error - Please contact the hardware vendor
2012-11-13T22:02:27.032Z cpu4:10256)cr0=0x8005003b cr2=0xfda09fe0 cr3=0x2d6fa9000 cr4=0x2668
2012-11-13T22:02:27.032Z cpu4:10256)frame=0x41222041bbf0 ip=0x4180266dbd6f err=18 rflags=0x202
2012-11-13T22:02:27.033Z cpu4:10256)rax=0x0 rbx=0x418041000180 rcx=0x0
2012-11-13T22:02:27.033Z cpu4:10256)rdx=0x0 rbp=0x41222041bce8 rsi=0x3124
2012-11-13T22:02:27.033Z cpu4:10256)rdi=0xdf27132c04f6 r8=0x0 r9=0x0
2012-11-13T22:02:27.033Z cpu4:10256)r10=0x1 r11=0x1 r12=0x0
2012-11-13T22:02:27.033Z cpu4:10256)r13=0x0 r14=0x3 r15=0x417fe68f0720
2012-11-13T22:02:27.033Z cpu4:10256)pcpu:0 world:4104 name:"idle0" (IS)
2012-11-13T22:02:27.033Z cpu4:10256)pcpu:1 world:7282 name:"vmm0:Windows_2012_-_SQL" (V)
2012-11-13T22:02:27.033Z cpu4:10256)pcpu:2 world:7161 name:"vmast.7160" ()
2012-11-13T22:02:27.033Z cpu4:10256)pcpu:3 world:7271 name:"vmm1:Windows_2012_-_App" (V)
2012-11-13T22:02:27.033Z cpu4:10256)pcpu:4 world:10256 name:"vmm0:NexentaStor" (V)
2012-11-13T22:02:27.033Z cpu4:10256)pcpu:5 world:10284 name:"vmast.10283" ()
2012-11-13T22:02:27.033Z cpu4:10256)pcpu:6 world:7148 name:"vmm0:pfSense" (V)
2012-11-13T22:02:27.033Z cpu4:10256)pcpu:7 world:7269 name:"vmm0:Windows_2012_-_App" (V)
2012-11-13T22:02:27.033Z cpu4:10256)@BlueScreen: Machine Check Exception: Unknown Intel encoding: 0xbe2000000005110a. PCPU4 in world 10256:vmm0:Nexenta
System has encountered a Hardware Error - Please contact the hardware vendor
2012-11-13T22:02:27.033Z cpu4:10256)Code start: 0x418026400000 VMK uptime: 0:20:02:46.086
2012-11-13T22:02:27.033Z cpu4:10256)0x41222041bce8:[0x4180266dbd6f]Power_HaltPCPU@vmkernel#nover+0x276 stack: 0x41222041bde8
2012-11-13T22:02:27.034Z cpu4:10256)0x41222041bde8:[0x4180265bd114]CpuSchedIdleLoopInt@vmkernel#nover+0x873 stack: 0x2e318b84c
2012-11-13T22:02:27.034Z cpu4:10256)0x41222041bed8:[0x4180265c4bef]CpuSchedDispatch@vmkernel#nover+0xabe stack: 0x0
2012-11-13T22:02:27.034Z cpu4:10256)0x41222041bf48:[0x4180265c5f1f]CpuSchedWait@vmkernel#nover+0x242 stack: 0x410000000000
2012-11-13T22:02:27.035Z cpu4:10256)0x41222041bf98:[0x4180265c6140]CpuSched_VcpuHalt@vmkernel#nover+0x14b stack: 0x4180265b84f9
2012-11-13T22:02:27.035Z cpu4:10256)0x41222041bfe8:[0x4180264f3698]VMMVMKCall_Call@vmkernel#nover+0x1af stack: 0x0
2012-11-13T22:02:27.035Z cpu4:10256)0x4180264c77d8:[0xfffffffffc223a12]__vmk_versionInfo_str@esx#nover+0xd547b4b1 stack: 0x0
2012-11-13T22:02:27.037Z cpu4:10256)base fs=0x0 gs=0x418041000000 Kgs=0x0
2012-11-13T22:02:27.037Z cpu4:10256)MC:PCPU4 B:8 S:0xbe2000000005110a M:0x9082000086 A:0x1e7296640 5
MC:PCPU4: 1 hardware errors seen since boot (0 corrected by hardware)
2012-11-13T22:02:27.037Z cpu4:10256)PCPU fam:6 model:58 step:9 type:2 name:Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
Thanks !