Hi,
I've a problem on a 2-nodes ESX 4.0 cluster, HA enabled.
Esx Patches are up to date.
My hosts disconnect from vCenter periodically (3 times last month). They're marked as "not responding" , then "disconnected" if i try to manually reconnect them. I have to reboot host.
Service console pings, but SSH or console logins fails : i can type my login at the prompt, but there's no password prompt after. Using Vsphere Client directly on a host results as a timeout. Accessing Management Homepage from HP also.
Some VM are still pinging, etc., but some VMs don't. All VMs are showing as "disconnected" on vCenter. They don't reboot on Host1.
I use HP BL 460c G6 for all nodes, on a c7000 Chassis. Smart Array p410i Controller.
Vcenter events :
24/10/2009 04:59:32 : HA agent on Host2.test.com in cluster TEST has an error : HA agent on the host failed
24/10/2009 04:59:39 : Host Host2.test.com in TEST is not responding
All VM's hosted by server2 are marked are disconnected.
Then 14 hours late, other host goes down (edit : but these to consecutive events don't seem to be linked, i'm maybe just unlucky !) :
24/10/2009 19:06:51 : Host Host1.test.com in TEST is not responding
24/10/2009 19:06:51 : Unable to contact a primary HA agent in cluster TEST
The only VM hosted by Host1 was still pinging, but when i tried to remote control it, VM goes down.
*Update : possible answer*
After a call with HP support, and VMWare support, it seems that a controller issue is the cause of the crash of the service console.
There's a firmware update (v2.50) concerning Smart Array P212, P410, P410i, P411, and P712m ---> here
Fix for a potential controller hang condition (lockup error code 0XBC) seen during heavy I/O.
Fix for a server operating system hang condition encountered during IO stress tests, such as SQLIO.
Fix for a potential controller hang condition (lockup error code
0XAB) seen when controller is configured in Zero Memory Mode (no cache
module installed)Fix for a potential controller hang condition that may be seen when a 2nd SATA drive fails in a RAID 6 configuration.
Thanks for your help.
Message was edited by: ROM13