If VIP was failed over from one host in subnet to another via manually running “ifconfig down” on one host and “ifconfig up” on another, you can see case when VIP become unreachable from outside of subnet. the problem should clear. but it can take up to several hours, depending on the platform.
If you have similar symptoms run on machines where VIP failed over to:
[root@linuxtest ~]# arping -U your_vip_ip
It will send ARP response, Gratuitous ARP, aka Unsolicited ARP. What would broadcast new MAC for the VIP and so update ARP caches on machines in same subnet, including router what routes traffic to the subnet from external world.
If not done, ARP caches will be updated but it would take time:
– for some cisco devices – 4 hours,
– for brocade – default is 10 minutes,
– for Linux – default is 60 seconds,
– for Windows – up to 10 minutes.
So cleaning arp cache would help for new connections to avoid waiting for arp timeout.
Also it will help for clients which had existing TCP connection before failover: they will immediately get a reset packet sent. This results in clients getting errors immediately rather than waiting for the TCP timeout (up to 15 minutes, depending on platform).
Btw, Ability to send Unsolicited ARP is the only reason Oracle RAC need VIPs.