Well, today was a real trial in the IT world for me. I went on site to a customer to put in a replacement Cisco ASA 5520. This customer has two ASA 5520s in Active/Standby mode. A few weeks back, the primary ASA failed, leaving the secondary to run on its own. So today, I went to put the replacement back in. Here is what I did on the primary replacement unit:
I first put the IP info in for the internal LAN side, and brought the interface up. Then, I typed in these commands to get this sync'ed with the running secondary ASA:
failover
failover lan unit primary
failover lan interface failover GigabitEthernet0/3
failover replication http
failover link failover GigabitEthernet0/3
failover interface ip failover 172.20.20.1 255.255.255.252 standby 172.20.20.2
The new primary ASA sync'ed up and the config looked good.
However, I did face a problem. When the primary came online and stood as the standby unit, I could no longer get requests from the outside to the web servers, mail servers, etc. Everything from the outside stopped working. But, I could still get on the Internet from the inside. If I powered down the primary unit, with the secondary unit only in place, I got the same result. It was only after I recycled the power on the secondary (with the primary still turned off) that I was able to get things running again. This was odd to me. After a series of tests, the same description above was the result in getting things resolved again.
Well, after a call to Cisco TAC, I found that when the primary ASA came online, the secondary ASA would use the primary's MAC address as its Layer 2 info to whomever it needed to talk to. When I powered down the primary unit, the secondary still had the new primary ASA's MAC address. When I recycled the power on the secondary ASA, it came back up using the OLD primary ASA's MAC address. Thats when it started working again.
So, with that said, I found that what I needed to do was to boot up both ASAs, then I rebooted the ISP's router in front of the ASAs. This cleared the ARP table and everything started working again, with the new primary and current secondary in place. Failover also worked as well.
LESSON: I learned that the MAC address of the production secondary was using the OLD primary ASA (the one that died). And when rebooting it, it started using the replacement ASA's MAC address. It all makes sense really, as Ive seen issues similar like this in the past. But I guess it really threw me off this time since when the secondary rebooted and would work again by itself, I didnt really think about it trying to use the OLD primary ASA's MAC address. Well, lesson learned.
Also, one other note that I want to express on redundant ASAs: The LAST ASA that you run the "failover" command on will be the one that pulls its config from the other ASA. The ASA you type in "failover" FIRST on, will be the one that pushes its config to the one you typed "failover" in LAST on. Just a note.