Tuesday, September 13, 2011

How To Fix SSH On A Cisco ASA When It Quits Working

I went to a customer site today, and one of the complaints that they had was that they could no longer SSH into their ASAs.  So, with that said, I told them I would fix this.  Now, I specifically remember SSH'ing into these ASAs before, without issue.  So, Im not sure what has happened.  Now, I know that a few days back, we replaced the primary ASA (in this Active/Standby configuration).  I suspect that ASA lost its key when we put the new primary ASA in and it sync'ed with the secondary ASA (it pulled FROM the secondary, to be clear).  So, as I began to troubleshoot this, here is what fixed the issue:
"crypto key generate rsa modulus 2048"
I typed this in and it resolved my SSH issue.  I am not 100% sure if the reason I gave above is correct, but I do suspect this to be the case, since I know only a few days ago I made this change.  Follow that story at this link:


Saturday, September 10, 2011

Cisco ASA: Active/Standby ASA issue today. Lesson learned.

Well, today was a real trial in the IT world for me.  I went on site to a customer to put in a replacement Cisco ASA 5520.  This customer has two ASA 5520s in Active/Standby mode.  A few weeks back, the primary ASA failed, leaving the secondary to run on its own.  So today, I went to put the replacement back in.  Here is what I did on the primary replacement unit:
I first put the IP info in for the internal LAN side, and brought the interface up.  Then, I typed in these commands to get this sync'ed with the running secondary ASA:
failover lan unit primary
failover lan interface failover GigabitEthernet0/3
failover replication http
failover link failover GigabitEthernet0/3
failover interface ip failover standby

The new primary ASA sync'ed up and the config looked good.
However, I did face a problem.  When the primary came online and stood as the standby unit, I could no longer get requests from the outside to the web servers, mail servers, etc.  Everything from the outside stopped working.  But, I could still get on the Internet from the inside.  If I powered down the primary unit, with the secondary unit only in place, I got the same result.  It was only after I recycled the power on the secondary (with the primary still turned off) that I was able to get things running again.  This was odd to me.  After a series of tests, the same description above was the result in getting things resolved again.
Well, after a call to Cisco TAC, I found that when the primary ASA came online, the secondary ASA would use the primary's MAC address as its Layer 2 info to whomever it needed to talk to.  When I powered down the primary unit, the secondary still had the new primary ASA's MAC address.  When I recycled the power on the secondary ASA, it came back up using the OLD primary ASA's MAC address.  Thats when it started working again.
So, with that said, I found that what I needed to do was to boot up both ASAs, then I rebooted the ISP's router in front of the ASAs.  This cleared the ARP table and everything started working again, with the new primary and current secondary in place.  Failover also worked as well.
LESSON:  I learned that the MAC address of the production secondary was using the OLD primary ASA (the one that died).  And when rebooting it, it started using the replacement ASA's MAC address.  It all makes sense really, as Ive seen issues similar like this in the past.  But I guess it really threw me off this time since when the secondary rebooted and would work again by itself, I didnt really think about it trying to use the OLD primary ASA's MAC address.  Well, lesson learned.
Also, one other note that I want to express on redundant ASAs:  The LAST ASA that you run the "failover" command on will be the one that pulls its config from the other ASA.  The ASA you type in "failover" FIRST on, will be the one that pushes its config to the one you typed "failover" in LAST on.  Just a note.