[Openswan Users] Routing doesn't route with Openswan U2.6.09 and Fedora 9

Greg Scott GregScott at InfraSupportEtc.com
Tue Dec 2 01:03:00 EST 2008


I first reported a variation on this problem back in May and June 2008.
The best solution at the time was, update to a newer version of IPSEC.
I finally did it and now I have another problem.  Or it may be the same
problem with different symptoms.  
 
It's a complex situation with multiple Linux IPSEC firewalls and
failover routing and I've spent most of the day trying to identify the
problem and explain it concisely.  So I will purposely leave out lots
and lots of details to hopefully describe the forest without individual
trees getting in the way.
 
Here's the situation:
 
At the HQ site, there are two Linux IPSEC firewall systems, named fw1
and fw2.  They are a failover pair, and are identical in every relevant
way, except the following:
 
fw1 is running Fedora 6 with IPSEC version U2.4.5 - this was the system
Paul W encouraged me to update.  
fw2 is running Fedora 9 with IPSEC version U2.6.09.  This is the new
version.  
 
All IPSEC and iptables and other configuration scripts are identical,
right down to the hostkey files.  I even came up with a way to handle
MAC addresses.  When a primary system goes offline, I have a bunch of
scripting on the backup that makes it assume the identity of the
primary.  This all works and it's running successfully at other sites.  
 
The tunnels to all the branch sites work, except to the branch named
Janesville.  To be more precise, Janesville works as expected with the
older HQ version on fw1, but does not work with the newer version on
fw2.  Janesville is more complex than the other branch sites.
Janesville has 2 LANs and its own set of failover routing back to the HQ
site with another pair of Linux HA firewalls.  
 
The names of Janesville's LANs are PNT and Cheetah.  The PNT LAN
nornally connects to the HQ site via an ATT PNT circuit.  The PNT LAN is
not allowed direct access to the Internet.  The Cheetah LAN runs a web
based application named Cheetah and normally connects to the HQ site via
an IPSEC tunnel.  The Janesville firewall keeps an "always-on" tunnel
between the Cheetah LAN and HQ, and a "sometimes-on" tunnel between the
PNT LAN and HQ.  When the firewalls in Janesville and HQ detect that the
PNT circuit is down, they bring up an IPSEC tunnel to the PNT LAN.  When
the PNT circuit comes back online, the Janesville and HQ firewalls take
down the IPSEC tunnel and resume routing via the PNT circuit.  
 
I have a bunch of scripting at both the HQ and Janesville sites to
manage all this.  I've spent many late nites putting all this together
and it all works, for the most part.  I documented an issue I ran across
back in May/June 2008, and maybe the problem I see today is a variation
on that with different symptoms.  
 
This is where it gets weird.  
 
All day today, I could not ping anything in the Janesville PNT LAN from
the HQ site to save my life.  This is important because some
applications at the HQ site need to print on a printer at the Janesville
branch site.  
 
The way this works is, everything in both the HQ and Janesville PNT and
Cheetah branch sites use my firewall for default gateway, and then I
make the routing decision whether to forward it through the PNT circuiit
or IPSEC tunnel, depending on the state of the PNT circuit.  If the PNT
is online, I just send it back to the PNT router.  If the PNT is
offline, I send it through the IPSEC tunnel.
 
Sounds simple enough....
 
After struggling all day running on the new version at the HQ site, I
cooked up a theory - what if the new version decided not to relinquish
its IPSEC routes to Janesville PNT when taking down the IPSEC tunnel?
Routing to other IPSEC and PNT sites makes sense, only Janesville PNT
behaves badly.  What's different?  Only Janesville PNT has IPSEC tunnels
that go up and down.  
 
Maybe my scripts are messed up.  So I tried a bunch of experiments by
hand.  From both Janesville and HQ sites, I did 
 
ipsec auto --add JanesvillePNT 
ipsec auto --up JanesvillePNT
 
and
 
ipsec auto --down JanesvillePNT
ipsec auto --delete JanesvillePNT
 
and by hand from the HQ site - ip route add 172.20.2.0/24 via
192.168.3.97 to force anything to the Janesville LAN to use the PNT
router.
 
None of this made a difference.  Nothing I tried on the new HQ firewall
would force it let go of the IPSEC route to Janesville.  
 
I found a way to test all this from a Windows server at the HQ site.
>From the Windows server, I can ping the printer in Janesville:
 
    ping 172.20.2.50
 
and then I can do tcpdump traces.  The printer is online and answers
from the Janesville firewall.  
 
When my HQ Windows server explicitely sets up a route for this printer
through the PNT router, the pings reply.  When I route it through my HQ
Linux/IPSEC firewall, the pings do not reply, even when I explicitely
take down the JanesvillePNT tunnel and add a route by hand on the HQ
IPSEC firewall to that printer via the PNT router .  I can change the
gateway on the Janesville printer to/from the Janesville PNT router and
this makes no difference.  
 
This turned into a novel.  The whole problem boils down to this:  No
matter what I do, my new HQ firewall will not forward packets for the
Janesville PNT LAN via the Janesville PNT router.  
 
So then I did a failover to fw1, the old version of IPSEC and it all
behaves as expected.
 
After all this, now some questions:
 
1 - How do I look at IPSEC routes these days?  ipsec look on fw1, the
old version, shows me what I expect.  ipsec look on the new version
tells me nothing.  Is there a nifty command or tool that can show me how
packets will really route?
 
2 - How do I manipulate IPSEC routes?  In particular, if I take down an
IPSEC tunnel, how to I make sure the routes from that tunnel are really
gone?  How do I look at routes before and after?  
 
3 - I just downloaded a newer version of IPSEC from souceforge.  I think
it's maybe a week old.  Is what I described above a known problem with
the version that came bundled with fc9 and should I remove the fc9 IPSEC
and install this new Sourceforge version?
 
thanks
 
- Greg Scott
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openswan.org/pipermail/users/attachments/20081202/a835fe66/attachment-0001.html 


More information about the Users mailing list