[Openswan Users] Unstable behavior with 2 tunnels connecting the same sites

Greg Scott GregScott at Infrasupport.com
Wed Jul 14 11:41:49 EDT 2010


Something unhealthy is going on with configs that have multiple tunnels
connecting the same sites.  

 

I know I always end up posting the weird problems and here's another
one.  I have a customer with 2 sites, called HQ and colo.  HQ is on the
right, colo on the left.  The HQ site has 2 LANS - 175.10/16 and
175.7/16.  The colo site also has 2 LANS, 175.8/16 and 175.9/16.  I
supernetted the tunnels at the  colo site to 175.8/15 as a
troubleshooting step and also a way to reduce the number of tunnels from
4 to 2.  I know this setup is a little off the beaten path, but this
customer needs multiple tunnels connecting the same sites to make their
storage replication work properly.  

 

Every once-in-a-while, one or more of these tunnels decides to go out to
lunch.  This is usually when there's a telcom interruption.  IPSEC is
supposed to hook both sites back up after the telecom comes back online,
but this doesn't always work here.  The only solution is to manually
restart ipsec on one side or the other.  

 

So this morning, I had an outage and sure enough, half the tunnels
weren't answering.  So I tried service ipsec restart at the HQ site  and
. . . it hung.  Yup, it hung.   I would love to prove that it hung, but
the putty output is already scrolled off the top of the window.   But I
was there, I saw it with my own eyes, it hung.  Trust me, it hung.  

 

Fwiw, I've seen this hang before with multiple tunnels.  It's been going
on for years in one form or another and I've posted references to it in
this forum.  

 

After pressing Ctrl/C, I tried sh -v /etc/rc.d/init.d/ipsec restart  -
this worked properly and now everyone can see everyone else.  

 

When the problem is happening, I see lots of messages coming into
/var/log/secure.  Here is a sample:

 

[root at stylmark-fw1 ipsec.d]# more greg2.txt

Jul 14 08:00:00 localhost pluto[23465]: initiate on demand from
175.10.0.1:8 to 175.9.1.35:0 proto=1 state: fos_start be

cause: acquire

Jul 14 08:00:00 localhost pluto[23465]: "colo-hqmain" #212624:
initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP+IKEv2A

LLOW {using isakmp#212615 msgid:d98e9c48 proposal=defaults
pfsgroup=OAKLEY_GROUP_MODP2048}

Jul 14 08:00:00 localhost pluto[23465]: "colo-hqmain" #212624:
transition from state STATE_QUICK_I1 to state STATE_QUICK

_I2

Jul 14 08:00:00 localhost pluto[23465]: "colo-hqmain" #212624:
STATE_QUICK_I2: sent QI2, IPsec SA established tunnel mod

e {ESP=>0x86d6e4be <0x68544fa4 xfrm=AES_128-HMAC_SHA1 NATOA=none
NATD=none DPD=none}

Jul 14 08:00:03 localhost pluto[23465]: initiate on demand from
175.10.0.1:8 to 175.8.1.101:0 proto=1 state: fos_start b

ecause: acquire

Jul 14 08:00:03 localhost pluto[23465]: "colo-hqmain" #212625:
initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP+IKEv2A

LLOW {using isakmp#212615 msgid:d31345ba proposal=defaults
pfsgroup=OAKLEY_GROUP_MODP2048}

Jul 14 08:00:03 localhost pluto[23465]: "colo-hqmain" #212625:
transition from state STATE_QUICK_I1 to state STATE_QUICK

_I2

Jul 14 08:00:03 localhost pluto[23465]: "colo-hqmain" #212625:
STATE_QUICK_I2: sent QI2, IPsec SA established tunnel mod

e {ESP=>0xb35a6fc7 <0xac2386d4 xfrm=AES_128-HMAC_SHA1 NATOA=none
NATD=none DPD=none}

Jul 14 08:00:09 localhost pluto[23465]: initiate on demand from
175.10.0.35:8 to 175.9.1.35:0 proto=1 state: fos_start b

ecause: acquire

Jul 14 08:00:09 localhost pluto[23465]: "colo-hqmain" #212626:
initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP+IKEv2A

LLOW {using isakmp#212615 msgid:b005937f proposal=defaults
pfsgroup=OAKLEY_GROUP_MODP2048}

Jul 14 08:00:09 localhost pluto[23465]: "colo-hqmain" #212626:
transition from state STATE_QUICK_I1 to state STATE_QUICK

_I2

Jul 14 08:00:09 localhost pluto[23465]: "colo-hqmain" #212626:
STATE_QUICK_I2: sent QI2, IPsec SA established tunnel mod

e {ESP=>0x364780e1 <0x58c0d1e0 xfrm=AES_128-HMAC_SHA1 NATOA=none
NATD=none DPD=none}

Jul 14 08:00:28 localhost pluto[23465]: "colo-hqmain" #212615: received
Delete SA(0x7c705344) payload: deleting IPSEC St

ate #209204

Jul 14 08:00:28 localhost pluto[23465]: "colo-hqmain" #212615: received
and ignored informational message

Jul 14 08:00:31 localhost pluto[23465]: "colo-hqmain" #212615: ignoring
Delete SA payload: PROTO_IPSEC_ESP SA(0x8b2781f0

) not found (maybe expired)

Jul 14 08:00:31 localhost pluto[23465]: "colo-hqmain" #212615: received
and ignored informational message

Jul 14 08:00:34 localhost pluto[23465]: "colo-hqmain" #212615: received
Delete SA(0xf8a2d8fb) payload: deleting IPSEC St

ate #209206

Jul 14 08:00:34 localhost pluto[23465]: "colo-hqmain" #212615: received
and ignored informational message

Jul 14 08:00:37 localhost pluto[23465]: "colo-hqmain" #212615: received
Delete SA(0x14029340) payload: deleting IPSEC St

ate #209207

Jul 14 08:00:37 localhost pluto[23465]: "colo-hqmain" #212615: received
and ignored informational message

Jul 14 08:00:38 localhost pluto[23465]: initiate on demand from
175.10.0.1:8 to 175.9.1.1:0 proto=1 state: fos_start bec

ause: acquire

Jul 14 08:00:38 localhost pluto[23465]: "colo-hqmain" #212627:
initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP+IKEv2A

LLOW {using isakmp#212615 msgid:3e7351ff proposal=defaults
pfsgroup=OAKLEY_GROUP_MODP2048}

Jul 14 08:00:39 localhost pluto[23465]: "colo-hqmain" #212627:
transition from state STATE_QUICK_I1 to state STATE_QUICK

_I2

Jul 14 08:00:39 localhost pluto[23465]: "colo-hqmain" #212627:
STATE_QUICK_I2: sent QI2, IPsec SA established tunnel mod

e {ESP=>0x19427699 <0x043fa1d4 xfrm=AES_128-HMAC_SHA1 NATOA=none
NATD=none DPD=none}

Jul 14 08:00:41 localhost pluto[23465]: initiate on demand from
175.10.0.1:8 to 175.8.1.254:0 proto=1 state: fos_start b

ecause: acquire

--More--(0%)

 

And here is a sample from /var/log/secure when things are working
properly - I dummied up references to public IP Addresses:

 

[root at stylmark-fw1 ipsec.d]# tail /var/log/secure -f

Jul 14 10:33:34 localhost pluto[3993]: "colo-hqmain" #1: the peer
proposed: 175.10.0.0/16:0/0 -> 175.8.0.0/15:0/0

Jul 14 10:33:34 localhost pluto[3993]: "colo-hqmain" #31: responding to
Quick Mode proposal {msgid:6a8b3c68}

Jul 14 10:33:34 localhost pluto[3993]: "colo-hqmain" #31:     us:
175.10.0.0/16===1.2.42.85<1.2.42.85>[@hqmain,+S=C]---1.2.42.86

Jul 14 10:33:34 localhost pluto[3993]: "colo-hqmain" #31:   them:
3.4.64.174---3.4.64.169<3.4.64.169>[@colo,+S=C]===175.8.0.0/15

Jul 14 10:33:34 localhost pluto[3993]: | NAT-OA: 0 tunnel: 0

Jul 14 10:33:34 localhost pluto[3993]: "colo-hqmain" #31: keeping
refhim=4294901761 during rekey

Jul 14 10:33:34 localhost pluto[3993]: "colo-hqmain" #31: transition
from state STATE_QUICK_R0 to state STATE_QUICK_R1

Jul 14 10:33:34 localhost pluto[3993]: "colo-hqmain" #31:
STATE_QUICK_R1: sent QR1, inbound IPsec SA installed, expecting QI2

Jul 14 10:33:34 localhost pluto[3993]: "colo-hqmain" #31: transition
from state STATE_QUICK_R1 to state STATE_QUICK_R2

Jul 14 10:33:34 localhost pluto[3993]: "colo-hqmain" #31:
STATE_QUICK_R2: IPsec SA established tunnel mode {ESP=>0x8fd8f76b
<0xaf448d32 xfrm=AES_128-HMAC_SHA1 NATOA=none NATD=none DPD=none}

Jul 14 10:35:34 localhost pluto[3993]: "colo-hqmain" #1: the peer
proposed: 175.10.0.0/16:0/0 -> 175.8.0.0/15:0/0

Jul 14 10:35:34 localhost pluto[3993]: "colo-hqmain" #32: responding to
Quick Mode proposal {msgid:bcf600d5}

Jul 14 10:35:34 localhost pluto[3993]: "colo-hqmain" #32:     us:
175.10.0.0/16===1.2.42.85<1.2.42.85>[@hqmain,+S=C]---1.2.42.86

Jul 14 10:35:34 localhost pluto[3993]: "colo-hqmain" #32:   them:
3.4.64.174---3.4.64.169<3.4.64.169>[@colo,+S=C]===175.8.0.0/15

Jul 14 10:35:34 localhost pluto[3993]: | NAT-OA: 0 tunnel: 0

Jul 14 10:35:34 localhost pluto[3993]: "colo-hqmain" #32: keeping
refhim=4294901761 during rekey

Jul 14 10:35:34 localhost pluto[3993]: "colo-hqmain" #32: transition
from state STATE_QUICK_R0 to state STATE_QUICK_R1

Jul 14 10:35:34 localhost pluto[3993]: "colo-hqmain" #32:
STATE_QUICK_R1: sent QR1, inbound IPsec SA installed, expecting QI2

Jul 14 10:35:34 localhost pluto[3993]: "colo-hqmain" #32: transition
from state STATE_QUICK_R1 to state STATE_QUICK_R2

Jul 14 10:35:34 localhost pluto[3993]: "colo-hqmain" #32:
STATE_QUICK_R2: IPsec SA established tunnel mode {ESP=>0x0c7f39bf
<0x2e95afcb xfrm=AES_128-HMAC_SHA1 NATOA=none NATD=none DPD=none}

Jul 14 10:35:47 localhost pluto[3993]: "colo-hqmain" #1: the peer
proposed: 175.10.0.0/16:0/0 -> 175.8.0.0/15:0/0

Jul 14 10:35:47 localhost pluto[3993]: "colo-hqmain" #33: responding to
Quick Mode proposal {msgid:521ce545}

Jul 14 10:35:47 localhost pluto[3993]: "colo-hqmain" #33:     us:
175.10.0.0/16===1.2.42.85<1.2.42.85>[@hqmain,+S=C]---1.2.42.86

Jul 14 10:35:47 localhost pluto[3993]: "colo-hqmain" #33:   them:
3.4.64.174---3.4.64.169<3.4.64.169>[@colo,+S=C]===175.8.0.0/15

Jul 14 10:35:47 localhost pluto[3993]: | NAT-OA: 0 tunnel: 0

Jul 14 10:35:47 localhost pluto[3993]: "colo-hqmain" #33: keeping
refhim=4294901761 during rekey

Jul 14 10:35:47 localhost pluto[3993]: "colo-hqmain" #33: transition
from state STATE_QUICK_R0 to state STATE_QUICK_R1

Jul 14 10:35:47 localhost pluto[3993]: "colo-hqmain" #33:
STATE_QUICK_R1: sent QR1, inbound IPsec SA installed, expecting QI2

Jul 14 10:35:47 localhost pluto[3993]: "colo-hqmain" #33: transition
from state STATE_QUICK_R1 to state STATE_QUICK_R2

Jul 14 10:35:47 localhost pluto[3993]: "colo-hqmain" #33:
STATE_QUICK_R2: IPsec SA established tunnel mode {ESP=>0xc0136c4b
<0x32bf7674 xfrm=AES_128-HMAC_SHA1 NATOA=none NATD=none DPD=none}

 

 

 

This is the version of Openswan running at the HQ site:

 

[root at stylmark-fw1 firewall-scripts]# ipsec version

Linux Openswan U2.6.25/K2.6.32.12-115.fc12.i686.PAE (netkey)

See `ipsec --copyright' for copyright information.

[root at stylmark-fw1 firewall-scripts]#

 

And this is the version running at the colo site:

 

[root at colo-fw firewall-scripts]# ipsec version

Linux Openswan U2.6.25/K2.6.17.2fw21 (netkey)

See `ipsec --copyright' for copyright information.

[root at colo-fw firewall-scripts]#

 

As you can see, the colo site has an older kernel but a new version of
Openswan.  

 

Here are the conn definitions.  First, colo-ipsec.conf at the colo site.
Note the commented out additional tunnels at the bottom.  I supernetted
the conn definitions at the colo site as a troubleshooting step:

 

conn colo-hqmain

        type=tunnel

        #

        # Left security gateway, subnet behind it, next hop toward left.

        #

        also=colo

        #

        # Right security gateway, subnet behind it, next hop toward
left.

        #

        also=hqmain

        auto=start

 

conn colo-hqmirror

        type=tunnel

        #

        # Left security gateway, subnet behind it, next hop toward left.

        #

        also=colo

        #

        # Right security gateway, subnet behind it, next hop toward
left.

        #

        also=hqmirror

        auto=start

 

##conn colomirror-hqmirror

##      type=tunnel

##      #

##      # Left security gateway, subnet behind it, next hop toward left.

##      #

##      also=colomirror

##      #

##      # Right security gateway, subnet behind it, next hop toward
left.

##      #

##      also=hqmirror

##      auto=start

 

##conn colomirror-hqmain

##      type=tunnel

##      #

##      # Left security gateway, subnet behind it, next hop toward left.

##      #

##      also=colomirror

##      #

##      # Right security gateway, subnet behind it, next hop toward
left.

##      #

##      also=hqmain

##      auto=start

 

include /etc/ipsec.d/sites.conf

 

 

 

Next are the conn definitions from hq-ipsec.conf:

 

conn colo-hqmain

        type=tunnel

        #

        # Left security gateway, subnet behind it, next hop toward left.

        #

        also=colo

        #

        # Right security gateway, subnet behind it, next hop toward
left.

        #

        also=hqmain

        auto=start

 

conn colo-hqmirror

        type=tunnel

        #

        # Left security gateway, subnet behind it, next hop toward left.

        #

        also=colo

        #

        # Right security gateway, subnet behind it, next hop toward
left.

        #

        also=hqmirror

        auto=start

 

##conn colomirror-hqmirror

##      type=tunnel

##      #

##      # Left security gateway, subnet behind it, next hop toward left.

##      #

##      also=colomirror

##      #

##      # Right security gateway, subnet behind it, next hop toward
left.

##      #

##      also=hqmirror

##      auto=start

 

##conn colomirror-hqmain

##      type=tunnel

##      #

##      # Left security gateway, subnet behind it, next hop toward left.

##      #

##      also=colomirror

##      #

##      # Right security gateway, subnet behind it, next hop toward
left.

##      #

##      also=hqmain

##      auto=start

 

include /etc/ipsec.d/sites.conf

 

And finally, sites.conf, which contains the IP Addresses of all sites.
Each site has an identical copy of sites.conf.  Public IP Addresses are
dummied up and RSA keys truncated.

 

conn hqmain

        right=1.2.42.85

        rightsubnet=175.10.0.0/16

        rightnexthop=1.2.42.86

        rightsourceip=175.10.0.1

        rightid=@hqmain

        ###     rightupdown=/etc/ipsec.d/hq-updown.sh

        # rsakey AQOkh1tMU

        rightrsasigkey=0sAQOkh...

 

conn hqmirror

        right=1.2.42.85

        rightsubnet=175.7.0.0/16

        rightnexthop=1.2.42.86

        rightsourceip=175.7.0.1

        rightid=@hqmirror

        # rsakey AQOkh1tMU

        rightrsasigkey=0sAQOkh1t...

 

conn colo

        left=3.4.64.169

        leftsubnet=175.8.0.0/15

        leftnexthop=3.4.64.174

        leftsourceip=175.9.1.1

        leftid=@colo

        # RSA 2192 bits   colo-fw   Wed Nov 29 19:08:25 2006

        leftrsasigkey=0sAQOSwRcj...

 

##conn colomirror

##      left=3.4.64.169

##      leftsubnet=175.8.0.0/16

##      leftnexthop=3.4.64.174

##      ##leftid=@colomirror

##      # RSA 2192 bits   colo-fw   Wed Nov 29 19:08:25 2006

##      leftrsasigkey=0sAQOSwR...

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openswan.org/pipermail/users/attachments/20100714/5c7e433b/attachment-0001.html 


More information about the Users mailing list