[Openswan Users] Pluto Segmentation fault in 2.4.7

Pompon pompon2 at gmail.com
Fri Mar 2 10:43:47 EST 2007


Hi list,

We encountered a bug in openswan 2.4.7 running on debian kernel 2.6.8 with
Klips stack that crashes our entire production platform tunnels !!

After investigations, this is due to a new connection we have just declared
that seems perfectly normal compared to our hundred others. When this
connections goes up, pluto end with a "Segmentation fault", then pluto is
endless restarted and faulting every 30 seconds making all other tunnels
unusable !


Here are the pluto logs :

Mar  2 13:08:44 ipsec-platform pluto[21291]: "faulty_tunnel" #76: responding
to Main Mode
Mar  2 13:08:44 ipsec-platform pluto[21291]: "faulty_tunnel" #76:
OAKLEY_DES_CBC is not supported.  Attribute OAKLEY_ENCRYPTION_ALGORITHM
Mar  2 13:08:44 ipsec-platform pluto[21291]: "faulty_tunnel" #76:
OAKLEY_DES_CBC is not supported.  Attribute OAKLEY_ENCRYPTION_ALGORITHM
Mar  2 13:08:44 ipsec-platform pluto[21291]: "faulty_tunnel" #76: transition
from state STATE_MAIN_R0 to state STATE_MAIN_R1
Mar  2 13:08:44 ipsec-platform pluto[21291]: "faulty_tunnel" #76:
STATE_MAIN_R1: sent MR1, expecting MI2
Mar  2 13:08:47 ipsec-platform pluto[21291]: "faulty_tunnel" #76: transition
from state STATE_MAIN_R1 to state STATE_MAIN_R2
Mar  2 13:08:47 ipsec-platform pluto[21291]: "faulty_tunnel" #76:
STATE_MAIN_R2: sent MR2, expecting MI3
Mar  2 13:08:53 ipsec-platform pluto[21291]: "faulty_tunnel" #76: Main mode
peer ID is ID_IPV4_ADDR: '222.126.XXX.YYY'
Mar  2 13:08:53 ipsec-platform pluto[21291]: "faulty_tunnel" #76: I did not
send a certificate because I do not have one.
Mar  2 13:08:53 ipsec-platform pluto[21291]: "faulty_tunnel" #76: transition
from state STATE_MAIN_R2 to state STATE_MAIN_R3
Mar  2 13:08:53 ipsec-platform pluto[21291]: "faulty_tunnel" #76:
STATE_MAIN_R3: sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cip her=oakley_3des_cbc_192 prf=oakley_sha group=modp1024}
Mar  2 13:08:53 ipsec-platform pluto[21291]: "faulty_tunnel" #76: Dead Peer
Detection (RFC 3706): not enabled because peer did not advertise it
Mar  2 13:08:57 ipsec-platform pluto[21291]: "faulty_tunnel" #77: we require
PFS but Quick I1 SA specifies no GROUP_DESCRIPTION
Mar  2 13:08:57 ipsec-platform pluto[21291]: "faulty_tunnel" #77: sending
encrypted notification NO_PROPOSAL_CHOSEN to 222.126.XXX.YYY:500
Mar  2 13:09:08 ipsec-platform ipsec__plutorun: Restarting Pluto
subsystem...


And in the syslog :

Mar  2 13:08:57 ipsec-platform ipsec__plutorun:
/usr/local/lib/ipsec/_plutorun: line 1: 21291 Segmentation fault
/usr/local/libexec/ipsec/pluto -
-nofork --secretsfile /etc/ipsec.secrets --ipsecdir /etc/ipsec.d
--debug-none --use-auto --uniqueids --nat_traversal --nhelpers 0
Mar  2 13:08:57 ipsec-platform ipsec__plutorun: !pluto failure!:  exited
with error status 139 (signal 11)
Mar  2 13:08:57 ipsec-platform ipsec__plutorun: restarting IPsec after
pause...
Mar  2 13:09:07 ipsec-platform kernel: IPSEC EVENT: KLIPS device ipsec0 shut
down.
Mar  2 13:09:07 ipsec-platform ipsec_setup: ...Openswan IPsec stopped
Mar  2 13:09:07 ipsec-platform ipsec_setup: Stopping Openswan IPsec...
Mar  2 13:09:07 ipsec-platform ipsec_setup: Removing orphaned
/var/run/pluto/pluto.pid:
Mar  2 13:09:08 ipsec-platform ipsec_setup: KLIPS debug `none'
Mar  2 13:09:08 ipsec-platform ipsec_setup: KLIPS ipsec0 on eth0:1
89.234.ZZZ.AAA/255.255.255.240 broadcast 89.234.ZZZ.EEE
Mar  2 13:09:08 ipsec-platform ipsec_setup: ...Openswan IPsec started
Mar  2 13:09:08 ipsec-platform ipsec_setup: Restarting Openswan IPsec
2.4.7...
Mar  2 13:09:08 ipsec-platform ipsec__plutorun: ipsec_auto: fatal error in
"common": connection has no "right" parameter specified
Mar  2 13:09:17 ipsec-platform kernel: ipsec0: no IPv6 routers present
Mar  2 13:09:36 ipsec-platform ipsec__plutorun:
/usr/local/lib/ipsec/_plutorun: line 1: 10404 Segmentation fault
/usr/local/libexec/ipsec/pluto -
-nofork --secretsfile /etc/ipsec.secrets --ipsecdir /etc/ipsec.d
--debug-none --use-auto --uniqueids --nat_traversal --nhelpers 0
Mar  2 13:09:36 ipsec-platform ipsec__plutorun: !pluto failure!:  exited
with error status 139 (signal 11)
Mar  2 13:09:36 ipsec-platform ipsec__plutorun: restarting IPsec after
pause...

[And so on...]

The line "we require PFS but Quick I1 SA specifies no GROUP_DESCRIPTION"
appears in the auth-log everytime the crash occurs and only at the crash
time, so I suspect it should be linked.


Here is the config file for this connection :

version 2.0
config setup
        nat_traversal=yes
        nhelpers=0
        interfaces="ipsec0=eth0:1"
        klipsdebug=none
        plutodebug=none

conn faulty_tunnel
        also=common
        ike=3des-sha-modp1024
        ikelifetime=28800s
        esp=3des-sha1
        keylife=3600s
        pfs=yes
        right=222.126.XXX.XXX
        rightsubnet=222.126.XXX.YYY/32

conn common
        authby=secret
        type=tunnel
        auto=add
        dpddelay=30
        dpdtimeout=10
        # dpdaction=restart
        dpdaction=hold
        compress=yes
        left=89.234.ZZZ.AAA
        leftsubnet=89.234.ZZZ.BBB/32
        leftsourceip=89.234.ZZZ.CCC
        leftnexthop=89.234.ZZZ.DDD


As our partner also use openswan and it seems that the bug need the 2 end
part to be set up to appear, we will try to get their exact configuration
file also.


I saw an archive mail from Matthias Haas on Jan 15 in the dev list that seem
to found the same bug and suggest this patch :

--- openswan-2.4.7/programs/pluto/demux.c Fri Jan 12 11:35:21 2007
+++ openswan-2.4.7-debug/programs/pluto/demux.c Fri Jan 12 12:16:07 2007
@@ -2411,7 +2411,7 @@
      * we can only be in calculating state if state is ignore,
      * or suspended.
      */
-    passert(result == STF_IGNORE || result == STF_SUSPEND ||
st->st_calculating==FALSE);
+    passert(result == STF_INLINE || result == STF_IGNORE || result ==
STF_SUSPEND || st->st_calculating==FALSE);


The good news is that the problem is perfectly reproduceable, but we would
not do so before we moved the tunnel with our partner to a staging platform
to perform more test and investigations, not to impact all our production,
which could be done next week.

Do you, developpers, need special further test from us, and do you think we
could apply this patch on our production server?


Thanks,
Jean-Michel Bonnefond.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openswan.org/pipermail/users/attachments/20070302/bbf9848b/attachment.html 


More information about the Users mailing list