[Openswan Users] Reports of IPSEC tunnels going offline with 2.6.28

Paul Wouters paul at xelerance.com
Thu Sep 9 09:47:23 EDT 2010


On Thu, 9 Sep 2010, Greg Scott wrote:

> Looking at /var/log/secure – it looks like a problem started on Sep 8 around 27 minutes after midnight.  The systems at both sites
> are running ntp, so both clocks are synchronized.  At 00:26:44, both HQ and MN report an SA established.   At 00:27:14, HQ starts
> reporting “Informational Exchange message must be encrypted” and MN reports a malformed payload.   This repeats a few times until
> 00:37:54, when MN reports “too many (17) malformed payloads. Deleting state”.  After that, several messages on both sides with
> initiate on demand errors.  Around 09:26:00, the folks at the MN site rebooted the MN firewall.  After that, the tunnel came back as
> normal.   This customer has reported this happening at least twice in the past few days.  Again, this is with 2.6.28 at both sites. 

Sounds like NETKEY related.

> Could this be a hardware problem?

I doubt it.

> Sep  8 00:26:04 MN-fw1 pluto[2192]: "mn-hq" #441: ERROR: netlink response for Add SA esp.c7d28cc6 at 3.4.
> 177.201 included errno 3: No such process

This is netkey not being happy.

> Sep  8 00:27:14 MN-fw1 pluto[2192]: "mn-hq" #440: byte 2 of ISAKMP Hash Payload must be zero, but is not
> Sep  8 00:27:14 MN-fw1 pluto[2192]: "mn-hq" #440: malformed payload in packet
> Sep  8 00:27:14 MN-fw1 pluto[2192]: | payload malformed after IV
> Sep  8 00:27:14 MN-fw1 pluto[2192]: |   c5 15 27 ac  8e a4 30 b7  af 3f 05 d3  57 e3 9b 0a
> Sep  8 00:27:14 MN-fw1 pluto[2192]: "mn-hq" #440: sending notification PAYLOAD_MALFORMED to 1.2.252.17
> 8:500
> Sep  8 00:27:54 MN-fw1 pluto[2192]: "mn-hq" #440: byte 2 of ISAKMP Hash Payload must be zero, but is not
> Sep  8 00:27:54 MN-fw1 pluto[2192]: "mn-hq" #440: malformed payload in packet
> Sep  8 00:27:54 MN-fw1 pluto[2192]: | payload malformed after IV
> Sep  8 00:27:54 MN-fw1 pluto[2192]: |   c5 15 27 ac  8e a4 30 b7  af 3f 05 d3  57 e3 9b 0a
> Sep  8 00:27:54 MN-fw1 pluto[2192]: "mn-hq" #440: sending notification PAYLOAD_MALFORMED to 1.2.252.17
> 8:500

I'm not sure how this all of sudden would happen though.

> Sep  8 08:26:44 MN-fw1 pluto[2192]: "mn-hq" #442: IPsec SA expired (LATEST!)
> 
> Sep  8 08:26:44 MN-fw1 pluto[2192]: "mn-hq" #442: down-client output: Running mn-updown
> 
> Sep  8 08:26:44 MN-fw1 pluto[2192]: initiate on demand from 192.168.0.219:49178 to 10.0.0.1:3389 proto=6
> 
>  state: fos_start because: acquire

This is the netkey bug unfortunately. At this point you have a bogus %pass route, and you have to restart
openswan. It seems to trigger when the SA expired, and the first encrypted packet then hits the %pass
route causing the bogus acquire.

Paul


More information about the Users mailing list