[Openswan Users] Dead Peer Detection restart causes tunnel to be established, but afterwards cannot ping from either side

Geekman the1geekman at gmail.com
Mon Oct 3 23:02:23 EDT 2011


Hi All,

In the past few days I've moved OpenSwan onto our production server,
and upon doing some more testing, I've run into an issue which I
didn't notice in testing. I don't think was there, but I have yet to
go back and actually verify this.

The net-to-net tunnels work great. The problem I have found is that
when using DPD, if our OpenSwan server detects the remote peer to be
dead for whatever reason, the tunnel is re-established, but traffic
then ceases to flow - my pings time out on either side. If, however,
the remote end point is the one who decided to initiate a new
connection, the problem does not seem to appear. This is obvious when
I reboot the remote VPN router, but sometimes even if the remote end
has established a working tunnel, if our end detects it as dead, it
will re-establish a new tunnel which does not work correctly.

Here's IPSec verify:
--------------------------

Checking your system to see if IPsec got installed and started correctly:
Version check and ipsec on-path                             	[OK]
Linux Openswan U2.6.23/K2.6.32-28-server (netkey)
Checking for IPsec support in kernel                        	[OK]
NETKEY detected, testing for disabled ICMP send_redirects   	[OK]
NETKEY detected, testing for disabled ICMP accept_redirects 	[OK]
Checking for RSA private key (/etc/ipsec.secrets)           	[OK]
Checking that pluto is running                              	[OK]
Pluto listening for IKE on udp 500                          	[OK]
Pluto listening for NAT-T on udp 4500                       	[OK]
Two or more interfaces found, checking IP forwarding        	[OK]
Checking NAT and MASQUERADEing
Checking for 'ip' command                                   	[OK]
Checking for 'iptables' command                             	[OK]
Opportunistic Encryption Support                            	[DISABLED]

* Neo is the local OpenSwan server.
* The remote end point is an RV042 VPN Router running latest firmware.
* SERVER_IP is the public IP of a server NATd behind Neo.
* REMOTE_IP is the remote end point public IP
* NEO_IP is Neo's public IP.


Here's my connection definition:
------------------------------------------

conn CustomerTunnel
                dpdaction=restart_by_peer
                dpdtimeout=10
		dpddelay=10
                type=tunnel
                pfs=yes
                rekey=yes
                authby=secret
                phase2=esp
                # Left security gateway, subnet behind it, nexthop toward right.
                left=NEO_IP
                leftsubnet=172.16.0.0/24
                leftnexthop=NEO_NEXTHOP
                leftsourceip=172.16.0.1
                # Right security gateway, subnet behind it, nexthop toward left.
                right=REMOTE_IP
                rightsubnet=192.168.1.0/24
                rightnexthop=REMOTE_NEXTHOP
                # Startup automatically
                auto=start


Output of ip xfrm state
------------------------------

src REMOTE_IP dst NEO_IP
	proto esp spi 0x5c99b428 reqid 16389 mode tunnel
	replay-window 32
	auth hmac(sha1) 0x4bc5a9351d11734cdc3f798567f5f699fcc755bc
	enc cbc(aes) 0xcc15a6f25dfa9548ee56413c2df3e8b3
	sel src 0.0.0.0/0 dst 0.0.0.0/0
src NEO_IP dst REMOTE_IP
	proto esp spi 0xd5e538ac reqid 16389 mode tunnel
	replay-window 32
	auth hmac(sha1) 0x112d626c0c41efabd52a1cbd5e064cfbb0420b9f
	enc cbc(aes) 0x663c0af56e6f3549ab84d8fbe5f7cc04
	sel src 0.0.0.0/0 dst 0.0.0.0/0


Output of ip xfrm policy
------------------------------

src 172.16.0.0/24 dst 192.168.1.0/24
	dir out priority 2344
	tmpl src NEO_IP dst REMOTE_IP
		proto esp reqid 16389 mode tunnel
src 192.168.1.0/24 dst 172.16.0.0/24
	dir fwd priority 2344
	tmpl src REMOTE_IP dst NEO_IP
		proto esp reqid 16389 mode tunnel
src 192.168.1.0/24 dst 172.16.0.0/24
	dir in priority 2344
	tmpl src REMOTE_IP dst NEO_IP
		proto esp reqid 16389 mode tunnel

Output of ipsec status:
------------------------------

000 "CustomerTunnel":
172.16.0.0/24===NEO_IP<NEO_IP>[+S=C]---NEO_NEXTHOP...REMOTE_NEXTHOP---REMOTE_IP<REMOTE_IP>[+S=C]===192.168.1.0/24;
erouted; eroute owner: #23
000 "CustomerTunnel":     myip=172.16.0.1; hisip=unset;
000 "CustomerTunnel":   ike_life: 3600s; ipsec_life: 28800s;
rekey_margin: 540s; rekey_fuzz: 100%; keyingtries: 0
000 "CustomerTunnel":   policy:
PSK+ENCRYPT+TUNNEL+PFS+UP+IKEv2ALLOW+lKOD+rKOD; prio: 24,24;
interface: eth0;
000 "CustomerTunnel":   dpd: action:restart_by_peer; delay:10; timeout:10;
000 "CustomerTunnel":   newest ISAKMP SA: #22; newest IPsec SA: #23;
000 "CustomerTunnel":   IKE algorithm newest: AES_CBC_128-SHA1-MODP1536
000
000 #23: "CustomerTunnel":500 STATE_QUICK_I2 (sent QI2, IPsec SA
established); EVENT_SA_REPLACE in 27804s; newest IPSEC; eroute owner;
isakmp#22; idle; import:admin initiate
000 #23: "CustomerTunnel" esp.d5e538ac at REMOTE_IP esp.5c99b428 at NEO_IP
tun.0 at REMOTE_IP tun.0 at NEO_IP ref=0 refhim=4294901761
000 #22: "CustomerTunnel":500 STATE_MAIN_I4 (ISAKMP SA established);
EVENT_SA_REPLACE in 2269s; newest ISAKMP; lastdpd=8s(seq in:805
out:0); idle; import:admin initiate



Related IPTables rules:
--------------------------------

-A PREROUTING -d SERVER_IP/32 -j DNAT --to-destination 172.16.0.4
-A POSTROUTING -s 172.16.0.0/24 -d 192.168.1.0/24 -j RETURN
-A POSTROUTING -s 172.16.0.4/32 -j SNAT --to-source SERVER_IP
-A FORWARD -d 172.16.0.0/24 -m policy --dir in --pol ipsec --mode
tunnel --tunnel-dst NEO_IP --tunnel-src REMOTE_IP -j ACCEPT
-A FORWARD -s 172.16.0.0/24 -m policy --dir out --pol ipsec --mode
tunnel --tunnel-dst REMOTE_IP --tunnel-src NEO_IP -j ACCEPT

….

-A FORWARD -s 172.16.0.4/32 -j ACCEPT



Here's relevant bits of the routing table on Neo:
---------------------------------------------------------------

192.168.1.0     UPSTREAM_GW   255.255.255.0   UG        0 0          0 eth0
172.16.0.0      *               255.255.255.0   U         0 0          0 eth1.3
default         UPSTREAM_GW   0.0.0.0         UG        0 0          0 eth0



Here's a number of log entries at various stages throughout the
testing process. I could not find any major differences between a
working tunnel and a non-working tunnel, but hopefully one of you see
something:

* Started IPSec on our end. We bring the tunnel up, it is working:

Oct  4 12:58:26 neo pluto[20127]: "CustomerTunnel" #1: initiating Main Mode
Oct  4 12:58:26 neo pluto[20127]: "CustomerTunnel" #1: received Vendor
ID payload [Dead Peer Detection]
Oct  4 12:58:26 neo pluto[20127]: "CustomerTunnel" #1: transition from
state STATE_MAIN_I1 to state STATE_MAIN_I2
Oct  4 12:58:26 neo pluto[20127]: "CustomerTunnel" #1: STATE_MAIN_I2:
sent MI2, expecting MR2
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #1: transition from
state STATE_MAIN_I2 to state STATE_MAIN_I3
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #1: STATE_MAIN_I3:
sent MI3, expecting MR3
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #1: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #1: transition from
state STATE_MAIN_I3 to state STATE_MAIN_I4
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #1: STATE_MAIN_I4:
ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY cipher=aes_128
prf=oakley_sha group=modp1536}
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #1: Dead Peer
Detection (RFC 3706): enabled
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #2: initiating
Quick Mode PSK+ENCRYPT+TUNNEL+PFS+UP+IKEv2ALLOW {using isakmp#1
msgid:93da8798 proposal=defaults pfsgroup=OAKLEY_GROUP_MODP1536}
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #2: Dead Peer
Detection (RFC 3706): enabled
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #2: transition from
state STATE_QUICK_I1 to state STATE_QUICK_I2
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #2: STATE_QUICK_I2:
sent QI2, IPsec SA established tunnel mode {ESP=>0xd5e538a0
<0xd83cb9e6 xfrm=AES_128-HMAC_SHA1 NATOA=none NATD=none DPD=enabled}
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #1: received Delete
SA payload: deleting ISAKMP State #1
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #3: responding to Main Mode
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #3: transition from
state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #3: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #3: transition from
state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 12:58:27 neo pluto[20127]: "CustomerTunnel" #3: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #3: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #3: transition from
state STATE_MAIN_R2 to state STATE_MAIN_R3
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #3: STATE_MAIN_R3:
sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cipher=aes_128 prf=oakley_sha group=modp1536}
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #3: Dead Peer
Detection (RFC 3706): enabled
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #3: the peer
proposed: 172.16.0.0/24:0/0 -> 192.168.1.0/24:0/0
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #4: responding to
Quick Mode proposal {msgid:0280a1bc}
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #4:     us:
172.16.0.0/24===NEO_IP<NEO_IP>[+S=C]---NEO_NEXTHOP
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #4:   them:
REMOTE_NEXTHOP---REMOTE_IP<REMOTE_IP>[+S=C]===192.168.1.0/24
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #4: keeping
refhim=4294901761 during rekey
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #4: transition from
state STATE_QUICK_R0 to state STATE_QUICK_R1
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #4: STATE_QUICK_R1:
sent QR1, inbound IPsec SA installed, expecting QI2
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #4: Dead Peer
Detection (RFC 3706): enabled
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #4: transition from
state STATE_QUICK_R1 to state STATE_QUICK_R2
Oct  4 12:58:28 neo pluto[20127]: "CustomerTunnel" #4: STATE_QUICK_R2:
IPsec SA established tunnel mode {ESP=>0xd5e538a1 <0x0df63e8b
xfrm=AES_256-HMAC_SHA1 NATOA=none NATD=none DPD=enabled}


* Disconnected/Reconnected VPN from remote side:

Oct  4 13:00:00 neo pluto[20127]: "CustomerTunnel" #3: received Delete
SA payload: replace IPSEC State #4 in 10 seconds
Oct  4 13:00:00 neo pluto[20127]: "CustomerTunnel" #3: received and
ignored informational message
Oct  4 13:00:00 neo pluto[20127]: "CustomerTunnel" #3: received Delete
SA payload: deleting ISAKMP State #3
Oct  4 13:00:00 neo pluto[20127]: "CustomerTunnel" #5: responding to Main Mode
Oct  4 13:00:00 neo pluto[20127]: "CustomerTunnel" #5: transition from
state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 13:00:00 neo pluto[20127]: "CustomerTunnel" #5: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 13:00:00 neo pluto[20127]: "CustomerTunnel" #5: transition from
state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 13:00:00 neo pluto[20127]: "CustomerTunnel" #5: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #5: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #5: transition from
state STATE_MAIN_R2 to state STATE_MAIN_R3
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #5: STATE_MAIN_R3:
sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cipher=aes_128 prf=oakley_sha group=modp1536}
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #5: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #5: the peer
proposed: 172.16.0.0/24:0/0 -> 192.168.1.0/24:0/0
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #6: responding to
Quick Mode proposal {msgid:34301160}
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #6:     us:
172.16.0.0/24===NEO_IP<NEO_IP>[+S=C]---NEO_NEXTHOP
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #6:   them:
REMOTE_NEXTHOP---REMOTE_IP<REMOTE_IP>[+S=C]===192.168.1.0/24
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #6: keeping
refhim=4294901761 during rekey
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #6: transition from
state STATE_QUICK_R0 to state STATE_QUICK_R1
Oct  4 13:00:01 neo pluto[20127]: "CustomerTunnel" #6: STATE_QUICK_R1:
sent QR1, inbound IPsec SA installed, expecting QI2
Oct  4 13:00:04 neo pluto[20127]: "CustomerTunnel" #6: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:00:04 neo pluto[20127]: "CustomerTunnel" #6: transition from
state STATE_QUICK_R1 to state STATE_QUICK_R2
Oct  4 13:00:04 neo pluto[20127]: "CustomerTunnel" #6: STATE_QUICK_R2:
IPsec SA established tunnel mode {ESP=>0xd5e538a2 <0x8720abac
xfrm=AES_256-HMAC_SHA1 NATOA=none NATD=none DPD=enabled}
Oct  4 13:00:04 neo pluto[20127]: "CustomerTunnel" #5: received Delete
SA payload: replace IPSEC State #6 in 10 seconds
Oct  4 13:00:04 neo pluto[20127]: "CustomerTunnel" #5: received and
ignored informational message
Oct  4 13:00:04 neo pluto[20127]: "CustomerTunnel" #5: received Delete
SA payload: deleting ISAKMP State #5
Oct  4 13:00:04 neo pluto[20127]: "CustomerTunnel" #7: responding to Main Mode
Oct  4 13:00:04 neo pluto[20127]: "CustomerTunnel" #7: transition from
state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 13:00:04 neo pluto[20127]: "CustomerTunnel" #7: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 13:00:05 neo pluto[20127]: "CustomerTunnel" #7: transition from
state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 13:00:05 neo pluto[20127]: "CustomerTunnel" #7: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 13:00:05 neo pluto[20127]: "CustomerTunnel" #8: responding to Main Mode
Oct  4 13:00:05 neo pluto[20127]: "CustomerTunnel" #8: transition from
state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 13:00:05 neo pluto[20127]: "CustomerTunnel" #8: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 13:00:07 neo pluto[20127]: "CustomerTunnel" #4: DPD Error:
could not find newest phase 1 state
Oct  4 13:00:07 neo pluto[20127]: "CustomerTunnel" #2: DPD Error:
could not find newest phase 1 state
Oct  4 13:00:07 neo pluto[20127]: "CustomerTunnel" #8: transition from
state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 13:00:07 neo pluto[20127]: "CustomerTunnel" #8: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 13:00:07 neo pluto[20127]: "CustomerTunnel" #8: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 13:00:07 neo pluto[20127]: "CustomerTunnel" #8: transition from
state STATE_MAIN_R2 to state STATE_MAIN_R3
Oct  4 13:00:07 neo pluto[20127]: "CustomerTunnel" #8: STATE_MAIN_R3:
sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cipher=aes_128 prf=oakley_sha group=modp1536}
Oct  4 13:00:07 neo pluto[20127]: "CustomerTunnel" #8: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #8: the peer
proposed: 172.16.0.0/24:0/0 -> 192.168.1.0/24:0/0
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #9: responding to
Quick Mode proposal {msgid:327ff8bb}
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #9:     us:
172.16.0.0/24===NEO_IP<NEO_IP>[+S=C]---NEO_NEXTHOP
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #9:   them:
REMOTE_NEXTHOP---REMOTE_IP<REMOTE_IP>[+S=C]===192.168.1.0/24
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #9: keeping
refhim=4294901761 during rekey
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #9: transition from
state STATE_QUICK_R0 to state STATE_QUICK_R1
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #9: STATE_QUICK_R1:
sent QR1, inbound IPsec SA installed, expecting QI2
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #9: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #9: transition from
state STATE_QUICK_R1 to state STATE_QUICK_R2
Oct  4 13:00:08 neo pluto[20127]: "CustomerTunnel" #9: STATE_QUICK_R2:
IPsec SA established tunnel mode {ESP=>0xd5e538a3 <0xbed47ce3
xfrm=AES_256-HMAC_SHA1 NATOA=none NATD=none DPD=enabled}


* Tunnel came back online. Eventually got the following afterwards,
but tunnel still working:

Oct  4 13:01:15 neo pluto[20127]: "CustomerTunnel" #7: max number of
retransmissions (2) reached STATE_MAIN_R2

* Issued --down on local end:

Oct  4 13:02:34 neo pluto[20127]: "CustomerTunnel": terminating SAs
using this connection
Oct  4 13:02:34 neo pluto[20127]: "CustomerTunnel" #9: deleting state
(STATE_QUICK_R2)
Oct  4 13:02:34 neo pluto[20127]: "CustomerTunnel" #8: deleting state
(STATE_MAIN_R3)
Oct  4 13:02:34 neo pluto[20127]: "CustomerTunnel" #2: deleting state
(STATE_QUICK_I2)
Oct  4 13:02:34 neo pluto[20127]: "CustomerTunnel" #2: netlink
recvfrom() of response to our XFRM_MSG_DELPOLICY message for policy
unk255.10000 at NEO_IP was too long: 168 > 36
Oct  4 13:02:34 neo pluto[20127]: "CustomerTunnel" #2: netlink
recvfrom() of response to our XFRM_MSG_DELPOLICY message for policy
unk255.10000 at NEO_IP was too long: 168 > 36
Oct  4 13:02:50 neo pluto[20127]: "CustomerTunnel" #10: responding to Main Mode
Oct  4 13:02:50 neo pluto[20127]: "CustomerTunnel" #10: transition
from state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 13:02:50 neo pluto[20127]: "CustomerTunnel" #10: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 13:02:50 neo pluto[20127]: "CustomerTunnel" #10: transition
from state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 13:02:50 neo pluto[20127]: "CustomerTunnel" #10: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #10: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #10: transition
from state STATE_MAIN_R2 to state STATE_MAIN_R3
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #10: STATE_MAIN_R3:
sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cipher=aes_128 prf=oakley_sha group=modp1536}
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #10: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #10: the peer
proposed: 172.16.0.0/24:0/0 -> 192.168.1.0/24:0/0
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #11: responding to
Quick Mode proposal {msgid:d714f713}
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #11:     us:
172.16.0.0/24===NEO_IP<NEO_IP>[+S=C]---NEO_NEXTHOP
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #11:   them:
REMOTE_NEXTHOP---REMOTE_IP<REMOTE_IP>[+S=C]===192.168.1.0/24
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #11: transition
from state STATE_QUICK_R0 to state STATE_QUICK_R1
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #11:
STATE_QUICK_R1: sent QR1, inbound IPsec SA installed, expecting QI2
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #11: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #11: transition
from state STATE_QUICK_R1 to state STATE_QUICK_R2
Oct  4 13:02:51 neo pluto[20127]: "CustomerTunnel" #11:
STATE_QUICK_R2: IPsec SA established tunnel mode {ESP=>0xd5e538a4
<0x071ef0b5 xfrm=AES_256-HMAC_SHA1 NATOA=none NATD=none DPD=enabled}

* After about 10 seconds, the remote end decides to reconnect the
tunnel. It is working again, consistently.

* Disconnected/Reconnected from remote side again for another test,
this time it reconnected fine, as show here, but stopped working soon
after, shown further below:

Oct  4 13:04:25 neo pluto[20127]: "CustomerTunnel" #10: received
Delete SA(0xd5e538a4) payload: deleting IPSEC State #11
Oct  4 13:04:25 neo pluto[20127]: "CustomerTunnel" #10: received and
ignored informational message
Oct  4 13:04:25 neo pluto[20127]: "CustomerTunnel" #10: received
Delete SA payload: deleting ISAKMP State #10
Oct  4 13:04:26 neo pluto[20127]: "CustomerTunnel" #12: responding to Main Mode
Oct  4 13:04:26 neo pluto[20127]: "CustomerTunnel" #12: transition
from state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 13:04:26 neo pluto[20127]: "CustomerTunnel" #12: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 13:04:26 neo pluto[20127]: "CustomerTunnel" #12: transition
from state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 13:04:26 neo pluto[20127]: "CustomerTunnel" #12: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #12: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #12: transition
from state STATE_MAIN_R2 to state STATE_MAIN_R3
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #12: STATE_MAIN_R3:
sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cipher=aes_128 prf=oakley_sha group=modp1536}
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #12: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #12: the peer
proposed: 172.16.0.0/24:0/0 -> 192.168.1.0/24:0/0
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #13: responding to
Quick Mode proposal {msgid:f95e78a3}
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #13:     us:
172.16.0.0/24===NEO_IP<NEO_IP>[+S=C]---NEO_NEXTHOP
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #13:   them:
REMOTE_NEXTHOP---REMOTE_IP<REMOTE_IP>[+S=C]===192.168.1.0/24
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #13: transition
from state STATE_QUICK_R0 to state STATE_QUICK_R1
Oct  4 13:04:27 neo pluto[20127]: "CustomerTunnel" #13:
STATE_QUICK_R1: sent QR1, inbound IPsec SA installed, expecting QI2
Oct  4 13:04:28 neo pluto[20127]: "CustomerTunnel" #13: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:04:28 neo pluto[20127]: "CustomerTunnel" #13: transition
from state STATE_QUICK_R1 to state STATE_QUICK_R2
Oct  4 13:04:28 neo pluto[20127]: "CustomerTunnel" #13:
STATE_QUICK_R2: IPsec SA established tunnel mode {ESP=>0xd5e538a5
<0x0de55b80 xfrm=AES_256-HMAC_SHA1 NATOA=none NATD=none DPD=enabled}
Oct  4 13:04:28 neo pluto[20127]: "CustomerTunnel" #12: received
Delete SA(0xd5e538a5) payload: deleting IPSEC State #13
Oct  4 13:04:28 neo pluto[20127]: "CustomerTunnel" #12: received and
ignored informational message
Oct  4 13:04:28 neo pluto[20127]: "CustomerTunnel" #12: received
Delete SA payload: deleting ISAKMP State #12
Oct  4 13:04:28 neo pluto[20127]: "CustomerTunnel" #14: responding to Main Mode
Oct  4 13:04:28 neo pluto[20127]: "CustomerTunnel" #14: transition
from state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 13:04:28 neo pluto[20127]: "CustomerTunnel" #14: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 13:04:29 neo pluto[20127]: "CustomerTunnel" #14: transition
from state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 13:04:29 neo pluto[20127]: "CustomerTunnel" #14: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 13:04:29 neo pluto[20127]: "CustomerTunnel" #14: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 13:04:29 neo pluto[20127]: "CustomerTunnel" #14: transition
from state STATE_MAIN_R2 to state STATE_MAIN_R3
Oct  4 13:04:29 neo pluto[20127]: "CustomerTunnel" #14: STATE_MAIN_R3:
sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cipher=aes_128 prf=oakley_sha group=modp1536}
Oct  4 13:04:29 neo pluto[20127]: "CustomerTunnel" #14: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:04:29 neo pluto[20127]: "CustomerTunnel" #15: responding to Main Mode
Oct  4 13:04:29 neo pluto[20127]: "CustomerTunnel" #15: transition
from state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 13:04:29 neo pluto[20127]: "CustomerTunnel" #15: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 13:04:31 neo pluto[20127]: "CustomerTunnel" #15: transition
from state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 13:04:31 neo pluto[20127]: "CustomerTunnel" #15: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #15: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #15: transition
from state STATE_MAIN_R2 to state STATE_MAIN_R3
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #15: STATE_MAIN_R3:
sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cipher=aes_128 prf=oakley_sha group=modp1536}
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #15: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #15: the peer
proposed: 172.16.0.0/24:0/0 -> 192.168.1.0/24:0/0
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #16: responding to
Quick Mode proposal {msgid:51591e1e}
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #16:     us:
172.16.0.0/24===NEO_IP<NEO_IP>[+S=C]---NEO_NEXTHOP
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #16:   them:
REMOTE_NEXTHOP---REMOTE_IP<REMOTE_IP>[+S=C]===192.168.1.0/24
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #16: transition
from state STATE_QUICK_R0 to state STATE_QUICK_R1
Oct  4 13:04:32 neo pluto[20127]: "CustomerTunnel" #16:
STATE_QUICK_R1: sent QR1, inbound IPsec SA installed, expecting QI2
Oct  4 13:04:33 neo pluto[20127]: "CustomerTunnel" #16: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:04:33 neo pluto[20127]: "CustomerTunnel" #16: transition
from state STATE_QUICK_R1 to state STATE_QUICK_R2
Oct  4 13:04:33 neo pluto[20127]: "CustomerTunnel" #16:
STATE_QUICK_R2: IPsec SA established tunnel mode {ESP=>0xd5e538a6
<0xb9b3a54b xfrm=AES_256-HMAC_SHA1 NATOA=none NATD=none DPD=enabled}


* After the few seconds (maybe 10), this is what appeared, and at this
time, the pings began to time out:

Oct  4 13:04:49 neo pluto[20127]: "CustomerTunnel" #14: DPD: No
response from peer - declaring peer dead
Oct  4 13:04:49 neo pluto[20127]: "CustomerTunnel" #14: DPD:
Restarting all connections that share this peer
Oct  4 13:04:49 neo pluto[20127]: "CustomerTunnel" #14: terminating
SAs using this connection
Oct  4 13:04:49 neo pluto[20127]: "CustomerTunnel" #16: deleting state
(STATE_QUICK_R2)
Oct  4 13:04:49 neo pluto[20127]: "CustomerTunnel" #15: deleting state
(STATE_MAIN_R3)
Oct  4 13:04:49 neo pluto[20127]: "CustomerTunnel" #14: deleting state
(STATE_MAIN_R3)
Oct  4 13:04:49 neo pluto[20127]: "CustomerTunnel" #17: initiating Main Mode
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #17: received
Vendor ID payload [Dead Peer Detection]
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #17: transition
from state STATE_MAIN_I1 to state STATE_MAIN_I2
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #17: STATE_MAIN_I2:
sent MI2, expecting MR2
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #17: transition
from state STATE_MAIN_I2 to state STATE_MAIN_I3
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #17: STATE_MAIN_I3:
sent MI3, expecting MR3
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #17: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #17: transition
from state STATE_MAIN_I3 to state STATE_MAIN_I4
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #17: STATE_MAIN_I4:
ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY cipher=aes_128
prf=oakley_sha group=modp1536}
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #17: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:04:50 neo pluto[20127]: "CustomerTunnel" #18: initiating
Quick Mode PSK+ENCRYPT+TUNNEL+PFS+UP+IKEv2ALLOW {using isakmp#17
msgid:41b8b2c4 proposal=defaults pfsgroup=OAKLEY_GROUP_MODP1536}
Oct  4 13:04:51 neo pluto[20127]: "CustomerTunnel" #18: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:04:51 neo pluto[20127]: "CustomerTunnel" #18: transition
from state STATE_QUICK_I1 to state STATE_QUICK_I2
Oct  4 13:04:51 neo pluto[20127]: "CustomerTunnel" #18:
STATE_QUICK_I2: sent QI2, IPsec SA established tunnel mode
{ESP=>0xd5e538a8 <0x3d19de9e xfrm=AES_128-HMAC_SHA1 NATOA=none
NATD=none DPD=enabled}
Oct  4 13:04:51 neo pluto[20127]: "CustomerTunnel" #17: ignoring
Delete SA payload: PROTO_IPSEC_ESP SA(0xd5e538a6) not found (maybe
expired)
Oct  4 13:04:51 neo pluto[20127]: "CustomerTunnel" #17: received and
ignored informational message

* Not entirely sure why DPD on our end detects a dead peer even though
I could ping the other side at that stage, but I consider this a
somewhat separate issue. I wouldn't really care about it restarting
the tunnel again so soon, as long as when it did, it worked.

* If I execute --replace, the tunnel is recreated, and works OK:

Oct  4 13:12:13 neo pluto[20127]: "CustomerTunnel": deleting connection
Oct  4 13:12:13 neo pluto[20127]: "CustomerTunnel" #18: deleting state
(STATE_QUICK_I2)
Oct  4 13:12:13 neo pluto[20127]: "CustomerTunnel" #17: deleting state
(STATE_MAIN_I4)
Oct  4 13:12:13 neo pluto[20127]: "CustomerTunnel": netlink recvfrom()
of response to our XFRM_MSG_DELPOLICY message for policy
eroute_connection delete inbound was too long: 100 > 36
Oct  4 13:12:13 neo pluto[20127]: "CustomerTunnel": netlink recvfrom()
of response to our XFRM_MSG_DELPOLICY message for policy
eroute_connection delete inbound was too long: 100 > 36
Oct  4 13:12:14 neo pluto[20127]: added connection description "CustomerTunnel"
Oct  4 13:12:24 neo pluto[20127]: "CustomerTunnel" #19: responding to Main Mode
Oct  4 13:12:24 neo pluto[20127]: "CustomerTunnel" #19: transition
from state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 13:12:24 neo pluto[20127]: "CustomerTunnel" #19: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 13:12:25 neo pluto[20127]: "CustomerTunnel" #19: transition
from state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 13:12:25 neo pluto[20127]: "CustomerTunnel" #19: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 13:12:25 neo pluto[20127]: "CustomerTunnel" #19: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 13:12:25 neo pluto[20127]: "CustomerTunnel" #19: transition
from state STATE_MAIN_R2 to state STATE_MAIN_R3
Oct  4 13:12:25 neo pluto[20127]: "CustomerTunnel" #19: STATE_MAIN_R3:
sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cipher=aes_128 prf=oakley_sha group=modp1536}
Oct  4 13:12:25 neo pluto[20127]: "CustomerTunnel" #19: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:12:26 neo pluto[20127]: "CustomerTunnel" #19: received
Delete SA payload: deleting ISAKMP State #19
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: responding to Main Mode
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: transition
from state STATE_MAIN_R0 to state STATE_MAIN_R1
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: STATE_MAIN_R1:
sent MR1, expecting MI2
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: transition
from state STATE_MAIN_R1 to state STATE_MAIN_R2
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: STATE_MAIN_R2:
sent MR2, expecting MI3
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: Main mode peer
ID is ID_IPV4_ADDR: 'REMOTE_IP'
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: transition
from state STATE_MAIN_R2 to state STATE_MAIN_R3
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: STATE_MAIN_R3:
sent MR3, ISAKMP SA established {auth=OAKLEY_PRESHARED_KEY
cipher=aes_128 prf=oakley_sha group=modp1536}
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #20: the peer
proposed: 172.16.0.0/24:0/0 -> 192.168.1.0/24:0/0
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #21: responding to
Quick Mode proposal {msgid:a08efe20}
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #21:     us:
172.16.0.0/24===NEO_IP<NEO_IP>[+S=C]---NEO_NEXTHOP
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #21:   them:
REMOTE_NEXTHOP---REMOTE_IP<REMOTE_IP>[+S=C]===192.168.1.0/24
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #21: transition
from state STATE_QUICK_R0 to state STATE_QUICK_R1
Oct  4 13:12:27 neo pluto[20127]: "CustomerTunnel" #21:
STATE_QUICK_R1: sent QR1, inbound IPsec SA installed, expecting QI2
Oct  4 13:12:28 neo pluto[20127]: "CustomerTunnel" #21: Dead Peer
Detection (RFC 3706): enabled
Oct  4 13:12:28 neo pluto[20127]: "CustomerTunnel" #21: transition
from state STATE_QUICK_R1 to state STATE_QUICK_R2
Oct  4 13:12:28 neo pluto[20127]: "CustomerTunnel" #21:
STATE_QUICK_R2: IPsec SA established tunnel mode {ESP=>0xd5e538ab
<0x872314e8 xfrm=AES_256-HMAC_SHA1 NATOA=none NATD=none DPD=enabled}


* As you can see, once the connection is forcibly deleted, I wait a
few seconds and it happens to time out on the remote end point, who
initiates a new connection.

* The tunnel is always established. If the tunnel is restarted for
whatever reason, and the remote peer initiates a new tunnel, it works
OK. From what I've seen, as soon as our side times out and tries to
initiate the tunnel, it does establish a tunnel, but for some reason
it doesn't work.

* Interestingly, neo established the connection when I first started
IPSec, which makes me think its somehow specific to DPD.

* My current dpd settings are:

-timeout=10
-delay=10
-action=restart_by_peer

(I was previously using timeout=40, delay=5)

* I've tried various timeouts and delays, I also was previously using
restart instead. All this seems to do is alter when / what is needed
to get a DPD restart to kick in. Once DPD issues a restart, it seems
to not work pretty consistently.


I'm at a loss at what else I can do here, or even what other
information I can provide. Any help is much appreciated.

I'm thinking about compiling latest stable from source, but I'd really
prefer to stick with the apt package, as it will make future upgrades
easier to manage.

Thanks!


More information about the Users mailing list