[Openswan Users] Losing VPN after ipsec restart

Roman Serbski mefystofel at gmail.com
Thu Sep 1 13:00:49 EDT 2011


Hi list,

Appreciate your advise with the following issue.

We have ~90 remote offices establishing IPSec tunnel with the server
in HQ (let's call it VPN master).

The VPN master is powered by Ubuntu 8.04.2 with Openswan
U2.4.9/K2.6.24-23-server installed from packages.

Here is the typical entry for the remote site in ipsec.conf:

conn L2TP-PSK-noNAT-remote-site-01
       authby=secret
       pfs=no
       auto=start
       keyingtries=3
       rekey=no
       type=tunnel
       left=public.ip.of.remote.side
       leftsubnet=192.168.100.0/24
       leftsourceip=192.168.100.1
       right=public.ip.of.vpn.master
       rightsubnet=10.0.0.0/8
       rightsourceip=private.ip.of.vpn.master

Remote sites are powered by Ubuntu 9.10 with Openswan
U2.6.22/K2.6.31-22-generic with the following ipsec.conf:

conn L2TP-PSK-noNAT-remote-site-01
       authby=secret
       pfs=no
       auto=start
       type=tunnel
       left=public.ip.of.remote.side
       leftsubnet=192.168.100.0/24
       leftsourceip=192.168.100.1
       right=public.ip.of.vpn.master
       rightsubnet=10.0.0.0/8
       rightsourceip=private.ip.of.vpn.master

Everything works fine with IPSec tunnel establishing alright, however
recently we started experiencing some issues.

When we modify ipsec.conf (to add a new entry) and restart ipsec on
VPN master, some offices are recovered instantly, for some offices it
takes an hour, but some are never recovered.

If I login to the remote site with IPSec tunnel down and restart ipsec
then the tunnel is established immediately.

I was trying to find a pattern but in vein.  Some offices with high
latency and packet loss are recovered immediately and offices with a
relatively good connection might never recover and vice verse. We also
monitor all sites by pinging them so I believe there is always some
traffic traversing the tunnel.

It's probably worth mentioning that we didn't experience this issue
before (with ~30 remote offices)... I guess with 90 sites we reached
some timeout limits.

Any hints would be greatly appreciated.

Thank you for your time.


More information about the Users mailing list