[Openswan Users] random l2tp/pppd failure, again
joel at gimps-r-us.com
Sun Mar 5 21:09:33 CET 2006
The thread at http://lists.virus.org/users-openswan-0602/msg00222.html
seems to describe the thing that started biting one of my clients today,
after a DR exercise. Previously, the server had been serving VPN
connections without a hitch for about two months. Hopefully I can throw
a bit more light on the subject.
As part of the DR exercise, the client's VPN server was rebuilt. It
runs Fedora Core 4, uses Openswan for IPSec and l2tpd (from Fedora
Extras) for L2TP. The server was built entirely using Kickstart, with
all packages installed from a custom build CD, and all configuration
done in %post statements. The l2tpd.conf and Openswan configuration did
not change between the time the server was initially installed to the
time it was rebuilt.
What happens is that after a few successful VPN connections from WinXP
clients (Laptops and Desktops various, from a wireless network and from
the Internet, some on the Internet having NAT applied), it appears that
l2tpd locks up. When WinXP clients try to connect, they get the 678
connection error. It can be narrowed down to l2tpd, because if the
command 'service l2tpd restart' is run, everything comes back to life.
Of course, I can't get my client to restart l2tpd every time it dies,
that's just fixing the symptom, and they wouldn't be too happy because
"it used to be stable".
There are only two possible changes that I can see between the old
system and the new system.
The first change is the versions of packages installed - during the
installation, a 'yum update' is performed. This may have updated some
rather critical things that may affect the VPN, such as Openswan, l2tpd
and the Kernel. I know that l2tpd has not been updated since the server
was originally built.
The second thing that changed is that Hyper-Threading was enabled on the
new server, but was not on the old server. Both servers have a single
processor. The installation detected the "2nd" CPU provided by HT, and
installed the SMP kernel. This is what I'm thinking is causing the
problem. As the symptom only seems to occur occasionally, it suggests
that it is some kind of SMP race condition.
If this occurs tomorrow during production hours, I will try booting with
a UP kernel, to see if this eliminates the problem.
To the others that are having this problem: are you running a SMP or a
UP kernel? Have you been able to resolve the problem by means other
than repeatedly bashing l2tpd on the head?
Thanks for any help that anyone can provide!
More information about the Users