[Openswan Users] random l2tp/pppd failure, again

Joel Michael joel at gimps-r-us.com
Sun Mar 5 21:09:33 CET 2006


The thread at http://lists.virus.org/users-openswan-0602/msg00222.html 
seems to describe the thing that started biting one of my clients today, 
after a DR exercise.  Previously, the server had been serving VPN 
connections without a hitch for about two months.  Hopefully I can throw 
a bit more light on the subject.

As part of the DR exercise, the client's VPN server was rebuilt.  It 
runs Fedora Core 4, uses Openswan for IPSec and l2tpd (from Fedora 
Extras) for L2TP.  The server was built entirely using Kickstart, with 
all packages installed from a custom build CD, and all configuration 
done in %post statements.  The l2tpd.conf and Openswan configuration did 
not change between the time the server was initially installed to the 
time it was rebuilt.

What happens is that after a few successful VPN connections from WinXP 
clients (Laptops and Desktops various, from a wireless network and from 
the Internet, some on the Internet having NAT applied), it appears that 
l2tpd locks up.  When WinXP clients try to connect, they get the 678 
connection error.  It can be narrowed down to l2tpd, because if the 
command 'service l2tpd restart' is run, everything comes back to life. 
Of course, I can't get my client to restart l2tpd every time it dies, 
that's just fixing the symptom, and they wouldn't be too happy because 
"it used to be stable".

There are only two possible changes that I can see between the old 
system and the new system.

The first change is the versions of packages installed - during the 
installation, a 'yum update' is performed.  This may have updated some 
rather critical things that may affect the VPN, such as Openswan, l2tpd 
and the Kernel.  I know that l2tpd has not been updated since the server 
was originally built.

The second thing that changed is that Hyper-Threading was enabled on the 
new server, but was not on the old server.  Both servers have a single 
processor.  The installation detected the "2nd" CPU provided by HT, and 
installed the SMP kernel.  This is what I'm thinking is causing the 
problem.  As the symptom only seems to occur occasionally, it suggests 
that it is some kind of SMP race condition.

If this occurs tomorrow during production hours, I will try booting with 
a UP kernel, to see if this eliminates the problem.

To the others that are having this problem: are you running a SMP or a 
UP kernel?  Have you been able to resolve the problem by means other 
than repeatedly bashing l2tpd on the head?

Thanks for any help that anyone can provide!
-- 
jpm


More information about the Users mailing list