[Openswan Users] hang problem with virtualized road warrior on centos

David McCullough david_mccullough at mcafee.com
Wed Mar 10 01:20:52 EST 2010


Jivin Gary Smith lays it down ...
> Hello, 

Hi gary,  sorry for the slow response,  been a bit tied up.

> I've encountering a problem with 2.6.24 where after it has been running for some time, the CPU and memory will max out on the box and networking ceases to operating (presumably because max cpu issue).  The box in question is a centos 5.4 server.  The box has been running for some months without fail until I installed openswan.  I used the centos rpm package BUT I had to recompile to remove the nss patch (any those others that were already included in the upstream).
> 
> The odd thing is that this is only happening on one of three boxes that has the same configuration.  The other two have static IP's, this one is an endpoint on a dhcp line.  The machine did have an issue once when renewing dhcp with openswan running in which dhcp hung.  I had to stop openswan in order to reacquire a new address.
> 
> It doesn't seem time dependent.  It's pretty random.  The machine itself is had 512mb ram and is only used for ipsec and iptables.
> 
> I wish I had more detail as to the problem, but the only way I can even access the box when the error occurs is a hard reboot (as the console is locked up as well).  It should also be noted that this is a virtual server running under vmware esxi 4.0 (and always has been -- for 6+ months with no problem).  The vmware server itself is idle (as I have been monitoring that).  
> 
> Has anyone else seem a high cpu/exhausted memory condition on a road warrior running for several days?  Any ideas on how to trap this condition when it does happen?

I have seen similar things and some fixes have gone into GIT but I think
most of them are in 2.6.24.

When this happens can you run something like:	

	whack --status

If so then pluto is still alive just too busy doing something it doesn't
need to do.  You may be able to learn some more by enabling debug at that
point:

	whack --debug-all

If pluto is still doing something then it should now get logged.  Let us
know what that is and see if we can learn something from it.

Cheers,
Davidm

-- 
David McCullough,      david_mccullough at mcafee.com,  Ph:+61 734352815
McAfee - SnapGear      http://www.mcafee.com         http://www.uCdot.org


More information about the Users mailing list