[Openswan dev] [PATCH] Fix race condition between pluto start and whack

Fri Apr 8 10:38:08 EDT 2011

| From: Mattias Walstrom <lazzer at vmlinux.org>

What is the parent process of Pluto?  Why isn't it doing a "wait" so
that Pluto is reaped?

Guess: if Pluto terminates, the socket will be closed, and whack will
cease to wait.

But #1: if Pluto initialization takes a long time (in my day it
didn't, but perhaps certificate loading or something else new takes
time), whack will have to wait.

Pluto is designed as an event-driven system where events should be
processed quickly; initialization should be processed quickly too.
What has made it slow?

But #2: if the terminated Pluto isn't reaped, perhaps the socket isn't
closed.  Make sure that whatever runs Pluto also reaps it promptly.

It is possible to *start* communicating with Pluto.  The communication
won't complete until Pluto setup is complete.  But, logically, this is
quite reasonable.  This property was required for the correctness of the 
scripts we used in my day.

| I use an (slow) embedded system (arm, 400mhz), on which there is also 
| another benefit of this patch; I use 'ipsec whack --status' to see if 
| the tunnels have come up, but if you do this too soon after boot, 'ipsec 
| whack' will not return until pluto has started (and processed the 
| message, this takes 4-5 seconds), with this patch ipsec whack will 
| return immediately if the socket does not exist.

I suggest that if this functionality is useful that it be implemented
another way.  For example, perhaps you could look in /proc.