[Openswan dev] Patch for review

Fri Apr 23 00:03:03 EDT 2010

Jivin Tuomo Soini lays it down ...
> David McCullough wrote:
> > Jivin Tuomo Soini lays it down ...
> >> Does this break klips?
> > 
> > Almost certainly,  but I haven't tried it yet :-)
> > 
> >> It seem to work perfectly on netkey.
> >>
> >> This is not complete fix yet, Real fixing requires removing checking of
> >> initiate_ondemand_body return codes, This just hardcodes return code so
> >> it works and returns code to the state it was before all my experiments.
> > 
> > Just restating that I haven't tried this patch,  but it seems unlikely to
> > work for KLIPS.  The bug we had with klips was that if we didn't do anything
> > with the tunnel in initiate_ondemand,  then the kernel installed bare_shunt
> > would stay and cause problems.  Always removing the bare_shnut is just as
> > likely to cause problems later when we expect it to be there.  The klips ops
> > are very unforgiving,  if you call update,  it has to exist,  if you call
> > create it must not exist,  if you call delete,  it must exist.
> 
> That's the netkey problem. Bare shunt is always left behind on same
> situations and that doesn't work.

The two interface just work differently I guess.

> > Would it be possible for you to explain the bug you are trying to fix
> > when running with netkey ?  Is it for demand tunnels or not ? What is going
> > wrong and when ?
> 
> On netkey every run where initiate_ondemand_body leave eroute into xfrm
> policy causing packets never flow for the conn causing on-demand tunnel
> negotiation.
> 
> This happens for both CK_INSTANCE and CK_PERMANENT state tunnels if
> there is acquire by kernel.
> 
> Even with my fixes there usually is tunnel negotiation and immidiate
> rekey but that can't be avoided without code properly checking if there
> already is ongoing negotiation and ignoring it.
> 
> > Comments below on how the patch can easily be made klips safe if that is the
> > right thing to do.
> > 
> >> commit 42e9ba4cd79187d0fe32ae6c1e29cf7e2ef7f916
> >> Author: Tuomo Soini <tis at foobar.fi>
> >> Date:   Thu Apr 22 12:42:43 2010 +0300
> >>
> >>     This commit reverts changes which did fix spurious tunnel rekeyings
> >>     because of netkey acquires. But unlike those work-arounds this
> >>     change removes reason for those commits by always returning with 0 from
> >>     initiate_ondemand_body. And those reverted changes broke on-demand tunneling
> >>     by %hold state.
> >>     
> >>     If we return with non-0 value bare shunts are left behind and never
> >>     cleaned away causing invalid hold eroutes on netkey.
> >>
> >> diff --git a/programs/pluto/initiate.c b/programs/pluto/initiate.c
> >> index e2e1c96..9d828ac 100644
> >> --- a/programs/pluto/initiate.c
> >> +++ b/programs/pluto/initiate.c
> >> @@ -717,7 +717,7 @@ initiate_ondemand_body(struct find_oppo_bundle *b
> >>      int hisport;
> >>      char demandbuf[256];
> >>      bool loggedit = FALSE;
> >> -    int work = 1; /* assume we did some */
> >> +    int work = 0; /* This is ugly fix, we just always return 0 - Tuomo */
> > 
> > How about:
> > 
> > 	/* on klips/mast assume we will do something something */
> > 	work = (kern_interface == USE_KLIPS || kern_interface == USE_MASTKLIPS);
> 
> That's ok for me.
> 
> >>      
> >>      /* What connection shall we use?
> >>       * First try for one that explicitly handles the clients.
> >> @@ -780,15 +780,19 @@ initiate_ondemand_body(struct find_oppo_bundle *b
> >>  
> >>  	if(c->kind == CK_INSTANCE)
> >>  	{
> >> +	    char cib[CONN_INST_BUF];
> >>  	    /* there is already an instance being negotiated, do nothing */
> >> -	    return 0;
> >> +	    openswan_log("rekeying existing instance \"%s\"%s, due to acquire"
> >> +			 , c->name
> >> +			 , (fmt_conn_instance(c, cib), cib));
> >> +
> >> +	    /*
> >> +	     * we used to return here, but rekeying is a better choice. If we
> >> +	     * got the acquire, it is because something turned stuff into a
> >> +	     * %trap, or something got deleted, perhaps due to an expiry.
> >> +	     */
> > 
> > 		work = 0; /* klips/mast need to know we are doing nothing */
> > 
> >>  	}
> >>  
> >> -	if(c->kind == CK_PERMANENT)
> >> -	{
> >> -	    /* there is already a tunnel, do nothing */
> >> -	    return 0;

Perhaps then:

	if ((kern_interface == USE_KLIPS || kern_interface == USE_MASTKLIPS) &&
			c->kind == CK_PERMANENT)

I am not sure if the alternative is ok,  but I know that leaving this in for
klips is ok.

> > Restore this, with the work = 0 and for netkey ot will be a NOOP.
> 
> We can't restore this. If we return here we never initiate on-demand on
> netkey.
> 
> > 		work = 0; /* klips/mast need to know we are doing nothing */
> 
> You didn't check end of the function. Using return 0 is same as setting
> work = 0;

I just prefer not to mix up my return mechanisms within the same function.
I did realise it was the same :-)

> >> -	}
> >>  	/* otherwise, there is some kind of static conn that can handle
> >>  	 * this connection, so we initiate it */
> 
> If we ever return with non-zero status eg work != 0 netkey leaves hold
> eroute behind and those are not cleaned away.

Thats fine,  but right now I have no known problems with klips and
eroutes/shunts so I'd prefer to reserve the changes only for netkey and
leave it as is for klips.

Cheers,
Davidm

-- 
David McCullough,      david_mccullough at mcafee.com,  Ph:+61 734352815
McAfee - SnapGear      http://www.mcafee.com         http://www.uCdot.org