[Openswan dev] Losing shared phase1

Thu Feb 10 15:56:26 EST 2011

I dont have this exact configuration in use, but in cases where
we share a phase1 between different connections, we externally
track + handle turning off the shared connections.

This being said I do have a situation (though for a slightly different
reason) where a shared phase1 may go away. As mentioned in this thread,
the delete notification is not considered reliable. We can end up with
a orphan phase 2.

I'm not quite sure if the keepalives referred are only nat-t or DPD
as well. DPD-wise, the DPD for that SA will kick in but not end up
doing anything--it will only warn "DPD: Serious: could not find newest
phase 1 state". It leaves the SA intact; phase1 only comes back if it's
manually requested.

For our environment, this results in a weird connectivity issue when
new phase 1's are reestablished. The issue may be more specific to us
because our SA's are labeled with selinux contexts & directional, and
we depend on acquire messages from the kernel to create these SA's. But,
because one side still has a SA (which the DPD serious did not cleanup)
that the other side no longer knows about (it initiated the ipsec auto
--down), this results in an odd connectivity issue. One side is sending
data to a new/reestablished SA, but the other side is replying on a
SA the other side doesnt know about.

This issue is fairly difficult to explain.. so for now I will keep it
to the DPD. In our case it makes sense to delete orphaned phase2
states when DPD encounters them. Does it make sense to do this
generally? (I should reconsult the rfcs)

I have more thoughts regarding p1 & p2 delete notifications but
it may be a separate topic.

-anthony

> -----Original Message-----
> From: dev-bounces at openswan.org 
> [mailto:dev-bounces at openswan.org] On Behalf Of Harald Jenny
> Sent: Friday, October 29, 2010 8:57 AM
> To: dev at openswan.org
> Subject: Re: [Openswan dev] Losing shared phase1
> 
> On Thu, Oct 28, 2010 at 11:36:52AM -0400, Paul Wouters wrote:
> > On Tue, 26 Oct 2010, D. Hugh Redelmeier wrote:
> > 
> > > The IKE and IPSec stuff are all combined in one conn.  So if you 
> > > stop the conn, you are in some sense stopping that authentication.
> > >
> > > There could have been two levels of conn, one for IKE and one for 
> > > IPSec.  That could more cleanly match to protocols.  It 
> was decided 
> > > that that distinction didn't actually help the usage we expected.
> > 
> > I guess that is still true in almost all cases. And even in a case 
> > where two legit IPsec SA's share an ISAKMP SA, as in 
> Harald's example, 
> > pluto can easilly setup a new ISAKMP.
> 
> Hmmm but does it also really do this in case of NAT-T as the 
> code I've looked at rather suggests that a keepalive is only 
> sent when an ISAKMP SA already exists... (please don't beat 
> me in case I read the code wrong).
> 
> > 
> > I guess it is really mostly a matter when you have very 
> many tunnels 
> > between the same two endpoints, which almost always is a 
> non-real testing scenario.
> 
> Well what if two headquarters with multiple subnets have to 
> be interconnected?
> Apart from the above scenario only natting the IPSec nets 
> comes to my mind and this can get really ugly!
> 
> > 
> > > You could, of course recode this to avoid the "problem":
> > >    for i in `seq 2 500` do
> > >      ipsec auto --up conn$i
> > >    done
> > >    ipsec auto --up conn1
> > 
> > Yes, that is exactly what we ended up doing. Well, not exactly, as 
> > your example only works for the down case if you go 1-500 in the up 
> > case. Your example would just move the ISAKMP to conn2, 
> which in the 
> > down would be quickly killed as well.
> > 
> > > or:
> > > have an IKE-only conn, separately created and torn down.
> > > That seems to match this case quite well.
> > 
> > How would you create an "ike-only" conn? I thought your 
> design didn't 
> > allow for that? :) But I know what you mean, pick a host-host one 
> > that's outside the loop of 500 tunnels.
> 
> *ggg*
> 
> > 
> > > Delete notifications in IKEv1 are neither mandatory nor 
> reliable.  
> > > So they should not matter much.
> > 
> > True, and again mostly relevant in testing and benchmarking.
> > 
> > > You could perhaps create a new --downipsec that left the IKE SA.  
> > > But then when would the IKE SA get deleted?
> > 
> > I don't think this integrates well in the real-world uses 
> though. What 
> > would you do on receiving a delete/notify? --down or --downipsec?
> 
> What exactly does the delete notification tell us? That the 
> ISAKMP or the IPSec SA terminates?
> 
> > 
> > > A conn's IKE SA is hard to attach to a different conn.  For one 
> > > thing, the settings might be different.
> > 
> > Agreed, it seems very unwise.
> > 
> > > | There are more reasons this is important too. Imagine 
> NAT-T keep 
> > > | alives no longer being send because there is no phase1.
> > >
> > > I know little about NAT-T and its requirements.
> > 
> > NAT keepalives are send to keep the NAT hole open in the 
> absence of traffic.
> > 
> > > Why did you down a conn if you didn't want it to go down?
> > 
> > And if it would go down, it can easilly come back up. It's 
> not a scaling issue.
> > 
> > I'll document this case and part of this discussion in the 
> wiki. But I 
> > don't think we should do any coding for this corner case. Unless 
> > anyone else has a real case scenario where re-establishing 
> the phase1 
> > causes packet loss or other issues?
> 
> Sorry when I'm nitpicking but could anyone (with more 
> experience) have a look at the keepalive code?
> 
> > 
> > Thanks for the answers on the design Hugh,
> > 
> > Paul
> 
> Kind regards
> Harald
> 
> > _______________________________________________
> > Dev mailing list
> > Dev at openswan.org
> > http://lists.openswan.org/mailman/listinfo/dev
> _______________________________________________
> Dev mailing list
> Dev at openswan.org
> http://lists.openswan.org/mailman/listinfo/dev
>