[Openswan dev]
record_and_initiate_opportunistic, was [Openswan Users] Lots of
%hold connections (fwd)
Paul Wouters
paul at xelerance.com
Tue Sep 20 15:50:54 CEST 2005
---------- Forwarded message ----------
Date: Tue, 20 Sep 2005 02:25:41 -0400 (EDT)
From: Michael Smith <msmith at cbnco.com>
To: users at openswan.org
Subject: [Openswan Users] Lots of %hold connections
Hi,
I have a test setup with 19 clients and a central location all running
kernel 2.6.11.11 with Openswan 2.4.0. They are all just 486-class Soekris
net4501s. There's a server in a /24 subnet behind the central router, and
all the clients have their own /32 subnets on the VPN.
On the clients I have a pretty pathological test application - if it can't
connect, it retries once per second. Every thirty seconds or so, this
results in a new bare shunt being added on the clients, if the central is
down:
000 w.x.y.16/32:0 -6-> a.b.c.55/32:0 => %hold 0 %acquire-netlink
That "6" is TCP. These pile up on the clients - I've seen up to 500. They
seem to each trigger a quick mode initiation once the client is able to
complete main mode. 19 clients times 500 quick mode initiations really
bogs down the central router :)
The narrow bare shunts are supposed to be replaced with broader subnet
shunts from the IPsec policy, e.g. w.x.y.16/32:0 -0-> a.b.c.0/24. The
trouble is that record_and_initiate_opportunistic() puts the transport
protocol - 6 - in the bare shunt, but initiate_opportunistic() sets the
transport protocol to 0 when it creates the broad %hold, so the broad
%hold doesn't replace the narrow one. A workaround is to set
transport_proto to 0 at the top of record_and_initiate_opportunistic():
--- programs/pluto/kernel.c 15 Sep 2005 18:17:21 -0000 1.1.1.2
+++ programs/pluto/kernel.c 20 Sep 2005 06:22:11 -0000
@@ -176,6 +176,13 @@
, int transport_proto
, const char *why)
{
+ /*
+ * initiate_opportunistic() sets its transport proto to 0, so we
+ * must do the same when creating the bare shunt; otherwise the narrow
+ * shunt won't be deleted when a broad hold pops up.
+ */
+ transport_proto = 0;
+
passert(samesubnettype(ours, his));
/* Add to bare shunt list.
But then I get a lot of "Queuing pending Quick Mode with ...". These pile
up just like the %holds and trigger a quick mode flood whenever the
central router comes back to life. So in add_pending(), I had to check if
any pre-existing penders exist and replace them:
--- programs/pluto/pending.c 30 May 2005 15:19:14 -0000 1.1.1.1
+++ programs/pluto/pending.c 20 Sep 2005 06:22:11 -0000
@@ -69,6 +69,37 @@
struct pending *next;
};
+static void
+delete_pending(struct pending **pp);
+
+static void delete_old_pending(const struct connection *c,
+ const struct pending *match)
+{
+ struct pending *p, **pp;
+
+ pp = host_pair_first_pending(c);
+ if(pp == NULL) return;
+
+ while ((p = *pp) != NULL)
+ {
+ if (p->isakmp_sa == match->isakmp_sa
+ && p->connection == match->connection
+ && p->policy == match->policy)
+ {
+ DBG(DBG_CONTROL, DBG_log("Deleting existing pending state from %d."
+ , p->pend_time));
+
+ p->connection = NULL;
+ delete_pending(pp);
+ }
+ else
+ {
+ pp = &p->next;
+ }
+ }
+}
+
+
/* queue a Quick Mode negotiation pending completion of a suitable Main Mode */
void
add_pending(int whack_sock
@@ -91,6 +122,8 @@
p->replacing = replacing;
p->pend_time = time(NULL);
+ delete_old_pending(c, p);
+
host_pair_enqueue_pending(c, p, &p->next);
}
I guess I should really be able to break after that call to
delete_pending(), if there is no other way of adding multiple identical
pending entries to the queue.
With those patches applied, my clients aren't hammering the central router
anymore, and they can successfully build SAs.
I am including /etc/no_oe.conf, so I don't think this is opportunism
related, although all the functions have opportunistic in their names.
Mike
_______________________________________________
Users mailing list
Users at openswan.org
http://lists.openswan.org/mailman/listinfo/users
More information about the Dev
mailing list