[Openswan dev] SPI generation by netlink_get_spi()

Andreas Steffen andreas.steffen at strongsec.net
Thu Jul 29 16:04:53 CEST 2004

Hi Herbert,

one of my customers has a problem with Openswan/strongSwan running
on a 2.6.6 kernel connecting to freeswan-2.04 with X.509 patch 1.6.3.
I was able to re-enact the scenario in question on my 2.6.7 test platform.

The problem can occur if a connection is started by auto=start as
in the following example:

conn uwe

auto=start automatically installs a %trap eroute:

: | *received whack message
: | route owner of "uwe" unrouted: NULL; eroute owner: NULL
: | route owner of "uwe" unrouted: NULL; eroute owner: NULL
: | route_and_eroute with c: uwe (next: none) ero:null esr:{(nil)} ro:null
     rosr:{(nil)} and state: 0
: | add eroute -> => int.104 at
: | eroute_connection add eroute -> => %trap:0

next the conn uwe is initiated

: | Queuing pending Quick Mode with "uwe"

: "uwe" #1: initiating Main Mode

starting with Main Mode, with a pending Quick Mode.

Due to a stray ICMP message occuring during the Main Mode Negotiation
the %trap eroute gets triggered and a narrow %hold eroute is installed:

: | *received kernel message
: | netlink_get: XFRM_MSG_ACQUIRE message
: | add bare shunt 0x8d854b0 -> => %hold:1 0    %acquire-netlink
: | initiate on demand from to proto=1 state:
     fos_start because: whack
: | find_connection: looking for policy for connection: ->

Next a search for a matching connection is started and und conn uwe is found:

: | find_connection: conn "uwe" has compatible peers:> [pri: 16793612]
: | find_connection: comparing best "uwe" [pri:16793612]{0x8d82b68} (child none)
      to "uwe" [pri:16793612]{0x8d82b68} (child none)
: | find_connection: concluding with "uwe" [pri:16793612]{0x8d82b68}
: | eroute_connection replace %trap with broad %hold eroute
      -> => %hold:0
: | delete narrow %hold eroute -> => %hold:1
: | delete bare shunt 0x8d854b0 -> => %hold:1
     0    %acquire-netlink

Since a Main Mode negotiation for conn uwe is already under way, a Quick Mode
negotiation is queued in the pending queue.
Quick Mode

: | Queuing pending Quick Mode with "uwe"

After the successful establishment of the phase 1 ISAKMP SA, the first
pending Quick Mode is started:

: "uwe" #1: ISAKMP SA established

: | unqueuing pending Quick Mode with "uwe"
: | creating state object #2 at 0x8d86220
: "uwe" #2: initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP {using isakmp#1}
: |    message ID:  65 7e 1a 0f
: | netlink_get_spi: allocated 0x9f4c9788 for esp.0 at
: | SPI  9f 4c 97 88

The netlink interface of the 2.6 kernel is used to request an SPI for
the IPsec SA.

Immediately after the first Quick Mode message the second pending Quick Mode
is inititated:

: | unqueuing pending Quick Mode with "uwe"
: | creating state object #3 at 0x8d876d0
: "uwe" #3: initiating Quick Mode RSASIG+ENCRYPT+TUNNEL+PFS+UP {using isakmp#1}
: |    message ID:  a1 01 a2 b2
: | netlink_get_spi: allocated 0x9f4c9788 for esp.0 at
: | SPI  9f 4c 97 88

And here the error happens. The two Quick Mode negotiations have different
Message IDs (65 7e 1a 0f versus a1 01 a2 b2) which will cause two phase2
state objects to be created on the peer side but the generated SPI 9f 4c 97 88
is the same. This will trigger the assertion passert(0) in 
kernel_pfkey.c:finish_pfkey_msg() in freeswan-2.0x because twice the same 
SADB_ADD command is executed for the outbound esp. Removing the assertion
as in Openswan does not help - several retrials will not succeed in setting
up the IPsec SA.

Looking at kernel.c:get_spi() I see that if KLIPS is used, each call
increases the SPI by one (spi++) so that always a unique SPI is generated
and therefore the problem never occurs. But using the native IPsec stack
of the 2.6 kernel causes netlink_get_spi() to be called instead:

static ipsec_spi_t
netlink_get_spi(const ip_address *src
               , const ip_address *dst
               , int proto
               , bool tunnel_mode
               , unsigned reqid
               , ipsec_spi_t min
               , ipsec_spi_t max
               , const char *text_said)

Because all input parameters are the same (I suspect that both Quick
Modes also use the same reqid, although I couldn't document this yet)
I assume this causes netlink to return the same SPI.

How can this be fixed? Can netlink be forced to generate unique
SPIs by some other means or must the reqids be different?



Andreas Steffen                   e-mail: andreas.steffen at strongsec.com
strongSec GmbH                    home:   http://www.strongsec.com
Alter Zürichweg 20                phone:  +41 1 730 80 64
CH-8952 Schlieren (Switzerland)   fax:    +41 1 730 80 65
==========================================[strong internet security]===

More information about the Dev mailing list