[Openswan dev] 0001023: Oops due to improper ipsec_sa destruction

Nick Jones nick.jones at network-box.com
Wed Mar 18 05:10:48 EDT 2009


Please refer also to the openswan mantis issue:
http://bugs.xelerance.com/view.php?id=1023

There is a discrepancy in the way that ipsec_sa objects are initialised 
and destroyed, causing a kernel Oops in ipsec.ko while destroying an 
ipsec_sa that was found to be invalid part way through initialisation.

kernel: 2.6.16.x
openswan: 2.4.7, 2.4.12, 2.6.20

I have seen reports of Oops' with exactly the same stack trace in these 
places:
http://lists.openswan.org/pipermail/dev/2004-March/000098.html
http://lists.openswan.org/pipermail/dev/2008-January/001786.html
http://lists.openswan.org/pipermail/dev/2008-March/001813.html

As Hiren Joshi pointed out in: 
http://lists.openswan.org/pipermail/dev/2008-March/001814.html the Oops 
can be forced by adding an spi with the same id as one already 
established, but we are also seeing this happen at runtime on heavily 
loaded boxes.

Setting nhelpers=0 in ipsec.conf will reduce the likelyhood of this 
Oops, but not entirely as we continue to see the Oops although with much 
less frequency.

As the forced Oops has the same stack trace as the runtime, I assume 
that both issues have the same cause and can be solved by my suggested fix.

There are two ways to force the Oops, adding a new spi with the same id 
as an outbound entry, and a new spi with the same id as an inbound entry:

----
ipsec spi
tun0x1004 at 10.8.16.202 IPIP: dir=out src=10.8.16.200 
life(c,s,h)=addtime(54,0,0) natencap=none natsport=0 natdport=0 
refcount=2 ref=1 refhim=0 reftable=0 refentry=1
tun0x1003 at 10.8.16.200 IPIP: dir=in src=10.8.16.202 
policy=192.168.1.0/24->192.168.11.0/24 flags=0x8<> 
life(c,s,h)=addtime(54,0,0) natencap=none natsport=0 natdport=0 
refcount=3 ref=5 refhim=1 reftable=0 refentry=5
tun0x1001 at 10.8.16.202 IPIP: dir=out src=10.8.16.200 
life(c,s,h)=addtime(60,0,0) natencap=none natsport=0 natdport=0 
refcount=1 ref=1 refhim=0 reftable=0 refentry=1
tun0x1002 at 10.8.16.200 IPIP: dir=in src=10.8.16.202 
policy=192.168.1.0/24->192.168.11.0/24 flags=0x8<> 
life(c,s,h)=addtime(60,0,0) natencap=none natsport=0 natdport=0 
refcount=3 ref=3 refhim=1 reftable=0 refentry=3
esp0x3fcd6dc4 at 10.8.16.200 ESP_AES_HMAC_SHA1: dir=in src=10.8.16.202 
iv_bits=128bits iv=0xff84ad43ca3cd6e0e8b73448b2f83858 ooowin=64 alen=160 
aklen=160 eklen=128 life(c,s,h)=addtime(54,0,0) natencap=none natsport=0 
natdport=0 refcount=2 ref=6 refhim=1 reftable=0 refentry=6
esp0x3fcd6dc3 at 10.8.16.200 ESP_AES_HMAC_SHA1: dir=in src=10.8.16.202 
iv_bits=128bits iv=0x25d7b2a4d2f479945952cff75b8ecb3a ooowin=64 alen=160 
aklen=160 eklen=128 life(c,s,h)=addtime(60,0,0) natencap=none natsport=0 
natdport=0 refcount=2 ref=4 refhim=1 reftable=0 refentry=4
esp0x23391498 at 10.8.16.202 ESP_AES_HMAC_SHA1: dir=out src=10.8.16.200 
iv_bits=128bits iv=0xb2405aaa8634121ae930d53e915bcf2a ooowin=64 alen=160 
aklen=160 eklen=128 life(c,s,h)=addtime(54,0,0) natencap=none natsport=0 
natdport=0 refcount=3 ref=7 refhim=0 reftable=0 refentry=7
esp0x23391497 at 10.8.16.202 ESP_AES_HMAC_SHA1: dir=out src=10.8.16.200 
iv_bits=128bits iv=0x76707e7f32fd78808691736b1370a748 ooowin=64 alen=160 
aklen=160 eklen=128 life(c,s,h)=addtime(60,0,0) natencap=none natsport=0 
natdport=0 refcount=3 ref=2 refhim=0 reftable=0 refentry=2

Outbound:
ipsec spi --af inet --edst 10.8.16.202 --spi 0x23391498 --proto esp 
--src 10.8.16.200 --esp 3des-md5-96 --enckey 0xBIGLONGKEY --authkey 
0xBIGLONGKEY

Inbound:
ipsec spi --af inet --edst 10.8.16.202 --spi 0x3fcd6dc4 --proto esp 
--src 10.8.16.200 --esp 3des-md5-96 --enckey 0xBIGLONGKEY --authkey 
0xBIGLONGKEY
----

A trace of the issue goes like this:
Both Cases:
- KLIPS cryptoapi interface: alg_type=15 alg_id=12 name=aes 
keyminbits=128 keymaxbits=256, found(0)
- KLIPS cryptoapi interface: alg_type=15 alg_id=3 name=des3_ede 
keyminbits=192 keymaxbits=192, found(0)
   - using cryptoapi for these alg ids

- klips_debug:pfkey_msg_interp: parsing message ver=2, type=3, errno=0, 
satype=3(ESP), len=77, res=0, seq=1, pid=7811
   - ESP

- klips_debug:pfkey_sa_process: .
   - pfkey_v2_ext_process.c:160:pfkey_sa_process
     - ipsec_alg_sa_init is called, this sets the ips_alg_enc field in 
the ipsec_sa,
       (it will also set ips_alg_auth but it usually ends up as null)
       because we are using cryptoapi, the key operation functions will 
be set as
       _capi_new_key and _capi_destroy_key etc

- klips_debug:pfkey_key_process: allocating 256 bytes for enckey
   - pfkey_v2_ext_process.c:516:pfkey_key_process
     - "pfkey_key_process: allocating 256 bytes for enckey"
       ips_key_e initalised as a kmalloc'd block of sadb_key_bits * 8 bytes

Outbound:
- ipsec_sa_getbyid: linked entry in ipsec_sa table...requested
- klips_debug:pfkey_add_parse: found an old ipsec_sa
- klips_debug:pfkey_msg_interp: message parsing failed with error -17.

Inbound:
- ipsec_sa_getbyid: linked entry in ipsec_sa table...requested
- ipsec_sa_getbyid: no entries in ipsec_sa table
- ipsec_sa_init: calling init routine of ESP_3DES_HMAC_MD5
- klips_debug:ipsec_alg_enc_key_create: incorrect encryption key size 
for id=3: 2048 bits -- must be between 192,192 bits

Both Cases again:
- klips_debug:ipsec_sa_wipe: removing SA
   - ipsec_sa.c:1028:ipsec_sa_wipe
     - ips_key_e != NULL, ok
     - ips->ips_alg_enc && ips->ips_alg_enc->ixt_e_destroy_key != NULL, ok
   - ipsec_alg_cryptoapi.c:301:_capi_destroy_key
     - ipsec_sa->ips_key_e is a simple block of bytes but gets cast as a 
struct crypto_tfm
   - ipsec_alg_cryptoapi.c:313:_capi_destroy_key
     - crypto_free_tfm... I needn't go on

The solution I am suggesting is to perform a more mirrored init/destroy 
by tying more closely the ips_key_e and ips_alg_enc fields of the 
ipsec_sa structure, similarly the same should be done for ips_key_a and 
ips_alg_auth. The patch attached to the mantis issue does this by 
removing the call to ipsec_alg_sa_init in pfkey_sa_process and moving it 
(actually the logic therein) to ipsec_alg_enc_key_create, so that the 
initialised key structure is guaranteed to match the algorithm used to 
create (and destroy) it. (I decided to remove the call to 
ipsec_ocf_sa_init as well)

An ipsec.ko module built with this patch applied will not Oops when 
forced from the command line.

I have done my best to assure that the change is responsible in terms of 
allocating and freeing resources, but I can't be 100% sure that it will 
not cause side affects within the program flow of the ipsec module as 
I'm not very familiar with the code. I'm hoping that my suggestion can 
be commented on by more experienced people here.

Thanks


More information about the Dev mailing list