[Openswan dev]

Ankit Desai ankit at elitecore.com
Tue Feb 14 13:16:48 CET 2006


hi,
I had a good look at the code and found the following patch to be useful. I
don't know whether this is the correct solution but just tried to remove the
recursion from the code.

<---snip--->

--- pfkey_v2.c.org      Tue Feb 14 12:53:56 2006
+++ pfkey_v2.c  Tue Feb 14 12:54:37 2006
@@ -101,7 +101,6 @@
 #ifdef NET_26
 static void pfkey_sock_list_grab(void)
 {
-       write_lock_bh(&pfkey_sock_lock);

        if (atomic_read(&pfkey_sock_users)) {
                DECLARE_WAITQUEUE(wait, current);
@@ -111,9 +110,7 @@
                        set_current_state(TASK_UNINTERRUPTIBLE);
                        if (atomic_read(&pfkey_sock_users) == 0)
                                break;
-                       write_unlock_bh(&pfkey_sock_lock);
                        schedule();
-                       write_lock_bh(&pfkey_sock_lock);
                }

                __set_current_state(TASK_RUNNING);
@@ -123,7 +120,6 @@

 static __inline__ void pfkey_sock_list_ungrab(void)
 {
-       write_unlock_bh(&pfkey_sock_lock);
        wake_up(&pfkey_sock_wait);
 }

@@ -745,7 +741,13 @@
        sk->sk_sleep=sock->wait;
 #endif /* NET_21 */

+#ifdef NET_26
+       write_lock_bh(&pfkey_sock_lock);
+#endif
        pfkey_insert_socket(sk);
+#ifdef NET_26
+       write_unlock_bh(&pfkey_sock_lock);
+#endif
        pfkey_list_insert_socket(sock, &pfkey_open_sockets);

        KLIPS_PRINT(debug_pfkey,

<---snip--->

explanation for the patch
"pfkey_sock_list_grab" is called from 2 places for "NET_26"
1) pfkey_insert_socket
2) pfkey_remove_socket

pfkey_remove_socket is already called with a lock. so no need to take a lock
again in pfkey_sock_list_grab.
Instead called the "pfkey_insert_socket" with lock from "pfkey_create".
Could not see any other occurance of pfkey_insert_socket other than that.
Also have not considered 2.4 kernel in the above patch. Have checked this
with 2.6 kernel only and that too just bootup.

-- Ankit Desai

----- Original Message ----- 
From: "Paul Wouters" <paul at xelerance.com>
To: "Ankit Desai" <ankit at elitecore.com>
Cc: <dev at openswan.org>
Sent: Tuesday, February 14, 2006 5:00 AM
Subject: Re: [Openswan dev]


> On Sat, 11 Feb 2006, Ankit Desai wrote:
>
> > found the following message at kernel bootup
> > klips_info:ipsec_init: KLIPS startup, Openswan KLIPS IPsec stack
version:
> > 2.4.5rc4
>
> > BUG: rwlock recursion on CPU#0, klipsdebug/1379, c8a8c844
> >  [<c01ce285>] _raw_write_lock+0x35/0x60
> >  [<c0308d04>] _write_lock_bh+0x14/0x20
> >  [<c8a5166e>] pfkey_sock_list_grab+0xe/0xf0 [ipsec]
> >  [<c014b2ae>] do_no_page+0x22e/0x250
> >  [<c0159c3a>] do_sync_write+0xba/0x100
> >  [<c8a51a90>] pfkey_remove_socket+0x20/0xa0 [ipsec]
> >  [<c8a51b37>] pfkey_destroy_socket+0x27/0x340 [ipsec]
> >  [<c0308c84>] _spin_lock_bh+0x14/0x20
> >  [<c02a2cff>] release_sock+0xf/0x50
> >  [<c8a52395>] pfkey_release+0xa5/0x100 [ipsec]
> >  [<c029f3d4>] sock_release+0x14/0x70
> >  [<c029fddd>] sock_close+0x2d/0x40
> >  [<c015aa35>] __fput+0xa5/0x140
> >  [<c015934c>] filp_close+0x4c/0x60
> >  [<c01593c4>] sys_close+0x64/0x80
> >  [<c0102ec5>] syscall_call+0x7/0xb
>
> That is our known SMP bug :(
>
> Paul
>



More information about the Dev mailing list