summaryrefslogtreecommitdiffstats
path: root/net
diff options
context:
space:
mode:
authorJay Vosburgh <fubar@us.ibm.com>2011-10-28 17:42:50 +0200
committerDavid S. Miller <davem@davemloft.net>2011-10-30 08:13:14 +0100
commite6d265e8504ab4a3368b8645d318b344ee88b280 (patch)
tree6fd2bc16819bae491e56be2658fc3298b7efa92c /net
parentqlcnic: fix beacon and LED test. (diff)
downloadlinux-e6d265e8504ab4a3368b8645d318b344ee88b280.tar.xz
linux-e6d265e8504ab4a3368b8645d318b344ee88b280.zip
bonding: eliminate bond_close race conditions
This patch resolves two sets of race conditions. Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> reported the first, as follows: The bond_close() calls cancel_delayed_work() to cancel delayed works. It, however, cannot cancel works that were already queued in workqueue. The bond_open() initializes work->data, and proccess_one_work() refers get_work_cwq(work)->wq->flags. The get_work_cwq() returns NULL when work->data has been initialized. Thus, a panic occurs. He included a patch that converted the cancel_delayed_work calls in bond_close to flush_delayed_work_sync, which eliminated the above problem. His patch is incorporated, at least in principle, into this patch. In this patch, we use cancel_delayed_work_sync in place of flush_delayed_work_sync, and also convert bond_uninit in addition to bond_close. This conversion to _sync, however, opens new races between bond_close and three periodically executing workqueue functions: bond_mii_monitor, bond_alb_monitor and bond_activebackup_arp_mon. The race occurs because bond_close and bond_uninit are always called with RTNL held, and these workqueue functions may acquire RTNL to perform failover-related activities. If bond_close or bond_uninit is waiting in cancel_delayed_work_sync, deadlock occurs. These deadlocks are resolved by having the workqueue functions acquire RTNL conditionally. If the rtnl_trylock() fails, the functions reschedule and return immediately. For the cases that are attempting to perform link failover, a delay of 1 is used; for the other cases, the normal interval is used (as those activities are not as time critical). Additionally, the bond_mii_monitor function now stores the delay in a variable (mimicing the structure of activebackup_arp_mon). Lastly, all of the above renders the kill_timers sentinel moot, and therefore it has been removed. Tested-by: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net')
0 files changed, 0 insertions, 0 deletions