summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorGang He <ghe@suse.com>2018-05-29 05:09:22 +0200
committerDavid Teigland <teigland@redhat.com>2018-05-29 17:48:35 +0200
commitda3627c30d229fea1e070e984366f80a1c4d9166 (patch)
treece11a5b004fc53bfc049c1f9604f516ed432df2e
parentdlm: make sctp_connect_to_sock() return in specified time (diff)
downloadlinux-da3627c30d229fea1e070e984366f80a1c4d9166.tar.xz
linux-da3627c30d229fea1e070e984366f80a1c4d9166.zip
dlm: remove O_NONBLOCK flag in sctp_connect_to_sock
We should remove O_NONBLOCK flag when calling sock->ops->connect() in sctp_connect_to_sock() function. Why? 1. up to now, sctp socket connect() function ignores the flag argument, that means O_NONBLOCK flag does not take effect, then we should remove it to avoid the confusion (but is not urgent). 2. for the future, there will be a patch to fix this problem, then the flag argument will take effect, the patch has been queued at https://git.kernel.o rg/pub/scm/linux/kernel/git/davem/net.git/commit/net/sctp?id=644fbdeacf1d3ed d366e44b8ba214de9d1dd66a9. But, the O_NONBLOCK flag will make sock->ops->connect() directly return without any wait time, then the connection will not be established, DLM kernel module will call sock->ops->connect() again and again, the bad results are, CPU usage is almost 100%, even trigger soft_lockup problem if the related configurations are enabled, DLM kernel module also prints lots of messages like, [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 [Fri Apr 27 11:23:43 2018] dlm: connecting to 172167592 The upper application (e.g. ocfs2 mount command) is hanged at new_lockspace(), the whole backtrace is as below, tb0307-nd2:~ # cat /proc/2935/stack [<0>] new_lockspace+0x957/0xac0 [dlm] [<0>] dlm_new_lockspace+0xae/0x140 [dlm] [<0>] user_cluster_connect+0xc3/0x3a0 [ocfs2_stack_user] [<0>] ocfs2_cluster_connect+0x144/0x220 [ocfs2_stackglue] [<0>] ocfs2_dlm_init+0x215/0x440 [ocfs2] [<0>] ocfs2_fill_super+0xcb0/0x1290 [ocfs2] [<0>] mount_bdev+0x173/0x1b0 [<0>] mount_fs+0x35/0x150 [<0>] vfs_kern_mount.part.23+0x54/0x100 [<0>] do_mount+0x59a/0xc40 [<0>] SyS_mount+0x80/0xd0 [<0>] do_syscall_64+0x76/0x140 [<0>] entry_SYSCALL_64_after_hwframe+0x42/0xb7 [<0>] 0xffffffffffffffff So, I think we should remove O_NONBLOCK flag here, since DLM kernel module can not handle non-block sockect in connect() properly. Signed-off-by: Gang He <ghe@suse.com> Signed-off-by: David Teigland <teigland@redhat.com>
-rw-r--r--fs/dlm/lowcomms.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index d31e9abfb9f1..a5e4a221435c 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1092,7 +1092,7 @@ static void sctp_connect_to_sock(struct connection *con)
kernel_setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO, (char *)&tv,
sizeof(tv));
result = sock->ops->connect(sock, (struct sockaddr *)&daddr, addr_len,
- O_NONBLOCK);
+ 0);
memset(&tv, 0, sizeof(tv));
kernel_setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO, (char *)&tv,
sizeof(tv));