Ceph 설치 후 서버 동작중에 서버가 죽어버리는 문제가 발생했다.
어떤 상황이냐면...
1. 초기 동작시에는 문제가 없지만 IO가 좀 발생하면 문제가 발생함.
2. 문제 발생시 특정 동작에 hangs이 걸리고 아무것도 동작하지 않음 (Deadlock)
-> ps -aux 명령이 그러함.
ps aux 을 하면 프로세스 목록이 보이다가 곧 멈춰버리는데 보여줘야 할 프로세스 (마지막으로 보여지는 pid의 다음 것)에 문제가 있음.
cat /proc/(문제pid)/cmdline 등의 명령도 멈춰버림.
3. 시스템 정상종료 불가
-> unmonut 불가로 보임
커널 로그는 아래와 같다.
[ 5092.890984] libceph: osd2 up
[ 6646.588531] INFO: task containerd:18390 blocked for more than 120 seconds.
[ 6646.588868] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.589070] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.589294] containerd D 0 18390 1 0x00000000
[ 6646.589297] Call Trace:
[ 6646.589306] __schedule+0x24e/0x880
[ 6646.589311] ? __memcg_init_list_lru_node+0x70/0xd0
[ 6646.589313] schedule+0x2c/0x80
[ 6646.589316] rwsem_down_write_failed+0x1ea/0x360
[ 6646.589319] ? ida_get_new_above+0x110/0x320
[ 6646.589325] call_rwsem_down_write_failed+0x17/0x30
[ 6646.589326] ? call_rwsem_down_write_failed+0x17/0x30
[ 6646.589328] down_write+0x2d/0x40
[ 6646.589332] register_shrinker_prepared+0x19/0x50
[ 6646.589336] sget_userns+0x419/0x490
[ 6646.589338] ? get_anon_bdev+0x100/0x100
[ 6646.589340] sget+0x7d/0xa0
[ 6646.589342] ? get_anon_bdev+0x100/0x100
[ 6646.589346] ? ovl_posix_acl_xattr_set+0x300/0x300 [overlay]
[ 6646.589348] mount_nodev+0x30/0xa0
[ 6646.589351] ovl_mount+0x18/0x20 [overlay]
[ 6646.589353] mount_fs+0x37/0x160
[ 6646.589357] vfs_kern_mount.part.24+0x5d/0x110
[ 6646.589359] do_mount+0x5ed/0xce0
[ 6646.589361] ? copy_mount_options+0x2c/0x220
[ 6646.589363] SyS_mount+0x98/0xe0
[ 6646.589367] do_syscall_64+0x73/0x130
[ 6646.589370] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.589371] RIP: 0033:0x55635d40f28a
[ 6646.589372] RSP: 002b:000000c00186e358 EFLAGS: 00000216 ORIG_RAX: 00000000000000a5
[ 6646.589374] RAX: ffffffffffffffda RBX: 000000c00004e000 RCX: 000055635d40f28a
[ 6646.589375] RDX: 000000c0014d8818 RSI: 000000c002ef45a0 RDI: 000000c0014d8810
[ 6646.589376] RBP: 000000c00186e3f0 R08: 000000c0038e8b00 R09: 0000000000000000
[ 6646.589377] R10: 0000000000000000 R11: 0000000000000216 R12: ffffffffffffffff
[ 6646.589378] R13: 0000000000000010 R14: 000000000000000f R15: 0000000000000055
[ 6646.589400] INFO: task containerd:16952 blocked for more than 120 seconds.
[ 6646.589598] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.589790] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.590013] containerd D 0 16952 1 0x00000000
[ 6646.590016] Call Trace:
[ 6646.590020] __schedule+0x24e/0x880
[ 6646.590023] ? __memcg_init_list_lru_node+0x70/0xd0
[ 6646.590025] schedule+0x2c/0x80
[ 6646.590028] rwsem_down_write_failed+0x1ea/0x360
[ 6646.590031] ? ida_get_new_above+0x110/0x320
[ 6646.590035] call_rwsem_down_write_failed+0x17/0x30
[ 6646.590037] ? call_rwsem_down_write_failed+0x17/0x30
[ 6646.590040] down_write+0x2d/0x40
[ 6646.590043] register_shrinker_prepared+0x19/0x50
[ 6646.590046] sget_userns+0x419/0x490
[ 6646.590048] ? get_anon_bdev+0x100/0x100
[ 6646.590052] sget+0x7d/0xa0
[ 6646.590054] ? get_anon_bdev+0x100/0x100
[ 6646.590059] ? ovl_posix_acl_xattr_set+0x300/0x300 [overlay]
[ 6646.590062] mount_nodev+0x30/0xa0
[ 6646.590067] ovl_mount+0x18/0x20 [overlay]
[ 6646.590070] mount_fs+0x37/0x160
[ 6646.590073] vfs_kern_mount.part.24+0x5d/0x110
[ 6646.590076] do_mount+0x5ed/0xce0
[ 6646.590079] ? copy_mount_options+0x2c/0x220
[ 6646.590082] SyS_mount+0x98/0xe0
[ 6646.590085] do_syscall_64+0x73/0x130
[ 6646.590088] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.590090] RIP: 0033:0x55635d40f28a
[ 6646.590091] RSP: 002b:000000c0018725b0 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[ 6646.590094] RAX: ffffffffffffffda RBX: 000000c000054f00 RCX: 000055635d40f28a
[ 6646.590095] RDX: 000000c0030c7ce8 RSI: 000000c00148bf60 RDI: 000000c0030c7ce0
[ 6646.590096] RBP: 000000c001872648 R08: 000000c00300f680 R09: 0000000000000000
[ 6646.590098] R10: 0000000000000000 R11: 0000000000000202 R12: ffffffffffffffff
[ 6646.590099] R13: 00000000000000fc R14: 00000000000000fb R15: 0000000000000100
[ 6646.590540] INFO: task xfsaild/rbd2:8223 blocked for more than 120 seconds.
[ 6646.590743] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.590939] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.591165] xfsaild/rbd2 D 0 8223 2 0x80000000
[ 6646.591168] Call Trace:
[ 6646.591171] __schedule+0x24e/0x880
[ 6646.591175] ? lock_timer_base+0x6b/0x90
[ 6646.591178] schedule+0x2c/0x80
[ 6646.591245] _xfs_log_force+0x159/0x2a0 [xfs]
[ 6646.591250] ? wake_up_q+0x80/0x80
[ 6646.591301] ? xfsaild+0x1b6/0x7e0 [xfs]
[ 6646.591350] xfs_log_force+0x2c/0x80 [xfs]
[ 6646.591400] xfsaild+0x1b6/0x7e0 [xfs]
[ 6646.591404] ? __schedule+0x256/0x880
[ 6646.591408] kthread+0x121/0x140
[ 6646.591458] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 6646.591460] ? kthread+0x121/0x140
[ 6646.591510] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 6646.591513] ? kthread_create_worker_on_cpu+0x70/0x70
[ 6646.591517] ret_from_fork+0x35/0x40
[ 6646.591539] INFO: task prometheus:10749 blocked for more than 120 seconds.
[ 6646.591740] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.591936] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.592162] prometheus D 0 10749 10332 0x00000000
[ 6646.592164] Call Trace:
[ 6646.592168] __schedule+0x24e/0x880
[ 6646.592171] schedule+0x2c/0x80
[ 6646.592173] io_schedule+0x16/0x40
[ 6646.592176] wait_on_page_bit+0xf4/0x130
[ 6646.592182] ? page_cache_tree_insert+0xe0/0xe0
[ 6646.592186] wait_for_stable_page+0x61/0x80
[ 6646.592188] grab_cache_page_write_begin+0x37/0x40
[ 6646.592192] iomap_write_begin.constprop.18+0x5b/0x140
[ 6646.592195] iomap_write_actor+0x92/0x170
[ 6646.592198] ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.592200] iomap_apply+0xa5/0x120
[ 6646.592203] ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.592205] iomap_file_buffered_write+0x6e/0xa0
[ 6646.592207] ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.592256] xfs_file_buffered_aio_write+0xca/0x290 [xfs]
[ 6646.592261] ? sock_read_iter+0x8f/0xf0
[ 6646.592309] xfs_file_write_iter+0xac/0x160 [xfs]
[ 6646.592347] new_sync_write+0xe7/0x140
[ 6646.592350] __vfs_write+0x29/0x40
[ 6646.592353] vfs_write+0xb1/0x1a0
[ 6646.592355] SyS_write+0x5c/0xe0
[ 6646.592359] do_syscall_64+0x73/0x130
[ 6646.592363] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.592365] RIP: 0033:0x47a170
[ 6646.592366] RSP: 002b:000000c006545278 EFLAGS: 00000212 ORIG_RAX: 0000000000000001
[ 6646.592369] RAX: ffffffffffffffda RBX: 000000c00004a000 RCX: 000000000047a170
[ 6646.592370] RDX: 000000000000023f RSI: 000000c000b4b88b RDI: 0000000000000026
[ 6646.592372] RBP: 000000c0065452c8 R08: 0000000000000000 R09: 0000000000000000
[ 6646.592373] R10: 0000000000000000 R11: 0000000000000212 R12: 000000000000477e
[ 6646.592374] R13: 000000000000387b R14: 0000000000004785 R15: 000000c000b4b88b
[ 6646.592397] INFO: task WTJourn.Flusher:13661 blocked for more than 120 seconds.
[ 6646.592611] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.592812] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.593038] WTJourn.Flusher D 0 13661 11503 0x00000000
[ 6646.593040] Call Trace:
[ 6646.593044] __schedule+0x24e/0x880
[ 6646.593045] schedule+0x2c/0x80
[ 6646.593048] io_schedule+0x16/0x40
[ 6646.593049] wait_on_page_bit_common+0xd8/0x160
[ 6646.593053] ? page_cache_tree_insert+0xe0/0xe0
[ 6646.593055] __filemap_fdatawait_range+0xfa/0x160
[ 6646.593057] ? __filemap_fdatawrite_range+0xcf/0x100
[ 6646.593060] ? __filemap_fdatawrite_range+0xdb/0x100
[ 6646.593062] file_write_and_wait_range+0x86/0xb0
[ 6646.593094] xfs_file_fsync+0x5f/0x230 [xfs]
[ 6646.593100] vfs_fsync_range+0x51/0xb0
[ 6646.593104] do_fsync+0x3d/0x70
[ 6646.593107] SyS_fdatasync+0x13/0x20
[ 6646.593109] do_syscall_64+0x73/0x130
[ 6646.593112] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.593113] RIP: 0033:0x7fe3337432e7
[ 6646.593114] RSP: 002b:00007fe32c6c2400 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[ 6646.593116] RAX: ffffffffffffffda RBX: 0000000000000011 RCX: 00007fe3337432e7
[ 6646.593117] RDX: 0000000000000000 RSI: 0000000000000011 RDI: 0000000000000011
[ 6646.593118] RBP: 00007fe32c6c2440 R08: 0000000000000000 R09: 0000000000000000
[ 6646.593119] R10: 0000000000000020 R11: 0000000000000293 R12: 0000560acb599f40
[ 6646.593120] R13: 0000560ac884130b R14: 0000000000000000 R15: 0000560acdf52158
[ 6646.593235] INFO: task xfsaild/rbd13:20644 blocked for more than 120 seconds.
[ 6646.593444] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.593640] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.593975] xfsaild/rbd13 D 0 20644 2 0x80000000
[ 6646.593978] Call Trace:
[ 6646.593981] __schedule+0x24e/0x880
[ 6646.593983] ? lock_timer_base+0x6b/0x90
[ 6646.593986] schedule+0x2c/0x80
[ 6646.594020] _xfs_log_force+0x159/0x2a0 [xfs]
[ 6646.594023] ? wake_up_q+0x80/0x80
[ 6646.594055] ? xfsaild+0x1b6/0x7e0 [xfs]
[ 6646.594087] xfs_log_force+0x2c/0x80 [xfs]
[ 6646.594120] xfsaild+0x1b6/0x7e0 [xfs]
[ 6646.594123] kthread+0x121/0x140
[ 6646.594155] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 6646.594157] ? kthread+0x121/0x140
[ 6646.594189] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 6646.594192] ? kthread_create_worker_on_cpu+0x70/0x70
[ 6646.594195] ret_from_fork+0x35/0x40
[ 6646.594233] INFO: task mongod:28103 blocked for more than 120 seconds.
[ 6646.594423] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.594616] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.594839] mongod D 0 28103 27425 0x00000000
[ 6646.594841] Call Trace:
[ 6646.594843] __schedule+0x24e/0x880
[ 6646.594845] schedule+0x2c/0x80
[ 6646.594847] io_schedule+0x16/0x40
[ 6646.594848] wait_on_page_bit+0xf4/0x130
[ 6646.594851] ? page_cache_tree_insert+0xe0/0xe0
[ 6646.594853] wait_for_stable_page+0x61/0x80
[ 6646.594855] grab_cache_page_write_begin+0x37/0x40
[ 6646.594856] iomap_write_begin.constprop.18+0x5b/0x140
[ 6646.594858] iomap_write_actor+0x92/0x170
[ 6646.594860] ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.594862] iomap_apply+0xa5/0x120
[ 6646.594864] ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.594865] iomap_file_buffered_write+0x6e/0xa0
[ 6646.594866] ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.594898] xfs_file_buffered_aio_write+0xca/0x290 [xfs]
[ 6646.594931] xfs_file_write_iter+0xac/0x160 [xfs]
[ 6646.594933] new_sync_write+0xe7/0x140
[ 6646.594935] __vfs_write+0x29/0x40
[ 6646.594937] vfs_write+0xb1/0x1a0
[ 6646.594940] SyS_pwrite64+0x95/0xb0
[ 6646.594942] do_syscall_64+0x73/0x130
[ 6646.594945] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.594946] RIP: 0033:0x7f9b8d65e963
[ 6646.594947] RSP: 002b:00007f9b8b2a9970 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
[ 6646.594949] RAX: ffffffffffffffda RBX: 0000000000000100 RCX: 00007f9b8d65e963
[ 6646.594949] RDX: 0000000000000100 RSI: 000055e0a90fd000 RDI: 000000000000000f
[ 6646.594950] RBP: 00007f9b8b2a99c0 R08: 000055e0a90fd000 R09: 0000000000000100
[ 6646.594951] R10: 0000000000064a80 R11: 0000000000000293 R12: 0000000000000100
[ 6646.594952] R13: 000055e0a90fd000 R14: 000055e0a62f4be0 R15: 0000000000064a80
[ 6646.594956] INFO: task WTJourn.Flusher:28149 blocked for more than 120 seconds.
[ 6646.595166] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.595367] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.595590] WTJourn.Flusher D 0 28149 27425 0x00000000
[ 6646.595592] Call Trace:
[ 6646.595595] __schedule+0x24e/0x880
[ 6646.595597] schedule+0x2c/0x80
[ 6646.595598] io_schedule+0x16/0x40
[ 6646.595600] wait_on_page_bit_common+0xd8/0x160
[ 6646.595602] ? page_cache_tree_insert+0xe0/0xe0
[ 6646.595604] __filemap_fdatawait_range+0xfa/0x160
[ 6646.595606] ? __filemap_fdatawrite_range+0xcf/0x100
[ 6646.595607] ? __filemap_fdatawrite_range+0xdb/0x100
[ 6646.595609] file_write_and_wait_range+0x86/0xb0
[ 6646.595640] xfs_file_fsync+0x5f/0x230 [xfs]
[ 6646.595644] vfs_fsync_range+0x51/0xb0
[ 6646.595646] do_fsync+0x3d/0x70
[ 6646.595649] SyS_fdatasync+0x13/0x20
[ 6646.595651] do_syscall_64+0x73/0x130
[ 6646.595654] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.595655] RIP: 0033:0x7f9b8d39060d
[ 6646.595656] RSP: 002b:00007f9b882a3690 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[ 6646.595657] RAX: ffffffffffffffda RBX: 000055e0a631b840 RCX: 00007f9b8d39060d
[ 6646.595658] RDX: 000055e0a60bc8d0 RSI: 000000000000000f RDI: 000000000000000f
[ 6646.595659] RBP: 00007f9b882a36c0 R08: 0000000000000020 R09: 0000000000000020
[ 6646.595660] R10: 6769546465726957 R11: 0000000000000293 R12: 000055e0a60bc8d0
[ 6646.595661] R13: 000055e0a3172ef1 R14: 000055e0a60cc138 R15: 0000000000000000
[ 6646.595879] INFO: task msgr-worker-1:14875 blocked for more than 120 seconds.
[ 6646.596083] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.596277] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.596631] msgr-worker-1 D 0 14875 14766 0x00000000
[ 6646.596634] Call Trace:
[ 6646.596637] __schedule+0x24e/0x880
[ 6646.596639] schedule+0x2c/0x80
[ 6646.596641] rwsem_down_read_failed+0xf0/0x160
[ 6646.596645] call_rwsem_down_read_failed+0x18/0x30
[ 6646.596647] ? call_rwsem_down_read_failed+0x18/0x30
[ 6646.596649] down_read+0x20/0x40
[ 6646.596654] __do_page_fault+0x40a/0x4b0
[ 6646.596657] ? vfs_read+0x115/0x130
[ 6646.596659] do_page_fault+0x2e/0xe0
[ 6646.596662] ? page_fault+0x2f/0x50
[ 6646.596664] page_fault+0x45/0x50
[ 6646.596665] RIP: 0033:0x7f2fe6e16628
[ 6646.596666] RSP: 002b:00007f2fe0bdf250 EFLAGS: 00010206
[ 6646.596672] RAX: 00005585a80de000 RBX: 00000000000012ef RCX: 00005585a5208e80
[ 6646.596673] RDX: 0000000000002000 RSI: 0000000000000fff RDI: 00005585a5208880
[ 6646.596675] RBP: 0000000000001000 R08: 00007f2fe71e9f60 R09: 0000000000000060
[ 6646.596676] R10: 00007f2fe0bdf420 R11: 000000000000004a R12: 0000000000001000
[ 6646.596678] R13: 0000000000000030 R14: 00000000000012ef R15: 0000000000000000
[ 6646.596681] INFO: task log:14885 blocked for more than 120 seconds.
[ 6646.596865] Tainted: G I 4.15.0-91-generic #92-Ubuntu
[ 6646.597060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.597408] log D 0 14885 14766 0x00000000
[ 6646.597419] Call Trace:
[ 6646.597422] __schedule+0x24e/0x880
[ 6646.597424] schedule+0x2c/0x80
[ 6646.597426] schedule_preempt_disabled+0xe/0x10
[ 6646.597427] __mutex_lock.isra.5+0x276/0x4e0
[ 6646.597430] __mutex_lock_slowpath+0x13/0x20
[ 6646.597431] ? __mutex_lock_slowpath+0x13/0x20
[ 6646.597433] mutex_lock+0x2f/0x40
[ 6646.597467] xfs_reclaim_inodes_ag+0x2b5/0x340 [xfs]
[ 6646.597471] ? shrink_page_list+0x3e4/0xbc0
[ 6646.597474] ? radix_tree_gang_lookup_tag+0xd9/0x160
[ 6646.597478] ? __list_lru_walk_one.isra.5+0x37/0x140
[ 6646.597482] ? iput+0x230/0x230
[ 6646.597513] xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[ 6646.597545] xfs_fs_free_cached_objects+0x19/0x20 [xfs]
[ 6646.597549] super_cache_scan+0x165/0x1b0
[ 6646.597551] shrink_slab.part.51+0x1e7/0x440
[ 6646.597554] shrink_slab+0x29/0x30
[ 6646.597555] shrink_node+0x11e/0x300
[ 6646.597558] do_try_to_free_pages+0xc9/0x330
[ 6646.597560] try_to_free_mem_cgroup_pages+0xfa/0x1e0
[ 6646.597565] try_charge+0x245/0x6a0
[ 6646.597567] mem_cgroup_try_charge+0x93/0x180
[ 6646.597571] __handle_mm_fault+0x8de/0x1290
[ 6646.597573] handle_mm_fault+0xb1/0x210
[ 6646.597576] __do_page_fault+0x281/0x4b0
[ 6646.597578] ? SyS_futex+0x13b/0x180
[ 6646.597580] do_page_fault+0x2e/0xe0
[ 6646.597583] ? page_fault+0x2f/0x50
[ 6646.597585] page_fault+0x45/0x50
[ 6646.597588] RIP: 0033:0x55859c29203e
[ 6646.597589] RSP: 002b:00007f2fdfbdda70 EFLAGS: 00010206
[ 6646.597590] RAX: 000055859cc89e60 RBX: 00005585a5ce6200 RCX: 00005585a5ce6240
[ 6646.597591] RDX: 00005585a59ae800 RSI: 00005585a684404f RDI: f0f0f0f0f0f0f0f1
[ 6646.597592] RBP: 000000000000000f R08: 0000000000000000 R09: 0000000000000000
[ 6646.597593] R10: 0000000000000000 R11: 00007f2fe3a63610 R12: 00005585a6844440
[ 6646.597593] R13: 000000000000000a R14: 00005585a59dc780 R15: 15fde929947b3030
[ 7202.983375] libceph: osd2 down
이 문제는 CephBlockPool을 사용하며 storage class의 filesystem을 xfs으로 사용할 때 문제가 발생한다.
현재 (2020/03/22) rook-ceph v1.2 & ceph v14.2.8으로써 최신버전임에도 문제가 발생하며
지금 kernel 4.15.0-91-generic 을 사용중인데 이후 커널버전에서 패치가 되었단 그런 내용이 있다.
관련 이슈는 아래내용을 참고하시길...
일단 ext4를 써야겠다.
www.github.com/rook/rook/issues/3132#issuecomment-580508760
satoru-takeuchi 의 코멘트이다.
- Feb 4, 2020: Fixed the description based on the comments in #4802 .
I got an answer from a Ceph kernel guy. Here is the summary from the user's point of view.
-
When does this problem happen?
- Making an XFS filesystem on the top of RBD and NDB.
- Its possibility gets higher when RBD client and OSD daemon are co-located.
-
How to bypass this problem?
- Use ext4 or any other filesystems rather than XFS. Filesystem type can be specified with csi.storage.k8s.io/fstype in StorageClass resource.
-
Will this problem be fixed?
- Yes, with the following two fixes
- Linux kernel 5.6 (not released yet), that includes the following patch
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d19f1c8e1937baf74e1962aae9f90fa3aeab463 - Ceph with a fix for this problem. This fix uses a feature that is introduced by the above- mentioned patch. The Ceph community will probably become discuss this fix after releasing Linux v5.6.
- Linux kernel 5.6 (not released yet), that includes the following patch
- Yes, with the following two fixes
-
Why does this problem happen?
- The deadlock of the following logic in the kernel.
- XFS will start pruning its caches, taking filesystem locks and kicking off I/O on rbd
- Memory allocation request(s) made by the OSD(s) to service that I/O may recurse back onto the same XFS, needing the same locks.
- The deadlock of the following logic in the kernel.
For more information, please refer to the following URL if you're interested in the detailed kernel logic.
https://marc.info/?l=ceph-devel&m=158029603909623&w=2
@BlaineEXE So, how should we deal with this Rook's issue. My idea is the followings.
- Close this issue since it's not a Rook's problem.
- Tell this information to Ceph's issue that you created, not to forget to fix Ceph side.
'개발 및 운영 > Kubernetes' 카테고리의 다른 글
bitnami/mariadb-galera 오류날때 (0) | 2020.04.23 |
---|---|
Kubernetes Offline (네트워크 분리 망) 에서 사용 (1) | 2020.03.30 |
microk8s helm offline 설치 (0) | 2020.03.15 |
Kubernetes yaml 자동 생성기 (0) | 2019.11.09 |
Kubernetes 설치 (0) | 2019.09.24 |
댓글