본문 바로가기
개발 및 운영/Kubernetes

Ceph 사용시 XFS 사용시 주의!!

by Joseph.Lee 2020. 3. 22.

Ceph 설치 후 서버 동작중에 서버가 죽어버리는 문제가 발생했다.

 

어떤 상황이냐면...

 

1. 초기 동작시에는 문제가 없지만 IO가 좀 발생하면 문제가 발생함.

2. 문제 발생시 특정 동작에 hangs이 걸리고 아무것도 동작하지 않음 (Deadlock)

-> ps -aux 명령이 그러함.

ps aux 을 하면 프로세스 목록이 보이다가 곧 멈춰버리는데 보여줘야 할 프로세스 (마지막으로 보여지는 pid의 다음 것)에 문제가 있음.

cat /proc/(문제pid)/cmdline 등의 명령도 멈춰버림.

3. 시스템 정상종료 불가

-> unmonut 불가로 보임

 

커널 로그는 아래와 같다.

[ 5092.890984] libceph: osd2 up
[ 6646.588531] INFO: task containerd:18390 blocked for more than 120 seconds.
[ 6646.588868]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.589070] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.589294] containerd      D    0 18390      1 0x00000000
[ 6646.589297] Call Trace:
[ 6646.589306]  __schedule+0x24e/0x880
[ 6646.589311]  ? __memcg_init_list_lru_node+0x70/0xd0
[ 6646.589313]  schedule+0x2c/0x80
[ 6646.589316]  rwsem_down_write_failed+0x1ea/0x360
[ 6646.589319]  ? ida_get_new_above+0x110/0x320
[ 6646.589325]  call_rwsem_down_write_failed+0x17/0x30
[ 6646.589326]  ? call_rwsem_down_write_failed+0x17/0x30
[ 6646.589328]  down_write+0x2d/0x40
[ 6646.589332]  register_shrinker_prepared+0x19/0x50
[ 6646.589336]  sget_userns+0x419/0x490
[ 6646.589338]  ? get_anon_bdev+0x100/0x100
[ 6646.589340]  sget+0x7d/0xa0
[ 6646.589342]  ? get_anon_bdev+0x100/0x100
[ 6646.589346]  ? ovl_posix_acl_xattr_set+0x300/0x300 [overlay]
[ 6646.589348]  mount_nodev+0x30/0xa0
[ 6646.589351]  ovl_mount+0x18/0x20 [overlay]
[ 6646.589353]  mount_fs+0x37/0x160
[ 6646.589357]  vfs_kern_mount.part.24+0x5d/0x110
[ 6646.589359]  do_mount+0x5ed/0xce0
[ 6646.589361]  ? copy_mount_options+0x2c/0x220
[ 6646.589363]  SyS_mount+0x98/0xe0
[ 6646.589367]  do_syscall_64+0x73/0x130
[ 6646.589370]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.589371] RIP: 0033:0x55635d40f28a
[ 6646.589372] RSP: 002b:000000c00186e358 EFLAGS: 00000216 ORIG_RAX: 00000000000000a5
[ 6646.589374] RAX: ffffffffffffffda RBX: 000000c00004e000 RCX: 000055635d40f28a
[ 6646.589375] RDX: 000000c0014d8818 RSI: 000000c002ef45a0 RDI: 000000c0014d8810
[ 6646.589376] RBP: 000000c00186e3f0 R08: 000000c0038e8b00 R09: 0000000000000000
[ 6646.589377] R10: 0000000000000000 R11: 0000000000000216 R12: ffffffffffffffff
[ 6646.589378] R13: 0000000000000010 R14: 000000000000000f R15: 0000000000000055
[ 6646.589400] INFO: task containerd:16952 blocked for more than 120 seconds.
[ 6646.589598]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.589790] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.590013] containerd      D    0 16952      1 0x00000000
[ 6646.590016] Call Trace:
[ 6646.590020]  __schedule+0x24e/0x880
[ 6646.590023]  ? __memcg_init_list_lru_node+0x70/0xd0
[ 6646.590025]  schedule+0x2c/0x80
[ 6646.590028]  rwsem_down_write_failed+0x1ea/0x360
[ 6646.590031]  ? ida_get_new_above+0x110/0x320
[ 6646.590035]  call_rwsem_down_write_failed+0x17/0x30
[ 6646.590037]  ? call_rwsem_down_write_failed+0x17/0x30
[ 6646.590040]  down_write+0x2d/0x40
[ 6646.590043]  register_shrinker_prepared+0x19/0x50
[ 6646.590046]  sget_userns+0x419/0x490
[ 6646.590048]  ? get_anon_bdev+0x100/0x100
[ 6646.590052]  sget+0x7d/0xa0
[ 6646.590054]  ? get_anon_bdev+0x100/0x100
[ 6646.590059]  ? ovl_posix_acl_xattr_set+0x300/0x300 [overlay]
[ 6646.590062]  mount_nodev+0x30/0xa0
[ 6646.590067]  ovl_mount+0x18/0x20 [overlay]
[ 6646.590070]  mount_fs+0x37/0x160
[ 6646.590073]  vfs_kern_mount.part.24+0x5d/0x110
[ 6646.590076]  do_mount+0x5ed/0xce0
[ 6646.590079]  ? copy_mount_options+0x2c/0x220
[ 6646.590082]  SyS_mount+0x98/0xe0
[ 6646.590085]  do_syscall_64+0x73/0x130
[ 6646.590088]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.590090] RIP: 0033:0x55635d40f28a
[ 6646.590091] RSP: 002b:000000c0018725b0 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[ 6646.590094] RAX: ffffffffffffffda RBX: 000000c000054f00 RCX: 000055635d40f28a
[ 6646.590095] RDX: 000000c0030c7ce8 RSI: 000000c00148bf60 RDI: 000000c0030c7ce0
[ 6646.590096] RBP: 000000c001872648 R08: 000000c00300f680 R09: 0000000000000000
[ 6646.590098] R10: 0000000000000000 R11: 0000000000000202 R12: ffffffffffffffff
[ 6646.590099] R13: 00000000000000fc R14: 00000000000000fb R15: 0000000000000100
[ 6646.590540] INFO: task xfsaild/rbd2:8223 blocked for more than 120 seconds.
[ 6646.590743]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.590939] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.591165] xfsaild/rbd2    D    0  8223      2 0x80000000
[ 6646.591168] Call Trace:
[ 6646.591171]  __schedule+0x24e/0x880
[ 6646.591175]  ? lock_timer_base+0x6b/0x90
[ 6646.591178]  schedule+0x2c/0x80
[ 6646.591245]  _xfs_log_force+0x159/0x2a0 [xfs]
[ 6646.591250]  ? wake_up_q+0x80/0x80
[ 6646.591301]  ? xfsaild+0x1b6/0x7e0 [xfs]
[ 6646.591350]  xfs_log_force+0x2c/0x80 [xfs]
[ 6646.591400]  xfsaild+0x1b6/0x7e0 [xfs]
[ 6646.591404]  ? __schedule+0x256/0x880
[ 6646.591408]  kthread+0x121/0x140
[ 6646.591458]  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 6646.591460]  ? kthread+0x121/0x140
[ 6646.591510]  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 6646.591513]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 6646.591517]  ret_from_fork+0x35/0x40
[ 6646.591539] INFO: task prometheus:10749 blocked for more than 120 seconds.
[ 6646.591740]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.591936] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.592162] prometheus      D    0 10749  10332 0x00000000
[ 6646.592164] Call Trace:
[ 6646.592168]  __schedule+0x24e/0x880
[ 6646.592171]  schedule+0x2c/0x80
[ 6646.592173]  io_schedule+0x16/0x40
[ 6646.592176]  wait_on_page_bit+0xf4/0x130
[ 6646.592182]  ? page_cache_tree_insert+0xe0/0xe0
[ 6646.592186]  wait_for_stable_page+0x61/0x80
[ 6646.592188]  grab_cache_page_write_begin+0x37/0x40
[ 6646.592192]  iomap_write_begin.constprop.18+0x5b/0x140
[ 6646.592195]  iomap_write_actor+0x92/0x170
[ 6646.592198]  ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.592200]  iomap_apply+0xa5/0x120
[ 6646.592203]  ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.592205]  iomap_file_buffered_write+0x6e/0xa0
[ 6646.592207]  ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.592256]  xfs_file_buffered_aio_write+0xca/0x290 [xfs]
[ 6646.592261]  ? sock_read_iter+0x8f/0xf0
[ 6646.592309]  xfs_file_write_iter+0xac/0x160 [xfs]
[ 6646.592347]  new_sync_write+0xe7/0x140
[ 6646.592350]  __vfs_write+0x29/0x40
[ 6646.592353]  vfs_write+0xb1/0x1a0
[ 6646.592355]  SyS_write+0x5c/0xe0
[ 6646.592359]  do_syscall_64+0x73/0x130
[ 6646.592363]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.592365] RIP: 0033:0x47a170
[ 6646.592366] RSP: 002b:000000c006545278 EFLAGS: 00000212 ORIG_RAX: 0000000000000001
[ 6646.592369] RAX: ffffffffffffffda RBX: 000000c00004a000 RCX: 000000000047a170
[ 6646.592370] RDX: 000000000000023f RSI: 000000c000b4b88b RDI: 0000000000000026
[ 6646.592372] RBP: 000000c0065452c8 R08: 0000000000000000 R09: 0000000000000000
[ 6646.592373] R10: 0000000000000000 R11: 0000000000000212 R12: 000000000000477e
[ 6646.592374] R13: 000000000000387b R14: 0000000000004785 R15: 000000c000b4b88b
[ 6646.592397] INFO: task WTJourn.Flusher:13661 blocked for more than 120 seconds.
[ 6646.592611]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.592812] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.593038] WTJourn.Flusher D    0 13661  11503 0x00000000
[ 6646.593040] Call Trace:
[ 6646.593044]  __schedule+0x24e/0x880
[ 6646.593045]  schedule+0x2c/0x80
[ 6646.593048]  io_schedule+0x16/0x40
[ 6646.593049]  wait_on_page_bit_common+0xd8/0x160
[ 6646.593053]  ? page_cache_tree_insert+0xe0/0xe0
[ 6646.593055]  __filemap_fdatawait_range+0xfa/0x160
[ 6646.593057]  ? __filemap_fdatawrite_range+0xcf/0x100
[ 6646.593060]  ? __filemap_fdatawrite_range+0xdb/0x100
[ 6646.593062]  file_write_and_wait_range+0x86/0xb0
[ 6646.593094]  xfs_file_fsync+0x5f/0x230 [xfs]
[ 6646.593100]  vfs_fsync_range+0x51/0xb0
[ 6646.593104]  do_fsync+0x3d/0x70
[ 6646.593107]  SyS_fdatasync+0x13/0x20
[ 6646.593109]  do_syscall_64+0x73/0x130
[ 6646.593112]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.593113] RIP: 0033:0x7fe3337432e7
[ 6646.593114] RSP: 002b:00007fe32c6c2400 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[ 6646.593116] RAX: ffffffffffffffda RBX: 0000000000000011 RCX: 00007fe3337432e7
[ 6646.593117] RDX: 0000000000000000 RSI: 0000000000000011 RDI: 0000000000000011
[ 6646.593118] RBP: 00007fe32c6c2440 R08: 0000000000000000 R09: 0000000000000000
[ 6646.593119] R10: 0000000000000020 R11: 0000000000000293 R12: 0000560acb599f40
[ 6646.593120] R13: 0000560ac884130b R14: 0000000000000000 R15: 0000560acdf52158
[ 6646.593235] INFO: task xfsaild/rbd13:20644 blocked for more than 120 seconds.
[ 6646.593444]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.593640] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.593975] xfsaild/rbd13   D    0 20644      2 0x80000000
[ 6646.593978] Call Trace:
[ 6646.593981]  __schedule+0x24e/0x880
[ 6646.593983]  ? lock_timer_base+0x6b/0x90
[ 6646.593986]  schedule+0x2c/0x80
[ 6646.594020]  _xfs_log_force+0x159/0x2a0 [xfs]
[ 6646.594023]  ? wake_up_q+0x80/0x80
[ 6646.594055]  ? xfsaild+0x1b6/0x7e0 [xfs]
[ 6646.594087]  xfs_log_force+0x2c/0x80 [xfs]
[ 6646.594120]  xfsaild+0x1b6/0x7e0 [xfs]
[ 6646.594123]  kthread+0x121/0x140
[ 6646.594155]  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 6646.594157]  ? kthread+0x121/0x140
[ 6646.594189]  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[ 6646.594192]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 6646.594195]  ret_from_fork+0x35/0x40
[ 6646.594233] INFO: task mongod:28103 blocked for more than 120 seconds.
[ 6646.594423]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.594616] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.594839] mongod          D    0 28103  27425 0x00000000
[ 6646.594841] Call Trace:
[ 6646.594843]  __schedule+0x24e/0x880
[ 6646.594845]  schedule+0x2c/0x80
[ 6646.594847]  io_schedule+0x16/0x40
[ 6646.594848]  wait_on_page_bit+0xf4/0x130
[ 6646.594851]  ? page_cache_tree_insert+0xe0/0xe0
[ 6646.594853]  wait_for_stable_page+0x61/0x80
[ 6646.594855]  grab_cache_page_write_begin+0x37/0x40
[ 6646.594856]  iomap_write_begin.constprop.18+0x5b/0x140
[ 6646.594858]  iomap_write_actor+0x92/0x170
[ 6646.594860]  ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.594862]  iomap_apply+0xa5/0x120
[ 6646.594864]  ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.594865]  iomap_file_buffered_write+0x6e/0xa0
[ 6646.594866]  ? iomap_write_begin.constprop.18+0x140/0x140
[ 6646.594898]  xfs_file_buffered_aio_write+0xca/0x290 [xfs]
[ 6646.594931]  xfs_file_write_iter+0xac/0x160 [xfs]
[ 6646.594933]  new_sync_write+0xe7/0x140
[ 6646.594935]  __vfs_write+0x29/0x40
[ 6646.594937]  vfs_write+0xb1/0x1a0
[ 6646.594940]  SyS_pwrite64+0x95/0xb0
[ 6646.594942]  do_syscall_64+0x73/0x130
[ 6646.594945]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.594946] RIP: 0033:0x7f9b8d65e963
[ 6646.594947] RSP: 002b:00007f9b8b2a9970 EFLAGS: 00000293 ORIG_RAX: 0000000000000012
[ 6646.594949] RAX: ffffffffffffffda RBX: 0000000000000100 RCX: 00007f9b8d65e963
[ 6646.594949] RDX: 0000000000000100 RSI: 000055e0a90fd000 RDI: 000000000000000f
[ 6646.594950] RBP: 00007f9b8b2a99c0 R08: 000055e0a90fd000 R09: 0000000000000100
[ 6646.594951] R10: 0000000000064a80 R11: 0000000000000293 R12: 0000000000000100
[ 6646.594952] R13: 000055e0a90fd000 R14: 000055e0a62f4be0 R15: 0000000000064a80
[ 6646.594956] INFO: task WTJourn.Flusher:28149 blocked for more than 120 seconds.
[ 6646.595166]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.595367] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.595590] WTJourn.Flusher D    0 28149  27425 0x00000000
[ 6646.595592] Call Trace:
[ 6646.595595]  __schedule+0x24e/0x880
[ 6646.595597]  schedule+0x2c/0x80
[ 6646.595598]  io_schedule+0x16/0x40
[ 6646.595600]  wait_on_page_bit_common+0xd8/0x160
[ 6646.595602]  ? page_cache_tree_insert+0xe0/0xe0
[ 6646.595604]  __filemap_fdatawait_range+0xfa/0x160
[ 6646.595606]  ? __filemap_fdatawrite_range+0xcf/0x100
[ 6646.595607]  ? __filemap_fdatawrite_range+0xdb/0x100
[ 6646.595609]  file_write_and_wait_range+0x86/0xb0
[ 6646.595640]  xfs_file_fsync+0x5f/0x230 [xfs]
[ 6646.595644]  vfs_fsync_range+0x51/0xb0
[ 6646.595646]  do_fsync+0x3d/0x70
[ 6646.595649]  SyS_fdatasync+0x13/0x20
[ 6646.595651]  do_syscall_64+0x73/0x130
[ 6646.595654]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 6646.595655] RIP: 0033:0x7f9b8d39060d
[ 6646.595656] RSP: 002b:00007f9b882a3690 EFLAGS: 00000293 ORIG_RAX: 000000000000004b
[ 6646.595657] RAX: ffffffffffffffda RBX: 000055e0a631b840 RCX: 00007f9b8d39060d
[ 6646.595658] RDX: 000055e0a60bc8d0 RSI: 000000000000000f RDI: 000000000000000f
[ 6646.595659] RBP: 00007f9b882a36c0 R08: 0000000000000020 R09: 0000000000000020
[ 6646.595660] R10: 6769546465726957 R11: 0000000000000293 R12: 000055e0a60bc8d0
[ 6646.595661] R13: 000055e0a3172ef1 R14: 000055e0a60cc138 R15: 0000000000000000
[ 6646.595879] INFO: task msgr-worker-1:14875 blocked for more than 120 seconds.
[ 6646.596083]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.596277] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.596631] msgr-worker-1   D    0 14875  14766 0x00000000
[ 6646.596634] Call Trace:
[ 6646.596637]  __schedule+0x24e/0x880
[ 6646.596639]  schedule+0x2c/0x80
[ 6646.596641]  rwsem_down_read_failed+0xf0/0x160
[ 6646.596645]  call_rwsem_down_read_failed+0x18/0x30
[ 6646.596647]  ? call_rwsem_down_read_failed+0x18/0x30
[ 6646.596649]  down_read+0x20/0x40
[ 6646.596654]  __do_page_fault+0x40a/0x4b0
[ 6646.596657]  ? vfs_read+0x115/0x130
[ 6646.596659]  do_page_fault+0x2e/0xe0
[ 6646.596662]  ? page_fault+0x2f/0x50
[ 6646.596664]  page_fault+0x45/0x50
[ 6646.596665] RIP: 0033:0x7f2fe6e16628
[ 6646.596666] RSP: 002b:00007f2fe0bdf250 EFLAGS: 00010206
[ 6646.596672] RAX: 00005585a80de000 RBX: 00000000000012ef RCX: 00005585a5208e80
[ 6646.596673] RDX: 0000000000002000 RSI: 0000000000000fff RDI: 00005585a5208880
[ 6646.596675] RBP: 0000000000001000 R08: 00007f2fe71e9f60 R09: 0000000000000060
[ 6646.596676] R10: 00007f2fe0bdf420 R11: 000000000000004a R12: 0000000000001000
[ 6646.596678] R13: 0000000000000030 R14: 00000000000012ef R15: 0000000000000000
[ 6646.596681] INFO: task log:14885 blocked for more than 120 seconds.
[ 6646.596865]       Tainted: G          I      4.15.0-91-generic #92-Ubuntu
[ 6646.597060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 6646.597408] log             D    0 14885  14766 0x00000000
[ 6646.597419] Call Trace:
[ 6646.597422]  __schedule+0x24e/0x880
[ 6646.597424]  schedule+0x2c/0x80
[ 6646.597426]  schedule_preempt_disabled+0xe/0x10
[ 6646.597427]  __mutex_lock.isra.5+0x276/0x4e0
[ 6646.597430]  __mutex_lock_slowpath+0x13/0x20
[ 6646.597431]  ? __mutex_lock_slowpath+0x13/0x20
[ 6646.597433]  mutex_lock+0x2f/0x40
[ 6646.597467]  xfs_reclaim_inodes_ag+0x2b5/0x340 [xfs]
[ 6646.597471]  ? shrink_page_list+0x3e4/0xbc0
[ 6646.597474]  ? radix_tree_gang_lookup_tag+0xd9/0x160
[ 6646.597478]  ? __list_lru_walk_one.isra.5+0x37/0x140
[ 6646.597482]  ? iput+0x230/0x230
[ 6646.597513]  xfs_reclaim_inodes_nr+0x33/0x40 [xfs]
[ 6646.597545]  xfs_fs_free_cached_objects+0x19/0x20 [xfs]
[ 6646.597549]  super_cache_scan+0x165/0x1b0
[ 6646.597551]  shrink_slab.part.51+0x1e7/0x440
[ 6646.597554]  shrink_slab+0x29/0x30
[ 6646.597555]  shrink_node+0x11e/0x300
[ 6646.597558]  do_try_to_free_pages+0xc9/0x330
[ 6646.597560]  try_to_free_mem_cgroup_pages+0xfa/0x1e0
[ 6646.597565]  try_charge+0x245/0x6a0
[ 6646.597567]  mem_cgroup_try_charge+0x93/0x180
[ 6646.597571]  __handle_mm_fault+0x8de/0x1290
[ 6646.597573]  handle_mm_fault+0xb1/0x210
[ 6646.597576]  __do_page_fault+0x281/0x4b0
[ 6646.597578]  ? SyS_futex+0x13b/0x180
[ 6646.597580]  do_page_fault+0x2e/0xe0
[ 6646.597583]  ? page_fault+0x2f/0x50
[ 6646.597585]  page_fault+0x45/0x50
[ 6646.597588] RIP: 0033:0x55859c29203e
[ 6646.597589] RSP: 002b:00007f2fdfbdda70 EFLAGS: 00010206
[ 6646.597590] RAX: 000055859cc89e60 RBX: 00005585a5ce6200 RCX: 00005585a5ce6240
[ 6646.597591] RDX: 00005585a59ae800 RSI: 00005585a684404f RDI: f0f0f0f0f0f0f0f1
[ 6646.597592] RBP: 000000000000000f R08: 0000000000000000 R09: 0000000000000000
[ 6646.597593] R10: 0000000000000000 R11: 00007f2fe3a63610 R12: 00005585a6844440
[ 6646.597593] R13: 000000000000000a R14: 00005585a59dc780 R15: 15fde929947b3030
[ 7202.983375] libceph: osd2 down

 

이 문제는 CephBlockPool을 사용하며 storage class의 filesystem을 xfs으로 사용할 때 문제가 발생한다.

 

현재 (2020/03/22) rook-ceph v1.2 & ceph v14.2.8으로써 최신버전임에도 문제가 발생하며

지금 kernel 4.15.0-91-generic 을 사용중인데 이후 커널버전에서 패치가 되었단 그런 내용이 있다.

 

관련 이슈는 아래내용을 참고하시길...

 

일단 ext4를 써야겠다.

 

www.github.com/rook/rook/issues/3132#issuecomment-580508760

 

Very high CPU usage on Ceph OSDs (v1.0, v1.1) · Issue #3132 · rook/rook

I am not sure where the problem is but I am seeing very high CPU usage since I started using v1.0.0. With three small clusters load average skyrockets to the 10s quite quickly making the nodes unus...

github.com

satoru-takeuchi 의 코멘트이다.

  • Feb 4, 2020: Fixed the description based on the comments in #4802 .

I got an answer from a Ceph kernel guy. Here is the summary from the user's point of view.

  • When does this problem happen?

    • Making an XFS filesystem on the top of RBD and NDB.
    • Its possibility gets higher when RBD client and OSD daemon are co-located.
  • How to bypass this problem?

    • Use ext4 or any other filesystems rather than XFS. Filesystem type can be specified with csi.storage.k8s.io/fstype in StorageClass resource.
  • Will this problem be fixed?

  • Why does this problem happen?

    • The deadlock of the following logic in the kernel.
      • XFS will start pruning its caches, taking filesystem locks and kicking off I/O on rbd
      • Memory allocation request(s) made by the OSD(s) to service that I/O may recurse back onto the same XFS, needing the same locks.

For more information, please refer to the following URL if you're interested in the detailed kernel logic.

https://marc.info/?l=ceph-devel&m=158029603909623&w=2

@BlaineEXE So, how should we deal with this Rook's issue. My idea is the followings.

  • Close this issue since it's not a Rook's problem.
  • Tell this information to Ceph's issue that you created, not to forget to fix Ceph side.

 

 

반응형

댓글