1 nearfull osd(s); 2 pool(s) nearfull
[WRN] OSD_NEARFULL: 1 nearfull osd(s)
osd.115 is near full
[WRN] POOL_NEARFULL: 2 pool(s) nearfull
pool 'rbd' is nearfull
pool 'device_health_metrics' is nearfull
root@sds-osd-302-01:~#
Проверил информацию по osd и увидел огромной разброс по заполненности. Например:
root@sds-osd-302-01:~# ceph osd df |grep -P 'ID|115'
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
115 hdd 5.45799 1.00000 5.5 TiB 4.7 TiB 4.6 TiB 2.7 MiB 8.1 GiB 843 GiB 85.02 1.36 116 up
root@sds-osd-302-01:~# ceph osd df |grep -P 'ID|37'
ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS
37 hdd 5.45799 1.00000 5.5 TiB 2.5 TiB 2.5 TiB 1.7 MiB 4.3 GiB 3.0 TiB 45.67 0.73 62 up
Можно ли "выровнять" OSD включив, например Autoscaling placement groups?
https://docs.ceph.com/en/latest/rados/operations/placement-groups/#autoscaling-placement-groups
Сейчас он выключен.
Или каким-то другим способом?
Версия ceph:
root@sds-osd-302-01:~# ceph -v
ceph version 15.2.14 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable)
Используется политика хранения R3
https://github.com/TheJJ/ceph-balancer
А что он на выходе генерит? Готовый крашмап?
вот полный вывод, запустил сейчас у себя: # ./placementoptimizer.py -v balance --ensure-optimal-moves --ensure-variance-decrease [2022-03-23 11:16:11,699] gathering cluster state via ceph api... [2022-03-23 11:16:19,036] running pg balancer [2022-03-23 11:16:19,045] current OSD fill rate per crushclasses: [2022-03-23 11:16:19,046] ssd: average=60.77%, median=61.25%, without_placement_constraints=64.21% [2022-03-23 11:16:19,047] cluster variance for crushclasses: [2022-03-23 11:16:19,047] ssd: 1.863 [2022-03-23 11:16:19,047] min osd.24 58.009% [2022-03-23 11:16:19,048] max osd.12 62.873% [2022-03-23 11:16:19,052] SAVE move 55.7d osd.12 => osd.24 [2022-03-23 11:16:19,052] props: size=194.0G remapped=False upmaps=0 [2022-03-23 11:16:19,052] => variance new=1.4833433441621613 < 1.8626866905980985=old [2022-03-23 11:16:19,052] new min osd.17 58.715% [2022-03-23 11:16:19,053] max osd.11 62.449% [2022-03-23 11:16:19,053] new cluster variance: [2022-03-23 11:16:19,053] ssd: 1.483 [2022-03-23 11:16:19,076] in descending full-order, couldn't empty osd.11, so we're done. if you want to try more often, set --max-full-move-attempts=$nr, this may unlock more balancing possibilities. [2022-03-23 11:16:19,076] -------------------------------------------------------------------------------- [2022-03-23 11:16:19,076] generated 1 remaps. [2022-03-23 11:16:19,076] total movement size: 194.0G. [2022-03-23 11:16:19,076] -------------------------------------------------------------------------------- [2022-03-23 11:16:19,077] old cluster variance per crushclass: [2022-03-23 11:16:19,077] ssd: 1.863 [2022-03-23 11:16:19,077] old min osd.24 58.009% [2022-03-23 11:16:19,077] old max osd.12 62.873% [2022-03-23 11:16:19,077] -------------------------------------------------------------------------------- [2022-03-23 11:16:19,077] new min osd.17 58.715% [2022-03-23 11:16:19,077] new max osd.11 62.449% [2022-03-23 11:16:19,078] new cluster variance: [2022-03-23 11:16:19,078] ssd: 1.483 [2022-03-23 11:16:19,078] -------------------------------------------------------------------------------- ceph osd pg-upmap-items 55.7d 12 24
Обсуждают сегодня