×××ךת CEPH. ××ק 1
××× ×× × ××××©× ×ת×××, ×¢×©×š× ×ת××× ×××€××××, BGP ××××ך××, ××× ×¢×©×š×ת SSD ×××××š× ×©× ×××¡×§× SAS ××× ×׊××¢×× ×××××××, ××× ×× ×€×š×קס××קס ××ך׊×× ×××× ×ס ×ת ×× ×× ×ª×× ×× ×ס××××× ×××ס×× S3 ש×× ×. ×× ×©×× ×× × ×××¥ ×¢××ך ××ך××××××׊××, ××× ×ך××¢ ש××ª× ×ת××× ××שת×ש ×-opensource, ×× ×¢×§×× ××ך ×ת×××× ×©×× ×¢× ×ס××£. ×××ך ××××× ×©×׀ך××¢ ×× ××× BGP. ××× ××× ××תך ×סך ××× ××, ×סך ××ך××ת ××× ×××¡×š× ××שך × ×ת×× BGP ×€× ×××. ××××¢×ª× ×©×× ×קך×× × ×Š××× ×ת×××.
××ש××× ××××ª× ×ך×××××××ת - ××× CEPH, ××× ×× ×× ×¢×× ×× ×× ×××. ××× ×Š××š× ×עש×ת "×××".
××ש××× ×©×§××××ª× ××× ××ך××× ×, ××××× ××××€×× ××××¢× ×× ×××××. ××× ××ך×× ××©×ª× ×§××׊×ת ×©× ×Š×ת×× ×©×× ××, ××שך ךשת ×ש×ת׀ת ××ת ×€×¢×× ××ש××× ××× ×ךשת ׊×××ך×ת. ×׊×ת×× ××××× ××ך××¢× ×¡××× ××סק×× - ×©× × ×¡×××× ×©× SSD, ×©× ××¡×€× ××©× × ×××× ××ק×× × ×€×š×××, ××©× × ×¡×××× ×©× HDD ×××××× ×©×× ××, ×©× ××¡×€× ×ק×××Š× ×©××ש×ת. ×××¢×× ×××××× ×©×× ×× × ×€×ª×š× ×¢× ××× ×שק×× OSD ש×× ××.
×××××š× ×¢×Š×× ××××קת ××©× × ××ק×× - ×××× ×× ×עך×ת ×××€×¢×× Ðž ×××× ×× ×©× CEPH ע׊×× ×××××ך×ת ש××.
ש×ך×× ×עך×ת ×××€×¢××
ךשת
××× ××××ך ×××× ×ש׀××¢ ×× ×¢× ××ק××× ××× ×¢× ××××××. ×עת ×ק××× - ×× ××ק×× ×× ×ק×× ×ª×××× ×¢× ×ק××× ××׊××ת ×¢× ×©××¢×ª×§× × ×ª×× ×× ×ק××׊×ת ××ק×× ××ך×ת ×××©×š× ×׊×××. ×××××× ×©×××××× ××׀׊ת ×עתק×× ××׀ת CRUSH ××× ×עתק ××× ××× ××ך×, ת××× × ×¢×©× ×©×××ש ×ךשת.
×××, ×××ך ×ך×ש×× ×©×××××ª× ×עש×ת ××× ××ת××× ××¢× ×ת ×ךשת ×× ××××ת, ××× ×××× ×× ×¡×ת ×ש×× ×¢ ×××ª× ××¢××ך ×ךשת×ת × ×€×š××ת.
×××ת××××, ש×× ××ª× ×ת ××××ך×ת ×©× ×ך×××¡× ×ךשת. ×ת×××ª× ××××ךת ת×ך××:
×× ×§×š×:
ethtool -l ens1f1
root@ceph01:~# ethtool -l ens1f1
Channel parameters for ens1f1:
Pre-set maximums:
RX: 0
TX: 0
Other: 1
Combined: 63
Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 1
root@ceph01:~# ethtool -g ens1f1
Ring parameters for ens1f1:
Pre-set maximums:
RX: 4096
RX Mini: 0
RX Jumbo: 0
TX: 4096
Current hardware settings:
RX: 256
RX Mini: 0
RX Jumbo: 0
TX: 256
root@ceph01:~# ethtool -l ens1f1
Channel parameters for ens1f1:
Pre-set maximums:
RX: 0
TX: 0
Other: 1
Combined: 63
Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 1
× ××ª× ×ך××ת ש×׀ך××ך×× ×× ×××××× ×š××ק×× ××קס××××. ××ÖŒ××Öž×:
root@ceph01:~#ethtool -G ens1f0 rx 4096
root@ceph01:~#ethtool -G ens1f0 tx 4096
root@ceph01:~#ethtool -L ens1f0 combined 63
××× ×× ×¢× ××× ×××ך ×׊×××
××××× ×ת ×××š× ×ª×ך ×ש×××× txqueuelen ×-1000 ×¢× 10
root@ceph01:~#ip link set ens1f0 txqueuelen 10000
××××, ×עק××ת ×ת××¢×× ×©× ceph ע׊××
××ÖŒ××Öž× MTU ×¢× 9000.
root@ceph01:~#ip link set dev ens1f0 mtu 9000
× ×סף ×-/etc/network/interfaces ×× ×©×× ××××ך ××¢×× × ××¢× ×עת ââ×××€×¢××
×ת×× / ××× '/ ךשת / ××שק××
root@ceph01:~# cat /etc/network/interfaces
auto lo
iface lo inet loopback
auto ens1f0
iface ens1f0 inet manual
post-up /sbin/ethtool -G ens1f0 rx 4096
post-up /sbin/ethtool -G ens1f0 tx 4096
post-up /sbin/ethtool -L ens1f0 combined 63
post-up /sbin/ip link set ens1f0 txqueuelen 10000
mtu 9000
auto ens1f1
iface ens1f1 inet manual
post-up /sbin/ethtool -G ens1f1 rx 4096
post-up /sbin/ethtool -G ens1f1 tx 4096
post-up /sbin/ethtool -L ens1f1 combined 63
post-up /sbin/ip link set ens1f1 txqueuelen 10000
mtu 9000
×××ך ×××, ×עק××ת ×××ª× ×××ך, ×ת×××ª× ×ס××× ×××ך×ך ×ת ××××ת ××ךע×× ×©× 4.15. ××ת××©× ××× ×©×׊×ת×× ×ש 128G RAM, ×ס××€× ×©× ××ך ק×××× × ×§×××¥ ת׊××š× ×¢××ך sysctl
cat /etc/sysctl.d/50-ceph.conf
net.core.rmem_max = 56623104
#ÐакÑОЌалÑÐœÑй ÑÐ°Ð·ÐŒÐµÑ Ð±ÑÑеÑа пÑОеЌа ЎаММÑÑ
ÐŽÐ»Ñ Ð²ÑеÑ
ÑПеЎОМеМОй 54M
net.core.wmem_max = 56623104
#ÐакÑОЌалÑÐœÑй ÑÐ°Ð·ÐŒÐµÑ Ð±ÑÑеÑа пеÑеЎаÑО ЎаММÑÑ
ÐŽÐ»Ñ Ð²ÑеÑ
ÑПеЎОМеМОй 54M
net.core.rmem_default = 56623104
#Ð Ð°Ð·ÐŒÐµÑ Ð±ÑÑеÑа пÑОеЌа ЎаММÑÑ
пП ÑЌПлÑÐ°ÐœÐžÑ ÐŽÐ»Ñ Ð²ÑеÑ
ÑПеЎОМеМОй. 54M
net.core.wmem_default = 56623104
#Ð Ð°Ð·ÐŒÐµÑ Ð±ÑÑеÑа пеÑеЎаÑО ЎаММÑÑ
пП ÑЌПлÑÐ°ÐœÐžÑ ÐŽÐ»Ñ Ð²ÑеÑ
ÑПеЎОМеМОй 54M
# Ма кажЎÑй ÑПкеÑ
net.ipv4.tcp_rmem = 4096 87380 56623104
#ÐекÑПÑÐœÐ°Ñ (ЌОМОЌÑÐŒ, пП ÑЌПлÑаМОÑ, ЌакÑОЌÑÐŒ) пеÑÐµÐŒÐµÐœÐœÐ°Ñ Ð² Ñайле tcp_rmem
# ÑПЎеÑÐ¶ÐžÑ 3 ÑелÑÑ
ÑОÑла, ПпÑеЎелÑÑÑОÑ
ÑÐ°Ð·ÐŒÐµÑ Ð¿ÑОеЌМПгП бÑÑеÑа ÑПкеÑПв TCP.
# ÐОМОЌÑÐŒ: кажЎÑй ÑÐŸÐºÐµÑ TCP ÐžÐŒÐµÐµÑ Ð¿ÑавП ОÑпПлÑзПваÑÑ ÑÑÑ Ð¿Ð°ÐŒÑÑÑ Ð¿ÐŸ
# ÑакÑÑ ÑвПегП ÑПзЎаМОÑ. ÐПзЌПжМПÑÑÑ ÐžÑпПлÑÐ·ÐŸÐ²Ð°ÐœÐžÑ ÑакПгП бÑÑеÑа
# гаÑаМÑОÑÑеÑÑÑ ÐŽÐ°Ð¶Ðµ пÑО ЎПÑÑОжеМОО пПÑПга ПгÑаМОÑÐµÐœÐžÑ (moderate memory pressure).
# Ð Ð°Ð·ÐŒÐµÑ ÐŒÐžÐœÐžÐŒÐ°Ð»ÑМПгП бÑÑеÑа пП ÑЌПлÑÐ°ÐœÐžÑ ÑПÑÑавлÑÐµÑ 8 ÐÐ±Ð°Ð¹Ñ (8192).
#ÐМаÑеМОе пП ÑЌПлÑаМОÑ: кПлОÑеÑÑвП паЌÑÑО, ЎПпÑÑÑОЌПе ÐŽÐ»Ñ Ð±ÑÑеÑа
# пеÑеЎаÑО ÑПкеÑа TCP пП ÑЌПлÑаМОÑ. ÐÑП зМаÑеМОе пÑОЌеМÑеÑÑÑ Ð²Ð·Ð°ÐŒÐµÐœ
# паÑаЌеÑÑа /proc/sys/net/core/rmem_default, ОÑпПлÑзÑеЌПгП ÐŽÑÑгОЌО пÑПÑПкПлаЌО.
# ÐМаÑеМОе ОÑпПлÑзÑеЌПгП пП ÑЌПлÑÐ°ÐœÐžÑ Ð±ÑÑеÑа ПбÑÑМП (пП ÑЌПлÑаМОÑ)
# ÑПÑÑавлÑÐµÑ 87830 байÑ. ÐÑП ПпÑеЎелÑÐµÑ ÑÐ°Ð·ÐŒÐµÑ ÐŸÐºÐœÐ° 65535 Ñ
# заЎаММÑÐŒ пП ÑЌПлÑÐ°ÐœÐžÑ Ð·ÐœÐ°ÑеМОеЌ tcp_adv_win_scale О tcp_app_win = 0,
# МеÑкПлÑкП ЌеМÑÑОй, МежелО ПпÑеЎелÑÐµÑ Ð¿ÑОМÑÑПе пП ÑЌПлÑÐ°ÐœÐžÑ Ð·ÐœÐ°ÑеМОе tcp_app_win.
# ÐакÑОЌÑÐŒ: ЌакÑОЌалÑÐœÑй ÑÐ°Ð·ÐŒÐµÑ Ð±ÑÑеÑа, кПÑПÑÑй ÐŒÐŸÐ¶ÐµÑ Ð±ÑÑÑ Ð°Ð²ÑПЌаÑОÑеÑкО
# вÑЎелеМ ÐŽÐ»Ñ Ð¿ÑОеЌа ÑПкеÑÑ TCP. ÐÑП зМаÑеМОе Ме ПÑЌеМÑÐµÑ ÐŒÐ°ÐºÑОЌÑЌа,
# заЎаММПгП в Ñайле /proc/sys/net/core/rmem_max. ÐÑО «ÑÑаÑОÑеÑкПЌ»
# вÑЎелеМОО паЌÑÑО Ñ Ð¿ÐŸÐŒÐŸÑÑÑ SO_RCVBUF ÑÑÐŸÑ Ð¿Ð°ÑаЌеÑÑ ÐœÐµ ÐžÐŒÐµÐµÑ Ð·ÐœÐ°ÑеМОÑ.
net.ipv4.tcp_wmem = 4096 65536 56623104
net.core.somaxconn = 5000
# ÐакÑОЌалÑМПе ÑОÑлП ПÑкÑÑÑÑÑ
ÑПкеÑПв, жЎÑÑОÑ
ÑПеЎОМеМОÑ.
net.ipv4.tcp_timestamps=1
# РазÑеÑÐ°ÐµÑ ÐžÑпПлÑзПваМОе вÑеЌеММÑÑ
ЌеÑПк (timestamps), в ÑППÑвеÑÑÑвОО Ñ RFC 1323.
net.ipv4.tcp_sack=1
# РазÑеÑОÑÑ Ð²ÑбПÑПÑÐœÑе пПЎÑвеÑÐ¶ÐŽÐµÐœÐžÑ Ð¿ÑПÑПкПла TCP
net.core.netdev_max_backlog=5000 (ЎеÑÐŸÐ»Ñ 1000)
# ЌакÑОЌалÑМПе кПлОÑеÑÑвП пакеÑПв в ПÑеÑеЎО Ма ПбÑабПÑкÑ, еÑлО
# ОМÑеÑÑÐµÐ¹Ñ Ð¿ÐŸÐ»ÑÑÐ°ÐµÑ Ð¿Ð°ÐºÐµÑÑ Ð±ÑÑÑÑее, ÑеЌ ÑÐŽÑП ÐŒÐŸÐ¶ÐµÑ ÐžÑ
ПбÑабПÑаÑÑ.
net.ipv4.tcp_max_tw_buckets=262144
# ÐакÑОЌалÑМПе ÑОÑлП ÑПкеÑПв, МаÑ
ПЎÑÑОÑ
ÑÑ Ð² ÑПÑÑПÑМОО TIME-WAIT ПЎМПвÑеЌеММП.
# ÐÑО пÑевÑÑеМОО ÑÑПгП пПÑПга â «лОÑМОй» ÑÐŸÐºÐµÑ ÑазÑÑÑаеÑÑÑ Ðž пОÑеÑÑÑ
# ÑППбÑеМОе в ÑОÑÑеЌМÑй жÑÑМал.
net.ipv4.tcp_tw_reuse=1
#РазÑеÑаеЌ пПвÑПÑМПе ОÑпПлÑзПваМОе TIME-WAIT ÑПкеÑПв в ÑлÑÑаÑÑ
,
# еÑлО пÑПÑПкПл ÑÑОÑÐ°ÐµÑ ÑÑП безПпаÑÐœÑÐŒ.
net.core.optmem_max=4194304
#УвелОÑОÑÑ ÐŒÐ°ÐºÑОЌалÑÐœÑй ПбÑОй бÑÑеÑ-кПÑЌОÑеÑкПй ALLOCATABLE
#ОзЌеÑÑеÑÑÑ Ð² еЎОМОÑаÑ
ÑÑÑÐ°ÐœÐžÑ (4096 байÑ)
net.ipv4.tcp_low_latency=1
#РазÑеÑÐ°ÐµÑ ÑÑÐµÐºÑ TCP/IP ПÑЎаваÑÑ Ð¿ÑеЎпПÑÑеМОе ÐœÐžÐ·ÐºÐŸÐŒÑ Ð²ÑеЌеМО ПжОЎаМОÑ
# пеÑеЎ бПлее вÑÑПкПй пÑПпÑÑкМПй ÑпПÑПбМПÑÑÑÑ.
net.ipv4.tcp_adv_win_scale=1
# ÐÑа пеÑÐµÐŒÐµÐœÐœÐ°Ñ Ð²Ð»ÐžÑÐµÑ ÐœÐ° вÑÑОÑлеМОе ПбÑеЌа паЌÑÑО в бÑÑеÑе ÑПкеÑа,
# вÑЎелÑеЌПй пПЎ ÑÐ°Ð·ÐŒÐµÑ TCP-ПкМа О пПЎ бÑÑÐµÑ Ð¿ÑОлПжеМОÑ.
# ÐÑлО велОÑОМа tcp_adv_win_scale ПÑÑОÑаÑелÑМаÑ, ÑП ÐŽÐ»Ñ Ð²ÑÑОÑÐ»ÐµÐœÐžÑ ÑазЌеÑа
# ОÑпПлÑзÑеÑÑÑ ÑлеЎÑÑÑее вÑÑажеМОе:
# Bytes- bytes2в ÑÑепеМО -tcp_adv_win_scale
# ÐЎе bytes â ÑÑП ÑÐ°Ð·ÐŒÐµÑ ÐŸÐºÐœÐ° в байÑаÑ
. ÐÑлО велОÑОМа tcp_adv_win_scale
# пПлПжОÑелÑМаÑ, ÑП ÐŽÐ»Ñ ÐŸÐ¿ÑÐµÐŽÐµÐ»ÐµÐœÐžÑ ÑазЌеÑа ОÑпПлÑзÑеÑÑÑ ÑлеЎÑÑÑее вÑÑажеМОе:
# Bytes- bytes2в ÑÑепеМО tcp_adv_win_scale
# ÐеÑÐµÐŒÐµÐœÐœÐ°Ñ Ð¿ÑÐžÐœÐžÐŒÐ°ÐµÑ ÑелПе зМаÑеМОе. ÐМаÑеМОе пП-ÑЌПлÑÐ°ÐœÐžÑ â 2,
# Ñ.е. пПЎ бÑÑÐµÑ Ð¿ÑÐžÐ»ÐŸÐ¶ÐµÐœÐžÑ ÐŸÑвПЎОÑÑÑ ÂŒ ÑаÑÑÑ ÐŸÐ±ÑеЌа, ПпÑеЎелÑеЌПгП пеÑеЌеММПй
# tcp_rmem.
net.ipv4.tcp_slow_start_after_idle=0
# ЌеÑ
аМОзЌ пеÑезапÑÑка ЌеЎлеММПгП ÑÑаÑÑа, кПÑПÑÑй ÑбÑаÑÑÐ²Ð°ÐµÑ Ð·ÐœÐ°ÑеМОе ПкМа
# пеÑегÑÑзкО, еÑлО ÑПеЎОМеМОе Ме ОÑпПлÑзПвалПÑÑ Ð·Ð°ÐŽÐ°ÐœÐœÑй пеÑОПЎ вÑеЌеМО.
# ÐÑÑÑе ПÑклÑÑОÑÑ SSR Ма ÑеÑвеÑе, ÑÑÐŸÐ±Ñ ÑлÑÑÑОÑÑ Ð¿ÑПОзвПЎОÑелÑМПÑÑÑ
# ЎПлгПжОвÑÑОÑ
ÑПеЎОМеМОй.
net.ipv4.tcp_no_metrics_save=1
#Ðе ÑПÑ
ÑаМÑÑÑ ÑезÑлÑÑаÑÑ ÐžÐ·ÐŒÐµÑеМОй TCP ÑÐŸÐµÐŽÐžÐœÐµÐœÐžÑ Ð² кеÑе пÑО егП закÑÑÑОО.
net.ipv4.tcp_syncookies=0
#ÐÑклÑÑОÑÑ ÐŒÐµÑ
аМОзЌ ПÑпÑавкО syncookie
net.ipv4.tcp_ecn=0
#Explicit Congestion Notification (ЯвМПе УвеЎПЌлеМОе П ÐеÑегÑÑжеММПÑÑО) в
# TCP-ÑПеЎОМеМОÑÑ
. ÐÑпПлÑзÑеÑÑÑ ÐŽÐ»Ñ ÑÐ²ÐµÐŽÐŸÐŒÐ»ÐµÐœÐžÑ ÐŸ вПзМОкМПвеМОО «заÑПÑа»
# Ма ЌаÑÑÑÑÑе к Ð·Ð°ÐŽÐ°ÐœÐœÐŸÐŒÑ Ñ
ПÑÑÑ ÐžÐ»Ðž ÑеÑО. ÐÐŸÐ¶ÐµÑ ÐžÑпПлÑзПваÑÑÑÑ ÐŽÐ»Ñ ÐžÐ·Ð²ÐµÑеМОÑ
# Ñ
ПÑÑа-ПÑпÑавОÑÐµÐ»Ñ ÐŸ МеПбÑ
ПЎОЌПÑÑО ÑМОзОÑÑ ÑкПÑПÑÑÑ Ð¿ÐµÑеЎаÑО пакеÑПв ÑеÑез
# кПМкÑеÑÐœÑй ЌаÑÑÑÑÑОзаÑÐŸÑ ÐžÐ»Ðž бÑаМЎЌаÑÑÑ.
net.ipv4.conf.all.send_redirects=0
# вÑклÑÑÐ°ÐµÑ Ð²ÑЎаÑÑ ICMP Redirect ⊠ЎÑÑгОЌ Ñ
ПÑÑаЌ. ÐÑа ПпÑÐžÑ ÐŸÐ±ÑзаÑелÑМП
# ЎПлжМа бÑÑÑ Ð²ÐºÐ»ÑÑеМа, еÑлО Ñ
ПÑÑ Ð²ÑÑÑÑÐ¿Ð°ÐµÑ Ð² ÑПлО ЌаÑÑÑÑÑОзаÑПÑа лÑбПгП ÑПЎа.
# У ÐœÐ°Ñ ÐœÐµÑ ÐŒÐ°ÑÑÑÑÑОзаÑОО.
net.ipv4.ip_forward=0
#СПпÑМП ПÑклÑÑеМОе ÑПÑваÑЎОМга. ÐÑ ÐœÐµ ÑлÑз, ÐŽÐŸÐºÐµÑ ÐœÐ° ЌаÑОМаÑ
Ме пПЎМÑÑ,
# МаЌ ÑÑП Ме ÐœÑжМП.
net.ipv4.icmp_echo_ignore_broadcasts=1
#Ðе ПÑвеÑаеЌ Ма ICMP ECHO запÑПÑÑ, пеÑеЎаММÑе ÑОÑПкПвеÑаÑелÑÐœÑЌО пакеÑаЌО
net.ipv4.tcp_fin_timeout=10
#ПпÑеЎелÑÐµÑ Ð²ÑÐµÐŒÑ ÑПÑ
ÑÐ°ÐœÐµÐœÐžÑ ÑПкеÑа в ÑПÑÑПÑМОО FIN-WAIT-2 пПÑле егП
# закÑÑÑÐžÑ Ð»ÐŸÐºÐ°Ð»ÑМПй ÑÑПÑПМПй. ÐеÑÐŸÐ»Ñ 60
net.core.netdev_budget=600 # (ЎеÑÐŸÐ»Ñ 300)
# ÐÑлО вÑпПлМеМОе пÑПгÑаЌЌМÑÑ
пÑеÑÑваМОй Ме вÑпПлМÑÑÑÑÑ ÐŽÐŸÑÑаÑПÑМП ЎПлгП,
# ÑП ÑеЌп ÑПÑÑа вÑ
ПЎÑÑОÑ
ЎаММÑÑ
ÐŒÐŸÐ¶ÐµÑ Ð¿ÑевÑÑОÑÑ Ð²ÐŸÐ·ÐŒÐŸÐ¶ÐœÐŸÑÑÑ ÑÐŽÑа
# ПпÑÑÑПÑОÑÑ Ð±ÑÑеÑ. Ð ÑезÑлÑÑаÑе бÑÑеÑÑ NIC пеÑепПлМÑÑÑÑ, О ÑÑаÑОк бÑÐŽÐµÑ Ð¿ÐŸÑеÑÑÐœ.
# ÐМПгЎа, МеПбÑ
ПЎОЌП ÑвелОÑОÑÑ ÐŽÐ»ÐžÑелÑМПÑÑÑ ÑабПÑÑ SoftIRQs
# (пÑПгÑаЌЌМÑÑ
пÑеÑÑваМОй) Ñ CPU. Ðа ÑÑП ПÑвеÑÐ°ÐµÑ netdev_budget.
# ÐМаÑеМОе пП ÑЌПлÑÐ°ÐœÐžÑ 300. ÐаÑаЌеÑÑ Ð·Ð°ÑÑÐ°Ð²ÐžÑ Ð¿ÑПÑеÑÑ SoftIRQ ПбÑабПÑаÑÑ
# 300 пакеÑПв ÐŸÑ NIC пеÑеЎ ÑеЌ как ПÑпÑÑÑОÑÑ CPU
net.ipv4.tcp_fastopen=3
# TFO TCP Fast Open
# еÑлО О ÐºÐ»ÐžÐµÐœÑ Ðž ÑеÑÐ²ÐµÑ ÐžÐŒÐµÑÑ Ð¿ÐŸÐŽÐŽÐµÑÐ¶ÐºÑ TFO, П кПÑПÑПй ÑППбÑаÑÑ Ð·Ð° ÑÑеÑ
# ÑпеÑОалÑМПгП Ñлага в TCP пакеÑе. РМаÑеЌ ÑлÑÑае ÑвлÑеÑÑÑ Ð¿Ð»Ð°ÑебП, пÑПÑÑП
# вÑглÑÐŽÐžÑ ÐºÑаÑОвП)
Сךשת ×ךק ×××§×Š×ª× ××××©×§× ×š×©×ª × ×€×š××× ×©× 10Gbps ×ךשת ש×××× × ×€×š×ת. ×× ×××× × ××××ª× ×׊×××ת ××ך×××¡× ×š×©×ª ××€×××× mellanox 10/25 Gbps, ××××ך ××©× × ×ת×× 10Gbps × ×€×š×××. ××ך×׊×× ×××Š×¢× ×××׊ע×ת OSPF, ×××ך ש×קשך ×¢× lacp ×ס××× ××ש×× ×ך×× ×ª×€××§× ××××ת ×©× ×קס×××× 16 Gbps, ××¢×× Ospf × ××Š× ××׊××× ×ת ×©×ª× ×עשך×ת ××× ×××× ×. ת××× ××ת עת××××ת ××× ×× ×Š× ×ת ×-ROCE ×¢× ×××× ×קס×× ×××× ××× ×××€××ת ×ת ××× ××ש×××. ×××Š× ×××××ך ××ק ×× ×©× ×ךשת:
- ×××××× ×©××××× ×ת ע׊×× ×ש ×ת×××ת IP ××׊×× ××ת ×-BGP, ×× ×× × ×Š×š×××× ×ª××× × - (××תך ×××ק, ×××× ×ת××ת ××××ך ×× ×××
frr=6.0-1 ) ××ך ×¢××. - ××¡× ×××, ××× ××××× ×ת ×©× × ×××©×§× ×š×©×ª, ×× ××× ×¢× ×©× × ××שק×× - ס×"× 4 ×׊×××ת. ×ך××ס ךשת ××× ×סת×× ×¢× ×××€×¢× ×¢× ×©×ª× ×׊×××ת ×-BGP ××××ך ×¢×××, ××©× × ×סת×× ×¢× ×©× × ×ת××× ×©×× ×× ×¢× ×©×ª× ×׊×××ת ×-OSPF ××××ך ×¢×××
׀ך××× × ×ס׀×× ×¢× ×××ךת OSPF: ××ש××× ××¢×קך×ת ××× ×׊××ך ×©× × ×§×ש×ך×× ×ס×××× ×ת תק××ת.
×©× × ×××©×§× ×š×©×ª ××××ך×× ××©×ª× ×š×©×ª×ת ש××××ת ׀ש×××ת - 10.10.10.0/24 ×-10.10.20.0/24
1: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
inet 10.10.10.2/24 brd 10.10.10.255 scope global ens1f0
2: ens1f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
inet 10.10.20.2/24 brd 10.10.20.255 scope global ens1f1
××€×× ×××× ××ת ך×××ת ×× ×ת ××.
×ÖŽ×סק
×ש×× ××× ××× ××××¢× ×ת ×××סק××. ×¢××ך SSD ש×× ××ª× ×ת ××ת××× × × ××€, ×¢××ך HDD - ××××¢× ××ך××. ×× ×××ך ××ת ××××€× ××××, NOOP ×¢××× ×¢× ××¢×קך×× ×©× "first in, first out", ×שך ××× ×××ת × ×©××¢ ××× "FIFO (First In, First Out)." ×קש×ת ×¢××××ת ×ת×ך ×ש×× ××××¢×ת. DEADLINE ××××× ××תך ×קך×××, ×× ×סף ×ת×××× ×ת×ך ×ק×× ×××©× ×××¢× ×××¢××ת ×××סק ×××× ××€×¢×××. ×× ××ש×× ×¢××ך ××עך×ת ש×× × - ×××š× ×××, ךק ת×××× ××× ×¢××× ×¢× ×× ××סק - OSD daemon.
(×× ×©×š××Š× ×׊××× ×ת×× ×ת××× ×-I/O ×××× ×קך×× ×¢× ×× ×××:
×× ×©××¢×××£ ×קך×× ×ך×ס×ת:
××××׊×ת ××××× ×× ××× ×קס, ×××××¥ ×× ×××××× ×ת nr_request
nr_requests
××¢×š× ×©× nr_requests ק×××¢ ×ת ×××ת ×קש×ת ×-I/O ש×××××¡× ×ת ××€× × ×©×ת××× ×-I/O ש××× / ×ק×× × ×ª×× ×× ×××ª×§× ××ס×××, ×× ××ª× ×שת×ש ××ך××ס RAID / ××ª×§× ×ס××× ×©×××× ××ת×××× ×¢× ×ª×ך ×××× ××תך ××-I ×ת××× /O ××××ך, ××¢××ת ××¢×š× ×©× nr_requests עש××× ××¢××ך ×ש׀ך ××׊××Š× ×ת ×¢××ס ×שךת ××שך ×תך×ש×ת ×××××ת ×××××ת ×©× I/O ×שךת. ×× ××ª× ×שת×ש ×-Deadline ×× CFQ ××ת×××, ×××××¥ ×××××ך ×ת ××¢×š× nr_request ××€× 2 ×××¢×š× ×©× ×¢××ק ×ת×ך.
×××! ×××ך××× ×¢×Š××, ×׀ת×× CEPH, ×ש×× ×¢×× ×××ª× × ×©×עך×ת ××¢×××€×××ת ש××× ×¢×××ת ××× ××תך
WBTthrottle ×/×× nr_requests
WBTthrottle ×/×× nr_requests
××ס×× ×§×׊×× ×שת×ש ×-I/O ×××××¡× ××ת×××; ×× ×××× ×ס׀ך ×תך×× ×ת ×× ×××× ××ס×× ×ק×׊×× × ××Š× ××××× ××××š× ××תך. ×קש×ת ×ק×× ×ק×××× ××××¢× ×ך××¢ ש×× ×ª×× ×× × ×ת××× ×××××, ××××ך ××× × ×©××€×ת ×××סק ×× ×ª×× ×× ×¢×Š×× ××××¢× ××××ך ××תך ×××׊ע×ת ×€×× ×§×Š××× ×××ת ס×× ×ך××ת ×©× ××× ×קס. ×× ××׀שך ×-OSDs ×׊×ך ×ס׀ק ××× ×ת××× ××××× ×-SSD ×עת ×ת××× ×׀ך׊×× ×§×× ××. ×ת××× ××ש××ת ×× ××׀שךת ×× ××§×š× × ×¢×Š×× ××ך×× ×××ש ×ת ×קש×ת ×-I/O ×©× ×××סק, ×תק××× ×××× ×××ª× ××× ×× ××׀שך ×ך××©× ×××סק×× ×ק××××× ××××ך × ×ª×× ×××€××××× ××תך ×¢× ×€× × ××€×××ת ש×××. ×××€×§× × ×× ××× ×©××ª× ×××× ×ס××× ××¢× ××תך ק××/×€×× ××× ××סק ××× ×©××× ××€×©×š× ×¢× ×§××/×€×× ×ש×ך ×× ×¡×× ×ך×× ×.
×¢× ××ת, ××¢×× ×ס×××ת ×תע×ךךת ×× × ×€× ×ךש×××ת ×× ×× ×¡×ת ××ש××× Ceph × ×ª×× ×¢××× ×¢× ×× ××××××ת ×©× ×××סק×× ××ס×ס×××. ×תך××ש ××, ××ס׀ך ××××× ×©× ×€×¢×××ת I/O ×××ת×× ×ת ××ת××× ×××סק ×¢××× ××××× ××× ×©×××× ×××ך×× ××× ×©×ª××š× I/O ××××× ×ת ×× ×××סק ××ת ת××š× Ceph. ×קש×ת קך××× ××ש׀ע×ת ×××××× ×××××× ×©×× × ×ª×§×¢×ת ××× ×קש×ת ×ת×××, ש×××××ת ×××××©× ×ס׀ך ×©× ××ת ×¢× ×©×× ×¢××ך×ת ×××סק ×ך×ש×.
××× ××ת××ך ×¢× ××¢×× ××, ×-Ceph ×ש ×× ×× ×× ×׊עךת ×ת××× ×××× × ×××ס×× ×§×׊×× ××©× WBTthrottle. ××× × ××¢× ×××××× ×ת ××××ת ×××××ת ×©× ×§××/×€×× ×ת××× ×¢×Š×× ×©×××× ××¢××× ×ת×ך ×××ת××× ×ת ת×××× ××××× ×©×× ××ק×× ××תך ××× ×©××× ×תך×ש ××××€× ×××¢× ×¢×§× ××€×¢××ª× ×¢× ××× ××§×š× × ×¢×Š××. ×ך××¢ ××××, ××××§× ×××××× ×©×¢×š×× ×ך×ךת ××××× ×¢×××× ×× ××€×××ª× ×ת ×××ª× ×××ת ×ק×××ת ×ך×× ×©××××× ×××€××ת ×ת ×××©×€×¢× ××× ×¢× ××× ×××××ך ×קך×××. ×ת×××ת ×××××ת ××©× ×ת ××ª× ×××ת ×× ××׊××Š× ×ת ×××š× ×ª×ך ××ת××× ××××× ××××€×× ×ת ×××©×€×¢× ××× ×€××ת ×××ך×. ×¢× ××ת, ×ש ׀שך×: ×¢× ××× ×Š×׊×× ××ס׀ך ××ך×× ××××× ×©× ×× ×ס×ת ×××תך ×ת×ך, ××ª× ×××× ×××€××ת ×ת ×××××ת ×©× ××§×š× × ×¢×Š×× ×××§×¡× ×ת ×××¢×××ת ש×× ××××× ×ª ×קש×ת × ×× ×¡×ת. ×××× ××ש×× ×§×Š×ª ×¢× ×× ××ª× ×Š×š×× ××תך ×¢××ך ××§×š× ×ש×××ש ×ס׀׊××€× ×©××, ×¢×××¡× ×¢×××× ××ת××× ××× ××ת××× ×××.
××× ×ש××× ××¢××ק ×©× ×ª×ך ×׊××ך×ת ×ת××× ×©×××, ××ª× ×××× ×××€××ת ×ת ××ס׀ך ××ך×× ××××× ×©× ×€×¢×××ת I/O ××׊××ת ×××€× ×××׊ע×ת ×××ך×ת WBTthrottle, ×× ×©××ª× ×××× ×××€××ת ×ת ××¢×š× ××קס×××× ×¢××ך ×€×¢×××ת ××׊××ת ×××€× ×ך×ת ××××ק ×©× ××××× ×©××. ×©× ××× ×××××× ×ש××× ×××¢×××ת ××××ª× ××ª× ×××ת, ××××¢××€×ת ש×× ×××× ××ס×ס ×××ש×× ××××š× ××.
×ש ×׊××× ×× ×©×עך×ת ×-Operation Priority ×©× Ceph ××¢××× ××תך ×¢××ך ש×××ת×ת ק׊ך×ת ××תך ×ך×ת ×××סק. ×¢× ××× ×××××¥ ×ת×ך ××××× ×××סק × ×ª××, ×××ק×× ××¢××§×š× ×©× ×ת×ך ×¢××ך ×-Ceph, ×©× ×ש ×× ××תך ש×××× ×¢× ××¢×××€×ת ×©× ×€×¢××ת ×-I/O. שק×× ×ת ×××××× ××××:
echo 8 > /sys/block/sda/queue/nr_requests
×ש×תף
××¢×× ××× ×©×× ×××× ××ךע×× ××× ×××€×× ×ת ××××× ×ת ש×× ×ך×× ××ש×× ××ס××× ×§×Š×ª ××תך ××׊××¢×× ×××××ך×
cat /etc/sysctl.d/60-ceph2.conf
kernel.pid_max = 4194303
#ÐОÑкПв в кажЎПй ЌаÑОМе пП 25, пПÑÐŸÐŒÑ ÑаÑÑÑОÑÑвалО ÑÑП пÑПÑеÑÑПв бÑÐŽÐµÑ ÐŒÐœÐŸÐ³ÐŸ
kernel.threads-max=2097152
# ТÑеЎПв, еÑÑеÑÑМП, ÑПже.
vm.max_map_count=524288
# УвелОÑОлО кПлОÑеÑÑвП ПблаÑÑей каÑÑÑ Ð¿Ð°ÐŒÑÑО пÑПÑеÑÑа.
# Ðак ÑлеЎÑÐµÑ ÐžÐ· ЎПкÑЌеМÑаÑОО пП ÑЎеÑÐœÑÐŒ пеÑеЌеММÑÐŒ
# ÐблаÑÑО каÑÑÑ Ð¿Ð°ÐŒÑÑО ОÑпПлÑзÑеÑÑÑ ÐºÐ°Ðº пПбПÑÐœÑй ÑÑÑÐµÐºÑ Ð²ÑзПва
# malloc, МапÑÑÐŒÑÑ Ñ Ð¿ÐŸÐŒÐŸÑÑÑ mmap, mprotect О madvise, а Ñакже пÑО загÑÑзке
# ПбÑОÑ
бОблОПÑек.
fs.aio-max-nr=50000000
# ÐПЎÑÑМОЌ паÑаЌеÑÑÑ input-output
# ЯЎÑП Linux пÑеЎПÑÑавлÑÐµÑ ÑÑМкÑÐžÑ Ð°ÑОМÑ
ÑПММПгП МеблПкОÑÑÑÑегП ввПЎа-вÑвПЎа (AIO),
# кПÑПÑÐ°Ñ Ð¿ÐŸÐ·Ð²ÐŸÐ»ÑÐµÑ Ð¿ÑПÑеÑÑÑ ÐžÐœÐžÑООÑПваÑÑ ÐœÐµÑкПлÑкП ПпеÑаÑОй ввПЎа-вÑвПЎа
# ПЎМПвÑеЌеММП, Ме ЎПжОЎаÑÑÑ Ð·Ð°Ð²ÐµÑÑÐµÐœÐžÑ ÐºÐ°ÐºÐŸÐ¹-лОбП Оз МОÑ
.
# ÐÑП Ð¿ÐŸÐŒÐŸÐ³Ð°ÐµÑ Ð¿ÐŸÐ²ÑÑОÑÑ Ð¿ÑПОзвПЎОÑелÑМПÑÑÑ Ð¿ÑОлПжеМОй,
# кПÑПÑÑе ЌПгÑÑ Ð¿ÐµÑекÑÑваÑÑ ÐŸÐ±ÑабПÑÐºÑ Ðž ввПЎ-вÑвПЎ.
# ÐаÑаЌеÑÑ aio-max-nr ПпÑеЎелÑÐµÑ ÐŒÐ°ÐºÑОЌалÑМПе кПлОÑеÑÑвП ЎПпÑÑÑОЌÑÑ
# ПЎМПвÑеЌеММÑÑ
запÑПÑПв.
vm.min_free_kbytes=1048576
# ЌОМОЌалÑÐœÑй ÑÐ°Ð·ÐŒÐµÑ ÑвПбПЎМПй паЌÑÑО кПÑПÑÑй МеПбÑ
ПЎОЌП пПЎЎеÑжОваÑÑ.
# ÐÑÑÑавлеМ 1Gb, ÑегП впПлМе ЎПÑÑаÑПÑМП ÐŽÐ»Ñ ÑабПÑÑ ÐŸÐ¿ÐµÑаÑОПММПй ÑОÑÑеЌÑ,
# О пПзвПлÑÐµÑ ÐžÐ·Ð±ÐµÐ³Ð°ÑÑ OOM Killer ÐŽÐ»Ñ Ð¿ÑПÑеÑÑПв OSD. ХПÑÑ Ð¿Ð°ÐŒÑÑО О Ñак
# как Ñ ÐŽÑÑака ÑаМÑОкПв, МП Ð·Ð°Ð¿Ð°Ñ ÐºÐ°ÑЌаМ Ме ÑÑМеÑ
vm.swappiness=10
# ÐПвПÑОЌ ОÑпПлÑзПваÑÑ ÑвПп еÑлО ПÑÑалПÑÑ ÑвПбПЎМÑÐŒ 10% паЌÑÑО.
# Ðа ЌаÑОМаÑ
128G ПпеÑаÑОвÑ, О 10% ÑÑП 12 ÐОгПв. ÐПлее ÑеЌ ЎПÑÑаÑПÑМП ÐŽÐ»Ñ ÑабПÑÑ.
# КÑаÑÐœÑй паÑаЌеÑÑ Ð² 60% заÑÑавлÑл ÑПÑЌПзОÑÑ ÑОÑÑеЌÑ, Ð·Ð°Ð»ÐµÐ·Ð°Ñ Ð² ÑвПп,
# кПгЎа еÑÑÑ ÐµÑе кÑÑа ÑвПбПЎМПй паЌÑÑО
vm.vfs_cache_pressure=1000
# УвелОÑОваеЌ ÑП ÑÑаÑÐœÑÑ
100. ÐаÑÑавлÑеЌ ÑÐŽÑП акÑОвМее вÑгÑÑжаÑÑ
# МеОÑпПлÑзÑеЌÑе ÑÑÑаМОÑÑ Ð¿Ð°ÐŒÑÑО Оз кеÑа.
vm.zone_reclaim_mode=0
# ÐПзвПлÑÐµÑ ÑÑÑаМавлОваÑÑ Ð±ÐŸÐ»ÐµÐµ ОлО ЌеМее агÑеÑÑОвМÑе пПЎÑ
ÐŸÐŽÑ Ðº
# вПÑÑÑÐ°ÐœÐŸÐ²Ð»ÐµÐœÐžÑ Ð¿Ð°ÐŒÑÑО, кПгЎа в зПМе закаМÑОваеÑÑÑ Ð¿Ð°ÐŒÑÑÑ.
# ÐÑлО ПМ ÑÑÑаМПвлеМ Ма МПлÑ, ÑП Ме пÑПОÑÑ
ÐŸÐŽÐžÑ Ð²ÐŸÑÑÑаМПвлеМОе зПМÑ.
# ÐÐ»Ñ ÑайлПвÑÑ
ÑеÑвеÑПв ОлО ÑабПÑОÑ
МагÑÑзПк
# вÑгПЎМП, еÑлО ОÑ
ЎаММÑе кÑÑОÑПваМÑ, zone_reclaim_mode
# ПÑÑавОÑÑ ÐŸÑклÑÑеММÑÐŒ, пПÑкПлÑÐºÑ ÑÑÑÐµÐºÑ ÐºÑÑОÑПваМОÑ,
# веÑПÑÑМП, бÑÐŽÐµÑ Ð±ÐŸÐ»ÐµÐµ важМÑÐŒ, ÑеЌ ЌеÑÑПМаÑ
ПжЎеМОе ЎаММÑÑ
.
vm.dirty_ratio=20
# ÐÑПÑÐµÐœÑ ÐŸÐ¿ÐµÑаÑОвМПй паЌÑÑО, кПÑПÑÑй ЌПжМП вÑЎелОÑÑ Ð¿ÐŸÐŽ "гÑÑзМÑе" ÑÑÑаМОÑÑ
# ÐÑÑОÑлÑлО Оз пÑОЌеÑМПгП ÑаÑÑеÑа:
# Ð ÑОÑÑеЌа 128 гОгПв паЌÑÑО.
# ÐÑОЌеÑМП пП 20 ЎОÑкПв SSD, Ñ ÐºÐŸÑПÑÑÑ
в МаÑÑÑПйкаÑ
CEPH ÑказаМП
# вÑЎелÑÑÑ Ð¿ÐŸÐŽ кÑÑОÑПваМОе пП 3G ПпеÑаÑОвÑ.
# ÐÑОЌеÑМП пП 40 ЎОÑкПв HDD, ÐŽÐ»Ñ ÐºÐŸÑПÑÑÑ
ÑÑÐŸÑ Ð¿Ð°ÑаЌеÑÑ ÑавеМ 1G
# 20% ÐŸÑ 128 ÑÑП 25.6 гОгПв. ÐÑПгП, в ÑлÑÑае ЌакÑОЌалÑМПй ÑÑОлОзаÑОО паЌÑÑО,
# ÐŽÐ»Ñ ÑОÑÑÐµÐŒÑ ÐŸÑÑаМеÑÑÑ 2.4G паЌÑÑО. ЧегП ей ЎПлжМП Ñ
ваÑОÑÑ ÑÑПб вÑжОÑÑ Ðž ЎПжЎаÑÑÑÑ
# ÑÑÑка кПпÑÑ ÐºÐ°Ð²Ð°Ð»ÐµÑОО - ÑП еÑÑÑ Ð¿ÑОÑеÑÑÐ²ÐžÑ DevOps кПÑПÑÑй вÑе пПÑОМОÑ.
vm.dirty_background_ratio=3
# пÑПÑÐµÐœÑ ÑОÑÑеЌМПй паЌÑÑО, кПÑПÑÑй ЌПжМП запПлМОÑÑ dirty pages ЎП ÑПгП,
# как ÑПМПвÑе пÑПÑеÑÑÑ pdflush/flush/kdmflush запОÑÑÑ ÐžÑ
Ма ЎОÑк
fs.file-max=524288
# ÐÑ Ðž ПÑкÑÑÑÑÑ
ÑайлПв Ñ ÐœÐ°Ñ,веÑПÑÑМП, бÑÐŽÐµÑ ÑОлÑМП бПлÑÑе, ÑеЌ ÑказаМП пП ЎеÑПлÑÑ.
××××× ×-CEPH
×××ך×ת ש××××ª× ×š××Š× ××תע×× ×¢×××× ××תך ×€×ך××:
cat /etc/ceph/ceph.conf
osd:
journal_aio: true # ТÑО паÑаЌеÑÑа, вклÑÑаÑÑОе
journal_block_align: true # пÑÑЌПй i/o
journal_dio: true # Ма жÑÑМал
journal_max_write_bytes: 1073714824 # ÐеЌМПгП ÑаÑÑÑМеЌ ЌакÑОЌалÑÐœÑй ÑазЌеÑ
# ÑазПвП запОÑÑваеЌПй ПпеÑаÑОО в жÑÑМал
journal_max_write_entries: 10000 # ÐÑ Ðž кПлОÑеÑÑвП ПЎМПвÑеЌеММÑÑ
запОÑей
journal_queue_max_bytes: 10485760000
journal_queue_max_ops: 50000
rocksdb_separate_wal_dir: true # РеÑОлО ЎелаÑÑ ÐŸÑЎелÑÐœÑй wal
# Ðаже пПпÑÑалОÑÑ Ð²ÑбОÑÑ Ð¿ÐŸÐŽ ÑÑП ЎелП
# NVMe
bluestore_block_db_create: true # ÐÑ Ðž пПЎ жÑÑМал ПÑЎелÑМПе ÑÑÑÑПйÑÑвП
bluestore_block_db_size: '5368709120 #5G'
bluestore_block_wal_create: true
bluestore_block_wal_size: '1073741824 #1G'
bluestore_cache_size_hdd: '3221225472 # 3G'
# бПлÑÑПй ПбÑеЌ ПпеÑаÑÐžÐ²Ñ Ð¿ÐŸÐ·Ð²ÐŸÐ»ÑеÑ
# Ñ
ÑаМОÑÑ ÐŽÐŸÑÑаÑПÑМП бПлÑÑОе ПбÑеЌÑ
bluestore_cache_size_ssd: '9663676416 # 9G'
keyring: /var/lib/ceph/osd/ceph-$id/keyring
osd_client_message_size_cap: '1073741824 #1G'
osd_disk_thread_ioprio_class: idle
osd_disk_thread_ioprio_priority: 7
osd_disk_threads: 2 # кПлОÑеÑÑвП ÑÑеЎПв Ñ ÐŽÐµÐŒÐŸÐœÐ° Ма ПЎОМ ЎОÑк
osd_failsafe_full_ratio: 0.95
osd_heartbeat_grace: 5
osd_heartbeat_interval: 3
osd_map_dedup: true
osd_max_backfills: 2 # кПлОÑеÑÑвП ПЎМПвÑеЌеММÑÑ
ПпеÑаÑОй Ð·Ð°Ð¿ÐŸÐ»ÐœÐµÐœÐžÑ ÐœÐ° ПЎОМ ÐСÐ.
osd_max_write_size: 256
osd_mon_heartbeat_interval: 5
osd_op_threads: 16
osd_op_num_threads_per_shard: 1
osd_op_num_threads_per_shard_hdd: 2
osd_op_num_threads_per_shard_ssd: 2
osd_pool_default_min_size: 1 # ÐÑПбеММПÑÑО жаЎМПÑÑО. ÐÑÐµÐœÑ Ð±ÑÑÑÑП ÑÑалП
osd_pool_default_size: 2 # МеÑ
ваÑаÑÑ ÐŒÐµÑÑа, пПÑÐŸÐŒÑ ÐºÐ°Ðº вÑеЌеММПе
# ÑеÑеМОе пÑОМÑлО ÑЌеМÑÑеМОе кПлОÑеÑÑвП
# ÑеплОк ЎаММÑÑ
osd_recovery_delay_start: 10.000000
osd_recovery_max_active: 2
osd_recovery_max_chunk: 1048576
osd_recovery_max_single_start: 3
osd_recovery_op_priority: 1
osd_recovery_priority: 1 # паÑаЌеÑÑ ÑегÑлОÑÑеЌ пП МеПбÑ
ПЎОЌПÑÑО Ма Ñ
ПЎÑ
osd_recovery_sleep: 2
osd_scrub_chunk_max: 4
××ק ××׀ך××ך×× ×©× ×××§× ×¢××ך QA ×××š×¡× 12.2.12 ×סך×× ××ךסת ceph 12.2.2, ×××©× osd_recovery_threads. ××€××× ×××× ×ת××× ××ת ×¢×××× ×¢× ×××׊×ך ×××× 12.2.12. ×תך××× ×ך×× ×ª××××ת ××× ×ךס××ת 12.2.2 ×-12.2.12 ××ש××× ×××, ×××׀שך ×¢×××× ×× ×ת××××××.
×ש××× ×××ק×
××××€× ×××¢×, ×׊××š× ×××××§× ××× ×Š××š× ××××ª× ××š×¡× ××× ×קך×, ××× ×××× ×©×ת×××ª× ××¢××× ×¢× ××ש×××, ךק ××××©× ××תך ××××ª× ×××× × ××××ך. ×××ך ש××קת×, ×× ×©× ××ª× ×××××× ×××š×¡× ×××× ×ך×ת ××× × ×××× ×××××× (1393 ש×ך×ת ×ת׊×ך×ת × ×× 1436 ×××š×¡× ×××ש×), ××××× × ××ת××× ××××ק ×ת ×××š×¡× ××××©× (××¢××× ××× ×קך×, ××× ×××ת ×¢× ××× ×ש×)
×××ך ××××× ×©× ×ס×× × ××ש××ך ××××ך ×ת ×××š×¡× ×××©× × ××× ×××××× ceph-deploy ש×× ××ק ×ש×ך××ª× ×ש×ך×ת (×××ק ×××¢×××××) ××ת××× ×ת×××ך ש××. ×××š×¡× ××××©× ××××ª× ×©×× × ××××, ×× ×× ×ש׀××¢× ×¢× ×€×¢××ת ××ש××× ×¢×Š××, ×××× × ××ª×š× ×××š×¡× 1.5.39
×××××× ×©×׀ק××× ceph-disk ×××ךת ×××ך×ך ש××× ××׊×× ×ש×××ש ××שת××©× ×׀ק××× ceph-volume, ×קך××, ×ת××× × ××׊×ך OSDs ×¢× ×׀ק××× ×××, ×××× ××××× ××× ×¢× ××××©× ××.
×ת×× ×× ××× ××׊×ך ×ך×× ×©× ×©× × ××× × × SSD שע×××× × ×Š×× ×××× × OSD, ש×ת×ך×, ×××ק××× ×¢× SAS ׊×ך. ×× × ××× ×××× ×¢× ×¢×Š×× × ×××¢××ת ×× ×ª×× ×× ×× ×××סק ×¢× ××××× × ××€×.
×ת××× × ××׊×ך ×ש××× ××€× ×ת××¢××
cat /etc/ceph/ceph.conf
root@ceph01-qa:~# cat /etc/ceph/ceph.conf # пПлПжОлО заÑаМее пПЎгПÑПвлеММÑй кПМÑОг
[client]
rbd_cache = true
rbd_cache_max_dirty = 50331648
rbd_cache_max_dirty_age = 2
rbd_cache_size = 67108864
rbd_cache_target_dirty = 33554432
rbd_cache_writethrough_until_flush = true
rbd_concurrent_management_ops = 10
rbd_default_format = 2
[global]
auth_client_required = cephx
auth_cluster_required = cephx
auth_service_required = cephx
cluster network = 10.10.10.0/24
debug_asok = 0/0
debug_auth = 0/0
debug_buffer = 0/0
debug_client = 0/0
debug_context = 0/0
debug_crush = 0/0
debug_filer = 0/0
debug_filestore = 0/0
debug_finisher = 0/0
debug_heartbeatmap = 0/0
debug_journal = 0/0
debug_journaler = 0/0
debug_lockdep = 0/0
debug_mon = 0/0
debug_monc = 0/0
debug_ms = 0/0
debug_objclass = 0/0
debug_objectcatcher = 0/0
debug_objecter = 0/0
debug_optracker = 0/0
debug_osd = 0/0
debug_paxos = 0/0
debug_perfcounter = 0/0
debug_rados = 0/0
debug_rbd = 0/0
debug_rgw = 0/0
debug_throttle = 0/0
debug_timer = 0/0
debug_tp = 0/0
fsid = d0000000d-4000-4b00-b00b-0123qwe123qwf9
mon_host = ceph01-q, ceph02-q, ceph03-q
mon_initial_members = ceph01-q, ceph02-q, ceph03-q
public network = 8.8.8.8/28 # аЎÑÐµÑ ÐžÐ·ÐŒÐµÐœÐµÐœ, еÑÑеÑÑвеММП ))
rgw_dns_name = s3-qa.mycompany.ru # О ÑÑÐŸÑ Ð°ÐŽÑÐµÑ ÐžÐ·ÐŒÐµÐœ
rgw_host = s3-qa.mycompany.ru # О ÑÑÐŸÑ ÑПже
[mon]
mon allow pool delete = true
mon_max_pg_per_osd = 300 # бПлÑÑе ÑÑеÑ
ÑÐŸÑ Ð¿Ð»ÐµÐ¹ÑÐŒÐµÐœÑ Ð³ÑÑпп
# Ма ЎОÑк Ме ÑеÑОлОÑÑ
# Ñ
ПÑÑ Ð¿Ð°ÑаЌеÑÑ, еÑÑеÑÑвеММП, завОÑÐžÑ ÐŸÑ ÐºÐŸÐ»ÐžÑеÑÑва пÑлПв,
# ОÑ
ÑазЌеÑПв О кПлОÑеÑÑва OSD. ÐЌеÑÑ ÐŒÐ°Ð»ÐŸ МП зЎПÑПвÑÑ
PG
# ÑПже Ме лÑÑÑОй вÑÐ±ÐŸÑ - ÑÑÑÐ°ÐŽÐ°ÐµÑ ÑПÑМПÑÑÑ Ð±Ð°Ð»Ð°ÐœÑОÑПвкО
mon_osd_backfillfull_ratio = 0.9
mon_osd_down_out_interval = 5
mon_osd_full_ratio = 0.95 # пПка ÐŽÐ»Ñ SSD ЎОÑкПв ЌеÑÑПЌ ÐŽÐ»Ñ ÐžÑ
# жÑÑМала ÑвлÑеÑÑÑ ÑПÑ-же ÐŽÐµÐ²Ð°Ð¹Ñ ÑÑП О ÐŽÐ»Ñ ÐСÐ
# ÑеÑОлО ÑÑП 5% ÐŸÑ ÐŽÐžÑка (кПÑПÑÑй ÑаЌ ÑазЌеÑПЌ 1.2Tb)
# ЎПлжМП впПлМе Ñ
ваÑОÑÑ, О кПÑÑелОÑÑÐµÑ Ñ Ð¿Ð°ÑаЌеÑÑПЌ
# bluestore_block_db_size плÑÑ Ð²Ð°ÑОаÑОвМПÑÑÑ ÐœÐ° бПлÑÑОе
# плейÑÐŒÐµÐœÑ Ð³ÑÑппÑ
mon_osd_nearfull_ratio = 0.9
mon_pg_warn_max_per_osd = 520
[osd]
bluestore_block_db_create = true
bluestore_block_db_size = 5368709120 #5G
bluestore_block_wal_create = true
bluestore_block_wal_size = 1073741824 #1G
bluestore_cache_size_hdd = 3221225472 # 3G
bluestore_cache_size_ssd = 9663676416 # 9G
journal_aio = true
journal_block_align = true
journal_dio = true
journal_max_write_bytes = 1073714824
journal_max_write_entries = 10000
journal_queue_max_bytes = 10485760000
journal_queue_max_ops = 50000
keyring = /var/lib/ceph/osd/ceph-$id/keyring
osd_client_message_size_cap = 1073741824 #1G
osd_disk_thread_ioprio_class = idle
osd_disk_thread_ioprio_priority = 7
osd_disk_threads = 2
osd_failsafe_full_ratio = 0.95
osd_heartbeat_grace = 5
osd_heartbeat_interval = 3
osd_map_dedup = true
osd_max_backfills = 4
osd_max_write_size = 256
osd_mon_heartbeat_interval = 5
osd_op_num_threads_per_shard = 1
osd_op_num_threads_per_shard_hdd = 2
osd_op_num_threads_per_shard_ssd = 2
osd_op_threads = 16
osd_pool_default_min_size = 1
osd_pool_default_size = 2
osd_recovery_delay_start = 10.0
osd_recovery_max_active = 1
osd_recovery_max_chunk = 1048576
osd_recovery_max_single_start = 3
osd_recovery_op_priority = 1
osd_recovery_priority = 1
osd_recovery_sleep = 2
osd_scrub_chunk_max = 4
osd_scrub_chunk_min = 2
osd_scrub_sleep = 0.1
rocksdb_separate_wal_dir = true
# ÑПзЎаеЌ ЌПМОÑПÑÑ
root@ceph01-qa:~#ceph-deploy mon create ceph01-q
# геМеÑОÑÑеЌ клÑÑО ÐŽÐ»Ñ Ð°ÑÑеМÑОÑОкаÑОО МПЎ в клаÑÑеÑе
root@ceph01-qa:~#ceph-deploy gatherkeys ceph01-q
# ÐÑП еÑлО пПÑÑÑÑМП. ÐÑлО Ñ ÐœÐ°Ñ ÐœÐµÑкПлÑкП ЌаÑОМ ЎПÑÑÑÐ¿ÐœÑ - Ñе, кПÑПÑÑе ПпОÑÐ°ÐœÑ Ð² кПМÑОге в ÑекÑОО
# mon_initial_members = ceph01-q, ceph02-q, ceph03-q
# ЌПжМП запÑÑÑОÑÑ ÑÑО Ўве ÐºÐŸÐŒÐ°ÐœÐŽÑ Ð² вОЎе ПЎМПй
root@ceph01-qa:~#ceph-deploy mon create-initial
# ÐПлПжОЌ клÑÑО в ÑказаММÑе в кПМÑОге ЌеÑÑа
root@ceph01-qa:~#cat ceph.bootstrap-osd.keyring > /var/lib/ceph/bootstrap-osd/ceph.keyring
root@ceph01-qa:~#cat ceph.bootstrap-mgr.keyring > /var/lib/ceph/bootstrap-mgr/ceph.keyring
root@ceph01-qa:~#cat ceph.bootstrap-rgw.keyring > /var/lib/ceph/bootstrap-rgw/ceph.keyring
# ÑПзЎаЎОЌ клÑÑ ÐŽÐ»Ñ ÑпÑÐ°Ð²Ð»ÐµÐœÐžÑ ÐºÐ»Ð°ÑÑеÑПЌ
root@ceph01-qa:~#ceph-deploy admin ceph01-q
# О ЌеМеЎжеÑ, плагОМаЌО ÑпÑавлÑÑÑ
root@ceph01-qa:~#ceph-deploy mgr create ceph01-q
×××ך ×ך×ש×× ×©× ×ª×§××ª× ×× ×שע×××ª× ×¢× ××š×¡× ×× ×©× ceph-deploy ×¢× ×ךסת ×ש××××ת 12.2.12 ××× ×©×××× ×עת × ×ס××× ××׊×ך OSD ×¢× db ×-Raid ×©× ×ª××× × -
root@ceph01-qa:~#ceph-volume lvm create --bluestore --data /dev/sde --block.db /dev/md0
blkid could not detect a PARTUUID for device: /dev/md1
××××, × ×š×× ×× blkid ××× × PARTUUID, ×× ××××ª× ×Š×š×× ××׊×ך ×××׊×ת ××××€× ××× ×:
root@ceph01-qa:~#parted /dev/md0 mklabel GPT
# ÑазЎелПв бÑÐŽÐµÑ ÐŒÐœÐŸÐ³ÐŸ,
# без GPT ОÑ
ÑПзЎаÑÑ ÐœÐµ пПлÑÑОÑÑÑ
# ÑÐ°Ð·ÐŒÐµÑ ÑазЎела ÐŒÑ ÑказалО в кПМÑОге вÑÑе = bluestore_block_db_size: '5368709120 #5G'
# ÐОÑкПв Ñ ÐŒÐµÐœÑ 20 пПЎ OSD, ÑÑкаЌО ÑПзЎаваÑÑ ÑÐ°Ð·ÐŽÐµÐ»Ñ Ð»ÐµÐœÑ
# пПÑÐŸÐŒÑ ÑЎелал ÑОкл
root@ceph01-qa:~#for i in {1..20}; do echo -e "nnnn+5Gnw" | fdisk /dev/md0; done
××× × ×š×× ××××, ×× ×× × ×× ×¡×× ××׊×ך ש×× ×ת ×-OSD ××ק×× ×ת ×ש×××× ×××× (ש×××, ×× ×©××××š× ×קך×)
×עת ×׊×ךת OSD ×ס×× bluestore ×××× ×׊××× ×ת ×× ×ª×× ×-WAL, ××× ×׊××× db
root@ceph01-qa:~#ceph-volume lvm create --bluestore --data /dev/sde --block.db /dev/md0
stderr: 2019-04-12 10:39:27.211242 7eff461b6e00 -1 bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
stderr: 2019-04-12 10:39:27.213185 7eff461b6e00 -1 bdev(0x55824c273680 /var/lib/ceph/osd/ceph-0//block.wal) open open got: (22) Invalid argument
stderr: 2019-04-12 10:39:27.213201 7eff461b6e00 -1 bluestore(/var/lib/ceph/osd/ceph-0/) _open_db add block device(/var/lib/ceph/osd/ceph-0//block.wal) returned: (22) Invalid argument
stderr: 2019-04-12 10:39:27.999039 7eff461b6e00 -1 bluestore(/var/lib/ceph/osd/ceph-0/) mkfs failed, (22) Invalid argument
stderr: 2019-04-12 10:39:27.999057 7eff461b6e00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (22) Invalid argument
stderr: 2019-04-12 10:39:27.999141 7eff461b6e00 -1 ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-0/: (22) Invalid argumen
××ª×š× ×××, ×× ×¢× ×××ª× ×ך×× (×× ××ק×× ××ך, ××××ךת×) ת×׊×ך ××××Š× × ×ס׀ת ×¢××ך WAL ×ת׊××× ×××ª× ×עת ×׊×ךת ×-OSD, ×× ××× ××¢××ך ×׊××š× ×××§× (×€×š× ××××€×¢× ×©× WAL × ×€×š×, ש×××× ×× ×š×Š×) .
×××, ×××××× ×©×¢×××× ××× ×ת××× ××ת ×ך××ק×ת ×××¢××ך ×ת WAL ×-NVMe, ×תך××× ×× ×ת×ךך ××××תך.
root@ceph01-qa:~#ceph-volume lvm create --bluestore --data /dev/sdf --block.wal /dev/md0p2 --block.db /dev/md1p2
×׊ך ××× ×××ך××, ×× ×××× ×-OSD. ×¢×ש×× ××××ª× ×š××Š× ×ק××¥ ×××ª× ××ךת, ×× ×× × ×ת×× × ×©×××× ×× ×¡×××× ×©×× ×× ×©× ××סק×× - ×ך×××ת ×××ך×ת ×¢× SSD ××ך×××ת ×××××ת, ××× ×ך×××ת ×××××ת ×¢× ×€× ×§××ק SAS.
× × ×× ×©×שךת×× ×ש 20 ××סק××, ××¢×©×š× ×ך×ש×× ×× ×× ×¡×× ×××, ××©× ××× ××ך××.
××ך××ס ×ך×ש×× ×, ×ך×ךת ×××××, × ×š×× ××:
×¢×¥ ceph osd
root@ceph01-q:~# ×¢×¥ ceph osd
ID CLASS WEIGHT TYPE ×©× ×¡×××ס REWEIGHT PRI-AFF
-1 14.54799 ×ך×ךת ×××× ×©×ךש
-3 9.09200 ×××š× ceph01-q
0 ssd 1.00000 osd.0 ×××¢×× 1.00000 1.00000
1 ssd 1.00000 osd.1 ×××¢×× 1.00000 1.00000
2 ssd 1.00000 osd.2 ×××¢×× 1.00000 1.00000
3 ssd 1.00000 osd.3 ×××¢×× 1.00000 1.00000
4 HDD 1.00000 osd.4 ×××¢×× 1.00000 1.00000
5 HDD 0.27299 osd.5 ×××¢×× 1.00000 1.00000
6 HDD 0.27299 osd.6 ×××¢×× 1.00000 1.00000
7 HDD 0.27299 osd.7 ×××¢×× 1.00000 1.00000
8 HDD 0.27299 osd.8 ×××¢×× 1.00000 1.00000
9 HDD 0.27299 osd.9 ×××¢×× 1.00000 1.00000
10 HDD 0.27299 osd.10 ×××¢×× 1.00000 1.00000
11 HDD 0.27299 osd.11 ×××¢×× 1.00000 1.00000
12 HDD 0.27299 osd.12 ×××¢×× 1.00000 1.00000
13 HDD 0.27299 osd.13 ×××¢×× 1.00000 1.00000
14 HDD 0.27299 osd.14 ×××¢×× 1.00000 1.00000
15 HDD 0.27299 osd.15 ×××¢×× 1.00000 1.00000
16 HDD 0.27299 osd.16 ×××¢×× 1.00000 1.00000
17 HDD 0.27299 osd.17 ×××¢×× 1.00000 1.00000
18 HDD 0.27299 osd.18 ×××¢×× 1.00000 1.00000
19 HDD 0.27299 osd.19 ×××¢×× 1.00000 1.00000
-5 5.45599 ×××š× ceph02-q
20 ssd 0.27299 osd.20 ×××¢×× 1.00000 1.00000
21 ssd 0.27299 osd.21 ×××¢×× 1.00000 1.00000
22 ssd 0.27299 osd.22 ×××¢×× 1.00000 1.00000
23 ssd 0.27299 osd.23 ×××¢×× 1.00000 1.00000
24 HDD 0.27299 osd.24 ×××¢×× 1.00000 1.00000
25 HDD 0.27299 osd.25 ×××¢×× 1.00000 1.00000
26 HDD 0.27299 osd.26 ×××¢×× 1.00000 1.00000
27 HDD 0.27299 osd.27 ×××¢×× 1.00000 1.00000
28 HDD 0.27299 osd.28 ×××¢×× 1.00000 1.00000
29 HDD 0.27299 osd.29 ×××¢×× 1.00000 1.00000
30 HDD 0.27299 osd.30 ×××¢×× 1.00000 1.00000
31 HDD 0.27299 osd.31 ×××¢×× 1.00000 1.00000
32 HDD 0.27299 osd.32 ×××¢×× 1.00000 1.00000
33 HDD 0.27299 osd.33 ×××¢×× 1.00000 1.00000
34 HDD 0.27299 osd.34 ×××¢×× 1.00000 1.00000
35 HDD 0.27299 osd.35 ×××¢×× 1.00000 1.00000
36 HDD 0.27299 osd.36 ×××¢×× 1.00000 1.00000
37 HDD 0.27299 osd.37 ×××¢×× 1.00000 1.00000
38 HDD 0.27299 osd.38 ×××¢×× 1.00000 1.00000
39 HDD 0.27299 osd.39 ×××¢×× 1.00000 1.00000
-7 6.08690 ×××š× ceph03-q
40 ssd 0.27299 osd.40 ×××¢×× 1.00000 1.00000
41 ssd 0.27299 osd.41 ×××¢×× 1.00000 1.00000
42 ssd 0.27299 osd.42 ×××¢×× 1.00000 1.00000
43 ssd 0.27299 osd.43 ×××¢×× 1.00000 1.00000
44 HDD 0.27299 osd.44 ×××¢×× 1.00000 1.00000
45 HDD 0.27299 osd.45 ×××¢×× 1.00000 1.00000
46 HDD 0.27299 osd.46 ×××¢×× 1.00000 1.00000
47 HDD 0.27299 osd.47 ×××¢×× 1.00000 1.00000
48 HDD 0.27299 osd.48 ×××¢×× 1.00000 1.00000
49 HDD 0.27299 osd.49 ×××¢×× 1.00000 1.00000
50 HDD 0.27299 osd.50 ×××¢×× 1.00000 1.00000
51 HDD 0.27299 osd.51 ×××¢×× 1.00000 1.00000
52 HDD 0.27299 osd.52 ×××¢×× 1.00000 1.00000
53 HDD 0.27299 osd.53 ×××¢×× 1.00000 1.00000
54 HDD 0.27299 osd.54 ×××¢×× 1.00000 1.00000
55 HDD 0.27299 osd.55 ×××¢×× 1.00000 1.00000
56 HDD 0.27299 osd.56 ×××¢×× 1.00000 1.00000
57 HDD 0.27299 osd.57 ×××¢×× 1.00000 1.00000
58 HDD 0.27299 osd.58 ×××¢×× 1.00000 1.00000
59 HDD 0.89999 osd.59 ×××¢×× 1.00000 1.00000
×××× × ×׊×ך ×ת××× ×שךת×× ××ך××××××× ×ש×× × ×¢× ×××ק ×'ק ×××ך×× ××ך××:
root@ceph01-q:~#ceph osd crush add-bucket rack01 root #ÑПзЎалО МПвÑй root
root@ceph01-q:~#ceph osd crush add-bucket ceph01-q host #ÑПзЎалО МПвÑй Ñ
ПÑÑ
root@ceph01-q:~#ceph osd crush move ceph01-q root=rack01 #пеÑеÑÑавОлО ÑеÑÐ²ÐµÑ Ð² ÐŽÑÑгÑÑ ÑÑПйкÑ
root@ceph01-q:~#osd crush add 28 1.0 host=ceph02-q # ÐПбавОлО ÐСРв ÑеÑвеÑ
# ÐÑлО кÑОвП ÑПзЎалО ÑП ЌПжМП ÑЎалОÑÑ
root@ceph01-q:~# ceph osd crush remove osd.4
root@ceph01-q:~# ceph osd crush remove rack01
×××¢××ת ×©× ×ª×§×× × ××× ××××× cluster, ××שך ×× ×¡×× ××׊×ך ×××š× ××ש ××××¢××ך ×××ª× ×-rack ק××× - ׀ק××× ceph osd crush move ceph01-host root=rack01 ק׀×, ××××× ×××ך×× ×××× ×××€×× ××× ××ך ××. ××××× ×׀ק××× ×××׊ע×ת CTRL+C ׀ש×× ××××ך ×ת ××ש××× ××¢××× ×××××.
×××€×ש ×ך×× ×ת ×××¢×× ×××:
×׀תך×× ×ת×ךך ××××ת ××ך×ק ×ת crushmap ×××ס×ך ×ת ×ק××¢ ××©× regel replicated_ruleset
root@ceph01-prod:~#ceph osd getcrushmap -o crushmap.row #ÐаЌпОЌ каÑÑÑ Ð² ÑÑÑПЌ вОЎе
root@ceph01-prod:~#crushtool -d crushmap.row -o crushmap.txt #пеÑевПЎОЌ в ÑОÑаеЌÑй
root@ceph01-prod:~#vim crushmap.txt #ÑеЎакÑОÑÑеЌ, ÑЎалÑÑ rule replicated_ruleset
root@ceph01-prod:~#crushtool -c crushmap.txt -o new_crushmap.row #кПЌпОлОÑÑеЌ ПбÑаÑМП
root@ceph01-prod:~#ceph osd setcrushmap -i new_crushmap.row #загÑÑжаеЌ в клаÑÑеÑ
Achtung: ×€×¢××× ×× ×¢×××× ××ך×× ×××××× ×××ש ×©× ×§××׊ת ×××ק×××× ××× ×-OSDs. ×× ××× × ××š× ××× ×¢×××š× ×, ××× ××¢× ××××.
××××ך ××××ך ×©× ×ª×§×× × ××ש××× ×××××§× ××× ×©×××ך ×ת××× ×××ש ×©× ×©×š×ª ×-OSD, ×× ×©××× ×©×× ×××¢××š× ×שךת×× ××ת××× ××ש××, ××××š× ××ך×ךת ××××× ×©× ×ש×ךש.
×ת×׊×× ×××, ×××ך ש×ך××× × ×ת ×ס×××× ×ס××€×ת ש×× ××Š×š× × ×©×ךש × ×€×š× ×¢××ך ××× × × ssd ×××× × ×€×š× ×¢××ך ××× × × ×Š×ך, ×ק×× × ×ת ×× ×-OSDs ×ת×× ×ת××× ×׀ש×× ×××§× × ×ת ש×ךש ×ך×ךת ×××××. ×××ך ××ת××× ×××ש, ×-OSD ××× ×××ש×ך ××ק×××.
×××ך ש××€×š× × ×ת××¢×× ××××ך ××תך, ×׊×× × ×€×š××ך ש××ך×× ×××ª× ×××ת ××. ×¢××× ×××ק ××©× ×
××× ××Š×š× × ×§××׊×ת ש×× ×ת ××€× ×¡×× ××סק.
×××ת×××× ××Š×š× × ×©× × ×©×ךש×× - ×¢××ך ssd ××¢××ך hdd
root@ceph01-q:~#ceph osd crush add-bucket ssd-root root
root@ceph01-q:~#ceph osd crush add-bucket hdd-root root
×××××× ×©×שךת×× ×××ק××× ×€×××ת ×-racks ש×× ××, ×××¢×× × ×××ת ××Š×š× × ×ת××× ×¢× ×©×š×ª×× ×ת×××
# СÑПйкО:
root@ceph01-q:~#ceph osd crush add-bucket ssd-rack01 rack
root@ceph01-q:~#ceph osd crush add-bucket ssd-rack02 rack
root@ceph01-q:~#ceph osd crush add-bucket ssd-rack03 rack
root@ceph01-q:~#ceph osd crush add-bucket hdd-rack01 rack
root@ceph01-q:~#ceph osd crush add-bucket hdd-rack01 rack
root@ceph01-q:~#ceph osd crush add-bucket hdd-rack01 rack
# СеÑвеÑа
root@ceph01-q:~#ceph osd crush add-bucket ssd-ceph01-q host
root@ceph01-q:~#ceph osd crush add-bucket ssd-ceph02-q host
root@ceph01-q:~#ceph osd crush add-bucket ssd-ceph03-q host
root@ceph01-q:~#ceph osd crush add-bucket hdd-ceph01-q host
root@ceph01-q:~#ceph osd crush add-bucket hdd-ceph02-q host
root@ceph01-q:~#ceph osd crush add-bucket hdd-ceph02-q host
×××××§× ×ת ×××סק×× ××€× ×¡××××× ×שךת×× ×©×× ××
root@ceph01-q:~# ÐОÑкО Ñ 0 пП 3 ÑÑП SSD, МаÑ
ПЎÑÑÑÑ Ð² ceph01-q, ÑÑавОЌ ОÑ
в ÑеÑвеÑ
root@ceph01-q:~# ssd-ceph01-q
root@ceph01-q:~#ceph osd crush add 0 1 host=ssd-ceph01-q
root@ceph01-q:~#ceph osd crush add 1 1 host=ssd-ceph01-q
root@ceph01-q:~#ceph osd crush add 2 1 host=ssd-ceph01-q
root@ceph01-q:~#ceph osd crush add 3 1 host=ssd-ceph01-q
root-ceph01-q:~# аМалПгОÑМП Ñ ÐŽÑÑгОЌО ÑеÑвеÑаЌО
×××ך ×€×××ך ×××סק×× ××× ×ס×××× ×-ssd-root ×-hdd-root, ×ש××š× × ×ת ×-root-default ך×ק, ××× ×©× ××× ××××ק ××ת×
root-ceph01-q:~#ceph osd crush remove default
×××ך ×××, ×¢××× × ××׊×ך ×××× ××€×Š× ×©× ×§×©×š ××××ך×× ×©× ×׊ך×× - ×××××× × ×Š××× ×××× ×©×ךש×× ×××××× ×ש×× ×ת × ×ª×× × ××××ך ש×× × ××ת ך×ת ××××××××ת ×©× ×עתק - ××ש×, ×עתק×× ×××××× ××××ת ×שךת×× ×©×× ××, ×× ××ת××× ×©×× ×× (×׀שך ××€××× ×ש×ךש×× ×©×× ××, ×× ×ש ×× × ××××§× ×××)
××€× × ×××ךת ס××, ×¢×××£ ×קך×× ×ת ×ת××¢××:
root-ceph01-q:~#ceph osd crush rule create-simple rule-ssd ssd-root host firstn
root-ceph01-q:~#ceph osd crush rule create-simple rule-hdd hdd-root host firstn
root-ceph01-q:~# ÐÑ ÑказалО Ўва пÑавОла, в кПÑПÑÑÑ
ЎаММÑе ÑеплОÑОÑÑÑÑÑÑ
root-ceph01-q:~# ÐŒÐµÐ¶ÐŽÑ Ñ
ПÑÑаЌО - ÑП еÑÑÑ ÑеплОка ЎПлжМа лежаÑÑ ÐœÐ° ÐŽÑÑгПЌ Ñ
ПÑÑе,
root-ceph01-q:~# Ўаже еÑлО ПМО в ПЎМПй ÑÑПйке
root-ceph01-q:~# РпÑПЎакÑеМе, еÑлО еÑÑÑ Ð²ÐŸÐ·ÐŒÐŸÐ¶ÐœÐŸÑÑÑ, лÑÑÑе ÑаÑпÑеЎелОÑÑ Ñ
ПÑÑÑ
root-ceph01-q:~# пП ÑÑПйкаЌ О ÑказаÑÑ ÑаÑпÑеЎелÑÑÑ ÑеплОкО пП ÑÑПйкаЌ:
root-ceph01-q:~# ##ceph osd crush rule create-simple rule-ssd ssd-root rack firstn
××××, ×× × ××׊ך×× ×××ך×× ×©××× ×× × ×š×׊×× ××××¡× ×ª××× ×ת ××סק ×©× ××××ך××××××׊×× ×©×× × ×עת×× - PROXMOX:
root-ceph01-q:~# #ceph osd pool create {NAME} {pg_num} {pgp_num}
root-ceph01-q:~# ceph osd pool create ssd_pool 1024 1024
root-ceph01-q:~# ceph osd pool create hdd_pool 1024 1024
××× ×× × ×××ך×× ××ך×××ת ×××× ××××× ×××× ××ק×× ××שת×ש
root-ceph01-q:~#ceph osd crush rule ls # ÑЌПÑÑОЌ ÑпОÑПк пÑавОл
root-ceph01-q:~#ceph osd crush rule dump rule-ssd | grep rule_id #вÑбОÑаеЌ ID ÐœÑжМПгП
root-ceph01-q:~#ceph osd pool set ssd_pool crush_rule 2
×ש ××שת ××××ךת ×ס׀ך ק××׊×ת ××ש×× ×¢× ×××× ×§××× ×ך×ש ×¢××ך ××ש××× ×©×× - ××¢×š× ××× OSDs ×××× ×©×, ×××× ×××ת × ×ª×× ×× (××××× ××× ×€× ×××××) ת××× ××××ך, ××× ×××ת ×× ×ª×× ×× ×××××ת.
××¡× ××× ×š×Š×× ×× ×××××ק ××תך ×-300 ק××׊×ת ××ק×× ×¢× ×××סק, ××××× ×§× ××תך ××ת××× ×¢× ×§××׊×ת ××ק×× ×§×× ×ת - ××××ך, ×× ×× ××××ך ש×× ×ª×׀ס 10 TB ××ש ×× 10 PG - ×× ××××× ×¢× ××× ×ך×קת ××× × ×ך×-×××× (pg) ×××× ××¢×××ª× - ש׀×× ××× ×¢× ×ך×××š× ××× ××××× ×§×× ×ת×× ××××× ×׊××š× ×§×× ×××××× ××תך).
××× ×¢××× × ××××ך ש××× ×©×ס׀ך ×-PGs ×××× ××תך, ×× ××שקע×× ××תך ×ש×××× ×××ש×× ××ק××× - ××××ך×× ××-CPU ×ת××××× ××××ת ×× ×׊×××.
××× × ××¡× ×¢×©×××
ךש××ת ×××ך××:
×ק×ך: www.habr.com