$ crushtool --test -i crushmap --rule 0 --show-mappings --min-x 0 --max-x 10 --num-rep 2
CRUSH rule 0 x 0 [0,12]
CRUSH rule 0 x 1 [5,24]
CRUSH rule 0 x 2 [9,14]
CRUSH rule 0 x 3 [30,11]
CRUSH rule 0 x 4 [20,10]
CRUSH rule 0 x 5 [28,0]
CRUSH rule 0 x 6 [6,34]
CRUSH rule 0 x 7 [19,31]
CRUSH rule 0 x 8 [17,26]
CRUSH rule 0 x 9 [9,20]
CRUSH rule 0 x 10 [10,33]
crushtool --test -i crushmap --rule 0 --show-mappings --min-x 0 --max-x 10 --num-rep 3
CRUSH rule 0 x 0 [0,12,32]
CRUSH rule 0 x 1 [5,24,20]
CRUSH rule 0 x 2 [9,14,28]
CRUSH rule 0 x 3 [30,11,13]
CRUSH rule 0 x 4 [20,10,31]
CRUSH rule 0 x 5 [28,0,12]
CRUSH rule 0 x 6 [6,34,14]
CRUSH rule 0 x 7 [19,31,6]
CRUSH rule 0 x 8 [17,26,5]
CRUSH rule 0 x 9 [9,20,30]
CRUSH rule 0 x 10 [10,33,12]
In general it’s going well. But in some cases it could be better to test.
In some cases, some operations may take a little longer to be processed by the osd. And the operation may fail, or even make the OSD to suicide.
There are many parameters for these timeouts. Some examples :
Thread suicide timed out
heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7f1ee3ca7700' had suicide timed out after 150
common/HeartbeatMap.cc: In function 'bool ceph::HeartbeatMap::_check(ceph::heartbeat_handle_d*, const char*, time_t)' thread 7f1f0c2a3700 time 2017-03-03 11:03:46.550118
common/HeartbeatMap.cc: 79: FAILED assert(0 == "hit suicide timeout")
花猫破解osd_op_thread_suicide_timeout = 900
Operation thread timeout
heartbeat_map is_healthy 'OSD::osd_op_tp thread 0x7fd306416700' had timed out after 15
ceph tell osd.XX injectargs --osd-op-thread-timeout 90
(default value is 15s)
Recovery thread timout
heartbeat_map is_healthy 'OSD::recovery_tp thread 0x7f4c2edab700' had timed out after 30
Erasure code is rather designed for clusters with a sufficient size. However if you want to use it with a small amount of hosts you can also adapt the crushmap for a better matching distribution to your need.
Here a first example for distributing data with 1 host OR 2 drive fault tolerance with k=4, m=2 on 3 hosts and more.
rule erasure_ruleset {
ruleset X
type erasure
花猫破解版 max_size 6
step take default
step choose indep 3 type host
step choose indep 2 type osd
step emit
rule replicated_ruleset {
ruleset X
type replicated
花猫破解版 max_size 3
step take default
step choose firstn 2 type datacenter
step chooseleaf firstn -1 type host
step emit
This working well with pool size=2 (not recommended!) or 3.
If you set pool size more than 3 (and increase the max_size in crush), be careful : you will have n-1 replica on one side and only one on the other datacenter.
If you want to be able to write data even when one of the datacenters is inaccessible, pool min_size should be set at 1 even if size is set to 3. In this case, pay attention to the monitors location.
Aaahhh full disk this morning.
Sometimes the logs can go crazy, and the files can quickly reach several gigabytes.