[关闭]
@tony-yin 2017-11-30T03:15:44.000000Z 字数 23917 阅读 1002

Cephx实战演练

Ceph


cephx

本文就阅读完徐小胖的大话Cephx后,针对一些猜测和疑惑进行了实战演练,对原文的一些说法和结论进行了验证,并进行了一系列的扩展的思考猜想和总结。最后收获满满,不仅对原文的一些结论进行了验证,也发现了其中的一些问题,更多的是自己动手后一些奇妙的场景和发现。

本文实战任务和完成情况如下:

删除 client.admin.keyring

主节点开始存在keyring,可以正常访问集群

  1. [root@node1 ceph]# ls
  2. ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph-deploy-ceph.log rbdmap
  3. ceph.bootstrap-mgr.keyring ceph.bootstrap-rgw.keyring ceph.conf ceph.mon.keyring
  4. [root@node1 ceph]# ceph -s
  5. cluster:
  6. id: 99480db2-f92f-481f-b958-c03c261918c6
  7. health: HEALTH_WARN
  8. no active mgr
  9. Reduced data availability: 281 pgs inactive, 65 pgs down, 58 pgs incomplete
  10. Degraded data redundancy: 311/771 objects degraded (40.337%), 439 pgs unclean, 316 pgs degraded, 316 pgs undersized
  11. application not enabled on 3 pool(s)
  12. clock skew detected on mon.node2, mon.node3
  13. services:
  14. mon: 3 daemons, quorum node1,node2,node3
  15. mgr: no daemons active
  16. osd: 6 osds: 5 up, 5 in
  17. rgw: 1 daemon active
  18. rgw-nfs: 1 daemon active
  19. data:
  20. pools: 10 pools, 444 pgs
  21. objects: 257 objects, 36140 kB
  22. usage: 6256 MB used, 40645 MB / 46901 MB avail
  23. pgs: 63.288% pgs not active
  24. 311/771 objects degraded (40.337%)
  25. 158 undersized+degraded+peered
  26. 158 active+undersized+degraded
  27. 65 down
  28. 58 incomplete
  29. 5 active+clean+remapped

keyring文件移动到其他地方,相当于删除了keyring,这时访问集群报错

  1. [root@node1 ceph]# mv ceph.client.admin.keyring /tmp/
  2. [root@node1 ceph]# ls
  3. ceph.bootstrap-mds.keyring ceph.bootstrap-mgr.keyring ceph.bootstrap-osd.keyring ceph.bootstrap-rgw.keyring ceph.conf ceph-deploy-ceph.log ceph.mon.keyring rbdmap
  4. [root@node1 ceph]# ceph -s
  5. 2017-11-23 18:07:48.685028 7f63f6935700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
  6. 2017-11-23 18:07:48.685094 7f63f6935700 -1 monclient: ERROR: missing keyring, cannot use cephx for authentication
  7. 2017-11-23 18:07:48.685098 7f63f6935700 0 librados: client.admin initialization error (2) No such file or directory
  8. [errno 2] error connecting to the cluster

再拷贝回来又可以访问集群了

  1. [root@node1 ceph]# mv /tmp/ceph.client.admin.keyring ./
  2. [root@node1 ceph]# ceph -s
  3. cluster:
  4. id: 99480db2-f92f-481f-b958-c03c261918c6
  5. health: HEALTH_WARN
  6. no active mgr
  7. Reduced data availability: 281 pgs inactive, 65 pgs down, 58 pgs incomplete
  8. Degraded data redundancy: 311/771 objects degraded (40.337%), 439 pgs unclean, 316 pgs degraded, 316 pgs undersized
  9. application not enabled on 3 pool(s)
  10. clock skew detected on mon.node2, mon.node3

node3由于/etc/ceph/目录下没有keyring文件,所以也无法连接集群

  1. [root@node3 ceph]# ls
  2. ceph.conf ceph-deploy-ceph.log rbdmap
  3. [root@node3 ceph]# ceph -s
  4. 2017-11-23 17:59:16.659034 7fbe34678700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
  5. 2017-11-23 17:59:16.659085 7fbe34678700 -1 monclient: ERROR: missing keyring, cannot use cephx for authentication
  6. 2017-11-23 17:59:16.659089 7fbe34678700 0 librados: client.admin initialization error (2) No such file or directory
  7. [errno 2] error connecting to the cluster

结论:

ceph.conf中的auth配置为cephx的时候,访问集群是需要秘钥文件的

修改 cephx 配置

node3节点上的/etc/ceph/目录下操作,首先将ceph.client.admin.keyring文件删除,然后将auth配置从cephx改为none,然后先重启monitor,再重启osd,这时候依然不可以访问集群,因为cephx是面向整个集群的,而不是某个节点,接下来需要在其他节点做一样的操作,更改cephxnone,然后重启monitorosd,这时候便可以在没有keyring文件的情况下访问集群了。

  1. # 删除keyring文件
  2. [root@node3 ~]# cd /etc/ceph/
  3. [root@node3 ceph]# ls
  4. ceph.client.admin.keyring ceph.conf ceph-deploy-ceph.log rbdmap
  5. [root@node3 ceph]# mv ceph.client.admin.keyring /tmp/
  6. # 更改cephx配置
  7. [root@node3 ceph]# cat ceph.conf
  8. [global]
  9. fsid = 99480db2-f92f-481f-b958-c03c261918c6
  10. mon_initial_members = node1, node2, node3
  11. mon_host = 192.168.1.58,192.168.1.61,192.168.1.62
  12. auth_cluster_required = cephx
  13. auth_service_required = cephx
  14. auth_client_required = cephx
  15. public network = 192.168.1.0/24
  16. mon clock drift allowed = 2
  17. mon clock drift warn backoff = 30
  18. [root@node3 ceph]# vim ceph.conf
  19. [root@node3 ceph]# cat ceph.conf
  20. [global]
  21. fsid = 99480db2-f92f-481f-b958-c03c261918c6
  22. mon_initial_members = node1, node2, node3
  23. mon_host = 192.168.1.58,192.168.1.61,192.168.1.62
  24. auth_cluster_required = none
  25. auth_service_required = none
  26. auth_client_required = none
  27. public network = 192.168.1.0/24
  28. mon clock drift allowed = 2
  29. mon clock drift warn backoff = 30
  30. [root@node3 ceph]# systemctl restart ceph-mon
  31. ceph-mon@ ceph-mon@node3.service ceph-mon.target
  32. [root@node3 ceph]# systemctl restart ceph-mon
  33. ceph-mon@ ceph-mon@node3.service ceph-mon.target
  34. [root@node3 ceph]# systemctl restart ceph-mon.target
  35. [root@node3 ceph]# systemctl restart ceph-osd.target
  36. # 更改单个节点配置后依然不可以访问集群
  37. [root@node3 ceph]# ceph -s
  38. 2017-11-27 23:05:23.022571 7f5200c2f700 0 librados: client.admin authentication error (95) Operation not supported
  39. [errno 95] error connecting to the cluster
  40. # 相应的更改其他几个节点并重启,便又可以正常访问集群了
  41. [root@node3 ceph]# ceph -s
  42. cluster:
  43. id: 99480db2-f92f-481f-b958-c03c261918c6
  44. health: HEALTH_WARN
  45. ...

结论:

auth配置为cephx的时候访问集群必须要借助秘钥文件,而当auth配置为none的时候,不再需要秘钥文件就可以访问集群了。(更改配置需要集群所有节点都做才可以生效,而不是单一节点

删除monitor秘钥

/etc/ceph/var/lib//ceph/mon/ceph-node1各有一个mon keyring

  1. [root@node1 ceph-node1]# cd /etc/ceph/
  2. [root@node1 ceph]# ls
  3. ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph-deploy-ceph.log rbdmap
  4. ceph.bootstrap-mgr.keyring ceph.bootstrap-rgw.keyring ceph.conf ceph.mon.keyring
  5. [root@node1 ceph]# cd /var/lib/ceph/mon/ceph-node1/
  6. [root@node1 ceph-node1]# ls
  7. done keyring kv_backend store.db systemd

先删除/etc/ceph/ceph-mon.keyring,还是可以访问集群

  1. [root@node1 ceph]# rm ceph.mon.keyring
  2. rm: remove regular file ceph.mon.keyring’? y
  3. [root@node1 ceph]# systemctl restart ceph-mon@node1.service
  4. [root@node1 ceph]# ceph -s
  5. cluster:
  6. id: 99480db2-f92f-481f-b958-c03c261918c6
  7. health: HEALTH_WARN
  8. no active mgr
  9. Reduced data availability: 281 pgs inactive, 65 pgs down, 58 pgs incomplete
  10. Degraded data redundancy: 311/771 objects degraded (40.337%), 439 pgs unclean, 316 pgs degraded, 316 pgs undersized
  11. application not enabled on 3 pool(s)
  12. clock skew detected on mon.node2
  13. ...
  14. ...

再删除/var/lib/ceph/mon/ceph-node1/keyring

  1. [root@node1 ceph-node1]# rm keyring
  2. rm: remove regular file keyring’? y
  3. [root@node1 ceph-node1]# systemctl restart ceph-mon@node1.service
  4. [root@node1 ceph-node1]# ceph -s

访问集群一直timeount,查看log文件发现Mon初始化失败

  1. 2017-11-24 00:33:55.812955 7fa16f995e40 -1 auth: error reading file: /var/lib/ceph/mon/ceph-node1/keyring: can't open /var/lib/ceph/mon/ceph-node1/keyring: (2) No such file or directory
  2. 2017-11-24 00:33:55.812991 7fa16f995e40 -1 mon.node1@-1(probing) e1 unable to load initial keyring /etc/ceph/ceph.mon.node1.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
  3. 2017-11-24 00:33:55.812999 7fa16f995e40 -1 failed to initialize

ok,那我们再试试将/var/lib/ceph/mon/ceph-node1/keyring删除,将etc/ceph/ceph.mon.keyring拷贝回来,这时候意外发生了,居然mon初始化失败

结论:

Monitor启动是需要keyring文件进行秘钥认证的,并且这个文件必须是/var/lib/ceph/mon/ceph-node1/目录下的,/etc/ceph/目录下的ceph.mon.keyring并不起作用

  1. [root@node1 ceph-node1]# rm keyring
  2. rm: remove regular file keyring’? y
  3. [root@node1 ceph]# ls
  4. ceph.bootstrap-mds.keyring ceph.bootstrap-osd.keyring ceph.client.admin.keyring ceph-deploy-ceph.log rbdmap
  5. ceph.bootstrap-mgr.keyring ceph.bootstrap-rgw.keyring ceph.conf ceph.mon.keyring
  6. [root@node1 ceph]# ceph -s
  7. // timeout
  8. ...

mon.log中的现象:

  1. 2017-11-24 00:44:26.534865 7ffaf5117e40 -1 auth: error reading file: /var/lib/ceph/mon/ceph-node1/keyring: can't open /var/lib/ceph/mon/ceph-node1/keyring: (2) No such file or directory
  2. 2017-11-24 00:44:26.534901 7ffaf5117e40 -1 mon.node1@-1(probing) e1 unable to load initial keyring /etc/ceph/ceph.mon.node1.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
  3. 2017-11-24 00:44:26.534916 7ffaf5117e40 -1 failed to initialize

至此,我们可以得出结论monitor初始化的时候依赖的文件是/var/lib/ceph/mon/ceph-node1/keyring而不是/etc/ceph/ceph.mon.keyring

修改 Mon keyring

原始的 keyring

  1. [root@node1 ceph-node1]# cat keyring
  2. [mon.]
  3. key = AQCo7fdZAAAAABAAQOysx+Yxbno/2N8W1huZFA==
  4. caps mon = "allow *"
  5. [root@node1 ceph-node1]# ceph auth get mon.
  6. exported keyring for mon.
  7. [mon.]
  8. key = AQCo7fdZAAAAABAAQOysx+Yxbno/2N8W1huZFA==
  9. caps mon = "allow *"

将中间的五个A替换成了五个C

  1. [root@node1 ceph-node1]# vim keyring
  2. [root@node1 ceph-node1]# cat keyring
  3. [mon.]
  4. key = AQCo7fdZCCCCCBAAQOysx+Yxbno/2N8W1huZFA==
  5. caps mon = "allow *"

重启查看 Mon keyring

理想结果:

  1. [root@node1 ceph-node1]# systemctl restart ceph-mon.target
  2. [root@node1 ceph-node1]# ceph auth get mon.
  3. exported keyring for mon.
  4. [mon.]
  5. key = AQCo7fdZCCCCCBAAQOysx+Yxbno/2N8W1huZFA==
  6. caps mon = "allow *"

令人疑惑的现实:

  1. [root@node1 ceph]# ceph auth get mon.
  2. exported keyring for mon.
  3. [mon.]
  4. key = AQCo7fdZAAAAABAAQOysx+Yxbno/2N8W1huZFA==
  5. caps mon = "allow *"
  6. [root@node1 ceph]# ceph auth get mon.
  7. exported keyring for mon.
  8. [mon.]
  9. key = AQCo7fdZAAAAABAAQOysx+Yxbno/2N8W1huZFA==
  10. caps mon = "allow *"
  11. [root@node1 ceph]# ceph auth get mon.
  12. exported keyring for mon.
  13. [mon.]
  14. key = AQCo7fdZCCCCCBAAQOysx+Yxbno/2N8W1huZFA==
  15. caps mon = "allow *"
  16. [root@node1 ceph]# ceph auth get mon.
  17. exported keyring for mon.
  18. [mon.]
  19. key = AQCo7fdZCCCCCBAAQOysx+Yxbno/2N8W1huZFA==
  20. caps mon = "allow *"
  21. [root@node1 ceph]# ceph auth get mon.
  22. exported keyring for mon.
  23. [mon.]
  24. key = AQCo7fdZCCCCCBAAQOysx+Yxbno/2N8W1huZFA==
  25. caps mon = "allow *"
  26. [root@node1 ceph]# ceph auth get mon.
  27. exported keyring for mon.
  28. [mon.]
  29. key = AQCo7fdZCCCCCBAAQOysx+Yxbno/2N8W1huZFA==
  30. caps mon = "allow *"
  31. [root@node1 ceph]# ceph auth get mon.
  32. exported keyring for mon.
  33. [mon.]
  34. key = AQCo7fdZCCCCCBAAQOysx+Yxbno/2N8W1huZFA==
  35. caps mon = "allow *"
  36. [root@node1 ceph]# ceph auth get mon.
  37. exported keyring for mon.
  38. [mon.]
  39. key = AQCo7fdZCCCCCBAAQOysx+Yxbno/2N8W1huZFA==
  40. caps mon = "allow *"
  41. [root@node1 ceph]# ceph auth get mon.
  42. exported keyring for mon.
  43. [mon.]
  44. key = AQCo7fdZCCCCCBAAQOysx+Yxbno/2N8W1huZFA==
  45. caps mon = "allow *"
  46. [root@node1 ceph]# ceph auth get mon.
  47. exported keyring for mon.
  48. [mon.]
  49. key = AQCo7fdZAAAAABAAQOysx+Yxbno/2N8W1huZFA==
  50. caps mon = "allow *"

可以看到一会是修改之前的keyring,一会是修改之后的keyring,那遇到这种问题,我们就通过log观察如何获取keyring

node1mon.log中日志:

  1. 2017-11-24 09:30:08.697047 7f9b73e09700 0 mon.node1@0(leader) e1 handle_command mon_command({"prefix": "auth get", "entity": "mon."} v 0) v1
  2. 2017-11-24 09:30:08.697106 7f9b73e09700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/1169357136' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
  3. 2017-11-24 09:30:10.020571 7f9b73e09700 0 mon.node1@0(leader) e1 handle_command mon_command({"prefix": "auth get", "entity": "mon."} v 0) v1
  4. 2017-11-24 09:30:10.020641 7f9b73e09700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/2455152702' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
  5. 2017-11-24 09:30:11.393391 7f9b73e09700 0 mon.node1@0(leader) e1 handle_command mon_command({"prefix": "auth get", "entity": "mon."} v 0) v1
  6. 2017-11-24 09:30:11.393452 7f9b73e09700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/1704778092' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
  7. 2017-11-24 09:30:12.669987 7f9b73e09700 0 mon.node1@0(leader) e1 handle_command mon_command({"prefix": "auth get", "entity": "mon."} v 0) v1
  8. 2017-11-24 09:30:12.670049 7f9b73e09700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/275069695' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
  9. 2017-11-24 09:30:14.113077 7f9b73e09700 0 mon.node1@0(leader) e1 handle_command mon_command({"prefix": "auth get", "entity": "mon."} v 0) v1
  10. 2017-11-24 09:30:14.113147 7f9b73e09700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/3800873459' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
  11. 2017-11-24 09:30:15.742038 7f9b73e09700 0 mon.node1@0(leader) e1 handle_command mon_command({"prefix": "auth get", "entity": "mon."} v 0) v1
  12. 2017-11-24 09:30:15.742106 7f9b73e09700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/1908944728' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
  13. 2017-11-24 09:30:17.629681 7f9b73e09700 0 mon.node1@0(leader) e1 handle_command mon_command({"prefix": "auth get", "entity": "mon."} v 0) v1
  14. 2017-11-24 09:30:17.629729 7f9b73e09700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/2193002591' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch

node2mon.log中日志:

  1. 2017-11-24 09:29:23.799402 7fdb3c0ae700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/4284881078' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
  2. 2017-11-24 09:29:26.030516 7fdb3c0ae700 0 mon.node2@1(peon) e1 handle_command mon_command({"prefix": "auth get", "entity": "mon."} v 0) v1
  3. 2017-11-24 09:29:26.030588 7fdb3c0ae700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/4157525590' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch
  4. 2017-11-24 09:29:38.637677 7fdb3c0ae700 0 mon.node2@1(peon) e1 handle_command mon_command({"prefix": "auth get", "entity": "mon."} v 0) v1
  5. 2017-11-24 09:29:38.637748 7fdb3c0ae700 0 log_channel(audit) log [INF] : from='client.? 192.168.1.58:0/4028820259' entity='client.admin' cmd=[{"prefix": "auth get", "entity": "mon."}]: dispatch

结论:

修改OSD keyring和修复

OSD启动的时候需要秘钥才可以登录集群,这个秘钥会存在Monitor的数据库中,所以登录的时候就会拿本地的keyring和存在Monitor中的keyring相匹配,正确的话才可以启动成功。

下面我们将本地的OSD keyring故意改错,然后重启OSD查看效果

  1. # 更改秘钥文件
  2. [root@node3 ceph]# cd /var/lib/ceph/osd/ceph-2
  3. [root@node3 ceph-2]# ls
  4. activate.monmap active block bluefs ceph_fsid fsid keyring kv_backend magic mkfs_done ready systemd type whoami
  5. [root@node3 ceph-2]# cat keyring
  6. [osd.2]
  7. key = AQCp8/dZ4BHbHxAA/GXihrjCOB+7kZJfgnSy+Q==
  8. [root@node3 ceph-2]# vim keyring
  9. [root@node3 ceph-2]# cat keyring
  10. [osd.2]
  11. key = BBBp8/dZ4BHbHxAA/GXihrjCOB+7kZJfgnSy+Q==
  12. [root@node3 ceph-2]# systemctl restart ceph-osd
  13. ceph-osd@ ceph-osd@2.service ceph-osd@5.service ceph-osd.target
  14. [root@node3 ceph-2]# systemctl restart ceph-osd
  15. ceph-osd@ ceph-osd@2.service ceph-osd@5.service ceph-osd.target
  16. [root@node3 ceph-2]# systemctl restart ceph-osd@2.service
  17. # 重启后发现OSD的状态时down
  18. [root@node3 ceph-2]# ceph osd tree | grep osd.2
  19. 2 hdd 0.00980 osd.2 down 1.00000 1.00000

查看日志,发现init失败,原因是auth认证出错

  1. 2017-11-27 23:52:18.069207 7fae1e8d2d00 -1 auth: error parsing file /var/lib/ceph/osd/ceph-2/keyring
  2. 2017-11-27 23:52:18.069285 7fae1e8d2d00 -1 auth: failed to load /var/lib/ceph/osd/ceph-2/keyring: (5) Input/output error
  3. ...
  4. 2017-11-27 23:52:41.232803 7f58d15ded00 -1 ** ERROR: osd init failed: (5) Input/output error

我们可以通过查询Monitor数据库获取正确的keyring,将错误的keyring修正过来再重启OSD

  1. # 查询Monitor数据库中的osd keyring
  2. [root@node3 ceph-2]# ceph auth get osd.2
  3. exported keyring for osd.2
  4. [osd.2]
  5. key = AQCp8/dZ4BHbHxAA/GXihrjCOB+7kZJfgnSy+Q==
  6. caps mgr = "allow profile osd"
  7. caps mon = "allow profile osd"
  8. caps osd = "allow *"
  9. # 修正keyring
  10. [root@node3 ceph-2]# vim keyring
  11. [root@node3 ceph-2]# cat keyring
  12. [osd.2]
  13. key = AQCp8/dZ4BHbHxAA/GXihrjCOB+7kZJfgnSy+Q==
  14. [root@node3 ceph-2]# systemctl restart ceph-osd@2.service
  15. # 重启OSD后可以发现osd.2状态已经变为up
  16. [root@node3 ceph-2]# ceph osd tree | grep osd.2
  17. 2 hdd 0.00980 osd.2 up 1.00000 1.00000

结论:

OSD启动需要正确的keyring,错误的话则无法启动成功,正确的keyring会被存在Monitor的数据库中

修改Client keyring和修复

之前我们通过删除client keyring验证了当auth=cephx的时候,客户端需要keyring才可以访问集群,那么它是像Monitor一样内容不被care还是和OSD一样需要精确匹配keyring呢?

  1. # 修改ceph.client.admin.keyring
  2. [root@node3 ceph-2]# cd /etc/ceph/
  3. [root@node3 ceph]# ls
  4. ceph.client.admin.keyring ceph.conf ceph-deploy-ceph.log rbdmap
  5. [root@node3 ceph]# cat ceph.client.admin.keyring
  6. [client.admin]
  7. key = AQDL7fdZWaQkIBAAsFhvFVQYqSeM/FVSY6o8TQ==
  8. [root@node3 ceph]# vim ceph.client.admin.keyring
  9. [root@node3 ceph]# cat ceph.client.admin.keyring
  10. [client.admin]
  11. key = BBBB7fdZWaQkIBAAsFhvFVQYqSeM/FVSY6o8TQ==
  12. # 访问集群出错
  13. [root@node3 ceph]# ceph -s
  14. 2017-11-28 00:06:05.771604 7f3a69ccf700 -1 auth: error parsing file /etc/ceph/ceph.client.admin.keyring
  15. 2017-11-28 00:06:05.771622 7f3a69ccf700 -1 auth: failed to load /etc/ceph/ceph.client.admin.keyring: (5) Input/output error
  16. 2017-11-28 00:06:05.771634 7f3a69ccf700 0 librados: client.admin initialization error (5) Input/output error
  17. [errno 5] error connecting to the cluster

可以看出访问集群需要正确的keyring,这时候如何修复呢?大家应该能够猜到,它和OSD的原理是一样的,正确的keyring也存在与Monitor的数据库

  1. # 直接获取client.admin出错
  2. [root@node3 ceph]# ceph auth get client.admin
  3. 2017-11-28 00:08:19.159073 7fcabb297700 -1 auth: error parsing file /etc/ceph/ceph.client.admin.keyring
  4. 2017-11-28 00:08:19.159079 7fcabb297700 -1 auth: failed to load /etc/ceph/ceph.client.admin.keyring: (5) Input/output error
  5. 2017-11-28 00:08:19.159090 7fcabb297700 0 librados: client.admin initialization error (5) Input/output error
  6. [errno 5] error connecting to the cluster
  7. # 需要加上monitor的keyring文件才可以获取client.admin.keyring
  8. [root@node3 ceph]# ceph auth get client.admin --name mon. --keyring /var/lib/ceph/mon/ceph-node3/keyring
  9. exported keyring for client.admin
  10. [client.admin]
  11. key = AQDL7fdZWaQkIBAAsFhvFVQYqSeM/FVSY6o8TQ==
  12. caps mds = "allow *"
  13. caps mgr = "allow *"
  14. caps mon = "allow *"
  15. caps osd = "allow *"
  16. # 修正keyring
  17. [root@node3 ceph]# vim ceph
  18. ceph.client.admin.keyring ceph.conf ceph-deploy-ceph.log
  19. [root@node3 ceph]# vim ceph.client.admin.keyring
  20. [root@node3 ceph]# cat ceph.client.admin.keyring
  21. [client.admin]
  22. key = AQDL7fdZWaQkIBAAsFhvFVQYqSeM/FVSY6o8TQ==
  23. # 访问集群成功
  24. [root@node3 ceph]# ceph -s
  25. cluster:
  26. id: 99480db2-f92f-481f-b958-c03c261918c6
  27. health: HEALTH_WARN
  28. ...

出现了令人惊奇的一幕,就是上面通过ceph auth获取OSDkeyring可以正常获取,而获取client.admin.keyring却要加上monitor.keyring,原因可以从报错信息看出,ceph auth需要以客户端连接集群为前提。

结论:

Client访问集群和OSD一样,需要正确的keyring与存在Monitor数据库中对应的keyring相匹配,并且当client.admin.keyring
不正确时,通过ceph auth读取keyring的时候需要加上monitor keyring的选项

Mon Caps

r 权限

Moniorr权限就是拥有读权限,对应的读权限都有哪些操作?在这里的读权限其实就是拥有读取Monitor数据库中信息的权限,MON作为集群的状态维护者,其数据库(/var/lib/ceph/mon/ceph-$hostname/store.db)内保存着集群这一系列状态图(Cluster Map),这些Map包含但不限于:

所以接下来我们可以创建一个新的只拥有读权限的用户,进行相关操作验证读权限具体拥有哪些权限

  1. ceph auth get-or-create client.mon_r mon 'allow r' >> /root/key
  2. [root@node3 ceph]# ceph auth get client.mon_r
  3. exported keyring for client.mon_r
  4. [client.mon_r]
  5. key = AQABvRxaBS6BBhAAz9uwjYCT4xKavJhobIK3ig==
  6. caps mon = "allow r"
  7. ceph --name client.mon_r --keyring /root/key -s // ok
  8. ceph --name client.mon_r --keyring /root/key osd crush dump // ok
  9. ceph --name client.mon_r --keyring /root/key osd getcrushmap -o crushmap.map // ok
  10. ceph --name client.mon_r --keyring /root/key osd dump // ok
  11. ceph --name client.mon_r --keyring /root/key osd tree // ok
  12. ceph --name client.mon_r --keyring /root/key osd stat // ok
  13. ceph --name client.mon_r --keyring /root/key pg dump // ok
  14. ceph --name client.mon_r --keyring /root/key pg stat // ok

尝试了下两个写操作,都显示报错权限拒绝

  1. [root@node3 ceph]# rados --name client.mon_r --keyring /root/key -p testpool put crush crushmap.map
  2. error putting testpool/crush: (1) Operation not permitted
  3. [root@node3 ceph]# ceph --name client.mon_r --keyring /root/key osd out osd.0
  4. Error EACCES: access denied

注意:

虽然上面有osdpg等信息,但是这些都隶属于crush map的范畴中,所以这些状态数据都是从Monitor获取的

结论:

Monitor的读权限对应的是从Monitor数据库获取一系列的Map信息,具体的上面也都讲的很详细了,并且该权限只能读取状态信息,不能获取具体数据信息,且不能进行OSD等守护进程写操作

w 权限

w权限必须配合r权限才会有效果,否则,单独w权限执行指令时,是会一直access denied的。所以我们在测试w权限时,需要附加上r权限才行:

  1. ceph auth get-or-create client.mon_rw mon 'allow rw' >> /root/key

w权限就可以做一些对组件的非读操作了,比如:

  1. # 踢出OSD
  2. ceph osd out
  3. # 删除OSD
  4. ceph osd rm
  5. # 修复PG
  6. ceph pg repair
  7. # 替换CRUSH
  8. ceph osd setcrushmap
  9. # 删除MON
  10. ceph mon rm
  11. ...
  12. # 还有很多操作,就不一一赘述

结论:

Monr权限可以读取集群各个组件的状态,但是不能修改状态,而w权限是可以做到的

注意:

这里的w权限能做到的写权限也只是修改组件的状态,但是并不包括对集群对象的读写权限,因为这些组件状态信息是存在Mon,而对象信息是存在OSD里面的,而这里的w权限也只是Mon的写权限,所以也很好理解了。

x 权限

MONx权限很局限,因为这个权限仅仅和auth相关,比如ceph auth listceph auth get 之类的指令,和w权限类似,x权限也需要r权限组合在一起才能有效力:

  1. # 用上面创建拥有rw权限的用户访问auth list后auth报错
  2. [root@node3 ~]# ceph --name client.mon_rw --keyring /root/key auth list
  3. 2017-11-28 21:28:10.620537 7f0d15967700 0 librados: client.mon_rw authentication error (22) Invalid argument
  4. InvalidArgumentError does not take keyword arguments
  5. # 创建rw权限的用户访问auth list成功
  6. [root@node3 ~]# ceph --name client.mon_rx --keyring /root/key auth list
  7. installed auth entries:
  8. osd.0
  9. key: AQDaTgBav2MgDBAALE1GEEfbQN73xh8V7ISvFA==
  10. caps: [mgr] allow profile osd
  11. caps: [mon] allow profile osd
  12. caps: [osd] allow *
  13. ...
  14. ...

这边需要注意的是徐小胖的原文应该是笔误,他是用的client.mon.rw访问的,所以说实践可以发现很多光看发现不了的东西

结论:

x权限也需要和r权限搭配才有效果,该权限只能处理与auth相关的操作

* 权限

这没什么好说的,猜也能猜到了,就是拥有rwx所有权限

OSD Caps

这一章需要研究一波再发出来

丢失所有秘钥的再恢复

如果所有秘钥全部删除,是否真的能恢复?所有秘钥包括

  1. # 删除 mon keyring
  2. [root@node1 ceph-node1]# mv keyring /root/
  3. # 删除 ceph.conf
  4. [root@node1 ceph-node1]# mv /etc/ceph/ceph.conf /root/
  5. # 删除 client.admin.keyring
  6. [root@node1 ceph-node1]# mv /etc/ceph/ceph.client.admin.keyring /root
  7. # 尝试访问集群报错
  8. [root@node1 ceph-node1]# ceph -s
  9. 2017-11-29 23:57:14.195467 7f25dc4cc700 -1 Errors while parsing config file!
  10. 2017-11-29 23:57:14.195571 7f25dc4cc700 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
  11. 2017-11-29 23:57:14.195579 7f25dc4cc700 -1 parse_file: cannot open ~/.ceph/ceph.conf: (2) No such file or directory
  12. 2017-11-29 23:57:14.195580 7f25dc4cc700 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
  13. Error initializing cluster client: ObjectNotFound('error calling conf_read_file',)
  14. # 尝试获取auth list报错
  15. [root@node1 ceph-node1]# ceph auth list
  16. 2017-11-29 23:57:27.037435 7f162c5a7700 -1 Errors while parsing config file!
  17. 2017-11-29 23:57:27.037450 7f162c5a7700 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
  18. 2017-11-29 23:57:27.037452 7f162c5a7700 -1 parse_file: cannot open ~/.ceph/ceph.conf: (2) No such file or directory
  19. 2017-11-29 23:57:27.037453 7f162c5a7700 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
  20. Error initializing cluster client: ObjectNotFound('error calling conf_read_file',)

ok,下面开始修复:

伪造 Mon keyring

ceph中除了mon.用户以外的的账户密码都保存在Mon的数据库leveldb中,但是mon. 用户的信息并没有保存在数据库里,而是在MON启动时读取Mon目录下的keyring 文件得到的,这也是我们之前验证后得到的结论。所以,我们可以随便伪造一个keyring,放到Mon 目录下去。然后同步到各个Mon节点,然后重启三个Mon

  1. [root@node1 ceph-node1]# cd /var/lib/ceph/mon/ceph-node1/
  2. [root@node1 ceph-node1]# ls
  3. done kv_backend store.db systemd
  4. [root@node1 ceph-node1]# vim keyring
  5. # 伪造 keyring,可以看到里面还有tony的字样,可以看出明显是伪造的
  6. [root@node1 ceph-node1]# cat keyring
  7. [mon.]
  8. key = AQCtonyZAAAAABAAQOysx+Yxbno/2N8W1huZFA==
  9. caps mon = "allow *"
  10. # 重启 mon
  11. [root@node1 ceph-node1]# service ceph-mon@node1 restart
  12. Redirecting to /bin/systemctl restart ceph-mon@node1.service

可以看到效果:

  1. # monitor log显示mon.node1@0初始化成功,并被选举成了monitor leader
  2. 2017-11-30 00:15:04.042157 7f8c4e28a700 0 log_channel(cluster) log [INF] : mon.node1 calling new monitor election
  3. 2017-11-30 00:15:04.042299 7f8c4e28a700 1 mon.node1@0(electing).elector(934) init, last seen epoch 934
  4. 2017-11-30 00:15:04.048498 7f8c4e28a700 0 log_channel(cluster) log [INF] : mon.node1 calling new monitor election
  5. 2017-11-30 00:15:04.048605 7f8c4e28a700 1 mon.node1@0(electing).elector(937) init, last seen epoch 937, mid-election, bumping
  6. 2017-11-30 00:15:04.078454 7f8c4e28a700 0 log_channel(cluster) log [INF] : mon.node1@0 won leader election with quorum 0,1,2

注意(很重要):

虽然说mon在启动的时候读取对应的keyring,不在乎内容的正确性,但是不代表这个keyring可以胡乱修改。也就是说这个keyring是要符合某种规范和格式的,在实践过程我发现keyring前三位必须为大写的AQC,当然还有其他的格式要求,比如结尾是否必须要是==?长度是否是固定的?这个格式要求可能很多,我没有时间一个一个手动无脑验证,这个可以日后查看源码了解实现思路,有兴趣的童鞋可以试试,说不定可以发现很有趣的现象。当然说了这么多是否意味着很难伪造呢?这个我们也不必担心,最好的做法是从别的集群的Mon keyring拷贝一份过来就可以了,自己胡乱伪造启动会报错如下:

  1. 2017-11-29 23:49:50.134137 7fcab3e23700 -1 cephx: cephx_build_service_ticket_blob failed with error invalid key
  2. 2017-11-29 23:49:50.134140 7fcab3e23700 0 mon.node1@0(probing) e1 ms_get_authorizer failed to build service ticket
  3. 2017-11-29 23:49:50.134393 7fcab3e23700 0 -- 192.168.1.58:6789/0 >> 192.168.1.61:6789/0 conn(0x7fcacd15d800 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=0).handle_connect_reply connect got BADAUTHORIZER

 还原 ceph.conf

没有/etc/ceph/ceph.conf这个文件,我们是没法执行ceph相关指令的,所以我们需要尽可能的还原它。首先fsid可以通过去任意osd目录(/var/lib/ceph/osd/ceph-$num/)读取ceph-fsid文件获得,然后mon_initial_membersmon_host代表着集群每个节点的hostnameip,这些都是我们知道的。

  1. # 还原 ceph.conf
  2. [root@node1 ceph-node1]# cat /var/lib/ceph/osd/ceph-0/ceph_fsid
  3. 99480db2-f92f-481f-b958-c03c261918c6
  4. [root@node1 ceph-node1]# vim /etc/ceph/ceph.conf
  5. [root@node1 ceph-node1]# cat /etc/ceph/ceph.conf
  6. [global]
  7. fsid = 99480db2-f92f-481f-b958-c03c261918c6
  8. mon_initial_members = node1, node2, node3
  9. mon_host = 192.168.1.58,192.168.1.61,192.168.1.62
  10. auth_cluster_required = cephx
  11. auth_service_required = cephx
  12. auth_client_required = cephx
  13. public network = 192.168.1.0/24
  14. # 通过 mon keyring 访问集群状态成功
  15. [root@node1 ceph-node1]# ceph -s --name mon. --keyring /var/lib/ceph/mon/ceph-node1/keyring
  16. cluster:
  17. id: 99480db2-f92f-481f-b958-c03c261918c6
  18. health: HEALTH_OK
  19. services:
  20. mon: 3 daemons, quorum node1,node2,node3
  21. mgr: node1_mgr(active)
  22. osd: 6 osds: 6 up, 6 in

恢复 ceph.client.keyring

有了Mon keyring,并且可以执行ceph指令,那么我们就可以通过ceph auth getMonitor leveldb获取任意keyring

  1. # 通过 Mon 获取 client.admin.keyring
  2. [root@node1 ceph-node1]# ceph --name mon. --keyring /var/lib/ceph/mon/ceph-node1/keyring auth get client.admin
  3. exported keyring for client.admin
  4. [client.admin]
  5. key = AQDL7fdZWaQkIBAAsFhvFVQYqSeM/FVSY6o8TQ==
  6. caps mds = "allow *"
  7. caps mgr = "allow *"
  8. caps mon = "allow *"
  9. caps osd = "allow *"
  10. # 创建 /etc/ceph/ceph.client.admin.keyring,并将上面内容更新到该文件
  11. [root@node1 ceph-node1]# vim /etc/ceph/ceph.client.admin.keyring
  12. [root@node1 ceph-node1]# cat /etc/ceph/ceph.client.admin.keyring
  13. [client.admin]
  14. key = AQDL7fdZWaQkIBAAsFhvFVQYqSeM/FVSY6o8TQ==
  15. caps mds = "allow *"
  16. caps mgr = "allow *"
  17. caps mon = "allow *"
  18. caps osd = "allow *"
  19. # 用默认 ceph -s 测试一下,发现可以正常访问了
  20. [root@node1 ceph-node1]# ceph -s
  21. cluster:
  22. id: 99480db2-f92f-481f-b958-c03c261918c6
  23. health: HEALTH_OK
  24. services:
  25. mon: 3 daemons, quorum node1,node2,node3
  26. mgr: node1_mgr(active)
  27. osd: 6 osds: 6 up, 6 in

总结

首先感谢徐小胖给我提供了cephx方面的思路,希望日后多出好文,我也在不断地拜读这些优质文章。这篇文章花了我很长时间,大家从日志的时间可以看出来,跨度已经有好几天了,很多实践真的不是一蹴而就的,需要反复的尝试和思考才能得到最后的成功。Ceph还是要多动手,看别人文章是好事,但是记得要加以实践,否则再好的文章也只是想当然,作者说什么你就跟着他的思路走,你永远不知道别人一句简短的话语和结论的背后花了多少时间去推敲和实践,你看起来一条命令执行成功或者在某一步执行某个命令那也许是别人失败了无数次总结出来的。所以我们要自己实践去验证,除了可以验证原文的观点正确与否,往往可以发现一些其他有用的知识。

经历这次总结,收获满满,我对cephx的理解又上了一个层次。本文就cephx在不同组件中的角色扮演和依赖关系进行梳理,然后再对各组件的cap进行了研究,最后针对各个keyring的恢复给出了详细的指南和步骤。然后还剩两项任务没有完成,等有空进行完善!

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注