[关闭]
@tony-yin 2017-08-20T08:44:58.000000Z 字数 1701 阅读 844

Ptbus disk-mon daemon

Magicloud


Daemon相关知识了解

MCS3DiskMonitor Daemon使用

  1. pid文件路径: /var/run/mcs3-smart-monitor.pid
  2. daemon所在位置: /etc/init.d/
  3. log打印位置:/var/log/mccloudstor/mcs3-disk-mon.log
  4. 操作方式:

    • service mcs3-smart-monitor start
    • service mcs3-smart-monitor stop
    • service mcs3-smart-monitor restart

改动daemon相关代码,需restart daemon才可以生效

MCS3DiskMonitor Daemon process

1. 单位时间避免邮件重复发送: 通过声明一个全局变量send_email_time, 记录当前时间的一个小时的时间,一旦发送邮件立即更新该变量为当前时间

  1. sent_mail_time = datetime.datetime.now() - datetime.timedelta(0,3600) # 1 hour
  2. def send_disk_status_notification(disk_status):
  3. global sent_mail_time
  4. now = datetime.datetime.now()
  5. if now < sent_mail_time + datetime.timedelta(0,3600):
  6. logger.info("Notification sent within one hour before. System will not send again.")
  7. return
  8. host = socket.gethostname()
  9. title = "Host {} Disk Health Status Warning!".format(host)
  10. message = disk_status
  11. try:
  12. utils.send_notification(title, message)
  13. sent_mail_time = datetime.datetime.now()
  14. except Exception as e:
  15. logger.error(str(e))

2. 执行命令报错影响其他代码的执行,有以下几处命令执行:

line48:

  1. output = utils.do_cmd("zpool status|grep state", force=True)

line71~75

  1. VDSTATE1 = do_cmd(MEGACLI_BIN + " -cfgdsply -aALL -NoLog | grep State")
  2. VDSTATE2 = do_cmd(MEGACLI_BIN + " -AdpAllInfo -aALL -NoLog | grep Degraded")
  3. VDSTATE3 = do_cmd(MEGACLI_BIN + " -AdpAllInfo -aALL -NoLog | grep Offline")
  4. PDSTATE1 = do_cmd(MEGACLI_BIN + " -AdpAllInfo -aALL -NoLog | grep \"Critical Disks\"")
  5. PDSTATE2 = do_cmd(MEGACLI_BIN + " -AdpAllInfo -aALL -NoLog | grep \"Failed Disks\"")

暂时处理方法:在这些方法的外面单独包一层try catch

3. SSD Disk信息获取

  1. smartctl -a -d megaraid,{} {}|grep 'Media_Wearout_Indicator'
  1. smartctl -a -d megaraid,{} {}|grep 'Serial Number'
  1. cat /sys/block/{}/queue/rotational // param such as sda or sdb etc
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注