Überwachen von Linux-Server-Metriken in Home Assistant über mqtt

Es war notwendig, einen anderen Server zu Hause zu platzieren, und ich machte mich daran, seine Leistung in einem Smart Home zu Hause zu überwachen, das von Home Assistant verwendet wird. Schnelles und dann nachdenkliches Googeln gab mir keine universellen Lösungen, also baute ich mein eigenes Fahrrad.





Einführung: Wir werden die Prozessorlast und -temperatur, die RAM- und Swap-Last, den freien Speicherplatz, die Betriebsdauer, die Gesamtsystemlast, die Temperatur und den Status von Smart Disks separat sowie den Status des RAIDs (auf einem Server mit Ubuntu-Server 20) überwachen. ein einfacher Software-Raid1 wurde ausgelöst) ... WD Green-Laufwerke, GA-525-Motherboard mit integriertem Atom525.



Der Moskito-Broker wurde bereits auf dem Smart-Home-Server eingerichtet, daher wurde mqtt als Datenübertragungsmethode ausgewählt.



In den ersten Abschnitten dieser Arbeit werden die Prinzipien der angewandten Datenerfassungsmethoden und am Ende die Datenübertragungsskripte und HA-Einstellungen angegeben.



Alle Befehle in den Beispielen werden als root ausgeführt.





Inhaltsverzeichnis

Sammeln von Systemsensoren

Sammeln von Systemlastdaten

Sammeln von Festplattenzustandsdaten

Sammeln von RAID-

Statusdaten Senden von gesammelten Daten

Konfigurieren von Home Assistant





Systemsensorwerte

Um die eingebauten Sensoren zu erhalten, verwenden wir das Dienstprogramm Sensoren





Wenn es nicht installiert ist, setzen Sie es: apt-get install lm-sensors







Zunächst müssen Sie alle verfügbaren Sensoren finden. Wir führen den Befehl aus sensors-detect





und beantworten alle Fragen y . Danach können Sie sehen, was passiert ist:sensors







Es sollte beachtet werden, dass meine Sensoren persönlich alle Sensoren anzeigten, die erst nach einem Neustart gefunden wurden. Vielleicht eine Art Fehler, ich weiß es nicht.





. sensors json, . sensors -A -u -j



json. , .







, . . json - jp. - ubuntu :



apt-get install jq







xpath . , -.





. , , , temp3, :



sensors -A -u -j | jq '.["coretemp-isa-0000"]["Core 0"].temp2_input'

sensors -A -u -j | jq '.["it8720-isa-0290"].fan1.fan1_input'

sensors -A -u -j | jq '.["it8720-isa-0290"].temp3.temp3_input'








, , , , .





. - free. , -m, .





, . - , .



free -m | grep "Mem" | awk '{print $2}'







grep , awk - , . , . .





, df. , , , . - , . : df









df | grep "/dev/md127p1" | awk '{print $5}' | sed 's/%$//'

df | grep "/dev/md126p1" | awk '{print $5}' | sed 's/%$//'








/proc/loadavg. , - , . , , / 1, 5 15 . . , ( ) , '? 15 :



cat /proc/loadavg | awk '{print $3}'







uptime:



uptime | awk '{print $3}' | sed 's/,$//'







mpstat. , , . , , . , , , . mpstat , apt install sysstat. ,



mpstat | grep all | awk '{print $13}'







, .



, , . bash . bc



cpuidle=$(mpstat | grep all | awk '{print $13}')

cpuload=$(echo "100-$cpuidle" | bc -l)

echo " : $cpuload"








hddtemp. , :



apt-get install hddtemp







: , -n :





SMART smartmontools



apt-get install smartmontools







, -a, .



smartctl -a /dev/sda







, . , . . :





  • Raw_Read_Error_Rate — . , . , . . , ;





  • Reallocated_Sector_Ct — . ;





  • Seek_Error_Rate — . ;





  • Spin_Retry_Count — . ;





  • Reallocated_Event_Count — ;





  • Offline_Uncorrectable — . .





, - json. -j, :



smartctl -a -j /dev/sda







json, . . , . json xpath .





xpath, jq, ( ):





smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[0].raw.value' #Raw_Read_Error_Rate

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[3].raw.value' #Reallocated_Sector_Ct

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[4].raw.value' #Seek_Error_Rate

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[6].raw.value' #Spin_Retry_Count

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[12].raw.value' #Reallocated_Event_Count

smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[14].raw.value' #Offline_Uncorrectable








, " - " - -H, . -j, json.





json:



smartctl -a /dev/sda -j | jq '.smart_status.passed' #smart_status







, ()

, , , cron . .



smartctl -t short /dev/sda







, 2



smartctl -t long /dev/sda







, 1 .



, , smartd, , . , . smartd .





RAID

raid mdadm. , /var. , mdadm , raid .



, - sys. [1] [2]



- . .





, cat /proc/mdstat





- :









echo 'check' >/sys/block/md126/md/sync_action

echo 'check' >/sys/block/md127/md/sync_action












cat /sys/block/md126/md/mismatch_cnt

cat /sys/block/md127/md/mismatch_cnt








0, .





, .





mosquitto, :



apt-get install mosquitto-clients







- , . - ( ), ( raid ), ( smart):



touch system.sh && touch drives.sh && touch smart.sh

chmod u+x system.sh && chmod u+x drives.sh && chmod u+x smart.sh








:





system.sh
#!/bin/bash
#      
ip=xx.xx.xx.xx
usr="xx"
pass="xx"



tempdrive1=$(hddtemp "/dev/sda" -n)
echo "  1: $tempdrive1"
tempdrive2=$(hddtemp "/dev/sdb" -n)
echo "  2: $tempdrive2"


tempcpu=$(sensors -A -u -j | jq '.["coretemp-isa-0000"]["Core 0"].temp2_input')
echo " : $tempcpu"
fan=$(sensors -A -u -j | jq '.["it8720-isa-0290"].fan1.fan1_input')
echo "  : $fan"
temp3=$(sensors -A -u -j | jq '.["it8720-isa-0290"].temp3.temp3_input')
echo " : $temp3"

totalram=$(free -m | grep "Mem" | awk '{print $2}')
echo " : $totalram"
usedram=$(free -m | grep "Mem" | awk '{print $3}')
echo "  : $usedram"
usedrampercent=$(($usedram * 100 / $totalram))
echo "    : $usedrampercent"

totalswap=$(free -m | grep "Swap" | awk '{print $2}')
echo " : $totalswap"
usedswap=$(free -m | grep "Swap" | awk '{print $3}')
echo "  : $usedswap"
usedswappercent=$(($usedswap * 100 / $totalswap))
echo "    : $usedswappercent"

averageload=$(cat /proc/loadavg | awk '{print $3}')
echo "  : $averageload"

uptimedata=$(uptime | awk '{print $3}' | sed 's/,$//')
echo ": $uptimedata"

cpuidle=$(mpstat | grep all | awk '{print $13}')
cpuload=$(echo "100-$cpuidle" | bc -l) # ,    bash      
echo "  : $cpuload"


echo " "
echo " "

mosquitto_pub -h $ip -t "srv/tempdrive1" -m $tempdrive1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/tempdrive2" -m $tempdrive2 -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/tempcpu" -m $tempcpu -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/fan" -m $fan -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/temp3" -m $temp3 -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/usedrampercent" -m $usedrampercent -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/usedswappercent" -m $usedswappercent -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/averageload" -m $averageload -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/uptimedata" -m $uptimedata -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/cpuload" -m $cpuload -u $usr -P $pass

      
      



drives.sh
#!/bin/bash
#      
ip=xx.xx.xx.xx
usr="xx"
pass="xx"



raid_system_status=$(cat /sys/block/md126/md/mismatch_cnt)
echo " RAID  : $raid_system_status"
raid_var_status=$(cat /sys/block/md127/md/mismatch_cnt)
echo " RAID  : $raid_var_status"

freesystemdisk=$(df | grep "/dev/md127p1" | awk '{print $5}' | sed 's/%$//')
echo "    : $freesystemdisk"
freedatadisk=$(df | grep "/dev/md126p1" | awk '{print $5}' | sed 's/%$//')
echo "    : $freedatadisk"

echo " "
echo " "

mosquitto_pub -h $ip -t "srv/raid_system_status" -m $raid_system_status -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/raid_var_status" -m $raid_var_status -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/freesystemdisk" -m $freesystemdisk -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/freedatadisk" -m $freedatadisk -u $usr -P $pass

      
      



smart.sh
#!/bin/bash
#      
ip=xx.xx.xx.xx
usr="xx"
pass="xx"



Raw_Read_Error_Rate1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[0].raw.value')
echo "SMART Raw_Read_Error_Rate  1: $Raw_Read_Error_Rate1"
Reallocated_Sector_Ct1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[3].raw.value')
echo "SMART Reallocated_Sector_Ct  1: $Reallocated_Sector_Ct1"
Seek_Error_Rate1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[4].raw.value')
echo "SMART Seek_Error_Rate  1: $Seek_Error_Rate1"
Spin_Retry_Count1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[6].raw.value')
echo "SMART Spin_Retry_Count  1: $Spin_Retry_Count1"
Reallocated_Event_Count1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[12].raw.value')
echo "SMART Reallocated_Event_Count  1: $Reallocated_Event_Count1"
Offline_Uncorrectable1=$(smartctl -a /dev/sda -j | jq '.ata_smart_attributes.table[14].raw.value')
echo "SMART Offline_Uncorrectable  1: $Offline_Uncorrectable1"

smart_status1=$(smartctl -a /dev/sda -j | jq '.smart_status.passed')
echo "  1: $smart_status1"

Raw_Read_Error_Rate2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[0].raw.value')
echo "SMART Raw_Read_Error_Rate  2: $Raw_Read_Error_Rate2"
Reallocated_Sector_Ct2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[3].raw.value')
echo "SMART Reallocated_Sector_Ct  2: $Reallocated_Sector_Ct2"
Seek_Error_Rate2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[4].raw.value')
echo "SMART Seek_Error_Rate  2: $Seek_Error_Rate2"
Spin_Retry_Count2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[6].raw.value')
echo "SMART Spin_Retry_Count  2: $Spin_Retry_Count2"
Reallocated_Event_Count2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[12].raw.value')
echo "SMART Reallocated_Event_Count  2: $Reallocated_Event_Count2"
Offline_Uncorrectable2=$(smartctl -a /dev/sdb -j | jq '.ata_smart_attributes.table[14].raw.value')
echo "SMART Offline_Uncorrectable  2: $Offline_Uncorrectable2"

smart_status2=$(smartctl -a /dev/sdb -j | jq '.smart_status.passed')
echo "  2: $smart_status2"

echo " "
echo " "

mosquitto_pub -h $ip -t "srv/Raw_Read_Error_Rate1" -m $Raw_Read_Error_Rate1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Sector_Ct1" -m $Reallocated_Sector_Ct1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Seek_Error_Rate1" -m $Seek_Error_Rate1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Spin_Retry_Count1" -m $Spin_Retry_Count1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Event_Count1" -m $Reallocated_Event_Count1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Offline_Uncorrectable1" -m $Offline_Uncorrectable1 -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/Raw_Read_Error_Rate2" -m $Raw_Read_Error_Rate2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Sector_Ct2" -m $Reallocated_Sector_Ct2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Seek_Error_Rate2" -m $Seek_Error_Rate2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Spin_Retry_Count2" -m $Spin_Retry_Count2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Reallocated_Event_Count2" -m $Reallocated_Event_Count2 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/Offline_Uncorrectable2" -m $Offline_Uncorrectable2 -u $usr -P $pass

mosquitto_pub -h $ip -t "srv/smart_status1" -m $smart_status1 -u $usr -P $pass
mosquitto_pub -h $ip -t "srv/smart_status2" -m $smart_status2 -u $usr -P $pass

      
      



, Mosquitto broker Home Assistant





, , , .





Home Assistant

, . Home Assistant .





sensor:
  - platform: mqtt
    state_topic: "srv/tempdrive1"
    name: " nextcloud   1"
    unit_of_measurement: °C
  - platform: mqtt
    state_topic: "srv/tempdrive2"
    name: " nextcloud   2"
    unit_of_measurement: °C
  - platform: mqtt
    state_topic: "srv/tempcpu"
    name: " nextcloud  "
    unit_of_measurement: °C
  - platform: mqtt
    state_topic: "srv/fan"
    name: " nextcloud  "
    unit_of_measurement: ppm
  - platform: mqtt
    state_topic: "srv/temp3"
    name: " nextcloud  "
    unit_of_measurement: °C
  - platform: mqtt
    state_topic: "srv/usedrampercent"
    name: " nextcloud  RAM"
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/usedswappercent"
    name: " nextcloud  SWAP"
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/freesystemdisk"
    name: " nextcloud     "
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/freedatadisk"
    name: " nextcloud     "
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/averageload"
    name: " nextcloud   "
  - platform: mqtt
    state_topic: "srv/uptimedata"
    name: " nextcloud "
  - platform: mqtt
    state_topic: "srv/cpuload"
    name: " nextcloud   "
    unit_of_measurement: "%"
  - platform: mqtt
    state_topic: "srv/Raw_Read_Error_Rate1"
    name: " nextcloud  1 SMART Raw_Read_Error_Rate"
  - platform: mqtt
    state_topic: "srv/Reallocated_Sector_Ct1"
    name: " nextcloud  1 SMART Reallocated_Sector_Ct"
  - platform: mqtt
    state_topic: "srv/Seek_Error_Rate1"
    name: " nextcloud  1 SMART Seek_Error_Rate"
  - platform: mqtt
    state_topic: "srv/Spin_Retry_Count1"
    name: " nextcloud  1 SMART Spin_Retry_Count"
  - platform: mqtt
    state_topic: "srv/Reallocated_Event_Count1"
    name: " nextcloud  1 SMART Reallocated_Event_Count"
  - platform: mqtt
    state_topic: "srv/Offline_Uncorrectable1"
    name: " nextcloud  1 SMART Offline_Uncorrectable"
  - platform: mqtt
    state_topic: "srv/smart_status1"
    name: " nextcloud  1 SMART "
  - platform: mqtt
    state_topic: "srv/Raw_Read_Error_Rate2"
    name: " nextcloud  2 SMART Raw_Read_Error_Rate"
  - platform: mqtt
    state_topic: "srv/Reallocated_Sector_Ct2"
    name: " nextcloud  2 SMART Reallocated_Sector_Ct"
  - platform: mqtt
    state_topic: "srv/Seek_Error_Rate2"
    name: " nextcloud  2 SMART Seek_Error_Rate"
  - platform: mqtt
    state_topic: "srv/Spin_Retry_Count2"
    name: " nextcloud  2 SMART Spin_Retry_Count"
  - platform: mqtt
    state_topic: "srv/Reallocated_Event_Count2"
    name: " nextcloud  2 SMART Reallocated_Event_Count"
  - platform: mqtt
    state_topic: "srv/Offline_Uncorrectable2"
    name: " nextcloud  2 SMART Offline_Uncorrectable"
  - platform: mqtt
    state_topic: "srv/smart_status2"
    name: " nextcloud  2 SMART "
  - platform: mqtt
    state_topic: "srv/raid_system_status"
    name: " nextcloud RAID   "
  - platform: mqtt
    state_topic: "srv/raid_var_status"
    name: " nextcloud RAID   "
      
      



, , , ! . , , . :





, . , , smart .





- , . , . → → mqtt.



- linux , , , .





- . , . , .





Der Screenshot zeigt, dass der besprochene Server für nextcloud geplant ist. Die internen Indikatoren können auch perfekt zu HA hinzugefügt werden, dafür gibt es eine wunderbare API. Und HA hat eine integrierte Integration.








All Articles