Date

If you manage a dell poweredge, it will come a time where you are curious to know about the health of your machine. These are few notes.

The first thing you will try is probably the dell management server. It’s the usual java monster with web interface and all in it. I don’t like it. So I installed openimpi and got the megacli bianry from the lsi website.

openimpi is easy: apt-get install opemimpi impitools

Then we need to load the impi kernel modules:

ipmi_si
ipmi_devintf
ipmi_msghandler
ipmi_poweroff
ipmi_watchdog

more info about impi on debian can be found here.

The next thing is to get the LSI raid monitoring tool. This is a compendium for this tool. This utility is not packaged for debian and actually is a 32 bits binary. You can download it from here. Then you just need to extract the binary with unzip and rpm.

If you really want to use omsa this page gives you all the info you need.

Monitoring Script

From HERE.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#!/bin/sh
CONT="a0"
STATUS=0
MEGACLI=/root/bin/megacli

echo -n "Checking RAID status on "
hostname
for a in $CONT
 do

   NAME=`$MEGACLI -AdpAllInfo -$a |grep "Product Name" | cut -d: -f2`
   echo "Controller $a: $NAME"
   noonline=`$MEGACLI PDList -$a | grep Online | wc -l`
   echo "No of Physical disks online : $noonline"
   DEGRADED=`$MEGACLI -AdpAllInfo -a0  |grep "Degrade"`
   echo $DEGRADED
   NUM_DEGRADED=`echo $DEGRADED |cut -d" " -f3`
   [ "$NUM_DEGRADED" -ne 0 ] && STATUS=1
   FAILED=`$MEGACLI -AdpAllInfo -a0  |grep "Failed Disks"`
   echo $FAILED
   NUM_FAILED=`echo $FAILED |cut -d" " -f4`
   [ "$NUM_FAILED" -ne 0 ] && STATUS=1

 done

exit $STATUS

Open source RAID card monitor

http://megactl.svn.sourceforge.net/viewvc/megactl/trunk/

It would be nice to have a munin plugin for this monitoring tools.