IPMI

IPMI steht für „Intelligent Platform Management Interface“. Der in vielen BMCs (Base Management Controller) angbotene Webserver zur Hardware-Administration startet, sobald die Maschine an das Stromnetz angeschlossen ist.

Die BMCs heissen bei …

  • Dell: iDRAC (integrated Dell Remote Access Controller; Default Login: root / calvin)

  • HP/HPE: iLO (integrated Lights Out)

  • Supermicro: schlicht BMC (Default-Login: ADMIN / ADMIN)

Dell iDRAC

iDRAC steht für „integrated Dell Remote Access Controller“. Damit lassen sich unter anderem RAIDs verwalten oder die Sensoren einer Maschine bis hin zum Ladestand des Notstromakkus des RAID-Controllers vom Schreibtisch aus überwachen. iDRAC kümmert sich bei Erreichen bestimmter Temperatur-Schwellwerte auch um das Drosseln der CPU-Leistung, oder begrenzt den Stromverbrauch anhand der Vorgaben des Admins. Der Zugriff erfolgt

  • über das Webinterface.

  • seriell über LAN (Simulation einer seriellen Schnittstelle per IPMI).

  • per SSH (proprietäre Shell).

  • per Telnet (per default abgeschaltet).

iDRAC wird über ein eigenes Netzwerk-Interface an das Service-Netzwerk angeschlossen.

HPE iLO

Steht für „integrated Lights Out“ (Version 2 und 3).

Benutzer

Ältere IPMI-Versionen bis v1.5 (Interface „lan“) verkraften nur bis zu 16 Zeichen lange Passwörter. IPMI ab 2.0 (Interfae „lanplus“) erhöht die Passwort-Länge auf 20 Zeichen. Benutzer Nummer 1 ist der Benutzer „Anonymous“.

Mail-Benachrichtigung

So sieht eine IPMI-Fehlermeldung per Mail aus, wenn eine Disk SMART-Errors zeigt:

message
IP : 10.80.32.27
Hostname: host.example.com
SEL_TIME: 2017/09/03 21:09:53
SENSOR_NUMBER: ff
SENSOR_TYPE: HDD
SENSOR_NAME: OEM
EVENT_DESCRIPTION: Medium
EVENT_DIRECTION: Assertion
EVENT SEVERITY:"Information"

ipmitool

Mit Hilfe von ipmitool lassen sich die IPMI-basierten Systemverwaltungs-Interfaces auf der Kommandzeile administrieren.

Begriffe:

  • CT, cr: critical

  • na: Not Available

  • NC, nc: non-critical

  • NR, nr: non-recoverable

  • ns: Not Specified

  • SDR: Sensor Data Record, beschreibt einen bestimmten Sensor

  • SEL: System Event Log

Installation und Konfiguration

yum -y install ipmitool

User

Benutzer auflisten, und dem Benutzer Nummer 2 einen Namen sowie ein Passwort vergeben:

ipmitool user list
ipmitool user set name 2 ipmi-admin
ipmitool user set password 2

Sensoren

# ipmi v1.5 locally
ipmitool sensor list

# ipmi v1.5 remote
ipmitool sensor list -I lan -H 10.80.32.29 -U 'ilo-viewer' -P '0123456789123456'

# ipmi v2.0; we just need USER permissions here
ipmitool sensor list -I lanplus -H 10.80.32.29 -L user -U 'ilo-viewer' -P '01234567890123456789'

Das Ergebnis ist natürlich abhängig von der Maschine. Beispiel:

Sensor

Sensor-Wert

UOM

Status

LowerNR

LowerCT

LowerNC

UpperNC

UpperCT

UpperNR

1

2

3

4

5

6

7

8

9

10

CPU Temp

52.000

degrees C

ok

0.000

0.000

0.000

96.000

101.000

101.000

PCH Temp

44.000

degrees C

ok

0.000

5.000

16.000

90.000

95.000

100.000

System Temp

33.000

degrees C

ok

-10.000

-5.000

0.000

80.000

85.000

90.000

Peripheral Temp

40.000

degrees C

ok

-10.000

-5.000

0.000

80.000

85.000

90.000

VcpuVRM Temp

55.000

degrees C

ok

-5.000

0.000

5.000

95.000

100.000

105.000

VmemABVRM Temp

41.000

degrees C

ok

-5.000

0.000

5.000

95.000

100.000

105.000

VmemCDVRM Temp

35.000

degrees C

ok

-5.000

0.000

5.000

95.000

100.000

105.000

DIMMA1 Temp

39.000

degrees C

ok

-5.000

0.000

5.000

80.000

85.000

90.000

DIMMA2 Temp

na

na

na

na

na

na

na

na

DIMMB1 Temp

41.000

degrees C

ok

-5.000

0.000

5.000

80.000

85.000

90.000

DIMMB2 Temp

na

na

na

na

na

na

na

na

DIMMC1 Temp

na

na

na

na

na

na

na

na

DIMMC2 Temp

na

na

na

na

na

na

na

na

DIMMD1 Temp

na

na

na

na

na

na

na

na

DIMMD2 Temp

na

na

na

na

na

na

na

na

FAN1

6000.000

RPM

ok

300.000

500.000

700.000

25300.000

25400.000

25500.000

FAN2

5400.000

RPM

ok

300.000

500.000

700.000

25300.000

25400.000

25500.000

FAN3

5800.000

RPM

ok

300.000

500.000

700.000

25300.000

25400.000

25500.000

FAN4

5700.000

RPM

ok

300.000

500.000

700.000

25300.000

25400.000

25500.000

FAN5

na

na

na

na

na

na

na

na

FANA

na

na

na

na

na

na

na

na

12V

12.189

Volts

ok

10.173

10.299

10.740

12.945

13.260

13.386

5VCC

5.026

Volts

ok

4.246

4.298

4.480

5.390

5.546

5.598

3.3VCC

3.384

Volts

ok

2.789

2.823

2.959

3.554

3.656

3.690

VBAT

3.075

Volts

ok

2.375

2.487

2.599

3.775

3.887

3.999

Vcpu

1.836

Volts

ok

1.242

1.260

1.395

1.899

2.088

2.106

VDIMMAB

1.218

Volts

ok

0.948

0.975

1.047

1.344

1.425

1.443

VDIMMCD

1.218

Volts

ok

0.948

0.975

1.047

1.344

1.425

1.443

5VSB

4.974

Volts

ok

4.246

4.298

4.480

5.390

5.546

5.598

3.3VSB

3.333

Volts

ok

2.789

2.823

2.959

3.554

3.656

3.690

1.5V PCH

1.536

Volts

ok

1.320

1.347

1.401

1.644

1.671

1.698

1.2V BMC

1.227

Volts

ok

1.020

1.047

1.092

1.344

1.371

1.398

1.05V PCH

1.059

Volts

ok

0.870

0.897

0.942

1.194

1.221

1.248

Chassis Intru

0x0

discrete

0x0000

na

na

na

na

na

na

PS1 Status

0x1

discrete

0x0100

na

na

na

na

na

na

PS2 Status

0x1

discrete

0x0100

na

na

na

na

na

na

Sensor Data Repository

ipmitool sdr list all

Das Ergebnis ist natürlich abhängig von der Maschine. Beispiel:

CPU Temp         | 51 degrees C      | ok
PCH Temp         | 44 degrees C      | ok
System Temp      | 33 degrees C      | ok
Peripheral Temp  | 40 degrees C      | ok
VcpuVRM Temp     | 55 degrees C      | ok
VmemABVRM Temp   | 41 degrees C      | ok
VmemCDVRM Temp   | 35 degrees C      | ok
DIMMA1 Temp      | 39 degrees C      | ok
DIMMA2 Temp      | no reading        | ns
DIMMB1 Temp      | 41 degrees C      | ok
DIMMB2 Temp      | no reading        | ns
DIMMC1 Temp      | no reading        | ns
DIMMC2 Temp      | no reading        | ns
DIMMD1 Temp      | no reading        | ns
DIMMD2 Temp      | no reading        | ns
FAN1             | 6000 RPM          | ok
FAN2             | 5400 RPM          | ok
FAN3             | 5800 RPM          | ok
FAN4             | 5700 RPM          | ok
FAN5             | no reading        | ns
FANA             | no reading        | ns
12V              | 12.13 Volts       | ok
5VCC             | 5.03 Volts        | ok
3.3VCC           | 3.38 Volts        | ok
VBAT             | 3.08 Volts        | ok
Vcpu             | 1.84 Volts        | ok
VDIMMAB          | 1.22 Volts        | ok
VDIMMCD          | 1.22 Volts        | ok
5VSB             | 4.97 Volts        | ok
3.3VSB           | 3.33 Volts        | ok
1.5V PCH         | 1.54 Volts        | ok
1.2V BMC         | 1.23 Volts        | ok
1.05V PCH        | 1.06 Volts        | ok
Chassis Intru    | 0x00              | ok
BMC FRU          | Log FRU @00h 00.0 | ok
ATEN BMC         | Dynamic MC @ 20h  | ok
NM exception     | Event-Only        | ns
NM health        | Event-Only        | ns
NM op cap        | Event-Only        | ns
NM alert         | Event-Only        | ns
PS1 Status       | 0x01              | ok
PS2 Status       | 0x01              | ok

System Event Log (SEL)

Achtung: Zeiten werden in UTC statt in Local Time angegeben (im Beispiel ist das Event nicht um 07:06, sondern um 08:06 Ortszeit Winterzeit aufgetreten).

ipmitool sel list

Beispiel:

 1 | 03/12/2019 | 13:10:45 | Physical Security #0xaa | General Chassis intrusion () | Asserted
 2 | 03/13/2019 | 11:15:43 | Physical Security #0xaa | General Chassis intrusion () | Asserted
 3 | 03/13/2019 | 15:15:34 | Physical Security #0xaa | General Chassis intrusion () | Asserted
 4 | 03/16/2019 | 07:06:24 | Physical Security #0xaa | General Chassis intrusion () | Asserted
 5 | 03/16/2019 | 12:10:02 | OS Boot | Installation started () | Asserted
 6 | 03/16/2019 | 12:16:11 | OS Boot | Installation completed () | Asserted
 7 | 04/02/2019 | 14:46:19 | OS Boot | Installation started () | Asserted
 8 | 04/02/2019 | 14:52:04 | OS Boot | Installation completed () | Asserted
 9 | 08/17/2019 | 02:31:44 | Power Supply #0xc8 | Failure detected () | Asserted
 a | 08/17/2019 | 02:31:57 | Power Supply #0xc8 | Failure detected () | Deasserted
 b | 08/17/2019 | 02:33:24 | Power Supply #0xc8 | Failure detected () | Asserted
 c | 08/17/2019 | 02:33:33 | Power Supply #0xc8 | Failure detected () | Deasserted
 d | 08/26/2019 | 13:57:02 | Physical Security #0xaa | General Chassis intrusion () | Deasserted
 e | 09/03/2019 | 21:08:34 | Unknown #0xff |  | Asserted
 f | 09/03/2019 | 21:09:53 | Unknown #0xff |  | Asserted
10 | 09/03/2019 | 21:11:20 | Unknown #0xff |  | Asserted
11 | 09/03/2019 | 21:19:24 | Unknown #0xff |  | Asserted
12 | 09/03/2019 | 21:23:27 | Unknown #0xff |  | Asserted
13 | 09/03/2019 | 21:24:49 | Unknown #0xff |  | Asserted
14 | 09/03/2019 | 21:59:00 | Unknown #0xff |  | Asserted
15 | 09/10/2019 | 20:45:43 | OS Boot | Installation started () | Asserted
16 | 09/10/2019 | 21:00:01 | OS Boot | Installation started () | Asserted
17 | 09/10/2019 | 21:11:27 | OS Boot | Installation completed () | Asserted

Mit Bezeichnung des Sensors statt der Hardware-Adresse:

ipmitool sel elist

Beispiel:

 1 | 03/12/2019 | 13:10:45 | Physical Security Chassis Intru | General Chassis intrusion () | Asserted
 2 | 03/13/2019 | 11:15:43 | Physical Security Chassis Intru | General Chassis intrusion () | Asserted
 3 | 03/13/2019 | 15:15:34 | Physical Security Chassis Intru | General Chassis intrusion () | Asserted
 4 | 03/16/2019 | 07:06:24 | Physical Security Chassis Intru | General Chassis intrusion () | Asserted
 5 | 03/16/2019 | 12:10:02 | OS Boot | Installation started () | Asserted
 6 | 03/16/2019 | 12:16:11 | OS Boot | Installation completed () | Asserted
 7 | 04/02/2019 | 14:46:19 | OS Boot | Installation started () | Asserted
 8 | 04/02/2019 | 14:52:04 | OS Boot | Installation completed () | Asserted
 9 | 08/17/2019 | 02:31:44 | Power Supply PS1 Status | Failure detected () | Asserted
 a | 08/17/2019 | 02:31:57 | Power Supply PS1 Status | Failure detected () | Deasserted
 b | 08/17/2019 | 02:33:24 | Power Supply PS1 Status | Failure detected () | Asserted
 c | 08/17/2019 | 02:33:33 | Power Supply PS1 Status | Failure detected () | Deasserted
 d | 08/26/2019 | 13:57:02 | Physical Security Chassis Intru | General Chassis intrusion () | Deasserted
 e | 09/03/2019 | 21:08:34 | Unknown #0xff |  | Asserted
 f | 09/03/2019 | 21:09:53 | Unknown #0xff |  | Asserted
10 | 09/03/2019 | 21:11:20 | Unknown #0xff |  | Asserted
11 | 09/03/2019 | 21:19:24 | Unknown #0xff |  | Asserted
12 | 09/03/2019 | 21:23:27 | Unknown #0xff |  | Asserted
13 | 09/03/2019 | 21:24:49 | Unknown #0xff |  | Asserted
14 | 09/03/2019 | 21:59:00 | Unknown #0xff |  | Asserted
15 | 09/10/2019 | 20:45:43 | OS Boot | Installation started () | Asserted
16 | 09/10/2019 | 21:00:01 | OS Boot | Installation started () | Asserted
17 | 09/10/2019 | 21:11:27 | OS Boot | Installation completed () | Asserted

Meta-Informationen:

ipmitool sel info

Beispiel:

SEL Information
Version          : 1.5 (v1.5, v2 compliant)
Entries          : 2
Free Space       : 10200 bytes
Percent Used     : 0%
Last Add Time    : 02/17/2019 10:29:31
Last Del Time    : 04/09/2018 08:10:06
Overflow         : false
Supported Cmds   : 'Reserve' 'Get Alloc Info'
# of Alloc Units : 512
Alloc Unit Size  : 20
# Free Units     : 510
Largest Free Blk : 510
Max Record Size  : 20

SEL löschen:

ipmitool sel clear

Troubleshooting

Could not open device at /dev/ipmi0 or /dev/ipmi/0 or /dev/ipmidev/0: No such file or directory

Evtl. fehlen die Kernel-Module - laden:

modprobe ipmi_devintf
modprobe ipmi_si

Um sie dauerhaft zu laden, diese zu /etc/modules hinzufügen.

Built on 2024-07-16