Check redfish-logservices¶
Overview¶
Checks the event log entries exposed under the LogServices of a Redfish-compatible server (the System Event Log, SEL) via the Redfish API. Alerts based on the severity of the log entries.
Important Notes:
Tested on DELL iDRAC and DMTF Simulator
A check usually completes within a few seconds, but a slow or retried request can take longer. The bundled Director basket allows a 60 second runtime timeout.
This check runs with both HTTP and HTTPS. It uses GET requests only.
No additional Python Redfish modules need to be installed.
Data Collection:
Reads the service root to detect the vendor, then queries the
Managerscollection (orSystemson Supermicro) to locate the log serviceReads the SEL log entries and evaluates each entry’s severity
Uses HTTP Basic authentication if
--usernameand--passwordare provided
Fact Sheet¶
Fact |
Value |
|---|---|
Check Plugin Download |
https://github.com/Linuxfabrik/monitoring-plugins/tree/main/check-plugins/redfish-logservices |
Nagios/Icinga Check Name |
|
Check Interval Recommendation |
Every 5 minutes |
Can be called without parameters |
Yes |
Runs on |
Cross-platform |
Compiled for Windows |
No |
Help¶
usage: redfish-logservices [-h] [-V] [--always-ok]
[--cache-expire CACHE_EXPIRE] [--ignore IGNORE]
[--insecure] [--log-type {sel,mel,both}]
[--match MATCH] [--max-age MAX_AGE] [--no-proxy]
[--password PASSWORD] [--retries RETRIES]
[--test TEST] [--timeout TIMEOUT] [--url URL]
[--username USERNAME]
Checks the event log entries exposed under the LogServices of a Redfish-
compatible server via the Redfish API and alerts based on the severity of the
log entries. By default it reads the System Event Log (SEL); `--log-type`
selects the management controller log (MEL) or both. Entries can be filtered
by regular expression (--match, --ignore), and entries older than --max-age
days can be aged out so a long-since resolved event does not keep the check in
a non-OK state forever.
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
--always-ok Always returns OK.
--cache-expire CACHE_EXPIRE
The amount of time after which the credential/data
cache expires, in minutes. Default: 15
--ignore IGNORE Ignore SEL entries whose message matches this Python
regular expression. Case-sensitive by default; use
`(?i)` for case-insensitive matching. Can be specified
multiple times. Example: `--ignore="Log area
reset/cleared"`.
--insecure This option explicitly allows insecure SSL
connections.
--log-type {sel,mel,both}
Which log to read: `sel` (System Event Log, default),
`mel` (management controller event log) or `both`.
Default: sel
--match MATCH Only consider SEL entries whose message matches this
Python regular expression. Case-sensitive by default;
use `(?i)` for case-insensitive matching. Can be
specified multiple times. Example:
`--match="(?i)temperature"`.
--max-age MAX_AGE Age out SEL entries older than this many days: they
are no longer alerted on, only counted in the summary.
A controller keeps an entry until the log is cleared,
so a long-since resolved event would otherwise keep
the check in a non-OK state forever. Default: 0 (0
disables aging).
--no-proxy Do not use a proxy.
--password PASSWORD Redfish API password.
--retries RETRIES Number of extra attempts if a request to the Redfish
API fails, before the check gives up. Helps against an
occasionally slow or flaky management controller.
Default: 3
--test TEST For unit tests. Needs "path-to-stdout-file,path-to-
stderr-file,expected-retc".
--timeout TIMEOUT Network timeout in seconds. Default: 8 (seconds)
--url URL Redfish API URL. Default: https://localhost:5000
--username USERNAME Redfish API username.
Usage Examples¶
./redfish-logservices --url=https://bmc --username=redfish-monitoring --password='linuxfabrik'
Output:
Checked SEL on 1 member. There are critical errors.
/redfish/v1/Managers/BMC
* 2012-03-07T14:44:00Z: System May be Melting [CRITICAL]
States¶
OK if no log entry has a severity above OK.
WARN if a log entry has severity „Warning“.
CRIT if a log entry has severity „Critical“.
--always-oksuppresses all alerts and always returns OK.
Perfdata / Metrics¶
This plugin does not provide any performance data.
For Maintainers¶
You don’t need a physical server with a real BMC (the management controller that serves the Redfish API, e.g. HPE iLO or Dell iDRAC) to develop or test this plugin. The official DMTF Redfish mockup server serves a static, read-only Redfish tree (including the manager log service) over plain HTTP, which is exactly what this GET-only plugin needs.
Run the mockup server and point the plugin at it, from the repository root:
podman run \
--detach --rm \
--name lfmp-redfish-mock \
--publish 5000:8000 \
docker.io/dmtf/redfish-mockup-server:latest
sleep 3
check-plugins/redfish-logservices/redfish-logservices --url=http://127.0.0.1:5000 --no-proxy
podman stop lfmp-redfish-mock
Use http://127.0.0.1:5000 rather than http://localhost:5000, because localhost may resolve to IPv6 (::1) while the published container port is bound to IPv4.
The fixtures under unit-test/stdout/ are the raw Redfish responses the plugin walks, one set per scenario named <scenario>-root (the service root), <scenario>-managers (the Managers collection) and <scenario>-sel (the log entries). To simulate an alert, copy a healthy set and add an entry with a Severity of Critical or Warning to the -sel file. The offline test suite is run with ./run from the unit-test directory.
Credits, License¶
Authors: Linuxfabrik GmbH, Zurich
License: The Unlicense, see LICENSE file.