Check dmesg¶
Overview¶
Checks the kernel ring buffer (dmesg) for messages at severity levels emerg, alert, crit, and err. Known false positives and hardware-specific noise are filtered out by default. To clear reported messages after resolving the underlying issue, run „dmesg –clear“. Requires root or sudo.
Important Notes:
The reported timestamps may be inaccurate. The time source used for dmesg is not updated after system SUSPEND/RESUME. Timestamps are adjusted according to the current delta between boottime and monotonic clocks, which only works for messages printed after the last resume
The kernel ring buffer is a fixed-size circular buffer. Over time, newer messages overwrite older ones, so errors that have been resolved and whose messages have been overwritten will no longer be reported
Data Collection:
Executes
dmesg --level=emerg,alert,crit,err --ctimeto read the kernel ring bufferKnown false positives are filtered out by default, including common harmless messages such as „Assuming drive cache: write through“, „ioctl error in smb2_get_dfs_refer rc=-5“, and various KVM/EFI/SMBus messages
Additional messages can be excluded using the
--ignoreparameter (case-sensitive, repeatable)If more than 10 error lines are found, the output is shortened to the first 5 and last 5 lines
Fact Sheet¶
Fact |
Value |
|---|---|
Check Plugin Download |
https://github.com/Linuxfabrik/monitoring-plugins/tree/main/check-plugins/dmesg |
Nagios/Icinga Check Name |
|
Check Interval Recommendation |
Every minute |
Can be called without parameters |
Yes |
Runs on |
Linux |
Compiled for Windows |
No |
Help¶
usage: dmesg [-h] [-V] [--always-ok] [--ignore IGNORE]
[--severity {warn,crit}] [--test TEST]
Checks the kernel ring buffer (dmesg) for messages at severity levels emerg,
alert, crit, and err. Known false positives and hardware-specific noise are
filtered out by default. To clear reported messages after resolving the
underlying issue, run "dmesg --clear". Requires root or sudo.
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
--always-ok Always returns OK.
--ignore IGNORE Ignore a kernel message (case-sensitive, repeating).
Default: [' Asking for cache data failed', ' Assuming
drive cache: write through', ' brcmfmac:
brcmf_c_preinit_dcmds: Firmware: BCM4345/6', '
brcmfmac: brcmf_fw_alloc_request: using
brcm/brcmfmac43455-sdio for chip BCM4345/6', ' CIFS
VFS: Free previous auth_key.response = ', ' cpufreq:
__cpufreq_add_dev: ->get() failed', ' EFI MOKvar
config table is not in EFI runtime memory', ' ERST:
Failed to get Error Log Address Range.', ' flip_done
timed out', ' i8042: No controller found', ' Ignoring
unsafe software power cap!', ' integrity: Problem
loading X.509 certificate -126', ' ioctl error in
smb2_get_dfs_refer rc=-5', ' kvm_set_msr_common:
MSR_IA32_DEBUGCTLMSR ', ' mokvar: EFI MOKvar config
table is not in EFI runtime memory', ' No Caching mode
page found', ' SMBus base address uninitialized -
upgrade BIOS or use ', ' SMBus Host Controller not
enabled!', ' tsc: Fast TSC calibration failed', '
unhandled rdmsr: ', ' unhandled wrmsr: ', ' vcpu0
disabled perfctr wrmsr', ' Warning: Deprecated Driver
is detected', ' Warning: Unmaintained driver is
detected']
--severity {warn,crit}
Severity for alerting. Default: crit
--test TEST For unit tests. Needs "path-to-stdout-file,path-to-
stderr-file,expected-retc".
Usage Examples¶
./dmesg --ignore ' unhandled wrmsr: ' --severity crit
Output:
5 errors in Kernel Ring Buffer.
[Mon May 31 18:27:14 2021] x86/cpu: SGX disabled by BIOS
[Sat Jun 5 18:49:50 2021] ACPI Error: Thread 2495397888 cannot release Mutex [ECMX] acquired by thread 1817575424 (20210105/exmutex-378)
[Sat Jun 5 18:49:50 2021] ACPI Error: Aborting method \_SB.PCI0.LPCB.ECDV._Q66 due to previous error (AE_AML_NOT_OWNER) (20210105/psparse-529)
[Tue Jun 8 18:54:41 2021] usb usb2-port1: Cannot enable. Maybe the USB cable is bad?
[Tue Jun 8 18:54:41 2021] usb usb2-port1: unable to enumerate USB device
States¶
OK if no emerg, alert, crit, or err messages are found in the kernel ring buffer (after filtering).
CRIT (or the severity given by
--severity) if any such messages are found.--always-oksuppresses all alerts and always returns OK.
Perfdata / Metrics¶
There is no perfdata.
Credits, License¶
Authors: Linuxfabrik GmbH, Zurich
License: The Unlicense, see LICENSE file.