Provided by: mcelog_100-1fakesync1_amd64 bug

NAME

       mcelog - Decode kernel machine check log on x86 machines

SYNOPSIS

       mcelog [options] [device]
       mcelog [options] --daemon
       mcelog [options] --client
       mcelog [options] --ascii
       mcelog --version

DESCRIPTION

       X86  CPUs  report errors detected by the CPU as machine check events (MCEs).  These can be
       data corruption detected in the CPU  caches,  in  main  memory  by  an  integrated  memory
       controller,  data  transfer  errors  on  the  front  side bus or CPU interconnect or other
       internal errors.  Possible causes  can  be  cosmic  radiation,  instable  power  supplies,
       cooling problems, broken hardware, or bad luck.

       Most  errors  can  be  corrected  by  the  CPU  by  internal  error correction mechanisms.
       Uncorrected errors cause machine check exceptions which may panic the machine.

       When a corrected error happens the x86 kernel writes a record describing the  MCE  into  a
       internal ring buffer available through the /dev/mcelog device mcelog retrieves errors from
       /dev/mcelog, decodes them into a human readable format and prints  them  on  the  standard
       output or optionally into the system log.

       Optionally  it  can  also  take  more  options like keeping statistics or triggering shell
       scripts on specific events.

       The normal operating modi for mcelog are running as a regular cron job  (traditional  way,
       deprecated),  running as a trigger directly executed by the kernel, or running as a daemon
       with the --daemon option.

       When an uncorrected machine check error happens that the kernel cannot recover  from  then
       it  will  usually  panic  the  system.  In this case when there was a warm reset after the
       panic mcelog should pick up the machine check errors after reboot.  This is  not  possible
       after a cold reset.

       In addition mcelog can be used on the command line to decode the kernel output for a fatal
       machine check panic in text format using the --ascii option. This  is  typically  used  to
       decode  the  panic console output of a fatal machine check, if the system was power cycled
       or mcelog didn't run immediately after reboot.

       When the panic triggers a kdump kexec crash kernel the crash kernel boot up script  should
       log the machine checks to disk, otherwise they might be lost.

       Note  that  after mcelog retrieves an error the kernel doesn't store it anymore (different
       from dmesg(1)), so the output should be always saved  somewhere  and  mcelog  not  run  in
       uncontrolled ways.

OPTIONS

       When  the  --syslog  option is specified redirect output to system log. The --syslog-error
       option causes the normal machine checks to be  logged  as  LOG_ERR  (implies  --syslog  ).
       Normally  only fatal errors or high level remarks are logged with error level.  High level
       one line summaries of specific errors are also logged to  the  syslog  by  default  unless
       mcelog operates in --ascii mode.

       When  the --logfile=file option is specified append log output to the specified file. With
       the --no-syslog option mcelog will never log anything to the syslog.

       When the --cpu=cputype option is specified set the to be  decoded  CPU  to  cputype.   See
       mcelog --help for a list of valid CPUs.  Note that specifying an incorrect CPU can lead to
       incorrect decoding output.  Default is either the CPU of the  machine  that  reported  the
       machine  check  (needs a newer kernel version) or the CPU of the machine mcelog is running
       on, so normally this option doesn't have to be used. Older versions of mcelog had separate
       options  for  different  CPU  types.  These  are  still  implemented,  but  deprecated and
       undocumented now.

       With the --dmi option mcelog will look up the addresses reported in machine checks in  the
       SMBIOS/DMI  tables  of  the  BIOS.   This  can  sometimes  tell  you  which DIMM or memory
       controller has developed a problem. More often the information reported  by  the  BIOS  is
       either  subtly  or  obviously wrong or useless.  This option requires that mcelog has read
       access to /dev/mem (normally requires root) and runs on  the  same  machine  in  the  same
       hardware configuration as when the machine check event happened.

       When  --ignorenodev  is specified then mcelog will exit silently when the device cannot be
       opened. This is useful in virtualized environment with limited devices.

       When --filter is specified mcelog will  filter  out  known  broken  machine  check  events
       (default on). When the --no-filter option is specified mcelog does not filter events.

       When  --raw  is  specified  mcelog  will not decode, but just dump the mcelog in a raw hex
       format. This can be useful for automatic post processing.

       When a device is specified the machine check logs are read  from  device  instead  of  the
       default /dev/mcelog.

       With the --ascii option mcelog decodes a fatal machine check panic generated by the kernel
       ("CPU n: Machine Check Exception ...") in ASCII from standard input and exits  afterwards.
       Note  that  when  the panic comes from a different machine than where mcelog is running on
       you might need to specify the correct cputype on older kernels.  On  newer  kernels  which
       output the PROCESSOR field this is not needed anymore.

       When  the  --file  filename option is specified mcelog --ascii will read the ASCII machine
       check record from input file filename instead of standard input.

       With the --config-file file option mcelog reads the specified  config  file.   Default  is
       /etc/mcelog/mcelog.conf See also CONFIG FILE below.

       With  the  --daemon  option  mcelog  will  run  in  the background. This gives the fastest
       reaction  time  and  is   the   recommended   operating   mode.    This   option   implies
       --logfile=/var/log/mcelog.   Important  messages  will be logged as one-liner summaries to
       syslog unless --no-syslog is given.  The option  --foreground  will  prevent  mcelog  from
       giving up the terminal in daemon mode. This is intended for debugging.

       With the --client option mcelog will query a running daemon for accumulated errors.

       With the --cpumhz=mhz option assume the CPU has mhz frequency for decoding the time of the
       event using the CPU time stamp counter. This  also  forces  decoding.  Note  this  can  be
       unreliable.   on  some  systems with CPU frequency scaling or deep C states, where the CPU
       time stamp counter does not increase linearly.  By default the frequency  of  the  current
       CPU  is  used  when  mcelog  determines  it  is safe to use. Newer kernels report the time
       directly in the event and don't need this anymore.

       The --pidfile file option writes the process id of the daemon into file file.  Only  valid
       in daemon mode.

       Mcelog  will enable extended error reporting from the memory controller on processors that
       support it unless you tell it not to with the --no-imc-log option.  You  might  need  this
       option when decoding old logs from a system where this mode was not enabled.

       --version displays the version of mcelog and exits.

CONFIG FILE

       mcelog  supports  a  config file to set defaults. Command line options override the config
       file. By default the config file is read from  /etc/mcelog/mcelog.conf  unless  overridden
       with the --config-file option.

       The  general  format  is optionname = value White space is not allowed in value currently,
       except at the end where it is dropped Comments start with #.

       All command line options that are not commands can be specified in the config  file.   For
       example  t  to enable the --no-syslog option use no-syslog = yes (or no to disable).  When
       the option has a argument use logfile = /tmp/logfile

NOTES

       The kernel prefers old messages over new. If the log buffer overflows only old  ones  will
       be kept.

       The exact output in the log file depends on the CPU, unless the --raw option is used.

       mcelog will report serious errors to the syslog during decoding.

SIGNALS

       When  mcelog  runs  in daemon mode and receives a SIGUSR1 it will close and reopen the log
       files. This can be used to rotate logs without restarting the daemon.

FILES

       /dev/mcelog (char 10, minor 227)

       /etc/mcelog/mcelog.conf

       /var/log/mcelog

       /var/run/mcelog.pid

SEE ALSO

       AMD x86-64 architecture programmer's manual, Volume 2, System programming

       Intel 64 and IA32 Architectures Software Developer's manual, Volume 3, System  programming
       guide Parts 1 and 2. Machine checks are described in Chapter 14 in Part1 and in Appendix E
       in Part2.

       Datasheet of your CPU.

                                             May 2009                                   MCELOG(8)