[RFC] Override CPE & CMC thresholding defaults

From: Russ Anderson <rja_at_sgi.com>
Date: 2005-05-14 02:56:36
CPE (Corrected Processor Errors) and CMC (Corrected Machine Checks)
handling has thresholding to prevent a burst of errors (such as a
solid single bit) from overwhelming Linux with interrupts & logging
overhead.  When the threshold is exceeded, linux goes into polling
mode.  The values are currently hardcoded.

#define MAX_CPE_POLL_INTERVAL (15*60*HZ) /* 15 minutes */
#define MIN_CPE_POLL_INTERVAL (2*60*HZ)  /* 2 minutes */
#define CMC_POLL_INTERVAL     (1*60*HZ)  /* 1 minute */

Those values may be appropriate for some configurations, but
may not for others.  For example, a CPE threshold of 5 may
be low for a system with 4 terabytes of memory.  Given the
variety of system sizes, it may be difficult to agree on
hardcoded defaults that fit all systems.

One alternative would be /proc/sys/kernel/ entries, so
that the default values could be overridden.  For example,
/proc/sys/kernel/cpe_threshold would override the default
CPE threshhold of 5.

Would there be any objection to creating /proc/sys/kernel/
entries to override the hardcoded default values?

Russ Anderson, OS RAS/Partitioning Project Lead  
SGI - Silicon Graphics Inc          rja@sgi.com
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Fri May 13 12:57:16 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:38 EST