Re: new utility for decoding salinfo records

From: Keith Owens <kaos_at_sgi.com>
Date: 2005-01-12 15:10:15
On Tue, 11 Jan 2005 07:46:28 -0800, 
Ben Woodard <woodard@redhat.com> wrote:
>Here is a new utility for looking into salinfo records. It several
>things differently than salinfo_decode. We have found that this helps

The design of salinfo_decode2 is completely unacceptable for SGI
hardware, and probably for HP as well.  You have removed all processing
of the oemdata.

SGI hardware decodes the oemdata in SAL records using prom code.  This
decode _must_ be done while the record is still in the prom's memory
space.  The callback into the prom (via the kernel) must be done after
the main part of the record is printed and before the record is cleared
from SAL.  For some error types such as CPE, the SGI oemdata provides
critical information about which DIMM is failing, including its node
and serial number.

AFAIK HP decode their oemdata via a user space program.  Again this is
done after the main part of the record is printed.

To handle both SGI and HP requirements, the existing salinfo_decode
program calls the optional program salinfo_decode_oemdata.  That call
is made at the right point in the read/decode/clear cycle to satisfy
all vendor requirements.  Removing salinfo_decode_oemdata is not an
option.

The existing salinfo_decode program works fine, including decoding oem
data.  I agree that we need a summary tool to merge data from multiple
records together, but there are better ways of doing that, we do not
need to remove the existing salinfo_decode functionality to get a
summary.

Leave salinfo_decode completely alone, especially the oem decoding.
To get a summary, add a new package that monitors the contents of
/var/log/salinfo/decoded, reads new records and summarizes the
contents.  I am quite happy to add a trigger (pipe or socket) from
salinfo_decode to the summary program to indicate when new records
arrive.

Any summary program must be extensible so a vendor can report on data
that is extracted from their oemdata.

BTW, salinfo_decode2 will spin forever on a kernel < 2.6.9-rc4,
including all 2.4 kernels.  Once again, salinfo_decode 0.7 gets this
right.

-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Tue Jan 11 23:10:51 2005

This archive was generated by hypermail 2.1.8 : 2005-08-02 09:20:34 EST