Robin Holt wrote: [Tue Mar 28 2006, 01:43:16PM EST] > Recently, we ran a large system out of memory and the oom_kill() appeared > to have frozen up. When we looked at the backtraces, we noticed the cpu > was making progress, but apparently not fast progress. As a simple test, > I did a 'echo m >/proc/sysrq-trigger' and that had not completed in more > than a half-hour. > > The system was a fully populated 512 node SGI machine. The way that > memory is physically layed out results in a single pgdat which covers > the node with two holes in it. This is new hardware with larger gaps > between the chunks of memory that earlier version had. As show_mem() > is traversing the entire systems memory to print out stats on remaining > memory, it takes faults while trying to look at holes in the array of > struct pages. > > At this point, I am looking for any sort of direction on what would be > a reasonable fix. Should show_mem() be made to skip to a page aligned > point in the array when the fault fails? Should we add the information > about start and end of hole to the pgdat()? Should we have one pgdat > per chunk? Are there other better ideas out there? Any direction would > be greatly appreciated. This could work but you need to be cautious because struct page for ia64 isn't a power of 2. Also this would have to be done conditionally because SPARSEMEM doesn't require it but of course VIRTUAL_MEM_MAP does. > > Thanks, > Robin Holt your welcome, bob - To unsubscribe from this list: send the line "unsubscribe linux-ia64" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.htmlReceived on Wed Mar 29 06:23:55 2006
This archive was generated by hypermail 2.1.8 : 2006-03-29 06:24:05 EST