RE: 05e0caad3b7bd0d0fbeff980bca22f186241a501 breaks ia64 kdump

From: Mel Gorman <mel_at_csn.ul.ie>
Date: 2006-11-15 10:42:13
On Tue, 14 Nov 2006, Zou, Nanhai wrote:

>> -----Original Message-----
>> From: linux-ia64-owner@vger.kernel.org
>> [mailto:linux-ia64-owner@vger.kernel.org] On Behalf Of Mel Gorman
>> Sent: 2006年11月10日 19:47
>> To: Zou, Nanhai
>> Cc: Horms; Andy Whitcroft; Linux-IA64; Bob Picco; Andrew Morton; Dave Hansen;
>> Andi Kleen; Benjamin Herrenschmidt; Paul Mackerras; Keith Mannthey; Luck, Tony;
>> KAMEZAWA Hiroyuki; Yasunori Goto; Khalid Aziz
>> Subject: RE: 05e0caad3b7bd0d0fbeff980bca22f186241a501 breaks ia64 kdump
>>
>> On Fri, 10 Nov 2006, Zou, Nanhai wrote:
>>
>>>> -----Original Message-----
>>>> From: linux-ia64-owner@vger.kernel.org
>>>> [mailto:linux-ia64-owner@vger.kernel.org] On Behalf Of Zou Nan hai
>>>> Sent: 2006年11月3日 18:07
>>>> To: Mel Gorman
>>>> Cc: Horms; Andy Whitcroft; Linux-IA64; Bob Picco; Andrew Morton; Dave
>> Hansen;
>>>> Andi Kleen; Benjamin Herrenschmidt; Paul Mackerras; Keith Mannthey; Luck,
>> Tony;
>>>> KAMEZAWA Hiroyuki; Yasunori Goto; Khalid Aziz
>>>> Subject: RE: 05e0caad3b7bd0d0fbeff980bca22f186241a501 breaks ia64 kdump
>>>>
>>>> On Fri, 2006-11-03 at 17:27, Mel Gorman wrote:
>>>>> On Fri, 3 Nov 2006, Zou, Nanhai wrote:
>>>>>
>>>>>> Hi,
>>>>>> 	This patch should fix the issue.
>>>>>>
>>>>>
>>>>> It would appear to fix the issue for IA64 but you are blotting over the
>>>>> issue that the map is reporting a one page hole. On arches with really
>>>>> adjacent regions that are getting merged, the regions will appear to
>>>>> overlap by one page. What can happen is something like this
>>>>>
>>>>> PFN ranges for nodes
>>>>> Node 1: 0 -> 1000
>>>>> Node 0: 1000 -> 2000
>>>>>
>>>> Hi,
>>>>  But the patch Andy and you are commenting is not my patch...., It was
>>>> in the previous thread.
>>>> My patch was in the attachment.....
>>>>
>>>>  Sorry for using outlook to send that patch as attachment, my Linux box
>>>> was not accessable at the time when I was posting the patch.
>>>>  I post the patch again, and copy the discription from my previous mail.
>>>>
>>>> When ia64 kernel is configured as discontinuous memory model,
>>>> active_pages are added through efi_memmap_walk(filter_rsvd_memory,
>>>> count_node_pages).
>>>> filter_rsvd_memory  will filter out all regions in rsvd_regions include
>>>> - boot param
>>>> - mem map
>>>> - initrd
>>>> - command line
>>>> - **** kernel code and data ***
>>>> - kernel map built from efi memmap
>>>> - crash kernel reserved region
>>>> So the kernel code and data is excluded even without kdump support,
>>>> check /proc/iomem and dmesg for early_node_data can verify that.
>>>> But magically, the first kernel boots happily without any complain...,
>>>> I guess that is related to the init value in memmap.
>>>>
>>>> This patch use another filter to add_acvitive_pages, only exclude crash
>> kernel
>>>> reserved region if CONFIG_KEXEC is on.
>>>>
>>>> Thanks
>>>> Zou Nan hai
>>>> --- a/arch/ia64/mm/discontig.c	2006-11-02 20:09:47.000000000 -0500
>>>> +++ b/arch/ia64/mm/discontig.c	2006-11-02 19:57:27.000000000 -0500
>>>> @@ -21,6 +21,7 @@
>>>>  #include <linux/acpi.h>
>>>>  #include <linux/efi.h>
>>>>  #include <linux/nodemask.h>
>>>> +#include <linux/kexec.h>
>>>>  #include <asm/pgalloc.h>
>>>>  #include <asm/tlb.h>
>>>>  #include <asm/meminit.h>
>>>> @@ -653,8 +654,6 @@ void call_pernode_memory(unsigned long s
>>>>  static __init int count_node_pages(unsigned long start, unsigned long len,
>>>> int node)
>>>>  {
>>>>  	unsigned long end = start + len;
>>>> -
>>>> -	add_active_range(node, start >> PAGE_SHIFT, end >> PAGE_SHIFT);
>>>>  	mem_data[node].num_physpages += len >> PAGE_SHIFT;
>>>>  	if (start <= __pa(MAX_DMA_ADDRESS))
>>>>  		mem_data[node].num_dma_physpages +=
>>>> @@ -669,7 +668,31 @@ static __init int count_node_pages(unsig
>>>>
>>>>  	return 0;
>>>>  }
>>>> +static __init int add_active_range_wrapper(unsigned long start,
>>>> +		unsigned long len, int node)
>>>> +{
>>>> +	unsigned long end = start + len;
>>>> +	add_active_range(node, start >> PAGE_SHIFT, end >> PAGE_SHIFT);
>>>> +	return 0;
>>>> +}
>>>>
>>
>> The function name doesn't really tell the reader what it's meant to be
>> doing. Something like register_active_ranges() might be a bit better.
>>
> Ok.
>>>> +static int __init
>>>> +filter_pernode_memory (unsigned long start, unsigned long end, void *arg)
>>>> +{
>>>> +	void (*func)(unsigned long, unsigned long, int);
>>>> +	func = arg;
>>>> +
>>>> +#ifdef CONFIG_KEXEC
>>>> +	if (start > crashk_res.start && start < crashk_res.end)
>>>> +		start = max(start, crashk_res.end);
>>>> +	if (end > crashk_res.start && end < crashk_res.end)
>>>> +		end = min(end, crashk_res.start);
>>
>>
>> These two checks appear to deliberatly avoid registering the kernel image
>> as an active range. Was that your intention? If so, will you not hit the
>> same problem with initmem?
>>
>  No, the crashk_res.start ~ crashk_res.end is the hole reserved for 2nd 
> kernel.

Then it needs a comment to that effect. It's difficult to see what code is 
executed by the main kernel and what code is executed by the crash kernel.

> The kernel himself does not to setup memmap for this area, the 
> 2nd kernel will handle it.

Ok, where does that happen?

> As I have mentioned, this bug also exist even 
> without kdump patch. You will see first kernels code and data is not 
> covered by add_active_range if DISCONTIGMEM model is choosen.
>

But is it's initmem section?

> Thanks
> Zou Nan hai
>
>>>> +#endif
>>>> +	if (start < end)
>>>> +		call_pernode_memory(__pa(start), end - start, func);
>>>> +
>>>> +	return 0;
>>>> +}
>>>>  /**
>>>>   * paging_init - setup page tables
>>>>   *
>

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Wed Nov 15 10:43:15 2006

This archive was generated by hypermail 2.1.8 : 2006-11-15 10:43:30 EST