Re: [PATCH: 002/017]Memory hotplug for new nodes v.4.(change name old add_memory() to arch_add_memory())

From: KAMEZAWA Hiroyuki <kamezawa.hiroyu_at_jp.fujitsu.com>
Date: 2006-03-22 11:05:14
On Tue, 21 Mar 2006 10:00:12 -0800
Dave Hansen <haveblue@us.ibm.com> wrote:

> On Sat, 2006-03-18 at 10:26 +0900, KAMEZAWA Hiroyuki wrote:
> > If *determine node* function is moved to arch specific parts,
> > memory hot add need more and more codes to determine  paddr -> nid in arch
> > specific codes. Then, we have to add new paddr->nid function even if new nid is
> > passed by firmware. We *lose* useful information of nid from firmware if 
> > add_memory() has just 2 args, (start, end).  
> 
> What I'm saying is that I'd like add_memory() to be just that, for
> adding memory.
> 
> At some point in the process, you need to export the NUMA node layout to
> the rest of the system, to say which pages go in which node.  I'm just
> saying that you should do that _before_ add_memory().
> 

To do so, we have to maintain new pfn_to_nid() function.
We have to maintain a new table/list and have to consider name of it :).
And, add_memory() has to check whether a node which belongs exists ot not, again.
I don't want these kind of things. 

With current kernel, we have to add new *pgdat* to node when adding a new node.
(If we don't, the kernel goes panic()) And we have to allocate a pgdat/zones 
in a local node in future. So adding a node before adding memory is not good. 
(current code uses kmalloc() just for reducing complexity.)

> add_memory() should support adding memory to more than one node.  If any
> hypervisor or hardware happens to have memory added in one contiguous
> chunk, it can not simply call add_memory().  _That_ firmware would be
> forced to do the NUMA parsing and figure out how many times to call
> add_memory().  
I don't think the firmware adds memory of multiple nodes at once.
It's crazy.

> 
> Let me reiterate: the process of telling the system which pages are in
> which node should be separate from telling the system that there *are*
> currently pages there now.

Considering "cpu only node', "check and add new node" function can be separated,
like add_memory_less_node().(But pgdat/zone etc.. will be allocated in out of node.)

Bye.
-- Kame

-
To unsubscribe from this list: send the line "unsubscribe linux-ia64" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Received on Wed Mar 22 11:06:21 2006

This archive was generated by hypermail 2.1.8 : 2006-03-22 11:06:32 EST