numad: Set memory policy from numad advisory nodeset

Though numad will manage the memory allocation of task dynamically,
it wants management application (libvirt) to pre-set the memory
policy according to the advisory nodeset returned from querying numad,
(just like pre-bind CPU nodeset for domain process), and thus the
performance could benefit much more from it.

This patch introduces new XML tag 'placement', value 'auto' indicates
whether to set the memory policy with the advisory nodeset from numad,
and its value defaults to the value of <vcpu> placement, or 'static'
if 'nodeset' is specified. Example of the new XML tag's usage:

  <numatune>
    <memory placement='auto' mode='interleave'/>
  </numatune>

Just like what current "numatune" does, the 'auto' numa memory policy
setting uses libnuma's API too.

If <vcpu> "placement" is "auto", and <numatune> is not specified
explicitly, a default <numatume> will be added with "placement"
set as "auto", and "mode" set as "strict".

The following XML can now fully drive numad:

1) <vcpu> placement is 'auto', no <numatune> is specified.

   <vcpu placement='auto'>10</vcpu>

2) <vcpu> placement is 'auto', no 'placement' is specified for
   <numatune>.

   <vcpu placement='auto'>10</vcpu>
   <numatune>
     <memory mode='interleave'/>
   </numatune>

And it's also able to control the CPU placement and memory policy
independently. e.g.

1) <vcpu> placement is 'auto', and <numatune> placement is 'static'

   <vcpu placement='auto'>10</vcpu>
   <numatune>
     <memory mode='strict' nodeset='0-10,^7'/>
   </numatune>

2) <vcpu> placement is 'static', and <numatune> placement is 'auto'

   <vcpu placement='static' cpuset='0-24,^12'>10</vcpu>
   <numatune>
     <memory mode='interleave' placement='auto'/>
   </numatume>

A follow up patch will change the XML formatting codes to always output
'placement' for <vcpu>, even it's 'static'.
This commit is contained in:
Osier Yang
2012-05-09 00:04:34 +08:00
committed by Eric Blake
parent 8be304ecb9
commit 97010eb1f1
16 changed files with 404 additions and 94 deletions

View File

@@ -362,14 +362,14 @@
0.9.11 (QEMU and KVM only)</span>, the optional attribute
<code>placement</code> can be used to indicate the CPU placement
mode for domain process, its value can be either "static" or
"auto", defaults to "static" if <code>cpuset</code> is specified,
"auto" indicates the domain process will be pinned to the advisory
nodeset from querying numad, and the value of attribute
<code>cpuset</code> will be ignored if it's specified. If both
<code>cpuset</code> and <code>placement</code> are not specified,
or if <code>placement</code> is "static", but no <code>cpuset</code>
is specified, the domain process will be pinned to all the
available physical CPUs.
"auto", defaults to <code>placement</code> of <code>numatune</code>,
or "static" if <code>cpuset</code> is specified. "auto" indicates
the domain process will be pinned to the advisory nodeset from querying
numad, and the value of attribute <code>cpuset</code> will be ignored
if it's specified. If both <code>cpuset</code> and <code>placement</code>
are not specified, or if <code>placement</code> is "static", but no
<code>cpuset</code> is specified, the domain process will be pinned to
all the available physical CPUs.
</dd>
</dl>
@@ -578,11 +578,24 @@
<dt><code>memory</code></dt>
<dd>
The optional <code>memory</code> element specifies how to allocate memory
for the domain process on a NUMA host. It contains two attributes,
attribute <code>mode</code> is either 'interleave', 'strict',
or 'preferred',
attribute <code>nodeset</code> specifies the NUMA nodes, it leads same
syntax with attribute <code>cpuset</code> of element <code>vcpu</code>.
for the domain process on a NUMA host. It contains several optional
attributes. Attribute <code>mode</code> is either 'interleave',
'strict', or 'preferred', defaults to 'strict'. Attribute
<code>nodeset</code> specifies the NUMA nodes, using the same syntax as
attribute <code>cpuset</code> of element <code>vcpu</code>. Attribute
<code>placement</code> (<span class='since'>since 0.9.12</span>) can be
used to indicate the memory placement mode for domain process, its value
can be either "static" or "auto", defaults to <code>placement</code> of
<code>vcpu</code>, or "static" if <code>nodeset</code> is specified.
"auto" indicates the domain process will only allocate memory from the
advisory nodeset returned from querying numad, and the value of attribute
<code>nodeset</code> will be ignored if it's specified.
If <code>placement</code> of <code>vcpu</code> is 'auto', and
<code>numatune</code> is not specified, a default <code>numatune</code>
with <code>placement</code> 'auto' and <code>mode</code> 'strict' will
be added implicitly.
<span class='since'>Since 0.9.3</span>
</dd>
</dl>

View File

@@ -562,16 +562,32 @@
<element name="numatune">
<optional>
<element name="memory">
<attribute name="mode">
<choice>
<value>strict</value>
<value>preferred</value>
<value>interleave</value>
</choice>
</attribute>
<attribute name="nodeset">
<ref name="cpuset"/>
</attribute>
<optional>
<attribute name="mode">
<choice>
<value>strict</value>
<value>preferred</value>
<value>interleave</value>
</choice>
</attribute>
</optional>
<choice>
<group>
<optional>
<attribute name='placement'>
<value>static</value>
</attribute>
</optional>
<optional>
<attribute name='nodeset'>
<ref name='cpuset'/>
</attribute>
</optional>
</group>
<attribute name='placement'>
<value>auto</value>
</attribute>
</choice>
</element>
</optional>
</element>