From e1b022890e0a14e57bf13ae9cbbcfa95e57ad7e4 Mon Sep 17 00:00:00 2001
From: Michal Privoznik
Date: Mon, 3 Jun 2019 10:46:18 +0200
Subject: [PATCH] schemas: Introduce disk type NVMe
There is this class of PCI devices that act like disks: NVMe.
Therefore, they are both PCI devices and disks. While we already
have (and can assign a NVMe device to a domain
successfully) we don't have disk representation. There are three
problems with PCI assignment in case of a NVMe device:
1) domains with can't be migrated
2) NVMe device is assigned whole, there's no way to assign only a
namespace
3) Because hypervisors see they don't put block layer
on top of it - users don't get all the fancy features like
snapshots
NVMe namespaces are way of splitting one continuous NVDIMM memory
into smaller ones, effectively creating smaller NVMe-s (which can
then be partitioned, LVMed, etc.)
Because of all of this the following XML was chosen to model a
NVMe device:
Signed-off-by: Michal Privoznik
Reviewed-by: Cole Robinson
---
docs/formatdomain.html.in | 53 ++++++++++++++++++++++-
docs/schemas/domaincommon.rng | 32 ++++++++++++++
tests/qemuxml2argvdata/disk-nvme.xml | 63 ++++++++++++++++++++++++++++
3 files changed, 147 insertions(+), 1 deletion(-)
create mode 100644 tests/qemuxml2argvdata/disk-nvme.xml
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index bfcdc026e6..e06cf2061b 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -2948,6 +2948,13 @@
</backingStore>
<target dev='vdd' bus='virtio'/>
</disk>
+ <disk type='nvme' device='disk'>
+ <driver name='qemu' type='raw'/>
+ <source type='pci' managed='yes' namespace='1'>
+ <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
+ </source>
+ <target dev='vde' bus='virtio'/>
+ </disk>
</devices>
...
@@ -2961,7 +2968,8 @@
Valid values are "file", "block",
"dir" (since 0.7.5),
"network" (since 0.8.7), or
- "volume" (since 1.0.5)
+ "volume" (since 1.0.5), or
+ "nvme" (since 6.0.0)
and refer to the underlying source for the disk.
Since 0.0.3
@@ -3144,6 +3152,43 @@
Since 1.0.5
+
nvme
+
+ To specify disk source for NVMe disk the source
+ element has the following attributes:
+
+
type
+
The type of address specified in address
+ sub-element. Currently, only pci value is
+ accepted.
+
+
+
managed
+
This attribute instructs libvirt to detach NVMe
+ controller automatically on domain startup (yes)
+ or expect the controller to be detached by system
+ administrator (no).
+
+
+
namespace
+
The namespace ID which should be assigned to the domain.
+ According to NVMe standard, namespace numbers start from 1,
+ including.
+
+
+
+ The difference between <disk type='nvme'>
+ and <hostdev/> is that the latter is plain
+ host device assignment with all its limitations (e.g. no live
+ migration), while the former makes hypervisor to run the NVMe
+ disk through hypervisor's block layer thus enabling all
+ features provided by the layer (e.g. snapshots, domain
+ migration, etc.). Moreover, since the NVMe disk is unbinded
+ from its PCI driver, the host kernel storage stack is not
+ involved (compared to passing say /dev/nvme0n1 via
+ <disk type='block'> and therefore lower
+ latencies can be achieved.
+