Commit Graph

1930 Commits

Author SHA1 Message Date
Michal Privoznik
b07640bb43 qemu_domain: Drop unused variables from qemuDomainChrDefDropDefaultPath()
In mu previous commits I've moved internals of
qemuDomainChrDefDropDefaultPath() into a separate function
(qemuDomainChrMatchDefaultPath()) but forgot to remove @buf and
@regexp variables which are now unused.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
2023-08-17 17:43:54 +02:00
Michal Privoznik
8abc979bb0 qemu: Move channelTargetDir into stateDir
For historical reasons (i.e. unknown reason) we put channel
sockets into a path derived from cfg->libDir which is a path that
survives host reboots (e.g. /var/lib/libvirt/...). This is not
necessary and in fact for session daemon creates a longer prefix:

  XDG_CONFIG_HOME -> /home/user/.config
  XDG_RUNTIME_DIR -> /run/user/1000

Worse, if host is rebooted suddenly (e.g. due to power loss) then
we leave files behind and nobody will ever remove them.

Therefore, place the channel target dir into state dir.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2173980
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2023-08-17 17:22:09 +02:00
Michal Privoznik
d3759d3674 qemu: Generate shorter channel target paths
A <channel/> device is basically an UNIX socket into guest.
Whatever is sent from the host, appears in the guest and vice
versa. But because of that, the length of the path to the socket
is important (underscored by fact that we derive the path from
domain short name). But there are still cases where we might not
fit into UNIX_PATH_MAX limit (usually 108 characters), because
the path is derived also from other variables, e.g.
XDG_CONFIG_HOME for session domains.

There are two components though, that are needless: "/target/"
and "domain-" prefix. Drop them. This is safe to do, because
running domains have their path saved in status XML and even
though paths are dropped on migration, they are not part of guest
ABI and thus we are free to change them.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2023-08-17 17:19:01 +02:00
Michal Privoznik
baeefe0327 qemu_domain: Partially validate memory amounts when auto-adding NUMA node
When automatically adding a NUMA node (qemuDomainDefNumaAutoAdd()) the
memory size of the node is computed as:

  total_memory - sum(memory devices)

And we have a nice helper for that: virDomainDefGetMemoryInitial() so
it looks logical to just call it. Except, this code runs in post parse
callback, i.e. memory sizes were not validated and it may happen that
the sum is greater than the total memory. This would be caught by
virDomainDefPostParseMemory() but that runs only after driver specific
callbacks (i.e. after qemuDomainDefNumaAutoAdd()) and because the
domain config was changed and memory was increased to this huge
number no error is caught.

So let's do what virDomainDefGetMemoryInitial() would do, but
with error checking.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2216236
Fixes: f5d4f5c8ee
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
2023-07-25 14:51:35 +02:00
Peter Krempa
c90c97a734 Properly mark auto-added 'terminator' virStorageSource
All backing chain members which were auto-added by image detection,
including the terminating element, should have the 'detected' property
set to true. This is needed to properly strip the detected elements in
some cases, e.g. for the status XML where we could treat some images as
manually terminated even when it was auto-detected.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-07-20 14:58:35 +02:00
Michal Privoznik
f5d4f5c8ee qemu: Add NUMA node automatically for memory hotplug
Up until v2.11.0-rc2~19^2~3 QEMU used to require at least one
NUMA node to be configured when memory hotplug was enabled. After
that commit, QEMU automatically adds a NUMA node if none was
specified on the cmd line. Reflect this in domain XML, i.e.
explicitly add a NUMA node into our domain definition if needed.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2216236
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
2023-07-18 08:42:55 +02:00
Michal Privoznik
851c5f075b qemu_domain: Deduplicate targetNode check in qemuDomainDefValidateMemoryHotplugDevice()
If a domain has NUMA configured, then all <memory/> devices
(except for 'virtio-pmem') need to have targetNode set. There are
two checks inside of qemuDomainDefValidateMemoryHotplugDevice()
for this: one inside of big switch() statement, which only checks
'dimm' and 'nvdimm' cases, and the other at the end of the
function that checks all models (except for 'virtio-pmem'). Let's
keep the latter and remove the former as the latter covers the
former too.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
2023-07-13 16:34:15 +02:00
Jean-Louis Dupond
b855f8ea1e Add discard_no_unref option for qcow2 images
Qemu 8.1.0 will add discard_no_unref option for qcow2 images.
When this option is enabled (default=false), then it will no longer
unreference clusters when guest does a discard, but it will just free
the blocks (useful for incremental backups for example) and pass the
discard to the lower layer.

This was implemented to avoid fragmentation within the qcow2 image.

Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-06-26 13:06:00 +02:00
Peter Krempa
e3ce39195c qemu_domain: Properly validate count of memory slots
Memory slots are required only for DIMM-like devices, while other
devices defined via <memory> such as virtio-mem may use the PCI bus and
thus do not require/consume a memory slot.

Fix the validation code to calculate the required count of memory
devices only for DIMMs and NVDIMMs.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-06-26 12:58:24 +02:00
Michal Privoznik
2c15506254 qemu: Fill virtio-mem/virtio-pmem .memaddr at runtime
After a QEMU domain is started, among other thing we query memory
device information. And while memory address is returned by QEMU
for all models, we store it only for DIMMs and NVDIMMs. Do store
it for VIRTIO_MEM and VIRTIO_PMEM too.

This effectively reports the address the virtio-mem/virtio-pmem
is mapped to in live XML.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-26 16:44:45 +02:00
Michal Privoznik
3ec6d586bc qemu: Start emulator thread with more generous cpuset.mems
Consider a domain with two guest NUMA nodes and the following
<numatune/> setting :

  <numatune>
    <memory mode="strict" nodeset="0"/>
    <memnode cellid="0" mode="strict" nodeset="1"/>
  </numatune>

What this means is the emulator thread is pinned onto host NUMA
node #0 (by setting corresponding cpuset.mems to "0"), and two
memory-backend-* objects are created:

  -object '{"qom-type":"memory-backend-ram","id":"ram-node0", .., "host-nodes":[1],"policy":"bind"}' \
  -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 \
  -object '{"qom-type":"memory-backend-ram","id":"ram-node1", .., "host-nodes":[0],"policy":"bind"}' \
  -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 \

Note, the emulator thread is pinned well before QEMU is even
exec()-ed.

Now, the way memory allocation works in QEMU is: the emulator
thread calls mmap() followed by mbind() (which is sane, that's
how everybody should do it). BUT, because the thread is already
restricted by CGroups to just NUMA node #0, calling:

  mbind(host-nodes:[1]); /* made up syntax (TM) */

fails. This is expected though. Kernel was instructed to place
the memory at NUMA node "0" and yet, process is trying to place
it elsewhere.

We used to solve this by not restricting emulator thread at all
initially, and only after it's done initializing (i.e. we got the
QMP greeting) we placed it onto desired nodes. But this had its
own problems (e.g. QEMU might have locked pieces of its memory
which were then unable to migrate onto different NUMA nodes).

Therefore, in v5.1.0-rc1~282 we've changed this and set cgroups
upfront (even before exec()-ing QEMU). And this used to work, but
something has changed (I can't really put my finger on it).

Therefore, for the initialization start the thread with union of
all configured host NUMA nodes ("0-1" in our example) and fix the
placement only after QEMU is started.

NB, the memory hotplug suffers the same problem, but that will
be fixed in the next commit.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2138150
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-23 17:21:16 +02:00
Michal Privoznik
37e41b7f16 qemu: Drop @forceVFIO argument of qemuDomainGetMemLockLimitBytes()
After previous cleanup, there's not a single caller that would
call qemuDomainGetMemLockLimitBytes() with @forceVFIO set. All
callers pass false.

Drop the unneeded argument from the function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 14:43:43 +02:00
Michal Privoznik
4f355fa5b7 qemu: Drop @forceVFIO argument of qemuDomainAdjustMaxMemLock()
After previous cleanup, there's not a single caller that would
call qemuDomainAdjustMaxMemLock() with @forceVFIO set. All callers
pass false.

Drop the unneeded argument from the function.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 14:43:43 +02:00
Michal Privoznik
c925bb9273 qemu_domin: Account for NVMe disks when calculating memlock limit on hotplug
During hotplug of a NVMe disk we need to adjust the memlock
limit. The computation of the limit is handled by
qemuDomainGetMemLockLimitBytes() which looks at given domain
definition and accounts for various device types (as different
types require different amounts). But during disk hotplug the
disk is not added to domain definition until the very last
moment. Therefore, qemuDomainGetMemLockLimitBytes() has this
@forceVFIO argument which tells it to assume VFIO even if there
are no signs of VFIO in domain definition. And this kind of
works, until the amount needed for NVMe disks changed (in
v9.3.0-rc1~52). What's missing in the commit is making @forceVFIO
behave the same as if there was an NVMe disk present in the
domain definition.

But, we can do even better - just mimic whatever we're doing for
hostdevs. IOW - introduce qemuDomainAdjustMaxMemLockNVMe() that
behaves the same as qemuDomainAdjustMaxMemLockHostdev().

There are subtle differences though:

1) qemuDomainAdjustMaxMemLockHostdev() can afford placing hostdev
   right at the end of vm->def->hostdevs, because the array was
   already reallocated (at the beginning of
   qemuDomainAttachHostPCIDevice()). But
   qemuDomainAdjustMaxMemLockNVMe() doesn't have that luxury.

2) qemuDomainAdjustMaxMemLockHostdev() places a
   virDomainHostdevDef pointer into domain definition, while
   qemuDomainStorageSourceAccessModifyNVMe() (which calls
   qemuDomainAdjustMaxMemLock()) sees a virStorageSource pointer
   but domain definition contains virDomainDiskDef. But that's
   okay, we can create a dummy disk definition and append it into
   the domain definition.

After this, qemuDomainAdjustMaxMemLock() can be called with
@forceVFIO = false, as the disk is now part of domain definition
(when computing the new limit).

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2014030#c28
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-05-16 14:43:42 +02:00
Michal Privoznik
fea0d8c40d qemu: Move <hostdev> SCSI path generation into qemuDomainPrepareHostdev()
When preparing a SCSI <hostdev/> with passthrough of a host SCSI
adapter (i.e. no protocol), a virStorageSource structure is
initialized and stored inside virDomainHostdevDef. But the source
structure is filled in many places, with almost the same code.

Firstly, qemuProcessPrepareHostHostdev() and
qemuConnectDomainXMLToNativePrepareHostHostdev() are the same.

Secondly, qemuDomainPrepareHostdev() allocates the src structure,
only to let qemuProcessPrepareHostHostdev() fill src->path later.

Well, src->path can be filled at the same place where the src
structure is allocated (qemuDomainPrepareHostdev()) which renders
the other two functions needless.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-04-25 12:36:30 +02:00
Michal Privoznik
59962b69b5 qemu: Deny all but VFIO PCI backends in hostdev prepare phase
We used to support KVM and VFIO style of PCI assignment. The
former was dropped in v5.7.0-rc1~103 and thus we only support
VFIO. All other backends lead to an error (see
qemuBuildPCIHostdevDevProps(), or qemuBuildPCIHostdevDevStr() as
it used to be called in the era of aforementioned commit).

Might as well report the error in prepare phase and save hassle
of proceeding with device preparation (e.g. in case of hotplug
overriding the device's driver, setting seclabels, etc.).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-04-25 12:36:30 +02:00
Michal Privoznik
3b87709c76 qemu: Move <hostdev/> PCI backend setting into qemuDomainPrepareHostdev()
virsh command domxml-to-native failed with below error but start
command succeed for same domain xml.

  "internal error: invalid PCI passthrough type 'default'"

If a <hostdev> PCI backend is not set in the XML, the supported
one is then chosen in qemuHostdevPreparePCIDevicesCheckSupport().
But this function is not called anywhere from
qemuConnectDomainXMLToNative(). But qemuDomainPrepareHostdev()
is. And it is also called from domain startup/hotplug code.
Therefore, move the backend setting to the common path and drop
qemuHostdevPreparePCIDevicesCheckSupport().

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-04-25 12:36:30 +02:00
Michal Privoznik
6e60e8cb9f qemu_domain: Move internals of qemuDomainPrepareHostdev() into a separate function
So far, qemuDomainPrepareHostdev() is a NOP for anything but a
SCSI hostdev. This will change soon. Therefore, move the SCSI
hostdev preparation into a separate function
(qemuDomainPrepareHostdevSCSI()) and make
qemuDomainPrepareHostdev() call function corresponding to the
hostdev type (or nothing if the type doesn't need any
preparation).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-04-25 12:36:30 +02:00
Michal Privoznik
5670c50ffb qemu_domain: Increase memlock limit for NVMe disks
When starting QEMU, or when hotplugging a PCI device QEMU might
lock some memory. How much? Well, that's an undecidable problem.

But despite that, we try to guess. And it more or less works,
until there's a counter example. This time, it's a guest with
both <hostdev/> and an NVMe <disk/>. I've started a simple guest
with 4GiB of memory:

  # virsh dominfo fedora
  Max memory:     4194304 KiB
  Used memory:    4194304 KiB

And here are the amounts of memory that QEMU tried to lock,
obtained via:

  grep VmLck /proc/$(pgrep qemu-kvm)/status

  1) with just one <hostdev/>
     VmLck:   4194308 kB

  2) with just one NVMe <disk/>
     VmLck:   4328544 kB

  3) with one <hostdev/> and one NVMe <disk/>
     VmLck:   8522852 kB

Now, what's surprising is case 2) where the locked memory exceeds
the VM memory. It almost resembles VDPA. Therefore, treat is as
such.

Unfortunately, I don't have a box with two or more spare NVMe-s
so I can't tell for sure. But setting limit too tight means QEMU
refuses to start.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2014030
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-04-20 08:37:22 +02:00
Jiri Denemark
27ed822d30 qemu/qemu_domain: Update format strings in translated messages
Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2023-04-01 11:40:33 +02:00
Andrea Bolognani
420a7a2550 qemu: Default to raw firmware for existing domains
The changes to the output files are the exact opposite of
those from commit 22207713cf: this is proof that the fix is
working as intended, and that existing domains will keep using
raw firmware images regardless of whether or not qcow2 images
are available on the system and have higher priority.

New domains will keep picking whatever firmware is considered
the preferred one according to the order of descriptors, as
evidenced by the fact that the recently introduced
firmware-auto-efi-abi-update-aarch64 test case is unaffected.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-03-28 14:22:34 +02:00
Andrea Bolognani
f099d3fe10 qemu: Move validation check out of postparse
Suggested-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-03-22 13:49:53 +01:00
Ján Tomko
9dab836721 qemu: use correct formatting string for size_t
Otherwise the build on armv7l breaks:
error: format ‘%lu’ expects argument of type
‘long unsigned int’, but argument 4 has type
‘size_t’ {aka ‘unsigned int’} [-Werror=format=]

Fixes: 1992ae40fa
Fixes: e239f7d0a8

Signed-off-by: Ján Tomko <jtomko@redhat.com>
2023-03-17 15:36:48 +01:00
Or Ozeri
5589a3e1f3 qemu: add luks-any encryption support for RBD images
The newly added luks-any rbd encryption format in qemu
allows for opening both LUKS and LUKS2 encryption formats.
This commit enables libvirt uses to use this wildcard format.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2023-03-16 15:19:36 +01:00
Or Ozeri
77c9663d72 qemu: add support for librbd layered encryption
This commit enables libvirt users to use layered encryption
of RBD images, using the librbd encryption engine.
This allows opening of an encrypted cloned image
whose parent is encrypted with a possibly different encryption key.
To open such images, multiple encryption secrets are expected
to be defined under the encryption XML tag.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2023-03-16 15:19:36 +01:00
Or Ozeri
1992ae40fa qemu: add multi-secret support in _qemuDomainStorageSourcePrivate
This commit changes the _qemuDomainStorageSourcePrivate struct
to support multiple secrets (instead of a single one before this commit).
This will useful for storage encryption requiring more than a single secret.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2023-03-16 15:19:36 +01:00
Or Ozeri
e239f7d0a8 qemu: add support for multiple secret aliases
Change secret aliases from %s-%s-secret0 to %s-%s-secret%lu,
which will later be used for storage encryption requiring more
than a single secret.

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
2023-03-16 15:19:35 +01:00
Michal Privoznik
b4ccb0dc41 qemu: Move cpuset preference evaluation into a separate function
The set of if()-s that determines the preference in cpumask used
for setting things like emulatorpin, vcpupin, etc. is going to be
re-used. Separate it out into a function.

You may think that this changes behaviour, but
qemuProcessPrepareDomainNUMAPlacement() ensures that
priv->autoCpuset is set for VIR_DOMAIN_CPU_PLACEMENT_MODE_AUTO.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Kristina Hanicova <khanicov@redhat.com>
Reviewed-by: Andrea Bolognani <abologna@redhat.com>
2023-03-15 12:46:40 +01:00
Jiri Denemark
b1b037fa5b Introduce VIR_DOMAIN_PAUSED_API_ERROR
Some APIs (migration, save/restore, snapshot, ...) require a domain to
be suspended temporarily. In case resuming the domain fails, the domain
will be unexpectedly left paused when the API finishes. This situation
is reported via VIR_DOMAIN_EVENT_SUSPENDED event with
VIR_DOMAIN_EVENT_SUSPENDED_API_ERROR detail. But we do not have a
corresponding reason for VIR_DOMAIN_PAUSED state and the reason would
remain set to the value used when the domain was paused. So the state
reason would suggest the operation is still running.

This patch changes the state reason to a new VIR_DOMAIN_PAUSED_API_ERROR
to make it clear the API that paused the domain already finished, but
failed to resume the domain.

Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-03-15 10:52:14 +01:00
Christian Nautze
a9a4421ba8 qemu: implement QEMU NBD source reconnect delay attribute
Currently it's only possible to set this parameter during domain
creation via QEMU commandline passthrough feature.
With the new delay attribute it's also possible to set this
parameter if you want to attach a new NBD disk
using "virsh attach-device domain device.xml" e.g.:

  <disk type='network' device='disk'>
    <driver name='qemu' type='raw'/>
    <source protocol='nbd' name='foo'>
      <host name='example.org' port='6000'/>
      <reconnect delay='10'/>
    </source>
    <target dev='vdb' bus='virtio'/>
  </disk>

Signed-off-by: Christian Nautze <christian.nautze@exoscale.ch>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-03-10 09:38:05 +01:00
Peter Krempa
f9b97f6b10 conf: cpu: Remove NULL check from virCPUDefCopyWithoutModel
Make all callers always pass a valid pointer which in turn allows us to
remove return value check from the callers.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-03-06 20:55:50 +01:00
Peter Krempa
8432392f51 cpu: Remove return value from virCPUDefCopyModel(Filter)
The functions were always returning 0.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-03-06 20:55:50 +01:00
Peter Krempa
9c627dc762 qemu: domain: Restructure control flow in qemuDomainFixupCPUs
Do the two fixups of CPU as one block and split up the return value
checks to separate conditions. This will make the upcoming refactors
simpler.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-03-06 20:55:50 +01:00
Peter Krempa
e61adbf26b qemu: capabilities: Don't make callers check return of virQEMUCapsNew(Binary)
The allocation of the object itself can't fail. What can fail is the
creation of the class on a programming error. Rather than punting the
error up the stack abort() directly on the first occurence as the error
can't be fixed during runtime.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-03-06 20:55:50 +01:00
Andrea Bolognani
d283e1bd19 qemu: Propagate firmware format
Take the information from the descriptor and store it in the
domain definition. Various things, such as the arguments passed
to -blockdev and the path generated for the NVRAM file, will
then be based on it.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-03-03 13:51:04 +01:00
Andrea Bolognani
d4383682c4 qemu: Move qemuDomainNVRAMPathFormat() to qemu_firmware
There are no other callers remaining.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-03-03 13:49:56 +01:00
Andrea Bolognani
9567f3ba1f qemu: Move firmware selection from startup to postparse
Currently, firmware selection is performed as part of the
domain startup process. This mostly works fine, but there's a
significant downside to this approach: since the process is
affected by factors outside of libvirt's control, specifically
the contents of the various JSON firmware descriptors and
their names, it's pretty much impossible to guarantee that the
outcome is always going to be the same. It would only take an
edk2 update, or a change made by the local admin, to render a
domain unbootable or downgrade its boot security.

To avoid this, move firmware selection to the postparse phase.
This way it will only be performed once, when the domain is
first defined; subsequent boots will not need to go through
the process again, as all the paths that were picked during
firmware selection are recorded in the domain XML.

Care is taken to ensure that existing domains are handled
correctly, even if their firmware configuration can't be
successfully resolved. Failure to complete the firmware
selection process is only considered fatal when defining a
new domain; in all other cases the error will be reported
during startup, as is already the case today.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-03-03 13:49:56 +01:00
Andrea Bolognani
79e7d2c602 qemu: Introduce qemuDomainDefBootPostParse()
Move all the boot related parts of qemuDomainDefPostParse()
to a separate helper.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-03-03 13:41:04 +01:00
Andrea Bolognani
7e12610387 qemu: Introduce qemuDomainDefMachinePostParse()
Move all the machine type related parts of
qemuDomainDefPostParse() to a separate helper.

Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-03-03 13:40:57 +01:00
Peter Krempa
6ecd218109 qemu: domain: Unexport qemuDomainObjTaintMsg
The function is used only inside qemu_domain.c, unexport it and move it
above its user.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2023-03-02 09:23:33 +01:00
Peter Krempa
9134b40d0b qemu: domain: Fix logic when tainting domain
Originally the code was skipping all repeated taints with the same taint
flag but a logic bug introduced in commit 30626ed15b inverted
the condition. This caused that actually the first occurence was NOT
logged but any subsequent was.

This was noticed when going through oVirt logs as they use custom guest
agent commands and the logs are totally spammed with this message.

Fixes: 30626ed15b
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
2023-03-02 09:23:33 +01:00
Laine Stump
f62ce81b8a qemu: respond to NETDEV_STREAM_DISCONNECTED event
When a QEMU netdev is of type "stream", if the socket it uses for
connectivity to the host network gets closed, then QEMU will send a
NETDEV_STREAM_DISCONNECTED event. We know that any stream netdev we've
created is backed by a passt process, and if the socket was closed,
that means the passt process has disappeared.

When we receive this event, we can respond by starting a new passt
process with the same options (including socket path) we originally
used. If we have previously created the stream netdev device with a
"reconnect" option, then QEMU will automatically reconnect to this new
passt process. (If we hadn't used "reconnect", then QEMU will never
try to reconnect to the new passt process, so there's no point in
starting it.)

Note that NETDEV_STREAM_DISCONNECTED is an event sent for the netdev
(ie "host side") of the network device, and so it sends the
"netdev-id" to specify which device was disconnected. But libvirt's
virDomainNetDef (the object used to keep track of network devices) is
the internal representation of both the host-side "netdev", and the
guest side device, and virDomainNetDef doesn't directly keep track of
the netdev-id, only of the device's "alias" (which is the "id"
parameter of the *guest* side of the device). Fortunately, by convention
libvirt always names the host-side of devices as "host" + alias, so in
order to search for the affected NetDef, all we need to do is trim the
1st 4 characters from the netdev-id and look for the NetDef having
that resulting trimmed string as its alias. (Contrast this to
NIC_RX_FILTER_CHANGED, which is an event received for the guest side
of the device, and so directly contains the device alias.)

Resolves: https://bugzilla.redhat.com/2172098
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-02-22 08:36:13 -05:00
Michal Privoznik
61d1b9e659 qemu: Don't remove macvtaps on failed start
If a domain is configured to create a macvtap/macvlan but the
target link already exists, startup fails (as expected) with:

  error: error creating macvtap interface test@eth0 (52:54:00:d9:0b:db): File exists

Okay, we could make that error message better, but that's not the
point. Since this error originated while generating cmd line
(the caller is qemuProcessStart(), transitively), the cleanup
after failed start is performed (qemuProcessStop()). Here,
virNetDevMacVLanDeleteWithVPortProfile() is called which removes
the macvtap interface we did not create (as it made us fail in
the first place).

Therefore, we need to track which macvtap/macvlan interface was
created successfully and remove only those.

You'll notice that only qemuProcessStop() has the new check. For
the (failed) hotplug case (qemuDomainAttachNetDevice()) this
function is already in place (the @iface_connected variable), or
not needed (qemuDomainRemoveNetDevice() - we're removing an
interface that was already attached to QEMU).

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2166235
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-02-01 15:44:26 +01:00
Peter Krempa
f730b1e4f2 qemu: domain: Store fdset ID for disks passed to qemu via FD
To ensure that we can hot-unplug the disk including the associated fdset
we need to store the fdset ID in the status XML.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-02-01 09:17:41 +01:00
Peter Krempa
531adf3274 qemuStorageSourcePrivateDataFormat: Rename 'tmp' to 'objectsChildBuf'
Be consistent with other children buffer variable naming scheme.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
2023-02-01 09:17:41 +01:00
Martin Kletzander
926594dcc8 qemu: Add implicit watchdog for q35 machine types
The iTCO watchdog is part of the q35 machine type since its inception,
we just did not add it implicitly.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2137346

Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-01-26 16:40:30 +01:00
Michal Privoznik
c3afde9211 qemu_domain: Don't unref NULL hash table in qemuDomainRefreshStatsSchema()
The g_hash_table_unref() function does not accept NULL. Passing
NULL results in a glib warning being triggered. Check whether the
hash table is not NULL and unref it only then.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-01-26 13:48:16 +01:00
zhenwei pi
7ba22d21a1 conf: introduce crypto device
Introduce crypto device like:

  <crypto model='virtio' type='qemu'>
    <backend model='builtin' queues='1'/>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
  </crypto>

  <crypto model='virtio' type='qemu'>
    <backend model='lkcf'/>
    <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
  </crypto>

Currently, crypto model supports virtio only, type supports qemu only
(vhost-user in the plan). For the qemu type, backend supports modle
builtin/lkcf, and the queues is optional.

Changes in this commit:
- docs: formatdomain.rst
- schemas: domaincommon.rng
- conf: crypto related domain conf
- qemu: crypto related
- tests: crypto related test

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
2023-01-25 16:00:42 +01:00
Laine Stump
a56f0168d5 qemu: hook up passt config to qemu domains
This consists of (1) adding the necessary args to the qemu commandline
netdev option, and (2) starting a passt process prior to starting
qemu, and making sure that it is terminated when it's no longer
needed. Under normal circumstances, passt will terminate itself as
soon as qemu closes its socket, but in case of some error where qemu
is never started, or fails to startup completely, we need to terminate
passt manually.

Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
2023-01-10 01:19:25 -05:00
Peter Krempa
894fe89484 qemu: Enable support for FD passed disk sources
Assert support for VIR_DOMAIN_DEF_FEATURE_DISK_FD in the qemu driver
now that all code paths are adapted.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
2023-01-09 14:59:43 +01:00