Libvirt creates qcow2 volumes with very old `compat` version `0.10` if
no `compat` version is specified in the xml (see
[here](https://libvirt.org/formatstorage.html#id11)). It seems libvirt
has backwards compatibility reasons to still use `0.10` if nothing is
specified in the xml.
Unfortunately this has the effect that VMs created by vagrant-libvirt
are using the 10 year old `0.10` qcow2 version, instead of the `1.1`
version that has **better performance and stability fixes**.
This is even happening when the backing file of the vagrant box is in
compat `1.1` format.
This is the `qemu-img info` of the backing file provided by the vagrant
box:
```
# qemu-img info /var/lib/libvirt/images/vagrant_box_image_0_box.img
[...]
Format specific information:
compat: 1.1
[...]
```
Although the backing file provided by the box is using compat `1.1`, the
created volume by vagrant-libvirt has only compat `0.1` as shown here:
```
# qemu-img info /var/lib/libvirt/images/vagrant-01-test_default.img
[...]
backing file: /var/lib/libvirt/images/vagrant_box_image_0_box.img
Format specific information:
compat: 0.10
[...]
```
With this MR, the volume of the VM will use the same `compat` version as
provided by the backing image of the vagrant box.
If `qemu-img info` does not return a `compat` flag, it falls back to the
very old `0.10` version.
Allow for iface_name to be set on public_network configurations to
control the name of the interface created by libvirt. This overrides the
default that would be created automatically, but cannot use a reserved
name as it will be ignored.
Closes: #799
Better handle setting the autoport value when the port is explicitly set
to ensure that the XML sent to update the VM is correct and will be the
XML that is reflected in the defined machine.
By prioritizing checking if the port is provided, graphics_autoport =
"yes" is ignored.
Fixes: #1687
This adds websocket functionality for VNC. The websocket attribute may
be used to specify the port to listen on (with -1 meaning
auto-allocation and autoport having no effect due to security reasons).
Secure Encryption Virtualization is supported by libvirt and this
change adds support for vagrant-libvirt to enable it.
It requires a UEFI base box and needs a combination of options to be
configured for it to work.
Co-authored-by: PELLET Norman <norman.pellet@csem.ch>
Co-authored-by: MUNTANÉ CALVO Enric <emc@csem.ch>
Co-authored-by: Darragh Bailey <daragh.bailey@gmail.com>
Closes: #1372
The CPU element to manage the mode, model, features (including nested),
is only available on some architectures. To allow this plugin to
generate XML valid for other architectures such as RISC-V, the CPU
element needs to be optional and only enabled when the architecture
specified supports it.
Include checks in the validation section to help prevent the setting of
an unsupported architecture with any of the CPU features that require
the CPU element to be available.
Fixes: #1538
To allow for a different ssh port to be used when connecting to a
machine for NFS setup, use the port provided in the ssh_info hash
with a fallback to the default ssh port.
This may allow NFS mounting into Windows guests once support is added to
vagrant itself to handle NFS installation.
Fixes: #1640
Move the unit context to a name matching the other contexts. Remove some
unnecessary entries from it that are unused, and remove references to
the old name relying on spec helper to load all contexts.
Allow libvirt to start the domain before reading back the XML to
retrieve the port assigned automatically for subsequent graphics access
when autoport is enabled.
Fixes: #992
Ensure the bootmenu is disabled by default. When not specified it will
default to the hypervisor default behaviour, however this is not
necessarily defined consistently across different distros. Therefore for
a consistent behaviour, simply ensure it is always configured to be off,
unless explicitly required to be enabled when the boot order is
configured.
Fixes: #947
The driver is created with a specific machine instance, use this in
stead of requiring a specific instance be passed in. Apply the same
approach to domain where possible which allows the same get ip address
function to be called with and without a domain being passed in to save
lookup during loops.
Normalise the XML to ensure the attributes for both documents have the
same ordering to prevent excessive noise when differences are detected.
Additionally sort various elements based on attributes that make
ordering irrelevant to allow for simpler comparison using xmlsimple.
Closes: #1583
Updating the behaviour to only warn on the XML differing instead of
erroring missed updating the spec tests to replace the check for the
error message with a warn message.
As this is the first time people may be alerted to an issue applying
changes, given that previously they would be silently discarded, better
to switch to warning the end user rather than blocking the domain start.
This is no worse than the previous position, and allows previous
configurations that were understood to be working, to continue to work
as they are, with hopefully issues logged by users that will allow
more correct XML to be sent before making failure to match the changes
applied fatal.
Remove the unnecessary nesting of begin/rescue entries in start domain
which partially existed due to vagrant-libvirt destroying domain
definitions on error, even for machines that had previously been
created.
Since this is no longer a problem as existing domains will be halted
rather than removed when an exception occurs, it is perfectly fine to
let the exception percolate upwards and be handled by the warden, which
is what the current code is doing, but at the expense of catching and
re-raising multiple times.
Switch to calling the returning the next middleware in the chain as soon
as possible in the set boot order action. Makes the overall remaining
logic tidier.
Include basic tests to check existing behaviour.
Make sure to perform a diff of the actual XML strings and not the object
representations if encountering the bug where the XML failed to be
updated correctly.
The switch to compare-xml exhibits non intuitive behaviour in that
comparing without use of verbose mode does not detect documents that are
identical. The if statement allowed the tests to pass but reads that if
the documents compare as true then emit an error message and raise an
exception. However this is because it returns false even with the docs
are the same content. Unfortunately there was no pre-existing test
case added when
Since there were concerns about equivalent-xml not being active upstream
raised by a distribution maintainer, switching back should be avoided.
Attempted use of nokogiri-diff indicated that whitespace changes would
be treated as differences with it being unclear how to easily exclude.
Moving to xml-simple, which although will treat whitespace as
significant and even with NormaliseSpace set to disregard any whitespace
around content elements, it requires walking the returned structure to
remove the empty content attribute that is left behind to allow a direct
comparison between the data structures.
To ensure the XML comparison is validated add a test where libvirt
returns XML that is different to what was requested, and assert that
the expected error is raised, an error message emitted and that the
domain define would be reverted to the previous state if possible.
Relates-To: #1565Fixes: #1556
Using diffy alone is insufficient to spot when the XML returned by
libvirt for the changes applied is equivalent but formatted differently.
When libvirt returns slightly differently formatted XML corresponding to
what changes were actually applied, it can result in an error being
assumed to have happened when the update actually worked as expected.
Equivalent-xml handles detecting XML documents as being identical,
therefore switching to using it instead of diffy should produce more
reliable results, while retaining diffy for better formatted diffs
should an issue occur is useful.
Fixes: #1556
For testing certain scenarios with vagrant-libvirt, need in the guest system a
value for the systems serial number in the DMI/SMBIOS system information.
The domain https://libvirt.org/formatdomain.html#smbios-system-information
format of libvirt allows to specify those values.
While adding `-smbios type=1,serial=$serial_value` to the `qemuargs` parameter
of the libvirt provider is already able to achieve this, a dedicated provider config
value adds native support from the `Vagrantfile` layering system. For example,
in the .box included Vagrantfile a random serial number can be enforced by
adding the following:
require 'securerandom'
Vagrant.configure("2") do |config|
config.vm.provider :libvirt do |libvirt|
libvirt.dmi_system_serial = SecureRandom.alphanumeric(8).upcase
end
end
Then in an instance specific Vagrantfile this value can be overwritten by adding:
Vagrant.configure("2") do |config|
config.vm.provider :libvirt do |libvirt|
libvirt.dmi_system_serial = "ABCDEFGH"
end
end
Co-authored-by: Nils Ballmann <nils.ballmann.ext@siemens.com>
Co-authored-by: Darragh Bailey <daragh.bailey@gmail.com>
Previous PR added tests but missed that the disk controller was not
being set correctly in the create domain action by default.
Update config documentation to include option.
If a box is added directly without a version, vagrant will default set
it to 0. When this occurs it is necessary to ensure use of mtime to
distinguish when the disk image in the box was last updated to allow
replacement of the box to trigger a fresh upload.
Fixes: #1382
Ensure that when no box is defined for a pxe install, that the domain
name is only populated for subsequent boots and if it is already
provided by a create action, skip attempting to retrieve from a
non-existing domain XML.
Fixes: #1508
Vagrant newer than 2.2.11 reworked how the after hook definition
functions requiring it to be called in a different way to ensure it is
possible to hook the box remove action to perform the expected local
action.
Add a compatibility function to ensure that the plugin works with both
mechanisms and some simple tests to ensure that with the unit tests
validated against the older vagrant versions that it confirms the action
will be called as expected.
As part of fixing the call of the remove_libvirt_image action, ensure it
only executes when the box removed is a libvirt one and ignores all
others.
Fixes: #1196
Calling undefine on a domain and recreating it can result in some edge
case errors where if the current capabilities of libvirt have been
reduced, it may not be possible to restore the old definition.
Instead switch to calling `domain_define` with the new definition and
check that the resulting libvirt domain definition has been updated in
the expected manner, otherwise report an error to the user.
Fixes: #949
Relates-to: #1329
Relates-to: #1027
Relates-to: #1371
With multi volume boxes, need to ensure that disk settings such as the
device assigned are resolved dynamically once it has been established
which devices have already been assigned to the box volumes on either
initial creation or subsequent boots.
Otherwise users are forced to always explicitly define the device for
additional storage instead of having it be automatically assigned the
next available device.
Consequently previous changes have broken the ability for machines
with additional storage to be halted and restarted correctly.
Include an integration test that for additional storage checks that the
machine can be stopped and started again.
Fixes: #1490
The loader tag is required when nvram is being enabled. Ensure it is
marked as a requirement of validation and also support it's
configuration during the start domain action.
Switch to instance doubles for driver where possible and reduce usage of
allow_any_instance_of.
Additionally replace some incorrect allows with expects to validate
calls are not made.
Handle enabling and disabling of NVRAM during start domain.
This patch contains latent support for upstream PR fog-libvirt#102 to support destroy
with NVRAM enabled once that is merged.
Fixes#1027.
Switch to the command as an array format when passing the port
forwarding ssh command to spawn to switch to execution of the ssh
process directly instead of being executed by the default shell.
Fixes: #1467
Libvirt did already change the current 9216kb
to 16384kb when starting a domain so in practice this is a no-op.
The libvirt default was changed in 2014 in 81ba2298b2
Reduce the patching needed should a distro wish to switch the default
from using the system connection by default to using a session
connection by default.
Should now only require patching the default value and a single test
checking the defaults.
Adjust handling to use a method reference instead of attempting to
update the env value as that only appears to be available to actions
further down the chain and any changes are not visible during recovery
when exercised for real.
By passing a function reference can limit the actions that can be taken
to only allow a single call that will set a flag internal to the clean
up action.
Fixes: #1328
Switch to dedicated actions to trigger destroy/halt via the recovery
mechanism if the VM bring up has failed to complete the bring up of the
machine.
Fixes: #1328
Switch to using explicit references to objects to be partially mocked
and remove the need to resolve the string constants as this will catch
more instances of calls to invalid or missing methods.
Rework how the vm is added to the machine for one of the tests as it is
not a method and instead is provided via internal state being exposed
with a helper.
In PrepareNFSSettings.read_host_ip we use a UDPSocket to connect to the host in
question. However, this fails completely if the system executing the tests has
no network adapters enable, like builders in OBS or Koji. If we double the calls
to this, then we no longer face that issue.
To make it easier to see when the XML generated has deviated from
expected, tidy up the emitted XML to use a more consistent formatting
that would be inline with what would be expected to be output by virsh
directly.
Re-enable handling of the disk_device domain volume setting to ensure it
can be overridden from the default of vda to a value chosen.
Provide a disk resolver to resolve devices after the box has been downloaded
so that initial devices can be correctly allocated and avoid conflicts with
additional disks added that would otherwise get assigned the same device.
Removes hack for destroy domain when more than one disk, as now devices
in the config are only present if provided by the configuration.
Fixes: #1353
When destroying domains were there are multiple box volumes to be
removed, need to perform a series of checks to establish that the
correct volumes are being removed.
- check if device aliases are in use
- process the box volume removals first
- detect if disks attached outside of vagrant-libvirt
- prefer aliases for additional disks
- avoid use of devices for multiple disks to detect as currently
assigned incorrectly.
- fallback to detecting number of box volumes to determine
point at which additional disks start
Attempts to flag a number of cases where behaviour might be incorrect to
help users spot when vagrant-libvirt may accidentally remove something
it shouldn't.
Adjust the order of checks around use of qemu sessions to allow use of
the agent as a priority when enabled, which should remove the need to
retrieve the address from the system connection when enabled.
Additionally adjust the call to the agent to ensure it uses the default
connection to retrieve the correct domain, rather than forcing the
system connection, which will fail to find the domain if it was created
via a user session.
Add tests that validate most of this behaviour, as well as resulting in
some minor fixes around downcasing the mac address for comparisons, and
also using instance mocks with rspec instead of pure doubles to help
catch false positives where mocks are allowing calls that done exist.
Related: #1342
Add a unit test for the prepare nfs settings action to act as a
regression test for the recent fix to avoiding modifying a frozen string
literal.
As part of this fix how the communicator is returned via a call to a
machine for testing purposes and remove an obsolete expect from the
wait till up tests as the code that would result in the communicator
being called has been removed.
Ensure that nfsd is not required to run the tests by mocking out the
host capability check.
Using the trailing action for ShutdownDomain results in it being
executed as part of the start up sequence in the reload action.
This appears to happen because the halt action is not executed in
isolation from the start action that is scheduled afterwards and
consequently the chaining behaviour spans both instead of being treated
as the combination of two distinct actions that should complete.
While this fixes the issue, it is important that subsequently call such
actions can be done without allowing the out part of the middleware
behaviour to be applied to subsequent actions.
Partial-Fix: #1376
It is more reliable to identify disk and network devices by use of
aliases, in addition to being able to establish in the absence of
information the purpose of such devices.
There is a possibility that in some cases this will also resolve issues
where the same device attach issued twice with the same details will
fail due to the second request not appearing to be honoured.
Additionally when destroying domains, may not have the relevant details
on how many disks are provided by the box, for those that support
multiple disks. Being able to traverse the domain XML and destroy the
appropriate volumes based on aliases names will remove the need to have
predictable device identifiers during the destroy and allow for an
improved resolver.
Relates: #1342