The function doesn't really need the connect object for anything besides
registering the autodestroy callback for it. If we merge it certain
callers can be simplified.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
Add possibility for the caller to set the flags for the call to
'virLXCProcessCleanup'.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Tim Wiederhake <twiederh@redhat.com>
The virDomainObj struct has @pid member where the domain's
hypervisor PID is stored (e.g. QEMU/bhyve/libvirt_lxc/... PID).
However, we are not consistent when it comes to shutoff state.
Initially, because virDomainObjNew() uses g_new0() the @pid is
initialized to 0. But when domain is shut off, some functions set
it to -1 (virBhyveProcessStop, virCHProcessStop, qemuProcessStop,
..).
In other places, the @pid is tested to be 0, on some other places
it's tested for being negative and in the rest for being
positive.
To solve this inconsistency we can stick with either value, -1 or
0. I've chosen the latter as it's safer IMO. For instance if by
mistake we'd kill(vm->pid, SIGTERM) we would kill ourselves
instead of init's process group.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
Remove the argument from the function prototypes and the callback
handler.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Access the 'driver' struct from the private data rather than the passed
opaque pointer in preparation to remove the opaque pointer.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This change was generated using the following spatch:
@ rule1 @
expression a;
identifier f;
@@
<...
- f(*a);
... when != a;
- *a = NULL;
+ g_clear_pointer(a, f);
...>
@ rule2 @
expression a;
identifier f;
@@
<...
- f(a);
... when != a;
- a = NULL;
+ g_clear_pointer(&a, f);
...>
Then, I left some of the changes out, like tools/nss/ (which
doesn't link with glib) and put back a comment in
qemuBlockJobProcessEventCompletedActiveCommit() which coccinelle
decided to remove (I have no idea why).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
After recent cleanups, there are some pointless cleanup sections.
Clean them up.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Convert all the users who unref their virCaps object unconditionally.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Now that resource structure can have appid as well we need to adapt code
that creates default resource partition if not provided by user.
Otherwise starting a VM with appid defined would fail with following
error:
error: unsupported configuration: Resource partition '(null)' must start with '/'
Fixes: 38b5f4faab
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
This is the bug I'm facing. I deliberately configured a container
so that the source of a <filesystem/> to passthrough doesn't
exist. The start fails with:
lxcContainerPivotRoot:669 : Failed to create /non-existent/path/.oldroot: Permission denied
which is expected. But what is NOT expected is that CGroup
hierarchy is left behind. This is because the controller sets up
the CGroup hierarchy, user namespace, moves interfaces, etc. and
finally checks whether container setup (done in a separate
process) succeeded. Only after all this the error is propagated
to the LXC driver. The driver aborts the startup and tries to
perform the cleanup, but this is missing CGroups because those
weren't detected yet.
Ideally, whenever a function fails, it tries to unroll back so
that is has no artifacts left behind (look at all those frees/FD
closes/etc. at end of functions). But with CGroups it is
different - the controller process can't clean up after itself,
because it is still running inside that CGroup.
Therefore, what we have to do is to let the driver detect CGroups
as soon as they are created, and proceed with controller
execution only after that.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Currently, there is only a single pipe passed to lxc_controller
and it is used by lxc_controller to signal to the LXC driver that
the container is set up and ready to run. However, in the next
commit we will need to signal that the LXC driver has done its
part of startup process and thus the controller can proceed.
Unfortunately, virCommand handshake can't be used for this,
because it's already used to read controller's PID.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
The command line argument is called --hanshakefd (check out
lxc_controller.c:main()). But the command line builder puts only
--handshake. This works, because there is no other argument
sharing the prefix.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Historically, we declared pointer type to our types:
typedef struct _virXXX virXXX;
typedef virXXX *virXXXPtr;
But usefulness of such declaration is questionable, at best.
Unfortunately, we can't drop every such declaration - we have to
carry some over, because they are part of public API (e.g.
virDomainPtr). But for internal types - we can do drop them and
use what every other C project uses 'virXXX *'.
This change was generated by a very ugly shell script that
generated sed script which was then called over each file in the
repository. For the shell script refer to the cover letter:
https://listman.redhat.com/archives/libvir-list/2021-March/msg00537.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Generated by the following spatch:
@@
expression a, b;
@@
+ b = g_steal_pointer(&a);
- b = a;
... when != a
- a = NULL;
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Generated using the following spatch:
@@
expression path;
@@
- virFileMakePath(path)
+ g_mkdir_with_parents(path, 0777)
However, 14 occurrences were not replaced, e.g. in
virHostdevManagerNew(). I don't really understand why.
Fixed by hand afterwards.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
'virLogGetFilters' doesn't return failure and 'virLogGetOutputs' reports
it's own errors.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Now that this function can be called regardless of interface type (and
whether or not we have a conn for the network driver), let's actually
call it for all interface types. This will assure that we re-connect
any disconnected bridge devices for <interface type='bridge'> as
mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1730084#c26
(until now we've only been reconnecting bridge devices for <interface
type='network'>)
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
The two scenarios were found by Coverity after a seemingly-unrelated
change to virLXCProcessSetupInterfaceTap() (in commit ecfc2d5f43), and
explained by John Ferlan here:
https://www.redhat.com/archives/libvir-list/2020-December/msg00810.html
To re-explain:
a) On entry to virLXCProcessSetupInterfaceTap() if net->ifname != NULL
then a copy of net->ifname is made into parentVeth, and a reference
to *that* pointer is sent down to virNetDevVethCreate().
b) If parentVeth (aka net->ifname) is a template name (e.g. "blah%d"),
then virNetDevVethCreate() calls virNetDevGenerateName(), and if
virNetDevGenerateName() successfully generates a usable name
(e.g. "blah27") then it will free the original template string
(which is pointed to by net->ifname and by parentVeth), then
replace the pointer in parentVeth with a pointer to the new
string. Note that net->ifname still points to the now-freed
template string.
c) returning back up to virLXCProcessSetupInterfaceTap(), we check if
net->ifname == NULL - it *isn't* (still contains stale pointer to
template string), so we don't replace it with the pointer to the new
string that is in parentVeth.
d) Result: the new string is leaked once we return from
virLXCProcessSetupInterfaceTap(), while there is a dangling pointer
to the old string in net->ifname.
There is also a leak if there is a failure somewhere between steps (b)
and (c) above - the failure cleanup in virNetDevVethCreate() will only
free the newly-generated parentVeth string if the original pointer was
NULL (narrator: "It wasn't."). But it's a new string allocated by
virNetDevGenerateName(), not the original string from net->ifname, so
it really does need to be freed.
The solution is to make a copy of the entire original string into a
g_autofree pointer, then iff everything is successful we g_free() the
original net->ifname and replace it by stealing the string returned by
virNetDevVethCreate().
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In all cases *except* when parsing status XML as libvirt is being
restarted, the XML parser will delete any manually specified interface
name (aka "<target dev='blah'/>" aka net->ifname) that could have been
generated by virNetDevGenerateName(). This means that during the setup
when a domain is being started (e.g. during
virLXCProcessSetupInterfaceTap()) it is pointless to call
virNetDevReserveName() with any setting of net->ifname that has come
from the XML parser - it is guaranteed to not fit the pattern of any
auto-generated name, and so the call is just a NOP anyway.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
the lxc driver uses virNetDevGenerateName() for its veth device names
since patch 2dd0fb492, so it should be using virNetDevReserveName()
during daemon restart/reconnect to skip over the device names that are
in use.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
In virLXCProcessSetupInterfaceTap, containerVeth needs to be freed on
failure.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Simplify virNetDevVethCreate by using common GenerateName/ReserveName
functions.
Signed-off-by: Shi Lei <shi_lei@massclouds.com>
Reviewed-by: Laine Stump <laine@redhat.com>
Commit 729a06c41 added code to the LXC driver (patterned after similar
code in the QEMU driver) that called
virNetDevMacVlanReserveName(net->ifname) for all type='direct'
interfaces during a libvirtd restart, to prevent other domains from
attempting to use a macvtap device name that was already in use by a
domain.
But, unlike a QEMU domain, when an LXC domain creates a macvtap
device, that device is almost immediately moved into the namespace of
the container (and it's then renamed, but that part isn't
important). Because of this, the LXC driver doesn't keep track (in
net->ifname) of the name used to create the device (as the QEMU driver
does).
The result of this is that if libvirtd is restarted while there is an
active LXC domain that has <interface type='direct'>, libvirtd will
segfault (since virNetDevMacVLanReserveName() doesn't check for a NULL
pointer).
The fix is to just not call that function in the case of the LXC
driver, since it is pointless anyway.
Fixes: 729a06c41a
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The next objective is to move virDomainDeviceDefValidate() to
domain_validate.c. First let's move all the static helpers.
The net device validation functions are used across multiple
drivers, so let's move them separately first.
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
If starting an container fails, the virLXCProcessStop() is
called. But since vm->def->id is not set until libvirt_lxc is
spawned (the domain's ID is PID of that process),
virLXCProcessStop() returns early as virDomainObjIsActive()
returns false. But doing so leaves behind resources reserved for
the containers during the startup process. Most notably, hostdevs
are not re-attached to the host, the domain's transient XML is
not removed, etc.
To resolve this, virLXCProcessCleanup() is called in this case.
However, it is modified to accept @flags which allows caller to
run only specific cleanups (depending how far in container
creation the failure occurred). There is plenty of cleanups which
don't need this guard because either they detect a NULL pointer
or try to release an unique resource.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
Glib provides g_auto(GStrv) which is in-place replacement of our
VIR_AUTOSTRINGLIST.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This patch adds new schema and adds support for parsing and formatting
domain configurations that include vdpa devices.
vDPA network devices allow high-performance networking in a virtual
machine by providing a wire-speed data path. These devices require a
vendor-specific host driver but the data path follows the virtio
specification.
When a device on the host is bound to an appropriate vendor-specific
driver, it will create a chardev on the host at e.g. /dev/vhost-vdpa-0.
That chardev path can then be used to define a new interface with
type='vdpa'.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Reviewed-by: Laine Stump <laine@redhat.com>
As preparation for g_autoptr() we need to change the function to take
only virCgroupPtr.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com>
Reviewed-by: Jonathon Jongsma <jjongsma@redhat.com>
There have been some reports that, due to libvirt always trying to
assign the lowest numbered macvtap / tap device name possible, a new
guest would sometimes be started using the same tap device name as
previously used by another guest that is in the process of being
destroyed *as the new guest is starting.
In some cases this has led to, for example, the old guest's
qemuProcessStop() code deleting a port from an OVS switch that had
just been re-added by the new guest (because the port name is based on
only the device name using the port). Similar problems can happen (and
I believe have) with nwfilter rules and bandwidth rules (which are
both instantiated based on the name of the tap device).
A couple patches have been previously proposed to change the ordering
of startup and shutdown processing, or to put a mutex around
everything related to the tap/macvtap device name usage, but in the
end no matter what you do there will still be possible holes, because
the device could be deleted outside libvirt's control (for example,
regular tap devices are automatically deleted when the qemu process
terminates, and that isn't always initiated by libvirt but could
instead happen completely asynchronously - libvirt then has no control
over the ordering of shutdown operations, and no opportunity to
protect it with a mutex.)
But this only happens if a new device is created at the same time as
one is being deleted. We can effectively eliminate the chance of this
happening if we end the practice of always looking for the lowest
numbered available device name, and instead just keep an integer that
is incremented each time we need a new device name. At some point it
will need to wrap back around to 0 (in order to avoid the IFNAMSIZ 15
character limit if nothing else), and we can't guarantee that the new
name really will be the *least* recently used name, but "math"
suggests that it will be *much* less common that we'll try to re-use
the *most* recently used name.
This patch implements such a counter for macvtap/macvlan, replacing
the existing, and much more complicated, "ID reservation" system. The
counter is set according to whatever macvtap/macvlan devices are
already in use by guests when libvirtd is started, incremented each
time a new device name is needed, and wraps back to 0 when either
INT_MAX is reached, or when the resulting device name would be longer
than IFNAMSIZ-1 characters (which actually is what happens when the
template for the device name is "maccvtap%d"). The result is that no
macvtap name will be re-used until the host has created (and possibly
destroyed) 99,999,999 devices.
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
'path' could be accessed uninitialized. Fix it by using g_autofree which
also mandates initialization.
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Include virutil.h in all files that use it,
instead of relying on it being pulled in somehow.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This patch pushes the isolatedPort setting from the <interface> down
all the way to the callers of virNetDevBridgeAddPort(), and sets
BR_ISOLATED on the port (using virNetDevBridgePortSetIsolated()) after
the port has been successfully added to the bridge.
Signed-off-by: Laine Stump <laine@redhat.com>
Signed-off-by: Laine Stump <laine@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
This addreses portability to Windows and standardizes
error reporting. This fixes a number of places which
failed to set O_CLOEXEC or failed to report errors.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Now, that every use of virAtomic was replaced with its g_atomic
equivalent, let's remove the module.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Libvirt's original atomic ops impls were largely copied
from GLib's code at the time. The only API difference
was that libvirt's virAtomicIntInc() would return a
value, but g_atomic_int_inc was void. We thus use
g_atomic_int_add(v, 1) instead, though this means
virAtomicIntInc() now returns the original value,
instead of the new value.
This rewrites libvirt's impl in terms of g_atomic_int*
as a short term conversion. The key motivation was to
quickly eliminate use of GNULIB's verify_expr() macro
which is not a direct match for G_STATIC_ASSERT_EXPR.
Long term all the callers should be updated to use
g_atomic_int* directly.
Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
This will be needed in the future for allocating private data.
Signed-off-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>