The capabilities are defined/parsed/formatted/queried from this module,
no reason for 'update' not being part of the module as well. This also
involves some module-specific prefix changes.
This patch also drops the node_device_linux_sysfs module from the repo
since:
a) it only contained the capability handlers we just moved
b) it's only linked with the driver (by design) and thus unreachable to
other modules
c) we touch sysfs across all the src/util modules so the module being
deleted hasn't been serving its original intention for some time already.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Let's move the udevEnumerateDevices into a thread to "speed
up" the initialization process. If the enumeration fails we
can set the Quit flag to ensure that udevEventHandleCallback
will not run.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Reviewed-by: Erik Skultety <eskultet@redhat.com>
Commit id '36555364' removed the setting of the driver->privileged,
which the udevProcessPCI would need in order to read the PCI device
configs.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Right-aligning backslashes when defining macros or using complex
commands in Makefiles looks cute, but as soon as any changes is
required to the code you end up with either distractingly broken
alignment or unnecessarily big diffs where most of the changes
are just pushing all backslashes a few characters to one side.
Generated using
$ git grep -El '[[:blank:]][[:blank:]]\\$' | \
grep -E '*\.([chx]|am|mk)$$' | \
while read f; do \
sed -Ei 's/[[:blank:]]*[[:blank:]]\\$/ \\/g' "$f"; \
done
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
If we find ourselves in the situation that the 'add' uevent has been
fired earlier than the sysfs tree for a device was created, we should
use the best-effort approach and give kernel some predetermined amount
of time, thus waiting for the attributes to be ready rather than
discarding the device from our device list forever. If those don't appear
in the given time frame, we need to move on, since libvirt can't wait
indefinitely.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Adjust udevEventHandleThread to be a proper thread routine running in an
infinite loop handling devices. The handler thread pulls all available
data from the udev monitor and only then waits until a wakeup signal for
new incoming data has been emitted by udevEventHandleCallback.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
This patch splits udevEventHandleCallback in two (introduces
udevEventHandleThread) in order to be later able to refactor the latter
to actually become a normal thread which will wait some time for the
kernel to create the whole sysfs tree for a device as we cannot do that
in the event loop directly.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
udevSetupSystemDev only needs the udev data lock to be locked because of
calling udevGetDMIData which accesses some protected structure members,
but it can do that on its own just fine, no need to hold the lock the
whole time.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The driver locks are unnecessary here, since currently the cleanup is
only called from the main daemon thread, so we can't race here. Moreover
@devs and @privateData are self-lockable objects, so no problem there
either.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since there's going to be a worker thread which needs to have some data
protected by a lock, the whole code would just simply get unnecessary
complex, since two sets of locks would be necessary, driver lock (for
udev monitor and event handle) and a mutex protecting thread-local data.
Given the future thread will need to access the udev monitor socket as
well, why not protect everything with a single lock, even better, by
converting the driver's private data to a lockable object, we get the
automatic object disposal feature for free.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
We need to perform a sanity check on the udev monitor before every
use so that we know nothing has changed in the meantime. The reason for
moving the code to a separate helper is to enhance readability and shift
the focus on the important stuff within the udevEventHandleCallback
handler.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Even though hal doesn't make use of it, the privileged flag is related
to the daemon/driver rather than the backend actually used.
While at it, get rid of some tab indentation in the driver state struct.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Let this new method handle the device object we obtained from the
monitor in order to enhance readability.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
So we have a sanity check for the udev monitor fd. Theoretically, it
could happen that the udev monitor fd changes (due to our own wrongdoing,
hence the 'sanity' here) and if that happens it means we are handling an
event from a different entity than we think, thus we should remove the
handle if someone somewhere somehow hits this hypothetical case.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
It might happen that virFileResolveLinkHelper fails on the lstat system
call. virFileResolveLink expects the caller to report an error when it
fails, however this wasn't the case for udevProcessMediatedDevice.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Commit @4cb719b2dc moved the driver locks around since these have become
unnecessary at spots where the code handles now self-lockable object
list, but missed the possible double unlock if udevEnumerateDevices
fails, because at that point the driver lock had been already dropped.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since virnodedeviceobj now has a self-lockable hash table, there's no
need to lock the table from the driver for processing. Thus remove the
locks from the driver for NodeDeviceObjList mgmt.
This includes the test driver as well.
Now that we have a bit more control, let's convert our object into
a lockable object and let that magic handle the create and lock/unlock.
This also involves creating a virNodeDeviceEndAPI in order to handle
the object cleanup for API's that use the Add or Find API's in order
to get a locked/reffed object. The EndAPI will unlock and unref the
object returning NULL to indicate to the caller to not use the obj.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Ensure that any function that walks the node device object list is prefixed
by virNodeDeviceObjList.
Also, modify the @filter param name for virNodeDeviceObjListExport to
be @aclfilter.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation to make things private, make the ->devs be pointers to a
virNodeDeviceObjList and then manage everything inside virnodedeviceobj
Signed-off-by: John Ferlan <jferlan@redhat.com>
Rather than passing the object to be removed by reference, pass by value
and then let the caller decide whether or not the object should be free'd
and how to handle the logic afterwards. This includes free'ing the object
and/or setting the local variable to NULL to prevent subsequent unexpected
usage (via something like virNodeDeviceObjRemove in testNodeDeviceDestroy).
For now this function will just handle the remove of the object from the
list for which it was placed during virNodeDeviceObjAssignDef.
This essentially reverts logic from commit id '61148074' that free'd the
device entry on list, set *dev = NULL and returned. Thus fixing a bug in
node_device_hal.c/dev_refresh() which would never call dev_create(udi)
since @dev would have been set to NULL.
Signed-off-by: John Ferlan <jferlan@redhat.com>
In preparation for privatizing the virNodeDeviceObj - create an accessor
for the @def field and then use it for various callers.
Signed-off-by: John Ferlan <jferlan@redhat.com>
Alter the node_device_driver source and prototypes to follow more
recent code style guidelines w/r/t spacing between functions, format
of the function, and the prototype definitions.
While the new names for nodeDeviceUpdateCaps, nodeDeviceUpdateDriverName,
and nodeDeviceGetTime don't follow exactly w/r/t a "vir" prefix, they
do follow other driver nomenclature style.
Signed-off-by: John Ferlan <jferlan@redhat.com>
When a number of SRIOV VFs (up to 128 on Intel XL710) is created:
for i in `seq 0 1`; do
echo 63 > /sys/class/net/<interface>/device/sriov_numvfs
done
libvirtd will then report "udev_monitor_receive_device returned NULL"
error because the netlink socket buffer is not big enough (using GDB on
libudev confirmed this with ENOBUFFS) and thus some udev events were
dropped. This results in some devices being missing in the nodedev-list
output. This patch overrides the system's rmem_max limit but for that,
we need to make sure we've got root privileges.
https://bugzilla.redhat.com/show_bug.cgi?id=1450960
Signed-off-by: ning.bo <ning.bo9@zte.com.cn>
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Similar to scsi_host and fc_host, there is a relation between a
scsi_target and its transport specific fc_remote_port. Let's expose this
relation and relevant information behind it.
An example for a virsh nodedev-dumpxml:
virsh # nodedev-dumpxml scsi_target0_0_0
<device>
<name>scsi_target0_0_0</name>
<path>/sys/devices/[...]/host0/rport-0:0-0/target0:0:0</path>
<parent>scsi_host0</parent>
<capability type='scsi_target'>
<target>target0:0:0</target>
<capability type='fc_remote_port'>
<rport>rport-0:0-0</rport>
<wwpn>0x9d73bc45f0e21a86</wwpn>
</capability>
</capability>
</device>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Make CCW devices available to the node_device driver. The devices are
already seen by udev so let's implement necessary code for detecting
them properly.
Topologically, CCW devices are similar to PCI devices, e.g.:
+- ccw_0_0_1a2b
|
+- scsi_host0
|
+- scsi_target0_0_0
|
+- scsi_0_0_0_0
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Bjoern Walk <bwalk@linux.vnet.ibm.com>
Start discovering the mediated devices on the host system and format the
attributes for the mediated device into the XML. Compared to the parent
device which reports generic information about the abstract mediated
devices types, a child device only reports the type name it has been
instantiated from and the IOMMU group number, since that's device
specific compared to the rest of the info that can be gathered about
mediated devices at the moment.
This patch introduces both the formatting and parsing routines, updates
nodedev.rng schema, adding a testcase as well.
The resulting mdev child device XML:
<device>
<name>mdev_4b20d080_1b54_4048_85b3_a6a62d165c01</name>
<path>/sys/devices/.../4b20d080-1b54-4048-85b3-a6a62d165c01</path>
<parent>pci_0000_06_00_0</parent>
<driver>
<name>vfio_mdev</name>
</driver>
<capability type='mdev'>
<type id='vendor_supplied_type_id'/>
<iommuGroup number='NUM'/>
<capability/>
<device/>
https://bugzilla.redhat.com/show_bug.cgi?id=1452072
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The parent device needs to report the generic stuff about the supported
mediated devices types, like device API, available instances, type name,
etc. Therefore this patch introduces a new nested capability element of
type 'mdev_types' with the resulting XML of the following format:
<device>
...
<capability type='pci'>
...
<capability type='mdev_types'>
<type id='vendor_supplied_id'>
<name>optional_vendor_supplied_codename</name>
<deviceAPI>vfio-pci</deviceAPI>
<availableInstances>NUM</availableInstances>
</type>
...
<type>
...
</type>
</capability>
</capability>
...
</device>
https://bugzilla.redhat.com/show_bug.cgi?id=1452072
Signed-off-by: Erik Skultety <eskultet@redhat.com>
The reason for introducing two capabilities, one for the device itself
(cap 'mdev') and one for the parent device listing the available types
('mdev_types'), is that we should be able to do
'virsh nodedev-list --cap' not only for existing mdev devices but also
for devices that support creation of mdev devices, since one day libvirt
might be actually able to create the mdev devices in an automated way
(just like we do for NPIV/vHBA).
https://bugzilla.redhat.com/show_bug.cgi?id=1452072
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Since we have that information provided by @def which is not a private
object, there is really no need for the variable.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
So udevGetDeviceDetails was one those functions using an enum in a
switch, but since it had a 'default' case, compiler didn't warn about an
unhandled enum. Moreover, the error about an unsupported device type
reported in the default case is unnecessary, since by the time we get
there, udevGetDeviceType (which was called before) already made sure
that any unrecognized device types had been handled properly.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Changes in commit id 'dec6d9df' caused a compilation failure on a RHEL6
CI build environment. So just replace 'system' with 'syscap' as a name.
cc1: warnings being treated as errors
../../src/conf/node_device_conf.c: In function 'virNodeDevCapSystemParseXML':
../../src/conf/node_device_conf.c:1415: error: declaration of 'system' shadows a global declaration [-Wshadow]
The comment was actually wrong as
https://www.freedesktop.org/software/systemd/man/udev_new.html#
mentions that on failure NULL is returned. Also the same return value
is checked in src/interface/interface_backend_udev.c already.
Signed-off-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com>
Add a new 'drm' capability for Direct Rendering Manager (DRM) devices,
providing device type information.
Teach the udev backend to populate those devices.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add new <devnode> top-level <device> element, that list the associated
/dev files. Distinguish the main /dev name from symlinks with a 'type'
attribute of value 'dev' or 'symlink'.
Update a test to check XML schema, and actually add it to the test list
since it was missing.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Test 12 from objecteventtest (createXML add event) segaults on FreeBSD
with bus error.
At some point it calls testNodeDeviceDestroy() from the test driver. And
it fails when it tries to unlock the device in the "out:" label of this
function.
Unlocking fails because the previous step was a call to
virNodeDeviceObjRemove from conf/node_device_conf.c. This function
removes the given device from the device list and cleans up the object,
including destroying of its mutex. However, it does not nullify the pointer
that was given to it.
As a result, we end up in testNodeDeviceDestroy() here:
out:
if (obj)
virNodeDeviceObjUnlock(obj);
And instead of skipping this, we try to do Unlock and fail because of
malformed mutex.
Change virNodeDeviceObjRemove to use double pointer and set pointer to
NULL.
This event is emitted when a nodedev XML definition is updated,
like when cdrom media is changed in a cdrom block device.
Also includes node device update event implementation for udev
backend, virsh nodedev-event support, and event-test support
Use udevHasDeviceProperty instead of udevGetStringProperty.
We do not need to copy the string since we do not need it.
Also add braces around the if body, since the change made
syntax check complain.
Two out of three callers free it right after converting it to a number.
Also change the comment at the beginning of the function, because
the comment inside the function told me to.
The wrapper adds an error message or a debug log.
Since we already log the properties we get from udev as strings,
there is no much use for the debug logs.
Open code the error message and delete the function.
Most of the code paths had to reset it to -1 and returning 0 was
only possible if we made it to the end of the function.
Initialize it to -1 and only set it to 0 if we reach the end, as we do
in most of libvirt code.
If we expose this information, which is one byte in every PCI config
file, we let all mgmt apps know whether the device itself is an endpoint
or not so it's easier for them to decide whether such device can be
passed through into a VM (endpoint) or not (*-bridge).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1317531
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Best viewed with '-w' as this is just an adjustment for future patch to
be readable without that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Both the hal and udev drivers call virPCI*() functions to the the
SRIOV VF/PF info about PCI devices, and the UDEV backend calls
virPCI*() to get IOMMU group info. Since there is now a single
function call in node_device_linux_sysfs.c to do all of this, replace
all that code in the two backends with calls to
nodeDeviceSysfsGetPCIRelatedDevCaps().
Note that this results in the HAL driver (probably) unnecessarily
calling virPCIDevieAddressGetIOMMUGroupNum(), but in the case that the
host doesn't support IOMMU groups, that function turns into a NOP (it
returns -2, which causes the caller to skip the call to
virPCIDeviceAddressGetIOMMUGroupAddresses()). So in the worst case it
is a few extra cycles spent, and in the best case a mythical platform
that supported IOMMU groups but used HAL rather than UDEV would gain
proper reporting of IOMMU group info.
This file contains only a single function, detect_scsi_host_caps(),
which is declared in node_device_driver.h and called from both the hal
and udev backends. Other things common to the hal and udev drivers
can be placed in that file though. As a prelude to adding further
functions, this patch renames the existing function to something
closer in line with other internal libvirt function names
(nodeDeviceSysfsGetSCSIHostCaps()), and puts the declarations into a
separate .h file.
For some reason a union (_virNodeDevCapData) that had only been
declared inside the toplevel struct virNodeDevCapsDef was being used
as an argument to functions all over the place. Since it was only a
union, the "type" attribute wasn't necessarily sent with it. While
this works, it just seems wrong.
This patch creates a toplevel typedef for virNodeDevCapData and
virNodeDevCapDataPtr, making it a struct that has the type attribute
as a member, along with an anonymous union of everything that used to
be in union _virNodeDevCapData. This way we only have to change the
following:
s/union _virNodeDevCapData */virNodeDevCapDataPtr /
and
s/caps->type/caps->data.type/
This will make me feel less guilty when adding functions that need a
pointer to one of these.
Adding functionality to libvirt that will allow it
query the ethtool interface for the availability
of certain NIC HW offload features
Here is an example of the feature XML definition:
<device>
<name>net_eth4_90_e2_ba_5e_a5_45</name>
<path>/sys/devices/pci0000:00/0000:00:03.0/0000:08:00.1/net/eth4</path>
<parent>pci_0000_08_00_1</parent>
<capability type='net'>
<interface>eth4</interface>
<address>90:e2:ba:5e:a5:45</address>
<link speed='10000' state='up'/>
<feature name='rx'/>
<feature name='tx'/>
<feature name='sg'/>
<feature name='tso'/>
<feature name='gso'/>
<feature name='gro'/>
<feature name='rxvlan'/>
<feature name='txvlan'/>
<feature name='rxhash'/>
<capability type='80203'/>
</capability>
</device>
Signed-off-by: Ján Tomko <jtomko@redhat.com>
For stateless, client side drivers, it is never correct to
probe for secondary drivers. It is only ever appropriate to
use the secondary driver that is associated with the
hypervisor in question. As a result the ESX & HyperV drivers
have both been forced to do hacks where they register no-op
drivers for the ones they don't implement.
For stateful, server side drivers, we always just want to
use the same built-in shared driver. The exception is
virtualbox which is really a stateless driver and so wants
to use its own server side secondary drivers. To deal with
this virtualbox has to be built as 3 separate loadable
modules to allow registration to work in the right order.
This can all be simplified by introducing a new struct
recording the precise set of secondary drivers each
hypervisor driver wants
struct _virConnectDriver {
virHypervisorDriverPtr hypervisorDriver;
virInterfaceDriverPtr interfaceDriver;
virNetworkDriverPtr networkDriver;
virNodeDeviceDriverPtr nodeDeviceDriver;
virNWFilterDriverPtr nwfilterDriver;
virSecretDriverPtr secretDriver;
virStorageDriverPtr storageDriver;
};
Instead of registering the hypervisor driver, we now
just register a virConnectDriver instead. This allows
us to remove all probing of secondary drivers. Once we
have chosen the primary driver, we immediately know the
correct secondary drivers to use.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In systemd >= 218, the udev_set_log_fn method has been marked
deprecated and turned into a no-op. Nothing in the udev client
library will print to stderr by default anymore, so we can
just stop installing a logging hook for new enough udev.
The manufacurer and product from USB device itself are usually not particularly
useful -- they tend to be missing, or ugly (all-uppercase, padded with spaces,
etc.). Prefer what's in the usb id database and fall back to descriptors only
if the device is too new to be in database.
https://bugzilla.redhat.com/show_bug.cgi?id=1138887
Leak introduced in commit 16ebf10f (v1.2.6), detected by valgrind:
==9816== 216 (96 direct, 120 indirect) bytes in 6 blocks are definitely lost in loss record 665 of 821
==9816== at 0x4A081D4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==9816== by 0x50836FB: virAlloc (viralloc.c:144)
==9816== by 0x1DBDBE27: udevProcessPCI (node_device_udev.c:546)
==9816== by 0x1DBDD79D: udevGetDeviceDetails (node_device_udev.c:1293)
* src/util/virpci.h (virPCIEDeviceInfoFree): New prototype.
* src/util/virpci.c (virPCIEDeviceInfoFree): New function.
* src/conf/node_device_conf.c (virNodeDevCapsDefFree): Clear
pci_express under pci case.
(virNodeDevCapPCIDevParseXML): Avoid leak.
* src/node_device/node_device_udev.c (udevProcessPCI): Likewise.
* src/libvirt_private.syms (virpci.h): Export it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Replace:
if (virBufferError(&buf)) {
virBufferFreeAndReset(&buf);
virReportOOMError();
...
}
with:
if (virBufferCheckError(&buf) < 0)
...
This should not be a functional change (unless some callers
misused the virBuffer APIs - a different error would be reported
then)
This stops the error message spam when running unprivileged
libvirtd:
2014-06-30 12:38:47.990+0000: 631: error : virPCIDeviceConfigOpen:300 :
Failed to open config space file
'/sys/bus/pci/devices/0000:00:00.0/config': Permission denied
Reported by Daniel Berrange:
https://www.redhat.com/archives/libvir-list/2014-June/msg01082.html
This new element is there to represent PCI-Express capabilities
of a PCI devices, like link speed, number of lanes, etc.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
While exposing the info under <interface/> in previous patch works, it
may work only in cases where interface is configured on the host.
However, orchestrating application may want to know the link state and
speed even in that case. That's why we ought to expose this in nodedev
XML too:
virsh # nodedev-dumpxml net_eth0_f0_de_f1_2b_1b_f3
<device>
<name>net_eth0_f0_de_f1_2b_1b_f3</name>
<path>/sys/devices/pci0000:00/0000:00:19.0/net/eth0</path>
<parent>pci_0000_00_19_0</parent>
<capability type='net'>
<interface>eth0</interface>
<address>f0🇩🇪f1:2b:1b:f3</address>
<link speed='1000' state='up'/>
<capability type='80203'/>
</capability>
</device>
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
A PCI device can be associated with a specific NUMA node. Later, when
a guest is pinned to one NUMA node the PCI device can be assigned on
different NUMA node. This makes DMA transfers travel across nodes and
thus results in suboptimal performance. We should expose the NUMA node
locality for PCI devices so management applications can make better
decisions.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
In "src/conf/" there are many enumeration (enum) declarations. Similar
to the recent cleanup to "src/util" directory, it's better to use a
typedef for variable types, function types and other usages. Other
enumeration and folders will be changed to typedef's in the future.
Most of the files changed in this commit are reltaed to Node and
Network (node_device_conf.h and nwfilter_params.*) enums.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Some CDROM devices are reported by udev to have an ID_TYPE="generic"
thus it is necessary to check if ID_CDROM is present.
As a side effect, treating ID_TYPE="generic" as a missing ID_TYPE will
enable checks for ID_DRIVE_FLASH_SD and ID_DRIVE_FLOPPY and the
udevKludgeStorageType heuristic.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>