Commit Graph

29 Commits

Author SHA1 Message Date
Stanislav Levin
0e8bde3175 ap: Raise dbus timeout
With some recent changes on Azure Agent the default DBus call
timeout is not good enough. For example, in case of
`InstallDNSSECFirst_1_to_5` job hostnamectl received reply in ~20sec,
but later it increased to ~30sec (more subjobs - more time to reply).
It's good to raise this timeout to be more protected against minimum
performance times.

https://www.freedesktop.org/software/systemd/man/sd_bus_set_method_call_timeout.html#Description

Fixes: https://pagure.io/freeipa/issue/9207
Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
2022-07-26 12:36:41 -04:00
Stanislav Levin
5a00882eab pylint: Fix useless-suppression
Cleanup up no longer used Pylint's disables where possible.

Fixes: https://pagure.io/freeipa/issue/9117
Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
2022-03-11 13:37:08 -05:00
Stanislav Levin
a941e8f602 azure: Ignore tar errors
Sometimes tar fails on changed in process files:
```
[2021-09-07 11:03:33] + tar --ignore-failed-read -czf ipaserver_install_logs.tar.gz --warning=no-failed-read /var/log/dirsrv /var/log/httpd2 /var/log/ipa /var/log/ipaclient-install.log /var/log/ipa-custodia.audit.log /var/log/ipaserver-install.log /var/log/krb5kdc.log /var/log/pki /var/log/samba /var/lib/bind/data systemd_journal.log
[2021-09-07 11:03:33] tar: Removing leading `/' from member names
[2021-09-07 11:03:33] tar: Removing leading `/' from hard link targets
[2021-09-07 11:03:33] tar: /var/log/dirsrv/slapd-IPA-TEST/access: file changed as we read it
[2021-09-07 11:03:33] + tests_result=1
```

This is expected failure since processes are not stopped during logs
collection and can flush their logs.

Fixes: https://pagure.io/freeipa/issue/8983
Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-09-15 08:48:13 +02:00
Stanislav Levin
10461b7091 azure: Make it possible to adjust Docker resources per test env
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
6c2db326f8 azure: coredump: Wait for systemd fully booted
Otherwise, 'Check for coredumps' task fails with:
```
Verifying        : samba-debugsource-2:4.14.4-0.fc34.x86_64             20/20
[Errno 2] No such file or directory: '/var/lib/dnf/rpmdb_lock.pid'
Finishing: Check for coredumps
```

This is due to systemd-tmpfiles(not ready yet).

Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
a893852b4f azure: Warn about extra and missing gating tests compared to PR-CI
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
611b49e42b azure: Collect installed packages
The list of installed packages may be useful for checking the
versions of packages for analysis. Previously, only the newly
installed packages can be observed on Build phase.

This is convenient for experienced users of PR-CI.

Note: the read-only access provided for non-master containers
to be able to execute Azure scripts. The logs are still collected
only on controller.

Only RPM-based collection is implemented for Fedora. By default
nothing is collected.

Users may want to override `installed_packages` function
in the corresponding `ipatests/azure/scripts/variables-DISTRO.sh`.

Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
afef09ccba ipatests: Ignore warnings on failed to read files on tarring
There are tons of useless warnings about missing files on collecting
logs, such as:

```
tar: /var/log/ipaserver-kra-install.log: Warning: Cannot stat: No such file or directory
tar: /var/log/ipaepn.log: Warning: Cannot stat: No such file or directory
tar: /etc/NetworkManager/NetworkManager.conf: Warning: Cannot stat: No such file or directory
tar: /var/log/ipabackup.log: Warning: Cannot stat: No such file or directory
tar: /var/log/iparestore.log: Warning: Cannot stat: No such file or directory
...

```

Since `--ignore-failed-read` option is passed to tar the caller
doesn't care about not readable(mostly missing) files and these warnings
may be filtered out.

This improves the readability of test logs.

Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
3889d8654a pytest: Show extra summary information for all except passed tests
By default pytest reports in summary section about tests failures and errors.
It will be helpful to see skipped, xfailed and xpassed tests.

Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
fc0c6b44a8 azure: Run Base and XMLRPC tests is isolated network
The tests in these envs make DNS requests to wild(internet) NSs,
though usually tests assume the opposite making requests to
`test.` zone. This makes CI unstable and dependent on wild
resolvers and logically wrong.

In future there can be tests which may want to check BIND as
resolver(cache) for external networks. In this case such tests
should be placed on not isolated mode.

By default, a test env is not isolated from internet(as it was
before), but it may be a good idea to change this default in
future.

Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
65700bf743 ipatests: Setup and collect BIND logs
For Base/XMLRPC tests BIND's logs are already collected.

Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
b5fdba7a72 azure: Warn about memory issues
The nonzero number of memory/memory+Swap usage hits limits may
indicate the possible env instability(crashes, random failures, etc.).

> memory.failcnt		 # show the number of memory usage hits limits
  memory.memsw.failcnt		 # show the number of memory+Swap hits limits

Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
1c82895c20 azure: Wait for systemd booted
The calling of systemd's utils during systemd boot may lead to
unpredictable results. For example, if DBus(dbus-broker) service
is not started then DBus request goes nowhere and eventually will
be timeouted. So, it's safer to wait fully booted system.

Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
be2f659aa7 azure: Collect systemd boot log
If an error occured while containers setup phase then no logs will
be collected and it is hard(impossible?) to debug such issues on
remote Azure host. With this change in case of such error all the
container's journals will be collected in `systemd_boot_logs`.

Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-05-25 10:45:49 +03:00
Stanislav Levin
5afe13798e Azure: Run chronyd in Docker
The syncing time stuff is required by IPA NTP tests.

Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-03-30 09:58:42 +02:00
Stanislav Levin
85c63fbe62 Azure: Show disk usage
Collect disk usage information may be helpful, for example, for
debugging code required free space such as healthcheck tests.

Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-03-30 09:58:42 +02:00
Stanislav Levin
3ac2cdfd43 Azure: Make it possible to pass additional Pytest args
Some tests require its specific Pytest args. With this change
they can be specified in tests definitions.

Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-03-30 09:58:42 +02:00
Stanislav Levin
3e33e546c3 Azure: Populate containers with self-AAAA records
IPA server's AAAA records at embedded DNS mode depend on result of
`get_server_ip_address` function(`ipaserver.install.installutils`),
which in turn, relies on NSS.

In case of Azure Pipelines, there are neither IPv6 records in
'/etc/hosts' nor external DNS, which may provide such. This leads to
the missing AAAA records for master and missing AAAA records for `ipa-ca`
pointing to master in embedded DNS.

In particular, tests `test_ipa_healthcheck_no_errors`,
`test_ipa_dns_systemrecords_check` fail with:
```
[
  {
    "source": "ipahealthcheck.ipa.idns",
    "check": "IPADNSSystemRecordsCheck",
    "result": "WARNING",
    "uuid": "b979a88a-6373-4990-bc83-ce724e9730b4",
    "when": "20210120055054Z",
    "duration": "0.032740",
    "kw": {
      "msg": "Got {count} ipa-ca AAAA records, expected {expected}",
      "count": 1,
      "expected": 2
    }
  }
]
```
where `ipa-ca` record exists only for replica.

Note: since the most of the code in setup_containers was touched it has
been reformatted.

Fixes: https://pagure.io/freeipa/issue/8683
Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2021-02-15 09:54:51 +02:00
Alexander Bokovoy
f977629182 Azure CI: mask chronyd in the container
Signed-off-by: Alexander Bokovoy <abokovoy@redhat.com>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
2020-11-17 18:48:24 +02:00
Stanislav Levin
a5b23287ae Azure: base: Collect both install and uninstall logs
Some applications remove their logs on uninstallation.
As a result of this, Azure lost `install` logs.

Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2020-08-31 09:46:03 +03:00
Christian Heimes
82ba4db11e Make api.env.mode consistent
* use "developer" in Azure
* fix man page: "development" to "developer"
* list known modes in API bootstrap methods

Other values for mode are still supported to avoid breaking existing
installations.

Fixes: https://pagure.io/freeipa/issue/8313
Signed-off-by: Christian Heimes <cheimes@redhat.com>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
2020-05-14 17:55:59 +02:00
Stanislav Levin
8882fc49d0 Azure: Allow chronyd to sync time
Though time namespace support was added in Linux kernel 5.6, it
is not landed on Azure VM (Ubuntu) yet.

The syncing time stuff is required by IPA NTP tests. it's
acceptable for testing 1 IPA environment on 1 Azure VM for such
tests.

Fixes: https://pagure.io/freeipa/issue/8316
Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Christian Heimes <cheimes@redhat.com>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2020-05-12 09:51:50 +02:00
Stanislav Levin
958e245813 Azure: Add custom seccomp profile
This allows to override the default seccomp profile.
Custom profile was generated from the default one [0] by adding one
allowed system call 'clock_adjtime'. This one is indirectly used by
chronyd with recent glibc2.31.

[0]: https://github.com/containers/libpod/blob/master/seccomp.json

Fixes: https://pagure.io/freeipa/issue/8316
Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Christian Heimes <cheimes@redhat.com>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2020-05-12 09:51:50 +02:00
Alexander Bokovoy
b8a1d130ad Azure Pipelines: Override services known to not work in containers
Chrony daemon tries to use adjtimex() which doesn't work in the
container we run in Docker environment on Azure Pipelines.

nis-domainname also tries to modify kernel-specific parameter that
doesn't really work in runc-based containers.

Use systemd container detection to avoid starting these services in the
containers.

Signed-off-by: Alexander Bokovoy <abokovoy@redhat.com>
Reviewed-By: Christian Heimes <cheimes@redhat.com>
2020-05-06 09:14:29 +02:00
Stanislav Levin
87408ee755 Azure: Increase memory limit
Azure host has 6 GB of physical memory + 7 GB of swap.
FreeIPA CI runs at least 5 masters on each Azure's host.
Thus, swap is intensively used.

Based on the available *physical* memory 389-ds performs db tweaks
and in future may fail to start in case of memory shortage.

Current memory limit for Azure Docker containers(master/replica):
- Physical
$ cat /sys/fs/cgroup/memory/memory.limit_in_bytes
1610612736
- Physical + swap:
$ cat /sys/fs/cgroup/memory/memory.memsw.limit_in_bytes
3221225472

In the meantime, installation of master + ca + kra + dnssec requires:
$ cat /sys/fs/cgroup/memory/memory.max_usage_in_bytes
1856929792

Some test environments require more memory.
For example, 'ipatests.test_integration.test_commands.TestIPACommand':
$ cat /sys/fs/cgroup/memory/memory.memsw.max_usage_in_bytes
2232246272
$ cat /sys/fs/cgroup/memory/memory.max_usage_in_bytes
2232246272

Fixes: https://pagure.io/freeipa/issue/8264
Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
2020-04-28 17:50:10 +02:00
Stanislav Levin
d1b53ded8b Azure: Gather coredumps
Applications may crash.
If a crash happens on a remote system during CI run it's sometimes
very hard to understand the reason. The most important means to
analyze such is a stack trace. It's also very important to check
whether there was a core dump or not, even a test passed.

For Docker environment, the core dumps are collected by the host's
systemd-coredump, which knows nothing about such containers (for
now). To build an informative thread stack trace debuginfo packages
should be installed. But they can't be installed on the host OS
(ubuntu), That's why after all the tests completed an additional
container should be up and the host's core dumps and host's journal
should be passed into it.

Even if there weren't enough debuginfo packages at CI-runtime, the
core dump could be analyzed locally later.

Fixes: https://pagure.io/freeipa/issue/8251
Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
Reviewed-By: Rob Crittenden <rcritten@redhat.com>
2020-04-08 11:27:45 +03:00
Stanislav Levin
e925148ad9 Azure: Free Docker resources after usage
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2020-02-25 18:02:12 +02:00
Stanislav Levin
1fa033c32d Azure: Preliminary check for provided limits
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2020-02-25 18:02:12 +02:00
Stanislav Levin
31d05650fb Azure: Add support for testing multi IPA environments
Currently, only one IPA environment is tested within Docker
containers. This is not efficient because Azure's agent gives
6 GB of physical memory and 13 GB of total memory (Feb 2020),
but limits CPU with 2 cores.

Next examples are for 'master-only' topologies.

Let's assume that only one member of github repo simultaneously
run CI. This allows to get the full strength of Azure.

Concurrency results for TestInstallMaster:
------------------------------------------
|    job concurrency      |  time/jobs   |
------------------------------------------
|             5           |     40/5     |
|             4           |     34/4     |
|             3           |     25/3     |
|             2           |     19/2     |
|             1           |     17/1     |
------------------------------------------
Results prove the limitation of 2 cores. So, in case of jobs'
number not exceeds the max capacity for parallel jobs(10) the
proposed method couldn't save time, but it reduces the used
jobs number up to 2 times. In other words, in this case CI
could pass 2 x tests.

But what if CI was triggered by several PRs? or jobs' number is
bigger than 10. For example, there are 20 tests to be run.

Concurrency results for TestInstallMaster and 20 input jobs:
------------------------------------------------------------------
|    job concurrency      |     time     | jobs used | jobs free |
------------------------------------------------------------------
|             5           |      40      |      4    |     6     |
|             4           |      34      |      5    |     5     |
|             3           |      25      |      7    |     3     |
|             2           |      19      |     10    |     0     |
|             1           |      34      |     20    |     0     |
------------------------------------------------------------------
So, in this case the optimal concurrency would be 4 since it
allows to run two CIs simultaneously (20 tasks on board) and get
results in 34 minutes for both. In other words, two people could
trigger CI from PR and don't wait for each other.

New Azure IPA tests workflow:

+ 1) generate-matrix.py script generates JSON from user's YAML [0]
  2) Azure generate jobs using Matrix strategy
  3) each job is run in parallel (up to 10) within its own VM (Ubuntu-18.04):
    a) downloads prepared Docker container image (artifact) from Azure cloud
       (built on Build Job) and loads the received image into local pool
  + b) GNU 'parallel' launch each IPA environment in parallel:
    + 1) docker-compose creates the Docker environment having a required number
         of replicas and/or clients
    + 2) setup_containers.py script does the needed container's changes (DNS,
         SSH, etc.)
    + 3) launch IPA tests on tests' controller
    c) publish tests results in JUnit format to provide a comprehensive test
       reporting and analytics experience via Azure WebUI [1]
    d) publish regular system logs as artifacts

[0]: https://docs.microsoft.com/en-us/azure/devops/pipelines/process/phases?view=azure-devops&tabs=yaml

Fixes: https://pagure.io/freeipa/issue/8202
Signed-off-by: Stanislav Levin <slev@altlinux.org>
Reviewed-By: Alexander Bokovoy <abokovoy@redhat.com>
2020-02-25 18:02:12 +02:00