Commit Graph

1107 Commits

Author SHA1 Message Date
Herbert Wolverson
a392112f7a Trying to get the outdated workflow system to work. YAML is not my friend. 2023-02-01 18:49:08 +00:00
Herbert Wolverson
3991aa404a Add "cargo outdated" checks to the GitHub CI workflow.
Part of ISSUE #229
2023-02-01 18:46:09 +00:00
Herbert Wolverson
d198c0feac Update Tokio and all dependency versions based on local run
of cargo audit and cargo outdated.

Part of ISSUE #229
2023-02-01 18:44:38 +00:00
Herbert Wolverson
7b1c285afa Adjust build_rust.sh to build to separate files and then move
into running files to avoid service interruption. The script
detects if you are using systemd (with the default names) and
will restart the services at the end of the process - for a
very brief interruption rather than several minutes.

Also suppresses pushd/popd output.

Related to ISSUE #208
2023-02-01 17:53:15 +00:00
Herbert Wolverson
9ad1de6ef5 Add safegaurd against running LibreQoS.py more than once at a time.
ISSUE #52

* Added file locking commands to the Python/Rust library.
* When LibreQoS.py starts, it checks that /var/run/libreqos.lock
  does not exist. If it does, it checks that it contains a PID
  and that PID is not still running with a python process.
* If the lock file exists and is valid, execution aborts.
* If the lock file exists and is invalid, it is cleaned.
* Cleans the lock on termination.
2023-02-01 17:09:42 +00:00
Herbert Wolverson
d71f41033c If ShapedDevices and ispConfig are in good shape, try to run LibreQoS.py when lqosd starts to avoid running with no queues. 2023-02-01 16:26:50 +00:00
Herbert Wolverson
3a9e72901b Correct issues with failing to collect data when started at
boot time. ISSUE #235 . Also relevant to ISSUE #209

* Discovered that the BOOT_TIME clock can fail if called
  immediately after boot.
* Refactored time fetching functions into `lqos_utils` with
  proper error wrapping.
* Adjusted unknown IP expiration to issue a bus response of
  "not ready yet" if the boot time clock is not available.
* Adjusted unknown IP expiration to handle 5-minutes in the
  past being a negative number.
* Adjusted queue collection to suggest that you run
  LibreQoS.py if queues don't exist - and fail gracefully,
  without causing a hitch.
2023-02-01 16:08:31 +00:00
Herbert Wolverson
d77fffe4f5 ISSUE #234
Remove the unused feature from notify. We actually moved notify
into a single crate (as opposed to all over the place) in
816ca7e651

As of this commit, build_rust runs without warnings.
2023-01-31 22:34:14 +00:00
Herbert Wolverson
dce8d76d9c Applied 'cargo fmt' to all files, forcing equal formatting for all. 2023-01-31 22:23:55 +00:00
Herbert Wolverson
51cf5d51fa ISSUE #209 General soundness check, assisted with clippy (linter).
* Corrected a lot of small issues like passing a string when a char
  will be (marginally) faster.
* Cleaned up single-arm match statements for the much more compiler
  friendly if let.
* Combined nested if statements.
* Cleaned all remaining unchecked unwrap() calls.
2023-01-31 22:22:05 +00:00
Herbert Wolverson
74101655d8 ISSUE #209 - Full error pass on lqos_queue_tracker module
* Replace every value unwrap with unwrap_or to not panic.
* Replace Anyhow errors with specific errors and log entries.
2023-01-31 21:31:03 +00:00
Herbert Wolverson
982e7314c1 Thorough error-handling pass on lqos_bus crate.
ISSUE #209

Replace "anyhow" with "thiserror". Add logging for all errors,
and only allow pass-through for errors that have already been
converted to a local error type and reported.
2023-01-31 19:33:22 +00:00
Herbert Wolverson
816ca7e651 Refactor the multiple "notify" systems into a single helper
structure.

* Creates FileWatcher, in lqos_utils.
* Removes "notify" dependency from other crates.

FileWatcher is designed to watch a file. If the file doesn't exist,
then an optional callback is called - and the watcher waits,
periodically checking to see if the file has appeared yet. When the
file appears, another optional callback is executed.

Once the file exists, a `notify` system is started (it uses Linux's
`inotify` system internally) for that file. When the file changes,
the process sleeps briefly and then executes an `on_change` callback.
Further messages are then suppressed for a short period to avoid
duplicates.

All uses of notify have been updated to use this system. Errors are
handled cleanly, per ISSUE #209.
2023-01-31 17:52:35 +00:00
Herbert Wolverson
3e4e7ebe64 Another chicken/egg issue:
* ShapedDevices.csv may not exist on first run.
* This previously caused lqos_node_manager to emit a hard error
  and not show any shaped devices at all.

To rectify this:

* ShapedDevices in-memory starts as an empty set.
* When the "watcher" spawns, if the file exists then ShapedDecices
  is loaded.
* If the "watcher" can't find ShapedDevices, it sleeps periodically
  looking for the file to be created. Once created, it loads it
  and starts the change monitor.
2023-01-31 16:26:54 +00:00
Herbert Wolverson
c911d8c190 Fix a chicken & egg problem with queueingStructure.json monitor
ISSUE #209

Using the inode watcher on a file that doesn't exist fails, and
was previously failing silently! This would result in queue
mappings not updating when LibreQoS.py was executed - even though
the queueingStructure.json file became available.

* Replace "anyhow" with specific errors.
* Track and log each step of the file monitor process for
  queueingStructure.json
* If the watcher cannot start because the file doesn't exist,
  the watcher loop sleeps for 30 seconds at a time (to keep
  load very low) and checks if the file exists yet. If it does,
  it loads it and then commences watching.
2023-01-31 15:49:29 +00:00
Herbert Wolverson
6f00386d61 Rename function to match what it does (copy/paste issue) 2023-01-31 15:22:23 +00:00
Herbert "TheBracket
cf33d62d85
Merge pull request #237 from interduo/patch-24
improvement: allow easily to turn on debugging mode
2023-01-31 07:16:39 -08:00
Interduo
5e02885c5b
improvement: allow easily to turn on debugging 2023-01-31 16:12:30 +01:00
Interduo
bd7f9a4a12
improvement: allow easily to turn on debugging mode 2023-01-31 16:11:32 +01:00
Herbert "TheBracket
bb9b911491
Merge pull request #232 from interduo/patch-22
bugfix: propper path and remove many sudo
2023-01-31 06:32:41 -08:00
Herbert "TheBracket
c165d7b9bf
Merge pull request #236 from interduo/patch-23
improvement: lqos_node_manager systemd unit file add require and after lqosd
2023-01-31 06:31:04 -08:00
Interduo
670d3dafe3
improvement: its better to use one sudo here
(than many times sudo inside script remove_pinned_maps.sh)
2023-01-31 14:46:46 +01:00
Interduo
ebc2162efe
lqos_node_manager systemd unit file add require and after lqosd 2023-01-31 11:48:01 +01:00
Herbert Wolverson
39e2689707 ISSUE #209: CSV ShapedDevices reader now uses this_error format for more specific errors and contains more logging. 2023-01-30 22:44:22 +00:00
Herbert Wolverson
e5500cf528 ISSUE #209: add detailed errors and logging to the program control system in lqos_config. Some errors should be impossible. 2023-01-30 22:10:54 +00:00
Herbert Wolverson
bfe9394faa ISSUE #209: add detailed errors and logging to theauthentication manager in lqos_config. 2023-01-30 22:05:12 +00:00
Herbert Wolverson
2735419320 ISSUE #209 : Add full error checking and custom error types to libre_qos_config.rs 2023-01-30 18:28:45 +00:00
Herbert Wolverson
a29391c25c ISSUE #209 : Add full error checking and custom error type to EtcLqos load system. 2023-01-30 17:57:15 +00:00
Herbert Wolverson
ee20023027 Refactor fd timer wait into a "periodic" function in lqos_utils,
only available for non-async use at this time. Adjust the two
non-async usages of timer-fd based timers to use the more
canonical setup.
2023-01-30 17:45:17 +00:00
Herbert Wolverson
7d32c720f0 Rocket web server's ring-buffer update timer is now a Linux
file descriptor.

* Remove the Tokio timer system.
* Replace with Linux's timer fd system.
* Add a watchdog to alert if we've somehow overrun the timer.
2023-01-30 17:22:28 +00:00
Herbert Wolverson
5aa90ee692 Switch bandwidth monitor thread to use linux timer file descriptors
* Replace Tokio timers in the bandwidth/throughput monitor with
  Linux timer file descriptors API.
* Instead of spawning a Tokio process, spawn an independent
  thread for the bandwidth monitor.
2023-01-30 17:09:29 +00:00
Herbert Wolverson
cc7f845f82 Remove info log for no queues to track, it's the normal expected behavior. 2023-01-30 17:04:14 +00:00
Herbert Wolverson
6b6bdc1395 Improved queue watching system
* Queue timing is now provided by Linux "timer file descriptors"
  instead of Tokio timers.
* Added an atomic bool to track "we're going faster than we should"
  (it's true while executing), skip cycles if we ran out of time and
  issue a warning.
* Queue tracking is no longer async, but is locked to its very own
  thread.
* Queue watcher is now more verbose about any issues it encounters.
* Queue structures with children will now correctly track all the
  children, avoiding the blank queue data issue.
2023-01-30 16:55:42 +00:00
Interduo
2e25acc086
bugfix: propler path and remove many sudo
As It could be replaced by only one command: sudo remove_pinned_maps.sh
2023-01-30 17:54:17 +01:00
Herbert Wolverson
2486355c1d Remove unused variable (and warning) in lqtop 2023-01-30 15:29:10 +00:00
Herbert Wolverson
234697bc29 Fix warning in json.rs benchmark. 2023-01-30 15:28:49 +00:00
Herbert Wolverson
ec5baaf866 Move netlink-testing out of main - it's in a branch, where it belongs. 2023-01-30 15:28:23 +00:00
Robert Chacón
55d85d302b
Update lqos.example 2023-01-29 21:17:05 -07:00
Robert Chacón
0fe7e28ce8
Update TESTING-1.4.md 2023-01-29 20:50:05 -07:00
Dave Taht
4e68ecc205 Some notes regarding good error handling
Bug #209, among others.
2023-01-29 19:45:36 -08:00
Dave Taht
7c18da1953 A few other thoughts 2023-01-29 19:31:47 -08:00
Dave Taht
4b097defa9 More comments wanted on and in lqos.conf.new.discuss 2023-01-29 19:12:47 -08:00
Dave Taht
6c7548c77e More fixes for how the conf file should work.
It was hilarious that I already missed the new "bridge"
section in my first attempt. Imagine what it is like for the
users?

Pithy notes:

I think this is an artifact of history, as a bool.

disable_rxvlan = true
disable_txvlan = true

There are a zillion other options in ethtool -h for
coalesing things, besides this.

disable_offload = [ "gso", "tso", "lro", "sg", "gro" ]
2023-01-29 18:22:42 -08:00
Dave Taht
6c81a2a8c1 Trying to unify configuration variables somewhat
We have a lot of configuration stuff, written in several very
different styles. We have hidden knowledge (like port numbers)
buried elsewhere. We have overly wordy variables names, and not
clear separation of each concept. We have a need to keep some
data secure (passwords to the apis), and others, need to be common.

Ideally there would be more of a secrets file for secrets to
point to, on the security case.

Having one file to rule them all is not exactly the right way
forward, but parsing one file *format* might prove simpler.

Please, everyone, think about how to best to express oneself,
I took a stab at it via this commit.
2023-01-29 09:52:58 -08:00
Robert Chacón
8d59d0594d
Solved https://github.com/LibreQoE/LibreQoS/issues/206 2023-01-29 07:44:58 -07:00
Dave Taht
ea40e6293c Put arrows in the summary spans
Still would like a format like

LIBREQOS MONITOR UP OTHER STATUS INFO and the summary down/up stats to
line up with the table below it.
2023-01-29 07:10:11 +00:00
Dave Taht
e175f15845 lqos_utils: Scale bits/s relative to gbit/s etc
Essentially right justify these functions.
2023-01-28 18:33:53 -08:00
Herbert Wolverson
0e97e6a868 Attempting to resolve issues with lqos_node_manager not seeing
statistics, while lqtop still works.

1) Add warning and error logging to lqos_node_manager if any
   part of the statistics gathering process fails.
2) (Hopefully temporarily) use the non-persistent bus client,
   again logging any issues.
3) Improve the statistics gathering timer code.
2023-01-27 19:47:19 +00:00
Herbert Wolverson
fcad4fa90a Remove warnings from lqtop 2023-01-26 20:50:12 +00:00
Herbert Wolverson
6e92a07a00 Fix panic in LibreQoS.py on update if an IP address mapping needed
deleting.

* Adjust the Python integration `delete_ip_mapping` function to
  not require a secondary "upload" parameter - because the
  Python code is unaware of whether there needs to be a
  separation of the two at this point.
* Change ENOEXIST return code in BPF map delete to NOT be an
  error - it indicates that there was nothing to do, rather
  than something not working.
2023-01-25 22:59:06 +00:00