* Add preliminary support for TTY Serial/UART (#15)
* Use shadow_buffer instead of VGA buffer to get a framebuffer-agnostic TTY supprot
* Added menu browsing & inputs from Serial TTY (#15)
* Add fix for degree symbol on TTY. Correct serial.c & serial.h file created with CRLF (#15)
* Move tty_error_redraw() to insure correct redraw when a error occurs
* Many reindent / cleanup
* Various optimization from @martinwhitaker comments
There is an unavoidable race between one core halting after decrementing
the barrier count and another core sending it the wakeup NMI. This can
only occur if the core sending the wakeup is running at many times the
speed of the core halting, but it has been observed on an Intel Icelake
mobile processor.
This reduces the number of instructions between decrementing the count
and halting in the halt wait case. Use the same code for the spin wait
case for consistency.
The old barrier implementation was very slow when running on a multi-socket
machine (pcmemtest issue 16).
The new implementation provides two options:
- when blocked, spin on a thread-local flag
- when blocked, execute a HLT instruction and wait for a NMI
The first option might be faster, but we need to measure it to find out. A
new boot command line option is provided to select between the two, with a
third setting that uses a mixture of the two.
Now we halt CPU cores that are going to be idle for a lengthy period,
we don't need to try to save power in other ways. And anyway, this was
not very effective.