memtest86plus

mirror of https://github.com/memtest86plus/memtest86plus.git synced 2024-11-23 08:26:23 -06:00

Author	SHA1	Message	Date
Lionel Debroux	375e22a4d7	Significantly optimize the bit fade and own addr tests for size, by folding near-identical switch case bodies together, and removing code duplication by merging pattern_fill() and pattern_check(). Also, add a rep stos[lq] path in the bit fade test. Before / after: text data bss dec hex filename 1830 4 0 1834 72a build32/tests/bit_fade.o 1191 4 0 1195 4ab build32/tests/bit_fade.o 1359 0 0 1359 54f build32/tests/own_addr.o 959 0 0 959 3bf build32/tests/own_addr.o 1581 4 0 1585 631 build64/tests/bit_fade.o 1021 4 0 1025 401 build64/tests/bit_fade.o 1236 0 0 1236 4d4 build64/tests/own_addr.o 859 0 0 859 35b build64/tests/own_addr.o	2024-03-13 08:36:19 +01:00
Lionel Debroux	53ca89f8ae	Add initial NUMA awareness support (#378 ) * Add a file containing useful macro definitions, currently a single top-level macro for obtaining the size of an array; use it to replace a sizeof(x) / sizeof(x[0]) construct in system/smbus.c . This requires switching the GCC build mode from C11 to C11 with GCC extensions. * Initial NUMA awareness (#12) support: parse the ACPI SRAT to build up new internal structures related to proximity domains and affinity; use these structures in setup_vm_map() and calculate_chunk() to skip the work on the processors which don't belong to the proximity domain currently being tested. Tested on a number of 1S single-domain, 2S multi-domain and 4S multi-domain platforms. SKIP_RANGE(iterations) trick by Martin Whitaker.	2024-03-13 01:43:26 +01:00
Lionel Debroux	9b9c65b968	Reduce padding and relocations (#355 ) * Optimize the JEP106 list by using __attribute__((packed)) to remove padding. The x86 & x86_64 series support unaligned accesses just fine, after all, and this is not remotely a hot path. * Optimize several string-related constructs by switching to fixed-length char arrays, which avoids pointers + relocations. * app/interrupt.c: array of different-length strings, but most of those are lengthy enough for this to be a clear win, especially on x86_64; * system/usbhcd.c: array of same-length strings; * tests/tests.h: array of structs containing same-length strings. * Reduce the size of the list of tests by using a narrower type for the cpu mode, which reduces padding.	2023-11-29 12:45:17 +01:00
Lionel Debroux	34eb8186fd	Significantly optimize test_mov_inv_walk1() for size by moving the ternary operators related to inverse and pattern outside hot paths, which prevents generated code duplication and shortens one of the sides of duplicated loops. (#351 ) Before: text data bss dec hex filename 3019 0 0 3019 bcb build32/tests/mov_inv_walk1.o 2640 0 0 2640 a50 build64/tests/mov_inv_walk1.o After: text data bss dec hex filename 1705 0 0 1705 6a9 build32/tests/mov_inv_walk1.o 1464 0 0 1464 5b8 build64/tests/mov_inv_walk1.o	2023-11-19 17:14:08 +01:00
martinwhitaker	186ef6e913	Improved own addr test (#219 ) * For 64-bit images, use the physical address as the test pattern in test 2. This will make it easier to diagnose faults. * Disable test 1 by default (issue #155). Test 2 provides the same test coverage. Test 1 may make it slightly easier to diagnose faults with a 32-bit image, so leave it as an option. * For 32 bit images, use the physical address to generate the offset in test 2. Detecting a stage change and using that to reset the offset counter could fail when the config menu was used to skip to the next test (issue #224).	2023-01-04 23:26:22 +01:00
Martin Whitaker	5a2bc4c960	Skip segments in tests where the calculated chunk size is too small. If the memory map contains very small segments and we have many active CPUs, the tests that split the segments into chunks distributed across the CPUs may end up with chunks that are too small for the test algorithm. With 4K pages and the current limit of 256 active CPUs, this is currently only a problem for the block move and modulo-n tests, but if we ever support more than 512 active CPUs, it could affect the other tests too. For now, just skip segments that are too small in the affected tests. As it only affects the block move and modulo-n tests and only affects very small regions of memory, the loss of test coverage is negligable. This may fix issue #216.	2022-12-10 15:24:26 +00:00
a1346054	9660eead4e	Simple maintenance improvements (#145 ) * Fix typos * Add missing final newline * Trim trailing whitespace	2022-08-15 17:51:48 +02:00
martinwhitaker	93c9c8ded5	Rework memory mapping to allow for larger program size (#54 ) * Improve abstraction in vmem.h and limit memory benchmarking to first 2GB. The third GB may get used for remapping memory regions that are only accessed during startup, so it's not safe to use it for the memory speed tests. * Fix calculation of end limit for locating memory benchmark workspace. * Document vmem.h. * Use window number, not current start address, to detect first window. * Increase the program low-load range from 1MB to 4MB and make more robust. If the BIOS has reserved some parts of low memory, there may not be enough contiguous space left to load the program there (issue #49). So increase the low-load range to include the first 3MB of high memory. Also guard against the program being initially loaded straddling the new boundary. Co-authored-by: Martin Whitaker <memtest@martin-whitaker.me.uk>	2022-04-28 23:04:01 +02:00
Martin Whitaker	e92f488753	Improve efficiency of random number generation (discussion #8 ). Use a more efficient algorithm that can be in-lined, and keep the generator state in a local variable.	2022-03-05 20:04:32 +00:00
Martin Whitaker	4078b7760e	Faster barrier implementation. The old barrier implementation was very slow when running on a multi-socket machine (pcmemtest issue 16). The new implementation provides two options: - when blocked, spin on a thread-local flag - when blocked, execute a HLT instruction and wait for a NMI The first option might be faster, but we need to measure it to find out. A new boot command line option is provided to select between the two, with a third setting that uses a mixture of the two.	2022-02-28 22:05:21 +00:00
Martin Whitaker	3245b6d916	Don't turn the cache off in test 0 when performing dummy runs. This should fix the slow startup on multi-socket machines (issue #16).	2022-02-19 20:55:41 +00:00
Martin Whitaker	f8b82eb0bd	Exclude copyright notices from Doxygen file descriptions.	2022-02-19 19:56:55 +00:00
Martin Whitaker	76adad2fe6	Add ability to generate internal API documentation using Doxygen.	2022-02-19 16:17:40 +00:00
Martin Whitaker	0e61b1605e	Remove volatile qualifier from testword pointers. Now we use the atomic read/write functions, these are redundant.	2022-02-19 13:01:42 +00:00
Martin Whitaker	1888f5c611	Add change to tests/test.c missed in commit `dcac5270`.	2022-02-02 15:33:25 +00:00
Martin Whitaker	ccab9ab081	Fix operation with a subset of CPU cores enabled. The last commit removed too much - there are a couple of places where we need to use a virtual CPU number rather than a physical CPU number.	2022-02-01 15:38:06 +00:00
Martin Whitaker	16d55b7dad	Remove distinction between physical and virtual CPUs. This is no longer needed, now we can display as many CPUs as we can physically handle.	2022-01-31 22:59:14 +00:00
Martin Whitaker	d9fee4dcbb	Flush caches between writing and verifying test data. Mostly we write and read large chunks of data which will make it likely that the data is no longer in the cache when we come to verify it. But this is not always true, and in any case, we shouldn't rely on it.	2021-12-23 11:00:10 +00:00
Martin Whitaker	11c0c6c2f5	Use atomic memory read/write functions in tests. This ensures compiler optimisations won't interfere with the tests.	2021-12-23 10:07:55 +00:00
Martin Whitaker	8f1d81b65d	Add missing includes of stdbool.h. To ensure we aren't dependent on the order of inclusion.	2021-12-05 13:50:25 +00:00
Martin Whitaker	fbd3376668	Initial commit.	2020-05-24 21:30:55 +01:00

21 Commits