These were low value and unfocused tests. We already have coverage of the
important aspects of these tests elsewhere. Additionally, there was really no
need for these to have their own test harness.
This reverts commit fa445d86a1.
Narrator: It did, in fact, make a difference.
For whatever reason, building LLVM against spacemit_x60 or baseline makes no
noticeable difference in terms of performance, but building the Zig compiler
against spacemit_x60 does. Also, the miscompilation that was causing
riscv64-linux-debug to fail was in the LLVM libraries, not in the Zig compiler,
so we may as well take the win here.
The idea is to have 2 runners per machine, since a lot of time is spent building
stage3 and stage4, both of which are largely single-core affairs. This will make
the test steps take longer, however, so the timeouts have been bumped a bit, and
max RSS for the test step has been lowered from 64G to 32G to prevent OOM.
Finally, we now only run a single ReleaseSafe job on PRs; Debug and Release jobs
are limited to pushes.