Client tests, again
SquidDev opened this issue ยท 4 comments
Back in 0ff6b0c, I introduced a way of doing automated client-side tests for catching issues with, say, monitor rendering. This effectively worked by teleporting a player to specific positions, taking screenshots, and then trying to compare the two.
Unfortunately, this didn't end up working very well, for (I think) a variety of reasons:
- Comparing screenshots naively (like I did) is incredibly flaky. I think achieving pixel-perfect rendering is never going to happen.
- These tests took a long time to run. While not an issue, it's annoying.
- Tests were run as part of CI. While notionally a good idea, when compounded with the above two, it made them very irritating!
However, given the spate of monitor rendering issues, I think it's probably worth revisiting this.
I think I'd propose the following:
-
Use the gametest system as before for defining tests (and the structures needed for these tests). I think this workflow was very effective.
-
Keep a method for taking a screenshot. However, this shouldn't try to compare against a reference screenshot: instead just save it to a directory. It would be nice if we could launch Minecraft with renderdoc and take captures automatically too, but definitely not needed for an MVP :).
It might also be worthwhile looking at modmuss50's work on client-side testing for Fabric. It's quite neat in that the test harness runs outside of Minecraft (instead interacts purely through image recognition), though I suspect that won't work for us.
-
Run the test suite (or a subset of our test suite) against a set of different configurations. This should include:
- Vanilla (using VBOs and TBOs)
- Sodium (using VBOs and TBOs)
- Iris+Sodium (with and without shaders)
- Optifine (with and without shaders). This may not be possible - it feels like running Optifine in a deobf environment would be pretty tricky.
-
Once tests have run, it's up to the developer to go through the screenshots and check nothing is amiss.
This will obviously take an age to run, so should definitely not be part of ./gradlew check
! I think we have to cope with this being a manual step which one does after major client-side changes (including updating Minecraft). I think even if we can't automate everything (which we can't!), automating some of the boring bits would be nice.
Could CI be split into two jobs? One that tests client-side stuff (as described above) and another that does the existing tests that are more server-side. The jobs could then run in parallel.
Commits might then be able to have a phrase in the commit message to skip specific jobs if the changes shouldn't affect that area of the mod.
I guess, it is better to try separate pipeline for every run. So, separate run for vanilla, separate run for Sodium, etc
The problem here is that there's no easy way to automatically check what's happening is Correct(TM), and so it requires people to manually download screenshots and verify them locally. At that point CI kinda stops having any value - you're better off running these tests locally.