Vanilla flaw hurting the stability of Fabric worldgen structure mods and causing crashes
TelepathicGrunt opened this issue · 12 comments
THE PROBLEM:
When Mojang made the worldgen code, they did not place any protection against structures being removed (or have their registry names changed) when re-entering a world. This crash will break the world forever until the player either deletes the problematic chunk using chunk editing tools or puts the structure mod back on so the structure exists again with the old registry name.
What is happening is that if a person saves and quits Minecraft while a chunk was currently generating a structure, the structure's registry name is saved into the chunk's data. This is done so that when the person enters the world again, the chunk can resume generating the structure like normal. However, if the person removes a mod that had structures and then attempts to enter the world again, the game will crash with a nullpointererror because the structure no longer exists.
This bug is especially dangerous with the risk that this can also completely BREAK worlds as if a mod removes and adds a different structure at the same time as part of their update, the world could initiate generating the new structure before crashing with the old structure. And trying to revert to the old version of the mod will make the world keep crashing as the newly added structure is now the one that's missing.
Since Vanilla only ever adds structures and never removes old ones (and doesn't do backward compatibility with older MC versions), this bug is not reproducible with Vanilla. However, it is extremely easy to reproduce this crash with worldgen mods that adds/remove structures and so, fixing the bug would have to be done by us instead of Mojang.
THE CODE:
The specific line of code that is causing the crash is line 194 inChunkGenerator.addStructureReferences. The line is the following and what I believe is null is the result of getStructureStarts():
Iterator var12 = world.getChunk(n, o).getStructureStarts().entrySet().iterator();
If Fabric could throw in a null check here, this would greatly benefit all players. It would help Fabric appear more stable to players since players will most likely blame this crash on Fabric when they are unable reproduce the crash with just Vanilla worlds alone.
REPRODUCTION OF ISSUE:
For proof that this issue exist, put on Yung's Better Mineshaft Fabric and and get the Fabric API. Then make a world with the mod on and load several chunks. Exit the world. Remove Yung's. then re-enter the world. It could take a few tries and more exploring as you need to get the Mineshaft to start generating right at the edge of chunk loading but eventually, you'll get the crash.
https://www.curseforge.com/minecraft/mc-mods/yungs-better-mineshafts-fabric
The logs. Since this crash happened during the loading screen while worldgen was threaded, the game gets stuck on the loading screen forever and not show any stacktrace of the crash because it was on a different thread. However, I was able to get a stacktrace of the crash with Forge and that's how I narrowed down the offending code to show above:
https://hatebin.com/tefbvihkgt
A copy of the world with this crash. You should be able to join this world and then get stuck on the loading screen forever or in you do get in, it should crash very soon after. When you put on Yung's, the crash should go away:
New World (1).zip
CONCLUSION:
I hope this helps and I believe this is within the scope of the Fabric modloader instead of Fabric API as structure mods should not rely on a separate API just to prevent a crash that can render worlds stuck with a mod.
So yeah, this fix should be applied to Fabric to guarantee safety and stability of user's world when using Fabric worldgen structure mods.
Update on this issue.
As of 1.16+, the issue no longer crashes the world but will instead lag the world, spam the logs, and prevent chunks from being able to save at all. The world essentially become straight up broken.
A bug report has been made to mojang and has reached 'community consensus' status
https://bugs.mojang.com/browse/MC-194811
As of right now, there's still no fix for this except by using a mixing i509 made.
The mixin looked like this last time I saw it but could be improved I think. Anyone reading this and is making a structure mod, you'll want to add this mixin if you are changing a structure's registry name or removing the structure in an update to your mod.
@Mixin(ChunkSerializer.class)
public abstract class ChunkSerializerMixin {
@Shadow @Final private static Logger LOGGER;
@Redirect(method = "readStructureReferences", at = @At(value = "INVOKE", target = "Lcom/google/common/collect/BiMap;get(Ljava/lang/Object;)Ljava/lang/Object;"))
private static <K, V> V emitMissingStructureMessage(BiMap<K, V> map, K key) {
if (!map.containsKey(key)) {
LOGGER.error("Found missing structure reference {}! Ignoring this structure", key);
}
return map.get(key);
}
@Redirect(method = "readStructureReferences", at = @At(value = "INVOKE", target = "Ljava/util/Map;put(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;"))
private static <K, V> V ignoreMissingStructures(Map<K, V> map, K key, V value) {
if (key == null) {
// The key is null, do not add to the map
return null;
}
return map.put(key, value);
}
}
The issue is only one person really can do that. Unless everyone depends on a sort of tiny mod
that provides that fix, anyone who touches world gen will step over each other.
EDIT: I have some slight tweaks to improve performance, but it is generally the same thing still:
https://github.com/i509VCB/fabric/blob/structurefix/fabric-structure-api-v1/src/main/java/net/fabricmc/fabric/mixin/structure/ChunkSerializerMixin.java
Oof. I'll ask Ffrann what they did as Frann borrowed the mixin to be able to do name changes too. Maybe they changed it to not be a redirect
@williewillus That is a good idea except that structure mods can't be the ones to have that mixin as when players completely remove the structure mod for whatever reason, the world gets wrecked. Thus, the fix needs to be in some place that players will always have on no matter what. Thus the push to put this into Fabric API.
Shouldn't this be doable with a single mixin at RETURN of readStructureReferences
? Just inspect the return value and do map.remove(null)
.
This should be idempotent across multiple mixins (though a single mod providing the fix would be better efficiency-wise)
@i509VCB Do you need a broken world to test on? I can make one easily. :-D
I'm also happy to test any solution if needed.
You can download it here: https://cloud.laerad.net/s/nCCCDBRgjcccZjc
The 1.16.2 world was created in SP. No mods (besides fabric api) were present after I removed astromine.
There was an astromine meteor at 360 / 64 / 309
Thus the push to put this into Fabric API.
yup, I think a module in FAPI to fix critical vanilla bugs like this one would be useful. Not anything that would affect gameplay (like forge's sometimes-invasive fixes), but just for the worst unpatched crashes/bugs.
they did not place any protection against structures being removed (or have their registry names changed)
imo this should be covered by a datafixer api in the case of renames.
Removals get a bit sketchy since I think mojang stores the structures in ungenerated chunks ahead of time, but a temporary workaround would be what you mentioned.
EDIT: This sounds like a job for another module since this type of fix doesn't make sense in any of the existing modules. I haven't fully formulated the possible name of said module yet.