Lib Multipart

Lib Multipart

46.7k Downloads

Concurrency issue with caching in PartContainer

Kneelawk opened this issue ยท 5 comments

commented

The Issue

While doing some tests yesterday, I encountered a NullPointerException in PartContainer.getCullingShape(). VoxelShapes.union() was breaking when it tried to call a method on its first parameter.

The first few lines of the stack-trace show this:

java.lang.NullPointerException: Cannot invoke "net.minecraft.util.shape.VoxelShape.isEmpty()" because "one" is null
	at net.minecraft.util.shape.VoxelShapes.combine(VoxelShapes.java:124)
	at net.minecraft.util.shape.VoxelShapes.combineAndSimplify(VoxelShapes.java:112)
	at net.minecraft.util.shape.VoxelShapes.union(VoxelShapes.java:104)
	at alexiil.mc.lib.multipart.impl.PartContainer.getCullingShape(PartContainer.java:616)
	at alexiil.mc.lib.multipart.impl.MultipartBlock.getCullingShape(MultipartBlock.java:154)

If you look at the code for PartContainer.getCullingShape(), you find:

    public VoxelShape getCullingShape() {
        if (cachedCullingShape == null) {
            cachedCullingShape = VoxelShapes.empty();
            for (PartHolder holder : parts) {
                cachedCullingShape = VoxelShapes.union(cachedCullingShape, holder.part.getCullingShape());
            }
        }
        return cachedCullingShape;
    }

The only way for the first parameter of VoxelShapes.union() to be null is if multiple threads were accessing cachedCullingShape at the same time, one thread calling getCullingShape(), and the other thread calling recalculateShape(). The first thread calls getCullingShape(), finding that cachedCullingShape == null, and attempts to rebuild the cache, setting cachedCullingShape = VoxelShapes.empty(). Then the second thread calls recalculateShape(), setting cachedCullingShape = null while the first thread is still assuming that it will not be null.

Full Crash Log

Here is a pastebin link to the full crash log: https://pastebin.com/Q4chsscU

Note that the stack trace does include WiredRedstone code, however, this code is simply supplying a mesh to the renderer and does not do anything with threads, so it is unlikely this concurrency issue was caused by that code.

Possible Fix

One simple fix would be to use AtomicReferences for cached state, and to store the new, incompletely-built, cache value in the get*Shape() methods while it's being built. The AtomicReferences are because in Java, on some platforms, it is possible for a reference-type variable to be only partially-set when it is read from another thread (like on ARM).

commented

This looks good. I'll try and test it the next chance I get. However, that may not be for a couple weeks. I'll re-open if I encounter the same issue again.

commented

Note how in the stack trace, the trace starts in a ForkJoinWorkerThread. From what I can tell, this means terrain is rendered off the Render thread in special dedicated threads for rendering terrain. This means that a part container's cached state is indeed being accessed from multiple threads.

commented

Just ran into this issue again on 0.7.1-pre.8 in my WiredRedstone dev environment, so maybe it's not nearly as hard to reproduce as I thought.

commented

@Kneelawk I believe the issue is that you should never be accessing block entity state from the meshing thread. You should get and store all the data you need in the ModelKey (which is being created in the client main thread).

commented

@Technici4n That is important, but that is not what is causing this specific issue. The issue here is that the base Minecraft renderer is accessing blocks' culling shapes from multiple threads when calculating ambient occlusion in terrain rendering. Because MultipartBlocks get their culling shape from their block entities, multiple threads end up accessing the block entities' cached shapes at the same time.