[Regression] 1.6.x vram leak fix locks optifine to single-threaded chunk rendering when enabled

Question

[Regression] 1.6.x vram leak fix locks optifine to single-threaded chunk rendering when enabled

FalsePattern opened this issue 3 years ago · 6 comments

FalsePattern commented 3 years ago

1.6.0's VRAM leak fix causes optifine to become a stutterfest. Gotta figure out why this happens and fix it.

FalsePattern · Answer 1 · 2022-06-21T13:48:42.000Z

The AMD memory leak on linux happens because the display lists don't get deleted, and mesa drivers don't deallocate the VBOs that were allocated in the background. This wasn't an issue while the default buffer size was 256kB for each VBO, but they increased it to 20MB a few years ago to reduce allocation spam. See: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6140

embeddedt · Answer 2 · 2022-06-21T13:45:29.000Z

@FalsePattern This issue intrigues me. In the process of fixing MC-129, I found that when this issue is fixed, causing the chunk renderer to actually rerender chunks at the speed it should be, there are sometimes cases where you see outdated chunk data until a display list is finally updated. E.g., if you move from the overworld to the nether, you will see overworld terrain barely visible through the fog.

My fix for this was simply to blast the display list contents whenever setDontDraw is called. However, I don't actually delete and recreate the display list like you're doing here. I haven't seen any compatibility or framerate issues with FastCraft or OptiFine using this method. Not sure if it fixes the AMD memory leak (since I don't have an AMD card to test with) but I'd be curious to hear what results you get.

Here's the related mixin:

/* MC-129 */
@Mixin(WorldRenderer.class)
public abstract class MixinWorldRenderer implements IWorldRenderer {
    @Shadow public boolean[] skipRenderPass;

    @Shadow private int glRenderList;

    /**
     * Make sure chunks re-render immediately (MC-129).
     */
    @Inject(method = "markDirty", at = @At("TAIL"))
    private void forceRender(CallbackInfo ci) {
        for(int i = 0; i < this.skipRenderPass.length; i++) {
            this.skipRenderPass[i] = false;
        }
    }

    /**
     * When switching worlds/dimensions, clear out the old render lists for old chunks. This prevents old dimension
     * content from being visible in the new world.
     */
    @Inject(method = "setDontDraw", at = @At("TAIL"))
    private void clearOldRenderList(CallbackInfo ci) {
        for(int pass = 0; pass < 2; pass++) {
            GL11.glNewList(this.glRenderList + pass, GL11.GL_COMPILE);
            GL11.glEndList();
        }
    }
}

I wonder if deallocating and reallocating display lists causes a GL flush that forces the client thread to suspend while other commands are executed.

FalsePattern · Answer 3 · 2022-06-21T14:46:13.000Z

The issue is the fact that minecraft preallocates 55k renderlists. Mesa allocates memory in chunks for 20MB for renderlists, usually way oversized. Because these renderlists never get released, the memory usage climbs as if the player always has every possible chunk loaded. (when a renderlist gets reused, mesa DOES reuse the already allocated renderlist, but there are also renderlists which are way outside the render distance still taking up VRAM)

This wasn't an issue when the buffer size was 256kB, but it became gamebreaking when they increased it to 20MB in this commit: https://gitlab.freedesktop.org/mesa/mesa/-/commit/3253594268028efdca17cb9d2b2e423b353c8aa5

embeddedt · Answer 4 · 2022-06-21T14:43:18.000Z

Based on line 202 of RenderGlobal, vanilla indeed seems to reuse the same set of display lists forever (at least until Minecraft.freeMemory is called), and simply allocates enough to handle the theoretical maximum render distance. I don't see it continuing to allocate any more after that. Are you implying that if a display list gets overwritten with new data without being deleted and reallocated first, none of the memory it previously used will be freed? That sounds like a Mesa issue, not a Minecraft one.

embeddedt · Answer 5 · 2022-06-21T15:04:29.000Z

Okay, now I think I understand. At 55k renderlists there can then theoretically be 20MB x 55,000 = 1100GB of buffer memory reserved for them. Clearly the game doesn't crash immediately on launch, so Mesa must do the allocation lazily. That was the part I was missing.

I wonder if it's an issue with cache locality, where having the render lists fragmented in memory slows down processing? I looked through your fix and you don't seem to be reallocating them except when the position changes, so I doubt my theory of the reallocation itself being slow is correct.

EDIT: It's worth noting that I don't see any performance impact with the leak fix forcefully enabled on my Nvidia card.

FalsePattern · Answer 6 · 2022-06-22T23:32:17.000Z

Fixed in 1.6.1, multithreaded chunkloading is unfixable with this strategy.

Share to