Fabric API

Fabric API

106M Downloads

Going back and forth between main and outer end island through gateway portal leaks memory on client

cuteBoiButt opened this issue · 17 comments

commented

java 18(ea build 22)
minecraft 1.17.1
fabric loader 0.12.5
fabric api 0.42.1

and nothing else :p

according to VisualVM by the time i took heap dump this HashSet...

private final Set<WorldChunk> loadedChunks = new HashSet<>();

...contained 67140 elements and had retained size of ~1028 megabytes, which doesn't sound right considering that i'm playing on a vanilla'ish(carpetmod) server with view-distance=16

i can share heap dump if it's going to be of any use but this leak seems to be easily reproducible
at least on "my machine™" :D

commented

java 18(ea build 22)
minecraft 1.17.1
fabric loader 0.12.5
fabric api 0.42.1

and nothing else :p

What makes you think carpet is involved?

commented

This seems to suggest there is a path for unloading chunks that we are missing. How many times do you need to go through to reproduce it?

commented

At view distance 16, there should be approx 1024 (32x32) elements in that HashSet.
The actual amount will depend upon when MC decides to run its chunk management code.

67140 elements suggests the unload event is not getting fired from here:
https://github.com/FabricMC/fabric/blob/1.17/fabric-lifecycle-events-v1/src/main/java/net/fabricmc/fabric/mixin/event/lifecycle/ThreadedAnvilChunkStorageMixin.java

This has been an issue before when a mod decides it wants to modify(optimise) the chunk loading.
i.e. it ends up moving the chunk unload and hence setLoadedToWorld() calls to mod code.

Is there a reason why the above unload mixin doesn't run from inside WorldChunk.setLoadedToWorld()
when the loadedToWorld state changes from true -> false?

commented

playing on a vanilla'ish(carpetmod) server

It sounds like they tested with just fabric api but its not super clear. I would just like some confirmation before spending time looking into something that might not exist.

commented

Can confirm that carpet is doing its own chunk unloading:

https://github.com/gnembon/fabric-carpet/blob/b935fadadf11f685d85300d28da3de6ad8ce6113/src/main/java/carpet/mixins/ThreadedAnvilChunkStorage_scarpetChunkCreationMixin.java#L372

This is in some code that regenerates chunks by unloading and then reloading them.

commented

@cuteBoiButt are you able to validate that this doesnt happen with carpet? Thanks.

commented

I have a possible fix here: fabric-api-0.42.1+local-chunk_leak.jar - also see my PR: #1820

leak seems to be gone, but number of elements still doesn't seem right to me with view-distance=16
image

commented

Still happens even with a pure(java -jar server.jar) vanilla server
image

that's after ~5 minutes of mindlessly walking through gateway portal

Just to be clear, leak happens on a client connected to a remote dedicated server

commented

I can't reproduce the issue with fabric API in dev (client) connected to a locally hosted vanilla dedicated server. :/

commented

My current reproduction setup

Client:
image
image
image

Server: locally hosted pure vanilla

Outer end island:
2021-11-11_15 51 58

Main island:
2021-11-11_15 52 03

And.. just press W for ~5min

commented

Oh wait, I was using the nether portal. 🤦

commented

Confirmed, thanks

commented

The issue I suggested before is a red herring, although I still think it is an issue for mods that do what carpet is doing.

I believe, this bug can happen on any long distance teleport? Not just the end gateway one.
That might not be true though since the end gateway uses a different teleport method (shared with spread players).

The actual problem is with this mixin:
https://github.com/FabricMC/fabric/blob/1.17/fabric-lifecycle-events-v1/src/main/java/net/fabricmc/fabric/mixin/event/lifecycle/client/ClientChunkManagerMixin.java#L54

In particular the unload mixin itercepts the compareAndSet() nullification of the client's chunk map.

But this is guarded by a check to see if the chunk is actually relevant (i.e. it is within the player's view distance).
see ClientChunkMap.isInRadius().
(It is also guarded on something else winning a race to replace that chunk element with something else).

What I believe is happening is that the player just teleported, so the previously loaded chunks fail this guard (they are out of range - or a new chunk already replaced it). This means the mixin is never invoked.

The mixin also doesn't trap chunks being discarded when the view distance is changed.
see ClientChunkManager.updateLoadDistance()

The fix should be to move the mixin so it doesn't depend on the isInRadius() and the other check.

I don't know whether this will also fix the view distance change problem.
Does the server send UnloadChunk packets when this happens?

commented

Indeed @warjort you are right. For me, the problem comes from this which unloads the chunk without going through ClientChunkManager#unload:
image

However, I would recommend redesigning the entire block entity unload event. In fact, in 1.17 markRemoved is called for block entities when a chunk is unloaded (this wasn't the case in 1.16) (to be confirmed that it's called in ALL situations), which implies that MC now properly tracks all loaded BEs in an easy to access manner.

To be confirmed (needs testing on integrated & dedicated servers etc), but I think we can just inject into ClientWorld#unloadBlockEntities and ServerWorld#unloadEntities for our BE unload event. We'll have to check that this properly unloads mixins. I've unassigned myself as this will require quite a bit of testing. If someone else wants to have a look, go ahead! 🙂

commented

but number of elements still doesn't seem right to me with view-distance=16

1449 chunks is in the right ballpark, its actually smaller than the potential maximum.

The actual logic for chunks loaded radius used on the client is

  private static int getChunkMapRadius(int loadDistance) {
      return Math.max(2, loadDistance) + 3;
   }

Which for 16 view distance is 19

This is then used to calculate a diameter (the +1 is your current chunk) which is used to allocate a square

         this.diameter = radius * 2 + 1;
         this.chunks = new AtomicReferenceArray(this.diameter * this.diameter);

19*2+1 = 39
That gives approx 39 * 39 = 1521 chunks.

The difference is likely because some of the possible 3 chunks beyond your view distance have not been sent from the server because you haven't moved in that direction.

commented

I have a possible fix here: fabric-api-0.42.1+local-chunk_leak.jar - also see my PR: #1820

Please let me know how you get on with this, there is some more work to be done to clean this up, but we can get a quick fix like this out quicker.

commented

Maybe i'm just too stupid but Mojang's technical decisions doesn't make sense for me.. like 90% of the time x_x
I always thought that(in case of view-distance=16) 35x35(33x33 entity processing centered on player +1 lazy +1 border) should be more than enough for virtually any scenario that i can think of

Serializing BE's, serializing chunk, compressing, encrypting, sending over bandwidth limited channel, decrypting, decompressing, deserializing everything, firing a ton of events isn't a zero-cost operation

But than again, i'm probably just too stupid >.<
Sorry for bothering т.т