WorldEdit

WorldEdit

45M Downloads

Fabric with worldedit hangs when stopping during shutdown

petersv5 opened this issue ยท 12 comments

commented

WorldEdit Version

7.3.0-beta-03

Platform Version

Fabric Loader 0.15.3

Confirmations

  • I am using the most recent Minecraft release.
  • I am using a version of WorldEdit compatible with my Minecraft version.
  • I am using the latest or recommended version of my platform software.
  • I am NOT using a hybrid server, e.g. a server that combines Bukkit and Forge. Examples include Arclight, Mohist, and Cardboard.
  • I am NOT using a fork of WorldEdit, such as FastAsyncWorldEdit (FAWE) or AsyncWorldEdit (AWE)

Bug Description

The server hangs during shutdown in some cases.
The was tracked down to the thread "WorldEdit Task Executor - 0" left running, whichin turn is due to the WorldEdit.executorService not being stopped. This particular task executor service is not run using daemon threads and thus require an explicit shutdown to terminate. If not stopped it will prevent the jvm from initiating the shutdown.

Expected Behavior

Sending /stop to the server should terminate the process in a reasonable time.

Reproduction Steps

  1. Make a selection
  2. //copy
  3. //schem save somefilename
  4. /stop

Observe that the fabric server does not termiante.
If checking the remaining server status with jstack the thread "WorldEdit Task Executor - 0" is still running and not a daemon thread.

Anything Else?

I am preparing a PR for this, hopefully later today.

commented

This issue has been automatically marked as stale because it has not been fully confirmed. It will be closed if no further activity occurs. Thank you for your contributions.

commented

Will test latest and get back if it is still an issue.

commented

It seems that in addition a timer is created during shematic saves that is not cancelled. This leads to a thread "Timer-1" from java.util.TimerThread that also remains live through the shutdown. I'm still tracking this down before filing the PR.

commented

Realistically for safety I believe that we should be properly shutting down the executor service on mod/plugin unload, not just making these daemon threads.

The Timer should automatically be GC'd after the schematic save is complete. If a reference remains, that is what should be fixed. I would be interested in a heap dump to see where this is going wrong.

commented

The executor service should be properly shut down by the PR, that was not made a daemon thread.

The Timer is a bit funny. The actual timer that resolved the hang when it was made a daemon thread is only ever used by the FutureProgressListener constructor. I see an equal number of construltor calls and calls to the run() method which in turn cancels the timer. So in theory there should be no timer reverences anywhere. At the point of the hang all non-daemon threads are already dead except "Timer-1". Either one of the daemon threads hold the reference to the FutureProgressListner Timer or it holds it itself.

commented

The Timer timer in FutureProgressListener is a static field. It is not going to go away, I guess. Is it actually correct to use a single Timer shared by all instances of FutureProgressListener? That should make them interfere with each other, I think.

There is another Timer in worldedit SessionManager that also keeps the TimerThread alive.

For the Timers it may actually be a better idea to make the threads a daemon. It is unlikely that any timer callbacks can do much good once the server has shut down which it will have done by the time the daemon-ness of the timer (or not) matters.

I am trying to make the timers go away by making some changes:

  • the FutureProgressListener timer was made non-static field.
  • the SessionManager timer was explicitly nulled on unload.

I still sometimes see the TimerThread being kept alive by Timer@ThreadReaper for a long while, but it eventually goes away. Still, this delay is a can be a problem for servers.

I will look a bit more at this tomorrow.

commented

This issue has been automatically marked as stale because it has not been fully confirmed. It will be closed if no further activity occurs. Thank you for your contributions.

commented

This issue still exists in the latest release. Please review and reopen the issue, and consider setting the timer thread to a daemon thread

Environment

  • Minecraft 1.21.1
  • OpenJDK Red_Hat-21.0.4.0.7-1
  • Fabric loader 0.16.7
  • worldedit 7.3.7+6929-c6af3a3 (worldedit-mod-7.3.7.jar)

Steps to reproduce

  1. Start a fabric server with worldedit mod only
  2. Execute /searchitem stone in the console (for those version that requires a player to run the command, run it in game)
  3. Execute /stop in the console
  4. Wait for the server to stop, observe

Expected behavior: the server stops; Actual behavior: the server never stops, and hangs forever

More information

Log: https://pastebin.com/TXvr65Ei

jstack output (look at the Timer-1 thread): https://pastebin.com/jWEh5FE1

Heap dump with jmap -dump:format=b,file=heapdump.hprof <pid>. It's splitted into 2 files, to bypass github's attachment size limit. You need to decompress twice:

heapdump.hprof.xz.001.zip
heapdump.hprof.xz.002.zip

Image

commented

It doesn't still exist but rather exists again - the fix was reverted because it caused other issues. Will re-open though.

commented

It doesn't still exist but rather exists again

This issue is also reproduce-able with at least:

  • 7.2.15, mc1.20.1
  • 7.2.14, mc1.19.4
  • 7.2.5, mc1.16.5

If you look into the related commit 5eb9b779d7467ba3c893421a8ed099c47685f01e, you will find out that this issue was introduced in not later than 7.0.0-beta-05, affecting all MC versions in 1.13+

commented

the fix was reverted because it caused other issues

I doubt if the "fix" you refer here does fix this issue

Look at the FutureProgressListener class, no change has been made to this file since 2020. Obviously those changes before 2020 did not fix this issue as well

commented