Fabric with worldedit hangs when stopping during shutdown
petersv5 opened this issue ยท 12 comments
WorldEdit Version
7.3.0-beta-03
Platform Version
Fabric Loader 0.15.3
Confirmations
- I am using the most recent Minecraft release.
- I am using a version of WorldEdit compatible with my Minecraft version.
- I am using the latest or recommended version of my platform software.
- I am NOT using a hybrid server, e.g. a server that combines Bukkit and Forge. Examples include Arclight, Mohist, and Cardboard.
- I am NOT using a fork of WorldEdit, such as FastAsyncWorldEdit (FAWE) or AsyncWorldEdit (AWE)
Bug Description
The server hangs during shutdown in some cases.
The was tracked down to the thread "WorldEdit Task Executor - 0" left running, whichin turn is due to the WorldEdit.executorService not being stopped. This particular task executor service is not run using daemon threads and thus require an explicit shutdown to terminate. If not stopped it will prevent the jvm from initiating the shutdown.
Expected Behavior
Sending /stop to the server should terminate the process in a reasonable time.
Reproduction Steps
- Make a selection
- //copy
- //schem save somefilename
- /stop
Observe that the fabric server does not termiante.
If checking the remaining server status with jstack the thread "WorldEdit Task Executor - 0" is still running and not a daemon thread.
Anything Else?
I am preparing a PR for this, hopefully later today.
This issue has been automatically marked as stale because it has not been fully confirmed. It will be closed if no further activity occurs. Thank you for your contributions.
It seems that in addition a timer is created during shematic saves that is not cancelled. This leads to a thread "Timer-1" from java.util.TimerThread that also remains live through the shutdown. I'm still tracking this down before filing the PR.
Realistically for safety I believe that we should be properly shutting down the executor service on mod/plugin unload, not just making these daemon threads.
The Timer
should automatically be GC'd after the schematic save is complete. If a reference remains, that is what should be fixed. I would be interested in a heap dump to see where this is going wrong.
The executor service should be properly shut down by the PR, that was not made a daemon thread.
The Timer is a bit funny. The actual timer that resolved the hang when it was made a daemon thread is only ever used by the FutureProgressListener constructor. I see an equal number of construltor calls and calls to the run() method which in turn cancels the timer. So in theory there should be no timer reverences anywhere. At the point of the hang all non-daemon threads are already dead except "Timer-1". Either one of the daemon threads hold the reference to the FutureProgressListner Timer or it holds it itself.
The Timer timer in FutureProgressListener is a static field. It is not going to go away, I guess. Is it actually correct to use a single Timer shared by all instances of FutureProgressListener? That should make them interfere with each other, I think.
There is another Timer in worldedit SessionManager that also keeps the TimerThread alive.
For the Timers it may actually be a better idea to make the threads a daemon. It is unlikely that any timer callbacks can do much good once the server has shut down which it will have done by the time the daemon-ness of the timer (or not) matters.
I am trying to make the timers go away by making some changes:
- the FutureProgressListener timer was made non-static field.
- the SessionManager timer was explicitly nulled on unload.
I still sometimes see the TimerThread being kept alive by Timer@ThreadReaper for a long while, but it eventually goes away. Still, this delay is a can be a problem for servers.
I will look a bit more at this tomorrow.
This issue has been automatically marked as stale because it has not been fully confirmed. It will be closed if no further activity occurs. Thank you for your contributions.
This issue still exists in the latest release. Please review and reopen the issue, and consider setting the timer thread to a daemon thread
Environment
- Minecraft 1.21.1
- OpenJDK Red_Hat-21.0.4.0.7-1
- Fabric loader 0.16.7
- worldedit 7.3.7+6929-c6af3a3 (worldedit-mod-7.3.7.jar)
Steps to reproduce
- Start a fabric server with worldedit mod only
- Execute
/searchitem stone
in the console (for those version that requires a player to run the command, run it in game) - Execute
/stop
in the console - Wait for the server to stop, observe
Expected behavior: the server stops; Actual behavior: the server never stops, and hangs forever
More information
Log: https://pastebin.com/TXvr65Ei
jstack output (look at the Timer-1
thread): https://pastebin.com/jWEh5FE1
Heap dump with jmap -dump:format=b,file=heapdump.hprof <pid>
. It's splitted into 2 files, to bypass github's attachment size limit. You need to decompress twice:
It doesn't still exist but rather exists again - the fix was reverted because it caused other issues. Will re-open though.
It doesn't still exist but rather exists again
This issue is also reproduce-able with at least:
7.2.15
, mc1.20.17.2.14
, mc1.19.47.2.5
, mc1.16.5
If you look into the related commit 5eb9b779d7467ba3c893421a8ed099c47685f01e, you will find out that this issue was introduced in not later than 7.0.0-beta-05
, affecting all MC versions in 1.13+
the fix was reverted because it caused other issues
I doubt if the "fix" you refer here does fix this issue
Look at the FutureProgressListener class, no change has been made to this file since 2020. Obviously those changes before 2020 did not fix this issue as well