SevTech: Ages of the Sky

SevTech: Ages of the Sky

1M Downloads

Watchdog execution stall shutting down server 3.1.1

Valeix opened this issue ยท 11 comments

commented

Bug Report

Server watchdog is detecting a 30000ms stall and shutting down the server as a crash after attempting to send a command in the server console. This happens typically when no players are online and the server has been idle for >1hr. No chunk loaders are active.

Note, the issue started appearing after updating the server from 3.0.8 to 3.1.1, it was updated in accordance with the update.txt. I also made a clean install and imported the world without uploading the libraries folder and continue to get the stall/crash.

Possible Solution

Seems to happen following tombstone backing up, the debugs show the stall 30sec after it completes.

Steps to Reproduce (for bugs)

  1. Server is running with no players
  2. Server has been idle for >1hr
  3. Attempt to connect to server, no response
  4. Attempt to enter command in server console
  5. Server watchdog shuts down server

Logs

Client Information

  • Modpack Version: 3.1.1
  • Java Version: 1.8.0.51 (64 Bit)
  • Launcher Used: Twitch
  • Memory Allocated: 8192 MB
  • Server/LAN/Single Player: Server
  • Optifine Installed: False
  • Shaders Enabled: False

World Information

  • Modpack Version world created in: 3.0.8
  • Additional Content Installed: mtqfix 1.12.2

Server Information

  • Java Version: 1.8.0.221
  • Operating System: Windows 10
  • Hoster/Hosting Solution: Myself
commented

What are you doing with your system clock? It's jumping around like a lunatic. Here is a supposedly chronological sequence of events from the 7.2.19 log:

  • At 17:00:00 (all the crashes begin at the turn of the hour) tombstone does a backup, seemingly successfully. (This is a common feature of all the logs, though I don't think it's to blame)
  • At 17:01:00 the server detects a fatal issue - the tick time exceeds the specified maximum of 60 seconds. This should immediately trigger a crash.
  • At 17:00:30 - 30 seconds earlier - Sampler detects the server has stalled for 30 seconds.
  • At 22:09:18 several things happen as the server either awakes from a significant slumber or notices the clocks have changed:
    • the server triggers the crash as a result of a single tick taking more than 60,000ms at 17:01:00.
    • whilst handling the crash the server becomes aware that 18558878ms have passed since the last tick. "Has the system time changed?" it asks. So do I.
  • At 22:09:19 the stall report is saved. A stall that supposedly occurred at 17:00:00 is saved as though it occurred several hours later.
    • The server also creates the crash report referencing the stall of 17:00:00.
    • The server is stopped.

So what is actually happening? Is the time changing or is it waiting more than 3 hours to crash?

commented

Unless windows is changing the time on its own, which i don't believe it is. The crash doesn't actually happen until I go to the server console and try and send a command. So between the time it actually started the stall and when I try to input a command is what its counting (referring to the 18mil ms). So it is waiting the hours to actually crash.

I tried to think of some things that can be ruled out for this issue, such as the server's power and sleep settings, which are set to never sleep, and I already had turn off hard disk disabled so I don't believe its a its a configuration like that.

commented

@snaiperskaya crashing on login is a very different issue to that described here. Please create a separate issue with all the requested information.

As Valeix describes, his server ignores login requests and crashes when something is entered into the server console.

commented

@Valeix
Thread.sleep(Math.max(1L, 50L - i)); where i is the time taken for the game to tick, and is never less than zero.

This is the call to sleep the game is hanging on. This can never be a pause of greater than 50ms and is used to try to keep the minimum tick time to 50ms so the game doesn't run too fast. This is vanilla code.

I cannot explain why this is happening unless you have modified the minecraft server jar to try to lower the activity while you are offline.

You could try disabling the watchdog, which you can do by setting max-tick-time=-1 in server.properties (Note: that is negative one - others have missed that), or disabling Windows internet time synchronisation. You can find instructions for this online.

Also of note; Java version 1.8.0_221 is not yet released, either you've made a mistake or it's a beta. Try using the latest released version 1.8.0_211.

commented

I did mistype the java version, I am running the 211, also I have not modified the jar. Is there any drawbacks to disabling the watchdog?

commented

The process will not crash automatically, but it doesn't seem to be doing that anyway. So not really. It will be interesting at least to see if anything changes.

commented

@sk2048 It is a similar issue, at the very least. In my case, the login of a player causes it to occur, but I didn't try a console command first to confirm that would or would not cause the same thing.

That said, I will also try to disable the watchdog, if this continues to happen.

commented

I would like to second this issue.

I've been running a server for about the last month on 3.1.1 and it's been running great until this same issue started yesterday. I've found that if I restore the region folder from a backup, it'll let me log in and be fine, but it seems to repeat this behavior once it's gone idle, as I had the same thing waiting for me this morning. It looks like the server stalls and then finally crashes on the first login attempt after sitting idle.

We're in Age 4 and my initial reaction the first time was the pneumaticraft pressure chamber we setup the night before was the issue (we setup redstone to regulate it, but logged off before it was up to pressure). After restoring the first time, we watched that to make sure it wouldn't over pressurize or explode and it was fine, but this morning I had to restore the region folder again, so something is getting corrupted.

Once this happens, the server will crash on initial login every time until the region folder is restored and then it seems fine until an extended idle period again. Like I said, this was running great until yesterday, and the pack hasn't changed, so I'm not sure what could suddenly be the issue.

commented

@snaiperskaya Crashing on login is almost certainly just because it's loading chunks with problem tile entities. But no crash reports or any other details mean I can't tell you for certain. Do you even know if the watchdog is triggering the crash?

Please stop hijacking another issue and create your own.

commented

Update:
Its been 3 days since I recreated the server and didn't import the libraries folder, instead I created a new server and let it generate a world, then stopped it and copied over the other server's world/prestige folder and the server had 1 stall shortly after booting it up but it did not shutdown/crash. I have restarted it once but its been going strong. The only thing I haven't done is visit all the previous dimensions that the stalling/crashing server had.

I'm going to poke around and see if a certain dimension is causing these stalling issues. There's really nothing built in the other dimensions so I'm not really expecting to find the source of the stalls doing this. Thinking that it was most likely caused by some corrupted file in the server. I kept a copy of the stalling server for now, for whatever reason I guess.

commented

Closing; seems like a freak occurrence. Feel free to reopen if you manage to recreate this issue.