Integrated Dynamics

Integrated Dynamics

82M Downloads

Integrated Dynamics crash - BlockEntityHelpers - NBT Issue?

Gbergz opened this issue ยท 19 comments

commented

Issue type:

  • ๐Ÿ› Bug

Short description:

When restarting the server, it stall'' until eventually crashes (watchdog).
Something tied to CyclopsCore Block Entity Helpers. Or Cluster.fromNBT ? Not sure entirely..

Steps to reproduce the problem:

Not really sure how to reproduce it on a new fresh server but on the server we have a restart triggers it.

Expected behaviour:

That the server can start without a crash.


Versions:

Crash report:

https://gist.github.com/Gbergz/f11378d4499766cd3a97f3fd4b98a7d3

commented

Always, it's been crashing over night while sleeping atleast 20 times.

You could try enabling safe-mode in the ID config file to see if you can start the server with that

Edit: Will try that.

commented

Tried it, didn't work unfortunately.
https://gist.github.com/Gbergz/82bb36b4d57bcff62f5bd41d36290af7

Gist
GitHub Gist: instantly share code, notes, and snippets.
commented

Thanks for reporting!

commented

At first glance, this doesn't appear to be an ID issue.
ID is just reading vanilla block entities using plain vanilla logic, so it looks like one of the block entities or chunks loads very slowly for some reason.

Does this always happen? Or just sometimes?

You could try enabling safe-mode in the ID config file to see if you can start the server with that.

commented

Not sure what I can do to help you.

You could try making a backup of the server, remove ID, starting the server, and load the chunks in which the ID networks reside. That would allow us to determine if chunk loading also fails without ID being present.

commented

That's what I have tried. Using a backup. It works. Until server restarts then it crashes again with the same crash report as above. Which is weird. I will try using an even older backup and see If anything changes.

commented

Rolling back further seems to have fixed it, but it's still a worry that it'll happen again..

commented

Based on the current information we have, it seems very unlikely that ID is the cause of the problem.

But in any case, if it would occur again, definitely keep all recent logs. And if ID would be mentioned again in the crash, can you share those logs here?

Closing this issue in the meantime until we have more information.

commented

getting the same issue https://gist.github.com/phit/98ca8f3db083aef71924e0e958bff7e2
https://gist.github.com/phit/27ffec72d37a778d852ccbd15db01aa2

rolling back does not sound like a fun solution in our case rebooting over and over and increasing the watchdog time to ridiculously high numbers does seem to lead the server loading eventually

I would love to share the save file, but it's multiple GB and I do not know whose base is responsible of my 60 players

Gist
GitHub Gist: instantly share code, notes, and snippets.
Gist
GitHub Gist: instantly share code, notes, and snippets.
commented

@phit Could you share the full logs up until the moment the crash occurs? (see discussion above)

commented

Should be this,
debug-4.log

If you need all prior logs I have those too, but don't want to post those in public. Just let me know and I can email you those.

commented

Can you share the latest forge logs? Debug does not contain the relevant information.

commented

I've sent you an email with more logs

commented

grafik
issue is happening basically every restart now and take 2-3 stalls until it finally manages to start

commented

so I went through all the logs we have and the only thing I saw in them that isn't in every log

grafik

I'll checkout those coords and rip down the machines once the server loads up and that will maybe help.

commented

RAM usage is nowhere near peak, you can see our server stats here https://staff.stonebound.net/grafana/d/server4/direwolf20-1-18-2
CPU usage is at a full 100% on one core during the time it's stuck loading :/
Unfortunately its really annoying to hook up a sampler to the java process to debug this due to containers used by our panel, otherwise I would have already hooked up VisualVM to see what the server is doing during those multiple minute freezes.

commented

@phit Have you tried increasing RAM on your server? Because the only abnormal thing I see in the logs you sent is that there are quite a few occurrences of "Can't keep up!", which may be resolved by increasing RAM.

If the problem still persists after that, then it might be possible to implement a system that spreads network loading across different ticks (if networks are very large and span many chunks), which may resolve this problem.

commented

welp i finally managed to hook up visualvm and can basically confirm that your mod is not at fault as far as i can tell
serverstart.zip

commented

Indeed, doesn't look like ID is causing anything here.

In any case, should a similar issue arise in the future, and ID is definitely the cause of it, I described a possible solution that could be implemented in my previous post.

Closing this in the mean time.