Integrated Dynamics crash - BlockEntityHelpers - NBT Issue?

Question

Integrated Dynamics crash - BlockEntityHelpers - NBT Issue?

Gbergz opened this issue 3 years ago · 19 comments

Gbergz commented 3 years ago

Issue type:

🐛 Bug

Short description:

When restarting the server, it stall'' until eventually crashes (watchdog).
Something tied to CyclopsCore Block Entity Helpers. Or Cluster.fromNBT ? Not sure entirely..

Steps to reproduce the problem:

Not really sure how to reproduce it on a new fresh server but on the server we have a restart triggers it.

Expected behaviour:

That the server can start without a crash.

Versions:

This mod: 1.10.17 | https://www.curseforge.com/minecraft/mc-mods/integrated-dynamics/files/3795973
Minecraft: 1.18.2
Forge: 40.1.19

Crash report:

https://gist.github.com/Gbergz/f11378d4499766cd3a97f3fd4b98a7d3

Gbergz · Answer 1 · 2022-05-20T14:12:41.000Z

Always, it's been crashing over night while sleeping atleast 20 times.

You could try enabling safe-mode in the ID config file to see if you can start the server with that

Edit: Will try that.

Gbergz · Answer 2 · 2022-05-20T14:27:24.000Z

Tried it, didn't work unfortunately.
https://gist.github.com/Gbergz/82bb36b4d57bcff62f5bd41d36290af7

Gist
gist:82bb36b4d57bcff62f5bd41d36290af7
GitHub Gist: instantly share code, notes, and snippets.

rubensworks · Answer 3 · 2022-05-20T13:50:58.000Z

Thanks for reporting!

rubensworks · Answer 4 · 2022-05-20T13:56:09.000Z

At first glance, this doesn't appear to be an ID issue.
ID is just reading vanilla block entities using plain vanilla logic, so it looks like one of the block entities or chunks loads very slowly for some reason.

Does this always happen? Or just sometimes?

You could try enabling safe-mode in the ID config file to see if you can start the server with that.

rubensworks · Answer 5 · 2022-05-21T06:14:09.000Z

Not sure what I can do to help you.

You could try making a backup of the server, remove ID, starting the server, and load the chunks in which the ID networks reside. That would allow us to determine if chunk loading also fails without ID being present.

Gbergz · Answer 6 · 2022-05-21T08:13:51.000Z

That's what I have tried. Using a backup. It works. Until server restarts then it crashes again with the same crash report as above. Which is weird. I will try using an even older backup and see If anything changes.

Gbergz · Answer 7 · 2022-05-21T14:15:17.000Z

Rolling back further seems to have fixed it, but it's still a worry that it'll happen again..

rubensworks · Answer 8 · 2022-05-21T14:20:33.000Z

Based on the current information we have, it seems very unlikely that ID is the cause of the problem.

But in any case, if it would occur again, definitely keep all recent logs. And if ID would be mentioned again in the crash, can you share those logs here?

Closing this issue in the meantime until we have more information.

phit · Answer 9 · 2022-06-14T20:33:32.000Z

getting the same issue https://gist.github.com/phit/98ca8f3db083aef71924e0e958bff7e2
https://gist.github.com/phit/27ffec72d37a778d852ccbd15db01aa2

rolling back does not sound like a fun solution in our case rebooting over and over and increasing the watchdog time to ridiculously high numbers does seem to lead the server loading eventually

I would love to share the save file, but it's multiple GB and I do not know whose base is responsible of my 60 players

Gist
gist:98ca8f3db083aef71924e0e958bff7e2
GitHub Gist: instantly share code, notes, and snippets.

Gist
gist:27ffec72d37a778d852ccbd15db01aa2
GitHub Gist: instantly share code, notes, and snippets.

rubensworks · Answer 10 · 2022-06-15T05:30:04.000Z

@phit Could you share the full logs up until the moment the crash occurs? (see discussion above)

phit · Answer 11 · 2022-06-15T07:15:23.000Z

Should be this,
debug-4.log

If you need all prior logs I have those too, but don't want to post those in public. Just let me know and I can email you those.

rubensworks · Answer 12 · 2022-06-15T07:40:27.000Z

Can you share the latest forge logs? Debug does not contain the relevant information.

phit · Answer 13 · 2022-06-15T08:29:27.000Z

I've sent you an email with more logs

phit · Answer 14 · 2022-06-15T08:38:34.000Z

issue is happening basically every restart now and take 2-3 stalls until it finally manages to start

phit · Answer 15 · 2022-06-15T08:50:12.000Z

so I went through all the logs we have and the only thing I saw in them that isn't in every log

I'll checkout those coords and rip down the machines once the server loads up and that will maybe help.

phit · Answer 16 · 2022-06-16T16:03:36.000Z

RAM usage is nowhere near peak, you can see our server stats here https://staff.stonebound.net/grafana/d/server4/direwolf20-1-18-2
CPU usage is at a full 100% on one core during the time it's stuck loading :/
Unfortunately its really annoying to hook up a sampler to the java process to debug this due to containers used by our panel, otherwise I would have already hooked up VisualVM to see what the server is doing during those multiple minute freezes.

Grafana

rubensworks · Answer 17 · 2022-06-16T15:43:17.000Z

@phit Have you tried increasing RAM on your server? Because the only abnormal thing I see in the logs you sent is that there are quite a few occurrences of "Can't keep up!", which may be resolved by increasing RAM.

If the problem still persists after that, then it might be possible to implement a system that spreads network loading across different ticks (if networks are very large and span many chunks), which may resolve this problem.

phit · Answer 18 · 2022-06-17T13:48:05.000Z

welp i finally managed to hook up visualvm and can basically confirm that your mod is not at fault as far as i can tell
serverstart.zip

rubensworks · Answer 19 · 2022-06-18T15:17:14.000Z

Indeed, doesn't look like ID is causing anything here.

In any case, should a similar issue arise in the future, and ID is definitely the cause of it, I described a possible solution that could be implemented in my previous post.

Closing this in the mean time.

Share to