Multiplayer server crashing at launch
Lictor000 opened this issue ยท 16 comments
BuildCraft version: 7.99.17
Forge version: 14.23.3.2680
Link to crash report or log: https://pastebin.com/k19WJyZm and https://pastebin.com/0gaeWPk4
Singleplayer or multiplayer: Multiplayer
Steps to reproduce: Launch server
Additional information:
Me and my friends have been running a server for more than a month with alpha buildcraft on it (we upgraded from 7.99.14 to 7.99.16 and to 7.99.17 when it was released), but in the last days the server would start to crash for too long ticks and would not launch at all for the same reason after the first crash. Disabling the "watchdog" that stops the server, it starts but it gets super laggy (some seconds needed to remove/place blocks), with many console lines that says that it can't keep up, falling behind for several seconds and skipping ticks. I tried even downloading the world and try it on single player and gives the same issues. The crash creates both the crashlogs I linked above.
I'm not 100% sure that the crash is due to buildcraft, since we run some other mods (IC2, Galacticraft, Forestry, Ironchests, Endertanks and nature's compass; we did have also Betweenlands but some weeks ago we removed it from the server), BUT I'm 100% sure that if I remove Buildcraft from the mods the server starts up again running smoothly. I tried removing every mod one by one and issues end only with buildcraft being removed.
We tried restoring a backup of the server but after some hours of running smoothly it would start to crash again, apparently with no particular trigger action, with the same crashlogs.
I hope this may be of some help to fix the mod and if not sorry for wasting your time and for any grammar mistake, english is not my native language.
Also I take the occasion to thank the current developers for keeping my favourite minecraft mod alive and also upgrading it.
the 2nd crash (that prevents the world from loading after) is caused by forestry, it seems it is making things crash cause things are in an invalid state (probably from the watchdog killing the server)
the first crash doesn't give any info as so why things lag, just that it was killed because it lagged, can you go ingame and run the /debug start
command, then wait a minute and do /debug stop
that should generate a ticktime report in the debug folder of the server, we can take a look at that to see what exactly is taking up all the time
Something very weird happened. I started the server, and by pure luck it escaped being shutdown by the watchdog for like 300 ms over the 60000 ms cap and started. I was doing other stuff meanwhile and didn't notice it actually went up. For like 30-40 minutes the console was filled with "can't keep up" lines, but then after that the suddenly stopped. I was able just now to actually enter the server in game and it was running smoothly again. In any case, via local backup in singleplayer, I did the debug command in the laggy condition in which we were stuck: https://pastebin.com/dV0y1sGr
This instead was done via console from the server after it started running smoothly again: https://pastebin.com/bPjPa7Gc
During which in the console this lines appeared: https://pastebin.com/ywsCjX5p
i thought i replied yesterday but must not have i guess
but how many (chunkloaded) pumps do you have? the lag seems to be from pumps rebuilding their queues to figure out what fluids to pump up, that's also why it went away after a while, once they all finished that things stabilized again
also if you check those can you see if a 64 block radius around the pumps is loaded (in cases where it's connected to a fluid pool that is that large)?might be slowing things down further by having to load and unload chunks that have not been loaded yet
it's weird that it took that long (unless you really have a ton of them connected to huge lakes) but in the laggy one i don't see anything else that comes even close to having a ticktimes as high that could be part of the problem
the non laggy one shows relatively high time for pipes but if you have a lot of them that makes sense also ticktime is good so don't think that's any problem, no abnormal times i see there really
Ok, just finished investigating after asking my players if and how many pumps they had around the world. One of them placed 4 pumps above a lava sea in the nether, using a IC2 chunkloader and an Endertank to pump lava in the overworld. He managed to reach the bottom of the sea, and two of the pumps were just above some fluid lava (not source lava blocks). Just after removing one the lag sensibly decreased and removing the other one restored the smoothness of the world completely.
Looking the area I noticed that the lava source blocks closer to the pumps responsible were just outside the chunkloaded area (I think), something over 70 blocks away from them.
So yeah, it seems the problem was just a lava-hungry player.
Sorry for wasting your time and thank you for your kindness!
no problem, it was no wasting time at all
there may be ways we can patch this up for the future so pumps don't kill the server on startup (maybe only checking a smaller area at first so things like infinite water pumps for cooling purposes are up and running directly and queueing the full update so they don't all run at the same time)
@AlexIIL any opionions on that?
Ok, I got curious and try experimenting a bit with those guilty pumps using my offline backup world.
I realize now, my bad, that I didn't specify that the server didn't just crash at its start up, but the first crash happened when it was on and running smoothly. After that first crash it couldn't load up unless disabling the watchdog, getting the laggy world.
So I started first to think, like I said in my previous post, that it was caused maybe by lava source blocks outside the chunkloaded area flowing anyway under the pumps (like when a big quarry hole gets entirely flooded with a single block of water; there's a nearby nether subway to travel from different towns we built on the overworld, so it could have flown from outside not constantly loaded chunks gradually when a player passed near by).
But then I thought that if that was the case the lag should have stopped the moment I went to the nether to take them down, since my presence was loading all the chunks in the area.
The pumps where quite high, so I placed them nearer to the lava. I took some screenshots.
This is where the pumps where placed, over that white structure around the pillar of the nether fortress. The nearest lava sources are just in front of me, horizontal distance from the pumps is something more than 70 blocks.
If I place them again in their original position lag starts again, and the pump's pipes don't come down.
But...
... If I place them a bit lower, same xz coordinates, closer to the lava level, the pipes come down! (EDIT: And no lag experienced!)
So by Pythagoras I would assume that the cause that makes go crazy the pump initialization is just a matter of distance from the source blocks.
Buildcraft version: 7.99.20
Forge version: 14.23.5.2768
Hi all, I'm having the same issue, the lag spikes range from 5 to >60 seconds. I pinpointed it to a pumping station in the Nether that's causing it and I believe unloaded chunks around the pump are at least partially causing it. It's 15 pumps spread across 3 platforms that are barely in reach of a chunk loader in the middle loading an area of 5x5 chunks total. I experienced from time to time that the pumps stop pumping for a while like the station on the right in the background.
Beginning of Profile Dump:
[01] | world - 99.98%/96.01%
[02] | | tick - 99.96%/95.97%
[03] | | | entities - 88.74%/85.17%
[04] | | | | blockEntities - 83.34%/70.98%
[05] | | | | | buildcraftfactory:pump - 57.59%/40.87%
[06] | | | | | | build - 99.31%/40.59%
[06] | | | | | | prepare - 0.33%/0.13%
[06] | | | | | | unspecified - 0.30%/0.12%
[06] | | | | | | oil_spring_search - 0.06%/0.03%
[06] | | | | | | checkLight - 0.00%/0.00%
[07] | | | | | | | unspecified - 55.70%/0.00%
[07] | | | | | | | getBrightness - 29.71%/0.00%
[07] | | | | | | | checkedPosition < toCheckCount - 14.59%/0.00%
[05] | | | | | enderio:tile_conduit_bundle - 37.53%/26.64%
[06] | | | | | | tick - 24.79%/6.60%
[06] | | | | | | crazypants_enderio_conduits_conduit_liquid_AdvancedLiquidConduit - 23.33%/6.22%
[07] | | | | | | | updateNetwork - 36.36%/2.26%
[07] | | | | | | | updateConnections - 33.62%/2.09%
[07] | | | | | | | unspecified - 30.02%/1.87%
With the TickProfiler mod active with lag spike detection I get regular logs like:
The server appears to have lag spiked.
Last tick 0.32925773s ago."Server thread" RUNNABLE
at com.google.common.collect.ImmutableCollection$Builder.addAll(ImmutableCollection.java:416)
at com.google.common.collect.ImmutableCollection$ArrayBasedBuilder.addAll(ImmutableCollection.java:500)
at com.google.common.collect.ImmutableList$Builder.addAll(ImmutableList.java:773)
at buildcraft.factory.tile.TilePump.buildQueue(TilePump.java:142)
at buildcraft.factory.tile.TilePump.mine(TilePump.java:283)
at buildcraft.factory.tile.TileMiner.func_73660_a(TileMiner.java:90)
at buildcraft.factory.tile.TilePump.func_73660_a(TilePump.java:235)
at net.minecraft.world.World.func_72939_s(World.java:1835)
at net.minecraft.world.WorldServer.func_72939_s(WorldServer.java:613)
at net.minecraft.server.MinecraftServer.func_71190_q(MinecraftServer.java:767)
at net.minecraft.server.dedicated.DedicatedServer.func_71190_q(DedicatedServer.java:397)
at net.minecraft.server.MinecraftServer.func_71217_p(MinecraftServer.java:668)
at net.minecraft.server.MinecraftServer.run(MinecraftServer.java:526)
at java.lang.Thread.run(Thread.java:748)
And /forge tps
also confirms it's the nether that's lagging:
Dim 0 (overworld) : Mean tick time: 6.082 ms. Mean TPS: 20.000
[20:17:18] [Server thread/INFO] [net.minecraft.server.dedicated.DedicatedServer]:
Dim -1 (Nether) : Mean tick time: 60.958 ms. Mean TPS: 16.405
[20:17:18] [Server thread/INFO] [net.minecraft.server.dedicated.DedicatedServer]:
Overall : Mean tick time: 67.537 ms. Mean TPS: 14.807
Hi, break the pumps and put them somewhere else. Nearer the source blocks of lava for example.
As I understood trying on my server, the problem was the fact that they were trying to pump from too far away lava sources, which are made "available" by that ton of flowing lava.
I suggest you to try to avoid lava seas and use limited lava pools instead to avoid this issue (or building barriers/dams with sand/gravel yourself).
[01] | world - 99.98%/96.01%
[02] | | tick - 99.96%/95.97%
[03] | | | entities - 88.74%/85.17%
[04] | | | | blockEntities - 83.34%/70.98%
[05] | | | | | buildcraftfactory:pump - 57.59%/40.87%
[06] | | | | | | build - 99.31%/40.59%
[06] | | | | | | prepare - 0.33%/0.13%
This bit means that it's definitely the pump causing the issue, not the unloaded chunks as everything in build.
I've had a little look and it seems that the pump handles having multiple pumps in one place really badly because everytime it tries to pump out a source block that a different pump has already pumped it will forcable recalculate the entire queue of blocks to pump all over again. I'm not sure how easy this will be to fix honestly :/
But basically can you try removing every pump except for one of them and then power it with a more powerful engine? (Just FYI redstone engines are being nerfed in the future to not be able to power pumps anyway so you'll have to do that at some point in the future).
I am having the same problem. The server is now crippling because of pumps building their queue. It also forces chunks to load. Pumping lava from the nether works alright until the source blocks are out of the reach of the chunkloader (3x3 chunks).
Running Buildcraft 7.99.24.4. I've tested this extensively.
Can you try updating to 7.99.24.5? I made a few performance improvements there, although they weren't massive so they won't fix this problem.
(And as a side note if you could expand the expensive part of the performance tree, or upload the performance information file for 7.99.24.5 that would be very useful).
I have created a separate server to test this, as the one I posted the report for can't be updated just yet. In the meantime, here is the spark report for 7.99.24.4 if it helps. I've attached a Base64 encoded PNG picture as well, in case the profiler link goes down.
Choose mappings for MCP 1.12.2 in the upper right corner. I am now testing the same thing on 7.99.24.5 and will come back with a report when lag begins.
Just as a sidenote, I see that you are using arraylists for building the queue, and then adding them somewhere with addAll (did not look too in-depth). Wouldn't it be better to use LinkedList, since the underlying array of ArrayList is immutable, and adding something to it (or ensuring capacity) causes the whole queue to be copied over?
This seems to be taking up most ticktime from TimePump.buildQueue():
com.google.common.collect.ImmutableList$Builder.addAll()
https://gist.github.com/KiralyCraft/15a89729091a622fe70496cc166b84de
https://sparkprofiler.github.io/#eqlV5LuDgy
EDIT: I've tried two different locations in the nether on the test server with 24.5, I could not reproduce the issue yet.
Given that I could not reproduce the problem anymore with 7.99.24.5 in a similar scenario, I think it's fixed. I've seen this issue happen multiple times before (and I did not report it, shame on me), and it was always Google's addAll that was taking the ticks away.
Ah, ok, thanks for this - that just confirmed that the issue was with building the queue rather than loading chunks (or anything else). In 7.99.24.5 I optimised this to be node-based (referring to another blockpos object for each pos) rather than a big Map<BlockPos, List>, which probably means it's fixed for you?