Lagspike on changing network.
TheLoneWolfling opened this issue ยท 10 comments
Every time I place or break a logistics pipe on my network I get a quarter-to-half-second lagspike.
I've tried turning the number of routing table threads down to 1 at priority 1, but it still occurs.
Using FastCraft's profiler, I've dumped a few spike reports. Looking at them, it seems like >60% of the update time of the routing table update thread is spent in Object.clone()
called from EnumSet.clone()
called from EnumSet.copyOf()
, called from ExitRoute.getFlags()
. The other part is almost entirely self time in ServerRouter.CreateRouteTable
. Perhaps it would be worth it to split that method into submethods, if nothing else so that we can actually see where the majority of the time is being spent?
Also, more surprisingly, in the main server thread, I'm getting almost the same thing. ~70% (!) of the main server thread time is also spent in ServerRouter.CreateRouteTable
- about 3/4 of that spent within checkProvider
and subcalls, 1/4 spent within checkCrafting
and subcalls. Same issue there too - checkProvider
calls getProviders
calls getDistanceTo
calls ensureRouteTableIsUpToDate
calls CreateRouteTable
as above, and checkCrafting
calls getCrafters
calls addCrafting
calls addCrafting
calls isSatelliteConnected
calls getIRoutersByCost
calls ensureRouteTableIsUpToDate
, etc.
Here's the NPS file, and here's an HTML version, if you don't want to use VisualVM.
It may be better to add a non-copying variant of getFlags
- as at least some of the calls to getFlags
don't require the copying behavior.
Though, all of this being said, I am well aware that profilers can be misleading. So take this with a grain of salt.
Also: part of this is probably related to my setup. For dust processing I'm using supplier pipes and a firewall pipe, as it's the only way that I know of currently to process dusts that a) doesn't try to pulverize ingots just so they can be smelted again, b) actually processes backlogs of dusts in the system, and c) actually allows crafting with dusts in the main system. Perhaps adding a module that acted as an itemsink, but with additional (slow) supplier-pipe-like functionality would help to mitigate this. (So basically an itemsink, but every 60+ seconds or so checks if there is any of the requested items in the system. If so, it goes into "fast" mode to process the backlog until it runs out of the item.)
Moved to 0.10 target, as I added the TickExecutor there. That can be used to implement work done in other threads. The TickExecutor executes tasks on its todo-list at the end of every server tick. Useful for CompletableFuture related chaining.
The hot path that leads to the re-creation of the routing table on every network change was the extractor module. Now that that's asynchronous on bigger networks this exact lag shouldn't occur anymore. I have sampled my results on our server and they seemed promising.
That's nice. We've already talked about what to improve. Could you tell us your server set-up in terms of RAM and CPU availability/usage though?
I'm puttering away at this locally, by the way. May put in a pull request later, if you so wish. I've already knocked down the lagspike by ~1/2 with a couple of minor tweaks.
It's not a standalone server - local SSP world. System has 8GB ram total (2 x 4GB DDR3-1600) - 1.5GB allocated to Minecraft as a maximum. Tends to hover around 1GB allocated most of the time, though.
It's an i5-3210M, which is why I also get clientside lag when a background thread is doing something - it's only 2 cores. (4 logical, but hyperthreaded.) Doesn't leave a lot of room in the background for anything else once you take into account the main server and client threads. I'd bump LP down to 0 threads but I would prefer a slower LP route update to the server being unresponsive, in general at least. Although given that it's causing problems on the other threads anyways...
I take it from your response that you wouldn't be interested? It can be difficult to gauge tone from writing, but "That's nice." tends to be sarcastic.
No sarcasm on our side. We always welcome good improvements. Looking forward to your improvemtn suggestions. Just be warned that it could take us a little bit longer to review tham because the Routing tables are LP's backbone and if there is a problem inside that code all hell breaks loose.
The thing about the routing table update threads is that when you set them to that low priority with only one thread you will have a lot more routing table creation inside the main Thread. The way LP handles it is that when it tries to access the routing table and the update hasn't been done inside one of the threads that update is done inside the main thread. Which could than again cause a lag spike if you have lot's of pipes. So slowing down the Update threads doesn't mean the updates are spread out more. It just means that more updates will be done inside the main thread.
Alright. That explains why I was seeing so much routing table creation inside the main thread.
So then: in that case, what arrangement of priority / count do you suggest for my machine?
I'll whip up a pull request from my local code. It should be fail-fast, in that if I've messed up it'll be instantly obvious ('UnsupportedOperationException'), but I'll have to double-check.
@TheLoneWolfling yeah, I see. I was a bit curt there ;) Don't get me wrong, but I am not a native English speaker, so wording can be a bit different sometimes. As @davboecki said: It was not sarcasm. We really appreciate (and endorse) any kind of contribution!
@davboecki Could that be a possible solution for the thing we discussed about the lagspikes?
Looking forward to getting this into LP
It mitigates it, but it doesn't remove it entirely. With this patch, the remaining bulk of the time is mostly split in self time inside CreateRouterTable
(mainly the first step), and in EnumSet.containsAll
.
I hate to say it, but if you want better performance it may be better to create your own replacement EnumSet - Java's doing a lot of unnecessary work behind the scenes. (EnumSet.containsAll
could be done with a single not and bitwise-and. Ditto, EnumSet.copyOf should only require copying a single integer.)
Java supposedly optimizes this - but in reality good luck with that.
Personally, I think the approach of precalculating routing information may not be the best approach. Doing an "on-the-fly" approach with caching - perhaps sacrificing ideality in the process, even - may be the better approach.