CC: Tweaked

CC: Tweaked

42M Downloads

Scheduler taking CPU load for synchronization in main thread?

LemADEC opened this issue ยท 9 comments

commented

Useful information to include:

  • Minecraft version 1.12.2
  • CC: Tweaked version 1.82.3
  • Detailed reproduction steps:
    During a routine profiling, I've noticed CC is now using locks in the main thread which didn't show before. Also, the whole scheduler took 0.82% CPU by itself (measured over 5 mn with 5 IDLE computers).
    So, I'm a bit confused as I thought computers were running in their own named threads. Is that link to the CFS introduction in a125a19 (1.82.0)?

Also, from past experiences with synchronization in public minecraft servers were successful only after replacing such lock/sync with CopyOnWriteArray, AtomicBoolean and such. As such, I'm a bit worried on how this lock will scale when we have 50+ computers running at the same time on our server.
Did we evaluate how the new system work on a loaded server?

commented

The actual queuing of events is only 0.17% (504ms in Environment.queueEvent()). While most things execute on a separate thread, when queuing events from the main thread (peripheral updates, main-thread tasks, and os.setTimer/setAlarm) we need to synchronise somehow.

However, this lock should be very low contention, so I'd surprised that it has any impact in practice (though 0.12% is pretty negligible). For reference, we've a server with 270 on computers (140 of which have run something in the last minute) and not noticed any performance degradation.

I'm not sure there's a nice solution here. Ultimately we need to lock at some point, as we're mutating things from multiple threads. I guess one solution would be to build a list of computers which have queued events from the main thread this tick and only add them to the executor at the end of the tick - means you only need to acquire a lock once. However, I'm not sure if all the additional bookkeeping would overshadow any performance gains.

commented

My Lua scripts are using os.startTimer to rhythm screen refreshes. Why is that requiring main thread context?
Would I have the same limitation with event posted from my block towards the Lua script?

commented

why peripheral updates have to be done in the Main thread?
Are we talking about method provided by blocks through Lua? the API for those clearly state it's called outside the main thread.

commented

By peripheral updates, I'm referring to when a peripheral is attached/detached. This is done on the main thread, as it's a response to block updates - thus we're queuing events from there.

Simiarly, when a peripheral needs to interact with the world, one uses ILuaAccess.executeMainThreadTask. This runs a task on the main thread which, when finished, queues an event.

commented

Sounds like that would typically apply to turtle breaking or placing a block for example.
My test setup was just 5 Computers each plugged one my blocks that don't use the ILuaAccess.executeMainThreadTask. My mod already handle asynchronous requests in its own internal ways.
There was no logs of attach/detach events either.

It's weird how we jump from 0.82 % to 0.17 %.

commented

Yes, turtle commands will also queue events from the main thread. In the above profile snapshot, the queueing is coming from the OS API. This is most likely from os.startTimer or os.setAlarm. It's possible the rednet coroutine or something is starting timers - haven't the foggiest.

commented

Timers and alarms are kept in lock-step with the server's TPS and so are scheduled to be run from the main thread. That's how CC has always worked. It doesn't run any computer code on the main thread - is merely scheduling an event.

I'd don't know if it'd be easier to pop on Discord or IRC to discuss things further - might be easier to have a full conversation about CC internals :).

commented

Discussed things on Discord and did some profiling. Performance seemed pretty comparable with 1.7.10 (also performed a lock + push to queue). Furthermore, the performance impact is pretty negligible, and I'm unable to create a scenario where it actually impacts performance in any way.