Rate limits, scheduling and preemption
SquidDev opened this issue · 2 comments
This has been kicking around my head for a couple of weeks, especially since 12e82af dropped, and given we just had an Indicent™ on SwitchCraft, I thought it might be as good a time as any to write up an issue on this:
Basically, we need a better way to prevent computers running amok. Or rather, when computers run amok, we need to reduce the impact that has on the rest of the server (and ideally on other computers). I think the best way in order to do this is as follows:
- Introduce global limits for the amount of server time ComputerCraft can consume in a tick.
- Allow pre-empting computers before they actually yield, reducing the risk of people blocking the computer thread.
- Add some form of scheduling, so computers which do little work get server or computer time are not pushed aside by those which do a lot.
As far as work on the computer thread goes, it should be pretty trivial in a conceptual sense. The actual Computer
implementation is a bit of a concurrency mess right now, so it'd be good to clean that up first. As far as scheduling goes, I think we can get away with something simple like waited-fair queuing (as used by Linux's CFS) - effectively pick whoever has used the least resources so far.
Server time is a little more complex. Unlike computers, tasks are executed in multiple places - turtles (on TE tick, executes commands for that computer), MainThread
(in one batch for multiple computers) and on Plethora peripherals (on TE tick, commands tied to multiple computers). We need a way to spread time fairly between computers, while keeping as close to our budget as possible*.
I haven't found a satisfactory solution to this yet, without introducing high latencies or massive unfairness. Though, as hopefully most of the time we'll be well within our budget, I think we mostly need to worry about the case where a couple of computers are hogging all execution time.
* We're always going to go above our budget, as we don't know how long something will take before we do it.
It should also be possible to extend some of this code to the bandwidth limiting areas of #33, though I haven't fully thought though all of that yet.
We've implemented the computer thread portions of this issue. I need to have a serious think about how best to handle the server tick scheduling - any ideas are welcome!