LuckPerms

LuckPerms

41.4k Downloads

getActiveContexts high CPU usage

XakepSDK opened this issue ยท 13 comments

commented

There's not really anything I can do to speed up these queries.

They're already cached, etc

commented

Can you make test build with cache2k instead of caffeine?

commented

They perform very similarly in benchmarks, I can't see it making any difference.

commented

Do we have any easy ways to evaluate the cache setup? for example verify it's failure ratio or size, etc.

commented

Yes - it's possible, but recording these stats has a performance impact.

commented

yes, profiling has an impact so it should be an optional thing to investigate.
I've seen multiple occasion where the cache is using high CPU, being able to enable profiling in those cases could help understand the issue, even when it's not the cache itself causing it.

commented

It looks like the cache in question is configured as,

private final LoadingCache<T, Contexts> lookupCache = Caffeine.newBuilder()
    .weakKeys()
    .expireAfterWrite(50L, TimeUnit.MILLISECONDS) // expire roughly every tick
    .build(new Loader());

There won't be a benefit switching libraries since this is a strict policy (vs maximumSize which is probabilistic).

Of note is that weakKeys means that the entry can only be looked up with the same key instance, and will be removed automatically when that key is GC'd. Usually users want weakValues, so might be an issue.

The very short time-to-live will also mean it expires very quickly. For longer durations, you can use refreshAfterWrite to reload requested keys asynchronously, rather than penalize the caller when the key expired. But for a very short duration, it's hard to know how this could be optimized.

Changing libraries won't help, and some features aren't supported elsewhere. I think reviewing the configuration is your best option.

commented

Hi Ben - firstly, thanks for taking the time to comment. Your opinion here is appreciated :)

weakKeys shouldn't be an issue in this case - as the .equals() implementation of T uses reference equality.

However, the low expiry time of the cache means that it's probably not necessary to wrap in weak references - would that help with performance?

The short expiry is necessary - the result of the lookup changes frequently. 50 milliseconds is the duration of one "game tick" - after which it's likely that the result will have changed.

So yeah, I'm not sure what can be done here. For what it's worth, this has never really been flagged as an issue before - I'm thinking that the root of the issue might be the efficiency of the load function.

commented

My opinion is still

There's not really anything I can do to speed up these queries.

You're of course more than welcome to try swapping the cache out for another implementation and seeing what difference it makes. I don't think it will be significant.

commented

Well, thanks for using my caching library. ๐Ÿ˜€

SubjectProxy#equals seems to be checking if its the same underlying reference. Can there be two SubjectProxy instances to one LPSubjectReference? If so, then weakKeys is harmful because it doesn't invoke your equals and only use ==. Otherwise, reads are not in the profile data and weakKeys has only a small impact. I'd probably drop it and rely on the low expiration time, merely for simplicity.

If the expectation is to fetch 20x a second then the cache is only absorbing a few calls within the tick. That makes sense, but will mean it can't be very helpful. If that's the expectation, then unfortunately it sounds like the loader is a necessary bottleneck.

commented

Oh, and staticLookupCache could use Guava's Suppliers.memoizeWithExpiration instead of a key-value cache. That doesn't help, but is a nicer utility for that scenario.

If you are willing to have more fuzziness per-tick, you could use refreshAfterWrite with a larger expiration time. Then the cache will asynchronously try to reload the value when accessed. If a stale read is okay for a few ticks, that might be a good option.

Similarly you could have a thread try to reload the cache entries, so that the caller only invokes the loader when absent. Then you might still want some mechanism to ensure unused entries are eventually evicted, e.g. expireAfterCreate (via a custom Expiry). That has the negative of loading too aggressively, but hides it from the user.

commented

Uhh yes, you're right. There can be more than one instance per "actual" Subject in that case, but there would never be more than a few in existence, because the creation of subject instances is cached too. For other platforms, it's guaranteed that there's only one - but anyway...

I'd probably drop it and rely on the low expiration time, merely for simplicity.

I definitely agree with that ^.

If the expectation is to fetch 20x a second then the cache is only absorbing a few calls within the tick. That makes sense, but will mean it can't be very helpful. If that's the expectation, then unfortunately it sounds like the loader is a necessary bottleneck.

It varies a lot - but in some cases (especially in the OPs case), it's likely that there are probably hundreds of hits per second per subject.

In other cases, there could be 10 hits for a subject in one tick, and then nothing for another second. Caching is definitely beneficial here - but yeah, the usefulness is limited by a potentially slow loader.

The context for all of this is a server which runs primarily off a single thread. Every nanosecond counts, a huge amount of work has to happen within a 50ms interval, and if it doesn't, the gameserver lags and players begin to notice. :)

You're absolutely right about staticLookupCache - thank you for pointing it out. :)

Loading entries asynchronously is a bit tricky. The loader potentially calls code which isn't thread safe, and has to be called from the "main server thread" - so I'm not sure if it's something that would necessarily work.

And finally,

Well, thanks for using my caching library. ๐Ÿ˜€

It's a pleasure to use it, and I really appreciate your input here. :)