LuckPerms

LuckPerms

41.4k Downloads

User Login Event following query not async

Slind14 opened this issue ยท 7 comments

commented

Hi Luck,

when a user logs in, luck perms freezes the server until data has been received from mysql. As mysql doesn't have this data on shared memory it causes delays, especially in cases of external databases and database clusters.
Would working towards making all interactions with the database async be something you could look forward to?

https://gist.github.com/Slind14/f8688b62404a70e8e952beaa82fda198 (this is how we got aware of it, I'm currently researching the reason for this starvation, but even without this issue the slow mysql operations are unnecessarily blocking the server)

commented

Nothing there suggests that LP is causing the problem.
I need timings ideally.

Regarding proper async operations, pretty much everything within LuckPerms is executed asynchronously. (even commands, and stuff like that.)

It could just be that other plugins using the permissions API are requesting user data on the main thread before LuckPerms has cached it.

I remain hopeful that this will be fixed with SpongePowered/SpongeAPI#1411, but it's been sat there for 4 weeks now without much progress made. Not really sure what I can do to change it on my end. I added those spammy messages so at least the blame wasn't being put on me all the time, but they were removed in this commit. 8b60fd0

commented

hmm, ok.
Is there any way we can resolve this?

  • Cache all data
  • Cache all player UUIDs with records and the data of players that had been online during the last 2 weeks. If a new player without record logs in, it is covert, if an active player logs in it is covered, if a player that hasn't played in a long time logs in, he gets kicked and told to try again.
  • Kicking the player if the data has not been cached? There was an issue with your permission data, please try again in 10 seconds.

zPerms did not have this issue as it cached all the data. So if it couldn't load the data, it couldn't load any data and hence restarted the server. (not perfect, but at least it only happened on server start and didn't freeze up the server during gameplay over and over again)

My issue is, that even if there are not permissions checks before the async load of the player data, it would still block the server on next permission check if the load isn't done. With a few plugins this is always gonna be a problem, no matter the sponge events order.

commented

This is already the current behaviour.

When a user logs into the server:

  1. Their login is paused. (the client hangs on the "logging in" screen)
  2. LuckPerms loads their data on a separate thread. In this process, raw data is collected from the DB, then deserialised and stored in memory. LuckPerms also completes some pre-processing, and in advance, resolves inheritance trees and other lookups, so when plugins eventually request the processed data, it's ready in a cache to simply be returned.
  3. When the data is loaded, their login continues and they are allowed onto the server.

If the data couldn't be loaded for whatever reason, their login gets cancelled.

This all happens off the main thread, so other server activity continues as normal. Only the users login thread is paused.

This is the same for both Bukkit / Sponge / Bungee, however, on Sponge, other plugins (and Sponge internals) can also request that user data be loaded. This is a problem, because currently, most of this is done on the main thread.

The problem comes when Sponge internal systems fire events for offline users. Their data is not cached, and has to be loaded (which eats up tick time) However, for online users, this is never a problem. Their data is already cached from when they joined the server.

Hopefully that makes sense. I'm unsure where your issue lies, I really need some sort of timings data. The logs you attached don't even indicate a LuckPerms issue. It could be a different plugin entirely.

commented

Here is a threaddump from one of those login lag spikes:
https://gist.github.com/Slind14/77b90c80fa4e04246cb75002510580db

commented

@lucko Looking at that trace, Sponge fires the ClientConnectionEvent.Join which happens after the async event. It also looks like LP is just reading from cache but seems to take awhile to find the data? Overall, LP should definitely have the user already loaded by this point judging from what I see here

https://github.com/lucko/LuckPerms/blob/master/sponge/src/main/java/me/lucko/luckperms/sponge/SpongeListener.java#L58

commented

That trace would have been fixed by #90. Completely separate issue - nothing to do with loading async or login events.

Can you test latest and let me know if it's fixed?

commented

looking good so far.