LuckPerms

LuckPerms

41.4k Downloads

Reconnect to database

Slind14 opened this issue · 12 comments

commented

Apparently LuckPerms does not reconnect to the database if the connection was lost leaving the server in a non join-able state.
Any chance you could add an auto reconnect feature to the mysql driver? Usually these are provided by the lib and only need to be enabled.

commented

Needs logs showing that. I'm using HikariCP to manage MySQL connections - as far as I know, it will continue to attempt to make connections with your server, even when previous attempts have failed.

commented
2016-12-11 23:37:29 [WARNING] [LuckPerms] Processing login for SirWill took 2395ms.
2016-12-11 23:37:29 [INFO] Disconnecting SirWill [/178.63.3.246:57575]: §7§l[§b§lL§3§lP§7§l] §cPermissions data could not be loaded. Please contact an administrator.

Unfortunately this is all it tells us without adding further debugging. This happens when the database is down and after starting it back up, you still can't connect. But the permission lookup command still works (reading from cache?).

commented

hmm that's odd, because the only to fix it, is to restart the server so it reestablishes the connection.
Its running in a cluster, if a node drops it takes up to 20 seconds for it sync back in, so that would be below the 30 seconds.

commented

Okay - well, best thing to do is just prevent the DB from going offline in the first place. I know you obviously can't help it, but you'll likely have issues with other plugins too. Not a very helpful reply I know, but I don't really know what else to suggest.

I'm using (pretty much) the default Hikari settings.

commented

Its the first time we have this issue. zPerms, prism and all our custom plugins reconnect just fine. (on +30 servers with this setup for 2 years)

commented

^^^ May help.

Default is 30 seconds, which is a bit long. Now, hopefully, when the DB is down, the connection will fail fast, instead of waiting for connections to come back to life.

You can see my settings here:
https://github.com/lucko/LuckPerms/blob/master/common/src/main/java/me/lucko/luckperms/common/storage/backing/MySQLBacking.java#L64-L80

Prism uses the same CP.
https://github.com/prism/Prism/blob/master/src/main/java/com/helion3/prism/storage/mysql/MySQLStorageAdapter.java

I'm open to any suggestions you may have. :)

commented

The prism version we use does use tomcat jdbc. I'll test the version with reduced timeout and let you know my findings.

commented

What's probably happening is that those alive connections are waiting the fullest extend of their timeout (30 seconds) before being destroyed by the pool. I'm certainly no expert though - I leave all that stuff down to the author of Hikari.

I'm not really sure that I'm doing (or rather not doing) anything special. I did a bit of searching - there's nothing obvious I can enable to fix this issue.

I guess, just make sure your database servers are stable. This is certainly the first time I've seen this sort of problem.

commented

Looks like the timeout setting is not supported :(
https://gist.github.com/Slind14/da63d19bcd15654ab713ca2bb6446427

commented

I'm not getting that error with the latest version.

I tested with an invalid host, again, failed to connect. Then tested with invalid credentials the init failed quickly as intended.

Reopening - can you confirm if the PR above fixed this issue for you?

commented

Just updated, I'll keep you updated.

commented

No more issues so far :)