LuckPerms

LuckPerms

41.4k Downloads

[1.10.2] Connection leak detection

Eufranio opened this issue ยท 15 comments

commented

Randomly it spams the console with:

[23:50:00] [luckperms housekeeper/WARN] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask]: Connection leak detection triggered for me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@5fd73187, stack trace follows
java.lang.Exception: Apparent connection leak detected
        at me.lucko.luckperms.common.storage.methods.MySQLDatastore.getConnection(MySQLDatastore.java:139) ~[MySQLDatastore.class:?]
        at me.lucko.luckperms.common.storage.methods.MySQLDatastore.runQuery(MySQLDatastore.java:114) ~[MySQLDatastore.class:?]
        at me.lucko.luckperms.common.storage.methods.SQLDatastore.runQuery(SQLDatastore.java:97) ~[SQLDatastore.class:?]
        at me.lucko.luckperms.common.storage.methods.SQLDatastore.loadAllGroups(SQLDatastore.java:309) ~[SQLDatastore.class:?]
        at me.lucko.luckperms.common.storage.methods.MySQLDatastore.loadAllGroups(MySQLDatastore.java:36) ~[MySQLDatastore.class:?]
        at me.lucko.luckperms.common.storage.SplitDatastore.loadAllGroups(SplitDatastore.java:106) ~[SplitDatastore.class:?]
        at me.lucko.luckperms.common.runnables.UpdateTask.run(UpdateTask.java:44) ~[UpdateTask.class:?]
        at org.spongepowered.api.scheduler.Task$Builder.lambda$execute$0(Task.java:138) ~[Task$Builder.class:1.10.2-2098-5.1.0-BETA-1814]
        at org.spongepowered.common.scheduler.SchedulerBase.lambda$startTask$0(SchedulerBase.java:177) ~[SchedulerBase.class:1.10.2-2098-5.1.0-BETA-1814]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_91]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]

Running LuckPermsSponge-2.12.20, on a 1.10.2 Sponge server.
SpongeForge version: 1.10.2-2098-5.1.0-BETA-1814
Forge version: 1.10.2-12.18.2.2107
Another error:

 java.sql.SQLTransientConnectionException: luckperms - Connection is not available, request timed out after 30000ms.
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.lib.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:548)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.lib.hikari.pool.HikariPool.getConnection(HikariPool.java:186)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.lib.hikari.pool.HikariPool.getConnection(HikariPool.java:145)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.lib.hikari.HikariDataSource.getConnection(HikariDataSource.java:83)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.common.storage.methods.MySQLDatastore.getConnection(MySQLDatastore.java:139)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.common.storage.methods.MySQLDatastore.runQuery(MySQLDatastore.java:114)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.common.storage.methods.SQLDatastore.runQuery(SQLDatastore.java:97)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.common.storage.methods.SQLDatastore.loadAllTracks(SQLDatastore.java:407)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.common.storage.methods.MySQLDatastore.loadAllTracks(MySQLDatastore.java:36)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.common.storage.SplitDatastore.loadAllTracks(SplitDatastore.java:131)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at me.lucko.luckperms.common.runnables.UpdateTask.run(UpdateTask.java:51)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at org.spongepowered.api.scheduler.Task$Builder.lambda$execute$0(Task.java:138)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at org.spongepowered.common.scheduler.SchedulerBase.lambda$startTask$0(SchedulerBase.java:177)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[23:51:29] [pool-2-thread-37/INFO] [STDERR]: [me.lucko.luckperms.common.storage.methods.MySQLDatastore:runQuery:125]:   at java.lang.Thread.run(Thread.java:745)
commented

Could you please pastebin your entire latest.log?

commented
commented

Happened when i tried setup default permissions of Nucleus, using /nucleus setupperms ADMIN admin, this could have overloaded the MySQL connection. You could add a delay and a cache for the permissions requests, or push to the database only in like 1 minute...

commented

Yeah. The logs would suggest that.

[23:59:25] [pool-2-thread-6/INFO] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask/luckperms]: Previously reported leaked connection me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@53ebefc9 was returned to the pool (unleaked)
[23:59:25] [pool-2-thread-8/INFO] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask/luckperms]: Previously reported leaked connection me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@4dd787d5 was returned to the pool (unleaked)
[23:59:25] [pool-2-thread-4/INFO] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask/luckperms]: Previously reported leaked connection me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@5c3b866d was returned to the pool (unleaked)
[23:59:26] [pool-2-thread-9/INFO] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask/luckperms]: Previously reported leaked connection me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@39a11e78 was returned to the pool (unleaked)
[23:59:26] [pool-2-thread-10/INFO] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask/luckperms]: Previously reported leaked connection me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@59d28cda was returned to the pool (unleaked)
[23:59:26] [pool-2-thread-12/INFO] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask/luckperms]: Previously reported leaked connection me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@487b7cf was returned to the pool (unleaked)
[23:59:26] [pool-2-thread-11/INFO] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask/luckperms]: Previously reported leaked connection me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@2bc1097e was returned to the pool (unleaked)
[23:59:26] [pool-2-thread-17/INFO] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask/luckperms]: Previously reported leaked connection me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@79b04134 was returned to the pool (unleaked)
[23:59:26] [pool-2-thread-15/INFO] [me.lucko.luckperms.lib.hikari.pool.ProxyLeakTask/luckperms]: Previously reported leaked connection me.lucko.luckperms.lib.mysql.jdbc.JDBC4Connection@5fd73187 was returned to the pool (unleaked)

However, they were returned a good 10 minutes later?? (not good)

I have one other user who is also having similar issues. I cannot work out why the hell it's happening.

Where is your MySQL server located? Same DC as your server, or on the other side of the world? Is it a rubbish(y) shared hosting one, or is it reliable?

commented

Either way, I'll definitely implement your suggestion RE: waiting before saving.

If you're just using commands, there shouldn't be an issue, however I can understand there may be issues when other plugins set things automatically.

commented

I have a dedicated server for my hosting stuff (website, web scripts, MySQL) and a game server for run my minecraft servers and heavy stuff. both dedicated. The Sponge server is running in the game server, and I use the external IP to connect with the MySQL server.

I can reproduce it easily. When I restarted the server, I can use the normal commands, and the stuff is written to the database, but when I use /nucleus setupperms ADMIN admin, the error happes. In that first log, it returned 10 minutes after because that was the time that the server took to create the admin group. Once created and tried setup the perms, It throws the errors instantly.

commented

Can I test that new build?

commented

Okay.

Well, I've cleaned up the #loadAllGroups / #loadAllTracks methods. There could have been some issues there. The logs you sent didn't actually suggest that there was any sort of issue with saving groups, it was the #loadAllGroups method that was mentioned in the stacktrace.

(Although that method does actually get called https://github.com/lucko/LuckPerms/blob/master/sponge/src/main/java/me/lucko/luckperms/sponge/service/LuckPermsSubjectData.java#L62)

Anyway, you've raised some really good points, so thank you for doing so. I'll look into these issues as a matter of priority.

My thoughts are:

  • Whether by command or through an API, whenever a user/group gets saved, they get put into a queue to be saved in 5 seconds time.
  • If they're already in the queue, the save time just gets extended to 5 seconds
  • When the queue time reaches 0, they get saved to the datastore.

That should prevent any issues with automated modifications, etc. :)

commented

I have a really crappy MySQL server which I use to test the plugin on. (my thoughts being if this server can handle it, anything can.)

So I'll try to reproduce the issue on that, using the Nucleus auto setup feature.

Anyway, I'll look into this tomorrow. (it's pretty late here.) Thanks again for your help.

commented

Okay, thank you by your help!

commented

I've made a bit of progress. (I'm probably like 70% done.)

The root cause of the issue is actually much harder to solve than you might think. Hopefully I'll have it done at some point tomorrow. ๐Ÿ™‚

commented

Ok, I've pretty much finished it.

Seems to be working as intended: https://gist.github.com/lucko/664dfc53868cd65a641991dc1bf7a8d6

Basically: Instead of saving right away, the plugin will wait 3 seconds for any other changes to be pushed, and so long as no other changes have been saved, after 3 seconds, it will actually save the data to storage. The wait time will reset to 3 each time another change gets pushed.

It's certainly not stable at the moment, but it does seem to fix the issue.

The remaining issues are:

  • If for whatever reason, the object gets loaded whilst it's sat in the buffer waiting to be saved, the changes made will be overridden.
  • Probably a bunch of other stuff I've overlooked. (This change has caused a lot of pretty major internal changes, so there's bound to be breakages.)

Anyway, feel free to try the CI build. Let me know if you have any issues.

(I'll leave this issue open until the remaining issues mentioned above are resolved.)

commented

Seems stable. Feel free to reopen if you have any problems.

commented

It's working flawlessly now, thanks! Tested in a production server

commented

Excellent! Good to hear. ๐Ÿ˜„