Moving from 1.17.1 to 1.18.1 and stuck on "Loading and validating shops, please wait..."
TomLewis opened this issue ยท 10 comments
Im trying to move from PurPur 1.17.1 to Purpur 1.18.1 and SignShop gets stuck on " Loading and validating shops, please wait..." in the console, been here about 10 mins now, is there something that happens with this? It does not seem to be progressing at all.
I left it on and it finally died and gave an error in the log.
[17:57:24] [Server thread/INFO]: [SignShop] Enabling SignShop v3.6.2
[17:57:25] [Server thread/INFO]: [SignShop] Checking data version.
[17:57:25] [Server thread/INFO]: [SignShop] Your data is current.
[17:57:25] [Server thread/INFO]: [SignShop] Loading and validating shops, please wait...
[17:57:30] [Thread-34/WARN]: Failed to load player data for 5dc4ffe3-722e-4f44-a41a-f00cd5c212e2
[17:58:10] [Thread-34/WARN]: Failed to load player data for a5c149e2-71bf-4b30-8c38-740fa8ea786e
[17:58:17] [Thread-34/WARN]: Failed to load player data for 57f49c77-157f-432d-9213-b7f6aa06b4a7
[18:05:29] [Server thread/ERROR]: Error occurred while enabling SignShop v3.6.2 (Is it up to date?)
java.lang.OutOfMemoryError: Java heap space
at it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap.rehash(Long2ObjectOpenHashMap.java:1331) ~[fastutil-8.5.6.jar:?]
at it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap.insert(Long2ObjectOpenHashMap.java:279) ~[fastutil-8.5.6.jar:?]
at it.unimi.dsi.fastutil.longs.Long2ObjectOpenHashMap.put(Long2ObjectOpenHashMap.java:287) ~[fastutil-8.5.6.jar:?]
at net.minecraft.world.level.chunk.storage.SectionStorage.readColumn(SectionStorage.java:165) ~[?:?]
at net.minecraft.world.level.chunk.storage.SectionStorage.loadInData(SectionStorage.java:148) ~[?:?]
at net.minecraft.server.level.ChunkMap.lambda$scheduleChunkLoad$27(ChunkMap.java:1127) ~[purpur-1.18.1.jar:git-Purpur-1489]
at net.minecraft.server.level.PlayerChunkMap$$Lambda$5151/0x0000000801dee8f8.apply(Unknown Source) ~[?:?]
at net.minecraft.server.level.ChunkMap.lambda$scheduleChunkLoad$28(ChunkMap.java:1164) ~[purpur-1.18.1.jar:git-Purpur-1489]
at net.minecraft.server.level.PlayerChunkMap$$Lambda$5425/0x0000000801e6a6e8.run(Unknown Source) ~[?:?]
at com.destroystokyo.paper.io.chunk.ChunkTaskManager.pollChunkWaitQueue(ChunkTaskManager.java:246) ~[purpur-1.18.1.jar:git-Purpur-1489]
at net.minecraft.server.level.ServerChunkCache$MainThreadExecutor.pollTask(ServerChunkCache.java:1286) ~[?:?]
at net.minecraft.util.thread.BlockableEventLoop.managedBlock(BlockableEventLoop.java:132) ~[?:?]
at net.minecraft.server.level.ServerChunkCache.getChunk(ServerChunkCache.java:674) ~[?:?]
at net.minecraft.world.level.Level.getChunk(Level.java:680) ~[?:?]
at net.minecraft.world.level.Level.getBlockState(Level.java:1014) ~[?:?]
at org.bukkit.craftbukkit.v1_18_R1.block.CraftBlock.getNMS(CraftBlock.java:73) ~[purpur-1.18.1.jar:git-Purpur-1489]
at org.bukkit.craftbukkit.v1_18_R1.block.CraftBlock.getBlockData(CraftBlock.java:160) ~[purpur-1.18.1.jar:git-Purpur-1489]
at org.wargamer2010.signshop.util.itemUtil.clickedSign(itemUtil.java:403) ~[SignShop-3.6.2.jar:?]
at org.wargamer2010.signshop.configuration.Storage.loadSellerFromSettings(Storage.java:169) ~[SignShop-3.6.2.jar:?]
at org.wargamer2010.signshop.configuration.Storage.Load(Storage.java:243) ~[SignShop-3.6.2.jar:?]
at org.wargamer2010.signshop.configuration.Storage.<init>(Storage.java:61) ~[SignShop-3.6.2.jar:?]
at org.wargamer2010.signshop.configuration.Storage.init(Storage.java:78) ~[SignShop-3.6.2.jar:?]
at org.wargamer2010.signshop.SignShop.onEnable(SignShop.java:148) ~[SignShop-3.6.2.jar:?]
at org.bukkit.plugin.java.JavaPlugin.setEnabled(JavaPlugin.java:264) ~[purpur-api-1.18.1-R0.1-SNAPSHOT.jar:?]
at org.bukkit.plugin.java.JavaPluginLoader.enablePlugin(JavaPluginLoader.java:370) ~[purpur-api-1.18.1-R0.1-SNAPSHOT.jar:?]
at org.bukkit.plugin.SimplePluginManager.enablePlugin(SimplePluginManager.java:500) ~[purpur-api-1.18.1-R0.1-SNAPSHOT.jar:?]
at org.bukkit.craftbukkit.v1_18_R1.CraftServer.enablePlugin(CraftServer.java:564) ~[purpur-1.18.1.jar:git-Purpur-1489]
at org.bukkit.craftbukkit.v1_18_R1.CraftServer.enablePlugins(CraftServer.java:478) ~[purpur-1.18.1.jar:git-Purpur-1489]
at net.minecraft.server.MinecraftServer.loadWorld0(MinecraftServer.java:732) ~[purpur-1.18.1.jar:git-Purpur-1489]
at net.minecraft.server.MinecraftServer.loadLevel(MinecraftServer.java:508) ~[purpur-1.18.1.jar:git-Purpur-1489]
at net.minecraft.server.dedicated.DedicatedServer.initServer(DedicatedServer.java:349) ~[purpur-1.18.1.jar:git-Purpur-1489]
at net.minecraft.server.MinecraftServer.runServer(MinecraftServer.java:1218) ~[purpur-1.18.1.jar:git-Purpur-1489]
[18:05:29] [Server thread/INFO]: [SignShop] Disabling SignShop v3.6.2
[18:05:29] [Server thread/INFO]: [SignShop] Successfully cancelled async sellers.yml save task with ID: 642
[18:05:29] [Server thread/INFO]: [SignShop] Disabled
This is a dedicated machine with 64GB Ram, the server has 10GB set to it, as I've read many times that going over 10gb isn't a good thing to do, this is just me on the server with no players remember while im testing! Theres no way signshop could fill 10gb!
I would think that 10GB would be enough unless you have lots of worlds and/or maybe with that many shops spread out over a large world(s) it could be loading enough chunks to burn through 10GB. Also if you have not done --forceUpgrade
each time a chunk is loaded it is upgraded which I would bet uses more RAM than simply loading it. Maybe --forceUpgrade
your server before loading any/many plugins?
I only have Spark/Timings, but if you can explain to me what to do I can follow along easy enough.
There are some profilers that can look at the VM from the outside such as VisualVM, however if you don't already know the basics I am afraid that my skills with it are too lacking to explain it and will likely be more of a pain than it will be worth.
Yup! After you fixed that bug with permissions everything was A-ok on 1.17! So not sure what the heck happened with 1.18.1.
1.18 is way more resource intensive starting with the fact that chunk height grew by 50%.
Well, you have me to test things, and I have one hell of a data set!
I definitely want a copy of that.
Isn't this what a database is for? MYSQL it up! (After a hotfix to get my 1.18 up haha!)
Although I am not well versed in DBs I am sure that Lists and Maps within a program will always be faster than external I/O especially for servers that use DBs on separate machines. I will admit there are other features of MYSQL that would be beneficial to SignShop.
Kind of.. Ive been on 1.12.2 for year and this Log4j finally made me pull the plug and take it offline to convert to 1.17 and now im testing 1.18, on the live box.. so the servers been whitelsited 3 weeks and I'm basically at the stage ready to open it up again, I just have a few plugins that need fixing, I'm so close I can taste it! I'm hoping when I re-open Ive not lost my entire playerbase of 9 years. Just be better to open with expanded world on 1.18 after 3 weeks downtime with all that fresh new terrain, either way everyones coming from 1.12.2 world so its going to be an upgrade!
Good luck, players can be quite unforgiving.
Correct! I have to kill it! it does eventually crash and restart itself to do it again! Is the heap space set in the java flags? I use aikars. I'm not entirely sure what I need to change, a guess would be -XX:+UseG1GC
-Xms
and -Xmx
are your minimum and maximum heap sizes respectively. As for all of the others I am sure Aikar knows way more about it than I do, just make sure you are using his recommendations for 1.18 servers as I think they have changed over time. I have also heard that you should not set the heap larger than 10GB but I believe that is for when you are not utilizing a significant portion of the heap. You end up with lag spikes due to the GC collecting garbage in blocks. With that being said I think not getting your server to start in the first place is worse than occasional lag spikes. You may try increasing your heap a few GB at a time until it at least starts and you can join. Just make sure to leave RAM for your OS and other applications. Then we can start optimizing.
Another question, when you're checking these signs, you're not loading the chunks are you? Because the conversion process from 1.17 to 1.18 chunks is telling me its going to take 33 hours with chunky!
SignShop is not explicitly loading or generating chunks during this time. But during the validation it does check if a location still has the sign etc. which will need the server to load the chunk in order to get that location's block data. Is Chunky actually converting existing chunks to the new format? Or is it just generating a defined area? Once again --forceUpgrade
may help with this.
All that was added were some logged messages to explain why the server was "hanging". What is done during that part of the loading process has not had a significant change in years. You can see the changes in this 'blame' starting here: https://github.com/wargamer/SignShop/blame/master/src/main/java/org/wargamer2010/signshop/configuration/Storage.java#L220
Example before:
[05.01 11:44:07] [Server] [Server thread/INFO]: [SignShop] Enabling SignShop v3.6.2
[05.01 11:44:08] [Server] [Server thread/INFO]: [SignShop] Checking data version.
[05.01 11:44:08] [Server] [Server thread/INFO]: [SignShop] Your data is current.
[05.01 11:44:11] [Server] [Server thread/INFO]: [SignShop] Vault's permissions not found, defaulting to OP.
[05.01 11:44:11] [Server] [Server thread/WARN]: [SignShop] No compatible worth plugin found, [Worth] disabled.
[05.01 11:44:11] [Server] [Server thread/INFO]: [SignShop] Thank you for enabling metrics!
[05.01 11:44:11] [Server] [Server thread/INFO]: [SignShop] Debugging enabled.
[05.01 11:44:11] [Server] [Server thread/INFO]: [SignShop] v3.6.2 Enabled
Why was there a 3 second pause AFTER the data version was checked?
After:
[05.01 11:44:07] [Server] [Server thread/INFO]: [SignShop] Enabling SignShop v3.6.2
[05.01 11:44:08] [Server] [Server thread/INFO]: [SignShop] Checking data version.
[05.01 11:44:08] [Server] [Server thread/INFO]: [SignShop] Your data is current.
[05.01 11:44:08] [Server] [Server thread/INFO]: [SignShop] Loading and validating shops, please wait...
[05.01 11:44:11] [Server] [Server thread/INFO]: [SignShop] Loaded 164 valid shops.
[05.01 11:44:11] [Server] [Server thread/INFO]: [SignShop] Vault's permissions not found, defaulting to OP.
[05.01 11:44:11] [Server] [Server thread/WARN]: [SignShop] No compatible worth plugin found, [Worth] disabled.
[05.01 11:44:11] [Server] [Server thread/INFO]: [SignShop] Thank you for enabling metrics!
[05.01 11:44:11] [Server] [Server thread/INFO]: [SignShop] Debugging enabled.
[05.01 11:44:11] [Server] [Server thread/INFO]: [SignShop] v3.6.2 Enabled
Now you can see that there was a pause because it was loading and validating shops, taking 3 seconds to check and load 164 shops. You have many many more shops so it will take way longer but 10 minutes does seem excessive. Do you get the "Your data is current." message? I know you are loading multiple worlds and if you have shops in all of them especially if you have shops that span across multiple worlds you could be loading lots of chunks as well and running out of memory. <--- just finished typing that when you sent the stacktrace.
1.18 uses a TON of memory for worlds.
Signshop has always taken a long time to load signs, its the single plugin that takes the longest at boot to finish, but moving from 1.17 to 1.18 just ends up dying, I edited my message with java.lang.OutOfMemoryError: Java heap space
that it finally outputs.
I had to run the server a few times before it finally gave me that error, so its not consistent.
my sellers.yml
is 16mb, my transaction log is 712mb, to say we have a lot of shop signs is an understatement haha!
But it usually always finishes even if it does take 2-3 mins, were on NVME so it gets there in the end.
I think you may have missed my edit where I added the log, because that includes the "Your data is current."!
I haven't converted the chunks on my 1.18 world yet, Ive just booted up the server to see what plugins break/get stuck with 1.18.
Could the validating be moved to when players interact instead? or even when chunks are loaded by players.
Signshop has always taken a long time to load signs, its the single plugin that takes the longest at boot to finish,
That is why I added the loading messages
I had to run the server a few times before it finally gave me that error, so its not consistent.
The server is consistently hanging though and never fully loads right? OutOfMemoryError will result in a crash unless the GC just happens to free enough memory in time. Either way you should be able to increase the heap space to prevent the OOME. Unless you do not have more RAM.
to say we have a lot of shop signs is an understatement haha!
Ya lol
But it usually always finishes even if it does take 2-3 mins
That is about what I would expect
I think you may have missed my edit where I added the log, because that includes the "Your data is current."!
I saw your edit that is why I typed the following, just didn't bother editing it.
<--- just finished typing that when you sent the stacktrace.
I haven't converted the chunks on my 1.18 world yet, Ive just booted up the server to see what plugins break/get stuck with 1.18.
Please tell me this is a test server and not your live server? Or you at least have backups?
Could the validating be moved to when players interact instead?
Not only would this be a lot of work, I believe it would cause lag. It is not just validating, it is loading the shops into memory so when a player interacts with ANY sign it does not have to search a file to see if it is a shop and if so what kind etc.
or even when chunks are loaded by players.
This may be possible except there may be issues when shops span across multiple chunks or even worlds. This would take lots re-work and testing which won't happen very quickly since I work full time and have a family. If something needs to change with SignShop to make it better for 1.18 on large scale servers I am willing but just not sure of the timeline.
At this stage I believe more information is needed about the problem and possible solutions. First I would like to know how much ram is allocated to the server and does increasing it help even if it takes a ridiculous amount? Do you have Java profiling tools at your disposal? And I assume you got everything working ok with 1.17.1 regarding SignShop?
At this stage I believe more information is needed about the problem and possible solutions. First I would like to know how much ram is allocated to the server and does increasing it help even if it takes a ridiculous amount? Do you have Java profiling tools at your disposal? And I assume you got everything working ok with 1.17.1 regarding SignShop?
This is a dedicated machine with 64GB Ram, the server has 10GB set to it, as I've read many times that going over 10gb isn't a good thing to do, this is just me on the server with no players remember while im testing! Theres no way signshop could fill 10gb!
For some context, Ive used 10gb for years and Ive had up to 100 players on 1.12.2, not sure how 1.17/1.18 will handle, ive already dropped my player count to 50!
Do you have Java profiling tools at your disposal?
I only have Spark/Timings, but if you can explain to me what to do I can follow along easy enough.
And I assume you got everything working ok with 1.17.1 regarding SignShop?
Yup! After you fixed that bug with permissions everything was A-ok on 1.17! So not sure what the heck happened with 1.18.1.
If something needs to change with SignShop to make it better for 1.18 on large scale servers I am willing but just not sure of the timeline.
Well, you have me to test things, and I have one hell of a data set! I don't think its ever been tested with this size of data that I use, my servers 9+ years old and Ive used signshop since the very early days! Since there is no purge of old signs from inactive players, theres bound to be a lot of old signs.
Not only would this be a lot of work, I believe it would cause lag. It is not just validating, it is loading the shops into memory so when a player interacts with ANY sign it does not have to search a file to see if it is a shop and if so what kind etc.
Isn't this what a database is for? MYSQL it up! (After a hotfix to get my 1.18 up haha!)
Please tell me this is a test server and not your live server? Or you at least have backups?
Kind of.. Ive been on 1.12.2 for year and this Log4j finally made me pull the plug and take it offline to convert to 1.17 and now im testing 1.18, on the live box.. so the servers been whitelsited 3 weeks and I'm basically at the stage ready to open it up again, I just have a few plugins that need fixing, I'm so close I can taste it! I'm hoping when I re-open Ive not lost my entire playerbase of 9 years. Just be better to open with expanded world on 1.18 after 3 weeks downtime with all that fresh new terrain, either way everyones coming from 1.12.2 world so its going to be an upgrade!
The server is consistently hanging though and never fully loads right? OutOfMemoryError will result in a crash unless the GC just happens to free enough memory in time. Either way you should be able to increase the heap space to prevent the OOME. Unless you do not have more RAM.
Correct! I have to kill it! it does eventually crash and restart itself to do it again! Is the heap space set in the java flags? I use aikars.
my GB is set in variables auto added by my wrapper
java.cli.X.ms=10G
java.cli.X.mx=10G
Then the rest is:
java.cli_extra=-XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:MaxGCPauseMillis=200 -XX:+UnlockExperimentalVMOptions -XX:+DisableExplicitGC -XX:+AlwaysPreTouch -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4 -XX:G1MixedGCLiveThresholdPercent=90 -XX:G1RSetUpdatingPauseTimePercent=5 -XX:SurvivorRatio=32 -XX:+PerfDisableSharedMem -XX:MaxTenuringThreshold=1 -XX:G1NewSizePercent=30 -XX:G1MaxNewSizePercent=40 -XX:G1HeapRegionSize=8M -XX:G1ReservePercent=20 -XX:InitiatingHeapOccupancyPercent=15 -Dusing.aikars.flags=https://mcflags.emc.gs -Daikars.new.flags=true -jar
I'm not entirely sure what I need to change, a guess would be -XX:+UseG1GC
Another question, when you're checking these signs, you're not loading the chunks are you? Because the conversion process from 1.17 to 1.18 chunks is telling me its going to take 33 hours with chunky!
SignShop is not explicitly loading or generating chunks during this time. But during the validation it does check if a location still has the sign etc. which will need the server to load the chunk in order to get that location's block data. Is Chunky actually converting existing chunks to the new format? Or is it just generating a defined area? Once again --forceUpgrade may help with this.
So it is loading chunks then? well if SignShop is trying to load thousands and thousands of Signs in chunks that are not yet converted, that may be the issue! I hope so because its then simple.
yes chunky loads and converts chunks, its the reason I have it running. You should not use --forceUpgrade
its no longer recommended to do this as the process can be very destructive, including emptying chests. You cant even do this with PurPur because of how dangerous it is now, it was removed. The advice is to use Chunky/WorldBorder, but Chunky is much faster. I say faster, but Its going to take another 18 hours until my world chunks have been converted, from a total of 33 hours. I will 100% be trying Signshop again once its converted all chunks! (Lets hope that's the issue!)
I also had an issue with Brewing plugin breaking the server at boot, I assume because that also loads chunks for its barrels.
make sure you are using his recommendations for 1.18 servers
I didnt even think to check if it changed from 1.17! Good shout!
Although I am not well versed in DBs I am sure that Lists and Maps within a program will always be faster than external I/O especially for servers that use DBs on separate machines. I will admit there are other features of MYSQL that would be beneficial to SignShop.
I have asked in the past to have all transactions in a database, so then they can also be queried in-game at a later date.
Imagine logging off for the night, logging in the next morning and SignShop is able to give you an overview of your sales because its able to do an SQL query from when you were last online sales.
There has to be a better loading method than load the entire lot in one go at boot, can this not be done Async once the servers up? Pass it over to another CPU code or something?
I definitely want a copy of that.
The sellers file? I can send it in a DM on discord or something? I'm GodsDead#4614
if you wanted to send a friend request.
1.18 is way more resource intensive starting with the fact that chunk height grew by 50%.
There's lots, and lots of optimisations to tackle this in PurPur and with separating entity loading per player instead of whole server now, at present there's nobody on the server, I just need it booting with SignShop, I really, really, really do not want to have to find a new Shop plugin and get thousands of players to re-build their shops.
There just has to be a better way than loading every single sign and validating it. What is the validating actually doing? Why is it required to validate? Is there a way we could at the very least have an option to load the server without validating so then we can run a cleanup option that async runs through all signs one by one and checks if a players been offline X time so then its removed to cleanup old signs.
So it is loading chunks then? well if SignShop is trying to load thousands and thousands of Signs in chunks that are not yet converted, that may be the issue! I hope so because its then simple...
I hope so too it would make this much less time sensitive.
...You should not use --forceUpgrade its no longer recommended to do this as the process can be very destructive, including emptying chests. You cant even do this with PurPur because of how dangerous it is now, it was removed. The advice is to use Chunky/WorldBorder, but Chunky is much faster... ... (Lets hope that's the issue!)...
I have not heard about these issues but I am not surprised. It probably also does not play well with Paper/PaperForks that mess with async chunk loading. However --forceUpgrade or equivalent probably should have been done during the 1.12-1.13 upgrade. But too late now of course.
I will 100% be trying Signshop again once its converted all chunks! (Lets hope that's the issue!)
This will greatly determine how we proceed.
I have asked in the past to have all transactions in a database...
I agree that would be great and would like to work with you on that but since that is a separate process it won't help in this situation of course.
There has to be a better loading method than load the entire lot in one go at boot, can this not be done Async once the servers up? Pass it over to another CPU code or something?
I think the reason for this is so that there is no way for a player to interact with a sign before SignShop is ready to determine if it is a shop or not. From what I can tell this is a very core part of the plugin that will require great effort and though to change. Remember I did not write this and it was written at a time when there wasn't so much overhead from the server chunks themselves etc. I was not expecting there to be such a drastic change between versions either and you seem to be the only one who has had this issue, due to the age of your server of course. Running this async could allow players to join and not be able to use shops or they may destroy shops before it is loaded. Or there could be lag when a player interacts with a shop while it loads and checks. Remember not everyone has a dedi with NVME drives. Not to mention this code base is a decade old and was not written with async in mind. Doing it now during a time crunch is less than ideal IF it is even a good idea.
The sellers file? I can send it in a DM on discord or something?
Yes and the other contents of the SignShop folder, I will send you a request.
There's lots, and lots of optimisations to tackle this in PurPur...
Yet they still weren't enough ;)
I really, really, really do not want to have to find a new Shop plugin
I don't want you to have to either. I am also willing to make and give you a dev build that would work in the mean time but I also want to be sure that it is a fix in the right direction.
There just has to be a better way than loading every single sign and validating it. What is the validating actually doing? Why is it required to validate? Is there a way we could at the very least have an option to load the server without validating so then we can run a cleanup option that async runs through all signs one by one and checks if a players been offline X time so then its removed to cleanup old signs.
I think continuing to cache all of the shops during startup is the right move but perhaps validating them later is an option. From what I can tell it validates by checking if the sign, chests, etc, still exist in the world and if not the shop is removed. They may not exist due to Worldedit, Skyblock, Multiverse and other plugins that my remove blocks or worlds at any given time. Cleanup of old shops would be a nice addition but won't help this case, I will keep the idea in mind.
I have already experimented with trying to unload chunks after the validation but I never could get it to actually work. I guess for now I will wait to see what you say about running SignShop after the Chunky upgrades.
Yup I was right! Converting the chunks did fix it! Thank god.
So if this happens to someone else, they will need to convert all their chunks to the latest version of Minecraft, incase anyone finds this thread the way you do this is to download Chunky
Remove signshop, Add chunky, reboot server.
Then use:
/chunky world <worldname>
/chunky shape circle
/chunky center <x> <z>
/chunky radius <radius>
/chunky start
- Do this for all your worlds you have signs used in.
- Wait until its finished, Re-add signshop then reboot.
Awesome! I was hoping this was the case and it would get it working for you. Now I can take my time and focus on a more stable permanent fix.