O(1) set.contains check for new strings are pretty slow
CaliforniaDemise opened this issue ยท 2 comments
Environment Details
Mod Version: From oldest version to latest.
Game Version: From oldest game version to latest.
Loader Type: All
Loader Version: All
Game Type: All
The Issue
In StageData class (or PlayerData#L179 on 1.11.2 gamestages creates lowercase strings everytime any mod or crafttweaker checks for stages, this is a problem because string hash check is a slow progress.
Java caches the hash after hashCode is first called which significantly speeds up process when you do it again but it doesn't speed up in this case because toLowerCase creates a new string which gets discarded after stage is checked.
My suggestions are:
- Handling toLowerCase when the stage is set instead of everytime it has been checked. This for example would speed up the JourneyMap Stages. You can remove toLowerCase entirely for newer game versions though.
- Telling addon creators to add a reference to the stage string (like here) so they won't create new strings and make use of hash cache.
- Limit the character amount of stage names to a power of two. Like for example 16 or 32 characters because strings use too many memory especially if they're not discarded.
- Add localization support. This is kind of like a feature request but honestly it's an important one. You already use regex to limit stage names to ASCII which is a pretty good choice but there're people on overseas that would really like localization support. This would help people to use long stage names too.
One Last Thing
I don't know but if you allow PR's for 1.12.2, I would like to post a PR to fix this issue in 1.12.2 too. Because people in 1.12.2 complains about it. I will release a fork if you entirely stopped supporting 1.12.2.
Screenshots / Media
Hello, it sounds like you may have some misconceptions. In Java String#toLowerCase
will not create a new string if the input does not require any changes. Additionally, the JVM will generally intern newly created strings meaning they are not discarded and the cached state remains in tact. These calls do introduce a minuscule amount of CPU overhead but this has never been visible in profiling reports of benchmarks of GameStages. In the 1.21.x port we are switching from String to a new record type. This will allow us to validate the stage name upfront and make some of these behaviours more apparent in the code.
To address your suggests though
Handling toLowerCase when the stage is set instead of everytime it has been checked. This for example would speed up the JourneyMap Stages. You can remove toLowerCase entirely for newer game versions though.
This is initially how GameStages worked, but several addons were allowing invalid stages to be created. These calls were added to ensure everything could continue to run smoothly and removing them now would be a breaking change. This will be addressed in the 1.21.x update as the new type will enforce validation.
Telling addon creators to add a reference to the stage string (like here) so they won't create new strings and make use of hash cache.
Because strings are generally interned there is no benefit to doing this. They are already the same reference.
Limit the character amount of stage names to a power of two. Like for example 16 or 32 characters because strings use too many memory especially if they're not discarded.
This would be an artificial limitation with no practical benefits. When strings are interned there is only one copy of them in memory and they are not easily discarded. If you want to avoid wasting a few bytes of extra ram just use shorter names.
Add localization support. This is kind of like a feature request but honestly it's an important one. You already use regex to limit stage names to ASCII which is a pretty good choice but there're people on overseas that would really like localization support. This would help people to use long stage names too.
Stage names are limited because they are internal IDs. They are not meant to be displayed to players under normal circumstances but many addons do allow you to customize how they are displayed and localized using the text component system. I can look into including a central API for this so the names are consistent across all addons but this would be a 1.21.x feature.
I don't know but if you allow PR's for 1.12.2, I would like to post a PR to fix this issue in 1.12.2 too. Because people in 1.12.2 complains about it. I will release a fork if you entirely stopped supporting 1.12.2.
Unfortunately I am not accepting new PRs for 1.12.2 at this time. A lot of the build tools for this version have crumbled so our CI/CD can not even build the project for these older versions. Feel free to make a fork if you feel it's still needed, just don't market it as official or send your users to my support channels.
It's not clear what they're talking about, but Game Stages does have a reputation for hurting game performance. This reputation primarily comes from addons and poorly optimized packs. For example packs using ItemStages to create thousands of new restrictions for the same stage rather than one restriction that checks if the item is in a tag. There is definitely room to optimize and improve the mod and its addons but it's worth noting that there are some very large projects that do things correctly and do not have these issues as a result.
Ah didn't look that deep into lowercase and uppercase, thanks for clarification. Good choice to use records too. People do use uppercase letters though, like ROTN, I guess newer versions does the regex check though so it only occurs in 1.12.2. I just thought this was another potential performance problem other than the addons like ItemStages.
Central API for localization would really help people because 3rd-party developers sometimes forget about details like that.
Other than that, It would be great if you can redirect 1.12.2 users to my fork.