[1.16.5+] Client crash on loading due to concurrency issue
hix3r opened this issue ยท 1 comments
Hello,
Mod ver.: MC1.16.1-1.1.2.6
Forge ver.: 1.16.5-36.2.20
Minecraft ver.: 1.16.5
Forge crash log: https://gist.github.com/hix3r/52aceac059fcaeb935b9a07b80e37c68
This is the same issue that was reported in #166 , it happens intermittently, because it is caused by a race condition of multiple threads trying to modify a single Java HashMap. The cause of this issue is present in the current latest branch as well. Looking at the changes, the code logic responsible is the same since at least 5 years ago so all versions of the mod in that period are affected.
In team.chisel.ctm.client.util.TextureMetadataHandler
the onModelBake()
method subscribes to all ModelBakeEvent
events. Events are dispatched asynchronously in multiple worker threads.
onModelBake()
does the following at line number 156 in the most recent branch:
try {
meta = ResourceUtil.getMetadata(ResourceUtil.spriteToAbsolute(tex.texture()));
} catch (IOException e) {}
ResourceUtil.getMetadata()
is a static
method that modifies metadataCache
which is a static final HashMap
via its put()
method on line 64 of the ResourceUtil
class.
This means there is a possible scenario when two event threads will try to put()
something into metadataCache
at the same time. As a result during loading the cache could be: unaffected or its entries silently corrupted or cause a runtime crash depending on thread timing.
Java HashMap by itself is not implemented to handle concurrent access and modification. The implementation of it needs to maintain a consistent internal tree structure of nodes. The strange exception message class java.util.HashMap$Node cannot be cast to class java.util.HashMap$TreeNode
is due to multiple threads trying to act on this internal tree structure at the same time, leaving it in a broken state.
Possible solutions:
- change
metadataCache
inResourceUtil
into aConcurrentHashMap
, as this uses several lock-free optimizations to lessen the burden of synchronization, the retrieval operations likeget()
do not block, plus the internal hash array is divided into segments so blocking only occurs on one segment during a modification - use a
synchronized
block when accessing or modifyingmetadataCache
- synchronize on the
getMetadata()
method - possibly use or implement some other cache data structure that is suitable for concurrent modification and retrieval
Hope this helps.
This bug is also present on 1.19.2-1.1.6+8