Et Futurum Requiem

Et Futurum Requiem

105k Downloads

java.util.ConcurrentModificationException

kotmatross28729 opened this issue ยท 33 comments

commented

Please check any boxes that apply to you and your issue

  • I use a translator application to post this issue.

  • This is a crash. Please upload, Pastebin, Gist or copypaste the whole crash report along with this issue.

  • This is a mod incompatibility. If I do this in vanilla Forge with only Et Futurum Requiem installed, it works normally.

Version number of Et-Futurum-Requiem (IMPORTANT)

cca3146

Describe the issue (IMPORTANT)

Honestly, I have absolutely no idea what it is.
I heard somewhere that this is something with an iterator or something like that.
This crash happened completely by accident, no idea if it should be reported here, or because too many mods are trying to do something at the same time or what, it's just etfuturum in the stacktrace, so I'm reporting it here.
Although I read that it seems to be (possible?) to fix it, so (I hope) that something can be done about it

Crash report

https://pastebin.com/pTqsn2Xs

commented

Likely another mod messing with it. It uses a concurrent hash map AND an iterator, both of which are thread-safe ways to modify maps. Assuming this is OptiFine's fault

commented

Issue likely caused by OptiFine, the functions pointed at in the crash report can't possibly throw a ConcurrentModificationException. There's a lot of OptiFine junk in this crash log though.

commented

Double-checked Google, ConcurrentHashMap is the correct way to avoid this exact error which means another mod is erroneously messing with world generators incorrectly. I've also used TCL mods and not gotten this error. Blaming OptiFine

commented

java.util.ConcurrentModificationException
at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966)
at java.util.LinkedList$ListItr.next(LinkedList.java:888)
at ganymedes01.etfuturum.world.EtFuturumLateWorldGenerator.generate(EtFuturumLateWorldGenerator.java:64)

@Roadhog360 I am reopening this and removing the wontfix because you really badly misjudged how terrible your code is there:
https://github.com/Roadhog360/Et-Futurum-Requiem/blob/master/src/main/java/ganymedes01/etfuturum/world/EtFuturumLateWorldGenerator.java#L50-L64

This is a PURE Et Futurum Requiem Issue, NO OTHER MODS INVOLVED.

What do you think happens when you redo the Deepslate stuff inside a Chunk and suddenly you trigger some other random adjacent Chunk to be generated WHILE you do that? Yes this can rarely happen even if you confine your code to the chunk boundary.

Edit: Yes I know it says to stop recording chunks there, but are you 100% sure nothing ever adds to your Lists and Stuff? Because I do not think you could possibly blame other Mods for you just using the wrong Iteration Pattern, where you probably need to make sure either absolutely nothing you do ever adds to those mapped lists while you generate chunks, or you iterate with ye olde get(i+1) or get(link element to the right of this one) method instead of using Iterators.

and "Chunk cachedChunk = world.getChunkFromChunkCoords(redoX, redoZ);" may result in adjacent chunks being generated in some cases.

Edit Edit: Also maybe I shouldn't write stuff while im still tired from unusual amounts of physical activity and being awake for 36 hours the day before... I sound quite angry in this one...

commented

Impossible, never encountered this issue after thousands of worlds generated over the year deepslate layers existed, and it points at an iterator that is not modified in an unsafe way. NOTHING else adds ANYTHING to that list.
redoX and redoZ EXCLUSIVELY places blocks ONLY within the selected chunk.
And this runs after the chunk has finished generating, when another chunk starts to generate. There's no way for another item to be added to the list, and there's also no way for another entry to be added to the list. This is, without any shadow of a doubt, a mod conflict.
There are multiple remnants of OptiFine in the stacktrace. If this is an EFR issue, you will have to prove it. Since you seem adamant I will take your word there is a chance that it's my fault, but I will still label this as a mod incompatibility, since there's no other way for it to be anything else at the moment.

I will admit there could be a hole I am not seeing.

commented

While I still am 99% sure there is a mod doing this, Greg, sometimes I appreciate your stern-ness. In my own cod e analasys I realize that even though I am certain there is no way for it to error, I realize I could make the code a little bit faster anyways. Even though I am still confident there's just no way this could be anything in standalone EFR both due to just the way this works, and having generated thousands of worlds since then for unrelated tests with 0 errors, I could probably optimize this a little, and in doing so perhaps resolve this mod conflict.

I'll explain what the "redo cache" is and how it works. So basically sometimes an ore vein generates, it crosses up with a chunk that already generated. This can cause some ores not to be replaced. The solution is my redo cache. It's... a bit hacky. So basically I check isReplaceableOreGen, when an WorldGenMinable overwrites a block, it fires this function on the block it replaced to check if it can replace. (This is how bountiful stones, tuff, and deepslate are replaceable by stone ores)

So if this fires on deepslate, we check to see what triggered it and if it's something deepslate has an alternative block for, it'll add it to the cache. The cache only runs on a per-chunk basis, and it only runs one chunk at a time, and it only runs a list of blocks that were marked as needing a redo in that chunk only, so the possibility of triggering a bordering chunk to generate is simply impossible.
I can look into replacing the check with a while (!list.isEmpty()), and doing the method of list.get(0) and list.remove(0) at the end. and of course making the set into a list since now I realize that's faster anyways. That method of approach is totally 100% absolutely no-way-to-ever error concurrent safe. The thing is, I'm not sure if this specific change would make the generator slower.

commented

P.S. Infinite Googling reveals that iterator.remove() is already concurrent safe so that can't be it either. There is exactly ONE place that could throw a Comod and that's the posSet loop. And again I feel like I'd've gotten this crash numerous times if this wasn't an OptiFine issue. I believe OptiFine had some chunk-related settings.

commented

... Wait, what do you mean "Yes this can rarely happen even if you confine your code to the chunk boundary" I missed this part. What do you mean?? Even if I place a block ONLY in one chunk it'll erroneously generate adjacent chunks? I'm having a hard time believing that. Do you have a source on this? Since when?

commented

I just wanted to chime in about how I was testing the latest release of EFR in my modpack (I was using Et_Futurum_Requiem-2.5.0-SNAPSHOT-ce8ced2+nomixin at the time, from around last month, without a problem) and I have also noticed and EtFuturumLateWorldGenerator crash happening when starting a world. Although in my case, it's not the same exact crash as the issue.

It should be noted that this does not seem to be happening at all in correlation to optifine, I tested an empty instance with both E7 and D6 and just EFR and it both didn't pose any issue.
Quick and dirty Edit: On top of that, starting my modpack without optifine still causes the same crash. So that can be easily ruled out.

I will post the pastebin of the crash here as well, since it seems to be related, or at least adjacent, and I'm open for further, more thorough crashlogs if necessary.

https://pastebin.com/UV65fxST

commented

Do I even need to create a new empty list? What exactly is the purpose of that? Once it's finished iterating the lists is discarded because I only want to ever iterate over its contents once. How much performance overhead will constantly creating new lists cost, on top of the amount I am already creating?

commented

clearing the List one element at a time creates more overhead than recreating the Lists, sooooo why do you want to recycle a dumb object pointer that causes you concurrency issues?

And what I wanted to say in my previous post:
This is why I tell you you have no Idea what a CME actually is, since you tried to google what it is and how it works without reflecting on what it means for the stacktrace and your own Code.

I was just there guessing at first because I knew the Issue must be on your end, especially if you blame optifine for worldgen issues of all things, simply because it was far down the stacktrace, but now i KNOW that this is the Issue.

Edit: What I learned about stacktraces from modding MC for this long is one core thing: only the start of a stacktrace or its causedby section are valuable, everything past that is garbage and should be ignored, because if you go too far down, you will suddenly blame things like optifine or fastcraft even though they had nothing to do with it.

commented

If it's a different error then please open a different bug report. (Not even related to the world generator; the only mention of EFR in the stack trace is onLoadComplete) Thanks for the report
On your way to open an issue I do have a question; did you change any configs? If yes, what specifically was changed?

commented

Back onto the topic at hand, I just spent about 30 minutes flying in a straight line, and then an additional 30 where I turned 45 degrees and continued diagonally, even with OptiFine and could not reproduce the error reported. The world did not crash. I would think if such an error could occur at random from erroneously generating adjacent chunks giving the game a whole hour of nothing but chunk generation with EFR's fast flying would eventually trigger it.

@GregoriusT
You called my code "terrible" but I spent weeks polishing that specific functionality. It looks fine to me, how exactly is it terrible, and what do YOU think I could do to improve it?
Disclaimer: I really do not mind people insulting my code, I sometimes write really cursed shit and would not be surprised if I have done so without meaning to, lmfao please do not take the tone of this message the wrong way

commented

So what all adds to the position set? Os ANY worldgen Code adding to the position set EVER? because if it does then this Code I am pasting here will RARELY and I mean VERY RARELY be able to cause crashes when a Chunk corrupts for no reason:
Chunk cachedChunk = world.getChunkFromChunkCoords(redoX, redoZ);
What do you think happens when the Chunk that was already in the Cache somehow got deleted or corrupted? It will be recreated instantly by this Line, which then causes Worldgen to run on it, which then adds said same chunk to the position set AGAIN, which is the cause of the concurrent modification exception once your iterator runs over the position set.

I genuinely hate how ConcurrentModificationExceptions dont trigger on add, set or remove, and instead trigger on whenever the Iterator does a "get", which blames the Issue on the wrong part of Code and makes it extremely hard to find the actual source. This is a core thing about CMEs that you should know, and that 100% invalidates the ENTIRETY of the stacktrace and the optifine you see in it.

What you need to do is SWAP your deepslateRedoCache for a BLANK NEW deepslateRedoCache, iterate over the previous cache that contains all the stuff, and the error will go away.

And because I worded it a bit badly here is what I mean:
Map<Long, List<Integer>> map = deepslateRedoCache.remove(world.provider.dimensionId);

Yes it is THAT simple, just replace the GET with a REMOVE at this Line and everything should work perfectly. Yes you have to REMOVE the thing, otherwise THIS exact problem arises.

Do I seriously need to argue about this? Do you KNOW what a CME even is and what it means? In this case it literally means "roadhog is adding items to the list or map that roadhog iterates over during the iteration", you literally CANNOT blame other Mods on this one, especially not by looking at the stacktrace. You only have yourself to blame on that one. Sorry I sound angry again dont i...

Edit: I was literally writing my response as you pinged me, lol, and sorry for insulting your code, but you blamed someone else for your own mistakes so I had to be defensive about that. (even though i hate optifine and its toxic dev)

commented

If it's a different error then please open a different bug report. (Not even related to the world generator; the only mention of EFR in the stack trace is onLoadComplete) Thanks for the report On your way to open an issue I do have a question; did you change any configs? If yes, what specifically was changed?

I see, my apolo:cheese:.
As for config changes, I just loaded a fresh instance to the latest commit without deleting the EFR config folder, but now that I did I just noticed an entirely new and exciting can of ๐Ÿชฑ , since deleting the EFR config folder wholesale and letting it regenerate now creates a brand new crash with GT6's gregapi!
Fascinating. It's like a mess of tangled christmas lights in here. Oh dear.
It's way too early for something of this scale. I'll just open an issue later on, most likely.

commented

Looks like Christmas came early in this thread.

commented

So what all adds to the position set? Os ANY worldgen Code adding to the position set EVER? because if it does then this Code I am pasting here will RARELY and I mean VERY RARELY be able to cause crashes when a Chunk corrupts for no reason: Chunk cachedChunk = world.getChunkFromChunkCoords(redoX, redoZ); What do you think happens when the Chunk that was already in the Cache somehow got deleted or corrupted? It will be recreated instantly by this Line, which then causes Worldgen to run on it, which then adds said same chunk to the position set AGAIN, which is the cause of the concurrent modification exception once your iterator runs over the position set.

I genuinely hate how ConcurrentModificationExceptions dont trigger on add, set or remove, and instead trigger on whenever the Iterator does a "get", which blames the Issue on the wrong part of Code and makes it extremely hard to find the actual source. This is a core thing about CMEs that you should know, and that 100% invalidates the ENTIRETY of the stacktrace and the optifine you see in it.

What you need to do is SWAP your deepslateRedoCache for a BLANK NEW deepslateRedoCache, iterate over the previous cache that contains all the stuff, and the error will go away.

And because I worded it a bit badly here is what I mean: Map<Long, List<Integer>> map = deepslateRedoCache.remove(world.provider.dimensionId);

Yes it is THAT simple, just replace the GET with a REMOVE at this Line and everything should work perfectly. Yes you have to REMOVE the thing, otherwise THIS exact problem arises.

Do I seriously need to argue about this? Do you KNOW what a CME even is and what it means? In this case it literally means "roadhog is adding items to the list or map that roadhog iterates over during the iteration", you literally CANNOT blame other Mods on this one, especially not by looking at the stacktrace. You only have yourself to blame on that one. Sorry I sound angry again dont i...

Edit: I was literally writing my response as you pinged me, lol, and sorry for insulting your code, but you blamed someone else for your own mistakes so I had to be defensive about that. (even though i hate optifine and its toxic dev)

A ConcurrentModificationException is when a list is modified while iterating through it, but both Iterator instances and ConcurrentHashMap is supposed to be modifiable during iteration. I looked this up a ton to be sure and that's what all the results said. It's even called a ConcurrentHashMap, what else could that mean? Countless Google results and StackOverflow posts say so.

So you're saying that the head of the StackTrace being posSet is all smoke and mirrors and ConcurrentHashMap is effectively causing the StackTrace to be garbage?

commented

wait a crash with MY mod too? Well I do have an Issue tracker LOL

https://github.com/GregTech6/gregtech6/issues?q=is%3Aissue+is%3Aopen+sort%3Aupdated-desc

commented

You seem worried about sounding mad, but honestly just don't worry about it. Intentional or not, I can sometimes be quite confident in what can and cannot cause an error, I welcome a little bit of pushy-ness.

commented

You havent read half my post again Roadhog...

commented

I read the whole thing, what did I apparently miss?

commented

The part where i fixed your code?

commented

I saw it, I haven't thought of what to say yet

commented

It isnt the MAP that crashes its the LIST inside the Map

commented

you are waving the concurrenthashmap around like a magic wand to fix a nonconcurrent LIST :P

commented

https://docs.oracle.com/javase/8/docs/api/java/util/ConcurrentModificationException.html

The takeaway from this description is that a ConcurrentModificationException is exactly as I described it. You keep insisting my description of it is wrong but not actually telling me what I was wrong about.

Also OptiFine makes changes to the game that are haphazardly made and invasive, and introduces several chunk-related settings, it's reasonable to have initially blamed it due to... just the way OptiFine is. In general. I still believe this is a mod conflict and I'm having a hard time believing that I wouldn't get this whatsoever after so long, because I just so happen to NEVER use OptiFine. I know 1.7.10 worldgen is a bit tacky by default but do you have a source on chunks corrupting for no reason during worldgen? That's new to me, I'd actually like to learn more.

Also... I JUST realized what you were actually getting at with the .remove thing. I was thinking that you wanted me to copy the list first, forgetting that .remove also returns the result I need. I'm a little slow... sorry, heh. Some of what I wrote is written before but to be more transparent I'll leave some of it up there. I still wanna learn about this supposed chunk corrupting on worldgen though. I still somewhat believe OptiFine has a hand in said chunk issues though, sorry, heh. Admittedly you have me slightly doubting that at last though. Although over the past few years OptiFine has managed to annoy me, so I'll just straight-up admit that this opinion may very well be personally biased.

So that being said using .remove to get the result I want should be the perfect way to clear the whole list in one go after getting my result. And as soon as that result is reached nothing else will add to that map because they will see it's gone and create a new one if somehow worldgen re-runs on that chunk. Actually that shouldn't happen because in normal, non-corrupt worlds, this should actually never happen UNLESS a chunk corrupts, in which case my handy dandy replacer will hit it as normal because it'll start fresh. So I don't have to worry about the list inside either. That's such a "hidden in plain sight solution"... using remove as a getter, heh.

Yes, I completely forgot that .remove can be used as a one-time getter and was confused as to how to implement your fix in the way you wanted. I was so confused, because I thought doing remove at the start would totally just make the loops immediately after not run and was devising some way to clone the list to still iterate over something hence my earlier comments about copying and creating lists and such. I sort of went over what this would do and checked it one more time against what you said, to be sure we're on the same page now.

Never thought of using .remove to get entries. That's actually kind of clever. I don't know why I'd doubt the great Greg's smarts for even a moment. I do understand I come across as stubborn but even when I am dead-set on being sure something is one way, I do always like to learn, and will pretty much immediately switch sails when I believe a lesson has been learned. Hell, your simple change might make this compatible with any threaded chunk loader mods should any proper ones come into fruition.

Do you know any ways I could possibly arbitrarily trigger chunk corruptions during worldgen and other issues that may trigger this crash so I can test it, compare results and even hold up other future worldgen to the corruption and jank test?

commented

And thank you for your patience @kotmatross28729 I suppose OptiFine gets off the hook this time, heh. I'll leave this issue open for a while after adding the fix to be safe, unless I can track down the cause and be certain it WILL NOT happen. I just don't feel good about closing an issue as "fixed" when I can't reproduce it, unless I've given it some time.

commented

Will give this a few days. If no crash is reported here I will assume Greg's suggestion worked to fix it. Comment here if the same issue occurs after closure, and this issue will be re-opened.

commented

Yep this should fix multithreaded Chunk generating Mods probably, if that was even possible at all, which it definitely isn't in 1.7.10 though...

And yeah using remove as a getter is a great shorthand for a lot of things like this.

(also i didnt insist on the description being wrong, i insisted on you looking at the map for an issue with the list inside of said map, the map was fine, the list was not, lol)

There is quite a lot of things you can blame Optifine for, like screwing up Mob spawns, which it definitely does do because it somehow just zeroes out any block metadata passed to:
public boolean canCreatureSpawn(EnumCreatureType type, IBlockAccess aWorld, int aX, int aY, int aZ)
resulting in my own blocks being screwed when they have metadata based Mob spawn protection, which suddenly does no longer work since metadata zero is spawnable...

commented

Ah, again misunderstood. You kept asking me if I even knew what concurrentmodification was and I mistook that for me possibly having not known what it is all this time.

commented

Yeah I knew that you knew what it technically is but you didnt understand what it meant since you thought fixing the map would fix all of its contents too, which it didnt. ;)

commented

Ah yeah, see, I had THOUGHT the only possibly concurrency was in the maps themselves, so basically I made them concurrent, but I didn't know chunks would randomly corrupt in the manner you described.

commented

Gonna close this as it's been a few days with no updates. The new code should be completely immune to this error but if the same crash happens again feel free to comment here to get the issue re-opened.