LOD map render tiles gets corrupted/does not generate after a restart
VL4DST3R opened this issue · 25 comments
template is bold
sample data is italicized
Issue Description: I noticed a recurring issue lately on my map renders where if something interrupts the generation, like a server restart, the render resumes on the highest zom level (say, LOD0), it does not propagate properly to lower-res LOD's (when you zoom out).
This results in a portion of black lines from where the rendering was interrupted, unless you zoom in all the way. This has happened on multiple maps, across the last 2 dynmap versions. While i would obviously like to know why is this happening or if there is a fix, my main gripe is the fact that if i do a dynmap radiusrender
for the affected areas, it doesn't even consider refreshing those non-LOD0 tiles, so essentially they can never be regenerated unless i purge the entire map and start over.
- Dynmap Version: Dynmap version: core=3.2-beta-2-483, plugin=3.2-beta-2-483
- Server Version: *Paper version git-Paper-642 (MC: 1.16.5) (Implementing API version 1.16.5-R0.1-SNAPSHOT)
- Pastebin of Configuration.txt: https://pastebin.com/Jds2H9rP
- Server Host (if applicable): self-hosted
- Pastebin of crashlogs or other relevant logs: n/a
- Other Relevant Data/Screenshots: This happens regardless of map version and while using no texture pack
- Steps to Replicate: (for simplicity) purge a world and force a fullrender. Stop the server normally midway through rendering, then start again and let it finish. Result: (continued below)
[x] I have looked at all other issues and this is indeed a duplicate of #2485! but i'm posting this since it was never an issue on our server before and wanted to know if this is
- something new in the last 2 dynmap versions,
- if it is related to my config and the use of cwebp or dwebp,
- OR related to using a particular paper version, since i saw it mentioned in a few threads.
[ x] I have been able to replicate this
I mean that sounds almost trivial to fix.
If you're not interested in trying to make a PR, I'll give it a whirl. Using a custom build myself already to fix the BlockPhysicsEvent stuff.
I'm not very familiar with the dynmap code itsself. My information is only inferred by observation. I find the dynmap code to quite disorganized ... And don't really see have the time to spend much time on the code to make enough sense of it fix the issue.
The work around works well enough for me, that there are no longer any obvious black bars on my server
I can confirm this.
For the record this is called the zoom out processing/rendering. And it's incredibly broken in the current version. For me the rendering doesn't even need to be interrupted for this issue to appear but it certainly amplifies this issue.
Rerendering these regions doesn't fix it.
This has been a bug on paper since around 1.12.2, I wish I could say this was a new issue or it has gotten worse but from my testing it is still just as inconsistent and broken as it was back then
Well it certainly has gotten worse. Has only been an issue since I upgraded to 3.2-beta-2 and MC 1.17.1.
Like often before the zoomed out tiles wouldn't update, but radiusrendering did the trick. Now these tiles are black and nothing helps.
Really wish they would at least add a way to properly delete and let it regenerate all the low-res/zoomed out renders of a map, since it certainly takes a lot less to generate those from the already existing high-res images instead of reading the actual map file.
Currently the only way to address this is the scorched earth approach (and praying nothing craps out during the render, forcing you to start over)
I can confirm this.
For the record this is called the zoom out processing/rendering. And it's incredibly broken in the current version. For me the rendering doesn't even need to be interrupted for this issue to appear but it certainly amplifies this issue.Rerendering these regions doesn't fix it.
It is not broken in just the "current versions" ...
Another instance reporting this issue is here: #3182
This is an issue, that has been around for years, in the beginning the Devs claimed it was an issue with the server (e.g. PaperMC vs Spigot vs Craftbukkit), but the server type only changes how reproducible it is.
The best approach ATM is either:
- Open Database, delete all zoomout (zoom not 0) renders, enable zoomout check on startup in configuration.txt, reload dynmap
- Bassically 1, but check location of blackbars on map and only delete the zoomouts at that location ... reduces serverload, but needs more technical knowledge.
- Change update timestamp of all "source" tiles (zoom==0) to the current timestamp, enable zoomout check on startup in configuration.txt, reload dynmap. (also basically the same as 1, but this leaves access to the previous zoomout tiles, while all zoomouts are being regenerated)
I may be getting this wrong but shouldn't having initial-zoomout-validate: true
prevent or at least address this issue in of itself? Talking about the "validating" part specifically here, because it clearly doesn't... What does it validate for?
I may be getting this wrong but shouldn't having
initial-zoomout-validate: true
prevent or at least address this issue in of itself? Talking about the "validating" part specifically here, because it clearly doesn't... What does it validate for?
only the timestamps, but those are borked ... dynmap saves the time the zoomout is saved to DB, instead of the timestamp of the "newest" sub-tile it used to render the zoomout, causing the race condition in a database, where the save operation is sometimes not visible for the read operation, when more than one thread for database-processing is used (the jdbc connector in dynmap ALWAYS uses more than 1 thread)
Consider Situation:
we have tiles A.B,C,D and the zoomout Z.
we change a block in D,
D is rendered, which causes an update for Z to be queued.
Z starts to be rendered, And gets the current state of A,B,C,D
we change something in C.
C is rendered. Does not queue Z to be updated, as it is still being rendered.
Z is finished -> timestamp after C, zoomout-validate does not think Z need to be rerendered ...
Awesome, thank you for clarifying.
when more than one thread for database-processing is used (the jdbc connector in dynmap ALWAYS uses more than 1 thread)
I was about to suggest if forcing it to only use a single thread would prevent this from happening, but you addressed that as well
Is this race condition something that was ever brought up to the dev team? You seem to have a very good understanding of how this happens, maybe you could give them some pointers about how to handle it better?
3. Change update timestamp of all "source" tiles (zoom==0) to the current timestamp, enable zoomout check on startup in configuration.txt, reload dynmap. (also basically the same as 1, but this leaves access to the previous zoomout tiles, while all zoomouts are being regenerated)
Any chance you could give me a query example that would achieve this? While i understand the procedure suggested by you, I lack the know-how to actually implement it. To add insult to injury, for me even opening the Tiles
data table freezes my puny db browser :/
The earliest report of this issue seems to be this one: #1666
With the claim it got fixed. Unfortunately the image referenced there is no longer online
There we go. I'd provide a build of that version, but the team seems to be against that sort of thing.
If you don't know how to build it, DM me on Discord. I'm in the dynmap Discord with the same username and avatar.
What I find very interesting is that I setting the timestamps to the long gone past did absolutely nothing for me.
I'm using file storage but that really shouldn't make a difference.
Did setting those timestamps to the past fix it for you?
Most of the time. But because of the nondeterministic order, you might need to do this zoomout
-Level-Count times.
So my PR was merged. So everyone try the latest snapshot. Good changes that it's fixed now.
UPDATE `table_prefix_tiles` SET `LastUpdate`=UNIX_TIMESTAMP()*1000 WHERE `zoom`=0
Hey, sorry for taking so long to come back to you.
I did test the command but it had a few issues. First, my data table was simply named tiles
, second, i couldn't get the actual unix timestamp but judging from what the equation was supposed to do, i figured it was to be set to a very large number, and so i did that manually, to 2147483647, which seems to be the largest unix timestamp value it could ever go? The query worked and set the appropriate date, but in-game it doesn't seem to do or trigger any update.
I then finally tried deleting the levels entirely using DELETE FROM 'tiles' WHERE 'zoom'!=0
but now i got no levels beyond 0 and still no regen. Did i miss something?
I saw in the meantime Brain pushed an update, does it just prevent further generation corruption or by chance anything retroactively?
It fixes part of the render corruption. I'm still seeing it happen, just significantly less frequent
I'm going to close this to help track this with the other zoomout issues as I continue cleaning up the stale issues. @BrainStone thank's for your work on fixing this issue.