State persistence and making complex programs possible

Question

State persistence and making complex programs possible

SirHall opened this issue 4 years ago · 30 comments

SirHall commented 4 years ago

I felt as though this may have been suggested plenty of times before, but I was unable to find any previous issues mentioning this problem so I'm making it now.

As is very well known OpenComputers and RetroComputers advertise this feature upfront, however the fact that neither of them have been updated past 1.12.2 mean that they cannot be used in any more recent mod packs.

While I love both ComputerCraft and OpenComputers for many reasons, it is often hard to use ComputerCraft for anything more complicated than directly transforming simple inputs into simple outputs and simple monitoring of the world around it(which is persistent) and immediately acting on it.

Lua to my knowledge supports saving it's global environment to a file and loading it back up again, whilst I expect the actual implementation to be quite complex, I do find it hard to use ComputerCraft to its fullest without the reliability of whatever it's doing staying around in its memory by the time it has loaded again.

The primary workaround includes writing each variable change to a file which is then loaded up when the program is run again usually as a startup script. This is purely a hack that fails to fix the issue that the position in a script that is currently being run is also lost. It simply isn't reasonable to attempt to write a program in such a way where it can be stopped at any time, and expect to continue running from any arbitrary point in the code purely by reloading saved variables, especially should recursion be used at any point.

ComputerCraft in my opinion is simply wonderful, but since 2013-2014 this has continued to be the number 1 issue that makes this mod difficult to use. Thankyou for the great work you have done in continuing to support this wonderful mod, and I would argue that this is the last real kink in this mod that needs fixing. I realize given the age of this mod it may be rather difficult, but I feel it would truly take ComputerCraft to a stage where it is nearly complete.

Thanks again for your amazing work!

gnif · Answer 1 · 2021-09-19T22:58:12.000Z

While not exactly solving the persistence issue many issues with a turtle that tracks its state come when execution is terminated before it has an opportunity to save it's state to disk, ie:

turtle.forward()
state.x = state.x + 1
-- termianted here, step lost
saveState()

A simple solution to this would be to assign a running count for each action performed that affects the state, using this one could use an intent journal to detect these crashes just like a modern computer does with a filesystem. For example:

function processJournal()
  local lastEvent = os.getCurrentEventID()
  while next(state.journal) do
    local entry = table.remove(state.journal, 1)

    -- only process entries that have not yet happened
    -- note, would require overflow/underflow checking here by the script
    if entry[0] > lastEvent then
          if entry[1] == 'f' then turtle.forward()
      elseif entry[1] == 'b' then turtle.back()          
      elseif entry[1] == 'u' then turtle.up()
      elseif entry[1] == 'd' then turtle.down()
      elseif entry[1] == 'l' then turtle.turnLeft()
      elseif entry[1] == 'r' then turtle.turnRight() 
      end
    end
  end
  saveState()
end

function forward()
  -- write the intent to move forward to the journal and update the position
  local eventID = os.getNextEventID()  
  table.insert(state.journal, {eventID, "f"})
  state.x = state.x + 1
  saveState()
    
  -- perform the action which increments the persistent event ID
  turtle.forward()
  
  -- remove the processed entry from the journal
  table.remove(state.journal, 1)
  saveState()  
end

loadState()
processJournal()
while state.x < 10 do
  forward()
end

As long as the event ID is guaranteed to change after each successful move event a CC program can detect if it was interrupted and where it happened if designed to do so.

For efficiency, provided the event ID is an always incrementing monotonic value, one could even write a whole batch of actions to the journal at once, save it to disk, and then process the journal entries, finally saving the finished state to disk at the end. This way Instead of having to write to disk on every action, it could be every batch of actions.

For example (using the same processJournal as above):

function forward()
  local nextEventID
  
  if next(state.journal) then
    nextEventID = state.journal[#state.journal][0] + 1
  else
    nextEventID = os.getNextEventID()
  end
  
  -- add the intent to move forward to the journal and update the position
  table.insert(state.journal, {nextEventID, "f"})
  state.x = state.x + 1
end

--startup and recovery
loadState()
processJournal()

-- batch the actions
while state.x < 10 do
  forward()
end
-- commit the journal to disk
saveState()

-- process the actions now committed to the journal
processJournal()

As for complexity to implement this feature, it's literally a few lines of code and a globally incrementing unsigned integer for each computer, additional server overhead would be practically zero and there is no way this could be abused.

BlackAsLight · Answer 2 · 2023-02-27T08:21:40.000Z

I disagree that the lack of state persistence makes complex programs impossible. For it is no impossible. It is simply an additional challenge that you as the coder need to over come. You can make complex programs in CC, you simply need to change your approach to said problem.

What I've come up with is splitting the job of a turtle up into individual tasks. If we take my custom quarry script for example. The turtle has several tasks:

Mining out the hole 3 layers at a time.
Clearing water as it mines.
Emptying inventory.
Fetching more fuel.

Each one of these tasks are designed in a way by simply calling the function again it can figure out where it was up to based off its position in the world and facing direction (which is calculated automatically with the help GPS API every time upon startup). When the task changes, said change is saved to a file and said file is loaded up at the start to call these functions again in reverse, saving a few settings like parameters that was passed in, or in the case of emptying inventory and fetching fuel, it knows the correct position to return back to instead of wherever it woke up from, before releasing the task up again to the mining or clearing water task.

gnif · Answer 3 · 2023-02-27T08:52:52.000Z

Then clearly you are limiting yourself to the one task you have in mind and are closed minded as to other uses turtles can have.

GPS will only give you positional data, if you're writing a script that tracks items/counts or anything else and it crashes/halts when you least expect it you WILL lose state information. CC can stop/halt just before or just after you write your state to a file leaving you out of step.

This is a common problem in computing in general, not just CC, and over the last several decades technologies have been invented to avoid such data losses. For example, journaled file-systems like ext3/4 and NTFS, and ACID compliant databases, along with hardware solutions like battery backed up raid controllers.

And finally, the performance hit to the server to have your turtle continually serialise and write state information to disk on every action is just ridiculous.

BlackAsLight · Answer 4 · 2023-03-08T05:52:35.000Z

I don't know how CC remembering state for you would solve a server crash. I think all bets are off in that situation, but for instances where you know your computer might halt because the chunk is unloading or server shutting down, etc, you can design a system that is resume-able, and in a way that isn't writing to a file for every action. The idea is that you aren't writing a program that can continue exactly where it left off from, but instead it can figure out where it is up to and what the next step is.

With my turtles that mine out a quarry, chop down trees or harvest my farms. Its resume-ablility is based around two factors that I can guarantee. Firstly a file write to when it switches tasks from mining a hole, to emptying inventory, to fetching fuel, to chopping tree, to replanting, to harvesting wheat. When it switches from doing one to the other it records this to a file which is then loaded up at boot. These writes don't happen for every action, just whenever they need to switch from doing one specific job to another. The other factor is their physical position. Based off their physical position, they're able to calculate how much hole they have left to dig, how far they're away from the fuel station, how much tree is left, how much harvest left to go over.

With a turtle that moves you don't need to record much or even have a lot of file writes. You make resume-able programs based off hard facts around the world around you.

Now for an instance where its a normal computer and resume-ability matters. I thought of the idea of a public mass smelter, where several people can place items in their personal input enderchests (from the enderchests mod), the computer takes them out smelts them all together, then places them back in their personal output enderchests. Since all the items are being mixed together in the furnace array before being separated again,in this instance it is important to remember what belongs to who. It took me a while to figure out a way that could, reliably, remember what belonged to who.

The solution that I came up with:

if 'file exists for Output needs Empting' then
    -- Remove File: Recording of Input Control Chest's Items
    -- Move Items from Output Control Chest to Output EnderChest
    -- Remove File: Output needs Emptying
end

if not 'file exists for Input Control Chest' then
    if 'Input EnderChest has Items' then
        -- Move Items to Input Control Chest
    end
    if 'Input Control Chest has Items' then
        -- Save File: Recording of Input Control Chest's Items
    end
end
    
if 'file exists for Input Control Chest' then
    -- Move Input Control Chest's Items to Furnace Array Input
end

if 'Furnace Array Output has desired Items' then
    -- Move desired Item to Output Control Chest
end

if 'Output Control Chest has everything desired' then
    -- Save File: Output needs Empting
end

This would then be placed in some type of loop with the actual furnac-ing part being handled by a separate process that doesn't need resume-ability. It would also be expanded to check multiple EnderChests from different people. As you can see though this is resume-able and doesn't involve writing to files for every action. Just two actions. Saving what the player inputted before its mixed and recording that their request is done.

If you have other valid ideas where you'd think this rebooting makes it impossible to write certain complex programs. I'd be more than willing to have a go at solving them.

gnif · Answer 5 · 2023-03-08T06:56:12.000Z

I fully understand what a state machine is, and how one works.
You are still yet again assuming that the save will happen after the condition.

if 'Some State that is volatile' then
    // CHUNK UNLOADED HERE
    -- Save File: with new state
end

Do note that I am all for making this the task of the script developer to solve, my suggestion below that makes it possible to solve this in the lua script was rejected in favour of full state persistence. (see #926)

BlackAsLight · Answer 6 · 2023-03-08T07:19:03.000Z

2. You are still yet again assuming that the save will happen after the condition.

if 'Some State that is volatile' then
    // CHUNK UNLOADED HERE
    -- Save File: with new state
end

Unless I've missed something, in the above pseudo code, if it halts before it saves the info, the next time the code runs it will fall through the if statements and come to the same position that it halted at. Meaning it doesn't matter where it halts in the code, simply calling it from the top again will work perfectly fine. Do note that the order of it does matters for this to work properly.

gnif · Answer 7 · 2023-03-08T07:33:38.000Z

You are assuming that the conditional will still be valid on restart of the script... lets say it's counting movement, no GPS and the conditional is if turtle.forward(). When the script starts back up again the turtle may have moved, or it may not have, there is no way to know. Read through the PR I linked for more context.

BlackAsLight · Answer 8 · 2023-03-08T07:57:58.000Z

You are assuming that the conditional will still be valid on restart of the script... lets say it's counting movement, no GPS and the conditional is if turtle.forward(). When the script starts back up again the turtle may have moved, or it may not have, there is no way to know. Read through the PR I linked for more context.

I personally wouldn't opt for the turtle moving without a GPS as my turtles call their positions on startup and then just keep a memory state of it updated as it moves around.

Anyway. If you didn't want to use GPS then you'd need to fall back to other factors you could rely on like knowing two points in the system then moving a little to figure out where along you are with those points. For example. If the turtle was making pixel art, it would move around a little looking at the already placed pixels until it had enough information to know that its exactly here. This is a bit tedious, but it would allow the turtle the ability to resume on startup

gnif · Answer 9 · 2023-03-08T08:00:21.000Z

For example. If the turtle was making pixel art

This is just it, there are scenarios where "just moving a little" is not a viable option. Or what if the turtle is en-route in the middle of the air to a new location, there is nothing to probe around for. IMO the option I suggested in #926 would not solve the issue of persistence, but rather provides a means to make it possible to write a script that solves the issue, and is the better solution then full persistence, however the author of this project seems to disagree.

gnif · Answer 10 · 2023-03-08T08:10:21.000Z

And if you use a GPS then moving a little doesn't matter.

Again, you are assuming that GPS is an acceptable solution.

My suggestion is simply structuring the code in a way where it can halt at any time then simply calling it again will allow it to fall down and figure out where it was at.

Again, this is called a "state machine" and it's how I write most of my scripts for CC. Your idea is nothing new, nor does it solve the issue at hand.

BlackAsLight · Answer 11 · 2023-03-08T08:08:13.000Z

This is just it, there are scenarios where "just moving a little" is not a viable option. Or what if the turtle is en-route in the middle of the air to a new location, there is nothing to probe around for. IMO the option I suggested in #926 would not solve the issue of persistence, but rather provides a means to make it possible to write a script that solves the issue, and is the better solution then full persistence, however the author of this project seems to disagree.

I'm not a fan of your suggestion. It just seems messy to me. And if you use a GPS then moving a little doesn't matter. If their in route to a new location, at startup they get their current location, load from a file their destination and continue on their way. My suggestion is simply structuring the code in a way where it can halt at any time then simply calling it again will allow it to fall down and figure out where it was at.

BlackAsLight · Answer 12 · 2023-03-08T08:18:10.000Z

Again, you are assuming that GPS is an acceptable solution.

It is an acceptable solution. It makes the job a lot easier and a lot more reliable for very little cost.

You need some type of reliable state for any type of resume-ability to be possible. Even in your suggestion, you're just moving what that state is to something else.

gnif · Answer 13 · 2023-03-08T08:23:22.000Z

My proposal simply adds a persistent action counter which can be used to determine if the last action such as a move/dig, etc was successful or not. How the user decides to use it is entirely up to them.

It is an acceptable solution. It makes the job a lot easier and a lot more reliable for very little cost.

Each turtle now needs a modem, and several turtles in the sky, exponentially getting more expensive the larger the radius one wishes to cover based on the RF transmit distance. It has an exponential cost. If it were possible to reliably determine if a move action occurred or not one could instead of needing this external GPS system be able to roll their own navigational system, which is fun in itself.

Sorry but no amount of "GPS solves the lack of persistent state issue" arguments are going to satisfy this requirement.

BlackAsLight · Answer 14 · 2023-03-08T08:49:46.000Z

It is an acceptable solution. It makes the job a lot easier and a lot more reliable for very little cost.

Each turtle now needs a modem, and several turtles in the sky, exponentially getting more expensive the larger the radius one wishes to cover based on the RF transmit distance. It has an exponential cost. If it were possible to reliably determine if a move action occurred or not one could instead of needing this external GPS system be able to roll their own navigational system, which is fun in itself.

Sorry but no amount of "GPS solves the lack of persistent state issue" arguments are going to satisfy this requirement.

With ender modems, not sure if that's an additional mod or in this one, on the GPS computers, the radius is the entire dimension. The turtles can use normal modems and there won't be a range problem. Each turtle needing a modem for travel isn't expensive at all. You can even have the turtle switch it out if you want two peripherals to be used at once.

The GPS doesn't solve a lack of persistence. It makes it a LOT easier for the turtle to calculate its state and resume where it left off. Using reliable factors, in this case ones position, makes it easier to resume. Without the mod remembering the state for you, you'll need some type of cost for the state to be recalculated. A GPS is a very low cost compared to the current state of the mod.

Lupus590 · Answer 15 · 2023-03-08T11:05:40.000Z

You are assuming that the conditional will still be valid on restart of the script... lets say it's counting movement, no GPS and the conditional is if turtle.forward(). When the script starts back up again the turtle may have moved, or it may not have, there is no way to know. Read through the PR I linked for more context.

Would counting fuel work? Save the current fuel value to the disk, then attempt to move. If the move fails then no fuel will be used.
When starting up check your actual fuel value to the one saved, if the saved is wrong then your position data might be wrong. If you also save the move that you are about to do to disk then you can reconstruct your position.

gnif · Answer 16 · 2023-03-08T21:59:20.000Z

Would counting fuel work? Save the current fuel value to the disk, then attempt to move. If the move fails then no fuel will be used.

This is brilliant! It solves my main issue here, however it would not work if the server is configured so that turtles do not need fuel (but that's no fun anyway)

Lupus590 · Answer 17 · 2023-03-09T10:59:55.000Z

You call also have "magic markers" in the world. Blocks that the turtle recognises at specific locations as the end of long routes (or as turns). On startup the turtle just keeps going forwards until it finds the mark.

You could also have localised GPS zones, and the turtle never turns when moving from one zone to the other. Lining itself up to get caught in the next one before leaving the current one.

neumond · Answer 18 · 2023-06-15T01:35:09.000Z

Sometimes I think that Lua is too complex for programmable turtles.

Imagine a RISC microcontroller with fixed amount of memory, some I/O ports, kilobyte of video memory and several hardware timers and port watchers. Running an emulator is modifying registers and memory, step by step, with very predictable time for each step (probably batch of steps). It's easy to freeze, to serialize, to restore and run from any possible state. To encourage sleeping (like current yielding) as much as possible there could be overheating mechanic: heavy computing rises the temperature and slows CPU down. Interaction with a disk is atomic reading/writing of blocks, which again, happens in predictable time and doesn't require saving file descriptor objects. HTTP API could be designed to be able to fail at any moment of time, while adding a task to the request queue could be atomic as well. In other words, it's not our program that fails anymore, it's some external component that fails on every chunk offload, while our program simply pauses.

Lupus590 · Answer 19 · 2020-09-03T14:29:40.000Z

If I recall correctly, SquidDev says that this is surprisingly complex to implement. While getting the Lua VM to save its state is relatively easy, it will change how CC works. Changes which will affect every (or nearly every) mod which uses the CC API (peripheral mods and such) as many of them assume that CC won't have persistence. This has the potential to break those mods, although, I do recall that SquidDev (and/or contributor/s) did make changes to how CC runs threads and coroutines, which I believe SquidDev mentioned that peripheral mods also make assumptions about, and that change has been fine for ages now. That said not many mods updating past 1.12.2 is also true for those peripheral mods, so now may be the best time to make changes like the ones needed for this.

In the meantime, I (and others) have written user code which works around the problem:

checkpoint - this will do the bulk of the state/stage persistence for your program, it does mean that you have to write your functions in a way that it can be terminated and then ran again and be fine, which can take a bit of head-scratching. and in case you need a working example, here's a tree farm using a turtle
LAMA - Location Aware Movement API, original source and docs. This keeps track of turtle position and orientation.
patience - persistent alarms/timers, untested but I have no reason to believe that it won't work

SquidDev · Answer 20 · 2020-09-03T16:23:38.000Z

Lua to my knowledge supports saving it's global environment to a file and loading it back up again

Sadly it doesn't by default - OpenComputers uses a modified version of Lua called Eris which does. However, our version of Lua - Cobalt does not support it. I started work a while ago on being able to save Lua state. Honestly, it's not too difficult - the code is tedious to write, but it's not complex. However, as Lupus says, that's not where the problems lies.

ComputerCraft itself is really not designed for persistence and, even if it was, there's lot of subtle places which make things difficult. For instance:

Open files need to be resurrected.
HTTP connections and websockets need to be closed/failed. Relatedly, the request queue needs to be persisted.

They're all possible to solve, but it's a lot, and I'm somewhat anxious about starting such a daunting task.

SirHall · Answer 21 · 2020-09-04T01:17:57.000Z

Hey thanks for the great and detailed replies. I would make a few extra points.
OpenComputer's allows you to advise the running environment on where it may freeze your code during execution if the loop is expected to take a long time. Now it's not quite the same thing, but it would be nice at the very least to be able to have some control over how the program is closed.

At the moment it is impossible to have any say at which line the execution is stopped at and as a result can make writing any complex program virtually impossible. A comparison would be in a multi-threaded application(outside the context of Minecraft or ComputerCraft) where the multiple threads must share data, but there is not locking mechanism. Any thread may be altering the data while the other threads are currently running any other line and you have race conditions. Although probably not a great comparison there are multiple synchronization methods that can help alleviate issues of when multiple threads need to access the same data at once.

Likewise it would be useful if a program could be notified that it is expected to stop running(the world is closing or the chunk is being unloaded) and must save its work and terminate immediately. While I'm not sure of how much control or work it would take to temporarily halt these while the computer is closing down, it would allow some form of safety where otherwise execution could stop at any point in the program without warning.

Thanks again for the amazing work guys!

SirHall · Answer 22 · 2020-09-04T01:27:35.000Z

An extra comment on HTTP connections.

I would argue that these do not need to be re-opened at all. Should you keep the API's to use them simple enough you could instead just throw some error as the server response is expected, as a result the code might simply try to send the request again. For example you will send a request to a server for a particular file. Now instead of allowing computers to read directly from this stream as it arrives (I'm not sure how ComputerCraft works internally but this is my assumption), cache the response and hand it back to the computer when they call the request.getResponse() method (I have not used the http API within CC so I do not know the exact method names).

My suggestion is that should the computer close down between the request being sent, and the response arriving back, you can make a decision.
Did the computer close after the response arrived back to ComputerCraft, but before the computer itself could receive it?
Then simply continue execution and return to the computer the http response when called.
Did the computer close before a response even arrived, or is the response malformed (eg. only half the response arrived)?
Then return some form of error that the programmer could then use to re-attempt the http request cleanly.

I believe though even the issue with websockets could be mostly resolved if the computer could know ahead of time that it needs to finish up what it's doing.

SquidDev · Answer 23 · 2020-09-04T08:03:00.000Z

OpenComputer's allows you to advise the running environment on where it may freeze your code during execution if the loop is expected to take a long time.

Hrmr, I was not aware of this. Do you have more details - is there a function one calls or something?

At the moment it is impossible to have any say at which line the execution is stopped

Aside from in some extenuating circumstances (Minecraft crash, high CPU load), this shouldn't be the case. Computers are only stopped when they yield (via a direct or direct call to coroutine.yield/os.pullEvent). Opening a file, writing to it and then closing it (without yielding between them) should generally occur uninterrupted.

Likewise it would be useful if a program could be notified that it is expected to stop running(the world is closing or the chunk is being unloaded) and must save its work and terminate immediately.

This has been mentioned several times, and I'm of the mind that it's pretty impractical: dan200/ComputerCraft#503.

An extra comment on HTTP connections.

I think that's basically what I was suggesting. As I mention in the previous comment, none of it's difficult, there's just a lot of small things like this which need addressing.

[...] as a result can make writing any complex program virtually impossible

As an aside, I do have to disagree with this. There's a lot of complex programs out there - it's pretty easy to write code which doesn't rely on too much persistent state. For instance, see the many inventory management systems.

prozacgod · Answer 24 · 2020-09-15T16:04:00.000Z

Just throwing in a couple of bits, one of my biggest issues with long running state saving scripts is dealing with server crashes, and chunks that aren't saved. If a lua script updates a state file, and the chunk does not get saved durring a crash, it doesn't matter how much state you're saving the upon next load your turtle that's in motion is "broken"

I've always felt a nice alleviation to this particular issue would be allowing some amount of NBT data to be written from the computer. That way the scripts when loaded can always have some amount of data that is correctly syncronized with the state of the game. I think exposing it as a table within the computer would be kinda nice like a global table called "persistent" that did all the index/new index capturing of the data and automatically having the data in place when running.

(this would also help with chunk/region rollbacks, as sometimes as an admin I'll do that for my players if nobody else was affected or there's a glitched block etc - although I can't recall ever having done a rollback on a chunk that happened to have a turtle running a state saving script in it, but perhaps that's a symptom of the state of things vs the lack of willingness to create/use such programs)

crabdancing · Answer 25 · 2020-09-30T05:28:14.000Z

@prozacgod Maybe we could just have an API for sending a warning to the bots to reorient themselves in the event of a server crash.

prozacgod · Answer 26 · 2020-10-04T03:43:57.000Z

@alxpettit I could see that working, but I wonder if that would be more effort than just giving the nbt storage solution, even if the data was just a raw string limited to 200 bytes or something would likely be more than enough for just basic "save my quick state for restoration". A script could store data as a lua blob and eval it during restart, instead of fleshing out a more "complete" api for serializing properties between java/nbt/lua - and limited enough to not have an foreseeable issues like abuse for data size in the nbt.

Lupus590 · Answer 27 · 2020-10-04T10:09:07.000Z

@prozacgod And why is the disk not good enough for that? Your idea sounds like checkpoint too. See here: #535 (comment)

SquidDev · Answer 28 · 2020-10-04T10:24:27.000Z

@Lupus590 I think they mention some issues in a previous comment:

One of my biggest issues with long running state saving scripts is dealing with server crashes, and chunks that aren't saved. If a Lua script updates a state file, and the chunk does not get saved during a crash, [...] next load your turtle that's in motion is "broken".

This has always been an issue. IIRC OpenComputers mitigates this by only flushing to disk when the world saves. It's probably something we should do too, as it's a bit of a pre-requisite for any form of actual persistence.

That said, if your server is crashing so much that this is a major issue, you've got bigger problems :p.

prozacgod · Answer 29 · 2020-10-04T19:03:30.000Z

I've kinda liked that CC doesn't sync - but you're right in that it's probably the "real" solution. I often use external editors with CC, it'd be nice if for say we could turn on/off sync on a floppy, just to enable a place for an external editor to work.

But, you're correct in that this is the "best possible solution" all around. I'm sure this is just sorta nitpicking, but I kinda wonder if there's race conditions for the syncing of the chunk/syncing of the files, would there still be edge cases? Even if their were it would be better than what exists now, and you can't go around solving every single wierd edge case.

SquidDev · Answer 30 · 2020-10-04T19:10:44.000Z

I often use external editors with CC, it'd be nice if for say we could turn on/off sync on a floppy

It's definitely something I'd allow disabling via the config, but not something I'd want to do in-game. It might be possible to pick up changes to files made outside Minecraft, but doing the inverse is obviously not feasible without disabling sync entirely.

I kinda wonder if there's race conditions for the syncing of the chunk/syncing of the files, would there still be edge cases?

Well, ideally the two would happen at the same time. Obviously there's going to be problems if the game is unexpectedly terminated while saving, but if that happens you're quite possibly going to end up with some world corruption anyway.

Share to