CC: Tweaked

CC: Tweaked

57M Downloads

Server crashes due to presumed OOM issues (no crash log); debug log suggests reporting.

LordDarthDan opened this issue ยท 14 comments

commented

Minecraft Version

1.18.x

Version

1.101.2

Details

https://www.dropbox.com/scl/fi/hrcnx9slrsbwegdw7a25z/debug-1-3-.log.gz?rlkey=xx4jbsjgziu11vep7abxnc1o2
(The log is 18 megabytes and won't allow me to upload here.)

The log contains the following:
[ComputerCraft-Computer-Worker-1/ERROR] [computercraft/]: Trying to run computer #47 on thread ComputerCraft-Computer-Worker-1, but already running on ComputerCraft-Computer-Worker-0. This is a SERIOUS bug, please report with your debug.log.

The crash often occurs upon a player joining the server.
The computers in the report run the following code:

id30: runs https://github.com/zyxkad/cc/blob/master/storage/depot.lua executed via startup code of

while true do
 shell.run('depot')
 sleep(1)
end

id47: runs only

local a = peripheral.wrap('tconstruct:smeltery_1')

while true do
 local l = a.list()
 redstone.setOutput('top', l and #l < a.size())
 sleep(1)
end

The details of the server and modpack are:
Hourglass Server Details 31.12.2023.txt
(includes Advanced Peripherals, CCTech and Valkyrien Computers)

commented

Thanks for the report! Would you be able to change the file to be public on Dropbox - I'm unable to read the logs right now!

commented

Thanks for the report! Would you be able to change the file to be public on Dropbox - I'm unable to read the logs right now!

Oh, sorry! Dropbox link sharing is a little weird. I think I fixed it tho!

commented

I tested, the code that can stable reproduce the error

[08:07:38] [ComputerCraft-Computer-Monitor-0/WARN]: Terminating computer #0 due to timeout (running for 13.629596041000001 seconds). This is NOT a bug, but may mean a computer is misbehaving.
Thread ComputerCraft-Computer-Worker-0 is currently WAITING
  on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@46953484
  at TRANSFORMER/[email protected]/dan200.computercraft.core.computer.ComputerExecutor.resumeMachine(ComputerExecutor.java:666)
  at TRANSFORMER/[email protected]/dan200.computercraft.core.computer.ComputerExecutor.work(ComputerExecutor.java:628)
  at TRANSFORMER/[email protected]/dan200.computercraft.core.computer.ComputerThread$Worker.runImpl(ComputerThread.java:702)
  at TRANSFORMER/[email protected]/dan200.computercraft.core.computer.ComputerThread$Worker.run(ComputerThread.java:641)
  at [email protected]/java.lang.Thread.run(Thread.java:833)
Enqueued command: ABORT
Enqueued events: 0
CobaltLuaMachine is terminated

is

local threads = {}
while true do
  for i = 1, 100000 do
    local thr = coroutine.create(function() end)
    threads[thr] = 1
  end
  -- clear threads
  for k, _ in pairs(threads) do
    threads[k] = nil
  end
  print(os.clock())
  sleep(0) -- yield
end
commented

It's a coroutine thread leak issue or maybe table key leak issue, before I run the code, my lowest memory usage is around 8000MB, while game running it will up to 9000MB, after gc will back to 8000MB.
After I run the code, the memory usage after gc will continually increase, until it reach 99%, then the error above will throw. @SquidDev

commented

I've tested again, if you make the table as weak key setmetatable(threads, {__mode='k'}), then the leak won't happen

commented

Thank you for the additional information, that was very helpful!

commented

Hey @SquidDev thanks for the quick fix.
But did you figure out what's wrong at [ComputerCraft-Computer-Worker-1/ERROR] [computercraft/]: Trying to run computer #47 on thread ComputerCraft-Computer-Worker-1, but already running on ComputerCraft-Computer-Worker-0. This is a SERIOUS bug, please report with your debug.log.?
IMO even the table have memory leak issue, the computer should not be run on different thread at same time

commented

I didn't no - the relevant code is pretty different on the latest versions of CC:T, and I'm not sure I can face going back and looking at the older version again.

I have a suspicion that the original worker (ComputerCraft-Computer-Worker-0) died/killed without cleaning up properly (possibly due to the OOM), and then a new worker was spawned the next time we came to run a task.

commented

@SquidDev Sorry to bother, but will this fix be releasing any time soon? Or should I try to build this version of the mod myself?

commented

There has been a fix released for Minecraft 1.20.1 (CC:T 1.109.3). I'm afraid I'm no longer providing updates for older versions of Minecraft - you might be able to backport the fixes, but it will be a bit of a slog.

commented

There has been a fix released for Minecraft 1.20.1 (CC:T 1.109.3). I'm afraid I'm no longer providing updates for older versions of Minecraft - you might be able to backport the fixes, but it will be a bit of a slog.

Judging by the specific fix, it was addressed in Cobalt, which isn't version dependent, as far as I understand - not the actual mod.
The problem was discovered in 1.18.2 and had been seriously killing my server. I would assume it is possible to just build 1.18.2 with the new Cobalt implementation - unless, of course, there's been some breaking changes I do not know of.

commented

The 1.18.x branch of CC:T is still on Cobalt 0.6 - there's been several breaking changes since then (notably the rewrite of coroutines, and the update to Lua 5.2). You're probably better off cherry-picking the fix to the older version of Cobalt.

commented

The 1.18.x branch of CC:T is still on Cobalt 0.6 - there's been several breaking changes since then (notably the rewrite of coroutines, and the update to Lua 5.2). You're probably better off cherry-picking the fix to the older version of Cobalt.

Thank you. I will try to do exactly that.

commented

If anyone else have met on this problem and want a backported fix:
https://github.com/zyxkad/CC-Tweaked/releases/tag/v1.18.2-1.101.3%2B1
PS: this update should only required on server side