CC: Tweaked

CC: Tweaked

59M Downloads

Absolute and relative paths

PrincessRTFM opened this issue ยท 8 comments

commented

I'm going to be honest: path handling in CC is kind of atrocious right now.

The fs and io APIs use absolute paths, the shell API handles relative ones, and only functions that are specifically built to handle relative paths differentiate between absolute-style with a leading / and relative-style without. Any functions that don't simply assume that all paths are absolute. If you call shell.resolve(path) from outside the root, it returns what looks for all the world like a relative path, and if you don't keep track of which paths you've resolved and which you haven't, you'll easily end up mangling them. Right now, unless you know how a specific program handles its paths, it's just not safe to call programs from anywhere but the root, since that's the only place where it doesn't matter either way.

I've been writing a custom path handling API, on top of which I've been further building a custom fs API that specifically recognises absolute paths as beginning with a /, but this is really just a personal band-aid solution for a fundamental problem with CC itself: inconsistent path handling makes it easy for things to break.

I know it would be a fairly significant task, but I'm officially submitting a request that path handling be reworked for consistency, making all APIs support both relative and absolute paths, differentiating based on the presence or absence of a leading /.

I'm aware that, if handled incorrectly, this could break existing (poorly-written) programs that rely on fs and io treating even relative-looking paths as absolute. To that end, perhaps a new pair of APIs could be implemented as a sort of compatibility layer, such as absfs and absio. On the other hand, since running programs from the root makes it irrelevant and that's how most users run their programs, it may not be enough of a break to actually matter.

commented

Yeah, the way paths are handled within CC has been a long-standing complaint of mine. It's confusing, and very easy to get wrong (indeed, even CC screws this up). Unfortunately, it's not obvious to me how best to fix this.

I think there's two separate issues here:

  • Everything uses absolute paths: this applies to both fs and io, but also other builtin functions like dofile, loadfile, and a couple of other CC-specific ones like os.run, settings.load.

    I don't think we can realistically change these to work on relative paths. Very few programs bother to put / at the front of paths, so this will break a lot of code. We could add a new module that behaves similarly to the existing fs API, but using relative paths. Though then I'm a little wary that you have some modules using relative paths, and some using absolute ones, which is confusing a different way.

  • Path-manipulating functions "lose" the absoluteness of paths. So for instance, fs.combine("/", "foo") returns "foo" rather than "/foo".

    We probably could change shell.resolve to return a path with a leading / without much repercussion, which would at least allow calling shell.resolve multiple times without issue.

    However, I'm not sure we could change fs.combine/fs.getDir without causing issues. I've definitely written code like while path ~= "" do f(path); path = fs.getDir(path) end in the past, which would no longer work if fs.getDir("/foo") returns "/".

For fs and io apis, relative path would be relative to what? Cause current directory and stuff are shell level concepts and not really applicable to filesystem level.

This is definitely workaroundable. You can do something similar to OC, and have some hidden global state that stores the current working directory, and set/restore that whenever a program is resumed/yielded.

commented

Question for this suggestion. For fs and io apis.. relative path would be relative to.. what? Cause current directory and stuff are shell level concepts and not really applicable to filesystem level.

commented

Which is, honestly, also not a great design choice. If a program is running, it has a current directory. That's an attribute of live processes. The filesystem doesn't exist in a vacuum. A running program's active directory shouldn't be limited to a single API. That's most of my problem, actually - it currently is, but that doesn't make sense.

commented

What about program executed by different program from rednet message? Or program executed by pastebin run or wget run ? Those don't have a directory they exist in and would basically have relative paths of their parent programs? [That is already currently a issue with shell.resolve(path) and those types of programs]

commented

They do, or at least they should. When you run pastebin run or wget run, its current directory is the directory you run it from. It then runs something else, which should inherit the current directory. This is how basically every modern shell works. Everything has a working directory. Any child processes they launch inherit the same working directory. Child processes can't change the working directory of their parents (which is possible in CC, and shouldn't be for the same reason it isn't possible in real shells).

commented

However, I'm not sure we could change fs.combine/fs.getDir without causing issues. I've definitely written code like while path ~= "" do f(path); path = fs.getDir(path) end in the past, which would no longer work if fs.getDir("/foo") returns "/".

I feel like this is just a symptom of the problem rather than a reason not to fix it. Take a gander at code that uses fs.getDir. There seems to be a lot more instances of brute forcing the path to be absolute path = '/'..fs.getDir(...) than code that expects an empty string as root fs.getDir(...) == "". Three I found back to back:
image
1 instance that anticipates either "/" or "" (good job squid)
2 instances that brute force the path to root. (the middle one does so earlier, trust)

commented

Very few programs bother to put / at the front of paths, so this will break a lot of code.

Only if that code is executing from outside of the filesystem root, to be fair. For all programs, running from / itself means that absolute and relative paths will point to the same place. And I think that the consistency and sanity of relative-looking paths all actually being relative is worth the risk that running old programs from outside / might have issues. I doubt it's overly common in the first place in the first place, at least for non-core programs.

The issues with fs.combine and fs.getDir are bigger problems, because I wouldn't be surprised if that's not an uncommon thing people have done, and that would break no matter where the code is running from. I don't think there's a way to fix it and maintain backwards-compatibility, at least not without some really unfortunate kludge to fs.combine... fs.getDir could be given another argument with a boolean indicating whether you want it to return "/" instead of "", but the only solution I can think of for fs.combine is that if the first argument is "/" then it should return a "really" absolute path. Even that much could be an issue for some programs, not to mention being a bit of a hack.

On the other hand, a new API with the same methods that do operate with properly formed paths would mean a whole second place that needs to be kept updated alongside the first, even if it wouldn't require copying all the code. My existing project defines a path API for handling of string paths, and then a replacement fs API that programmatically wraps lists of methods from the underlying core fs API to just process path arguments before passing them through, but if any new methods are added to fs then I'd need to update it appropriately:

-- Most wrapped functions only need to resolve a single given path
-- and then pass it directly to the underlying fs method
for _, func in ipairs(singlePathArgumentMethods) do
	fslib[func] = function(name, ...) return fs[func](path.cwd(name), ...) end
end

It may be time to decide how important backwards-compatibility is for old CC programs.

commented

Sorry if I step on somebody's toes, I just found this topic.

A complete solution should include quite a few moments, if you ask me.
When shell starts a program, it should pass to it a clone of its configured FS object as well as a clone of its configured STDIO object. These objects should inside themselves contain the copies of references to current working directory (FS) and IO streams, if any (STDIO).
The program then could change them as it see fit, but these changes wont affect the parent objects.
Once that's done, you get the rest for free.