Create FileOptions specification for arguments to cmdline scripts
LadyCailin opened this issue ยท 0 comments
Allowing users to create a prototype for the arguments passed in to cmdline scripts would serve multiple purposes. One, it would allow for self documenting scripts. Two, it would allow programmatic creation of auto completion for various shells. Three, it would allow for standard and automatic parsing of arguments. Four, it would allow automatic creation of help files.
Example
Consider for example a program which accepts four options: a verb (verbe
), a flag (short option f
, long option flage
), an optional named argument (arge
), a named enum (enume
) and some default file arguments. Additionally, there are two parameter sets, parame
and paramf
. We might invoke this command such as:
cmd.ms verbe -f --arge value --enume x file1.txt file2.txt
or cmd.ms verbf --arge value --enume y file1.txt
The file options for cmd.ms might then look like:
<!
arguments:
verb verbe{parame}/verbf{paramf}/verbg Description,
flag f/flage{verbe} Description in plain english\, with support for escaped commas,
[arg] arge{parame/paramf} string Description,
arg enume enum(x/y/z) Description
default file... Description
>
General Format
The general format then is that at the top level, you have a comma separated list of arguments, with zero or one default, and zero or more verbs, flags and args. For each of these values, it can be one of: verb, flag, arg, or default. For arg and default, if the argument is optional, it should be specified with square brackets, such as [arg]
. In general, parameter sets are specified with curly braces, i.e. {parame}
. (More on parameter sets below.) Each of the four specification types has slightly different semantics, though the general format is type parameters description
, where the description is a plain text explanation of that parameter. Note that commas and semicolons must be escaped. Commas separate the parameters, and semicolons end the file option, but both can be escaped with a backslash \,
or \;
. In all cases, the description is optional, though highly recommended.
When it comes to usage, flags and args have a shared semantic in terms of short vs long arguments: short arguments are single characters, and use a single dash, and long arguments are two or more characters, and use two dashes. -f
or -a
or --long
for instance.
verb
A verb is a non dashed parameter which acts as the primary mode selector for the command. This is optional, and when left off, the command will only have one "mode" in which it operates, though parameter sets are still supported even when there is no verb. The general format is 'verb' enumSet description
. enumSet is a slash separated list of one or more literal values, followed by a plain text description of the parameter. There is no separate description for each verb when using this slash separated format, so it is suggested to expound on the overall mode operation in the file description elsewhere in the file options, and have a short description of each verb in the parameter description, or simply list each verb out as its own specification. Unlike the other parameter types, each individual verb in the enum set can have its own parameter set definition. While not necessary, the verb is often the primary way for a user to select the parameter set to use, and so each verb will also have its own unique parameter set selector. Providing multiple verbs in one definition of the verb option is effectively the same as defining each verb in its own prototype with the same description. However, for complex commands, it may be clearer to split them up, i.e:
verb verbe{parame} description of verbe,
verb verbf{paramf} description of verbf
In any case, there can only be one verb provided by the user, and is a globally unique value for each program.
Thus, if the prototype is: verb verbe{parame}/verbf{paramf}/verbg
then the user can run one of cmd.ms verbe
, cmd.ms verbf
, or cmd.ms verbg
.
Verbs do not have short forms, unlike flags and args.
The parameters
value should not have any spaces in it, the first space begins the description.
In order to support parsing of Windows style arguments, it is possible to provide a verb with a slash in it, however this requires escaping with \
, i.e. verb \/verbe/\/verbf
. It is not recommended to use this syntax except for backwards compatibility reasons.
flag
A flag is an option, which is either present, or not. It does not have any values attached to it, and is simply a boolean value. Flags are defined with the following format: 'flag' name description
where name
is the name of the flag. Flags can have one or two names, either a short name, or a long name. The short name must be a single character, and the long name must be two or more characters. Both may be provided, in this case, it should be a slash separated value, starting with the short value. That is, all three of the following are accepted: f
, flag
, or f/flag
. If the flag is part of a parameter set, the selector should follow, i.e. f/flag{parame}
or f{parame}
.
Short flags have an advantage that they can be combined into a single argument when using the short code. For instance, consider the following definition:
flag e/ee,
flag f/ff,
flag g/gg,
flag h/hh
Then the user can set the e, f, and g flags with any of the following combinations: cmd.ms -e -f -g
, cmd.ms -efg
, cmd.ms -ef --gg
, cmd.ms -e --ff --gg
, etc.
Since flags are inherently "optional" (meaning if left off, this is considered a false value), it is an error to specify that the argument is optional (i.e. [flag]).
Version 2.0 feature: Setting a flag like -f
or --flag
is implied -f:true
or --flag:true
. Flags may be explicitly set as "unset" by using the following format: -f:false
. Functionally, this is identical to simply leaving off the argument, but in some cases, particularly if the program is meant to be executed by other programs, it may be useful to always set the flag to some value.
arg
An arg is essentially a named parameter to the program. The prototype follows the general format: 'arg' name type description
where name follows the same rules as the name in flag, i.e. short, long, or short/long (optionally followed by a parameter set selector), and type is one of the supported argument types: string
, file
, number
, or enum
. For string and number, no autocompletion will be provided, though error checking is provided with number, to ensure the argument is a number. For file, file completions will be provided, and for enum, completion of the possible values is provided. In all cases, the type may be a multi-value option, in which case the type should be immediately followed by three dots, i.e. string...
. For enums, the list of enums should follow within parenthesis and be slash separated, i.e. enum(x/y/z)
.
The user may provide the arguments thus: cmd.ms --arge a
or in the case of multi-value options: cmd.ms --arge a b
.
Version 2.0 feature: For multi-value options, the argument may be repeated anywhere in the parameters: cmd.ms --arge a --arge b
.
default
A default argument (or multi-value) is a list of unnamed arguments. The format of the default prototype is: 'default' type description
, where type follows the same rules as normal arguments.
The user may provide the arguments thus: cmd.ms a
or in the case of multi-value options: cmd.ms a b c
. Additional arguments or flags may come anywhere in the parameter list, though in the case of multi-value arguments, it becomes impossible to add additional default arguments after, since the argument will take priority. Thus, assuming the following prototype:
flag f,
default string...
it can be run with any of the following:
cmd.ms -f a b
, cmd.ms a -f b
, or cmd.ms a b -f
Parameter Sets
TODO
Autocompletions
Autocompletions are automatically provided for popular shells in cases where it makes sense. For instance, verbs, enums, and file parameters will all be autocompleted where appropriate. This functionality is automatically installed with the install-cmdline option. Support is initially intended for bash (including WSL on Windows), but additional support for other shells can be considered where demanded and possible.
Usage in code
In general, the core function used to parse the input parameters is the parse_opts()
function. This returns a structure which contains the parameters as passed in by the user. At the top level, the structure looks like this:
structure(
'verb': enum,
'flag': map,
'arg': map,
'default': array,
'parameterSet': enum,
'additional': array<string>,
'raw': array<string>
)
Since there is only one verb possible, it is simply an enum value of one of the possible values.
Flag contains all possible flags, with each containing a boolean, true if it was set, and false otherwise. Note that both short and long values will be in the array. If for instance the prototype looks like flag f/flage
then the resulting object would look like:
array(
'flag': array(
'f': true,
'flage': true
));
Arg contains all the arguments that were passed in. Like flag, short and long arguments will both contain the input values.
array(
'arg': array(
'a': 'arg',
'arge': 'arg',
'multi': array('a', 'b', 'c')
))
For arguments that are not provided at all, null will be set. If the argument was set, but no value was passed in, the value will be an empty string (in the case of single arguments) or an empty array (in the case of multi values).
Default is simply an array of the default values.
ParameterSet is the selected parameter set, and will be an enum value of the possible parameter sets defined. The selected parameter set doesn't affect the value of the args.
There are three special values, 'additional' and 'raw'. Additional contains a raw list of unrecognized arguments. This is not all arguments passed in: Any arguments that match the prototype will be populated in the normal manner. Raw, on the other hand, contains an array of all arguments, with no processing provided, other than parsing the arguments into an array (per how the shell passed them in to the program).
Version 1.0: This structure will be an associative array, and can be accessed for instance via @args['verb']
or @args['flag']['myFlag']
.
Version 2.0: Once objects are added, this will be an automatically generated class based on the prototype, so things like string @verb = @args->verb;
and boolean @b = @args->flag->myFlag;
will be possible and statically typed.
The function parse_opts with no arguments implies: parse_opts(reflect_pull('argumentsClass', reflect_pull('file')), reflect_pull('command')), where reflect_pull('argumentsClass', reflect_pull('file')) returns the prototype class (a magic value in version 1.0) which is an automatically generated class which is created when a file has the arguments File Option. However, custom arguments may be passed in to parse_opts to use this functionality outside the normal use case.
Help text generation
One benefit of providing complete information through this mechanism is that help text generation can be provided in a programmatic way. In general, the script-help
utility can always be accessed using mscript script-help cmd.ms
, though it is also recommended for more complex scripts to also provide a help argument, which simply prints out the generated help text and exits. The auto-help file option can be specified to generate these arguments automatically, which implies the following prototypes:
verb help,
verb \/?,
flag h/help
and additionally automatically provides priority handlers to print the help text and exit if the user uses one of the arguments. In any case, help text may be generated on demand using the help_text()
function, which can be printed out with an error message should the user provide non-sensical inputs, for instance.