~tpapastylianou/process_optargs

A super-simple sourceable function for processing commandline options and arguments in bash

3d012f1 Fixed bug where ANSI_RESET was named RESET instead

9 months ago

5f04e31 Updated README and source doc to reflect more usage edge-cases

1 year, 6 months ago

#process_optargs

A super-simple sourceable function for processing commandline options and arguments in bash

Note: Main development for this project is done on sourcehut; however, given the visibility and social nature of github's starring system, if you appreciate this project and would like to express your appreciation and/or promote it on github, then please feel free to star it on Github, here.

#Quick usage summary and example

  1. Source the process_optargs file (which contains the definition of the process_optargs function) from the sourceable_functions directory into your script.

    Note: Running the process_optargs function has the side effect of also loading the init_ansicolor_vars, error, warning, and is_in functions; these are used by process_optargs internally, but are also useful functions in their own right, which you might want to use later in your script, when processing the obtained options / input arguments. You can also source these separately if you need to.

  2. Create the following variables/arrays in your function's scope:

    • An empty associative array called OPTIONS, to be filled by process_optargs

    • An empty indexed array called ARGS, to be filled by process_optargs

    • An indexed array called VALID_FLAG_OPTIONS with valid flag-option names (see below for details)

    • An indexed array called VALID_KEYVAL_OPTIONS with valid keyval-option names (see below for details)

    • A variable called COMMAND_NAME, initialized with the caller's name.

  3. Call process_optargs with the relevant arguments to parse (typically just "$@")

  4. Use the now filled OPTIONS and ARGS arrays to perform validation checks on the obtained options/arguments. The error, warning, and is_in functions may be useful to you here.

Example use:

source /path/to/process_optargs

function myfunction () {    # The function whose options/args we want to process

 # Initialise the necessary variables that will be checked / populated by process_optargs
   local -A OPTIONS=()   # Note: -A (i.e. with a capital 'A') makes this an associative array. See `man bash` for details if you're not familiar with associative arrays in bash.
   local -a ARGS=()      # Note: -a (i.e. with a small 'a') explicitly declares this as a normal indexed array.
   local -a VALID_FLAG_OPTIONS=( -h/--help -v --version )    # Note: -h and --help denote the same flag, whereas -v and --version are separate flags here! (e.g. '-v' could be for 'verbose')
   local -a VALID_KEYVAL_OPTIONS=( -r/--repetitions )
   local COMMAND_NAME="myfunction"

 # Call process_optargs to perform the processing and populate the OPTIONS and ARGS arrays.
   process_optargs "$@" || exit 1

 # Validate parsed options and arguments
   if is_in '-h' "${!OPTIONS[@]}" || is_in '--help' "${!OPTIONS[@]}"
   then display_help
   fi

   if   is_in '-r'            "${!OPTIONS[@]}"; then REPS="${OPTIONS[-r]}"
   elif is_in '--repetitions' "${!OPTIONS[@]}"; then REPS="${OPTIONS[--repetitions]}"
   fi

   if   test "${#ARGS[@]}" -lt 1
   then error "myfunction requires at least one non-option arguments"
        exit 1
   fi

   # ...etc
}

# Now you can call your function with options / input arguments, e.g.:
myfunction "hello" --repetitions 5

#Detailed description and usage instructions

process_optargs is a function, intended to be used inside a (caller) function (or script) to parse its inputs into options and positional arguments.

The intended way to call process_optargs is with "$@" as the argument. This effectively passes the caller function's inputs down to the process_optargs function for processing. If an error occurs during the processing of arguments, it will return a non-zero status, which can be intercepted by the caller to handle as appropriate.

This function expects the following variables to be declared and initialized locally in the caller function:

  • An associative array called OPTIONS, initialized as empty
  • An indexed array called ARGS, initialized as empty
  • An indexed array called VALID_FLAG_OPTIONS, initialized as below.
  • An indexed array called VALID_KEYVAL_OPTIONS, initialized as below
  • A variable called COMMAND_NAME, initialized with the caller's name.

The COMMAND_NAME variable should be set to the name of the caller function/command; its purpose is to be used internally to identify the caller by name, in the case of error messages. If it is left empty, a default ("The Caller") is used.

The VALID_FLAG_OPTIONS and VALID_KEYVAL_OPTIONS arrays is where one should specify all flag or keyval options respectively, which are valid options for the caller function. Individual elements of these arrays could be options in either 'short' (e.g. -o), 'long' (e.g. --optionname), or 'short/long' format (e.g. -o/--optionname). These arrays are inspected internally, and serve as a way to confirm whether any flag and key/value options passed to the caller function were valid options for that caller. If the caller takes no flag arguments, set the VALID_FLAG_OPTIONS array to the empty array (i.e. VALID_FLAG_OPTIONS=() ); and similarly for the VALID_KEYVAL_OPTIONS array.

Note that, since bash arrays delimit array tokens using spaces, when using the short/long format, there should be no spaces surrounding the / as this will cause the names to be interpeted as separate tokens.

Also note that you do not specify valid 'values' at all at this point, only 'keys'. Any validation of the parsed flags and key/value pairs obtained as options via process_optargs should be performed by the caller after the call to process_optargs.

The OPTIONS and ARGS arrays should ideally be initialized as empty. If not, process_args will give a warning, but continue anyway. This allows the caller to initialize them with predetermined options / args before the call to process_optargs. process_optargs will then simply append any options and arguments detected to these arrays.

After process_optargs has exited, the OPTIONS associative array in the caller function will have been populated with key/value pairs; in the case of key/value style options, the keys of the associative array will correspond to valid 'option keys' (as specified in the VALID_KEYVAL_OPTIONS array) in either 'short' or 'long' form (with dashes included), depending on the particular form that what was provided to the caller.

Similarly, in the case of flag options, the keys of the OPTIONS associative array will correspond to valid 'option flags' in either 'short' or 'long' form (with dashes included), depending on the particular form that was provided to the caller, and the value always set to 'On'. In this way, you can simply check if a flag appears as a key in the associative array to see if it has been provided or not (if it was not provided, it will simply not exist in the associative array; i.e. do NOT expect it to exist with a value of 'Off' -- in other words, the value of flag-based options will always be set to 'On' but otherwise it is something that can be safely ignored).

If an option in either short or long form appears in both the VALID_FLAG_OPTIONS and VALID_KEYVAL_OPTIONS arrays, process_optargs will exit with an error.

Similar to OPTIONS, after process_optargs has exited, the ARGS array in the caller function will have been populated with positional arguments, i.e. the list of "non-option" inputs passed to the function.

Note that, unlike other argument parsing tool conventions, the 0-index element of the ARGS array will denote the first proper argument passed to the function, and NOT the name of the command (or function) used to call process_optargs. However, if you did want this behaviour for whatever reason, since the command (or function) name used to call process_optargs is meant to be captured in your manually specified COMMAND_NAME variable before the call to process_optargs, you can easily "append" this as the first element (i.e. at the 0-index position) of the ARGS array, by simply overwriting the freshly populated ARGS array as follows:

ARGS=( "$COMMAND_NAME" "${ARGS[@]}" )

#Calling your custom function with options and input arguments

At the point of use, when passing inputs to your caller function, you can combine short flags together (e.g. -abc instead of -a -b -c), use short keyval options with or without a space (e.g. -d1 or -d 1 ), and long keyval options using either a space or an equals sign without spaces (i.e. both --name George and --name=George are fine).

Notes:

  • As is typical of many unix tools, the special value -- causes any subsequent inputs to be interpreted explicitly as arguments (i.e. even if they start with dashes and are valid option names).

  • A single dash (i.e. -) by itself is not special in any way, and is treated as a normal argument (this is desired behaviour; many unix programs which expect a filename as an argument, traditionally accept - as a special "filename", denoting that the input is to be read from the stdin instead.)

  • If a keyval option is provided in both short and long form at the same time, process_optargs will exit with an error, to prevent ambiguity.

  • If a flag-based option is provided in both short and long forms, while in principle there is no potential for ambiguity (since they would both simply be set to 'On'), in practice process_optargs will also treat this as an error, to prevent situations where you'd end up with the same option appearing twice in the associative array (and thus requiring validation a second time, which could waste computational time, or even cause unintended bugs).

  • Contrary to the above, if a keyval or flag type option is provided two or more times, but in the same form (i.e. all short, or all long), then this is a valid invocation; the value used for that option is the one given last. This is intentional behaviour, since occasionally it is necessary to override previous options in this manner (e.g. in the case of aliases).

  • Since you cannot have spaces surrounding the equals sign when assigning a value to a key/value style option, specifying --optioname= as an input assigns the empty string as a valid value to that option. If you want to pass the empty string as a value using a key/value style option without an equals sign, then you have to do so explicitly, e.g. --optionname "".

  • Similarly, with short-form options, an empty string must be passed as -i "" explicitly; passing -i"" is invalid and won't work (since -i"" will get simplified to -i by the shell)

#Additional (useful) functions defined in this package

This section documents some additional functions defined in this package; some of these (though not all), are used internally by process_optargs, and as long as they reside in the same directory as process_optargs, then these are sourced automatically by process_optargs in your project. However, these functions are interesting in their own right, and you may well find them useful for validating inputs in your own script. Therefore, these are also described briefly below. If you wish to use them outside of the context of process_optargs, you will need to source them separately before use.

#is_in

Given N input arguments, this function checks if the 1st input reoccurs in the remaining N-1 arguments.

The intended way to call this function is with an 'exploded' array variable as the second argument, effectively checking if the first input is a member of the array, e.g.:

Elements=( 1 3 5 7 9 )
is_in 5 "${Elements[@]}"   # succeeds
is_in 6 "${Elements[@]}"   # fails

#find_first

Checks if the first input reoccurs in the remaining list of input arguments, and if so echoes the index on which it first appears (in a zero-indexed manner). If a match is made the function returns with a status of 0, otherwise it returns with a status of 1.

The intended way to call this function is with an 'exploded' array variable as the second argument, effectively returning the array index at which the input element first occurs.

Examples:

Elements=( 1 3 5 7 9 )
find_first 1 "${Elements[@]}"   # echoes '0'
find_first 5 "${Elements[@]}"   # echoes '2'
find_first 6 "${Elements[@]}"   # echoes nothing, returns with non-zero status

#init_ansicolor_vars

Checks for the presence of the NO_COLOR variable, as per https://no-color.org/. If it is empty, then colours are initialised to respective ANSI_* names (e.g. ANSI_BLUE is initialised to the ansi sequence corresponding to blue); if non-empty, the same variables are all initialised to empty strings. This allows these variables to be used for colorisation in scripts, with meaningful names. The error and warning functions require this as a dependency for the coloring functionality.

#error

A useful, 'libstderred.so' compatible function for echoing error messages to the stderr stream. It will result in coloring errors in red by default, unless the NO_COLOR variable is set. E.g.:

error "This will be printed to the stderr stream in red"

#warning

Same as error, but with blue colorisation instead of red. E.g.:

warning "This will be printed to the stderr stream in blue"

AUTHOR: Tasos Papastylianou (https://tpapastylianou.com)