Write great command line utilities with Python

Brett Weir Feb 13, 2023 18 min read

Have you ever written a Python script to solve a problem or scratch some itch, only to think to yourself, "I wish there was a tool that just did this"?

Well, I got news for you: you just wrote one. No, but seriously. Turning almost any old Python script into a clean, reusable tool is a lot easier than you think.

"My Python script?" you say, "but it won't work on anyone else's computer!" Wrong again! While it's true that shipping a complex application to an end user is an exercise, a simple Python script using only the standard library is about the easiest thing in the world to distribute.

In today's post, we'll write a command line utility from start to finish, and go from making it work, to making it usable, to making it useful. For the sake of this exercise, I define a "command line utility" as a small, self-contained, text-driven tool used from the command line, such as ls and grep and all the other little tools that you can string together to do more interesting things. This is to differentiate from the monstrous, multi-layered command line interfaces of tools like docker and git.

It's the development of simple tools that is at the heart of what makes Python a fun and rewarding language to learn.

Philosophy

To understand what makes a good command line tool, it's important to step back and discuss the Unix philosophy. The following summary is lifted from Wikipedia, whose editors reproduced from Eric S. Raymond's page, which reproduces Peter H. Salus's book, A Quarter-Century of Unix (lol):

  • Write programs that do one thing and do it well.
  • Write programs to work together.
  • Write programs to handle text streams, because that is a universal interface.

While these are good ideas all the time, they are foundational to producing command line utilities that last.

Introducing fizzbuzz

To demonstrate building a killer script, we'll use the fizz buzz script as a starting point.

Here's the code:

# fizzbuzz.py
for x in range(1, 16):
    if not x % 3 and not x % 5:
        print("fizzbuzz")
    elif not x % 3:
        print("fizz")
    elif not x % 5:
        print("buzz")
    else:
        print(x)

Let's add this to a file called fizzbuzz.py. Then we can execute it like so:

python3 fizzbuzz.py

So what's wrong with it so far? It seems to work okay as a fizzbuzzer:

$ python3 fizzbuzz.py
1
2
fizz
4
buzz
fizz
7
8
fizz
buzz
11
fizz
13
14
fizzbuzz

The code itself is fine. The problem is that it's not very convenient for consumers to use because of the following:

  • The number of fizzbuzzy lines is hard-coded at 15. What if I need 100 lines of fizzbuzz for my application?

  • The range is also hard-coded. What if I need the fizzbuzz from 75-90?

  • The output phrases are fixed. What if I need output words other than "fizz" and "buzz"? Like "cats" and "bacon"?

  • The token separator is hard-coded to \n. What if I need space- or comma-separated fizzbuzz?

As ridiculous as these examples are, this is a problem I encounter every single day, where application authors code for a single use case and don't imagine their software ever being used in a different way.

If we wanted to use this code in our own Python application, an obvious optimization would be to package it as a function, and expose some of the easier hooks I mentioned:

>>> def fizzbuzz(start=1, end=16, fizz="fizz", buzz="buzz"):
...     for x in range(start, end):
...         if not x % 3 and not x % 5:
...             print(f"{fizz}{buzz}")
...         elif not x % 3:
...             print(fizz)
...         elif not x % 5:
...             print(buzz)
...         else:
...             print(x)
...
>>>

Then we can do all kinds of fizzbuzzing:

  • Using the defaults:

    >>> fizzbuzz()
    1
    2
    fizz
    ...
    13
    14
    fizzbuzz
    
  • Using custom start/end points:

    >>> fizzbuzz(start=3, end=8)
    fizz
    4
    buzz
    fizz
    7
    
  • Using custom tokens:

    >>> fizzbuzz(end=10, fizz="cats", buzz="bacon")
    1
    2
    cats
    4
    bacon
    cats
    7
    8
    cats
    

Well, great, but what if I'm not authoring a Python script? There's still no way for me to customize what this script prints outside of a Python context, which means all consumers of this functionality must write in Python or else.

Command line interfaces

Making interfaces that are accessible across programming language boundaries is an area where the command line is extremely useful. But how should such an interface be built?

There's an IEEE standard that specifies what this kind of interface should look like, but the basics are likely something you've seen before:

  • The command goes first: fizzbuzz, foobar, etc.

  • Short options are one character and start with -, like -o and -s.

  • Long options can be multiple characters and start with --, like --output and --save.

  • If options support arguments, these arguments are mandatory. For example, -o myfile.json.

  • You can have positional arguments that are evaluated in the order they appear: fizzbuzz output.txt.

  • Be sure to have -h / --help options!

There's more to it, but that's most of it. This would be a lot to try to parse manually, but you don't have to. Python has a built-in library for this called argparse.

Introducing argparse

argparse is one of my favorite Python "batteries". It's a very simple command-line option parsing module available in the standard library, so it's always available when you want to use it.

The simplest argparse program looks like this:

import argparse
parser = argparse.ArgumentParser()
parser.parse_args()

You can define your command line interface at any point before parser.parse_args() gets evaluated. After this is where your main program code should be.

With that in mind, let's start adding argparse to our fizzbuzzer:

# fizzbuzz.py
import argparse  # new

def fizzbuzz(start=1, end=16, fizz="fizz", buzz="buzz"):
    for x in range(start, end):
        if not x % 3 and not x % 5:
            print(f"{fizz}{buzz}")
        elif not x % 3:
            print(fizz)
        elif not x % 5:
            print(buzz)
        else:
            print(x)

parser = argparse.ArgumentParser()  # new

parser.parse_args()  # new
fizzbuzz()

This is runnable like so:

python3 fizzbuzz.py

If you run it now, it'll print the same output as before. But, try adding a --help parameter:

$ python3 fizzbuzz.py --help
usage: fizzbuzz.py [-h]

options:
  -h, --help  show this help message and exit

What just happened? argparse added a default -h/--help option, generated based on the parameters you configured. But we still need to wire up the available options.

# fizzbuzz.py
import argparse

def fizzbuzz(start=1, end=16, fizz="fizz", buzz="buzz"):
    for x in range(start, end):
        if not x % 3 and not x % 5:
            print(f"{fizz}{buzz}")
        elif not x % 3:
            print(fizz)
        elif not x % 5:
            print(buzz)
        else:
            print(x)

parser = argparse.ArgumentParser()
parser.add_argument("-s", "--start", type=int, default=1)  # new
parser.add_argument("-e", "--end", type=int, default=16)   #
parser.add_argument("-f", "--fizz", default="fizz")        #
parser.add_argument("-b", "--buzz", default="buzz")        #

args = parser.parse_args()  # changed
fizzbuzz(
  start=args.start,
  end=args.end,
  fizz=args.fizz,
  buzz=args.buzz,
)

It still runs as before:

$ python3 fizzbuzz.py
1
2
fizz
4
buzz
fizz
7
8
fizz
buzz
11
fizz
13
14
fizzbuzz

But check this out:

$ python3 fizzbuzz.py --help
usage: fizzbuzz.py [-h] [-s START] [-e END] [-f FIZZ] [-b BUZZ]

options:
  -h, --help            show this help message and exit
  -s START, --start START
  -e END, --end END
  -f FIZZ, --fizz FIZZ
  -b BUZZ, --buzz BUZZ

More importantly, any options available to the function are now available to the command line interface as well:

  • Using custom start/end points:

    $ python3 fizzbuzz5.py --start 3 --end 8
    fizz
    4
    buzz
    fizz
    7
    
  • Using custom tokens:

    $ python3 fizzbuzz5.py --fizz cats --buzz bacon --end 10
    1
    2
    cats
    4
    bacon
    cats
    7
    8
    cats
    

Improving the help output

We can dress up the --help output further by customizing the metavar and help parameters:

# ...
parser.add_argument(
  "-s", "--start", metavar="INT", type=int, default=1, help="starting count"
)
parser.add_argument(
  "-e", "--end", metavar="INT", type=int, default=16, help="ending count"
)
parser.add_argument(
  "-f", "--fizz", metavar="TEXT", default="fizz", help="fizz token"
)
parser.add_argument(
  "-b", "--buzz", metavar="TEXT", default="buzz", help="buzz token"
)
# ...

Which will result in our tool looking really great:

$ python3 fizzbuzz.py --help
usage: fizzbuzz.py [-h] [-s INT] [-e INT] [-f TEXT] [-b TEXT]

options:
  -h, --help            show this help message and exit
  -s INT, --start INT   starting count
  -e INT, --end INT     ending count
  -f TEXT, --fizz TEXT  fizz token
  -b TEXT, --buzz TEXT  buzz token

Now your tool is self-documenting and looking very professional with minimal effort on our part.

Writing a dual-purpose Python module

Our command line tool is looking awesome so far. However, we've broken our ability to use it as a Python module. If we were to try to import the fizzbuzz function into another module:

# another.py
from fizzbuzz import fizzbuzz
fizzbuzz()

You'll actually get two fizzbuzz outputs, because the one at the bottom of the module is also executed when it is imported. The calling order looks like this:

fizzbuzz( # this one first!
    start=args.start,
    end=args.end,
    fizz=args.fizz,
    buzz=args.buzz,
)
# then this one
fizzbuzz()

This brings us to a great rule of thumb: don't execute any code outside of a function unless you really, really know what you're doing.

We want the CLI to run if the module is executed directly, but we don't want it to run if it's imported as a module. The typical solution to this problem is to wrap the CLI entrypoint in a function (I call it main()) and call it only if the module is executed directly. This looks like:

def main():
    # do stuff ...

if __name__ == "__main__":
  main()

Applying this to our module, the code becomes:

# fizzbuzz.py
import argparse

def fizzbuzz(start=1, end=16, fizz="fizz", buzz="buzz"):
    for x in range(start, end):
        if not x % 3 and not x % 5:
            print(f"{fizz}{buzz}")
        elif not x % 3:
            print(fizz)
        elif not x % 5:
            print(buzz)
        else:
            print(x)

def main():  # new
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "-s", "--start", metavar="INT", type=int, default=1, help="starting count"
    )
    parser.add_argument(
        "-e", "--end", metavar="INT", type=int, default=16, help="ending count"
    )
    parser.add_argument(
        "-f", "--fizz", metavar="TEXT", default="fizz", help="fizz token"
    )
    parser.add_argument(
        "-b", "--buzz", metavar="TEXT", default="buzz", help="buzz token"
    )

    args = parser.parse_args()
    fizzbuzz(
        start=args.start,
        end=args.end,
        fizz=args.fizz,
        buzz=args.buzz,
    )

if __name__ == "__main__":  # new
    main()                  #

Now your code is useful whether you're calling it directly from Python or as a standalone command line utility.

Argument validation and exit codes

Right now, our code doesn't do any bounds checking. For example, it shouldn't be possible to pass negative start and end arguments, and it definitely shouldn't be possible for start to be greater than end. Our function may quietly allow this condition but we wouldn't want such an error to go unnoticed for long.

Failing as early and loudly as possible is called the fail-fast principle, and is a great idea, so let's make our tool fail immediately when given some invalid arguments. With argparse, the easiest way to throw errors is with the parser.error function:

# fizzbuzz.py
import argparse

def fizzbuzz(start=1, end=16, fizz="fizz", buzz="buzz"):
    for x in range(start, end):
        if not x % 3 and not x % 5:
            print(f"{fizz}{buzz}")
        elif not x % 3:
            print(fizz)
        elif not x % 5:
            print(buzz)
        else:
            print(x)

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "-s", "--start", metavar="INT", type=int, default=1, help="starting count"
    )
    parser.add_argument(
        "-e", "--end", metavar="INT", type=int, default=16, help="ending count"
    )
    parser.add_argument(
        "-f", "--fizz", metavar="TEXT", default="fizz", help="fizz token"
    )
    parser.add_argument(
        "-b", "--buzz", metavar="TEXT", default="buzz", help="buzz token"
    )

    args = parser.parse_args()

    if args.start < 0 or args.end < 0:                             # new
        parser.error(                                              #
          "--start and --end must be positive integers"            #
        )                                                          #
    if args.end <= args.start:                                     #
        parser.error(                                              #
          "--end must be greater than --start to return a result"  #
        )                                                          #

    fizzbuzz(
        start=args.start,
        end=args.end,
        fizz=args.fizz,
        buzz=args.buzz,
    )

if __name__ == "__main__":
    main()

With the above changes, your utility will return an error if there are any issues with its arguments:

  • If --start or --end are negative numbers:

    $ python3 fizzbuzz.py --start -1
    usage: fizzbuzz.py [-h] [-s INT] [-e INT] [-f TEXT] [-b TEXT]
    fizzbuzz.py: error: --start and --end must be positive integers
    
  • If --end is not greater than --start:

    $ python3 fizzbuzz.py --start 0 --end 0
    usage: fizzbuzz.py [-h] [-s INT] [-e INT] [-f TEXT] [-b TEXT]
    fizzbuzz.py: error: --end must be greater than --start to return a result
    

Either of the above errors will return a non-zero exit code to the shell. You can see the exit code of the previously-run command with echo $?:

$ echo $?
2

Shells and any other commands that run your utility use the exit code to determine whether the command was succcessful, with 0 indicating success, and anything else as failure. If you ever need to set the exit code, you can use sys.exit and pass the desired exit code:

$ python3 -c 'import sys; sys.exit(99)'
$ echo $?
99

Returning non-zero exit codes when your utility fails is one of the most important things you can do to prevent hard-to-track-down issues when using your own tooling as part of larger automation deployments.

Handling interruption gracefully

Sometimes commands can take a long time, like this one (don't wait for it to finish!):

python3 fizzbuzz7.py --end 100000000

As long as there are long-running commands, there will be times when you need to interrupt them. What you don't want your users to see when they interrupt a command is this:

1529447
fizz
1529449
b^C
Traceback (most recent call last):
  File "/home/brett/Projects/examples/fizzbuzz/fizzbuzz.py", line 47, in <module>
    main()
  File "/home/brett/Projects/examples/fizzbuzz/fizzbuzz.py", line 38, in main
    fizzbuzz(
  File "/home/brett/Projects/examples/fizzbuzz/fizzbuzz.py", line 13, in fizzbuzz
    print(x)
KeyboardInterrupt

This is a mild case. Depending on how deep your call stack goes, this output could be horrific.

Luckily, there's a better way: you can intercept this to prevent your users from seeing behind the curtain.

The easiest way is with a try / except block:

    try:
      # ...
    except KeyboardInterrupt:
        sys.exit(0)

Now your users will see this:

...
fizz
126799
buzz
^Cbuzz

The advantage of this is the simplicity. However, it will only catch keyboard interrupts that happen within the try / except block, so depending on the program and your users' luck, they may still get a stack trace if they interrupt the program at the wrong time. Oh well, I can live with that.

Make it executable

I don't like having to call python3 fizzbuzz.py all the time, and I could live without the .py extension too. In this section, we'll modify our script so that it is always run by a Python interpreter, even without a file extension.

The first step is to add a shebang, which tells a Linux system what interpreter to use to run it. This should be the first line in the file and formatted as follows:

#!/path/to/interpreter [args...]

The first hurdle to overcome here is that the Python interpreter could be in any number of filesystem locations.

If you followed the steps in my pyenv post, there's a good chance your Python interpreter is in a path similar to the following:

$ which python3
/home/<user>/.pyenv/shims/python3

If you're in a virtual environment, the Python path might look like this:

$ which python3
/home/<user>/path/to/your/project/venv/bin/python3

Even more Python interpreters are likely available in your system path:

$ ls -1 /usr/bin/python*
/usr/bin/python2
/usr/bin/python2.7
/usr/bin/python3
/usr/bin/python3.8

To work around needing to hard-code the Python interpreter path, the env command is commonly used, as it will search that path for an executable by name and then run it. The "canonical" Python shebang is as follows:

#!/usr/bin/env python3

Putting it all together, the final version of fizzbuzz:

#!/usr/bin/env python3
import argparse
import sys

def fizzbuzz(start=1, end=16, fizz="fizz", buzz="buzz"):
    for x in range(start, end):
        if not x % 3 and not x % 5:
            print(f"{fizz}{buzz}")
        elif not x % 3:
            print(fizz)
        elif not x % 5:
            print(buzz)
        else:
            print(x)

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "-s", "--start", metavar="INT", type=int, default=1, help="starting count"
    )
    parser.add_argument(
        "-e", "--end", metavar="INT", type=int, default=16, help="ending count"
    )
    parser.add_argument(
        "-f", "--fizz", metavar="TEXT", default="fizz", help="fizz token"
    )
    parser.add_argument(
        "-b", "--buzz", metavar="TEXT", default="buzz", help="buzz token"
    )

    args = parser.parse_args()

    if args.start < 0 or args.end < 0:
        parser.error("--start and --end must be positive integers")
    if args.end <= args.start:
        parser.error("--end must be greater than --start to return a result")

    try:
        fizzbuzz(
            start=args.start,
            end=args.end,
            fizz=args.fizz,
            buzz=args.buzz,
        )
    except KeyboardInterrupt:
        sys.exit(0)

if __name__ == "__main__":
    main()

With the addition of the shebang, we are now empowered to make it executable, and also ditch the .py extension and stop calling python3 ourselves, as the shebang now tells the program loader that this is, in fact, a Python script.

Mark the script as executable with chmod and then rename with mv:

chmod a+x fizzbuzz.py
mv fizzbuzz.py fizzbuzz

Then check it out:

# We can now invoke the tool directly
./fizzbuzz

# Running like this is still possible
python3 fizzbuzz

Install to the path

Now we want to install this utility onto our system so we can use it like any other command:

fizzbuzz

The ideal location for user-installed utilities like this is /usr/local/bin/. This unambiguously distinguishes it from a tool installed by your system package manager. We can use the install tool to set the correct permissions and ownership, so that we'd be able to use the tool even if we hadn't manually run chmod (note the sudo):

sudo install fizzbuzz /usr/local/bin/

Let's execute it:

$ sudo install fizzbuzz /usr/local/bin/
[sudo] password for brett:

Then check the results:

$ ls -l /usr/local/bin/fizzbuzz
-rwxr-xr-x 1 root root 94 Jan 30 21:52 /usr/local/bin/fizzbuzz

The install command automatically set the correct file attributes, so it's ready to use:

$ fizzbuzz --help
usage: fizzbuzz [-h] [-s INT] [-e INT] [-f TEXT] [-b TEXT]

options:
  -h, --help            show this help message and exit
  -s INT, --start INT   starting count
  -e INT, --end INT     ending count
  -f TEXT, --fizz TEXT  fizz token
  -b TEXT, --buzz TEXT  buzz token

Conclusion

Congratulations! You just made a command line utility! You'll be able to run it just like any other command on your system, and if you ever want to share it with anyone else with a Linux system, the only thing they'll need is a recent Python version installed.

As you just witnessed, with argparse in our toolkit, our little fizzbuzz utility suddenly looks very professional with minimal effort. When we show it off effortlessly in front of our friends (you all talk to your friends about command line tools, right?), they'll scarcely believe it didn't come with your computer.

No, but seriously. If there's any task that you do on a regular basis that you've been thinking about automating, this is the easiest way to start. Writing a command line utility is more accessible than it ever has been and will provide a clean, reliable interface for your automation to hook into.

Happy coding!


Tags

#argparse #cli #linux #python