Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Programs should assume the current directory can be anywhere (i.e. do not assume current directory will be in the same directory as the entry point file)

  • Programs should generally work regardless if given absolute paths or a relative paths (it depends on how the paths are provided)

  • Programs should be OS agnostic and work regardless if dealing with POSIX paths or Windows paths

  • Programs should not assume the filesystem layouts of its environment (e.g. do not assume /tmp exists in the environment; programs should rely on environment variables instead)

Other

Apart from precondition checks at the beginning of a function/method, each function/method should only have one return statement and that should be at the end of the function. This is to prevent unintentional side-effects where a developer adds some code thinking it will be run, but the function exits out before it gets to that piece of code.

...

Code Block
from datetime import datetime
import os
from pathlib import Path

# Good: It is clear that this variable represents a directory containing N log files
log_dirpath = Path("workspace/logs")

# Bad: "logs" can easily be misinterpreted as a list of log objects
logs = Path("workspace/logs")

timestamp_str = datetime.now().strftime("%m-%d-%Y_%H-%M-%S")

# Good: It is clear that this variable represents a file path relative to the current directory
log_filepath = Path("log_{}.txt".format(timestamp_str))

# Good: It is clear that this variable represents a file name and does not represent an addressable path by itself
log_filename = "log_{}.txt".format(timestamp_str)

# Good: The suffix nodepath implicitly denotes that this function takes file or directory paths
def copy(src_nodepath, dest_nodepath):
    """
    Copy a source path to the destination path
    If the given paths are directory paths, perform a recursive copy
    """
    raise NotImplementedError

# Bad: The name "file" can easily be misinterpreted as a file object.
# The function "os.listdir" actually returns a list of node names (names of both directories and files)
for file in os.listdir(log_dirpath):
    print(file)

...

Current Directory

Defining a convention for file layouts is important for consistency. All Python files should have a responsibility, and Python files should not be created simply for the sake of dividing a program into different modulesFrom a program’s perspective, the current directory can be anywhere in the filesystem; it is up to the caller of the program.

Code Block
languagebash
./program.py                     # Current directory is in the same directory as the program file
./directory/program.py           # Current directory is at the parect directory of the program file
/home/root/directory/program.py  # Current directory could be anywhere in the filesystem

It is good practice to ensure your program is agnostic to the location of the current directory. Some good ways to make your program agnostic of the current directory is to:

  1. Derive the parent directory of the program file, then assemble paths relative to the location of the program file itself

  2. Let the caller determine the location of a directory/file by taking path(s) as commend line option(s). This avoids having your program make assumptions of the host’s filesystem layout

Code Block
languagepy
"""
Example, derive the parent directory of this file, then assemble paths relative to the parent directory

This approach is great if this program is packaged with other files that live alongside it (e.g. configuration files)
"""

import os
from pathlib import Path

# Ways to get the parent directory path of this program file
self_dirpath = os.path.dirname(__file__)  # Using traditional os to get a string object
self_dirpath = Path(__file__).parent  # Using modern pathlib to get a Path object

# Ways to assemble a path relative to the parent directory of this file
target_filepath = os.path.join(self_dirpath, "directory", "test.txt")  # Assume self_dirpath is a string object
target_filepath = self_dirpath.joinpath("directory", "test.txt")  # Assume self_dirpath is a Path object
Code Block
languagepy
"""
Example, let the caller of this program provide the path to use

This example uses pathlib Path objects instead of os string objects
"""

from argparse import ArgumentParser
from pathlib import Path
import sys


def get_args():
    argument_parser = ArgumentParser()
    argument_parser.add_argument("-d", "--target-dirpath", type=Path, help="A directory path to reference")
    return argument_parser.parse_args()
    

def main():
    args = get_args()
    target_filepath = args.target_dirpath.joinpath("directory", "test.txt")
    
  
if __name__ == "__main__":
    sys.exit(main())

Changing the Current Directory

Your program (whether it’s Python or a shell script) should avoid changing the caller’s current directory if possible.

Code Block
languagebash
/home/user$ ./some-blackbox-program      # <=== Program changes the current directory during execution
/tmp$                                    # <=== Program exited; current directory changed (Avoid this behavior if possible)

If the current directory needs to be changed for any reason, ensure it gets reverted by the end of the program.

Code Block
languagepy
import os

prev_current_directory = os.getcwd()
try:
    os.chdir("/tmp")
    # Do stuff here
finally:
    os.chdir(prev_current_directory)

...

File layout

Defining a convention for file layouts is important for consistency. All Python files should have a defined purpose, and Python files should not be created simply for the sake of dividing a program into different modules.

For example, if a Python file contains a wide assortment of unrelated functions (excluding a generalized utility module), then that should be an indication that the file is responsible for too many different things.

Entry point

The purpose of an entry point file is to act as the starting point of a program; in other words, this is the file that the user invokes to run a program. By convention, the entry point file should contain the function main and should have a consistent layout.

...