...
Programs should assume the current directory can be anywhere (i.e. do not assume current directory will be in the same directory as the entry point file)
Programs should generally work regardless if given absolute paths or a relative paths (it depends on how the paths are provided)
Programs should be OS agnostic and work regardless if dealing with POSIX paths or Windows paths
Programs should not assume the filesystem layouts of its environment (e.g. do not assume
/tmp
exists in the environment; programs should rely on environment variables instead)
Other
Apart from precondition checks at the beginning of a function/method, each function/method should only have one return statement and that should be at the end of the function. This is to prevent unintentional side-effects where a developer adds some code thinking it will be run, but the function exits out before it gets to that piece of code.
...
Code Block |
---|
from datetime import datetime import os from pathlib import Path # Good: It is clear that this variable represents a directory containing N log files log_dirpath = Path("workspace/logs") # Bad: "logs" can easily be misinterpreted as a list of log objects logs = Path("workspace/logs") timestamp_str = datetime.now().strftime("%m-%d-%Y_%H-%M-%S") # Good: It is clear that this variable represents a file path relative to the current directory log_filepath = Path("log_{}.txt".format(timestamp_str)) # Good: It is clear that this variable represents a file name and does not represent an addressable path by itself log_filename = "log_{}.txt".format(timestamp_str) # Good: The suffix nodepath implicitly denotes that this function takes file or directory paths def copy(src_nodepath, dest_nodepath): """ Copy a source path to the destination path If the given paths are directory paths, perform a recursive copy """ raise NotImplementedError # Bad: The name "file" can easily be misinterpreted as a file object. # The function "os.listdir" actually returns a list of node names (names of both directories and files) for file in os.listdir(log_dirpath): print(file) |
...
Current Directory
Defining a convention for file layouts is important for consistency. All Python files should have a responsibility, and Python files should not be created simply for the sake of dividing a program into different modulesFrom a program’s perspective, the current directory can be anywhere in the filesystem; it is up to the caller of the program.
Code Block | ||
---|---|---|
| ||
./program.py # Current directory is in the same directory as the program file
./directory/program.py # Current directory is at the parect directory of the program file
/home/root/directory/program.py # Current directory could be anywhere in the filesystem |
It is good practice to ensure your program is agnostic to the location of the current directory. Some good ways to make your program agnostic of the current directory is to:
Derive the parent directory of the program file, then assemble paths relative to the location of the program file itself
Let the caller determine the location of a directory/file by taking path(s) as commend line option(s). This avoids having your program make assumptions of the host’s filesystem layout
Code Block | ||
---|---|---|
| ||
"""
Example, derive the parent directory of this file, then assemble paths relative to the parent directory
This approach is great if this program is packaged with other files that live alongside it (e.g. configuration files)
"""
import os
from pathlib import Path
# Ways to get the parent directory path of this program file
self_dirpath = os.path.dirname(__file__) # Using traditional os to get a string object
self_dirpath = Path(__file__).parent # Using modern pathlib to get a Path object
# Ways to assemble a path relative to the parent directory of this file
target_filepath = os.path.join(self_dirpath, "directory", "test.txt") # Assume self_dirpath is a string object
target_filepath = self_dirpath.joinpath("directory", "test.txt") # Assume self_dirpath is a Path object |
Code Block | ||
---|---|---|
| ||
"""
Example, let the caller of this program provide the path to use
This example uses pathlib Path objects instead of os string objects
"""
from argparse import ArgumentParser
from pathlib import Path
import sys
def get_args():
argument_parser = ArgumentParser()
argument_parser.add_argument("-d", "--target-dirpath", type=Path, help="A directory path to reference")
return argument_parser.parse_args()
def main():
args = get_args()
target_filepath = args.target_dirpath.joinpath("directory", "test.txt")
if __name__ == "__main__":
sys.exit(main()) |
Changing the Current Directory
Your program (whether it’s Python or a shell script) should avoid changing the caller’s current directory if possible.
Code Block | ||
---|---|---|
| ||
/home/user$ ./some-blackbox-program # <=== Program changes the current directory during execution
/tmp$ # <=== Program exited; current directory changed (Avoid this behavior if possible) |
If the current directory needs to be changed for any reason, ensure it gets reverted by the end of the program.
Code Block | ||
---|---|---|
| ||
import os
prev_current_directory = os.getcwd()
try:
os.chdir("/tmp")
# Do stuff here
finally:
os.chdir(prev_current_directory) |
...
File layout
Defining a convention for file layouts is important for consistency. All Python files should have a defined purpose, and Python files should not be created simply for the sake of dividing a program into different modules.
For example, if a Python file contains a wide assortment of unrelated functions (excluding a generalized utility module), then that should be an indication that the file is responsible for too many different things.
Entry point
The purpose of an entry point file is to act as the starting point of a program; in other words, this is the file that the user invokes to run a program. By convention, the entry point file should contain the function main
and should have a consistent layout.
...