Table of Contents |
---|
...
Introduction
...
...
...
...
...
...
Object-Oriented Programming (OOP) Coding Standards
...
...
OOP Abstract Base Class (ABC) Coding Standards
...
Introduction
The Sibros Python coding standards serves as a general programming guideline for a variety of aspects such as style guide, conventions, and design. The guidelines extend beyond Python's PEP 8 style guide, and just like PEP 8, the guidelines only serve as recommendations.
Why Python?
Python is an excellent programming language that has multiple advantages over other languages.
Easy to learn
Python was designed to be written closer to English by simplifying the overall syntax; it generally takes less lines of code to achieve similar functionality in other languages like C or C++.
Great for prototyping
Because of Python's simplistic nature, it has a quick turnaround time to create programs for a wide range of applications ranging from internal/external standalone tools to simple scripts for automation.
Object-oriented
Python is an object-oriented language, so developers can take advantage of object-oriented design to create complex yet elegant designs.
General Philosophy
Readability vs. performance
Python should be written with readability as the utmost priority; optimization is secondary. Optimization should only be done if the problem being solved is proven to be a problem in the first place. See Premature Optimization for more details.
For example, let's say algorithm A has an execution time of 50 ms, and algorithm B uses some obscure logic to improve the execution time to 1 ms. Assuming algorithm B's obscure logic compromises readability, then algorithm B is not justifiable because a 50x performance gain in the order of tens of milliseconds is negligible relative to human perception.
The same argument can be used for memory optimization.
Nested indentations
Try to ensure blocks of code have a maximum of three (3) levels of indentation if possible (indentations from functions and classes do not count). If an algorithm requires an excessive amount of indentations, then consider breaking up the algorithm into sequential chunks. For example, instead of having an algorithm that does everything in a single for loop, consider dividing the algorithm into sequential, smaller steps: step 1: do this, step 2: do that, finally step 3: combine results.
Modularity
Modularity is a necessary design principle to writing complex yet maintainable programs. For example, if a file contains 500+ lines of code, then that's a good sign the file needs to be divided into modules. The same can be said for functions/methods, control flow (e.g. if/elif/else, for loops, etc.), and classes.
An extreme example of unmaintainable code is when a function has many nested if/elif/else blocks of code where each condition contains a significant amount of code (e.g. 100+ lines). At this point, the code can be compared to abstract art; readers will find it to be very difficult to identify the logical boundaries among the many lines of code. This scenario typically happens when modularity was not incorporated into the design and the program has to handle frequent requirement changes or support multiple variations of something.
Unit testing
Try to unit test all code. Any untested code might have bugs regardless of how trivial the logic is. The last thing you want is to have customers discover bugs in your program. Overall, unit testing is an essential discipline that has multiple advantages.
Unit testing encourages modular design
Unit testing encourages modular design by encouraging developers to divide large programs into small building blocks which ultimately makes the overall program more testable. It's more preferable to test small, focused components (functions, classes, modules, etc.) over a large monolithic component full of complexities.
Unit testing encourages developers to design testable APIs
In order to test an API, the API has to be testable. In other words, an API can be wrongfully designed such that it is not easily testable.
Example
Imagine that you are writing a unit test for a progress bar class that is responsible for displaying progress percentages to a GUI widget.
Code Block |
---|
class ProgressBar:
def display_progress(self):
"""
Bad: This method is not easily testable
"""
# Displays progress percentage to a GUI widget
# Progress depends on an external source
progress_percent = self._external_source.progress
self._view.set_value(progress_percent) |
The problem is that the class is not easily testable. How do you validate the behavior of the class ProgressBar
for percentages ranging [0, 100]
? How do you simulate the progress percent? What if there is a corner case bug that happens when the percentage is exactly 100
percent?
To make the class more testable, the user should be able to easily provide inputs (to simulate cases) and read the outputs (for validating the behavior for each case). Mocks and stubs can also help test behaviors that cannot be easily simulated.
Code Block |
---|
class ProgressBar:
def display_progress(self, progress_percent: int) -> bool:
"""
Good: This method is easier to test
:param progress_percent: Progress percent to display; valid values: [0, 100]
:return: True if successful
"""
# Displays given progress percentage to a GUI widget
success = self._view.set_value(progress_percent)
return success
from dataclasses import dataclass
def test_display_progress_succeeds_within_0_to_100():
progress_bar = ProgressBar()
@dataclass
class TestData:
progress_percent: int
expected_result: bool
test_data = [
TestData(-1, False),
TestData(0, True),
TestData(100, True),
TestData(101, False),
]
for data in test_data:
assert data.expected_result == progress_bar.display_progress(data.progress_percent) |
Unit testing is automation friendly
By definition, unit tests is defined as testing code modules only. And with the simplicity of creating mocks and stubs, it completely eliminates any hardware dependencies or manual steps. Properly designed unit tests can be executed by a CI/CD system (e.g. Jenkins) on every code change pushed to a source code management server (e.g. GitHub, GitLab, Bitbucket). With this strategy, regression testing can be ran on every code change autonomously to catch bugs early.
References
More details on Sibros's philosophy of unit testing can be found in the article Unit Testing for C.
PEP 8 Highlights
In general, developers should follow the PEP 8 style guide. The style guide is not perfect, however, and developers may prefer to slightly deviate from the style guide.
The sections below are intended to highlight and correct common PEP 8 deviations.
Import style
Import statements should be grouped to easily identify modules that are standard, third-party, and local. Import statements should also be alphabetical ordered. These details are further elaborated in PEP 8 import.
Layout
Code Block |
---|
# Standard modules
# Third-party modules
# Local modules |
Example
Code Block |
---|
from argparse import ArgumentParser
import os
import re
from subprocess import Popen
import sys
import requests
import yaml
from httptransport import HTTPTransport |
If there are no third-party modules to import, then leave it blank.
Code Block |
---|
import os
from httptransport import HTTPTransport |
Naming convention
Developers should follow the naming convention described in PEP 8 naming convention
The sections below will attempt to summarize the details described in the PEP 8 article.
Variables
Variable names should be lower_snake_case
and global constants should be UPPER_SNAKE_CASE
Example
Code Block |
---|
MY_GLOBAL_CONSTANT = 10
for index in range(MY_GLOBAL_CONSTANT)
message = "Index: {}".format(index)
print(message) |
Class
Class and type names should be UpperCamelCase
. If there is an acronym in the name, then it should be all UPPERCASE
. If the UPPERCASE
acronym compromises readability, then it is fine to make the acronym UpperCamelCase
instead.
Example
Code Block |
---|
class MyCustomClass:
pass
class HTTPClient:
pass
class UDSServer:
pass
class UdsServer: # <== Maybe more readable to others, but prefer `UDS` over `Uds` if possible
pass |
Functions / methods
Function and method names should be lower_snake_case
and should start with a verb to describe the action performed.
Example
Code Block |
---|
class HTTPClient:
def upload(self, url: str, filepath: str):
raise NotImplementedError
def download(self, url: str, filepath: str):
raise NotImplementedError
def _resume_download(self, url: str, offset: int) -> int:
raise NotImplementedError
def get_current_timestamp(): -> float:
raise NotImplementedError
def compute_next_offset(current_offset: int) -> int:
raise NotImplementedError |
Packages / modules
Package and module names should be lower_snake_case
. Use underscores (_
) to separate all words.
Note that our recommendation deviates from PEP 8 where PEP 8 discourages the use of underscores. We feel that it's better to be consistent and use underscores unconditionally.
Example
Code Block |
---|
http_package
constants.py
custom_json_parser.py
my_module.py
uds_server.py
import constants
from custom_json_parser import CustomJSONParser
from http_package import HTTPClient
import my_module
from uds_server import UDSServer |
General Coding Standards
The sections above attempts to highlight common mistakes that deviate from the PEP 8 style guide. This section will describe guidelines derived from previous experiences of continuous improvements and many iterations of trial and error.
Variable naming and duck typing pitfalls
A common mistake in Python is naming variables ineffectively. The biggest disadvantage of Duck Typing is that it is easy to lose context if data types are not clear.
The problem
Imagine that you are a new developer working on an existing project, and you are tasked to implement the new function display_current_position
given the argument coordinate
. The problem is that it is not clear on how to use the given argument.
Code Block |
---|
def display_current_position(coordinate):
raise NotImplementedError # <== Implement this function given a coordinate |
You may have many questions, and your best approach to answering those questions, unfortunately, maybe through reverse engineering. Some questions you may have are:
What is the data type of
coordinate
? Is it an integer, list, string, or an object?If it is an object, what type of class is it? What public methods and attributes are accessible?
And to your surprise, it turns out the original developer decided to represent coordinate
as a tuple of floats (<float>, <float>)
. At this point, you may assume the elements represents (latitude, longitude)
. But of course, your assumption may be wrong.
A solution
One solution to avoiding the duck typing pitfalls is to follow these naming conventions:
Names should closely resemble the data type (use suffixes if it helps)
Names should be plural to denote an iterable (preferably a list)
Use Type Hinting (Python 3.5+ only) to document parameter and return value types
Example
Code Block |
---|
from coordinate import Coordinate
# Good: Name is derived from the type Coordinate
coordinate = Coordinate()
# Good: Name is suffixed with the type and overall has more context (represents coordinate of a vehicle)
vehicle_coordinate = Coordinate()
# Bad: Name deviates from the type Coordinate; this will confuse other readers
location = Coordinate()
# Good: Name denotes that the variable is a pair (tuple) and is explicit about the data structure
latitude_longitude_pair = (coordinate.latitude, coordinate.longitude)
# Debatable: Sometimes names may be too long or too detailed. It may already be implied that latitude and longitude are best represented as floats
latitude_longitude_float_pair = (coordinate.latitude, coordinate.longitude)
coordinate = Coordinate()
# Good: Plural name clearly implies a list of coordinate objects
coordinates = [
coordinate,
]
# Somewhat good: Although the suffix "dict" makes it clear that this is a dictionary, readers also need context on the dictionary's schema
coordinate_dict = {
"current": coordinate
}
# Good: Suffix makes it clear that this is a set (unique set of coordinates)
coordinate_set = {
coordinate
}
# Good: Suffix makes it clear that this is a string representing a coordinate
coordinate_str = "0.0, 0.0" |
Sometimes it is fine to assume the name is implicitly clear if there is enough context surrounding the names (this is mostly applicable for well-known conventions or is typically the case when the readers have domain knowledge).
Code Block |
---|
from memoryblock import MemoryBlock
# Memory block represents a simple block of memory starting at some address of some size
memory_block = MemoryBlock()
memory_block.name # Assume this attribute is a string representing the name of the memory block
memory_block.address # Assume this attribute is an integer representing the start address
memory_block.size # Assume this attribute is an integer representing the size in bytes |
Filesystem
Python developers will often deal with the filesystem in some form; some applications that rely on the filesystem include general automation, file/directory manipulation, and command line tool.
Fundamental concepts
It is important to understand the fundamentals of filesystem such as:
Here are some guidelines:
Programs should assume the current directory can be anywhere (i.e. do not assume current directory will be in the same directory as the entry point file)
Programs should generally work regardless if given absolute paths or a relative paths (it depends on how the paths are provided)
Programs should be OS agnostic and work regardless if dealing with POSIX paths or Windows paths
Programs should not assume the filesystem layouts of its environment (e.g. do not assume
/tmp
exists in the environment; programs should rely on environment variables instead)
Other
Apart from precondition checks at the beginning of a function/method, each function/method should only have one return statement and that should be at the end of the function. This is to prevent unintentional side-effects where a developer adds some code thinking it will be run, but the function exits out before it gets to that piece of code.The Sibros Python coding standards serves as a general programming guideline for a variety of aspects such as style guide, conventions, and design. The guidelines extend beyond Python's PEP 8 style guide, and just like PEP 8, the guidelines only serve as recommendations.
...
Why Python?
Python is an excellent programming language that has multiple advantages over other languages.
Easy to learn
Python was designed to be written closer to English by simplifying the overall syntax; it generally takes less lines of code to achieve similar functionality in other languages like C or C++.
Great for prototyping
Because of Python's simplistic nature, it has a quick turnaround time to create programs for a wide range of applications ranging from internal/external standalone tools to simple scripts for automation.
Object-oriented
Python is an object-oriented language, so developers can take advantage of object-oriented design to create complex yet elegant designs.
...
General Philosophy
Readability vs. performance
Python should be written with readability as the utmost priority; optimization is secondary. Optimization should only be done if the problem being solved is proven to be a problem in the first place. See Premature Optimization for more details.
For example, let's say algorithm A has an execution time of 50 ms, and algorithm B uses some obscure logic to improve the execution time to 1 ms. Assuming algorithm B's obscure logic compromises readability, then algorithm B is not justifiable because a 50x performance gain in the order of tens of milliseconds is negligible relative to human perception.
The same argument can be used for memory optimization.
Nested indentations
Try to ensure blocks of code have a maximum of three (3) levels of indentation if possible (indentations from functions and classes do not count). If an algorithm requires an excessive amount of indentations, then consider breaking up the algorithm into sequential chunks. For example, instead of having an algorithm that does everything in a single for loop, consider dividing the algorithm into sequential, smaller steps: step 1: do this, step 2: do that, finally step 3: combine results.
Modularity
Modularity is a necessary design principle to writing complex yet maintainable programs. For example, if a file contains 500+ lines of code, then that's a good sign the file needs to be divided into modules. The same can be said for functions/methods, control flow (e.g. if/elif/else, for loops, etc.), and classes.
An extreme example of unmaintainable code is when a function has many nested if/elif/else blocks of code where each condition contains a significant amount of code (e.g. 100+ lines). At this point, the code can be compared to abstract art; readers will find it to be very difficult to identify the logical boundaries among the many lines of code. This scenario typically happens when modularity was not incorporated into the design and the program has to handle frequent requirement changes or support multiple variations of something.
Unit testing
Try to unit test all code. Any untested code might have bugs regardless of how trivial the logic is. The last thing you want is to have customers discover bugs in your program. Overall, unit testing is an essential discipline that has multiple advantages.
Unit testing encourages modular design
Unit testing encourages modular design by encouraging developers to divide large programs into small building blocks which ultimately makes the overall program more testable. It's more preferable to test small, focused components (functions, classes, modules, etc.) over a large monolithic component full of complexities.
Unit testing encourages developers to design testable APIs
In order to test an API, the API has to be testable. In other words, an API can be wrongfully designed such that it is not easily testable.
Example
Imagine that you are writing a unit test for a progress bar class that is responsible for displaying progress percentages to a GUI widget.
Code Block |
---|
class ProgressBar:
def display_progress(self):
"""
Bad: This method is not easily testable
"""
# Displays progress percentage to a GUI widget
# Progress depends on an external source
progress_percent = self._external_source.progress
self._view.set_value(progress_percent) |
The problem is that the class is not easily testable. How do you validate the behavior of the class ProgressBar
for percentages ranging [0, 100]
? How do you simulate the progress percent? What if there is a corner case bug that happens when the percentage is exactly 100
percent?
To make the class more testable, the user should be able to easily provide inputs (to simulate cases) and read the outputs (for validating the behavior for each case). Mocks and stubs can also help test behaviors that cannot be easily simulated.
Code Block |
---|
class ProgressBar:
def display_progress(self, progress_percent: int) -> bool:
"""
Good: This method is easier to test
:param progress_percent: Progress percent to display; valid values: [0, 100]
:return: True if successful
"""
# Displays given progress percentage to a GUI widget
success = self._view.set_value(progress_percent)
return success
from dataclasses import dataclass
def test_display_progress_succeeds_within_0_to_100():
progress_bar = ProgressBar()
@dataclass
class TestData:
progress_percent: int
expected_result: bool
test_data = [
TestData(-1, False),
TestData(0, True),
TestData(100, True),
TestData(101, False),
]
for data in test_data:
assert data.expected_result == progress_bar.display_progress(data.progress_percent) |
Unit testing is automation friendly
By definition, unit tests is defined as testing code modules only. And with the simplicity of creating mocks and stubs, it completely eliminates any hardware dependencies or manual steps. Properly designed unit tests can be executed by a CI/CD system (e.g. Jenkins) on every code change pushed to a source code management server (e.g. GitHub, GitLab, Bitbucket). With this strategy, regression testing can be ran on every code change autonomously to catch bugs early.
References
More details on Sibros's philosophy of unit testing can be found in the article Unit Testing for C.
...
PEP 8 Highlights
In general, developers should follow the PEP 8 style guide. The style guide is not perfect, however, and developers may prefer to slightly deviate from the style guide.
The sections below are intended to highlight and correct common PEP 8 deviations.
Import style
Import statements should be grouped to easily identify modules that are standard, third-party, and local. Import statements should also be alphabetical ordered. These details are further elaborated in PEP 8 import.
Layout
Code Block |
---|
# Standard modules
# Third-party modules
# Local modules |
Example
Code Block |
---|
from argparse import ArgumentParser
import os
import re
from subprocess import Popen
import sys
import requests
import yaml
from httptransport import HTTPTransport |
If there are no third-party modules to import, then leave it blank.
Code Block |
---|
import os
from httptransport import HTTPTransport |
Naming convention
Developers should follow the naming convention described in PEP 8 naming convention
The sections below will attempt to summarize the details described in the PEP 8 article.
Variables
Variable names should be lower_snake_case
and global constants should be UPPER_SNAKE_CASE
Example
Code Block |
---|
MY_GLOBAL_CONSTANT = 10
for index in range(MY_GLOBAL_CONSTANT)
message = "Index: {}".format(index)
print(message) |
Class
Class and type names should be UpperCamelCase
. If there is an acronym in the name, then it should be all UPPERCASE
. If the UPPERCASE
acronym compromises readability, then it is fine to make the acronym UpperCamelCase
instead.
Example
Code Block |
---|
class MyCustomClass:
pass
class HTTPClient:
pass
class UDSServer:
pass
class UdsServer: # <== Maybe more readable to others, but prefer `UDS` over `Uds` if possible
pass |
Functions / methods
Function and method names should be lower_snake_case
and should start with a verb to describe the action performed.
Example
Code Block |
---|
class HTTPClient:
def upload(self, url: str, filepath: str):
raise NotImplementedError
def download(self, url: str, filepath: str):
raise NotImplementedError
def _resume_download(self, url: str, offset: int) -> int:
raise NotImplementedError
def get_current_timestamp(): -> float:
raise NotImplementedError
def compute_next_offset(current_offset: int) -> int:
raise NotImplementedError |
Packages / modules
Package and module names should be lower_snake_case
. Use underscores (_
) to separate all words.
Note that our recommendation deviates from PEP 8 where PEP 8 discourages the use of underscores. We feel that it's better to be consistent and use underscores unconditionally.
Example
Code Block |
---|
http_package
constants.py
custom_json_parser.py
my_module.py
uds_server.py
import constants
from custom_json_parser import CustomJSONParser
from http_package import HTTPClient
import my_module
from uds_server import UDSServer |
...
General Coding Standards
The sections above attempts to highlight common mistakes that deviate from the PEP 8 style guide. This section will describe guidelines derived from previous experiences of continuous improvements and many iterations of trial and error.
Variable naming and duck typing pitfalls
A common mistake in Python is naming variables ineffectively. The biggest disadvantage of Duck Typing is that it is easy to lose context if data types are not clear.
The problem
Imagine that you are a new developer working on an existing project, and you are tasked to implement the new function display_current_position
given the argument coordinate
. The problem is that it is not clear on how to use the given argument.
Code Block |
---|
def display_current_position(coordinate):
raise NotImplementedError # <== Implement this function given a coordinate |
You may have many questions, and your best approach to answering those questions, unfortunately, maybe through reverse engineering. Some questions you may have are:
What is the data type of
coordinate
? Is it an integer, list, string, or an object?If it is an object, what type of class is it? What public methods and attributes are accessible?
And to your surprise, it turns out the original developer decided to represent coordinate
as a tuple of floats (<float>, <float>)
. At this point, you may assume the elements represents (latitude, longitude)
. But of course, your assumption may be wrong.
A solution
One solution to avoiding the duck typing pitfalls is to follow these naming conventions:
Names should closely resemble the data type (use suffixes if it helps)
Names should be plural to denote an iterable (preferably a list)
Use Type Hinting (Python 3.5+ only) to document parameter and return value types
Example
Code Block |
---|
from coordinate import Coordinate
# Good: Name is derived from the type Coordinate
coordinate = Coordinate()
# Good: Name is suffixed with the type and overall has more context (represents coordinate of a vehicle)
vehicle_coordinate = Coordinate()
# Bad: Name deviates from the type Coordinate; this will confuse other readers
location = Coordinate()
# Good: Name denotes that the variable is a pair (tuple) and is explicit about the data structure
latitude_longitude_pair = (coordinate.latitude, coordinate.longitude)
# Debatable: Sometimes names may be too long or too detailed. It may already be implied that latitude and longitude are best represented as floats
latitude_longitude_float_pair = (coordinate.latitude, coordinate.longitude)
coordinate = Coordinate()
# Good: Plural name clearly implies a list of coordinate objects
coordinates = [
coordinate,
]
# Somewhat good: Although the suffix "dict" makes it clear that this is a dictionary, readers also need context on the dictionary's schema
coordinate_dict = {
"current": coordinate
}
# Good: Suffix makes it clear that this is a set (unique set of coordinates)
coordinate_set = {
coordinate
}
# Good: Suffix makes it clear that this is a string representing a coordinate
coordinate_str = "0.0, 0.0" |
Sometimes it is fine to assume the name is implicitly clear if there is enough context surrounding the names (this is mostly applicable for well-known conventions or is typically the case when the readers have domain knowledge).
Code Block |
---|
from memoryblock import MemoryBlock
# Memory block represents a simple block of memory starting at some address of some size
memory_block = MemoryBlock()
memory_block.name # Assume this attribute is a string representing the name of the memory block
memory_block.address # Assume this attribute is an integer representing the start address
memory_block.size # Assume this attribute is an integer representing the size in bytes |
Filesystem
Python developers will often deal with the filesystem in some form; some applications that rely on the filesystem include general automation, file/directory manipulation, and command line tool.
Fundamental concepts
It is important to understand the fundamentals of filesystem such as:
Here are some guidelines:
Programs should assume the current directory can be anywhere (i.e. do not assume current directory will be in the same directory as the entry point file)
Programs should generally work regardless if given absolute paths or a relative paths (it depends on how the paths are provided)
Programs should be OS agnostic and work regardless if dealing with POSIX paths or Windows paths
Current Directory
From a program’s perspective, the current directory can be anywhere in the filesystem; it is up to the caller of the program.
Code Block | ||
---|---|---|
| ||
./program.py # Current directory is in the same directory (parent directory)
./directory/program.py # Current directory is 1 directory up from the parent directory
/home/root/directory/program.py # Current directory could be anywhere in the filesystem |
It is good practice to ensure your program is agnostic to the location of the current directory. Some good ways to make your program agnostic of the current directory is to:
Derive the parent directory of the program file, then assemble paths relative to the location of the program file itself
Let the caller determine the location of a directory/file by taking path(s) as commend line option(s). This avoids having your program make assumptions of the host’s filesystem layout
Code Block | ||
---|---|---|
| ||
"""
Example, derive the parent directory of this file, then assemble paths relative to the parent directory
This approach is great if this program is packaged with other files that live alongside it (e.g. configuration files)
"""
import os
from pathlib import Path
# Ways to get the parent directory path of this program file
self_dirpath = os.path.dirname(__file__) # Using traditional os to get a string object
self_dirpath = Path(__file__).parent # Using modern pathlib to get a Path object
# Ways to assemble a path relative to the parent directory of this file
target_filepath = os.path.join(self_dirpath, "directory", "test.txt") # Assume self_dirpath is a string object
target_filepath = self_dirpath.joinpath("directory", "test.txt") # Assume self_dirpath is a Path object |
Code Block | ||
---|---|---|
| ||
"""
Example, let the caller of this program provide the path to use
This example uses pathlib Path objects instead of os string objects
"""
from argparse import ArgumentParser
from pathlib import Path
import sys
def get_args():
argument_parser = ArgumentParser()
argument_parser.add_argument("-d", "--target-dirpath", type=Path, help="A directory path to reference")
return argument_parser.parse_args()
def main():
args = get_args()
target_filepath = args.target_dirpath.joinpath("directory", "test.txt")
if __name__ == "__main__":
sys.exit(main()) |
Changing the Current Directory
Your program (whether it’s Python or a shell script) should avoid changing the caller’s current directory if possible. In Python, changing the current directory could potentially affect code further in the code stack.
If the current directory needs to be changed for any reason, ensure it gets reverted by the end of the code block that needs the current directory to be at the specific location.
Code Block | ||
---|---|---|
| ||
import os
prev_current_directory = os.getcwd()
try:
os.chdir("/tmp")
# Do stuff here
finally:
os.chdir(prev_current_directory) |
Naming convention
Similar to the general naming convention in the earlier section, filesystem paths should have its own naming convention. The name of the variable should represent a filesystem node file vs. directory
and should be explicit about the representation name vs. path vs. name with no extension
.
...
Code Block |
---|
from datetime import datetime import os from pathlib import Path # Good: It is clear that this variable represents a directory containing N log files log_dirpath = Path("workspace/logs") # Bad: "logs" can easily be misinterpreted as a list of log objects logs = Path("workspace/logs") timestamp_str = datetime.now().strftime("%m-%d-%Y_%H-%M-%S") # Good: It is clear that this variable represents a file path relative to the current directory log_filepath = Path("log_{}.txt".format(timestamp_str)) # Good: It is clear that this variable represents a file name and does not represent an addressable path by itself log_filename = "log_{}.txt".format(timestamp_str) # Good: The suffix nodepath implicitly denotes that this function takes file or directory paths def copy(src_nodepath, dest_nodepath): """ Copy a source path to the destination path If the given paths are directory paths, perform a recursive copy """ raise NotImplementedError # Bad: The name "file" can easily be misinterpreted as a file object. # The function "os.listdir" actually returns a list """of node names (names of raiseboth NotImplementedErrordirectories and # Bad: The name "file" can easily be misinterpreted as a file object. # The function "os.listdir" actually returns a list of node names (names of both directories and files) for file in os.listdir(log_dirpath): print(file)files) for file in os.listdir(log_dirpath): print(file) |
Other
Apart from precondition checks at the beginning of a function/method, each function/method should only have one return statement and that should be at the end of the function. This is to prevent unintentional side-effects where a developer adds some code thinking it will be run, but the function exits out before it gets to that piece of code.
...
File layout
Defining a convention for file layouts is important for consistency. All Python files should have a responsibilitydefined purpose, and Python files should not be created simply for the sake of dividing a program into different modules.
For example, if a Python file contains a wide assortment of unrelated functions (excluding a generalized utility module), then that should be an indication that the file is responsible for too many different things.
Entry point
The purpose of an entry point file is to act as the starting point of a program; in other words, this is the file that the user invokes to run a program. By convention, the entry point file should contain the function main
and should have a consistent layout.
...