Python Coding Standards


Introduction

The Sibros Python coding standards serves as a general programming guideline for a variety of aspects such as style guide, conventions, and design. The guidelines extend beyond Python's PEP 8 style guide, and just like PEP 8, the guidelines only serve as recommendations.


Why Python?

Python is an excellent programming language that has multiple advantages over other languages.

Easy to learn

Python was designed to be written closer to English by simplifying the overall syntax; it generally takes less lines of code to achieve similar functionality in other languages like C or C++.

Great for prototyping

Because of Python's simplistic nature, it has a quick turnaround time to create programs for a wide range of applications ranging from internal/external standalone tools to simple scripts for automation.

Object-oriented

Python is an object-oriented language, so developers can take advantage of object-oriented design to create complex yet elegant designs.


General Philosophy

Readability vs. performance

Python should be written with readability as the utmost priority; optimization is secondary. Optimization should only be done if the problem being solved is proven to be a problem in the first place. See Premature Optimization for more details.

For example, let's say algorithm A has an execution time of 50 ms, and algorithm B uses some obscure logic to improve the execution time to 1 ms. Assuming algorithm B's obscure logic compromises readability, then algorithm B is not justifiable because a 50x performance gain in the order of tens of milliseconds is negligible relative to human perception.

The same argument can be used for memory optimization.

Nested indentations

Try to ensure blocks of code have a maximum of three (3) levels of indentation if possible (indentations from functions and classes do not count). If an algorithm requires an excessive amount of indentations, then consider breaking up the algorithm into sequential chunks. For example, instead of having an algorithm that does everything in a single for loop, consider dividing the algorithm into sequential, smaller steps: step 1: do this, step 2: do that, finally step 3: combine results.

Modularity

Modularity is a necessary design principle to writing complex yet maintainable programs. For example, if a file contains 500+ lines of code, then that's a good sign the file needs to be divided into modules. The same can be said for functions/methods, control flow (e.g. if/elif/else, for loops, etc.), and classes.

An extreme example of unmaintainable code is when a function has many nested if/elif/else blocks of code where each condition contains a significant amount of code (e.g. 100+ lines). At this point, the code can be compared to abstract art; readers will find it to be very difficult to identify the logical boundaries among the many lines of code. This scenario typically happens when modularity was not incorporated into the design and the program has to handle frequent requirement changes or support multiple variations of something.

Unit testing

Try to unit test all code. Any untested code might have bugs regardless of how trivial the logic is. The last thing you want is to have customers discover bugs in your program. Overall, unit testing is an essential discipline that has multiple advantages.

  1. Unit testing encourages modular design

Unit testing encourages modular design by encouraging developers to divide large programs into small building blocks which ultimately makes the overall program more testable. It's more preferable to test small, focused components (functions, classes, modules, etc.) over a large monolithic component full of complexities.

  1. Unit testing encourages developers to design testable APIs

In order to test an API, the API has to be testable. In other words, an API can be wrongfully designed such that it is not easily testable.

Example

Imagine that you are writing a unit test for a progress bar class that is responsible for displaying progress percentages to a GUI widget.

1 2 3 4 5 6 7 8 9 class ProgressBar: def display_progress(self): """ Bad: This method is not easily testable """ # Displays progress percentage to a GUI widget # Progress depends on an external source progress_percent = self._external_source.progress self._view.set_value(progress_percent)

The problem is that the class is not easily testable. How do you validate the behavior of the class ProgressBar for percentages ranging [0, 100]? How do you simulate the progress percent? What if there is a corner case bug that happens when the percentage is exactly 100 percent?

To make the class more testable, the user should be able to easily provide inputs (to simulate cases) and read the outputs (for validating the behavior for each case). Mocks and stubs can also help test behaviors that cannot be easily simulated.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 class ProgressBar: def display_progress(self, progress_percent: int) -> bool: """ Good: This method is easier to test :param progress_percent: Progress percent to display; valid values: [0, 100] :return: True if successful """ # Displays given progress percentage to a GUI widget success = self._view.set_value(progress_percent) return success from dataclasses import dataclass def test_display_progress_succeeds_within_0_to_100(): progress_bar = ProgressBar() @dataclass class TestData: progress_percent: int expected_result: bool test_data = [ TestData(-1, False), TestData(0, True), TestData(100, True), TestData(101, False), ] for data in test_data: assert data.expected_result == progress_bar.display_progress(data.progress_percent)
  1. Unit testing is automation friendly

By definition, unit tests is defined as testing code modules only. And with the simplicity of creating mocks and stubs, it completely eliminates any hardware dependencies or manual steps. Properly designed unit tests can be executed by a CI/CD system (e.g. Jenkins) on every code change pushed to a source code management server (e.g. GitHub, GitLab, Bitbucket). With this strategy, regression testing can be ran on every code change autonomously to catch bugs early.

References

More details on Sibros's philosophy of unit testing can be found in the article Unit Testing for C.


PEP 8 Highlights

In general, developers should follow the PEP 8 style guide. The style guide is not perfect, however, and developers may prefer to slightly deviate from the style guide.

The sections below are intended to highlight and correct common PEP 8 deviations.

Import style

Import statements should be grouped to easily identify modules that are standard, third-party, and local. Import statements should also be alphabetical ordered. These details are further elaborated in PEP 8 import.

Layout

1 2 3 4 5 # Standard modules # Third-party modules # Local modules

Example

1 2 3 4 5 6 7 8 9 10 from argparse import ArgumentParser import os import re from subprocess import Popen import sys import requests import yaml from httptransport import HTTPTransport

If there are no third-party modules to import, then leave it blank.

1 2 3 import os from httptransport import HTTPTransport

Naming convention

Developers should follow the naming convention described in PEP 8 naming convention

The sections below will attempt to summarize the details described in the PEP 8 article.

Variables

Variable names should be lower_snake_case and global constants should be UPPER_SNAKE_CASE

Example

1 2 3 4 5 MY_GLOBAL_CONSTANT = 10 for index in range(MY_GLOBAL_CONSTANT) message = "Index: {}".format(index) print(message)

Class

Class and type names should be UpperCamelCase. If there is an acronym in the name, then it should be all UPPERCASE. If the UPPERCASE acronym compromises readability, then it is fine to make the acronym UpperCamelCase instead.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 class MyCustomClass: pass class HTTPClient: pass class UDSServer: pass class UdsServer: # <== Maybe more readable to others, but prefer `UDS` over `Uds` if possible pass

Functions / methods

Function and method names should be lower_snake_case and should start with a verb to describe the action performed.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 class HTTPClient: def upload(self, url: str, filepath: str): raise NotImplementedError def download(self, url: str, filepath: str): raise NotImplementedError def _resume_download(self, url: str, offset: int) -> int: raise NotImplementedError def get_current_timestamp(): -> float: raise NotImplementedError def compute_next_offset(current_offset: int) -> int: raise NotImplementedError

Packages / modules

Package and module names should be lower_snake_case. Use underscores (_) to separate all words.

Note that our recommendation deviates from PEP 8 where PEP 8 discourages the use of underscores. We feel that it's better to be consistent and use underscores unconditionally.

Example

1 2 3 4 5 6 7 8 9 10 http_package constants.py custom_json_parser.py my_module.py uds_server.py import constants from custom_json_parser import CustomJSONParser from http_package import HTTPClient import my_module from uds_server import UDSServer

General Coding Standards

The sections above attempts to highlight common mistakes that deviate from the PEP 8 style guide. This section will describe guidelines derived from previous experiences of continuous improvements and many iterations of trial and error.

Variable naming and duck typing pitfalls

A common mistake in Python is naming variables ineffectively. The biggest disadvantage of Duck Typing is that it is easy to lose context if data types are not clear.

The problem

Imagine that you are a new developer working on an existing project, and you are tasked to implement the new function display_current_position given the argument coordinate. The problem is that it is not clear on how to use the given argument.

1 2 def display_current_position(coordinate): raise NotImplementedError # <== Implement this function given a coordinate

You may have many questions, and your best approach to answering those questions, unfortunately, maybe through reverse engineering. Some questions you may have are:

  • What is the data type of coordinate? Is it an integer, list, string, or an object?

  • If it is an object, what type of class is it? What public methods and attributes are accessible?

And to your surprise, it turns out the original developer decided to represent coordinate as a tuple of floats (<float>, <float>). At this point, you may assume the elements represents (latitude, longitude). But of course, your assumption may be wrong.

A solution

One solution to avoiding the duck typing pitfalls is to follow these naming conventions:

  • Names should closely resemble the data type (use suffixes if it helps)

  • Names should be plural to denote an iterable (preferably a list)

  • Use Type Hinting (Python 3.5+ only) to document parameter and return value types

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 from coordinate import Coordinate # Good: Name is derived from the type Coordinate coordinate = Coordinate() # Good: Name is suffixed with the type and overall has more context (represents coordinate of a vehicle) vehicle_coordinate = Coordinate() # Bad: Name deviates from the type Coordinate; this will confuse other readers location = Coordinate() # Good: Name denotes that the variable is a pair (tuple) and is explicit about the data structure latitude_longitude_pair = (coordinate.latitude, coordinate.longitude) # Debatable: Sometimes names may be too long or too detailed. It may already be implied that latitude and longitude are best represented as floats latitude_longitude_float_pair = (coordinate.latitude, coordinate.longitude) coordinate = Coordinate() # Good: Plural name clearly implies a list of coordinate objects coordinates = [ coordinate, ] # Somewhat good: Although the suffix "dict" makes it clear that this is a dictionary, readers also need context on the dictionary's schema coordinate_dict = { "current": coordinate } # Good: Suffix makes it clear that this is a set (unique set of coordinates) coordinate_set = { coordinate } # Good: Suffix makes it clear that this is a string representing a coordinate coordinate_str = "0.0, 0.0"

Sometimes it is fine to assume the name is implicitly clear if there is enough context surrounding the names (this is mostly applicable for well-known conventions or is typically the case when the readers have domain knowledge).

1 2 3 4 5 6 7 from memoryblock import MemoryBlock # Memory block represents a simple block of memory starting at some address of some size memory_block = MemoryBlock() memory_block.name # Assume this attribute is a string representing the name of the memory block memory_block.address # Assume this attribute is an integer representing the start address memory_block.size # Assume this attribute is an integer representing the size in bytes

Filesystem

Python developers will often deal with the filesystem in some form; some applications that rely on the filesystem include general automation, file/directory manipulation, and command line tool.

Fundamental concepts

It is important to understand the fundamentals of filesystem such as:

Here are some guidelines:

  • Programs should assume the current directory can be anywhere (i.e. do not assume current directory will be in the same directory as the entry point file)

  • Programs should generally work regardless if given absolute paths or a relative paths (it depends on how the paths are provided)

  • Programs should be OS agnostic and work regardless if dealing with POSIX paths or Windows paths

  • Programs should not assume the filesystem layouts of its environment (e.g. do not assume /tmp exists in the environment; programs should rely on environment variables instead)

Other

Apart from precondition checks at the beginning of a function/method, each function/method should only have one return statement and that should be at the end of the function. This is to prevent unintentional side-effects where a developer adds some code thinking it will be run, but the function exits out before it gets to that piece of code.

Naming convention

Similar to the general naming convention in the earlier section, filesystem paths should have its own naming convention. The name of the variable should represent a filesystem node file vs. directory and should be explicit about the representation name vs. path vs. name with no extension.

Here are some naming guidelines:

  • Variable representing a directory path should be suffixed with dirpath or dirname

  • Variable representing a file path should be suffixed with filepath or filename

  • Variable representing any path should be suffixed with nodepath or nodename

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 from datetime import datetime import os from pathlib import Path # Good: It is clear that this variable represents a directory containing N log files log_dirpath = Path("workspace/logs") # Bad: "logs" can easily be misinterpreted as a list of log objects logs = Path("workspace/logs") timestamp_str = datetime.now().strftime("%m-%d-%Y_%H-%M-%S") # Good: It is clear that this variable represents a file path relative to the current directory log_filepath = Path("log_{}.txt".format(timestamp_str)) # Good: It is clear that this variable represents a file name and does not represent an addressable path by itself log_filename = "log_{}.txt".format(timestamp_str) # Good: The suffix nodepath implicitly denotes that this function takes file or directory paths def copy(src_nodepath, dest_nodepath): """ Copy a source path to the destination path If the given paths are directory paths, perform a recursive copy """ raise NotImplementedError # Bad: The name "file" can easily be misinterpreted as a file object. # The function "os.listdir" actually returns a list of node names (names of both directories and files) for file in os.listdir(log_dirpath): print(file)

File layout

Defining a convention for file layouts is important for consistency. All Python files should have a responsibility, and Python files should not be created simply for the sake of dividing a program into different modules.

Entry point

The purpose of an entry point file is to act as the starting point of a program; in other words, this is the file that the user invokes to run a program. By convention, the entry point file should contain the function main and should have a consistent layout.

First, the command line interface (if any) should be the first thing the reader sees. The command line arguments should be defined in a get_args() function at the top of the file.

Second, the main() function should follow the command line interface definition and contain the application logic such as:

  1. Get command line arguments

  2. Do some application logic (invoke helper functions if any)

  3. Return exit code

Lastly, define helper functions below main(). The reason for having helper functions at the bottom is because readers should not care about the implementation details; the main() function should sufficiently describe what the program does.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 from argparse import ArgumentParser import sys def get_args(): argument_parser = ArgumentParser() return argument_parser.parse_args() def main(): args = get_args() return 0 def helper_function(): raise NotImplementedError if __name__ == "__main__": sys.exit(main()) # <== Return exit code using main() return value as the exit code

Command line interface

Python programs should be configurable through a command line interface. Command line arguments should follow the POSIX utility convention.

  • Single dash for short option (single character): -o, -i, -I

  • Double dash for long options (dashes to separate words): --output, --ignore-all-warnings

In addition to having a command line interface, Python programs should always return an Exit Code at the end of execution where zero (0) represents success and nonzero represents an error.

Module layout

Python modules should be categorized into multiple types:

  • Class definition and constants

  • Functions and constants

The motivation for categorizing modules into different types is to ensure that each module has a specific responsibility. Constants may freely reside in any module type. It's technically possible to mix class definitions and functions all into a single module, but it's better to categorize the types of modules to avoid creating a monolithic module.

Class definition and constants

A module containing a class should ideally contain exactly one class definition, and the file name should reflect that class name. This makes it easier for readers to find the class definition.

It's also fine to have multiple private classes in a class module, but remember that readers should not care about the implementation details, so try to hide private classes at the bottom of the file.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Contents of http_client.py class HTTPClient: raise NotImplementedError # Alternative contents of http_client.py HTTP_POST_METHOD = "post" class HTTPClient: raise NotImplementedError class _HTTPTransport: raise NotImplementedError

Example usage

1 2 3 from http_client import HTTPClient http_client = HTTPClient()

Functions and constants

Modules containing functions and constants should ideally be associated to some responsibility. Sometimes it is fine to create a generic helper.py module that contains generic helper functions, but try to avoid that.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # Contents of http_helper.py from enum import Enum class HTTPStatusCode(Enum): SUCCESS = (200, 299) CLIENT_ERROR = (400, 499) SERVER_ERROR = (500, 599) def is_success(status_code): raise NotImplementedError def is_failure(status_code): raise NotImplementedError

Object-Oriented Programming (OOP) Coding Standards

This section will discuss conventions related to object oriented programming which will go over topic related to class design and layout.

Data model fundamentals

In Python, almost everything is an object. Built-in types like string, integer, list, etc. are all objects. Each of these objects have their own set of methods. For example, two lists can be added together using the + operator list + [0, 1, 2]. You cannot, however, add a list and an integer. list + 0 will result in a type error.

1 2 3 4 5 my_str = "I'm a string object" my_int = 0 # <== integer object my_list = [] # <== list object my_list + my_int # <== TypeError exception

Developers should be aware of Python Data Models which can be used to model custom classes based on built-in types by overriding special methods (a.k.a double underscore methods or dunder methods).

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 from typing import * class Table: def __init__(self, columns: List[str], rows: List[List[str]]): self._columns = columns self._rows = rows def __len__(self): return len(self._rows) def __str__(self): lines = [ str(self._columns), ] lines += [str(row) for row in self._rows] return ("\n".join(lines)) def __iter__(self): for row in self._rows: yield row def __getitem__(self, key): return self._rows[key] table = Table( columns=["c0", "c1"], rows=[ ["r00", "r01"], ["r10", "r11"], ] ) len(table) # Calls __len__ to get length of table print(table) # Calls __str__ to get a string representation for row in table: # Calls __iter__ to get a generator pass table[0] # Calls __getitem__ to retrieve first (0) element

Class responsibilities

Ideally, a class should have a focused responsibility. A class can be an entity that has some functional responsibility, a simple data container, or both. With a focused purpose, it's methods and attributes should also be relevant to a class's responsibility. For example, let's say we have an HTTPClient class which, implied by the name, functions as an HTTP client. The HTTPClient should not have methods to parse and process application-specific response payloads from an HTTP server. Rather, the HTTPClient should pass response payloads to the caller where the caller should have context on how to use the response payload.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 from typing import * from response import Response class HTTPClient: def get(self, url: str, headers: dict = None) -> Response: raise NotImplementedError def get_device_ids_from_response(self, url: str) -> List[int]: """ Bad: What is a "device ID"? HTTP client should not have any application-specific logic As the class name implies, this class should only be responsible for functioning as an HTTP client and therefore should only have methods and attributes relevant to an HTTP client """ raise NotImplementedError """ Good: - HTTPClient is only responsible for performing HTTP requests. It does not care about the response payloads - DeviceInfoReader gives meaning to response payloads; It uses HTTPClient to handle HTTP transactions with the cloud """ from typing import * from cloud_endpoint import DEVICE_INFO_URL from response import Response class HTTPClient: def get(url: str, headers: dict = None) -> Response: raise NotImplementedError class DeviceInfoReader: """ Responsible for getting device information from the cloud """ def __init__(self, http_client: HTTPClient): self._http_client = http_client def get_device_ids(self) -> List[int]: response = self._http_client.get(url=DEVICE_INFO_URL) # Assume response body is a dictionary where "device_ids" is a list of int return response.body["device_ids"]

Public / private / accessors / mutators design

A class design should be simple to read and use. Similar to a restaurant menu, you generally want to avoid overloading customers with unnecessary internal details like who prepares the food or where does the supply come from. And even if the menu exposed internal details, it's not like customers can use that information in any meaningful way. Rather, you want to make the menu straightforward; the customer wants to purchase food and drinks, and the menu should clearly highlight possible options to the customer.

Class design is very similar to the restaurant menu analogy above. A class should make it explicitly clear on what are the methods, accessors, and mutators that are at the user's disposal. A general rule is to put the important stuff at the top of the file and internal details at the bottom of the file. Readers tend to read from top to bottom.

To make classes aesthetically simple, classes should follow this order:

  1. Special methods

  2. Public methods

  3. Private methods

  4. Accessors

  5. Mutators

Notice that private methods should ideally be thrown at the bottom since it's internal details. Instead, it's placed underneath public methods for consistency; methods should be packed together to avoid mixing accessors and mutators in between methods.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 """ Note: The headers (e.g. # Public methods) are optional """ from response import Response class HTTPClient: def __init__(self): self._authentication = {} def __str__(self): return "HTTP client" # # Public methods # def get(url: str, header: dict = None) -> Response: raise NotImplementedError def post(url: str, header: dict = None) -> Response: raise NotImplementedError # # Private methods # def _format_header(header: dict) -> str raise NotImplementedError # # Accessors # @property def authentication(self): -> dict return self._authentication # # Mutators # @authentication.setter def authentication(self, new_authentication: dict): self._authentication = new_authentication http_client = HTTPClient() # Good: User calls public method response = http_client.get(url="url") # Bad: User calls private method; not intended by author of HTTPClient http_client._format_header(header={}) # Good: User uses accessor and mutator http_client.authentication = {} print(http_client.authentication) # Bad: User accesses a private attribute; not intended by author of HTTPClient http_client._authentication = {}

Note that a leading single underscore (_) can be prefixed to identifiers to denote private. Users should not use double underscores (__) since that mechanism is reserved for Name Mangling.

Data flow best practices

A common problem in other languages like C is that global variables can easily be abused. Imagine that you are a C developer and are given a C library that relies heavily on internal global variables.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 /* * Contents of http_client.c * * The problem is that in C, it may be tempting to rely on global variables */ #include <stdint.h> #include "http_client.h" static module_s module; static response_s last_response; static uint32_t last_status_code; void http_client__initialize(void) { // Not implemented } void http_client__get(const char* url) { // Not implemented } void http_client__get_response(response_s* response) { if (NULL != response) { *response = last_response; } } uint32_t http_client__get_status_code(void) { return last_status_code; }

When the API relies too much on global variables and internal states, the application code will begin to look very confusing to other readers.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 #include "http_client.h" int main(int argc, char** argv) { // Bad: HTTP client is not configurable http_client__initialize(); // Bad: No return type; user passes URL in, but nothing comes out http_client__get("url"); // Bad: Library now has a response based on a previous function call // There is hardly an association between "http_client__get_response" and "http_client__get" response_s response = {}; http_client__get_response(&response); return -1; }

The problem in the C code above also exists in class design as well. In class design, attributes act similarly to global variables, that is, all methods can freely access and modify attributes, and therefore, this makes methods have internal states in some form.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 """ The problem is the same as the C library example above Attributes act like global variables; in this case, those attributes unnecessarily retain internal state """ import requests class HTTPClient: def __init__(self): self._last_response = None self._last_status_code = None def get(self, url: str): self._last_response = requests.get(url) self._last_status_code = self._last_response.status_code def get_response(self): -> Response return self._last_response def get_last_status_code(self) -> int: return self._last_status_code http_client = HTTPClient() # Bad: Methods "get()" and "get_response()" have a loose association # Readers will get confused as to where the response came from # In other words, what changed "get_response()" behavior? response = http_client.get_response() # None http_client.get(url="url") # Application logic here response = http_client.get_response() # Response object

A solution to the problem of obscure internal state changes is to ensure that the API has proper data flow. Functions and methods should have inputs and outputs defined, and APIs should avoid retaining unnecessary states. A general rule is that if an API has multiple functions/methods that take zero inputs and zero outputs, then that is a good sign that the data flow will be very obscure to the user.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # Bad: There's a lot of magic happening internally which makes the application code confusing stateful_class = StatefulClass(result_filepath="results.txt") stateful_class.do_this() stateful_class.do_that() stateful_class.handle_results() stateful_class.write_results_to_file() stateful_class.handle_error() # Good: The solution is to define inputs and outputs for functions and methods if applicable less_stateful_class = LessStatefulClass() output_a = less_stateful_class.do_this(input_a) output_b = less_stateful_class.do_that(input_b) formatted_results = less_stateful_class.handle_results(results=[output_a, output_b]) error = less_stateful_class.write_results_to_file(filepath="results.txt", results=formatted_results) less_stateful_class.handle_error(error)

Package / Module Structuring

Like classes, package and module design follow a similar paradigm.

Namespace

First off for package and module naming, try to avoid redundant names.

Example

  • /can

    • /__init__.py

    • /can_pcan.py

    • /can_value_can.py

    • /can_vector.py

1 2 3 4 import can # Bad: Redundant names in namespace result in repetitive reference to "can" pcan = can.can_pcan.CAN_PCAN()

In the above example, the package name can already makes it explicitly clear that modules in the package are related to CAN, so the module name does not need to be prefixed with that name; the same applies to identifiers within the modules.

1 2 3 4 import can # Good: Package namespace already associates module "pcan" to the context of CAN pcan = can.pcan.PCAN()

Public / private imports

Packages and modules should also follow the same public and private conventions as classes. Users should certainly be discouraged from importing private modules and identifiers. Like classes, the same leading single underscore (_) can be used to denote private.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 # Contents of module.py PUBLIC_CONSTANT = 0 ANOTHER_PUBLIC_CONSTANT = 0 _PRIVATE_CONSTANT = -1 class PublicClass: raise NotImplementedError class _PrivateClass: raise NotImplementedError

To control the public / private scope of packages, use the __init__.py file that resides in the root of the package directory. When a user imports a package, the __init__.py file actually gets executed on import; developers can take advantage of this behavior by placing import statements in the __init__.py file.

Example

1 2 3 4 5 6 # Contents of package/__init__.py # This file gets executed when a user imports this package from .module import PublicClass, PUBLIC_CONSTANT, ANOTHER_PUBLIC_CONSTANT from package import PublicClass, PUBLIC_CONSTANT public_class = PublicClass()

OOP Abstract Base Class (ABC) Coding Standards

One of the challenges of creating large program is that developers will likely encounter the scenario where many variations of something needs to be supported. For example, CAN nodes fundamentally works the same. Functionally, a CAN node can initialize, uninitialize, send message, and receive message. However, the implementation of CAN varies widely across different hardware. A PCAN USB-CAN device works differently than Vector USB-CAN device despite having the same exact behavior from a high level perspective.

To support both devices in code, you certainly want to avoid creating multiple implementations that have different APIs each with slight deviations.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 """ Bad: Both PCAN and VectorCAN have similar functional behavior, and the APIs are similar, but the API variation will force users to write duplicate application code each with slight variations """ from typing import * from pcan_message import PCANMessage class PCAN: def __init__(self, kbaud: int): raise NotImplementedError def send_message_to_channel(self, channel: int, pcan_message: PCANMessage): raise NotImplementedError def receive_message_to_channel(self, channel:int) -> PCANMessage: raise NotImplementedError class VectorCAN: def __init__(self, port: str, baud: int): raise NotImplementedError def send(self, bus: int, id: int, data: bytearray): raise NotImplementedError def receive(self, bus: int) -> Tuple[int, bytearray]: raise NotImplementedError from usbcan import PCAN, VectorCAN def send_can_message(device, bus: int, id: int, data: bytearray): """ Bad: User has to write duplicate code with slight variations that practically do the same thing User will also have to write more "elif" conditions to support more device types. """ if isinstance(device, PCAN): pcan_message = PCANMessage(id=id, data=data) device.send_message_to_channel(channel=bus, pcan_message) elif isinstance(device, VectorCAN): device.send(bus=bus, id=id, data=data) else: raise TypeError(device)

Abstract Base Classes (ABC) are designed to solve the problem of variations. ABC allows developers to define a common interface which acts as a contract that classes must conform to. The USB standard is a perfect example. Think about how there are many USB devices out there: keyboard, mouse, flash drive, and may other types. There isn't a specific connector for keyboard vs. flash drive; you can just plug any USB connector into a USB port and expect it to work. That seamless abstraction is all thanks to the USB standard that all USB devices must conform to. For instance, All USB devices shall support 5 V input voltage, shall communicate using the USB communication protocol, and shall identify itself to the host on connection.

ABC allows developers to define a standard interface by defining common method signatures.

Example

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 # Contents of can_interface_abc.py from abc import ABC, abstractmethod from typing import * from errors import CANInterfaceError from can_message import CANMessage class CANInterfaceABC(ABC): # # Public methods # @abstractmethod def initialize(self, kbaud: float) -> bool: """ This method shall initialize a hardware with the given kbaud Return true if successful """ raise NotImplementedError @abstractmethod def uninitialize(self): """ This method shall uninitialize a hardware; do nothing if never initialized """ raise NotImplementedError @abstractmethod def send(self, can_message: CANMessage) -> bool: """ This method shall send a given CAN message and return true if successful :raise: CANInterfaceError """ raise NotImplementedError def send_multiple(self, can_messages: List[CANMessage]) -> bool: """ Abstract base classes can also have some implementation too that depend on other abstract methods """ ret = False for can_message in can_messages: ret = self.send(can_message) if not ret: break @abstractmethod def receive(self, timeout: float) -> Union[CANMessage, None]: """ This method shall receive a CAN message from a receive buffer (if any) This method shall also block for up to the given timeout in seconds Return a CAN message object if a received message exists else return None :raise: CANInterfaceError """ raise NotImplementedError # # Private method # @abstractmethod def _block_timeout(self, timeout: float): """ Private methods can be abstract methods too """ raise NotImplementedError

In the example above, developers should suffix the class name with ABC to make it explicitly clear that the class is an abstract base class. For each method, developers should clearly document the expected behavior (in a docstring) and type hint the parameters and return value.

The ABC class should document exactly how implementation classes should behave. In other words, other developers who are responsible for writing implementation classes should not have to find the original developer to seek information about the parameters, expected behavior, and the return value of all abstract methods.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 # Contents of pcan.py import time from typing import * from can_interface_abc import CANInterfaceABC class PCAN(CANInterfaceABC): # # Public methods # def initialize(self, kbaud: float) -> bool: # Implementation here return True def uninitialize(self): # Implementation here pass def send(self, can_message: CANMessage) -> bool: # Implementation here return True def receive(self, timeout: float) -> Union[CANMessage, None]: # Implementation here return None # # Private methods # def _block_timeout(self, timeout: float): time.sleep(timeout) """ Assume PCAN and VectorCAN both conform to CANInterfaceABC """ from can import CANInterfaceABC, CANMessage, PCAN, VectorCAN can_device = get_can_device() # Assume this function exists and returns a CANInterface object assert isinstance(can_device, CANInterfaceABC) # This statement is just an example; returns True # Good: Application code can be reused across different CAN device types can_device.initialize(kbaud=500) can_message = CANMessage() can_device.send(can_message) can_message = can_device.receive(timeout=1.0) can_device.uninitialize()

Conclusion

Python is an excellent programming language that is both easy to learn and flexible for a wide range of applications. And with it's object oriented nature, it allows developers to write code ranging from very basic scripts to large, complex yet elegant programs. Sibros's coding standards were derived from PEP 8 and extended based on past experiences. Programming is more than just putting instructions together in sequential order, but is rather an art that should focus on the readers. More time is spent reading code than writing, so it is very important to ensure that readers can easily understand the intentions of the writers.