1 Introduction
2 Why Python?
3 General Philosophy
- 3.1 Readability vs. performance
- 3.2 Nested indentations
- 3.3 Modularity
- 3.4 Unit testing
4 PEP 8 Highlights
- 4.1 Import style
- 4.2 Naming convention
  - 4.2.1 Variables
  - 4.2.2 Class
  - 4.2.3 Functions / methods
  - 4.2.4 Packages / modules
5 General Coding Standards
- 5.1 Variable naming and duck typing pitfalls
- 5.2 Filesystem
  - 5.2.1 Fundamental concepts
  - 5.2.2 Current Directory
    - 5.2.2.1 Changing the Current Directory
- 5.3 Naming convention
- 5.4 Other
6 File layout
- 6.1 Entry point
  - 6.1.1 Command line interface
- 6.2 Module layout
  - 6.2.1 Class definition and constants
  - 6.2.2 Functions and constants
7 Object-Oriented Programming (OOP) Coding Standards
8 Package / Module Structuring
- 8.1 Namespace
- 8.2 Public / private imports
9 OOP Abstract Base Class (ABC) Coding Standards
10 Conclusion

Introduction

The Sibros Python coding standards serves as a general programming guideline for a variety of aspects such as style guide, conventions, and design. The guidelines extend beyond Python's PEP 8 style guide, and just like PEP 8, the guidelines only serve as recommendations.

Why Python?

Python is an excellent programming language that has multiple advantages over other languages.

Easy to learn

Python was designed to be written closer to English by simplifying the overall syntax; it generally takes less lines of code to achieve similar functionality in other languages like C or C++.

Great for prototyping

Because of Python's simplistic nature, it has a quick turnaround time to create programs for a wide range of applications ranging from internal/external standalone tools to simple scripts for automation.

Object-oriented

Python is an object-oriented language, so developers can take advantage of object-oriented design to create complex yet elegant designs.

General Philosophy

Readability vs. performance

Python should be written with readability as the utmost priority; optimization is secondary. Optimization should only be done if the problem being solved is proven to be a problem in the first place. See Premature Optimization for more details.

For example, let's say algorithm A has an execution time of 50 ms, and algorithm B uses some obscure logic to improve the execution time to 1 ms. Assuming algorithm B's obscure logic compromises readability, then algorithm B is not justifiable because a 50x performance gain in the order of tens of milliseconds is negligible relative to human perception.

The same argument can be used for memory optimization.

Nested indentations

Try to ensure blocks of code have a maximum of three (3) levels of indentation if possible (indentations from functions and classes do not count). If an algorithm requires an excessive amount of indentations, then consider breaking up the algorithm into sequential chunks. For example, instead of having an algorithm that does everything in a single for loop, consider dividing the algorithm into sequential, smaller steps: step 1: do this, step 2: do that, finally step 3: combine results.

Modularity

Modularity is a necessary design principle to writing complex yet maintainable programs. For example, if a file contains 500+ lines of code, then that's a good sign the file needs to be divided into modules. The same can be said for functions/methods, control flow (e.g. if/elif/else, for loops, etc.), and classes.

An extreme example of unmaintainable code is when a function has many nested if/elif/else blocks of code where each condition contains a significant amount of code (e.g. 100+ lines). At this point, the code can be compared to abstract art; readers will find it to be very difficult to identify the logical boundaries among the many lines of code. This scenario typically happens when modularity was not incorporated into the design and the program has to handle frequent requirement changes or support multiple variations of something.

Unit testing

Try to unit test all code. Any untested code might have bugs regardless of how trivial the logic is. The last thing you want is to have customers discover bugs in your program. Overall, unit testing is an essential discipline that has multiple advantages.

Unit testing encourages modular design

Unit testing encourages modular design by encouraging developers to divide large programs into small building blocks which ultimately makes the overall program more testable. It's more preferable to test small, focused components (functions, classes, modules, etc.) over a large monolithic component full of complexities.

Unit testing encourages developers to design testable APIs

In order to test an API, the API has to be testable. In other words, an API can be wrongfully designed such that it is not easily testable.

Example

Imagine that you are writing a unit test for a progress bar class that is responsible for displaying progress percentages to a GUI widget.

class ProgressBar:
    def display_progress(self):
        """
        Bad: This method is not easily testable
        """
        # Displays progress percentage to a GUI widget
        # Progress depends on an external source
        progress_percent = self._external_source.progress
        self._view.set_value(progress_percent)

The problem is that the class is not easily testable. How do you validate the behavior of the class ProgressBar for percentages ranging [0, 100]? How do you simulate the progress percent? What if there is a corner case bug that happens when the percentage is exactly 100 percent?

To make the class more testable, the user should be able to easily provide inputs (to simulate cases) and read the outputs (for validating the behavior for each case). Mocks and stubs can also help test behaviors that cannot be easily simulated.

class ProgressBar:
    def display_progress(self, progress_percent: int) -> bool:
        """
        Good: This method is easier to test

        :param progress_percent: Progress percent to display; valid values: [0, 100]
        :return: True if successful
        """
        # Displays given progress percentage to a GUI widget
        success = self._view.set_value(progress_percent)
        return success
from dataclasses import dataclass


def test_display_progress_succeeds_within_0_to_100():
    progress_bar = ProgressBar()

    @dataclass
    class TestData:
        progress_percent: int
        expected_result: bool

    test_data = [
        TestData(-1, False),
        TestData(0, True),
        TestData(100, True),
        TestData(101, False),
    ]

    for data in test_data:
        assert data.expected_result == progress_bar.display_progress(data.progress_percent)

Unit testing is automation friendly

By definition, unit tests is defined as testing code modules only. And with the simplicity of creating mocks and stubs, it completely eliminates any hardware dependencies or manual steps. Properly designed unit tests can be executed by a CI/CD system (e.g. Jenkins) on every code change pushed to a source code management server (e.g. GitHub, GitLab, Bitbucket). With this strategy, regression testing can be ran on every code change autonomously to catch bugs early.

References

More details on Sibros's philosophy of unit testing can be found in the article Unit Testing for C.

PEP 8 Highlights

In general, developers should follow the PEP 8 style guide. The style guide is not perfect, however, and developers may prefer to slightly deviate from the style guide.

The sections below are intended to highlight and correct common PEP 8 deviations.

Import style

Import statements should be grouped to easily identify modules that are standard, third-party, and local. Import statements should also be alphabetical ordered. These details are further elaborated in PEP 8 import.

Layout

# Standard modules

# Third-party modules

# Local modules

Example

from argparse import ArgumentParser
import os
import re
from subprocess import Popen
import sys

import requests
import yaml

from httptransport import HTTPTransport

If there are no third-party modules to import, then leave it blank.

import os

from httptransport import HTTPTransport

Naming convention

Developers should follow the naming convention described in PEP 8 naming convention

The sections below will attempt to summarize the details described in the PEP 8 article.

Variables

Variable names should be lower_snake_case and global constants should be UPPER_SNAKE_CASE

Example

MY_GLOBAL_CONSTANT = 10

for index in range(MY_GLOBAL_CONSTANT)
    message = "Index: {}".format(index)
    print(message)

Class

Class and type names should be UpperCamelCase. If there is an acronym in the name, then it should be all UPPERCASE. If the UPPERCASE acronym compromises readability, then it is fine to make the acronym UpperCamelCase instead.

Example

class MyCustomClass:
    pass


class HTTPClient:
    pass


class UDSServer:
    pass


class UdsServer:  # <== Maybe more readable to others, but prefer `UDS` over `Uds` if possible
    pass

Functions / methods

Function and method names should be lower_snake_case and should start with a verb to describe the action performed.

Example

class HTTPClient:
    def upload(self, url: str, filepath: str):
        raise NotImplementedError

    def download(self, url: str, filepath: str):
        raise NotImplementedError

    def _resume_download(self, url: str, offset: int) -> int:
        raise NotImplementedError
def get_current_timestamp(): -> float:
    raise NotImplementedError


def compute_next_offset(current_offset: int) -> int:
    raise NotImplementedError

Packages / modules

Package and module names should be lower_snake_case. Use underscores (_) to separate all words.

Note that our recommendation deviates from PEP 8 where PEP 8 discourages the use of underscores. We feel that it's better to be consistent and use underscores unconditionally.

Example

http_package
constants.py
custom_json_parser.py
my_module.py
uds_server.py
import constants
from custom_json_parser import CustomJSONParser
from http_package import HTTPClient
import my_module
from uds_server import UDSServer

General Coding Standards

The sections above attempts to highlight common mistakes that deviate from the PEP 8 style guide. This section will describe guidelines derived from previous experiences of continuous improvements and many iterations of trial and error.

Variable naming and duck typing pitfalls

A common mistake in Python is naming variables ineffectively. The biggest disadvantage of Duck Typing is that it is easy to lose context if data types are not clear.

The problem

Imagine that you are a new developer working on an existing project, and you are tasked to implement the new function display_current_position given the argument coordinate. The problem is that it is not clear on how to use the given argument.

def display_current_position(coordinate):
    raise NotImplementedError  # <== Implement this function given a coordinate

You may have many questions, and your best approach to answering those questions, unfortunately, maybe through reverse engineering. Some questions you may have are:

What is the data type of coordinate? Is it an integer, list, string, or an object?
If it is an object, what type of class is it? What public methods and attributes are accessible?

And to your surprise, it turns out the original developer decided to represent coordinate as a tuple of floats (<float>, <float>). At this point, you may assume the elements represents (latitude, longitude). But of course, your assumption may be wrong.

A solution

One solution to avoiding the duck typing pitfalls is to follow these naming conventions:

Names should closely resemble the data type (use suffixes if it helps)
Names should be plural to denote an iterable (preferably a list)
Use Type Hinting (Python 3.5+ only) to document parameter and return value types

Example

from coordinate import Coordinate

# Good: Name is derived from the type Coordinate
coordinate = Coordinate()

# Good: Name is suffixed with the type and overall has more context (represents coordinate of a vehicle)
vehicle_coordinate = Coordinate()

# Bad: Name deviates from the type Coordinate; this will confuse other readers
location = Coordinate()

# Good: Name denotes that the variable is a pair (tuple) and is explicit about the data structure
latitude_longitude_pair = (coordinate.latitude, coordinate.longitude)

# Debatable: Sometimes names may be too long or too detailed. It may already be implied that latitude and longitude are best represented as floats
latitude_longitude_float_pair = (coordinate.latitude, coordinate.longitude)
coordinate = Coordinate()

# Good: Plural name clearly implies a list of coordinate objects
coordinates = [
    coordinate,
]

# Somewhat good: Although the suffix "dict" makes it clear that this is a dictionary, readers also need context on the dictionary's schema
coordinate_dict = {
    "current": coordinate
}

# Good: Suffix makes it clear that this is a set (unique set of coordinates)
coordinate_set = {
    coordinate
}

# Good: Suffix makes it clear that this is a string representing a coordinate
coordinate_str = "0.0, 0.0"

Sometimes it is fine to assume the name is implicitly clear if there is enough context surrounding the names (this is mostly applicable for well-known conventions or is typically the case when the readers have domain knowledge).

from memoryblock import MemoryBlock

# Memory block represents a simple block of memory starting at some address of some size
memory_block = MemoryBlock()
memory_block.name  # Assume this attribute is a string representing the name of the memory block
memory_block.address  # Assume this attribute is an integer representing the start address
memory_block.size  # Assume this attribute is an integer representing the size in bytes

Filesystem

Python developers will often deal with the filesystem in some form; some applications that rely on the filesystem include general automation, file/directory manipulation, and command line tool.

Fundamental concepts

It is important to understand the fundamentals of filesystem such as:

Here are some guidelines:

Programs should assume the current directory can be anywhere (i.e. do not assume current directory will be in the same directory as the entry point file)
Programs should generally work regardless if given absolute paths or a relative paths (it depends on how the paths are provided)
Programs should be OS agnostic and work regardless if dealing with POSIX paths or Windows paths

Current Directory

From a program’s perspective, the current directory can be anywhere in the filesystem; it is up to the caller of the program.

./program.py                     # Current directory is in the same directory (parent directory)
./directory/program.py           # Current directory is 1 directory up from the parent directory
/home/root/directory/program.py  # Current directory could be anywhere in the filesystem

It is good practice to ensure your program is agnostic to the location of the current directory. Some good ways to make your program agnostic of the current directory is to:

Derive the parent directory of the program file, then assemble paths relative to the location of the program file itself
Let the caller determine the location of a directory/file by taking path(s) as commend line option(s). This avoids having your program make assumptions of the host’s filesystem layout

"""
Example, derive the parent directory of this file, then assemble paths relative to the parent directory

This approach is great if this program is packaged with other files that live alongside it (e.g. configuration files)
"""

import os
from pathlib import Path

# Ways to get the parent directory path of this program file
self_dirpath = os.path.dirname(__file__)  # Using traditional os to get a string object
self_dirpath = Path(__file__).parent  # Using modern pathlib to get a Path object

# Ways to assemble a path relative to the parent directory of this file
target_filepath = os.path.join(self_dirpath, "directory", "test.txt")  # Assume self_dirpath is a string object
target_filepath = self_dirpath.joinpath("directory", "test.txt")  # Assume self_dirpath is a Path object

"""
Example, let the caller of this program provide the path to use

This example uses pathlib Path objects instead of os string objects
"""

from argparse import ArgumentParser
from pathlib import Path
import sys


def get_args():
    argument_parser = ArgumentParser()
    argument_parser.add_argument("-d", "--target-dirpath", type=Path, help="A directory path to reference")
    return argument_parser.parse_args()
    

def main():
    args = get_args()
    target_filepath = args.target_dirpath.joinpath("directory", "test.txt")
    
  
if __name__ == "__main__":
    sys.exit(main())

Changing the Current Directory

Your program (whether it’s Python or a shell script) should avoid changing the caller’s current directory if possible. In Python, changing the current directory could potentially affect code further in the code stack.

If the current directory needs to be changed for any reason, ensure it gets reverted by the end of the code block that needs the current directory to be at the specific location.

import os

prev_current_directory = os.getcwd()
try:
    os.chdir("/tmp")
    # Do stuff here
finally:
    os.chdir(prev_current_directory)

Naming convention

Similar to the general naming convention in the earlier section, filesystem paths should have its own naming convention. The name of the variable should represent a filesystem node file vs. directory and should be explicit about the representation name vs. path vs. name with no extension.

Here are some naming guidelines:

Variable representing a directory path should be suffixed with dirpath or dirname
Variable representing a file path should be suffixed with filepath or filename
Variable representing any path should be suffixed with nodepath or nodename

Example

from datetime import datetime
import os
from pathlib import Path

# Good: It is clear that this variable represents a directory containing N log files
log_dirpath = Path("workspace/logs")

# Bad: "logs" can easily be misinterpreted as a list of log objects
logs = Path("workspace/logs")

timestamp_str = datetime.now().strftime("%m-%d-%Y_%H-%M-%S")

# Good: It is clear that this variable represents a file path relative to the current directory
log_filepath = Path("log_{}.txt".format(timestamp_str))

# Good: It is clear that this variable represents a file name and does not represent an addressable path by itself
log_filename = "log_{}.txt".format(timestamp_str)

# Good: The suffix nodepath implicitly denotes that this function takes file or directory paths
def copy(src_nodepath, dest_nodepath):
    """
    Copy a source path to the destination path
    If the given paths are directory paths, perform a recursive copy
    """
    raise NotImplementedError

# Bad: The name "file" can easily be misinterpreted as a file object.
# The function "os.listdir" actually returns a list of node names (names of both directories and files)
for file in os.listdir(log_dirpath):
    print(file)

Other

Apart from precondition checks at the beginning of a function/method, each function/method should only have one return statement and that should be at the end of the function. This is to prevent unintentional side-effects where a developer adds some code thinking it will be run, but the function exits out before it gets to that piece of code.

File layout

Defining a convention for file layouts is important for consistency. All Python files should have a defined purpose, and Python files should not be created simply for the sake of dividing a program into different modules.

For example, if a Python file contains a wide assortment of unrelated functions (excluding a generalized utility module), then that should be an indication that the file is responsible for too many different things.

Entry point

The purpose of an entry point file is to act as the starting point of a program; in other words, this is the file that the user invokes to run a program. By convention, the entry point file should contain the function main and should have a consistent layout.

First, the command line interface (if any) should be the first thing the reader sees. The command line arguments should be defined in a get_args() function at the top of the file.

Second, the main() function should follow the command line interface definition and contain the application logic such as:

Get command line arguments
Do some application logic (invoke helper functions if any)
Return exit code

Lastly, define helper functions below main(). The reason for having helper functions at the bottom is because readers should not care about the implementation details; the main() function should sufficiently describe what the program does.

Example

from argparse import ArgumentParser
import sys


def get_args():
    argument_parser = ArgumentParser()
    return argument_parser.parse_args()


def main():
    args = get_args()
    return 0


def helper_function():
    raise NotImplementedError


if __name__ == "__main__":
    sys.exit(main())  # <== Return exit code using main() return value as the exit code

Command line interface

Python programs should be configurable through a command line interface. Command line arguments should follow the POSIX utility convention.

Single dash for short option (single character): -o, -i, -I
Double dash for long options (dashes to separate words): --output, --ignore-all-warnings

In addition to having a command line interface, Python programs should always return an Exit Code at the end of execution where zero (0) represents success and nonzero represents an error.

Module layout

Python modules should be categorized into multiple types:

Class definition and constants
Functions and constants

The motivation for categorizing modules into different types is to ensure that each module has a specific responsibility. Constants may freely reside in any module type. It's technically possible to mix class definitions and functions all into a single module, but it's better to categorize the types of modules to avoid creating a monolithic module.

Class definition and constants

A module containing a class should ideally contain exactly one class definition, and the file name should reflect that class name. This makes it easier for readers to find the class definition.

It's also fine to have multiple private classes in a class module, but remember that readers should not care about the implementation details, so try to hide private classes at the bottom of the file.

Example

# Contents of http_client.py

class HTTPClient:
    raise NotImplementedError
# Alternative contents of http_client.py
HTTP_POST_METHOD = "post"


class HTTPClient:
    raise NotImplementedError


class _HTTPTransport:
    raise NotImplementedError

Example usage

from http_client import HTTPClient

http_client = HTTPClient()

Functions and constants

Modules containing functions and constants should ideally be associated to some responsibility. Sometimes it is fine to create a generic helper.py module that contains generic helper functions, but try to avoid that.

# Contents of http_helper.py

from enum import Enum


class HTTPStatusCode(Enum):
    SUCCESS = (200, 299)
    CLIENT_ERROR = (400, 499)
    SERVER_ERROR = (500, 599)


def is_success(status_code):
    raise NotImplementedError


def is_failure(status_code):
    raise NotImplementedError

Object-Oriented Programming (OOP) Coding Standards

This section will discuss conventions related to object oriented programming which will go over topic related to class design and layout.

Data model fundamentals

In Python, almost everything is an object. Built-in types like string, integer, list, etc. are all objects. Each of these objects have their own set of methods. For example, two lists can be added together using the + operator list + [0, 1, 2]. You cannot, however, add a list and an integer. list + 0 will result in a type error.

my_str = "I'm a string object"
my_int = 0  # <== integer object
my_list = []  # <== list object

my_list + my_int  # <== TypeError exception

Developers should be aware of Python Data Models which can be used to model custom classes based on built-in types by overriding special methods (a.k.a double underscore methods or dunder methods).

Example

from typing import *


class Table:
    def __init__(self, columns: List[str], rows: List[List[str]]):
        self._columns = columns
        self._rows = rows

    def __len__(self):
        return len(self._rows)

    def __str__(self):
        lines = [
            str(self._columns),
        ]
        lines += [str(row) for row in self._rows]
        return ("\n".join(lines))

    def __iter__(self):
        for row in self._rows:
            yield row

    def __getitem__(self, key):
        return self._rows[key]


table = Table(
    columns=["c0", "c1"],
    rows=[
        ["r00", "r01"],
        ["r10", "r11"],
    ]
)

len(table)  # Calls __len__ to get length of table
print(table)  # Calls __str__ to get a string representation
for row in table:  # Calls __iter__ to get a generator
    pass
table[0]  # Calls __getitem__ to retrieve first (0) element

Class responsibilities

Ideally, a class should have a focused responsibility. A class can be an entity that has some functional responsibility, a simple data container, or both. With a focused purpose, it's methods and attributes should also be relevant to a class's responsibility. For example, let's say we have an HTTPClient class which, implied by the name, functions as an HTTP client. The HTTPClient should not have methods to parse and process application-specific response payloads from an HTTP server. Rather, the HTTPClient should pass response payloads to the caller where the caller should have context on how to use the response payload.

Example

from typing import *

from response import Response


class HTTPClient:
    def get(self, url: str, headers: dict = None) -> Response:
        raise NotImplementedError

    def get_device_ids_from_response(self, url: str) -> List[int]:
        """
        Bad: What is a "device ID"? HTTP client should not have any application-specific logic

        As the class name implies, this class should only be responsible for functioning as an HTTP client and
        therefore should only have methods and attributes relevant to an HTTP client
        """
        raise NotImplementedError
"""
Good:
- HTTPClient is only responsible for performing HTTP requests. It does not care about the response payloads
- DeviceInfoReader gives meaning to response payloads; It uses HTTPClient to handle HTTP transactions
  with the cloud
"""

from typing import *

from cloud_endpoint import DEVICE_INFO_URL
from response import Response


class HTTPClient:
    def get(url: str, headers: dict = None) -> Response:
        raise NotImplementedError


class DeviceInfoReader:
    """
    Responsible for getting device information from the cloud
    """
    def __init__(self, http_client: HTTPClient):
        self._http_client = http_client

    def get_device_ids(self) -> List[int]:
        response = self._http_client.get(url=DEVICE_INFO_URL)
        # Assume response body is a dictionary where "device_ids" is a list of int
        return response.body["device_ids"]

Public / private / accessors / mutators design

A class design should be simple to read and use. Similar to a restaurant menu, you generally want to avoid overloading customers with unnecessary internal details like who prepares the food or where does the supply come from. And even if the menu exposed internal details, it's not like customers can use that information in any meaningful way. Rather, you want to make the menu straightforward; the customer wants to purchase food and drinks, and the menu should clearly highlight possible options to the customer.

Class design is very similar to the restaurant menu analogy above. A class should make it explicitly clear on what are the methods, accessors, and mutators that are at the user's disposal. A general rule is to put the important stuff at the top of the file and internal details at the bottom of the file. Readers tend to read from top to bottom.

To make classes aesthetically simple, classes should follow this order:

Special methods
Public methods
Private methods
Accessors
Mutators

Notice that private methods should ideally be thrown at the bottom since it's internal details. Instead, it's placed underneath public methods for consistency; methods should be packed together to avoid mixing accessors and mutators in between methods.

Example

"""
Note: The headers (e.g. # Public methods) are optional
"""

from response import Response


class HTTPClient:
    def __init__(self):
        self._authentication = {}

    def __str__(self):
        return "HTTP client"

    #
    # Public methods
    #

    def get(url: str, header: dict = None) -> Response:
        raise NotImplementedError

    def post(url: str, header: dict = None) -> Response:
        raise NotImplementedError

    #
    # Private methods
    #

    def _format_header(header: dict) -> str
        raise NotImplementedError

    #
    # Accessors
    #

    @property
    def authentication(self): -> dict
        return self._authentication

    #
    # Mutators
    #

    @authentication.setter
    def authentication(self, new_authentication: dict):
        self._authentication = new_authentication
http_client = HTTPClient()

# Good: User calls public method
response = http_client.get(url="url")

# Bad: User calls private method; not intended by author of HTTPClient
http_client._format_header(header={})

# Good: User uses accessor and mutator
http_client.authentication = {}
print(http_client.authentication)

# Bad: User accesses a private attribute; not intended by author of HTTPClient
http_client._authentication = {}

Note that a leading single underscore (_) can be prefixed to identifiers to denote private. Users should not use double underscores (__) since that mechanism is reserved for Name Mangling.

Data flow best practices

A common problem in other languages like C is that global variables can easily be abused. Imagine that you are a C developer and are given a C library that relies heavily on internal global variables.

/*
 * Contents of http_client.c
 *
 * The problem is that in C, it may be tempting to rely on global variables
 */

#include <stdint.h>

#include "http_client.h"

static module_s module;
static response_s last_response;
static uint32_t last_status_code;

void http_client__initialize(void)
{
    // Not implemented
}

void http_client__get(const char* url)
{
    // Not implemented
}

void http_client__get_response(response_s* response)
{
    if (NULL != response) {
        *response = last_response;
    }
}

uint32_t http_client__get_status_code(void)
{
    return last_status_code;
}

When the API relies too much on global variables and internal states, the application code will begin to look very confusing to other readers.

#include "http_client.h"

int main(int argc, char** argv)
{
    // Bad: HTTP client is not configurable
    http_client__initialize();

    // Bad: No return type; user passes URL in, but nothing comes out
    http_client__get("url");

    // Bad: Library now has a response based on a previous function call
    // There is hardly an association between "http_client__get_response" and "http_client__get"
    response_s response = {};
    http_client__get_response(&response);

    return -1;
}

The problem in the C code above also exists in class design as well. In class design, attributes act similarly to global variables, that is, all methods can freely access and modify attributes, and therefore, this makes methods have internal states in some form.

"""
The problem is the same as the C library example above
Attributes act like global variables; in this case, those attributes unnecessarily retain internal state
"""

import requests


class HTTPClient:
    def __init__(self):
        self._last_response = None
        self._last_status_code = None

    def get(self, url: str):
        self._last_response = requests.get(url)
        self._last_status_code = self._last_response.status_code

    def get_response(self): -> Response
        return self._last_response

    def get_last_status_code(self) -> int:
        return self._last_status_code
http_client = HTTPClient()

# Bad: Methods "get()" and "get_response()" have a loose association
# Readers will get confused as to where the response came from
# In other words, what changed "get_response()" behavior?

response = http_client.get_response()  # None

http_client.get(url="url")

# Application logic here

response = http_client.get_response()  # Response object

A solution to the problem of obscure internal state changes is to ensure that the API has proper data flow. Functions and methods should have inputs and outputs defined, and APIs should avoid retaining unnecessary states. A general rule is that if an API has multiple functions/methods that take zero inputs and zero outputs, then that is a good sign that the data flow will be very obscure to the user.

Example

# Bad: There's a lot of magic happening internally which makes the application code confusing
stateful_class = StatefulClass(result_filepath="results.txt")

stateful_class.do_this()
stateful_class.do_that()
stateful_class.handle_results()
stateful_class.write_results_to_file()
stateful_class.handle_error()

# Good: The solution is to define inputs and outputs for functions and methods if applicable
less_stateful_class = LessStatefulClass()

output_a = less_stateful_class.do_this(input_a)
output_b = less_stateful_class.do_that(input_b)
formatted_results = less_stateful_class.handle_results(results=[output_a, output_b])
error = less_stateful_class.write_results_to_file(filepath="results.txt", results=formatted_results)
less_stateful_class.handle_error(error)

Package / Module Structuring

Like classes, package and module design follow a similar paradigm.

Namespace

First off for package and module naming, try to avoid redundant names.

Example

/can
- /__init__.py
- /can_pcan.py
- /can_value_can.py
- /can_vector.py

import can

# Bad: Redundant names in namespace result in repetitive reference to "can"
pcan = can.can_pcan.CAN_PCAN()

In the above example, the package name can already makes it explicitly clear that modules in the package are related to CAN, so the module name does not need to be prefixed with that name; the same applies to identifiers within the modules.

import can

# Good: Package namespace already associates module "pcan" to the context of CAN
pcan = can.pcan.PCAN()

Public / private imports

Packages and modules should also follow the same public and private conventions as classes. Users should certainly be discouraged from importing private modules and identifiers. Like classes, the same leading single underscore (_) can be used to denote private.

Example

# Contents of module.py

PUBLIC_CONSTANT = 0
ANOTHER_PUBLIC_CONSTANT = 0

_PRIVATE_CONSTANT = -1


class PublicClass:
    raise NotImplementedError


class _PrivateClass:
    raise NotImplementedError

To control the public / private scope of packages, use the __init__.py file that resides in the root of the package directory. When a user imports a package, the __init__.py file actually gets executed on import; developers can take advantage of this behavior by placing import statements in the __init__.py file.

Example

# Contents of package/__init__.py
# This file gets executed when a user imports this package
from .module import PublicClass, PUBLIC_CONSTANT, ANOTHER_PUBLIC_CONSTANT
from package import PublicClass, PUBLIC_CONSTANT

public_class = PublicClass()

OOP Abstract Base Class (ABC) Coding Standards

One of the challenges of creating large program is that developers will likely encounter the scenario where many variations of something needs to be supported. For example, CAN nodes fundamentally works the same. Functionally, a CAN node can initialize, uninitialize, send message, and receive message. However, the implementation of CAN varies widely across different hardware. A PCAN USB-CAN device works differently than Vector USB-CAN device despite having the same exact behavior from a high level perspective.

To support both devices in code, you certainly want to avoid creating multiple implementations that have different APIs each with slight deviations.

Example

"""
Bad: Both PCAN and VectorCAN have similar functional behavior, and the APIs are similar,
but the API variation will force users to write duplicate application code each with slight variations
"""

from typing import *

from pcan_message import PCANMessage


class PCAN:
    def __init__(self, kbaud: int):
        raise NotImplementedError

    def send_message_to_channel(self, channel: int, pcan_message: PCANMessage):
        raise NotImplementedError

    def receive_message_to_channel(self, channel:int) -> PCANMessage:
        raise NotImplementedError


class VectorCAN:
    def __init__(self, port: str, baud: int):
        raise NotImplementedError

    def send(self, bus: int, id: int, data: bytearray):
        raise NotImplementedError

    def receive(self, bus: int) -> Tuple[int, bytearray]:
        raise NotImplementedError
from usbcan import PCAN, VectorCAN


def send_can_message(device, bus: int, id: int, data: bytearray):
    """
    Bad: User has to write duplicate code with slight variations that practically do the same thing
    User will also have to write more "elif" conditions to support more device types.
    """
    if isinstance(device, PCAN):
        pcan_message = PCANMessage(id=id, data=data)
        device.send_message_to_channel(channel=bus, pcan_message)
    elif isinstance(device, VectorCAN):
        device.send(bus=bus, id=id, data=data)
    else:
        raise TypeError(device)

Abstract Base Classes (ABC) are designed to solve the problem of variations. ABC allows developers to define a common interface which acts as a contract that classes must conform to. The USB standard is a perfect example. Think about how there are many USB devices out there: keyboard, mouse, flash drive, and may other types. There isn't a specific connector for keyboard vs. flash drive; you can just plug any USB connector into a USB port and expect it to work. That seamless abstraction is all thanks to the USB standard that all USB devices must conform to. For instance, All USB devices shall support 5 V input voltage, shall communicate using the USB communication protocol, and shall identify itself to the host on connection.

ABC allows developers to define a standard interface by defining common method signatures.

Example

# Contents of can_interface_abc.py

from abc import ABC, abstractmethod
from typing import *

from errors import CANInterfaceError
from can_message import CANMessage


class CANInterfaceABC(ABC):
    #
    # Public methods
    #

    @abstractmethod
    def initialize(self, kbaud: float) -> bool:
        """
        This method shall initialize a hardware with the given kbaud
        Return true if successful
        """
        raise NotImplementedError

    @abstractmethod
    def uninitialize(self):
        """
        This method shall uninitialize a hardware; do nothing if never initialized
        """
        raise NotImplementedError

    @abstractmethod
    def send(self, can_message: CANMessage) -> bool:
        """
        This method shall send a given CAN message and return true if successful
        :raise: CANInterfaceError
        """
        raise NotImplementedError

    def send_multiple(self, can_messages: List[CANMessage]) -> bool:
        """
        Abstract base classes can also have some implementation too that depend on other abstract methods
        """
        ret = False
        for can_message in can_messages:
            ret = self.send(can_message)
            if not ret:
                break

    @abstractmethod
    def receive(self, timeout: float) -> Union[CANMessage, None]:
        """
        This method shall receive a CAN message from a receive buffer (if any)
        This method shall also block for up to the given timeout in seconds
        Return a CAN message object if a received message exists else return None
        :raise: CANInterfaceError
        """
        raise NotImplementedError

    #
    # Private method
    #

    @abstractmethod
    def _block_timeout(self, timeout: float):
        """
        Private methods can be abstract methods too
        """
        raise NotImplementedError

In the example above, developers should suffix the class name with ABC to make it explicitly clear that the class is an abstract base class. For each method, developers should clearly document the expected behavior (in a docstring) and type hint the parameters and return value.

The ABC class should document exactly how implementation classes should behave. In other words, other developers who are responsible for writing implementation classes should not have to find the original developer to seek information about the parameters, expected behavior, and the return value of all abstract methods.

# Contents of pcan.py

import time
from typing import *

from can_interface_abc import CANInterfaceABC


class PCAN(CANInterfaceABC):
    #
    # Public methods
    #

    def initialize(self, kbaud: float) -> bool:
        # Implementation here
        return True

    def uninitialize(self):
        # Implementation here
        pass

    def send(self, can_message: CANMessage) -> bool:
        # Implementation here
        return True

    def receive(self, timeout: float) -> Union[CANMessage, None]:
        # Implementation here
        return None

    #
    # Private methods
    #

    def _block_timeout(self, timeout: float):
        time.sleep(timeout)
"""
Assume PCAN and VectorCAN both conform to CANInterfaceABC
"""

from can import CANInterfaceABC, CANMessage, PCAN, VectorCAN

can_device = get_can_device()  # Assume this function exists and returns a CANInterface object
assert isinstance(can_device, CANInterfaceABC)  # This statement is just an example; returns True

# Good: Application code can be reused across different CAN device types
can_device.initialize(kbaud=500)
can_message = CANMessage()
can_device.send(can_message)
can_message = can_device.receive(timeout=1.0)
can_device.uninitialize()

Conclusion

Python is an excellent programming language that is both easy to learn and flexible for a wide range of applications. And with it's object oriented nature, it allows developers to write code ranging from very basic scripts to large, complex yet elegant programs. Sibros's coding standards were derived from PEP 8 and extended based on past experiences. Programming is more than just putting instructions together in sequential order, but is rather an art that should focus on the readers. More time is spent reading code than writing, so it is very important to ensure that readers can easily understand the intentions of the writers.

Python Coding Standards

Introduction

Why Python?

General Philosophy

Readability vs. performance

Nested indentations

Modularity

Unit testing

PEP 8 Highlights

Import style

Naming convention

Variables

Class

Functions / methods

Packages / modules

General Coding Standards

Variable naming and duck typing pitfalls

Filesystem

Fundamental concepts

Current Directory

Changing the Current Directory

Naming convention

Other

File layout

Entry point

Command line interface

Module layout

Class definition and constants

Functions and constants

Object-Oriented Programming (OOP) Coding Standards

Data model fundamentals

Class responsibilities

Public / private / accessors / mutators design

Data flow best practices

Package / Module Structuring

Namespace

Public / private imports

OOP Abstract Base Class (ABC) Coding Standards

Conclusion

Related content