Guidelines and Best Practices for Writing Library Code

The notes in this section are about writing readable, maintainable Python code that your future self and other people will be able to use, maintain, and improve. While much of what follows is applicable to any Python code that you may write, some of the points are particularly relevant to code going into modules in the tools; e.g. the SalishSeaTools Package and the SalishSeaNowcast Package.

A primary guide for writing Python code is PEP 8 – Style Guide for Python Code.

Installing the flake8 static analysis tool and enabling your editor to use it to highlight problem code will help you to write well-styled code. See Python Source Code Checking via Flake8 for details of how to set that up for emacs.

If you are looking for examples of the coding style preferred in Salish Sea project modules, checkout out the code in these packages:

Python 3

The Salish Sea project uses Python 3. Your should write and test your code using Python 3. See Anaconda Python Distribution for instructions on how to install a Python 3 working environment, or Building a Python 3 Conda Environment if you want to set up a Python 3 environment within your Anaconda Python 2 installation.

Because of the way that the module systems on jasper and orcinus work the SalishSeaTools Package and Salish Sea NEMO Command Processor (SalishSeaCmd package) must retain backward compatibility to Python 2.7. The primary implication of that is that modules that use the division operation should have:

from __future__ import division

as their first import so that floating point division is enabled.

Imports

  • Only import things that you are actually using in your module. flake8 will identify unused imports for you.

  • Never use:

    from something import *
    
  • When you are importing several things from the same place do it like this:

    from salishsea_tools import (
        nc_tools,
        viz_tools,
        stormtools,
        tidetools,
    )
    
  • Imports should be grouped:

    • Python standard library
    • Other installed libraries
    • Other Salish Sea project libraries
    • The library that the module is part of

    The groups should be separated by an empty line, and the imports should be sorted alphabetically within the groups.

    An example from the :SalishSeaNowcast.nowcast.workers.get_NeahBay_ssh nowcast system worker module:

    import datetime
    import logging
    import os
    import shutil
    
    from bs4 import BeautifulSoup
    import matplotlib
    import netCDF4 as nc
    import numpy as np
    import pandas as pd
    import pytz
    
    from salishsea_tools import nc_tools
    
    from nowcast import (
        figures,
        lib,
    )
    from nowcast.nowcast_worker import NowcastWorker
    

Public and Private Objects

Many compiled languages like Java provide statements to mark functions, methods, etc. as private, meaning that they are inaccessible outside of their particular program scope. Dynamic languages like Python have very strong introspection capabilities that make such privacy constraints impossible. Instead, the Python community relies on the social convention that functions, methods, etc. that are spelled with leading underscore characters (_) are considered to be private.

We use that social convention to say,

“I have marked this function as private because I don’t want to guarantee that I won’t change its arguments later and I don’t want other people to rely its definition.”,

or,

“This function just exists to wrap some lines of code so that the function that calls it is more readable, or because I need to use this bit of code in several places in this module. It is not intended to be used outside of this module.”

Here’s an example of private functions from the nowcast.figures.publish.strm_surge_alerts module:

def storm_surge_alerts(
    grids_15m, weather_path, coastline, tidal_predictions,
    figsize=(18, 20),
    theme=nowcast.figures.website_theme,
):
    ...
    plot_data = _prep_plot_data(grids_15m, tidal_predictions, weather_path)
    fig, (ax_map, ax_pa_info, ax_cr_info, ax_vic_info) = _prep_fig_axes(
        figsize, theme)
    _plot_alerts_map(ax_map, coastline, plot_data, theme)
    ...

The storm_surge_alerts() function is public. It is intended to be called by the nowcast.workers.make_plots worker.

The _prep_plot_data(), _prep_fig_axes(), and _plot_alerts_map() functions that storm_surge_alerts() calls are private functions within nowcast.figures.publish.storm_surge_alerts module. Their purpose is code encapsulation and improving readability but they are not useful outside of the module, so they are named with a leading underscore to indicate that.

The “leading underscore means private” convention is most commonly used for functions and methods of classes but it can be used on any Python object (variables, classes, modules, etc.) - it is simply a naming convention.

The Sphinx autodoc extension that we use for Automatic Module Documentation Generation respects the leading underscore naming convention and does not generate documentation for objects that are thusly named.

Automatic Module Documentation Generation

We use the Sphinx autodoc extension to produce API (Application Programming Interface) documentation like the SalishSeaTools Package API docs. The autodoc extension pulls documentation from docstrings into the documentation tree in a semi-automatic way. When commits are pushed to Bitbucket a signal is sent to readthedocs.org where the changes are pulled in, Sphinx is run to update the HTML rendered docs, and the revised version is published at http://salishsea-meopar-tools.readthedocs.org/en/latest/.

See Documentation with Sphinx for more details.

To add a new module’s docstrings to the auto-generated API docs you need to add a block of reStructuredText to the API docs file for the package in which the module resides. For example, to auto-generate docs for the salishsea_tools.data_tools module, the following block needs to be added to tools/SalishSeaTools/docs/api.rst:

1
2
3
4
5
6
7
.. _salishsea_tools.data_tools:

:py:mod:`data_tools` Module
===========================

.. automodule:: salishsea_tools.data_tools
    :members:

Line 1 is a cross-reference label for the module docs. It must be unique, so we use the module’s Python namespace expressed in dotted notation. Once the above block of rst has been committed and pushed to Bitbucket it will become possible to link to it in either the tools or docs docs using:

:ref:`salishsea_tools.data_tools`

Within the tools repo only you can also link to the module docs with:

:py:mod:`salishsea_tools.data_tools`

thanks to automatic index generation provided by the autodoc extension.

Lines 3 and 4 are the section heading for the module’s docs. We use the :py:mod: semantic markup to make the module name stand out in the rendered docs, and to provide meaning in the docs source file. The heading underline should be appropriate to the level of the section in the API docs file. In most cases that is ======== that render in an HTML <h2> tag.

Lines 6 and 7 are the directives that tell the autodoc extension where to find the module’s code and how to process the module’s contents. The example shows the case that we most commonly use: identifying the module by it dotted notation namespace path. The :members: option on line 7 tells autodoc to generate docs for all of the public elements (classes, functions, module-level data structures, etc.) it finds in the module.

See the Sphinx autodoc extension docs for more details.

Return SimpleNamespace from Functions

If you are writing a function that returns more than one value, consider returning the collection of values as a SimpleNamespace. If your function returns more than 3 values, definitely return them as a SimpleNamespace.

SimpleNamespace objects that have fields accessible by attribute lookup (dotted notation). They also have a helpful string representation which lists the namespace contents in a name=value format.

>>> p = SimpleNamespace(x=11, y=22)
>>> p.x + p.y               # fields also accessible by name
33
>>> p                       # readable string representation with a name=value style
namespace(x=11, y=22)

Using the salishsea_tools.data_tools.load_ADCP() function code as an example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
from types import SimpleNamespace


def load_ADCP(
        daterange, station='central',
        adcp_data_dir='/ocean/dlatorne/MEOPAR/ONC_ADCP/',
):
    """
    ...

    :returns: :py:attr:`datetime` attribute holds a :py:class:`numpy.ndarray`
              of data datatime stamps,
              :py:attr:`depth` holds the depth at which the ADCP sensor is
              deployed,
              :py:attr:`u` and :py:attr:`v` hold :py:class:`numpy.ndarray`
              of the zonal and meridional velocity profiles at each datetime.
    :rtype: 4 element :py:class:`types.SimpleNamespace`
    """
    ...
    return SimpleNamespace(datetime=datetime, depth=depth, u=u, v=v)

Returning a SimpleNamespace lets us call load_ADCP() like:

adcp_data = load_ADCP(('2016 05 01', '2016 05 31'))

and we can access the depth that the sensor is located at as:

adcp_data.depth

This compact and easy to understand when our future selves read our code.

Module-Specific Best Practices

salishsea_tools.places

The SalishSeaTools.salishsea_tools.places.PLACES` data structure is intended to be the single source of truth for information about geographic places that are used in analysis and presentation of Salish Sea NEMO model results.

It is intended to replace data structures like SalishSeaNowcast.nowcast.figures.SITES, SalishSeaNowcast.nowcast.research_ferries.ferry_stations, etc.

Library code that uses the PLACES data structure should use try...except to catch KeyError exceptions and produce an error message that is more informative than the default, for example:

try:
    max_tide_ssh = max(ttide.pred_all) + PLACES[site_name]['mean sea lvl']
    max_historic_ssh = PLACES[site_name]['hist max sea lvl']
except KeyError as e:
    raise KeyError(
        'place name or info key not found in '
        'salishsea_tools.places.PLACES: {}'.format(e))