Skip to content

Developer Guidelines

jonathanrocher edited this page Feb 20, 2013 · 2 revisions

Guidelines for Developers

Coding style and practices:

The community is very welcome and encouraged to contribute code to this project by use of pull requests. For code to be accepted and merged, it needs to follow certain practices:

  • module size should be kept below 500 lines if at all possible.
  • each module should come with a set of unit tests covering the extent of the module (see testing guidelines).
  • Every module, class or function should have doc strings (see documentation guidelines).
  • pep8 coding style is recommended.

Provenance:

All wrapped functions should have an indication of where they came from, perhaps in the doc string, or perhaps an attribute should be created for this purpose. A version number and/or date of the creation or revision of the function should also be included.

Another possibility is that functions from particular sources should be contained in sub-modules, e.g. aos.awips. However, this option would probably get in the way of grouping the codes that are related in functionality.

Handling the original source:

My preference would be to make as few modifications to the original source as possible, but maybe this is not really a requirement. It just seems safer in that it reduces the chance of accidentally messing up the original code in some subtle way. I would rather keep all the f2py declarations in the signature files as opposed to adding it as comments to the Fortran source. However, there are obviously cases where the original source needs to be changed, such as the case noted in the AWIPS code of an extension intrinsic functions (JINT) that is not generally available. As long as the original unmodified code is checked in to the repository and explanatory messages accompany the required revisions, the changes should not cause the loss of the trail of provenance.

Function arguments:

As Johnny originally stated, this library should not depend on any external software other than NumPy (ideally) and possibly SciPy if necessary. So all required data array arguments must accept basic NumPy arrays. A fully developed function interface should also be able to handle masked arrays (or whatever is the current NumPy implementation of support for missing values). It should also accept cdms variables and possibly other types of context-aware data objects.

A goal should be to simplify the interface as much as possible without sacrificing functionality.

f2py does a good job of 'hiding' arguments such as array sizes that are not needed in the Python/NumPy context, implementing them as optional keyword arguments. However, there may also be other arguments that are not required when the data arguments are passed in using context-aware data objects like cdms. These arguments might represent coordinate arrays or units, for instance. In that case it would be good to make these arguments optional as well. However, it would be important to document clearly that depending on the format of the required data arguments, these arguments may or may not be required.

Interface coding:

Functions built to operate on, for example, a 2d horizontal array should be able to take additional dimensions on the left size of the input array. The interface function can loop over the additional dimensions to operate on all horizontal slices of the multi-d array. This may improve performance and also could make it easier to parallelize the code. It also can simplify the Python user code.

If a function cannot handle missing values, it should, if possible, indicate the least disruptive workaround in the error message: resetting the missing value to 0 or 1 perhaps. If there is no good work-around it should raise an exception. Ideally, given a masked array, it could check whether there are any actual missing values before setting the error.

Clone this wiki locally