Skip to content

Python type hints and migration to Python 3

Jerry Morrison edited this page Oct 4, 2019 · 33 revisions

Migration to Python 3

See these Google Slides for our plan overview for migrating to Python 3 and adding mypy type hints to help catch problems in the migration and elsewhere (esp. with the string to unicode vs. bytes change).

TODO: Specifically how to use type hints in Python 2 & 3 compatible code to catch problems with bytes vs. unicode strings?

Main References

FYI: Additional References

Strategy

  • Adopt all the __future__ imports. Division is the challenging one.
  • Adopt Python 3 compatible libraries.
  • Convert to Python 3 compatible syntax.
  • Use a checker tool in CI to catch backsliding.
  • Use a tool like "future" to do much of the conversion, incrementally. Let everyone know as we ratchet up the Python 3 compatibility.
  • Drop support for Python 2.
  • Phase out use of the "six" compatibility library.

Type hints

Type hints look like this:

def emphasize(message):
  # type: (str) -> str
  """Construct an emphatic message."""
  return message + '!'

A few type hints -- esp. one per function definition -- can go a long way to catching problems and documenting types.

PyCharm checks types interactively, while you edit. You don't need any other tools to check types. See Python Type Checking (Guide).

Batch programs mypy and pytest are other ways to check types, particularly in Continuous Integration builds.

Typeshed is a repository for "stub" files that associate type definitions with existing libraries. It's bundled with PyCharm, mypy, and pytype. It does not have types for Numpy.

Types for Numpy

There are experimental type stubs in the numpy repo numpy-stubs that define types for dtype and ndarray. It's not fancy but it does catch some mistakes and it improves PyCharm autocompletion. Hopefully the numpy team will improve these stubs, but numpy is more flexible with types than the type system is unlikely to handle.

With this stub file, you can write type hints like np.ndarray and np.ndarray[int]. It doesn't have a way to express array shape so that still goes into a docstring.

import numpy as np

def f(a):
    # type: (np.ndarray[float]) -> np.ndarray[int]
    return np.asarray(a, dtype=int)

The wcEcoli project includes numpy-stubs.

To install more stub files:

  1. Copy them into a stubs/ directory in the project.
  2. Mark the stubs/ directory as a source root in PyCharm by choosing Mark Directory as | Sources Root from the directory's context menu.