Skip to content

Python type hints and migration to Python 3

Jerry Morrison edited this page Jun 8, 2020 · 33 revisions

Migration to Python 3

See these Google Slides for our plan overview for migrating to Python 3 and adding mypy type hints to help catch problems in the migration and elsewhere (esp. with the string to unicode vs. bytes change).

String types

Python 2 Python 2+3 Python 3
basestring basestring # superclass of str, unicode -- --
unicode unicode -- --
typing.Text unicode typing.Text str
typing.AnyStr # any type of string but not mixed typing.AnyStr typing.AnyStr typing.AnyStr
six.string_types # for instanceof() (basestring,) six.string_types (str,)
six.text_type unicode six.text_type str
String = Union[str, Text] # type alias unicode or str [or bytes] a text string str

Dict access

Python 2 Python 2+3 Python 3
test a key key in d d.has_key(key) key in d key in d
snapshot as a list d.keys() list(d) list(d) list(d.keys()) list(d)
d.values() list(d.values()) # uses more CPU & RAM list(six.viewvalues(d)) list(d.values()) # uses more CPU & RAM list(d.values())
d.items() list(d.item()) # uses more CPU & RAM list(six.viewitems(d)) list(d.items()) # uses more CPU & RAM list(d.items())
iterable view d.viewkeys() six.viewkeys(d) d.keys() d # if the context will call iter() on it
d.viewvalues() six.viewvalues(d) d.values()
d.viewitems() six.viewitems(d) d.items()
iterator for key in d: ... iter(d) d.iterkeys() for key in d: ... iter(d) six.iterkeys(d) for key in d: ... iter(d) iter(d.keys())
d.itervalues() six.itervalues(d) iter(d.values())
d.iteritems() six.iteritems(d) iter(d.items())

Main References

FYI: Additional References

Strategy

  • Adopt all the __future__ imports.
    • Division is the challenging one. It's mostly in use already, with the big exception that wholecell/utils/units.py has truediv turned off for its callers due to Issue #433.
  • Adopt Python 3 compatible libraries.
    • The pips should now be Python 3 compatible, but they aren't all clearly marked that way.
    • Finish adopting subprocess32 in place of subprocess. It's a back-port of the Python 3 subprocess with improvements and bug fixes in process launching.
  • Incrementally convert to Python 3 compatible syntax and semantics. Use a tool like "future" to do much of the conversion. As we ratchet up the Python 3 compatibility, let everyone know and update the checker tool configuration.
    • Use a checker tool in CI to catch backsliding on Python 3 compatibility changes.
  • Add type hints, esp. for the str, bytes, unicode, and basestring types and the AnyStr type hint.
    • Add a type checker in CI, most likely pytest (see below).
  • Drop support for Python 2.
  • Phase out use of the "six" compatibility library.

Type hints

Type hints look like this:

def emphasize(message):
  # type: (str) -> str
  """Construct an emphatic message."""
  return message + '!'

A few type hints -- esp. one per function definition -- can go a long way to catching problems and documenting types.

PyCharm checks types interactively, while you edit. You don't need any other tools to check types. See Python Type Checking (Guide).

Batch programs mypy and pytest are other ways to check types, particularly in Continuous Integration builds (CI).

Typeshed is a repository for "stub" files that associate type definitions with existing libraries. It's bundled with PyCharm, mypy, and pytype. It does not have types for Numpy.

Types for Numpy

There are experimental type stubs in the numpy repo numpy-stubs that define types for dtype and ndarray. It's not fancy but it does catch some mistakes and it improves PyCharm autocompletion. The numpy team might improve these stubs but numpy, scipy, and matplotlib use types more flexibly than type checker tools can handle.

With this stub file, you can write type hints like np.ndarray and np.ndarray[int]. It has no way to express the element type or array shape so use docstrings for that.

import numpy as np

def f(a):
    # type: (np.ndarray]) -> np.ndarray
    return np.asarray(a, dtype=int)

The wcEcoli project includes numpy-stubs.

To install more stub files:

  1. Copy them into the stubs/ directory in the project.
  2. Mark the stubs/ directory as a source root in PyCharm.