IPEP 17: Notebook Format 4

Status	Active
Author	Min RK <[email protected]>
Created	April 29, 2013
Updated	September 27, 2013

There are a few changes we need to make to the notebook that will not be backward compatible. We do not intend to make these changes for 1.0, because nbformat changes are quite painful. This is a catalog of the changes we intend to make when we do next rev the nbformat.

remove multiple worksheets

The worksheets field is a list, but we have no UI to support multiple worksheets. Our design has since shifted to heading-cell based structure, so we never intend to support the multiple worksheet model. The worksheets list of lists shall be replaced with a single list, called cells.

use mime-type output keys

We transform mimetype output data to short names, like json or png. These should be restored to proper mimetype values of image/png and application/json, etc. used by the message spec. The output should be generated by a simple passthrough of the messages, rather than a whitelist transform.

Remove python-centric names

Following IPEP 13, Python-specific keys in the message spec and notebook will be removed. Those affecting the notebook format:

pyout will become execute_output
pyerr will become error

Make cell content key uniform

Currently text cells have a source key, which contains the text, and code cells have an input key. There is no reason for the two cell types to have a different name for their content:

CodeCell.input will become CodeCell.source, matching TextCell.source.

metadata changes

remove notebook name from metadata
move language key from code cells to top-level notebook metadata
add kernel info to top-level notebook metadata in some form
add format key to raw_cell metadata
add state for show/hide (already have) and auto-scroll.

Implementation and Coordination

Tasks involved in creating nbformat v4:

thoroughly define the v4 spec
update message spec keys (pyout, pyerr, etc.)
mime-type keys for output (affects nbconvert, nbformat, javascript)
remove worksheets, move cells to top-level list
add conversions to nbformat: v3->v2, v4->v3, v3->v4
metadata changes
widget-related changes (TBD)
we will need v4->v4 to track changes to v4 during development. If so, this should probably not be included in release, right?

I think this is the logical order of these tasks:

Define v4 in a doc (not just changes, full spec - v3 was never fully defined)
add downgrade API to nbformat (or nbconvert, unclear which), and implement v3->v2
copy v3 to v4, adding empty v4->v3 and v3->v4, removing the py/json distinction (nbconvert is responsible for .py now)
remove worksheet in v4
update msg spec keys that are reflected in notebook
use mime-type output keys
update various metadata keys (this mainly affects javascript code)

v2<->v3 conversion APIs can be done while v4 is being defined, but no part of v4 should be implemented until the spec is documented. Incremental implementations of v4 features, starting with 4. can be implemented in discrete PRs, probably on a v4 feature branch. Their order relative to each other isn't critically important.

Each time a change is made to the in-development v4 spec:

update spec doc
update nbformat.v4
update v4->v3 and v3->v4
update v4->v4?
update javascript, if affected
update nbconvert, if affected
TEST EVERY NEW CHANGE

Provide feedback

Saved searches

Use saved searches to filter your results more quickly