From a173d0bb746b1afc6a4942a2536c9008da35b572 Mon Sep 17 00:00:00 2001
From: John MacFarlane
Date: Mon, 8 Jun 2015 12:53:31 -0700
Subject: [PATCH] Updated spec.
---
test/spec.txt | 109 ++++++++++++++++++++++++++++++++++++++------------
1 file changed, 84 insertions(+), 25 deletions(-)
diff --git a/test/spec.txt b/test/spec.txt
index ba2c5aa89..8b2c7a30d 100644
--- a/test/spec.txt
+++ b/test/spec.txt
@@ -1,8 +1,8 @@
---
title: CommonMark Spec
author: John MacFarlane
-version: 0.19
-date: 2015-04-27
+version: 0.20
+date: 2015-06-08
license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)'
...
@@ -235,7 +235,10 @@ carriage return (`U+000D`), newline (`U+000A`), or form feed
[Unicode whitespace](@unicode-whitespace) is a sequence of one
or more [unicode whitespace character]s.
-A [non-space character](@non-space-character) is anything but `U+0020`.
+A [space](@space) is `U+0020`.
+
+A [non-space character](@non-space-character) is any character
+that is not a [whitespace character].
An [ASCII punctuation character](@ascii-punctuation-character)
is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
@@ -246,9 +249,10 @@ A [punctuation character](@punctuation-character) is an [ASCII
punctuation character] or anything in
the unicode classes `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
-## Tab expansion
+## Preprocessing
-Tabs in lines are expanded to spaces, with a tab stop of 4 characters:
+Tabs in lines are immediately expanded to [spaces][space], with a tab
+stop of 4 characters:
.
→foo→baz→→bim
@@ -274,11 +278,11 @@ with the replacement character (`U+FFFD`).
# Blocks and inlines
We can think of a document as a sequence of
-[blocks](@block)---structural
-elements like paragraphs, block quotations,
-lists, headers, rules, and code blocks. Blocks can contain other
-blocks, or they can contain [inline](@inline) content:
-words, spaces, links, emphasized text, images, and inline code.
+[blocks](@block)---structural elements like paragraphs, block
+quotations, lists, headers, rules, and code blocks. Some blocks (like
+block quotes and list items) contain other blocks; others (like
+headers and paragraphs) contain [inline](@inline) content---text,
+links, emphasized text, images, code, and so on.
## Precedence
@@ -529,12 +533,12 @@ consists of a string of characters, parsed as inline content, between an
opening sequence of 1--6 unescaped `#` characters and an optional
closing sequence of any number of `#` characters. The opening sequence
of `#` characters cannot be followed directly by a
-[non-space character].
-The optional closing sequence of `#`s must be preceded by a space and may be
-followed by spaces only. The opening `#` character may be indented 0-3
-spaces. The raw contents of the header are stripped of leading and
-trailing spaces before being parsed as inline content. The header level
-is equal to the number of `#` characters in the opening sequence.
+[non-space character]. The optional closing sequence of `#`s must be
+preceded by a [space] and may be followed by spaces only. The opening
+`#` character may be indented 0-3 spaces. The raw contents of the
+header are stripped of leading and trailing spaces before being parsed
+as inline content. The header level is equal to the number of `#`
+characters in the opening sequence.
Simple headers:
@@ -562,11 +566,13 @@ More than six `#` characters is not a header:
####### foo
.
-A space is required between the `#` characters and the header's
-contents. Note that many implementations currently do not require
-the space. However, the space was required by the [original ATX
-implementation](http://www.aaronsw.com/2002/atx/atx.py), and it helps
-prevent things like the following from being parsed as headers:
+At least one space is required between the `#` characters and the
+header's contents, unless the header is empty. Note that many
+implementations currently do not require the space. However, the
+space was required by the
+[original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
+and it helps prevent things like the following from being parsed as
+headers:
.
#5 bolt
@@ -1028,7 +1034,41 @@ paragraph.)
.
-The contents are literal text, and do not get parsed as Markdown:
+If there is any ambiguity between an interpretation of indentation
+as a code block and as indicating that material belongs to a [list
+item][list items], the list item interpretation takes precedence:
+
+.
+ - foo
+
+ bar
+.
+
+.
+
+.
+1. foo
+
+ - bar
+.
+
+-
+
foo
+
+
+
+.
+
+
+The contents of a code block are literal text, and do not get parsed
+as Markdown:
.
@@ -2329,9 +2369,16 @@ foo
.
-Laziness only applies to lines that are continuations of
-paragraphs. Lines containing characters or indentation that indicate
-block structure cannot be lazy.
+Laziness only applies to lines that would have been continuations of
+paragraphs had they been prepended with `>`. For example, the
+`>` cannot be omitted in the second line of
+
+``` markdown
+> foo
+> ---
+```
+
+without changing the meaning:
.
> foo
@@ -2343,6 +2390,15 @@ block structure cannot be lazy.
.
+Similarly, if we omit the `>` in the second line of
+
+``` markdown
+> - foo
+> - bar
+```
+
+then the block quote ends after the first line:
+
.
> - foo
- bar
@@ -2357,6 +2413,9 @@ block structure cannot be lazy.
.
+For the same reason, we can't omit the `>` in front of
+subsequent lines of an indented or fenced code block:
+
.
> foo
bar