make conversion non-destructive to soup; improve div/article/section handling #184
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This merge request does the following:
convert_soup()
non-destructive (soup left as-is)<div>
,<article>
,<section>
elementsUnit tests are updated.
Regarding #107, I believe that block-element newline separation, not line continuation, is the correct behavior at
<div>
,<article>
, and<section>
elements. These elements are all block elements. The following HTML example shows that in both the<p>
and<div>
cases, the separation between "foo" and "bar" uses block-element separation behavior, not<br />
line-continuation behavior: