Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Space between item listing/tables #228

Open
botzill opened this issue Feb 7, 2017 · 4 comments
Open

Space between item listing/tables #228

botzill opened this issue Feb 7, 2017 · 4 comments

Comments

@botzill
Copy link
Contributor

botzill commented Feb 7, 2017

Currently if we have a list like:

screen shot 2017-02-07 at 9 45 51 pm

is exported as:

screen shot 2017-02-07 at 9 45 58 pm

So, there is no space between items.

This is applied to tables as well:
input:

screen shot 2017-02-07 at 10 07 40 pm

output:

screen shot 2017-02-07 at 10 07 48 pm

@botzill botzill changed the title Space between item listing Space between item listing/tables Feb 7, 2017
@botzill
Copy link
Contributor Author

botzill commented Feb 10, 2017

As I check the code I see that this was deliberately done, via:

def export_paragraph(self, paragraph):
    results = super(PyDocXHTMLExporter, self).export_paragraph(paragraph)

    results = is_not_empty_and_not_only_whitespace(results)
    if results is None:
        return

Any reason why we do that?

Basically I think that we need to detect empty paragraph and convert them into <br/> to have proper output.

@kylegibson
Copy link
Contributor

If I recall correctly, it's because word documents can have these blank p's, but don't actually render to anything in a document. Empty p's in OOXML do not necessarily translate to a line break in HTML. If in doubt, 1) check the spec: how does it say empty p's should be handled? 2) construct a word document with some empty p's. Open the document in Word. What happens?

@botzill
Copy link
Contributor Author

botzill commented Feb 10, 2017

Yes, I did some tests and basically if we add an empty <w:p/> it will be rendered as new line. Of course there can be different scenarios about this depending where <w:p/> is located. To be honest I could not find proper information about empty p, I just did tests with doc.

I did some work related to this here: botzill@34ee045.

To properly allow <w:p/> to be rendered we need to reset html p tag default margins and allow those empty p to be processed. Empty paragraph is replaced with: <p>&nbsp;</p> so that it will work in lists as well.

This way we don't actually need this method :

def yield_nested_with_line_breaks_between_paragraphs(self, iterable, func):
.

But not sure yet if this will cover all the cases. From tests I did seems be fine so far.

@botzill
Copy link
Contributor Author

botzill commented Feb 14, 2017

The info I found about p: https://msdn.microsoft.com/en-us/library/gg278323.aspx

The most basic unit of block-level content within a WordprocessingML document, paragraphs are stored using the <p> element. A paragraph defines a distinct division of content that begins on a new line. A paragraph can contain three pieces of information: optional paragraph properties, inline content (typically runs), and a set of optional revision IDs used to compare the content of two documents.

Also here: https://msdn.microsoft.com/en-us/library/documentformat.openxml.wordprocessing.paragraph.aspx. But no info related to empty paragraphs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants