Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for textbox -> txbxContent #203

Merged
merged 21 commits into from
May 16, 2016
Merged

Add support for textbox -> txbxContent #203

merged 21 commits into from
May 16, 2016

Conversation

jlward
Copy link
Contributor

@jlward jlward commented May 11, 2016

      <w:r>
            <w:pict>
                <v:textbox>
                  <w:txbxContent>
                    <w:p>
                      <w:r>
                        <w:t>Foo bar baz</w:t>
                      </w:r>
                    </w:p>
                  </w:txbxContent>
                </v:textbox>
              </v:rect>
            </w:pict>
      </w:r>


class AlternateContent(XmlModel):
XML_TAG = 'AlternateContent'
children = XmlCollection(Fallback)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might make a pass and update all the XmlCollection stuff that I touched to use strings instead of classes.

)


class Textbox(XmlModel):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guy should really live in vml instead of wordprocessing :( I'll handle this after the code review.

def types(self):
return set(self._set_types(*self._types))

def _set_types(self, *types):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to consider caching this result? Would be expensive to have to run it more than once, if stuff is needing to be imported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had considered it. Do you recall if pydocx has a built in caching tool? Or should I do it the old fashioned way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A thought. This gets called for each instance. I might need to make types be a class method and cache it there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a class method would actually fix this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My research leads me to believe that no caching is needed (it won't actually change anything).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what you researched, Jason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I printed the types in this method and could only get that printing the first time it was used in a test. The second test did not show any of the printing.

'''
self.assert_document_generates_html(document, expected_html)

def test_textbox_with_content_outside_of_textbox(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuation of https://github.com/CenterForOpenScience/pydocx/pull/203/files#r63035777 I have the tests written. However I don't have any idea what I need to do to break this up into multiple paragraphs. From what I can tell, the outside opening p tag has already happened. Which means I'll need to close it, do my thing, then re-open it. If this is the case, I have no idea how it can be done inline. I could probably build a post-processor, but it looks like we don't have those anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're doing something like this elsewhere in the code. I'm not certain where.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would almost rather push this live as is, and deal with it later. But that's mostly because I don't have a good solution in mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ComplexFields, we had to implement a two-pass approach: https://github.com/CenterForOpenScience/pydocx/blob/master/pydocx/export/base.py#L115

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in the editor if you get nested paragraph tags like this?

Copy link
Contributor Author

@jlward jlward May 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CKEditor actually fixes it for us.

<p>AAA<p>BBB</p></p> --> <p>AAA</p><p>BBB</p>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The legacy editor actually also fixes this auto-magically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having an editor that fixes it for us is good, but other users of pydocx might not have that luxury. Our goal is to output clean HTML, right? I'm +0 on fixing this in a separate ticket, before #204 or as part of #204, if you think that would be a better way to structure the work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a good way of fixing it. Honestly, I was going to punt on it, get #204 done, then not worry about it until someone finds a good way to fix it.


class Fallback(XmlModel):
XML_TAG = 'Fallback'
children = XmlCollection('wordprocessing.Picture')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what other child types Fallback supports?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will be dealing with that in #204 as I mentioned during stand up :)

I don't plan to push a new release until both this and 203 are done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this comment?
#TODO in #204- actually include all of the children defined by the spec

@kylegibson
Copy link
Contributor Author

Do you have any test cases for when textbox appears without the markup_compatibility? This would happen if you had a word document created in an older version of word that didn't support the markup compatibility syntax.

@jlward
Copy link
Contributor

jlward commented May 12, 2016

Do you have any test cases for when textbox appears without the markup_compatibility? This would happen if you had a word document created in an older version of word that didn't support the markup compatibility syntax.

I do not. I only have it with markup_compatibility because that was the only way I could get it to happen using libreoffice.

@kylegibson
Copy link
Contributor Author

I only have it with markup_compatibility because that was the only way I could get it to happen using libreoffice.

What version of libreoffice?

@jlward
Copy link
Contributor

jlward commented May 12, 2016

What version of libreoffice?

5.1. And I don't have access to 4.x

@kylegibson
Copy link
Contributor Author

5.1. And I don't have access to 4.x

Our vagrant boxes ship with 3.5. Although, ideally we would create the documents using an older version of Word. I wish we had access to Word VMs like we do with IE VMs.

@jlward
Copy link
Contributor

jlward commented May 12, 2016

Our vagrant boxes ship with 3.5. Although, ideally we would create the documents using an older version of Word. I wish we had access to Word VMs like we do with IE VMs.

Using 3.5 and saving as a docx actually removes the contents of the docx. So there's that.

@@ -15,7 +15,7 @@
NumberingSpan,
NumberingSpanBuilder,
)
from pydocx.openxml import wordprocessing, vml
from pydocx.openxml import wordprocessing, vml, markup_compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphabetize?

def export_textbox_content(self, textbox_content):
return self.yield_nested(textbox_content.children, self.export_node)

# Markup Compatibility exporters
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we no longer need this comment.

@winhamwr
Copy link
Contributor

Everything looks good now except for the paragraph nesting thing. It would be good to include Kyle in that decision, since I feel like I don't have a great understanding of what it would take to implement a fix.

@winhamwr
Copy link
Contributor

I'll wait to review the description of the ticket to fix the nested paragraph before giving this one the ol' 👍

@jlward
Copy link
Contributor

jlward commented May 16, 2016

I'll wait to review the description of the ticket to fix the nested paragraph before giving this one the ol' 👍

#213 has been created to deal with nested paragraphs.

@caffodian
Copy link

@kylegibson has reviewed #213 so this is good

@jlward jlward merged commit e84d282 into master May 16, 2016
@jlward jlward deleted the issue_203 branch May 20, 2016 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants