Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable equations #49

Open
wants to merge 513 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
513 commits
Select commit Hold shift + click to select a range
ef39906
fixed test
SamPortnow May 21, 2013
9ca241d
updated xml files
SamPortnow May 21, 2013
3a1f9f8
Merge branch 'master' into issue_24
May 21, 2013
aacf1ae
refs #24: added a comment
May 21, 2013
cdd0add
Merge pull request #24 from OpenScienceFramework/issue_24
jlward May 21, 2013
a82dd25
Merge branch 'master' into issue_25
May 21, 2013
0e96655
refs #25: added a test showing that inserted text in lists is still b…
May 21, 2013
2f0ed87
refs #25: refactor and fixed inserted text in lists
May 21, 2013
bc4196b
refs #25: added a test showing that smart tags in a list work fine.
May 21, 2013
3701654
Merge branch 'issue_25' into issue_27
May 21, 2013
e3d7a19
refs #28: update all the tests
May 21, 2013
e4fafa5
refs #28: update the parser
May 21, 2013
404b395
refs #28: removed pre-processing, maybe travis will run this time?
May 21, 2013
855ce76
refs #29: updated the tests for expected output
May 21, 2013
bf2705b
refs #29: updated the xml based tests for the new expected html
May 21, 2013
061d8d1
refs #29: updated white space
May 21, 2013
2aa5922
refs #29: updated the parser for valid values
May 21, 2013
7f78ad4
Merge pull request #25 from OpenScienceFramework/issue_25
jlward May 21, 2013
13a0524
Merge branch 'master' into issue_27
May 21, 2013
6b3cdd0
Merge branch 'master' into issue_28
May 21, 2013
8f387f3
refs #28: updated test based on merged master
May 21, 2013
f44ca81
Merge branch 'master' into issue_29
May 21, 2013
cbea7a9
refs #29: updated tests based on merged master
May 21, 2013
262cdd1
Merge pull request #27 from OpenScienceFramework/issue_27
jlward May 21, 2013
cf15525
Merge branch 'master' into issue_28
May 21, 2013
1c72947
refs #28: updated tests based on merged master
May 21, 2013
eb444f9
Merge branch 'master' into issue_29
May 21, 2013
e605958
refs #29: updated tests based on merged master
May 21, 2013
e254e81
refs #28: split up a line into multiple lines
May 21, 2013
04c407d
refs #28: updated how we are doing underline
May 21, 2013
8c5b39c
refs #28: Added css stuff to the README
May 21, 2013
cc1dd25
Merge branch 'issue_28' into issue_29
May 21, 2013
a0de8a9
refs #29: namespaced all the css classes
May 21, 2013
9142a9f
Merge pull request #28 from OpenScienceFramework/issue_28
jlward May 21, 2013
29a6893
Merge branch 'master' into issue_29
May 21, 2013
0abe0a3
updated main
SamPortnow May 21, 2013
9bfe357
merged with master
SamPortnow May 21, 2013
05f99e9
merged
SamPortnow May 21, 2013
058a0f5
added latex parser
SamPortnow May 21, 2013
3f073e4
flake8 compliant
SamPortnow May 21, 2013
a1d8080
updating
SamPortnow May 21, 2013
faa2d0b
fixed break tag
SamPortnow May 21, 2013
6e8bf4f
updating
SamPortnow May 21, 2013
d584137
refs #30: Updated the tests for expected behaviour
May 21, 2013
c9c76fb
refs #30: no longer adding invalid attributes to insert and delete tags
May 21, 2013
70fc06f
refs #30: stopped break separating tags that were inline like (insert…
May 21, 2013
6dde935
added test; made size search cleaner
SamPortnow May 21, 2013
4095602
updating tests
SamPortnow May 21, 2013
319b8b1
fixed formatting
SamPortnow May 21, 2013
1699da1
updated test
SamPortnow May 21, 2013
988b767
fixed tests
SamPortnow May 22, 2013
315e844
removed not needed files for this branch
SamPortnow May 22, 2013
615bad9
removed unecessary imports
SamPortnow May 22, 2013
07e145b
updating
SamPortnow May 22, 2013
5984952
Merge pull request #29 from OpenScienceFramework/issue_29
jlward May 22, 2013
0bd3e36
Merge branch 'master' into issue_30
May 22, 2013
3b1e083
merged
SamPortnow May 22, 2013
4937a55
updating
SamPortnow May 22, 2013
72bf87a
comment fixes
SamPortnow May 22, 2013
d556e96
added import
SamPortnow May 22, 2013
92e7b7d
fixed error
SamPortnow May 22, 2013
a43be1a
fixed some errors
SamPortnow May 22, 2013
2c91c7e
Merge branch 'master' of https://github.com/OpenScienceFramework/pydo…
SamPortnow May 22, 2013
ecda5d2
fixed test
SamPortnow May 23, 2013
2b896df
refs #32: small refactor
May 23, 2013
31fe2a1
refs #32: code updated based on deprications
May 23, 2013
d0cf19b
refs #32: can now do underline/italics
May 23, 2013
27de3f0
refs #32: it is now possible to test headings
May 23, 2013
a061e3a
Merge pull request #30 from OpenScienceFramework/issue_30
jlward May 23, 2013
480d417
Merge branch 'master' into issue_32
May 23, 2013
add38af
refs #33: Updated the test for expected image behaviour
May 23, 2013
9329b0a
refs #33: updated DocxParser to extract images
May 23, 2013
737d351
refs #20 code cleanup, removed print statment
May 23, 2013
c418599
Merge branch 'master' into localDpi
May 23, 2013
c525515
Merge pull request #20 from OpenScienceFramework/localDpi
jlward May 23, 2013
f134050
Merge branch 'master' into issue_32
May 23, 2013
dc25a5f
Merge branch 'master' into issue_33
May 23, 2013
2cba0a4
refs #33: name change, import clenaup
May 23, 2013
e5975fc
refs #33: removed lying comments, added a comment
May 23, 2013
196522a
refs #33: on the rest of the image test cases, made sure the image wa…
May 23, 2013
a25affc
refs #33: good catch on the KeyError
May 23, 2013
5776117
refs #33: name change, no longer need try/except
May 23, 2013
999c3a2
refs #33: assume zip_path is always set
May 23, 2013
8c03bc2
Merge pull request #33 from OpenScienceFramework/issue_33
jlward May 23, 2013
85b82e4
Merge branch 'master' into issue_32
May 23, 2013
cdd8846
refs #32: removed the != None part for the style
May 23, 2013
c16b298
Merge pull request #32 from OpenScienceFramework/issue_32
jlward May 23, 2013
5dec2fb
Bumped to version 0.1.3
May 23, 2013
8897f03
bumped to version 0.1.4
May 23, 2013
2875d87
bumped to version 0.1.5
May 23, 2013
a620d81
bumped to version 0.1.6
May 23, 2013
89e996f
bumped to version 0.1.7
May 23, 2013
bf79516
refs #35: updated the readme so PyPi likes it.
May 28, 2013
fe0c875
refs #35: hopefully the name change will force github to render with …
May 28, 2013
fb47133
refs #35: Peg PyPi support for 2.6 and 2.7
May 28, 2013
d7b7d09
Merge pull request #35 from OpenScienceFramework/issue_35
jlward May 28, 2013
4ae6ee6
bumped to version 0.1.8
May 28, 2013
766f768
Fixed the manifest
May 28, 2013
6d3a372
Fixed a broken filename
May 28, 2013
8fd0b87
adding test; cleaned up parser
SamPortnow May 28, 2013
b8efd39
updating parser
SamPortnow May 28, 2013
e6de463
refs #34: updated the tests for expected behaviour for base 64 encodi…
May 28, 2013
5f4f649
refs #34: store and pass around the image data, instead of the image …
May 28, 2013
82735a5
refs #34: image handler now deals with image data and base 64 encodes…
May 28, 2013
5056af4
refs #34: since we no longer write the image to disk, we no longer ne…
May 28, 2013
212d866
updated to current version
SamPortnow May 28, 2013
6b63d89
updating
SamPortnow May 28, 2013
9370edd
flake8
SamPortnow May 28, 2013
5cb6028
refs #37: Added tests showing what should happen with upper roman num…
May 28, 2013
eab44ab
refs #37: it is now possible to convert root level upper roman lists …
May 28, 2013
f0842d9
refs #34: passed along the filename, correctly created the src for im…
May 29, 2013
38c6d48
refs #37: updates based on code review
May 29, 2013
b657f7d
Merge pull request #34 from OpenScienceFramework/issue_34
jlward May 29, 2013
f48a013
Merge branch 'master' into issue_37
May 29, 2013
433fe73
Merge pull request #37 from OpenScienceFramework/issue_37
jlward May 29, 2013
8a649f1
bumped to version 0.2.0
May 29, 2013
dcfb4b4
some changes to tables
SamPortnow May 30, 2013
4c15915
some changes to tables
SamPortnow May 30, 2013
6655342
Merge branch 'master' into latex
SamPortnow May 30, 2013
0d11f13
flake8
SamPortnow May 30, 2013
2d1292d
refs #38: refactored the test code to do r tags correctly
May 30, 2013
1db516e
refs #38: added a test showing the duplicated content issue
May 30, 2013
138684a
refs #38: fixed justifications
May 30, 2013
948a4fe
refs #38: removed dead code
May 30, 2013
2cfe7e6
refs #38: added a comment showing what still needs to be done.
May 30, 2013
139dce2
refs #38: code cleanup
May 30, 2013
525857c
minor change
SamPortnow May 31, 2013
3cbee23
Merge branch 'issue_38' of https://github.com/OpenScienceFramework/py…
SamPortnow May 31, 2013
9faa36c
merged with latex
SamPortnow May 31, 2013
6e188f4
refs #38: Added a changelog
May 31, 2013
0ff6f50
Merge pull request #38 from OpenScienceFramework/issue_38
jlward May 31, 2013
a0f1daa
bumped to version 0.2.1
May 31, 2013
4ff701d
refs #41: refactor and started using spans inline instead of divs
May 31, 2013
7bf0028
added page orientation
SamPortnow Jun 3, 2013
3ab2569
refs #43: switched to using lxml
Jun 3, 2013
9db7b51
refs #43: udpated the reqs
Jun 3, 2013
18213fb
fixed init
SamPortnow Jun 3, 2013
f817bd0
benchmark_test
SamPortnow Jun 4, 2013
c9a1304
removed unncessary
SamPortnow Jun 4, 2013
20eb9bc
updating
SamPortnow Jun 4, 2013
7f24073
updating
SamPortnow Jun 4, 2013
7b196af
merged with master
SamPortnow Jun 4, 2013
b2d70b6
refs #43: got the last of the failing unit tests passing
Jun 4, 2013
cb25ea0
refs #43: Big refactor. Moved all the new lxml parser to its own file.
Jun 4, 2013
f55427b
refs #43: No longer storing on attrib, storing on a global dictionary…
Jun 4, 2013
b48e8e3
refs #43: Refactor to no longer need the subclassed parser
Jun 4, 2013
ceabf65
refs #43: switched to using cElementTree
Jun 4, 2013
f78e857
refs #43: no longer need lxml
Jun 4, 2013
7389127
refs #43: small refactor, no longer skipping the test that use to take
Jun 4, 2013
39ccc37
refs #43: change log and updated readme
Jun 4, 2013
7ce47bd
refs #43: removed a dead print statement
Jun 4, 2013
514fb5a
refs #42: Added two tests for how sub/super scripts are supposed to work
Jun 4, 2013
e07b683
refs #42: sub and super scripts are now working
Jun 4, 2013
23533d6
refs #42: it would help to add the fixture for the docx test
Jun 4, 2013
9543d4e
refs #42: added an update note and updated the readme
Jun 4, 2013
898cfd0
refs #42: added a comment
Jun 4, 2013
bc87a89
Revert "refs #41: refactor and started using spans inline instead of …
Jun 5, 2013
1d4b406
Merge pull request #43 from OpenScienceFramework/issue_43
winhamwr Jun 5, 2013
b137c64
Merge branch 'master' into issue_42
Jun 5, 2013
155dc34
refs #42: changed conditional to elif
Jun 5, 2013
e460158
Merge pull request #42 from OpenScienceFramework/issue_42
jlward Jun 5, 2013
07d1da6
bumped to version 0.3.0
Jun 5, 2013
7b088ab
simple lists and simple tables working
SamPortnow Jun 6, 2013
11f196a
merged with master
SamPortnow Jun 6, 2013
2758f4a
merged with master
SamPortnow Jun 6, 2013
76dc1e7
merged with master
SamPortnow Jun 6, 2013
7b4cca9
merged with master
SamPortnow Jun 12, 2013
191a99c
table issue
SamPortnow Jun 12, 2013
bc28d22
removed uncessary import
SamPortnow Jun 12, 2013
52286ae
removed uncessary file
SamPortnow Jun 12, 2013
2531459
updated the test
SamPortnow Jun 12, 2013
4d92d84
Merge branch 'master' into table_fix
SamPortnow Jun 12, 2013
468502e
flake8 compliant
SamPortnow Jun 12, 2013
4145aa7
flake8 compliant
SamPortnow Jun 12, 2013
b4edee2
minor changes
SamPortnow Jun 12, 2013
ddc5600
refs #44: Added tests showing what the different supported r styles s…
Jun 12, 2013
cef0f9e
refs #44: Got several more r styles working correctly
Jun 12, 2013
9741e66
refs #44: update note and updated readme
Jun 12, 2013
137fbc0
made changes based on comments; added test case
SamPortnow Jun 13, 2013
1fb36a2
made changes based on comments; added test case
SamPortnow Jun 13, 2013
7774916
made changes based on comments; added test case
SamPortnow Jun 13, 2013
251a225
refs #44: First step at passing in an rPr instead
Jun 13, 2013
3fcebc6
refs #44: updated the tests to use the rpr instead
Jun 13, 2013
97ab97a
removed uncessary line
SamPortnow Jun 13, 2013
0fcfc7c
changed table tests
SamPortnow Jun 13, 2013
d50cdfd
refs #44: small refactor, not calling find twice per inline call anymore
Jun 13, 2013
e06802f
refs #44: even better performance
Jun 13, 2013
a6807b8
refs #44: no more kwargs abusing
Jun 13, 2013
f5cc362
Merge pull request #44 from OpenScienceFramework/issue_44
jlward Jun 13, 2013
c67efbf
bumped to version 0.3.1
Jun 13, 2013
dff9da1
merged with master
SamPortnow Jun 17, 2013
1a57dd1
updating
SamPortnow Jun 17, 2013
7ac58c9
merged with master
SamPortnow Jun 17, 2013
7493338
updated tests
SamPortnow Jun 17, 2013
721492e
updated tests
SamPortnow Jun 17, 2013
bd6e8e7
change to vmerge
SamPortnow Jun 17, 2013
81ca609
merged with masteR
SamPortnow Jun 17, 2013
83505df
updated changelog
SamPortnow Jun 17, 2013
4e32fd5
updated changelog
SamPortnow Jun 17, 2013
7c1603a
changes based on comments
SamPortnow Jun 17, 2013
ebddee8
Merge pull request #45 from OpenScienceFramework/table_fix
jlward Jun 18, 2013
d06b064
Merge branch 'table_fix' into latex
SamPortnow Jun 18, 2013
7b0517c
merged with master; added latex parser
SamPortnow Jun 18, 2013
227342a
changes based on comments
SamPortnow Jun 18, 2013
be11f10
fixed some more indentation stuff
SamPortnow Jun 18, 2013
331efe9
comments based on changes; fixed more spacing errors
SamPortnow Jun 18, 2013
8b90ce1
flake8 fix
SamPortnow Jun 18, 2013
1ce235d
fixed another spacing issue
SamPortnow Jun 18, 2013
8b43476
fixed up the table method
SamPortnow Jun 21, 2013
1b83a0f
updated init function to include docx2html
SamPortnow Jun 26, 2013
b0236f4
added the py_docx module and the htmlconversion parser
SamPortnow Jun 26, 2013
e2d35ec
added a missing import statement
SamPortnow Jun 26, 2013
6910567
updating
SamPortnow Jun 28, 2013
8f84ebe
refs #47: Added a test showing that not all unicode is handled correctly
Jul 2, 2013
139826f
refs #47: found the encoding of the document, and passed that encodin…
Jul 2, 2013
54b89f0
refs #47: update note
Jul 2, 2013
5ff7077
refs #48: added a test showing that val=0 should not add the style
Jul 2, 2013
0f7dad0
refs #48: val=0 no longer adds a style
Jul 2, 2013
6a14f56
refs #48: update note
Jul 2, 2013
84f7586
refs #47: updates based on code review
Jul 2, 2013
2969ad8
Merge pull request #47 from OpenScienceFramework/issue_47
jlward Jul 2, 2013
6141f9f
Merge branch 'master' into issue_48
Jul 2, 2013
d47be5b
refs #48: refactor
Jul 2, 2013
23f799a
Merge pull request #48 from OpenScienceFramework/issue_48
jlward Jul 2, 2013
c46e864
bumped to version 0.3.2
Jul 2, 2013
2297dca
now testing for math cases
SamPortnow Jul 3, 2013
62cc07c
math functionality
SamPortnow Jul 3, 2013
a730e89
math functionality
SamPortnow Jul 3, 2013
b25c7fd
merged with master
SamPortnow Jul 3, 2013
13a677f
refs #50: Catching the SyntaxError and raising a custom exception
Jul 3, 2013
101f5d2
refs #50: update note
Jul 3, 2013
eafde35
matrix support
SamPortnow Jul 3, 2013
a47c8bc
Merge pull request #50 from OpenScienceFramework/issue_50
jlward Jul 3, 2013
f110526
bumped to version 0.3.3
Jul 3, 2013
62a47d3
matrices in latex now working.
SamPortnow Jul 3, 2013
377af77
matrix test cases added
SamPortnow Jul 3, 2013
1168d5e
matrix test cases now passing
SamPortnow Jul 3, 2013
bcb12b3
matrix test cases now passing
SamPortnow Jul 3, 2013
08360b8
refs #46: Added a test showing the error with self closing t tags
Jul 5, 2013
cbc2feb
refs #46: fixed the problem with el.text being None
Jul 5, 2013
d84b50b
refs #46: update note
Jul 5, 2013
b05e308
refs #46: code cleanup
Jul 5, 2013
5a49778
Merge pull request #46 from OpenScienceFramework/issue_46
jlward Jul 5, 2013
c92792a
merged with master
SamPortnow Jul 8, 2013
8f70590
merged with master
SamPortnow Jul 8, 2013
9cf289e
added latex test case
SamPortnow Jul 8, 2013
1825c84
removed uncessary files
SamPortnow Jul 8, 2013
90e5ea9
flake8 compliant
SamPortnow Jul 8, 2013
338eebd
flake8 compliant
SamPortnow Jul 8, 2013
7d4d618
changes based on comments
SamPortnow Jul 8, 2013
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,6 @@ pip-log.txt
nosetests.xml
*.mo
.idea

test.html
testxml.html
6 changes: 5 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,13 @@ language: python
python:
- "2.6"
- "2.7"
script: python main.py
script: ./run_tests.sh
install:
- python setup.py -q install
- pip install -r requirements.txt
env:
- TRAVIS_EXECUTE_PERFORMANCE=1
notifications:
email:
- [email protected]
- [email protected]
2 changes: 2 additions & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Sam Portnow <[email protected]>
Jason Ward <[email protected]>
38 changes: 38 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@

Changelog
=========
* 0.3.4
* It is possible for `w:t` tags to have `text` set to `None`. This no longer causes an error when escaping that text.
* 0.3.3
* In the event that `cElementTree` has a problem parsing the document, a
`MalformedDocxException` is raised instead of a `SyntaxError`
* 0.3.2
* We were not taking into account that vertical merges should have a
continue attribute, but sometimes they do not, and in those cases word
assumes the continue attribute. We updated the parser to handle the
cases in which the continue attribute is not there.
* We now correctly handle documents with unicode character in the
namespace.
* In rare cases, some text would be output with a style when it should not
have been. This issue has been fixed.
* 0.3.1
* Added support for several more OOXML tags including:
* caps
* smallCaps
* strike
* dstrike
* vanish
* webHidden
More details in the README.
* 0.3.0
* We switched from using stock *xml.etree.ElementTree* to using
*xml.etree.cElementTree*. This has resulted in a fairly significant speed
increase for python 2.6
* It is now possible to create your own pre processor to do additional pre
processing.
* Superscripts and subscripts are now extracted correctly.
* 0.2.1
* Added a changelog
* Added the version in pydocx.__init__
* Fixed an issue with duplicating content if there was indentation or
justification on a p element that had multiple t tags.
7 changes: 7 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
include AUTHORS
include CHANGELOG
include LICENSE
include MANIFEST.in
include README.rst
include pydocx/fixtures/*
include pydocx/tests/templates/*
2 changes: 0 additions & 2 deletions README.md

This file was deleted.

228 changes: 228 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,228 @@
======
pydocx
======
.. image:: https://travis-ci.org/OpenScienceFramework/pydocx.png?branch=master
:align: left
:target: https://travis-ci.org/OpenScienceFramework/pydocx

pydocx is a parser that breaks down the elements of a docxfile and converts them
into different markup languages. Right now, HTML is supported. Markdown and LaTex
will be available soon. You can extend any of the available parsers to customize it
to your needs. You can also create your own class that inherits DocxParser
to create your own methods for a markup language not yet supported.

Currently Supported
###################

* tables
* nested tables
* rowspans
* colspans
* lists in tables
* lists
* list styles
* nested lists
* list of tables
* list of pragraphs
* justification
* images
* styles
* bold
* italics
* underline
* hyperlinks
* headings

Usage
#####

DocxParser includes abstracts methods that each parser overwrites to satsify its own needs. The abstract methods are as follows:

::

class DocxParser:

@property
def parsed(self):
return self._parsed

@property
def escape(self, text):
return text

@abstractmethod
def linebreak(self):
return ''

@abstractmethod
def paragraph(self, text):
return text

@abstractmethod
def heading(self, text, heading_level):
return text

@abstractmethod
def insertion(self, text, author, date):
return text

@abstractmethod
def hyperlink(self, text, href):
return text

@abstractmethod
def image_handler(self, path):
return path

@abstractmethod
def image(self, path, x, y):
return self.image_handler(path)

@abstractmethod
def deletion(self, text, author, date):
return text

@abstractmethod
def bold(self, text):
return text

@abstractmethod
def italics(self, text):
return text

@abstractmethod
def underline(self, text):
return text

@abstractmethod
def superscript(self, text):
return text

@abstractmethod
def subscript(self, text):
return text

@abstractmethod
def tab(self):
return True

@abstractmethod
def ordered_list(self, text):
return text

@abstractmethod
def unordered_list(self, text):
return text

@abstractmethod
def list_element(self, text):
return text

@abstractmethod
def table(self, text):
return text
@abstractmethod
def table_row(self, text):
return text

@abstractmethod
def table_cell(self, text):
return text

@abstractmethod
def page_break(self):
return True

@abstractmethod
def indent(self, text, left='', right='', firstLine=''):
return text

Docx2Html inherits DocxParser and implements basic HTML handling. Ex.

::

class Docx2Html(DocxParser):

# Escape '&', '<', and '>' so we render the HTML correctly
def escape(self, text):
return xml.sax.saxutils.quoteattr(text)[1:-1]

# return a line break
def linebreak(self, pre=None):
return '<br />'

# add paragraph tags
def paragraph(self, text, pre=None):
return '<p>' + text + '</p>'


However, let's say you want to add a specific style to your HTML document. In order to do this, you want to make each paragraph a class of type `my_implementation`. Simply extend docx2Html and add what you need.

::

class My_Implementation_of_Docx2Html(Docx2Html):

def paragraph(self, text, pre = None):
return <p class="my_implementation"> + text + '</p>'



OR, let's say FOO is your new favorite markup language. Simply customize your own new parser, overwritting the abstract methods of DocxParser

::

class Docx2Foo(DocxParser):

# because linebreaks in are denoted by '!!!!!!!!!!!!' with the FOO markup langauge :)
def linebreak(self):
return '!!!!!!!!!!!!'

Custom Pre-Processor
####################

When creating your own Parser (as described above) you can now add in your own custom Pre Processor. To do so you will need to set the `pre_processor` field on the custom parser, like so:

::

class Docx2Foo(DocxParser):
pre_processor_class = FooPrePorcessor


The `FooPrePorcessor` will need a few things to get you going:

::

class FooPrePorcessor(PydocxPrePorcessor):
def perform_pre_processing(self, root, *args, **kwargs):
super(FooPrePorcessor, self).perform_pre_processing(root, *args, **kwargs)
self._set_foo(root)

def _set_foo(self, root):
pass

If you want `_set_foo` to be called you must add it to `perform_pre_processing` which is called in the base parser for pydocx.

Everything done during pre-processing is executed prior to `parse` being called for the first time.


Styles
######

The base parser `Docx2Html` relies on certain css class being set for certain behaviour to occur. Currently these include:

* class `pydocx-insert` -> Turns the text green.
* class `pydocx-delete` -> Turns the text red and draws a line through the text.
* class `pydocx-center` -> Aligns the text to the center.
* class `pydocx-right` -> Aligns the text to the right.
* class `pydocx-left` -> Aligns the text to the left.
* class `pydocx-comment` -> Turns the text blue.
* class `pydocx-underline` -> Underlines the text.
* class `pydocx-caps` -> Makes all text uppercase.
* class `pydocx-small-caps` -> Makes all text uppercase, however truly lowercase letters will be small than their uppercase counterparts.
* class `pydocx-strike` -> Strike a line through.
* class `pydocx-hidden` -> Hide the text.

Optional Arguments
##################

You can pass in `convert_root_level_upper_roman=True` to the parser and it will convert all root level upper roman lists to headings instead.
12 changes: 0 additions & 12 deletions main.py

This file was deleted.

Loading