You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just did a read_docx as part of my testing for my project on a test .docx file with various things including a textbox.
Examining the resulting Value::Object I can't find the text in my textbox anywhere.
I can see from the crates.io page that at the bottom, under "Features", "Textbox" is left unticked.
Does this mean that the parsing basically ignores all textboxes?
And yet, when I uncompress the .docx file, in document.xml there it is, near the end:
"v:textbox style="mso-fit-shape-to-text:t"><w:txbxContent><w:p w:rsidR="0094123E" w:rsidRPr="00DF617B" w:rsidRDefault="0094123E" w:rsidP="0094123E"><w:pPr><w:ind w:left="0" w:firstLine="0"/></w:pPr><w:r w:rsidRPr="00DF617B"><w:t>Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</w:t></w:r></w:p><w:p w:rsidR="0094123E" w:rsidRDefault="0094123E"/></w:txbxContent></v:textbox>"
Have I got this right about omitting textboxes currently?
If so, any reason why this is not apparently currently included in the parsing? It's slightly irksome because it means I'll have to cobble together my own code to parse document.xml.
The text was updated successfully, but these errors were encountered:
Edited the title in the hope that you might find time to give this some thought. Omitting text from text-boxes seems a bit of an oversight, which could seemingly be corrected fairly easily...
Mrodent
changed the title
Clarification about Word textboxes?
Extract text from Word textboxes [proposed label: enhancement]
May 17, 2024
I just did a
read_docx
as part of my testing for my project on a test .docx file with various things including a textbox.Examining the resulting
Value::Object
I can't find the text in my textbox anywhere.I can see from the crates.io page that at the bottom, under "Features", "Textbox" is left unticked.
Does this mean that the parsing basically ignores all textboxes?
And yet, when I uncompress the .docx file, in document.xml there it is, near the end:
"v:textbox style="mso-fit-shape-to-text:t"><w:txbxContent><w:p w:rsidR="0094123E" w:rsidRPr="00DF617B" w:rsidRDefault="0094123E" w:rsidP="0094123E"><w:pPr><w:ind w:left="0" w:firstLine="0"/></w:pPr><w:r w:rsidRPr="00DF617B"><w:t>Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</w:t></w:r></w:p><w:p w:rsidR="0094123E" w:rsidRDefault="0094123E"/></w:txbxContent></v:textbox>"
Have I got this right about omitting textboxes currently?
If so, any reason why this is not apparently currently included in the parsing? It's slightly irksome because it means I'll have to cobble together my own code to parse document.xml.
The text was updated successfully, but these errors were encountered: