forked from Skrol29/opentbs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathxml_synopsis.html
688 lines (640 loc) · 41.3 KB
/
xml_synopsis.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Synopsis</title>
<style type="text/css">
pre {
background-color: #A0CFCF;
}
code {
background-color: #B4BBFC;
color: #036;
}
li {
line-height: 24px;
}
.comm {
color: #060;
}
.txt {
color: #666;
}
</style>
</head>
<body>
<!-- div main-body is used for insertion in the TBS menu -->
<div id="main-body">
<h1>Synopsis of XML files stored in the archives supported by OpenTBS</h1>
<p> Version 2016-03-24</p>
<p>Extensions: odt, odg, ods, odf, odp, docx, xlsx and pptx</p>
<p>This file is incomplete, feel free to send your own comments to: <a href="http://www.tinybutstrong.com/onlyyou.html">http://www.tinybutstrong.com/onlyyou.html</a></p>
<h2>Table of Contents</h2>
<ul>
<li>1 <a href="#libreoffice">LibreOffice & OpenOffice documents</a></li>
<li>1.1 <a href="#libreoffice_mainfile">Main file</a></li>
<li>1.1.1 <a href="#libreoffice_sections">Sections</a></li>
<li>1.1.2 <a href="#libreoffice_slides">Slides in a presentation</a></li>
<li>1.1.3 <a href="#libreoffice_tables">Table in a document</a></li>
<li>1.1.3.a Cells merged vertically</li>
<li>1.1.3.b Cells merged horizontally</li>
<li>1.1.4 <a href="#libreoffice_comments">Comments</a></li>
<li>1.1.4.a Comments in ODS workbooks (OpenOffice Calc)</li>
<li>1.1.4.b Comments in ODT documents (OpenOffice Write)</li>
<li>1.1.5 <a href="#libreoffice_header">Headers and footers in a document </a></li>
<li>1.1.5.a ODT and ODS</li>
<li>1.1.5.b ODP</li>
<li>1.1.6 <a href="#libreoffice_charts">Charts</a></li>
<li>1.1.7 <a href="#libreoffice_textboxes">Textboxes</a></li>
<li>1.1.8 <a href="#libreoffice_odf">Special to ODF</a> (OpenOffice Math Formula)</li>
<li>1.1.9 <a href="#libreoffice_pictures">Pictures</a></li>
<li>2 <a href="#msoffice">Microsoft Office documents</a></li>
<li>2.1 <a href="#msoffice_docx">Word document (DOCX</a>)</li>
<li>2.1.1 <a href="#msoffice_docx_main">The maine file</a></li>
<li>2.1.2 <a href="#msoffice_docx_tables">Tables</a></li>
<li>2.1.3 <a href="#msoffice_docx_headers">Headers and footers</a></li>
<li>2.1.4 <a href="#msoffice_docx_comments">Comments, footnotes and endnotes</a></li>
<li>2.1.5 <a href="#msoffice_docx_bookmarks">Bookmarks</a></li>
<li>2.1.6 <a href="#msoffice_docx_textboxes">Textboxes</a></li>
<li>2.1.7 <a href="#msoffice_docx_charts">Charts</a></li>
<li>2.1.8 <a href="#msoffice_docx_pictures">Pictures</a></li>
<li>2.2 <a href="#msoffice_xlsx">Excel spreadsheet (XLSX</a>)</li>
<li>2.2.1 <a href="#msoffice_xlsx">General</a></li>
<li>2.2.2 <a href="#msoffice_xlsx_headers">Headers and footers</a></li>
<li>2.2.3 <a href="#msoffice_xlsx_pictures">Pictures</a></li>
<li>2.3 <a href="#msoffice_pptx">PowerPoint presentation (PPTX</a>)</li>
<li>2.3.1 <a href="#msoffice_pptx">General</a></li>
<li>2.3.2 <a href="#msoffice_pptx_headers">Headers and footers</a></li>
<li>2.3.3 <a href="#msoffice_pptx_pictures">Pictures</a></li>
</ul>
<h2> <a name="libreoffice"></a>LibreOffice & OpenOffice documents</h2>
<p> That is extensions ODT, ODS, ODG, ODF, ODP, ODM.</p>
<p>All simple quotes <code>"'"</code> in texts are coded with <code>"&apos;"</code> but they are automatically replaced by the OpenTBS plugin.</p>
<p>The main information is stored in the file <em>'content.xml'</em>. </p>
<p> The pictures are stored in the directory <em>'Pictures'</em> and should be registered into the file <em>'META-INF/manifest.xml'</em>. (OpenTBS does it automatically for you when you use parameter "addpic").</p>
<p> Since OpenOffice 3.2, if a picture is not registered in the Manifest file, then it can produce a message error when opening the document.</p>
<p> Video and sound cannot be stored in OpenOffice documents.</p>
<h2><a name="libreoffice_mainfile" id="libreoffice_mainfile"></a>Main file (content.xml)</h2>
<h3>Synopsis</h3>
<pre><office:document-content>
...
<office:body>
<office:text>
<text:p text:style-name="Standard">
<span class="comm"> Normal new lines are made with a new paragraphs <text:p>...</text:p>
Simple new lines are made with <text:line-break/>
Tabs are made with <text:tab/>
Page-breaks are made with a new paragraph having a style which has the attribute fo:break-before="page" or fo:break-after="page".
Note that the page-break does not work if the attribute is in the paragraph element. A "break-before" at the first page, or a "break-after" on the last page has no effect.
Local styles (bold, color,...) are made with <text:span text:style-name="T1">...</text:span>
</span></text:p>
<text:h>...</text:h> A paragraph typed as Header
<text:list>...</text:list> A list of items
<table:table>...</table:table> A table
</office:text>
</office:body>
</office:document-content></pre>
<h3><a name="libreoffice_sections" id="libreoffice_mainfile2"></a>Sections</h3>
<pre><text:p> ... </text:p>
<text:section text:style-name="Sect1" text:name="Section1">
<text:p> ... </text:p>
<text:p> ... </text:p>
</text:section>
<text:p> ... </text:p></pre>
<h3><a name="libreoffice_slides" id="libreoffice_mainfile3"></a>Slides in a presentation</h3>
<pre><draw:page ....>
</draw:page></pre>
<h3><a name="libreoffice_tables" id="libreoffice_mainfile4"></a>Tables in a document</h3>
<pre><table:table>
<table:table-column ... />
<table:table-column ... />
<table:table-column ... />
<table:table-row>
<table:table-cell> ... </table:table-cell>
<table:table-cell table:number-columns-spanned="2"> ... </table:table-cell>
<table:covered-table-cell/> <span class="comm">// virtual column that is covered by the span</span>
<table:table-cell> ... </table:table-cell>
</table:table-row>
</table:table></pre>
<h4>Cells merged vertically</h4>
<pre><table:table-cell table:number-rows-spanned="2" office:value-type="string">
<text:p>
<text:span> <span class="txt">CONTENT OF THE CELL</span> </text:span>
</text:p>
</table:table-cell></pre>
<p>Cell that are not displayed because of vertical merging or horizontal mergin, are replaced with <code><table:covered-table-cell/></code>. But such entities seems to be optionnal: LibreOffice display correclty the table if they are ommited.</p>
<p>Attributes<code> table:number-rows-spanned="1"</code> seems to be supported and has no effect.</p>
<h4> Cells merged horizontally</h4>
<pre><table:table-cell table:number-columns-spanned="2" office:value-type="string">
<text:p>
<text:span> <span class="txt">CONTENT OF THE CELL</span> </text:span>
</text:p>
</table:table-cell>
<table:covered-table-cell/>
</pre>
<h3><a name="libreoffice_comments" id="libreoffice_mainfile5"></a>Comments</h3>
<h4>Comments in ODS workbooks (OpenOffice Calc)</h4>
<pre><table:table-row table:style-name="ro2">
<table:table-cell ...> ... </table:table-cell>
<table:table-cell office:value-type="float" office:value="3.6">
<office:annotation draw:style-name="gr1" draw:text-style-name="P1" svg:width="2.899cm" svg:height="0.991cm" svg:x="9.632cm" svg:y="2.786cm" draw:caption-point-x="-0.61cm" draw:caption-point-y="1.511cm">
<dc:date>2011-02-02T00:00:00</dc:date>
<text:p text:style-name="P1"><span class="txt">Here is my comment.</span></text:p>
</office:annotation>
<text:p>3,60</text:p>
</table:table-cell>
<table:table-cell ...> ... </table:table-cell>
</table:table-row></pre>
<h4>Comments in ODT documents (OpenOffice Write)</h4>
<pre><text:p text:style-name="Standard">
Here is the start of the text and now
<office:annotation>
<dc:creator>Skrol29</dc:creator>
<dc:date>2011-02-02T16:13:58.42</dc:date>
<text:p text:style-name="P1">
<text:span text:style-name="T1"><span class="txt">Here is my comment.</span></text:span>
</text:p>
</office:annotation>
comment is just inserted there.
</text:p></pre>
<h3><a name="libreoffice_headers" id="libreoffice_mainfile6"></a>Headers and footers in a document </h3>
<h4><a name="libreoffice_headers_odt" id="libreoffice_mainfile9"></a>ODT and ODS</h4>
<p> Header and footer contents are saved in the <em>"sytles.xml"</em> file: You can choose for each page to hide or display the header and footer. Nevertheless, the contents is always the same for each page.</p>
<pre><office:document-styles ...>
<office:master-styles>
<style:master-page ...>
<style:header>
<text:p text:style-name="Header"><span class="txt">here is the header</span></text:p>
</style:header>
<style:footer>
<text:p text:style-name="Footer"><span class="txt">here is the footer</span></text:p>
</style:footer>
</style:master-page>
</office:master-styles>
</office:document-styles></pre>
<h4><a name="libreoffice_headers_odp" id="libreoffice_mainfile8"></a>ODP</h4>
<p> Header and footer contents are saved in <em>"content.xml"</em> file.
They are defined as available styles and can be use for any slide and for the whole document in the "handout" view which is a set of several slide.</p>
<pre><office:presentation>
<presentation:header-decl presentation:name="hdr1"><span class="txt">Here is my header</span></presentation:header-decl>
<presentation:footer-decl presentation:name="ftr1"><span class="txt">Here is my footer for slide 1</span></presentation:footer-decl>
<presentation:footer-decl presentation:name="ftr2"><span class="txt">Here is my footer</span></presentation:footer-decl>
<draw:page ... presentation:use-footer-name="ftr1">
</draw:page>
...
</office:presentation> </pre>
<h3> <a name="libreoffice_charts" id="libreoffice_mainfile7"></a>Charts</h3>
<p> The location of the chart is defined in the main subfile <em>"contents.xml"</em>.</p>
<pre><text:p text:style-name="Standard">
<draw:frame draw:name="<span class="txt">My name</span>" text:anchor-type="paragraph" svg:x="2.401cm" svg:y="1.379cm" svg:width="14.102cm" svg:height="7.622cm" draw:z-index="0">
<draw:object xlink:href="./Object 1" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" />
<draw:image xlink:href="./ObjectReplacements/Object 1" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" />
<svg:title><span class="txt">My title</span></svg:title>
<svg:desc><span class="txt">My description</span></svg:desc>
</draw:frame>
</text:p></pre>
<p>An image preview is saved in the file <em>"ObjectReplacements/Object 1"</em>, and this image will be displayed
automatically instead of the real Chart view until the chart is manually changed in the document.</p>
<p> In order to avoid this preview, the followings must be deteled :</p>
<ul>
<li> the file <em>"ObjectReplacements/Object 1"</em>,</li>
<li> the reference <code><draw:image></code> in the <em>"contents.xml"</em> file,</li>
<li> the reference <code><draw:image></code> in the <em>"META-INF/manifest.xml"</em> file.</li>
</ul>
<p>Data and properties of the chart are saved into the corresponding subdirectory <em>"Object 1"</em>.
Data are saved in <em>"Object 1/contents.xml"</em> in a table that groups data of all series of the chart.</p>
<pre><table:table table:name="local-table">
<table:table-header-columns>
<table:table-column />
</table:table-header-columns>
<table:table-columns>
<table:table-column table:number-columns-repeated="3" />
</table:table-columns>
<table:table-header-rows>
<table:table-row>
<table:table-cell><text:p /></table:table-cell>
<table:table-cell office:value-type="string"><text:p><span class="txt">column 1</span></text:p></table:table-cell>
<table:table-cell office:value-type="string"><text:p><span class="txt">column 2</span></text:p></table:table-cell>
<table:table-cell office:value-type="string"><text:p><span class="txt">column 3</span></text:p></table:table-cell>
</table:table-row>
</table:table-header-rows>
<table:table-rows>
<table:table-row>
<table:table-cell office:value-type="string"><text:p><span class="txt">line 1</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="9.1"><text:p><span class="txt">9.1</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="3.2"><text:p><span class="txt">3.2</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="4.54"><text:p><span class="txt">4.54</span></text:p></table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell office:value-type="string"><text:p><span class="txt">line 2</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="2.4"><text:p><span class="txt">2.4</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="8.8"><text:p><span class="txt">8.8</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="9.65"><text:p><span class="txt">9.65</span></text:p></table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell office:value-type="string"><text:p><span class="txt">line 3</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="3.1"><text:p><span class="txt">3.1</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="1.5"><text:p><span class="txt">1.5</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="3.7"><text:p><span class="txt">3.7</span></text:p></table:table-cell>
</table:table-row>
<table:table-row>
<table:table-cell office:value-type="string"><text:p><span class="txt">line 4</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="4.3"><text:p><span class="txt">4.3</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="9.02"><text:p><span class="txt">9.02</span></text:p></table:table-cell>
<table:table-cell office:value-type="float" office:value="6.2"><text:p><span class="txt">6.2</span></text:p></table:table-cell>
</table:table-row>
</table:table-rows>
</table:table></pre>
<h3><a name="libreoffice_textboxes" id="libreoffice_mainfile10"></a>Textboxes</h3>
<pre><text:p text:style-name="Standard">
<draw:frame draw:style-name="fr1" draw:name="Frame1" text:anchor-type="paragraph" svg:width="4.535cm" draw:z-index="0">
<draw:text-box fo:min-height="2.461cm">
<text:p text:style-name="Frame_20_contents"><span class="txt">Message in a textbox</span></text:p>
</draw:text-box>
</draw:frame>
<span class="txt">Usual text</span>
</text:p>
</pre>
<h3><a name="libreoffice_odf" id="libreoffice_mainfile11"></a>Special to ODF (OpenOffice Math Formula)</h3>
<p> Any comment in the formula must be entered between text delimiters which are the double quotes (<code>"</code>).
Newlines are made with the keyword <code>'newline'</code> outside the text delimiter. </p>
<h3><a name="libreoffice_pictures" id="libreoffice_mainfile12"></a>Pictures/Images</h3>
<p> Binary contents is saved as a file in <em>"Pictures/"</em>.</p>
<p>Short synopsis of the control in the document:</p>
<pre><text:p ...>
<draw:frame ...>
<draw:text-box ...>
<text:p ...>
<draw:a ...>
<draw:frame draw:style-name="fr1" draw:name="images1" text:anchor-type="paragraph" svg:width="0.847cm" svg:height="0.847cm" draw:z-index="0">
<draw:image xlink:href="Pictures/100000000000002000000020A0D29467.jpg" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad"/>
</draw:frame>
</draw:a>
</text:p>
</draw:text-box>
</draw:frame>
</text:p></pre>
<h2> <a name="msoffice" id="msoffice"></a>Microsoft Office documents</h2>
<p>That is documents with extension DOCX, XLSX, PPTX.</p>
<h3> <a name="msoffice_docx" id="msoffice2"></a>Microsoft Word document (.DOCX)</h3>
<p>The main file is usually <em>"word/document.xml"</em>, but its actual location is defined in the file <em>"[Content_Types].xml"</em>, in the element: <code><Override PartName="/word/document.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/></code></p>
<p>Note: I've test to change the <em>"word/document.xml"</em> name in both the <em>"[Content_Types].xml"</em> file and the archive, but this makes
Word 2010 to be unable to open the document, saying it is corrupted.</p>
<h4><a name="msoffice_docx_main" id="msoffice11"></a>The main file "word/document.xml" (DOCX)</h4>
<pre><w:document>
<w:body>
<w:p> <span class="comm">// New paragraph</span>
<w:pPr> <span class="comm">// Parameters of the paragraph</span>
<w:rPr> ... </w:rPr> <span class="comm">// Set of parameters for a Run</span>
<w:sectPr> <span class="comm">// Start a new section. Sections are a set of page layout (margin, columns, ...) available until the next section.</span>
<w:type w:val="continuous"/> <span class="comm">// May be present whe the section is defined manually.</span>
</w:sectPr>
<w:pageBreakBefore/> <span class="comm">// Page break before the paragraph (way #1)</span>
</w:pPr>
<w:r> <span class="comm">// New run item. A run item is a set of content having common layout properties.</span>
<w:rPr> <span class="comm">// Set of parameters for a Run. Examples: <w:i/> is italic, <w:b/> is bold. </w:rPr></span>
<w:t><span class="txt">Your text is here</span></w:t>
<span class="comm">// Simple new lines are made with <w:br/></span>
<span class="comm">// Page breaks can also be made with <w:br w:type="page"/> (way #2)</span>
</w:r>
<w:tab/> <span class="comm">// Tabs are placed between <w:r> elements.</span>
<w:r>
<w:t xml:space="preserve"><span class="txt"> Next text </span></w:t>
</w:r> <span class="comm">// spaces between entities are dealt using attribute xml:space="preserve"</span>
</w:p>
</w:body>
</w:document></pre>
<p>What are attributes <code>"w:rsidR"</code> and <code>"w:rsidRPr"</code> for?</p>
<p>Attribute <code>"w:rsidR"</code> is a Revision ID. Each new user on a doc has a new id,
and each of its modification is marked with its RsID. </p>
<p> More info: <a href="http://blogs.msdn.com/brian_jones/archive/2006/12/11/what-s-up-with-all-those-rsids.aspx">http://blogs.msdn.com/brian_jones/archive/2006/12/11/what-s-up-with-all-those-rsids.aspx</a></p>
<h4><a name="msoffice_docx_tables" id="msoffice12"></a>Tables (DOCX)</h4>
<pre><w:p>...</w:p>
<w:tbl> <span class="comm">// a table takes the same place as a paragraph</span>
<w:tblPr></w:tblPr>
<w:tblGrid>
<w:gridCol ... /> <span class="comm">// principally define colmun widths</span>
<w:gridCol ... />
</w:tblGrid>
<w:tr>
<w:tc> ... </w:tc>
<w:tc> ... </w:tc>
...
</w:tr>
</w:tbl>
<w:p>...</w:p> </pre>
<h5>Cells merged vertically</h5>
<pre><w:tc>
<w:tcPr>
<w:vMerge w:val="restart"/> <span class="comm">// marks the cell to start a new cell-merging</span>
<w:vMerge w:val="continue"/> <span class="comm">// marks the cell to continue the cell-mergin (the cell is merged with a previous one having "restart" or "continue")</span>
<w:vMerge/> <span class="comm">// same as above</span>
<span class="comm"> // no <w:vMerge> entity means the cell is not merged. </span>
</w:tcPr>
...
</w:tc></pre>
<h5>Cells mertged horizontally</h5>
<pre><w:tc>
<w:tcPr>
<w:gridSpan w:val="2"/> <span class="comm">// It works like "colspan" in HTML.</span>
</w:tcPr>
...
</w:tc></pre>
<h4> <a name="msoffice_docx_headers" id="msoffice13"></a>Headers and footers (DOCX)</h4>
<p>They are 3 types of headers and footers in Microsoft Word : Default, Even (for even numbered pages only) and First (for the first page only). Event and First types are optional.</p>
<p>Each section of the document may have its own set of header/footer of the 3 types, but by default a new section has his headers/footers linked with the previous sections.</p>
<p>Each headers and footers are saved in separated XML file. If no header/footer is defined for the document, then they are no header nor footer XML files. Even and First headers are optional, they may not be defined for a section, and so have no corresponding XML files.</p>
<p>Example of header and footer files: <em>"word/header1.xml"</em> and <em>"word/footer1.xml"</em>.</p>
<p>The actual type and locations of Headers and Footers are defined in the main document <em>"word/document.xml"</em>with the section's properties.</p>
<pre><w:sectPr>
<w:headerReference w:type="default" r:id="rId13"/>
<w:footerReference w:type="default" r:id="rId14"/>
<w:headerReference w:type="first" r:id="rId15"/>
<w:footerReference w:type="first" r:id="rId16"/>
<w:headerReference w:type="even" r:id="rId17"/>
<w:footerReference w:type="even" r:id="rId18"/>
<w:sectPr></pre>
<h5>Relation files</h5>
<p>The information related to r:id are stored in the file <em>"word/_rels/document.xml.rels"</em></p>
<pre><Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId8" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.png"/>
<Relationship Id="rId13" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/header" Target="header1.xml"/>
...
</Relationships></pre>
<p>Locations are also appering in the file <em>"[Content_Types].xml"</em>, in the elements like:
</p>
<pre><Override PartName="/word/header1.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.header+xml"/>
// <span class="comm">and</span>
<Override PartName="/word/footer1.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.footer+xml"/></pre>
<p>Some referenced header/footers may have no actual files because of no data.</p>
<h5>Example of header source:</h5>
<pre><?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:hdr ...>
<w:p>
<w:pPr><w:pStyle w:val="my_header"/></w:pPr>
<w:r>
<w:t><span class="txt">here is the text of the header</span></w:t>
</w:r>
</w:p>
</w:hdr></pre>
<h4><a name="msoffice_docx_comments" id="msoffice14"></a>Comments, footnotes and endnotes (DOCX)</h4>
<p> Like headers and footers, they are aslo saved in separated XML files.</p>
<h4><a name="msoffice_docx_bookmarks" id="msoffice15"></a>Bookmarks</h4>
<pre><w:p>
<w:r><w:t xml:space="preserve"><span class="txt">Here is a </span></w:t></w:r>
<w:bookmarkStart w:id="0" w:name="mybookmark"/>
<w:r><w:t><span class="txt">bookmark</span></w:t></w:r>
<w:bookmarkEnd w:id="0"/>
<w:r><w:t>.</w:t></w:r>
</w:p></pre>
<h4><a name="msoffice_docx_textboxes" id="msoffice16"></a>Textboxes</h4>
<pre><w:p>
<w:r>
<mc:AlternateContent> <span class="comm">// indicate a liste of different possible choices</span>
<mc:Choice Requires="wps"> <span class="comm">// first choice and its condition</span>
<w:drawing>
<wp:anchor ...>
...
</wp:anchor>
</w:drawing>
</mc:Choice>
<mc:Fallback> <span class="comm">// last choice if no condition is true</span>
<w:pict>
<v:shapetype ...>
...
</v:shapetype>
<v:shape ...>
<v:textbox ...>
<w:txbxContent>
<w:p> <w:r> <w:t><span class="txt">Here s a text box.</span></w:t> </w:r> </w:p>
</w:txbxContent>
</v:textbox>
</v:shape>
</w:pict>
</mc:Fallback>
</mc:AlternateContent>
</w:r>
</w:p></pre>
<h4><a name="msoffice_docx_charts" id="msoffice5"></a>Charts (DOCX)</h4>
<p>The first chart is saved under <em>"word/charts/chart1.xml"</em>, and so on for the next ones. The XML file of the chart contains a copy of the data used for the chart.</p>
<p> If the chart is designed manually, then <em>"chart1.xml"</em> also contains references to cells of an Ms Excel files that is used by Ms Word for managing series.</p>
<p> The Excel file is emmbeded in the Docx file, for exemple: <em>"word/embeddings/Worksheet_Microsoft_Excel1.xlsx"</em>. The path of the Excel file is saved into <em>"word/charts/_rels/chart1.xml.rels"</em>.
Nevertheless the references to that Excel file are optional and can be deleted from the XML of the chart.</p>
<p>Title of the chart, the axes and the series are saved in <em>"chart1.xml"</em>. Other custom text boxes are saved in a shape file. For example :<em> "word/drawings/drawing1.xml"</em>.</p>
<p>Example of a series saved in the XML (the tags are different for an XY series):</p>
<pre><c:ser>
<c:idx val="0"/>
<c:order val="0"/>
<c:tx>
<c:strRef>
<c:f>Sheet1!$A$2</c:f>
<c:strCache><c:ptCount val="1"/><c:pt idx="0"><c:v><span class="txt">Here is the name of the Series</span></c:v></c:pt></c:strCache>
</c:strRef>
</c:tx>
<c:spPr>
<a:solidFill><a:srgbClr val="9999FF"/></a:solidFill>
<a:ln w="12700"><a:solidFill><a:srgbClr val="000000"/></a:solidFill><a:prstDash val="solid"/></a:ln>
</c:spPr>
<c:invertIfNegative val="0"/>
<c:cat>
<c:strRef>
<c:f>Sheet1!$B$1:$E$1</c:f>
<c:strCache>
<c:ptCount val="4"/>
<c:pt idx="0"><c:v><span class="txt">Category A</span></c:v></c:pt>
<c:pt idx="1"><c:v><span class="txt">Category B</span></c:v></c:pt>
<c:pt idx="2"><c:v><span class="txt">Category C</span></c:v></c:pt>
<c:pt idx="3"><c:v><span class="txt">Category D</span></c:v></c:pt>
</c:strCache>
</c:strRef>
</c:cat>
<c:val>
<c:numRef>
<c:f>Sheet1!$B$2:$E$2</c:f>
<c:numCache>
<c:formatCode>General</c:formatCode>
<c:ptCount val="4"/>
<c:pt idx="0"><c:v><span class="txt">20.399999999999999</span></c:v></c:pt>
<c:pt idx="1"><c:v><span class="txt">27.4</span></c:v></c:pt>
<c:pt idx="2"><c:v><span class="txt">90</span></c:v></c:pt>
<c:pt idx="3"><c:v><span class="txt">20.399999999999999</span></c:v></c:pt>
</c:numCache>
</c:numRef>
</c:val>
</c:ser></pre>
<p> </p>
<h4><a name="msoffice_docx_pictures" id="msoffice6"></a>Pictures (DOCX)</h4>
<p>Binary contents is saved as a file in <em>"word/media/"</em>. </p>
<p>Image link saved into <em>"word/_rels/document.xml.rels"</em>: <code><Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.png"/></code></p>
<p>They are two ways to insert a picture.</p>
<h5>1) VML (old way)</h5>
<p> (short synopsis)</p>
<pre><w:pict>
<v:shapetype ...>
...
</v:shapetype>
<v:shape style="width:89.25pt;height:119.25pt" ...> <span class="comm">// this element contains the size of the picture</span>
<v:imagedata r:id="rId6" o:title=""/> <span class="comm">// this element contains the link to the picture internal file</span>
</v:shape>
</w:pict>
</pre>
<h5>2) DrawingML (the new way that includes 2D/3D effects)</h5>
<p> (short synopsis)</p>
<pre><w:drawing>
<wp:inline ...>
<wp:extent cx="1130400" cy="1512000" /> <span class="comm">// this element gives the size of the shape box that contains the picture</span>
<wp:docPr id="1" name="Image 1" descr="My description" title="<span class="txt">My title</span>"/> <span class="comm">// this optional element gives the description and title of the image</span>
<wp:docPr id="1" name="Image 0" descr="2884414-my_picture.jpg">
<a:hlinkClick xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" r:id="rId4" tooltip="<span class="txt">My tooltip</span>"/> <span class="comm">// this optional element gives the URL and tooltip of the link if any. In Word 2007 you cannot customize the title and desciption</span>
</wp:docPr>
<a:graphic ...>
<a:graphicData ...>
<pic:pic ...>
<pic:blipFill>
<a:blip r:embed="rId4" /> <span class="comm">// this element contains the link to the picture internal file</span>
</pic:blipFill>
<pic:spPr ...>
<a:xfrm>
<a:off x="0" y="0" />
<a:ext cx="1130400" cy="1512000" /> <span class="comm">// this element gives the size of the picture inside its shape box, it can be rescaled to fit in the shape box</span>
</a:xfrm>
<a:prstGeom prst="rect"><a:avLst /></a:prstGeom>
<a:noFill />
<a:ln><a:noFill /></a:ln>
</pic:spPr>
</pic:pic>
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing></pre>
<h4> </h4>
<h3><a name="msoffice_xlsx" id="msoffice3"></a>Microsoft Excel spreadsheet (XLSX)</h3>
<h4>General</h4>
<p>An Excel workbook can have one or several worksheets. The contents of cells are saved in worksheets.
Worksheets files are named <em>"xl/worksheets/sheet1.xml"</em>, and also <em>sheet2.xml</em>, <em>sheet3.xml</em>...</p>
<p> The file names are not the names defined in Excel by the user, they are internal names. But it seems that there is always at least a worksheet named <em>"sheet1.xml"</em>.</p>
<p>All string values of cells are stored in the file <em>"xl/sharedStrings.xml"</em>. The cells contains in fact the index of the string in the sharedStrings.xml file. This separation will probably make difficulties to merge an Excel sheet. </p>
<p>All sheets of the workbook are listed in the file <em>"xl/workbook.xml"</em>.</p>
<p>Synopsis of a sheet file like <em>"xl/worksheets/sheet1.xml"</em> (XLSX)</p>
<pre><worksheet>
...
<sheetData>
<row r="2" spans="2:2" ht="90">
<span class="comm">// A range of one row in wich several cells are defined</span>
<c r="B2" s="1" t="s">
<span class="comm"> /* Definition of a cell:
* Attribute r is the address if the cell in the sheet (format A1). This attribute is optional.
* Attribute s is the style of the cell (the format). Styles are saved into the file 'xl/styles.xml' but I have not found the link yet.
* Attribute t is the type of data, by default it is numerical
* t="s" means that the displayed value is a string, the saved value is the index if the string taken in file "sharedStrings.xml".
* b: boolean, d: date, e: error, n: number, s: shared string, inlineStr: inlinde string, str: string as the result of a formula
*/
</span> <f>B13+B14</f> <span class="comm">// The formula if any. If there is no formula, this tag is absent. The type of <c> is the type of the result.</span>
<v>0</v> <span class="comm">// The inner value without formatting.
// If t="s" then the value is in fact the index of the string in the "xl/sharedStrings.xml" file.
// If t="str" then the value is the string result of the formula.</span>
</c>
<c r="C2" s="1" t="inlineStr">
<span class="comm"> /* The type "inlineStr" is a special value that allows the string to be stored in the cell instead of in the file "sharedStrings.xml".
* It is used by OpenTBS to transert string with TBS fields from "sharedStrings.xml" into the XML of the sheet.
*/
</span> <is><t>This is a string</t></is>
</c>
</row>
</sheetData>
</worksheet></pre>
<p>Synopsis of the Shared String file <em>"xl/sharedStrings.xml"</em>: (XLSX)</p>
<pre><sst>
<si>
<t><span class="txt">value or text</span></t>
</si>
<si>
<t><span class="txt">value or text</span></t>
</si>
</sst></pre>
<h4><a name="msoffice_xlsx_headers" id="msoffice10"></a>Headers and footers (XLSX)</h4>
<p> Like as DCX, they can be up to 6 headers/footers for each sheet.<br>
Header and footers are saved in the sheet file.</p>
<pre><headerFooter differentOddEven="1" differentFirst="1">
<oddHeader><span class="txt">My header for odd page in this sheet</span></oddHeader>
<evenHeader><span class="txt">My header for even page in this sheet</span></evenHeader>
<firstHeader><span class="txt">My header for first page in this sheet</span></firstHeader>
</headerFooter></pre>
<h4><a name="msoffice_xlsx_pictures" id="msoffice8"></a>Pictures (XLSX)</h4>
<p>Binary contents is saved as a file in <em>"xl/media/"</em>. </p>
<p>The presence of pictures in the sheet is mentioned with a single <code><drawing></code> entity at the bottom of the <code><worksheet></code> entity. The <code><drawing></code> entity carries a reference id, which is defined the Rels file of the sheet. All properties of all the pictures in a sheet are finally saved in a third XML file.</p>
<h5> File "xl\worksheets\sheet1.xml"</h5>
<pre><worksheet ...>
...
<drawing r:id="rId2"/> <span class="comm">// (only one entity for all pictures in the sheet)</span>
</worksheet> </pre>
<h5>File "xl\worksheets\_rels\sheet1.xml.rels" </h5>
<pre><Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/drawing" Target="../drawings/drawing1.xml"/>
<span class="comm">// (only one entity for all pictures in the sheet)</span></pre>
<h5> File "xl\drawings\drawing1.xml"</h5>
<p> (short synopsis, one entity per picture in the sheet)</p>
<pre><xdr:twoCellAnchor ...>
<xdr:pic>
<xdr:nvPicPr>
<xdr:cNvPr id="2" name="Image 1" descr="xxx my description" title="xxx my title"/>
</xdr:nvPicPr>
<xdr:blipFill>
<a:blip xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" r:embed="rId1">
</a:blip>
</xdr:blipFill>
</xdr:pic>
</xdr:twoCellAnchor>
</pre>
<h3> </h3>
<h3><a name="msoffice_pptx" id="msoffice4"></a>Microsoft PowerPoint presentation (PPTX)</h3>
<h4>General</h4>
<p>Think to set all texts to <em>"Tools » Language » No check"</em> when you edit the PowerPoint presentation, otherwise some TBS fields can be split by XML tags about the language and spell checking.</p>
<p>Slides are listed in the file <em>"ppt/_rels/presentation.xml.rels"</em>, where an internal id is affected to them.</p>
<p>The first slide is quite always corresponding to the file <em>"ppt/slides/slide1.xml"</em>.</p>
<p>Synopsis of a slide file like <em>"ppt/slides/slide1.xml"</em>: (PPTX)</p>
<pre><p:sld>
<p:cSld>
<p:spTree>
<p:sp>
<p:txBody>
<a:p>
<a:pPr eaLnBrk="1" hangingPunct="1"/>
<a:r>
<a:t><span class="txt">Some text here</span></a:t>
</a:r>
</a:p>
</p:txBody>
</p:sp>
</p:spTree>
</p:cSld>
</p:sld></pre>
<h4><a name="msoffice_pptx_headers" id="msoffice9"></a>Headers and footers (PPTX)</h4>
<p>They can be headers and footers for each slide, and aslo an header and footer for the "handout" view, wich is a set of several slides.</p>
<p> Header and footer of each slide is saved in the sub-file of the slide ("ppt/slides/slide1.xml" for example).</p>
<p> Header and footer of the handout is saved ins ubs-file <em>"ppt/handoutMasters/handoutMaster1.xml"</em>.</p>
<h4><a name="msoffice_pptx_pictures" id="msoffice7"></a>Pictures (PPTX)</h4>
<p>Binary contents is saved as a file in <em>"ppt/media/"</em>.</p>
<p>(short synopsis)</p>
<pre><p:pic>
<p:nvPicPr>
<p:cNvPr id="4" name="Image 3" descr="<span class="txt">My description</span>" title="<span class="txt">My title</span>"> <span class="comm">// this optional element gives the description and title of the image</span>
<a:hlinkClick r:id="rId2" action="ppaction://hlinkfile" tooltip="<span class="txt">My toolip</span>"/>
</p:cNvPr>
</p:nvPicPr>
<p:blipFill>
<a:blip r:embed="rId2">
</a:blip>
</p:blipFill>
<p:spPr>
<a:xfrm>
<a:off x="2667000" y="4365104" />
<a:ext cx="3810000" cy="1200150" />
</a:xfrm>
</p:spPr>
</p:pic></pre>
<h4></h4>
<p> </p>
<p> </p>
</div>
</body>
</html>