-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathOverview.bs
1359 lines (1120 loc) · 45.8 KB
/
Overview.bs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<pre class='metadata'>
Title: HTML Ruby Markup Extensions
Shortname: html-ruby-extensions
Level: None
Status: ED
Group: htmlwg
Repository: w3c/html-ruby
TR: https://www.w3.org/TR/html-ruby-extensions/
ED: https://w3c.github.io/html-ruby/
Editor: Florian Rivoal, Invited Expert, https://florian.rivoal.net, w3cid 43241
Abstract:
Ruby, a form of interlinear annotation,
are short runs of text alongside the base text.
They are typically used in East Asian documents
to indicate pronunciation or to provide a short annotation.
This specification revises and extends the markup model established by HTML to express ruby.
Complain About: accidental-2119 yes, missing-example-ids yes
Markup Shorthands: markdown yes
Status Text:
This document is developed
under the terms of the <a href="https://www.w3.org/2022/02/ruby-agreement">Agreement on HTML Ruby Markup</a>
between W3C and the WHATWG.
</pre>
<pre class="anchors">
spec: html; urlPrefix: https://html.spec.whatwg.org/multipage/
type:dfn; for:/; text:text; url: dom.html#text-content
type:dfn; for:/; text:categories; url: dom.html#concept-element-categories
</pre>
<pre class=link-defaults>
spec:css-ruby-1; type:value; text:ruby-base
</pre>
<pre class=biblio>
{
"QA-RUBY": {
"href": "https://www.w3.org/International/questions/qa-ruby",
"title": "What is ruby?",
"publisher": "W3C",
"authors": [ "Richard Ishida" ]
},
"UNIFIED-RUBY": {
"href": "https://fantasai.inkedblade.net/weblog/2011/ruby/",
"title": "Towards a Unified Ruby Model",
"authors": [ "Elika J. Etemad" ]
}
}
</pre>
<h2 id=intro>
Introduction</h2>
<i>This section is non-normative</i>
<dfn export>Ruby</dfn> is a name for small annotations that are rendered alongside base text.
This is especially useful for Japanese and other East Asian content
(ruby may be called <i>furigana</i> in Japanese).
It is most often used to provide a reading (pronunciation guide).
Ruby text is usually presented alongside the base text,
using a smaller typeface.
The name ruby originated from a named font size
(about half the size of the normal 10 point font)
used by British typesetters.
Typically ruby is used in East Asian scripts
to provide phonetic transcriptions of obscure and little known characters,
characters that the reader is not expected
to be familiar with
(such as children or foreigners learning to write),
or characters that have multiple readings
which can't be determined by the context
(eg. some Japanese names).
For example it is widely used in educational materials and children’s texts,
but it can also be readily found in many types of literature and signage.
It is also occasionally used to convey information about the meaning of ideographic characters.
<figure>
<img src="images/ruby-shinkansen.png"
width=320 height=70
alt="An example of annotating text with ruby:
the Japanese word for bullet-train is written with 3 kanji characters,
written horizontally, left to right.
Their pronunciation is indicated by 6 hiragana characters
placed immediately above.
The annotation is half the font size of the base it annotates.">
</figure>
Specialized markup, as defined in this document, is necessary
to describe the semantic associations between the base text and its annotations,
to enable its various visual layouts
as well as correct non-visual presentation and processing.
Note: [[CSS-RUBY-1 inline]] defines the ruby layout model in CSS,
enabling the ruby presentation described above
and frequently desired variations.
<h3 id=relations>
Background and Relation to the [[HTML inline]]</h3>
A set of HTML elements to markup ruby has evolved over the years in multiple specifications,
starting from the 2001 [[RUBY inline]] specification
all the way to the current [[HTML inline]],
with different incarnations varying in flexibility, complexity, or verbosity.
While concise and effective in simple cases,
the ruby model described in the [[HTML inline]]
(at the time of writing this document)
is insufficiently expressive to handle all use cases well.
Moreover, some aspects of it are also not interoperably implemented;
yet implementing them would not completely address the remaining use cases.
Additionally, these aspects are at odds with the CSS layout model.
This specification is written to promote--
and guide implementations of--
a revised and extended model for ruby,
in order to more completely address the needs of ruby on the Web platform.
This effort is undertaken with the <a href="https://www.w3.org/2022/02/ruby-agreement">agreement of W3C and the WHATWG</a>.
[[#diff-html]] summarizes the main differences,
and provides a brief overview of why these differences are desirable.
Note that the semantics of the subset of the [[HTML inline]]
that is interoperably implemented
remain unchanged in this extension specification,
making the ruby model described here
backwards compatible
with any ruby content
supported by existing user agents.
<div class=advisement>
In this document,
advisement blocks like this one
indicate how normative parts of the document
relate to and replace various part of the [[HTML inline]].
</div>
It is hoped that the changes described here will in time
be adopted by the WHATWG
and integrated into the [[HTML inline]],
reducing the delta between the two documents.
<h2 id=elements>
HTML Elements for Ruby</h2>
<div class=advisement>
This section and it subsections replace and extendent
sections [[HTML/text-level-semantics#the-ruby-element]] through [[HTML/text-level-semantics#the-rp-element]]
of the [[HTML inline]].
</div>
<h3 id=the-ruby-element>
The <dfn element><code>ruby</code></dfn> element</h3>
<dl class="def">
<dt><a spec=html>Categories</a>:
<dd><a>Flow content</a>.
<dd><a>Phrasing content</a>.
<dd><a>Palpable content</a>.
<dt><a spec=html>Contexts in which this element can be used</a>:
<dd>Where <a>phrasing content</a> is expected.
<dt><a spec=html>Content model</a>:
<dd>See prose.
<dt><a spec=html>Content attributes</a>:
<dd><a spec=html>Global attributes</a>
<dt> <a spec=html>Accessibility considerations</a>:
<dd><a href="https://w3c.github.io/html-aria/#el-ruby">For authors</a>.
<dd><a href="https://w3c.github.io/html-aam/#el-ruby">For implementers</a>.
<dt><a spec=html>DOM interface</a>:
<dd>Uses {{HTMLElement}}.
</dl>
The <{ruby}> element <a spec=html>represents</a> one or more ranges of phrasing content
paired with associated ruby annotations.
Ruby annotations are short runs of annotation text presented alongside base text.
Although primarily used in East Asian typography as a guide for pronunciation,
they can also be used for other associated information.
Ruby is most commonly presented as interlinear annotations,
although other presentations are also used.
<span class=non-normative>A more complete introduction to ruby and its rendering
can be found in W3C’s [[QA-RUBY inline]] article
and in [[CSS-RUBY-1 inline]].</span>
<div class="example" id=basic-ruby-ex>
This example shows Japanese text,
with ruby markup used to annotate the ideographs with their pronunciation.
<pre><code highlight="html">
<ruby>霧<rt>きり</rt></ruby>とも<ruby>霞<rt>かすみ</rt></ruby>とも
</code></pre>
A typical rendering would be something akin to the following image:
<figure>
<img
src="images/composition.png"
width="280" height="75"
alt="A short piece of horizontal Japanese text,
with the reading of each kanji character
indicated by small hiragana characters above it.
Each group of hiragana is horizontally centered
relative to the kanji it annotates.">
</figure>
</div>
The content model of <{ruby}> elements consists of
one or more of the following [=ruby segment=] sequences:
<ol>
<li>
One or more <a>phrasing content</a> nodes
or <{rb}> elements
(or a combination)
<a spec=html>representing</a> the base-level content being annotated
(the <dfn local-lt="base range">ruby base range</dfn>).
<li>
One or more <{rt}> or <{rtc}> elements
(or a combination)
<a spec=html>representing</a> any annotations associated
with the preceding base content,
where each <{rtc}> element or sequence of <{rt}> elements
<a spec=html>represents</a> one independent level of annotation
(a <dfn local-lt="annotation range">ruby annotation range</dfn>).
Each [=annotation range=],
and [=annotation units=] within each range,
and can optionally be preceded by / followed by / interleaved with
individual <{rp}> elements.
(The optional <{rp}> element can be used
to add presentational content such as parentheses,
which can be useful when rendering annotations inline,
including as a fallback when ruby layout is not supported.)
</ol>
Note: For authoring convenience,
the internal ruby elements <{rb}>, <{rt}>, <{rtc}>, and <{rp}>
have <a class=allow-2119 href="#optional-tags">optional end tags</a>.
<div class="example" id=ruby-optional-tag-ex>
In Taiwan,
phonetic annotations for Chinese text are typically provided
using Zhuyin characters (also known as Bopomofo).
In mainland China,
phonetic annotations are typically provided
using Latin characters using Pinyin transcription.
In this example, both are provided:
<figure>
<img
src="images/zhuyin-mei.png"
width="80" height="84"
alt="“Beautiful” in Chinese,
with both pinyin and bopomofo annotations.">
</figure>
<pre><code highlight="html">
<ruby lang=zh-TW>
<rb>美</rb><rtc><rt>ㄇㄟˇ</rt></rtc><rtc lang=zh-Latn><rt>měi</rt></rtc>
</ruby>
</code></pre>
<p>Certain features of HTML ruby allow for simpler markup:
<ul>
<li>End tags can be omitted.
<li>
Text contained directly by a <{ruby}> element
implicitly represents a [=ruby base unit=]
(as if it were contained in an <{rb}> element).
<li>
Consecutive <{rt}> children of a <{ruby}> element
are implicitly grouped into a [=ruby annotation range=]
(as if they were contained in an <{rtc}> element).
<li>
Text contained directly by an <{rtc}> element
implicity represents a [=ruby annotation unit=].
</ul>
In effect,
the above example is equivalent
(in meaning, though not in the DOM it produces)
to the following:
<pre><code highlight="html">
<ruby lang=zh-TW>美<rt>ㄇㄟˇ<rtc lang=zh-Latn>měi</ruby>
</code></pre>
</div>
Note: The [[CSS-RUBY-1 inline]] enables authors
to control the rendering of the HTML <{ruby}> element and its contents,
supporting a variety of layouts based on the same markup.
<div class="example" id=ruby-bopomofo-ex>
Three rendering styles are commonly used with Zhuyin (Bopomofo) characters.
(Annotations here are shown in blue for clarity,
though in actual uses there would be no color distinction.)
When the text is written vertically,
the phonetic annotations are rendered to the right,
along the base text:
<figure>
<img
src="images/zhuyin-vert.png"
width="87" height="132"
alt="A Chinese word composed of two characters, written vertically.
To the right of each character,
phonetic annotations appear,
written vertically.">
</figure>
In horizontal writing,
they are usually also typeset to the right,
in this case sandwiched between individual base characters:
<figure>
<img src="images/zhuyin.png"
width="174" height="66"
alt="A Chinese word composed of two characters,
written horizontally.
To the right of each character,
phonetic annotations appear,
written vertically.">
</figure>
However, sometimes Zhuyin annotations are instead typeset
above horizontal base text:
<figure>
<img src="images/zhuyin-above.png"
width="125" height="92"
alt="A Chinese word composed of two characters,
written horizontally.
Above each character,
phonetic annotations appear,
written horizontally.">
</figure>
These differences are stylistic,
not semantic,
and therefore share the same markup:
<pre lang="zh-TW"><code highlight="html">
<ruby lang=zh-TW><rb>電<rb>腦<rt>ㄉㄧㄢˋ<rt>ㄋㄠˇ</ruby>
</code></pre>
</div>
<h4 id="ruby-pairing">
Ruby Segmentation and Pairing</h4>
Within a ruby element,
content is parcelled into a series of ruby segments.
Ignoring <a spec=html>inter-element whitespace</a> and <{rp}> elements,
each <dfn>ruby segment</dfn> consists of:
<ul>
<li>
One [=ruby base range=]:
zero or more <dfn local-lt="base unit">ruby base units</dfn>,
each of which is either a DOM range containing a single child <{rb}> element
or a maximal DOM range of child content
that does not contain a child <{rb}> element.
<li>
Zero or more [=ruby annotation ranges=],
each a DOM range corresponding to either
a single <{rtc}> element
or to a maximal sequence of consecutive <{rt}> elements.
The [=ruby annotation range=] is further parcelled
into a sequence of <dfn local-lt="annotation unit">ruby annotation units</dfn>:
if it consists of a sequence of <{rt}> elements,
then each such element is an individual [=ruby annotation unit=];
if it consists of an <{rtc}> element,
then each of its child <{rt}> elements
and each maximal DOM range of non-<{rt}> child content
is a [=ruby annotation unit=].
</ul>
<div class="example" id=ruby-combine-ex>
Annotating text character by character is also typical in Chinese.
In this example,
each character is individually annotated in its own <{ruby}> element:
<code highlight="html" lang=zh>
<ruby>千<rt>qiān</ruby><ruby>里<rt>lǐ</ruby><ruby>之<rt>zhī</ruby><ruby>行<rt>xíng</ruby>﹐<ruby>始<rt>shǐ</ruby><ruby>於<rt>yú</ruby><ruby>足<rt>zú</ruby><ruby>下<rt>xià</ruby>。
</code>
<figure>
<img src="images/ruby-pinyin.png"
width="365" height="75"
alt="A Chinese phrase,
with each character phonetically annotated with a pinyin syllable">
</figure>
Multiple adjacent ruby segments can also be combined into the same <{ruby}> parent:
<code highlight="html" lang=zh>
<ruby>千<rt>qiān</rt>里<rt>lǐ</rt>之<rt>zhī</rt>行<rt>xíng</ruby>﹐<ruby>始<rt>shǐ</rt>於<rt>yú</rt>足<rt>zú</rt>下<rt>xià</ruby>。
</code>
</div>
The process of <dfn export>annotation pairing</dfn> associates [=ruby annotation units=]
with [=ruby base units=].
Within each [=ruby segment=],
each [=ruby base unit=] is paired with a [=ruby annotation unit=]
from each [=ruby annotation range=].
If a [=ruby annotation range=] consists of an <{rtc}> element
that contains no <{rt}> elements,
the single [=ruby annotation unit=] represented by its contents spans
(is paired with)
every [=ruby base unit=] in the [=ruby segment=].
Otherwise,
each [=ruby annotation unit=] in the [=ruby annotation range=] is paired,
in order,
with the corresponding [=ruby base unit=] in the segment’s [=ruby base range=].
<span class=w-nodev>
If there are not enough [=ruby base units=],
any remaining [=ruby annotation units=]
are assumed to be associated
with empty, hypothetical bases
inserted at the end of the [=ruby base range=].
If there are not enough [=ruby annotation units=]
in a [=ruby annotation range=],
the remaining [=ruby base units=]
are assumed to not have an annotation from that annotation level.
</span>
<div class="example" id="ruby-inlining">
In some contexts,
for example when the font size or line height are too small
for interlinear ruby to be readable,
it is desirable to inline the ruby annotation
such that it appears in parentheses after the text it annotates.
This also provides an appropriate fallback rendering
for user agents that do not support ruby layout.
However,
for compound words in Japanese particularly,
per-character inlined phonetics are awkward.
Instead,
the more natural rendering
is to place the annotation of an entire word
together after its base text.
For example,
when typeset inline,
<span lang="ja">京都市</span> (“Kyoto City”)
is expected to be rendered as
“<span lang="ja">京都市(きょうとし)</span>”,
not “<span lang="ja">京(きょう)都(と)市(し)</span>”.
This can be marked up using consecutive <{rb}> elements followed by consecutive <{rt}> elements:
<pre><code highlight="html">
<ruby><rb>京<rb>都<rb>市<rt>きょう<rt>と<rt>し</ruby>
</code></pre>
If each base character was immediately followed by its annotation in the markup
(each base-annotation pair forming its own segment),
inlining would result in the undesirable and awkward
“<span lang="ja">京(きょう)都(と)市(し)</span>”.
Note that the markup above does not automatically provide the parentheses.
Parentheses can be inserted using CSS generated content
when intentionally typesetting inline,
however they would be missing
when a UA that does not support ruby
falls back to inline layout automatically from interlinear layout.
The <{rp}> element can be inserted
to provide the appropriate punctuation for when ruby is not supported:
<pre><code highlight="html">
<ruby><rb>京<rb>都<rb>市<rp>(<rt>きょう<rt>と<rt>し<rp>)</ruby>
</code></pre>
</div>
<h4 id="ruby-compound" class=non-normative>
Markup Patterns for Multi-Character Ruby</h4>
<i>This section is non-normative</i>
In the simplest examples,
each [=ruby base unit=] contains only a single character,
a pattern often used for character-per-character phonetic annotations.
However, [=ruby base units=] are not restricted
to containing a single character.
In some cases it may be impossible
to map an annotation to the base characters individually,
and the annotation may need to jointly apply to a group of characters.
<div class="example" id=grou-ruby-ex>
For example,
the Japanese word for “today” is written with the characters 今日,
literally “this”+“day”.
But it's pronounced きょう (kyō),
which can't be broken down
into a “this” part
and a “day” part.
Therefore phonetic ruby indicating the reading of 今日
would be marked up as follows:
<pre><code highlight="html">
<ruby>今日<rt>きょう</ruby>
</code></pre>
<figure>
<img src="images/group.png"
width="87" height="71"
alt="“きょう” annotating “今日”">
</figure>
</div>
<div class="example" id=group-ruby-ex-2>
Ruby can also be used to describe the meaning of the base text,
rather than (or in addition to) the pronunciation.
In such cases,
both the base text and the annotation
are typically made of multiple characters,
with no meaningful subdivision possible.
Here a compound ideographic word
has an English-derived synonym
(written in katakana)
given as an annotation:
<pre><code highlight="html">
<ruby>境界面<rt>インターフェース</ruby>
</code></pre>
<figure>
<img src="images/ruby-interface.png"
width=170 height=70
alt="“インターフェース” annotating “境界面”">
</figure>
Here a compound ideographic word
has its English equivalent
directly provided as an annotation:
<pre><code highlight="html">
<ruby lang="ja">編集者<rt lang="en">editor</ruby>
</code></pre>
<figure>
<img src="images/ruby-editor.png"
width=130 height=70
alt="“editor” annotating “編集者”">
</figure>
</div>
In compound words,
although phonetic annotations might correspond to individual characters,
they are sometimes nonetheless typeset to share space above the base text,
rendering similar to annotations on multi-character bases.
However, there are subtle distinctions in their rendering
that require encoding the pairing relationships within the compound word
as well as its identification as a word.
Furthermore, sharing space in this way
versus rendering each pair in its own visual “column” is a stylistic preference:
the markup needs to provide enough information to allow for both renderings
(as well as correct inlining).
<div class="example" id="jukugo-ruby">
In this example,
we will use the Japanese noun “<span lang="ja">京都市</span>”,
meaning “Kyoto City”.
Its characters are pronounced “きょう”, “と”, and “し”, respectively.
(Distinct colors shown in these examples for clarity:
in actual usage there would be no color distinction.)
Such compound words could be rendered
with phonetic annotations placed over each character one by one.
In this style,
when an annotation is visually longer than the character it annotates,
surrounding text is pushed apart,
to make the correspondance between each character and its annotation clear.
<figure>
<img src="images/kyoto-s.png"
width="140" height="71"
alt="“Kyoto City” written in horizontal Japanese,
with phonetic annotations over each of the three characters.
The first and second character are pushed apart from each other,
as the annotation over the first one is too long to fit.">
</figure>
However, it is common to present such a word
with its annotations sharing space together
when they would otherwise create a separation in the base text,
to preserve the implication that it is a single word.
This style is called “jukugo ruby”
(“jukugo” meaning “compound word”).
<figure>
<img src="images/kyoto-m.png"
width="120" height="71"
alt="“Kyoto City” written in horizontal Japanese,
with phonetic annotations over the word.
The characters of each annotation are not alligned
to their corresponding base,
instead they are collectively aligned to the whole word.">
</figure>
Even when presenting as “jukugo ruby“ though,
the annotation are not always merged.
If a line break occurs in the middle of the word,
the annotations are expected to remain associated with the correct base character.
<figure>
<img src="images/kyoto-lb.png"
width="224" height="123"
alt="“Kyoto City” written in horizontal Japanese,
broken across two lines.
The phonetic annotations displayed over the word
are paired with each base character,
and line break together.">
</figure>
Whether—and how much—the annotations are merged can vary,
and can depend on the font size,
as “jukugo ruby“ only merges annotations
when at least one of them is longer than its base.
<figure>
<img src="images/kyoto-33.png"
width="119" height="64"
alt="“Kyoto City” written in horizontal Japanese,
with phonetic annotations over each of the three characters.
At 33% of the base font size,
annotations are small enough to fit their base character,
and are aligned to it.">
<figcaption>Ruby sized at 33%</figcaption>
</figure>
<figure>
<img src="images/kyoto-50.png"
width="120" height="71"
alt="“Kyoto City” written in horizontal Japanese,
with phonetic annotations over each of the three characters.
At 50% of the base font size,
the first annotation doesn't fit over its base character,
so it merges with the second one.
The third remains separate.">
<figcaption>Ruby sized at 50%</figcaption>
</figure>
<figure>
<img src="images/kyoto-60.png"
width="120" height="75"
alt="“Kyoto City” written in horizontal Japanese,
with phonetic annotations over each of the three characters.
At 60% of the base font size,
the first annotation doesn't fit over the first character,
nor do the first and second together fit over the first two characters.
All three are merged and aligned together.">
<figcaption>Ruby sized at 60%</figcaption>
</figure>
Since choosing to render as “jukugo ruby” or not is a stylistic choice,
the same markup needs to enable both--
and it needs to encode both the pairing information within the word
as well as the grouping of these pairs as a single word:
<pre><code highlight="html">
<ruby><rb>京<rb>都<rb>市<rt>きょう<rt>と<rt>し</ruby>
</code></pre>
Correct “jukugo ruby” is not be possible
if all the base characters are part of a single <{rb}> element
and all the annotation text in a single <{rt}> element,
as their individual pairings would be lost.
</div>
Note: For more details on Japanese and Chinese ruby usage and rendering,
see [[JLREQ inline]]
(particularly
[[JLREQ#ruby_and_emphasis_dots|Ruby and Emphasis Dots]]
and [[JLREQ#positioning_of_jukugoruby|Appendix F]]),
[[SIMPLE-RUBY inline]],
and the section on [[CLREQ#interlinear_annotations|Interlinear annotations]] of [[CLREQ inline]].
<h3 id=the-rb-element>
The <dfn element><code>rb</code></dfn> element</h3>
<dl class="def">
<dt><a spec=html>Categories</a>:
<dd>None.
<dt><a spec=html>Contexts in which this element can be used</a>:
<dd>As a child of a <{ruby}> element.
<dt><a spec=html>Content model</a>:
<dd><a>Phrasing content</a>.
<dt><a spec=html>Content attributes</a>:
<dd><a spec=html>Global attributes</a>
<dt><a spec=html>DOM interface</a>:
<dd>Uses {{HTMLElement}}.
</dl>
An <{rb}> (“ruby base”) element
<span class=w-nodev>that is the child of a <{ruby}> element</span>
<a spec=html>represents</a> a [=ruby base unit=]:
a unitary component of base-level text
annotated by any ruby annotation(s) to which it is paired.
<p class=w-nodev>
An <{rb}> element that is not a child of a <{ruby}> element
<a spec=html>represents</a> the same thing as its children.
<div class="example" id=rb-ex>
When no <{rb}> element is used, the base is implied:
<pre><code highlight="html">
<ruby>base<rt>annotation</ruby>
</code></pre>
The element can also be made explicit:
<pre><code highlight="html">
<ruby><rb>base<rt>annotation</ruby>
</code></pre>
Both markup patterns have identical semantics.
Explicit <{rb}> elements can be useful for styling,
and are necessary
when marking up consecutive bases to pair with consecutive annotations
(for example,
when representing a compound word;
see <span lang="ja">京都市</span> <a href="#ruby-inlining">inlining</a>
and <a href="#jukugo-ruby">jukugo ruby</a> examples above).
</div>
<h3 id=the-rt-element>
The <dfn element><code>rt</code></dfn> element</h3>
<dl class="def">
<dt><a spec=html>Categories</a>:
<dd>None.
<dt><a spec=html>Contexts in which this element can be used</a>:
<dd>
As a child of a <{ruby}>
or of an <{rtc}> element.
<dt><a spec=html>Content model</a>:
<dd><a>Phrasing content</a>.
<dt><a spec=html>Content attributes</a>:
<dd><a spec=html>Global attributes</a>
<dt><a spec=html>Accessibility considerations</a>:
<dd><a href="https://w3c.github.io/html-aria/#el-rt">For authors</a>.
<dd><a href="https://w3c.github.io/html-aam/#el-rt">For implementers</a>.
<dt><a spec=html>DOM interface</a>:
<dd>Uses {{HTMLElement}}.
</dl>
An <{rt}> (“ruby text”) element
<span class=w-nodev>that is the child of a <{ruby}> element
or of an <{rtc}> element
that is itself the child of a <{ruby}> element</span>
<a spec=html>represents</a> a [=ruby annotation unit=]:
a unitary annotation of the [=ruby base unit=] to which it is paired.
<p class=w-nodev>
An <{rt}> element that is not a child of a <{ruby}> element
nor of an <{rtc}> element
that is itself the child of a <code>ruby</code> element
<a spec=html>represents</a> the same thing as its children.
<h3 id=the-rtc-element>
The <dfn element><code>rtc</code></dfn> element</h3>
<dl class="def">
<dt><a spec=html>Categories</a>:
<dd>None.
<dt><a spec=html>Contexts in which this element can be used</a>:
<dd>
As a child of a <{ruby}> element.
<dt><a spec=html>Content model</a>:
<dd>
Either [=phrasing content=] or a sequence of <{rt}> elements;
optionally preceded, interleaved with, or followed by individual <{rp}> elements.
<dt><a spec=html>Content attributes</a>:
<dd><a spec=html>Global attributes</a>
<dt><a spec=html>DOM interface</a>:
<dd>Uses {{HTMLElement}}.
</dl>
An <{rtc}> (“ruby text container”) element
<span class=w-nodev>that is the child of a <{ruby}> element</span>
<a spec=html>represents</a> one level of annotation
(a [=ruby annotation range=])
for the preceding sequence of [=ruby base units=]
(its <span>ruby base range</span>).
Note: In simple cases,
<{rtc}> elements can be omitted
as a [=ruby annotation range=] is implied
by consecutive <{rt}> elements.
However, they are necessary
in order to associate multiple levels of annotation
with a single [=ruby base range=],
for example to provide both phonetic and semantic information,
phonetic information in different scripts,
or semantic information in different languages.
<div class="example" id=ruby-rtc-ex>
In this example,
the Japanese compound word 上手 ("skillful")
has phonetic annotations in both kana and romaji phonetics
while at the same time maintaining the pairing to bases
and annotation grouping information.
<figure>
<img src="images/mono-or-jukugo-double.png"
width="72" height="81"
alt="上手 (skill) annotated in both kana and romaji">
</figure>
This enabled by the following markup:
<pre><code highlight="html">
<ruby><rb>上<rb>手<rt>じよう<rt>ず<rtc><rt>jou<rt>zu</ruby>
</code></pre>
</div>
Note: Text that is a direct child of the <{rtc}> element
implicitly represents a [=ruby annotation unit=]
as if it were contained in an <{rt}> element,
except that this annotation spans all the bases in the segment.
<div class="example" id=rtc-ex>
In this example, the Chinese word for San Francisco
(<span lang="zh-Hans">旧金山</span>, i.e. “old gold mountain”)
is annotated both using pinyin to give the pronunciation,
and with the original English.
<figure>
<img src="images/group-double.png"
width="113" height="84"
alt="San Francisco in Chinese,
with both pinyin and the original English as annotations.">
</figure>
Which is marked up as follows:
<pre><code highlight="html">
<ruby><rb>旧<rb>金<rb>山<rt>jiù<rt>jīn<rt>shān<rtc>San Francisco</ruby>
</code></pre>
Here, a single base run of three base characters
is annotated with three pinyin ruby text segments
in a first (implicit) container,
and an <{rtc}> element is introduced
in order to provide a second single ruby annotation
being the city's English name.
</div>
<p class=w-nodev>
An <{rtc}> element that is not a child of a <{ruby}> element
<a spec=html>represents</a> the same thing as its children.
<h3 id=the-rp-element>
The <dfn element><code>rp</code></dfn> element</h3>
<dl class="def">
<dt><a spec=html>Categories</a>:
<dd>None.
<dt><a spec=html>Contexts in which this element can be used</a>:
<dd>
As a child of a <{ruby}> or <{rtc}> element,
either immediately before or immediately after an <{rtc}> element or a [=ruby annotation unit=].
<dt><a spec=html>Content model</a>:
<dd><a spec=html>Text</a>.
<dt><a spec=html>Content attributes</a>:
<dd><a spec=html>Global attributes</a>
<dt><a spec=html>Accessibility considerations</a>:
<dd><a href="https://w3c.github.io/html-aria/#el-rp">For authors</a>.</dd>
<dd><a href="https://w3c.github.io/html-aam/#el-rp">For implementers</a>.</dd>
<dt><a spec=html>DOM interface</a>:
<dd>Uses {{HTMLElement}}.
</dl>
The <{rp}> (“ruby parenthetical”) element <a spec=html>represents</a> nothing.
It is used to provide presentational content
(such as parentheses)
around [=ruby annotation units=],
to be shown when presenting ruby content inline,
without using ruby-specific layout.
This may happen when using a user agent that does not support ruby layout,
or for stylistic reasons.
In typical ruby layout,
it is not displayed.
<div class="example" id=rp-ex>
In this example,
each ideograph in the text <span lang="ja">漢字</span>
is annotated with its phonetic reading.
Furthermore, it uses <{rp}> so that in legacy user agents the readings are in parentheses:
<pre lang="ja"><code highlight="html">
...<ruby>漢<rb>字<rp>(<rt>かん<rt>じ<rp>)</ruby>...
</code></pre>
In user agents that support ruby layout,
the rendering omit the parentheses,
but in user agents that do not, the rendering would be:
<pre lang="ja">...漢字(かんじ)...</pre>
</div>
<div class="example" id=contrieved-rp-ex>
Here a contrived example
showing some symbols with names given in English and French
using double-sided annotations,
with <{rp}> elements as well:
<pre><code highlight="html">
<ruby>
<rb>♥<rp>: <rt>Heart<rp>, <rtc lang=fr>Cœur</rtc><rp>.</rp>
<rb>☘<rp>: <rt>Shamrock<rp>, <rtc lang=fr>Trèfle</rtc><rp>.</rp>
<rb>✶<rp>: <rt>Star<rp>, <rtc lang=fr>Étoile</rtc><rp>.</rp>
</ruby>
</code></pre>
This would make the example render as follows in non-ruby-capable user agents:
<pre>♥: Heart, <span lang="fr">Cœur</span>. ☘: Shamrock, <span
lang="fr">Trèfle</span>. ✶: Star, <span lang="fr">Étoile</span>.</pre>
</div>
<h2 id=optional-tags class=non-normative>
Optional Tags</h2>
<div class=advisement>
This section extends the [[HTML/syntax#optional-tags]] section of the [[HTML inline]],
replacing the paragraphs of that section about <{rt}> and <{rp}>,
and adding two more for <{rb}> and <{rtc}>.
</div>
An <{rb}> element's <a spec=html>end tag</a> may be omitted
if the <{rb}> element is immediately followed by
an <{rb}>, <{rt}>, <{rtc}> or <{rp}> element,
or if there is no more content in the parent element.
An <{rt}> element's <a spec=html>end tag</a> may be omitted
if the <{rt}> element is immediately followed by
an <{rb}>, <{rt}>, <{rtc}> or <{rp}> element,
or if there is no more content in the parent element.
An <{rtc}> element's <a spec=html>end tag</a> may be omitted
if the <{rtc}> element is immediately followed by
an <{rb}> or <{rtc}> element,
or if there is no more content in the parent element.
An <{rp}> element's <a spec=html>end tag</a> may be omitted
if the <{rp}> element is immediately followed by
an <{rb}>, <{rt}>, <{rtc}> or <{rp}> element,
or if there is no more content in the parent element.
<h2 id=rendering>
Rendering</h3>
<div class=advisement>
This section completes the [[html/rendering#non-replaced-elements]] section of the [[HTML inline]],
and in particular its [[HTML/rendering#phrasing-content-3]] subsection,