-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathindex.html
394 lines (310 loc) · 25.8 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
<!DOCTYPE html>
<html dir="ltr" lang="en">
<head>
<meta charset="utf-8">
<title>Developing Localizable Manifests</title>
<script src="https://www.w3.org/Tools/respec/respec-w3c" async="" class="remove"></script>
<script class="remove">
var respecConfig = {
specStatus: "ED",
edDraftURI: "https://w3c.github.io/localizable-manifests",
shortName: "localizable-manifests",
editors: [
{ name: "Addison Phillips",
company: "Invited Expert" , w3cid: 33573
}
],
// if you wish the publication date to be other than today, set this
//publishDate: "2021-08-24",
//previousMaturity: "WD",
//previousPublishDate: "2021-08-24",
noRecTrack: true,
group: "i18n",
github: "w3c/localizable-manifests",
xref: [ "i18n-glossary" ]
};
</script>
<link rel="stylesheet" href="local.css" />
</head>
<body>
<div id="abstract">
<p>This document provides definitions and best practices related to the specification of manifest files and similar document formats on the Web.</p>
</div>
<div id="sotd">
<p>Some specifications on the Web deal with defining sets of files or resources to be consumed together. A common design pattern is to provide a manifest or configuration file that defines which resources are available and how various resources should be used or to provide various kinds of metadata about a collection of resources. This document provides definitions and best practices related to the specification of manifest files and similar document formats on the Web.</p>
</div>
<section id="introduction">
<h2>Introduction</h2>
<p>Some specifications on the Web deal with defining sets of files or resources to be consumed together. A common design pattern is to provide a manifest or configuration file that defines which resources are available and how various resources should be used or to provide various kinds of metadata about a collection of resources.</p>
<p>Some examples of specifications like this include Webapp [[APPMANIFEST]], Miniapp [[MINIAPP-MANIFEST]], and ePub [[EPUB-33]].</p>
<p>Manifest file formats usually contain [=syntactic content=] or [=user-supplied values=] that are not localizable. But most manifest file formats also contain at least some [=localizable content=] fields. A specification that defines a manifest file format that includes localizable content values needs to provide a means for users to localize the manifest into different languages. This document covers the pros and cons of common designs, along with some best practices related to building and managing manifests.</p>
<p>These (and many other) best practices, along with links to supporting materials, can also be found in the <cite>Internationalization Best Practices for Spec Developers</cite> [[INTERNATIONAL-SPECS]]. In addition to the best practices found here, additional best practices relating to language metadata on the Web can be found in [[STRING-META]]. Core terminology can also be found in the <cite>Internationalization Glossary</cite> [[I18N-GLOSSARY]].</p>
<aside class="note">
<p>In this document [[RFC2119]] keywords have their usual meaning. Best practices and definitions are set off from the remainder of the text with special formatting.</p>
<p class="advisement">Best practices appear with a different background color and decoration like this.</p>
<p class="definition">Definitions appear with a different background color and decoration like this.</p>
<p class="gap">Gaps or recommendations for future work appear with a different background color and decoration like this.</p>
</aside>
</section>
<section id="use-cases">
<h2>Use Cases</h2>
<p>There are different ways that manifest files might be used on the Web and these affect the choice of manifest localization strategy.</p>
<p><dfn data-lt="packaged" class="lint-ignore export">Packaged Manifest.</dfn> Some manifests are packaged with other files to form an application or document. Since the manifest is generally consumed as part of the larger set of content and the content is downloaded as a single unit, manifests of this type can use more modular localization strategies, such as using multiple files with names or paths based on the locale.</p>
<p><dfn data-lt="shorthand" class="lint-ignore export">Shorthand Description Manifest.</dfn> Some manifests are meant to be used as a shorthand description of an application, document, or other resource. Manifests of this type are often downloaded by the client, sometimes as a precursor to retrieving the actual resource or its component parts. Since retrieving multiple files has latency, size, and storage implications, this type of manifest usually seeks to use a single file or other less-modular approaches.</p>
<section id="language-preferences">
<h2>Managing Language Preferences</h2>
<p>Another consideration is the <a>language negotiation</a> strategy employed by the specification. If a manifest can be generated on demand (rather than being cached at a well-known URL), then the localization might be applied on a per-request basis. This points more toward a <a>shorthand</a> manifest style. In other cases, where some implementations might be able to negotiate while others cannot, it might be important for the specification to permit different strategies to be used.</p>
<p>Bear in mind that modern operating environments support quite extensive sets of available locales and that application owners often wish to satisfy diverse audiences with a single localized application or document. While examples in specifications necessarily are constrained to perhaps three or four locales, actual applications with 50 or more specific locales are not uncommon.</p>
<p>Manifests need to match a user's language preference to the available localized resources. The user's locale preference can include more than just a primary language subtag. It might include the script (in tags such as <code>zh-Hans</code>, meaning "Chinese as written in the Simplified Han script") or region (in tags such as <code>fr-CA</code>, meaning "French as used in Canada")—or both. Some systems support <a>Unicode Locale Identifiers</a>, an extension to [[BCP47]] that includes tailoring such as calendars, number shaping, collation, and more. For more information, see [[LTLI]].</p>
<section id="locale-fallback">
<h4>Matching Language Tags and Locale Fallback</h4>
<p>Matching localized resources to a user's locale preference usually involves "locale fallback", using a mechanism such as [[BCP47]]'s "lookup" matching scheme or [[CLDR]]'s "best match" scheme. These involve comparing the language tags on each <a>resource</a> seeking the <em>longest</em> tag that completely matches the user's preference. There is also, usually, a default locale to use for cases in which none of the localized resources match the user's preference.</p>
<p>Matching of language tags to available resources (such as found in [[[#path-vs-file]]]) or to items in some form of language map (see [[[#language-maps]]]) can require an implementation to examine all of the available values, as the first potential match might not be the best overall match.</p>
<p>For example, consider the following language map:</p>
<pre class="javascript">
"localized-resource": {
"en": {"value": "This is English"},
"en-GB": {"value": "This is UK English"},
"en-US": {"value": "This is US English"}
}
</pre>
<p>If the user's <a>locale</a> preference were <code>en-GB</code> or <code>en-US</code>, the best match is included in the list. Obviously, the entire list must be read to find the best match, since the entry for <code>en</code> could be considered a match for either value.</p>
<p>If the user's locale preference were <code>en-AE</code> (English as used in the United Arab Emirates), the best match in the list is <code>en</code>, although some implementations of best matching might apply out-of-band information to potentially match a longer tag.</p>
<p>If the user's locale preference were <code>fr-FR</code> (French as used in France), none of the values match and some form of defaulting would be needed to select the value to use.</p>
<p>Finally, the user's locale preference could include additional subtags, such as those used to tailor locale settings or such as a language variant. An example of a tailored locale might be <code>en-US-u-hc-h23</code> (US English but with the "hour cycle" set to 24 hour time). An example of a language variant might be <code>en-GB-oxendict</code> (UK English, but using spelling conventions of the Oxford English Dictionary).</p>
<aside class="note">
<p>For more information about CLDR locale extensions, see [[RFC6067]] or <a href="https://cldr.unicode.org/index/bcp47-extension">Unicode Extensions for BCP47</a> on the Unicode site.</p>
</aside>
<p>Locale fallback is implemented such that the user's preference is matched against the available list of values, and, if no exact match is found, removing the last subtag and repeating. This is described as "lookup" in the language tag matching part of [[BCP47]] (specifically [[RFC4647]]). Thus, the tag <code>en-GB-oxendict</code> would be compared against the above languge map as follows:</p>
<pre>
en-GB-oxendict // no match
en-GB // matches en-GB
en // if en-GB were not present, matches en
(default) // no match, use defaults
</pre>
</section>
</section>
</section>
<section id="indicating">
<h2>Indicating Language and Direction</h2>
<p><a>Natural language</a> values stored in a manifest, like any document format or data structure, require language and direction metadata in order to be used by <a>consumers</a> successfully. For detailed information about the requirements, see [[STRING-META]].</p>
<p>There are four common serializations for string values in a manifest, and these are described in this section. Specifications are intended to use these together to form a complete localization solution.</p>
<section id="non-linguistic">
<h4>Non-Linguistic Fields</h4>
<p>For <a>non-linguistic fields</a> (that is, strings that contain data that is not human language) language or direction metadata SHOULD NOT be associated with the value. Note that this includes <a data-cite="international-specs/#application_internal">application-internal data values</a> [[INTERNATIONAL-SPECS]].</p>
<p>If a <a>consumer</a> is required to assign a language tag to the data, the value <code>zxx</code> (Non-Linguistic) SHOULD be used. If a <a>consumer</a> is required to assign a <a>base direction</a> to the data, the value <code>auto</code> SHOULD be used.</p>
<aside class="example" title="Examples of non-linguistic string values">
<pre class="javascript">
"isbn": "978-123456789-X",
"part-number": "§ABC-123-0094"
</pre>
<p>Note that non-linguistic values are sometimes <em>localized</em>, even though they are not <em>translated</em>.</p>
<pre class="javascript">
"gaining-value-color": "green", // perhaps "red" in another locale?
"background-color": "#ffebdd", // or something else?
"default-level": "medium", // perhaps "large" or "small" in another locale?
"help-file-url": "https://example.org/en-US/help.html"
</pre>
</aside>
</section>
<section id="single-linguistic-field">
<h4>Single-Language Localizable Text Field</h4>
<p>For <a>localizable text</a> fields that appear in a single language, use a data structure to represent the value. The recommended representation is an object with three fields. The field <code>value</code> contains the actual string. The field <code>lang</code> contains a <a>valid</a> [[BCP47]] language tag. The field <code>dir</code> contains the string's <a>base direction</a> (one of the values <code>ltr</code>, <code>rtl</code>, or <code>auto</code>).</p>
<p>See <a data-cite="STRING-META#use-the-localizable-data-structure">The Localizable WebIDL Dictionary</a> in [[STRING-META]].</p>
<aside class="example" title="Example of a localizable text field">
<pre class="json" id="localizable-text-field">
"field-name-goes-here": {
"value": "This is the string value",
"lang": "en-US",
"dir": "ltr"
}
</pre>
</aside>
</section>
<section id="language-defaults">
<h4>Document-Level Defaults</h4>
<p>Single-language localizable text fields are at their best if there are only a few natural language strings in a given document. When a document such as a manifest contains many natural language strings, this becomes inefficient. To reduce the complexity of encoding these strings, one workaround is to establish a document-level default for language and base direction. These are separate values, as language does not imply direction. There should still be the ability to override either value on any given string value by using the single-language <a>localizable text</a> field representation found <a href="#single-linguistic-field">above</a>.</p>
<p>For specifications that can make use of the [[JSON-LD]] <code>@context</code> mechanism, use the <code>@language</code> and <code>@direction</code> fields to supply the document level defaults.</p>
<p>If your specification defines its own document level defaults, provide two optional fields:</p>
<ul>
<li>The document's default language field SHOULD be called <code>language</code> and SHOULD be specified to contain a <a>valid</a> [[BCP47]] language tag. Specifications SHOULD specify that implementations are only require to check if a [[BCP47]] language tag is <a>well-formed</a>.</li>
<li>The document's default <a>base direction</a> field SHOULD be called <code>direction</code> and support the values <code>ltr</code>, <code>rtl</code>, or <code>auto</code>.</li>
</ul>
<aside class="example" title="Example of document level defaults">
<pre class="json">
"language": "en-US",
"direction": "ltr",
//...
"some-field-name": "This string is in U.S. English, thanks to the default",
// the following field overrides the language but not the direction:
"another-field-name": {
"value": "Diese Zeichenfolge ist auf Deutsch",
"lang": "de"
}
</pre>
</aside>
<aside class="note">
<p><strong>Document level defaults are not a complete solution on their own.</strong> Specifications can define a document-level defaulting mechanism to assist users who often create monolingual documents. They are particularly useful when used in conjunction with the <a href="#locale-specific">locale-specific</a> or <a href="#hybrid">hybrid</a> approaches described later in this document. However, document-level defaults should only be used to augment the ability for users to provide string-specific metadata.</p>
</aside>
</section>
<section id="language-maps">
<h4>Language Maps</h4>
<p>The world is not monolingual. Having documents that contain only a single language would mean providing many iterations of the document, one for each language, in order to localize the manifest. This also might require language negotiation when requesting the manifest.</p>
<p>One way to address this is to allow multilingual values for each <a>localizable text</a> field inside the manifest document.</p>
<p>Language selection is not merely the exact matching of language tag string values to the user's preferred locale. The usual <a href="#localizable-text-field">object representation</a> of a <a>localizable text</a> field requires that the object be deserialized in order to discover the language tag associated with the value. This can be inefficient when there are many values in a given manifest file. In these cases, the best practice is to use a language map to organize <a>localizable text</a> values. Such a map exposes the language tag for the purposes of selection, but still uses an object representation on the value side of the map, since both language and direction might need to be overridden for a given string value.</p>
<aside class="note">
<p>The language map structure presented in this section is not the same as the language maps found in <a data-cite="JSON-LD#string-internationalization">JSON-LD String Internationalization</a> [[JSON-LD]].</p>
</aside>
<aside class="example" title="A localizable language map">
<pre class="json">
"field-name-goes-here": {
"en": {"value": "This is English"},
"en-GB": {"value": "This is UK English", "dir": "ltr"},
"fr": {"value": "C'est français", "lang": "fr-CA", "dir": "ltr"},
"ar": {"value": "هذه عربية", "dir": "rtl"}
}
</pre>
</aside>
<aside class="note">
<p>This structure permits a language tag as the key and also as part of the value. These two values do not have to match. The value of the key indicates the <a>locale</a> of the intended audience, while the language tag of the value represents information about the actual language of the string.</p>
<p>These values might not match in cases where additional specificity is needed to get the correct rendering or processing of the value or, occasionally, when a foreign language value <strong><em>is</em></strong> the intended localization (and the language needs to be overridden, such as to select voices or dictionaries in assistive technology):</p>
<pre class="json">
"extra-rendering-help": {
"zh": {"value": "你好世界!", "lang": "zh-Hans"}
}
"hello-in-french": {
"en": {"value": "Bonjour!", "lang": "fr"},
"de": {"value": "Bonjour!", "lang": "fr"}
}
</pre>
</aside>
</section>
</section>
<section id="common-approaches">
<h2>Common Approaches</h2>
<p>There are several common approaches to creating localized manifests. Each offers certain advantages (and suffers from certain drawbacks):</p>
<section id="unitary">
<h3>Unitary Manifest Files</h3>
<p>A unitary manifest approach provides for localization by defining localizable fields using an array of language values.</p>
<aside class="example">
<p>Example of a unitary manifest:</p>
<pre class="json">{
"appId": "myApplication",
"releaseDate": "2021-06-07",
"releaseUrl": "https://example.com/path/to/myApplication",
// localizable values in a map
"title": {
"en": {"value": "My Application"}, // English
"fr": {"value": "Mon application", "lang": "fr-CA"}, // French
"de": {"value": "Mein Bewerbund"}, // German
"zh-Hans": {"value": "我的应用程序"} // Simplified Chinese
},
"author": {
"en": {"value": "Example Corporation"},
"fr": {"value": "Société Exemple"}
}</pre>
</aside>
<p>The advantages of a unitary manifest include:</p>
<ol>
<li>The manifest is a single file, allowing for ease of downloading.
<li>It is easier to ensure that the correct version of the resources in every language are being used.
<li>Language defaulting may be easier to manage.
</ol>
<p>There are also disadvantages to this approach:</p>
<ol>
<li>If the manifest is rendered into many languages, the file's size may grow disproportionately.
<li>Localization processes are generally optimized to create files that are copies of the original, only with the language-bearing materials replaced. Creating, managing, and synchronizing localization in a highly multilingual file may be more complex than using separate files per language. Generating the manifest via scripts or tooling is one approach to managing this complexity.
</ol>
<p>Unitary manifests work best with smaller source files with a limited number of localizable content fields and/or a limited number of languages. Most unitary manifests define some type of “language indexing” scheme (such as found in [[JSON-LD]]) to perform the actual selection.</p>
</section>
<section id="locale-specific">
<h3>Language or Locale-specific Manifest Files</h3>
<p>Locale-specific manifests create a separate file for each supported language or locale. Usually the specification will <a href="#path-vs-file">choose either a <code>path</code> or <code>filename</code> based organizational scheme</a> for allocating and finding the manifest files.</p>
<aside class="example">
<p>Example of language specific manifest files:</p>
<p>Filename: <code>/manifest/MyApplicationManifest.json</code></p>
<pre class="json">{
"appId": "myApplication",
"releaseDate": "2021-06-07",
"language": "en",
"releaseUrl": "https://example.com/path/to/myApplication",
"title": "My Application",
"author": "Example Corporation",
// etc.
</pre>
<p>Filename: <code>/manifest/MyApplicationManifest-fr.json</code></p>
<pre class="json">{
"appId": "myApplication",
"releaseDate": "2021-06-07",
"language": "fr",
"releaseUrl": "https://example.com/path/to/myApplication",
"title": "Mon application",
"author": "Société Exemple",
// etc.
</pre>
</aside>
<p>The advantages of locale-specific manifests include:</p>
<ol>
<li>Managing the creation or addition of locales is generally easier. Localization processes are generally optimized for “creating a localized copy” of a given source file.
</ol>
<p>The disadvantages include:</p>
<ol>
<li>Even simple locale fallback chains can require the client to search for or retrieve multiple files.
<li>Because the files are separate, versioning of the content must be rigorously controlled.
<li>It may be necessary to test many more variations to ensure proper operation.
</ol>
</section>
<section id="hybrid">
<h3>Hybrid Approaches</h3>
<p>A common compromise is to have a central manifest file that contains most or all of the non-localizable content, coupled with separate files that contains the localizable content. In some cases, the central manifest file contains defaults for the localizable content. The central manifest might also contain a directory or list of available locales (or the URL of language-specific manifests) so that clients only need to try to retrieve localized files that are actually available.</p>
<aside class="example">
<p>Example of a hybrid manifest:</p>
<p>Filename: <code>/manifest/MyApplicationManifest.json</code></p>
<pre class="json">{
"appId": "myApplication",
"releaseDate": "2021-06-07",
"defaultLanguage": "en",
"availableLanguages": [ "ar-AE", "ar-EG", "en", "fr", "fr-CA", "zh-Hans-CN" ],
"releaseUrl": "https://example.com/path/to/myApplication",
"title": "My Application",
"author": "Example Corporation",
// etc.
</pre>
<p>Filename: <code>/manifest/MyApplicationManifest-en.json</code></p>
<pre class="json">{
"appId": "myApplication",
"language": "en",
"title": "My Application",
"author": "Example Corporation",
// etc.
</pre>
<p>Filename: <code>/manifest/MyApplicationManifest-fr.json</code></p>
<pre class="json">{
"appId": "myApplication",
"language": "fr",
"title": "Mon application",
"author": "Société Exemple",
// etc.
</pre>
</aside>
</section>
</section>
<section id="path-vs-file">
<h2>Path versus File Name Localization</h2>
<p>Some manifests opt to store localized variants of the manifest using modification of the path. Others prefer to modify the file name.</p>
<aside class="example">
<pre>Path-based: /manifests/en-US/myApplication.json
File-based: /manifests/myApplication-en-US.json
</pre>
</aside>
<p>Path-based storage has the advantage that multiple files can be localized together. For example, the localized manifest can include a localized icon, logo, stylesheet, README, and terms-of-service file together with the manifest. New languages can be added by adding a new directory.</p>
<p>Path-based systems have the disadvantage that the filenames are generally “all the same” and the contents of a given file is not obvious from the filename—only by its location in the application hierarchy.</p>
<p>Filename-based storage has the advantage that the contents of the file is obvious from the filename and managing files can sometimes be easier in this regard.</p>
</section>
<section id="other">
<h2>Other Considerations</h2>
<p>Note that manifest localization does not remove the need for the string content in a manifest to provide language and direction metadata for each separate data value. The manifest file can define a file-wide default for language and direction, but specific files should be able to override either value. See [[STRING-META]] for more information.</p>
<p>Some manifests allow for "sparse population" of values in the localized file. For example, the <code>en-US</code> (English, United States) file might only contain one or two US-specific values, relying on the <code>en</code> (English) file for most of its content. Specifications need to be explicit about whether sparse population is allowed and how it works to avoid misconfiguration errors or missing translations.</p>
</section>
</body>
</html>