Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add names for 2 letter codes #572

Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
209 changes: 108 additions & 101 deletions _data/languages.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion apis/aisa.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ supported_languages:
base_code: hi
name: Hindi
variant_name: null
- slug: null
- slug: Huyu
code: huyu
normalized_code: Huyu
base_code: Huyu
Expand Down
40 changes: 20 additions & 20 deletions apis/alibaba.md
Original file line number Diff line number Diff line change
Expand Up @@ -176,11 +176,11 @@ supported_languages:
base_code: ceb
name: Cebuano
variant_name: null
- slug: ch
- slug: chamorro
code: ch
normalized_code: ch
base_code: ch
name: null
name: Chamorro
variant_name: null
- slug: chm
code: chm
Expand Down Expand Up @@ -470,11 +470,11 @@ supported_languages:
base_code: hy
name: Armenian
variant_name: null
- slug: ia
- slug: interlingua
code: ia
normalized_code: ia
base_code: ia
name: null
name: Interlingua
variant_name: null
- slug: iba
code: iba
Expand All @@ -488,11 +488,11 @@ supported_languages:
base_code: id
name: Indonesian
variant_name: null
- slug: ie
- slug: interlingue
code: ie
normalized_code: ie
base_code: ie
name: null
name: Interlingue
variant_name: null
- slug: igbo
code: ig
Expand All @@ -512,11 +512,11 @@ supported_languages:
base_code: inh
name: null
variant_name: null
- slug: io
- slug: ido
code: io
normalized_code: io
base_code: io
name: null
name: Ido
variant_name: null
- slug: icelandic
code: is
Expand Down Expand Up @@ -596,11 +596,11 @@ supported_languages:
base_code: kk
name: Kazakh
variant_name: null
- slug: kl
- slug: kalaallisut,
code: kl
normalized_code: kl
base_code: kl
name: null
name: Kalaallisut,
variant_name: null
- slug: khmer
code: km
Expand All @@ -620,11 +620,11 @@ supported_languages:
base_code: ko
name: Korean
variant_name: null
- slug: kr
- slug: kanuri
code: kr
normalized_code: kr
base_code: kr
name: null
name: Kanuri
variant_name: null
- slug: kashmiri
code: ks
Expand Down Expand Up @@ -830,11 +830,11 @@ supported_languages:
base_code: 'no'
name: Norwegian
variant_name: null
- slug: nv
- slug: navajo
code: nv
normalized_code: nv
base_code: nv
name: null
name: Navajo
variant_name: null
- slug: chichewa
code: ny
Expand All @@ -848,11 +848,11 @@ supported_languages:
base_code: oc
name: Occitan
variant_name: null
- slug: oj
- slug: ojibwa
code: oj
normalized_code: oj
base_code: oj
name: null
name: Ojibwa
variant_name: null
- slug: oromo
code: om
Expand Down Expand Up @@ -1226,17 +1226,17 @@ supported_languages:
base_code: vi
name: Vietnamese
variant_name: null
- slug: vo
- slug: "volap\xFCk"
code: vo
normalized_code: vo
base_code: vo
name: null
name: "Volap\xFCk"
variant_name: null
- slug: wa
- slug: walloon
code: wa
normalized_code: wa
base_code: wa
name: null
name: Walloon
variant_name: null
- slug: waray
code: war
Expand Down
30 changes: 15 additions & 15 deletions apis/baidu.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,11 +225,11 @@ supported_languages:
base_code: co
name: Corsican
variant_name: null
- slug: null
- slug: cree
code: cre
normalized_code: cr
base_code: cr
name: null
name: Cree
variant_name: null
- slug: crimean-tatar
code: crh
Expand Down Expand Up @@ -493,19 +493,19 @@ supported_languages:
code: ido
normalized_code: io
base_code: io
name: null
name: Ido
variant_name: null
- slug: inuktitut
code: iku
normalized_code: iu
base_code: iu
name: Inuktitut
variant_name: null
- slug: ina
- slug: interlingua
code: ina
normalized_code: ia
base_code: ia
name: null
name: Interlingua
variant_name: null
- slug: indonesian
code: ind
Expand Down Expand Up @@ -549,11 +549,11 @@ supported_languages:
base_code: kab
name: Kabyle
variant_name: null
- slug: kal
- slug: kalaallisut,
code: kal
normalized_code: kl
base_code: kl
name: null
name: Kalaallisut,
variant_name: null
- slug: kannada
code: kan
Expand All @@ -567,11 +567,11 @@ supported_languages:
base_code: ks
name: Kashmiri
variant_name: null
- slug: kau
- slug: kanuri
code: kau
normalized_code: kr
base_code: kr
name: null
name: Kanuri
variant_name: null
- slug: kazakh
code: kaz
Expand Down Expand Up @@ -753,11 +753,11 @@ supported_languages:
base_code: nap
name: null
variant_name: null
- slug: null
- slug: south-ndebele
code: nbl
normalized_code: nr
base_code: nr
name: null
name: South Ndebele
variant_name: null
- slug: nds
code: nds
Expand Down Expand Up @@ -819,11 +819,11 @@ supported_languages:
base_code: oc
name: Occitan
variant_name: null
- slug: oji
- slug: ojibwa
code: oji
normalized_code: oj
base_code: oj
name: null
name: Ojibwa
variant_name: null
- slug: oriya
code: ori
Expand Down Expand Up @@ -1161,11 +1161,11 @@ supported_languages:
base_code: vi
name: Vietnamese
variant_name: null
- slug: wln
- slug: walloon
code: wln
normalized_code: wa
base_code: wa
name: null
name: Walloon
variant_name: null
- slug: wolof
code: wol
Expand Down
24 changes: 12 additions & 12 deletions apis/niutrans.md
Original file line number Diff line number Diff line change
Expand Up @@ -369,11 +369,11 @@ supported_languages:
base_code: cfm
name: null
variant_name: null
- slug: cha
- slug: chamorro
code: cha
normalized_code: ch
base_code: ch
name: null
name: Chamorro
variant_name: null
- slug: chechen
code: che
Expand Down Expand Up @@ -795,11 +795,11 @@ supported_languages:
base_code: he
name: Hebrew
variant_name: null
- slug: null
- slug: herero
code: her
normalized_code: hz
base_code: hz
name: null
name: Herero
variant_name: null
- slug: hindi
code: hi
Expand All @@ -819,11 +819,11 @@ supported_languages:
base_code: hlb
name: null
variant_name: null
- slug: null
- slug: hiri-motu
code: hmo
normalized_code: ho
base_code: ho
name: null
name: Hiri Motu
variant_name: null
- slug: croatian
code: hr
Expand Down Expand Up @@ -1233,11 +1233,11 @@ supported_languages:
base_code: lua
name: Luba-Kasai
variant_name: null
- slug: null
- slug: luba-katanga
code: lub
normalized_code: lu
base_code: lu
name: null
name: Luba-Katanga
variant_name: null
- slug: lue
code: lue
Expand Down Expand Up @@ -1431,11 +1431,11 @@ supported_languages:
base_code: my
name: Burmese
variant_name: null
- slug: nav
- slug: navajo
code: nav
normalized_code: nv
base_code: nv
name: null
name: Navajo
variant_name: null
- slug: nba
code: nba
Expand All @@ -1449,11 +1449,11 @@ supported_languages:
base_code: ndc
name: null
variant_name: null
- slug: null
- slug: ndonga
code: ndo
normalized_code: ng
base_code: ng
name: null
name: Ndonga
variant_name: null
- slug: nepali
code: ne
Expand Down
5 changes: 5 additions & 0 deletions apis/pangeamt.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,11 @@ integrations:
plugin: true
custom: true
active: false
- slug: translate5
name: translate5
active: false
urls:
- https://confluence.translate5.net/display/CON/PangeaMT
active: true
seo:
name: The PangeaMT machine translation API
Expand Down
9 changes: 5 additions & 4 deletions generate.py
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ def normalize(code):
base_code = base_language_code(normalized_code)
if base_code in language['codes']:
language_name = language.get('names', [None])[0]
language_slug = slugify(language_name) if language_name else code
language_slug = slugify(language_name) if language_name else base_code
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this change mean? For which languages does it change something? e.g Chinese?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, the change doesn't make any difference since I have also added the names for 2-letter codes. However, I just changed it to base_code since the validation was done with base_code.

break
if api_id not in [ 'alibaba', 'baidu', 'niutrans' ] and len(base_code) == 2 and not language_name:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make this stricter now?

e.g.

if len(base_code) == 2 and not language_name:

# This is usually a typo.
Expand All @@ -228,10 +228,10 @@ def normalize(code):
'variant_name': variant_name
})
if not language_slug:
if code in UNLISTED_LANGUAGES:
UNLISTED_LANGUAGES[code] += 1
if base_code in UNLISTED_LANGUAGES:
UNLISTED_LANGUAGES[base_code] += 1
else:
UNLISTED_LANGUAGES[code] = 1
UNLISTED_LANGUAGES[base_code] = 1

integrations = get_tms_by_id_and_key(api_id, 'api_integrations')

Expand Down Expand Up @@ -326,6 +326,7 @@ def normalize(code):
{ content }
''')


### Add unlisted languages to languages.json
for code, _ in UNLISTED_LANGUAGES.items():
unlisted_languages = {
Expand Down
5 changes: 5 additions & 0 deletions integrations/translate5.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,11 @@ api_integrations:
name: Microsoft Translator
- slug: language-weaver
name: Language Weaver
- slug: pangeamt
active: false
urls:
- https://confluence.translate5.net/display/CON/PangeaMT
name: PangeaMT
fuzzy_repair: false
open-source: true
quality_estimation_integrations:
Expand Down
Loading