Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[$500] Add system for pluralizing strings to our internationalization/localization library #38614

Closed
iwiznia opened this issue Mar 19, 2024 · 77 comments
Assignees
Labels
Engineering External Added to denote the issue can be worked on by a contributor Monthly KSv2 NewFeature Something to build that is a new item. Reviewing Has a PR in review

Comments

@iwiznia
Copy link
Contributor

iwiznia commented Mar 19, 2024

Problem

I've been seeing more and more copy that has singular and plural versions and what we are doing is adding 2 different translations for it, which is less than ideal for a localized app, as some languages have more than 2 forms (plural and singular) and this means we won't be able to support those languages without a major effort

Solution

Add feature to support pluralization in our localization system. How it will work is:

  • You can pass a count param to the translate method
  • When received, it will check if the translation entry is a string, if it is, we will ignore the count param and translate it same as it does now
  • If it is an object, it will:

This way, we have a standard way of pluralizing phrases that works for all languages and avoid having to refactor all this when we decide we want to support localization for real.

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~01e4da0f48a9da5a41
  • Upwork Job ID: 1770149691252797440
  • Last Price Increase: 2024-03-19
  • Automatic offers:
    • jayeshmangwani | Reviewer | 0
    • brandonhenry | Contributor | 0
    • shubham1206agra | Contributor | 103171004
    • ZhenjaHorbach | Contributor | 103705648
Issue OwnerCurrent Issue Owner: @twisterdotcom
@iwiznia iwiznia added External Added to denote the issue can be worked on by a contributor Engineering Daily KSv2 NewFeature Something to build that is a new item. labels Mar 19, 2024
Copy link

melvin-bot bot commented Mar 19, 2024

Job added to Upwork: https://www.upwork.com/jobs/~01e4da0f48a9da5a41

@melvin-bot melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label Mar 19, 2024
Copy link

melvin-bot bot commented Mar 19, 2024

Triggered auto assignment to Contributor-plus team member for initial proposal review - @jayeshmangwani (External)

Copy link

melvin-bot bot commented Mar 19, 2024

@melvin-bot melvin-bot bot added Weekly KSv2 and removed Daily KSv2 labels Mar 19, 2024
Copy link

melvin-bot bot commented Mar 19, 2024

⚠️ It looks like this issue is labelled as a New Feature but not tied to any GitHub Project. Keep in mind that all new features should be tied to GitHub Projects in order to properly track external CAP software time ⚠️

@ShridharGoel
Copy link
Contributor

Is this open for external contributors? I would like to work on it.

@iwiznia
Copy link
Contributor Author

iwiznia commented Mar 19, 2024

Yes, it is

@ShridharGoel
Copy link
Contributor

ShridharGoel commented Mar 19, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

Add system for pluralizing strings to our internationalization/localization library.

What is the root cause of that problem?

New feature

What changes do you think we should make in order to solve the problem?

Example code to achieve this: (This would be added in the existing getTranslatedPhrase method in Localize/index.ts, after we get the translatedPhrase value)

const count = getCountFromPhraseParameters()

if (translatedPhrase) {
    if (typeof translatedPhrase === 'function') {
        return translatedPhrase(...phraseParameters);
    }

    if (typeof translatedPhrase === 'object' && count !== undefined) {
        const pluralRules = new Intl.PluralRules(language);
        const pluralForm = pluralRules.select(count);

        if (translatedPhrase.hasOwnProperty(pluralForm)) {
            return translatedPhrase[pluralForm]
        } else {
            console.warn(`Plural form '${pluralForm}' not found for phrase '${phraseKey}'`);
            return translatedPhrase.other
        }
    }

    // We set the translated value in the cache only for the phrases without parameters.
    cacheForLocale?.set(phraseKey, translatedPhrase);
    return translatedPhrase;
}

Note: We need to discuss the way to add count in phraseParameters. We can either convert phraseParameters into an object and have different keys for count and the values, or we can keep it as an array and ensure that the first item should be the count.

  • We get the count via phraseParameters if it exists.
  • Inside the method, we first retrieve the translation using the existing logic.
  • If the translation is an object, indicating that there are multiple forms based on count, we perform pluralization.
  • Pluralization is done using the PluralRules API, which selects the appropriate plural category based on the count and language rules.
  • We then look for a matching translation for the selected plural category.
  • If a matching translation is found, we return it. Otherwise, we fallback to the other key or an empty string.
  • Optionally, we log a warning if no matching translation is found for the selected plural category and return the value in other.

@melvin-bot melvin-bot bot added Daily KSv2 and removed Weekly KSv2 labels Mar 19, 2024
@ghost
Copy link

ghost commented Mar 19, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

Add system for pluralizing strings to our internationalization/localization library

What is the root cause of that problem?

New Feature

Here :

const translate = useMemo<LocaleContextProps['translate']>(
() =>
(phraseKey, ...phraseParameters) =>
Localize.translate(locale, phraseKey, ...phraseParameters),
[locale],
);

we need to refactor the translate method to add support for Pluralisation

What changes do you think we should make in order to solve the problem?

As discussed above, we need to add do all these things to translate method :

  • You can pass a count param to the translate method
  • When received, it will check if the translation entry is a string, if it is, we will ignore the count param and translate it same as it does now
  • If it is an object, it will:
  • It will then look for an entry in the object that matches what plural rules returned
  • If none is found, log a warn (or throw in dev) and use the translation defined in other

What alternative solutions did you explore? (Optional)

N/A

Fyi If needed I can implement and add test branch as well.

@iwiznia iwiznia self-assigned this Mar 19, 2024
@iwiznia
Copy link
Contributor Author

iwiznia commented Mar 19, 2024

@ShridharGoel I think we can use count if it is passed inside phraseParameters instead of adding a new param (which would also mean updating all usages of that method, since you are changing the param order).

Also you are not doing this correctly:

If none is found, log a warn (or throw in dev) and use the translation defined in other

@ghost
Copy link

ghost commented Mar 19, 2024

hey @iwiznia ! Do you need sample code ? Just asking hope you don't mind asking me this.

@ShridharGoel
Copy link
Contributor

Proposal

Updated CC: @iwiznia

@iwiznia
Copy link
Contributor Author

iwiznia commented Mar 19, 2024

hey @iwiznia ! Do you need sample code ? Just asking hope you don't mind asking me this.

It always helps.

Example code to achieve this: (This would be added in place of the existing translate method in LocaleContextProvider)

I think this is not correct? The code for translating is in Localize/index.ts...

            const count = phraseParameters[0]?.count ?? 1; // Extract count from phraseParameters or default to 1

If no count is passed, it should treat it as such, not default to 1

@ShridharGoel
Copy link
Contributor

Proposal

Updated with code optimisation and example usage.

@ShridharGoel
Copy link
Contributor

@iwiznia I actually made an update just now, can you have a look?

@iwiznia
Copy link
Contributor Author

iwiznia commented Mar 19, 2024

Did you see this?

I think this is not correct? The code for translating is in Localize/index.ts...

Also: I think these are wrong:

                  return translationEntry[pluralForm];
                  return translationEntry.other; // Fallback to 'other' form

Because if the translation has other parameters, they will not be replaced. Plus, a translation entry can be a function too.

@ShridharGoel
Copy link
Contributor

ShridharGoel commented Mar 19, 2024

@iwiznia Yes, it makes more sense to add in Localize/index.ts. I'll update it accordingly.

Can we also change phraseParameters to accept such an object?

{
  "count": 2,
  "values": []
}

Else, we can keep it as an array and make count to be the first element always.

@brandonhenry
Copy link
Contributor

brandonhenry commented Mar 19, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

Add system for pluralizing strings to our internationalization/localization library to support languages with more than two pluralization forms.

What is the root cause of that problem?

New Feature
The current getTranslatedPhrase function in Localize/index.ts does not handle pluralization rules for languages with more than singular and plural forms. It needs to be enhanced to support the diverse pluralization categories used by different languages.

What changes do you think we should make in order to solve the problem?

I propose the following changes to the getTranslatedPhrase function in Localize/index.ts:

Handling of translated phrases and count parameter

  • If the translated phrase is a string, it is cached and returned immediately. This is the same behavior as the original function.
  • If the translated phrase is a function, it is evaluated with the provided parameters, including the count parameter. The count parameter is passed as a separate argument to the translation function. This allows the translation function to destructure the count parameter and use it in the translation.
  • If the translated phrase is an object, it is assigned to the phrase variable for further processing. This change was made to handle pluralization, where the translated phrase is an object containing different pluralization forms.
if (translatedPhrase) {
    if (typeof translatedPhrase === 'string') {
        cacheForLocale?.set(phraseKey, translatedPhrase);
        return translatedPhrase;
    }

    let phrase: string | Record<Intl.LDMLPluralRule, string>;

    if (typeof translatedPhrase === 'function') {
        phrase = translatedPhrase(...phraseParameters);
    } else {
        phrase = translatedPhrase;
    }

    if (typeof phrase === 'string') {
        return phrase;
    }
    // ...
}

Handling of phrase variable

If phrase is a string, it is returned as the final translated phrase. This is a new check added to handle the case where the translated phrase is an object, but the evaluated result (phrase) is a string.

if (typeof phrase === 'string') {
    return phrase;
}

Pluralization handling

  • The count is extracted from phraseParameters (if provided) to determine the pluralization form.
  • The Intl.PluralRules instance for the current language is memoized to avoid creating multiple instances. This improves performance by reusing the same instance for subsequent calls with the same language.
  • The pluralization category is determined based on the count and language rules using Intl.PluralRules.
  • If the phrase object has a translation for the pluralization category, it is returned. If the translation is a function, it is evaluated with the provided parameters before returning. This allows for dynamic translations based on the pluralization form.
  • If no translation is found for the exact category, it fallbacks to the 'other' category if it exists. This provides a fallback mechanism for languages that may not have specific translations for all pluralization forms.

Since count can be passed in anywhere in phraseParameters - we need a deserializer to get count from it.

function deserializePhraseParamsForCount(phraseParameters: any[]): number | undefined {
  for (const param of phraseParameters) {
    if (typeof param === 'number') {
      return param;
    } else if (Array.isArray(param)) {
      const countParam = findCountParam(param);
      if (countParam !== undefined) {
        return countParam;
      }
    } else if (typeof param === 'object' && param !== null) {
      const countParam = findCountParam(Object.values(param));
      if (countParam !== undefined) {
        return countParam;
      }
    }
  }
  return undefined;
}
const count = deserializePhraseParamsForCount(phraseParameters);

if (!memoizedPluralRules[language]) {
    memoizedPluralRules[language] = new Intl.PluralRules(language);
}

const pluralCategory = memoizedPluralRules[language].select(count);

if (phrase.hasOwnProperty(pluralCategory)) {
    const translation = phrase[pluralCategory];
    if (typeof translation === 'function') {
        return translation(...phraseParameters);
    }
    return translation;
}

if (phrase.hasOwnProperty('other')) {
    const fallbackTranslation = phrase['other'];
    if (typeof fallbackTranslation === 'function') {
        return fallbackTranslation(...phraseParameters);
    }

    const warningMessage = `Plural form '${pluralCategory}' not found for phrase '${phraseKey}'`;
    Log.warn(warningMessage);
    if (__DEV__) {
        throw new Error(warningMessage);
    }
    // Use the 'other' translation as the fallback
   
    return fallbackTranslation;
}

Error handling

  • If no translation is found for the pluralization category or the 'other' fallback, a warning message is logged using Log.warn.
  • In development mode (__DEV__), an error is thrown with the warning message to alert developers about missing translations.
  • In production, the phrase key itself is returned as a fallback to prevent the application from breaking due to missing translations.
const warningMessage = `Plural form '${pluralCategory}' not found for phrase '${phraseKey}'`;
Log.warn(warningMessage);
if (__DEV__) {
    throw new Error(warningMessage);
}
return phraseKey;

Fallback language handling:

  • If no translated phrase is found for the current language, the function falls back to the specified fallback language by recursively calling getTranslatedPhrase with the fallback language.
  • If no translation is found in the fallback language, it tries the default language (en) and logs an alert if the translation is not found.
if (fallbackLanguage) {
    const fallbackTranslatedPhrase = getTranslatedPhrase(fallbackLanguage, phraseKey, null, ...phraseParameters);
    if (fallbackTranslatedPhrase) {
        return fallbackTranslatedPhrase;
    }
}

if (language !== CONST.LOCALES.DEFAULT) {
    Log.alert(`${phraseKey} was not found in the ${language} locale`);
    return getTranslatedPhrase(CONST.LOCALES.DEFAULT, phraseKey, null, ...phraseParameters);
}

Return value

If no translation is found in the default language, the function returns null instead of the phrase key. This change was made to differentiate between a missing translation and a fallback to the phrase key itself.

return null;

Note: we could also consider extracting the pluralization logic into a separate function to keep the getTranslatedPhrase function focused on translation retrieval and caching.

With these changes, the count parameter is passed as a separate argument to the translation function, allowing it to be destructured and used in the translation. This enables the usage of dynamic translations based on the count value, as shown in the example:

someTranslation: {
    zero: `There's no transactions`,
    one: `There's one transaction`,
    many: ({count}) => `There's ${count} transactions`,
}

The getTranslatedPhrase function will automatically pass the count parameter to the translation function when it is called, so you can use it directly in the translation.

Changes

All changes can be seen in this PR here

@ShridharGoel
Copy link
Contributor

Proposal

Updated

Copy link

melvin-bot bot commented Jul 18, 2024

📣 @shubham1206agra 🎉 An offer has been automatically sent to your Upwork account for the Contributor role 🎉 Thanks for contributing to the Expensify app!

Offer link
Upwork job
Please accept the offer and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review 🧑‍💻
Keep in mind: Code of Conduct | Contributing 📖

@shubham1206agra
Copy link
Contributor

If that is true, I think it was not intended at all and probably a mistake when converting it to ts? Yep, it was done here (@BartoszGrajdek can you clarify why please?). Did not check all usages, but a quick look make it seem we don't use it like that anywhere.

@iwiznia At the time we migrated it I remember seeing some usages where translate had more than one parameter, that is why it was implemented like this.

If it's really the case that translate takes one parameter max, this makes things only easier and you can go ahead and refactor it as part of this issue, if you need help please let me know! 😄

Progress update: I am in the process of merging all parameters into one. This is taking some time as there are many keys to check. I will open a PR on Monday.

@jayeshmangwani
Copy link
Contributor

Sounds good

@shubham1206agra
Copy link
Contributor

I have asked @jayeshmangwani to review this first.

@melvin-bot melvin-bot bot added Weekly KSv2 and removed Weekly KSv2 labels Aug 5, 2024
@shubham1206agra
Copy link
Contributor

Waiting on @jayeshmangwani to final review the PR.

Copy link

melvin-bot bot commented Aug 27, 2024

📣 @ZhenjaHorbach 🎉 An offer has been automatically sent to your Upwork account for the Contributor role 🎉 Thanks for contributing to the Expensify app!

Offer link
Upwork job
Please accept the offer and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review 🧑‍💻
Keep in mind: Code of Conduct | Contributing 📖

@mallenexpensify
Copy link
Contributor

Swapped C+ assignment per this discussion.

Copy link

melvin-bot bot commented Sep 23, 2024

This issue has not been updated in over 15 days. @iwiznia, @twisterdotcom, @shubham1206agra, @ZhenjaHorbach eroding to Monthly issue.

P.S. Is everyone reading this sure this is really a near-term priority? Be brave: if you disagree, go ahead and close it out. If someone disagrees, they'll reopen it, and if they don't: one less thing to do!

@melvin-bot melvin-bot bot added the Monthly KSv2 label Sep 23, 2024
@twisterdotcom
Copy link
Contributor

How many people need paying for this big one? I see @jayeshmangwani and @shubham1206agra were involved as C+?

@jayeshmangwani
Copy link
Contributor

@shubham1206agra and @ZhenjaHorbach needed to be paid here.

@ZhenjaHorbach
Copy link
Contributor

Oh, yeah
I forgot about it 😅

@shubham1206agra is Contributor here
And @ZhenjaHorbach is Contributor+ here

I will add BugZero Checklist today or tomorrow !

@shubham1206agra
Copy link
Contributor

@twisterdotcom I want to ask for pay increase here as the work here was more than the initially anticipated.

@twisterdotcom
Copy link
Contributor

Nice, thanks for asking. Just discussing internally, but I see no reason not to increase it - just figuring out what to: https://expensify.slack.com/archives/C01SKUP7QR0/p1728559862581289

@twisterdotcom
Copy link
Contributor

Okay, gonna add another $250 here folks.

@twisterdotcom
Copy link
Contributor

Payment Summary:

@shubham1206agra
Copy link
Contributor

@twisterdotcom Accepted offer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Engineering External Added to denote the issue can be worked on by a contributor Monthly KSv2 NewFeature Something to build that is a new item. Reviewing Has a PR in review
Projects
No open projects
Status: Done
Development

No branches or pull requests