Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECMAScript Proposal: Optional ICU Compatibility for Intl API #913

Open
ptu14 opened this issue Jul 31, 2024 · 4 comments
Open

ECMAScript Proposal: Optional ICU Compatibility for Intl API #913

ptu14 opened this issue Jul 31, 2024 · 4 comments
Labels
c: meta Component: intl-wide issues s: comment Status: more info is needed to move forward

Comments

@ptu14
Copy link

ptu14 commented Jul 31, 2024

ECMA-402 Proposal: Enhanced ICU Integration for Intl API

Champion(s)

[Names of champions]

Stage

Stage 0

Motivation

The Intl API currently provides a standardized way for JavaScript applications to handle internationalization. However, it doesn't fully expose the capabilities of the International Components for Unicode (ICU) library, which is already included in most JavaScript environments. This proposal aims to bridge that gap, providing developers with more powerful and flexible internationalization tools while leveraging existing resources.

Key Points

  1. Browsers already include full ICU objects: Major browsers like Chrome, Firefox, and Safari already ship with complete ICU implementations. This proposal seeks to expose these existing capabilities more directly.

  2. Node.js supports configurable ICU data: Node.js allows specifying ICU data files using the NODE_ICU_DATA environment variable. This demonstrates the flexibility and existing support for comprehensive ICU functionality in server-side environments.

  3. Underutilized resources: Despite the presence of full ICU capabilities in most JavaScript environments, developers cannot fully leverage these resources through the current Intl API.

  4. Potential for performance improvements: Direct access to ICU functions could reduce the overhead of the current abstraction layer in the Intl API.

Prior Art

This proposal builds directly on the International Components for Unicode (ICU) library, which is already integrated into:

  1. All major browsers (Chrome, Firefox, Safari, Edge)
  2. Node.js (with configurable data via NODE_ICU_DATA)
  3. Java's java.text and java.util packages
  4. .NET's System.Globalization namespace
  5. C++'s ICU4C library

Description

We propose introducing an enhanced mode for the Intl API that provides more direct access to the existing ICU capabilities:

const formatter = new Intl.DateTimeFormat('en-US', { 
  enhancedICU: true,
  pattern: 'EEEE, MMMM d, y' // ICU date format pattern
});
console.log(formatter.format(new Date('2023-05-17'))); 
// Output: Wednesday, May 17, 2023

const collator = new Intl.Collator('de-DE', {
  enhancedICU: true,
  strength: 'quaternary' // ICU collation strength
});
console.log(collator.compare('ä', 'a')); // More precise comparison

Expensive to Implement in Userland

Implementing ICU-level functionality in pure JavaScript would be prohibitively expensive:

  1. Data size: Full ICU data is already included in browsers and configurable in Node.js. Reimplementing this in JS would unnecessarily duplicate large amounts of data.
  2. Algorithm complexity: Many ICU algorithms are highly optimized and would be inefficient if reimplemented in JavaScript.
  3. Maintenance burden: Keeping up with Unicode standards and CLDR updates is already handled by ICU maintainers.

Broad Appeal

  1. npm statistics: Popular i18n libraries like moment.js (11M weekly downloads) and date-fns (27M weekly downloads) demonstrate the high demand for advanced date formatting capabilities.
  2. Framework adoption: React-intl (2.5M weekly downloads) and Angular's i18n module showcase the need for powerful i18n tools in major frameworks.
  3. High-profile use cases:
    • Google Calendar requires advanced date/time formatting and calculations.
    • Booking.com needs precise collation for multilingual hotel searches.
    • Twitter's language detection and sorting for multilingual content.

Detailed Design

  1. Introduce an enhancedICU option to all Intl constructors.
  2. When enhancedICU is true, allow the use of ICU patterns and options directly.
  3. Provide access to additional ICU features like "compound formats" for DateTimeFormat, "alternate handling" for Collator, etc.
  4. Expose an Intl.getICUVersion() method to check the underlying ICU version.

Payload Mitigation

This proposal does not increase implementation size because:

  1. Browsers already include full ICU implementations.
  2. Node.js allows configuring ICU data separately.
  3. The proposal exposes existing functionality rather than adding new data or algorithms.
  4. Any additional code would be minimal, primarily consisting of new API surface to expose existing ICU capabilities.

Compatibility

This proposal is fully backwards-compatible. All existing Intl functionality remains unchanged when enhancedICU is not used.

Implementation

The implementation would primarily involve creating new bindings between the existing ICU implementations in JavaScript engines and the Intl API surface. Most of the heavy lifting is already done by the included ICU libraries.

Summary

By providing enhanced ICU integration, this proposal significantly improves ECMA-402's internationalization capabilities, aligns closely with existing implementations, and meets the criteria for addition to the specification. It leverages resources already present in JavaScript environments, provides powerful tools that are expensive to implement in userland, has broad appeal, and does not increase payload size.

@ptu14
Copy link
Author

ptu14 commented Jul 31, 2024

could potentially solve #891

@eemeli
Copy link
Member

eemeli commented Aug 1, 2024

Expanding the Intl API to provide everything that's made available by ICU4C/ICU4J would also require any alternative to those libraries (such as ICU4X) to provide the same features.

It's also not clear from the proposal what the extent of the ask here is; what new options and such would be included?

Do I gather right that the set of options enabled by enhancedICU would be defined separately for each ICU version, and potentially include breaking changes?

@srl295
Copy link
Member

srl295 commented Aug 1, 2024

Speaking as a proponent of expansion, I have to say that some icu api/implementation has roots going back over 30 years (not a typo). I don't think it's appropriate to just expose everything.

As to the version, there are pros and big cons in "leaking" this level of detail. I'd be more in favor of a way to get at the cldr version-or-equivalent something such as data = "https://cldr.unicode.org 46.0.0" giving a provenance for the data set (not the implementation). But there are major caveats (customization that everyone does, more importantly potential for misuse)

@sffc
Copy link
Contributor

sffc commented Aug 22, 2024

ECMA-402 is designed from the ground up to express what Web and client-side developers need as opposed to what ICU4C happens to provide. There is a lot of functionality in ICU4C that is simply not relevant to the Web platform.

Additionally, ICU4C APIs carry cruft and do not always represent modern i18n best practices.

I agree with @srl295 that if we want to work toward a lower-level abstraction than ECMA-402, I'd look into CLDR. One of my first issues on this repo is #210 for exactly this type of functionality.

@ptu14, if you can identify specific gaps that you need in the Web platform, those can be individual proposals. For example, you mention date skeleton strings and "high demand for advanced date formatting capabilities." We could discuss specific date formatting proposals on their own merits.

If you would like to discuss this, please join [email protected] and we can schedule this for an upcoming call. If you are able to travel, many of us will also be here for the Unicode Technology Workshop on October 22-23 and an ECMA-402 Face-to-Face on October 24.

@sffc sffc added c: meta Component: intl-wide issues s: comment Status: more info is needed to move forward labels Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: meta Component: intl-wide issues s: comment Status: more info is needed to move forward
Projects
None yet
Development

No branches or pull requests

4 participants