Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce data size for dateTimeSymbolMap and others #697

Open
mosuem opened this issue Aug 10, 2023 · 1 comment
Open

Reduce data size for dateTimeSymbolMap and others #697

mosuem opened this issue Aug 10, 2023 · 1 comment
Labels
package:intl type-enhancement A request for a change that isn't a bug

Comments

@mosuem
Copy link
Member

mosuem commented Aug 10, 2023

The current design of package:intl bundles all data together in a large map. This means that users will by default load or compile into their app the data for all locales, not just the ones they care about when importing, for example, pkgs/intl/lib/date_symbol_data_local.dart. The workaround would be to include the data for each locale manually through pkgs/intl/lib/date_symbol_data_custom.dart, which is more involved.

To allow FR such as #696 to be possible without increasing this default loading size, we could reduce redundancies by saving only the diffs of DateSymbols or similar classes to their parents, or a default base.

This would allow space saving storage of variants of a language to be stored.

@mosuem mosuem added type-enhancement A request for a change that isn't a bug package:intl labels Aug 10, 2023
@rakudrama
Copy link
Member

Regarding size

How is the data size being measured? If there was a change to supported locales or one of the compilers, how would we know if it was a good change or bad?

Regarding redundancies:

There is already considerable sharing between DarteSymbols objects due to the same const lists being used by different locales.

Locales that are substantially the same as another locale incur the size cost of referencing mostly the same O(20) lists used by the other locale, but not the size of the shared lists. Adding N closely related locales is not as expensive as adding N very different locales.

I think it will be hard to come up with a general scheme of diffs against a base where the cost of expressing what differs does not substantially eat into the savings.

One might restructure the DateSymbols to get more benefit from the const sharing.
For example, if a const class with NARROWMONTHS, STANDALONENARROWMONTHS, NARROWWEEKDAYS, STANDALONENARROWWEEKDAYS would replace many clusters of 4 lists with a common object. This saves 3 fields when there is re-use of this cluster, but costs a separate object when the cluster is unique. It is not clear this would be a win overall.

Regarding the large map

The large map has ~120 locales. The map is constructed via a function that is passed to initialization code and called on demand. It would probably be best of the entire map was a const object.

Native AOT would initialize the const map quickly and not have the code size tax of a huge function. dart2js would have to initialize the map early, but we are considering some improvements in this area. Either way, for dart2js there is some code to construct the Map, but the constant is more easily understood by the compiler.

A recent change in how dart2js represents constant Maps with String keys reduced the size of the patterns maps, exploiting the redundancy that each map has the same keys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package:intl type-enhancement A request for a change that isn't a bug
Projects
None yet
Development

No branches or pull requests

2 participants