-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GF_Latin_PriAfrican #137
Conversation
Please hold on a merge until we've completed the review of the data in this PR. I'll update in this thread when we are ready. |
This PR will be reviewed by @moyogo and @NeilSureshPatel |
@chrissimpkins at first glance ebreve ibreve obreve look out of place. They don't seem to be used in those languages and their uppercase are missing. |
@chrissimpkins I wanted to confirm that the priority list is meant to cover the languages in the countries Dave mentions in #136. We had discussed different tiers of priorities in the past, so I want to make sure I am looking at the right things. Thanks. |
I shared a Doc with you that details the list of languages. |
@chrissimpkins One more question. Are the glyphsets meant to be exhaustive or additive? There are things missing but if all fonts are required to support levels 1-4, then I can see that some of the missing things are not required if they are already covered in the other Latin coverage levels. |
This is meant to be a superset of the Latin Core set to fully support the target languages. So, the question is: if a family includes Latin Core + Latin PriAfrican coverage, does it fully support the languages of interest? |
Language dataI’ve updated language data for some of the target languages in gflanguages PR 114. The Afrikaans base characters were incorrect, however the update doesn’t affect this glyphset. Note: Yoruba has been split into Yoruba (Nigeria) As mentionned in the meeting: GF Latin African PriBesides Ŋŋ needed for Luganda spoken in Uganda (and several languages not listed in the targeted languages) this is mostly a priority Nigerian language glyph set. The following target languages are already supported by GF Latin Core:
Note: Xhosa (xh) and Zulu (zu) spoken in South Africa require lowercase-to-uppercase kerning as they use lowercase prefixes at the beginning of proper nouns. This can be narrowed down to vowels aeiou for the lowercase in from of uppercase. For example "eVe" in "eVenda" should kern visually symmetrically in most designs. This glyphset adds support for:
Characters that should be removedSome character should not be in the GF Latin African Pri glyphset: Missed opportunitySince several countries listed in the shared document use none of the target languages it seems there are easy additions that should be recommended. I’d strongly recommend adding 6 characters Ɛ ɛ Ɲ ɲ Ɔ ɔ:
Besides this total of 70M speakers, a lot of additional languages would be supported with these additions, these are the larger ones in the countries listed. Note: like Ŋ, Ɲ can have both n-form and N-form, both forms are not necessary at the same time. |
Let's not miss the opportunity. I think the Eng here is also a locl verison, so there's 2 versions of that to be added, with the needed feature code. |
I think you are saying we need both, even though its split, which I agree with. Thanks to @Black-sage for explaining to me the ethnic integration across state lines is strong :) |
Revisions based on input from Neil Patel and Denis Jacquerye, including the comments in #137 (comment)
@chrissimpkins Here’s the ouput for the .nam file with a patched version of Yanone’s
The difference with the latest file in the PR is that the following are missing:
Somehowe we missed capital S with dot below (see #141). |
@chrissimpkins I opened #144 for those fixes. |
Reviewed and dropped a request there. Thank you very much Denis! |
…tch1 Add Sdotbelow, dotbelowcomb to PriAfrican glyph set, update data.json mapping for PriAfrican glyph set
We merged Denis' #144 into this branch. |
I am excited for this to be ready. |
I think that we are nearly ready to merge. Open for any final comments / recommendations before we do so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adds 33 encoded characters; are there any unencoded (locl
) forms to go with them?
Also please update the root README to reflect this additon to the sets
Considering the scope of the glyphset, they are not included.
The N-form /Eng usually encoded as the default /Eng or as /Eng.loclNSM, preferred in Northern Sami likely belongs to GF Latin Beyond, since Northern Sami is not covered by GF Latin Core nor GF Latin PriAfrican as it’s spoken in Finland, Norway, Sweden. Liberia’s /Bhook (used in Gio, Loma, Liberia Kpelle which are out of scope), similar to /Btopbar, is not included. Both N-form and n-form of /Nhookleft are not differentiated here, as the n-form, like /Eng, should be the default when applicable. |
Thanks for the quick and thorough response Denis! I'll update the README |
This morning, we discussed the upcoming African Glyphsets definition to clarify their differences. Understanding this one would be a subset of the Latin-SSA, we should review the following both for this PR and the Latin-SSA under development:
cc @RosaWagner |
Waiting to resolve Viv's recommendation in #137 (comment) to merge this. Thoughts @davelab6? The PriAfrican name was as you recommended in #136. The modularity issue is a good point. Should this glyph set be a subset of the pan-African set? Or a separate set and projects must layer pan-African on top of this one to achieve the widest defined African lang support? |
My understanding is if e.g. |
@vv-monsalve @chrissimpkins in #142 For example, the GF_Latin_PriAfrican.yaml used for #144 was: extends:
- GF Latin Core
language_codes:
- af_Latn # Afrikaans
- ak_Latn # Akan
- bm_Latn # Bambara
- dyu_Latn # Dioula
- ff_Latn # Fulfulde
- ha_Latn # Hausa
- ig_Latn # Igbo
- lg_Latn # Luganda
- om_Latn # Oromo
- sw_Latn # Swahili
- xh_Latn # Xhosa
- yo_Latn # Yoruba
- zu_Latn # Zulu Other than that, there will be overlap between some sets for example GF Latin Vietnamese overlaps with GF Latin African as they both extend GF Latin Core but share glyphs and have different purposes. Neither of GF Latin African or GF Latin Vietnamese extends the other. It makes sense GF Latin Core should extends GF Latin Kernel. But is it convenient? I don’t think the names GF Latin African Core and GF Latin African Kernel make sense. |
I agree with you Denis. I can see why one might wants sets that don't
overlap but realistically that does seem to militate against utility.
Re: names for the African glyphs sets:
1: If we name them part 1 (Pri) and part 2 (SSA) since that doesn't imply
levels of importance. And would could remove the pri from SSA and they
would just build I think.
2: We could also call Pri "Partial" and the SSA "Complete" or
"Full" especially if we don't remove PRI from SSA.
…-e.
On Fri, Nov 10, 2023 at 1:52 PM Denis Moyogo Jacquerye < ***@***.***> wrote:
@vv-monsalve <https://github.com/vv-monsalve> @chrissimpkins
<https://github.com/chrissimpkins> in #142
<#142>
scripts/assemble_charactersets.py has been patched to allow glyphsets
defintion yaml files to have an extends list of other glyphsets. The
various glyphset files (nam, nice names, production names) will only list
the additional glyphs, not the ones in the other glyphsets.
For example, the GF_Latin_PriAfrican.yaml used for #144
<#144> was:
extends:
- GF Latin Corelanguage_codes:
- af_Latn # Afrikaans
- ak_Latn # Akan
- bm_Latn # Bambara
- dyu_Latn # Dioula
- ff_Latn # Fulfulde
- ha_Latn # Hausa
- ig_Latn # Igbo
- lg_Latn # Luganda
- om_Latn # Oromo
- sw_Latn # Swahili
- xh_Latn # Xhosa
- yo_Latn # Yoruba
- zu_Latn # Zulu
Other than that, there will be overlap between some sets for example GF
Latin Vietnamese overlaps with GF Latin African as they both extend GF
Latin Core but share glyphs and have different purposes. Neither of GF
Latin African or GF Latin Vietnamese extends the other.
It makes sense GF Latin Core should extends GF Latin Kernel. But is it
convenient?
I don’t think the names GF Latin African Core and GF Latin African Kernel
make sense.
GF Latin Kernel is meant to be used in all glyphsets, Latin or not, and GF
Latin Core is meant to be used in all Latin glyphsets.
GF Latin PriAfrican is not a great name either as there may be different
levels of priority. I don’t have better names to suggest.
—
Reply to this email directly, view it on GitHub
<#137 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQUQXJPC5LJT4CT2AGUIFTYDZZXDAVCNFSM6AAAAAA5ZCBXR6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMBWGI3TEMBRGE>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
I don't have a strong opinion about the details here. The business need is for a priority set of African languages to be supported, or a kernel Latin set be included in primarily non Latin fonts. |
Sorry for the delay. I don't really get this discussion about having non-overlapping glyphsets. If I checkout the data.json file (the source of truth), we can see that each glyph also lists the glyphsets it appears in. We can see that the "a" is in Latin kernel and Latin core. If we add new glyphs for African Latin which are not in I have no opinions on the actual glyphs included in this pr since you're all much smarter than me. |
Yes, I saw this, but it only happens for those two Glyphsets. If you go to e.g. I was surprised to see a letter like "a" repeated in those Glyphsets (kernel + Core), tbh since the modular idea is precisely for each added definition to be built up over the previous one. But I don't know if there was a particular reason or need to repeat the glyphs in Kernel and Core. Regardless of what we decide (to allow the repetition or not), we should define a consistent approach. And I would advocate refraining from repeating and using the |
Kernel glyphset is not part of the Latin modular system because it is not the minimal required set for font supporting latin languages, it is the minimal set for non-LCG families. Core is the minimal glyphset tested for Latin support, all families supporting latin should have Core+ParticularSet. We do not want Kernel+Core+OtherSet because (1) it starts to be complicated, (2) would set Kernel as the minimum required (which is not the case). So Core needs to have all the minimal amount of glyphs required. We may modify Kernel in the future and it shouldn’t affect Core. African, Beyond, Vietnamese etc are made in addition to Core, no need to repeat codepoints between files. We chose the modular system to make clear that supporting a non-Core sets alone would result in an incomplete support for Google Fonts. As a matter of fact, these glyphsets were made in the context of Google Fonts, not for users out of this context to know what characters they need to support a certain region etc. Since all GF Latin families supports Core, the modular system also brings clarity on the number of glyphs to add to support a particular set. Now, if there is still the will to make a comprehensive set to say “if you support this, you support African languages” without caring for the minimal set required (ie not working in a modular way and including basic alphabet everywhere), I honestly don’t see a problem with that. What does it change at the end? Even if all the sets work independently, Core will still be the minimal set required by fontbakery in the googlefonts profile. It would allow to update Core without affecting the other sets, and probably would simplify Jan’s tooling to build sets. |
Makes sense now, thanks. |
@moyogo @EbenSorkin Are design notes around these characters all accumulated and summarized somewhere other than just this pull request? Things like whether one uses an N or n form for the cap form of a letter.... (Not that this should block the pull request.) I think @RosaWagner makes a strong argument that this should include anything needed that is in addition to G Latin Core. That is already the case with the PR as it stands, right? So the remaining decision is: What name should be used? Dave says it does not matter to him. Seems like there are several options on the table. Is there a time or process for any remaining decisions blocking completion of the pull request? |
I am under the impression that nobody felt strongly enough about Pri and
SSA to rename them. I'm not sure what the answer is to the rest of your
question. I will be coming back to glyph design notes for Pri and SSA stuff
but probably not this week.
…On Thu, Dec 7, 2023 at 1:38 PM Thomas Phinney ***@***.***> wrote:
@moyogo <https://github.com/moyogo> @EbenSorkin
<https://github.com/EbenSorkin> Are design notes around these characters
all accumulated and summarized somewhere other than just this pull request?
Things like whether one uses an N or n form for the cap form of a
letter.... (Not that this should block the pull request.)
I think @RosaWagner <https://github.com/RosaWagner> makes a strong
argument that this should include anything needed that is in addition to G
Latin Core. That is already the case with the PR as it stands, right?
So the remaining decision is: What name should be used? Dave says it does
not matter to him. Seems like there are several options on the table.
Is there a time or process for any remaining decisions blocking completion
of the pull request?
—
Reply to this email directly, view it on GitHub
<#137 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAQUQXKYYGSQHZMYVZYWADLYIIELJAVCNFSM6AAAAAA5ZCBXR6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBVHEYDQNBVGA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I can share the documentation with you Thomas. Denis and Neil were involved in defining the final set. The initial attempt was based on my own research on language support targets. |
For Eng,the answer might be both depending on the languages that you intend to support. Mind looping Marianna and I into your repository tracker to discuss it with you? We just reviewed this in Google Sans. I don't believe that there is a way to document this type of requirement in glyphsets. What we likely need is full-fledged documentation rather than lists of codepoints to address this level of detail. |
The Pri reflects Google business needs and will be used to define some font development contracts. That is why the wording is as it is. It is not intended to indicate priorities for any other development team out there. I wouldn't get hung up on the name. We can revisit it and change if necessary down the road. The key right now is to signal that the data are stable and be able to provide it to development teams. I don't have strong opinions about superset and subset issues. IMO a designer who is briefed to develop to a Fonts lang support standard probably wants as concise of a way to understand that as possible. So, duplication across glyphsets like the SSA areas is likely fine in that sense. If this is a subset of full SSA, then the designer ignores this list and spends their time understanding the full SSA codepoint list if they are commissioned to develop that lang support, and this one if they are spec'd to develop projects intended for a smaller set of languages. I respect all of the feedback here. Let's leave this PR open until next week so that all who have commented to date have an opportunity to weigh in further. If there is no more input we will merge as is next week. It will be open to additional edits in future PR's. |
Many thanks to all for the feedback here. And a big thanks to @NeilSureshPatel and @moyogo for the review on these codepoints. Greatly appreciated! |
Closes #136