Gain a better sense of the sub-operations involved in font compilation and their timing #34

simoncozens · 2022-07-28T08:32:00Z

Issue #25 gave us an overall picture of the distribution of compilation times, but in order to identify the most profitable spots for optimisation, we need a profile of font compilation based on the distinct operations involved:

How much of the build is taken up with Glyphs-to-UFO conversion?
How long does cubic-to-quadratic curve conversion take?
What percentage of build time is the binary font merge?
What percentage of that is gvar generation and optimization?
...
And how do all of these timings vary by glyph count and master count?

We've had attempts in the past to profile fontmake, but this has generally happened on the function level rather than on the macro "operation" level. ("Just throw a profiler at it".) This is relatively difficult to interpret as information gets lost in the weeds. I want a high level report which looks more like this:

Glyphs to UFO conversion: 30s (6%)
...

There is some timing code in fontmake (but not in ufo2ft), so it's a matter of expanding that, adding semantic information ("What is operation is this code performing?"), and then running it on our Noto test rig and collating the data.

The text was updated successfully, but these errors were encountered:

simoncozens · 2022-07-28T11:11:06Z

Here's some data, for building variable fonts: https://gist.github.com/simoncozens/3b841c88d6cb7de813b7530759e25e44

I'm finding it hard to visualise; here's the Noto fonts with the top eight longest build times:

library(ggplot2);
library(jsonlite);
library(dplyr);
library(RColorBrewer);
library(stringr);
df <- jsonlite::fromJSON("noto-variable.json");
df <- df %>% unnest(timings) %>% mutate(message=timings[,1], time=as.numeric(timings[,2])) %>% select(-timings)
df2 <- df %>% group_by(name) %>% mutate(total_time=sum(time)) %>% filter(total_time > 40 & total_time < 200)

# Timings as absolute seconds
ggplot(df2, aes(y=time, x=name, label=if_else(time>1,stringr::str_wrap(paste(message, " (", signif(time, digits=3), "s)",sep=""), 30), NULL),color=format)) + geom_bar(stat="identity", fill="transparent",size=0.1, color="black") + geom_text(size = 2.5,position = position_stack(vjust = 0.5)) + facet_wrap(~format, scales="free") + theme(axis.text.x = element_text(angle = 45,vjust=0.5), legend.position="none")

# Timings as percentages
ggplot(df2, aes(y=time, x=name, label=if_else(time>1,stringr::str_wrap(paste(message, " (", signif(time/total_time*100, digits=3), "%)",sep=""), 30), NULL),color=format)) + geom_bar(stat="identity", fill="transparent",size=0.1, color="black") + geom_text(size = 2.5,position = position_stack(vjust = 0.5)) + facet_wrap(~format, scales="free") + theme(axis.text.x = element_text(angle = 45,vjust=0.5), legend.position="none")

simoncozens · 2022-07-28T14:51:06Z

Here's the equivalent data for static instance generation: https://gist.github.com/simoncozens/9050865f138ae080bee599c9176b61db

A big chunk of instance generation is the UFO instantiation. I have a project to do this in Rust (triangulate) which would make that step incredibly quick; the rest of the font generation is embarrassingly parallel per instance, as they're completely independent font builds. This is something we could use Ninja to orchestrate (in gftools-builder-ninja we already do). I can easily see order of magnitude speedups for static instance generation.

simoncozens · 2022-07-28T15:45:09Z

Why do feature writers take a long time? One of the things they do is compile a GSUB table. They use this to trace substitutions and allocate properties to different glyphs. (i.e. we deduce that lam-ar.init is an Arabic glyph even though it is unencoded, because it is produced by a substitution from lam-ar which is encoded and has the Arabic script property.)

Once they've done that, they throw away the binary GSUB table, and it gets built again later...

anthrotype · 2022-07-28T15:51:08Z

that's right.. although that temporary GSUB is not serialized (so no overflow resolution triggers), only "built" in the sense that the features.fea is parsed and a GSUB table object is generated by the feaLib.builder.
Also bear in mind there could be GSUB feature writers (there aren't none in the built-in writers yet but a user can define their own writers and plug them in the build, like they do with filters), which are run before all the GPOS-based feature writers I believe. We could in theory reuse that GSUB table that we build (to do the closure to classify glyphs by unicode properties for the subsequent GPOS writers) and keep as is in the final font instead of having feaLib redo the work. Worth trying. I probably thought of that and gave up for some reasons that I forgot now.

simoncozens · 2022-07-28T16:42:50Z

Looking at the variable font builds, can we drop the "save UFO sources" step? Currently the conversion from Glyphs to UFO gets a designspace/UFO object, writes out all the files and passes a path to run_from_designspace which then loads all the UFOs from disk again. Obviously in an incremental setup, having the UFOs on disk means that you can avoid the conversion next time, but is there any reason that run_from_designspace can't just optionally take a designSpaceLib object and we skip the save/load and keep everything in memory?

anthrotype · 2022-07-28T17:28:04Z

can we drop the "save UFO sources" step?

MutatorMath had its own parser that wanted to load from disk again, but we can now finally ditch that since fontmake has its own instantiator that works with in-memory designspace/UFO masters.

I think the other issue I'm working on right now -- broken include statements in features.fea when exporting .glyphs => .ufo to different directory than input file -- is somehow related to the current saving of UFO masters to disk; glyphsLib returns an in-memory designspace object populated with ufoLib2.Font objects that have no .path attribute becaues they haven't been loaded from disk but generated by code, if their features.fea contain some include statements, these must be resolved relative to the UFO's path (which doesn't exist yet until you save the ufo to disk), and lacking that feaLib can only use the current working directory to resolve includes which rarely makes sense... so fontmake saves them to disk and things (kinda) work.
I'll experiment with not saving the UFO masters to disk (but I'm sure while I'm writing this Simon has already done the work LOL)

simoncozens · 2022-07-28T17:42:35Z

No, for once I thought I’d ask first if anyone’s tried it already… also I have run out of brain for today. Will try again tomorrow.

anthrotype · 2022-07-28T17:43:44Z

Get some well deserved rest, you did amazing work!

rsheeter · 2022-07-29T02:56:41Z

Great work! QQ, is the high level reporting stuff integrated into, or on a path to be integrated into, fontmake?

rsheeter · 2022-07-29T03:01:45Z

Noob question, what's the difference between "build OpenType features" and "run feature writes" ? - sounds mildly like the same thing.

behdad · 2022-07-29T03:04:07Z

The latter writes .fea files from glyph anchors and other data. The former compiled .fea files I think.

simoncozens · 2022-07-29T06:24:38Z

As Behdad says. One makes binary tables, the other decides what should go in them.

We have some (duelling) PRs for integrating the timing reporting into fontmake/ufo2ft and will try to sort them out today.

simoncozens · 2022-07-29T07:23:44Z

I suppose another thing to draw from this is that the "write/compile features once instead of per master" PR will make quite an impact.

But I can't shake the feeling that I had a year ago: we can tinker at various optimisations and maybe get 20%, 30%, but font compilation intrinsically means doing a lot of work, and there's no order-of-magnitude gain going to come from tweaking.

rsheeter · 2022-07-29T22:14:12Z

The discussion - and making of tweaks along the way - is helpful to bring some of us (...maybe just me?) up to speed on the wonders of fontmake. I'm very confident that in time we'll have our 1-2 order faster compiler and we can look back on how hilariously slow it used to be over a beer.

Zooming out for a moment, if we ignore our current implementation, which parts of font compilation truly are the majority of the work?

IIUC our prior is that the long poles should be 1) processing glyphs (embarrasingly parallel) and 2) processing layout (...I'm less clear on how parallel this can be). There is other work but none of it is tremendously expensive. Is that fair? Missing things that are non-trivial amounts of work?

simoncozens · 2022-07-30T06:55:38Z

Well, fonticulus is 60x faster than fontmake. I'll do some profiling and see what its hotspots are; would be interesting to compare.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gain a better sense of the sub-operations involved in font compilation and their timing #34

Gain a better sense of the sub-operations involved in font compilation and their timing #34

simoncozens commented Jul 28, 2022

simoncozens commented Jul 28, 2022 •

edited

Loading

simoncozens commented Jul 28, 2022 •

edited

Loading

simoncozens commented Jul 28, 2022

anthrotype commented Jul 28, 2022

simoncozens commented Jul 28, 2022

anthrotype commented Jul 28, 2022

simoncozens commented Jul 28, 2022

anthrotype commented Jul 28, 2022

rsheeter commented Jul 29, 2022

rsheeter commented Jul 29, 2022

behdad commented Jul 29, 2022

simoncozens commented Jul 29, 2022

simoncozens commented Jul 29, 2022

rsheeter commented Jul 29, 2022

simoncozens commented Jul 30, 2022

Gain a better sense of the sub-operations involved in font compilation and their timing #34

Gain a better sense of the sub-operations involved in font compilation and their timing #34

Comments

simoncozens commented Jul 28, 2022

simoncozens commented Jul 28, 2022 • edited Loading

simoncozens commented Jul 28, 2022 • edited Loading

simoncozens commented Jul 28, 2022

anthrotype commented Jul 28, 2022

simoncozens commented Jul 28, 2022

anthrotype commented Jul 28, 2022

simoncozens commented Jul 28, 2022

anthrotype commented Jul 28, 2022

rsheeter commented Jul 29, 2022

rsheeter commented Jul 29, 2022

behdad commented Jul 29, 2022

simoncozens commented Jul 29, 2022

simoncozens commented Jul 29, 2022

rsheeter commented Jul 29, 2022

simoncozens commented Jul 30, 2022

simoncozens commented Jul 28, 2022 •

edited

Loading

simoncozens commented Jul 28, 2022 •

edited

Loading