Refactor #34

diegomura · 2018-11-28T04:04:23Z

No description provided.

devongovett · 2018-11-28T04:56:15Z

You don’t like classes? 😜

diegomura · 2018-11-28T06:48:05Z

Haha it's not that. Maybe I shouldn't have pushed this yet. It's a WIP, but that thrown very good results so far. Let me explain a bit what this is 😄

The original cause of this refactor are some issues I was having with the original solution:

Ignoring chars after glyph generation: Since glyph generation was occurring very early on the process, it was pretty hard to consider some scenarios in which you wanted to ignore or remove some chars of the string. One case was breaking words based on soft-hyphens. Glyph generation created glyphs for these chars, so the only way of dealing with this was iterate and remove them later in the process.
High coupled classes: I like classes (actually react-pdf is full of them), but I encounter many issues whatever I wanted to fix something in the core, and was in part because the classes were not very independent from each other (specially after adding the dependency injection feature).
Hard to add new features: Related to the previous point, but in the sense that the old algorithm was a bit fragile. Adding new features easily broke old stuff.
Hard to test: Wasn't easy to write accurate tests for the core algorithm, specially because all the logic was inside one main big loop statement

Don't get me wrong. The current solution is awesome and has proven to have value on react-pdf. However, I wanted to give it a shot on a refactor that follows two main heuristics:

Try to calculate as many things as early as possible (specially before glyph generation). Word breaking can be one example of this. It's faster, easier and more reliable to operate through strings rather than GlyphStrings. Makes it possible to create more simple and powerful engines that mutates the strings without having to worry about glyphs yet.
Try to divide the problem (which it's crazy hard in some ways) into small and testable bits. And imo following a functional approach in a problem such as typesetting can be very beneficial. Each part is now independent (immutable in the future. Some already are) and easy to test. This has proven already to be so much simple to debug and add new features on the layout pipeline.

I didn't want to change the API of the lib (which is not immutable), so there are still some things that can be confusing, but based on some tests (with some other enhancements on the react-pdf side) I observed a performance boost of ~25%. It's pretty visible on large documents.

Anyways, would love to hear your thoughts! There's still some work to do on this repo, but I plan to release a react-pdf with this solution hopefully soon, after testing it a bit further. I'm now in a point in which 100% of the documents I tried (some of them very complex), passed successfully.

diegomura · 2019-01-24T18:49:31Z

@devongovett would yo be ok to merge this? It's being tested on react-pdf for awhile now and it's working very good. Having all functional like this really makes the entire problem more easy to maintain and adding new features. For me, having this merged would make the development process (both for react-pdf and textkit) way more easy 😄

diegomura added 2 commits November 23, 2018 00:05

Refactor core package

2a84c39

Add word-hyphenation engine

d86e6f1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor #34

Refactor #34

diegomura commented Nov 28, 2018

devongovett commented Nov 28, 2018

diegomura commented Nov 28, 2018 •

edited

Loading

diegomura commented Jan 24, 2019

Refactor #34

Are you sure you want to change the base?

Refactor #34

Conversation

diegomura commented Nov 28, 2018

devongovett commented Nov 28, 2018

diegomura commented Nov 28, 2018 • edited Loading

diegomura commented Jan 24, 2019

diegomura commented Nov 28, 2018 •

edited

Loading