Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1087 stage preprocessor generated files in temp dir #1144

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

sytsereitsma
Copy link

@sytsereitsma sytsereitsma commented Feb 12, 2020

This PR allows preprocessors to stage output files in a temporary directory (the location of which is passed along by mdBook). The renderer can processes these files to include them in the book.

The hbs renderer simply copies the content of the temp dir to the book output dir as is. I'm not sure how other renderers would handle this.

Should be BW compatible as far as I can see.

@sytsereitsma sytsereitsma changed the title #1087 stage preprocessor files in temp dir #1087 stage preprocessor generated files in temp dir Feb 16, 2020
@sytsereitsma sytsereitsma changed the title #1087 stage preprocessor generated files in temp dir Fix #1087 stage preprocessor generated files in temp dir Aug 23, 2020
@sytsereitsma
Copy link
Author

@Michael-F-Bryan Is there any chance this PR can get merged?

If not what would you like me to change?

@Michael-F-Bryan
Copy link
Contributor

I'm not a maintainer of mdbook any more, but hopefully @ehuss will be able to find someone that can review this for you 🙂

@sytsereitsma
Copy link
Author

@ehuss Can you or someone else have a look at this?

@ehuss
Copy link
Contributor

ehuss commented Aug 28, 2020

Sorry, I don't really understand what this is trying to do. Can you break it down into simple terms? What is the motivation? How is it different from before? Are there any alternative designs that were considered?

@sytsereitsma
Copy link
Author

The problem it solves (#1087) is that some preprocessors actually generate files that need to be staged somewhere. I used to copy them directly to the book output dir, but the HBS rederer now removes the book output dir after the preprocessor has been executed (introduced with #985). Apart from that, copying the files to the book output dir will probably only work for the HBS renderer.

To overcome the above problem I have to stage the files in the src dir now, which is ugly and kind of borks the mdbook serve command, because the files in the src dir keep changing.

The alternative options that were considered:

  • Lifecycle ,methods, e.g. have the renderer call a before_preprocessor() function (where they can clear the book dir). Nice and clean, but still requires direct writing to the book dir.
  • Data URIs. A bit arcane, and I do not know what the URI length limit is for browsers (1, 2). For images they will far exceed 2000 chars (or 65k chars for that matter).
  • Extend the hbs rederer with a config switch, which enables/disables the output dir cleanup (hacky)
  • Transfer the image data using the stdio interface between mdbook and the preprocessor. This requires serialization and deserialization for the image data and will have quite long string values in the json output.
  • Have mdbook copy all src files to a preparation dir, where the preprocessor can copy data to. The renderer then renders the preparation dir instead of src. This is a very nice solution, but I do not know how this affects BW compatibility.

I finally settled on creating a temporary directory from mdbook, whose location is passed to the preprocessor. The preprocessor can copy the generated files into that directory, after which the renderer can copy the directory's contents to the output dir (or deal with it using its own rendering scheme). This solution is still backward compatible for older renderers.

Sytse

@ehuss
Copy link
Contributor

ehuss commented Sep 6, 2020

What's wrong with transferring files via the JSON interface? How big are these files? What are the performance implications like? (Not suggesting that is necessarily preferred, just that it could be simpler.)

I was also wondering if there might be some way to accommodate files that persist between builds. For example, if a preprocessor has something that is very expensive to compute, it would be nice if it could cache the results between builds, and only rebuild files or content if it changes. That may be out of scope here, but I'm wondering what options there would be, and if this could be extended to support that.

As for compatibility, this appears to change the RenderContext API, which is a backwards-incompatible change. Unfortunately mdbook's public API is not well isolated.

This also doesn't appear to have any documentation or tests.

@sytsereitsma
Copy link
Author

The files range between 10kB to 30kB on my books (I do not know large the images are for other users' projects). These files need to be Base64 encoded to be able to be stored in a JSON string on the preprocessor side. On the mdBook side they need to be decoded again and saved to a specific file (for the case of the HBS renderer). This is possible and supported by the JSON parser, it just seemed like a lot of extra effort over simply copying the files. I'm more than happy to employ this scheme if that has your preference.

As for caching, that is the responsibility of the preprocessor (my preprocessor uses caching).

Failing to add unit tests was sloppy, I will add them once we've agreed upon the final solution (are there also e2e integration tests to add?). Same for the documentation (what would be the best place for this?).

Sytse

@Michael-F-Bryan
Copy link
Contributor

it just seemed like a lot of extra effort over simply copying the files

When designing the way plugins work, one of the big benefits was that the files required to render a book were transferred via STDIN/STDOUT. That way you've only got a single source of truth and aren't using additional side-channels like the filesystem to transfer information.

The HTML renderer kinda doesn't follow this when copying non-markdown files across, but that's more because it was created before a clean plugin interface was created and contains a fair amount of legacy code.

I feel like a better solution would be to alter Book so it includes assets that may be embedded in the rendered document as well as normal pages. That way you've got a well-defined mechanism for transferring generated files between preprocessors and the renderer.

@sytsereitsma
Copy link
Author

Ok. If Eric shares this opinion I'll implement the book extension (this PR can be deleted then).

Before I start I'll add a little proposal to the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants