Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make docs reproducibility more robust #3951

Closed
wants to merge 2 commits into from

Conversation

DanielG
Copy link
Contributor

@DanielG DanielG commented Sep 23, 2023

Commits:

  • Replace faketime with SOURCE_DATE_EPOCH for docs

On Debian some architectures don't support using the faketime wrapper and
it will simply bail out. Turns out pdflatex now natively supports
SOURCE_DATE_EPOCH though as long as you also set FORCE_SOURCE_DATE,
otherwise it won't take effect. Graphviz (dot) also supports the former.

  • Introduce GIT_EPOCH for getting SOURCE_DATE_EPOCH date in tarball

On Debian some architectures don't support using the faketime wrapper and
it will simply bail out. Turns out pdflatex now natively supports
SOURCE_DATE_EPOCH though as long as you also set FORCE_SOURCE_DATE,
otherwise it won't take effect. Graphviz (dot) also supports the former.
@KrystalDelusion KrystalDelusion added the discuss to be discussed at next dev jour fixe (see #devel-discuss at https://yosyshq.slack.com/) label Oct 12, 2023
@mmicko
Copy link
Member

mmicko commented Oct 18, 2023

Problem that was needed solving here is that dot and pdflatex did not generate PDF files in docs/images/011 directory only based by content but also by current time. That produced issues of CI updating https://github.com/YosysHQ-Docs/yosys-cmd-ref/tree/main/images since then all these PDFs are always found to be different.

For creating documentation Debian packages, it would be perfectly fine to get PDF files with any timestamp in them since anyway you would rebuild it for each version, so I see no need to enforce any specific timestamp.
In this case I would rather define FAKETIME variable empty and use that on these two places and then override it in CI, that way your builds will not be affected and it will work same as always for us.

Does this solution looks fine to you, and does it solve issues you experiencing ?

@DanielG
Copy link
Contributor Author

DanielG commented Oct 18, 2023

Does this solution looks fine to you, and does it solve issues you experiencing ?

Reproducibility isn't just a convince feature, it's a table stakes distribution security mechanism see https://reproducible-builds.org/

The idea is that anyone can check a Debian package corresponds to it's source code by recompiling and comparing the resulting .debs bit by bit. This helps detect compromised build infrastructure.

So we really do need to get rid of faketime here.

Currently this patch is still broken due to CreationDate etc. see https://wiki.debian.org/ReproducibleBuilds/TimestampsInPDFGeneratedByLaTeX I'll have to do slightly more invasive surgery.

--Daniel

@DanielG
Copy link
Contributor Author

DanielG commented Oct 18, 2023

Ok, my previous analysis was wrong. Graphviz dot doesn't seem to support SOURCE_DATE_EPOCH and that's the root problem I have in the Debian packaging.

The problem with faketime is this:

faketime: sem_open: Permission denied
The faketime wrapper only works on platforms that support the sem_open()
system call. However, you may LD_PRELOAD libfaketime without using this wrapper.

So instead of removing the use of faketime I'll switch to setting the LD_PRELOAD and FAKETIME envvars directly.

--Daniel

@KrystalDelusion
Copy link
Member

As per your own links;

Another option for Debian would be to remove the field completely with dh_installdocs or similar, or at least set its content consistently (package build date, installation directory, and so on).

This is what faketime is doing for us. While the reason was to prevent git from detecting a change in pdf when there was no change in content, this accomplishes the goal of a reproducible build that (should) produce the same result from the same source, rather than the output changing due to differences in build system.

Unless there is something else affecting the reproducibility (the PDF ID in your link for example), then this is just about building on Debian and the name of the pull request should be updated to reflect that.

@DanielG
Copy link
Contributor Author

DanielG commented Oct 23, 2023

Turns out the faketime problem (faketime: sem_open: Permission denied) was related to a build chroot misconfiguration not an architecture specific limitation of faketime.

So in fact this entire PR is ill-conceived :)

Unless there is something else affecting the reproducibility

I'm still seeing PDF CreationDate differences (debci), I think those are due to $TZ. I'll send an update once I confirm overriding $TZ gets rid of the non-reproducibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss to be discussed at next dev jour fixe (see #devel-discuss at https://yosyshq.slack.com/)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants