[Feature]: package version lock files #151

precompute · 2023-06-29T14:19:30Z

Feature Description

Pinning packages makes configs reproducible. Currently, the only way to pin packages is to get the current hash for every package and add it to the appropriate use-package block.

I suggest implementing a new alist, elpaca-package-hash-alist that holds the hash for every package. Elpaca would be able to generate this alist automatically, so users could effortlessly set elpaca-package-hash-alist to this value upon startup. Elpaca would check these values and act accordingly (pull / reset / do nothing, etc). Every package with :elpaca t would be affected, and there could be a user-option for enabling this behavior.

Also, maybe this alist could be written to a .elpaca-pins file or similar?

Example:

Set PACKAGE-NAME to HASH

(add-to-list 'elpaca-package-hash-alist (cons PACKAGE-NAME HASH))

HASH could be set to t to signal an upgrade.

Confirmation

The feature I'm proposing does not already exist in Elpaca

The text was updated successfully, but these errors were encountered:

progfolio · 2023-06-29T20:12:57Z

Hi. Thanks for the suggestion. What you're suggesting is referred to as a version lock file.

There is currently the elpaca-write-lockfile function which will create an item menu with the current recipes for each package. For example, (elpaca-write-lockfile "/tmp/test.eld") produces the following for data:

((elpaca :source
   "lockfile" :date (25757 57844 206472 188000) :recipe
   (:protocol https :inherit t :depth 1 :repo
              "https://github.com/progfolio/elpaca.git" :ref
              "272966b864db86604535bced55b3dfa3c7ed8532" :pre-build
              ("git" "remote" "set-url" "origin"
               "[email protected]:progfolio/elpaca.git")
              :files (:defaults (:exclude "extensions")) :build
              (:not elpaca--activate-package) :package "elpaca"))
              ;; Other packages omitted
              )

The full recipe is stored with a computed :ref recipe keyword. What's missing is a way to rebuild packages "from scratch" so the package can be reset to that state. It's trickier than storing a commit ref, though. We'd want to disable inheritance for that recipe, possibly override the :depth keyword, etc.

Pinning packages makes configs reproducible.

Lock files work so long as the upstream source still has the commit referenced in the lock file available.
However, if the upstream disappears or overwrites history, the ref is useless.

What I've been experimenting with is keeping the entire Elpaca package store in a repository.
This has the benefit that the entire source code of each library is available despite what happens upstream.
It can also simplify the machinery around restoring package state (by being a thin layer over git).
The trade off is that a package store repository is obviously larger on disk than a lock file, but I don't think the difference is significant if you consider that the lock file doesn't do anything on its own (it would have to be used to download all the repos, anyhow).

I've experimented with a few backup strategies and I haven't decided which I'll end up with for Elpaca.
I may design it in a way where one could sub out for their own strategy as well.

Related issues: #24 #36

axgfn · 2023-07-15T18:34:12Z

I really like the idea of keeping the entire Elpaca package store in git. Curiously, I think it could even make Elpaca usable in environments without access to git (like the new Android port of Emacs, for example). You would just download main.tar.gz from your package store repository on GitHub or other git forge.

I'm interested in Elpaca and I think it has a lot going for it, but I'm not willing to switch to another package manager until it can match straight in reproducibility. Keeping an eye on this issue.

hammerandtongs · 2023-07-22T20:24:20Z

I'm not a fan of the entire Elpaca store in git.

I absolutely do need a lock file as it would give me a known good config.

I have 4 workstations that I use emacs on with the same git directory holding the config only.

I know some people check their elpa directory in or rsync but I don't like the idea of that for many reasons.

Despite using git and git-annex extensively I've never wanted this solution and resist having another pile to move around.

I'd like a normal Cargo.lock (very successful for rust) style text file that I could check into git, easily inspect and edit(say if someone rarely delete a remote git) in emacs or vim without doobedydeeing around in git to fix or alter things.

I don't think archiving other peoples git trees is a good problem for a package manager to solve.

Any binary files will start to explode the size of the elpaca git blob.

Without storing binary artifacts the benefits some people imagine won't actually be there.

progfolio · 2023-07-23T16:38:48Z

Any binary files will start to explode the size of the elpaca git blob.

Not many packages include binary blobs. Are there specific packages which come to mind?
It would be good to know so I can build a pessimistic test case.

Without storing binary artifacts the benefits some people imagine won't actually
be there.

My hunch that this scenario is even rarer than git repos disappearing or history being rewritten. A lockfile does not guarantee the presence of any system binaries either, so it's a shared flaw between both approaches.

There are trade offs between both approaches and I plan on making things flexible enough to accommodate either.

milanglacier · 2023-07-29T17:57:43Z

I'm not a fan of the entire Elpaca store in git.

I absolutely do need a lock file as it would give me a known good config.

I have 4 workstations that I use emacs on with the same git directory holding the config only.

I know some people check their elpa directory in or rsync but I don't like the idea of that for many reasons.

Despite using git and git-annex extensively I've never wanted this solution and resist having another pile to move around.

I'd like a normal Cargo.lock (very successful for rust) style text file that I could check into git, easily inspect and edit(say if someone rarely delete a remote git) in emacs or vim without doobedydeeing around in git to fix or alter things.

I don't think archiving other peoples git trees is a good problem for a package manager to solve.

Any binary files will start to explode the size of the elpaca git blob.

Without storing binary artifacts the benefits some people imagine won't actually be there.

Yes, I do agree. The cargo.toml style package version control system is good enough in 99% scenario.

In case the upstream package has changed, since it is something that less usually happens, the user can just manually switch the package upstream or just reset the upstream to a fork with its local copy.

In my own perspective, the approach of maintaining an entire Elpaca in git is not a good practice for source management. I seldom see any projects will include the src of the third packages into their own source code. Besides, the git size will grow rapidly.

I currently use straight, and the directory ~/.emacs/straight/repos has a size of 391M, and it is just a snapshot. Thinking about if you want to manage this folder into source management system, how large your .git will become?

Using git submodule will point to same pity: once the upstream changes, you can also not initialize all the submodule in a fresh install.

axgfn · 2023-07-30T00:49:58Z

My ~/.config/emacs/straight/repos/ directory is 713M, but ~/.config/emacs/straight/build/ is only 16M when I exclude .elc files. That's more like what I imagined would be tracked in git. I'm also not worried about it ballooning too much in size over time. Git is pretty good at compression, and we'd only need a new snapshot for each time packages are updated, which I expect for most users is only a weekly or monthly chore.

roshanshariff · 2023-08-20T05:39:47Z

@ajgrf, if I'm not mistaken, the straight/build directory usually just has the compiled .elc files, other binaries and build output, and symlinks to the original source .el files in straight/repos. The symlinks are negligible, and you're excluding the .elc files; that leaves only other miscellaneous binaries in your measurement. Needless to say, it's not enough to just track those if you want working packages.

xendk · 2023-09-08T21:24:54Z

I'll throw my vote for a simple lockfile.

While the idea of having all your packages safely stored in case Github blows up sounds tempting, I see it as a solution to a problem I don't have. But in the most realistic case of a ref or even a complete repo disappearing, the first thing I'd be looking into is fixing the situation, finding a new package or otherwise deal with the problem. I use elpaca to install packages, that is (more or less) maintained packages, I don't need it to deal with dead code that once was a package. Worst case scenario I'll dig it out of elpaca/repos and make my own "package". What I want is to being able to say "this doesn't work. I know it worked last week. Please start up the Delorean." And if it could integrate well with git bisecting my init.el, that would be swell.

As for the size thing.

~/.c/emacs ▶ du -hcs straight/*
0       straight/bootstrap.el
47M     straight/build
516K    straight/build-cache.el
4,0K    straight/modified
2,9G    straight/repos
16K     straight/versions
2,9G    total
~/.c/emacs ▶ du -hcs elpaca/*
32M     elpaca/builds
28M     elpaca/cache
506M    elpaca/repos
565M    total

They're not entirely equal, there's been a bit of package churn since I switched to elpaca, but they're the same ballpark. Obviously elpaca saves quite a bit by doing shallow copies, but it's still 500M to save the source for the build.

I'll admit to being the type that's not afraid to mess around with the source in repos when trying to fix bugs, and then forgetting about getting the changes anywhere. How would storing the packages in git deal with local modifications?

progfolio · 2023-09-09T14:23:04Z

How would storing the packages in git deal with local modifications?

The state of the repos and builds directories are stored as is.
So if you have local modifications, they would be stored.

xendk · 2023-09-10T18:11:28Z

So if you have local modifications, they would be stored.

So how does one tell what is local modification? I assume the history of the individual repo directories isn't part of this.

It this basically the same as adding elpaca/repos and elpaca/builds to ones .config/emacs repository (with some magic to avoid submodules for repos)?

progfolio · 2023-09-10T18:24:43Z

Thomas Fini Hansen ***@***.***> writes:

So how does one tell what is local modification? I assume the history of the individual repo directories isn't part of this.

The git history of each repository would be preserved as well.

It this basically the same as adding elpaca/repos and elpaca/builds to ones . config/emacs repository (with some magic to avoid submodules for repos)?

It's a similar approach, but the entire store would be in its own repository instead of added to one's config repository. There would also be a minimal API around it so you don't really need to know how to use git to use it. e.g., 1. User executes `M-x elpaca-backup`. They're prompted to take an optional note for the back up (the commit message). The entire store as is is committed to the package store repository in a way that avoids submodules. 2. User executes `M-x elpaca-restore-backup`. They're prompted to choose a backup point (which is just picking a commit). The store is checked out at that state and all packages are rebuilt. That's the basic gist of it. I'll have to keep experimenting with it to see how it works in practice.

xendk · 2023-09-10T19:20:40Z

The git history of each repository would be preserved as well.

In my case, that's 3 gigs of data, half a gig if going with shallow checkouts. As with some of the other posters, I'm a bit skeptical...

The entire store as is is committed to the package store repository in a way that avoids submodules.

Oh, care to share your secret sauce? I'm just curious.

There would also be a minimal API around it so you don't really need to know how to use git to use it.

Ah, I think we've got the source of the dissonance in this issue here. You're working on an user-friendly, self-contained solution that can be used by anyone. But those asking for a lock file already has their config in git and are looking for a way to control elpaca from that.

It's two different user-stories, but as you say, they ought to be able to co-exist. It's "just" a matter of someone implementing elpaca-load-lockfile.

progfolio · 2023-09-10T19:36:02Z

Thomas Fini Hansen ***@***.***> writes:

In my case, that's 3 gigs of data, half a gig if going with shallow checkouts. As with some of the other posters, I'm a bit skeptical...

I thought you showed 565M total in your store earlier? In any case, there may be other tricks to optimize the storage size.

Oh, care to share your secret sauce? I'm just curious.

If I test more and think it will be a viable solution, I'll push it to a feature branch which can be tested. I'll mention it here if that happens.

Ah, I think we've got the source of the dissonance in this issue here. You're working on an user-friendly, self-contained solution that can be used by anyone.

Yes. I believe that should be offered alongside other solutions.

But those asking for a lock file already has their config in git and are looking for a way to control elpaca from that. It's two different user-stories, but as you say, they ought to be able to co-exist. It's "just" a matter of someone implementing elpaca-load-lockfile.

Some other changes would need to be made, too. For example, you'd need a way to say "rebuild these packages from scratch". There's a naive approach here: master...feat/rebuild-from-scratch but I don't think that will be the final approach. Basically we need a way to get the repo into the declared state prior to rebuilding anything, without losing any possible changes to the repo. It sounds easy until you start implementing it. Backups are the highest priority feature at the moment, so I'll begin working on them again soon.

xendk · 2023-09-10T21:07:27Z

I thought you showed 565M total in your store earlier?

Well, that's shallow repos most of it. Looking about, it seems that elpaca will do shallow clones and then fetch new history when updating? I'll admit I've never worked much with shallow clones.

For example, you'd need a way to say "rebuild these packages from
scratch".

Why does it need to re-clone? Nuking the build dir seems like a sensible cleanup, but why re-clone if the ref we're updating/downgrading to is in the repo? If you're trying to revert to a working configuration, the needed ref should already be available (unless shallow copies get in the way, of course). I would think that bringing repos to the same version as the lockfile and rebuilding packages that were changed should suffice (well, plus cloning stuff that hasn't yet, to support the "rebuild from scratch" scenario).

without losing any possible changes to the repo. It sounds easy until you start implementing it.

Well yeah... Would it help if the prerequisite for elpaca-write-lockfile was no uncommitted changes? It does open up new ways to shoot oneself in the foot (if one nukes the local repo and had a local branch with changes for instance), but it could work.

progfolio · 2023-09-11T00:11:06Z

Thomas Fini Hansen ***@***.***> writes:

Well, that's shallow repos most of it. Looking about, it seems that elpaca will do shallow clones and then fetch new history when updating? I'll admit I've never worked much with shallow clones.

Yes. A shallow clone has a "grafted" root node and will pull in new history.

Why does it need to re-clone?

It needn't in all cases. As I mentioned, that's a naive implementation. A complete solution will not be as simple.

to a working configuration, the needed ref should already be available (unless shallow copies get in the way, of course). I would think that bringing repos to the same version as the lockfile and rebuilding packages that were changed should suffice (well, plus cloning stuff that hasn't yet, to support the "rebuild from scratch" scenario).

It sounds easier than it is. There are many corner cases. You have to consider that the package recipe itself may have been altered between backups. The repo may not contain the history to retrieve a given ref. etc.

Would it help if the prerequisite for elpaca-write-lockfile was no uncommitted changes? It does open up new ways to shoot oneself in the foot (if one nukes the local repo and had a local branch with changes for instance), but it could work.

There'd have to be a policy similar to that in place for a lockfile solution. Otherwise, it would be too easy to lose un-pushed work.

psionic-k · 2024-01-02T15:15:31Z

Lock files work so long as the upstream source still has the commit referenced in the lock file available.
However, if the upstream disappears or overwrites history, the ref is useless.

How about creating a nix profile depending on the git sources for that set? As long as you hold onto the resulting profile as a GC root, all the git sources will remain in the store. Since the Guix store is basically the same implementation, both of these systems can be used similarly for holding onto snapshots of all the packages efficiently.

https://nixos.org/manual/nix/stable/package-management/profiles

To rehydrate, you would just copy the immutable git sources into /repos and rebuild everything and maybe do updates.

roshanshariff · 2024-01-02T20:34:46Z

@psionic-k You could probably achieve the same thing without nix by creating git branches in the same repository, one for each upstream repo, pointing at the commit you're using. I suspect this is the approach @progfolio is considering as the "full backup" method? You could check out the individual branches as worktrees to share the git repository and objects between them.

The downside is that git doesn't expect to be used in this way, so it'll be a bit harder to interact with the upstream repos and push patches, etc. But I guess it would work for backups, since you could use the recipe metadata to reconstruct things like upstream URLs and branch names that would normally be in the config of a checked out git repo.

dominicm00 · 2024-07-03T04:58:25Z

I will give my 2¢ that this problem is the entire purpose of https://archive.softwareheritage.org/. There's nothing wrong with giving people the option to create full git backups as described, but frankly I think falling back to software heritage on clone failure is simple and robust. Guix, a project which takes source reproducibility very seriously, takes this approach.

progfolio · 2024-07-03T11:34:46Z

Thanks for chiming in.

I will give my 2¢ that this problem is the entire purpose of https://archive.softwareheritage.org/.

Cool project. However, after kicking the tires, it looks like it's missing quite a few of my github repositories.

There's nothing wrong with giving people the option to create full git backups as described, but frankly I think falling back to software heritage on clone failure is simple and robust. Guix, a project which takes source reproducibility very seriously, takes this approach.

I have an idea for how to implement simple lockfiles which will be at least on par with what straight.el offers (with a better UI). The main hurdle now is time. Money is tight for me right now (unfortunately, I don't pay my bills by writing software) so I've had to pick up two jobs and am working long hours most days. When I get some time I will implement the idea I have.

dominicm00 · 2024-07-03T13:05:41Z

I will give my 2¢ that this problem is the entire purpose of https://archive.softwareheritage.org/.

Cool project. However, after kicking the tires, it looks like it's missing quite a few of my github repositories.

I'm surprised! Usually anything on GitHub is on there. Maybe I'll look into creating a (M)ELPA lister so that published emacs packages are indexed more regularly. It's also possible I can make a submission tool within emacs...will take a look.

I have an idea for how to implement simple lockfiles which will be at least on par with what straight.el offers (with a better UI). The main hurdle now is time. Money is tight for me right now (unfortunately, I don't pay my bills by writing software) so I've had to pick up two jobs and am working long hours most days. When I get some time I will implement the idea I have.

Of course; you've already created more than enough incredible software for free! Thank you so much for what you've done already! IMO elpaca is basically as close to perfect as we have in a package manager ❤️

Martinsos · 2025-01-19T22:02:16Z

What would you suggest as the best solution in the meantime, while waiting for the full support to be implemented? On one hand I like the idea of having something like a lock file, but I would also like to be resistant to network issues like the ones that Gnu Savannah seems to often have, where lock file won't help since I can't download the packages. Would the solution then be to just version control elpaca/ directory (assuming I am ok commiting that much MB into my git repo)?

progfolio · 2025-01-19T23:50:34Z

What would you suggest as the best solution in the meantime, while waiting for the full support to be implemented?

I'll get simple support in sooner than later.

On one hand I like the idea of having something like a lock file, but I would also like to be resistant to network issues like the ones that Gnu Savannah seems to often have, where lock file won't help since I can't download the packages.

Bear in mind what's being downloaded from GNU's servers is the package recipes.
Most of the packages are developed elsewhere, so a lockfile would workaround the issue of the recipe source going down (as it has for the past couple days), and can be tracked alongside one's init file in version control.
I've reworked the current ELPA menus so they should be working now.

Would the solution then be to just version control elpaca/ directory (assuming I am ok commiting that much MB into my git repo)?

That's one way. You should be able to discard the builds directory, since that can be rebuilt from the info in the cache directory and the repositories.

Martinsos · 2025-01-21T12:40:38Z

Bear in mind what's being downloaded from GNU's servers is the package recipes. Most of the packages are developed elsewhere, so a lockfile would workaround the issue of the recipe source going down (as it has for the past couple days), and can be tracked alongside one's init file in version control. I've reworked the current ELPA menus so they should be working now.

Got it! I just learned about the whole idea of tarballs vs recipes in the last couple of days so this is starting to make sense now.
But in any case, having elapca dir version controled should allow me to rollback independently of non-local factors (like a git repo going down) - that is good to know. Btw for me build dir is quite small in size compared to the rest of the elpaca dir so I will probably commit that one also, although I guess that depends on the specific packages of course.

Thanks!

precompute added the enhancement New feature or request label Jun 29, 2023

progfolio changed the title ~~[Feature]: Make pinning packages easier~~ [Feature]: package version lock files Jun 29, 2023

kotatsuyaki mentioned this issue Aug 23, 2024

Parallel package install during first install of emacs packages radian-software/straight.el#1055

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: package version lock files #151

[Feature]: package version lock files #151

precompute commented Jun 29, 2023

progfolio commented Jun 29, 2023 •

edited

Loading

axgfn commented Jul 15, 2023

hammerandtongs commented Jul 22, 2023

progfolio commented Jul 23, 2023

milanglacier commented Jul 29, 2023 •

edited

Loading

axgfn commented Jul 30, 2023

roshanshariff commented Aug 20, 2023

xendk commented Sep 8, 2023

progfolio commented Sep 9, 2023

xendk commented Sep 10, 2023

progfolio commented Sep 10, 2023 via email

xendk commented Sep 10, 2023

progfolio commented Sep 10, 2023 via email

xendk commented Sep 10, 2023

progfolio commented Sep 11, 2023 via email

psionic-k commented Jan 2, 2024

roshanshariff commented Jan 2, 2024

dominicm00 commented Jul 3, 2024

progfolio commented Jul 3, 2024

dominicm00 commented Jul 3, 2024

Martinsos commented Jan 19, 2025 •

edited

Loading

progfolio commented Jan 19, 2025

Martinsos commented Jan 21, 2025

[Feature]: package version lock files #151

[Feature]: package version lock files #151

Comments

precompute commented Jun 29, 2023

Feature Description

Confirmation

progfolio commented Jun 29, 2023 • edited Loading

axgfn commented Jul 15, 2023

hammerandtongs commented Jul 22, 2023

progfolio commented Jul 23, 2023

milanglacier commented Jul 29, 2023 • edited Loading

axgfn commented Jul 30, 2023

roshanshariff commented Aug 20, 2023

xendk commented Sep 8, 2023

progfolio commented Sep 9, 2023

xendk commented Sep 10, 2023

progfolio commented Sep 10, 2023 via email

xendk commented Sep 10, 2023

progfolio commented Sep 10, 2023 via email

xendk commented Sep 10, 2023

progfolio commented Sep 11, 2023 via email

psionic-k commented Jan 2, 2024

roshanshariff commented Jan 2, 2024

dominicm00 commented Jul 3, 2024

progfolio commented Jul 3, 2024

dominicm00 commented Jul 3, 2024

Martinsos commented Jan 19, 2025 • edited Loading

progfolio commented Jan 19, 2025

Martinsos commented Jan 21, 2025

progfolio commented Jun 29, 2023 •

edited

Loading

milanglacier commented Jul 29, 2023 •

edited

Loading

Martinsos commented Jan 19, 2025 •

edited

Loading