-
-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature]: package version lock files #151
Comments
Hi. Thanks for the suggestion. What you're suggesting is referred to as a version lock file. There is currently the ((elpaca :source
"lockfile" :date (25757 57844 206472 188000) :recipe
(:protocol https :inherit t :depth 1 :repo
"https://github.com/progfolio/elpaca.git" :ref
"272966b864db86604535bced55b3dfa3c7ed8532" :pre-build
("git" "remote" "set-url" "origin"
"[email protected]:progfolio/elpaca.git")
:files (:defaults (:exclude "extensions")) :build
(:not elpaca--activate-package) :package "elpaca"))
;; Other packages omitted
) The full recipe is stored with a computed
Lock files work so long as the upstream source still has the commit referenced in the lock file available. What I've been experimenting with is keeping the entire Elpaca package store in a repository. I've experimented with a few backup strategies and I haven't decided which I'll end up with for Elpaca. |
I really like the idea of keeping the entire Elpaca package store in git. Curiously, I think it could even make Elpaca usable in environments without access to git (like the new Android port of Emacs, for example). You would just download main.tar.gz from your package store repository on GitHub or other git forge. I'm interested in Elpaca and I think it has a lot going for it, but I'm not willing to switch to another package manager until it can match straight in reproducibility. Keeping an eye on this issue. |
I'm not a fan of the entire Elpaca store in git. I absolutely do need a lock file as it would give me a known good config. I have 4 workstations that I use emacs on with the same git directory holding the config only. I know some people check their elpa directory in or rsync but I don't like the idea of that for many reasons. Despite using git and git-annex extensively I've never wanted this solution and resist having another pile to move around. I'd like a normal Cargo.lock (very successful for rust) style text file that I could check into git, easily inspect and edit(say if someone rarely delete a remote git) in emacs or vim without doobedydeeing around in git to fix or alter things. I don't think archiving other peoples git trees is a good problem for a package manager to solve. Any binary files will start to explode the size of the elpaca git blob. Without storing binary artifacts the benefits some people imagine won't actually be there. |
Not many packages include binary blobs. Are there specific packages which come to mind?
My hunch that this scenario is even rarer than git repos disappearing or history being rewritten. A lockfile does not guarantee the presence of any system binaries either, so it's a shared flaw between both approaches. There are trade offs between both approaches and I plan on making things flexible enough to accommodate either. |
Yes, I do agree. The In case the upstream package has changed, since it is something that less usually happens, the user can just manually switch the package upstream or just reset the upstream to a fork with its local copy. In my own perspective, the approach of maintaining an entire I currently use straight, and the directory Using git submodule will point to same pity: once the upstream changes, you can also not initialize all the submodule in a fresh install. |
My |
@ajgrf, if I'm not mistaken, the |
I'll throw my vote for a simple lockfile. While the idea of having all your packages safely stored in case Github blows up sounds tempting, I see it as a solution to a problem I don't have. But in the most realistic case of a ref or even a complete repo disappearing, the first thing I'd be looking into is fixing the situation, finding a new package or otherwise deal with the problem. I use elpaca to install packages, that is (more or less) maintained packages, I don't need it to deal with dead code that once was a package. Worst case scenario I'll dig it out of As for the size thing.
They're not entirely equal, there's been a bit of package churn since I switched to elpaca, but they're the same ballpark. Obviously elpaca saves quite a bit by doing shallow copies, but it's still 500M to save the source for the build. I'll admit to being the type that's not afraid to mess around with the source in |
The state of the repos and builds directories are stored as is. |
So how does one tell what is local modification? I assume the history of the individual repo directories isn't part of this. It this basically the same as adding |
Thomas Fini Hansen ***@***.***> writes:
So how does one tell what is local modification? I assume the
history of the individual repo
directories isn't part of this.
The git history of each repository would be preserved as well.
It this basically the same as adding elpaca/repos and
elpaca/builds to ones .
config/emacs repository (with some magic to avoid submodules for
repos)?
It's a similar approach, but the entire store would be in its own
repository instead of added to one's config repository.
There would also be a minimal API around it so you don't really
need to know how to use git to use it.
e.g.,
1. User executes `M-x elpaca-backup`. They're prompted to take an
optional note for the back up (the commit message). The entire
store as is is committed to the package store repository in a way
that avoids submodules.
2. User executes `M-x elpaca-restore-backup`. They're prompted to
choose a backup point (which is just picking a commit). The store
is checked out at that state and all packages are rebuilt.
That's the basic gist of it. I'll have to keep experimenting with
it to see how it works in practice.
|
In my case, that's 3 gigs of data, half a gig if going with shallow checkouts. As with some of the other posters, I'm a bit skeptical...
Oh, care to share your secret sauce? I'm just curious.
Ah, I think we've got the source of the dissonance in this issue here. You're working on an user-friendly, self-contained solution that can be used by anyone. But those asking for a lock file already has their config in git and are looking for a way to control elpaca from that. It's two different user-stories, but as you say, they ought to be able to co-exist. It's "just" a matter of someone implementing |
Thomas Fini Hansen ***@***.***> writes:
In my case, that's 3 gigs of data, half a gig if going with
shallow checkouts. As with some
of the other posters, I'm a bit skeptical...
I thought you showed 565M total in your store earlier?
In any case, there may be other tricks to optimize the storage
size.
Oh, care to share your secret sauce? I'm just curious.
If I test more and think it will be a viable solution, I'll push
it to a feature branch which can be tested. I'll mention it here
if that happens.
Ah, I think we've got the source of the dissonance in this issue
here. You're working on an
user-friendly, self-contained solution that can be used by
anyone.
Yes. I believe that should be offered alongside other solutions.
But those asking for a
lock file already has their config in git and are looking for a
way to control elpaca from
that.
It's two different user-stories, but as you say, they ought to
be able to co-exist. It's "just" a
matter of someone implementing elpaca-load-lockfile.
Some other changes would need to be made, too.
For example, you'd need a way to say "rebuild these packages from
scratch".
There's a naive approach here:
master...feat/rebuild-from-scratch
but I don't think that will be the final approach.
Basically we need a way to get the repo into the declared state
prior to rebuilding anything,
without losing any possible changes to the repo. It sounds easy
until you start implementing it.
Backups are the highest priority feature at the moment, so I'll
begin working on them again soon.
|
Well, that's shallow repos most of it. Looking about, it seems that elpaca will do shallow clones and then fetch new history when updating? I'll admit I've never worked much with shallow clones.
Why does it need to re-clone? Nuking the build dir seems like a sensible cleanup, but why re-clone if the ref we're updating/downgrading to is in the repo? If you're trying to revert to a working configuration, the needed ref should already be available (unless shallow copies get in the way, of course). I would think that bringing repos to the same version as the lockfile and rebuilding packages that were changed should suffice (well, plus cloning stuff that hasn't yet, to support the "rebuild from scratch" scenario).
Well yeah... Would it help if the prerequisite for |
Thomas Fini Hansen ***@***.***> writes:
Well, that's shallow repos most of it. Looking about, it seems
that elpaca will do shallow
clones and then fetch new history when updating? I'll admit I've
never worked much with
shallow clones.
Yes. A shallow clone has a "grafted" root node and will pull in
new history.
Why does it need to re-clone?
It needn't in all cases. As I mentioned, that's a naive
implementation.
A complete solution will not be as simple.
to a working configuration, the needed ref should already be
available (unless shallow
copies get in the way, of course). I would think that bringing
repos to the same version as
the lockfile and rebuilding packages that were changed should
suffice (well, plus cloning
stuff that hasn't yet, to support the "rebuild from scratch"
scenario).
It sounds easier than it is. There are many corner cases.
You have to consider that the package recipe itself may have been
altered between backups.
The repo may not contain the history to retrieve a given ref. etc.
Would it help if the prerequisite for elpaca-write-lockfile was
no
uncommitted changes? It does open up new ways to shoot oneself
in the foot (if one
nukes the local repo and had a local branch with changes for
instance), but it could work.
There'd have to be a policy similar to that in place for a
lockfile solution.
Otherwise, it would be too easy to lose un-pushed work.
|
How about creating a nix profile depending on the git sources for that set? As long as you hold onto the resulting profile as a GC root, all the git sources will remain in the store. Since the Guix store is basically the same implementation, both of these systems can be used similarly for holding onto snapshots of all the packages efficiently. https://nixos.org/manual/nix/stable/package-management/profiles To rehydrate, you would just copy the immutable git sources into /repos and rebuild everything and maybe do updates. |
@psionic-k You could probably achieve the same thing without nix by creating git branches in the same repository, one for each upstream repo, pointing at the commit you're using. I suspect this is the approach @progfolio is considering as the "full backup" method? You could check out the individual branches as worktrees to share the git repository and objects between them. The downside is that git doesn't expect to be used in this way, so it'll be a bit harder to interact with the upstream repos and push patches, etc. But I guess it would work for backups, since you could use the recipe metadata to reconstruct things like upstream URLs and branch names that would normally be in the config of a checked out git repo. |
I will give my 2¢ that this problem is the entire purpose of https://archive.softwareheritage.org/. There's nothing wrong with giving people the option to create full git backups as described, but frankly I think falling back to software heritage on clone failure is simple and robust. Guix, a project which takes source reproducibility very seriously, takes this approach. |
Thanks for chiming in.
Cool project. However, after kicking the tires, it looks like it's missing quite a few of my github repositories.
I have an idea for how to implement simple lockfiles which will be at least on par with what straight.el offers (with a better UI). The main hurdle now is time. Money is tight for me right now (unfortunately, I don't pay my bills by writing software) so I've had to pick up two jobs and am working long hours most days. When I get some time I will implement the idea I have. |
I'm surprised! Usually anything on GitHub is on there. Maybe I'll look into creating a (M)ELPA lister so that published emacs packages are indexed more regularly. It's also possible I can make a submission tool within emacs...will take a look.
Of course; you've already created more than enough incredible software for free! Thank you so much for what you've done already! IMO elpaca is basically as close to perfect as we have in a package manager ❤️ |
What would you suggest as the best solution in the meantime, while waiting for the full support to be implemented? On one hand I like the idea of having something like a lock file, but I would also like to be resistant to network issues like the ones that Gnu Savannah seems to often have, where lock file won't help since I can't download the packages. Would the solution then be to just version control |
I'll get simple support in sooner than later.
Bear in mind what's being downloaded from GNU's servers is the package recipes.
That's one way. You should be able to discard the builds directory, since that can be rebuilt from the info in the cache directory and the repositories. |
Got it! I just learned about the whole idea of tarballs vs recipes in the last couple of days so this is starting to make sense now. Thanks! |
Feature Description
Pinning packages makes configs reproducible. Currently, the only way to pin packages is to get the current hash for every package and add it to the appropriate
use-package
block.I suggest implementing a new alist,
elpaca-package-hash-alist
that holds the hash for every package. Elpaca would be able to generate this alist automatically, so users could effortlessly setelpaca-package-hash-alist
to this value upon startup. Elpaca would check these values and act accordingly (pull / reset / do nothing, etc). Every package with:elpaca t
would be affected, and there could be a user-option for enabling this behavior.Also, maybe this alist could be written to a .elpaca-pins file or similar?
Example:
Set PACKAGE-NAME to HASH
(add-to-list 'elpaca-package-hash-alist (cons PACKAGE-NAME HASH))
HASH could be set to
t
to signal an upgrade.Confirmation
The text was updated successfully, but these errors were encountered: