Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special treatment of pre-installed packages by the solver #9669

Open
Tracked by #10440
hasufell opened this issue Jan 29, 2024 · 58 comments
Open
Tracked by #10440

Special treatment of pre-installed packages by the solver #9669

hasufell opened this issue Jan 29, 2024 · 58 comments

Comments

@hasufell
Copy link
Member

hasufell commented Jan 29, 2024

The cabal solver seems to treat pre-installed packages specially (e.g. those shipped with GHC).

To reproduce:

git clone https://github.com/hasufell/toto.git
cd toto
ghcup run --ghc 9.4.8 -- cabal build

This should cause a failure, because ghc-9.4.8 ships with filepath-1.4.2.2, but the package above uses modules from 1.4.100.1. The package has no upper bounds on filepath. For any other non-pre-installed package, the solver would pick the latest.

I understand that this is by design, but I question this design here, because:

  • it makes it harder for core library maintainers to ship bugfixes
  • it's a potential security risk

@mpickering found out that there used to be a --upgrade-dependencies switch, which is now disabled.

I argue that the default should be to pick the latest possible version anyway.


CCing some potentially interested parties: @simonpj @frasertweedale

@mpickering
Copy link
Collaborator

Branch which re-enables --upgrade-dependencies (in a unfinished way) - https://github.com/mpickering/cabal/tree/wip/upgrade-dependencies

It seems sensible to me to choose pre-installed packages, then you don't have to build them again. If the version constraints don't disallow it then the solver could choose that install plan anyway. If you really don't want a package to be part of the install plan then perhaps instead you want a means to instruct the solver to never choose a particular version as an additional constraint form.

@Mikolaj
Copy link
Member

Mikolaj commented Jan 29, 2024

Regardless of the outcome of the discussion, it would make sense to find the PR that commented --upgrade-dependencies out, understand how this happened and prevent for the future, e.g., by guarding this functionality with tests and also by documenting it (better).

@hasufell
Copy link
Member Author

@mpickering I'm worried about defaults here. I don't think there's an easy way to tell my users that there was a subtle bug in filepath in splitFileName. No one reads the changelog. GHC already ships the version.

What if this is a security bug? What do I do? I have no communication channels. A cabal update && cabal build should leave your project in the best possible state (bugfixes, security fixes) without further interventions/constraints required by the end users.

This is how all linux distributions work, afaik. "Saving compilation time" seems like a questionable priority, imo (as a default).

@Mikolaj
Copy link
Member

Mikolaj commented Jan 29, 2024

@mpickering also says about --upgrade-dependencies: "There is the -prefer-oldest option already and this I suppose is a --prefer-newest ... decide how it should interact with --prefer-oldest".

@Mikolaj
Copy link
Member

Mikolaj commented Jan 29, 2024

CC @grayjay, @gbaz

@ulysses4ever
Copy link
Collaborator

I agree that --prefer-newest (so, no special-casing the boot packages) would be a cleaner default. Perhaps, there should be --prefer-installed to get the current default if we change the default to --prefer-newest?

@Mikolaj
Copy link
Member

Mikolaj commented Jan 29, 2024

While I like the default @hasufell proposed better, for the reasons given and also because it's more uniform, I worry it would hit hard the users with big sets of installed packages, which may include Nix users, Linux distribution users and v1-/cabal-env users (whether for teaching purposes or others). So we'd need some good backward compatibility scheme.

@Mikolaj
Copy link
Member

Mikolaj commented Jan 29, 2024

Another solution, which unfortunately increases complication, might be to treat libraries installed together with GHC (and perhaps all installed not by the user directly, but by install/upgrade scripts) specially and apply --upgrade-dependencies to them, while keeping user-installed libraries immutable, on the premise that the user knows what the user is doing (and that installing libraries is rare and discouraged).

Edit: which somehow agrees with how we treat local packages even if newer versions are on Hackage, local packages being "user-installed" and so automatic upgrades being disabled (I think?). I remember apt keeps track which packages are directly requested by the user and which are only installed as dependencies, but this is a very distant analogy to installed Haskell packages and how cabal treats them.

@hasufell
Copy link
Member Author

CCing @Ericson2314 @angerman wrt Nix

@michaelpj
Copy link
Collaborator

I don't think there's an easy way to tell my users that there was a subtle bug in filepath in splitFileName. No one reads the changelog. GHC already ships the version.

What happens if you deprecate the version with the bug (the installed version)? Will cabal still prefer it?

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Jan 29, 2024

Re Nix, I would like to have no notion of "preinstalled dependencies" because one should not be "preinstalling" things with Nix. So I like this.

  • it shouldn't affect my ideal world for Nix, as described above
  • insofar as users might expect "core libraries" to be preinstalled, the less special-casing Cabal does, the easier/less-surprising it is to switch those libraries to be non-pre-installed.

In conclusion, Proud Nix Hater @hasufell has proposed something that I think is actually great for Nix. Thank you! :)

@gbaz
Copy link
Collaborator

gbaz commented Jan 29, 2024

I'm a bit confused. In fact, the way we use nix at work involves "pre-installing" everything in the sense that everything goes into a package database which nix provides, instead of the cabal store, no?

So with the current behavior, if I am building foo which depends on bar-1 and the latter is in the package database nix provides, then even if bar-1.1 is released, cabal build will still use bar-1. With the new behavior, cabal-build would download and build bar-1.1 even though the nix configuration specified bar-1.

(the workaround for this, which is possible but slightly irritating, is to use a cabal.project or the like to disable hackage or any other package repository for packages developed in a nix provided environment)

@gbaz
Copy link
Collaborator

gbaz commented Jan 29, 2024

All that said, I modestly prefer the current behavior, in part because I'm afraid of changing this sort of stuff given large and unpredictable effects it may have on many users, and in part because users will expect that if they have a containers that works installed, then cabal will just use it.

I do think an explicit flag like upgrade-dependencies (though that is a terrible name, given its semantics, and prefer-newest or the like is better) is useful, to make this behavior more controllable.

@hasufell
Copy link
Member Author

I don't think cabal has any obligation to "not break nix". It's nix packagers obligation to keep it working.

Changes like the proposed one would be communicated early enough with a migration period, so that users can adapt and opt out of the changed behavior.

This is the same with the v1 vs v2 change that the cabal team executed over several years. Except this one seems much less disruptive.

I can't see how the current behavior is a sensible default from any angle, if it causes average users to miss bugfixes. It is not safe.

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Jan 29, 2024

@gbaz What I mean is that in Haskell.nix-style approaches planning takes place with no / empty package database. Ideally even for "cabal build in Nix shell" usage, we'd still use the original pure / ex nihilo plan.

The current trick of "re-planning" in the Nix shell and hoping it solves for as little few possible not-already-built things as possibly is comparatively gross, and (IIRC) runs into issues when sources are funny (e.g. modified local packages).

(That said, the "already installed" constraint is useful for the above hack, and I imagine also useful for anyone that is wondering why their boot packages aren't being used under this issue's proposal.)

@frasertweedale
Copy link
Contributor

@mpickering I'm worried about defaults here. I don't think there's an easy way to tell my users that there was a subtle bug in filepath in splitFileName. No one reads the changelog. GHC already ships the version.

What if this is a security bug? What do I do? I have no communication channels. A cabal update && cabal build should leave your project in the best possible state (bugfixes, security fixes) without further interventions/constraints required by the end users.

Speaking as a member of the Haskell Security Response Team, our hope is that cabal-install will be enhanced to directly use the data from the advisory database, and either omit affected packages from build plans by default, or alert users when build plans contain affected packages.

This issue poses some good questions but I don't think the SRT would have an opinion on it one way or the other, given the objective of more explicit cabal-install features/behaviour regarding known security issues.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Jan 29, 2024

The issue with forcing as much newest dependencies as possible is that your library/app might end up with a very different set of dependencies than your tests. If tests involve doctest, which is quite common, their build plan includes ghc-the-package and so sets in stone all boot libraries as shipped with GHC. The lib/app most likely does not depend on ghc and would be free to build against latest and greatest boot packages. Overall effect would be that you are testing not what you are shipping.

I think making ghc reinstallable would be an important stepping stone.

@simonpj
Copy link
Collaborator

simonpj commented Jan 30, 2024

What is perplexing for me is the following. Suppose we have

  • filepath which just happens to come with GHC, say filepath-4.3.1
  • wombat which just happens not to come with GHC.
  • I install a package wimwam using cabal, and it turns out that cabal's build plan installs the dependency wombat-2.7.2
  • Now the library authors for wombat and filepath release bug-fixes, say wombat-2.7.3 and filepath-4.3.2
  • Some weeks later I install yet another package, foogle which depends on wombat and filepath, but with very open upper bounds.

Question: when installing foogle which versions of wombat and filepath will cabal pick? I understand @hasufell as saying that it will pick

  • The buggy filepath-4.3.1 because it is "pre-installed"
  • The bug-fixed wombat-2.7.3, because cabal picks the newest if it can, even though wombat-2.7.2 is already installed.

I am baffled about why we could possibly want to treat wombat and filepath differently, just because filepath happens, through some accident of fate, to come with GHC.

  • If you want to use already-installed packages, to avoid compilation time, do that for both.
  • If you want to use the same packages for testing and for the library/app, make sure that build plan for both has the same constraints.

Which choice is best isn't obvious to me. But I can't see any justification for treating the two differently.

@hasufell
Copy link
Member Author

hasufell commented Jan 30, 2024

@simonpj

I am baffled about why we could possibly want to treat wombat and filepath differently, just because filepath happens, through some accident of fate, to come with GHC.

This is mostly correct, with the caveat that cabal treats any package that is in the global package db specially. It just so happens that cabal v2-build and cabal v2-install (which are now default) don't touch the global package db anymore as opposed to cabal v1-install --global (legacy). So for most users, the global package db just contains what GHC ships with.

That, imo, makes it even worse. There are many other mechanisms to avoid cabal rebuilds (e.g. just don't run cabal update, use a freeze file, pass certain flags to cabal, ...). For saving space it seems totally backwards and what we actually want there is: #3333

@hasufell
Copy link
Member Author

@Bodigrim

If tests involve doctest, which is quite common, their build plan includes ghc-the-package and so sets in stone all boot libraries as shipped with GHC.

I understand doctest is a special case, but I do not believe this justifies having the current default.

Overall effect would be that you are testing not what you are shipping.

It is the maintainers responsibility to ensure testing across multiple setups. If they can't do that, then their cabal version bounds are simply wrong or their test suite is sub-par.

For anyone wondering, there are two practical solutions to avoid doctest:

@hasufell
Copy link
Member Author

hasufell commented Jan 30, 2024

@frasertweedale

Speaking as a member of the Haskell Security Response Team, our hope is that cabal-install will be enhanced to directly use the data from the advisory database, and either omit affected packages from build plans by default, or alert users when build plans contain affected packages.

I don't want to digress too much on this, but I'm rather surprised by this sentiment and disagree rather strongly (wrt this being enough).

I'll put my response in a collapsible section to keep the thread clean. I'm happy to continue that discussion privately or on the security response team issue tracker.

Wrt security updates

To my knowledge, there are currently two main definitions of "software security" (or "insecurity").

  1. "unexpected computation", which was first prominently described in several langsec papers (e.g. "Security Applications of Formal Language Theory" and "The Halting Problems of Network Stack
    Insecurity") and for which there exist a few practical models.
  2. model security risk based on market means (as in: how expensive is it for an attacker to crack a system), e.g. "Computer Security Strength & Risk:A Quantitative Approach" by Stuart Edward Schechter

For the first definition there are a variety of technical models, but most of them are rather hard to apply. E.g. there is "An attack Surface Metric" by Pratyusa K. Manadhata, which attempts to model a system based on IO automatons and syscalls a user can trigger.
Several LANGSEC papers suggest to treat program input as a protocol and start with a strict parser and then model the system accordingly to understand possible system states.
Several MAC/data-flow approaches have been developed to aid with the relationship of user input and control flow (also in Haskell). But these are very specific techniques that a software engineer applies.

The second definition allows more insights into the whole business of redistribution, software maintenance, supply chain issues, buying 0-day exploits or DDoS attacks on the darkweb, etc.
We assume that it is possible to crack any system, depending on available energy and assets. As such, in order to protect from attacks, we want to make attacking the system more expensive.
This can be achieved by a number of things: strong cryptographic protocols, network architecture, etc.

But most importantly: packaging and update policies. The easiest way to make attacks more expensive is to always update to the latest versions of all packages. The reason is that an attacker has to invest more time in studying/keeping up with new versions of software packages and coming up with attacks (or buying them somewhere) as compared to versions that have been around for 2 years. There has just been less time to adapt. Software updates are disruptive for attackers. In that way, the most secure model is a "rolling release distro".

In addition, there is no clear definition in research of what a "security bug" is and this idea has also been rejected numerous times by the Linux kernel (see various interviews and ML posts by Greg Kroah-Hartman, the maintainer of the stable linux branch). The Linux kernel backports any bugfix. Because any bug can (as per the "unexpected computation" definition of insecurity) potentially lead to a security vulnerability, even if none is known yet. You are not safe just because you're not running a version that has no publicly known vulnerabilities.

All this said... we want to optimize an ecosystem of packages in a way that an attacker has to invest a lot of resources to drive attacks on users. And the most important way to do this is to be as aggressive as possible with software updates, even if there's no CVEs that mandate an update. This conflicts (heavily) with the current default of cabal.

@Mikolaj
Copy link
Member

Mikolaj commented Jan 30, 2024

@simonpj:

I install a package wimwam using cabal, and it turns out that cabal's build plan installs the dependency wombat-2.7.2

I think modern cabal works rather differently. The new cabal v2-build (or v2-install) builds wombat-2.7.2 locally and does not "install" it, at least not in the same sense that GHC distribution installs the packages it comes with. Modern cabal discourages installing any libraries and encourages building them anew (with smart caching via "store").

In fact, if GHC stopped providing/exposing the bundled packages, the problem of the exceptional treatment of installed packages would be immediately gone (until the user insists on manually installing some other packages, which is discouraged and hard to do properly). If GHC ships with the packages so that the user saves on compilation, it's no wonder cabal tries to accommodate it. However, I'm guessing GHC exposes the packages, because the ghc package (and any other non-reinstallable packages?) depends on them and ghc can't be re-built/reinstalled/relinked (in particular, to depend on different versions/builds of its dependencies). See #9064 (comment) and many related issues. I'm sure @bgamari or @mpickering could easily confirm or deny.

Therefore, the inconsistent cabal behaviour may be caused primarily by ghc and others not being reinstallable/rebuildable (and secondarily, by attempting backward compatibility for old v1-/Setup workflows, such as Nix, Linux distros, old setups for Haskell courses where each student is supposed to have the same versions of dependencies and freeze files were not yet a thing). If so, we can wait until ghc/others are reinstallable and then the problem vanishes (unless the user introduces it independently). Or we could try to limit the special cabal behaviour to build plans that include ghc, but then such build-plans are treated specially. If template-haskell is another case of a non-reinstallable package that depends on reinstallable packages (I remember rebuilding it in the past, but that's no longer possible, I think?), this makes the specially treated build plans much more common and harder to describe succinctly to the user.

Am I anywhere close to the root cause of keeping this old functionality in modern cabal? Can cabal handle GHC in some alternative way without incurring this irregular behaviour? E.g., what can go wrong if GHC renames all the packages (in the package db and/or on Hackage) it bundles so that they can't be reinstalled at all?

Edit: actually, what happens if cabal reinstalls a dependency of ghc and uses it alongside the other copy of this dependency baked into ghc? I guess no outright disaster, but there can be subtle bugs due to subtle changes in behaviour between the versions? Is that why cabal is reluctant to reinstall?

@simonpj
Copy link
Collaborator

simonpj commented Jan 30, 2024

Therefore, the inconsistent cabal behaviour may be caused primarily by ghc and others not being reinstallable/rebuildable

Yes indeed:

  • IF the build plan uses ghc-the-package
  • THEN you are stuck with particular versions of the packages that ghc-the-package depends on

But that is a simple consequence of depending on ghc-the-package, which in turn depends on a particular wired-in version of filepath.

But let's suppose that your build plan does not depend on ghc-the-package or template-haskell (a very common case). Now filepath has no constraints -- cabal is entirely free to rebuild it locally. The fact that there is a pre-installed version is irrelevant, no? So my question remains: why is filepath (and other packages that happen to come with GHC) treated specially?

@michaelpj
Copy link
Collaborator

Therefore, the inconsistent cabal behaviour may be caused primarily by ghc and others not being reinstallable/rebuildable

I think this is not quite right. I don't think anyone is asking to change the behaviour when you depend on a non-reinstallable package: there we really do have to go with what GHC ships, and that's that.

I think the request is about packages that are reinstallable, but happen to have versions in the global package-db, like filepath. Then the preference seems less justifiable.

@mpickering
Copy link
Collaborator

The reason that things in the GlobalPackageDB are treated specially is because of the definition of corePackageDbs in Distribution.Client.ProjectPlanning. This only looks in GlobalPackageDb and any extra package databases that a user has configured.

In general, it is a bit of an issue that cabal-install and Cabal assume anything about the structure of package databases (see #3728). There are many assumptions baked into both projects that you want to use the global package database and the things in there are privileged. This is primarily a legacy from the old days when GHC was much less flexible about being able to specify a package database stack, but now it is completely agnostic.

@Mikolaj
Copy link
Member

Mikolaj commented Jan 30, 2024

why is filepath (and other packages that happen to come with GHC) treated specially?

Then the preference seems less justifiable.

My guess is that cabal covers the case of dependencies of non-reinstallable package in a lazy way --- by treating specially all packages that reside in the relevant package DB [edit; and regardless of the build plan]. This is has several advantages: simplicity of implementation, simplicity of configuration (though, as @mpickering states, this may be too hard-wired at this point), an extra benefit of backward compatibility for other legacy workflows using a central package DB, simplicity of conveying the behaviour to the user (though it's probably not conveyed yet or not well enough). Edit: one more advantage: this primitive solution does not increase the coupling of GHC and cabal, because the list of non-installable packages that changes between GHC versions (#9092) does not need to be used for yet another purpose in cabal code.

Which is why I'm considering the other option, changing the behaviour of GHC, not of cabal. Or even of the GHC installer, e.g., making it rename all the packages it installs [edit: a less brutal variant: install them to another package db, if that matters]. That's probably absurd for fundamental reasons, but I'd like to improve my understanding of the situation by learning why exactly.

Edit: to be fair, the improvement of analysing the build plan (and auto-upgrading if the package in question is not a dependency of a non-reinstallable package) would not detract from the ease of configuration (though it would eliminate the backward compatibility side-benefit). However, I'm not able to predict what interplay it could have with the solver (because deciding to auto-upgrade changes the build plan, so perhaps we need to solve anew to verify all constraints are respected? what if the new solution implies the package should not be automatically upgraded?).

@gbaz
Copy link
Collaborator

gbaz commented Jan 30, 2024

The comments here are swaying me towards supporting a change here. However, I really do feel it needs to be flag-controlled and I'm definitely worried that any deep change like this could well confuse workers and disrupt workflows in a way that is very unexpected and hard to diagnose, especially for those who don't read release notes.

@mpickering
Copy link
Collaborator

There seems to be some suggestion in https://discourse.haskell.org/t/is-cabal-install-stable-enough-yet-that-ghc-should-be-inflicting-it-on-newbies/9979/52 that cabal install --lib will add pre-installed packages (and hence be relevant to this ticket).

However, to my understanding that is false because v2- commands don't consult the user package database (see definition of corePackageDbs in Distribution.Client.ProjectPlanning. Therefore this default only affects the global package db and any package database specified to cabal with the --package-db flag.

If I am mistaken about this then a reproduce which demonstrates the issue would be much appreciated.

@hasufell
Copy link
Member Author

If I am mistaken about this then a reproduce which demonstrates the issue would be much appreciated.

There's a clear reproducer for the original topic:

To reproduce:

git clone https://github.com/hasufell/toto.git
cd toto
ghcup run --ghc 9.4.8 -- cabal build

This should cause a failure, because ghc-9.4.8 ships with filepath-1.4.2.2, but the package above uses modules from 1.4.100.1. The package has no upper bounds on filepath. For any other non-pre-installed package, the solver would pick the latest.


that cabal install --lib will add pre-installed packages (and hence be relevant to this ticket).

I'm not sure if anyone suggested that and it seems out of scope of this ticket how install --lib behaves. It is broken anyhow.

This ticket is about the solver.

@mpickering
Copy link
Collaborator

It seems that I misunderstood that suggestion in the thread (it is quite complicated all the discussions going on). Thank you for clarifying that.

In particular the comment "Do v2 builds even look at the globally installed packages?", globally installed packages might mean, packages installed in the the global package database OR packages installed in the user package database by cabal install --lib.

@TeofilC
Copy link
Collaborator

TeofilC commented Sep 28, 2024

What's the status of this?

Personally I would prefer a positive flag for preferring flags in the global package db, maybe --prefer-globally-installed, which could be on by default with a deprecation cycle to change the default. --upgrade-dependencies does not jump out to me as related to this issue!

I think @michaelpj's suggestion is great. Though I think it's a bit tempting to just make the entire change in one release. I struggle to see how we could communicate this "deprecation" to users, and I imagine we won't get much usage until it becomes the default -- I think the users who would test this feature would just as easily test it from a nightly build.


I would be really keen to make progress here because this looks to be one of the issues blocking decoupling GHC upgrades from boot library upgrades, and I think that would be a huge improvement (though maybe I'm misunderstanding this issue?).

@phadej
Copy link
Collaborator

phadej commented Sep 28, 2024

though maybe I'm misunderstanding this issue?)

This issue is about default behavior. cabal-install would solve for newer version and pick one if forced to. So no, this issue is not blocking anything AFAICT.

The example "reproducer" https://github.com/hasufell/toto.git has no lower-bound on filepath (in fact any bound), so picking any version is a fair choice.

@TeofilC
Copy link
Collaborator

TeofilC commented Sep 28, 2024

This issue is about default behavior. cabal-install would solve for newer version and pick one if forced to. So no, this issue is not blocking anything AFAICT.

Good point. I should've been more precise. Here is the sort of scenario I had in mind:

(Assuming the package uses cabal-install and doesn't depend on ghc:lib)

If I want to submit a PR to bump the bound for a non-boot library, I can just update the bound, and be confident that it will be tested by CI.

If I want to bump a boot library bound, then I need to apply this workaround where I forbid the old version. I think most of the community is understandably not aware that this is even possible. Even if someone was aware, I think most maintainers wouldn't want to deal with this extra complexity. They are likely to instead wait for a version of GHC where that library is bundled.

So, this issue blocks updating the ecosystem to "reinstallable" boot library bumps before a GHC comes out with that version bundled.

@TeofilC
Copy link
Collaborator

TeofilC commented Sep 28, 2024

@Mikolaj wrote early on in the discussion:

Regardless of the outcome of the discussion, it would make sense to find the PR that commented --upgrade-dependencies out, understand how this happened and prevent for the future, e.g., by guarding this functionality with tests and also by documenting it (better).

Git blame points us to this commit by @dcoutts 324b324#diff-e2de3403daa75f77ddd177d0a040f0547097abddb360328293ba46da21673a4e

And as far as I can tell --upgrade-dependencies is introduced here in the PR that implemented nix-style builds already commented out with this comment attached:

       -- Things that only make sense for manual mode, not --local mode
       -- too much control!

So it sounds like these have never been applicable to v2 style commands

@phadej
Copy link
Collaborator

phadej commented Sep 28, 2024

--upgrade-dependencies is a legacy from v1-build.

The v2 commands have a lot of flags which are inherited from v1-build and do nothing or don't make sense, because the v2 interface wasn't started from scratch, but partially reuses v1-build command definitions.

You correctly blame Duncan. It might made sense to get something quickly done, but in long term that was a bad choice. The v2 cli interface is not clean.

v1-build used a single version per package database, so --upgrade-dependencies was a thing.

@phadej
Copy link
Collaborator

phadej commented Sep 28, 2024

be confident that it will be tested by CI.
If I want to bump a boot library bound, then I need to apply this workaround where I forbid the old version.

Well... you have to keep testing against the old versions too, because in larger setup there might be ghc. --prefer-oldest is a recent addition, which helps a bit, but still doesn't test in the middle versions.

But I agree, having a flag to prefer installed, or not (or oldest; many way choice) would be helpful. Then good CI setups could test against all options. (Note to myself to add --prefer-oldest step to haskell-ci).

IMHO, the default choice doesn't really make a good argument for CI, you should test against as many "naturally occuring" dependencies as possible. Building against bundled boot libs will occur as long as ghc is not reinstallable.

So I'm 👍 to have more options to the solver. I actually don't care what's the default is. (If I don't like it, I hope I can change it in the global cabal-install config).

@phadej
Copy link
Collaborator

phadej commented Sep 28, 2024

This is my general stance about these "let's change the default" proposals. Make options available, make them configurable, and then you can (discuss the) change of the default.

The recent --enable-documentation change which cannot be undone in global config is driving me crazy.

@TeofilC
Copy link
Collaborator

TeofilC commented Sep 28, 2024

Yes this definitely should be (globally) configurable. That really lowers the cost of changing the default as you say -- users can just override it.
It's especially important for this one because as people have said earlier this is load-bearing in some nix setups.

@Mikolaj
Copy link
Member

Mikolaj commented Sep 30, 2024

This is my general stance about these "let's change the default" proposals. Make options available, make them configurable, and then you can (discuss the) change of the default.

I wholeheartedly agree. Given that this discussion took so long and didn't bring us consensus nor even a review of available options, I'd propose to implement the new behaviour as a flag and I hope we can merge this PR without as long a delay and afterwards let's resume discussing the default, with a very concrete and tested flag in hand that will be the candidate for the new default.

@juhp
Copy link
Collaborator

juhp commented Oct 27, 2024

From the distro perspective I think the current behaviour is good, so I am not at all convinced changing the default behaviour is desirable, but having a flag is fine of course. If a dependency is already satisfied why should it have to be updated by default? Modern compiled languages already burn far too much energy with needless constant rebuilding...

@hasufell
Copy link
Member Author

From the distro perspective I think the current behaviour is good,

That's an odd thing to say, since every distro I know has the opposite behavior.

If you install a package and there are updates for its dependencies, they will get pulled in.


Can I get a proper decision from Cabal team what the way forward is?

@geekosaur
Copy link
Collaborator

You might be interested to know that GHC HQ is considering my suggestion to ship a list of packages that can't be upgraded and possibly a list of packages that "freeze" the entire set of bootlibs, which would allow the solver to determine when it has to behave the way it does now and otherwise treat bootlibs like normal. (This would work better if ghcup could retrofit this data into older ghc versions which won't include it.) That said, @grayjay told me this may be difficult for the solver to use; it's possible that the current situation is the best the current solver can do.

@hasufell
Copy link
Member Author

This would work better if ghcup could retrofit this data into older ghc versions which won't include it.

That won't be easily possible due to haskell/ghcup-hs#361 which is a major rewrite that has stalled

@Mikolaj
Copy link
Member

Mikolaj commented Oct 28, 2024

Can I get a proper decision from Cabal team what the way forward is?

In the absence of a PR, if you'd like the Cabal team as a whole to make a decision, please add this topic to the agenda of the fortnightly cabal devs chat at https://hackmd.io/X62yS0d6RxW3ybh8AmRqlw or, even better, please come to the meeting. Until we have such a decision or any objection, my proposal from a few comments back stands.

@hasufell
Copy link
Member Author

if you'd like the Cabal team as a whole to make a decision

Yes, please add the topic to the agenda. I don't have capacity to be involved in cabal meetings, thanks.

@TeofilC
Copy link
Collaborator

TeofilC commented Oct 28, 2024

From the distro perspective I think the current behaviour is good, so I am not at all convinced changing the default behaviour is desirable

I think it's reasonable that distro (and nix dev shells) want to provide a set of packages and force cabal to use those. But I don't think that means that other users need to be stuck with the current behaviour. I feel like cabal-install should provide configuration options to allow system config to provide these packages, but shouldn't force boot packages on users. So, I don't think there's necessarily a conflict between these two desires.

@Ericson2314
Copy link
Collaborator

And note that isn't the ideal thing for Nix either. When we hop in a dev shell we shouldn't be re-planning at all. Forcing cabal to choose installed packages is just a hack around not being able to use the original plan from which the nix shell dependencies were chosen.

@phadej
Copy link
Collaborator

phadej commented Oct 28, 2024

When we hop in a dev shell we shouldn't be re-planning at all.

There could be greedy pick-installed-only solver for Nix (and alike) use cases. But cabal-install developers are going to remove support for different solvers (e.g. #9206), so there's no going back. (Constructing install plan, even from existing one is still "constructing install plan"; arguably stack has a solver as well, though a very naive one). (IIRC there was some older patch, which hard coded modular solver as the solver everywhere, but I couldn't find it: EDIT, found #9282).

@grayjay
Copy link
Collaborator

grayjay commented Oct 29, 2024

You might be interested to know that GHC HQ is considering my suggestion to ship a list of packages that can't be upgraded and possibly a list of packages that "freeze" the entire set of bootlibs, which would allow the solver to determine when it has to behave the way it does now and otherwise treat bootlibs like normal. (This would work better if ghcup could retrofit this data into older ghc versions which won't include it.) That said, @grayjay told me this may be difficult for the solver to use; it's possible that the current situation is the best the current solver can do.

@geekosaur I thought about the freezing behavior that you mentioned, and I realized that I'm not sure I understand. Would the solver need any additional logic beyond the existing requirement that an installed package depend on the exact installed versions of the dependencies that it was built against? I would expect that including ghc in an install plan would already freeze all of the dependencies declared in ghc's build-depends field.

@geekosaur
Copy link
Collaborator

I think the issue is that, since it's a "special" package that is dependent on the exact packages shipped with ghc (more specifically, on the exact packages that ghc itself is built against), it can't rely on build-depends but on the global package db. Isn't that why bootlibs use those versions to begin with?

@grayjay
Copy link
Collaborator

grayjay commented Oct 29, 2024

cabal already only considers the installed version of ghc, because it appears in the list of non-reinstallable packages:

-- | The set of non-reinstallable packages includes those which cannot be
-- rebuilt using a GHC installation and Hackage-published source distribution.
-- There are a few reasons why this might be true:
--
-- * the package overrides its unit ID (e.g. with ghc's @-this-unit-id@ flag),
-- which can result in multiple indistinguishable packages (having potentially
-- different ABIs) with the same unit ID.
--
-- * the package contains definitions of wired-in declarations which tie
-- it to a particular compiler (e.g. we can't build link against
-- @base-4.18.0.0@ using GHC 9.6.1).
--
-- * the package does not have a complete (that is, buildable) source distribution.
-- For instance, some packages provided by GHC rely on files outside of the
-- source tree generated by GHC's build system.
nonReinstallablePackages :: [PackageName]
nonReinstallablePackages =
[ mkPackageName "base"
, mkPackageName "ghc-bignum"
, mkPackageName "ghc-internal"
, mkPackageName "ghc-prim"
, mkPackageName "ghc"
, mkPackageName "integer-gmp"
, mkPackageName "integer-simple"
, mkPackageName "template-haskell"
]

Choosing the installed ghc means that all of ghc's dependencies must also be installed (the same versions it was built against, which came with GHC). I should have said that cabal reads the dependencies from the InstalledPackageInfo, not build-depends, because it is an installed package. I was wondering whether cabal needs to do anything more to implement the freezing behavior that you mentioned.

@geekosaur
Copy link
Collaborator

It's already doing the freezing behavior; the problem is that it's freezing too much and we need a way to say "only these packages need to be frozen unless the ghc package is involved".

@geekosaur
Copy link
Collaborator

To be more clear: certain packages are hardcoded into cabal as being completely non-reinstallable. Any other package in the global package db is "soft non-reinstallable", or preferred in cabal file lingo. They shouldn't be: they should be treated like any other package, unless ghc is being linked. But then we need more than the version, and the global package db is not a store with a hash so we can't use that to say "this exact package, not just the same version" when it comes to ghc. (This is not entirely true as of 9.10.1, there is a very short hash on the package id which makes the first point in the above comment false, but I don't know how it reflects into ghc's dependencies.)

@TeofilC
Copy link
Collaborator

TeofilC commented Oct 29, 2024

The issue for amending the list of non-reinstallable packages is #10087, perhaps we could move conversation related to that into that issue considering this thread is already really long

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests