Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pkg servers: The eager registry flavor is significantly lagging behind the General registry's master branch #120562

Closed
GeorgeR227 opened this issue Dec 2, 2024 · 52 comments

Comments

@GeorgeR227
Copy link

I've released a new version of CombinatorialSpaces.jl a few hours ago when writing this but it seems like that release hasn't made it to the General registry yet. Just to confirm the issue isn't with this specific release I've tried with a few other packages unrelated to this one and going back even further, even like 12 hours ago, I can't seem to update to the newest versions of those either.

There probably isn't, and can't be I suppose, a guarantee on how long this propagation can take but this feels like significantly longer than usual.

@DilumAluthge
Copy link
Member

If you need the latest package version right now, you can try switching to the "eager" flavor of the registry:

ENV["JULIA_PKG_SERVER_REGISTRY_PREFERENCE"] = "eager"

import Pkg

Pkg.Registry.update()

Try that out, and if you can't get the latest package version when you're using the eager registry, let me know, and I'll reopen this issue.

@DilumAluthge DilumAluthge closed this as not planned Won't fix, can't repro, duplicate, stale Dec 2, 2024
@GeorgeR227
Copy link
Author

I tried running those commands but I still can't get the newest version to download. However, I've noticed that one of those other packages I tested before (Enzyme.jl specifically) that released a longer time ago seems to have finally propagated. In this case I'm expecting version 0.6.8 for CombinatorialSpaces.jl and Enzyme.jl is now on 0.13.18.

julia> ENV["JULIA_PKG_SERVER_REGISTRY_PREFERENCE"] = "eager"
"eager"

julia> import Pkg

julia> Pkg.Registry.update()
    Updating registry at `~/.julia/registries/General.toml`

(@v1.11) pkg> add CombinatorialSpaces
   Resolving package versions...
    Updating `~/.julia/environments/v1.11/Project.toml`
  [b1c52339] + CombinatorialSpaces v0.6.7

...

(@v1.11) pkg> add Enzyme
   Resolving package versions...
   Installed Enzyme ─ v0.13.18
    Updating `~/.julia/environments/v1.11/Project.toml`
  [7da242da] + Enzyme v0.13.18

@GeorgeR227
Copy link
Author

If this is simply a case of "sit and wait", that's not too much of a problem for me. However, I just wanted to point this out in case there was some kind of unintended slowdown/blockage in the registry.

@DilumAluthge
Copy link
Member

Well, the goal of the eager registry is to never have that delay. So this may be a bug in the Pkg server.

@DilumAluthge DilumAluthge reopened this Dec 2, 2024
@NicolasRiel
Copy link

Same issue on my side, been waiting 12 hours, also using "Eager"

@DilumAluthge DilumAluthge changed the title Registry propagation taking a long time Registry propagation taking a long time (even when using eager) Dec 3, 2024
@DilumAluthge
Copy link
Member

DilumAluthge commented Dec 3, 2024

I am able to reproduce this in a clean setup with Julia 1.11.1. Here are the steps to reproduce.

First, run the following commands in Bash:

unset JULIA_NUM_THREADS

unset JULIA_PROJECT

unset JULIA_LOAD_PATH

unset JULIA_PKG_SERVER

export JULIA_DEPOT_PATH="$(mktemp -d)"

julia --startup-file=no --history-file=no

Now, inside Julia, run the following commands:

ENV["JULIA_PKG_PRECOMPILE_AUTO"] = "0"

ENV["JULIA_PKG_SERVER_REGISTRY_PREFERENCE"] = "eager"

import Pkg

Pkg.Registry.add()

Pkg.Registry.update()

Pkg.add("CombinatorialSpaces")

Pkg.status()

Pkg.add(; name = "CombinatorialSpaces", version = "0.6.8")

Here is the output that I get:

$ unset JULIA_NUM_THREADS
$ unset JULIA_PROJECT
$ unset JULIA_LOAD_PATH
$ unset JULIA_PKG_SERVER
$ export JULIA_DEPOT_PATH="$(mktemp -d)"
$ julia --startup-file=no --history-file=no
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.11.1 (2024-10-16)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia> ENV["JULIA_PKG_PRECOMPILE_AUTO"] = "0"
"0"

julia> ENV["JULIA_PKG_SERVER_REGISTRY_PREFERENCE"] = "eager"
"eager"

julia> import Pkg
[ Info: Precompiling Pkg [44cfe95a-1eb2-52ea-b672-e2afdf69b78f]

julia> Pkg.Registry.add()
  Installing known registries into `/var/folders/f4/dd0zhsvj5099rqhl7v7wjq580000gq/T/tmp.wOmv6COwdI`
       Added `General` registry to /tmpdepot/registries
true

julia> Pkg.Registry.update()
    Updating registry at `/tmpdepot/registries/General.toml`

julia> Pkg.add("CombinatorialSpaces")
    Updating registry at `/tmpdepot/registries/General.toml`
   Resolving package versions...

...

    Updating `/private/tmpdepot/environments/v1.11/Project.toml`
  [b1c52339] + CombinatorialSpaces v0.6.7
    Updating `/private/tmpdepot/environments/v1.11/Manifest.toml`

...

julia> Pkg.status()
Status `/private/tmpdepot/environments/v1.11/Project.toml`
  [b1c52339] CombinatorialSpaces v0.6.7

julia> Pkg.add(; name = "CombinatorialSpaces", version = "0.6.8")
   Resolving package versions...
ERROR: Unsatisfiable requirements detected for package CombinatorialSpaces [b1c52339]:
 CombinatorialSpaces [b1c52339] log:
 ├─possible versions are: 0.1.0 - 0.6.7 or uninstalled
 ├─restricted to versions * by project [5aa5a18c], leaving only versions: 0.1.0 - 0.6.7
 │ └─project [5aa5a18c] log:
 │   ├─possible versions are: 0.0.0 or uninstalled
 │   └─project [5aa5a18c] is fixed to version 0.0.0
 └─restricted to versions 0.6.8 by an explicit requirement — no versions left
Stacktrace:

...

However, looking at the registry, I see that CombinatorialSpaces v0.6.8 should be available when using the eager registry:

["0.6.8"]
git-tree-sha1 = "046dd1cba7869d566fcfbc19f3fb420d408585b3"

@DilumAluthge
Copy link
Member

DilumAluthge commented Dec 3, 2024

I took a look at which commits registries.eager and registries.conservative currently point to. Here is what I get:

Latest commit on origin/master:
commit 4efa4f52673180858dd4c91b0c8df93c4c09f7cd
Author: Registrator <[email protected]>
Date:   Tue Dec 3 09:55:54 2024 +0530
    New version: GasChem v0.8.0 (#120574)

Eager commit:
commit 7aa56ada7e91056215ab4e18a37ba7f6081f0b0f
Author: Registrator <[email protected]>
Date:   Mon Dec 2 15:14:01 2024 +0530
    New version: PkgSkeleton v1.3.2 (#120518)

Conservative commit:
commit 6ec563265b0dcff3a3fbe9b9445e6523de125801
Author: Registrator <[email protected]>
Date:   Mon Dec 2 10:39:24 2024 +0530
    New version: Gridap v0.18.8 (#120508)

In hyperlink format:

@DilumAluthge
Copy link
Member

So currently, registries.eager is not giving the latest commit.

@DilumAluthge DilumAluthge changed the title Registry propagation taking a long time (even when using eager) The eager registry flavor is significantly lagging behind the master branch Dec 3, 2024
@dlfivefifty
Copy link
Contributor

I'm also experiencing this trying to use InfiniteArrays v0.15 which is breaking CI in the PR JuliaLinearAlgebra/LazyBandedMatrices.jl#137

@DilumAluthge
Copy link
Member

I wonder if this is a bug in the Pkg server (or storage server)? cc: @staticfloat @fredrikekre

@DilumAluthge
Copy link
Member

Okay, here's what I'm getting now. eager has advanced (compared to where it was yesterday), but it still is lagging behind master.


Latest commit on origin/master:
commit 38ac2aa0bb3ce9df8b55227ded841e9838525779
Author: Registrator <[email protected]>
Date:   Wed Dec 4 04:25:34 2024 +0530
    New package: StateSignals v0.1.0 (#120374)

Eager commit:
commit 4572c5eb4b7e77711da0695e90c28a6a8aa07375
Author: Registrator <[email protected]>
Date:   Tue Dec 3 21:30:08 2024 +0530
    New version: SimulationBasedInference v0.1.7 (#120612)

Conservative commit:
commit 6ec563265b0dcff3a3fbe9b9445e6523de125801
Author: Registrator <[email protected]>
Date:   Mon Dec 2 10:39:24 2024 +0530
    New version: Gridap v0.18.8 (#120508)

@DilumAluthge
Copy link
Member

And note that conservative today is stuck on the same commit that it was stuck on yesterday.

@LilithHafner
Copy link
Member

I'm hitting this with [email protected], released 11 hours ago.

@maleadt
Copy link
Contributor

maleadt commented Dec 4, 2024

Something still seems up: Requesting the eager registry sometimes returns outdated responses:

❯ curl -LsI -v https://pkg.julialang.org/registries.eager |& grep -E '(last-modified|Host)'
> Host: eu-central.pkg.julialang.org
< last-modified: Tue, 03 Dec 2024 16:07:51 GMT

❯ curl -LsI -v https://pkg.julialang.org/registries.eager |& grep -E '(last-modified|Host)'
> Host: eu-central.pkg.julialang.org
< last-modified: Wed, 04 Dec 2024 06:28:10 GMT

The same doesn't happen with the default conservative one.

@fredrikekre
Copy link
Member

I don't think the Last-Modified header is very accurate when hitting the intermediate layer, but going to the storage servers instead:

$ curl -sLI https://us-east.storage.julialang.org/registries.conservative | rg Last
Last-Modified: Mon, 02 Dec 2024 05:18:30 GMT

$ curl -sLI https://kr.storage.julialang.org/registries.conservative | rg Last
Last-Modified: Mon, 02 Dec 2024 05:18:28 GMT

@fredrikekre
Copy link
Member

.eager is up to date now I hope:

$ curl -sLI https://us-east.storage.julialang.org/registries | rg Last
Last-Modified: Wed, 04 Dec 2024 15:35:03 GMT

$ curl -sLI https://us-east.storage.julialang.org/registries.eager | rg Last
Last-Modified: Wed, 04 Dec 2024 15:35:01 GMT

$ curl -sLI https://us-east.storage.julialang.org/registries.conservative | rg Last
Last-Modified: Mon, 02 Dec 2024 05:18:30 GMT

@NicolasRiel
Copy link

I could update my package a few minutes ago. Seems to work for me.

Cheers

@MilesCranmer
Copy link

MilesCranmer commented Dec 4, 2024

Thanks for the detailed descriptions about this issue. For me it is still out-of-date (UK). Do you have any tips on figuring out which registry I default to?

Also, is there a list of all Julia registries somewhere? An extremely useful workflow would be to verify all registries are up-to-date before pushing updates to clients – since any out-of-date registry might break applications until the new packages are available. (e.g., my python library automatically updates the Julia packages - which would fail until the registries are all updated)

Edit: here's an attempt in case others want something:

#!/bin/bash

channel="conservative"
base_url="https://%s.pkg.julialang.org/registries.${channel}"
regions=(
  "au"
  "eu-central" 
  "in"
  "jp"
  "kr"
  "sa"
  "sg"
  "us-east"
  "us-west"
)

urls=()
for region in "${regions[@]}"; do
  urls+=("$(printf "$base_url" "$region")")
done

for url in "${urls[@]}"; do
  last_modified=$(curl -sLI "$url" | grep -i "last-modified" | awk '{$1=""; print $0}' | sed 's/^ //')
  echo "$url: $last_modified"
done

Although this only does the pkg servers rather than storage since I don't know those URLs.

@MilesCranmer
Copy link

My local package server in the UK is still out-of-date. I am trying to push a bug fix to downstream users but it's been 22 hours now without the storage server being updated :/ Is there anything I can do to help push this along? (Downstream users will all be on conservative presumably)

@DilumAluthge
Copy link
Member

Which package server are you currently being routed to?

That is, what do you get when you run the following on your local machine?

curl -X GET -I pkg.julialang.org

Specifically, what's the Location: value?

@DilumAluthge DilumAluthge changed the title The eager registry flavor is significantly lagging behind the master branch Pkg servers: The eager registry flavor is significantly lagging behind the master branch Dec 4, 2024
@DilumAluthge DilumAluthge changed the title Pkg servers: The eager registry flavor is significantly lagging behind the master branch Pkg servers: The eager registry flavor is significantly lagging behind the General registry's master branch Dec 4, 2024
@DilumAluthge
Copy link
Member

Hmmm, for me, eu-central.pkg seems to go back and forth between Monday (2 days ago) and Wednesday (today):

$ curl -sLI https://eu-central.pkg.julialang.org/registries.conservative | grep -i "last"
last-modified: Mon, 02 Dec 2024 05:18:31 GMT
$ curl -sLI https://eu-central.pkg.julialang.org/registries.conservative | grep -i "last"
last-modified: Wed, 04 Dec 2024 06:28:11 GMT
$ curl -sLI https://eu-central.pkg.julialang.org/registries.conservative | grep -i "last"
last-modified: Mon, 02 Dec 2024 05:18:31 GMT
$ curl -sLI https://eu-central.pkg.julialang.org/registries.conservative | grep -i "last"
last-modified: Wed, 04 Dec 2024 06:28:11 GMT
$ curl -sLI https://eu-central.pkg.julialang.org/registries.conservative | grep -i "last"
last-modified: Wed, 04 Dec 2024 06:28:11 GMT
$ curl -sLI https://eu-central.pkg.julialang.org/registries.conservative | grep -i "last"
last-modified: Mon, 02 Dec 2024 05:18:31 GMT

There are two eu-central Pkg server backends (eu-central1.pkg and eu-central2.pkg). I wonder if one is up-to-date (giving Wednesday), and the other is lagging (giving Monday), and presumably the loadbalancer (eu-central.pkg) is round-robinning me between the two.

@fredrikekre Any idea why one eu-central backend would be up-to-date, but the other backend is lagging?

@LilithHafner
Copy link
Member

I'm still not seeing 1.0.3 or 1.0.4 of BasicAutoloads, one of which is more than 24 hours old. Requesting eager fixes it, but 24 hours of latency is quite a lot:

shell> curl https://raw.githubusercontent.com/JuliaRegistries/General/refs/heads/master/B/BasicAutoloads/Versions.toml
["1.0.0"]
git-tree-sha1 = "f0b29630f259d7d218b6cfa45ee3d2d67a010da4"

["1.0.1"]
git-tree-sha1 = "f15954212241fe1fcb8b36a52fc2d0913c8863d8"

["1.0.2"]
git-tree-sha1 = "3aaa0991207215dc7d21705d868a1bea5c40a108"

["1.0.3"]
git-tree-sha1 = "8527f3827129547d14164f60c1ce4db06979526e"

["1.0.4"]
git-tree-sha1 = "5854c10828e40cee88c8888f8d91a58888f86789"

(@v1.11) pkg> activate --temp
  Activating new project at `/tmp/jl_z5w6Wz`

(jl_z5w6Wz) pkg> add [email protected]
   Resolving package versions...
ERROR: Unsatisfiable requirements detected for package BasicAutoloads [09cdc199]:
 BasicAutoloads [09cdc199] log:
 ├─possible versions are: 1.0.0 - 1.0.2 or uninstalled
 └─restricted to versions 1.0.4 by an explicit requirement — no versions left

(jl_z5w6Wz) pkg> registry update
    Updating registry at `~/.julia/registries/General.toml`

(jl_z5w6Wz) pkg> add [email protected]
   Resolving package versions...
ERROR: Unsatisfiable requirements detected for package BasicAutoloads [09cdc199]:
 BasicAutoloads [09cdc199] log:
 ├─possible versions are: 1.0.0 - 1.0.2 or uninstalled
 └─restricted to versions 1.0.4 by an explicit requirement — no versions left

julia> ENV["JULIA_PKG_SERVER_REGISTRY_PREFERENCE"] = "eager"
"eager"

(jl_z5w6Wz) pkg> registry update
    Updating registry at `~/.julia/registries/General.toml`

(jl_z5w6Wz) pkg> add [email protected]
   Resolving package versions...
   Installed BasicAutoloads ─ v1.0.4
    Updating `/tmp/jl_z5w6Wz/Project.toml`
  [09cdc199] + BasicAutoloads v1.0.4
    Updating `/tmp/jl_z5w6Wz/Manifest.toml`
  [09cdc199] + BasicAutoloads v1.0.4
Precompiling project...
  ✓ BasicAutoloads
  1 dependency successfully precompiled in 1 seconds
  1 dependency precompiled but a different version is currently loaded. Restart julia to access the new version

@DilumAluthge
Copy link
Member

conservative being 24 hours old isn't too bad IMO (as long as eager is up-to-date).

IMO, conservative really only becomes a problem if it's a longer delay, e.g. a week or more (again, assuming eager is up-to-date), given that it's really easy for users to switch to eager temporarily.

@DilumAluthge
Copy link
Member

Probably a week delay would be excessive, but I think 48 hours or less is probably okay.

@MilesCranmer
Copy link

Is there a reason conservative is default? Just worried for my downstream users who won’t get my hotfix for this long.

I might be especially screwed because lot of my users are in Python via PythonCall.jl — PyPI and conda-forge are updated instantaneously. This means that whenever Julia’s package registry gets out-of-sync, their installs will just completely fail due to not being able to find the new backend.

I guess I could automatically set their env variable but perhaps seems a bit fragile

@DilumAluthge
Copy link
Member

As far as the default value: If the user doesn't specify the environment variable, then Julia/Pkg will default to conservative.

So, in most cases (e.g. normal local usage), the default will end up being conservative).

However, in certain situations, you might end up observing a different default. For example, in GitHub Actions CI, if you use julia-actions/julia-buildpkg or julia-actions/julia-runtest, then your CI jobs will get eager by default. See here (for julia-buildpkg) and here (for julia-runtest). You can of course override this behavior by setting the environment variable in your GitHub Actions workflow YAML file.

@DilumAluthge
Copy link
Member

Just saw your comment. Give me a minute, I'll write up a few sentences explaining why eager is the default.

@DilumAluthge
Copy link
Member

So, here's the reasoning for why eager is the default.

When using the conservative registry, the user is guaranteed that all packages and artifacts are available to download from the Pkg servers. In particular, the user will never need to download any packages or artifacts from github.com.

When using the eager registry, this guarantee is not present. Therefore, it is possible that a package or artifact will not be available from the Pkg servers, in which case the user would need to download the package or artifact from another location, such as github.com.

Consider three different groups of users:

  1. Most users.
    • For most users, it doesn't matter if they use eager or conservative.
  2. Users that have network restrictions (e.g. a corporate firewall) that prevent them from downloading from github.com.
    • These users need to use the conservative registry.
  3. Users that need to be able to use the latest version of a package soon after the registration PR has been merged
    • In most cases, these users will be fine with either conservative or eager. However, in some cases, these users need to use eager when they need to use certain latest package versions.

My concern is that if we switch the default to eager, then users in group 2 will, by default, not be able to install any Julia packages. So my concern is that something like this happens: a new Julia users downloads Julia and tries to use it on their corporate network, and Julia/Pkg fails to install any Julia packages, so the new user gives up and doesn't pursue Julia further. So I'm worried that this kind of experience would have a negative impact on Julia adoption by new users, particularly in the corporate setting.

In contrast, the users in group 3 tend to be more experienced in Julia. For example, they might be Julia package developers that want to install the latest version of the package that they just registered. These group 3 users are more likely to be familiar with Julia and thus are more likely to be participating in JuliaLang communities (such as Discourse, Slack, Zulip, GitHub, etc). So it's more likely that they'll know where to reach out for help on one such platform.

@DilumAluthge
Copy link
Member

I might be especially screwed because lot of my users are in Python via PythonCall.jl — PyPI and conda-forge are updated instantaneously. This means that whenever Julia’s package registry gets out-of-sync, their installs will just completely fail due to not being able to find the new backend.

I'm not sure I 100% understand your use case - you have both a Julia package and a Python package, and you register a new version of your Julia package (in the General registry) at the same time you upload a new version of your Python package (in PyPI and/or conda-forge), and each specific version of the Julia package only works with a specific version of the Python package?

(We can move this conversation somewhere else, to save people the notifications.)

@dlfivefifty
Copy link
Contributor

My packages are still not updating (UK)... I don't really know the "eager" or "conservative" distinction just using the default

@MilesCranmer
Copy link

@DilumAluthge Thanks for explaining all of this. My follow-up question would be – could we have eager be the default and then have conservative be used as a fallback for some download failing? I feel like getting bugfixes to users as quickly as possible is really important. If someone hits a download error, then at the very least, their bug will be "loud" and debuggable. But a given user may not know they've been using a package that has a sev 1 for a week when there's already a patch in place. And if conservative could be an automatic fallback for when a URL can't be accessed, that would be nice too.

@dlfivefifty same for me (also UK). Even using eager I am not getting updates.

@DilumAluthge
Copy link
Member

@dlfivefifty Can you try with the eager registry flavor, and see if that works? The instructions are here (#120562 (comment)). Also, if you still have issues with the eager registry, how long ago was the package version registered (that you are trying to install).

@MilesCranmer When you use the eager registry, how long ago was the package version registered (that you are trying to install)?


As far as the question about "falling back" to conservative, I don't know if that's possible with the way that Julia/Pkg (and the Pkg Servers) currently work. Someone else (@StefanKarpinski, probably) would know better than me. My impression is that it's not possible to do.

@waltergu
Copy link

waltergu commented Dec 4, 2024

As far as the default value: If the user doesn't specify the environment variable, then Julia/Pkg will default to conservative.

So, in most cases (e.g. normal local usage), the default will end up being conservative).

However, in certain situations, you might end up observing a different default. For example, in GitHub Actions CI, if you use julia-actions/julia-buildpkg or julia-actions/julia-runtest, then your CI jobs will get eager by default. See here (for julia-buildpkg) and here (for julia-runtest). You can of course override this behavior by setting the environment variable in your GitHub Actions workflow YAML file.

For me, even in CI with the eager option, it takes more than 10 hours for the pkg server to update.

@dlfivefifty
Copy link
Contributor

Yes eager worked

@DilumAluthge
Copy link
Member

DilumAluthge commented Dec 4, 2024

For me, even in CI with the eager option, it takes more than 10 hours for the pkg server to update.

This is just my personal opinion, but IMO 10 hours isn't too bad of a delay. I personally think that anything under 48 hours isn't too bad. Again, this is just me - other folks may disagree.

EDIT: Please disregard this comment. I completely misread the previous commenter's comment (I thought the comment said "conservative" instead of "eager".)

@waltergu
Copy link

waltergu commented Dec 4, 2024

For me, even in CI with the eager option, it takes more than 10 hours for the pkg server to update.

This is just my personal opinion, but IMO 10 hours isn't too bad of a delay. I personally think that anything under 48 hours isn't too bad. Again, this is just me - other folks may disagree.

10 hours is OK for conservative. But for eager, it is too long because it leads to annoying CI failures, which is painful for a pacakge developer.

@DilumAluthge
Copy link
Member

DilumAluthge commented Dec 4, 2024

I am so sorry - I completely misread your message.

Yes, I agree - in my opinion, if eager is lagging the master branch of General by 10 hours, that's a bug.

Just to double-check: in CI now, the eager registry is now up-to-date for you, right? (Now that the Pkg server team has implemented some fixes.) If not, I should re-open this issue.

@waltergu
Copy link

waltergu commented Dec 4, 2024

I am so sorry - I completely misread your message.

Yes, in my opinion, if eager is lagging the master branch of General by 10 hours, that's a bug.

Just to double-check: in CI now, the eager registry is now up-to-date for you, right? (Now that the Pkg server team has implemented some fixes.) If not, I should re-open this issue.

Happy to know that the Pkg server team has implemented some fixes. I will check it later.

@MilesCranmer
Copy link

Thanks for the continued engagement @DilumAluthge, much appreciated.
So the only other points of comparison I have are PyPI (instant), conda-forge (minutes), pipenv (minutes), rust crates.io (instant), docker hub (instant), npm (instant), Ruby (instant), and things like GitHub itself (instant). Maybe Pkg servers just use a different operating mode that results in different scale for what is considered reasonable? e.g., pipenv people even complain about a 30 minutes delay (worst-case): pypa/pipenv#2840 which is due to them wrapping PyPI. So even 10 hours, to me, seems quite large.

More info on the Python-Julia thing, usually what I’ll do is wait for my Julia package updated to be registered for 24 hours before pushing my Python wrapper to PyPI, just in case. But once in a while the Julia registry takes longer (usually for only some availability zones), so I’ll get some emails from users that installation failed - since the Python wrapper is versioned to a specific Julia backend.

@DilumAluthge
Copy link
Member

so I’ll get some emails from users that installation failed

In these cases, are the failures occurring only when a user is installing your Julia package for the first time? If so, the easiest approach might just be to add the instructions for using the eager registry to your Julia package's docs and/or README, and ask the users to perform those steps before they install your Julia package for the first time.

Also, how are the users installing your Python package. Do they manually install it themselves (e.g. they run the pip/poetry/etc command themselves)? Or does your Julia package automatically install the Python package (via e.g. Conda.jl or something like that)?

@MilesCranmer
Copy link

MilesCranmer commented Dec 5, 2024

PySR uses juliacall (PythonCall.jl). Users will run:

pip install pysr

This will download the current version, say 1.0.3. Then they import it, and juliacall automatically installs Julia and [email protected]. It’s all automatic which is really good for the many novice programmers who use PySR (it’s an end-user tool rather than a framework).

Now if a user is unlucky, they might be in an availability zone that is slow. So even if I wait 24 hours as a precaution before uploading to PyPI (which propagates instantly), the Julia pkg servers might still be out of date. This will instantly fail their install step since it can’t find the right backend.

@DilumAluthge
Copy link
Member

DilumAluthge commented Dec 5, 2024

Hmmm. So, pysr is your Python package, and it depends on juliacall? Do you have control over the juliacall invocation that does the "install the [email protected] Julia package" step? If so, I think probably the quickest fix is for you to set the JULIA_PKG_SERVER_REGISTRY_PREFERENCE environment variable during that juliacall invocation.

@MilesCranmer
Copy link

Yes exactly. My one concern is if I force "eager", then it sounds like there would be other issues (group 2 you mentioned above)? In which case a fallback to "conservative" would help a lot.

For a long time I had it attempt to download SymbolicRegression.jl directly from GitHub at the specific tag. The issue with this is precisely that many people could not clone (either due to missing git, or some firewall reason), so I switched back.

@DilumAluthge
Copy link
Member

Something like MilesCranmer/PySR#765 might work for your use case. It first tries with JULIA_PKG_SERVER_REGISTRY_PREFERENCE=eager, but if an exception is encounters, it automatically falls back to conservative and tries again.

In the case where the user had already set JULIA_PKG_SERVER_REGISTRY_PREFERENCE, my PR doesn't modify JULIA_PKG_SERVER_REGISTRY_PREFERENCE, with the idea being that if the user set it themselves, they probably had a reason to do so.

@MilesCranmer
Copy link

Thanks! Will try it out. If it is a robust workaround it could probably go in pyjuliapkg directly so others can use it.

Tangentially, I’m curious—why does the package registry take that much time to update? I wonder why many other package indexes are virtually instantaneous or at worst, minutes, but Julia can take several hours (up to days, although such cases are rare). Are we doing something completely differently than others (and can we switch)? Is there something being computed on the servers that could be computed on the end user machine ?

@DilumAluthge
Copy link
Member

DilumAluthge commented Dec 5, 2024

In this case, I think there was a bug in the Pkg servers. Fredrik Ekre and others have been working to fix it.

Normally, there's not too much lag between conservative and eager. When there is a lag, it's usually because some big new binary artifacts have just been registered, which causes the Pkg servers and storage servers to spend a lot of time processign the artifacts.

I'm not super familiar with the Pkg server architecture. It's possible that there is room to optimize the performance, particularly when it comes to processing big artifacts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants