WIP: World-age partition bindings #54654

Keno · 2024-06-02T21:49:58Z

This implements world-age partitioning of bindings as proposed in #40399. In effect, much like methods, the global view of bindings now depends on your currently executing world. This means that const bindings can now have different values in different worlds. In principle it also means that regular global variables could have different values in different worlds, but there is currently no case where the system does this.

Motivation

The reasons for this change are manifold:

The primary motivation is to permit Revise to redefine structs. This has been a feature request since the very begining of Revise (redefining struct timholy/Revise.jl#18) and there have been numerous attempts over the past 7 years to address this, as well as countless duplicate feature request. A past attempt to implement the necessary julia support in Support type renaming #22721 failed because the consequences and semantics of re-defining bindings were not sufficiently worked out. One way to think of this implementation (at least with respect to types) is that it provides a well-grounded implementation of Support type renaming #22721.
A secondary motivation is to make const-redefinition no longer UB (although const redefinition will still have a significant performance penalty, so it is not recommended). See e.g. the full discussion in Add devdocs on UB #54099 and Behavior of reassignment to a const #38584.
Not currently implemented, but this mechanism can be used to re-compile code where bindings are introduced after the first compile, which is a common performance trap for new users (Track backedges through not-yet-defined bindings? #53958).
Not currently implemented, but this mechanism can be used to clarify the semantics of bindings import and resolution to address issues like compiling top-level expressions forces global binding resolution #14055.

Implementation

In this PR:

Binding gets min_world/max_world fields like CodeInstance
Various lookup functions walk this linked list using the current task world_age as a key
Inference accumulates world bounds as it would for methods
Upon binding replacement, we walk all methods in the system, invalidating those whose uninferred IR references the replaced GlobalRef
One primary complication is that our IR definition permits const globals in value position, but if binding replacement is permitted, the validity of this may change after the fact. To address this, there is a helper in Core.Compiler that gets invoked in the type inference world and will rewrite the method source to be legal in all worlds.
A new @world macro can be used to access bindings from old world ages. This is used in printing for old objects.
The const-override behavior was changed to only be permitted at toplevel. The warnings about it being UB was removed.

Of particular note, this PR does not include any mechanism for invalidating methods whose signatures were created using an old Binding (or types whose fields were the result of a binding evaluation). There was some discussion among the compiler team of whether such a mechanism should exist in base, but the consensus was that it should not. In particular, although uncommon, a pattern like:

f() = Any
g(::f()) = 1
f() = Int

Does not redefine g. Thus to fully address the Revise issue, additional code will be required in Revise to track the dependency of various signatures and struct definitions on bindings.

Demo

julia> struct Foo
               a::Int
       end

julia> g() = Foo(1)
g (generic function with 1 method)

julia> g()
Foo(1)

julia> f(::Foo) = 1
f (generic function with 1 method)

julia> fold = Foo(1)
Foo(1)

julia> struct Foo
               a::Int
               b::Int
       end

julia> g()
ERROR: MethodError: no method matching Foo(::Int64)
The type `Foo` exists, but no method is defined for this combination of argument types when trying to construct it.

Closest candidates are:
  Foo(::Int64, ::Int64)
   @ Main REPL[6]:2
  Foo(::Any, ::Any)
   @ Main REPL[6]:2

Stacktrace:
 [1] g()
   @ Main ./REPL[2]:1
 [2] top-level scope
   @ REPL[7]:1

julia> f(::Foo) = 2
f (generic function with 2 methods)

julia> methods(f)
# 2 methods for generic function "f" from Main:
 [1] f(::Foo)
     @ REPL[8]:1
 [2] f(::@world(Foo, 0:26898))
     @ REPL[4]:1

julia> fold
@world(Foo, 0:26898)(1)

Performance consideration

On my machine, the validation required upon binding replacement for the full system image takes about 200ms. With CedarSim loaded (I tried OmniPackage, but it's not working on master), this increases about 5x. That's a fair bit of compute, but not the end of the world. Still, Revise may have to batch its validation. There may also be opportunities for performance improvement by operating on the compressed representation directly.

Semantic TODO

Do we want to change the resolution time of bindings to (semantically) resolve them immediately?
Do we want to introduce guard bindings when inference assumes the absence of a binding?
When (if ever) do globals get declared implicitly?

Implementation TODO

Keno · 2024-06-04T18:21:41Z

Summarizing discussion with @JeffBezanson (only part of it) @StefanKarpinski @vtjnash @topolarity @gbaraldi @oscardssmith from today about the remaining semantic questions:

Semantic TODO

When does binding resolution happen semantically?

Example 1

Right now, the precise point-in-time of binding resolution is ill-defined. To illustrate this, consider:

module Exporter
	export foo
	foo = 1
end

using .Exporter
f() = foo

julia> f()
1

julia> global foo = 2
ERROR: cannot assign a value to imported variable Exporter.foo from module Main
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

julia> global foo = 2
2

julia> f()
2

but also

julia> code_typed(f)
1-element Vector{Any}:
 CodeInfo(
1 ─ %1 = Main.foo::Any
└──      return %1
) => Any

julia> global foo = 2
ERROR: cannot assign a value to imported variable Exporter.foo from module Main
Stacktrace:
 [1] top-level scope
   @ REPL[5]:1

Basically, right now bindings get resolved whenever anything in the system happens to look at a binding, but since the running-or-not of inference is outside the semantics of the language, these semantics are ill-defined.

Example 2

Similar for binding ambiguousness:

module Exporter1
	export foo
	foo = 1
end

module Exporter2
	export foo
	foo = 2
end
using .Exporter1

with

julia> foo
1

julia> using .Exporter2
WARNING: using Exporter2.foo in module Main conflicts with an existing identifier.

julia> foo
1

julia> using .Exporter1

julia> using .Exporter2

julia> foo
WARNING: both Exporter2 and Exporter1 export "foo"; uses of it in module Main must be qualified
ERROR: UndefVarError: `foo` not defined in `Main`
Hint: It looks like two or more modules export different bindings with this name, resulting in ambiguity. Try explicitly importing it from a particular module, or qualifying the name with the module it should come from.

(similar to example 1, the explicit reference of foo is not necessary, anything in the system that might resolve the binding counts).

Proposed semantics

The proposed semantics are (and this was not fully spelled out in the discussion, so there may be some further debate on this) that bindings resolve (in decreasing order of priority)

The most recently declared (in world age terms) global/const binding, unless explicitly deleted.
The most recent (in world age terms) explicit import using Foo: x, unless explicitly deleted.
An implicit import from all using'd modules available in the world age.

Note 1: Only toplevel global declares new bindings. A function level global declares a new (Any-typed) binding only if there was no previous top-level global.
Note 2: global x (even at top level) does not declare a new binding if previously declared global, otherwise equivalent to global x::Any.
Note 3: global x::T does not declare a new binding if x is already a global in the current module and T is egal to the currently declared type of x.

To see concrete effects, the assignment in Example 1, global foo = 2 becomes allowed. f() subsequently returns 2. In the first case of Example 2, the second use of foo will give the same ambiguous error as the first example does.

Overall, these semantics remove any consideration of code-execution order (with the exception of the ordering of world-age-incrementing top-level declarations) and should be a lot clearer. There should also be no change in binding resolution in cases that are not currently errors or warnings. Cases that are may change slightly, but as discussed above, we currently don't actually guarantee those resolutions, because binding resolution may happen at any time.

The one wrinkle here is that currently, there is precisely one case where bindings are introduced by non-toplevel code as discovered in #54607. We discussed this somewhat extensively. This change was introduced in 1.9 to allow modification of existing globals in other modules, but accidentally also permitted the creation of new bindings. The overall consensus was the independent of any semantics changes here, we need to correct this oversight, I will be submitting a PR shortly to attempt to correct this issue for 1.11, though we may have to go through a deprecation since it was in the system for two releases. Regardless, it shouldn't be an impediment for this change.

What to do about replacement of mutable bindings?

The semantics of binding replacement for const bindings are fairly clear and match the semantics of method replacement reasonably closely: The values that you see are the values that happened to be assigned when the world age is captured. However, this issue becomes trickier with mutable bindings. Here are some examples (I'll be using opaque closures as the canonical world age capture mechanism, but feel free to substitute tasks or whatever other world-age capture mechanism you prefer).

Example

For example, what does the following do:

global x::Integer = 1
old_age = get_world_counter()
oc = @opaque ()->(global x = x+1)
global x::Int = 3

Now, consider the following:

@world(x, old_age)
oc();
x
@world(x, old_age)

As implemented in this PR, bindings are fully partitioned, so this would give 1 2 3 2. However, the concern is e.g. with Revise, binding replacement may not always be fully obvious and users may not understand why e.g. long running tasks suddenly stop updating globals.

Other options

For completeness, I will list all the options we discussed, although some of these are probably bad:

Suggestion 1: Merge bindings with egal metadata across world ages.

This is not quite relevant to the example, but there was a proposal that in:

global x::Int # World Age 1
global x::Float64 # World Age 2
global x::Int # World Age 3

the bindings in world ages 1 and 3 should alias. I think we largely discarded this proposal as

Not addressing the problem fully and
Being confusing to users, as narrowing/widening does not get these semantics, so for people who don't know the precise semantics here, when bindings are shared or not can be confusing.

Suggestion 2: setglobal! assigns in every compatible world age

Basically, the semantics here would be getglobal reads the last setglobal! that assigned a value of a compatible type. So the result in the above example would change to 3 4 4 4, but in this slightly modification:

global x::Float64 = 1
old_age = get_world_counter()
oc = @opaque ()->(global x = x+1)
global x::Int = 3

you would get 1.0, 2.0, 3, 2.0 as in the current semantics.

We liked this semantically, but were concerned about the difficulty of implementation.

Suggestion 3: Writing outdated mutable bindings is an error

Relatively straightforward. If you replace a binding, then from that point on, all code running in old world ages will error upon assignment. In our running example, we would have 1 OldBindingError 3 1.
There are two primary downside here:

an additional pointer-sized load on every global assignment, but that's likely reasonable and if you really wanted to could probably be fixed by rewriting the binding pointer.
Any setglobal! call can never be proven nothrow

@topolarity raised the point that it seems odd to disallow this at top level, while permitting mutations through mutable objects, but that same concern applies to mutable values accessed through const

Suggestion 4: Do that, but also error on reads

Basically, as soon as you replace a mutable binding, the old one becomes toxic and loudly errors. In our running example, this gives OldBindingError, OldBindingError, 3, OldBindingError. This is quite aggressive, but I do think it is acceptable, because it requires the combination of code that:

Runs in an old world
Makes use of global mutable bindings
Has those bindings replaced

This situation is not super common, particularly since global mutable state is generally discouraged in Julia (as in other languages).

An open question is whether we want to extend this behavior to const bindings with mutable type.

Suggestion 5 [late submission]: type assert (using the old binding type) on read and write

Similar to suggestion 4, but rather than erroring unconditionally, old world bindings would gain typeasserts (on read with the old type, on write with the new type). global x::T is equivalent to global x::T = convert(T, @world(x, get_world_counter()). Should also still be compatible with suggestion 2.

Conclusion

I think we arrived at starting with suggestion 4. It's the most conservative and we should explicitly reserve the potential of revisiting the error cases in the future. Suggestion 2 was also well liked, but implementation difficulty was a concern. Additionally, with the indicated semantic reservation, switching from suggestion 4 to suggestion 2 is feasible, but not vice versa.

Eager resolution of bindings?

@JeffBezanson is concerned about losing the redefinition error in the following case:

f() = sin(1)
f()
sin = 4

also known as the missing=false issue. There were several proposals to improve this:

Eagerly resolving whether a binding is imported to a global and disallowing switching this until the module is closed (otherwise as above).
Have an opt-in mode that disallows shadowing imported bindings entirely

Should we do guard entries?

I originally made this a semantic question, although it's really more of a performance consideration. That said, with the changes to binding resolution discussed above, I believe guard entries are required for correctness anyway, so I think this question is moot.

As discussed in [1], the implicit creation of bindings through the setglobal! intrinsic was accidentally added in 1.9 unintentionally and will be removed (ideally) or at the very least deprecated in 1.11. The recommended replacement syntax is `Core.eval(mod, Expr(:global, sym))` to introduce the binding and `invokelatest(setglobal!, mod, sym, val)` to set it. The invokelatest is not presently required, but may be required for JuliaLang/julia#54654, so it's included in the recommendation. [1] JuliaLang/julia#54607

PR #44231 (part of Julia 1.9) introduced the ability to modify globals with `Mod.sym = val` syntax. However, the intention of this syntax was always to modify *existing* globals in other modules. Unfortunately, as implemented, it also implicitly creates new bindings in the other module, even if the binding was not previously declared. This was not intended, but it's a bit of a syntax corner case, so nobody caught it at the time. After some extensive discussions and taking into account the near future direction we want to go with bindings (#54654 for both), the consensus was reached that we should try to undo the implicit creation of bindings (but not the ability to assign the *value* of globals in other modules). Note that this was always an error until Julia 1.9, so hopefully it hasn't crept into too many packages yet. We'll see what pkgeval says. If use is extensive, we may want to consider a softer removal strategy. Across base and stdlib, there's two cases affected by this change: 1. A left over debug statement in `precompile` that wanted to assign a new variable in Base for debugging. Removed in this PR. 2. Distributed wanting to create new bindings. This is a legimitate use case for wanting to create bindings in other modules. This is fixed in JuliaLang/Distributed.jl#102. As noted in that PR, the recommended replacement where implicit binding creation is desired is: ``` Core.eval(mod, Expr(:global, sym)) invokelatest(setglobal!, mod, sym, val) ``` The `invokelatest` is not presently required, but may be needed by #54654, so it's included in the recommendation now. Fixes #54607

vchuravy · 2024-06-05T00:53:11Z

I am not sure about any resolution that will cause code running to error.

I do think it is acceptable, because it requires the combination of code that:

Runs in an old world

Makes use of global mutable bindings

Has those bindings replaced

This situation is not super common, particularly since global mutable state is generally discouraged in Julia (as in other languages).

I am not convinced that this situation isn't common and it would prohibit any use of global mutable bindings by any code that executes in a separate/frozen world-ages. We might in due time want to be able to execute "compilation unit" that are fully separated from the rest of the system.

However, the concern is e.g. with Revise, binding replacement may not always be fully obvious and users may not understand why e.g. long running tasks suddenly stop updating globals.

I do understand the concerns, but for me the conclusion seems more to be that globals must be accessed in the world-age of the task / or long running tasks must be restarted.

I am not convinced that the Revise use-case outweighs the statement: "Old code must be able to keep running"

Keno · 2024-06-05T01:07:14Z

I am not convinced that this situation isn't common and it would prohibit any use of global mutable bindings by any code that executes in a separate/frozen world-ages. We might in due time want to be able to execute "compilation unit" that are fully separated from the rest of the system.

Pretty much yes. I will admit that I generally think mutable globals are the wrong tool for anything beyond basic repl usage and I think of frozen world ages as a somewhat advanced feature, so I don't mind putting the complexity there as much. That said, separate compilation units can of course have their own entirely separate semantics and not observe global assignments from outside at all.

"Old code must be able to keep running"

What about Suggestion 5 where old code does mostly keep running, unless you assign a value to the global that is incompatible with the type restriction that was active in the captured world age. That seems like it would mostly let old code keep running unless you're changing something massively about the globals.

As discussed in [1], the implicit creation of bindings through the setglobal! intrinsic was accidentally added in 1.9 unintentionally and will be removed (ideally) or at the very least deprecated in 1.11. The recommended replacement syntax is `Core.eval(mod, Expr(:global, sym))` to introduce the binding and `invokelatest(setglobal!, mod, sym, val)` to set it. The invokelatest is not presently required, but may be required for JuliaLang/julia#54654, so it's included in the recommendation. [1] JuliaLang/julia#54607

vchuravy · 2024-06-05T01:26:09Z

It feels weird to add extra runtime overhead to a language feature. In some sense I would want to keep the current semantics that long-running tasks must opt-in to seeing "new"/"unexpected" state with invokelatest.

global running::Bool =  true

@spawn begin
      while running
          # 
      end
end

global running::Int

That is the scenario we are worried about? If I now change running my task will never finish.

For me that is equivalent to

const running =  Ref{Bool}(true)

@spawn begin
      while running[]
          # 
      end
end

const running = Ref{Int}(0)

Of course you raised this point in your proposal:

An open question is whether we want to extend this behavior to const bindings with mutable type.

For me information shouldn't travel back in time and thus anything we do here can't impact code executing in a prior world-age. I would rather make the rule simple: "If you modify a binding, that modification is only visible from here on out" and not have complicated rules about unifying bindings across time.

The "right" way fro me to write long-running tasks for that in a hot reloading scenario is probably:

global running::Bool = true
function execute()
     # work
     return running
end

@spawn while true
           invokelates(execute) || break
end

Keno · 2024-06-05T01:31:49Z

It feels weird to add extra runtime overhead to a language feature
Yes, but the overhead is actually less than you might expect, because precompile can already unset bindings, which is somewhat equivalent.

That is the scenario we are worried about? If I now change running my task will never finish.

Yes, or even something more benign like global running::Union{Int, Bool} where the user would be confused by setting running = false after that declaration does not terminate the loop.

For me that is equivalent to
I would rather make the rule simple: "If you modify a binding, that modification is only visible from here on out" and not have complicated rules about unifying bindings across time.

Yes, that was also what I originally wanted to do and is what is implemented in this PR, but you and I both have a very sophisticated understanding of world ages. I fear that most users may not.

Keno · 2024-06-05T01:34:39Z

If you modify a binding

One of the biggest problems I have is that "modifying a binding" is not necessarily an explicit operation. From the users perspective, they just changed the type on a binding, or even worse in the:

struct Foo
   a::Int # changed to `Integer` by the user using Revise
end
global foo::Foo = Foo(1)

case, the user may not be touching the binding at all at the source level, but revise still has to do the rebinding.

vchuravy · 2024-06-05T01:53:42Z

Absolutely! For me the question is if the Revise use-case trumps :

No additional run-time costs
Code that ran once in a world-age will not break, e.g. can we keep freezing world-ages

For me 2. is more important than the Revise use-case. We should empower Revise as much as possible, but also acknowledge that it can't do a perfect job with top-level state.

Keno · 2024-06-05T02:04:08Z

I don't think it's really about Revise. Revise is fine with either semantics. The question is more about the semantics of old-world code, for which there are two competing priorities:

It should keep running as is
It should match the mental model of the users

For 1, I really don't think that the typeassert solution is that bad. In the pre-typed-globals world (which is still extremely common), people would regularly do things like while running::Bool; end. Doing that implicitly is pretty much the semantics of Suggestion 5 with identical old-world behavior and I think it's a lot simpler of a mental model, because it does not require explaining to users the precise extents of world ages.

Keno · 2024-06-05T02:11:22Z

Said another way, I think the issue is mostly about type restriction on globals. If we didn't have that feature, I think the answer would be fairly obvious that there's only one mutable location for a binding (even if you can shadow it with a const for some subset of the world age). So then the question is if the type restriction feature is compelling enough of a reason to world-age fracture by type restriction semantically, and it just seems too niche for that to be realistic.

I think the semantics of:

There's only one global location for a given binding
Optionally you may declare a (world-age-fractured) type for the global which introduces implicit converts and typeasserts.

seems like a very simple mental model.

StefanKarpinski · 2024-06-05T14:32:01Z

I think I like that last approach, but I want to be sure what it means. Is the idea:

There is exactly one current value for a mutable global binding
If the binding in the current world looks like x::T then we guarantee that getting x returns a value of type T by e.g. doing convert(T, currentvalue(:x))::T
If this conversion is an error in the world age where an assignment happens, it's an immediate error and the assignment fails (the old value is retained)
If this conversion is not an error in the world age where an assignment happens, a new "true value" is established and used everywhere
In worlds where the true value can be converted to the expected type, the conversion is performed and the converted value is returned upon access to the global
In worlds where the true value cannot be converted to the expected type, accessing the global is an error.

Is that what you have in mind here, @Keno? One thing to consider is: if the user writes x::T = v do we consider v to be the true value of the global or do we consider convert(T, v)::T to be the true value?

vchuravy · 2024-06-05T15:06:09Z

Thinking about it some more this morning, I think "Suggestion 5" may be workable. typeassert on read and write with the type being world-age based.

It allows changing global running::Bool to global running::Nothing and then back, once the user realized the mistake.

Keno · 2024-06-05T16:03:47Z

There is exactly one current value for a mutable global binding

Yes

If the binding in the current world looks like x::T then we guarantee that getting x returns a value of type T by e.g. doing convert(T, currentvalue(:x))::T

I wasn't suggesting the additional convert on access (only assignment), since we don't usually convert on access. Although arguably we don't have to, since the can never have a mismatch.

If this conversion is an error in the world age where an assignment happens, it's an immediate error and the assignment fails (the old value is retained)

Yes

If this conversion is not an error in the world age where an assignment happens, a new "true value" is established and used everywhere

Yes

In worlds where the true value can be converted to the expected type, the conversion is performed and the converted value is returned upon access to the global

Yes, modulo above question on convert-on-access

In worlds where the true value cannot be converted to the expected type, accessing the global is an error.

Yes

Keno · 2024-06-05T16:18:16Z

only assignment

Thinking about this some more (and consistent with what I wrote yesterday), I don't think we can introduce old-world converts to/from the new type either on access or on write, because in general the new type may be defined in a new world, so the convert is likely to fail. Of course, we could implicitly transition to the latest world, but I think that's too magical. I think the only semantics that are sensible are:

# M.x
getglobal(M, :x)::get_binding_type(M, :x)
# M.x = val
setglobal!(M, :x, convert(get_binding_type(M, :x), val)::invokelatest(get_binding_type, M, :x))

Where get_binding_type looks up the binding type for running world and get set/getglobal are semantically untyped (for these purposes anyway, those type asserts are probably implicit in those intrinsics).

PR #44231 (part of Julia 1.9) introduced the ability to modify globals with `Mod.sym = val` syntax. However, the intention of this syntax was always to modify *existing* globals in other modules. Unfortunately, as implemented, it also implicitly creates new bindings in the other module, even if the binding was not previously declared. This was not intended, but it's a bit of a syntax corner case, so nobody caught it at the time. After some extensive discussions and taking into account the near future direction we want to go with bindings (#54654 for both), the consensus was reached that we should try to undo the implicit creation of bindings (but not the ability to assign the *value* of globals in other modules). Note that this was always an error until Julia 1.9, so hopefully it hasn't crept into too many packages yet. We'll see what pkgeval says. If use is extensive, we may want to consider a softer removal strategy. Across base and stdlib, there's two cases affected by this change: 1. A left over debug statement in `precompile` that wanted to assign a new variable in Base for debugging. Removed in this PR. 2. Distributed wanting to create new bindings. This is a legimitate use case for wanting to create bindings in other modules. This is fixed in JuliaLang/Distributed.jl#102. As noted in that PR, the recommended replacement where implicit binding creation is desired is: ``` Core.eval(mod, Expr(:global, sym)) invokelatest(setglobal!, mod, sym, val) ``` The `invokelatest` is not presently required, but may be needed by #54654, so it's included in the recommendation now. Fixes #54607

StefanKarpinski · 2024-06-05T18:18:29Z

The reason I mentioned convert-on-access is because you could have a situation like this:

global x::Any     # world 1
global x::String  # world 2
global x::Float64 # world 3
global x::Int     # world 4

What happens when one does x = 1 in world 4? If you had conversion on-access (or conversion on assignment to all compatible types), then when accessing x in world 3, you would get 1.0. I think that doing the convert eagerly versus lazily with caching would be largely equivalent. Either way, accessing x after that in world 2 would be an error. But one of the subtleties I wanted to tease out is what happens if you do x = big(1) in world 4 and then access x in world 1? Does world 1 see big(1) or Int(1)?

Keno · 2024-06-05T18:26:32Z

I think doing anything beyond the standard conversion on assignment we do right now is too magical. convert does not preserve object identity, so if you have a mutable object, different accesses to the objects no longer alias, and it all just becomes very confusing. I think it's fine in your example, for world 3 accesses to error as well.

StefanKarpinski · 2024-06-05T18:31:47Z

So basically:

Convert in the world where assignment happens
If the resulting value is type-compatible with an older world, also bind it there
In worlds where the new value is not type compatible, access becomes an error

A couple of clarifying questions:

What if assignment occurs in a world that is not the latest? Error?
Should we stop scanning historical worlds at the first failed assignment or continue?

Keno · 2024-06-05T18:44:18Z

Basically, the way to think about it is that there is only one storage location for Mod.sym, semantically typed by the binding type in the latest world age.

Convert in the world where assignment happens

yes

If the resulting value is type-compatible with an older world, also bind it there

In worlds where the new value is not type compatible, access becomes an error

There's only one storage location - if the actual stored value is not compatible with the binding type that was declared in a previous world age, access becomes an error.

A couple of clarifying questions:

What if assignment occurs in a world that is not the latest? Error?

Convert according to the old binding type and then attempt to assign (without convert) into the global. If the type (after convert in the old world) is not compatible with the latest declared type (without any convert), this is an error.

Should we stop scanning historical worlds at the first failed assignment or continue?

There's no scanning of historical worlds, the typeassert is on access.

PR #44231 (part of Julia 1.9) introduced the ability to modify globals with `Mod.sym = val` syntax. However, the intention of this syntax was always to modify *existing* globals in other modules. Unfortunately, as implemented, it also implicitly creates new bindings in the other module, even if the binding was not previously declared. This was not intended, but it's a bit of a syntax corner case, so nobody caught it at the time. After some extensive discussions and taking into account the near future direction we want to go with bindings (#54654 for both), the consensus was reached that we should try to undo the implicit creation of bindings (but not the ability to assign the *value* of globals in other modules). Note that this was always an error until Julia 1.9, so hopefully it hasn't crept into too many packages yet. We'll see what pkgeval says. If use is extensive, we may want to consider a softer removal strategy. Across base and stdlib, there's two cases affected by this change: 1. A left over debug statement in `precompile` that wanted to assign a new variable in Base for debugging. Removed in this PR. 2. Distributed wanting to create new bindings. This is a legimitate use case for wanting to create bindings in other modules. This is fixed in JuliaLang/Distributed.jl#102. As noted in that PR, the recommended replacement where implicit binding creation is desired is: ``` Core.eval(mod, Expr(:global, sym)) invokelatest(setglobal!, mod, sym, val) ``` The `invokelatest` is not presently required, but may be needed by #54654, so it's included in the recommendation now. Fixes #54607

topolarity · 2024-06-05T21:28:38Z

If the type (after convert in the old world) is not compatible with the latest declared type (without any convert), this is an error.

Does this have to be an error? I would have expected this to be an error upon access in the new world, rather than modification in the old one.

Otherwise, it seems like global x::Int has to be an error if the contents of x are incompatible with the new type, even though the variable may be initialized before any use in the new world.

Keno · 2024-06-05T21:31:39Z

Does this have to be an error? I would have expected this to be an error upon access in the new world, rather than modification in the old one.

It is feasible to make it symmetric, but then every new world access will need to have a type-assert on read, even if the binding hasn't been replaced, which has performance implications. I also think I like the semantics of erroring in the old world better, but I'm open to discussion.

Otherwise, it seems like global x::Int has to be an error if the contents of x are incompatible with the new type, even though the variable may be initialized before any use in the new world.

It tries to convert, and is an error if not possible.

Keno · 2024-06-05T21:36:34Z

but then every new world access will need to have a type-assert on read

I guess we could cache the typeassert for the newest world along with the world age and collapse the check into one.

So yeah, I think either is probably feasible. Your proposal does have the advantage of letting old code keep running if the new code never actually touches the global.

Now that I've had a few months to recover from the slog of adding `BindingPartition`, it's time to renew my quest to finish #54654. This adds the basic infrastructure for having multiple partitions, including making the lookup respect the `world` argument - on-demand allocation of missing partitions, `Base.delete_binding` and the `@world` macro. Not included is any inference or invalidation support, or any support for the runtime to create partitions itself (only `Base.delete_binding` does that for now), which will come in subsequent PRs.

This adds the binding partition revalidation code from #54654. This is the last piece of that PR that hasn't been merged yet - however the TODO in that PR still stands for future work. This PR itself adds a callback that gets triggered by deleting a binding. It will then walk all code in the system and invalidate code instances of Methods whose lowered source referenced the given global. This walk is quite slow. Future work will add backedges and optimizations to make this faster, but the basic functionality should be in place with this PR.

This is the final PR in the binding partitions series (modulo bugs and tweaks), i.e. it closes #54654 and thus closes #40399, which was the original design sketch. This thus activates the full designed semantics for binding partitions, in particular allowing safe replacement of const bindings. It in particular allows struct redefinitions. This thus closes timholy/Revise.jl#18 and also closes #38584. The biggest semantic change here is probably that this gets rid of the notion of "resolvedness" of a binding. Previously, a lot of the behavior of our implementation depended on when bindings were "resolved", which could happen at basically an arbitrary point (in the compiler, in REPL completion, in a different thread), making a lot of the semantics around bindings ill- or at least implementation-defined. There are several related issues in the bugtracker, so this closes #14055 #44604 #46354 #30277 It is also the last step to close #24569. It also supports bindings for undef->defined transitions and thus closes #53958 #54733 - however, this is not activated yet for performance reasons and may need some further optimization. Since resolvedness no longer exists, we need to replace it with some hopefully more well-defined semantics. I will describe the semantics below, but before I do I will make two notes: 1. There are a number of cases where these semantics will behave slightly differently than the old semantics absent some other task going around resolving random bindings. 2. The new behavior (except for the replacement stuff) was generally permissible under the old semantics if the bindings happened to be resolved at the right time. With all that said, there are essentially three "strengths" of bindings: 1. Implicit Bindings: Anything implicitly obtained from `using Mod`, "no binding", plus slightly more exotic corner cases around conflicts 2. Weakly declared bindings: Declared using `global sym` and nothing else 3. Strongly declared bindings: Declared using `global sym::T`, `const sym=val`, `import Mod: sym`, `using Mod: sym` or as an implicit strong global declaration in `sym=val`, where `sym` is known to be global (either by being at toplevle or as `global sym=val` inside a function). In general, you always allowed to syntactically replace a weaker binding by a stronger one (although the runtime permits arbitrary binding deletion now, this is just a syntactic constraint to catch errors). Second, any implicit binding can be replaced by other implicit bindings as the result of changing the `using`'ed module. And lastly, any constants may be replaced by any other constants (irrespective of type). We do not currently allow replacing globals, but may consider changing that in 1.13. This is mostly how things used to work, as well in the absence of any stray external binding resolutions. The most prominent difference is probably this one: ``` set_foo!() = global foo = 1 ``` In the above terminology, this now always declares a "strongly declared binding", whereas before it declared a "weakly declared binding" that would become strongly declared on first write to the global (unless of course somebody had created a different strongly declared global in the meantime). To see the difference, this is now disallowed: ``` julia> set_foo!() = global foo = 1 set_foo! (generic function with 1 method) julia> const foo = 1 ERROR: cannot declare Main.foo constant; it was already declared global Stacktrace: [1] top-level scope @ REPL[2]:1 ``` Before it would depend on the order of binding resolution (although it just crashes on current master for some reason - whoops, probably my fault). Another major change is the ambiguousness of imports. In: ``` module M1; export x; x=1; end module M2; export x; x=2; end using .M1, .M2 ``` the binding `Main.x` is now always ambiguous and will throw on access. Before which binding you get, would depend on resolution order. To choose one, use an explicit import (which was the behavior you would previously get if neither binding was resolved before both imports).

This is the final PR in the binding partitions series (modulo bugs and tweaks), i.e. it closes #54654 and thus closes #40399, which was the original design sketch. This thus activates the full designed semantics for binding partitions, in particular allowing safe replacement of const bindings. It in particular allows struct redefinitions. This thus closes timholy/Revise.jl#18 and also closes #38584. The biggest semantic change here is probably that this gets rid of the notion of "resolvedness" of a binding. Previously, a lot of the behavior of our implementation depended on when bindings were "resolved", which could happen at basically an arbitrary point (in the compiler, in REPL completion, in a different thread), making a lot of the semantics around bindings ill- or at least implementation-defined. There are several related issues in the bugtracker, so this closes #14055 closes #44604 closes #46354 closes #30277 It is also the last step to close #24569. It also supports bindings for undef->defined transitions and thus closes #53958 closes #54733 - however, this is not activated yet for performance reasons and may need some further optimization. Since resolvedness no longer exists, we need to replace it with some hopefully more well-defined semantics. I will describe the semantics below, but before I do I will make two notes: 1. There are a number of cases where these semantics will behave slightly differently than the old semantics absent some other task going around resolving random bindings. 2. The new behavior (except for the replacement stuff) was generally permissible under the old semantics if the bindings happened to be resolved at the right time. With all that said, there are essentially three "strengths" of bindings: 1. Implicit Bindings: Anything implicitly obtained from `using Mod`, "no binding", plus slightly more exotic corner cases around conflicts 2. Weakly declared bindings: Declared using `global sym` and nothing else 3. Strongly declared bindings: Declared using `global sym::T`, `const sym=val`, `import Mod: sym`, `using Mod: sym` or as an implicit strong global declaration in `sym=val`, where `sym` is known to be global (either by being at toplevle or as `global sym=val` inside a function). In general, you always allowed to syntactically replace a weaker binding by a stronger one (although the runtime permits arbitrary binding deletion now, this is just a syntactic constraint to catch errors). Second, any implicit binding can be replaced by other implicit bindings as the result of changing the `using`'ed module. And lastly, any constants may be replaced by any other constants (irrespective of type). We do not currently allow replacing globals, but may consider changing that in 1.13. This is mostly how things used to work, as well in the absence of any stray external binding resolutions. The most prominent difference is probably this one: ``` set_foo!() = global foo = 1 ``` In the above terminology, this now always declares a "strongly declared binding", whereas before it declared a "weakly declared binding" that would become strongly declared on first write to the global (unless of course somebody had created a different strongly declared global in the meantime). To see the difference, this is now disallowed: ``` julia> set_foo!() = global foo = 1 set_foo! (generic function with 1 method) julia> const foo = 1 ERROR: cannot declare Main.foo constant; it was already declared global Stacktrace: [1] top-level scope @ REPL[2]:1 ``` Before it would depend on the order of binding resolution (although it just crashes on current master for some reason - whoops, probably my fault). Another major change is the ambiguousness of imports. In: ``` module M1; export x; x=1; end module M2; export x; x=2; end using .M1, .M2 ``` the binding `Main.x` is now always ambiguous and will throw on access. Before which binding you get, would depend on resolution order. To choose one, use an explicit import (which was the behavior you would previously get if neither binding was resolved before both imports). (cherry picked from commit 888cf03)

Keno requested a review from timholy June 2, 2024 21:50

giordano changed the title ~~WIP: World-age parition bindings~~ WIP: World-age partition bindings Jun 2, 2024

Keno mentioned this pull request Jun 4, 2024

It is now possible to create globals in a different module #54607

Closed

Keno mentioned this pull request Jun 5, 2024

Don't rely on implicit binding creation by setglobal JuliaLang/Distributed.jl#102

Merged

Keno mentioned this pull request Jun 5, 2024

Don't let setglobal! implicitly create bindings #54678

Merged

Keno mentioned this pull request Oct 18, 2024

Add basic infrastructure for binding replacement #56224

Merged

Keno mentioned this pull request Nov 22, 2024

Add basic code for binding partition revalidation #56649

Merged

vtjnash mentioned this pull request Nov 22, 2024

REPLExt: no method matching repl_init(::REPL.LineEditREPL) when loading packages in REPL mode. #56216

Open

Keno mentioned this pull request Feb 4, 2025

bpart: Fully switch to partitioned semantics #57253

Merged

Keno closed this in #57253 Feb 6, 2025

Keno closed this in 888cf03 Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: World-age partition bindings #54654

WIP: World-age partition bindings #54654

Keno commented Jun 2, 2024 •

edited

Loading

Keno commented Jun 4, 2024 •

edited

Loading

Semantic TODO

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

vchuravy commented Jun 5, 2024 •

edited

Loading

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

Keno commented Jun 5, 2024

topolarity commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

WIP: World-age partition bindings #54654

WIP: World-age partition bindings #54654

Conversation

Keno commented Jun 2, 2024 • edited Loading

Motivation

Implementation

Demo

Performance consideration

Semantic TODO

Implementation TODO

Keno commented Jun 4, 2024 • edited Loading

Semantic TODO

When does binding resolution happen semantically?

Example 1

Example 2

Proposed semantics

What to do about replacement of mutable bindings?

Example

Other options

Suggestion 1: Merge bindings with egal metadata across world ages.

Suggestion 2: setglobal! assigns in every compatible world age

Suggestion 3: Writing outdated mutable bindings is an error

Suggestion 4: Do that, but also error on reads

Suggestion 5 [late submission]: type assert (using the old binding type) on read and write

Conclusion

Eager resolution of bindings?

Should we do guard entries?

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

vchuravy commented Jun 5, 2024 • edited Loading

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

vchuravy commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

Keno commented Jun 5, 2024

StefanKarpinski commented Jun 5, 2024

Keno commented Jun 5, 2024

topolarity commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 5, 2024

Keno commented Jun 2, 2024 •

edited

Loading

Keno commented Jun 4, 2024 •

edited

Loading

vchuravy commented Jun 5, 2024 •

edited

Loading