Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registry.lookup doesn't always return new item that was just registered #267

Open
jweinkam opened this issue Feb 14, 2024 · 13 comments
Open

Comments

@jweinkam
Copy link

I have in my application start

{Horde.Registry, members: :auto, keys: :unique, name: MyRegistry}

For the code

Horde.Registry.register(MyRegistry, "name", "value")
result = Horde.Registry.lookup(MyRegistry, "name") 

I would expect result to be [{#PID<...>, "value"}]
But most of the time I get []

@jweinkam
Copy link
Author

This is using v0.9.0. Version v0.8.7 always gave the expected result. I believe this is related to the change in 053e1ae

@derekkraan
Copy link
Owner

This is intended behaviour. In local cases, it is expected that you will know the pid of the process (that you have just registered). Is this possible in your case? (Otherwise I would like to know more details)

@jweinkam
Copy link
Author

I am attempting to do

    GenStateMachine.start_link(__MODULE__, args,  name: {:via, Horde.Registry, {MyRegistry, "name"}})

This is frequently failing with error message

{:error, {:process_not_registered_via, Horde.Registry}}

Since internally in GenStateMachine.start_link it does a register, followed immediately by lookup and get [] instead of [{#PID<...>, ...}]

@derekkraan
Copy link
Owner

Thanks for the clarification. Usually I would expect that you wouldn't register and then immediately do a lookup, since theoretically to register you need the PID anyways, but it seems that :gen_statem(?) is doing exactly that.

Do you have a link to the location in the source where this is happening?

I don't have an immediate solution to your problem. I would suggest for now to stick with v0.8.7.

@jweinkam
Copy link
Author

GenStateMachine.start_link, call :gen_statem.start_link, which call :gen.start(?MODULE, link, Module, Args, Opts), when eventually calls :gen.init_it(GenMod, Starter, Parent, Mod, Args, Options). This is in gen.erl at line 188. It calls to register, then calls GenMod:init_it (where GenMod = :gen_statem). In :gen_statem.init_it the first this it does is gen:get_proc_name which calls lookup. This is gen_statem.erl at line 1015. So the lines are side by side, but the calls are very quickly one right after the other.

@arjan
Copy link
Contributor

arjan commented Feb 19, 2024

Maybe we should let register block until the pid has propagated in the crdt? Although that might make things slow if the sync interval is high.. 🤔

@derekkraan
Copy link
Owner

Calling Horde.Registry.register/3 does a GenServer call into Horde.RegistryImpl, which calls into DeltaCrdt.put/4, which results in a GenServer.call/3 to the DeltaCrdt instance. Before it returns, DeltaCrdt does a few Kernel.send/2, back to the Horde.RegistryImpl.

So in theory, these messages have all been sent before Horde.Registry.register/3 returns, which should mean that they are received by Horde.RegistryImpl before the next call.

It is confusing to me that this appears not to be the case. I thought that sent messages were always received in order.

@jweinkam
Copy link
Author

The registry lookup doesn't do a call but instead just reads from ets. Thus the read for ets can happen before the other sends are processed.

@derekkraan
Copy link
Owner

Oh yes that is true. I am also now reading that message ordering is only preserved between two processes, so if a third process is involved, the messages can be interleaved.

@jweinkam
Copy link
Author

I am able to work around the problem by not specifying name: {:via, Horde.Registry, {MyRegistry, "name"}} and instead having my init call Horde.Registry.register(MyRegistry, name, self()). This results in bypassing the register, lookup during gen_statem start, while still getting the process registered so that calls can be made using {:via, Horde.Registry, {MyRegister, "name"}}.

@nifoc
Copy link

nifoc commented May 5, 2024

I'm sorry for what is essentially a +1 comment, but I've just discovered the same issue.

I too am using gen_statem and a via tuple. Everything worked with the plain old Registry, so I expected Horde.Registry to be a drop-in replacement.

@philipgiuliani
Copy link

I've also stumbled on this issue today when debugging a crash in our system. We are also having it with gen_statem.

@derekkraan
Copy link
Owner

We could potentially put a second ets table in front of the existing one, to store new local registrations in. Then we could expire them after some very short period of time (milliseconds). This would cover the gap between registration and propagation of the registration back to the local registry.

We only need to cover the amount of time it takes for the message to be sent to the local process and recorded in the real ETS table. Unless your process is flooded with messages, this will be very quick.

I'm still mulling over how good of an idea this is. [expectation management] I don't have any time to work on this at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants