-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes lease name truncation can lead to non-unique lease names #910
Comments
About possible solutions:
|
This is a serious problem, as:
I understand that this is a limitation imposed by k8s, and Akka has no way around that. But the current solution is IMHO fragile and prone to error, so not a real solution. @an-tex proposed some workarounds. A fixed-length encoding could be a real solution, at the expense of lease names legibility. But I think that an imperfect working system is better than a "legible" (not really the case with truncated lease names) but broken one. Are we missing something? There is workaround that we could employ right now? How this is working in production for others? |
The prefix is a concatenation of different bits of information. As described here: https://doc.akka.io/docs/akka/current/typed/cluster-sharding.html#lease
Another alternative solution could be to make it configurable. But I think only the What about going for solution 2 as suggested by @an-tex, with failure in case the final name is larger than 253 chars? We can also apply solution 1 in case it surpass 253 chars, but the problem I see with hashing is that it will make it harder to follow logs. |
Early failure would be preferable to silent truncation. At least we'd have an exception that highlights the underlying problem instead of a failing cluster that can't acquire leases.
Certainly it'd be prefereable to have readable names, but in that case a working system with unreadable lease names it's still preferable to a failing system with readable ones IMHO.
Yes, previously we weren't using the leasing part, but it's useful (in our context at least).
Readable names and early failures with fallback to no leasing seems to me the best solution so far, if possible. |
Fail early sounds good to me. Better to fail hard than error and fallback, because some don't even pay attention to errors. Changing to |
* for backwards compatibility it's possible to allow old behavior with config, e.g. to support rolling update
* in case the lease names are too long, see akka/akka-management#910
Fail early check added in #1194 |
Versions used
Versions:
Akka: 2.6.13
Akka Management: 1.0.10
Expected Behavior
When using a kubernetes lease for cluster sharding (https://doc.akka.io/docs/akka/current/typed/cluster-sharding.html#lease), the kubernetes lease names should be unique.
Actual Behavior
In my case I've enabled a projection using a sharded daemon process, using a kubernetes lease for coordination. The lease names end up being prefixed with
application-shard-sharded-daemon-process-[...]
(42 characters) and being truncated to 63 characters, this leaves only 22 characters for the actual projection name + tag. When truncated, the tag is the first to disappear, this generates the same lease name for different sharding processes. This in turn leads to an error loop with the following logs:Lease application-shard-sharded-daemon-process-[...] requested by client [email protected]:25520 is already owned by client. Previous lease was not released due to ungraceful shutdown.
Conflict during heartbeat to lease LeaseResource(Some([email protected]:25520),8609152,1622013422731). Lease assumed to be released
sharded-daemon-process-[...]: Shard id [2] lease lost, stopping shard and killing [1] entities.
The text was updated successfully, but these errors were encountered: