Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RW Separation] Treat Regular and Search Replicas Separately to Prevent Allocation Blocking #17421

Open
vinaykpud opened this issue Feb 21, 2025 · 0 comments
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance untriaged

Comments

@vinaykpud
Copy link
Contributor

vinaykpud commented Feb 21, 2025

Problem Statement

Currently, all replicas (regular and search replicas) are treated the same during shard allocation. This leads to a problem where the inability to assign one type of replica (either regular or search) can block the assignment of the other type.

  • If we are unable to assign a regular replica to a shard, it should not prevent the assignment of a search replica to a node.
  • Similarly, if we are unable to assign a search replica to a shard, it should not block the assignment of a regular replica to a node.
  • However, due to the current implementation, where all replicas are treated the same, this undesired blocking behavior occurs.
Example scenario:
  1. Assume we have sufficient nodes in the cluster.
  2. Create an index with:
    • 2 Primary Shards (2P)
    • 2 Regular Replicas (2R)
    • 2 Search Replicas (2SR)
  3. The allocateUnassigned method in LocalShardsBalancer:
    • First sorts shards based on the defined comparator.
    • Assigns primaries first to available nodes.
    • Then starts allocating replicas only after all primaries are assigned.
  4. If the allocator attempts to allocate a search replica and finds no dedicated search node, it leaves it unassigned.

So this unassigned search replica causes all other replicas to remain unassigned, even though regular replicas could have been allocated to available nodes.

Expected Behavior
  • Regular replicas and search replicas should be treated separately during allocation.
  • The failure to assign one should not block the allocation of the other.
  • If regular replicas have available nodes, they should be allocated even if search replicas remain unassigned.

Describe the solution you'd like

Update the comparator in LocalShardsBalancer class:
LocalShardsBalancer.java#L807
to differentiate between regular replicas and search replicas.

This comparator is used for sorting the shards before allocation. Updating it will ensure that regular replicas and search replicas are allocated separately, preventing one type from blocking the other.

How the Fix Works

  1. When an index with 2P, 2R, 2SR is created (Assume we have cluster with sufficient nodes) , the first call to allocateUnassigned will sort the shards like this:

    [(0, P, IDX1), (0, P, IDX1), (0, R, IDX1), (0, R, IDX1), (0, R, IDX1), (0, R, IDX1), (0, S, IDX1), (0, S, IDX1), (0, S, IDX1), (0, S, IDX1)]
    
    • First, primaries get assigned.
  2. In the next call to allocateUnassigned, shards will be sorted as:

    [(0, R, IDX1), (0, R, IDX1), (0, R, IDX1), (0, S, IDX1), (0, S, IDX1), (0, S, IDX1), (0, S, IDX1)]
    
    • Allocation will now alternate between regular and search replicas.
    • First iteration: Allocator will try to assign regular replicas.
    • Next iteration: Allocator will try to assign search replicas.

This ensures that if a regular replica cannot be assigned, search replicas can still be allocated, and vice versa.

Related component

Search:Performance

Describe alternatives you've considered

No response

Additional context

No response

@vinaykpud vinaykpud added enhancement Enhancement or improvement to existing feature or request untriaged labels Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request Search:Performance untriaged
Projects
Status: 🆕 New
Development

No branches or pull requests

1 participant