Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CASSANDRA-20051 4.1 be able to set seed back to itself #3661

Open
wants to merge 3 commits into
base: cassandra-4.1
Choose a base branch
from

Conversation

smiklosovic
Copy link
Contributor

Thanks for sending a pull request! Here are some tips if you're new here:

  • Ensure you have added or run the appropriate tests for your PR.
  • Be sure to keep the PR description updated to reflect all changes.
  • Write your PR title to summarize what this PR proposes.
  • If possible, provide a concise example to reproduce the issue for a faster review.
  • Read our contributor guidelines
  • If you're making a documentation change, see our guide to documentation contribution

Commit messages should follow the following format:

<One sentence description, usually Jira title or CHANGES.txt summary>

<Optional lengthier description (context on patch)>

patch by <Authors>; reviewed by <Reviewers> for CASSANDRA-#####

Co-authored-by: Name1 <email1>
Co-authored-by: Name2 <email2>

The Cassandra Jira

@rtib
Copy link
Contributor

rtib commented Nov 7, 2024

Unfortunately I haven't the capacity to build and test the whole scenario, but I think I've understood it by just reading the code.
I've left a few comments, but all in all it LGTM.

@smiklosovic
Copy link
Contributor Author

@rtib I do not see any comments from you.

@@ -2023,10 +2018,13 @@ public List<String> reloadSeeds()
return getSeeds();
}

tmp.remove(getBroadcastAddressAndPort());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change? we didn't support self seed in this API and we still don't, but if the only seed is the self seed we now bypass this check? What is the motivation?

Copy link
Contributor Author

@smiklosovic smiklosovic Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcapwell

There is this above that:

        if (tmp.isEmpty())
        {
            logger.trace("New seed node list is empty. Not updating seed list.");
            return getSeeds();
        }

If we removed local address before - so tmp is empty - it would return here. But in that case seeds would not be changed as it would return prematurely - that is the reason we see this behavior described in the ticket.

But we have to remove local address from tmp before going to mutate seeds

        // Add the new entries
        seeds.addAll(tmp);
        // Remove the old entries
        seeds.retainAll(tmp);

This would add nothing with addAll as tmp is empty and it would remove everything when retainAll(empty list). So seeds would end up empty in the end.

@@ -1986,20 +1986,15 @@ void buildSeedsList()
/**
* JMX interface for triggering an update of the seed node list.
*/
public List<String> reloadSeeds()
public synchronized List<String> reloadSeeds()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why synchronized? We don't do that when touching the seeds object so are you just trying to block concurrent nodetool commands? I am cool with that, but to be 100% clear this does not fix the race condition bug with this method mutating seeds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, blocking concurrent nodetool / jmx. What's the problem? How it doesnt fix it? Genuinely curious. Maybe I forgot something.

Copy link
Contributor Author

@smiklosovic smiklosovic Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is because there are other accesses to seeds outside of this method, e.g when we add / remove seed when node is added / removed etc ... ah damn

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that a TON of people have broken gossip by adding and removing synchronization. I haven't seen it in the seed path, but elsewhere in gossiper, folks spent far, far too much time (pre-simulator) on this. If you touch synchronization in this area, you may want to see if the simulator can prove it doesn't cause problems.

Copy link
Contributor Author

@smiklosovic smiklosovic Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffjirsa I may go without synchronized indeed. I dont think that in practice it would be problematic to not have it synchronized. Who is executing reloadSeeds at the exactly same time & from different threads? The chance of executing this twice from nodetool is virtually 0.

@@ -2023,10 +2018,13 @@ public List<String> reloadSeeds()
return getSeeds();
}

tmp.remove(getBroadcastAddressAndPort());

// Add the new entries
seeds.addAll(tmp);
// Remove the old entries
seeds.retainAll(tmp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if you only define the self seed, the new logic tries to handle, but then this step here will make seeds empty... I don't really follow why we want to allow this behavior...

Copy link
Contributor Author

@smiklosovic smiklosovic Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcapwell isn't seeds empty when the address of the node and one in the descriptor is same? I am just following the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants