Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider moving compactor and scan server path cleaners to a utility #5041

Open
dlmarion opened this issue Nov 7, 2024 · 3 comments
Open

Comments

@dlmarion
Copy link
Contributor

dlmarion commented Nov 7, 2024

The Manager and CompactionCoordinator classes start cleanup threads (here and here) that remove paths in ZooKeeper that are no longer being used. The accumulo-cluster script uses the ZooZap utility to clean ZooKeeper paths when stopping the cluster (here for example).

If we removed the ZK cleaner threads in the Manager and CompactionCoordinator, then it would be relatively easy to determine how many and which compactors and scan servers were started, but not running.

@keith-turner
Copy link
Contributor

then it would be relatively easy to determine how many and which compactors and scan servers were started, but not running.

Is this for the case of determining how many server processes died?

@dlmarion
Copy link
Contributor Author

then it would be relatively easy to determine how many and which compactors and scan servers were started, but not running.

Is this for the case of determining how many server processes died?

Stopped or died. It would make it easier to detect down servers for the monitor or other utilities.

@dlmarion
Copy link
Contributor Author

@keith-turner and I discussed this - the idea behind this issue is to use ZooKeeper as a mechanism to understand the intended state of the system. If we know the intended state of the system via paths in ZooKeeper, then we can identify which servers are down by looking for paths that don't have an associated lock. For users using accumulo-cluster, ZooKeeper would be cleaned up automatically when performing a stop operation as accumulo-cluster calls ZooZap. My idea here was that a utility could be created and run periodically by the user to clean up ZooKeeper paths when their intended deployment layout has changed. @keith-turner suggested that it might be too easy for ZooKeeper to get polluted with old paths if, for example, the user doesn't run the utility or in a case where the user is using an orchestration system like Kubernetes that stops and starts pods when needed that end up having different hostnames. @keith-turner suggested allowing the user to specify how many processes by resource group and type that they are intending to run, then we can compare the actual running count against the intended deployment. We might be able to convey this information in a property value (json?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants