Support multiple clusters in sync_tron_from_k8s #991

jfongatyelp · 2024-07-13T00:28:33Z

If an action supports retries then we migrate to a new cluster while a job is ongoing, some attempts for this action could live on the old cluster and others on the new.

In order to catch this situation, sync_tron_from_k8s needs to be able to inspect pods in both old & new clusters to determine the correct tron state.

This simply allows specifying multiple kubeconfigs and gathering all pods from all clusters in order to compare against potentially-updateable actions.

This has been verified to work with multiple kubeconfigs for both eks and non-eks infrastage, and I've verified the metadata.creationTimestamps appear to be in UTC so will be consistent across clusters so we are comparing timestamps appropriately.

(Note that this means running from the devbox is not quite as convenient since our kubeconfigs are shared in a single file; I figured just forcing folks to run this somewhere with multiple files [such as the servers themselves, or by asking folks to extract each cluster to its own kubeconfig] would be easier than adding some custom method of specifying file+context combinations in the args)

nemacysts

we could potentially hardcode our paasta.conf kubeconfig (we puppet that everywhere, right?) and then we only need to have folks specify a context name, but this is perfectly fine :)

jfongatyelp · 2024-07-16T17:40:39Z

we could potentially hardcode our paasta.conf kubeconfig (we puppet that everywhere, right?) and then we only need to have folks specify a context name, but this is perfectly fine :)

Nah, I don't think we puppet paasta.conf everywhere, just devboxes. But that said, it seems simple enough to have 2 kind of 'modes', so I've added another arg so folks can either provide a single kubeconfig + multiple kubecontexts (paasta.conf) or multiple kubeconfigs w/ no kubecontext.

This was verified to work and return the same results w/ 2 separate infrastage+eks-infrastage kubeconfigs as well as using paasta.conf w/ 2 separate kubecontexts.

nemacysts · 2024-07-19T03:06:21Z

tools/sync_tron_state_from_k8s.py

@@ -77,6 +89,10 @@ def parse_args():
    parser.add_argument("-v", "--verbose", dest="verbose", action="store_true", default=False, help="Verbose logging")
    args = parser.parse_args()

+    # We can only have multiple kubeconfigs, or multiple contexts with a single config
+    if len(args.kubeconfig_path) > 1 and args.kubecontext:
+        parser.error("You can only specify a single --kubeconfig-path if specifying --kubecontext arguments.")


i'm assuming this will also do a sys.exit() or something similar?

Yup that's been my experience, it's using https://docs.python.org/3/library/argparse.html#argparse.ArgumentParser.error :

$ ./tools/sync_tron_state_from_k8s.py --kubeconfig-path /etc/kubernetes/paasta.conf kubeconfig_eks_infrastage.conf --kubecontext pnw-devc infrastage usage: sync_tron_state_from_k8s.py [-h] [--kubeconfig-path KUBECONFIG_PATH [KUBECONFIG_PATH ...]] [--kubecontext [KUBECONTEXT [KUBECONTEXT ...]]] [--do-work] [--tron-url TRON_URL] [--tronctl-wrapper TRONCTL_WRAPPER] [-n NUM_RUNS] [-v] sync_tron_state_from_k8s.py: error: You can only specify a single --kubeconfig-path if specifying --kubecontext arguments. $ echo $? 2

nemacysts · 2024-07-19T03:14:57Z

tools/sync_tron_state_from_k8s.py

-def fetch_pods(kubeconfig_path: str) -> Dict[str, V1Pod]:
+def fetch_pods(kubeconfig_path: str, kubecontext: Optional[str]) -> Dict[str, V1Pod]:
+    if kubecontext:
+        # KubeClient only uses the environment variable


whoops, sorry - i should have also allowed for passing this through as an arg

Support multiple clusters in sync_tron_from_k8s

165ba1b

jfongatyelp requested review from nemacysts and cuza July 13, 2024 00:28

nemacysts approved these changes Jul 15, 2024

View reviewed changes

Support running on devbox too

2d145c2

Minor fix to help text for kubecontext param

172ad0c

jfongatyelp requested a review from nemacysts July 18, 2024 22:51

nemacysts approved these changes Jul 19, 2024

View reviewed changes

jfongatyelp merged commit d51984e into master Jul 19, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multiple clusters in sync_tron_from_k8s #991

Support multiple clusters in sync_tron_from_k8s #991

jfongatyelp commented Jul 13, 2024 •

edited

Loading

nemacysts left a comment

jfongatyelp commented Jul 16, 2024

nemacysts Jul 19, 2024

jfongatyelp Jul 19, 2024

nemacysts Jul 19, 2024

Support multiple clusters in sync_tron_from_k8s #991

Support multiple clusters in sync_tron_from_k8s #991

Conversation

jfongatyelp commented Jul 13, 2024 • edited Loading

nemacysts left a comment

Choose a reason for hiding this comment

jfongatyelp commented Jul 16, 2024

nemacysts Jul 19, 2024

Choose a reason for hiding this comment

jfongatyelp Jul 19, 2024

Choose a reason for hiding this comment

nemacysts Jul 19, 2024

Choose a reason for hiding this comment

jfongatyelp commented Jul 13, 2024 •

edited

Loading