From 4a8baad19cd2441daa3263318b2c766981b74a91 Mon Sep 17 00:00:00 2001
From: Lars Kellogg-Stedman <lars@redhat.com>
Date: Fri, 31 Jan 2025 21:23:18 -0500
Subject: [PATCH] [WIP] Add documentation on deploying a hosted cluster

This is a work in progress!

This document describes how to deploy a bare metal hosted cluster using Red
Hat's Hosted Control Plane service, with bare metal nodes and networking
provided by the ESI environment at the MOC.
---
 deploying-a-hosted-cluster.md | 204 ++++++++++++++++++++++++++++++++++
 1 file changed, 204 insertions(+)
 create mode 100644 deploying-a-hosted-cluster.md
diff --git a/deploying-a-hosted-cluster.md b/deploying-a-hosted-cluster.md
new file mode 100644
index 0000000..90428c2
--- /dev/null
+++ b/deploying-a-hosted-cluster.md
@@ -0,0 +1,204 @@
+# Deploying a hosted cluster on ESI-provisioned nodes
+
+## Prerequisites
+
+- You are comfortable working with both OpenShift and OpenStack.
+- You are comfortable with shell scripts.
+- You have cluster admin privileges on the management (hypershift) cluster.
+- You are able to create floating ips both for the hypershift project and for the project that owns the nodes on which you'll deploy your target cluster.
+- You are able to create DNS records on demand for the domain that you are using as your base domain.
+- You have installed the latest version of `python-esiclient`.
+
+## Assumptions
+
+You have an OpenStack [`clouds.yaml`][clouds.yaml] file in the proper location, and it defines the following two clouds:
+
+- `hypershift` -- this is the project that owns the nodes and networks allocated to the hypershift management cluster.
+- `mycluster` -- this is the project that owns the nodes and networks on which you will be deploying a new cluster.
+
+[clouds.yaml]: https://docs.openstack.org/python-openstackclient/pike/configuration/index.html#clouds-yaml
+
+## Allocate DNS and floating ips
+
+You must have DNS records in place before deploying the cluster (the install process will block until the records exist).
+
+- Allocate two floating ip addresses from ESI:
+
+  - One will be for the API and must be allocated from the hypershift project (because it will map to worker nodes on the management cluster)
+  - One will be for the Ingress service and must be allocated from the network on which you are deploying your target cluster work nodes
+
+- Create DNS entries that map to those addresses:
+
+  - `api.<clustername>.<basedomain>` should map to the api vip.
+  - `api-int.<clustername>.<basedomain>` should map to the api vip.
+  - `*.apps.<clustername>.<basedomain>` should map to the ingress vip.
+
+Note that at this point these addresses are not associated with any internal ip address. We can't do that until after the cluster has been deployed.
+
+## Gather required configuration
+
+- You will need a pull secret, which you can download from <https://console.redhat.com/openshift/downloads>. Scroll to the "Tokens" section and download the pull secret.
+
+- You will probably want to provide an ssh public key. This will be provisioned for the `core` user on your nodes, allowing you to log in for troubleshooting purposes.
+  
+## Deploy the cluster
+
+First, create the namespace for your cluster:
+
+```
+oc create ns clusters-mycluster
+```
+
+Now you can use the `hcp` cli to create appropriate cluster manifests:
+
+```
+hcp create cluster agent \
+  --name mycluster \
+  --pull-secret pull-secret.txt \
+  --agent-namespace hardware-inventory \
+  --base-domain int.massopen.cloud \
+  --api-server-address api.mycluster.int.massopen.cloud \
+  --etcd-storage-class lvms-vg1 \
+  --ssh-key larsks.pub \
+  --namespace clusters \
+  --control-plane-availability-policy HighlyAvailable \
+  --release-image quay.io/openshift-release-dev/ocp-release:4.17.9-multi \
+  --node-pool-replicas 3
+```
+
+This will create several resources in the `clusters` namespace:
+
+- A HostedCluster resource
+- A NodePool resource
+- Several Secrets:
+  - A pull secret (`<clustername>-pull-secret`)
+  - Your public ssh key (`<clustername>-ssh-key`)
+  - An etcd encryption key (`<clustername>-etcd-encryption-key`)
+
+This will trigger the process of deploying control plane services for your cluster into the `clusters-<clustername>` namespace.
+
+If you would like to see the manifests generated by the `hcp` command, add the options `--render --render-sensitive`; this will write the manifests to *stdout* instead of deploying them to the cluster.
+
+After creating the HostedCluster resource, the hosted control plane will immediately start to deploy. You will find the associated services in the `clusters-<clustername>` namespace. You can track the progress of the deployment by watching the `status` field of the `HostedCluster` resource:
+
+```
+oc -n clusters get hostedcluster mycluster -o json | jq .status
+```
+
+You will also see that an appropriate number of agents have been allocated from the agent pool:
+
+```
+$ oc -n hardware-inventory get agents
+NAME                                   CLUSTER   APPROVED   ROLE          STAGE
+07e21dd7-5b00-2565-ffae-485f1bf3aabc   mycluster true       worker        
+2f25a998-0f1d-c202-4fdd-a2c300c9b7da   mycluster true       worker        
+36c4906e-b96e-2de5-e4ec-534b45d61fa7             true       auto-assign   
+384b3b4f-e111-6881-019e-3668abb7cb0f             true       auto-assign   
+5180125a-614c-ac90-7adf-9222dc228704             true       auto-assign   
+5aed1b72-90c6-da99-0bee-e668ca41b2ff             true       auto-assign   
+8542e6ac-41b4-eca3-fedd-6af8edd4a41e   mycluster true       worker        
+b698178a-7b31-15d2-5e20-b2381972cbdf             true       auto-assign   
+c6a86022-c6b9-c89d-b6b9-3dd5c4c1063e             true       auto-assign   
+d2c0f44b-993c-3e32-4a22-39af4be355b8             true       auto-assign   
+```
+
+## Interacting with the control plane
+
+The hosted control plane will be available within a matter of minutes, but in order to interact with it you'll need to complete a few additional steps.
+
+### Set up port forwarding for control plane services
+
+The API service for the new cluster is deployed as a [NodePort] service on the management cluster, as are several other services that need to be exposed in order for the cluster deploy to complete.
+
+1. Acquire a floating ip address from the hypershift project if you don't already have a free one:
+
+    ```
+    api_vip=$(openstack --os-cloud hypershift floating ip create external -f value -c floating_ip_address)
+    ```
+
+1. Pick the address of one of the cluster nodes as a target for the port forwarding:
+
+    ```
+    internal_ip=$(oc get nodes -l node-role.kubernetes.io/worker= -o name |
+      shuf |
+      head -1 |
+      xargs -INODE oc get NODE -o jsonpath='{.status.addresses[?(@.type == "InternalIP")].address}'
+    )
+    ```
+
+1. Set up appropriate port forwarding:
+
+    ```
+    openstack --os-cloud hypershift esi port forwarding create "$internal_ip" "$api_vip" $(
+      oc -n clusters-mycluster get service -o json |
+      jq '.items[]|select(.spec.type == "NodePort")|.spec.ports[].nodePort' |
+      sed 's/^/-p /'
+    )
+    ```
+
+    The output of the above command will look something like this:
+
+    ```
+    +--------------------------------------+---------------+---------------+----------+--------------+---------------+
+    | ID                                   | Internal Port | External Port | Protocol | Internal IP  | External IP   |
+    +--------------------------------------+---------------+---------------+----------+--------------+---------------+
+    | 2bc05619-d744-4e8a-b658-714da9cf1e89 |         31782 |         31782 | tcp      | 10.233.2.107 | 128.31.20.161 |
+    | f386638e-eca2-465f-a05c-2076d6c1df5a |         30296 |         30296 | tcp      | 10.233.2.107 | 128.31.20.161 |
+    | c06adaff-e1be-49f8-ab89-311b550182cc |         30894 |         30894 | tcp      | 10.233.2.107 | 128.31.20.161 |
+    | b45f08fa-bbf3-4c1d-b6ec-73b586b4b0a3 |         32148 |         32148 | tcp      | 10.233.2.107 | 128.31.20.161 |
+    +--------------------------------------+---------------+---------------+----------+--------------+---------------+
+    ```
+
+[nodeport]: https://kubernetes.io/docs/concepts/services-networking/service/#type-nodeport
+
+### Update DNS
+
+Ensure that the DNS entry for your API address is correct. The names `api.<cluster_name>.<basedomain>` and `api-int.<cluster_name>.<basedomain>` must both point to the `$api_vip` address configured in the previous section.
+
+### Obtain the admin kubeconfig file
+
+The admin `kubeconfig` file is available as a Secret in the `clusters-<cluster_name>` namespace:
+
+```
+oc -n clusters-mycluster extract secret/admin-kubeconfig --keys kubeconfig
+```
+
+This will extract the file `kubeconfig` into your current directory. You can use that to interact with the hosted control plane:f6c5eae4-d8c8-4d82-bcd4-47718335c39c
+
+```
+oc --kubeconfig kubeconfig get namespace
+```
+
+## Set up port forwarding for the ingress service
+
+1. Acquire a floating ip address from the ESI project that owns the bare metal nodes if you don't already have a free one:
+
+    ```
+    ingress_vip=$(openstack --os-cloud mycluster floating ip create external -f value -c floating_ip_address)
+    ```
+
+1. Pick the address of one of the cluster nodes as a target for the port forwarding. Note that here we're using the `kubeconfig` file we downloaded in a previous step:
+
+    ```
+    internal_ip=$(oc --kubeconfig kubeconfig get nodes -l node-role.kubernetes.io/worker= -o name |
+      shuf |
+      head -1 |
+      xargs -INODE oc --kubeconfig kubeconfig get NODE -o jsonpath='{.status.addresses[?(@.type == "InternalIP")].address}'
+    )
+    ```
+
+1. Set up appropriate port forwarding (in the bare metal node ESI project):
+
+    ```
+    openstack --os-cloud mycluster esi port forwarding create "$internal_ip" "$ingress"  -p 80 -p 443
+    ```
+
+## Wait for the cluster deploy to complete
+
+When the target cluster is fully deployed, the output for the HostedCluster resource will look like this:
+
+```
+$ oc -n clusters get hostedcluster mycluster
+NAME       VERSION   KUBECONFIG                   PROGRESS    AVAILABLE   PROGRESSING   MESSAGE
+mycluster  4.17.9    mycluster-admin-kubeconfig   Completed   True        False         The hosted control plane is available
+```