A service for provisioning and managing fleets of Kafka instances.
For more information on how the service works, see the implementation documentation.
- Golang 1.19+
- Docker - to create database
- ocm cli - ocm command line tool
- Node.js v14.17+ and npm
There are a number of prerequisites required for running kas-fleet-manager due to its interaction with external services. All of the below are required to run kas-fleet-manager locally.
NOTE: some of the hyperlinks in the User Account & Organization Setup and Populating configuration are not publicly accessible outside of Red Hat organization
- Request additional permissions for your user account in OCM
stage. Example MR
- Ensure your user has the role
ManagedKafkaService
. This allows your user to create Syncsets
- Ensure your user has the role
- Ensure the organization your account or service account belongs to has
quota for installing the Managed Kafka Add-on, see this
example
- Find your organization by its
external_id
beneath ocm-resources/uhc-stage/orgs
- Find your organization by its
The Observability stack requires a Personal Access Token to read externalized
configuration from within the bf2 organization. For development cycles, you will
need to generate a personal token for your own GitHub user (with bf2 access)
and place it within the secrets/observability-config-access.token
file.
To generate a new token:
- Follow the steps found here,
making sure to check ONLY the
repo
box at the top of the scopes/permissions list (which will check each of the subcategory boxes beneath it) - Copy the value of your Personal Access Token to a secure private location. Once you leave the page, you cannot access the value again & you will be forced to reset the token to receive a new value should you lose the original
- Paste the token value in the
secrets/observability-config-access.token
file
Kas-fleet-manager can be started without a dataplane OSD cluster, however,
no Kafkas will be placed or provisioned. To setup a data plane OSD
cluster, please follow the Using an existing OSD cluster with manual scaling enabled
option in the data-plane-osd-cluster-options.md
guide.
-
Add your organization's
external_id
to the Quota Management List Configurations if you need to create STANDARD kafka instances. Follow the guide in Quota Management List Configurations -
Follow the guide in Access Control Configurations to configure access control as required
-
Retrieve your ocm-offline-token from https://qaprodauth.cloud.redhat.com/openshift/token and save it to
secrets/ocm-service.token
-
Setup AWS configuration
make aws/setup
-
(optional) Setup Google Cloud Platform (GCP) configuration
If you intend to configure/provision Data Planes in GCP then GCP Service Account JSON credentials need to be provided to Fleet Manager so it can deploy and provision Data Plane Clusters there. To create a GCP Service Account and its corresponding JSON credentials see:
Additionally, the GCP Service Account has to meet the following requirements:
- In case the Data Plane Clusters are to be provisioned through OCM then
the GCP Service Account has to be named
osd-ccs-admin
- In case the Data Plane Clusters are to be provisioned through OCM then the following GCP IAM Roles have to be granted to the GCP Service Account: Required GCP IAM roles. See Manage Service Account access for details on how to do it
In order to configure GCP Service Account JSON credentials for Fleet Manager retrieve them from GCP and configure them. To do so the following alternatives are available:
- Copy the contents of the JSON credentials into the
secrets/gcp.api-credentials
file - Run the
gcp/setup/credentials
Makefile target providing the JSON credentials content as a base64-encoded string in theGCP_API_CREDENTIALS
environment variable. To do so run:GCP_API_CREDENTIALS="<base64-encoded-gcp-serviceaccount-credentials>" make gcp/setup/credentials
Finally, make sure that
gcp
is listed as a supported Cloud Provider with at least one configured GCP region in Fleet Manager in theconfig/provider-configuration.yaml
configuration file. See the documentation in that file for detail on the configuration schema. - In case the Data Plane Clusters are to be provisioned through OCM then
the GCP Service Account has to be named
-
Setup MAS SSO configuration
- keycloak cert
echo "" | openssl s_client -servername identity.api.stage.openshift.com -connect identity.api.stage.openshift.com:443 -prexit 2>/dev/null | sed -n -e '/BEGIN\ CERTIFICATE/,/END\ CERTIFICATE/ p' > secrets/keycloak-service.crt
- mas sso client id & client secret
make keycloak/setup MAS_SSO_CLIENT_ID=<mas_sso_client_id> MAS_SSO_CLIENT_SECRET=<mas_sso_client_secret> OSD_IDP_MAS_SSO_CLIENT_ID=<osd_idp_mas_sso_client_id> OSD_IDP_MAS_SSO_CLIENT_SECRET=<osd_idp_mas_sso_client_secret>
Values can be found in Vault
- keycloak cert
-
Setup Kafka TLS cert
make kafkacert/setup
-
Setup the image pull secret
- Image pull secret for RHOAS can be found in
Vault,
copy the content for the
config.json
key and paste it tosecrets/image-pull.dockerconfigjson
file
- Image pull secret for RHOAS can be found in
Vault,
copy the content for the
-
Setup the Observability stack secrets
make observatorium/setup
-
Generate OCM token secret
make ocm/setup OCM_OFFLINE_TOKEN=<ocm-offline-token> OCM_ENV=development
-
Setup the RedHat SSO secrets
make redhatsso/setup
NOTE: This is only required if your Observatorium instance is authenticated using sso.redhat.com
Run the following make target:
make observatorium/token-refresher/setup CLIENT_ID=<client-id> CLIENT_SECRET=<client-secret> [OPTIONAL PARAMETERS]
Required Parameters:
- CLIENT_ID: The client id of a service account that has, at least, permissions to read metrics
- ClIENT_SECRET: The client secret of a service account that has, at least, permissions to read metrics
Optional Parameters:
- PORT: Port for running the token refresher on.
Defaults to
8085
- IMAGE_TAG: Image tag of the token-refresher image.
Defaults to
latest
- ISSUER_URL: URL of your auth issuer.
Defaults to
https://sso.redhat.com/auth/realms/redhat-external
- OBSERVATORIUM_URL: URL of your Observatorium instance.
Defaults to
https://observatorium-mst.api.stage.openshift.com/api/metrics/v1/managedkafka
Please make sure you have followed all of the prerequisites above first.
- Compile the binary
make binary
- Clean up and Creating the database
- If you have db already created execute
make db/teardown
- Create database tables
make db/setup && make db/migrate
- (optional) Verify tables and records are created
make db/login
# List all the tables serviceapitests# \dt List of relations Schema | Name | Type | Owner --------+--------------------+-------+------------------- public | clusters | table | kas_fleet_manager public | kafka_requests | table | kas_fleet_manager public | leader_leases | table | kas_fleet_manager public | migrations | table | kas_fleet_manager
- If you have db already created execute
- Start the service
./kas-fleet-manager serve
NOTE: The service has numerous feature flags which can be used to enable/disable certain features of the service. Please see the feature flag documentation for more information
- Verify the local service is working
curl -H "Authorization: Bearer $(ocm token)" http://localhost:8000/api/kafkas_mgmt/v1/kafkas {"kind":"KafkaRequestList","page":1,"size":0,"total":0,"items":[]}
Follow this guide on how to deploy the KAS Fleet Manager service to an OpenShift cluster.
# Submit a new Kafka cluster creation request
curl -v -XPOST -H "Authorization: Bearer $(ocm token)" http://localhost:8000/api/kafkas_mgmt/v1/kafkas?async=true -d '{ "region": "us-east-1", "cloud_provider": "aws", "name": "test-kafka", "multi_az":true}'
# List a kafka request
curl -v -XGET -H "Authorization: Bearer $(ocm token)" http://localhost:8000/api/kafkas_mgmt/v1/kafkas/<kafka_request_id> | jq
# List all kafka request
curl -v -XGET -H "Authorization: Bearer $(ocm token)" http://localhost:8000/api/kafkas_mgmt/v1/kafkas | jq
# Delete a kafka request
curl -v -X DELETE -H "Authorization: Bearer $(ocm token)" http://localhost:8000/api/kafkas_mgmt/v1/kafkas/<kafka_request_id>
The locally installed kas-fleet-manager doesn't deploy TLS enabled kafka admin server but the default URL scheme used by the app-service-cli is HTTPS. So, the url scheme should be changed to http in the CLI code and it should be built locally.
First run the Fleet Manager locally
Then, RHOAS CLI can run pointing to the locally running Fleet Manager:
./rhoas login --mas-auth-url=stage --api-gateway=http://localhost:8000
Now, various kafka specific operations can be performed as described here
- The admin-server API is used for managing topics, acls, and consumer groups. The API specification can be found here
- To get the admin-server API endpoint, after the Kafka instance is in ready
state, call the GET kafka Instance endpoint against the Fleet Manager API.
Assuming the Fleet Manager is running in a local process on port 8000 that
can be done by executing:
curl -H "Authorization: Bearer $(ocm token)" http://localhost:8000/api/kafkas_mgmt/v1/kafkas/<kafka_id> | jq .admin_api_server_url
# Start Swagger UI container
make run/docs
# Launch Swagger UI and Verify from a browser: http://localhost:8082
# Remove Swagger UI conainer
make run/docs/teardown
Install the podman-docker utility. This will create a symbolic link
for /run/docker.sock
to /run/podman/podman.sock
:
#Fedora and RHEL8
dnf -y install podman-docker
#Ubuntu 21.10 or higher
apt -y install podman-docker
NOTE: As this is running rootless containers, please check the etc/subuid and etc/subgid files and make sure that the configured range includes the UID of current user. Please find more details here
In addition to the REST API exposed via make run
, there are additional
commands to interact directly with the service (i.e. cluster creation/scaling,
Kafka creation, Errors list, etc.) without having to use a REST API client.
To use these commands, run make binary
to create
the ./kas-fleet-manager
CLI.
Run ./kas-fleet-manager -h
for information on the additional commands.
The service can be run in a number of different environments. Environments
are essentially bespoke sets of configuration that the service uses to make
it function differently. Environments can be set using the OCM_ENV
environment
variable. Below are the list of known environments and their details.
development
- Thestaging
OCM environment is used. Sentry is disabled. Debugging utilities are enabled. This should be used in local developmenttesting
- The OCM API is mocked/stubbed out, meaning network calls to OCM will fail. The auth service is mocked. This should be used for unit testingintegration
- Identical totesting
but using an emulated OCM API server to respond to OCM API calls, instead of a basic mock. This can be used for integration testing to mock OCM behaviourproduction
- Debugging utilities are disabled, Sentry is enabled. Environment can be ignored in most development and is only used when the service is deployed
See the contributing guide for general guidelines.
make test
Integration tests can be executed against a real or "emulated" OCM environment. Executing against an emulated environment can be useful to get fast feedback as OpenShift clusters will not actually be provisioned, reducing testing time greatly.
Both scenarios require a database and OCM token to be setup before running integration tests, run:
make db/setup
make ocm/setup OCM_OFFLINE_TOKEN=<ocm-offline-token> OCM_ENV=development
To run a local keycloak container and setup realm configuration:
make sso/setup
make sso/config
make keycloak/setup MAS_SSO_CLIENT_ID=kas-fleet-manager MAS_SSO_CLIENT_SECRET=kas-fleet-manager OSD_IDP_MAS_SSO_CLIENT_ID=kas-fleet-manager OSD_IDP_MAS_SSO_CLIENT_SECRET=kas-fleet-manager
To run integration tests with an "emulated" OCM environment, run:
OCM_ENV=integration make test/integration
To run integration tests with a real OCM environment, run:
make test/integration
NOTE: Make sure that the keycloak service that's running locally is exposed via the internet using a reverse proxy service like ngrok.
ngrok http 8180
....wait for ngrok to run, then copy the generated URL and use them as mas-sso base url in "internal/kafka/internal/environments/development.go" file.
To stop and remove the database container when finished, run:
make db/teardown
To stop and remove the keycloak container when finished, run:
make sso/teardown
Current list of integration tests can be found here
The https://github.com/bf2fc6cc711aee1a0c2a/cos-fleet-manager is used to
build the cos-fleet-manager
binary which is a fleet manager for connectors
similar to how kas-fleet-manager
is fleet manager for Kafka instances.
The cos-fleet-manager
just imports most of the code from
the kas-fleet-manager
enabling only connector APIs that are in this
repo's internal/connector
package.
Connector integration tests require most of the security and access configuration listed in Prerequisites. Connector service uses AWS secrets manager as a connector specific vault service for storing connector secret properties such as usernames, passwords, etc.
Before running integration tests, the required AWS secrets files
in the secrets/vault
directory MUST be configured in the files:
secrets/vault/aws_access_key_id
secrets/vault/aws_secret_access_key