Robert Krawitz's tools for installing etc. OpenShift 4 clusters.
Table of Contents
-
oinst: OpenShift 4.x installer wrapper, currently for AWS, GCE, and libvirt.
You may want to install kubechart (github.com/sjenning/kubechart/kubechart) and oschart (github.com/sjenning/oschart/oschart) to monitor the cluster as it boots and runs.
I welcome PRs to extend this to other platforms. See the
oinst
API below for more information. -
waitfor-pod: wait for a specified pod to make its appearance (used as a helper by
oinst
). -
bastion-ssh and bastion-scp -- use an ssh bastion to access cluster nodes.
-
install-custom-kubelet -- install a custom kubelet into a cluster.
-
set-worker-parameters -- set various kubelet parameters and wait for the operation to complete
-
clean-cluster: clean up a libvirt cluster if
openshift-install destroy cluster
doesn't work. -
get-first-master: find the external IP address first master node of a cluster.
-
get-masters: get the external IP addresses of all of the master nodes of a cluster.
-
get-nodes: get the external (if available) or internal IP address of each node in a cluster.
-
clusterbuster -- generate pods, namespaces, and secrets to stress test a cluster. Also optionally do client/server data transfer.
-
clusterbuster-connstat -- retrieve information about client/server clusterbuster run.
-
monitor-pod-status -- periodically print summary information about objects generated by
clusterbuster
.
-
openshift-release-info -- get various information about one or more releases.
-
get-container-status: retrieve the status of each running container on the cluster.
-
get-images: retrieve the image and version of each image used by the cluster.
The API for oinst
consists of a platform plugin handling API calls
through a dispatch function. Platform plugins are bash scripts source
by oinst
.
Each supported platform must provide a plugin residing in
installer/platforms/
(or
$OPENSHIFT_OINST_LIBDIR/share/OpenShift/installer/platforms/
).
OPENSHIFT_OINST_LIBDIR
may be a path, in which case each directory on
the path is searched. The name of the file is taken to be the name of
the platform. Autosave/backup files are not searched.
The platform plugin must provide a dispatch function, typically named
_____<platform>_dispatch
, that handles the API calls, which will be
presented below. Responses to API calls are provided by text on
stdout and the status (return) code; errors may be logged to stderr.
Note that all names visible at global scope (i. e. not defined with
local
within a shell function) must start with _____<platform>
or
______<platform>
(five or six underscores). Any other names result
in an error. Any state you want to save must be in variables declared
via declare -g
, as described in the bash man page.
All plugins must call, from top level
add_platform _____<platform>_dispatch
to register the plugin. As noted, the dispatch function is typically
named dispatch
, but need not be as long as the global scope rule is
followed. If add_platform
is not called, or the platform name does
not match the filename, the plugin is ignored.
If a platform plugin wishes to make options available to the user via
-X option=value
(or --option=value
), it must call
register_options [options...]
All options must start with the platform name. These options are dispatched as described below.
All routines here may make use of any variables and functions in the
oinst
script that do not start with an underscore. They are all of
the form
operation [args]
-
base_domain domainname -- specify the DNS domain name of the cluster to be created.
-
cleanup -- perform any platform-specific cleanup functions. Generally
openshift-installer
will perform cleanup; this may be used for backup or if anything else needs to be done. -
default_install_type -- returns the default installation type. This may be used if e. g. a plugin supports installation to multiple zones, and the plugin wishes to specify one of them as the default (perhaps picked at random).
-
diagnose text -- attempt to recognize any errors in the installer's output stream, to generate later diagnostics if the installer fails. If the line is recognized, the diagnostic routine should call
set_diagnostic <diagnostic-name> <diagnostic-routine>
The
diagnostic-name
is the name that the diagnostic routine will use to recognize that the particular diagnostic was set. Thediagnostic-name
anddiagnostic-routine
's name must follow the naming requirements above.If installation fails, the
diagnostic-routine
will be invoked with thediagnostic-name
. It should print an appropriate error message to stdout, with an additional newline at the end. If the return status is1
, the diagnostic is taken as authoritative; default diagnostics related to credentials are not printed in that case. If the diagnosis is less certain, thediagnostic-routine
should return0
. -
is_install_type install_type -- return a status of 0 if the name of the install type is recognized by this plugin, in which case this plugin will handle all future API calls. If it does not recognize the name of this installation type, it should return 1.
-
machine_cidr -- print the desired machine CIDR value.
-
master -- print any additional YAML that should be supplied in the
controlPlane
definition. The YAML code will be indented appropriately. -
platform -- print any additional YAML that should be supplied in the
platform
definition. -
worker -- print any additional YAML that should be supplied in the
compute
definition. -
postinstall -- perform any additional steps that are needed after installation successfully completes. This may include installation of e. g. extra DNS or routing beyond the normal
oc login
. It does not need to include creation of an ssh bastion. -
replicas node-type -- echo the number of replicas desired. node-type will be either
master
orworker
. This routine may call `cmdline_replicas node-type default to use the number of replicas requested on the command line, along with the desired platform-specific default. -
set_option option value -- set the specified option to the desired value.
-
setup -- perform any necessary setup rasks prior to installation (e. g. cleaning additional caches beyond the standard, setting any top level variables to non-default values).
-
supports_bastion -- return a status of 0 (normal return) if the platform supports a bastion ssh host, or 1 (failure return) if it does not.
-
validate -- perform any platform-specific validation. This routine may exit (by calling
fatal
) with an appropriate error message.A typical validation may involve validating the instance type used in the installation. See Validating Instance Types below.
-
platform_help type -- provide platform-specific help information that will be appended to the help message. Copy one of the other help routines for starters. An unknown type should be ignored.
-
install_types -- provide a list of install types supported by the platform (e. g. cloud provider zones). Conventionally, the first line is flush to the left and indicates the default; other supported installation types are indented two spaces.
-
default_domain -- provide the default installation domain to be used.
-
options -- provide text with a help message for platform-specific options registered via
register-options
.
-
If the dispatch function is called with an operation that it does not
know about, it may call dispatch_unknown <platform> <args>
to notify
that it was called invalidly (and that probably the platform needs to
be fixed). This should only be done if it is called with an argument
outside of the list above; operations that it knows about but simply
doesn't so anything with should simply be ignored.
Cloud providers typically offer a variety of machine instance types with differing amounts of memory, CPU, storage, network bandwidth, etc. Validating these instance types up front saves considerable time. Validation failure is considered to be a soft error; the user may continue, but is warned that the chosen instance type is not known to be valid. This allows the user to specify e. g. a new instance type that hasn't been added to the validation list yet.
If the platform validation function wishes to validate the instance type, it should call
validate_instance_type "$worker_type" $master_type" <platform> <option_to_use> <instance_splitter> [instance names...]
The arguments to this function are:
-
worker_type is the worker type specified on the command line; normally it should be passed literally as
"$worker_type"
-
master_type is the master type specified on the command line; normally it should be passed literally as
"$master_type"
-
platform is the name of the platform that should be presented in a help message (which may be different, or capitalized differently, from the defined platform name)
-
option_to_use is the name of the option to use if the user wants to get a list of known valid instance types. Assuming that
option_to_use
ismaster_type
, the help will suggest specifying--master_type=list
for a short list of instance types, or--master_type=list-all
for the full list.master_type
is usually fine to use here. -
instance_splitter is the name of a shell function that splits each instance name into a type name and instance size or similar (subtype). It should print two lines, the first being the name of the type and the second being the instance size/subtype. The function should do the splitting appropriately for the cloud provider's nomenclature. For example, for AWS
m5.xlarge
would be split intom5 xlarge
-
instance names (all other arguements on the command line) is a list of known instance names. Instance type names are used to determine where to split lines, so all instances of a given type should be grouped together. There are some special names that may be provided for grouping; all of these should be prefixed with a space:
-
" X.Family" is the name of a broader family of instance types, e. g. general purpose, compute optimized, etc. If the user specifies
list
, only the first family in the lists's members are printed; if the user specifeslist-all
, all instance types are printed. -
" Y.Instance Type" is the name of a particular instance that should be treated as its own family (not split) and always printed at the left on its own line.
If the type name is a single space, the family name is treated as being empty and is not printed.
-
If the validator function recognizes the type name but not the instance size name, it will list the known instance sizes as suggestions.