-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
📖 Add OLMv1 Overview doc #692
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #692 +/- ##
=======================================
Coverage 64.01% 64.01%
=======================================
Files 22 22
Lines 1370 1370
=======================================
Hits 877 877
Misses 442 442
Partials 51 51
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
docs/olmv1_overview.md
Outdated
|
||
### Single-tenant control planes | ||
|
||
One choice for customers would be to adopt low-overhead single-tenant control planes (e.g. hypershift?) in which every tenant can have full control over their APIs and controllers and be truly isolated (at the control plane layer at least) from other tenants. With this option, the things OLMv1 cannot do (listed above) are irrelevant, because the purpose of all of those features is to support multi-tenant control planes in OLM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Kubernetes docs on multi-tenancy also mention the Cluster
API, Kamaji, and vcluster as options for creating virtual control planes per tenant for tenant isolation. It might be worth linking to the kubernetes docs and/or some of the projects that are specifically designed for addressing this (hypershift can be included in this but shouldn't be the only example IMO)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think including links to some of these other projects trying to solve the multi-tenancy problem will help get the point across that using something specifically for enabling multi-tenancy is a better solution than trying to force multi-tenancy support into OLMv1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'll update this. Thanks! Missed the hypershift callout in my RH-specific scrub.
docs/olmv1_overview.md
Outdated
|
||
Using the [Operator Capability Levels](https://sdk.operatorframework.io/docs/overview/operator-capabilities/) as a rubric, operators that fall into Level 1 and some that fall into Level 2 are not making full use of the operator pattern. If content authors had the choice to ship their content without also shipping an operator that performs simple installation and upgrades, many supporting these Level 1 and Level 2 operators might make that choice to decrease their overall maintenance and support burden while losing very little in terms of value to their customers. | ||
|
||
## What will OLM doo that a generic package manager doesn't? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
## What will OLM doo that a generic package manager doesn't? | |
## What will OLM do that a generic package manager doesn't? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me. Had a couple comments, but nothing that I think is worth really holding this PR for.
|
||
### Watched namespaces cannot be configured in a first-class API | ||
|
||
OLMv1 will not have a first-class API for configuring the namespaces that a controller will watch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is "first-class API" being defined? Does this mean that OLM will not provide this capability itself, but this could be provided by someone else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that not having a "first-class API" means that there will be no field on the APIs introduced by OLMv1 to set that information explicitly. There is nothing stopping the author of a controller from providing configuration options to users for this or hardcoding the namespaces it should watch.
I don't think we have really talked about this, but I could imagine the APIs introduced by OLMv1 having a kind of "pass through" field where users can use to set some arbitrary values on the manifests deployed by OLM that doesn't prevent this (e.g setting env vars on a deployment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than saying "not", let's say what it does provide. e.g. "Namespace watching is provided via "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tmshort This is somewhat nuanced, and I want to make sure that the overall message is clear.
Namespace watching is not provided by OLMv1 at all. However, OLMv1 will support arbitrary configuration with schemas provided by extension authors and values that must match those schemas provided by extension admins. If authors decide to include knobs in their configuration schemas that control scoping, that's fine. But it isn't anything OLMv1 knows about or can build features around.
I call this out with a bit more detail in the Approach -> Don't Fight Kubernetes
section further down.
There is one and only one exception. In order to maintain backward compatibility with registry+v1 bundles, the OLM maintainers will define the parameterization schema for registry+v1 bundles and will include the ability to define the watched namespaces. But again, this is particular to this bundle format and is not something that the broader OLM system will have awareness of.
However, Kubernetes does not assume that a controller will be successful when it reconciles an object. | ||
|
||
The Kubernetes design assumptions are: | ||
- CRDs and their controllers are trusted cluster extensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that a CRD (the API) is global, but it is not clear to me why a controller cannot be namespaced. If the controller is running in a namespace, with a service account that has RBAC limited to the namespace, and only reconciles CRs within that namespace -- I'm not sure I would consider that a cluster extension?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could be wrong, but in this case I am interpreting "cluster extension" as referring to it literally being something that extends the functionality of the Kubernetes cluster. I don't think there is anything in Kubernetes preventing a controller from running with reconciliation scoped only to a namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read cluster extension
as things that extend the functionally of the cluster, further then what is shipped with a default install of Kubernetes. This has no baring on the scope that an operator author chooses to be implemented within a controller. Even if the controller in question is just namespace scoped, it still extended
a k8s clusters functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A controller can be namespaced, but it requires that all controllers for a given CRD manage non-overlapping namespaces. That can be tricky. This is why we are/were considering splitting up "global components" (e.g. CRDs) from potentially "namespace-able components" (i.e. the controllers).
|
||
The Kubernetes design assumptions are: | ||
- CRDs and their controllers are trusted cluster extensions. | ||
- If an object for an API exists a controller WILL reconcile it, no matter where it is in the cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a controller is running with a service account with namespaced RBAC, would the controller even see a CR created in another namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. My understanding is that the design assumption we are stating here is that Kubernetes assumes that there is a controller somewhere that will reconcile an object for an API no matter where it is in the cluster. As far as I am aware this doesn't necessarily mean it has to be a single instance of a controller. I do think in general it makes more sense to use a single controller over many namespace specific controllers though.
|
||
OLMv1 will make the same assumption that Kubernetes does and that users of Kubernetes APIs do. That is: If a user has RBAC to create an object in the cluster, they can expect that a controller exists that will reconcile that object. If this assumption does not hold, it will be considered a configuration issue, not an OLMv1 bug. | ||
|
||
This means that it is a best practice to implement and configure controllers to have cluster-wide permission to read and update the status of their primary APIs. It does not mean that a controller needs cluster-wide access to read/write secondary APIs. If a controller can update the status of its primary APIs, it can tell users when it lacks permission to act on secondary APIs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be useful to challenge this. Although APIs are inherently cluster-scoped in Kubernetes, can we achieve some level of multi-tenancy by splitting the APIs from the controllers?
Can a cluster admin make an API available -- with no controller installed with cluster-wide RBAC? Can a namespace admin then install a controller in their namespace with limited RBAC to reconcile CRs in that namespace? This may require allowing for the installation of individual components of a bundle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we are going to intentionally prevent any of those possible approaches to trying to achieve multi-tenancy but I don't think we are going to intentionally support them either. A user absolutely should be able to do all of those things by creating individual bundles that way, but going that route fights the design of Kubernetes and comes with a host of it's own issues, making it difficult for OLM to be able to effectively handle the automatic life cycling of the controllers. Taking this approach, it will be entirely on the user to understand the impacts of making the decision of using this approach and why OLM may not behave as expected in this case.
For example, I would not see it as a bug if automatic upgrades for a controller were to fail and require manual intervention in this scenario. I would also not see it as a bug if OLM successfully installed multiple controllers reconcile the same resource in the same namespace in this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that the APIs are global means that there isn't true multi-tenancy of controllers either. All of the controllers for that global API MUST agree on the single API they will all use. Therefore tenants will be limited by the choices made by other tenants when it comes to lifecycling controllers.
As Bryce said, OLMv1 will not get in the way of multiple controller installations, but it also won't help de-conflict between them.
|
||
### Dependencies based on watched namespaces | ||
|
||
Since there will be no first-class support for configuration of watched namespaces, OLMv1 cannot resolve dependencies among bundles based on where controllers are watching. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have the following scenario:
- Customer goes to install Operator A at the cluster scope (CRD + controller with cluster-wide RBAC)
- Operator A has a dependency on Operator B (A will be creating CRs from B and expects B to reconcile them)
I'm not sure why the absence of a watched namespaces concept prevents this dependency fulfillment.
OLMv1, given its limited RBAC, can likely only see that CRD B is not present. If it was present, OLMv1 couldn't see if controller B is available and reconciling.
OLMv1 however, if its RBAC was limited to a single namespace, could create a CR for B and see if controller B picks it up and sets some status field. This would give OLMv1 the information it needs about whether controller B is running and reconciling at the cluster scope (which as presented is the recommend -- possibly only? -- install mode).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Customer goes to install Operator A at the cluster scope (CRD + controller with cluster-wide RBAC)
OLMv1 will be completely unaware of the scoping configuration of Operator A. It doesn't know if Operator A is watching the entire cluster or is watching just a subset.
The fact that an opaque configuration applied to the bundle results in cluster-wide RBAC for the service account tied to a deployment is maybe a good enough proxy for watch namespace. But you are correct that a dependency resolver would also need awareness of the RBAC in use by any dependents, which is not available to OLM and may not be available to a user.
OLMv1 however, if its RBAC was limited to a single namespace, could create a CR for B and see if controller B picks it up and sets some status field. This would give OLMv1 the information it needs about whether controller B is running and reconciling at the cluster scope (which as presented is the recommend -- possibly only? -- install mode).
This seems fragile and complex and lots could go wrong. Off the top of my head:
- OLM would have to evaluate the schema of each CRD and be able to produce a valid object
- This assumes that all objects have a status
- Operators may have admission webhooks that reject creates unless arbitrary conditions are met.
- Creating a CR may have major implications on the operations of a cluster, and would like incur costs.
- The fact that a CR for B can be created in a particular namespace provides no signal about whether a controller would reconcile B in another namespace.
The most likely dependency resolver implementation given the constraints is probably:
- Client-based
- Limited to cluster admins who can see RBAC and Extensions cluster-wide.
- Required to use RBAC as a proxy for watch namespaces, which may result in assumptions that a controller is watching a namespace, even if it is not (i.e. it has RBAC, but for whatever reason isn't actually watching there)
1. How would a dependency resolver know which extensions were installed (let alone which extensions were watching which namespaces)? If a user is running the resolver, they would be blind to an installed extension that is watching their namespace if they don’t have permission to list extensions in the installation namespace. If a controller is running the resolver, then it might leak information to a user about installed extensions that the user is not otherwise entitled to know. | ||
2. Even if (1) could be overcome, the lack of awareness of watched namespaces means that the resolver would have to make assumptions. If only one controller is installed, is it watching the right set of namespaces to meet the constraint? If multiple controllers are installed, are any of them watching the right set of namespaces? Without knowing the watched namespaces of the parent and child controllers, a correct dependency resolver implementation is not possible to implement. | ||
|
||
Note that regardless of the ability of OLMv1 to perform dependency resolution (now or in the future), OLMv1 will not automatically install a missing dependency when a user requests an operator. The primary reasoning is that OLMv1 will err on the side of predictability and cluster-administrator awareness. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given it was properly conveyed via the UI (or logging in a CLI), I think you can achieve administrator awareness while still fulfilling dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Predictability is definitely the harder thing to achieve though. All sorts of inputs go into which dependency is chosen. In other package managers, there is almost always an imperative flow where an admin has a chance to review the chosen dependencies before they are installed.
If it was possible to overcome the difficulty of building a dependency resolver without awareness of controller scope (that's a big if, and not something we're pursuing), then a client-based resolver that presents the user with the chosen set of Extensions to install and lets them decide how to proceed would be the best of both worlds.
With OLMv1's focus on GitOps friendliness and security posture, we have decided not to pursue a controller-based dependency resolver/installer.
And again, this is all fairly moot because we are not pursuing a dependency resolver. However, part of the beauty of this design is that it lends itself more to extensibility. A third party should be able to implement a dependency resolver over the APIs provided by core OLMv1:
- Catalog contents are available to clients
- Catalog metadata is extensible, so third parties could include their own dependency metadata (e.g. about "requires", "provides", "conflicts", etc).
- Extension API is available to clients
|
||
OLMv1 will not provide dependency resolution among packages in the catalog (see [Dependencies based on watched namespaces](#dependencies-based-on-watched-namespaces)) | ||
|
||
OLMv1 will provide constraint checking based on available cluster state. Constraint checking will be limited to checking whether the existing constraints are met. If so, install proceeds. If not, unmet constraints will be reported and the install/upgrade waits until constraints are met. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In OLMv0 this seems to exist via nativeAPIs
defined in CSVs. The way this is built today, if you try to install an operator via the UI and the nativeAPI
requirements are not fulfilled, the failure is not made apparent to the user with the missing dependencies. The installation appears to just be stuck pending. I'd suggest we make this more visible and easily debuggable in OLMv1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are a few rough edges of OLMv0 like this. Another is minKubeVersion
in the CSV. The goal is to get all of the constraint-related information into the catalog where it can be evaluated before pulling, extracting, and applying bundle contents.
docs/olmv1_overview.md
Outdated
|
||
TL;DR: OLMv1 cannot feasibly support multi-tenancy or any feature that assumes multi-tenancy. All multi-tenancy features end up falling over because of the global API system of Kubernetes. While this short conclusion may be unsatisfying, the reasons are complex and intertwined. | ||
|
||
Nearly every engineer in the Operator Framework group contributed to design explorations and prototypes over an entire year. For each of these design explorations, there are complex webs of features and assumptions that are necessary to understand the context that ultimately led to a conclusion of infeasibility that led us to today’s conclusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nearly every engineer in the Operator Framework group contributed to design explorations and prototypes over an entire year. For each of these design explorations, there are complex webs of features and assumptions that are necessary to understand the context that ultimately led to a conclusion of infeasibility that led us to today’s conclusion. | |
Nearly every engineer in the Operator Framework group contributed to design explorations and prototypes over an entire year. For each of these design explorations, there are complex webs of features and assumptions that are necessary to understand the context that ultimately led to a conclusion of infeasibility. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole paragraph seems odd. It's describing the process/history but not necessarily the present state of OLMv1. (I.e. The subjects of these paragraphs is not OLMv1, but people and tasks.) I think we may want to reconsider this paragraph, or put it into a historical section, even if it means just adding a ### Historical Context
above it.
README.md
Outdated
|
||
OLM v1 is the follow-up to OLM v0, located [here](https://github.com/operator-framework/operator-lifecycle-manager). | ||
|
||
It consists of four different components, including this one, which are as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The last "it" referenced was "OLM v0". You may want to distinguish this as OLM v1 (even though it's repetitive, it's more precise).
|
||
OLM v1 is the follow-up to OLM v0, located [here](https://github.com/operator-framework/operator-lifecycle-manager). | ||
|
||
It consists of four different components, including this one, which are as follows: | ||
* operator-controller | ||
* [deppy](https://github.com/operator-framework/deppy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bye-bye deppy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is still in use by the ClusterExtension controller. As is rukpak. Let's leave those in until we actually stop using them?
README.md
Outdated
* operator-controller | ||
* [deppy](https://github.com/operator-framework/deppy) | ||
* [rukpak](https://github.com/operator-framework/rukpak) | ||
* [catalogd](https://github.com/operator-framework/catalogd) | ||
|
||
For a more complete overview of OLM v1 and how it will differ from OLM v0, see our [overview](./docs/olmv1_overview.md). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future tense ("will differ") vs present tense ("differs")?
docs/olmv1_overview.md
Outdated
|
||
TL;DR: OLMv1 cannot feasibly support multi-tenancy or any feature that assumes multi-tenancy. All multi-tenancy features end up falling over because of the global API system of Kubernetes. While this short conclusion may be unsatisfying, the reasons are complex and intertwined. | ||
|
||
Nearly every engineer in the Operator Framework group contributed to design explorations and prototypes over an entire year. For each of these design explorations, there are complex webs of features and assumptions that are necessary to understand the context that ultimately led to a conclusion of infeasibility that led us to today’s conclusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole paragraph seems odd. It's describing the process/history but not necessarily the present state of OLMv1. (I.e. The subjects of these paragraphs is not OLMv1, but people and tasks.) I think we may want to reconsider this paragraph, or put it into a historical section, even if it means just adding a ### Historical Context
above it.
|
||
### Watched namespaces cannot be configured in a first-class API | ||
|
||
OLMv1 will not have a first-class API for configuring the namespaces that a controller will watch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than saying "not", let's say what it does provide. e.g. "Namespace watching is provided via "
However, Kubernetes does not assume that a controller will be successful when it reconciles an object. | ||
|
||
The Kubernetes design assumptions are: | ||
- CRDs and their controllers are trusted cluster extensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A controller can be namespaced, but it requires that all controllers for a given CRD manage non-overlapping namespaces. That can be tricky. This is why we are/were considering splitting up "global components" (e.g. CRDs) from potentially "namespace-able components" (i.e. the controllers).
docs/olmv1_overview.md
Outdated
|
||
## What will OLM doo that a generic package manager doesn't? | ||
|
||
OLM will provide multiple features that are absent in generic package managers. Some items listed below are already implemented, while others are most likely planned for the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove "are most likely". Prefer simply "are", or "may be".
027901f
to
9e66fee
Compare
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Signed-off-by: Joe Lanford <[email protected]>
9e66fee
to
dadeb9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To sum up my thoughts on reviewing this:
OLM v1 feels like a wasted opportunity in the form presented and I really hoped to see more focus around the previously discussed separation of API (cluster-scoped) from controller (not necessarily cluster-scoped) and the impact on lifecycle that such a split would necessitate.
I thought/hoped that OLM v1 would be trying to provide a solution for that challenging problem, rather than what appears to me to be a view of "that's someone else's problem to solve, we are going to design OLM to not get in the way of whoever tries to solve that"
In a model where the CRD and the controller(s) that implement that CRD are distinct entities I don't think something that wants to be a lifecycle manager can just "opt out" of such a key aspect that drives lifecycle events as the relationship between CRD and controller(s).
I think that the discussion around tenancy in the cluster has got in the way a little here perhaps, because this is not about multi-tenancy really, and - unlike multi-tenancy - this seperation of CRD from controller is something that is a natural fit for Kubernetes, certainly a more natural fit then having them bound together as it is today IMO.
OLM v1 should be the glue that binds and manages the CRD and it's controllers, that's what I'd expect of a lifecycle manager for cluster extensions. Put the discussion about tenancy aside, this isn't about being able to isolate one namespace from changes in another (which as had been said many times really isn't possible due to the nature of Kubernetes), but as a cluster administrator I should be able to install an API in my cluster, and then control which namespaces can use that API.
- Manage a Kubernetes API extension (CRD)
- Manage a single cluster-scoped controller for an installed API extension
- Manage one or more namespace-scoped controllers for an installed API extension
I'd expect OLM to be the thing that can manage the extension as a whole including:
- Ensuring either use of multiple controllers with namespace scopes or a single cluster scoped controller, and preventing a mix or the two
- Managing the relationship between controller and CRD, and e.g. preventing controller updates happening which would create incompatibiity with the CRD version or at least flagging them as moving into a warning state if the controller and the CRD are not compatible
- Moving from a model where the CRD and controller are delivered in one package to offering distinct packages for CRDs and controllers, as well as a convenience bundle for both perhaps
The customers with large clusters that I have worked with in the last 2 years do not want to be using cluster-scoped controllers, and they are not seeking to address tenancy concerns through the use of multiple namespace-scoped controllers; they see namespace-scoped controllers as a way to enable the extensions on a namespace-by-namespace basis. Yes, the extension is installed to the cluster, but it's only been enabled for use in namespace1, 2, & 3.
They are seeking to limit the exposure/impact of introducing a new API and updating controllers in their large clusters rather than the ability to independently operate in namespace1 and namespace2 through some form of isolation/tenancy.
This is what I think the next evolution of OLM should be, transition to a first-class data model and lifecycle built around the seperation of CRD from controller.
Example: Strimzi and Red Hat AMQ Streams
Today use of these two major Kafka operators is problematic, if you install both in a cluster things go bad for you because both respond to the same API (kafka.kafka.strimzi.io
).
I would expect/hope for OLM v1 to address this with first class support for something like this:
- Install the
kafka.strimzi.io
API extension to say "we support Kafka in this cluster" (no controller(s) included in this action) - Install the Strimzi controller in
namespace1
(no change to the API included in this action) - Install the AMQ Streams controller in
namespace2
(no change to the API included in this action, it can not break anything innamespace1
) - Update the
kafka.strimzi.io
API extension ... at this point we are performing a cluster-scoped operation that may impact all namespaces with a controller (namespace1
andnamespace2
) - Update the AMQ Streams controller in
namespace2
(no need to worry about impactingnamespace1
)
To me, this is what I think about I hear people talking about tenancy, because this is what the customers I work with are seeking, and it's a perfect fit for Kubernetes if we just broke apart the delivery mechanism for CRDs and controllers.
I would liken this change in approach to the difference it made when file-based catalogs came around and removed the channel graph information from the operator bundles. It never made sense that an individual operator bundle had to know "what channel will I be in", or "what's the default channel of the package I belong to" and once we were able to define that relationship in the correct place (in the catalog itself) it was a game-changer for managing operator packages. I feel that the same thing needs to happen for CRD/Controllers as part of the evolution of OLM.
However, Kubernetes does not assume that a controller will be successful when it reconciles an object. | ||
|
||
The Kubernetes design assumptions are: | ||
- CRDs and their controllers are trusted cluster extensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example which is counter argument is the Ingress API, which has reference to the ingress class name, which may or may not having a running controller on the cluster.
I am nor arguing APIs are global.
I am arguing that APIs could be separated from the controllers, with different lifecycle
|
||
### "Watch namespace"-aware operator discoverability | ||
|
||
When operators add APIs to a cluster, these APIs are globally visible. As stated before, there is an assumption in this design that a controller will reconcile an object of that API anywhere it exists in the cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Such assumption is dangereous. Controllers might have privileges to read and modify Secrets, to properly manage the operands. Cluster admins will not allow global access to the all Secrets on the cluster, they will allow such access only to the selected namespaces only.
Example: lets assume we are installing Kafka operator at the cluster scope, which will have Secrets get/update permission. Cluster admins will not let Kafka operator access to namespaces running some other workloads where the confidential/sensitive data is stored as Secrets.
Therefore, controllers must be provided a way to restrict their access. It could be done as such that RBAC is granted only in selected namespaces and it is up to the controller to somehow know which namespaces to watch. But then, it begs for some API to understand what is such scope of controller, or at least some though being given what is the controller developer best practices how to handle scope discovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RH maintainers had a series of meetings last week and determined some updates/clarifications needed here, but it would be much easier to discern the updates in a separate PR, so let's merge this and do a follow-up to capture updates.
Reviewers: please interpret this as an attempt to ensure that your comment is interpreted in the updated context
Description
Reviewer Checklist