[Draft] Assets generation and Platform Awareness enhancement #210

rromannissen · 2024-11-22T13:18:22Z

Since its first release, the insights that Konveyor could gather from a given application were either coming from the source code from the application itself (analysis), or from information provided by the different stakeholders involved in the management of the application lifecycle (assessment). This enhancement proposes a third way of surfacing insights about an application by gathering both runtime and deployment configuration from the very platform in which the application is running (discovery), and storing that configuration in a canonical model that can be leveraged by different Konveyor modules or addons.

Aside from that, the support that Konveyor provided for the migration process stopped when the application source code was modified for the target platform, leaving the application itself ready to be deployed but without the required assets to get it actually deployed in the target platform. For example, for an application to be deployed in Kubernetes, it is not only necessary to adapt the application source code to run in containers, but it is also necessary to have deployment manifests that define how that application can be deployed in a cluster, a Containerfile to build the image and potentially some runtime configuration files. This enhancement proposes a way to automate the generation of those assets by leveraging the configuration and insights gathered by Konveyor.

Signed-off-by: Ramón Román Nissen <[email protected]>

gciavarrini · 2024-11-22T13:43:02Z

enhancements/assets-generation/README.md

+
+- Should there be a dynamic way of registering Platform Types, Discovery Providers and Generator Types? Should that be managed by CRs or could there be an additional mechanism? That would imply adding some dynamic behavior on the UI to render the different field associated with each of them.
+- How can we store sensitive data retrieved by the Discovery Providers?
+- How could we handle the same file being rendered by two different _Generators_ (charts)? Is there a way to calculate the intersection of two different Helm charts?


How could we handle the same file being rendered by two different Generators (charts)?

using different relase name for each generator can be an idea?
One approach may be to use a different release name for each Generator. WDYT?

Is there a way to calculate the intersection of two different Helm charts

Not aware of a way to intersect, maybe the closest is to use dependency managment

We wouldn't be using the Helm release concept, as I wouldn't expect the asset generator to have any direct contact with a k8s cluster (that would be something more for a CI/CD pipeline). We are mostly using Helm to render assets via the helm template command.

gciavarrini · 2024-11-22T13:57:05Z

enhancements/assets-generation/README.md

+##### Repository Augmentation
+
+- Generated assets could be stored in a branch from the target application repository, or if needed, on a separate configuration repository if the application has adopted a GitOps approach to configuration management.
+- Allow architects to seed repositories for migrators to start their work with everything they need to deploy the applications they are working on right away → Ease the change, test, repeat cycle.


What do you mean with "seed repositories"?

Add everything developers need to start deploying the application in the target platform since the very first minute. If a developer can only interact with the source code to apply changes to adapt the application for the target platform, but is not able to actually deploy the app in there to see if it works, it becomes difficult for them to know when the migration is done, at least to a point in which the organization can test that everything behaves as expected.

Deployment could be done by existing CI/CD infrastructure. We implemented this approach for our customer in a workflow. When move2kube generated dockerfile and manifests we triggered tekton to build image and deploy. We provided place for customers to define how the pipeline should be triggered.

That's the idea, our assets generator leaves the assets in a place where the corporate CI/CD can pick them up and orchestrate the deployment in whatever way they have designed. That last mile, the deployment itself, is delegated to the corporate CI/CD system, Konveyor doesn't have anything to do with it.

I think IIUC @rromannissen, you are saying that nothing is stopping the generator from creating the TektonPipeline but applying and using that pipeline is an exercise left to users outside of Konveyor.

Is that correct?

@shawn-hurley that's it!

shawn-hurley · 2024-11-22T17:37:14Z

enhancements/assets-generation/README.md

+
+- Should there be a dynamic way of registering Platform Types, Discovery Providers and Generator Types? Should that be managed by CRs or could there be an additional mechanism? That would imply adding some dynamic behavior on the UI to render the different field associated with each of them.
+- How can we store sensitive data retrieved by the Discovery Providers?
+- How could we handle the same file being rendered by two different _Generators_ (charts)? Is there a way to calculate the intersection of two different Helm charts?


Is the open question how you can layer the file changes on top of a each other, or merge them together, so that the generators work together?

If the OpenShift generator (chart) generates a Deployment.yaml and the EAP on OpenShift generator (chart) generates a different Deployment.yaml, how can we merge them? It just came to my mind that we could establish an explicit order of preference when assigning Generators to a Target Platform, so if some resources (files) overlap, the ones with the top preference override the others. That would mean no file merging, but the end result would be a composition (should we call this merge?) of the files rendered by all generators.

I maybe missing some context here but my understanding is that we would have one more more generators (configured by the users) which may provide one or more ways to deploy the same app. In my opinion we should not merge anything and provide generated manifests (in different folders) per user request and let user decide what to do about duplicity.

I think this makes sense for the first pass. I believe this could get cumbersome, but waiting for an actual user pain makes sense to me.

enhancements/assets-generation/README.md

shawn-hurley · 2024-11-22T17:46:55Z

enhancements/assets-generation/README.md

+- Documented way of storing configuration:
+  - Keys are documented and have a precise meaning.
+  - Similar to Ansible facts, but surfacing different concerns related to the application runtime and platform configuration.
+- RBAC protected.


I would like to explore this a little more

Is the whole configuration RBAC protected or just some fields. How is the RBAC managed from the hub?

At the moment I think it should be something simple along the lines of "only Admins and/or Architects can see the config" considering how RBAC works now. Once we move authorization over to Konveyor itself (as we've discussed several times in the past), I think we'd have something more flexible that would allow users to have a more fine grained control over this.

shawn-hurley · 2024-11-22T18:00:36Z

enhancements/assets-generation/README.md

+
+- The hub generates a values.yaml file based on the intersection of the _Configuration_ dictionary for the target application, the fixed _Variables_ set in the Generator and the _Parameters_ the user might have provided when requesting the generation, in inverse order of preference (_Parameters_ have top preference over the others, then _Variables_ and finally the _Configuration_ dictionary). That file should also include values inferred from other information stored in the application profile such as tags.
+- The values.yaml file is injected by the hub in a _Generator_ task pod that will execute the `helm template` command to render the assets.
+- The generated assets are then placed in a branch of the repository associated with the application.


This would require giving access for konveyor to write to a repository, which as far as I know, it only needs read access today.

I wonder if being able to download the generated assets from the hub/ui might be a solution worth exploring.

This would allow users to put the files in gitops in an other repo, or just to use locally to test with before commiting. They could even mutate the resource before commiting.

Just something to consider, not tied to it one way or the other.

@shawn-hurley AFAIK @jortel already has writing to a repository figured out.

Having everything committed to a repo seems cleaner to me, and a user can always make changes in the repo with a clear log of where each of those changes comes from. If we were to allow users to download the files, that would mean it would be difficult to tell which parts came from Konveyor and which ones came from a manual change.

In the end, this is all about organizations being able to enforce standards. If someone wants to override some of those standards, then they should be accountable for that.

what about pushing a PR or MR? it will be up to repo owners to merge the change. we may not need to have write permission to the repository.

That implies having to integrate with the different APIs of the most common Git wrappers out there: GitHub, GitLab, Bitbucket, Gitea... That means having not only to implement but also maintaining compatibility with all these APIs over time, which would require a considerable investment. I don't think that is a priority at the moment considering the resources we have.

I agree that would not necessarily be hard, but it adds a larger support burden than we would like.

We have talked about this offline, and one of the things that we talked about is that this entire flow only works for source applications, not binaries. I think for the first pass, this makes sense, and we can pivot if there are issues that customers bring up. There is no need to boil the ocean if something is working to get in the user's hands.

selrahal · 2024-12-02T20:59:07Z

enhancements/assets-generation/README.md

+
+There will be platform related fields in the Application entity. These fields should be considered optional, as applications can still be managed manually or via the CSV import without the need for awareness of the source platform.
+
+A _Source Platform_ section (similar to the Source Code and Binary sections) should be included in the _Application Profile_, including the following fields:


@rromannissen is it possible to have multiple source platforms for a particular application? If we think of the current functionality around cloning the source from a git repo as a "platform" (git repository, there is an api, we populate the source code from this info......and "analyze" phase is strictly the static code analysis) then we'd definitely need multiple. maybe if there is an EAP app on k8s then you would have the two different source platforms? each responsible for their own details.

Code retrieval remains part of the analysis phase, as repositories are not a platform in which the application is deployed, but rather a place where the application source code is stored. The analysis process should be able to surface configuration though, and I think that we should (and can) leverage analysis findings (probably coming from Insights) to populate the configuration dictionary, aside from the technology tags to automate archetype association as we do now. That should remain independent from the discovery process for different platforms described in this document.

For a "compound" scenario like EAP on K8s, I imagine having a dedicated discovery provider that can handle the specifics of that situation and be able to retrieve information for both the k8s objects and the EAP configuration. Bear in mind that using a vanilla EAP discovery provider would not work for an EAP on k8s scenario, as some (if not all) of the EAP management APIs are disabled in the image.

If I could add to that, we would probably have to start scanning container layers at that point to get the information out of them. This is not impossible, and there are many ways to do this, but it is not something that we have implemented.

We should also consider this, but I think it is outside the scope of this enhancement.

Thoughts?

do we consider multilayered apps (many source repos, different components deployed in more than one platform) to be in scope of this work?

As per the cardinality we currently have in Konveyor, each component on a distributed application would be treated as what we call an application in the inventory. All components of the same distributed application could be related via runtime dependencies and common tags.

Couldn't they be associated via a migration wave as well or is that the wrong tool for the job?

Migration Waves are thought to break the migration effort into different sprints to enable an iterative approach, so probably not the best tool for that. We discussed in the past the possibility of having the concept of application and components as first class entities, but that would require further changes in the API and UI/UX that I think go beyond the scope of this enhancement.

Based on our discussion yesterday it seems they are mostly using stateless apps. With that said it is ok to keep it out of the scope for this work.

pkliczewski · 2024-12-06T09:00:35Z

enhancements/assets-generation/README.md

+
+## Open Questions
+
+- Should there be a dynamic way of registering Platform Types, Discovery Providers and Generator Types? Should that be managed by CRs or could there be an additional mechanism? That would imply adding some dynamic behavior on the UI to render the different field associated with each of them.


dynamic behavior on the UI is solved by products like OCP with frontend plugins for different operators on plugins in RHDH (backstage). I think the question should be: "which mechanism would be a good match for existing architecture"

If we are just doing dynamic fields, I think it would make sense to just focus on that, as that is a much more constrained problem (read the open API spec for a "thing" to determine the type have a the right field for that type). Having a full front end plugin system is hard IMO and if we don't need that we shouldn't focus on it IMO.

In the future, we may but I think we should do that work when it becomes an acute problem users are feeling.

I am ok with limiting the scope. Based on this open question it was not clear to me what we intent to provide. Still "just" dynamic fields may grow beyond our initial design.

pkliczewski · 2024-12-06T09:09:03Z

enhancements/assets-generation/README.md

+
+- Should there be a dynamic way of registering Platform Types, Discovery Providers and Generator Types? Should that be managed by CRs or could there be an additional mechanism? That would imply adding some dynamic behavior on the UI to render the different field associated with each of them.
+- How can we store sensitive data retrieved by the Discovery Providers?
+- How could we handle the same file being rendered by two different _Generators_ (charts)? Is there a way to calculate the intersection of two different Helm charts?


I maybe missing some context here but my understanding is that we would have one more more generators (configured by the users) which may provide one or more ways to deploy the same app. In my opinion we should not merge anything and provide generated manifests (in different folders) per user request and let user decide what to do about duplicity.

pkliczewski · 2024-12-06T09:19:19Z

enhancements/assets-generation/README.md

+    - Hypervisors and VMs.
+    - Others...
+- Assets generation:
+  - Flexible enough to generate all assets required to deploy an application on k8s (and potentially other platforms in the future)


The applications may be multilayered with complex deployments running in different platforms like stateless web service (PCF) and a db or cache (vm). Should we limit ourselves to only parts of the app or attempt to generate all the deployment assets? Depending on our choices we may or may not need to think about network layout and corresponding manifests.

How deep we go would totally depend on the Discovery Provider logic and the Helm charts (and potentially other templating technologies in the future) associated with the generator for the target platform. The goal is to provide a framework to enable us and users to do this in a structured way.

pkliczewski · 2024-12-06T09:22:17Z

enhancements/assets-generation/README.md

+
+## Proposal
+
+### Personas / Actors


do we see a place for SRE or platform engineering in this effort?

That is something I would consider once we expose this functionality via the Backstage plugin.

I raised this question since the original intention was to connect to source runtime as well as make sure it will run in target runtime without issues. This clearly requires work from SRE to configure access, CI/CD etc. although based on our discussion with customer we know it is not the highest priority atm.

pkliczewski · 2024-12-06T09:33:21Z

enhancements/assets-generation/README.md

+
+There will be platform related fields in the Application entity. These fields should be considered optional, as applications can still be managed manually or via the CSV import without the need for awareness of the source platform.
+
+A _Source Platform_ section (similar to the Source Code and Binary sections) should be included in the _Application Profile_, including the following fields:


do we consider multilayered apps (many source repos, different components deployed in more than one platform) to be in scope of this work?

pkliczewski · 2024-12-06T09:50:22Z

enhancements/assets-generation/README.md

+
+##### Discovery Providers
+
+Abstraction layer responsible of collecting configuration around an application on a given platform:


this assume network connectivity to a platform/agent and admin level permissions. Is this something we can expect? what should be the process for agent to be deployed/installed?

Yes, for the live connection approach we'll need some valid credentials and network access. Agents will have to be deployed by the infrastructure teams managing the platforms and exposed to the Hub somehow (TBD).

pkliczewski · 2024-12-06T09:54:34Z

enhancements/assets-generation/README.md

+- *Initial discovery*:
+  - Configuration dictionary gets populated with non sensitive data. Sensitive data gets redacted or defaults to dummy values.
+- *Template instantiation*:
+  - A second discovery retrieval happens to obtain the sensitive data and inject it in the instantiated templates (the actual generated assets) without storing the data in the Configuration dictionary.


is there a need to protect access to generated assets with sensitive data?

Very likely, but that would be responsibility of the user generating the assets, meaning that they should take care of storing the assets in a secured repository.

pkliczewski · 2024-12-06T10:00:52Z

enhancements/assets-generation/README.md

+
+- The hub generates a values.yaml file based on the intersection of the _Configuration_ dictionary for the target application, the fixed _Variables_ set in the Generator and the _Parameters_ the user might have provided when requesting the generation, in inverse order of preference (_Parameters_ have top preference over the others, then _Variables_ and finally the _Configuration_ dictionary). That file should also include values inferred from other information stored in the application profile such as tags.
+- The values.yaml file is injected by the hub in a _Generator_ task pod that will execute the `helm template` command to render the assets.
+- The generated assets are then placed in a branch of the repository associated with the application.


what about pushing a PR or MR? it will be up to repo owners to merge the change. we may not need to have write permission to the repository.

pkliczewski · 2024-12-06T10:03:29Z

enhancements/assets-generation/README.md

+
+##### Repository Augmentation
+
+- Generated assets could be stored in a branch from the target application repository, or if needed, on a separate configuration repository if the application has adopted a GitOps approach to configuration management.


Generated assets could be stored in a branch. We need to keep in mind that we may have sensitive information added as part of asset generation. I am not sure whether it is a good idea to store those in a repository.

That's exactly how things are done in a full GitOps approach, with configuration for different environments being stored in different configuration repositories with different security level. Nevertheless, I think it might be interesting to add an additional parameter for Template Instantiation to allow the user to prevent sensitive data to be injected in the generated assets.

pkliczewski · 2024-12-06T10:06:53Z

enhancements/assets-generation/README.md

+##### Repository Augmentation
+
+- Generated assets could be stored in a branch from the target application repository, or if needed, on a separate configuration repository if the application has adopted a GitOps approach to configuration management.
+- Allow architects to seed repositories for migrators to start their work with everything they need to deploy the applications they are working on right away → Ease the change, test, repeat cycle.


Deployment could be done by existing CI/CD infrastructure. We implemented this approach for our customer in a workflow. When move2kube generated dockerfile and manifests we triggered tekton to build image and deploy. We provided place for customers to define how the pipeline should be triggered.

istein1 · 2024-12-10T11:34:06Z

enhancements/assets-generation/README.md

+    - Hypervisors and VMs.
+    - Others...
+- Assets generation:
+  - Flexible enough to generate all assets required to deploy an application on k8s (and potentially other platforms in the future)


There can be all sort of assets, and it might be tricky to provide any kind of it.
In case the discovery detects an asset Konveyor doesn't have in it's arsenal,
would the option to ask the user to provide that asset source, so that Konveyor could generate it can be considered?
Or maybe I'm getting this wrong and Konveyor is good with any asset, it would propagate it into a helm chart and then the CI/CD will handle this asset is installed?

Discovery providers will discover configuration, have it stored in canonical form, and then the generators will generate assets for different target platforms. Considering we will be in control of the discovery providers and generators we ship out of the box, we should put special care on coordinating them to tackle meaningful migration paths such as Cloud Foundry to Kubernetes (meaning shipping a CF discovery provider and a default Kubernetes generator).

istein1 · 2024-12-10T16:56:52Z

enhancements/assets-generation/README.md

+- Managed in the administration perspective
+- Potential fields:
+  - Name
+  - Platform Type (Kubernetes, Cloud Foundry, EAP, WebSphere…)


Does this list contain all the supported platforms?
Asking in terms design and infra needed to test this.

Those are just examples. In a first iteration we should focus on Kubernetes and Cloud Foundry.

istein1 · 2024-12-10T17:06:25Z

enhancements/assets-generation/README.md

+
+### Test Plan
+
+TBD


@rromannissen , Could you please suggest a one high level end-to-end test for a common use case?
I think that would provide more calcification on the tests should be focused on.

istein1 · 2024-12-10T18:00:28Z

enhancements/assets-generation/README.md

+
+## Design Details
+
+### Test Plan


@mguetta1 @ibragins, @nachandr,
Would you please add here questions/thoughts/ideas on testing?

First draft for the Assets generation and Platform Awareness enhancement

2d57833

Signed-off-by: Ramón Román Nissen <[email protected]>

rromannissen requested review from jwmatthews, jortel, dymurray, JonahSussman and eemcmullan November 22, 2024 13:38

gciavarrini reviewed Nov 22, 2024

View reviewed changes

shawn-hurley reviewed Nov 22, 2024

View reviewed changes

enhancements/assets-generation/README.md Show resolved Hide resolved

shawn-hurley reviewed Nov 22, 2024

View reviewed changes

selrahal reviewed Dec 2, 2024

View reviewed changes

rromannissen mentioned this pull request Dec 5, 2024

[RFE] Default credentials #211

Open

pkliczewski reviewed Dec 6, 2024

View reviewed changes

istein1 reviewed Dec 10, 2024

View reviewed changes


		There will be platform related fields in the Application entity. These fields should be considered optional, as applications can still be managed manually or via the CSV import without the need for awareness of the source platform.

		A _Source Platform_ section (similar to the Source Code and Binary sections) should be included in the _Application Profile_, including the following fields:


		## Open Questions

		- Should there be a dynamic way of registering Platform Types, Discovery Providers and Generator Types? Should that be managed by CRs or could there be an additional mechanism? That would imply adding some dynamic behavior on the UI to render the different field associated with each of them.


		##### Discovery Providers

		Abstraction layer responsible of collecting configuration around an application on a given platform:


		##### Repository Augmentation

		- Generated assets could be stored in a branch from the target application repository, or if needed, on a separate configuration repository if the application has adopted a GitOps approach to configuration management.


		### Test Plan

		TBD

[Draft] Assets generation and Platform Awareness enhancement #210

Are you sure you want to change the base?

[Draft] Assets generation and Platform Awareness enhancement #210

Conversation

rromannissen commented Nov 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

istein1 Dec 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

istein1 Dec 10, 2024 •

edited

Loading