Replies: 4 comments
-
Cool resources I found https://cloud.google.com/architecture I also realize that I am trying to toe the line with suggestions hovering between being more effecient/useful vs being overkill for our needs. |
Beta Was this translation helpful? Give feedback.
-
https://cloud.google.com/architecture/framework/cost-optimization/storage In thinking about migrating from VM-storage to cloud storage (see also #894), I think it would be nice to have different buckets (or block stores) for different aspects of stored files in gEAR. The bucket mount paths can all be configured in the gear.ini file. Determining what can be shared between the production and devel instances and what needs to be split can help dramatically save costs since redundancy is reduced.
As mentioned in the opening comment, being able to separate these from the VM, as well as other files the user is not directly or indirectly adding (annotations, feature mappings, etc) can also help with making updating the VM code a bit more of an automatic process. |
Beta Was this translation helpful? Give feedback.
-
https://cloud.google.com/architecture/application-development/dynamic-app-python proposed architecture adjustment. Of course we would still use existing things instead of proposed ones, such as Flask instead of Django, but I like the structure. It has a good CI/CD system with Cloud Build and Artifact Registry we could take advantage of (I already have a Dockerfile set up, albeit monolithic). Interesting to note is that it seems to use Cloud Run to spawn a container for the API (if not up) which is triggered when the API request is put in by the front end. This means we would have to use Cloud Run logging to monitor potential API errors, but theoretically we could scale API calls horizontally, which isn't really an issue at the moment. The "secret manager" could potentially eliminate some stuff added in the gear.ini. We also would not use Firebase... could continue using our existing VM which would be slimmed down or explore App Engine, which is less management by us. Cloud Run is even an option, as it scales the frontend based on requests too (see https://cloud.google.com/architecture/application-development/three-tier-web-app) . Again possibly overkill but could save costs and set up gEAR for future decoupling Also found this whole set of migration pages -> https://cloud.google.com/architecture/migration-to-gcp-getting-started |
Beta Was this translation helpful? Give feedback.
-
Architecture improvement ideas
This space is going to be a dumping ground of ideas I had about how to improve gEAR architecture.
Automatic deployments
One of the issues that came to light is that our current development areas are manually deployed. This includes manually updating packages on each VM as well as the codebase, which can lead to inconsistencies across each devel environment compared to the local environments used by each developer. And manual updates can lead to missing steps, and some technical debt when bugs occur.
One solution I thought of was to have a gold image that is updated (automatically or manual) to latest content, which can be cloned to a devel image. When working, this same devel image can be the new production VM by tag or URL switchover as provided by Google Cloud configuation options.
Decoupling services
In order to make a clonable VM that can benefit from automatic updates, there are some parts of the current gEAR VM that can be split off into their own service. Ideally we can use the gEAR config file to decide a) if the service is local or through GCP and b) where the service is located (i.e. uri route)
Services/components that can be split off of vm to make for easier cloning
EDIT: Cloud Storage FUSE does not support concurrency, so if multiple write events occur the last one wins. Should not matter for primary datasets we do not intend to change but worth noting.
Quick sources
By storing these independently of the VM, we can connect them to any potential new cloned VM, and choose which data sources we want to investigate (devel db vs prod db for gEAR or NeMO for instance). And to accommodate for easier setups for other groups looking to clone our setup, we can set the configuration file to use a "local" version of the service so it's more monolithic for that group. I have noticed the GCP service python packages generally mirror the same design as the "local" ones (ie. cloud-sql package uses mysql-connector code... just have an adaptor to switch the package source)
Another benefit is the ability to generate custom logging and monitoring information on the decoupled files and database stuff. It is also worth nothing GCP is now supporting some cross-project functionality, so it may be possible to not have to duplicate efforts across gEAR and NeMO Analytics.
Sort of unrelated, but I learned that Google Cloud Run is going to have a feature where it will auto-update packages for your run deployment, which takes a lot of headache out of doing this. Great for security patching and so-on.
Setup documentation improvements
Reorganize documentation to be one document or at least refactor the "setup_new_server_notes.md" page to be better (more wiki-like)
(Personal opinion). Use the committed docker requirements.txt file as the ground truth for pip installation, instead of having a separate list of packages thrown individually in a document. Installing using
pip install -r requirements.txt
is very handy for installing large numbers of packages. We can even alias the requirements.txt file back to the docker directory or the docs directory. I am suggesting the requirements.txt file because I generally keep that more up-to-date as I keep my docker image in line with the VMs and if a package is incorrect, I notice it very quickly when sandboxing/testing.Executable code from the setup script markdown files should be moved to separate executable shell scripts and the markdown references those scripts. It is worth noting that if we implement the "clone vm" approach from earlier, server setup/creation should be a one-time affair.
Infrastructure as code for smartly provisioning services
I was made aware of this concept at Github Universe, and it was brought up in a couple of talks at Google Cloud Next. The idea is that instead of manually configuring and deploying a service, had code writing configuration files that will then be used to deploy the service. The use-case I can think of is for uploading gEAR files, where some pre-knowledge of the file and then be used to write a configuration file that deploys the right VM to perform the upload. (Yes I realize the upload code was improved upon, so this example may be out of date.). See example at https://www.freecodecamp.org/news/what-is-infrastructure-as-code/
Other things
If I think of new suggestions, I'll add below. I just wanted to get something on paper. The summary is that I feel we are doing a lot of manual stuff that can be automated, improve synchronicity, reduce technical debt on debugging errors in "devel", get us quicker into using new versions of packages, and embracing a more modern cloud-based workflow while having the flexibility to keep things local or hybrid if needed.
Beta Was this translation helpful? Give feedback.
All reactions