Skip to content

jina-ai/terraform-jcloud-aws-infra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Terraform-JCloud-AWS-Infra

Terraform module which creates JCloud infra resource running on AWS based on EKS (Kubernetes) resources

Infrastructure:

The module includes below infrastructure and sub modules to support various JCloud features:

The examples provided under examples/ provide a set of configurations that demonstrate different configurations and settings that can be used with this module. However, these examples are not representative production cluster.

Components:

Components refers the Kubernetes tools or software that support JCloud features.

  • Knative (support application autoscale)
  • Kong (Ingress gateway)
  • Linkerd (Service Mesh)
  • External-dns (External DNS registration)
  • Karpenter (node autoscale)

Usage

data "aws_partition" "current" {}
data "aws_caller_identity" "current" {}

################################################################################
# k8s Module
################################################################################

module "jcloud" {
  source = "jina-ai/aws-infra/jcloud"
  version = "0.0.1"

  region       = "us-east-1"
  cluster_name = "jcloud-dev"

  vpc_name    = "jcloud-dev-vpc"
  eks_version = "1.27"

  cidr            = "10.200.0.0/20"
  azs             = ["us-east-1a", "us-east-1b", "us-east-1c"]
  public_subnets  = ["10.200.6.0/24", "10.200.7.0/24", "10.200.8.0/24"]
  private_subnets = ["10.200.0.0/23", "10.200.2.0/23", "10.200.4.0/23"]

  kms_key_owners = [data.aws_caller_identity.current.arn]

  eks_admin_users = [data.aws_caller_identity.current.arn]

  enable_cert_manager = false
  enable_kong         = true
  enable_linkerd      = true

  tags = var.tags
}

Examples

  • Minimal: JCloud cluster only with ingress controller.

Requirements

Name Version
terraform >= 1.3.0
aws >= 4.47
helm >= 2.4
kubectl >= 1.14
random >= 2.1.2
tls ~> 3.0

Providers

Name Version
aws >= 4.47
helm >= 2.4
kubectl >= 1.14
kubernetes n/a
time n/a

Modules

Name Source Version
alb-controller ./modules/aws/alb-controller n/a
autoscaler ./modules/aws/cluster-autoscaler n/a
cert_manager ./modules/general/cert-manager n/a
eks terraform-aws-modules/eks/aws 19.16.0
eks-ebs-csi ./modules/aws/k8s-ebs-csi n/a
eks-efs-csi ./modules/aws/k8s-efs-csi n/a
eks_managed_node_group terraform-aws-modules/eks/aws//modules/eks-managed-node-group n/a
external-dns ./modules/general/external-dns n/a
karpenter_irsa terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks ~> 5.2.0
knative ./modules/general/knative n/a
kong ./modules/general/kong n/a
kubecost ./modules/general/kubecost n/a
linkerd ./modules/general/linkerd n/a
monitor ./modules/general/monitor n/a
nvidia_plugin ./modules/nvidia n/a
vpc terraform-aws-modules/vpc/aws ~> 4.0
vpc_endpoint_security_group terraform-aws-modules/security-group/aws ~> 4.0
vpc_endpoints terraform-aws-modules/vpc/aws//modules/vpc-endpoints ~> 3.0

Resources

Name Type
aws_iam_instance_profile.karpenter resource
aws_launch_template.gpu resource
aws_launch_template.gpu_shared resource
aws_launch_template.karpenter resource
aws_launch_template.system resource
helm_release.karpenter resource
helm_release.metrics_server resource
kubectl_manifest.karpenter_provisioner resource
kubectl_manifest.karpenter_provisioner_gpu resource
kubectl_manifest.karpenter_provisioner_gpu_shared resource
kubectl_manifest.karpenter_provisioner_privileged resource
kubectl_manifest.karpenter_provisioner_system resource
kubectl_manifest.wolf_resources resource
kubernetes_config_map_v1_data.coredns-domain resource
time_sleep.ng resource
time_sleep.this resource
aws_ami.eks_node data source
aws_ami.eks_node_gpu data source
aws_caller_identity.current data source
aws_eks_cluster.cluster data source
aws_eks_cluster_auth.auth data source
aws_partition.current data source
aws_region.current data source
kubectl_file_documents.wolf data source

Inputs

Name Description Type Default Required
alertmanager_config_yaml_body Prometheus' Alertmanager Values in YAML Format string "" no
app_ref Suffix of Project Name of the AWS Resource string "" no
aws_auth_fargate_profile_pod_execution_role_arns List of Fargate profile pod execution role ARNs to add to the aws-auth configmap list(string) [] no
aws_auth_node_iam_role_arns_non_windows List of non-Windows based node IAM role ARNs to add to the aws-auth configmap list(string) [] no
aws_auth_node_iam_role_arns_windows List of Windows based node IAM role ARNs to add to the aws-auth configmap list(string) [] no
azs A list of availability zones in the region list(string) [] no
certs JCloud ingress certs list(map(string)) [] no
cidr The CIDR block for the VPC. Default value is a valid CIDR, but not acceptable by AWS and should be overriden string "0.0.0.0/0" no
cluster_name Project Name of the AWS Resources string "" no
cluster_service_ipv4_cidr The CIDR block to assign Kubernetes service IP addresses from. If you don't specify a block, Kubernetes assigns addresses from either the 10.100.0.0/16 or 172.20.0.0/16 CIDR blocks string null no
create_buckets Jcloud monitor bucket bool true no
create_cluster_security_group Determines if a security group is created for the cluster. Note: the EKS service creates a primary security group for the cluster by default bool true no
create_kms_key Controls if a KMS key for cluster encryption should be created bool true no
create_kubecost_metrics_buckets Whether to Create Kubecost metrics bucket bool false no
create_node_security_group Determines whether to create a security group for the node groups or use the existing node_security_group_id bool true no
domain_filters The domain filters for external dns string "{wolf.jina.ai,dev.jina.ai,docsqa.jina.ai}" no
ebs_binding_mode EBS Storage class binding mode string "Immediate" no
efs_binding_mode EFS Storage class binding mode string "Immediate" no
eks_admin_roles eks admin roles list(string) [] no
eks_admin_users eks admin user list(string)
[
"jcloud-eks-user"
]
no
eks_custom_roles eks custom roles map(string) {} no
eks_custom_users eks custom user map(string) {} no
eks_managed_node_group_defaults Map of EKS managed node group default configurations any {} no
eks_readonly_roles eks readonly roles list(string) [] no
eks_readonly_users eks readonly user list(string) [] no
eks_version EKS version string "" no
enable_alb_controller Whether enable ALB controller in EKS bool false no
enable_cert_manager Whether create cert manager role for service account bool true no
enable_cluster_autoscaler Whether enable cluster autoscaler bool false no
enable_ebs Whether to enable ebs bool false no
enable_efs Whether to enable efs bool false no
enable_external_dns Whether to enable external dns bool false no
enable_gpu Whether enable GPU bool false no
enable_grafana Whether Grafana is Enabled bool false no
enable_karpenter Whether to enable karpenter bool false no
enable_knative Whether to enable Knative bool false no
enable_kong Whether to enable Kong bool true no
enable_kubecost Whether to enable Kubecost bool false no
enable_linkerd Whether to enable Linkerd bool true no
enable_logging If set to true, Loki and Promtail will be enabled, and corresponding toggles (i.e. enable_loki, enable_promtail) will be overwritten bool false no
enable_loki Whether Loki is enabled bool false no
enable_metrics If set to true, Prometheus, Thanos and DCGM Exporter will be enabled, and corresponding toggles (i.e enable_prometheus, enable_thanos, enable_dcgm_exporter) will be overwritten bool false no
enable_monitor Whether enable jcloud monitor such as Prometheus and Loki bool false no
enable_monitor_store Whether enable jcloud monitor s3 store and related IAM roles bool false no
enable_otlp_collector Whether to enable OTLP Collector bool false no
enable_prometheus Whether Prometheus is Enabled bool false no
enable_promtail Whether Promtail is enabled bool false no
enable_tempo Whether to enable Tempo for tracing bool false no
enable_thanos Whether Thanos is Enabled bool false no
enable_tracing If set to true, Tempo and OTLP Collector will be enabled, and corresponding toggles (i.e enable_tempo, enable_otlp_collector) will be overwritten bool false no
gpu_instance_type A list of EC2 instance type for dedicated GPU usage list(string)
[
"g5.xlarge",
"g5.2xlarge",
"g5.4xlarge",
"g5.12xlarge"
]
no
gpu_node_labels Karpenter accelerator type for GPU map(any) {} no
grafana_additional_data_sources_yaml_body (Optional) Grafana Additional Data Sources List in YAML Format. If not provided, use default data sources string "" no
grafana_admin_password Grafana Admin Password string "" no
grafana_database Grafana Database Credentials map(string)
{
"host": "",
"password": "",
"type": "",
"user": ""
}
no
grafana_ingress_class_name Grafana Ingress Class Name. Ignored if grafana_ingress_yaml_body is set. string "kong" no
grafana_ingress_tls_secret_name Grafana Ingress TLS Secret Name. Ignored if grafana_ingress_yaml_body is set string "" no
grafana_ingress_yaml_body Grafana Ingress Values in YAML Format. This overwrites grafana_ingress_tls_secret_name and grafana_ingress_class_name string "" no
grafana_server_domain Grafana Server Domain string "" no
init_node_type A list of EC2 instance type for init node group list(string)
[
"t3.medium"
]
no
karpenter_consolidation_enable Whether to enable consolidation on Karpenter bool false no
kms_key_administrators A list of IAM ARNs for key administrators. If no value is provided, the current caller identity is used to ensure at least one key admin is available list(string) [] no
kms_key_owners A list of IAM ARNs for those who will have full key permissions (kms:*) list(string) [] no
kms_key_users A list of IAM ARNs for key users list(string) [] no
kubecost_athena_bucket Kubecost athena bucket url string "" no
kubecost_athena_region Kubecost athena bucket region string "us-east-1" no
kubecost_grafana_host Kubecost grafana host string "" no
kubecost_master Whethere is kubecost master bool true no
kubecost_metric_buckets Kubecost metrics bucket string "" no
kubecost_s3_region Kubecost metrics bucket region string "us-east-1" no
log_bucket Jcloud log bucket name string "" no
log_bucket_region Log Bucket Region string "" no
loki_overwrite_values_yaml_body Overwrite Loki Values in YAML. Please refer to https://github.com/grafana/loki/blob/main/production/helm/loki/values.yaml for all possible values you can set. string "" no
metrics_bucket Jcloud metrics bucket name string "" no
metrics_bucket_region Metrics S3 Bucket Region string "" no
monitor_iam_access_key_id Monitor IAM Access Key ID string "" no
monitor_iam_access_key_secret Monitor IAM Access Key Secret string "" no
node_groups Map of EKS managed node group definitions to create any {} no
node_security_group_id ID of an existing security group to attach to the node groups created string "" no
otlp_collector_overwrite_values_yaml_body Overwrite OTLP Collector Values in YAML string "" no
otlp_endpoint OTLP Endpoint string "kube-tempo-distributor:4317" no
private_subnets A list of private subnets inside the VPC list(string) [] no
prometheus_otlp_collector_scrape_endpoint OTLP Collector Scrape Endpoint string "kube-otlp-collector-opentelemetry-collector.monitor.svc.cluster.local:8888" no
prometheus_stack_overwrite_values_yaml_body Overwrite Prometheus-Stack Values in YAML. Please refer to https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml for all possible values you can set. string "" no
promtail_clients_urls Promtail's Clients' URLS to push logs to list(string)
[
"http://kube-loki.monitor.svc.cluster.local:3100/loki/api/v1/push"
]
no
promtail_overwrite_values_yaml_body Overwrite Promtail Values in YAML string "" no
public_subnets A list of public subnets inside the VPC list(string) [] no
region Region of the AWS resources string "us-east-1" no
remote_cert_manager_role Remote cert manager role string "" no
remote_external_dns_role Remote AWS external DNS role string "" no
shared_gpu_instance_type A list of EC2 instance type for shared GPU usage list(string)
[
"g5.xlarge",
"g5.2xlarge",
"g5.4xlarge"
]
no
shared_gpu_node_labels Karpenter accelerator type for shared GPU map(any) {} no
shared_gpu_slicing_replicas Shared GPU slice number number 3 no
tags Tags for AWS Resource map(string) {} no
tempo_overwrite_values_yaml_body Overwrite Tempo Values in YAML. Please refer to https://github.com/grafana/helm-charts/blob/main/charts/tempo-distributed/values.yaml for all possible values you can set. string "" no
thanos_object_storage_config_key Thanos object storage name string "objstore.yml" no
thanos_object_storage_config_name Thanos object storage name string "jcloud-monitor-store" no
thanos_overwrite_values_yaml_body Thanos Overwrite Values in YAML string "" no
traces_bucket Jcloud traces bucket name string "" no
traces_bucket_region Traces S3 Bucket Region string "" no
vpc_cni_version EKS VPC CNI addon version string "" no
vpc_name Name to be used on all the resources as identifier string "" no

Outputs

Name Description
aws_auth_configmap_yaml [DEPRECATED - use var.manage_aws_auth_configmap] Formatted yaml output for base aws-auth configmap containing roles used in cluster node groups/fargate profiles
azs A list of availability zones specified as argument to this module
cert_manager_irsa_arn cert manager service account IAM Role ARN
cloudwatch_log_group_arn Arn of cloudwatch log group created
cloudwatch_log_group_name Name of cloudwatch log group created
cluster_addons Map of attribute maps for all EKS cluster addons enabled
cluster_arn The Amazon Resource Name (ARN) of the cluster
cluster_certificate_authority_data Base64 encoded certificate data required to communicate with the cluster
cluster_endpoint Endpoint for your Kubernetes API server
cluster_iam_role_arn IAM role ARN of the EKS cluster
cluster_iam_role_name IAM role name of the EKS cluster
cluster_iam_role_unique_id Stable and unique string identifying the IAM role
cluster_id The name/id of the EKS cluster. Will block on cluster creation until the cluster is really ready
cluster_identity_providers Map of attribute maps for all EKS identity providers enabled
cluster_name The name of the cluster
cluster_oidc_issuer_url The URL on the EKS cluster for the OpenID Connect identity provider
cluster_platform_version Platform version for the cluster
cluster_primary_security_group_id Cluster security group that was created by Amazon EKS for the cluster. Managed node groups use this security group for control-plane-to-data-plane communication. Referred to as 'Cluster security group' in the EKS console
cluster_security_group_arn Amazon Resource Name (ARN) of the cluster security group
cluster_security_group_id ID of the cluster security group
cluster_status Status of the EKS cluster. One of CREATING, ACTIVE, DELETING, FAILED
cluster_version The Kubernetes version for the cluster
efs_dns_name The DNS name for the filesystem
efs_id The ID that identifies the file system (e.g., fs-ccfc0d65).
efs_irsa_arn efs service account IAM Role ARN
eks_managed_node_groups Map of attribute maps for all EKS managed node groups created
loki_yaml_body YAML of Loki
monitor_iam_access_key_id The access key ID
monitor_iam_access_key_secret The access key secret
monitor_iam_role_arn ARN of IAM role
monitor_iam_role_name Name of IAM role
monitor_iam_user_arn The ARN assigned by AWS for this user
monitor_iam_user_name The user's name
mount_target_dns_name The DNS name for the given subnet/AZ
node_security_group_arn Amazon Resource Name (ARN) of the node shared security group
node_security_group_id ID of the node shared security group
oidc_provider The OpenID Connect identity provider (issuer URL without leading https://)
oidc_provider_arn The ARN of the OIDC Provider if enable_irsa = true
private_subnet_arns List of ARNs of private subnets
private_subnets List of IDs of private subnets
prometheus_stack_yaml_body YAML of prometheus stack
promtail_yaml_body YAML of Tempo
public_subnet_arns List of ARNs of public subnets
public_subnets List of IDs of public subnets
region Region of the AWS resources
tempo_yaml_body YAML of Tempo
vpc_arn The ARN of the VPC
vpc_cidr_block The CIDR block of the VPC
vpc_id ID of the VPC