You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
=================================================================
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
An error occurred (EntityAlreadyExists) when calling the CreateRole operation: Role with name kf-ack-sm-controller-role-tagcluster already exists.
Try running cleanup_sm_controller_req.py
Writing params.env for ACK SageMaker Controller
=================================================================
Params file written to : ../../awsconfigs/common/ack-sagemaker-controller/params.env
Editing ./utils/ack_sm_controller_bootstrap/config.yaml with appropriate values...
Config file written to : ./utils/ack_sm_controller_bootstrap/config.yaml
Installing kubeflow vanilla deployment with helm with irsa
=================================================================
==========Installing cert-manager==========
"jetstack" has been added to your repositories
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "eks" chart repository
...Successfully got an update from the "jetstack" chart repository
Update Complete. ⎈Happy Helming!⎈
Release "cert-manager" does not exist. Installing it now.
NAME: cert-manager
LAST DEPLOYED: Mon Mar 3 00:41:09 2025
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.10.1 has been deployed successfully!
In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).
More information on the different types of issuers and how to configure them
can be found in our documentation:
For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the ingress-shim
documentation:
https://cert-manager.io/docs/usage/ingress/
Waiting for cert-manager pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app.kubernetes.io/instance in (cert-manager)' --timeout=240s -n cert-manager
pod/cert-manager-5f58985b79-nzp9g condition met
pod/cert-manager-cainjector-5cdbcddbc8-7frkq condition met
pod/cert-manager-webhook-5788d8d7c6-gl5mg condition met
All cert-manager pods are running!
==========Installing istio==========
Release "istio" does not exist. Installing it now.
NAME: istio
LAST DEPLOYED: Mon Mar 3 00:41:31 2025
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for istio pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (istio-ingressgateway, istiod)' --timeout=240s -n istio-system
pod/istio-ingressgateway-799fbddb8c-z4fvl condition met
pod/istiod-684894b77-wmkk8 condition met
All istio pods are running!
==========Installing dex==========
Release "dex" does not exist. Installing it now.
NAME: dex
LAST DEPLOYED: Mon Mar 3 00:41:48 2025
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for dex pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (dex)' --timeout=240s -n auth
pod/dex-77bbb9b76c-hxjpb condition met
All dex pods are running!
==========Installing oidc-authservice==========
Release "oidc-authservice" does not exist. Installing it now.
NAME: oidc-authservice
LAST DEPLOYED: Mon Mar 3 00:41:52 2025
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for oidc-authservice pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (authservice)' --timeout=240s -n istio-system
error: timed out waiting for the condition on pods/authservice-0
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Waiting for oidc-authservice pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (authservice)' --timeout=240s -n istio-system
error: timed out waiting for the condition on pods/authservice-0
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Waiting for oidc-authservice pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (authservice)' --timeout=240s -n istio-system
error: timed out waiting for the condition on pods/authservice-0
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Traceback (most recent call last):
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/kubeflow_installation.py", line 324, in
install_kubeflow(
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/kubeflow_installation.py", line 101, in install_kubeflow
install_component(
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/kubeflow_installation.py", line 180, in install_component
validate_component_installation(installation_config, component_name)
File "/home/karthik/.local/lib/python3.10/site-packages/retrying.py", line 56, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/home/karthik/.local/lib/python3.10/site-packages/retrying.py", line 266, in call
raise attempt.get()
File "/home/karthik/.local/lib/python3.10/site-packages/retrying.py", line 301, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/usr/lib/python3/dist-packages/six.py", line 719, in reraise
raise value
File "/home/karthik/.local/lib/python3.10/site-packages/retrying.py", line 251, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/kubeflow_installation.py", line 192, in validate_component_installation
kubectl_wait_pods(value, namespace, key)
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/utils.py", line 275, in kubectl_wait_pods
raise Exception("Timeout/error waiting for pod condition")
Exception: Timeout/error waiting for pod condition
make: *** [Makefile:104: deploy-kubeflow] Error 1
Checked the pods and found the PVC was not bound.
Checked the PVC where the storage class is not set
karthik@U-1BB4IDIUHCE8V:/kubeflow/kubeflow-manifests/awsconfigs$ kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
gp2 kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 10d
karthik@U-1BB4IDIUHCE8V:/kubeflow/kubeflow-manifests/awsconfigs$ kubectl get pvc -n istio-system
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
authservice-pvc Pending 16m
karthik@U-1BB4IDIUHCE8V:~/kubeflow/kubeflow-manifests/awsconfigs$ kubectl get configmap -n istio-system
kubectl describe pvc -n istio-system
Name: authservice-pvc
Namespace: istio-system StorageClass:
Status: Pending
Volume:
Labels: app.kubernetes.io/managed-by=Helm
Annotations: meta.helm.sh/release-name: oidc-authservice
meta.helm.sh/release-namespace: default
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: authservice-0
Events:
Type Reason Age From Message
Normal FailedBinding 32s (x82 over 20m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
make deploy-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=vanilla
test tagcluster || (echo Please export CLUSTER_NAME variable ; exit 1)
test us-east-1 || (echo Please export CLUSTER_REGION variable ; exit 1)
aws eks update-kubeconfig --name tagcluster --region us-east-1
Updated context arn:aws:eks:us-east-1:xxxxxxxxxxxxx:cluster/tagcluster in /home/karthik/.kube/config
yq e '.cluster.name=env(CLUSTER_NAME)' -i tests/e2e/utils/ack_sm_controller_bootstrap/config.yaml
yq e '.cluster.region=env(CLUSTER_REGION)' -i tests/e2e/utils/ack_sm_controller_bootstrap/config.yaml
cd tests/e2e && PYTHONPATH=.. python3.8 utils/ack_sm_controller_bootstrap/setup_sm_controller_req.py
=================================================================
=================================================================
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
An error occurred (EntityAlreadyExists) when calling the CreateRole operation: Role with name kf-ack-sm-controller-role-tagcluster already exists.
Try running cleanup_sm_controller_req.py
=================================================================
Params file written to : ../../awsconfigs/common/ack-sagemaker-controller/params.env
Editing ./utils/ack_sm_controller_bootstrap/config.yaml with appropriate values...
Config file written to : ./utils/ack_sm_controller_bootstrap/config.yaml
=================================================================
cd tests/e2e && PYTHONPATH=.. python3.8 utils/kubeflow_installation.py --deployment_option vanilla --installation_option helm --pipeline_s3_credential_option irsa --cluster_name tagcluster
tagcluster
=================================================================
==========Installing cert-manager==========
"jetstack" has been added to your repositories
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "eks" chart repository
...Successfully got an update from the "jetstack" chart repository
Update Complete. ⎈Happy Helming!⎈
Release "cert-manager" does not exist. Installing it now.
NAME: cert-manager
LAST DEPLOYED: Mon Mar 3 00:41:09 2025
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.10.1 has been deployed successfully!
In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).
More information on the different types of issuers and how to configure them
can be found in our documentation:
https://cert-manager.io/docs/configuration/
For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the
ingress-shim
documentation:
https://cert-manager.io/docs/usage/ingress/
Waiting for cert-manager pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app.kubernetes.io/instance in (cert-manager)' --timeout=240s -n cert-manager
pod/cert-manager-5f58985b79-nzp9g condition met
pod/cert-manager-cainjector-5cdbcddbc8-7frkq condition met
pod/cert-manager-webhook-5788d8d7c6-gl5mg condition met
All cert-manager pods are running!
==========Installing istio==========
Release "istio" does not exist. Installing it now.
NAME: istio
LAST DEPLOYED: Mon Mar 3 00:41:31 2025
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for istio pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (istio-ingressgateway, istiod)' --timeout=240s -n istio-system
pod/istio-ingressgateway-799fbddb8c-z4fvl condition met
pod/istiod-684894b77-wmkk8 condition met
All istio pods are running!
==========Installing dex==========
Release "dex" does not exist. Installing it now.
NAME: dex
LAST DEPLOYED: Mon Mar 3 00:41:48 2025
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for dex pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (dex)' --timeout=240s -n auth
pod/dex-77bbb9b76c-hxjpb condition met
All dex pods are running!
==========Installing oidc-authservice==========
Release "oidc-authservice" does not exist. Installing it now.
NAME: oidc-authservice
LAST DEPLOYED: Mon Mar 3 00:41:52 2025
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Waiting for oidc-authservice pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (authservice)' --timeout=240s -n istio-system
error: timed out waiting for the condition on pods/authservice-0
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Waiting for oidc-authservice pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (authservice)' --timeout=240s -n istio-system
error: timed out waiting for the condition on pods/authservice-0
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Waiting for oidc-authservice pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'app in (authservice)' --timeout=240s -n istio-system
error: timed out waiting for the condition on pods/authservice-0
error: unknown flag: --timeout
See 'kubectl describe --help' for usage.
Traceback (most recent call last):
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/kubeflow_installation.py", line 324, in
install_kubeflow(
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/kubeflow_installation.py", line 101, in install_kubeflow
install_component(
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/kubeflow_installation.py", line 180, in install_component
validate_component_installation(installation_config, component_name)
File "/home/karthik/.local/lib/python3.10/site-packages/retrying.py", line 56, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/home/karthik/.local/lib/python3.10/site-packages/retrying.py", line 266, in call
raise attempt.get()
File "/home/karthik/.local/lib/python3.10/site-packages/retrying.py", line 301, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/usr/lib/python3/dist-packages/six.py", line 719, in reraise
raise value
File "/home/karthik/.local/lib/python3.10/site-packages/retrying.py", line 251, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/kubeflow_installation.py", line 192, in validate_component_installation
kubectl_wait_pods(value, namespace, key)
File "/home/karthik/kubeflow/kubeflow-manifests/tests/e2e/utils/utils.py", line 275, in kubectl_wait_pods
raise Exception("Timeout/error waiting for pod condition")
Exception: Timeout/error waiting for pod condition
make: *** [Makefile:104: deploy-kubeflow] Error 1
Checked the pods and found the PVC was not bound.
Checked the PVC where the storage class is not set
karthik@U-1BB4IDIUHCE8V:
/kubeflow/kubeflow-manifests/awsconfigs$ kubectl get storageclass/kubeflow/kubeflow-manifests/awsconfigs$ kubectl get pvc -n istio-systemNAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
gp2 kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 10d
karthik@U-1BB4IDIUHCE8V:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
authservice-pvc Pending 16m
karthik@U-1BB4IDIUHCE8V:~/kubeflow/kubeflow-manifests/awsconfigs$ kubectl get configmap -n istio-system
kubectl describe pvc -n istio-system
Name: authservice-pvc
Namespace: istio-system
StorageClass:
Status: Pending
Volume:
Labels: app.kubernetes.io/managed-by=Helm
Annotations: meta.helm.sh/release-name: oidc-authservice
meta.helm.sh/release-namespace: default
Finalizers: [kubernetes.io/pvc-protection]
Capacity:
Access Modes:
VolumeMode: Filesystem
Used By: authservice-0
Events:
Type Reason Age From Message
Normal FailedBinding 32s (x82 over 20m) persistentvolume-controller no persistent volumes available for this claim and no storage class is set
Used the below command to overcome the error.
kubectl patch pvc authservice-pvc -n istio-system -p '{"spec":{"storageClassName":"gp2"}}'
Installer should find the default storage class name and add the storage class in the PVC to fix this.
The text was updated successfully, but these errors were encountered: