Failing to push large image layers #18800
-
Not sure if this should have been an issue instead, let me know if you want me to move it. I have some issues pushing images with large layers to harbor. Using the docker cli I can see the large layers uploading until the "meter" gets full, then it stops for a while until it fails and then retries the upload. Looking at the different logs in Harbor I can see two relevant log lines in core and in registry. Log for core:
As you can see especially the logs from core are not very specific. Adding debug logging did not really reveal anything more relevant to this. I tried going through the network traffic and it seems like the error starts with core returning a 502 http error to the client, then you get some reset messages going to core from the client (or possibly from nginx in between). I'm running harbor in several kubernetes clusters, installed via the helm chart running version 2.6. I can see this problem in several clusters, but it seems to depend a bit on the cloud provider. For one cloud provider I can see errors starting with layers that are ~1 GB, while on another cloud provider I can see errors for layers that are ~7GB. The installations are also not always consistent in what size fails, e.g. sometimes the first cloud fails at 800MB and sometimes it can manage 1.5GB. In one cloud I'm storing images in object storage via s3 (though it's not aws) and in the other it's object storage via swift api. It's very possible that there is some issue with the object storage on these clouds. But I would like to know exactly what happens here, is there some timeout in harbor, is there some actual error talking to the object storage, or something else? Ultimately I want to be able to support larger layers (at least up to ~2GB), but if that is not possible then it would be good to just understand what is happening. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 4 replies
-
hi, what's the harbor version? Based on the failure error, it appears that the request was canceled due to a timeout, which could be caused by either the proxy or Harbor components. To troubleshoot this issue, you can start by tuning the request/response timeout by setting a larger value and retrying the operation. |
Beta Was this translation helpful? Give feedback.
-
Hey @viktor-f , I'm encountering the same problem as you do, did you manage to have a workaround for this issue? |
Beta Was this translation helpful? Give feedback.
-
For anyone also encountering this issue, we've discovered that the problem lies on ingress-nginx just randomly closing long HTTP connections after some time. We mitigated this by switching to a different ingress. |
Beta Was this translation helpful? Give feedback.
-
@yunghollow91 doe you have an example/snippet of where you changed that? I cannot find any timeout related parameter in the Helm values.yaml (running Harbor on Kubernetes). Or is this your own ingress controller outside of the Harbor installed ngnix (I use Traefik there)? |
Beta Was this translation helpful? Give feedback.
hi, what's the harbor version? Based on the failure error, it appears that the request was canceled due to a timeout, which could be caused by either the proxy or Harbor components.
To troubleshoot this issue, you can start by tuning the request/response timeout by setting a larger value and retrying the operation.