-
Notifications
You must be signed in to change notification settings - Fork 43
kubernetes: persistent volumes #19
Comments
Has any progress been made on this? I'd be interested in taking a crack at it if it hasn't yet been touched. |
no progress really, but would love some assistance :) I added some code to create the pv and pvc data objects but never had time to finish the job |
I am checking this currently. However, the problem that I see is that we need to have ReadWriteMany volumes in kubernetes cluster. At least we does not have those. Need to install something like https://github.com/gluster/gluster-kubernetes |
@zetaab a ReadWriteMany volume is not required because all Pipeline steps execute on the same node, using a shared workspace (make sure you pass the --kube-node flag when testing with the cli). This means the persistent volume needs to be of type HostPath. |
sharing hostpath volumes in kubernetes is not recommended. Platforms like openshift it is not even allowed (without modifying things). This can be maybe used in future for hostpath things (after dynamic provisioning is supported): https://kubernetes.io/docs/concepts/storage/storage-classes/#local currently the problem is that if namespace x takes hostpath |
@bradrydzewski btw how you are planning to use that --kube-node thing? When new build is going to start you need to just define one node where everything should be executed? You need to know which node has enough resources to execute that beforehand? Or should user define in which node build is always running? |
The I think the default volume type should be HostPath because installing a volume plugin should not be a requirement for using Drone. But we can certainly give teams the option to use alternate volume plugin types if that want or need to. |
Yes I agree with you, there should be hostpath(in future this can be moved to pvc using local hostpath dynamic provisioner) and pvc option. Also hostpath should be maybe the default one, because installing things like RWX volumes is not that easy. RWX volume is needed if people are executing two pipeline steps simultaneously, otherwise RWO is enough. However, it might be quite slow to execute pipelines with RWO because detaching/attaching volume to each step takes time. |
I think this issue is very important to make the Kubernetes runtime fully native. The implementation uses While other aspects of the Kubernetes implementation actually embrace the Kubernetes scheduler. The decision that there is no agent concept in the Kubernetes runtime is a forward looking one. Currently Drone starts every build in about the time it is able to spin up a container. Practically this means that builds and their steps always start, getting rid of the queue concept. But a running step on the UI sometimes means a Today it's not the case, if you fan-out and one of the step/pod is in Furthermore limiting all pods of a pipeline on a specific node, prevents the cluster from scaling up. Which is a shame given that the abstraction is able to rely on Kubernetes for a fully queueless autoscaling behavior, while the current implementation prevents Kubernetes from doing its job. Plus prevents me from migrating my (homegrown) autoscaling 0.8 Drone setup to the Kubernetes runtime. See how a
|
@laszlocph So if you run multiple pipelines, it still only run it one machine? I can see why you would use hostpath, but not of multiple pipelines |
Each pipeline has a dedicated machine. If you have two nodes and two pipelines, Drone can schedule them on different nodes, I've seen that happening. The problem happens when a pipeline fans out to 4 - let say, each of those steps requiring a single core, and you have 2 or 4 core machine. Then at least one of those supposed to be parallel steps are becoming sequential. No matter how many other idle nodes you have. |
For sure. But I do think a certain amount blame for this problem lies with the cluster owner. If you have workloads that require certain or specific amounts of resources, resource in the cluster is at or near the limit, and/or you are relying on the cluster-autoscaler to add the required resources to schedule your loads, then you will end up with some issues at some point. In this situation you can apply some specific configuration to help, for example:
It won't solve all problems but with some careful consideration you can solve a lot of scheduling problems like this. |
Also worthing noting that in k8s version 1.12 they improved scheduling in regards to volumes and zones: https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/ But I think until the cluster-autoscaler is zone aware this will always be tricky. |
I agree, but overprovisioning and being smart about placing workloads are exactly the problems schedulers meant to eliminate. So let's say we overprovision each node. How do you know when to add a node to the cluster? |
FYI I haven't used Drone prior to version 1.0 so my knowledge there is lacking.
Ideally I would say this isn't Drone's responsibility to know or control this. This is scheduling and is the responsibly of k8s. But when Drone creates the job, it could set
The cluster-autoscaler will do this. If you want limits around its behaviour then you can set these (to some degree) with min/max, cool down, delays etc. I'm not saying either of these solutions are ideal for Drone but normally that's how these issues are solved on k8s 🙂 |
Drone for Nomad handles this by requesting CPU and RAM resources for each Pipeline. In Nomad, the request does not place any actual limits, and is only used for scheduling. I believe a similar concept could be used for Kubernetes via resource requests vs limits. (nomad task definition)
Fundamentally Drone does not care about whether or not steps run on the same or different nodes, however, Drone does care that there is a shared disk (workspace) available for all steps in the pipeline. The current implementation -- running all steps on a single node -- was the easiest way for us to create the initial proof of concept, which we can now build upon. It is possible Drone can support ReadWriteMany volumes in the future (Ceph, Gluster, etc) although we still need to get Drone working well with vanilla Kubernetes and HostPath persistent volume claim, since a cluster may not have ReadWriteMany volume plugins available and this should not be a requirement of running Drone. To support ReadWriteMany volumes, we will need to assign resource limits to every step in the pipeline in order for it to be properly scheduled by Kubernetes. In terms of user-experience, this will sort of suck, so we need to find a way to minimize this or come up with sane defaults that can easily be overriden. |
Using the
Ceph and Gluster is a pain to setup, but i just tried ReadWriteMany volumes with NFS and only needed a single pod. Maybe it could be bundled with the Drone yaml. The linked branch contains a drone-runtime implementation that uses NFS based Or if you chose the pluggable
I don't get this part, what kind of limits are needed? |
I did my PR once already (#27), but @bradrydzewski closed it. As I see it storageclasses is correct solution for this. Of course storageclass needs to support RWX or similar, but with general storageclass implementation we could use more than one storagebackends |
What about adding a new That way we could use Kubernetes native "objects" for configuration by the user, e.g., custom volumes for secrets, the user would specify those volumes in normal yaml format, storageclass name is separated from Docker config and more, namespace pattern / what namespace should be used for the jobs. Though I'm not sure how the user would set the configuration "globally" and maybe even on a per project to be passed to the Kubernetes Drone Runtime. Example idea: {
"metadata": {
[...]
},
"steps": [
[...]
],
"docker": {}.
"kubernetes": {
// Storage Class name to use for the PersistentVolumeClaims to use per Job
"storageClassName": "my-awesome-rwx-storageclass",
// Attach other volumes to the job pod(s)
"additionalVolumes": [
{
"volumeMount": {
"name": "foo",
"mountPath: "/etc/foo",
"readOnly": true
},
"volume": {
"name": "foo",
"secret": {
"secretName": "mysecret"
}
}
}
],
// How the namespace should be named and / or if separate namespaces per Job should be used
"namespaces": {
"mode": "isolated",
"prefix": "drone-"
// or
"mode": "shared",
"name": "my-droneci-ci-namespace"
},
// Clean up jobs after X time, other items to clean up?
// E.g., adding a label to the created namespaces and making sure that non are hanging on deletion every X internval
"cleanUp": {
"jobDeletionDelay": "24h"
}
}
} (the example covers more than just This would be a config section for Kubernetes which I personally can think of when looking at Drone CI Runtime Kubernetes right now and what I would like to see in the future. 😉 |
I see us going more the direction of adopting something like Knative or Tekton as a runtime target for Drone (see #65). Drone was designed for Docker and trying to force-fit this design into Kubernetes is not working very well. I expect we will invest a lot more time into Knative in the coming months, as opposed to investing in the existing experimental implementation. |
@bradrydzewski as I see this could work #27 if in some day we have working local volume provisioner which can dynamically make volumes in current host machine. It works pretty much in same way than current solution (except no hardcoded hostpath mount, instead persistentvolumeclaim). See https://kubernetes.io/docs/concepts/storage/volumes/#local and https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/
|
the kubernetes implementation in this repository was scrapped for reasons described here. We have a new implementation, created from scratch, that no longer uses a persistent volume which in turn obsoletes this issue. New implementation can be found at drone-runners/drone-runner-kube. new kubernetes runner documentation can be found here: |
this helps ensure all pods have access to a shared workspace and can run on the same machine. It also helps us implement
temp_dir
volumes (as defined in the drone yaml). Persistent volumes are currently disabled while we try to figure out an approach to scheduling pods on specific nodes.The text was updated successfully, but these errors were encountered: