diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 0000000..e69de29 diff --git a/404.html b/404.html new file mode 100644 index 0000000..cd7de52 --- /dev/null +++ b/404.html @@ -0,0 +1,493 @@ + + + +
+ + + + + + + + + + + + + + + +The agent is responsible for the life cycle of the computation, i.e., running the computation and sending events about the status of the computation within the TEE. The agent is found inside the VM (TEE), and each computation within the TEE has its own agent. When a computation run request is sent from from the manager, manager creates a VM where the agent is found and sends the computation manifest to the agent.
+As the computation in the agent undergoes different operations, it sends events to the manager so that the user can monitor the computation from either the UI or other client. Events sent to the manager include computation running, computation finished, computation failed, and computation stopped.
+Agent sends agent events to the manager via vsock. The manager listens to the vsock and forwards the events via gRPC. The agent events are used to show the status of the computation inside the TEE so that a user can be aware of what is happening inside the TEE.
+To run a computation in the agent, a signed certificate is required. The certificate is used to verify the user who is running the computation. The certificate is sent to the agent by the manager, and the agent verifies the certificate before running the computation.
+ + + + + + +Cocos AI system is a distributed platform for running secure multi-party computations.
+It has 2 parts:
+The system architecture is illustrated in the image below.
+Agent defines firmware which goes into the TEE and is used to control and monitor computation within TEE and enable secure and encrypted communication with outside world (in order to fetch the data and provide the result of the computation). The Agent contains a gRPC server that listens for requests from gRPC clients. Communication between the Manager and Agent is done via vsock. The Agent sends events to the Manager via vsock, which then forwards these via gRPC. Agent contains a gRPC server that exposes useful functions that can be accessed by other gRPC clients such as the CLI.
+Manager is a gRPC client that listens to requests sent through gRPC and sends them to Agent via vsock. Manager creates a secure enclave and loads the computation where the agent resides. The connection between Manager and Agent is through vsock, through which channel agent sends events periodically to manager, who forwards these via gRPC.
+CoCoS CLI is used to access the agent within the secure enclave. CLI communicates to agent using gRPC, with funcitons such as algo to provide the algorithm to be run, data to provide the data to be used in the computation, and run to start the computation. It also has functions to fetch and validate the attestation report of the enclave.
+For more information on CLI, please refer to CLI docs.
+ + + + + + +The CLI allows you to perform various tasks related to the computation and management of algorithms and datasets. The CLI is a gRPC client for the agent service.
+To build the CLI, follow these steps:
+go get github.com/ultravioletrs/cocos
.cd cocos
.make cli
.make install-cli
.To run a computation, use the following command:
+./build/cocos-cli run --computation '{"name": "my-computation"}'
+
+To upload an algorithm, use the following command:
+./build/cocos-cli algo /path/to/algorithm
+
+To upload a dataset, use the following command:
+./build/cocos-cli data /path/to/dataset.csv
+
+To retrieve the computation result, use the following command:
+./build/cocos-cli result
+
+To install the CLI locally, i.e. for the current user:
+Run make install-cli
.
The CLI supports various configuration flags and options.
+Use the --help
flag with any command to see additionalinformation.
The CLI uses gRPC for communication with the Agent service.
+ + + + + + +HAL is a layer of programming that allows the software to interact with the hardware device at a general level rather than at the detailed hardware level. Cocos uses HAL and AMD SEV-SNP as an abstraction layer for confidential computing.
+AMD SEV-SNP creates secure virtual machines (SVMs). VMs are usually used to run an operating system (e.g., Ubuntu and its applications). To avoid using a whole OS, HAL uses:
+This way, applications can be executed in the SVM, and the whole HAL SVM is entirely in RAM, protected by SEV-SNP. Being a RAM-only SVM means that secrets that are kept in the SVM will be destroyed when the SVM stops working.
+HAL is made using the tool Buildroot. Buildroot is used to create efficient, embedded Linux systems, and we use it to create the compressed image of the kernel (vmlinuz) and the initial file system (initramfs).
+HAL configuration for Buildroot also includes Python runtime and agent software support. You can read more about the Agent software here.
+HAL is combined with AMD SEV-SNP to provide a fully encrypted VM that can be verified using remote attestation. You can read more about the attestation process here.
+Cocos uses QEMU and Open Virtual Machine Firmware (OVMF) to boot the confidential VM. During boot with SEV-SNP, the AMD Secure Processor (AMD SP) measures (calculates the hash) of the contents of the VM to insert that hash into the attestation report. This measurement is proof of what is currently running inside the VM. The problem with SEV is that it only measures the Open Virtual Machine Firmware (OVMF). To solve this, we have built OVMF so that OVMF contains hashes of the vmlinuz and initrams. Once the OVMF is loaded, it will load the vmlinuz and initramfs into memory, but it will continue the boot process only if the hashes of the vmlinuz and initramfs match the hashes stored in OVMF. This way, the attestation report will contain the measurement of OVMF, with the hashes, and OVMF will guarantee that the correct kernel and file system are booted. The whole process can be seen in the following diagram. The green color represents the trusted part of the system, while the red is untrusted:
+ +This process guarantees that the whole VM is secure and can be verified.
+After the kernel boots, the agent is started and ready for work.
+ + + + + + +CoCoS.ai is a distributed, microservice-based solution in the cloud that enables confidential and privacy-preserving AI/ML, i.e. execution of model training and algorithm inference on confidential data sets. Privacy-preservation is considered a “holy grail” of AI. It opens many possibilities, among which is a collaborative, trustworthy AI.
+The final product enables data scientists to train AI and ML models on confidential data that is never revealed, and can be used for Secure Multi-Party Computation (SMPC). AI/ML on combined data sets that come from different sources will unlock huge value.
+ +CoCoS.ai is enabling the following features:
+Before proceeding, install the following prerequisites:
+Once everything is installed, execute the following command from project root:
+To run CoCoS.ai, first download the cocos git repository:
+git clone git@github.com:ultravioletrs/cocos.git
+cd cocos
+make
+
+Finally - you can run the backend (within cocos
directory):
make run
+
+
+
+
+
+
+
+ Manager acts as the bridge between computation running in the VM and the user/organization. Once a computation is created by a user and the invited users have uploaded their public certificates and a run request is sent, the manager is responsible for creating the computation in the VM and managing the computation lifecycle. Communication to Manager is done via gRPC, while communication between Manager and Agent is done via vsock.
+Vsock is used to send agent events from the computation in the agent to the manager. The manager then sends the events to via gRPC, and these are visible to the end user.
+Agent runs a gRPC server, and CLI is a gRPC client of agent. The manager sends the computation to the agent via gRPC and the agent runs the computation while sending evnets back to manager on the status. The manager then sends the events it receives from agent via vsock through gRPC.
+git clone https://github.com/ultravioletrs/cocos
+cd cocos
+
+NB: all relative paths in this document are relative to cocos
repository directory.
QEMU-KVM is a virtualization platform that allows you to run multiple operating systems on the same physical machine. It is a combination of two technologies: QEMU and KVM.
+To install QEMU-KVM on a Debian based machine, run
+sudo apt update
+sudo apt install qemu-kvm
+
+Create img
directory in cmd/manager
. Create tmp
directory in cmd/manager
.
The necessary kernel modules must be loaded on the hypervisor.
+sudo modprobe vhost_vsock
+ls -l /dev/vhost-vsock
+# crw-rw-rw- 1 root kvm 10, 241 Jan 16 12:05 /dev/vhost-vsock
+ls -l /dev/vsock
+# crw-rw-rw- 1 root root 10, 121 Jan 16 12:05 /dev/vsock
+
+Cocos HAL for Linux is framework for building custom in-enclave Linux distribution. Use the instructions in Readme.
+Once the image is built copy the kernel and rootfs image to cmd/manager/img
from buildroot/output/images/bzImage
and buildroot/output/images/rootfs.cpio.gz
respectively.
cd cmd/manager
+
+sudo find / -name OVMF_CODE.fd
+# => /usr/share/OVMF/OVMF_CODE.fd
+OVMF_CODE=/usr/share/OVMF/OVMF_CODE.fd
+
+sudo find / -name OVMF_VARS.fd
+# => /usr/share/OVMF/OVMF_VARS.fd
+OVMF_VARS=/usr/share/OVMF/OVMF_VARS.fd
+
+KERNEL="img/bzImage"
+INITRD="img/rootfs.cpio.gz"
+
+qemu-system-x86_64 \
+ -enable-kvm \
+ -cpu EPYC-v4 \
+ -machine q35 \
+ -smp 4 \
+ -m 2048M,slots=5,maxmem=10240M \
+ -no-reboot \
+ -drive if=pflash,format=raw,unit=0,file=$OVMF_CODE,readonly=on \
+ -netdev user,id=vmnic,hostfwd=tcp::7020-:7002 \
+ -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= \
+ -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3 -vnc :0 \
+ -kernel $KERNEL \
+ -append "earlyprintk=serial console=ttyS0" \
+ -initrd $INITRD \
+ -nographic \
+ -monitor pty \
+ -monitor unix:monitor,server,nowait
+
+Once the VM is booted press enter and on the login use username root
.
Agent is started automatically in the VM.
+# List running processes and use 'grep' to filter for processes containing 'agent' in their names.
+ps aux | grep cocos-agent
+# This command helps verify that the 'agent' process is running.
+# The output shows the process ID (PID), resource usage, and other information about the 'cocos-agent' process.
+# For example: 118 root cocos-agent
+
+We can also check if Agent
is reachable from the host machine:
# Use netcat (nc) to test the connection to localhost on port 7020.
+nc -zv localhost 7020
+# Output:
+# nc: connect to localhost (::1) port 7020 (tcp) failed: Connection refused
+# Connection to localhost (127.0.0.1) 7020 port [tcp/*] succeeded!
+
+Now you are able to use Manager
with Agent
. Namely, Manager
will create a VM with a separate OVMF variables file on manager /run
request.
We need Open Virtual Machine Firmware. OVMF is a port of Intel's tianocore firmware - an open source implementation of the Unified Extensible Firmware Interface (UEFI) - used by a qemu virtual machine. We need OVMF in order to run virtual machine with focal-server-cloudimg-amd64. When we install QEMU, we get two files that we need to start a VM: OVMF_VARS.fd
and OVMF_CODE.fd
. We will make a local copy of OVMF_VARS.fd
since a VM will modify this file. On the other hand, OVMF_CODE.fd
is only used as a reference, so we only record its path in an environment variable.
sudo find / -name OVMF_CODE.fd
+# => /usr/share/OVMF/OVMF_CODE.fd
+MANAGER_QEMU_OVMF_CODE_FILE=/usr/share/OVMF/OVMF_CODE.fd
+
+sudo find / -name OVMF_VARS.fd
+# => /usr/share/OVMF/OVMF_VARS.fd
+MANAGER_QEMU_OVMF_VARS_FILE=/usr/share/OVMF/OVMF_VARS.fd
+
+NB: we set environment variables that we will use in the shell process where we run manager
.
To start the service, execute the following shell script (note a server needs to be running see here):
+# download the latest version of the service
+go get github.com/ultravioletrs/cocos
+
+cd $GOPATH/src/github.com/ultravioletrs/cocos
+
+# compile the manager
+make manager
+
+# copy binary to bin
+make install
+
+# set the environment variables and run the service
+MANAGER_GRPC_URL=localhost:7001
+MANAGER_LOG_LEVEL=debug \
+MANAGER_QEMU_USE_SUDO=false \
+MANAGER_QEMU_ENABLE_SEV=false \
+./build/cocos-manager
+
+To enable AMD SEV support, start manager like this
+MANAGER_GRPC_URL=localhost:7001
+MANAGER_LOG_LEVEL=debug \
+MANAGER_QEMU_USE_SUDO=true \
+MANAGER_QEMU_ENABLE_SEV=true \
+MANAGER_QEMU_SEV_CBITPOS=51 \
+./build/cocos-manager
+
+NB: To verify that the manager successfully launched the VM, you need to open two terminals on the same machine. In one terminal, you need to launch go run main.go
(with the environment variables of choice) and in the other, you can run the verification commands.
To verify that the manager launched the VM successfully, run the following command:
+ps aux | grep qemu-system-x86_64
+
+You should get something similar to this
+darko 324763 95.3 6.0 6398136 981044 ? Sl 16:17 0:15 /usr/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -nographic -monitor pty
+
+If you run a command as sudo
, you should get the output similar to this one
root 37982 0.0 0.0 9444 4572 pts/0 S+ 16:18 0:00 sudo /usr/local/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -object sev-guest,id=sev0,cbitpos=51,reduced-phys-bits=1 -machine memory-encryption=sev0 -nographic -monitor pty
+root 37989 122 13.1 5345816 4252312 pts/0 Sl+ 16:19 0:04 /usr/local/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -object sev-guest,id=sev0,cbitpos=51,reduced-phys-bits=1 -machine memory-encryption=sev0 -nographic -monitor pty
+
+The two processes are due to the fact that we run the command /usr/bin/qemu-system-x86_64
as sudo
, so there is one process for sudo
command and the other for /usr/bin/qemu-system-x86_64
.
If the ps aux | grep qemu-system-x86_64
give you something like this
darko 13913 0.0 0.0 0 0 pts/2 Z+ 20:17 0:00 [qemu-system-x86] <defunct>
+
+means that the a QEMU virtual machine that is currently defunct, meaning that it is no longer running. More precisely, the defunct process in the output is also known as a "zombie" process.
+You can troubleshoot the VM launch procedure by running directly qemu-system-x86_64
command. When you run manager
with MANAGER_LOG_LEVEL=info
env var set, it prints out the entire command used to launch a VM. The relevant part of the log might look like this
{"level":"info","message":"/usr/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -nographic -monitor pty","ts":"2023-08-14T18:29:19.2653908Z"}
+
+You can run the command - the value of the "message"
key - directly in the terminal:
/usr/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -nographic -monitor pty
+
+and look for the possible problems. This problems can usually be solved by using the adequate env var assignments. Look in the manager/qemu/config.go
file to see the recognized env vars. Don't forget to prepend MANAGER_QEMU_
to the name of the env vars.
qemu-system-x86_64
Processes#To kill any leftover qemu-system-x86_64
processes, use
pkill -f qemu-system-x86_64
+
+The pkill command is used to kill processes by name or by pattern. The -f flag to specify that we want to kill processes that match the pattern qemu-system-x86_64
. It sends the SIGKILL signal to all processes that are running qemu-system-x86_64
.
If this does not work, i.e. if ps aux | grep qemu-system-x86_64
still outputs qemu-system-x86_64
related process(es), you can kill the unwanted process with kill -9 <PID>
, which also sends a SIGKILL signal to the process.
CoCoS.ai is a distributed, microservice-based solution in the cloud that enables confidential and privacy-preserving AI/ML, i.e. execution of model training and algorithm inference on confidential data sets. Privacy-preservation is considered a \u201choly grail\u201d of AI. It opens many possibilities, among which is a collaborative, trustworthy AI.
The final product enables data scientists to train AI and ML models on confidential data that is never revealed, and can be used for Secure Multi-Party Computation (SMPC). AI/ML on combined data sets that come from different sources will unlock huge value.
"},{"location":"#features","title":"Features","text":"CoCoS.ai is enabling the following features:
Apache-2.0
"},{"location":"agent/","title":"Agent","text":"The agent is responsible for the life cycle of the computation, i.e., running the computation and sending events about the status of the computation within the TEE. The agent is found inside the VM (TEE), and each computation within the TEE has its own agent. When a computation run request is sent from from the manager, manager creates a VM where the agent is found and sends the computation manifest to the agent.
"},{"location":"agent/#agent-events","title":"Agent Events","text":"As the computation in the agent undergoes different operations, it sends events to the manager so that the user can monitor the computation from either the UI or other client. Events sent to the manager include computation running, computation finished, computation failed, and computation stopped.
"},{"location":"agent/#vsock-connection-between-agent-manager","title":"Vsock Connection Between Agent & Manager","text":"Agent sends agent events to the manager via vsock. The manager listens to the vsock and forwards the events via gRPC. The agent events are used to show the status of the computation inside the TEE so that a user can be aware of what is happening inside the TEE.
"},{"location":"agent/#security","title":"Security","text":"To run a computation in the agent, a signed certificate is required. The certificate is used to verify the user who is running the computation. The certificate is sent to the agent by the manager, and the agent verifies the certificate before running the computation.
"},{"location":"architecture/","title":"Architecture","text":"Cocos AI system is a distributed platform for running secure multi-party computations.
It has 2 parts:
The system architecture is illustrated in the image below.
"},{"location":"architecture/#agent","title":"Agent","text":"Agent defines firmware which goes into the TEE and is used to control and monitor computation within TEE and enable secure and encrypted communication with outside world (in order to fetch the data and provide the result of the computation). The Agent contains a gRPC server that listens for requests from gRPC clients. Communication between the Manager and Agent is done via vsock. The Agent sends events to the Manager via vsock, which then forwards these via gRPC. Agent contains a gRPC server that exposes useful functions that can be accessed by other gRPC clients such as the CLI.
"},{"location":"architecture/#manager","title":"Manager","text":"Manager is a gRPC client that listens to requests sent through gRPC and sends them to Agent via vsock. Manager creates a secure enclave and loads the computation where the agent resides. The connection between Manager and Agent is through vsock, through which channel agent sends events periodically to manager, who forwards these via gRPC.
"},{"location":"architecture/#cli","title":"CLI","text":"CoCoS CLI is used to access the agent within the secure enclave. CLI communicates to agent using gRPC, with funcitons such as algo to provide the algorithm to be run, data to provide the data to be used in the computation, and run to start the computation. It also has functions to fetch and validate the attestation report of the enclave.
For more information on CLI, please refer to CLI docs.
"},{"location":"cli/","title":"Agent CLI","text":"The CLI allows you to perform various tasks related to the computation and management of algorithms and datasets. The CLI is a gRPC client for the agent service.
"},{"location":"cli/#build","title":"Build","text":"To build the CLI, follow these steps:
go get github.com/ultravioletrs/cocos
.cd cocos
.make cli
.make install-cli
.To run a computation, use the following command:
./build/cocos-cli run --computation '{\"name\": \"my-computation\"}'\n
"},{"location":"cli/#upload-algorithm","title":"Upload Algorithm","text":"To upload an algorithm, use the following command:
./build/cocos-cli algo /path/to/algorithm\n
"},{"location":"cli/#upload-dataset","title":"Upload Dataset","text":"To upload a dataset, use the following command:
./build/cocos-cli data /path/to/dataset.csv\n
"},{"location":"cli/#retrieve-result","title":"Retrieve Result","text":"To retrieve the computation result, use the following command:
./build/cocos-cli result\n
"},{"location":"cli/#installation","title":"Installation","text":"To install the CLI locally, i.e. for the current user:
Run make install-cli
.
The CLI supports various configuration flags and options.
Use the --help
flag with any command to see additionalinformation.
The CLI uses gRPC for communication with the Agent service.
"},{"location":"hal/","title":"Hardware Abstraction Layer (HAL)","text":"HAL is a layer of programming that allows the software to interact with the hardware device at a general level rather than at the detailed hardware level. Cocos uses HAL and AMD SEV-SNP as an abstraction layer for confidential computing.
AMD SEV-SNP creates secure virtual machines (SVMs). VMs are usually used to run an operating system (e.g., Ubuntu and its applications). To avoid using a whole OS, HAL uses:
This way, applications can be executed in the SVM, and the whole HAL SVM is entirely in RAM, protected by SEV-SNP. Being a RAM-only SVM means that secrets that are kept in the SVM will be destroyed when the SVM stops working.
"},{"location":"hal/#how-is-hal-constructed","title":"How is HAL constructed?","text":"HAL is made using the tool Buildroot. Buildroot is used to create efficient, embedded Linux systems, and we use it to create the compressed image of the kernel (vmlinuz) and the initial file system (initramfs).
HAL configuration for Buildroot also includes Python runtime and agent software support. You can read more about the Agent software here.
"},{"location":"hal/#how-does-it-work","title":"How does it work?","text":"HAL is combined with AMD SEV-SNP to provide a fully encrypted VM that can be verified using remote attestation. You can read more about the attestation process here.
Cocos uses QEMU and Open Virtual Machine Firmware (OVMF) to boot the confidential VM. During boot with SEV-SNP, the AMD Secure Processor (AMD SP) measures (calculates the hash) of the contents of the VM to insert that hash into the attestation report. This measurement is proof of what is currently running inside the VM. The problem with SEV is that it only measures the Open Virtual Machine Firmware (OVMF). To solve this, we have built OVMF so that OVMF contains hashes of the vmlinuz and initrams. Once the OVMF is loaded, it will load the vmlinuz and initramfs into memory, but it will continue the boot process only if the hashes of the vmlinuz and initramfs match the hashes stored in OVMF. This way, the attestation report will contain the measurement of OVMF, with the hashes, and OVMF will guarantee that the correct kernel and file system are booted. The whole process can be seen in the following diagram. The green color represents the trusted part of the system, while the red is untrusted:
This process guarantees that the whole VM is secure and can be verified.
After the kernel boots, the agent is started and ready for work.
"},{"location":"install/","title":"Install","text":"Before proceeding, install the following prerequisites:
Once everything is installed, execute the following command from project root:
To run CoCoS.ai, first download the cocos git repository:
git clone git@github.com:ultravioletrs/cocos.git\ncd cocos\nmake\n
Finally - you can run the backend (within cocos
directory):
make run\n
"},{"location":"manager/","title":"Manager","text":"Manager acts as the bridge between computation running in the VM and the user/organization. Once a computation is created by a user and the invited users have uploaded their public certificates and a run request is sent, the manager is responsible for creating the computation in the VM and managing the computation lifecycle. Communication to Manager is done via gRPC, while communication between Manager and Agent is done via vsock.
Vsock is used to send agent events from the computation in the agent to the manager. The manager then sends the events to via gRPC, and these are visible to the end user.
"},{"location":"manager/#manager-agent","title":"Manager <> Agent","text":"Agent runs a gRPC server, and CLI is a gRPC client of agent. The manager sends the computation to the agent via gRPC and the agent runs the computation while sending evnets back to manager on the status. The manager then sends the events it receives from agent via vsock through gRPC.
"},{"location":"manager/#setup-and-test-manager-agent","title":"Setup and Test Manager <> Agent","text":"git clone https://github.com/ultravioletrs/cocos\ncd cocos\n
NB: all relative paths in this document are relative to cocos
repository directory.
QEMU-KVM is a virtualization platform that allows you to run multiple operating systems on the same physical machine. It is a combination of two technologies: QEMU and KVM.
To install QEMU-KVM on a Debian based machine, run
sudo apt update\nsudo apt install qemu-kvm\n
Create img
directory in cmd/manager
. Create tmp
directory in cmd/manager
.
The necessary kernel modules must be loaded on the hypervisor.
sudo modprobe vhost_vsock\nls -l /dev/vhost-vsock\n# crw-rw-rw- 1 root kvm 10, 241 Jan 16 12:05 /dev/vhost-vsock\nls -l /dev/vsock\n# crw-rw-rw- 1 root root 10, 121 Jan 16 12:05 /dev/vsock\n
"},{"location":"manager/#prepare-cocos-hal","title":"Prepare Cocos HAL","text":"Cocos HAL for Linux is framework for building custom in-enclave Linux distribution. Use the instructions in Readme. Once the image is built copy the kernel and rootfs image to cmd/manager/img
from buildroot/output/images/bzImage
and buildroot/output/images/rootfs.cpio.gz
respectively.
cd cmd/manager\n\nsudo find / -name OVMF_CODE.fd\n# => /usr/share/OVMF/OVMF_CODE.fd\nOVMF_CODE=/usr/share/OVMF/OVMF_CODE.fd\n\nsudo find / -name OVMF_VARS.fd\n# => /usr/share/OVMF/OVMF_VARS.fd\nOVMF_VARS=/usr/share/OVMF/OVMF_VARS.fd\n\nKERNEL=\"img/bzImage\"\nINITRD=\"img/rootfs.cpio.gz\"\n\nqemu-system-x86_64 \\\n -enable-kvm \\\n -cpu EPYC-v4 \\\n -machine q35 \\\n -smp 4 \\\n -m 2048M,slots=5,maxmem=10240M \\\n -no-reboot \\\n -drive if=pflash,format=raw,unit=0,file=$OVMF_CODE,readonly=on \\\n -netdev user,id=vmnic,hostfwd=tcp::7020-:7002 \\\n -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= \\\n -device vhost-vsock-pci,id=vhost-vsock-pci0,guest-cid=3 -vnc :0 \\\n -kernel $KERNEL \\\n -append \"earlyprintk=serial console=ttyS0\" \\\n -initrd $INITRD \\\n -nographic \\\n -monitor pty \\\n -monitor unix:monitor,server,nowait\n
Once the VM is booted press enter and on the login use username root
.
Agent is started automatically in the VM.
# List running processes and use 'grep' to filter for processes containing 'agent' in their names.\nps aux | grep cocos-agent\n# This command helps verify that the 'agent' process is running.\n# The output shows the process ID (PID), resource usage, and other information about the 'cocos-agent' process.\n# For example: 118 root cocos-agent\n
We can also check if Agent
is reachable from the host machine:
# Use netcat (nc) to test the connection to localhost on port 7020.\nnc -zv localhost 7020\n# Output:\n# nc: connect to localhost (::1) port 7020 (tcp) failed: Connection refused\n# Connection to localhost (127.0.0.1) 7020 port [tcp/*] succeeded!\n
"},{"location":"manager/#conclusion","title":"Conclusion","text":"Now you are able to use Manager
with Agent
. Namely, Manager
will create a VM with a separate OVMF variables file on manager /run
request.
We need Open Virtual Machine Firmware. OVMF is a port of Intel's tianocore firmware - an open source implementation of the Unified Extensible Firmware Interface (UEFI) - used by a qemu virtual machine. We need OVMF in order to run virtual machine with focal-server-cloudimg-amd64. When we install QEMU, we get two files that we need to start a VM: OVMF_VARS.fd
and OVMF_CODE.fd
. We will make a local copy of OVMF_VARS.fd
since a VM will modify this file. On the other hand, OVMF_CODE.fd
is only used as a reference, so we only record its path in an environment variable.
sudo find / -name OVMF_CODE.fd\n# => /usr/share/OVMF/OVMF_CODE.fd\nMANAGER_QEMU_OVMF_CODE_FILE=/usr/share/OVMF/OVMF_CODE.fd\n\nsudo find / -name OVMF_VARS.fd\n# => /usr/share/OVMF/OVMF_VARS.fd\nMANAGER_QEMU_OVMF_VARS_FILE=/usr/share/OVMF/OVMF_VARS.fd\n
NB: we set environment variables that we will use in the shell process where we run manager
.
To start the service, execute the following shell script (note a server needs to be running see here):
# download the latest version of the service\ngo get github.com/ultravioletrs/cocos\n\ncd $GOPATH/src/github.com/ultravioletrs/cocos\n\n# compile the manager\nmake manager\n\n# copy binary to bin\nmake install\n\n# set the environment variables and run the service\nMANAGER_GRPC_URL=localhost:7001\nMANAGER_LOG_LEVEL=debug \\\nMANAGER_QEMU_USE_SUDO=false \\\nMANAGER_QEMU_ENABLE_SEV=false \\\n./build/cocos-manager\n
To enable AMD SEV support, start manager like this
MANAGER_GRPC_URL=localhost:7001\nMANAGER_LOG_LEVEL=debug \\\nMANAGER_QEMU_USE_SUDO=true \\\nMANAGER_QEMU_ENABLE_SEV=true \\\nMANAGER_QEMU_SEV_CBITPOS=51 \\\n./build/cocos-manager\n
"},{"location":"manager/#verifying-vm-launch","title":"Verifying VM Launch","text":"NB: To verify that the manager successfully launched the VM, you need to open two terminals on the same machine. In one terminal, you need to launch go run main.go
(with the environment variables of choice) and in the other, you can run the verification commands.
To verify that the manager launched the VM successfully, run the following command:
ps aux | grep qemu-system-x86_64\n
You should get something similar to this
darko 324763 95.3 6.0 6398136 981044 ? Sl 16:17 0:15 /usr/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -nographic -monitor pty\n
If you run a command as sudo
, you should get the output similar to this one
root 37982 0.0 0.0 9444 4572 pts/0 S+ 16:18 0:00 sudo /usr/local/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -object sev-guest,id=sev0,cbitpos=51,reduced-phys-bits=1 -machine memory-encryption=sev0 -nographic -monitor pty\nroot 37989 122 13.1 5345816 4252312 pts/0 Sl+ 16:19 0:04 /usr/local/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -object sev-guest,id=sev0,cbitpos=51,reduced-phys-bits=1 -machine memory-encryption=sev0 -nographic -monitor pty\n
The two processes are due to the fact that we run the command /usr/bin/qemu-system-x86_64
as sudo
, so there is one process for sudo
command and the other for /usr/bin/qemu-system-x86_64
.
If the ps aux | grep qemu-system-x86_64
give you something like this
darko 13913 0.0 0.0 0 0 pts/2 Z+ 20:17 0:00 [qemu-system-x86] <defunct>\n
means that the a QEMU virtual machine that is currently defunct, meaning that it is no longer running. More precisely, the defunct process in the output is also known as a \"zombie\" process.
You can troubleshoot the VM launch procedure by running directly qemu-system-x86_64
command. When you run manager
with MANAGER_LOG_LEVEL=info
env var set, it prints out the entire command used to launch a VM. The relevant part of the log might look like this
{\"level\":\"info\",\"message\":\"/usr/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -nographic -monitor pty\",\"ts\":\"2023-08-14T18:29:19.2653908Z\"}\n
You can run the command - the value of the \"message\"
key - directly in the terminal:
/usr/bin/qemu-system-x86_64 -enable-kvm -machine q35 -cpu EPYC -smp 4,maxcpus=64 -m 4096M,slots=5,maxmem=30G -drive if=pflash,format=raw,unit=0,file=/usr/share/OVMF/OVMF_CODE.fd,readonly=on -drive if=pflash,format=raw,unit=1,file=img/OVMF_VARS.fd -device virtio-scsi-pci,id=scsi,disable-legacy=on,iommu_platform=true -drive file=img/focal-server-cloudimg-amd64.img,if=none,id=disk0,format=qcow2 -device scsi-hd,drive=disk0 -netdev user,id=vmnic,hostfwd=tcp::2222-:22,hostfwd=tcp::9301-:9031,hostfwd=tcp::7020-:7002 -device virtio-net-pci,disable-legacy=on,iommu_platform=true,netdev=vmnic,romfile= -nographic -monitor pty\n
and look for the possible problems. This problems can usually be solved by using the adequate env var assignments. Look in the manager/qemu/config.go
file to see the recognized env vars. Don't forget to prepend MANAGER_QEMU_
to the name of the env vars.
qemu-system-x86_64
Processes","text":"To kill any leftover qemu-system-x86_64
processes, use
pkill -f qemu-system-x86_64\n
The pkill command is used to kill processes by name or by pattern. The -f flag to specify that we want to kill processes that match the pattern qemu-system-x86_64
. It sends the SIGKILL signal to all processes that are running qemu-system-x86_64
.
If this does not work, i.e. if ps aux | grep qemu-system-x86_64
still outputs qemu-system-x86_64
related process(es), you can kill the unwanted process with kill -9 <PID>
, which also sends a SIGKILL signal to the process.
A trusted execution environment (TEE) is a separate part of the main memory and the CPU that encrypts code/data and enables \"on the fly\" executions of the said encrypted code/data. One of the examples of TEEs is Intel Secure Guard Extensions (SGX) and AMD Secure Encrypted Virtualization (SEV).
"},{"location":"tee/#amd-sev","title":"AMD SEV","text":"AMD SEV and its latest and most secure iteration, AMD Secure Encrypted Virtualization - Secure Nested Paging (SEV-SNP), is the AMD technology that isolates entire virtual machines (VMs). SEV-SNP encrypts the whole VM and provides confidentiality and integrity protection of the VM memory. This way, the hypervisor or any other application on the host machine cannot read the VM memory.
At Cocos, we use an in-memory VM called the Hardware Abstraction Layer (HAL). You can read more on HAL here.
One of the critical components of the SEV technology is the remote attestation. Remote attestation is a process in which one side (the attester) collects information about itself and sends that information to the client (or the relying party) for the relying party to assess the trustworthiness of the attester. If the attester is deemed trustworthy, the relying party will send confidential code/data or any secrets to the attester. You can read more on the attestation process here.
"}]} \ No newline at end of file diff --git a/sitemap.xml b/sitemap.xml new file mode 100644 index 0000000..caafc69 --- /dev/null +++ b/sitemap.xml @@ -0,0 +1,48 @@ + +A trusted execution environment (TEE) is a separate part of the main memory and the CPU that encrypts code/data and enables "on the fly" executions of the said encrypted code/data. One of the examples of TEEs is Intel Secure Guard Extensions (SGX) and AMD Secure Encrypted Virtualization (SEV).
+AMD SEV and its latest and most secure iteration, AMD Secure Encrypted Virtualization - Secure Nested Paging (SEV-SNP), is the AMD technology that isolates entire virtual machines (VMs). SEV-SNP encrypts the whole VM and provides confidentiality and integrity protection of the VM memory. This way, the hypervisor or any other application on the host machine cannot read the VM memory.
+At Cocos, we use an in-memory VM called the Hardware Abstraction Layer (HAL). You can read more on HAL here.
+One of the critical components of the SEV technology is the remote attestation. Remote attestation is a process in which one side (the attester) collects information about itself and sends that information to the client (or the relying party) for the relying party to assess the trustworthiness of the attester. If the attester is deemed trustworthy, the relying party will send confidential code/data or any secrets to the attester. You can read more on the attestation process here.
+ + + + + + +