Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add TDX debugging tools and documentation #1024

Merged
merged 5 commits into from
Nov 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions dev-docs/coco/tdx-measurements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Debugging TDX measurement mismatches

> [!CAUTION]
> This document doesn't claim correctness by any mean and shouldn't be seen
> as a source of truth. It should only serve as a helper document for internal
> debugging.

TDX uses a static MRTD and dynamic RTMRs for it's (boot) integrity measurements.

We pre-calculate expected values that we later check against when verifying workloads.

This document shows how mismatches in these measurements can be debugged.

## Retrieving the guest's event log

[Get a shell](../aks/serial-console.md) into the pod VM. Then, run the [`tdeventlog`](https://github.com/canonical/tdx/blob/noble-24.04/tests/lib/tdx-tools/src/tdxtools/tdeventlog.py)
tool within the guest to retrieve the event log. If the tool can't be installed in the guest,
the `/sys/firmware/acpi/tables/data/CCEL` and `/sys/firmware/acpi/tables/CCEL` files can also be dumped
by other means and transferred to a machine where they can then be parsed with `tdeventlog`.

## Understanding the event log

The event log will consist of multiple entries looking like so:

```
==== TDX Event Log Entry - 17 [0x83AA76BC] ====
RTMR : 2
Type : 0x6 (EV_EVENT_TAG)
Length : 87
Algorithms ID : 12 (TPM_ALG_SHA384)
Digest[0] : efa84d42b931a7454dc770eeeca0d476ac613f432b650515fc26cff088cf206c856c276f8acf435e98560c14fd2e0c67
RAW DATA: ----------------------------------------------
83AA76BC 03 00 00 00 06 00 00 00 01 00 00 00 0C 00 EF A8 ................
83AA76CC 4D 42 B9 31 A7 45 4D C7 70 EE EC A0 D4 76 AC 61 MB.1.EM.p....v.a
83AA76DC 3F 43 2B 65 05 15 FC 26 CF F0 88 CF 20 6C 85 6C ?C+e...&.... l.l
83AA76EC 27 6F 8A CF 43 5E 98 56 0C 14 FD 2E 0C 67 15 00 'o..C^.V.....g..
83AA76FC 00 00 EC 22 3B 8F 0D 00 00 00 4C 69 6E 75 78 20 ...";.....Linux
83AA770C 69 6E 69 74 72 64 00 initrd.
RAW DATA: ----------------------------------------------
```

`RTMR` specifies the RTMR (out of `RTMR {0,1,2,3}`) the measurement has been made into. Thus,
if you want to debug only a specific register, it makes sense to `grep` for this line.
While the `Type` might be of value to see what component actually makes the measurement, it will be
considered out-of-scope for this document. `Length` and `Algorithms ID` should be self-explanatory.

`Digest[0]` is the SHA384 of the raw measured contents. In the above example, `efa84d...` corresponds to
`sha384sum initrd.zst`.

`RAW DATA` is the raw data blob for the measurement event, containing the aforementioned information as
well as the informational string (`Linux initrd`, in this case) associated with the event. Note that this
can be misleading, as for some events measured by OVMF, the informational string is actually equal to the
measured data (the input for `sha384sum`) - however, this isn't the case for all measurements.

## Locating mismatches

Usually, the error given by the coordinator, CLI, etc. will already show you which RTMR mismatched.

To narrow it down further, it"s recommended to add debug statements to the [`hashAndExtend`](https://github.com/edgelesssys/contrast/blob/a73691e17492b37469e32c7e800c4c0f7a955545/tools/tdx-measure/rtmr/rtmr.go#L45)
function of the measurement precalculator to see a log corresponding to the `Digest[0]` values in the
event log. Then, one can diff these against the digests in the event log for the RTMR in question to see
which event causes the mismatch.

Finding the mismatch then is a matter of code search and reversing which component might have done which
measurement.

The [TDX Virtual Firmware documentation](https://cdrdv2.intel.com/v1/dl/getContent/733585) gives an abstract
overview of what components of the boot chain are generally reflected in the specific registers, but this is
likely not sufficient to find the exact location where things go wrong.

GitHub code search against the informational string of the event seems to be a good general pathway to find
out about what the measurement is exactly.

Below is an incomplete list of which component measures into which RTMRs:

### RTMR 0

Measured into by the Firmware(?).

Contains the firmware itself, secure boot EFI variables, ACPI configuration and the `EFI_LOAD_OPTION`
passed by the VMM.

### RTMR 1

Measured into by the Firmware.

Contains a measurement of the loaded EFI application (the kernel, for example), and raw hashes of the aforementioned
informational strings.

### RTMR 2

Measured into by GRUB or the Linux [EFI stub](https://elixir.bootlin.com/linux/v6.11.8/source/drivers/firmware/efi/libstub/efi-stub-helper.c),
as it would do with TPM PCR 8/9.

This contains a measurement for the kernel command line and the initrd, in that order.

### RTMR 3

Reserved, all-0 at the moment of writing.
Original file line number Diff line number Diff line change
@@ -1,18 +1,20 @@
From 649d2928a8f6552ba0644e6f9a7f719dfbdabe60 Mon Sep 17 00:00:00 2001
From: Tom Dohrmann <[email protected]>
Date: Mon, 23 Sep 2024 10:57:09 +0200
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Moritz Sanft <[email protected]>
msanft marked this conversation as resolved.
Show resolved Hide resolved
Date: Thu, 21 Nov 2024 09:43:08 +0100
Subject: [PATCH] verify Hobs instead of measuring them

We don't want the measurement to change depending on the amount of
memory assigned to the CVM. Instead verify the Hobs to make sure that
they contain reasonable values and don't measure anything.

Authored-by: Tom Dohrmann <[email protected]>
---
OvmfPkg/IntelTdx/TdxHelperLib/SecTdxHelper.c | 106 ++++++++++++++----
.../IntelTdx/TdxHelperLib/SecTdxHelperLib.inf | 19 +++-
2 files changed, 100 insertions(+), 25 deletions(-)

diff --git a/OvmfPkg/IntelTdx/TdxHelperLib/SecTdxHelper.c b/OvmfPkg/IntelTdx/TdxHelperLib/SecTdxHelper.c
index 19e9b1bf54..f5cd850b84 100644
index 19e9b1bf5491e004773a4034e0b7664ce49cbbaa..f5cd850b84b7b22745035fac0dc263a1fcff6d11 100644
--- a/OvmfPkg/IntelTdx/TdxHelperLib/SecTdxHelper.c
+++ b/OvmfPkg/IntelTdx/TdxHelperLib/SecTdxHelper.c
@@ -873,44 +873,106 @@ TdxHelperMeasureTdHob (
Expand Down Expand Up @@ -145,7 +147,7 @@ index 19e9b1bf54..f5cd850b84 100644
return EFI_SUCCESS;
}
diff --git a/OvmfPkg/IntelTdx/TdxHelperLib/SecTdxHelperLib.inf b/OvmfPkg/IntelTdx/TdxHelperLib/SecTdxHelperLib.inf
index d17b84c01f..dc2a822b90 100644
index d17b84c01f202abd9860e093d57917d04b6dbce0..dc2a822b902ea47a861ffd0d4513bc769431f2b5 100644
--- a/OvmfPkg/IntelTdx/TdxHelperLib/SecTdxHelperLib.inf
+++ b/OvmfPkg/IntelTdx/TdxHelperLib/SecTdxHelperLib.inf
@@ -43,11 +43,24 @@
Expand Down Expand Up @@ -176,6 +178,3 @@ index d17b84c01f..dc2a822b90 100644

[Guids]
gCcEventEntryHobGuid
--
2.46.0

6 changes: 6 additions & 0 deletions packages/by-name/OVMF-TDX/package.nix
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,20 @@
# SPDX-License-Identifier: AGPL-3.0-only

{
lib,
edk2,
nasm,
acpica-tools,
debug ? false,
}:

edk2.mkDerivation "OvmfPkg/IntelTdx/IntelTdxX64.dsc" rec {
name = "OVMF-TDX";

buildFlags = lib.optionals debug [ "-D DEBUG_ON_SERIAL_PORT=TRUE" ];

buildConfig = if debug then "DEBUG" else "RELEASE";

nativeBuildInputs = [
nasm
acpica-tools
Expand Down
1 change: 1 addition & 0 deletions packages/by-name/mkNixosConfig/package.nix
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ lib.makeOverridable (
pause-bundle
nvidia-ctk-oci-hook
nvidia-ctk-with-config
tdx-tools
;
inherit (outerPkgs.kata) kata-agent;
})
Expand Down
30 changes: 30 additions & 0 deletions packages/by-name/tdx-tools/package.nix
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright 2024 Edgeless Systems GmbH
# SPDX-License-Identifier: AGPL-3.0-only

{
python3Packages,
fetchFromGitHub,
}:

python3Packages.buildPythonApplication rec {
pname = "tdx-tools";
version = "noble-24.04";
pyproject = true;

src = fetchFromGitHub {
owner = "canonical";
repo = "tdx";
rev = version;
sha256 = "sha256-4Uzsnrf/B3awMutSPSF9PeOZ68mstNzQXnaD11nHWD4=";
};

build-system = [ python3Packages.setuptools ];

dependencies = with python3Packages; [
py-cpuinfo
];

preBuild = ''
cd tests/lib/tdx-tools
'';
}
63 changes: 63 additions & 0 deletions packages/debug-qemu-tdx.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#!/usr/bin/env bash
# Copyright 2024 Edgeless Systems GmbH
# SPDX-License-Identifier: AGPL-3.0-only

# This script starts a QEMU TDX VM as Kata would (more or less)
# for debugging purposes.

set -euo pipefail

if [ -z "$1" ]; then
echo "Usage: $0 <runtime-name>"
exit 1
fi

runtime_name=$1

bios="/opt/edgeless/${runtime_name}/tdx/share/OVMF.fd"
while [[ $# -gt 0 ]]; do
key="$1"
case $key in
-bios)
shift
bios="$1"
shift
;;
*)
shift
;;
esac
done

base_cmdline='tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/vda1 rootflags=ro rootfstype=erofs console=hvc0 console=hvc1 debug systemd.show_status=true systemd.log_level=debug panic=1 nr_cpus=1 selinux=0 systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket scsi_mod.scan=none agent.log=debug agent.debug_console agent.debug_console_vport=1026'
kata_cmdline=$(cat /opt/edgeless/${runtime_name}/etc/configuration-qemu-tdx.toml | tomlq -r '.Hypervisor.qemu.kernel_params')
extra_cmdline='console=ttyS0 systemd.unit=default.target'

/opt/edgeless/${runtime_name}/tdx/bin/qemu-system-x86_64 \
-name sandbox-testing,debug-threads=on \
-uuid 49ce7d67-eade-4708-a81f-b5b904213207 \
-machine q35,accel=kvm,kernel_irqchip=split,confidential-guest-support=tdx \
-cpu host,-vmx-rdseed-exit,pmu=off \
-m 2148M,slots=10,maxmem=516333M \
-device pci-bridge,bus=pcie.0,id=pci-bridge-0,chassis_nr=1,shpc=off,addr=2,io-reserve=4k,mem-reserve=1m,pref64-reserve=1m \
-device virtio-serial-pci,disable-modern=false,id=serial0 \
-device virtio-blk-pci,disable-modern=false,drive=image-3132ead95475d1bb,config-wce=off,share-rw=on,serial=image-3132ead95475d1bb \
-drive id=image-3132ead95475d1bb,file=/opt/edgeless/${runtime_name}/share/kata-containers.img,aio=threads,format=raw,if=none,readonly=on \
-device virtio-scsi-pci,id=scsi0,disable-modern=false \
-object '{"qom-type":"tdx-guest","id":"tdx","mrconfigid":"XGOgbZcHhD3KKCQ1Z4aeLiAYlCQu6/zTrhgQLkAQg/cAAAAAAAAAAAAAAAAAAAAA","quote-generation-socket":{"type":"vsock","cid":"2","port":"4050"}}' \
-object rng-random,id=rng0,filename=/dev/urandom \
-device virtio-rng-pci,rng=rng0 \
-rtc base=utc,driftfix=slew,clock=host \
-global kvm-pit.lost_tick_policy=discard \
-vga none \
-no-user-config \
-nodefaults \
-nographic \
--no-reboot \
-object memory-backend-ram,id=dimm1,size=2148M \
-kernel /opt/edgeless/${runtime_name}/share/kata-kernel \
-initrd /opt/edgeless/${runtime_name}/share/kata-initrd.zst \
-append "${base_cmdline} ${kata_cmdline} ${extra_cmdline}" \
-serial stdio \
-bios "${bios}" \
-smp 1,cores=1,threads=1,sockets=1,maxcpus=1
1 change: 1 addition & 0 deletions packages/nixos/debug.nix
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ in
util-linux
coreutils
strace
tdx-tools
];

services.getty.autologinUser = "root";
Expand Down
4 changes: 4 additions & 0 deletions tools/vale/styles/config/vocabularies/edgeless/accept.txt
Original file line number Diff line number Diff line change
Expand Up @@ -133,3 +133,7 @@ Xeon
xsltproc
Zipkin
# keep-sorted end
RTMR
RTMRs
precalculator
initrd