Towards Trusted Contextualization of Confidential Virtual Machines

Master's Thesis

Florian Lubitz

2025-02-17

Contextualization in Cloud Computing

Rapid scaling
Many individual VMs
Based on a base image
Same image, different context

Confidential Computing in the Cloud

Cloud Provider has access to VMs
Confidential Computing offers VM TEEs
Encryption for Data in Use
Remote Attestation
Direct Measured Boot for Initial Context
Works with Contextualization

Confidential VM

Hypervisor

Host OS

Hardware

Security Shortcomings with Contextualization

Contextualization with Base Image
Contextualization changes Image
Third Party involvement in Software Stack
No Record of Contextualization

Hardware

Hypervisor

Initial Context

VM Image

Contextualization

Dynamic Deployments

Enabling Post Boot Verification

Establish Trust to Image

Runtime Measurements

Trusted Contextualization

Policy Enforcement

Fine Grained Auditing

AMD SEV-SNP CVMs

AMD Secure Encrypted Virtualization TEE
Direct Measured Boot
Remote Attestation
Hardware Root of Trust

Attestation Report

Field	Offset&Bits	Description
Version	00h 31:0	SEV-SNP attestation report version.
Guest SVN	04h 31:0	Security version number of the guest firmware.
Policy	08h 63:0	Defines security settings and VM restrictions.
...	...	...
Custom Data	50h 511:0	User-supplied data
Measurement	90h 383:0	Cryptographic hash of the VM's initial memory state.
...	...	...
Signature (VTEC)	2A0h-49Fh	Cryptographic signature to verify authenticity.

Measurement

Kernel
Kernel Commandline
Initramfs
Firmware
VM Image

Custom Data

Given on Report Request
64 Bytes Binary Data
e.g. Workload Metadata

extended Berkeley Packet Filter (eBPF)

Origin as network package filter
extended to general purpose
compiled to eBPF bytecode
dynamically loaded into kernel

extended Berkeley Packet Filter (eBPF)

Implement LSM Hooks
LSM adds hooks to system calls
additional checks possible

User Level Process

System Call

File Search

Access Control

Access

extended Berkeley Packet Filter (eBPF)

Implement LSM Hooks
LSM adds hooks to system calls
additional checks possible

User Level Process

System Call

File Search

Access Control

LSM Hook

Access

path_mkdir,path_chmod,file_open,task_fix_setuid,bpf... (277)

Motivation

Security Risks in Cloud Environment
Contextualization in Confidential Virtual Machines
Gap of Trust between Initial Context and Post Boot State

STIEVM

Immutable Trusted Foundation
Trusted Contextualization
Trusted Recording of System and Workload
Policy Enforcement and Hardened Assurences
Remote Attestation for Stakeholders

Threat Model

Workload Owner/User

Service Provider

Cloud Provider

Workload

Kubernetes

CVM

Host OS

Hardware

Cloud Provider

Provides

Hardware + Infrastructure
Image Repository
Starts VM

Attacks

Alter Base Image
Alter Boot Arguments

Host OS

Hardware

Service Provider

Provides

VM Configuration
Boot Arguments & Base Image
Workload Platform
Stievm

Attacks

Faulty Config
Nosy Neighbour

Verifies

VM Integrity

Kubernetes

VM OS

Host OS

Hardware

Workload Owner

Provides

Workload
Policies

Verifies

VM Integrity
Workload Integrity

Workload

Kubernetes

VM OS

Host OS

Hardware

Goals revisited

Extend Measured Envelope of Hardware Root of Trust	Verifiable Root File System
System Centered Contextualization	Policy-Parser Contextualizer
Runtime Integrity Verification	Recorder & Reporter HTTP-Server
Target Identity Establishment	Identification
Workload Centered Contextualization	Workload-Listener Policy-Enforcer

Components

STIEVM

Policy-Parser

Identification

Contextualizer

Recorder & Reporter

HTTP-Server

Workload-Listener

Policy-Enforcer

Verifiable Root File System

CVM Life phases

Provisioning

Bootstrap

Runtime

Bootstrap

STIEVM

Policy-Parser

Verifiable Root File System

Bootstrap

Verifiable Root File System

root=UUID=f5d4b3db-cda4-45fa-8689-1540e3cbe989 root_hash=b77ec379b630e9fb88ce5c3439bbe794233bf3b4b95018401a70d7d10d50be2f

veritysetup verify ${root_dev} ${verity_dev} ${root_hash} -v
veritysetup create vroot ${root_dev} ${verity_dev}  ${root_hash} ${salt}

switch_root vroot

Bootstrap

Policy-Parser

Read Policy and hash from kernel command line
Parse Policy for other components
Verifies policy against hash

...be2f  stievm_policy="{\"net...de\"}]}}" stievm_policy_hash="1840...50be2f"

pub struct PolicyConfig {
  pub network: Option<NetworkPolicy>,
  pub attestation: Option<AttestationPolicy>,
  pub ssh: Option<SshPolicy>,
  pub luks_devices: Option<LuksDevicePolicy>,
}

Runtime

Setup

Configuration

Workload Runtime

Setup

Configuration

Workload Runtime

Self Setup

Policy-Parser

Identification

Policy-Enforcer

Setup

Configuration

Workload Runtime

Identification

Uniquely Identify STIEVM
Couples STIEVM with the CVM in Inital Attestation
Provides key to other components
Ephemeral ed25519 key also acts as nonce
Shows freshness for one time measurements
Root of Trust for dynamic attestation collateral

Setup

Configuration

Workload Runtime

Policy-Enforcer

Implements LSM Hooks with eBPF
Includes eBPF in STIEVM binary
Controlls Access to files outside of rootfs
Ensures Integrity of STIEVM

Setup

Configuration

Workload Runtime

Contextualizer

HTTP-Server

Contextualizer

Recorder & Reporter

Setup

Configuration

Workload Runtime

Contextualizer

Custom Contextualization Tool
Two stages: Basic and Extended
Logs all changed files
Organized in Modules

Setup

Configuration

Workload Runtime

Contextualizer

pub trait Configurator {
  fn new() -> Self
  where
    Self: Sized;

  fn configure(&mut self, config: &PolicyConfig);

  fn configure_extended(&mut self, config: &PolicyConfig) {}

  fn get_config_tracker(&self) -> &ConfigurationTracker;
}

Setup

Configuration

Workload Runtime

HTTP-Server

Endpoints for initial attestation:

Accessible for Service Provider

GET / to get initial attestation report
POST / to send secrets for context (LUKS keys, ...)

Setup

Configuration

Workload Runtime

Contextualizer

Second Stage of Contextualization
Merge Secrets into Policy
Extended Configuration

Setup

Configuration

Workload Runtime

Recorder & Reporter

Create Report with hash for each touched file

Stored as a file for later use

Setup

Configuration

Workload Runtime

Recorder & Reporter

config.response.json

{
  "/...authorized_keys": "1840...50be2f",
  "/...config.yaml": "44d0...a32c2a",
  "/...stievm.conf": "b77e...be2f",
}

config.response.signature.txt

uRYmc+Whyuc5FySyFzCkRFpoZeJa
HrTpZR0dnLvwBd1woTvKZU6wvbVl
wISqErK1Qa6bvC2z848Bd23nAMmdAg==

config.report.bin

Attestation Report (1184 bytes):

Report Data:                  
4f 53 cd a1 8c 2b aa 0c 03 54 bb 5f 9a 3e cb e5 
ed 12 ab 4d 8e 11 ba 87 3c 2f 11 16 12 02 b9 45 
88 42 f3 6b ca c0 fb 09 14 b1 b8 3d 41 fe 6b a9 
d4 ab cd f9 8c 49 16 72 b4 38 f7 5f c1 55 d6 f6 

Report Data:                  
sha256(config.response.json)
sha256(config.response.signature.txt)

Setup

Configuration

Workload Runtime

HTTP-Server

Workload-Listener

Policy-Enforcer

Setup

Configuration

Workload Runtime

HTTP-Server

Endpoint	Method	Description
/report/attest	POST	Runtime Measurements
/report/initial	GET	Initial attestation report
/report/config	GET	Configuration Report

Setup

Configuration

Workload Runtime

HTTP-Server

Endpoint	Method	Description
/workload/:id/policy	POST	Update Workload Policy
/workload/:id/initial	GET	Initial workload report
/workload/:id/measure	POST	Workload Runtime Measurements

Setup

Configuration

Workload Runtime

User Manager

System User

Linux User
System Config
System Reports

STIEVM User

Workload User
Workload Reports
Initial & Config Report

Authentication

ED25519 Key Pair
System User: authorized_keys
STIEVM User: STIEVM PKey

Setup

Configuration

Workload Runtime

Workload Listener

kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
    stievmUser: a-user
  annotations:
    stievm/auth: "3oyYsSosKBWf9WaAfBHjCs4n1yD1eLYBCkKj0cZcYl0VA4fudDqFycbSBFoIqavMHSbXY/lFQIJ8Gs1B0z1Nm24iNG0nIorrVsZz4J2GCivrFBEyyHI5vBT6/AfyNn0D"
    stievm/auth-data: "SomeDataToSign"
    stievm/policy: "{\"path_guards\":[\"/usr/share/nginx/html\":\"/usr/sbin/nginx\"}]}"

spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-deployment
  template:
    metadata:
      labels:
        app: nginx-deployment
        stievmUser: a-user
    spec:

Setup

Configuration

Workload Runtime

Workload Listener

async fn handle_deployment(depl: &Deployment, client: Client) {
  if !(depl.has_annotation("stievm/auth")
    && depl.has_annotation("stievm/auth-data")) {
    error!("no auth");
  }

  if already_measured(&depl) {
    println!("Workload already measured");
    return Ok(());
  }

  let user_name = depl.get_label("stievmUser");
  let auth_data = depl.get_annotation("stievm/auth-data").unwrap();
  let auth = depl.get_annotation("stievm/auth").unwrap();

  let auth_result = validate_auth(user_name, auth, auth_data.as_bytes().to_vec());
  if auth_result.is_err() {
    error!("Could not authenticate user");
  }

  measure_initial_workload(&depl).await?;

  user_manager.add_workload_to_user(user_name, &depl_id);

  let policy = depl.get_annotation("stievm/policy").unwrap();
  let parsed_policy = PolicyParser::parse(policy).unwrap();

  let pods = get_pods(&depl, client).await?;
  for pod in pods {
    let container = pod.get_container();

    enforcer::install_workload_guards(
      container.id,
      &depl_id,
      &parsed_policy,
    )
  }

  Ok(())
}

Evaluation

System Specifications

AMD EPYC 7313
8GB VM RAM
4 vCPUs
SEV-SNP enabled
Linux 6.8.0.-rc5 with AMD-SEV patches

Verifiable Root File System

Initramfs

20 runs, 2641MiB

Verification time: ~2.63s

verity dm target creation: ~0.01s

Custom initramfs time: ~2.7s

Policy Enforcer eBPF

Installation (for each workload)

Step	Duration (µs)	Standard Deviation (µs)	% of Total
Setup BPF FS	38.47	11.65	51.53\%
Get Inodes for Paths	36.19	9.61	48.47\%
Total	74.65

Policy Enforcer eBPF

Access Control

20000 file access

		Mean Time (µs)	Std. Dev	Min Time (µs)	Max Time (µs)
BPF Enabled	Protected	1053.34	157.19	921.97	3843.6
BPF Enabled	Unprotected	1125.78	156.88	932.09	4130.42
BPF Disabled	Protected	1130.09	168.3	911.98	2062.12
BPF Disabled	Unprotected	1142.52	163.87	918.7	2117.58
Overhead	Protected	-76.75		9.99	1781.48
Overhead	Unprotected	-16.74		13.39	2012.84

HTTP Server

Response times for different requests

Request	Mean Time (µs)
Initial Attestation	5 865.28
Secret Post	271.03
Initial Report	91.44
Configuration Report	91.40
New System Measurement	12 162.50

Average Time to get Attestation Report from FW: 5288 µs

HTTP Server

Concurrent Requests

1-128 concurrent requests, 10 runs each

Space Overhead

The space overhead for the sytem is 100MiB

Related Work

Academic

Johnson et al. :
Parma: Conﬁdential containers via attested execution policies
Pecholt and Wessel :
Cocotpm: Trusted platform modules for virtual machines in confidential computing environments
Wilke and Scopelliti:
Snpguard: Remote attestation of sev-snp vms using open source tools

Commercial

cloud-init & Ansible
Integrity Measurement Architecture
Tetragon, Kubearmor, AppArmor
Confidential Containers

Conclusion

Secure, Immutable, Integrity protected System
Post Boot Attestation of Contextualized System
Post Boot Attestation of Dynamic Workloads
Policy Enforcer for Contextualization and Hardening
Integrated into Kubernetes
Minimal overhead
Extendable for more features and other systems

Setup

Configuration

Workload Runtime

Policy-Enforcer

Get Inodes of STIEVM dirs
Load eBPF program into kernel
Populate eBPF maps with Inodes
Attach eBPF program to LSM hooks
Pin eBPF program to filesystem

Logs all actions

Setup

Configuration

Workload Runtime

Policy-Enforcer

SEC("lsm/file_open")
int BPF_PROG(file_guard, struct file* file) {
  current_pid = get_current_pid();
  for (int i = 0; i < MAX_DEPTH; i++) {
    inode = file.inode
    if (isInodeInMap(inode)) {
      if (current_pid != stievm_pid) {
        return -EACCES;
      } else{
        return 0
      }
    }

    if (! hasParent(file)){
      break;
    }   
  }
}

Sending Secrets to HTTP Server

POST / HTTP/1.1
Host: 172.27.16.46:3000
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
User-Agent: python-httpx/0.27.2
Authorization: florian;QjsyxWGSPtFpIGf8cUJbzLodt411WeqVauYFmZYebXvUEgdEXnDOrvWwHFKWam9847Er88oLp+pa59VjI5dafg0+rZUjiWCtFrhYXwtXyiZj1smdEYZNjTOl2pM1dQoE
Content-Type: application/json
Content-Length: 82

{"luks_devices": {"device_keys": [{"id": "my-device-1", "key": "SomeSecretKey"}]}}

Basic Configuration

pub fn basic_configuration(config: &PolicyConfig) -> Vec<Box<dyn configurator::Configurator>> {
  let mut configurators: Vec<Box<dyn configurator::Configurator>> = vec![
    Box::new(NetworkConfigurator::new()),
    Box::new(AttestationConfigurator::new()),
    Box::new(SshConfigurator::new()),
  ];
  for configurator in &mut configurators {
    configurator.configure(config);
  }
  stievm_log!("stievm", "contextualizer", "Finished basic configuration");
  configurators
  }

/report/attest

POST /report/attest HTTP/1.1
Host: 172.27.16.46:3000
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
User-Agent: python-httpx/0.27.2
Authorization: florian;a/HykOzqBVawZQCoDsqKU2828It96tChUxhc3GitvVbr882OWWvPkMml1d7AZ/vkeJNENbb6xzBv+k/n517yIU0UNjOPhXHRvMdovt7CX0KayKgmu1sEfQBIyywmBjEC
Content-Type: application/json
Content-Length: 254

{
  "nonce": "32c9254a-be71-4232-8053-1e4956e114be",
  "evidence": [
    {
      "type": "fs_hash",
      "path": "/home/florian/projects/ttccvm/cvm",
      "hash": "Crz7McEuZ8IZXL2ViXcEVGCHj3Ox1LV5EsmuPhlebn8="
    },
    {
      "type": "log"
    }
  ]
}

/report/attest

response.json

{
  "nonce": "32c9254a-be71-4232-8053-1e4956e114be",
  "evidence": [ 
    {
      "hash": "Crz7McEuZ8IZXL2ViXcEVGCHj3Ox1LV5EsmuPhlebn8=",
      "value": [
        {
          "path": "",
          "hash": "Crz7McEuZ8IZXL2ViXcEVGCHj3Ox1LV5EsmuPhlebn8="
        },
        {
          "path": "create-cvm-clean.sh",
          "hash": "dgphKtWQamZHeGlmyGNlvbSOroCaH2YP5Gadat6fcow="
        },
      ]
    },
    {
      "hash": "pfM6IlgDIG2sFdKq978hQHsWJ9e23Rfd8N1qVDqQ59s=",
      "value": "
      [INFO ] - [stievm][main][stievm] Starting stievm
      [INFO ] - [stievm][policy_parser][stievm::policy_parser::parser] Re
      [INFO ] - [stievm][stievm_guards][stievm::enforcer] stievm_guards A
      [INFO ] - [stievm][contextualizer][stievm::config_modules] Starting
      [INFO ] - [stievm][contextualizer][stievm::config_modules] Finished
      [INFO ] - [stievm][contextualizer][stievm::config_modules] Starting
      [INFO ] - [stievm][contextualizer][stievm::config_modules] Finished
      [INFO ] - [stievm][web_server][stievm::att_web_server::web_server] 
      "
    }
  ]
}

response.signature.txt

uRYmc+Whyuc5FySyFzCkRFpoZeJa
HrTpZR0dnLvwBd1woTvKZU6wvbVl
wISqErK1Qa6bvC2z848Bd23nAMmdAg==

report.bin

Attestation Report (1184 bytes):

Report Data:                  
4f 53 cd a1 8c 2b aa 0c 03 54 bb 5f 9a 3e cb e5 
ed 12 ab 4d 8e 11 ba 87 3c 2f 11 16 12 02 b9 45 
88 42 f3 6b ca c0 fb 09 14 b1 b8 3d 41 fe 6b a9 
d4 ab cd f9 8c 49 16 72 b4 38 f7 5f c1 55 d6 f6

Workload Guard

pub fn install_workload_guards(
  docker_container_id: &str,
  workload_id: &str,
  policy: &WorkloadPolicy,
) {
  let docker_pid = get_docker_pid(docker_container_id)?;
  let container_pid = get_container_pid(docker_pid)?;

  check_bpf_lsm_enabled()?;

    // Get the pid namespace of the container
  let container_pid_ns = container_ns_inode(&docker_pid);

  log_info(...);

  let inodes_to_block = get_inodes(policy.path_guards);

  bpf.container_pid = container_pid;

  bpf.load();
  bpf.inode_map.pin();

  for block_pair in inodes_to_block {
    bpf.inode_map.add(
      &block_pair.path_inode,
      &block_pair.command_inode,
        )
  }

  bpf.attach();
  bpf.pin()

  log_info(...)
}

Workload Vault BPF

SEC("lsm/file_open")
int BPF_PROG(check_file_open, struct file* file, int ret) {

  if (ret != 0) {
    return 0;
  }

  if (is_in_container()) {

    // Get the inode of the current task
    struct task_struct* task;
    struct mm_struct* mm;
    struct file* exe_file;
    struct dentry* exe_dentry;
    struct inode* exe_inode;

    task = (struct task_struct*)bpf_get_current_task_btf();
    mm = task->mm;
    exe_file = mm->exe_file;
    exe_dentry = exe_file->f_path.dentry;
    exe_inode = exe_dentry->d_inode;
    u64 exe_inode_num = exe_inode->i_ino;

    // Check if the current process may access the specified inode
    struct dentry* dentry = file->f_path.dentry;
    // Traverse up the directory hierarchy to check for restrictions
    for (__u16 i = 0; i < MAX_DIR_DEPTH; i++) {  // Limit traversal depth to avoid verifier issues
      if (!dentry) {
        bpf_printk("No dentry found");
        break;
      }

      u64 inode = dentry->d_inode->i_ino;
      __u32 current_pid = bpf_get_current_pid_tgid() >> 32;
      u32 allowed_inode = is_inode_restricted(inode);
      // Check if this inode (file or directory) is restricted
      if (allowed_inode != 0) {
        bpf_printk("inode: %d tries to access restricted %d", exe_inode_num, inode);
        if (exe_inode_num != allowed_inode) {
          return -EACCES;  // Access denied for non-allowed PID
        } else {
          return 0;  // Access allowed for the allowed PID
        }
      }

      // Move up to the parent directory
      if (dentry->d_parent == dentry) {
        break;  // Reached the root directory
      }
      dentry = dentry->d_parent;
    }
  }

  return 0;  // Access allowed if no restricted inode found in path
}