Towards Trusted Contextualization of Confidential Virtual Machines

Master's Thesis

Florian Lubitz

2025-02-17

Contextualization in Cloud Computing

  • Rapid scaling
  • Many individual VMs
  • Based on a base image
  • Same image, different context

Confidential Computing in the Cloud

  • Cloud Provider has access to VMs
  • Confidential Computing offers VM TEEs
  • Encryption for Data in Use
  • Remote Attestation
  • Direct Measured Boot for Initial Context
  • Works with Contextualization
Confidential VM
Hypervisor
Host OS
Hardware

Security Shortcomings with Contextualization

  • Contextualization with Base Image
  • Contextualization changes Image
  • Third Party involvement in Software Stack
  • No Record of Contextualization
Hardware
Hypervisor
VM
Initial Context
VM Image
Contextualization
Dynamic Deployments

Enabling Post Boot Verification

Establish Trust to Image

Runtime Measurements

Trusted Contextualization

Policy Enforcement

Fine Grained Auditing

AMD SEV-SNP CVMs

  • AMD Secure Encrypted Virtualization TEE
  • Direct Measured Boot
  • Remote Attestation
  • Hardware Root of Trust

Attestation Report

FieldOffset&BitsDescription
Version00h 31:0SEV-SNP attestation report version.
Guest SVN04h 31:0Security version number of the guest firmware.
Policy08h 63:0Defines security settings and VM restrictions.
.........
Custom Data50h 511:0User-supplied data
Measurement90h 383:0Cryptographic hash of the VM's initial memory state.
.........
Signature (VTEC)2A0h-49FhCryptographic signature to verify authenticity.

Measurement

  • Kernel
  • Kernel Commandline
  • Initramfs
  • Firmware
  • VM Image

Custom Data

  • Given on Report Request
  • 64 Bytes Binary Data
  • e.g. Workload Metadata

extended Berkeley Packet Filter (eBPF)

  • Origin as network package filter
  • extended to general purpose
  • compiled to eBPF bytecode
  • dynamically loaded into kernel

extended Berkeley Packet Filter (eBPF)

  • Implement LSM Hooks
  • LSM adds hooks to system calls
  • additional checks possible
User Level Process
System Call
File Search
Access Control
Access

extended Berkeley Packet Filter (eBPF)

  • Implement LSM Hooks
  • LSM adds hooks to system calls
  • additional checks possible
User Level Process
System Call
File Search
Access Control
LSM Hook
Access

path_mkdir,path_chmod,file_open,task_fix_setuid,bpf... (277)

Motivation

  • Security Risks in Cloud Environment
  • Contextualization in Confidential Virtual Machines
  • Gap of Trust between Initial Context and Post Boot State

STIEVM

  • Immutable Trusted Foundation
  • Trusted Contextualization
  • Trusted Recording of System and Workload
  • Policy Enforcement and Hardened Assurences
  • Remote Attestation for Stakeholders

Threat Model

Workload Owner/User
Service Provider
Cloud Provider

Cloud Provider

Provides

  • Hardware + Infrastructure
  • Image Repository
  • Starts VM

Attacks

  • Alter Base Image
  • Alter Boot Arguments

Service Provider

Provides

  • VM Configuration
  • Boot Arguments & Base Image
  • Workload Platform
  • Stievm

Attacks

  • Faulty Config
  • Nosy Neighbour

Verifies

  • VM Integrity

Workload Owner

Provides

  • Workload
  • Policies

Verifies

  • VM Integrity
  • Workload Integrity

Goals revisited

Extend Measured Envelope of Hardware Root of Trust
Verifiable Root File System
System Centered Contextualization
Policy-Parser
Contextualizer
Runtime Integrity Verification
Recorder & Reporter
HTTP-Server
Target Identity Establishment
Identification
Workload Centered Contextualization
Workload-Listener
Policy-Enforcer

Components

STIEVM

Policy-Parser
Identification
Contextualizer
Recorder & Reporter
HTTP-Server
Workload-Listener
Policy-Enforcer
Verifiable Root File System

CVM Life phases

Provisioning

Bootstrap

Runtime

Bootstrap

STIEVM

Policy-Parser
Verifiable Root File System

Bootstrap

Verifiable Root File System
root=UUID=f5d4b3db-cda4-45fa-8689-1540e3cbe989 root_hash=b77ec379b630e9fb88ce5c3439bbe794233bf3b4b95018401a70d7d10d50be2f
veritysetup verify ${root_dev} ${verity_dev} ${root_hash} -v
veritysetup create vroot ${root_dev} ${verity_dev}  ${root_hash} ${salt}

switch_root vroot

Bootstrap

Policy-Parser
  • Read Policy and hash from kernel command line
  • Parse Policy for other components
  • Verifies policy against hash
...be2f  stievm_policy="{\"net...de\"}]}}" stievm_policy_hash="1840...50be2f"
pub struct PolicyConfig {
  pub network: Option<NetworkPolicy>,
  pub attestation: Option<AttestationPolicy>,
  pub ssh: Option<SshPolicy>,
  pub luks_devices: Option<LuksDevicePolicy>,
}

Runtime

Setup

Configuration

Workload Runtime

Setup

Configuration

Workload Runtime

Self Setup
Policy-Parser
Identification
Policy-Enforcer

Setup

Configuration

Workload Runtime

Identification
  • Uniquely Identify STIEVM
  • Couples STIEVM with the CVM in Inital Attestation
  • Provides key to other components
  • Ephemeral ed25519 key also acts as nonce
  • Shows freshness for one time measurements
  • Root of Trust for dynamic attestation collateral

Setup

Configuration

Workload Runtime

Policy-Enforcer
  • Implements LSM Hooks with eBPF
  • Includes eBPF in STIEVM binary
  • Controlls Access to files outside of rootfs
  • Ensures Integrity of STIEVM

Setup

Configuration

Workload Runtime

Contextualizer
HTTP-Server
Contextualizer
Recorder & Reporter

Setup

Configuration

Workload Runtime

Contextualizer
  • Custom Contextualization Tool
  • Two stages: Basic and Extended
  • Logs all changed files
  • Organized in Modules

Setup

Configuration

Workload Runtime

Contextualizer
pub trait Configurator {
  fn new() -> Self
  where
    Self: Sized;

  fn configure(&mut self, config: &PolicyConfig);

  fn configure_extended(&mut self, config: &PolicyConfig) {}

  fn get_config_tracker(&self) -> &ConfigurationTracker;
}

Setup

Configuration

Workload Runtime

HTTP-Server

Endpoints for initial attestation:

Accessible for Service Provider

  • GET / to get initial attestation report
  • POST / to send secrets for context (LUKS keys, ...)

Setup

Configuration

Workload Runtime

Contextualizer
  • Second Stage of Contextualization
  • Merge Secrets into Policy
  • Extended Configuration

Setup

Configuration

Workload Runtime

Recorder & Reporter

Create Report with hash for each touched file

Stored as a file for later use

Setup

Configuration

Workload Runtime

Recorder & Reporter
config.response.json
{
  "/...authorized_keys": "1840...50be2f",
  "/...config.yaml": "44d0...a32c2a",
  "/...stievm.conf": "b77e...be2f",
}
config.response.signature.txt
uRYmc+Whyuc5FySyFzCkRFpoZeJa
HrTpZR0dnLvwBd1woTvKZU6wvbVl
wISqErK1Qa6bvC2z848Bd23nAMmdAg==
config.report.bin
Attestation Report (1184 bytes):

Report Data:                  
4f 53 cd a1 8c 2b aa 0c 03 54 bb 5f 9a 3e cb e5 
ed 12 ab 4d 8e 11 ba 87 3c 2f 11 16 12 02 b9 45 
88 42 f3 6b ca c0 fb 09 14 b1 b8 3d 41 fe 6b a9 
d4 ab cd f9 8c 49 16 72 b4 38 f7 5f c1 55 d6 f6 

Report Data:                  
sha256(config.response.json)
sha256(config.response.signature.txt) 

Setup

Configuration

Workload Runtime

HTTP-Server
Workload-Listener
Policy-Enforcer

Setup

Configuration

Workload Runtime

HTTP-Server
EndpointMethodDescription
/report/attest POSTRuntime Measurements
/report/initial GET Initial attestation report
/report/config GET Configuration Report

Setup

Configuration

Workload Runtime

HTTP-Server
EndpointMethodDescription
/workload/:id/policyPOST Update Workload Policy
/workload/:id/initialGET Initial workload report
/workload/:id/measurePOST Workload Runtime Measurements

Setup

Configuration

Workload Runtime

User Manager

System User

  • Linux User
  • System Config
  • System Reports

STIEVM User

  • Workload User
  • Workload Reports
  • Initial & Config Report

Authentication

  • ED25519 Key Pair
  • System User: authorized_keys
  • STIEVM User: STIEVM PKey

Setup

Configuration

Workload Runtime

Workload Listener
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
    stievmUser: a-user
  annotations:
    stievm/auth: "3oyYsSosKBWf9WaAfBHjCs4n1yD1eLYBCkKj0cZcYl0VA4fudDqFycbSBFoIqavMHSbXY/lFQIJ8Gs1B0z1Nm24iNG0nIorrVsZz4J2GCivrFBEyyHI5vBT6/AfyNn0D"
    stievm/auth-data: "SomeDataToSign"
    stievm/policy: "{\"path_guards\":[\"/usr/share/nginx/html\":\"/usr/sbin/nginx\"}]}"

spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-deployment
  template:
    metadata:
      labels:
        app: nginx-deployment
        stievmUser: a-user
    spec:

Setup

Configuration

Workload Runtime

Workload Listener
async fn handle_deployment(depl: &Deployment, client: Client) {
  if !(depl.has_annotation("stievm/auth")
    && depl.has_annotation("stievm/auth-data")) {
    error!("no auth");
  }

  if already_measured(&depl) {
    println!("Workload already measured");
    return Ok(());
  }

  let user_name = depl.get_label("stievmUser");
  let auth_data = depl.get_annotation("stievm/auth-data").unwrap();
  let auth = depl.get_annotation("stievm/auth").unwrap();

  let auth_result = validate_auth(user_name, auth, auth_data.as_bytes().to_vec());
  if auth_result.is_err() {
    error!("Could not authenticate user");
  }

  measure_initial_workload(&depl).await?;

  user_manager.add_workload_to_user(user_name, &depl_id);

  let policy = depl.get_annotation("stievm/policy").unwrap();
  let parsed_policy = PolicyParser::parse(policy).unwrap();

  let pods = get_pods(&depl, client).await?;
  for pod in pods {
    let container = pod.get_container();

    enforcer::install_workload_guards(
      container.id,
      &depl_id,
      &parsed_policy,
    )
  }

  Ok(())
}

Evaluation

System Specifications

  • AMD EPYC 7313
  • 8GB VM RAM
  • 4 vCPUs
  • SEV-SNP enabled
  • Linux 6.8.0.-rc5 with AMD-SEV patches

Verifiable Root File System

Initramfs

20 runs, 2641MiB

Verification time: ~2.63s

verity dm target creation: ~0.01s

Custom initramfs time: ~2.7s

Policy Enforcer eBPF

Installation (for each workload)

StepDuration (µs)Standard Deviation (µs)% of Total
Setup BPF FS38.4711.6551.53\%
Get Inodes for Paths36.199.6148.47\%
Total74.65

Policy Enforcer eBPF

Access Control

20000 file access
Mean Time (µs)Std. DevMin Time (µs)Max Time (µs)
BPF EnabledProtected1053.34157.19921.973843.6
Unprotected1125.78156.88932.094130.42
BPF DisabledProtected1130.09168.3911.982062.12
Unprotected1142.52163.87918.72117.58
OverheadProtected-76.759.991781.48
Unprotected-16.7413.392012.84

HTTP Server

Response times for different requests

RequestMean Time (µs)
Initial Attestation5 865.28
Secret Post271.03
Initial Report91.44
Configuration Report91.40
New System Measurement 12 162.50

Average Time to get Attestation Report from FW: 5288 µs

HTTP Server

Concurrent Requests

1-128 concurrent requests, 10 runs each
Concurrent Requests

Space Overhead

The space overhead for the sytem is 100MiB

Related Work

Academic

  • Johnson et al. :
    Parma: Confidential containers via attested execution policies
  • Pecholt and Wessel :
    Cocotpm: Trusted platform modules for virtual machines in confidential computing environments
  • Wilke and Scopelliti:
    Snpguard: Remote attestation of sev-snp vms using open source tools

Commercial

  • cloud-init & Ansible
  • Integrity Measurement Architecture
  • Tetragon, Kubearmor, AppArmor
  • Confidential Containers

Conclusion

  • Secure, Immutable, Integrity protected System
  • Post Boot Attestation of Contextualized System
  • Post Boot Attestation of Dynamic Workloads
  • Policy Enforcer for Contextualization and Hardening
  • Integrated into Kubernetes
  • Minimal overhead
  • Extendable for more features and other systems

Setup

Configuration

Workload Runtime

Policy-Enforcer
  1. Get Inodes of STIEVM dirs
  2. Load eBPF program into kernel
  3. Populate eBPF maps with Inodes
  4. Attach eBPF program to LSM hooks
  5. Pin eBPF program to filesystem

Logs all actions

Setup

Configuration

Workload Runtime

Policy-Enforcer
SEC("lsm/file_open")
int BPF_PROG(file_guard, struct file* file) {
  current_pid = get_current_pid();
  for (int i = 0; i < MAX_DEPTH; i++) {
    inode = file.inode
    if (isInodeInMap(inode)) {
      if (current_pid != stievm_pid) {
        return -EACCES;
      } else{
        return 0
      }
    }

    if (! hasParent(file)){
      break;
    }   
  }
}

Sending Secrets to HTTP Server

POST / HTTP/1.1
Host: 172.27.16.46:3000
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
User-Agent: python-httpx/0.27.2
Authorization: florian;QjsyxWGSPtFpIGf8cUJbzLodt411WeqVauYFmZYebXvUEgdEXnDOrvWwHFKWam9847Er88oLp+pa59VjI5dafg0+rZUjiWCtFrhYXwtXyiZj1smdEYZNjTOl2pM1dQoE
Content-Type: application/json
Content-Length: 82

{"luks_devices": {"device_keys": [{"id": "my-device-1", "key": "SomeSecretKey"}]}}


Basic Configuration

pub fn basic_configuration(config: &PolicyConfig) -> Vec<Box<dyn configurator::Configurator>> {
  let mut configurators: Vec<Box<dyn configurator::Configurator>> = vec![
    Box::new(NetworkConfigurator::new()),
    Box::new(AttestationConfigurator::new()),
    Box::new(SshConfigurator::new()),
  ];
  for configurator in &mut configurators {
    configurator.configure(config);
  }
  stievm_log!("stievm", "contextualizer", "Finished basic configuration");
  configurators
  }

/report/attest

POST /report/attest HTTP/1.1
Host: 172.27.16.46:3000
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
User-Agent: python-httpx/0.27.2
Authorization: florian;a/HykOzqBVawZQCoDsqKU2828It96tChUxhc3GitvVbr882OWWvPkMml1d7AZ/vkeJNENbb6xzBv+k/n517yIU0UNjOPhXHRvMdovt7CX0KayKgmu1sEfQBIyywmBjEC
Content-Type: application/json
Content-Length: 254

{
  "nonce": "32c9254a-be71-4232-8053-1e4956e114be",
  "evidence": [
    {
      "type": "fs_hash",
      "path": "/home/florian/projects/ttccvm/cvm",
      "hash": "Crz7McEuZ8IZXL2ViXcEVGCHj3Ox1LV5EsmuPhlebn8="
    },
    {
      "type": "log"
    }
  ]
}

/report/attest

response.json
{
  "nonce": "32c9254a-be71-4232-8053-1e4956e114be",
  "evidence": [ 
    {
      "hash": "Crz7McEuZ8IZXL2ViXcEVGCHj3Ox1LV5EsmuPhlebn8=",
      "value": [
        {
          "path": "",
          "hash": "Crz7McEuZ8IZXL2ViXcEVGCHj3Ox1LV5EsmuPhlebn8="
        },
        {
          "path": "create-cvm-clean.sh",
          "hash": "dgphKtWQamZHeGlmyGNlvbSOroCaH2YP5Gadat6fcow="
        },
      ]
    },
    {
      "hash": "pfM6IlgDIG2sFdKq978hQHsWJ9e23Rfd8N1qVDqQ59s=",
      "value": "
      [INFO ] - [stievm][main][stievm] Starting stievm
      [INFO ] - [stievm][policy_parser][stievm::policy_parser::parser] Re
      [INFO ] - [stievm][stievm_guards][stievm::enforcer] stievm_guards A
      [INFO ] - [stievm][contextualizer][stievm::config_modules] Starting
      [INFO ] - [stievm][contextualizer][stievm::config_modules] Finished
      [INFO ] - [stievm][contextualizer][stievm::config_modules] Starting
      [INFO ] - [stievm][contextualizer][stievm::config_modules] Finished
      [INFO ] - [stievm][web_server][stievm::att_web_server::web_server] 
      "
    }
  ]
}
response.signature.txt
uRYmc+Whyuc5FySyFzCkRFpoZeJa
HrTpZR0dnLvwBd1woTvKZU6wvbVl
wISqErK1Qa6bvC2z848Bd23nAMmdAg==
report.bin
Attestation Report (1184 bytes):

Report Data:                  
4f 53 cd a1 8c 2b aa 0c 03 54 bb 5f 9a 3e cb e5 
ed 12 ab 4d 8e 11 ba 87 3c 2f 11 16 12 02 b9 45 
88 42 f3 6b ca c0 fb 09 14 b1 b8 3d 41 fe 6b a9 
d4 ab cd f9 8c 49 16 72 b4 38 f7 5f c1 55 d6 f6

Workload Guard

pub fn install_workload_guards(
  docker_container_id: &str,
  workload_id: &str,
  policy: &WorkloadPolicy,
) {
  let docker_pid = get_docker_pid(docker_container_id)?;
  let container_pid = get_container_pid(docker_pid)?;

  check_bpf_lsm_enabled()?;

    // Get the pid namespace of the container
  let container_pid_ns = container_ns_inode(&docker_pid);

  log_info(...);

  let inodes_to_block = get_inodes(policy.path_guards);

  bpf.container_pid = container_pid;

  bpf.load();
  bpf.inode_map.pin();

  for block_pair in inodes_to_block {
    bpf.inode_map.add(
      &block_pair.path_inode,
      &block_pair.command_inode,
        )
  }

  bpf.attach();
  bpf.pin()

  log_info(...)
}

Workload Vault BPF

SEC("lsm/file_open")
int BPF_PROG(check_file_open, struct file* file, int ret) {

  if (ret != 0) {
    return 0;
  }

  if (is_in_container()) {

    // Get the inode of the current task
    struct task_struct* task;
    struct mm_struct* mm;
    struct file* exe_file;
    struct dentry* exe_dentry;
    struct inode* exe_inode;

    task = (struct task_struct*)bpf_get_current_task_btf();
    mm = task->mm;
    exe_file = mm->exe_file;
    exe_dentry = exe_file->f_path.dentry;
    exe_inode = exe_dentry->d_inode;
    u64 exe_inode_num = exe_inode->i_ino;

    // Check if the current process may access the specified inode
    struct dentry* dentry = file->f_path.dentry;
    // Traverse up the directory hierarchy to check for restrictions
    for (__u16 i = 0; i < MAX_DIR_DEPTH; i++) {  // Limit traversal depth to avoid verifier issues
      if (!dentry) {
        bpf_printk("No dentry found");
        break;
      }

      u64 inode = dentry->d_inode->i_ino;
      __u32 current_pid = bpf_get_current_pid_tgid() >> 32;
      u32 allowed_inode = is_inode_restricted(inode);
      // Check if this inode (file or directory) is restricted
      if (allowed_inode != 0) {
        bpf_printk("inode: %d tries to access restricted %d", exe_inode_num, inode);
        if (exe_inode_num != allowed_inode) {
          return -EACCES;  // Access denied for non-allowed PID
        } else {
          return 0;  // Access allowed for the allowed PID
        }
      }

      // Move up to the parent directory
      if (dentry->d_parent == dentry) {
        break;  // Reached the root directory
      }
      dentry = dentry->d_parent;
    }
  }

  return 0;  // Access allowed if no restricted inode found in path
}