Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions examples/confidential_gpu/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# confidential computing with GPU

This is an example of a VM with GPU, using confidential computing,
encrypted disk using a multiregion (US by default) Cloud HSM key
and a custom service account with cloud-platform scope. It creates
the VM with a startup script that installs Nvidia H100 drivers and
enables confidential computing on the GPU.

<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| key | Key name. | `string` | n/a | yes |
| keyring | Keyring name. | `string` | n/a | yes |
| location | Location for the resources (keyring, key, network, etc.). | `string` | `"us"` | no |
| project\_id | The Google Cloud project ID. | `string` | n/a | yes |
| region | The GCP region to create and test resources in. | `string` | `"us-central1"` | no |
| service\_account\_roles | Predefined roles for the Service account that will be created for the VM. Remember to follow principles of least privileges with Cloud IAM. | `list(string)` | `[]` | no |
| subnetwork | The subnetwork selflink to host the compute instances in. | `string` | n/a | yes |
| suffix | A suffix to be used as an identifier for resources. (e.g., suffix for KMS Key, Keyring). | `string` | `""` | no |

## Outputs

| Name | Description |
|------|-------------|
| instance\_self\_link | Self-link for compute instance. |
| name | Name of the instance templates. |
| self\_link | Self-link to the instance template. |
| suffix | Suffix used as an identifier for resources. |

<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
72 changes: 72 additions & 0 deletions examples/confidential_gpu/confidential_gpu_activator.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#!/bin/bash

# Copyright 2025 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# This script is used to activate Confidential GPU on a Google Cloud VM instance.
# It installs necessary packages, GPU drivers, and configures the Linux Kernel Crypto API (LKCA).
# Runs only once on the first startup of the VM instance.

FLAG_FILE_1="/var/log/confidential-gpu-script-part-1-ran.flag"

echo "Running startup script to activate Confidential GPU..."

# Check if the flag file exists to determine if the script has already run.
if ! [ -f "$FLAG_FILE_1" ]; then
echo "Running part 1 of the startup script..."
# Update package list and install necessary packages.
sudo apt-get update -y
sudo apt-get install linux-headers-"$(uname -r)" -y
sudo apt install -y build-essential libxml2 libncurses5-dev pkg-config libvulkan1 gcc-12 -y

# Install GPU drivers.
sudo apt install linux-modules-nvidia-550-server-open-gcp nvidia-driver-550-server-open -y

# Create a flag file to indicate that the part 1 of the script has run.
touch "$FLAG_FILE_1"
sudo reboot
fi

FLAG_FILE_2="/var/log/confidential-gpu-script-part-2-ran.flag"

if ! [ -f "$FLAG_FILE_2" ]; then
echo "Running part 2 of the startup script..."

# Configure a secure communication between the GPU and the GPU driver, by enabling the Linux Kernel Crypto API (LKCA).
echo "install nvidia /sbin/modprobe ecdsa_generic; /sbin/modprobe ecdh; /sbin/modprobe --ignore-install nvidia" | sudo tee /etc/modprobe.d/nvidia-lkca.conf
sudo update-initramfs -u

# Enable persistence mode.
sudo test -f /usr/lib/systemd/system/nvidia-persistenced.service && sudo sed -i "s/no-persistence-mode/uvm-persistence-mode/g" /usr/lib/systemd/system/nvidia-persistenced.service
sudo systemctl daemon-reload

# Create a flag file to indicate that the part 2 of the script has run.
sudo touch "$FLAG_FILE_2"

# Reboot the VM instance to apply LKCA and persistence mode configurations.
sudo reboot
fi

FLAG_FILE_3="/var/log/confidential-gpu-script-part-3-ran.flag"
if [ -f "$FLAG_FILE_3" ]; then
echo "Script has already run. Skipping..."
exit 0
fi

# Set GPU to ready state after each reboot.
sudo nvidia-smi conf-compute -srs 1

touch "$FLAG_FILE_3"
echo "Confidential GPU activation script has completed successfully."
117 changes: 117 additions & 0 deletions examples/confidential_gpu/main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
/**
* Copyright 2025 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

locals {
default_suffix = var.suffix == "" ? random_string.suffix.result : "${random_string.suffix.result}-${var.suffix}"
key_name = "${var.key}-${local.default_suffix}"
}

resource "random_string" "suffix" {
length = 4
special = false
upper = false
}

module "kms" {
source = "terraform-google-modules/kms/google"
version = "~> 3.0"

keyring = "${var.keyring}-${local.default_suffix}"
location = var.location
project_id = var.project_id
keys = [local.key_name]
purpose = "ENCRYPT_DECRYPT"
key_protection_level = "HSM"
prevent_destroy = false
}

resource "google_service_account" "default" {
project = var.project_id
account_id = "confidential-gpu-sa"
display_name = "Custom SA for confidential VM Instance"
}

resource "google_project_iam_member" "service_account_roles" {
for_each = toset(var.service_account_roles)

project = var.project_id
role = each.key
member = "serviceAccount:${google_service_account.default.email}"
}

data "google_project" "project" {
project_id = var.project_id
}

resource "google_compute_address" "ip_address" {
name = "external-ip-${local.default_suffix}"
project = var.project_id
region = var.region
}

locals {
access_config = {
nat_ip = google_compute_address.ip_address.address
network_tier = "PREMIUM"
}
}

resource "google_kms_crypto_key_iam_binding" "crypto_key" {
crypto_key_id = module.kms.keys[local.key_name]
role = "roles/cloudkms.cryptoKeyEncrypterDecrypter"
members = [
"serviceAccount:service-${data.google_project.project.number}@compute-system.iam.gserviceaccount.com",
]
}

module "instance_template" {
source = "terraform-google-modules/vm/google//modules/instance_template"
version = "~> 13.0"

region = var.region
project_id = var.project_id
subnetwork = var.subnetwork
access_config = [local.access_config]

name_prefix = "confidential-gpu-template"
machine_type = "a3-highgpu-1g"
source_image_project = "ubuntu-os-cloud"
source_image = "ubuntu-2204-lts"
enable_confidential_vm = true
confidential_instance_type = "TDX"
disk_size_gb = 20
disk_type = "pd-ssd"
spot = true

startup_script = file("${path.module}/confidential_gpu_activator.sh")

service_account = {
email = google_service_account.default.email
scopes = ["cloud-platform"]
}
disk_encryption_key = module.kms.keys[local.key_name]
}

module "compute_instance" {
source = "terraform-google-modules/vm/google//modules/compute_instance"
version = "~> 13.0"

region = var.region
access_config = [local.access_config]
hostname = "confidential-gpu-instance"
instance_template = module.instance_template.self_link
deletion_protection = false
}
36 changes: 36 additions & 0 deletions examples/confidential_gpu/outputs.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# /**
# * Copyright 2025 Google LLC
# *
# * Licensed under the Apache License, Version 2.0 (the "License");
# * you may not use this file except in compliance with the License.
# * You may obtain a copy of the License at
# *
# * http://www.apache.org/licenses/LICENSE-2.0
# *
# * Unless required by applicable law or agreed to in writing, software
# * distributed under the License is distributed on an "AS IS" BASIS,
# * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# * See the License for the specific language governing permissions and
# * limitations under the License.
# */


output "self_link" {
description = "Self-link to the instance template."
value = module.instance_template.self_link
}

output "name" {
description = "Name of the instance templates."
value = module.instance_template.name
}

output "instance_self_link" {
description = "Self-link for compute instance."
value = module.compute_instance.instances_self_links[0]
}

output "suffix" {
description = "Suffix used as an identifier for resources."
value = local.default_suffix
}
59 changes: 59 additions & 0 deletions examples/confidential_gpu/variables.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
/**
* Copyright 2024 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

variable "project_id" {
description = "The Google Cloud project ID."
type = string
}

variable "region" {
description = "The GCP region to create and test resources in."
type = string
default = "us-central1"
}

variable "subnetwork" {
description = "The subnetwork selflink to host the compute instances in."
type = string
}

variable "location" {
description = "Location for the resources (keyring, key, network, etc.)."
type = string
default = "us"
}

variable "suffix" {
description = "A suffix to be used as an identifier for resources. (e.g., suffix for KMS Key, Keyring)."
type = string
default = ""
}

variable "keyring" {
description = "Keyring name."
type = string
}

variable "key" {
description = "Key name."
type = string
}

variable "service_account_roles" {
description = "Predefined roles for the Service account that will be created for the VM. Remember to follow principles of least privileges with Cloud IAM."
type = list(string)
default = []
}
4 changes: 4 additions & 0 deletions metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ spec:
location: examples/instance_template/confidential_computing
- name: confidential_computing_intel
location: examples/confidential_computing_intel
- name: confidential_gpu
location: examples/confidential_gpu
- name: disk_snapshot
location: examples/compute_instance/disk_snapshot
- name: encrypted_disks
Expand Down Expand Up @@ -101,9 +103,11 @@ spec:
- roles/iam.serviceAccountAdmin
- roles/compute.instanceAdmin
- roles/resourcemanager.projectIamAdmin
- roles/cloudkms.admin
services:
- cloudresourcemanager.googleapis.com
- storage-api.googleapis.com
- serviceusage.googleapis.com
- compute.googleapis.com
- iam.googleapis.com
- cloudkms.googleapis.com
4 changes: 4 additions & 0 deletions modules/compute_disk_snapshot/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ spec:
location: examples/instance_template/confidential_computing
- name: confidential_computing_intel
location: examples/confidential_computing_intel
- name: confidential_gpu
location: examples/confidential_gpu
- name: disk_snapshot
location: examples/compute_instance/disk_snapshot
- name: encrypted_disks
Expand Down Expand Up @@ -167,12 +169,14 @@ spec:
- roles/iam.serviceAccountAdmin
- roles/compute.instanceAdmin
- roles/resourcemanager.projectIamAdmin
- roles/cloudkms.admin
services:
- cloudresourcemanager.googleapis.com
- storage-api.googleapis.com
- serviceusage.googleapis.com
- compute.googleapis.com
- iam.googleapis.com
- cloudkms.googleapis.com
providerVersions:
- source: hashicorp/google
version: ">= 3.71, < 7"
Expand Down
4 changes: 4 additions & 0 deletions modules/compute_instance/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ spec:
location: examples/instance_template/confidential_computing
- name: confidential_computing_intel
location: examples/confidential_computing_intel
- name: confidential_gpu
location: examples/confidential_gpu
- name: disk_snapshot
location: examples/compute_instance/disk_snapshot
- name: encrypted_disks
Expand Down Expand Up @@ -178,12 +180,14 @@ spec:
- roles/iam.serviceAccountAdmin
- roles/compute.instanceAdmin
- roles/resourcemanager.projectIamAdmin
- roles/cloudkms.admin
services:
- cloudresourcemanager.googleapis.com
- storage-api.googleapis.com
- serviceusage.googleapis.com
- compute.googleapis.com
- iam.googleapis.com
- cloudkms.googleapis.com
providerVersions:
- source: hashicorp/google
version: ">= 3.88, < 7"
Loading