Deploying NixOS to cloud

Introduction

This blog post demonstrates how to automatically deploy a NixOS machine. We have 2 main ways of deploying a NixOS machine; By directly spinning a machine with an NixOS image and deploying our configurations remotely, or by using another Linux OS and then overwriting its OS using a tool like kexec. You can even write your own ISO so it will come with your own configurations, but that is the subject of another article. I will show you how to overwrite an OS with NixOS automatically with literally one click.

Cloud-init

Cloud-init’s role here will be to allow us to access the installed machine via SSH. It will install a public SSH key; consequently, it needs a user as well to hold that key. That needs to be a template file, as it will be parsed by Terraform template(tftpl). Create a file named templates/user-data.yaml.tftpl:

#cloud-config
users:
  - name: ${user}
    hostname: ${hostname}
    sudo: ALL=(ALL) NOPASSWD:ALL
    shell: /bin/bash
    ssh_authorized_keys:
      - ${ssh_public_login_key}
    lock_passwd: true

write_files:    
  - path: /home/${user}/.ssh/id_rsa
    permissions: '0600'
    content: |
      ${indent(6, nixos_ssh_private_deploy_key)}
    owner: '${user}:${user}'
    defer: true

  - path: /root/.ssh/id_rsa
    permissions: '0600'
    content: |
      ${indent(6, nixos_ssh_private_deploy_key)}
    owner: 'root:root'
    defer: true

  # You can optionally install a deploy key for a github repo, containing your NixOS configuration for example
  - path: /home/${user}/.ssh/id_ed25519
    permissions: '0600'
    content: |
      ${indent(6, playbook_ssh_private_deploy_key)}
    owner: '${user}:${user}'
    defer: true
  - path: /root/.ssh/id_ed25519
    permissions: '0600'
    content: |
      ${indent(6, playbook_ssh_private_deploy_key)}
    owner: 'root:root'
    defer: true

NixOS

First, we need to create the configuration file of the NixOS machine. We will be using flakes features as this is a good practice and not hard to create, in your templates/flake.nix:

{
  description = "Simple NixOS config for cloud instance";

  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/nixos-24.05";
    disko.url = "github:nix-community/disko";
  };

  outputs = { self, nixpkgs, disko, ... }: {
    nixosConfigurations.<user> = nixpkgs.lib.nixosSystem {
      system = "x86_64-linux";
      modules = [
        disko.nixosModules.disko
        ./disko-config.nix
        ({ pkgs, ... }: {
          networking.hostName = "<user>";
          time.timeZone = "UTC";

          services.openssh.enable = true;
          services.openssh.settings.PasswordAuthentication = false;

          users.users.<user> = {
            isNormalUser = true;
            extraGroups = [ "wheel" ];
            # fill this field with your ssh public key(s)
            openssh.authorizedKeys.keys = [
              "ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIEGoIMYLhnlfKRIbKMrD5ABT+KhtXL4HqVTCrkP36KJS <email>"
            ];
          };

          # We can install packages which will be used by cloud-init command, including cloud-init itself.
          environment.systemPackages = with pkgs; [ git ansible cloud-init ];
          security.sudo.wheelNeedsPassword = false;
          system.stateVersion = "25.05";

          boot.loader.systemd-boot.enable = true;
          boot.loader.efi.canTouchEfiVariables = true;
          boot.kernelPackages = pkgs.linuxPackages_latest;

          # NixOS declarative configuration allow us to define systemD services sparing load for tools like ansible. 
          systemd.services.<user> = {
            description = "<user>'s service";
            wantedBy = [ "multi-user.target" ];
            serviceConfig = {
              ExecStart = "${pkgs.bash}/bin/bash -c 'echo The systemd ran > /home/<user>/log'";
              User = "<user>";
              Group = "wheel";
              Restart = "on-failure";
              RestartSec = "5s";
            };
          };
        })
      ];
    };

    packages.x86_64-linux.<user> =
      self.nixosConfigurations.<user>.config.system.build.toplevel;

    packages.x86_64-linux.<user>-disko =
      self.nixosConfigurations.<user>.config.system.build.diskoScript;
  };
}

Our user configuration is done; however, we need to configure the OS partition. Hopefully, NixOS has a disko package. We will create a module for disko’s configuration. Create a file named templates/disko-config.nix:

{
  disko.devices.disk.main = {
    device = "/dev/nvme0n1";
    type = "disk";
    content = {
      type = "gpt";
      partitions = {
        boot = {
          size = "1G";
          type = "C12A7328-F81F-11D2-BA4B-00A0C93EC93B";
          content = {
            type = "filesystem";
            format = "vfat";
            mountpoint = "/boot";
          };
        };
        root = {
          size = "100%";
          content = {
            type = "filesystem";
            format = "ext4";
            mountpoint = "/";
          };
        };
      };
    };
  };
}

Terraform

Terraform will actually deploy everything. It will glue the Linux installation, Clout-init runtime, and the NixOS installation together, via the NixOS anywhere Terraform provider. You can use any cloud/infra provider you want. The AWS provider is easy to use, and its ecosystem is big. This example will use an EC2 instance. In your main.tf set:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 6.0"
    }
  }
}

provider "aws" {
  region = var.region
}

data "aws_ami" "instance" {
  most_recent = var.most_recent
  filter {
    name   = "name"
    values = [var.ami_filter_name]
  }

  filter {
    name   = "virtualization-type"
    values = [var.virtualization_type]
  }

  filter {
    name   = "is-public"
    values = [var.is_public]
  }

  filter {
    name   = "free-tier-eligible"
    values = [var.is_free_tier_eligible]
  }

  filter {
    name   = "architecture"
    values = [var.architecture]
  }

  owners = var.owners
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.instance.id
  instance_type = var.instance_type

  tags = {
    Name = var.instance_name
  }

  tenancy = var.tenancy

  user_data = templatefile(
    var.template_path,
    {
      user                 = var.user
      ssh_public_login_key = file(var.ssh_public_login_key)
      hostname             = var.hostname

      playbook_ssh_private_deploy_key      = file(var.playbook_ssh_private_deploy_key)
      playbook_ssh_private_deploy_key_path = var.playbook_ssh_private_deploy_key_path
      app_ansible_playbook_name            = var.app_ansible_playbook_name
      app_ansible_playbook_url             = var.app_ansible_playbook_url
      app_inventory_name                   = var.app_inventory_name

      nixos_ssh_private_deploy_key = file(var.nixos_ssh_private_deploy_key)
      nixos_config_url             = var.nixos_config_url
    }
  )
}

output "public_ip" {
  value       = aws_instance.web.public_ip
  description = "The IPv4 address of the instance"
}

output "public_dns" {
  value       = aws_instance.web.public_dns
  description = "The public DNS name of the instance"
}

output "security_groups" {
  value       = aws_instance.web.security_groups
  description = "The security groups associated with the instance"
}

output "arn" {
  value       = aws_instance.web.arn
  description = "The ARN of the instance"
}

output "tags" {
  value       = aws_instance.web.tags
  description = "The user of the instance"
}

module "deploy" {
  source                 = "github.com/nix-community/nixos-anywhere/terraform/all-in-one"
  
  nixos_system_attr      = "../templates#packages.x86_64-linux.<your-user>"
  nixos_partitioner_attr = "../templates#packages.x86_64-linux.<your-user>-disko"

  target_host            = module.ec2.public_ip
  instance_id            = module.ec2.public_ip

  install_ssh_key = file("~/.aws/keys/login_keys/instance")

  deployment_ssh_key = file("~/.aws/keys/login_keys/instance")

  target_user = "<your-user>"

  install_user = module.ec2.tags["Name"]
}

“Make sure that your instance has enough disk to run kexec. eg: AWS t2.micro does not have enough space.”

We will parse the user-data file using the Terraform templatefile() built-in function and bind that to user_data. You can set any Linux OS you want, eg, Ubuntu. You can set useful outputs such as the ARN ID, public IP, or domain name so you can connect with that instance without accessing the web dashboard.

After that, you can set the variables manually or leave this file as a model and create another file which imports this one as a module in example/main.tf as a good practice:

module "ec2" {
  source = "../"

  region        = "<region>"
  instance_type = "<instance-type>"
  instance_name = "<instance-name>"

  most_recent           = true
  ami_filter_name       = "<image-name>"
  architecture          = "x86_64"
  is_public             = true
  is_free_tier_eligible = true
  owners                = ["<image-owner>"]

  virtualization_type = "hvm"

  tenancy              = "default"
  ssh_public_login_key = "<public-ssh-key>"
  hostname             = "<instance-name>"
  user                 = "<instance-name>"
  app_inventory_name   = "localhost,"

  # This is the path to the user-data template interpreted by Terraform, this will be used in cloud-init
  template_path        = "<path-to-user-data>"

  # Run app
  app_ansible_playbook_url             = "<ansible-playbook-repo>"
  app_ansible_playbook_name            = "<playbook-name>"

  # These are optional, in case you want to use an Ansible playbook repo for 
  # running `ansible-pull` command and this repo is private, you can use a deploy key
  playbook_ssh_private_deploy_key      = "<public-ssh-key>"
  playbook_ssh_private_deploy_key_path = "<public-ssh-key>"

  # NixOS config
  nixos_ssh_private_deploy_key = "~/.aws/keys/deploy_keys/nixos"
  nixos_config_url             = "[email protected]:<user>/nixos_config.git" # You can fetch a configuration for your NixOS from a repo
}

output "public_ip" {
  value       = module.ec2.public_ip
  description = "The IPv4 address of the instance"
}

output "public_dns" {
  value       = module.ec2.public_dns
  description = "The public DNS name of the instance"
}

output "security_groups" {
  value       = module.ec2.security_groups
  description = "The security groups associated with the instance"
}

output "arn" {
  value       = module.ec2.arn
  description = "The ARN of the instance"
}

output "user" {
  value       = module.ec2.tags["Name"]
  description = "The user of the instance"
}

module "deploy" {
  source                 = "github.com/nix-community/nixos-anywhere/terraform/all-in-one"
  
  nixos_system_attr      = "../templates#packages.x86_64-linux.<your-user>"
  nixos_partitioner_attr = "../templates#packages.x86_64-linux.<your-user>-disko"

  target_host            = module.ec2.public_ip
  instance_id            = module.ec2.public_ip

  install_ssh_key = file("~/.aws/keys/login_keys/instance")

  deployment_ssh_key = file("~/.aws/keys/login_keys/instance")

  target_user = "<your-user>"

  install_user = module.ec2.tags["Name"]
}

Note we have two steps here, divided into two modules: the deployment of a machine, and the second step, Terraform will run nixos-anywhere resources on that machine from the first step, where kexec is executed under the hood. Therefore, in the second step we have two private keys; The one which is already registered in the instace(deployment_ssh_key), which we installed in the cloud-init script, and the other is the one that will be installed after the NixOS installation is complete(install_ssh_key), it’s respective public key will be used for us to access this instace. The same logic applies to the user, with the caveat that we can’t choose the instance username unless we created the ISO ourselves; verify those details with your image provider.

Deployment

Run these commands at the same level as your main.tf to install the dependencies and run Terraform. Note that it will take some minutes to fully deploy the end software.

terraform init
terraform apply