Configuring Ansible Vault for Your Homelab

Ari Kalfus | Mar 17, 2024 min read

I run a small homelab in a Tailscale network configured with Ansible. There are several methods I could use to manage application secrets in this environment, but I’ve chosen to use HashiCorp Vault as Vault Community is frequently what I would turn to in a professional environment, so it will serve as a good resource for experimentation in the homelab environment.

This article will cover how I set up Vault via Ansible to manage secrets in my homelab.

Setup:

  • Single Vault process on a non-dedicated server (other services running on the server) using Raft storage
  • Vault auto-unseal via AWS, auto-start via Systemd
  • Tailscale creates a Wireguard network, thus Vault itself does not need to be configured with TLS
  • “Tier 0” secrets, used to initialize Ansible, are pulled from 1Password
  • “Tier 1” secrets, used to configure services, are pulled from Vault

Install Vault

I installed Vault using Homebrew (yes, on Linux, it’s better than whatever you’re doing). I believe I did set up Homebrew myself, outside of Ansible, for simplicity, by running:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

I then manage brew packages on the system with:

- name: Packages | Uninstall Homebrew Packages
  community.general.homebrew:
    name: "{{ homebrew_uninstalled_packages }}"
    state: absent

- name: Packages | Install Homebrew Packages
  community.general.homebrew:
    name: "{{ homebrew_installed_packages }}"
    state: present

In my host vars, I have each of these list variables defined. To install Vault, I included:

homebrew_installed_packages:
  - hashicorp/tap/vault

Then I can keep Vault, along with other software, up-to-date with brew upgrade.

Configure Vault

Installing the Vault binary isn’t enough to run Vault on the system. I need to configure Vault to run as a service, and I need to initialize and unseal the Vault.

I configure Vault by creating a system user and defining a systemd service to run the Vault process.

- name: Vault | Create User
  become: true
  ansible.builtin.user:
    name: vault
    password: '!'  # disabled
    comment: HashiCorp Vault service user
    create_home: false
    shell: /sbin/nologin
    system: true
    state: present
  register: vault_user

- name: Vault | Ensure Directories
  become: true
  ansible.builtin.file:
    path: "{{ item }}"
    state: directory
    owner: "{{ vault_user.name }}"
    mode: "0755"
  loop:
    - /opt/vault
    - /opt/vault/data

- name: Vault | Ensure Vault Config
  become: true
  ansible.builtin.template:
    src: ../files/vault_config.hcl.j2
    dest: /opt/vault/vault_config.hcl
    owner: "{{ vault_user.name }}"
    mode: "0600"
  notify: Restart Vault

- name: Vault | Configure Vault Service
  become: true
  ansible.builtin.copy:
    src: ../files/vault.service
    dest: /etc/systemd/system/vault.service
    mode: "0644"
    owner: root
  notify: Restart Vault

- name: Vault | Initialize Vault Service
  become: true
  register: vault_initialize
  ansible.builtin.systemd:
    name: vault.service
    enabled: true
    daemon_reload: true
    state: started

The vault_config.hcl.j2 file is a Jinja2 template that I use to configure Vault. I also set the vault_config.hcl file permissions to 0600 since I will end up putting AWS credentials into this file.

vault_config.hcl:

storage "raft" {
    path        = "/opt/vault/data"
    node_id     = "node1"
}

listener "tcp" {
    address         = "{{ ansible_host }}:{{ ports.vault_main }}"
    tls_disable     = true
}

seal "awskms" {
    region      = "{{ unseal_aws_region }}"
    kms_key_id  = "{{ unseal_aws_kms_key_id }}"
    access_key  = "{{ unseal_aws_access_key }}"
    secret_key  = "{{ unseal_aws_secret_key }}"
}

disable_mlock   = true
api_addr        = "http://{{ ansible_host }}:{{ ports.vault_main }}"
cluster_addr    = "http://{{ ansible_host }}:{{ ports.vault_cluster }}"
ui              = true

Vault Raft storage is the “default” storage option for Vault these days, so it is simple to use for this one-node setup. I need to provide the path I want Vault to persist data. In this case, it will be /opt/vault/data, which I created in the playbook.

Because I’m using Tailscale for my homelab network, I don’t need to configure TLS for Vault, so I set tls_disable to true. Once this GitHub issue is resolved, I will likely enable TLS for Vault inside my Tailscale network so URLs look nicer. In my .zshrc, I set my Vault address with:

export VAULT_ADDR="http://$(tailscale ip --4):8200"

I do use the Tailscale MagicDNS name for this server when accessing Vault from my other devices, e.g. http://myserver:8200.

Following HashiCorp’s guidance, since I am using Raft storage I also set disable_mlock = true.

Finally, I set the awskms seal section to use AWS KMS for auto-unsealing. This makes Vault available immediately after a reboot, which is important for a homelab environment. I spent a few weeks manually unsealing Vault after every reboot, and the ~$1/month to auto-unseal with KMS is well worth it.

Auto-Unseal with AWS KMS

There are a few short steps to setting up a KMS key and IAM user for Vault to use for auto-unsealing. The HashiCorp auto-unseal documentation is completely useless if you want to understand what is required.

You could do this with Terraform, but I opted to use the AWS console.

1. Create a KMS key in the AWS console

Use the default values - symmetric key for encryption and decryption.

KMS Key Configure screen

Give your key an alias display name, then move to defining key permissions. Select a key administrator who can manage the key (e.g. your personal or administrator IAM user/role). For example, since I log in to my personal AWS account through SSO (shoutout to Jumpcloud’s free tier, which they no longer seem to offer to new accounts, or at least no longer document it), I selected my AWSReservedSSO_AdministratorAccess_xxx role.

Next, I need to select Key Users, who will be able to use the KMS key for cryptographic operations. This will need to be our Vault IAM User, but since I haven’t created that yet, I’ll leave this blank for now.

Finish creating the KMS key and copy the ARN of the key for the next step.

2. Create an IAM user for Vault to use for auto-unsealing

Head over to the Users section of IAM. Before I can create a user, however, I need to create a policy that will allow the user to use the KMS key I created. Select the Policies section of IAM and create a new policy.

Here’s the JSON for the policy I need to create. Substitute in the ARN for the KMS key you created in step 1.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VaultUnsealUseKey",
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:Encrypt",
                "kms:DescribeKey"
            ],
            "Resource": "arn:aws:kms:us-east-1:xxx:key/xxx"
        }
    ]
}

Give your policy a name and create it. Now, head over to the Users section of IAM and create a new user.

“Attach policies directly” to the user and select the IAM policy you just created.

IAM user attach policy

Create the user, then head over to Security credentials to create an access and secret key pair. Copy these values.

Don’t forget to go back to your KMS Key and, under the Key users section, add your new IAM user.

KMS add key user

You must grant access via the IAM policy and grant the user access on the KMS key policy:

No AWS principal, including the account root user or key creator, has any permissions to a KMS key unless they are explicitly allowed, and never denied, in a key policy, IAM policy, or grant.

Unless the key policy explicitly allows it, you cannot use IAM policies to allow access to a KMS key. Without permission from the key policy, IAM policies that allow permissions have no effect.

- https://docs.aws.amazon.com/kms/latest/developerguide/key-policies.html

3. Integrate with Vault

In my case, I store the IAM user credentials in 1Password, and pass them to Vault during playbook invocation. To resolve networking issues I experienced running Ansible over Tailscale, I set no_proxy when running the command.

no_proxy='*' poetry run ansible-playbook -i inventory --force-handlers \
	-e ansible_become_password=$$(op read "op://...") \
	-e unseal_aws_access_key=$$(op read "op://...") \
  	-e unseal_aws_secret_key=$$(op read "op://...") \
	server.yml

I then set two host vars on this server:

unseal_aws_region: "us-east-1"
unseal_aws_kms_key_id: "..."

You can receive the KMS key ID from the KMS section of the console, or extracted from your KMS ARN.

This completes the configuration to auto-unseal Vault with AWS KMS.

Run Vault as a Service

The systemd service file for Vault is rather straightforward. Since I installed Vault through Homebrew, I use the absolute path for the brew vault binary.

[Unit]
Description="HashiCorp Vault - A tool for managing secrets"
Documentation=https://developer.hashicorp.com/vault/docs
Requires=network-online.target
After=network-online.target
ConditionFileNotEmpty=/opt/vault/vault_config.hcl
StartLimitIntervalSec=60
StartLimitBurst=3

[Service]
User=vault
Group=vault
ProtectSystem=full
ProtectHome=read-only
PrivateTmp=true
PrivateDevices=true
SecureBits=keep-caps
AmbientCapabilities=CAP_IPC_LOCK
CapabilityBoundingSet=CAP_SYSLOG CAP_IPC_LOCK
NoNewPrivileges=true
ExecStart=/home/linuxbrew/.linuxbrew/bin/vault server -config=/opt/vault/vault_config.hcl
ExecReload=/bin/kill --signal HUP $MAINPID
KillMode=process
KillSignal=SIGINT
Restart=on-failure
RestartSec=5
TimeoutStopSec=30
LimitNOFILE=65536
LimitMEMLOCK=infinity
LimitCORE=0

[Install]
WantedBy=multi-user.target

I invoke vault to start with the config file created by Ansible. Vault needs CAP_IPC_LOCK and the service is run under the system vault user created by the playbook.

Login and read secrets from Vault

With this setup, I can access Vault from the rest of our playbook and pull in secrets needed for other services in my homelab. In a separate Terraform project, I configured Vault’s authentication and secrets backends, and created several system policies for my use cases. One of those is an AppRole for this Ansible playbook. I stored the role and secret IDs in 1Password and pass them into Ansible as well.

no_proxy='*' poetry run ansible-playbook -i inventory --force-handlers \
	-e ansible_become_password=$$(op read "op://...") \
	-e vault_role_id=$$(op read "op://...") \
	-e vault_secret_id=$$(op read "op://...") \
	-e unseal_aws_access_key=$$(op read "op://...") \
  	-e unseal_aws_secret_key=$$(op read "op://...") \
	server.yml

I login to Vault in my playbook with this section at the end of my Vault tasks:

- name: Vault | Prompt for AppRole
  ansible.builtin.debug:
    msg: "Please generate an AppRole and re-run the playbook with the following env vars: vault_role_id, vault_secret_id"
  when:
    - vault_role_id is undefined

- name: Vault | Quit and Resubmit with AppRole
  ansible.builtin.meta: end_play
  when:
    - vault_role_id is undefined

- name: Vault | Login
  ansible.builtin.set_fact:
    hashivault_login_data: "{{ lookup('community.hashi_vault.vault_login', url=vault_url, auth_method='approle', role_id=vault_role_id, secret_id=vault_secret_id) }}"  # noqa: yaml[line-length]

I can then read secrets from Vault using the community.hashi_vault lookup plugins. This is super ugly, but seems the most straightforward way to go about it. This example is pulled from one of my homelab systems, which runs https://github.com/monicahq/monica.

# Define however many paths of secrets you want to read in a 'paths' variable
# And use the 'community.hashi_vault.vault_kv2_get' lookup plugin to capture the raw response output from Vault
- name: CRM | Get Secrets
  vars:
    paths:
      - apps/monica-crm
      - apps/monica-crm/db
      - saas/sendgrid/monica-crm
      - saas/weatherapi
  ansible.builtin.set_fact:
    monica_secrets_raw: "{{ lookup('community.hashi_vault.vault_kv2_get', *paths, url=vault_url, auth_method='token', token=hashivault_login_data | community.hashi_vault.vault_login_token) }}"  # noqa: yaml[line-length]

# Parse the raw response output from Vault and collapse the data into a single dictionary
- name: CRM | Parse Secrets
  ansible.builtin.set_fact:
    monica_secrets: "{{ monica_secrets_raw[0].data.data | ansible.builtin.combine(monica_secrets_raw[1].data.data, monica_secrets_raw[2].data.data, monica_secrets_raw[3].data.data, list_merge='append') }}"  # noqa: yaml[line-length]
    # If you only reference one path, you can just reference `monica_secrets_raw.data.data` directly, there will not be an array.

Conclusion

Following this setup, I can create however many other services I want and pull their secrets from Vault. I also take Vault’s audit logs and stream them to a 3rd party provider with a great free tier. If you are interested, I’ve included a bonus section below on how this is configured.

Bonus: Stream audit logs to Axiom

I use Axiom to stream my Vault audit logs. I connect Vault’s syslog output to Axiom via Vector.

I install Vector’s apt package:

- name: Packages | Add Vector Repository
  block:
    - name: Packages | Vector | Ensure GPG Key
      become: true
      ansible.builtin.get_url:
        url: "https://repositories.timber.io/public/vector/gpg.3543DB2D0A2BC4B8.key"
        dest: "/usr/share/keyrings/timber-vector-archive-keyring.asc"
        mode: '0644'
        owner: root
      register: vector_gpg_key

    - name: Packages | Vector | Add Repository
      become: true
      ansible.builtin.apt_repository:
        repo: "deb [signed-by={{ vector_gpg_key.dest }}] https://repositories.timber.io/public/vector/deb/ubuntu {{ ansible_distribution_release }} main"
        state: present
        filename: timber-vector
        mode: '0644'

- name: Packages | Install Apt Packages
  become: true
  ansible.builtin.apt:
    name: "{{ apt_packages }}"  # one of which is 'vector'
    cache_valid_time: 3600
    state: present

and configure Vector via the following Ansible tasks:

- name: Logs | Copy Vector Config
  become: true
  ansible.builtin.template:
    src: ../files/vector.yaml.j2
    dest: /etc/vector/vector.yaml
    mode: '0640'
    owner: vector
  register: vector_config
  notify: Restart Vector

- name: Logs | Validate Vector Config
  become: true
  when: vector_config.changed
  ansible.builtin.command:
    cmd: /usr/bin/vector validate /etc/vector/vector.yaml
  register: vector_validate

- name: Logs | Ensure Valid Vector Confiig
  when: vector_config.changed
  ansible.builtin.assert:
    that:
      - vector_validate.rc == 0

- name: Logs | Ensure Vector Service
  become: true
  ansible.builtin.systemd:
    name: vector.service
    enabled: true
    state: started

Because Vector and Axiom are used for Vault’s logs, I consider them Tier 0 services so the Axiom API token is also pulled from 1Password. My full make run that supports all of my homelab projects in my Ansible playbook is:

no_proxy='*' poetry run ansible-playbook -i inventory --force-handlers \
	-e ansible_become_password=$$(op read "op://...") \
	-e axiom_api_token=$$(op read "op://...") \
	-e vault_role_id=$$(op read "op://...") \
	-e vault_secret_id=$$(op read "op://...") \
	-e unseal_aws_access_key=$$(op read "op://...") \
  	-e unseal_aws_secret_key=$$(op read "op://...") \
	server.yml

Vector’s apt package installs a default systemd service, which I leverage without modification. It is a basic service file:

/lib/systemd/system/vector.service:

[Unit]
Description=Vector
Documentation=https://vector.dev
After=network-online.target
Requires=network-online.target

[Service]
User=vector
Group=vector
ExecStartPre=/usr/bin/vector validate
ExecStart=/usr/bin/vector
ExecReload=/usr/bin/vector validate
ExecReload=/bin/kill -HUP $MAINPID
Restart=always
AmbientCapabilities=CAP_NET_BIND_SERVICE
EnvironmentFile=-/etc/default/vector
# Since systemd 229, should be in [Unit] but in order to support systemd <229,
# it is also supported to have it here.
StartLimitInterval=10
StartLimitBurst=5
[Install]
WantedBy=multi-user.target

If I wanted to customize Vector’s behavior, I could do so via environment variables in the /etc/default/vector file. The default behavior reads Vector’s configuration from /etc/vector/vector.yaml. This file is where I include the details for how Vector should parse logs from syslog, transform them, and send them to Axiom.

sources:
  # Read syslog entries from auth.log
  authlog:
    type: file
    ignore_older_secs: 600
    include:
      - /var/log/auth.log

transforms:
  # Parse Syslog logs
  parse_authlog:
    type: remap
    inputs:
      - authlog
    source: |
      . = parse_syslog!(string!(.message))      

  # Now that we have syslog parsed, we can filter on the "appname" field.
  # Ignore any logs that are not from the vault service.
  filter_vault:
    type: filter
    inputs:
      - parse_authlog
    condition: |
      .appname == "vault"      

  # Now we can parse the JSON out of the "message" inside the syslog entry.
  # We can't filter on request.path yet because we have to parse the message.
  parse_vault:
    type: remap
    inputs:
      - filter_vault
    # Some paths, so far just 'sys/internal/ui/mounts', break when trying to parse as JSON.
    # This path isn't useful to ingest anyway, so we ignore the parsing error and drop further processing.
    source: |
      message, err = parse_json(.message)
      if err != null {
        log("Unable to parse JSON: " + err, level: "debug")
        abort
      } else {
        .vault = message
        del(.message)
      }      

  # Certain logs we don't want to ingest.
  # Filter them out here, we can now filter on Vault's JSON properties under the "vault" root-level field.
  filter_vault_omissions:
    type: filter
    inputs:
      - parse_vault
    condition: |
      .vault.type != "request" &&
      .vault.request.path != "sys/auth" &&
      .vault.request.path != "sys/internal/ui/mounts" &&
      .vault.request.path != "sys/internal/ui/resultant-acl" &&
      .vault.request.path != "sys/capabilities-self"      

sinks:
  # Stream resulting log entries to Axiom
  axiom:
    type: axiom
    inputs:
      - filter_vault_omissions
    token: "{{ axiom_api_token }}"
    dataset: "{{ axiom_vault_dataset }}"
    org_id: "{{ axiom_org_id }}"

Vector has a ton of components you can work with, and a number of ways you can transform data between sources and sinks. I’ve only scratched the surface in my vector file, but it’s sufficient to parse the Vault logs, exclude some heavily accessed paths I don’t care about, and stream the rest into Axiom.

Inside Axiom I can query, monitor, and visualize the log data similarly to any log management platform, which is another benefit of its inclusion in my homelab setup.

Vault Axiom dashboard

I do wish there was a Personal Plus-style of plan for Axiom, as I don’t care about my homelab setup enough for the $25/month plan once I hit the limits of the free tier Personal plan. For now, it works super smoothly, and I am very happy with it.

comments powered by Disqus