Get started with HashiCorp Nomad & Consul
From 0 to 100 with HashiCorps Nomad & Consul including initial server setup, load balancing, service connection any much more.
Roman Zipp, April 8th, 2023
About this Guide
This post will guide you through the initial setup of Nomad, Consul and Vault.
Additionally, I will cover some common additional steps for
AWS CLI (for ECR)
Docker (+ Authentication)
1. Prequisites
CNI Bridge
Nomad uses CNI plugins to configure the network namespace used to secure the Consul service mesh sidecar proxy. All Nomad client nodes using network namespaces must have CNI plugins installed. See the Consul CNI Docs for more information.
See Nomad Install Docs for more information
curl -L -o cni-plugins.tgz "https://github.com/containernetworking/plugins/releases/download/v1.0.0/cni-plugins-linux-$( [ $(uname -m) = aarch64 ] && echo arm64 || echo amd64)"-v1.0.0.tgz && \ sudo mkdir -p /opt/cni/bin && \ sudo tar -C /opt/cni/bin -xzf cni-plugins.tgz
Configure environment values
This script configures multiple env values NOMAD_ADDR
, VAULT_ADDR
, CONSUL_HTTP_ADDR
so we can run cli commands without appending the address and port every time.
PRIVATE_IP=$(/sbin/ip -o -4 addr list ens18 | awk '{print $4}' | cut -d/ -f1) echo -e "\nexport NOMAD_ADDR=http://$PRIVATE_IP:4646" >> /root/.bashrc echo -e "export VAULT_ADDR=https://$PRIVATE_IP:8200" >> /root/.bashrc echo -e "export CONSUL_HTTP_ADDR=http://$PRIVATE_IP:8500" >> /root/.bashrc source /root/.bashrc
2. Installation
Get private interface IP
You need to obtain the private IP of your chosen network interface. Check if the command below fits your needs or set the IP address manually.
PRIVATE_IP=$(/sbin/ip -o -4 addr list ens18 | awk '{print $4}' | cut -d/ -f1) echo $PRIVATE_IP
Install require packages
apt-get update && apt-get upgrade -y apt-get install curl wget gpg gnupg coreutils ca-certificates lsb-release # AWS CLI apt-get install -y awscli amazon-ecr-credential-helper
Install Docker
See the official Docker install guide for more information.
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null apt-get update apt-get install -y docker-ce docker-ce-cli containerd.io
Add HashiCorps PPAs
See the official install guide if your prefer to use the prebuilt binary.
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
Install Nomad, Consul & Vault
apt-get update apt-get install -y nomad consul vault systemctl enable nomad systemctl enable consul systemctl enable vault
3. Configuration
3.1 Docker AWS ECR Authentication
We will create a new /etc/docker/config.json
file to provide Nomad with our Docker login credentials. Replace <aws_id>
and <aws_region>
with your own values.
mkdir -p /etc/docker cat <<EOT >> /etc/docker/config.json { "credHelpers": { "public.ecr.aws": "ecr-login", "<aws_id>.dkr.ecr.<aws_region>.amazonaws.com": "ecr-login" } } EOT
3.2 Nomad
Before getting started with Nomad instances, we need to configure some environment values in /etc/nomad.d/nomad.env
mkdir -p /etc/nomad.d cat <<EOT >> /etc/nomad.d/nomad.env AWS_ACCESS_KEY_ID=****** AWS_SECRET_ACCESS_KEY=****** AWS_DEFAULT_REGION=<aws_region> VAULT_ADDR=http://127.0.0.1:8200 VAULT_TOKEN= CONSUL_HTTP_ADDR=$PRIVATE_IP:8500 CONSUL_CACERT=/etc/consul.d/certs/consul-agent-ca.pem CONSUL_CLIENT_CERT=/etc/consul.d/certs/dc1-server-consul.pem CONSUL_CLIENT_KEY=/etc/consul.d/certs/dc1-server-consul-key.pem CONSUL_HTTP_SSL=false EOT
We will now configure your Nomad instances as client and/or server. Place this file in /etc/nomad.d/nomad.hcl
rm -f /etc/nomad.d/nomad.hcl && nano /etc/nomad.d/nomad.hcl
Nomad Client
datacenter = "dc1" data_dir = "/opt/nomad" bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}" server { enabled = false } client { enabled = true network_interface = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"name\" }}" template { disable_file_sandbox = true } } consul {} plugin "docker" { config { volumes { enabled = true } auth { config = "/etc/docker/config.json" } } }
Nomad Client & Server
datacenter = "dc1" data_dir = "/opt/nomad" bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}" server { enabled = true bootstrap_expect = 3 } client { enabled = true network_interface = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"name\" }}" template { disable_file_sandbox = true } } consul {} plugin "docker" { config { volumes { enabled = true } auth { config = "/etc/docker/config.json" } } }
Some values explained
The server
server.bootstrap_expect
values defined, how many nomad server instances need to be running in order to select a leader.The
client.template.disable_file_sandbox
allows you to mount host files into you job allcos.
3.3 Consul
Setup TLS & encryption (optional)
To enable internal TLS encryption we need to generate a certificate using the following commands. See the Consul TLS docs for more information.
mkdir -p /etc/consul.d/certs && cd /etc/consul.d/certs consul keygen # UyaZRVMUdoNinDtEDxMZFiqpQmjbsIQXUeGYDWgi= consul tls ca create -domain consul # ==> Saved consul-agent-ca.pem # ==> Saved consul-agent-ca-key.pem
You now need to distribute the generated consul-agent-ca.pem
certificate to all consul agents and place it in /etc/consul.d/certs/consul-agent-ca.pem
Generate agent certificates
On your host Consul server, generate an agent certificate for each Consul agent you want to deploy.
consul tls cert create -server -dc dc1 -domain consul # ==> WARNING: Server Certificates grants authority to become a # server and access all state in the cluster including root keys # and all ACL tokens. Do not distribute them to production hosts # that are not server nodes. Store them as securely as CA keys. # ==> Using consul-agent-ca.pem and consul-agent-ca-key.pem # ==> Saved dc1-server-consul-0.pem # ==> Saved dc1-server-consul-0-key.pem
Distribute your agent certificate to the respective server
scp 10.1.10.1:/etc/consul.d/certs/dc1-server-consul-1-key.pem . scp 10.1.10.1:/etc/consul.d/certs/dc1-server-consul-1.pem .
Configure each agent
Again, choose which servers will be used as client or server for your Consul instances.
mkdir -p /etc/consul.d/certs chown -R consul:consul /etc/consul.d chown -R consul:consul /opt/consul rm -f /etc/consul.d/consul.hcl && nano /etc/consul.d/consul.hcl chmod 640 /etc/consul.d/consul.hcl # Paste consul keys /etc/consul.d/certs/dc1-server-consul.pem /etc/consul.d/certs/dc1-server-consul-key.pem # Bootstrap ACL consul acl bootstrap
Consul Client
datacenter = "dc1" data_dir = "/opt/consul" bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}" client_addr = "{{ GetPrivateInterfaces | exclude \"name\" \"docker.*\" | join \"address\" \" \" }} {{ GetAllInterfaces | include \"flags\" \"loopback\" | join \"address\" \" \" }}" retry_join = ["<add all o your consul clients & server ip addresses>"] # also include this server ip ca_file = "/etc/consul.d/certs/consul-agent-ca.pem" cert_file = "/etc/consul.d/certs/dc1-server-consul.pem" key_file = "/etc/consul.d/certs/dc1-server-consul-key.pem" tls { grpc { use_auto_cert = false } } ports { grpc = 8502 grpc_tls = -1 } connect { enabled = true } dns_config { allow_stale = true node_ttl = "5s" use_cache = true cache_max_age = "5s" } log_level = "info"
Consul Server
datacenter = "dc1" data_dir = "/opt/consul" server = true bootstrap_expect = 3 # your server count bind_addr = "{{ GetPrivateInterfaces | include \"network\" \"10.0.0.0/8\" | attr \"address\" }}" client_addr = "{{ GetPrivateInterfaces | exclude \"name\" \"docker.*\" | join \"address\" \" \" }} {{ GetAllInterfaces | include \"flags\" \"loopback\" | join \"address\" \" \" }}" retry_join = ["<add all o your consul clients & server ip addresses>"] # also include this server ip ca_file = "/etc/consul.d/certs/consul-agent-ca.pem" cert_file = "/etc/consul.d/certs/dc1-server-consul-0.pem" key_file = "/etc/consul.d/certs/dc1-server-consul-0-key.pem" ui_config { enabled = true } acl { enabled = true default_policy = "allow" # change this enable_token_persistence = true } tls { grpc { use_auto_cert = false } } ports { grpc = 8502 grpc_tls = -1 } connect { enabled = true } dns_config { allow_stale = true node_ttl = "5s" use_cache = true cache_max_age = "5s" } log_level = "info"
4. Cheat-Sheet
Here are some handy commands I commonly use for debugging.
# nomad cleanup allocation history & summary nomad system gc nomad system reconcile summaries # show service logs nomad monitor # attach shell to job nomad alloc exec -task=<task> <alloc> /bin/bash # show open ports lsof -i -P -n | grep LISTEN ss -tulpn # test tcp connection nc -z -v -w 2 <host> <port> # test consul dns dig @127.0.0.1 -p 8600 _<service-name>._tcp.service.consul # query container from inside curl -H "Host: domain.tld" 10.1.10.1:21021 # query some endpoint curl -H "Host: domain.tld" -X POST <host>/api # list service instances "address:port" curl -s http://127.0.0.1:8500/v1/catalog/service/<service-name>|jq -j '.[] | .ServiceAddress,":",.ServicePort,"\n"' # consul filtering curl --get http://127.0.0.1:8500/v1/agent/services --data-urlencode 'filter=Service == "<service-name>"'|jq -j '.[] | .Address,":",.Port,"\n"'