Building Elastic Kubernetes Service (EKS) With Terraform — Deploying and Packaging applications with Helm

Toluwase Makanjuola

14 min read·Just now

This project seeks to solidify your skills by focusing more on the best practice setup.

You will use Terraform to create a Kubernetes EKS cluster and dynamically add scalable worker nodes
You will deploy multiple applications using HELM
You will experience more Kubernetes objects and how to use them with Helm.
You will improve upon your CI/CD skills with Jenkins

Cloud providers such as AWS have managed services for Kubernetes, they have done all the hard work, and with a few API calls to AWS, you can have a production-grade cluster ready to go in minutes. Therefore, in this project, you begin by focusing on EKS, and how to get it up and running using Terraform. Before moving on to other things.

Building EKS with Terraform

Note: Use Terraform version v1.0.2

Open up a new directory on your laptop, and name it eks
Use AWS CLI to create an S3 bucket
Create a file — backend.tf Task for you, ensure the backend is configured for remote state in S3

terraform {
}

Create a file — network.tf and provision Elastic IP for Nat Gateway, VPC, Private and public subnets.

# reserve Elastic IP to be used in our NAT gateway
resource "aws_eip" "nat_gw_elastic_ip" {
  vpc = true

  tags = {
    Name            = "${var.cluster_name}-nat-eip"
    iac_environment = var.iac_environment_tag
  }
}# create VPC using the official AWS module
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"  name = "${var.name_prefix}-vpc"
  cidr = var.main_network_block
  azs  = data.aws_availability_zones.available_azs.names  private_subnets = [
    # this loop will create a one-line list as ["10.0.0.0/20", "10.0.16.0/20", "10.0.32.0/20", ...]
    # with a length depending on how many Zones are available
    for zone_id in data.aws_availability_zones.available_azs.zone_ids :
    cidrsubnet(var.main_network_block, var.subnet_prefix_extension, tonumber(substr(zone_id, length(zone_id) - 1, 1)) - 1)
  ]  public_subnets = [
    # this loop will create a one-line list as ["10.0.128.0/20", "10.0.144.0/20", "10.0.160.0/20", ...]
    # with a length depending on how many Zones are available
    # there is a zone Offset variable, to make sure no collisions are present with private subnet blocks
    for zone_id in data.aws_availability_zones.available_azs.zone_ids :
    cidrsubnet(var.main_network_block, var.subnet_prefix_extension, tonumber(substr(zone_id, length(zone_id) - 1, 1)) + var.zone_offset - 1)
  ]  # enable single NAT Gateway to save some money
  # WARNING: this could create a single point of failure, since we are creating a NAT Gateway in one AZ only
  # feel free to change these options if you need to ensure full Availability without the need of running 'terraform apply'
  # reference: https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/2.44.0#nat-gateway-scenarios
  enable_nat_gateway     = true
  single_nat_gateway     = true
  one_nat_gateway_per_az = false
  enable_dns_hostnames   = true
  reuse_nat_ips          = true
  external_nat_ip_ids    = [aws_eip.nat_gw_elastic_ip.id]  # add VPC/Subnet tags required by EKS
  tags = {
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    iac_environment                             = var.iac_environment_tag
  }
  public_subnet_tags = {
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    "kubernetes.io/role/elb"                    = "1"
    iac_environment                             = var.iac_environment_tag
  }
  private_subnet_tags = {
    "kubernetes.io/cluster/${var.cluster_name}" = "shared"
    "kubernetes.io/role/internal-elb"           = "1"
    iac_environment                             = var.iac_environment_tag
  }
}

Note: The tags added to the subnets are very important. The Kubernetes Cloud Controller Manager (cloud-controller-manager) and AWS Load Balancer Controller (aws-load-balancer-controller) must identify the clusters. To do that, it queries the cluster’s subnets by using the tags as a filter.

For public and private subnets that use load balancer resources: each subnet must be tagged

Key: kubernetes.io/cluster/cluster-name
Value: shared

For private subnets that use internal load balancer resources: each subnet must be tagged

Key: kubernetes.io/role/internal-elb
Value: 1

For public subnets that use internal load balancer resources: each subnet must be tagged

Key: kubernetes.io/role/elb
Value: 1

Create a file — variables.tf

# create some variables
variable "cluster_name" {
  type        = string
  description = "EKS cluster name."
}
variable "iac_environment_tag" {
  type        = string
  description = "AWS tag to indicate environment name of each infrastructure object."
}
variable "name_prefix" {
  type        = string
  description = "Prefix to be used on each infrastructure object Name created in AWS."
}
variable "main_network_block" {
  type        = string
  description = "Base CIDR block to be used in our VPC."
}
variable "subnet_prefix_extension" {
  type        = number
  description = "CIDR block bits extension to calculate CIDR blocks of each subnetwork."
}
variable "zone_offset" {
  type        = number
  description = "CIDR block bits extension offset to calculate Public subnets, avoiding collisions with Private subnets."
}

Create a file — data.tf - This will pull the available AZs for use.

# get all available AZs in our region
data "aws_availability_zones" "available_azs" {
  state = "available"
}

data "aws_caller_identity" "current" {} # used for accesing Account ID and ARN

Create a file — eks.tf and provision EKS cluster (Create the file only if you are not using your existing Terraform code. Otherwise you can simply append it to the main.tf from your existing code) Read more about this module from the official documentation here - Reading it will help you understand more about the rich features of the module.

module "eks-cluster" {
  source           = "terraform-aws-modules/eks/aws"
  version          = "17.1.0"
  cluster_name     = "${var.cluster_name}"
  cluster_version  = "1.20"
  write_kubeconfig = true

  subnets = module.vpc.private_subnets
  vpc_id  = module.vpc.vpc_id worker_groups_launch_template = local.worker_groups_launch_template  # map developer & admin ARNs as kubernetes Users
  map_users = concat(local.admin_user_map_users, local.developer_user_map_users)
}

Create a file — worker-nodes.tf - This is used to set the policies for autoscaling. To save as much as 90% of the cost we will use Spot Instances - Read more here

# add spot fleet Autoscaling policy
resource "aws_autoscaling_policy" "eks_autoscaling_policy" {
  count = length(local.worker_groups_launch_template)

  name                   = "${module.eks-cluster.workers_asg_names[count.index]}-autoscaling-policy"
  autoscaling_group_name = module.eks-cluster.workers_asg_names[count.index]
  policy_type            = "TargetTrackingScaling"  target_tracking_configuration {
    predefined_metric_specification {
      predefined_metric_type = "ASGAverageCPUUtilization"
    }
    target_value = var.autoscaling_average_cpu
  }
}

Create a file — locals.tf to create local variables. Terraform does not allow assigning variable to variables. There are good reasons for that to avoid repeating your code unnecessarily. So a terraform way to achieve this would be to use locals so that your code can be kept DRY

# render Admin & Developer users list with the structure required by EKS module
locals {
  admin_user_map_users = [
    for admin_user in var.admin_users :
    {
      userarn  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:user/${admin_user}"
      username = admin_user
      groups   = ["system:masters"]
    }
  ]
  developer_user_map_users = [
    for developer_user in var.developer_users :
    {
      userarn  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:user/${developer_user}"
      username = developer_user
      groups   = ["${var.name_prefix}-developers"]
    }
  ]
  worker_groups_launch_template = [
    {
      override_instance_types = var.asg_instance_types
      asg_desired_capacity    = var.autoscaling_minimum_size_by_az * length(data.aws_availability_zones.available_azs.zone_ids)
      asg_min_size            = var.autoscaling_minimum_size_by_az * length(data.aws_availability_zones.available_azs.zone_ids)
      asg_max_size            = var.autoscaling_maximum_size_by_az * length(data.aws_availability_zones.available_azs.zone_ids)
      kubelet_extra_args      = "--node-labels=node.kubernetes.io/lifecycle=spot" # Using Spot instances means we can save a lot of money and scale to have even more instances.
      public_ip               = true
    },
  ]
}

Add more variables to the variables.tf file

# create some variables
variable "admin_users" {
  type        = list(string)
  description = "List of Kubernetes admins."
}
variable "developer_users" {
  type        = list(string)
  description = "List of Kubernetes developers."
}
variable "asg_instance_types" {
  type        = list(string)
  description = "List of EC2 instance machine types to be used in EKS."
}
variable "autoscaling_minimum_size_by_az" {
  type        = number
  description = "Minimum number of EC2 instances to autoscale our EKS cluster on each AZ."
}
variable "autoscaling_maximum_size_by_az" {
  type        = number
  description = "Maximum number of EC2 instances to autoscale our EKS cluster on each AZ."
}
variable "autoscaling_average_cpu" {
  type        = number
  description = "Average CPU threshold to autoscale EKS EC2 instances."
}

Create a file — variables.tfvars to set values for variables.

cluster_name            = "tooling-app-eks"
iac_environment_tag     = "development"
name_prefix             = "darey-io-eks"
main_network_block      = "10.0.0.0/16"
subnet_prefix_extension = 4
zone_offset             = 8

# Ensure that these users already exist in AWS IAM. Another approach is that you can introduce an iam.tf file to manage users separately, get the data source and interpolate their ARN.
admin_users                              = ["dare", "solomon"]
developer_users                          = ["leke", "david"]
asg_instance_types                       = ["t3.small", "t2.small"]
autoscaling_minimum_size_by_az           = 1
autoscaling_maximum_size_by_az           = 10
autoscaling_average_cpu                  = 30

Create file — provider.tf

provider "aws" {
  region = "us-west-1"
}

provider "random" {
}

Run terraform init
Run Terraform plan — Your plan should have an output

Plan: 41 to add, 0 to change, 0 to destroy.

Run Terraform apply

This will begin to create cloud resources, and fail at some point with the error

╷
│ Error: Post "http://localhost/api/v1/namespaces/kube-system/configmaps": dial tcp [::1]:80: connect: connection refused
│ 
│   with module.eks-cluster.kubernetes_config_map.aws_auth[0],
│   on .terraform/modules/eks-cluster/aws_auth.tf line 63, in resource "kubernetes_config_map" "aws_auth":
│   63: resource "kubernetes_config_map" "aws_auth" {

That is because for us to connect to the cluster using the kubeconfig, Terraform needs to be able to connect and set the credentials correctly.

To fix this problem

Append to the file data.tf

# get EKS cluster info to configure Kubernetes and Helm providers
data "aws_eks_cluster" "cluster" {
  name = module.eks-cluster.cluster_id
}
data "aws_eks_cluster_auth" "cluster" {
  name = module.eks-cluster.cluster_id
}

Append to the file provider.tf

# get EKS authentication for being able to manage k8s objects from terraform
provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  token                  = data.aws_eks_cluster_auth.cluster.token
}

Rerun the init and plan — This time you will see

# module.eks-cluster.kubernetes_config_map.aws_auth[0] will be created
  + resource "kubernetes_config_map" "aws_auth" {
      + data = {
          + "mapAccounts" = jsonencode([])
          + "mapRoles"    = <<-EOT
                - "groups":
                  - "system:bootstrappers"
                  - "system:nodes"
                  "rolearn": "arn:aws:iam::696742900004:role/tooling-app-eks20210718113602300300000009"
                  "username": "system:node:{{EC2PrivateDNSName}}"
            EOT
          + "mapUsers"    = <<-EOT
                - "groups":
                  - "system:masters"
                  "userarn": "arn:aws:iam::696742900004:user/dare"
                  "username": "dare"
                - "groups":
                  - "system:masters"
                  "userarn": "arn:aws:iam::696742900004:user/solomon"
                  "username": "solomon"
                - "groups":
                  - "darey-io-eks-developers"
                  "userarn": "arn:aws:iam::696742900004:user/leke"
                  "username": "leke"
                - "groups":
                  - "darey-io-eks-developers"
                  "userarn": "arn:aws:iam::696742900004:user/david"
                  "username": "david"
            EOT
        }
      + id   = (known after apply)

      + metadata {
          + generation       = (known after apply)
          + labels           = {
              + "app.kubernetes.io/managed-by" = "Terraform"
              + "terraform.io/module"          = "terraform-aws-modules.eks.aws"
            }
          + name             = "aws-auth"
          + namespace        = "kube-system"
          + resource_version = (known after apply)
          + uid              = (known after apply)
        }
    }Plan: 1 to add, 0 to change, 0 to destroy.

Deploy applications with Helm

Earlier on, you experienced the use of manifest files to define and deploy resources like pods, deployments, and services into the Kubernetes cluster. Here, you will do the same thing except that it will not be passed through kubectl. In the real world, Helm is one of the most popular tools to deploy resources into Kubernetes. That is because it has a rich set of features that allows deployments to be packaged as a unit. Rather than have multiple YAML files managed individually - which can quickly become messy.

A Helm chart is a definition of the resources that are required to run an application in Kubernetes. Instead of having to think about all of the various deployments/services/volumes/configmaps/ etc that make up your application, you can use a command like

helm install stable/mysql

Helm will make sure all the required resources are installed. In addition, you will be able to tweak the helm configuration by setting a single variable to a particular value and more or less resources will be deployed. For example, enabling slave for MySQL so that it can have read-only replicas.

Behind the scenes, a helm chart is essentially a bunch of YAML manifests that define all the resources required by the application. Helm takes care of creating the resources in Kubernetes (where they don’t exist) and removing old resources.

Lets begin to gradually walk through how to use Helm

Parameterizing YAML manifests using Helm templates

Let’s consider that our Tooling app has been Dockerised into an image tooling-appand that you wish to deploy with Kubernetes. Without helm, you would create the YAML manifests defining the deployment, service, and ingress, and apply them to your Kubernetes cluster using kubectl apply. Initially, your application is version 1, and so the Docker image is tagged as tooling-app:1.0.0. A simple deployment manifest might look something like the following:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tooling-app-deployment
  labels:
    app: tooling-app
spec:
  replicas: 3
  strategy: 
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      app: tooling-app
  template:
    metadata:
      labels:
        app: tooling-app
    spec:
      containers:
      - name: tooling-app
        image: "tooling-app:1.0.0"
        ports:
        - containerPort: 80

Now let's imagine that the developers develop another version of the toolin app, version 1.1.0. How do you deploy that? Assuming nothing needs to be changed with the service or other Kubernetes objects, it may be as simple as copying the deployment manifest and replacing the image defined in the spec section. You would then re-apply this manifest to the cluster, and the deployment would be updated, performing a rolling-update.

The main problem with this is that all of the values specific to the tooling app — the labels and the image names etc — are mixed up with the entire definition of the manifest.

Helm tackles this by splitting the configuration of a chart out from its basic definition. For example, instead of hard coding the name of your app or the specific container image into the manifest, you can provide those when you install the “chart” (More on this later) into the cluster.

For example, a simple templated version of the tooling app deployment might look like the following:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ .Release.Name }}-deployment
  labels:
    app: "{{ template "name" . }}"
spec:
  replicas: 3
  strategy: 
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
  selector:
    matchLabels:
      app: "{{ template "name" . }}"
  template:
    metadata:
      labels:
        app: "{{ template "name" . }}"
    spec:
      containers:
      - name: "{{ template "name" . }}"
        image: "{{ .Values.image.name }}:{{ .Values.image.tag }}"
        ports:
        - containerPort: 80

This example demonstrates several features of Helm templates:

The template is based on YAML, with {{ }} mustache syntax defining dynamic sections. Helm provides various variables that are populated at install time. For example, the {{.Release.Name}} allows you to change the name of the resource at runtime by using the release name. Installing a Helm chart creates a release (this is a Helm concept rather than a Kubernetes concept). You can define helper methods in external files. The {{template “name”}} call gets a safe name for the app, given the name of the Helm chart (but which can be overridden). By using helper functions, you can reduce the duplication of static values (like tooling-app), and hopefully reduce the risk of typos.

You can manually provide configuration at runtime. The {{.Values.image.name}} value, for example, is taken from a set of default values, or from values provided when you call helm install. There are many different ways to provide the configuration values needed to install a chart using Helm. Typically, you would use two approaches:

A values.yaml file that is part of the chart itself. This typically provides default values for the configuration, as well as serves as documentation for the various configuration values.

When providing configuration on the command line, you can either supply a file of configuration values using the -f flag. We will see a lot more on this later on.

Now let's setup Helm and begin to use it.

According to the official documentation here, there are different options for installing Helm. But we will build the source code to create the binary.

Download the tar.gz file from the project's Github release page. Or simply use wget to download version 3.6.3 directly

wget https://github.com/helm/helm/archive/refs/tags/v3.6.3.tar.gz

Unpack the tar.gz file

tar -zxvf v3.6.3.tar.gz

cd into the unpacked directory

cd helm-3.6.3

Build the source code using make utility

make build

If you do not have make installed or for any other reason, you cannot install the tool, simply use the official documentation here for other options.

Helm binary will be in the bin folder. Simply move it to the bin directory on your system. You can check other tools to know where that is. For example, check where pwd utility is being called from by running which pwd. Assuming the output is /usr/local/bin. You can move the helm binary there.

sudo mv bin/helm /usr/local/bin/

Check that Helm is installed

helm version

version.BuildInfo{Version:"v3.6+unreleased", GitCommit:"13f07e8adbc57b0e3f96b42340d6f44bdc8d5016", GitTreeState:"dirty", GoVersion:"go1.15.5"}

Deploy Jenkins with Helm

Before we begin to develop our helm charts, lets make use of publicly available charts to deploy all the tools that we need.

One of the amazing things about Helm is the fact that you can deploy applications that are already packaged from a public Helm repository directly with very minimal configuration. An example is Jenkins.

Visit Artifact Hub to find packaged applications as Helm Charts
Search for Jenkins
Add the repository to the helm so that you can easily download and deploy

helm repo add jenkins https://charts.jenkins.io

Update helm repo

helm repo update

Install the chart

helm install [RELEASE_NAME] jenkins/jenkins --kubeconfig [kubeconfig file]

You should see an output like this

NAME: jenkins
LAST DEPLOYED: Sun Aug  1 12:38:53 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
NOTES:
1. Get your 'admin' user password by running:
  kubectl exec --namespace default -it svc/jenkins -c jenkins -- /bin/cat /run/secrets/chart-admin-password && echo
2. Get the Jenkins URL to visit by running these commands in the same shell:
  echo http://127.0.0.1:8080
  kubectl --namespace default port-forward svc/jenkins 8080:8080

3. Login with the password from step 1 and the username: admin
4. Configure security realm and authorization strategy
5. Use Jenkins Configuration as Code by specifying configScripts in your values.yaml file, see documentation: http:///configuration-as-code and examples: https://github.com/jenkinsci/configuration-as-code-plugin/tree/master/demosFor more information on running Jenkins on Kubernetes, visit:
https://cloud.google.com/solutions/jenkins-on-container-engineFor more information about Jenkins Configuration as Code, visit:
https://jenkins.io/projects/jcasc/
NOTE: Consider using a custom image with pre-installed plugins

Check the Helm deployment

helm ls --kubeconfig [kubeconfig file]

Output:

NAME    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
jenkins default         1               2021-08-01 12:38:53.429471 +0100 BST    deployed        jenkins-3.5.9   2.289.3

Check the pods

kubectl get pods --kubeconfig [kubeconfig file]

Output:

NAME        READY   STATUS    RESTARTS   AGE
jenkins-0   2/2     Running   0          6m14s

Describe the running pod (review the output and try to understand what you see)

kubectl describe pod jenkins-0 --kubeconfig [kubeconfig file]

Check the logs of the running pod

kubectl logs jenkins-0 --kubeconfig [kubeconfig file]

You will notice an output with an error

error: a container name must be specified for pod jenkins-0, choose one of: [jenkins config-reload] or one of the init containers: [init]

This is because the pod has a Sidecar container alongside the Jenkins container. As you can see from the error output, there is a list of containers inside the pod [jenkins config-reload] i.e jenkins and config-reload containers. The job of the config-reload is mainly to help Jenkins reload its configuration without recreating the pod.

Therefore we need to let kubectl know, which pod we are interested in to see its log. Hence, the command will be updated like:

kubectl logs jenkins-0 -c jenkins --kubeconfig [kubeconfig file]

Now let's avoid calling the [kubeconfig file] every time. Kubectl expects to find the default kubeconfig file in the location ~/.kube/config. But what if you already have another cluster using that same file? It doesn't make sense to overwrite it. What you will do is merge all the kubeconfig files using a Kubectl plugin called [konfig](https://github.com/corneliusweig/konfig) and select whichever one you need to be active.
Install a package manager for Kubectl called krew so that it will enable you to install plugins to extend the functionality of Kubectl. Read more about it [Here](https://github.com/kubernetes-sigs/krew)
Install the [konfig plugin](https://github.com/corneliusweig/konfig)

kubectl krew install konfig

Import the kubeconfig into the default kubeconfig file. Ensure to accept the prompt to overide.

sudo kubectl konfig import --save [kubeconfig file]

Show all the contexts — Meaning all the clusters configured in your kubeconfig. If you have more than 1 Kubernetes cluster configured, you will see them all in the output.

kubectl config get-contexts

Set the current context to use for all kubectl and helm commands

kubectl config use-context [name of EKS cluster]

Test that it is working without specifying the --kubeconfig flag

kubectl get po

Output:

NAME READY STATUS RESTARTS AGE jenkins-0 2/2 Running 0 84m

Display the current context. This will let you know the context in which you are using to interact with Kubernetes.

kubectl config current-context

Now that we can use it kubectl without the --kubeconfig flag, Let's get access to the Jenkins UI. (In later projects we will further configure Jenkins. For now, it is to set up all the tools we need)
Some commands were provided on the screen when Jenkins was installed with Helm. See number 5 above. Get the password to the admin user

kubectl exec --namespace default -it svc/jenkins -c jenkins -- /bin/cat /run/secrets/chart-admin-password && echo

Use port forwarding to access Jenkins from the UI

kubectl --namespace default port-forward svc/jenkins 8080:8080

Go to the browser localhost:8080 and authenticate with the username and password from number 1 above

Now setup the following tools using Helm

This section will be quite challenging for you because you will need to spend some time researching the charts, reading their documentation, and understanding how to get an application running in your cluster by simply running a helm install command.

Artifactory
Hashicorp Vault
Prometheus
Grafana
Elasticsearch ELK using ECK

Successfully installing all these 5 tools is a great experience to have.