Simplifying Continuous Deployment : Exploring popular CD Tools and practical applications

Ansible, Chef, Puppet, Terraform, Cloud Vendor Templates, Pulumi what works for you ?

With the advent of cloud computing and DevOps CI/CD concepts, the whole ecosystem of application deployment has changed. The deployment space now involves quite identifiable tasks like., packaging the code(build the binaries), managing environment configurations (both infra. and app) , building the infrastructure, building application images and then final deployment on clusters.

The infrastructure machines have moved from very fine curated servers to serverless compute VMs. With that the deployment process has also shifted. It has changed from providing very customized deployment instructions to using repeatable automated deployment tools and code ( Yes, I am referring to IaC).

There are lot of mature products addressing this space. Each one of them are capable of managing the end to end deployment process on their own. But the problem is figuring out what works best for your use case and your application. You will find lot of information comparing tools like., Ansible to Chef then to Puppet, Terraform , ARM templates, Pulumi to name a few. On top of it each organization, project even architect will have their affinity to these products. The more I read the more confused I was to pick what works for the use case. This blog tries to simplify given the scenario which set of tool is best placed to solve your problem statement.

Questions to guide the decision-making

When we are analyzing the whole build and deploy space, actually we look to resolve following queries :

How do I package the application so that it doesn’t need to be rebuilt based on environment config and infrastructure changes.
How do I provision the platform, VM or cloud services which are replicable in each environment.
How do I decouple application deployment from the infra provisioning, each one should have their own lifecycle and not interdependent.

Tool categories to address the questions

Let’s check what deployment tools we have.

Category	Description	Examples
Provisioners	Family of products/tools responsible to create infrastructure from zero.	Tools like Terraform and AWS CloudFormation.
Configurer or App Managers	Products that help you manage the infrastructure created and support application configurations.	Tools like Ansible, Puppet, and Chef.
Packagers	Language-specific tools to bundle code into a manageable deployment unit. Containerization solutions fall in this category.	Tools like Docker, Kubernetes, and language-specific package managers.

Based on your application nature, a combination of tools from above categories should come together to manage the deployment automation needs. In case we have to create new infrastructure or platform for the first time or replicate in all envs. we should use something from the provisioners. We should lean towards the App Configurer and Managers when we want the codebase deployed with a few configurations tweaked per environment. We can also use them when the app is deployed in an incremental fashion. When you want to package the code in such a that they are agnostic to underlying hardware then pick something from the Packagers.

CD Pipeline: Tool Suitability and Usage

Ideally the pipeline should have following steps

This article focuses on the provision and configuration steps as the boundaries are blur here and the options available here mostly overlap. Its pretty hard to choose practical thing. Just for brevity sake, not focusing much on the other steps.

The “provisioners” are best suited to create the repeatable infrastructure. The are mostly declarative i.e. you only mention the desired state and leave it to them to figure out to attain it. The key point here is that we want to provision the infrastructure in immutable way. This means if we need to change something it doesn’t alter the state, it will figure out what needs to be done to reach the changed state. This concept makes IaC code simple as we don’t have to manage the changes based on current state in the code. For example ,

Here is a simple terraform script managing a spark cluster(GCP Dataproc) and the same script can be edited to change the base image without worrying what is the current state : :

provider "google" {
  project = "<projectId>"
  region  = "<Var1>"
}

resource "google_dataproc_cluster" "<Var2>" {
  name       = "<Var2>"
  region     = "<Var3>"

  cluster_config {
    master_config {
      num_instances = 1
      machine_type  = "n1-standard-4"
    }

    # Worker node configuration
    worker_config {
      num_instances = 2
      machine_type  = "n1-standard-4"
    }

    software_config {
      image_version = "<configure image name>"
      
    }

Also one thing to notice is that these are pretty standard tasks like., creation of DB, Databricks clusters, setting up N/W, Kubernetes cluster deploying image on a VM. The nuances are mostly driven by the cloud provider. Application team doesn’t change much than putting right values in the creation templates. It makes sense to let the IaC tool manage the state. Let it determine how to reach the desired state. Tools like Terraform, Pulumi or public cloud ones like., ARM templates, CloudFormation etc works well here.

Now lets discuss the next thing i.e. infrastructure configuration. This means customizing the the provisioned infrastructure as per your need. This is on top of providing the customized values in the provisioning templates. Few examples are like installing certain dependencies on the VMs, allowing certain ports only, setting up keys and certificates, pushing some init scripts for the cluster startup. Sometimes this can be achieved by inbuilding into the container image itself. Though this should be used more for the application configuration. Now this is very much specific to the organization and the project. Some project use the same cluster provisioning with different init script. Also this step will undergo lot of updates and incremental development. This will also involve lot of scheduled patching work like, change the security scan script etc. etc. Here the goal is to maintain the desired state of the platform, provided that the base state is known, hence the “Maintainers”. You need to tell exactly how it needs to be done like., fetch password from some vault and then install in certain directory or connect to some nexus location and copy initial dependencies. This needs scripting support. The tools which best support this step are Ansible, Puppet, Chef or even Powershell scripts. We wont go into nitty gritty on how to pick one among them like one is agent less, or use DSL language or YML or simple extension of shell scripts. But the point is pick tool here where you can exactly tell how it needs to be done and you got lot of control on code. If we pick any of provisioners here, we will end up creating lot of complex scripts or we will try to invoke the puppet modules or shell scripts from the provisioner (eg. terraform etc.). We should avoid such interlinking and let CI pipeline, gitlab or jenkins mange it. Yes it creates one problem that is how to link the output state of the provisioners and channel it as input for the configuration scripts in an automated way(may be another blog for it).. Here is an example of puppet script to configure init script execution for a VM. The same can be done from terraform but the whole flow of downloading script from nexus and pushing it here makes it easier.


# Init script path
$init_script_url = '<path>/<initVm>.sh'

# Download the initialization script to a temporary location
exec { 'download_init_script':
  command => "/usr/bin/wget -O /tmp/initVm.sh ${init_script_url}",
  path    => '/usr/bin',
  creates => '/tmp/initVm.sh',
}

# Execute the initialization script
exec { 'execute_init_script':
  command => '/tmp/initVm.sh',
  path    => '/bin:/usr/bin:/usr/local/bin',
  require => File['/tmp/initVm.sh'],
}

After bringing the infrastructure to the ready state lets deploy and configure the application. From a deployment perspective, our focus should be on managing environment-related properties. Other types of configurations should be managed in the build phase or the containerization phase. Few examples here would be notification email setting, timeout settings, DB url, etc. All these should come from Cloud Vaults, secret managers or configMap. All solutions like HELM Charts are applicable here only. There is not much difference between application and infrastructure configuration process. Its only at what level they are applicable. Hence same set of processes and IAC tool like., Ansible, Puppet, Powershell script, cloud specific configuration managers, Chef etc. works best here. Not picking specific example here.

For more clarity lets go through some practical scenarios :

We have to deploy some Java application on a cloud VM. The VM must have the organization-approved OS image. Then, the VM should have a few utilities and a certificate installed.
- Here you can have provisioner IaC code to create VM with image and then use configurer file to deploy the certificate file. To link between VM name created by provisioner to configurator we can use a host file/infra file.
You have to provision a kubernetes cluster and create namespace. HELM is provided to you. You need to configure the PODs, gateway and then deploy the HELM chart. Few secrets needs to be pushed to the cloud vault.
- Again create cluster, namespace and configure initializer script using provisioners and for remaining tasks use puppet/ansible.
You need to create a dataproc, databricks, or spark cluster. Pre define some dependencies. Then deploy your Spark jobs.
- You can use scripts to integrate with the Cloud Service REST Apis after provisioning the cluster.
Database or Sqlwarehouse deployments with DDL scripts and user permission settings.
- We can see clear demarcation. Treat DDL scripts as code and deploy it separately. Only DB config should be managed using terraform etc.

Concluding Remarks

Its fair to say that with the gradual shift from very much curated hardware to easily replaceable infrastructure and different level of application packages (jar, zip, to serverless modules), we need a combination of deployment tools to work in tandem. Its not going to be a single tool but a combination of them aligning to each step involved in the deployment(infra deployment, app deployment, environment configuration etc.) and its very low dividend to dwell deep into comparison among tools addressing each deployment step. You will be fine with whatever is mandated by your organization in each category.

Caffe Big

Talking Technology : big things in small packages

Leave a comment Cancel reply