iep/iep-003
R. Tyler Croy af75459204
Flesh out the remainder of the first draft of IEP-3
2016-11-17 17:00:35 -08:00
..
README.adoc Flesh out the remainder of the first draft of IEP-3 2016-11-17 17:00:35 -08:00

README.adoc

<html lang="en"> <head> </head>

IEP-3: Terraform for describing infrastructure as code

Table 1. Metadata

IEP

3

Title

Terraform for describing infrastructure as code

Author

R. Tyler Croy

Status

💬 In-process

Type

Process

Created

2016-11-16

Abstract

The migration to Azure means all infrastructure is (technically) an API call away. This means that more of our infrastructure can be managed via automation rather than previously where, in some cases, it was managed via support tickets. In order to develop, modify, and deploy Azure-based infrastructure the Jenkins project infrastructure should be define all infrastructure services and resources programmatically.

Specification

Terraform is a tool which provides Azure-specific abstractions [1] for defining Azure resources programmatically, in addition to support for persistent the state of which resources have/have not yet been created.

This document is not intended to explain all the features of Terraform, but rather describe how it should be used within the Jenkins project infrastructure.

Defining Infrastructure

As a proof-of-concept, an existing Azure Storage account was modeled [2] with Terraform, e.g.:

releases-storage.tf
resource "azurerm_resource_group" "releases" {
    name     = "${var.prefix}jenkinsinfra-releases"
    location = "East US 2"
}
resource "azurerm_storage_account" "releases" {
    name                = "${var.prefix}jenkinsreleases"
    resource_group_name = "${azurerm_resource_group.releases.name}"
    location            = "East US 2"
    account_type        = "Standard_GRS"
}
resource "azurerm_storage_container" "war" {
    name                  = "war"
    resource_group_name   = "${azurerm_resource_group.releases.name}"
    storage_account_name  = "${azurerm_storage_account.releases.name}"
    container_access_type = "container"
}

Logical clusters or groups of resources should be defined in this manner, using a single .tf file in the plans/ directory in the repository [3]. This makes finding the right Terraform plan corresponding to a specific part of the infrastructure easy to find, easier to review, and easy to test in isolation from the other resources.

All identifiers must be prefixed.

Azure contains a number of global identifier namespaces which can cause conflicts between two different contributors, or two different environments, when defining infrastructure. For example, if DevA defines a resource group named "jenkins", DevB cannot also define a resource group named "jenkins". Some identifiers are subscription specific, but in order to avoid potential conflicts, all identifiers in Terraform resources must use the prefix variable:

example.tf
resource "azurerm_resource_group" "example" {
    name     = "${var.prefix}jenkinsinfra-examples" # (1)
    location = "East US 2"
}
  1. Referencing var.prefix pulls in an environment/developer-specific defined prefix

Storing State

Terraform generates state which allows the tool to act in an idempotent fashion. That is to say, without a .tfstate file of some form, Terraform may create redundant infrastructure in Azure.

This state is an important part of what makes Terraform a useful tool for the Jenkins project (see Rationale). This state contains access keys, and other semi-confidential information which would not be safe to check into the repository [3].

To reap the benefits of Terraform state without needing a local filesystem or .tfstate file checked into the repository [3] the proposal is to store production Terraform state in Azure blob storage [4] using Terraforms built-in Remote State functionality.

For production state, this would be configured as such:

prodstate.tf
data "terraform_remote_state" "prod_tfstate" {
  backend = "azure"
  config {
    storage_account_name = "jenkinsinfra-tfstate"
    container_name       = "production"
    key                  = "terraform.tfstate"
  }
}

Requirements

Storing Terraform state in Azure blob storage dictates two requirements to the infrastructure code:

  1. A separate "bootstrap" set of Terraform plans exists to define the storage containers necessary to store/access production state.

  2. terraform apply statements be run in a consistent fashion, in order to ensure that the appropriate production state is being referenced during the execution of the Terraform plans.

The first requirement is readily addressed with a separate directory structure and some tooling in the repository [3].

The second requirement addressed with the use of a "proper" Jenkins-based delivery pipeline for the Terraform plans. This would entail a Jenkins environment which had the appropriate credentials for provisioning infrastructure in the production Azure environment, e.g.:

node('azurerm') {
    checkout scm

    stage('Validate') {
        sh 'terraform validate plans/*.tf'
    }
    stage('Plan') {
        sh 'terraform plan plans'
    }
    stage('Apply') {
        input 'Do the plans look good?'
        sh 'terraform apply plans'
    }
}

This approach provides a single point of deployment for Terraform plans which can be then inspected or otherwise interacted with by the entirety of the Jenkins project infrastructure team instead of relying on individual contributors' laptops.

Motivation

By using Terraform to describe all infrastructure there will be no "hidden" infrastructure which is only known to a select few, rather than the current situation where one or two people might be aware of where certain resources are located or how they relate to others.

Defining all infrastructure resources in Terraform also lowers the bar for new infrastructure contributions. Not only by making the actual infrastructure topologies open source, but by allowing practically anybody to provision infrastructure which resembles Jenkins project infrastructure. Currently there is no way to provision a "dev version of the Jenkins project infrastructure" and this would be feasible with Terraform plans describing the projects infrastructure.

Rationale

The benefits of describing infrastructure as code should be self-evident, before considering the rationale for choosing Terraform, first consider some other options:

Azure Resource Manager (ARM) templates

ARM templates [5] are conceptually interchangeable with AWS Cloud Formation templates; templates defined in JSON to describe cloud resources.

Pros

  • Supported for practically all resources on Azure

  • Relatively simple to use for the basic use-cases

Cons

  • No state and therefore

  • Not idempotent

  • Entirely foreign to many within the operations community.

  • Would require an external source of template parameters in order to function. In essence, ARM allows template parameters but use of parameterized templates would mean Jenkins project infrastructure automation would require these to be defined externally to the ARM template to model multiple (local-dev vs. production) environments

Puppet-defined resources

Pros

  • Jenkins infrastructure already has large amounts of Puppet code implemented, and a well-defined workflow for modifying, testing, and deploying Puppet code.

  • Puppets graph approach supports idempotency

Cons

  • Puppet must have an "execution context" which for most Puppet catalogues means a node (machine) which the catalogue is being executed against. In order to provision Azure resources a "deployment" node would need to exist whose sole job would be to provision Azure resources from Puppet. Basically, one cannot run Puppet on "Azure" to provision an Azure Load Balancer (for example).

  • The puppetlabs-azure module is not a very common approach which means the tooling will lag behind "native" (i.e. supported by Microsoft) toolchains such as ARM templates [5].

Terraform

👍

Terraform is a reasonably popular and well understood tool, which enjoys contributions from Microsoft for its Azure support.

Pros

  • Stateful and therefore

  • Supports idempotent operations

  • Widely used in the "modern operations" community, while the specific resources might not be familiar to newcomers, the tool itself would be.

  • Variable substitution and separation of state files allows development clusters to be created entirely separate from production while still resembling production infrastructure.

Cons

  • "Yet another DSL" to learn in order to effectively contribute to the Jenkins project infrastructure

  • Doesnt support all resources defined by Azure, which might dictate the use of the azurerm_template_deployment resource in Terraform, and still needing to write ARM templates [5].

Costs

There are no additional financial costs associated with using Terraform. There is a learning curve associated with Terraform but its safe to assume that theres a learning curve with all things Azure for the infrastructure contributors at this point in time.

Reference implementation

This plan is a reference implementation of the only Azure resources provisioned to date.

</html>