Basic guide to creating resources in AWS with Terraform

13 min readDec 22, 2021

When I started to work with AWS, immediately I used the Serverless framework to deploy resources using the principle of Infrastructure as Code, I had heard about Terraform, but I never had the chance to work with it, until a few months ago, I had to learn about Terraform to work on a new project, and here I want to leave some tips that I learned.

First, what is infrastructure as code?

Well, infrastructure as code is the managing and provisioning of infrastructure through code instead of manual processes, which means that you don’t need to administrate your servers and resources to deploy your application, instead, you choose a provider like AWS, Azure, or Google to create your resources with tools like Serverless or Terraform, where you define each resource like code.

In my case, I used the Serverless and Terraform frameworks, and I want to talk about those tools.

First, Serverless Framework provides us with an abstraction layer on AWS services in general and AWS lambda. Through its configuration YAML file, we can describe the deployment to be carried out including the lambda functions to create, their access permissions, and how they will interact with the rest of the AWS cloud services such as API Gateway, S3, CloudFront, Route53, DynamoDB, etc. This framework was designed specifically to work with AWS and create resources with it.

On another hand, Terraform is a framework that can work with many providers like AWS, Azure, Google, Alibaba, etc. Instead of one file, it requires many configuration files where you define each resource, literally every resource that you need to deploy your application, must be defined. You need to be careful with each resource and sometimes is confusing and hard to learn about them.

I think the main difference between those frameworks, is the learning curve, with Serverless is easy to learn about each definition and resource because you work with one only file and one provider (AWS). But, Terraform is hard to learn because has a lot of configurations and resources, there are different ways to define your resources, and sometimes poor documentation.

How does Terraform work?

First, we need to know the main elements to work with it.

Providers: We can define the providers like plugins that allow the connection with our cloud provider, we can create a connection and create resources only by providing the necessary credentials to create the connection.
Data: This kind of source allows data to be fetched or computed for use elsewhere in Terraform configuration.
Resources: Define elements to be created in your cloud provider, for example, s3 buckets, lambdas, VPCs, CloudFront distributions, etc… Also, you can provide the resource with any configuration that you need (RAM, CPU, timeouts, etc). Finally, you can use the information of each resource in other resources, for example, you can use the information of your bucket to create a CloudFront distribution.
Modules: Imagine that you have a lot of resources in just one file, well, you can create a module where you can set the resources that are similar or have a relation, and you can define inputs and outputs to each module (I’m going to talk about this later).
Variables: Definitions of inputs of information, could have a default value or you can set the value. Imagine the variables like the inputs in your method, you can define required inputs and inputs with default values.
Outputs: Information generated by resources or modules, that could be used in other modules. Imagine the outputs like the information that a function returns once has been executed.

I talked about the principal elements to work with Terraform, but you must be wondering how Terraform stores the information of each resource that has been created. Well here is another difference between Terraform and Serverless Framework, Serverless under the hood uses CloudFormaction to create each resource in AWS, but Terraform uses its state file called terraform.tfstate, actually, this file is a big JSON that contains the information of each resource that you have created.

{
  "version": 1,
  "terraform_version": "1.0.10",
  "serial": 114,
  "lineage": "f96263cc-8331-41ee-999d-e6f770e72aa8",
  "outputs": {},
  "resources": [
  {
    "mode": "managed",
    "type": "aws_route53_zone",
    "name": "route53",
    "provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
    "instances": [
      {
        "schema_version": 0,
        "attributes": {
          "arn": "arn:aws:route53:::hostedzone/12345",
          "comment": "Managed by Terraform",
          "delegation_set_id": "",
          "force_destroy": null,
          "id": "12345",
          "name": "test.com",
          "name_servers": [],
          "tags": {},
          "tags_all": {},
          "vpc": [],
          "zone_id": "12345"
        },
        "sensitive_attributes": [],
        "private": "2lvbiI6IjAifQ=="
      }
    ]
  },
  {
    "module": "module.buckets",
    "mode": "managed",
    "type": "aws_s3_bucket_object",
    "name": "test_bucket_object",
    "provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
    "instances": [
      {
        "schema_version": 0,
        "attributes": {
          "acl": "private",
          "bucket": "test_bucket",
          "bucket_key_enabled": false,
          "cache_control": "",
          "content": null,
          "content_base64": null,
          "content_disposition": "",
          "content_encoding": "",
          "content_language": "",
          "content_type": "binary/octet-stream",
          "etag": "ef5897a2-4cd4-11ec-81d3-0242ac130003",
          "force_destroy": false,
          "id": "test_bucket/test.json",
          "key": "test_bucket/test.json",
          "kms_key_id": null,
          "metadata": {},
          "object_lock_legal_hold_status": "",
          "object_lock_mode": "",
          "object_lock_retain_until_date": "",
          "server_side_encryption": "",
          "source": "buckets/../../test/test.json",
          "source_hash": null,
          "storage_class": "STANDARD",
          "tags": {},
          "tags_all": {},
          "version_id": "",
          "website_redirect": ""
        },
        "sensitive_attributes": [],
        "private": "bA=="
      }
    ]
  }
]
}

In this example, you can see the information of two resources an s3 object and a hosted zone in route53 (fake data), which means if you decide to add, modify or remove a resource the information is stored here. Now you have two ways to store this file, the first one is the easy way, store it locally in your code, which is a good choice if you are learning, but a fatal decision if you are working on a project. So, to solve this issue you can store your file in an s3 bucket. you just need to add a backend provider.

backend "s3" {
  bucket = "test_project"
  key = "terraform.tfstate"
  region = "us-east-1"
}

In this code, you can see the bucket and name of the file where I’m going to store the infrastructure information.

Well, I explained the main elements of Terraform. Now, it’s time to explain how to implement Terraform in your project.

First, you need to install Terraform, this installation depends on your OS. So you need to check the instructions here.

When you have installed Terraform, you need to establish a place where allocate your files to define your infrastructure. I recommend the next structure.

infrastructure
     |_ main.tf
     |_ modules.tf
     |_ ouputs.tf
     |_ provider.tf
     |_ resources.tf
     |_ variables.tf

In this structure, we have a main folder called infrastructure that is goings to be at the root of the project. And, here we are going to define the entire infrastructure to build your project.

Once you have created the main folder, you need to set the configurations to execute Terraform. First, you need to add your provider in this case AWS. We set the provider information in the file provider.tf.

provider "aws" {
  region = var.region
  access_key = var.aws_access_key_id
  secret_key = var.aws_secret_access_key
}

You need to set the region, and credentials to create the connection with your AWS account. Maybe, you are wondering where are the variables? Well, that information is stored in variables.tf.

# ----------------------------------------------------------------------------------------------------------------------
# AWS CREDENTIALS
# ----------------------------------------------------------------------------------------------------------------------

variable "aws_access_key_id" {
  description = "AWS access key credential"
}

variable "aws_secret_access_key" {
  description = "AWS secret access key credential"
}

variable "region" {
  default = "us-east-1"
}

The variables file is the place where you are going to define your inputs or information to use across your resources definition, here you can have two scenarios, the first one is to define variables in your modules and use that information in your resources definition, for example, the region of your s3 bucket. Now the second scenario is to define variables in your root configuration to start Terraform, for example, you don’t want to upload your credentials in your code to allow the connection between Terraform and AWS, well if you define variables in your root folder (in this case infrastructure) when you try to run the command to deploy your infrastructure Terraform goings to request the variables that you defined, in this case, the variables are credentials to allow the connection. The region variable could have another value but is not required, because a default value has been already defined.

Now, we need to set the configuration to start Terraform, the file main.tf is the place to do that.

terraform {
  required_providers {
    aws = {
      source = "hashicorp/aws"
      version = "~> 3.0"
    }
  }

  backend "s3" {
    bucket = "test-project"
    key = "terraform.tfstate"
    region = "us-east-1"
  }
}

Here, you can see the definition of Terraform with the AWS provider, and you can see the configuration to store the state file into an s3 bucket. The bucket must be created manually, because if you create the bucket through terraform, and someone deletes the bucket accidentally you going to have trouble.

Now, you have your initial configuration to start the work. It’s time to run the commands to execute Terraform. First, you need to allocate them in your infrastructure folder, and then you need to set the values of your variables, only the variables that you defined in the root and don’t have default values, in this case, the credentials need to be set, you have to do this to avoid setting the value of each variable each time that you run a command. When you need to set a value of any variable you must set the prefix TF_VAR always.

export TF_VAR_aws_secret_access_key=********export TF_VAR_aws_access_key_id=********

If you don’t set the AWS credentials, Terraform will use your credentials allocated in your local configuration.

Now you need to start your “backend”, which means that you going to create or bring your state to add, update or remove resources.

terraform init -backend-config="access_key=$TF_VAR_aws_access_key_id" -backend-config="secret_key=$TF_VAR_aws_secret_access_key"

With this command, Terraform creates the file to store your definitions of infrastructure the first time, and the second one brings the configuration from your bucket. If you didn’t define a bucket and just want to test locally, you don’t need the args config. When Terraform ends the initialization, you maybe see one file called .terraform.lock.hcl , well this file is a version file of dependencies, in this case, our dependency is the provider. This file is similar to package.lock in NodeJs. This file is important and you need to include it in your repo. The second change that you’re going to see is a folder called .terraform, well here is where the dependencies are allocated, similar to node_modules, and you can ignore this folder.

Each time that you need to execute Terraform, you must set your variables (if you defined variables in the root) and then execute the init command, I mean the first time that you open your project and init you work, not each time that you need run any command of Terraform. Here, you can face a problem if you got an error from AWS, maybe it's because the user credentials that you used, don’t have permission to execute actions in your s3 bucket. This is a usual problem when you are creating resources with Terraform, sometimes Terraform provides the error with the information of permission that you need, but sometimes is just an error. Well, you can set this variable to get the entire information.

export TF_LOG=DEBUG

With this flag, we can see the entire information of each request that Terraform makes to AWS. In that way, you don’t need to guess which permission you need.

Now, is the time to add resources to your infrastructure, a good recommendation is to define a folder for each type of resource. For example, I defined folders to allocated buckets, lambda functions, logs, etc… Each folder represents a module, and each module is isolated which means you can’t reference resources from other modules outside, for example, you can’t reference a resource from buckets in functions directly. To achieve that you need to define your variables (inputs) and outputs (resources generated).

Like I said each module is isolated, and the modules file in the root goings to be the place where all modules going to interact with each other.

module "buckets" {
  source = "./buckets/"
  environment = local.environment
}module "policy" {
  source = "./policy"
  environment = local.environment
}module "functions" {
  source = "./functions/"
  object_bucket_references = module.buckets.object_references
  lambdas_names = var.lambdas_names
  environment = local.environment
  lambdas_exec_roles_arn = module.policy.lambdas_exec_roles_arn
}

In this example, the initial module is ‘bucket’, here I going to create a bucket and I going to upload a zip file, then the module ‘policy’ goings to create the policy to execute a lambda, and finally, the module ‘functions’ use the outputs of each module to create its resources, in this case, ‘object_bucket_references’ and ‘lambdas_exec_roles_arn’ are the variables that have the reference of the zip file and policy that the lambda function needs to be created.

Variables from module functions

# ----------------------------------------------------------------------------------------------------------------------
# GENERAL CONFIGURATIONS
# ----------------------------------------------------------------------------------------------------------------------
variable "runtime" {
  default = "python3.6"
}
variable "environment" {}
# ----------------------------------------------------------------------------------------------------------------------
# LAMBDA INPUTS
# ----------------------------------------------------------------------------------------------------------------------
variable "lambdas_exec_roles_arn" {}
variable "object_bucket_references" {}
variable "lambdas_names" {}

Output from module buckets

output "object_references" {
  value = {
    "test_function" : {
      etag : aws_s3_bucket_object.test_function_object.etag,
      key : aws_s3_bucket_object.test_function_object.key,
      bucket : aws_s3_bucket_object.test_function_object.bucket
    }
  }
}

Output from module policy

output "lambdas_exec_roles_arn" {
  value = {
      "test_exec_role_arn": aws_iam_role.test_lambda_exec_role.arn
  }
}

Definition of a resource in the module functions

resource "aws_lambda_function" "get_company_lambda_function" {
  role             = var.lambdas_exec_roles_arn.test_exec_role_arn
  handler          = "test_handler.handler"
  runtime          = var.runtime
  s3_bucket        = var.object_bucket_references.test_function.bucket
  s3_key           = var.object_bucket_references.test_function.key
  function_name    = "${var.environment}_${var.lambdas_names.test_lambda_function}"
  source_code_hash = base64sha256(var.object_bucket_references.test_function.etag)

}

You can define many types of outputs, like strings, objects, numbers, etc… If you have many outputs from a module, the best choice is to define an output like an object so you can send multiple values to another module just with one variable.

An important thing that you always need to remember is that Terraform is asynchronous, which means that the resources are created in different orders, for example, if you define a bucket and then you defined a file to be uploaded in that bucket, is possible that Terraform executes first the resource to upload the file and after that create the bucket, or execute the creation at the same time, that is a problem but don’t worry about that because you can add an option to wait for one or multiple resources, this option is “depends_on”, with this parameter you can set the resources that you want to create before of the creation of one resource.

resource "aws_s3_bucket" "bucket" {
  bucket = "${var.environment}-${var.bucket_name}"
  acl = "private-read"
}

resource "aws_s3_bucket_object" "test_function_object" {
  bucket = aws_s3_bucket.bucket.bucket
  key = "${var.lambda_resource_name}/${var.environment}/test.zip"
  source = "${path.module}/../../dist/test.zip"
  etag = filemd5("${path.module}/../../dist/test.zip")
  depends_on = [aws_s3_bucket.bucket]
}

This applies to resources, but with modules, this doesn’t apply because the modules are created in the order that you defined them. For example, in the definition, you can see buckets, policies, and functions, which means the module “buckets” is created first and the module “functions” is the last one.

Now, you can check your infrastructure with the next command.

terraform plan

With this command, Terraform analyses your changes and shows information about the resources that going to be created, updated, or deleted. The command ‘plan’ also analyzes your resources and possible errors in your definitions.

Terraform detected the following changes made outside of Terraform since the
last "terraform apply":

  # module.functions.aws_lambda_function.test_lambda_function has been changed
  ~ resource "aws_lambda_function" "test_lambda_function" {
        id                             = "test_sample_lambda_function"
      ~ last_modified                  = "2021-09-30T22:39:39.177+0000" -> "2021-10-01T22:23:35.495+0000"
      ~ runtime                        = "python3.8" -> "python3.6"
        tags                           = {}
        # (18 unchanged attributes hidden)



Plan: 0 to add, 1 to change, 0 to destroy.

─────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't
guarantee to take exactly these actions if you run "terraform apply" now.

Also, you can run terraform validate to validate the configuration files in a directory, referring only to the configuration and not accessing any remote services such as remote state, provider APIs, etc.

Finally, you can deploy your resources in AWS, you need to run the next command:

terraform apply -auto-approve

Each time that you apply your changes, terraform asks you for a confirmation, but you can skip that with the flag -auto-approve.

When you run the command plan terraform doesn’t analyze the permissions required to deploy your infrastructure, for that reason is important to set the variable to debug and see each request and its responses, so you can modify the permissions to create your resources.

Additional tips

I want to talk about something interesting, in Terraform you can define your workspaces or environments, in that way you can avoid the creation of folders for each environment. To achieve that you need to modify your backend and add a prefix to generate each environment.

backend "s3" {
  bucket = "test-project"
  key = "terraform.tfstate"
  region = "us-east-1"
  workspace_key_prefix = "env:"
}

Now, you need to create your workspace, to do that you need to run the next command:

terraform workspace new env_name

After the creation of your new environment, you’re going to see in your bucket a key per environment, and inside the file terraform.tfstate.

terraform.tfstate
dev
 |_ terraform.tfstate
stg
 |_ terraform.tfstate
prod
 |_ terraform.tfstate

Additionally, terraform creates by default a file in the root, because if you don’t select a workspace before deploying your resources, terraform needs a file to bring the main configuration. So, always is important to select a workspace. And how I can select my workspace? Is easy, you just need to run the next command:

terraform workspace select env_name

Sometimes you need to specify values per environment, I mean you don’t want to create a database with 1TB of storage and 32GB of RAM in each environment, so to achieve that you need to get the value of your current workspace, to do that is recommendable define a variable in the root of your infrastructure.

# -------------------------------------------------------------------------------------------------------------------
# LOCAL VARIABLES
# ----------------------------------------------------------------------------------------------------------------------variable "environment" {
  default = terraform.workspace
}locals {
  is_production = var.environment == "prod"
}

Is possible that you wondering, why it is defined something called ‘locals’ and didn’t define a normal variable, well, the reason is that Terraform doesn’t allow the use of conditional expressions in variable definitions, so to do that you need “locals”.

Now, in my variables of any module, I going to map the values according to the environment.

variable "environment" {}
variable "is_production" {}locals {
  db_name = var.is_production? "productiondb" : "${var.environment}db"
  db_subnet_group_name = var.is_production ? "main_subnet" : "${var.environment}_subnet"
  db_instance = {
    "prod" = "db.t3.small" 
    "demo" = "db.t3.micro"
  }
}

And finally, in the resource, the variable with the values is called, and according to the environment, the specifications are settled to create the resource.

resource "aws_db_instance" "kpinetwork_db" {
  identifier             = local.db_name
  allocated_storage      = 5

  instance_class         = local.db_instance[var.environment]
  engine                 = "postgres"
  engine_version         = "13.3"
  port                   = "5432"

  db_subnet_group_name   = aws_db_subnet_group.kpinetwork.name
  vpc_security_group_ids = [var.db_security_group.id]

  skip_final_snapshot    = true
}

You can see the code of the project here: https://github.com/ridouku/terraform-aws

More information about:

Questions? Comments? Contact me at ridouku@gmail.com

Basic guide to creating resources in AWS with Terraform

Additional tips

Written by Bryan Arellano