Terraformer - Infrastructure To Code

Terraformer - Infrastructure To Code
I would like to tell you about the new CLI tool that I wrote to solve an old problem.

Problem

Terraform has long been a standard in the Devops/Cloud/IT community. The thing is very convenient and useful for dealing with infrastructure as code. There are a lot of goodies in Terraform as well as a lot of forks, sharp knives and rakes.
With Terraform it is very convenient to make new things and then manage, change or delete them. And what about those who have a huge infrastructure in the cloud and are not created through Terraform? Rewriting and recreating the entire cloud is somehow expensive and unsafe.
I encountered such a problem at 2 jobs, the simplest example is when you want everything to be in the form of terraform files, and you have 250+ buckets and write them in for terraform with your hands like a lot.
There is issue since 2014 in terrafom which was closed in 2016 with the hope that there will be import.

In general, everything is as in the picture only from right to left

Warnings: The author does not live in Russia for half of his life and writes little in Russian. Beware of spelling errors.

Solutions

1. There is a ready-made and old solution for AWS terraforming. When I tried to get my 250+ buckets through it, I realized that everything was bad there. AWS has long thrown a lot of new options and terraforming does not know about them and in general it has ruby template look skimpy. After 2 pm I sent pull request to add more features there and realized that such a solution does not fit at all.
How terraforming works, it takes data from the AWS SDK and generates tf and tfstate through a template.
There are 3 problems here:
1. There will always be a lag in updates
2. tf files sometimes come out broken
3. tfstate is collected separately from tf and does not always converge
In general, it is difficult to get a result in which `terraform plan` says that there are no changes

2. `terraform import` is a built-in command in terraform. How does it work?
You write an empty TF file with the resource name and type, then you run `terraform import` and pass in the resource ID. terraform calls the provider, gets the data and makes a tfstate file.
There are 3 problems here:
1. We get only the tfstate file, and tf is empty, you need to write it manually or convert it with tfstate
2. Can only work with one resource at a time and does not support all resources. And what should I do with 250+ buckets again
3. You need to know the ID of the resources - that is, you need to wrap it in code that gets the list of resources
In general the result is partial and does not scale well

My decisions

Requirements:
1. Ability to create tf and tfstate files by resources. For example, download all buckets / security group / load balancer and that `terraform plan` returned that there were no changes
2. You need 2 clouds GCP + AWS
3. A global solution that is easy to update every time and does not waste time on each resource for 3 days of work
4. Make Open source - everyone has this problem

Go language - that's why I love it, and it has a library for creating HCL files that is used in terraform + a lot of code in terraform that can be useful

Path

First attempt
I started with a simple one. Calling the cloud through the SDK for the desired resource and converting it into fields for terraform. The attempt died immediately on the security group because I did not like to convert only the security group for 1.5 days (and there are a lot of resources). For a long time and then the fields can be changed / added

second attempt
Based on the idea described here. Just take and convert tfstate to tf. All the data is there and the fields are the same. How to get full tfstate for many resources?? This is where the `terraform refresh` command came to the rescue. terraform takes all the resources in tfstate and pulls out data on them by ID and writes everything to tfstate. That is, create an empty tfstate with only names and IDs, run `terraform refresh` then we get the full tfstate. Hooray!
Now let's do some recursive pornography writing a converter for tfstate to tf. For those who have never read tfstate, this is JSON, but special.
Here is its important part attributes

 "attributes": {
                            "id": "default/backend-logging-load-deployment",
                            "metadata.#": "1",
                            "metadata.0.annotations.%": "0",
                            "metadata.0.generate_name": "",
                            "metadata.0.generation": "24",
                            "metadata.0.labels.%": "1",
                            "metadata.0.labels.app": "backend-logging",
                            "metadata.0.name": "backend-logging-load-deployment",
                            "metadata.0.namespace": "default",
                            "metadata.0.resource_version": "109317427",
                            "metadata.0.self_link": "/apis/apps/v1/namespaces/default/deployments/backend-logging-load-deployment",
                            "metadata.0.uid": "300ecda1-4138-11e9-9d5d-42010a8400b5",
                            "spec.#": "1",
                            "spec.0.min_ready_seconds": "0",
                            "spec.0.paused": "false",
                            "spec.0.progress_deadline_seconds": "600",
                            "spec.0.replicas": "1",
                            "spec.0.revision_history_limit": "10",
                            "spec.0.selector.#": "1",

There is:
1. id - string
2. metadata - array of size 1 and in it an object with fields which is described below
3. spec - hash of size 1 and key, value in it
In short, a fun format, everything can be several levels deep too

                   "spec.#": "1",
                            "spec.0.min_ready_seconds": "0",
                            "spec.0.paused": "false",
                            "spec.0.progress_deadline_seconds": "600",
                            "spec.0.replicas": "1",
                            "spec.0.revision_history_limit": "10",
                            "spec.0.selector.#": "1",
                            "spec.0.selector.0.match_expressions.#": "0",
                            "spec.0.selector.0.match_labels.%": "1",
                            "spec.0.selector.0.match_labels.app": "backend-logging-load",
                            "spec.0.strategy.#": "0",
                            "spec.0.template.#": "1",
                            "spec.0.template.0.metadata.#": "1",
                            "spec.0.template.0.metadata.0.annotations.%": "0",
                            "spec.0.template.0.metadata.0.generate_name": "",
                            "spec.0.template.0.metadata.0.generation": "0",
                            "spec.0.template.0.metadata.0.labels.%": "1",
                            "spec.0.template.0.metadata.0.labels.app": "backend-logging-load",
                            "spec.0.template.0.metadata.0.name": "",
                            "spec.0.template.0.metadata.0.namespace": "",
                            "spec.0.template.0.metadata.0.resource_version": "",
                            "spec.0.template.0.metadata.0.self_link": "",
                            "spec.0.template.0.metadata.0.uid": "",
                            "spec.0.template.0.spec.#": "1",
                            "spec.0.template.0.spec.0.active_deadline_seconds": "0",
                            "spec.0.template.0.spec.0.container.#": "1",
                            "spec.0.template.0.spec.0.container.0.args.#": "3",

In general, who wants a programming problem for an interview, then just ask to write a parser for this case πŸ™‚
After trying for a long time to write a bug-free parser, I found part of it in the terraform code, and the most important part. And everything seemed to be working fine.

try three
terraform provider are binaries that contain code with all the resources and logic for working with the cloud API. Each cloud has its own provider and terraform itself only calls them through its RPC protocol between two processes.
Now I decided to contact terraform providers directly via RPC calls. So it turned out beautifully and made it possible to change terraform providers to newer ones and get new features without changing the code. It also turned out that not all fields in tfstate should be in tf, but how can I find out? Just ask the provider about it. Then another recursive pornography began on the assembly of regular expressions, a rigmarole with the search for fields inside tfstate at all levels in depth.

In the end, we got a useful CLI tool that has a common infrastructure for all terraform providers and you can easily add a new one. Also adding resources takes little code. Plus all sorts of goodies such as connection between resources. Of course there were many different problems that can not be described.
I named the animal Terrafomer.

The final

Using Terrafomer, we generated 500-700 thousand lines of tf + tfstate code from two clouds. We were able to take legacy things and start touching them only through terraform, as in the best infrastructure as code ideas. It's just magic when you take a huge cloud and get it through the command in the form of terraform worker files. And then grep / replace / git and so on.

Combed and put in order, received permission. Released on github for everyone on Thursday (02.05.19/XNUMX/XNUMX). github.com/GoogleCloudPlatform/terraformer
Got 600 stars already, 2 pull requests adding support for openstack and kubernetes. Good feedback. In general, the project is useful for people
I advise everyone who wants to start working with Terraform and not rewrite everything for this.
I would be happy to pull requests, issues, stars.

Demo
Terraformer - Infrastructure To Code

Source: habr.com

Add a comment