by Thierry de Pauw on
Infrastructure Testing doesn't get the attention it deserves, @iSibnZe
working on infrastructure has changed ... a lot!
we are in a state where infrastructure can be done in development teams (where in the past you had dedicated teams for that)
=> shift responsibility
but our engineering lags behind
- version control: most team agree on this
- pipelines: few teams run IaC from pipelines
- steady environment
and Testing is important for automation!
The development cycle:
- develop change
- verify the change locally
increment the infrastructure with changes -> updated infrastructure -> test the resulting behaviour
code repo -> build artefact -> dev -> prod
between code repo and build artefact:
1. build and test infrastructure code: bare minimum is to check the validity of the definition files
2. if tests fail it doesn't produce an artefact
dev -> pre-prod -> prod
on each transition:
1. provision infrastructure
2. run tests
3. failing tests fail the pipeline
Cloud -> API -> tool using the API
test on all these abstraction levels:
- Unit/Model tests against the tool
- API tests: check via the API if the provisioning did what was expected
Model tests: I don't trust my code
API tests: I don't trust my tool (nor my code)
Real infrastructure tests: I have trust issues :)
Real infrastructure tests are expensive, but the only way to know for sure ...
example: Pulumi model tests
what about Terraform?
you will have to apply some hacks to do model testing.
run the plan and perform checks against the plan.
terraform plan -out my.plan
terraform show my.plan
Example 1: Dedicated, Ephemeral, Immutable, Testing
- test an abstraction in the infrastructure codebase
- for every test execution: Create, Test, Destroy
tests run off-site and asserts that the module works as expected.
time: 20 - 40min
- no interference with prod infra
- cheap compared to persistent iuT
- high execution times
- glue code might be extensive
- modules are not always good for starting
example: JUnit with Before
terraform init; terraform apply -auto-approve and TearDown
Example 1b: Dedicated, Persistent, Mutable, Testing
- keep test infrastructure around
- provisions updates =>
terraform apply performs updates
- higher quality because you always update existing infra as you do in production
- more $$ expensive
Example 2: Reused, Persistent, (Im)mutable, Productive
- test productive infrastructure as it changes
- can enhance a pipeline stage => smoke tests
- mutable: provision infra and then check
- immutable: provision infra, check and then switch
But: may interfere with your production workload!