Rollbacks on failures
- Stack Creation Fails: (CreateStack API)
- Default everything rolls back (gets deleted). We can look at the log OnFailure=ROLLBACK
- Troubleshoot: Option to disable rollback and manually troubleshoot OnFailure=DO_NOTHING
- Delete: get rid of the stack entirely, do not keep anything OnFailure DELETE
- Stack Update Fails: (UpdateStack API)
- The stack automatically rolls back to the previous known working state
- Ability to see in the log what happened and error messages
Let’s use the same template file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
--- Parameters: SSHKey: Type: AWS::EC2::KeyPair::KeyName Description: Name of an existing EC2 KeyPair to enable SSH access to the instance Resources: MyInstance: Type: AWS::EC2::Instance Properties: AvailabilityZone: eu-central-1a ImageId: ami-00a205cb8e06c3c4e InstanceType: t2.micro KeyName: !Ref SSHKey SecurityGroups: - !Ref SSHSecurityGroup # we install our web server with user data UserData: Fn::Base64: !Sub | #!/bin/bash -xe # Get the latest CloudFormation package yum update -y aws-cfn-bootstrap # Start cfn-init /opt/aws/bin/cfn-init -s ${AWS::StackId} -r MyInstance --region ${AWS::Region} # Start cfn-signal to the wait condition /opt/aws/bin/cfn-signal -e $? --stack ${AWS::StackId} --resource SampleWaitCondition --region ${AWS::Region} Metadata: Comment: Install a simple Apache HTTP page AWS::CloudFormation::Init: config: packages: yum: httpd: [] files: "/var/www/html/index.html": content: | <h1>Hello World from EC2 instance!</h1> <p>This was created using cfn-init</p> mode: '000644' commands: hello: command: "echo 'boom' && exit 1" services: sysvinit: httpd: enabled: 'true' ensureRunning: 'true' SampleWaitCondition: CreationPolicy: ResourceSignal: Timeout: PT1M Count: 1 Type: AWS::CloudFormation::WaitCondition # our EC2 security group SSHSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: SSH and HTTP SecurityGroupIngress: - CidrIp: 0.0.0.0/0 FromPort: 22 IpProtocol: tcp ToPort: 22 - CidrIp: 0.0.0.0/0 FromPort: 80 IpProtocol: tcp ToPort: 80 |
Now we can create a stack
but with different option:
This time we still failed to receive 1 resource signal(s)
But stack wasn’t rolled back.
When we connect to the created EC2 instance we can see the reason of an error:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
[ec2-user@ip-172-31-31-182 ~]$ sudo /var/log/cfn-init.log sudo: /var/log/cfn-init.log: command not found [ec2-user@ip-172-31-31-182 ~]$ sudo cat /var/log/cfn-init.log 2021-06-01 16:41:35,941 [INFO] -----------------------Starting build----------------------- 2021-06-01 16:41:35,942 [INFO] Running configSets: default 2021-06-01 16:41:35,943 [INFO] Running configSet default 2021-06-01 16:41:35,944 [INFO] Running config config 2021-06-01 16:41:47,932 [INFO] Yum installed ['httpd'] 2021-06-01 16:41:47,947 [ERROR] Command hello (echo 'boom' && exit 1) failed 2021-06-01 16:41:47,947 [ERROR] Error encountered during build of config: Command hello failed Traceback (most recent call last): File "/usr/lib/python3.7/site-packages/cfnbootstrap/construction.py", line 573, in run_config CloudFormationCarpenter(config, self._auth_config).build(worklog) File "/usr/lib/python3.7/site-packages/cfnbootstrap/construction.py", line 273, in build self._config.commands) File "/usr/lib/python3.7/site-packages/cfnbootstrap/command_tool.py", line 127, in apply raise ToolError(u"Command %s failed" % name) cfnbootstrap.construction_errors.ToolError: Command hello failed 2021-06-01 16:41:47,947 [ERROR] -----------------------BUILD FAILED!------------------------ 2021-06-01 16:41:47,947 [ERROR] Unhandled exception during build: Command hello failed Traceback (most recent call last): File "/opt/aws/bin/cfn-init", line 176, in <module> worklog.build(metadata, configSets) File "/usr/lib/python3.7/site-packages/cfnbootstrap/construction.py", line 135, in build Contractor(metadata).build(configSets, self) File "/usr/lib/python3.7/site-packages/cfnbootstrap/construction.py", line 561, in build self.run_config(config, worklog) File "/usr/lib/python3.7/site-packages/cfnbootstrap/construction.py", line 573, in run_config CloudFormationCarpenter(config, self._auth_config).build(worklog) File "/usr/lib/python3.7/site-packages/cfnbootstrap/construction.py", line 273, in build self._config.commands) File "/usr/lib/python3.7/site-packages/cfnbootstrap/command_tool.py", line 127, in apply raise ToolError(u"Command %s failed" % name) cfnbootstrap.construction_errors.ToolError: Command hello failed |
For troubleshooting we need not to rollback on failure.
Nested stacks
- Nested stacks are stacks as part of other stacks
- They allow you to isolate repeated patterns / common components in separate stacks and call them from other stacks
- Example:
- Load Balancer configuration that is re-used
- Security Group that is re-used
- Nested stacks are considered best practice
- To update a nested stack, always update the parent (root stack)
Consider such a cloudformation template:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
Parameters: SSHKey: Type: AWS::EC2::KeyPair::KeyName Description: Name of an existing EC2 KeyPair to enable SSH access to the instance Resources: myStack: Type: AWS::CloudFormation::Stack Properties: TemplateURL: https://s3.amazonaws.com/cloudformation-templates-us-east-1/LAMP_Single_Instance.template Parameters: KeyName: !Ref SSHKey DBName: "mydb" DBUser: "user" DBPassword: "pass" DBRootPassword: "passroot" InstanceType: t2.micro SSHLocation: "0.0.0.0/0" Outputs: StackRef: Value: !Ref myStack OutputFromNestedStack: Value: !GetAtt myStack.Outputs.WebsiteURL |
As we see to TemplateURL is attached to this yaml. The URL of the attached template is
https://s3.amazonaws.com/cloudformation-templates-us-east-1/LAMP_Single_Instance.template
Let’s crete a stack:
Nest -> Next
Create stack
Now we have two stack created:
If we go to the output of the nested stack we have the url of created LAMP website:
If we want to delete two stacks we need to delete the parent stack:
ChangeSets
- When you update a stack, you need to know what changes before it happens for greater confidence
- ChangeSets won’t say if the update will be successful
At the begining we will create simple stack
1 2 3 4 5 6 7 8 |
--- Resources: MyEC2Instance: Type: AWS::EC2::Instance Properties: AvailabilityZone: eu-central-1a ImageId: ami-00a205cb8e06c3c4e InstanceType: t2.micro |
Next -> Next -> Create stack
Now let’s create change set for created stack. We will use such a tamplate
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
--- Parameters: SecurityGroupDescription: Description: Security Group Description Type: String Resources: MyEC2Instance: Type: AWS::EC2::Instance Properties: AvailabilityZone: eu-central-1a ImageId: ami-00a205cb8e06c3c4e InstanceType: t2.micro SecurityGroups: - !Ref SSHSecurityGroup - !Ref ServerSecurityGroup # an elastic IP for our instance MyEIP: Type: AWS::EC2::EIP Properties: InstanceId: !Ref MyEC2Instance # our EC2 security group SSHSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Enable SSH access via port 22 SecurityGroupIngress: - CidrIp: 0.0.0.0/0 FromPort: 22 IpProtocol: tcp ToPort: 22 # our second EC2 security group ServerSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: !Ref SecurityGroupDescription SecurityGroupIngress: - IpProtocol: tcp FromPort: 80 ToPort: 80 CidrIp: 0.0.0.0/0 - IpProtocol: tcp FromPort: 22 ToPort: 22 CidrIp: 192.168.1.1/32 |
CloudFormation -> Stacks -> StackCsDemo -> Create change set
Next -> Next -> Create change set -> Create change set
Change set has not been created yet.
We can see every changes that will be created in our stack:
You can delete or execute the change set.
Let’s execute the change set.
Retaining Data on Deletes
- You can put a DeletionPolicy on any resource to control what happens when the CloudFormation template is deleted
DeletionPolicy=Retain
:- Specify on resources to preserve / backup in case of CloudFormation deletes
- To keep a resource, specify Retain (works for any resource / nested stack)
DeletionPolicy=Snapshot
:- EBSVolume, ElastiCache Cluster, ElastiCache ReplicationGroup
- RDS DBlnstance, RDS DBCluster, Redshift Cluster
DeletePolicy=Delete
(default behavior):- Note: for
AWS::RDS::DBCluster
resources, the default policy isSnapshot
- Note: to delete an S3 bucket, you need to first empty the bucket of its content
- Note: for
Consider such a yaml template:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
--- Resources: MySG: Type: AWS::EC2::SecurityGroup DeletionPolicy: Retain Properties: GroupDescription: Enable SSH access via port 22 SecurityGroupIngress: - CidrIp: 0.0.0.0/0 FromPort: 22 IpProtocol: tcp ToPort: 22 MyEBS: Type: AWS::EC2::Volume DeletionPolicy: Snapshot Properties: AvailabilityZone: eu-central-1a Size: 1 VolumeType: gp2 |
Let’s go and create a stack:
Next->
Next -> Next -> Create stack
We have created EBS volume and Security Group.
Now, let’s delete the stack. As we see the Security Group MySG has not been deleted.
EBS volume has been deleted but the snapshot of the volume has been taken.
Termination Protection on Stacks
- To prevent accidental deletes of CloudFormation templates, use Termination Protection
Let’s see this quickly. First we create a simple stack:
1 2 3 4 5 6 7 8 |
--- Resources: MyEC2Instance: Type: AWS::EC2::Instance Properties: AvailabilityZone: eu-central-1a ImageId: ami-00a205cb8e06c3c4e InstanceType: t2.micro |
Now let’s enable the termination protection:
After we enable the termination protection we can not delete the stack:
As it is mentioned to delete the stack we need to disable termination policy.