Ansible Handling Errors

Ansible evaluates the return codes of modules and commands to determine whether a task succeeded or failed. Normally, when a task fails Ansible immediately aborts the plays, skipping all subsequent tasks.
However, sometimes an administrator may want to have playbook execution continue even if a task fails. For example, it may be expected that a particular command will fail.

There are a number of Ansible features that may be used to better manage task errors:

  • Ignore the failed task: By default, if a task fails, the play is aborted; however, this behavior can be overridden by skipping failed tasks. To do so, the ignore_errors keyword needs to be used in a task. The following snippet shows how to use ignore_errors to continue playbook execution even if the task fails. For example, if the notapkg package does not exist the yum
    module will fail, but having ignore_errors set to yes will allow execution to continue.

  • Force execution of handlers: By default, if a task that notifies a handler fails, the handler will be skipped as well. Administrators can override this behavior by using the force_handlers keyword in task. This forces the handler to be called even if the task fails. The following snippet shows hows to use the force_handlers keyword in a task to forcefully execute the handler even if the task fails:

  • Override the failed state: A task itself can succeed, yet administrators may want to mark the task as failed based on specific criteria. To do so, the failed_when keyword can be used with a task. This is often used with modules that execute remote commands and capture the output in a variable. For example, administrators can run a script that outputs an error message and use that message to define the failed state for the task. The following snippet shows how the failed_when keyword can be used in a task:

  • Override the changed state: When a task updates a managed host, it acquires the changed state. However, if a task does not perform any change on the managed node, handlers are skipped. The changed_when keyword can be used to override the default behavior of what triggers the changed state. For example, if administrators want to restart a service every time a playbook runs, the changed_when keyword can be added to the task. The following snippet shows how a handler can be triggered every time by forcing the changed state:

 

Ansible blocks and error handling
In playbooks, blocks are clauses that enclose tasks. Blocks allow for logical grouping of tasks, and can be used to control how tasks are executed. For example, administrators can define a main set of tasks and a set of extra tasks that will only be executed if the first set fails. To do so, blocks in playbooks can be used with three keywords:

  • block: Defines the main tasks to run.
  • rescue: Defines the tasks that will be run if the tasks defined in the block clause fails.
  • always: Defines the tasks that will always run independently of the success or failure of tasks defined in the block and rescue clauses.

The following example shows how to implement a block in a playbook. Even if tasks defined in the block clause fail, tasks defined in the rescue and always clause will be executed:

 

Example 1. Configure Error Handling

The playbook below will copy url files from managedhosts.

Ensuring that http service is at started state at both servers.

Now we are running playbook5:

Ass you can see index.html is copied into /tmp directory

Let’s stop http service at managedhost1

And remove index.html files from /tmp

When we run playbook5 now we will see errors:

Now we will write a playbook with error debugging::

Running the playbook:

Everything went well:

Let’s stop http service at managedhost2:

And we can see the message about problems:

 

Example 2. Handling errors

Playbook playbook.yml targets all hosts in the mailservers group. It initializes four variables: maildir_path with a value of /home/john/Maildir, maildir with a value of /home/student/Maildir, mail_package with a value of postfix, and mail_service with a value of postfix. It also has a tasks section that defines a task that uses the copy module to install a directory to the managed host. It has a second task that installs the package that the mail_package variable defines.

Run the playbook. Watch as the first task fails, resulting in the failure of the play.

Update the first task by adding the ignore_errors keyword with a value of yes. The updated task should read as follows:

Run the playbook and watch as the second task runs, despite the fact that the first task failed.

Create a block in the tasks section and nest the tasks into it. Remove the line ignore_errors: yes for the first task.

Move the task that installs the postfix package into the rescue clause. Update the task to also install the dovecot package.

Add an always clause. It should create a task to start both the postfix and dovecot services. Register the output of the command in the command_result variable.

Add a debug task outside the block that outputs the command_result variable.

The playbook should now read as follows:

Run the playbook, and watch as the rescue and always clauses run despite the fact that the first task failed.

To restart the two services every time, the changed_when keyword can be used. To do so, capture the output of the first task into a variable; the variable will be used to force the state of the task that installs the packages. A handlers section will be added to the playbook.

Update the task in the block section by saving the output of the command into the command_output variable.

Update the task in the rescue section by adding a changed_when condition as well as a notify section for the restart_dovecot handler. The condition will read the command_output variable to forcefully mark the task as changed.

Update the task in the always section to restart the service defined by the mail_service variable.

Append a handlers section to restart the dovecot service.

When updated, the tasks and handlers sections should read as follows:

Run the playbook. Watch as the handler is called, despite the fact that the dovecot package is already installed:

 

Example 3.

Create the playbook.yml playbook. Initialize three variables: web_package with a value of http, db_package with a value of mariadb-server and db_service with a value of mariadb. The variables will be used to install the required packages and start the server. The http value is an intentional error in the package name. Define two tasks that use the yum module and the two variables, web_package and db_package. The task will install the required packages. The file should read as follows:

Run the playbook and watch the output of the play.

The task failed because there is no existing package called http. Because the first task failed, the second task was skipped. Update the first task to ignore any errors by adding the ignore_errors keyword. The file should read as follows:

Run the playbook another time and watch the output of the play.

Despite the fact that the first task failed, Ansible executed the second one.

Insert a new task at the beginning of the playbook. The task will execute a remote command and capture the output. The output of the command is used by the task that installs the mariadb-server package to override what Ansible considers as a failure.

In the playbook, insert a task at the beginning of the playbook that executes a remote command and saves the output in the command_result variable. The play should also continue if the task fails; to do so, add the ignore_errors keyword. The playbook should read as follows:

Run the playbook to ensure the first two tasks are skipped.

Update the task that installs the mariadb-server package. Add a condition that indicates that the task should be considered as failed if the keyword Error is present in the variable command_result. The playbook should read as follows:

Before running the playbook, run an ad hoc command to remove the mariadb-server package from the databases managed hosts.

Run the playbook and watch how the latest task is skipped as well.

Update the task that installs the mariadb-server package by overriding what triggers the changed state. To do so, use the return code saved in the command_result.rc variable.

Update the last task by commenting out the when condition and adding a
changed_when condition in order to override the changed state for the task. The condition will use the return code the registered variable contains. The playbook should read as follows:

Execute the playbook twice; the first execution will reinstall the mariadb-server package. The following output shows the result of the second execution; as you can see, the task is marked as changed despite the fact that the mariadb-server package has already been installed:

 

Update the playbook by nesting the first two tasks in a block cause. Remove the lines that use the ignore_errors conditional.

Nest the task that installs the mariadb-server package in a rescue clause and remove the conditional that overrides the changed result. The task will be executed even if the previous tasks fail.

Finally, add an always clause that will start the database server upon installation using the service module.

Once updated, the task section should read as follows:

Before running the playbook, remove the mariadb-server package from the databases managed hosts.

Run the playbook, and watch as despite failure for the first two tasks, Ansible installs the mariadb-server package and starts the mariadb service.

 

Example 4 . Implementing Task Control

In this lab, you will install the Apache web server and secure it using mod_ssl. You will use various Ansible conditionals to deploy the environment.

Defining tasks for the web server
In the top-level directory for this lab, create the install_packages.yml task
file. Start by defining the task that uses the yum module in order to install the latest version of the httpd and mod_ssl packages. For the packages, use a loop and two variables: web_package and ssl_package. The two variables will be set by the main playbook.

Add a when clause in order to install the packages only if:
1. The managed host is in the webservers group.
2. The amount of memory on the managed host is greater than the amount
the memory variable defines. For the amount of memory the system has, the Ansible fact ansible_memory_mb.real.total can be used.

Finally, add the task that starts the service defined by the web_service variable. When completed, the file should read as follows:

 

Defining tasks for the web server’s configuration
Create the configure_web.yml tasks file. Start with a task that uses the shell module to determine whether or not the httpd package is installed and register the output in a variable. Update the condition to consider the task as failed based on the return code of the command (the return code is 1 when a package is not installed). The failed_when variable will be used to override how Ansible should consider the task as failed by using the return code.

Create a block that executes only if the httpd package is installed (use the return code that has been captured in the first task). The  block contains the tasks for configuring the files. Start the block with a task that uses the get_url module to retrieve the Apache SSL configuration file. Use the https_uri variable for the url and /etc/httpd/conf.d/ for the remote path on the managed host.

Define a task that creates /etc/httpd/conf.d/ssl remote ssl directory with a mode of 0755. The directory will store the SSL certificates. Define a task that creates /var/www/html/logs remote logs directory with a mode of 0755. The directory will store the SSL logs.  Define a task that uses the stat module to ensure the /etc/httpd/conf.d/ssl.conf file exists. Capture the output in the ssl_file variable using the register statement.

Define a task that renames the /etc/httpd/conf.d/ssl.conf file as /etc/httpd/conf.d/ssl.conf.bak, only if the file exists. Before attempting to rename the file, the task will use previous task which will evaluate the content of the ssl_file variable.

Define another task that uses the unarchive module to retrieve the remote SSL configuration files. Use the ssl_uri variable for the source and /etc/httpd/conf.d/ssl/ as the destination. Instruct the task to notify the restart_services handler when the file has been copied.

Add the last task that creates the index.html file under /var/www/html/ on the managed host. The page should use Ansible facts and should read as follows:

Use the two following Ansible facts to create the page: ansible_fqdn and
ansible_default_ipv4.address.

Finally, make sure the block only runs if the httpd package is installed. To do so, add a when clause that parses the return code contained in the rpm_check registered variable. When completed, the file should read as follows:

 

Defining tasks for the firewall
Create the configure_firewall.yml task file. Define the task that uses the yum module to install the package that the fw_package variable defines (latest version of the firewall service – the variable will be set in the main playbook). Tag the task with the production tag.

Add the task that starts the firewall service using the fw_service variable and tag it as production.

Write the task that uses the firewalld module to add the http and https service rules to the firewall. The rules should be applied immediately as well as persistently. Use a loop for the two rules. Tag the task as production.
When completed, the file should read as follows:

 

Defining the main playbook
Create the main playbook.yml and start by targeting the hosts in the webservers host group. Define a block that imports the following three task files: install_packages.yml, configure_web.yml, and configure_firewall.yml, using the include statement.

For the first include, use install_packages.yml as the name of the file to import. Define the four variables required by the file:  memory with a value of 256,  web_package with a value of httpd, ssl_package with a value of mod_ssl, web_service with a value of httpd.

For the task that imports the configure_web.yml file, define the following variables: https_uri with a value of http://materials.example.com/task_control/https.conf, and ssl_uri with a value of http://materials.example.com/task_control/ssl.tar.gz.

For the task that imports the configure_firewall.yml playbook, add a condition to only import the tasks tagged with the production tag. Define the fw_package and fw_service variables with a value of firewalld.

Create the rescue clause for the block that installs the latest version of the httpd package and notifies the restart_services handler upon the package installation. Add a debug statement that reads: Failed to import and run all the tasks; installing the web server manually

Add an always statement that uses the shell module to query the status of the httpd service using systemctl.

Define the restart_services handler that uses a loop to restart both the
firewalld and httpd services.

When completed, the playbook should read as follows:

 

Executing the playbook.yml playbook
Run the playbook.yml playbook to set up the environment. The playbook should:
• Import and run the tasks that install the web server packages only if there is enough memory on the managed host.
• Import and run the tasks that configure SSL for the web server.
• Import and run the tasks that create the firewall rule for the web server to be
reachable.

Ensure the web server has been correctly configured by querying the home page of the web server (https://serverb.example.com) using curl with the -k option to allow insecure connections (bypass any SSL strict checking):