Ansible, it’s Use Case (configure Hadoop and starts cluster services using Ansible Playbook)

Ayushmilan
3 min readSep 26, 2021

In this blog, I am going to describe how to set up a Hadoop cluster using Ansible.

Task Description📄

Configure Hadoop and start cluster services using Ansible Playbook.

Let’s get a basic idea about Ansible and Hadoop.

What is Ansible?

Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code. It runs on many Unix-like systems and can configure both Unix-like systems as well as Microsoft Windows. It includes its own declarative language to describe system configuration.

What is Hadoop?

The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly available service on top of a cluster of computers, each of which may be prone to failures.

Let’s start…

At first check, the ansible configuration and software which will be necessary for this Hadoop configuration in the system then add the managed node in the inventory, which will be converted into name node and data node.

Check the status with the pinging to the managed node.

Now, let’s take a look at our name node playbook.

here I create another playbook to start the name node.

Now let's run both the playbook one by one to set up and start the name node

here we can see that in the top-right system which is our name node is started and in running state.

Now, let’s take a look at our data node playbook which will set up and start the data node.

It’s time to run this playbook and start the data node and connect it with the name node.

here we can see that in the bottom-right system which is our data node is started and in running state.

Now let's check the Hadoop report whether it is working correctly or not.

As we can see that in both the system it will show output that is correct, i.e. our Hadoop cluster is set on these systems and we are done with this with the help of Ansible.

So, We have successfully done it.
see in the next blog, you can read such blogs on my profile.
Thank you.

--

--

Ayushmilan

Associate Data Scientist, Technical Content Writer, GATE CSE(2022, 2023) Qualified