Integrating LVM with Hadoop

Mohd Sabir
3 min readMar 15, 2021

--

What is Hadoop?

Apache Hadoop is an open source framework that is used to efficiently store and process large datasets ranging in size from gigabytes to petabytes of data. Instead of using one large computer to store and process the data, Hadoop allows clustering multiple computers to analyze massive datasets in parallel more quickly.

What is LVM(Logical Volume Management)?

LVM is a framework of the Linux operating system that has been introduced for the easier management of physical storage devices. The concept of logical volume management is very much similar to the concept of virtualization, i.e. you can create as many virtual storage volumes on top of a single storage device as you want. The logical storage volumes thus created can be expanded or shrunk according to your growing or reducing storage needs.

Elasticity

The Elasticity refers to the ability to automatically expand or compressed the infrastructural resources on a sudden-up and down in the requirement so that the workload can be managed efficiently.

Task Description 📄

🔅 Integrating LVM with Hadoop and providing Elasticity to DataNode Storage.

Let’s start doing our task:-

Prerequisite for this task:-

  • We need a Hadoop Cluster.

I have already configure Hadoop Cluster on the Top of AWS Cloud let’s connect

  • > To integrate LVM with Hadoop , we have to create the Logical Volume and mount this partition and share this partition to master node using datanode, we can increase & decrease LVM partition and get elasticity.

Let’s start:-

For creating LVM partition we use

lvcreate --size 4G --name lv task

LVM partition created successfully , now we ill format this partition and mount it.

Partition mount succcessfully

Now we will provide this folder to our datanode

vi hdfs.site.xml<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. --><configuration>
<property>
<name>dfs.data.dir</name>
<value>/mnt</value>
</property>
</configuration>

Save this file , format your datanode and start the services.

Datanode started now check let’s check how storage they are sharing with masternode.

it is sharing only 4GB storage , let’s increase the size on the fly

for increasing the size we have to increase the size of LVM partition let’s do

lvextend --size +4G /dev/task/lv 

Partition resize successfully

df -hT

Let’s check they are sharing 8GB to master node or not

Yeah its working fine.

To learn more about LVM you can click on below link

https://sabir69261.medium.com/increase-or-decrease-the-size-of-static-partition-in-linux-58ffc320f017

Thanks for reading

--

--

Mohd Sabir
Mohd Sabir

Written by Mohd Sabir

DevOps Enthusiastic || Kubernetes || GCP || Terraform || Jenkins || Scripting || Linux ,, Don’t hesitate to contact on : https://www.linkedin.com/in/mohdsabir

No responses yet