An open-source framework written in Java which allows users to store as much as terabytes or even petabytes of Big Data – both structured and un-structured – across a cluster of computers. The unique storage mechanism which uses a distributed file system (HDFS) to map data across any part of a cluster.
This is the module which offers a key selling point of Hadoop, as it ensures scalability. When data is received by Hadoop it is executed over three different stages:
The structure of Hadoop means that it can scale horizontally, unlike traditional relational databases. This is because the data can be stored across a cluster of servers, from a single server to hundreds.
Faster data processing is made possible by the distributed file and powerful mapping offered by Hadoop.
Both your structured and unstructured data can be used to generate value by Hadoop. It can draw useful insights from sources such as social media, daily logs and emails.
The data stored by Hadoop is stored in replicate form across different servers in multiple locations, which increases reliability.