Skip to main content

One post tagged with "hadoop"

View All Tags

· 6 min read

Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage.

Hadoop components

Hadoop is divided into two core components

  • HDFS: a distributed file system;
  • YARN: a cluster resource management technology.