blog




  • Essay / Google File System - 581

    Google File System (GFS) was developed by Google to meet the high data processing needs. Hadoop's Distributed File System (HDFS) was originally developed by Yahoo.Inc, but is maintained as open source by the Apache Software Foundation. HDFS was built based on Google's GFS and Map Reduction. As Internet data grew rapidly, it became necessary to store the upcoming big data. So Google developed a distributed file system called GFS and HDFS to meet different customer needs. These are built on commodity hardware, so the systems often fail. To make the systems reliable, data is replicated across multiple nodes. By default, the minimum number of replicas is 3. Millions of files and large files are common with these types of file systems. Data is read more often than written. Large streaming needs and small random needs are supported. How GFS works: GFS consists of a master node and block servers that are accessed by multiple clients. The client asks the master for the location of the block. The client sends the file name and chunk index it needs. The master stores the name...