Google File System



EOI: 10.11242/viva-tech.01.04.191

Download Full Text here



Citation

Pramod A. Yadav ,Prof. Krutika Vartak, "Google File System ", VIVA-IJRI Volume 1, Issue 4, Article 191, pp. 1-6, 2021. Published by Computer Engineering Department, VIVA Institute of Technology, Virar, India.

Abstract

Google File System was innovatively created by Google engineers and it is ready for production in record time. The success of Google is to attributed the efficient search algorithm, and also to the underlying commodity hardware. As Google run number of application then Google’s goal became to build a vast storage network out of inexpensive commodity hardware. So Google create its own file system, named as Google File System that is GFS. Google File system is one of the largest file system in operation. Generally Google File System is a scalable distributed file system of large distributed data intensive apps. In the design phase of Google file system, in which the given stress includes component failures , files are huge and files are mutated by appending data. The entire file system is organized hierarchically in directories and identified by pathnames. The architecture comprises of multiple chunk servers, multiple clients and a single master. Files are divided into chunks, and that is the key design parameter. Google File System also uses leases and mutation order in their design to achieve atomicity and consistency. As of there fault tolerance, Google file system is highly available, replicas of chunk servers and master exists.

Keywords

Design, Scalability, Fault tolerance, Performance, Measurements

References

  1. Thomas Anderson, Michael Dahlin, Jeanna Neefe, David Patterson, Drew Roselli, and Randolph Wang. Serverless network file systems. In Proceedings of the 15th ACM Symposium on Operating System Principles, pages 109–126, Copper Mountain Resort, Colorado, December 1995.
  2. Remzi H. Arpaci-Dusseau, Eric Anderson, Noah Treuhaft, David E. Culler, Joseph M. Hellerstein, David Patterson, and Kathy Yelick. Cluster I/O with River: Making the fast case common. In Proceedings of the Sixth Workshop on Input/Output in Parallel and Distributed Systems (IOPADS ’99), pages 10–22, Atlanta, Georgia, May 1999.
  3. Luis-Felipe Cabrera and Darrell D. E. Long. Swift: Using distributed disk striping to provide high I/O data rates. Computer Systems, 4(4):405–436, 1991.
  4. Garth A. Gibson, David F. Nagle, Khalil Amiri, Je? Butler, Fay W. Chang, Howard Gobio?, Charles Hardin, Erik Riedel, David Rochberg, and Jim Zelenka. A cost-e?ective, high-bandwidth storage architecture. In Proceedings of the 8th Architectural Support for Programming Languages and Operating Systems, pages 92–103, San Jose, California, October 1998.
  5. John Howard, Michael Kazar, Sherri Menees, David Nichols, Mahadev Satyanarayanan, Robert Sidebotham, and Michael West. Scale and performance in a distributed file system. ACM Transactions on Computer Systems, 6(1):51–81, February 1988.
  6. InterMezzo. http://www.inter-mezzo.org, 2003.
  7. Barbara Liskov, Sanjay Ghemawat, Robert Gruber, Paul Johnson, Liuba Shrira, and Michael Williams. Replication in the Harp file system. In 13th Symposium on Operating System Principles, pages 226–238, Pacific Grove, CA, October 1991.
  8. Lustre. http://www.lustreorg, 2003.
  9. David A. Patterson, Garth A. Gibson, and Randy H. Katz. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data, pages 109–116, Chicago, Illinois, September 1988.
  10. rank Schmuck and Roger Haskin. GPFS: A shared-disk file system for large computing clusters. In Proceedings of the First USENIX Conference on File and Storage Technologies, pages 231–244, Monterey, California, January 2002.
  11. Steven R. Soltis, Thomas M. Ruwart, and Matthew T. O’Keefe. The Gobal File System. In Proceedings of the Fifth NASA Goddard Space Flight Center Conference on Mass Storage Systems and Technologies, College Park, Maryland, September 1996.
  12. Chandramohan A. Thekkath, Timothy Mann, and Edward K. Lee. Frangipani: A scalable distributed file system. In Proceedings of the 16th ACM Symposium on Operating System Principles, pages 224–237, Saint-Malo, France, October 1997.