0% found this document useful (0 votes)
39 views

Chapter 8

Uploaded by

Ami
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
39 views

Chapter 8

Uploaded by

Ami
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 22
CONTENTS © Introduction to DFS * File models © DFS design © Semantics of file shoring * DFS implementation © File caching in DFS © Replication in DFS Case Studies > SUN network file system > Google file system Distributed File System Learning Objectives "This chapter will enable you to underst? Basic concepts of a Distributed File Sys!" (DFS), its functions, components, and « features File models, such as structured, unst'us"~" mutable, and immutable files Various issues In file system design, 5¢°" ” service interface and directory server" Concepts of naming transparency an! semantics of file sharing . Techniques of DFS Implementation. fl and replication Scanned with CamScanner ~esswuted Fle Sy; 9.1 Introduction to DFS the applications and users. A file sysi ment functions, as Tetrieval, naming, sharing, and protection of files. The complexities of Space allocation and storage on the secondary device are hidden from the programmer. tem is a part of the OS that performs file manage- . Such as organization, storage, 9.1.1 Functions of DFS A Distributed File System (DFS) makes it convenient for the users of a distributed system to use files in a distributed environment. As compared to traditional file system used on a standalone machine, DFS introduces more complexities because the users and devices are physically distributed across locations, Apart from permanent storage and information sharing, a DFS should provide additional features, such as remote informa- tion sharing, user mobility, availability, and support for diskless workstations. © The first feature is remote information sharing, which implies that any file should be transparently accessed from any node irrespective of the location of the file. * The second feature is user mobility, ic., the system should be flexible such that one can work from any node at any instant of time without the need to relocate any Storage device. This facility suits people who need to access files while working from different locations at different instances of time. * The third feature is availabitity, which implies that the files should always be available in spite of any temporary failure. The DFS hence must maintain multiple copies on different nodes, which are called the replicas, The number of replicas is hidden from the users. * DFS should also support diskless workstations. These workstations are economical, less noisy, and generate less heat because they allow disks to be separated from the workstation. An array of RAID can be kept at a central location as a repository of files. ADES should support remote file access capability, transparently. The major functions of a DFS include permanent storage, remote information ® sharing, user mobility, availabillty, and support for diskless workstation. 9.1.2 Components of DFS As discussed, a DFS facilitates remote access. The major components of a DFS are stor- age service, true file service, and name service, © Storage service is related to the allocation and management of space on the sec ondary storage. It provides a logical view of storage to the users. This is possible by providing operations for storage and retrieval of data. Scanned with CamScanner ibuted Computing idual files, such as access, mogiy,, * True file service provides operations on indiv I d tion, creation, deletion, etc. To carry out these operations correctly and efficieny ii ing s ics, file i issues, such as file access mechanism, file sharing se ante replication, concurrency control, data consistency, etc., need to lered. 4), these issues are discussed in subsequent sections. arte bles the users to identify files casi), © Name service is another component which enal ly with text names which are mapped to internal file IDs used to locate the files, 1), service is also called directory service, and performs operations, such as crear. delete, add, etc. : This chapter deals with the design and implementation issues of the true file servic. component of a DFS. The aspects of a DFS which are different from conventional file systems are also discussed here. However, before understanding the concepts of DFS. i: is important to understand the major difference between a file service and a file server A fille service is a specification of what the file system offers to the users. It describe, the primitives available, the parameters taken, and the actions performed. The users can gauge what to expect from the file system, but the service does not specify the methods of implementation. In short, the file system specifies its interface to the user. A file server, on the other hand, is a process running on a machine which implements the file service. A machine can run one or multiple file servers, but the location ani function of each one should be specified. Users call the specified procedures, w: performed, and the results are retuned to the user. Users should not be aware of the f that the file system is distributed. It provides a conventional processor file system view. to the user. Each of the servers running on a system may offer different file services { different operating systems. So a terminal can have multiple windows, and each one used to run a different file server, 9.1.3 Desirable Features of a Good DFS ADFS enables the uscrs to access files Temotely. Such a system should have the follow ing features, Transparency One of the most important desirable features is trans arency. There are four different types of transparencies desired in any DFS. The firet tone art is structure transparency, which is useful to achieve performange, vegin yy reliability. A DFS should use multiple file Servers to do so. Each of the were is us or a kernel process, controlling a set of secondary storage devices. “this muliplit should be transparent to the users. Second, access transparency enabl a as a local or remote files to be provided in the same way. It implies that the file syste should Jocate a remote file and arrange for transfers without the user bein awe n fit The third is naming transparency, which implies that the name should not indicate the location of the file. File movement should be possible without changing the name of the file The fourth is replication transparency. When the file is Feplicated on multiple nodes. t" number of copies and location should be hidden from the users. Scanned with CamScanner ie, users should be able to view id the system should exhibit the can be achieved by bringing the user’: the node where the user is trying to log in. Performance It isthe next important featu of time needed to satisfy Eassofuse A good DFS system should be sim interface should be easy, user friendly, Scalability "Most distributed systems span across locations. Hence, a good DFS should be scalable and should: support the growth of nodes and users, withous disruption of service or loss of performance. ple and easy touse. The semantics and the and should support a large number of applications. Availability A DFS system should also be available. This implies that the system should continue to function even in case of but the entire system should not br breakdown is to replicate the files. Partial failure. Some degradation may be allowed, eak down. One of the solutions to tackle such a Reliability Tagged with these features is another desired feature: reliability. It is important to minimize the loss of stored data, thus reducing the load on users to create their own backups. The system should create backups automatically, which will prove useful in the event of loss of original files. Integrity of data A good DFS should ensure the integrity of data, Concurrency control mechanisms must be used to allow multiple users to access files. Atomic transactions should also be used to maintain the data integrity of files. Security Since the DFS will be used by many users, it is important to maintain security. Appropriate security mechanisms must be implemented so that the users are confident of the privacy of their data. Security mechanisms must be implemented to prevent unauthorized access. Support for heterogeneous systems A distributed system comprising heterogeneous Machines provides flexibility to the users to work on different computing platforms and applications. Hence, a DFS must support users working in a heterogeneous environment. All such machines should be able to share files and integrate different storage media. If any DFS possesses all the desirable features, the distributed system becomes efficient and easy to use. The desirable features of @ good DFS Include transparency, user mobility, \ performance, ease of use, scalability, availability, reliability, data integrity, security, and the ability to support heterogeneous systems, {\ Scanned with CamScanner “Dist rlbuted Computing 9.2 File Models ‘A DFS systern allows access to remote i as if the files are stored on the same machj,, File models} File models decide the performance of a 1); Structure Modifiability | since they are based on how the data js in the file and what method is used for mods, cation. The file models are classified base |Structured| uutable a Unstructured] [Mutable] [Ion the structure and how they can be modifica, Figure 9-1 File models shown in Figure 9-1. 9.2.1 Structured and Unstructured Files Based on the structure, file models can be classified as unstructured and structured. the former model, as the name suggests, the file server understands the file as a p: sequence of bytes. Modem operating systems use this structure. Since the files have structure, the applications can interpret the file in their own way. The other file moue the structured file model. Here, the file appears as an ordered sequence of records, possibly of variable sizes. These records can be indexed so that cach record is acc depending on the specific position in the file, In the indexed file system, a record bis key field, and the file is accessed by specifying this key. The file is maintained asa tree or as a hash table. 9.2.2 Mutable and Immutable Files Files have attributes, the information which describes the file. Each attribute lis 4 name and a value. Typical attributes are owner, size, permission, date of creation. of last modification, etc., which are specified by the file system. These attribute be modified by the user. Based on how the files can be modified, they are clas: mutable and immutable files. Most current operating systems use the mutabl where updates on a file overwrite the old contents, producing a new file. In imme file model, files cannot be modified once created, but can only be deleted. The C system used this type of file. Each file can have multiple versions, and can be mai!" separately. Practical implementation of such 2 model increases the disk space #2! allocation activity. However one advantage of the immutable files is that file cachi" replication becomes casier. File model plays a very important role in designing DFS. In the next section, we describe the issues in designing a DFS. Based on the structure, file models can be classified as Unstructured (where the \ file Is @ pure sequence of bytes) and structured (which Is an ceo! ordered sequen’ ® records). Files are also classified as mutable (can overwrite the contents of 2" 0 file) and Immutable files (can never be modified once created). Scanned with CamScanner “Distibited Fie syelam EE) 9,3 DFS Design DFS is an essential component of a distributed system. A DFS basically consists of two major components: true file service and directory service, which we describe in this sec- tion. The file service provides operations on individual files, like reading, writing, etc.; whereas the directory service is concemed with creating and managing directories, add- ing and deleting files from directories, etc. 9.3.1 File Service Interface File service interface enables the users to access files in a distributed environment. File accessing models decide how the user's request for file access is serviced. The file access model depends on two factors: the method for remote file access, and the unit of data access, as shown in Figue 9-2. File service interface Method of remote Unit of, file access data’access Remote’ Data File-level | | Block-level] | Byte-level || | Record-level access caching transfer transfer transfer transfer Figure 9-2 File service interface Based on method of remote file access Based on the method used for remote file access, the models can be classified as remote access model and data caching model. A distributed file system may use either of the models to access remote files. Remote access mode! In this model, the user's request is performed at the server's node, and a copy of the file is returned to the user. The request and response messages are transferred across the network as data packets along Gient with communication overheads, as shown in Figure Server 9-3. Therefore, a remote file service model's interface and the communication protocols must be designed | carefully to minimize the overheads attached to the Requests from t number of messages required to satisfy the request. tlientto access _—‘Fille stays Typical examples of this type of access model are the remote file onthe server 1 ocus and Network File System (NFS). Remote file Figure 9-3 Remote file access access always results in network traffic. Scanned with CamScanner ibuted Computing Data caching model The other model for file acces, File moved to client is the data caching model or the upload/downloy fae TO model, shown in Figure 9-4. This model uses thy 2 fee locality feature of data access. On the first reques jy New fle or a file from the user, the data is brought to the Accesses When the client is user’s node, cached, and ea request are done on ‘lent done, the file is satisfied locally. Cache can be made available using Least Recently Used (LRU) replacement returned to the server ; Figure 9-4 Data caching model policy. This access model is implemented in Sprite distribute system. As compared to the remote access model, the data caching model reduces network traffic. However, caching gives rise to consistency issues in case multiple users star writing to the same file. Caching is commonly used in most DFS becausc ofits advantay such as increased performance and greater scalability. When a user requests for a file. the unit of data transfer refers to the part of data transferred in a single read or write operation. "\\_| Based on the method used for remote file access, the models can be classified as remote access model and data caching model. . i Unit of data access Based on the unit of data transfer, the various data transfer models are: file-level transi: block-level transfer, byte-level transfer, and record-level transfer. Filedevel transfer In this model, the entire file is moved when a user makes a requ: for a file. This model has various advantages. It is efficient to transmit the file page-by page for multiple requests made for the same file system, since the Protocol overhead i» required only once. This model assists in scalability because fewer accesses are male the file server, resulting in decreased network traffic and server load. The disk scheduli routine can be optimized because it is known that the entire file has to be transferred! Once the entire file is cached to the user's node, it is immune to server ur network failures. This model supports heterogeneous workstations because it is easy to transfor! the entire file into a form which is compatible with the equesting workerenog The drawback of this model is that more storage space is required on th 1s This model is also inconvenient for large files, ne use! systems. Block-level transfer In this model, as the name implies, blocks (contiguous portions ofa file, fixed in length) are transferred on user request. If the virtual memory page size tallies with the file block size, itis called page-level transfer. This model has the advantage th! it does not require large storage space, because the entire file is not copied when sm! Scanned with CamScanner Soi ne TED Table 9-1 Comparison of file transfer models Type of Fite-level Unit of | Advantages Disadvantages transfer | transfer File Fewer accesses to file server and | Not suitable for diskless workstations. Byte-level environments. only a small part of the file is needed. Block-level | Block | Storage space is saved, Suitable for | For large files, there is a need to make diskless workstations, multiple requests for accessing the same file. Increases network traffic. Byte _| Flexibility for any range of data storage | Cache management is difficult. Record- level reduces network traffic. Good forsmall- | Not suitable for large-sized files. Network sized files. Supports heterogeneous | bandwidth and storage space are wasted if and retrieval. Record | Easier to protect data. Ideal for| Increases network traffic In case large database environment. 1] number of records have to be accessed. portions are required. Hence, this is suitable for diskless workstations. It provides them with a large virtual memory when they do not have their own hard disks. But in a worst- case scenario, if the entire file is required, multiple requests lead to network traffic and network protocol overhead. This results in poor performance as compared to the file transfer model. Sun’s NFS and Sprite use the block-level transfer model. Byte-level transfer In this model, the unit of data transfer is a byte. This level of transfer provides maximum flexibility because storage and retrieval is possible for any sequential sub-range of a file by specifying the offset within the file and the length. Cache management is difficult due to the variable size of access requests. Cambridge file server uses the byte-level transfer model. Record-level transfer It is a structured file model where the unit of transfer is a record. Research Storage System uses this level transfer model. Files can be protected using the capabilities and access control lists, Each user has the capability for each object to enable access. The capability describes the type of permissible accesses. An access control list specifies the list of users who can access the file and how they can obtain access. For example, in the UNIX system, it is possible to provide read, write, and execute bits for the owner and the owner’s group. Various file transfer models based on their unit of data access are compared in Table 9-1. Afier describing the file system interface, we next describe the directory service interface. "|, ased cn unit of transfer, le data transfer con be cried out at feel, bio: jevel, byte-level, and record-level. 9.3.2 Directory Service Interface J The directory service provides operations for creating and deleting directories, naming and renaming files, and moving across directories. It is independent of the file service Scanned with CamScanner implementation, where files are transferred in one piece or accessed remotely. The directory service defines the alphabet, the syntax for composing the file, and the direc. tory names. : Ina distributed system, users can combine related files together by using directories and subdirectories. The fil name consists of letters, numbers, and special characters, ‘The file name has an extension of maximum three char. acters such as doc, txt, etc., which identify the type of the file, Explicit attributes can be used to define the file type, Figure 9-5 Hierarchical file system —_instead of using extensions. ” Figure 9-6 Linked directory structure Various operations can be provided to work with di. rectories, such as create, delete, enter, remove, and look Number of up. Just as in Windows or Linux environment, when you aestories click on a directory, all the subdirectories under that di- \, this directory rectory are listed, The subdirectory may be further di- ‘ vided to the next level, leading to a tree structure called Machine 1 — the hierarchical file system, depicted in Figure 9-5. In a distributed system, files on different machines can be linked together. As shown in Figure 9-6, links or Pointers to arbitrary directories can be created, resultiny in trees and directory graphs. There is a distinct differ- ence between trees and graphs in a distributed system. Figure 9-6 depicts various directories, 4, B, C, which are linked either to B, C, D, or E. Apart from the hierarchies! tree structure, directory D also links to directory B, Ina tree structured directory, link A to link B can be removed ifdirectory Bis empty. But ina directed graph, link 4 to link B can be removed as long as any one link remains, The number of links can be tracked and maintained by a reference count, as shown in the upper right hand corte: of the directory inthe diagram. If link Ato B is removed, the reference count is changed fro 2 to J, Directory B cannot be reached from the root. The directories, D, E, and their files are. effectively, orphans. In centralized systems, all file activity is stopped, and the graph is tr- versed from the root to all the reachable directories. All unmarked directories are knowt be unreachable. This problem becomes critical in distributed systems because it is expensive to discover orphan directories which are distributed across machines Itis difficult to val the system to get a snapshot of reachable directories, . This brings us to the key issue in DFS—whether al of the directory hierarchy. This statement is ex, of two servers, each holding the directories an; and the circles are files. Figure 9-7(b) depicts all clients having the same view of the file system. Its easy " Program and understand. Figure 9-7(c) represents a system where different clients have _— I the machines have the same vie plained in Figure 9-7(a), which consis id some files, The squares are directories Scanned with CamScanner Distributed File Sys “ED } Such a file system resides on machines that manage unting. This method is flexible, easy to implement, but does not behave like a time-sharing system. File server 1 i i () All clients have a different view of the file system 1 | Client 1 ! Client 1 1 A Root i Root] 1 fe) fe] | A fo 1 fh D 1 2 ! B) fo fe l 8] [c] [E 1a a 1 1 3b te Ad ' File server 2 : Client 2 I ‘Client 2 RB 1 Root ' Root 1 e] [F) | R D | A é ! I @) 1 BO EE 1 BE ‘Wo file servers ' gh dh dds ' gh [ 1 (b) 1 I All clients have the same I i View of the file system = 1 1 ! 1 ! 1 Figure 9-7 View of directory hierarchy Another related design issue is the need for the global root directory which all ma- chines recognize as the root, and it contains one entry for every server, The path defined as /serverlpath is uniform across the system. Thus, the directory server interface can cither be a hierarchical file system or a linked directory structure. The DFS should also support naming transparency, which is described in the next section. : lons for: creating and celetin directory service Interface provides. operat 9 The srt es wee esas eee Implemented using elther hierarchical le system or link-based directory structure. 9.3.3: Naming Transparency Apart from file service interface and directory server interface, the other design issue in DFS design is naming transparency. The principal transparency issue related to nam- Scanned with CamScanner Bag Vsr1DUted Computing Root ing is location transparency which implies thaty, file name does not indicate the location of th, fi For example, Aserverd/dirI/dird/y tells that the, Servi Eevers] Sever every] Genera rectory is located on Server 4 but the use, dird Figure 9-8 Naming transparency aa ‘ai This cannot be done automatically because the unaware of the location of Server 4. The Ser wa can be moved without changing the path ay, TZ component of all path names is the server nai server I/dir /dir4/y becomes the new path name; the file y. This is shown in Figure 9-8. If the file y is very large and cannot be acon, Directory modated on Server 4, it can be located on Sene faming 1 directly. The system in which files can be moves without changing the names are said to have locs tion independence. It is a desirable property, Tworlevel | |Mounting remote) |Single name! sometimes is difficult to achieve. naming file system specification| . or lt There are three approaches to file and direc! Figure 9-9 Approaches to directory naming naming in a distributed system, as depicted in Fis 9.9: nwo-level naming, mounting remote file Asthe on to a local file, and a single name specification that looks the same on all machin: names suggest, the first two methods are casy to implement, while the latter method requires, careful design because it atternpts to make the distributed system look and act like a single computer. To achieve naming transparency, a two-level naming scheme can be used, wher level one is a symbolic name used by people (because binary names are cumbersome remember), while level two is a binary name to be used by the system itself, Directors are used to provide a mapping between these two levels. When the user references 3 symbolic name, the system maps it with the Appropriate directory to get the binary nartt which is used to locate the file. The binary name varies in different systems but in as tem with multiple servers, cach is self-contained, and can have just a local number. TE easier way to decide the binary name is that it should indicate the server and the speci® file on the server. This scheme allows a directory on one server to hold the file which ® stored on another server. The other alternative is to use a symbolic link-directory entry which maps toa [se file name] string and can be looked up on the server which is named to locate the bil name. This symbolic name is basically a path name. Yet another way is to use capabilities as binary names, Looking-up ASCII names vey a capability which can either contain a physical or logical machine number, ora net" | address of the appropriate server, and a number which indicates the file required- physical address is used to send n jessages to the server and the logical address “th located by broadcasting OF SSR gee Pata. ag oem vette district is quite likely that looking-up ASCII names provide several binary names which Scanned with CamScanner Distributed File system BB replicas and/or backups, From these names, any one file can be located. This method provides fault tolerance through redundancy. These are the various ways in which nam. ing transparency can be achieved, \ ‘The principal transparency issue related to naming Is location transparency. The three approaches to file and directory naming in a distributed system are machine and path naming, mounting remote fle system anto a local file, anda single name Specification that looks the same on all machines, 9.4 Semantics of File Sharing A shared file can be simulta icously accessed by multiple users. It is necessary to define the semantics of read and writ . When modifications made by the user to a file will be made visible to other users, Consider an example of a single processor system which permits file sharing. In case a Read operation follows a Write operation, the Read returns the value just written. This is shown in Figure 9-10(a). Sending a file sequentially to a single server is practically not desirable because of poor performance, poor scalability, and poor reliability. This problem can be solved by allowing clients to maintain local copies of frequently used, and large-sized files in their caches. As shown in Figure 9-10(b), client machine | modifies a local copy of a file and then closes it. Shortly after client machine 2 reads the file from the server, it will get an obsolete copy fom the server. Client machine #1 Process} —2lbk~ racess| A r alte) Original file 2, Write 'c’ 1.Read ‘ab’ Single machine, File server Pe eer} Process] A (afbTe] 3. Read gets ‘ab’ Client machine #2 B [a] ry i 2 = (ais 1. Write ‘c? 2. Read gets ‘abe’ | and (a) 7 te) — eer ‘Read fll edb a Write, get most recent value writ » Figure 9-10 _ (a) Single machine, Read followed by a Write, get most recent value written | 4b) Aatstrbuted ester with caching, obsolete values ae acessed Scanned with CamScanner in Figure 9- UNIX fi The various types of file-sharing semantics shown in Figue 9-1 , i” tie ne S tics, session semantics, immutable shared-file semantics, an trans: lies File-sharing semantics I ‘Transaction- Sessi Immutable i n semantics ere like semantics UNIX file semantics Figure 9-11 File sharing semantics 9.4.1 UNIX File Semantics First let us understand the UNIX file semantics. In a single processor system, the file- sharing semantics enforces absolute time ordering, and ensures that when a Read opera. tion follows a Write operation, the Read always retums the value last written. Even when two Writes take place in quick succession followed by a Read, the value Read is the value which is stored by the last Write. This model is ideal for single processor systems because it is easy to serialize all Read/Write requests. Ina distributed system, the UNIX semantics is achieved if it uses only one file server, and the clients do not cache files. The read or write requests from the clients go directly to the server, and it processes them sequentially. Network delays may cause the Read that occurred after the Write to reach the server first, thus giving an older value to the server. The UNIX semantics model is difficult to achieve in a distributed file system (even where the shared file is handled by a single server). Applications that need to guarantee UNIX semantics for correct operation must use Jocks or other special means, and not just depend on the underlying file system for providing sharing, 9.4.2 Session Semantics Ideally, the client must propagate all changes to the server, the moment they are made at the client side. However, it is better said than done, since it may result iz increased network congestion. The other way to go about it is to relax the file-sharing semantics that Read sees as the effects of previous Writes. Additionally a new ing a namely changes made to an open file are visible only to the process which mradifed ih! file. Once the file is closed, the changes will be made visible to al ts ish aie This does not alter the sequence of operations, and the behavior of fi ha ia aii and the subsequent reads get new values. This rule is inberae 9! the file is ree ce i derived from the name session, which isa series of file accesses done bevveen the ope and close operations. Other open instances of the file do not rellezt th pee een Using session semantics, multiple clients can perform both Read nd Wi ite accesses concurrently on the same file. Each client works with an image erihe a dene ol ent closes the file, others are still accessing an older copy of the file: Aaah a session's closed, the image is sent back to the server for update operation. Hania the final imas* Scanned with CamScanner GED Osut00 ‘Computing 9.5 DFS \mplementation : Iterative lookup Figure 9-12 DFS system |! ~ s] jong users, hy Inthe carlier section, we have discussed how the files can i ee ne ‘ a thy section, we will sce how these systems are implemented. While sane which a useful to understand how the files will be used and the common ope! are be carried out on these files. . a Based on the research work carried out in this area, various obsenstiene _ com: to light, First, for those files which are less than 10 kb, it is feasible to transfer the entir, file, and not just disk blocks, between the client and the server. In rare cases, when th files are too large, they can be treated separately as abnormal cases. Second, most files have short lifetimes. Files are often created, read, and then deleted Take the case of acompiler which creates temporary files. Such files are created on the client and kept there until they are deleted, thus reducing network traffic. Third, few files are normally shared, which means that the client caching is a good option. Using session semantics leads to better performance. There is a clear existence of distinct file classes, and therefore, different mechanisms can be used to handle these classes. System binaries are widely used, they never change, and therefore they can be replicated even though the occasional update is a complex process. Compilers or other temporary files are small in size. never shared, and disappear quickly, and hence, they can be kept locally, As the email boxes are frequently updated, their replication is use~ less, Ordinary files are to be carefully handled, iis they may be shared. In this section, we briefly discuss the structure-related issues in DFS implementation. 9.5.1 DFS System Structure ction, we discuss various approaches to organization of file and directory sev ers, First Ict us understand the difference between a client and a server, In some systems. the machine can act either as a client or a server, depending an the service it offers oF the service it receives. In other systems, there may be no distinction between clients and servers. The file and the directory server may also be user programs, so the system can be configured to run as either client or server software on the same machine. For examples a file server offersa file service by exporting the names of the selected directories which are open to access, while in some systems, the client and the server run on fundamentll different machines and may even use different operating systems, Here the elient should know which server holds which directory and hence it requires more messazes Figure 9-12 shows the methods of DFS fie Ina dis tributed environment, the mapping of a symbolic name 104 binary name on a directory server and then accessing the filé on a file server can be done on the same machine or on 8 iF ferent machine. Keeping them on separate machines is mo" "flexible because both functions are unrelated, which make I¢ software simpler. On the other hand, two servers Wo" ookups definitely increase network traffic. = ‘ DFS system lookup Scanned with CamScanner Directory on Pleectory on Server 1 Server 1 Directory of Client Pipetory on Server 2 Lookup c Reply with ¢ c Directory on Directory or CK | seners Servers myfile myfile (a) (b) Figure 9-13 (a) Iterative lookup and (b) Automatic lookup The other implementation is dependent on how the file and directory services are structured. The client gives a symbolic name to the directory server that returns a binary name the file server can understand. In case the directory hierarchy is split on multiple Servers, either an iterative lookup or an automatic lookup can be used, as shown in Fig- Herative lookup method The iterative lookup methad shown in Figure 9-13(a) works as follows, The entries for the directory are current directory Server l,aon directory Server 1 which Points to ent; 4 on Server 2, pointing to c on Server 3. This file contains an entry for myfile. To look up this entry, the client has to send messages in tum to Server 1, 2, and 3,and each message comprises of a set of reply-request messages. Access to the last server locates the file binary name. Then the file is sent to the client. Automatic lookup method The automatic lookup method shown in Figure 9-13(6) works as follows. The client sends request to directory Server 1, which forwards the request to Server 2, which further sends request to Server 3, from where it locates the file. Automatic lookup is more efficient as compared to iterative lookup, but it cannot be done using RPC because the request is made to Server I, and the reply is obtained from Server 3. As the number of directory servers is increased, the cost of file location also increases, To improve lookup Performance, the binary names accessed earlier can be cached locally. Whenever a file is opened, the cache is checked, and if the entry is not found, only then the directory lookup operation is carried out. Stateful and stateless servers The final structural issue is whether the directory and file servers shout aint State information of clients. This lends ta twa ontions. namely stateful and statele Scanned with CamScanner ez urstriputed Computing jients is maintained on the seryy, servers. In stateful server, the state information of all a 5 peaed: Consider the following sequence of operations when a le server opens the file. © Server maintains informa ¢ Client is given a file ID for future references. Subsequent requests come with the file ID. ; , © Server uses a file ID to determine which file is requested. i le. © Maintain the state information table which maps file IDs to the fil h state information. Each request must contain in the file, so that the server knows what to do, the server crashes, the entire informa. f which clients have which files s which file open. tion about which client ha: Stateless servers do not maintain suc! the full name of the file and offset withi This increases the length of the message. In case tion is lost. When the server is rebooted, it is unaware 0 open. Subsequent reads and writes to open files fail, and recovery depends on the retrans. mission of requests from the clients. Hence, stateless servers are morc fault-tolerant than stateful servers. Stateful servers also have their own benefits. They use short length messages and hence utilize less network bandwidth. Performance is better because information about open files is available in the main memory. Based on the server that most files are read sequentially, the server can read blocks ahead of time. This decreased delay and duplicate requests on time out from the clients can be tracked from the state table. Idempotent operations can also be easily achieved. Files can be locked and this information can be maintained along with the state information. Stateless systems lock files using a special lock server. Table 9-3. Relative advantages of stateful and stateless servers eiatslens servers “ ]statetul servers No reed for OPEN/CLOSE calls 97 feet messages No server space wasted to store tables ance _| Possible to read ahead of time nk a lo limitation on the number of files opened Sultable for idempoter i nt operations Possible to lock files No Issues if the client crashes ents have multiple files open at the same instant t of tit aia aval Opened. Ina sateful server, ine ae Fe ae Seana ie Ae other clients may have eenitB a file? The server will hold the cg ther ‘ '0 wait or will be refused sery; the entries in the table 3" situations, stateless servers are better Service for that particular file. In suc? --—— . " DFS !90KUp Is based on, re nN imolemented with ctu Scanned with CamScanner Figure 9-14 No caching Figure 9-15(a) Distributed File System File caching improves /O performance because recently-accessed files are reta the main memory. When requests for these files are repeated, the network tran roed because the files are available locally. Performance improvement of the fil : sy is based on the locality of the file access pattern. Caching also helps in reliabi scalability. Most current DFS use some type of caching. File caching scheme is degi on various factors, such as cached data granularity, cache size (large or small, fixe dynamic), replacement policy, cache location, techniques to propagate the modification, and validation of cached data. 9.6.1 Cache Location As shown in Figure 9-14, the simplest way is to avoid using a cache and access the file directly from the server, This method requires time, is slow, and hence, not preferred. In a client-server system having memory with disk, the cached file can be stored in four locations, namely server’s disk, server's memory, client's disk, or client’s memory. These four locations are shown in Figure 9-15(a). Server's disk The most standard location of caching file is the server's disk, which is the original location where it is always stored, There is enough space available here in case the file is modified and grows in length. It will still be accessible to all the clients. copy of every file exists, there are no consistency problems, When the clicnt wants to read the file, two transfers are involved: server's disk to Server's main memory, and then to client’s main memory. Both these transfers take time, To improve performance, one part of the transfer time can be avoided by caching the file in the server's main memory. This method is slow since the file has to be transferred from the server to the client's memory, Since the main memory is limited in size, some algorithm will be needed to find out which files or parts of the files should be kept in the cache. This algorithm will be based on two issues, namely unit of cache, and what replacement technique should be used when the cache fills up. Since only one Server's main memory The file can be cached in the server's main memory and Needs to be Servers transferred from the server's memory to the cli- Client's main Server's _ent's memory. The question is whether to cache the disk memory disk entire file or only the disk blocks. In case the entire <> Csfile is cached, it can be stored in contiguous loca- [2 J tions; and high speed transfer gives good perfor- mance. Disk block caching optimizes the cache and disk space. To solve the latter problem, standard Caching locations Scanned with CamScanner Eos Computing | € is very large, ing: eon the client's disk, TY large, caching can. be done 0! Client's memory Cache hit Cache mi: 2 at" chet miss The fastest Access of a file is from the cliet!’s within [ea | memory. Once it is decided that the files should process. be cached in the client’s memory, the optio™ Itself available for caching are in the user process’ i address space, in kemel, or in a cache manaz Figure 9-15(b) Caching within each process ferences are very few as compared to meny, ic oldest block, can be chosen for eviction, j he copy can be discarded. Alternatively, 1, ‘acached file in the server's main mem, ly keep copies of the file on 1, ective, only one capy oy caching techniques can be used. Cache re! teferences. Least Recently Used (LRU), l an up-to-date copy exists in the disk, the cach cached data is written to the disk. Accessing J is easy and transparent to the clients. The server can casi disk and the main memory consistent. From the client's perspt the file exists in the system. Client's disk Another alternative is to store the data in the client's disk. Network transfer time iy reduced but the client’s disk needs to be accessed in case of a cache hit. This scheme improves reliability because the modified data will be available in the event of data loss ora crash, Then the data can be recovered from the client's disk. ‘The other advantage is that the client’s disk obviously has more capacity of storaye ‘as compared to the server’s main memory. More data can be cached, leading to hi: cache-hit ratio. Most DFS use file-level data transfer model, where the file is cached completely. If the file is too large, disk is a better alternative because it can occupy the entire file unless the file size is larger than the disk space available, The file can still be accessed even if the client is disconnected from the server. This also improves scalability and reliability because access to the disks can be serviced locally and there is no need to contact the server. . ‘The only disadvantage is that disk caching cannot be supported on diskless worksts- tions. Every cache requires disk access, leading to a significant increase in response lime. A conscious decision has to be made to cache the file in the server's main memory or the client's disk. ° Server caching eliminates the access to disk, but netw quired. The solution to eliminate the network transfer tim client's side. The decision to use the client's main mem on whether the system should. Save space or perform better. As you are aware, the disk holds more space, but the access is slaw. The servi faster than the client's disk. In case the file siz ‘ork transfer will still be 1 ¢ is to cache the data on the ory or the disk is dependent er's main memory may provide a file ee aS a user process. The second option is 10 cath the files directly in each user process's addres space, as depicted in Figure 9-15(b). Scanned with CamScanner coche kernel ~| user process process Distributed File Syste Cache hit Cache miss The cache is managed by the library system @ @ call. During any process execution, the files opened, closed, read, or written. Ii F Cay G) keeps the most heavily-used files they can be reused if needed. tT Server execution is complete, the modif jo ic) Ci i " * Boge Figure 9-15(¢) Caching in kernel tive in case individual proc Cache manager files repeatedly. It is suitable Cache hit 1 Cache miss manager, but not for program developmen ronments where the files may not be read again ) On 7) Instead of the user process's address Spe th an als i kt b t f t he file can also be cached in the kernel shown in Figure 9-15(c). However, this scheme involves frequent system-calls to access the file Server on every cache hit. Figure 9-15(d) Cache manager as a user As shown in Figure 9-15(d), the file can also process be cached in a separate user-level cache. This re- lieves the kernel from maintaining the file system code and it becomes more isolated and flexible. The kernel can dynamically decide the memory space reservation for programs versus cache. If the cache manager runs on virtual memory, the kemel can store part of the cached files in the disk, and the blocks are brought to the main memory on cache hit. With virtual memory, the client caching idea holds no value, but the cache manager can lock some pages which are very frequently accessed, On the minimum side, at least one RPC is required to make a file request, and it takes one or two RPCs depending on whether the request can be satisfied by the cache or not Hence, an average caching scheme uses more RPCs and is suitable when RPCs are fas and network transfers are slow. Irrespective of the caching location, the performance gain due to caching is still dependent on the cPu, network technology, file size, an¢ applications which need the file. The relative merits of each cache location policy ar listed in Table 9-4. Table 9-4 Merits of cache location policies Sache location ‘Access cost (cache hit) | Merits = Server's disk One disk access and one | » Enough space, accessible to all clients Server's main memory | One network access. network access # If only one copy exists, no consistency Issues _ | econsistency maintained easily between cached data and the original file - i; « t INIX tke fle 7 a (Contd Scanned with CamScanner ring semanties istributed Computing comes all the more simple in case of immutable 4), The new version is made for an UPdatey the cached file. It can be checked Whey verhead. creased complexity. Caching b because changes in the files are not permitted. file and is bound to the same symbolic name as the cached copy is reopened and it has the same RPC 0} Cache consistency issues | iein. Th There are various approaches to deal with the cache consistency problem. These can }, classified as Server-initiated and Client-initiated approaches. In the Server-initiated approach, servers need to keep track of where the file bloc}, are cached. The servers inform the client cache managers whenever their cached dat, become stale. . . In the Client-initiated approach, the clients validate the data with the servers before using the data. This partially negates the main benefit of caching. a File caching Improves I/O performance because recently-accessed files are retained 1a Inthe main memory, Four different cache location policies are possible: server's disk ‘or memory, and client's disk or memory. To maintain consistency among copies in the cache, there are four methods: write-through, delayed-write, write on close, and centralized control algorithm, 9.7 Replication in DFS One of the main goals of a DFS is to improve availability. A replicated file has multiple copies located on separate file servers, The first major reason for providing replicationis to increase reliability by having independent backups of each file. If one server crashes, the copy can be taken from another server. The second reason for replication is to enable file access to continue even if one file server is down. The objective is that the entire 9 tem should not break down during a crash of a file server. Replication allows the work load to be distributed among multiple servers if any one of the servers is overloaded. thts improving perfomance, To summarize, the reasons for replication relate to reliably. availability, and performance. In this i i i i icatie piper ot ar taal Section, we describe the issues in replication and 9.7.1 Unit of replication In DFS, replication unit can vary based on size or uy file or block, volume, or pack. Soup Of les namely Sample Complete file or block Complete file or block is repli ; plicated, on demand, only whe? data is needed. This type of replica management is harder i erm: ine repli and ensuring file Protection. eran tetng of locating Volume The other unit of file replication i is meth : plication is volume (group) of files. This meth wasteful if some files of the volume are not needed. 7 = 7 Scanned with CamScanner Distributed File System 9 339] Pack In this method, pack is a subset of files in a user's primary pack, and all Teplicas in the pack are updated together. This ensures mutual consistency among replicas, 9.7.2 Replica Creation The key issue related to replication is transparency. Should the users be aware of which file is replicated and how many replicas exist? If this entire process of replication is car- ried out behind the programmer's back, the system is said to be replication transparent. As shown in Figure 9-17, the replication can be carried out in any of the following three ways, namely explicit file replication, lazy file replication, or file replication using a group. 1) Client © (5:) ® Keo omo” ole 1 Later QO nn. @ © | crous 1, 2, 3: search (a) Explicit file replication (b) Lazy file replication (c) File replication using group Figure 9-17 Replica creation methods Explicit file replication In this method, the entire process is controlled by the programmer. A process always makes copy (C) of the file on one server, and then it can make multiple copies to be resident on other servers (S,, S2, S3). The directory server can maintain a list of all rep- licas and network addresses for the files. When the name is looked up in the directory, all replicas are listed and each copy can be found. When a file is requested, any one of these copies can be opened. Lazy file replication In this method, only one copy (C) is created on the server (S2), and later, this server makes replicas on servers Sy and S3. The system can track all the replicas and retrieve any one of the copies as required. These copies are actually made in the background, and there is a chance that the file may change before the copy is made. File replication using a group The third method is to carry out file replication using groups. In this method, a Write System call is sent to all the servers (Sj, S2, Ss), and multiple replicas are created when the original is made. In lazy replication, only one server is addressed—not the entire group—and it hap- Pens in the background when the server is free; while in the group mechanism, all the Copies are made at the same time. Each of these three methods has its own advantages nd disadvantadec hut all the methads nrovide trancnarency Scanned with CamScanner 4 Distributed File System JRLEY ne Google File System (GFS) is designed for Google applications and workloads. Google needed 2. good DFS with redundant storage of massive amounts of data on cheap and unreliable computers. jhe files are huge, of size 100 MB or larger, with each file typically containing many application ob- jects: such as web documents. There are more than a few millions of files and they are fast growing data sets of many terabytes. Google workloads are basically large streaming reads. The individual operations typically read hundreds of kilobytes or more, and the successive operations from the same client often read through a contiguous region of a file. In GFS API, the files are organized in a hierarchy of directories and identified by pathnames. The operations on files are create, delete, open, close, read, write, snapshot, and record-append. Files are stored as chunks with a fixed size (64MB) on a chunk server. One of the benefits of having a large chunk size is the reduced need for client-master interaction. Sequential access is the most common form of access. The client can cache the location information of all chunks even for a mulli-terabyte working sel. On a large chunk, the client is more likely to perform multiple Reads/Writes. The disad- vantage is that there will also be wastage of disk space due to fragmentation. As shown in Figure 9-23, GFS consists of GFS clients, a master, and many chunk servers. The master maintains only a [file name, chunk server] table in the main memory, resulting in minimal /O. The files are replicated using a primary-backup scheme, and the master is kept out of the loop. The master is also responsible for metadata storage and namespace managemenlocking. The master maintains periodic communication with chunk servers to give instructions, collect state, and track cluster health. The master also manages chunk creation, re-replication, and rebalancing, by spreading replicas across racks to reduce correlated failures. Additionally, the master manages garbage col- lection by logging the deletion, renaming the file to a hidden name, and lazily garbaging the collected hidden files. GFS uses shadow masters which minimize master involvement, File name, chunk-Index GFS client Master Contact address 7 Chunk-server n Instructiar | station nge. chur TD rena Chunk- Chunk- Chunk- server server server Chunk data /TTooedile | { Linux tie | | Unux fle ‘system ‘system system Figure 9-23 Google file system When a GFS cliant makes a request, the client never reads/writes file data through the master. Instead, aclient asks the master which chunk-servers it should contact. Using the fixed chunk size, the cient translates the file name and byte offset specified by the application into a chunk index within ‘ef, Then, it sends the master a request containing the fle name and the chunk-index. The master "eplies with the corresponding chunk handle and the locations of replicas. The client typically asks ‘or mutipte chunks in the same request and the master can also Include the information for chunks Scanned with CamScanner

You might also like