DOS Lectures

download DOS Lectures

of 116

Transcript of DOS Lectures

  • 8/7/2019 DOS Lectures

    1/116

    DISTRIBUTEDCOMPUTING

  • 8/7/2019 DOS Lectures

    2/116

    INTRODUCTION

    Advancements in microelectronic technology haveresulted in the availability of fast, inexpensiveprocessors, and advancements in communication

    technology have resulted in the availability of cost-effective and highly efficient computer networks.

    Computer architectures consisting of interconnected,multiple processors are of two types:Tightly coupled systems.Loosely coupled systems.

  • 8/7/2019 DOS Lectures

    3/116

    Tightly Coupled Systems

    (parallel processing systems )There is a singlesystem wideprimary memory(address space)

    that is shared by allthe processors.

    Any communicationbetween theprocessors usuallytakes place throughthe sharedmemory.

  • 8/7/2019 DOS Lectures

    4/116

    Loosely Coupled Systems(distributed computing systems )

    The processors donot share memory,and each processorhas its own localmemory.

    All physicalcommunicationbetween theprocessors is doneby passing messagesacross the networkthat interconnectsthe processors.

  • 8/7/2019 DOS Lectures

    5/116

    P rocessors of loosely coupled systems can be located farfrom each other to cover a wider geographical area.

    In tightly coupled systems, the number of processors thatcan be usefully deployed is usually small and limited by thebandwidth of the shared memory.

    This is not the case with distributed computing systemsthat are more freely expandable and can have an almostunlimited number of processors.

    In distributed systems its own resources are local, whereasthe other processors and their resources are remote.

    A processor and its resources are referred to as anode/site/machine of the distributed computing system.

  • 8/7/2019 DOS Lectures

    6/116

    TIME-SHARING SYSTEMSIn 1970s that computers started to use the concept of time sharing.

    P arallel advancements in hardware technology allowed reduction inthe size and increase in the processing speed of computers, causinglarge-sized computers to be gradually replaced by smaller andcheaper ones that had more processing capability than theirpredecessors. These systems were called, minicomputers.

    The advent of time-sharing systems was the first step towarddistributed computing systems.

    It provided with two important concepts used in distributedcomputing systemsthe sharing of computer resources simultaneously by many users,accessing of computers from a place different from the maincomputer room.

    .contd

  • 8/7/2019 DOS Lectures

    7/116

    Centralized Time-Sharing SystemsMost of the processing of a user's job could be done atthe user's own computer, allowing the main computerto be simultaneously shared by a larger number of users.

    Shared resources such as files, databases, and softwarelibraries were placed on the main computer.

    Limitation: the terminals could not be placed very farfrom the main computer room since ordinary cableswere used to connect the terminals to the maincomputer.

  • 8/7/2019 DOS Lectures

    8/116

    Evolution of Distributed Computing Systems

    Advancements in computer networkingtechnology between 1960s and early 1970s thatemerged as two key networking technologies theLAN (local area network) and WAN (wide areanetwork).

    In 1990 s there was another major advancementin networking technology, the ATM(Asynchronous Transfer Mode) technology.

    ATM s can make very high speed networkingpossible, providing data transmission rates up to1.2Gbps in both LAN and WAN environments.

  • 8/7/2019 DOS Lectures

    9/116

    The availability of such high-bandwidth networks can allow distributedcomputing systems to support multimedia applications.

    The merging of computer and networking technologies gave birth todistributed computing systems in the late 1970s.

    Hardware issues of building such systems were fairly well understood,

    Major stumbling block was the availability of adequate software for makingthese systems easy to use and for fully exploiting their power.

  • 8/7/2019 DOS Lectures

    10/116

    MODELS OF DISTRIBUTED COMPUTING SYSTEM

    1. Minicomputer

    2. Workstation

    3. Workstation-server

    4. P rocessor-pool

    5. Hybrid

  • 8/7/2019 DOS Lectures

    11/116

    M inicomputer ModelIs an extension of the centralized time-sharing system, whichconsists of a few minicomputers interconnected by a

    communication network.

    Each minicomputer has multiple users simultaneously logged on toit.

    Each user is logged on to one specific minicomputer, with remoteaccess to other minicomputers.

    The network allows a user to access remote resources that areavailable on some machine other than the one on to which the useris currently logged.

    This model may be used when resource sharing with remote usersis desired.

    The early a A RPA Net is an example of a distributed computing

    system based on the minicomputer model.

  • 8/7/2019 DOS Lectures

    12/116

    Distributed computing system basedon the minicomputer model

  • 8/7/2019 DOS Lectures

    13/116

    W orkstation ModelIt consists of several workstations interconnected by a

    communication network. Each workstation is equipped withits own disk and serve as a single-user computer.

    In such an environment, at any one time a significantproportion of the workstations are idle resulting in the waste

    of CP U time.

    Therefore, the idea of the workstation model is tointerconnect all these workstations by a high-speed LAN

    A dvantage - idle workstations may be used to process jobs of users who are logged onto other workstations and do not havesufficient processing power at their own workstations to gettheir jobs processed efficiently.

  • 8/7/2019 DOS Lectures

    14/116

    In this model, a user logs onto one of the workstations called "homeworkstation and submits jobs for execution.

    When the system finds that the user's workstation does not have sufficient

    processing power for executing the processes of the submitted jobs efficiently,it transfers one or more of the processes from the user's workstation to someother workstation that is currently idle and gets the process executed there,and finally the result of execution is returned to the user's workstation.

    The Sprite system and an experimental system developed at Xerox PA RC arethe distributed computing systems based on the workstation model.

  • 8/7/2019 DOS Lectures

    15/116

    Distributed computing system basedon the workstation model

  • 8/7/2019 DOS Lectures

    16/116

    Limitations of Workstation Model

    This model is not so simple to implement due to issueslike: -

    1. How does the system find an idle workstation?

    2. How is a process transferred from one workstation toget it executed on another workstation?

    3. What happens to a remote process if a user logs ontoa workstation that was idle until now and was beingused to execute a process of another workstation?

  • 8/7/2019 DOS Lectures

    17/116

    W orkstation- Server ModelIt is a network of personal workstations, eachwith its own disk and a local file system.

    A workstation with its own local disk is called adiskful workstation and a workstation without alocal disk is called a diskless workstation.

    diskless workstations are more popular innetwork environments than diskful workstations,making the workstation-server model morepopular than the workstation model for buildingdistributed computing systems.

  • 8/7/2019 DOS Lectures

    18/116

    Distributed computing system based on the workstationserver model

  • 8/7/2019 DOS Lectures

    19/116

    D istributed computing system based on the workstation servermodel consists of a few minicomputers and several workstationsinterconnected by a communication network.

    Minicomputers are used for implementing the file system and otherfor providing other types of services, such as database service andprint service. Therefore, each minicomputer is used as a servermachine to provide one or more types of services.

    There are specialized machines (or the specialized workstations) forrunning server processes for managing and providing access toshared resources.

    For higher reliability and better scalability, multiple servers are usedfor managing the resources of a particular type in a distributedcomputing system.

    Example - there may be multiple file servers, each running on aseparate minicomputer and cooperating via the network, formanaging the files of all the users in the system.

  • 8/7/2019 DOS Lectures

    20/116

    A DVA NTA GES OF W ORKSTATION-SERVER MODELCheaper to use a few minicomputers equipped with large, fast disks that areaccessed over the network than a large number of diskful workstations,with each

    workstation having a small, slow disk.

    Diskless workstations are also preferred to diskful workstations from a systemmaintenance point of view. Backup and hardware maintenance are easier toperform with a few large disks than with many small disks scattered all over abuilding or campus.

    All files are managed by the file servers, so users have the flexibility to use anyworkstation and access the files in the same manner irrespective of whichworkstation the user is currently logged on.

    The request-response protocol is mainly used to access the services of the servermachines.

    A user has guaranteed response time because workstations are not used forexecuting remote processes.

    LIMITATION: the model does not utilize the processing capability of idleworkstations.

  • 8/7/2019 DOS Lectures

    21/116

    REQUEST-RESP ONSE P ROTOCOLAlso known as the client-server model of communication.

    A client process sends a request to a server process for getting someservice such as reading a block of a file. The server executes the requestand sends back a reply to the client that contains the result of requestprocessing.

    This model provides an effective general-purpose approach to the sharingof information and resources in distributed computing systems.

    It can also be implemented in a variety of hardware and softwareenvironments.

    It is even possible for both the client and server processes to be run on thesame computer.

    Some processes are both client and server processes. That is, a serverprocess may use the services of another server, appearing as a client tothe latter.

  • 8/7/2019 DOS Lectures

    22/116

    Processor- Pool ModelIt is based on the observation that most of the time a user does not need

    any computing power but once in a while user may need a very largeamount of computing power for a short time (e.g., when recompiling aprogram consisting of a large number of files after changing a basic shareddeclaration).

    Unlike the workstation-server model in which a processor is allocated toeach user, in the processor pool model the processors are pooled togetherto be shared by the users as needed.

    The pool of processors consists of a large number of microcomputers andminicomputers attached to the network.

    Each processor in the pool has its own memory to load and run a systemprogram or an application program of the distributed computing system.

    A moeba, Plan 9 and the Cambridge Distributed Computing System arethe distributed computing systems based on the processor-pool model.

  • 8/7/2019 DOS Lectures

    23/116

    Distributed computing system based on processor-pool model

  • 8/7/2019 DOS Lectures

    24/116

    Compared to the workstation-server model, this model allows betterutilization of the available processing power of a distributed computingsystem, as entire processing power of the system is available for use by thecurrently logged-on users, whereas in the workstation-server model severalworkstations may be idle at a particular time but they cannot be used forprocessing the jobs of other users.

    Provides greater flexibility than the workstation-server model. The system'sservices can be easily expanded without the need to install any morecomputers; the processors in the pool can be allocated to act as extra serversto carry any additional load arising from an increased user population or toprovide new services.

    Limitation : This model is usually not suitable for high performance interactive

    applications, especially those using graphics or window systems. This is due tothe slow speed of communication between the computer on which theapplication program of a user is being executed and the terminal via which theuser is interacting with the system. The workstation-server model is generallyconsidered to be more suitable for such applications.

  • 8/7/2019 DOS Lectures

    25/116

    H ybrid ModelThe workstation-server model, is the most widely used model for building distributedcomputing systems as a large number of computer users only perform simpleinteractive tasks such as editing jobs, sending electronic mails, and executing smallprograms.

    In a working environment that has groups of users who often perform jobs needingmassive computation, the processor-pool model is more suitable.

    To combine the advantages of both the workstation-server and processor-pool

    models, a hybrid model came into existence.

    The hybrid model is based on the workstation-server model with the addition of apool of processors. The processors in the pool can be allocated dynamically forcomputations that are too large for workstations or that require several computersconcurrently for efficient execution.

    A dvantages 1. Efficient execution of computation-intensive jobs; 2. Givesguaranteed response to interactive jobs by allowing them to be processed on localworkstations of the users.

    Limitation : Expensive to implement than the workstation-server model or theprocessor-pool model.

  • 8/7/2019 DOS Lectures

    26/116

    Limitations of Distributed Computing SystemsDistributed computing systems are much more complex and difficult tobuild than traditional centralized systems.

    Complexity is mainly due to:Using and managing a very large number of distributed resources,Handling the communication and security problems

    The performance and reliability of a distributed computing systemdepends to a great extent on the performance and reliability of theunderlying communication network.

    Special software is needed to handle loss of messages during transmissionacross the network or to prevent overloading of the network, which

    degrades the performance and responsiveness to the users.

    Special software security measures are needed to protect the widelydistributed shared resources and services against intentional or accidentalviolation of access control and privacy constraints.

  • 8/7/2019 DOS Lectures

    27/116

    A dvantages of Distributed Computing Systems

    Despite the increased complexity and the difficulty of building distributed computing systems, the installationand use of distributed computing systems are rapidly

    increasing.

    The technical needs, the economic pressures, and themajor advantages that have led to the emergence andpopularity of distributed computing systems

    .contd

  • 8/7/2019 DOS Lectures

    28/116

    Inherently Distributed A pplicationsSeveral applications are inherently distributed in natureand require a distributed computing system for theirrealization.

    Inherently distributed applications includes

    computerized worldwide airline reservation system, acomputerized banking system and a factoryautomation system controlling robots and machines allalong an assembly line.

    These applications require that some processing powerbe available at many distributed locations forcollecting, preprocessing and accessing data, resultingin the need for distributed computing systems.

  • 8/7/2019 DOS Lectures

    29/116

    INFORM ATION SHA RING A MONG

    DISTRIBUTED USERSThe use of distributed computing systems by agroup of users to work cooperatively is known asComputer-Supported Cooperative Working (CSCW),or Groupware.

    Groupware applications depend heavily on the

    sharing of data objects between programs runningon different nodes of a distributed computingsystem.

  • 8/7/2019 DOS Lectures

    30/116

    RESOURCE SHA RING

    Sharing of software resources such as softwarelibraries and databases as well as hardwareresources such as printers, hard disks, etc. can alsobe done in a very effective way among all thecomputers and the users of a single distributedcomputing system.

  • 8/7/2019 DOS Lectures

    31/116

    BETTER PRICE-PERFORM A NCE RATIO

    Small number of CP

    Us in a distributed computingsystem based on the processor-pool model can beeffectively used by a large number of users frominexpensive terminals, giving a fairly high price-

    performance ratio as compared to either acentralized time-sharing system or a personalcomputer.

    They also facilitate resource sharing amongmultiple computers.

  • 8/7/2019 DOS Lectures

    32/116

  • 8/7/2019 DOS Lectures

    33/116

    H igher ReliabilityReliability refers to the degree of tolerance against errors and componentfailures in a system.

    A reliable system prevents loss of information even in the event of component failures.

    The multiplicity of storage devices and processors in a distributedcomputing system allows the maintenance of multiple copies of criticalinformation within the system and the execution of importantcomputations redundantly to protect them against catastrophic failures.

    If one of the processors fails, the computation can be successfullycompleted at the other processor, and if one of the storage devices fails,the information can still be used from the other storage device.

    The geographical distribution of the processors and other resources in adistributed computing system limits the scope of failures caused bynatural disasters.

    contd

  • 8/7/2019 DOS Lectures

    34/116

    Availability, refers to the fraction of lime for which a system is available foruse. In comparison to a centralized system, a distributed computingsystem also enjoys the advantage of increased availability.

    Example - If the processor of a centralized system fails the entire systembreaks down and no useful work can be performed. But in the case of a DCsystem, a few parts of the system can be down without interrupting thejobs of the users who are using the other parts of the system.

    In a workstation of a distributed computing system that is based on theworkstation-server model fails, only the user of that workstation isaffected. Other users of the system are not affected by this failure.

    In a distributed computing system based on the processor pool model, if some of the processors in the pool are down at any moment, the system

    can continue to function normally, simply with some loss in performancethat is proportional to the number of processors that are down.

    In this case, none of the users is affected and the users cannot even knowthat some of the processors are down.

  • 8/7/2019 DOS Lectures

    35/116

    Extensibility and Incremental GrowthDistributed computing systems are capable of incremental growth.

    It is possible to gradually extend the power and functionality of adistributed computing system by simply adding additional resources(both hardware and software) to the system as and when the needarises.

    Example - additional processors can be easily added to the system

    to handle the increased workload of an organization that mighthave resulted from its expansion.

    Extensibility is also easier in a distributed computing systembecause addition of new resources to an existing system can beperformed without significant disruption of the normal functioningof the system.

    P roperly designed distributed computing systems that have theproperty of extensibility and incremental growth are called opendistributed systems.

  • 8/7/2019 DOS Lectures

    36/116

    Better Flexibility in Meeting Users' NeedsDifferent types of computers are suitable for performing different types of

    computations.

    Example - computers with ordinary power are suitable for ordinary dataprocessing jobs, whereas high-performance computers are more suitablefor complex mathematical computations.

    In a centralized system, the users have to perform all types of computations on the only available computer.

    A distributed computing system may have a pool of different types of computers, in which case the most appropriate one can be selected forprocessing a user's job depending on the nature of the job.

    In a distributed computing system that is based on the hybrid model,interactive jobs can be processed at a user's own workstation and theprocessors in the pool may be used to process non-interactive,computation-intensive jobs.

  • 8/7/2019 DOS Lectures

    37/116

    DISTRIBUTED

    OPERATINGSYSTEM

  • 8/7/2019 DOS Lectures

    38/116

    INTRODUCTIONThe operating systems used for Distributed Computing Systems (DCS) isclassified into two Network Operating Systems ( NOS) and DistributedOperating Systems ( DOS).

    Difference between NOS and DOS:In a NOS the users view the DCS as a collection of distinct machinesconnected by a communication subsystem. Users are aware of the factthat multiple computers are being used. But a DOS hides the existence of multiple computers and provides a single-system image to its users. Itmakes a collection of networked machines act as a virtual uniprocessor.

    In NOS each computer of the DCS has its own local operating system andthere is essentially no coordination at all among the computers except forthe rule that when two processes of different computers communicatewith each other they must use a mutually agreed on communication

    protocol. While with a DOS there is a single system wide operating systemand each computer of the DCS runs a part of this global operating system.

    Fault tolerance capability of a DOS is usually very high as compared to thatof a NOS.

  • 8/7/2019 DOS Lectures

    39/116

    What is Distributed Operating Systems?

    A distributed operating system is one that looks to itsusers like an ordinary centralized operating system butruns on multiple, independent C P Us.

    The key concept here is transparency i.e., the use of multiple processors should be invisible (transparent) tothe user.

    A distributed computing system that uses a networkoperating system is referred as a NETWORK SYSTEM,whereas one that uses a distributed operating systemis referred as a TRUE DISTRIBUTED SYSTEM.

  • 8/7/2019 DOS Lectures

    40/116

    MESSAGEPASSING

  • 8/7/2019 DOS Lectures

    41/116

    INTRODUCTIONIn a distributed system, processes executing on differentcomputers need to communicate with each other .

    Each computer of a distributed system may have a resourcemanager process to monitor the current status of usage of

    its local resources, and the resource managers of all thecomputers might communicate with each other from timeto time to dynamically balance the system load among allthe computers.

    Therefore, a distributed operating system needs to provideInter- P rocess Communication (l P C) mechanisms to facilitatesuch communication activities.

  • 8/7/2019 DOS Lectures

    42/116

    Inter- Process Communication (IPC) methods

    Shared-data approach Message-passing approach.

  • 8/7/2019 DOS Lectures

    43/116

    Message-passing provides a set of message-based I P Cprotocols and does so by shielding the details of

    complex network protocols and multipleheterogeneous platforms from programmers.

    It enables processes to communicate by exchangingmessages.

    It allows programs to be written by using simplecommunication primitives, such as send and receive .

    It serves as a suitable infrastructure for building otherhigher level I P C systems, such as Remote P rocedureCall (RP C) and Distributed Shared Memory (DSM)

  • 8/7/2019 DOS Lectures

    44/116

    Desirable Features of a Good Message Passing System

    Simplicity (Clean and simple semantics of the I P Cprotocols.)

    Uniform Semantics (semantics of remote communicationsshould be as close as possible to those of localcommunications.)

    Efficiency (reducing the number of message exchangesduring the communication process. Optimizations normallyadopted for efficiency.)

    Reliability (Cope with failure problems and guarantees thedelivery of a message. Acknowledgments andretransmissions on the basis of timeouts. Duplicatemessages may be sent in the event of failures or because of timeouts.)

  • 8/7/2019 DOS Lectures

    45/116

    Correctness (A tomicity ensures that every message sent to a groupof receivers will be delivered to either all of them or none of them.Ordered delivery ensures that messages arrive at all receivers in an

    order acceptable to the application. Survivability guarantees thatmessages will be delivered correctly despite partial failures of processes, machines, or communication links.)

    Flexibility (IP C primitives must have the flexibility to permit anykind of control flow between the cooperating processes.)

    Security ( Authentication of the Receiver/Sender of a message bythe Sender/Receiver; Encryption of a message before sending itover the network).

    Portability (Easily construct a new IP

    C facility on another system byreusing the basic design of the existing message-passing system.The applications written by using the primitives of the I P C protocolsof the message-passing system should be portable.)

  • 8/7/2019 DOS Lectures

    46/116

    Message is a block of information.

  • 8/7/2019 DOS Lectures

    47/116

    Important issues need to be considered in the design of IPC protocol

    Who is the sender?

    Who is the receiver?

    Is there one receiver or many receivers?

    Is the message guaranteed to have been accepted by its receiver(s)?

    Does the sender need to wait for a reply?

    What should be done if a catastrophic event such as a node crash or acommunication link failure occurs during the course of communication?

    What should be done if the receiver is not ready to accept the message:Will the message be discarded or stored in a buffer? In the case of buffering, what should be done if the buffer is full?

    If there are several outstanding messages for a receiver, can it choose theorder in which to service the outstanding messages?

  • 8/7/2019 DOS Lectures

    48/116

    SYNCH RONIZATIONThe semantics used are blocking and nonblocking types.

    Depends on one of the two types of semantics used for the send and receiveprimitives.

    In blocking send primitive , after execution of the send statement, the sendingprocess is blocked until it receives an acknowledgment from the receiver that the

    message has been received.

    For nonblocking send primitive , after execution of the send statement, thesending process is allowed to proceed with its execution as soon as the messagehas been copied to a buffer.

    In blocking receive primitive , after execution of the receive statement, thereceiving process is blocked until it receives a message.

    For a nonblocking receive primitive , the receiving process proceeds with itsexecution after execution of the receive statement, which returns control almostimmediately just after telling the kernel where the message buffer is.

  • 8/7/2019 DOS Lectures

    49/116

    H ow receiving process knows that message has arrived in the message buffer in a

    nonblocking receive primitive?

    Polling . A test primitive is provided to allow thereceiver to check the buffer status. The receiver usesthis primitive to periodically poll the kernel to check if the message is already available in the buffer.

    Interrupt . When the message has been filled in thebuffer and is ready for use by the receiver, a softwareinterrupt is used to notify the receiving process. Thismethod is highly efficient and allows maximumparallelism, but its main drawback is that user-levelinterrupts make programming difficult

  • 8/7/2019 DOS Lectures

    50/116

    Synchronous CommunicationWhen both the send and receive primitives of a communicationbetween two processes use blocking semantics, the communicationis said to be synchronous ; otherwise it is asynchronous .

    The sending process sends a message to the receiving process, thenwaits for an acknowledgment.

    After executing the receive statement, the receiver remains blockeduntil it receives the message sent by the sender.

    On receiving the message, the receiver sends an acknowledgmentmessage to the sender.

    The sender resumes execution only after receiving thisacknowledgment message.

  • 8/7/2019 DOS Lectures

    51/116

  • 8/7/2019 DOS Lectures

    52/116

    A DVA NTA GESAs compared to asynchronous communication, synchronouscommunication is simple and easy to implement.It contributes to reliability as it assures the sending process that itsmessage has been accepted before the sending process resumesexecution.If the message gets lost or is undelivered, no backward errorrecovery is necessary for the sending process to establish a

    consistent state and resume execution.

    LIMITATIONSMain drawback of synchronous communication is that it limitsconcurrency and is subject to communication deadlocks.Is less flexible than asynchronous communication because thesending process always has to wait for an acknowledgment fromthe receiving process even when this is not necessary.

  • 8/7/2019 DOS Lectures

    53/116

    FA ILUREHA NDLINGDistributed system offer potential for parallelism, but is also proneto partial failures such as a node crash or a communication linkfailure.

    Loss of request message. This may happen either due to the failureof communication link between the sender and receiver or becausethe receiver's node is down at the time the request messagereaches there.

    Loss of response message. This may happen either due to thefailure of communication link between the sender and receiver orbecause the sender's node is down at the time the response

    message reaches there.

    Unsuccessful execution of the request. This happens due to thereceiver's node crashing while the request is being processed.

  • 8/7/2019 DOS Lectures

    54/116

    Request message is lost

  • 8/7/2019 DOS Lectures

    55/116

    Response message is lost

  • 8/7/2019 DOS Lectures

    56/116

  • 8/7/2019 DOS Lectures

    57/116

    To cope with these problems, a reliable I P C protocol of a message-passing system is designed based on the idea of internalretransmissions of messages after timeouts and the return of an

    acknowledgment message to the sending machine's kernel by thereceiving machine's kernel.

    Kernel of the sending machine is responsible for retransmitting themessage after waiting for a timeout period if no acknowledgment isreceived from the receiver's machine within this time.

    The kernel of the sending machine frees the sending process onlywhen the acknowledgment is received.

    The time duration for which the sender waits before retransmitting

    the request is slightly more than the approximate round-trip timebetween the sender and the receiver nodes plus the average timerequired for executing the request.

  • 8/7/2019 DOS Lectures

    58/116

    Four-message I P C protocolfor client-server communication

  • 8/7/2019 DOS Lectures

    59/116

    The client sends a request message to the server.

    When the request message is received at the server's machine, thekernel of that machine returns an acknowledgment message to thekernel of the client machine.

    If the acknowledgment is not received within the timeout period,the kernel of the client machine retransmits the request message.

    When the server finishes processing the client's request, it returns areply message (containing the result of processing) to the client.

    When the reply message is received at the client's machine, thekernel of that machine returns an acknowledgment message to thekernel of the server machine.

    If the acknowledgment message is not received within the timeoutperiod, the kernel of the server machine retransmits the replymessage.

  • 8/7/2019 DOS Lectures

    60/116

    P roblem occurs if a request processing takes along time.

    If the request message is lost, it will beretransmitted only after the timeout period,which has been set to a large value to avoid

    unnecessary retransmissions of the requestmessage.

    If the timeout value is not set properly taking into

    consideration the long time needed for requestprocessing, unnecessary retransmissions of therequest message will take place.

  • 8/7/2019 DOS Lectures

    61/116

    SOLUTIONThe client sends a request message to the server.

    When the request message is received at the server's machine, the kernelof that machine starts a timer. If the server finishes processing the client'srequest and returns the reply message to the client before the timerexpires, the reply serves as the acknowledgment of the request message.

    Otherwise, a separate acknowledgment is sent by the kernel of the servermachine to acknowledge the request message. If an acknowledgment isnot received within the timeout period, the kernel of the client machineretransmits the request message.

    When the reply message is received at the client's machine, the kernel of

    that machine returns an acknowledgment message to the kernel of theserver machine. If the acknowledgment message is not received withinthe timeout period, the kernel of the server machine retransmits the replymessage.

  • 8/7/2019 DOS Lectures

    62/116

    Three-message I P C protocol forclient-server communication

  • 8/7/2019 DOS Lectures

    63/116

    The client sends a request message to the server.

    When the server finishes processing the client's request, it returns areply message (containing the result of processing) to the client.

    The client remains blocked until the reply is received. If the reply isnot received within the timeout period, the kernel of the clientmachine retransmits the request message.

    When the reply message is received at the client's machine, thekernel of that machine returns an acknowledgment message to thekernel of the server machine.

    If the acknowledgment message is not received within the timeoutperiod, the kernel of the server machine retransmits the replymessage.

  • 8/7/2019 DOS Lectures

    64/116

    Two-message I P C protocol forclient-server communication

  • 8/7/2019 DOS Lectures

    65/116

    The client sends a request message to. the serverand remains blocked until a reply is received fromthe server.

    When the server finishes processing the client's

    request, it returns a reply message (containingthe result of processing) to the client.

    If the reply is not received within the timeoutperiod, the kernel of the client machineretransmits the request message.

  • 8/7/2019 DOS Lectures

    66/116

    Fault-tolerant communication between client- server

    QUESTIONS

  • 8/7/2019 DOS Lectures

    67/116

    QUESTIONSWhy are MAC P rotocols needed?

    Write an algo. to detect the loss of a token in atoken ring scheme for MAC?

    Suggest a priority based token ring scheme forMAC that does not lead to the starvation of a

    low priority site when higher priority sitesalways have something to transmit.

  • 8/7/2019 DOS Lectures

    68/116

  • 8/7/2019 DOS Lectures

    69/116

    IDEMPOTENCY & HA NDLING DUPLICATE REQUEST

    An idempotent operation produces the sameresults without any side effects no matter howmany times it is performed with the same

    arguments.

    Operations that do not necessarily produce

    the same results when executed repeatedlywith the same arguments are said to benonidempotent.

  • 8/7/2019 DOS Lectures

    70/116

    A NONIDEMP OTENT ROUTINE

  • 8/7/2019 DOS Lectures

    71/116

    To implement exactly-once semantics a uniqueidentifier is used for every request that the clientmakes.

    Before forwarding a request to a server for processing,the kernel of the server machine checks to see if areply already exists in the reply cache for the request.

    If yes, means that this is a duplicate request that hasalready been processed.

    Then the previously computed result is extracted fromthe reply cache and a new response message is sent tothe client. Otherwise, the request is a new one.

  • 8/7/2019 DOS Lectures

    72/116

  • 8/7/2019 DOS Lectures

    73/116

    MULTIDATA GRA M MESS A GESNetworks have an upper bound on the size of data that can be transmitted at a time. Thissize is known as the maximum transfer unit

    (MTU) of a network.

    A message whose size is greater than the MTU

    has to be fragmented into multiples of theMTU, and then each fragment has to be sentseparately.

  • 8/7/2019 DOS Lectures

    74/116

    Each fragment is sent in a packet that has some controlinformation in addition to the message data. Eachpacket is known as a datagram.

    Messages smaller than the MTU of the network can besent in a single packet and are known as single-datagram messages .

    Messages larger than the MTU of the network have tobe fragmented and sent in multiple packets. Suchmessages are known as multidatagram messages.

  • 8/7/2019 DOS Lectures

    75/116

    Keeping Track of Lost and Out-of- Sequence Packets in Multidatagram Messages

    The logical transfer of a message consists of physical transfer of several packets.

    A message transmission is complete only when all the packets of the message have been received by the process to which it is sent.

    For successful completion of a multidatagram message transfer,reliable delivery of every packet is important.

    Stop-and-wait protocol is used for the purpose.

    But a separate acknowledgment packet for each request packetleads to a communication overhead.

    Better approach is to use a single acknowledgment packet for allthe packets of a multidatagram message (called blast protocol ).

  • 8/7/2019 DOS Lectures

    76/116

    When blast protocol is used, a node crash or a

    communication link failure may lead to theproblems like:

    1. One or more packets of the multidatagrammessage are lost in communication.

    2. The packets are received out of sequence bythe receiver.

  • 8/7/2019 DOS Lectures

    77/116

    MECHANISM TO COP E WITHP ROBLEMSUse a bitmap to identify the packets of a message.

    Header part of each packet consists of two extra fields, one specifies the totalnumber of packets in the multidatagram message and the other is the bitmapfield that specifies the position of this packet in the complete message.

    All packets have information about the total number of packets in themessage, so even in the case of out-of-sequence receipt of the packets, abuffer area can be set by the receiver for the entire message and the receivedpacket can be placed in its proper position inside the buffer area.

    After timeout. if all packets have not yet been received, a bitmap indicating

    the unreceived packets is sent to the sender. The sender retransmits onlythose packets that have not been received by the receiver. This technique iscalled selective repeat.

  • 8/7/2019 DOS Lectures

    78/116

  • 8/7/2019 DOS Lectures

    79/116

    GROUP COMMUNIC ATION

    Depending on single or multiple senders andreceivers three types of group communicationare possible:

    1. One to many (single sender and multiplereceivers).

    2. Many to one (multiple senders and singlereceiver).

    3. Many to many (multiple senders and multiplereceivers).

  • 8/7/2019 DOS Lectures

    80/116

    One-to- Many Communication

    There are multiple receivers for a messagesent by a single sender.

    Also known as Multicast Communication .

    A special case of multicast communication is

    Broadcast Communication , in which messageis sent to all processors connected to anetwork.

    G M

  • 8/7/2019 DOS Lectures

    81/116

    Group ManagementReceiver processes of a message form a group.Such groups are of two types- closed and open .

    In a closed group only the members of the groupcan send a message to the group.

    An outside process cannot send a message to thegroup as a whole, although it may send amessage to an individual member of the group.

    In an open group any process in the system cansend a message to the group as a whole.

  • 8/7/2019 DOS Lectures

    82/116

    A group of processes working on a commonproblem need not communicate with outside

    processes and can form a closed group.

    A group of replicated servers meant fordistributed processing of client requests mustform an open group so that client processes cansend their requests to them.

    A flexible message-passing system with groupcommunication facility should support both typesof groups.

  • 8/7/2019 DOS Lectures

    83/116

    A message-passing system with group

    communication facility provides the flexibilityto create and delete groups dynamically andto allow a process to join or leave a group atany time.

    The message-passing system have amechanism to manage the groups and their

    membership information, called centralizedgroup server process.

  • 8/7/2019 DOS Lectures

    84/116

  • 8/7/2019 DOS Lectures

    85/116

    Buffered and Unbuffered Multicast

    Multicast send cannot be synchronous due to: -

    It is unrealistic to expect a sending process to

    wait until all the receiving processes that belongto the multicast group are ready to receive themulticast message.

    The sending process may not be aware of all thereceiving processes that belong to the multicastgroup.

  • 8/7/2019 DOS Lectures

    86/116

    A multicast message treatment on a receiving processside depends on whether the multicast mechanism isbuffered or unbuffered.

    For an unbuffered multicast the message is notbuffered for the receiving process and is lost if thereceiving process is not in a state ready to receive it.

    Message is received only by those processes of themulticast group that are ready to receive it.

    For a buffered multicast, message is buffered for thereceiving processes, so each process of the multicastgroup will receive the message.

  • 8/7/2019 DOS Lectures

    87/116

    Send-to-A

    ll SemanticsCopy of the message is sent to each process of the multicast group

    and the message is buffered until it isaccepted by the process.

  • 8/7/2019 DOS Lectures

    88/116

  • 8/7/2019 DOS Lectures

    89/116

    Bulletin-board semantics is more flexible thansend-to-all semantics, due to: -

    The relevance of a message to a particularreceiver may depend on the receiver's state.

    Messages not accepted within a certain timeafter transmission may no longer be useful;their value may depend on the sender's state.

    Flexible reliability in multicast

  • 8/7/2019 DOS Lectures

    90/116

    Flexible reliability in multicast communication

    0 -reliable: No response is expected by the sender from any of the receivers. Useful for applications using asynchronous multicastin which the sender does not wait for any response aftermulticasting the message.

    1-reliable: Sender expects a response from any of the receivers.

    m-out-of-n reliable: The multicast group consists of m receivers andthe sender expects a response from m (I < m < n) of the nreceivers.

    All-reliable: The sender expects a response message from all thereceivers of the multicast group.

  • 8/7/2019 DOS Lectures

    91/116

    A tomic MulticastAtomic multicast has an all-or-nothing property.

    When a message is sent to a group by atomic multicast, it is eitherreceived by all the processes that are members of the group or elseit is not received by any of them.

    When a process fails, it is no longer a member of the multicastgroup.

    When the process comes up after failure, it must join the groupafresh.

  • 8/7/2019 DOS Lectures

    92/116

    Applications for which the degree of reliability requirement is O-reliable, I-reliable, or m-out-of-n-reliable do not need atomic multicast facility.

    Applications for which the degree of reliability requirement is all-reliable needatomic multicast facility.

    A flexible message-passing system : -

    1. should support both atomic and nonatomic multicast facilities and2. should provide the flexibility to the sender of a multicast message to specify

    in the send primitive whether atomicity property is required or not for themessage being multicast.

    M h d i l i l i i

  • 8/7/2019 DOS Lectures

    93/116

    Method to implement atomic multicast in all-reliable communication

    Kernel of the sending machine sends the message to all members of thegroup and waits for an acknowledgment from each member.

    After a timeout period, the kernel retransmits the message to all thosemembers from whom an acknowledgment message has not yet beenreceived.

    The timeout-based retransmission of the message is repeated until anacknowledgment is received from all members of the group.

    When all acknowledgments have been received, the kernel confirms tothe sender that the atomic multicast process is complete.

    This method works fine only as long as the machines of the sender processand the receiver processes do not fail during an atomic multicastoperation.

  • 8/7/2019 DOS Lectures

    94/116

    A fault-tolerant atomic multicast protocol

    must ensure that a multicast will be deliveredto all members of the multicast group even inthe event of failure of the sender's machine ora receiver's machine.

    Each message has a message identifier field todistinguish it from all other messages and a

    field to indicate that it is an atomic multicastmessage.

  • 8/7/2019 DOS Lectures

    95/116

  • 8/7/2019 DOS Lectures

    96/116

    Many-to- One Communication

    Multiple senders send messages to a single receiver.

    The single receiver may be selective or nonselective.

    A selective receiver specifies a unique sender; amessage exchange takes place only if that sender sendsa message.

    A nonselective receiver specifies a set of senders, andif anyone sender in the set sends a message to thisreceiver, a message exchange takes place.

  • 8/7/2019 DOS Lectures

    97/116

    Many-to- Many Communication

    Multiple senders send messages to multiple receivers.

    The one-to-many and many-to-one schemes areimplicit in this scheme.

    The issues related to one-to-many and many-to-oneschemes, also apply to the many-to-manycommunication scheme.

    An important issue related to many-to-manycommunication scheme is that of ordered messagedelivery.

  • 8/7/2019 DOS Lectures

    98/116

    REMOTEPROCEDURECA LL(RPC)

  • 8/7/2019 DOS Lectures

    99/116

    Remote Procedure Call (RPC)An independently developed I P C (Inter P rocessCommunication) protocol is tailored specifically to oneapplication and does not provide a foundation onwhich to build a variety of distributed applications.

    A need was felt for a general I P C protocol that can beused for designing several distributed applications.

    RP C provide a valuable communication mechanismthat is suitable for building a fairly large number of distributed applications.

  • 8/7/2019 DOS Lectures

    100/116

    TH E RPC MODEL

    For making a procedure call, the caller places arguments tothe procedure in some well-specified location.

    Control is then transferred to the sequence of instructionsthat constitutes the body of the procedure.

    The procedure is executed in a newly created executionenvironment that includes copies of the arguments given inthe calling instruction.

    After the procedure's execution is over, control returns tothe calling point, returning a result.

  • 8/7/2019 DOS Lectures

    101/116

    Model of R P C

    Features of R P C

  • 8/7/2019 DOS Lectures

    102/116

    Features of R P CSimple call syntax.

    Well-defined interface. This property is used to support compile-time type checking and automated interface generation.

    The clean and simple semantics of a procedure call.

    Generality - in single-machine computations procedure calls are themost important mechanism for communication between parts of the algorithm.

    Efficiency - P rocedure calls are simple enough for communication tobe quite rapid.

    It can be used as an I P C mechanism to communicate betweenprocesses on different machines as well as between differentprocesses on the same machine.

  • 8/7/2019 DOS Lectures

    103/116

    TRA NSPA RENCY OF RPC

    Syntactic transparency - a remote procedurecall should have exactly the same syntax as alocal procedure call.

    Semantic transparency - the semantics of aremote procedure call are identical to those of

    a local procedure call.

  • 8/7/2019 DOS Lectures

    104/116

    LIMITATIONS

    Achieving exactly the same semantics for RPC as for LPC isalmost impossible, mainly due to : -

    In remote procedure calls, the called procedure is executedin an address space that is disjoint from the callingprogram's address space.

    Due to this reason, the called (remote) procedure cannothave access to any variables or data values in the callingprogram's environment.

    In the absence of shared memory, it is meaningless to passaddresses in arguments.

  • 8/7/2019 DOS Lectures

    105/116

    IMPLEMENTING RPC MECHA NISM

  • 8/7/2019 DOS Lectures

    106/116

    IMPLEMENTING RPC MECHA NISMRP C involves a client process and a server process. Therefore, toconceal the interface of the R P C system from both the client andserver processes, a separate stub procedure is associated with eachof the two processes.

    To hide the existence and functional details of the underlyingnetwork, an R P C communication package (known as R P CRuntime) isused on both the client and server sides.

    Implementation of an R P C mechanism involves the five elementsof:The clientThe client stub

    The RP CRuntimeThe server stubThe server

    Client

  • 8/7/2019 DOS Lectures

    107/116

    ClientA user process that initiates a remote procedure call.To make a remote procedure call the client makes aperfectly normal local call that invokes a correspondingprocedure in the client stub.

    Client Stub

    The client stub is responsible for carrying out thefollowing two tasks:

    1. On receipt of a call request from the client, it packs aspecification of the target procedure and thearguments into a message and then asks the localRP CRuntime to send it to the server stub.

    2. On receipt of the result of procedure execution, itunpacks the result and passes it to the client.

    RPCRuntime

  • 8/7/2019 DOS Lectures

    108/116

    The RP CRuntime handles transmission of messages across the networkbetween client and server machines.

    It is responsible for retransmissions, acknowledgments, packet routing,and encryption.

    The RP CRuntime on the client machine receives the call request messagefrom the client stub and sends it to the server machine.

    It also receives the message containing the result of procedure executionfrom the server machine and passes it to the client stub.

    RP CRuntime on the server machine receives the message containing theresult of procedure execution from the server stub and sends it to theclient machine.

    It also receives the call request message from the client machine andpasses it to the server stub.

    S

  • 8/7/2019 DOS Lectures

    109/116

    Server

    On receiving a call request from the serverstub, the server executes the appropriateprocedure and returns the result of procedure

    execution to the server stub.

    The beauty of the whole scheme is the total

    ignorance on the part of the client that thework was done remotely instead of by thelocal kernel.

    Implementation of R P C mechanism

  • 8/7/2019 DOS Lectures

    110/116

    Implementation of R C mechanism.

    STUB GENERATION

  • 8/7/2019 DOS Lectures

    111/116

    STUB GENERATIONStubs can be generated in two ways: -

    1. ManuallyRP C implementer provides a set of translation functionsfrom which a user can construct his or her own stubs. Thismethod is simple to implement and can handle verycomplex parameter types.

    2. A utomaticallyThis method uses Interface Definition Language (IDL) thatis used to define the interface between a client and aserver. An interface definition is a list of procedure names

    supported by the interface, together with the types of their arguments and results. This is sufficient informationfor the client and server to independently performcompile-time type checking and to generate appropriatecalling sequences.

    RPC MESSA GES

  • 8/7/2019 DOS Lectures

    112/116

    RPC MESSA GES

    Call messages that are sent by the client tothe server for requesting execution of aparticular remote procedure.

    Reply messages that are sent by the server tothe client for returning the result of remoteprocedure execution.

    COMPONENTS NECESSA RY IN A CA LL MESSA GE

  • 8/7/2019 DOS Lectures

    113/116

    COMPONENTS NECESSA RY IN A CA LL MESSA GEThe identification information of the remote procedure to be executed.

    The arguments necessary for the execution of the procedure.

    A message identification field that consists of a sequence number (foridentifying lost messages and duplicate messages in case of systemfailures and for properly matching reply messages to outstanding callmessages, in cases where the replies of several outstanding call messagesarrive out of order).

    A message type field that is used to distinguish call messages from replymessages. For example, in an R P C system, this field may be set to 0 for allcall messages and set to 1 for all reply messages.

    A client identification field that may be used for two purposes-to allowthe server of the R P C to identify the client to whom the reply message hasto be returned and to allow the server to check the authentication of theclient process for executing the concerned procedure

  • 8/7/2019 DOS Lectures

    114/116

    RPC call message format

    Server receives a call message it could face conditions like

  • 8/7/2019 DOS Lectures

    115/116

    Call message is not intelligible to server. This may happen when a callmessage violates the R P C protocol. The server will reject such calls.

    The server detects by scanning the client's identifier field that the client isnot authorized to use the service. The server will return an unsuccessfulreply without bothering to make an attempt to execute the procedure.

    The server finds that the remote program, version. or procedure numberspecified in the remote procedure identifier field of the call message isnot available with it. Server will return an unsuccessful reply withoutbothering to make an attempt to execute the procedure.

    Incompatible RPC interface being used by the client and server.

    An exception condition (such as division by zero) occurs while executingthe specified remote procedure.

    The specified remote procedure is executed successfully.

  • 8/7/2019 DOS Lectures

    116/116

    RPC REPLY MESSA GE FORMAT

    SUCCESSFUL REPLY UNSUCCESSFUL REPLY