Remote Procedure Calls
Introduction
IPC part of distributed system can often be conveniently handled by message-passing model. It doesn't offer a uniform panacea for all the needs. It can be said as the special case of message-passing model.
Cont
It has become widely accepted because of the following features:
Simple call syntax and similarity to local procedure calls. It specifies a well defined interface and this property supports compile-time type checking and automated interface generation. Its ease of use, efficiency and generality. It can be used as an IPC mechanism between processes on different machines and also between different processes on the same machine.
RPC Model
It is similar to commonly used procedure call model. It works in the following manner:
1.
For making a procedure call, the caller places arguments to the procedure in some well specified location. Control is then transferred to the sequence of instructions that constitutes the body of the procedure. The procedure body is executed in a newly created execution environment that includes copies of the arguments given in the calling instruction.
2.
3.
Cont
4.
After the procedure execution is over, control returns to the calling point, returning a result.
The RPC enables a call (either from a local or remote process) to be made to a procedure that does not reside in the address space of the calling process.
Since the caller and the callee processes have disjoint address space, the remote procedure has no access to data and variables of the callers environment.
Cont
Therefore RPC facility uses a message-passing scheme for information exchange between the caller and the callee processes.
On arrival of request message, the server process extracts the procedures parameters, computes the result, sends a reply message, and then awaits the next call message.
Only one of the two processes is active at any given time.
Cont
Cont
It is not always necessary that the caller gets blocked. There can be RPC implementations depending on the parallelism of the caller and the callees environment or other features.
The RPC could be asynchronous, so that the client may do useful work while waiting for the reply from the server.
Server can create a thread to process an incoming request so that the server can be free to receive other requests.
Transparency of RPC
A transparent RPC is one in which the local and remote procedure calls are indistinguishable to the programmers. Types of transparencies:
Syntactic transparency
A remote procedure call should have exactly the same syntax as a local procedure call.
Semantic transparency
The semantics of a remote procedure call are identical to those of a
local procedure call.
Syntactic transparency is not an issue but semantic transparency is difficult.
RPCs vs. local procedure calls
1. Unlike local procedure calls, with RPCs, the called procedure is executed in an address space that is disjoint from the calling programs address space.
Absence of shared memory.
So, it is meaningless making call by reference, using addresses in arguments and pointers.
Cont
2. RPCs are more vulnerable to failure.
Possibility of processor crashes and communication problems of a network.
3. RPCs are much more time consuming than local procedure
calls due to the involvement of communication network.
Due to these reasons, total semantic transparency is impossible.
Implementing RPC Mechanism
To achieve the goal of semantic transparency, the implementation of RPC mechanism is based on the concepts of stubs.
Cont
Stubs
Provide a normal / local procedure call abstraction by concealing the underlying RPC mechanism. A separate stub procedure is associated with both the client and server processes. To hide the underlying communication network, RPC communication package known as RPC Runtime is used on both the sides.
Cont
Thus implementation of RPC involves the five elements of program:
1. 2. 3. 4. 5.
Client Client Stub RPC Runtime Server stub Server
Cont
The client, the client stub, and one instance of RPCRuntime execute on the client machine. The server, the server stub, and one instance of RPCRuntime execute on the server machine.
As far as the client is concerned, remote services are accessed by the user by making ordinary local procedure calls instead of using the send and receive primitives.
Client Stub
It is responsible for the following two tasks:
On receipt of a call request from the client: it packs a specifications of the target procedure and the arguments into a message and asks the local RPC Runtime to send it to the server stub.
On receipt of the result of procedure execution, it unpacks the result and passes it to the client.
RPCRuntime
It handles transmission of messages across the network between Client and the server machine. It is responsible for
Retransmission, Acknowledgement, Routing & Encryption.
Server Stub
It is responsible for the following two tasks:
On receipt of a call request message from the local RPC Runtime, it unpacks it and makes a perfectly normal call to invoke the appropriate procedure in the server. On receipt of the result of procedure execution from the server, it unpacks the result into a message and then asks the local RPC Runtime to send it to the client stub.
Stub Generation
Stubs can be generated in one of the following two ways:
Manually & Automatically
Automatic Stub Generation
Interface Definition Language (IDL) is used to define interface between a client and the server. Interface definition:
It is a list of procedure names supported by the interface together with the types of their arguments and results.
It also plays role in reducing data storage and controlling amount of data transferred over the network. It has information about type definitions, enumerated types, and defined constants.
Cont
Export the interface
A server program that implements procedures in the interface.
Import the interface
A client program that calls procedures from an interface.
The interface definition is compiled by the IDL compiler.
Cont
IDL compiler generates:
components that can be combined with client and server programs, without making any changes to the existing compilers; Client stub and server stub procedures; The appropriate operations; marshaling and unmarshaling
A header file that supports the data types.
RPC Messages
RPC system is independent of transport protocols and is not concerned as to how a message is passed from one process to another. Types of messages involved in the implementation of RPC system:
Call messages Reply messages
Call Messages
Components necessary in a call message are:
1.
The identification Information of the remote procedure to be executed.
The arguments necessary for the execution of the procedure. Message Identification field that consists of a sequence number for identifying lost and duplicate messages.
2.
3.
Call Messages
4.
Message Type field is used to distinguish between call and reply messages. Client Identification Field allows the server to:
5.
Identify the client to whom the reply message has to be returned and To allow server to authenticate the client process.
Call Message Format
Reply Messages
These are sent by the server to the client for returning the result of remote procedure execution.
Cont
Conditions for unsuccessful message sent by the server:
The server finds that the call message is not intelligible to it.
Client is not authorized to use the service.
Remote procedure identifier is missing. The remote procedure is not able to decode the supplied arguments. Occurrence of exception condition.
Reply message formats
Message identifier Message Reply status type (successful) Result
A successful reply message format
Message identifier
Message Reply status Reason for type (unsuccessful) failure
An unsuccessful reply message format
Marshalling Arguments and Results
The arguments and results in remote procedure calls are language-level data structures, which are transferred in the form of message data. Transfer of data between two computers requires encoding and decoding of the message data.
Marshalling Arguments and Results
Encoding and decoding of messages in RPC is known as marshaling & Unmarshaling, respectively.
Classification of marshaling procedures:
Those provided as a part of the RPC software.
Those that are defined by the users of the RPC system.
The Issue
Connections on the WWW are stateless. Every time a link is followed, is like the first time to the server
it has no memory for connections.
Why Bother To Fix This?
By saving state we can
Save configuration information between sessions. Make adaptive websites (change themselves to suit users behaviour). Enable e-commerce applications (shopping carts). Violate users privacy by tracking which websites and web pages they visit.
Saving State In General
Cookie Concept
Cookie is a computing term from long ago.
Basic idea
Client stores data for the server.
Client sends data to server with each request.
Something passed between programs that enables the something useful. subroutines receiver to or do
Session-level Authentication
Session (from ISO Reference Model)
Logical communication between two network end points. Sessions are composed of requests and responses that occur between applications in different network hosts. In browser terms a session is the longetivity of the O/S process.
The Steps of Basic Authentication
Browser requests resource from application usually with GET protocol.
server
Server replies with code 401 (authorization required). Browser prompts user for name & password.
The Steps of Basic Authentication
Browser resends request including the name & password (in the network header):
Every time the browser makes a request for that resource it will send the name & password, until the end of the session. The name & password are like a cookie that is stored in RAM. Because they are in RAM they will be forgotten when the browser quits (i.e., at the end of the session).
Server Management
Issues :
Server implementation
Server creation
Server Implementation
Types of servers :
Stateful servers
Stateless servers
Stateful Server
A Stateful Server maintains clients information from one RPC to the next.
state
They provide an easier programming paradigm. They are more efficient than stateless servers.
Stateful file server
Client process
Open (filename,mode) Return (fid) read (fid,100,buf) Return(bytes 0 to 99) Read (fid, 100, buf) Fid mode
R/W pointer
Server process
Return(bytes 100 to 199)
Stateless Server
Every request from a client must be accompanied with all necessary parameters to successfully carry out the desired operation.
Stateless servers have distinct advantage over stateful server in the event of a failure.
The choice of using a stateless or a stateful server is purely application dependent.
Stateless file server
Client process
File state information
Server process
read (filename,0,100,buf) File mode R/W pointer name
Return(bytes 0 to 99)
Read (filename,100,100, buf) Return(bytes 100 to 199)
Server Creation Semantics
A server process is independent of a client process that makes a remote procedure call to it.
Server processes may either be created and installed before their client processes or be created on a demand basis.
Server Creation Semantics
Servers based on the life duration for which RPC servers survive:
Instance-per-call Servers
Instance-per-session Server
Persistent Servers
Instance-per-call Servers
They exist only for the duration of a single call. These server are created by RPCRuntime on the server machine only when a call message arrives. The server is deleted after the call has been executed.
The servers of this type are
Stateless, and more expensive for invoking same type of service.
Instance-per-session Server
These servers exist for the entire session for which a client and a server interact.
It can maintain intercall state information to minimize
the overhead involved in server creation & destruction for a client-server session that involves a large number of calls.
Persistent Servers
Persistent servers indefinitely. Advantages:
remain
in
existence
Most commonly used. Improves performance and reliability. Shared by many clients.
Each server exports its services by registering with binding agent.
Parameter Passing Semantics
Call by Value Semantics Call by Reference Semantics
Call-by-value
All parameters are copied into a message that is transmitted from the client to the server through the intervening network. It is time consuming for passing large data types such as trees, arrays, etc.
Call-by-reference
Most RPC mechanisms use the call-by-reference semantics for parameter passing
The client and the server exist in different address space.
Distributed systems having distributed sharedmemory mechanisms can allow passing of parameters by reference.
Call-by- object -reference
A call-by-reference in which objects invocation is used by the RPC mechanism. Here, the value of a variable is a reference to an object.
Call-by-move & call-by-visit
Call-by-move
A parameter is passed by reference as in the method of call-by-object-reference, but at the time of the call, the parameter object is moved to the destination node (callee). If it remains at the callers node ,it is known as call-byvisit.
It allows packaging of the argument objects in the same network packet as the invocation message, thereby reducing the network traffic and message count.
Call Semantics
Failure of communication link between the caller and the callee node is possible. Failures can be because of
Call Message Reply Caller / Callee crash or Link
Failure Handling code is part of RPC Runtime
Types of Call Semantics
Possibly or May-Be Call Semantics Last-one call semantics Last-of-many call semantics At-least-once call semantics Exactly-once call semantics
Possibly or May-Be Call Semantics
It is a Request Protocol. It uses time out but no surety of reply. It is an Asynchronous RPC. Server need not send back the reply. Useful for periodic update services.
Last-One Call Semantic
It is Request / Reply protocol. Retransmission of call mesg. Based on time out, until result recd. By caller. It issues Orphan Call (whose parent has expired due to node crash). It is useful in designing simple RPC. Problems with nested RPC.
Last-of-many Call Semantic
It is similar to last-one semantic except that the orphan calls are neglected.
When a call is repeated, it is assigned a new call identifier. A caller accepts a response only if the call identifier associated with it matches with the identifier of the most recently repeated call, otherwise it ignores the response message.
At-least-once Call Semantic
It guarantees that the call is executed one or more times but does not specify which results are returned to the caller.
It uses timeout-based retransmissions without caring for the orphan calls. If there are any orphan calls, it takes the result of the first response message and ignores the others, whether or not the accepted response is from an orphan.
Exactly-Once Call Semantics
It is Request / Reply / ACK protocol. No matter how many calls are made, a procedure is executed only once.
The server deletes an information from its reply cache only after receiving an ACK from the client.
Implementation of RRA protocol requires that the unique message identifier associated with request message must be ordered.
Communication protocols for RPCs
1.
2.
3.
Request protocol Request / Reply protocol Request /Reply /Acknowledge-Reply protocol
Request (R) protocol
Client First RPC Request message Server Procedure execution
Next RPC
Request message
Procedure execution
Request / Reply (RR) protocol
Client Server
Request message
First RPC Reply message
Also serves as acknowledgeMent for the request message
Procedure execution
Request message Next RPC Procedure execution
Reply message
Also serves as acknowledgeMent for the request message
Request /Reply /Acknowledge-Reply (RRA) protocol
Client Server
Request message
First RPC Reply message
Reply acknowledgement msg
Procedure execution
Request message Next RPC Procedure execution
Reply message
Reply acknowledgement msg
Complicated RPCs
Types of complicated RPCs
RPCs involving long duration calls or Large gaps between calls
RPCs involving arguments and / or results that are too large to fit in a single datagram packet.
RPCs involving Long-Duration Calls & Large Gaps Between Calls
Two methods to handle these RPCs:
Periodic probing of the server by the clients.
Periodic generation of an ACK by the server.
Periodic probing of the server by the clients
After a client sends a request message to a server, it periodically sends a probe packet to the server, which the server is expected to acknowledge.
This allows the client to detect a servers crash or communication link failures and to notify the corresponding user of an exception condition.
Periodic generation of an ACK by the server
If a server is not able to generate the next packet significantly sooner than the expected retransmission time, it spontaneously generates an acknowledgement.
RPCs involving Long Messages
It proposes the use of multidatagram messages. A single ACk packet is used for all packets of multidatagram message. this
Another crude method is to break one logical RPC into several physical RPC as in SUN Microsystem where RPC is limited to 8 kilo bytes.
Client Server Binding
Client Stub must know the location of a server before RPC can take place between them.
Process by which client gets associated with server is known as BINDING.
Servers export their operations to register their willingness to provide services &
Clients import operations, asking the RPCRuntime to locate a server and establish any state that may be needed at each end.
Cont
Issues for client-server binding process:
How does a client specify a server to which it wants to get bound?
How does the binding process locate the specified server?
When is it proper to bind a client to a server?
Is it possible for a client to change a binding during execution? Can a client be simultaneously bound to multiple servers that provide the same service?
Server Naming
The naming issue is the specification by a client of a server with which it wants to communicate.
Interface name of a server is its unique identifier.
It is specified by type & instance. Type specifies the version number of different interfaces Instance specifies a server providing the services within that interface.
Cont
Interface name semantics are based on an arrangement between the exporter and importer.
Interface names are created by the users, not by the RPC package. The RPC package only dictates the means by which an importer uses the interface name to locate an exporter.
Server locating
Methods are:
1. 2.
Broadcasting Binding agent
Broadcasting
A message to locate a node is broadcast to all the nodes from the client node. Node having desired server returns a response message. (could be replicated) Suitable for small networks.
Binding Agent
It is basically a name server used to bind a client to a server by providing the client with the location information of the desired server. It maintains a binding table containing mapping of server interface name to its location. The table contains identification information and a handle is used to locate it.
Cont
The location of the binding agent (having a fixed address) is known to all nodes by using a broadcast message . Primitives used by Binding Agent Register Deregister Lookup
Cont
1.
The server registers with the binding agent.
itself
Binding Agent 2 3 4
2.
The client requests the binding agent for the servers location. The binding agent returns the servers location information to the client. The client calls the server.
3.
4.
Client Process
Server Process
Cont
Advantages
Support multiple servers having same interface type.
Client requests can be spread evenly to balance the load.
Cont
Disadvantages
The overhead involved in binding clients to servers is large and becomes significant when many client processes are short lived.
A binding agent must be robust against failures and should not become a performance bottleneck.
Cont
Solution
distribute the binding function among several binding agents, and Replicate the information among them.
Again overhead in creating and handling replicas.
Binding Time
Binding at compile time Binding at link time Binding at call time
Binding at compile time
Servers network address can be compiled into client code.
Binding at link time
Client makes an import request to the binding agent for the service before making a call. Servers handle cached by the client to avoid contacting binding agent for subsequent calls. This method is suitable for those situations in which a client calls a server several times once it is bound to it.
Binding at call time
A client is bound to a server at the time when it calls the server for the first time during its execution. The commonly used approach for binding at call time is the indirect call method.
Binding at call time by the method of indirect call
1.
The client process passes the servers interface name and the arguments of the RPC calls to the binding agent. The binding agent sends an RPC call message to the server, including in it the arguments received from the client. The server returns the result of request processing to the binding agent. The binding agent returns this result to the client along with the servers handle. Subsequent calls are sent directly from the client process to the server process.
2.
Binding Agent 1 4 5 2
3.
4.
Client Process
Server Process
5.
Security issues
Is the authentication of the server by the client required? Is the authentication of the client by the server required when the result is returned? Is it all right if the arguments and results of the RPC are accessible to users other than the caller and the callee?
Some special types or RPCs
Callback RPC Broadcast RPC Batch-made RPC
Callback RPC
It facilitates a peer-to-Peer paradigm among participating processes. It allows a process to be both a client and a server. Issues
Providing server with clients handle Making the client process wait for the callback RPC Handling callback deadlocks
Cont
Client Server
Start Procedure execution Stop Procedure execution temporarily
Call (parameter list)
Callback (parameter list)
Process callback request and send reply
Reply (result of callback)
resume Procedure execution
Reply (result of call)
Procedure execution ends
Broadcast RPC
A clients request is broadcast on the network and is processed by all the servers that have the concerned procedure for processing that request.
Cont
Methods for broadcasting a clients request:
1.
Use of broadcast primitive to indicate that the clients request message has to be broadcasted. Declare broadcast ports.
2.
Cont
Back-off algorithm can be used to increase the time between retransmissions.
It helps in reducing the load on the physical network and computers involved.
Batch-mode RPC
Batch-mode RPC is used to queue separate RPC requests in a transmission buffer on the client side and then send them over the network in one batch to the server.
It reduces the overhead involved in sending requests and waiting for a response for each request. It is efficient for applications requiring lower call rates and client doesnt need a reply.
It requires reliable transmission protocol.
Lightweight RPC (LRPC)
The communication traffic in operating systems are of two types: 1. Cross-domain - involves communication between domains on the same machine.
2.
Cross-machine - involves communication domains located on separate machines.
between
The LRPC is a communication facility designed and optimized for cross-domain communications.
Cont
For cross domain, user level server processes have its own address space. LRPC is safe and transparent and represents a viable communication alternative for microkernel operating systems.
Cont
Techniques used by LRPC for better performance:
Simple control transfer Simple data transfer Simple stubs Design for concurrency
Simple control transfer
It uses a special threads scheduling mechanism, called handoff scheduling for direct context switch from the client thread to the server thread of an LRPC.
Simple data transfer
LRPC reduces the cost of data transfer by performing fewer copies of the data during its transfer from one domain to another. An LRPC uses a shared-argument stack that is accessible to both the client and the server.
Cont
Pairwise allocation of argument stacks enables LRPC to provide a private channel between the client and server and also Allows the copying of parameters and results as many times as are necessary to ensure correct and safe operation.
Simple stub
Every procedure has a call stub in the clients domain and an entry stub in the servers domain. To reduce the cost of interlayer crossings, LRPC stubs blur the boundaries between the protocol layers.
Optimizations for better performance
Concurrent access to multiple servers Serving multiple requests simultaneously Reducing per-call workload of servers Reply caching of idempotent remote procedures Proper selection of timeout values Proper design of RPC protocol specification