UNIT 2 Part-I
Introduction to API
• API stands for Application Programming Interface and refers to the act of
communication between two or more computers or more specifically,
computer programs. Like user interfaces exist to provide a platform for
users to communicate with the system.
• A software can also interact with different softwares through multiple
APIs, so as to provide a one-place-for-everything kind of support and
become a hub of data accessing.
Example:
A good common-place instance is when we use a food-delivery application
on our smart-phones and pay for it via online means, say your PayTM
wallet or Amazon Pay account. When we select this wallet option, the
wallet's interface opens up. We enter our PIN numbers and process the
payment. This switch from the food delivery application's interface to the
wallet's interface is due to usage of an API, which borrows the wallet's
functionality and lets us process the payment conveniently.
Contd…..
• APIs enable communication between services via
request-response cycles. These cycles may/may not require an
internet connection. The application in need of data from
another service or server sends a request, which is conveyed
through API protocols and the required data is sent back from
the server.
API INTERNET PROTOCOLS
1. SOAP (Simple Objects Access Protocol)
It is an API protocol which employs XML to enable API
communication. It is the oldest API protocol in use, emerging in
1998. SOAP uses XML files to transfer data between web
services. These XML files are sent over HTTP/HTTPS
transmissions as is common on the internet. However, SOAP
also provides flexibility and enables data transmission over other
protocols as well such as Transmission Control Protocol(TCP),
Simple Mail Transport Protocol(SMTP), User Data Protocol
(UDP), etc.
Messages in SOAP are encoded in XML specifically and have a
proper defined format:
Contd…
• Envelope : Literally envelopes the entire message/data in tags.
• Header : Defines all extra information that might be needed to
process this data. This is an optional element.
• Body : This is where we actually write the request for data required/
where all the requested data is added.
• Fault : Defines all the errors that may arise in the data during/due to
transmission and measures to handle them.
While SOAP is highly prevalent on the web as well as flexible with its
transmission channel, its reliance on XML makes it rigid due to its strict
formatting. Difficulties in debugging XML also proves to be a
hindrance.
REST (Representational State
Transfer)
• REST protocols overcome SOAP's dependency on XML by
supporting data transmission in multiple formats such as JSON (most
prominent), HTML, Python, plain text as well as media files.
However, REST relies solely on HTTP/HTTPS for data transmission,
taking away SOAP's adaptability to other protocols. APIs which
employ the REST protocol are called RESTful APIs.
• REST APIs follow a client-server architecture and must
be stateless. Stateless communication implies that no client data is
stored between GET requests. These GET requests must be distinct
and disconnected.REST assigns every operation a unique URL, so
when the server receives a request, it knows which instructions to
execute to fulfil the request. REST also supports caching. So, the
browser can store the results obtained from the request locally and
retrieve it periodically as needed, thereby increasing speed and
efficiency.
A typical REST request has the following components:
• Endpoint : The destination URL from which data is being
requested.
• Method : We use predefined methods such as GET, POST,
PUT or DELETE to fetch the data. These methods vary from
one other. Ex. in when using GET, the data is appended to the
end of the URL string, whereas in POST, the data is sent along
with the HTTP request.
• Headers : They define the request's details and dictate the
proper format in which the response must be received.
• body (data) : The actual data sent by the service.
REST APIs provide more freedom during implementation and are
lightweight and scalable.
gRPC (Google Remote Procedural Call)
• As the name indicates, gRPC was developed at Google and
released publicly in 2015. It is an open source RPC framework
capable of running in most environments.
• Unlike above protocols, gRPC lets developers define their own
custom functions to enable inter-service communication as
required.
• gRPC uses HTTP as its transport layer and also provides
additional facilities such as authentication features, timeouts,
flow control, etc.
• Data is sent in protocol buffers, a language and platform
independent mechanism which defines how data can be
structured in a simpler intuitive way.
THANK YOU
UNIT 2 Part2
Introduction
• Usually, a program represents its data and information as data
structures when dealing with them at runtime. For example, a
library management system might represent the data about a
certain book using a “Book” object which in turn may consist of
primitive data items like “title, ISBN” and complex objects like
“author”.
• However, this library management system cannot represent the
information about the book using the said “Book” data structure
when sending data to another library management system. For
that, the “Book” data structure needs to be flattened (converted
to a sequence of bytes) before transmission and rebuilt at the
destination. This applies to almost every data structure as they
are not compatible to be transmitted through mediums like
networks.
• This is where concepts like marshalling and external data
representation come into play.
External Data Representation(XDR)
• XDR is a standard data serialization format that can be used to
transmit data among different computer architectures.
Conversion of data from local representation to XDR is
called encoding and conversion from XDR to a local
representation is called decoding. XDR implementations are
portable between different operating systems and independent
of the transport layer.
• It is done by the following two processes.
Contd…
• Marshalling: Marshalling is the process of transferring and formatting a
collection of data structures into an external data representation type
appropriate for transmission in a message.
• Unmarshalling: The converse of this process is unmarshalling, which
involves reformatting the transferred data upon arrival to recreate the original
data structures at the destination.
Approaches:
There are three ways to successfully communicate between various sorts of
data between computers.
1. Common Object Request Broker Architecture (CORBA)
2. Java’s Object Serialization
3. Extensible Markup Language (XML)
Common Object Request Broker Architecture
(CORBA):
It allows systems with diverse architectures, operating systems, programming
languages, and computer hardware to work together. It allows software
applications and their objects to communicate with one another. It is a standard
for creating and using distributed objects.
Data Representation in CORBA:
• Common Data Representation (CDR) is used to describe structured or primitive
data types that are supplied as arguments or results during remote invocations on
CORBA distributed objects.
• It allows clients and servers’ built-in computer languages to communicate with
one another. To exemplify, it converts little-endian(Most Significant Byte(MSB)
is placed first) to big-endian(Most Significant Byte (MSB) is placed Last).
Java’s Object Serialization
• Java Remote Method Invocation (RMI) allows you to pass both objects
and primitive data values as arguments and method calls. In Java, the term
serialization refers to the activity of putting an object (an instance of a
class) or a set of related objects into a serial format suitable for saving to
disk or sending in a message.
• Java provides a mechanism called object serialization. This allows an
object to be represented as a sequence of bytes containing information
about the object’s data and the type of object and the type of data stored in
the object. After the serialized object is written to the file, it can be read
from the file and deserialized.
Extensible Markup Language (XML)
• Clients communicate with web services using XML, which is also
used to define the interfaces and other aspects of web services. while
an XML archive is larger than a binary archive, it has the advantage of
being readable on any machine.
• The primitive data types are marshalled into a binary form in the first
two ways- CORBA and Java’s object serialization.
• The primitive data types are expressed textually in the third technique
(XML). A data value’s textual representation will typically be longer
than its binary representation. The HTTP protocol is another example
of the textual approach.
• On the other hand, type information is included in both Java
serialization and XML but in different ways , XML documents can
refer to namespaces(unique tag names), which are externally specified
groups of names (with types).
Multicast Communication
• Multicast is a method of group communication
where the sender sends data to multiple
receivers or nodes present in the network
simultaneously.
• Multicasting is a type of one-to-many and
many-to-many communication as it allows
sender or senders to send data packets to
multiple receivers at once across LANs or
WANs.
• Multicasting is considered as the special case of
broadcasting as.it works in similar to
Broadcasting, but in Multicasting, the
information is sent to the targeted or specific
members of the network.
• This reduces the bandwidth of the signal
Applications
• Multicasting is used in many areas like:
1. Internet protocol (IP)
2. Streaming Media
3. It also supports video conferencing applications and webcasts.
Note: Multicasting use classful addressing of IP address of class – D
which ranges from 224.0.0.0 to 239.255.255.255
Types of Multicast Communication
• IP Multicast : Multicasting that takes place over the Internet is known
as IP Multicasting. These multicast follow the internet protocol(IP) to
transmit data. IP multicasting uses a mechanism known as ‘Multicast
trees’ to transmit to information among the users of the network.
Multicast trees; allows a single transmission to branch out to the
desired receivers.
• IP multicasts also use two other essential protocols to function;
Internet Group Management Protocol (IGMP), Protocol Independent
Multicast (PIM)
Network virtualization
• Network virtualization is a method of combining the available
resources(server, desktop, operating system, file , storage and
network) in a network to consolidate multiple physical networks,
divide a network into segments or create software networks
between VMs.
• It also supports the processes with multiple threads to do multiple
tasks simultaneously i.e. We can construct (parts of) programs that
appear to run simultaneously.
• It creates the illusion of parallelism by moving back and forth quickly
between threads and processes.
Conclusion
Resource virtualization is a term used to describe the difference between
having a single CPU and being able to pretend there are multiple CPUs.
How does network virtualization work?
• Network virtualization abstracts network services from the physical
hardware and infrastructure. To do this, a network hypervisor creates
an abstraction layer that hosts and supports different virtual
networks.
• The abstraction layer provides a simplified representation of the
nodes and links making up the virtual networks. The hypervisor is not
only responsible for abstraction, but also controls the resources,
bandwidth and capacity for each logical network. While the virtual
networks share the hypervisor platform, they remain independent of
each other and have their own security rules.
Network virtualization typically encompasses the following
components:
• A network hypervisor;
• Controller software;
• Host protocols, such as Virtual Extensible LAN (VXLAN);
• Virtual switching and routing; and
• Management tools.
• Elements within a virtual network
-- such as VM workloads -- can
communicate with each other and
with nodes on a separate virtual
network using encapsulated host
protocols, virtual switches and
virtual routers. The messages do
not travel through the physical
networking devices, which helps
reduce latency.
• Network administrators can
migrate a workload from one host
to another in real time, with the
associated security policies and
networking requirements moving
with it. The virtualization platform
also applies security policies to
new workloads automatically.
Advantages of Network Virtualization
• Network virtualization helps organizations achieve major
advances in speed, agility, and security by automating and
simplifying many of the processes that go into running a data
center network and managing networking and security in the
cloud. Here are some of the key benefits of network
virtualization:
• Reduce network provisioning time from weeks to minutes
• Achieve greater operational efficiency by automating manual
processes
• Place and move workloads independently of physical topology
• Improve network security within the data center
Network Virtualization Example
• A VLAN is a subsection of a local area network (LAN) created with
software that combines network devices into one group, regardless of
physical location. VLANs can improve the speed and performance of
busy networks and simplify changes or additions to the network.
• Another example is network overlays. One industry-standard
technology is called virtual extensible local area network (VXLAN).
VXLAN provides a framework for overlaying virtualized layer 2
networks over layer 3 networks, defining both an encapsulation
mechanism and a control plane.
• Network overlaying enable dynamic and flexible routing, load
balancing, and resource allocation based on the application's needs
rather than the network's constraints. Additionally, overlay networks
can enhance security and privacy of communication by encrypting,
anonymizing, or isolating the traffic from the physical network.
Thank you
Unit2 Part III
Remote Invocation
Communication Paradigms
There are three modes of communication in distributed systems:
1. Inter-Process Communication
This is a low-level support for communication between processes in
distributed systems, including message-passing primitives. They have direct
access to the API offered by Internet protocols and support multicast
communication.
2.Remote Invocation
• It is a mechanisms of enabling a client to invoke a procedure/method
from the server via communication between client and server.
• It is used in a distributed system and it is the calling of a remote operation,
procedure or method.
3. Indirect communication ,for example Group communication, Message
Queues etc.
Remote Invocation Protocols
• Request-reply protocols
• Remote procedure calls:
• Remote method invocation:
Request Reply Protocol
• It works well for systems that involve simple RPCs.
• The parameters and result values are enclosed in a single packet buffer
in simple RPCs.
• This protocol has a concept base of using implicit acknowledgements
instead of explicit acknowledgements.
• Here, a reply from the server is treated as the acknowledgement
(ACK) for the client’s request message, and a client’s following call is
considered as an acknowledgement (ACK) of the server’s reply
message to the previous call made by the client.
• To deal with failure handling e.g. lost messages, the timeout
transmission technique is used with RR protocol.
• If a client does not get a response message within the predetermined
timeout period, it retransmits the request message.
• Exactly-once semantics is provided by servers as responses get held in
reply cache that helps in filtering the duplicated request messages and
reply messages are retransmitted without processing the request again.
Remote Procedure Call
• Remote Procedure Call (RPC) is an inter-process communication
technique. The Full form of RPC is Remote Procedure Call. It is used
for client-server applications. RPC mechanisms are used when a
computer program causes a procedure or subroutine to execute in a
different address space.
Three types of RPC are:
• Callback RPC
• Broadcast RPC
• Batch-mode RPC
Callback RPC
This type of RPC enables a P2P paradigm between participating
processes. It helps a process to be both client and server
services.
Functions of Callback RPC:
• Offers server with clients handle.
• Callback makes the client process wait
• Manage callback deadlocks.
• It facilitates a peer-to-Peer paradigm among participating
processes.
Broadcast RPC
Broadcast RPC is a client’s request, that is broadcast on the
network, processed by all servers which have the method for
processing that request.
Functions of Broadcast RPC:
• Allows you to specify that the client’s request message has to
be broadcasted.
• You can declare broadcast ports.
• It helps to reduce the load on the physical network.
Batch-mode RPC
Batch-mode RPC helps to queue, separate RPC requests, in a
transmission buffer, on the client-side, and then send them on a
network in one batch to the server.
Functions of Batch-mode RPC:
• It minimizes overhead involved in sending a request as it sends
them over the network in one batch to the server.
• This type of RPC protocol is only efficient for the application that
needs lower call rates.
• It needs a reliable transmission protocol.
Remote Method Invocation
• In a distributed computing environment, remote method
invocation (RMI) refers to calling a method on a remote object.
It is analogous to a remote procedure call. The main role is to
allow objects to access data and invoke methods on remote
objects (objects residing in non-local memory space).
Requirements for the distributed
applications
• If any application performs these tasks, it can be distributed
application.
1. The application need to locate the remote method
2. It need to provide the communication with the remote objects, and
the application need to load the class definitions for the objects.
The RMI provides remote communication between the applications
using two objects stub and skeleton.
Understanding stub and skeleton
• RMI uses stub and skeleton objects for communication with the remote
object.
• A remote object is an object whose method can be invoked from another
JVM.
Stub
The stub is an object, acts as a gateway for the client side. All the outgoing
requests are routed through it. It resides at the client side and represents the
remote object. When the caller invokes method on the stub object, it does the
following tasks:
1. It initiates a connection with remote Virtual Machine (JVM),
2. It writes and transmits (marshals) the parameters to the remote Virtual
Machine (JVM),
3. It waits for the result
4. It reads (unmarshals) the return value or exception, and
5. It finally, returns the value to the caller.
Skeleton
• The skeleton is an object, acts as a
gateway for the server side object. All
the incoming requests are routed
through it. When the skeleton receives
the incoming request, it does the
following tasks:
1. It reads the parameter for the remote
method
2. It invokes the method on the actual
remote object, and
3. It writes and transmits (marshals) the
result to the caller.
THANK YOU
UNIT2 Part IV
Indirect Communication
Introduction
• Indirect communication is defined as communication between entities in a
distributed system through an intermediary with no direct coupling or link
between the sender and the receiver(s).
• Remote invocation is based on direct(Strong) coupling between
senders and receivers, making systems rigid and difficult to change
• Indirect(Loose Coupling) communication used when change is
anticipated: e.g. mobile environments with users coming and going in
the network.
• Indirect communication are Space and time uncoupled.
• Disadvantages:
• performance overhead due to extra indirection
• more difficult to manage due to lack of space/time coupling
• Space UnCoupling.
Sender doesn’t know the identity of the receiver/s participants
can be replaced, updated, replicated, migrated
• time uncoupling: Sender and receiver don’t need to exist at
the same time
• useful in volatile environments where participants come and go
• implies persistence in communication channel: messages must be
stored
• NB different to asynchronous communication: asynchronous comms
don’t imply that the receiver has an independent lifetime.
Indirect Communication Paradigms
• Group communication
• Publish subscribe
• Message queues
• Shared memory
Group communication
• Communication between two processes in a distributed system is required to
exchange various data, such as code or a file, between the processes. When
one source process tries to communicate with multiple processes at once, it is
called Group Communication. A group is a collection of interconnected
processes with abstraction. This abstraction is to hide the message passing so
that the communication looks like a normal procedure call.
Types of Group Communication
• Broadcast Communication
• Multicast Communication.
• Unicast Communication
Characteristics
• Sender is not aware of the identities of the receivers
• Represents an abstraction over multicast communication Possible
implementation over IP multicast (or an equivalent overlay network),
adding value in terms of Managing group membership
• Detecting failures and providing reliability and ordering guarantees
Applications
• financial: reliable dissemination of financial information (e.g.
stock tickers) to large number of clients.
• institutions need accurate, up-to-date access to large number of
information sources
• multiuser game
• fault-tolerance: consistent update of replicated data
• system monitoring/management, load balancing
Types of Groups formed
Open Groups
Group, group membership
Processes may join or leave the group.
A single multicast operation such as aGroup. Send (aMessage) is enough to send to each
member of a group.
Advantages:
• Convenience for the programmers
• Efficient utilization of bandwidth
• Minimize total time to deliver the message to all
Publish Subscribe Model
Publish-subscribe system is
a system where publishers
use service event to event
and subscribers join the
network at some events
through the subscription
process. Publish-subscribe
systems are used in a wide
domain group. Examples are
in:
•Financial information system
•RSS feeds
•Ubiquitos computing
•Application monitoring, etc.
• This system is also a
major component of
Google's infrastructure,
related to the need for the
dissemination of
marketing information. We
can know that Google is a
large-scale search engine
that requires the ability to
disseminate information
on a large scale and
quickly.
Characteristics
• Subscribe-publish systems have two main
characteristics:
1. Heterogeneity: When event notifications are used as the subject
of communication, components in an unformed distributed
system can form simultaneously. The variety of information
disseminated is one of the main characteristics of this system.
2. Asynchronicity, notifications spread simultaneously to all
subscribers who are interested in information. The formed event
is used to dynamically change the object in case of correction,
because the system simultaneously propagates information
3.Message queue system
• Message queues, or more accurately called distributed
message queues are an important category in indirect
communication. Message queues provide point-to-point
services. To form the desired needs of scale and time.
Point-to-point here means that the sender puts the message
into a queue, and then it will be eliminated by several
processes. Message queues can also be said to be
middleware that uses message-oriented.
• groups, pub-sub are one-to-many, MQ is point-to-point
• A number of processes can send messages to the same queue, and likewise
a number of receivers can remove messages from a queue. The queuing
policy is normally first-in-first-out (FIFO), but most message queue
implementations also support the concept of priority, with higher-priority
messages delivered first.
• A message consists of a destination (that is, a unique identifier designating
the destination queue), metadata associated with the message, including
fields such as the priority of the message and the delivery mode, and also
the body of the message.
• One crucial property of message queue systems is that messages are
persistent – that is, message queues will store the messages indefinitely
(until they are consumed) and will also commit the messages to disk to
enable reliable delivery.
• any message sent is eventually received (validity) and the message received
is identical to the one sent, and no messages are delivered twice (integrity).
Message queue systems therefore guarantee that messages will be
delivered (and delivered once) but cannot say anything about the timing of
the delivery.
Shared Memory approaches
• It is an indirect communication paradigms that offer
an abstraction of shared memory.
• Distributed shared memory (DSM) is an abstraction
used for sharing data between computers that do
not share physical memory. Processes access DSM
by reads and updates to what appears to be
ordinary memory within their address space.
• The main point of DSM is that it spares the
programmer the concerns of message passing when
writing applications that might otherwise have to
use it. DSM is primarily a tool for parallel
applications or for any distributed application or
group of applications in which individual shared data
items can be accessed directly
• DSM is in general less appropriate in client-server systems, where
clients normally view server-held resources as abstract data and
access them by request (for reasons of modularity and protection).
• DSM systems manage replicated data: each computer has a local
copy of recently accessed data items stored in DSM, for speed of
access.
THANK YOU