Distributed Comp
Distributed Comp
So now, we're going to look at a data center. So, in a data center, we have several racks of
servers. So the racks in a data center will consist of several blade servers. These racks are
interconnected with each other through switches. The switches sit on top of the racks and that's
why these are called top-of-the-rack switches. These switches allow the data to be transferred
from one rack server to another one. So they are used for internal routing of data in a data
center. The switches can also be used to send data out of the data center or to get data from
outside into the data center by using the border gateway router. This router connects the data
center to the external Internet. Data centers also have load balancers that are used in order to
distribute the load across the different servers. As you can see, there is a hierarchy in the
topology inside a data center. This hierarchy is used to partition the different parts of a data
center so that they can be used very efficiently. If the demand is lower, then certain parts of the
data center can be switched off in order to save on energy. So, this hierarchical structure or
topology is often typical in data centers. Now, in a data center, if a particular blade server fails,
for example here, if this particular blade server fails, then the software that is running on that
particular blade server can be migrated over from this particular blade server that has failed to
another blade server either within the same rack or to a different rack server. So, S1 that is the
software or application that is running on this particular blade server which failed, can be
migrated from this particular rack to any other rack, let's say we migrated to the other server
here and S1 will be running on the next server. So, since the entire data center is virtualized,
that is, all these blade servers are virtualized, that is the hardware, the underlying hardware
resource of the blade server appear as a shared pool of resource, the operating system and the
services running on the applications running on top can be easily migrated from one hardware if
it fails to another one. That is why data centers provide a very resilient service. One of the new
trends in cloud computing especially in the topology of data center is to move away from this
hierarchical topology as we see over here there is a hierarchy of switches that are connecting
different parts of the network. The new trend is to move away from this hierarchy to a flat
topology. What this means is that there is a much higher degree of interconnection between the
switches. So, with this high level of interconnection, the hierarchy is flattened, and it is easier to
move a particular service or an application from one part of the data center to another part of
the data center quite easily without having to go through many hierarchies. So, this is a new
trend in the way data centers are operated. So now, that we have looked at the internal
architecture of a data center, let's look at it from the outside. When you think of cloud and data
centers, you might be thinking about some sprawling windowless warehouse kind of facility with
restricted access in some remote areas of Phoenix, San Antonio or Las Vegas. That may well
be quite true for large data centers of big cloud providers. In fact, some of these data fortresses
range between 400,000 square feet to 7 million square feet, like the Citadel Data Center of
Nevada, which is almost twice the size of the Mall of America. But data centers are not
necessarily gigantic and they may be hiding in plain sight in some ordinary looking building, like
this one. It houses the data center of the University of Minnesota. It is a secure facility that
houses and protects the resources that supports all the IT functions of the university. So, let's
take a peek inside to see how this fancy technology the cloud actually looks. But first we need to
get the permission to get inside one of these. Once inside, you may find the staff busy
monitoring the service and their connectivity. The inside of a data center typically looks like this.
A vast array of parallel corridors each with racks of servers lined up on either side, all connected
with high-speed top-of-the-rack switches and miles of network cables. These powerful servers
with high processing power and large memory units can host everything, from your emails, to
your social networks, to your favorite movies, and even this very course video. If you zoom in
closer, this is what each server rack looks like. This is a front view of a server rack with three
servers from top to bottom. If you notice carefully in the middle there is a keyboard or KVM that
is used to connect to all the servers. Once pulled out, it unfolds into a monitor plus a keyboard
that technicians can use to access and switch between the different servers. At the very bottom
is a large universal power supply or UPS battery backup that will keep the servers running long
enough to shut down without crashing even if there was a power outage. Of course, we make
sure there's no power outage in the first place. Now here, our friend Bob is working on a Dell
power grids are 710 Rack Mount server. He's lifting up the too quick release latches that allow
the server to slide out on rails. This is generally done to service or remove the server or to
replace it with a new one. Here Bob, is releasing a locking latch to remove a hard drive from the
R 70010 server. If a hard drive fails, it can be removed and replaced while the server is running.
When a new hard drive is inserted back in, the server detects that new one was installed and it
starts to use it. The lost data is rebuilt from the data on the other drives. But that's not all. To
keep your data available at all times, all data will be backed up and stored as backup copies on
other Mirror servers. Once it's dark, and everyone's left for the day the magic really begins.
These servers come alive and chat with each other throughout the night because a data center
never sleeps. So, data centers provide a centralized solution for computation and storage of
data, but recently there has been an emerging trend of pushing out some of this intelligence
from the core to the edge of the network, and that has led to the emergence of a new area
called edge computing. In the next video, we're going to learn about it.
In data-centered architecture, the data is centralized and accessed frequently by
other components, which modify data. The main purpose of this style is to achieve
integrality of data. Data-centered architecture consists of different components that
communicate through shared data repositories. The components access a shared
data structure and are relatively independent, in that, they interact only through the
data store.
The most well-known examples of the data-centered architecture is a database
architecture, in which the common database schema is created with data definition
protocol – for example, a set of related tables with fields and data types in an
RDBMS.
Another example of data-centered architectures is the web architecture which has a
common data schema (i.e. meta-structure of the Web) and follows hypermedia data
model and processes communicate through the use of shared web-based data
services.
Types of Components
There are two types of components −
A central data structure or data store or data repository, which is responsible
for providing permanent data storage. It represents the current state.
A data accessor or a collection of independent components that operate on
the central data store, perform computations, and might put back the results.
Interactions or communication between the data accessors is only through the data
store. The data is the only means of communication among clients. The flow of
control differentiates the architecture into two categories −
Disadvantages
It is more vulnerable to failure and data replication or duplication is possible.
High dependency between data structure of data store and its agents.
Changes in data structure highly affect the clients.
Evolution of data is difficult and expensive.
Cost of moving data on network for distributed data.
Advantages
Provides scalability which provides easy to add or update knowledge source.
Provides concurrency that allows all knowledge sources to work in parallel as
they are independent of each other.
Supports experimentation for hypotheses.
Supports reusability of knowledge source agents.
Disadvantages
The structure change of blackboard may have a significant impact on all of its
agents as close dependency exists between blackboard and knowledge
source.
It can be difficult to decide when to terminate the reasoning as only
approximate solution is expected.
Problems in synchronization of multiple agents.
Major challenges in designing and testing of system.
VM
As you know virtualization is a fairly old technology, but it's still super
relevant to building our cloud computing strategy today. So, first off, what
is virtualization? Simply put, virtualization is the process of creating a
software based, or virtual, version of something, whether that be compute,
storage, networking, servers, or applications. And what makes virtualization
feasible, is something called the hypervisor. We're going to write that here.
What a hypervisor is, is it's simply a piece of software that runs above
the physical server, or host. And there are a couple different types of
hypervisors out there. What they do is essentially pull the resources from
the physical server and allocate them to your virtual environments.
There are two main types of hypervisors out there. One being Type 1. Very
simple to remember. And 2, you guessed it, Type 2. So let's start with Type
1. A Type 1 hypervisor is a hypervisor that is installed directly on top of the
physical server. These are the most frequently typed of use hypervisors and
they're most secure, they lower the latency, and these are the ones that
you'll see in the market the most. Some examples would be VMware, ESXi,
or Microsoft Hyper-v, or even open-source KVM. The other type of
hypervisor is a Type 2 hypervisor, over here. And what makes these
different is that there is a layer of host OS that sits between the physical
server and the hypervisor. By that nature they are also called, Hosted.
Hhese are a lot less frequent. They're mostly used for end-user
virtualization. You might see some in the market that are called: Oracle,
VirtualBox, or VMware Workstation. So once you have your hypervisor
installed, you can build virtual environments, or virtual machines. What
makes a VM a VM? A VM is simply a software based computer.
They're run like a physical computer. They have an operating system
and applications, and they're completely independent of one another,
but you can run multiple of them on a hypervisor. And the hypervisor
manages the resources that are allocated to these virtual environments
from the physical server. Because they're independent you can run
different operating systems on different virtual machines. You could
run Windows here, or Linux here, or UNIX here for example. Because
they're independent they're also extremely portable. You can move a
virtual machine from one hypervisor to another hypervisor on a completely
different machine almost instantaneously, which gives you a lot of
flexibility and a lot of portability within your environment. So looking at all
of this - this is the core virtualization as a process. So let's talk about a
couple key benefits that you want to take away from this. 1) Cost savings.
When you think about this and the fact that you can run multiple virtual
environments from one piece of infrastructure, means that you can
drastically reduce your physical infrastructure footprint. And the fact that
you don't have to maintain nearly as many servers, run as much electricity,
save on maintenance costs, means that you save on your bottom line at the
end of the day. 2) Would be agility and speed. Like I said, spinning up a
virtual machine is relatively easy and quick - a lot more simple than
provisioning an entire new environment for your developers if they say they
want to spin up a new environment so that they can run a test scenario.
Whatever it might be, virtualization makes that process a lot simpler and
quicker. And 3) lowers your downtime. Let's say that this host goes out
unexpectedly. The fact that you can move virtual machines from one
hypervisor to another, on a different physical server, means that you have a
great backup plan in place. Right? So, if this host goes down you can
simply move your VMs very quickly to another hypervisor on a machine
that is working. Virtualization and VMs are at the center of cloud
computing and provide many benefits
Interpreter style:
The interpreter is an architectural style that is suitable for applications in
which the most appropriate language or machine for executing the
solution is not directly available. The style consists of a few components
which are: a program that we are trying to run, interpreter that we are
trying to interpret, current state of the program and the interpreter and
finally memory component to hold the program, the current state of the
program and the current state of the interpreter. The connectors for
interpreter architectural style is procedure calls to communicate between
the components and direct memory accesses to access memory.
Impose specific topological constraints?
The interpreter has 4 compositions:
An interpreter engine: finishes the interpreter work
Data store field: contains the pseudo code
A data structure: records the current state of the interpreter engine
Another data structure: records the progress of the interpreted source
code.
Input: Input to the interpreted program component is sent to the program
state, where it is read by the program running on the interpreter
Output: Program output is placed in the program state, where it can result
in output from the interpreted program component's interface.
There are some distinct advantages of using an interpreter architecture
style:
By having the behavior of the system defined by a custom language or
data structure, software development becomes easier
This facilitates the portability and flexibility of application or languages
across various platforms
As each line is interpreted, the results of the execution are visible which
makes debugging easier
Errors are caught as they happen since the interpreter stops when it
can’t interpret a line
Portability and flexibility of application or languages across various
platforms Virtualization. Machine code intended for one hardware
architecture can be run on another using a virtual machine, which is
essentially an interpreter Behaviour of system defined by a custom
language or data structure; makes software easier to develop and
understand. Supports dynamic change(Efficiency) the interpreter usually
just needs to translate the code being worked on to an intermediate
representation (or not translate it at all), thus requiring much less time
before the changes can be tested. “Sandbox” security An interpreter or
virtual machine is not compelled to actually execute all the instructions the
source code it is processing. In particular, it can refuse to execute code
that violates any security constraints it is operating under.
A number of different layers are defined with each layer performing a well-
defined set of operations. Each layer will do some operations that becomes
closer to machine instruction set progressively.
At the outer layer, components will receive the user interface operations and at
the inner layers, components will perform the operating system
interfacing(communication and coordination with OS)
Intermediate layers to utility services and application software functions.
In layered architecture, several layers (components) are defined with each
layer performing a well-defined set of operations. These layers are arranged
in a hierarchical manner, each one built upon the one below it. Each layer
provides a set of services to the layer above it and acts as a client to the layer
below it. The interaction between layers is provided through protocols
(connectors) that define a set of rules to be followed during interaction. One
common example of this architectural style is OSI-ISO (Open Systems
Interconnection-International Organization for Standardization)
communication system.
Next, we will take a look at the event based architectural style, which derives from the event
driven programming paradigm. In this architectural style, the fundamental elements in the
system are events. Events can be signals, user inputs, messages or a data from other functions
or programs. Events act as both indicators of change in the system and as triggers to functions.
In this paradigm, functions take the form of event generators and event consumers. Event
generators send events and event consumers receive and process these events. A function can
be both an event generator and an event consumer. You're probably familiar with functions
being explicitly invoked by other functions. In contrast, other architectural styles such as the
main program and subroutine style, functions are not explicitly called in the event-based
architectural style. Instead, event consumers are called implicitly based on event sent from
event generators. We say that event-based functions experience implicit invocation. The
defining feature of implicit invocation is that a functions are not in direct communication with
each other. All communication between them is mediated by an event bus. For now, we can
think of the event bus as the connector between of all event generators and consumers in the
system. Further on in this lesson, we will see how it does this job. To achieve this structure, we
first bind an event and an event consumer via the event bus. That is, each event consumer
registers with the event bus to be notified of certain events. When the event bus detects an
event, it distributes the event to all appropriate event consumers. Because all events must pass
through the event bus, it is critical part of the system. This structure may seem a little familiar to
you and it should be. Earlier in the specialization, during our study of design patterns, you
learned the observer pattern. The observer design pattern actually manifests the event based
architectural style. One way to implement the event bus is to structure the system to have a
main loop that continually listens for events. When an event is detected, the loop calls all the
functions bound to that event. In many cases, we will want an event consumer to notify other
functions when it has completed its task or to send a state change, so this function must also be
an event generator. This function will implicitly invoke other functions to run after it's completed.
It does this by sending out an event, the event bus. As always, when non-event reaches the
event bus, corresponding event consumers would be triggered and the next computation can
take place. We can use the example of the code editor to help visualize this structure. After
editing a file consider what happens when the safe button is clicked. This click generates an
event from the editor that is sent to the event bus to indicate that the project is ready to be
rebuilt. A build tool is triggered by this event to process the file. On completion of the build, the
build tool generates an event that is sent to the event bus. The event bus passes along this
event to the editor and to the test tool. When the editor receives the event from the event bus, it
presents the latest project status. When the test tool receives the event from the event bus, it
begins running tests. When the test tool finishes, it sends a test completion event to the event
bus. The editor receives this event and presents the latest test status. Notice that in this
example, the builds tool, tests tool, and editor are all event generators and consumers. All of the
indirect communication between functions may not be as efficient as explicit function invocation.
So what's a good idea to take this into consideration as you design your system. In the event
based architectural style, event generators do not necessarily know which functions will be
consuming their events, and likewise, event consumers do not necessarily know who is
generating the events they handled. This loose coupling of functions makes it easier to evolve
and scale the system. Adding new functionally for an existing event is as simple as registering a
new event function pair to the event bus and adding a new event consumer. Next, let's consider
how objects and data will be updated in the system. If objects in the system are globally
accessible then event consumers may change the shared state as they process an event.
However, this design may be risky because event consumers can be called asynchronously.
With asynchronous calls, an event consumer does not need to wait for other event consumers
to finish running before itself running. This means that two event consumers could be running at
the same time on the same shared data. On the one hand, systems that allow asynchronous
function calls may increase the efficiency of the system. But on the other hand asynchronous
calls can result in race conditions. Where the shared data may not be updated correctly. We say
a system has race conditions when the behavior of the functions depend on the order in which
they are called. One way to coordinate function access to shared data is through a semaphore.
A simple binary semaphore may consist of a variable that toggles between two values, available
and unavailable. Available indicates that the shared data is not in use, and unavailable indicates
that the shared data is in use by a function. To make this concept less abstract, let's consider an
analogy. Imagine that you are at a clothing store and you have selected some clothes to try on.
You make your way to the row of fitting rooms. How do you know if a room is available? Near
each door handle you see either a green or a red indicator that conveys whether a changing
room is available or unavailable, respectively. You enter a room whose indicator is green. When
you lock the door behind you, the indicator switches to red to convey the room is unavailable.
Even if someone tried to enter a room with a red indicator, they would not be able to do so as
it's locked. After you've tried on the clothing as you unlock the door to leave the room, the
indicator switches back to green. In this way, the access to each changing room area is simply
managed through use of the door indicators. Now let's get back to thinking about semaphores in
terms of code. A semaphore has a special operation to check and toggle its value in a single
step. Any function that would like to access the shared data must first check the value of the
semaphore before proceeding. If the value is set to unavailable, then the function must wait and
may check back again later. If the value is set to available, it is, at once, toggled to unavailable
before the function proceeds to access the shared data. When the function no longer needs
access, it will toggle the semaphore value to available to indicate that the shared data is no
longer in use. In this way a semaphore can be used to control access to share data.
Alternatively, if an object is not globally accessible and its state is changed by an event
consumer then some indication of this must be sent back to the event bus before the function
ends. This way, event consumers to that event will be aware of the state change. In this
architectural style, events and functions do not occur in a predictable way. There are no
guarantees of exactly when an event will be handled, how long it will take to be handled, or
when an event generator will emit an event. Because of this a synchronize behavior, the control
flow is entirely based on which events occurred during its execution and in what order. Event
rhythm systems must pay special attention to avoid introducing bugs caused from a synchronize
events. Now that we've learned about the structural and behavioral details of the style, let's
move on to an example so we can see how this will be applied. An example of a system that
could be implemented using an event based architectural style is Cookie Clicker. Cookie Clicker
is an interactive game. The goal of this game is to collect as many cookie points as possible by
clicking on an image of a cookie with your cursor. To collect points in our simplified variation of
the game, you either click the cookie manually, or click the cursor icon to purchase a blue cursor
that will automatically click the cookie for you at regular time intervals. Thus helping you collect
cookie points faster and with less work on your part. In more detail, we will design this system to
do the following. Increase the total cookie points by one each time the player manually clicks the
cookie. When the Buy Clicker icon is clicked, five cookie points are deducted from the total
cookie points and a new blue cursor is added around the cookie. And every five seconds, the
blue cursor surrounding the cookie automatically clicks the cookie and the total cookie points
increases by the number of blue cursors surrounding the cookie. An event-driven way to
achieve this behavior is as follows. The system will use a main event loop to listen for events.
Depending on where you click within a window, a different event consumer will be involved. For
the sake of simplicity, we will assume all variables in this system are globally accessible and
that access to them is controlled. When you click the cookie, the function registered to handle
this click event is called and the total cookie points is increased by one. When you click the Buy
Clicker icon, the function registered to handle this event is the Buy Clicker function. When
invoked, it first reduces your cookie score because you have made a purchase. Then it adds an
automatic clicker function to the system, and a new blue cursor is added around the cookie.
This means that there will be one automatic clicker function for each blue cursor surrounding the
cookie. The first time a clicker is purchased, a timer function is added to the system by the buy
clicker function.. The timer function is an event generator registered to the event bus that sends
a timer event every five seconds. When the timer function emits a timer event, the event bus
detects this and triggers every automatic clicker function to consume this event. Each automatic
clicker function is responsible for making one click to the cookie. So every time a timer event is
sent to the event bus and received by the automatic clicker functions, the total cookie points
increases by the number of automatic clicker functions. That is, every five seconds, the total
cookie points automatically increases by the number of automatic clickers purchased since the
beginning of the game. Together this event consumers, generator and bus form the system of
the game. We've lifted the details of how the graphics are drawn but as you can imagine
additional functions can be added to the system to produce the graphics. As we've seen from
our example the event based architectural style is well suited to interactive applications.
Graphical user interface applications that rely on user input and distributed systems that interact
with other programs are prime candidates for this style. In the event-based architectural style,
event generators and event consumers do not need to know about each other directly. All
communication between them is mediated through an event bus. This decoupling of functions
results in systems that evolve and scale well.
There are software systems where an end user can interact with it through a language.
That is, the user can write scripts, macros or rules that access and compose the basic
features of the system in new ways. Let's take a look at this formula for the Microsoft
Excel spreadsheet application. This simple formula is asking Excel to sum the values in
cells A1 to A4. Excel is able to do this at run time and this functionality is not limited to
just cells A1 to A4. The application is able to carry out the intended calculation because
it is able to interpret the formula into an intermediate representation. And then execute
this representation. Built into Excel, is a component called an interpreter, that has this
responsibility. In fact, the end user doesn't need to know about the underlying
implementation of the function. They don't even need to know about how the formula or
the spreadsheet is represented. The interpreter takes care of properly handing the
requested calculation. In addition to enabling programmable applications, systems
based on interpreters can allow an end user to write scripts, macros or rules that access,
compose, and run the basic features of those systems in new and dynamic ways. The
interpreter-based architecture is used in a variety of commercial systems because it
provides users with flexible and portable functionality. Interpreters can be used to run
scripts and macros. Systems can use an interpreter to drive programmable actions
specified by the user. Scripts are often used for automating common tasks. For example,
users can write scripts to schedule tasks, performing repetitive actions and compost
complex tasks that invoke other commands. Macros are an evolution of scripts, and
became popular with the introduction of graphical user interfaces. A macro records
keyboard and mouse input so that they can be executed later. This allows users to
record interactions with the user interface, which may be either repetitive or complex.
And then, replay the recorded actions in a simple and quick way. An interpreter allows
you to add functionality to a system, or extend existing functionality of a system. This is
done by composing preexisting functions together in a specific sequence in order to
create something new. These preexisting functions are defined by the system
architecture and offered to the user. The developer, thus, avoids the need to implement
all possible combinations of functionality. For example, a web browser extension, not a
plug in, is a component that adds new functionality to the browser and can customize the
pages that the browser renders. Such a component is typically implemented in a
language like Javascript to be run by an interpreter embedded in the browser. Having a
system with a built in interpreter is not only beneficial to developers, it encourages end
users to implement their own customizations. Toward this end, a system can offer an
easier to use language that has domain specific abstractions suited to the needs and
thinking of the end users. This is an advance over requiring end users to use the general
programming languages that professional software developers use. Besides building an
interpreter into a system, the system, itself, can be implemented in a language that runs
upon an interpreter. This can make your system more portable. So, it can work on
platforms that the interpreter supports. The interpreter and it's environment, essentially,
abstracts the underlying platforms, so your system can be more platform independent.
Portability is becoming more important with the rise of virtual machines and virtual
environments. With more services being hosted in Cloud, it is becoming increasingly
important to be able to develop and deploy software systems onto hardware that you
have no control over. However, interpreters can be slow. Basic implementation spend
little time analyzing the source code and use a line by line translate and execute
strategy. This is a classic trade off, it may be faster and more flexible for developers to
use and interpretive language, but slower for the computer to execute it. Let's take a look
at an example of where interpreters are used for Java programs. Java programs are first
translated into an intermediate language that is loaded into a JVM, short for Java Virtual
Machine. Which then executes the intermediate language. The JVM will attempt to
optimize the intermediate instructions by monitoring the frequency at which the
instructions are executed. The instructions that are executed frequently, get translated
into machine code and executed immediately. On the next execution of the same
intermediate instructions, the JVM uses Lazy Linking to point the program to the
previous machine code translation. Instructions that are not used frequently are left for
the interpreter of the JVM to execute. This decreases execution times since frequently
used instructions do not need to be constantly translated, and the entire program does
not need to be translated all at once. The JVM also provides portability to Java
programs, allowing them to run on many operating environments. Interpreters support
many different uses such as scripting, creation of macros and enabling programs to work
across different computer architectures. As a designer, it's important that you understand
the role of interpreters in a system's architecture, as they can be used to address needs
for programmability, flexibility, and portability.
A macro (which stands for
"macroinstruction") is a programmable
pattern which translates a certain
sequence of input into a preset sequence
of output. Macros can make tasks less
repetitive by representing a complicated
sequence
of keystrokes, mouse movements,
commands, or other types of input.
In computer programming, macros are
a tool that allows a developer to re-use
code. For instance, in the C
programming language, this is an
example of a simple macro definition
which incorporates arguments:
#define square(x) ((x) * (x))
…which declares
an integer type variable named num,
and set its value to 25.
Note
A macro is not the same as a function. Functions require special instructions and
computational overhead to safely pass arguments and return values. A macro is a way to
repeat frequently-used lines of code. In some simple cases, using a macro instead of a
function can improve performance by requiring fewer instructions and system resources
to execute.
Overview
Design and implementation of a distributed system requires consideration of the
following elements:
Placement of components
Placement of data
Functional roles
Communication patterns
Fortunately, most distributed systems employ one of a small set of common models.
Software Layers
First, consider the software architecture of the components of a distributed system.
The lower two layers comprise the platform, such as Intel x86/Windows or
PowerPC/MacOS X, that provides OS-level services to the upper layers.
The middleware sits between the platform and the application and its purpose is to
mask heterogeneity and provide a consistent programming model for application
developers. Some of the abstractions provided by middleware include the following:
Java RMI
CORBA
DCOM
Atop the middleware layer sits the application layer. The application layer provides
application-specific functionality. Depending on the application, it may or may not
make sense to take advantage of existing middleware.
1 Access
Hides the way in which resources are accessed and the differences in data
platform.
2 Location
Hides where resources are located.
3 Technology
Hides different technologies such as programming language and OS from user.
4 Migration / Relocation
Hide resources that may be moved to another location which are in use.
5 Replication
Hide resources that may be copied at several location.
6 Concurrency
Hide resources that may be shared with other users.
7 Failure
Hides failure and recovery of resources from user.
8 Persistence
Hides whether a resource ( software ) is in memory or disk.
Advantages
Resource sharing − Sharing of hardware and software resources.
Openness − Flexibility of using hardware and software of different vendors.
Concurrency − Concurrent processing to enhance performance.
Scalability − Increased throughput by adding new resources.
Fault tolerance − The ability to continue in operation after a fault has
occurred.
Disadvantages
Complexity − They are more complex than centralized systems.
Security − More susceptible to external attack.
Manageability − More effort required for system management.
Unpredictability − Unpredictable responses depending on the system
organization and network load.
The data source can be connected to a number of different filters. Each filter has a
required interface to receive input from data sources and other filters. But each also
have a provided interface that will output transformed data. The data target is
connected to receive the fully transformed result. There are many advantages to
using the pipe and filter architecture. It ensures loose and flexible coupling of
components, the filters. Each filter runs independently of other filters, only
focusing on its input and output data, which allows you to easily add newly
developed filters or move filters around in a system to achieve different results.
Additionally, the loose coupling also allows changes on individual filters to be
made easily without affecting other filters in the system. This is very important to
develop rapidly changing systems. Another advantage is that filters can be treated
as black boxes. Users of the system do not need to know the inner workings of
each filter. So they can simply use a filter to transform different datasets and not
worry about knowing the logic behind that transformation. Finally, a main
advantage of this type of architecture is reusability. Each filter can be called and
used over and over again with different inputs. Filters can be repeatedly used in
different applications for different purposes. However, there are a few drawbacks
to using the pipe and filter architecture. One disadvantage is that it may reduce
performance due to excessive overheads in filters. For this type of architecture,
each filter will receive input, parse that input into some data structure, perform
transformations, and then send data out. The next filter will do the same thing,
parse the input into a data structure, do transformations, and send data to another
filter, which will also do the same thing. If each and every filter has to do this
process with similar parsing at output stages, that's a lot of overhead. As more and
more filters are added, the performance of the system will rapidly diminish.
Additionally, this architecture may cause filters to get overloaded with massive
amounts of data to process. While one filter is working hard to process the large
amount of data, other filters may be starved waiting for their inputs. The last
important disadvantage for this type of architecture is that it cannot be used for
interactive applications. The data transfers and transformations will take time
depending on the amount of data transmitted. So this type of architecture is not
suitable for applications that require rapid responses. Despite the drawbacks that
we just discussed, the many advantages to using the pipe and filter architecture
make it a very popular one used in many systems, such as the text based utilities in
the UNIX operating system. Whenever different datasets need to be manipulated in
different ways, you should consider using the pipe and filter architecture. Breaking
the system down into filters and pipes can provide greater reusability, decrease
coupling, and allow for greater flexibility in your system.
VM
As you know virtualization is a fairly old technology, but it's still super relevant to
building our cloud computing strategy today. So, first off, what is virtualization?
Simply put, virtualization is the process of creating a software based, or virtual,
version of something, whether that be compute, storage, networking, servers, or
applications. And what makes virtualization feasible, is something called the
hypervisor. We're going to write that here. What a hypervisor is, is it's simply a
piece of software that runs above the physical server, or host. And there are a
couple different types of hypervisors out there. What they do is essentially pull
the resources from the physical server and allocate them to your virtual
environments. There are two main types of hypervisors out there. One being Type
1. Very simple to remember. And 2, you guessed it, Type 2. So let's start with Type
1. A Type 1 hypervisor is a hypervisor that is installed directly on top of the
physical server. These are the most frequently typed of use hypervisors and they're
most secure, they lower the latency, and these are the ones that you'll see in the
market the most. Some examples would be VMware, ESXi, or Microsoft Hyper-v,
or even open-source KVM. The other type of hypervisor is a Type 2 hypervisor,
over here. And what makes these different is that there is a layer of host OS that
sits between the physical server and the hypervisor. By that nature they are also
called, Hosted. Hhese are a lot less frequent. They're mostly used for end-user
virtualization. You might see some in the market that are called: Oracle,
VirtualBox, or VMware Workstation. So once you have your hypervisor installed,
you can build virtual environments, or virtual machines. What makes a VM a VM?
A VM is simply a software based computer. They're run like a physical
computer. They have an operating system and applications, and they're
completely independent of one another, but you can run multiple of them on a
hypervisor. And the hypervisor manages the resources that are allocated to
these virtual environments from the physical server. Because they're
independent you can run different operating systems on different virtual
machines. You could run Windows here, or Linux here, or UNIX here for
example. Because they're independent they're also extremely portable. You
can move a virtual machine from one hypervisor to another hypervisor on a
completely different machine almost instantaneously, which gives you a lot of
flexibility and a lot of portability within your environment. So looking at all of this
- this is the core virtualization as a process. So let's talk about a couple key benefits
that you want to take away from this. 1) Cost savings. When you think about this
and the fact that you can run multiple virtual environments from one piece of
infrastructure, means that you can drastically reduce your physical infrastructure
footprint. And the fact that you don't have to maintain nearly as many servers, run
as much electricity, save on maintenance costs, means that you save on your
bottom line at the end of the day. 2) Would be agility and speed. Like I said,
spinning up a virtual machine is relatively easy and quick - a lot more simple than
provisioning an entire new environment for your developers if they say they want
to spin up a new environment so that they can run a test scenario. Whatever it
might be, virtualization makes that process a lot simpler and quicker. And 3)
lowers your downtime. Let's say that this host goes out unexpectedly. The fact that
you can move virtual machines from one hypervisor to another, on a different
physical server, means that you have a great backup plan in place. Right? So, if
this host goes down you can simply move your VMs very quickly to another
hypervisor on a machine that is working. Virtualization and VMs are at the center
of cloud computing and provide many benefits.
Virtual Machines or VMs are also known as Virtual Servers or Virtual Instances, or simply
Instances, depending on the cloud provider. The various cloud providers make VMs available in
a variety of configurations and deployment options to serve different use cases. When you
create a virtual server in the cloud, you specify the Region and Zone or Data Center you want
the server to be provisioned in and the Operating System you want on it. You can choose
between shared (that is, a multi-tenant) VMs or dedicated (that is, a single-tenant) VMs. You
can also choose between hourly or monthly billing, and select storage and networking options
for the virtual server. Now let’s look at a few different types of VMs that can be provisioned in
the cloud. Shared or Public Cloud VMs are provider-managed, multi-tenant deployments that
can be provisioned on-demand with predefined sizes. Being multi-tenant means that the
underlying physical server is virtualized and is shared across other tenants or users. To satisfy
different workloads, cloud providers offer predefined sizes and configurations ranging from a
single virtual core and a small amount of RAM to multiple virtual cores and much larger amounts
of RAM. For example there can be configurations for Compute Intensive workloads, Memory
intensive workloads, or High Performance I/O. Rather than pick from only pre-defined sizes,
some providers also offer custom configurations that allow users to define the number of cores
and RAM and local storage characteristics. Public VMs are usually priced by the hour (or in
some cases even seconds) and configurations start as low as pennies per hour. Some
providers also let you get monthly VMs, which can result in some cost savings if you know you
will run the VM for at least a month, but if you decide to de-commision the VM in the middle of
the month, you will still be charged for the full month. Transient or Spot VMs take advantage of
unused capacity in a cloud data center. Cloud providers make this unused capacity available to
users at a much lower cost than regular VMs of similar sizes. Although the Transient VMs are
available at a huge discount, the Cloud provider can choose to de-provision them at any time
and reclaim the resources for provisioning regular, higher-priced, VMs. Because you run the risk
of losing these VMs when capacity in the data center decreases, these VMs are great for non-
production workloads such as testing and developing applications. They are also useful for
running stateless workloads, testing scalability, or running big data and high performance
computing (HPC) workloads at a low cost. Reserved virtual server instances allow you to
reserve capacity and guarantee resources for future deployments. You reserve desired amount
of virtual server capacity, provision instances from that capacity when you need them, and
choose a term, such as 1 year or 3 years, for your reserved capacity. You're guaranteed this
capacity within the data center of your choice for the life of the contract term. By committing to a
longer term, you can also lower your costs compared to hourly or monthly instances. This can
be useful when you know you require at least a certain level of cloud capacity for a specific
duration. And if you exceed your reserved capacity, you can always choose to supplement your
unplanned usage and capacity requirements with hourly or monthly VMs. Note however that not
all predefined VMs families or configurations may be available as reserved. Dedicated hosts
offer single-tenant isolation. This means that only your VMs run on a given host so they can
make exclusive use of full capacity and resources of the underlying hardware. When
provisioning a dedicated host you to specify the data center and POD in which you want your
host placed. You then assign instances, or virtual machines, to a specific host. This allows for
maximum control over workload placement. Dedicated hosts are typically used for meeting
compliance and regulatory requirements or meet specific licensing terms. Virtualization and
VMs are at the center of cloud computing and provide many benefits. In the next video, we will
discuss bare metal servers, what they are and what they provide.
System Architectures
The application layer defines the functional role of each component in a distributed
system, and each component may have a different functional role. There are several
common architectures employed by distributed systems. The choice of architecture
can impact the design considerations described below:
Client-Server
The client-server model is probably the most popular paradigm. The server is
responsible for accepting, processing, and replying to requests. It is the producer. The
client is purely the consumer. It requests the services of the server and accepts the
results.
The basic web follows the client-server model. Your browser is the client. It requests
web pages from a server (e.g., google.com), waits for results, and displays them for
the user.
In some cases, a web server may also act as a client. For example, it may act as a
client of DNS or may request other web pages.
One difference is centralization. In the traditional client–server model, the server is a centralized
system that serves many clients. The bulk of the computing power resides in a centralized
server that is attached to terminals (clients). In a service oriented architecture, components are
more decentralized. Components are more equal in the sense that one is not subordinate to the
other, for example, the client and server can swap roles in order to achieve an objective.
Components can operate independently but are capable of interacting with each other in order
to achieve an objective.
Another difference is coupling. In the traditional client–server model, the client is typically tightly
coupled to the server. In a service oriented architecture, a component exposes a service so that
the service can be discovered and invoked. The component exposes the service in a way that is
independent of its implementation (via an interface like WSDL or WADL). Other components
that understand the interface can invoke the service via the interface without being tightly
coupled to its implementation. A developer can build an application which uses one or more
services without knowing about their underlying implementations, for example,
a Java application can use a service implemented in C++ and another service implemented in
Ruby.
Multiple Servers
In reality, a web site is rarely supported with only one server. Such an implementation
would not be scalable or reliable. Instead, web sites such as Google or CNN are
hosted on many (many many) machines. Services are either replicated, which means
that each machine can perform the same task, or partitioned, which means that some
machines perform one set of tasks and some machines perform another set of tasks.
For example, a site like CNN might serve images from one set of machines and
HTML from another set of machines.
To reduce latency, load on the origin server, and bandwidth usage, proxies and caches
are also used to deliver content. An end host (your browser) may cache content. In
this case, when you first request content, your browser stores a copy on your local
machine. Subsequent requests for the same content can be fulfilled by using the cache
rather than requesting the content from the origin server.
An organization, like USF, may also deploy a proxy server that can cache content and
deliver it to any client within the organization. Again, this reduces latency, and it also
reduces bandwidth usage. Suppose that several hundred USF students download the
same YouTube video. If a proxy server caches the video after the first student's
request, subsequent requests can be satisfied by using the cached content, thereby
reducing the number of external requests by several hundred.
CDNs, like Akamai, also fall into this category. However, CDNs work a bit
differently than traditional proxy servers. CDNs actively replicate content throughout
the network in a push-based fashion. When a customer (e.g., CNN) updates its
content, the new content is replicated throughout the network. In contrast, a proxy
server will cache new content when it is requested by the first client.
P2P
The peer-to-peer model assumes that each entity in the network has equivalent
functionality. In essence, it can play the role of a client or a server. Ideally, this
reduces bottlenecks and enables each entity to contribute resources to the system.
Unfortunately, it doesn't always work that way. One of the early papers on peer-to-
peer systems was Free Riding on Gnutella, a paper that demonstrated that peers often
free ride by taking resources (downloading files, in this case) and never contributing
resources (uploading files).
Hierarchical or superpeer systems, like Skype, are also widely used. In these systems,
peers are organized in a tree-like structure. Typically, more capable peers are elected
to become superpeers (or supernodes). Superpeers act on behalf of downstream peers
and can reduce communication overhead.
Other
Mobile Code/Agents
The previous models assume that the client/server/peer entities exchange data. The
mobile code model assumes that components may exchange code. An example of this
is Java Applets. When your browser downloads and applet, it downloads some Java
code that it then runs locally. The big issue with this model is that it introduces
security risks. No less a security threat are mobile agents -- processes that can move
from machine to machine.
The network computer model assumes that the end user machine is a low-end
computer that maintains a minimal OS. When it boots, it retrieves the OS and
files/applications from a central server and runs applications locally. The thin client
model is similar, though assumes that the process runs remotely and the client
machine simply displays results (e.g., X-windows and VNC).
This model has been around for quite some time, but has recently received much
attention. Google and Amazon are both promoting "cloud computing". Sun's Sun Ray
technology also makes for an interesting demonstration. Though this model has yet to
see success, it is beginning to look more promising.
Mobile Devices
There is an increasing need to develop distributed systems that can run atop devices
such as cell phones, cameras, and MP3 players. Unlike traditional distributed
computing entities, which communicate over the Internet or standard local area
networks, these devices often communicate via wireless technologies such as
Bluetooth or other low bandwidth and/or short range mechanisms. As a result, the
geographic location of the devices impacts system design. Moreover, mobile systems
must take care to consider the battery constraints of the participating devices. System
design for mobile ad hoc networks (MANETs), sensor networks, and delay/disruption
tolerant networks (DTNs) is a very active area of research.
Fundamental Models
Or, understanding the characteristics that impact distributed system performance and
operation.
Interaction
Latency - the time between the sending of a message at the source and the
receipt of the message at the destination.
Bandwidth - the total amount of information that can be transmitted over a
given time period (e.g., Mbits/second).
Jitter - "the variation int he time taken to deliver a series of messages."
(Coulouris et al)
Generally, it is sufficient to know the order in which events occur. A logical clock is a
counter that allows a system to keep track of when events occur in relation to other
events.
Failure
Security
There are several potential threats a system designer need be aware of: