Network Analysis and Architecture Springer 2023
Network Analysis and Architecture Springer 2023
Yu-Chu Tian
Jing Gao
Network
Analysis and
Architecture
Signals and Communication Technology
Series Editors
Emre Celebi, Department of Computer Science, University of Central Arkansas,
Conway, AR, USA
Jingdong Chen, Northwestern Polytechnical University, Xi’an, China
E. S. Gopi, Department of Electronics and Communication Engineering, National
Institute of Technology, Tiruchirappalli, Tamil Nadu, India
Amy Neustein, Linguistic Technology Systems, Fort Lee, NJ, USA
Antonio Liotta, University of Bolzano, Bolzano, Italy
Mario Di Mauro, University of Salerno, Salerno, Italy
This series is devoted to fundamentals and applications of modern methods of signal
processing and cutting-edge communication technologies. The main topics are infor-
mation and signal theory, acoustical signal processing, image processing and multi-
media systems, mobile and wireless communications, and computer and communi-
cation networks. Volumes in the series address researchers in academia and industrial
R&D departments. The series is application-oriented. The level of presentation of
each individual volume, however, depends on the subject and can range from practical
to scientific.
Indexing: All books in “Signals and Communication Technology” are indexed by
Scopus and zbMATH
For general information about this book series, comments or suggestions, please
contact Mary James at [email protected] or Ramesh Nath Premnath at
[email protected].
Yu-Chu Tian · Jing Gao
Network Analysis
and Architecture
Yu-Chu Tian Jing Gao
School of Computer Science College of Computer and Information
Queensland University of Technology Engineering
Brisbane, QLD, Australia Inner Mongolia Agricultural University
Hohhot, China
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature
Singapore Pte Ltd. 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
To our families and friends
Preface
Computer networks and the Internet have become integral parts of the fundamental
infrastructure in modern industries and societies. Building a new network, upgrading
an existing network, or planning for using a public network requires a profound
understanding of the concepts, principles, approaches, and processes involved in
advanced network planning. This book helps develop such a deep understanding.
The knowledge and skills acquired from this book are relevant to computer networks,
data communication, cybersecurity, and other related disciplines.
There have been numerous books on computer networks and network data commu-
nication, ranging from introductory to more advanced levels. The number of such
books continues to increase. However, from our teaching experience over the last
two decades, we have noticed a dearth of network books that provide detailed discus-
sions on systematic approaches and best practices for high-level network planning
in a structured process. As modern service-based networking becomes increasingly
complex, integrating advanced network technologies, mechanisms, and policies into
network architecture to serve business goals and meet network requirements poses
a significant challenge. Topics relevant to these aspects are scattered throughout the
networking literature. In our teaching and learning of advanced network subjects,
specifically network planning, we have had to gather references from various sources
and compile them into modular teaching materials. This has inspired us to write a
dedicated book on network analysis and architecture design for network planning.
Therefore, this book distinguishes itself from existing network books by
describing systematic approaches and best practices for network planning in a struc-
tured process. It introduces high-level network architecture and component-based
architecture and discusses how advanced network technologies, such as security, are
integrated into network architecture. The book can be used as a textbook for senior
undergraduate students or postgraduate students. It is also valuable as a reference
book for network practitioners seeking to develop or enhance their skills in network
vii
viii Preface
analysis and architecture design, particularly for large-scale computer networks with
complex network service requirements. In our case, we have used the materials from
this book in teaching postgraduate students specializing in computer science and
electrical engineering.
As a textbook, this book compiles materials from various references, including
books, international standards such as Request for Comments (RFCs) from the
Internet Engineering Task Force (IETF), research articles, and network products.
Therefore, the majority of these materials are not our original contributions, although
the book does incorporate some research and development from our own group. We
will cite references whenever possible throughout the book to acknowledge the orig-
inal contributors of these materials that are not authored by us. However, to avoid
distracting readers from the main theme of the topics covered in this book, we will
not provide citations for every sentence. In fact, it would not be practical to do so. It
is safe to assume that all materials discussed in this book are contributed by others
unless explicitly indicated that a contribution is from our own group.
This book covers many advanced topics of computer networks, particularly focusing
on network analysis and architecture. When used as a textbook for senior undergrad-
uate students or postgraduate students, it can easily fit into a one-semester advanced
networks subject. For example, in Australian universities, there are typically 13
teaching weeks each semester. The 13 chapters of the book can be taught within
these 13 teaching weeks. We teach Chaps. 1 and 2 in a single module, and each of
the remaining chapters in separate modules, reserving week 13 as a revision week.
If there are additional teaching weeks available, the following options can be
considered:
• Divide Chap. 6 (Network Addressing Architecture) into two modules, with one
module dedicated to IPv4 addressing and the other module focusing on IPv6
addressing.
• Teach Chap. 10 (Network Security and Privacy Architecture) in two modules to
allow for comprehensive discussions of security architecture.
• Split Chap. 12 (Virtualization and Cloud) into two modules, with one module
focusing on virtualization and the other module dedicated to cloud architecture.
• Extend Chap. 13 (Building TCP/IP Socket Applications) to two teaching modules
to provide insightful practice in developing practical network communication
systems.
Preface ix
If there are fewer teaching weeks, several options are available, by combining
multiple chapters into a single module and/or setting some chapters aside. For
example,
• Combine Chap. 1 (Introduction) and Chap. 2 (Systematic Approaches) into a
single teaching module as we have done in our teaching practice.
• Teach Chap. 11 (Data Centers) and Chap. 12 (Virtualization and Cloud) in a single
module.
• Use Chap. 13 (Building TCP/IP Socket Applications) for setting up assignment
projects.
Contacting us
We have compiled a set of questions and exercises for each chapter in this book to
aid students in better understanding the content. Professors who are using this book
as a textbook are more than welcome to reach out for access to specific sections or
the complete question bank.
We encourage professors, instructors, tutors and teaching assistants, and even
students to create supplementary modules of materials that complement the existing
content of this book. If you have materials in any format that you believe suitable for
potential inclusion in a future edition of this book, we would love to hear from you.
If any part of the materials that you provide is included in a future edition, we will
give you clear and explicit acknowledgment for your valuable contribution.
We also encourage readers of this book to share any comments or insights with us.
Whether you spot typographical errors, grammatical inaccuracies, or inappropriate
expressions, or if you identify any misconstrued descriptions or interpretations, we
would welcome your observations. You may also discuss with us about the assertions
made in the book, and let us know what resonates and what may require more
investigations. You may further tell us what you think should be incorporated or
excluded in a future edition.
We sincerely hope that this book is useful to you.
Prior to its publication, the majority of the drafts of this book have served as lecture
notes for several years, benefiting hundreds of postgraduate students. We are grateful
to these students and their tutors for their valuable comments and suggestions, which
have contributed to enhancing the content and presentation of the book.
We would like to acknowledge the editors and coordinators of this book from the
publisher for the professional management of the whole process of the publication
of this book. We extend special thanks to Mr. Stephen Yeung, Mr. Ramesh Nath
Premnath, Mr. Karthik Raj Selvaraj, and Ms. Manopriya Saravanan from Springer.
It has been enjoyable to work with them and the publisher.
The contents presented in this book are based on numerous contributions from
the networking community. We are sincerely grateful to all the authors who have
made these valuable contributions, and the organizations that have led the research
and development of many of these areas. For instance, many RFCs from the IETF
serve as the primary sources of references for various networking concepts, mecha-
nisms, and technologies. We would like to express our deep gratitude to the authors
and organizations involved in these contributions from the networking community.
An exhaustive list of these contributors would be too lengthy to include within the
limited space here. All the contributions from the networking community have been
appropriately cited and/or discussed, alongside our own understanding and practical
experience. Consequently, we bear full responsibility for any errors, incorrect or
inappropriate interpretations, and/or inconsistent descriptions that may arise. Any
feedback and suggestions for further improvement of this book are most welcome.
In this book, we have also included some original contributions from our own
group. We would like to thank our colleagues and students who have worked and
collaborated with us to create these contributions. Working with all of you in an
exciting team environment filled with energy, inspiration, and enthusiasm has been
a truly enjoyable experience.
xi
xii Acknowledgements
Many of our own contributions cited in this book have received financial support
from various funding agencies through research grants. We are particularly grateful to
the Australian Research Council (ARC) for its support through the Discovery Projects
scheme and Linkage Projects scheme under several grants, such as DP220100580,
DP170103305, and DP160102571. We would also like to acknowledge the Australian
Innovative Manufacturing Cooperative Research Centre (IMCRC), the Australian
CRC for Spatial Information (CRCSI), and other funding agencies and organizations
that have supported our research and development endeavors.
Contents
xiii
xiv Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541
About the Authors
Prof. Jing Gao is a professor and the Dean of the College of Computer and
Information Engineering, Inner Mongolia Agriculture University, Hohhot 010018,
China. She also serves as the director of the Inner Mongolia Autonomous Region
Key Laboratory of Big Data Research and Application for Agricultural and
Animal Husbandry, Hohhot 010018, China. She received the Ph.D. degree in
computer science and technology from Beihang University, Beijing 100191,
China, in 2009. She has previously worked as a visiting professor at the School
of Computer Science, Queensland University of Technology, Brisbane QLD
4000, Australia. Her research interests include computer networks, big data
analytics and computing, knowledge discovery, and agricultural intelligent systems.
Email: [email protected].
xxv
Acronyms
xxvii
xxviii Acronyms
RS Router Solicitation
RSA Rivest–Shamir–Adleman
RSPEC Request SPECification
RSTP Rapid Spanning Tree Protocol
RSVP Resource Reservation Protocol
RTE RouTe Entry
RTFM Real-Time Flow Measurement
RTP Real-time Transport Protocol
RTT Round Trip Time
SA Security Association
SaaS Software as a Service
SAN Storage Area Network
SCTP Stream Control Transmission Protocol
SDN Software Defined Networking
SGW Serving Gateway
SLA Service Level Agreement
SLAAC StateLess Address AutoConfiguration
SLM Service Level Management
SLS Service Level Specification
SMS Subscriber Management System
SMTP Simple Mail Transfer Protocol
SNAP SubNetwork Access Protocol
SNMP Simple Network Management Protocol
SPF Sender Policy Framework
SPI Security Parameters Index
SPIN Sensor Protocols for Information via Negotiation
SPT Shortest Path Tree
SSH Secure Shell
SSID Service Set IDentifier
SSL Secure Sockets Layer
SSM Source-specific Multicast
STB Set Top Box
STP Spanning Tree Protocol
TCA Traffic Conditioning Agreement
TCI Tag Control Information
TCS Traffic Conditioning Specification
TFTP Trivial File Transfer Protocol
TIA Telecommunications Industry Association
TLS Transport Layer Security
ToS Type of Service
TPID Tag Protocol Identification
TSI Tenant System Interface
TSN Time-Sensitive Networking
TSPEC Traffic SPECification
TTL Time to Live
Acronyms xxxiii
xxxv
xxxvi List of Figures
xliii
xliv List of Tables
• Chapter 1: Introduction.
• Chapter 2: Systematic Approaches.
• Chapter 3: Requirements Analysis.
• Chapter 4: Traffic Flow Analysis.
The main theme of this book is network analysis and architecture design for network
planning. While there are various types of networks, this book specifically focuses
on computer networks for data communication. Therefore, it uses the term networks
to specifically refer to computer networks unless specified explicitly otherwise.
Computer networks interconnect computing and network devices for data com-
munication and network services, typically utilizing shared network resources. They
have become an integral part of fundamental infrastructure in modern industries and
societies. Organizations reply on computer networks to gather, exchange, store, and
analyze information for business intelligence, which assists in, and supports, busi-
ness decisions. The worldwide system of interconnected computer networks forms
the Internet.
With the increasing complexity of modern computer networks and network ser-
vices, it becomes a challenging task to build a new network, upgrade an existing
network, or use a third-party or public network for an organization, especially when
dealing with a large-scale network. It requires a deep understanding of the concepts,
principles, mechanisms, process, and methodology of network planning. This book
will help develop such a profound understanding of network analysis and architecture
design, which are pivotal aspects of high-level network planning.
Network planning involves strategic planning of building a new network, upgrad-
ing an existing network, or utilizing a third-party or public network for an organi-
zation before the network is implemented. It serves business goals within various
constraints and ensures that the network meets current and future requirements in a
cost-effective way. Additionally, network planning encompasses determining how to
provide network connectivity and scalability, as well as how to provision and secure
network functions and services at the expected levels of Quality of Service (QoS).
This introductory chapter will begin with a discussion of the motivation behind
network planning. It will then clarify what is involved in network planning. This
is followed by a brief explanation of how to plan a network. After that, the main
objectives, contents, and organization of this book will be presented.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 3
Y.-C. Tian and J. Gao, Network Analysis and Architecture, Signals and
Communication Technology, https://doi.org/10.1007/978-981-99-5648-7_1
4 1 Introduction
Headquarters Public
Cloud
R1 R2
R3 R4
Branch Data
Office Center
Fig. 1.1 Logical network planning for a large-scale network across multiple geographically remote
locations with network services from a private data center and a public cloud
center must be provisioned over the third-party WAN links to the whole network.
The network must be able to provide connectivity, scalability, performance, security,
and network services that meet current and future requirements.
However, many questions need to be answered before the network becomes truly
useful. For example, what current and future requirements are? What SLAs with the
third-party service providers should be developed? What security requirements are
and how they are enforced over the third-part infrastructure? What trade-offs should
be developed between scalability and connectivity, performance and costs, and new
technologies and complexity? All these and other aspects demand a comprehensive
analysis of network behavior and requirements, followed by systematical planning
of the network architecture to align with the identified requirements.
As depicted in the example presented in Fig. 1.1, network planning needs to
address multiple factors, requirements, and/or objectives, such as:
• Complexity of network topology and behavior,
• Network connectivity versus hierarchy,
• Network scalability,
• Bandwidth capacity and allocation,
• Network services and QoS,
• Network management,
• Confidentiality, Integrity, and Availability (CIA), and
• Costs.
6 1 Introduction
These factors, requirements, and objectives often compete with each other. For exam-
ple, improving QoS may result in higher costs, optimizing resource utilization may
lead to more complex topology and/or protocols, enhancing security may introduce
additional overhead. Therefore, it is important to identify and establish trade-offs
through network planning for a satisfactory solution to the network planning prob-
lem.
A computer network should be able to deal with short-term, medium-term, and
long-term dynamic changes in both the network itself and its requirements. An exam-
ple of such changes is the growth in the number of users, network devices, and/or
network services, leading to a requirement of well-planned network augmentation.
Network augmentation may involve modifying the network topology, introducing
additional QoS mechanisms, designing new Virtual Local Area Networks (VLANs),
adjusting server placement, or adopting additional protocols. To address network
augmentation effectively, it is important to engage in network planning in advance
to adequately prepare for the upcoming network changes.
Overall, network planning is essential to a computer networking project. It offers
a number of benefits. For example,
• It clarifies business goals and constraints,
• It defines network planning problems,
• It identifies technical requirements and trade-offs, network services, and QoS lev-
els,
• It gives a top-level view of network architecture,
• It presents a detailed design for the implementation of network services, and
• It provides component-based architecture, in which security and QoS are at the
heart.
Good network planning fosters a comprehensive understanding of the challenges a
network must address and how the network is used and managed. It enhances network
performance, accounts for future growth, and ensures that the network conforms to
security and QoS requirements.
Ultimately, network planning aims to plan the network to serve business goals
within various constraints. These goals and constraints are transformed into cur-
rent and future requirements that should be fulfilled. To fulfill these requirements,
logical network architecture and physical network connections are developed and
planned incorporating with various mechanisms, protocols, and technologies. From
this understanding, the following is a list of main deliverables in network planning:
• Clarification and documentation of business goals and constraints relevant to net-
work planning,
1.3 Strategic, Tactical and Operational Planning 7
As in other planning projects, a network planning project can be considered from the
perspectives of strategic, tactical, and operational planning. Strategic planning deals
with long-term requirements and sets up strategic goals for a given period of time.
8 1 Introduction
Tactical planning breaks down long-term strategic planning into short-term objectives
and actions. Operational planning considers the requirements of current network
operation and translates strategic goals into technical ones. Network planning should
investigate all strategic, tactical, and operational requirements through clearly defined
long-term, short-term, and current targets and actions.
More specifically, strategic planning involves the development of strategies to
enable networks to focus on common objectives for a given period of time, typically
three to five years. It provides a big picture of the network in the long term to cast a
vision, and thus requires mission processing and a high-level thinking of the entire
business. Some examples of strategic objectives are listed below:
• In the next three years, move the majority of office network services to cloud
or replace them with cloud-based Software as a Service (SaaS), such as word
processing, spreadsheet, and email.
• In the next four years, replace current circuit-switching voice services with Voice
over IP (VoIP).
• In the next five years, decommission current private data center and move to a
public data center infrastructure.
These strategic targets will have a significant impact on the planning of current and
short-term networking. Tactical and operational planning should fulfill the require-
ments of the strategic targets.
Tactical planning aims to achieve specific and short-term objectives, which are
derived from strategic planning. It presents short-term steps and actions that should
be taken to accomplish strategic objectives described in the strategic planning phase.
The tenure of a tactical plan is typically short, usually one year or shorter. Here are
some examples of tactical objectives and actions:
• In the next three months, initiate testing of cloud-based SaaS for selected services
in a specific network segment.
• In the next six months, expand the testing of SaaS to more services in multiple
network segments.
• In the next nine months, replace local DNS servers with cloud-based DNS servers
from a public cloud and then decommission local DNS servers.
• In the next 12 months, replace local mail servers with cloud-based servers from a
public cloud and then decommission local mail servers.
Operational planning focuses on current and immediate operational issues and
requirements of the network. It makes decisions based on detailed information spe-
cific to network segments, functions, services, technologies, and/or protocols, provid-
ing an opportunity to use network resources effectively and efficiently. The following
are some examples of operational planning:
• Segment a Local Area Network (LAN) into two to reduce the impact of broadcast
traffic on latency performance.
• Create a Virtual Local Area Network (VLAN) for a new work group with staff
members sitting in two different buildings.
1.4 Structured Network Planning Processes 9
Overall, network planning converts networking visions and ideas into meaningful
actions and results [1]. It transforms business communication objectives and network-
ing needs into networking requirements, budgets, and project plans. The process of
network planning consists of the following main steps and activities [1]:
(1) Document customer’s business problem and understand what is really needed.
(2) Abstract, formulate, and document a conceptual solution to the business problem,
leading to some results that can be visualized and discussed.
(3) Define a conceptual solution in terms of requirements, such as functional, oper-
ational, administrative, and performance aspects to satisfy the customer’s needs
and address the business problem.
10 1 Introduction
(4) Research and select appropriate product technologies for deploying the solution.
(5) Create a realistic budget for solution deployment by considering both one-time
and recurring expenses.
(6) Develop a project plan for designing and implementing the deployable solution.
This book will not discuss budget issues. Also, as mentioned earlier. it will not
discuss physical network design. Instead, the book will concentrate on high-level
architectural models of larges-scale networks.
Problem statement,
BEGIN initial conditions,
workflow data,
existing policies
Requirements
Analysis Require
ments a Requirements specifications,
traffic a nalysis,
nalysis sets of services,
requirements boundaries,
location information,
flow specifications,
architecture/design trusts
Network
Architecture Develop
ment, s Network topology,
and eva election,
luation network technologies,
equipment requirements,
stratetic locations,
components,
architectural boundaries
Network
Design Selectio
n,e
layout, valuation, Vendor selection,
and oth equipment selection,
ers
configuration details,
END network blueprints,
component plans,
design boundaries
and ensure the manageability of the networking project. The structure process, as
recommended by Oppenheimer, exhibits the following characteristics:
• A top-down design sequence, which begins with gathering and analyzing require-
ments,
• The use of multiple techniques and models to characterize networks, determine
requirements, and propose a structure for future systems,
• A focus on data flow, data types, and processes for accessing to, or changing, the
data,
• An understanding of the location and needs of data access and processing, and
• The development of a logical model ahead of a physical model.
From these characteristics, four main phases are identified for the process of network
design in the system development life cycle [3, p. 6]:
(1) Requirements analysis,
(2) Logical design,
(3) Physical design, and
(4) Testing, validation, and documentation of the design.
12 1 Introduction
While the process of network planning is described from different perspectives, it has
become a common understanding that network planning should follow a structured
process with sequential phases or steps. Each phase should not start until the com-
pletion of its preceding phase. The phases of requirements analysis, logical network
architecture, and physical network design are common to all processes described
above. The general process of network planning is outlined below with multiple
sequential phases:
(1) Business goals analysis: This phase involves understanding and clarifying busi-
ness goals and constraints.
(2) Requirements analysis: In this phase, requirements specifications are identified
and formalized, which take possible trade-offs into consideration.
(3) Top-level network architecture: This phase provides a top-level view of network
components, topology, functions, services, technologies. interconnections, poli-
cies, and possible locations.
(4) Component-based network architecture: Each significant component is addressed
in this phase with the focus on its functions, services, mechanisms, technologies,
protocols, connections, performance, security, and other aspects.
(5) Network design: This phase primarily focuses on physical design.
(6) Implementation and deployment.
(7) Evaluation, testing, and verification.
(8) Operation, Administration, and Maintenance (OAM).
This general process will be formalized later in Chap. 2 in the waterfall model as one
of the systematic approaches for network planning. This book will mainly focus on
the first four phases with an emphasis on network analysis and architecture planning.
As mentioned earlier, it will not cover physical network design.
Network planning can begin with current requirements and then incorporate addi-
tional enhancements and/or technologies for future network growth. Alternatively,
it can start with a strategic perspective on future targets and then narrow down to
current operational requirements. The ways chosen by different planners for network
planning can vary significantly. This makes network planning more of an art than
a science or technology. From this perspective, network planning requires a deep
understanding of the insights into computer networks with regard to [2, p. 3],
• Individual rules on evaluating and choosing network technologies,
• Ideas about how network technologies, services, and protocols work together and
interact,
1.5 Network Planning as an Art 13
• Experience in determining what works and what does not work, and
• Selecting network topological models, often based on arbitrary factors.
Such a profound understanding of computer networks requires much knowledge that
can only be developed through experience [4].
As an art, network planning largely relies on the expertise and experience of the
planner. There is no standard solution to a network planning problem. This explains
the observation that no two networks are exactly the same in the real world. Different
network planers will likely propose different plans and designs, all of which can
function effectively. The solutions are not differentiated solely for their correctness
or incorrectness. However, a solution may be better than others in the sense that it
provides better trade-offs among competing objectives and requirements.
Good network planning can be achieved by following best practices. For example,
one such practice is to follow a structured process. Utilizing systematic approaches
is also a good practice in network planning. Petryschuk has summarized top five best
practices for network design [5]:
• Integrate security early on. Security is always crucial in all networks, and some-
times even more so than performance. Therefore, it is highly recommended to
consider security as a priority requirement from the beginning of the network
planning project.
• Know when to use top-down versus bottom-up. The top-down methodology is
always recommended for planning a large-scale network from scratch. It allows
us to focus on the fulfillment of business goals and constraints through the develop-
ment of technical requirements and trade-offs. However, for one or more specific
segments of the network, if the requirements and their relationships with business
goals are already clear, the bottom-up methodology may provide a quick solution.
• Standardize everything. Some examples are hostnames (e.g., printer02.
area01.lan03), IP addressing, structured cabling, and security policies.
• Plan for growth. Consider factors such as bandwidth capacity, segmentation, and
IP address allocation to accommodate future expansion.
• Create and maintain network documentation. Documentation plays a vital role in
network management and troubleshooting.
This book will present best practices from various perspectives for network planning.
No matter how a network is planned and who conducts the planning, it is always
imperative to ensure that the needs of the business are met from the network planning.
As network complexity continues to grow, comprehensive analysis and detailed tech-
nical design become increasingly important in meeting the requirements of modern
service-based networking. Therefore, network planning should be approached not
only as an art but also as a science and technology. In this book, it is recommended
to follow a structured process and incorporate systematic approaches for network
planning. This will be discussed in detail throughout the book, providing valuable
insights for effective network planning.
14 1 Introduction
As analyzed previously, the success of a network planning project requires the planner
to follow a structured process incorporating systematic approaches. It also necessi-
tates a good understanding of customers requirements. It further calls for a high-level
logical view of the network before any physical design is conducted. The logical view
should be hardware- and vendor-independent.
In addition, engaging customers and executives in the network panning project
is critical for the success of the project. This will ensure that the customer require-
ments are well clarified and specified, and the planning well aligns with the business
goals and constraints. It is important to discuss the network planning project with
customers and executives during each planning phase, seeking their endorsement of
intermediate outputs or gathering clear suggestions for changes and amendments.
Observations indicate that large Information Technology projects that fail often
exhibit the following features [1]:
• Lack of customer (end-user) involvement in planning,
• Focus on product technology and vendor selection,
• Misunderstanding of customer (end-user) business requirements,
• Understatement of project start-up (build) and recurring (run) costs,
• Poor definition of the business value of the proposed project, and/or
• Lack of interest and support from executive management.
The overall aim of this book is to provide systematic approaches and best practices
for network planning in a structured process. It describes how to assemble various
network technologies, services, mechanisms, and policies into network architecture
in a cohesive way. Specific objectives of this book include:
• Establishing a structured process for planning large-scale computer networks with
increasing complexity,
• Introducing systematic approaches that are effective in, and suitable for, network
planning,
• Developing insights into business goals and constrains that network planning
should meet,
• Understanding technical requirements specifications and exploring possible trade-
offs through comprehensive requirements analysis for network planning,
• Presenting a top-level view of network architecture that aligns with business goals
and technical requirements, and
• Investigating logical architecture for important components of large-scale com-
puter networks.
1.8 Book Organization 15
To achieve these objectives, this book is designed with the following main
contents:
Part I: Network Analysis
– The concepts of network planning in a structured process.
– Systematic approaches for network planning. More specifically, the following
approaches will be discussed: the systems approach, the waterfall model, a generic
network analysis model, the top-down methodology, and service-based network-
ing.
– Requirements analysis for network planning to define what network problems
are, clarify business goals and constraints, and develop technical requirements
specifications with possible trade-offs.
– Traffic flow analysis to identify predictable and guaranteed traffic flows, and the
requirements to serve these flows.
Part II: Network Architecture
– Top-level network architecture covering network topology, functional entities, and
service models.
– Component-based architecture for key components of large-scale networks with
complex service and security requirements. More specifically, architectural models
will be discussed for addressing, routing, performance, management, and security
components.
Part III: Network Infrastructure
– Data centers with various national and international standards on topology, archi-
tecture, security, and design.
– Virtualization and cloud in relation to virtualization mechanisms, virtualized
resources, virtualized network functions, cloud architecture, cloud service models,
cloud security, and other related topics.
– Building practical TCP/IP network communication systems by using sockets.
Comprehensive examples will be provided for different scenarios and require-
ments of TCP/IP communication applications.
The overall structure of this book is shown in Fig. 1.3. The present introduc-
tory chapter introduces basic concepts of network planning through systematic
approaches in a structured process. Then, network analysis is discussed in detail
for network planning. It is covered in three main chapters: Chap. 2 on systematic
approaches for network analysis and architecture design, Chap. 3 on requirements
analysis, and Chap. 4 on traffic flow analysis, respectively.
Next, Chap. 5 is dedicated to providing a top-level view of network architecture
for large-scale networks. It is supported by component-based architectural models
16 1 Introduction
1. Concepts 2. Systematic
and Processes Approaches
Part I:
Network
Analysis
3. Requirements 4. Traffic
Analysis Flow Analysis
5. Architecture Planning
Part II:
9. Management
8. Performance
Network
6. Addressing
10. Security
7. Routing
Architecture
12. Virtualization
11. Data Center
Part III: and Cloud
Network
Infrastructure
13. TCP/IP via Sockets
for key network components in Chaps. 6 through 10, which cover addressing, routing,
performance, management, and security, respectively.
After that, two critical network infrastructure components are discussed in
Chaps. 11 and 12 to support various network functions, services, and applications.
Chapter 11 focuses on data centers, covering topics such as topology, architecture,
security, and standards. Chapter 12 explores virtualization and cloud with detailed
discussions on virtualization mechanisms, virtualized resources, virtualized network
functions, cloud architecture, cloud service models, cloud security, and other related
topics.
Moreover, Chap. 13 is devoted to building TCP/IP communication systems via
sockets. It delves into the concepts, principles, and practices of socket programming.
Comprehensive examples are also provided in this chapter.
References 17
References
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 19
Y.-C. Tian and J. Gao, Network Analysis and Architecture, Signals and
Communication Technology, https://doi.org/10.1007/978-981-99-5648-7_2
20 2 Systematic Approaches
• When planning to use third-party or public networks for network services and
applications.
Let us begin our discussions below with the systems approach as one of the
effective systematic approaches.
The systems approach is a notation of the general systems theory, which was origi-
nally developed by Ludwig Von Bertalanffy in the 1960’s. It refers to the decomposi-
tion of a complex system into smaller and easy-to-understand subsystems for better
understanding of the complexity of the overall system. The system approach finds
applications in various areas including network planning. This section discusses the
systems approach and its specific applications within the context of network plan-
ning.
In the general system theory, a system is a unitary whole integrated from interacting
and interdependent subsystems. It can be a natural, human-made, or even conceptual
entity. Depending on the application context, a system can represent an organization,
a software system, a management process process, or a computer network. In each of
these examples, the system is treated as a whole, which is composed of inter-related
and interdependent elements or subsystems. In the context of computer networking,
multiple individual Local Area Networks (LANs), functional areas, network services,
and other logical or physical entities work together to form a unified whole, namely
the network.
Most logical or physical systems of practical relevance are open systems, meaning
that they interact with their environment. Therefore, in order to understand or define
the functions of a system as a whole, it is important to clearly delineate the boundary
between the system and its environment. By doing so, the interactions between the
system and its environment can be investigated and understood. This helps address
questions such as:
• What inputs does the system receive from the environment?
• What outputs does the system generate and send to the environment?
• What constraints exist at the boundary of the system?
Figure 2.1 provides a graphical representation of the concepts of a system and its
environment. The system consists of multiple subsystems that are interacting and
interdependent. It also interacts with the environment through inputs and outputs.
The concepts of system and environment are directly applicable to computer
networks as well. In the context of computer networking, it if crucial to define a
2.1 Systems Approach 21
Environment
clear boundary for each routing area. Border routers, for instance, are installed at the
boundary between an enterprise network and its external ISP. The performance of
network services as specified in Service Level Agreements (SLAs) is often measured
at the boundary of the network. In this scenario, the external ISP can be considered
as part of the network’s environment.
In the context of computer networking, Fig. 2.2 illustrates three perspectives of a
network considered as a system. It shows three topological models for the same net-
work, i.e., geographical topology, functional topology, and component-based topol-
ogy, respectively. This highlights that fact that a system, such as a network in this
example, can be investigated from different perspectives for a comprehensive under-
standing of its functions, behavior, and performance.
Environment Environment
System System
WAN Core
Security Routing
Network
Management Performance
essential to have a good understanding of the behavior of each component within the
overall system. In the case of a computer network, this requires a good understanding
of the behavior of each network segment or component, such as LANs, Metropolitam
Area Networks (MANs), Wide Area Networks (WANs), access, distribution, core,
and physical and logical components shown in Fig. 2.2.
However, comprehending each subsystem individually is not sufficient to fully
understand the functions, behavior, and performance of the overall system. This is
because interactions between the subsystems exist, and these interactions give rise
to unique system dynamics that may not exist in the individual subsystems. Having
a deep understanding of each atom does not mean that we understand the entire
universe. Similarly, knowing each individual from a school does not mean that we
know the school’s culture. For a network shown in Fig. 2.2b, a high level of traffic
in each access area may not indicate an issue for that specific area, but collectively,
the traffic from the access areas may cause congestion in the distribution area.
From this perspective, systems possess an important feature known as emergent
behavior or synergy. This means that a system is more than the sum of its parts.
2.1 Systems Approach 23
We can simply express this feature as 1 + 1 > 2. This can be understood from two
perspectives:
• Due to the interactions between components and their collective contributions,
a system can exhibit some functions, behavior, and features that do not exist in
its individual components. Each individual computer can function alone in isola-
tion. But when multiple computers are networked, they can exchange information
through data communication over the network.
• Well-designed system components with appropriate use of their interactions will
make the system stronger than the sum of its individual components. Being stronger
in this context can be interpreted as having richer dynamics, improved perfor-
mance, or additional functions in our favor. Conversely, a poor design of the system
may cause the system to behave worse than the sum of its parts.
The understanding of system synergy from these two perspectives highlights the
importance of designing network components and segments individually and collec-
tively in a systematic way.
Tertiary students are generally well-trained in solving given problems using specific
techniques or tools. For example, when presented a linear programming problem
that has a well-defined objective function and constraints, they can apply the sim-
plex method to find a solution. However, when it comes with complex systems like
enterprise network planning, the technical specifications for solving complex sys-
tems problems are not always readily available. This implies that the problem itself,
which needs to be solved, is not clearly defined.
Let us consider a scenario where one of the distribution areas in the network shown
in Fig. 2.2b experiences traffic congestion and significant latency. This is a problem
that needs to be solved, but what is the actual underlying technical problem? Could
it be attributed to:
• An inappropriate use of network resources by a user or group of users?
• An inadequate segmentation design in the access areas that interact with the dis-
tribution area?
• An improper topological design of the distribution area?
• Or other underlying issues?
Without clarifying the specific problem, it becomes challenging to address it from
the technical perspective. This highlights a distinctive feature of complex system,
such as network planning: we need to tackle problems that have not yet been clearly
defined.
Therefore, when dealing with complex systems, the first step to solve a problem,
which is typically yet to be defined, is to clarify and clearly define the problem itself.
How can we accomplish this? The answer lies in conducting a comprehensive system
analysis, specifically a requirements analysis. This will aid in gathering and clarifying
2.1 Systems Approach 25
the business goals and constraints, which serve as the foundation for developing
technical specifications and trade-offs. Only after we have a complete set of technical
specifications can we begin to explore and develop techniques and tools to lead a
solution satisfying those specifications.
Additionally, the process of system analysis also provides insights into potential
solutions or directions to pursue in the search of a satisfactory solution. For example,
if system analysis reveals that the traffic congestion mentioned earlier in a distribution
area depicted in Fig. 2.2b results from an inappropriate use of network resources by
a group of users in an access area, it may be necessary to develop and enforce a
user access policy. If network access from all access areas is functioning properly,
the issue could be attributed to inadequate segmentation. In such a case, it may be
necessary to break down large segments of the network into smaller ones.
Gray Box
Analysis
the behavior of the network and analyze the impact of these measures. If necessary,
we can feed the information obtained from the output back to the system, further
adjust the actions of traffic shaping or restriction based on the obtained information,
and observe the resulting improvement or degradation in system performance. This
iterative process enables us to gain a better understanding of the system dynamics and
behavior, as well as uncover potential solutions for improving system performance.
Through this input-output-feedback process, we will acquire more knowledge
about the system. As a result, the black box gradually becomes a gray box, and
potentially even a white box. A white box signifies that the system is fully under-
stood for the purpose of system control and operation. System analysis, including
requirements analysis, in network planning is a process that assists in understanding
the network from an initial black box gradually to a gray box and potentially even a
while box.
The waterfall model typically consists of five to seven phases that proceed strictly in
a linear sequential order, implying that a phase cannot commence until its previous
phase is completed. The names of the waterfall phases may vary depending on the
specific application scenario. In its early version defined by Winston W. Royce, the
waterfall model is composed of five phases, i.e., requirements, design, implementa-
tion, verification, and maintenance. This is shown in Fig. 2.4a.
The five phases in the waterfall model depicted in Fig. 2.4a are briefly described
below:
• Requirements: In this phase, all customer requirements are gathered at the begin-
ning of the project, enabling all other phases to be planned without further inter-
actions with the customer until the conclusion of the project.
• Design: The design phase is divided into two steps: logical design and physi-
cal design. In the logical design step, conceptual and/or theoretical solutions are
2.2 Waterfall Model 27
Requirements Requirements
Verification Verification
Mainenance Mainenance
(a) Standard waterfall model (b) Waterfall software development life cycle
developed based on the gathered requirements. In the physical design step, the con-
ceptual and/or theoretical solutions are converted into concrete specifications that
can be implemented. The design phase provides a blueprint for the construction
or implementation of the final product.
• Implementation: In this phase, the concrete specifications developed in the design
phase are implemented. For example, in software development, this involves writ-
ing code based on the design specifications.
• Verification: Once the implementation is complete, the product undergoes thor-
ough review and testing. Both the developer and particularly the customer exam-
ine the product to ensure that it functions as intended and meets the requirements
developed in the initial phase.
• Maintenance: The customer uses the product, maintains it, discovers any bugs and
deficiencies. The developer applies fixes as necessary.
These five phases provide a structured and sequential process to project management,
enabling a systematic progression from requirements analysis to product delivery and
maintenance.
To make the waterfall model more effective, feedback can be introduced from the
verification and maintenance phases to the requirements, design, and implementation
phases. This enables the refinement of the first three phases even after the complete
product is delivered to the customer. In order to show the original waterfall model
clearly, the feedback feature has not been visualized in Fig. 2.4a.
28 2 Systematic Approaches
In the context of computer networking, the waterfall model typically comprises the
following sequential phases: requirements analysis, logical network design, physical
network design, implementation and deployment, evaluation/testing/verification, and
Operation, Administration, and Maintenance (OAM). It is depicted in Fig. 2.5. The
results from the evaluation/testing/verification and OAM phases could be fed back
to previous phases for their refinement.
For the purpose of network planning, this book will focus more on the first two
phases in the waterfall networking model illustrated in Fig. 2.5, i.e., requirements
analysis, and logical network design.
The requirements analysis phase in waterfall networking requires a good under-
standing of business goals and constraints. From this understanding, comprehensive
specifications of technical requirements and potential trade-offs can be developed.
Therefore, this phase consists of three main steps:
• Analyzing business goals and constraints,
• Analyzing technical requirements and exploring potential trade-offs, and
• Characterize existing network and also network traffic. This step considers current
and future network traffic, and assesses its impact on the protocol behavior and
Quality of Service (QoS) requirements.
The logical network design phase in the waterfall networking model deals with
architectural and logical design, including topology, functionality, QoS, security,
and other related aspects. More specifically, it starts with a top-level architectural
deign. This is followed by component-based architecture for addressing, routing,
performance, management, security, cloud, and others. For each of these components,
to fulfill the requirements of the component, we need to
• Understand what this component can provide,
• Plan and design the topology, and
• Identify, select, and develop appropriate mechanisms.
It is worth noting that network planning can be evaluated before its actual imple-
mentation and deployment. The evaluation can be conducted through modelling and
simulation under typical use cases. The results obtained from the evaluation can
be fed back to the requirements analysis and system design phases for their further
refinement. However, in order to maintain our focus on the main theme of network
planning, this book will not extensively cover network modelling and simulation.
As discussed in previous sections, both the systems approach and waterfall model
emphasize the significance of system analysis with a particular focus on requirements
analysis. This section introduces a generic model for network analysis from the
perspective of complex systems comprising multiple components and entities. This
model can be viewed as an application of system analysis in the context of network
planning [3, pp. 27–31].
30 2 Systematic Approaches
From the systems approach perspective, the network component depicted in Fig. 2.6b
differs from the OSI network protocol layer illustrated in Fig. 2.6a. It is a subsystem
with functions spanning the bottom three OSI protocol layers (i.e., the network,
data link, and physical layers) in Fig. 2.6a. Its main functions include routing and
Application
Data Link
Interface Interface
Physical Network
(a) OSI’s layered architecture (b) A generic network analysis model [119, p. 28]
Fig. 2.6 OSI’s layered architecture and a generic network analysis model
2.3 A Generic Network Analysis Model 31
end-to-end delivery of data packets at the network layer, media access control at the
data link layer, and bit streaming through NICs and communication medium at the
physical layer.
The device component or subsystem shown Fig. 2.6b represents an abstraction of
the functionalities performed by hardware devices such as routers, switches, servers,
hosts, and other networking devices. These hardware device functions are imple-
mented within the OS and span across the bottom four OSI protocol layers (i.e., the
transport, network, data link, and physical layers). In general, the device subsys-
tem manages end-to-end routes through the use of transport protocols, end-to-end
data delivery by employing network-layer protocols, media access control with sup-
port from data-link-layer protocols, and the transmission of data bits via NICs and
communication medium by using physical-layer protocols.
Various network services are provisioned over the network through applications
with support from protocols spanning multiple OSI layers. The application com-
ponent or subsystem depicted in Fig. 2.6b is an abstraction of the main functions
performed across the top four OSI protocol layers shown in Fig. 2.6a, i.e., the appli-
cation, presentation, session, and transport layers. It deals with application-layer
protocols and services (e.g., web service via HTTPS), presentation control (e.g.,
data format, encryption, and compression), session control, and the establishment of
end-to-end links through transport protocols.
It is important to realize that network services and applications ultimately serve
users. Therefore, the user component or subsystem illustrated in Fig. 2.6b captures the
functions and requirements of users. It incorporates the top two OSI protocol layers
(i.e., the application and presentation layers) to address application requirements and
data format.
It is worth noting that depending on how complex a network is and what needs
to be analyzed for the network, each of the four subsystems depicted in Fig. 2.6b
can be further decomposed into two or more components. For example, the user
subsystem can be investigated by further dividing it into specific user groups. In the
case of an enterprise network within a university, user groups could include IT support
staff, general administration staff, academic staff, senior executives, students, and
visitors. Different user groups may have different requirements and security policies,
necessitating specific considerations in network analysis and architectural planning.
In the generic model depicted in Fig. 2.6b, each of two neighboring subsystems inter-
acts with each other through appropriate interfaces. The user subsystem interacts with
the application subsystem through displays (e.g., monitors), Graphical User Inter-
faces (GUIs), and general User Interfaces (UIs). Between the application and devices
subsystems, application-device interfaces can be various Application Progamming
Interfaces (APIs), QoS configuration and management, and device monitoring and
management systems. The device-network interface between the device and network
32 2 Systematic Approaches
For the analysis and design of systems such as computer networks, either top-down
or bottom-up approach could be used. These two approaches reflect different styles
and processes of thinking and decision-making. A comparison between top-down
and bottom-up approaches is provided in Fig. 2.7. The top-down methodology has
been described in [4, pp. 3–7] as a general network design methodology.
Business goals,
General
Big picture network functions
and services
Top-down
Top-down
Top-level
Some details architecture,
Zoom-in component-based
architecture
Hardware,
Specific floor plan,
Focused cabling
details
big subsystems. After that, investigate how each of these subsystems is solved and
how these subsystems interact with each other. This process continues recursively,
further dividing the subsystems into smaller ones until sufficient details are achieved
to support the functions of the overall system. This recursive decomposition of the
system into subsystems is similar to that in the systems approach. However, after the
system is decomposed, either top-down or bottom-up approach can be employed. The
top-down methodology emphasizes the recursive process from the top to the bottom.
It also requires to consider not only the problem to be solved, but also the way of
solving the problem. In general, the top-down approach is effective for large-scale
and complex systems. In the context of network planning, the top-down and bottom-
up approaches address network problems in different ways. The top-down approach
first considers the upper layers of OSI’s layered network architecture before moving
to lower layers. This means that it focuses on applications, data format, session
control, and transport at upper layers before delving into routing, switching, and bit
streaming at lower layers.
The top-down methodology for network planning is also an interactive process.
The business and technical requirements with high priority should be addressed first.
They can be presented in a top-level architecture of the network. Later, more infor-
mation will be gathered regarding specific technical and non-technical requirements,
such as business service models, protocol behavior, connectivity and performance
requirements, access control policies, and security policies. Then, the top-level archi-
tecture can be improved and refined, from which component-based architectural
models can be further developed.
Different from the top-down approach, the bottom-up approach starts from the
specific and moves up to the general. It focuses on dealing with individual compo-
nents first and then integrates logical or physical components with clearly defined
or well understood interactions to form larger subsystems. This process is repeated
recursively, moving up the hierarchy to derive a solution with the expected func-
tions and dynamics of the overall system. Overall, the bottom-up approach is useful
for small-scale and simple systems. In the context of network planning, it primar-
ily concentrates on the hosts, switches, routers, and their interconnections before
considering upper-layer functions and behaviors of the overall network.
The devices can be interconnected with one or two Ethernet LANs using switches.
Then, implement a simple Dynamic Host Configuration Protocol (DHCP) server,
add a border router, configure firewalls, and install and deploy applications. It is not
difficult to configure the network to satisfy the requirements and constraints of the
business. In this bottom-up design process, individual components of the network
are specified in detail. They are then integrated to form larger components, which are
subsequently integrated to create a complete network. This is a process of decision-
making about smaller components first and then deciding how to put the components
together to get a complete system with the desired functions.
For the planning of a large-scale network, connectivity and scalability are among
the major concerns. Meeting the requirements for availability, reliability, security,
performance, and management is also challenging. Due to the interactions among all
such logical or physical components, a design that appears satisfactory for individual
components may not be effective for the overall system. Let us take addressing as an
example, which is a logical component that spans almost all aspects of a network,
e.g., connectivity, scalability, security, performance, and management. Addressing
cannot be adequately tackled without a global view of the entire network. The same
holds true for security, QoS, and network management. As another example, if two
routing areas are designed separately with different routing protocols, each routing
protocol may function perfectly within its own routing area. However, integrating
these two routing areas into the network’s routing system would present a significant
challenge: route redistribution would be required. Hence, the top-down approach
is preferable to the bottom-up approach for complex networks. It helps avoid such
issues resulting from the lack of the global knowledge of the overall network.
It is observed from above discussions that the top-down approach has its advantages
and disadvantages. Let us consider the advantages of the top-down approach:
• It facilitates the alignment of business goals and constraints.
• Expectations from the system are unified while functions and responsibilities are
clearly defined. They all are independent of hardware devices and vendors.
• The logical correctness of primary system functions and services can be ensured.
It can be confirmed before the system implementation and deployment.
• Having a big picture of the system aids in fulfilling technical requirements and
trade-offs. As a result, no major logical flaws will exist in the system.
• Implementation is quick after system analysis and high-level design are completed.
However, there are certain drawbacks associated with the top-down approach.
It requires a lengthy process for system analysis and architectural design before
any further actions can be taken. Not all team members may possess the skills or
expertise required for this process, resulting in a potential waste of resources in
the initial phases of top-down network planning. Moreover, starting from high-level
2.5 Service-Based Networking 35
analysis and design may limit the creative thinking of individual team members.
Additionally, during implementation, individual members may encounter difficulties
in implementing the components derived from the high-level analysis and design.
The bottom-up approach turns these issues arising from the top-down approach
into advantages. Starting from the bottom allows for quick progress in finding solu-
tions for small and local components. Team members can make a full use of their
respective skills and expertise to work on specific components or areas of the sys-
tem. However, integrating these local solutions to form the overall solution for the
entire system is not a trivial task. It requires a significant effort, fine tuning, and a
lengthy process of configurations. The advantages discussed above in the the top-
down approach become questionable in the bottom-up approach.
There have been significant efforts to define and specify services in the OSI reference
model. In general, network services can be understood from different perspectives.
The European Telecommunications Standards Institute (ETSI) NFV Industry Speci-
fication Group (ISG) has defined a network service from its functional and behavioral
36 2 Systematic Approaches
is always within 100 ms although it varies over time. An application designed to use
this predictable network service will function well as long as the traffic latency does
not exceed this 100 ms threshold.
In a computer network, there may be a small number of mission-, safety-, and/or
time-critical services that must be provisioned with guaranteed QoS. For example, an
application may require a reservation of 100 kbps bandwidth along the path between
end points. Otherwise, critical data or commands will not reach the destination within
their respective deadlines, leading to functional failure of the application. As another
example, when a server is down, a backup server at hot standby must take over within
3 s to prevent system crashes. Therefore, the switching over to the hot standby server
within 3 s must be guaranteed.
Predictable and guaranteed network services are configured and managed differ-
ently from best-effort services. They require specific mechanisms, strategies, and
policies in network planning and operation. Detailed discussions on predictable and
guaranteed services will be provided later in the context of network analysis and
architecture.
Network services can be considered from the perspectives of service requests and
service offerings. Briefly speaking, service offerings refer to network services that
are offered by the network to the system. Service requests are requirements that
are requested from the network by users, applications, or devices, and expected to
be fulfilled by the network. They form part of the requirements for the network.
Figure 2.8 illustrates the concepts of service requests and offerings.
Device Device
Service Request
Service Offering
Network
38 2 Systematic Approaches
Service Offerings
Network service offerings refer to network services themselves that the network
offers to the system. Well-known examples of network services offered by networks
include DHCP, DNS, email, file sharing, FTP, HTTP and WWW, print, SNMP, SSH,
VoIP, and many more. DHCP assigns IP addresses to hosts dynamically. DNS trans-
lates domain names to IP addresses. HTTP and WWW enable web browsing. SNMP
is deployed for network management. SSH is a secure shell for remote login.
Service offerings are provisioned to meet the requirements of service requests
made by users, applications, and devices. Therefore, in order to understand what
service offerings should be provisioned and how they are provisioned, it is important
to match service offerings with the corresponding service requests. For instance, an
application requiring web service to support its functions would need the provisioning
of the HTTP service.
In computer networking, service offerings are provisioned as the best-effort deliv-
ery by default, as mentioned earlier. Naturally, they will meet the requirements of
best-effort service requests. The network resources that are actually available to a
specific service will change dynamically over time. There may be occasions when
no sufficient resources, such as bandwidth, are available to a network service for
a period of time. Therefore, it is understandable that the level of performance of
the service offerings is neither predictable nor guaranteed by default. For example,
the FTP service may be unable to establish a connection with the remote file server
due to insufficient bandwidth. In the case of a VoIP service, the quality of the VoIP
service may become very poor for a few minutes or longer if insufficient bandwidth
is available.
To support predictable and guaranteed service requests, simply providing service
offerings is not sufficient without performance management and traffic differentia-
tion. It is essential to develop and deploy QoS mechanisms, strategies, and policies to
meet the performance requirements specified by the service requests. For example,
in the case of the VoIP service mentioned above, traffic delay and jitter should be
limited within a certain range, thus exhibiting predictable traffic behavior. For time-
critical services that require a guaranteed level of performance, resource reservation
would be essential along the path between the endpoints. In this specific example,
end-to-end support is particularly important because each of the routers along the
path must be able to reserve network resources to ensure the guaranteed service.
Service Requests
(1) Connecting hosts to the network: this is a requirement for hardware connectivity.
(2) Placing a specific group of users in a virtual network–this is a requirement for
segmentation and management.
(3) Ensuring reliable network connection with traffic latency within 100 ms for a
specific application, e.g., APPLx: this is a requirement for differentiated traffic
management, which should show at least predictable behavior.
(4) Ensuring the availability of the online teaching system during teaching hours:
this is an availability requirement.
(5) Establishing remote connection to the network over the Internet: this is a require-
ment for secure remote connection.
These examples illustrate various types of service requests that users, applications,
or devices may make to the network, outlining their specific requirements and expec-
tations.
It is seen from the aforementioned examples that some service requests correspond
to one or more network services. For instance, connecting hosts to the network
requires many network services, e.g., physical connectivity to layers 1 and 2, layer-
3 connection, as well as DHCP service and DNS service. Fulfilling this service
request requires the provisioning of multiple service offerings. In this particular
example, clarifying what services should be offered by the network is relatively
straightforward.
In many cases, it is necessary to conduct a detailed analysis to clarify what ser-
vices should be offered by the network in order to meet the requirements of the
service requests. Let us revisit the example discussed earlier: remote connection to
the network over the Internet. There is no single mapping of this request to network
services. Multiple options are available to meet this requirement, e.g., VPN tunnel-
ing, encryption, and other mechanisms. Depending on the architectural design, these
options or mechanisms can be used independently or in combination.
The majority of service requests are addressed through best-effort service offer-
ings. There is no differentiation of network traffic among best-effort services. As
a result, all best-effort services share network resources without prioritization. The
behaviors of these services are neither predictable nor guaranteed.
Some service requests may require predictable or guaranteed service offerings.
A predictable service request may specify a predictable traffic behavior, e.g., end-
to-end traffic latency below 100 ms. In this example, the actual latency value may
be unknown in advance, but it is known to be within the 100 ms threshold. In com-
parison with a predictable service request, a guaranteed service request typically
arises from mission-, safety-, and/or time-critical requirements. Let us consider the
aforementioned example of the online teaching system that must be available during
teaching hours. This implies that the availability of the online teaching system must
be guaranteed for the specified time period. To meet this availability requirement,
multiple network service offerings will be needed. They will be investigated later in
the context of network architecture.
40 2 Systematic Approaches
Network resources, such as bandwidth, are always finite for a specific network. Let
us take bandwidth as an example. After a certain amount of bandwidth is reserved
for predictable and guaranteed services, the remaining bandwidth will be available
for best-effort services. There are basically two methods for managing the sharing
of finite bandwidth:
• Serve all: This method admits all new services and share the bandwidth among
existing and new services, or
• Serve with quality: This method admits new services only when the QoS can be
maintained for both existing and new services.
Consider a scenario where a finite amount of bandwidth, e.g., 10 Mbps, is allocated
for a web service. If there is only one session open, the session will use the entire 10
Mbps bandwidth, which is sufficient to maintain a good quality of the web service.
However, if there are 100 sessions open simultaneously, each session will have an
average bandwidth of 100 kbps, which may still be acceptable despite the possibility
of traffic collisions and increased delays. If additional 100 sessions are introduced
to the service, the available bandwidth for each session is reduced by half to an
average of 50 kbps, which may cause significant performance degradation or make
the system practically unusable.
However, if the number of sessions is limited to a maximum of 100, any requests
to open additional sessions will be rejected. This ensures that the number of sessions
does not exceed 100 at any given time, thereby maintaining an average bandwidth
of 100 kbps per session. When an existing session is closed, a new session can be
admitted. This will require a well-designed admission control mechanism.
An important part of network planning is to describe how to deal with perfor-
mance management with best-effort, predictable, and guaranteed requirements. This
is addressed specifically in performance-component architecture and will be dis-
cussed later in this book.
There are a number of service performance metrics that can be directly measured or
quantified through measurements within a short period of time. Some key metrics
are listed below in alphabetic order:
• Accuracy, which refers to the amount of error-free traffic successfully transmitted,
relative to total traffic.
• Bandwidth capacity, which indicates data-carrying capacity measured in bits per
second (bps).
• Bandwidth usage, which measures how much bandwidth is used for a period
of time. For optimal network operation, one may expect to get as close to the
maximum bandwidth as possible without overloading the network.
• Bandwidth utilization, which is the percentage of total available bandwidth capac-
ity in use.
• Jitter, which is the variation of time delay. For example, if delay varies between 3
ms and 10 ms, the corresponding jitter is 7 ms. If an application is jitter-sensitive,
the jitter must be maintained within a threshold.
• Latency, which quantifies the amount of time taken to transmit data from one point
to another.
• Packet loss, which is the packet dropout during transmission from one point to
another.
• Response time, which measures the amount of time taken to receive a response
after sending a request for a network service.
• Round Trip Time (RTT), which is the amount of time it takes for a data packet
travels to its destination plus the amount of time it takes for an acknowledgment
of that packet to be received at the origin.
42 2 Systematic Approaches
• Re-transmission, which refers to the number of lost or dropped packets that need
to be re-transmitted to complete a successful data delivery.
• Throughput, which quantifies the rate of data successfully transmitted from one
point to another. For example, the throughput over a link is measured as 300 kbps.
The metric of accuracy is measured differently in WANs and LANs. In WAN
links, it is measured as the Bit Error Rate (BER), which is typically in the order of
10−11 for fiber-optic links. In LANs, the measurement of successful transmissions
usually focuses on frames rather than individual bits. Therefore, a BER is not usually
specified for LANs. Instead, the accuracy of data transmission in LANs can be
quantified by a bad frame in a certain number of bytes, e.g., a bad frame in 106 bytes
in a typical scenario.
Calculated Metrics
MTBF
Availability =
MTBF + MTTR
This equation represents the percentage of time the network remains operational
during a given period of time, i.e., uptime divided by the total period of time. Let us
examine some examples to get a sense what the network availability really means in
practical network operations. Consider a full year of 365 days (8, 760 h). A network
that is available 99% of the time is actually out of service for 87.6 h (i.e., more than
three days). The availability of 99.9% means 8.76 h of failure downtime each year.
A requirement of 99.99% availability implies 0.876 h (52 min 33.6 s) of downtime
per year due to failures.
In the calculation of network availability, MTBF and MTTR are mean values.
Actual time before failure and actual time to repair can fluctuate around these mean
values. Depending on the requirements of network services offered by the network,
2.5 Service-Based Networking 43
it may be necessary to consider the worst-case scenario for the network or its com-
ponents.
Furthermore, scheduled maintenance of the network should be excluded in the
calculation of the availability metric. The network may undergo planned shutdowns
for maintenance purposes, e.g., once a year during Christmas or New Year. These
planned shutdowns are not considered failures and therefore are not included in the
calculation of MTTR.
Related to availability, maintainability is a statistical measure of the time required
to fully restore system functions after a system failure. It is usually represented by
MTTR.
Network reliability is related to, but different from, availability. It characterizes
how long the network keeps functional without failure interruption. Practically, it
can be quantified in different ways, e.g., by
• the mean service time between two failures on average, i.e., MTBF divided by the
number of failures, or
• failure rate, which is the number of failures divided by the total time in service.
Mathematically, network reliability is a complex topic, which still attracts active
research and development, e.g., the work presented in [7].
While the concepts of reliability and availability are similar in some aspects,
they are fundamentally distinct. Quite often, they are erroneously used interchange-
ably. Also, reliability is sometimes represented by a percentage value, e.g., 99.99%
reliability. In this case, it should not be confused with availability, but should be
interpreted as the percentage of the time the network is reliable without failures or
interruptions. Moreover, a network can be highly available with a short MTBF, but
not practically reliable because of frequent failure interruptions. This highlights the
distinction between availability, which focuses on the uptime of the network, and reli-
ability, which considers the ability of the network to function without interruptions
over an extended period of time.
Performance metrics discussed previously are typically coupled with each other to
some extend. However, optimizing all performance metrics in a network is not a
realistic task. For example, in practice, achieving a high throughput usually comes
with some packet losses and increased delay. In order to quantify the requirements of
service performance, it is helpful to clarify thresholds or limits for the performance
metrics. By considering these individual thresholds together, a multi-dimensional
view of performance requirements can be formed, which is also known as perfor-
mance envelopes [3, pp. 50–51].
The multi-dimensional view of performance characterizes the acceptable perfor-
mance in a high-dimensional space. Figure 2.9 illustrates a three-dimensional perfor-
mance envelope that considers delay, throughput, and RMA performance. When the
44 2 Systematic Approaches
Low- 1
performance
space Delay
Threshold
RMA
performance metrics fall within the defined thresholds or lower bounds of the limits
in the multi-dimensional view, the service performance meets the requirements. Con-
versely, if the performance exceeds the thresholds or upper bounds of the limits, it no
longer satisfies the requirements. When the performance lies between the lower and
upper bounds, it still conforms to the requirements, but warnings may be triggered
to indicate a risky state with a potential to cross the limits.
2.6 Summary
The waterfall model emphasizes the sequential phases involved in a network plan-
ning project. Six essential phases are identified, which are requirements analysis,
logical network design, physical network design, implementation and deployment,
evaluation/testing/verification, and OAM. Each of these phases can be further decom-
posed into multiple sub-phases. With the sequential phases in the waterfall model, a
phase should not commence until the successful completion of its preceding phase.
The decomposition of the entire project into multiple sequential phases in the water-
fall model aligns with the decomposition of a system into multiple subsystems in the
systems approach.
Building upon the concept of system decomposition in network analysis, a generic
network analysis model has been presented. It considers requirements from four
perspectives: users, applications, devices, and the network. Traditional networking
mainly focuses on the interconnection of devices into the network. In comparison,
modern networking additionally considers requirements from users and applications.
All requirements from users, applications, devices, and the network will be trans-
lated to technical network requirements, from which network architecture can be
developed.
The top-down methodology is an effective tool to deal with complex network
analysis and architectural planning, particularly for large-scale networks. It begins
by developing a top-level view of the network to capture business goals, essential
network functions, and critical network services. It then zooms in on specific network
components for the development of top-level architectural models and component-
based architecture. This is followed by detailed network design and physical design at
the bottom level to cover hardware, floor plans, and structured cabling systems. The
top-down methodology shares the similarities with the systems approach in terms of
system decomposition and with the waterfall model in terms of sequential phases.
Unlike traditional networking with the focus on capacity planning, modern net-
working places greater emphasis on network services, which are configurable, man-
ageable, and provisioned end-to-end. This shift leads to the concepts of service
requests and service offerings. Service requests are requirements requested from
the network, whereas service offerings are services offered by the network to the
system to fulfill the requirements. The majority of network services are best-effort
services. Some services are required to be predictable. A small number of services
are critical and therefore must be guaranteed. To characterize and measure services
quantitatively, various service performance metrics are used. From these metrics,
a multi-dimensional view of service performance can be developed, which gives a
basic understanding of the services in conformance or non-conformance with the
requirements.
46 2 Systematic Approaches
References
1. Fang, Q., Zeitouni, K., Xiong, N., Wu, Q., Camtepe, S., Tian, Y.C.: Nash equilibrium based
semantic cache in mobile sensor grid database systems. IEEE Trans. Syst. Man Cybern. Syst.
47(9), 2550–2561 (2017)
2. Fang, Q., Xiong, N., Zeitouni, K., Wu, Q., Vasilakos, A., Tian, Y.C.: Game balanced multi-factor
multicast routing in sensor grid networks. Inf. Sci. 367–368, 550–572 (2016)
3. McCabe, J.D.: Network Analysis, Architecture, and Design, 3rd edn. Morgan Kaufmann Pub-
lishers, Burlington, MA 01803, USA (2007). ISBN 978-0-12-370480-1
4. Oppenheimei, P.: Top-Down Network Design, 3rd edn. Cisco Press, Indianapolis, IN 46240,
USA (2011). ISBN 978-1-58720-283-4
5. Braden, R., Clark, D., Shenker, S.: Integrated services architecture. RFC 1633, RFC Editor
(1994). https://doi.org/10.17487/RFC1633
6. ETSI: Network functions virtualisation (NFV); terminology for main concepts in NFV. ETSI
GS NFV 003 V1.4.1, ETSI NFV ISG. https://www.etsi.org/deliver/etsi_gs/NFV/001_099/003/
01.04.01_60/gs_nfv003v010401p.pdf (2018). Accessed 3 Jun 2022
7. Chaturvedi, S.K.: Network Reliability: Measures and Evaluation. Wiley (2016). ISBN 978-1-
119-22400-6
Chapter 3
Requirements Analysis
Requirements analysis is the first phase of systematic network planning in the sys-
tems approach, waterfall model, and top-down methodology, which are discussed
previously in the last chapter. Subsequent phases will not commence until this phase
is completed. The primary objective of requirements analysis is to clarify and define
the network planning problems that need to be solved but have not been clearly
specified. This chapter will discuss what and how requirements are developed in a
systematic manner.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 47
Y.-C. Tian and J. Gao, Network Analysis and Architecture, Signals and
Communication Technology, https://doi.org/10.1007/978-981-99-5648-7_3
48 3 Requirements Analysis
Functions/Technologies:
Characterizing
requirements
requirements
requirements
requirements
Application
connectivity, sclability,
networks
Network
existing
Device
Technical User
Analysis availability, performance,
security, manageability,
traffic flow, etc
Fig. 3.1 Requirements analysis in which overlap exists in the technical analysis components. Traffic
flow analysis will be discussed in a separate chapter
project. The resulting network planned from these requirements will effectively sup-
port its users, applications, and devices.
From the perspectives of network functions and technologies, requirements anal-
ysis considers connectivity, scalability, availability, performance, security, manage-
ability, and other related aspects. More specifically, it develops performance thresh-
olds, determines the nature of services that the network must deliver, and decides
where the services must be delivered for a new or existing network.
Requirements analysis produces two types of documents:
• Network requirements specifications, which are integrated from all types of tech-
nical requirements, and
• Requirements map, which describes the location dependencies of various require-
ments.
The requirements map is an extension of the application map, which shows the
location dependencies of applications.
Figure 3.1 illustrates a block diagram of requirements analysis. It shows the main
components, processes, and their dependencies. The discussions of requirements
analysis in this chapter will be structured around this diagram.
over the Internet from anywhere in the world. The system should further provide
network services integrated with text, voice, video, and multimedia communi-
cations. Additionally, the system for teaching and learning should interact with
many other systems, such as those for student management, enrollment manage-
ment, class allocation, timetabling, and grade center. Due to the involvement of
private information in these systems, strict security and privacy polices must be
in place.
(2) Support for research and development is also essential and must be highly reli-
able. It should be able to manage grant applications, and research projects, and
host research data and results. Since some projects and generated data may be
sensitive, a high level of security is also required.
For a research organization, teaching and learning may not be part of its core
business. However, its requirements to support research are similar to those of a
higher-education institution. Depending on the size of the organization, network
services can be provisioned on-premises or off-premises. For small organizations,
third-party cloud services can be considered as a cost-effective alternative without
sacrificing the required level of QoS. In the real world, there are many research
organizations that use cloud services to support their core business. Examples of
such cloud services commonly employed by various organizations include cloud-
based mail services, cloud or public Domain Name System (DNS) services, and
other Software as a Service (SaaS) applications.
A financial organization manages highly-sensitive financial data or databases. The
storage, operation, and transmission of these data must maintain a very high level
of reliability and security. Redundant database servers may be in place with one
serving as the primary server and the others in hot standby mode. Given that such an
organization serves a large customer base across multiple cities, states, and counties,
network services over the Internet are an essential requirement. Any financial trans-
actions conducted over the Internet must adhere to strict security policies, employ
strong encryption, and ensure guaranteed Quality of Service (QoS) management.
A company for online sales will have different requirements from those of an
higher-education, research, or financial organization discussed above. Web or web-
based services would become mission-critical, which serve as front-end interfaces
to customers. These services must be highly reliable and available, likely operating
seven days a week and 24 h a day. An online payment system will be integrated
into these web-based services. The back-end financial databases supporting the sales
operations must also maintain a high level of reliability, availability, and security.
Overall, understanding the core business of an organization provides valuable
insights into its products, services, as well as internal and external relationships.
From this understanding, basic ideas can be developed regarding the key require-
ments and top priorities for the network planning task. These ideas will be further
supported later through the development of detailed technical requirements specifi-
cations, architectural topology, and various mechanisms and policies.
3.2 Business Goals and Constraints 51
There are numerous applications that are commonly found across various networks.
Examples of such applications include web services, mail services, File Transfer
52 3 Requirements Analysis
Protocol (FTP), Secure Shell (SSH), print services, remote desktop, and many others.
By default, these network services are provisioned as best-effort services unless
designed differently to meet specific QoS requirements.
However, every organization has its own unique core business, which may share
similarities with other organizations but often diverges in specific aspects. Conse-
quently, each organization relies on its own set of key applications to support its
core business. These key applications drive network planning from the perspective
of service-based networking.
As a good practice, it is advisable to identify and list top N key applications
that are critical for the organization. This helps in prioritizing resource allocation
and implementing effective QoS management strategies. Depending on the scale
and complexity of the network, the value N for a specific network planning project
may vary. For example, it could be 5, 10, or another suitable value. A more detailed
analysis of application requirements will be conducted later in Sect. 3.4.
A network planning project should have a clearly defined scope. However, the scope
is often unclear initially and thus needs to be defined. In order to define the project
scop, a few questions must be answered. For example, where are the boundaries
of the project? What is not part of the project? What aspects must, or should, be
addressed?
In some cases, a project may focus on upgrading specific segments of an existing
network. In such a scenario, other segments are not part of the project and thus
should remain untouched. The project boundaries would be defined by the routers
that separate these segments from others. If any settings of the routers need to be
modified, make sure to assess whether these changes will affect the interactions with
other parts of the network.
In some other cases, a project may focus on migrating an existing local private
data center to a third-party cloud data center. Various options exist, for example,
• Use third-party Infrastructure as a Service (IaaS) to host network services, or
• Use third-party SaaS offered by the same or different cloud service provider.
Through a high-level analysis, the scope and boundaries of this specific project can
be clarified.
54 3 Requirements Analysis
Let us analyze various requirements based on the generic network analysis model
introduced in Sect. 2.3 of Chap. 2 (Fig. 2.6). The generic model highlights four main
components: users, applications, devices, and the network itself. Each of these com-
ponents is mapped to multiple layers of the OSI seven-layer architectural model.
3.3 User Requirements 55
Application User
User
User
Application
Presentation requirements:
Security
Session Application Availability
Reliability Performance
Transport Functionality requirements:
Timeliness Capacity
Network Device Interactivity End-to-end delay
Device
Network
At the top layer of the generic network analysis model, the user component
addresses the requirements from end users including network administrators and
managers. It is associated with functions across layers 7 and 6 of the OSI seven-layer
architectural model. A fundamental question to consider is: what do users need from
the network to perform their tasks? User requirements can be developed from the
end-user perspective in order for the users to perform their tasks successfully.
User requirements can be approached from various aspects. A list of general user
requirements is presented in Fig. 3.2. Undoubtedly, it is not an exhaustive list. Also,
each of the listed requirements is qualitative, and thus needs to be further quantified
through the development of detailed technical requirements. A good reference for
the general user requirements listed in Fig. 3.2 is the book by McCabe [2, pp. 64–
66]. Let us briefly discuss these user requirements in the following based on our
understanding and practice.
Security is listed as the first user requirement because it is one of the main chal-
lenges in networks and should therefore be given top priority. From the user perspec-
tive, it refers to the Confidentiality, Integrity, and Availability (CIA) of users’ infor-
mation and network resources. This entails protecting the information and resources
from unauthorized access and disclosure. User security requirements can also be
characterized from the reliability and availability perspectives. They will affect delay
performance and capacity planning due to the additional overhead introduced by
security enhancements.
Availability and Reliability have been discusses previously in Sect. 2.5.4 of
Chap. 2. Both concepts are fundamentally different. For example, a highly reliable
network service may not be highly available to an end user. However, from the user
perspective, reliability more or less means availability. To the user, network services
should not only be highly available but also have a consistent level of QoS. Meeting
user availability and reliability requirements may need additional network resources,
such as redundant links or servers. As a result, they can have a potentially important
impact on delay performance, capacity planning, and network management.
56 3 Requirements Analysis
Application User
User
Application
Presentation
Location RMA
Data Link Capacity
Delay
Physical Network
There are many applications that apply to everywhere and used by everyone. Exam-
ples include email, web browsing, word processing, and other general office appli-
cations. Some applications are transparent to, but used by, all users, such as DNS
and DCHP services. Some of these applications (e.g., email) are configured as best-
effort services. However, some others may be critical for a specific organization or
network and thus should be identified and configured with predictable and guaranteed
services.
There are also many applications that are specific to some segments of a network
and used by particular groups of users. For these applications, it is useful to clarify
their location dependencies physically and logically, resulting in an application map
as illustrated in Fig. 3.4. The application map helps in determining the flow charac-
teristics of the applications and mapping the traffic flows during traffic flow analysis.
From the application map, a more general requirements map will be developed later
as one of the two main outcomes from the overall requirements analysis.
For the development of the application map, the following questions need to be
answered:
• Where an application will be applied, in users’ environment or within the environ-
ment of the overall network system?
• On which devices will the application be used, on general end-user devices or
specific devices?
Answering these questions will assist in clarifying the location information of the
application.
3.5 Device Requirements 61
LAN A
LAN A
LAN B
App5
Block B
Block D
Block B
Block C
Block C
LAN D
LAN B
LAN C
LAN C
App5
App4
App4 App6
Block A
Block C
Block E
LAN A
LAN B
LAN C
App6
Block D
Block B
LAN D
Application User
User
Application
Presentation
Session Application
Device
requirements:
Transport
Category Performance
Performance requirements:
Network Device
Device
Group CPU
Network
Location Memory
Data Link
Storage
Physical Network Others
The performance of network devices is not easy to determine because of two reasons:
• It is closely tied to hardware, firmware, and software that join users, applications,
and other components of the system; and
• The components within a device are often proprietary, implying that detailed infor-
mation about their performance may be limited or unavailable.
Consequently, there may be a lack of device performance details.
Recall that in computer networking, network performance, or more generally net-
work QoS, is managed and measured end-to-end. From the end-to-end perspective,
the performance characteristics of a device can be determined by considering the
device’s overall performance. They can be described based on various components
of the device, such as processors, memory, storage, device drivers, and read/write
speed, all of which impact the overall performance of the device.
By clarifying the performance characteristics of a device, potential performance
problems or limitations within the devices can be identified. This enables the devel-
opment of strategies to overcome these limitations. For instance, by identifying bot-
tlenecks in the network interfaces of a device, it becomes possible to find out how to
upgrade the device in order to achieve the required level of performance.
General computing devices, such as personal computers, laptops, and mobile devices,
are typically location-independent. They can be used as plug-and-play devices any-
where on the network or off-premise for remote access to the network via the Inter-
net. Nevertheless, understanding where and how many generic computing devices
are accessing the network is helpful in determining how network resources should
be allocated and what the overall performance of the applications provided through
the devices would look like for their users.
Different from general computing devices, servers are more location-dependent.
For example,
• A computing laboratory is set up on a specific floor of a building to house a
cluster of high-performance computers. This specific location is chosen to provide
a controlled environment, optimized power supply, and efficient networking for
the cluster of computers.
• A storage server is placed in a specific location that is in close proximity to another
server running a database service. This close placement ensures reduced latency
and improved data transfer between the storage server and the database server.
In these examples, the placement of servers or workstations is purposeful and strate-
gic. Factors such as proximity, resource sharing, and specialized services are taken
into account during the decision-making process.
64 3 Requirements Analysis
Network requirements build upon, and are more technical in nature than, user, appli-
cation, and device requirements, which are subjective to some extent as we have
already understood. All user, application, and device requirements will be even-
tually reflected in, and translated into, network requirements. Therefore, for the
development of network requirements, it is essential to conduct a detailed analysis
3.6 Network Requirements 65
Application User
User
Application
Presentation
Network
Session Application requirements:
Transport Scaling
Services Performance
Interoperability requirements:
Network Device
Device
Upgrading Capacity
Network
Performance Throughput
Data Link
Security Delay
Physical Management Others
Network
from various technical aspects of networks. The derived network requirements will
drive the subsequent development of detailed technical specifications for network
planning.
In most cases, networks are not built from scratch. Rather, network planning
projects typically consider extending or upgrading existing networks. Therefore,
in the analysis of network requirements, it is critical to characterize existing net-
works, integrate new components and technologies into the existing infrastructure,
and establish a clear pathway for migrating or upgrading the existing networks to
the new ones being planned.
In the following, let us discuss how to incorporate existing networks into network
planning. This will be followed by an analysis of network requirements in terms of
management, performance, and security. The topic of characterizing existing net-
works will be addressed later in a separate section. Figure 3.6 provides an overview
of the overall network requirements within the framework of the four-layer generic
network analysis model.
and performance, the lowest possible delay that can be achieved in the network, and
the best possible security protection that the current firewalls can provide. Beyond
these constraints, improved or additional mechanisms will need to be designed and
provided for the desired level of network performance.
Support services are the services that support the functionality of the network
and networked systems. Typical examples include strategies and mechanisms for
addressing, routing, security, performance, and management. When an existing net-
work is upgraded to a new one, it is essential to understand the network requirements
for each of these support services.
Interoperability ensures the smooth transition from the existing network to the
planned new network. If the planned network follows the same addressing and routing
strategies as those in the existing network, no translation will be required at the
boundary between the existing and new networks. Therefore, it is part of the network
requirements analysis to clarify the technologies and media used in the existing
network, as well as any performance or functional requirements for the upgrading
of the existing network to the planned new one.
Location dependency of the existing network may change when it is upgraded to
the planned new one. For example, a LAN-based service in the existing network may
become a WAN-based cloud service in the planned new network. This change can
impact the QoS of the service, and thus needs to be considered in the development
of network requirements.
a routing protocol. Different routing protocols have different processes for making
routing decisions. When developing routing requirements, it is important to take into
account factors such as the network environment and the applications running on the
network. This helps guide the selection of an appropriate routing protocol that aligns
with the specific needs and characteristics of the network.
In some cases, there may be a requirement for multiple routing protocols to be
used within the same network. This could be due to various reasons, such as a large-
scale enterprise network that is integrated from multiple networks running different
routing protocols. For such scenarios, what are the requirements for multiple routing
protocols to work together? Protocol interoperability and route redistribution will
need to be considered.
network events, information flows, integrity and security, and other network perfor-
mance metrics. In network analysis, develop network management requirements in
relation to the following aspects, which are not listed exhaustively:
• What needs to be monitored and managed?
• Is monitoring for event notification or trend?
• What instrumentation methods, such as Simple Network Management Protocol
(SNMP), should be used?
• To what level of detail should events or trend be monitored?
• Should management be performed in-band or out-of-band?
• Is centralized or distributed monitoring more suitable?
• What is the impact of network management on network QoS?
• How should the management itself and management data be managed?
By addressing these aspects, comprehensive network management requirements can
be developed to ensure effective QoS implementation, management, and control.
Characterizing existing networks will help develop realistic goals for network plan-
ning. Any bottlenecks, performance problems, and network devices or components
that require replacement or improvement in the existing network can be identified.
This will give some hints about potential solutions for the network planning project.
The main tasks involved in characterizing existing networks include the under-
standing of the physical and logical network architecture, addressing and routing
architecture, performance baselines, management architecture, and security archi-
tecture. Analyzing protocols used in existing networks is also an important task.
The physical network architecture provides information about the geographical loca-
tions of network components, devices, and services. By using the top-down method,
the top-level view of the physical network architecture can be developed. It shows
physical network sites, and their geographical locations and connectivity such as
WAN.
For each network site, one or multiple network maps can be developed to illustrate
more detailed physical information such as:
• Buildings, floors, and rooms or areas.
• The physical location of main servers or server farms, such as web servers, DNS
servers, mail servers, database servers, and storage servers.
• The physical locations of routers and switches, such as border routers and other
routers.
• The physical locations of high-performance computing clusters, computing labo-
ratories, and other specific computing facilities.
• The physical locations of network management components, such as enterprise
edge management components (e.g., VPN servers).
70 3 Requirements Analysis
For addressing, it is necessary to clarify whether IPv4 or IPv6 is in use. If the dual-
stack configuration of both IPv4 and IPv6 is already implemented, continue to use it
in the planned new network. If only IPv4 is being used, it is the right time to consider
introducing IPv6 to the new network.
Investigate if IP addresses are allocated hierarchically with good scalability to
accommodate future growth. Identify areas where IP address allocation can be
improved through better subnetting. For example, discontiguous subnets should be
avoided in IP address allocation.
IP addressing is tightly coupled with routing. By analyzing subnetting strategies,
it is relatively easy to characterize how route summarization, i.e., router aggregation
or supernetting, has been implemented in the existing network. Assess whether route
aggregation can be further improved in the planned new network.
Evaluate if routers are appropriately placed with the desired security protection
in the existing network. Analyze what routing protocol is being used in the existing
network. Does the routing protocol continue to function effectively with any changes
made to the network architecture and IP addressing in the planned new network? If
more than one routing protocol is employed, how do they work together in the same
network?
3.7 Characterizing Existing Networks 71
Characterizing the performance of the existing network will help establish perfor-
mance baselines and determine the requirement improvement for network upgrading.
The first set of performance metrics to consider is RMA (Reliability, Maintainability,
and Availability), which has been discussed in detail previously. It may be expected
that the current levels of RMA performance are maintained, or improved RMA per-
formance is desired in the planned new network.
Other performance measures that could be characterized include latency, through-
out, network utilization, network efficiency, and network accuracy, which have been
briefly discussed in the previous chapter:
• Latency can be measured in different ways depending on application scenarios,
such as one-way delay, Round Trip Time (RTT), and response time. In some cases,
the boot time of machines, whether they are Physical Machines (PMs) or Virtual
Machines (VMs), should also be considered. For example, if migrating a VM to
a new PM that is currently off, it will take time to boot the new PM first and then
boot the VM hosted on the new PM.
• Throughput reflects the capacity of the network for data communications. Its value
in the existing network can be used as a performance baseline for network upgrad-
ing.
• Network utilization measures, typically in the percentage of capacity, how much
bandwidth is in use during a specific period of time, such as 10 min or an hour. It
is high during peak hours and low during off-peak hours. The pattern of network
utilization versus time helps understand the normal traffic behaviors. The planned
new network should be able to handled the expected traffic pattern.
• Network efficiency is commonly understood as the successfully transferred data
expressed as a percentage of the total transferred data. This understanding has
considered protocol overhead as part of the useful data, and thus is not accurate.
More accurately, network efficiency measures how much payload, i.e., user data,
is successfully transferred in comparison with the total transferred data including
overhead no matter whether the overhead is caused by collisions, frame headers,
acknowledgments, and re-transmission. For example, if 10 packets have been suc-
cessfully transmitted without re-transmission and the overhead is 20% (e.g., from
frame headers), then the network efficiency is 80%. Therefore, a larger packet size
is beneficial in general. But any transmission errors will lead to re-transmission of
large packets in this scenario.
• Network accuracy captures how correctly data packets can be transferred over the
network. For WAN links, it is measured by Bit Error Rate (BER). which is typically
around 1 in 1011 for fibre-optic links. For LAN links, the focus is on data frames
rather than individual data bits. A typical network accuracy threshold for LANs
is a bad frame per 106 data frames. If WAN links are provided by a third-party
service provider, the desired network accuracy can be specified in a Service Level
Agreement (SLA). Check if there is such an SLA for the existing network.
72 3 Requirements Analysis
The security of the existing network can be characterized from various aspects. For
example,
• What assets are being protected?
• What are the security risks to the assets?
• What physical security measures are in place?
• What methods are being used for security awareness?
• What is the security plan currently in use?
• What security policies have been implemented?
• What security procedures are designed?
• How is DMZ designed for security protection, and which servers are placed in the
DMZ?
• How is the overall security managed?
Through such a process of characterizing the security of the existing network,
it will become clear which security mechanisms and strategies could be inherited
in the planned new network. Also, potential security issues that require attention or
enhancement can be identified. Meanwhile, potential solutions to these issues may
be proposed. For instance, a current firewall should be replaced with a new one that
provides enhanced security protection.
• Are there any other perceived issues, such as the scalability of using a specific
protocol?
It may not be realistic or necessary to list all protocols used in the existing network.
For example, IPv4 or IPv6 is always present regardless of the applications being
executed. Therefore, it is a general practice to focus on the most important protocols
or the protocols with specific use cases. For instance, IntServ is used to provide
end-to-end QoS guarantee for a specific application, highlighting the importance
of that application in the existing network. Similarly, DiffServ is being used by a
group of video streaming applications, showing the soft-real-time QoS nature of
these applications.
Building A
Seniors 20
Admin 6
No other users
Canvas
Building B Building C Building D Building E
Seniors 30 Seniors 50 Acaemics 500 Academics 500
Acaemics 200 Acaemics 300 Admin 15 Admin 16
Admin 10 Admin 10 Students Students
Students Students 10 HPCs GNSS system
Logistics, Payroll
Building F
Building G Building H Building I
Acaemics 300
New
Admin 10
network
Students
Data Databases
Center Servers
Fig. 3.7 A simplified requirements map. A complete requirements map has detailed information
about the location dependencies of comprehensive requirements and a large number of important
applications
• Manageability,
• Security, and
• Affordability.
All of these aspects are important. However, it should be understood that optimizing
all of them simultaneously is practically impossible in a real-world network. Improv-
ing one aspect may lead to the sacrifice of one or more other aspects. For example,
enhancing security through deep packet inspection will inevitably introduce addi-
tional time delay. Similarly, enhancing the reliability of a network may require the
implementation of additional redundancy mechanisms and strategies. Consequently,
this will result in increased capital and operational costs for the network, as well as
more complicated network management. Therefore, in order to achieve a satisfac-
tory solution, trade-offs need to be made for the identified requirements and technical
goals. This can be achieved by carefully evaluating and prioritizing various aspects,
taking into account the specific needs and objectives of the network project.
76 3 Requirements Analysis
Categorizing Requirements
With these three categories of requirements, assign top priority to the REQUIRED
requirements, normal priority to the RECOMMENDED requirements, and low prior-
ity to the DESIRABLE/OPTIONAL requirements, respectively. Within the RECOM-
MENDED and DESIRABLE/OPTIONAL categories, prioritize the requirements by
determining the order in which they will be implemented. In this way, high-priority
requirements can always be met and low-priority requirements will be met whenever
possible.
When conflicts arise in implementing the requirements, several options exist. Here
are some examples:
• Meet the high-priority requirements and relax the low-priority requirements. For
example, the time delay of 10 s specified in Requirement R4.1 of Table 3.2 might
be slightly relaxed, e.g., 10.5 or 11 s, with a minimal impact on the QoS of the
Canvas application for teaching and learning in a university.
• Investigate whether the conflicting requirements are appropriately specified. If not,
consider re-defining them. For example, find out what causes the delay in accessing
Canvas. Is it because of excessive simultaneous sessions or users, or is it due to an
inefficient authentication process? Can on-campus access to Canvas be streamlined
for faster access? A detailed investigation can help refine the requirements and
potentially eliminate the conflicts.
3.9 Summary 77
3.9 Summary
Given the complexity of networks and network services, network requirements can
be analyzed in separate but interconnected components. In this chapter, requirements
analysis has been conducted within the framework of a four-layer generic network
analysis model. Accordingly, requirements are grouped into user, application, device,
and network requirements. As we ascend the layered hierarchy towards the top user
layer, the requirements become more subjective. By contrast, as we move down the
layers towards the bottom network layer, the requirements becomes more technical
and objective.
The requirements identified from the user, application, device, and network com-
ponents are integrated to form a complete set of requirements. The resulting require-
ments are further analyzed for their prioritization and refinement, which include
resolving conflicts and making trade-offs. Ultimately, detailed requirements spec-
ifications, along with a requirements map, are developed with technical trade-offs
and constraints. They will serve network planning, specifically in relation to network
architecture and design.
References
1. Oppenheimei, P.: Top-Down Network Design, 3rd edn. Cisco Press, Indianapolis, IN 46240,
USA (2011). ISBN 978-1-58720-283-4
2. McCabe, J.D.: Network Analysis, Architecture, and Design, 3rd edn. Morgan Kaufmann Pub-
lishers, Burlington, MA 01803, USA (2007). ISBN 978-0-12-370480-1
3. Tian, Y.C., Levy, D.C.: Handbook of Real-Time Computing. Springer, Singapore 189721 (2022)
4. Bradner, S.: Key words for use in RFCs to indicate requirement levels. RFC 2119, RFC Editor
(1997). https://doi.org/10.17487/RFC1112
5. Leiba, B.: Ambiguity of uppercase vs lowercase in rfc 2119 key words. RFC 8174, RFC Editor
(2017). BCP 14. https://doi.org/10.17487/RFC1112
Chapter 4
Traffic Flow Analysis
Many of the requirements developed from the requirements analysis in the previ-
ous chapter are directly or indirectly related to performance for users, applications,
devices, and the network in the four-layer generic network analysis model. Along
with their location dependencies, they are affected by the patterns and behaviors
of traffic flows. More importantly, the implementation of various Quality of Ser-
vice (QoS) management mechanisms and strategies relies on traffic flow manage-
ment. Therefore, traffic flow analysis is an important step in the development of flow
requirements specifications. It is an integral part of requirements analysis in network
planning projects.
Traffic flow analysis characterizes traffic flows within a network to understand
the following aspects:
• Traffic flows: identifying where the flows will likely occur,
• Traffic QoS: determining what levels of QoS the flows will require,
• Traffic models: comprehending what types of traffic flows have been well under-
stood,
• Traffic measurement: clarifying how traffic flows are measured and quantified,
• Traffic load: assessing how much load the flows carry,
• Traffic behavior: examining how the flows behave, and
• Traffic management: defining how the flows should be managed.
Through the process of flow analysis, a comprehensive set of flow requirements
specifications will be identified. These requirements indicate where traffic flows are
likely to occur and how flow requirements will combine and interact. They also offer
valuable insights into network hierarchy and redundancy, and may even suggest
interconnection strategies. More specifically, the developed flow specifications will
be used later in network architecture planning, particularly for the management of
network performance and QoS.
As we have already understood, the majority of network services are typically pro-
visioned as best-effort services by default, implying that they do not offer any explicit
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 79
Y.-C. Tian and J. Gao, Network Analysis and Architecture, Signals and
Communication Technology, https://doi.org/10.1007/978-981-99-5648-7_4
80 4 Traffic Flow Analysis
Traffic flows have been described from different perspectives. Let us examine in the
following how traffic flows have been defined.
In the IETF RFC 2722 [1, p. 5], a traffic flow is defined as “an artificial logical
equivalent to a call or connection”. It is “a portion of traffic, delimited by a start and
stop time”, that belongs to “a user, a host system, a network, a group of networks,
a particular transport address (e.g. an IP port number), or any combination of the
above”. “Attribute values (source/destination addresses, packet counts, byte counts,
etc.) associated with a flow are aggregate quantities reflecting events which take
place in the DURATION between the start and stop times. The start time of a flow is
fixed for a given flow; the stop time may increase with the age of the flow.” [1, p. 5].
In IPv6 networks, the IETF RFC 3697 [2, p. 1] defines a flow as “a sequence of
packets sent from a particular source to a particular unicast, anycast, or multicast
destination that the source desires to label as a flow. A flow could consist of all
packets in a specific transport connection or a media stream. However, a flow is not
necessarily 1:1 mapped to a transport connection.”
In the IETF RFC 3917 [3], which specifies the IP Flow Information eXport
(IPFIX) [3, pp. 3–4], a flow is defined as “a set of IP packets passing an obser-
vation point in a network during a certain time interval. All packets belonging to a
particular flow have a set of common properties.” The flow properties mentioned here
refer to flow attributes that are described in the IETF RFC 2722 [1]. Each property,
or attribute, results from
• The packet header field (e.g., destination IP address), transport header field (e.g.,
destination port number), or application header field,
• The characteristics of the packet itself, such as QoS levels, and/or
• The fields derived from packet treatment, e.g., the next hop IP address.
A packet is considered to belong to a flow if it fully satisfies all the defined properties
of that flow. For example, traffic originating from the same application, e.g., video
4.1 Traffic Flows 81
streaming or VoIP, can be effectively managed within a single flow. Similarly, traffic
with identical QoS requirements can be grouped together within the same flow,
facilitating streamlined QoS management.
From the above discussions, it is seen that the concept of traffic flows is applied to
“any protocol, using address attributes in any combination at the adjacent, network,
and transport layers of the network stack” [1, p. 3]. The term Adjacent refers to “the
next layer down in a particular instantiation of protocol layering” though it usually
means the link layer [1, p. 3]. More specifically:
• A traffic flow is a sequence of packets with common attributes or properties,
which “are defined in such a way that they are valid for multiple network protocol
stacks” [1, p. 3],
• It is measured during a time interval and in a single session of an application,
• It is characterized at an observation point, and
• It is end-to-end between source and destination applications/devices/users.
To broaden the scope of Real-Time Flow Measurement (RTFM) beyond simple
traffic volume measurements defined in the IETF RFC 2722 [1], new flow attributes
are introduced in the IETF RFC 2724 [4]. They include performance attributes, such
as throughput, packet loss, delays, jitter, and congestion measures. These perfor-
mance attributes are calculated as extensions to the RTFM flow attributes according
to the following three general classes [4, p. 5]:
• Trace: Attributes of individual packets within a flow or a segment of a flow, e.g.,
last packet size,
• Aggregate: Attributes derived from the flow considered as a whole, e.g., mean rate,
and
• Group: Attributes calculated from groups of packet values within the flow, e.g.,
inter-arrival times.
Figure 4.1 illustrates a traffic flow over a network. Since traffic flows are end-to-
end, they can also be examined and evaluated on a link-to-link or network-to-network
basis. This will help integrate flow requirements at the link or network level.
Most flows are bidirectional, with the same set of attributes for both directions
or a different set of attributes for each direction. However, there are instances where
flows are unidirectional with a single set of attributes. The flow shown in Fig. 4.1
represents an example of such a unidirectional flow.
The traffic flow shown in Fig. 4.1 captures the features of an individual flow. An indi-
vidual flow is defined as a flow of protocol and application information transmitted
over the network during a single session. It is the basic unit of traffic flows, and can
be aggregated with other individual flows to form aggregated traffic flows, such as
composite flows, which will be discussed later.
82 4 Traffic Flow Analysis
Header Attributes
(application, transport,
and/or packet heaaders)
Fig. 4.1 A traffic flow with attributes applied end-to-end and over network
Individual Flow 1 R
Composite Flow R
R1 R2
Network
ow 2
idu al Fl
Indiv
3 Composite Flow aggregated from:
Flow
al
iv idu 1) Individual flow 1: 120kbps upstream, 500 kbps down-
Ind stream,
2) Individual flow 2: unidirectional, 1 Mbps, 20 ms one-
way delay, and
3) Individual flow 3: 10 Mbps bidirectional, 100 ms
round-trip delay.
comparison, when uploading a file to a File Transfer Protocol (FTP) server over the
Internet, the traffic flows upstream from the user. Internet access speeds are typically
asymmetrical with the upstream rate being considerably slower than the downstream
rate.
In the process of flow analysis, not all traffic flows need to be considered. There-
fore, it is important to clarify questions such as:
• Which individual flows should be considered?
• Where and what requirements should be applied to these flows,
• When do these flows contribute to composite flows, and
• How are the requirements of individual flows that contribute to the composite
flows aggregated to form the requirements for the composite flows?
By addressing these questions, flow analysis can focus on the flows that have the
most significant impact on the network architecture and QoS management.
Critical flows are traffic flows that are considered more important than others and,
as a result, have higher levels of QoS requirements. Traffic flows associated with
mission-critical, safety-critical, and time-critical services and applications are typi-
cal examples of critical flows. Traffic flows that serve more important users, appli-
cations, and and devices can also be treated as critical flows, even if they are not
as critical as mission-critical, safety-critical, or time-critical flows. Traffic flows that
have high, predictable, and guaranteed performance requirements drive the archi-
tectural design of a network, especially in service-based networking. They dictate
resource allocation, prioritization schemes, and overall network QoS management
to ensure that the required level of network performance is maintained.
84 4 Traffic Flow Analysis
Identifying the sources and sinks of traffic flows will help characterize the flows and
determine their directions. A flow source or data source is a network entity where a
traffic flow originates. A flow sink or data sink is a network entity where a traffic flow
terminates. Graphically, a flow source can be represented by a circled dot, whereas
a flow sink can be denoted by a circled asterisk, as shown in Fig. 4.3. Figure 4.4
provides some examples of flow sources and sinks.
In a computer network, some network devices only generate data and traffic flows,
making them pure flow sources. A typical example of such devices is video cameras
used in a networked surveillance system.
There are also network devices that only consume data and terminate traffic flows,
making them pure flow sinks. Video monitors in a network surveillance system are
examples of pure data sinks. Global Positioning System (GPS) devices used in general
vehicles receive satellite signals for positioning and navigation. They are pure data
sinks. However, high-end GPS devices may also communicate with satellite base
stations and other network devices, and thus are not pure flow sinks.
Network
Almost all network devices in a network generate and consume data, and thus play
dual roles as both flow sources and sinks. For example, consider a web server that
receives requests from visitors and responds by providing the requested information.
When receiving requests, the web server functions as a flow sink. However, when
sending out information in response to the requests, it acts as a flow source. As the
traffic of web server responses is generally much larger than that of requests, the web
server primarily acts a flow source. Similarly, a storage server primarily functions
as a flow sink because the incoming traffic directed towards the server is generally
much higher than the outgoing traffic originating from the server.
Site A Site B
WAN
Ba
ow
ck
e fl
b
on
on
Flow boundary Flow boundary
b
e fl
ck
ow
Ba
Flow boundary
Composite flow
Site C
this 80/20 rule. This is because, from the user’s perspective, it is often transparent
where and how much resources come from the WAN, or users do not care where the
resources come from as long as they are readily available. Consequently, the growing
demand for WAN resources may require a revised rule, such as a 50/50 split or even
a 20/80 distribution. The decision to use local network resources within the LAN
or remote resources through the WAN should consider various factors such as cost,
performance, and other specific requirements.
The concept of QoS can vary in its interpretation across different application domains
or use cases. In telephony systems, QoS refers to the overall performance of the
services that are provided by the system. It is quantified using various performance
88 4 Traffic Flow Analysis
metrics. These metrics include packet loss rate, bit rate, throughput, latency, jitter,
reliability, and availability. Quite often, this interpretation of QoS is also applied
in computer networking to indicate the quality of various services provided by a
network.
In computer networking, QoS specifically refers to a set of mechanisms and tech-
nologies to control network traffic and related resources such that the performance
of high-priority and critical applications is ensured within the resource capacity of
the network. This interpretation of QoS is adopted in this chapter for discussing
performance architecture of computer networks.
QoS is a feature of routers and switches. Therefore, QoS mechanisms are imple-
mented in routers and switches to prioritize traffic and control resources so that more
important traffic can pass first over less important traffic. Traffic with QoS require-
ments should be marked by te applications that generate the traffic. When routers
and switches receive the marked traffic, they will be able to categorize the traffic into
different groups for QoS management.
Understandably, it is necessary to characterize, quantify, measure, and manage
the achieved performance quality of a computer network with QoS control, as well
as the quality of the applications that the network serves. This should be addressed in
conjunction with performance management as part of overall network management
architecture, which will be discussed later in a separate chapter. Various performance
metrics are used for this purpose, such as packet loss rate, bit rate, throughput, latency,
and jitter, as mentioned above.
In computer networking, network service requests from end users, applications, and
devices are supported through network service offerings by the network. By default,
network service offerings are provisioned as best-effort service responses. This means
that the network does not differentiate traffic flows from different users, applications,
and devices. As a result, the network simply offers whatever network resources that
are available at any given time without providing any performance guarantee to
network services. As the available network resources fluctuate over time, the best-
effort allocation of network resources to each user, application, and device also
changes from time to time. Therefore, it becomes the responsibility of the system
or underlying applications to adapt their traffic flows to the available services. The
flow control function of TCP is an example of self-adaptation to a dynamic network
environment.
While the majority of network services are best-effort services, there are also
network services that are time-critical, safety-critical, and mission-critical services.
VoIP and video streaming services integrated with data networks are typical exam-
ples, which are sensitive to delay and jitter. Due to their real-time requirements, both
VoIP and video streaming use UDP for the transport of voice and video datagrams.
In the best-effort service environment, the delay and jitter may exceed their maxi-
4.2 QoS Requirements 89
mum tolerable thresholds, resulting in poor or even unacceptable quality of voice and
video. More severely, if the loss rate of datagrams becomes high, as there is no mech-
anism for re-transmitting the lost datagrams in UDP, the voice and video services
may become completely unusable. Even if a re-transmission strategy is designed at
the application layer, the re-transmitted datagrams may arrive too late to be useful at
the receiver.
Traditionally, over-provisioning network bandwidth has been used to partially
address such problems in network services with specific performance requirements.
When network utilization, which is the used bandwidth relative to the total available
bandwidth, is relatively low, burst traffic in the network can be handled without a
major impact on the performance of voice, video, and other services with performance
requirements. Thus, throwing more bandwidth is helpful to mitigate some network
service problems.
However, due to the limited network resources, meeting the performance require-
ments for some network services remains a challenge. This challenge become more
severe when more critical network services are integrated into the data network.
Therefore, there is a demand for systematic mechanisms to control and manage the
performance of network services. This is where QoS comes into play.
The most common use cases for QoS in computer networking are voice and video
streams, as discussed earlier. In addition to voice and video services, many other
network services also require QoS, particularly for real-time applications as well as
safety-critical and mission-critical systems. For example, in large-scale manufactur-
ing and agricultural applications, numerous real-time monitoring and control tasks
operate with the support of network communications integrated with industrial Inter-
net and IoT networking. Some tasks have higher priority than others, such as sending
out emergency commands versus receiving regular sensing measurement data. With
QoS management, tasks with higher priority can be executed earlier than other tasks.
As the demand for network connectivity continues to increase, more and more
network services are being provisioned over networks. This trend motivates and
necessitates the increasing deployment of QoS control in computer networks to pro-
vide differentiated services to various end users, applications, and devices. Therefore,
QoS will play an increasingly important role in future network systems, ensuring that
certain data streams are handled with higher priority over others within the given net-
work resource capacity.
For identified traffic flows, it is not sufficient in network planning to know only simple
QoS characteristics such as load (bandwidth) and behavior, which will be discussed
later in this chapter. As network services and QoS requirements drive service-based
networking, it is important to comprehensively understand QoS requirements, par-
ticularly for critical flows and applications.
The majority of network services and applications in a network are served with
best effort by default. No specific QoS requirements will be applied to these best-
effort services. If sufficient bandwidth is available, they may perform well. But if
not, they may function poorly. This is the behavior that we would expect for these
services and applications. For example, web browsing and mail services are generally
deployed as best-effort services.
Some flows or applications can be served with service differentiation such as
the DiffServ mechanism specified in the IETF RFC 2475 [5]. DiffServ is a layer-
3 QoS control mechanism. It does not support end-to-end QoS management for
individual flows. Rather, it aggregates traffic flows between two hops and manages
the aggregated flows in a PHB manner. This is particularly suitable for soft-real-time
services and applications.
There are flows that require end-to-end QoS support. These flows can be served
with Integrated Service (IntServ), which is originally defined in the IETF RFC
4.2 QoS Requirements 91
1633 [9]. Like DiffServ, IntServe is also a layer-3 QoS control mechanism. But
different from DiffServ that manages flows in a PHB manner, IntServ manages each
individual flow from end to end. Therefore, it requires all routers along the path of
the flow to reach an agreement on reserving resources, e.g., bandwidth, for the flow.
For this purpose, a signaling system is essential for communicating the flow require-
ments with all participating routers. Obviously, it is also essential that all participating
routers must be able to reserve the required resources. Due to its end-to-end flow
support, IntServ is well suitable for hard-real-time services and applications such as
those with QoS requirements for guaranteed flows.
In IntServ, two major classes of QoS services have been defined: guaranteed
service and controlled-load service:
• Guaranteed service is introduced with IntServ. It is defined in the IETF RFC
2212 [10]. Its objective is to provide “firm (mathematically provable) bounds” on
the queuing delays that a packet will experience in a router. As a result, guaranteed
service guarantees both delay and bandwidth for a flow.
• Controlled-load service is defined in the IETF RFC 2211 [11]. It aims to provide
the client traffic flow with a QoS “closely approximating the QoS that same flow
would receive from an unloaded network element”. It operates effectively for the
served flow regardless of the traffic load of the router through which the flow is
passing. Thus, admission control is used to ensure that the controlled-load service
performs well even if the router is heavily loaded or overloaded. Controller-load
service does not specify any specific performance guarantees, making it suitable
for real-time multimedia applications such as video streaming.
More details of the controlled-load service and guaranteed service developed for
IntServ will be discussed later in the performance component architecture of this
book.
From the QoS requirements, it is seen that some flows are more important than others.
Therefore, the technique of flow prioritization is used to formally determine which
flows should receive the most resources or which flows should be allocated resources
first. While network engineers can always request more funding for additional net-
work resources, it is important to acknowledge that network resources are not infinite
in the real world, as discussed in the IETF RFC 1633 [9, p. 4]. Consequently, flow
prioritization should be conducted under the constraints of network resources. The
following discussions on flow prioritization does not consider any budget constraints
in relation to the acquisition of additional network resources.
Critical flows, such as mission-critical, safety-critical, and time-critical flows,
should ne assigned higher levels of priority than other flows. They may not necessarily
require a significant amount of resources, but will require service guarantees such
92 4 Traffic Flow Analysis
as an upper delay bound or a lower bandwidth bound. This necessitates the use of
guaranteed service provided by IntServ. Moreover, among the critical flows, certain
flows may be more important than others.
For traffic flows with soft QoS requirements, lower levels of priority can be
assigned than those for critical flows. Typical examples include non-critical video
streaming and other multimedia flows. Depending on the application scenarios, they
can be managed through controlled-load service within the DiffServ framework if
end-to-end support is not necessary, or through guaranteed service within the IntServ
framework if end-to-end support is essential.
Traffic flows without specific QoS requirements can be treated as best-effort flows,
as we have already understood. They can be managed through best-effort services.
When the path of a flow experiences heavy traffic load, the service provided to the
flow may be much slower than it would be under normal traffic conditions.
There are layer-2 and layer-3 mechanisms for flow prioritization. The basic idea
behind flow prioritization mechanisms is to mark the outgoing traffic flows before
they are transmitted. At layer 2, a 3-bit Priority Code Point (PCP) in the frame
header represents eight levels of priority for data frames. In comparison, at layer
3, a 6-bit Differentiated Services field CodePoint (DSCP) is embedded into the IP
header to differentiate 64 levels of priority for data packets. What flow prioritization
mechanisms should be chosen and how to control the marked traffic flows with QoS
requirements will be the topics of performance component architecture, which will
be covered later in a separate chapter of this book.
There are a few well-established traffic flow models, each showing specific and
consistent flow behaviors. An effective approach to flow analysis is to map network
flows to one of these flow models. This section introduces the peer-to-peer, client-
server, hierarchical client-server, and distributed-computing flow models. These flow
models have been characterized with examples in the book by McCabe [6, pp. 180–
191].
The peer-to-peer flow model describes fairly consistent traffic flows in a physically
or logically flat network topology. Shown in Fig. 4.6, it exhibits the following two
key features:
• No peers are more important than others.
• There is no manager among the peers.
4.3 Traffic Flow Models 93
Individual flows
As a result, all flows among the peers are considered to be equal in terms of their
importance. Either all or none of the flows are critical. A single set of flow require-
ments, known as a profile, applies to all flows.
Typical examples of peer-to-peer flow behavior include:
• Web browsing across the Internet: A huge number of users browse numerous web-
sites over the Internet. No users or websites are considered to be more important
than others, and there are no managers to coordinate web browsing activities over
the Internet.
• FTP services over the Internet: This is similar to web browsing over the Internet.
A large number of FTP servers serve a vast user base for file transfer across the
Internet. No FTP servers or users are more important than others.
• Email services over the Internet: This is also similar to web browsing over the
Internet. A vast number of email users use a large number of email servers over
the Internet for email services. Nobody on the Internet is able to coordinate such
email services.
• Social networking among peers: Examples include Twitter, Facebook, TikTok,
and WeChat. While these social networks have servers as back-end infrastructure
support, users over the Internet are considered to be equally important. They engage
in peer-to-peer conversations where no one is more important than others. Traffic
flows to or from each user are treated without differentiation.
• A combination of web browsing, FTP, email services, and social networking over
the Internet.
The client-server flow model is graphically depicted in Fig. 4.7. It is the most
generally applicable traffic flow model in computer networking. The client-server
model consists of a centralized server and one or more distributed clients. It has a
request/response communication feature with bidirectional traffic flows:
• Clients send requests to the server for data and services, and the server responds
to the requests by providing data and services to the clients.
94 4 Traffic Flow Analysis
Server
Storage
Response
Re
Request
est qu
qu se Re est
Re spon spon
Re se
Fig. 4.7 Client-server flow model with asymmetric traffic flows [6, p. 184]. The traffic of responses
is much bigger than that of requests. Thus, the server is more likely a flow source whereas the clients
are more likely flow sinks
• The downstream traffic of the responses from the server to the clients is typically
much bigger than the upstream traffic of the requests from the clients. Therefore,
the bidirectional upstream ad downstream traffic flows are asymmetric. As a result,
the server is considered more likely as a flow source whereas the clients act more
likely as flow sinks.
• In comparison with the requests from the clients, the responses from the server
are more important because they are being expected by the clients.
Many network services and applications fit well into the client-server model. A
typical example is user access to the centralized servers like web server, FTP server,
mail server, and other servers within an enterprise network. In this model, users
send requests to the server for data and services, and the server responds to the
user by providing the requested data and services. This necessitates highly reliable
servers, secure communication for requests and responses, and the ability to handle
large downstream traffic. Therefore, a network must possess the capability to support
client-server services and applications. It is worth noting that certain services and
applications, such as web browsing, FTP downloading, and mail services, can fit
into different flow models depending on specific scenarios. While they operate in a
client-server manner within an enterprise network, they exhibit peer-to-peer traffic
flows across the Internet, as discussed previously in relation to the peer-to-peer flow
model.
Additional examples of the client-server flow model include various web-based or
similar applications, such as Overleaf as an online LATEX editor, SAP as an Enterprise
Resource Planning (ERP) application, cloud services from cloud service providers,
GitHub as a code hosting platform, arXiv as an open-access repository of electronic
preprints and postprints, and ChatGPT as an online chatbot (launched on the 30th
of November in 2022). These applications have centralized servers to offer data and
services to distributed clients worldwide. Communication between the server and
clients occurs through requests and responses, with traffic flows being bidirectional
and typically asymmetric.
4.3 Traffic Flow Models 95
Terminal-host traffic flows were popular many years ago. They can be considered
as a special type of client-server traffic flows in modern computer networks. They
appear in the communications between a mainframe computer and its remote ter-
minals, as well as in other Telnet applications. Typically, a terminal sends a single
or a few characters to the host, and the host responds with many characters. Thus,
terminal-host traffic flows are typically asymmetric. However, there are instances
where a terminal sends a character to the host and receives a character in return, such
as in the vi editor. There are also scenarios where a complete screen is updated at
a time, for instance, in some mainframes like IBM-3270. Therefore, the efficiency
performance of terminal-host traffic flows may vary depending on the specific appli-
cation. Further discussions on terminal-host traffic flows can be found in the book
by Oppenheimer [7, p. 91].
When more tiers of hierarchy are added to the client-server model, the characteristics
of network communication traffic can be better described using a hierarchical client-
server flow model, which is also known as a cooperative computing flow model. This
is illustrated in the logical diagram of Fig. 4.8. In the upper tiers of the hierarchical
client-server flow model, there exist multiple hierarchical tiers of servers. These
servers engage in communication with one another, and function as both flow sources
and sinks. The top-tier server performs as the global manager. At the lowest tier of
servers, each server serves one or more clients within a client-server flow model,
forming multiple client-server systems.
Global Server
Storage
Storage Storage
se
Re
Re
est
Re
st
Re
on
on
que
sp
spo
que
que
qu
sp
sp
on
Re
Re
nse
Re
Re
st
st
se
Fig. 4.8 Hierarchical client-server flow model [6, p. 186], also known as cooperative computing
flow model, with two or more hierarchical tiers of servers acting as both flow sources and sinks.
The distributed clients act more likely as flow sinks
96 4 Traffic Flow Analysis
As in the simple client-server flow model, the server-to-client traffic flows are
considered more important than the client-to-server flows in the hierarchical client-
server flow model. In addition, without more detailed information, the server-to-
server traffic flows are also considered more important than the client-to-server flows.
With the increasing deployment of web applications over the Internet, the demands
for server reliability and performance are growing. As a result, many web servers are
replicated and then distributed across various physical locations. These servers are
often managed and coordinated by a global server. As a result, numerous applications
that were originally served within the client-server model are now being served within
a hierarchical client-server model. In this model, the traffic flows between servers,
and servers and managers, become more important than before in ensuring the full
functionality, reliability, and security of web applications.
In the hierarchical client-server model, servers at multiple tiers may offer similar
functions and services as discussed earlier. It is also possible for them to provide
different functions and services. For example, one server may primarily be used for
scientific computing, while another may be predominately used for e-commerce.
Nevertheless, the servers communicate with each other for resource sharing, data
replication, task migration, and other purposes.
The distributed-computing flow model is the most specialized flow model. Briefly
speaking, depending on the application scenarios, it may exhibit traffic flow char-
acteristics that resemble a combination of both peer-to-peer and client-server flow
models, or demonstrate the opposite characteristics of the traffic flows observed in
the client-server flow model. A logical diagram of the distributed-computing flow
model is presented in Fig. 4.9. This model comprises a central task manager and
multiple distributed computing nodes. The task manager assumes responsibility for
managing the overall computing task, dispatching subtasks to the distributed comput-
ing nodes, and collecting computing results from them. The distributed computing
nodes conduct the computations assigned by the task manager.
Let us have a detailed discussion on the features of the distributed-computing
flow model. First of all, distributed computing is a specialized computing method that
deals with the computation of computing tasks across multiple distributed computing
nodes. It decomposes a computing task into multiple smaller subtasks and then
distributes these subtasks to different computing nodes for computation. Then, it
collects the computing results from the distributed computing nodes and derives the
final computing result for the overall computing task.
There are several scenarios that necessitate distributed computing. Here are a few
examples:
• When the data required for a computing task are distributed in multiple nodes and
cannot be consolidated onto a single node due to factors such as data ownership
4.3 Traffic Flow Models 97
Task Manager
Storage
Re
Results
lts h Tas su
su atc k lts
Re isp dis
dispatch
kd pat
Task
ch
Tas
Interaction
Fig. 4.9 Distributed-computing flow model with a task manager and multiple computing nodes [6,
p. 189]. Depending on the application scenarios, the task manager can be either or both of flow
sources and sinks, and so does each of the computing nodes. Direct interactions among the computing
nodes may or may not exist for information exchange
algorithms and speed up the overall execution of the computing. If the subtasks
are additionally fine granulated, the distributed computing system behaves akin to a
parallel computing system. In such cases, the task manager may dynamically dispatch
subtasks to the computing nodes.
When the subtasks dispatched to the distributed computing nodes are loosely cou-
pled, interactions among the computing nodes may not be necessary. Even if some
interactions are needed, they can be achieved indirectly through the task manager.
If the subtasks additionally have a coarse granularity, it is feasible to statically allo-
cate the subtasks to the computing nodes at the beginning of the overall computing
task based on the computing capacity and resources of the nodes. This distributed
computing scenario resembles cluster computing.
Regarding the traffic flows in the distributed-computing flow model, the task man-
ager actively dispatches subtasks to the distributed computing nodes and passively
receives results from them. This is different from the client-server flow model, in
which clients send requests to the server and receives responses from it. However,
in terms of one-to-many communication with asymmetric traffic flows, both the
distributed-computing and client-server flow models behave similarly. As for the
interactions appearing in the distributed-computing flow model, their traffic flows
are similar to those in the peer-to-peer flow model. No single interaction is more
important than others, and none of the computing nodes manage all these interac-
tions.
The RTFM Working Group (WG) has developed a system for measuring and report-
ing information about traffic flows over the Internet. The system is specified in the
IETF RFC 2722 [1]. Measuring traffic flows serves several purposes. Here are some
use cases:
• To characterize and understand the behavior of existing networks,
• To plan for network development and expansion,
• To quantify network and application performance,
• To verify the quality of network service, and
• To attribute network usage to users and applications.
This section discusses how traffic flows are measured.
The RTFM architecture, as specified in the IETF RFC 2722 [1], is illustrated in
Fig. 4.10. It consists of four main components: meter, meter reader, manager, and
analysis application. Each component has dedicated functions and responsibilities,
which are described in the following.
4.4 Traffic Flow Measurement 99
Manager
n
tio
Se
ra
tti
gu
ng
nfi
s
Co
Meters are placed at flow measurement points in order to (1) observe packets as
they pass by the points on their way through the network and (2) classify them into
certain groups. A group may correspond to a user, a host, a network, a group of net-
works, a transport address, or any combination of the above. For each of such groups,
a meter will accumulate relevant attributes for the group. Each meter selectively
records network activity as directed by its configuration that is set by the manager. It
can also process the recorded activity, such as aggregation and transformation before
the data is stored.
A meter reader is responsible for reading and transporting a full copy, or a subset
of, usage data from one or multiple meters so that the data is available to analysis
applications. What, where, when, and how to read will be directed by the configura-
tion set by the manager.
A flow measurement manager is an application that configures meter entities and
meter reader entities. It sends configuration commands to the meters, and supervises
each meter and meter reader for their proper operation. For convenience, the functions
of meter reader and manager may be combined into a single network entity. It is worth
mentioning that the manager of a meter is the master of the meter. Therefore, the
parameters of the meter can only be set by the manager.
An analysis application analyzes and processes the collected usage data, and
reports useful information for network management and engineering. For example,
the following information may be reported: traffic flow matrices (e.g., total flow rates
of many paths), flow rate frequency distribution (i.e., flow rates over a duration of
time), and usage data showing the total amount of traffic to and from specific hosts
or users. These reports assist in network monitoring, planning, and optimization.
To better understand the operation of the flow measurement system, let us consider
the interactions between the flow measurement components as depicted in Fig. 4.10.
The interactions between a meter and a meter reader involve the transfer of usage
100 4 Traffic Flow Analysis
date captured by the meter. This data is organized in a Flow Table. The meter reader
can read a full copy or a subset of the usage data by using a file transfer protocol.
The subset of the usage data can be a reduced number of records with all attributes,
or all records with a reduced number of attributes. A meter reader may collect usage
data from one or multiple meters.
The flow measurement manager is responsible for configuring and controlling
flow meters and flow meter readers. It sends configuration and setting information to
the meters and meter readers. The configuration for each meter includes the following
aspects [1, pp. 7-8]:
• Flow specifications indicating which flows are to be measured, how they are aggre-
gated, and what data the meter is required to compute for each flow being measured;
• Meter control parameters such as the inactivity time for flows; and
• Sampling behavior, which determines whether all packets passing through the
measurement point are observed or only a subset of them.
It is worth mentioning that a meter can execute several rule sets concurrently on
behalf of one or multiple managers.
The configuration for each meter reader includes specific information about the
meter from which usage data is to be collected. This information is defined in the
IETF RFC 2722 [1, p. 8]:
• The unique identity of the meter, i.e., the meter’s network name and address,
• How frequently usage data is to be collected from the meter,
• Which flow records are to be collected, and
• What attributes are to be collected for the above flow records.
It is feasible to have multiple managers, meters, and meter readers in traffic flow
measurement for the same network. This allows for more flexibility and scalability
in managing and collecting usage data from different parts of the network. In the
example depicted in Fig. 4.11, Meter 1, Meters 2 and 3, and Meter 3 are placed in
three separate network segments, respectively. Manager A configures and controls
Meters 1, 2, and 4, as well as Meter Reader II. Manager B manages Meters 3 and
4, and Meter Reader I. Meter Reader I collects usage data from Meters 1, 2 and 3,
whereas Meter Reader II collects usage data from Meters 2, 3, and 4.
We have the following observations from this example:
• A manager can manage several separate meters. For example, Meters 1, 2, and 4
are managed by Manager A as shown in Fig. 4.11.
• A meter can have several rule sets from multiple managers. For example, Meter 4
is managed by both Meter Readers I and II in Fig. 4.11.
• Multiple meters can report to one or more meter readers. For example, Meters 2
and 3 report to both Meter Readers I and II as illustrated in Fig. 4.11, providing
4.4 Traffic Flow Measurement 101
Manager A
Meter Reader II
Network
Meter 1 Meter 2 segment 2 Meter 3 Meter 4
Network Network
segment 1 segment 3
Meter Reader I
Manager B
redundancy to meter readers. If a meter reader fails, the other can still collect usage
data from both Meters 2 and 3.
• Placing both Meters 2 and 3 within the same network segment also adds redundancy
to the traffic metering of the segment. If one meter fails, the other can still report
the usage data for the network segment.
• In this example, no synchronization is required between the two Meter Readers,
indicating that they can operate independently.
In a flow measurement configuration with multiple Meter Managers, it is necessary
to have communication between the managers. However, the interactions between
Meter Managers cannot be fully addressed solely from the flow measurement per-
spective. They should be explored in the broader context of network management,
which will be covered in later chapters of this book.
There are typically large volumes of traffic flows in a network. Capturing all of
them for flow measurements can be resource-demanding. In many cases, it may
not be feasible or necessary to capture every single flow. The flow granularity of
flow measurements controls the trade-off between the ‘overhead’ associated with
the measurements and the ‘level of details’ provided by the usage data. A higher
level of details implies higher overhead, and vice versa.
How to control the flow granularity? It is controlled by adjusting the level of
details for various factors, such as those listed below [1, p. 13]:
• The metered traffic group, which can be based on address attributes,
• The category of packets, such as the attributes other than addresses, and
102 4 Traffic Flow Analysis
• The lifetime or duration of flows, i.e., the reporting interval that may need to be
sufficiently short for accurately measuring the flows.
The rule set that determines the traffic group of each packet is known as the current
rule set for the meter. It is an essential part of the reported information. This means
that the reported usage data information cannot be properly interpreted without the
current rule set.
Packet processor
‘Search’ index
Flow table
‘Collect’ index
Meter Reader
The ‘flow key’ is used to locate the entry of the flow in the flow table. If no such an
entry is found in the flow table, create one and add it to the flow table. Then, update
the data fields of the entry, e.g., packet and byte counters. The information shown in
the flow table can be collected at any time by a meter reader. To locate specific flows
to be collected within the flow table, the ‘collect’ index can be used by the meter
reader.
After identifying traffic flows, the next step is to analyze and quantify these flows.
This analysis enables a better understanding of the behavior of the protocols and
applications that generate the flows, providing valuable insights into network traffic
patterns, usage trends, and performance characteristics. It also assists in designing
appropriate network architecture and selecting suitable network technologies for a
network planning project, especially for capacity planning.
30 kB
= 3 kB/min
10 min
• The traffic load in bits per second (bps) is:
It is worth mentioning that the unit ‘kB’ (kilo-byte) can have slightly different
meanings in different contexts. In base 10, which aligns with the International System
of Unit (SI), 1 kB = 1, 000 bytes, i.e., 103 . However, 1 kB = 1, 024 bytes in base 2,
i.e., 210 . The base 2 representation is particularly used to measure the size of data files
and the space capacity of hard drives and memory. Regardless, in our calculations,
we use the base 10 representation, i.e., 1 kB = 1, 000 bytes, for several reasons:
• It is more convenient for calculations;
• The difference in the calculation results is small enough; and
• The small difference can be well accommodated because
– The calculation result is only an approximate estimate under simplified assump-
tions, and
– The result will need to be scaled up in general for capacity planning.
When the protocols that the application uses for data transmission are considered,
the estimate of the traffic load can be refined by taking into account the protocol
overhead. Table 4.2 tabulates the overhead of some commonly used protocols [7, p.
100].
Moreover, it is necessary to consider any additional traffic load that may be gener-
ated by running an application. Depending on the application scenario, this additional
traffic load may or may not have an impact on the performance of the application.
Some sources of additional traffic load include:
4.5 Traffic Load and Behavior 105
To demonstrate traffic load analysis, let us consider a few examples of local and
remote database access.
Local Access to a Database
3000 queries
= 300 queries/min
10 min duration
(3) The total size of the queries in bytes per minute:
In the above calculations, we have used base 10 to convert 1 kB to 1, 000 bytes for
the reasons discussed earlier.
Remote Access to Multiple Databases
Now, we extend the single database scenario discussed above to three database sites
as depicted in Fig. 4.14. The user and application requirements for each site are the
same as those for the single site scenario discussed above. Additional assumptions
for each of the three sites are given below:
• 80% of the queries can be answered locally from the local database;
• 20% of the queries have to be answered remotely from another database
– The transfers of the queries and responses are server-to-server, and
4.5 Traffic Load and Behavior 107
Database 1
Site 1
User 1
Site 2 Site 3
User 2 Database 2 Database 3 User 3
– The sizes of the queries and responses are respectively the same as those for
local database access.
From these assumptions and the logical diagram in Fig. 4.14, each of the three sites
for access to the local database fits well into the client-server flow model. However,
the communication among the three database servers is peer-to-peer because no
server is a manager and all flows between any pairs of servers are considered equally
important. Therefore, Fig. 4.14 shows a combination of the client-server flow model
and peer-to-peer flow model.
For each of the three sites, the local database access generates the following traffic
load:
For each site, the queries to, and responses from, remote servers, generate the fol-
lowing traffic flows:
For the calculation of backbone flows between remote servers, how are the traffic
flows from and to a server distributed to the paths connecting the other remote
servers? In the absence of additional information about the specific distribution, we
consider the worst case scenario, which is an even distribution of traffic flows to all
connecting paths, as shown in Fig. 4.15. In the given example, the 8 kbps flows for
queries to remote servers are evenly distributed to the two paths connecting the other
two sites, resulting in 4 kbps for each of the two paths. Similarly, the 80 kbps flows
108 4 Traffic Flow Analysis
Database 1
Composite flow:
Queries = 8 kbps Site 1
Responses = 80 kbps
Flow boundary
Even distribution:
Queries = 4 kbps
Responses = 40 kbps
Database 1
F0
Site 1
User 1
F1 Query Response
Flow boundary Flow
(kbps) (kbps)
F0 40 400
F1 8 80
F2 F2 F2 4 40
F1 F2 F1
F0 F0
Site 2 Site 3
User 2 Database 2 Database 3 User 3
Fig. 4.16 Calculation results of traffic flows from a single pair of query and response for the
example in Fig. 4.14
for responses from remote servers are evenly distributed to the two paths, resulting
in 40 kbps for each path.
The calculation results of the traffic load for the backbone flows generated from
a single pair of query and response are summarized in Fig. 4.16. Traffic flow F0
represents the flow for local database access. Traffic flow F1 is a composite flow,
which carries query information to, and responses from, remote database servers.
Traffic flow F2 is the traffic flow to, and from, a remote database server.
Additional Database Synchronization
Let us additionally consider database synchronization. It is assumed that the databases
at the three remote sites are synchronized with each other once every 30 minutes. In
the synchronization process, the amount of data that needs to be transferred is 9 MB.
For constant-rate synchronization, the traffic load is calculated as follows:
4.5 Traffic Load and Behavior 109
Table 4.3 Traffic load with database access and synchronization for the example shown in
Fig. 4.14
Flow Database access only Database Database access and
(kbps) synchronization only synchronization (kbps)
(kbps)
F0 (query) 40 − 40
F0 (response) 400 − 400
F1 (outbound) 8+40+40=88 80 88+80=168
F1 (inbound) 80+4+4=88 80 88+80=168
F2 (one direction) 4+40=44 40 44+40=84
F2 (the other direction) 4+40=44 40 44+40=84
9 MB
= 0.05 MB/s.
30 min × 60 s/min
The calculated traffic load for the overall system is tabulated in Table 4.3. The results
in the table indicate the minimum bandwidth requirements for the network. In net-
work planning, these requirements will need to be scaled up for future growth.
A solution to a network planning project depends on not only traffic flows and traffic
load, but also traffic behavior. Traffic behavior is largely influenced by protocol
behavior, application behavior, and bandwidth usage patterns. Particularly, to a large
extent, broadcast traffic may dictate network architecture, such as LAN topology
110 4 Traffic Flow Analysis
If no transmission errors are present, the larger the frame size is, the higher the
network efficiency will be because a higher portion of the frame can be dedicated to
the payload. However, if an error occurs in the transmission of the frame, the frame
may need to be re-transmitted, leading to a waste of bandwidth. In this case, a big-
ger frame size means a higher bandwidth waste. To optimize the use of bandwidth
resources, the concept of Maximum Transmission Unit (MTU) is adopted in com-
puter networking. Some applications allow for MTU configuration. If the frame size
exceeds the MTU, fragmentation occurs in an IP environment. Fragmentation splits
the large frame into multiple smaller frames, each of which is equal to, or shorter
than, the MTU. While fragmentation ensures data transmission in the network, it can
also slow down data communications.
IPv6 supports MTU discovery, which discovers the largest frame size that can be
used without the need of fragmentation. This improvement over IPv4 networking
eliminates the need for MTU configuration.
Flow control in TCP/IP communications plays a critical role in controlling net-
work efficiency within the constraints of communication paths. A TCP sender can
fill a send window, or buffer, with data for transmission without waiting for an ACK
from the receiver. The receiver will place the received data into the receive window, or
buffer, with a maximum size of 65, 535 bytes for processing. The bigger the receive
window, the higher the network efficiency will be. But this requires more memory
and CPU resources on the receiver side.
Unlike TCP, UDP-based communications do not have built-in flow control at the
transport layer (layer 4). However, user-defined flow control can be implemented in
application programs at higher layers, such as session layer (layer 5) or application
layer (layer 7). This allows for customized flow control mechanisms tailored to
specific requirements. Table 4.4 provides a list of protocols or services that use
either TCP or UDP as the underlying transport protocol.
To evaluate the impact of protocol interactions on network efficiency, it is essential
to understand what protocols are used by an application and what relationships and
dependencies they have. For example, an email client that uses IMAP to retrieve email
messages from the mail server relies on TCP as the underlying transport protocol.
This requires a three-way handshaking process for TCP connection, and a four-
way handshaking process for TCP disconnection. IMAP can be configured without
encryption on port 143 or with encryption using Transport Layer Security (TLS)
on port 993 (Table 4.4). This implies that the email client will interact with TLS in
addition to IAMP when encryption is enabled. It is seen that multiple protocols work
together to enable the functionality of a specific application. The interactions among
these protocols affect traffic load and behavior, and ultimately network efficiency.
Hence, it is worthwhile to check the feature configurations of the application to
ensure their appropriateness without significantly impacting network efficiency.
Effective error handling is important in network communications to ensure the
successful data transmission from one end to the other. Errors in data communications
include packet corruption and dropout. While it is possible to encode a packet to
achieve full error recovery solely from the received data, this will require excessive
resources and thus is impractical in most cases. The most common method of error
112 4 Traffic Flow Analysis
100%
Bandwidth usage
(normalized)
0%
Mon Tue Wed Thu Fri Sat Sun
Day in a Week
100%
Bandwidth usage
(normalized)
0%
Mon Tue Wed Thu Fri Sat Sun
Day in a Week
Fig. 4.17 Weekly bandwidth usage pattern normalized as a percentage of the maximum bandwidth
usage over a week
different network traffic volumes during peak and off-peak hours. This bandwidth
usage pattern should be taken into account in network architectural design.
Figure 4.17 also shows that the bandwidth usage on weekends is almost flat without
exhibiting the same day-night pattern observed on weekdays. This is because fewer
users use the network during weekends compared to weekdays.
A network usage analysis conducted for the student residence of a university shows
interesting findings [17]. The study reveals that peak network usage occurs around
midnight, while off-peak hours are in the early morning typically around 6–7 am.
Notably, student residents exhibit similar levels of activities on weekends compared
to weekdays, indicating a consistent network usage pattern throughout the week. In
terms of overall traffic, incoming traffic dominates, accounting for approximately
80%, while outgoing traffic constitutes around 20%.
From the application perspective, the study mentioned above [17] reports that
HTTP-related applications contribute to one third of the overall traffic. Video ser-
vices including YouTube account for roughly one fifth of the traffic. Social network
114 4 Traffic Flow Analysis
applications including Skype represent around one tenth. These statistics are sum-
marized in Table 4.5. As these statistics were measured many years ago, they do not
reflect the current popularity of applications such as TikTok and Zoom. Nevertheless,
these statistics provide some insights into historical application usage patterns.
From the day-to-day perspective, it is seen from Fig. 4.17 that the bandwidth usage on
weekdays from Monday to Friday exhibits a high degree of similarity and consistency.
However, there is a notable distinction between the bandwidth usage on weekdays
and weekends, primarily because a significant portion of network users do not use
the network during weekends.
The traffic analysis for the student residence discussed above [17] shows a con-
sistent day-night pattern across the whole week from Monday through Sunday. This
consistency is attributed to the active network usage by students, even on weekends.
The results from the process of identifying, characterizing, and developing traffic
flows are integrated to form a full traffic flow specification known as flowspec. Let
us describe flowspecs from three perspectives:
• Flow analysis as part of requirements analysis,
• QoS management, and
• Traffic routing for efficient traffic forwarding.
This section primarily focuses on the flowspec from the flow analysis perspective for
the purpose of top-level logical network planning. Flowspecs for QoS management
and routing serve specific component-based network architecture.
4.6 Flow Specification 115
From the flow analysis perspective, a flowspec describes traffic flows of a network,
and the performance requirements and prioritization of the flows.
The performance requirements for a flow can be simply classified into three types:
best-effort, predictable, and guaranteed requirements. They describe how critical the
flow is (e.g., mission-critical, safety-critical, or time-critical). A specific performance
threshold or range can also be given. When specifying traffic flows, the requirements
for a composite flow aggregated from multiple individual flows can be integrated
from the requirements for individual flows. The combined requirements of all flows
can later be used for performance management, e.g., through the implementation of
QoS mechanisms like DiffServ.
To characterize different levels of performance requirements for traffic flows, a
flowspec can be presented in a one-part, two-part, or multi-part form. A one-part
flowspec, also known as a unitary flowspec, provides a basic level of detail and
performance requirements. As we move to two-part and multi-part flowspecs, the
level of detail and performance requirements increases. The three types of flowspecs
and their characteristics are summarized in Table 4.6.
One-Part Flowspec
A one-part flowspec lists traffic flows with best-effort performance requirements.
Best-effort performance implies no specific requirements are specified because all
network services and flows are provisioned with best effort by default in computer
networking. Therefore, a one-part flowspec suits general traffic flows that do not
have specific performance requirements. Typical examples that can be characterized
by using a one-part flowspec include general web browsing, mail services, FTP
downloading, and network printing. These types of applications typically do not
require specific performance guarantees. They can operate effectively with best-
effort services as they can dynamically adapt to available network resources such as
bandwidth.
Two-Part Flowspec
A two-part flowspec describes traffic flows with two parts of requirements:
• One part for predictable performance requirements, and
• The other part for best-effort performance requirements.
Therefore, a two-part flowspec can be seen as a natural extension of the one-part
flowspec by adding traffic flows with predictable performance requirements.
Predictable performance requirements, such as delay and jitter, are necessary for
certain soft real-time applications like video streaming and voice services. These
applications are sensitive to delay and jitter because they require timely delivery of
data packets to maintain their QoS. However, video and voice services can tolerate
a certain range of delay and jitter. Occasional losses of a few video frames may
not significantly impact the user experience in general video applications, such as
YouTube. Therefore, it is not necessary to provide guaranteed performance for delay
and jitter in such cases.
Multi-Part Flowspec
In addition to best-effort and predictable requirements, there are cases where guar-
anteed performance requirements need to be considered. The addition of guaranteed
performance requirements to the two-part flowspec results in a multi-part flowspec.
Guaranteed requirements are specifically developed for critical applications or ser-
vices that demand a certain level of performance assurance. These requirements
define specific thresholds or ranges that must be met to ensure the desired perfor-
mance. For example, a critical application may require a maximum delay of 20 ms
and a minimum bandwidth of 100 kbps at all times.
Flowspec Algorithm
When multiple flowspecs have been developed, how do we combine the requirements
from all these flowspecs? The flowspec algorithm provides a set of rules to combine
the requirements. These rules address the best-effort, predictable, and guaranteed
performance requirements specified in the flowspecs:
R1 Add up the capacity requirements of all flows to form the overall capacity require-
ment for the network.
R2 For performance characteristics other than bandwidth, such as delay, jitter, and
reliability, the best performance requirements among all predictable flows are
selected and applied to these flows. This rule ensures that the flowspecs with
predictable performance requirements are met with optimized performance.
R3 Guaranteed flows must be handled individually by considering their specific
requirements separately.
By applying these rules to all flowspecs, the flowspec algorithm determines the
overall capacity requirements and performance characteristics needed for QoS man-
agement. Rule R1 applies to all types of flowspecs including one-part, two-part,
and multi-part flowspecs. Rule R2 is used for predictable flows found in two-part
4.6 Flow Specification 117
Rule R2 Rule R3
Max 1,2,3
Guaranteed
Predictable Predictable
Rule R1
In IntServ QoS management, flowspecs help each router that participates in an IntServ
session to determine whether it has sufficient resources to meet the IntServ QoS
requirements. For an identified flow that requires IntServ QoS control, the corre-
sponding flowspec must describe
• What the required QoS is, and
• What the characteristics of the QoS are.
118 4 Traffic Flow Analysis
These two aspects are described by the Traffic SPECification (TSPEC) and
Request SPECification (RSPEC), respectively. Both TSPEC and RSPEC are essential
components of flowspecs in IntServ. Detailed technical specifications about TSPEC
and RSPEC can be found in the IETF RFC 2210 [18] and RFC 2215 [19]. These
RFCs provide comprehensive explanations of the TSPEC and RSPEC formats, their
parameters, and their usage within the context of IntServ. The use of TSPEC and
RSPEC in QoS management will be further discussed in a separate chapter on net-
work performance architecture.
Routers are responsible for making routing decisions and forwarding traffic over
networks. In modern IP networks, routers have evolved to be more powerful with
additional capabilities such as traffic management, security policy enforcement, and
other functions beyond basic routing. These capabilities allow network operators
to apply various rules and actions to packets based on specified criteria defined by
network policies. Such rules are known as match rules because their use is through
matching multiple fields of the packet header. Traffic classification and shaping, as
well as other traffic management actions can be associated with these matching rules.
To make traffic routing more efficient with required actions such as those men-
tioned above, a flowspec that a router receives is expected to have well-defined match
criteria. For this purpose, the IETF RFC 8955 [20] has defined a flowspec as “an
n-tuple consisting of several matching criteria that can be applied to IP traffic”. A
packet that matches all the specified criteria is said to match the defined flowspec.
In the IETF RFC 8955 [20], this n-turple flowspec is encoded into the Network
Layer Reachability Information (NLRI) of the Border Gateway Protocol (BGP). The
encoding format can be used “to distribute (intra-domain and inter-domain) traffic
Flow Specifications for IPv4 unicast and IPv4 BGP/MPLS VPN services”.
4.7 Summary
As part of the requirements analysis process, flow analysis identifies and characterizes
traffic flows and further develops a set of flow specifications that highlight the QoS
performance requirements of these flows. The developed flow specifications provide
insights into network architecture and QoS management mechanisms. In addition
to serving network capacity planning, they are particularly important for network
applications and services that have predictable and guaranteed QoS performance
requirements.
4.7 Summary 119
From the descriptions provided in various RFCs, this chapter has established that
a traffic flow is a sequence of packets with common attributes, measured during a
period of time in a single session of an application, characterized at a measurement
point, and spans from a source to a destination in an end-to-end manner. In the
presence of network hierarchy, multiple individual flows may be aggregated to form
a composite flow.
If a flow does not have specific performance requirements, it is treated as a best-
effort flow and thus can be managed with best-effort service. This is the default
configuration for traffic flows and network services in computer networking when
no performance requirements are specified. QoS performance requirements for a
flow may include capacity, delay, reliability, manageability, availability, and other
factors. In traditional networking, these requirements are often addressed through
over-provisioning of bandwidth. However, in modern computer networking where
the number of network services is increasing, simply throwing more bandwidth is
insufficient to provide predictable and guaranteed services, especially for flows that
required end-to-end IntServ QoS support. Therefore, developing QoS performance
requirements is essential for flow prioritization and overall network QoS manage-
ment.
An effective approach to flow analysis is to map network applications and services
into a well-established traffic flow model. This chapter has discussed four popular
flow models: peer-to-peer, client-server, hierarchical client-server, and distributed
computing flow models. Conventional terminal-host traffic flows are no longer pop-
ular in modern networking although they still appear in some networks. They are
treated as a special type of client-server flows in this chapter.
Traffic load and behavior can affect the performance requirements of traffic flows.
When estimating traffic load, considerations should include protocol overhead as well
as the interactions between protocols used by an application. Also, various patterns
of traffic, users, and applications should be taken into account, such as peak and
off-peak traffic requirements.
The flowspec developed from flow analysis describes the best-effort, predictable,
and guaranteed performance requirements. The flowspec algorithm combine these
requirements through using flowspec rules, forming the overall QoS performance
requirements for the network. These results will be used for network architecture
planning.
120 4 Traffic Flow Analysis
References
1. Brownlee, N., Mills, C., Ruth, G.: Traffic flow measurement: Architecture. RFC 2722, RFC
Editor (1999). https://doi.org/10.17487/RFC2722
2. Rajahalme, J., Conta, A., Carpenter, B., Deering, S.: Ipv6 flow label specification. RFC 3697,
RFC Editor (2004). https://doi.org/10.17487/RFC3697
3. Quittek, J., Zseby, T., Claise, B., Zander, S.: Requirements for IP flow information export
(IPFIX). RFC 3917, RFC Editor (2004). https://doi.org/10.17487/RFC3917
4. Handelman, S., Stibler, S., Brownlee, N., Ruth, G.: RTFM: New attributes for traffic flow
measurement. RFC 2724, RFC Editor (1999). https://doi.org/10.17487/RFC2724
5. Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., Weiss, W.: An architecture for differ-
entiated services. RFC 2475, RFC Editor (1998). https://doi.org/10.17487/RFC2475
6. McCabe, J.D.: Network Analysis, Architecture, and Design (3rd ed.). Morgan Kaufmann Pub-
lishers, Burlington, MA 01803, USA (2007). ISBN 978-0-12-370480-1
7. Oppenheimei, P.: Top-Down Network Design (3rd ed.). Cisco Press, Indianapolis, IN 46240,
USA (2011). ISBN 978-1-58720-283-4
8. Huston, G.: Next steps for qos architecture. RFC 2990, RFC Editor (2000). https://doi.org/10.
17487/RFC2990
9. Braden, R., Clark, D., Shenker, S.: Integrated services architecture. RFC 1633, RFC Editor
(1994). https://doi.org/10.17487/RFC1633
10. Shenker, S., Partridge, C., Guerin, R.: Specification of guaranteed quality of service. RFC 2212,
RFC Editor (1997). https://doi.org/10.17487/RFC2212
11. Wroclawski, J.: Specification of the controlled-load network element service. RFC 2211, RFC
Editor (1997). https://doi.org/10.17487/RFC2211
12. Zhang, Y.F., Tian, Y.C., Fidge, C., Kelly, W.: Data-aware task scheduling for all-to-all compari-
son problems in heterogeneous distributed systems. J. Parallel Distrib. Comput. 93–94, 87–101
(2016)
13. Zhang, Y.F., Tian, Y.C., Kelly, W., Fidge, C.: Scalable and efficient data distribution for dis-
tributed computing of all-to-all comparison problems. Futur. Gener. Comput. Syst. 67, 152–162
(2017)
14. Cain, B., Deering, S., Kouvelas, I., Fenner, B., Thyagarajan, A.: Internet group management
protocol, version 3. RFC 3376, RFC Editor (2002). https://doi.org/10.17487/RFC3376
15. Vida, R., Costa, L.: Multicast listener discovery version 2 (MLDv2) for IPv6. RFC 3810, RFC
Editor (2004). https://doi.org/10.17487/RFC3810
16. Holbrook, H., Cain, B., Haberman, B.: Using Internet group management protocol version 3
(IGMPv3) and multicast listener discovery protocol version 2 (MLDv2) for source-specific
multicast. RFC 4604, RFC Editor (2006). https://doi.org/10.17487/RFC4604
17. Lam, A.: Network usage analysis at student residence. Online report (2011). https://www.
cityu.edu.hk/its/news/2011/06/27/network-usage-analysis-student-residence. Accessed 26
Jan 2023
18. Wroclawski, J.: The use of RSVP with IETF integrated services. RFC 2210, RFC Editor (1997).
https://doi.org/10.17487/RFC2210
19. Shenker, S., Wroclawski, J.: General characterization parameters for integrated service network
elements. RFC 2215, RFC Editor (1997). https://doi.org/10.17487/RFC2215
20. Loibl, C., Hares, S., Raszuk, R., McPherson, D., Bacher, M.: Dissemination of flow specification
rules. RFC 8955, RFC Editor (2020). https://doi.org/10.17487/RFC8955
Part II
Network Architecture
Following network analysis in the previous part, this part conducts architectural
planning for large-scale computer networks. It will begin with an overall network
architecture design. Then, component-based network architecture will be investi-
gated by following international standards and best industrial practices. The inves-
tigations will cover important network components including addressing, routing,
performance, management, and security.
Chapter 5
Network Architectural Models
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 123
Y.-C. Tian and J. Gao, Network Analysis and Architecture, Signals and
Communication Technology, https://doi.org/10.1007/978-981-99-5648-7_5
124 5 Network Architectural Models
relationships among various physical and functional components are complex. This
chapter discusses network architecture using the top-down methodology, systems
approach, and hierarchical network architecture models. It also includes geograph-
ical models, functional models, flow-based models, and component-based models
of networks. Enterprise edge topology and redundant network models will also be
discussed in this chapter.
Site C
Bulk transfer Core
Core R
Distribution
Distribution Distribution
Layer R R
Policy-based
connectivity R R
Access
Access Access Access Access Layer
• Traffic forwarding policies, for example, to forward traffic from a specific network
out of one interface of a router whilst forwarding all other traffic out of another
interface of the router;
• Address or area aggregation though supernetting, thus improving routing effi-
ciency;
• The definition of broadcast and multicast domains through appropriate segmenta-
tion of the network;
• Virtual Local Area Network (VLAN) traffic routing;
• Media transition, e.g., from Ethernet to Asynchronous Transfer Mode (ATM).
• Security mechanisms;
• Route redistribution between routing domains if different routing protocols are
used; and
• Demarcation between static and dynamic routing.
The distribution layer is the demarcation point between the core and access layers. It
does not have direct connection to end users in general unless purposely designed for
a specific reason. Therefore, it usually consolidates traffic flows from the access layer.
Performance management for individual and consolidated flows can be implemented
here.
The Access Layer
The access layer in computer networks connects users and their applications to the
network via low-end switches and wireless access points. Most traffic flows origi-
nate and sink at the access layer. They can be treated easily on an individual basis.
Therefore, access lists and packet filters can be implemented here. In a campus net-
work environment, the access layer can be designed to manage segmentation, micro-
126 5 Network Architectural Models
R R R R
Fig. 5.2 Full- and partial-mesh architectural models for core layer
As the core layer provides high reliability and availability for high-speed bulk trans-
port of traffic, it must be designed with redundancy. Two options of the core layer
topology are full-mesh and partial-mesh topology, as shown in Fig. 5.2. Full-mesh
topology offers the highest reliability due to multiple paths between any pair of
routers. But it is relatively expensive and also complex to manage. Also, full-mesh
topology lacks scalability. Therefore, it is not suitable for a core with many sites that
are interconnected through WANs.
A simpler and cost-effective alternative is the partial-mesh topology. In partial
mesh, routers have fewer links to other routers compared to full mesh. The scalability
of partial mesh is also improved over full mesh. Nevertheless, how to develop a partial
mesh still needs to be carefully considered. For example, the partial mesh shown in
Fig. 5.2b is preferable over a loop topology where the four routers form a circular
connection. The loop topology with many sites is not recommended in general. This
is due to the fact that if a link is down there will be many hops between routers on
opposite sides of the broken loop, negatively impacting overall network performance.
The distribution layer handles many servers and specialized devices, and also imple-
ments performance mechanisms. Therefore, it also requires redundant connections to
both the core and access layers. Mesh or even partial-mesh is not common between
routers at the distribution layer. Rather, a distribution-layer router is connected to
5.1 Hierarchical Network Architecture 127
Core
Layer R R
Distribution
Layer R R R
Access
Layer R R R R R
more than one core-layer router, and an access-layer router is connected to multiple
distribution-layer routers. This forms a hierarchical redundant topology for the dis-
tribution layer, as illustrated in Fig. 5.3. In this figure, if a link failure occurs between
the distribution layer and either the access layer or core layer, there is always an alter-
native link available for transporting traffic. This significantly improves the overall
reliability of the network.
A large-scale network can be viewed from the functional perspective as shown pre-
viously with the core/distribution/access model. It can also be approached from the
geographical perspective with natural physical separation of Local Area Network
(LAN), Metropolitam Area Network (MAN), and WAN. This motivates the popular
use of another high-level topological model: the LAN/MAN/WAN architecture, as
shown in Fig. 5.4. Some comparisons of LAN, MAN, and WAN are tabulated in
Table 5.1.
Similar to the functional core/distribution/access architecture introduced from
Cisco, the LAN/MAN/WAN architecture also has a hierarchical topology. At the
bottom layer of this topology, LANs implement local user access to the network.
MANs, on the other hand, interconnect multiple sites or campus networks within
the same metropolitan area through the MAN infrastructure such as twisted-pair
and fibre-optic cables, some of which may be owned by third parties. The WAN
interconnects geographically remote sites through WAN infrastructure like fibre-
optic cables, radio waves, and satellites. WAN infrastructure is typically owned by
third parties.
However, different from the functional core/distribution/access architecture, the
LAN/MAN/WAN architecture concentrates on physical locations of network com-
ponents. Particularly, it focuses more on the boundaries between LANs, WANs, and
the WAN, as well as the specific features and requirements associated with these
boundaries. For example, border routers of a campus network are installed on the
128 5 Network Architectural Models
City III
R
WAN WAN
City I City II
R R
Site A Site B
MAN MAN MAN
R R
R R
In the LAN/MAN/WAN architectural model, the LAN layer does not necessarily
cover LANs only. It often extends to multiple buildings and floors. The resulting
LANs can be interconnected with routers at one or more layers in a hierarchical
manner.
It is common for the LAN/MAN/WAN architecture to be implemented without
a separate MAN layer, leading to a LAN/WAN model. In this LAN/WAN model,
multiple sites of an enterprise network within the same metropolitan area are inter-
connected through WAN connections. Other sites located in different metropolitan
areas are also connected via WAN connections.
The edge of an enterprise network is expected to ensure reliable and secure connectiv-
ity to external networks and the Internet. It should also provide authorized users with
secure access to the enterprise network from outside through a public WAN connec-
tion or the Internet. Therefore, enterprise edge architecture may include redundant
WAN segments, multihomed Internet connectivity, and secure Virtual Private Net-
work (VPN) links.
WAN links are critical for a reliable core layer in the core/distribution/access archi-
tecture. This necessitates redundant WAN connections at the edge of an enterprise
network. As shown previously in Fig. 5.2, a full-mesh or partial-mesh topology
should be designed to provide redundant WAN links.
When a full or partial mesh is provisioned for WAN connections, it is necessary to
ensure that the backup links are indeed functional paths. This requires the implemen-
tation of circuit diversity, which refers to the use of different physical paths. Different
ISPs may use the same WAN infrastructure from a third party. In such cases, if two
such ISPs are chosen for WAN backup, the backup may not work as expected. If the
WAN link from an ISP fails, the WAN link from the other ISP also fails.
Meanwhile, it is also necessary to ensure that the local cabling system of the
WAN segments has different physical paths for backup purposes. If one physical
path encounters a failure, the alternative physical path can serve as a functional
backup. Make sure that the links to the ISPs are reliable with redundant physical
cabling.
130 5 Network Architectural Models
Multihoming is a network technology that provides more than one connection for a
system to access and offer network services. In the context of network connectiv-
ity, the term multihoming refers to the use of multiple network connections, which
specifically indicate more than one Internet entry here. Multihoming offers Internet
redundancy and thus enhances the reliability and availability of network access. It
is worth mentioning that the same term multihoming is also used to describe multi-
homed servers and multihomed Content Delivery Networks (CDNs). A multihomed
server is connected to multiple networks or has multiple network interfaces, allowing
it to communicate and provide services through different network paths. A multi-
homed CDN has more than one Point of Presence (PoP) or edge server, enabling it to
deliver content and services to end users through the most optimal and geographically
closer PoP.
There are different ways to achieve multihoming for Internet connectivity. The
enterprise edge can be connected to a single ISP through a single edge router or mul-
tiple edge routers. Alternatively, it can be connected to two different ISPs. Therefore,
there are four basic options:
(1) A single edge router connecting to two routers from the same ISP,
(2) Two edge routers connecting to two routers of the same ISP via different links,
(3) A single edge router connecting to two routers each from a different ISP, and
(4) Two edge routers connecting to two routers each from from a different ISP.
These options are depicted in Fig. 5.5.
Each of the four options presented above has its own advantages and disadvan-
tages. The selection of the appropriate option depends on the specific requirements
of the network. In general, working with a single ISP is simpler and easier compared
to dealing with multiple ISPs, but it lacks ISP redundancy. Choosing an option with
a single edge router is a cost-effective solution, but it also introduces a single point
of failure. In comparison, using multiple edge routers to connect to multiple ISPs
offers the highest reliability for Internet connectivity, but it is more expensive and
complex.
ISP ISP
1 ISP
Enterprise Enterprise
Option 1): 1 ISP, 1 edge router Option 2): 1 ISP, 2 edge routers
2 ISPs
Enterprise Enterprise
Option 3): 2 ISPs, 1 edge router Option 4): 2 ISPs, 2 edge routers
protocol within another. A typical use case of tunneling is to carry IPv6 packets over
IPv4 networks. But for a secure VPN connection, the data packets are encapsulated
and encrypted for their secure transport over third-party networks or the Internet.
VPNs are a cost-effective solution for users and telecommuters to connect to
enterprise intranet or extranet. Because local Internet connectivity is widely available
worldwide, connecting to an enterprise network through VPN over the Internet is
simple. In general, a VPN server needs to be set up within the enterprise network
for users to connect from outside. Also, a VPN client needs to be installed on the
computer of the VPN user. The VPN server and client work together to establish
and maintain a secure point-to-point VPN tunnel between the user and the enterprise
network. A typical scenario of using VPN for secure connections to an enterprise
network is shown in Fig. 5.6. Further details on the VPV technology will be discussed
later in the chapter on security component architecture.
Network architecture can be analyzed from the traffic flow perspective. From the flow
characteristics developed through traffic flow analysis in the previous chapter, this
section focuses on architectural features in flow-based architectural models. Four
132 5 Network Architectural Models
Enterprise Network
VPN
server
Internet
From the peer-to-peer traffic flow model discussed in the previous chapter, a peer-
to-peer architectural model can be developed, in which no centralized control exists.
Depending on the application scenarios, requirements, and/or constraints, the peer-
to-peer architecture can be either fully meshed or partially meshed. This is shown in
Fig. 5.7, in which a full-mesh peer-ro-peer core network and a partial-mesh peer-to-
peer ad-hoc network are illustrated.
In the peer-to-peer core network architecture, the functions and features of the
architecture are pushed to the edge of the enterprise network. Therefore, architectural
planning should focus on the core layer and edge of the enterprise network.
In the ad-hoc network architecture, since there is a lack of support from a fixed
infrastructure, the architectural features should concentrate on the connectivity of
end nodes to the network to ensure effective network communication. In such cases,
various factors will need to be considered such as bandwidth resources, throughput
performance, latency, packet loss, and energy consumption.
5.3 Flow-based Architectural Models 133
Node Node
4 3
Full-mesh Partial-mesh
core ad hoc
network network
Node Node
1 2
Network/s
Clients Clients
134 5 Network Architectural Models
Servers,
server farm,
or server LAN
Network/s
Clients Clients
characteristics are applied to server locations, interfaces to client LANs, and client-
server flows.
However, in hierarchical client-server systems, multiple layers of servers work
together to fulfill the requirements of the clients. This introduces traffic flows and
interactions not only between clients and servers but also between the servers them-
selves. Therefore, when designing the hierarchical client-server architecture, it is
necessary to consider architectural features, functions, and characteristics for both
client-server and server-server interactions.
Efficient communication and coordination between servers at different layers
are essential to ensure smooth operation and optimal performance of the hierar-
chical client-server system. This includes considerations such as load balancing,
resource allocation, data synchronization, and inter-server communication protocols.
By addressing the requirements of both client-server and server-server interactions,
the hierarchical client-server architectural model provides a scalable and organized
approach to handling complex network services and applications.
Data
Client Storage
The deployment of distributed computing can vary depending on the use cases
and application scenarios. For example, within a university, distributed computing
laboratories are often built as a cluster of nodes within a LAN. In such cases, Ethernet
networking is typically used to interconnect low-end manager and worker nodes, such
as desktop computers. High-end manager and worker nodes, such as workstations,
HPCs, and storage servers, can be interconnected through Infiniband networking
with fiber-optic Infiniband switching, in addition to Ethernet networking.
This dual-networking configuration allows for high-speed data transfer via Infini-
band, while normal network communication is carried out through Ethernet. By
combining these networking technologies, the distributed computing environment
can achieve efficient data transfer and effective network communication, catering to
the requirements of different types of nodes and distributed computing tasks.
IoT IoT
5.4 Functional Architectural Models 137
The end-to-end service architectural model focuses on the components that sup-
port end-to-end traffic flows within a network. It recognizes that network services
are provisioned end-to-end. The performance of individual network services is also
measured end-to-end. Some critical network services and applications require end-
to-end performance guarantees.
To achieve end-to-end performance guarantees, end-to-end performance manage-
ment such as Integrated Service (IntServ) can be implemented. IntServ allows for
the measurement, control, and assurance of end-to-end performance. Identifying the
specific services or applications that require end-to-end support is part of the system
requirements analysis process. Designing an architectural model that supports the
end-to-end traffic flows for those services or applications becomes a key task in net-
work architecture design. Figure 5.12 shows an example of end-to-end architectural
model that considers all components along the end-to-end path of the traffic flow.
Core
Switch Server
router
Distribution Distribution R
Router R
End-to-end flow
Switch
Host
Fig. 5.12 An example of end-to-end architecture model that considers all components along the
path of the traffic flow
138 5 Network Architectural Models
Internet Customer’s
Internet
network
Customers Extranet Extranet (logical link)
Suppliers
Firewall
Intranet
Employees Extranet Intranet
server server
Collaborators
Appl.
server
The world-public
Intranet
(a) A logical view (b) An implementation example
Internet
Subscriber Subscriber
Subscriber Subscriber Subscriber Subscriber
discussed in this chapter. The focus of our discussions here is on the service-provider
architectural model within enterprise and campus networks.
The functional architectural models described in the previous section address func-
tions from the perspectives of applications, end-to-end services, intranet and extranet,
and subscriber and provider. Therefore, these models do not concentrate on individual
components of the network, but rather spread across multiple network components.
Component-based architectural models can also be seen as functional architectural
models in the sense that they describe how and where each network function is
applied within the network. However, they have a specific focus on the functions
of individual components of the network for network connectivity, scalability, man-
ageability, performance, and security. These component-based architectural models
provide a detailed understanding of the capabilities and requirements of each network
component.
Each network component represents a major type of capacity within the network.
It is supported by a set of mechanisms. These mechanisms encompass a range of
hardware, software, protocols, policies, and techniques designed and deployed in the
network to achieve its capacity under various constraints.
In this book, we will examine the following network components for network
architecture:
140 5 Network Architectural Models
the SLAs and other performance requirements? Will the addressing and routing com-
ponents help enhance network performance and management? From the systems
approach perspective, not only the decomposed individual components, but also
their interactions and dependencies, determine the behavior of the overall network
performance.
In general, trade-offs or balances are necessary among the functional compo-
nents and the multiple mechanisms within each component. They are a fundamental
requirement for network architecture design and overall network planning. By devel-
oping trade-offs or balances, more important services and their underlying mecha-
nisms can be prioritized with sufficient resource support. Meanwhile, all other ser-
vices and their underlying mechanisms are still functional under various network
constraints.
Working with a layer-3 addressing scheme, routing largely determines the efficiency
of end-to-end delivery of data packets. The routing component architecture con-
centrates on the planning of traffic routing and forwarding across networks and the
Internet from the traffic source to the intended destination. This requires to choose
appropriate routing protocols, design route distribution if multiple routing protocols
are used, and consider potential separation of routing decision and data forwarding.
142 5 Network Architectural Models
There are different categories of routing protocols each with its own advantages
and limitations. To choose a routing protocol, it is necessary to examine the suitability
of candidate routing protocols based on the performance requirements of the network.
The size or scale of the network will determine the requirements of scalability and
convergence speed when selecting a routing protocol. Some routing protocols, such
as OSPF, are simpler, while others, like BGP, are more complex. Some routing
protocols are more suitable for larger networks with hierarchical structures.
A single routing protocol is generally recommended for an enterprise or campus
network. However, there are situations where more than one routing protocol must be
used, such as when integrating networks from two different companies each using
a different routing protocol. In such cases, it becomes necessary to design route
redistribution between the two different routing protocols.
Software Defined Networking (SDN) has been developed for programmable net-
working. It separates routing decision from data forwarding. For a specific network,
it is worth investigating whether SDN is a better option than traditional routing. If
SDN is considered to be a suitable choice, find out whether or not the network is
SDN-ready.
Network Network
Single-tier Jitter
performance,
e.g., latency Latency
The requirements for security and privacy are important for all networks. The concept
of security generally refers to the Confidentiality, Integrity, and Availability (CIA) of
network resources and information, encompassing protection against theft, physical
damage, unauthorized access, DoS, and various cyberattacks. The concept of privacy
aspect primarily concentrates on safeguarding privacy by preventing unauthorized
access and disclosure of sensitive information. The security component architecture
addresses the requirements for both security and privacy aspects. Therefore, it is
essential to investigate what security threats and risks there might be, what security
5.6 Redundancy Architectural Models 145
and privacy mechanisms are available to mitigate these threats, and how these mech-
anisms can be integrated into the network to provide security and privacy protection.
Security risks in networks can originate from both internal and external sources.
Information leakage and inappropriate uses of network resources by internal users
within the network cause significant risks. These risks should be carefully addressed
during various stages of network planning, design, management, and operation.
While network security technologies continue to evolve, networks are becoming
susceptible to external cyberattacks. Although some attacks can be detected, many
others are not unknown to existing Intrusion Detection Systems (IDSs). To build an
effective security and privacy protection system, it is necessary to develop a deep
understanding of various cyberattacks.
Mechanisms currently available for network security include firewalls, ACLs,
filters, IPsec and other security protocols, cryptography, and security policies. Some
people consider NAT as a security mechanism because of the fact that private IP
addresses behind NAT are not visible to external networks. This is a misconception
about NAT. It is worth mentioning that NAT is not designed for security, and should
not be relied on as a security solution. NAT is primarily adopted in IPv4 networks to
conserve public IP address resources by using private IP addresses.
Integrating security mechanisms into the security component architecture is a
complex task due to the diverse range of security risks that need to be addressed.
There is no one-size-fits-all solution for security and privacy protection. Each security
mechanism may be applied to specific security scenarios or targeted areas of the
network. Multiple mechanisms can work together to provide comprehensive security
and privacy protection for the entire network.
In computer networking, a host relies on its gateway, which is the first-hop router, to
communicate with other hosts outside its LAN. If the gateway fails, the host loses its
network connectivity to all networks external to its LAN. Therefore, implementing
redundant gateways are necessary to ensure uninterrupted network connections.
In an enterprise network, there are typically multiple routers that are strategically
placed in different locations to serve different purposes. For example, an edge router
installed at the network boundary establishes a connection between internal and
external networks. If the edge router malfunctions, the entire enterprise network
may lose its Internet connectivity. To mitigate this risk, having a backup edge router
becomes essential, particularly when Internet connectivity is critical to the operations
of the organization.
To achieve router redundancy, physical backup routers must be deployed at dif-
ferent network locations. They must also be reachable by workstations, switches,
and other routers. Furthermore, mechanisms should be designed to effectively man-
age multiple routers and provide router redundancy. For example, how to find an
alternative router to forward traffic in the event of a router failure.
In the following, we will discuss mechanisms related to first-hop redundancy, i.e.,
workstation-to-router redundancy. These mechanisms manage multiple routers and
enable hosts to discover an alternative gateway should the primary router they are
currently using fail.
To ensure uninterrupted connectivity and access to external networks and the Internet,
it is critical to establish reliable workstation-to-router connectivity in network archi-
tecture design. This highlights the importance of workstation-to-router redundancy
in network planning.
For workstations to communicate with other networks and the Internet, they need
to discover a suitable router. In scenarios where the currently used router becomes
unreachable for any reason, it becomes necessary for the workstation to identify
an alternative router. Various methods have been developed to facilitate worksta-
tion router discovery within a network. In some implementations, workstations are
configured with an explicit and static setting of default (or primary) gateway and
backup (or secondary) gateway, allowing workstations to automatically switch to the
backup gateway if the primary gateway becomes unavailable. Other implementations
let workstations discover a router automatically by using some protocols. Example
protocols include:
• Address Resolution Protocol (ARP), which is a protocol implemented in operating
systems, allowing workstations to discover and associate IP addresses with MAC
addresses within the same local network.
5.6 Redundancy Architectural Models 147
Internet
Internet
R1 R2 R3
R (virtual)
R1 R2 R3 R
Switch
Switch
Fig. 5.16 The physical and logical views of HSRP in a local area network 192.168.10.0/24 with
the default gateway 192.168.10.1/24. All hosts have IP addresses on this network 192.168.10.0/24
Internet is always routed through this virtual router. But physically, the traffic is
forwarded by the active router in the HSRP group. As mentioned previously, if a
changeover of the active router occurs, HSRP ensures that network traffic to the
Internet is routed through the new active router. This minimizes interruptions in the
network connectivity of the hosts to the Internet.
HSRP Datagram
The HSRP datagram is 40-byte long. Its format is summarized in Fig. 5.17 [6].
Some of the octets in the datagram are described below:
• Op Code: The 1-octet Op Code specifies the type of message in the packet. The
values 0, 1, and 2 represent Hello, Coup, and Resign messages, respectively. A
Hello message from a router indicates its capability of becoming the active or
standby router. A Coup message is sent when a router wishes to become the active
router. A Resign message indicates that a router wishes to resign from its active
router role.
• State: Each of the standby group routers maintains a state machine. The 1-octet
State indicates the current state of a router at the time of sending the message.
Possible values of the State are 0 for Initial, 1 for Learn, 2 for Listen, 4 for Speak,
8 for Standby, and 16 for Active, respectively.
• Hellotime: The 1-octet Hellotime is used only in Hello messages. It indicates the
approximate period of time measured in seconds between Hello messages. If it is
not configured or learned, it is recommended that a default value of 3 seconds be
set.
• Holdtime: Also meaningful only in Hello messages, the Holdtime field is 1 octet.
It indicates the amount of time measured in seconds that the current Hello message
is considered valid. If the Holdtime is not configured or learned, it is recommended
to set a default value of 10 seconds.
• Priority: The 1-octet Priority field is used to elect the active and standby routers. A
router with a higher numerical value of priority supersedes other routers with lower
numerical priority values. In the case of two routers with the same priority, the
router with a higher IP address wins. This priority setting should not be confused
150 5 Network Architectural Models
with the priority settings in real-time system scheduling, where a smaller integer
value represents a higher task priority.
• Authentication: The Authentication field is 8-octet long, which contains a clear-
text 8-character reused password. If no authentication data is configured, a default
value is recommended, which is
0x63 0x69 0x73 0x63 0x6F 0x00 0x00 0x00
It is worth mentioning that while the authentication field helps prevent misconfig-
uration of HSRP, it does not provide security. HSRP can be easily subverted on
the LAN, for example, through DoS attacks. However, “it is difficult to subvert
HSRP from outside the LAN as most routers will not forward pakctes addressed
to all-routers multicast address (224.0.0.2)” [6].
• Virtual IP address: The last field in the HSRP datagram is the 4-byte virtual IP
address used by the group. During HSRP operation, at least one router in a standby
group must know the virtual IP address. If a router in a standby group does not
know the virtual IP address, it stays in the Learn state in the State field. In the Learn
state without knowing the virtual IP address, a router is not allowed to Speak (in
the State field), meaning that it cannot send periodic Hello messages. Otherwise, if
a router knows the virtual IP address but is neither the active router nor the standby
router, it remains in Listen state in the State field.
Other First-hop Redundancy Protocols
In addition to the Cisco HSRP, other fist-hop redundancy protocols are also available
with different use case scenarios. Two such protocols are VRRP and GLBP, both of
which are developed by Cisco.
VRRP is an open standard that can be used for first-hop redundancy in networks
with equipment from multiple vendors. It is specified in IETF RFC 5798 (March
2010) [7]. Similar to HSRP, VRRP is also configured for a group of routers, i.e.,
gateways. But VRRP differs from HSRP in several aspects. Firstly, the master router
in VRRP is manually configured by the network administrator. Also, unlike HSRP
that uses a virtual IP address, VRRP uses the real IP address of the master’s interface
that connects to the subnet as the default gateway for clients. The backup members
of the VRRP group keep communicating with the master router. If the master router
is detected to be down, the backup members of the VRRP group will take over the
role of the gateway to forward traffic. When the master router recovers from failure,
it resumes its role as the gateway for forwarding traffic. It is interesting to note that
when a backup gateway is functioning as the active gateway to forward traffic, the
IP address it uses still belongs to the master gateway, which is the owner of the IP
address.
GLBP is also a Cisco’s proprietary protocol that can provide first-hop redundancy.
While it shares similarities with HSRP and VRRP, it also has several distinct features.
For example, GLBP uses virtual IP and MAC addresses as in HSRP. Also, similar
to HSRP and VRRP, GLPBP maintains a group of routers. However, unlike HSRP
and VRPP, all routers in a GLBP group are active and forward traffic. Therefore,
this unique feature of GLBP allows for load balancing, as implied in the name of the
protocol.
5.6 Redundancy Architectural Models 151
always ready to take over the role of the primary server, and a secondary server will
assume the role should the primary server fail. Once the primary server is restored
from failure, it resumes its role, and the secondary server returns to standby.
• In the active-active mode, two redundant servers are configured to be active. This
configuration is typically related to load balancing, i.e., the two servers share
the workload in normal operation. However, when a server fails, all traffic and
workload will be shifted to the operational server.
DHCP Server Redundancy
In general, redundant DHCP servers are deployed in enterprise networks. This means
that there are multiple DHCP servers in a LAN to ensure redundancy. Also, it is
recommended that DHCP servers maintain mirrored copies of the DHCP database,
which contains IP configuration information.
Where should DHCP servers be placed in a network? Here are some general
guidelines:
• For small networks, place redundant DHCP servers at the distribution layer. This
is based on the understanding that small networks typically do not experience
excessive traffic when communicating with DHCP servers.
• For large networks, it is necessary to limit the traffic between the access layer and
distribution layer. Therefore, place redundant servers at the access layer in large
networks. This arrangement allows for localized DHCP service, reducing the need
for DHCP requests to traverse the network core.
• For large campus networks, it is recommended to place DHCP servers on a different
network segment from where the end systems are located. This often implies that
the DHCP servers are located on the other side of a router. To enable DHCP
functionality in such a scenario, the router must be configured to forward DHCP
request broadcasts.
DNS Server Redundancy
DNS servers can be placed at either the access layer or distribution layer to resolve IP
addresses of FQDNs. While they are important for network communications, they are
less critical compared to DHCP servers. This is because we are able to communicate
with a remote host through its IP address if the IP address is known. Many network
servers are configured with static IP addresses, which are already known to us. For
example, some ISPs have assigned static IP addresses to their mail servers and have
publicly advertised these IP addresses. Nevertheless, it is necessary to plan for DNS
server redundancy because it is not a reasonable assumption that the IP addresses of
all remote sites are known.
It is worth mentioning that there are public DNS servers that can be used as backup
solutions for DNS redundancy. Actually, many organizations, especially small busi-
ness companies, use public DNS servers for DNS services. Two examples of public
DNS servers are:
• Cisco OpenDNS: 208.67.222.222 and 208.67.220.220, and
• Google Public DNS: 8.8.8.8 and 8.8.4.4.
5.6 Redundancy Architectural Models 153
But keep in mind that these public or open DNS services may not always be reliable.
They may close these services anytime. For example, as a previously well-regarded
free public DNS provider, Norton ConnectSafe closed their public DNS services in
November 2018.
Redundancy of File and Database Servers
Mission-critical file and database servers require full redundancy to ensure uninter-
rupted operation and avoid loss of valuable data. Typical examples include those in
banks and other financial organizations. For such critical file and database servers,
mirrored servers are highly recommended with separate power supplies and sepa-
rate networks. They hold the same data so that if a server is down or physically
damaged, the mirrored server can become active. To ensure data consistency and
integrity, mirrored servers maintain instant synchronization for time-sensitive appli-
cations like stock exchanges. For non-real-time applications such as student grade
databases, data synchronization can be performed in a batch processing manner,
typically overnight.
For non-critical files and databases, if full redundancy is not feasible due to rea-
sons such as budget constraints and other limitations, consider mirrored or duplexed
hard drives of the servers, allowing for redundancy at the storage level. Many enter-
prise networks and data centers use Storage Area Networks (SANs) to enhance the
reliability and availability of data. SANs are designed for highly reliable access to
large amounts of stored data. Therefore, they provide an option for the redundancy
of file and database servers.
Web Server Redundancy
Redundant web servers are useful for minimizing the website downtime and ensuring
continuous service availability. When the primary web server fails or requires mainte-
nance, the redundant web server takes over seamlessly, thus providing uninterrupted
services to website visitors. The failover process is transparent to users.
An easy way to implement web server redundancy is to use a load balancer, which
distributes incoming traffic across multiple web servers. The load balancer can be a
hardware device or a software package. In the event of a failure, the load balancer
automatically redirects traffic to the operational server, ensuring uninterrupted ser-
vice. The failover process is automatic and almost instant. It is important to consider
redundancy for the load balancer itself to avoid a single point of failure.
If the redundant web servers are located at two different sites, e.g., one on the
enterprise network and the other in an external data center, consider placing the load
balancer off-site. This can be accomplished by one of the following two options:
• Collocate a redundant hardware load balancer in a high up-time data center to
maintain its high reliability and availability, or
• Use a cloud service to deploy a redundant software load balancer.
The choice between these two options depends on specific application requirements
and constraints.
154 5 Network Architectural Models
Another way to implement web server failover is through DNS settings. If the
primary web server becomes inaccessible for a certain duration, it will fail over to
the IP address of the secondary web server. When the primary web server becomes
responsive again, it resumes its role as the main server.
In cases where the websites of an organization are hosted on an infrastructure
provided by an external web hosting service provider, the responsibility for ensuring
the reliability and availability of the web servers lies with the service provider. This
should be clearly defined in the Service Level Agreements (SLAs) between the
organization and the service provider.
Mail Server Redundancy
Mail services are an integral part of fundamental IT services. Modern organizations
rely heavily on mail services for their business. In the higher education sector, which
we are most familiar with, mail services, along with other IT and Internet services,
are essential for normal teaching and learning, research, and other activities. Con-
sequently, ensuring the redundancy of mail servers is a crucial requirement in an
enterprise network. Mail servers are also known as mail exchange (MX) servers.
There are several options for redundant or backup MX mail servers, such as
store mail solution, shared storage solution, and server replication solution. They are
briefly discussed below:
• Store mail solution: This solution stores mail on the secondary server while the
primary server is down. If the primary server becomes unavailable, the secondary
server will receive and store mails on behalf of the primary server until the primary
server is restored. However, this solution does not allow users to access and send
mails through the secondary server because the primary server is the only one
configured with the authentication details.
• Shared storage solution: This solution shares mail storage between the primary
and secondary servers. Only one server is active at a time, and the other remains
standby. Thus, the MX servers work in an active-standby mode. The MX-A record
in the DNS settings points to the IP address of the active MX server.
• Server replication solution: This solution involves a cluster of mail servers each
with independent local mail storage. All mail servers within the cluster are active
and automatically synchronize their mail data. Therefore, they work in an active-
active mode, and thus are capable of handling both failover and load balancing
scenarios.
It is worth mentioning that an MX-record is a type of resource record in the DNS.
It specifies the host name of the MX server that handles emails for a domain and a
preference code. The lower the value of the preference code is, the higher the priority
of handling emails for the domain is. Moreover, an MX A-record (address record)
determines which IP address belongs to a domain name. Emails will be routed to the
designated IP address set in the A-record of the host using the DNS.
To check MX records, we can use the utility nslookup in a terminal window.
Type in nslookup to execute the utility. Default server and its address will display.
Then, issue an command set type=mx to set the query type. After that, type in the
5.6 Redundancy Architectural Models 155
Fig. 5.18 Examples of MX records and CNAME records looked up by using nslookup
domain name to be looked up and press Enter. The MX records of the domain will
appear. Figure 5.18 shows Gmail’s MX records and CNAME records as examples.
As for mail servers, it is also an option to use mail services from an external mail
service provider. In this case, the mail service provider assumes responsibility for
the reliability and availability of the mail servers in accordance with the SLAs.
A final note on the redundancy of web and mail servers is about the configuration
of DNS round robin. DNS round robin configures a list of addresses in a circular
mander for redundant web or mail servers. For each request, it responds with a
different address from the circular list, thus providing a certain degree of redundancy.
However, because other DNS servers on the Internet cache previous name-to-address
mapping information, DNS round robin may not be able to provide fast failover of
the servers. Nevertheless, as it is a simple configuration, DNS round robin can be an
option when other solutions are not readily available for the network.
156 5 Network Architectural Models
On the Internet, multiple routes exist from one router to another. Therefore, in plan-
ning a network, there is typically no need to consider route redundancy on the Internet.
Once network traffic reaches the Internet, routing protocols can effectively find a path
to route the traffic on the Internet.
However, route and media redundancy should be considered within a network and
from the network to the Internet. The connections of hosts to switches, switches to
routers, and routers to other routers and WANs within the network are all essential
to maintain network connectivity.
For WAN links, it is recommended to have a primary link and a backup link
to establish a reliable connection to the Internet. These two links can be provided
by two different ISPs, and may employ different technologies. For example, the
primary WAN link could be a leased line, while the backup link could be an ISDN
connection. Redundant WAN links can be configured in either an active-standby
mode, or an active-active mode with load balancing.
Within a network, backup paths can be considered for connections to routers,
switches, and other network devices. For critical routers and switches, meshed or
partially meshed connections are recommended in order to minimize the impact
of link failures on network performance. When desining backup paths, two key
considerations are:
• The capacity of the backup paths, and
• The time required to switch to backup paths should the primary path fail.
It is noted that in comparison with the primary path, backup paths may use different
technologies and may have lower capacity. Therefore, it is important to test the
backup solution to ensure that it meets the required backup requirements. Tested
backup paths can also be combined with load balancing techniques.
Meshed or partially meshed network connections can lead to loops. Loops may
cause communication failure or performance degradation. For example, a router
queries its neighboring routers for a path to an intended destination. If a neighboring
router also does not know such a path, it will query its neighboring routers. Such a
query may come back from a loop to the router that originally sent out the query.
Loops can be avoided by using the Spanning Tree Protocol (STP) specified in
IEEE 802.1D. STP dynamically prunes a meshed or partially meshed topology of
connected layer-2 switches into a spanning tree. The resulting topology is loop-free,
which spans the entire switched domain with branches spreading out from a stem
without loops or polygons.
In cases where the physical topology of a network changes, the established span-
ning tree may fail to work. STP will respond to the topological change of the network
and build a new spanning tree. As this process takes time, some upper-layer network
services may be timeout during the convergence period. If this happens, reconnec-
tion to these upper-layer services will be required. To speed up the reconstruction
of the spanning tree, Rapid Spanning Tree Protocol (RSTP) has been developed in
5.7 Integration of Various Architectural Models 157
IEEE 802.1w to supplement IEEE 802.1D. Building upon STP, RSTP provides rapid
convergence of the spanning tree by
• Assigning port roles, and
• Determining the active topology.
This allows for faster recovery in case of topological changes.
To fully capture the functions and features of a network, multiple architectural models
discussed in previous sections need to be integrated, forming an overall architecture of
the network. This integration process typically begins with a hierarchical topological
model, such as the core/distribution/access model or LAN/MAN/WAN model. Then,
incorporating other functional, component-based, and flow-based models into the
base model as needed. Once all main functions and features of the network are
characterized, a complete reference architecture is established.
Figure 5.19 shows the use of the core/distribution/access model as the base model
for developing a complete network architecture. Various architectural models such
as flow-based models, redundancy models, functional models, enterprise edge, and
the LAN/MAN/WAN model are added to the base model. This provides a top-level
view of the entire netwo