Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2005, Proceedings of the 5th ACM SIGCOMM conference on Internet measurement - IMC '05
While wide-area Internet traffic has been heavily studied for many years, the characteristics of traffic inside Internet enterprises remain almost wholly unexplored. Nearly all of the studies of enterprise traffic available in the literature are well over a decade old and focus on individual LANs rather than whole sites. In this paper we present a broad overview of internal enterprise traffic recorded at a medium-sized site. The packet traces span more than 100 hours, over which activity from a total of several thousand internal hosts appears. This wealth of data-which we are publicly releasing in anonymized form-spans a wide range of dimensions. While we cannot form general conclusions using data from a single site, and clearly this sort of data merits additional in-depth study in a number of ways, in this work we endeavor to characterize a number of the most salient aspects of the traffic. Our goal is to provide a first sense of ways in which modern enterprise traffic is similar to wide-area Internet traffic, and ways in which it is quite different.
Proceedings of the 2014 Conference on Internet Measurement Conference, 2014
Although traffic between Web servers and Web browsers is readily apparent to many knowledgeable end users, fewer are aware of the extent of server-to-server Web traffic carried over the public Internet. We refer to the former class of traffic as front-office Internet Web traffic and the latter as back-office Internet Web traffic (or just front-office and back-office traffic, for short). Back-office traffic, which may or may not be triggered by end-user activity, is essential for today's Web as it supports a number of popular but complex Web services including large-scale content delivery, social networking, indexing, searching, advertising, and proxy services. This paper takes a first look at back-office traffic, measuring it from various vantage points, including from within ISPs, IXPs, and CDNs. We describe techniques for identifying back-office traffic based on the roles that this traffic plays in the Web ecosystem. Our measurements show that back-office traffic accounts for a significant fraction not only of core Internet traffic, but also of Web transactions in the terms of requests and responses. Finally, we discuss the implications and opportunities that the presence of backoffice traffic presents for the evolution of the Internet ecosystem.
Globally the internet has emerged as the key component for commercial and personal communication. As many devices that have emerged, the internet is still increasing in usage, its flexibility and versatility cannot be questioned. Hence there is always an expansion of the Internet and there will always be an expansion as technology is always evolving and there will be continuous utilization and traffic behavioral changes. Due to this diversity and the fast changing properties the Internet is a moving target. At present, the Internet is far from being well understood in its entirety. However, constantly changing Internet characteristics associated with both time and location make it imperative for the Internet community to understand the nature and behavior of current Internet traffic in order to support research and further development. Through the measurement and analysis of traffic, the Internet can be better understood. This research presents a successful Internet measurement process, providing guidelines for conducting passive network measurements. Recent large scale back bone traffic data is analyzed, revealing current deployment of protocol features on packet and flow level, including statistics about anomalies and misbehavior. A method to classify packet header data on transport level according to network application is proposed, resulting in complete traffic decomposition.
Proceedings of the 14th international conference on World Wide Web - WWW '05, 2005
We offer the first large-scale analysis of Web traffic based on network flow data. Using data collected on the Inter-net2 network, we constructed a weighted bipartite clientserver host graph containing more than 18 × 10 6 vertices and 68 × 10 6 edges valued by relative traffic flows. When considered as a traffic map of the WorldWide Web, the generated graph provides valuable information on the statistical patterns that characterize the global information flow on the Web. Statistical analysis shows that client-server connections and traffic flows exhibit heavy-tailed probability distributions lacking any typical scale. In particular, the absence of an intrinsic average in some of the distributions implies the absence of a prototypical scale appropriate for server design, Web-centric network design, or traffic modeling. The inspection of the amount of traffic handled by clients and servers and their number of connections highlights non-trivial correlations between information flow and patterns of connectivity as well as the presence of anomalous statistical patterns related to the behavior of users on the Web. The results presented here may impact considerably the modeling, scalability analysis, and behavioral study of Web applications.
arXiv (Cornell University), 2019
The Internet is transforming our society, necessitating a quantitative understanding of Internet traffic. Our team collects and curates the largest publicly available Internet traffic data sets. An analysis of 50 billion packets using 10,000 processors in the MIT SuperCloud reveals a new phenomenon: the importance
Networks and Heterogeneous Media, 2006
The Internet's layered architecture and organizational structure give rise to a number of different topologies, with the lower layers defining more physical and the higher layers more virtual/logical types of connectivity structures. These structures are very different, and successful Internet topology modeling requires annotating the nodes and edges of the corresponding graphs with information that reflects their network-intrinsic meaning. These structures also give rise to different representations of the traffic that traverses the heterogeneous Internet, and a traffic matrix is a compact and succinct description of the traffic exchanges between the nodes in a given connectivity structure. In this paper, we summarize recent advances in Internet research related to (i) inferring and modeling the router-level topologies of individual service providers (i.e., the physical connectivity structure of an ISP, where nodes are routers/switches and links represent physical connections), (ii) estimating the intra-AS traffic matrix when the AS's router-level topology and routing configuration are known, (iii) inferring and modeling the Internet's AS-level topology, and (iv) estimating the inter-AS traffic matrix. We will also discuss recent work on Internet connectivity structures that arise at the higher layers in the IP/TCP protocol stack and are more virtual and dynamic; e.g., overlay networks like the WWW graph, where nodes are web pages and edges represent existing hyperlinks, or P2P networks like Gnutella, where nodes represent peers and two peers are connected if they have an active network connection. Here, an autonomous system (AS) or autonomous domain is a group of routers and networks managed by a single organization. In turn, an Internet Service Provider (ISP) can consist of a single AS or of a group of ASes, but for simplicity, we will use the terms AS and ISP indistinguishably throughout this paper. The TCP/IP protocol stack as a whole and IP in particular are able to hide from the user much of the enormous complexity associated with controlling this diverse set of networked resources and coordinating the actions among the many competing ISPs. By providing the mechanisms necessary to knit together diverse networking technologies and ASes into a single virtual network (i.e., a network of networks, or "Internet"), they ultimately guarantee seamless connectivity and reliable communication between sending and receiving hosts, irrespective of where in the network they are.
2005
Recent spates of cyber-attacks and frequent emergence of applications affecting Internet traffic dynamics have made it imperative to develop effective techniques that can extract, and make sense of, significant communication patterns from Internet traffic data for use in network operations and security management. In this paper, we present a general methodology for building comprehensive behavior profiles of Internet backbone traffic in terms of communication patterns of end-hosts and services. Relying on data mining and informationtheoretic techniques, the methodology consists of significant cluster extraction, automatic behavior classification and structural modelling for in-depth interpretive analyses. We validate our methodology using data sets from the core of the Internet. Our results demonstrate that it indeed can identify common traffic profiles as well as anomalous behavior patterns that are of interest to network operators and security analysts.
… LANMAN 2005. The …, 2005
In this paper, we present a comprehensive traffic analysis of the Greek School Network (GSN), a wide area network designed to provide Internet access and services to about 15,000 units of primary and secondary education schools and administration offices. In our analysis, we have used measurements from the PATRAS region node obtained through the Cisco NetFlow and FlowScan tools. We have used classical analysis to obtain protocol and application traffic statistics. Our study revealed that TCP traffic is dominant in the network, while nearly 50% of the outgoing and 37% of the incoming traffic is Peer-to-Peer (P2P) traffic, with a further 25.6% of traffic using not registered ports and suspected to be P2P as well. Finally, we have also observed a remarkable traffic locality phenomenon in the P2P services, where more than the 90% of the traffic was heading or generated by 50 hosts.
HotNets, 2021
The impact of Internet phenomena depends on how they impact users, but researchers lack visibility into how to translate Internet events into their impact. Distressingly, the research community seems to have lost hope of obtaining this information without relying on privileged viewpoints. We argue for optimism thanks to new network measurement methods and changes in Internet structure which make it possible to construct an "Internet traffic map". This map would identify the locations of users and major services, the paths between them, and the relative activity levels routed along these paths. We sketch our vision for the map, detail new measurement ideas for map construction, and identify key challenges that the research community should tackle. The realization of an Internet traffic ma p wi ll be an In ternet-scale research effort with Internet-scale impacts that reach far beyond the research community, and so we hope our fellow researchers are excited to join us in addressing this challenge.
GLOBECOM'01. IEEE Global Telecommunications Conference (Cat. No.01CH37270), 2001
In this work the authors show how the behaviour of Web users strongly affects the dynamics of TCP connections in Internet. Analysing actual and systematically generated HTTP traces, it is proved that the time between the download of two pages is critical to determine the re-utilisation of TCP connections. On the other hand, the study also shows that the utilisation of 1.1 version of the HTTP standard does not significantly affect the traffic generated by HTTP 1.0. In this sense, the heavy-tailed nature of the size of HTTP connections can be considered as an invariant property.
International Journal of Business & Technology
With the rapid increase demand for data usage, Internet has become complex and harder to analyze. Characterizing the Internet traffic might reveal information that are important for Network Operators to formulate policy decisions, develop techniques to detect network anomalies, help better provision network resources (capacity, buffers) and use workload characteristics for simulations (typical packet sizes, flow durations, common protocols). In this paper, using passive monitoring and measurements, we show collected data traffic at Internet backbone routers. First, we reveal main observations on patterns and characteristics of this dataset including packet sizes, traffic volume for inter and intra domain and protocol composition. Second, we further investigate independence structure of packet size arrivals using both visual and computational statistics. Finally, we show the temporal behavior of most active destination IP and Port addresses.
2001
This paper analyzes two one week long traces of all the interdomain traffic of two very different ISPs. The analysis correlates the received traffic with the BGP routing table of the studied ISPs. We first analyze the topological distribution of the traffic throughout the Internet. Then, we analyze the variability of the interdomain flows by considering their activity and the variability of their traffic in volume.
11th IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer Telecommunications Systems, 2003. MASCOTS 2003., 2003
Proceeding of the 17th international conference on World Wide Web - WWW '08, 2008
Peer-to-Peer (P2P) applications continue to grow in pop-ularity, and have reportedly overtaken Web applications as the single largest contributor to Internet tra c. Using traces collected from a large edge network, we conduct an exten-sive analysis of P2P tra c, compare P2P tra c with Web tra c, and discuss the implications of increased P2P tra c. In addition to
IEEE Internet Computing, 2001
T he Internet's evolution over the past 30 years has been accompanied by the development of various network applications. These applications range from early text-based utilities such as file transfer and remote login to the more recent advent of the Web, electronic commerce, and multimedia streaming.
Webology, 2022
Network Traffic Monitoring and Analysis (NTMA) is the main element to network management, especially to correctly operate large-scale networks such as the Internet on which modern academic organizations heavily depend. Their traffic use increases significantly because students, staff members, and research labs use them to search information. It is necessary to analyze, measure, and classify this Internet traffic according to the need of different stakeholders such as Internet Service Providers and network administrators. Moreover, bandwidth congestions frequently occur, causing user dissatisfaction. This study tries to find different characterizations such as data over hosts, countries, cities, companies, top-level domains, and servers. In addition, this is a new study to find out different patterns and levels of analysis from the device to its international requests. Our findings show that the highest traffic use is on Mondays and Wednesdays. Web server and DNS server drop in respo...
Proceedings Fourth IEEE International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems (WECWIS 2002)
The World Wide Web has achieved immense popularity in the business world. It is thus essential to characterize the traffic behavior at these sites, a study that will facilitate the design and development of high-performance, reliable e-commerce servers. This paper makes an effort in this direction. Aggregated traffic arriving at a Businessto-Business (B2B) and a Business-to-Consumer (B2C) e-commerce site was collected and analyzed. High degree of self-similarity was found in the traffic (higher than that observed in general Web-environment). Heavy-tailed behavior of transfer times was established at both the sites. Traditionally this behavior has been attributed to the distribution of transfer sizes, which was not the case in B2C space. This implies that the heavy-tailed transfer times are actually caused by the behavior of back-end service time. In B2B space, transfer-sizes were found to be heavy-tailed. A detailed study of the traffic and load at the back-end servers was also conducted and the inferences are included in this paper.
Proceedings of the Twentieth ACM Workshop on Hot Topics in Networks, 2021
The impact of Internet phenomena depends on how they impact users, but researchers lack visibility into how to translate Internet events into their impact. Distressingly, the research community seems to have lost hope of obtaining this information without relying on privileged viewpoints. We argue for optimism thanks to new network measurement methods and changes in Internet structure which make it possible to construct an "Internet traffic map". This map would identify the locations of users and major services, the paths between them, and the relative activity levels routed along these paths. We sketch our vision for the map, detail new measurement ideas for map construction, and identify key challenges that the research community should tackle. The realization of an Internet traffic ma p wi ll be an In ternet-scale research effort with Internet-scale impacts that reach far beyond the research community, and so we hope our fellow researchers are excited to join us in addressing this challenge.
Journal of the American Statistical Association, 2000
Internet engineering and management depend on an understanding of the characteristics of network traffic. Statistical models are needed that can generate traffic that mimics closely the observed behavior on live Internet wires. Models can be used on their own for some tasks and combined with network simulators for others. But the challenge of model development is immense. Internet traffic data are ferocious. Their statistical properties are complex, databases are very large, Internet network topology is vast, and the engineering mechanism is intricate and introduces feedback into the traffic. Packet header collection and organization of the headers into connection flows yields data rich in information about traffic characteristics and serves as an excellent framework for modeling. Many existing statistical tools and models-especially those for time series, point processes, and marked point process-can be used to describe and model the statistical characteristics, taking into account the structure of the Internet, but new tools and models are needed.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.