Academia.eduAcademia.edu

Web Workload Characterization: Ten Years Later

2005, Web Information Systems Engineering and Internet Technologies Book Series

Abstract

] conducted a comprehensive workload characterization study of Internet Web servers. By analyzing access logs from 6 Web sites (3 academic, 2 research, and 1 industrial) in 1994 and 1995, the authors identified 10 invariants: workload characteristics common to all the sites that are likely to persist over time. In this present work, we revisit the 1996 work by Arlitt and Williamson, repeating many of the same analyses on new data sets collected in 2004. In particular, we study access logs from the same 3 academic sites used in the 1996 paper. Despite a 30-fold increase in overall traffic volume from 1994 to 2004, our main conclusion is that there are no dramatic changes in Web server workload characteristics in the last 10 years. Although there have been many changes in Web technologies (e.g., new protocols, scripting languages, caching infrastructures), most of the 1996 invariants still hold true today. We postulate that these invariants will continue to hold in the future, because they represent fundamental characteristics of how humans organize, store, and access information on the Web.