Papers by Luca Pappalardo

Big Data offer nowadays the potential capability of creating a digital nervous system of our soci... more Big Data offer nowadays the potential capability of creating a digital nervous system of our society, enabling the measurement, monitoring and prediction of relevant aspects of socio-economic phenomena in quasi real time. This potential has fueled, in the last few years, a growing interest around the usage of Big Data to support official statistics in the measurement of individual and collective economic well-being. In this work we study the relations between human mobility patterns and socio-economic development. Starting from nation-wide mobile phone data we extract a measure of mobility volume and a measure of mobility diversity for each individual. We then aggregate the mobility measures at municipality level and investigate the correlations with external socio-economic indicators independently surveyed by an official statistics institute. We find three main results. First, aggregated human mobility patterns are correlated with these socio-economic indicators. Second, the diversity of mobility, defined in terms of entropy of the individual users' trajectories, exhibits the strongest correlation with the external socio-economic indicators. Third, the volume of mobility and the diversity of mobility show opposite correlations with the socio-economic indicators.
Our results, validated against a null model, open an interesting perspective to study human behavior through Big Data by means of new statistical indicators that quantify and possibly "nowcast" the socio-economic development of our society.

Sports analytics in general, and football (soccer in USA) analytics in particular, have evolved i... more Sports analytics in general, and football (soccer in USA) analytics in particular, have evolved in recent years in an amazing way, thanks to automated or semi-automated sensing technologies that provide high-fidelity data streams extracted from every game. In this paper we propose a data-driven approach and show that there is a large potential to boost the understanding of football team performance. From observational data of football games we extract a set of pass-based performance indicators and summarize them in the H indicator. We observe a strong correlation among the proposed indicator and the success of a team, and therefore perform a simulation on the four major European championships (78 teams, almost 1500 games). The outcome of each game in the championship was replaced by a synthetic outcome (win, loss or draw) based on the performance indicators computed for each team. We found that the final rankings in the simulated championships are very close to the actual rankings in the real championships, and show that teams with high ranking error show extreme values of a defense/attack efficiency measure, the Pezzali score. Our results are surprising given the simplicity of the proposed indicators, suggesting that a complex systems' view on football data has the potential of revealing hidden patterns and behavior of superior quality.

Nature Communications, 2015
The availability of massive digital traces of human whereabouts has offered a series of novel ins... more The availability of massive digital traces of human whereabouts has offered a series of novel insights on the quantitative patterns characterizing human mobility. In particular, numerous recent studies have lead to an unexpected consensus: the considerable variability in the characteristic travelled distance of individuals coexists with a high degree of predictability of their future locations. Here we shed light on this surprising coexistence by systematically investigating the impact of recurrent mobility on the characteristic distance travelled by individuals. Using both mobile phone and GPS data, we discover the existence of two distinct classes of individuals: returners and explorers. As existing models of human mobility cannot explain the existence of these two classes, we develop more realistic models able to capture the empirical findings. Finally, we show that returners and explorers play a distinct quantifiable role in spreading phenomena and that a correlation exists between their mobility patterns and social interactions.
Traditional approaches to user engagement analysis focus on individual users. In this paper we ad... more Traditional approaches to user engagement analysis focus on individual users. In this paper we address user engagement analysis at the level of groups of users (social communities). From the entire Skype social network we extract communities by means of representative community detection methods each one providing node partitions having their own peculiarities. We then examine user engagement in the extracted communities putting into evidence clear relations between topological and geographic features of communities and their mean user engagement. In particular we show that user engagement can be to a great extent predicted from such features. Moreover, from the analysis it clearly emerges that the choice of community definition and granularity deeply affect the predictive performance.
Journal of Official Statistics, 2015

2014 International Conference on Data Science and Advanced Analytics (DSAA), 2014
The large availability of mobility data allows us to investigate complex phenomena about human mo... more The large availability of mobility data allows us to investigate complex phenomena about human movement. However this adundance of data comes with few information about the purpose of movement. In this work we address the issue of activity recognition by introducing Activity-Based Cascading (ABC) classification. Such approach departs completely from probabilistic approaches for two main reasons. First, it exploits a set of structural features extracted from the Individual Mobility Network (IMN), a model able to capture the salient aspects of individual mobility. Second, it uses a cascading classification as a way to tackle the highly skewed frequency of activity classes. We show that our approach outperforms existing state-of-theart probabilistic methods. Since it reaches high precision, ABC classification represents a very reliable semantic amplifier for Big Data.

2013 IEEE 13th International Conference on Data Mining Workshops, 2013
The recent emergence of the so called online social
fitness constitutes a good proxy to study th... more The recent emergence of the so called online social
fitness constitutes a good proxy to study the patterns underlying
success in sport. Through these platforms, users can collect,
monitor and share with friends their sport performance, diet,
and even burned calories, giving an unprecedented opportunity
to answer very fascinating questions: What are the main factors
that shape sport performance? What are the characteristics that
distinguish successful sportsmen? Can we characterize the role
of social influence on fitness behavior?
In the current work, we present the results of a study conducted
on a sample of 29, 284 cyclists downloaded via APIs from the
social fitness platform Strava.com. We defined two basic metrics:
a measure of training effort, that is how much a cyclist struggled
during the workout; and a measure of training performance
indicating the results achieved during the training. Analyzing
the relationship between these two metrics, an interesting result
immediately emerges: at a global level, there is no correlation
between effort and performance. This means that, in general, the
performance is not simply a function of training: two athletes
with the same level of training have different performance.
However, by deeply investigating workouts time evolution and
cyclists’ training characteristics, we found that athletes that
better improve their performance follow precise training patterns
usually referred as overcompensation theory, with alternation of
stress peaks and rest periods. Studies and experiments related to
such theory, up to now, have always been conducted by sports
doctors on a few dozen professionals athletes. To the best of our
knowledge, our study is the first corroboration on large scale of
this theory, mainly confirming that “engine matters”, but tuning
is fundamental.

One classic problem definition in social network analysis is the study of diffusion in networks, ... more One classic problem definition in social network analysis is the study of diffusion in networks, which enables us to tackle problems like favoring the adoption of positive technologies. Most of the attention has been turned to how to maximize the number of influenced nodes, but this approach misses the fact that different scenarios imply different diffusion dynamics, only slightly related to maximizing the number of nodes involved. In this paper we measure three different dimensions of social prominence: the Width, i.e. the ratio of neighbors influenced by a node; the Depth, i.e. the degrees of separation from a node to the nodes perceiving its prominence; and the Strength, i.e. the intensity of the prominence of a node. By defining a procedure to extract prominent users in complex networks, we detect associations between the three dimensions of social prominence and classical network statistics. We validate our results on a social network extracted from the Last.Fm music platform.
The European Physical Journal Special Topics, 2013
Are the patterns of car travel different from those of general human mobility? Based on a unique ... more Are the patterns of car travel different from those of general human mobility? Based on a unique dataset consisting of the GPS trajectories of 10 million travels accomplished by 150,000 cars in Italy, we investigate how known mobility models apply to car travels, and illustrate novel analytical findings. We also assess to what extent the sample in our dataset is representative of the overall car mobility, and discover how to build an extremely accurate model that, given our GPS data, estimates the real traffic values as measured by road sensors. a

2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2012
The advent of social media have allowed us to build massive networks of weak ties: acquaintances ... more The advent of social media have allowed us to build massive networks of weak ties: acquaintances and nonintimate ties we use all the time to spread information and thoughts. Conversely, strong ties are the people we really trust, people whose social circles tightly overlap with our own and, often, they are also the people most like us. Unfortunately, the social media do not incorporate tie strength in the creation and management of relationships, and treat all users the same: friend or stranger, with little or nothing in between. In the current work, we address the challenging issue of detecting on online social networks the strong and intimate ties from the huge mass of such mere social contacts. In order to do so, we propose a novel multidimensional definition of tie strength which exploits the existence of multiple online social links between two individuals. We test our definition on a multidimensional network constructed over users in Foursquare, Twitter and Facebook, analyzing the structural role of strong e weak links, and the correlations with the most common similarity measures.

2013 BRICS Congress on Computational Intelligence and 11th Brazilian Congress on Computational Intelligence, 2013
In the last years, the emergence of big data led scientists from diverse disciplines toward the s... more In the last years, the emergence of big data led scientists from diverse disciplines toward the study of the laws underlying human mobility. Although these recent discoveries have shed light on very interesting and fascinating aspects about people movements, they are generally focused on global and general mobility patterns. For this reason, they do not necessarily capture phenomena related to specific types of mobility, such as mobility by car, by public transportations means, by foot and so on. In this work, we aim to compare general human mobility with mobility expressed by a specific conveyance, trying to address the following question: What are the differences between general mobility and mobility by car? To answer this question, we present the results of an analysis performed on a big mobile phone dataset and on a GPS dataset storing information about car travels in Italy.
Books by Luca Pappalardo
Modeling, Management, and Understanding, 2009
Technical Reports by Luca Pappalardo
Uploads
Papers by Luca Pappalardo
Our results, validated against a null model, open an interesting perspective to study human behavior through Big Data by means of new statistical indicators that quantify and possibly "nowcast" the socio-economic development of our society.
fitness constitutes a good proxy to study the patterns underlying
success in sport. Through these platforms, users can collect,
monitor and share with friends their sport performance, diet,
and even burned calories, giving an unprecedented opportunity
to answer very fascinating questions: What are the main factors
that shape sport performance? What are the characteristics that
distinguish successful sportsmen? Can we characterize the role
of social influence on fitness behavior?
In the current work, we present the results of a study conducted
on a sample of 29, 284 cyclists downloaded via APIs from the
social fitness platform Strava.com. We defined two basic metrics:
a measure of training effort, that is how much a cyclist struggled
during the workout; and a measure of training performance
indicating the results achieved during the training. Analyzing
the relationship between these two metrics, an interesting result
immediately emerges: at a global level, there is no correlation
between effort and performance. This means that, in general, the
performance is not simply a function of training: two athletes
with the same level of training have different performance.
However, by deeply investigating workouts time evolution and
cyclists’ training characteristics, we found that athletes that
better improve their performance follow precise training patterns
usually referred as overcompensation theory, with alternation of
stress peaks and rest periods. Studies and experiments related to
such theory, up to now, have always been conducted by sports
doctors on a few dozen professionals athletes. To the best of our
knowledge, our study is the first corroboration on large scale of
this theory, mainly confirming that “engine matters”, but tuning
is fundamental.
Books by Luca Pappalardo
Technical Reports by Luca Pappalardo
Our results, validated against a null model, open an interesting perspective to study human behavior through Big Data by means of new statistical indicators that quantify and possibly "nowcast" the socio-economic development of our society.
fitness constitutes a good proxy to study the patterns underlying
success in sport. Through these platforms, users can collect,
monitor and share with friends their sport performance, diet,
and even burned calories, giving an unprecedented opportunity
to answer very fascinating questions: What are the main factors
that shape sport performance? What are the characteristics that
distinguish successful sportsmen? Can we characterize the role
of social influence on fitness behavior?
In the current work, we present the results of a study conducted
on a sample of 29, 284 cyclists downloaded via APIs from the
social fitness platform Strava.com. We defined two basic metrics:
a measure of training effort, that is how much a cyclist struggled
during the workout; and a measure of training performance
indicating the results achieved during the training. Analyzing
the relationship between these two metrics, an interesting result
immediately emerges: at a global level, there is no correlation
between effort and performance. This means that, in general, the
performance is not simply a function of training: two athletes
with the same level of training have different performance.
However, by deeply investigating workouts time evolution and
cyclists’ training characteristics, we found that athletes that
better improve their performance follow precise training patterns
usually referred as overcompensation theory, with alternation of
stress peaks and rest periods. Studies and experiments related to
such theory, up to now, have always been conducted by sports
doctors on a few dozen professionals athletes. To the best of our
knowledge, our study is the first corroboration on large scale of
this theory, mainly confirming that “engine matters”, but tuning
is fundamental.