Papers by Alejandro Noriega

Cornell University - arXiv, Oct 26, 2016
Tourism has been an increasingly important factor in global economy, society and environment, acc... more Tourism has been an increasingly important factor in global economy, society and environment, accounting for a significant share of GDP and labor force. Policy and research on tourism traditionally rely on surveys and economic datasets, which are based on small samples and depict tourism dynamics at low spatial and temporal granularity. Anonymous call detail records (CDRs) are a novel source of data, showing enormous potential in areas of high societal value: such as epidemics, poverty, and urban development. This study demonstrates the added value of using CDRs for the formulation, analysis and evaluation of tourism strategies, at the national and local levels. In the context of the European country of Andorra, we use CDRs to evaluate marketing strategies in tourism, understand tourists' experiences, and evaluate revenues and externalities generated by touristic events. We do this by extracting novel indicators in high spatial and temporal resolutions, such as tourist flows per country of origin, flows of new tourists, tourist revisits, tourist externalities on transportation congestion, spatial distribution, economic impact, and profiling of tourist interests. We exemplify the use of these indicators for the planning and evaluation of high impact touristic events, such as cultural festivals and sports competitions. 1
The impact of crime shocks across gender and socioeconomic groups: a large-scale mapping of behavioral disruption
Research Papers in Economics, 2021

BACKGROUNDDiabetes and hypertension are among top public health priorities, particularly in low a... more BACKGROUNDDiabetes and hypertension are among top public health priorities, particularly in low and middle-income countries where their health and socioeconomic impact is exacerbated by the quality and accessibility of health care. Moreover, their connection with severe or deadly COVID-19 illness has further increased their societal relevance. Tools for early detection of these chronic diseases enable interventions to prevent high-impact complications, such as loss of sight and kidney failure. Similarly, prognostic tools for COVID-19 help stratify the population to prioritize protection and vaccination of high-risk groups, optimize medical resources and tests, and raise public awareness.METHODSWe developed and validated state-of-the-art risk models for the presence of undiagnosed diabetes, hypertension, visual complications associated with diabetes and hypertension, and the risk of severe COVID-19 illness (if infected). The models were estimated using modern methods from the field o...
This paper presents a methodology for a) modeling the geo-spatial and social interaction factors ... more This paper presents a methodology for a) modeling the geo-spatial and social interaction factors that drive interference (SUTVA violations) in randomized field experiments; and b) eliciting a set of non-dominated sample options that approximate the Pareto-optimal tradeoff between interference and external representativity as functions of sample choice. We develop and test the methodology in the context of a large-scale health experiment in rural Mexico, involving more than 5,000 pregnant women and 600 health clinics across five states. Relevant for the practitioner, we show the methodology is computationally tractable and can be implemented leveraging novel open sourced geo-spatial data and software tools.

Targeted social programs, such as conditional cash transfers (CCTs), are a major vehicle for pove... more Targeted social programs, such as conditional cash transfers (CCTs), are a major vehicle for poverty alleviation throughout the developing world. Only in Mexico and Brazil, these reach nearly 80 million people (25% of population), distributing +8 billion USD yearly. We study the potential efficiency and fairness gains of targeting CCTs by means of artificial intelligence algorithms. In particular, we analyze the targeting decision rules and underlying poverty prediction models used by national-wide CCTs in three middleincome countries (Mexico, Ecuador, and Costa Rica). Our contribution is three-fold: 1) We show that, absent explicit measures aimed at limiting algorithmic bias, targeting rules can systematically disadvantage population subgroups, such as incurring exclusion errors 2.3 times higher on poor urban households compared to their rural counterparts, or exclusion errors 2.2 times higher on poor elderly households compared with poor traditional nuclear families. 2) We constra...

ArXiv, 2018
Today's age of data holds high potential to enhance the way we pursue and monitor progress in... more Today's age of data holds high potential to enhance the way we pursue and monitor progress in the fields of development and humanitarian action. We study the relation between data utility and privacy risk in large-scale behavioral data, focusing on mobile phone metadata as paradigmatic domain. To measure utility, we survey experts about the value of mobile phone metadata at various spatial and temporal granularity levels. To measure privacy, we propose a formal and intuitive measure of reidentification risk$\unicode{x2014}$the information ratio$\unicode{x2014}$and compute it at each granularity level. Our results confirm the existence of a stark tradeoff between data utility and reidentifiability, where the most valuable datasets are also most prone to reidentification. When data is specified at ZIP-code and hourly levels, outside knowledge of only 7% of a person's data suffices for reidentification and retrieval of the remaining 93%. In contrast, in the least valuable datas...

Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, 2019
Society increasingly relies on machine learning models for automated decision making. Yet, effici... more Society increasingly relies on machine learning models for automated decision making. Yet, efficiency gains from automation have come paired with concern for algorithmic discrimination that can systematize inequality. Recent work has proposed optimal postprocessing methods that randomize classification decisions for a fraction of individuals, in order to achieve fairness measures related to parity in errors and calibration. These methods, however, have raised concern due to the information inefficiency, intra-group unfairness, and Pareto sub-optimality they entail. The present work proposes an alternative active framework for fair classification, where, in deployment, a decision-maker adaptively acquires information according to the needs of different groups or individuals, towards balancing disparities in classification performance. We propose two such methods, where information collection is adapted to group-and individual-level needs respectively. We show on realworld datasets that these can achieve: 1) calibration and single error parity (e.g., equal opportunity); and 2) parity in both false positive and false negative rates (i.e., equal odds). Moreover, we show that by leveraging their additional degree of freedom, active approaches can substantially outperform randomization-based classifiers previously considered optimal, while avoiding limitations such as intra-group unfairness.

ACM/IMS Transactions on Data Science, 2021
Tourism has been an increasingly significant contributor to the economy, society, and environment... more Tourism has been an increasingly significant contributor to the economy, society, and environment. Policy-making and research on tourism traditionally rely on surveys and economic datasets, which are based on small samples and depict tourism dynamics at a low granularity. Anonymous call detail record (CDR) is a novel source of data with enormous potential in areas of high societal value: epidemics, poverty, and urban development. This study demonstrates the added value of CDR in event tourism, especially for the analysis and evaluation of marketing strategies, event operations, and the externalities at the local and national levels. To achieve this aim, we formalize 14 indicators in high spatial and temporal resolutions to measure both the positive and the negative impacts of the touristic events. We exemplify the use of these indicators in a tourism country, Andorra, on 22 high-impact events including sports competitions, cultural performances, and music festivals. We analyze these...

BackgroundThe automated screening of patients at risk of developing diabetic retinopathy (DR) rep... more BackgroundThe automated screening of patients at risk of developing diabetic retinopathy (DR) represents an opportunity to improve their mid-term outcome, and lower the public expenditure associated with direct and indirect costs of common sight-threatening complications of diabetes.ObjectiveIn the present study, we aim at developing and evaluating the performance of an automated deep learning–based system to classify retinal fundus images from international and Mexican patients, as referable and non-referable DR cases. In particular, we study the performance of the automated retina image analysis (ARIA) system under an independent scheme (i.e. only ARIA screening) and twoassistiveschemes (i.e., hybrid ARIA + ophthalmologist screening), using a web-based platform for remote image analysis.MethodsWe ran a randomized controlled experiment where 17 ophthalmologists were asked to classify a series of retinal fundus images under three different conditions: 1) screening the fundus image b...
Electricity Demand and Population Dynamics Prediction from Mobile Phone Metadata
Lecture Notes in Computer Science, 2016
Energy efficiency is a key challenge for building modern sustainable societies. World’s energy co... more Energy efficiency is a key challenge for building modern sustainable societies. World’s energy consumption is expected to grow annually by 1.6 %, increasing pressure for utilities and governments to fulfill demand and raising significant challenges in generation, distribution, and storage of electricity. In this context, accurate predictions and understanding of population dynamics and their relation to electricity demand dynamics is of high relevance.
Uploads
Papers by Alejandro Noriega