Papers by Anjali Mazumder
Journal of the Royal Statistical Society Series C: Applied Statistics
Forensic entomology contributes important information to criminal investigations. This paper prop... more Forensic entomology contributes important information to criminal investigations. This paper proposes a novel method to estimate the hatching time of fly larvae based on the temperature profile at the crime scene and on experimental data on larval development, where larvae are exposed to a constant temperature. We develop a dynamic model to estimate the growth curve under time-varying temperature profiles and the corresponding hatching time at the crime scene. Asymptotic properties are provided for the proposed estimators, and we explore their robustness via simulations. The proposed methodology is demonstrated on data from two criminal cases from the UK.

CERN European Organization for Nuclear Research - Zenodo, Mar 22, 2022
whose expertise, wisdom, and lived experiences have provided us with a wide range of insights tha... more whose expertise, wisdom, and lived experiences have provided us with a wide range of insights that proved invaluable throughout this research. We would also like to thank those individuals and communities who engaged with our participatory platform on decidim and whose thoughts and opinions on data justice greatly informed the framing of this project. All of these contributions have demonstrated the pressing need for a relocation of data justice and we hope to have emphasised this throughout our research outputs. Finally, we would like to acknowledge the tireless efforts of our colleagues at the International Centre of Expertise in Montréal and GPAI's Data Governance Working Group. We are grateful, in particular, for the unbending support of Ed Teather, Sophie Fallaha, Jacques Rajotte, and Noémie Gervais from CEIMIA, and for the indefatigable dedication of
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
SSRN Electronic Journal
whose expertise, wisdom, and lived experiences have provided us with a wide range of insights tha... more whose expertise, wisdom, and lived experiences have provided us with a wide range of insights that proved invaluable throughout this research. We would also like to thank those individuals and communities who engaged with our participatory platform on decidim and whose thoughts and opinions on data justice greatly informed the framing of this project. All of these contributions have demonstrated the pressing need for a relocation of data justice and we hope to have emphasised this throughout our research outputs. Finally, we would like to acknowledge the tireless efforts of our colleagues at the International Centre of Expertise in Montréal and GPAI's Data Governance Working Group. We are grateful, in particular, for the unbending support of Ed Teather, Sophie Fallaha, Jacques Rajotte, and Noémie Gervais from CEIMIA, and for the indefatigable dedication of

EPJ Data Science
Understanding what factors predict whether an urban migrant will end up in a deprived neighbourho... more Understanding what factors predict whether an urban migrant will end up in a deprived neighbourhood or not could help prevent the exploitation of vulnerable individuals. This study leveraged pseudonymized mobile money interactions combined with cell phone data to shed light on urban migration patterns and deprivation in Tanzania. Call detail records were used to identify individuals who migrated to Dar es Salaam, Tanzania’s largest city. A street survey of the city’s subwards was used to determine which individuals moved to more deprived areas. t-tests showed that people who settled in poorer neighbourhoods had less money coming into their mobile money account after they moved, but not before. A machine learning approach was then utilized to predict which migrants will move to poorer areas of the city, making them arguably more vulnerable to poverty, unemployment and exploitation. Features indicating the strength and location of people’s social connections in Dar es Salaam before th...

Not only is the crime of modern slavery a hidden one, but global definitions also vary, as do cul... more Not only is the crime of modern slavery a hidden one, but global definitions also vary, as do cultural norms on issues such as child marriage, child labour and forced marriage. In this nuanced context, this research goes beyond the traditional regression analyses used in this space, to try to understand not just the linear predictive relationships, but also the non-linear effects, thresholds, interaction effects, and potential subgroups of countries. Consequently, this research, in the context of small data (N=70) and many potential (and correlated) predictors (107), utilises machine learning methods with the goal to better predict and explain countries' prevalence of MDS, with all its complexities. Data is collated and scraped from multiple open sources including the World Bank, Varieties of Democracy Project (Coppedge et al., 2015), The Woman Stats Project, and the Global Slavery Index, which includes a measure of slavery prevalence measured by the Gallup World Poll. Our metho...

Decision theory as the name would imply is concerned with the process of making decisions. The ex... more Decision theory as the name would imply is concerned with the process of making decisions. The extension to statistical decision theory includes decision making in the presence of statistical knowledge which provides some information where there is uncertainty. The elements of decision theory are quite logical and even perhaps intuitive. The classical approach to decision theory facilitates the use of sample information in making inferences about the unknown quantities. Other relevant information includes that of the possible consequences which is quantified by loss and the prior information which arises from statistical investigation. The use of Bayesian analysis in statistical decision theory is natural. Their unification provides a foundational framework for building and solving decision problems. The basic ideas of decision theory and of decision theoretic methods lend themselves to a variety of applications and computational and analytic advances. This initial part of the repor...
Abstracts are ordered in session order for oral presentations (including professional development... more Abstracts are ordered in session order for oral presentations (including professional development workshops) followed by poster presentations

40 million people are estimated to be in some form of modern slavery across the globe. Understand... more 40 million people are estimated to be in some form of modern slavery across the globe. Understanding the factors that make any particular individual or geographical region vulnerable to such abuse is essential for the development of effective interventions and policy. Efforts to isolate and assess the importance of individual drivers statistically are impeded by two key challenges: data scarcity and high dimensionality. The hidden nature of modern slavery restricts available datapoints; and the large number of candidate variables that are potentially predictive of slavery inflates the feature space exponentially. The result is a highly problematic "small-n, large-p' setting, where overfitting and multi-collinearity can render more traditional statistical approaches inapplicable. Recent advances in non-parametric computational methods, however, offer scope to overcome such challenges. We present an approach that combines non-linear machine learning models and strict cross-va...

Humanities and Social Sciences Communications, 2021
Forty million people are estimated to be in some form of modern slavery across the globe. Underst... more Forty million people are estimated to be in some form of modern slavery across the globe. Understanding the factors that make any particular individual or geographical region vulnerable to such abuse is essential for the development of effective interventions and policy. Efforts to isolate and assess the importance of individual drivers statistically are impeded by two key challenges: data scarcity and high dimensionality, typical of many “wicked problems”. The hidden nature of modern slavery restricts available data points; and the large number of candidate variables that are potentially predictive of slavery inflate the feature space exponentially. The result is a “small n, large p” setting, where overfitting and significant inter-correlation of explanatory variables can render more traditional statistical approaches problematic. Recent advances in non-parametric computational methods, however, offer scope to overcome such challenges and better capture the complex nature of modern...
BMJ, 2021
Artificial intelligence can help tackle the covid-19 pandemic, but bias and discrimination in its... more Artificial intelligence can help tackle the covid-19 pandemic, but bias and discrimination in its design and deployment risk exacerbating existing health inequity argue David Leslie and colleagues on 24 February 2022 by guest. Protected by copyright.

Behaviormetrika, 2020
Forensic science often involves the comparison of crime-scene evidence to a known-source sample t... more Forensic science often involves the comparison of crime-scene evidence to a known-source sample to determine if the evidence and the reference sample came from the same source. Even as forensic analysis tools become increasingly objective and automated, final source identifications are often left to individual examiners’ interpretation of the evidence. Each source identification relies on judgements about the features and quality of the crime-scene evidence that may vary from one examiner to the next. The current approach to characterizing uncertainty in examiners’ decision-making has largely centered around the calculation of error rates aggregated across examiners and identification tasks, without taking into account these variations in behavior. We propose a new approach using IRT and IRT-like models to account for differences among examiners and additionally account for the varying difficulty among source identification tasks. In particular, we survey some recent advances (Luby ...

Forensic Science International: Genetics Supplement Series, 2008
Forensic inference from genetic markers uses highly polymorphic multi-locus genotypes. Measures o... more Forensic inference from genetic markers uses highly polymorphic multi-locus genotypes. Measures of informativeness can aid in selecting efficient genetic markers. Existing measures do not account for multiple sources of genetic variation (i.e. mutation, silent alleles, etc.) and they are not directly applicable to complex identification problems. Using information theoretic principles within a probabilistic expert system (PES) we define a general measure of informativeness, I q , of a marker for answering a forensic query. I q gives a slightly different ranking of most genetic markers as its comparable measures. Accounting for sources of variation such as mutation, silent and null alleles reduces I q and may further affect ranking. This criterion has a solid theoretical basis and can account for multiple sources of genetic variation and other anomalies. It can be directly applied to a variety of planning issues concerning the type, quantity and specific choice of markers for use in paternity testing and more general forensic problems.

Journal of Electromyography and Kinesiology, 2012
Work related musculoskeletal disorders have been associated with office work yet exposure quantif... more Work related musculoskeletal disorders have been associated with office work yet exposure quantification is challenging and not measured consistently. Our objective was to examine associations within and across exposure measurements guided by a conceptual model of three measurement locations: external to the body, at the interface, and internal to the body. Forty-one office workers (71% female), mean age 41 years (SD=9.6), mean height 168cm (SD=10.3), and mean weight 74kg (SD=19), were recruited from a large urban newspaper. Four methods of quantifying mechanical exposure were used linked to locations: equipment dimensions (external), relative fit and postures (interface), and EMG (internal). We explored: (1) a within-location analysis of relationships among methods; and (2) a cross-location analysis of relationships among methods. Exposure method comparisons showed mostly weak correlations among equipment variables, moderate correlations among posture variables, and strong or moderate correlations among EMG variables. For the majority of pair-wise comparisons between exposure measures across locations, the correlations were weak or moderate. Comparisons of relative fit revealed some differences in dimensions, postures, and EMG measures. Few strong associations between various exposure measures were found, although worker-reported relative fit holds promise. Future work might link exposure methods (specific measures) with locations for particular purposes.

International Archives of Occupational and Environmental Health, 2011
Purpose To detect impacts of changes in work environment and worker-equipment interface variables... more Purpose To detect impacts of changes in work environment and worker-equipment interface variables upon surface electromyography (EMG) measures using multivariate, longitudinal analysis. Methods For 33 office workers, yearly measurements (1999)(2000)(2001) were taken during normal work. Independent variables were related to work environment (expertobserved equipment dimensions, work organization on questionnaire) and interface (expert-observed postures, self-reported workstation-equipment relative fit i.e. inside or outside guidelines-informed location, and 30 min videobased task analysis). Internal mechanical exposure (EMG) was recorded bilaterally from extensor carpi radialis brevis (ECRB) and upper trapezius sites, each side, also for 30 min. Dependent variables were amplitude probability distribution functions (APDF 50 and 90%) and gaptime for entire record EMG (over all tasks) and task-specific EMG (for four separate tasks). Multivariate mixed models used independent variables to predict EMG measures (4 muscle sites 9 (1 entire record ? 4 task specific) = 20 models total). Results Among EMG measures, 9/16 means and 2/16 variances were significantly different across years (p \ 0.1). Environment and interface variables explained part of the variation in EMG measures in 13/20 models. The most consistent predictors included: (1) increased monitor distance predicted reduced APDFs and increased gaptimes; (2) wrist extension \20°predicted decreases in left ECRB APDFs; (3) keyboard location within guidelines predicted improvements in all right ECRB EMG measures during keyboarding; and (4) longer task duration predicted higher APDFs and lower gaptimes. Conclusion Longitudinal analysis with multivariate models can detect the impacts of changes in environment and interface exposures on EMG measures among office workers.

Forensic Science International: Genetics, 2013
Increases in the sensitivity of DNA profiling technology now allow profiles to be obtained from s... more Increases in the sensitivity of DNA profiling technology now allow profiles to be obtained from smaller and more degraded DNA samples than was previously possible. The resulting profiles can be highly informative, but the subjective elements in the interpretation make it problematic to achieve the valid and efficient evaluation of evidential strength required in criminal cases. The problems arise from stochastic phenomena such as "dropout" (absence of an allele in the profile that is present in the underlying DNA) and experimental artefacts such as "stutter" that can generate peaks of ambiguous allelic status. Currently in the UK, evidential strength evaluation uses an approach in which the complex signals in the DNA profiles are interpreted in a semi-manual fashion by trained experts aided by a set of guidelines, but also relying substantially on professional judgment. We introduce a statistical model to calculate likelihood ratios for evaluating DNA evidence arising from multiple known and unknown contributors that allows for such stochastic phenomena by incorporating peak heights. Efficient use of peak heights allows for more crime scene profiles to be reported to courts than is currently possible. The model parameters are estimated from experimental data incorporating multiple sources of variability in the profiling system. We report and analyse experimental results from the SGMPlus system, run at 28 amplification cycles with no enhancements, currently used in the UK. Our methods are readily adapted to other DNA profiling systems provided that the experimental data for the parameter estimation is available.
The 27th Annual …, 2005
Background: Decision and cost-effectiveness analytic models require estimates of treatment effect... more Background: Decision and cost-effectiveness analytic models require estimates of treatment effectiveness as proportion of patients achieving positive treatment outcomes (responder analysis). This approach often means dichotomizing continuous scores into thresholds of ...

ArXiv, 2021
On any given day, tens of millions of people find themselves trapped in instances of modern slave... more On any given day, tens of millions of people find themselves trapped in instances of modern slavery. The terms"human trafficking,""trafficking in persons,"and"modern slavery"are sometimes used interchangeably to refer to both sex trafficking and forced labor. Human trafficking occurs when a trafficker compels someone to provide labor or services through the use of force, fraud, and/or coercion. The wide range of stakeholders in human trafficking presents major challenges. Direct stakeholders are law enforcement, NGOs and INGOs, businesses, local/planning government authorities, and survivors. Viewed from a very high level, all stakeholders share in a rich network of interactions that produce and consume enormous amounts of information. The problems of making efficient use of such information for the purposes of fighting trafficking while at the same time adhering to community standards of privacy and ethics are formidable. At the same time they help us,...
Uploads
Papers by Anjali Mazumder