Papers by Michal Ozery-flato

JAMA Network Open
ImportanceAn accurate and robust artificial intelligence (AI) algorithm for detecting cancer in d... more ImportanceAn accurate and robust artificial intelligence (AI) algorithm for detecting cancer in digital breast tomosynthesis (DBT) could significantly improve detection accuracy and reduce health care costs worldwide.ObjectivesTo make training and evaluation data for the development of AI algorithms for DBT analysis available, to develop well-defined benchmarks, and to create publicly available code for existing methods.Design, Setting, and ParticipantsThis diagnostic study is based on a multi-institutional international grand challenge in which research teams developed algorithms to detect lesions in DBT. A data set of 22 032 reconstructed DBT volumes was made available to research teams. Phase 1, in which teams were provided 700 scans from the training set, 120 from the validation set, and 180 from the test set, took place from December 2020 to January 2021, and phase 2, in which teams were given the full data set, took place from May to July 2021.Main Outcomes and MeasuresThe ove...
Sorting
cancer karyotypes by elementary operations
Fast and Efficient Feature Engineering for Multi-Cohort Analysis of EHR Data
Studies in health technology and informatics, 2017
We present a framework for feature engineering, tailored for longitudinal structured data, such a... more We present a framework for feature engineering, tailored for longitudinal structured data, such as electronic health records (EHRs). To fast-track feature engineering and extraction, the framework combines general-use plug-in extractors, a multi-cohort management mechanism, and modular memoization. Using this framework, we rapidly extracted thousands of features from diverse and large healthcare data sources in multiple projects.

The Impact of COVID-19 Pandemic on Clinical Findings in Medical Imaging Exams: An Observational Study in a Nationwide Israeli Health Organization (Preprint)
BACKGROUND The outbreak of the COVID-19 pandemic was followed by reduced utilization of routine d... more BACKGROUND The outbreak of the COVID-19 pandemic was followed by reduced utilization of routine diagnostic exams. Along with other pandemic-related factors, this may have influenced detected clinical conditions. OBJECTIVE The study aimed to analyze the impact of COVID-19 on the use of outpatient medical imaging services and clinical findings therein, specifically focusing on the time period after the launch of the Israeli COVID-19 vaccination campaign. In addition, the study tested whether the observed gains in clinical findings may be linked to exposure to COVID-19 infection, hospitalization (indicative of COVID-19 complications), or vaccination. METHODS Our dataset included 572,480 ambulatory medical imaging patients in a national health organization, from January 1, 2019 to August 31, 2021. We compared different measures of medical imaging utilization and clinical findings therein, before and after the surge of the pandemic, to identify significant changes. We also inspected the ...

The Impact of COVID-19 Pandemic on Clinical Findings in Medical Imaging Exams: An Observational Study in a Nationwide Israeli Health Organization (Preprint)
BACKGROUND The outbreak of the COVID-19 pandemic was followed by reduced utilization of routine d... more BACKGROUND The outbreak of the COVID-19 pandemic was followed by reduced utilization of routine diagnostic exams. Along with other pandemic-related factors, this may have influenced detected clinical conditions. OBJECTIVE The study aimed to analyze the impact of COVID-19 on the use of outpatient medical imaging services and clinical findings therein, specifically focusing on the time period after the launch of the Israeli COVID-19 vaccination campaign. In addition, the study tested whether the observed gains in clinical findings may be linked to exposure to COVID-19 infection, hospitalization (indicative of COVID-19 complications), or vaccination. METHODS Our dataset included 572,480 ambulatory medical imaging patients in a national health organization, from January 1, 2019 to August 31, 2021. We compared different measures of medical imaging utilization and clinical findings therein, before and after the surge of the pandemic, to identify significant changes. We also inspected the ...

Real-world healthcare data hold the potential to identify therapeutic solutions for progressive d... more Real-world healthcare data hold the potential to identify therapeutic solutions for progressive diseases by efficiently pinpointing safe and efficacious repurposing drug candidates. This approach circumvents key early clinical development challenges, particularly relevant for neurological diseases, concordant with the vision of the 21st Century Cures Act. However, to-date, these data have been utilized mainly for confirmatory purposes rather than as drug discovery engines. Here, we demonstrate the usefulness of real-world data in identifying drug repurposing candidates for disease- modifying effects, specifically candidate marketed drugs that exhibit beneficial effects on Parkinson’s disease (PD) progression. We performed an observational study in cohorts of ascertained PD patients extracted from two large medical databases, Explorys SuperMart (N=88,867) and IBM MarketScan Research Databases (N=106,395); and applied two conceptually different, well-established causal inference metho...

In response to the outbreak of the coronavirus disease 2019 (Covid-19), governments worldwide hav... more In response to the outbreak of the coronavirus disease 2019 (Covid-19), governments worldwide have introduced multiple restriction policies, known as non-pharmaceutical interventions (NPIs). However, the relative impact of control measures and the long-term causal contribution of each NPI are still a topic of debate. We present a method to rigorously study the effectiveness of interventions on the rate of the time-varying reproduction number Rt and on human mobility, considered here as a proxy measure of policy adherence and social distancing. We frame our model using a causal inference approach to quantify the impact of five governmental interventions introduced until June 2020 to control the outbreak in 113 countries: confinement, school closure, mask wearing, cultural closure, and work restrictions. Our results indicate that mobility changes are more accurately predicted when compared to reproduction number. All NPIs, except for mask wearing, significantly affected human mobility...

Radiology, 2022
Background: Digital breast tomosynthesis (DBT) has higher diagnostic accuracy than digital mammog... more Background: Digital breast tomosynthesis (DBT) has higher diagnostic accuracy than digital mammography, but interpretation time is substantially longer. Artificial intelligence (AI) could improve reading efficiency. Purpose: To evaluate the use of AI to reduce workload by filtering out normal DBT screens. Materials and Methods: The retrospective study included 13 306 DBT examinations from 9919 women performed between June 2013 and November 2018 from two health care networks. The cohort was split into training, validation, and test sets (3948, 1661, and 4310 women, respectively). A workflow was simulated in which the AI model classified cancer-free examinations that could be dismissed from the screening worklist and used the original radiologists' interpretations on the rest of the worklist examinations. The AI system was also evaluated with a reader study of five breast radiologists reading the DBT mammograms of 205 women. The area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and recall rate were evaluated in both studies. Statistics were computed across 10 000 bootstrap samples to assess 95% CIs, noninferiority, and superiority tests. Results: The model was tested on 4310 screened women (mean age, 60 years 6 11 [standard deviation]; 5182 DBT examinations). Compared with the radiologists' performance (417 of 459 detected cancers [90.8%], 477 recalls in 5182 examinations [9.2%]), the use of AI to automatically filter out cases would result in 39.6% less workload, noninferior sensitivity (413 of 459 detected cancers; 90.0%; P = .002), and 25% lower recall rate (358 recalls in 5182 examinations; 6.9%; P = .002). In the reader study, AUC was higher in the standalone AI compared with the mean reader (0.84 vs 0.81; P = .002). Conclusion: The artificial intelligence model was able to identify normal digital breast tomosynthesis screening examinations, which decreased the number of examinations that required radiologist interpretation in a simulated clinical workflow. Published under a CC BY 4.0 license.

Abstract. Chromosomal instability is a hallmark of cancer. The results of this instability can be... more Abstract. Chromosomal instability is a hallmark of cancer. The results of this instability can be observed in the karyotypes of many cancerous genomes, which often contain a variety of aberrations. In this study we introduce a new approach for analyzing rearrangement events in carcinogenesis. This approach builds on a new effective heuristic for computing a short sequence of rearrangement events that may have led to a given karyotype. We applied this heuristic on over 40,000 karyotypes reported in the scientific literature. Our analysis implies that these karyotypes have evolved predominantly via four principal event types: chromosomes gains and losses, reciprocal translocations, and terminal deletions. We used the frequencies of the reconstructed rearrangement events to measure similarity between karyotypes. Using clustering techniques, we demonstrate that in many cases, rearrangement event frequencies are a meaningful criterion for distinguishing between karyotypes of distinct tum...

Analysis of chromosomal alterations in cancer
Chromosomal instability is a hallmark of cancer. This instability is manifested in the karyotypes... more Chromosomal instability is a hallmark of cancer. This instability is manifested in the karyotypes of many cancerous genomes, which often contain a variety of chromosomal aberrations. In this study we introduce a new approach for analyzing chromosomal aberrations in carcinogenesis. This approach builds on a new effective heuristic for reconstructing a set of probable alteration events that transforms a normal karyotype into a given cancer karyotype. We applied this heuristic on over 34,000 karyotypes reported in the scientific literature. Our analysis implies that these karyotypes have evolved predominantly via four principal event types: chromosome gains and losses, reciprocal translocations, and terminal deletions. We used the frequencies of the reconstructed events to measure similarity between karyotypes. We demonstrate that in many cases, event frequencies can distinguish the karyotypes of distinct cancer classes, where the latter are defined by cancer morphology and topography....

Biases in observational data pose a major challenge to estimation methods for the effect of treat... more Biases in observational data pose a major challenge to estimation methods for the effect of treatments. An important technique that accounts for these biases is reweighting samples to minimize the discrepancy between treatment groups. Inverse probability weighting, a popular weighting technique, models the conditional treatment probability given covariates. However, it is overly sensitive to model misspecification and suffers from large estimation variance. Recent methods attempt to alleviate these limitations by finding weights that minimize a selected discrepancy measure between the reweighted populations. We present a new reweighting approach that uses classification error as a measure of similarity between datasets. Our proposed framework uses bi-level optimization to alternately train a discriminator to minimize classification error, and a balancing weights generator to maximize this error. This approach borrows principles from generative adversarial networks (GANs) that aim to...
The understanding of genome rearrangements is an important endeavor in comparative genomics. A ma... more The understanding of genome rearrangements is an important endeavor in comparative genomics. A major computational problem in this field is finding a shortest sequence of genome rearrangements that "sorts" one genome into another. In this paper we focus on sorting a multi-chromosomal genome by translocations. We reveal new relationships between this problem and the well studied problem of sorting by reversals. Based on these relationships, we develop two new algorithms for sorting by translocations, which mimic known algorithms for sorting by reversals: a score-based method building on Bergeron's algorithm, and a recursive procedure similar to the Berman-Hannenhalli method. Though their proofs are more involved, our procedures for translocations match the complexities of the original ones for reversals.

A central problem in genome rearrangement is finding a most parsimonious rearrangement scenario u... more A central problem in genome rearrangement is finding a most parsimonious rearrangement scenario using certain rearrangement operations. An important problem of this type is sorting a signed genome by reversals and translocations (SBRT). Hannenhalli and Pevzner presented a duality theorem for SBRT which leads to a polynomial time algorithm for sorting a multi-chromosomal genome using a minimum number of reversals and translocations. However, there is one case for which their theorem and algorithm fail. We describe that case and suggest a correction to the theorem and the polynomial algorithm. The solution of SBRT uses a reduction to the problem of sorting a signed permutation by reversals (SBR). The best extant algorithms for SBR require quadratic time. The common approach to solve SBR is by finding a safe reversal using the overlap graph or the interleaving graph of a permutation. We describe a family of signed permutations which proves a quadratic lower bound on the number of affected vertices in the overlap/interleaving graph during any optimal sorting scenario. This implies, in particular, an Ω(n 3) lower bound for Bergeron's algorithm.

Nucleic Acids Research, Jul 23, 2014
Genomes undergo changes in organization as a result of gene duplications, chromosomal rearrangeme... more Genomes undergo changes in organization as a result of gene duplications, chromosomal rearrangements and local mutations, among other mechanisms. In contrast to prokaryotes, in which genes of a common function are often organized in operons and reside contiguously along the genome, most eukaryotes show much weaker clustering of genes by function, except for few concrete functional groups. We set out to check systematically if there is a relation between gene function and gene organization in the human genome. We test this question for three types of functional groups: pairs of interacting proteins, complexes and pathways. We find a significant concentration of functional groups both in terms of their distance within the same chromosome and in terms of their dispersal over several chromosomes. Moreover, using Hi-C contact map of the tendency of chromosomal segments to appear close in the 3D space of the nucleus, we show that members of the same functional group that reside on distinct chromosomes tend to co-localize in space. The result holds for all three types of functional groups that we tested. Hence, the human genome shows substantial concentration of functional groups within chromosomes and across chromosomes in space.

Diabetology and Metabolic Syndrome, Jul 15, 2013
Objective: To investigate the predictive value of different biomarkers for the incidence of type ... more Objective: To investigate the predictive value of different biomarkers for the incidence of type 2 diabetes mellitus (T2DM) in subjects with metabolic syndrome. Methods: A prospective study of 525 non-diabetic, middle-aged Lithuanian men and women with metabolic syndrome but without overt atherosclerotic diseases during a follow-up period of two to four years. We used logistic regression to develop predictive models for incident cases and to investigate the association between various markers and the onset of T2DM. Results: Fasting plasma glucose (FPG), body mass index (BMI), and glycosylated haemoglobin can be used to predict diabetes onset with a high level of accuracy and each was shown to have a cumulative predictive value. The estimated area under the receiver-operating characteristic curve (AUC) for this combination was 0.92. The oral glucose tolerance test (OGTT) did not show cumulative predictive value. Additionally, progression to diabetes was associated with high values of aortic pulse-wave velocity (aPWV). Conclusion: T2DM onset in middle-aged metabolic syndrome subjects can be predicted with remarkable accuracy using the combination of FPG, BMI, and HbA 1c , and is related to elevated aPWV measurements.
Journal of Computational Biology, 2008
A centromere is a special region in the chromosome that plays a vital role during cell division. ... more A centromere is a special region in the chromosome that plays a vital role during cell division. Every new chromosome created by a genome rearrangement event must have a centromere in order to survive. This constraint has been ignored in the computational modeling and analysis of genome rearrangements to date. Unlike genes, the different centromeres are indistinguishable, they have no
Chronic diseases constitute the leading cause of mortality in the western world, have a major imp... more Chronic diseases constitute the leading cause of mortality in the western world, have a major impact on the patients' quality of life, and comprise the bulk of healthcare costs. Nowadays, healthcare data management systems integrate large amounts of medical information on patients, including diagnoses, medical procedures, lab test results, and more. Sophisticated analysis methods are needed for utilizing these data to assist in patient management and to enhance treatment quality at reduced costs.

Proc 1st RECOMB Satellite Workshop in …, 2007
Chromosomal instability is a hallmark of cancer. The results of this instability can be observed ... more Chromosomal instability is a hallmark of cancer. The results of this instability can be observed in the karyotypes of many cancerous genomes, which often contain a variety of aberrations. In this study we introduce a new approach for analyzing rearrangement events in carcinogenesis. This approach builds on a new effective heuristic for computing a short sequence of rearrangement events that may have led to a given karyotype. We applied this heuristic on over 40,000 karyotypes reported in the scientific literature. Our analysis implies that these karyotypes have evolved predominantly via four principal event types: chromosomes gains and losses, reciprocal translocations, and terminal deletions. We used the frequencies of the reconstructed rearrangement events to measure similarity between karyotypes. Using clustering techniques, we demonstrate that in many cases, rearrangement event frequencies are a meaningful criterion for distinguishing between karyotypes of distinct tumor classes. Further investigations of this kind can provide insight on the scenarios by which particular cancer types have evolved.
Pre-biopsy Multi-class Classification of Breast Lesion Pathology in Mammograms
Machine Learning in Medical Imaging
Improving the Performance and Explainability of Mammogram Classifiers with Local Annotations
Interpretable and Annotation-Efficient Learning for Medical Image Computing
Uploads
Papers by Michal Ozery-flato