Papers by Zafiirah Banon Hosenie

Astronomers require efficient automated detection and classification pipelines when conducting la... more Astronomers require efficient automated detection and classification pipelines when conducting large-scale surveys of the optical sky. Such pipelines are fundamentally important as they permit rapid follow-up and analysis of those detections most likely to be of scientific value. We present a deep learning framework based on a convolutional neural network model known as MeerCRAB. It is designed to filter out the so called “bogus” detections from true astrophysical sources in the transient detection pipeline of the MeerLICHT telescope. Optical candidates are described using a variety of 2D images and numerical features extracted from those images. The relationship between the input images and the target classes is unclear, since the ground truth is poorly defined and often the subject of debate. This makes it difficult to determine which source of information should be used to train a classification algorithm. To proceed we deployed variants of MeerCRAB that employed different networ...

The accurate automated classification of variable stars into their respective subtypes is difficu... more The accurate automated classification of variable stars into their respective subtypes is difficult. Machine learning based solutions often fall foul of the imbalanced learning problem, which causes poor generalisation performance in practice, especially on rare variable star sub-types. We attempted to overcome such deficiencies via the development of a hierarchical machine learning classifier. This ‘algorithmlevel’ approach to tackling imbalance, yielded promising results on Catalina RealTime Survey (CRTS) data. We attempt to further improve hierarchical classification performance by applying ‘data-level’ approaches to directly augment the training data so that they better describe under-represented classes. We apply and report results for three data augmentation methods in particular: Randomly Augmented Sampled Light curves from magnitude Error (RASLE), augmenting light curves with Gaussian Process modelling (GpFit) and the Synthetic Minority Over-sampling Technique (SMOTE). When ...

Astronomers require efficient automated detection and classification pipelines when conducting la... more Astronomers require efficient automated detection and classification pipelines when conducting large-scale surveys of the (optical) sky for variable and transient sources. Such pipelines are fundamentally important, as they permit rapid follow-up and analysis of those detectionsmost likely to be of scientific value. We therefore present a deep learning pipeline based on the convolutional neural network architecture called MeerCRAB. It is designed to filter out the so called “bogus” detections from true astrophysical sources in 1Jodrell Bank Centre for Astrophysics, Department of Physics and Astronomy, The University of Manchester, Manchester M13 9PL, UK. E-mail: [email protected] 2Department of Astrophysics/IMAPP, Radboud University, P.O. 9010,6500 GL, Nijmegen, The Netherlands. 3Inter-University Institute for Data Intensive Astronomy & Department of Astronomy, University of Cape Town, Private Bag X3, Rondebosch 7701, South Africa. 4South African Astronomical Observatory, P...

Upcoming synoptic surveys are set to generate an unprecedented amount of data. This requires an a... more Upcoming synoptic surveys are set to generate an unprecedented amount of data. This requires an automatic framework that can quickly and efficiently provide classification labels for several new object classification challenges. Using data describing 11 types of variable stars from the Catalina Real-Time Transient Survey (CRTS), we illustrate how to capture the most important information from computed features and describe detailed methods of how to robustly use information theory for feature selection and evaluation. We apply three machine learning algorithms and demonstrate how to optimize these classifiers via cross-validation techniques. For the CRTS data set, we find that the random forest classifier performs best in terms of balanced accuracy and geometric means. We demonstrate substantially improved classification results by converting the multiclass problem into a binary classification task, achieving a balanced-accuracy rate of ∼99 per cent for the classification of δ Scuti...

The accurate automated classification of variable stars into their respective sub-types is diffic... more The accurate automated classification of variable stars into their respective sub-types is difficult. Machine learning based solutions often fall foul of the imbalanced learning problem, which causes poor generalisation performance in practice, especially on rare variable star sub-types. In previous work, we attempted to overcome such deficiencies via the development of a hierarchical machine learning classifier. This 'algorithm-level' approach to tackling imbalance, yielded promising results on Catalina Real-Time Survey (CRTS) data, outperforming the binary and multi-class classification schemes previously applied in this area. In this work, we attempt to further improve hierarchical classification performance by applying 'data-level' approaches to directly augment the training data so that they better describe under-represented classes. We apply and report results for three data augmentation methods in particular: $\textit{R}$andomly $\textit{A}$ugmented $\textit{S...
Uploads
Papers by Zafiirah Banon Hosenie