Practical Manual - Machine Learning Application

Uploaded by

Missaka Sooriyaarachchi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views4 pages

Practical Manual - Machine Learning Application

Uploaded by

Missaka Sooriyaarachchi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

PREDICTING PULSAR STARS USING MACHINE LEARNING

Introduction
In the vast expanse of the universe, pulsars, highly magnetised and rapidly rotating neutron stars,
have captivated the curiosity of astronomers and astrophysicists alike. Identifying these celestial
objects from the vast amount of astronomical data is a challenging task that can benefit from the
power of machine learning.
This experiment aims to provide undergraduate students with an exciting opportunity to explore
the fields of astrophysics and machine learning by predicting the presence of pulsar stars using a
classification model. By utilising a popular machine learning algorithm called Random Forest,
participants will delve into the realms of data analysis, feature engineering, and model training.

Objectives
The primary objective of this experiment is to introduce students to the application of machine
learning to the identification of pulsar stars. By the end of the experiment, participants should have
gained hands-on experience in:
1. Understanding the nature and significance of pulsar stars in astrophysics
2. Acquiring and preprocessing astronomical data related to pulsar candidates
3. Exploring and analysing data to identify relevant features
4. Implementing a Random Forest classification model to predict pulsar stars
5. Evaluating the performance of the model and interpreting the results
6. Gaining insights into the challenges and considerations when applying machine learning
to real-world astrophysical problems

Experimental Setup
1. Dataset: Participants will be provided with a carefully curated dataset (HTRU2) containing
various features extracted from radio frequency observations of potential pulsar stars.
These features include statistical measures, signal profiles, and other relevant information.
See the dataset description below for more information. (Note that the original dataset
has been modified for the purpose of this experiment.)
2. Data Preprocessing: Participants will explore the dataset, deal with missing values,
imbalanced data, and normalizing methods.
3. Feature Engineering: Students will identify and engineer relevant features that can enhance
the performance of the classification model. This process will involve extracting
meaningful information and selecting appropriate input variables.
4. Model Training: Using the Random Forest algorithm, participants will construct a
classification model to predict the presence of pulsar stars based on the selected features

1
with higher accuracies. They will learn how the different features may affect the
performance of the algorithm and reduce under fitting and overfitting.
5. Model Evaluation: The trained model will be evaluated using various performance metrics,
such as accuracy, precision, recall, F1 and F2 scores. Participants will interpret the results
and gain insights into the effectiveness of the model in distinguishing pulsar stars.

Conclusion
By participating in this experiment, students will gain valuable hands-on experience in the
interdisciplinary fields of astrophysics and machine learning. They will deepen their understanding
of pulsar stars, develop data analysis skills, and learn to apply machine learning algorithms to real-
world problems. This experiment serves as a stepping stone for students interested in exploring
the fascinating intersection of astronomy and data science.

References
Students may use the following references in order to complete the experiment.
 Python (recommended version 3.9.0): https://www.python.org/downloads/release/python-
390/
 Pandas (for data handling)
o Documentation: https://pandas.pydata.org/docs/
o Tutorials: https://www.w3schools.com/python/pandas/default.asp
 Scikit-learn (to build Random Forest Classifiers): https://scikit-learn.org/stable/
 Learn About Random Forest From Scratch (External Article):
https://www.linkedin.com/pulse/random-forest-explained-regression-classification-tasks-
sellahewa?utm_source=share&utm_medium=member_android&utm_campaign=share_vi
a

2
DATASET DESCRIPTION

******************************************************************************
# HTRU2

© Author: Rob Lyon, School of Computer Science & Jodrell Bank Centre for Astrophysics,
University of Manchester, Kilburn Building, Oxford Road, Manchester M13 9PL.
Web: http://www.scienceguyrob.com or http://www.cs.manchester.ac.uk
or alternatively http://www.jb.man.ac.uk
******************************************************************************

1. Overview

HTRU2 is a data set which describes a sample of pulsar candidates collected during the High Time
Resolution Universe Survey (South). Pulsars are a rare type of Neutron star that produce radio
emission detectable here on Earth. They are of considerable scientific interest as probes of space-
time, the inter-stellar medium, and states of matter.
As pulsars rotate, their emission beam sweeps across the sky, and when this crosses our line of
sight, produces a detectable pattern of broadband radio emission. As pulsars rotate rapidly, this
pattern repeats periodically. Thus pulsar search involves looking for periodic radio signals with
large radio telescopes. Each pulsar produces a slightly different emission pattern, which varies
slightly with each rotation. Thus a potential signal detection known as a 'candidate', is averaged
over many rotations of the pulsar, as determined by the length of an observation. In the absence of
additional info, each candidate could potentially describe a real pulsar. However, in practice almost
all detections are caused by radio frequency interference (RFI) and noise, making legitimate
signals hard to find.
Machine learning tools are now being used to automatically label pulsar candidates to facilitate
rapid analysis. Classification systems in particular are being widely adopted, which treat the
candidate data sets as binary classification problems. Here the legitimate pulsar examples are a
minority positive class, and spurious examples the majority negative class. At present multi-class
labels are unavailable, given the costs associated with data annotation. The data set shared here
contains 16,259 spurious examples caused by RFI/noise, and 1,639 real pulsar examples. These
examples have all been checked by human annotators. Each candidate is described by 8 continuous
variables. The first four are simple statistics obtained from the integrated pulse profile (folded
profile). This is an array of continuous variables that describe a longitude-resolved version of the

3
signal that has been averaged in both time and frequency. The remaining four variables are
similarly obtained from the DM-SNR curve. These are summarized below:
1. Mean of the integrated profile.
2. Standard deviation of the integrated profile.
3. Excess kurtosis of the integrated profile.
4. Skewness of the integrated profile.
5. Mean of the DM-SNR curve.
6. Standard deviation of the DM-SNR curve.
7. Excess kurtosis of the DM-SNR curve.
8. Skewness of the DM-SNR curve.

For the purpose of this experiment, the dataset is divided into two different files as
pulsar_data_train.csv and pulsar_data_test.csv. You must use the training data file to train your
Random Forest Algorithm. After training, use the test data file to evaluate the performance based
on the trained model. Candidates are stored in both files in separate rows. Each row lists the
variables first, and the class label is the final entry. The class labels used are 0 (negative, not a
pulsar star) and 1 (positive).

Please note that the data contains no positional information or other astronomical details.
It is simply feature data extracted from candidate files using the PulsarFeatureLab tool.

2. Abbreviations
DM: The dispersion measure, which is a measure of the amount of scattering that a radio signal
experiences as it travels through the interstellar medium.
SNR: The signal-to-noise ratio, which is a measure of the strength of the radio signal relative to
the noise.

****************
END OF THE EXPERIMENT

Pulsar Data Analysis with Machine Learning
No ratings yet
Pulsar Data Analysis with Machine Learning
10 pages
Pulsar Detection with ML Techniques
No ratings yet
Pulsar Detection with ML Techniques
7 pages
Predicting Pulsar Star Using Machine Learning - 1 - s15423
No ratings yet
Predicting Pulsar Star Using Machine Learning - 1 - s15423
6 pages
Debesai Gutierrez Koyluoglu
No ratings yet
Debesai Gutierrez Koyluoglu
11 pages
Machine Learning Approach To Detect Pulsar Star: Tensorflow and Random Forest Model With Python
No ratings yet
Machine Learning Approach To Detect Pulsar Star: Tensorflow and Random Forest Model With Python
14 pages
Pulsar Candidate Identification With Artificial Intelligence Techniques
No ratings yet
Pulsar Candidate Identification With Artificial Intelligence Techniques
23 pages
Programming Assignment 2
No ratings yet
Programming Assignment 2
3 pages
Stellar Classification via ML Models
No ratings yet
Stellar Classification via ML Models
6 pages
DAL Assignment 6 Endsem
No ratings yet
DAL Assignment 6 Endsem
8 pages
Stellar Classification With Logistic Regression
No ratings yet
Stellar Classification With Logistic Regression
8 pages
EE485 Proposal
No ratings yet
EE485 Proposal
2 pages
4375-Article Text-26335-1-10-20230427
No ratings yet
4375-Article Text-26335-1-10-20230427
13 pages
Random Forest Algorithm For Classification of Multiwavelength Data
No ratings yet
Random Forest Algorithm For Classification of Multiwavelength Data
7 pages
Problem Statement: PREDICTING A PULSAR STAR
No ratings yet
Problem Statement: PREDICTING A PULSAR STAR
1 page
Slay The Day
No ratings yet
Slay The Day
21 pages
The GMRT High Resolution Southern
No ratings yet
The GMRT High Resolution Southern
16 pages
Pulsar Candidate Selection Evolution
No ratings yet
Pulsar Candidate Selection Evolution
22 pages
Group 16 Project
No ratings yet
Group 16 Project
18 pages
Final Year Project: Asteroids Classification Using Machine Learning
No ratings yet
Final Year Project: Asteroids Classification Using Machine Learning
15 pages
Binary Radio Pulsars Thesis
No ratings yet
Binary Radio Pulsars Thesis
236 pages
Evaluating Classification Algorithms: Exoplanet Detection Using Kepler Time Series Data
No ratings yet
Evaluating Classification Algorithms: Exoplanet Detection Using Kepler Time Series Data
6 pages
Predicting Real Exoplanets Using Machine Learning Techniques (Evaluator-OMID-BAGHCHEH-SARAEI)
No ratings yet
Predicting Real Exoplanets Using Machine Learning Techniques (Evaluator-OMID-BAGHCHEH-SARAEI)
44 pages
Ishida Ou Funkel
No ratings yet
Ishida Ou Funkel
45 pages
Pybird Jax
No ratings yet
Pybird Jax
43 pages
Projects List - Sheet5
No ratings yet
Projects List - Sheet5
1 page
New TESS Exoplanet Candidates Catalog
No ratings yet
New TESS Exoplanet Candidates Catalog
44 pages
DAL Assignment 6
No ratings yet
DAL Assignment 6
7 pages
Machine Learning For Detection of Cosmic Bodies - Literature Review
No ratings yet
Machine Learning For Detection of Cosmic Bodies - Literature Review
6 pages
Vicky Kalogera Et Al - The Strongly Relativistic Double Pulsar and LISA
No ratings yet
Vicky Kalogera Et Al - The Strongly Relativistic Double Pulsar and LISA
16 pages
All Mix
No ratings yet
All Mix
11 pages
Decision Tree & Random Forest SEO Guide
No ratings yet
Decision Tree & Random Forest SEO Guide
26 pages
Machine Learning in Space Exploration
No ratings yet
Machine Learning in Space Exploration
33 pages
The Gamma Ray Pulsar Population: Astronomy Department, Cornell University, Ithaca, NY 14853
No ratings yet
The Gamma Ray Pulsar Population: Astronomy Department, Cornell University, Ithaca, NY 14853
32 pages
Project Astroid
No ratings yet
Project Astroid
13 pages
Decision Trees for Star Classification
No ratings yet
Decision Trees for Star Classification
20 pages
MLSP Lab Exp4
No ratings yet
MLSP Lab Exp4
9 pages
9 Pulsars
No ratings yet
9 Pulsars
53 pages
Exploring Galaxy Evolution With Generative Models
No ratings yet
Exploring Galaxy Evolution With Generative Models
4 pages
Investigation of The Random Forest Framework For Classification of Hyperspectral Data
No ratings yet
Investigation of The Random Forest Framework For Classification of Hyperspectral Data
11 pages
Cluster Hdbscan Dan GMM
No ratings yet
Cluster Hdbscan Dan GMM
45 pages
Gamma-Ray Source Classification Dataset
No ratings yet
Gamma-Ray Source Classification Dataset
4 pages
Deep Multi-Survey Classification of Variable Stars
No ratings yet
Deep Multi-Survey Classification of Variable Stars
16 pages
Bayesian Models For Astrophysical Data Using R, JAGS, Python, and Stan
100% (1)
Bayesian Models For Astrophysical Data Using R, JAGS, Python, and Stan
413 pages
EXOPLANETS
No ratings yet
EXOPLANETS
18 pages
Machine Learning CNN
No ratings yet
Machine Learning CNN
28 pages
PhysRevD 108 063008
No ratings yet
PhysRevD 108 063008
14 pages
Machine Learning in Astronomy Guide
No ratings yet
Machine Learning in Astronomy Guide
37 pages
21MDT0131 VL2021220504623 Pe003
No ratings yet
21MDT0131 VL2021220504623 Pe003
67 pages
Farrell 2023 J. Cosmol. Astropart. Phys. 2023 016
No ratings yet
Farrell 2023 J. Cosmol. Astropart. Phys. 2023 016
39 pages
Stellar Classification
No ratings yet
Stellar Classification
3 pages
2024 08 09 Model Averaging Not Selection
No ratings yet
2024 08 09 Model Averaging Not Selection
5 pages
A Novel Technique For Long-Term Timing of Redback Millisecond Pulsars
No ratings yet
A Novel Technique For Long-Term Timing of Redback Millisecond Pulsars
23 pages
EMUSE: Evolutionary Map of The Universe Search Engine
No ratings yet
EMUSE: Evolutionary Map of The Universe Search Engine
19 pages
Uncertaintynet 1
No ratings yet
Uncertaintynet 1
12 pages
P4-DTRF 1
No ratings yet
P4-DTRF 1
63 pages
Version 2: Population Synthesis With Detailed Binary-Evolution Simulations Across A Cosmological Range of Metallicities
No ratings yet
Version 2: Population Synthesis With Detailed Binary-Evolution Simulations Across A Cosmological Range of Metallicities
48 pages
Studying The Black Widow Pulsars PSR J0312 0921 and PSR J1627 3219 in The Optical and X-Rays
No ratings yet
Studying The Black Widow Pulsars PSR J0312 0921 and PSR J1627 3219 in The Optical and X-Rays
6 pages
2458 7160 1 PB
No ratings yet
2458 7160 1 PB
8 pages
Azure ML Tools & CI/CD Overview
No ratings yet
Azure ML Tools & CI/CD Overview
8 pages
Detection System of Cattle Foot and Mouth Disease (FMD) Using Deep Learning
No ratings yet
Detection System of Cattle Foot and Mouth Disease (FMD) Using Deep Learning
13 pages
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
No ratings yet
EMAIL+SPAM+DETECTION Final Fishries++ (2658+to+2664) - 1
7 pages
Ai Exit MCQ From Course Outline
No ratings yet
Ai Exit MCQ From Course Outline
41 pages
2019, Prediction of Wind Power Generation Base On Neural Network in Consideration of The Fault Time
No ratings yet
2019, Prediction of Wind Power Generation Base On Neural Network in Consideration of The Fault Time
10 pages
2018 8159 1 PB
No ratings yet
2018 8159 1 PB
9 pages
10 53635-Jit 1274928-3051095
No ratings yet
10 53635-Jit 1274928-3051095
8 pages
Machine Learning Ai in Medical Devices
No ratings yet
Machine Learning Ai in Medical Devices
24 pages
Random Forest Classification with Sklearn
No ratings yet
Random Forest Classification with Sklearn
3 pages
Privacy-Preserving Continual Learning Methods For
No ratings yet
Privacy-Preserving Continual Learning Methods For
12 pages
REPORT
No ratings yet
REPORT
42 pages
Sida: Synthetic Image Driven Zero-Shot Domain Adaptation: Ye-Chan Kim Seungju Cha Si-Woo Kim
No ratings yet
Sida: Synthetic Image Driven Zero-Shot Domain Adaptation: Ye-Chan Kim Seungju Cha Si-Woo Kim
9 pages
ML for Climate Prediction in Telangana
No ratings yet
ML for Climate Prediction in Telangana
10 pages
Smart Energy Optimization with ML
No ratings yet
Smart Energy Optimization with ML
14 pages
Electronics 11 01125 v2
No ratings yet
Electronics 11 01125 v2
31 pages
Automatic Detection of Liver Diseases Based On Sup
No ratings yet
Automatic Detection of Liver Diseases Based On Sup
20 pages
Smart Alarm Based On Sleep Stages Prediction
No ratings yet
Smart Alarm Based On Sleep Stages Prediction
4 pages
CDS - Unit 2
No ratings yet
CDS - Unit 2
31 pages
Thesis ADS11
No ratings yet
Thesis ADS11
54 pages
Car Price Prediction
No ratings yet
Car Price Prediction
18 pages
Ref 14
No ratings yet
Ref 14
18 pages
Unit V
No ratings yet
Unit V
25 pages
2023 Attention Transformer Mechanism and Fusion-Based Deep Learning Architecture For MRI Brain Tumor Classification System
No ratings yet
2023 Attention Transformer Mechanism and Fusion-Based Deep Learning Architecture For MRI Brain Tumor Classification System
13 pages
Application of Machine Learning and Deep Learning For Predicting Groundwater Levels in The West Coast Aquifer System, South Africa
No ratings yet
Application of Machine Learning and Deep Learning For Predicting Groundwater Levels in The West Coast Aquifer System, South Africa
18 pages
Crack Identification From Concrete Structure Images Using Deep Transfer Learning
No ratings yet
Crack Identification From Concrete Structure Images Using Deep Transfer Learning
7 pages
Assignment 4x
No ratings yet
Assignment 4x
19 pages
Classification of Services Through Feature Selection and Machine Learning in 5G Networks
No ratings yet
Classification of Services Through Feature Selection and Machine Learning in 5G Networks
11 pages
Final Project Report
No ratings yet
Final Project Report
58 pages
ph451 Final Project 2
No ratings yet
ph451 Final Project 2
19 pages
Coursera - Digital Health (Autorecovered)
No ratings yet
Coursera - Digital Health (Autorecovered)
152 pages

Practical Manual - Machine Learning Application

Uploaded by

Practical Manual - Machine Learning Application

Uploaded by

PREDICTING PULSAR STARS USING MACHINE LEARNING

You might also like