0% found this document useful (0 votes)
56 views3 pages

Air Quality Analysis Using Machine Learning

This report analyzes air quality data using machine learning techniques, focusing on predicting Air Quality Index (AQI) values and categorizing them. Two datasets from Indian cities and global AQI values were utilized, with strong performance achieved in both regression and classification models. The findings highlight the potential of machine learning for effective air pollution monitoring and forecasting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views3 pages

Air Quality Analysis Using Machine Learning

This report analyzes air quality data using machine learning techniques, focusing on predicting Air Quality Index (AQI) values and categorizing them. Two datasets from Indian cities and global AQI values were utilized, with strong performance achieved in both regression and classification models. The findings highlight the potential of machine learning for effective air pollution monitoring and forecasting.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Air Quality Analysis Using Machine Learning

Abstract
 This report presents a comprehensive analysis of air quality data using machine learning techniques.
 The study utilizes two datasets—one detailing daily air pollutant measurements in various Indian cities
and another containing global AQI values with geographic coordinates.
 By applying regression and classification models, the report aims to predict Air Quality Index (AQI)
values and categorize them into standard AQI buckets.
 The models achieve strong performance, demonstrating the potential of machine learning in air pollution
monitoring and forecasting.

Introduction
 Air pollution is one of the most serious environmental concerns worldwide. It affects human health,
ecosystems, and contributes to climate change.
 Monitoring and predicting air quality are crucial for public safety and policy-making. This report uses
machine learning (ML) to analyze air quality data and build models to predict AQI and its categories,
which helps understand pollution patterns and anticipate hazardous conditions.

Dataset Description
2.1 city_day.csv
 Total Records: 29,531
 Main Features: PM2.5, PM10, NO, NO2, NOx, NH3, CO, SO2, O3, Benzene, Toluene, Xylene
 Target Variables: AQI (for regression), AQI_Bucket (for classification)
 Cities Covered: Various Indian cities
 Time Range: Includes multiple dates for each city
2.2 AQI-and-Lat-Long-of-Countries.csv
 Total Records: 16,695
 Main Features: AQI Value, CO AQI Value, Ozone AQI Value, NO2 AQI Value, PM2.5 AQI Value
 Other Data: Latitude and Longitude of monitoring points

Data Preprocessing
 Removed rows with missing AQI or AQI_Bucket
 Filled missing pollutant values using median imputation
 Encoded AQI categories using Label Encoding for classification tasks
 Split data into training and testing sets (80:20 split)
Machine Learning Techniques
4.1 Regression (AQI Prediction)
 Model: Linear Regression
 Features Used: All pollutant values
 Target: AQI (continuous)
4.2 Classification (AQI Category)
 Model: Decision Tree Classifier
 Features Used: All pollutant values
 Target: AQI_Bucket (encoded)

Results
5.1 Regression Model
 R² Score: 0.807
 RMSE: 59.44
The model explains 80.7% of the variance in AQI values.
5.2 Classification Model
 Accuracy: 72.6%
Classification Report:

Category Precision Recall F1-Score

Good 0.62 0.58 0.60

Moderate 0.76 0.75 0.75

Poor 0.57 0.57 0.57

Satisfactor 0.77 0.78 0.78


y

Severe 0.78 0.76 0.77

Very Poor 0.68 0.69 0.69

Visualizations
(Optional – You can insert graphs like bar plots, confusion matrix, and AQI distribution here. Let me know
if you want me to generate them.)

Conclusion
This project demonstrates the usefulness of machine learning in environmental monitoring. The regression
model effectively predicts AQI values, while the classification model accurately identifies air quality
categories. These models can be integrated into air pollution tracking systems to provide real-time alerts and
long-term insights.
References
 Government of India, Central Pollution Control Board (CPCB) Data
 Scikit-learn Documentation: https://scikit-learn.org
 Python Pandas Documentation: https://pandas.pydata.org

You might also like