Rainfall Prediction in Begusarai
Rainfall Prediction in Begusarai
Bachelor of Technology
in
Civil Engineering
Submitted by:
i
PROJECT APPROVAL SHEET
A Major Project Entitled
Supervisor
Signature
Begusarai (Bihar)-851134
External Evaluator
ii
CERTIFICATE
The major project report entitled “Rainfall Prediction in Begusarai District using Seasonal
Autoregressive Integrated Moving Average Model” prepared by Amlesh Kumar
(20101125008),Shivshankar Kumar (201011250025),Department of Civil Engineering,
RRSDCE Begusarai is hereby approved and certified as a creditable study in the trend
analysis and forecasting of rainfall carried out and present in a satisfactory manner to warrant
its acceptable as pre-requisite to the degree for which it has been submitted.
iii
ACKNOWLEDGEMENT
We take this opportunity to express a deep sense of gratitude towards our project Guide Prof.
Nitya Nand Jha, Assistant Professor, Department of Civil Engineering, RRSDCE Begusarai
for his excellent guidance and support. His extreme energy, creativity, and perfectionism have
always been a constant source of motivation for us. He is a great person and one of the best
mentors.
We sincerely thanks Dr. Sushil Kumar, the Principal of the Institute, Prof. Lakshmi Kant, (Head
of the Civil Engineering Department), and all the other faculties and staff members of the Civil
Engineering Department, RRSDCE Begusarai for their kind supports and providing required
facilities during the period of this project.
Also, we would like to express our gratitude toward our friends for their valuable suggestions
and helpful discussion, which provided to be of great value.
Date: -
Place: - R.R.S.D.C.E Begusarai
iv
DECLARATION
We hereby declare that the project work entitled by "Rainfall Prediction in Begusarai
District using Seasonal Autoregressive Integrated Moving Average Model" submitted to
R.R.S.D.C.E, Begusarai is a record of an original work done by us under the supervision of
Prof. Nitya Nand Jha, Asst. Professor, Department of Civil Engineering and this project is
submitted for fulfilment of the requirement for the Degree of Bachelor of Technology in Civil
Engineering. The result embodied in this work has not been submitted to any other University
and Institute.
v
ABSTRACT
Accurate daily rainfall prediction is required for accurate stream flow prediction, flooding risk
analysis and construction of reliable flood control and early warning system. However, because of
its nonlinearity, the prediction of daily rainfall with high accuracy and long prediction lead time is
difficult. In this study, Seasonal Autoregressive Integrated Moving Average Model was applied to
predict the daily rainfall using one meteorological parameters (Precipitation) for a period of 1972-
2023 were used. The four SARIMA models are trained and evaluated. The SARIMA models are
developed using last 30 data for training and first 30 for testing. SARIMA model 3 have superior
performance in MAE, MSE, R-SQ and RMSE. The SARIMA model 4 has MAE, MSE, R-SQ and
RMSE as 43.13,5346.23,0.64 and 73.11 for training and 55.29,6720,0.20 and 81.97 for testing
respectively.
vi
TABLE OF CONTENTS
Contents
CERTIFICATE ............................................................................................................................ iii
ACKNOWLEDGEMENT ........................................................................................................... iv
DECLARATION ........................................................................................................................... v
ABSTRACT .................................................................................................................................. vi
LIST OF FIGURES ..................................................................................................................... ix
LIST OF ABBREVIATIONS ....................................................................................................... x
CHAPTER- 01............................................................................................................................... 1
INTRODUCTION .......................................................................................................................... 1
1.1 Background ........................................................................................................................... 1
1.2 STUDY AREA ...................................................................................................................... 2
1.3 OBJECTIVES ....................................................................................................................... 3
CHAPTER- 02............................................................................................................................... 4
LITERATURE REVIEW ................................................................................................................ 4
CHAPTER- 03............................................................................................................................... 9
MATERIALS AND METHODS .................................................................................................... 9
3.1 Data ....................................................................................................................................... 9
3.2 Seasonal ARIMA Model ..................................................................................................... 10
3.3 Model Identification ............................................................................................................ 10
3.4 Diagnostic Checking ............................................................................................................11
3.5 Fitting and Prediction ...........................................................................................................11
CHAPTER- 04............................................................................................................................. 12
RESULTS AND DISCUSSION.................................................................................................... 12
4.1 Result................................................................................................................................... 12
4.2 Nature of Time Series Data ................................................................................................. 12
4.3 Stationarity .......................................................................................................................... 12
4.4 Test for Auto Correlation ..................................................................................................... 13
4.5 Model Identification ............................................................................................................ 14
4.6 Test for Auto Correlation ..................................................................................................... 15
vii
4.6 Testing Data ......................................................................................................................... 16
4.8 Forecasted Values ................................................................................................................ 17
CHAPTER 5 ................................................................................................................................ 18
CONCLUSION ........................................................................................................................... 18
REFERENCES………………………………………………………………………………………………………………….……………..19
ANNEXURE………………………………………………………………………………..…..20
viii
LIST OF FIGURES
03 Rainfall data 09
05 SARIMA Result 13
06 13
Ljung-box test statistic for fitted models
07 14
Time series plot of monthly rainfall
ACF and PACF plot of rainfall
08 15
Seasonal First Difference plot of Rainfall
09 15
ix
LIST OF ABBREVIATIONS
x
CHAPTER- 01
INTRODUCTION
1.1 Background
The atmosphere and ocean have warmed, the amounts of snow and ice have diminished and sea
level has risen in the recent past. When the temperature increases beyond 2.5°C, then 20 to 30 per
cent of known animal and plant species would be at increased risk of extinction. If the global
average temperature increase exceeded 3.5°C, models suggested that there would be extinctions
of 40 to 70 per cent of known species . India is one of the 27 countries identified as most vulnerable
to the impact of global warming. Studied the changes in the frequency of rainy days, rainy days as
well as heavy rainfall days using the daily rainfall data for the period 1901-2005 all over India. It
is a fact that climate change is real and is happening across the world in different magnitudes. The
long term mean seasonal and annual rainfall analysis showed that South West Monsoon (SWM)
rainfall observed was 176.9 mm and North East Monsoon (NEM) was 336.9 mm with an annual
rainfall of 674.8 mm . The agricultural practices and crop yields of India are heavily dependent on
the climatic factors like rainfall. Out of 142 million ha cultivated land in India, 92 million ha (i.e.
about 65%) are under the influence of rain-fed agriculture. Unlike irrigated agriculture, rain-fed
farming is usually diverse and risk prone. The monsoon season is the principal rain-bearing season
and in fact, a substantial part of the annual rainfall over a large part of the country occurs in this
season. Small variations in the timing and the quantity of monsoon rainfall have the potential to
impact on agricultural output .
Rainfall is natural climatic phenomena whose prediction is challenging and demanding. Its
forecasts of particular relevance to the agriculture sector, which contributes significantly to the
economy of the nation . On a worldwide scale, numerous attempts have been made to predict its
behavioural pattern using various techniques. In the last few decades, time series forecasting has
received tremendous attention of researchers. Time series models have been commonly used in a
broad range of scientific applications. Some of the major advantages of time series models include
their systematic search capability for identification, estimation and diagnostic checking. Time
series models, like the Autoregressive Integrated Moving Average (ARIMA), effectively consider
serial linear correlation among observations, whereas Seasonal Autoregressive Integrated Moving
1
Average (SARIMA) models can satisfactorily describe time series that exhibit non-stationary
behaviours both within and across seasons. SARIMA models are the most general forecasting
models with high degree of accuracy. An attempt has been made in the present paper to analyse
and predict the monthly rainfall patterns for Begusarai district, Bihar using the SARIMA model.
It lies on the northern bank of the river Ganges in the Mithila region.
It is located at latitudes (25.15 deg N & 25.45 deg N) and longitudes (85.45 deg E & 86.36
deg E).
2
Fig. 2. Begusarai shapefile
1.3 OBJECTIVES
To Understanding long-term climatic changes and socioeconomic impacts.
To Examining temporal variations in rainfall patterns and distribution.
To Evaluating future rainfall patterns, distribution.
To Identifying monotonic monthly and seasonal patterns in the context
of global warming.
3
CHAPTER- 02
LITERATURE REVIEW
Kokilavani et.al(2020)
Title: SARIMA Modelling and Forecasting of Monthly Rainfall Patterns for Coimbatore,
Tamil Nadu, India.
Research work:
Research Purpose: Forecasting of Monthly Rainfall Patterns for Coimbatore, Tamil Nadu.
Study Area: Coimbatore, Tamil Nadu, India.
Methodology: Machine learning using SARIMA model.
Findings: The mean monthly rainfall ranged from 7 mm (January) to 189.7 mm (October).
From, April to November, the Co-efficient of Variation is less than 100 per cent and the
dependability of rainfall for these months are higher compared to other months.
4
Mondal et.al (2012)
Title: Rainfall trend analysis by mann-kendall test: a case study of north-eastern part of
cuttack district, orissa.
Research work:
Research Purpose: To investigate the trend analysis of north-eastern part of Cuttack district,
Orissa by using non-parametric Man- Kendall test.
Study Area: North-eastern part of Cuttack district, Orissa .
Methodology: Non-parametric Man- Kendall test.
Findings: Trend analysis of Birupa river basin has annual rainfall for 40 years with
maximum rainfall occurrence in the years 1983 with the total precipitation of 2810 mm
approximately and minimum rainfall has occurred in the year 1996 with the total of
around 1118 mm. Average rainfall for these 40 years is 1693.709 mm
5
Madane et.al (2024)
Title: Spatio-temporal variations of reference evapotranspiration using Innovative
and Mann-Kendall trend analysis under limited weather data in semi-arid region of Indian
Punjab.
Research work:
Research Purpose: To investigate the Spatio-temporal variations of reference
evapotranspiration
Study Area: Central Punjab (three stations) and South West Punjab (three stations).
Methodology: Non-parametric Innovative and Man- Kendall test.
Findings:The mean reference evapotranspiration (RET) estimated using the Hargreaves
method (Hargreaves 1994) for the Central Zone varied from 1661.1 ± 52.2, at Sangrur to
1709.2 ± 39 at Barnala. In the south-west zone, the mean monthly RET varied from 1694.5
± 81.4 at Bathinda to 1694.5 ± 81.4 at Mansa.
6
Das et al. (2023)
Title: CMIP5 based past and future climate change scenarios over south bihar, India.
Research work:
Research Purpose: To investigate the past and future rainfall and temperature change
scenarios over three locations Patna.
Study Area: Districts of South Bihar.
Methodology: CMIP5 based Multimodal Ensemble of GCMs were used for future
prediction and trend analysis was done using Mann Kendell and Sens Slope Estimator are
used for trend analysis
Findings: Rainfall for Patna and Gaya has shown declining trend where as for Bhagalpur
increasing trend was found during 1901-2015. Future rainfall has shown increasing trend
for all location. Most of the GCMs revealed increase in Temperature for all location.
7
Swagatam Bora & Abhilash Hazarika (2023)
Title: Rainfall Time Series Forecasting using ARIMA Model
Research work:
Research Purpose: To study aims to predict the rainfall distribution pattern over Assam
and Meghalaya for the next five years
Study Area: Assam and Meghalaya.
Methodology: Machine learning using ARIMA model.
Findings: Accordance with predictions of heavy rainfall and flooding. The prediction is
based on an ARIMA model. 5 models were compared and the model with the lowest AIC
value (0,0,1)(2,1,2) has been selected as the model for forecasting.
8
CHAPTER- 03
Shape File of India map has been downloaded from Survey of India.
Begusarai District shapefile is extracted from the shape file of India map downloaded from
Survey of India using QGIS software.
Gridded Rainfall at resolution of 0.25 deg x 0.25 deg for 1953 to 2023, has been taken from
IMD.
Rainfall data at grid point 1 (25.50 deg N,86.00 deg E) has been analyzed.
9
3.2 Seasonal ARIMA Model
The general form of multiplicative seasonal model SARIMA ( , , ) ( , , )s p d q P D Q is given by
Where,
The monthly rainfall took for the study, s=12. Hence, the above equation (1) can be written as
For this purpose, one should construct a time plot of the data and inspect the graph for any
anomalies. Through careful examination of the plot, usually one could get an idea about whether
the series contains a trend, seasonality, outliers; non-constant variances and other non-normal and
non-stationary phenomena. This information would help to choose proper data transformation. If
10
the variance grows with time, we should use variance-stabilizing transformations and difference.
A series with non-constant variance often needs a logarithmic transformation.
The next step is to identify preliminary values of auto regressive order p, the order of differencing
d, the moving average order q and their corresponding seasonal parameters P, D and Q. Here, the
autocorrelation function (ACF), the partial autocorrelation function (PACF) are the most important
elements. The ACF measures the amount of linear dependence between observations in a time
series that are separated by a lag q. The PACF helps to determine how many autoregressive terms
p are necessary. The parameter d is the order of difference frequency changing from non-stationary
time series to stationary time series. Furthermore, a time series plot and ACF of data will typically
suggest whether any differencing is needed. If differencing is called for, the time plot will show
some kind of linear trend.
When preliminary values of D and d have been fixed, the next step is to check the ACF and PACF
of to determine the values of P Q, p and q. Further one could choose parameters using
minimum Akaike’s Information Criterion (AIC) and Bayesian Information Criterion (BIC). Once
the model is tentatively established, the parameters and the corresponding standard errors can be
estimated using statistical techniques.
11
CHAPTER- 04
4.3 Stationarity
The time series plot showed that the data exhibited stationary and the quality of stationarity of the
observation was further tested by Augmented Dickey-Fuller test (ADF), KPSS test, PP test (Table:
1). The probability value of ADF and PP test were less than 0.05 and greater than 0.05 for KPSS
test for the rainfall data. Thus, the data set was considered to be stationary at 26 lags.
12
Fig. 5 SARIMA Result
13
Fig. 7. Time series plot of monthly rainfall
(1,0,0) (1,0,2),12
(2,0,3) (2,0,3),12
(2,0,2) (1,0,1),12
14
4.6 Test for Auto Correlation
15
Fig. 10 Autocorrelation plot of rainfall
16
Fig. 12 Actual and Forecasted rainfall graph
forecasted Precipitation
350
300
250
200
150
100
50
0
01-01-2024
01-03-2024
01-05-2024
01-07-2024
01-09-2024
01-11-2024
01-01-2025
01-03-2025
01-05-2025
01-07-2025
01-09-2025
01-11-2025
01-01-2026
01-03-2026
01-05-2026
01-07-2026
01-09-2026
01-11-2026
01-01-2027
01-03-2027
01-05-2027
01-07-2027
01-09-2027
01-11-2027
01-01-2028
01-03-2028
01-05-2028
01-07-2028
01-09-2028
01-11-2028
17
CHAPTER 5
CONCLUSION
The forecasted data from the SARIMA model is an accurate prediction of the rainfall of the next
5 years i.e. from 2024 to 2028 of the Begusarai, Bihar region. Using this data, various different
aspects of the development of the state, whose progress depends on the monsoon rains can be
planned. The crops can be harvested earlier or an entire harvest can be prevented if the monsoon
is predicted to be severe. Similarly, constructed activities can be planned so as to maximize the
output during times of less rainfall. The disaster management authority of each state can also plan
for rescue drives and relief camps in accordance with predictions of heavy rainfall and flooding.
The prediction is based on an SARIMA model. 3 models were compared and the model with the
lowest AIC value (2,0,2)(1,0,1) has been selected as the model for forecasting.
18
REFERENCES
1. Praveen, B., Talukdar, S., Shahfahad, Mahato, S., Mondal, J., Sharma, P., ... & Rahman, A.
(2020). Analyzing trend and forecasting of rainfall changes in India using non-parametrical
and machine learning approaches. Scientific reports, 10(1), 10342.
2. Kokilavani, S., Pangayarselvi, R., Ramanathan, S. P., Dheebakaran, G., Sathyamoorthy, N.
K., Maragatham, N., & Gowtham, R. (2020). SARIMA modelling and forecasting of
monthly rainfall patterns for Coimbatore, Tamil Nadu, India. Current Journal of Applied
Science and Technology, 39(8), 69-76.
3. Mondal, A., Kundu, S., & Mukhopadhyay, A. (2012). Rainfall trend analysis by Mann-
Kendall test: A case study of north-eastern part of Cuttack district, Orissa. International
Journal of Geology, Earth and Environmental Sciences, 2(1), 70-78.
4. Subbaiah Naidu, K. C. H. V. (2016). SARIMA modeling and forecasting of seasonal
rainfall patterns in India. Int J Math Trends Technol (IJMTT), 38(1), 15-22.
5. Madane, D. A., Bankey, H., & Sharda, R. (2024). Spatio-temporal variations of reference
evapotranspiration using Innovative and Mann–Kendall trend analysis under limited
weather data in semi-arid region of Indian Punjab. Theoretical and Applied Climatology,
1-22.
6. Eni, D. (2015). Seasonal ARIMA modeling and forecasting of rainfall in Warri Town,
Nigeria. Journal of Geoscience and Environment Protection, 3(06), 91.
7. Das, L., Bhowmick, S., Meher, J. K., & Mahdi, S. S. (2023). CMIP5 based past and future
climate change scenarios over South Bihar, India. Journal of Earth System Science, 132(1),
8.
8. Bari, S. H., Rahman, M. T., Hussain, M. M., & Ray, S. (2015). Forecasting monthly
precipitation in Sylhet city using ARIMA model. Civil and Environmental Research, 7(1),
69-77.
9. Bora, S., & Hazarika, A. (2023, April). Rainfall time series forecasting using ARIMA
model. In 2023 International Conference on Artificial Intelligence and Applications
(ICAIA) Alliance Technology Conference (ATCON-1) (pp. 1-5). IEEE.
19
ANNEXURE
forecasted
Date Precipitation
01-01-2024 2.493421
01-02-2024 20.682651
01-03-2024 3.166115
01-04-2024 28.332701
01-05-2024 50.816382
01-06-2024 156.395995
01-07-2024 304.375168
01-08-2024 223.338485
01-09-2024 226.933126
01-10-2024 57.06868
01-11-2024 16.06868
01-12-2024 8.60688
01-01-2025 21.096273
01-02-2025 0.992052
01-03-2025 14.626776
01-04-2025 16.021716
01-05-2025 56.515246
01-06-2025 157.768704
01-07-2025 295.936553
01-08-2025 237.89571
01-09-2025 207.398759
01-10-2025 80.456576
01-11-2025 7.630147
01-12-2025 14.023204
01-01-2026 1.578423
01-02-2026 13.643151
01-03-2026 6.189723
01-04-2026 17.492374
01-05-2026 62.12129
01-06-2026 145.544375
01-07-2026 313.546994
01-08-2026 216.315976
01-09-2026 230.813542
01-10-2026 57.193052
01-11-2026 13.310778
01-12-2026 2.716207
01-01-2027 12.599816
01-02-2027 9.328376
01-03-2027 3.404678
20
01-04-2027 27.113345
01-05-2027 46.519712
01-06-2027 165.658723
01-07-2027 290.620295
01-08-2027 239.798941
01-09-2027 208.720619
01-10-2027 75.763667
01-11-2027 0.124927
01-12-2027 4.356495
01-01-2028 12.522884
01-02-2028 2.400595
01-03-2028 16.700128
01-04-2028 8.648823
01-05-2028 68.460492
01-06-2028 142.176449
01-07-2028 313.395813
01-08-2028 219.611163
01-09-2028 224.315944
01-10-2028 66.036053
01-11-2028 2.791707
21