SAVITRIBAI PHULE PUNE UNIVERSITY, PUNE
REPORT ON
PREDICT FUTURE STOCK PRICES
SUBMITTED TO THE SAVITRIBAI PHULE PUNE UNIVERSITY, PUNE
IN THE FULFILLMENT OF THE REQUIREMENT
Of
Laboratory Practice - III
Final Year (Computer Engineering)
Academic Year 2024-25
BY
Name of Students: Roll No :
1) Natesh Khadse 4202008
2) Raj Mankar 4202019
3) Prathamesh Nawale 4202027
3) Ashwin Pawar 4202039
Under the Guidance of
Ms. R.T. Waghmode
DEPARTMENT OF COMPUTER ENGINEERING
STES’s SINHGAD INSTITUTE OF TECHNOLOGY AND SCIENCE
NARHE, PUNE – 411041
CERTIFICATE
This is to certify that, Ms. (Name of Student) Roll No. have successfully completed the
report entitled “Predict Future Stock Prices” under the supervision of Ms. R. T. Waghmode, in the
fulfillment of the requirement of Laboratory Practice-III in Final Year Computer Engineering, in the
academic year 2024-2025 prescribed by Savitribai Phule Pune University, Pune .
(Ms. R. T. Waghmode ) (Dr. G. S. Navale)
Guide Head
Department of Computer Engineering Department of Computer Engineering
(Dr. S. D. Markande)
Principal
SINHGAD INSTITUTE OF TECHNOLOGY AND SCIENCE, NARHE ,PUNE– 411041
Place : Pune
Date :
Acknowledgement
I take this opportunity to acknowledge each and every one who contributed towards my work.
I express my sincere gratitude towards Ms. R. T. Waghmode, Assistant Professor at Sinhgad Institute of
Technology and Science, Narhe , Pune , for her valuable inputs , guidance and support throughout the
course.
I wish to express my thanks to Dr. G. S. Navale, Head of Computer Engineering Department,
Sinhgad Institute of Technology and Science, Narhe for giving me all the help and important suggestions
all over the Work. I thank all the teaching staff members, for their indispensable support and priceless
suggestions.
I also thank my friends and family for their help in collecting data , without their help Course
report have not been completed. At the end ,my special thanks to Dr. S. D. Markande, Principal
Sinhgad Institute of Technology and Science, Narhe for providing ambience in the college, which
motivate us to work.
Signature
Natesh Khadse
Raj Mankar
Prathamesh Nawale
Ashwin Pawar
Contents
CHAPTER TITLE PAGE NO
1 Introduction 5
2 Problem Definition 6
3 Project Architecture 7
4 Hardware & Software Requirement 8
5 Project Description 9
6 Implementation 10
7 GUI 13
8 Conclusion 14
9 Reference 15
1. INTRODUCTION
The stock market serves as a barometer of a country's economic health, with price fluctuations
reflecting both investor sentiment and macroeconomic factors. In India, the stock market has
experienced considerable volatility from 2000 to 2020, influenced by events like the global financial
crisis, changes in government policies, and the COVID-19 pandemic.
Understanding the market’s ups and downs and predicting future stock prices is crucial for investors,
policymakers, and financial analysts.
In this project, we use Indian stock market data from 2000 to 2020 to analyze historical trends
and build models that predict future stock returns. With the aid of machine learning algorithms, this
project seeks to explore the driving forces behind market movements and to predict stock prices based
on historical data.
2. PROBLEM DEFINITION
The project aims to analyze Indian stock market data from 2000 to 2020 to understand the factors
influencing stock price movements and predict future stock prices using machine learning models like
Linear Regression, Random Forest, and LSTM. The goal is to provide valuable insights for investors
and analysts.
3. PROJECT ARCHITECTURE
4. SOFTWARE & HARDWARE REQUIREMENT
Software Requirements:
- Operating System: Windows 10/11, macOS, or Linux.
- Programming Languages: Python (Version 3.7 or above)
- Software and Libraries: Jupyter Notebook or Google, Pandas, NumPy, Matplotlib and
Seaborn, Scikit-learn, TensorFlow/Keras, Statsmodels.
- Version Control: Git, GitHub/GitLab/Bitbucket.
Hardware Requirements:
- A computer with a compatible operating system (Windows, macOS, or Linux)
- A stable internet connection
- A minimum of 4 GB RAM (8 GB or more recommended)
- A decent processor (Intel Core i3 or equivalent)
5. PROJECT DESCRIPTION
The dataset used in this project consists of Indian stock market data spanning from 2000 to 2020.
It includes essential information such as the specific trading date, opening and closing prices, high
and low prices, trading volume, and adjusted closing prices. To ensure data quality and consistency,
the dataset underwent rigorous preprocessing, which involved handling missing values and removing
outliers. Additionally, new features were engineered to capture stock price trends, including moving
averages, volatility, and daily returns.
Exploratory data analysis (EDA) was conducted to gain valuable insights into the stock market
data. Key events like the 2008 global financial crisis, 2016 demonetization, and the 2020 COVID-19
pandemic were analyzed to understand their impact on stock prices.
Visualizations of stock price trends, moving averages, and daily returns helped identify patterns and
market behavior. Correlation analysis was performed to examine relationships between different
variables, such as opening price, closing price, and volume.
Several machine learning models were employed to predict future stock prices. These models
included Linear Regression, Random Forest, and Long Short-Term Memory (LSTM). Linear
Regression, a simple model, predicts stock prices based on linear relationships between input
variables and output values. Random Forest, an ensemble model, combines multiple decision trees to
capture complex non-linear relationships in the data. LSTM, a type of recurrent neural network, is
well-suited for stock market data as it can capture long-term dependencies and trends.
The dataset was divided into training and testing sets, with 80% of the data used for training and
20% for testing. Each model was trained on the historical data to predict future stock prices. The
models were evaluated using Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE)
metrics. RMSE measures the overall difference between predicted and actual stock prices, while
MAE provides the average magnitude of prediction errors. These metrics helped assess the accuracy
and performance of the models in predicting stock price movements.
6. IMPLEMENTATION
Data Loading and Preprocessing
import pandas as pd
import numpy as np
# Load dataset data = pd.read_csv('stock_data_2000_2020.csv')
# Data preprocessing: Handling missing values
data.fillna(method='ffill', inplace=True)
# Feature engineering: Moving averages and returns data['50_MA'] =
data['Close'].rolling(window=50).mean() data['200_MA'] =
data['Close'].rolling(window=200).mean() data['Daily_Return'] =
data['Close'].pct_change()
# Dropping rows with missing values after feature engineering
data.dropna(inplace=True) Exploratory Data Analysis (EDA) import
matplotlib.pyplot as plt import seaborn as sns # Visualize stock price trends
plt.figure(figsize=(12,6)) plt.plot(data['Date'], data['Close'], label='Close Price')
plt.plot(data['Date'], data['50_MA'], label='50-Day Moving Average')
plt.plot(data['Date'], data['200_MA'], label='200-Day Moving Average')
plt.legend() plt.title('Stock Price Trends') plt.show()
Model Implementation Model 1: Linear Regression from
sklearn.model_selection import train_test_split from
sklearn.linear_model import LinearRegression # Splitting the
data into training and testing sets X = data[['50_MA',
'200_MA', 'Daily_Return']].values y = data['Close'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False) #
Train Linear Regression model model_lr =
LinearRegression() model_lr.fit(X_train, y_train) # Predict and
evaluate y_pred_lr = model_lr.predict(X_test)
Model 2: Random Forest from sklearn.ensemble import
RandomForestRegressor # Train Random Forest model
model_rf = RandomForestRegressor(n_estimators=100)
model_rf.fit(X_train, y_train) # Predict and evaluate y_pred_rf
= model_rf.predict(X_test)
Model 3: LSTM (Long Short-Term Memory) import
tensorflow as tf from tensorflow.keras.models import
Sequential from tensorflow.keras.layers import LSTM,
Dense
# Reshaping data for LSTM
X_train_lstm = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1)) X_test_lstm =
np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))
# Building the LSTM model model_lstm
= Sequential()
model_lstm.add(LSTM(50, return_sequences=True, input_shape=(X_train_lstm.shape[1], 1)))
model_lstm.add(LSTM(50, return_sequences=False)) model_lstm.add(Dense(25))
model_lstm.add(Dense(1)) # Compile
the model model_lstm.compile(optimizer='adam',
loss='mean_squared_error')
# Train the LSTM model model_lstm.fit(X_train_lstm, y_train, batch_size=64,
epochs=5)
# Predict and evaluate y_pred_lstm =
model_lstm.predict(X_test_lstm)
7. GUI
8. CONCLUSION
This project highlighted the potential of machine learning models, particularly LSTM, for
analyzing stock market data and predicting future stock price movements. While the predictions made
by these models are useful, they are not definitive due to the unpredictability of the market. As a
result, future research should focus on refining these models by incorporating more factors and
developing hybrid approaches to improve prediction accuracy. Nonetheless, the tools and models
discussed in this project provide a valuable foundation for investors looking to make informed
decisions based on historical data and trend analysis.
9. REFERENCE
1. H. Patel and S. Shah, "Forecasting Stock Market Movements Using Machine Learning
Techniques", 2016, ResearchGate.
2. R. Reddy and P. Kumar, "Stock Price Prediction Using Support Vector Machines and
Neural Networks", 2018, IEEE.
3. A. Mehta and R. Gupta, “Time-Series Forecasting of Stock Market Using Machine
Learning Models”.
4. R. Shankar and P. Verma, "Stock Market Prediction Using Linear Regression Models", 2017,
IEEE.
5. K. Gupta and A. Mukherjee, "Random Forest for Stock Market Prediction", 2018, IEEE.
6. M. S. Hussain and K. S. Reddy, "Deep Learning for Stock Market Prediction Using
LSTM", 2020, IEEE.