DATA ANALYSIS FOR INVESTMENTS
Submitted to
PROF. SUMANTA PASARI
( Department of Mathematics, BITS Pilani, Pilani campus.)
BY
ABHILASH CHANDRASEKAR 2021B4A31050P
YASHOVARDHAN NAIK 2021AAPS0621P
TANISHQ RULANIA 2021B4A21566P
ADARSH MISHRA 2021B4A42503P
YASH KUMAR 2021B4A32330P
JAYANT AGGARWAL 2021B4AA2324P
KUSHAGRA SINGHVI 2021B4A21718P
VISHNU GOYAL 2021B5PS2068P
NSE and NIFTY 50
1. Basics of Stock Market
The stock market is a platform where shares of publicly listed companies are traded. These
shares represent ownership in a company. Investors buy and sell shares in hopes of making
profits based on the company's performance or overall market trends.
2. What is NSE
The National Stock Exchange (NSE) is one of the leading stock exchanges in India, where
buying and selling of securities like stocks, bonds, and derivatives take place. It provides a
transparent and efficient trading platform to investors. NSE was established to bring
transparency to the Indian capital market and is a hub for trading various financial instruments.
3. NIFTY 50 Index
The Nifty 50 Index is a stock market index that represents the weighted average of the top 50
largest and most liquid companies listed on the NSE. It acts as a barometer for the performance
of the Indian equity market and the overall economy. When the Nifty 50 rises, it indicates the
market is doing well, and when it falls, the market is underperforming.
The Nifty 50 includes companies from various sectors like IT, finance, energy, and healthcare,
providing a diversified representation of the Indian economy.
4. Index Methodology (Nifty 50)
An index methodology determines how the index is constructed and maintained. Nifty 50 uses
the following principles:
a) Selection of Stocks:
● The top 50 companies by market capitalization and liquidity are chosen.
● Companies must be listed on NSE and meet certain eligibility criteria.
b) Market Capitalization Weighting:
● Companies are weighted based on their free-float market capitalization:
Free-float Market Capitalization = Market Price of Share * Shares Available to public
● This ensures larger companies have a higher impact on the index.
c) Rebalancing:
● The index is reviewed semi-annually to ensure it remains representative of the
market.
d) Liquidity Filter:
● Only companies with high trading volumes and turnover are included.
e) Sector Representation:
3
● The index ensures that no single sector dominates, maintaining a diversified
view of the economy.
5. Performance Metrics of Nifty 50
Index Value Calculation: The index value is based on a weighted average:
Index Value=∑(Free-float Market Cap of Component Stocks) × Base Value / Base Market
Capitalization
The index value provides a single, representative figure summarizing the overall performance of
the 50 largest companies in the market.
It reflects the health of the market:
● Increase in the index value: Indicates a bullish market where stock prices are rising.
● Decrease in the index value: Indicates a bearish market with declining prices.
If an individual’s portfolio gave a return of 12% while Nifty 50 increased by 10%, the individual
outperformed the market.
a. Correlation
Correlation is used to measure how the stocks within the index move relative to each other. A
diversified index like the Nifty 50 aims to include stocks with:
● Low Correlation: To reduce risk, collapse of one sector shouldn’t impact other sectors
● Example: IT sector and Oil & Gas sector often have low correlation.
Correlation coefficient (r) ranges from -1 to +1:
● r= 1: Stocks move in perfect sync, and r= −1: Stocks move in opposite directions.
b. Statistical Tools
A moving average (MA) is a statistical tool used in stock markets to analyze price
trends by smoothing out short-term fluctuations. It calculates the average of a stock's
price over a specified period (e.g., 50 days or 200 days). This smoothed value helps
traders and investors identify the direction of the stock’s price movement.
A golden crossover happens when a short-term moving average (e.g., 50-day) crosses
above a long-term moving average (e.g., 200-day), signaling a bullish trend.
4
A death crossover happens when a short-term moving average crosses below a
long-term moving average, signaling a bearish trend.
Seasonal Decomposition of Nifty 50 closing prices
5
6
7
Need for Investment
1. Context
A crucial decision in personal finance is the option between market investments and
traditional bank deposits. In "Irrational Exuberance," Shiller (2015) points out that
psychological biases and logical economic considerations frequently affect decisions
about market involvement. According to Thaler (2015) and Statman (2019), the rise of
behavioral finance has completely changed how we think about making financial
decisions.
2. Empirical Analysis
Building on Varma's (2021) work on Indian financial markets:
2.1. Banking Analysis and Market Returns
Returns Value (per annum) Source
Nominal 4 RBI Handbook of Statistics, 2014
Real -2
NIFTY 50 Nominal 12 Bajpai, G. N. (2019). "Development
of Capital Markets in India."
NIFTY 50 Real 6
Table 1 : NIFTY 50 performance metrics and returns on banking deposits demonstrating
significant divergence
For the given table,
Inflation Rate (p.a) = 6 %
Sharpe Ratio = 0.73 (indicating positive risk-adjusted returns)
2.2. Purchasing power analysis
Investment Vehicle Nominal Value Real Value Real Return
(2024) (2024)
Bank Deposit ₹148054 ₹89815 -10.085%
Market Investment ₹310584 ₹188,674 88.674%
Table 2: Analysis of Purchasing power based on 6% annual inflation
3. Psychological Factors in Investment
3.1. Risk Perception Framework
3.2. Stress Mitigation Strategies
8
3.3. Risk Management Framework
4. Optimal Investment Framework Development
4.1. Three-Tier Investment Framework
Emergency Reserve (Tier 1)
● Allocation: 6-12 months of expenses
● Vehicle: High-yield savings accounts
● Purpose: Immediate liquidity needs
Intermediate Portfolio (Tier 2)
● Allocation: 20-30% of investable assets
● Vehicle: Fixed income instruments
● Purpose: Medium-term stability
Growth Portfolio (Tier 3)
● Allocation: Remaining investable assets
● Vehicle: Diversified market investments
● Purpose: Long-term wealth creation
4.2. Implementation Protocol
Initial Position
● Emergency fund establishment
● Risk tolerance assessment
● Goal-based allocation determination
Investment Execution
● Systematic investment planning
● Dollar-cost averaging implementation
● Regular rebalancing schedule
Monitoring and Adjustment
● Quarterly performance review
● Annual strategy reassessment
● Risk metric evaluation
9
5. Statistical Evidence
Probability of positive returns based on holding period:
Time Horizon Probability of Positive Returns
1 Year 75%
5 Years 88%
10 Years 96%
Table 3: Probability of Positive Returns Across Investment Time Horizons based on
Historical NSE India data
Standard deviation of returns decreases with time:
S.D of returns over a period of years Value
𝜎₁ 18.5%
𝜎₅ 12.3%
𝜎₁₀ 8.7%
Table 4:Historical Volatility Ranges of NIFTY 50
6. Conclusions and Recommendations
The empirical evidence strongly supports market investment as a necessary component
of long-term financial planning. While banking deposits serve a crucial role in short-term
liquidity management, they fail to provide adequate protection against inflation and
opportunity cost. The psychological barriers to market investment can be effectively
managed through structured approaches to asset allocation and risk management.
6.1. Key Findings
● Market investments consistently outperform banking deposits on a real return
basis
● Systematic investment approaches significantly reduce psychological stress
● Long-term investment horizons demonstrate strong risk-reduction characteristics
6.2. Recommendations
● Implement structured three-tier investment framework
● Utilize systematic investment protocols
● Maintain disciplined rebalancing schedules
● Regular review and adjustment of investment strategy
10
Effective Investment Strategy
1. How can one be an effective investor?
An effective investor makes decisions based on data, trends, and principles of risk management. Key
strategies include:
● Diversification:
○ Expanding investments across sectors to balance high-growth opportunities and stability.
○ Examples: IT, Pharma for growth; Consumer Goods, Energy for stability.
● Regular Investments:
○ Regular investments, where a fixed amount is invested periodically, reduces the impact
of market volatility.
● Market Awareness:
○ Use historical data and technical indicators (e.g., Moving Averages, RSI) to time
investments effectively.
○ Invest consistently rather than attempting to time the market.
● Risk Assessment:
○ Align investments with your risk tolerance, financial goals, and time horizon. Adjust as
your circumstances change.
● Rebalancing:
○ Periodically review and adjust your portfolio to maintain your desired asset allocation
and risk profile.
○ This prevents overexposure to any one asset class due to market fluctuations.
○ Rebalancing ensures your investments stay aligned with your goals.
● Patience and Long-Term Perspective:
○ Historical cumulative returns (from NIFTY 50 data) demonstrate that long-term
investments often outperform short-term trading.
11
2. Shall I invest daily, on a specific day of the month, or once a month?
1. Daily Investments:
a. Suitable for active investors with time to monitor markets.
b. Spreads risk but incurs higher transaction costs.
2. Monthly Investments:
a. Ideal for most investors.
b. Historical trends suggest higher returns during the first and fifth week of the month.
12
● Optimum Investment Strategy
○ Tuesday and Friday show higher average returns in weekday analysis, while Monday
shows a trend of negative returns. Thus to invest on a specific day in a week, the best
strategy is to invest on Mondays, when the markets are down.
○ Previous data suggests that markets are at their lowest during the third week of a given
month. Considering Monday is the best day for weekly investments, the best strategy to
invest would be a specific day of a month, on the Monday of the 3rd week of a
month.
13
3. Is the stock market operational daily, 24 hours?
● Operational Hours:
○ The NSE operates on trading days from 9:15 AM to 3:30 PM (IST).
○ It is closed on weekends and public holidays.
● Best Time to Invest:
○ Early trading hours (9:30 AM–11:00 AM) offer higher liquidity and better price
discovery.
○ Avoid late hours due to reduced trading activity.
4. In which part of the industry (IT, Telecom, etc.) should I invest?
1. High-Growth Sectors:
○ IT: Benefiting from digital transformation and exports.
○ Pharma: Defensive sector with growth during uncertainty.
○ Financial Services: Key driver of the Indian economy.
2. Stable Sectors:
○ Consumer Goods: Reliable returns during economic downturns.
○ Energy: Essential sector with steady demand.
3. Diversification:
○ Allocate 60-70% to high-growth sectors and 30-40% to stable sectors for balanced risk
and returns.
5. Shall I target to become a domestic or an international investor?
Detailed Answer:
● Domestic Investments:
○ Indian markets, like NIFTY 50, offer exposure to diverse, high-growth companies.
○ Easier access and lower risks make this suitable for beginners.
● International Investments:
○ Provides diversification across global markets.
○ Protects against domestic economic downturns and currency depreciation.
○ Invest in resilient economies like the US or emerging markets for higher growth.
14
This shows the comparison with MSCI(Morgan Stanley Investment Capital) which shows that investing
in NIFTY 50 is better..
15
Analysis of TCS Stock Prices
The financial time series data for TCS stock prices, sourced from the NSE website, was analyzed and
forecasted using ARIMA modeling and use the model for accurate forecasting of stock prices. The
process included critical steps like data preprocessing, checking for stationarity and distribution, and
model selection.
The 10 year data ( FY 2014 to FY 2023) for TCS is shown below:
1. Data Preprocessing
To do time series analysis on the data, the data needs to be free of discontinuities
Preprocessing is essential because raw stock market data contains missing values,inconsistencies. In
preparing the stock market data for ARIMA modeling, the primary preprocessing step involves ensuring
the continuity of the time series. The stock market doesn’t work during the weekends and other holidays.
Due to this, data is not recorded for these days. To ensure continuity of the dataset, we use the forward
filling method, where the data from the previous day is used to fill in the missing data for the current day,
This approach is preferred because it assumes that the last observed value remains valid until a new
value is recorded, which aligns with the nature of stock price data.
Time series can be classified into many types. One such classification is: Stationary and non-stationary.
Stationary time series is the one in which the mean, variance, auto-correlation and other statistical
properties are independent of time. While non-stationary time series is the one in which these statistical
properties are time dependent.
16
Using non-stationary time series analysis while understanding stock market trends can produce
unreliable and erroneous results thereby decreasing the accuracy while forecasting. A non-stationary
time series can be transformed to a stationary time series by using several methods such as differencing,
taking log, taking nth root, or a combination of these methods.
The 1st order differencing on the 10 year data (non stationary) is shown below:
2. Stationarity Check Using Augmented Dickey-Fuller (ADF) Test
ARIMA assumes that the data is stationary, meaning that its statistical properties like mean and variance
remain constant over [Link] is a commonly used statistical test to determine whether a given time
series is stationary or not. It is a unit root test which tests the presence of a unit root(implying that the
series is non- stationary). It uses the augmented version of the Dickey Fuller Test to work on larger and
complicated time series models.
yt = c + βt + αyt−1 + φ1∆Y t−1 + φ2∆Y t−2 + … + φp∆Y t−p + εt
The Null hypothesis implies the presence of unit [Link], to infer that the series is stationary, we need
to reject the null hypothesis by obtaining a p-value less than the significance level(0.05).
17
To test for stationarity, we had taken a random sample of 16 days from the dataset. The ADF results are
shown below:
For the TCS stock data, the ADF test yielded a p-value less than 0.05, indicating stationarity.
3. Distribution Check Using Anderson-Darling Test
Understanding the distribution of the data helps in selecting appropriate statistical models and
identifying potential irregularities. The Anderson-Darling test was used to check whether the data
followed a normal distribution, as this distribution is often assumed in many statistical analyses. The null
hypothesis for this test is that the data conforms to a specified distribution, such as the normal
[Link] Darling test calculates the critical values based on the specific distribution
H0 : The data fits a given distribution
H1 : The data does not fit a given distribution.
18
If p-value> 𝛂 then we cannot reject the null hypothesis and can conclude that the data fits the given
distribution. Else, we reject the null hypothesis
19
The test results did not reject the null hypothesis for the normal, weibull and the logistic regression, with
the weibull distribution’s test statistic being the lowest. Since the data follows normality, we can
continue with ARIMA.
4. ARIMA vs SARIMA
The choice between ARIMA and SARIMA depends on the presence of seasonality in the data. SARIMA
extends ARIMA by incorporating seasonal components (P, D, Q, s), which account for repeating patterns
or cycles at fixed intervals (e.g., monthly or quarterly). To determine whether SARIMA was necessary,
the data was visually inspected and analyzed using autocorrelation (ACF) and partial autocorrelation
20
(PACF) plots. ARIMA was used in the forecasting process for shorter durations compared to SARIMA
which is usually used to forecast dates for a longer period of time.
For TCS stock prices over the 16 day period, no significant seasonal patterns were observed in the plots,
and there were no periodic spikes in the ACF at regular intervals. This absence of seasonality made
ARIMA a more suitable and efficient choice. Additionally, ARIMA's simpler structure avoids overfitting
and reduces computational complexity, which can be a concern with SARIMA models. Therefore,
ARIMA was chosen for its effectiveness in modeling non-seasonal trends and residual correlations.
5. Forecasting and Prediction of TCS Stock Prices
After the ADF Test and the Anderson-Darling test, we need to figure out the model for forecasting the
data. As stated ARIMA model was fitted.
Once the data was preprocessed and confirmed stationary, an ARIMA model was fitted. The 16 day
sample data of TCS stock was split into two sets: training set and the test set. A training set is used to
implement a model while the test set is used to validate and show how much the predicted values from
the training set deviate from the test set and determine the accuracy of the trained model
21
The above graph is shown for an ARIMA model trained with parameters p, d, q set as (1, 1, 1). This
model was then used to forecast stock prices for the next 4 trading days.
The model order (p, d, q) can be optimized based on ACF and PACF plots. It can also be done using the
auto_arima function in python, leading to ARIMA(1, 0, 0), where:
● p=1: Indicates one lag of the autoregressive term, capturing short-term dependencies in the
[Link] 1 lagged series is going to be used to forecast periods.
● d=0: Indicates that no differencing is applied to the series to make it stationary.
● q=0: Suggests that the model does not include any lagged forecast errors in the moving average
(MA) part of the model.
22
The choice of ARIMA was justified due to the data over the shorter period being stationary. The choice
of SARIMA for the 10 year data would be more useful as the data is non-stationary. The use of SARIMA
on the 10 year data is shown below:
23
6. Results and Conclusion
This analysis demonstrated a systematic approach to preparing, analyzing, and modeling TCS stock
prices. Key takeaways include:
1. Data Preprocessing: Addressing missing values, outliers, and non-stationarity ensured the data
was ready for modeling.(The 1st order differencing on the 10 year data)
2. Stationarity: The ADF test confirmed the data met ARIMA's assumptions.i.e For the TCS stock
data, the ADF test yielded a p-value less than 0.05, indicating stationarity.
3. Distribution: The Anderson-Darling test revealed the non-normal nature of the data, which is
typical for financial time series.
4. Model Choice: ARIMA was preferred over SARIMA due to the absence of seasonality,
ensuring a simpler and effective model.
5. Forecasting: The ARIMA model provided accurate short-term predictions, showing its utility
for financial decision-making.
This analysis sets the foundation for further exploration and decision-making in financial modeling.
References
1. [Link]
2. [Link]
3. [Link]
4. [Link]
5. Bajpai, G. N. (2019). "Development of Capital Markets in India." Springer.
6. [Link]
7.
24