This article compares the return on investment equities across different segments like banks, metals and auto. If the average return of one sectors is different than the other. This dataset only compares the intra-day returns and compares the data of last one year and does not take into account external factors at all.

The steps:

  • Read data from excel
  • Each sector data is in different tabs.
  • Calculate the intra-day gains for each sectors
  • Then merge the data into one dataframe
  • Then visualize the distribution plots
  • Conduct hypothesis tests on intr-day returns
  • Plot a time series plot
In [1]:
import pandas as pd
import numpy as np
In [2]:
import os

Read data from Excel

In [3]:
index = pd.ExcelFile( "index.xlsx" )
In [4]:
index.sheet_names
Out[4]:
['BSE', 'BANKEX', 'Auto', 'Metals']
In [5]:
bse = index.parse( "BSE" )
In [6]:
bse.head()
Out[6]:
Date Open High Low Close
0 2015-11-01 26641.69 26824.30 25451.42 26145.67
1 2015-10-01 26344.19 27618.14 26168.71 26656.83
2 2015-09-01 26127.04 26471.82 24833.54 26154.83
3 2015-08-01 28089.09 28417.59 25298.42 26283.09
4 2015-07-01 27823.65 28578.33 27416.39 28114.56
In [7]:
bse.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 71 entries, 0 to 70
Data columns (total 5 columns):
Date     71 non-null datetime64[ns]
Open     71 non-null float64
High     71 non-null float64
Low      71 non-null float64
Close    71 non-null float64
dtypes: datetime64[ns](1), float64(4)
memory usage: 3.3 KB

Calculate Intraday gains for BSE, Auto, Metal and Bankex

In [8]:
bse["gain"] = bse.apply( lambda x: ( x.Close - x.Open ) * 100 / x.Open, axis = 1 )
In [9]:
bse.head()
Out[9]:
Date Open High Low Close gain
0 2015-11-01 26641.69 26824.30 25451.42 26145.67 -1.861819
1 2015-10-01 26344.19 27618.14 26168.71 26656.83 1.186751
2 2015-09-01 26127.04 26471.82 24833.54 26154.83 0.106365
3 2015-08-01 28089.09 28417.59 25298.42 26283.09 -6.429543
4 2015-07-01 27823.65 28578.33 27416.39 28114.56 1.045549
In [10]:
bse["gain"] = bse.apply( lambda x: np.round( ( x.Close - x.Open ) * 100 / x.Open, 2 ), axis = 1 )
In [11]:
auto = index.parse( "Auto" )
auto["gain"] = bse.apply( lambda x: np.round( ( x.Close - x.Open ) * 100 / x.Open, 2 ), axis = 1 )
auto = auto[['Date', 'gain']]
auto.columns = ["Date", 'auto_gain']
In [12]:
auto.head()
Out[12]:
Date auto_gain
0 2015-11-01 -1.86
1 2015-10-01 1.19
2 2015-09-01 0.11
3 2015-08-01 -6.43
4 2015-07-01 1.05
In [13]:
auto.head()
Out[13]:
Date auto_gain
0 2015-11-01 -1.86
1 2015-10-01 1.19
2 2015-09-01 0.11
3 2015-08-01 -6.43
4 2015-07-01 1.05
In [14]:
bankex = index.parse( "BANKEX" )
bankex["gain"] = bankex.apply( lambda x: np.round( ( x.Close - x.Open ) * 100 / x.Open, 2 ), axis = 1 )
bankex = bankex[['Date', 'gain']]
bankex.columns = ["Date", 'bankex_gain']
In [15]:
metal = index.parse( "Metals" )
metal["gain"] = metal.apply( lambda x: np.round( ( x.Close - x.Open ) * 100 / x.Open, 2 ), axis = 1 )
metal = metal[['Date', 'gain']]
metal.columns = ["Date", 'metal_gain']

Merge Dataframes

In [16]:
auto_bank = auto.merge( bankex, on = ['Date'], how = 'inner' )
In [17]:
auto_bank_metal = auto_bank.merge( metal, on = ['Date'], how = 'inner' )
In [18]:
auto_bank_metal.head()
Out[18]:
Date auto_gain bankex_gain metal_gain
0 2015-11-01 -1.86 0.55 -3.21
1 2015-10-01 1.19 -0.37 5.91
2 2015-09-01 0.11 1.01 -7.24
3 2015-08-01 -6.43 -8.91 -14.31
4 2015-07-01 1.05 2.56 -7.41

Visualize the distribution of gains

In [19]:
import seaborn as sn
%matplotlib inline
import matplotlib.pyplot as plt

sn.set(rc={"figure.figsize": (10, 6)});
sn.distplot( auto_bank_metal.auto_gain, hist = False, color='r' )
sn.distplot( auto_bank_metal.bankex_gain, hist = False, color='g' )
sn.distplot( auto_bank_metal.metal_gain, hist = False, color='b' )
Out[19]:
<matplotlib.axes._subplots.AxesSubplot at 0x8e65908>

Visulialize the pair plot between the gains across multiple sectors

In [20]:
sn.pairplot( auto_bank_metal[["auto_gain", "bankex_gain", "metal_gain"]])
Out[20]:
<seaborn.axisgrid.PairGrid at 0x7af5898>

Hypothesis test for average returns of differnet sectors

  • Null Hypothesis : Average intra-day gains of auto and banks are same
  • Alternate Hypothesis : Average intra-day gains of auto and banks are different
In [21]:
from scipy import stats
stats.ttest_ind( auto_bank_metal.auto_gain, auto_bank_metal.bankex_gain, equal_var=False)
Out[21]:
(-0.47418462946747664, 0.63625758361524243)
  • Conclusion: As p-valus is more than 0.05, average intra-day gains of auto and banks are same
  • Null Hypothesis : Average intra-day gains of auto and metal are same
  • Alternate Hypothesis : Average intra-day gains of auto and metas are different
In [22]:
stats.ttest_ind( auto_bank_metal.auto_gain, auto_bank_metal.metal_gain, equal_var=False)
Out[22]:
(1.4478941750387357, 0.15046434670608549)
  • Conclusion: As p-valus is more than 0.05, average intra-day gains of auto and metals are same
  • Null Hypothesis : Average intra-day gains of bank and metals are same
  • Alternate Hypothesis : Average intra-day gains of bank and metals are different
In [23]:
stats.ttest_ind( auto_bank_metal.metal_gain, auto_bank_metal.bankex_gain, equal_var=False)
Out[23]:
(-1.6004442793566971, 0.11176745899569727)
  • Conclusion: As p-valus is more than 0.05, average intra-day gains of bank and metals are same

Plot time series plots

In [24]:
plt.plot(auto_bank_metal.Date, auto_bank_metal.bankex_gain, 'r' )
plt.plot(auto_bank_metal.Date, auto_bank_metal.auto_gain, 'g' )
plt.plot(auto_bank_metal.Date, auto_bank_metal.metal_gain, 'b' )

plt.show()