Introduction

In this tutorial, we will look into

  • How to calculate rolling means and plot them?
  • How to adjust the close based on the stock splits or bonus stock issues.
  • How to calculate monthly, weekly and quarterly returns?

Loading the infosys daily prices dataset

In [104]:
import pandas as pd
import numpy as np
In [105]:
infy_df = pd.read_csv( '../scraping/INFY.csv' )
In [106]:
infy_df.head( 5 )
Out[106]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs)
0 2011-06-29 2868.00 2907.00 2868.0 2880.0 2881.75 600094.0 17320.05
1 2011-06-30 2886.05 2919.00 2881.5 2903.1 2910.45 1417730.0 41163.32
2 2011-07-01 2935.00 2969.95 2924.1 2928.0 2934.15 800308.0 23549.85
3 2011-07-04 2966.65 2977.00 2935.0 2941.0 2938.95 514324.0 15162.29
4 2011-07-05 2957.00 2969.90 2926.0 2959.0 2956.45 802966.0 23701.22

Setting Date Time Index

In [107]:
infy_df = infy_df.set_index(pd.DatetimeIndex(infy_df['Date']) )
In [108]:
infy_df[0:5]
Out[108]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs)
2011-06-29 2011-06-29 2868.00 2907.00 2868.0 2880.0 2881.75 600094.0 17320.05
2011-06-30 2011-06-30 2886.05 2919.00 2881.5 2903.1 2910.45 1417730.0 41163.32
2011-07-01 2011-07-01 2935.00 2969.95 2924.1 2928.0 2934.15 800308.0 23549.85
2011-07-04 2011-07-04 2966.65 2977.00 2935.0 2941.0 2938.95 514324.0 15162.29
2011-07-05 2011-07-05 2957.00 2969.90 2926.0 2959.0 2956.45 802966.0 23701.22

Calculating Gains

In [109]:
infy_df['gain'] = infy_df.Close.pct_change( 1 )
In [110]:
infy_df[0:5]
Out[110]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain
2011-06-29 2011-06-29 2868.00 2907.00 2868.0 2880.0 2881.75 600094.0 17320.05 NaN
2011-06-30 2011-06-30 2886.05 2919.00 2881.5 2903.1 2910.45 1417730.0 41163.32 0.009959
2011-07-01 2011-07-01 2935.00 2969.95 2924.1 2928.0 2934.15 800308.0 23549.85 0.008143
2011-07-04 2011-07-04 2966.65 2977.00 2935.0 2941.0 2938.95 514324.0 15162.29 0.001636
2011-07-05 2011-07-05 2957.00 2969.90 2926.0 2959.0 2956.45 802966.0 23701.22 0.005955

Plotting Gains

In [111]:
import matplotlib.pyplot as plt
import seaborn as sn
%matplotlib inline
In [112]:
plt.figure( figsize = ( 8, 6 ))
plt.plot( infy_df.gain )
Out[112]:
[<matplotlib.lines.Line2D at 0x11c1f73c8>]
In [113]:
plt.figure( figsize = ( 8, 6 ))
plt.plot( infy_df.Close )
Out[113]:
[<matplotlib.lines.Line2D at 0x11c328978>]

Why somedays the stock has fallen almost 50%?

In [114]:
infy_df[infy_df.gain <= -0.49]
Out[114]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain
2014-12-02 2014-12-02 2164.95 2164.95 2115.0 2127.1 2126.60 2663748.0 56883.10 -0.511110
2015-06-15 2015-06-15 976.95 998.80 968.0 994.0 990.45 2618213.0 25924.87 -0.498519

News for bonus stock issues in the ratio of 1:1 on the above two dates

http://www.moneycontrol.com/company-facts/infosys/bonus/IT#IT

Initialize the stock weight

In [115]:
infy_df['weight'] = np.nan

Stock Prices on and before Bonus Issue days

In [116]:
infy_df[infy_df.index == '2015-06-15']
Out[116]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2015-06-15 2015-06-15 976.95 998.8 968.0 994.0 990.45 2618213.0 25924.87 -0.498519 NaN
In [117]:
infy_df[infy_df.index == '2015-06-12']
Out[117]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2015-06-12 2015-06-12 2005.0 2025.55 1955.5 1978.05 1975.05 2463231.0 49050.86 -0.011313 NaN
In [118]:
infy_df[infy_df.index == '2014-12-02']
Out[118]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2014-12-02 2014-12-02 2164.95 2164.95 2115.0 2127.1 2126.6 2663748.0 56883.1 -0.51111 NaN
In [119]:
infy_df[infy_df.index == '2014-12-01']
Out[119]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2014-12-01 2014-12-01 4387.0 4402.2 4292.6 4348.95 4349.85 2292733.0 99748.43 -0.002294 NaN

Adjust for the stock split

In [120]:
infy_df.loc['2015-06-12', 'weight'] = 2
In [121]:
infy_df[infy_df.index == '2015-06-12']
Out[121]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2015-06-12 2015-06-12 2005.0 2025.55 1955.5 1978.05 1975.05 2463231.0 49050.86 -0.011313 2.0
In [122]:
infy_df.loc['2014-12-01', 'weight'] = 4
In [123]:
infy_df[infy_df.index == '2014-12-01']
Out[123]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2014-12-01 2014-12-01 4387.0 4402.2 4292.6 4348.95 4349.85 2292733.0 99748.43 -0.002294 4.0
In [124]:
np.max( infy_df.index )
Out[124]:
Timestamp('2016-12-30 00:00:00')
In [126]:
infy_df.loc['2016-12-30', 'weight'] = 1.0
In [127]:
infy_df[-1:]
Out[127]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2016-12-30 2016-12-30 998.0 1012.0 992.75 1011.0 1010.6 3633884.0 36590.12 0.018391 1.0
In [128]:
infy_df = infy_df.fillna( method = 'bfill')
In [129]:
infy_df[0:5]
Out[129]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2011-06-29 2011-06-29 2868.00 2907.00 2868.0 2880.0 2881.75 600094.0 17320.05 0.009959 4.0
2011-06-30 2011-06-30 2886.05 2919.00 2881.5 2903.1 2910.45 1417730.0 41163.32 0.009959 4.0
2011-07-01 2011-07-01 2935.00 2969.95 2924.1 2928.0 2934.15 800308.0 23549.85 0.008143 4.0
2011-07-04 2011-07-04 2966.65 2977.00 2935.0 2941.0 2938.95 514324.0 15162.29 0.001636 4.0
2011-07-05 2011-07-05 2957.00 2969.90 2926.0 2959.0 2956.45 802966.0 23701.22 0.005955 4.0
In [130]:
infy_df[-5:-1]
Out[130]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2016-12-26 2016-12-26 988.00 994.0 975.10 983.35 982.75 1606914.0 15800.50 -0.005767 1.0
2016-12-27 2016-12-27 982.75 1001.9 980.00 996.90 998.95 2456408.0 24394.26 0.016484 1.0
2016-12-28 2016-12-28 1002.95 1009.8 995.00 999.00 998.50 2738408.0 27470.40 -0.000450 1.0
2016-12-29 2016-12-29 1003.75 1008.0 988.15 990.00 992.35 3120062.0 31134.52 -0.006159 1.0
In [132]:
infy_df[infy_df.index >= '2014-11-30'][0:5]
Out[132]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2014-12-01 2014-12-01 4387.00 4402.20 4292.60 4348.95 4349.85 2292733.0 99748.43 -0.002294 4.0
2014-12-02 2014-12-02 2164.95 2164.95 2115.00 2127.10 2126.60 2663748.0 56883.10 -0.511110 2.0
2014-12-03 2014-12-03 2101.50 2132.95 2084.55 2124.50 2123.50 3193761.0 67357.29 -0.001458 2.0
2014-12-04 2014-12-04 2110.00 2144.90 2095.00 2103.50 2101.80 2349807.0 49608.48 -0.010219 2.0
2014-12-05 2014-12-05 2115.00 2116.60 2059.80 2063.40 2070.30 1839690.0 38243.21 -0.014987 2.0
In [131]:
infy_df[infy_df.index >= '2015-06-10'][0:5]
Out[131]:
Date Open High Low Last Close Total Trade Quantity Turnover (Lacs) gain weight
2015-06-10 2015-06-10 1990.00 2039.35 1985.00 2027.00 2026.50 1596460.0 32121.23 0.017268 2.0
2015-06-11 2015-06-11 2035.95 2044.75 1995.05 2000.00 1997.65 2307366.0 46399.07 -0.014236 2.0
2015-06-12 2015-06-12 2005.00 2025.55 1955.50 1978.05 1975.05 2463231.0 49050.86 -0.011313 2.0
2015-06-15 2015-06-15 976.95 998.80 968.00 994.00 990.45 2618213.0 25924.87 -0.498519 1.0
2015-06-16 2015-06-16 994.35 1003.00 985.00 998.00 999.35 2611722.0 25994.83 0.008986 1.0

Calculate the new close

In [134]:
infy_df['new_close'] = infy_df.Close / infy_df.weight
In [135]:
plt.figure( figsize = ( 8, 6 ))
plt.plot( infy_df.new_close )
Out[135]:
[<matplotlib.lines.Line2D at 0x11c4f42e8>]

Monthly Price Changes

In [153]:
ohlc_dict = {'Open': 'first', 'new_close':'last'}
infy_monthly_df = infy_df.resample('M').apply( ohlc_dict ).reset_index()
infy_monthly_df[0:5]
Out[153]:
index Open new_close
0 2011-06-30 2868.0 727.6125
1 2011-07-31 2935.0 693.9750
2 2011-08-31 2780.1 585.7375
3 2011-09-30 2385.0 633.2625
4 2011-10-31 2490.0 719.3875
In [154]:
infy_monthly_df = infy_monthly_df.set_index(pd.DatetimeIndex(infy_monthly_df['index']) )
infy_monthly_df = infy_monthly_df.sort_index( ascending = True )
infy_monthly_df[0:5]
Out[154]:
index Open new_close
2011-06-30 2011-06-30 2868.0 727.6125
2011-07-31 2011-07-31 2935.0 693.9750
2011-08-31 2011-08-31 2780.1 585.7375
2011-09-30 2011-09-30 2385.0 633.2625
2011-10-31 2011-10-31 2490.0 719.3875
In [156]:
infy_monthly_df['m_gains'] = infy_monthly_df.new_close.pct_change( 1 )
In [157]:
infy_monthly_df[0:5]
Out[157]:
index Open new_close m_gains
2011-06-30 2011-06-30 2868.0 727.6125 NaN
2011-07-31 2011-07-31 2935.0 693.9750 -0.046230
2011-08-31 2011-08-31 2780.1 585.7375 -0.155967
2011-09-30 2011-09-30 2385.0 633.2625 0.081137
2011-10-31 2011-10-31 2490.0 719.3875 0.136002

Plot monthly gains

In [158]:
plt.figure( figsize = ( 8, 6 ))
plt.plot( infy_monthly_df.m_gains )
Out[158]:
[<matplotlib.lines.Line2D at 0x11c7b16d8>]
In [160]:
plt.figure( figsize = ( 8, 6 ))
plt.plot( infy_monthly_df.new_close )
Out[160]:
[<matplotlib.lines.Line2D at 0x11c8d97b8>]
In [170]:
sn.distplot( infy_monthly_df.m_gains.dropna(), hist = True )
Out[170]:
<matplotlib.axes._subplots.AxesSubplot at 0x11cd29588>
In [172]:
sn.boxplot( infy_monthly_df.m_gains.dropna() )
Out[172]:
<matplotlib.axes._subplots.AxesSubplot at 0x11cefb2b0>

Worst month

In [173]:
infy_monthly_df[infy_monthly_df.m_gains < -0.2]
Out[173]:
index Open new_close m_gains rmean_3 rmean_6 rmean_12
2013-04-30 2013-04-30 2910.75 558.8625 -0.226314 669.316667 649.035417 625.040625

Find Rolling means of 3 and 12 months

In [161]:
infy_monthly_df['rmean_3'] = infy_monthly_df.new_close.rolling( 3 ).mean()
In [162]:
infy_monthly_df[0:5]
Out[162]:
index Open new_close m_gains rmean_3
2011-06-30 2011-06-30 2868.0 727.6125 NaN NaN
2011-07-31 2011-07-31 2935.0 693.9750 -0.046230 NaN
2011-08-31 2011-08-31 2780.1 585.7375 -0.155967 669.108333
2011-09-30 2011-09-30 2385.0 633.2625 0.081137 637.658333
2011-10-31 2011-10-31 2490.0 719.3875 0.136002 646.129167
In [166]:
infy_monthly_df['rmean_12'] = infy_monthly_df.new_close.rolling( 12 ).mean()
In [167]:
infy_monthly_df[0:5]
Out[167]:
index Open new_close m_gains rmean_3 rmean_6 rmean_12
2011-06-30 2011-06-30 2868.0 727.6125 NaN NaN NaN NaN
2011-07-31 2011-07-31 2935.0 693.9750 -0.046230 NaN NaN NaN
2011-08-31 2011-08-31 2780.1 585.7375 -0.155967 669.108333 NaN NaN
2011-09-30 2011-09-30 2385.0 633.2625 0.081137 637.658333 NaN NaN
2011-10-31 2011-10-31 2490.0 719.3875 0.136002 646.129167 NaN NaN
In [168]:
plt.figure( figsize = ( 8, 6 ))
plt.plot( infy_monthly_df.new_close, 'r' )
plt.plot( infy_monthly_df.rmean_3, 'g' )
plt.plot( infy_monthly_df.rmean_12, 'b' )
Out[168]:
[<matplotlib.lines.Line2D at 0x11cbd21d0>]

Conclusion:

In this tutorial, we have calculated understood

  • How to adjust for stock splits or bonus issues.
  • How to find monthly returns
  • How to find rolling means for several periods
In [ ]: