Hurst II

On another post we explained how to calculate the Hurst exponent using two different methods.

Now we implement it.

To test the method we’ll use a time series from the Brazilian market: the mini future of the IBovespa index, hourly, from January to April, 2019. It’s available here.

This choice is quite arbitrary, but we know that this time series is mean reverting.

How do we know that? A simple ADF test:


import statsmodels.api as stat
import statsmodels.tsa.stattools as ts
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

win = pd.read_csv('WIN_H1_01_04.csv', sep=';')

plt.plot(win['CLOSE'])

results=ts.adfuller(win['CLOSE'], maxlag=1, regression='c', autolag=None)
print(results)

The results reads:

(-3.985491197671518, 0.0014884183818247635, 1, 794, {‚1%‘: -3.4386126789104074, ‚5%‘: -2.865186972298872, ’10%’: -2.5687119871327146})

That means, for the series to be mean reverting within 10% certainty, the first parameter should be more negative than ~ -2.67, and so forth for 5% and 1%.

So we know it is mean reverting and we cant test our implementations safely.

Calculating the Hurst exponent

The strategy is the following: calculate Var(t) for many ts and take the log on the equation Var(t) = c*t^{2H}. We should then have an affine equation and we are able to determine the linear coefficient, which is 2H.

So, to calculate the Hurst exponent the first thing to do is we have to generate a set Tof lags t. They should be between two and the length of the time series z. So we use numpy’s arange which generates a set of lags between 1 and a fraction (which we choose) of the length of z:

price = np.log(win.CLOSE)

T = np.arange(len(price)/50).astype(int)

The next step is to calculate the Var(t). To be extra didactic I’ll make an explicit calculation for a particular t in T: first we have to calculate the following list (defining price = z)

|z(t) - z(0)|
|z(t+1) - z(1)|
|z(t+2) - z(2)|
…
|z(t+N) - z(N)|

And for this set we calculate the variance. We repeat the process for every t in T, thus obtaining a set {Var(t)}_{t in T}.

So the best tool to do this is to use panda’s diff function (not numpy’s) with the argument t in T.

So we define y as follows, which is already the log of Var(t):

y = [np.log(price.diff(t).abs().var()) for t in T]

This is it. The rest of the code is merely the mechanics to get H, but the reasoning is over.

y = np.array(y)
x = np.log(T)
x = x[np.isfinite(x)]
y = y[np.isfinite(y)]

result = np.polyfit(x,y,1)

H = result[0]/2

The result we get is H ≈ 0.43, which agrees with what we know about the series: it is quite mean reverting.

The jupyter’s notebook of this little project can be found here.

Github Twitter LinkedIn RSS