Introduction to Hypothesis Driven Development — Overview of a Simple Strategy and Indicator Hypotheses

This post will begin to apply a hypothesis-driven development framework (that is, the framework written by Brian Peterson on how to do strategy construction correctly, found here) to a strategy I’ve come across on SeekingAlpha. Namely, Cliff Smith posted about a conservative bond rotation strategy, which makes use of short-term treasuries, long-term treasuries, convertibles, emerging market debt, and high-yield corporate debt–that is, SHY, TLT, CWB, PCY, and JNK. What this post will do is try to put a more formal framework on whether or not this strategy is a valid one to begin with.

One note: For the sake of balancing succinctness for blog consumption and to demonstrate the computational techniques more quickly, I’ll be glossing over background research write-ups for this post/strategy, since it’s yet another take on time-series/cross-sectional momentum, except pared down to something more implementable for individual investors, as opposed to something that requires a massive collection of different instruments for massive, institutional-class portfolios.

Introduction, Overview, Objectives, Constraints, Assumptions, and Hypotheses to be Tested:

Momentum. It has been documented many times. For the sake of brevity, I’ll let readers follow the links if they’re so inclined, but among them are Jegadeesh and Titman’s seminal 1993 paper, Mark Carhart’s 1997 paper, Andreu et. Al (2012), Barroso and Santa-Clara (2013), Ilmanen’s Expected Returns (which covers momentum), and others. This list, of course, is far from exhaustive, but the point stands. Formation periods of several months (up to a year) should predict returns moving forward on some holding period, be it several months, or as is more commonly seen, one month.

Furthermore, momentum applies in two varieties–cross sectional, and time-series. Cross-sectional momentum asserts that assets that outperformed among a group will continue to outperform, while time-series momentum asserts that assets that have risen in price during a formation period will continue to do so for the short-term future.

Cliff Smith’s strategy depends on the latter, effectively, among a group of five bond ETFs. I am not certain of the objective of the strategy (he didn’t mention it), as PCY, JNK, and CWB, while they may be fixed-income in name, possess volatility on the order of equities. I suppose one possible “default” objective would be to achieve an outperforming total return against an equal-weighted benchmark, both rebalanced monthly.

The constraints are that one would need a sufficient amount of capital such that fixed transaction costs are negligible, since the strategy is a single-instrument rotation type, meaning that each month may have two-way turnover of 200% (sell one ETF, buy another). On the other hand, one would assume that the amount of capital deployed is small enough such that execution costs of trading do not materially impact the performance of the strategy. That is to say, moving multiple billions from one of these ETFs to the other is a non-starter. As all returns are computed close-to-close for the sake of simplicity, this creates the implicit assumption that the market impact and execution costs are very small compared to overall returns.

There are two overarching hypotheses to be tested in order to validate the efficacy of this strategy:

1) Time-series momentum: while it has been documented for equities and even industry/country ETFs, it may not have been formally done so yet for fixed-income ETFs, and their corresponding mutual funds. In order to validate this strategy, it should be investigated if the particular instruments it selects adhere to the same phenomena.

2) Cross-sectional momentum: again, while this has been heavily demonstrated in the past with regards to equities, ETFs are fairly new, and of the five mutual funds Cliff Smith selected, the latest one only has data going back to 1997, thus allowing less sophisticated investors to easily access diversified fixed income markets a relatively new innovation.

Essentially, both of these can be tested over a range of parameters (1-24 months).

Another note: with hypothesis-driven strategy development, the backtest is to be *nothing more than a confirmation of all the hypotheses up to that point*. That is, re-optimizing on the backtest itself means overfitting. Any proposed change to a strategy should be done in the form of tested hypotheses, as opposed to running a bunch of backtests and selecting the best trials. Taken another way, this means that every single proposed element of a strategy needs to have some form of strong hypothesis accompanying it, in order to be justified.

So, here are the two hypotheses I tested on the corresponding mutual funds:

require(quantmod)
require(PerformanceAnalytics)
require(reshape2)
symbols <- c("CNSAX", "FAHDX", "VUSTX", "VFISX", "PREMX")
getSymbols(symbols, from='1900-01-01')
prices <- list()
for(symbol in symbols) {
  prices[[symbol]] <- Ad(get(symbol))
}
prices <- do.call(cbind, prices)
colnames(prices) <- substr(colnames(prices), 1, 5)
returns <- na.omit(Return.calculate(prices))

sample <- returns['1997-08/2009-03']
monthRets <- apply.monthly(sample, Return.cumulative)

returnRegression <- function(returns, nMonths) {
  nMonthAverage <- apply(returns, 2, runSum, n = nMonths)
  nMonthAverage <- xts(nMonthAverage, order.by = index(returns))
  nMonthAverage <- na.omit(lag(nMonthAverage))
  returns <- returns[index(nMonthAverage)]
  
  rankAvg <- t(apply(nMonthAverage, 1, rank))
  rankReturn <- t(apply(returns, 1, rank))
  
  
  meltedAverage <- melt(data.frame(nMonthAverage))
  meltedReturns <- melt(data.frame(returns))
  meltedRankAvg <- melt(data.frame(rankAvg))
  meltedRankReturn <- melt(data.frame(rankReturn))
  lmfit <- lm(meltedReturns$value ~ meltedAverage$value - 1)
  rankLmfit <- lm(meltedRankReturn$value ~ meltedRankAvg$value)
  return(rbind(summary(lmfit)$coefficients, summary(rankLmfit)$coefficients))
}

pvals <- list()
estimates <- list()
rankPs <- list()
rankEstimates <- list()
for(i in 1:24) {
  tmp <- returnRegression(monthRets, nMonths=i)
  pvals[[i]] <- tmp[1,4]
  estimates[[i]] <- tmp[1,1]
  rankPs[[i]] <- tmp[2,4]
  rankEstimates[[i]] <- tmp[2,1]
}
pvals <- do.call(c, pvals)
estimates <- do.call(c, estimates)
rankPs <- do.call(c, rankPs)
rankEstimates <- do.call(c, rankEstimates)

Essentially, in this case, I take a pooled regression (that is, take the five instruments and pool them together into one giant vector), and regress the cumulative sum of monthly returns against the next month’s return. Also, I do the same thing as the above, except also using cross-sectional ranks for each month, and performing a rank-rank regression. The sample I used was the five mutual funds (CNSAX, FAHDX, VUSTX, VFISX, and PREMX) since their inception to March 2009, since the data for the final ETF begins in April of 2009, so I set aside the ETF data for out-of-sample backtesting.

Here are the results:

pvals <- list()
estimates <- list()
rankPs <- list()
rankEstimates <- list()
for(i in 1:24) {
  tmp <- returnRegression(monthRets, nMonths=i)
  pvals[[i]] <- tmp[1,4]
  estimates[[i]] <- tmp[1,1]
  rankPs[[i]] <- tmp[2,4]
  rankEstimates[[i]] <- tmp[2,1]
}
pvals <- do.call(c, pvals)
estimates <- do.call(c, estimates)
rankPs <- do.call(c, rankPs)
rankEstimates <- do.call(c, rankEstimates)


plot(estimates, type='h', xlab = 'Months regressed on', ylab='momentum coefficient', 
     main='future returns regressed on past momentum')
plot(pvals, type='h', xlab='Months regressed on', ylab='p-value', main='momentum significance')
abline(h=.05, col='green')
abline(h=.1, col='red')

plot(rankEstimates, type='h', xlab='Months regressed on', ylab="Rank coefficient",
     main='future return ranks regressed on past momentum ranks', ylim=c(0,3))
plot(rankPs, type='h', xlab='Months regressed on', ylab='P-values')

Of interest to note is that while much of the momentum literature specifies a reversion effect on time-series momentum at 12 months or greater, all the regression coefficients in this case (even up to 24 months!) proved to be positive, with the very long-term coefficients possessing more statistical significance than the short-term ones. Nevertheless, Cliff Smith’s chosen parameters (the two and four month settings) possess statistical significance at least at the 10% level. However, if one were to be highly conservative in terms of rejecting strategies, that in and of itself may be reason enough to reject this strategy right here.

However, the rank-rank regression (that is, regressing the future month’s cross-sectional rank on the past n month sum cross sectional rank) proved to be statistically significant beyond any doubt, with all p-values being effectively zero. In short, there is extremely strong evidence for cross-sectional momentum among these five assets, which extends out to at least two years. Furthermore, since SHY or VFISX, aka the short-term treasury fund, is among the assets chosen, since it’s a proxy for the risk-free rate, by including it among the cross-sectional rankings, the cross-sectional rankings also implicitly state that in order to be invested into (as this strategy is a top-1 asset rotation strategy), it must outperform the risk-free asset, otherwise, by process of elimination, the strategy will invest into the risk-free asset itself.

In upcoming posts, I’ll look into testing hypotheses on signals and rules.

Lastly, Volatility Made Simple has just released a blog post on the performance of volatility-based strategies for the month of August. Given the massive volatility spike, the dispersion in performance of strategies is quite interesting. I’m happy that in terms of YTD returns, the modified version of my strategy is among the top 10 for the year.

Thanks for reading.

NOTE: while I am currently consulting, I am always open to networking, meeting up (Philadelphia and New York City both work), consulting arrangements, and job discussions. Contact me through my email at ilya.kipnis@gmail.com, or through my LinkedIn, found here.

21 thoughts on “Introduction to Hypothesis Driven Development — Overview of a Simple Strategy and Indicator Hypotheses”

Pingback: Introduction to Hypothesis Driven Development — Overview of a Simple Strategy and Indicator Hypotheses | Mubashir Qasim
Pingback: Quantocracy's Daily Wrap for 09/03/2015 | Quantocracy
Pingback: Distilled News | Data Analytics & R
Pingback: Introduction to Hypothesis Driven Development — Overview of a Simple Strategy and Indicator Hypotheses « Manipulate Magazine: Math 4 You By Us Group Illinois
Pingback: IMHO BEST LINKS FROM QUANTOCRACY FOR THE WEEK 31 AUG 15 — 6 SEP 15 | Quantitative Investor Blog
Hugo Varandas says:

September 9, 2015 at 12:17 am

Ilya, good post. I have two questions:
Why are you not removing the intersect for the rankings as you do for the returns (y~x-1 vs y~x). The estimates and probabilityes actualy refer to the intersect in the case of rankings.
Why do you use averages of discrete returns rather than cumulative returns or averages of log returns?

Keep up the good work.

- Ilya Kipnis says:
  
  September 9, 2015 at 12:26 am
  
  Hello Hugo,
  
  Actually, I do use the p-value for the regression estimate. The second row is the regression estimate, not the intercept, which you can find accessed inside the loop here:
  
  for(i in 1:24) {
  tmp <- returnRegression(monthRets, nMonths=i)
  pvals[[i]] <- tmp[1,4]
  estimates[[i]] <- tmp[1,1]
  rankPs[[i]] <- tmp[2,4]
  rankEstimates[[i]] <- tmp[2,1]
  }
  
  As for averages of discrete returns instead of cumulative returns, it's that ROC is the difference between two points. So this gives me more data. But it's most likely very similar in nature.
  
  And I don't remove the intersect for ranking because returns are already zero-centered, ranks aren't, so I keep the intercept there.
  
  - Hugo Varandas says:
    
    September 9, 2015 at 11:18 pm
    
    Maybe I am missing something… The second row seems like the intersection of the rank linear regression.
    rbind(summary(lmfit)$coefficients, summary(rankLmfit)$coefficients)
    Estimate Std. Error t value Pr(>|t|)
    meltedAverage$value 0.01829089 0.006436298 2.841835 4.643492e-03
    (Intercept) 2.69224138 0.137225579 19.619093 4.568979e-66
    meltedRankAvg$value 0.10258621 0.041375069 2.479421 1.344357e-02
    
    Thanks for the explanation about why the intersect is needed.
  - Ilya Kipnis says:
    
    September 9, 2015 at 11:34 pm
    
    Hugo,
    
    > a b lmfit summary(lmfit)
    
    Call:
    lm(formula = a ~ b)
    
    Residuals:
    Min 1Q Median 3Q Max
    -2.56744 -0.76535 0.06351 0.76057 2.46539
    
    Coefficients:
    Estimate Std. Error t value Pr(>|t|)
    (Intercept) -0.002372 0.105546 -0.022 0.982
    b -0.002547 0.113137 -0.023 0.982
    
    Residual standard error: 1.047 on 98 degrees of freedom
    Multiple R-squared: 5.17e-06, Adjusted R-squared: -0.0102
    F-statistic: 0.0005067 on 1 and 98 DF, p-value: 0.9821
    
    The value is the second row of the coefficients.
    
    Hope this helps.
Pingback: Hypothesis-Driven Development Part II | QuantStrat TradeR
Beliavsky says:

September 11, 2015 at 1:15 pm

Thanks for your post. I suggest adding the line of code

require(reshape2)

below the other “require” lines. When I first ran your script R complained about not finding the “melt” function.

- Ilya Kipnis says:
  
  September 11, 2015 at 1:38 pm
  
  It was in the returnRegression function already, but I edited it to the top.
  
Pingback: Hypothesis Driven Development Part IV: Testing The Barroso/Santa Clara Rule | QuantStrat TradeR
Jukka says:

October 1, 2015 at 8:53 am

I tested the code with random portfolio and the rank-rank regression looks very similar. Any thoughts about that?

This was the code to generate the random rankings. I hope that I got it right

nMonthAverage <- apply(returns, 2, runSum, n = nMonths)
nMonthAverage <- xts(nMonthAverage, order.by = index(returns))
nMonthAverage <- na.omit(lag(nMonthAverage))

random <- returns
for(i in 1:nrow(random)) {
random[i,] <- runif(ncol(random))
}
nMonthAverage <- random

- Ilya Kipnis says:
  
  October 2, 2015 at 3:59 am
  
  So you’re generating from a uniform distribution every month, and assuming it’s integer, then sure, you’re effectively doing the same thing.
  
Pingback: Create Amazing Looking Backtests With This One Wrong–I Mean Weird–Trick! (And Some Troubling Logical Invest Results) | QuantStrat TradeR
Michael says:

July 13, 2016 at 5:40 pm

Why do you subtract 1 when running the regression here?

lmfit <- lm(meltedReturns$value ~ meltedAverage$value – 1)

- Ilya Kipnis says:
  
  July 13, 2016 at 6:05 pm
  
  To remove the intercept. I am stating that I want to regress solely against the independent variable, not an intercept.
  
  - Michael says:
    
    July 13, 2016 at 11:50 pm
    
    My stats knowledge isn’t great. How are you sure the intercept is zero here? I checked the qqplot and it looks fine but I don’t get the intuition. Thanks
sung says:

October 5, 2016 at 2:06 am

I don’t understand your answer to Hugo.
As you used rbind, the object tmp consists of three rows.
First row for regression coefficient in meltedAverage
Second row for intercept in meltedRankAvg
Third row for regression coefficient in meltedRankAvg

So, I guess tmp[1,], tmp[3,] is needed to show regression coefficient.

sung says:

October 5, 2016 at 2:09 am

k