Hypothesis-Driven Development Part V: Stop-Loss, Deflating Sharpes, and Out-of-Sample

This post will demonstrate a stop-loss rule inspired by Andrew Lo’s paper “when do stop-loss rules stop losses”? Furthermore, it will demonstrate how to deflate a Sharpe ratio to account for the total number of trials conducted, which is presented in a paper written by David H. Bailey and Marcos Lopez De Prado. Lastly, the strategy will be tested on the out-of-sample ETFs, rather than the mutual funds that have been used up until now (which actually cannot be traded more than once every two months, but have been used simply for the purpose of demonstration).

First, however, I’d like to fix some code from the last post and append some results.

A reader asked about displaying the max drawdown for each of the previous rule-testing variants based off of volatility control, and Brian Peterson also recommended displaying max leverage, which this post will provide.

Here’s the updated rule backtest code:

ruleBacktest <- function(returns, nMonths, dailyReturns,
nSD=126, volTarget = .1) {
nMonthAverage <- apply(returns, 2, runSum, n = nMonths)
nMonthAverage <- na.omit(xts(nMonthAverage, order.by = index(returns)))
nMonthAvgRank <- t(apply(nMonthAverage, 1, rank))
nMonthAvgRank <- xts(nMonthAvgRank, order.by=index(nMonthAverage))
selection <- (nMonthAvgRank==5) * 1 #select highest average performance
dailyBacktest <- Return.portfolio(R = dailyReturns, weights = selection)
constantVol <- volTarget/(runSD(dailyBacktest, n = nSD) * sqrt(252))
monthlyLeverage <- na.omit(constantVol[endpoints(constantVol), on ="months"])
wts <- cbind(monthlyLeverage, 1-monthlyLeverage)
constantVolComponents <- cbind(dailyBacktest, 0)
out <- Return.portfolio(R = constantVolComponents, weights = wts)
out <- apply.monthly(out, Return.cumulative)
maxLeverage <- max(monthlyLeverage, na.rm = TRUE)
return(list(out, maxLeverage))
}

t1 <- Sys.time()
allPermutations <- list()
allDDs <- list()
leverages <- list()
for(i in seq(21, 252, by = 21)) {
monthVariants <- list()
ddVariants <- list()
leverageVariants <- list()
for(j in 1:12) {
trial <- ruleBacktest(returns = monthRets, nMonths = j, dailyReturns = sample, nSD = i)
sharpe <- table.AnnualizedReturns(trial[[1]])[3,]
dd <- maxDrawdown(trial[[1]])
monthVariants[[j]] <- sharpe
ddVariants[[j]] <- dd
leverageVariants[[j]] <- trial[[2]]
}
allPermutations[[i]] <- do.call(c, monthVariants)
allDDs[[i]] <- do.call(c, ddVariants)
leverages[[i]] <- do.call(c, leverageVariants)
}
allPermutations <- do.call(rbind, allPermutations)
allDDs <- do.call(rbind, allDDs)
leverages <- do.call(rbind, leverages)
t2 <- Sys.time()
print(t2-t1)

Drawdowns:

Leverage:

Here are the results presented as a hypothesis test–a linear regression of drawdowns and leverage against momentum formation period and volatility calculation period:

ddLM <- lm(meltedDDs$MaxDD~meltedDDs$volFormation + meltedDDs$momentumFormation)
summary(ddLM)

Call:
lm(formula = meltedDDs$MaxDD ~ meltedDDs$volFormation + meltedDDs$momentumFormation)

Residuals:
Min 1Q Median 3Q Max
-0.08022 -0.03434 -0.00135 0.02911 0.20077

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.240146 0.010922 21.99 < 2e-16 ***
meltedDDs$volFormation -0.000484 0.000053 -9.13 6.5e-16 ***
meltedDDs$momentumFormation 0.001533 0.001112 1.38 0.17
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0461 on 141 degrees of freedom
Multiple R-squared: 0.377, Adjusted R-squared: 0.368
F-statistic: 42.6 on 2 and 141 DF, p-value: 3.32e-15

levLM <- lm(meltedLeverage$MaxLeverage~meltedLeverage$volFormation + meltedDDs$momentumFormation)
summary(levLM)

Call:
lm(formula = meltedLeverage$MaxLeverage ~ meltedLeverage$volFormation +
meltedDDs$momentumFormation)

Residuals:
Min 1Q Median 3Q Max
-0.9592 -0.5179 -0.0908 0.3679 3.1022

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.076870 0.164243 24.82 <2e-16 ***
meltedLeverage$volFormation -0.009916 0.000797 -12.45 <2e-16 ***
meltedDDs$momentumFormation 0.009869 0.016727 0.59 0.56
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.693 on 141 degrees of freedom
Multiple R-squared: 0.524, Adjusted R-squared: 0.517
F-statistic: 77.7 on 2 and 141 DF, p-value: <2e-16

Easy interpretation here–the shorter-term volatility estimates are unstable due to the one-asset rotation nature of the system. Particularly silly is using the one-month volatility estimate. Imagine the system just switched from the lowest-volatility instrument to the highest. It would then take excessive leverage and get blown up that month for no particularly good reason. A longer-term volatility estimate seems to do much better for this system. So, while the Sharpe is generally improved, the results become far more palatable when using a more stable calculation for volatility, which sets maximum leverage to about 2 when targeting an annualized volatility of 10%. Also, to note, the period to compute volatility matters far more than the momentum formation period when addressing volatility targeting, which lends credence (at least in this case) to so many people that say “the individual signal rules matter far less than the position-sizing rules!”. According to some, position sizing is often a way for people to mask only marginally effective (read: bad) strategies with a separate layer to create a better result. I’m not sure which side of the debate (even assuming there is one) I fall upon, but for what it’s worth, there it is.

Moving on, I want to test out one more rule, which is inspired by Andrew Lo’s stop-loss rule. Essentially, the way it works is this (to my interpretation): it evaluates a running standard deviation, and if the drawdown exceeds some threshold of the running standard deviation, to sit out for some fixed period of time, and then re-enter. According to Andrew Lo, stop-losses help momentum strategies, so it seems as good a rule to test as any.

However, rather than test different permutations of the stop rule on all 144 prior combinations of volatility-adjusted configurations, I’m going to take an ensemble strategy, inspired by a conversation I had with Adam Butler, the CEO of ReSolve Asset Management, who stated that “we know momentum exists, but we don’t know the perfect way to measure it”, from the section I just finished up and use an equal weight of all 12 of the momentum formation periods with a 252-day rolling annualized volatility calculation, and equal weight them every month.

Here are the base case results from that trial (bringing our total to 169).

strat <- list()
for(i in 1:12) {
strat[[i]] <- ruleBacktest(returns = monthRets, nMonths = i, dailyReturns = sample, nSD = 252)[[1]]
}
strat <- do.call(cbind, strat)
strat <- Return.portfolio(R = na.omit(strat), rebalance_on="months")

rbind(table.AnnualizedReturns(strat), maxDrawdown(strat), CalmarRatio(strat))

With the following result:

portfolio.returns
Annualized Return 0.12230
Annualized Std Dev 0.10420
Annualized Sharpe (Rf=0%) 1.17340
Worst Drawdown 0.09616
Calmar Ratio 1.27167

Of course, also worth nothing is that the annualized standard deviation is indeed very close to 10%, even with the ensemble. And it’s nice that there is a Sharpe past 1. Of course, given that these are mutual funds being backtested, these results are optimistic due to the unrealistic execution assumptions (can’t trade sooner than once every *two* months).

Anyway, let’s introduce our stop-loss rule, inspired by Andrew Lo’s paper.

loStopLoss <- function(returns, sdPeriod = 12, sdScaling = 1, sdThresh = 1.5, cooldown = 3) {
stratRets <- list()
count <- 1
stratComplete <- FALSE
originalRets <- returns
ddThresh <- -runSD(returns, n = sdPeriod) * sdThresh * sdScaling
while(!stratComplete) {
retDD <- PerformanceAnalytics:::Drawdowns(returns)
DDbreakthrough <- retDD < ddThresh & lag(retDD) > ddThresh
firstBreak <- which.max(DDbreakthrough) #first threshold breakthrough, if 1, we have no breakthrough
#the above line is unintuitive since this is a boolean vector, so it returns the first value of TRUE
if(firstBreak > 1) { #we have a drawdown breakthrough if this is true
stratRets[[count]] <- returns[1:firstBreak,] #subset returns through our threshold breakthrough
nextPoint <- firstBreak + cooldown + 1 #next point of re-entry is the point after the cooldown period
if(nextPoint <= (nrow(returns)-1)) { #if we can re-enter, subset the returns and return to top of loop
returns <- returns[nextPoint:nrow(returns),]
ddThresh <- ddThresh[nextPoint:nrow(ddThresh),]
count <- count+1
} else { #re-entry point is after data exhausted, end strategy
stratComplete <- TRUE
}
} else { #there are no more critical drawdown breakthroughs, end strategy
stratRets[[count]] <- returns
stratComplete <- TRUE
}
}
stratRets <- do.call(rbind, stratRets) #combine returns
expandRets <- cbind(stratRets, originalRets) #account for all the days we missed
expandRets[is.na(expandRets[,1]), 1] <- 0 #cash positions will be zero
rets <- expandRets[,1]
colnames(rets) <- paste(cooldown, sdThresh, sep="_")
return(rets)
}

Essentially, the way it works is like this: the function computes all the drawdowns for a return series, along with its running standard deviation (non-annualized–if you want to annualize it, change the sdScaling parameter to something like sqrt(12) for monthly or sqrt(252) for daily data). Next, it looks for when the drawdown crossed a critical threshold, then cuts off that portion of returns and standard deviation history, and moves ahead in history by the cooldown period specified, and repeats. Most of the code is simply dealing with corner cases (is there even a time to use the stop rule? What about iterating when there isn’t enough data left?), and then putting the results back together again.

In any case, for the sake of simplicity, this function doesn’t use two different time scales (IE compute volatility using daily data, make decisions monthly), so I’m sticking with using a 12-month rolling volatility, as opposed to 252 day rolling volatility multiplied by the square root of 21.

Finally, here are another 54 runs to see if Andrew Lo’s stop-loss rule works here. Essentially, the intuition behind this is that if the strategy breaks down, it’ll continue to break down, so it would be prudent to just turn it off for a little while.

Here are the trial runs:


threshVec <- seq(0, 2, by=.25)
cooldownVec <- c(1:6)
sharpes <- list()
params <- expand.grid(threshVec, cooldownVec)
for(i in 1:nrow(params)) {
configuration <- loStopLoss(returns = strat, sdThresh = params[i,1],
cooldown = params[i, 2])
sharpes[[i]] <- table.AnnualizedReturns(configuration)[3,]
}
sharpes <- do.call(c, sharpes)

loStoplossFrame <- cbind(params, sharpes)
loStoplossFrame$improvement <- loStoplossFrame[,3] - table.AnnualizedReturns(strat)[3,]

colnames(loStoplossFrame) <- c("Threshold", "Cooldown", "Sharpe", "Improvement")

And a plot of the results.

ggplot(loStoplossFrame, aes(x = Threshold, y = Cooldown, fill=Improvement)) +
geom_tile()+scale_fill_gradient2(high="green", mid="yellow", low="red", midpoint = 0)

Result:

Result: at this level, and at this frequency (retaining the monthly decision-making process), the stop-loss rule basically does nothing in order to improve the risk-reward trade-off in the best case scenarios, and in most scenarios, simply hurts. 54 trials down the drain, bringing us up to 223 trials. So, what does the final result look like?

charts.PerformanceSummary(strat)

Here’s the final in-sample equity curve–and the first one featured in this entire series. This is, of course, a *feature* of hypothesis-driven development. Playing whack-a-mole with equity curve bumps is what is a textbook case of overfitting. So, without further ado:

And now we can see why stop-loss rules generally didn’t add any value to this strategy. Simply, it had very few periods of sustained losses at the monthly frequency, and thus, very little opportunity for a stop-loss rule to add value. Sure, the occasional negative month crept in, but there was no period of sustained losses. Furthermore, Yahoo Finance may not have perfect fidelity on dividends on mutual funds from the late 90s to early 2000s, so the initial flat performance may also be a rather conservative estimate on the strategy’s performance (then again, as I stated before, using mutual funds themselves is optimistic given the unrealistic execution assumptions, so maybe it cancels out). Now, if this equity curve were to be presented without any context, one may easily question whether or not it was curve-fit. To an extent, one can argue that the volatility computation period may be optimized, though I’d hardly call a 252-day (one-year) rolling volatility estimate a curve-fit.

Next, I’d like to introduce another concept on this blog that I’ve seen colloquially addressed in other parts of the quantitative blogging space, particularly by Mike Harris of Price Action Lab, namely that of multiple hypothesis testing, and about the need to correct for that.

Luckily for that, Drs. David H. Bailey and Marcos Lopez De Prado wrote a paper to address just that. Also, I’d like to note one very cool thing about this paper: it actually has a worked-out numerical example! In my opinion, there are very few things as helpful as showing a simple result that transforms a collection of mathematical symbols into a result to demonstrate what those symbols actually mean in the span of one page. Oh, and it also includes *code* in the appendix (albeit Python — even though, you know, R is far more developed. If someone can get Marcos Lopez De Prado to switch to R–aka the better research language, that’d be a godsend!).

In any case, here’s the formula for the deflated Sharpe ratio, implemented straight from the paper.

deflatedSharpe <- function(sharpe, nTrials, varTrials, skew, kurt, numPeriods, periodsInYear) {
emc <- .5772
sr0_term1 <- (1 - emc) * qnorm(1 - 1/nTrials)
sr0_term2 <- emc * qnorm(1 - 1/nTrials * exp(-1))
sr0 <- sqrt(varTrials * 1/periodsInYear) * (sr0_term1 + sr0_term2)

numerator <- (sharpe/sqrt(periodsInYear) - sr0)*sqrt(numPeriods - 1)

skewnessTerm <- 1 - skew * sharpe/sqrt(periodsInYear)
kurtosisTerm <- (kurt-1)/4*(sharpe/sqrt(periodsInYear))^2

denominator <- sqrt(skewnessTerm + kurtosisTerm)

result <- pnorm(numerator/denominator)
pval <- 1-result
return(pval)
}

The inputs are the strategy’s Sharpe ratio, the number of backtest runs, the variance of the sharpe ratios of those backtest runs, the skewness of the candidate strategy, its non-excess kurtosis, the number of periods in the backtest, and the number of periods in a year. Unlike the De Prado paper, I choose to return the p-value (EG 1-.

Let’s collect all our Sharpe ratios now.

allSharpes <- c(as.numeric(table.AnnualizedReturns(sigBoxplots)[3,]),
meltedSharpes$Sharpe,
as.numeric(table.AnnualizedReturns(strat)[3,]),
loStoplossFrame$Sharpe)

And now, let’s plug and chug!

stratSignificant <- deflatedSharpe(sharpe = as.numeric(table.AnnualizedReturns(strat)[3,]),
nTrials = length(allSharpes), varTrials = var(allSharpes),
skew = as.numeric(skewness(strat)), kurt = as.numeric(kurtosis(strat)) + 3,
numPeriods = nrow(strat), periodsInYear = 12)

And the result!

> stratSignificant
[1] 0.01311

Success! At least at the 5% level…and a rejection at the 1% level, and any level beyond that.

So, one last thing! Out-of-sample testing on ETFs (and mutual funds during the ETF burn-in period)!

symbols2 <- c("CWB", "JNK", "TLT", "SHY", "PCY")
getSymbols(symbols2, from='1900-01-01')
prices2 <- list()
for(tmp in symbols2) {
prices2[[tmp]] <- Ad(get(tmp))
}
prices2 <- do.call(cbind, prices2)
colnames(prices2) <- substr(colnames(prices2), 1, 3)
returns2 <- na.omit(Return.calculate(prices2))

monthRets2 <- apply.monthly(returns2, Return.cumulative)

oosStrat <- list()
for(i in 1:12) {
oosStrat[[i]] <- ruleBacktest(returns = monthRets2, nMonths = i, dailyReturns = returns2, nSD = 252)[[1]]
}
oosStrat <- do.call(cbind, oosStrat)
oosStrat <- Return.portfolio(R = na.omit(oosStrat), rebalance_on="months")

symbols <- c("CNSAX", "FAHDX", "VUSTX", "VFISX", "PREMX")
getSymbols(symbols, from='1900-01-01')
prices <- list()
for(symbol in symbols) {
prices[[symbol]] <- Ad(get(symbol))
}
prices <- do.call(cbind, prices)
colnames(prices) <- substr(colnames(prices), 1, 5)
oosMFreturns <- na.omit(Return.calculate(prices))
oosMFmonths <- apply.monthly(oosMFreturns, Return.cumulative)

oosMF <- list()
for(i in 1:12) {
oosMF[[i]] <- ruleBacktest(returns = oosMFmonths, nMonths = i, dailyReturns = oosMFreturns, nSD=252)[[1]]
}
oosMF <- do.call(cbind, oosMF)
oosMF <- Return.portfolio(R = na.omit(oosMF), rebalance_on="months")
oosMF <- oosMF["2009-04/2011-03"]

fullOOS <- rbind(oosMF, oosStrat)

rbind(table.AnnualizedReturns(fullOOS), maxDrawdown(fullOOS), CalmarRatio(fullOOS))
charts.PerformanceSummary(fullOOS)

And the results:

portfolio.returns
Annualized Return 0.1273
Annualized Std Dev 0.0901
Annualized Sharpe (Rf=0%) 1.4119
Worst Drawdown 0.1061
Calmar Ratio 1.1996

And one more equity curve (only the second!).

In other words, the out-of-sample statistics compare to the in-sample statistics. The Sharpe ratio is higher, the Calmar slightly lower. But on a whole, the performance has kept up. Unfortunately, the strategy is currently in a drawdown, but that’s the breaks.

So, whew. That concludes my first go at hypothesis-driven development, and has hopefully at least demonstrated the process to a satisfactory degree. What started off as a toy strategy instead turned from a rejection to a not rejection to demonstrating ideas from three separate papers, and having out-of-sample statistics that largely matched if not outperformed the in-sample statistics. For those thinking about investing in this strategy (again, here is the strategy: take 12 different portfolios, each selecting the asset with the highest momentum over months 1-12, target an annualized volatility of 10%, with volatility defined as the rolling annualized 252-day standard deviation, and equal-weight them every month), what I didn’t cover was turnover and taxes (this is a bond ETF strategy, so dividends will play a large role).

Now, one other request–many of the ideas for this blog come from my readers. I am especially interested in things to think about from readers with line-management responsibilities, as I think many of the questions from those individuals are likely the most universally interesting ones. If you’re one such individual, I’d appreciate an introduction, and knowing who more of the individuals in my reader base are.

Thanks for reading.

NOTE: while I am currently consulting, I am always open to networking, meeting up, consulting arrangements, and job discussions. Contact me through my email at ilya.kipnis@gmail.com, or through my LinkedIn, found here.

Hypothesis Driven Development Part IV: Testing The Barroso/Santa Clara Rule

This post will deal with applying the constant-volatility procedure written about by Barroso and Santa Clara in their paper “Momentum Has Its Moments”.

The last two posts dealt with evaluating the intelligence of the signal-generation process. While the strategy showed itself to be marginally better than randomly tossing darts against a dartboard and I was ready to reject it for want of moving onto better topics that are slightly less of a toy in terms of examples than a little rotation strategy, Brian Peterson told me to see this strategy through to the end, including testing out rule processes.

First off, to make a distinction, rules are not signals. Rules are essentially a way to quantify what exactly to do assuming one acts upon a signal. Things such as position sizing, stop-loss processes, and so on, all fall under rule processes.

This rule deals with using leverage in order to target a constant volatility.

So here’s the idea: in their paper, Pedro Barroso and Pedro Santa Clara took the Fama-French momentum data, and found that the classic WML strategy certainly outperforms the market, but it has a critical downside, namely that of momentum crashes, in which being on the wrong side of a momentum trade will needlessly expose a portfolio to catastrophically large drawdowns. While this strategy is a long-only strategy (and with fixed-income ETFs, no less), and so would seem to be more robust against such massive drawdowns, there’s no reason to leave money on the table. To note, not only have Barroso and Santa Clara covered this phenomena, but so have others, such as Tony Cooper in his paper “Alpha Generation and Risk Smoothing Using Volatility of Volatility”.

In any case, the setup here is simple: take the previous portfolios, consisting of 1-12 month momentum formation periods, and every month, compute the annualized standard deviation, using a 21-252 (by 21) formation period, for a total of 12 x 12 = 144 trials. (So this will put the total trials run so far at 24 + 144 = 168…bonus points if you know where this tidbit is going to go).

Here’s the code (again, following on from the last post, which follows from the second post, which follows from the first post in this series).

require(reshape2)
require(ggplot2)

ruleBacktest <- function(returns, nMonths, dailyReturns,
                         nSD=126, volTarget = .1) {
  nMonthAverage <- apply(returns, 2, runSum, n = nMonths)
  nMonthAverage <- xts(nMonthAverage, order.by = index(returns))
  nMonthAvgRank <- t(apply(nMonthAverage, 1, rank))
  nMonthAvgRank <- xts(nMonthAvgRank, order.by=index(returns))
  selection <- (nMonthAvgRank==5) * 1 #select highest average performance
  dailyBacktest <- Return.portfolio(R = dailyReturns, weights = selection)
  constantVol <- volTarget/(runSD(dailyBacktest, n = nSD) * sqrt(252))
  monthlyLeverage <- na.omit(constantVol[endpoints(constantVol), on ="months"])
  wts <- cbind(monthlyLeverage, 1-monthlyLeverage)
  constantVolComponents <- cbind(dailyBacktest, 0)
  out <- Return.portfolio(R = constantVolComponents, weights = wts)
  out <- apply.monthly(out, Return.cumulative)
  return(out)
}

t1 <- Sys.time()
allPermutations <- list()
for(i in seq(21, 252, by = 21)) {
  monthVariants <- list()
  for(j in 1:12) {
    trial <- ruleBacktest(returns = monthRets, nMonths = j, dailyReturns = sample, nSD = i)
    sharpe <- table.AnnualizedReturns(trial)[3,]
    monthVariants[[j]] <- sharpe
  }
  allPermutations[[i]] <- do.call(c, monthVariants)
}
allPermutations <- do.call(rbind, allPermutations)
t2 <- Sys.time()
print(t2-t1)

rownames(allPermutations) <- seq(21, 252, by = 21)
colnames(allPermutations) <- 1:12

baselineSharpes <- table.AnnualizedReturns(algoPortfolios)[3,]
baselineSharpeMat <- matrix(rep(baselineSharpes, 12), ncol=12, byrow=TRUE)

diffs <- allPermutations - as.numeric(baselineSharpeMat)
require(reshape2)
require(ggplot2)
meltedDiffs <-melt(diffs)

colnames(meltedDiffs) <- c("volFormation", "momentumFormation", "sharpeDifference")
ggplot(meltedDiffs, aes(x = momentumFormation, y = volFormation, fill=sharpeDifference)) + 
  geom_tile()+scale_fill_gradient2(high="green", mid="yellow", low="red")

meltedSharpes <- melt(allPermutations)
colnames(meltedSharpes) <- c("volFormation", "momentumFormation", "Sharpe")
ggplot(meltedSharpes, aes(x = momentumFormation, y = volFormation, fill=Sharpe)) + 
  geom_tile()+scale_fill_gradient2(high="green", mid="yellow", low="red", midpoint = mean(allPermutations))

Again, there’s no parallel code since this is a relatively small example, and I don’t know which OS any given instance of R runs on (windows/linux have different parallelization infrastructure).

So the idea here is to simply compare the Sharpe ratios with different volatility lookback periods against the baseline signal-process-only portfolios. The reason I use Sharpe ratios, and not say, CAGR, volatility, or drawdown is that Sharpe ratios are scale-invariant. In this case, I’m targeting an annualized volatility of 10%, but with a higher targeted volatility, one can obtain higher returns at the cost of higher drawdowns, or be more conservative. But the Sharpe ratio should stay relatively consistent within reasonable bounds.

So here are the results:

Sharpe improvements:

In this case, the diagram shows that on a whole, once the volatility estimation period becomes long enough, the results are generally positive. Namely, that if one uses a very short estimation period, that volatility estimate is more dependent on the last month’s choice of instrument, as opposed to the longer-term volatility of the system itself, which can create poor forecasts. Also to note is that the one-month momentum formation period doesn’t seem overly amenable to the constant volatility targeting scheme (there’s basically little improvement if not a slight drag on risk-adjusted performance). This is interesting in that the baseline Sharpe ratio for the one-period formation is among the best of the baseline performances. However, on a whole, the volatility targeting actually does improve risk-adjusted performance of the system, even one as simple as throwing all your money into one asset every month based on a single momentum signal.

Absolute Sharpe ratios:

In this case, the absolute Sharpe ratios look fairly solid for such a simple system. The 3, 7, and 9 month variants are slightly lower, but once the volatility estimation period reaches between 126 and 252 days, the results are fairly robust. The Barroso and Santa Clara paper uses a period of 126 days to estimate annualized volatility, which looks solid across the entire momentum formation period spectrum.

In any case, it seems the verdict is that a constant volatility target improves results.

Thanks for reading.

NOTE: while I am currently consulting, I am always open to networking, meeting up (Philadelphia and New York City both work), consulting arrangements, and job discussions. Contact me through my email at ilya.kipnis@gmail.com, or through my LinkedIn, found here.

Hypothesis Driven Development Part III: Monte Carlo In Asset Allocation Tests

This post will show how to use Monte Carlo to test for signal intelligence.

Although I had rejected this strategy in the last post, I was asked to do a monte-carlo analysis of a thousand random portfolios to see how the various signal processes performed against said distribution. Essentially, the process is quite simple: as I’m selecting one asset each month to hold, I simply generate a random number between 1 and the amount of assets (5 in this case), and hold it for the month. Repeat this process for the number of months, and then repeat this process a thousand times, and see where the signal processes fall across that distribution.

I didn’t use parallel processing here since Windows and Linux-based R have different parallel libraries, and in the interest of having the code work across all machines, I decided to leave it off.

Here’s the code:

randomAssetPortfolio <- function(returns) {
  numAssets <- ncol(returns)
  numPeriods <- nrow(returns)
  assetSequence <- sample.int(numAssets, numPeriods, replace=TRUE)
  wts <- matrix(nrow = numPeriods, ncol=numAssets, 0)
  wts <- xts(wts, order.by=index(returns))
  for(i in 1:nrow(wts)) {
    wts[i,assetSequence[i]] <- 1
  }
  randomPortfolio <- Return.portfolio(R = returns, weights = wts)
  return(randomPortfolio)
}

t1 <- Sys.time()
randomPortfolios <- list()
set.seed(123)
for(i in 1:1000) {
  randomPortfolios[[i]] <- randomAssetPortfolio(monthRets)
}
randomPortfolios <- do.call(cbind, randomPortfolios)
t2 <- Sys.time()
print(t2-t1)

algoPortfolios <- sigBoxplots[,1:12]
randomStats <- table.AnnualizedReturns(randomPortfolios)
algoStats <- table.AnnualizedReturns(algoPortfolios)

par(mfrow=c(3,1))
hist(as.numeric(randomStats[1,]), breaks = 20, main = 'histogram of monte carlo annualized returns',
     xlab='annualized returns')
abline(v=as.numeric(algoStats[1,]), col='red')
hist(as.numeric(randomStats[2,]), breaks = 20, main = 'histogram of monte carlo volatilities',
     xlab='annualized vol')
abline(v=as.numeric(algoStats[2,]), col='red')
hist(as.numeric(randomStats[3,]), breaks = 20, main = 'histogram of monte carlo Sharpes',
     xlab='Sharpe ratio')
abline(v=as.numeric(algoStats[3,]), col='red')

allStats <- cbind(randomStats, algoStats)
aggregateMean <- apply(allStats, 1, mean)
aggregateDevs <- apply(allStats, 1, sd)

algoPs <- 1-pnorm(as.matrix((algoStats - aggregateMean)/aggregateDevs))

plot(as.numeric(algoPs[1,])~c(1:12), main='Return p-values',
     xlab='Formation period', ylab='P-value')
abline(h=0.05, col='red')
abline(h=.1, col='green')

plot(1-as.numeric(algoPs[2,])~c(1:12), ylim=c(0, .5), main='Annualized vol p-values',
     xlab='Formation period', ylab='P-value')
abline(h=0.05, col='red')
abline(h=.1, col='green')

plot(as.numeric(algoPs[3,])~c(1:12), main='Sharpe p-values',
     xlab='Formation period', ylab='P-value')
abline(h=0.05, col='red')
abline(h=.1, col='green')

And here are the results:


In short, compared to monkeys throwing darts, to use some phrasing from the Price Action Lab blog, these signal processes are only marginally intelligent, if at all, depending on the variation one chooses. Still, I was recommended to see this process through the end, and evaluate rules, so next time, I’ll evaluate one easy-to-implement rule.

Thanks for reading.

NOTE: while I am currently consulting, I am always open to networking, meeting up (Philadelphia and New York City both work), consulting arrangements, and job discussions. Contact me through my email at ilya.kipnis@gmail.com, or through my LinkedIn, found here.

Hypothesis-Driven Development Part II

This post will evaluate signals based on the rank regression hypotheses covered in the last post.

The last time around, we saw that rank regression had a very statistically significant result. Therefore, the next step would be to evaluate the basic signals — whether or not there is statistical significance in the actual evaluation of the signal–namely, since the strategy from SeekingAlpha simply selects the top-ranked ETF every month, this is a very easy signal to evaluate.

Simply, using the 1-24 month formation periods for cumulative sum of monthly returns, select the highest-ranked ETF and hold it for one month.

Here’s the code to evaluate the signal (continued from the last post), given the returns, a month parameter, and an EW portfolio to compare with the signal.


signalBacktest <- function(returns, nMonths, ewPortfolio) {
  nMonthAverage <- apply(returns, 2, runSum, n = nMonths)
  nMonthAverage <- xts(nMonthAverage, order.by = index(returns))
  nMonthAvgRank <- t(apply(nMonthAverage, 1, rank))
  nMonthAvgRank <- xts(nMonthAvgRank, order.by=index(returns))
  selection <- (nMonthAvgRank==5) * 1 #select highest average performance
  sigTest <- Return.portfolio(R = returns, weights = selection)
  difference <- sigTest - ewPortfolio
  diffZscore <- mean(difference)/sd(difference)
  sigZscore <- mean(sigTest)/sd(sigTest)
  return(list(sigTest, difference, mean(sigTest), sigZscore, mean(difference), diffZscore))
}

ewPortfolio <- Return.portfolio(monthRets, rebalance_on="months")

sigBoxplots <- list()
excessBoxplots <- list()
sigMeans <- list()
sigZscores <- list()
diffMeans <- list()
diffZscores <- list()
for(i in 1:24) {
  tmp <- signalBacktest(monthRets, nMonths = i, ewPortfolio)
  sigBoxplots[[i]] <- tmp[[1]]
  excessBoxplots[[i]] <- tmp[[2]]
  sigMeans[[i]] <- tmp[[3]]
  sigZscores[[i]] <- tmp[[4]]
  diffMeans[[i]] <- tmp[[5]]
  diffZscores[[i]] <- tmp[[6]]
}

sigBoxplots <- do.call(cbind, sigBoxplots)
excessBoxplots <- do.call(cbind, excessBoxplots)
sigMeans <- do.call(c, sigMeans)
sigZscores <- do.call(c, sigZscores)
diffMeans <- do.call(c, diffMeans)
diffZscores <- do.call(c, diffZscores)

par(mfrow=c(2,1))
plot(as.numeric(sigMeans)*100, type='h', main = 'signal means', 
     ylab = 'percent per month', xlab='formation period')
plot(as.numeric(sigZscores), type='h', main = 'signal Z scores', 
     ylab='Z scores', xlab='formation period')

plot(as.numeric(diffMeans)*100, type='h', main = 'mean difference between signal and EW',
     ylab = 'percent per month', xlab='formation period')
plot(as.numeric(diffZscores), type='h', main = 'difference Z scores',
     ylab = 'Z score', xlab='formation period')

boxplot(as.matrix(sigBoxplots), main = 'signal boxplots', xlab='formation period')
abline(h=0, col='red')
points(sigMeans, col='blue')

boxplot(as.matrix(sigBoxplots[,1:12]), main = 'signal boxplots 1 through 12 month formations', 
        xlab='formation period')
abline(h=0, col='red')
points(sigMeans[1:12], col='blue')

boxplot(as.matrix(excessBoxplots), main = 'difference (signal - EW) boxplots', 
        xlab='formation period')
abline(h=0, col='red')
points(sigMeans, col='blue')

boxplot(as.matrix(excessBoxplots[,1:12]), main = 'difference (signal - EW) boxplots 1 through 12 month formations', 
        xlab='formation period')
abline(h=0, col='red')
points(sigMeans[1:12], col='blue')

Okay, so what’s going on here is that I compare the signal against the equal weight portfolio, and take means and z scores of both the signal values in general, and against the equal weight portfolio. I plot these values, along with boxplots of the distributions of both the signal process, and the difference between the signal process and the equal weight portfolio.

Here are the results:




To note, the percents are already multiplied by 100, so in the best cases, the rank strategy outperforms the equal weight strategy by about 30 basis points per month. However, these results are…not even in the same parking lot as statistical significance, let alone in the same ballpark.

Now, at this point, in case some people haven’t yet read Brian Peterson’s paper on strategy development, the point of hypothesis-driven development is to *reject* hypothetical strategies ASAP before looking at any sort of equity curve and trying to do away with periods of underperformance. So, at this point, I would like to reject this entire strategy because there’s no statistical evidence to actually continue. Furthermore, because August 2015 was a rather interesting month, especially in terms of volatility dispersion, I want to return to volatility trading strategies, now backed by hypothesis-driven development.

If anyone wants to see me continue to rule testing with this process, let me know. If not, I have more ideas on the way.

Thanks for reading.

NOTE: while I am currently consulting, I am always open to networking, meeting up (Philadelphia and New York City both work), consulting arrangements, and job discussions. Contact me through my email at ilya.kipnis@gmail.com, or through my LinkedIn, found here.

I’m Back, A New Harry Long Strategy, And Plans For Hypothesis-Driven Development

I’m back. Anyone that wants to know “what happened at Graham”, I felt there was very little scaffolding/on-boarding, and Graham’s expectations/requirements changed, though I have a reference from my direct boss, an accomplished quantitative director In any case, moving on.

Harry Long (of Houston) recently came out with a new strategy posted on SeekingAlpha, and I’d like to test it for robustness to see if it has merit.

Here’s the link to the post.

So, the rules are fairly simple:

ZIV 15%
SPLV 50%
TMF 10%
UUP 20%
VXX 5%

TMF can be approximated with a 3x leveraged TLT. SPLV is also highly similar to XLP — aka the consumer staples SPY sector. Here’s the equity curve comparison to prove it.

So, let’s test this thing.

require(PerformanceAnalytics)
require(downloader)
require(quantmod)

getSymbols('XLP', from = '1900-01-01')
getSymbols('TLT', from = '1900-01-01')
getSymbols('UUP', from = '1900-01-01')
download('https://www.dropbox.com/s/jk3ortdyru4sg4n/ZIVlong.TXT', destfile='ZIVlong.csv')
download('https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT', destfile = 'VXXlong.csv')
ZIV &lt;- xts(read.zoo('ZIVlong.csv', header=TRUE, sep=','))
VXX &lt;- xts(read.zoo('VXXlong.csv', header=TRUE, sep=','))

symbols &lt;- na.omit(cbind(Return.calculate(Cl(ZIV)), Return.calculate(Ad(XLP)), Return.calculate(Ad(TLT))*3,
                         Return.calculate(Ad(UUP)), Return.calculate(Cl(VXX))))
strat &lt;- Return.portfolio(symbols, weights = c(.15, .5, .1, .2, .05), rebalance_on='years')

Here are the results:

compare &lt;- na.omit(cbind(strat, Return.calculate(Ad(XLP))))
charts.PerformanceSummary(compare)
rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare))

Equity curve (compared against buy and hold XLP)

Statistics:

                          portfolio.returns XLP.Adjusted
Annualized Return                 0.0864000    0.0969000
Annualized Std Dev                0.0804000    0.1442000
Annualized Sharpe (Rf=0%)         1.0747000    0.6720000
Worst Drawdown                    0.1349957    0.3238755
Calmar Ratio                      0.6397665    0.2993100

In short, this strategy definitely offers a lot more bang for your risk in terms of drawdown, and volatility, and so, offers noticeably higher risk/reward tradeoffs. However, it’s not something that beats the returns of instruments in the category of twice its volatility.

Here are the statistics from 2010 onwards.

charts.PerformanceSummary(compare['2010::'])
rbind(table.AnnualizedReturns(compare['2010::']), maxDrawdown(compare['2010::']), CalmarRatio(compare['2010::']))

                          portfolio.returns XLP.Adjusted
Annualized Return                0.12050000    0.1325000
Annualized Std Dev               0.07340000    0.1172000
Annualized Sharpe (Rf=0%)        1.64210000    1.1308000
Worst Drawdown                   0.07382878    0.1194072
Calmar Ratio                     1.63192211    1.1094371

Equity curve:

Definitely a smoother ride, and for bonus points, it seems some of the hedges helped with the recent market dip. Again, while aggregate returns aren’t as high as simply buying and holding XLP, the Sharpe and Calmar ratios do better on a whole.

Now, let’s do some robustness analysis. While I do not know how Harry Long arrived at the individual asset weights he did, what can be tested much more easily is what effect offsetting the rebalancing day has on the performance of the strategy. As this is a strategy rebalanced once a year, it can easily be tested for what effect the rebalancing date has on its performance.

yearlyEp &lt;- endpoints(symbols, on = 'years')
rebalanceDays &lt;- list()
for(i in 0:251) {
  offset &lt;- yearlyEp+i
  offset[offset &gt; nrow(symbols)] &lt;- nrow(symbols)
  offset[offset==0] &lt;- 1
  wts &lt;- matrix(rep(c(.15, .5, .1, .2, .05), length(yearlyEp)), ncol=5, byrow=TRUE)
  wts &lt;- xts(wts, order.by=as.Date(index(symbols)[offset]))
  offsetRets &lt;- Return.portfolio(R = symbols, weights = wts)
  colnames(offsetRets) &lt;- paste0("offset", i)
  rebalanceDays[[i+1]] &lt;- offsetRets
}
rebalanceDays &lt;- do.call(cbind, rebalanceDays)
rebalanceDays &lt;- na.omit(rebalanceDays)
stats &lt;- rbind(table.AnnualizedReturns(rebalanceDays), maxDrawdown(rebalanceDays))
stats[5,] &lt;- stats[1,]/stats[4,]

Here are the plots of return, Sharpe, and Calmar vs. offset.

plot(as.numeric(stats[1,])~c(0:251), type='l', ylab='CAGR', xlab='offset', main='CAGR vs. offset')
plot(as.numeric(stats[3,])~c(0:251), type='l', ylab='Sharpe Ratio', xlab='offset', main='Sharpe vs. offset')
plot(as.numeric(stats[5,])~c(0:251), type='l', ylab='Calmar Ratio', xlab='offset', main='Calmar vs. offset')
plot(as.numeric(stats[4,])~c(0:251), type='l', ylab='Drawdown', xlab='offset', main='Drawdown vs. offset')




In short, this strategy seems to be somewhat dependent upon the rebalancing date, which was left unsaid. Here are the quantiles for the five statistics for the given offsets:

rownames(stats)[5] &lt;- "Calmar"
apply(stats, 1, quantile)
     Annualized Return Annualized Std Dev Annualized Sharpe (Rf=0%) Worst Drawdown    Calmar
0%            0.072500             0.0802                  0.881000      0.1201198 0.4207922
25%           0.081925             0.0827                  0.987625      0.1444921 0.4755600
50%           0.087650             0.0837                  1.037250      0.1559238 0.5364758
75%           0.092000             0.0843                  1.090900      0.1744123 0.6230789
100%          0.105100             0.0867                  1.265900      0.1922916 0.8316698

While the standard deviation seems fairly robust, the Sharpe can decrease by about 33%, the Calmar can get cut in half, and the CAGR can also vary fairly substantially. That said, even using conservative estimates, the Sharpe ratio is fairly solid, and the Calmar outperforms that of XLP in any given variation, but nevertheless, performance can vary.

Is this strategy investible in its current state? Maybe, depending on your standards for rigor. Up to this point, rebalancing sometime in December-early January seems to substantially outperform other rebalance dates. Maybe a Dec/January anomaly effect exists in literature to justify this. However, the article makes no mention of that. Furthermore, the article doesn’t explain how it arrived at the weights it did.

Which brings me to my next topic, namely about a change with this blog going forward. Namely, hypothesis-driven trading system development. While this process doesn’t require complicated math, it does require statistical justification for multiple building blocks of a strategy, and a change in mindset, which a great deal of publicly available trading system ideas either gloss over, or omit entirely. As one of my most important readers praised this blog for “showing how the sausage is made”, this seems to be the next logical step in this progression.

Here’s the reasoning as to why.

It seems that when presenting trading ideas, there are two schools of thought: those that go off of intuition, build a backtest based off of that intuition, and see if it generally lines up with some intuitively expected result–and those that believe in a much more systematic, hypothesis-driven step-by-step framework, justifying as many decisions (ideally every decision) in creating a trading system. The advantage of the former is that it allows for displaying many more ideas in a much shorter timeframe. However, it has several major drawbacks: first off, it hides many concerns about potential overfitting. If what one sees is one final equity curve, there is nothing said about the sensitivity of said equity curve to however many various input parameters, and what other ideas were thrown out along the way. Secondly, without a foundation of strong hypotheses about the economic phenomena exploited, there is no proof that any strategy one comes across won’t simply fail once it’s put into live trading.

And third of all, which I find most important, is that such activities ultimately don’t sufficiently impress the industry’s best practitioners. For instance, Tony Cooper took issue with my replication of Trading The Odds’ volatility trading strategy, namely how data-mined it was (according to him in the comments section), and his objections seem to have been completely borne out by in out-of-sample performance.

So, for those looking for plug-and-crank system ideas, that may still happen every so often if someone sends me something particularly interesting, but there’s going to be some all-new content on this blog.

Thanks for reading.

NOTE: while I am currently consulting, I am always open to networking, meeting up (Philadelphia and New York City both work), consulting arrangements, and job discussions. Contact me through my email at ilya.kipnis@gmail.com, or through my LinkedIn, found here.

The JP Morgan SCTO strategy

This strategy goes over JP Morgan’s SCTO strategy, a basic XL-sector/RWR rotation strategy with the typical associated risks and returns with a momentum equity strategy. It’s nothing spectacular, but if a large bank markets it, it’s worth looking at.

Recently, one of my readers, a managing director at a quantitative investment firm, sent me a request to write a rotation strategy based around the 9 sector spiders and RWR. The way it works (or at least, the way I interpreted it) is this:

Every month, compute the return (not sure how “the return” is defined) and rank. Take the top 5 ranks, and weight them in a normalized fashion to the inverse of their 22-day volatility. Zero out any that have negative returns. Lastly, check the predicted annualized vol of the portfolio, and if it’s greater than 20%, bring it back down to 20%. The cash asset–SHY–receives any remaining allocation due to setting securities to zero.

For the reference I used, here’s the investment case document from JP Morgan itself.

Here’s my implementation:

Step 1) get the data, compute returns.

require(quantmod)
require(PerformanceAnalytics)
symbols <- c("XLB", "XLE", "XLF", "XLI", "XLK", "XLP", "XLU", "XLV", "XLY", "RWR", "SHY")
getSymbols(symbols, from="1990-01-01")
prices <- list()
for(i in 1:length(symbols)) {
  prices[[i]] <- Ad(get(symbols[i]))  
}
prices <- do.call(cbind, prices)
colnames(prices) <- gsub("\\.[A-z]*", "", colnames(prices))
returns <- na.omit(Return.calculate(prices))

Step 2) The function itself.

sctoStrat <- function(returns, cashAsset = "SHY", lookback = 4, annVolLimit = .2,
                      topN = 5, scale = 252) {
  ep <- endpoints(returns, on = "months")
  weights <- list()
  cashCol <- grep(cashAsset, colnames(returns))
  
  #remove cash from asset returns
  cashRets <- returns[, cashCol]
  assetRets <- returns[, -cashCol]
  for(i in 2:(length(ep) - lookback)) {
    retSubset <- assetRets[ep[i]:ep[i+lookback]]
    
    #forecast is the cumulative return of the lookback period
    forecast <- Return.cumulative(retSubset)
    
    #annualized (realized) volatility uses a 22-day lookback period
    annVol <- StdDev.annualized(tail(retSubset, 22))
    
    #rank the forecasts (the cumulative returns of the lookback)
    rankForecast <- rank(forecast) - ncol(assetRets) + topN
    
    #weight is inversely proportional to annualized vol
    weight <- 1/annVol
    
    #zero out anything not in the top N assets
    weight[rankForecast <= 0] <- 0
    
    #normalize and zero out anything with a negative return
    weight <- weight/sum(weight)
    weight[forecast < 0] <- 0
    
    #compute forecasted vol of portfolio
    forecastVol <- sqrt(as.numeric(t(weight)) %*% 
                          cov(retSubset) %*% 
                          as.numeric(weight)) * sqrt(scale)
    
    #if forecasted vol greater than vol limit, cut it down
    if(as.numeric(forecastVol) > annVolLimit) {
      weight <- weight * annVolLimit/as.numeric(forecastVol)
    }
    weights[[i]] <- xts(weight, order.by=index(tail(retSubset, 1)))
  }
  
  #replace cash back into returns
  returns <- cbind(assetRets, cashRets)
  weights <- do.call(rbind, weights)
  
  #cash weights are anything not in securities
  weights$CASH <- 1-rowSums(weights)
  
  #compute and return strategy returns
  stratRets <- Return.portfolio(R = returns, weights = weights)
  return(stratRets)      
}

In this case, I took a little bit of liberty with some specifics that the reference was short on. I used the full covariance matrix for forecasting the portfolio variance (not sure if JPM would ignore the covariances and do a weighted sum of individual volatilities instead), and for returns, I used the four-month cumulative. I’ve seen all sorts of permutations on how to compute returns, ranging from some average of 1, 3, 6, and 12 month cumulative returns to some lookback period to some two period average, so I’m all ears if others have differing ideas, which is why I left it as a lookback parameter.

Step 3) Running the strategy.

scto4_20 <- sctoStrat(returns)
getSymbols("SPY", from = "1990-01-01")
spyRets <- Return.calculate(Ad(SPY))
comparison <- na.omit(cbind(scto4_20, spyRets))
colnames(comparison) <- c("strategy", "SPY")
charts.PerformanceSummary(comparison)
apply.yearly(comparison, Return.cumulative)
stats <- rbind(table.AnnualizedReturns(comparison),
               maxDrawdown(comparison),
               CalmarRatio(comparison),
               SortinoRatio(comparison)*sqrt(252))
round(stats, 3)

Here are the statistics:

                          strategy   SPY
Annualized Return            0.118 0.089
Annualized Std Dev           0.125 0.193
Annualized Sharpe (Rf=0%)    0.942 0.460
Worst Drawdown               0.165 0.552
Calmar Ratio                 0.714 0.161
Sortino Ratio (MAR = 0%)     1.347 0.763

               strategy         SPY
2002-12-31 -0.035499564 -0.05656974
2003-12-31  0.253224759  0.28181559
2004-12-31  0.129739794  0.10697941
2005-12-30  0.066215224  0.04828267
2006-12-29  0.167686936  0.15845242
2007-12-31  0.153890329  0.05146218
2008-12-31 -0.096736711 -0.36794994
2009-12-31  0.181759432  0.26351755
2010-12-31  0.099187188  0.15056146
2011-12-30  0.073734427  0.01894986
2012-12-31  0.067679129  0.15990336
2013-12-31  0.321039353  0.32307769
2014-12-31  0.126633020  0.13463790
2015-04-16  0.004972434  0.02806776

And the equity curve:

To me, it looks like a standard rotation strategy. Aims for the highest momentum securities, diversifies to try and control risk, hits a drawdown in the crisis, recovers, and slightly lags the bull run on SPY. Nothing out of the ordinary.

So, for those interested, here you go. I’m surprised that JP Morgan itself markets this sort of thing, considering that they probably employ top-notch quants that can easily come up with products and/or strategies that are far better.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

The Logical Invest Enhanced Bond Rotation Strategy (And the Importance of Dividends)

This post will display my implementation of the Logical Invest Enhanced Bond Rotation strategy. This is a strategy that indeed does work, but is dependent on reinvesting dividends, as bonds pay coupons, which means bond ETFs do likewise.

The strategy is fairly simple — using four separate fixed income markets (long-term US government bonds, high-yield bonds, emerging sovereign debt, and convertible bonds), the strategy aims to deliver a low-risk, high Sharpe profile. Every month, it switches to two separate securities, in either a 60-40 or 50-50 split (that is, a 60-40 one way, or the other). My implementation for this strategy is similar to the ones I’ve done for the Logical Invest Universal Investment Strategy, which is to maximize a modified Sharpe ratio in a walk-forward process.

Here’s the code:

LogicInvestEBR <- function(returns, lowerBound, upperBound, period, modSharpeF) {
  count <- 0
  configs <- list()
  instCombos <- combn(colnames(returns), m = 2)
  for(i in 1:ncol(instCombos)) {
    inst1 <- instCombos[1, i]
    inst2 <- instCombos[2, i]
    rets <- returns[,c(inst1, inst2)]
    weightSeq <- seq(lowerBound, upperBound, by = .1)
    for(j in 1:length(weightSeq)) {
      returnConfig <- Return.portfolio(R = rets, 
                      weights = c(weightSeq[j], 1-weightSeq[j]), 
                      rebalance_on="months")
      colnames(returnConfig) <- paste(inst1, weightSeq[j], 
                                inst2, 1-weightSeq[j], sep="_")
      count <- count + 1
      configs[[count]] <- returnConfig
    }
  }
  
  configs <- do.call(cbind, configs)
  cumRets <- cumprod(1+configs)
  
  #rolling cumulative 
  rollAnnRets <- (cumRets/lag(cumRets, period))^(252/period) - 1
  rollingSD <- sapply(X = configs, runSD, n=period)*sqrt(252)
  
  modSharpe <- rollAnnRets/(rollingSD ^ modSharpeF)
  monthlyModSharpe <- modSharpe[endpoints(modSharpe, on="months"),]
  
  findMax <- function(data) {
    return(data==max(data))
  }
  
  #configs$zeroes <- 0 #zeroes for initial periods during calibration
  weights <- t(apply(monthlyModSharpe, 1, findMax))
  weights <- weights*1
  weights <- xts(weights, order.by=as.Date(rownames(weights)))
  weights[is.na(weights)] <- 0
  weights$zeroes <- 1-rowSums(weights)
  configCopy <- configs
  configCopy$zeroes <- 0
  
  stratRets <- Return.portfolio(R = configCopy, weights = weights)
  return(stratRets)  
}

The one thing different about this code is the way I initialize the return streams. It’s an ugly piece of work, but it takes all of the pairwise combinations (that is, 4 choose 2, or 4c2) along with a sequence going by 10% for the different security weights between the lower and upper bound (that is, if the lower bound is 40% and upper bound is 60%, the three weights will be 40-60, 50-50, and 60-40). So, in this case, there are 18 configurations. 4c2*3. Do note that this is not at all a framework that can be scaled up. That is, with 20 instruments, there will be 190 different combinations, and then anywhere between 3 to 11 (if going from 0-100) configurations for each combination. Obviously, not a pretty sight.

Beyond that, it’s the same refrain. Bind the returns together, compute an n-day rolling cumulative return (far faster my way than using the rollApply version of Return.annualized), divide it by the n-day rolling annualized standard deviation divided by the modified Sharpe F factor (1 gives you Sharpe ratio, 0 gives you pure returns, greater than 1 puts more of a focus on risk). Take the highest Sharpe ratio, allocate to that configuration, repeat.

So, how does this perform? Here’s a test script, using the same 73-day lookback with a modified Sharpe F of 2 that I’ve used in the previous Logical Invest strategies.

symbols <- c("TLT", "JNK", "PCY", "CWB", "VUSTX", "PRHYX", "RPIBX", "VCVSX")
suppressMessages(getSymbols(symbols, from="1995-01-01", src="yahoo"))
etfClose <- Return.calculate(cbind(Cl(TLT), Cl(JNK), Cl(PCY), Cl(CWB)))
etfAdj <- Return.calculate(cbind(Ad(TLT), Ad(JNK), Ad(PCY), Ad(CWB)))
mfClose <- Return.calculate(cbind(Cl(VUSTX), Cl(PRHYX), Cl(RPIBX), Cl(VCVSX)))
mfAdj <- Return.calculate(cbind(Ad(VUSTX), Ad(PRHYX), Ad(RPIBX), Ad(VCVSX)))
colnames(etfClose) <- colnames(etfAdj) <- c("TLT", "JNK", "PCY", "CWB")
colnames(mfClose) <- colnames(mfAdj) <- c("VUSTX", "PRHYX", "RPIBX", "VCVSX")

etfClose <- etfClose[!is.na(etfClose[,4]),]
etfAdj <- etfAdj[!is.na(etfAdj[,4]),]
mfClose <- mfClose[-1,]
mfAdj <- mfAdj[-1,]

etfAdjTest <- LogicInvestEBR(returns = etfAdj, lowerBound = .4, upperBound = .6,
                             period = 73, modSharpeF = 2)

etfClTest <- LogicInvestEBR(returns = etfClose, lowerBound = .4, upperBound = .6,
                             period = 73, modSharpeF = 2)

mfAdjTest <- LogicInvestEBR(returns = mfAdj, lowerBound = .4, upperBound = .6,
                            period = 73, modSharpeF = 2)

mfClTest <- LogicInvestEBR(returns = mfClose, lowerBound = .4, upperBound = .6,
                           period = 73, modSharpeF = 2)

fiveStats <- function(returns) {
  return(rbind(table.AnnualizedReturns(returns), 
               maxDrawdown(returns), CalmarRatio(returns)))
}

etfs <- cbind(etfAdjTest, etfClTest)
colnames(etfs) <- c("Adjusted ETFs", "Close ETFs")
charts.PerformanceSummary((etfs))

mutualFunds <- cbind(mfAdjTest, mfClTest)
colnames(mutualFunds) <- c("Adjusted MFs", "Close MFs")
charts.PerformanceSummary(mutualFunds)
chart.TimeSeries(log(cumprod(1+mutualFunds)), legend.loc="topleft")

fiveStats(etfs)
fiveStats(mutualFunds)

So, first, the results of the ETFs:

Equity curve:

Five statistics:

> fiveStats(etfs)
                          Adjusted ETFs Close ETFs
Annualized Return            0.12320000 0.08370000
Annualized Std Dev           0.06780000 0.06920000
Annualized Sharpe (Rf=0%)    1.81690000 1.20980000
Worst Drawdown               0.06913986 0.08038459
Calmar Ratio                 1.78158934 1.04078405

In other words, reinvesting dividends makes up about 50% of these returns.

Let’s look at the mutual funds. Note that these are for the sake of illustration only–you can’t trade out of mutual funds every month.

Equity curve:

Log scale:

Statistics:

                          Adjusted MFs Close MFs
Annualized Return           0.11450000 0.0284000
Annualized Std Dev          0.05700000 0.0627000
Annualized Sharpe (Rf=0%)   2.00900000 0.4532000
Worst Drawdown              0.09855271 0.2130904
Calmar Ratio                1.16217559 0.1332706

In this case, day and night, though how much of it is the data source may also be an issue. Yahoo isn’t the greatest when it comes to data, and I’m not sure how much the data quality deteriorates going back that far. However, the takeaway seems to be this: with bond strategies, dividends will need to be dealt with, and when considering returns data presented to you, keep in mind that those adjusted returns assume the investor stays on top of dividend maintenance. Fail to reinvest the dividends in a timely fashion, and, well, the gap can be quite large.

To put it into perspective, as I was writing this post, I wondered whether or not most of this was indeed due to dividends. Here’s a plot of the difference in returns between adjusted and close ETF returns.

chart.TimeSeries(etfAdj - etfClose, legend.loc="topleft", date.format="%Y-%m",
                 main = "Return differences adjusted vs. close ETFs")

With the resulting image:

While there may be some noise to the order of the negative fifth power on most days, there are clear spikes observable in the return differences. Those are dividends, and their compounding makes a sizable difference. In one case for CWB, the difference is particularly striking (Dec. 29, 2014). In fact, here’s a quick little analysis of the effect of the dividend effects.

dividends <- etfAdj - etfClose
divReturns <- list()
for(i in 1:ncol(dividends)) {
  diffStream <- dividends[,i]
  divPayments <- diffStream[diffStream >= 1e-3]
  divReturns[[i]] <- Return.annualized(divPayments)
}
divReturns <- do.call(cbind, divReturns)
divReturns

divReturns/Return.annualized(etfAdj)

And the result:

> divReturns
                         TLT        JNK        PCY        CWB
Annualized Return 0.03420959 0.08451723 0.05382363 0.05025999

> divReturns/Return.annualized(etfAdj)
                       TLT       JNK       PCY       CWB
Annualized Return 0.453966 0.6939243 0.5405922 0.3737499

In short, the effect of the dividend is massive. In some instances, such as with JNK, the dividend comprises more than 50% of the annualized returns for the security!

Basically, I’d like to hammer the point home one last time–backtests using adjusted data assume instantaneous maintenance of dividends. In order to achieve the optimistic returns seen in the backtests, these dividend payments must be reinvested ASAP. In short, this is the fine print on this strategy, and is a small, but critical detail that the SeekingAlpha article doesn’t mention. (Seriously, do a ctrl + F in your browser for the word “dividend”. It won’t come up in the article itself.) I wanted to make sure to add it.

One last thing: gaudy numbers when using monthly returns!

> fiveStats(apply.monthly(etfs, Return.cumulative))
                          Adjusted ETFs Close ETFs
Annualized Return            0.12150000   0.082500
Annualized Std Dev           0.06490000   0.067000
Annualized Sharpe (Rf=0%)    1.87170000   1.232100
Worst Drawdown               0.03671871   0.049627
Calmar Ratio                 3.30769620   1.662642

Look! A Calmar Ratio of 3.3, and a Sharpe near 2!*

*: Must manage dividends. Statistics reported are monthly.

Okay, in all fairness, this is a pretty solid strategy, once one commits to managing the dividends. I just felt that it should have been a topic made front and center considering its importance in this case, rather than simply swept under the “we use adjusted returns” rug, since in this instance, the effect of dividends is massive.

In conclusion, while I will more or less confirm the strategy’s actual risk/reward performance (unlike some other SeekingAlpha strategies I’ve backtested), which, in all honesty, I find really impressive, it comes with a caveat like the rest of them. However, the caveat of “be detail-oriented/meticulous/paranoid and reinvest those dividends!” in my opinion is a caveat that’s a lot easier to live with than 30%+ drawdowns that were found lurking in other SeekingAlpha strategies. So for those that can stay on top of those dividends (whether manually, or with machine execution), here you go. I’m basically confirming the performance of Logical Invest’s strategy, but just belaboring one important detail.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

The Logical Invest “Hell On Fire” Replication Attempt

This post is about my replication attempt of Logical Invest’s “Hell On Fire” strategy — which is its Universal Investment Strategy using SPXL and TMF (aka the 3x leveraged ETFs). I don’t match their results, but I do come close.

It seems that some people at Logical Invest have caught whiff of some of the work I did in replicating Harry Long’s ideas. First off, for the record, I’ve actually done some work with Harry Long in private, and the strategies we’ve worked on together are definitely better than the strategies he has shared for free, so if you are an institution hoping to vet his track record, I wouldn’t judge it by the very much incomplete frameworks he posts for free.

This post’s strategy is the Logical Invest Universal Investment Strategy leveraged up three times over. Here’s the link to their newest post. Also, I’m happy to see that they think positively of my work.

In any case, my results are worse than those on Logical Invest’s, so if anyone sees a reason for the discrepancy, please let me know.

Here’s the code for the backtest–most of it is old, from my first time analyzing Logical Invest’s strategy.

LogicalInvestUIS <- function(returns, period = 63, modSharpeF = 2.8) {
  returns[is.na(returns)] <- 0 #impute any NAs to zero
  configs <- list()
  for(i in 1:11) {
    weightFirst <- (i-1)*.1
    weightSecond <- 1-weightFirst
    config <- Return.portfolio(R = returns, weights=c(weightFirst, weightSecond), rebalance_on = "months")
    configs[[i]] <- config
  }
  configs <- do.call(cbind, configs)
  cumRets <- cumprod(1+configs)
  
  #rolling cumulative 
  rollAnnRets <- (cumRets/lag(cumRets, period))^(252/period) - 1
  rollingSD <- sapply(X = configs, runSD, n=period)*sqrt(252)
  
  modSharpe <- rollAnnRets/(rollingSD ^ modSharpeF)
  monthlyModSharpe <- modSharpe[endpoints(modSharpe, on="months"),]
  
  findMax <- function(data) {
    return(data==max(data))
  }
  
  #configs$zeroes <- 0 #zeroes for initial periods during calibration
  weights <- t(apply(monthlyModSharpe, 1, findMax))
  weights <- weights*1
  weights <- xts(weights, order.by=as.Date(rownames(weights)))
  weights[is.na(weights)] <- 0
  weights$zeroes <- 1-rowSums(weights)
  configCopy <- configs
  configCopy$zeroes <- 0
  
  stratRets <- Return.portfolio(R = configCopy, weights = weights)
  
  weightFirst <- apply(monthlyModSharpe, 1, which.max)
  weightFirst <- do.call(rbind, weightFirst)
  weightFirst <- (weightFirst-1)*.1
  align <- cbind(weightFirst, stratRets)
  align <- na.locf(align)
  chart.TimeSeries(align[,1], date.format="%Y", ylab=paste("Weight", colnames(returns)[1]), 
                                                           main=paste("Weight", colnames(returns)[1]))
  
  return(stratRets)
}

In this case, rather than steps of 5% weights, I used 10% weights after looking at the Logical Invest charts more closely.

Now, let’s look at the instruments.

getSymbols("SPY", from="1990-01-01")

getSymbols("TMF", from="1990-01-01")
TMFrets <- Return.calculate(Ad(TMF))
getSymbols("TLT", from="1990-01-01")
TLTrets <- Return.calculate(Ad(TLT))
tmf3TLT <- merge(TMFrets, 3*TLTrets, join='inner')
charts.PerformanceSummary(tmf3TLT)
Return.annualized(tmf3TLT[,2]-tmf3TLT[,1])
discrepancy <- as.numeric(Return.annualized(tmf3TLT[,2]-tmf3TLT[,1]))
tmf3TLT[,2] <- tmf3TLT[,2] - ((1+discrepancy)^(1/252)-1)
modifiedTLT <- 3*TLTrets - ((1+discrepancy)^(1/252)-1)

rets <- merge(3*Return.calculate(Ad(SPY)), modifiedTLT, join='inner')
colnames(rets) <- gsub("\\.[A-z]*", "", colnames(rets))

leveragedReturns <- rets
colnames(leveragedReturns) <- paste("Leveraged", colnames(leveragedReturns), sep="_")
leveragedReturns <- leveragedReturns[-1,]

Again, more of the same that I did from my work analyzing Harry Long’s strategies to get a longer backtest of SPXL and TMF (aka leveraged SPY and TLT).

Now, let’s look at some configurations.


hof <- LogicalInvestUIS(returns = leveragedReturns, period = 63, modSharpeF = 2.8)
hof2 <- LogicalInvestUIS(returns = leveragedReturns, period = 73, modSharpeF = 3)
hof3 <- LogicalInvestUIS(returns = leveragedReturns, period = 84, modSharpeF = 4)
hof4 <- LogicalInvestUIS(returns = leveragedReturns, period = 42, modSharpeF = 1.5)
hof5 <- LogicalInvestUIS(returns = leveragedReturns, period = 63, modSharpeF = 6)
hof6 <- LogicalInvestUIS(returns = leveragedReturns, period = 73, modSharpeF = 2)

hofComparisons <- cbind(hof, hof2, hof3, hof4, hof5, hof6)
colnames(hofComparisons) <- c("d63_F2.8", "d73_F3", "d84_F4", "d42_F1.5", "d63_F6", "d73_F2")
rbind(table.AnnualizedReturns(hofComparisons), maxDrawdown(hofComparisons), CalmarRatio(hofComparisons))

With the following statistics:

> rbind(table.AnnualizedReturns(hofComparisons), maxDrawdown(hofComparisons), CalmarRatio(hofComparisons))
                           d63_F2.8    d73_F3    d84_F4  d42_F1.5    d63_F6    d73_F2
Annualized Return         0.3777000 0.3684000 0.2854000 0.1849000 0.3718000 0.3830000
Annualized Std Dev        0.3406000 0.3103000 0.3010000 0.4032000 0.3155000 0.3383000
Annualized Sharpe (Rf=0%) 1.1091000 1.1872000 0.9483000 0.4585000 1.1785000 1.1323000
Worst Drawdown            0.5619769 0.4675397 0.4882101 0.7274609 0.5757738 0.4529908
Calmar Ratio              0.6721751 0.7879956 0.5845827 0.2541127 0.6457823 0.8455274

It seems that the original 73 day lookback, sharpe F of 2 had the best performance.

Here are the equity curves (log scale because leveraged or volatility strategies look silly at regular scale):

chart.TimeSeries(log(cumprod(1+hofComparisons)), legend.loc="topleft", date.format="%Y",
                 main="Hell On Fire Comparisons", ylab="Value of $1", yaxis = FALSE)
axis(side=2, at=c(0, 1, 2, 3, 4), label=paste0("$", round(exp(c(0, 1, 2, 3, 4)))), las = 1)

In short, sort of upwards from 2002 to the crisis, where all the strategies take a dip, and then continue steadily upwards.

Here are the drawdowns:

dds <- PerformanceAnalytics:::Drawdowns(hofComparisons)
chart.TimeSeries(dds, legend.loc="bottomright", date.format="%Y", main="Drawdowns Hell On Fire Variants", 
                 yaxis=FALSE, ylab="Drawdown", auto.grid=FALSE)
axis(side=2, at=seq(from=0, to=-.7, by = -.1), label=paste0(seq(from=0, to=-.7, by = -.1)*100, "%"), las = 1)

Basically, some regular bumps along the road given the CAGRs (that is, if you’re going to leverage something that has an 8% drawdown on the occasion three times over, it’s going to have a 24% drawdown on those same occasions, if not more), and the massive hit in the crisis when bonds take a hit, and on we go.

In short, this strategy is basically the same as the original strategy, just leveraged up, so for those with the stomach for it, there you go. Of course, Logical Invest is leaving off some details, since I’m not getting a perfect replica. Namely, their returns seem slightly higher, and their drawdowns slightly lower. I suppose that’s par for the course when selling subscriptions and newsletters.

One last thing, which I think people should be aware of–when people report statistics on their strategies, make sure to ask the question as to which frequency. Because here’s a quick little modification, going from daily returns to monthly returns:

> betterStatistics <- apply.monthly(hofComparisons, Return.cumulative)
> rbind(table.AnnualizedReturns(betterStatistics), maxDrawdown(betterStatistics), CalmarRatio(betterStatistics))
                           d63_F2.8    d73_F3    d84_F4  d42_F1.5    d63_F6   d73_F2
Annualized Return         0.3719000 0.3627000 0.2811000 0.1822000 0.3661000 0.377100
Annualized Std Dev        0.3461000 0.3014000 0.2914000 0.3566000 0.3159000 0.336700
Annualized Sharpe (Rf=0%) 1.0746000 1.2036000 0.9646000 0.5109000 1.1589000 1.119900
Worst Drawdown            0.4323102 0.3297927 0.4100792 0.6377512 0.4636949 0.311480
Calmar Ratio              0.8602366 1.0998551 0.6855148 0.2856723 0.7894636 1.210563

While the Sharpe ratios don’t improve too much, the Calmars (aka the return to drawdown) statistics increase dramatically. EG, imagine a month in which there’s a 40% drawdown, but it ends at a new equity high. A monthly return series will sweep that under the rug, or, for my fellow Jewish readers, pass over it. So, be wary.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

The Downside of Rankings-Based Strategies

This post will demonstrate a downside to rankings-based strategies, particularly when using data of a questionable quality (which, unless one pays multiple thousands of dollars per month for data, most likely is of questionable quality). Essentially, by making one small change to the way the strategy filters, it introduces a massive performance drop in terms of drawdown. This exercise effectively demonstrates a different possible way of throwing a curve-ball at ranking strategies to test for robustness.

Recently, a discussion came up between myself, Terry Doherty, Cliff Smith, and some others on Seeking Alpha regarding what happened when I substituted the 63-day SMA for the three month SMA in Cliff Smith’s QTS strategy (quarterly tactical strategy…strategy).

Essentially, by simply substituting a 63-day SMA (that is, using daily data instead of monthly) for a 3-month SMA, the results were drastically affected.

Here’s the new QTS code, now in a function.

qts <- function(prices, nShort = 20, nLong = 105, nMonthSMA = 3, nDaySMA = 63, wRankShort=1, wRankLong=1.01, 
                movAvgType = c("monthly", "daily"), cashAsset="VUSTX", returnNames = FALSE) {
  cashCol <- grep(cashAsset, colnames(prices))
  
  #start our data off on the security with the least data (VGSIX in this case)
  prices <- prices[!is.na(prices[,7]),] 
  
  #cash is not a formal asset in our ranking
  cashPrices <- prices[, cashCol]
  prices <- prices[, -cashCol]
  
  #compute momentums
  rocShort <- prices/lag(prices, nShort) - 1
  rocLong <- prices/lag(prices, nLong) - 1
  
  #take the endpoints of quarter start/end
  quarterlyEps <- endpoints(prices, on="quarters")
  monthlyEps <- endpoints(prices, on = "months")
  
  #take the prices at quarterly endpoints
  quarterlyPrices <- prices[quarterlyEps,]
  
  #short momentum at quarterly endpoints (20 day)
  rocShortQtrs <- rocShort[quarterlyEps,]
  
  #long momentum at quarterly endpoints (105 day)
  rocLongQtrs <- rocLong[quarterlyEps,]
  
  #rank short momentum, best highest rank
  rocSrank <- t(apply(rocShortQtrs, 1, rank))
  
  #rank long momentum, best highest rank
  rocLrank <- t(apply(rocLongQtrs, 1, rank))
  
  #total rank, long slightly higher than short, sum them
  totalRank <- wRankLong * rocLrank + wRankShort * rocSrank 
  
  #function that takes 100% position in highest ranked security
  maxRank <- function(rankRow) {
    return(rankRow==max(rankRow))
  }
  
  #apply above function to our quarterly ranks every quarter
  rankPos <- t(apply(totalRank, 1, maxRank))
  
  #SMA of securities, only use monthly endpoints
  #subset to quarters
  #then filter
  movAvgType = movAvgType[1]
  if(movAvgType=="monthly") {
    monthlyPrices <- prices[monthlyEps,]
    monthlySMAs <- xts(apply(monthlyPrices, 2, SMA, n=nMonthSMA), order.by=index(monthlyPrices))
    quarterlySMAs <- monthlySMAs[index(quarterlyPrices),]
    smaFilter <- quarterlyPrices > quarterlySMAs
  } else if (movAvgType=="daily") {
    smas <- xts(apply(prices, 2, SMA, n=nDaySMA), order.by=index(prices))
    quarterlySMAs <- smas[index(quarterlyPrices),]
    smaFilter <- quarterlyPrices > quarterlySMAs
  } else {
    stop("invalid moving average type")
  }
  
  finalPos <- rankPos*smaFilter
  finalPos <- finalPos[!is.na(rocLongQtrs[,1]),]
  cash <- xts(1-rowSums(finalPos), order.by=index(finalPos))
  finalPos <- merge(finalPos, cash, join='inner')
  
  prices <- merge(prices, cashPrices, join='inner')
  returns <- Return.calculate(prices)
  stratRets <- Return.portfolio(returns, finalPos)
  
  if(returnNames) {
    findNames <- function(pos) {
      return(names(pos[pos==1]))
    }
    tmp <- apply(finalPos, 1, findNames)
    assetNames <- xts(tmp, order.by=as.Date(names(tmp)))
    return(list(assetNames, stratRets))
  }
  return(stratRets)
}

The one change I made is this:

  movAvgType = movAvgType[1]
  if(movAvgType=="monthly") {
    monthlyPrices <- prices[monthlyEps,]
    monthlySMAs <- xts(apply(monthlyPrices, 2, SMA, n=nMonthSMA), order.by=index(monthlyPrices))
    quarterlySMAs <- monthlySMAs[index(quarterlyPrices),]
    smaFilter <- quarterlyPrices > quarterlySMAs
  } else if (movAvgType=="daily") {
    smas <- xts(apply(prices, 2, SMA, n=nDaySMA), order.by=index(prices))
    quarterlySMAs <- smas[index(quarterlyPrices),]
    smaFilter <- quarterlyPrices > quarterlySMAs
  } else {
    stop("invalid moving average type")
  }

In essence, it allows the function to use either a monthly-calculated moving average, or a daily, which is then subset to the quarterly frequency of the rest of the data.

(I also allow the function to return the names of the selected securities.)

So now we can do two tests:

1) The initial parameter settings (20-day short-term momentum, 105-day long-term momentum, equal weigh their ranks (tiebreaker to the long-term), and use a 3-month SMA to filter)
2) The same exact parameter settings, except a 63-day SMA for the filter.

Here’s the code to do that.

#get our data from yahoo, use adjusted prices
symbols <- c("NAESX", #small cap
             "PREMX", #emerging bond
             "VEIEX", #emerging markets
             "VFICX", #intermediate investment grade
             "VFIIX", #GNMA mortgage
             "VFINX", #S&P 500 index
             "VGSIX", #MSCI REIT
             "VGTSX", #total intl stock idx
             "VUSTX") #long term treasury (cash)

getSymbols(symbols, from="1990-01-01")
prices <- list()
for(i in 1:length(symbols)) {
  prices[[i]] <- Ad(get(symbols[i]))  
}
prices <- do.call(cbind, prices)
colnames(prices) <- gsub("\\.[A-z]*", "", colnames(prices))

monthlySMAqts <- qts(prices, returnNames=TRUE)
dailySMAqts <- qts(prices, wRankShort=.95, wRankLong=1.05, movAvgType = "daily", returnNames=TRUE)

retsComparison <- cbind(monthlySMAqts[[2]], dailySMAqts[[2]])
colnames(retsComparison) <- c("monthly SMA qts", "daily SMA qts")
retsComparison <- retsComparison["2003::"]
charts.PerformanceSummary(retsComparison["2003::"])
rbind(table.AnnualizedReturns(retsComparison["2003::"]), maxDrawdown(retsComparison["2003::"]))

And here are the results:

Statistics:

                          monthly SMA qts daily SMA qts
Annualized Return               0.2745000     0.2114000
Annualized Std Dev              0.1725000     0.1914000
Annualized Sharpe (Rf=0%)       1.5915000     1.1043000
Worst Drawdown                  0.1911616     0.3328411

With the corresponding equity curves:

Here are the several instances in which the selections do not match thanks to the filters:

selectedNames <- cbind(monthlySMAqts[[1]], dailySMAqts[[1]])
colnames(selectedNames) <- c("Monthly SMA Filter", "Daily SMA Filter")
differentSelections <- selectedNames[selectedNames[,1]!=selectedNames[,2],]

With the results:

           Monthly SMA Filter Daily SMA Filter
1997-03-31 "VGSIX"            "cash"          
2007-12-31 "cash"             "PREMX"         
2008-06-30 "cash"             "VFIIX"         
2008-12-31 "cash"             "NAESX"         
2011-06-30 "cash"             "NAESX"  

Now, of course, many can make the arguments that Yahoo’s data is junk, my backtest doesn’t reflect reality, etc., which would essentially miss the point: this data here, while not a perfect realization of the reality of Planet Earth, may as well have been valid (you know, like all the academics, who use various simulation techniques to synthesize more data or explore other scenarios?). All I did here was change the filter to something logically comparable (that is, computing the moving average filter on a different time-scale, which does not in any way change the investment logic). From 2003 onward, this change only affected the strategy in four places. However, those instances were enough to create some noticeable changes (for the worse) in the strategy’s performance. Essentially, the downside of rankings-based strategies are when the overall number of selected instruments (in this case, ONE!) is small, a few small changes in parameters, data, etc. can lead to drastically different results.

As I write this, Cliff Smith already has ideas as to how to counteract this phenomenon. However, unto my experience, once a strategy starts getting into “how do we smooth out that one bump on the equity curve” territory, I think it’s time to go back and re-examine the strategy altogether. In my opinion, while the idea of momentum is of course, sound, with a great deal of literature devoted to it, the idea of selecting just one instrument at a time as the be-all-end-all strategy does not sit well with me. However, to me, QTS nevertheless presents an interesting framework for analyzing small subgroups of securities, and using it as one layer of an overarching strategy framework, such that the return streams are sub-strategies, instead of raw instruments.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

The Logical-Invest “Universal Investment Strategy”–A Walk Forward Process on SPY and TLT

I’m sure we’ve all heard about diversified stock and bond portfolios. In its simplest, most diluted form, it can be comprised of the SPY and TLT etfs. The concept introduced by Logical Invest, in a Seeking Alpha article written by Frank Grossman (also see link here), essentially uses a walk-forward methodology of maximizing a modified Sharpe ratio, biased heavily in favor of the volatility rather than the returns. That is, it uses a 72-day moving window to maximize total returns between different weighting configurations of a SPY-TLT mix over the standard deviation raised to the power of 5/2. To put it into perspective, at a power of 1, this is the basic Sharpe ratio, and at a power of 0, just a momentum maximization algorithm.

The process for this strategy is simple: rebalance every month on some multiple of 5% between SPY and TLT that previously maximized the following quantity (returns/vol^2.5 on a 72-day window).

Here’s the code for obtaining the data and computing the necessary quantities:

require(quantmod)
require(PerformanceAnalytics)
getSymbols(c("SPY", "TLT"), from="1990-01-01")
returns <- merge(Return.calculate(Ad(SPY)), Return.calculate(Ad(TLT)), join='inner')
returns <- returns[-1,]
configs <- list()
for(i in 1:21) {
  weightSPY <- (i-1)*.05
  weightTLT <- 1-weightSPY
  config <- Return.portfolio(R = returns, weights=c(weightSPY, weightTLT), rebalance_on = "months")
  configs[[i]] <- config
}
configs <- do.call(cbind, configs)
cumRets <- cumprod(1+configs)
period <- 72

roll72CumAnn <- (cumRets/lag(cumRets, period))^(252/period) - 1
roll72SD <- sapply(X = configs, runSD, n=period)*sqrt(252)

Next, the code for creating the weights:

sd_f_factor <- 2.5
modSharpe <- roll72CumAnn/roll72SD^sd_f_factor
monthlyModSharpe <- modSharpe[endpoints(modSharpe, on="months"),]

findMax <- function(data) {
  return(data==max(data))
}

weights <- t(apply(monthlyModSharpe, 1, findMax))
weights <- weights*1
weights <- xts(weights, order.by=as.Date(rownames(weights)))
weights[is.na(weights)] <- 0
weights$zeroes <- 1-rowSums(weights)
configs$zeroes <- 0

That is, simply take the setting that maximizes the monthly modified Sharpe Ratio calculation at each rebalancing date (the end of every month).

Next, here’s the performance:

stratRets <- Return.portfolio(R = configs, weights = weights)
rbind(table.AnnualizedReturns(stratRets), maxDrawdown(stratRets))
charts.PerformanceSummary(stratRets)

Which gives the results:

> rbind(table.AnnualizedReturns(stratRets), maxDrawdown(stratRets))
                          portfolio.returns
Annualized Return                 0.1317000
Annualized Std Dev                0.0990000
Annualized Sharpe (Rf=0%)         1.3297000
Worst Drawdown                    0.1683851

With the following equity curve:

Not perfect, but how does it compare to the ingredients?

Let’s take a look:

stratAndComponents <- merge(returns, stratRets, join='inner')
charts.PerformanceSummary(stratAndComponents)
rbind(table.AnnualizedReturns(stratAndComponents), maxDrawdown(stratAndComponents))
apply.yearly(stratAndComponents, Return.cumulative)

Here are the usual statistics:

> rbind(table.AnnualizedReturns(stratAndComponents), maxDrawdown(stratAndComponents))
                          SPY.Adjusted TLT.Adjusted portfolio.returns
Annualized Return            0.0907000    0.0783000         0.1317000
Annualized Std Dev           0.1981000    0.1381000         0.0990000
Annualized Sharpe (Rf=0%)    0.4579000    0.5669000         1.3297000
Worst Drawdown               0.5518552    0.2659029         0.1683851

In short, it seems the strategy performs far better than either of the ingredients. Let’s see if the equity curve comparison reflects this.

Indeed, it does. While it does indeed have the drawdown in the crisis, both instruments were in drawdown at the time, so it appears that the strategy made the best of a bad situation.

Here are the annual returns:

> apply.yearly(stratAndComponents, Return.cumulative)
           SPY.Adjusted TLT.Adjusted portfolio.returns
2002-12-31  -0.02054891  0.110907611        0.01131366
2003-12-31   0.28179336  0.015936985        0.12566042
2004-12-31   0.10695067  0.087089794        0.09724221
2005-12-30   0.04830869  0.085918063        0.10525398
2006-12-29   0.15843880  0.007178861        0.05294557
2007-12-31   0.05145526  0.102972399        0.06230742
2008-12-31  -0.36794099  0.339612265        0.19590423
2009-12-31   0.26352114 -0.218105306        0.18826736
2010-12-31   0.15056113  0.090181150        0.16436950
2011-12-30   0.01890375  0.339915713        0.24562838
2012-12-31   0.15994578  0.024083393        0.06051237
2013-12-31   0.32303535 -0.133818884        0.13760060
2014-12-31   0.13463980  0.273123290        0.19637382
2015-02-20   0.02773183  0.006922893        0.02788726

2002 was an incomplete year. However, what’s interesting here is that on a whole, while the strategy rarely if ever does as well as the better of the two instruments, it always outperforms the worse of the two instruments–and not only that, but it has delivered a positive performance in every year of the backtest–even when one instrument or the other was taking serious blows to performance, such as SPY in 2008, and TLT in 2009 and 2013.

For the record, here is the weight of SPY in the strategy.

weightSPY <- apply(monthlyModSharpe, 1, which.max)
weightSPY <- do.call(rbind, weightSPY)
weightSPY <- (weightSPY-1)*.05
align <- cbind(weightSPY, stratRets)
align <- na.locf(align)
chart.TimeSeries(align[,1], date.format="%Y", ylab="Weight SPY", main="Weight of SPY in SPY-TLT pair")

Now while this may serve as a standalone strategy for some people, the takeaway in my opinion from this is that dynamically re-weighting two return streams that share a negative correlation can lead to some very strong results compared to the ingredients from which they were formed. Furthermore, rather than simply rely on one number to summarize a relationship between two instruments, the approach that Frank Grossman took to actually model the combined returns was one I find interesting, and undoubtedly has applications as a general walk-forward process.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.