A New Volatility Strategy, And A Heuristic For Analyzing Robustness

This post is motivated by a discussion that arose when I tested a strategy by Frank of Trading The Odds (post here). One point, brought up by Tony Cooper of Double Digit Numerics, the original author of the paper that Trading The Odds now trades (I consider it a huge honor that my blog is read by authors of original trading strategies), is that my heatmap analysis only looked cross-sectional performance, as opposed to performance over time–that is, performance that could have been outstanding over the course of the entire backtest could have been the result of a few lucky months. This is a fair point, which I hope this post will address in terms of a heuristic using both visual and analytical outputs.

The strategy for this post is the following, provided to me kindly by Mr. Helmuth Vollmeier (whose help in all my volatility-related investigations cannot be understated):

Consider VXV and VXMT, the three month and six month implied volatility on the usual SP500. Define contango as VXV/VXMT < 1, and backwardation vice versa. Additionally, take an SMA of said ratio. Go long VXX when the ratio is greater than 1 and above its SMA, and go long XIV when the converse holds. Or in my case, get in at the close when that happens and exit at the next day's close after the converse occurs (that is, my replication is slightly off due to using some rather simplistic coding for illustrative purposes).

In any case, here is the script for setting up the strategy, most of which is just downloading the data–the strategy itself is just a few lines of code:

require(downloader)
require(PerformanceAnalytics)

download("http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vxvdailyprices.csv", 
         destfile="vxvData.csv")
download("http://www.cboe.com/publish/ScheduledTask/MktData/datahouse/vxmtdailyprices.csv", 
         destfile="vxmtData.csv")

vxv <- xts(read.zoo("vxvData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
vxmt <- xts(read.zoo("vxmtData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))
ratio <- Cl(vxv)/Cl(vxmt)


download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")

download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package

xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))

xiv <- merge(xiv, ratio, join="inner")
vxx <- merge(vxx, ratio, join="inner")
colnames(xiv)[5] <- colnames(vxx)[5] <- "ratio"

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))

retsList <- list()
count <- 1
for(i in 10:200) {
  ratioSMA <- SMA(ratio, n=i)
  vxxSig <- lag(ratio > 1 & ratio > ratioSMA)
  xivSig <- lag(ratio < 1 & ratio < ratioSMA)
  rets <- vxxSig*vxxRets + xivSig*xivRets
  colnames(rets) <- i
  retsList[[i]]  <- rets
  count <- count+1  
}
retsList <- do.call(cbind, retsList)
colnames(retsList) <- gsub("X", "", colnames(retsList))
charts.PerformanceSummary(retsList)
retsList <- retsList[!is.na(retsList[,191]),]
retsList <- retsList[-1,]

retsList <- retsList["::2014-11-28"] #for monthly aggregation, remove start of Dec 2014

About as straightforward as things get (the results, as we'll see, are solid, as well). In this case, I tested every SMA between a 10 day and a classic 200 day SMA. And since this strategy is a single-parameter strategy (unless you want to get into adjusting the ratio critical values up and down away from 1), instead of heatmaps, we'll suffice with basic scatter plots and line plots (which make things about as simple as they come).

The heuristic I decided upon was to take some PerfA functions (Return.annualized, SharpeRatio.annualized, for instance), and compare the rank of the average of their monthly ranks (that is, a two-layer rank, very similar to the process in Flexible Asset Allocation) to the aggregate, whole time-period rank. The idea here is that performance based on a few lucky months may have a high aggregate ranking, but a much lower monthly ranking, which would be reflected in a scatter plot. Ideally, the scatter plot would go from lower left to upper right in terms of ranks comparisons, with a correlation of 1, meaning that the strategy with the best overall return will have the best average monthly return rank, and so on down the list. This can also apply to the Sharpe ratio, and so on.

Here is my off-the-cuff implementation of such an idea:

rankComparison <- function(rets, perfAfun="Return.cumulative") {
  fun <- match.fun(perfAfun)
  monthlyFun <- apply.monthly(rets, fun)
  monthlyRank <- t(apply(monthlyFun, MARGIN=1, FUN=rank))
  meanMonthlyRank <- apply(monthlyRank, MARGIN=2, FUN=mean)
  rankMMR <- rank(meanMonthlyRank)
  
  aggFun <- fun(rets)
  aggFunRank <- rank(aggFun)
  plot(aggFunRank~rankMMR, main=perfAfun)
  print(cor(aggFunRank, meanMonthlyRank))
}

So, I get a chart and a correlation of average monthly ranks against a single-pass whole-period rank. Here are the results for cumulative returns and Sharpe ratio:

> rankComparison(retsList)
[1] 0.8485374

Basically, the interpretation is this: the outliers above and to the left of the main cluster can be interpreted as those having those “few lucky months”, while those to the lower right consistently perform somewhat well, but for whatever reason, are just stricken with bad luck. However, the critical results that we’re looking for is that the best overall performers (the highest aggregate rank) are also those with the most *consistent* performance (the highest monthly rank), which is generally what we see.

Furthermore, the correlation of .85 also lends credence that this is a robust strategy.

Here’s the process repeated with the annualized Sharpe ratio:

> rankComparison(retsList, perfAfun="SharpeRatio.annualized")
[1] 0.8647353

In other words, an even clearer relationship here, and again, we see that the best performers overall are also the best monthly performers, so we can feel safe in the robustness of the strategy.

So what’s the punchline? Well, the idea is now that we’ve established that the best results on aggregate are also the strongest results when analyzing the results across time, let’s look to see if the various rankings of risk and reward metrics reveal which configurations those are.

Here’s a chart of the aggregate rankings of annualized return (aka cumulative return), annualized Sharpe, MAR (return over max drawdown), and max drawdown.

aggReturns <- Return.annualized(retsList)
aggSharpe <- SharpeRatio.annualized(retsList)
aggMAR <- Return.annualized(retsList)/maxDrawdown(retsList)
aggDD <- maxDrawdown(retsList)

plot(rank(aggReturns)~as.numeric(colnames(aggReturns)), type="l", ylab="annualized returns rank", xlab="SMA",
     main="Risk and return rank comparison")
lines(rank(aggSharpe)~as.numeric(colnames(aggSharpe)), type="l", ylab="annualized Sharpe rank", xlab="SMA", col="blue")
lines(rank(aggMAR)~as.numeric(colnames(aggMAR)), type="l", ylab="Max return over max drawdown", xlab="SMA", col="red")
lines(rank(-aggDD)~as.numeric(colnames(aggDD)), type="l", ylab="max DD", xlab="SMA", col="green")
legend("bottomright", c("Return rank", "Sharpe rank", "MAR rank", "Drawdown rank"), pch=0, col=c("black", "blue", "red", "green"))

And the resulting plot itself:

So, looking at these results, here are some interpretations, moving from left to right:

At the lower end of the SMA, the results are just plain terrible. Sure, the drawdowns are lower, but the returns are in the basement.
The spike around the 50-day SMA makes me question if there is some sort of behavioral bias at work here.
Next, there’s a region with fairly solid performance between that and the 100-day SMA, but is surrounded on both sides by pretty abysmal performance.
Moving onto the 100-day SMA region, the annualized returns and Sharpe ratios are strong, but get the parameter estimation incorrect going forward, and there’s a severe risk of incurring heavy drawdowns. The jump improvement in the drawdown metric is also interesting. Again, is there some sort of bias towards some of the round numbers? (50, 100, etc.)
Lastly, there’s nothing particularly spectacular about the performances until we get to the high 100s and the 200 day SMA, at which point, we see a stable region of configurations with high ranks in all categories.

Let’s look at that region more closely:

truncRets <- retsList[,161:191]
stats <- data.frame(cbind(t(Return.annualized(truncRets)),
                 t(SharpeRatio.annualized(truncRets)),
                 t(maxDrawdown(truncRets))))
colnames(stats) <- c("A.Return", "A.Sharpe", "Worst_Drawdown")
stats$MAR <- stats[,1]/stats[,3]
stats <- round(stats, 3)

And the results:

> stats
    A.Return A.Sharpe Worst_Drawdown   MAR
170    0.729    1.562          0.427 1.709
171    0.723    1.547          0.427 1.693
172    0.723    1.548          0.427 1.694
173    0.709    1.518          0.427 1.661
174    0.711    1.522          0.427 1.665
175    0.711    1.522          0.427 1.665
176    0.711    1.522          0.427 1.665
177    0.711    1.522          0.427 1.665
178    0.696    1.481          0.427 1.631
179    0.667    1.418          0.427 1.563
180    0.677    1.441          0.427 1.586
181    0.677    1.441          0.427 1.586
182    0.677    1.441          0.427 1.586
183    0.675    1.437          0.427 1.582
184    0.738    1.591          0.427 1.729
185    0.760    1.637          0.403 1.886
186    0.794    1.714          0.403 1.970
187    0.798    1.721          0.403 1.978
188    0.802    1.731          0.403 1.990
189    0.823    1.775          0.403 2.042
190    0.823    1.774          0.403 2.041
191    0.823    1.774          0.403 2.041
192    0.819    1.765          0.403 2.031
193    0.822    1.772          0.403 2.040
194    0.832    1.792          0.403 2.063
195    0.832    1.792          0.403 2.063
196    0.802    1.723          0.403 1.989
197    0.810    1.741          0.403 2.009
198    0.782    1.677          0.403 1.941
199    0.781    1.673          0.403 1.937
200    0.779    1.670          0.403 1.934

So starting from SMA 186 through SMA 200, we see some fairly strong performance–returns in the high 70s to the low 80s, and MARs in the high 1s to low 2s. And of course, since this is about a trading strategy, equity curves are of course, obligatory. Here is what that looks like:

strongRets <- retsList[,177:191]
charts.PerformanceSummary(strongRets)

Basically, on aggregate, some very strong performance. However, it is certainly not *smooth* performance. New equity highs are followed by strong drawdowns, which are then followed by a recovery and new, higher equity highs.

To conclude (for the moment, I’ll have a new post on this next week with a slight wrinkle that gets even better results), I hope that I presented not only a simple but effective strategy, but also a simple but effective (if a bit time consuming, to do all the monthly computations on 191 return streams) heuristic suggested/implied by Tony Cooper of double digit numerics for analyzing the performance and robustness of your trading strategies. Certainly, while many professors and theorists elucidate on robustness (with plenty of math that makes stiff bagels look digestible), I believe not a lot of attention is actually paid to it in more common circles, using more intuitive methods. After all, if someone would want to be an unscrupulous individual selling trading systems or signals (instead of worrying about the strategy’s capacity for capital), it’s easy to show an overfit equity curve while making up some excuse so as to not reveal the (most likely overfit) strategy. One thing I’d hope this post inspires is for individuals to ask not only look at equity curves, but also plots such as aggregate against average monthly (or higher frequencies, if the strategies are tested over mere months, for instance, such as intraday trading) metric rankings when performing parameter optimization.

Is this heuristic the most robust and advanced that can be done? Probably not. Would one need to employ even more advanced techniques if computing time becomes an issue? Probably (bootstrapping and sampling come to mind). Can this be built on? Of course. *Will* someone build on it? I certainly plan on revisiting this topic in the future.

Lastly, on the nature of the strategy itself: while Trading The Odds presented a strategy functioning on a very short time frame, I’m surprised that instead, we have a strategy whose parameters are on a much higher end of the numerical spectrum.

Thanks for reading.

NOTE: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

An Update on Flexible Asset Allocation

A few weeks back, after seeing my replication, one of the original authors of the Flexible Asset Allocation paper got in touch with me to tell me to make a slight adjustment to the code, in that rather than remove any negative-momentum securities before performing any ranking, to perform all ranking without taking absolute momentum into account, and only removing negative absolute momentum securities at the very end, after allocating weights.

Here’s the new code:

FAA <- function(prices, monthLookback = 4,
                weightMom = 1, weightVol = .5, weightCor = .5, 
                riskFreeName = NULL, bestN = 3,
                stepCorRank = FALSE, stepStartMethod = c("best", "default"),
                geometric = TRUE, ...) {
  stepStartMethod <- stepStartMethod[1]
  if(is.null(riskFreeName)) {
    prices$zeroes <- 0
    riskFreeName <- "zeroes"
    warning("No risk-free security specified. Recommended to use one of: quandClean('CHRIS/CME_US'), SHY, or VFISX. 
            Using vector of zeroes instead.")
  }
  returns <- Return.calculate(prices)
  monthlyEps <- endpoints(prices, on = "months")
  riskFreeCol <- grep(riskFreeName, colnames(prices))
  tmp <- list()
  dates <- list()
  
  for(i in 2:(length(monthlyEps) - monthLookback)) {
    #subset data
    priceData <- prices[monthlyEps[i]:monthlyEps[i+monthLookback],]
    returnsData <- returns[monthlyEps[i]:monthlyEps[i+monthLookback],]
    
    #perform computations
    momentum <- data.frame(t(t(priceData[nrow(priceData),])/t(priceData[1,]) - 1))
    momentum <- momentum[,!is.na(momentum)]
    #momentum[is.na(momentum)] <- -1 #set any NA momentum to negative 1 to keep R from crashing
    priceData <- priceData[,names(momentum)]
    returnsData <- returnsData[,names(momentum)]
    
    momRank <- rank(momentum)
    vols <- data.frame(StdDev(returnsData))
    volRank <- rank(-vols)
    cors <- cor(returnsData, use = "complete.obs")
    if (stepCorRank) {
      if(stepStartMethod=="best") {
        compositeMomVolRanks <- weightMom*momRank + weightVol*volRank
        maxRank <- compositeMomVolRanks[compositeMomVolRanks==max(compositeMomVolRanks)]
        corRank <- stepwiseCorRank(corMatrix=cors, startNames = names(maxRank), 
                                        bestHighestRank = TRUE, ...)
        
      } else {
        corRank <- stepwiseCorRank(corMatrix=cors, bestHighestRank=TRUE, ...)
      }
    } else {
      corRank <- rank(-rowSums(cors))
    }
    
    totalRank <- rank(weightMom*momRank + weightVol*volRank + weightCor*corRank)
    
    upper <- length(names(returnsData))
    lower <- max(upper-bestN+1, 1)
    topNvals <- sort(totalRank, partial=seq(from=upper, to=lower))[c(upper:lower)]
    
    #compute weights
    longs <- totalRank %in% topNvals #invest in ranks length - bestN or higher (in R, rank 1 is lowest)
    longs[momentum < 0] <- 0 #in previous algorithm, removed momentums < 0, this time, we zero them out at the end.
    longs <- longs/sum(longs) #equal weight all candidates
    longs[longs > 1/bestN] <- 1/bestN #in the event that we have fewer than top N invested into, lower weights to 1/top N
    names(longs) <- names(totalRank)
    
    
    #append removed names (those with momentum < 0)
    removedZeroes <- rep(0, ncol(returns)-length(longs))
    names(removedZeroes) <- names(returns)[!names(returns) %in% names(longs)]
    longs <- c(longs, removedZeroes)
    
    #reorder to be in the same column order as original returns/prices
    longs <- data.frame(t(longs))
    longs <- longs[, names(returns)]
    
    #append lists
    tmp[[i]] <- longs
    dates[[i]] <- index(returnsData)[nrow(returnsData)]
  }
  
  weights <- do.call(rbind, tmp)
  dates <- do.call(c, dates)
  weights <- xts(weights, order.by=as.Date(dates)) 
  weights[, riskFreeCol] <- weights[, riskFreeCol] + 1-rowSums(weights)
  strategyReturns <- Return.rebalancing(R = returns, weights = weights, geometric = geometric)
  colnames(strategyReturns) <- paste(monthLookback, weightMom, weightVol, weightCor, sep="_")
  return(strategyReturns)
}

And here are the new results, both with the original configuration, and using the stepwise correlation ranking algorithm introduced by David Varadi:

mutualFunds <- c("VTSMX", #Vanguard Total Stock Market Index
                 "FDIVX", #Fidelity Diversified International Fund
                 "VEIEX", #Vanguard Emerging Markets Stock Index Fund
                 "VFISX", #Vanguard Short-Term Treasury Fund
                 "VBMFX", #Vanguard Total Bond Market Index Fund
                 "QRAAX", #Oppenheimer Commodity Strategy Total Return 
                 "VGSIX" #Vanguard REIT Index Fund
)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2014-10-30")
tmp <- list()
for(fund in mutualFunds) {
  tmp[[fund]] <- Ad(get(fund))
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)
colnames(adPrices) <- gsub(".Adjusted", "", colnames(adPrices))

original <- FAA(adPrices, riskFreeName="VFISX")
swc <- FAA(adPrices, riskFreeName="VFISX", stepCorRank = TRUE)
originalOld <- FAAreturns(adPrices, riskFreeName="VFISX")
swcOld <- FAAreturns(adPrices, riskFreeName="VFISX", stepCorRank=TRUE)
all4 <- cbind(original, swc, originalOld, swcOld)
names(all4) <- c("original", "swc", "origOld", "swcOld")
charts.PerformanceSummary(all4)
> rbind(Return.annualized(all4)*100,
+       maxDrawdown(all4)*100,
+       SharpeRatio.annualized(all4))
                                 original       swc   origOld    swcOld
Annualized Return               12.795205 14.135997 13.221775 14.037137
Worst Drawdown                  11.361801 11.361801 13.082294 13.082294
Annualized Sharpe Ratio (Rf=0%)  1.455302  1.472924  1.377914  1.390025

And the resulting equity curve comparison

Overall, it seems filtering on absolute momentum after applying all weightings using only relative momentum to rank actually improves downside risk profiles ever so slightly compared to removing negative momentum securities ahead of time. In any case, FAAreturns will be the function that removes negative momentum securities ahead of time, and FAA will be the ones that removes them after all else is said and done.

I’ll return to the standard volatility trading agenda soon.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Trading The Odds Volatility Risk Premium: Addressing Data Mining and Curve-Fitting

Several readers, upon seeing the risk and return ratio along with other statistics in the previous post stated that the result may have been the result of data mining/over-optimization/curve-fitting/overfitting, or otherwise bad practice of creating an amazing equity curve whose performance will decay out of sample.

Fortunately, there’s a way to test that assertion. In their book “Trading Systems: A New Approach to System Development and Portfolio Optimization”, Urban Jaekle and Emilio Tomasini use the concept of the “stable region” to demonstrate a way of visualizing whether or not a parameter specification is indeed overfit. The idea of a stable region is that going forward, how robust is a parameter specification to slight changes? If the system just happened to find one good small point in a sea of losers, the strategy is likely to fail going forward. However, if small changes in the parameter specifications still result in profitable configurations, then the chosen parameter set is a valid configuration.

As Frank’s trading strategy only has two parameters (standard deviation computation period, aka runSD for the R function, and the SMA period), rather than make line graphs, I decided to do a brute force grid search just to see other configurations, and plotted the results in the form of heatmaps.

Here’s the modified script for the computations (no parallel syntax in use for the sake of simplicity):

download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")

download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package

xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxmt <- xts(read.zoo("vxmtdailyprices.csv", format="%m/%d/%Y", sep=",", header=TRUE))

getSymbols("^VIX", from="2004-03-29")

vixvxmt <- merge(Cl(VIX), Cl(vxmt))
vixvxmt[is.na(vixvxmt[,2]),2] <- vixvxmt[is.na(vixvxmt[,2]),1]

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))

getSymbols("^GSPC", from="1990-01-01")
spyRets <- diff(log(Cl(GSPC)))

t1 <- Sys.time()
MARmatrix <- list()
SharpeMatrix <- list()
for(i in 2:21) {
  
  smaMAR <- list()
  smaSharpe <- list()
  for(j in 2:21){
    spyVol <- runSD(spyRets, n=i)
    annSpyVol <- spyVol*100*sqrt(252)
    vols <- merge(vixvxmt[,2], annSpyVol, join='inner')
    
    
    vols$smaDiff <- SMA(vols[,1] - vols[,2], n=j)
    vols$signal <- vols$smaDiff > 0
    vols$signal <- lag(vols$signal, k = 1)
    
    stratRets <- vols$signal*xivRets + (1-vols$signal)*vxxRets
    #charts.PerformanceSummary(stratRets)
    #stratRets[is.na(stratRets)] <- 0
    #plot(log(cumprod(1+stratRets)))
    
    stats <- data.frame(cbind(Return.annualized(stratRets)*100, 
                              maxDrawdown(stratRets)*100, 
                              SharpeRatio.annualized(stratRets)))
    
    colnames(stats) <- c("Annualized Return", "Max Drawdown", "Annualized Sharpe")
    MAR <- as.numeric(stats[1])/as.numeric(stats[2])    
    smaMAR[[j-1]] <- MAR
    smaSharpe[[j-1]] <- stats[,3]
  }
  rm(vols)
  smaMAR <- do.call(c, smaMAR)
  smaSharpe <- do.call(c, smaSharpe)
  MARmatrix[[i-1]] <- smaMAR
  SharpeMatrix[[i-1]] <- smaSharpe
}
t2 <- Sys.time()
print(t2-t1)

Essentially, just wrap the previous script in a nested for loop over the two parameters.

I chose GGplot2 to plot the heatmaps for more control with coloring.

Here’s the heatmap for the MAR ratio (that is, returns over max drawdown):

MARmatrix <- do.call(cbind, MARmatrix)
rownames(MARmatrix) <- paste0("SMA", c(2:21))
colnames(MARmatrix) <- paste0("runSD", c(2:21))
MARlong <- melt(MARmatrix)
colnames(MARlong) <- c("SMA", "runSD", "MAR")
MARlong$SMA <- as.numeric(gsub("SMA", "", MARlong$SMA))
MARlong$runSD <- as.numeric(gsub("runSD", "", MARlong$runSD))
MARlong$scaleMAR <- scale(MARlong$MAR)
ggplot(MARlong, aes(x=SMA, y=runSD, fill=scaleMAR))+geom_tile()+scale_fill_gradient2(high="skyblue", mid="blue", low="red")

Here’s the result:

Immediately, we start to see some answers to questions regarding overfitting. First off, is the parameter set published by TradingTheOdds optimized? Yes. In fact, not only is it optimized, it’s by far and away the best value on the heatmap. However, when discussing overfitting, curve-fitting, and the like, the question to ask isn’t “is this the best parameter set available”, but rather “is the parameter set in a stable region?” The answer, in my opinion to that, is yes, as noted by the differing values of the SMA for the 2-day sample standard deviation. Note that this quantity, due to being the sample standard deviation, is actually the square root of the two squared residuals of that time period.

Here are the MAR values for those configurations:

> MARmatrix[1:10,1]
    SMA2     SMA3     SMA4     SMA5     SMA6     SMA7     SMA8     SMA9    SMA10    SMA11 
2.471094 2.418934 2.067463 3.027450 2.596087 2.209904 2.466055 1.394324 1.860967 1.650588 

In this case, not only is the region stable, but the MAR values are all above 2 until the SMA9 value.

Furthermore, note that aside from the stable region of the 2-day sample standard deviation, a stable region using a standard deviation of ten days with less smoothing from the SMA (because there’s already an average inherent in the sample standard deviation) also exists. Let’s examine those values.

> MARmatrix[2:5, 9:16]
      runSD10  runSD11  runSD12  runSD13  runSD14  runSD15  runSD16   runSD17
SMA3 1.997457 2.035746 1.807391 1.713263 1.803983 1.994437 1.695406 1.0685859
SMA4 2.167992 2.034468 1.692622 1.778265 1.828703 1.752648 1.558279 1.1782665
SMA5 1.504217 1.757291 1.742978 1.963649 1.923729 1.662687 1.248936 1.0837615
SMA6 1.695616 1.978413 2.004710 1.891676 1.497672 1.471754 1.194853 0.9326545

Apparently, a standard deviation between 2 and 3 weeks with minimal SMA smoothing also produced some results comparable to the 2-day variant.

Off to the northeast of the plot, using longer periods for the parameters simply causes the risk-to-reward performance to drop steeply. This is essentially an illustration of the detriments of lag.

Finally, there’s a small rough patch between the two aforementioned stable regions. Here’s the data for that.

> MARmatrix[1:5, 4:8]
       runSD5    runSD6    runSD7   runSD8   runSD9
SMA2 1.928716 1.5825265 1.6624751 1.033216 1.245461
SMA3 1.528882 1.5257165 1.2348663 1.364103 1.510653
SMA4 1.419722 0.9497827 0.8491229 1.227064 1.396193
SMA5 1.023895 1.0630939 1.3632697 1.547222 1.465033
SMA6 1.128575 1.3793244 1.4085513 1.440324 1.964293

As you can see, there are some patches where the MAR is below 1, and many where it’s below 1.5. All of these are pretty detached from the stable regions.

Let’s repeat this process with the Sharpe Ratio heatmap.

SharpeMatrix <- do.call(cbind, SharpeMatrix)
rownames(SharpeMatrix) <- paste0("SMA", c(2:21))
colnames(SharpeMatrix) <- paste0("runSD", c(2:21))
sharpeLong <- melt(SharpeMatrix)
colnames(sharpeLong) <- c("SMA", "runSD", "Sharpe")
sharpeLong$SMA <- as.numeric(gsub("SMA", "", sharpeLong$SMA))
sharpeLong$runSD <- as.numeric(gsub("runSD", "", sharpeLong$runSD))
ggplot(sharpeLong, aes(x=SMA, y=runSD, fill=Sharpe))+geom_tile()+
  scale_fill_gradient2(high="skyblue", mid="blue", low="darkred", midpoint=1.5)

And the result:

Again, the TradingTheOdds parameter configuration lights up, but among a region of strong configurations. This time, we can see that in comparison to the rest of the heatmap, the northern stable region seems to have become clustered around the 10-day standard deviation (or 11) with SMAs of 2, 3, and 4. The regions to the northeast are also more subdued by comparison, with the Sharpe ratio bottoming out around 1.

Let’s look at the numerical values again for the same regions.

Two-day standard deviation region:

> SharpeMatrix[1:10,1]
    SMA2     SMA3     SMA4     SMA5     SMA6     SMA7     SMA8     SMA9    SMA10    SMA11 
1.972256 2.210515 2.243040 2.496178 1.975748 1.965730 1.967022 1.510652 1.963970 1.778401 

Again, numbers the likes of which I myself haven’t been able to achieve with more conventional strategies, and numbers the likes of which I haven’t really seen anywhere for anything on daily data. So either the strategy is fantastic, or something is terribly wrong outside the scope of the parameter optimization.

Two week standard deviation region:

> SharpeMatrix[1:5, 9:16]
      runSD10  runSD11  runSD12  runSD13  runSD14  runSD15  runSD16  runSD17
SMA2 1.902430 1.934403 1.687430 1.725751 1.524354 1.683608 1.719378 1.506361
SMA3 1.749710 1.758602 1.560260 1.580278 1.609211 1.722226 1.535830 1.271252
SMA4 1.915628 1.757037 1.560983 1.585787 1.630961 1.512211 1.433255 1.331697
SMA5 1.684540 1.620641 1.607461 1.752090 1.660533 1.500787 1.359043 1.276761
SMA6 1.735760 1.765137 1.788670 1.687369 1.507831 1.481652 1.318751 1.197707

Again, pretty outstanding numbers.

The rough patch:

> SharpeMatrix[1:5, 4:8]
       runSD5   runSD6   runSD7   runSD8   runSD9
SMA2 1.905192 1.650921 1.667556 1.388061 1.454764
SMA3 1.495310 1.399240 1.378993 1.527004 1.661142
SMA4 1.591010 1.109749 1.041914 1.411985 1.538603
SMA5 1.288419 1.277330 1.555817 1.753903 1.685827
SMA6 1.278301 1.390989 1.569666 1.650900 1.777006

All Sharpe ratios higher than 1, though some below 1.5

So, to conclude this post:

Was the replication using optimized parameters? Yes. However, those optimized parameters were found within a stable (and even strong) region. Furthermore, it isn’t as though the strategy exhibits poor risk-to-return metrics beyond those regions, either. Aside from raising the lookback period on both the moving average and the standard deviation to levels that no longer resemble the original replication, performance was solid to stellar.

Does this necessarily mean that there is nothing wrong with the strategy? No. It could be that the performance is an artifact of “observe the close, enter at the close” optimistic execution assumptions. For instance, quantstrat (the go-to backtest engine in R for more trading-oriented statistics) uses a next-bar execution method that defaults on the *next* day’s close (so if you look back over my quantstrat posts, I use prefer=”open” so as to get the open of the next bar, instead of its close). It could also be that VXMT itself is an instrument that isn’t very well known in the public sphere, either, seeing as how Yahoo finance barely has any data on it. Lastly, it could simply be the fact that although the risk to reward ratios seem amazing, many investors/mutual fund managers/etc. probably don’t want to think “I’m down 40-60% from my peak”, even though it’s arguably easier to adjust a strategy with a good reward to risk ratio with excess risk by adding cash (to use a cooking analogy, think about your favorite spice. Good in small quantities.), than it is to go and find leverage for a good reward to risk strategy with very small returns (not to mention incurring all the other risks that come with leverage to begin with, such as a 50% drawdown wiping out an account leveraged two to one).

However, to address the question of overfitting, through a modified technique from Jaekle and Tomasini (2009), these are the results I found.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Volatility Risk Premium: Sharpe 2+, Return to Drawdown 3+

First, before starting this post, I’d like to give one last comment about my previous post:

I called Vanguard to inquire about the trading policies on VWEHX and VFISX, and there are two-month cooldown periods (aka frequent-trading policies) on those mutual funds. However, the HYG ETF does indeed pay dividends, so the adjusted ETF variant is most likely the closest performance an investor can expect. Still, a Sharpe ratio higher than 1.25 is nothing to scoff at. Of course, no transaction costs are assumed on any of my strategies, so make sure your broker isn’t ripping you off if you actually intend on seriously investing in anything I publish on this blog (I hear interactive brokers has $1 per transaction), and once again, remember that none of this constitutes official advice.

Now, onto this post:

Judging by the attention some of my previous volatility posts have garnered through my replication of SeekingAlpha strategies, today, I am going to share a strategy whose statistics boggle my mind.

The strategy was presented by TradingTheOdds in this post. I had to replicate it for myself to be sure it worked as advertised, but unless I have something horrendously incorrect, this strategy works…quite well. Here’s the strategy:

Using the actual S&P 500 index, compute the two-day annualized historical volatility. Subtract that from the VXMT, which is the six-month expected volatility of the S&P 500 (prior to 2008, use the actual VIX). Then, take the 5-day SMA of that difference. If this number is above 0, go long XIV, otherwise go long VXX. In my replication, the strategy uses market-on-close orders (AKA observe “near” the close, buy at the close), so the strategy should be taken with a little bit of a grain of salt. Here’s the code:

download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")

download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package

xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxmt <- xts(read.zoo("vxmtdailyprices.csv", format="%m/%d/%Y", sep=",", header=TRUE))

getSymbols("^VIX", from="2004-03-29")

vixvxmt <- merge(Cl(VIX), Cl(vxmt))
vixvxmt[is.na(vixvxmt[,2]),2] <- vixvxmt[is.na(vixvxmt[,2]),1]

getSymbols("^GSPC", from="1990-01-01")
spyRets <- diff(log(Cl(GSPC)))

spyVol <- runSD(spyRets, n=2)
annSpyVol <- spyVol*100*sqrt(252)

vols <- merge(vixvxmt[,2], annSpyVol, join='inner')
vols$smaDiff <- SMA(vols[,1] - vols[,2], n=5)
vols$signal <- vols$smaDiff > 0
vols$signal <- lag(vols$signal, k = 1)

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))
stratRets <- vols$signal*xivRets + (1-vols$signal)*vxxRets

The VXMT data is taken from the link I showed earlier. So, as I interpret it, to me, this strategy seems to be stating this:

Since we are subtracting the long-term expected volatility (VXMT) from the near-term historical volatility, which I suppose is meant to be a proxy for the “forecast of current volatility”, the implied hypothesis seems to be that volatility is being overestimated by the VXMT, so we should go short volatility. Conversely, if the near-term historical volatility is higher than the expected volatility, it means we should be long volatility instead. So, here’s the punchline that is the equity curve:

charts.PerformanceSummary(stratRets)

Yes, you’re looking at that correctly–over approximately ten years (slightly longer), this strategy had a cumulative return of about $10,000 for every $1 invested. Just to put this into perspective, here’s a log-scale equity curve.

There are a few noticeable dips which correspond to around 40% drawdowns on the regular scale. Now, let’s look at the usual statistics:

stats <- data.frame(cbind(Return.annualized(stratRets)*100, 
               maxDrawdown(stratRets)*100, 
               SharpeRatio.annualized(stratRets)))

colnames(stats) <- c("Annualized Return", "Max Drawdown", "Annualized Sharpe")
stats$MAR <- as.numeric(stats[1])/as.numeric(stats[2])

With the following result:

                  Annualized Return Max Drawdown Annualized Sharpe      MAR
Annualized Return          137.7875     45.64011          2.491509 3.019001

Risky, as judging from maximum drawdown alone? Yes. But is there risk for the reward? Absolutely.

To put a lower bound on the performance of the strategy, here are the same diagrams and statistics with the signal lagged one more day (that is, since the returns use close-to-close data and work off of the closing prices, the signal is lagged by a day to avoid lookahead bias. Lagging the signal by one more day would mean receiving the signal at the close of day t, but only entering on the close of day t+1).

In effect, this is the coding change:

vols$signal <- lag(vols$signal, k = 1)

Becomes

vols$signal <- lag(vols$signal, k = 2)

Here are the results:

Equity curve/drawdowns:

So, still impressive on the returns, but now there are some very pronounced drawdowns–slightly larger, but more concerning, that they’re longer.

Log equity curve:

Again, slightly more pronounced dips.

Here are the statistics:

                  Annualized Return Max Drawdown Annualized Sharpe      MAR
Annualized Return          84.36546     56.77219          1.521165 1.486035

So still quite respectable for a strategy this understandable.

I’m fairly certain that there’s still room for improvement on this strategy, considering that anywhere there’s a 5-day SMA, there’s often room for better trend-following indicators, and possibly a better indication for the volatility of SPY than the 2-day rolling annualized metric. But as it stands, I think based on its risk/reward characteristics, this strategy has a place as an aggressive, returns-generating component of a portfolio of strategies.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Predicting High Yield with SPY–a Two Part Post

This post will cover ideas from two individuals: David Varadi of CSS Analytics with whom I am currently collaborating on some volatility trading strategies (the extent of which I hope will end up as a workable trading strategy–my current replica of some of VolatilityMadeSimple’s publicly displayed “example” strategies (note, from other blogs, not to be confused with their proprietary strategy) are something that I think is too risky to be traded as-is), and Cesar Alvarez, of Alvarez Quant Trading. If his name sounds familiar to some of you, that’s because it should. He used to collaborate (still does?) with Larry Connors of TradingMarkets.com, and I’m pretty sure that sometime in the future, I’ll cover those strategies as well.

The strategy for this post is simple, and taken from this post from CSS Analytics.

Pretty straightforward–compute a 20-day SMA on the SPY (I use unadjusted since that’s what the data would have actually been). When the SPY’s close crosses above the 20-day SMA, buy the high-yield bond index, either VWEHX or HYG, and when the converse happens, move to the cash-substitute security, either VFISX or SHY.

Now, while the above paragraph may make it seem that VWEHX and HYG are perfect substitutes, well, they aren’t, as no two instruments are exactly alike, which, as could be noted from my last post, is a detail that one should be mindful of. Even creating a synthetic “equivalent” is never exactly perfect. Even though I try my best to iron out such issues, over the course of generally illustrating an idea, the numbers won’t line up exactly (though hopefully, they’ll be close). In any case, it’s best to leave an asterisk whenever one is forced to use synthetics for the sake of a prolonged backtest.

The other elephant/gorilla in the room (depending on your preference for metaphorical animals), is whether or not to use adjusted data. The upside to that is that dividends are taken into account. The *downside* to that is that the data isn’t the real data, and also assumes a continuous reinvestment of dividends. Unfortunately, shares of a security are not continuous quantities–they are discrete quantities made so by their unit price, so the implicit assumptions in adjusted prices can be optimistic.

For this particular topic, Cesar Alvarez covered it exceptionally well on his blog post, and I highly recommend readers give that post a read, in addition to following his blog in general. However, just to illustrate the effect, let’s jump into the script.


getSymbols("VWEHX", from="1950-01-01")
getSymbols("SPY", from="1900-01-01")
getSymbols("HYG", from="1990-01-01")
getSymbols("SHY", from="1990-01-01")
getSymbols("VFISX", from="1990-01-01")

spySma20Cl <- SMA(Cl(SPY), n=20)
clSig <- Cl(SPY) > spySma20Cl
clSig <- lag(clSig, 1)

vwehxCloseRets <- Return.calculate(Cl(VWEHX))
vfisxCloseRets <- Return.calculate(Cl(VFISX))
vwehxAdjustRets <- Return.calculate(Ad(VWEHX))
vfisxAdjustRets <- Return.calculate(Ad(VFISX))

hygCloseRets <- Return.calculate(Cl(HYG))
shyCloseRets <- Return.calculate(Cl(SHY))
hygAdjustRets <- Return.calculate(Ad(HYG))
shyAdjustRets <- Return.calculate(Ad(SHY))

mutualAdRets <- vwehxAdjustRets*clSig + vfisxAdjustRets*(1-clSig)
mutualClRets <- vwehxCloseRets*clSig + vfisxCloseRets*(1-clSig)

etfAdRets <- hygAdjustRets*clSig + shyAdjustRets*(1-clSig)
etfClRets <- hygCloseRets*clSig + shyCloseRets*(1-clSig)

Here are the results:


mutualFundBacktest <- merge(mutualAdRets, mutualClRets, join='inner')
charts.PerformanceSummary(mutualFundBacktest)
data.frame(t(rbind(Return.annualized(mutualFundBacktest)*100, 
                   maxDrawdown(mutualFundBacktest)*100,
                   SharpeRatio.annualized(mutualFundBacktest))))

Which produces the following equity curves:

As can be seen, the choice to adjust or not can be pretty enormous. Here are the corresponding three statistics:

               Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
VWEHX.Adjusted         14.675379       2.954519                        3.979383
VWEHX.Close             7.794086       4.637520                        3.034225

Even without the adjustment, the strategy itself is…very very good, at least from this angle. Let’s look at the ETF variant now.

etfBacktest <- merge(etfAdRets, etfClRets, join='inner')
charts.PerformanceSummary(etfBacktest)
data.frame(t(rbind(Return.annualized(etfBacktest)*100, 
                   maxDrawdown(etfBacktest)*100,
                   SharpeRatio.annualized(etfBacktest))))

The resultant equity curve:

With the corresponding statistics:

             Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
HYG.Adjusted         11.546005       6.344801                       1.4674301
HYG.Close             5.530951       9.454754                       0.6840059

Again, another stark difference. Let’s combine all four variants.

fundsAndETFs <- merge(mutualFundBacktest, etfBacktest, join='inner')
charts.PerformanceSummary(fundsAndETFs)
data.frame(t(rbind(Return.annualized(fundsAndETFs)*100, 
                   maxDrawdown(fundsAndETFs)*100,
                   SharpeRatio.annualized(fundsAndETFs))))

The equity curve:

With the resulting statistics:

               Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0..
VWEHX.Adjusted         17.424070       2.787889                       4.7521579
VWEHX.Close            11.739849       3.169040                       3.8715923
HYG.Adjusted           11.546005       6.344801                       1.4674301
HYG.Close               5.530951       9.454754                       0.6840059

In short, while the strategy itself seems strong, the particular similar (but not identical) instruments used to implement the strategy make a large difference. So, when backtesting, make sure to understand what taking liberties with the data means. In this case, by turning two levers, the Sharpe Ratio varied from less than 1 to above 4.

Next, I’d like to demonstrate a little trick in quantstrat. Although plenty of examples of trading strategies only derive indicators (along with signals and rules) from the market data itself, there are also many strategies that incorporate data from outside simply the price action of the particular security at hand. Such examples would be many SPY strategies that incorporate VIX information, or off-instrument signal strategies like this one.

The way to incorporate off-instrument information into quantstrat simply requires understanding what the mktdata object is, which is nothing more than an xts type object. By default, a security may originally have just the OHLCV and open interest columns. Most demos in the public space generally use data only from the instruments themselves. However, it is very much possible to actually pre-compute signals.

Here’s a continuation of the script to demonstrate, with a demo of the unadjusted HYG leg of this trade:

####### BOILERPLATE FROM HERE
require(quantstrat)

currency('USD')
Sys.setenv(TZ="UTC")
symbols <- "HYG"
stock(symbols, currency="USD", multiplier=1)
initDate="1990-01-01"

strategy.st <- portfolio.st <- account.st <- "preCalc"
rm.strat(portfolio.st)
rm.strat(strategy.st)
initPortf(portfolio.st, symbols=symbols, initDate=initDate, currency='USD')
initAcct(account.st, portfolios=portfolio.st, initDate=initDate, currency='USD')
initOrders(portfolio.st, initDate=initDate)
strategy(strategy.st, store=TRUE)
######### TO HERE

clSig <- Cl(SPY) > SMA(Cl(SPY), n=20)

HYG <- merge(HYG, clSig, join='inner')
names(HYG)[7] <- "precomputed_signal"

#no parameters or indicators--we precalculated our signal

add.signal(strategy.st, name="sigThreshold",
           arguments=list(column="precomputed_signal", threshold=.5, 
                          relationship="gt", cross=TRUE),
           label="longEntry")


add.signal(strategy.st, name="sigThreshold",
           arguments=list(column="precomputed_signal", threshold=.5, 
                          relationship="lt", cross=TRUE),
           label="longExit")

add.rule(strategy.st, name="ruleSignal",
         arguments=list(sigcol="longEntry", sigval=TRUE, orderqty=1, ordertype="market",
                        orderside="long", replace=FALSE, prefer="Open"),
         type="exit", path.dep=TRUE)

add.rule(strategy.st, name="ruleSignal", 
         arguments=list(sigcol="longExit", sigval=TRUE, orderqty="all", ordertype="market", 
                        orderside="long", replace=FALSE, prefer="Open"), 
         type="exit", path.dep=TRUE)

#apply strategy
t1 <- Sys.time()
out <- applyStrategy(strategy=strategy.st,portfolios=portfolio.st)
t2 <- Sys.time()
print(t2-t1)

#set up analytics
updatePortf(portfolio.st)
dateRange <- time(getPortfolio(portfolio.st)$summary)[-1]
updateAcct(portfolio.st,dateRange)
updateEndEq(account.st)

As you can see, no indicators computed from the actual market data, because the strategy used a pre-computed value to work off of. The lowest-hanging fruit of applying this methodology, of course, would be to append the VIX index as an indicator for trading strategies on the SPY.

And here are the results, trading a unit quantity:

> data.frame(round(t(tradeStats(portfolio.st)[-c(1,2)]),2))
                      HYG
Num.Txns           217.00
Num.Trades         106.00
Net.Trading.PL      36.76
Avg.Trade.PL         0.35
Med.Trade.PL         0.01
Largest.Winner       9.83
Largest.Loser       -2.71
Gross.Profits       67.07
Gross.Losses       -29.87
Std.Dev.Trade.PL     1.67
Percent.Positive    50.00
Percent.Negative    50.00
Profit.Factor        2.25
Avg.Win.Trade        1.27
Med.Win.Trade        0.65
Avg.Losing.Trade    -0.56
Med.Losing.Trade    -0.39
Avg.Daily.PL         0.35
Med.Daily.PL         0.01
Std.Dev.Daily.PL     1.67
Ann.Sharpe           3.33
Max.Drawdown        -7.24
Profit.To.Max.Draw   5.08
Avg.WinLoss.Ratio    2.25
Med.WinLoss.Ratio    1.67
Max.Equity          43.78
Min.Equity          -1.88
End.Equity          36.76

And the corresponding position chart:

Lastly, here are the vanguard links for VWEHX and VFISX. Apparently, neither charge a redemption fee. I’m not sure if this means that they can be freely traded in a systematic fashion, however.

In conclusion, hopefully this post showed a potentially viable strategy, understanding the nature of the data you’re working with, and how to pre-compute values in quantstrat.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

Seeking Volatility and Leverage

So Harry Long recently posted several articles, a couple of them all that have variations on a theme of a combination of leveraging SPY (aka SPXL), leveraging TLT (aka TMF), and some small exposure to the insanely volatile volatility indices (VXX, TVIX, ZIV, etc.), which can have absolutely insane drawdowns. Again, before anything else, a special thanks to Mr. Helmuth Vollmeier for his generosity in providing long-dated VXX and ZIV data, both of which will be leveraged for this post (in more ways than one).

In any case, here is the link to the two articles:

A Weird All-Long Strategy That Beats the S&P 500 Every Year
A Refined All-Long Strategy III

As usual, the challenge is that the exact ETFs in question didn’t exist prior to the financial crisis, giving a very handy justification as to why not to show the downsides of the strategy/strategies. From a conceptual standpoint, it’s quite trivial to realize that upon reading the articles, that when a large chunk of the portfolio consists of a leveraged SPY exposure, one is obviously going to look like a genius outperforming the SPY itself in a bull run. The question, obviously, is what happens when the market doesn’t support the strategy. If offered a 50% coin flip, with the outcome of heads winning a million dollars and being told nothing else, the obvious question to ask would be: “and what happens on tails?”

This post aims to address this for three separate configurations of the strategy.

First off, in order to create a believable backtest, the goal is to first create substitutes to the short-dated newfangled ETFs (SPXL and TMF), which will be done very simply: leverage the adjusted returns of SPY and TLT, respectively (I had to use adjusted due to the split in SPXL–normally I don’t like using adjusted data for anything, but splits sort of necessitate this evil).

Here we go:

SPXL vs. SPY:

require(quantmod)
require(PerformanceAnalytics)
require(downloader)

getSymbols("SPXL", from="1990-01-01")
spxlRets <- Return.calculate(Ad(SPXL)) #have to use adjusted due to split 
getSymbols("SPY", from="1990-01-01")
SPYrets <- Return.calculate(Ad(SPY))
spxl3SPY <- merge(spxlRets, 3*SPYrets, join='inner')
charts.PerformanceSummary(spxl3SPY)

So, the adjusted data in use for this simulation will slightly overshoot in regards to the absolute returns. That stated, it isn’t so much the returns we care about in this post (we know they’re terrific when times are good), but the drawdowns. The drawdowns are basically on top of one another, which is good.

Let’s repeat this with TMF and TLT:

getSymbols("TMF", from="1990-01-01")
TMFrets <- Return.calculate(Ad(TMF))
getSymbols("TLT", from="1990-01-01")
TLTrets <- Return.calculate(Ad(TLT))
tmf3TLT <- merge(TMFrets, 3*TLTrets, join='inner')
charts.PerformanceSummary(tmf3TLT)

The result:

> Return.annualized(tmf3TLT[,2]-tmf3TLT[,1])
                  TLT.Adjusted
Annualized Return   0.03123479

A bit more irritating, as there’s clearly a bit of discrepancy to the tune of approximately 3.1% a year in terms of annualized returns in favor of the leveraged TLT vs. the actual TMF (so if you can borrow for less than 3% a year, this may be a good strategy for you–though I’m completely in the dark about why this sort of mechanic exists–is it impossible to actually short TMF, or buy TLT on margin? If someone is more intimately familiar with this trade, let me know), so, I’m going to make like an engineer and apply a little patch to remove the bias–subtract the daily returns of the discrepancy from the leveraged adjusted TLT.

discrepancy <- as.numeric(Return.annualized(tmf3TLT[,2]-tmf3TLT[,1]))
tmf3TLT[,2] <- tmf3TLT[,2] - ((1+discrepancy)^(1/252)-1)

charts.PerformanceSummary(tmf3TLT)

Much better. Let’s save those modified TLT returns (our synthetic TMF):

modifiedTLT <- 3*TLTrets - ((1+discrepancy)^(1/252)-1)

With VXX, luckily, we simply need to compare Mr. Vollmeier’s data to Yahoo’s return data so that we can verify if two separate return streams check out.

#get long VXX -- thank you so much, Mr. Helmuth Vollmeier
download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT", 
         destfile="longVXX.txt") #requires downloader package
VXXlong <- read.csv("longVXX.txt", stringsAsFactors=FALSE)
VXXlong <- xts(VXXlong[,2:5], order.by=as.Date(VXXlong$Date))
VXXrets <- Return.calculate(Cl(VXXlong)) #long data only has close

getSymbols("VXX", from="1990-01-01")
vxxYhooRets <- Return.calculate(Ad(VXX))
vxx2source <- merge(VXXrets, vxxYhooRets, join='inner')
charts.PerformanceSummary(vxx2source) #identical

And the result:

No discrepancies here whatsoever. So once again, I am very fortunate to have experienced readers commenting on this blog.

So, with this in mind, let’s attempt to recreate the equity curve of the first strategy, which consists of 50% SPXL, 45% TMF, and 5% VXX.

rSPY_TLT_VXX <- cbind(3*SPYrets, modifiedTLT, VXXrets)
rSPY_TLT_VXX <- rSPY_TLT_VXX[!is.na(rSPY_TLT_VXX[,3]),]
colnames(rSPY_TLT_VXX) <- c("SPY", "TLT", "VXX")

strat <- Return.rebalancing(R = rSPY_TLT_VXX, weights = c(.5, .45, .05), 
                            rebalance_on = "years", geometric=TRUE)
stratAndSPY <- merge(strat, SPYrets, join='inner')
charts.PerformanceSummary(stratAndSPY["2009-04-16::"])

About there.

One other note, on a purely mechanical issue: when using the

geometric = TRUE

argument with R, when creating synthetic leverage, you cannot create it in the actual

weights

argument, or it will leverage your capital at every rebalancing period, giving you obviously incorrect results. Furthermore, these results were achieved using geometric = TRUE in two places: one in the Return.rebalancing argument (which implies reinvesting the capital), and then once again when calling the PerformanceAnalytics functions. Essentially, the implication of this is reinvesting all gains at the rebalancing period, and not touching any position no matter what. Used inappropriately, this will create results that border on the optimistic.

Now that we’ve replicated the general shape and pattern of the original equity curve, let’s look at this strategy on a whole.

charts.PerformanceSummary(stratAndSPY)

If you just look at the top chart, it looks pretty amazing, doesn’t it? Now look at the bottom chart. Not only is there a massive drawdown, but there’s a massive spike up, and then *another* massive, larger drawdown. Imagine what would have happened to someone who didn’t follow this strategy to the letter. Get out at the very worst moment, get back in after a run-up, and then get hit *again*.

Here are the usual statistics I use:

> Return.annualized(stratAndSPY)
                  portfolio.returns SPY.Adjusted
Annualized Return         0.2305339   0.07937642
> maxDrawdown(stratAndSPY)
               portfolio.returns SPY.Adjusted
Worst Drawdown         0.4901882    0.5518672
> SharpeRatio.annualized(stratAndSPY)
                                portfolio.returns SPY.Adjusted
Annualized Sharpe Ratio (Rf=0%)         0.9487574    0.3981902

An annualized Sharpe just shy of 1, using adjusted data, with a CAGR/max drawdown ratio of less than one half, and a max drawdown far beyond the levels of acceptable (even 20% may be too much for some people, though I’d argue it’s acceptable over a long enough time frame provided it’s part of a diversified portfolio of other such uncorrelated strategies).

Now, the claim is that this strategy consistently beats the S&P 500 year after year? That can also be tested.

diff <- stratAndSPY[,1] - stratAndSPY[,2]
diffAndModTLT <- cbind(diff, modifiedTLT)
charts.PerformanceSummary(diffAndModTLT)

Essentially, I shorted the SPY against the strategy (which would simply mean still long the SPY, except at 50% instead of 150%), and this is the result, in comparison to the 3x leveraged TLT (and cut down by the original discrepancy on a daily level)

So even after shorting the SPY and its massive drawdown away, one is still left with what amounts to a diluted TMF position, which has its own issues. Here are the three statistics, once again:

> Return.annualized(diffAndModTLT)
                  portfolio.returns TLT.Adjusted
Annualized Return         0.1181923    0.1356003
> maxDrawdown(diffAndModTLT)
               portfolio.returns TLT.Adjusted
Worst Drawdown         0.3930016    0.6348332
> SharpeRatio.annualized(diffAndModTLT)
                                portfolio.returns TLT.Adjusted
Annualized Sharpe Ratio (Rf=0%)         0.4889822    0.3278975

In short, for a strategy that markets itself on beating the SPY, shorting the SPY against it costs more in upside than is gained on the downside. Generally, anytime I see an article claiming “this strategy does really well against benchmark XYZ”, my immediate intuition is: “so what does the equity curve look like when you short your benchmark against your strategy?” If the performance deteriorates, that once again means some tough questions need asking. That stated, the original strategy handily trounced the SPY benchmark, and the difference trounced the leveraged TLT. Just that my own personal benchmark is an annualized return over max drawdown of 1 or more (meaning that even the worst streak can be made up for within a year’s time–or, more practically, that generally, you don’t go a year without getting paid).

Let’s move on to the second strategy. In this instance, it’s highly similar–50% SPXL (3x SPY), 40% TMF (3x TLT), and 10% TVIX (2x VXX). Again, let’s compare synthetic to actual.

getSymbols("TVIX", from="1990-01-01")
TVIXrets <- Return.calculate(Ad(TVIX))
vxxTvix <- merge(2*VXXrets, TVIXrets, join='inner')
charts.PerformanceSummary(vxxTvix) #about identical

We’re in luck. This chart is about identical, so no tricks necessary.

The other two instruments are identical, so we can move straight to the strategy.

First, let’s replicate an equity curve:

rSPY_TLT_VXX2 <- cbind(3*SPYrets, modifiedTLT,  2*VXXrets)
rSPY_TLT_VXX2 <- rSPY_TLT_VXX2[!is.na(rSPY_TLT_VXX2[,3]),]

stratTwo <- Return.rebalancing(R=rSPY_TLT_VXX, weights = c(.5, .4, .1), rebalance_on="years", geometric=TRUE)
stratTwoAndSPY <- merge(stratTwo, SPYrets, join='inner')
charts.PerformanceSummary(stratTwoAndSPY["2010-11-30::"], geometric=TRUE)

General shape and pattern of the strategy’s equity curve achieved. What does it look like since the inception of the original VIX futures?

charts.PerformanceSummary(stratTwoAndSPY)

Very similar to the one before. Let’s look at them side by side.

bothStrats <- merge(strat, stratTwo, join='inner')
colnames(bothStrats) <- c("strategy one", "strategy two")
charts.PerformanceSummary(bothStrats)

First of all, let’s do a side by side comparison of the three statistics:

> Return.annualized(bothStrats)
                  strategy one strategy two
Annualized Return    0.2305339    0.2038783
> maxDrawdown(bothStrats)
               strategy one strategy two
Worst Drawdown    0.4901882    0.4721624
> SharpeRatio.annualized(bothStrats)
                                strategy one strategy two
Annualized Sharpe Ratio (Rf=0%)    0.9487574    0.9075242

The second strategy seems to be strictly worse than the first. If we’d short the second against the first, essentially, it’d mean we have a 15% exposure to TLT, and a -15% exposure to VXX. For a fun tangent, what does such a strategy’s equity curve look like?

stratDiff <- bothStrats[,1] - bothStrats[,2]
charts.PerformanceSummary(stratDiff)

With the following statistics:

> Return.annualized(stratDiff)
                  strategy one
Annualized Return   0.02606221
> maxDrawdown(stratDiff)
[1] 0.1254502
> SharpeRatio.annualized(stratDiff)
                                strategy one
Annualized Sharpe Ratio (Rf=0%)    0.8544455

Basically, a 1 to 5 annualized return to max drawdown ratio. In short, this may be how a lot of mediocre managers go out of business–see an idea that looks amazing, leverage it up, then have one short period of severe underperformance, where everything goes wrong for a small amount of time (EG equity market-neutral quant meltdown of August 2007, flash crash, etc.), and then a whole fund keels over. In fact, these spikes of underperformance are the absolute worst type of phenomena that can happen to many systematic strategies, since they trigger the risk-exit mechanisms, and then recover right before the strategy can make it back in.

Finally, we have our third strategy, which introduces one last instrument–ZIV. Here’s the specification for that strategy:

30% SPXL
30% ZIV
30% TMF
10% TVIX

Again, let’s go through the process and get our replicated equity curve.

download("https://www.dropbox.com/s/jk3ortdyru4sg4n/ZIVlong.TXT", destfile="longZIV.txt")
ZIVlong <- read.csv("longZIV.txt", stringsAsFactors=FALSE)
ZIVlong <- xts(ZIVlong[,2:5], order.by=as.Date(ZIVlong$Date))
ZIVrets <- Return.calculate(Cl(ZIVlong))

strat3components <- cbind(3*SPYrets, ZIVrets, modifiedTLT, 2*VXXrets)
strat3components <- strat3components[!is.na(strat3components[,4]),]
stratThree <- Return.rebalancing(strat3components, weights=c(.3, .3, .3, .1), rebalance_on="years", geometric=TRUE)
stratThreeAndSPY <- merge(stratThree, SPYrets, join='inner')
chart.TimeSeries(log(cumprod(1+stratThreeAndSPY["2010-11-30::"])))

With the resulting equity curve replication:

And again, the full-backtest equity curve:

charts.PerformanceSummary(stratThreeAndSPY)

To put it together, let’s combine all three strategies, and the SPY.

threeStrats <- merge(bothStrats, stratThree, join='inner')
threeStratsSPY <- merge(threeStrats, SPYrets, join='inner')
colnames(threeStratsSPY)[3] <- "strategy three"
charts.PerformanceSummary(threeStratsSPY)

stats <- data.frame(cbind(t(Return.annualized(threeStratsSPY))*100, 
               t(maxDrawdown(threeStratsSPY))*100, 
               t(SharpeRatio.annualized(threeStratsSPY))))
stats$returnToDrawdown <- stats[,1]/stats[,2]

The resultant equity curve:

The resultant statistics:

> stats
               Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0.. returnToDrawdown
strategy.one           23.053387       49.01882                       0.9487574        0.4702967
strategy.two           20.387835       47.21624                       0.9075242        0.4317970
strategy three         15.812291       39.31843                       0.8019835        0.4021597
SPY.Adjusted            7.937642       55.18672                       0.3981902        0.1438325

In short, all of them share the same sort of profile–very strong annualized returns, even scarier drawdowns, Sharpe ratios close to 1 (albeit using adjusted data), and return to drawdown ratios slightly less than .5 (also scary). Are these complete strategies on their own? No. Do they beat the S&P 500? Yes. Does it make sense that they beat the S&P 500? Considering that two of these configurations have a greater than 100% market exposure, and the direction of the equity markets tends to be up over time (at least over the period during which the VIX futures traded), then this absolutely makes sense. Should one short the S&P against these strategies? I wouldn’t say so.

One last thing to note–the period over which these strategies were tested (inception of VIX futures) had no stagflation, and the Fed’s QE may be partially (or a great deal) responsible for the rise in the equity markets since the crisis. If the market declines as a result of the fed raising rates (which it inevitably will have to at some point), these strategies might be seriously hurt, so I’d certainly advise a great deal of caution, even going forward.

In any case, for better or for worse, here are a few strategies from SeekingAlpha, replicated to as far as synthetic data is available.

Thanks for reading.

It’s Amazing How Well Dumb Things Work?

Recently, Harry Long posted not one but four new articles on Seeking Alpha called It’s Amazing How Well Dumb Things Work II. The last time I replicated a strategy by Mr. Long, it came up short on the expectations it initially built. For the record, here are the links to the four part series.

First post
Second post (very valuable comments section)
Third post
Fourth post

While the strategy itself wasn’t the holy grail (nothing is, really, and it did outperform the S&P), it did pay off with some long-history ETF data for which I am extremely grateful (a project I’m currently working on involves those exact indices, I’ll see if I can blog about it), and it did give me the chance to show some R functionality that I hadn’t shown before that point, which in my opinion, made the endeavor more than worth it.

For this particular strategy, I’m not so sure that’s the case.

In short, the S&P 500, gold, and long-term bonds, rebalance to an equal-weight portfolio annually. Dirt simple. So simple, in fact, that I can backtest this strategy back to 1978!

The data I used was the actual GSPC index from Yahoo Finance, and some quandl data that I cleaned up with the quandClean function in my IKTrading package. While some of the futures I originally worked with have huge chunks of missing data, the long term bonds and gold futures are relatively intact. Gold had about 44 missing days in the 21st century when GLD had returns, so there may be some data integrity issues there, though the average return for GLD on those days is about -3 bps, so the general concept demonstration is intact. With the bonds, there were 36 missing days that TLT had returns for, but the average returns for those days were a -18 bps, so I instead imputed those missing days to zero.

In any case, here’s the script to set up the data:

require(IKTrading)
require(PerformanceAnalytics)
require(quantmod)

getSymbols("^GSPC", from="1800-01-01")
SPrets &lt;- Return.calculate(Cl(GSPC))

goldFutures &lt;- quandClean(stemCode = "CHRIS/CME_GC", verbose = TRUE)
getSymbols("GLD", from="1990-01-01") #quandl's data had a few gaps--let's use GLD to fill them in.
goldGLD &lt;- merge(Cl(goldFutures), Cl(GLD), join='outer')
goldRets &lt;- Return.calculate(goldGLD)
sum(is.na(goldRets[,1]))
mean(goldRets[is.na(goldRets[,1]),2], na.rm=TRUE)
goldRets[is.na(goldRets[,1]),1] &lt;- goldRets[is.na(goldRets[,1]),2] #impute missing returns data with GLD returns data for that day
goldRets &lt;- goldRets[,1]

thirtyBond &lt;- quandClean(stemCode="CHRIS/CME_US", verbose = TRUE)
getSymbols("TLT", from="1990-01-01")
bondTLT &lt;- merge(Cl(thirtyBond), Cl(TLT), join='outer')
bondRets &lt;- Return.calculate(bondTLT)
sum(is.na(bondRets[,1]))
mean(bondRets[is.na(bondRets[,1]),2], na.rm=TRUE) #18 basis points? Just going to impute as zero.
bondRets[is.na(bondRets[,1]),1] &lt;- 0 #there are 259 other such instances prior to this imputing
bondRets &lt;- bondRets[,1]

SPbondGold &lt;- cbind(SPrets, goldRets, bondRets)
SPbondGold &lt;- SPbondGold["1977-12-30::"] #start off at beginning of 1978, since that's when gold futures were first in inception
colnames(SPbondGold) &lt;- c("SandP", "Bonds", "Gold")

DTW_II_returns &lt;- Return.rebalancing(R = SPbondGold, weights = c(1/3, 1/3, 1/3), geometric = FALSE, rebalance_on = "years")
stratSP &lt;- merge(DTW_II_returns, SPrets, join='inner')
colnames(stratSP) &lt;- c("Harry Long Strategy", "S&amp;P 500")

First, let’s recreate the original equity curve, to prove that I’m comparing apples to apples.

#Recreate the original equity curve
charts.PerformanceSummary(stratSP["2004-12-01::"])

So, as you can see, everything seems to match, article confirmed, and that’s good.

But I wrote the backtest to go back to 1978, just so we can see as much of the performance as we possibly can. So, let’s take a look. Now, keep in mind that the article stated the following two assertions:

“As we previously observed, the approach has higher returns and lower drawdowns across an entire bull and bear market cycle than the S&P 500…” and that “…a portfolio manager who employed this decidedly humble, simple approach, would actually have been doing well for clients as a fiduciary.”

Let’s see if the full-period equity curve supports these assertions.

charts.PerformanceSummary(stratSP)

Here are some numerical values to put this into perspective:

&gt; Return.cumulative(stratSP)
                  Harry Long Article  S&amp;P 500
Cumulative Return           6.762379 20.25606
&gt; Return.annualized(stratSP)
                  Harry Long Article    S&amp;P 500
Annualized Return         0.05713769 0.08640992
&gt; SharpeRatio.annualized(stratSP)
                                Harry Long Article   S&amp;P 500
Annualized Sharpe Ratio (Rf=0%)           0.583325 0.4913877
&gt; maxDrawdown(stratSP)
               Harry Long Article   S&amp;P 500
Worst Drawdown           0.334808 0.5677539

In short, a 5.7% to an 8.6% annualized return in favor of the S&P 500 since 1978, with both sets of returns sporting substantial drawdowns compared to their annualized rates of return. So while this strategy gives a better risk-adjusted return, you get the return you pay for with the risks taken–smaller max drawdown for a similar Sharpe? Smaller return. Furthermore, the early 80s seem to have hurt this “all-weather” strategy more than the S&P 500.

For fun, let’s leverage the strategy 2x over and see what happens.

stratSP$leveragedStrat &lt;- stratSP[,1]*2
charts.PerformanceSummary(stratSP)

And this is the resulting equity curve:

Hey, now we’re talking, right?! Well, the late 70s and early 80s say not exactly. But what if I wanted to make myself look better than the strategy justifies? Well, I can simply truncate those drawdowns, by just leaving off everything before 1985! (And put some marketing spin on it to justify it.)

charts.PerformanceSummary(stratSP["1985::"])

Here are some statistics:

&gt; maxDrawdown(stratSP["1985::"])
               Harry.Long.Article   S.P.500 leveragedStrat
Worst Drawdown          0.2228608 0.5677539      0.4065492
&gt; Return.annualized(stratSP["1985::"])
                  Harry.Long.Article    S.P.500 leveragedStrat
Annualized Return         0.05782724 0.08700208      0.1106892
&gt; SharpeRatio.annualized(stratSP["1985::"])
                                Harry.Long.Article   S.P.500 leveragedStrat
Annualized Sharpe Ratio (Rf=0%)          0.6719875 0.4753468      0.6431376

Now look, higher returns, lower max drawdown, higher Sharpe, we can all go home happy?

Of course not. So what’s the point of this post?

Simply, to not believe everything you see at first glance. There’s often a phrase that’s thrown around that states “you’ll never see a bad backtest”–simply implying that if the backtest is bad, it’s thrown out, and all that’s left are the results with cherry-picked time frames, brute-force curve-fit parameter sets, and who knows what other sort of methodologies baked in to make a strategy look as good as possible.

On my blog, I try my best to be on the opposite side of the spectrum. In instances in which ideas were published years ago, some low-hanging fruit is an out-of sample test. On the opposite side of the spectrum, such as with Mr. Harry Long of SeekingAlpha here, when there are backtests with a very small time frame owing to newly released instruments, I’ll try and provide a much larger context. I provide the code, the data, and explanations in plain English, all to the best of my ability here on this blog. If you don’t wish to take my word for it, you can always run the scripts yourself. Worst to worst, leave a comment, and I’ll answer to the best of my ability.

I’ll have another Harry Long backtest replication up soon.

Thanks for reading.

Combining FAA and Stepwise Correlation

Since I debuted the stepwise correlation algorithm, I suppose the punchline that people want to see is: does it actually work?

The short answer? Yes, it does.

A slightly longer answer: it works, With the caveat that having a better correlation algorithm that makes up 25% of the total sum of weighted ranks only has a marginal (but nevertheless positive) effect on returns. Furthermore, when comparing a max decorrelation weighting using default single-pass correlation vs. stepwise, the stepwise gives a bumpier ride, but one with visibly larger returns. Furthermore, for this universe, the difference between starting at the security ranked highest by the momentum and volatility components, or with the security that has the smallest aggregate correlation to all securities, is very small. Essentially, from my inspection, the answer to using stepwise correlation is: “it’s a perfectly viable alternative, if not better.”

Here are the functions used in the script:

require(quantmod)
require(PerformanceAnalytics)

stepwiseCorRank <- function(corMatrix, startNames=NULL, stepSize=1, bestHighestRank=FALSE) {
  #edge cases
  if(dim(corMatrix)[1] == 1) {
    return(corMatrix)
  } else if (dim(corMatrix)[1] == 2) {
    ranks <- c(1.5, 1.5)
    names(ranks) <- colnames(corMatrix)
    return(ranks)
  }
  
  if(is.null(startNames)) {
    corSums <- rowSums(corMatrix)
    corRanks <- rank(corSums)
    startNames <- names(corRanks)[corRanks <= stepSize]
  }
  nameList <- list()
  nameList[[1]] <- startNames
  rankList <- list()
  rankCount <- 1
  rankList[[1]] <- rep(rankCount, length(startNames))
  rankedNames <- do.call(c, nameList)
  
  while(length(rankedNames) < nrow(corMatrix)) {
    rankCount <- rankCount+1
    subsetCor <- corMatrix[, rankedNames]
    if(class(subsetCor) != "numeric") {
      subsetCor <- subsetCor[!rownames(corMatrix) %in% rankedNames,]
      if(class(subsetCor) != "numeric") {
        corSums <- rowSums(subsetCor)
        corSumRank <- rank(corSums)
        lowestCorNames <- names(corSumRank)[corSumRank <= stepSize]
        nameList[[rankCount]] <- lowestCorNames
        rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
      } else { #1 name remaining
        nameList[[rankCount]] <- rownames(corMatrix)[!rownames(corMatrix) %in% names(subsetCor)]
        rankList[[rankCount]] <- rankCount
      }
    } else {  #first iteration, subset on first name
      subsetCorRank <- rank(subsetCor)
      lowestCorNames <- names(subsetCorRank)[subsetCorRank <= stepSize]
      nameList[[rankCount]] <- lowestCorNames
      rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
    }    
    rankedNames <- do.call(c, nameList)
  }
  
  ranks <- do.call(c, rankList)
  names(ranks) <- rankedNames
  if(bestHighestRank) {
    ranks <- 1+length(ranks)-ranks
  }
  ranks <- ranks[colnames(corMatrix)] #return to original order
  return(ranks)
}


FAAreturns <- function(prices, monthLookback = 4,
                       weightMom=1, weightVol=.5, weightCor=.5, 
                       riskFreeName="VFISX", bestN=3,
                       stepCorRank = FALSE, stepStartMethod=c("best", "default")) {
  stepStartMethod <- stepStartMethod[1]
  returns <- Return.calculate(prices)
  monthlyEps <- endpoints(prices, on = "months")
  riskFreeCol <- grep(riskFreeName, colnames(prices))
  tmp <- list()
  dates <- list()
  
  for(i in 2:(length(monthlyEps) - monthLookback)) {
    
    #subset data
    priceData <- prices[monthlyEps[i]:monthlyEps[i+monthLookback],]
    returnsData <- returns[monthlyEps[i]:monthlyEps[i+monthLookback],]
    
    #perform computations
    momentum <- data.frame(t(t(priceData[nrow(priceData),])/t(priceData[1,]) - 1))
    priceData <- priceData[, momentum > 0] #remove securities with momentum < 0
    returnsData <- returnsData[, momentum > 0]
    momentum <- momentum[momentum > 0]
    names(momentum) <- colnames(returnsData)
    vol <- as.numeric(-sd.annualized(returnsData))
    
    if(length(momentum) > 1) {
      
      #perform ranking
      if(!stepCorRank) {
        sumCors <- -colSums(cor(returnsData, use="complete.obs"))
        stats <- data.frame(cbind(momentum, vol, sumCors))
        ranks <- data.frame(apply(stats, 2, rank))
        weightRankSum <- weightMom*ranks$momentum + weightVol*ranks$vol + weightCor*ranks$sumCors
        names(weightRankSum) <- rownames(ranks)
      } else {
        corMatrix <- cor(returnsData, use="complete.obs")
        momRank <- rank(momentum)
        volRank <- rank(vol)
        compositeMomVolRanks <- weightMom*momRank + weightVol*volRank
        maxRank <- compositeMomVolRanks[compositeMomVolRanks==max(compositeMomVolRanks)]
        if(stepStartMethod=="default") {
          stepCorRanks <- stepwiseCorRank(corMatrix=corMatrix, startNames = NULL, 
                                          stepSize = 1, bestHighestRank = TRUE)
        } else {
          stepCorRanks <- stepwiseCorRank(corMatrix=corMatrix, startNames = names(maxRank), 
                                          stepSize = 1, bestHighestRank = TRUE)
        }
        weightRankSum <- weightMom*momRank + weightVol*volRank + weightCor*stepCorRanks
      }
      
      totalRank <- rank(weightRankSum)
      
      #find top N values, from http://stackoverflow.com/questions/2453326/fastest-way-to-find-second-third-highest-lowest-value-in-vector-or-column
      #thanks to Dr. Rob J. Hyndman
      upper <- length(names(returnsData))
      lower <- max(upper-bestN+1, 1)
      topNvals <- sort(totalRank, partial=seq(from=upper, to=lower))[c(upper:lower)]
      
      #compute weights
      longs <- totalRank %in% topNvals #invest in ranks length - bestN or higher (in R, rank 1 is lowest)
      longs <- longs/sum(longs) #equal weight all candidates
      longs[longs > 1/bestN] <- 1/bestN #in the event that we have fewer than top N invested into, lower weights to 1/top N
      names(longs) <- names(totalRank)
      
    } else if(length(momentum) == 1) { #only one security had positive momentum 
      longs <- 1/bestN
      names(longs) <- names(momentum)
    } else { #no securities had positive momentum 
      longs <- 1
      names(longs) <- riskFreeName
    }
    
    #append removed names (those with momentum < 0)
    removedZeroes <- rep(0, ncol(returns)-length(longs))
    names(removedZeroes) <- names(returns)[!names(returns) %in% names(longs)]
    longs <- c(longs, removedZeroes)
    
    #reorder to be in the same column order as original returns/prices
    longs <- data.frame(t(longs))
    longs <- longs[, names(returns)]
    
    #append lists
    tmp[[i]] <- longs
    dates[[i]] <- index(returnsData)[nrow(returnsData)]
  }
  
  weights <- do.call(rbind, tmp)
  dates <- do.call(c, dates)
  weights <- xts(weights, order.by=as.Date(dates)) 
  weights[, riskFreeCol] <- weights[, riskFreeCol] + 1-rowSums(weights)
  strategyReturns <- Return.rebalancing(R = returns, weights = weights, geometric = FALSE)
  colnames(strategyReturns) <- paste(monthLookback, weightMom, weightVol, weightCor, sep="_")
  return(strategyReturns)
}

The FAAreturns function has been modified to transplant the stepwise correlation algorithm I discussed earlier. Essentially, the chunk of code that performs the ranking inside the function got a little bit larger, and some new arguments to the function have been introduced.

First off, there’s the option to use the stepwise correlation algorithm in the first place–namely, the stepCorRank defaulting to FALSE (the default settings replicate the original FAA idea demonstrated in the first post on this idea). However, the argument that comes next, the stepStartMethod argument does the following:

Using the “default” setting, the algorithm will start off using the security that is simply least correlated among the securities (that is, the lowest sum of correlations among securities). However, the “best” setting instead will use the weighted sum of ranks using the prior two factors (momentum and volatility). This argument defaults to using the best security (aka the one best ranked prior by the previous two factors), as opposed to the default. At the end of the day, I suppose the best way of illustrating functionality is with some examples of taking this piece of engineering out for a spin. So here goes!

mutualFunds <- c("VTSMX", #Vanguard Total Stock Market Index
                 "FDIVX", #Fidelity Diversified International Fund
                 "VEIEX", #Vanguard Emerging Markets Stock Index Fund
                 "VFISX", #Vanguard Short-Term Treasury Fund
                 "VBMFX", #Vanguard Total Bond Market Index Fund
                 "QRAAX", #Oppenheimer Commodity Strategy Total Return 
                 "VGSIX" #Vanguard REIT Index Fund
)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2012-12-31")
tmp <- list()
for(fund in mutualFunds) {
  tmp[[fund]] <- Ad(get(fund))
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)
colnames(adPrices) <- gsub(".Adjusted", "", colnames(adPrices))

original <- FAAreturns(adPrices, stepCorRank=FALSE)
originalSWCbest <- FAAreturns(adPrices, stepCorRank=TRUE)
originalSWCdefault <- FAAreturns(adPrices, stepCorRank=TRUE, stepStartMethod="default")
stepMaxDecorBest <- FAAreturns(adPrices, weightMom=.05, weightVol=.025, 
                               weightCor=1, riskFreeName="VFISX", 
                               stepCorRank = TRUE, stepStartMethod="best")
stepMaxDecorDefault <- FAAreturns(adPrices, weightMom=.05, weightVol=.025, 
                                  weightCor=1, riskFreeName="VFISX", 
                                  stepCorRank = TRUE, stepStartMethod="default")
w311 <- FAAreturns(adPrices, weightMom=3, weightVol=1, weightCor=1, stepCorRank=TRUE)
originalMaxDecor <- FAAreturns(adPrices, weightMom=0, weightVol=1, stepCorRank=FALSE)
tmp <- cbind(original, originalSWCbest, originalSWCdefault, 
             stepMaxDecorBest, stepMaxDecorDefault, w311, originalMaxDecor)
names(tmp) <- c("original", "originalSWCbest", "originalSWCdefault", "SMDB", 
                "SMDD", "w311", "originalMaxDecor")
charts.PerformanceSummary(tmp, colorset=c("black", "orange", "blue", "purple", "green", "red", "darkred"))


statsTable <- data.frame(t(rbind(Return.annualized(tmp)*100, maxDrawdown(tmp)*100, SharpeRatio.annualized(tmp))))
statsTable$ReturnDrawdownRatio <- statsTable[,1]/statsTable[,2]

Same seven securities as the original paper, with the following return streams:

Original: the FAA original replication
originalSWCbest: original weights, stepwise correlation algorithm, using the best security as ranked by momentum and volatility as a starting point.
originalSWCdefault: original weights, stepwise correlation algorithm, using the default (minimum sum of correlations) security as a starting point.
stepMaxDecorBest: a max decorrelation algorithm that sets the momentum and volatility weights at .05 and .025 respectively, compared to 1 for correlation, simply to get the best starting security through the first two factors.
stepMaxDecorDefault: analogous to originalSWCdefault, except with the starting security being defined as the one with minimum sum of correlations.
w311: using a weighting of 3, 1, and 1 on momentum, vol, and correlation, respectively, while using the stepwise correlation rank algorithm, starting with the best security (the default for the function), since I suspected that not weighing momentum at 1 or higher was the reason any other equity curves couldn’t top out above the paper’s original.
originalMaxDecor: max decorrelation using the original 1-pass correlation matrix

Here is the performance chart:

Here’s the way I interpret it:

Does David Varadi’s stepwise correlation ranking algorithm help performance? From this standpoint, the answers lead to yes. Using the original paper’s parameters, the performance over the paper’s backtest period is marginally better in terms of the equity curves. Comparing max decorrelation algorithms (SMDB and SMDD stand for stepwise max decorrelation best and default, respectively), the difference is even more clear.

However, I was wondering why I could never actually outdo the original paper’s annualized return, and out of interest, decided to more heavily weigh the momentum ranking than the original paper eventually had it set at. The result is a bumpier equity curve, but one that has a higher annualized return than any of the others. It’s also something that I didn’t try in my walk-forward example (though interested parties can simply modify the original momentum vector to contain a 1.5 weight, for instance).

Here’s the table of statistics for the different permutations:

> statsTable
                   Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0.. ReturnDrawdownRatio
original                    14.43802       13.15625                        1.489724            1.097427
originalSWCbest             14.70544       13.15625                        1.421045            1.117753
originalSWCdefault          14.68145       13.37059                        1.457418            1.098041
SMDB                        13.55656       12.33452                        1.410072            1.099075
SMDD                        13.18864       11.94587                        1.409608            1.104033
w311                        15.76213       13.85615                        1.398503            1.137555
originalMaxDecor            11.89159       11.68549                        1.434220            1.017637

At the end of the day, all of the permutations exhibit solid results, and fall along different ends of the risk/return curve. The original settings exhibit the highest Sharpe Ratio (barely), but not the highest annualized return to max drawdown ratio (which surprisingly, belongs to the setting that overweights momentum).

To wrap this analysis up (since there are other strategies that I wish to replicate), here is the out-of-sample performance of these seven strategies (to Oct 30, 2014):

Maybe not the greatest thing in the world considering the S&P has made some spectacular returns in 2013, but nevertheless, the momentum variant strategies established new equity highs fairly recently, and look to be on their way up from their latest slight drawdown. Here are the statistics for 2013-2014:

statsTable <- data.frame(t(rbind(Return.annualized(tmp["2013::"])*100, maxDrawdown(tmp["2013::"])*100, SharpeRatio.annualized(tmp["2013::"]))))
statsTable$ReturnDrawdownRatio <- statsTable[,1]/statsTable[,2]

> statsTable
                   Annualized.Return Worst.Drawdown Annualized.Sharpe.Ratio..Rf.0.. ReturnDrawdownRatio
original                    9.284678       8.259298                       1.1966581           1.1241485
originalSWCbest             8.308246       9.657667                       0.9627413           0.8602746
originalSWCdefault          8.916144       8.985685                       1.0861781           0.9922609
SMDB                        6.406438       9.657667                       0.8366559           0.6633525
SMDD                        5.641980       5.979313                       0.7840507           0.9435833
w311                        8.921268       9.025100                       1.0142871           0.9884953
originalMaxDecor            2.888778       6.670709                       0.4244202           0.4330542

So, the original parameters are working solidly, the stepwise correlation algorithm seems to be in a slight rut, and the variants without any emphasis on momentum simply aren’t that great (they were created purely as illustrative tools to begin with). Whether you prefer to run FAA with these securities, or with trading strategies of your own, my only caveat is that transaction costs haven’t been taken into consideration (from what I hear, interactive brokers charges you $1 per transaction, so it shouldn’t make a world of a difference), but beyond that, I believe these last four posts have shown that FAA is something that works. While it doesn’t always work perfectly (EG the S&P 500 had a very good 2013), the logic is sound, and the results are solid, even given some rather plain-vanilla type securities.

In any case, I think I’ll conclude with the fact that FAA works, and the stepwise correlation algorithm provides a viable alternative to computing your weights. I’ll update my IKTrading package with some formal documentation regarding this algorithm soon.

Thanks for reading.

A Walk-Forward Attempt on FAA

So in the first post about FAA, I was requested to make a walk-forward test of FAA. While the results here aren’t good, I’d like to share the general process anyway.

Here’s the additional code I wrote, assuming the first post‘s code is still in your environment (the demo will have the function in the namespace as well):

weightMom <- seq(0, 1, by=.5)
weightVol <- c(0, .5, 1)
weightCor <- c(0, .5, 1)
monthLookback=c(3, 4, 6, 10)
permutations <- expand.grid(weightMom, weightVol, weightCor, monthLookback)
colnames(permutations) <- c("wMom", "wVol", "wCor", "monthLookback")

require(doMC)
registerDoMC(detectCores())
t1 <- Sys.time()
out <- foreach(i = 1:nrow(permutations), .combine = cbind) %dopar% {
  FAAreturns(prices=adPrices, 
             monthLookback = permutations$monthLookback[i], 
             weightMom = permutations$wMom[i], 
             weightCor = permutations$wCor[i], 
             weightVol=permutations$wVol[i])
}
t2 <- Sys.time()
print(t2-t1)

out <- out["1998-10::"] #start at 1999 due to NAs with data

FAAwalkForward <- function(portfolios, applySubset = apply.quarterly, applyFUN = Return.cumulative) {
  metrics <- applySubset(portfolios, applyFUN)
  weights <- list()
  for(i in 1:nrow(metrics)) {
    row <- metrics[i,]
    winners <- row==max(row)
    weight <- winners/rowSums(winners) #equal weight all top performers
    weights[[i]] <- weight
  }
  weights <- do.call(rbind, weights)
  returns <- Return.rebalancing(portfolios, weights)
  return(returns)
}

WFQtrRets <- FAAwalkForward(portfolios = out, applySubset = apply.quarterly, applyFUN = Return.cumulative)
WFYrRets <- FAAwalkForward(portfolios = out, applySubset = apply.yearly, applyFUN = Return.cumulative)
WFMoRets <- FAAwalkForward(portfolios = out, applySubset = apply.monthly, applyFUN = Return.cumulative)

WF <- cbind(WFQtrRets, WFYrRets, WFMoRets)
colnames(WF) <- c("quarterly", "annually", "monthly")
WF <- WF["1999::"]

original <- FAAreturns(adPrices)
original <- original["1999::"]
WF <- cbind(WF, original)
colnames(WF)[4] <- "original"
charts.PerformanceSummary(WF)


Return.annualized(WF)
maxDrawdown(WF)
SharpeRatio.annualized(WF)

So what I did was take about a hundred permutations, and compute them all in parallel using the doMC package (Windows uses a different parallel architecture, but this post explains a more OS-agnostic method). Next, I wrote a small function that would compute a metric for all of the permutations for some period (monthly, quarterly, annually), and equal-weight all of the maximum configurations, which were in some cases, more than one. This would be the strategy’s holdings over the next period, and repeat this iteration to the end. I started in late 1998, as I used a 10-month lookback period for one of the monthly lookback settings, while the data starts in mid 1997.

Here are the results:

> Return.annualized(WF)
                  quarterly  annually   monthly  original
Annualized Return 0.1303674 0.1164749 0.1204314 0.1516936
> maxDrawdown(WF)
               quarterly  annually   monthly  original
Worst Drawdown 0.1666639 0.1790334 0.1649651 0.1315625
> SharpeRatio.annualized(WF)
                                quarterly annually  monthly original
Annualized Sharpe Ratio (Rf=0%)  1.257753 1.025863 1.194066 1.519912

In short, using a walk-forward with FAA seriously harmed the performance.

After seeing these disappointing results, I mulled over as to the why, and here’s my intuition:

A) FAA uses a ranking algorithm, which loses the nuance of slightly changing this weight or that, leading to many identical configurations for a given time period. That some of these or all of these configurations are only the best for that period is a very real possibility.

B) Given that there are so few securities, it’s quite possible that the walk-forward process is simply chasing performance–that is, switching into a configuration right after it had its one moment in the sun. Considering that the original strategy is fairly solid to begin with, it certainly seems to be better to pick a decent configuration and stick with it.

C) One last thought that stuck with me is that the original FAA strategy meets all the qualifications *for* a walk-forward strategy already. It is a monthly-rebalanced algorithm that seeks to maximize an objective function (rank of the weighted sum of ranks of momentum, volatility, and correlation) in a robust fashion–it’s simply that instead of strategy configurations, the inputs were seven mutual funds. To me, in fact, the more I think about it, the more FAA looks like an extremely solid walk-forward framework, simply presented as a strategy in and of itself.

In any case, while the results were far from spectacular, I’m hoping that the code has given others some ideas about how to conduct some generalized returns-based walk-forward testing on their own strategies.

On one last note, if any readers have ideas that they’d like me to investigate, I’m always open to input and new ideas.

Thanks for reading.

Introducing Stepwise Correlation Rank

So in the last post, I attempted to replicate the Flexible Asset Allocation paper. I’d like to offer a thanks to Pat of Intelligent Trading Tech (not updated recently, hopefully this will change) for helping me corroborate the results, so that I have more confidence there isn’t an error in my code.

One of the procedures the authors of the FAA paper used is a correlation rank, which I interpreted as the average correlation of each security to the others.

The issue, pointed out to me in a phone conversation I had with David Varadi is that when considering correlation, shouldn’t the correlations the investor is concerned about be between instruments within the portfolio, as opposed to simply all the correlations, including to instruments not in the portfolio? To that end, when selecting assets (or possibly features in general), conceptually, it makes more sense to select in a stepwise fashion–that is, start off at a subset of the correlation matrix, and then rank assets in order of their correlation to the heretofore selected assets, as opposed to all of them. This was explained in Mr. Varadi’s recent post.

Here’s a work in progress function I wrote to formally code this idea:

stepwiseCorRank <- function(corMatrix, startNames=NULL, stepSize=1, bestHighestRank=FALSE) {
  #edge cases
  if(dim(corMatrix)[1] == 1) {
    return(corMatrix)
  } else if (dim(corMatrix)[1] == 2) {
    ranks <- c(1.5, 1.5)
    names(ranks) <- colnames(corMatrix)
    return(ranks)
  }
  if(is.null(startNames)) {
    corSums <- rowSums(corMatrix)
    corRanks <- rank(corSums)
    startNames <- names(corRanks)[corRanks <= stepSize]
  }
  nameList <- list()
  nameList[[1]] <- startNames
  rankList <- list()
  rankCount <- 1
  rankList[[1]] <- rep(rankCount, length(startNames))
  rankedNames <- do.call(c, nameList)
  
  while(length(rankedNames) < nrow(corMatrix)) {
    rankCount <- rankCount+1
    subsetCor <- corMatrix[, rankedNames]
    if(class(subsetCor) != "numeric") {
      subsetCor <- subsetCor[!rownames(corMatrix) %in% rankedNames,]
      if(class(subsetCor) != "numeric") {
        corSums <- rowSums(subsetCor)
        corSumRank <- rank(corSums)
        lowestCorNames <- names(corSumRank)[corSumRank <= stepSize]
        nameList[[rankCount]] <- lowestCorNames
        rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
      } else { #1 name remaining
        nameList[[rankCount]] <- rownames(corMatrix)[!rownames(corMatrix) %in% names(subsetCor)]
        rankList[[rankCount]] <- rankCount
      }
    } else {  #first iteration, subset on first name
      subsetCorRank <- rank(subsetCor)
      lowestCorNames <- names(subsetCorRank)[subsetCorRank <= stepSize]
      nameList[[rankCount]] <- lowestCorNames
      rankList[[rankCount]] <- rep(rankCount, min(stepSize, length(lowestCorNames)))
    }    
    rankedNames <- do.call(c, nameList)
  }
  
  ranks <- do.call(c, rankList)
  names(ranks) <- rankedNames
  if(bestHighestRank) {
    ranks <- 1+length(ranks)-ranks
  }
  ranks <- ranks[colnames(corMatrix)] #return to original order
  return(ranks)
}

So the way the function works is that it takes in a correlation matrix, a starting name (if provided), and a step size (that is, how many assets to select per step, so that the process doesn’t become extremely long when dealing with larger amounts of assets/features). Then, it iterates–subset the correlation matrix on the starting name, and find the minimum value, and add it to a list of already-selected names. Next, subset the correlation matrix columns on the selected names, and the rows on the not selected names, and repeat, until all names have been accounted for. Due to R’s little habit of wiping out labels when a matrix becomes a vector, I had to write some special case code, which is the reason for two nested if/else statements (the first one being for the first column subset, and the second being for when there’s only one row remaining).

Also, if there’s an edge case (1 or 2 securities), then there is some functionality to handle those trivial cases.

Here’s a test script I wrote to test this function out:

require(PerformanceAnalytics)
require(quantmod)

#mid 1997 to end of 2012
getSymbols(mutualFunds, from="1997-06-30", to="2012-12-31")
tmp <- list()
for(fund in mutualFunds) {
  tmp[[fund]] <- Ad(get(fund))
}

#always use a list hwne intending to cbind/rbind large quantities of objects
adPrices <- do.call(cbind, args = tmp)
colnames(adPrices) <- gsub(".Adjusted", "", colnames(adPrices))

adRets <- Return.calculate(adPrices)

subset <- adRets["2012"]
corMat <- cor(subset)

tmp <- list()
for(i in 1:length(mutualFunds)) {
  rankRow <- stepwiseCorRank(corMat, startNames=mutualFunds[i])
  tmp[[i]] <- rankRow
}
rankDemo <- do.call(rbind, tmp)
rownames(rankDemo) <- mutualFunds
origRank <- rank(rowSums(corMat))
rankDemo <- rbind(rankDemo, origRank)
rownames(rankDemo)[8] <- "Original (VBMFX)"

heatmap(-rankDemo, Rowv=NA, Colv=NA, col=heat.colors(8), margins=c(6,6))

Essentially, using the 2012 year of returns for the 7 FAA mutual funds, I compared how different starting securities changed the correlation ranking sequence.

Here are the results:

               VTSMX FDIVX VEIEX VFISX VBMFX QRAAX VGSIX
VTSMX              1     6     7     4     2     3     5
FDIVX              6     1     7     4     2     5     3
VEIEX              6     7     1     4     2     3     5
VFISX              2     6     7     1     3     4     5
VBMFX              2     6     7     4     1     3     5
QRAAX              5     6     7     4     2     1     3
VGSIX              5     6     7     4     2     3     1
Non-Sequential     5     6     7     2     1     3     4

In short, the algorithm is rather robust to starting security selection, at least judging by this small example. However, comparing VBMFX start to the non-sequential ranking, we see that VFISX changes from rank 2 in the non-sequential to rank 4, with VTSMX going from rank 5 to rank 2. From an intuitive perspective, this makes sense, as both VBMFX and VFISX are bond funds, which have a low correlation with the other 5 equity-based mutual funds, but a higher correlation with each other, thus signifying that the algorithm seems to be working as intended, at least insofar as this small example demonstrates. Here’s a heatmap to demonstrate this in visual form.

The ranking order (starting security) is the vertical axis, and the horizontal are the ranks, from white being first, to red being last. Notice once again that the ranking orders are robust in general (consider each column of colors descending), but each particular ranking order is unique.

So far, this code still has to be tested in terms of its applications to portfolio management and asset allocation, but for those interested in such an idea, it’s my hope that this provides a good reference point.

Thanks for reading.