Create Amazing Looking Backtests With This One Wrong–I Mean Weird–Trick! (And Some Troubling Logical Invest Results)

This post will outline an easy-to-make mistake in writing vectorized backtests–namely in using a signal obtained at the end of a period to enter (or exit) a position in that same period. The difference in results one obtains is massive.

Today, I saw two separate posts from Alpha Architect and Mike Harris both referencing a paper by Valeriy Zakamulin on the fact that some previous trend-following research by Glabadanidis was done with shoddy results, and that Glabadanidis’s results were only reproducible through instituting lookahead bias.

The following code shows how to reproduce this lookahead bias.

First, the setup of a basic moving average strategy on the S&P 500 index from as far back as Yahoo data will provide.


getSymbols('^GSPC', src='yahoo', from = '1900-01-01')
monthlyGSPC <- Ad(GSPC)[endpoints(GSPC, on = 'months')]

# change this line for signal lookback
movAvg <- SMA(monthlyGSPC, 10)

signal <- monthlyGSPC > movAvg
gspcRets <- Return.calculate(monthlyGSPC)

And here is how to institute the lookahead bias.

lookahead <- signal * gspcRets
correct <- lag(signal) * gspcRets

These are the “results”:

compare <- na.omit(cbind(gspcRets, lookahead, correct))
colnames(compare) <- c("S&P 500", "Lookahead", "Correct")
rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare))
logRets <- log(cumprod(1+compare))
chart.TimeSeries(logRets, legend.loc='topleft')

Of course, this equity curve is of no use, so here’s one in log scale.

As can be seen, lookahead bias makes a massive difference.

Here are the numerical results:

                            S&P 500  Lookahead   Correct
Annualized Return         0.0740000 0.15550000 0.0695000
Annualized Std Dev        0.1441000 0.09800000 0.1050000
Annualized Sharpe (Rf=0%) 0.5133000 1.58670000 0.6623000
Worst Drawdown            0.5255586 0.08729914 0.2699789
Calmar Ratio              0.1407286 1.78119192 0.2575219

Again, absolutely ridiculous.

Note that when using Return.Portfolio (the function in PerformanceAnalytics), that package will automatically give you the next period’s return, instead of the current one, for your weights. However, for those writing “simple” backtests that can be quickly done using vectorized operations, an off-by-one error can make all the difference between a backtest in the realm of reasonable, and pure nonsense. However, should one wish to test for said nonsense when faced with impossible-to-replicate results, the mechanics demonstrated above are the way to do it.

Now, onto other news: I’d like to thank Gerald M for staying on top of one of the Logical Invest strategies–namely, their simple global market rotation strategy outlined in an article from an earlier blog post.

Up until March 2015 (the date of the blog post), the strategy had performed well. However, after said date?

It has been a complete disaster, which, in hindsight, was evident when I passed it through the hypothesis-driven development framework process I wrote about earlier.

So, while there has been a great deal written about not simply throwing away a strategy because of short-term underperformance, and that anomalies such as momentum and value exist because of career risk due to said short-term underperformance, it’s never a good thing when a strategy creates historically large losses, particularly after being published in such a humble corner of the quantitative financial world.

In any case, this was a post demonstrating some mechanics, and an update on a strategy I blogged about not too long ago.

Thanks for reading.

NOTE: I am always interested in hearing about new opportunities which may benefit from my expertise, and am always happy to network. You can find my LinkedIn profile here.

11 thoughts on “Create Amazing Looking Backtests With This One Wrong–I Mean Weird–Trick! (And Some Troubling Logical Invest Results)

  1. There are other examples of problems with LI strategies post publication. To their credit, they post results showing the date of publication. There also appears to be widespread problems with many non-peer reviewed, pseudo-academic publications (SSRN is a breeding ground for this). Whenever I see in inflection point in 2008, for example, I know the strategy has been curve fit through selection bias. The author just selected instruments that worked in that unique period. But there are other biases – some pretty subtle. I’ve been monitoring many of the strategies published on your blog with troubling results. Even EAA, a fantastic idea and paper, has some red flags. EAA never booked a single yearly loss from 1998 through to publication in January 2015 (using MF data from Yahoo). Not even in 2008! Yet it lost nearly 5% in 2015 and is slightly negative YTD. Thank you for creating great code. Your blog is a treasure!

    • To be fair, 2015 was a horrid year for momentum, so I can let that one slide. A 5% loss isn’t the end of the world. This year I think is a bit lukewarm as well.

      Momentum generally looks bad when the markets are in a sort of sustained consolidation wishy-washy phase. I’m sure there’s some technique out there that can more scientifically say what state a market is in a bit more definitively than looking at a chart, though.

  2. Pingback: Build Awesome Searching Backtests With This A person Wrong–I Indicate Weird–Trick! (And Some Troubling Logical Spend Final results) | A bunch of data

  3. Ilya, thanks for this useful article.

    GeraldM, if you invest money with systems you pick from (il)logical invest then you deserve to lose your money to someone who can make better use of it.

    • Tarantino, I have never used LI. It’s an obvious curve fit. It has been entertaining to monitor though. They publish charts that show when they went live which is commendable because it shows that the systems are actually not working. That makes me think that they don’t really appreciate the problems of over optimization. Just an opinion of course.

  4. I think one problem with your code could be that you eliminate rows with NAs (na.omit) from a matrix of *returns*. You have to first merge the price-vectors, then remove the NAs and only after that calculate returns, otherwise you can get distorted results.

  5. Other years in the back test weren’t good for momentum either. Market momentum characteristics in 2015 weren’t much different than prior periods (see A likely problem with momentum is the explosion in papers, blogs posts and algorithms detailing every aspect of momentum. Everyone knows about it and so how an there be an edge for something that is so commonly understood? I agree that one year does not make for a robust data point but the fact remains that EAA proceeded to lose once applied to real out-of-sample data where as it didn’t during the biggest bubble (2000) or the worst financial crisis (2008) in a generation. The defensive portfolio is also in the deepest draw down since start of the back test period (1998). I find that troubling. The end result is that EAA went into draw down immediately after publication and has not recovered since. The problems with Logical Invest are over optimization (the unrealistic CAGR’s are enough to think that). The problems with EAA are not easy to identify. Maybe there isn’t a problem and 2015 (and so far in 2016) are somehow worse than any period since 1998. Or maybe momentum is now fully arbitraged out and will not work for the foreseeable future. I don’t know. I will keep monitoring these strategies though.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s