Volatility Risk Premium: Sharpe 2+, Return to Drawdown 3+

First, before starting this post, I’d like to give one last comment about my previous post:

I called Vanguard to inquire about the trading policies on VWEHX and VFISX, and there are two-month cooldown periods (aka frequent-trading policies) on those mutual funds. However, the HYG ETF does indeed pay dividends, so the adjusted ETF variant is most likely the closest performance an investor can expect. Still, a Sharpe ratio higher than 1.25 is nothing to scoff at. Of course, no transaction costs are assumed on any of my strategies, so make sure your broker isn’t ripping you off if you actually intend on seriously investing in anything I publish on this blog (I hear interactive brokers has $1 per transaction), and once again, remember that none of this constitutes official advice.

Now, onto this post:

Judging by the attention some of my previous volatility posts have garnered through my replication of SeekingAlpha strategies, today, I am going to share a strategy whose statistics boggle my mind.

The strategy was presented by TradingTheOdds in this post. I had to replicate it for myself to be sure it worked as advertised, but unless I have something horrendously incorrect, this strategy works…quite well. Here’s the strategy:

Using the actual S&P 500 index, compute the two-day annualized historical volatility. Subtract that from the VXMT, which is the six-month expected volatility of the S&P 500 (prior to 2008, use the actual VIX). Then, take the 5-day SMA of that difference. If this number is above 0, go long XIV, otherwise go long VXX. In my replication, the strategy uses market-on-close orders (AKA observe “near” the close, buy at the close), so the strategy should be taken with a little bit of a grain of salt. Here’s the code:


         destfile="longVXX.txt") #requires downloader package

xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxx <- xts(read.zoo("longVXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))
vxmt <- xts(read.zoo("vxmtdailyprices.csv", format="%m/%d/%Y", sep=",", header=TRUE))

getSymbols("^VIX", from="2004-03-29")

vixvxmt <- merge(Cl(VIX), Cl(vxmt))
vixvxmt[is.na(vixvxmt[,2]),2] <- vixvxmt[is.na(vixvxmt[,2]),1]

getSymbols("^GSPC", from="1990-01-01")
spyRets <- diff(log(Cl(GSPC)))

spyVol <- runSD(spyRets, n=2)
annSpyVol <- spyVol*100*sqrt(252)

vols <- merge(vixvxmt[,2], annSpyVol, join='inner')
vols$smaDiff <- SMA(vols[,1] - vols[,2], n=5)
vols$signal <- vols$smaDiff > 0
vols$signal <- lag(vols$signal, k = 1)

xivRets <- Return.calculate(Cl(xiv))
vxxRets <- Return.calculate(Cl(vxx))
stratRets <- vols$signal*xivRets + (1-vols$signal)*vxxRets

The VXMT data is taken from the link I showed earlier. So, as I interpret it, to me, this strategy seems to be stating this:

Since we are subtracting the long-term expected volatility (VXMT) from the near-term historical volatility, which I suppose is meant to be a proxy for the “forecast of current volatility”, the implied hypothesis seems to be that volatility is being overestimated by the VXMT, so we should go short volatility. Conversely, if the near-term historical volatility is higher than the expected volatility, it means we should be long volatility instead. So, here’s the punchline that is the equity curve:


Yes, you’re looking at that correctly–over approximately ten years (slightly longer), this strategy had a cumulative return of about $10,000 for every $1 invested. Just to put this into perspective, here’s a log-scale equity curve.

There are a few noticeable dips which correspond to around 40% drawdowns on the regular scale. Now, let’s look at the usual statistics:

stats <- data.frame(cbind(Return.annualized(stratRets)*100, 

colnames(stats) <- c("Annualized Return", "Max Drawdown", "Annualized Sharpe")
stats$MAR <- as.numeric(stats[1])/as.numeric(stats[2])

With the following result:

                  Annualized Return Max Drawdown Annualized Sharpe      MAR
Annualized Return          137.7875     45.64011          2.491509 3.019001

Risky, as judging from maximum drawdown alone? Yes. But is there risk for the reward? Absolutely.

To put a lower bound on the performance of the strategy, here are the same diagrams and statistics with the signal lagged one more day (that is, since the returns use close-to-close data and work off of the closing prices, the signal is lagged by a day to avoid lookahead bias. Lagging the signal by one more day would mean receiving the signal at the close of day t, but only entering on the close of day t+1).

In effect, this is the coding change:

vols$signal <- lag(vols$signal, k = 1)


vols$signal <- lag(vols$signal, k = 2)

Here are the results:

Equity curve/drawdowns:

So, still impressive on the returns, but now there are some very pronounced drawdowns–slightly larger, but more concerning, that they’re longer.

Log equity curve:

Again, slightly more pronounced dips.

Here are the statistics:

                  Annualized Return Max Drawdown Annualized Sharpe      MAR
Annualized Return          84.36546     56.77219          1.521165 1.486035

So still quite respectable for a strategy this understandable.

I’m fairly certain that there’s still room for improvement on this strategy, considering that anywhere there’s a 5-day SMA, there’s often room for better trend-following indicators, and possibly a better indication for the volatility of SPY than the 2-day rolling annualized metric. But as it stands, I think based on its risk/reward characteristics, this strategy has a place as an aggressive, returns-generating component of a portfolio of strategies.

Thanks for reading.

Note: I am a freelance consultant in quantitative analysis on topics related to this blog. If you have contract or full time roles available for proprietary research that could benefit from my skills, please contact me through my LinkedIn here.

34 thoughts on “Volatility Risk Premium: Sharpe 2+, Return to Drawdown 3+

  1. Ilya, great find. The strategy’s returns are, as you said, astounding. They are so good, I can’t help but wonder if they are due to data mining. This answer by Frank in the comments on his blog is interesting,

    “I tried with VXV (3-month term) as well, but performance was less than stellar (than with VXMT (6-month)). Unfortunately I don’t own a christal ball, but it is always possible that in the future the VXV, VX futures (either pure quotes or merged into constant maturity) or a related underlying may outperform all others due to a change in market regime. Who knows …

    I tried a lot of combinations (underlyings, moving averages, cut offs), and of course it is possible to come to even better results (than by suing the VXMT rather than the VIX) by curve-fitting (e.g. by adding a cut-off different than “0”). Currently I’am investigating into this topic (the reason behind the VXMT outperformance). Stay tuned …”

    So Frank tried lots of hypotheses before finding this one. I’ve always found the strategies I can trust generally work without trial-and-error, as soon as they are tested, and the logic of why they work is always very simple and transparent from the formation of the hypothesis.

    The issue is of course evaluating non-quantitative risk in the strategy. The unquantified risk all strategies have is whether the future is going to be like the past. The way to get at that is to identify the assumptions behind why the strategy works and judge their durability. Almost all of the SMA-type strategies that I see have opaque assumptions which seem like they could become invalid at any time. That’s a bigger risk than volatility to me, as volatility can be easily mediated, e.g., by doing what Warren Buffett does and always keeping a big cash pile on hand so you are never forced to sell assets under duress.

    What are your thoughts?

    • Well, to begin with, this wasn’t Frank’s original strategy. It was from a different paper with a separate hypothesis behind it. I do believe he optimized it a bit more, but to me, the logic is apparent. It’s the idea that if the near-term historical volatility of the SPY is lower than that as estimated by the expected (read: forecast) value that is the VXMT, then it makes sense to short the VIX volatility, since the VIX is also an estimate of expected volatility, as opposed to realized/historical volatility.

      However, I did go back and check the parameters. Is the SMA5 of a two-day historical volatility optimized? Yes. As in, I could not find a better parameter for the SMA. Are the strategy payoffs still in the realm of solid to impressive even outside these two exact parameters? Yep. Still MARs north of 1, Sharpe north of 1, drawdowns still in the 40s-50s. Is this strategy without risk? By no means. However, the downside is within the realm of any of the volatility trading strategies posted to the internet (aka in the 40s or 50s, if not worse), while the returns are pretty fantastic. Also, when I see an optimized value of a two-period something or other (such as David Varadi’s DV2 indicator, or Larry Connors’s use of the RSI2 or the Connors RSI (3, 2, 100), that’s usually a sign that there’s a very short-term phenomenon the computation seeks to capture.

      Regarding Buffett’s strategy of keeping cash on hand so as to never be forced to sell at a bad moment, that’s certainly good advice. In fact, I’d highly recommend mixing this strategy with other very low-risk strategies or assets. I mean sure, you can shoot for the moon and go all-in on this strategy, but then you might exit at the bottom of one of those cyclical periods of the strategy being wrong for a little while.

      At the end of the day though, there aren’t too many moving parts to this strategy:

      Compare two computations of volatility (long term forecasted vs. near-term historical). Smooth them out so as to not get whipsawed. Go in the direction of the historical volatility in relation to the forecasted. (EG historical volatility lower than forecasted? Short volatility, and vice versa.)

      Can I be rationalizing away red flags? Possibly. Is this a strategy I’d say going all in on is a good idea? No. However, I wouldn’t say that it makes sense to dismiss the strategy out of hand, since for a small allocation of your portfolio (EG 5%-10%), the returns can get a major shot in the arm.

      Just my two cents.

      • Hi Ilya, the rationale for the strategy is clear. What’s less clear, I guess, is whether the rationale is a logic or a rationalization. There are a couple of questions I’d suggest for getting at the issue.

        1) If you hadn’t seen the data, how closely could you have guessed the returns?
        2) If you hadn’t seen the data, could you have predicted the difference in returns between this strategy and other strategies to which the same “logic” applies? For example, could you have predicted the strategy would work much more poorly using VXV 3-month numbers?
        3) Based on the “logic”, can you predict what market conditions would lead to improvements in or elimination of the returns from this strategy?
        4) How much money would you bet that this strategy’s performance on an out-of-sample data set would be within 10% of the performance on this data set?

        For my part, I can’t see how to deduce answers to these questions from the “logic” of the strategy. If I combine those failures with the unnatural magnitude of the returns, it’s hard to dismiss the idea that there’s a strong data mining effect here.

      • 1) Not closely at all, since this is new to me as well. I would have guessed something substantial, as the volatility ETFs can have severe drawdowns, so I’d guess somewhere in the range of 40-60%. Still not too close.

        2) No way. I generally don’t guess that annualized returns would be above 100%.

        3) I’d have to look more closely at the data to answer that question. The strategy is new to me as well.

        4) I highly doubt that this strategy’s returns would hold up exactly as spectacularly as they have been, but currently, the system seems to be making new equity highs, even if returns are 50% of what they are, that still might not be such a bad strategy.

        Certainly, I’m willing to accept that there’s some variation of curve-fitting and over-optimization here. The more pertinent question seems to be “how much”?

  2. data mining and curve fitting. Using linear algebra and calculus, I an construct a combination of ETFs and stocks that has a very high backtested Sharpe ratio over some backtested period, but correlation does not equal causation. There has to be some fundamental reasoning behind a strategy that tie the components together. Long TQQQ and Long TMF rebalanced quarterly has done exceptionally well

  3. The rationale behind is very easy: Sell (volatiity) risk when there is fear in the market (implied volatility overestimates realized volatility), and vice versa. Even the most simple strategy based on that concept (‘Go long XIV at today’s close if the VIX index will close below the front month VIX futures contract, or go long VXX if it will close above. Hold until a change in position.’) came up with 50+% annualized returns over the course of the last 10 years (including the financial crisis and the most recent bull market).
    See http://volatilitymadesimple.com/backtest-comparing-the-vix-index-to-front-month-vix-futures/

    Of course there were (and will be) severe drawdowns. But is it worth the risk (as one part of your portfolio) ? I think definitely!

  4. I wrote the paper that this strategy was modified from. To a large extent my strategy was “data-mined” and to reflect this fact I gave it two “Grim Reaper” icons in the paper.

    It has now been data-mined to within an inch of its life (if not further). This has been done in two ways: firstly the blogger took the best of the 5 strategies in my paper. That is data-mining by itself. Then he optimised the parameters. So that’s data-mining on top of data-mining on top of further data-mining.

    The VRP has been zero or negative most of the year. This can be seen in chart 4 of http://volatilitymadesimple.com/vix-trading-year-to-date-in-four-graphs-2/ which is an up-to-date version of a chart in my paper. So VRP strategy returns should be zero.

    • It isn’t often that the authors of original papers comment on a small blog like this. So, Mr. Cooper, thank you so much. I’m not sure I agree with the first statement being a case of data mining. That is, if I’d read a book containing several strategies, would it be data mining to pick one that I particularly like? Maybe, maybe not. I’m not sure how one would go about testing that, however.

      Regarding the second statement, however, you are absolutely correct that this specific parameter combination was indeed over-optimized. However, Jaekle and Tomasini (2009) have a certain “finger in the air” method of visually analyzing whether or not a strategy is over-optimized, which, given the response to this blog post, I feel should be shared for further discussion. I’ll have that up soon.


  5. If you read a book containing several strategies and picked the one that performed the best that would be optimising which is a form data-mining.

    [I would technically (being a proponent of statistical learning or machine learning) call it overfitting. But for some reason people in the trading area (and small bloggers) call it “data-mining” – a term with pejorative connotations. But in my field, data mining is the noble art of finding meaningful patterns in data. Similarly the term “curve fitting” has taken a pejorative meaning in the trading world.]

    I tried to convince TradingTheOdds that what worked best in the past isn’t likely to work the best in the future but to no avail. Maybe that’s why authors don’t like to engage with small bloggers. Nevertheless, one has to respect a blogger who uses R!

    • No, the concern for whether or not the strategy was overfit is absolutely valid. And call it what one will (data mining, curve fitting, overfitting, over-optimization, etc…), as someone with an MS in stats, it’s all the same to me–blindly run an optimizer and pick the best past performance, and then say “LOOK AT MY AMAZING RESULTS!”. Well, the results are indeed amazing, sticking to the adage of “you’ll never see a bad backtest”, but just because one person overfit a parameter set doesn’t mean that the strategy itself is bad. After all, given enough parameters and rules, one can overfit anything potentially, and it’s far too easy to fool oneself with trading systems, with massive mistakes.

      However, I’m working on finishing up a demonstration with heatmaps showing that the strategy itself may actually have some value.

  6. Thanks for the post, really great stuff. I’m having trouble getting R to recognize the vxmt variable that was created in line 9. Is there another download statement that’s missing? I only see the two CSVs (XIV, VXX) that were created.Please advice. Thank you!

  7. Tony, Ilya,

    with respect to “I tried to convince TradingTheOdds that what worked best in the past isn’t likely to work the best in the future but to no avail.”: It is for sure that what worked best in the past may not (but not: will not !) work the best in the future. But why looking for “what works best”, if “working fine” already could mean 50+% per year ?

    We could’ve discussed the same issue 4 years ago (VRP strategy, curvefitting, … with a 5-6 year history), and while the theoretician would still be looking for the fly in the ointment, meanwhile the practitioner would’ve increased his stake more the tenfold. Let’s see what will happen over the course of the next couple of years (live results are available on the blog).

    I totally agree that the VRP significantly declined between 2008 and the present (see my recent posting), but nobody knows when it will increase again. But the concept behind is still able to provide a profitable edge in the market, even with the most simple set of parameters.

    Anyway: Thanks a lot to Tony for making his strategy public.


  8. Pingback: Trading The Odds Volatility Risk Premium: Addressing Data Mining and Curve-Fitting | QuantStrat TradeR

  9. Putting aside the question of data overfitting, having a half dozen 50% drawdowns within a span of a few years makes this strategy unusable for the most part. I find that any VIX trading strategy should strive for, at most, 20% drawdowns.

    • A fair point. Not one that I think everyone would agree with, but I certainly can see why one might hold such views. The downside to this strategy are immense if one devotes a sizable chunk of one’s holdings to it.

  10. Hello,

    Thank you for the analysis of this strategy. Very good work!
    Can you please let me know how you get the Log Equity Curve?

    Thank you!

      • Do you mean?

        #equity curve

        #log-scale equity curve

        I get this error:
        Error in plot.window(…) : need finite ‘ylim’ values
        In addition: Warning messages:
        1: In min(x) : no non-missing arguments to min; returning Inf
        2: In max(x) : no non-missing arguments to max; returning -Inf


        # destfile=”XIV.txt”)

        # destfile=”VXX.txt”) #requires downloader package

        xiv <- xts(read.zoo("Data/XIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
        vxx <- xts(read.zoo("Data/VXX.txt", format="%Y-%m-%d", sep=",", header=TRUE))
        vxmt <- xts(read.zoo("Data/vxmtdailyprices.csv", format="%m/%d/%Y", sep=",", header=TRUE))

        getSymbols("^VIX", from="2004-03-29")

        vixvxmt <- merge(Cl(VIX), Cl(vxmt))
        vixvxmt[is.na(vixvxmt[,2]),2] <- vixvxmt[is.na(vixvxmt[,2]),1]

        getSymbols("^GSPC", from="1990-01-01")
        spyRets <- diff(log(Cl(GSPC)))

        spyVol <- runSD(spyRets, n=2)
        annSpyVol <- spyVol*100*sqrt(252)

        vols <- merge(vixvxmt[,2], annSpyVol, join='inner')
        vols$smaDiff <- SMA(vols[,1] – vols[,2], n=5)
        vols$signal 0
        #vols$signal <- lag(vols$signal, k = 1)
        vols$signal <- lag(vols$signal, k = 2)

        xivRets <- Return.calculate(Cl(xiv))
        vxxRets <- Return.calculate(Cl(vxx))
        stratRets <- vols$signal*xivRets + (1-vols$signal)*vxxRets

        #equity curve

        #log-scale equity curve

  11. Hi Ilya,

    Many thanks for sharing this post and the base data!

    I am very new to R and tried to replicate the above with the info from your quantstrat tutorials. Struggling to get the assets right as its more of a binary (either-or) allocation between the two assets.

    Is that actually possible via quantstrat?

    Would like to model transaction cost as well as trailing stops to look further into this strategy and hence the try to press it into the quantstrat system… Happy to share how far I got.

    Many thanks anyway! Great blog!

      • My Approach was to compute the indicator columns for VXX & XIV separately to allow the same singals & rules for both. However, doing so gives a completely different equity curve than expected?

        ## ---- Signal Calculation -------------------------------------------------
        VIXVXMT <- merge(Cl(VIX), Cl(VXMT))
        VIXVXMT[is.na(VIXVXMT[,2]),2] <- VIXVXMT[is.na(VIXVXMT[,2]),1]
        spyRets <- diff(log(Cl(GSPC)))
        spyVol <- runSD(spyRets, n=2)
        annSpyVol <- spyVol*100*sqrt(252)
        vols <- merge(VIXVXMT[,2], annSpyVol, join='inner')
        vols$smaDiff <- SMA(vols[,1] - vols[,2], n=5)
        vols$signalXIV <- vols$smaDiff > 0
        vols$signalXIV <- lag(vols$signalXIV, k = 1)
        XIV <- merge(XIV, vols$signalXIV, join='inner')
        names(XIV)[5] <- "precomputed_signal"
        vols$signalVXX <- vols$smaDiff < 0
        vols$signalVXX <- lag(vols$signalVXX, k = 1)
        VXX <- merge(VXX, vols$signalVXX, join='inner')
        names(VXX)[5] <- "precomputed_signal"
        add.signal(strategy.st, name="sigThreshold",
                   arguments=list(column="precomputed_signal", threshold=.5, 
                                  relationship="gt", cross=TRUE),
        add.signal(strategy.st, name="sigThreshold",
                   arguments=list(column="precomputed_signal", threshold=.5, 
                                  relationship="lt", cross=TRUE),
        add.rule(strategy.st, name="ruleSignal",
                 arguments=list(sigcol="longEntry", sigval=TRUE, orderqty=1, ordertype="market",
                                orderside="long", replace=FALSE, prefer="Close"),
                 type="enter", path.dep=TRUE)
        add.rule(strategy.st, name="ruleSignal", 
                 arguments=list(sigcol="longExit", sigval=TRUE, orderqty="all", ordertype="market", 
                                orderside="long", replace=FALSE, prefer="Close"), 
                 type="exit", path.dep=TRUE)
      • I wouldn’t be able to tell you, as all the strategies regarding volatility trading have been on the idea of “one lot, invest all capital”, so I’m operating in returns space at the moment.

        Also, when using quantstrat, you don’t need to lag your signals. I lag my signals in returns space because the return at time t is that day’s close divided by yesterday’s close – 1, so I would obtain the return of time t+1 using the signal generated at time t. Quantstrat’s next-bar execution system takes care of that for you.

  12. Tony is very kind to raise warning flags, though most of people seem too greedy to care about.

    Let me raise a few questions for the people thinking investing in the strategy.
    Do you know what is volatility, how it is calculated, used, traded (if so you would never compute historical volatility using 2days…. . Traders likely never use anything shorter than at least 1 week).
    Do you know how VIX is calculated (not a flavor, but the real computation)? Are you familiar with short convexity, volofvol, gap risk, tail risk?
    Have you read the prospectus of XIV, VXX etf/etn? Do you understand how they work?
    Currently VIX trades around 14. Do you know what happens if VIX gaps and opens at 28, twice its last closing level (if you think such a gap is impossible… use significant backtests in terms of period considered -include 1987, 1994 bond crisis, 1998 Emerging and LTCM crisis, 2008 Leh day,… whatever gap events- and markets – check the volvol in europe during the euro 2011-12 crisis, in Japan the day after the tsunami, do you think a big earthquake in california is impossible during your lifetime?)? The answer is likely XIV opens in gap at price level close to 0 (but do the correct computation by yourself). And once there is such a gap, and XIV value is near 0, I think the product description mentions XIV product is terminated (but read the contrat information yourself).
    There are other big potential issues (replication risk due to oversized XIV/VXX compared to liquidity in spx option markets)… .
    Long term Professionals (talented and with amazing track records) got burnt by the depegging of CHF from EUR a few weeks back (CHF realized volatility had been close to 0 due to the peg, and implied likely very low. And in 1min, there is a 40% move in the currency… Calculate the resulting volofvol and impact on an hypothetical EURCHF vix index… )

    So,… trading these strategies (the best or the average one, the optimized one or the standard one) without being aware of all the risks and having an hedging strategy in place is like going to the casino. You have a fun time as long as it lasts, but at the end you likely have less money than at the beginning (if any money left).

  13. Hi, I was analyzing your code and have one question. Why do you use Return.calculate() function for XIV and VXX, and diff(log()) to calculate SnP 500 returns? Is there any significant difference for those two methods?

    I tried both on SnP 500 and they return slightly different results (about 10^(-5) different )

  14. Pingback: I’m Back, A New Harry Long Strategy, And Plans For Hypothesis-Driven Development | QuantStrat TradeR

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s