Several Key PerformanceAnalytics Functions From R Now In Python (special thanks to Vijay Vaidyanathan)

Posted on June 21, 2023 by Ilya Kipnis • Posted in AI, Asset Allocation, Basics, Data Analysis, Portfolio Management, Python, R, Replication, Reporting, Trading • Tagged AI, Python, R • 1 Comment

So, thanks to my former boss, and head of direct indexing at BNY Mellon, Vijay Vaidyanathan, and his Coursera course, along with the usual assistance from chatGPT (I officially see it as a pseudo programming language), I have some more software for the Python community now released to my github. As wordpress now makes it very difficult to paste formatted code, I’ll be linking more often to github files.

In today’s post, the code in question is a python file containing quite a few functions from R’s PerformanceAnalytics library outside of Return.portfolio (or Return_portfolio, as it may be in Python, because the . operator is reserved in Python…ugh).

So, here is the “library” of new functions, for now called edhec_perfa.py, as a nod to Vijay Vaidyanathan’s Coursera course(s).

Here’s the link to the latest github file.

Now, while a lot of the functions should be pretty straightforward (annualized return, VaR, CVaR, tracking error/annualized tracking error, etc.), two functions that I think are particularly interesting that I ported over from R with the help of chatGPT (seriously, when prompted correctly, an LLM is quite helpful in translating from one programming language to another, provided the programmer has enough background in both languages to do various debugging) are charts_PerformanceSummary, which does exactly what you might think it does from seeing me use the R function here on the blog, and table_Drawdowns, which gives the highest N drawdowns by depth.

Here’s a quick little Python demo (once you import my file from the github):

import yfinance as yf
import pandas as pd
import numpy as np

symbol = "SPY"
start_date = "1990-01-01"
end_date = pd.Timestamp.today().strftime("%Y-%m-%d")
df_list = []

data = pd.DataFrame(yf.download(symbol, start=start_date, end=end_date))
data = data["Adj Close"]
data.name = symbol
data = pd.DataFrame(data)

returns = Return_calculate(data)

charts_PerformanceSummary(returns)
table_Drawdowns(returns)

This results in the following:

View post on imgur.com

And the following table:

View post on imgur.com

The NaT in the “To” column means that the drawdown is current, as does the NaN in the Recovery–they’re basically Python’s equivalent of NAs, though NaT means an NA on *time*, while an NaN means an NA on a numerical value. So that’s interesting in that Python gives you a little more information regarding your column data types, which is kind of interesting.

Another function included in this library, as a utility function is the infer_trading_periods function. Simply, plug in a time series, and it will tell you the periodicity of the data. Pandas was supposed to have a function that did this, but due to differences in frequency of daily data caused by weekends and holidays, this actually doesn’t work on financial data, so I worked with chatGPT to write a new function that works more generally. R’s checkData function is also in this library (translated over to Python), but I’m not quite sure where it was intended to be used, but left it in for those that want it.

So, for those interested, just feel free to read through some of the functions, and use whichever ones you find applicable to your workflow. That said, if you *do* wind up using it, please let me know, since there’s more value I can add.

Lastly, I’m currently in the job market *and* have a volatility trading signal subscription available to offer for those interested. You can email me at ilya.kipnis@gmail.com or find me on my LinkedIn profile here.

Thanks for reading.

How You Measure Months Matters — A Lot. A Look At Two Implementations of KDA

Posted on November 30, 2019 by Ilya Kipnis • Posted in Asset Allocation, Data Analysis, ETFs, Portfolio Management, R, Replication, Trading • Tagged R • 18 Comments

This post will detail a rather important finding I found while implementing a generalized framework for momentum asset allocation backtests. Namely, that when computing momentum (and other financial measures for use in asset allocation, such as volatility and correlations), measuring formal months, from start to end, has a large effect on strategy performance.

So, first off, I am in the job market, and am actively looking for a full-time role (preferably in New York City, or remotely), or a long-term contract. Here is my resume, and here is my LinkedIn profile. Furthermore, I’ve been iterating on my volatility strategy, and given that I’ve seen other services with large drawdowns, or less favorable risk/reward profiles charge $50/month, I think following my trades can be a reasonable portfolio diversification tool. Read about it and subscribe here. I believe that my body of work on this blog speaks to the viability of employing me, though I am also learning Python to try and port over my R skills over there, as everyone seems to want Python, and R much less so, hence the difficulty transferring between opportunities.

Anyhow, one thing I am working on is a generalized framework for tactical asset allocation (TAA) backtests. Namely, those that take the form of “sort universe by momentum, apply diversification weighting scheme”–namely, the kinds of strategies that the folks over at AllocateSmartly deal in. I am also working on this framework and am happy to announce that as of the time of this writing, I will happily work with individuals that want more customized TAA backtests, as the AllocateSmartly FAQs state that AllocateSmartly themselves do not do custom backtests. The framework I am currently in the process of implementing is designed to do just that. However, after going through some painstaking efforts to compare apples to apples, I came across a very important artifact. Namely, that there is a fairly large gulf in performance between measuring months from their formal endpoints, as opposed to simply approximating months with 21-day chunks (E.G. 21 days for 1 month, 63 for 3, and so on).

Here’s the code I’ve been developing recently–the long story short, is that the default options essentially default to Adaptive Asset Allocation, but depending on the parameters one inputs, it’s possible to get to something as simple as dual momentum (3 assets, invest in top 1), or as complex as KDA, with options to fine-tune it even further, such as to account for the luck-based timing that Corey Hoffstein at Newfound Research loves to write about (speaking of whom, he and the awesome folks at ReSolve Asset Management have launched a new ETF called ROMO–Robust Momentum–I recently bought a bunch in my IRA because a buy-it-and-forget-it TAA ETF is pretty fantastic as far as buy-and-hold investments go). Again, I set a bunch of defaults in the parameters so that most of them can be ignored.

require(PerformanceAnalytics)
require(quantmod)
require(tseries)
require(quadprog)

stratStats <- function(rets) {
  stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
  stats[5,] <- stats[1,]/stats[4,]
  stats[6,] <- stats[1,]/UlcerIndex(rets)
  rownames(stats)[4] <- "Worst Drawdown"
  rownames(stats)[5] <- "Calmar Ratio"
  rownames(stats)[6] <- "Ulcer Performance Index"
  return(stats)
}


getYahooReturns <- function(symbols, return_column = "Ad") {
  returns <- list()
  for(symbol in symbols) {
    getSymbols(symbol, from = '1990-01-01', adjustOHLC = TRUE)
    if(return_column == "Ad") {
      return <- Return.calculate(Ad(get(symbol)))
      colnames(return) <- gsub("\\.Adjusted", "", colnames(return))
    } else {
      return <- Return.calculate(Op(get(symbol)))
      colnames(return) <- gsub("\\.Open", "", colnames(return))
      
    }
    returns[[symbol]] <- return
  }
  returns <- na.omit(do.call(cbind, returns))
  return(returns)
}

symbols <- c("SPY", "VGK",   "EWJ",  "EEM",  "VNQ",  "RWX",  "IEF",  "TLT",  "DBC",  "GLD")  

returns <- getYahooReturns(symbols)
canary <- getYahooReturns(c("VWO", "BND"))

# offsets endpoints by a certain amount of days (I.E. 1-21)
dailyOffset <- function(ep, offset = 0) {
  
  ep <- ep + offset
  ep[ep < 1] <- 1
  ep[ep > nrow(returns)] <- nrow(returns)
  ep <- unique(ep)
  epDiff <- diff(ep)
  if(last(epDiff)==1) { 
    # if the last period only has one observation, remove it
    ep <- ep[-length(ep)]
  }
  return(ep)
}

# computes total weighted momentum and penalizes new assets (if desired)
compute_total_momentum <- function(yearly_subset, 
                                   momentum_lookbacks, momentum_weights,
                                   old_weights, new_asset_mom_penalty) {
  
  empty_vec <- data.frame(t(rep(0, ncol(yearly_subset)))) 
  colnames(empty_vec) <- colnames(yearly_subset)
  
  total_momentum <- empty_vec
  for(j in 1:length(momentum_lookbacks)) {
    momentum_subset <- tail(yearly_subset, momentum_lookbacks[j])
    total_momentum <- total_momentum + Return.cumulative(momentum_subset) * 
      momentum_weights[j]  
  }
  
  # if asset returns are negative, penalize by *increasing* negative momentum
  # this algorithm assumes we go long only
  total_momentum[old_weights == 0] <- total_momentum[old_weights==0] * 
    (1-new_asset_mom_penalty * sign(total_momentum[old_weights==0]))
  
  return(total_momentum)
}

# compute weighted correlation matrix
compute_total_correlation <- function(data, cor_lookbacks, cor_weights) {
  
  # compute total correlation matrix
  total_cor <- matrix(nrow=ncol(data), ncol=ncol(data), 0)
  rownames(total_cor) <- colnames(total_cor) <- colnames(data)
  for(j in 1:length(cor_lookbacks)) {
    total_cor = total_cor + cor(tail(data, cor_lookbacks[j])) * cor_weights[j]
  }
  
  return(total_cor)
}

# computes total weighted volatility
compute_total_volatility <- function(data, vol_lookbacks, vol_weights) {
  empty_vec <- data.frame(t(rep(0, ncol(data))))
  colnames(empty_vec) <- colnames(data)
  
  # normalize weights if not already normalized
  if(sum(vol_weights) != 1) {
    vol_weights <- vol_weights/sum(vol_weights)
  }
  
  # compute total volrelation matrix
  total_vol <- empty_vec
  for(j in 1:length(vol_lookbacks)) {
    total_vol = total_vol + StdDev.annualized(tail(data, vol_lookbacks[j])) * vol_weights[j]
  }
  
  return(total_vol)
}

check_valid_parameters <- function(mom_lookbacks, cor_lookbacks, mom_weights, cor_weights, vol_weights, vol_lookbacks) {
  if(length(mom_weights) != length(mom_lookbacks)) {
    stop("Momentum weight length must be equal to momentum lookback length.") }
  
  if(length(cor_weights) != length(cor_lookbacks)) {
    stop("Correlation weight length must be equal to correlation lookback length.")
  }
  
  if(length(vol_weights) != length(vol_lookbacks)) {
    stop("Volatility weight length must be equal to volatility lookback length.")
  }
}


# computes weights as a function proportional to the inverse of total variance
invVar <- function(returns, lookbacks, lookback_weights) {
  var <- compute_total_volatility(returns, lookbacks, lookback_weights)^2
  invVar <- 1/var
  return(invVar/sum(invVar))
}

# computes weights as a function proportional to the inverse of total volatility
invVol <- function(returns, lookbacks, lookback_weights) {
  vol <- compute_total_volatility(returns, lookbacks, lookback_weights)
  invVol <- 1/vol
  return(invVol/sum(invVol))
}

# computes equal weight portfolio
ew <- function(returns) {
  return(StdDev(returns)/(StdDev(returns)*ncol(returns)))
}

# computes minimum volatility portfolio
minVol <- function(returns, cor_lookbacks, cor_weights, vol_lookbacks, vol_weights) {
  vols <- compute_total_volatility(returns, vol_lookbacks, vol_weights)
  cors <- compute_total_correlation(returns, cor_lookbacks, cor_weights)
  covs <- t(vols) %*% as.numeric(vols) * cors
  min_vol_rets <- t(matrix(rep(1, ncol(covs))))
  
  n.col = ncol(covs)
  zero.mat <- array(0, dim = c(n.col, 1))
  one.zero.diagonal.a <- cbind(1, diag(n.col), 1 * diag(n.col), -1 * diag(n.col))
  min.wgt <- rep(0, n.col)
  max.wgt <- rep(1, n.col)
  bvec.1.vector.a <- c(1, rep(0, n.col), min.wgt, -max.wgt)
  meq.1 <- 1
  
  mv.port.noshort.a <- solve.QP(Dmat = covs, dvec = zero.mat, Amat = one.zero.diagonal.a,
                                bvec = bvec.1.vector.a, meq = meq.1)
  
  
  min_vol_wt <- mv.port.noshort.a$solution
  names(min_vol_wt) <- rownames(covs)
  return(min_vol_wt)
}
asset_allocator <- function(returns, 
                           canary_returns = NULL, # canary assets for KDA algorithm and similar
                           
                           mom_threshold = 0, # threshold momentum must exceed
                           mom_lookbacks = 126, # momentum lookbacks for custom weights (EG 1-3-6-12)
                           
                           # weights on various momentum lookbacks (EG 12/19, 4/19, 2/19, 1/19)
                           mom_weights = rep(1/length(mom_lookbacks), 
                                             length(mom_lookbacks)), 
                           
                           # repeat for correlation weights
                           cor_lookbacks = mom_lookbacks, # correlation lookback
                           cor_weights = rep(1/length(mom_lookbacks), 
                                             length(mom_lookbacks)),
                           
                           vol_lookbacks = 20, # volatility lookback
                           vol_weights = rep(1/length(vol_lookbacks), 
                                             length(vol_lookbacks)),
                           
                           # number of assets to hold (if all above threshold)
                           top_n = floor(ncol(returns)/2), 
                           
                           # diversification weight scheme (ew, invVol, invVar, minVol, etc.)
                           weight_scheme = "minVol",
                           
                           # how often holdings rebalance
                           rebalance_on = "months",
                           
                           # how many days to offset rebalance period from end of month/quarter/year
                           offset = 0, 
                           
                           # penalize new asset mom to reduce turnover
                           new_asset_mom_penalty = 0, 
                           
                           # run Return.Portfolio, or just return weights?
                           # for use in robust momentum type portfolios
                           compute_portfolio_returns = TRUE,
                           verbose = FALSE,
                           
                           # crash protection asset
                           crash_asset = NULL,
                           ...
                           ) {
  
  # normalize weights
  mom_weights <- mom_weights/sum(mom_weights)
  cor_weights <- cor_weights/sum(cor_weights)
  vol_weights <- vol_weights/sum(vol_weights)
  
  # if we have canary returns (I.E. KDA strat), align both time periods
  if(!is.null(canary_returns)) {
   smush <- na.omit(cbind(returns, canary_returns))
   returns <- smush[,1:ncol(returns)]
   canary_returns <- smush[,-c(1:ncol(returns))]
   empty_canary_vec <- data.frame(t(rep(0, ncol(canary_returns))))
   colnames(empty_canary_vec) <- colnames(canary_returns)
  }
  
  # get endpoints and offset them
  ep <- endpoints(returns, on = rebalance_on)
  ep <- dailyOffset(ep, offset = offset)
  
  # initialize vector holding zeroes for assets
  empty_vec <- data.frame(t(rep(0, ncol(returns))))
  colnames(empty_vec) <- colnames(returns)
  weights <- empty_vec
  
  # initialize list to hold all our weights
  all_weights <- list()
  
  # get number of periods per year
  switch(rebalance_on,
         "months" = { yearly_periods = 12},
         "quarters" = { yearly_periods = 4},
         "years" = { yearly_periods = 1})
  
  for(i in 1:(length(ep) - yearly_periods)) {
    
    # remember old weights for the purposes of penalizing momentum of new assets
    old_weights <- weights
    
    # subset one year of returns, leave off first day 
    return_subset <- returns[c((ep[i]+1):ep[(i+yearly_periods)]),]

    # compute total weighted momentum, penalize potential new assets if desired
    momentums <- compute_total_momentum(return_subset,  
                                        momentum_lookbacks = mom_lookbacks,
                                        momentum_weights = mom_weights,
                                        old_weights = old_weights, 
                                        new_asset_mom_penalty = new_asset_mom_penalty)
    
    # rank negative momentum so that best asset is ranked 1 and so on
    momentum_ranks <- rank(-momentums)
    selected_assets <- momentum_ranks <= top_n & momentums > mom_threshold
    selected_subset <- return_subset[, selected_assets]
    
    # case of 0 valid assets
    if(sum(selected_assets)==0) {
      weights <- empty_vec
    } else if (sum(selected_assets)==1) {
      
      # case of only 1 valid asset -- invest everything into it
      weights <- empty_vec + selected_assets
      
    } else {
      # apply a user-selected weighting algorithm
      # modify this portion to select more weighting schemes
      if (weight_scheme == "ew") {
        weights <- ew(selected_subset)
      } else if (weight_scheme == "invVol") {
        weights <- invVol(selected_subset, vol_lookbacks, vol_weights)
      } else if (weight_scheme == "invVar"){
        weights <- invVar(selected_subset, vol_lookbacks, vol_weights)
      } else if (weight_scheme == "minVol") {
        weights <- minVol(selected_subset, cor_lookbacks, cor_weights,
                          vol_lookbacks, vol_weights)
      } 
    }
    
    # include all assets
    wt_names <- names(weights) 
    if(is.null(wt_names)){wt_names <- colnames(weights)}
    zero_weights <- empty_vec
    zero_weights[wt_names] <- weights
    weights <- zero_weights
    weights <- xts(weights, order.by=last(index(return_subset)))
    
    # if there's a canary universe, modify weights by fraction with positive momentum
    # if there's a safety asset, allocate the crash protection modifier to it.
    if(!is.null(canary_returns)) {
      canary_subset <- canary_returns[c(ep[i]:ep[(i+yearly_periods)]),]
      canary_subset <- canary_subset[-1,]
      canary_mom <- compute_total_momentum(canary_subset, 
                                           mom_lookbacks, mom_weights,
                                           empty_canary_vec, 0)
      canary_mod <- mean(canary_mom > 0)
      weights <- weights * canary_mod
      if(!is.null(crash_asset)) {
        if(momentums[crash_asset] > mom_threshold) {
          weights[,crash_asset] <- weights[,crash_asset] + (1-canary_mod)
        }
      }
    }
    
    all_weights[[i]] <- weights
  }
  
  # combine weights
  all_weights <- do.call(rbind, all_weights)
  if(compute_portfolio_returns) {
    strategy_returns <- Return.portfolio(R = returns, weights = all_weights, verbose = verbose)
    return(list(all_weights, strategy_returns))
  }
  return(all_weights)
  
}

#out <- asset_allocator(returns, offset = 0)
kda <- asset_allocator(returns = returns, canary_returns = canary, 
                       mom_lookbacks = c(21, 63, 126, 252),
                       mom_weights = c(12, 4, 2, 1),
                       cor_lookbacks = c(21, 63, 126, 252),
                       cor_weights = c(12, 4, 2, 1), vol_lookbacks = 21,
                       weight_scheme = "minVol",
                       crash_asset = "IEF")

The one thing that I’d like to focus on, however, are the lookback parameters. Essentially, assuming daily data, they’re set using a *daily lookback*, as that’s what AllocateSmartly did when testing my own KDA Asset Allocation algorithm. Namely, the salient line is this:

“For all assets across all three universes, at the close on the last trading day of the month, calculate a “momentum score” as follows:(12 * (p0 / p21 – 1)) + (4 * (p0 / p63 – 1)) + (2 * (p0 / p126 – 1)) + (p0 / p252 – 1)Where p0 = the asset’s price at today’s close, p1 = the asset’s price at the close of the previous trading day and so on. 21, 63, 126 and 252 days correspond to 1, 3, 6 and 12 months.”

So, to make sure I had apples to apples when trying to generalize KDA asset allocation, I compared the output of my new algorithm, asset_allocator (or should I call it allocate_smartly ?=] ) to my formal KDA asset allocation algorithm.

Here are the results:

                            KDA_algo KDA_approximated_months
Annualized Return         0.10190000              0.08640000
Annualized Std Dev        0.09030000              0.09040000
Annualized Sharpe (Rf=0%) 1.12790000              0.95520000
Worst Drawdown            0.07920336              0.09774612
Calmar Ratio              1.28656163              0.88392257
Ulcer Performance Index   3.78648873              2.62691398

Essentially, the long and short of it is that I modified my original KDA algorithm until I got identical output to my asset_allocator algorithm, then went back to the original KDA algorithm. The salient difference is this part:

# computes total weighted momentum and penalizes new assets (if desired)
compute_total_momentum <- function(yearly_subset, 
                                   momentum_lookbacks, momentum_weights,
                                   old_weights, new_asset_mom_penalty) {
  
  empty_vec <- data.frame(t(rep(0, ncol(yearly_subset)))) 
  colnames(empty_vec) <- colnames(yearly_subset)
  
  total_momentum <- empty_vec
  for(j in 1:length(momentum_lookbacks)) {
    momentum_subset <- tail(yearly_subset, momentum_lookbacks[j])
    total_momentum <- total_momentum + Return.cumulative(momentum_subset) * 
      momentum_weights[j]  
  }
  
  # if asset returns are negative, penalize by *increasing* negative momentum
  # this algorithm assumes we go long only
  total_momentum[old_weights == 0] <- total_momentum[old_weights==0] * 
    (1-new_asset_mom_penalty * sign(total_momentum[old_weights==0]))
  
  return(total_momentum)
}

Namely, the part that further subsets the yearly subset by the lookback period, in terms of days, rather than monthly endpoints. Essentially, the difference in the exact measurement of momentum–that is, the measurement that explicitly selects *which* instruments the algorithm will allocate to in a particular period, unsurprisingly, has a large impact on the performance of the algorithm.

And lest anyone think that this phenomena no longer applies, here’s a yearly performance comparison.

                KDA_algo KDA_approximated_months
2008-12-31  0.1578348930             0.062776766
2009-12-31  0.1816957178             0.166017499
2010-12-31  0.1779839604             0.160781537
2011-12-30  0.1722014474             0.149143148
2012-12-31  0.1303019332             0.103579674
2013-12-31  0.1269207487             0.134197066
2014-12-31  0.0402888320             0.071784979
2015-12-31 -0.0119459453            -0.028090873
2016-12-30  0.0125302658             0.002996917
2017-12-29  0.1507895287             0.133514924
2018-12-31  0.0747520266             0.062544709
2019-11-27  0.0002062636             0.008798310

Of note: the variant that formally measures momentum from monthly endpoints consistently outperforms the one using synthetic monthly measurements.

So, that will do it for this post. I hope to have a more thorough walk-through of the asset_allocator function in the very near future before moving onto Python-related matters (hopefully), but I thought that this artifact, and just how much it affects outcomes, was too important not to share.

An iteration of the algorithm capable of measuring momentum with proper monthly endpoints should be available in the near future.

Thanks for reading.

Replicating Volatility ETN Returns From CBOE Futures

Posted on January 12, 2018 by Ilya Kipnis • Posted in ETFs, R, Replication, Volatility • Tagged R • 14 Comments

This post will demonstrate how to replicate the volatility ETNs (XIV, VXX, ZIV, VXZ) from CBOE futures, thereby allowing any individual to create synthetic ETF returns from before their inception, free of cost.

So, before I get to the actual algorithm, it depends on an update to the term structure algorithm I shared some months back.

In that algorithm, mistakenly (or for the purpose of simplicity), I used calendar days as the time to expiry, when it should have been business days, which also accounts for weekends, and holidays, which are an irritating artifact to keep track of.

So here’s the salient change, in the loop that calculates times to expiry:

source("tradingHolidays.R")

masterlist <- list()
timesToExpiry <- list()
for(i in 1:length(contracts)) {
  
  # obtain data
  contract <- contracts[i]
  dataFile <- paste0(stem, contract, "_VX.csv")
  expiryYear <- paste0("20",substr(contract, 2, 3))
  expiryMonth <- monthMaps$monthNum[monthMaps$futureStem == substr(contract,1,1)]
  expiryDate <- dates$dates[dates$dateMon == paste(expiryYear, expiryMonth, sep="-")]
  data <- tryCatch(
    {
      suppressWarnings(fread(dataFile))
    }, error = function(e){return(NULL)}
  )
  
  if(!is.null(data)) {
    # create dates
    dataDates <- as.Date(data$`Trade Date`, format = '%m/%d/%Y')
    
    # create time to expiration xts
    toExpiry <- xts(bizdays(dataDates, expiryDate), order.by=dataDates)
    colnames(toExpiry) <- contract
    timesToExpiry[[i]] <- toExpiry
    
    # get settlements
    settlement <- xts(data$Settle, order.by=dataDates)
    colnames(settlement) <- contract
    masterlist[[i]] <- settlement
  }
}

The one salient line in particular, is this:

toExpiry <- xts(bizdays(dataDates, expiryDate), order.by=dataDates)

What is this bizdays function? It comes from the bizdays package in R.

There’s also the tradingHolidays.R script, which makes further use of the bizdays package. Here’s what goes on under the hood in tradingHolidays.R, for those that wish to replicate the code:

easters <- read.csv("easters.csv", header = FALSE)
easterDates <- as.Date(paste0(substr(easters$V2, 1, 6), easters$V3), format = '%m/%d/%Y')-2

nonEasters <- read.csv("nonEasterHolidays.csv", header = FALSE)
nonEasterDates <- as.Date(paste0(substr(nonEasters$V2, 1, 6), nonEasters$V3), format = '%m/%d/%Y')

weekdayNonEasters <- nonEasterDates[which(!weekdays(nonEasterDates) %in% c("Saturday", "Sunday"))]

hurricaneSandy <- as.Date(c("2012-10-29", "2012-10-30"))

holidays <- sort(c(easterDates, weekdayNonEasters, hurricaneSandy))
holidays <- holidays[holidays > as.Date("2003-12-31") & holidays < as.Date("2019-01-01")]


require(bizdays)

create.calendar("HolidaysUS", holidays, weekdays = c("saturday", "sunday"))
bizdays.options$set(default.calendar = "HolidaysUS")

There are two CSVs that I manually compiled, but will share screenshots of–they are the easter holidays (because they have to be adjusted for turning Sunday to Friday because of Easter Fridays), and the rest of the national holidays.

Here is what the easters csv looks like:

eastersScreenshot

And the nonEasterHolidays, which contains New Year’s Day, MLK Jr. Day, President’s Day, Memorial Day, Independence Day, Labor Day, Thanksgiving Day, and Christmas Day (along with their observed dates) nonEasterScreenshot CSV:

Furthermore, we need to adjust for the two days that equities were not trading due to Hurricane Sandy.

So then, the list of holidays looks like this:

> holidays
  [1] "2004-01-01" "2004-01-19" "2004-02-16" "2004-04-09" "2004-05-31" "2004-07-05" "2004-09-06" "2004-11-25"
  [9] "2004-12-24" "2004-12-31" "2005-01-17" "2005-02-21" "2005-03-25" "2005-05-30" "2005-07-04" "2005-09-05"
 [17] "2005-11-24" "2005-12-26" "2006-01-02" "2006-01-16" "2006-02-20" "2006-04-14" "2006-05-29" "2006-07-04"
 [25] "2006-09-04" "2006-11-23" "2006-12-25" "2007-01-01" "2007-01-02" "2007-01-15" "2007-02-19" "2007-04-06"
 [33] "2007-05-28" "2007-07-04" "2007-09-03" "2007-11-22" "2007-12-25" "2008-01-01" "2008-01-21" "2008-02-18"
 [41] "2008-03-21" "2008-05-26" "2008-07-04" "2008-09-01" "2008-11-27" "2008-12-25" "2009-01-01" "2009-01-19"
 [49] "2009-02-16" "2009-04-10" "2009-05-25" "2009-07-03" "2009-09-07" "2009-11-26" "2009-12-25" "2010-01-01"
 [57] "2010-01-18" "2010-02-15" "2010-04-02" "2010-05-31" "2010-07-05" "2010-09-06" "2010-11-25" "2010-12-24"
 [65] "2011-01-17" "2011-02-21" "2011-04-22" "2011-05-30" "2011-07-04" "2011-09-05" "2011-11-24" "2011-12-26"
 [73] "2012-01-02" "2012-01-16" "2012-02-20" "2012-04-06" "2012-05-28" "2012-07-04" "2012-09-03" "2012-10-29"
 [81] "2012-10-30" "2012-11-22" "2012-12-25" "2013-01-01" "2013-01-21" "2013-02-18" "2013-03-29" "2013-05-27"
 [89] "2013-07-04" "2013-09-02" "2013-11-28" "2013-12-25" "2014-01-01" "2014-01-20" "2014-02-17" "2014-04-18"
 [97] "2014-05-26" "2014-07-04" "2014-09-01" "2014-11-27" "2014-12-25" "2015-01-01" "2015-01-19" "2015-02-16"
[105] "2015-04-03" "2015-05-25" "2015-07-03" "2015-09-07" "2015-11-26" "2015-12-25" "2016-01-01" "2016-01-18"
[113] "2016-02-15" "2016-03-25" "2016-05-30" "2016-07-04" "2016-09-05" "2016-11-24" "2016-12-26" "2017-01-02"
[121] "2017-01-16" "2017-02-20" "2017-04-14" "2017-05-29" "2017-07-04" "2017-09-04" "2017-11-23" "2017-12-25"
[129] "2018-01-01" "2018-01-15" "2018-02-19" "2018-03-30" "2018-05-28" "2018-07-04" "2018-09-03" "2018-11-22"
[137] "2018-12-25"

So once we have a list of holidays, we use the bizdays package to set the holidays and weekends (Saturday and Sunday) as our non-business days, and use that function to calculate the correct times to expiry.

So, now that we have the updated expiry structure, we can write a function that will correctly replicate the four main volatility ETNs–XIV, VXX, ZIV, and VXZ.

Here’s the English explanation:

VXX is made up of two contracts–the front month, and the back month, and has a certain number of trading days (AKA business days) that it trades until expiry, say, 17. During that timeframe, the front month (let’s call it M1) goes from being the entire allocation of funds, to being none of the allocation of funds, as the front month contract approaches expiry. That is, as a contract approaches expiry, the second contract gradually receives more and more weight, until, at expiry of the front month contract, the second month contract contains all of the funds–just as it *becomes* the front month contract. So, say you have 17 days to expiry on the front month. At the expiry of the previous contract, the second month will have a weight of 17/17–100%, as it becomes the front month. Then, the next day, that contract, now the front month, will have a weight of 16/17 at settle, then 15/17, and so on. That numerator is called dr, and the denominator is called dt.

However, beyond this, there’s a second mechanism that’s responsible for the VXX looking like it does as compared to a basic futures contract (that is, the decay responsible for short volatility’s profits), and that is the “instantaneous” rebalancing. That is, the returns for a given day are today’s settles multiplied by yesterday’s weights, over yesterday’s settles multiplied by yesterday’s weights, minus one. That is, (S_1_t * dr/dt_t-1 + S_2_t * 1-dr/dt_t-1) / (S_1_t-1 * dr/dt_t-1 + S_2_t-1 * 1-dr/dt_t-1) – 1 (I could use a tutorial on LaTeX). So, when you move forward a day, well, tomorrow, today’s weights become t-1. Yet, when were the assets able to be rebalanced? Well, in the ETNs such as VXX and VXZ, the “hand-waving” is that it happens instantaneously. That is, the weight for the front month was 93%, the return was realized at settlement (that is, from settle to settle), and immediately after that return was realized, the front month’s weight shifts from 93%, to, say, 88%. So, say Credit Suisse (that issues these ETNs ), has $10,000 (just to keep the arithmetic and number of zeroes tolerable, obviously there are a lot more in reality) worth of XIV outstanding after immediately realizing returns, it will sell $500 of its $9300 in the front month, and immediately move them to the second month, so it will immediately go from $9300 in M1 and $700 in M2 to $8800 in M1 and $1200 in M2. When did those $500 move? Immediately, instantaneously, and if you like, you can apply Clarke’s Third Law and call it “magically”.

The only exception is the day after roll day, in which the second month simply becomes the front month as the previous front month expires, so what was a 100% weight on the second month will now be a 100% weight on the front month, so there’s some extra code that needs to be written to make that distinction.

That’s the way it works for VXX and XIV. What’s the difference for VXZ and ZIV? It’s really simple–instead of M1 and M2, VXZ uses the exact same weightings (that is, the time remaining on front month vs. how many days exist for that contract to be the front month), uses M4, M5, M6, and M7, with M4 taking dr/dt, M5 and M6 always being 1, and M7 being 1-dr/dt.

In any case, here’s the code.

syntheticXIV <- function(termStructure, expiryStructure) {
  
  # find expiry days
  zeroDays <- which(expiryStructure$C1 == 0)
  
  # dt = days in contract period, set after expiry day of previous contract
  dt <- zeroDays + 1
  dtXts <- expiryStructure$C1[dt,]
  
  # create dr (days remaining) and dt structure
  drDt <- cbind(expiryStructure[,1], dtXts)
  colnames(drDt) <- c("dr", "dt")
  drDt$dt <- na.locf(drDt$dt)
  
  # add one more to dt to account for zero day
  drDt$dt <- drDt$dt + 1
  drDt <- na.omit(drDt)
  
  # assign weights for front month and back month based on dr and dt
  wtC1 <- drDt$dr/drDt$dt
  wtC2 <- 1-wtC1
  
  # realize returns with old weights, "instantaneously" shift to new weights after realizing returns at settle
  # assumptions are a bit optimistic, I think
  valToday <- termStructure[,1] * lag(wtC1) + termStructure[,2] * lag(wtC2)
  valYesterday <- lag(termStructure[,1]) * lag(wtC1) + lag(termStructure[,2]) * lag(wtC2)
  syntheticRets <- (valToday/valYesterday) - 1
  
  # on the day after roll, C2 becomes C1, so reflect that in returns
  zeroes <- which(drDt$dr == 0) + 1 
  zeroRets <- termStructure[,1]/lag(termStructure[,2]) - 1
  
  # override usual returns with returns that reflect back month becoming front month after roll day
  syntheticRets[index(syntheticRets)[zeroes]] <- zeroRets[index(syntheticRets)[zeroes]]
  syntheticRets <- na.omit(syntheticRets)
  
  # vxxRets are syntheticRets
  vxxRets <- syntheticRets
  
  # repeat same process for vxz -- except it's dr/dt * 4th contract + 5th + 6th + 1-dr/dt * 7th contract
  vxzToday <- termStructure[,4] * lag(wtC1) + termStructure[,5] + termStructure[,6] + termStructure[,7] * lag(wtC2)
  vxzYesterday <- lag(termStructure[,4]) * lag(wtC1) + lag(termStructure[, 5]) + lag(termStructure[,6]) + lag(termStructure[,7]) * lag(wtC2)
  syntheticVxz <- (vxzToday/vxzYesterday) - 1
  
  # on zero expiries, next day will be equal (4+5+6)/lag(5+6+7) - 1
  zeroVxz <- (termStructure[,4] + termStructure[,5] + termStructure[,6])/
    lag(termStructure[,5] + termStructure[,6] + termStructure[,7]) - 1
  syntheticVxz[index(syntheticVxz)[zeroes]] <- zeroVxz[index(syntheticVxz)[zeroes]]
  syntheticVxz <- na.omit(syntheticVxz)
  
  vxzRets <- syntheticVxz
  
  # write out weights for actual execution
  if(last(drDt$dr!=0)) {
    print(paste("Previous front-month weight was", round(last(drDt$dr)/last(drDt$dt), 5)))
    print(paste("Front-month weight at settle today will be", round((last(drDt$dr)-1)/last(drDt$dt), 5)))
    if((last(drDt$dr)-1)/last(drDt$dt)==0){
      print("Front month will be zero at end of day. Second month becomes front month.")
    }
  } else {
    print("Previous front-month weight was zero. Second month became front month.")
    print(paste("New front month weights at settle will be", round(last(expiryStructure[,2]-1)/last(expiryStructure[,2]), 5)))
  }
  
  return(list(vxxRets, vxzRets))
}

So, a big thank you goes out to Michael Kapler of Systematic Investor Toolbox for originally doing the replication and providing his code. My code essentially does the same thing, in, hopefully a more commented way.

So, ultimately, does it work? Well, using my updated term structure code, I can test that.

While I’m not going to paste my entire term structure code (again, available here, just update the script with my updates from this post), here’s how you’d run the new function:

> out <- syntheticXIV(termStructure, expiryStructure)
[1] "Previous front-month weight was 0.17647"
[1] "Front-month weight at settle today will be 0.11765"

And since it returns both the vxx returns and the vxz returns, we can compare them both.

compareXIV <- na.omit(cbind(xivRets, out[[1]] * -1))
colnames(compareXIV) <- c("XIV returns", "Replication returns")
charts.PerformanceSummary(compareXIV)

With the result:

xivComparison

Basically, a perfect match.

Let’s do the same thing, with ZIV.

compareZIV <- na.omit(cbind(ZIVrets, out[[2]]*-1))
colnames(compareZIV) <- c("ZIV returns", "Replication returns")
charts.PerformanceSummary(compareZIV)

So, rebuilding from the futures does a tiny bit better than the ETN. But the trajectory is largely identical.

That concludes this post. I hope it has shed some light on how these volatility ETNs work, and how to obtain them directly from the futures data published by the CBOE, which are the inputs to my term structure algorithm.

This also means that for institutions interested in trading my strategy, that they can obtain leverage to trade the futures-composite replicated variants of these ETNs, at greater volume.

Thanks for reading.

NOTES: For those interested in a retail subscription strategy to trading volatility, do not hesitate to subscribe to my volatility-trading strategy. For those interested in employing me full-time or for long-term consulting projects, I can be reached on my LinkedIn, or my email: ilya.kipnis@gmail.com.

Comparing Some Strategies from Easy Volatility Investing, and the Table.Drawdowns Command

Posted on November 14, 2017 by Ilya Kipnis • Posted in ETFs, R, Replication, Volatility • Tagged R • 6 Comments

This post will be about comparing strategies from the paper “Easy Volatility Investing”, along with a demonstration of R’s table.Drawdowns command.

First off, before going further, while I think the execution assumptions found in EVI don’t lend the strategies well to actual live trading (although their risk/reward tradeoffs also leave a lot of room for improvement), I think these strategies are great as benchmarks.

So, some time ago, I did an out-of-sample test for one of the strategies found in EVI, which can be found here.

Using the same source of data, I also obtained data for SPY (though, again, AlphaVantage can also provide this service for free for those that don’t use Quandl).

Here’s the new code.

require(downloader)
require(quantmod)
require(PerformanceAnalytics)
require(TTR)
require(Quandl)
require(data.table)

download("http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vix3mdailyprices.csv", destfile="vxvData.csv")

VIX <- fread("http://www.cboe.com/publish/scheduledtask/mktdata/datahouse/vixcurrent.csv", skip = 1)
VIXdates <- VIX$Date
VIX$Date <- NULL; VIX <- xts(VIX, order.by=as.Date(VIXdates, format = '%m/%d/%Y'))

vxv <- xts(read.zoo("vxvData.csv", header=TRUE, sep=",", format="%m/%d/%Y", skip=2))

ma_vRatio <- SMA(Cl(VIX)/Cl(vxv), 10)
xivSigVratio <- ma_vRatio < 1 
vxxSigVratio <- ma_vRatio > 1 

# V-ratio (VXV/VXMT)
vRatio <- lag(xivSigVratio) * xivRets + lag(vxxSigVratio) * vxxRets
# vRatio <- lag(xivSigVratio, 2) * xivRets + lag(vxxSigVratio, 2) * vxxRets


# Volatility Risk Premium Strategy
spy <- Quandl("EOD/SPY", start_date='1990-01-01', type = 'xts')
spyRets <- Return.calculate(spy$Adj_Close)
histVol <- runSD(spyRets, n = 10, sample = FALSE) * sqrt(252) * 100
vixDiff <- Cl(VIX) - histVol
maVixDiff <- SMA(vixDiff, 5)

vrpXivSig <- maVixDiff > 0 
vrpVxxSig <- maVixDiff < 0
vrpRets <- lag(vrpXivSig, 1) * xivRets + lag(vrpVxxSig, 1) * vxxRets


obsCloseMomentum <- magicThinking # from previous post

compare <- na.omit(cbind(xivRets, obsCloseMomentum, vRatio, vrpRets))
colnames(compare) <- c("BH_XIV", "DDN_Momentum", "DDN_VRatio", "DDN_VRP")

So, an explanation: there are four return streams here–buy and hold XIV, the DDN momentum from a previous post, and two other strategies.

The simpler one, called the VRatio is simply the ratio of the VIX over the VXV. Near the close, check this quantity. If this is less than one, buy XIV, otherwise, buy VXX.

The other one, called the Volatility Risk Premium strategy (or VRP for short), compares the 10 day historical volatility (that is, the annualized running ten day standard deviation) of the S&P 500, subtracts it from the VIX, and takes a 5 day moving average of that. Near the close, when that’s above zero (that is, VIX is higher than historical volatility), go long XIV, otherwise, go long VXX.

Again, all of these strategies are effectively “observe near/at the close, buy at the close”, so are useful for demonstration purposes, though not for implementation purposes on any large account without incurring market impact.

Here are the results, since 2011 (that is, around the time of XIV’s actual inception):
easyVolInvesting

To note, both the momentum and the VRP strategy underperform buying and holding XIV since 2011. The VRatio strategy, on the other hand, does outperform.

Here’s a summary statistics function that compiles some top-level performance metrics.

stratStats <- function(rets) {
  stats <- rbind(table.AnnualizedReturns(rets), maxDrawdown(rets))
  stats[5,] <- stats[1,]/stats[4,]
  stats[6,] <- stats[1,]/UlcerIndex(rets)
  rownames(stats)[4] <- "Worst Drawdown"
  rownames(stats)[5] <- "Calmar Ratio"
  rownames(stats)[6] <- "Ulcer Performance Index"
  return(stats)
}

And the result:

> stratStats(compare['2011::'])
                             BH_XIV DDN_Momentum DDN_VRatio   DDN_VRP
Annualized Return         0.3801000    0.2837000  0.4539000 0.2572000
Annualized Std Dev        0.6323000    0.5706000  0.6328000 0.6326000
Annualized Sharpe (Rf=0%) 0.6012000    0.4973000  0.7172000 0.4066000
Worst Drawdown            0.7438706    0.6927479  0.7665093 0.7174481
Calmar Ratio              0.5109759    0.4095285  0.5921650 0.3584929
Ulcer Performance Index   1.1352168    1.2076995  1.5291637 0.7555808

To note, all of the benchmark strategies suffered very large drawdowns since XIV’s inception, which we can examine using the table.Drawdowns command, as seen below:

> table.Drawdowns(compare[,1]['2011::'], top = 5)
        From     Trough         To   Depth Length To Trough Recovery
1 2011-07-08 2011-11-25 2012-11-26 -0.7439    349        99      250
2 2015-06-24 2016-02-11 2016-12-21 -0.6783    379       161      218
3 2014-07-07 2015-01-30 2015-06-11 -0.4718    236       145       91
4 2011-02-15 2011-03-16 2011-04-20 -0.3013     46        21       25
5 2013-04-15 2013-06-24 2013-07-22 -0.2877     69        50       19
> table.Drawdowns(compare[,2]['2011::'], top = 5)
        From     Trough         To   Depth Length To Trough Recovery
1 2014-07-07 2016-06-27 2017-03-13 -0.6927    677       499      178
2 2012-03-27 2012-06-13 2012-09-13 -0.4321    119        55       64
3 2011-10-04 2011-10-28 2012-03-21 -0.3621    117        19       98
4 2011-02-15 2011-03-16 2011-04-21 -0.3013     47        21       26
5 2011-06-01 2011-08-04 2011-08-18 -0.2723     56        46       10
> table.Drawdowns(compare[,3]['2011::'], top = 5)
        From     Trough         To   Depth Length To Trough Recovery
1 2014-01-23 2016-02-11 2017-02-14 -0.7665    772       518      254
2 2011-09-13 2011-11-25 2012-03-21 -0.5566    132        53       79
3 2012-03-27 2012-06-01 2012-07-19 -0.3900     80        47       33
4 2011-02-15 2011-03-16 2011-04-20 -0.3013     46        21       25
5 2013-04-15 2013-06-24 2013-07-22 -0.2877     69        50       19
> table.Drawdowns(compare[,4]['2011::'], top = 5)
        From     Trough         To   Depth Length To Trough Recovery
1 2015-06-24 2016-02-11 2017-10-11 -0.7174    581       161      420
2 2011-07-08 2011-10-03 2012-02-03 -0.6259    146        61       85
3 2014-07-07 2014-12-16 2015-05-21 -0.4818    222       115      107
4 2013-02-20 2013-07-08 2014-06-10 -0.4108    329        96      233
5 2012-03-27 2012-06-01 2012-07-17 -0.3900     78        47       31

Note that the table.Drawdowns command only examines one return stream at a time. Furthermore, the top argument specifies how many drawdowns to look at, sorted by greatest drawdown first.

One reason I think that these strategies seem to suffer the drawdowns they do is that they’re either all-in on one asset, or its exact opposite, with no room for error.

One last thing, for the curious, here is the comparison with my strategy since 2011 (essentially XIV inception) benchmarked against the strategies in EVI (which I have been trading with live capital since September, and have recently opened a subscription service for):

volPerfBenchmarks

stratStats(compare['2011::'])
                             QST_vol    BH_XIV DDN_Momentum DDN_VRatio   DDN_VRP
Annualized Return          0.8133000 0.3801000    0.2837000  0.4539000 0.2572000
Annualized Std Dev         0.3530000 0.6323000    0.5706000  0.6328000 0.6326000
Annualized Sharpe (Rf=0%)  2.3040000 0.6012000    0.4973000  0.7172000 0.4066000
Worst Drawdown             0.2480087 0.7438706    0.6927479  0.7665093 0.7174481
Calmar Ratio               3.2793211 0.5109759    0.4095285  0.5921650 0.3584929
Ulcer Performance Index   10.4220721 1.1352168    1.2076995  1.5291637 0.7555808

Thanks for reading.

NOTE: I am currently looking for networking and full-time opportunities related to my skill set. My LinkedIn profile can be found here.

Leverage Up When You’re Down?

Posted on September 5, 2017 by Ilya Kipnis • Posted in ETFs, R, Replication, Volatility • Tagged R • 5 Comments

This post will investigate the idea of reducing leverage when drawdowns are small, and increasing leverage as losses accumulate. It’s based on the idea that whatever goes up must come down, and whatever comes down generally goes back up.

I originally came across this idea from this blog post.

So, first off, let’s write an easy function that allows replication of this idea. Essentially, we have several arguments:

One: the default leverage (that is, when your drawdown is zero, what’s your exposure)? For reference, in the original post, it’s 10%.

Next: the various leverage levels. In the original post, the leverage levels are 25%, 50%, and 100%.

And lastly, we need the corresponding thresholds at which to apply those leverage levels. In the original post, those levels are 20%, 40%, and 55%.

So, now we can create a function to implement that in R. The idea being that we have R compute the drawdowns, and then use that information to determine leverage levels as precisely and frequently as possible.

Here’s a quick piece of code to do so:

require(xts)
require(PerformanceAnalytics)

drawdownLev <- function(rets, defaultLev = .1, levs = c(.25, .5, 1), ddthresh = c(-.2, -.4, -.55)) {
  # compute drawdowns
  dds <- PerformanceAnalytics:::Drawdowns(rets)
  
  # initialize leverage to the default level
  dds$lev <- defaultLev
  
  # change the leverage for every threshold
  for(i in 1:length(ddthresh)) {
    
    # as drawdowns go through thresholds, adjust leverage
    dds$lev[dds$Close < ddthresh[i]] <- levs[i]
  }
  
  # compute the new strategy returns -- apply leverage at tomorrow's close
  out <- rets * lag(dds$lev, 2)
  
  # return the leverage and the new returns
  leverage <- dds$lev
  colnames(leverage) <- c("DDLev_leverage")
  return(list(leverage, out))
}

So, let’s replicate some results.

require(downloader)
require(xts)
require(PerformanceAnalytics)


download("https://dl.dropboxusercontent.com/s/jk6der1s5lxtcfy/XIVlong.TXT",
         destfile="longXIV.txt")


xiv <- xts(read.zoo("longXIV.txt", format="%Y-%m-%d", sep=",", header=TRUE))
xivRets <- Return.calculate(Cl(xiv))

xivDDlev <- drawdownLev(xivRets, defaultLev = .1, levs = c(.25, .5, 1), ddthresh = c(-.2, -.4, -.55))
compare <- na.omit(cbind(xivDDlev[[2]], xivRets))
colnames(compare) <- c("XIV_DD_leverage", "XIV")

charts.PerformanceSummary(compare['2011::2016'])

And our results look something like this:

xivddlev

                          XIV_DD_leverage       XIV
Annualized Return               0.2828000 0.2556000
Annualized Std Dev              0.3191000 0.6498000
Annualized Sharpe (Rf=0%)       0.8862000 0.3934000
Worst Drawdown                  0.4870604 0.7438706
Calmar Ratio                    0.5805443 0.3436668

That said, what would happen if one were to extend the data for all available XIV data?

xivddlev2

> rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare))
                          XIV_DD_leverage       XIV
Annualized Return               0.1615000 0.3319000
Annualized Std Dev              0.3691000 0.5796000
Annualized Sharpe (Rf=0%)       0.4375000 0.5727000
Worst Drawdown                  0.8293650 0.9215784
Calmar Ratio                    0.1947428 0.3601385

A different story.

In this case, I think the takeaway is that such a mechanism does well when the drawdowns for the benchmark in question occur sharply, so that the lower exposure protects from those sharp drawdowns, and then the benchmark spends much of the time in a recovery mode, so that the increased exposure has time to earn outsized returns, and then draws down again. When the benchmark continues to see drawdowns after maximum leverage is reached, or continues to perform well when not in drawdown, such a mechanism falls behind quickly.

As always, there is no free lunch when it comes to drawdowns, as trying to lower exposure in preparation for a correction will necessarily mean forfeiting a painful amount of upside in the good times, at least as presented in the original post.

Thanks for reading.

NOTE: I am currently looking for my next full-time opportunity, preferably in New York City or Philadelphia relating to the skills I have demonstrated on this blog. My LinkedIn profile can be found here. If you know of such opportunities, do not hesitate to reach out to me.

An Out of Sample Update on DDN’s Volatility Momentum Trading Strategy and Beta Convexity

Posted on June 20, 2017 by Ilya Kipnis • Posted in ETFs, R, Replication, Trading, Volatility • Tagged R • 24 Comments

The first part of this post is a quick update on Tony Cooper’s of Double Digit Numerics’s volatility ETN momentum strategy from the volatility made simple blog (which has stopped updating as of a year and a half ago). The second part will cover Dr. Jonathan Kinlay’s Beta Convexity concept.

So, now that I have the ability to generate a term structure and constant expiry contracts, I decided to revisit some of the strategies on Volatility Made Simple and see if any of them are any good (long story short: all of the publicly detailed ones aren’t so hot besides mine–they either have a massive drawdown in-sample around the time of the crisis, or a massive drawdown out-of-sample).

Why this strategy? Because it seemed different from most of the usual term structure ratio trades (of which mine is an example), so I thought I’d check out how it did since its first publishing date, and because it’s rather easy to understand.

Here’s the strategy:

Take XIV, VXX, ZIV, VXZ, and SHY (this last one as the “risk free” asset), and at the close, invest in whichever has had the highest 83 day momentum (this was the result of optimization done on volatilityMadeSimple).

Here’s the code to do this in R, using the Quandl EOD database. There are two variants tested–observe the close, buy the close (AKA magical thinking), and observe the close, buy tomorrow’s close.

require(quantmod)
require(PerformanceAnalytics)
require(TTR)
require(Quandl)

Quandl.api_key("yourKeyHere")

symbols <- c("XIV", "VXX", "ZIV", "VXZ", "SHY")

prices <- list()
for(i in 1:length(symbols)) {
  price <- Quandl(paste0("EOD/", symbols[i]), start_date="1990-12-31", type = "xts")$Adj_Close
  colnames(price) <- symbols[i]
  prices[[i]] <- price
}
prices <- na.omit(do.call(cbind, prices))
returns <- na.omit(Return.calculate(prices))

# find highest asset, assign column names
topAsset <- function(row, assetNames) {
  out <- row==max(row, na.rm = TRUE)
  names(out) <- assetNames
  out <- data.frame(out)
  return(out)
}

# compute momentum
momentums <- na.omit(xts(apply(prices, 2, ROC, n = 83), order.by=index(prices)))

# find highest asset each day, turn it into an xts
highestMom <- apply(momentums, 1, topAsset, assetNames = colnames(momentums))
highestMom <- xts(t(do.call(cbind, highestMom)), order.by=index(momentums))

# observe today's close, buy tomorrow's close
buyTomorrow <- na.omit(xts(rowSums(returns * lag(highestMom, 2)), order.by=index(highestMom)))

# observe today's close, buy today's close (aka magic thinking)
magicThinking <- na.omit(xts(rowSums(returns * lag(highestMom)), order.by=index(highestMom)))

out <- na.omit(cbind(buyTomorrow, magicThinking))
colnames(out) <- c("buyTomorrow", "magicalThinking")

# results
charts.PerformanceSummary(out['2014-04-11::'], legend.loc = 'top')
rbind(table.AnnualizedReturns(out['2014-04-11::']), maxDrawdown(out['2014-04-11::']))

Pretty simple.

Here are the results.

capture

> rbind(table.AnnualizedReturns(out['2014-04-11::']), maxDrawdown(out['2014-04-11::']))
                          buyTomorrow magicalThinking
Annualized Return          -0.0320000       0.0378000
Annualized Std Dev          0.5853000       0.5854000
Annualized Sharpe (Rf=0%)  -0.0547000       0.0646000
Worst Drawdown              0.8166912       0.7761655

Looks like this strategy didn’t pan out too well. Just a daily reminder that if you’re using fine grid-search to select a particularly good parameter (EG n = 83 days? Maybe 4 21-day trading months, but even that would have been n = 82), you’re asking for a visit from, in the words of Mr. Tony Cooper, a visit from the grim reaper.

****

Moving onto another topic, whenever Dr. Jonathan Kinlay posts something that I think I can replicate that I’d be very wise to do so, as he is a very skilled and experienced practitioner (and also includes me on his blogroll).

A topic that Dr. Kinlay covered is the idea of beta convexity–namely, that an asset’s beta to a benchmark may be different when the benchmark is up as compared to when it’s down. Essentially, it’s the idea that we want to weed out firms that are what I’d deem as “losers in disguise”–I.E. those that act fine when times are good (which is when we really don’t care about diversification, since everything is going up anyway), but do nothing during bad times.

The beta convexity is calculated quite simply: it’s the beta of an asset to a benchmark when the benchmark has a positive return, minus the beta of an asset to a benchmark when the benchmark has a negative return, then squaring the difference. That is, (beta_bench_positive – beta_bench_negative) ^ 2.

Here’s some R code to demonstrate this, using IBM vs. the S&P 500 since 1995.

ibm <- Quandl("EOD/IBM", start_date="1995-01-01", type = "xts")
ibmRets <- Return.calculate(ibm$Adj_Close)

spy <- Quandl("EOD/SPY", start_date="1995-01-01", type = "xts")
spyRets <- Return.calculate(spy$Adj_Close)

rets <- na.omit(cbind(ibmRets, spyRets))
colnames(rets) <- c("IBM", "SPY")

betaConvexity <- function(Ra, Rb) {
  positiveBench <- Rb[Rb > 0]
  assetPositiveBench <- Ra[index(positiveBench)]
  positiveBeta <- CAPM.beta(Ra = assetPositiveBench, Rb = positiveBench)
  
  negativeBench <- Rb[Rb < 0]
  assetNegativeBench <- Ra[index(negativeBench)]
  negativeBeta <- CAPM.beta(Ra = assetNegativeBench, Rb = negativeBench)
  
  out <- (positiveBeta - negativeBeta) ^ 2
  return(out)
}

betaConvexity(rets$IBM, rets$SPY)

For the result:

> betaConvexity(rets$IBM, rets$SPY)
[1] 0.004136034

Thanks for reading.

NOTE: I am always looking to network, and am currently actively looking for full-time opportunities which may benefit from my skill set. If you have a position which may benefit from my skills, do not hesitate to reach out to me. My LinkedIn profile can be found here.

Testing the Hierarchical Risk Parity algorithm

Posted on May 26, 2017 by Ilya Kipnis • Posted in Asset Allocation, ETFs, Portfolio Management, R, Replication, Trading • Tagged R • 9 Comments

This post will be a modified backtest of the Adaptive Asset Allocation backtest from AllocateSmartly, using the Hierarchical Risk Parity algorithm from last post, because Adam Butler was eager to see my results. On a whole, as Adam Butler had told me he had seen, HRP does not generate outperformance when applied to a small, carefully-constructed, diversified-by-selection universe of asset classes, as opposed to a universe of hundreds or even several thousand assets, where its theoretically superior properties result in it being a superior algorithm.

First off, I would like to thank one Matthew Barry, for helping me modify my HRP algorithm so as to not use the global environment for recursion. You can find his github here.

Here is the modified HRP code.

covMat <- read.csv('cov.csv', header = FALSE)
corMat <- read.csv('corMat.csv', header = FALSE)

clustOrder <- hclust(dist(corMat), method = 'single')$order

getIVP <- function(covMat) {
  invDiag <- 1/diag(as.matrix(covMat))
  weights <- invDiag/sum(invDiag)
  return(weights)
}

getClusterVar <- function(covMat, cItems) {
  covMatSlice <- covMat[cItems, cItems]
  weights <- getIVP(covMatSlice)
  cVar <- t(weights) %*% as.matrix(covMatSlice) %*% weights
  return(cVar)
}

getRecBipart <- function(covMat, sortIx) {
  w <- rep(1,ncol(covMat))
  w <- recurFun(w, covMat, sortIx)
  return(w)
}

recurFun <- function(w, covMat, sortIx) {
  subIdx <- 1:trunc(length(sortIx)/2)
  cItems0 <- sortIx[subIdx]
  cItems1 <- sortIx[-subIdx]
  cVar0 <- getClusterVar(covMat, cItems0)
  cVar1 <- getClusterVar(covMat, cItems1)
  alpha <- 1 - cVar0/(cVar0 + cVar1)
  
  # scoping mechanics using w as a free parameter
  w[cItems0] <- w[cItems0] * alpha
  w[cItems1] <- w[cItems1] * (1-alpha)
  
  if(length(cItems0) > 1) {
    w <- recurFun(w, covMat, cItems0)
  }
  if(length(cItems1) > 1) {
    w <- recurFun(w, covMat, cItems1)
  }
  return(w)
}


out <- getRecBipart(covMat, clustOrder)
out

With covMat and corMat being from the last post. In fact, this function can be further modified by encapsulating the clustering order within the getRecBipart function, but in the interest of keeping the code as similar to Marcos Lopez de Prado’s code as I could, I’ll leave this here.

Anyhow, the backtest will follow. One thing I will mention is that I’m using Quandl’s EOD database, as Yahoo has really screwed up their financial database (I.E. some sector SPDRs have broken data, dividends not adjusted, etc.). While this database is a $50/month subscription, I believe free users can access it up to 150 times in 60 days, so that should be enough to run backtests from this blog, so long as you save your downloaded time series for later use by using write.zoo.

This code needs the tseries library for the portfolio.optim function for the minimum variance portfolio (Dr. Kris Boudt has a course on this at datacamp), and the other standard packages.

A helper function for this backtest (and really, any other momentum rotation backtest) is the appendMissingAssets function, which simply adds on assets not selected to the final weighting and re-orders the weights by the original ordering.

require(tseries)
require(PerformanceAnalytics)
require(quantmod)
require(Quandl)

Quandl.api_key("YOUR_AUTHENTICATION_HERE") # not displaying my own api key, sorry :(

# function to append missing (I.E. assets not selected) asset names and sort into original order
appendMissingAssets <- function(wts, allAssetNames, wtsDate) {
  absentAssets <- allAssetNames[!allAssetNames %in% names(wts)]
  absentWts <- rep(0, length(absentAssets))
  names(absentWts) <- absentAssets
  wts <- c(wts, absentWts)
  wts <- xts(t(wts), order.by=wtsDate)
  wts <- wts[,allAssetNames]
  return(wts)
}

Next, we make the call to Quandl to get our data.

symbols <- c("SPY", "VGK",	"EWJ",	"EEM",	"VNQ",	"RWX",	"IEF",	"TLT",	"DBC",	"GLD")	

rets <- list()
for(i in 1:length(symbols)) {
  
  # quandl command to download from EOD database. Free users should use write.zoo in this loop.
  
  returns <- Return.calculate(Quandl(paste0("EOD/", symbols[i]), start_date="1990-12-31", type = "xts")$Adj_Close)
  colnames(returns) <- symbols[i]
  rets[[i]] <- returns
}
rets <- na.omit(do.call(cbind, rets))

While Josh Ulrich fixed quantmod to actually get Yahoo data after Yahoo broke the API, the problem is that the Yahoo data is now garbage as well, and I’m not sure how much Josh Ulrich can do about that. I really hope some other provider can step up and provide free, usable EOD data so that I don’t have to worry about readers not being able to replicate the backtest, as my policy for this blog is that readers should be able to replicate the backtests so they don’t just nod and take my word for it. If you are or know of such a provider, please leave a comment so that I can let the blog readers know all about you.

Next, we initialize the settings for the backtest.

invVolWts <- list()
minVolWts <- list()
hrpWts <- list()
ep <- endpoints(rets, on =  "months")
nMonths = 6 # month lookback (6 as per parameters from allocateSmartly)
nVol = 20 # day lookback for volatility (20 ibid)

While the AAA backtest actually uses a 126 day lookback instead of a 6 month lookback, as it trades at the end of every month, that’s effectively a 6 month lookback, give or take a few days out of 126, but the code is less complex this way.

Next, we have our actual backtest.

for(i in 1:(length(ep)-nMonths)) {
  
  # get returns subset and compute absolute momentum
  retSubset <- rets[c(ep[i]:ep[(i+nMonths)]),]
  retSubset <- retSubset[-1,]
  moms <- Return.cumulative(retSubset)
  
  # select top performing assets and subset returns for them
  highRankAssets <- rank(moms) >= 6 # top 5 assets
  posReturnAssets <- moms > 0 # positive momentum assets
  selectedAssets <- highRankAssets & posReturnAssets # intersection of the above
  selectedSubset <- retSubset[,selectedAssets] # subset returns slice
  
  if(sum(selectedAssets)==0) { # if no qualifying assets, zero weight for period
    
    wts <- xts(t(rep(0, ncol(retSubset))), order.by=last(index(retSubset)))
    colnames(wts) <- colnames(retSubset)
    invVolWts[[i]] <- minVolWts[[i]] <- hrpWts[[i]] <- wts
    
  } else if (sum(selectedAssets)==1) { # if one qualifying asset, invest fully into it
    
    wts <- xts(t(rep(0, ncol(retSubset))), order.by=last(index(retSubset)))
    colnames(wts) <- colnames(retSubset)
    wts[, which(selectedAssets==1)] <- 1
    invVolWts[[i]] <- minVolWts[[i]] <- hrpWts[[i]] <- wts
    
  } else { # otherwise, use weighting algorithms
    
    cors <- cor(selectedSubset) # correlation
    volSubset <- tail(selectedSubset, nVol) # 20 day volatility
    vols <- StdDev(volSubset)
    covs <- t(vols) %*% vols * cors
    
    # minimum volatility using portfolio.optim from tseries
    minVolRets <- t(matrix(rep(1, sum(selectedAssets))))
    minVolWt <- portfolio.optim(x=minVolRets, covmat = covs)$pw
    names(minVolWt) <- colnames(covs)
    minVolWt <- appendMissingAssets(minVolWt, colnames(retSubset), last(index(retSubset)))
    minVolWts[[i]] <- minVolWt
    
    # inverse volatility weights
    invVols <- 1/vols 
    invVolWt <- invVols/sum(invVols) 
    invNames <- colnames(invVolWt)
    invVolWt <- as.numeric(invVolWt) 
    names(invVolWt) <- invNames
    invVolWt <- appendMissingAssets(invVolWt, colnames(retSubset), last(index(retSubset)))
    invVolWts[[i]] <- invVolWt
    
    # hrp weights
    clustOrder <- hclust(dist(cors), method = 'single')$order
    hrpWt <- getRecBipart(covs, clustOrder)
    names(hrpWt) <- colnames(covs)
    hrpWt <- appendMissingAssets(hrpWt, colnames(retSubset), last(index(retSubset)))
    hrpWts[[i]] <- hrpWt
  }
}

In a few sentences, this is what happens:

The algorithm takes a subset of the returns (the past six months at every month), and computes absolute momentum. It then ranks the ten absolute momentum calculations, and selects the intersection of the top 5, and those with a return greater than zero (so, a dual momentum calculation).

If no assets qualify, the algorithm invests in nothing. If there’s only one asset that qualifies, the algorithm invests in that one asset. If there are two or more qualifying assets, the algorithm computes a covariance matrix using 20 day volatility multiplied with a 126 day correlation matrix (that is, sd_20′ %*% sd_20 * (elementwise) cor_126. It then computes normalized inverse volatility weights using the volatility from the past 20 days, a minimum variance portfolio with the portfolio.optim function, and lastly, the hierarchical risk parity weights using the HRP code above from Marcos Lopez de Prado’s paper.

Lastly, the program puts together all of the weights, and adds a cash investment for any period without any investments.

invVolWts <- round(do.call(rbind, invVolWts), 3) # round for readability
minVolWts <- round(do.call(rbind, minVolWts), 3)
hrpWts <- round(do.call(rbind, hrpWts), 3)

# allocate to cash if no allocation made due to all negative momentum assets
invVolWts$cash <- 0; invVolWts$cash <- 1-rowSums(invVolWts)
hrpWts$cash <- 0; hrpWts$cash <- 1-rowSums(hrpWts)
minVolWts$cash <- 0; minVolWts$cash <- 1-rowSums(minVolWts)

# cash value will be zero
rets$cash <- 0

# compute backtest returns
invVolRets <- Return.portfolio(R = rets, weights = invVolWts)
minVolRets <- Return.portfolio(R = rets, weights = minVolWts)
hrpRets <- Return.portfolio(R = rets, weights = hrpWts)

Here are the results:

compare <- cbind(invVolRets, minVolRets, hrpRets)
colnames(compare) <- c("invVol", "minVol", "HRP")
charts.PerformanceSummary(compare)
rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare))

                             invVol    minVol       HRP
Annualized Return         0.0872000 0.0724000 0.0792000
Annualized Std Dev        0.1208000 0.1025000 0.1136000
Annualized Sharpe (Rf=0%) 0.7221000 0.7067000 0.6968000
Worst Drawdown            0.1548801 0.1411368 0.1593287
Calmar Ratio              0.5629882 0.5131956 0.4968234

In short, in the context of a small, carefully-selected and allegedly diversified (I’ll let Adam Butler speak for that one) universe dominated by the process of which assets to invest in as opposed to how much, the theoretical upsides of an algorithm which simultaneously exploits a covariance structure without needing to invert a covariance matrix can be lost.

However, this test (albeit from 2007 onwards, thanks to ETF inception dates combined with lookback burn-in) confirms what Adam Butler himself told me, which is that HRP hasn’t impressed him, and from this backtest, I can see why. However, in the context of dual momentum rank selection, I’m not convinced that any weighting scheme will realize much better performance than any other.

Thanks for reading.

NOTE: I am always interested in networking and hearing about full-time opportunities related to my skill set. My linkedIn profile can be found here.

The Marcos Lopez de Prado Hierarchical Risk Parity Algorithm

Posted on May 22, 2017 by Ilya Kipnis • Posted in Asset Allocation, Portfolio Management, R, Replication, Trading • Tagged R • 17 Comments

This post will be about replicating the Marcos Lopez de Prado algorithm from his paper building diversified portfolios that outperform out of sample. This algorithm is one that attempts to make a tradeoff between the classic mean-variance optimization algorithm that takes into account a covariance structure, but is unstable, and an inverse volatility algorithm that ignores covariance, but is more stable.

This is a paper that I struggled with until I ran the code in Python (I have anaconda installed but have trouble installing some packages such as keras because I’m on windows…would love to have someone walk me through setting up a Linux dual-boot), as I assumed that the clustering algorithm actually was able to concretely group every asset into a particular cluster (I.E. ETF 1 would be in cluster 1, ETF 2 in cluster 3, etc.). Turns out, that isn’t at all the case.

Here’s how the algorithm actually works.

First off, it computes a covariance and correlation matrix (created from simulated data in Marcos’s paper). Next, it uses a hierarchical clustering algorithm on a distance-transformed correlation matrix, with the “single” method (I.E. friend of friends–do ?hclust in R to read up more on this). The key output here is the order of the assets from the clustering algorithm. Note well: this is the only relevant artifact of the entire clustering algorithm.

Using this order, it then uses an algorithm that does the following:

Initialize a vector of weighs equal to 1 for each asset.

Then, run the following recursive algorithm:

1) Break the order vector up into two equal-length (or as close to equal length) lists as possible.

2) For each half of the list, compute the inverse variance weights (that is, just the diagonal) of the covariance matrix slice containing the assets of interest, and then compute the variance of the cluster when multiplied by the weights (I.E. w’ * S^2 * w).

3) Then, do a basic inverse-variance weight for the two clusters. Call the weight of cluster 0 alpha = 1-cluster_variance_0/(cluster_variance_0 + cluster_variance_1), and the weight of cluster 1 its complement. (1 – alpha).

4) Multiply all assets in the original vector of weights containing assets in cluster 0 with the weight of cluster 0, and all weights containing assets in cluster 1 with the weight of cluster 1. That is, weights[index_assets_cluster_0] *= alpha, weights[index_assets_cluster_1] *= 1-alpha.

5) Lastly, if the list isn’t of length 1 (that is, not a single asset), repeat this entire process until every asset is its own cluster.

Here is the implementation in R code.

First off, the correlation matrix and the covariance matrix for use in this code, obtained from Marcos Lopez De Prado’s code in the appendix in his paper.

> covMat
             V1           V2           V3           V4           V5          V6           V7           V8           V9          V10
1   1.000647799 -0.003050479  0.010033224 -0.010759689 -0.005036503 0.008762563  0.998201625 -0.001393196 -0.001254522 -0.009365991
2  -0.003050479  1.009021349  0.008613817  0.007334478 -0.009492688 0.013031817 -0.009420720 -0.015346223  1.010520047  1.013334849
3   0.010033224  0.008613817  1.000739363 -0.000637885  0.001783293 1.001574768  0.006385368  0.001922316  0.012902050  0.007997935
4  -0.010759689  0.007334478 -0.000637885  1.011854725  0.005759976 0.000905812 -0.011912269  0.000461894  0.012572661  0.009621670
5  -0.005036503 -0.009492688  0.001783293  0.005759976  1.005835878 0.005606343 -0.009643250  1.008567427 -0.006183035 -0.007942770
6   0.008762563  0.013031817  1.001574768  0.000905812  0.005606343 1.064309825  0.004413960  0.005780148  0.017185396  0.011601336
7   0.998201625 -0.009420720  0.006385368 -0.011912269 -0.009643250 0.004413960  1.058172027 -0.006755374 -0.008099181 -0.016240271
8  -0.001393196 -0.015346223  0.001922316  0.000461894  1.008567427 0.005780148 -0.006755374  1.074833155 -0.011903469 -0.013738378
9  -0.001254522  1.010520047  0.012902050  0.012572661 -0.006183035 0.017185396 -0.008099181 -0.011903469  1.075346677  1.015220126
10 -0.009365991  1.013334849  0.007997935  0.009621670 -0.007942770 0.011601336 -0.016240271 -0.013738378  1.015220126  1.078586686
> corMat
             V1           V2           V3           V4           V5          V6           V7           V8           V9          V10
1   1.000000000 -0.003035829  0.010026270 -0.010693011 -0.005020245 0.008490954  0.970062043 -0.001343386 -0.001209382 -0.009015412
2  -0.003035829  1.000000000  0.008572055  0.007258718 -0.009422702 0.012575370 -0.009117080 -0.014736040  0.970108941  0.971348946
3   0.010026270  0.008572055  1.000000000 -0.000633903  0.001777455 0.970485047  0.006205079  0.001853505  0.012437239  0.007698212
4  -0.010693011  0.007258718 -0.000633903  1.000000000  0.005709500 0.000872861 -0.011512172  0.000442908  0.012052964  0.009210090
5  -0.005020245 -0.009422702  0.001777455  0.005709500  1.000000000 0.005418538 -0.009347204  0.969998023 -0.005945165 -0.007625721
6   0.008490954  0.012575370  0.970485047  0.000872861  0.005418538 1.000000000  0.004159261  0.005404237  0.016063910  0.010827955
7   0.970062043 -0.009117080  0.006205079 -0.011512172 -0.009347204 0.004159261  1.000000000 -0.006334331 -0.007592568 -0.015201540
8  -0.001343386 -0.014736040  0.001853505  0.000442908  0.969998023 0.005404237 -0.006334331  1.000000000 -0.011072068 -0.012759610
9  -0.001209382  0.970108941  0.012437239  0.012052964 -0.005945165 0.016063910 -0.007592568 -0.011072068  1.000000000  0.942667300
10 -0.009015412  0.971348946  0.007698212  0.009210090 -0.007625721 0.010827955 -0.015201540 -0.012759610  0.942667300  1.000000000

Now, for the implementation.

This reads in the two matrices above and gets the clustering order.

covMat <- read.csv('cov.csv', header = FALSE)
corMat <- read.csv('corMat.csv', header = FALSE)

clustOrder <- hclust(dist(corMat), method = 'single')$order

This is the clustering order:

> clustOrder
 [1]  9  2 10  1  7  3  6  4  5  8

Next, the getIVP (get Inverse Variance Portfolio) and getClusterVar functions (note: I’m trying to keep the naming conventions identical to Dr. Lopez’s paper)

getIVP <- function(covMat) {
  # get inverse variance portfolio from diagonal of covariance matrix
  invDiag <- 1/diag(as.matrix(covMat))
  weights <- invDiag/sum(invDiag)
  return(weights)
}

getClusterVar <- function(covMat, cItems) {
  # compute cluster variance from the inverse variance portfolio above
  covMatSlice <- covMat[cItems, cItems]
  weights <- getIVP(covMatSlice)
  cVar <- t(weights) %*% as.matrix(covMatSlice) %*% weights
  return(cVar)
}

Next, my code diverges from the code in the paper, because I do not use the list comprehension structure, but instead opt for a recursive algorithm, as I find that style to be more readable.

One wrinkle to note is the use of the double arrow dash operator, to assign to a variable outside the scope of the recurFun function. I assign the initial weights vector w in the global environment, and update it from within the recurFun function. I am aware that it is a faux pas to create variables in the global environment, but my attempts at creating a temporary environment in which to update the weight vector did not produce the updating mechanism I had hoped to, so a little bit of assistance with refactoring this code would be appreciated.

getRecBipart <- function(covMat, sortIx) {
  # keeping track of weights vector in the global environment
  assign("w", value = rep(1, ncol(covMat)), envir = .GlobalEnv)

  # run recursion function
  recurFun(covMat, sortIx)
  return(w)
}

recurFun <- function(covMat, sortIx) {
  # get first half of sortIx which is a cluster order
  subIdx <- 1:trunc(length(sortIx)/2)

  # subdivide ordering into first half and second half
  cItems0 <- sortIx[subIdx]
  cItems1 <- sortIx[-subIdx]

  # compute cluster variances of covariance matrices indexed
  # on first half and second half of ordering
  cVar0 <- getClusterVar(covMat, cItems0)
  cVar1 <- getClusterVar(covMat, cItems1)
  alpha <- 1 - cVar0/(cVar0 + cVar1)
  
  # updating weights outside the function using scoping mechanics 
  w[cItems0] <<- w[cItems0] * alpha
  w[cItems1] <<- w[cItems1] * (1-alpha)
  
  # rerun the function on a half if the length of that half is greater than 1
  if(length(cItems0) > 1) {
    recurFun(covMat, cItems0)
  }
  if(length(cItems1) > 1) {
    recurFun(covMat, cItems1)
  }
}

Lastly, let’s run the function.

out <- getRecBipart(covMat, clustOrder)

With the result (which matches the paper):

> out
 [1] 0.06999366 0.07592151 0.10838948 0.19029104 0.09719887 0.10191545 0.06618868 0.09095933 0.07123881 0.12790318

So, hopefully this democratizes the use of this technology in R. While I have seen a raw Rcpp implementation and one from the Systematic Investor Toolbox, neither of those implementations satisfied me from a “plug and play” perspective. This implementation solves that issue. Anyone here can copy and paste these functions into their environment and immediately make use of one of the algorithms devised by one of the top minds in quantitative finance.

A demonstration in a backtest using this methodology will be forthcoming.

Thanks for reading.

NOTE: I am always interested in networking and full-time opportunities which may benefit from my skills. Furthermore, I am also interested in project work in the volatility ETF trading space. My linkedin profile can be found here.

A Return.Portfolio Wrapper to Automate Harry Long Seeking Alpha Backtests

Posted on June 17, 2016 by Ilya Kipnis • Posted in R, Replication, SeekingAlpha • 3 Comments

This post will cover a function to simplify creating Harry Long (of Houston) type rebalancing strategies from SeekingAlpha for interested readers. As Harry Long has stated, most, if not all of his strategies are more for demonstrative purposes rather than actual recommended investments.

So, since Harry Long has been posting some more articles on Seeknig Alpha, I’ve had a reader or two ask me to analyze his strategies (again). Instead of doing that, however, I’ll simply put this tool here, which is a wrapper that automates the acquisition of data and simulates portfolio rebalancing with one line of code.

Here’s the tool.

require(quantmod)
require(PerformanceAnalytics)
require(downloader)

LongSeeker &lt;- function(symbols, weights, rebalance_on = "years",
                       displayStats = TRUE, outputReturns = FALSE) {
  getSymbols(symbols, src='yahoo', from = '1990-01-01')
  prices &lt;- list()
  for(i in 1:length(symbols)) {
    if(symbols[i] == "ZIV") {
      download("https://www.dropbox.com/s/jk3ortdyru4sg4n/ZIVlong.TXT", destfile="ziv.txt")
      ziv &lt;- xts(read.zoo("ziv.txt", header=TRUE, sep=",", format="%Y-%m-%d"))
      prices[[i]] &lt;- Cl(ziv)
    } else if (symbols[i] == "VXX") {
      download("https://dl.dropboxusercontent.com/s/950x55x7jtm9x2q/VXXlong.TXT",
               destfile="vxx.txt")
      vxx &lt;- xts(read.zoo("vxx.txt", header=TRUE, sep=",", format="%Y-%m-%d"))
      prices[[i]] &lt;- Cl(vxx)
    }
    else {
      prices[[i]] &lt;- Ad(get(symbols[i]))
    }
  }
  prices &lt;- do.call(cbind, prices)
  prices &lt;- na.locf(prices)
  returns &lt;- na.omit(Return.calculate(prices))

  returns$zeroes &lt;- 0
  weights &lt;- c(weights, 1-sum(weights))
  stratReturns &lt;- Return.portfolio(R = returns, weights = weights, rebalance_on = rebalance_on)

  if(displayStats) {
    stats &lt;- rbind(table.AnnualizedReturns(stratReturns), maxDrawdown(stratReturns), CalmarRatio(stratReturns))
    rownames(stats)[4] &lt;- "Max Drawdown"
    print(stats)
    charts.PerformanceSummary(stratReturns)
  }

  if(outputReturns) {
    return(stratReturns)
  }
}

It fetches the data for you (usually from Yahoo, but a big thank you to Mr. Helumth Vollmeier in the case of ZIV and VXX), and has the option of either simply displaying an equity curve and some statistics (CAGR, annualized standard dev, Sharpe, max Drawdown, Calmar), or giving you the return stream as an output if you wish to do more analysis in R.
Here’s an example of simply getting the statistics, with an 80% XLP/SPLV (they’re more or less interchangeable) and 20% TMF (aka 60% TLT, so an 80/60 portfolio), from one of Harry Long’s articles.
LongSeeker(c("XLP", "TLT"), c(.8, .6))

Statistics:

                          portfolio.returns
Annualized Return                 0.1321000
Annualized Std Dev                0.1122000
Annualized Sharpe (Rf=0%)         1.1782000
Max Drawdown                      0.2330366
Calmar Ratio                      0.5670285

Equity curve:

Nothing out of the ordinary of what we might expect from a balanced equity/bonds portfolio. Generally does well, has its largest drawdown in the financial crisis, and some other bumps in the road, but overall, I’d think a fairly vanilla “set it and forget it” sort of thing.
And here would be the way to get the stream of individual daily returns, assuming you wanted to rebalance these two instruments weekly, instead of yearly (as is the default).
tmp &lt;- LongSeeker(c("XLP", "TLT"), c(.8, .6), rebalance_on="weeks",
                    displayStats = FALSE, outputReturns = TRUE)

And now let’s get some statistics.
table.AnnualizedReturns(tmp)
maxDrawdown(tmp)
CalmarRatio(tmp)

Which give:
&gt; table.AnnualizedReturns(tmp)
                          portfolio.returns
Annualized Return                    0.1328
Annualized Std Dev                   0.1137
Annualized Sharpe (Rf=0%)            1.1681
&gt; maxDrawdown(tmp)
[1] 0.2216417
&gt; CalmarRatio(tmp)
             portfolio.returns
Calmar Ratio         0.5990087

Turns out, moving the rebalancing from annually to weekly didn’t have much of an effect here (besides give a bunch of money to your broker, if you factored in transaction costs, which this doesn’t).
So, that’s how this tool works. The results, of course, begin from the latest instrument’s inception. The trick, in my opinion, is to try and find proxy substitutes with longer histories for newer ETFs that are simply leveraged ETFs, such as using a 60% weight in TLT with an 80% weight in XLP instead of a 20% weight in TMF with 80% allocation in SPLV.
For instance, here are some proxies:
SPXL = XLP

SPXL/UPRO = SPY * 3

TMF = TLT * 3
That said, I’ve worked with Harry Long before, and he develops more sophisticated strategies behind the scenes, so I’d recommend that SeekingAlpha readers take his publicly released strategies as concept demonstrations, as opposed to fully-fledged investment ideas, and contact Mr. Long himself about more customized, private solutions for investment institutions if you are so interested.
Thanks for reading.
NOTE: I am currently in the northeast. While I am currently contracting, I am interested in networking with individuals or firms with regards to potential collaboration opportunities.

Create Amazing Looking Backtests With This One Wrong–I Mean Weird–Trick! (And Some Troubling Logical Invest Results)

Posted on April 22, 2016 by Ilya Kipnis • Posted in ETFs, R, Replication, Trading • Tagged R • 11 Comments

This post will outline an easy-to-make mistake in writing vectorized backtests–namely in using a signal obtained at the end of a period to enter (or exit) a position in that same period. The difference in results one obtains is massive.

Today, I saw two separate posts from Alpha Architect and Mike Harris both referencing a paper by Valeriy Zakamulin on the fact that some previous trend-following research by Glabadanidis was done with shoddy results, and that Glabadanidis’s results were only reproducible through instituting lookahead bias.

The following code shows how to reproduce this lookahead bias.

First, the setup of a basic moving average strategy on the S&P 500 index from as far back as Yahoo data will provide.

require(quantmod)
require(xts)
require(TTR)
require(PerformanceAnalytics)

getSymbols('^GSPC', src='yahoo', from = '1900-01-01')
monthlyGSPC <- Ad(GSPC)[endpoints(GSPC, on = 'months')]

# change this line for signal lookback
movAvg <- SMA(monthlyGSPC, 10)

signal <- monthlyGSPC > movAvg
gspcRets <- Return.calculate(monthlyGSPC)

And here is how to institute the lookahead bias.

lookahead <- signal * gspcRets
correct <- lag(signal) * gspcRets

These are the “results”:

compare <- na.omit(cbind(gspcRets, lookahead, correct))
colnames(compare) <- c("S&P 500", "Lookahead", "Correct")
charts.PerformanceSummary(compare)
rbind(table.AnnualizedReturns(compare), maxDrawdown(compare), CalmarRatio(compare))
logRets <- log(cumprod(1+compare))
chart.TimeSeries(logRets, legend.loc='topleft')

Of course, this equity curve is of no use, so here’s one in log scale.

As can be seen, lookahead bias makes a massive difference.

Here are the numerical results:

                            S&P 500  Lookahead   Correct
Annualized Return         0.0740000 0.15550000 0.0695000
Annualized Std Dev        0.1441000 0.09800000 0.1050000
Annualized Sharpe (Rf=0%) 0.5133000 1.58670000 0.6623000
Worst Drawdown            0.5255586 0.08729914 0.2699789
Calmar Ratio              0.1407286 1.78119192 0.2575219

Again, absolutely ridiculous.

Note that when using Return.Portfolio (the function in PerformanceAnalytics), that package will automatically give you the next period’s return, instead of the current one, for your weights. However, for those writing “simple” backtests that can be quickly done using vectorized operations, an off-by-one error can make all the difference between a backtest in the realm of reasonable, and pure nonsense. However, should one wish to test for said nonsense when faced with impossible-to-replicate results, the mechanics demonstrated above are the way to do it.

Now, onto other news: I’d like to thank Gerald M for staying on top of one of the Logical Invest strategies–namely, their simple global market rotation strategy outlined in an article from an earlier blog post.

Up until March 2015 (the date of the blog post), the strategy had performed well. However, after said date?

It has been a complete disaster, which, in hindsight, was evident when I passed it through the hypothesis-driven development framework process I wrote about earlier.

So, while there has been a great deal written about not simply throwing away a strategy because of short-term underperformance, and that anomalies such as momentum and value exist because of career risk due to said short-term underperformance, it’s never a good thing when a strategy creates historically large losses, particularly after being published in such a humble corner of the quantitative financial world.

In any case, this was a post demonstrating some mechanics, and an update on a strategy I blogged about not too long ago.

Thanks for reading.

NOTE: I am always interested in hearing about new opportunities which may benefit from my expertise, and am always happy to network. You can find my LinkedIn profile here.

QuantStrat TradeR

Trading, QuantStrat, R, and more.

Category Archives: Replication

Several Key PerformanceAnalytics Functions From R Now In Python (special thanks to Vijay Vaidyanathan)

How You Measure Months Matters — A Lot. A Look At Two Implementations of KDA

Replicating Volatility ETN Returns From CBOE Futures

Comparing Some Strategies from Easy Volatility Investing, and the Table.Drawdowns Command

Leverage Up When You’re Down?

An Out of Sample Update on DDN’s Volatility Momentum Trading Strategy and Beta Convexity

Testing the Hierarchical Risk Parity algorithm

The Marcos Lopez de Prado Hierarchical Risk Parity Algorithm

A Return.Portfolio Wrapper to Automate Harry Long Seeking Alpha Backtests

Create Amazing Looking Backtests With This One Wrong–I Mean Weird–Trick! (And Some Troubling Logical Invest Results)