Asset Pricing using Extreme Liquidity with Python (Part-2)


  • Part-1 Recap
  • Part-1 Error Corrections
  • Part-2 Implementation Details, Deviations, Goals
  • Prepare Data
  • Setup PYMC3 Generalized Linear Models (GLM)
  • Evaluate and Interprate Models
  • Conclusions
  • References

part-1 recap

In part 1 We discussed the theorized underpinnings of Ying Wu of Stevens Institute of Technology - School's asset pricing model. Theory links the catalyst of systemic risk events to the funding difficulties of major financial intermediaries. Thus crisis risk is linked to liquidity events. The model proposes a method to estimate a proxy index for the systematic liquidity risk. We use an illiquidity metric calculated across a large group of stocks, then apply a tool called the Hill estimator to measure the average distance of extreme illiquidity events from the tail cutoff. We explored the high level intuition behind the Hill estimator. 

We created an implementation of the Hill estimator, aggregated the stock data, calculated the illiquidity metrics and the ELR index, and finally output the intermediate data into a hdf5 file for quick read/write access. 

We did not get this far in part-1, but the paper asserts that we can use this index as an asset pricing component. This could also be thought of as the primary feature or independent variable in a simple linear regression (think CAPM). The target variable is the expected aggregate returns. From there the paper says we can create long-short portfolios by ranking the stocks according to their factor betas and sorting them into quantiles. 

part-1 error corrections

In part 1 the implementation of the hill estimator was incorrect. The ELR index is supposed to be comprised of the values that exceed 95th percentile. In the original implementation I calculated the average of all values not just those in the tail region. Therefore the quick and dirty observations made previously are for a different index. On the left is the original incorrect index. On the right is the corrected index.



After spending some time rereading the research paper there is a subtle bit of additional complexity I have not included in this implementation that may affect the results I get vs those found in the paper. 

In the paper the threshold value is calculated as the 95th percentile cross-sectionally for the entire month. Then the index is constructed by calculating the average log distance from that threshold for any datapoint located in the tail. To create an index like this requires binning the data by month, getting the threshold value of that month by aggregating the daily illiquidity metrics of a few thousand stocks for that month, then calculating the log average distance between those tail values and the threshold. 

This means we likely need a whole month of data before we can calculate the ELR value. We can potentially use a rolling 21 or 30 day window to simulate a monthly lookback but based on the paper it does not seem that the author used this method. If, instead we go by calendar months, this likely means we need a lot more data before we can draw any conclusions. For example the author's sample period is from 1968-2011 and only includes NYSE stocks among other stock universe selection details. 

In my exploration of ELR index, I prefer to keep it simpler, and calculate the 95th percentile threshold based on the cross sectional daily illiquidity values instead of the whole month.

Part-2 Goals:

  • Import the calculated the daily illiquidity values
  • Resample the illiquidity measures by week, taking the median and max illiquidity values, then calculate the ELR Index
  • Use pymc3's generalized linear models function to fit a model for predicting the cross-sectional scaled returns.
  • Interpret and Evaluate the models.

prepare data

First we need to import packages and get our data ready. 

# import packages

import sys
import os 
# ------------------- % import datasets % ------------------- #
datasets = '/YOUR/DATASET/LOCATION/_Datasets/'

import pandas as pd
import as web
from pandas.tseries.offsets import *
import numpy as np
import scipy.stats as scs
import matplotlib as mpl
import matplotlib.pyplot as plt'bmh')
%matplotlib inline
import seaborn as sns
sns.set_style('white', {"xtick.major.size": 3, "ytick.major.size": 3})

import pymc3 as pm
from scipy import optimize

import time
from tqdm import tqdm

p = print

I created a hdf5 file for the aggregated returns because I want to use them as a proxy for our target variable of expected market returns. I import those here and create a time series consisting of the cross sectional median and average log returns.

## read in data

# log returns for aggregrate mkt proxy
LRET_FILE = datasets + 'LRET_Set_2016-11-22.h5' 
lret_set = pd.read_hdf(LRET_FILE, 'RETURNS_DV_SET')
lret_set = lret_set.loc[:,lret_set.columns.to_series().str.contains('_lret').tolist()]

# calc median and mean cross sectional

mkt = pd.DataFrame({'cross_mdn_rets':lret_set.median(axis='columns'), 
## read in illiquidity data for ELR calculations

ILQ_FILE = datasets + 'Illiquidity_Set_2016-11-22.h5'
ilq = pd.read_hdf(ILQ_FILE, 'Illiquidity_Set')  

After loading the data into our environment, we resample the data to a weekly frequency using both median and max values for comparison. On my outdated laptop this took approximately 7 minutes.

# weekly resample
freq = '1W'
df = ilq.resample(freq).median()
df_max = ilq.resample(freq).max() 

Next we define our convenience functions for calculating our ELR index. Notice that I deviate from the traditional z-score scaling method and implement the Gelman scaler which divides the centered values by 2 times the standard deviation. You can read more details from Andrew Gelman's paper[2] about why we use this method. The high-level intuition is that this scale improves regression coefficient interpretability across binary, discrete, and continuous variables.

# convenience functions for gamma calculation and scaler

# gamma estimate
def _ext_lq_risk(series):
    # threshold is 95th percentile
    p_star = np.nanpercentile(series, 95)
    illiq = series[series > p_star]
    #illiq = series # looks better on chart but less explanatory power
    lg_illiq = np.log(illiq / p_star)
    lg_illiq = lg_illiq[np.isfinite(lg_illiq)]
        gamma = 1./ ((1./len(lg_illiq)) * sum(lg_illiq))
    except ZeroDivisionError:
        gamma = np.nan
    return gamma

# scaler function
gelman_scaler = lambda ser: (ser - ser.mean()) / (2*ser.std())

# calculate elr index
def _calculate_elr(df, cutoff=100, scaler=None):
    gs = {} # gammas dictionary
    nan_dates = []
    for d in df.index:
        # we want at least N nonnull values
        if df.loc[d].notnull().sum() > cutoff:
            gamma = _ext_lq_risk(df.loc[d])
            gs[d] = gamma

    gdf = pd.DataFrame.from_dict(gs, orient='index').sort_index()
    gdfz = scaler(gdf)
    gdfz.columns = ['ELR']
    return gdfz, nan_dates

Now we can set up our main experimental dataframe. We need to make sure the our market proxy dataframe, which consists of the aggregate sample returns, has the same index as our ELR dataframe before we merge them. Also, remember we are going to experiment with two resampled dataframes, one with the weekly median illiquidity, and one with the weekly maximum illiquidity. Our final step after creating our merged dataframes is to add a column for our Gelman scaled aggregate returns.

# calculate ELR index on resampled data
gdfz_mdn, _ = _calculate_elr(df, scaler=gelman_scaler)
gdfz_max, _ = _calculate_elr(df_max, scaler=gelman_scaler)

# market resample must match gdfz before merge
# merge dataframes
mkt_rs = mkt.resample(freq).mean()
mrg_mdn = pd.concat([gdfz_mdn, mkt_rs], join='inner', axis=1)
mrg_max = pd.concat([gdfz_max, mkt_rs], join='inner', axis=1)

# add cross sectional average Gelman scored returns
avg_col = 'cross_avg_rets'
mrg_mdn['cross_avg_zrets'] = gelman_scaler(mrg_mdn[avg_col])
mrg_max['cross_avg_zrets'] = gelman_scaler(mrg_max[avg_col])


Before running our model I define some output convenience functions adapted from the excellent blog Applied AI[3].

# pymc3 convenience functions adapted from

def trace_median(x):
    return pd.Series(np.median(x,0), name='median')

def plot_traces(trcs, retain=1000, varnames=None):
    ''' Convenience fn: plot traces with overlaid means and values '''
    nrows = len(trcs.varnames)
    if varnames is not None:
        nrows = len(varnames)
    ax = pm.traceplot(trcs[-retain:], varnames=varnames, figsize=(12,nrows*1.4)
        ,lines={k: v['mean'] for k, v in 

    for i, mn in enumerate(pm.df_summary(trcs[-retain:], varnames=varnames)['mean']):
        ax[i,0].annotate('{:.2f}'.format(mn), xy=(mn,0), xycoords='data'
                    ,xytext=(5,10), textcoords='offset points', rotation=90
                    ,va='bottom', fontsize='large', color='#AA0022')   
def plot_pm_acf(trace, varnames=None, burn=None):
    pm.autocorrplot(trace, varnames=varnames, burn=burn, figsize=(7,5))

Now we can set up our model. I will gloss over some of the particulars pymc3 and the Generalized Linear Model (glm) functions for now. I'm also skipping over why I'm using a Bayesian methodology vs. a frequentist one. Generally speaking, Bayesian modeling is the preferred methodology due to robustness and explicit modeling of the uncertainty in our point estimates. I plan to revisit this topic in more detail in the future, but there are plenty of tutorials and explanations of why Bayesian is the way to go.

Anyone familiar with R will appreciate the following simplicity of model setup. First we need to define our model formula as a string.

# predicting cross sectional average returns using the ELR index
ft_endog = 'cross_avg_zrets' 
ft_exog =  ['ELR'] # this format allows easy addition of more variables
fml = '{} ~ '.format(ft_endog) + ' + '.join(ft_exog) 

# 'cross_avg_zrets ~ ELR'

Next we follow pymc3's glm model convention and choose the number of samples we wish to draw from the predicted posterior.

# choose samples and run model

samples = 5000
with pm.Model() as mdl:

    ## Use GLM submodule for simplified model specification
    ## Betas are Normal (as per default settings (for Ridge)
    ## Likelihood is Normal (with HalfCauchy for error prior)
    pm.glm.glm(fml, mrg_mdn, family=pm.glm.families.Normal())
    start_MAP = pm.find_MAP(fmin=optimize.fmin_powell)
    ## take samples using NUTS sampler
    trc_ols = pm.sample(samples, start=start_MAP, step=pm.NUTS())  

rvs = [ for rv in mdl.unobserved_RVs]

plot_traces(trc_ols, varnames=rvs)
plot_pm_acf(trc_ols, varnames=rvs, burn=1000)

p(pm.df_summary(trc_ols[-1000:], varnames=rvs))
p('\nMedian Illiquidity ELR Model\nDIC:', pm.dic(trc_ols[-1000:], model=mdl))

We run the model for both the median and max illiquidity estimates. 


First we need to decide how we will evaluate which model is best. For this I have chosen the Deviance Information Criterion (DIC) which is implemented in pymc3 and designed specifically for Bayesian modelling using MCMC. Like similar alternative measures, the smaller the number the better our model. 

First we evaluate the resampled median illiquidity model. 

median model trace

On the left we can examine the distribution of our sample estimate for the intercept, ELR, and model error. On the right we can see the sample trace. This should look like white noise and it does. We can see the intercept is basically zero, the ELR beta is -0.06 and the standard deviation is 0.5

median acf

We plot the ACF of our variables to confirm that the sample traces are white noise. However we can see a strongly negative autocorrelation for each variable at its first lag. 

median model summary and dic

We can see that the both the ELR and sd are significant as their highest posterior density does NOT include zero in the interval. The DIC is 1531. Now let's compare the median model to the max model.

elr max model trace

We observe that the models are similar in their output, however notice the ELR in this instance has a stronger negative correlation with our target variable than does the median model. The traces on the right side appear to resemble white noise. Let's confirm by looking at the ACF plot.

elr max model acf

This confirms our intuition that the series is close to white noise. We can also see a pretty strong negative autocorrelation at lag 1 for each of our variables. This is not ideal but ok for our exploratory purposes.

max model hpd and dic

We can see that the ELR and sd are both significant as neither interval includes zero. The magnitude of the ELR coefficient is larger in the max model which corresponds to a lower better DIC.


We designed an experiment to evaluate the relationship between the ELR index and the cross sectional scaled returns. We deviated from the original paper in a couple notable ways. We used the daily illiquidity measure and resampled to a weekly frequency using both the weekly median, and the weekly max. We then calculated the ELR index using the weekly cross sectional data as opposed to the highly nuanced monthly methodology used in the paper.

We then designed a basic linear model using pymc3 to explore the ELR index's impact on the scaled cross sectional returns. After examining the results I am somewhat disappointed we weren't able to show as strong a link as demonstrated in the paper. The max model is clearly the better model according to the DIC, but even then we can see the ELR index is only weakly related to the cross sectional returns. 

A positive takeaway is that the sign of the relationship is what we would expect. The ELR we calculated is negatively correlated with cross section of returns. 

As currently constructed, using this method to form the basis of an asset pricing model seems dubious at best and definitely lowers my expectations when I simulate a long-short strategy in Quantopian.


  1. Wu, Ying, Asset Pricing with Extreme Liquidity Risk (October 10, 2016). Available at SSRN: or
  2. Gelman, A. (2008), Scaling regression inputs by dividing by two standard deviations. Statist. Med., 27: 2865–2873. doi:10.1002/sim.3107
  3. Sedar, Jonathan. "Bayesian Inference with PyMC3 - Part 2." The Sampler. Applied AI, 06 Sept. 2016. Web. 13 Dec. 2016.