Labeling and Meta-Labeling Returns for ML Prediction

Post Outline

  • Introduction
  • Links
  • Embedded Notebook


This post focuses on Chapter 3 in the new book Advances in Financial Machine Learning by Marcos Lopez De Prado.  In this chapter De Prado demonstrates a workflow for improved return labeling for the purposes of supervised classification models. He introduces multiple concepts but focuses on the Triple-Barrier Labeling method, which incorporates profit-taking, stop-loss, and holding period information, and  also meta-labeling which is a technique designed to address several issues. Those issues include how to improve the f1-scores and recall accuracy of a primary model such e.g. a moving average crossover model, and how to reduce the likelihood of overfitting a model by splitting up the decision of which side to trade from the decision to trade at all. 

Please note that I am publishing my experimental results with the hope that errors/gaps in my understanding will be corrected by those with better comprehension of the material. Additionally note that, in his book, his example dataset is a long history of a continuous SP500 E-Mini futures time series with tick-level resolution, whereas mine is an admittedly, somewhat dirty, tick series of the IVE ETF. 


Github Repo:
Github Notebook: Link
Parquet dataset for download: Link