That’s Enough Machine Learning – thanks!
Alright – so I’m going to hammer on one specific topic that’s been bothering me in the tech scene and that’s just machine learning being thrown everywhere. “Need a t-shirt? Let’s use machine learning to find our different habits and predict our tastes.” Or, you know, you could go to a store and see what appeals to you. OK that’s an exaggeration and going to stores and checking merchandise doesn’t scale across variety the web offers you. But I like this analogy so I’m going to keep it.
The problem I see with machine learning, and why I think it’s overused in markets inappropriately, is that it cannot explain in the same way human consciousness can. What I mean by that is that traditional science tells us to form a hypothesis before conducting an experiment. The idea being that by forming an explanation before seeing the data, we are forced to take current observations and make a rational expectation. This of course leads to biases which is shown quantitatively by the inability to replicate research as well as the number of papers that seem to support their hypothesis. What “big data” (I throw up a little in my mouth when I use that phrase) presents us though is the ability to get instant iterative feedback and A/B testing lets us test our samples in the real-world and see if our models hold up.
This is how it “should” be done. What happens though is that machine learning instead of being used as an optimization method becomes used as a method of find explanations. Many of us are using it to find relationships and then we are are backfilling a hypothesis and shows to be the case. While the current method of science is far from perfect, this approach seems far far worse. I have seen some who can master this, but they often have very strict processes in place to ensure the models hold up. Some enforce it via risk management while others run statistical tests – usually a combination of the two.
But do we really need to use advanced machine learning to create explanatory relationships instead of being an optimization method? After speaking with many people using it this way and reading papers on it, it seems like many doing it drastically overfits and their live results/trading do not match their out-of-sample. A common response to this idea is that, “machine learning should work if we properly out-of-sample tests.” Well, something taught to me by Josh + Steve @ AlphaParity (on this list), was that many people inappropriately run out-of-sample tests. What people often do is they initially have an in-sample and out-of-sample but when out-of-sample doesn’t match the in-sample performance, they parameterize the in-sample until the out-of-sample matches what they want. This creates just one in-sample and no out-of-sample.
Using machine learning as an explanatory relationship finder often leads to complexity of models, which just further adds the probability of overfitting. A secondary problem with markets is that regime shifts can happen rapidly, making machine learning less effective on larger time periods where there become new macro drivers. While it absolutely can be done, I know only one who has pulled it off and I have no idea how they do it. The question is, that all of this complexity worth it? The largest hedge funds out there like AQR do not use it to find explanatory relationships but use it for what it was meant to be: an optimization algorithm that slightly boosts performance. The simplicity of models like this reduce the chances of overfitting and also allow us to know when a model will break – when there will a regime shift. This knowing-when-it-fails allow us to assign higher odds as to when to size down risk (or weighting in non-market cases), or use portfolio construction to provide correlation/diversification benefit.
So before we go crazy with machine learning trying to be predictive from the start, I think it’s worthwhile to test the relationships and run studies and then consider ML at a “tweaking” stage. When used properly, it can be an effective tool, I just don’t think as effective as the mass-adoption of this phrase implies for the vast majority of cases. I think a good example of those who properly used it were the winners behind the Netflix Prize, where their solution is public. Their initial papers explored biases and preferences people had when initially ranking movies. Their final solution contained different ML and statistical methods to push results over the edge. Reading Team BellKor’s Pragmatic Chaos’s papers in sequential order is good fun: Direct link to final paper. Ignoring the math, their logic and explanations are fantastic displays of the scientific method + optimization methods.