Data Mining
Data Mining By Nick And Zach Radge
Data Mining
For a real example of data mining we only need look at some trading books, the best example being Long Term Secrets to Short Term Trading by Larry Williams, written in the 90s.
The specific pattern we’ll discuss here is the famed Oops! pattern (page 113). This is where the S&P 500 futures opens below the prior day’s low. If prices rally from that weak open and penetrate the low of the prior day, a long position is initiated (the opposite is true for a short setup).
But here’s the twist to the setup; any signal generated on a Wednesday or Thursday is ignored.
Why?
Because they were probably losing days and that’s what data mining is – purposely filtering out the bad parts of a system. Usually when data mining is done the robustness of the system is questionable, meaning that performance will usually deteriorate into the future.
Unfortunately we can’t test the theory because the Emini S&P futures trade almost 24-hours a day, so exhaustion gaps are few and far between. What we can show is how skipping trading days can lead to data mining. So for this example we will use the Australian Share Price Index futures that allows us to get access to day-session-only data. Exhaustion gaps will occur cmore often after US trading.
Step 1: We’ll start with in-sample data from 1983 through 2013 (pretend you picked up the book in 2013 and decided to test this theory for yourself).
Step 2: We’ll optimise for each day of the week to see what the best trading days are, both long and short.
It’s obvious that trading the setup on a Friday is a losing proposition, so we’ll skip the Friday trades. Here’s what the equity curve looks like from 1983 through 2013:
Looks very good. Smooth growth with both longs and shorts complimenting each other. With these results you’d certainly be tempted to trade the strategy going forward.
So let’s now test on out-of-sample data, which is 2013 through to today. To reiterate, we’re trading the Oops! pattern both long and short except on Fridays.
What a disaster. The strategy is barely profitable.
Lastly, the table below shows that over the last 10-years, Friday was actually profitable. Indeed, Tuesday, Wednesday and Thursday are all losing propositions.
The markets are dynamic and constantly changing. The more rules in the strategy, and the more refined those rules are, the less robust the strategy will be in the future. There is no logical reason for skipping a specific day of the week, indeed the only reason would be to make the data look better.
If you want to learn more about system design and how to create profitable systematic trading systems consider our Beginners Guide To Building Trading Systems course which has been extremely popular and has more chapters like this.