Results obtained from machine learning are in many cases fitted to the price series. This is not bad in principle but it is often hard to differentiate the good strategies from random and curve-fitted ones. The data-mining bias introduced by the repeated use of data and combinations of features almost guarantees that eventually some random strategies(s) will pass validation tests by luck alone. But when market conditions change and the over-fit is no longer in effect, then the performance of these fitted strategies deteriorates fast.
Below are some criteria one can use to minimize the possibility of a random strategy due to selection and data-mining bias.
(1) The underline process that generated the equity curve must be deterministic. If randomness and stochasticity are involved and each time the process runs the strategy with the best equity curve is different, then there is high probability that the process is not reliable or it is not based on sound principles. The justification for this is that it is impossible for a large number of edges to exist in a market and most of those strategies must be flukes. DLPAL employs a unique deterministic machine learning algorithm and each time it uses the same data with same parameters it generates the same output.
(2) The strategy must be profitable with a small profit target and stop-loss just outside 1-bar volatility range. If not, then the probability that the strategy possesses no intelligence in timing entries is high. This is because a large class of exits, such as for example trailing stops, curve-fit performance to the price series. If market conditions change in the future the strategy will fail.
(3) The strategy must not involve indicators with parameters that can be optimized to get the final equity curve. If there are such parameters, then the data-mining bias increases due to the higher number of parameters involved making it extremely unlikely that the strategy possesses any intelligence because it is most probably fitted to the data.
(4) If results of an out-of-sample test are used to reset the machine learning process and to start a fresh run, data-snooping bias is introduced. In this case validation in an out-of-sample beyond the first run is useless because the out-of-sample has become already part of the in-sample. If an additional forward sample is used, then this reduces to the original in-sample design problem with the possibility of the performance in the forward sample obtained by chance.