Deflated Sharpe Ratio: Fix Skewness and Multiple Testing

Every quant eventually faces this moment: a backtest Sharpe Ratio of 2.1 lands on the desk, the strategy looks brilliant, and everyone in the room gets excited. Then it blows up live. The Sharpe Ratio, as most traders use it, is a deceptively incomplete statistic — it quietly ignores the shape of your return distribution and the number of strategies you tested to find it.

Bailey and Lopez de Prado tackled exactly this problem in their 2012 Journal of Portfolio Management paper. Their Deflated Sharpe Ratio (DSR) adjusts the classic metric for three silent killers: negative skewness, excess kurtosis, and the multiple testing problem. The result is a probability — specifically, the probability that your observed Sharpe is genuinely above zero after accounting for all the ways you could have fooled yourself.

Think of it like a job interview where the candidate has applied to 200 firms. If one firm hires them based purely on a great interview, is that evidence of talent — or just probability? The DSR framework asks the same question of your strategy. A standalone Sharpe looks impressive; deflated against the number of configurations you tried and the non-normality of returns, it can collapse dramatically.

The maths behind DSR relies on the Minimum Track Record Length and a benchmark Sharpe drawn from the maximum of a sample of IID normal variables — essentially asking what Sharpe you'd expect to see by pure chance given how many strategies you tried. Negative skewness (strategies that occasionally blow up) and excess kurtosis (fat tails) both reduce the effective information content of your observed Sharpe, shrinking the DSR probability further. A crash-prone mean-reversion strategy with gorgeous average returns is exactly the kind of candidate this framework punishes most. Traders wanting the full technical derivation can read about the Sharpe Ratio's foundations on Investopedia, explore the broader context of the multiple comparisons problem on Wikipedia, and review the statistical concept of kurtosis on Wikipedia before working through Bailey and Lopez de Prado's original derivation.

The practical takeaway is blunt: before presenting any backtest Sharpe, count how many parameter sets, timeframes, or strategy variants you tested to find it. If that number is large and your return distribution has fat tails, the raw Sharpe is almost certainly flattering you.

Run the deflation. If the probability collapses, the strategy needs more out-of-sample evidence — not a live account.

This content is for educational purposes only and does not constitute financial product advice. Past performance is not indicative of future results. Profit Logic Ltd (ACN 688 669 936) accepts no responsibility for errors or omissions in this content or anywhere on this website. Always seek advice from a licensed financial adviser before making investment decisions.

Deflated Sharpe Ratio: Fix Skewness and Multiple Testing

Related Articles

COT Data: What Futures Positioning Reports Really Show

Index Inclusion Events and Price Distortions

FX Session Overlaps: Liquidity, Spreads & Depth