William Sharpe’s 1975 article, “Adjusting for Risk in Portfolio Performance Measurement,” applied Harry Markowitz’ established framework to the evaluation of investment performance, and came up with the Sharpe ratio, a statistic that is simple in a way that some see as beautiful, but that others see as awfully misleading. This ratio has been a subject of contention ever since.
AllAboutAlpha has sought to keep our readers up to date on the argument, as least since this piece by our Alpha Male.
Why It’s Good to be Sharpe
Standard deviation, or variance, may be used as proxies of risk. Thus, the Sharpe ratio isn’t a measure of return on capital; it is a measure instead of return on risk. And one of its great advantages, de Prado says, is that it is scale invariant. Thus, it is indifferent to leverage, “so long as it is kept constant for each investment.”
Prado points us to the way in which the variance of Sharpe ratios increases with negative skewness and positive excess kurtrosis. This is intuitive (or, as de Prado puts it, these “signs associated with the moments make sense.”)
He tells us, too, that Elmar Mertens of the University of Basel proved an important related point in a 2002 paper. Assuming IID returns, he established that the normality assumption on returns themselves may be dropped, and the Sharpe ratio will still converge toward a Normal distribution. This is important because, since 2002, belief in normal distribution of returns in many asset classes has waned. The talk – among academic students of these matters and to some extent among practitioners as well – has been of various sorts of non-normality, fat tails and fractals. The Sharpe ratio continues to be worth discussing precisely because it isn’t limited to the case of normal returns.
Non-normal returns can produce inflationary effects on an unmodified Sharpe ratio. But the necessary modification is a straightforward one: redefine the SR in probabilistic terms. This produces what de Prado unsurprisingly calls the Probabilistic Sharpe Ratio, or PSR.
IID Observations Not Required
Indeed, the generality of the Sharpe ratio is greater than Mertens established. De Prado makes the further point that the Sharpe ratio will converge toward a Normal distribution, not only when the returns are non-normally distributed, but when the observations are not IID; that is, when they are only “stationary and ergodic,” a less demanding test.
To see the difference, consider two possible sorts of sampling. There are a bunch of black marbles and a few white marbles in a jar, and I want to know the percentage of each. Let’s suppose I start by taking a sample of just four marbles. Of these, three are black and one is white. So in my sample, 75% of the marbles are black.
Suppose I’m nervous about concluding that 75% of the marbles in the large jar as a whole are black, so I decide to take another sample. There are two ways I can proceed: I can put the initial sample back into the jar, and mix it up well: or I could put that first sample off to the side and take four more out of the slightly depleted jar.
Now, suppose I again get a 3/1 sample. If I had put the first sample back, re-mixed, and taken the second sample, then we might have called the two samplings IID. But if I put the first sample aside and then picked a separate sample, the two observations clearly aren’t IID. The first observation, by depleting the jar somewhat, has changed the reality I’m trying to understand, so the second observation is not independent of the first.
Nonetheless, it may be ergodic, the next-best thing.
Anyway, the Sharpe ratio’s own statistical properties don’t rely on the IID character of underlying returns, a reason to regard the Sharpe ratio as a powerful metric. If it has its difficulties, then it is to be mended in an appropriate way, rather than simply erased from the world’s blackboards.
And it does have its problems. Even when SR has been transformed into PSR and non-normal returns have been rendered non-inflationary, there is another source of performance inflation:
Selection bias of the sort de Prado has in mind overlaps with the multiple testing problem, the fact that as more and more strategies are tested against the same data set, the chance of getting at least one invalid result, a false positive, steadily increases.
He quotes ethical guideline #8 of the American Statistical Society, “Running multiple tests on the same data set at the same stage of an analysis increases the chance of obtaining at least one invalid result. Selecting the one ‘significant’ result from a multiplicity of parallel tests poses a grave risk of an incorrect conclusion.”
De Prado proposes to allow for multiple tests, while adjusting the Sharpe Ratio a second time to eliminate the risk this multiplicity otherwise introduces. The first adjustment of the SR , as mentioned above, turned it into the PSR. This second adjustment turns it into the “Deflated Sharpe Ratio,” or DSR. Or, as de Prado puts it in red letters, “DSR is a PSR where the rejection threshold is adjusted to reflect the multiplicity of trials.”
It does this in what seems a rather cumbersome way. After all, the beauty of the original formulation of Sharpe’s ratio was the simplicity of Now we not only have to understand that probabilistically, we have to deflate it by taking into account the non-normality of the returns (that matter keeps coming back, like a persistent zombie), also the length of the returns’ series; the variance of the SRs tested; and the number of independent trials involved in the selection of the investment strategy.