# Applying Occam’s Razor to Factor Analysis

By Alex Botte, CFA, CAIA, Vice President, Two Sigma

The principle of Occam’s razor states that, in searching for explanations or solutions to problems, “entities should not be multiplied unnecessarily.” If one has various models that arrive at the same outcome, one should choose the one with the fewest assumptions.

In this post, we discuss how this principle can be applied to investment decisions, namely around asset allocation and manager evaluation. For both of these exercises, institutional investors may find it useful to focus on select risk factors that appear to drive the majority of risk and return in their portfolios.

One way to do so is to decompose the returns and risk of an investment portfolio using statistical regression techniques. This can involve developing a risk factor model (or what we call a “lens”) through which to analyze the portfolio. Based on our research, we believe that such a lens should be “parsimonious,” generally meaning:

• Constructed with as few factors as possible, and
• Used in combination with statistical methods that limit the results to the most relevant factors.

We believe that parsimony in factor lens construction can both simplify the investment process and enhance the accuracy of factor-based analysis.

WE BELIEVE THAT A PARSIMONIOUS RISK FACTOR LENS CAN SIMPLIFY THE INVESTMENT PROCESS

We believe that using a small number of the most relevant factors in a “less-is-more” approach can simplify the investment process by allowing an investor to focus on their portfolio’s most significant risk drivers. As discussed in the blog post The Largest Risk Drivers of Portfolios,” our research suggests that only a handful of core factors drive most of the risk in institutional portfolios. We believe that constructing a valid risk lens with a manageable number of factors, in combination with a factor selection methodology, can help investors better understand what these essential factors are, and how they interact.

Why does this matter? While many investors analyze the historical returns of their investments or portfolios, perhaps an even more important task is estimating how portfolios will perform in the future.

Imagine trying to forecast portfolio performance using an unparsimonious set (e.g., dozens or even hundreds) of factors across asset classes, geographies, industries, sectors, and/or styles. Investors might have a view on how a few key markets, like stocks, government bonds, and credit spreads will perform in the future. After all, many industry experts publish capital market assumptions for many of these broad asset classes and indicators. However, investors may not have an informed view on more specific geographical factors, such as how certain regions or countries will perform relative to one another within an asset class like equities. It would likely prove harder still to develop forward-looking expectations for various sectors within all those geographies, such as Consumer Discretionary, Healthcare, Technology and Utilities. And what about market-neutral “style” factors? Investors may have views on a small number of aggregate factors like equity Value, Momentum, or Quality, but maybe not for single-metric factors like how a book-to-price factor will perform versus a sales-to-price factor.

To us, the merits of Occam’s razor in forecasting portfolio performance are clear: using a factor lens intended to boil down elemental risks in most institutional portfolios to a smaller number of factors, say around 10-20, can make forecasting portfolio performance much more manageable.

WE BELIEVE THAT A PARSIMONIOUS LENS CAN ENHANCE THE ACCURACY OF FACTOR-BASED ANALYSIS

The advantages of parsimony extend beyond manageability. We believe that using an excessive number of factors to explain the risk of a dependent return stream can result in overfitting. That is, a model may do a great job of explaining investment performance over a particular analysis period in the past, but may not be a good predictor of future performance.

To demonstrate why this can be the case, imagine a “portfolio” whose returns are from 10 die rolls. The “returns” of the portfolio would be random noise, and therefore should be entirely unexplainable retrospectively. Yet, one could try to model the 10 die-roll returns using a combination of ostensibly explanatory factors (such as the weight of the die, the size of the die, the age of the die roller, etc.) that actually have nothing to do with the outcome of the rolls. The resulting model, however, would be fit to the noise of the past die rolls and wouldn’t have any predictive power in determining what the next die roll would be. The greater the number of explanatory factors used, the worse the problem becomes, since one can always find some combination of variables that happen to “explain” the noise of die rolls, just by luck.

The small number of data points – only 10 die rolls in this example – can also be problematic. We believe that adding more factors to an analysis can increase the risk of spurious results, especially with small sample sizes. Unfortunately, limited sample sizes are quite a common constraint in the institutional space, where many managers report returns infrequently (monthly, and even sometimes quarterly) and/or have a limited track record.

When conducting factor analysis, we believe that investors should consider the “degrees of freedom” as an important determinant of how much confidence to put in the output of the analysis. We define the degrees of freedom to be approximately equal to the number of return data points in the analysis minus the number of explanatory factors. We believe that a greater number of degrees of freedom provides correspondingly greater confidence in the estimated factor relationships. In other words, have as many return observations as possible, while limiting the number of factors included to those that one thinks really matter. A general rule of thumb among statisticians is having about 10 times the number of return observations as factors.

There are statistical methods that can help determine which factors really matter. For example, Akaike information criterion is a method that can be used for variable selection. It does so by weighing the tradeoff between the quality and simplicity of various models – or essentially approximating Occam’s razor.1

CONCLUSION

We believe that a parsimonious risk lens can benefit investors by helping them focus on the risks that appear to really matter to their portfolio. And that parsimony in factor analysis can allow investors to simplify investment processes, especially those that require forward-looking assumptions, and can significantly aid the interpretation and accuracy of factor analysis.

Does parsimony come at the expense of comprehensiveness? In our view, not necessarily. With thoughtful construction and variable-selection techniques, we believe that a factor lens with relatively few, yet relevant, factors can still cover the majority of risk in institutional portfolios.

SeeThe Largest Risk Drivers of Portfolios blog post for further exploration of how the Two Sigma Factor Lens, a factor risk model, attempts to provide a parsimonious and holistic view of portfolio risks.

Alex Botte, CFA, CAIA is a Vice President at Two Sigma, working on the research team that is focused on new factors and methodologies for the Two Sigma Venn platform. She can be reached at alex.botte@twosigma.com.

REFERENCES

1 Cavanaugh, J. E. (1997). “Unifying the derivations of the Akaike and corrected Akaike information criteria”, Statistics & Probability Letters, Vol. 33, No. 2, pages 201-208.