By Keith Black, PhD, CFA, CAIA, FDP, Managing Director of Content Strategy, CAIA Association
CAIA’s second webinar starring the authors of papers that are required readings for the FDP exam featured Gene Getman, CAIA, a client portfolio manager and product specialist for Lombard Odier Investment Managers. He detailed how Lombard Odier includes big data in their investment process and how things aren’t as easy or as profitable as they might initially appear.
Quantitative investors have long worked with market data and financial data, such as stock prices and earnings reports. These datasets are relatively easy to work with given that they are structured data sets with long time series history. Moving along to alternative data, it seems like the wild west. Many data sets have only three years of history and some data collectors have practices that may eventually be regulated out of existence. Alternative data may be like a box of chocolates, you never know which 5% of data sources are going to get you alpha.
Getman categorizes alternative data into four sources, some of which are easier to work with than others. Consumer spending and credit card data can still provide useful signals for some companies where the data can provide an insight into corporate revenues before the earnings report. Government statistics can be useful, such as registrations for new vehicles or building permits. Contributory databases can have some interesting information, such as how BluDot’s machine learning algorithm discovered the Coronavirus on Dec. 31, a week ahead of the public reports from the CDC and WHO, by using airline flight data and online news items across world languages. Online behavior, such as web searches and web scraping, can find hiring trends at specific companies or trends in searches for a company’s products. Geolocation data is especially time consuming to process, as you have to lay a map of business locations on top of cell phone signals and physical street maps.
Alternative data recently passed financial data in the amount of data collected and will continue to grow far more rapidly than traditional data sources. The problem, then, is not how much data is available, but the quality of the data and the effort that it takes to work with these unstructured, big data sources. Picking a data set is as challenging as picking a stock. To make matters even more complex, some data sources only provide information about a single industry or a small number of stocks. To build quantitative coverage of 1,000 stocks in a market may require as many 40 data sets. Each of these 40 datasets must be evaluated for quality and cleanliness and manually mapped to the stocks to which the data pertains.
It can be expensive and time consuming to build the infrastructure to access and process alternative data. Not only do you have to buy the data, you also have to build a hardware and software infrastructure and deploy a team that can be skilled and efficient in loading, cleaning, evaluating, and testing the data. Not all of the expensive data sets are worth buying, and some of the best data sets can be free. The key is how you mix data sources and signals together in a cost efficient way and build coverage across a variety of stocks. Along the way, you need to train data scientists to think like fundamental analysts. Of course, CAIA and FDP can assist in this training process.
Alternative data sources are generally high frequency signals. In many cases, you are trying to predict revenues of companies in the current quarter, which means that once the earnings announcement has come, you will exit that trade and look for a new signal. While alpha can be hard to find, other benefits from alternative data can come from finding uncorrelated signals. By adding uncorrelated signals to your quant models, your alpha might become more stable, and importantly, you can step aside from the crowded trades that can result from everyone using the same-old traditional factors on the same-old stocks.
Sometimes people who think Apple is a fruit company become fabulously wealthy. Uncorrelated thoughts, indeed.