In aviation, a black box is an on-board record of flight data that can be crucial in tracing the causes of an aircraft accident. The black box is, in reality, painted bright orange; it is designed to be found, opened and inspected.
By contrast, black boxes in the finance industry are opaque repositories of complex algorithms and coding that are tough to break open and study. With more widespread adoption of artificial intelligence, investment managers are starting to ask themselves whether these hard-to-interpret methods are a risk too far.
“When using a more transparent model, the parameters are clear. The assumptions being made about the relationship are clear. Neural networks have a lot more parameters to estimate, requiring a lot more data to get right,” says Yaz Romahi, JP Morgan Asset Management’s chief investment officer of quantitative beta strategies. “While their non-linearity is an advantage in terms of modelling, the lack of transparency means that we don’t actually know what the learnt relationship is.”
Recurrent neural networks are a type of artificial intelligence that JP Morgan AM uses to analyse the sentiment of phrases in analyst reports and other documents.
Rival asset manager Lazard will only employ machine learning – an application of AI – if it can provide an ex-post justification for the algo in its investment strategy. NB Breton Hill uses machine learning for auxiliary functions such as data cleansing or financial report analysis, but not for picking investments. Data Capital Management scraps three in five AI models over problems with interpretability.
The caution among these firms centres on the tension between the need for transparency and the desire for an investment edge. If a firm cannot explain to itself – let alone to clients – why a strategy is succeeding, then how can it replicate that performance in different market scenarios, or avoid a damaging blow-up? The question has implications for risk management as well as investment decision-making.
Jack Kim, chief risk officer at Data Management Capital, sums up black-box investing: “From the user’s perspective, if something goes wrong, it is hard to determine how to fix the problem.”
In recent years, fundamental managers have begun to deploy artificial intelligence as part of a more disciplined data science-driven approach to investing, in tandem with existing quant tools like risk factors, rigorous statistical analysis and hordes of data.
Many firms have reported early success with simple interpretable machine learning techniques developed from regression, a core statistical discipline. An example is decision trees, which model the factors that contribute to a particular outcome via a series of branching operations.
For these methods, it is relatively straightforward to explain the model’s output, because it is possible to retrace the steps the algorithm made and the data it considered to be important. But in an effort to get more predictive power, many firms have plunged into the murky realm of deep learning and neural networks, which attempt to mimic the complex decision-making networks of the human brain.
Deep learning is the powerhouse behind voice recognition technology Siri and Alexa, and how self-driving cars recognise objects on the road. In asset management, its applications have been wide-ranging, from predicting stock price movements, to natural language processing of text data like news and social media, from image classification of satellite imagery to modelling risk.
What are neural networks?
Neural networks consist of a system of neurons, which are individual processors, connected by flows of data. Each neuron takes input data and performs a non-linear transformation on that data. It then passes on that transformed data to the next column of neurons. A neuron gives a weight to each set of data being passed to it by a different neuron. These weights are calibrated according to the entire system’s performance; neural connections that increase performance are given greater weight while those having less of an effect are given smaller weights.
The reason deep learning neural networks are so attractive to quants in finance is because they are especially good at finding complex or non-linear relationships that vary over time, economic cycle, or other conditions.
Quant fund Acadian Asset Management, with $96 billion in assets under management, believes non-linear forecasting is one of the biggest applications for deep learning in quant investing. In recent research, the firm argues it is wrong to assume that many of the widely accepted drivers of stock prices are linear.
Seth Weingram, the firm’s director of client advisory, says: “Historically people have tended to make simplifying assumptions about the relationships between corporate attributes and future returns, and historical stock price performance and future returns. But often those assumptions are there to help to simplify the problem in a computational way.”
These simplifications often take the form of an assumption of a straight line between a given variable and future returns, but according to Weingram, there’s little reason to believe that is what the relationship looks like.
“One of the things machine learning is useful for is to allow an algorithm to infer what the shape of relationship might be. Do we know that’s a straight line relationship? Could it be curved? Could it depend on the environment we’re in, or a particular time period and so on?” Weingram says.
Another area where deep learning could help unearth non-linear relationships is alternative data, which refers to new kinds of data not traditionally in investment managers’ toolkit. Examples include patents, social media data, credit card company information, satellite imagery, geolocation data, and website scraping.
Kathryn Kaminski, chief research strategist and portfolio manager at Natixis-affiliated quant fund AlphaSimplex, describes an example of how rainfall affects corn prices. Too much rainfall hits crop yields, as does too little rainfall. The sweet spot is somewhere in the middle.
“Non-linear approaches are better at picking up some of these complex relationships,” she says.
There is a conventional wisdom that with machine learning, you’re automatically turning out the lights. But it doesn’t have to be the case
Seth Weingram, Acadian Asset Management
But the limited dataset in finance, compared with other industries, raises problems. If a quant in the medical field were to build an AI algorithm to determine normal cells from cancer cells, it is possible to find patients from around the world – in other words, fresh datasets – to test the algorithm to see whether it’s learned correctly.
“The problem with finance is we only have one instance of history, and if you’ve learned on that history, you don’t have another way to cross-validate the learned behaviour,” says Romahi at JP Morgan AM. “So the only way to test it is to run it in real life, and that’s risky.”
Paul Moghtader, in Lazard Asset Management’s quant team, which runs $15 billion of assets, stresses the importance of data in justifying the use of AI algorithms.
“All our research projects have to start with an investment rationale. Why should this work? Why should this make sense? And machine learning and deep learning, in some sense, starts with the opposite. It starts with, what does the data tell us? And so, understanding it after the fact is crucial.”
Moghtader adds that if the output of an AI algorithm doesn’t have a clear investment rationale, “then we won’t implement it”.
Pick and choose
In the meantime, funds are getting smart about how to balance interpretability with performance.
Ray Carroll, chief investment officer of NB Breton Hill, Neuberger Berman’s quant group, which manages $4 billion, believes the biggest risk is to depend wholly on opaque machines to generate trades.
“If I didn’t enforce any human insight into the investment process, I’m pretty sure that a machine learning trading system would say ‘buy on the dip’ every time because when you calibrate it to the last decade, that’s what would be successful. But that’s a road to disaster if there’s a recession. So, I am wary about handing over the keys to the machine,” he says.
Instead, NB Breton Hill uses complex machine learning algorithms in non-critical scenarios, such as filling in missing data, or updating stale data. For example, Carroll says, prior to a company releasing its full financial report, it will sometimes issue a press release with partial financial information. The firm uses AI to analyse information from the company’s peers who have already reported, and make updated estimates of the financials.
Carroll says he is more comfortable using complex algorithms if it is possible to clearly understand what the output is, for example, natural language processing of earnings calls transcripts. The firm has developed a widget that displays a transcript of the call, highlighting words or phrases that the machine interprets as bullish in green, and bearish in red.
Acadian Asset Management leans towards transparency when it needs to understand the most important predictors of returns. The fund is exploring how different indicators of the financial strength of corporations can be used to predict future returns.
“It could be that there are non-linear relationships between those attributes and future returns, which is one motivation for machine learning. On top of that, certain machine learning algorithms pick out what the most important predictors are. So we choose the algorithm that had the most transparency,” Weingram says.
“There is a conventional wisdom that with machine learning, you’re automatically turning out the lights. But it doesn’t have to be the case,” he adds.
If things are interacting in a very complex way that you don’t understand, you don’t have the comfort to risk-manage on a scenario-by-scenario basis
Jack Kim, Data Management Capital
According to Data Capital Management’s Kim, several schools of thought are emerging as to how asset managers should approach explainability.
One is that quants should only use interpretable machine learning algorithms, which limits the scope to very simple, generalised linear models. Another is that they should stick to models such as gradient boosting trees and random forests, which have naturally embedded methods of tracking back what were the important features in the overall decision-making process. Both techniques use decision trees, but they differ in how the results are combined. Gradient boosting trees are built sequentially, with every new tree constructed to correct errors made by previously trained trees. In random forests, each tree is trained independently using a random sample of the data.
For Kim, interpretable artificial intelligence is one of the systematic hedge fund’s “main concentrations”. It explains why the firm dumps three in every five of the models it develops.
BlackRock acted similarly last year when it decided to mothball liquidity risk models built using neural networks, even though they outperformed decision tree-based models, because they weren’t explainable.
Other firms, too, are questioning whether the use of non-linear AI can leave them exposed to shadowy risks that are hard, if not impossible, to manage.
“If you have net long US equity factor exposure with linear portfolio construction, and there is an expected major market or geopolitical move, and if you don’t want to take that risk, you can reduce or eliminate your exposure to that particular factor, derisk or hedge your positions,” Kim says. “But if things are interacting in a very complex way that you don’t understand, you don’t have the comfort to risk-manage on a scenario-by-scenario basis.”
Kaminski, too, cautions that it can become very hard to predict how non-linear signals combine together and to determine how much risk is being taken.
“If you have a very complicated neural network, it’s very hard to know that you’re not doubling up on the same features for different markets. It becomes complicated fast.”
As complex machine learning permeates the investment management industry, firms are quickly realising the dangers of getting lost in the random forest.
Editing by Alex Krohn