Buy-side quants have long struggled to understand and master the effect of a firm’s own trading on market prices – so-called ‘market impact’.
In particular, the challenge of trading out of big positions is sometimes compared to pushing an elephant into a swimming pool and hoping to avoid the splash. The standard approach for firms wanting an indication of how big a splash to expect has been to examine the impact of similar trades in the past.
But when firms look closely, they find few trades are really that alike. And where similarities or patterns between new and past trades exist, sometimes they are too subtle or change too quickly for traders to spot.
For that reason, firms such as Bloomberg, JP Morgan and Portware have turned to machine learning for help. “It’s only now we have this convergence of technology, faster machine-learning algorithms and a better understanding of how market impact works that we can assemble these components at scale,” says David Fellah, head of algo linear quant research for Europe, the Middle East and Africa at JP Morgan.
Results so far have been promising.
Machine learning can help quants get to grips with the elephant-splash problem in a couple of ways. On one hand, it can complement conventional market impact models. Firms can use artificial intelligence to squeeze more information from sparse historical data, for example, or help identify non-linear relationships in order flow.
Alternatively, in its bolder applications, machine learning can be used to create trading robots that teach themselves how to react to market changes. Both approaches are in use already. And the savings they promise are striking, particularly for large systematic funds that trade heavily.
According to Jean-Philippe Bouchaud, head of research at Capital Fund Management, a systematic fund, as much as two-thirds of the gain on trades can be lost to market impact costs. One hedge fund execution expert says the cost of adverse market impact at their firm reached about $1 million a year – about a tenth of its profit before tax.
Fellah, meanwhile, says the spread between the lower and upper quartile of trader performance is generally as narrow as two basis points: “If you can improve the performance of an algorithm by even a fraction of a basis point, it makes a large difference.” This is due to the sheer number of orders traded with the algorithm.
Current machine-learning techniques include cluster analysis, supervised learning techniques such as Bayesian regression or Random Forest, and reinforcement learning (see box: Machine-learning techniques).
Cluster analysis, first developed more than 70 years ago as a broad statistical technique, is used to identify hard-to-see similarities in complex data. Bayesian regression and Random Forest are techniques that make predictions, assigning probabilities to defined scenarios. And reinforcement learning aims, through many simulations, to train so-called artificial intelligence (AI) agents to choose the best course of action in a particular environment.
While the techniques themselves are not new, their rising adoption owes itself to increases in computational power and the growing quantity of available data. Steps forward in the theoretical understanding of market impact and artificial intelligence are factors, too.
Bloomberg is using cluster analysis to fill gaps in the data used to calibrate conventional parametric models. These relatively basic models continue to dominate the industry even when used alongside more sophisticated tools, but are forced to rely on often-sparse historical information.
Bloomberg’s liquidity assessment tool – LQA – gets round that problem by first grouping bonds into broad, intuitively similar buckets, then using cluster analysis to collect together the most comparable products in each bucket.
Every bond is quantitatively measured against a range of common features such as currency, duration, time to maturity and amount outstanding. Those measurements determine its position within a theoretical multi-dimensional space.
Our research suggests clustering is most useful, and results are more stable, when it is used with a structural market impact modelNaz Quadri, Bloomberg Enterprise Solutions
For example, trading 500 lots of an obscure US Treasury bond, LQA will identify other US Treasury bonds the shortest distance away within that space. LQA will then use their combined pool of data to calibrate the parametric model.
Initially, Bloomberg had tried cluster analysis by itself, running a linear regression model on the clusters to produce expected costs. But this did not provide the results it had hoped for. Clustering alone can result in instabilities, where a small change in the underlying data causes large unexpected changes in cluster composition.
Instead, the company learned it needed to incorporate a parametric model to introduce stability.
“Some applications of clustering were more useful than others,” says Naz Quadri, head of quant engineering and research at Bloomberg Enterprise Solutions. “Our research suggests clustering is most useful, and results are more stable, when it is used with a structural market impact model.”
Portware and JP Morgan, meanwhile, are looking at the problem from a different angle, using artificial intelligence to help identify how the careful timing of trades can minimise market impact.
Both take market impact models that describe how the effect of a trade depends on previous trades as a starting point. In JP Morgan’s transient model, for example, the impact from each trade decays over time. The aim is to avoid scheduling trades too closely together, which can lead to a market impact greater than the sum of its parts.
The order flow imbalance is actually quite predictable. It can be predicted to a reasonable accuracy by auto-regression models, but we find there’s room for more accurate prediction using non-linear methodsHenri Waelbroeck, Portware
Such models work well for liquid assets such as equities, but other asset classes are more difficult. “I know some firms try to apply a transient market impact model to fixed income – it doesn’t seem to do quite as well in lower liquidity securities,” says Quadri.
The firms use these models to set out the best possible trading schedules for a range of scenarios, then tweak the schedule as the real trade progresses, using supervised learning techniques such as Bayesian regression or Random Forest to make the short-term predictions determining those tweaks.
Portware’s Bayesian regression method, for example, uses multiple artificial intelligence agents to simultaneously predict short-term volatility, order flow and market volume as a trade progresses. The firm uses market, news and social media data as input.
“The order flow imbalance is actually quite predictable,” says Henri Waelbroeck, Portware’s director of research. “It can be predicted to a reasonable accuracy by auto-regression models, but we find there’s room for more accurate prediction using non-linear methods.”
The AI agents in Portware’s system use these predictions to produce order flow expectations, and to further predict trade risks such as urgency – a measure of how trading more quickly can reduce costs versus trading more slowly. As these predictions are stored in the system’s memory, each agent can use the predictions of other agents to enhance its own.
“The higher-level agents are able to take advantage of the findings of lower algo agents in the same way you would imagine an executive team at a table – a CEO listening to the advice of a group of other people,” Waelbroeck says.
As the trade progresses, if order flow, volatility or volume deviate from expectations, Portware’s system will alert a human trader of opportunities to switch execution algorithms. The expected execution costs of these algorithms are then computed using the market impact model and compared using another pre-trained AI agent.
JP Morgan’s Fellah says it takes on average 30 operations – limit orders, revisions and cancellations – in the limit order book to achieve a single trade due to the fragmentation of liquidity. That leads to the inefficiency of being sent to the back of the queue whenever an order is adjusted.
If you consider reinforcement learning algorithms for self-driving vehicles or games, those algorithms have to understand the physics of the system in which they are operating,” says Fellah. “It’s no different in financeDavid Fellah, JP Morgan
The bank uses Random Forest to produce short-term order flow predictions, which means the number of such operations can be drastically reduced. Fellah says Random Forest was chosen for its speed.
His team is also beta-testing another application of machine learning to the problem of market impact, looking to use reinforcement learning to teach a sole AI agent to react to order imbalance and queue position in the limit order book.
The firm runs simulations of the limit order book and the agent will use the simulations to optimise its trading schedule. The impacts of its orders are modelled by the transient model and the simulations allow the agent to learn how trades cause market impact.
“If you consider reinforcement learning algorithms for self-driving vehicles or games, those algorithms have to understand the physics of the system in which they are operating,” says Fellah. “It’s no different in finance.”
The idea is to use this simulation framework to train the agent or robot to make optimal actions throughout its life. The training gives the AI an intuition for how trades cause market impact and how that impact will decay with time. The AI gains a sense of how delaying an order or trading at faster rates will influence its possible actions in future.
“One interesting thing about this approach is that we don’t write a single line of code. In a sense, the machine ‘writes’ the algorithm,” says Fellah.
These successes in using machine learning to assess market impact are prompting further research, with Waelbroeck saying his firm, for example, is looking to extend its short-term forecasting techniques to longer-term portfolio risk management.
At the high-frequency scale of a few seconds or a few minutes, there’s typically enough data that a sort of automated search for correlation, which machine learning is providing, can workJean-Philippe Bouchaud, Capital Fund Management
“This is perhaps a little bit further down the road for us,” he says. “But we are exploring the applicability of the system we’ve created to a different approach to portfolio risk.”
“It could help portfolio managers better prepare their portfolios for future shocks that might not be evident when looking at risk strictly from the standpoint of trailing correlations.”
However, some quants remain sceptical about how far machine learning can help. Quant funds have described elsewhere the difficulty of developing unsupervised algorithms for money-making over the long term.
Nataliya Bershova, head of execution research at Alliance Bernstein, says she prefers to rely on parametric models: “With machine learning, you cannot say for example that factor X is more influential than factor Y. It’s just a black box that tells you there is a much better feed to your real data through this non-parametric technique.
“With most machine-learning techniques, you cannot clearly separate permanent impact and temporary impact. In parametric models, you can, and that’s an important feature,” she says.
Meanwhile, Capital Fund Management’s Bouchaud believes there is too little data for low-frequency machine-learning analysis to be of much use.
“Machine learning is only interesting if there are non-linear effects, right? If everything is linear you can do linear regression.”
At low frequencies, data is too limited to avoid misleading results, oversampling, overfitting or to gain new insights, he says: “All the nonlinear effects machine learning has found at low frequencies we have already found using more traditional data analysis methods or intuition.”
Higher-frequency trading is different, though, he thinks. “At the high-frequency scale of a few seconds or a few minutes, there’s typically enough data that a sort of automated search for correlation, which machine learning is providing, can work.”
Waelbroeck is more optimistic that machine learning can help at the lower frequencies, and sees potential for machine learning to help solve problems far beyond modelling market impact.
In stressed markets, correlation structures change and risk managers rely on the correlations of previous, similar stress events to estimate this change. This is the crudest of prediction methods, he says, and it’s a problem where machine learning can help.
“More advanced methods identify statements that were true in the past event and are most likely to remain true today. The next crisis is not going to be a repetition of the collapse of the ABS market – but there are truths about past crises that will remain true and can help predict how the next crisis will unfold. Machine learning can help uncover these truths.
Machine learning uses statistical techniques to infer relationships between data. The artificial intelligence (AI) agent does not have an algorithm to tell it which relationships it should find, but learns from the data using statistical analysis to revise its hypotheses.
In supervised learning, the machine is presented with examples of input data together with the desired output. The agent works out a relationship between the two and uses this relationship to make predictions given further input data.
Supervised learning techniques, such as Bayesian regression or Random Forest, are useful where firms have a flow of input data and would like to make predictions.
Unsupervised learning, in contrast, does without learning examples. The agent instead tries to find relationships between input data by itself. Unsupervised learning can be used for classification problems, determining which data points are similar to each other, as in cluster analysis.
Portware’s AI agents use Bayesian regression to calculate probabilities of order flow scenarios. The firm first takes data from markets, social media and news sources to determine the probabilities of various degrees of order imbalance.
“You classify the possible outcomes in a collection of scenarios, compute the probability of each scenario and compute expectations as probability-weighted means. If we’re talking about an order flow predictor, for example, we might classify the future order flow as very positive, positive, neutral, negative, or very negative,” says Waelbroeck.
JP Morgan takes incoming market data for orders in the limit order book and uses Random Forest to output a single forecast of order flow direction over the next 20 or more ticks, together with its probability.
Cluster analysis takes a simple idea – the notion of physical objects clustering in three-dimensional space – but applies it in a theoretical space of many more dimensions.
Taking bonds, for example, each dimension might represent a feature of the bond: duration, maturity, value outstanding, or currency, and so on. Just as objects have positions in three-dimensional space and a distance between them, so bonds have positions in this theoretical feature-space with distances between them. These distances determine how ‘similar’ the bonds are.
Bloomberg’s liquidity assessment tool – LQA – aims to cluster bonds with sufficiently similar behaviour so their historical data can be shared and used to make general predictions for all bonds in that cluster.
In reinforcement learning, an artificial intelligence agent learns how to choose optimal actions when presented with a particular environment.
JP Morgan runs simulations of the limit order book to train an AI agent how to trade. The AI agent will perform trades in the simulation with the expected costs of its orders modelled by a transient model.
The robot learns by putting a value on each possible action for a given market environment based on its expected cost, fill probability and order size. During the simulation, actions that reduce impact costs over the entire trading period and allow greater freedom of action in the future receive a higher value. The agent therefore learns which actions maximise future expected value and is then able to make optimal decisions during a real trade.