
Machine learning could solve optimal execution problem
Reinforcement learning can be used to optimally execute order flows
Humans and models are usually not good at handling too much information. When the number of factors involved in a decision-making process increases, the modelling of the decision process and its various outcomes becomes unwieldy and time-consuming.
In recent years, machine learning has stepped in to solve that problem.
One area within finance that has consistently attracted a large amount of research from the buy side is market impact, or the effect of large orders on market price.
If a trade cannot be executed in one go, because of a lack of liquidity at the prevailing market price, it is broken into a series of smaller trades. But this exposes the trader to the risk of the market moving against them while those trades are being executed.
Solutions range from a simple limit on time taken to execute the trade to limits on the price at which the trade is executed. When either limit is breached, the firm stops trading – but these two methods need not always result in the optimal execution that maximises the wealth and reduces costs for the trader. More sophisticated institutions use dynamic programming to update the execution algorithm to reflect changing market conditions. This means having to use computationally intensive numerical techniques to find the optimal trading strategy.
In all these cases, the limitations arise from the fact the various dynamics need to be modelled. But what if there was no need for a model?
In a recent technical article, Machine learning for trading, Gordon Ritter, a senior portfolio manager at GSA Capital Partners in New York, applies a machine learning technique called reinforcement learning to simulate market impact and find an optimal trading strategy that maximises the value of the trade adjusted for its risk.
Common machine learning techniques include cluster analysis, which is used to identify hard-to-see similarities and patterns in complex data. In supervised learning techniques, which include Bayesian regression and random forests, the agent can learn from example data and associated target responses.
The agent is learning about the optimal strategy and the cost without actually building a model. In the training process, the agent tries all sorts of things and gets to observe the reward and basically correct his algorithm
Gordon Ritter, GSA Capital Partners
Another technique is reinforcement learning, which tries to train the machine, through a large number of simulations, to choose the best course of action in a particular environment, so when the machine is ready to trade in real life, it already knows what the optimal course of action is, based on its training.
In this paper, Ritter applies reinforcement learning to trading by giving the agent the task of maximising the expected utility of the trade – that is, the value of trade less all associated costs, and adjusted for the risk of the trade. “It allows you to learn the optimal strategy in a way that is fully cognisant of any kind of cost, so your own impact on the price is a really big source of cost for quant traders. But other kinds of costs, such as bid-offer spreads, commissions, borrowing costs – all those would get factored into the reward,” says Ritter.
What has always restricted traditional optimal execution algorithms is the number of factors that can be used in the models. The larger this number gets, the more difficult the problem is to solve. This does not exist with reinforcement learning, as the machine learns by the trial and error associated with being in different states of the world and figuring out the optimal path of execution on its own. “The agent is learning about the optimal strategy and the cost without actually building a model. In the training process, the agent tries all sorts of things and gets to observe the reward and basically correct his algorithm,” says Ritter.
The author says millions of scenarios can be run during the training process in less than a second. The only caveat is that the approach in the paper applies to single asset trading, such as a single stock. If multiple assets are involved, the training process would be slower, but once training is complete, the technique can be used in real time to trade.
Reaping the benefits
Many have been quick to try and reap the benefits of machine learning in the modelling of market impact. Firms such as Portware and JP Morgan are already using supervised machine learning approaches to model market impact. The latter is also testing the use of reinforcement learning to optimise its trading schedule.
It has also found its way into many other applications such as model validation, improving the pitching of trade ideas and in credit underwriting.
One common criticism of machine learning, especially from regulators, is the way it works is not transparent, so when things go wrong, it is difficult to pinpoint to the source of the problem. A related concern is whether the machine learning technique itself is a model and hence should be backtested – but it is not very clear how to do that.
For that reason, while developing machine learning applications for trading activities that could potentially affect markets, simultaneous strides must be made in improving the way machine learning approaches can be tested. That way, firms can leverage two things – the diligence and speed with which machines can trawl through large datasets, and the ability of humans to adapt and find solutions when things go wrong.
Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.
To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe
You are currently unable to print this content. Please contact info@risk.net to find out more.
You are currently unable to copy this content. Please contact info@risk.net to find out more.
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. Printing this content is for the sole use of the Authorised User (named subscriber), as outlined in our terms and conditions - https://www.infopro-insight.com/terms-conditions/insight-subscriptions/
If you would like to purchase additional rights please email info@risk.net
Copyright Infopro Digital Limited. All rights reserved.
You may share this content using our article tools. Copying this content is for the sole use of the Authorised User (named subscriber), as outlined in our terms and conditions - https://www.infopro-insight.com/terms-conditions/insight-subscriptions/
If you would like to purchase additional rights please email info@risk.net
More on Our take
FX-style crypto platforms could bridge gap with TradFi
Emergence of execution-only ECNs, prime brokers and clearing houses brings new confidence in crypto
Skew this: taking the computational burden off basket options
Dan Pirjol presents a snap formula for estimating implied volatility skew in an instant
Shhh, don’t tell: the struggle to keep skew under wraps
Liquidity recycling by clients has made it more difficult for banks to keep skews quiet
How a machine learning model closed a hidden FX arbitrage gap
MUFG Securities quant uses variational inference to control the mid volatility of options
The AOCI elephant in the DFAST room
After March’s banking crisis, Fed stress tests should adopt harsher and wider ranging rate scenarios
China needs an RMB liquidity absorber – HK might be the answer
Increasing HKMA’s CNH debt issuance could help cement renminbi’s role in financial markets
Into the quantiverse: real-world pricing goes arbitrage-free
QRM quants claim to have bridged divide across ‘multiverse’ of fixed-income models
A three-point turn in derivative design
Citibank quant’s triangle method allows information geometry to be applied to hedge structuring