# Data mining, machine learning and problems with autocalls

## The week on Risk.net, January 19–25, 2019

Banks bet on data to rescue research

Barclays, Morgan Stanley, UBS among those using data science to pep up their research offerings

Arnott, Harvey: machine learning dangerous when data thin

Experts warn ML should be used “for its correct purpose” – not for prying long-term strategies from sparse information

Eurostoxx dislocations signal autocall hedging pain

Swings in dividends and volatility reveal year-end stress as European index slump tests “peak vega”

COMMENTARY: Digging too deep

Data is the latest of many hopes for banks and other investors looking for improved returns in a lacklustre environment. Several banks have begun to point their research teams at big data – using internal data, purchased databases or new research to collect huge quantities of data points, which can then be analysed using the new technology of machine learning (ML).

UBS seems to be in the lead at present, but Morgan Stanley, BNP Paribas and many others are following. And this combination is being applied elsewhere as well; last week Risk looked at HSBC’s client intelligence unit, which is aimed at using internal client data to generate new sales leads for existing customers. Standard Chartered’s data analytics group earned a 2019 Risk Award for quant of the year for its head Alexei Kondratyev, based on the group’s machine learning work.

Awkwardly, this is also the week in which some serious doubt is cast on the foundations of the whole project. In too many cases the data is simply too thin to produce useful and robust conclusions, according to quant investing experts Rob Arnott and Campbell Harvey, who warned that using powerful machine learning techniques on small data sets was a dangerous habit. Harvey has form here – last year he was one of several academics warning that bad statistical practice was letting spurious machine-learning discoveries pass for real.

As we noted last year, quantitative finance was only one of many fields of research that suffered a crisis of reproducibility in which many keystone results were found to be spurious, the result of poor statistics, wishful thinking and a motivation to publish positive findings. The last is the most serious: if academia is distorted by the pressure to publish, it’s difficult to think that the pressure is not even heavier in finance, where being first with a slightly improved model holds out the prospect of massive financial reward.

The atmosphere at many buy-side institutions is pro-ML, verging on the febrile, with one manager saying: “In other fields like medicine, self-driving cars and language translation, researchers have been able to use machine learning to create complex models that are better than human performance. And we can do the same.”

In reality, self-driving cars are still years away from widespread safe deployment, medical ML has at best provided evidence of matching human performance and catching human error in a few specialist fields, and while translation software can produce a usable translation, it cannot produce anything close to the output of a skilled human translator.

Another pro-ML argument is that natural selection will save the day – institutions with bad software or statistical tests will be outcompeted by those with better ones. This is also misleading: a bad implementation of sales-leading software for example, could result in liabilities for a bank, including mis-selling costs, which might not be discovered for years. And the failure of a few major institutions is not something that can be brushed off as the natural Darwinian process; it is a system-threatening event. Only good, sceptical, rigorous research practice can save banks from being lured to disaster.

STAT OF THE WEEK

UBS’s income from equity derivatives trading plummeted $47 million (23%) and from cash equities trading$16 million (5%) in the last three months of 2018. Overall, the equities divisions’ revenues slipped $92 million to$792 million. Stock slump dents income, hikes VAR by 22% at UBS

QUOTE OF THE WEEK

“And when they’re wrong, [fund managers using factor timing] are really wrong, and get hit for about 50 basis points per year, also worse than other funds. So factor timing for these funds, on average, hurts” – Andrew Chin, AllianceBernstein