Data science may have a razzle-dazzle kind of aura, but the terms used to describe the work make it sound like hard, back-breaking, painstaking labour. Data has to be harvested and cleaned; it is processed in factories, and analysed in labs.
Banks are trying to hitch their research units up to this trendy, buzzy bandwagon – but already, some cynics believe they lack the stomach for the hard labour that’s involved.
One investment manager says he expects most banks to “dabble” in data science but only a few to invest heavily enough to be able to handle and process all the data, make it meaningful, and to create room for specialisation among their data scientists.
Those who fall short may be doomed, the buy-sider warns: “If you want to understand why a company’s share price is jumping up or down from one day to the next, you will be flying blind without access to this kind of information. And if your investment strategy is short term, you can’t afford to not have access to these early warning data sources.”
So, what makes it so tough?
For one thing, it demands manpower. UBS’s Evidence Lab has experts in quantitative market research, pricing data, transaction data, geospatial data, climatology, hydrology, employment data, social data, mobile data – the list goes on.
Shortcuts are in short supply. Banks might look to speed up their stockpiling of data assets, for example, by using robots to trawl the internet gathering up data without much human intervention.
But websites change all the time. Robots break. Much of the data gathered is useless in raw form. UBS spent “years” cleaning Chinese geolocation codes, for example, before it could make them useful, Evidence Lab’s Barry Hurewitz says.
Evidence Lab runs ‘factories’ in multiple locations to grind through the sort of skilled/unskilled labour that data collection of this type represents for this reason.
It takes time. Newer entrants to the space will need years to build up depth in their data. As a benchmark, UBS has built datasets based on survey results for things such as mobile banking, virtual reality headsets and companies’ capital spend intentions, which in some cases go back five years.
Banks have to learn about each new dataset: how best to take data in and organise it, how to structure investigations to yield the most useful results, and how to turn specific data into investable insights. Mostly those are lessons learnt the hard way, through familiarity with the data in question.
Finding new ground to tread could prove tough. Banks might aim to collect information in different areas. But equally they could find the best areas of enquiry have been addressed already.
When they do break new ground, it must be done properly. Asset managers tell Risk.net that today they ignore data generated through some ad hoc bank surveys – the sort of thing that banks were doing before data science became fashionable – because the methodology lacks rigour.
Finally, the handling of data presents another obstacle. Bankers say doing due diligence on the licensing agreements for data both from inside and outside their firms can be a huge time sink. But they’re equally fretful about coming unstuck by using data wrongly.
Will they succeed? That will come down to investment.
There probably is gold in those mountains of data. But digging it out and turning it into something useful will be a multi-year, multi-multi-million commitment.
- Libor leaders: ABP crafts blueprint for corporate Libor switch
- Libor leaders: how seven firms are tackling the transition
- Libor replacement: a modelling framework for in-arrears term rates
- Swaps data: a new era of competition in interest rate futures
- From memos to texts, algos fish for signals in-house