Death of the data warehouse

Two panellists at this year's Buy-Side Technology North American Summit talk about their firms' use of big data lakes in place of data warehouses

big-data-lake
Panellists at the Buy-Side Technology North American Summit 2015

Hunger for data isn't going to slow down any time soon. But as the quantity of data used by firms climbs, so too do the issues surrounding it.

A good data governance strategy isn't exactly a sexy topic, but it's a necessary one to tackle with the increasing demand for data.

Scott Burleigh, executive director for JP Morgan Asset Management, said about a year and a half ago his firm made heavy investments into technology around data governance. Burleigh, who spoke on a panel at this year's Buy-Side Technology North American Summit on October 7, said the firm found there were multiple copies of data and places where the same data was processed over and over again.

"What evolved over time was that we didn't have a single version of the truth," Burleigh said. "You had different answers for the same instrument. Different rights and returns for the same security. You had weighted average credit ratings that were different between reports. Multiple answers for the same question."

Trip to the lake

A consolidated area to store the data was the answer, but not via a warehouse. Instead, the firm chose to build a big data lake.

Rashmi Gupta, a data manager at MetLife and fellow panelist, said her firm has taken the exact same approach. Instead of having a traditional centralised warehouse, everything is put into a big data lake, which serves as a data acquisition layer.

A semantics layer – a data translation layer that sits on top of the data acquisition layer – maps to the enterprise data model. Gupta said big data lakes are one of the biggest trends she sees in the industry now.

"So you have one set of information, one single version of truth, but you don't have all the cost associated and the work and labour involved in creating one single warehouse," Gupta said.

It takes very little time to build up big data lakes, according to Gupta, and they have great scalability. If there is a new application a firm wants to use, all it has to do is put it in the lake and build a translation layer on top of it.

Gupta said there are some issues around data integrity, which makes the translation layer such a critical part of the entire operation.

"It boils down to, very simply put, the whole data warehouse is now being replaced by a high-technology data service layer," Burleigh said.

Tapping at the source

Burleigh used solvency-related data as an example of how it works. With the data lake, a logical data model brings in data from multiple sources. The data is delivered through a search layer, meaning the user can ask for the type of data or data elements without specifying the source.

"You just talk to the service layer, tell it what data elements you want and it knows where they are," Burleigh said. "It serves it up to you as though it was one source."

JP Morgan has taken it a step further, according to Burleigh, by governing data at the source before it enters the data lake. By doing so, Burleigh said the firm doesn't have to worry about altering the data once it's in the data lake.

"We're identifying the source for the data that goes into the lake and we make changes, or the governance says we need to make changes to the data element," Burleigh said. "We make it at the source and it gets reflected in the data lake."

This article was originally published on sister website WatersTechnology.com.

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe

You are currently unable to copy this content. Please contact info@risk.net to find out more.

Chartis RiskTech100® 2024

The latest iteration of the Chartis RiskTech100®, a comprehensive independent study of the world’s major players in risk and compliance technology, is acknowledged as the go-to for clear, accurate analysis of the risk technology marketplace. With its…

T+1: complacency before the storm?

This paper, created by WatersTechnology in association with Gresham Technologies, outlines what the move to T+1 (next-day settlement) of broker/dealer-executed trades in the US and Canadian markets means for buy-side and sell-side firms

You need to sign in to use this feature. If you don’t have a Risk.net account, please register for a trial.

Sign in
You are currently on corporate access.

To use this feature you will need an individual account. If you have one already please sign in.

Sign in.

Alternatively you can request an individual account here