Case Study: European Bank

Case Study: European Bank

If you’re a leading bank in Europe, with hundreds of millions of Euros in assets, how do you keep your competitive edge? Your bank has been widely praised as the most innovative bank in mobile payment systems by leading industry experts. Every other bank in Europe has to be gunning for you. The key, top management decided, is better daily decision making backed by big data analytics.

  • Understand their customers better to segment them along axis of value and risk (high-value/low-risk, high-value/high-risk, low-value/low-risk, low-value/high-risk), identifying opportunities in outliers and clusters.

  • Analyze spending patterns and use predictive analytics to maximize cross-selling opportunities

This turned out to be harder than it seemed at first. The idea is to move your data from your central mainframe system, data stored on archival tape systems, and other systems daily to a distributed cluster of servers running Hadoop big data analytics. At this bank, this amounted to about 1500 files in myriad file formats and 10 TB of data each day coming not just from the mainframe but from archival tape and other systems.

Background

The bank in this case study has moved up in industry rankings to become the #1 retail bank in their region. It serves over 10M clients, the largest bank customer base in their area. It manages hundreds of billions of euros in loans and even more in client funds. It operates over 5,000 branches along with almost 10,000 ATMs. A highly-rated brand with top marks for trust and excellence in service quality, it takes pride in its history of service and its commitment to maintaining a sustainable and socially responsible banking model.

Problem

How does a leading bank move this volume of daily data (up to 10TB daily) coming from multiple disparate data sources (VSAM, with metadata in COBOL and PL/1), including archival tape, to its growing Hadoop cluster for daily near real-time analytics to fuel optimum decision making? And how can it do it fast and without the costly overhead of traditional ETL and without driving mainframe monthly license charges (MLC) through the roof?

Complicating matters, the bank could not tolerate the latency and the overhead of passing through intermediate stages. It really wanted to go from the daily data directly to its Hadoop cluster where the analytics could be performed on the spot and the results made accessible to the bank’s users. This would eliminate clunky ETL processing, speed decision making, and lower costs.

Solution

The traditional options didn’t work. This included using IBM’s MQFTE, exploring competing enterprise solutions, and various middleware alternatives. They all involved extra steps, added pieces, increased latency, and resulted in cumbersome multi-stage ETL. For example, MQFTE had difficulty handling PL/1 metadata and couldn’t insert data directly into Hadoop. All the alternatives also generated unacceptable overhead in terms of costly CPU consumption and other expenses. For files with metadata encoded in PL/1, there were no commercially available solutions for data conversion, and the bank hoped to avoid the expense, delays and risks of creating and maintaining custom software.

Instead, they really wanted a solution that enabled them to Extract from any source system, Load into Hadoop, and use the Hadoop tool set—Impala, Hive, Pig, Spark—to handle Transform directly in Hadoop through ELT, not the usual ETL, and perform the analytics right there. They turned to Veristorm’s vStorm Enterprise and a Cloudera x86 cluster that would handle the transform function directly in Hadoop while skipping any intermediate transformations, which would now be unnecessary.

Results

The solution turned out to be surprisingly straightforward and efficient. It delivered:

  • 3x faster performance, end to end

  • 90% reduction in MIPS overhead

Big data analytics provides a powerful management decision making capability that has been compared to mining for gold. By reducing the overhead as much as it did and speeding up the time to results, the bank was able to capture more of the gold embedded in their data and more efficiently turn it into valuable insights their bank managers could use to make better decisions.

“Only through Veristorm vStorm Enterprise and Cloudera running Hadoop could the bank have met its demanding requirements,” says Yannick Barel, Veristorm Vice President of Worldwide Sales. The results speak for themselves.