Enterprise to Hadoop Data Integration

Enterprise to Hadoop Data Integration

The easiest way to import mainframe data into Hadoop

The best analytics require detailed data. Leading enterprise organizations mine their transaction data for insights 79% of the time, and at least 60% of enterprise transactions touch their mainframe server. It’s safe to say that the best predictive analytics require detailed enterprise data from the mainframe server.

There is actionable information to be gleaned from new sources like website click-streams, but when combined with the contextual enterprise data of customer and transaction data, the richness is multiplied.

What’s the easiest way to copy data from the mainframe to your enterprise analytics platform? A look at integration costs recently showed that on a big data project, 80% of the development effort goes into data integration and only 20% is spent on analytics—probably the inverse of how it should be.

In sectors like financial services, banking and insurance, these customer insights help detect fraud, reduce loan risk, and find new opportunities in cross-selling services.

Data Integration Challenges from Mainframe to Hadoop

There are three major hurdles to low-latency data movement into analytics platforms:

  • Mainframe data formats (like VSAM, QSAM, and Datacom/DB) are not well-understood outside of the mainframe community. Solutions like Sqoop or Flume are simply not built to do these kinds of data conversions.

  • Hadoop solutions, like Sqoop, overlook the cost of MIPS and storage on the mainframe, and will not scale.

  • Security and compliance are very important and implemented in ways unique to the mainframe server

Veristorm’s vStorm Enterprise is purpose-built to tackle these problems.

  • It automates data conversion even for mainframe-specific formats like VSAM and Datacom/DB.

  • It includes metadata conversion from COBOL and PL/1.

  • It extracts the data and metadata in native (usually binary) format, which reduces MIPS charges and the heavy compute load of large SQL queries.

  • It streams the data and metadata to the target platform without staging. This eliminates the cost and delay of staging, plus the security risk of having an additional access point for your data.

vStorm Enterprise supports most major vendors as sources and targets for data integration, including IBM, CA, Oracle, Hortonworks, Cloudera and Teradata. You can see a complete list of sources and targets  on the vStorm Enterprise page.