Doug Henschen on Analytics, Big Data & Smart Apps
Informatica World 2015 Event Report. Last year at Informatica World 2014, this $1-billion-annual-revenue data-integration company introduced the Intelligent Data Platform (IDP). The goal, in part, was to unifying its various products for on-premises and cloud data integration, data cleansing, master data management and so on. The company promised the platform would delve more deeply into big data while also extending into data security, addressing boardroom-level concerns about data breaches and data privacy.
This week at Informatica World 2015, CEO Sohaib Abbasi kicked things off reviewing IDP accomplishments to date as well as new goals for “the age of engagement.” Contrasting this new age with the old “age of productivity” – in which businesses focused mostly on transactions and efficiency — Abbasi warned that the data challenges are getting harder as we try to correlate mobile, social, machine and other emerging data types with transactions managed both on premises and in the cloud.
“We have to understand not only what products customers bought, but also which products they like, who their friends are and who they influence,” said Abbasi, underscoring the connection between transactions and social engagement.
Informatica Rev: Self-Service Goes Public
Chief among Informatica’s IDP achievements over the last year was the September launch of Informatica Rev (originally called Project Springbok, as covered here by Holger Mueller), a self-service tool designed to help business users prepare their data with guided recommendations for tasks such as curating, cleansing, joining and enriching data. Rev’s Excel-like user interface was demonstrated by Andrew Comstock, director of product management, who showed how a data novice could bring related data sets together with a “blend” feature that spots shared data keys and automatically joins the data without coding. Users can also segment data into categories such as “small, medium and large” customers by providing labels and simple range settings.
The new combinations of data that users create can be shared, so Comstock pointed out that Rev is really a public-service tool, not just a self-service tool. Not only do colleagues find these Rev-generated blendings useful and reusable, IT can take note of popular data sets created in Rev and can learn from the data-prep steps taken by end users to create hardened production data resources.
MyPOV: Self-service is a mantra these days, but not all such offerings also put an emphasis on wider collaboration and reuse. With self-service approaches there’s a risk that users can be self-centered, creating new silos of data that are only useful to them. With its combination of an Excel-like interface and automation and recommendation features, Informatica is making data access and data preparation more accessible to non-IT users.
Secure@Source Spots What’s Sensitive
Launched in April, Secure@Source is Informatica’s move deeper into the security arena (beyond data masking) with a product that takes advantage of its knowledge of the metadata. The goal is to help companies prevent data breaches while also supporting data-privacy and data-audit initiatives.
Secure@Source analyzes the metadata in Power Center repositories and spots sensitive data that might be at risk. This list of sensitive data automatically recognized includes PCI, PII and PHI data, but you can also customize to spot what’s uniquely sensitive to a particular firm. The system reveals both groups and departments that have the highest data risks, because of the nature of their data, and which groups lack adequate security controls. For now, Secure@Source is limited to structured data managed by Power Center on-premises, but the roadmap calls for support of both unstructured and cloud data external to Power Center.
MyPOV: To what degree will this product be a tough sell or, at least, a top-down sell in organizations where data-management professionals view security as somebody else’s job and where security professionals likely have their preferred vendors and approaches? I asked Informatica executives just that, and they acknowledged that they have to educate the sales force and reach out to new buyers including chief data officers and chief security officers. I suspect this product may take a while to get rolling, but Informatica is smart to tap into its knowledge of the metadata to move into the security category. Adding unstructured and cloud sources will be essential.
Project Sonoma Tracks Big Data Lineage
Abbasi put deeper big data integration on the roadmap by announcing Project Sonoma, an effort to bring development agility and governance to Hadoop-based data lakes or data hubs. Set for release in the second half of 2015 (read, late this year), Sonoma is designed to help developers make sense of data in Hadoop-based data lakes as well as external sources so they can collect trusted, relevant data for big-data analysis. Amit Walia, senior VP and GM of data integration and data security, demonstrated how users might bring transactional as well as social and mobile data together using a Live Data Map feature. In a user scenario he searched a 4-petabyte data lake for information that might be relevant for customer-churn analysis. A Knowledge Graph component of Sonoma tracks data lineage, giving developers a better idea of whether that data can be trusted. A Social Graph component recommends social network and media sources that might fit the project.
Sonoma’s output is a report that catalogs all relevant and trusted data that might be used in a customer-churn analysis. As with Rev, the report is an enduring, sharable asset, so it can trigger collaboration and resue and save time the next time a similar project is pursued.
MyPOV: All is good in concept, but this product is a bit over-the-horizon. I have to wonder to what degree it will overlap with data-lineage and data-cataloging efforts being pursued by the Hadoop community. It will be important for Informatica to interoperate with these tools and platform options to give customers flexability and a choice. Informatica has already integrated with Cloudera Navigator, for example, which is that Hadoop distributor’s tool for tracing data lineage within its clusters. Informatica can track lineage more comprehensively.
Social360 and Big Data Relationship Management
Social360 for Informatica MDM – also expected in the second half of this year — is an enhancement of the vendor’s on-premises master data management platform, MDM 10. Informatica says the feature will help organizations match transactions with social network interactions and media mentions to spot influencers of valued customers. Suresh Menon, VP information quality solutions, demonstrated an interface in which social content pertaining to competitors, product features, channels, and customers were revealed.
Making use of social data isn’t always easy, so a Big Data Relationship Management, another coming MDM enhancement, is designed to enrich social data with demographic and sentiment data. The idea is to help spot the hot topics, the big customers, and the big influencers, aiding deeper sentiment-analysis options.
MyPOV: Here, too, I’m wondering to what extend Informatica is working with sentiment-analysis tools, vendors and options. Partner Salesforce, for example, also captures social and sentiment data, but it’s all in the context of sales, marketing or customer service. Informatica’s opportunity is to facilitate and feed broader analysis extending into supply chain and manufacturing. Plenty of apps vendors are talking about linking the social and transactional worlds, so Informatica has to stick to playing the metadata-savvy Switzerland.
My Big Picture POV
With its Intelligent Data Platform, Informatica is raising the level of discourse out of niche data projects and up to five prominent IT initiatives: analytics, total customer relationship, application consolidation, cloud modernization, and data governance. These five initiatives account for the lion’s share of Informatica deployments, and they are tied to larger business objectives (read, funding sources) like driving better decisions, improving cross-selling and up-selling, grappling with mergers and acquisitions, improving business agility, and getting governance, risk and compliance under control.
Out of the 2,400-plus attendees here at Informatica World 2015, most seem to be data-integration veterans who have used the company’s tools for a long time. But given packed halls for cloud and big data topics, it’s clear that demands are changing and customers are looking beyond the same old ETL and ELT workloads. In a briefing with analysts, Abbasi said that a key reason Informatica is planning to go private is to up the innovation.
“With all this disruption going on, now is the time to double down and invest even more in big data, in Rev [self service], in MDM, and in security,” he said. No doubt that will be easier without having to face public financial scrutiny.
One disruptive force Informatica didn’t mention was all the open source options commoditizing baseline capabilities like ETL. Informatica has to do more to deliver value, and it’s clearly moving quickly to do just that. I expect it to move that much more quickly once this corporate transition takes place as expected in the second or third quarter of this year.