The Greenplum division of EMC is building a single data analytics platform that can crunch both structured and unstructured data and give a broad range of users the tools to study an enterprise's information.
Greenplum, which EMC acquired last year, plans to introduce its Unified Analytics Platform in the first quarter of next year. UAP will combine the EMC Greenplum database with EMC Greenplum HD, which uses the Hadoop open-source analysis framework for unstructured data, and EMC Greenplum Chorus 2.0. Chorus is the user interface for setting up queries and creating visualizations, and in the new version it lets users address both structured and unstructured data.
EMC is announcing the Greenplum UAP on Wednesday at an event in Mountain View, California. Pricing will be disclosed next year.
Organizations in many areas have mountains of data from their operations that are becoming too big to analyze with conventional tools, according to Enterprise Strategy Group analyst Julie Lockner. The volume of data, the complexity of a query and the need for quick answers often creates a challenge, she said.
Some enterprises, especially in retail and the health sciences, are adopting new technologies like those coming from Greenplum to learn more from the data they already have, she said. For example, online stores can correlate visitor behavior with eventual purchases and pharmaceutical companies can more easily process results of clinical studies. Insurance, investment and other companies also are starting to embrace new analytics tools to make more accurate predictions.
One of Greenplum's goals has been to make data analytics tools available to business executives and other employees, rather than just a team of dedicated data scientists. Chorus provides a less arcane interface for translating human questions into queries against sets of data, and it includes a social networking environment where people across an organization can collaborate on working with the data.
The UAP brings enterprises two main benefits, said Michael Maxey, senior director of product marketing at Greenplum.
"One is, the scope of data they can address, but also being able to address all of the existing processes and expertise in an organization and extend it over those new data sets," he said.
In addition to gaining access to unstructured data through Greenplum HD, Chorus 2.0 features an enhanced ability to quickly create a virtual "sandbox" in which to develop new analytics processes, Maxey said. That addition draws on technology from EMC's VMware subsidiary, he said.
Customers can deploy the UAP on their own standard computing hardware or order a prepackaged configuration, Maxey said. Enterprises that already have the Greenplum database or Greenplum HD can integrate those into the unified platform.
Gleaning insights from structured data in traditional databases requires different technology from analyzing unstructured data, such as Web pages, images and video. If business managers want answers to questions that require both kinds of information, typically they need two analytics platforms, and the enterprise may only be able to afford one, Lockner said. Greenplum's UAP should be a more economical solution that lets a company answer all types of queries, she said.
Sign up for Computerworld eNewsletters.