The need by corporate IT operations to enable easier interaction with massive -- and fast growing -- data sets in Hadoop environments is driving a flurry of vendor activity.
For one, Splunk last week rolled out a beta version of an analytics tool that it claims can be used to access, search, analyze and use data in Hadoop environments more efficiently than current technologies like MapReduce, Pig and Hive.
The product, called Hunk, lets companies gain insight into Hadoop data assets without the need for custom development, data migration batch processing and data modeling, said Clint Sharp, principal product manager for big data at Splunk.
The Hunk tool set lets enterprises explore, query and analyze Hadoop data where it resides, Sharp said.
The technology supports ad hoc querying of Hadoop data and enables users to analyze and correlate petabytes of structured and unstructured data in a distributed Hadoop environment, he added.
Business can use Hunk to build graphs, visualize data and create custom dashboards in Hadoop. It also allows them to more easily share insights, gathered from Hadoop data, with others in the enterprise, the company says.
Hunk is the company's first major foray into the Hadoop business beyond a connector product for sharing data between Splunk and Hadoop and another one for monitoring the health of a Hadoop environment.
Hunk taps into growing enterprise interest in Hadoop technologies, and the need for easier to use products than are available today, Sharp said.
"It's not that hard getting data into Hadoop. But getting value from the data is incredibly hard," he said.
The open source Hadoop software, distributed by the Apache Software Foundation, and some of the technologies that have grown up around it are mostly optimized for batch processing tasks and do not allow the sort of interactive, ad hoc querying of data that companies are increasingly looking for, Sharp said. "Our goal is to give you a user interface for Hadoop that is easy to use," and allows such interaction, he said.
Splunk has done a good job so far of helping companies tap machine log data for useful information, said Merv Adrian, an analyst with Gartner Inc. "With Hunk, they are taking what they learned with machine data and moving it over to more general purpose data in Hadoop," he said
Hunk is one of a small but emerging set of tools that enable direct interactive analytics against the Hadoop Distributed File System, Adrian said. "It is part of a new wave" of products, along with Cloudera's Impala, EMC Greenplum's Pivotal HD and the open source Apache Drill project. "The first wave was brute force batch processing of files in Hadoop."
Sign up for Computerworld eNewsletters.