Concurrent uses the Cloudera CDH platform. "Certainly we could take the open-source version without the Cloudera support, but we found a vendor partner that allows us to expand our solution and leverage their expertise, and really understand how the system works, not just hack it together because it's open source," Lazzaro says.
Return Path started working with MapR's commercial distribution last year, a move it made to improve stability and boost performance. "We've been able to see a roughly 2.5- to three-times increase in performance for our workloads," Sautins says. "That means we can either run things twice as fast, which is great, or we can run half the servers, which can also be very compelling." [Also see: "MapR makes Hadoop better, faster, easier"]
Along with multiplying options for commercial Hadoop distributions, there are other signs the open source platform is gathering steam. Venture capital is flowing, and new startups with management add-ons and analytic applications are appearing at a dizzying pace. It's also getting increasing attention from traditional data management players -- including IBM, Oracle, Microsoft and EMC -- eager to cash in on the action.
On the funding front, 2011 was a huge year for Hadoop vendors: Cloudera landed $40 million in Series D funding; MapR secured $20 million in Series B funding; Datameer, which makes analytics tools built on Hadoop, secured $9.25 million in its second funding round; and in September, $11 million went to DataStax, which offers a commercial version of the Apache Cassandra distributed database management system as well as a new product that couples Cassandra with Hadoop analytics.
Another event that portends increasing financial investments in Hadoop-related startups is Accel Partners' launch of a $100 million big data fund earmarked for startups working in areas including data management, storage, data analytics and business intelligence. To help spend the money, Accel lined up a team of fund advisers, and the Hadoop realm is well represented by Cutting, who's now with Cloudera; Gil Ebaz, founder of Hadoop user Factual; Cloudera Chief Scientist Jeff Hammerbacher, who once led the data team at Facebook; and Facebook's Jay Parikh.
"There's already a second and third generation of startups being created to take advantage of this macro trend. We're the old guys in the room now, after doing this for three years," says Charles Zedlewski, vice president of product at Cloudera.
Choosing workloads, finding talent
Hadoop makes it easier to process big data, but it's no cure-all. One common challenge for enterprises is how to choose the most appropriate technology to handle different kinds of data.
"I think there's still a lot of confusion about what applications, what workloads, should be on Hadoop versus those that should be in a traditional enterprise data warehouse," Aslett says. "Unfortunately at this point, there aren't any easy answers for that."
Sign up for Computerworld eNewsletters.