As 2016 draws to a close, a new study suggests big data is growing in maturity and surging in the cloud.
AtScale, which specializes in BI on Hadoop using OLAP-like cubes, recently conducted a survey of more than 2,550 big data professionals at 1,400 companies across 77 countries. The survey was conducted in conjunction with Cloudera, Hortonworks, MapR, Cognizant, Trifacta and Tableau.
AtScale's 2016 Big Data Maturity Survey found that nearly 70 percent of respondents have been using big data for more than a year (compared with 59 percent last year). Seventy-six percent of respondents are using Hadoop today, and 73 percent say they are now using Hadoop in production (compared with 65 percent last year). Additionally, 74 percent have more than 10 Hadoop nodes and 20 percent 20 percent have more than 100 nodes.
"The maturity of respondents in this survey is a key consideration," Thomas Dinsmore, big data analytics industry analyst and author of the book "Disruptive Analytics," said in a statement Wednesday. "One in five respondents has more than 100 nodes and 74 percent of them are in production, indicating double-digit growth year-over-year."
Respondents also say they are increasingly turning to the cloud when it comes to hosting their big data analytics. Fifty-three percent of respondents say they have already deployed big data in the cloud and 14 percent of respondent have all their big data in the cloud. Seventy-two percent plan to use the cloud for a big data deployment in the future.
"There's been a clear surge in use of big data in the cloud over the last year and what's perhaps as interesting is the fact that respondents are far more likely to achieve tangible value when their data is in the cloud," says AtScale CTO and co-founder Matt Baird.
Hadoop is better off-premises
"Hadoop is freaking hard," adds Dave Mariani, CEO and founder of AtScale. "It's really hard to deploy, it's really hard to manage. I see a lot of customers really like not having to worry about managing their Hadoop cluster. Being able to elastically scale, not just add new nodes but also shrink them, and to use object storage as a persistent layer to do that, that is a completely different notion than on-prem Hadoop."
Alongside big data's increasing maturity, the primary workloads are also shifting.
"The number one workload last year was ETL, then business intelligence, then data science," says Bruno Aziza, chief marketing officer of AtScale. "This year, the number one workload was business intelligence."
BI is big
ETL and data science remain popular big data workloads, but business intelligence (BI), which was already trending upward last year, has become the predominant workload with 75 percent of respondents using or planning to use BI on big data. And that's not slowing down any time soon if the indications are correct. Fully 97 percent of respondents said they would do as much oremore with big data over the next three months.
Sign up for Computerworld eNewsletters.