There are lies, damn lies ... and technology industry studies.
Back in May, Gartner released a survey that crapped all over the Hadoop industry: Of the 284 CIOs polled by Gartner, only 26 percent claimed to be "either deploying, piloting, or experimenting with Hadoop." Even with the small sample size and large margin of error, these were depressing numbers -- ones that contradicted the level of adoption people like me are seeing in the real world.
Well, Gartner has been wrong before. Released today, a new survey commissioned by AtScale that surveyed more than 2,100 people offers results a whole lot closer to what those of us in the field encounter. Most dramatically, 76 percent of respondents said they plan to use Hadoop -- or are already using it and plan to use it more.
Naturally, the AtScale numbers need to be taken with a grain of salt: AtScale is a Hadoop solutions provider and the survey was conducted on its behalf. But anecdotally, at least, I can tell you that the results paint a picture a closer to reality than Gartner's doom and gloom.
Hadoop's killer app
Some of you might find this shocking, but the AtScale survey indicates that Hadoop's killer app is business intelligence -- for 69 percent of those planning to use Hadoop and 65 percent of those already using it. If that's a surprise, then you didn't read even the top item in my article "The 7 most common Hadoop and Spark projects."
Most companies don't have "big data" -- only many new unstructured or semistructured data sources -- and they'd like to gain insight by aggregating them and hooking up a visualization tool. In fact, according to the study, most want to gain insight using Tableau or Excel. If they're already using Hadoop, they are probably also working with Tableau (51 percent). If they aren't, they'd like to use Excel (60 percent).
This matches exactly what I see in the field. My company's bread and butter is building data lakes (or enterprise data hubs, if you prefer). As the survey confirms, these new Hadoop-based systems generally don't replace Teradata or Netezza. Instead, customers either want to augment their existing MPP to handle new types of data or they don't have MPP in place at all. In my experience, companies find their MPP systems can't scale the way they hoped -- and discover they can shove Hadoop on regular hardware (or on Amazon) and add more nodes as they grow.
According to the study, the lower cost of Hadoop solutions isn't the main attraction for most companies. But cost and scale are always connected. If you're looking to do BI and analytics today, you probably don't think "let me buy this weird proprietary hardware columnar database thing" as a first step. In fact, if you draw the architecture of a Netezza next to the architecture of Hive or maybe HBase plus Phoenix, you'll see a very similar structure.
Sign up for Computerworld eNewsletters.