During the year, for instance, eBay staffers can see when customers start typing in Halloween queries and Christmas queries. "With that I can tell you the kinds of things people are looking for. We didn't comprehend this use of the data five years ago -- not at all."
Be careful out there
As good as Hadoop is, there are some cautions. First, "don't commit to or standardize on one vendor quite yet," because it's such a "turbulent" space right now, Forrester's Kobielus suggests. "The vendors are all continuing to rapidly evolve." On the other hand, that does create a "vibrant ecosystem," he says.
Marcus Collins, an analyst at Gartner, says it's up to the enterprise to get the expertise needed to get the most out of Hadoop. "It's asking for a level of analytics capabilities that many companies don't have today," he says. "You need to train your staff and invest in analytics, and that will put you in the best position to exploit this technology."
Another key consideration: Most shops will need to hire Hadoop specialists, who are in short supply, or will need to train in-house staffers. "It's not trivial to use," eBay's Williams says. "So we've put a lot of training in place so our engineers know how to use Hadoop and can write code. You're going to have to invest in your developers and program manager so they can become proficient users. Don't underestimate that."
Also be prepared for an organizational learning curve in terms of relying on an open-source system for a mission-critical application. Using it for a few under-the-radar kinds of projects is one thing, but it's another entirely to develop a massive system for all the world to see. Best be prepared to educate your management about the benefits of open source.
Another tip from Collins is to stay "intimately involved" with the project to make sure it goes as planned. "Don't just give your problems to your Hadoop vendor," he says. At the end of the day, "you're going to be running it."
Also, Kobielus explains, best practices with Hadoop are still evolving, so it's best to figure out some short-term benefit you might get from the system and avoid anything too long-term to start. As you build up expertise, you can figure out more things to do with the software. In the meantime, the range of approaches that early adopters are using to build out and scale their clusters "are all over the board," he says.
Adds to, doesn't replace, other databases
Most customers are using Hadoop in addition to, not instead of, other types of software. At eBay, for instance, the company still uses relational databases as well as does "a lot of custom [database] work," Williams explains. "At eBay, we see value in using multiple technologies to work with our data. Hadoop is a terrific choice for certain uses, while other technologies work alongside it for other purposes."
Sign up for Computerworld eNewsletters.