Yahoo’s done one impressive thing after another since Microsoft threatened to acquire it, and now, in a step that should make the open source community proud, Yahoo has announced that it’s running “the world’s largest Hadoop application, a 10,000 core Linux cluster producing data used by the Yahoo! Search Webmap.”
Hadoop comes from the Apache Software Foundation, and acts as a distributed computing platform. Prior to Yahoo’s announcement, the software wasn’t hurting for footing; other users included Google, IBM, and Last.fm. Still, this development should ensure Hadoop is embraced to an even greater degree in the future.
On the Yahoo Search Blog, Sean Suchter points out, “Using open source software is a win-win situation for Yahoo! and the wider community. We achieve cost savings, faster processing, reduced maintenance, and increased scale and the community can benefit from the myriad improvements it took to make Hadoop viable for such a large-scale commercial implementation.” Quite an endorsement (and invitation), eh?
Yahoo and Hadoop are also taking steps to make sure everyone hears the news, with related posts appearing on the Yahoo Developer Network and Hadoop and Distributed Computing blogs. An interview between Jeremy Zawodny and two team members adds to the buzz.
But as Zawodny reminds everyone, there’s more to this than PR. “It’s not just an experiment or research project,” he writes. “There’s real money on the line.” And with their record-breaking implementation, Yahoo and Hadoop seem to be handling it well.