It’s no secret that Facebook (s fb) stores a lot of data — 100 petabytes, in fact — in Hadoop, but how it keeps that data available whenever it needs it isn’t necessarily common knowledge. Today at the Hadoop Summit, however, Facebook Engineer Andrew Ryan highlighted that solution, which Facebook calls AvatarNode. (I’m at Hadoop Summit, but didn’t attend Ryan’s talk; thankfully, he also summarized it in a blog post.)

For those unfamiliar with the availability problem Facebook solved with AvatarNode, here’s the 10,000-foot explanation: The NameNode service in Hadoop’s architecture handles all metadata operations with the Hadoop Distributed File System, but it also just runs on a single node. If that node goes down, so does, for all intents and purposes, Hadoop because nothing that relies on HDFS will run properly.

As Ryan explains, Facebook began building AvatarNode about two years ago (hence its James Cameron-inspired name) and…

Ver o post original 167 mais palavras