Friday, March 27, 2009

Artemis, Scribe, Chukwa

Artemis

A metrics and log aggregation tool built on top of Dryad. Mostly a debugging tool for Dryad jobs, but cast as a tool for cluster and cloud services management in general, the developer can choose from thousands of metrics which Microsoft Windows exports for collection and filtering. The idea is to witness the impact of a Dryad job on individual nodes, particularly to spot unexpected behavior. Drawbacks: not open sources, built on MS technologies (e.g. .Net for UI, not web based)

Scribe

An open source project created at Facebook, this system shines in its simplicity (collect logs) and its effective reuse of the lower level Thrift RPC project. Once logs are collected, they can be queried in an ad-hoc fashion using Hive.

Chukwa

A scalable log, metrics, and generic data collection framework built by Yahoo on top of Hadoop MapReduce and HDFS. The idea is to store everything possible in its raw form and do analysis later. This is in contrast to Artemis which does filtering before collection and stores the results in a much smaller database than Chukwas HDFS archival store.

No comments: