Release Notes - Hadoop Chukwa - Version 0.3
This is the first public release of Chukwa, a log analysis framework on top of Hadoop. Chukwa has been tested at scale and used in some production settings, and is reasonably robust and well behaved. For instructions on setting up Chukwa, see the administration guide and the rest of the Chukwa documentation.
The collection components of Chukwa -- adaptors, agents, and collectors -- are approaching general-use quality. They've been fairly aggressively tested, and can be counted on to perform properly and recover from failures.
HICC, the visualization component, is "beta" quality. It's been used succesfully at multiple sites, but it's still brittle. Work is ongoing. Documentation is still sparse, and error reporting isn't always sufficiently clear.
Chukwa has not been extensively audited for security vulnerabilities. Do not run it except in trusted environments. Never run Chukwa as root: By default, the ExecAdaptor allows arbitrary remote execution.
Chukwa relies on Java 1.6
The back-end processing requires Hadoop 0.18+.
Collecting Hadoop logs and metrics requires Hadoop 0.20+.
- HICC defaults to assuming data is UTC; if your machines run on local time, HICC graphs will not display properly until you change the HICC timezone. You can do this by clicking the small "gear" icon on the time selection tool.
- As mentioned in the administration guide, the pig down sampling should run as external command.
HDFSUsage script for monitoring hdfs usage in /user, this one needs to run as special hdfs user to access the data. This user should have write access to $CHUKWA_LOG_DIR.
- System metrics collection may fail or be incomplete if your versions of sar and iostat do not match the ones that Chukwa expects. (See also CHUKWA-260)
The data in a few of the chukwa agent metrics monthly, quarterly, yearly, decade tables is wrong. The recordname column holds host data, and the host column holds recordname data.