Apache Hama

Apache Hama is a pure BSP(Bulk Synchronous Parallel) computing framework on top of HDFS (Hadoop Distributed File System) for massive scientific computations such as matrix, graph and network algorithms. Currently, it has the following features:


Recent News

  • Feb 5, 2012: release 0.4.0 available
  • July 28, 2011: release 0.3.0 available
  • June 2, 2011: release 0.2.0 available
  • Apr 30, 2010: Introduced in the BSP Worldwide

Why Hama and BSP?

Today, many practical data processing applications require a more flexible programming abstraction model that is compatible to run on highly scalable and massive data systems (e.g., HDFS, HBase, etc). A message passing paradigm beyond Map-Reduce framework would increase its flexibility in its communication capability. Bulk Synchronous Parallel (BSP) model fills the bill appropriately. Some of its significant advantages over MapReduce and MPI are:

Getting Started

Start by installing Hama on a Hadoop cluster.

Getting Involved

Hama is an open source volunteer project under the Apache Software Foundation. We encourage you to learn about the project and contribute your expertise. Here are some starter links:

Disclaimer

Apache Hama is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.