You will at a minimum need the following:

  • Java 6 installed (Java 7 has not been tested)

Setup passphraseless ssh

These instructions are taken from the Hadoop Quick Start Guide.

Now check that you can ssh to the localhost without a passphrase:

ssh localhost

If you cannot ssh to localhost without a passphrase, execute the following commands:

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

Heads Up!

Also you will need to know the location of the JAVA_HOME directory.

Download Source and Binary Artifacts

Both the source and binary artifacts are provided via mirrors here:

Apache Blur 0.2.0 Source Apache Blur 0.2.0 Binary

If building from source, the distribution needs to be compiled before use

Clone master

git clone https://git-wip-us.apache.org/repos/asf/incubator-blur.git

Build the artifacts (if you want to run the tests remove the "-DskipTests")

cd incubator-blur/
mvn install -DskipTests -P distribution

The binary artifact is located distribution/target/apache-blur-0.2.0-incubating-bin.tar.gz.

Once a distribution is available, follow the simple steps to install.

Extract the contents of the distribution

tar -xzvf apache-blur-*-bin.tar.gz
While it's not required it is a good idea to set BLUR_HOME in your environment variables.

For bash edit .bash_profile and add:

export BLUR_HOME=<directory where Blur was extracted>

There are a few things at a minimum that will need to be configured to start Apache Blur

Edit $BLUR_HOME/conf/blur-env.sh and set JAVA_HOME:

export JAVA_HOME=<Java Home Directory>

Caution

If this variable is not set, then the script will attempt to locate JAVA_HOME by using the location of the "java" command.

Starting Apache blur is a simple one command step

To start Apache Blur run the following command:

$BLUR_HOME/bin/start-all.sh

This will start a single Controller server and a single Shard server on your localhost.

You should see:

blur@blurvm:~$ apache-blur-0.2.0-incubating/bin/start-all.sh 
localhost: ZooKeeper starting as process 6650.
localhost: Shard [0] starting as process 6783.
localhost: Controller [0] starting as process 6933.

If you run the start command again you should see:

blur@blurvm:~$ apache-blur-0.2.0-incubating/bin/stop-all.sh 
localhost: Stopping Controller [0] server with pid [6933].
localhost: Stopping Shard [0] server with pid [6783].
localhost: Stopping ZooKeeper with pid [6650].

If you see it starting the servers again, then there is likely some issue with startup. Look in the $BLUR_HOME/logs directory for log and out files.

Once the servers have been started, you can use the shell to interact with Blur.

The shell command can be found in the bin directory

Auto detect the controller servers from the $BLUR_HOME/conf/controllers file

$BLUR_HOME/bin/blur shell

You can also explicitly call out the controller servers.

$BLUR_HOME/bin/blur shell controller1:40010,controller2:40010

Once in the shell, tables can be created, enabled, disabled, and removed. Type help to get a list of the commands.

The below example creates a table and stores the contents of the table in a local directory of /data/testTableName which will only work if you are running blur in a single instance. Normally if you are running a hadoop cluster this will be a hdfs URI for example hdfs://host:port/blur/tables/testTableName.

Create Table

blur> #Creates a table called testtable in the local directory of /data/testtable with 11 shards
blur> create -t testtable -c 11 -l file:///data/testtable

Mutate


blur> #Adds a row to testtable
blur> mutate testtable rowid1 recordid1 fam0 col1:value1

Query


blur> #Runs a query on testtable
blur> query testtable fam0.col1:value1
 - Results Summary -
    total : 1
    time  : 7.874 ms
-----------------------------------------------------------------------------------------------------
      hit : 0
    score : 1.4142135381698608
       id : rowid1
 recordId : recordid1
   family : fam0
     col1 : value1
-----------------------------------------------------------------------------------------------------
 - Results Summary -
    total : 1
    time  : 7.874 ms

Enable Highlighting


blur> #Turns highlighting on
blur> highlight
highlight of query command is now on

Query with Highlights


blur> #Runs a query on testtable with highlighting on, notice <<<value1>>> is highlighted 
blur> query testtable2 fam0.col1:value1
 - Results Summary -
    total : 1
    time  : 13.395 ms
-----------------------------------------------------------------------------------------------------
      hit : 0
    score : 1.4142135381698608
       id : rowid1
 recordId : recordid1
   family : fam0
     col1 : <<<value1>>>
-----------------------------------------------------------------------------------------------------
 - Results Summary -
    total : 1
    time  : 13.395 ms
blur>