org.apache.droids.norobots
Class NoRobotClient

java.lang.Object
  extended by org.apache.droids.norobots.NoRobotClient

public class NoRobotClient
extends java.lang.Object

A Client which may be used to decide which urls on a website may be looked at, according to the norobots specification located at: http://www.robotstxt.org/wc/norobots-rfc.html


Constructor Summary
NoRobotClient(ContentLoader contentLoader, java.lang.String userAgent)
          Create a Client for a particular user-agent name and the given ContentLoader.
NoRobotClient(java.lang.String userAgent)
          Create a Client for a particular user-agent name.
 
Method Summary
 boolean isUrlAllowed(java.net.URI uri)
          Decide if the parsed website will allow this URL to be be seen.
static java.util.Map<java.lang.String,org.apache.droids.norobots.RulesEngine> parse(java.io.InputStream instream)
           
 void parse(java.net.URI baseUri)
          Head to a website and suck in their robots.txt file.
 void parseText(java.io.InputStream instream)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

NoRobotClient

public NoRobotClient(ContentLoader contentLoader,
                     java.lang.String userAgent)
Create a Client for a particular user-agent name and the given ContentLoader.

Parameters:
userAgent - name for the robot

NoRobotClient

public NoRobotClient(java.lang.String userAgent)
Create a Client for a particular user-agent name.

Parameters:
userAgent - name for the robot
Method Detail

parse

public void parse(java.net.URI baseUri)
           throws java.io.IOException,
                  NoRobotException
Head to a website and suck in their robots.txt file. Note that the URL passed in is for the website and does not include the robots.txt file itself.

Parameters:
baseUrl - of the site
Throws:
java.io.IOException
NoRobotException

parseText

public void parseText(java.io.InputStream instream)
               throws java.io.IOException,
                      NoRobotException
Throws:
java.io.IOException
NoRobotException

parse

public static java.util.Map<java.lang.String,org.apache.droids.norobots.RulesEngine> parse(java.io.InputStream instream)
                                                                                    throws java.io.IOException
Throws:
java.io.IOException

isUrlAllowed

public boolean isUrlAllowed(java.net.URI uri)
                     throws java.lang.IllegalStateException,
                            java.lang.IllegalArgumentException
Decide if the parsed website will allow this URL to be be seen. Note that parse(URL) must be called before this method is called.

Parameters:
url - in question
Returns:
is the url allowed?
Throws:
java.lang.IllegalStateException - when parse has not been called
java.lang.IllegalArgumentException


Copyright © 2007-2009. All Rights Reserved.