Package org.apache.droids.norobots

Using norobots-rfc


Interface Summary
ContentLoader An abstract loader intended for retrieving content identified by a URI.
Rule A robots.txt rule.

Class Summary
NoRobotClient A Client which may be used to decide which urls on a website may be looked at, according to the norobots specification located at:
SimpleContentLoader A simple implementation of ContentLoader based on URLConnection.

Exception Summary
NoRobotException Application exception for anything that might go wrong in the checking of a robots.txt file.

Package org.apache.droids.norobots Description

Using norobots-rfc

  1. Import the class import org.apache.http.norobots.NoRobotClient;
  2. Create an instance for your user-agent NoRobotClient nrc = NoRobotClient("googlebot");
  3. Parse a robots.txt at a site nrc.parse( new URL( "" ) );
  4. Ask if a url is allowed boolean test = nrc.isUrlAllowed( new URL( "" ) );

Copyright © 2007-2009. All Rights Reserved.