|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||
See:
Description
| Interface Summary | |
|---|---|
| ContentLoader | An abstract loader intended for retrieving content identified by a URI. |
| Rule | A robots.txt rule. |
| Class Summary | |
|---|---|
| NoRobotClient | A Client which may be used to decide which urls on a website may be looked at, according to the norobots specification located at: http://www.robotstxt.org/wc/norobots-rfc.html |
| SimpleContentLoader | A simple implementation of ContentLoader based on URLConnection. |
| Exception Summary | |
|---|---|
| NoRobotException | Application exception for anything that might go wrong in the checking of a robots.txt file. |
import org.apache.http.norobots.NoRobotClient;
NoRobotClient nrc = NoRobotClient("googlebot");
nrc.parse( new URL( "http://www.apache.org/" ) );
boolean test = nrc.isUrlAllowed( new URL( "http://www.apache.org/index.html" ) );
|
|||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | ||||||||