org.apache.tika.mime
Class MimeTypes

java.lang.Object
  extended by org.apache.tika.mime.MimeTypes

public final class MimeTypes
extends java.lang.Object

This class is a MimeType repository. It gathers a set of MimeTypes and enables to retrieves a content-type from its name, from a file name, or from a magic character sequence.

The MIME type detection methods that take an InputStream as an argument will never reads more than getMinLength() bytes from the stream. Also the given stream is never closed, marked, or reset by the methods. Thus a client can use the mark feature of the stream (if available) to restore the stream back to the state it was before type detection if it wants to process the stream based on the detected type.


Field Summary
static java.lang.String DEFAULT
          The default application/octet-stream MimeType
 
Constructor Summary
MimeTypes()
           
 
Method Summary
 void addPattern(MimeType type, java.lang.String pattern)
          Adds a file name pattern for the given media type.
 MimeType forName(java.lang.String name)
          Returns the registered media type with the given name (or alias).
 MimeType getMimeType(byte[] data)
          Returns the MIME type that best matches the given first few bytes of a document stream.
 MimeType getMimeType(java.io.File file)
          Find the Mime Content Type of a file.
 MimeType getMimeType(java.io.InputStream stream)
          Returns the MIME type that best matches the first few bytes of the given document stream.
 MimeType getMimeType(java.lang.String name)
          Find the Mime Content Type of a document from its name.
 MimeType getMimeType(java.lang.String name, byte[] data)
          Find the Mime Content Type of a document from its name and its content.
 MimeType getMimeType(java.lang.String name, java.io.InputStream stream)
          Returns the MIME type that best matches the given document name and the first few bytes of the given document stream.
 MimeType getMimeType(java.net.URL url)
          Find the Mime Content Type of a document from its URL.
 int getMinLength()
          Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.
 java.lang.String getType(java.lang.String typeName, java.lang.String url, byte[] data)
           
 java.lang.String getType(java.net.URL url)
          Determines the MIME type of the resource pointed to by the specified URL.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DEFAULT

public static final java.lang.String DEFAULT
The default application/octet-stream MimeType

See Also:
Constant Field Values
Constructor Detail

MimeTypes

public MimeTypes()
Method Detail

getMimeType

public MimeType getMimeType(java.io.File file)
Find the Mime Content Type of a file.

Parameters:
file - to analyze.
Returns:
the Mime Content Type of the specified file, or null if none is found.

getMimeType

public MimeType getMimeType(java.net.URL url)
Find the Mime Content Type of a document from its URL.

Parameters:
url - of the document to analyze.
Returns:
the Mime Content Type of the specified document URL, or null if none is found.

getMimeType

public MimeType getMimeType(java.lang.String name)
Find the Mime Content Type of a document from its name.

Parameters:
name - of the document to analyze.
Returns:
the Mime Content Type of the specified document name

getMimeType

public MimeType getMimeType(byte[] data)
Returns the MIME type that best matches the given first few bytes of a document stream.

The given byte array is expected to be at least getMinLength() long, or shorter only if the document stream itself is shorter.

Parameters:
data - first few bytes of a document stream
Returns:
matching MIME type, or null if no match is found

getMimeType

public MimeType getMimeType(java.io.InputStream stream)
                     throws java.io.IOException
Returns the MIME type that best matches the first few bytes of the given document stream.

Parameters:
stream - document stream
Returns:
matching MIME type, or null if no match is found
Throws:
java.io.IOException - if the stream can be read
See Also:
getMimeType(byte[])

getType

public java.lang.String getType(java.lang.String typeName,
                                java.lang.String url,
                                byte[] data)

getType

public java.lang.String getType(java.net.URL url)
                         throws java.io.IOException
Determines the MIME type of the resource pointed to by the specified URL. Examines the file's header, and if it cannot determine the MIME type from the header, guesses the MIME type from the URL extension (e.g. "pdf).

Parameters:
url -
Returns:
Throws:
java.io.IOException

getMimeType

public MimeType getMimeType(java.lang.String name,
                            byte[] data)
Find the Mime Content Type of a document from its name and its content. The policy used to guess the Mime Content Type is:
  1. Try to find the type based on the provided data.
  2. If a type is found, then return it, otherwise try to find the type based on the file name

Parameters:
name - of the document to analyze.
data - are the first bytes of the document's content.
Returns:
the Mime Content Type of the specified document, or null if none is found.
See Also:
getMinLength()

getMimeType

public MimeType getMimeType(java.lang.String name,
                            java.io.InputStream stream)
                     throws java.io.IOException
Returns the MIME type that best matches the given document name and the first few bytes of the given document stream.

Parameters:
name - document name
stream - document stream
Returns:
matching MIME type, or null if no match is found
Throws:
java.io.IOException - if the stream can not be read
See Also:
getMimeType(String, byte[])

forName

public MimeType forName(java.lang.String name)
                 throws MimeTypeException
Returns the registered media type with the given name (or alias). The named media type is automatically registered (and returned) if it doesn't already exist.

Parameters:
name - media type name (case-insensitive)
Returns:
the registered media type with the given name or alias
Throws:
MimeTypeException - if the given media type name is invalid

addPattern

public void addPattern(MimeType type,
                       java.lang.String pattern)
                throws MimeTypeException
Adds a file name pattern for the given media type.

Parameters:
type - media type
pattern - file name pattern
Throws:
MimeTypeException - if the pattern conflicts with existing ones

getMinLength

public int getMinLength()
Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.

Returns:
the minimum length of data to provide.
See Also:
getMimeType(byte[]), getMimeType(String, byte[])


Copyright © 2008 The Apache Software Foundation. All Rights Reserved.