Package | Description |
---|---|
org.apache.tika |
Apache Tika.
|
org.apache.tika.config |
Tika configuration tools.
|
org.apache.tika.extractor |
Extraction of component documents.
|
org.apache.tika.fork |
Forked parser.
|
org.apache.tika.parser |
Tika parsers.
|
org.apache.tika.parser.epub | |
org.apache.tika.parser.external |
External parser process.
|
org.apache.tika.parser.xml |
Modifier and Type | Method and Description |
---|---|
Parser |
Tika.getParser()
Returns the parser instance used by this facade.
|
Constructor and Description |
---|
Tika(Detector detector,
Parser parser)
Creates a Tika facade using the given detector and parser instances.
|
Modifier and Type | Method and Description |
---|---|
Parser |
TikaConfig.getParser()
Returns the configured parser instance.
|
Parser |
TikaConfig.getParser(MediaType mimeType)
Deprecated.
Use the
TikaConfig.getParser() method instead |
Constructor and Description |
---|
ParserContainerExtractor(Parser parser,
Detector detector) |
Modifier and Type | Class and Description |
---|---|
class |
ForkParser |
Constructor and Description |
---|
ForkParser(ClassLoader loader,
Parser parser) |
Modifier and Type | Class and Description |
---|---|
class |
AbstractParser
Abstract base class for new parsers.
|
class |
AutoDetectParser |
class |
CompositeParser
Composite parser that delegates parsing tasks to a component parser
based on the declared content type of the incoming document.
|
class |
CryptoParser
Decrypts the incoming document stream and delegates further parsing to
another parser instance.
|
class |
DefaultParser
A composite parser based on all the
Parser implementations
available through the
service provider mechanism . |
class |
DelegatingParser
Base class for parser implementations that want to delegate parts of the
task of parsing an input document to another parser.
|
class |
EmptyParser
Dummy parser that always produces an empty XHTML document without even
attempting to parse the given document stream.
|
class |
ErrorParser
Dummy parser that always throws a
TikaException without even
attempting to parse the given document stream. |
class |
NetworkParser |
class |
ParserDecorator
Decorator base class for the
Parser interface. |
class |
ParserPostProcessor
Parser decorator that post-processes the results from a decorated parser.
|
Modifier and Type | Method and Description |
---|---|
protected Parser |
DelegatingParser.getDelegateParser(ParseContext context)
Returns the parser instance to which parsing tasks should be delegated.
|
Parser |
CompositeParser.getFallback()
Returns the fallback parser.
|
protected Parser |
CompositeParser.getParser(Metadata metadata)
Returns the parser that best matches the given metadata.
|
protected Parser |
CompositeParser.getParser(Metadata metadata,
ParseContext context) |
Parser |
ParserDecorator.getWrappedParser()
Gets the parser wrapped by this ParserDecorator
|
static Parser |
ParserDecorator.withTypes(Parser parser,
Set<MediaType> types)
Decorates the given parser so that it always claims to support
parsing of the given media types.
|
Modifier and Type | Method and Description |
---|---|
Map<MediaType,List<Parser>> |
CompositeParser.findDuplicateParsers(ParseContext context)
Utility method that goes through all the component parsers and finds
all media types for which more than one parser declares support.
|
Map<MediaType,Parser> |
CompositeParser.getParsers()
Returns the component parsers.
|
Map<MediaType,Parser> |
CompositeParser.getParsers(ParseContext context) |
Map<MediaType,Parser> |
DefaultParser.getParsers(ParseContext context) |
Modifier and Type | Method and Description |
---|---|
void |
CompositeParser.setFallback(Parser fallback)
Sets the fallback parser.
|
static Parser |
ParserDecorator.withTypes(Parser parser,
Set<MediaType> types)
Decorates the given parser so that it always claims to support
parsing of the given media types.
|
Modifier and Type | Method and Description |
---|---|
void |
CompositeParser.setParsers(Map<MediaType,Parser> parsers)
Sets the component parsers.
|
Constructor and Description |
---|
AutoDetectParser(Detector detector,
Parser... parsers) |
AutoDetectParser(Parser... parsers)
Creates an auto-detecting parser instance using the specified set of parser.
|
CompositeParser(MediaTypeRegistry registry,
Parser... parsers) |
ParserDecorator(Parser parser)
Creates a decorator for the given parser.
|
ParserPostProcessor(Parser parser)
Creates a post-processing decorator for the given parser.
|
ParsingReader(Parser parser,
InputStream stream,
Metadata metadata,
ParseContext context)
Creates a reader for the text content of the given binary stream
with the given document metadata.
|
ParsingReader(Parser parser,
InputStream stream,
Metadata metadata,
ParseContext context,
Executor executor)
Creates a reader for the text content of the given binary stream
with the given document metadata.
|
Constructor and Description |
---|
CompositeParser(MediaTypeRegistry registry,
List<Parser> parsers) |
Modifier and Type | Class and Description |
---|---|
class |
EpubContentParser
Parser for EPUB OPS
*.html files. |
class |
EpubParser
Epub parser
|
Modifier and Type | Method and Description |
---|---|
Parser |
EpubParser.getContentParser() |
Parser |
EpubParser.getMetaParser() |
Modifier and Type | Method and Description |
---|---|
void |
EpubParser.setContentParser(Parser content) |
void |
EpubParser.setMetaParser(Parser meta) |
Modifier and Type | Class and Description |
---|---|
class |
CompositeExternalParser
A Composite Parser that wraps up all the available External Parsers,
and provides an easy way to access them.
|
class |
ExternalParser
Parser that uses an external program (like catdoc or pdf2txt) to extract
text content and metadata from a given document.
|
Modifier and Type | Class and Description |
---|---|
class |
DcXMLParser
Dublin Core metadata parser
|
class |
FictionBookParser |
class |
XMLParser
XML parser.
|
Copyright © 2007-2014 The Apache Software Foundation. All Rights Reserved.