Package org.apache.lucene.analysis.in
Class IndicTokenizer
- java.lang.Object
-
- org.apache.lucene.util.AttributeSource
-
- org.apache.lucene.analysis.TokenStream
-
- org.apache.lucene.analysis.Tokenizer
-
- org.apache.lucene.analysis.CharTokenizer
-
- org.apache.lucene.analysis.in.IndicTokenizer
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
@Deprecated public final class IndicTokenizer extends CharTokenizer
Deprecated.(3.6) UseStandardTokenizer
instead.Simple Tokenizer for text in Indian Languages.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.AttributeFactory, AttributeSource.State
-
-
Constructor Summary
Constructors Constructor Description IndicTokenizer(Version matchVersion, Reader input)
Deprecated.IndicTokenizer(Version matchVersion, AttributeSource.AttributeFactory factory, Reader input)
Deprecated.IndicTokenizer(Version matchVersion, AttributeSource source, Reader input)
Deprecated.
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description protected boolean
isTokenChar(int c)
Deprecated.Returns true iff a codepoint should be included in a token.-
Methods inherited from class org.apache.lucene.analysis.CharTokenizer
end, incrementToken, isTokenChar, normalize, normalize, reset
-
Methods inherited from class org.apache.lucene.analysis.Tokenizer
close, correctOffset
-
Methods inherited from class org.apache.lucene.analysis.TokenStream
reset
-
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
-
-
-
-
Constructor Detail
-
IndicTokenizer
public IndicTokenizer(Version matchVersion, AttributeSource.AttributeFactory factory, Reader input)
Deprecated.
-
IndicTokenizer
public IndicTokenizer(Version matchVersion, AttributeSource source, Reader input)
Deprecated.
-
-
Method Detail
-
isTokenChar
protected boolean isTokenChar(int c)
Deprecated.Description copied from class:CharTokenizer
Returns true iff a codepoint should be included in a token. This tokenizer generates as tokens adjacent sequences of codepoints which satisfy this predicate. Codepoints for which this is false are used to define token boundaries and are not included in tokens.As of Lucene 3.1 the char based API (
CharTokenizer.isTokenChar(char)
andCharTokenizer.normalize(char)
) has been depreciated in favor of a Unicode 4.0 compatible int based API to support codepoints instead of UTF-16 code units. Subclasses ofCharTokenizer
must not override the char based methods if aVersion
>= 3.1 is passed to the constructor.NOTE: This method will be marked abstract in Lucene 4.0.
- Overrides:
isTokenChar
in classCharTokenizer
-
-