Class RussianLetterTokenizer

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    @Deprecated
    public class RussianLetterTokenizer
    extends org.apache.lucene.analysis.CharTokenizer
    Deprecated.
    Use StandardTokenizer instead, which has the same functionality. This filter will be removed in Lucene 5.0
    A RussianLetterTokenizer is a Tokenizer that extends LetterTokenizer by also allowing the basic Latin digits 0-9.

    You must specify the required Version compatibility when creating RussianLetterTokenizer:

    • As of 3.1, CharTokenizer uses an int based API to normalize and detect token characters. See CharTokenizer.isTokenChar(int) and CharTokenizer.normalize(int) for details.
    • Constructor Detail

      • RussianLetterTokenizer

        public RussianLetterTokenizer​(org.apache.lucene.util.Version matchVersion,
                                      Reader in)
        Deprecated.
        Construct a new RussianLetterTokenizer. * @param matchVersion Lucene version to match See {@link above}
        Parameters:
        in - the input to split up into tokens
      • RussianLetterTokenizer

        public RussianLetterTokenizer​(org.apache.lucene.util.Version matchVersion,
                                      org.apache.lucene.util.AttributeSource source,
                                      Reader in)
        Deprecated.
        Construct a new RussianLetterTokenizer using a given AttributeSource.
        Parameters:
        matchVersion - Lucene version to match See {@link above}
        source - the attribute source to use for this Tokenizer
        in - the input to split up into tokens
      • RussianLetterTokenizer

        public RussianLetterTokenizer​(org.apache.lucene.util.Version matchVersion,
                                      org.apache.lucene.util.AttributeSource.AttributeFactory factory,
                                      Reader in)
        Deprecated.
        Construct a new RussianLetterTokenizer using a given AttributeSource.AttributeFactory. * @param matchVersion Lucene version to match See {@link above}
        Parameters:
        factory - the attribute factory to use for this Tokenizer
        in - the input to split up into tokens
    • Method Detail

      • isTokenChar

        protected boolean isTokenChar​(int c)
        Deprecated.
        Collects only characters which satisfy Character.isLetter(int).
        Overrides:
        isTokenChar in class org.apache.lucene.analysis.CharTokenizer