C D E F G H I L N O P R S T W
All Classes All Packages
All Classes All Packages
All Classes All Packages
C
- clear() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- clone() - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
-
Clone method.
- copyTo(AttributeImpl) - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- current() - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
D
- DefaultICUTokenizerConfig - Class in org.apache.lucene.analysis.icu.segmentation
-
Default
ICUTokenizerConfig
that is generally applicable to many languages. - DefaultICUTokenizerConfig() - Constructor for class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
E
- end() - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
- equals(Object) - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
F
- first() - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
- following(int) - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
G
- getBreakIterator(int) - Method in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
- getBreakIterator(int) - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
-
Return a breakiterator capable of processing a given script.
- getCode() - Method in interface org.apache.lucene.analysis.icu.tokenattributes.ScriptAttribute
-
Get the numeric code for this script value.
- getCode() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- getName() - Method in interface org.apache.lucene.analysis.icu.tokenattributes.ScriptAttribute
-
Get the full name.
- getName() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- getShortName() - Method in interface org.apache.lucene.analysis.icu.tokenattributes.ScriptAttribute
-
Get the abbreviated name.
- getShortName() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- getText() - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
- getType(int, int) - Method in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
- getType(int, int) - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
-
Return a token type value for a given script and BreakIterator rule status.
H
- hashCode() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
I
- ICUCollationKeyAnalyzer - Class in org.apache.lucene.collation
-
Filters
KeywordTokenizer
withICUCollationKeyFilter
. - ICUCollationKeyAnalyzer(Collator) - Constructor for class org.apache.lucene.collation.ICUCollationKeyAnalyzer
- ICUCollationKeyFilter - Class in org.apache.lucene.collation
-
Converts each token into its
CollationKey
, and then encodes the CollationKey withIndexableBinaryStringTools
, to allow it to be stored as an index term. - ICUCollationKeyFilter(TokenStream, Collator) - Constructor for class org.apache.lucene.collation.ICUCollationKeyFilter
- ICUFoldingFilter - Class in org.apache.lucene.analysis.icu
-
A TokenFilter that applies search term folding to Unicode text, applying foldings from UTR#30 Character Foldings.
- ICUFoldingFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.icu.ICUFoldingFilter
-
Create a new ICUFoldingFilter on the specified input
- ICUNormalizer2Filter - Class in org.apache.lucene.analysis.icu
-
Normalize token text with ICU's
Normalizer2
- ICUNormalizer2Filter(TokenStream) - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2Filter
-
Create a new Normalizer2Filter that combines NFKC normalization, Case Folding, and removes Default Ignorables (NFKC_Casefold)
- ICUNormalizer2Filter(TokenStream, Normalizer2) - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2Filter
-
Create a new Normalizer2Filter with the specified Normalizer2
- ICUTokenizer - Class in org.apache.lucene.analysis.icu.segmentation
-
Breaks text into words according to UAX #29: Unicode Text Segmentation (http://www.unicode.org/reports/tr29/)
- ICUTokenizer(Reader) - Constructor for class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
-
Construct a new ICUTokenizer that breaks text into words from the given Reader.
- ICUTokenizer(Reader, ICUTokenizerConfig) - Constructor for class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
-
Construct a new ICUTokenizer that breaks text into words from the given Reader, using a tailored BreakIterator configuration.
- ICUTokenizerConfig - Class in org.apache.lucene.analysis.icu.segmentation
-
Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.
- ICUTokenizerConfig() - Constructor for class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
- ICUTransformFilter - Class in org.apache.lucene.analysis.icu
-
A
TokenFilter
that transforms text with ICU. - ICUTransformFilter(TokenStream, Transliterator) - Constructor for class org.apache.lucene.analysis.icu.ICUTransformFilter
-
Create a new ICUTransformFilter that transforms text on the given stream.
- incrementToken() - Method in class org.apache.lucene.analysis.icu.ICUNormalizer2Filter
- incrementToken() - Method in class org.apache.lucene.analysis.icu.ICUTransformFilter
- incrementToken() - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
- incrementToken() - Method in class org.apache.lucene.collation.ICUCollationKeyFilter
L
- LaoBreakIterator - Class in org.apache.lucene.analysis.icu.segmentation
-
Syllable iterator for Lao text.
- LaoBreakIterator(RuleBasedBreakIterator) - Constructor for class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
- last() - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
N
- next() - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
- next(int) - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
O
- org.apache.lucene.analysis.icu - package org.apache.lucene.analysis.icu
-
Analysis components based on ICU
- org.apache.lucene.analysis.icu.segmentation - package org.apache.lucene.analysis.icu.segmentation
-
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
- org.apache.lucene.analysis.icu.tokenattributes - package org.apache.lucene.analysis.icu.tokenattributes
-
Additional ICU-specific Attributes for text analysis.
- org.apache.lucene.collation - package org.apache.lucene.collation
P
- previous() - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
R
- reflectWith(AttributeReflector) - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- reset() - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
- reset(Reader) - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
- reusableTokenStream(String, Reader) - Method in class org.apache.lucene.collation.ICUCollationKeyAnalyzer
S
- ScriptAttribute - Interface in org.apache.lucene.analysis.icu.tokenattributes
-
This attribute stores the UTR #24 script value for a token of text.
- ScriptAttributeImpl - Class in org.apache.lucene.analysis.icu.tokenattributes
-
Implementation of
ScriptAttribute
that stores the script as an integer. - ScriptAttributeImpl() - Constructor for class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- setCode(int) - Method in interface org.apache.lucene.analysis.icu.tokenattributes.ScriptAttribute
-
Set the numeric code for this script value.
- setCode(int) - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- setText(String) - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
- setText(CharacterIterator) - Method in class org.apache.lucene.analysis.icu.segmentation.LaoBreakIterator
T
- tokenStream(String, Reader) - Method in class org.apache.lucene.collation.ICUCollationKeyAnalyzer
W
- WORD_HANGUL - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words containing Korean hangul
- WORD_HIRAGANA - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words containing Japanese hiragana
- WORD_IDEO - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words containing ideographic characters
- WORD_KATAKANA - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words containing Japanese katakana
- WORD_LETTER - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words that contain letters
- WORD_NUMBER - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words that appear to be numbers
All Classes All Packages