Class ICUTokenizerConfig

  • Direct Known Subclasses:
    DefaultICUTokenizerConfig

    public abstract class ICUTokenizerConfig
    extends Object
    Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.
    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Constructor Detail

      • ICUTokenizerConfig

        public ICUTokenizerConfig()
    • Method Detail

      • getBreakIterator

        public abstract com.ibm.icu.text.BreakIterator getBreakIterator​(int script)
        Return a breakiterator capable of processing a given script.
      • getType

        public abstract String getType​(int script,
                                       int ruleStatus)
        Return a token type value for a given script and BreakIterator rule status.