Class ICUTokenizerConfig
- java.lang.Object
-
- org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
-
- Direct Known Subclasses:
DefaultICUTokenizerConfig
public abstract class ICUTokenizerConfig extends Object
Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Constructor Summary
Constructors Constructor Description ICUTokenizerConfig()
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description abstract com.ibm.icu.text.BreakIterator
getBreakIterator(int script)
Return a breakiterator capable of processing a given script.abstract String
getType(int script, int ruleStatus)
Return a token type value for a given script and BreakIterator rule status.
-
-
-
Method Detail
-
getBreakIterator
public abstract com.ibm.icu.text.BreakIterator getBreakIterator(int script)
Return a breakiterator capable of processing a given script.
-
getType
public abstract String getType(int script, int ruleStatus)
Return a token type value for a given script and BreakIterator rule status.
-
-