All Classes Interface Summary Class Summary Enum Summary Exception Summary
Class |
Description |
AbstractEncoder |
Base class for payload encoders.
|
Among |
|
ArabicAnalyzer |
Analyzer for Arabic.
|
ArabicLetterTokenizer |
Deprecated.
|
ArabicNormalizationFilter |
|
ArabicNormalizer |
Normalizer for Arabic.
|
ArabicStemFilter |
|
ArabicStemmer |
Stemmer for Arabic.
|
ArmenianAnalyzer |
Analyzer for Armenian.
|
ArmenianStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
|
BasqueAnalyzer |
Analyzer for Basque.
|
BasqueStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
|
BrazilianAnalyzer |
Analyzer for Brazilian Portuguese language.
|
BrazilianStemFilter |
|
BrazilianStemmer |
A stemmer for Brazilian Portuguese words.
|
BulgarianAnalyzer |
Analyzer for Bulgarian.
|
BulgarianStemFilter |
|
BulgarianStemmer |
Light Stemmer for Bulgarian.
|
ByteVector |
This class implements a simple byte vector with access to the underlying
array.
|
CatalanAnalyzer |
Analyzer for Catalan.
|
CatalanStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
|
CharArrayIterator |
|
CharVector |
This class implements a simple char vector with access to the underlying
array.
|
ChineseAnalyzer |
Deprecated.
|
ChineseFilter |
Deprecated.
|
ChineseTokenizer |
Deprecated.
|
CJKAnalyzer |
An Analyzer that tokenizes text with StandardTokenizer ,
normalizes content with CJKWidthFilter , folds case with
LowerCaseFilter , forms bigrams of CJK with CJKBigramFilter ,
and filters stopwords with StopFilter
|
CJKBigramFilter |
Forms bigrams of CJK terms that are generated from StandardTokenizer
or ICUTokenizer.
|
CJKTokenizer |
Deprecated.
|
CJKWidthFilter |
A TokenFilter that normalizes CJK width differences:
Folds fullwidth ASCII variants into the equivalent basic latin
Folds halfwidth Katakana variants into the equivalent kana
|
CompoundWordTokenFilterBase |
Base class for decomposition token filters.
|
CzechAnalyzer |
Analyzer for Czech language.
|
CzechStemFilter |
A TokenFilter that applies CzechStemmer to stem Czech words.
|
CzechStemmer |
Light Stemmer for Czech.
|
DanishAnalyzer |
Analyzer for Danish.
|
DanishStemmer |
Generated class implementing code defined by a snowball script.
|
DateRecognizerSinkFilter |
Attempts to parse the CharTermAttributeImpl.termBuffer() as a Date using a DateFormat .
|
DelimitedPayloadTokenFilter |
Characters before the delimiter are the "token", those after are the payload.
|
DictionaryCompoundWordTokenFilter |
A TokenFilter that decomposes compound words found in many Germanic languages.
|
DutchAnalyzer |
Analyzer for Dutch language.
|
DutchStemFilter |
Deprecated.
|
DutchStemmer |
Deprecated.
|
DutchStemmer |
Generated class implementing code defined by a snowball script.
|
EdgeNGramTokenFilter |
Tokenizes the given token into n-grams of given size(s).
|
EdgeNGramTokenFilter.Side |
Specifies which side of the input the n-gram should be generated from
|
EdgeNGramTokenizer |
Tokenizes the input from an edge into n-grams of given size(s).
|
EdgeNGramTokenizer.Side |
Specifies which side of the input the n-gram should be generated from
|
ElisionFilter |
Removes elisions from a TokenStream .
|
EmptyTokenStream |
An always exhausted token stream.
|
EnglishAnalyzer |
Analyzer for English.
|
EnglishMinimalStemFilter |
|
EnglishMinimalStemmer |
Minimal plural stemmer for English.
|
EnglishPossessiveFilter |
TokenFilter that removes possessives (trailing 's) from words.
|
EnglishStemmer |
Generated class implementing code defined by a snowball script.
|
FinnishAnalyzer |
Analyzer for Finnish.
|
FinnishLightStemFilter |
|
FinnishLightStemmer |
Light Stemmer for Finnish.
|
FinnishStemmer |
Generated class implementing code defined by a snowball script.
|
FloatEncoder |
Encode a character array Float as a Payload .
|
FrenchAnalyzer |
Analyzer for French language.
|
FrenchLightStemFilter |
|
FrenchLightStemmer |
Light Stemmer for French.
|
FrenchMinimalStemFilter |
|
FrenchMinimalStemmer |
Light Stemmer for French.
|
FrenchStemFilter |
Deprecated.
|
FrenchStemmer |
Deprecated.
|
FrenchStemmer |
Generated class implementing code defined by a snowball script.
|
GalicianAnalyzer |
Analyzer for Galician.
|
GalicianMinimalStemFilter |
|
GalicianMinimalStemmer |
Minimal Stemmer for Galician
|
GalicianStemFilter |
|
GalicianStemmer |
Galician stemmer implementing "Regras do lematizador para o galego".
|
German2Stemmer |
Generated class implementing code defined by a snowball script.
|
GermanAnalyzer |
Analyzer for German language.
|
GermanLightStemFilter |
|
GermanLightStemmer |
Light Stemmer for German.
|
GermanMinimalStemFilter |
|
GermanMinimalStemmer |
Minimal Stemmer for German.
|
GermanNormalizationFilter |
|
GermanStemFilter |
A TokenFilter that stems German words.
|
GermanStemmer |
A stemmer for German words.
|
GermanStemmer |
Generated class implementing code defined by a snowball script.
|
GreekAnalyzer |
Analyzer for the Greek language.
|
GreekLowerCaseFilter |
Normalizes token text to lower case, removes some Greek diacritics,
and standardizes final sigma to sigma.
|
GreekStemFilter |
A TokenFilter that applies GreekStemmer to stem Greek
words.
|
GreekStemmer |
A stemmer for Greek words, according to: Development of a Stemmer for the
Greek Language. Georgios Ntais
|
HindiAnalyzer |
Analyzer for Hindi.
|
HindiNormalizationFilter |
|
HindiNormalizer |
Normalizer for Hindi.
|
HindiStemFilter |
A TokenFilter that applies HindiStemmer to stem Hindi words.
|
HindiStemmer |
Light Stemmer for Hindi.
|
HTMLStripCharFilter |
A CharFilter that wraps another Reader and attempts to strip out HTML constructs.
|
HungarianAnalyzer |
Analyzer for Hungarian.
|
HungarianLightStemFilter |
|
HungarianLightStemmer |
Light Stemmer for Hungarian.
|
HungarianStemmer |
Generated class implementing code defined by a snowball script.
|
HunspellAffix |
Wrapper class representing a hunspell affix
|
HunspellDictionary |
In-memory structure for the dictionary (.dic) and affix (.aff)
data of a hunspell dictionary.
|
HunspellStemFilter |
TokenFilter that uses hunspell affix rules and words to stem tokens.
|
HunspellStemmer |
HunspellStemmer uses the affix rules declared in the HunspellDictionary to generate one or more stems for a word.
|
HunspellStemmer.Stem |
Stem represents all information known about a stem of a word.
|
HunspellWord |
A dictionary (.dic) entry with its associated flags.
|
Hyphen |
This class represents a hyphen.
|
Hyphenation |
This class represents a hyphenated word.
|
HyphenationCompoundWordTokenFilter |
A TokenFilter that decomposes compound words found in many Germanic languages.
|
HyphenationException |
This class has been taken from the Apache FOP project (http://xmlgraphics.apache.org/fop/).
|
HyphenationTree |
This tree structure stores the hyphenation patterns in an efficient way for
fast lookup.
|
IdentityEncoder |
Does nothing other than convert the char array to a byte array using the specified encoding.
|
IndicNormalizationFilter |
A TokenFilter that applies IndicNormalizer to normalize text
in Indian Languages.
|
IndicNormalizer |
Normalizes the Unicode representation of text in Indian languages.
|
IndicTokenizer |
Deprecated.
|
IndonesianAnalyzer |
Analyzer for Indonesian (Bahasa)
|
IndonesianStemFilter |
|
IndonesianStemmer |
Stemmer for Indonesian.
|
IntegerEncoder |
Encode a character array Integer as a Payload .
|
IrishAnalyzer |
Analyzer for Irish.
|
IrishLowerCaseFilter |
Normalises token text to lower case, handling t-prothesis
and n-eclipsis (i.e., that 'nAthair' should become 'n-athair')
|
IrishStemmer |
This class was automatically generated by a Snowball to Java compiler
It implements the stemming algorithm defined by a snowball script.
|
ItalianAnalyzer |
Analyzer for Italian.
|
ItalianLightStemFilter |
|
ItalianLightStemmer |
Light Stemmer for Italian.
|
ItalianStemmer |
Generated class implementing code defined by a snowball script.
|
KpStemmer |
Generated class implementing code defined by a snowball script.
|
KStemFilter |
A high-performance kstem filter for english.
|
KStemmer |
This class implements the Kstem algorithm
|
LatvianAnalyzer |
Analyzer for Latvian.
|
LatvianStemFilter |
|
LatvianStemmer |
Light stemmer for Latvian.
|
LovinsStemmer |
Generated class implementing code defined by a snowball script.
|
NGramTokenFilter |
Tokenizes the input into n-grams of the given size(s).
|
NGramTokenizer |
Tokenizes the input into n-grams of the given size(s).
|
NorwegianAnalyzer |
Analyzer for Norwegian.
|
NorwegianLightStemFilter |
|
NorwegianLightStemmer |
Light Stemmer for Norwegian.
|
NorwegianMinimalStemFilter |
|
NorwegianMinimalStemmer |
Minimal Stemmer for Norwegian bokmål (no-nb)
|
NorwegianStemmer |
Generated class implementing code defined by a snowball script.
|
NumericPayloadTokenFilter |
Assigns a payload to a token based on the Token.type()
|
OpenStringBuilder |
A StringBuilder that allows one to access the array.
|
PathHierarchyTokenizer |
Tokenizer for path-like hierarchies.
|
PatternAnalyzer |
Efficient Lucene analyzer/tokenizer that preferably operates on a String rather than a
Reader , that can flexibly separate text into terms via a regular expression Pattern
(with behaviour identical to String.split(String) ),
and that combines the functionality of
LetterTokenizer ,
LowerCaseTokenizer ,
WhitespaceTokenizer ,
StopFilter into a single efficient
multi-purpose class.
|
PatternConsumer |
This interface is used to connect the XML pattern file parser to the
hyphenation tree.
|
PatternParser |
A SAX document handler to read and parse hyphenation patterns from a XML
file.
|
PayloadEncoder |
Mainly for use with the DelimitedPayloadTokenFilter, converts char buffers to Payload.
|
PayloadHelper |
Utility methods for encoding payloads.
|
PersianAnalyzer |
Analyzer for Persian.
|
PersianCharFilter |
CharFilter that replaces instances of Zero-width non-joiner with an
ordinary space.
|
PersianNormalizationFilter |
|
PersianNormalizer |
Normalizer for Persian.
|
PorterStemmer |
Generated class implementing code defined by a snowball script.
|
PortugueseAnalyzer |
Analyzer for Portuguese.
|
PortugueseLightStemFilter |
|
PortugueseLightStemmer |
Light Stemmer for Portuguese
|
PortugueseMinimalStemFilter |
|
PortugueseMinimalStemmer |
Minimal Stemmer for Portuguese
|
PortugueseStemFilter |
|
PortugueseStemmer |
Portuguese stemmer implementing the RSLP (Removedor de Sufixos da Lingua Portuguesa)
algorithm.
|
PortugueseStemmer |
Generated class implementing code defined by a snowball script.
|
PositionFilter |
Set the positionIncrement of all tokens to the "positionIncrement",
except the first return token which retains its original positionIncrement value.
|
PrefixAndSuffixAwareTokenFilter |
|
PrefixAwareTokenFilter |
Joins two token streams and leaves the last token of the first stream available
to be used when updating the token values in the second stream based on that token.
|
QueryAutoStopWordAnalyzer |
An Analyzer used primarily at query time to wrap another analyzer and provide a layer of protection
which prevents very common words from being passed into queries.
|
ReversePathHierarchyTokenizer |
Tokenizer for domain-like hierarchies.
|
ReverseStringFilter |
Reverse token string, for example "country" => "yrtnuoc".
|
RomanianAnalyzer |
Analyzer for Romanian.
|
RomanianStemmer |
Generated class implementing code defined by a snowball script.
|
RSLPStemmerBase |
Base class for stemmers that use a set of RSLP-like stemming steps.
|
RSLPStemmerBase.Rule |
A basic rule, with no exceptions.
|
RSLPStemmerBase.RuleWithSetExceptions |
A rule with a set of whole-word exceptions.
|
RSLPStemmerBase.RuleWithSuffixExceptions |
A rule with a set of exceptional suffixes.
|
RSLPStemmerBase.Step |
A step containing a list of rules.
|
RussianAnalyzer |
Analyzer for Russian language.
|
RussianLetterTokenizer |
Deprecated.
|
RussianLightStemFilter |
|
RussianLightStemmer |
Light Stemmer for Russian.
|
RussianLowerCaseFilter |
Deprecated.
|
RussianStemFilter |
Deprecated.
|
RussianStemmer |
Generated class implementing code defined by a snowball script.
|
ShingleAnalyzerWrapper |
A ShingleAnalyzerWrapper wraps a ShingleFilter around another Analyzer .
|
ShingleFilter |
A ShingleFilter constructs shingles (token n-grams) from a token stream.
|
ShingleMatrixFilter |
Deprecated.
|
ShingleMatrixFilter.Matrix |
A column focused matrix in three dimensions:
|
ShingleMatrixFilter.OneDimensionalNonWeightedTokenSettingsCodec |
|
ShingleMatrixFilter.SimpleThreeDimensionalTokenSettingsCodec |
A full featured codec not to be used for something serious.
|
ShingleMatrixFilter.TokenPositioner |
|
ShingleMatrixFilter.TokenSettingsCodec |
Strategy used to code and decode meta data of the tokens from the input stream
regarding how to position the tokens in the matrix, set and retreive weight, et c.
|
ShingleMatrixFilter.TwoDimensionalNonWeightedSynonymTokenSettingsCodec |
A codec that creates a two dimensional matrix
by treating tokens from the input stream with 0 position increment
as new rows to the current column.
|
SingleTokenTokenStream |
A TokenStream containing a single token.
|
SnowballAnalyzer |
Deprecated.
|
SnowballFilter |
A filter that stems words using a Snowball-generated stemmer.
|
SnowballProgram |
This is the rev 502 of the Snowball SVN trunk,
but modified:
made abstract and introduced abstract method stem to avoid expensive reflection in filter class.
|
SolrSynonymParser |
Parser for the Solr synonyms format.
|
SpanishAnalyzer |
Analyzer for Spanish.
|
SpanishLightStemFilter |
|
SpanishLightStemmer |
Light Stemmer for Spanish
|
SpanishStemmer |
Generated class implementing code defined by a snowball script.
|
StemmerOverrideFilter |
Provides the ability to override any KeywordAttribute aware stemmer
with custom dictionary-based stemming.
|
StemmerUtil |
Some commonly-used stemming functions
|
SwedishAnalyzer |
Analyzer for Swedish.
|
SwedishLightStemFilter |
|
SwedishLightStemmer |
Light Stemmer for Swedish.
|
SwedishStemmer |
Generated class implementing code defined by a snowball script.
|
SynonymFilter |
Matches single or multi word synonyms in a token stream.
|
SynonymMap |
A map of synonyms, keys and values are phrases.
|
SynonymMap.Builder |
Builds an FSTSynonymMap.
|
TernaryTree |
Ternary Search Tree.
|
TestApp |
|
ThaiAnalyzer |
Analyzer for Thai language.
|
ThaiWordFilter |
TokenFilter that use BreakIterator to break each
Token that is Thai into separate Token(s) for each Thai word.
|
TokenOffsetPayloadTokenFilter |
Adds the Token.setStartOffset(int)
and Token.setEndOffset(int)
First 4 bytes are the start
|
TokenRangeSinkFilter |
Counts the tokens as they go by and saves to the internal list those between the range of lower and upper, exclusive of upper
|
TokenTypeSinkFilter |
Adds a token to the sink if it has a specific type.
|
TurkishAnalyzer |
Analyzer for Turkish.
|
TurkishLowerCaseFilter |
Normalizes Turkish token text to lower case.
|
TurkishStemmer |
Generated class implementing code defined by a snowball script.
|
TypeAsPayloadTokenFilter |
Makes the Token.type() a payload.
|
WikipediaTokenizer |
Extension of StandardTokenizer that is aware of Wikipedia syntax.
|
WordnetSynonymParser |
Parser for wordnet prolog format
|