Class HunspellStemmer
- java.lang.Object
-
- org.apache.lucene.analysis.hunspell.HunspellStemmer
-
public class HunspellStemmer extends Object
HunspellStemmer uses the affix rules declared in the HunspellDictionary to generate one or more stems for a word. It conforms to the algorithm in the original hunspell algorithm, including recursive suffix stripping.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
HunspellStemmer.Stem
Stem represents all information known about a stem of a word.
-
Constructor Summary
Constructors Constructor Description HunspellStemmer(HunspellDictionary dictionary)
Constructs a new HunspellStemmer which will use the provided HunspellDictionary to create its stems
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description List<HunspellStemmer.Stem>
applyAffix(char[] strippedWord, int length, HunspellAffix affix, int recursionDepth)
Applies the affix rule to the given word, producing a list of stems if any are foundstatic void
main(String[] args)
HunspellStemmer entry point.List<HunspellStemmer.Stem>
stem(char[] word, int length)
Find the stem(s) of the provided wordList<HunspellStemmer.Stem>
stem(String word)
Find the stem(s) of the provided wordList<HunspellStemmer.Stem>
uniqueStems(char[] word, int length)
Find the unique stem(s) of the provided word
-
-
-
Constructor Detail
-
HunspellStemmer
public HunspellStemmer(HunspellDictionary dictionary)
Constructs a new HunspellStemmer which will use the provided HunspellDictionary to create its stems- Parameters:
dictionary
- HunspellDictionary that will be used to create the stems
-
-
Method Detail
-
stem
public List<HunspellStemmer.Stem> stem(String word)
Find the stem(s) of the provided word- Parameters:
word
- Word to find the stems for- Returns:
- List of stems for the word
-
stem
public List<HunspellStemmer.Stem> stem(char[] word, int length)
Find the stem(s) of the provided word- Parameters:
word
- Word to find the stems for- Returns:
- List of stems for the word
-
uniqueStems
public List<HunspellStemmer.Stem> uniqueStems(char[] word, int length)
Find the unique stem(s) of the provided word- Parameters:
word
- Word to find the stems for- Returns:
- List of stems for the word
-
applyAffix
public List<HunspellStemmer.Stem> applyAffix(char[] strippedWord, int length, HunspellAffix affix, int recursionDepth)
Applies the affix rule to the given word, producing a list of stems if any are found- Parameters:
strippedWord
- Word the affix has been removed and the strip addedaffix
- HunspellAffix representing the affix rule itselfrecursionDepth
- Level of recursion this stemming step is at- Returns:
- List of stems for the word, or an empty list if none are found
-
main
public static void main(String[] args) throws IOException, ParseException
HunspellStemmer entry point. Accepts two arguments: location of affix file and location of dic file- Parameters:
args
- Program arguments. Should contain location of affix file and location of dic file- Throws:
IOException
- Can be thrown while reading from the filesParseException
- Can be thrown while parsing the files
-
-