Package org.apache.lucene.misc
Class HighFreqTerms
- java.lang.Object
-
- org.apache.lucene.misc.HighFreqTerms
-
public class HighFreqTerms extends Object
HighFreqTerms
class extracts the top n most frequent terms (by document frequency ) from an existing Lucene index and reports their document frequency. If used with the -t flag it also reports their total tf (total number of occurences) in order of highest total tf
-
-
Field Summary
Fields Modifier and Type Field Description static int
DEFAULTnumTerms
static int
numTerms
-
Constructor Summary
Constructors Constructor Description HighFreqTerms()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static org.apache.lucene.misc.TermStats[]
getHighFreqTerms(org.apache.lucene.index.IndexReader reader, int numTerms, String field)
static long
getTotalTermFreq(org.apache.lucene.index.IndexReader reader, org.apache.lucene.index.Term term)
static void
main(String[] args)
static org.apache.lucene.misc.TermStats[]
sortByTotalTermFreq(org.apache.lucene.index.IndexReader reader, org.apache.lucene.misc.TermStats[] terms)
Takes array of TermStats.
-
-
-
Field Detail
-
DEFAULTnumTerms
public static final int DEFAULTnumTerms
- See Also:
- Constant Field Values
-
numTerms
public static int numTerms
-
-
Method Detail
-
getHighFreqTerms
public static org.apache.lucene.misc.TermStats[] getHighFreqTerms(org.apache.lucene.index.IndexReader reader, int numTerms, String field) throws Exception
- Parameters:
reader
-numTerms
-field
-- Returns:
- TermStats[] ordered by terms with highest docFreq first.
- Throws:
Exception
-
sortByTotalTermFreq
public static org.apache.lucene.misc.TermStats[] sortByTotalTermFreq(org.apache.lucene.index.IndexReader reader, org.apache.lucene.misc.TermStats[] terms) throws Exception
Takes array of TermStats. For each term looks up the tf for each doc containing the term and stores the total in the output array of TermStats. Output array is sorted by highest total tf.- Parameters:
reader
-terms
- TermStats[]- Returns:
- TermStats[]
- Throws:
Exception
-
-