Class HighFreqTerms


  • public class HighFreqTerms
    extends Object
    HighFreqTerms class extracts the top n most frequent terms (by document frequency ) from an existing Lucene index and reports their document frequency. If used with the -t flag it also reports their total tf (total number of occurences) in order of highest total tf
    • Field Detail

      • numTerms

        public static int numTerms
    • Constructor Detail

      • HighFreqTerms

        public HighFreqTerms()
    • Method Detail

      • getHighFreqTerms

        public static org.apache.lucene.misc.TermStats[] getHighFreqTerms​(org.apache.lucene.index.IndexReader reader,
                                                                          int numTerms,
                                                                          String field)
                                                                   throws Exception
        Parameters:
        reader -
        numTerms -
        field -
        Returns:
        TermStats[] ordered by terms with highest docFreq first.
        Throws:
        Exception
      • sortByTotalTermFreq

        public static org.apache.lucene.misc.TermStats[] sortByTotalTermFreq​(org.apache.lucene.index.IndexReader reader,
                                                                             org.apache.lucene.misc.TermStats[] terms)
                                                                      throws Exception
        Takes array of TermStats. For each term looks up the tf for each doc containing the term and stores the total in the output array of TermStats. Output array is sorted by highest total tf.
        Parameters:
        reader -
        terms - TermStats[]
        Returns:
        TermStats[]
        Throws:
        Exception
      • getTotalTermFreq

        public static long getTotalTermFreq​(org.apache.lucene.index.IndexReader reader,
                                            org.apache.lucene.index.Term term)
                                     throws Exception
        Throws:
        Exception