Class StopFilter

    • Constructor Detail

      • StopFilter

        @Deprecated
        public StopFilter​(boolean enablePositionIncrements,
                          TokenStream input,
                          Set<?> stopWords,
                          boolean ignoreCase)
        Deprecated.
        Construct a token stream filtering the given input. If stopWords is an instance of CharArraySet (true if makeStopSet() was used to construct the set) it will be directly used and ignoreCase will be ignored since CharArraySet directly controls case sensitivity.

        If stopWords is not an instance of CharArraySet, a new CharArraySet will be constructed and ignoreCase will be used to specify the case sensitivity of that set.

        Parameters:
        enablePositionIncrements - true if token positions should record the removed stop words
        input - Input TokenStream
        stopWords - A Set of Strings or char[] or any other toString()-able set representing the stopwords
        ignoreCase - if true, all words are lower cased first
      • StopFilter

        @Deprecated
        public StopFilter​(Version matchVersion,
                          TokenStream input,
                          Set<?> stopWords,
                          boolean ignoreCase)
        Deprecated.
        Construct a token stream filtering the given input. If stopWords is an instance of CharArraySet (true if makeStopSet() was used to construct the set) it will be directly used and ignoreCase will be ignored since CharArraySet directly controls case sensitivity.

        If stopWords is not an instance of CharArraySet, a new CharArraySet will be constructed and ignoreCase will be used to specify the case sensitivity of that set.

        Parameters:
        matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the stop set if Version > 3.0. See above for details.
        input - Input TokenStream
        stopWords - A Set of Strings or char[] or any other toString()-able set representing the stopwords
        ignoreCase - if true, all words are lower cased first
      • StopFilter

        @Deprecated
        public StopFilter​(boolean enablePositionIncrements,
                          TokenStream in,
                          Set<?> stopWords)
        Deprecated.
        Constructs a filter which removes words from the input TokenStream that are named in the Set.
        Parameters:
        enablePositionIncrements - true if token positions should record the removed stop words
        in - Input stream
        stopWords - A Set of Strings or char[] or any other toString()-able set representing the stopwords
        See Also:
        makeStopSet(Version, java.lang.String[])
      • StopFilter

        public StopFilter​(Version matchVersion,
                          TokenStream in,
                          Set<?> stopWords)
        Constructs a filter which removes words from the input TokenStream that are named in the Set.
        Parameters:
        matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the stop set if Version > 3.0. See above for details.
        in - Input stream
        stopWords - A Set of Strings or char[] or any other toString()-able set representing the stopwords
        See Also:
        makeStopSet(Version, java.lang.String[])
    • Method Detail

      • makeStopSet

        public static final Set<Object> makeStopSet​(Version matchVersion,
                                                    String... stopWords)
        Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.
        Parameters:
        matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0
        stopWords - An array of stopwords
        See Also:
        passing false to ignoreCase
      • makeStopSet

        @Deprecated
        public static final Set<Object> makeStopSet​(List<?> stopWords)
        Deprecated.
        Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.
        Parameters:
        stopWords - A List of Strings or char[] or any other toString()-able list representing the stopwords
        Returns:
        A Set (CharArraySet) containing the words
        See Also:
        passing false to ignoreCase
      • makeStopSet

        public static final Set<Object> makeStopSet​(Version matchVersion,
                                                    List<?> stopWords)
        Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.
        Parameters:
        matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0
        stopWords - A List of Strings or char[] or any other toString()-able list representing the stopwords
        Returns:
        A Set (CharArraySet) containing the words
        See Also:
        passing false to ignoreCase
      • makeStopSet

        @Deprecated
        public static final Set<Object> makeStopSet​(String[] stopWords,
                                                    boolean ignoreCase)
        Deprecated.
        Creates a stopword set from the given stopword array.
        Parameters:
        stopWords - An array of stopwords
        ignoreCase - If true, all words are lower cased first.
        Returns:
        a Set containing the words
      • makeStopSet

        public static final Set<Object> makeStopSet​(Version matchVersion,
                                                    String[] stopWords,
                                                    boolean ignoreCase)
        Creates a stopword set from the given stopword array.
        Parameters:
        matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0
        stopWords - An array of stopwords
        ignoreCase - If true, all words are lower cased first.
        Returns:
        a Set containing the words
      • makeStopSet

        @Deprecated
        public static final Set<Object> makeStopSet​(List<?> stopWords,
                                                    boolean ignoreCase)
        Deprecated.
        Creates a stopword set from the given stopword list.
        Parameters:
        stopWords - A List of Strings or char[] or any other toString()-able list representing the stopwords
        ignoreCase - if true, all words are lower cased first
        Returns:
        A Set (CharArraySet) containing the words
      • makeStopSet

        public static final Set<Object> makeStopSet​(Version matchVersion,
                                                    List<?> stopWords,
                                                    boolean ignoreCase)
        Creates a stopword set from the given stopword list.
        Parameters:
        matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0
        stopWords - A List of Strings or char[] or any other toString()-able list representing the stopwords
        ignoreCase - if true, all words are lower cased first
        Returns:
        A Set (CharArraySet) containing the words
      • getEnablePositionIncrementsVersionDefault

        @Deprecated
        public static boolean getEnablePositionIncrementsVersionDefault​(Version matchVersion)
        Deprecated.
        Returns version-dependent default for enablePositionIncrements. Analyzers that embed StopFilter use this method when creating the StopFilter. Prior to 2.9, this returns false. On 2.9 or later, it returns true.