Package org.apache.lucene.search
Class FuzzyQuery
- java.lang.Object
-
- org.apache.lucene.search.Query
-
- org.apache.lucene.search.MultiTermQuery
-
- org.apache.lucene.search.FuzzyQuery
-
- All Implemented Interfaces:
Serializable
,Cloneable
public class FuzzyQuery extends MultiTermQuery
Implements the fuzzy search query. The similarity measurement is based on the Levenshtein (edit distance) algorithm.Warning: this query is not very scalable with its default prefix length of 0 - in this case, *every* term will be enumerated and cause an edit score calculation.
This query uses
MultiTermQuery.TopTermsScoringBooleanQueryRewrite
as default. So terms will be collected and scored according to their edit distance. Only the top terms are used for building theBooleanQuery
. It is not recommended to change the rewrite mode for fuzzy queries.- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.search.MultiTermQuery
MultiTermQuery.ConstantScoreAutoRewrite, MultiTermQuery.RewriteMethod, MultiTermQuery.TopTermsBoostOnlyBooleanQueryRewrite, MultiTermQuery.TopTermsScoringBooleanQueryRewrite
-
-
Field Summary
Fields Modifier and Type Field Description static int
defaultMaxExpansions
static float
defaultMinSimilarity
static int
defaultPrefixLength
protected Term
term
-
Fields inherited from class org.apache.lucene.search.MultiTermQuery
CONSTANT_SCORE_AUTO_REWRITE_DEFAULT, CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE, CONSTANT_SCORE_FILTER_REWRITE, rewriteMethod, SCORING_BOOLEAN_QUERY_REWRITE
-
-
Constructor Summary
Constructors Constructor Description FuzzyQuery(Term term)
FuzzyQuery(Term term, float minimumSimilarity)
FuzzyQuery(Term term, float minimumSimilarity, int prefixLength)
FuzzyQuery(Term term, float minimumSimilarity, int prefixLength, int maxExpansions)
Create a new FuzzyQuery that will match terms with a similarity of at leastminimumSimilarity
toterm
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
equals(Object obj)
protected FilteredTermEnum
getEnum(IndexReader reader)
Construct the enumeration to be used, expanding the pattern term.float
getMinSimilarity()
Returns the minimum similarity that is required for this query to match.int
getPrefixLength()
Returns the non-fuzzy prefix length.Term
getTerm()
Returns the pattern term.int
hashCode()
String
toString(String field)
Prints a query to a string, withfield
assumed to be the default field and omitted.-
Methods inherited from class org.apache.lucene.search.MultiTermQuery
clearTotalNumberOfTerms, getRewriteMethod, getTotalNumberOfTerms, incTotalNumberOfTerms, rewrite, setRewriteMethod
-
Methods inherited from class org.apache.lucene.search.Query
clone, combine, createWeight, extractTerms, getBoost, getSimilarity, mergeBooleanQueries, setBoost, toString, weight
-
-
-
-
Field Detail
-
defaultMinSimilarity
public static final float defaultMinSimilarity
- See Also:
- Constant Field Values
-
defaultPrefixLength
public static final int defaultPrefixLength
- See Also:
- Constant Field Values
-
defaultMaxExpansions
public static final int defaultMaxExpansions
- See Also:
- Constant Field Values
-
term
protected Term term
-
-
Constructor Detail
-
FuzzyQuery
public FuzzyQuery(Term term, float minimumSimilarity, int prefixLength, int maxExpansions)
Create a new FuzzyQuery that will match terms with a similarity of at leastminimumSimilarity
toterm
. If aprefixLength
> 0 is specified, a common prefix of that length is also required.- Parameters:
term
- the term to search forminimumSimilarity
- a value between 0 and 1 to set the required similarity between the query term and the matching terms. For example, for aminimumSimilarity
of0.5
a term of the same length as the query term is considered similar to the query term if the edit distance between both terms is less thanlength(term)*0.5
prefixLength
- length of common (non-fuzzy) prefixmaxExpansions
- the maximum number of terms to match. If this number is greater thanBooleanQuery.getMaxClauseCount()
when the query is rewritten, then the maxClauseCount will be used instead.- Throws:
IllegalArgumentException
- if minimumSimilarity is >= 1 or < 0 or if prefixLength < 0
-
FuzzyQuery
public FuzzyQuery(Term term, float minimumSimilarity, int prefixLength)
-
FuzzyQuery
public FuzzyQuery(Term term, float minimumSimilarity)
-
FuzzyQuery
public FuzzyQuery(Term term)
-
-
Method Detail
-
getMinSimilarity
public float getMinSimilarity()
Returns the minimum similarity that is required for this query to match.- Returns:
- float value between 0.0 and 1.0
-
getPrefixLength
public int getPrefixLength()
Returns the non-fuzzy prefix length. This is the number of characters at the start of a term that must be identical (not fuzzy) to the query term if the query is to match that term.
-
getEnum
protected FilteredTermEnum getEnum(IndexReader reader) throws IOException
Description copied from class:MultiTermQuery
Construct the enumeration to be used, expanding the pattern term.- Specified by:
getEnum
in classMultiTermQuery
- Throws:
IOException
-
getTerm
public Term getTerm()
Returns the pattern term.
-
toString
public String toString(String field)
Description copied from class:Query
Prints a query to a string, withfield
assumed to be the default field and omitted.The representation used is one that is supposed to be readable by
QueryParser
. However, there are the following limitations:- If the query was created by the parser, the printed representation may not be exactly what was parsed. For example, characters that need to be escaped will be represented without the required backslash.
- Some of the more complicated queries (e.g. span queries) don't have a representation that can be parsed by QueryParser.
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classMultiTermQuery
-
equals
public boolean equals(Object obj)
- Overrides:
equals
in classMultiTermQuery
-
-