Class SegmentReader
- java.lang.Object
-
- org.apache.lucene.index.IndexReader
-
- org.apache.lucene.index.SegmentReader
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Cloneable
public class SegmentReader extends IndexReader implements Cloneable
IndexReader implementation over a single segment.Instances pointing to the same segment (but with different deletes, etc) may share the same core data.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
SegmentReader.CoreClosedListener
Called when the shared core for this SegmentReader is closed.-
Nested classes/interfaces inherited from class org.apache.lucene.index.IndexReader
IndexReader.ReaderClosedListener
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
readOnly
Deprecated.-
Fields inherited from class org.apache.lucene.index.IndexReader
hasChanges
-
-
Constructor Summary
Constructors Constructor Description SegmentReader()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
addCoreClosedListener(SegmentReader.CoreClosedListener listener)
Expert: adds a CoreClosedListener to this reader's shared coreObject
clone()
Efficiently clones the IndexReader (sharing most internal state).IndexReader
clone(boolean openReadOnly)
Deprecated.protected BitVector
cloneDeletedDocs(BitVector bv)
Deprecated.protected byte[]
cloneNormBytes(byte[] bytes)
Deprecated.Directory
directory()
Returns the directory this index resides in.int
docFreq(Term t)
Returns the number of documents containing the termt
.protected void
doClose()
Implements close.protected void
doCommit(Map<String,String> commitUserData)
Deprecated.Document
document(int n, FieldSelector fieldSelector)
Get theDocument
at then
th position.protected void
doDelete(int docNum)
Deprecated.protected IndexReader
doOpenIfChanged()
If the index has changed since it was opened, open and return a new reader; else, returnnull
.protected IndexReader
doOpenIfChanged(boolean openReadOnly)
Deprecated.protected void
doSetNorm(int doc, String field, byte value)
Deprecated.protected void
doUndeleteAll()
Deprecated.static SegmentReader
get(boolean readOnly, SegmentInfo si, int termInfosIndexDivisor)
static SegmentReader
get(boolean readOnly, Directory dir, SegmentInfo si, int readBufferSize, boolean doOpenStores, int termInfosIndexDivisor)
Object
getCoreCacheKey()
ExpertObject
getDeletesCacheKey()
Expert.FieldInfos
getFieldInfos()
Get theFieldInfos
describing all fields in this reader.String
getSegmentName()
Return the name of the segment this reader is reading.TermFreqVector
getTermFreqVector(int docNumber, String field)
Return a term frequency vector for the specified document and field.void
getTermFreqVector(int docNumber, String field, TermVectorMapper mapper)
Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of theTermFreqVector
.void
getTermFreqVector(int docNumber, TermVectorMapper mapper)
Map all the term vectors for all fields in a DocumentTermFreqVector[]
getTermFreqVectors(int docNumber)
Return an array of term frequency vectors for the specified document.int
getTermInfosIndexDivisor()
For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened.long
getUniqueTermCount()
Returns the number of unique terms (across all fields) in this reader.boolean
hasDeletions()
Returns true if any documents have been deletedboolean
hasNorms(String field)
Returns true if there are norms stored for this field.boolean
isDeleted(int n)
Returns true if document n has been deletedint
maxDoc()
Returns one greater than the largest possible document number.byte[]
norms(String field)
Returns the byte-encoded normalization factor for the named field of every document.void
norms(String field, byte[] bytes, int offset)
Read norms into a pre-allocated array.int
numDocs()
Returns the number of documents in this index.TermDocs
rawTermDocs(Term term)
Expert: returns an enumeration of the documents that containterm
, including deleted documents (which are normally filtered out).void
removeCoreClosedListener(SegmentReader.CoreClosedListener listener)
Expert: removes a CoreClosedListener from this reader's shared coreTermDocs
termDocs()
Returns an unpositionedTermDocs
enumerator.TermDocs
termDocs(Term term)
Returns an enumeration of all the documents which containterm
.TermPositions
termPositions()
Returns an unpositionedTermPositions
enumerator.TermEnum
terms()
Returns an enumeration of all the terms in the index.TermEnum
terms(Term t)
Returns an enumeration of all terms starting at a given term.String
toString()
-
Methods inherited from class org.apache.lucene.index.IndexReader
acquireWriteLock, addReaderClosedListener, close, commit, commit, decRef, deleteDocument, deleteDocuments, document, doOpenIfChanged, doOpenIfChanged, ensureOpen, flush, flush, getCommitUserData, getCommitUserData, getCurrentVersion, getIndexCommit, getRefCount, getSequentialSubReaders, getVersion, incRef, indexExists, isCurrent, isOptimized, lastModified, listCommits, numDeletedDocs, open, open, open, open, open, open, open, open, open, open, open, openIfChanged, openIfChanged, openIfChanged, openIfChanged, removeReaderClosedListener, reopen, reopen, reopen, reopen, setNorm, setNorm, termPositions, tryIncRef, undeleteAll
-
-
-
-
Field Detail
-
readOnly
@Deprecated protected boolean readOnly
Deprecated.
-
-
Method Detail
-
get
public static SegmentReader get(boolean readOnly, SegmentInfo si, int termInfosIndexDivisor) throws CorruptIndexException, IOException
- Throws:
CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error
-
get
public static SegmentReader get(boolean readOnly, Directory dir, SegmentInfo si, int readBufferSize, boolean doOpenStores, int termInfosIndexDivisor) throws CorruptIndexException, IOException
- Throws:
CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error
-
cloneNormBytes
@Deprecated protected byte[] cloneNormBytes(byte[] bytes)
Deprecated.Clones the norm bytes. May be overridden by subclasses. New and experimental.- Parameters:
bytes
- Byte array to clone- Returns:
- New BitVector
-
cloneDeletedDocs
@Deprecated protected BitVector cloneDeletedDocs(BitVector bv)
Deprecated.Clones the deleteDocs BitVector. May be overridden by subclasses. New and experimental.- Parameters:
bv
- BitVector to clone- Returns:
- New BitVector
-
clone
public final Object clone()
Description copied from class:IndexReader
Efficiently clones the IndexReader (sharing most internal state).On cloning a reader with pending changes (deletions, norms), the original reader transfers its write lock to the cloned reader. This means only the cloned reader may make further changes to the index, and commit the changes to the index on close, but the old reader still reflects all changes made up until it was cloned.
Like
IndexReader.openIfChanged(IndexReader)
, it's safe to make changes to either the original or the cloned reader: all shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.- Overrides:
clone
in classIndexReader
-
clone
@Deprecated public final IndexReader clone(boolean openReadOnly) throws CorruptIndexException, IOException
Deprecated.Clones the IndexReader and optionally changes readOnly. A readOnly reader cannot open a writeable reader.- Overrides:
clone
in classIndexReader
- Throws:
CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error
-
doOpenIfChanged
protected IndexReader doOpenIfChanged() throws CorruptIndexException, IOException
Description copied from class:IndexReader
If the index has changed since it was opened, open and return a new reader; else, returnnull
.- Overrides:
doOpenIfChanged
in classIndexReader
- Throws:
CorruptIndexException
IOException
- See Also:
IndexReader.openIfChanged(IndexReader)
-
doOpenIfChanged
@Deprecated protected IndexReader doOpenIfChanged(boolean openReadOnly) throws CorruptIndexException, IOException
Deprecated.If the index has changed since it was opened, open and return a new reader; else, returnnull
.- Overrides:
doOpenIfChanged
in classIndexReader
- Throws:
CorruptIndexException
IOException
- See Also:
IndexReader.openIfChanged(IndexReader, boolean)
-
doCommit
@Deprecated protected void doCommit(Map<String,String> commitUserData) throws IOException
Deprecated.Implements commit.- Specified by:
doCommit
in classIndexReader
- Throws:
IOException
-
doClose
protected void doClose() throws IOException
Description copied from class:IndexReader
Implements close.- Specified by:
doClose
in classIndexReader
- Throws:
IOException
-
hasDeletions
public boolean hasDeletions()
Description copied from class:IndexReader
Returns true if any documents have been deleted- Specified by:
hasDeletions
in classIndexReader
-
doDelete
@Deprecated protected void doDelete(int docNum)
Deprecated.Implements deletion of the document numbereddocNum
. Applications should callIndexReader.deleteDocument(int)
orIndexReader.deleteDocuments(Term)
.- Specified by:
doDelete
in classIndexReader
-
doUndeleteAll
@Deprecated protected void doUndeleteAll()
Deprecated.Implements actual undeleteAll() in subclass.- Specified by:
doUndeleteAll
in classIndexReader
-
terms
public TermEnum terms()
Description copied from class:IndexReader
Returns an enumeration of all the terms in the index. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration. Note that after calling terms(),TermEnum.next()
must be called on the resulting enumeration before calling other methods such asTermEnum.term()
.- Specified by:
terms
in classIndexReader
-
terms
public TermEnum terms(Term t) throws IOException
Description copied from class:IndexReader
Returns an enumeration of all terms starting at a given term. If the given term does not exist, the enumeration is positioned at the first term greater than the supplied term. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.- Specified by:
terms
in classIndexReader
- Throws:
IOException
- if there is a low-level IO error
-
getFieldInfos
public FieldInfos getFieldInfos()
Description copied from class:IndexReader
Get theFieldInfos
describing all fields in this reader. NOTE: do not make any changes to the returned FieldInfos!- Specified by:
getFieldInfos
in classIndexReader
-
document
public Document document(int n, FieldSelector fieldSelector) throws CorruptIndexException, IOException
Description copied from class:IndexReader
Get theDocument
at then
th position. TheFieldSelector
may be used to determine whatField
s to load and how they should be loaded. NOTE: If this Reader (more specifically, the underlyingFieldsReader
) is closed before the lazyField
is loaded an exception may be thrown. If you want the value of a lazyField
to be available after closing you must explicitly load it or fetch the Document again with a new loader.NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call
IndexReader.isDeleted(int)
with the requested document ID to verify the document is not deleted.- Specified by:
document
in classIndexReader
- Parameters:
n
- Get the document at then
th positionfieldSelector
- TheFieldSelector
to use to determine what Fields should be loaded on the Document. May be null, in which case all Fields will be loaded.- Returns:
- The stored fields of the
Document
at the nth position - Throws:
CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error- See Also:
Fieldable
,FieldSelector
,SetBasedFieldSelector
,LoadFirstFieldSelector
-
isDeleted
public boolean isDeleted(int n)
Description copied from class:IndexReader
Returns true if document n has been deleted- Specified by:
isDeleted
in classIndexReader
-
termDocs
public TermDocs termDocs(Term term) throws IOException
Description copied from class:IndexReader
Returns an enumeration of all the documents which containterm
. For each document, the document number, the frequency of the term in that document is also provided, for use in search scoring. If term is null, then all non-deleted docs are returned with freq=1. Thus, this method implements the mapping:-
Term => <docNum, freq>*
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.
- Overrides:
termDocs
in classIndexReader
- Throws:
IOException
- if there is a low-level IO error
-
rawTermDocs
public TermDocs rawTermDocs(Term term) throws IOException
Expert: returns an enumeration of the documents that containterm
, including deleted documents (which are normally filtered out).- Throws:
IOException
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
termDocs
public TermDocs termDocs() throws IOException
Description copied from class:IndexReader
Returns an unpositionedTermDocs
enumerator.Note: the TermDocs returned is unpositioned. Before using it, ensure that you first position it with
TermDocs.seek(Term)
orTermDocs.seek(TermEnum)
.- Specified by:
termDocs
in classIndexReader
- Throws:
IOException
- if there is a low-level IO error
-
termPositions
public TermPositions termPositions() throws IOException
Description copied from class:IndexReader
Returns an unpositionedTermPositions
enumerator.- Specified by:
termPositions
in classIndexReader
- Throws:
IOException
- if there is a low-level IO error
-
docFreq
public int docFreq(Term t) throws IOException
Description copied from class:IndexReader
Returns the number of documents containing the termt
.- Specified by:
docFreq
in classIndexReader
- Throws:
IOException
- if there is a low-level IO error
-
numDocs
public int numDocs()
Description copied from class:IndexReader
Returns the number of documents in this index.- Specified by:
numDocs
in classIndexReader
-
maxDoc
public int maxDoc()
Description copied from class:IndexReader
Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.- Specified by:
maxDoc
in classIndexReader
-
hasNorms
public boolean hasNorms(String field)
Description copied from class:IndexReader
Returns true if there are norms stored for this field.- Overrides:
hasNorms
in classIndexReader
-
norms
public byte[] norms(String field) throws IOException
Description copied from class:IndexReader
Returns the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents. Returns null if norms were not indexed for this field.- Specified by:
norms
in classIndexReader
- Throws:
IOException
- See Also:
AbstractField.setBoost(float)
-
doSetNorm
@Deprecated protected void doSetNorm(int doc, String field, byte value) throws IOException
Deprecated.Implements setNorm in subclass.- Specified by:
doSetNorm
in classIndexReader
- Throws:
IOException
-
norms
public void norms(String field, byte[] bytes, int offset) throws IOException
Read norms into a pre-allocated array.- Specified by:
norms
in classIndexReader
- Throws:
IOException
- See Also:
AbstractField.setBoost(float)
-
getTermFreqVector
public TermFreqVector getTermFreqVector(int docNumber, String field) throws IOException
Return a term frequency vector for the specified document and field. The vector returned contains term numbers and frequencies for all terms in the specified field of this document, if the field had storeTermVector flag set. If the flag was not set, the method returns null.- Specified by:
getTermFreqVector
in classIndexReader
- Parameters:
docNumber
- document for which the term frequency vector is returnedfield
- field for which the term frequency vector is returned.- Returns:
- term frequency vector May be null if field does not exist in the specified document or term vector was not stored.
- Throws:
IOException
- See Also:
Field.TermVector
-
getTermFreqVector
public void getTermFreqVector(int docNumber, String field, TermVectorMapper mapper) throws IOException
Description copied from class:IndexReader
Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of theTermFreqVector
.- Specified by:
getTermFreqVector
in classIndexReader
- Parameters:
docNumber
- The number of the document to load the vector forfield
- The name of the field to loadmapper
- TheTermVectorMapper
to process the vector. Must not be null- Throws:
IOException
- if term vectors cannot be accessed or if they do not exist on the field and doc. specified.
-
getTermFreqVector
public void getTermFreqVector(int docNumber, TermVectorMapper mapper) throws IOException
Description copied from class:IndexReader
Map all the term vectors for all fields in a Document- Specified by:
getTermFreqVector
in classIndexReader
- Parameters:
docNumber
- The number of the document to load the vector formapper
- TheTermVectorMapper
to process the vector. Must not be null- Throws:
IOException
- if term vectors cannot be accessed or if they do not exist on the field and doc. specified.
-
getTermFreqVectors
public TermFreqVector[] getTermFreqVectors(int docNumber) throws IOException
Return an array of term frequency vectors for the specified document. The array contains a vector for each vectorized field in the document. Each vector vector contains term numbers and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null.- Specified by:
getTermFreqVectors
in classIndexReader
- Parameters:
docNumber
- document for which term frequency vectors are returned- Returns:
- array of term frequency vectors. May be null if no term vectors have been stored for the specified document.
- Throws:
IOException
- See Also:
Field.TermVector
-
toString
public String toString()
- Overrides:
toString
in classIndexReader
-
getSegmentName
public String getSegmentName()
Return the name of the segment this reader is reading.
-
directory
public Directory directory()
Returns the directory this index resides in.- Overrides:
directory
in classIndexReader
-
getCoreCacheKey
public final Object getCoreCacheKey()
Description copied from class:IndexReader
Expert- Overrides:
getCoreCacheKey
in classIndexReader
-
getDeletesCacheKey
public Object getDeletesCacheKey()
Description copied from class:IndexReader
Expert. Warning: this returns null if the reader has no deletions- Overrides:
getDeletesCacheKey
in classIndexReader
-
getUniqueTermCount
public long getUniqueTermCount()
Description copied from class:IndexReader
Returns the number of unique terms (across all fields) in this reader. This method returns long, even though internally Lucene cannot handle more than 2^31 unique terms, for a possible future when this limitation is removed.- Overrides:
getUniqueTermCount
in classIndexReader
-
getTermInfosIndexDivisor
public int getTermInfosIndexDivisor()
Description copied from class:IndexReader
For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened.- Overrides:
getTermInfosIndexDivisor
in classIndexReader
-
addCoreClosedListener
public void addCoreClosedListener(SegmentReader.CoreClosedListener listener)
Expert: adds a CoreClosedListener to this reader's shared core
-
removeCoreClosedListener
public void removeCoreClosedListener(SegmentReader.CoreClosedListener listener)
Expert: removes a CoreClosedListener from this reader's shared core
-
-