Class SegmentReader

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Cloneable

    public class SegmentReader
    extends IndexReader
    implements Cloneable
    IndexReader implementation over a single segment.

    Instances pointing to the same segment (but with different deletes, etc) may share the same core data.

    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail

      • readOnly

        @Deprecated
        protected boolean readOnly
        Deprecated.
    • Constructor Detail

      • SegmentReader

        public SegmentReader()
    • Method Detail

      • cloneNormBytes

        @Deprecated
        protected byte[] cloneNormBytes​(byte[] bytes)
        Deprecated.
        Clones the norm bytes. May be overridden by subclasses. New and experimental.
        Parameters:
        bytes - Byte array to clone
        Returns:
        New BitVector
      • cloneDeletedDocs

        @Deprecated
        protected BitVector cloneDeletedDocs​(BitVector bv)
        Deprecated.
        Clones the deleteDocs BitVector. May be overridden by subclasses. New and experimental.
        Parameters:
        bv - BitVector to clone
        Returns:
        New BitVector
      • clone

        public final Object clone()
        Description copied from class: IndexReader
        Efficiently clones the IndexReader (sharing most internal state).

        On cloning a reader with pending changes (deletions, norms), the original reader transfers its write lock to the cloned reader. This means only the cloned reader may make further changes to the index, and commit the changes to the index on close, but the old reader still reflects all changes made up until it was cloned.

        Like IndexReader.openIfChanged(IndexReader), it's safe to make changes to either the original or the cloned reader: all shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.

        Overrides:
        clone in class IndexReader
      • hasDeletions

        public boolean hasDeletions()
        Description copied from class: IndexReader
        Returns true if any documents have been deleted
        Specified by:
        hasDeletions in class IndexReader
      • terms

        public TermEnum terms()
        Description copied from class: IndexReader
        Returns an enumeration of all the terms in the index. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration. Note that after calling terms(), TermEnum.next() must be called on the resulting enumeration before calling other methods such as TermEnum.term().
        Specified by:
        terms in class IndexReader
      • terms

        public TermEnum terms​(Term t)
                       throws IOException
        Description copied from class: IndexReader
        Returns an enumeration of all terms starting at a given term. If the given term does not exist, the enumeration is positioned at the first term greater than the supplied term. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.
        Specified by:
        terms in class IndexReader
        Throws:
        IOException - if there is a low-level IO error
      • document

        public Document document​(int n,
                                 FieldSelector fieldSelector)
                          throws CorruptIndexException,
                                 IOException
        Description copied from class: IndexReader
        Get the Document at the n th position. The FieldSelector may be used to determine what Fields to load and how they should be loaded. NOTE: If this Reader (more specifically, the underlying FieldsReader) is closed before the lazy Field is loaded an exception may be thrown. If you want the value of a lazy Field to be available after closing you must explicitly load it or fetch the Document again with a new loader.

        NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call IndexReader.isDeleted(int) with the requested document ID to verify the document is not deleted.

        Specified by:
        document in class IndexReader
        Parameters:
        n - Get the document at the nth position
        fieldSelector - The FieldSelector to use to determine what Fields should be loaded on the Document. May be null, in which case all Fields will be loaded.
        Returns:
        The stored fields of the Document at the nth position
        Throws:
        CorruptIndexException - if the index is corrupt
        IOException - if there is a low-level IO error
        See Also:
        Fieldable, FieldSelector, SetBasedFieldSelector, LoadFirstFieldSelector
      • isDeleted

        public boolean isDeleted​(int n)
        Description copied from class: IndexReader
        Returns true if document n has been deleted
        Specified by:
        isDeleted in class IndexReader
      • termDocs

        public TermDocs termDocs​(Term term)
                          throws IOException
        Description copied from class: IndexReader
        Returns an enumeration of all the documents which contain term. For each document, the document number, the frequency of the term in that document is also provided, for use in search scoring. If term is null, then all non-deleted docs are returned with freq=1. Thus, this method implements the mapping:

          Term    =>    <docNum, freq>*

        The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

        Overrides:
        termDocs in class IndexReader
        Throws:
        IOException - if there is a low-level IO error
      • rawTermDocs

        public TermDocs rawTermDocs​(Term term)
                             throws IOException
        Expert: returns an enumeration of the documents that contain term, including deleted documents (which are normally filtered out).
        Throws:
        IOException
        WARNING: This API is experimental and might change in incompatible ways in the next release.
      • numDocs

        public int numDocs()
        Description copied from class: IndexReader
        Returns the number of documents in this index.
        Specified by:
        numDocs in class IndexReader
      • maxDoc

        public int maxDoc()
        Description copied from class: IndexReader
        Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.
        Specified by:
        maxDoc in class IndexReader
      • hasNorms

        public boolean hasNorms​(String field)
        Description copied from class: IndexReader
        Returns true if there are norms stored for this field.
        Overrides:
        hasNorms in class IndexReader
      • getTermFreqVector

        public TermFreqVector getTermFreqVector​(int docNumber,
                                                String field)
                                         throws IOException
        Return a term frequency vector for the specified document and field. The vector returned contains term numbers and frequencies for all terms in the specified field of this document, if the field had storeTermVector flag set. If the flag was not set, the method returns null.
        Specified by:
        getTermFreqVector in class IndexReader
        Parameters:
        docNumber - document for which the term frequency vector is returned
        field - field for which the term frequency vector is returned.
        Returns:
        term frequency vector May be null if field does not exist in the specified document or term vector was not stored.
        Throws:
        IOException
        See Also:
        Field.TermVector
      • getTermFreqVector

        public void getTermFreqVector​(int docNumber,
                                      String field,
                                      TermVectorMapper mapper)
                               throws IOException
        Description copied from class: IndexReader
        Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the TermFreqVector.
        Specified by:
        getTermFreqVector in class IndexReader
        Parameters:
        docNumber - The number of the document to load the vector for
        field - The name of the field to load
        mapper - The TermVectorMapper to process the vector. Must not be null
        Throws:
        IOException - if term vectors cannot be accessed or if they do not exist on the field and doc. specified.
      • getTermFreqVector

        public void getTermFreqVector​(int docNumber,
                                      TermVectorMapper mapper)
                               throws IOException
        Description copied from class: IndexReader
        Map all the term vectors for all fields in a Document
        Specified by:
        getTermFreqVector in class IndexReader
        Parameters:
        docNumber - The number of the document to load the vector for
        mapper - The TermVectorMapper to process the vector. Must not be null
        Throws:
        IOException - if term vectors cannot be accessed or if they do not exist on the field and doc. specified.
      • getTermFreqVectors

        public TermFreqVector[] getTermFreqVectors​(int docNumber)
                                            throws IOException
        Return an array of term frequency vectors for the specified document. The array contains a vector for each vectorized field in the document. Each vector vector contains term numbers and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null.
        Specified by:
        getTermFreqVectors in class IndexReader
        Parameters:
        docNumber - document for which term frequency vectors are returned
        Returns:
        array of term frequency vectors. May be null if no term vectors have been stored for the specified document.
        Throws:
        IOException
        See Also:
        Field.TermVector
      • getSegmentName

        public String getSegmentName()
        Return the name of the segment this reader is reading.
      • getUniqueTermCount

        public long getUniqueTermCount()
        Description copied from class: IndexReader
        Returns the number of unique terms (across all fields) in this reader. This method returns long, even though internally Lucene cannot handle more than 2^31 unique terms, for a possible future when this limitation is removed.
        Overrides:
        getUniqueTermCount in class IndexReader
      • getTermInfosIndexDivisor

        public int getTermInfosIndexDivisor()
        Description copied from class: IndexReader
        For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened.
        Overrides:
        getTermInfosIndexDivisor in class IndexReader
      • addCoreClosedListener

        public void addCoreClosedListener​(SegmentReader.CoreClosedListener listener)
        Expert: adds a CoreClosedListener to this reader's shared core
      • removeCoreClosedListener

        public void removeCoreClosedListener​(SegmentReader.CoreClosedListener listener)
        Expert: removes a CoreClosedListener from this reader's shared core