Class TermPruningPolicy

    • Constructor Detail

      • TermPruningPolicy

        protected TermPruningPolicy​(IndexReader in,
                                    Map<String,​Integer> fieldFlags)
        Construct a policy.
        Parameters:
        in - input reader
        fieldFlags - a map, where keys are field names and values are bitwise-OR flags of operations to be performed (see PruningPolicy for more details).
    • Method Detail

      • pruneWholeTermVector

        public boolean pruneWholeTermVector​(int docNumber,
                                            String field)
                                     throws IOException
        Term vector pruning.
        Parameters:
        docNumber - document number
        field - field name
        Returns:
        true if the complete term vector for this field should be removed (as specified by PruningPolicy.DEL_VECTOR flag).
        Throws:
        IOException
      • pruneAllFieldPostings

        public boolean pruneAllFieldPostings​(String field)
                                      throws IOException
        Pruning of all postings for a field
        Parameters:
        field - field name
        Returns:
        true if all postings for all terms in this field should be removed (as specified by PruningPolicy.DEL_POSTINGS).
        Throws:
        IOException
      • prunePayload

        public boolean prunePayload​(TermPositions in,
                                    Term curTerm)
        Called when checking for the presence of payload for the current term at a current position
        Parameters:
        in - positioned term positions
        curTerm - current term associated with these positions
        Returns:
        true if the payload should be removed, false otherwise.
      • pruneTermVectorTerms

        public abstract int pruneTermVectorTerms​(int docNumber,
                                                 String field,
                                                 String[] terms,
                                                 int[] freqs,
                                                 TermFreqVector v)
                                          throws IOException
        Pruning of individual terms in term vectors.
        Parameters:
        docNumber - document number
        field - field name
        terms - array of terms
        freqs - array of term frequencies
        v - the original term frequency vector
        Returns:
        0 if no terms are to be removed, positive number to indicate how many terms need to be removed. The same number of entries in the terms array must be set to null to indicate which terms to remove.
        Throws:
        IOException
      • pruneTermEnum

        public abstract boolean pruneTermEnum​(TermEnum te)
                                       throws IOException
        Pruning of all postings for a term (invoked once per term).
        Parameters:
        te - positioned term enum.
        Returns:
        true if all postings for this term should be removed, false otherwise.
        Throws:
        IOException
      • pruneAllPositions

        public abstract boolean pruneAllPositions​(TermPositions termPositions,
                                                  Term t)
                                           throws IOException
        Prune all postings per term (invoked once per term per doc)
        Parameters:
        termPositions - positioned term positions. Implementations MUST NOT advance this by calling TermPositions methods that advance either the position pointer (next, skipTo) or term pointer (seek).
        t - current term
        Returns:
        true if the current posting should be removed, false otherwise.
        Throws:
        IOException
      • pruneSomePositions

        public abstract int pruneSomePositions​(int docNum,
                                               int[] positions,
                                               Term curTerm)
        Prune some postings per term (invoked once per term per doc).
        Parameters:
        docNum - current document number
        positions - original term positions in the document (and indirectly term frequency)
        curTerm - current term
        Returns:
        0 if no postings are to be removed, or positive number to indicate how many postings need to be removed. The same number of entries in the positions array must be set to -1 to indicate which positions to remove.