Class WeightedSpanTermExtractor


  • public class WeightedSpanTermExtractor
    extends Object
    Class used to extract WeightedSpanTerms from a Query based on whether Terms from the Query are contained in a supplied TokenStream.
    • Constructor Detail

      • WeightedSpanTermExtractor

        public WeightedSpanTermExtractor()
      • WeightedSpanTermExtractor

        public WeightedSpanTermExtractor​(String defaultField)
    • Method Detail

      • extract

        protected void extract​(org.apache.lucene.search.Query query,
                               Map<String,​WeightedSpanTerm> terms)
                        throws IOException
        Fills a Map with <@link WeightedSpanTerm>s using the terms from the supplied Query.
        Parameters:
        query - Query to extract Terms from
        terms - Map to place created WeightedSpanTerms in
        Throws:
        IOException
      • extractWeightedSpanTerms

        protected void extractWeightedSpanTerms​(Map<String,​WeightedSpanTerm> terms,
                                                org.apache.lucene.search.spans.SpanQuery spanQuery)
                                         throws IOException
        Fills a Map with <@link WeightedSpanTerm>s using the terms from the supplied SpanQuery.
        Parameters:
        terms - Map to place created WeightedSpanTerms in
        spanQuery - SpanQuery to extract Terms from
        Throws:
        IOException
      • extractWeightedTerms

        protected void extractWeightedTerms​(Map<String,​WeightedSpanTerm> terms,
                                            org.apache.lucene.search.Query query)
                                     throws IOException
        Fills a Map with <@link WeightedSpanTerm>s using the terms from the supplied Query.
        Parameters:
        terms - Map to place created WeightedSpanTerms in
        query - Query to extract Terms from
        Throws:
        IOException
      • fieldNameComparator

        protected boolean fieldNameComparator​(String fieldNameToCheck)
        Necessary to implement matches for queries against defaultField
      • getReaderForField

        protected org.apache.lucene.index.IndexReader getReaderForField​(String field)
                                                                 throws IOException
        Throws:
        IOException
      • getWeightedSpanTerms

        public Map<String,​WeightedSpanTerm> getWeightedSpanTerms​(org.apache.lucene.search.Query query,
                                                                       org.apache.lucene.analysis.TokenStream tokenStream)
                                                                throws IOException
        Creates a Map of WeightedSpanTerms from the given Query and TokenStream.

        Parameters:
        query - that caused hit
        tokenStream - of text to be highlighted
        Returns:
        Map containing WeightedSpanTerms
        Throws:
        IOException
      • getWeightedSpanTerms

        public Map<String,​WeightedSpanTerm> getWeightedSpanTerms​(org.apache.lucene.search.Query query,
                                                                       org.apache.lucene.analysis.TokenStream tokenStream,
                                                                       String fieldName)
                                                                throws IOException
        Creates a Map of WeightedSpanTerms from the given Query and TokenStream.

        Parameters:
        query - that caused hit
        tokenStream - of text to be highlighted
        fieldName - restricts Term's used based on field name
        Returns:
        Map containing WeightedSpanTerms
        Throws:
        IOException
      • getWeightedSpanTermsWithScores

        public Map<String,​WeightedSpanTerm> getWeightedSpanTermsWithScores​(org.apache.lucene.search.Query query,
                                                                                 org.apache.lucene.analysis.TokenStream tokenStream,
                                                                                 String fieldName,
                                                                                 org.apache.lucene.index.IndexReader reader)
                                                                          throws IOException
        Creates a Map of WeightedSpanTerms from the given Query and TokenStream. Uses a supplied IndexReader to properly weight terms (for gradient highlighting).

        Parameters:
        query - that caused hit
        tokenStream - of text to be highlighted
        fieldName - restricts Term's used based on field name
        reader - to use for scoring
        Returns:
        Map of WeightedSpanTerms with quasi tf/idf scores
        Throws:
        IOException
      • collectSpanQueryFields

        protected void collectSpanQueryFields​(org.apache.lucene.search.spans.SpanQuery spanQuery,
                                              Set<String> fieldNames)
      • mustRewriteQuery

        protected boolean mustRewriteQuery​(org.apache.lucene.search.spans.SpanQuery spanQuery)
      • getExpandMultiTermQuery

        public boolean getExpandMultiTermQuery()
      • setExpandMultiTermQuery

        public void setExpandMultiTermQuery​(boolean expandMultiTermQuery)
      • isCachedTokenStream

        public boolean isCachedTokenStream()
      • getTokenStream

        public org.apache.lucene.analysis.TokenStream getTokenStream()
      • setWrapIfNotCachingTokenFilter

        public void setWrapIfNotCachingTokenFilter​(boolean wrap)
        By default, TokenStreams that are not of the type CachingTokenFilter are wrapped in a CachingTokenFilter to ensure an efficient reset - if you are already using a different caching TokenStream impl and you don't want it to be wrapped, set this to false.
        Parameters:
        wrap -
      • setMaxDocCharsToAnalyze

        protected final void setMaxDocCharsToAnalyze​(int maxDocCharsToAnalyze)