Interface SamReader

    • Method Detail

      • getResourceDescription

        String getResourceDescription()
        Returns:
        a human readable description of the resource backing this sam reader
      • hasIndex

        boolean hasIndex()
        Returns:
        true if ths is a BAM file, and has an index
      • iterator

        SAMRecordIterator iterator()
        Iterate through file in order. For a SamReader constructed from an InputStream, and for any SAM file, a 2nd iteration starts where the 1st one left off. For a BAM constructed from a SeekableStream or File, each new iteration starts at the first record.

        Only a single open iterator on a SAM or BAM file may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

        Specified by:
        iterator in interface Iterable<SAMRecord>
      • query

        SAMRecordIterator query​(String sequence,
                                int start,
                                int end,
                                boolean contained)
        Iterate over records that match the given interval. Only valid to call this if hasIndex() == true.

        Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. You can use a second SamReader to iterate in parallel over the same underlying file.

        Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

        Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

        Parameters:
        sequence - Reference sequence of interest.
        start - 1-based, inclusive start of interval of interest. Zero implies start of the reference sequence.
        end - 1-based, inclusive end of interval of interest. Zero implies end of the reference sequence.
        contained - If true, each SAMRecord returned will have its alignment completely contained in the interval of interest. If false, the alignment of the returned SAMRecords need only overlap the interval of interest.
        Returns:
        Iterator over the SAMRecords matching the interval.
      • queryOverlapping

        SAMRecordIterator queryOverlapping​(String sequence,
                                           int start,
                                           int end)
        Iterate over records that overlap the given interval. Only valid to call this if hasIndex() == true.

        Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

        Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

        Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

        Parameters:
        sequence - Reference sequence of interest.
        start - 1-based, inclusive start of interval of interest. Zero implies start of the reference sequence.
        end - 1-based, inclusive end of interval of interest. Zero implies end of the reference sequence.
        Returns:
        Iterator over the SAMRecords overlapping the interval.
      • queryContained

        SAMRecordIterator queryContained​(String sequence,
                                         int start,
                                         int end)
        Iterate over records that are contained in the given interval. Only valid to call this if hasIndex() == true.

        Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

        Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

        Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

        Parameters:
        sequence - Reference sequence of interest.
        start - 1-based, inclusive start of interval of interest. Zero implies start of the reference sequence.
        end - 1-based, inclusive end of interval of interest. Zero implies end of the reference sequence.
        Returns:
        Iterator over the SAMRecords contained in the interval.
      • query

        SAMRecordIterator query​(QueryInterval[] intervals,
                                boolean contained)
        Iterate over records that match one of the given intervals. This may be more efficient than querying each interval separately, because multiple reads of the same SAMRecords is avoided.

        Only valid to call this if hasIndex() == true.

        Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. You can use a second SamReader to iterate in parallel over the same underlying file.

        Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match an interval of interest.

        Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

        Parameters:
        intervals - Intervals to be queried. The intervals must be optimized, i.e. in order, with overlapping and abutting intervals merged. This can be done with QueryInterval.optimizeIntervals(htsjdk.samtools.QueryInterval[])
        contained - If true, each SAMRecord returned is will have its alignment completely contained in one of the intervals of interest. If false, the alignment of the returned SAMRecords need only overlap one of the intervals of interest.
        Returns:
        Iterator over the SAMRecords matching the interval.
      • queryOverlapping

        SAMRecordIterator queryOverlapping​(QueryInterval[] intervals)
        Iterate over records that overlap any of the given intervals. This may be more efficient than querying each interval separately, because multiple reads of the same SAMRecords is avoided.

        Only valid to call this if hasIndex() == true.

        Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

        Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

        Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

        Parameters:
        intervals - Intervals to be queried. The intervals must be optimized, i.e. in order, with overlapping and abutting intervals merged. This can be done with QueryInterval.optimizeIntervals(htsjdk.samtools.QueryInterval[])
      • queryContained

        SAMRecordIterator queryContained​(QueryInterval[] intervals)
        Iterate over records that are contained in the given interval. This may be more efficient than querying each interval separately, because multiple reads of the same SAMRecords is avoided.

        Only valid to call this if hasIndex() == true.

        Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

        Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

        Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

        Parameters:
        intervals - Intervals to be queried. The intervals must be optimized, i.e. in order, with overlapping and abutting intervals merged. This can be done with QueryInterval.optimizeIntervals(htsjdk.samtools.QueryInterval[])
        Returns:
        Iterator over the SAMRecords contained in any of the intervals.
      • queryAlignmentStart

        SAMRecordIterator queryAlignmentStart​(String sequence,
                                              int start)
        Iterate over records that map to the given sequence and start at the given position. Only valid to call this if hasIndex() == true.

        Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

        Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

        Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that matches the arguments.

        Parameters:
        sequence - Reference sequence of interest.
        start - Alignment start of interest.
        Returns:
        Iterator over the SAMRecords with the given alignment start.
      • queryMate

        SAMRecord queryMate​(SAMRecord rec)
        Fetch the mate for the given read. Only valid to call this if hasIndex() == true. This will work whether the mate has a coordinate or not, so long as the given read has correct mate information. This method iterates over the SAM file, so there may not be an unclosed iterator on the SAM file when this method is called.

        Note that it is not possible to call queryMate when iterating over the SamReader, because queryMate requires its own iteration, and there cannot be two simultaneous iterations on the same SamReader. The work-around is to open a second SamReader on the same input file, and call queryMate on the second reader.

        Parameters:
        rec - Record for which mate is sought. Must be a paired read.
        Returns:
        rec's mate, or null if it cannot be found.