Class CollectInsertSizeMetrics


  • @DocumentedFeature
    public class CollectInsertSizeMetrics
    extends SinglePassSamProgram
    Command line program to read non-duplicate insert sizes, create a Histogram and report distribution statistics.
    • Field Detail

      • Histogram_FILE

        @Argument(shortName="H",
                  doc="File to write insert size Histogram chart to.")
        public File Histogram_FILE
      • DEVIATIONS

        @Argument(doc="Generate mean, sd and plots by trimming the data down to MEDIAN + DEVIATIONS*MEDIAN_ABSOLUTE_DEVIATION. This is done because insert size data typically includes enough anomalous values from chimeras and other artifacts to make the mean and sd grossly misleading regarding the real distribution.")
        public double DEVIATIONS
      • HISTOGRAM_WIDTH

        @Argument(shortName="W",
                  doc="Explicitly sets the Histogram width, overriding automatic truncation of Histogram tail. Also, when calculating mean and standard deviation, only bins <= Histogram_WIDTH will be included.",
                  optional=true)
        public Integer HISTOGRAM_WIDTH
      • MIN_HISTOGRAM_WIDTH

        @Argument(shortName="MW",
                  doc="Minimum width of histogram plots. In the case when the histogram would otherwise betruncated to a shorter range of sizes, the MIN_HISTOGRAM_WIDTH will enforce a minimum range.",
                  optional=true)
        public Integer MIN_HISTOGRAM_WIDTH
      • MINIMUM_PCT

        @Argument(shortName="M",
                  doc="When generating the Histogram, discard any data categories (out of FR, TANDEM, RF) that have fewer than this percentage of overall reads. (Range: 0 to 1).")
        public float MINIMUM_PCT
      • METRIC_ACCUMULATION_LEVEL

        @Argument(shortName="LEVEL",
                  doc="The level(s) at which to accumulate metrics.  ")
        public Set<MetricAccumulationLevel> METRIC_ACCUMULATION_LEVEL
      • INCLUDE_DUPLICATES

        @Argument(doc="If true, also include reads marked as duplicates in the insert size histogram.")
        public boolean INCLUDE_DUPLICATES
    • Constructor Detail

      • CollectInsertSizeMetrics

        public CollectInsertSizeMetrics()
    • Method Detail

      • customCommandLineValidation

        protected String[] customCommandLineValidation()
        Put any custom command-line validation in an override of this method. clp is initialized at this point and can be used to print usage and access argv. Any options set by command-line parser can be validated.
        Overrides:
        customCommandLineValidation in class CommandLineProgram
        Returns:
        null if command line is valid. If command line is invalid, returns an array of error message to be written to the appropriate place.
      • setup

        protected void setup​(htsjdk.samtools.SAMFileHeader header,
                             File samFile)
        Description copied from class: SinglePassSamProgram
        Should be implemented by subclasses to do one-time initialization work.
        Specified by:
        setup in class SinglePassSamProgram
      • acceptRead

        protected void acceptRead​(htsjdk.samtools.SAMRecord record,
                                  htsjdk.samtools.reference.ReferenceSequence ref)
        Description copied from class: SinglePassSamProgram
        Should be implemented by subclasses to accept SAMRecords one at a time. If the read has a reference sequence and a reference sequence file was supplied to the program it will be passed as 'ref'. Otherwise 'ref' may be null.
        Specified by:
        acceptRead in class SinglePassSamProgram