Class CbclReader

  • All Implemented Interfaces:
    htsjdk.samtools.util.CloseableIterator<CbclData>, Closeable, AutoCloseable, Iterator<CbclData>

    public class CbclReader
    extends BaseBclReader
    implements htsjdk.samtools.util.CloseableIterator<CbclData>
    ------------------------------------- CBCL Header ----------------------------------- Bytes 0 - 1 Version number, current version is 1 unsigned 16 bits little endian integer Bytes 2 - 5 Header size unsigned 32 bits little endian integer Byte 6 Number of bits per basecall unsigned Byte 7 Number of bits per q-score unsigned

    q-val mapping info Bytes 0-3 Number of bins (B), zero indicates no mapping B pairs of 4 byte values (if B > 0) {from, to}, {from, to}, {from, to} from: quality score bin to: quality score

    Number of tile records unsigned 32bits little endian integer

    gzip virtual file offsets, one record per tile Bytes 0-3: tile number Bytes 4-7 Number of clusters that were written into the current block (required due to bit-packed q-scores) unsigned 32 bit integer

    Bytes 8-11 Uncompressed block size of the tile data (useful for sanity check when excluding non-PF clusters) unsigned 32 bit integer

    Bytes 12-15 Compressed block size of the tile data unsigned 32 bit integer

    non-PF clusters excluded flag 1: non-PF clusters are excluded 0: non-PF clusters are included

    ------------------------------------- CBCL File Content -----------------------------------

    N blocks of gzip files, where N is the number of tiles.

    Each block consists of C number of basecall, quality score pairs where C is the number of clusters for the given tile.

    Each basecall, quality score pair has the following format (assuming 2 bits are used for the basecalls): Bits 0-1: Basecalls (respectively [A, C, G, T] for [00, 01, 10, 11]) Bits 2 and up: Quality score (unsigned Q bit little endian integer where Q is the number of bits per q-score). For a two bit quality score, this is two clusters per byte where the bottom 4 bits are the first cluster and the higher 4 bits are the second cluster.