Class RangeCoder

java.lang.Object
htsjdk.samtools.cram.compression.range.RangeCoder

public class RangeCoder extends Object
Arithmetic range coder used by the CRAM 3.1 Range (adaptive arithmetic) codec and FQZComp quality score codec. Implements both encoding and decoding using a 32-bit range with carry propagation for output byte generation.

The range coder maintains a probability interval [low, low+range) and narrows it for each symbol based on cumulative and symbol frequencies. When the range becomes too small (< 2^24), it renormalizes by shifting out the top byte.

Encoding output is written to an internal byte[] buffer (set via setOutput(byte[], int)) rather than a ByteBuffer, eliminating bounds checking and position tracking overhead in the hot encoding loop.

See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Return the current write position in the output buffer.
    protected void
    rangeDecode(ByteBuffer inBuffer, int cumulativeFrequency, int symbolFrequency)
    Update the decoder state after a symbol has been decoded.
    void
    Initialize the decoder by reading the first 5 bytes of the compressed stream into the code register.
    protected void
    rangeEncode(int cumulativeFrequency, int symbolFrequency, int totalFrequency)
    Encode a symbol by narrowing the range interval and emitting output bytes as needed.
    void
    Flush the encoder state by emitting the final 5 bytes.
    protected int
    rangeGetFrequency(int totalFrequency)
    Compute the scaled frequency for symbol lookup during decoding.
    void
    setOutput(byte[] buf, int pos)
    Set the output buffer for encoding.

    Methods inherited from class Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • RangeCoder

      public RangeCoder()
  • Method Details

    • setOutput

      public void setOutput(byte[] buf, int pos)
      Set the output buffer for encoding. Must be called before any encode operations.
      Parameters:
      buf - the byte array to write compressed output to
      pos - the starting write position in the buffer
    • getOutputPosition

      public int getOutputPosition()
      Return the current write position in the output buffer. Call after encoding is complete to determine how many bytes were written.
    • rangeDecodeStart

      public void rangeDecodeStart(ByteBuffer inBuffer)
      Initialize the decoder by reading the first 5 bytes of the compressed stream into the code register. Must be called before any calls to ByteModel.modelDecode(ByteBuffer, RangeCoder).
      Parameters:
      inBuffer - the compressed input stream
    • rangeDecode

      protected void rangeDecode(ByteBuffer inBuffer, int cumulativeFrequency, int symbolFrequency)
      Update the decoder state after a symbol has been decoded.
      Parameters:
      inBuffer - the compressed input stream (for renormalization reads)
      cumulativeFrequency - cumulative frequency of symbols before the decoded symbol
      symbolFrequency - frequency of the decoded symbol
    • rangeGetFrequency

      protected int rangeGetFrequency(int totalFrequency)
      Compute the scaled frequency for symbol lookup during decoding.
      Parameters:
      totalFrequency - the sum of all symbol frequencies
      Returns:
      the scaled frequency value used to identify the decoded symbol
    • rangeEncode

      protected void rangeEncode(int cumulativeFrequency, int symbolFrequency, int totalFrequency)
      Encode a symbol by narrowing the range interval and emitting output bytes as needed. Output is written to the internal byte[] buffer (set via setOutput(byte[], int)).
      Parameters:
      cumulativeFrequency - cumulative frequency of all symbols before this one
      symbolFrequency - frequency of the symbol being encoded
      totalFrequency - sum of all symbol frequencies
    • rangeEncodeEnd

      public void rangeEncodeEnd()
      Flush the encoder state by emitting the final 5 bytes. Must be called after all symbols have been encoded to produce a valid compressed stream.