PYME.IO.PZFFormat module¶

Defines a ‘wire’ format for transmitting or saving image frame data, optionally with Huffman compression and/or sqrt quantization. The combination of quantization and Huffman coding allows compression ratios of 6-10 fold on typical microscopy data. Using compression or quantization requires the pyme-compress companion library to be installed which has optimized c code for performing the compression and quantization. If pyme-compress is compiled and installed on an AVX capable processor, a throughput in excess of 800MB/s can be achieved.

Most users will just want the dumps() and loads() functions

PYME.IO.PZFFormat.ChunkedHuffmanCompress(data, quantization=None)¶

PYME.IO.PZFFormat.ChunkedHuffmanCompress_o(data)¶

PYME.IO.PZFFormat.ChunkedHuffmanDecompress(datastring)¶

PYME.IO.PZFFormat.dumps(data, sequenceID=0, frameNum=0, frameTimestamp=0, compression=0, quantization=0, quantizationOffset=0, quantizationScale=1)¶

Dump an image frame (supplied as a numpy array) into a string in PZF format.

Parameters

data: ndarray: The frame as a 2D (or optionally 3D) numpy array
sequenceID: int: A unique identifier for the sequence to which this frame belongs. This will let us connect the frame with it’s metadata even if they end up in different directories etc …
frameNum: int: The position of this frame within the sequence
frameTimestamp: float: A timestamp for the frame (if provided by the camera)
compression: int (enum): compression method to use - one of: PZFFormat.DATA_COMP_RAW, PZFFormat.DATA_COMP_HUFFCODE, or PZFFormat.DATA_COMP_HUFFCODE_CHUNKS Where raw stores the data with no compression, huffcode uses Huffman coding, and huffcode chunks breaks the data into chunks first, with each chunk meing encodes by a separate thread.
quantization: int (enum): Whether or not the data is quantized before saving. One of DATA_QUANT_NONE or DATA_QUANT_SQRT. If DATA_QUANT_SQRT is selected, then the data is quantized as follows prior to compression:

\[data_{quant} = \frac{\sqrt{data - quantizationOffset}}{quantizationScale}\]

PYME.IO.PZFFormat.header_dtype_v3 = [('ID', 'S2'), ('Version', 'u1'), ('DataFormat', 'u1'), ('DataCompression', 'u1'), ('DataQuantization', 'u1'), ('DimOrder', 'S1'), ('RESERVED0', 'S1'), ('SequenceID', 'i8'), ('FrameNum', 'u4'), ('Width', 'u4'), ('Height', 'u4'), ('Depth', 'u4'), ('FrameTimestamp', 'u8'), ('QuantOffset', 'f4'), ('QuantScale', 'f4'), ('DataOffset', 'u4'), ('RESERVED1', 'S12')]¶

numpy dtype used to define the file header struct.

Most of the entries should be fairly self explanatory, with the following deserving a bit more explanation:

ID: a 2-character string that we can test to see if the file type is consistent
Version: the version of this format the file uses
DataFormat: what the data type of individual pixels is
DataCompression: whether the data is compressed, and which algorithm is used
SequenceID: A unique identifier for the sequence to which this frame belongs. The most important property of this number is that it is unique to each sequence. A reasonable method of generation would be to use a unix-format integer timestamp for the first dword, and a random integer for the second. A hash of the first n image pixels could also be used.
FrameNum: The position of this frame within the sequence
FrameTimestamp: Space to save camera derived frame timestamps, if available
Depth: As envisaged, the format is expected to contain individual 2D frames, with multiple frames being pulled together in a higher level container to construct a sequence or stack. Depth is included just because it doesn’t take a significant ammount of extra space, but gives us flexibility for the future.

PYME.IO.PZFFormat.load_header(datastring)¶

PYME.IO.PZFFormat.loads(datastring)¶

Loads image data from a string in PZF format.

Parameters

datastringstring / bytes: The encoded data

Returns

datandarray: The image data as a numpy array
headerrecarray: The image header, as a numpy record array with the header_dtype dtype.