PYME.IO.PZFFormat module¶
Defines a ‘wire’ format for transmitting or saving image frame data, optionally with Huffman compression and/or sqrt quantization. The combination of quantization and Huffman coding allows compression ratios of 6-10 fold on typical microscopy data. Using compression or quantization requires the pyme-compress companion library to be installed which has optimized c code for performing the compression and quantization. If pyme-compress is compiled and installed on an AVX capable processor, a throughput in excess of 800MB/s can be achieved.
Most users will just want the dumps() and loads() functions
- PYME.IO.PZFFormat.ChunkedHuffmanCompress(data, quantization=None)¶
- PYME.IO.PZFFormat.ChunkedHuffmanCompress_o(data)¶
- PYME.IO.PZFFormat.ChunkedHuffmanDecompress(datastring)¶
- PYME.IO.PZFFormat.dumps(data, sequenceID=0, frameNum=0, frameTimestamp=0, compression=0, quantization=0, quantizationOffset=0, quantizationScale=1)¶
Dump an image frame (supplied as a numpy array) into a string in PZF format.
- Parameters
- data: ndarray
The frame as a 2D (or optionally 3D) numpy array
- sequenceID: int
A unique identifier for the sequence to which this frame belongs. This will let us connect the frame with it’s metadata even if they end up in different directories etc …
- frameNum: int
The position of this frame within the sequence
- frameTimestamp: float
A timestamp for the frame (if provided by the camera)
- compression: int (enum)
compression method to use - one of: PZFFormat.DATA_COMP_RAW, PZFFormat.DATA_COMP_HUFFCODE, or PZFFormat.DATA_COMP_HUFFCODE_CHUNKS Where raw stores the data with no compression, huffcode uses Huffman coding, and huffcode chunks breaks the data into chunks first, with each chunk meing encodes by a separate thread.
- quantization: int (enum)
Whether or not the data is quantized before saving. One of DATA_QUANT_NONE or DATA_QUANT_SQRT. If DATA_QUANT_SQRT is selected, then the data is quantized as follows prior to compression:
\[data_{quant} = \frac{\sqrt{data - quantizationOffset}}{quantizationScale}\]
- PYME.IO.PZFFormat.header_dtype_v3 = [('ID', 'S2'), ('Version', 'u1'), ('DataFormat', 'u1'), ('DataCompression', 'u1'), ('DataQuantization', 'u1'), ('DimOrder', 'S1'), ('RESERVED0', 'S1'), ('SequenceID', 'i8'), ('FrameNum', 'u4'), ('Width', 'u4'), ('Height', 'u4'), ('Depth', 'u4'), ('FrameTimestamp', 'u8'), ('QuantOffset', 'f4'), ('QuantScale', 'f4'), ('DataOffset', 'u4'), ('RESERVED1', 'S12')]¶
numpy dtype used to define the file header struct.
Most of the entries should be fairly self explanatory, with the following deserving a bit more explanation:
- ID
a 2-character string that we can test to see if the file type is consistent
- Version
the version of this format the file uses
- DataFormat
what the data type of individual pixels is
- DataCompression
whether the data is compressed, and which algorithm is used
- SequenceID
A unique identifier for the sequence to which this frame belongs. The most important property of this number is that it is unique to each sequence. A reasonable method of generation would be to use a unix-format integer timestamp for the first dword, and a random integer for the second. A hash of the first n image pixels could also be used.
- FrameNum
The position of this frame within the sequence
- FrameTimestamp
Space to save camera derived frame timestamps, if available
- Depth
As envisaged, the format is expected to contain individual 2D frames, with multiple frames being pulled together in a higher level container to construct a sequence or stack. Depth is included just because it doesn’t take a significant ammount of extra space, but gives us flexibility for the future.
- PYME.IO.PZFFormat.load_header(datastring)¶
- PYME.IO.PZFFormat.loads(datastring)¶
Loads image data from a string in PZF format.
- Parameters
- datastringstring / bytes
The encoded data
- Returns
- datandarray
The image data as a numpy array
- headerrecarray
The image header, as a numpy record array with the
header_dtypedtype.