PYME Metadata

PYME metadata is stored as a series of key-value pairs. By convention, these pairs are hierarchical, with a high level category separated from a lower level parameter by a dot.

Note

The metatdata was formally stored within a hierarchical structure (and still is within HDF files), and can be accessed in a hierarchical fashion using some metadata handlers. We are slowly moving to a flatter internal representation, with the hierarchical nature being by convention only. You are nevertheless encouraged to follow this convention when defining new keys.

Programmatic interaction with metadata is accomplished using the classes in PYME.IO.MetaDataHandler module. A number of on disk storage formats are supported, as outlined below. The .md format has comments describing some of the key fields.

JSON metadata

This is the format recommended for any new usage. A .json file with the same base file name (the bit excluding the extension) as an image file will be automatically parsed and used as metadata. This is the most restrictive form of metadata (it only supports metadata values which can be easily converted to json, either by consisting of simple python types, or by implementing a to_JSON method, whereas older metadata formats support embedding arbitrary python objects as pickles).

An example .json metadata file is shown below:

{
"AcquiringUser": "David_Baddeley",
"Analysis.BGRange": [-50, 0],
"Analysis.DataFileID": 624268635,
"Analysis.DebounceRadius": 4,
"Analysis.DetectionThreshold": 0.7,
"Analysis.FitModule": "SplitterFitFNR",
"Analysis.PCTBackground": 0.0,
"Analysis.subtractBackground": true,
"Camera.ADOffset": 251.0,
"Camera.CycleTime": 0.051750000566244125,
"Camera.EMGain": 90,
"Camera.ElectronsPerCount": 4.99,
"Camera.IntegrationTime": 0.05000000074505806,
"Camera.Model": "DU897_BV",
"Camera.Name": "Andor IXon DV97",
"Camera.NoiseFactor": 1.41,
"Camera.ROIHeight": 511,
"Camera.ROIPosX": 1,
"Camera.ROIPosY": 1,
"Camera.ROIWidth": 511,
"Camera.ReadNoise": 88.1,
"Camera.SerialNumber": 7863,
"Camera.StartCCDTemp": -70,
"Camera.TrueEMGain": 35.50129002714877,
"EndTime": 1415229790.515,
"EstimatedLaserOnFrameNo": 201,
"NikonTi.FilterCube": "647LP",
"NikonTi.LightPath": "L100",
"Positioning.PIFoc": 166.1875,
"Positioning.Stage_X": 12.527,
"Positioning.Stage_Y": 13.0805,
"Protocol.BleachFrames": [61,200],
"Protocol.DarkFrameRange": [0,20],
"Protocol.DataStartsAt": 201,
"Protocol.Filename": "prebleach642.py",
"Protocol.PrebleachFrames": [21,58],
"Sample.Creator": "Kenny",
"Sample.Labelling": [["TOM20", "A647"],["alpha-tubulin","Cy3B"]],
"Sample.SlideRef": "5_11_2014_A",
"Splitter.Channel0ROI": [0,0,256,512],
"Splitter.Channel1ROI": [256,0,256,512],
"Splitter.Dichroic": "FF700-Di01",
"Splitter.Flip": false,
"Splitter.TransmittedPathPosition": "Left",
"StackSettings.EndPos": 176.0875,
"StackSettings.NumSlices": 100,
"StackSettings.ScanMode": "Middle and Number",
"StackSettings.ScanPiezo": "PIFoc",
"StackSettings.StartPos": 156.2875,
"StackSettings.StepSize": 0.2,
"StartTime": 1415229470.461,
"chroma.ShiftFilename": "G:\\Data\\David_Baddeley\\5_11_series_A.sf",
"chroma.dx": null,
"chroma.dy": null,
"imageID": 624268635,
"voxelsize.units": "um",
"voxelsize.x": 0.07,
"voxelsize.y": 0.07,
"voxelsize.z": 0.2
}

.md Metadata files

This is a slightly older metadata format, and is essentially a python script with a .md extension which is interpreted to fill a MetaDataHandler object. You can do a lot with it, but, as it is a python script, the normal security implications apply. Like the .json metadata files, PYME will search for a .md file with the same base name as an image file and use this as metadata.

an example .md file is shown below:

#PYME Simple Metadata v1
md['AcquiringUser'] = 'David_Baddeley'               # who sat at the microscope to take the image

# Protocol fields are written by the acquisition protocol and generally describe changes in laser powers etc ...
# They are pretty specialized to our acquisition software.
md['Protocol.DataStartsAt'] = 201   # frame where we expect good quality data to start. IE where to start localizing molecules
md['Protocol.PrebleachFrames'] = (21, 58)            # a range of frames taken at low laser power prior to bleaching (for generation of a widefield image)
md['Protocol.Filename'] = u'prebleach642.py'
md['Protocol.DarkFrameRange'] = (0, 20)
md['Protocol.BleachFrames'] = (61, 200)

# Parameters of an image splitting device (e.g. optosplit or custom built) used for ratiometric multi-colour or biplane imaging
md['Splitter.Channel0ROI'] = [0, 0, 256, 512]
md['Splitter.Dichroic'] = 'FF700-Di01'
md['Splitter.Channel1ROI'] = [256, 0, 256, 512]
md['Splitter.TransmittedPathPosition'] = 'Left'
md['Splitter.Flip'] = False

md['NikonTi.LightPath'] = 'L100'
md['NikonTi.FilterCube'] = u'647LP'

# information about a measured vectorial shift field used to correct for chromatic aberrations between channels
# note that specifying the filename should be sufficient (the other fields will then be automatically completed)
md['chroma.dx'] = pickle.loads('''ccopy_reg\n_reconstructor\np1\n(cscipy.interpolate.fitpack2\nSmoothBivariateSpline\np2\nc__builtin__\nobject\np3\nNtRp4\n(dp5\nS'fp'\np6\nF1746.7117511335171\nsS'degrees'\np7\n(I3\nI3\ntp8\nsS'tck'\np9\n(cnumpy.core.multiarray\n_reconstruct\np10\n(cnumpy\nndarray\np11\n(I0\ntS'b'\ntRp12\n(I1\n(I8\ntcnumpy\ndtype\np13\n(S'f8'\nI0\nI1\ntRp14\n(I3\nS'<'\nNNNI-1\nI-1\nI0\ntbI00\nS'\x98k\xb0o\xac\xc80@\x98k\xb0o\xac\xc80@\x98k\xb0o\xac\xc80@\x98k\xb0o\xac\xc80@!\x06\xbf\xe1\x13\xb7\xd0@!\x06\xbf\xe1\x13\xb7\xd0@!\x06\xbf\xe1\x13\xb7\xd0@!\x06\xbf\xe1\x13\xb7\xd0@'\ntbg10\n(g11\n(I0\ntS'b'\ntRp15\n(I1\n(I8\ntg14\nI00\nS'%\xcc?\xd3\x8e\xdfs@%\xcc?\xd3\x8e\xdfs@%\xcc?\xd3\x8e\xdfs@%\xcc?\xd3\x8e\xdfs@\xea\x88\xaeu\xd4\xb8\xdf@\xea\x88\xaeu\xd4\xb8\xdf@\xea\x88\xaeu\xd4\xb8\xdf@\xea\x88\xaeu\xd4\xb8\xdf@'\ntbg10\n(g11\n(I0\ntS'b'\ntRp16\n(I1\n(I16\ntg14\nI00\nS"\xa9\xafZ\xc2\xb75^@Z\xf5\x99]\x91*<@N\x12\xca|\xe1`\xff\xbfs\x15\x87a\xc2\x9bZ@\xa7K\x7f\xff(\xdaM@ \xa0\xe7\xa3\x1e\xe0C\xc0U\x89\xc0\x85.\x1dB\xc0\xb8:\x95\xb7\xd0\xc14@v\xca\x19\x06\xe1\x11J@\xb8';\xea\x1e\xd6G\xc0P\xa8e\x9d\xf6\xfaF\xc0\x93\x89\xa2\xd5\xe5.;@\xbf\xe6\xf2\xc2\x8f\xe0_@\x95\x054\x1a\xca\x9f=@\xf9(\x86\xba\xc93'@`\xcc0\xa6t\xafR@"\ntbtp17\nsb.''')
md['chroma.ShiftFilename'] = u'G:\\Data\\David_Baddeley\\5_11_series_A.sf'
md['chroma.dy'] = pickle.loads('''ccopy_reg\n_reconstructor\np1\n(cscipy.interpolate.fitpack2\nSmoothBivariateSpline\np2\nc__builtin__\nobject\np3\nNtRp4\n(dp5\nS'fp'\np6\nF661.02649656893755\nsS'degrees'\np7\n(I3\nI3\ntp8\nsS'tck'\np9\n(cnumpy.core.multiarray\n_reconstruct\np10\n(cnumpy\nndarray\np11\n(I0\ntS'b'\ntRp12\n(I1\n(I8\ntcnumpy\ndtype\np13\n(S'f8'\nI0\nI1\ntRp14\n(I3\nS'<'\nNNNI-1\nI-1\nI0\ntbI00\nS'\x98k\xb0o\xac\xc80@\x98k\xb0o\xac\xc80@\x98k\xb0o\xac\xc80@\x98k\xb0o\xac\xc80@!\x06\xbf\xe1\x13\xb7\xd0@!\x06\xbf\xe1\x13\xb7\xd0@!\x06\xbf\xe1\x13\xb7\xd0@!\x06\xbf\xe1\x13\xb7\xd0@'\ntbg10\n(g11\n(I0\ntS'b'\ntRp15\n(I1\n(I8\ntg14\nI00\nS'%\xcc?\xd3\x8e\xdfs@%\xcc?\xd3\x8e\xdfs@%\xcc?\xd3\x8e\xdfs@%\xcc?\xd3\x8e\xdfs@\xea\x88\xaeu\xd4\xb8\xdf@\xea\x88\xaeu\xd4\xb8\xdf@\xea\x88\xaeu\xd4\xb8\xdf@\xea\x88\xaeu\xd4\xb8\xdf@'\ntbg10\n(g11\n(I0\ntS'b'\ntRp16\n(I1\n(I16\ntg14\nI00\nS'gU\x00\xe6?\xa5\xf0?\xe7~\xe4rn\xa76\xc0IH.d*\xa8A\xc03/\x9ap\xe5\x8aT\xc0\xa4\x82\'\x8e\x1bpA\xc0_{\xd7\xc8\xbdsF\xc0)+M\xe2\xfd\x012\xc0\xa8\x858\n\xcb\x07M\xc0\xb4L\xbd\x1dB\x80U\xc0>p\x12\xeb<\xedE\xc0\xae\x81\xf73\x0c\xd5A\xc0\x8b\xaf|\x84m\x1c&\xc0\xe0M\x1b\x98\xb0\xc4^\xc0R\xba\x9f"-\x9cO\xc0\xffD\x11\x952\x8a7\xc0\x11\xa6\xc5U\x8eL!@'\ntbtp17\nsb.''')

# these are the settings which would have been used if we were taking a stack. They are always recorded, whether or not
# a stack is acquired (i.e. this data will still be here for 2D images, but is not meaningful. Stack acquisitions
# will have ProtocolFocus events in their events data.
md['StackSettings.ScanMode'] = 'Middle and Number'
md['StackSettings.StepSize'] = 0.20000000000000001
md['StackSettings.EndPos'] = 176.08750000000001
md['StackSettings.NumSlices'] = 100
md['StackSettings.StartPos'] = 156.28749999999999
md['StackSettings.ScanPiezo'] = 'PIFoc'

# Where the various stages are, in their native units: TODO these should really all be in um
md['Positioning.Stage_X'] = 12.526999999999999
md['Positioning.Stage_Y'] = 13.080500000000001
md['Positioning.PIFoc'] = 166.1875

# settings for the analysis. Added to the raw data if doing real time streaming analysis, otherwise only present in
# the analysed data
md['Analysis.DetectionThreshold'] = 0.69999999999999996
md['Analysis.PCTBackground'] = 0.0
md['Analysis.DebounceRadius'] = 4
md['Analysis.DataFileID'] = 624268635
md['Analysis.BGRange'] = [-50, 0]
md['Analysis.FitModule'] = 'SplitterFitFNR'
md['Analysis.subtractBackground'] = True

# a unique ID ascociated with the RAW image suitable for use as e.g. a public key. This is generated either by taking
# a hash of the part of the first frame data, or by computing a starting timestamp along with a random hash.
md['imageID'] = 624268635

# metadata about the sample. This is written into an SQL database on acquisition. The database schema is significantly
# more structured (especially for the labelling) than the summary data written here, and also includes e.g. species
# and strain info.
md['Sample.Labelling'] = [(u'TOM20', u'A647'), (u'alpha-tubulin', u'Cy3B')]
md['Sample.SlideRef'] = u'5_11_2014_A'
md['Sample.Creator'] = u'Kenny'

# Information about the camera
md['Camera.ROIHeight'] = 511
md['Camera.ROIPosY'] = 1
md['Camera.ROIPosX'] = 1
md['Camera.CycleTime'] = 0.051750000566244125         # time between consecutive frames, in seconds
md['Camera.EMGain'] = 90                              # the EM Gain setting (gain register 8 bit voltage value)
md['Camera.ADOffset'] = 251.0                         # the AD offset (zero point) in counts / ADUs
md['Camera.TrueEMGain'] = 35.501290027148769          # the calibrated EM gain (this is the important one)
md['Camera.StartCCDTemp'] = -70
md['Camera.SerialNumber'] = 7863
md['Camera.ROIWidth'] = 511
md['Camera.ElectronsPerCount'] = 4.9900000000000002
md['Camera.Model'] = 'DU897_BV'
md['Camera.ReadNoise'] = 88.099999999999994           # the read noise in counts / ADUs
md['Camera.NoiseFactor'] = 1.4099999999999999         # the excess noise factor for the multiplcation step 1.4 for EMCCD, 1.0 for sCMOS
md['Camera.IntegrationTime'] = 0.05000000074505806
md['Camera.Name'] = 'Andor IXon DV97'

md['StartTime'] = 1415229470.461          # the start time as a unix style timestamp

md['EstimatedLaserOnFrameNo'] = 201       # synonymous with and redundant to 'Protocol.DataStartsAt'. Added automatically during analysis.

md['EndTime'] = 1415229790.5150001        # the end time as a unix style timestamp

md['voxelsize.units'] = 'um'           # NB voxelsizes should ALWAYS be in um. The units specified here are ignored
md['voxelsize.x'] = 0.070000000000000007
md['voxelsize.z'] = 0.20000000000000001
md['voxelsize.y'] = 0.070000000000000007

XML metadata

We also support an XML version of the metadata. This is principally used when embedding PYME metadata (as a structured annotation) into OME metadata when producing OME TIFFs. It can also be used in a standalone fashion like the .json and .md formats, but this usage is discouraged.

HDF metadata

This is the original hierarchical on disk format and uses HDF groups for the category labels, and pytables attributes to those groups as the individual keys. This is likely to be poorly supported when using any library other than pytables to access the HDF file, although does seem to be readable using the standard h5ls and h5dump tools (see example below) with the exception of a few fields which are stored as python pickles:

DB3:~ david$ h5ls -rv /Users/david/PYMEData/david/2016_11_30/30_11_series_A.h5/MetaData
Opened "/Users/david/PYMEData/david/2016_11_30/30_11_series_A.h5" with sec2 driver.
/Camera                  Group
    Attribute: ADOffset scalar
        Type:      native double
        Data:  967
    Attribute: CLASS scalar
        Type:      5-byte null-terminated ASCII string
        Data:  "GROUP"
    Attribute: CycleTime scalar
        Type:      native double
        Data:  0.1
    Attribute: EMGain scalar
        Type:      native long
        Data:  0
    Attribute: ElectronsPerCount scalar
        Type:      native double
        Data:  27.32
    Attribute: IntegrationTime scalar
        Type:      native double
        Data:  0.1
    Attribute: Name scalar
        Type:      23-byte null-terminated ASCII string
        Data:  "Simulated EM CCD Camera"
    Attribute: NoiseFactor scalar
        Type:      native double
        Data:  1.41
    Attribute: ROI scalar
        Type:      23-byte null-terminated ASCII string
        Data:  "(I0\nI0\nI1024\nI256\ntp1\n."
    Attribute: ROIHeight scalar
        Type:      native long
        Data:  256
    Attribute: ROIPosX scalar
        Type:      native long
        Data:  0
    Attribute: ROIPosY scalar
        Type:      native long
        Data:  0
    Attribute: ROIWidth scalar
        Type:      native long
        Data:  1024
    Attribute: ReadNoise scalar
        Type:      native double
        Data:  109.8
    Attribute: TITLE scalar
        Type:      1-byte null-terminated ASCII string
        Data:  ""
    Attribute: VERSION scalar
        Type:      3-byte null-terminated ASCII string
        Data:  "1.0"
    Location:  1:6528
    Links:     1
/Lasers                  Group
    Attribute: CLASS scalar
        Type:      5-byte null-terminated ASCII string
        Data:  "GROUP"
    Attribute: TITLE scalar
        Type:      1-byte null-terminated ASCII string
        Data:  ""
    Attribute: VERSION scalar
        Type:      3-byte null-terminated ASCII string
        Data:  "1.0"
    Location:  1:8264
    Links:     1
/Lasers/l405             Group
    Attribute: CLASS scalar
        Type:      5-byte null-terminated ASCII string
        Data:  "GROUP"
    Attribute: MaxPower scalar
        Type:      native double
        Data:  1000
    Attribute: On scalar
        Type:      native 8-bit field
        Data:  0x00
    Attribute: Power scalar
        Type:      native long
        Data:  10
    Attribute: TITLE scalar
        Type:      1-byte null-terminated ASCII string
        Data:  ""
    Attribute: VERSION scalar
        Type:      3-byte null-terminated ASCII string
        Data:  "1.0"
    Location:  1:9136
    Links:     1
/Lasers/l488             Group
    Attribute: CLASS scalar
        Type:      5-byte null-terminated ASCII string
        Data:  "GROUP"
    Attribute: MaxPower scalar
        Type:      native double
        Data:  1000
    Attribute: On scalar
        Type:      native 8-bit field
        Data:  0x00
    Attribute: Power scalar
        Type:      native long
        Data:  10
    Attribute: TITLE scalar
        Type:      1-byte null-terminated ASCII string
        Data:  ""
    Attribute: VERSION scalar
        Type:      3-byte null-terminated ASCII string
        Data:  "1.0"
    Location:  1:10520
    Links:     1
/Positioning             Group
    Attribute: CLASS scalar
        Type:      5-byte null-terminated ASCII string
        Data:  "GROUP"
    Attribute: Fake_x_piezo scalar
        Type:      native double
        Data:  5
    Attribute: Fake_y_piezo scalar
        Type:      native double
        Data:  5
    Attribute: Fake_z_piezo scalar
        Type:      native double
        Data:  50
    Attribute: TITLE scalar
        Type:      1-byte null-terminated ASCII string
        Data:  ""
    Attribute: VERSION scalar
        Type:      3-byte null-terminated ASCII string
        Data:  "1.0"
    Attribute: x scalar
        Type:      native double
        Data:  5
    Attribute: y scalar
        Type:      native double
        Data:  5
    Attribute: z scalar
        Type:      native double
        Data:  50
    Location:  1:5248
    Links:     1
/Protocol                Group
    Attribute: CLASS scalar
        Type:      5-byte null-terminated ASCII string
        Data:  "GROUP"
    Attribute: DarkFrameRange scalar
        Type:      13-byte null-terminated ASCII string
        Data:  "(I0\nI20\ntp1\n."
    Attribute: DataStartsAt scalar
        Type:      native long
        Data:  21
    Attribute: Filename scalar
        Type:      11-byte null-terminated UTF-8 string
        Data:  "simul488.py"
    Attribute: TITLE scalar
        Type:      1-byte null-terminated ASCII string
        Data:  ""
    Attribute: VERSION scalar
        Type:      3-byte null-terminated ASCII string
        Data:  "1.0"
    Location:  1:80288
    Links:     1
/StackSettings           Group
    Attribute: CLASS scalar
        Type:      5-byte null-terminated ASCII string
        Data:  "GROUP"
    Attribute: EndPos scalar
        Type:      native double
        Data:  59.9
    Attribute: NumSlices scalar
        Type:      native long
        Data:  100
    Attribute: ScanMode scalar
        Type:      17-byte null-terminated ASCII string
        Data:  "Middle and Number"
    Attribute: ScanPiezo scalar
        Type:      1-byte null-terminated ASCII string
        Data:  "z"
    Attribute: StartPos scalar
        Type:      native double
        Data:  40.1
    Attribute: StepSize scalar
        Type:      native double
        Data:  0.2
    Attribute: TITLE scalar
        Type:      1-byte null-terminated ASCII string
        Data:  ""
    Attribute: VERSION scalar
        Type:      3-byte null-terminated ASCII string
        Data:  "1.0"
    Location:  1:3648
    Links:     1
/voxelsize               Group
    Attribute: CLASS scalar
        Type:      5-byte null-terminated ASCII string
        Data:  "GROUP"
    Attribute: TITLE scalar
        Type:      1-byte null-terminated ASCII string
        Data:  ""
    Attribute: VERSION scalar
        Type:      3-byte null-terminated ASCII string
        Data:  "1.0"
    Attribute: units scalar
        Type:      2-byte null-terminated ASCII string
        Data:  "um"
    Attribute: x scalar
        Type:      native double
        Data:  0.07
    Attribute: y scalar
        Type:      native double
        Data:  0.07
    Attribute: z scalar
        Type:      native double
        Data:  0.2
    Location:  1:11576
    Links:     1

We are considering replacing this with a somewhat more portable and simple implementation to make it easier to read in 3rd party clients.

Read-only support of 3rd party metadata

We can read limited metadata (e.g. voxelsize) from OME TIFFS, Zeiss .lsm, and anything we open using bioformats