[ome-devel] Prairie View XML changes for upcoming v5.2 Release

Jason Swedlow j.r.swedlow at dundee.ac.uk
Tue Jun 10 10:58:40 BST 2014


Hi Mike

Thanks for this info.  It's great to have this spec and the example data.  We'll look at this and let you know if we have any questions or issues.

I don't know if we can get this into our upcoming release, as we already have a lot of work scheduled for that.  We'll definitely cc you on the relevant tickets so you can follow our progress.

Thanks again for this contribution.  We welcome all info and sample data from the commercial imaging community.  Thanks to all at Bruker for their support of Bio-Formats.

Cheers,

Jason

Centre for Gene Regulation & Expression | Open Microscopy Environment | University of Dundee

On 9 Jun 2014 15:54, Mike Wussow <Mike.Wussow at bruker-nano.com> wrote:
Dear Bio-formats Community,

Prairie View version 5.2 will be released the first week in July 2014.  With this release we have found it necessary to make changes to our XML files.  We realize that these changes will likely break the Bio-formats reader for our files but were necessary changes that should serve our customers well moving forward.  Therefore we are providing information on the changes and example data sets in advance of our release in the hopes that this community can update the Bio-formats reader to accommodate these changes.

A summary of the changes that have occurred to our XML files can be found here: https://www.dropbox.com/s/sf2e634tpuv6ffh/Prairie%20View%20XML%20Evolution.docx as well as in the text below.

Example data sets can be downloaded here: https://www.dropbox.com/sh/rtd73wcu4oboqt4/AABKM1rdhYLWjFVrw-YA5__Sa

Michael C. Wussow
Director Product Line Management

Bruker Nano Surfaces Division    mike.wussow at bruker-nano.com<mailto:mike.wussow at bruker-nano.com>
3030 Laura Lane, Suite 140           http://www.bruker.com/nano
Middleton, WI 53562-0677
Phone: +1 608-662-0022 x167
Cell: +1 608-381-8252
________________________________


Prairie View XML Evolution
Version 4.x and Earlier
This is the format most Prairie View users are accustomed to seeing, and most third-party applications are able to handle.  The largest issue with customers trying to parse this format is trying to read the XML file like a text file, looking for certain strings or assuming a fixed order, when it should be read as an XML file using whatever XML parser is available in the programming language being used.
Another lesser-known issue with this format is that while the first line in the file says it is XML version 1.0, the file actually contains some characters that aren’t valid for version 1.0 and instead require version 1.1.  Some XML parsing libraries (for example, MATLAB’s) are picky about this distinction and refuse to read the XML files unless the version at the top of the file is changed from 1.0 to 1.1.  However, Microsoft/.NET doesn’t support version 1.1, so if the file is changed, Prairie View can no longer read the file.  Removal of those characters is beyond the scope of Bruker’s current technical development projects.
Version 5.0
In addition to the traditional XML file format, Prairie View version 5.0 offers the option to  ‘Use Smaller XML File Format’; the user can toggle this option under the ‘Preferences’ menu in the Prairie View software.
The smaller XML file format eliminates the duplicate state key/value pairs by introducing a hierarchical structure where only what has changed is written out.  This saves tens to hundreds of megabytes for larger data sets.  However, most third-party applications designed to read the larger XML files will not read the smaller format because they don’t know where to find data that can exist in any one of three places.  The differences are illustrated in the following example.
In earlier versions each data set XML file looked something like the following:
<?xml version="1.0" encoding="utf-8"?>
<PVScan version="4.3.2.24" date="3/27/2013 5:01:57 PM" … >
    …
    <Sequence … >
        <Frame … >
            <PVStateShard>
                …All state key/value pairs are listed here, for every frame, even if they don’t change…
            </PVStateShard>
        </Frame>
        …There could be more frames here…
    </Sequence>
    …There could be more sequences here…
</PVScan>
XML files using the smaller format look something like the following:
<?xml version="1.0" encoding="utf-8"?>
<PVScan version="5.0.64.100" date="4/15/2014 ‏‎9:01:26 AM" … >
    …
    <PVStateShard>
        …All state key/value pairs are listed here,
           this can be thought of as the grandparent state…
    </PVStateShard>
    …
    <Sequence … >
        …
        <PVStateShard>
            …Only state key/value pairs which differ from the grandparent state are listed here,
               this can be thought of as the parent state…
        </PVStateShard>
        …
        <Frame … >
            <PVStateShard>
                …Only state key/value pairs which differ from the parent state are listed here,
                   this can be thought of as the child state…
            </PVStateShard>
        </Frame>
        …There could be more frames here…
    </Sequence>
    …There could be more sequences here…
</PVScan>
It is important to note that any XML parser capable of handling the smaller XML format can still read the old format.  The old, larger format is like a special case of the smaller XML file where the child state contains every state key/value pair and it is never necessary to look at the parent or grandparent state.  It is still a good practice to check if those parent and grandparent nodes exist prior to trying to access them.
Version 5.1
There were no significant changes in the XML structure between version 5.0 and 5.1. The option for the user to select either the original or the smaller XML format exists in 5.1 as it did in 5.0.
In Prairie View 5.1, OME TIFF support was updated and extended to allow Fiji/ImageJ to import our datasets using just the TIFF files.  Almost every TIFF file written in version 5.1 is an OME TIFF, as noted by the *.ome.tif file extension.  In version 5.0, and earlier versions supporting OME TIFF, adding OME TIFF metadata to the TIFFs was done in post processing and did not change the file extension; that OME TIFF implementation was rudimentary and designed with a single case in mind.  The changes in 5.1 fully implement the OME TIFF format.
A data set can be imported into Fiji/ImageJ using the OME TIFF metadata (instead of the Prairie View data set XML file) by selecting the first *ome.tif file in the folder (instead of the XML file that would normally be used by the Bio-Formats import tool).  The first *.ome.tif file is a little larger than the rest and contains information about all the other related TIF files.
Version 5.2
This version takes the smaller XML format introduced in version 5.1 and makes it the default format, with no option to go back to writing out the larger data set XML files.  In addition, the format of the state key/value pairs has changed slightly to incorporate indices and sub-indices where applicable.
For example here are what some state key/value pairs looked like prior to version 5.2:
<Key key="linesPerFrame" permissions="Read, Write, Save" value="186" />
<Key key="pmtGain_0" permissions="Write, Save" value="605" />
<Key key="pmtGain_1" permissions="Write, Save" value="604" />
<Key key="pmtGain_2" permissions="Write, Save" value="0" />
<Key key="positionCurrent_XAxis" permissions="Write, Save" value="0.95" />
<Key key="positionCurrent_YAxis" permissions="Write, Save" value="-4.45" />
<Key key="positionCurrent_ZAxis" permissions="Write, Save" value="-9,62.45" />


Here is what those same state key/value pairs look like in version 5.2:
<PVStateValue key="linesPerFrame" value="186" />
<PVStateValue key="pmtGain">
    <IndexedValue index="0" value="605" description="Ch1 High Voltage" />
    <IndexedValue index="1" value="604" description="Ch2 High Voltage" />
    <IndexedValue index="2" value="0" description="Ch3 High Voltage" />
</PVStateValue>
<PVStateValue key="positionCurrent">
    <SubindexedValues index="XAxis">
        <SubindexedValue subindex="0" value="0.95" />
    </SubindexedValues>
    <SubindexedValues index="YAxis">
        <SubindexedValue subindex="0" value="-4.45" />
    </SubindexedValues>
    <SubindexedValues index="ZAxis">
        <SubindexedValue subindex="0" value="-9" description="Focus" />
        <SubindexedValue subindex="1" value="62.45" description="Piezo" />
    </SubindexedValues>
</PVStateValue>
Notice that in addition to the slight formatting changes, description fields have been enhanced to include some human-recognizable identifiers for key/value pairs which would otherwise only have a programmatically generated index associated with them.  Since this new description field has just been introduced, not all key/value pairs have been implemented; more will be implemented over time.
It is also worth noting that keys, indices, and sub-indices in version 5.2 are written in alphabetical/numerical order, whereas the order in older versions could vary based on a number of factors.
Also new to version 5.2 is XML metadata related to Spectral Mode in Prairie View.  If a sequence was collected in Spectral Mode, the sequence tag would have the SpectralMode attribute set to “True”.
<Sequence SpectralMode="True" …
The rest of the metadata format is consistent with a non-spectral dataset except that instead of having a maximum of four image channels, there can be 16 channels with each channel corresponding to subset of the entire emission spectrum.  Each channel is denoted by the <File> tag and has attributes for channel name, channel number, timestamp and filename.  Metadata for a spectral frame with 16 spectral channels is formatted as follows:
<Frame index="1" parameterSet="CurrentSettings" absoluteTime="0.374000000000024" relativeTime="0">
<File absoluteTime="0.374" relativeTime="0" filename="TSeries-06052014-1318-003_Cycle00001_Ch1_000001.ome.tif" wavelengthMax="531" wavelengthMin="526" channelName="SpectralChannel_01" channel="1"/>
…
<File absoluteTime="0.374" relativeTime="0" filename="TSeries-06052014-1318-003_Cycle00001_Ch16_000001.ome.tif" wavelengthMax="828" wavelengthMin="808" channelName="SpectralChannel_16" channel="16"/>
<ExtraParameters lastGoodFrame="0"/>
<PVStateShard/>
</Frame>


The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20140610/c9bb702a/attachment-0001.html>


More information about the ome-devel mailing list