[ome-devel] Super-Resolution standard format

Wed Sep 30 20:03:43 BST 2015

Hi Alex

Thanks for your very valuable thoughts- I think it would be great if our final format incorporated all your ideas. Our thinking was to start with a very basic format, and if it looks workable to build on it. More comments inline.

On 25 September 2015 at 15:42, Alex Herbert <a.herbert at sussex.ac.uk<mailto:a.herbert at sussex.ac.uk>> wrote:
Hi Simon,

I think that the minimum requirements for a standard format would be
storing the intensity of a localisation at position TXYZ.

I think everyone's agreed on this point!

Intensity and TXYZ would require units. A factor should be provided to
convert the given units to standard (e.g. seconds, nm, photons). This
would allow the format to record using units of choice for the software.

That makes sense. However, for our first version how about we stick to one unit for simplicity. nm? pixels? Hopefully the original image contains enough metadata to convert between units.

It would be good to allow for storing the bounds of the data. This is
relevant to rendering images, density analysis and other data
post-processing.

I think most of us assumed the localisation data would be stored alongside the source images which would provide some (most?) of this information.

To make it possible to link localisations together (i.e. to mark the
same molecule in different frames) would require an ID field. These
requirement make the file need both a header and records. It could look
something like this:

Header:

Dimension,Min,Max,Unit,Conversion
ID,1,59,Count,2.34
T,1,1000,Frames,0.25
X,0,64,Pixels,107
Y,0,64,Pixels,107
Z,-500,500,nm,1
Intensity,10.987,1234,ADUs,0.025

This header corresponds to an image with 59 molecules, with an average
of 2.34 localisations per molecule (117 localisations in total). The
image was taken for 1000 frames at a frame rate of 250 milliseconds,
with 107nm pixel pitch on a 64x64 pixel camera, 500nm depth of field for
3D fitting, with a gain of 40.

An particle ID field also makes sense. Could you (and other people on this list) go into a bit more detail about how this additional summary information would be used? Is there a significant advantage to storing it as opposed to calculating it as necessary?

If tracing of localisations into molecules has not been performed then
the IDs would be sequential from 1 and the Conversion for the ID set to
1. Note that others may have a different take on the use of an ID
column. For example it could be used to record the unique ID of a
fitting candidate. In this case it would not be sequential as candidates
that were rejected will be omitted from the results.

Sounds reasonable.

Also note that results may be written directly by a parallel processing
fitting routine. In such a case they may not necessarily be ordered by
ID or by time T, and the bounds for some dimensions (ID, T, Z,
Intensity) may not be available when the header is written. In this case
the bounds can be optional. The bounds for XY should be available given
the image from the camera is a fixed size, however it is more flexible
if these are optional too.

Records (simple delimited data):

ID,T,X,Y,Z,Intensity

This could be extended by allowing custom columns to be described in the
header. These columns would be appended to the standard ones.

The previously highlighted issue of pixel offset (e.g. positions 0,0 or
0.5,0.5 correspond to the centre of a pixel) could be either included in
the header, or just set mandatory that the coordinates use a standard
offset defined by the file type.

OK. I'm not an SR user, so unless someone else decides between these options I'll just go with (0,0)

Also note that the data parser written by the EPFL for the Localisation
Microscopy Challenge 2013 had the facility to specify the delimiter for
records. It may also be required that the locale for numbers be of a
mandatory type or specified in the file given that some formats use
different characters for decimal points. This is something that has
caused a problem for my own software for a French user where the locale
uses commas as the decimal separator.

You might've noticed we're leaning towards HDF5. This should avoid most of the encoding/locale issues, and having to convert from ascii to binary.

How the header is composed is open to debate. It could be XML allowing
verbose descriptions of the data columns. Or a simple tabular format
like the example shown above. But I recommend it should be easy to
detect when the header has ended and the rows of data records begin.

If we do go with HDF5 then we can use one or more of the description/attribute fields on tables, columns, etc. Either something really easy and simple (e.g. units for each column), or if it's more complicated, a block of OME-XML.

I hope that provides a few ideas.

It does, thanks very much.

If anyone else has comments on the format we're working towards you should speak up soon!

Simon

Alex

On 19/09/15 12:00, ome-devel-request at lists.openmicroscopy.org.uk<mailto:ome-devel-request at lists.openmicroscopy.org.uk> wrote:

Message: 1
Date: Fri, 18 Sep 2015 15:03:49 +0100
From: Simon Li <spli at dundee.ac.uk<mailto:spli at dundee.ac.uk>>
To: OME External Developer List
      <ome-devel at lists.openmicroscopy.org.uk<mailto:ome-devel at lists.openmicroscopy.org.uk>>
Subject: Re: [ome-devel] Super-Resolution standard format
Message-ID:
      <CAMvbRBEkvEPyRWH0QvyCTihaYoudRkF_c-xjjFUoWY3p0bfTrQ at mail.gmail.com<mailto:CAMvbRBEkvEPyRWH0QvyCTihaYoudRkF_c-xjjFUoWY3p0bfTrQ at mail.gmail.com>>
Content-Type: text/plain; charset="utf-8"

Dear Alex

Thanks for adding your file formats. As you might've seen earlier in this thread I made a start on a Python script for importing some SR data into OMERO.tables, which is essentaily HDF5 behind the scenes with some OMERO-specific conventions (in terms of metadata inside the HDF5). Our current (tentative) plans are to stick with this, but remove the OMERO-specific requirements.

https://github.com/manics/omero-superresolution-tables

Ian Munro has done some work on getting PALM-siever and ThunderSTORM to work with this format.

Where we could really use your help is in getting everyone to agree on what columns should be mandatory, their types (for example, should x/y/z always be in nm), and anything else we might've missed.

Best wishes

Simon

_______________________________________________
ome-devel mailing list
ome-devel at lists.openmicroscopy.org.uk<mailto:ome-devel at lists.openmicroscopy.org.uk>
http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel

The University of Dundee is a registered Scottish Charity, No: SC015096

The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20150930/6f84a133/attachment.html>