[ome-devel] Super-Resolution standard format

Thu Oct 1 13:32:01 BST 2015

Hi Nico,

Thanks for your comments.

I agree that the conversion field I suggested is not perfect. My idea 
was that software could choose to write their data using any units they 
wanted, so speeding up data writing by not have to convert during 
output. However they should provide a method to convert them to a standard.

For conversion of the Intensity, I left out that the gain would be 
ADUs/photon. This is implied by the requirement that the conversion 
field converts the specified unit to the standard. Thus in my example 
the conversion for X is nm/Pixel and Frames is seconds/Frame, since the 
standard units are nm and seconds respectively.

If you allow the data to be recorded using any units with the 
requirement that the units be included then you are  leaving a problem 
for anyone reading the file, i.e. they must be able to understand the 
units. The options so far suggested are:

1. Use standard units
2. Use any units you want and include the units in the file
3. Use a subset of defined units for each field and include the units in 
the file
4. Use any units you want and include a way to convert them to standard 
units

For simplicity option 1 would be the best. It transfers responsibility 
to the writer of the file to record the data correctly. However it does 
not allow the format to be used in the case where a calibration to 
convert to standard units is not available.

Option 2 allows the writer to record the data they currently have. It 
would be the fastest

Option 3 should allow the writer to record their raw data (since the 
subset of defined units for each field would be well chosen). But it 
does not include conversion to other units.

Option 4 allows the writer to record the data they currently have but 
passes responsibility to them to state how the units are to be converted 
into something recognisable.

Perhaps options 3 and 4 can be combined. Allow the units to be recorded 
but they must be from a defined set of allowed units. The data can be 
written direct. Then provide an optional section in the header for a 
calibration to convert to other units.

An example of the units allowed for each field would be:

ID: N/A
Coordinates: pixel, nm
Time: frame, second
Intensity: count, photon

The writer then has the option to include a calibration stating how the 
chosen units can be converted to one (or all) of the other allowed units 
in the field.

Note that the suggestion for XML in the file was only for the header. I 
agree that data records should be fast to read and write. Binary is 
preferable as long as the binary format can be read on multiple 
platforms. My own file formats see speed differences of 3 orders of 
magnitude when switching to binary. It is the only reasonable way to 
save and load a million records.

Allowing the header to be at the end of the file is a good option. It 
would require scanning the file first to locate the header but this can 
be made a standard procedure, for example this is done for TIFF image 
files which can interlace binary data with information about the data.

Hi Simon,

I agree with Nico that the localisation results should be decoupled from 
the original images, which a user may want to discard once processed to 
save storage space. Also note that you may run localisation software 
directly on the output from the camera and not even keep the pixel 
images. This allows a very high frame rate to be used without generating 
a lot of data storage requirements.

As a personal preference I would vote for the centre of a pixel being 
recorded as 0.5,0.5. This means that 0,0 is the top left of a pixel; 1,1 
the bottom right. This allows the bounds of a camera pixel array to be 
specified as 0,0 to width,height. It also allows using the floor of any 
sub-pixel floating point coordinates to identify the pixel it is within 
which is neater than using rounding.

If you set 0,0 as the centre of a pixel it means that the lower bounds 
for data in pixels is -0.5,-0.5. If using nm for coordinates it means 
that the lower edge of your data bounds is -(pixel pitch) / 2.

Regards,

Alex