<div dir="ltr"><div>Hi all,<br><br></div>Quick background: I'm one of the µManager developers, and recently had to spend some time going through our TIFF file format. We may need to be able to output super-resolution localization data sometime in the future, so I have a potential horse in this race. :)<br><div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Oct 1, 2015 at 5:32 AM, Alex Herbert <span dir="ltr"><<a href="mailto:a.herbert@sussex.ac.uk" target="_blank">a.herbert@sussex.ac.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Nico,<br>
<br>
Thanks for your comments.<br>
<br>
I agree that the conversion field I suggested is not perfect. My idea was that software could choose to write their data using any units they wanted, so speeding up data writing by not have to convert during output. However they should provide a method to convert them to a standard.<br></blockquote><div><br></div>I do not believe this would produce a measurable improvement in file writing speed. Unit conversion is a simple multiplication, maybe with an addition in some cases; the addition of two operations to your output code will not make a difference. Meanwhile, allowing the file writer to use arbitrary units puts a significant onus on whoever must read the files.<br></div><div class="gmail_quote"> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
For conversion of the Intensity, I left out that the gain would be ADUs/photon. This is implied by the requirement that the conversion field converts the specified unit to the standard. Thus in my example the conversion for X is nm/Pixel and Frames is seconds/Frame, since the standard units are nm and seconds respectively.<br>
<br>
If you allow the data to be recorded using any units with the requirement that the units be included then you are leaving a problem for anyone reading the file, i.e. they must be able to understand the units. The options so far suggested are:<br>
<br>
1. Use standard units<br>
2. Use any units you want and include the units in the file<br>
3. Use a subset of defined units for each field and include the units in the file<br>
4. Use any units you want and include a way to convert them to standard units<br></blockquote><br></div><div class="gmail_quote">My preference would be for 1, with the ability to mark the units as "unknown; described by creator as X". In other words, if I, as a file writer, am *unable* to convert to standard units, then I can say "Look, best I can tell you is that I got 75 counts from the camera" and then the number would be 75 with a unit of "counts" in all representations (i.e. with no attempt at unit conversion). In all other cases, as I *am* able to convert to standard units, I must do so.<br><br></div><div class="gmail_quote">There should be one right way to write data. If users want to use different units to *interpret* the data, well, of course they can do that, but there should be only one standard set of units in the storage. That vastly simplifies the lives of whoever has to read/write this data because they don't have to cope with a bunch of different valid ways things could potentially be done. <br><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Allowing the header to be at the end of the file is a good option. It would require scanning the file first to locate the header but this can be made a standard procedure, for example this is done for TIFF image files which can interlace binary data with information about the data.<br></blockquote><div><br></div><div>You can have a reserved field at the beginning of the file that contains the offset of the header. So when writing is complete, you output the header at offset X, then you go back to that reserved field and write X into it. Thus when reading the file, you can quickly extract the offset of the header and jump to it, instead of having to scan through the entire file. <br><br></div><div>One thing we wish we did for µManager's multipage TIFF format was to include "flagpost" values with each image in the file, to identify "a new image record starts here". The "flagpost" is simply a likely-to-be-unique value (e.g. a randomly-chosen 64-bit number). The use of this is for manually reconstructing data if a file becomes corrupted, e.g. because the program crashed midway through. In such situations you do not have a header, or the header is corrupt, so you can't use it to interpret the rest of the file. You may want to consider something similar if there are repetitive structures in this file format that you would like to be able to recover later.<br><br></div><div>-Chris<br></div></div></div></div></div>