[ome-users] Bio-formats - Bug when recycling TiffDelegateReader

Thu Apr 13 16:49:34 BST 2017

Hi Melissa,

Yeah i perfectly understand the problem with the TIFF reader, indeed the 
delegate reader is kind of generic TIFF reader and should be able to 
open any TIFF image (somehow). I think i will use the idea of using a 
custom reader list so we can obtain the reader faster than testing all 
available readers.
And as i think that is very specific to the TiffDelegateReader so we can 
probably do something as :

if (reader instanceof TiffDeledateReader)
   reader = tiffReaders.getReader(path);
else if (!reader.isThisType(path, false) && !reader.isThisType(path, true))
   reader = mainReader.getReader(path);

Or probably even better :

// was a TIFF reader ?
if ((reader instanceof TiffDelegateReader) || (reader instanceof 
BaseTiffReader) || (reader instanceof OMETiffReader))
   reader = tiffReaders.getReader(path);
else if (!reader.isThisType(path, false) && !reader.isThisType(path, true))
   reader = mainReader.getReader(path);

As the reader.isThisType(path, false) may return 'true' for all TIFF 
reader as soon we have a tif file extension.

Thanks for the clarifications :)

Best,

- Stephane

Le 13/04/2017 à 16:46, Melissa Linkert a écrit :
> Hi Stephane,
>
>> Thanks for taking the time to investigate the question !
>> Ok, i totally understand, i though file where both generated by ImageJ
>> so i used that example but in fact my code snippet was incomplete (to
>> make it simpler), what we really do is we first check if we can re-use
>> the reader by using this code in between file opening :
>> if (!reader.isThisType(path, false) && !reader.isThisType(path, true))
>        reader = mainReader.getReader(path);
>
>> As you can see we first want to check without opening the file (again
>> because it's faster) and i though it was the issue as the reader
>> probably used the file extension to assume it can open it properly so i
>> changed the code to
>> if (!reader.isThisType(path, true))
>>       reader = mainReader.getReader(path);
>
> Understood. Using this strategy to minimize file opening makes sense
> and is similar to what has been implemented in the python-bioformats
> bindings by the CellProfiler team [1].
>
>> But the result is the same, the TiffDelegateReader return 'true' for
>> reader.isThisType('rotated.tif', true).
>> Looking at the mainReader.getReader(path) implementation it uses the
>> same method internally to determine which reader to return.
>> The only difference is that mainReader (ImageReader) is always scanning
>> readers from its list instead of trying to recycle the last used one...
>> that means that OME Tiff Reader probably appears before in the list than
>> TiffDelegateReader ending to that difference.
> A couple of additional thoughts about file format detection.
> Generally, identifying the correct format is not a trivial operation
> largely due to the fact that the mapping is not one-to-one between
> files and file formats.
>
> This is especially true for TIFF-based file formats (currently 29 out
> of the 144 supported in Bio-Formats). All the files stored using these
> formats could be detected as generic TIFF files as well. Obviously in
> many cases, the expectation is to read the correct metadata i.e. use
> the correct reader with the appropriate grouping (which can require
> opening the files).
>
> Format detection in ImageReader is given by the order of readers in
> readers.txt [2]. As you can see TiffDelegateReader is listed at the
> bottom to give a chance for all the other TIFF-based file formats to
> detect the file.  This unfortunately means that calling
> 'isThisType(path, true)' on a TiffDelegateReader alone will not be
> enough to tell you whether a different TIFF-based reader would be more
> appropriate for the file.  Setting 'reader' to
> 'mainReader.getReader(path)' for every new file will be the safest way
> to guarantee that the correct TIFF-based reader is used every time
> (though this will be a little slower).
>
> You might also consider using an ImageReader with a custom ClassList
> to resolve the case where TiffDelegateReader's 'isThisType(path,
> true)' returns true.  See [3] for an example of constructing an
> ImageReader with a custom ClassList.  The ClassList overrides what is
> in readers.txt; if you add only TIFF-based readers to it, you should
> get correct type detection for known TIFF files without the overhead
> of checking non-TIFF readers.
>
> Regards,
> -Melissa
>
> [1] https://github.com/CellProfiler/python-bioformats/issues/23
> [2] https://github.com/openmicroscopy/bioformats/blob/develop/components/formats-api/src/loci/formats/readers.txt
> [3] https://github.com/openmicroscopy/bioformats/blob/develop/components/formats-bsd/src/loci/formats/in/NRRDReader.java#L225
>
> On Wed, Apr 12, 2017 at 5:17 AM, Stephane Dallongeville
> <stephane.dallongeville at pasteur.fr> wrote:
>> Hi Sebastien,
>>
>> Thanks for taking the time to investigate the question !
>>
>>> Looking at the data, I think the issue is a related to the unwrapping of
>>> the ImageReader using thegetReader() API
>>> while the two files are actually different formats from the Bio-Formats
>>> standpoint:
>>>   - 3dtestbigger.tif is a TIFF file generated by ImageJ. As you mentioned,
>>> the ImageReader will delegate to
>>>     the TiffDelegateReader and parse the image dimensions from the content
>>> of the ImageDescription tag (`slices=40`)
>>>   - rotated.tif is a TIFF file with an OME-XML string in its
>>> ImageDescription tag. An ImageReader will detect
>>>     such a file as an OME-TIFF and read its dimensions using the OME
>>> metadata.
>>>     The TiffReader will also read this file but detect it a regular
>>> multi-page TIFF and aggregate all the planes alongside the T dimension.
>>> ....
>>> All understood. As mentioned above, the code snippet above also includes
>>> the assumption that the initial format
>>> reader can be used for all the files. I assume this might be a constraint
>>> in addition to the reader recycling.
>>> Otherwise, it should be possible to recycle the generic ImageReader.
>> Ok, i totally understand, i though file where both generated by ImageJ so i
>> used that example but in fact my code snippet was incomplete (to make it
>> simpler), what we really do is we first check if we can re-use the reader by
>> using this code in between file opening :
>>
>> if (!reader.isThisType(path, false) && !reader.isThisType(path, true))
>>        reader = mainReader.getReader(path);
>>
>> As you can see we first want to check without opening the file (again
>> because it's faster) and i though it was the issue as the reader probably
>> used the file extension to assume it can open it properly so i changed the
>> code to
>>
>> if (!reader.isThisType(path, true))
>>        reader = mainReader.getReader(path);
>>
>> But the result is the same, the TiffDelegateReader return 'true' for
>> reader.isThisType('rotated.tif', true).
>> Looking at the mainReader.getReader(path) implementation it uses the same
>> method internally to determine which reader to return.
>> The only difference is that mainReader (ImageReader) is always scanning
>> readers from its list instead of trying to recycle the last used one... that
>> means that OME Tiff Reader probably appears before in the list than
>> TiffDelegateReader ending to that difference.
>>
>> Best,
>>
>> --
>> Stephane Dallongeville
>> Unité d'Analyse d'Images Biologiques
>> CNRS UMR 3691
>> Institut Pasteur
>> 25 rue du Dr. Roux
>> 75724 Paris cedex 15 (FRANCE)
>>
>> Tel: +33 (0)1 45 68 87 01
>> Fax: +33 (0)1 40 61 33 30
>>
>> http://www.bioimageanalysis.org/
>>
>> _______________________________________________
>> ome-users mailing list
>> ome-users at lists.openmicroscopy.org.uk
>> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users
> _______________________________________________
> ome-users mailing list
> ome-users at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users