[ome-users] Bio-formats - Bug when recycling TiffDelegateReader

Fri Apr 14 13:12:18 BST 2017

Hi David,

Thanks for the confirmation, TIFF is the only format where we need to do 
that :)
Yeah the second code should be better to avoid problem as having 
OMETIffReader being used in place of another TIFF reader, the first 
proposed solution in imperfect in that case (or should use 
reader.isThisType(path, true) only for the second test).

Best,

- Stephane

Le 14/04/2017 à 13:05, David Gault (Staff) a écrit :
> Hi Stephane,
>
> Both of the the solutions your prosed would be suitable. Looking through the readers.txt file and it appears that TIFF is the only format which also uses a base or delegate reader directly in this manner, so your plan of identifying TIFF readers separately should be satisfactory.
>
> Between the two solutions the main benefit of your second option would be to speed up the reader identification time if you are opting to use a separate class list for TIFF based formats. As you already have a TIFF class list prepared it makes sense to use it for all TIFF formats in this manner. The 3 Reader classes TiffDelegateReader, BaseTiffReader and OMETiffReader should also cover all possible TIFF reader combinations.
>
> With Thanks,
> David Gault
>
>> On 13 Apr 2017, at 16:49, Stephane Dallongeville <stephane.dallongeville at pasteur.fr> wrote:
>>
>> Hi Melissa,
>>
>> Yeah i perfectly understand the problem with the TIFF reader, indeed the delegate reader is kind of generic TIFF reader and should be able to open any TIFF image (somehow). I think i will use the idea of using a custom reader list so we can obtain the reader faster than testing all available readers.
>> And as i think that is very specific to the TiffDelegateReader so we can probably do something as :
>>
>> if (reader instanceof TiffDeledateReader)
>>   reader = tiffReaders.getReader(path);
>> else if (!reader.isThisType(path, false) && !reader.isThisType(path, true))
>>   reader = mainReader.getReader(path);
>>
>> Or probably even better :
>>
>> // was a TIFF reader ?
>> if ((reader instanceof TiffDelegateReader) || (reader instanceof BaseTiffReader) || (reader instanceof OMETiffReader))
>>   reader = tiffReaders.getReader(path);
>> else if (!reader.isThisType(path, false) && !reader.isThisType(path, true))
>>   reader = mainReader.getReader(path);
>>
>> As the reader.isThisType(path, false) may return 'true' for all TIFF reader as soon we have a tif file extension.
>>
>> Thanks for the clarifications :)
>>
>> Best,
>>
>> - Stephane
>>
>>
>> Le 13/04/2017 à 16:46, Melissa Linkert a écrit :
>>> Hi Stephane,
>>>
>>>> Thanks for taking the time to investigate the question !
>>>> Ok, i totally understand, i though file where both generated by ImageJ
>>>> so i used that example but in fact my code snippet was incomplete (to
>>>> make it simpler), what we really do is we first check if we can re-use
>>>> the reader by using this code in between file opening :
>>>> if (!reader.isThisType(path, false) && !reader.isThisType(path, true))
>>>        reader = mainReader.getReader(path);
>>>
>>>> As you can see we first want to check without opening the file (again
>>>> because it's faster) and i though it was the issue as the reader
>>>> probably used the file extension to assume it can open it properly so i
>>>> changed the code to
>>>> if (!reader.isThisType(path, true))
>>>>       reader = mainReader.getReader(path);
>>> Understood. Using this strategy to minimize file opening makes sense
>>> and is similar to what has been implemented in the python-bioformats
>>> bindings by the CellProfiler team [1].
>>>
>>>> But the result is the same, the TiffDelegateReader return 'true' for
>>>> reader.isThisType('rotated.tif', true).
>>>> Looking at the mainReader.getReader(path) implementation it uses the
>>>> same method internally to determine which reader to return.
>>>> The only difference is that mainReader (ImageReader) is always scanning
>>>> readers from its list instead of trying to recycle the last used one...
>>>> that means that OME Tiff Reader probably appears before in the list than
>>>> TiffDelegateReader ending to that difference.
>>> A couple of additional thoughts about file format detection.
>>> Generally, identifying the correct format is not a trivial operation
>>> largely due to the fact that the mapping is not one-to-one between
>>> files and file formats.
>>>
>>> This is especially true for TIFF-based file formats (currently 29 out
>>> of the 144 supported in Bio-Formats). All the files stored using these
>>> formats could be detected as generic TIFF files as well. Obviously in
>>> many cases, the expectation is to read the correct metadata i.e. use
>>> the correct reader with the appropriate grouping (which can require
>>> opening the files).
>>>
>>> Format detection in ImageReader is given by the order of readers in
>>> readers.txt [2]. As you can see TiffDelegateReader is listed at the
>>> bottom to give a chance for all the other TIFF-based file formats to
>>> detect the file.  This unfortunately means that calling
>>> 'isThisType(path, true)' on a TiffDelegateReader alone will not be
>>> enough to tell you whether a different TIFF-based reader would be more
>>> appropriate for the file.  Setting 'reader' to
>>> 'mainReader.getReader(path)' for every new file will be the safest way
>>> to guarantee that the correct TIFF-based reader is used every time
>>> (though this will be a little slower).
>>>
>>> You might also consider using an ImageReader with a custom ClassList
>>> to resolve the case where TiffDelegateReader's 'isThisType(path,
>>> true)' returns true.  See [3] for an example of constructing an
>>> ImageReader with a custom ClassList.  The ClassList overrides what is
>>> in readers.txt; if you add only TIFF-based readers to it, you should
>>> get correct type detection for known TIFF files without the overhead
>>> of checking non-TIFF readers.
>>>
>>> Regards,
>>> -Melissa
>>>
>>> [1] https://github.com/CellProfiler/python-bioformats/issues/23
>>> [2] https://github.com/openmicroscopy/bioformats/blob/develop/components/formats-api/src/loci/formats/readers.txt
>>> [3] https://github.com/openmicroscopy/bioformats/blob/develop/components/formats-bsd/src/loci/formats/in/NRRDReader.java#L225
>>>
>>> On Wed, Apr 12, 2017 at 5:17 AM, Stephane Dallongeville
>>> <stephane.dallongeville at pasteur.fr> wrote:
>>>> Hi Sebastien,
>>>>
>>>> Thanks for taking the time to investigate the question !
>>>>
>>>>> Looking at the data, I think the issue is a related to the unwrapping of
>>>>> the ImageReader using thegetReader() API
>>>>> while the two files are actually different formats from the Bio-Formats
>>>>> standpoint:
>>>>>   - 3dtestbigger.tif is a TIFF file generated by ImageJ. As you mentioned,
>>>>> the ImageReader will delegate to
>>>>>     the TiffDelegateReader and parse the image dimensions from the content
>>>>> of the ImageDescription tag (`slices=40`)
>>>>>   - rotated.tif is a TIFF file with an OME-XML string in its
>>>>> ImageDescription tag. An ImageReader will detect
>>>>>     such a file as an OME-TIFF and read its dimensions using the OME
>>>>> metadata.
>>>>>     The TiffReader will also read this file but detect it a regular
>>>>> multi-page TIFF and aggregate all the planes alongside the T dimension.
>>>>> ....
>>>>> All understood. As mentioned above, the code snippet above also includes
>>>>> the assumption that the initial format
>>>>> reader can be used for all the files. I assume this might be a constraint
>>>>> in addition to the reader recycling.
>>>>> Otherwise, it should be possible to recycle the generic ImageReader.
>>>> Ok, i totally understand, i though file where both generated by ImageJ so i
>>>> used that example but in fact my code snippet was incomplete (to make it
>>>> simpler), what we really do is we first check if we can re-use the reader by
>>>> using this code in between file opening :
>>>>
>>>> if (!reader.isThisType(path, false) && !reader.isThisType(path, true))
>>>>        reader = mainReader.getReader(path);
>>>>
>>>> As you can see we first want to check without opening the file (again
>>>> because it's faster) and i though it was the issue as the reader probably
>>>> used the file extension to assume it can open it properly so i changed the
>>>> code to
>>>>
>>>> if (!reader.isThisType(path, true))
>>>>        reader = mainReader.getReader(path);
>>>>
>>>> But the result is the same, the TiffDelegateReader return 'true' for
>>>> reader.isThisType('rotated.tif', true).
>>>> Looking at the mainReader.getReader(path) implementation it uses the same
>>>> method internally to determine which reader to return.
>>>> The only difference is that mainReader (ImageReader) is always scanning
>>>> readers from its list instead of trying to recycle the last used one... that
>>>> means that OME Tiff Reader probably appears before in the list than
>>>> TiffDelegateReader ending to that difference.
>>>>
>>>> Best,
>>>>
>>>> --
>>>> Stephane Dallongeville
>>>> Unité d'Analyse d'Images Biologiques
>>>> CNRS UMR 3691
>>>> Institut Pasteur
>>>> 25 rue du Dr. Roux
>>>> 75724 Paris cedex 15 (FRANCE)
>>>>
>>>> Tel: +33 (0)1 45 68 87 01
>>>> Fax: +33 (0)1 40 61 33 30
>>>>
>>>> http://www.bioimageanalysis.org/