[ome-users] File format for HTS/HCS data

Christian Carsten Sachs c.sachs at fz-juelich.de
Thu Oct 13 13:46:49 BST 2016


Hi Mario, hi Kai,

joining the discussion as I am also very interested in a HDF5 based
container format.

Given the data is to be assembled outside of an existing microscopy
software, I don't think it makes much sense to 'emulate' the structure
of a particular proprietary software's HDF format.
As you state it, a new reader would be the way to go.

There used to be some discussion about putting OME metadata in HDF5
containers, i.e. creating OME-HDF (analogously to OME-TIFF) files; with
discussions on the mailing list i.e. in 2009 [1] and 2012 [2].

Therefore I wonder if someone from the OME team could elaborate whether
progress has been made in this direction. I'd be very interested in
developments in that direction.

Best regards,
Christian Sachs

[1]
http://lists.openmicroscopy.org.uk/pipermail/ome-devel/2009-February/001152.html
[2]
http://lists.openmicroscopy.org.uk/pipermail/ome-devel/2012-May/002220.html

On 10/13/2016 02:37 PM, Mario Emmenlauer wrote:
>
> Dear Kai,
>
> On 13.10.2016 14:01, Kai Schleicher wrote:
>> One limitation for us is that whatever container format we chose must be
>> *read**by the Bio-formats*.
>
> This is a very valid point. I'm also only aware of CellH5 and Imaris Format.
> But I would not see this as a big limitation. HDF5 support is already in
> Bio-Formats, so adding support for a specific "formatting" of HDF5 is rather
> easy. If you choose any HDF5 based format, even if its not currently supported
> in Bio-Formats, adding support will be relatively easy. But that's just me :)
>
> But can you elaborate what your use-case is? Lets take an example, you have
> a container for a plate with 10.368 images. Do you just want to read an image
> with a known name? Or do you need a user interface to select the image from
> the container, for example in Fiji? If a user interface is needed, how
> elaborate does it need to be, to pick the images? Would a scrollable listing
> be sufficient, or should it rather be more comfortable, like a plate layout,
> or something more fancy?
>
> Cheers,
>
>    Mario
>
>
>
>> Writing the same format by the Bio-formats would be great too, but we could
>> probably work around that.
>>
>> This limitation is only met by CellH5 if I am not mistaken. Are there
>> alternatives for the Bio-formates?
>>
>> Thanks for your input and cheers,
>> Kai
>>
>>
>>
>> On 10/13/2016 11:30 AM, Mario Emmenlauer wrote:
>>> Dear Kai,
>>>
>>> I have a slightly lengthy reply because we have been doing this for a while,
>>> with certain pro's and cons. We have stored a number of single HCS images in
>>> HDF5 containers, to make them better suited for the storage and archiving.
>>> Our common data sizes are 384WP, 9 sites per well, 3-4 channels. Therefore we
>>> have a total of ~12.000 images per plate. Currently we host some 5500 plates.
>>>
>>> When we went to HDF5, we could observe a significant number of effects. First,
>>> depending on the block size of the file system of the storage, we use less
>>> storage space! That's because many storage systems use a larger block size,
>>> which effectively "wastes" a bit of space on every image. We could gain up to
>>> 20% space by combining the images into one container.
>>>
>>> Then we could also significantly improve network transfer rates with the
>>> containers. Its mostly the storage read/write rates that go up, and we could
>>> improve throughput up to 3-fold! This is also great for archiving and backup.
>>>
>>> However there are also downsides. First its more cumbersome to access the
>>> images, because you add a layer of complexity. For us, the HDF5 support is
>>> built into the storage data base, so when we browse images on the web and
>>> download, the HDF5 is transparent. But every once in a while we need to
>>> access the images on the disk. In order to do that, we now use ImageJ, or
>>> an archive extractor, or a Matlab HDF5 reader that we implemented ourselves.
>>> It works, but its just good to know that the user experience is not the same.
>>>
>>> In my humble opinion the benefits outweigh the downsides. And I'd recommend
>>> an HDF5 based format, because its broadly supported, and very open! Many large
>>> players build on it, like Nasa, so I hope it will be supported for long. Also
>>> the HDF5 support in Fiji/ImageJ is quite good. However, this still leaves you
>>> with a problem how you'd like to "format" your HDF5 file. This is a separate
>>> problem! Think of it like a zip-archive: its up to you what folder layout you
>>> have inside the zip container, and this can still very much impact the proper-
>>> ties of working with the data.
>>>
>>> We chose the "H5AR" formatting (and library) from SIS, because its easy to use,
>>> and because they are the providers of Java HDF5 for Fiji/ImageJ. This is also
>>> the format used by openBIS. Its actively developed and quite mature, we did
>>> not encounter any problems. CellH5 would be different alternative. And there
>>> is the BigDataViewer HDF5 format.
>>> I found H5AR very easy to use, its really not much more than a container.
>>> The ease of use finally made the race for us. If you find this format worth-
>>> while, I can also provide you with Matlab and Python readers for the HDF5
>>> that transparently handle the container. So with the readers, you can work
>>> with the container as if it where a directory:
>>>     % Example: Matlab dir() and imread() wrappers:
>>>     vFileList = dirh5('/path/to/data.h5ar');
>>>     vImage = imreadh5(['/path/to/data.h5ar/' vFileList(1).name]);
>>>
>>>
>>> Cheers,
>>>
>>>     Mario
>>>
>>>
>>>
>>>
>>> On 12.10.2016 16:40, Kai Schleicher wrote:
>>>> Hi,
>>>>
>>>> We are looking for a container-file format to store HTS/HCS data. This format
>>>> needs to be read and written by Bio-formats.
>>>>
>>>> The reason is that our storage file system deals much better with a few large
>>>> files than with many small files. Hence we'd expect to increase usability and
>>>> performance drastically when handling the data.
>>>>
>>>> The output files of our microscope is for example *.HTD (metadata / structure)
>>>> next to  *.TIF (images and thumbnails). Here each position, channel, time-point
>>>> and plane are saved as individual *.TIFs. This creates a lot of images for each
>>>> multi-well plate screen.
>>>>
>>>> So far we have been looking into CellH5 which looks promising but development
>>>> appears discontinued (correct me if I am wrong). Are there alternatives?
>>>>
>>>> Thanks for your help and cheers,
>>>> Kai
>
>
>
> Viele Gruesse,
>
>     Mario Emmenlauer
>
>
> --
> BioDataAnalysis GmbH, Mario Emmenlauer      Tel. Buero: +49-89-74677203
> Balanstr. 43                   mailto: memmenlauer * biodataanalysis.de
> D-81669 München                          http://www.biodataanalysis.de/
> _______________________________________________
> ome-users mailing list
> ome-users at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users
>


------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------



More information about the ome-users mailing list