<div dir="ltr"><br><div class="gmail_quote"><div dir="ltr">Dear Mario,</div><div dir="ltr"><br></div><div>thanks a lot for the detailed explanation with concrete measurements, very valuable!</div><div><br></div><div>So did you / someone try CellH5 or did you use your own HDF5 structure ?</div><div><br></div><div>Any plans and/or recommendations from OME side for bioformats/Omero integration?</div><div><br></div><div>Regards,</div><div>Manuel</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
Dear Manuel,<br>
<br>
On 10.10.2018 08:46, Manuel Stritt wrote:<br>
> Dear all,<br>
> <br>
> we're currently rethinking our high content screening workflow and thus I'm thinking about a good way to store all data.<br>
> My idea is to create one e.g. HDF5 file per plate which contains all the images + meta data.<br>
> Do have any recommendations regarding that (I think Mario once triggered a discussion around that topic) ?<br>
> <br>
> If some kind of HDF5 is considered as solution - then still a structure specification would be needed. <br>
> Kai and Nico mentioned the CellH5 format, a flavor of HDF5.<br>
> As far as I can see this is supported by bioformats / Omero. However, it's still unclear how<br>
> to pack the output of a e.g. Opera machine into a CellH5 format in a convenient way.<br>
> <br>
> In addition there was a discussion about an official OME-HDF5 format, right? <br>
> What's the current status for that?<br>
<br>
In our testing, the containers provide a big benefit for the file<br>
system and storage back end, especially when a large NAS storage is<br>
used. For a typical desktop application with a local spinning disk,<br>
there was virtually no difference in speed or file system overhead.<br>
<br>
On a NAS, we got a 30% eduction in storage space due to reduced chunk<br>
size overhead. This is a tunable parameter, so your mileage may vary!<br>
It may be anything between 0% and up to 50% (or more) reduction,<br>
depending on your image file size and the file system chunk size.<br>
<br>
Furthermore we got significantly faster data transfer rates because<br>
the data is more consecutive. An rsync transfer of a full plate from/to<br>
the NAS was about twice as fast when transferring the container than<br>
the individual files.<br>
<br>
Last not least the container can apply transparent compression without<br>
modifying the original file, so you can (transparently) apply bzip2 or<br>
other compression schemes on the raw file, while keeping the full<br>
(proprietary) file intact and unchanged.<br>
<br>
<br>
All this comes also at a price. There where situations where we would<br>
have liked to access images and it was not as easy as we hoped. Its<br>
good if the database supports download of individual files, to make<br>
the image access transparent for end users. But when the database is<br>
down, image access becomes quite hard for end users. Furthermore, there<br>
is a unlucky number of ~100-1000 files where access is always a hazzle,<br>
because manual download becomes too cumbersome and container extraction<br>
is typically not super comfortable.<br>
<br>
Long story short: containers are great, but I would really love to<br>
see them in combination with a simple, graphical, cross-platform file<br>
management utility that allows adding and extracting files. There are<br>
some such utilities like HDF5View [1] but they are not yet comparable<br>
to something like WinZip/WinRAR/7Zip/...<br>
<br>
[1] <a href="https://support.hdfgroup.org/HDF5/Tutor/hdfview.html" rel="noreferrer" target="_blank">https://support.hdfgroup.org/HDF5/Tutor/hdfview.html</a><br>
<br>
Viele Gruesse,<br>
<br>
Mario Emmenlauer<br>
<br>
<br>
--<br>
BioDataAnalysis GmbH, Mario Emmenlauer Tel. Buero: +49-89-74677203<br>
Balanstr. 43 mailto: memmenlauer * <a href="http://biodataanalysis.de" rel="noreferrer" target="_blank">biodataanalysis.de</a><br>
D-81669 München <a href="http://www.biodataanalysis.de/" rel="noreferrer" target="_blank">http://www.biodataanalysis.de/</a><br>
</blockquote></div><br clear="all"><div><br></div><br></div>
<br>
<div><br></div><div><div><font size="2">The information of this email and in any file transmitted with it is strictly confidential and may be legally privileged.</font></div><div><font size="2">It is intended solely for the addressee. If you are not the intended recipient, any copying, distribution or any other use of this email is prohibited and may be unlawful. In such case, you should please notify the sender immediately and destroy this email.</font></div><div><font size="2">The content of this email is not legally binding unless confirmed by letter.</font></div><div><font size="2">Any views expressed in this message are those of the individual sender, except where the message states otherwise and the sender is authorized to state them to be the views of the sender's company.</font></div></div>