[ome-users] HDF5? Or something like it?

Jason Swedlow (Staff) j.r.swedlow at dundee.ac.uk
Thu Dec 22 12:04:29 GMT 2016


Hi Jake-

As always, thanks for your comments.  Responses in line.

On 21/12/2016 12:28, "ome-users on behalf of Jake Carroll" <ome-users-bounces at lists.openmicroscopy.org.uk<mailto:ome-users-bounces at lists.openmicroscopy.org.uk> on behalf of jake.carroll at uq.edu.au<mailto:jake.carroll at uq.edu.au>> wrote:

Hi,

Having just read the blog post here:

http://blog.openmicroscopy.org/future-plans/community/2016/12/21/omero-5-3/

Things are looking really wonderful and I look forward to deploying 5.3 into production.

Thanks, 5.3.0 will be a big step, and we’ll have more coming in the following point releases.


I have a couple of questions.

1. Given the data model is changing, what becomes of the existing structures, infrastructure, underlying semantics of the way 5.2.x platforms work? Is there a smart conversion mechanism or table mapper of some description when we go from A to B?


The data model changes should be described in the Model documentation [1]. We published these so that anyone interested could follow our work and prepare for any changes relevant to them. A couple of these model changes will affect OMERO directly, notably the dropped ROI properties and the reduced Marker enumerations. Additionally, the shape transform will be stored as OMERO objects instead of strings - see our blog post [2] for more information.


As always, more information will be available in the relevant section of the upcoming developer documentation [3] as we progress towards OMERO 5.3.0.


2. Key to OMERO’s acceptance (for us at least) when dealing with really (really) big data is throughput, performance of ingest and ability to then download those very large objects again at very high speed. Can you talk to any of the efforts made in this regard or progress? I’d love to liberate the data and see the full potential of my hardware platforms utilised by OMERO. Currently due to certain eccentricities in ICE/Grid, it kind of feels like a giant funnel (a very functional and well designed funnel, I might add (!)– but a funnel, none the less).


Thanks again for raising-- Luke Hammond, Douglas Russell and others brought this up at our 2016 Annual Users Meeting [4], and it’s a very important issue. We run into similar problems with our work on IDR [5]. As always getting work funded is a challenge, however we owe huge thanks to Douglas Russell (Harvard Med School) and Damir Sudar (Knight Center, OHSU) who have led an effort to secure supplements to a BD2K award to the NIH COMMONS LINCS project [6]  to fund 3 separate projects directed towards the “funnel problem”. These involve a true read-only OMERO server, portable session storage, and service routing. This work is underway and is being performed by the team at Glencoe Software. The project is public and all generated code will be released under the GPL. Apologies for not writing to the community sooner about this-- the last two months have been a whirlwind. We’ll try to get an update on all this in the New Year. **A huge thanks to Douglas Russell & Peter Sorger (HMS) and Damir Sudar & Joe Gray (OHSU) and the teams at LINCS HMS and LINCS OHSU for securing this funding from the BD2K programme**.


A note in this regard-- requests for work to be done are always welcome, and we try to track and respond to all requests. During 2016, we have had several entities-- Carl Zeiss, 3i, PerkinElmer, EMBL-EBI, HMS, OHSU-- who have needed work done and either performed the work themselves and contributed it to the community, or actively contributed to the costs of performing the work and delivering it to the community.  Of course, we continue to receive contributions from several individuals and other entities [7].  We hope to grow this type of activity in 2017. This is particularly important as we have more work to do-- for example,  scaling imports is a major unfunded priority. Obviously, anyone wanting to make contributions should get in touch.


3. What of formats such as HDF5? From a computational model representation and programmatic manipulation perspective - it seems to be something persistently on everybody’s minds. Might we see progress down this path? If not – I am sure there is probably a good reason which I don’t understand. Would love to understand more about it…


Indeed, we have discussed this at length, for some time. We have been reluctant to take on the challenge of standardising binary containers-- we have enough to do dealing with metadata. However, as imaging experiments become more complex, tools that properly store binary data and metadata become more important for our community. They are also becoming critical for making our software work-- we are running into several examples where data storage structures are challenging, e.g., proprietary HCS systems are writing data from plate-based imaging as single TIFFs.  This was (barely) manageable when users were collecting 2-4 fields and 2-3 channels per well.  Now, with Cell Painting [8], 50 fields/well and timelapse imaging, a single plate generates 10^5 - 10^6 files on disk. Simply reading these data is very challenging, only because of the file format implementation (the proprietary vendors who use these antiquated storage mechanisms know who they are!!!).


To this end, we have submitted a proposal (BSRC TRDF [9]) to fund work that combines OME metadata with several established binary containers-- a few different variants of HDF5 (e.g., BigDataViewer [10], Cellh5 [11], etc.), the KLB format [12], and a few others. We haven’t found a single binary container that can satisfy all the needs of the broad community we serve, so we hope to support a few-- if we get the resources. For example, some members of the community might need lossless compression (i.e., KLB), others might need BDV for Fiji-based visualisation, and so on. We note, with great foreboding, that HDF5 write-locking remains an issue and we are concerned about using HDF5-based formats in large, distributed environments. If anyone in the community has a good solution for this, we’d love to hear it (and use it!).


Thank you and keep up the amazing work. We really appreciate it and are looking forward very much to what is coming.


Thank you for all the usage, feedback and support.


Jason, Josh, Seb, Jean-Marie, Chris and the whole OME Team.


[1] https://www.openmicroscopy.org/site/support/ome-model/schemas/june-2016.html

[2] http://blog.openmicroscopy.org/data-model/future-plans/2016/06/20/shape-transforms/

[3] http://www.openmicroscopy.org/site/support/omero5.3/developers/whatsnew.html

[4] http://www.openmicroscopy.org/site/community/minutes/meetings/11th-annual-users-meeting-2016

[5] http://idr-demo.openmicroscopy.org/

[6] http://www.lincsproject.org/

[7] http://www.openmicroscopy.org/site/about/ome-contributors

[8] http://www.nature.com/nprot/journal/v11/n9/full/nprot.2016.105.html

[9] http://www.bbsrc.ac.uk/funding/filter/2016-tools-resources-development-fund-bioimaging/

[10] http://www.nature.com/nmeth/journal/v12/n6/full/nmeth.3392.html

[11] http://www.cellh5.org/

[12] http://www.nature.com/nprot/journal/v10/n11/abs/nprot.2015.111.html


The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-users/attachments/20161222/eac10e4c/attachment.html>


More information about the ome-users mailing list