[ome-devel] Fwd: File format for large data sets stored at multiple resolutions
Johan Henriksson
mahogny at areta.org
Wed May 6 22:45:19 BST 2015
Hi Nico!
First of all, have you tried the pyramid compression of jpeg2000? (I have
not!)
Second, last time I tried large datasets in ome-tiff it was a huge issue. I
tried to convert our 40gb+ recordings to ome-tiff and got indexing times
from hell (up to 10 minutes). I never had time to properly investigate
this. Part of the problem might be that I changed the OME-writer to add in
JPEG-compressed data (since we have a lot of jpegs since before, and I did
not feel like converting those to PNGs).
JPEG2000 pyramids would help you with huge 2D-images but not with huge 5D
datasets. I believe the problem (anyone please correct me here) is that the
ome-reader first indexes the TIFF-file - but does so in worst case by going
through the entire file to find where each plane is(?). This is in no way
fast in the current implementation as I suspect it jumps through the entire
dataset as tifs are essentially a linked list of planes. if your output
compressed file is 5gb+ then this alone is really slow
my solution to this would have been an extension with a special data object
containing pointers to most of all planes, in a single place. thus very few
reads would be needed to map all planes. but then I moved to another lab
and never had time to return to this. but I think it's a problem/solution
worth reconsidering. if a tif-reader does not understand such a special
data object it would just ignore it, but specialized readers could gain a
lot of speed by reading it
cheers,
Johan
On Mon, May 4, 2015 at 2:08 AM, Nico Stuurman <nico.stuurman at ucsf.edu>
wrote:
>
> Dear all,
>
> I have been running into more and more individual efforts to create new
> file formats to deal with large datasets that need to be stored at
> multiple resolutions to enable fast feedback to the user. Examples are
> the hdf5 format used by the BigDataViewer plugin by Tobias Pietzsch and
> Stephan Preibisch, the hdf5 format used by Chimera (UCSF-based package
> primarily for crystallography and EM that also has amazing capabilities
> for 3D visualization of light microscopy data), the Micro-Manager
> SlideExplorer plugin for which Arthur Edelstein developed his own
> storage system, and the Micro-Manager plugin "Magelan" that Henry
> Pinkard is developing right now, and who also stores multiple resolution
> versions of the data on disk. Doubtlessly, there are many more examples.
>
> Even when conversion between these formats is possible (as long as they
> are reasonably documented), conversion becomes time consuming and takes
> up large amounts of disk space, simply because the data sets have become
> gigantic. The reasons why everyone designs their own formats are also
> clear, there simply is no standard (at least that I am aware of, if
> there is please do let me know! ) that let's one store gigantic datasets
> that give fast access to the data in multiple resolutions.
>
> Since you guys have created the standard in light microscopy with
> ome.tif, I assume that you have thoughts what a new standard (hdf5
> based?) should look like. In any case, I am very much looking forward
> to hearing your thoughts and I will be happy to help avoid a wild growth
> of different formats that we will have to live with for years to come if
> we do not take action soon.
>
> Best,
>
> Nico
>
>
>
>
> _______________________________________________
> ome-devel mailing list
> ome-devel at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
>
--
--
-----------------------------------------------------------
Johan Henriksson, PhD
Karolinska Institutet / European Bioinformatics Institute (EMBL-EBI)
Labstory - Integrated laboratory documentation and databases (
www.labstory.se)
http://mahogny.areta.org http://www.endrov.net
<http://www.endrov.net>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20150506/176ef246/attachment.html>
More information about the ome-devel
mailing list