[ome-devel] ome-tiff files: does it really needs to be namedxx.ome.tif ??

Curtis Rueden ctrueden at wisc.edu
Fri Apr 10 19:51:21 BST 2009


Hi Frans & Ghislain,

I know it has been a long time since this thread, but I wanted to point out
that we removed the dependency on .ome.tif extension in Bio-Formats as of
r4904 (03/11).

JHOVE is an extendable framework (we have added several proprietary file
> format identifications to it)
>

Bio-Formats has very similar logic to verify internal file structure. It may
be worth using JHOVE instead of our current API, but it is a low priority
issue for us. I have filed ticket #358 so we don't forget:

  https://skyking.microscopy.wisc.edu/trac/java/ticket/358

Regards,
Curtis

On Wed, Jan 14, 2009 at 8:38 AM, Cornelissen, Frans [PRDBE] <
FCORNELI at its.jnj.com> wrote:

>  All,
>
>
>
> A good example OS JAVA software package that deals with these issues is
> JHOVE (http://hul.harvard.edu/jhove/ )
>
> It will verify the internal file structure to establish the type of
> file&image format
>
> This is needed anyway since several completely different image file foramts
> use the same (e.g.  .img extension (NIFTI, Analyse, …)
>
>
>
> JHOVE is an extendable framework (we have added several proprietary file
> format identifications to it)
>
>
>
> JHOVE checks the well-formedness and the validity of the files, even the
> value of the atrtribute fields (if you like)
>
> So JHOVE  could read the (OME- XML)  ImageDescription structure to decide
> on the “real” format.
>
>
>
> See the attached png file
>
>
>
> Regards, frans
>
>
>
> -----Original Message-----
> *From:* Ghislain Bonamy [mailto:GBonamy at gnf.org]
> *Sent:* Wednesday, 14 January 2009 3:52 AM
> *To:* Curtis Rueden
> *Cc:* Cornelissen, Frans [PRDBE]; ome-devel at lists.openmicroscopy.org.uk
> *Subject:* RE: [ome-devel] ome-tiff files: does it really needs to be
> namedxx.ome.tif ??
>
>
>
> Curtis,
>
>  I am glad that this is issue is raised as .ext1.ext2 is in no way a
> standard extension.
>
>
> Not a common practice perhaps, but I am curious what you see as the
> practical pitfalls of this compound extension. It seems to me that using
> .ome.tif gives us the best of both worlds: 1) an unambiguous extension for
> the OME-TIFF format, and 2) compatibility with existing TIFF software.
>
> I concede this, and my comment was more about the form. In addition, I was
> more focused on whether or not  the reader should assume a file format based
> on the extension, and the fact that other tiff based format may have a
> different extension than Tiff such as the .flex files from evotech. This can
> be a problem since renaming the file would make it inaccessible to other
> file systems etc. So perhaps, supposing than any unknown extension is a Tiff
> based format and prioritizing tiffs to be first assumed to me OME complient
> and then be a different flavor of it would make sense. For instance a flex
> file with a flex extension is considered as such unless it has a tag
> indicating that it is an OME compliant file (in which case it would be
> handled by the OMETiffReader).
>
>  Perhaps, using simply a .tiff with a format specific IFD would make more
> sense.
>
>
> We have discussed this idea in the past. Do you see any advantages to your
> approach other than those you mention below?
>
> No these are the advantages that I am referring too.
>
>  In addition, this would allow for a mechanism to transform Tiff based
> file formats more efficiently, and provide backward compatibility with other
> readers.
>
>
> How would a custom IFD entry be more efficient than the current mechanism?
> Currently you could inject the OME-XML metadata extremely efficiently by
> surgically overwriting the ImageDescription tag. The only downside is that a
> TIFF cannot simultaneously be an OME-TIFF and some other flavor of TIFF that
> also uses the ImageDescription tag for its metadata. A custom IFD entry
> would alleviate that issue.
> This is precisely my point.
>
>
> How is the current specification not backwardly compatible with other
> readers? The only way I can think of is if those other readers are also
> expecting custom metadata from the same ImageDescription tag.
>
> I meant that the custom IFD being conserved, a TIFF format could still be
> read by the other readers.
>
>  For instance the Opera .flex file is roughly a Tiff file with some
> specific header (and in some cases a specific compression of the pixel
> data). One could think that adding the OME-XML header under an OME specific
> IFD would make the most sense. While the original IFD would remain
> unchanged.
>
>
> Is your concern that you do not wish to overwrite the Flex file's existing
> ImageDescription tag? I checked a bunch of our sample Flex files, and none
> of them have an existing ImageDescription tag. Are you using this field?
> Perhaps, I am getting confused between IFDs and ImageDescription tags. The
> Opera Flex file contains an IFD= 65200, which points to an XML file
> containing the proprietary metadata. My idea would be to do the same for
> OME-Tiff so that the metadata can be read by OME complient or proprietary
> readers.
>
>
> The rationale thus far has been that any actual "image description" --
> i.e., comment -- can go in the OME-XML block's Description tag. But I
> understand that there are cases where this solution is undesirable: like I
> said, if the same field holds metadata in some other structure, it could be
> a problem. Examples include TIFFs saved by ImageJ, and Leica TCS TIFFs.
>
> Being able to combine these forms of metadata within a single TIFF is of
> potential benefit. Is there anyone out there who wants to do this, and wants
> to see the OME-TIFF specification changed?
>
>  On another note and still about metadata. As the metadata can become
> extremely large, would it make sense to provide a mechanism to compress it
> using deflate for instance? Or is there a mechanism for this?
>
>
> For OME-TIFF, there is no mechanism to compress the XML at the moment. You
> can do it with the OME-XML file format, using extension .omez, with zlib.
> However, the metadata is a tiny fraction of the total data size, even when
> represented in a rather inefficient XML format.
> I agree with this, which is why Tiff format might be bit easier to
> implement this. One could compress the pixel data using an optimized
> algorithm such as the JPEG2000, LZW… and compress the Metadata contained in
> the ImageDescription tag (or in an IFD) using his preferred algorithm.
> Although compressing this kind of data takes some time decompressing it is
> extremely fast (cf. using GZIp compression). The easiest solution would be
> to use an extra IFD to indicate if the metadata is compressed or not!
>
>
> The major goals of OME-TIFF are performance and compatibility. Uncompressed
> TIFF planes are essentially raw, and storing the metadata in the first
> ImageDescription tag allows quick access (not to say that a custom tag
> wouldn't be just as quick). Compressing the XML block would increase the
> amount of time necessary to parse the metadata, though that is no reason not
> to allow it as an option. If this option would really significantly reduce
> the size of your data, we can discuss how best to add such a facility.
> Again allowing compression using for instance GZIP is slow for the
> compression step but almost instantaneous when reading. In addition, right
> now metadata are manageably small, but with ROI and other flavor of metdata
> it may become more and more consequent. I am merely mentioning something
> which I foresee could be useful in the future, while remaining optional
> depending on the users need.
> In conclusion, I am resistant to changing the OME-TIFF specification
> without a compelling practical benefit. The spec has already seen some
> breaking changes that have made it difficult to maintain compatibility
> within Bio-Formats, and I would prefer to avoid any further complications in
> the implementation. But it would be foolish not to discuss and evaluate any
> potential benefits to such changes.
>
> In this case, unless someone in the community would directly benefit from
> the custom IFD entry approach, I think the advantages are mostly theoretical
> and would not warrant the disruption caused by changing the specification
> again. Please feel free to argue if you disagree. :-)
> Yes, this is still theoretical and it really depends on everyone’s need.
> Obviously this would be most useful to us and other people who do
> high-throughput imaging and collect millions of images for a screen and
> where metadata becomes larger and larger. Perhaps, this would be nice to
> keep in mind if an HTF format flavor is ever implemented. But once again,
> perhaps, am I the only one in this situation… so far ;)
>
>
>
> -Ghislain
>
> On Tue, Jan 13, 2009 at 3:09 PM, Ghislain Bonamy <GBonamy at gnf.org> wrote:
>
> Curtis, Frans,
>
>
>
> I am glad that this is issue is raised as .ext1.ext2 is in no way a
> standard extension.
>
>
>
> Perhaps, using simply a .tiff with a format specific IFD would make more
> sense. In addition, this would allow for a mechanism to transform Tiff based
> file formats more efficiently, and provide backward compatibility with other
> readers.
>
>
>
> For instance the Opera .flex file is roughly a Tiff file with some specific
> header (and in some cases a specific compression of the pixel data). One
> could think that adding the OME-XML header under an OME specific IFD would
> make the most sense. While the original IFD would remain unchanged.
>
>
>
> On another note and still about metadata. As the metadata can become
> extremely large, would it make sense to provide a mechanism to compress it
> using deflate for instance? Or is there a mechanism for this?
>
>
>
> Best,
>
>
>
> Ghislain Bonamy, PhD
>
> __________________________________________
>
> Research Investigator I
>
>
>
> Genomic Institute of the
>
> Novartis Research
>
> Foundation
>
> Department of Molecular & Cell Biology, room G214
>
> 10675 John Jay Hopkins Drive
>
> San Diego CA 92121
>
> USA
>
>
>
> +1 (858) 812-1534 (W & F)
>
> +1 (757) 941-4194 (H)
>
> +1 (858) 354-7388 (M)
>
> www.gnf.org
>
>
>
> Hudson-Alpha Institute for Biotechnology
>
> www.hudsonalpha.org <http://www.haib.org>
>
>
>
> *From:* ome-devel-bounces at lists.openmicroscopy.org.uk [mailto:
> ome-devel-bounces at lists.openmicroscopy.org.uk] *On Behalf Of *Curtis
> Rueden
> *Sent:* Tuesday, January 13, 2009 1:00 PM
> *To:* Cornelissen, Frans [PRDBE]
> *Cc:* ome-devel at lists.openmicroscopy.org.uk
> *Subject:* Re: [ome-devel] ome-tiff files: does it really needs to be
> namedxx.ome.tif ??
>
>
>
> Hi Frans,
>
> When using Tiff files, we would like to convert them to OME-tiff so that
> they do contain the OME-XML metadata.
>
> Currently the new files have to contain the .ome.tiff as extension
> In our analysis processes, the altered name causes a disruption.
>
>
> Originally, the specification did not require the .ome.tif extension, but
> we decided it would reduce ambiguity to prefer a more specific extension --
> and the .ome.tif extension allows non-OME-aware TIFF programs to continue
> seeing the files as regular TIFFs.
>
> Question: is it really a hard requirement that the .ome. part is in the
> filename?
>
>
> At the moment, for Bio-Formats and hence OMERO, yes it is a hard
> requirement. We are not necessarily opposed to parsing OME-TIFF metadata out
> of files without the .ome.tif extension, but at the moment there are some
> technical barriers to doing so efficiently.
>
> This in itself is no proof of the fact that the file *really* contains a
> valid OME-xml structure, so an application is probably going the check
> internally to decide whether it is an OME file anyway...
>
>
> True. The same is true for every file extension -- the only way to verify
> that the file *really* contains correctly structured data of the indicated
> type is to attempt to fully parse it. However, file extension is an
> extremely useful hint that greatly improves performance. In some cases
> (e.g., certain raw data formats) it might even be impossible to completely
> determine the file format without the filename extension.
>
> Could the .ome. extension requirement be removed for importing ome-tiff
> files into OMERO?
>
>
> Yes, we always parse a TIFF file's ImageDescription block. Ideally, we
> should be properly parsing any OME-XML we find there. However, as I said,
> there are some performance challenges we need to sort out. The fix shouldn't
> be too bad. We'll file a ticket to keep you posted.
>
> -Curtis
>
> On Mon, Jan 12, 2009 at 8:43 AM, Cornelissen, Frans [PRDBE] <
> FCORNELI at its.jnj.com> wrote:
>
> Hi,
>
> When using Tiff files, we would like to convert them to OME-tiff so that
> they do contain the OME-XML metadata.
>
> Currently the new files have to contain the .ome.tiff as extension
> In our analysis processes, the altered name causes a disruption.
>
> Question: is it really a hard requirement that the .ome. part is in the
> filename?
> This in itself is no proof of the fact that the file *really* contains a
> valid OME-xml structure, so an application is probably going the check
> internally to decide whether it is an OME file anyway...
>
> Could the .ome. extension requirement be removed for importing ome-tiff
> files into OMERO?
>
> Best regards, frans cornelissen
> _______________________________________________
> ome-devel mailing list
> ome-devel at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20090410/6b2c45d5/attachment-0001.htm 


More information about the ome-devel mailing list