[ome-devel] ome-tiff files: does it really needs to be namedxx.ome.tif ??

Cornelissen, Frans [PRDBE] FCORNELI at its.jnj.com
Wed Jan 14 13:38:52 GMT 2009


All,

 

A good example OS JAVA software package that deals with these issues is
JHOVE (http://hul.harvard.edu/jhove/ )

It will verify the internal file structure to establish the type of
file&image format

This is needed anyway since several completely different image file
foramts use the same (e.g.  .img extension (NIFTI, Analyse, ...)

 

JHOVE is an extendable framework (we have added several proprietary file
format identifications to it)

 

JHOVE checks the well-formedness and the validity of the files, even the
value of the atrtribute fields (if you like)

So JHOVE  could read the (OME- XML)  ImageDescription structure to
decide on the "real" format.

 

See the attached png file

 

Regards, frans

 

-----Original Message-----
From: Ghislain Bonamy [mailto:GBonamy at gnf.org] 
Sent: Wednesday, 14 January 2009 3:52 AM
To: Curtis Rueden
Cc: Cornelissen, Frans [PRDBE]; ome-devel at lists.openmicroscopy.org.uk
Subject: RE: [ome-devel] ome-tiff files: does it really needs to be
namedxx.ome.tif ??

 

Curtis,

	I am glad that this is issue is raised as .ext1.ext2 is in no
way a standard extension.


Not a common practice perhaps, but I am curious what you see as the
practical pitfalls of this compound extension. It seems to me that using
.ome.tif gives us the best of both worlds: 1) an unambiguous extension
for the OME-TIFF format, and 2) compatibility with existing TIFF
software.

I concede this, and my comment was more about the form. In addition, I
was more focused on whether or not  the reader should assume a file
format based on the extension, and the fact that other tiff based format
may have a different extension than Tiff such as the .flex files from
evotech. This can be a problem since renaming the file would make it
inaccessible to other file systems etc. So perhaps, supposing than any
unknown extension is a Tiff based format and prioritizing tiffs to be
first assumed to me OME complient and then be a different flavor of it
would make sense. For instance a flex file with a flex extension is
considered as such unless it has a tag indicating that it is an OME
compliant file (in which case it would be handled by the OMETiffReader).

	Perhaps, using simply a .tiff with a format specific IFD would
make more sense.


We have discussed this idea in the past. Do you see any advantages to
your approach other than those you mention below? 

No these are the advantages that I am referring too.

	In addition, this would allow for a mechanism to transform Tiff
based file formats more efficiently, and provide backward compatibility
with other readers.


How would a custom IFD entry be more efficient than the current
mechanism? Currently you could inject the OME-XML metadata extremely
efficiently by surgically overwriting the ImageDescription tag. The only
downside is that a TIFF cannot simultaneously be an OME-TIFF and some
other flavor of TIFF that also uses the ImageDescription tag for its
metadata. A custom IFD entry would alleviate that issue.
This is precisely my point.


How is the current specification not backwardly compatible with other
readers? The only way I can think of is if those other readers are also
expecting custom metadata from the same ImageDescription tag.

I meant that the custom IFD being conserved, a TIFF format could still
be read by the other readers.

	For instance the Opera .flex file is roughly a Tiff file with
some specific header (and in some cases a specific compression of the
pixel data). One could think that adding the OME-XML header under an OME
specific IFD would make the most sense. While the original IFD would
remain unchanged.


Is your concern that you do not wish to overwrite the Flex file's
existing ImageDescription tag? I checked a bunch of our sample Flex
files, and none of them have an existing ImageDescription tag. Are you
using this field? 
Perhaps, I am getting confused between IFDs and ImageDescription tags.
The Opera Flex file contains an IFD= 65200, which points to an XML file
containing the proprietary metadata. My idea would be to do the same for
OME-Tiff so that the metadata can be read by OME complient or
proprietary readers.


The rationale thus far has been that any actual "image description" --
i.e., comment -- can go in the OME-XML block's Description tag. But I
understand that there are cases where this solution is undesirable: like
I said, if the same field holds metadata in some other structure, it
could be a problem. Examples include TIFFs saved by ImageJ, and Leica
TCS TIFFs.

Being able to combine these forms of metadata within a single TIFF is of
potential benefit. Is there anyone out there who wants to do this, and
wants to see the OME-TIFF specification changed?

	On another note and still about metadata. As the metadata can
become extremely large, would it make sense to provide a mechanism to
compress it using deflate for instance? Or is there a mechanism for
this?


For OME-TIFF, there is no mechanism to compress the XML at the moment.
You can do it with the OME-XML file format, using extension .omez, with
zlib. However, the metadata is a tiny fraction of the total data size,
even when represented in a rather inefficient XML format.
I agree with this, which is why Tiff format might be bit easier to
implement this. One could compress the pixel data using an optimized
algorithm such as the JPEG2000, LZW... and compress the Metadata
contained in the ImageDescription tag (or in an IFD) using his preferred
algorithm. Although compressing this kind of data takes some time
decompressing it is extremely fast (cf. using GZIp compression). The
easiest solution would be to use an extra IFD to indicate if the
metadata is compressed or not! 


The major goals of OME-TIFF are performance and compatibility.
Uncompressed TIFF planes are essentially raw, and storing the metadata
in the first ImageDescription tag allows quick access (not to say that a
custom tag wouldn't be just as quick). Compressing the XML block would
increase the amount of time necessary to parse the metadata, though that
is no reason not to allow it as an option. If this option would really
significantly reduce the size of your data, we can discuss how best to
add such a facility.
Again allowing compression using for instance GZIP is slow for the
compression step but almost instantaneous when reading. In addition,
right now metadata are manageably small, but with ROI and other flavor
of metdata it may become more and more consequent. I am merely
mentioning something which I foresee could be useful in the future,
while remaining optional depending on the users need.
In conclusion, I am resistant to changing the OME-TIFF specification
without a compelling practical benefit. The spec has already seen some
breaking changes that have made it difficult to maintain compatibility
within Bio-Formats, and I would prefer to avoid any further
complications in the implementation. But it would be foolish not to
discuss and evaluate any potential benefits to such changes.

In this case, unless someone in the community would directly benefit
from the custom IFD entry approach, I think the advantages are mostly
theoretical and would not warrant the disruption caused by changing the
specification again. Please feel free to argue if you disagree. :-)
Yes, this is still theoretical and it really depends on everyone's need.
Obviously this would be most useful to us and other people who do
high-throughput imaging and collect millions of images for a screen and
where metadata becomes larger and larger. Perhaps, this would be nice to
keep in mind if an HTF format flavor is ever implemented. But once
again, perhaps, am I the only one in this situation... so far ;)

 

-Ghislain

On Tue, Jan 13, 2009 at 3:09 PM, Ghislain Bonamy <GBonamy at gnf.org>
wrote:

Curtis, Frans,

 

I am glad that this is issue is raised as .ext1.ext2 is in no way a
standard extension.

 

Perhaps, using simply a .tiff with a format specific IFD would make more
sense. In addition, this would allow for a mechanism to transform Tiff
based file formats more efficiently, and provide backward compatibility
with other readers.

 

For instance the Opera .flex file is roughly a Tiff file with some
specific header (and in some cases a specific compression of the pixel
data). One could think that adding the OME-XML header under an OME
specific IFD would make the most sense. While the original IFD would
remain unchanged.

 

On another note and still about metadata. As the metadata can become
extremely large, would it make sense to provide a mechanism to compress
it using deflate for instance? Or is there a mechanism for this?

 

Best,

 

Ghislain Bonamy, PhD

__________________________________________

Research Investigator I

 

Genomic Institute of the

Novartis Research

Foundation

Department of Molecular & Cell Biology, room G214

10675 John Jay Hopkins Drive

San Diego CA 92121

USA

 

+1 (858) 812-1534 (W & F)

+1 (757) 941-4194 (H)

+1 (858) 354-7388 (M)

www.gnf.org

 

Hudson-Alpha Institute for Biotechnology

www.hudsonalpha.org <http://www.haib.org> 

 

From: ome-devel-bounces at lists.openmicroscopy.org.uk [mailto:
ome-devel-bounces at lists.openmicroscopy.org.uk] On Behalf Of Curtis
Rueden
Sent: Tuesday, January 13, 2009 1:00 PM
To: Cornelissen, Frans [PRDBE]
Cc: ome-devel at lists.openmicroscopy.org.uk
Subject: Re: [ome-devel] ome-tiff files: does it really needs to be
namedxx.ome.tif ??

 

Hi Frans,

	When using Tiff files, we would like to convert them to OME-tiff
so that
	they do contain the OME-XML metadata.
	
	Currently the new files have to contain the .ome.tiff as
extension
	In our analysis processes, the altered name causes a disruption.


Originally, the specification did not require the .ome.tif extension,
but we decided it would reduce ambiguity to prefer a more specific
extension -- and the .ome.tif extension allows non-OME-aware TIFF
programs to continue seeing the files as regular TIFFs.

	Question: is it really a hard requirement that the .ome. part is
in the
	filename?


At the moment, for Bio-Formats and hence OMERO, yes it is a hard
requirement. We are not necessarily opposed to parsing OME-TIFF metadata
out of files without the .ome.tif extension, but at the moment there are
some technical barriers to doing so efficiently.

	This in itself is no proof of the fact that the file *really*
contains a
	valid OME-xml structure, so an application is probably going the
check
	internally to decide whether it is an OME file anyway...


True. The same is true for every file extension -- the only way to
verify that the file *really* contains correctly structured data of the
indicated type is to attempt to fully parse it. However, file extension
is an extremely useful hint that greatly improves performance. In some
cases (e.g., certain raw data formats) it might even be impossible to
completely determine the file format without the filename extension.

	Could the .ome. extension requirement be removed for importing
ome-tiff
	files into OMERO?


Yes, we always parse a TIFF file's ImageDescription block. Ideally, we
should be properly parsing any OME-XML we find there. However, as I
said, there are some performance challenges we need to sort out. The fix
shouldn't be too bad. We'll file a ticket to keep you posted.

-Curtis

On Mon, Jan 12, 2009 at 8:43 AM, Cornelissen, Frans [PRDBE] <
FCORNELI at its.jnj.com> wrote:

Hi,

When using Tiff files, we would like to convert them to OME-tiff so that
they do contain the OME-XML metadata.

Currently the new files have to contain the .ome.tiff as extension
In our analysis processes, the altered name causes a disruption.

Question: is it really a hard requirement that the .ome. part is in the
filename?
This in itself is no proof of the fact that the file *really* contains a
valid OME-xml structure, so an application is probably going the check
internally to decide whether it is an OME file anyway...

Could the .ome. extension requirement be removed for importing ome-tiff
files into OMERO?

Best regards, frans cornelissen
_______________________________________________
ome-devel mailing list
ome-devel at lists.openmicroscopy.org.uk
http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20090114/b4365a15/attachment-0001.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: JHOVE.png
Type: image/png
Size: 210580 bytes
Desc: JHOVE.png
Url : http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20090114/b4365a15/attachment-0001.png 


More information about the ome-devel mailing list