[ome-devel] tif import performance
Jason Swedlow
jason at lifesci.dundee.ac.uk
Sun Nov 14 23:52:02 GMT 2004
Richard-
The problem is that for many TIFF based files, ZWT (focus, channel,
timepoint) or other info (e.g., plate position, in a screen) is encoded
in the file name, and not in the TIFF tags, Note this is not OME, but
the file formats we are provided by the commercial vendors. Anyway, at
least some of the basic info OME needs for import is accessed during
this process.
As for performance issues (your previous) email, you can certainly
criticize the approach we have implemented. There are historical
reasons for this, that aren't really worth going into. But OME import
will always be much slower than a simple copy, because a large number
of calcs are performed, and then written to the database. This does
take real time. Our general sense of a use case is that data is
acquired, imported, and then analysed. So far we have not worked on a
system where the time from beginning an import to actually seeing the
data and/or analysing it is critical. In many cases, we expect that an
import of many Gbytes of data can take minutes, depending on the type
of data. If you let us know your requirements -- how many TIFFs, how
big, etc., we can discuss the limitations and see what we can do.
Finally, you should note that OME is open source but not because our
primary goal is to provide a service to the greater community. We are
funded by grants to do our own work, and we are developing OME to
support those purposes. We are currently making the transition from a
proof-of-concept project to a production project, but we aren't there
yet. Our latest difficulties in installs are an example of that. You
will doubtless find all sorts of problems in our system, where our own
requirements do not match or anticipate your own. We welcome your
commentary, but as we are still growing and supporting our own work, we
don't have resources to fix the things you might want, at least right
away (we do endeavor to fix system bugs ASAP). However, we welcome
your, and others, help, and will support that as much as possible. Our
group of developers keeps growing, largely through people who need the
tools we are working on, and have the skills to help.
The problem we are trying to address is huge. We have made alot of
progress, but still have a long way to go. If you can, please do help
us out-- we'd love it.
Cheers,
Jason
On 14 Nov 2004, at 23:11, Richard Beare wrote:
> Hi,
> I've been thinking about this issue a bit further, so I thought I'd
> better check to see whether I have the order of events correct in my
> head.
>
> When an "ome import x.tif" command is run, does the following happen:
>
> x.tif is given to OMEIS, which presumably converts it to an internal
> representation.
>
> ome then determines the tif tags by calling the readData() function
> which in turn calls the OMEIS ReadFile function.
>
> -------------------------
>
> I'm not completely sure why the tiff tags need to be read at all - is
> it just so that the dimensions and types are known?
>
> However, ome has direct access to the tif file, so it could presumably
> read the tags using tifflib before entering the file into the image
> server. I guess this could be seen as moving the feature creep to
> somewhere else.
>
> Sorry if this is a dumb question, I'm still trying to learn how
> everything fits together.....
>
> Macura, Tomasz (NIH/NIA/IRP) wrote:
>> Dear Richard,
>> Tiff tag reading is indeed a bottleneck. I am not sure what you mean
>> by
>> perl->libtiff, but our tiff tag reading library is
>> OME/src/perl2/OME/ImportEngine/TIFFUtils.pm .
>> Reading tiff tags amounts to doing many reads (readData() calls) on
>> the
>> original file. Each call to readData() is much
>> less efficient then ideal because readData() has to get the data from
>> OMEIS
>> (using the OMEIS method ReadFile()
>> http://www.openmicroscopy.org/api/omeis/files.html#ReadFile) and
>> there is a
>> lot of surrounding logic written in Perl that deals with caching. As
>> you can see, due to our requirements with regard
>> to OMEIS we can't use Perl's built-in libraries.
>> The end result is that the TIFF perl command readTiffIFD() takes alot
>> (10-100 times) longer than the libtiff's tifffinfo CLI program.
>> The optimizations I did with regard to improving tiff tag reading was
>> (1)
>> TIFF caching and (2) increased IFD specificity. So right
>> now the tags have to be read only once and its possible to specify
>> quite
>> explicitly which tags are required. This means that as
>> few as possible readTiffIFD() calls are made per image.
>> In my limited imagination, the only way of improving performance is
>> to get
>> OMEIS (which is written in C) to use libtiff to
>> get the TIFF tags and send them over, in mass, to the ImportEngine.
>> Implementing this is surely very feasible but I am not able to
>> comment on the larger issues (e.g. OMEIS feature creep).
>> Regards,
>> Tom
>
>
> --
> Richard Beare, CSIRO Mathematical & Information Sciences
> Locked Bag 17, North Ryde, NSW 1670, Australia
> Phone: +61-2-93253221 (GMT+~10hrs) Fax: +61-2-93253200
>
> Richard.Beare at csiro.au
> _______________________________________________
> ome-devel mailing list
> ome-devel at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
>
NOTE NEW EMAIL ADDRESS: jason at lifesci.dundee.ac.uk
**************************
MSI/WTB Complex
The University of Dundee
Dow Street
Dundee DD1 5EH
United Kingdom
phone (01382) 345819
Intl phone: 44 1382 345819
FAX (01382) 348072
email: jason at lifesci.dundee.ac.uk
Lab Page: http://www.dundee.ac.uk/lifesciences/swedlow/
Open Microscopy Environment: http://www.openmicroscopy.org
**************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 5805 bytes
Desc: not available
Url : http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20041114/9cfb9260/attachment.bin
More information about the ome-devel
mailing list