[ome-devel] File import help

Thu Jun 6 13:01:03 BST 2019

Hi Brian

On 4 Jun 2019, at 21:06, Brian Bodensteiner <brian at intelligent-imaging.com<mailto:brian at intelligent-imaging.com>> wrote:

Hi Sebastian,

Thanks for the note, it was very helpful!

So first, I’ve tracked down the “File not open.” message and it is indeed our message. And exactly as you suspected it means OMERO opens and closes files differently than FIJI, which calls initFile and doesn’t close the file until it’s done working.

Yes, in general, the behavior is to open thee files for just as long as you need to it. This is because OMERO is a different environment than Fiji where the same assumptions do not hold. One crucial aspect is to make sure the server never run out of file descriptors due to too many open files.

That means both good news and bad news. The good news is, now that I know OMERO is constantly closing the file, I have a proof of concept work-around that allows data to load: I bracket openBytes with openFile and close. The bad news is, because of the way our reader library, this can be extremely inefficient. As our format is a single file, each openFile call traverses the entire file and builds up metadata and data file pointers for all of the images. On very large files that have tens of thousands (or hundreds of thousands) of images that can take 2-10 seconds or longer. What happens then is, every openBytes call for every plane of every image now takes that amount of overhead to read what’s often a trivial amount of data. It scales with the size of the file, and result is very poor performance.

Agreed on the overhead and the performance concerns about reparsing the entire file for metadata. When OMERO was upgraded to use directly the files from the file system back in 2013, this was also a concern we had to address to be able to maintain acceptable performance levels while using the imaging data natively.

The technological solution we have been using has been to:

- store a minimal representation of the native data structure within the Bio-Formats reader itself, e.g. filenames, offsets
- serialise the  Bio-Formats reader on disk after it is initialised for the first time using the Memoizer API
- let all subsequent data access load the initialised reader from the cache first and use the internal representation to access the data as fast as possible

To give you a concrete example, the legacy Slidebook reader developed by OME (now superseded by 3i native reader) stores various file offsets as private fields
https://github.com/openmicroscopy/bioformats/blob/v6.1.0/components/formats-gpl/src/loci/formats/in/SlidebookReader.java#L74
and these offsets can be reused directly when opening planes
https://github.com/openmicroscopy/bioformats/blob/v6.1.0/components/formats-gpl/src/loci/formats/in/SlidebookReader.java#L124

I’m hoping you can help suggest a way to workaround this. First question would be, is there a better function than closeFile that I should use to actually shut down our reading library? I see there is a close() file without argument that my logging shows is called less frequently than close(bool). Basically, my goal would be to have a single call to pair with initFile, followed by all data + metadata reading, and then a close / destructor. I just don’t know if that’s possible, or if the object is created multiple times, perhaps even once per image? If you have any suggestions I can definitely try them out.

Anyway, please let me know if you have any other ideas. Worst-case it looks like each image has a tight openBytes loop where close(bool) is not called, so we could probably get away with loading the metadata once per experiment. Once I get a little more testing done here I’m hoping to send you an updated jar file to test.

Without seeing the code, it is quite hard to make accurate suggestions. We certainly understand the native code cannot be shared but is the Java reader class wrapping it accessible publicly? Looking at how the various calls are made might help us providing insightful suggestions.

Best,
Sebastien

Thanks again for your help!

Best,

Brian

Hi Brian,

On 1 Jun 2019, at 01:03, Brian Bodensteiner <brian at intelligent-imaging.com<https://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel><mailto:brian at intelligent-imaging.com<https://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel>>> wrote:

Hi,

We’re working with some of our users to improve SlideBook file import for OMERO but could use some help.

From our side, several community users have been increasingly asking for the ability to load Slidebook data using the native 3i reader in OMERO. So we are very happy to hear that work has started on this front.

First, we tested using stock OMERO.insight 5.4.10. Importing the .SLD files results in the following error (as the default importer does not handle new data):

2019-05-31 16:53:17,563 DEBUG [ org.scijava.nativelib.NativeLibraryUtil] (entQueue-0) processor is INTEL_64 os.arch is amd64
<snip>
2019-05-31 16:53:24,604 INFO  [   ome.formats.importer.ImportCandidates] ( Thread-14) 1 file(s) parsed into 0 group(s) with 1 call(s) to setId in 387ms. (393ms total) [0 unknowns]

That’s also what I expect if the 3i Slidebook reader is not available client-side.

Next, we copied the SlideBook6Reader.jar into the lib folder. This is picked up and executed, and the metadata is read correctly. However, no pixel data is read, and I get the following exception:

2019-05-31 16:55:13,347 WARN  [     o.o.s.a.m.editor.AnnotationTaskPane] (nitializer) UI for displaying ROIS annotations not implemented yet!
<snip>
    serverExceptionClass = "ome.conditions.InternalException"
    message = " Wrapped Exception: (java.lang.AssertionError):
               File not open.”

This indicates some error happened server side probably while processing the file during the import steps although the error is somehow wrapped.
I suppose you have access to the test server? The Blitz log file should be more informative about the specific API calls that led to the exception.

I’ve tested the same jars in Fiji and they are working fine, so I am guessing there is something different going on in terms of file open/close flow. I can’t find the source of the "File not open.” error message - it is the cas that in order to read pixel data the file must be parsed and read, but I’d also presume this happens when the metadata is read.

There are a few import post-processing steps happening server-side which require file access:

- metadata access when populating the database with the new inserts
- pixel data access while reading the minimum/maximum intensities and/or creating thumbnails

I suspect one of these steps is leading to the "File not open" exception. Hopefully the server logs will provide more information.

We have some specific exceptions thrown when a file is closed and then attempted to be read, so it would help to understand where that message is coming from. We also updated the jar on the server as well, though it’s unclear if that’s required or not (is pixel data stored directly or accessed on the server).

Since OMERO 5.0, the raw image files are directly accessed on the server using Bio-Formats. As such, adding the reader as a JAR to the server folder is definitely a prerequisite.

Anyway any direction you could provide would be much appreciated. This feels like the right direction - similar to Fiji dropping the library into the Bio-Formats installation immediately moves things to the SlideBook6Reader.jar.

Agreed. Please let us know as you get more details from the logs.

Best,
Sebastien

Thanks again for your help,

Brian

Brian Bodensteiner
Vice President Engineering

Intelligent Imaging Innovations (3i)
3509 Ringsby Court
Denver, CO  80216  USA
1-424-744-5941
www.intelligent-imaging.com<http://www.intelligent-imaging.com/><http://www.intelligent-imaging.com/>

_______________________________________________
ome-devel mailing list
ome-devel at lists.openmicroscopy.org.uk<mailto:ome-devel at lists.openmicroscopy.org.uk>
https://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel

The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20190606/ac5ff5c9/attachment.html>