[ome-devel] CellProfiler on the cluster crashes OMERO

Jean-Marie Burel (Staff) j.burel at dundee.ac.uk
Fri Dec 16 15:38:34 GMT 2016


Hi Frederik
There is no direct way to retrieve the information you need by querying the OMERO API

Mark Carroll has written a testing application https://github.com/mtbc/omero-downloader (experimental work in progress)
that will probably get you where you want.

The first part of the following method (line 232-260)
https://github.com/mtbc/omero-downloader/blob/dev_5_3/src/main/java/org/openmicroscopy/client/downloader/Download.java#L232
Shows how to retrieve the original files composing a given image
You won’t need the downloading part of the method

The order of the id of the files “should” match the channels order. This will have to be validated with your format

To test it, you can run the “downloader” app against your omero server with the following arguments:
-u username -w password -s hostname -p port  -d absolutePathToLocalDirectory Image:image_id -f binary
This will only download the files composing the image in absolutePathToLocalDirectory


Cheers
Jmarie

From: ome-devel <ome-devel-bounces at lists.openmicroscopy.org.uk<mailto:ome-devel-bounces at lists.openmicroscopy.org.uk>> on behalf of Frederik Grüll <frederik.gruell at unibas.ch<mailto:frederik.gruell at unibas.ch>>
Reply-To: OME External Developer List <ome-devel at lists.openmicroscopy.org.uk<mailto:ome-devel at lists.openmicroscopy.org.uk>>
Date: Thursday, 15 December 2016 10:14
To: OME External Developer List <ome-devel at lists.openmicroscopy.org.uk<mailto:ome-devel at lists.openmicroscopy.org.uk>>
Subject: Re: [ome-devel] CellProfiler on the cluster crashes OMERO


Hello Chris,

Is there maybe a workaround, for example by loading the images not through OMERO but from the cluster file system? When I import a plate I select a HTD file and OMERO.insight uploads all the TIFFs of the plate along with it, with one TIFF per channel, and displays them as a multi-channel view as expected. I have four channels and would be interested in the filesystem location of the four TIFF files that back each image, and the information which TIFF belongs to which channel. I could then use this information to generate a CSV file for CellProfiler with paths instead of OMERO URLs.

I had experimented a bit with image.getFileset() to get the original TIFFs. However, when I call the method, it takes a long time and then returns the paths of _all_ 9600 images of the plate, not only the paths of the four TIFFs I wanted. Is there a way to just get the paths to the TIFFs of a view only, plus the information which TIFF backs which channel?

Cheers,

Frederik

On 13.12.2016 14:57, Chris Allan wrote:

Hello, Frederick:

There are a number of issues with the CellProfiler-OMERO integration that will be felt especially hard on any significantly sized cluster configuration. At Glencoe Software we have been working on some solutions and have be liaising with Anne Carpenter's team here:

        https://github.com/CellProfiler/CellProfiler/issues/1772

We are aiming to release at least one potential solution to the community before the end of the year. Testing such solutions thoroughly is, as you can imagine, quite time consuming.

As things stand I think it's fair to say that (1) if you utilise a large number of workers or are processing a large dataset CellProfiler-OMERO usage will severely impact your OMERO server performance-- it may even exhaust all resources you currently have allocated to OMERO; and (2) that even if you utilise some of our forthcoming changes you will still have to be very careful about resource allocation on your server. CellProfiler-OMERO in a cluster configuration can easily request data at a rate in excess of 1gbps for short periods of time.

I realise none of the above is particularly helpful to you right this minute but I hope that it at least helps shed some light on why you're seeing what you're seeing. At the moment my only concrete suggestion to you is to keep your CellProfiler executions short-lived and your parallelisation limited.

-Chris

On 13 Dec 2016, at 10:49, Frederik Grüll <frederik.gruell at unibas.ch><mailto:frederik.gruell at unibas.ch> wrote:



Dear all,

I am using CellProfiler on our cluster to process plates for screening.
The images are fetched from OMERO with the CellProfiler-OMERO
integration. A typical job consists of a command like this:

cellprofiler -b -p Entry-pipeline_omero.cpproj -c -r -o $OUT_DIR -t
$TMPDIR -f $FIRST_IMAGE_SET -l $LAST_IMAGE_SET --data-file
plate_303_iids.csv -d $DONE_FILE --omero-credentials
host=omero.biozentrum.unibas.ch,port=4064,session-id=33c6118d-f8b2-4ac2-adb2-12d48ae37a2f

When I run about 20 jobs in parallel, performance looks good at the
beginning, only limited by the performance of CellProfiler and not by
the I/O with OMERO. The plate I am processing has 2400 sites with three
channels and the OMERO IDs are in the CSV file plate_303_iids.csv that I
generated before. A job processes 50 image sets, selected with
$FIRST_IMAGE_SET and $LAST_IMAGE_SET. The results of the pipeline are
correct.

However, after about 4/5 of the images have been processed, OMERO
becomes very slow. The load on the OMERO server reaches 10, with the
Java process for Blitz consuming 10 cores. Eventually, my CellProfiler
jobs will loose connection ("JavaException:
Ice.ConnectionLostException"), and OMERO recovers in a few cases or
otherwise the CPU load falls back to normal, but OMERO needs to be
restarted anyway.

If I run more than 20 jobs in parallel, I would occasional get an error
message "ome.conditions.OverUsageException: servantsPerSession reached
for 05dbc314-3030-40af-8e72-68b3688e8c94: 10000" after CellProfiler
processed only 1665 single-channel images, implying 6 servants per image
per channel.

I have already had a look into the logs, especially Blitz-0.log, but
could not find a reason why OMERO would become so slow after a while.
Jstat indicates that all time is spend on garbage collection. Our OMERO
server has 250GB of RAM with omero.jvmcfg.percent.blitz=40.

Where else could I look into to find the cause and prevent the
degradation in performance? I use OMERO.server 5.2.5 with OpenJDK
version 1.8.0_65 and CellProfiler 2.2.0 with Oracle Java 1.8.0_92.

Cheers and thank you for your time,
Frederik

--
Dr. Frederik Grüll | Image Analysis Specialist | G1055, Biozentrum,
University of Basel | Klingelbergstr. 50/70 | CH-4056 Basel Phone: +41
(61) 207 2250 | frederik.gruell at unibas.ch<mailto:frederik.gruell at unibas.ch> | www.biozentrum.unibas.ch<http://www.biozentrum.unibas.ch>

_______________________________________________
ome-devel mailing list
ome-devel at lists.openmicroscopy.org.uk<mailto:ome-devel at lists.openmicroscopy.org.uk>http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel

_______________________________________________
ome-devel mailing list
ome-devel at lists.openmicroscopy.org.uk<mailto:ome-devel at lists.openmicroscopy.org.uk>http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel

--
Dr. Frederik Grüll | Image Analysis Specialist | G1055, Biozentrum, University of Basel | Klingelbergstr. 50/70 | CH-4056 Basel Phone: +41 (61) 207 2250 | <mailto:frederik.gruell at unibas.ch> frederik.gruell at unibas.ch<mailto:frederik.gruell at unibas.ch> | www.biozentrum.unibas.ch<http://www.biozentrum.unibas.ch>

The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20161216/d826ed62/attachment.html>


More information about the ome-devel mailing list