[ome-devel] CellProfiler on the cluster crashes OMERO

Simon Li spli at dundee.ac.uk
Tue Dec 13 13:41:58 GMT 2016


Hi Frederik

Could you give us your server configuration and diagnostics:

    omero config get --hide-password
    omero admin diagnostics

It would also be helpful if we could see your logs for all OMERO services, not just Blitz. Would you mind uploading them to https://www.openmicroscopy.org/qa2/qa/upload/ and giving us the timestamp of when the problem first arises following a restart?

Best wishes

Simon


On 13 December 2016 at 10:49, Frederik Grüll <frederik.gruell at unibas.ch<mailto:frederik.gruell at unibas.ch>> wrote:
Dear all,

I am using CellProfiler on our cluster to process plates for screening.
The images are fetched from OMERO with the CellProfiler-OMERO
integration. A typical job consists of a command like this:

cellprofiler -b -p Entry-pipeline_omero.cpproj -c -r -o $OUT_DIR -t
$TMPDIR -f $FIRST_IMAGE_SET -l $LAST_IMAGE_SET --data-file
plate_303_iids.csv -d $DONE_FILE --omero-credentials
host=omero.biozentrum.unibas.ch<http://omero.biozentrum.unibas.ch>,port=4064,session-id=33c6118d-f8b2-4ac2-adb2-12d48ae37a2f

When I run about 20 jobs in parallel, performance looks good at the
beginning, only limited by the performance of CellProfiler and not by
the I/O with OMERO. The plate I am processing has 2400 sites with three
channels and the OMERO IDs are in the CSV file plate_303_iids.csv that I
generated before. A job processes 50 image sets, selected with
$FIRST_IMAGE_SET and $LAST_IMAGE_SET. The results of the pipeline are
correct.

However, after about 4/5 of the images have been processed, OMERO
becomes very slow. The load on the OMERO server reaches 10, with the
Java process for Blitz consuming 10 cores. Eventually, my CellProfiler
jobs will loose connection ("JavaException:
Ice.ConnectionLostException"), and OMERO recovers in a few cases or
otherwise the CPU load falls back to normal, but OMERO needs to be
restarted anyway.

If I run more than 20 jobs in parallel, I would occasional get an error
message "ome.conditions.OverUsageException: servantsPerSession reached
for 05dbc314-3030-40af-8e72-68b3688e8c94: 10000" after CellProfiler
processed only 1665 single-channel images, implying 6 servants per image
per channel.

I have already had a look into the logs, especially Blitz-0.log, but
could not find a reason why OMERO would become so slow after a while.
Jstat indicates that all time is spend on garbage collection. Our OMERO
server has 250GB of RAM with omero.jvmcfg.percent.blitz=40.

Where else could I look into to find the cause and prevent the
degradation in performance? I use OMERO.server 5.2.5 with OpenJDK
version 1.8.0_65 and CellProfiler 2.2.0 with Oracle Java 1.8.0_92.

Cheers and thank you for your time,
Frederik

--
Dr. Frederik Grüll | Image Analysis Specialist | G1055, Biozentrum,
University of Basel | Klingelbergstr. 50/70 | CH-4056 Basel Phone: +41
(61) 207 2250 | frederik.gruell at unibas.ch<mailto:frederik.gruell at unibas.ch> | www.biozentrum.unibas.ch<http://www.biozentrum.unibas.ch>


_______________________________________________
ome-devel mailing list
ome-devel at lists.openmicroscopy.org.uk<mailto:ome-devel at lists.openmicroscopy.org.uk>
http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel



The University of Dundee is a registered Scottish Charity, No: SC015096
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20161213/eccb9b87/attachment.html>


More information about the ome-devel mailing list