<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p>Hi Simon,</p>
    <p>Thank you for your offer to have a look. I uploaded all the logs
      and the output of the commands in a zip file "debug.zip".</p>
    <p>I started my cluster jobs at 9:14, and OMERO was restarted at
      10:49 on Dec 14 2016. All times are CET.</p>
    <p>Best regards</p>
    <p>Frederik<br>
    </p>
    <br>
    <div class="moz-cite-prefix">On 13.12.2016 14:41, Simon Li wrote:<br>
    </div>
    <blockquote
cite="mid:CAMvbRBGM4sNL4-ieG5+m5VVpBwKgtd+jv=Sd4b__QtriJOzQFQ@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <div dir="ltr">
        <div>
          <div>
            <div>
              <div>Hi Frederik<br>
                <br>
              </div>
              Could you give us your server configuration and
              diagnostics:<br>
              <br>
                  omero config get --hide-password<br>
                  omero admin diagnostics<br>
              <br>
            </div>
            It would also be helpful if we could see your logs for all
            OMERO services, not just Blitz. Would you mind uploading
            them to
            <a moz-do-not-send="true"
              href="https://www.openmicroscopy.org/qa2/qa/upload/"
              target="_blank">https://www.openmicroscopy.<wbr>org/qa2/qa/upload/</a>
            and giving us the timestamp of when the problem first arises
            following a restart?<br>
            <br>
          </div>
          Best wishes<br>
          <br>
        </div>
        Simon<br>
        <div>
          <div>
            <div>
              <div><br>
              </div>
            </div>
          </div>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On 13 December 2016 at 10:49, Frederik
          Grüll <span dir="ltr">
            <<a moz-do-not-send="true"
              href="mailto:frederik.gruell@unibas.ch" target="_blank">frederik.gruell@unibas.ch</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            Dear all,<br>
            <br>
            I am using CellProfiler on our cluster to process plates for
            screening.<br>
            The images are fetched from OMERO with the
            CellProfiler-OMERO<br>
            integration. A typical job consists of a command like this:<br>
            <br>
            cellprofiler -b -p Entry-pipeline_omero.cpproj -c -r -o
            $OUT_DIR -t<br>
            $TMPDIR -f $FIRST_IMAGE_SET -l $LAST_IMAGE_SET --data-file<br>
            plate_303_iids.csv -d $DONE_FILE --omero-credentials<br>
            host=<a moz-do-not-send="true"
              href="http://omero.biozentrum.unibas.ch" rel="noreferrer"
              target="_blank">omero.biozentrum.unibas.<wbr>ch</a>,port=4064,session-id=<wbr>33c6118d-f8b2-4ac2-adb2-<wbr>12d48ae37a2f<br>
            <br>
            When I run about 20 jobs in parallel, performance looks good
            at the<br>
            beginning, only limited by the performance of CellProfiler
            and not by<br>
            the I/O with OMERO. The plate I am processing has 2400 sites
            with three<br>
            channels and the OMERO IDs are in the CSV file
            plate_303_iids.csv that I<br>
            generated before. A job processes 50 image sets, selected
            with<br>
            $FIRST_IMAGE_SET and $LAST_IMAGE_SET. The results of the
            pipeline are<br>
            correct.<br>
            <br>
            However, after about 4/5 of the images have been processed,
            OMERO<br>
            becomes very slow. The load on the OMERO server reaches 10,
            with the<br>
            Java process for Blitz consuming 10 cores. Eventually, my
            CellProfiler<br>
            jobs will loose connection ("JavaException:<br>
            Ice.ConnectionLostException"), and OMERO recovers in a few
            cases or<br>
            otherwise the CPU load falls back to normal, but OMERO needs
            to be<br>
            restarted anyway.<br>
            <br>
            If I run more than 20 jobs in parallel, I would occasional
            get an error<br>
            message "ome.conditions.<wbr>OverUsageException:
            servantsPerSession reached<br>
            for 05dbc314-3030-40af-8e72-<wbr>68b3688e8c94: 10000" after
            CellProfiler<br>
            processed only 1665 single-channel images, implying 6
            servants per image<br>
            per channel.<br>
            <br>
            I have already had a look into the logs, especially
            Blitz-0.log, but<br>
            could not find a reason why OMERO would become so slow after
            a while.<br>
            Jstat indicates that all time is spend on garbage
            collection. Our OMERO<br>
            server has 250GB of RAM with omero.jvmcfg.percent.blitz=40.<br>
            <br>
            Where else could I look into to find the cause and prevent
            the<br>
            degradation in performance? I use OMERO.server 5.2.5 with
            OpenJDK<br>
            version 1.8.0_65 and CellProfiler 2.2.0 with Oracle Java
            1.8.0_92.<br>
            <br>
            Cheers and thank you for your time,<br>
            Frederik<br>
            <span class="HOEnZb"><font color="#888888"><br>
                --<br>
                Dr. Frederik Grüll | Image Analysis Specialist | G1055,
                Biozentrum,<br>
                University of Basel | Klingelbergstr. 50/70 | CH-4056
                Basel Phone: +41<br>
                (61) 207 2250 | <a moz-do-not-send="true"
                  href="mailto:frederik.gruell@unibas.ch">frederik.gruell@unibas.ch</a>
                |
                <a moz-do-not-send="true"
                  href="http://www.biozentrum.unibas.ch"
                  rel="noreferrer" target="_blank">www.biozentrum.unibas.ch</a><br>
                <br>
              </font></span><br>
            ______________________________<wbr>_________________<br>
            ome-devel mailing list<br>
            <a moz-do-not-send="true"
              href="mailto:ome-devel@lists.openmicroscopy.org.uk">ome-devel@lists.<wbr>openmicroscopy.org.uk</a><br>
            <a moz-do-not-send="true"
              href="http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel"
              rel="noreferrer" target="_blank">http://lists.openmicroscopy.<wbr>org.uk/mailman/listinfo/ome-<wbr>devel</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
      <br>
      <span style="font-size:10pt;">The University of Dundee is a
        registered Scottish Charity, No: SC015096</span>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
ome-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:ome-devel@lists.openmicroscopy.org.uk">ome-devel@lists.openmicroscopy.org.uk</a>
<a class="moz-txt-link-freetext" href="http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel">http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel</a>
</pre>
    </blockquote>
    <br>
    <div class="moz-signature">-- <br>
      Dr. Frederik Grüll | Image Analysis Specialist | G1055,
      Biozentrum, University of Basel | Klingelbergstr. 50/70 | CH-4056
      Basel
      Phone: +41 (61) 207 2250 | <a class="moz-txt-link-abbreviated" href="mailto:frederik.gruell@unibas.ch">frederik.gruell@unibas.ch</a> |
      <a class="moz-txt-link-abbreviated" href="http://www.biozentrum.unibas.ch">www.biozentrum.unibas.ch</a></div>
  </body>
</html>