[ome-users] Network drive slowness

Paul van Schayck paul at vanschayck.nl
Thu Mar 19 15:37:07 GMT 2015


Hi Sebastien,

Ah thank you! Those changes look excellent. I suspect though, that
they are initially only available for the command line importer, and
not the Java client?

I guess it is technically difficult to do both the upload and checksum
in one read, right?

Kind regards,

Paul

On Thu, Mar 19, 2015 at 3:52 PM, Sebastien Besson <seb.besson at gmail.com> wrote:
> Hi Paul,
>
> the checksumming operation has indeed been identified as one of the
> duplicate and
> time-consuming operation. While converging towards 5.1.0, we are trying to
> put many
> solutions into place to selectively disable some consuming import steps and
> let import
> process of large datasets be much faster. See notably:
>
> https://github.com/openmicroscopy/openmicroscopy/pull/3264
> https://github.com/openmicroscopy/openmicroscopy/pull/3580
> https://github.com/openmicroscopy/openmicroscopy/pull/3610
> https://github.com/openmicroscopy/openmicroscopy/pull/3630
>
> Some of this work is still under review and documentation will follow but I
> would expect
> these advanced import options will at least partially solve some of your
> import issues.
>
> Best,
> Sebastien
>
> On 19 Mar 2015, at 14:42, Paul van Schayck <paul at vanschayck.nl> wrote:
>
> Hi Josh and Melissa,
>
> Coming back to this issue. I've got a theory, and I wanted to run this past
> you.
>
> Could it be that the slow down during import is due to the verify
> step, in which a checksum is calculated both on the importer and the
> server? Could it be that the file is read twice on the importer side,
> once during submit and then a second time to calculate the checksum?
> This would be of less concern on a local disk, but on a network disk
> this is killing.
>
> I've also observed that a ManagedImport job may let the sever run out
> of memory by starting too many jobs that are not left to finish. Here
> is a small piece of filtered log. See both the OutOfMemory error, and
> the delay of 30 minutes.
>
> $ grep "40d58dd2" var/log/Blitz-0.log.1
> 2015-03-13 12:07:56,493 DEBUG [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (2-thread-2) User
> callContext: {omero.session.uuid=92c5a250-fd3e-450f-a881-b97b0d5d0e98,
> omero.client.uuid=b0c821a4-3b8a-4ebe-a7ee-9e485157da5c}
> 2015-03-13 12:07:56,494 INFO  [
> ome.services.util.ServiceHandler] (2-thread-2)  Executor.doWork --
> omero.cmd.HandleI.run(92c5a250-fd3e-450f-a881-b97b0d5d0e98/IHandlec85bb139-737d-46f9-aa0b-33a51e8e6b95,
> ome.services.blitz.repo.ManagedImportRequestI at 40d58dd2)
> 2015-03-13 12:07:57,493 INFO  [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (.Server-54) getRequest:
> ome.services.blitz.repo.ManagedImportRequestI at 40d58dd2
> 2015-03-13 12:07:57,840 INFO  [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (.Server-54) Add callback:
> ~3SJ7}(JvCFzsbw9oOJ*/00b0bd2e-4cbd-4c43-a67b-b6b72b5d7931
> 2015-03-13 12:07:58,536 INFO  [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (.Server-54) getRequest:
> ome.services.blitz.repo.ManagedImportRequestI at 40d58dd2
> 2015-03-13 12:07:58,536 INFO  [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (2-thread-2) Cancelled
> 2015-03-13 12:07:58,537 INFO  [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (.Server-54) getResponse:
> null
> 2015-03-13 12:07:58,537 DEBUG [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (2-thread-2) Request
> cancelled by java.lang.OutOfMemoryError: Java heap space
> 2015-03-13 12:07:58,897 INFO  [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (2-thread-2) notify
> cancelled: omero.cmd.ERR at 2c5bf6d6/omero.cmd.Status at 49aa725a
> 2015-03-13 12:07:59,250 INFO  [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (.Server-47) getRequest:
> ome.services.blitz.repo.ManagedImportRequestI at 40d58dd2
> 2015-03-13 12:39:04,791 INFO  [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (.Server-53) Closing...
> 2015-03-13 12:39:04,791 INFO  [
> o.s.b.r.ManagedImportRequestI. at 40d58dd2] (.Server-53) notify
> cancelled: omero.cmd.ERR at 2c5bf6d6/omero.cmd.Status at 49aa725a
>
> Thanks,
>
> Paul
>
> On Tue, Jan 6, 2015 at 9:53 AM, Paul van Schayck <paul at vanschayck.nl> wrote:
>
> Hi Josh,
>
> Thank you for your summary. Just before the holidays I did some
> additional testing but only were able to post the results now:
>
> High Latency Network drive (HL): 10Mbit connection, typical 40ms
> latency (2% packet loss)
>
> Low Latency Network drive (LL): 100Mbit connection, typical < 10ms
> latency (0% packet loss)
>
> I compared simple transfering the files to my local machine to
> importing to the OMERO server. This import to the OMERO server of
> course takes place via my local machine, hence the comparison. Note
> that the connection between my local machine and the OMERO server is
> 100mbit/s and < 1ms. I did all tests in duplo and had results within a
> few percent of the previous.
>
> Copying to local machine takes:
> HL: 19m30s
> LL: 2m30s
>
> Importing to OMERO server takes:
> HL: 26m30s
> LL: 12m30s
>
> And importing from my local machine to OMERO takes 6m30s. What these
> results show while there is some overhead to importing, it's not too
> big. Funny ennough the overhead seems less for the high latency
> situation.
>
> However what was noticeable is that the importer client will appear to
> hang when the latency is higher. The import will continue, but the
> client will not show any progress any more and seem to hang. What
> would help you most to debug such an apparent hang, what will show
> where Java is waiting for?
>
> Thanks,
>
> Paul
>
>
> On Fri, Dec 12, 2014 at 4:47 PM, Josh Moore <josh at glencoesoftware.com>
> wrote:
>
> Hi all,
>
> as a fairly general summary of this thread:
>
> * There will always be hidden costs for Bio-Formats and OMERO to work from
> a mounted disk[1] and as datasets grow in size, we'll probably have to come
> up with more comprehensive solutions to issues like these, but that won't be
> possible for 5.1.0
>
> * Nevertheless, there is obviously still huge room for improvement. A large
> part of the underlying problem is that we don't and probably can't regularly
> test with realistically sized datasets of each file format over various
> mounting options. This is certainly where you can help! If anyone has
> similar issues perhaps with other file formats, let us know the details.
>
> * In the meantime, we'll start working on the formats and components where
> we know there are issues and keep you posted. If anyone is interested in
> getting their hands dirty, likely the most helpful information would be Java
> hprof[2] output or similar from an example run of what you consider slow.
>
> A lovely weekend to one and all,
> ~Josh
>
> [1] http://en.wikipedia.org/wiki/Fallacies_of_distributed_computing
> [2] https://docs.oracle.com/javase/7/docs/technotes/samples/hprof.html
>
>
> On Thu, Dec 11, 2014 at 10:11 AM, Melissa Linkert
> <melissa at glencoesoftware.com> wrote:
>
>
> Hi Paul and Niko,
>
> Would this bug also explain slowness in the OMERO importer while
> importing files which are located on a network drive (in Windows)?
>
>
> It's very possible - the easiest way to know for sure is to compare the
> import time for the same file on a network drive vs. on a local drive.
>
> I doubt this is related - as I understand it, the (current) import
> process
> using OMERO.insight just copies the image file(s) onto the server
> before it
> is processing them. So the first time the BioFormats kicks in there is
> *after* the actual transfer and therefore the network drive issue
> should
> not apply here.
>
>
> That's mostly correct, but Bio-Formats is still used client-side at
> import time send metadata to the server and to determine which files
> need to be uploaded.  Bio-Formats definitely would be accessing the
> files as they are on the network drive.
>
> You're probably right that the File.isHidden() in BioFormats isn't
> called
> in the OMERO importer at the client side. However, from quickly grepping
> through the source of openmicroscopy/components/insight I found several
> calls to File.isHidden(). I was wondering if these are also meant to be
> optimized as part of this issue.
>
>
> We weren't initially considering the OMERO code base, but once this
> issue is resolved in Bio-Formats it likely makes sense to use the same
> fix in OMERO.
>
> Regards,
> -Melissa
>
> P.S. Niko, you are now CC'd on the ticket as well.
>
> On Thu, Dec 11, 2014 at 09:07:41PM +0100, Paul van Schayck wrote:
>
> Hi Niko,
>
> You're probably right that the File.isHidden() in BioFormats isn't
> called
> in the OMERO importer at the client side. However, from quickly grepping
> through the source of openmicroscopy/components/insight I found several
> calls to File.isHidden(). I was wondering if these are also meant to be
> optimized as part of this issue.
>
> What we experience in the OMERO importer is for very low latency network
> drives that the import is relatively slow (compared to file copy). For
> higher latency network drives the import is slow and the importer UI
> becomes unresponsive (although the import continues). If you want me to
> quantify these experiences, I could do that.
>
> Thanks,
>
> Paul
>
>
> On Thu, Dec 11, 2014 at 8:22 PM, Niko Ehrenfeuchter <
> nikolaus.ehrenfeuchter at unibas.ch> wrote:
>
> Hi Paul,
>
> I doubt this is related - as I understand it, the (current) import
> process
> using OMERO.insight just copies the image file(s) onto the server
> before it
> is processing them. So the first time the BioFormats kicks in there is
> *after* the actual transfer and therefore the network drive issue
> should
> not apply here.
>
> I might be wrong, though...
>
> Cheers
> Niko
>
> On 11.12.2014 17:29, Paul van Schayck wrote:
>
> Hi Melissa,
>
> Would this bug also explain slowness in the OMERO importer while
> importing files which are located on a network drive (in Windows)?
>
> Thanks,
>
> Paul
>
> _______________________________________________
> ome-users mailing list
> ome-users at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users
>
>
>
> _______________________________________________
> ome-users mailing list
> ome-users at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users
>



More information about the ome-users mailing list