[ome-devel] Problem when importing >1000 images
Andrii Iudin
andrii at ebi.ac.uk
Mon May 9 22:34:28 BST 2016
Dear Josh,
Yes, we plan to do the update to the latest version, we just have been
waiting for the release of OMERO that will have Bio-Formats 5.2.0 since
it will introduce fixes that are required for us to use OMERO for the
EMPIAR images and since we expect this to be a major update for our
systems. Please correct me if I am wrong, but Bio-Formats version that
is in the latest OMERO is below 5.2.0?
Best regards,
Andrii
On 09/05/2016 20:15, Josh Moore wrote:
> On Fri, May 6, 2016 at 5:14 PM, Andrii Iudin <andrii at ebi.ac.uk> wrote:
>> Dear Josh,
> Hi Andrii,
>
>> The version we are running is 5.1.4-ice35-b55.
> Have you considered upgrading recently? 5.2.3 just came out and with
> it, the 5.2 series will soon be going into maintenance mode, while
> support for 5.1 will be dropped.
>
> With the latest version, I've just attempted importing 700-800 hundred
> directories each with 6 images using:
>
> $ for x in $(seq 1 1000); do /opt/ome0/dist/bin/omero import
> $(printf "%04d" "$x") ; done
>
> So far I've had no exception with 5.2.3. If you'd like me to try with
> one of your images (assuming they are all similar), feel free to
> upload it to http://qa.openmicroscopy.org.uk/
>
> Cheers,
> ~Josh
>
>
>
>
>> Best regards,
>> Andrii
>>
>>
>> On 06/05/2016 16:11, Josh Moore wrote:
>>> On Fri, May 6, 2016 at 12:49 PM, Andrii Iudin <andrii at ebi.ac.uk> wrote:
>>>> Dear Josh,
>>> Hi Andrii,
>>>
>>>
>>>> I have added a logout after to the script after each import call. This
>>>> time
>>>> more than 2000 entries have been imported, however an error happened.
>>>> Please
>>>> could you check the attached log? Is this the same issue with NFS or
>>>> something different? Is it possible that using sessions might help?
>>> It does look like you're still running in the session/service
>>> exhaustion as you were seeing earlier. If using a single session
>>> doesn't solve the problem, the only other thing I can think to try at
>>> this point is a forcible closing of services. What version of OMERO
>>> are you using?
>>>
>>> Cheers,
>>> ~Josh.
>>>
>>>
>>>
>>>
>>>> Thank you and best regards,
>>>> Andrii
>>>>
>>>> On 02/05/2016 06:44, Josh Moore wrote:
>>>>> On Fri, Apr 29, 2016 at 12:41 PM, Andrii Iudin <andrii at ebi.ac.uk> wrote:
>>>>>> Dear Josh,
>>>>> Hi Andrii,
>>>>>
>>>>>
>>>>>> Thank you for providing the possible solution to our problem. We will
>>>>>> test
>>>>>> the session usage and get back with the results. Please could you
>>>>>> clarify
>>>>>> a
>>>>>> few things about your propositions?
>>>>>>
>>>>>> Is it possible to add a wait time somewhere in the code to compensate
>>>>>> for
>>>>>> the slower NFS locking?
>>>>> Cconceivably, but considering the state the serve could possibly be in
>>>>> at that point (shutdown, etc) it's difficult to know. One option is to
>>>>> put your /OMERO directory on a non-NFS filesystem and then symlink in
>>>>> individual directories from NFS. Ultimately, though, this points to an
>>>>> issue with the remote fileshare that needs to be looked into.
>>>>>
>>>>>
>>>>>> As far as I can see we do not call
>>>>>> bin/omero login
>>>>> `bin/omero import` calls `login` if no login is present.
>>>>>
>>>>>
>>>>>> explicitly at this moment. Is it an integral part of the import? There
>>>>>> is
>>>>>> also BlitzGateway.connect() call before the script goes into the loop
>>>>>> over
>>>>>> all images.
>>>>> Agreed. There are a couple of different logins in play here which
>>>>> makes it all a bit complicated. One option would be to get everything
>>>>> into the same process with no subprocess calls to `bin/omero import`.
>>>>>
>>>>>
>>>>>> Does this mean then that we should call logout after each import?
>>>>> That's probably the easiest thing to test. Longer-term, it'd be better
>>>>> to use a session key.
>>>>>
>>>>>
>>>>>> Thank you and best regards,
>>>>>> Andrii
>>>>> Cheers,
>>>>> ~Josh.
>>>>>
>>>>>
>>>>>
>>>>>> On 28/04/2016 10:17, Josh Moore wrote:
>>>>>>> On Wed, Apr 27, 2016 at 11:40 AM, Andrii Iudin <andrii at ebi.ac.uk>
>>>>>>> wrote:
>>>>>>>> Dear Josh,
>>>>>>> Hi Andrii,
>>>>>>>
>>>>>>>
>>>>>>>> Thank you for pointing to the documentation on the remote shares.
>>>>>>>> Those
>>>>>>>> .lock files usually appear if we stop the server after one of the
>>>>>>>> "crashes".
>>>>>>>> When stopping and starting the server during its normal functioning
>>>>>>>> they
>>>>>>>> seem to be not created.
>>>>>>> It sounds like a race condition. When the server is under pressure,
>>>>>>> etc., then there's no time for the slower NFS locking implementation
>>>>>>> to do what it should. This is what makes the remote share not behave
>>>>>>> as a posix filesystem should. There has been some success with other
>>>>>>> versions of NFS and lockd tuning.
>>>>>>>
>>>>>>>
>>>>>>>> The run_command definition is following:
>>>>>>>> def run_command(self, command, logFile=None):
>>>>>>> Thanks for the definition. I don't see anything off-hand in your code.
>>>>>>> If there's a keep alive bug in the import code itself, you might
>>>>>>> trying running a separate process with:
>>>>>>>
>>>>>>> bin/omero sessions keepalive
>>>>>>>
>>>>>>> You can either do that in a console for testing, or via your Python
>>>>>>> driver itself. If that fixes the problem, then we can help you
>>>>>>> integrate that code into your main script without the need for a
>>>>>>> subprocess. Additionally, the session UUID that is created by that
>>>>>>> method could be used in all of your import subprocesses which would 1)
>>>>>>> protect the use of the password and 2) lower the overhead on the
>>>>>>> server.
>>>>>>>
>>>>>>> (In fact, now that I think of it, if you don't have a call to
>>>>>>> `bin/omero logout` anywhere in your code, this may be exactly the
>>>>>>> problem that you are running into. Each call to `bin/omero login`
>>>>>>> creates a new session which is kept alive for the default session
>>>>>>> timeout.)
>>>>>>>
>>>>>>> Cheers,
>>>>>>> ~Josh.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Andrii
>>>>>>>>
>>>>>>>>
>>>>>>>> On 26/04/2016 21:00, Josh Moore wrote:
>>>>>>>>> Hi Andrii,
>>>>>>>>>
>>>>>>>>> On Tue, Apr 26, 2016 at 10:56 AM, Andrii Iudin <andrii at ebi.ac.uk>
>>>>>>>>> wrote:
>>>>>>>>>> Dear Josh,
>>>>>>>>>>
>>>>>>>>>> Please find attached the import script. For each EMDB entry it
>>>>>>>>>> performs
>>>>>>>>>> an
>>>>>>>>>> import of six images - three sides and their thumbnails.
>>>>>>>>> Thanks for this. And where's the definition of `run_command`?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> To stop OMERO we use "omero web stop" and then "omero admin stop"
>>>>>>>>>> commands.
>>>>>>>>>> After this it is necessary to remove
>>>>>>>>>> var/OMERO.data/.omero/repository/*/.lock files before starting
>>>>>>>>>> OMERO
>>>>>>>>>> again.
>>>>>>>>>> The system is NFS.
>>>>>>>>> I'd assume then that disconnections & the .lock files are unrelated.
>>>>>>>>> Please see
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> https://www.openmicroscopy.org/site/support/omero5.2/sysadmins/unix/server-binary-repository.html#locking-and-remote-shares
>>>>>>>>> regarding using remote shares.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> ~Josh.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Andrii
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 25/04/2016 16:21, Josh Moore wrote:
>>>>>>>>>>> On Fri, Apr 22, 2016 at 12:41 PM, Andrii Iudin <andrii at ebi.ac.uk>
>>>>>>>>>>> wrote:
>>>>>>>>>>>> Dear OMERO developers,
>>>>>>>>>>>>
>>>>>>>>>>>> We are experiencing an issue when importing a large number of
>>>>>>>>>>>> images
>>>>>>>>>>>> in
>>>>>>>>>>>> a
>>>>>>>>>>>> single consequent go. This usually happens after importing more
>>>>>>>>>>>> than
>>>>>>>>>>>> a
>>>>>>>>>>>> thousand images. Please see below excerpts from the logs.
>>>>>>>>>>>> Increasing
>>>>>>>>>>>> a
>>>>>>>>>>>> time
>>>>>>>>>>>> period between each import seemed to helped a bit, however this
>>>>>>>>>>>> issue
>>>>>>>>>>>> ultimately happened anyway.
>>>>>>>>>>> Is this script available publicly? It would be useful to see how
>>>>>>>>>>> it's
>>>>>>>>>>> working.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> To get OMERO server working after this happens,
>>>>>>>>>>>> it is necessary to stop it, remove .lock files and start the
>>>>>>>>>>>> server
>>>>>>>>>>>> again.
>>>>>>>>>>>> It would be much appreciated if you could point out to a possible
>>>>>>>>>>>> way
>>>>>>>>>>>> to
>>>>>>>>>>>> solve this issue.
>>>>>>>>>>> How did you stop OMERO? Is your file system on NFS or another
>>>>>>>>>>> remote
>>>>>>>>>>> share?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Thank you and with best regards,
>>>>>>>>>>>> Andrii
>>>>>>>>>>> Cheers,
>>>>>>>>>>> ~Josh.
> _______________________________________________
> ome-devel mailing list
> ome-devel at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
More information about the ome-devel
mailing list