[ome-devel] Problem when importing >1000 images

Andrii Iudin andrii at ebi.ac.uk
Fri May 6 16:14:52 BST 2016


Dear Josh,

The version we are running is 5.1.4-ice35-b55.

Best regards,
Andrii

On 06/05/2016 16:11, Josh Moore wrote:
> On Fri, May 6, 2016 at 12:49 PM, Andrii Iudin <andrii at ebi.ac.uk> wrote:
>> Dear Josh,
> Hi Andrii,
>
>
>> I have added a logout after to the script after each import call. This time
>> more than 2000 entries have been imported, however an error happened. Please
>> could you check the attached log? Is this the same issue with NFS or
>> something different? Is it possible that using sessions might help?
> It does look like you're still running in the session/service
> exhaustion as you were seeing earlier.  If using a single session
> doesn't solve the problem, the only other thing I can think to try at
> this point is a forcible closing of services. What version of OMERO
> are you using?
>
> Cheers,
> ~Josh.
>
>
>
>
>> Thank you and best regards,
>> Andrii
>>
>> On 02/05/2016 06:44, Josh Moore wrote:
>>> On Fri, Apr 29, 2016 at 12:41 PM, Andrii Iudin <andrii at ebi.ac.uk> wrote:
>>>> Dear Josh,
>>> Hi Andrii,
>>>
>>>
>>>> Thank you for providing the possible solution to our problem. We will
>>>> test
>>>> the session usage and get back with the results. Please could you clarify
>>>> a
>>>> few things about your propositions?
>>>>
>>>> Is it possible to add a wait time somewhere in the code to compensate for
>>>> the slower NFS locking?
>>> Cconceivably, but considering the state the serve could possibly be in
>>> at that point (shutdown, etc) it's difficult to know. One option is to
>>> put your /OMERO directory on a non-NFS filesystem and then symlink in
>>> individual directories from NFS. Ultimately, though, this points to an
>>> issue with the remote fileshare that needs to be looked into.
>>>
>>>
>>>> As far as I can see we do not call
>>>> bin/omero login
>>> `bin/omero import` calls `login` if no login is present.
>>>
>>>
>>>> explicitly at this moment. Is it an integral part of the import? There is
>>>> also BlitzGateway.connect() call before the script goes into the loop
>>>> over
>>>> all images.
>>> Agreed. There are a couple of different logins in play here which
>>> makes it all a bit complicated. One option would be to get everything
>>> into the same process with no subprocess calls to `bin/omero import`.
>>>
>>>
>>>> Does this mean then that we should call logout after each import?
>>> That's probably the easiest thing to test. Longer-term, it'd be better
>>> to use a session key.
>>>
>>>
>>>> Thank you and best regards,
>>>> Andrii
>>> Cheers,
>>> ~Josh.
>>>
>>>
>>>
>>>> On 28/04/2016 10:17, Josh Moore wrote:
>>>>> On Wed, Apr 27, 2016 at 11:40 AM, Andrii Iudin <andrii at ebi.ac.uk> wrote:
>>>>>> Dear Josh,
>>>>> Hi Andrii,
>>>>>
>>>>>
>>>>>> Thank you for pointing to the documentation on the remote shares. Those
>>>>>> .lock files usually appear if we stop the server after one of the
>>>>>> "crashes".
>>>>>> When stopping and starting the server during its normal functioning
>>>>>> they
>>>>>> seem to be not created.
>>>>> It sounds like a race condition. When the server is under pressure,
>>>>> etc., then there's no time for the slower NFS locking implementation
>>>>> to do what it should. This is what makes the remote share not behave
>>>>> as a posix filesystem should. There has been some success with other
>>>>> versions of NFS and lockd tuning.
>>>>>
>>>>>
>>>>>> The run_command definition is following:
>>>>>>        def run_command(self, command, logFile=None):
>>>>> Thanks for the definition. I don't see anything off-hand in your code.
>>>>> If there's a keep alive bug in the import code itself, you might
>>>>> trying running a separate process with:
>>>>>
>>>>>        bin/omero sessions keepalive
>>>>>
>>>>> You can either do that in a console for testing, or via your Python
>>>>> driver itself. If that fixes the problem, then we can help you
>>>>> integrate that code into your main script without the need for a
>>>>> subprocess. Additionally, the session UUID that is created by that
>>>>> method could be used in all of your import subprocesses which would 1)
>>>>> protect the use of the password and 2) lower the overhead on the
>>>>> server.
>>>>>
>>>>> (In fact, now that I think of it, if you don't have a call to
>>>>> `bin/omero logout` anywhere in your code, this may be exactly the
>>>>> problem that you are running into. Each call to `bin/omero login`
>>>>> creates a new session which is kept alive for the default session
>>>>> timeout.)
>>>>>
>>>>> Cheers,
>>>>> ~Josh.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> Best regards,
>>>>>> Andrii
>>>>>>
>>>>>>
>>>>>> On 26/04/2016 21:00, Josh Moore wrote:
>>>>>>> Hi Andrii,
>>>>>>>
>>>>>>> On Tue, Apr 26, 2016 at 10:56 AM, Andrii Iudin <andrii at ebi.ac.uk>
>>>>>>> wrote:
>>>>>>>> Dear Josh,
>>>>>>>>
>>>>>>>> Please find attached the import script. For each EMDB entry it
>>>>>>>> performs
>>>>>>>> an
>>>>>>>> import of six images - three sides and their thumbnails.
>>>>>>> Thanks for this. And where's the definition of `run_command`?
>>>>>>>
>>>>>>>
>>>>>>>> To stop OMERO we use "omero web stop" and then "omero admin stop"
>>>>>>>> commands.
>>>>>>>> After this it is necessary to remove
>>>>>>>> var/OMERO.data/.omero/repository/*/.lock files before starting OMERO
>>>>>>>> again.
>>>>>>>> The system is NFS.
>>>>>>> I'd assume then that disconnections & the .lock files are unrelated.
>>>>>>> Please see
>>>>>>>
>>>>>>>
>>>>>>> https://www.openmicroscopy.org/site/support/omero5.2/sysadmins/unix/server-binary-repository.html#locking-and-remote-shares
>>>>>>> regarding using remote shares.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> ~Josh.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Andrii
>>>>>>>>
>>>>>>>>
>>>>>>>> On 25/04/2016 16:21, Josh Moore wrote:
>>>>>>>>> On Fri, Apr 22, 2016 at 12:41 PM, Andrii Iudin <andrii at ebi.ac.uk>
>>>>>>>>> wrote:
>>>>>>>>>> Dear OMERO developers,
>>>>>>>>>>
>>>>>>>>>> We are experiencing an issue when importing a large number of
>>>>>>>>>> images
>>>>>>>>>> in
>>>>>>>>>> a
>>>>>>>>>> single consequent go. This usually happens after importing more
>>>>>>>>>> than
>>>>>>>>>> a
>>>>>>>>>> thousand images. Please see below excerpts from the logs.
>>>>>>>>>> Increasing
>>>>>>>>>> a
>>>>>>>>>> time
>>>>>>>>>> period between each import seemed to helped a bit, however this
>>>>>>>>>> issue
>>>>>>>>>> ultimately happened anyway.
>>>>>>>>> Is this script available publicly? It would be useful to see how
>>>>>>>>> it's
>>>>>>>>> working.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> To get OMERO server working after this happens,
>>>>>>>>>> it is necessary to stop it, remove .lock files and start the server
>>>>>>>>>> again.
>>>>>>>>>> It would be much appreciated if you could point out to a possible
>>>>>>>>>> way
>>>>>>>>>> to
>>>>>>>>>> solve this issue.
>>>>>>>>> How did you stop OMERO? Is your file system on NFS or another remote
>>>>>>>>> share?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Thank you and with best regards,
>>>>>>>>>> Andrii
>>>>>>>>> Cheers,
>>>>>>>>> ~Josh.
> _______________________________________________
> ome-devel mailing list
> ome-devel at lists.openmicroscopy.org.uk
> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel



More information about the ome-devel mailing list