[ome-users] Import error

Douglas Russell douglas_russell at hms.harvard.edu
Tue Jan 9 13:49:32 GMT 2018


FYI: Just the latter three postgres logs relate to the most recent attempt.

On Tue, 9 Jan 2018 at 08:35 Douglas Russell <douglas_russell at hms.harvard.edu>
wrote:

> And this was all there was in the postgres logs:
>
> 01:33:37 LOG: unexpected EOF on client connection with an open transaction
> 02:13:52 LOG: checkpoints are occurring too frequently (21 seconds apart)
> 02:13:52 HINT: Consider increasing the configuration parameter
> "checkpoint_segments".
> 07:52:19 LOG: checkpoints are occurring too frequently (10 seconds apart)
> 07:52:19 HINT: Consider increasing the configuration parameter
> "checkpoint_segments".
> 08:50:05 LOG: unexpected EOF on client connection with an open transaction
>
> My gut feeling is that the database update fails, causing the whole import
> to fail, but it's hard to know what is going on.
>
> D
>
> On Tue, 9 Jan 2018 at 08:20 Douglas Russell <
> douglas_russell at hms.harvard.edu> wrote:
>
>> Hi,
>>
>> Sorry for delay following this up.
>>
>> These OMERO instances are in Docker, yes, but otherwise I don't think
>> there is anything remarkable about the configuration. I have allocated
>> postgres 5GBs of RAM and am not seeing any messages about that running out
>> of memory. The OMERO server has 20GBs of RAM.
>>
>> The only errors in the Blitz log are:
>>
>> /opt/omero/server/OMERO.server/var/log/Blitz-0.log:2018-01-09
>> 00:15:32,910 ERROR [        ome.services.util.ServiceHandler] (l.Server-7)
>> Method interface ome.api.ThumbnailStore.createThumbnailsByLongestSideSet
>> invocation took 26125
>> /opt/omero/server/OMERO.server/var/log/Blitz-0.log:2018-01-09
>> 00:15:33,090 ERROR [o.s.t.interceptor.TransactionInterceptor] (2-thread-4)
>> Application exception overridden by rollback exception
>> /opt/omero/server/OMERO.server/var/log/Blitz-0.log:2018-01-09
>> 00:15:33,090 ERROR [        ome.services.util.ServiceHandler] (2-thread-4)
>> Method interface ome.services.util.Executor$Work.doWork invocation took
>> 17514887
>>
>> The only thing I haven't yet tried is moving postgres into the same
>> container as OMERO. I can try that if it would help, but I highly doubt it
>> will make any difference as in this setup, there is only one t2.2xlarge
>> instance running everything. It was using a load balancer (easiest way to
>> connect things up should they actually be on different hosts), but I tried
>> it without that where I just give the IP of the postgres docker container
>> to the OMERO instance configuration and I got the same result, so it's not
>> the timeout of the load balancer at fault.
>>
>> Thanks,
>>
>> Douglas
>>
>> On Wed, 3 Jan 2018 at 06:56 Mark Carroll <m.t.b.carroll at dundee.ac.uk>
>> wrote:
>>
>>>
>>> On 12/23/2017 12:32 PM, Douglas Russell wrote:
>>> > I'd checked master logs files and there was nothing of interest in
>>> > there. dmesg is more promising though, good idea. It looks like a
>>> memory
>>> > issue. I've increased the amount of memory available to 20GBs from 4GBs
>>> > and now it does not fail in the same way. Not sure why so much RAM is
>>> > needed when each image in the screen is only 2.6MBs. Now there is a
>>> nice
>>> > new error.
>>>
>>> You have me wondering if the server does the whole plate import in only
>>> one transaction. Also, if memory issues could be due to PostgreSQL or
>>> instead Java (e.g., Hibernate) and, assuming Java-side, if the issue is
>>> pixel data size (do the TIFF files use compression?) or metadata (e.g.,
>>> tons of ROIs?). Scalability has been an ongoing focus for us: we have
>>> done much but there is much more yet to be done.
>>>
>>> > Going by the error that I see when the database tries to rollback, I
>>> > think it is timeout related.
>>>
>>> I'm not seeing an obvious timeout issue here but I may well be missing
>>> something and maybe over the holiday period you have noticed more clues
>>> yourself too?
>>>
>>> > The import log: https://s3.amazonaws.com/dpwr/pat/import_log.txt
>>> > The server logs (I tried the import twice):
>>> > https://s3.amazonaws.com/dpwr/pat/omero_logs.zip
>>> >
>>> > There are a couple of these in the database logs as you'd expect for
>>> the
>>> > two import attempts, but nothing else of interest.
>>> >
>>> > LOG: unexpected EOF on client connection with an open transaction
>>>
>>> Mmmm, late in the import process the EOFException from
>>> PGStream.ReceiveChar looks key. I'm trying to think what in PostgreSQL's
>>> pg_* tables might give some hint as to relevant activity or locks at the
>>> time (if it's a timeout, maybe a deadlock?). I guess there's nothing
>>> particularly exciting about how your OMERO server connects to
>>> PostgreSQL? It's simply across a LAN, perhaps via Docker or somesuch?
>>>
>>> How large is the plate? Given the 5.4 database changes I am wondering if
>>> this could possibly be a regression since 5.3.5 and how easy the error
>>> might be to reproduce in a test environment.
>>>
>>> Now the holiday season is behind us, at OME we're starting to return to
>>> the office. Happy New Year! With luck we'll get this issue figured out
>>> promptly. My apologies if I missed some existing context from the thread
>>> that I didn't realize already bears on some of my questions.
>>>
>>> -- Mark
>>>
>>> The University of Dundee is a registered Scottish Charity, No: SC015096
>>> _______________________________________________
>>> ome-users mailing list
>>> ome-users at lists.openmicroscopy.org.uk
>>> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openmicroscopy.org.uk/pipermail/ome-users/attachments/20180109/d7dd60af/attachment.html>


More information about the ome-users mailing list