[ome-users] Import error

Mark Carroll m.t.b.carroll at dundee.ac.uk
Wed Jan 3 11:55:51 GMT 2018


On 12/23/2017 12:32 PM, Douglas Russell wrote:
> I'd checked master logs files and there was nothing of interest in
> there. dmesg is more promising though, good idea. It looks like a memory
> issue. I've increased the amount of memory available to 20GBs from 4GBs
> and now it does not fail in the same way. Not sure why so much RAM is
> needed when each image in the screen is only 2.6MBs. Now there is a nice
> new error.

You have me wondering if the server does the whole plate import in only
one transaction. Also, if memory issues could be due to PostgreSQL or
instead Java (e.g., Hibernate) and, assuming Java-side, if the issue is
pixel data size (do the TIFF files use compression?) or metadata (e.g.,
tons of ROIs?). Scalability has been an ongoing focus for us: we have
done much but there is much more yet to be done.

> Going by the error that I see when the database tries to rollback, I
> think it is timeout related.

I'm not seeing an obvious timeout issue here but I may well be missing
something and maybe over the holiday period you have noticed more clues
yourself too?

> The import log: https://s3.amazonaws.com/dpwr/pat/import_log.txt
> The server logs (I tried the import twice):
> https://s3.amazonaws.com/dpwr/pat/omero_logs.zip
>
> There are a couple of these in the database logs as you'd expect for the
> two import attempts, but nothing else of interest.
>
> LOG: unexpected EOF on client connection with an open transaction

Mmmm, late in the import process the EOFException from
PGStream.ReceiveChar looks key. I'm trying to think what in PostgreSQL's
pg_* tables might give some hint as to relevant activity or locks at the
time (if it's a timeout, maybe a deadlock?). I guess there's nothing
particularly exciting about how your OMERO server connects to
PostgreSQL? It's simply across a LAN, perhaps via Docker or somesuch?

How large is the plate? Given the 5.4 database changes I am wondering if
this could possibly be a regression since 5.3.5 and how easy the error
might be to reproduce in a test environment.

Now the holiday season is behind us, at OME we're starting to return to
the office. Happy New Year! With luck we'll get this issue figured out
promptly. My apologies if I missed some existing context from the thread
that I didn't realize already bears on some of my questions.

-- Mark

The University of Dundee is a registered Scottish Charity, No: SC015096


More information about the ome-users mailing list