[ome-users] Can somebody explain the pyramid building concept to me?

Josh Moore josh at glencoesoftware.com
Tue Sep 11 14:10:00 BST 2012


On Sep 5, 2012, at 10:43 PM, Jake Carroll wrote:

> Hi Josh,

Hi Jake,

> OK. Top posting, because this is getting too complicated down there.

:) No problem. Sorry to have let this fall through the cracks. Please do CC anything that's not private or sensitive to ome-users or ome-devel so others can help me (remember to) respond.


>>> And you're still not seeing your images upload successfully, correct?
> 
> Correct. It's taking hours and hours and hours. Specs of the server:
> 
> 4 x vCPU 
> 16GB of vRAM.
> 30GB OS/boot volume, /nfs backing store to high performance 10GbE
> connected SAN disk.
> Ubuntu 12.04.01 - latest kernel
> ICE3.4
> 
> I personally think the server has hung, but I can't be sure. We can't see
> anything "wrong" on the client side, because it does progress - but surely
> it should not take a day to ingest a 9.35GB image?

Agreed. 


> Is there a place I can send you one of these files, so you could try and
> replicate the problem yourself? I have numerous ways I can send you a big
> file..

I'll send you connection information off-list, but my guess is still that the main process is struggling. There's not a OutOfMemory exception in the log you sent me, which would point to Blitz-0 needing more memory (though I didn't have all the logs). But the issues with the postgresql connection point to perhaps running into SWAP.

When uploading your files, you might also upload a tarball of var/log.

Cheers,
~Josh.

> Further details:
> 
> Last login: Mon Sep  3 10:12:32 2012 from qbi-carroll.qbi.uq.edu.au
> linuxadmin at omero:~$ omero admin diagnostics
> 
> ===========================================================================
> =====
> OMERO Diagnostics 4.4.3-ice34-b5
> ===========================================================================
> =====
> 
> Commands:   java -version                  1.6.0     (/usr/bin/java)
> Commands:   python -V                      2.7.3     (/usr/bin/python)
> Commands:   icegridnode --version          3.4.2     (/usr/bin/icegridnode)
> Commands:   icegridadmin --version         3.4.2
> (/usr/bin/icegridadmin)
> Commands:   psql --version                 9.1.4     (/usr/bin/psql -- 2
> others)
> 
> Server:     icegridnode                    running
> Server:     Blitz-0                        active (pid = 4220, enabled)
> Server:     DropBox                        inactive (disabled)
> Server:     FileServer                     active (pid = 4259, enabled)
> Server:     Indexer-0                      active (pid = 4239, enabled)
> Server:     MonitorServer                  inactive (disabled)
> Server:     OMERO.Glacier2                 active (pid = 4244, enabled)
> Server:     OMERO.IceStorm                 active (pid = 4242, enabled)
> Server:     PixelData-0                    active (pid = 4248, enabled)
> Server:     Processor-0                    active (pid = 4251, enabled)
> Server:     Tables-0                       active (pid = 4257, enabled)
> Server:     TestDropBox                    inactive (enabled)
> 
> Log dir:    /home/linuxadmin/apps/OMERO/OMERO.server/var/log exists
> 
> Log files:  Blitz-0.log                    319.0 MB      errors=35
> warnings=69  
> Log files:  DropBox.log                    0.0 KB        errors=2
> warnings=0   
> Log files:  FileServer.log                 0.0 KB
> Log files:  Indexer-0.log                  5.0 KB        errors=1
> warnings=2   
> Log files:  MonitorServer.log              1.0 KB        errors=1
> warnings=0   
> Log files:  OMEROweb.log                   957.0 KB      errors=6
> warnings=5   
> Log files:  OMEROweb_request.log           0.0 KB
> Log files:  PixelData-0.log                1.0 KB
> Log files:  Processor-0.log                3.0 KB        errors=0
> warnings=1   
> Log files:  Tables-0.log                   3.0 KB        errors=0
> warnings=1   
> Log files:  TestDropBox.log                n/a
> Log files:  master.err                     0.0 KB
> Log files:  master.out                     0.0 KB
> Log files:  Total size                     320.52 MB
> 
> Parsing Blitz-0.log:[line:30] => Server restarted <=
> 
> Environment:OMERO_HOME=/home/linuxadmin/apps/OMERO/OMERO.server
> Environment:OMERO_NODE=(unset)
> Environment:OMERO_MASTER=(unset)
> Environment:PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/b
> in:/usr/games:/usr/lib/jvm/java-6-sun/bin:/usr/share/Ice-3.4.2:/usr/lib/pos
> tgresql/9.1/bin:/home/linuxadmin/apps/OMERO/OMERO.server/bin
> Environment:ICE_HOME=/usr/share/Ice-3.4.2
> Environment:LD_LIBRARY_PATH=/usr/share/java:/usr/lib:
> Environment:DYLD_LIBRARY_PATH=(unset)
> 
> OMERO data dir: '/omero'	Exists? True	Is writable? True
> OMERO.web status... DEVELOPMENT: You will have to check status by hand!
> 
> 
> Any thoughts/way we can progress this?
> 
> Cheers.
> 
> --JC
> 
> 
> 
> 
> 
> 
> 
> On 3/09/12 5:43 PM, "Josh Moore" <josh at glencoesoftware.com> wrote:
> 
>> 
>> On Sep 3, 2012, at 3:20 AM, Jake Carroll wrote:
>> 
>>> Hi Josh,
>>> 
>>> Wade in down below...
>>> 
>>> On 29/08/12 10:14 PM, "Josh Moore" <josh at glencoesoftware.com> wrote:
>>> 
>>>> 
>>>> On Aug 29, 2012, at 11:59 AM, Jake Carroll wrote:
>>>> 
>>>>> Hi Josh,
>>>>> 
>>>>> Thanks for getting back to me!
>>>> 
>>>> Sure, no problem.
>>>> 
>>>> 
>>>>> On 28/08/12 6:58 PM, "Josh Moore" <josh at glencoesoftware.com> wrote:
>>>>> 
>>>>>> 
>>>>>> On Aug 28, 2012, at 7:17 AM, Jake Carroll wrote:
>>>>>> 
>>>>>>> Hi all.
>>>>>> 
>>>>>> Hi Jake,
>>>>>> 
>>>>>>> I'm currently trying to understand some of the time-frames it takes
>>>>>>> to
>>>>>>> ingest big images into Omero. We're putting a 9.8GB image in
>>>>>>> currently,
>>>>>>> and it looks like it'll take a very very long time. We've thrown
>>>>>>> 16GB
>>>>>>> of
>>>>>>> RAM at it, 4 * vCPU's and copious amounts of high speed hardware.
>>>>>> 
>>>>>> How have you configured the RAM?
>>>>> 
>>>>> What can you tell me about it? Is there a specific tuneable I should
>>>>> be
>>>>> changing here? Something about a JVM heap size allocated to omero in
>>>>> some
>>>>> specific way?
>>>> 
>>>> There are several -Xmx options in etc/grid/templates.xml that you can
>>>> modify.
>>>> Be sure to restart your server after changing them.
>>> 
>>> Made some tweaks as suggested. Doubled and tripled mem configs in some
>>> cases.
>> 
>> And you're still not seeing your images upload successfully, correct?
>> 
>> 
>> 
>>>>>>> drwxrwsr-x 2 linuxadmin staff  4.0K Aug 28 14:45 .
>>>>>>> -rw-rw-r-- 1 linuxadmin staff     0 Aug 28 14:45
>>>>>>> .201_pyramid.pyr_lock
>>>>>>> -rw-rw-r-- 1 linuxadmin staff   80M Aug 28  2012
>>>>>>> .201_pyramid4066596291236323213.tmp
>>>>>>> 
>>>>>>> Found that in the "Pixels" directory.
>>>>>>> 
>>>>>>> It generates ever so slowly.
>>>>>> 
>>>>>> Could you possibly send use the var/log/PixelData*log files which
>>>>>> should
>>>>>> have timing information.
>>>>> 
>>>>> OK. Check this out:
>>>>> 
>>>>> linuxadmin at omero:~/apps/OMERO/OMERO.server/var/log$ cat
>>>>> PixelData-0.log
>>>>> 2012-08-25 06:55:33,233 INFO  [
>>>>> ome.services.blitz.Entry] (
>>>>>   main) Waiting 10000 ms on startup
>>>>> 2012-08-25 06:55:43,242 INFO  [
>>>>> ome.services.blitz.Entry] (
>>>>>   main) Creating ome.pixeldata. Please wait...
>>>>> 2012-08-25 06:55:46,012 INFO
>>>>> [ng.ShutdownSafeEhcacheManagerFactoryBean] (
>>>>>   main) Initializing EHCache CacheManager
>>>>> 2012-08-25 06:55:51,665 INFO  [
>>>>> ome.services.fulltext.FullTextAnalyzer] (
>>>>>   main) Initialized FullTextAnalyzer
>>>>> 2012-08-25 06:56:02,062 INFO  [
>>>>> ome.io.nio.PixelsService] (
>>>>>   main) PixelsService(path=/omero,
>>>>> resolver=ome.services.OmeroFilePathResolver at 62a23d38,
>>>>> backoff=ome.io.nio.SimpleBackOff(factor=93.2),
>>>>> sizes=ome.io.nio.ConfiguredTileSizes(w=256,h=256,W=3192,H=3192))
>>>> 
>> 
>> 
>>> Another one. Please see attached logs and a jstack dump as you'd
>>> suggested.
>> 
>> Thanks.
>> 
>> 
>> 
>>>> The backoff here is within reason. Mine are roughly the same:
>>>> 
>>>> grep factor= var/log/Blitz-0.log  | cut -b204- | cut -d, -f1
>>>> (factor=144.4)
>>>> (factor=100.5)
>>>> (factor=91.8)
>>>> (factor=79.9)
>>>> (factor=104.1)
>>>> (factor=82.6)
>>>> (factor=87.6)
>>>> (factor=79.2)
>>>> (factor=81.7)
>>>> (factor=134.9)
>>>> (factor=97.9)
>>>> (factor=88.3)
>>>> (factor=198.1)
>>>> (factor=92.5)
>>>> 
>>>> 
>>>>> 2012-08-25 06:56:02,068 INFO  [
>>>>> ome.services.pixeldata.PixelDataThread] (
>>>>>   main) Initializing PixelDataThread (threads=2)
>>>>> 2012-08-25 06:56:02,290 INFO  [
>>>>> ome.services.db.DatabaseIdentity] (
>>>>>   main) Using LSID format:
>>>>> 
>>>>> 
>>>>> urn:lsid:export.openmicroscopy.org:%s:67b0b90f-4476-4a22-9e2f-6cc4530fd
>>>>> 46
>>>>> e_
>>>>> %s%s
>>>>> 2012-08-25 06:56:02,448 INFO  [
>>>>> ome.services.fulltext.FullTextThread] (
>>>>>   main) Initializing Full-Text Indexer
>>>>> 2012-08-25 06:56:02,456 INFO  [
>>>>> ome.tools.hibernate.ExtendedMetadata] (
>>>>>   main) Calculating ExtendedMetadata...
>>>>> 2012-08-25 06:56:02,587 INFO
>>>>> [.services.scheduler.SchedulerFactoryBean] (
>>>>>   main) Starting Quartz Scheduler now
>>>>> 2012-08-25 06:56:02,745 INFO  [
>>>>> ome.services.blitz.Entry] (
>>>>>   main) ome.pixeldata now accepting connections.
>>>>> 2012-08-25 06:56:03,008 INFO  [
>>>>> ome.security.basic.CurrentDetails]
>>>>> (2-thread-2) Adding log:INSERT,class ome.model.meta.Session,1073
>>>> 
>>>> 
>>>> Was there nothing else in your log? This doesn't look like the process
>>>> is
>>>> doing anything.
>>> 
>>> This time I attached the PixelData log and the Blitz log, too.
>>> 
>>> Let me know what you make of it!
>> 
>> 
>> In the PixelData log, nothing's happening at all, which may or may not be
>> ok, depending on the file. In this case, I see 25+ minutes of uploading
>> tiles:
>> 
>> 2012-09-03 10:16:12,805 INFO  [        ome.services.util.ServiceHandler]
>> (Server-285)  Meth:	interface ome.api.RawPixelsStore.setTile
>> 
>> which means that it's fine for there to be nothing in PixelData log. But,
>> I never see this work continue. Either the client has thrown an
>> exception, or the server is hung. Which os these seems more like what
>> you're experiencing?
>> 
>> Cheers,
>> ~Josh.
>> 
>>> Cheers.
>>> --JC
>>> 
>>> 
>>> 
>>>> 
>>>> 
>>>>>>> So I have a better handle on it all, can somebody explain why it's
>>>>>>> such
>>>>>>> a slow procedure? It's barely using any CPU time or RAM currently.
>>>>>>> Perhaps it's my lack of understanding as to how the
>>>>>>> rendering/generation
>>>>>>> works that makes me perceive it as "slow".
>>>>>> 
>>>>>> While it's working, could you possibly attach to the PID with jstack
>>>>>> and
>>>>>> send us the output?
>>>>> 
>>>>> OK. *which* specific PID/name from a ps -ef | grep -blah should I be
>>>>> isolating here?
>>>> 
>>>> Either:
>>>> ps -ef | grep PixelData
>>>> 
>>>> Or:
>>>> bin/omero admin ice server pid PixelData-0
>>>> 
>>>> 
>>>> `bin/omero admin diagnostics` also prints it.
>>>> 
>>>> Cheers,
>>>> ~J.
>>>> 
>>>>> Cheers.
>>>>> --JC
>>>>> 
>>>>>>> Thanks!
>>>>>>> --JC
>>>>>> 
>>>>>> Thanks as well,
>>>>>> ~Josh.
>>>> 
>>>> 
>>>> 
>>> 
>>> <omero.outputs.20120309.zip>
>> 
> 




More information about the ome-users mailing list