[ome-users] Stall issues/download issues still, even with gevent...
Jake Carroll
jake.carroll at uq.edu.au
Sun Jan 31 01:31:31 GMT 2016
Hi again
Unfortunately, still having issues on large downloads failing via the web interface.
I'm using a startup string such as this:
omero web start --workers 128 --wsgi-args '--worker-class gevent --error-logfile=/home/omero/OMERO.server/var/log/g_error.log'
And it doesn't seem to really matter what workers INT I use, we'll still see stalls and fails on download over the web interface.
I'm trying to download a 9.5GB ims format file.
The g_error.log looks interesting?
root at omero-prod-gen2:~# tail -f ~omero/OMERO.server/var/log/g_error.log
2016-01-31 09:23:53 [4781] [INFO] Booting worker with pid: 4781
2016-01-31 09:23:53 [4794] [INFO] Booting worker with pid: 4794
2016-01-31 09:23:53 [4798] [INFO] Booting worker with pid: 4798
2016-01-31 09:23:53 [4814] [INFO] Booting worker with pid: 4814
2016-01-31 09:23:53 [4808] [INFO] Booting worker with pid: 4808
2016-01-31 09:23:53 [4823] [INFO] Booting worker with pid: 4823
2016-01-31 09:23:53 [4827] [INFO] Booting worker with pid: 4827
2016-01-31 09:23:53 [4838] [INFO] Booting worker with pid: 4838
2016-01-31 09:23:53 [4858] [INFO] Booting worker with pid: 4858
2016-01-31 09:23:53 [4874] [INFO] Booting worker with pid: 4874
2016-01-31 09:26:00 [3852] [CRITICAL] WORKER TIMEOUT (pid:4608)
2016-01-31 09:26:00 [3852] [CRITICAL] WORKER TIMEOUT (pid:4608)
2016-01-31 09:26:01 [5314] [INFO] Booting worker with pid: 5314
I managed to download (randomly?) more than I ever have before, with 1.7GB of the file downloaded in this configuration - but it is still failing/stalling.
What could I be missing?
I even tried with 256 workers:
omero at omero-prod-gen2:~$ omero web start --workers 256 --wsgi-args '--worker-class gevent --error-logfile=/home/omero/OMERO.server/var/log/g_error.log'
...but the workers still seem to time out at *some* random point early on:
2016-01-31 09:29:24 [7360] [INFO] Booting worker with pid: 7360
2016-01-31 09:29:24 [7371] [INFO] Booting worker with pid: 7371
2016-01-31 09:30:14 [5433] [CRITICAL] WORKER TIMEOUT (pid:7045) <-- happened almost immediately after booting the workers.
2016-01-31 09:30:14 [5433] [CRITICAL] WORKER TIMEOUT (pid:7045)
2016-01-31 09:30:15 [8273] [INFO] Booting worker with pid: 8273
*SO THEN* I tried booting the worker processes with a very long time out:
omero web start --workers 256 --wsgi-args '-t 360 --worker-class gevent --error-logfile=/home/omero/OMERO.server/var/log/g_error.log'
And, after a much much much longer download length of 4.2GB of my 9.5GB ims file it finally started to show problem signs again:
2016-01-31 09:49:32 [8394] [CRITICAL] WORKER TIMEOUT (pid:10451)
2016-01-31 09:49:32 [8394] [CRITICAL] WORKER TIMEOUT (pid:10451)
2016-01-31 09:49:33 [11503] [INFO] Booting worker with pid: 11503
And then it failed again, unfortunately.
So made the timeout an enormous number:
omero web start --workers 256 --wsgi-args '-t 1440 --worker-class gevent --error-logfile=/home/omero/OMERO.server/var/log/g_error.log'
...and I can finally drag in my 9.5GB file over the OMERO web interface, without timeout failures.
Something doesn't feel quite right, does it?
-jc
More information about the ome-users
mailing list