[ome-devel] Install failure loading Experiment.ome

McCaughey, Michael J michael.j.mccaughey at Vanderbilt.Edu
Fri Oct 6 19:50:27 BST 2006


It's hard to tell how many vendors have this issue; a quick Google shows some compliant, some partially, some not at all. It seems to be improving, however - OS X, for example, was not POSIX-compliant but apparently is now. The vendor in question has a proprietary OS, and their tech support can't even tell me if their NFS implimentation is POSIX-compliant or not. They're using some sort of modified version of apache strictly for management; there's no way I'd run OME on it even if they'd let me.

As NFS doesn't seem to be a problem for most OME users, there's no reason for you to worry about it.

A quick run of test-concurrent-write returns:
In PID 17617, Error calling test-concurrent-write: NewPixels failed.
System Error: No such file or directory
OMEIS Error: Couldn't get next Pixels ID: No such file or directory

Is that indicative of a file system problem, or something else?

Do CIFS shares show this same problem? Y'all are running large dbs, is it all on local filesystems?

Mike

Michael J. McCaughey, PhD
Molecular Physiology and Biophysics
U9203 MRBIII
6-6175



-----Original Message-----
From: Ilya Goldberg [mailto:igg at nih.gov]
Sent: Fri 10/6/2006 12:31 PM
To: McCaughey, Michael J
Subject: Re: [ome-devel] Install failure loading Experiment.ome
 
I realize that not being able to mount the OMEIS repository as a  
share is a potentially huge problem.  I was more wondering about how  
common it is for NFS servers not to correctly implement POSIX- 
compliant file locking (or fcntl() locking as its sometimes known).   
I know that there are NFS servers that are fully posix-compliant with  
respect to file locking.  Sometimes this is buried in the NFS server/ 
client itself (the way its done in OS X, I believe), other times it  
is done using separate daemons (statd and lockd).  I think the best  
place to have this resolved is by talking to the vendors of your NFS  
server and client OSes.

My bet is that the segmentation fault occurs within the Berkley DB,  
because it tends to do that unless file locking is done exactly  
right.  It even says so in their documentation.  A little while ago  
we discovered an occasional crash there (like 2 or 3 times out of  
thousands) that was traced back to a race condition within the  
Berkley DB code.  It could only be replicated with 16 or more  
concurrent request loops going to a multi-CPU server.  You might try  
running src/C/omeis/test-concurrent-write.  It requires an OMEIS test  
directory called "OMEIS-TEST" in the "current working directory".  It  
forks 64 processes and has each of them issue 1,000 OMEIS writes by  
directly calling the OMEIS routines (i.e., not via the normal omeis  
http interface).  If locking is the problem, this is generally  
sufficient to expose it.

This was worked around by doing our own locking before handing things  
off to Berkeley DB.  Of course, if our locking isn't supported in  
NFS, then Berkeley DB will go back to dumping core if its not happy  
with things.  Berkley DB is used in a great many unix apps, so having  
dodgy file locking on shares will likely have some pretty widely felt  
effects.  All we ever wanted out of Berkley DB was a balanced B-tree  
algorithm that works efficiently with lots and lots of entries in a  
file shared by multiple concurrent processes.  Can't get nothin for  
free, I tell ya.

It would be a pity if this feature is broken in many NFS server  
implementations.  It would have a very much larger effect on shared  
file resources other than just OME.  OMEIS isn't the only application  
out there that depends on file-locking.  Its a big problem for mail  
spool files, for example (that's actually exactly the same problem as  
OMEIS has - granular shared reading and exclusive writing of specific  
potions of potentially very big files).

If posix file locking is abandon-ware as far as most NFS servers are  
concerned, then we will need to implement some kind of work-around.   
It would not be very difficult, though it would make OMEIS a lot less  
efficient.  Probably the most straight forward way to do it is to  
establish exclusivity using a sentinel file.  I would like to  
preserve fcntl locking if its available though (it is part of the  
posix standard, after all), and ideally not do anything about it if  
vendors generally have solutions to comply with standards.

Is it possible to run just OMEIS directly on the big share?

-Ilya


On Oct 6, 2006, at 12:10 PM, McCaughey, Michael J wrote:

> Well, it has the potential to be a big problem here.
>
> Without the ability to put OMEIS on a share, my OME useage is  
> limited by my local disk capacity. I have a backlog of >10TB of  
> images to import, but the server has nowhere near enough capacity,  
> so I've been trying to use an NFS share from a NAS box that I have  
> 64TB on.
>
> The IT people have not been able to make CIFS work with *nix  
> clients on the NAS box (don't even start me on that), so I haven't  
> been able to try it. Is anyone putting OMEIS out on a cifs share?
>
> Mike
>
> Michael J. McCaughey, PhD
> Molecular Physiology and Biophysics
> U9203 MRBIII
> 6-6175
>
>
>
> -----Original Message-----
> From: Ilya Goldberg [mailto:igg at nih.gov]
> Sent: Fri 10/6/2006 8:19 AM
> To: McCaughey, Michael J
> Subject: Re: [ome-devel] Install failure loading Experiment.ome
>
> Aha!
> Its the dread nfs share that I bet doesn't support fully posix-
> compliant file-locking.  There's discussion of this on-line, along
> with some suggested fixes (its not OME-specific).  I don't know how
> big a problem this is generally speaking.  Anyone?
>
> -Ilya
>
>
> On Oct 2, 2006, at 4:45 PM, McCaughey, Michael J wrote:
>
>> The problem seems to be related to installing OME/OMEIS on a nfs-
>> mounted share.
>> I can install perfectly well locally, but installs with only the
>> Base OME directory and Base OMEIS directory changed to a directory
>> on the share fail as described. The OME/OMEIS directories are
>> created with correct permissions by install.pl, and are populated.
>> The share itself is owned by root with world rwx (777) permissions,
>> so it doesn't *appear* to be a permission issue.
>>
>> Any suggestions anyone?
>>
>> Mike
>>
>> Michael J. McCaughey, PhD
>> Molecular Physiology and Biophysics
>> U9203 MRBIII
>> 6-6175
>>
>>
>>
>> -----Original Message-----
>> From: ome-devel-bounces at lists.openmicroscopy.org.uk on behalf of
>> Chris Allan
>> Sent: Sun 9/17/2006 4:57 PM
>> To: ome-devel at lists.openmicroscopy.org.uk
>> Subject: Re: [ome-devel] Install failure loading Experiment.ome
>>
>>
>> On 15 Sep 2006, at 15:51, McCaughey, Michael J wrote:
>>
>> ...snip...
>>>
>>>
>>> Apache's error log gives:
>>> [Fri Sep 15 08:25:21 2006] [error] [client 127.0.0.1] In PID 28882,
>>> Error callin
>>> g OMEIS: Method parameter missing
>>> [Fri Sep 15 08:27:02 2006] [error] [client 127.0.0.1] In PID 30644,
>>> Error calling OMEIS:
>>> [Fri Sep 15 08:27:02 2006] [error] [client 127.0.0.1] Method
>>> parameter missing
>>> [Fri Sep 15 08:27:02 2006] [error] [client 127.0.0.1]
>>> [Fri Sep 15 08:27:02 2006] [error] [client 127.0.0.1] Premature end
>>> of script headers: omeis
>>>
>>> Anybody seen this before?
>> Ugh, yes. That's likely an OMEIS segfault.
>>
>> No idea what might be causing it unfortunately and getting cores out
>> of Apache can be a bit tricky.
>>
>> Ciao.
>>
>> -Chris
>> _______________________________________________
>> ome-devel mailing list
>> ome-devel at lists.openmicroscopy.org.uk
>> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
>>
>>
>> Michael J. McCaughey, PhD
>> Molecular Physiology and Biophysics
>> U9203 MRBIII
>> 6-6175
>>
>> _______________________________________________
>> ome-devel mailing list
>> ome-devel at lists.openmicroscopy.org.uk
>> http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel
>
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.openmicroscopy.org.uk/pipermail/ome-devel/attachments/20061006/3c86b466/attachment.htm 


More information about the ome-devel mailing list