<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

<HTML>

<HEAD>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">

<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7650.28">

<TITLE>RE: [ome-devel] Install failure loading Experiment.ome</TITLE>

</HEAD>

<BODY>

<!-- Converted from text/plain format -->


<P><FONT SIZE=2>It's hard to tell how many vendors have this issue; a quick Google shows some compliant, some partially, some not at all. It seems to be improving, however - OS X, for example, was not POSIX-compliant but apparently is now. The vendor in question has a proprietary OS, and their tech support can't even tell me if their NFS implimentation is POSIX-compliant or not. They're using some sort of modified version of apache strictly for management; there's no way I'd run OME on it even if they'd let me.<BR>

<BR>

As NFS doesn't seem to be a problem for most OME users, there's no reason for you to worry about it.<BR>

<BR>

A quick run of test-concurrent-write returns:<BR>

In PID 17617, Error calling test-concurrent-write: NewPixels failed.<BR>

System Error: No such file or directory<BR>

OMEIS Error: Couldn't get next Pixels ID: No such file or directory<BR>

<BR>

Is that indicative of a file system problem, or something else?<BR>

<BR>

Do CIFS shares show this same problem? Y'all are running large dbs, is it all on local filesystems?<BR>

<BR>

Mike<BR>

<BR>

Michael J. McCaughey, PhD<BR>

Molecular Physiology and Biophysics<BR>

U9203 MRBIII<BR>

6-6175<BR>

<BR>

<BR>

<BR>

-----Original Message-----<BR>

From: Ilya Goldberg [<A HREF="mailto:igg@nih.gov">mailto:igg@nih.gov</A>]<BR>

Sent: Fri 10/6/2006 12:31 PM<BR>

To: McCaughey, Michael J<BR>

Subject: Re: [ome-devel] Install failure loading Experiment.ome<BR>

<BR>

I realize that not being able to mount the OMEIS repository as a&nbsp;<BR>

share is a potentially huge problem.&nbsp; I was more wondering about how&nbsp;<BR>

common it is for NFS servers not to correctly implement POSIX-<BR>

compliant file locking (or fcntl() locking as its sometimes known).&nbsp;&nbsp;<BR>

I know that there are NFS servers that are fully posix-compliant with&nbsp;<BR>

respect to file locking.&nbsp; Sometimes this is buried in the NFS server/<BR>

client itself (the way its done in OS X, I believe), other times it&nbsp;<BR>

is done using separate daemons (statd and lockd).&nbsp; I think the best&nbsp;<BR>

place to have this resolved is by talking to the vendors of your NFS&nbsp;<BR>

server and client OSes.<BR>

<BR>

My bet is that the segmentation fault occurs within the Berkley DB,&nbsp;<BR>

because it tends to do that unless file locking is done exactly&nbsp;<BR>

right.&nbsp; It even says so in their documentation.&nbsp; A little while ago&nbsp;<BR>

we discovered an occasional crash there (like 2 or 3 times out of&nbsp;<BR>

thousands) that was traced back to a race condition within the&nbsp;<BR>

Berkley DB code.&nbsp; It could only be replicated with 16 or more&nbsp;<BR>

concurrent request loops going to a multi-CPU server.&nbsp; You might try&nbsp;<BR>

running src/C/omeis/test-concurrent-write.&nbsp; It requires an OMEIS test&nbsp;<BR>

directory called &quot;OMEIS-TEST&quot; in the &quot;current working directory&quot;.&nbsp; It&nbsp;<BR>

forks 64 processes and has each of them issue 1,000 OMEIS writes by&nbsp;<BR>

directly calling the OMEIS routines (i.e., not via the normal omeis&nbsp;<BR>

http interface).&nbsp; If locking is the problem, this is generally&nbsp;<BR>

sufficient to expose it.<BR>

<BR>

This was worked around by doing our own locking before handing things&nbsp;<BR>

off to Berkeley DB.&nbsp; Of course, if our locking isn't supported in&nbsp;<BR>

NFS, then Berkeley DB will go back to dumping core if its not happy&nbsp;<BR>

with things.&nbsp; Berkley DB is used in a great many unix apps, so having&nbsp;<BR>

dodgy file locking on shares will likely have some pretty widely felt&nbsp;<BR>

effects.&nbsp; All we ever wanted out of Berkley DB was a balanced B-tree&nbsp;<BR>

algorithm that works efficiently with lots and lots of entries in a&nbsp;<BR>

file shared by multiple concurrent processes.&nbsp; Can't get nothin for&nbsp;<BR>

free, I tell ya.<BR>

<BR>

It would be a pity if this feature is broken in many NFS server&nbsp;<BR>

implementations.&nbsp; It would have a very much larger effect on shared&nbsp;<BR>

file resources other than just OME.&nbsp; OMEIS isn't the only application&nbsp;<BR>

out there that depends on file-locking.&nbsp; Its a big problem for mail&nbsp;<BR>

spool files, for example (that's actually exactly the same problem as&nbsp;<BR>

OMEIS has - granular shared reading and exclusive writing of specific&nbsp;<BR>

potions of potentially very big files).<BR>

<BR>

If posix file locking is abandon-ware as far as most NFS servers are&nbsp;<BR>

concerned, then we will need to implement some kind of work-around.&nbsp;&nbsp;<BR>

It would not be very difficult, though it would make OMEIS a lot less&nbsp;<BR>

efficient.&nbsp; Probably the most straight forward way to do it is to&nbsp;<BR>

establish exclusivity using a sentinel file.&nbsp; I would like to&nbsp;<BR>

preserve fcntl locking if its available though (it is part of the&nbsp;<BR>

posix standard, after all), and ideally not do anything about it if&nbsp;<BR>

vendors generally have solutions to comply with standards.<BR>

<BR>

Is it possible to run just OMEIS directly on the big share?<BR>

<BR>

-Ilya<BR>

<BR>

<BR>

On Oct 6, 2006, at 12:10 PM, McCaughey, Michael J wrote:<BR>

<BR>

&gt; Well, it has the potential to be a big problem here.<BR>

&gt;<BR>

&gt; Without the ability to put OMEIS on a share, my OME useage is&nbsp;<BR>

&gt; limited by my local disk capacity. I have a backlog of &gt;10TB of&nbsp;<BR>

&gt; images to import, but the server has nowhere near enough capacity,&nbsp;<BR>

&gt; so I've been trying to use an NFS share from a NAS box that I have&nbsp;<BR>

&gt; 64TB on.<BR>

&gt;<BR>

&gt; The IT people have not been able to make CIFS work with *nix&nbsp;<BR>

&gt; clients on the NAS box (don't even start me on that), so I haven't&nbsp;<BR>

&gt; been able to try it. Is anyone putting OMEIS out on a cifs share?<BR>

&gt;<BR>

&gt; Mike<BR>

&gt;<BR>

&gt; Michael J. McCaughey, PhD<BR>

&gt; Molecular Physiology and Biophysics<BR>

&gt; U9203 MRBIII<BR>

&gt; 6-6175<BR>

&gt;<BR>

&gt;<BR>

&gt;<BR>

&gt; -----Original Message-----<BR>

&gt; From: Ilya Goldberg [<A HREF="mailto:igg@nih.gov">mailto:igg@nih.gov</A>]<BR>

&gt; Sent: Fri 10/6/2006 8:19 AM<BR>

&gt; To: McCaughey, Michael J<BR>

&gt; Subject: Re: [ome-devel] Install failure loading Experiment.ome<BR>

&gt;<BR>

&gt; Aha!<BR>

&gt; Its the dread nfs share that I bet doesn't support fully posix-<BR>

&gt; compliant file-locking.&nbsp; There's discussion of this on-line, along<BR>

&gt; with some suggested fixes (its not OME-specific).&nbsp; I don't know how<BR>

&gt; big a problem this is generally speaking.&nbsp; Anyone?<BR>

&gt;<BR>

&gt; -Ilya<BR>

&gt;<BR>

&gt;<BR>

&gt; On Oct 2, 2006, at 4:45 PM, McCaughey, Michael J wrote:<BR>

&gt;<BR>

&gt;&gt; The problem seems to be related to installing OME/OMEIS on a nfs-<BR>

&gt;&gt; mounted share.<BR>

&gt;&gt; I can install perfectly well locally, but installs with only the<BR>

&gt;&gt; Base OME directory and Base OMEIS directory changed to a directory<BR>

&gt;&gt; on the share fail as described. The OME/OMEIS directories are<BR>

&gt;&gt; created with correct permissions by install.pl, and are populated.<BR>

&gt;&gt; The share itself is owned by root with world rwx (777) permissions,<BR>

&gt;&gt; so it doesn't *appear* to be a permission issue.<BR>

&gt;&gt;<BR>

&gt;&gt; Any suggestions anyone?<BR>

&gt;&gt;<BR>

&gt;&gt; Mike<BR>

&gt;&gt;<BR>

&gt;&gt; Michael J. McCaughey, PhD<BR>

&gt;&gt; Molecular Physiology and Biophysics<BR>

&gt;&gt; U9203 MRBIII<BR>

&gt;&gt; 6-6175<BR>

&gt;&gt;<BR>

&gt;&gt;<BR>

&gt;&gt;<BR>

&gt;&gt; -----Original Message-----<BR>

&gt;&gt; From: ome-devel-bounces@lists.openmicroscopy.org.uk on behalf of<BR>

&gt;&gt; Chris Allan<BR>

&gt;&gt; Sent: Sun 9/17/2006 4:57 PM<BR>

&gt;&gt; To: ome-devel@lists.openmicroscopy.org.uk<BR>

&gt;&gt; Subject: Re: [ome-devel] Install failure loading Experiment.ome<BR>

&gt;&gt;<BR>

&gt;&gt;<BR>

&gt;&gt; On 15 Sep 2006, at 15:51, McCaughey, Michael J wrote:<BR>

&gt;&gt;<BR>

&gt;&gt; ...snip...<BR>

&gt;&gt;&gt;<BR>

&gt;&gt;&gt;<BR>

&gt;&gt;&gt; Apache's error log gives:<BR>

&gt;&gt;&gt; [Fri Sep 15 08:25:21 2006] [error] [client 127.0.0.1] In PID 28882,<BR>

&gt;&gt;&gt; Error callin<BR>

&gt;&gt;&gt; g OMEIS: Method parameter missing<BR>

&gt;&gt;&gt; [Fri Sep 15 08:27:02 2006] [error] [client 127.0.0.1] In PID 30644,<BR>

&gt;&gt;&gt; Error calling OMEIS:<BR>

&gt;&gt;&gt; [Fri Sep 15 08:27:02 2006] [error] [client 127.0.0.1] Method<BR>

&gt;&gt;&gt; parameter missing<BR>

&gt;&gt;&gt; [Fri Sep 15 08:27:02 2006] [error] [client 127.0.0.1]<BR>

&gt;&gt;&gt; [Fri Sep 15 08:27:02 2006] [error] [client 127.0.0.1] Premature end<BR>

&gt;&gt;&gt; of script headers: omeis<BR>

&gt;&gt;&gt;<BR>

&gt;&gt;&gt; Anybody seen this before?<BR>

&gt;&gt; Ugh, yes. That's likely an OMEIS segfault.<BR>

&gt;&gt;<BR>

&gt;&gt; No idea what might be causing it unfortunately and getting cores out<BR>

&gt;&gt; of Apache can be a bit tricky.<BR>

&gt;&gt;<BR>

&gt;&gt; Ciao.<BR>

&gt;&gt;<BR>

&gt;&gt; -Chris<BR>

&gt;&gt; _______________________________________________<BR>

&gt;&gt; ome-devel mailing list<BR>

&gt;&gt; ome-devel@lists.openmicroscopy.org.uk<BR>

&gt;&gt; <A HREF="http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel">http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel</A><BR>

&gt;&gt;<BR>

&gt;&gt;<BR>

&gt;&gt; Michael J. McCaughey, PhD<BR>

&gt;&gt; Molecular Physiology and Biophysics<BR>

&gt;&gt; U9203 MRBIII<BR>

&gt;&gt; 6-6175<BR>

&gt;&gt;<BR>

&gt;&gt; _______________________________________________<BR>

&gt;&gt; ome-devel mailing list<BR>

&gt;&gt; ome-devel@lists.openmicroscopy.org.uk<BR>

&gt;&gt; <A HREF="http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel">http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-devel</A><BR>

&gt;<BR>

&gt;<BR>

<BR>

<BR>

</FONT>

</P>


</BODY>

</HTML>