[ome-users] List expected files in omero data Files/ and omero scripts

Carnë Draug carandraug+dev at gmail.com
Wed Sep 14 16:08:15 BST 2016


On 14 September 2016 at 10:44, "Colin Blackburn"
<C.Blackburn at dundee.ac.uk> wrote:
> Hi Carnë,
>
> On 13/09/2016 18:05, "ome-users on behalf of Carnë Draug"
> <ome-users-bounces at lists.openmicroscopy.org.uk on behalf of
> carandraug+dev at gmail.com> wrote:
>
>>Hi
>>
>>I have been trying to get a list of all files that ought to be in the
>>Files/ directory.  This is a 5.2.5 installation of omero but has been
>>through several upgrades so it still has files from omero 4 releases.
>>
>>At the moment, I'm using this query:
>>
>>  SELECT
>>    id
>>    FROM originalfile
>>    WHERE (repo IS NULL OR repo = '') AND mimetype != 'Repository';
>>
>>but I found out that this includes some omero scripts (some of the rows
>>from that query have a mimetype of 'text/x-python' and their filenames
>>are the omero scripts including some of old scripts that no longer exist
>>such as Movie_Figure.py and Make_Movie.py).
>>
>>I thought scripts were kept where they are and not copied.  Indeed, new
>>scripts have a repo value of 'ScriptRepo'.  Is this because omero scripts
>>used to be copied to Files/ and is my query is correct?
>
> Official scripts are at paths relative to the server installation, rather
> than the data directory, so they do not have any entries under Files/
> This applies to those scripts in the structured folders under
> 'lib/scripts/omero/' on server startup and those uploaded officially (via
> 'bin/omero script upload FooBar.py --official' say). In the latter case
> the path on the OriginalFile table is likely to be '/'.
>
> When an official script is replaced, or modified in-place and the server
> restarted, the old row is left in the database with the repo column
> nulled. The mimetype is left as 'text/x-python' (or 'text/x-jython' or
> 'text/x-matlab' if relevant). A new row is then created with the repo set
> to 'ScriptRepo' and a new hash. (And with the addition of look-up tables
> in 5.3 you will probably see files with 'ScriptRepo' and 'text/x-lut' also
> appear, though this may be subject to change before release.)
>
> When a user script is uploaded, ie not an official one, this *is* stored
> under Files/  Such scripts will not have the repo or path columns set in
> their OriginalFile row.

If I understood correctly, you are saying that the way to differentiate
between a script that was uploaded (not official) and old official repos
is an empty path.  So instead of:

  SELECT
    id
    FROM originalfile
    WHERE (repo IS NULL OR repo = '') AND mimetype != 'Repository';

I should be doing:

  SELECT
    id
    FROM originalfile
    WHERE (repo IS NULL OR repo = '') AND mimetype != 'Repository'
          AND ((mimetype != 'text/x-python' AND mimetype != 'text/x-jython'
                AND mimetype != 'text/x-matlab') OR path IS NULL OR path = '');

to account for the special cases of old official scripts.

But this still won't work.  If a python script is an attachment to an image
it will go into the originalfile table, have a 'text/x-python' mimetype'
and a originalfile.path value, be stored in Files/, but would still be
filtered out by that query.  How can I distinguish between an old script
whose repo value has been cleared after an omero upgrade (and therefore
won't be in Files/) and a script file that was attached to an image?

> Other OriginalFile entries that will have actual file entries under Files/
> are uploaded files (using 'bin/omero upload' or the graphical clients) and
> pre-OMERO5 archived files. In addition the server may create some other
> files under Files/ (to capture stderr, for instance) that will then have
> entries in OriginalFiles.
>
> I may have missed some potential candidates for Files/ but I'm sure Josh
> will be along soon to add those if that is the case!

But given an arbitrary omero database, how can I distinguish between rows
in the originalfile table that should be in Files and rows that should be
somewhere else?  I'm facing the issue now with this old scripts but my
problem really is getting a list of files that omero expects to be in Files/.

I understand that many of the files end up in the originalfile table but
that table doesn't have that information.  It seems like the typical use
case is coming from another table and sometime end up in originalfile.
Should I be getting all the id from other tables instead?

>>And is my query correct to get a list of all files expected to be under
>>Files/ ? Is there something else to look for?  What should the query be
>>for an arbitrary omero database (the long plan is to write a tool to
>>validate any omero database against an omero data directory -- the
>>opposite
>>of omero cleanse).
>
> Have you looked at investigating the problem the other way around, more
> like cleanse? Collecting the candidate Ids from the file names under
> Files/ and then looking at the rows in the table that don't match these
> files?
>

No, because we are trying to work around an issue with that approach.
Given an omero database, validate the omero data directory.  If a file
is missing there, for example, because of something like omero 2016-SV1 [1],
how can we identify it?

Carnë

[1] http://www.openmicroscopy.org/site/products/omero/secvuln/2016-SV1-cleanse


More information about the ome-users mailing list