[ome-users] issue and fix for MetaXpress multi-line descriptions

Mario Emmenlauer mario at emmenlauer.de
Sat Mar 28 12:00:26 GMT 2015


Dear Roger,

I'll try to add all I know about this (see below). Let me know if
you have more questions. One more thing: I have a huge test set of
MetaXpress images, so if you can provide a different solution, I
might be able to test it for you?

On 27.03.2015 17:20, Roger Leigh wrote:
> On 26/03/15 14:07, Mario Emmenlauer wrote:
>>
>> Dear Bio-Formats developers,
>>
>> I have a small issue with MetaXpress meta data, and a possible fix.
>> When I read the image (link below), there are several global metadata
>> entries that are not correct. I followed up where they come from, and
>> it turns out that MetaXpress has a free-text box where user can enter
>> a 'Description' for their screening experiment. Multi-line entry is
>> supported in MetaXpress, which leads to "wrong" interpretation of the
>> metadata in Bio-Formats.
>>
>> Here the example from the image below. For the field 'Description'
>> our user had entered in MetaXpress:
>>      siRNA transfection of HeLa cells
>>      Entry 4h
>>      extracellular: red
>>      intracellular: red and green
>>      DAPI staining (1:1000)
>>      [...]
>>
>> Of this Description, several lines are missing in Bio-Formats. For
>> the text that is not missing, I can find fields like:
>>      'extracellular'          'red'
>>      'intracellular'          'red and green'
>>      'DAPI staining (1'       '1000)'
>>
>> This is not how I think the metadata should be reported :-)
>>
>>
>> The best solution I could think of is a defined vocabulary of keys
>> from MetaXpress, to separate them from the free-text entered by the
>> user. Attached is a patch that does exactly this, and works well
>> for me. The patch should apply smoothly to Bio-Formats 5.0.7 file
>> components/formats-gpl/src/loci/formats/in/MetamorphReader.java
>>
>> For me, the patch achieves a second functionality: there are some
>> keys that do not have a colon, in other words they consist only of
>> a key and no value (an example is "Acquired from Photometrics"). For
>> me it makes sense to store them as both key and value, i.e. to have
>> it reported as:
>>      'Acquired from Photometrics' = 'Acquired from Photometrics'
>>
>> This is my personal preference, you can ignore this part of the patch
>> if you do not find it useful.
>>
>> One last thing: I store the free-text in key 'Global Description'.
>> I think for you its more common to use the key 'Global Comment'? Feel
>> free to change this to your liking.
>>
>>
>> The image I used is here:
>>      http://data.marssoft.de/bBZIX-021_wD13_s3_z0_t1_cCy5_u001.tif
> 
> Thanks for the patch.
> 
> Looking at it, the behaviour seems generally sensible, though I would
> probably just set
>   Acquired from Photometrics = 1
> to avoid the duplication of information.

I slightly prefer my behavior, because in this case the "1" would be
an arbitrary choice to say "the value exists". Its difficult to decide
on an arbitrary value, it can be "1" or "exists" or "present" or "yes"
or something. Using the key as value again is indeed redundant, but it
solves the problem of choice for me. Thats of course just my personal
opinion... :-)


> Since the description contains free-form metadata which we can't
> sensibly parse into key-value pairs, one question I have is whether the
> free-text is in a single contiguous block or interspersed with
> MetaXpress keys.  For example, if it's e.g. a leading contiguous block,
> that would mean we could be a bit more intelligent about processing the
> remaining lines--once we've found a known key, we could avoid adding all
> subsequent lines to the Comment.  Or if it's always after a certain key
> like "Exposure:".  In your example:
> 
> ------------------------------------------------------------------------
> Experiment base name:AC20-2-1--TetR-GFPforEntryAssay
> Experiment set:AC20-2-1--TetR-GFPforEntryAssay

The above two "Experiment base name:" and "Experiment set:" are
standard MetaXpress key-value-pairs. They are not mandatory (so not
all images have them), but they exist quite often in my images.


> siRNA transfection of HeLa cells
> infection with ACBr165 (TetR-GFP and const. dsRED
> Entry 4h
> Induction of TetR-GFP for 4 h together with Gentamycin killing of
> extracellular bacteria
> extracellular: red
> intracellular: red and green
> DAPI staining (1:1000)
> Dy-647-phalloidin (1:100)

This above block is the free-text I was referring to. I'm pretty
sure that it always comes as one consecutive block, but I did not
check this carefully. Also, I think it does not have to be at a
fixed position in the Description field. I can say it does not have
to follow after the key "Experiment set", because the latter is an
optional key. I don't know if its always followed by "Exposure:"
for example.


> Exposure: 25 ms
> Binning: 1 x 1
> Region: 1392 x 1040, offset at (0, 0)
> Acquired from Photometrics
> Subtract: Off
> Shading: Off
> Digitizer: 20 MHz
> Gain: Gain 1 (1x)
> Camera Shutter: Always Open
> Clear Count: 2
> Clear Mode: CLEAR PRE SEQUENCE
> Frames to Average: 1
> Trigger Mode: Normal (TIMED)
> Temperature: -29.95

All the remaining are again standard MetaXpress key-value-pairs.
I think most of them are again not mandatory (so not all images
have them), but I did not check this in detail. I can say that all
my MetaXpress images of this format have at least 8 of the above
16 standard MetaXpress key-value-pairs. This was an arbitrary check
I implemented after I realized that not all 16 must always be
present.


> Another thing is how to handle the parsing if it's not in an XML
> Description element.  Currently the code is duplicated in
> MetamorphReader after "// parse (mangle) TIFF comment".  If the method
> in MetamorphHandler was made static, this could be used here as well, to
> remove the duplication.

Yes, very good observation! In my personal parser I use the same
code to parse the Description element for both the XML- and the non-
XML cases, they work 100% identical for me. It would very likely
make sense for you to do the same...

Cheers and all the best,

    Mario



> Kind regards,
> Roger


-- 
Mario Emmenlauer BioDataAnalysis             Mobil: +49-(0)151-68108489
Balanstrasse 43                    mailto: mario.emmenlauer * unibas.ch
D-81669 München                          http://www.biodataanalysis.de/



More information about the ome-users mailing list