<div dir="ltr">Hi,<div><br></div><div><div>> > You know what would be really cool? If we could create an</div><div>> > object-store provider backend for OMERO to tape into object storage</div><div>> > APIs. I really *really* like the idea of being able to natively</div><div>> > target OpenStack SWIFT buckets, Amazon S3 buckets and native</div><div>> > Ceph-RADOS-gw stores. Thinking out loud, there is huge potential to</div><div>> > scale omero in the cloud further, massive potential for data reuse</div><div>> > and even further extensibility benefits we can derive from scaling</div><div>> > out like this.</div><div>> </div><div>> Agreed, but again this is equivalent to the "substantially more work"</div><div>> from above.</div></div><div><br></div><div>Lots of work, definitely. But fundamentally feasible. At the Paris meeting, Douglas Russell and I developed a rough prototype of a Bio-Formats I/O handler for Amazon S3 [1]. I know Douglas is working on performance issues (caching / read-ahead and maybe eventually memory-mapped I/O) as his time allows.</div><div><br></div><div>Of course, that is only the first step. But since OMERO5 internally uses Bio-Formats to read planes, it opens a lot of possibilities!</div><div><br></div><div>Regards,</div><div>Curtis</div><div><br></div><div>[1] <a href="https://github.com/dpwrussell/bfs3">https://github.com/dpwrussell/bfs3</a></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jun 19, 2015 at 5:12 AM, Josh Moore <span dir="ltr"><<a href="mailto:josh@glencoesoftware.com" target="_blank">josh@glencoesoftware.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thu, Jun 18, 2015 at 12:32 PM, Jake Carroll <<a href="mailto:jake.carroll@uq.edu.au">jake.carroll@uq.edu.au</a>> wrote:<br>

> Hi list!<br>

<br>

Hi Jake. Good to hear from you ... you've been busy!<br>

<span class=""><br>

<br>

> We’re running OMERO at a fairly large scale now (big ingest, lots of<br>

> instruments, plenty of IO and lots of compute cycles) across a significant<br>

> network. We’re (as Jason alluded to in a previous post) doing things at a<br>

> cloud scale with OMERO, which, still seems to be a bit unusual from what I<br>

> have found.<br>

<br>

</span>You've certainly got everyone interested.<br>

<span class=""><br>

<br>

> Anyway..<br>

><br>

> One thing that has come up recently is the notion of delegated<br>

> administration. Here is an example.<br>

><br>

> Org Unit “A” is the controller/unit that runs an OMERO platform. It has lots<br>

> of users and is the main provider of the OMERO platform.<br>

><br>

> Org Unit “B” says “hey…that is darn cool. We’d like some of the OMERO love,<br>

> too! Can we join you?”<br>

><br>

> Org Unit “A” says: “Of course! We share the love, and we love OMERO!”.<br>

><br>

> In our LDAP binds we then allow said org unit access. But, I got to thinking<br>

> a bit further afield about something better or even nicer. I liked the idea<br>

> of multi-tenancy with my omero-cloud instance. Further, I liked the idea of<br>

> my delegated administrators (as I like to call them) being in control of<br>

> their own destiny, and to an extent, their users, such that, on a large<br>

> omero instance, you’d have an effective waterfall model of administrative<br>

> chains.<br>

><br>

> OU “A” can oversee it all.<br>

><br>

> OU “B” has some selected administrators that can access/modify/work with the<br>

> OU “B” users who belong to that bit of the LDAP container (or some other<br>

> access control mechanism).<br>

><br>

> It would sort of make OMERO very multi-tenancy savvy in many respects.<br>

<br>

</span>Certainly, this depends on what exactly is meant by "control",<br>

"administrate", etc. but I'd assume that with the group mappings[1] in<br>

introduced in 5.1.2, you'd be able to perform at least *some* of this.<br>

<br>

Assuming all groups are managed by LDAP, then the logic would be<br>

something roughly equivalent to:<br>

<br>

 * all admins of UnitA are set as owners of all groups<br>

 * all admins of UnitB are set as owners of UnitB groups<br>

<br>

which I think would be doable with an OR-query.<br>

<span class=""><br>

<br>

> Further, with regards to OMERO.fs, it would be really ideal to be able to<br>

> prefix or specify in the backend multiple points of IO exit for the omero<br>

> data storage location. Such that the omero.data.dir variable could equal or<br>

> have multiple backends for different OU’s from the same OMERO instance. This<br>

> would both logically and physically compartmentalise the OMERO data domain.<br>

> [Which could be a good thing for more reasons than one, much less IO<br>

> scheduling and performance characteristics at a filesystem level, for<br>

> different omero workload types].<br>

<br>

</span>There are a couple of things here. There are a number of things stored<br>

under ${omero.data.dir}, and it would be good to know what's causing<br>

the most trouble. The easiest to compartmentalize as you suggest is<br>

${omero.managed.dir}. That would cover the bulk of the READs and<br>

WRITEs *except for* Pyramids, which would then also need to be moved<br>

to the managed directory.<br>

<br>

If the OMERO4 /Files or /Pixels directories are causing issues, this<br>

will be substantially more work as would be the case for /FullText.<br>

<span class=""><br>

<br>

> Finally, I have been speaking with a colleague at Harvard about the<br>

> semantics of parallel filesystem access, scale and the limitations of POSIX.<br>

><br>

> You know what would be really cool? If we could create an object-store<br>

> provider backend for OMERO to tape into object storage APIs. I really<br>

> *really* like the idea of being able to natively target OpenStack SWIFT<br>

> buckets, Amazon S3 buckets and native Ceph-RADOS-gw stores. Thinking out<br>

> loud, there is huge potential to scale omero in the cloud further, massive<br>

> potential for data reuse and even further extensibility benefits we can<br>

> derive from scaling out like this.<br>

<br>

</span>Agreed, but again this is equivalent to the "substantially more work"<br>

from above.<br>

<span class=""><br>

<br>

> Just a few thoughts. Apologies for the idea overload. Just needed to get it<br>

> down on the page for the list to think about/ponder and tell me “We’ve<br>

> already done that Jake, don’t worry…it is in the pipeline for 5.2 or 5.3”<br>

> etc.<br>

<br>

</span>No worries, always good to have. A tl;dr summary might be:<br>

<br>

 * LDAP: possibly already in 5.1.2<br>

<br>

 * Redirecting managed.dir: conceivable in 5.1, help/testing on your<br>

part appreciated.<br>

<br>

 * New IO backend: was the original priority for 5.2, but we know how<br>

those change.<br>

<br>

<br>

> Talk soon.<br>

> -jc<br>

<br>

<br>

All the best,<br>

~Josh<br>

<br>

[1] <a href="https://github.com/openmicroscopy/openmicroscopy/pull/3798" rel="noreferrer" target="_blank">https://github.com/openmicroscopy/openmicroscopy/pull/3798</a><br>

_______________________________________________<br>

ome-users mailing list<br>

<a href="mailto:ome-users@lists.openmicroscopy.org.uk">ome-users@lists.openmicroscopy.org.uk</a><br>

<a href="http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users" rel="noreferrer" target="_blank">http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users</a><br>

</blockquote></div><br></div>