<div dir="ltr">Hi,<div><br></div><div><div>> > You know what would be really cool? If we could create an</div><div>> > object-store provider backend for OMERO to tape into object storage</div><div>> > APIs. I really *really* like the idea of being able to natively</div><div>> > target OpenStack SWIFT buckets, Amazon S3 buckets and native</div><div>> > Ceph-RADOS-gw stores. Thinking out loud, there is huge potential to</div><div>> > scale omero in the cloud further, massive potential for data reuse</div><div>> > and even further extensibility benefits we can derive from scaling</div><div>> > out like this.</div><div>> </div><div>> Agreed, but again this is equivalent to the "substantially more work"</div><div>> from above.</div></div><div><br></div><div>Lots of work, definitely. But fundamentally feasible. At the Paris meeting, Douglas Russell and I developed a rough prototype of a Bio-Formats I/O handler for Amazon S3 [1]. I know Douglas is working on performance issues (caching / read-ahead and maybe eventually memory-mapped I/O) as his time allows.</div><div><br></div><div>Of course, that is only the first step. But since OMERO5 internally uses Bio-Formats to read planes, it opens a lot of possibilities!</div><div><br></div><div>Regards,</div><div>Curtis</div><div><br></div><div>[1] <a href="https://github.com/dpwrussell/bfs3">https://github.com/dpwrussell/bfs3</a></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Jun 19, 2015 at 5:12 AM, Josh Moore <span dir="ltr"><<a href="mailto:josh@glencoesoftware.com" target="_blank">josh@glencoesoftware.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On Thu, Jun 18, 2015 at 12:32 PM, Jake Carroll <<a href="mailto:jake.carroll@uq.edu.au">jake.carroll@uq.edu.au</a>> wrote:<br>
> Hi list!<br>
<br>
Hi Jake. Good to hear from you ... you've been busy!<br>
<span class=""><br>
<br>
> We’re running OMERO at a fairly large scale now (big ingest, lots of<br>
> instruments, plenty of IO and lots of compute cycles) across a significant<br>
> network. We’re (as Jason alluded to in a previous post) doing things at a<br>
> cloud scale with OMERO, which, still seems to be a bit unusual from what I<br>
> have found.<br>
<br>
</span>You've certainly got everyone interested.<br>
<span class=""><br>
<br>
> Anyway..<br>
><br>
> One thing that has come up recently is the notion of delegated<br>
> administration. Here is an example.<br>
><br>
> Org Unit “A” is the controller/unit that runs an OMERO platform. It has lots<br>
> of users and is the main provider of the OMERO platform.<br>
><br>
> Org Unit “B” says “hey…that is darn cool. We’d like some of the OMERO love,<br>
> too! Can we join you?”<br>
><br>
> Org Unit “A” says: “Of course! We share the love, and we love OMERO!”.<br>
><br>
> In our LDAP binds we then allow said org unit access. But, I got to thinking<br>
> a bit further afield about something better or even nicer. I liked the idea<br>
> of multi-tenancy with my omero-cloud instance. Further, I liked the idea of<br>
> my delegated administrators (as I like to call them) being in control of<br>
> their own destiny, and to an extent, their users, such that, on a large<br>
> omero instance, you’d have an effective waterfall model of administrative<br>
> chains.<br>
><br>
> OU “A” can oversee it all.<br>
><br>
> OU “B” has some selected administrators that can access/modify/work with the<br>
> OU “B” users who belong to that bit of the LDAP container (or some other<br>
> access control mechanism).<br>
><br>
> It would sort of make OMERO very multi-tenancy savvy in many respects.<br>
<br>
</span>Certainly, this depends on what exactly is meant by "control",<br>
"administrate", etc. but I'd assume that with the group mappings[1] in<br>
introduced in 5.1.2, you'd be able to perform at least *some* of this.<br>
<br>
Assuming all groups are managed by LDAP, then the logic would be<br>
something roughly equivalent to:<br>
<br>
* all admins of UnitA are set as owners of all groups<br>
* all admins of UnitB are set as owners of UnitB groups<br>
<br>
which I think would be doable with an OR-query.<br>
<span class=""><br>
<br>
> Further, with regards to OMERO.fs, it would be really ideal to be able to<br>
> prefix or specify in the backend multiple points of IO exit for the omero<br>
> data storage location. Such that the omero.data.dir variable could equal or<br>
> have multiple backends for different OU’s from the same OMERO instance. This<br>
> would both logically and physically compartmentalise the OMERO data domain.<br>
> [Which could be a good thing for more reasons than one, much less IO<br>
> scheduling and performance characteristics at a filesystem level, for<br>
> different omero workload types].<br>
<br>
</span>There are a couple of things here. There are a number of things stored<br>
under ${omero.data.dir}, and it would be good to know what's causing<br>
the most trouble. The easiest to compartmentalize as you suggest is<br>
${omero.managed.dir}. That would cover the bulk of the READs and<br>
WRITEs *except for* Pyramids, which would then also need to be moved<br>
to the managed directory.<br>
<br>
If the OMERO4 /Files or /Pixels directories are causing issues, this<br>
will be substantially more work as would be the case for /FullText.<br>
<span class=""><br>
<br>
> Finally, I have been speaking with a colleague at Harvard about the<br>
> semantics of parallel filesystem access, scale and the limitations of POSIX.<br>
><br>
> You know what would be really cool? If we could create an object-store<br>
> provider backend for OMERO to tape into object storage APIs. I really<br>
> *really* like the idea of being able to natively target OpenStack SWIFT<br>
> buckets, Amazon S3 buckets and native Ceph-RADOS-gw stores. Thinking out<br>
> loud, there is huge potential to scale omero in the cloud further, massive<br>
> potential for data reuse and even further extensibility benefits we can<br>
> derive from scaling out like this.<br>
<br>
</span>Agreed, but again this is equivalent to the "substantially more work"<br>
from above.<br>
<span class=""><br>
<br>
> Just a few thoughts. Apologies for the idea overload. Just needed to get it<br>
> down on the page for the list to think about/ponder and tell me “We’ve<br>
> already done that Jake, don’t worry…it is in the pipeline for 5.2 or 5.3”<br>
> etc.<br>
<br>
</span>No worries, always good to have. A tl;dr summary might be:<br>
<br>
* LDAP: possibly already in 5.1.2<br>
<br>
* Redirecting managed.dir: conceivable in 5.1, help/testing on your<br>
part appreciated.<br>
<br>
* New IO backend: was the original priority for 5.2, but we know how<br>
those change.<br>
<br>
<br>
> Talk soon.<br>
> -jc<br>
<br>
<br>
All the best,<br>
~Josh<br>
<br>
[1] <a href="https://github.com/openmicroscopy/openmicroscopy/pull/3798" rel="noreferrer" target="_blank">https://github.com/openmicroscopy/openmicroscopy/pull/3798</a><br>
_______________________________________________<br>
ome-users mailing list<br>
<a href="mailto:ome-users@lists.openmicroscopy.org.uk">ome-users@lists.openmicroscopy.org.uk</a><br>
<a href="http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users" rel="noreferrer" target="_blank">http://lists.openmicroscopy.org.uk/mailman/listinfo/ome-users</a><br>
</blockquote></div><br></div>