More flexible use of Sun Grid Engine by Galaxy

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

More flexible use of Sun Grid Engine by Galaxy

Peter van Heusden
Hi there

At the moment Galaxy's choice of Sun Grid Engine settings are done on a
per-tool basis - that is, you can define different queues and projects
on a per-tool basis. However, I've got two use cases that currently
aren't supported.

1) Per-user settings. The SGEJobRunner potentially has access to user
information (via the job.session_id), and while all jobs get run as user
galaxy, per-user settings could be simulated by mapping Galaxy users to
SGE projects. Then the SGE admin can do load-balancing, etc on a
per-project basis. For a site with a relatively small and static set of
users this could work. Alternately you need to map Galaxy users to SGE
users - this is much trickier, requiring elevated privileges and thus
probably a separate job runner daemon.

2) Queue selection. Again this could be done on a per-tool basis, but
usage patterns don't always support that. So for example, our site has
two queues - a long running one and a short running one. Certain
resources are reserved for only the short running queue's usage. How to
select a different queue for jobs raises a bit of a design problem for
me - effectively you're telling Galaxy "I want to run this workflow, but
with this extra parameter" - its almost like having a "meta parameter"
since clearly you don't want to have queue selection as an input for any
particular tool.

So I'd be interested in the Galaxy team's input as to how best to
address these two use cases.

Thanks,
Peter

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: More flexible use of Sun Grid Engine by Galaxy

Edward Kirton
Hi Peter,

I had similar issues and had success setting up galaxy this way:

(1) Per-group settings:

I wanted per-group settings as different groups in my organization owned their own SGE clusters or fair-share allocation (each group either submits to the same cluster but different queue as different user OR they have their own cluster entirely).  To do this, you may set up multiple galaxy sites (i.e. hg clone).  I have each group configured as subdirs on my galaxy domain (e.g. mygalaxy.gov/main, mygalaxy.gov/public, mygalaxy.gov/group1, etc.).  Each site has (a) different environment variables pointing to its desired cluster (e.g. $SGE_CELL) -- I added these to the run.sh script actually just to make sure i didn't get confused and (b) universe_wsgi.ini has different job runner definition lines for the tools (specifying queue, resource, project).  Each group is to login to their own site and I used Apache group authentication to enforce this.

A few comments: for these per-group sites, i'm not using the load balancing by separate job and web runners for the group sites, i am not saving job data in the database, and consequently there is no recovery on restart.  since the group sites have a relatively small number of users, this doesn't seem to be a problem.  Also, i use a hardware-based solution for this (load balancing router), but it hardly seems necessary for the workloads I've seen so far (but is necessary for redundancy in case of server hardware failure which may not be an issue for you, depending on your SLA).

I should also mention all group sites use the same postgresql db, so workflows/histories/datasets can be shared across the organization.  However this requires that updates from galaxy-central be done in concert in case there are db schema changes (i have sudo su access to all other groups' galaxy-xx user), but i allow groups to push/pull other changes anytime (groups are not allowed to change galaxy internals/db schema, just add tools and datatypes).

(2) queue selection

this functionality is already there as of a few months ago.  the fields in the job runner defline/url are:
0 : sge or pbs
1 : ??
2 : cell (only one allowed per galaxy instance -- that's why i have multiple group sites)
3 : queue
4 : project
5 : params (e.g. resources)

so my default job runner looks like this:

sge:///galaxy.q//-b y -V -l medium.c/

for shorter or longer running jobs (or large ram requirements), I use different -l options (must specify on a per-tool basis).

This is a minor inconvenience because when a group adds a tool, their configuration won't automatically appear in the universe file because it's not tracked (cannot be, since they have different ports and names). 

Hope this helps,
Ed


On Mon, Jun 21, 2010 at 6:34 AM, Peter van Heusden <[hidden email]> wrote:
Hi there

At the moment Galaxy's choice of Sun Grid Engine settings are done on a
per-tool basis - that is, you can define different queues and projects
on a per-tool basis. However, I've got two use cases that currently
aren't supported.

1) Per-user settings. The SGEJobRunner potentially has access to user
information (via the job.session_id), and while all jobs get run as user
galaxy, per-user settings could be simulated by mapping Galaxy users to
SGE projects. Then the SGE admin can do load-balancing, etc on a
per-project basis. For a site with a relatively small and static set of
users this could work. Alternately you need to map Galaxy users to SGE
users - this is much trickier, requiring elevated privileges and thus
probably a separate job runner daemon.

2) Queue selection. Again this could be done on a per-tool basis, but
usage patterns don't always support that. So for example, our site has
two queues - a long running one and a short running one. Certain
resources are reserved for only the short running queue's usage. How to
select a different queue for jobs raises a bit of a design problem for
me - effectively you're telling Galaxy "I want to run this workflow, but
with this extra parameter" - its almost like having a "meta parameter"
since clearly you don't want to have queue selection as an input for any
particular tool.

So I'd be interested in the Galaxy team's input as to how best to
address these two use cases.

Thanks,
Peter

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev


_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev
Reply | Threaded
Open this post in threaded view
|

Re: More flexible use of Sun Grid Engine by Galaxy

Nate Coraor (nate@bx.psu.edu)
In reply to this post by Peter van Heusden
Peter van Heusden wrote:

> Hi there
>
> At the moment Galaxy's choice of Sun Grid Engine settings are done on a
> per-tool basis - that is, you can define different queues and projects
> on a per-tool basis. However, I've got two use cases that currently
> aren't supported.
>
> 1) Per-user settings. The SGEJobRunner potentially has access to user
> information (via the job.session_id), and while all jobs get run as user
> galaxy, per-user settings could be simulated by mapping Galaxy users to
> SGE projects. Then the SGE admin can do load-balancing, etc on a
> per-project basis. For a site with a relatively small and static set of
> users this could work. Alternately you need to map Galaxy users to SGE
> users - this is much trickier, requiring elevated privileges and thus
> probably a separate job runner daemon.


Hi Peter,

There's an issue in our tracker to implement functionality allowing jobs
to run on the cluster as real users instead of the Galaxy user:

http://bitbucket.org/galaxy/galaxy-central/issue/106

Once implemented, you could then define which resources used by which
users directly in Grid Engine.  I think this would be the cleanest way
to do #1.

>
> 2) Queue selection. Again this could be done on a per-tool basis, but
> usage patterns don't always support that. So for example, our site has
> two queues - a long running one and a short running one. Certain
> resources are reserved for only the short running queue's usage. How to
> select a different queue for jobs raises a bit of a design problem for
> me - effectively you're telling Galaxy "I want to run this workflow, but
> with this extra parameter" - its almost like having a "meta parameter"
> since clearly you don't want to have queue selection as an input for any
> particular tool.
>
> So I'd be interested in the Galaxy team's input as to how best to
> address these two use cases.

This has been a pretty difficult one to define, since it's almost
impossible to look at a job and decide how long it's going to run before
you run it.

Your solution of giving the users the choice is interesting, although
would be a pretty site-specific implementation, since that choice may be
a queue, a project, a different cell, or just different qsub parameters.

--nate

>
> Thanks,
> Peter
>
> _______________________________________________
> galaxy-dev mailing list
> [hidden email]
> http://lists.bx.psu.edu/listinfo/galaxy-dev

_______________________________________________
galaxy-dev mailing list
[hidden email]
http://lists.bx.psu.edu/listinfo/galaxy-dev