Clusters, Runners, and user credentials

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Clusters, Runners, and user credentials

Lloyd Brown
I'm a systems administrator for an HPC cluster, and have been asked by a
faculty member here to try to get galaxy to work on our cluster.
Unfortunately, there are one or two outstanding questions that I can't
seem to find the answer to, and I'm hoping someone here can help me out.

In particular, is galaxy, and the PBS runner specifically, capable of
submitting jobs under specific user names?  Essentially, if I set up
galaxy to push jobs to our cluster, will they all show up under one user
credential (eg. the "galaxy" user), or can we set it up so that the user
logged into galaxy, is used to submit the job?

This one is kindof a show-stopper, since our internal policies require
that all jobs have a specific user credential, with one person per username.

Thanks,
Lloyd


--
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

ichorny
Lyod,

See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.

Best,

Ilya

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Lloyd Brown
Sent: Monday, October 31, 2011 2:35 PM
To: Galaxy Dev List
Subject: [galaxy-dev] Clusters, Runners, and user credentials

I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster.
Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.

In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names?  Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?

This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.

Thanks,
Lloyd


--
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

ichorny
BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.

Ilya


-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Chorny, Ilya
Sent: Monday, October 31, 2011 3:27 PM
To: Lloyd Brown; Galaxy Dev List
Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials

Lyod,

See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.

Best,

Ilya

-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Lloyd Brown
Sent: Monday, October 31, 2011 2:35 PM
To: Galaxy Dev List
Subject: [galaxy-dev] Clusters, Runners, and user credentials

I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster.
Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.

In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names?  Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?

This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.

Thanks,
Lloyd


--
Lloyd Brown
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
http://marylou.byu.edu
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

Glen Beane
Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.

How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.

Sent from my iPhone

On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <[hidden email]> wrote:

> BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
>
> Ilya
>
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Chorny, Ilya
> Sent: Monday, October 31, 2011 3:27 PM
> To: Lloyd Brown; Galaxy Dev List
> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
>
> Lyod,
>
> See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
>
> Best,
>
> Ilya
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Lloyd Brown
> Sent: Monday, October 31, 2011 2:35 PM
> To: Galaxy Dev List
> Subject: [galaxy-dev] Clusters, Runners, and user credentials
>
> I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster.
> Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
>
> In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names?  Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
>
> This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
>
> Thanks,
> Lloyd
>
>
> --
> Lloyd Brown
> Systems Administrator
> Fulton Supercomputing Lab
> Brigham Young University
> http://marylou.byu.edu
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

ichorny
I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct.  

-----Original Message-----
From: Glen Beane [mailto:[hidden email]]
Sent: Monday, October 31, 2011 4:20 PM
To: Chorny, Ilya
Cc: Lloyd Brown; Galaxy Dev List
Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials

Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.

How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.

Sent from my iPhone

On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <[hidden email]> wrote:

> BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
>
> Ilya
>
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Chorny, Ilya
> Sent: Monday, October 31, 2011 3:27 PM
> To: Lloyd Brown; Galaxy Dev List
> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
>
> Lyod,
>
> See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
>
> Best,
>
> Ilya
>
> -----Original Message-----
> From: [hidden email]
> [mailto:[hidden email]] On Behalf Of Lloyd Brown
> Sent: Monday, October 31, 2011 2:35 PM
> To: Galaxy Dev List
> Subject: [galaxy-dev] Clusters, Runners, and user credentials
>
> I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster.
> Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
>
> In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names?  Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
>
> This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
>
> Thanks,
> Lloyd
>
>
> --
> Lloyd Brown
> Systems Administrator
> Fulton Supercomputing Lab
> Brigham Young University
> http://marylou.byu.edu
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this and other
> Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

Fields, Christopher J
I recall at the Galaxy conf there were questions on how secure this is (having the 'galaxy' user submit jobs as someone else).  This would involve switching users on the cluster or would require user login information, correct?

The way we planned on working around this was to just specify a user account string (using '-A') instead of bothering with switching users.  I believe our local cluster disallows switching users via PBS unless the submitter has admin privs, but the accounting string works fine (I suppose one could use the project option as well).

chris

On Oct 31, 2011, at 6:30 PM, Chorny, Ilya wrote:

> I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct.  
>
> -----Original Message-----
> From: Glen Beane [mailto:[hidden email]]
> Sent: Monday, October 31, 2011 4:20 PM
> To: Chorny, Ilya
> Cc: Lloyd Brown; Galaxy Dev List
> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
>
> Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
>
> How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
>
> Sent from my iPhone
>
> On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <[hidden email]> wrote:
>
>> BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
>>
>> Ilya
>>
>>
>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of Chorny, Ilya
>> Sent: Monday, October 31, 2011 3:27 PM
>> To: Lloyd Brown; Galaxy Dev List
>> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
>>
>> Lyod,
>>
>> See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
>>
>> Best,
>>
>> Ilya
>>
>> -----Original Message-----
>> From: [hidden email]
>> [mailto:[hidden email]] On Behalf Of Lloyd Brown
>> Sent: Monday, October 31, 2011 2:35 PM
>> To: Galaxy Dev List
>> Subject: [galaxy-dev] Clusters, Runners, and user credentials
>>
>> I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster.
>> Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
>>
>> In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names?  Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
>>
>> This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
>>
>> Thanks,
>> Lloyd
>>
>>
>> --
>> Lloyd Brown
>> Systems Administrator
>> Fulton Supercomputing Lab
>> Brigham Young University
>> http://marylou.byu.edu
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>>
>> http://lists.bx.psu.edu/
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>>
>> http://lists.bx.psu.edu/
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this and other
>> Galaxy lists, please use the interface at:
>>
>> http://lists.bx.psu.edu/
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

Nate Coraor (nate@bx.psu.edu)
In reply to this post by Glen Beane
Glen Beane wrote:
> Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.

Technically, using the drmaa runner with TORQUE is supposed to work.  I
just tried it here to test Ilya's code, and Galaxy was segfaulting when
trying to interact with TORQUE's libdrmaa on setting up the job
template.  I didn't look into it any further, I'm using a fairly old
TORQUE client here, so I suspect it may just be due to that.

--nate

>
> How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
>
> Sent from my iPhone
>
> On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <[hidden email]> wrote:
>
> > BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
> >
> > Ilya
> >
> >
> > -----Original Message-----
> > From: [hidden email] [mailto:[hidden email]] On Behalf Of Chorny, Ilya
> > Sent: Monday, October 31, 2011 3:27 PM
> > To: Lloyd Brown; Galaxy Dev List
> > Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
> >
> > Lyod,
> >
> > See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
> >
> > Best,
> >
> > Ilya
> >
> > -----Original Message-----
> > From: [hidden email] [mailto:[hidden email]] On Behalf Of Lloyd Brown
> > Sent: Monday, October 31, 2011 2:35 PM
> > To: Galaxy Dev List
> > Subject: [galaxy-dev] Clusters, Runners, and user credentials
> >
> > I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster.
> > Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
> >
> > In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names?  Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
> >
> > This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
> >
> > Thanks,
> > Lloyd
> >
> >
> > --
> > Lloyd Brown
> > Systems Administrator
> > Fulton Supercomputing Lab
> > Brigham Young University
> > http://marylou.byu.edu
> > ___________________________________________________________
> > Please keep all replies on the list by using "reply all"
> > in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
> >
> >  http://lists.bx.psu.edu/
> >
> > ___________________________________________________________
> > Please keep all replies on the list by using "reply all"
> > in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
> >
> >  http://lists.bx.psu.edu/
> >
> > ___________________________________________________________
> > Please keep all replies on the list by using "reply all"
> > in your mail client.  To manage your subscriptions to this
> > and other Galaxy lists, please use the interface at:
> >
> >  http://lists.bx.psu.edu/
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>   http://lists.bx.psu.edu/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

Glen Beane
On Nov 2, 2011, at 4:57 PM, Nate Coraor wrote:

> Glen Beane wrote:
>> Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
>
> Technically, using the drmaa runner with TORQUE is supposed to work.  I
> just tried it here to test Ilya's code, and Galaxy was segfaulting when
> trying to interact with TORQUE's libdrmaa on setting up the job
> template.  I didn't look into it any further, I'm using a fairly old
> TORQUE client here, so I suspect it may just be due to that.

I don't believe TORQUE's DRMAA support receives that much attention these days

--
Glen L. Beane
Senior Software Engineer
The Jackson Laboratory
(207) 288-6153


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

Fields, Christopher J
In reply to this post by Fields, Christopher J
Ilya, Nate,

To add a bit of background to the below, we have several clusters on campus that use very different accounting systems; some run as a regular cron job to process job run info, however others use a qsub wrapper to check service units prior to job submission (a byproduct of being part of teragrid/xcede).  It seems the most direct route to work around accounting-level differences is to submit the job as a user (so I'm interested in this solution), but the below security questions I mentioned were raised by a number of our local cluster sysadmins as well as (if I'm not mistaken) at the conference.  

Were these ever addressed, or is it considered an non-issue?  Apologies about re-sending, I didn't know if this had been answered elsewhere, but this was a serious concern that may block us from using some pretty nice HPC resources.

chris

On Nov 1, 2011, at 4:59 PM, Fields, Christopher J wrote:

> I recall at the Galaxy conf there were questions on how secure this is (having the 'galaxy' user submit jobs as someone else).  This would involve switching users on the cluster or would require user login information, correct?
>
> The way we planned on working around this was to just specify a user account string (using '-A') instead of bothering with switching users.  I believe our local cluster disallows switching users via PBS unless the submitter has admin privs, but the accounting string works fine (I suppose one could use the project option as well).
>
> chris
>
> On Oct 31, 2011, at 6:30 PM, Chorny, Ilya wrote:
>
>> I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct.  
>>
>> -----Original Message-----
>> From: Glen Beane [mailto:[hidden email]]
>> Sent: Monday, October 31, 2011 4:20 PM
>> To: Chorny, Ilya
>> Cc: Lloyd Brown; Galaxy Dev List
>> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
>>
>> Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
>>
>> How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
>>
>> Sent from my iPhone
>>
>> On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <[hidden email]> wrote:
>>
>>> BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
>>>
>>> Ilya
>>>
>>>
>>> -----Original Message-----
>>> From: [hidden email]
>>> [mailto:[hidden email]] On Behalf Of Chorny, Ilya
>>> Sent: Monday, October 31, 2011 3:27 PM
>>> To: Lloyd Brown; Galaxy Dev List
>>> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
>>>
>>> Lyod,
>>>
>>> See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
>>>
>>> Best,
>>>
>>> Ilya
>>>
>>> -----Original Message-----
>>> From: [hidden email]
>>> [mailto:[hidden email]] On Behalf Of Lloyd Brown
>>> Sent: Monday, October 31, 2011 2:35 PM
>>> To: Galaxy Dev List
>>> Subject: [galaxy-dev] Clusters, Runners, and user credentials
>>>
>>> I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster.
>>> Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
>>>
>>> In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names?  Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
>>>
>>> This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
>>>
>>> Thanks,
>>> Lloyd
>>>
>>>
>>> --
>>> Lloyd Brown
>>> Systems Administrator
>>> Fulton Supercomputing Lab
>>> Brigham Young University
>>> http://marylou.byu.edu
>>> ___________________________________________________________
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>>>
>>> http://lists.bx.psu.edu/
>>>
>>> ___________________________________________________________
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>>>
>>> http://lists.bx.psu.edu/
>>>
>>> ___________________________________________________________
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this and other
>>> Galaxy lists, please use the interface at:
>>>
>>> http://lists.bx.psu.edu/
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>
>> http://lists.bx.psu.edu/
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>  http://lists.bx.psu.edu/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

Nate Coraor (nate@bx.psu.edu)
Hi Chris,

Ilya's solution uses sudo to submit the job via drmaa after switching to
the actual user's uid and gid.  This means giving your Galaxy user sudo
rights to run 3 scripts as root:

    * A script to submit jobs
    * A script to kill jobs
    * A script to chown a directory

This could be tightened up a bit, in the case of the first two by
sudoing directly to the user, rather than to root and then setuid()ing.
In the case of the latter script, a path is passed to the script rather
than a Galaxy job id, so it could be used by the Galaxy user to chown
anything that root can chown.  In addition, if your Galaxy data lives in
NFS with root squashing enabled, this script would fail.

Of course, the paths to these scripts are configurable, so they can be
replaced with site-suitable versions.

Another option to avoid sudo entirely would be for Galaxy to start as
root and then drop privileges, but I am not incredibly fond of this
solution, since it allows for the possibility of privilege separation
exploits.  Perhaps a stripped down Galaxy data daemon that runs with
elevated privileges, whose sole job it is to manage permissions and move
data?

As with the existing Galaxy implementation, Galaxy's data is not copied
around at job runtime for tool input, it simply exists in one place and
is expected to be locatable on the cluster resource at the same path.
My next development goal is to remove this limitation.

The assumption is also made that tool inputs are readable by the actual
user, which was a problem in some environments.

If administrators prefer to give the Galaxy account the permission to
run jobs as other users directly in the DRM, this would certainly solve
the problem.  Galaxy would just need minor modification to take
advantage of the feature.

As you probably recall, there were many people at the GCC brainstorming
this problem, and I don't recall that we ever came up with the perfectly
secure solution.  This solution may be good enough for some sites.

If there's a desire for tightened security, I would be happy to review
and accept any work done on that. =)

--nate

Fields, Christopher J wrote:

> Ilya, Nate,
>
> To add a bit of background to the below, we have several clusters on campus that use very different accounting systems; some run as a regular cron job to process job run info, however others use a qsub wrapper to check service units prior to job submission (a byproduct of being part of teragrid/xcede).  It seems the most direct route to work around accounting-level differences is to submit the job as a user (so I'm interested in this solution), but the below security questions I mentioned were raised by a number of our local cluster sysadmins as well as (if I'm not mistaken) at the conference.  
>
> Were these ever addressed, or is it considered an non-issue?  Apologies about re-sending, I didn't know if this had been answered elsewhere, but this was a serious concern that may block us from using some pretty nice HPC resources.
>
> chris
>
> On Nov 1, 2011, at 4:59 PM, Fields, Christopher J wrote:
>
> > I recall at the Galaxy conf there were questions on how secure this is (having the 'galaxy' user submit jobs as someone else).  This would involve switching users on the cluster or would require user login information, correct?
> >
> > The way we planned on working around this was to just specify a user account string (using '-A') instead of bothering with switching users.  I believe our local cluster disallows switching users via PBS unless the submitter has admin privs, but the accounting string works fine (I suppose one could use the project option as well).
> >
> > chris
> >
> > On Oct 31, 2011, at 6:30 PM, Chorny, Ilya wrote:
> >
> >> I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct.  
> >>
> >> -----Original Message-----
> >> From: Glen Beane [mailto:[hidden email]]
> >> Sent: Monday, October 31, 2011 4:20 PM
> >> To: Chorny, Ilya
> >> Cc: Lloyd Brown; Galaxy Dev List
> >> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
> >>
> >> Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
> >>
> >> How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
> >>
> >> Sent from my iPhone
> >>
> >> On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <[hidden email]> wrote:
> >>
> >>> BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
> >>>
> >>> Ilya
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: [hidden email]
> >>> [mailto:[hidden email]] On Behalf Of Chorny, Ilya
> >>> Sent: Monday, October 31, 2011 3:27 PM
> >>> To: Lloyd Brown; Galaxy Dev List
> >>> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
> >>>
> >>> Lyod,
> >>>
> >>> See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
> >>>
> >>> Best,
> >>>
> >>> Ilya
> >>>
> >>> -----Original Message-----
> >>> From: [hidden email]
> >>> [mailto:[hidden email]] On Behalf Of Lloyd Brown
> >>> Sent: Monday, October 31, 2011 2:35 PM
> >>> To: Galaxy Dev List
> >>> Subject: [galaxy-dev] Clusters, Runners, and user credentials
> >>>
> >>> I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster.
> >>> Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
> >>>
> >>> In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names?  Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
> >>>
> >>> This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
> >>>
> >>> Thanks,
> >>> Lloyd
> >>>
> >>>
> >>> --
> >>> Lloyd Brown
> >>> Systems Administrator
> >>> Fulton Supercomputing Lab
> >>> Brigham Young University
> >>> http://marylou.byu.edu
> >>> ___________________________________________________________
> >>> Please keep all replies on the list by using "reply all"
> >>> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
> >>>
> >>> http://lists.bx.psu.edu/
> >>>
> >>> ___________________________________________________________
> >>> Please keep all replies on the list by using "reply all"
> >>> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
> >>>
> >>> http://lists.bx.psu.edu/
> >>>
> >>> ___________________________________________________________
> >>> Please keep all replies on the list by using "reply all"
> >>> in your mail client.  To manage your subscriptions to this and other
> >>> Galaxy lists, please use the interface at:
> >>>
> >>> http://lists.bx.psu.edu/
> >>
> >> ___________________________________________________________
> >> Please keep all replies on the list by using "reply all"
> >> in your mail client.  To manage your subscriptions to this
> >> and other Galaxy lists, please use the interface at:
> >>
> >> http://lists.bx.psu.edu/
> >
> >
> > ___________________________________________________________
> > Please keep all replies on the list by using "reply all"
> > in your mail client.  To manage your subscriptions to this
> > and other Galaxy lists, please use the interface at:
> >
> >  http://lists.bx.psu.edu/
>
>
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client.  To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>   http://lists.bx.psu.edu/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/
Reply | Threaded
Open this post in threaded view
|

Re: Clusters, Runners, and user credentials

Fields, Christopher J
On Nov 3, 2011, at 9:50 AM, Nate Coraor wrote:

> Hi Chris,
>
> Ilya's solution uses sudo to submit the job via drmaa after switching to
> the actual user's uid and gid.  This means giving your Galaxy user sudo
> rights to run 3 scripts as root:
>
>    * A script to submit jobs
>    * A script to kill jobs
>    * A script to chown a directory
>
> This could be tightened up a bit, in the case of the first two by
> sudoing directly to the user, rather than to root and then setuid()ing.
> In the case of the latter script, a path is passed to the script rather
> than a Galaxy job id, so it could be used by the Galaxy user to chown
> anything that root can chown.  In addition, if your Galaxy data lives in
> NFS with root squashing enabled, this script would fail.

Yes, we may run into this or worse; we're setting up gpfs locally for our NFS.

> Of course, the paths to these scripts are configurable, so they can be
> replaced with site-suitable versions.
>
> Another option to avoid sudo entirely would be for Galaxy to start as
> root and then drop privileges, but I am not incredibly fond of this
> solution, since it allows for the possibility of privilege separation
> exploits.  Perhaps a stripped down Galaxy data daemon that runs with
> elevated privileges, whose sole job it is to manage permissions and move
> data?

That sounds like a feasible option.

> As with the existing Galaxy implementation, Galaxy's data is not copied
> around at job runtime for tool input, it simply exists in one place and
> is expected to be locatable on the cluster resource at the same path.
> My next development goal is to remove this limitation.

This is something we will run into at some point, particularly with some of the NCSA resources (where user paths are quite different from other clusters on campus).

> The assumption is also made that tool inputs are readable by the actual
> user, which was a problem in some environments.
>
> If administrators prefer to give the Galaxy account the permission to
> run jobs as other users directly in the DRM, this would certainly solve
> the problem.  Galaxy would just need minor modification to take
> advantage of the feature.
>
> As you probably recall, there were many people at the GCC brainstorming
> this problem, and I don't recall that we ever came up with the perfectly
> secure solution.  This solution may be good enough for some sites.

Right, I think this is more a problem when the cluster is not under our control and has already been configured.  Not that it's impossible, but there is definitely an additional level of sysadmin concerns we have to deal with.  And with multiple clusters (with multiple configurations, sysadmins, etc) this becomes more complex.  We're deploying step-wise (on one cluster initially, then others down the road) for this reason.

> If there's a desire for tightened security, I would be happy to review
> and accept any work done on that. =)
>
> --nate

That is a possibility, we have initially talked with a few of the myproxy folks here re: security concerns and possible solutions for user job submissions (there wasn't much added yet beyond what you already covered, unfortunately).

chris

> Fields, Christopher J wrote:
>> Ilya, Nate,
>>
>> To add a bit of background to the below, we have several clusters on campus that use very different accounting systems; some run as a regular cron job to process job run info, however others use a qsub wrapper to check service units prior to job submission (a byproduct of being part of teragrid/xcede).  It seems the most direct route to work around accounting-level differences is to submit the job as a user (so I'm interested in this solution), but the below security questions I mentioned were raised by a number of our local cluster sysadmins as well as (if I'm not mistaken) at the conference.  
>>
>> Were these ever addressed, or is it considered an non-issue?  Apologies about re-sending, I didn't know if this had been answered elsewhere, but this was a serious concern that may block us from using some pretty nice HPC resources.
>>
>> chris
>>
>> On Nov 1, 2011, at 4:59 PM, Fields, Christopher J wrote:
>>
>>> I recall at the Galaxy conf there were questions on how secure this is (having the 'galaxy' user submit jobs as someone else).  This would involve switching users on the cluster or would require user login information, correct?
>>>
>>> The way we planned on working around this was to just specify a user account string (using '-A') instead of bothering with switching users.  I believe our local cluster disallows switching users via PBS unless the submitter has admin privs, but the accounting string works fine (I suppose one could use the project option as well).
>>>
>>> chris
>>>
>>> On Oct 31, 2011, at 6:30 PM, Chorny, Ilya wrote:
>>>
>>>> I modified drmaa.py to pass the galaxy users path variable to the actual user. As long as the galaxy user's environment is correct then the actual user's environment should be correct.  
>>>>
>>>> -----Original Message-----
>>>> From: Glen Beane [mailto:[hidden email]]
>>>> Sent: Monday, October 31, 2011 4:20 PM
>>>> To: Chorny, Ilya
>>>> Cc: Lloyd Brown; Galaxy Dev List
>>>> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
>>>>
>>>> Many of us are using the PBS job runner (for TORQUE) and would definitely be interested in a port.
>>>>
>>>> How do you deal with making sure the user's environment is configured properly? We use a python virtualenv and load specific module files with tested tool versions in our galaxy users startup scripts on our cluster.
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On Oct 31, 2011, at 6:29 PM, "Chorny, Ilya" <[hidden email]> wrote:
>>>>
>>>>> BTW, I am not sure if PBS works with drmaa. If not then the code will need to be ported to work with pbs.
>>>>>
>>>>> Ilya
>>>>>
>>>>>
>>>>> -----Original Message-----
>>>>> From: [hidden email]
>>>>> [mailto:[hidden email]] On Behalf Of Chorny, Ilya
>>>>> Sent: Monday, October 31, 2011 3:27 PM
>>>>> To: Lloyd Brown; Galaxy Dev List
>>>>> Subject: Re: [galaxy-dev] Clusters, Runners, and user credentials
>>>>>
>>>>> Lyod,
>>>>>
>>>>> See Nate's email below Title: Actual user code. We have been working on implementing this feature in galaxy. The code is still in development but feel free to test it out and let us know how it works for you.
>>>>>
>>>>> Best,
>>>>>
>>>>> Ilya
>>>>>
>>>>> -----Original Message-----
>>>>> From: [hidden email]
>>>>> [mailto:[hidden email]] On Behalf Of Lloyd Brown
>>>>> Sent: Monday, October 31, 2011 2:35 PM
>>>>> To: Galaxy Dev List
>>>>> Subject: [galaxy-dev] Clusters, Runners, and user credentials
>>>>>
>>>>> I'm a systems administrator for an HPC cluster, and have been asked by a faculty member here to try to get galaxy to work on our cluster.
>>>>> Unfortunately, there are one or two outstanding questions that I can't seem to find the answer to, and I'm hoping someone here can help me out.
>>>>>
>>>>> In particular, is galaxy, and the PBS runner specifically, capable of submitting jobs under specific user names?  Essentially, if I set up galaxy to push jobs to our cluster, will they all show up under one user credential (eg. the "galaxy" user), or can we set it up so that the user logged into galaxy, is used to submit the job?
>>>>>
>>>>> This one is kindof a show-stopper, since our internal policies require that all jobs have a specific user credential, with one person per username.
>>>>>
>>>>> Thanks,
>>>>> Lloyd
>>>>>
>>>>>
>>>>> --
>>>>> Lloyd Brown
>>>>> Systems Administrator
>>>>> Fulton Supercomputing Lab
>>>>> Brigham Young University
>>>>> http://marylou.byu.edu
>>>>> ___________________________________________________________
>>>>> Please keep all replies on the list by using "reply all"
>>>>> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>>>>>
>>>>> http://lists.bx.psu.edu/
>>>>>
>>>>> ___________________________________________________________
>>>>> Please keep all replies on the list by using "reply all"
>>>>> in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:
>>>>>
>>>>> http://lists.bx.psu.edu/
>>>>>
>>>>> ___________________________________________________________
>>>>> Please keep all replies on the list by using "reply all"
>>>>> in your mail client.  To manage your subscriptions to this and other
>>>>> Galaxy lists, please use the interface at:
>>>>>
>>>>> http://lists.bx.psu.edu/
>>>>
>>>> ___________________________________________________________
>>>> Please keep all replies on the list by using "reply all"
>>>> in your mail client.  To manage your subscriptions to this
>>>> and other Galaxy lists, please use the interface at:
>>>>
>>>> http://lists.bx.psu.edu/
>>>
>>>
>>> ___________________________________________________________
>>> Please keep all replies on the list by using "reply all"
>>> in your mail client.  To manage your subscriptions to this
>>> and other Galaxy lists, please use the interface at:
>>>
>>> http://lists.bx.psu.edu/
>>
>>
>> ___________________________________________________________
>> Please keep all replies on the list by using "reply all"
>> in your mail client.  To manage your subscriptions to this
>> and other Galaxy lists, please use the interface at:
>>
>>  http://lists.bx.psu.edu/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/