SLURM configuration problem

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

SLURM configuration problem

Leonor Palmeira-2
Dear all,

we have setup a Galaxy instance on a virtual machine, and we want to be
able to submit jobs to our HPC system (SLURM).

Currently, we do not understand how to define that jobs will be sent to
the HPC cluster.

We have set :

export $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so

This is our config/job_conf.xml :

<?xml version="1.0"?>
<!-- A sample job config that explicitly configures job running the way
it is configured by default (if there is no explicit config). -->
<job_conf>
    <plugins>
        <plugin id="drmaa" type="runner"
load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" />
    </plugins>
    <handlers default="handlers">
        <handler id="handler0" tags="handlers" />
        <handler id="main" />
    </handlers>
    <destinations default="slurm">
        <destination id="slurm" runner="drmaa">
                <param id="nativeSpecification">-P all_5hrs</param>
        </destination>
    </destinations>
</job_conf>

And the output of "sh run.sh" :


galaxy.jobs.manager DEBUG 2017-02-07 15:50:39,962 Starting job handler

galaxy.jobs INFO 2017-02-07 15:50:39,962 Handler 'main' will load all
configured runner plugins

galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-07 15:50:39,971
Loaded 'failure' state handler from module
galaxy.jobs.runners.state_handlers.resubmit

pulsar.managers.util.drmaa DEBUG 2017-02-07 15:50:39,975 Initializing
DRMAA session from thread MainThread

Traceback (most recent call last):

  File
"/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py", line
55, in paste_app_factory

    app = galaxy.app.UniverseApplication( global_conf=global_conf,
**kwargs )

  File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in __init__

    self.job_manager = manager.JobManager( self )

  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py", line 23,
in __init__

    self.job_handler = handler.JobHandler( app )

  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 32,
in __init__

    self.dispatcher = DefaultJobDispatcher( app )

  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 723,
in __init__

    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )

  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
687, in get_job_runner_plugins

    rval[id] = runner_class( self.app, runner[ 'workers' ],
**runner.get( 'kwds', {} ) )

  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
line 88, in __init__

    self.ds = DrmaaSessionFactory().get()

  File
"/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
line 31, in get

    return DrmaaSession(session_constructor, **kwds)

  File
"/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
line 49, in __init__

    DrmaaSession.session.initialize()

  File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
257, in initialize

    py_drmaa_init(contactString)

  File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py", line
73, in py_drmaa_init

    return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))

  File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
151, in error_check

    raise _ERRORS[code - 1](error_string)

InternalException: code 1: cell directory
"/usr/lib/gridengine-drmaa/default" doesn't exist

Could anyone point us in the right direction?
This would be greatly appreciated.

Best regards
Leonor

--
Leonor Palmeira | PhD
Associate Scientist
Department of Human Genetics
CHU de Liège | Domaine Universitaire du Sart-Tilman
4000 Liège | BELGIQUE
Tél: +32-4-366.91.41
Fax: +32-4-366.72.61
e-mail: [hidden email]
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: SLURM configuration problem

Leonor Palmeira-2
Dear all,

we are struggling with the Galaxy documentation to understand how our VM
(with our Galaxy instance running perfectly in local) should be
configured in order to be able to submit jobs to our SLURM Cluster.

We have a shared filesystem named /home/mass/GAL between the Cluster and
the VM. Galaxy is installed in /home/mass/GAL/APP/ and 'drmaa' is
installed on the SLURM Cluster.

The following variables need to be specified but we are struggling to
find out which paths we should give them. We have currently set them
like this and this clearly does not work :

$DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
$SGE_ROOT=/usr/lib/gridengine-drmaa (very wild guess)

We would greatly appreciate some help
Thank you in advance
Leo

Leonor Palmeira | PhD
Associate Scientist
Department of Human Genetics
CHU de Liège | Domaine Universitaire du Sart-Tilman
4000 Liège | BELGIQUE
Tél: +32-4-366.91.41
Fax: +32-4-366.72.61
e-mail: [hidden email]

On 02/07/2017 03:58 PM, Leonor Palmeira wrote:

> Dear all,
>
> we have setup a Galaxy instance on a virtual machine, and we want to be
> able to submit jobs to our HPC system (SLURM).
>
> Currently, we do not understand how to define that jobs will be sent to
> the HPC cluster.
>
> We have set :
>
> export $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>
> This is our config/job_conf.xml :
>
> <?xml version="1.0"?>
> <!-- A sample job config that explicitly configures job running the way
> it is configured by default (if there is no explicit config). -->
> <job_conf>
>     <plugins>
>         <plugin id="drmaa" type="runner"
> load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" />
>     </plugins>
>     <handlers default="handlers">
>         <handler id="handler0" tags="handlers" />
>         <handler id="main" />
>     </handlers>
>     <destinations default="slurm">
>         <destination id="slurm" runner="drmaa">
> <param id="nativeSpecification">-P all_5hrs</param>
> </destination>
>     </destinations>
> </job_conf>
>
> And the output of "sh run.sh" :
>
>
> galaxy.jobs.manager DEBUG 2017-02-07 15:50:39,962 Starting job handler
>
> galaxy.jobs INFO 2017-02-07 15:50:39,962 Handler 'main' will load all
> configured runner plugins
>
> galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-07 15:50:39,971
> Loaded 'failure' state handler from module
> galaxy.jobs.runners.state_handlers.resubmit
>
> pulsar.managers.util.drmaa DEBUG 2017-02-07 15:50:39,975 Initializing
> DRMAA session from thread MainThread
>
> Traceback (most recent call last):
>
>   File
> "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py", line
> 55, in paste_app_factory
>
>     app = galaxy.app.UniverseApplication( global_conf=global_conf,
> **kwargs )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in __init__
>
>     self.job_manager = manager.JobManager( self )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py", line 23,
> in __init__
>
>     self.job_handler = handler.JobHandler( app )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 32,
> in __init__
>
>     self.dispatcher = DefaultJobDispatcher( app )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 723,
> in __init__
>
>     self.job_runners = self.app.job_config.get_job_runner_plugins(
> self.app.config.server_name )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
> 687, in get_job_runner_plugins
>
>     rval[id] = runner_class( self.app, runner[ 'workers' ],
> **runner.get( 'kwds', {} ) )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
> line 88, in __init__
>
>     self.ds = DrmaaSessionFactory().get()
>
>   File
> "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
> line 31, in get
>
>     return DrmaaSession(session_constructor, **kwds)
>
>   File
> "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
> line 49, in __init__
>
>     DrmaaSession.session.initialize()
>
>   File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
> 257, in initialize
>
>     py_drmaa_init(contactString)
>
>   File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py", line
> 73, in py_drmaa_init
>
>     return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
>
>   File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
> 151, in error_check
>
>     raise _ERRORS[code - 1](error_string)
>
> InternalException: code 1: cell directory
> "/usr/lib/gridengine-drmaa/default" doesn't exist
>
> Could anyone point us in the right direction?
> This would be greatly appreciated.
>
> Best regards
> Leonor
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: SLURM configuration problem

Marius van den Beek

Hello Leonor,

One thing that you should avoid is setting things related to SGE (Sun Grid Engine)
if you’re trying to interface with Slurm.
The error message`

"/usr/lib/gridengine-drmaa/default"

Points to a problem with SGE … I don’t understand where that is coming into play if you’re trying to submit jobs to SLURM.

A good bet is to set the path to the drmaa library like so:

...
<plugin id="slurm" type="runner" load="galaxy.jobs.runners.slurm:SlurmJobRunner">
    <param id="drmaa_library_path">/var/lib/libdrmaa.so</param>
</plugin>
...

If you’re still having trouble let us know.

Best,
Marius


On 13 February 2017 at 15:25, Leonor Palmeira <[hidden email]> wrote:
Dear all,

we are struggling with the Galaxy documentation to understand how our VM
(with our Galaxy instance running perfectly in local) should be
configured in order to be able to submit jobs to our SLURM Cluster.

We have a shared filesystem named /home/mass/GAL between the Cluster and
the VM. Galaxy is installed in /home/mass/GAL/APP/ and 'drmaa' is
installed on the SLURM Cluster.

The following variables need to be specified but we are struggling to
find out which paths we should give them. We have currently set them
like this and this clearly does not work :

$DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
$SGE_ROOT=/usr/lib/gridengine-drmaa (very wild guess)

We would greatly appreciate some help
Thank you in advance
Leo

Leonor Palmeira | PhD
Associate Scientist
Department of Human Genetics
CHU de Liège | Domaine Universitaire du Sart-Tilman
4000 Liège | BELGIQUE
Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41
Fax: <a href="tel:%2B32-4-366.72.61" value="+3243667261">+32-4-366.72.61
e-mail: [hidden email]

On 02/07/2017 03:58 PM, Leonor Palmeira wrote:
> Dear all,
>
> we have setup a Galaxy instance on a virtual machine, and we want to be
> able to submit jobs to our HPC system (SLURM).
>
> Currently, we do not understand how to define that jobs will be sent to
> the HPC cluster.
>
> We have set :
>
> export $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>
> This is our config/job_conf.xml :
>
> <?xml version="1.0"?>
> <!-- A sample job config that explicitly configures job running the way
> it is configured by default (if there is no explicit config). -->
> <job_conf>
>     <plugins>
>         <plugin id="drmaa" type="runner"
> load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" />
>     </plugins>
>     <handlers default="handlers">
>         <handler id="handler0" tags="handlers" />
>         <handler id="main" />
>     </handlers>
>     <destinations default="slurm">
>         <destination id="slurm" runner="drmaa">
>               <param id="nativeSpecification">-P all_5hrs</param>
>       </destination>
>     </destinations>
> </job_conf>
>
> And the output of "sh run.sh" :
>
>
> galaxy.jobs.manager DEBUG 2017-02-07 15:50:39,962 Starting job handler
>
> galaxy.jobs INFO 2017-02-07 15:50:39,962 Handler 'main' will load all
> configured runner plugins
>
> galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-07 15:50:39,971
> Loaded 'failure' state handler from module
> galaxy.jobs.runners.state_handlers.resubmit
>
> pulsar.managers.util.drmaa DEBUG 2017-02-07 15:50:39,975 Initializing
> DRMAA session from thread MainThread
>
> Traceback (most recent call last):
>
>   File
> "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py", line
> 55, in paste_app_factory
>
>     app = galaxy.app.UniverseApplication( global_conf=global_conf,
> **kwargs )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in __init__
>
>     self.job_manager = manager.JobManager( self )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py", line 23,
> in __init__
>
>     self.job_handler = handler.JobHandler( app )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 32,
> in __init__
>
>     self.dispatcher = DefaultJobDispatcher( app )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 723,
> in __init__
>
>     self.job_runners = self.app.job_config.get_job_runner_plugins(
> self.app.config.server_name )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
> 687, in get_job_runner_plugins
>
>     rval[id] = runner_class( self.app, runner[ 'workers' ],
> **runner.get( 'kwds', {} ) )
>
>   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
> line 88, in __init__
>
>     self.ds = DrmaaSessionFactory().get()
>
>   File
> "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
> line 31, in get
>
>     return DrmaaSession(session_constructor, **kwds)
>
>   File
> "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
> line 49, in __init__
>
>     DrmaaSession.session.initialize()
>
>   File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
> 257, in initialize
>
>     py_drmaa_init(contactString)
>
>   File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py", line
> 73, in py_drmaa_init
>
>     return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
>
>   File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
> 151, in error_check
>
>     raise _ERRORS[code - 1](error_string)
>
> InternalException: code 1: cell directory
> "/usr/lib/gridengine-drmaa/default" doesn't exist
>
> Could anyone point us in the right direction?
> This would be greatly appreciated.
>
> Best regards
> Leonor
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: SLURM configuration problem

Leonor Palmeira-2
Hi,

we modified our configuration as Marius suggested, but we still get the
following error. This is an error we had just before, and we were trying
to fix it by specifying an $SGE_ROOT variable.

I don't know why this error pops up, as we are trying to use SLURM, not
SGE...

galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-20 14:58:59,768
Loaded 'failure' state handler from module
galaxy.jobs.runners.state_handlers.resubmit
pulsar.managers.util.drmaa DEBUG 2017-02-20 14:58:59,807 Initializing DRMAA
session from thread MainThread
Traceback (most recent call last):
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
line 55, in paste_app_factory
    app = galaxy.app.UniverseApplication( global_conf=global_conf,
**kwargs )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in __init__
    self.job_manager = manager.JobManager( self )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py", line 23, in
__init__
    self.job_handler = handler.JobHandler( app )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 32, in
__init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 723, in
__init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
687, in
get_job_runner_plugins
    rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get(
'kwds', {} ) )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py", line
88, in __init__
    self.ds = DrmaaSessionFactory().get()
  File
"/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",

line 31, in get
    return DrmaaSession(session_constructor, **kwds)
  File
"/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",

line 49, in __init__
    DrmaaSession.session.initialize()
  File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line 257,
in initialize
    py_drmaa_init(contactString)
  File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py", line 73,
in py_drmaa_init
    return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
  File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line 151,
in error_check
    raise _ERRORS[code - 1](error_string)
InternalException: code 1: Please set the environment variable SGE_ROOT.

Thanks a lot in advance
Leonor

Leonor Palmeira | PhD
Associate Scientist
Department of Human Genetics
CHU de Liège | Domaine Universitaire du Sart-Tilman
4000 Liège | BELGIQUE
Tél: +32-4-366.91.41
Fax: +32-4-366.72.61
e-mail: [hidden email]

On 02/13/2017 03:37 PM, Marius van den Beek wrote:

> Hello Leonor,
>
> One thing that you should avoid is setting things related to SGE (Sun
> Grid Engine)
> if you’re trying to interface with Slurm.
> The error message`
>
> |"/usr/lib/gridengine-drmaa/default" |
>
> Points to a problem with SGE … I don’t understand where that is coming
> into play if you’re trying to submit jobs to SLURM.
>
> A good bet is to set the path to the drmaa library like so:
>
> |... <plugin id="slurm" type="runner"
> load="galaxy.jobs.runners.slurm:SlurmJobRunner"> <param
> id="drmaa_library_path">/var/lib/libdrmaa.so</param> </plugin> ... |
>
> If you’re still having trouble let us know.
>
> Best,
> Marius
>
> ​
>
> On 13 February 2017 at 15:25, Leonor Palmeira <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Dear all,
>
>     we are struggling with the Galaxy documentation to understand how our VM
>     (with our Galaxy instance running perfectly in local) should be
>     configured in order to be able to submit jobs to our SLURM Cluster.
>
>     We have a shared filesystem named /home/mass/GAL between the Cluster and
>     the VM. Galaxy is installed in /home/mass/GAL/APP/ and 'drmaa' is
>     installed on the SLURM Cluster.
>
>     The following variables need to be specified but we are struggling to
>     find out which paths we should give them. We have currently set them
>     like this and this clearly does not work :
>
>     $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     $SGE_ROOT=/usr/lib/gridengine-drmaa (very wild guess)
>
>     We would greatly appreciate some help
>     Thank you in advance
>     Leo
>
>     Leonor Palmeira | PhD
>     Associate Scientist
>     Department of Human Genetics
>     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     4000 Liège | BELGIQUE
>     Tél: +32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     Fax: +32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     e-mail: [hidden email] <mailto:[hidden email]>
>
>     On 02/07/2017 03:58 PM, Leonor Palmeira wrote:
>     > Dear all,
>     >
>     > we have setup a Galaxy instance on a virtual machine, and we want
>     to be
>     > able to submit jobs to our HPC system (SLURM).
>     >
>     > Currently, we do not understand how to define that jobs will be
>     sent to
>     > the HPC cluster.
>     >
>     > We have set :
>     >
>     > export $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >
>     > This is our config/job_conf.xml :
>     >
>     > <?xml version="1.0"?>
>     > <!-- A sample job config that explicitly configures job running
>     the way
>     > it is configured by default (if there is no explicit config). -->
>     > <job_conf>
>     >     <plugins>
>     >         <plugin id="drmaa" type="runner"
>     > load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" />
>     >     </plugins>
>     >     <handlers default="handlers">
>     >         <handler id="handler0" tags="handlers" />
>     >         <handler id="main" />
>     >     </handlers>
>     >     <destinations default="slurm">
>     >         <destination id="slurm" runner="drmaa">
>     >               <param id="nativeSpecification">-P all_5hrs</param>
>     >       </destination>
>     >     </destinations>
>     > </job_conf>
>     >
>     > And the output of "sh run.sh" :
>     >
>     >
>     > galaxy.jobs.manager DEBUG 2017-02-07 15:50:39,962 Starting job handler
>     >
>     > galaxy.jobs <http://galaxy.jobs> INFO 2017-02-07 15:50:39,962
>     Handler 'main' will load all
>     > configured runner plugins
>     >
>     > galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-07
>     15:50:39,971
>     > Loaded 'failure' state handler from module
>     > galaxy.jobs.runners.state_handlers.resubmit
>     >
>     > pulsar.managers.util.drmaa DEBUG 2017-02-07 15:50:39,975 Initializing
>     > DRMAA session from thread MainThread
>     >
>     > Traceback (most recent call last):
>     >
>     >   File
>     > "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     line
>     > 55, in paste_app_factory
>     >
>     >     app = galaxy.app.UniverseApplication( global_conf=global_conf,
>     > **kwargs )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in
>     __init__
>     >
>     >     self.job_manager = manager.JobManager( self )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py",
>     line 23,
>     > in __init__
>     >
>     >     self.job_handler = handler.JobHandler( app )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     line 32,
>     > in __init__
>     >
>     >     self.dispatcher = DefaultJobDispatcher( app )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     line 723,
>     > in __init__
>     >
>     >     self.job_runners = self.app.job_config.get_job_runner_plugins(
>     > self.app.config.server_name )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     > 687, in get_job_runner_plugins
>     >
>     >     rval[id] = runner_class( self.app, runner[ 'workers' ],
>     > **runner.get( 'kwds', {} ) )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     > line 88, in __init__
>     >
>     >     self.ds = DrmaaSessionFactory().get()
>     >
>     >   File
>     >
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     > line 31, in get
>     >
>     >     return DrmaaSession(session_constructor, **kwds)
>     >
>     >   File
>     >
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     > line 49, in __init__
>     >
>     >     DrmaaSession.session.initialize()
>     >
>     >   File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
>     > 257, in initialize
>     >
>     >     py_drmaa_init(contactString)
>     >
>     >   File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     line
>     > 73, in py_drmaa_init
>     >
>     >     return _lib.drmaa_init(contact, error_buffer,
>     sizeof(error_buffer))
>     >
>     >   File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
>     > 151, in error_check
>     >
>     >     raise _ERRORS[code - 1](error_string)
>     >
>     > InternalException: code 1: cell directory
>     > "/usr/lib/gridengine-drmaa/default" doesn't exist
>     >
>     > Could anyone point us in the right direction?
>     > This would be greatly appreciated.
>     >
>     > Best regards
>     > Leonor
>     >
>     ___________________________________________________________
>     Please keep all replies on the list by using "reply all"
>     in your mail client.  To manage your subscriptions to this
>     and other Galaxy lists, please use the interface at:
>       https://lists.galaxyproject.org/ <https://lists.galaxyproject.org/>
>
>     To search Galaxy mailing lists use the unified search at:
>       http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>
>
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: SLURM configuration problem

Marius van den Beek
Hi Leonor,

Are you sure that you are using a drmaa library that is compatible with slurm?
or alternatively you can use Nate Coraor's fork here https://github.com/natefoo/slurm-drmaa.

Best,
Marius

On 20 February 2017 at 15:06, Leonor Palmeira <[hidden email]> wrote:
Hi,

we modified our configuration as Marius suggested, but we still get the
following error. This is an error we had just before, and we were trying
to fix it by specifying an $SGE_ROOT variable.

I don't know why this error pops up, as we are trying to use SLURM, not
SGE...

galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-20 14:58:59,768
Loaded 'failure' state handler from module
galaxy.jobs.runners.state_handlers.resubmit
pulsar.managers.util.drmaa DEBUG 2017-02-20 14:58:59,807 Initializing DRMAA
session from thread MainThread
Traceback (most recent call last):
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
line 55, in paste_app_factory
    app = galaxy.app.UniverseApplication( global_conf=global_conf,
**kwargs )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in __init__
    self.job_manager = manager.JobManager( self )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py", line 23, in
__init__
    self.job_handler = handler.JobHandler( app )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 32, in
__init__
    self.dispatcher = DefaultJobDispatcher( app )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line 723, in
__init__
    self.job_runners = self.app.job_config.get_job_runner_plugins(
self.app.config.server_name )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
687, in
get_job_runner_plugins
    rval[id] = runner_class( self.app, runner[ 'workers' ], **runner.get(
'kwds', {} ) )
  File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py", line
88, in __init__
    self.ds = DrmaaSessionFactory().get()
  File
"/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",

line 31, in get
    return DrmaaSession(session_constructor, **kwds)
  File
"/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",

line 49, in __init__
    DrmaaSession.session.initialize()
  File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line 257,
in initialize
    py_drmaa_init(contactString)
  File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py", line 73,
in py_drmaa_init
    return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
  File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line 151,
in error_check
    raise _ERRORS[code - 1](error_string)
InternalException: code 1: Please set the environment variable SGE_ROOT.

Thanks a lot in advance
Leonor

Leonor Palmeira | PhD
Associate Scientist
Department of Human Genetics
CHU de Liège | Domaine Universitaire du Sart-Tilman
4000 Liège | BELGIQUE
Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41
Fax: <a href="tel:%2B32-4-366.72.61" value="+3243667261">+32-4-366.72.61
e-mail: [hidden email]

On 02/13/2017 03:37 PM, Marius van den Beek wrote:
> Hello Leonor,
>
> One thing that you should avoid is setting things related to SGE (Sun
> Grid Engine)
> if you’re trying to interface with Slurm.
> The error message`
>
> |"/usr/lib/gridengine-drmaa/default" |
>
> Points to a problem with SGE … I don’t understand where that is coming
> into play if you’re trying to submit jobs to SLURM.
>
> A good bet is to set the path to the drmaa library like so:
>
> |... <plugin id="slurm" type="runner"
> load="galaxy.jobs.runners.slurm:SlurmJobRunner"> <param
> id="drmaa_library_path">/var/lib/libdrmaa.so</param> </plugin> ... |
>
> If you’re still having trouble let us know.
>
> Best,
> Marius
>
> ​
>
> On 13 February 2017 at 15:25, Leonor Palmeira <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Dear all,
>
>     we are struggling with the Galaxy documentation to understand how our VM
>     (with our Galaxy instance running perfectly in local) should be
>     configured in order to be able to submit jobs to our SLURM Cluster.
>
>     We have a shared filesystem named /home/mass/GAL between the Cluster and
>     the VM. Galaxy is installed in /home/mass/GAL/APP/ and 'drmaa' is
>     installed on the SLURM Cluster.
>
>     The following variables need to be specified but we are struggling to
>     find out which paths we should give them. We have currently set them
>     like this and this clearly does not work :
>
>     $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     $SGE_ROOT=/usr/lib/gridengine-drmaa (very wild guess)
>
>     We would greatly appreciate some help
>     Thank you in advance
>     Leo
>
>     Leonor Palmeira | PhD
>     Associate Scientist
>     Department of Human Genetics
>     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     4000 Liège | BELGIQUE
>     Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     Fax: <a href="tel:%2B32-4-366.72.61" value="+3243667261">+32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     e-mail: [hidden email] <mailto:[hidden email]>
>
>     On 02/07/2017 03:58 PM, Leonor Palmeira wrote:
>     > Dear all,
>     >
>     > we have setup a Galaxy instance on a virtual machine, and we want
>     to be
>     > able to submit jobs to our HPC system (SLURM).
>     >
>     > Currently, we do not understand how to define that jobs will be
>     sent to
>     > the HPC cluster.
>     >
>     > We have set :
>     >
>     > export $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >
>     > This is our config/job_conf.xml :
>     >
>     > <?xml version="1.0"?>
>     > <!-- A sample job config that explicitly configures job running
>     the way
>     > it is configured by default (if there is no explicit config). -->
>     > <job_conf>
>     >     <plugins>
>     >         <plugin id="drmaa" type="runner"
>     > load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" />
>     >     </plugins>
>     >     <handlers default="handlers">
>     >         <handler id="handler0" tags="handlers" />
>     >         <handler id="main" />
>     >     </handlers>
>     >     <destinations default="slurm">
>     >         <destination id="slurm" runner="drmaa">
>     >               <param id="nativeSpecification">-P all_5hrs</param>
>     >       </destination>
>     >     </destinations>
>     > </job_conf>
>     >
>     > And the output of "sh run.sh" :
>     >
>     >
>     > galaxy.jobs.manager DEBUG 2017-02-07 15:50:39,962 Starting job handler
>     >
>     > galaxy.jobs <http://galaxy.jobs> INFO 2017-02-07 15:50:39,962
>     Handler 'main' will load all
>     > configured runner plugins
>     >
>     > galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-07
>     15:50:39,971
>     > Loaded 'failure' state handler from module
>     > galaxy.jobs.runners.state_handlers.resubmit
>     >
>     > pulsar.managers.util.drmaa DEBUG 2017-02-07 15:50:39,975 Initializing
>     > DRMAA session from thread MainThread
>     >
>     > Traceback (most recent call last):
>     >
>     >   File
>     > "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     line
>     > 55, in paste_app_factory
>     >
>     >     app = galaxy.app.UniverseApplication( global_conf=global_conf,
>     > **kwargs )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in
>     __init__
>     >
>     >     self.job_manager = manager.JobManager( self )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py",
>     line 23,
>     > in __init__
>     >
>     >     self.job_handler = handler.JobHandler( app )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     line 32,
>     > in __init__
>     >
>     >     self.dispatcher = DefaultJobDispatcher( app )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     line 723,
>     > in __init__
>     >
>     >     self.job_runners = self.app.job_config.get_job_runner_plugins(
>     > self.app.config.server_name )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     > 687, in get_job_runner_plugins
>     >
>     >     rval[id] = runner_class( self.app, runner[ 'workers' ],
>     > **runner.get( 'kwds', {} ) )
>     >
>     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     > line 88, in __init__
>     >
>     >     self.ds = DrmaaSessionFactory().get()
>     >
>     >   File
>     >
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     > line 31, in get
>     >
>     >     return DrmaaSession(session_constructor, **kwds)
>     >
>     >   File
>     >
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     > line 49, in __init__
>     >
>     >     DrmaaSession.session.initialize()
>     >
>     >   File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
>     > 257, in initialize
>     >
>     >     py_drmaa_init(contactString)
>     >
>     >   File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     line
>     > 73, in py_drmaa_init
>     >
>     >     return _lib.drmaa_init(contact, error_buffer,
>     sizeof(error_buffer))
>     >
>     >   File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
>     > 151, in error_check
>     >
>     >     raise _ERRORS[code - 1](error_string)
>     >
>     > InternalException: code 1: cell directory
>     > "/usr/lib/gridengine-drmaa/default" doesn't exist
>     >
>     > Could anyone point us in the right direction?
>     > This would be greatly appreciated.
>     >
>     > Best regards
>     > Leonor
>     >
>     ___________________________________________________________
>     Please keep all replies on the list by using "reply all"
>     in your mail client.  To manage your subscriptions to this
>     and other Galaxy lists, please use the interface at:
>       https://lists.galaxyproject.org/ <https://lists.galaxyproject.org/>
>
>     To search Galaxy mailing lists use the unified search at:
>       http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>
>
>


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: SLURM configuration problem

Leonor Palmeira-2
Hi Marius,

yes, we are using the one from Poznan. Should we give it a try with the
fork?

Best
Leonor

Leonor Palmeira | PhD
Associate Scientist
Department of Human Genetics
CHU de Liège | Domaine Universitaire du Sart-Tilman
4000 Liège | BELGIQUE
Tél: +32-4-366.91.41
Fax: +32-4-366.72.61
e-mail: [hidden email]

On 02/20/2017 03:13 PM, Marius van den Beek wrote:

> Hi Leonor,
>
> Are you sure that you are using a drmaa library that is compatible with
> slurm?
> This http://apps.man.poznan.pl/trac/slurm-drmaa should work, IIRC,
> or alternatively you can use Nate Coraor's fork
> here https://github.com/natefoo/slurm-drmaa.
>
> Best,
> Marius
>
> On 20 February 2017 at 15:06, Leonor Palmeira <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi,
>
>     we modified our configuration as Marius suggested, but we still get the
>     following error. This is an error we had just before, and we were trying
>     to fix it by specifying an $SGE_ROOT variable.
>
>     I don't know why this error pops up, as we are trying to use SLURM, not
>     SGE...
>
>     galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-20 14:58:59,768
>     Loaded 'failure' state handler from module
>     galaxy.jobs.runners.state_handlers.resubmit
>     pulsar.managers.util.drmaa DEBUG 2017-02-20 14:58:59,807
>     Initializing DRMAA
>     session from thread MainThread
>     Traceback (most recent call last):
>       File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     line 55, in paste_app_factory
>         app = galaxy.app.UniverseApplication( global_conf=global_conf,
>     **kwargs )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in
>     __init__
>         self.job_manager = manager.JobManager( self )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py", line
>     23, in
>     __init__
>         self.job_handler = handler.JobHandler( app )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line
>     32, in
>     __init__
>         self.dispatcher = DefaultJobDispatcher( app )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line
>     723, in
>     __init__
>         self.job_runners = self.app.job_config.get_job_runner_plugins(
>     self.app.config.server_name )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     687, in
>     get_job_runner_plugins
>         rval[id] = runner_class( self.app, runner[ 'workers' ],
>     **runner.get(
>     'kwds', {} ) )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     line
>     88, in __init__
>         self.ds = DrmaaSessionFactory().get()
>       File
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>
>     line 31, in get
>         return DrmaaSession(session_constructor, **kwds)
>       File
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>
>     line 49, in __init__
>         DrmaaSession.session.initialize()
>       File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py",
>     line 257,
>     in initialize
>         py_drmaa_init(contactString)
>       File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     line 73,
>     in py_drmaa_init
>         return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
>       File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py",
>     line 151,
>     in error_check
>         raise _ERRORS[code - 1](error_string)
>     InternalException: code 1: Please set the environment variable SGE_ROOT.
>
>     Thanks a lot in advance
>     Leonor
>
>     Leonor Palmeira | PhD
>     Associate Scientist
>     Department of Human Genetics
>     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     4000 Liège | BELGIQUE
>     Tél: +32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     Fax: +32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     e-mail: [hidden email] <mailto:[hidden email]>
>
>     On 02/13/2017 03:37 PM, Marius van den Beek wrote:
>     > Hello Leonor,
>     >
>     > One thing that you should avoid is setting things related to SGE (Sun
>     > Grid Engine)
>     > if you’re trying to interface with Slurm.
>     > The error message`
>     >
>     > |"/usr/lib/gridengine-drmaa/default" |
>     >
>     > Points to a problem with SGE … I don’t understand where that is coming
>     > into play if you’re trying to submit jobs to SLURM.
>     >
>     > A good bet is to set the path to the drmaa library like so:
>     >
>     > |... <plugin id="slurm" type="runner"
>     > load="galaxy.jobs.runners.slurm:SlurmJobRunner"> <param
>     > id="drmaa_library_path">/var/lib/libdrmaa.so</param> </plugin> ... |
>     >
>     > If you’re still having trouble let us know.
>     >
>     > Best,
>     > Marius
>     >
>     > ​
>     >
>     > On 13 February 2017 at 15:25, Leonor Palmeira <[hidden email] <mailto:[hidden email]>
>     > <mailto:[hidden email] <mailto:[hidden email]>>> wrote:
>     >
>     >     Dear all,
>     >
>     >     we are struggling with the Galaxy documentation to understand how our VM
>     >     (with our Galaxy instance running perfectly in local) should be
>     >     configured in order to be able to submit jobs to our SLURM Cluster.
>     >
>     >     We have a shared filesystem named /home/mass/GAL between the Cluster and
>     >     the VM. Galaxy is installed in /home/mass/GAL/APP/ and 'drmaa' is
>     >     installed on the SLURM Cluster.
>     >
>     >     The following variables need to be specified but we are struggling to
>     >     find out which paths we should give them. We have currently set them
>     >     like this and this clearly does not work :
>     >
>     >     $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >     $SGE_ROOT=/usr/lib/gridengine-drmaa (very wild guess)
>     >
>     >     We would greatly appreciate some help
>     >     Thank you in advance
>     >     Leo
>     >
>     >     Leonor Palmeira | PhD
>     >     Associate Scientist
>     >     Department of Human Genetics
>     >     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     >     4000 Liège | BELGIQUE
>     >     Tél: +32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     <tel:%2B32-4-366.91.41>
>     >     Fax: +32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     <tel:%2B32-4-366.72.61>
>     >     e-mail: [hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>>
>     >
>     >     On 02/07/2017 03:58 PM, Leonor Palmeira wrote:
>     >     > Dear all,
>     >     >
>     >     > we have setup a Galaxy instance on a virtual machine, and we
>     want
>     >     to be
>     >     > able to submit jobs to our HPC system (SLURM).
>     >     >
>     >     > Currently, we do not understand how to define that jobs will be
>     >     sent to
>     >     > the HPC cluster.
>     >     >
>     >     > We have set :
>     >     >
>     >     > export $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >     >
>     >     > This is our config/job_conf.xml :
>     >     >
>     >     > <?xml version="1.0"?>
>     >     > <!-- A sample job config that explicitly configures job running
>     >     the way
>     >     > it is configured by default (if there is no explicit
>     config). -->
>     >     > <job_conf>
>     >     >     <plugins>
>     >     >         <plugin id="drmaa" type="runner"
>     >     > load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" />
>     >     >     </plugins>
>     >     >     <handlers default="handlers">
>     >     >         <handler id="handler0" tags="handlers" />
>     >     >         <handler id="main" />
>     >     >     </handlers>
>     >     >     <destinations default="slurm">
>     >     >         <destination id="slurm" runner="drmaa">
>     >     >               <param id="nativeSpecification">-P
>     all_5hrs</param>
>     >     >       </destination>
>     >     >     </destinations>
>     >     > </job_conf>
>     >     >
>     >     > And the output of "sh run.sh" :
>     >     >
>     >     >
>     >     > galaxy.jobs.manager DEBUG 2017-02-07 15:50:39,962 Starting
>     job handler
>     >     >
>     >     > galaxy.jobs <http://galaxy.jobs> <http://galaxy.jobs> INFO
>     2017-02-07 15:50:39,962
>     >     Handler 'main' will load all
>     >     > configured runner plugins
>     >     >
>     >     > galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-07
>     >     15:50:39,971
>     >     > Loaded 'failure' state handler from module
>     >     > galaxy.jobs.runners.state_handlers.resubmit
>     >     >
>     >     > pulsar.managers.util.drmaa DEBUG 2017-02-07 15:50:39,975
>     Initializing
>     >     > DRMAA session from thread MainThread
>     >     >
>     >     > Traceback (most recent call last):
>     >     >
>     >     >   File
>     >     >
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     >     line
>     >     > 55, in paste_app_factory
>     >     >
>     >     >     app = galaxy.app.UniverseApplication(
>     global_conf=global_conf,
>     >     > **kwargs )
>     >     >
>     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line
>     170, in
>     >     __init__
>     >     >
>     >     >     self.job_manager = manager.JobManager( self )
>     >     >
>     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py",
>     >     line 23,
>     >     > in __init__
>     >     >
>     >     >     self.job_handler = handler.JobHandler( app )
>     >     >
>     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     >     line 32,
>     >     > in __init__
>     >     >
>     >     >     self.dispatcher = DefaultJobDispatcher( app )
>     >     >
>     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     >     line 723,
>     >     > in __init__
>     >     >
>     >     >     self.job_runners =
>     self.app.job_config.get_job_runner_plugins(
>     >     > self.app.config.server_name )
>     >     >
>     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     >     > 687, in get_job_runner_plugins
>     >     >
>     >     >     rval[id] = runner_class( self.app, runner[ 'workers' ],
>     >     > **runner.get( 'kwds', {} ) )
>     >     >
>     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     >     > line 88, in __init__
>     >     >
>     >     >     self.ds = DrmaaSessionFactory().get()
>     >     >
>     >     >   File
>     >     >
>     >  
>      "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >     > line 31, in get
>     >     >
>     >     >     return DrmaaSession(session_constructor, **kwds)
>     >     >
>     >     >   File
>     >     >
>     >  
>      "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >     > line 49, in __init__
>     >     >
>     >     >     DrmaaSession.session.initialize()
>     >     >
>     >     >   File
>     "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
>     >     > 257, in initialize
>     >     >
>     >     >     py_drmaa_init(contactString)
>     >     >
>     >     >   File
>     "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     >     line
>     >     > 73, in py_drmaa_init
>     >     >
>     >     >     return _lib.drmaa_init(contact, error_buffer,
>     >     sizeof(error_buffer))
>     >     >
>     >     >   File
>     "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
>     >     > 151, in error_check
>     >     >
>     >     >     raise _ERRORS[code - 1](error_string)
>     >     >
>     >     > InternalException: code 1: cell directory
>     >     > "/usr/lib/gridengine-drmaa/default" doesn't exist
>     >     >
>     >     > Could anyone point us in the right direction?
>     >     > This would be greatly appreciated.
>     >     >
>     >     > Best regards
>     >     > Leonor
>     >     >
>     >     ___________________________________________________________
>     >     Please keep all replies on the list by using "reply all"
>     >     in your mail client.  To manage your subscriptions to this
>     >     and other Galaxy lists, please use the interface at:
>     >       https://lists.galaxyproject.org/
>     <https://lists.galaxyproject.org/> <https://lists.galaxyproject.org/
>     <https://lists.galaxyproject.org/>>
>     >
>     >     To search Galaxy mailing lists use the unified search at:
>     >       http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>
>     >     <http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>>
>     >
>     >
>
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: SLURM configuration problem

Marius van den Beek
It doesn't hurt to try this, but I don't think that will solve the problem.

Just to be sure, the basics are working? You can submit jobs via sbatch?
How did you compile/install slurm-drmaa ?

Also it looks like drmaa-python is being used from /usr/local/... .
Are you running galaxy in a virtualenv?
It's strongly recommended to do that.
Starting galaxy through run.sh will handle the creation and installation of all necessary dependencies for you.
Finally it looks like you're loading pulsar from /usr/local ... this is a bit messy.
Please try getting the cluster submission to work using run.sh first.


On 20 February 2017 at 15:24, Leonor Palmeira <[hidden email]> wrote:
Hi Marius,

yes, we are using the one from Poznan. Should we give it a try with the
fork?

Best
Leonor

Leonor Palmeira | PhD
Associate Scientist
Department of Human Genetics
CHU de Liège | Domaine Universitaire du Sart-Tilman
4000 Liège | BELGIQUE
Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41
Fax: <a href="tel:%2B32-4-366.72.61" value="+3243667261">+32-4-366.72.61
e-mail: [hidden email]

On 02/20/2017 03:13 PM, Marius van den Beek wrote:
> Hi Leonor,
>
> Are you sure that you are using a drmaa library that is compatible with
> slurm?
> This http://apps.man.poznan.pl/trac/slurm-drmaa should work, IIRC,
> or alternatively you can use Nate Coraor's fork
> here https://github.com/natefoo/slurm-drmaa.
>
> Best,
> Marius
>
> On 20 February 2017 at 15:06, Leonor Palmeira <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi,
>
>     we modified our configuration as Marius suggested, but we still get the
>     following error. This is an error we had just before, and we were trying
>     to fix it by specifying an $SGE_ROOT variable.
>
>     I don't know why this error pops up, as we are trying to use SLURM, not
>     SGE...
>
>     galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-20 14:58:59,768
>     Loaded 'failure' state handler from module
>     galaxy.jobs.runners.state_handlers.resubmit
>     pulsar.managers.util.drmaa DEBUG 2017-02-20 14:58:59,807
>     Initializing DRMAA
>     session from thread MainThread
>     Traceback (most recent call last):
>       File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     line 55, in paste_app_factory
>         app = galaxy.app.UniverseApplication( global_conf=global_conf,
>     **kwargs )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in
>     __init__
>         self.job_manager = manager.JobManager( self )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py", line
>     23, in
>     __init__
>         self.job_handler = handler.JobHandler( app )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line
>     32, in
>     __init__
>         self.dispatcher = DefaultJobDispatcher( app )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py", line
>     723, in
>     __init__
>         self.job_runners = self.app.job_config.get_job_runner_plugins(
>     self.app.config.server_name )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     687, in
>     get_job_runner_plugins
>         rval[id] = runner_class( self.app, runner[ 'workers' ],
>     **runner.get(
>     'kwds', {} ) )
>       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     line
>     88, in __init__
>         self.ds = DrmaaSessionFactory().get()
>       File
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>
>     line 31, in get
>         return DrmaaSession(session_constructor, **kwds)
>       File
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>
>     line 49, in __init__
>         DrmaaSession.session.initialize()
>       File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py",
>     line 257,
>     in initialize
>         py_drmaa_init(contactString)
>       File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     line 73,
>     in py_drmaa_init
>         return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
>       File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py",
>     line 151,
>     in error_check
>         raise _ERRORS[code - 1](error_string)
>     InternalException: code 1: Please set the environment variable SGE_ROOT.
>
>     Thanks a lot in advance
>     Leonor
>
>     Leonor Palmeira | PhD
>     Associate Scientist
>     Department of Human Genetics
>     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     4000 Liège | BELGIQUE
>     Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     Fax: <a href="tel:%2B32-4-366.72.61" value="+3243667261">+32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     e-mail: [hidden email] <mailto:[hidden email]>
>
>     On 02/13/2017 03:37 PM, Marius van den Beek wrote:
>     > Hello Leonor,
>     >
>     > One thing that you should avoid is setting things related to SGE (Sun
>     > Grid Engine)
>     > if you’re trying to interface with Slurm.
>     > The error message`
>     >
>     > |"/usr/lib/gridengine-drmaa/default" |
>     >
>     > Points to a problem with SGE … I don’t understand where that is coming
>     > into play if you’re trying to submit jobs to SLURM.
>     >
>     > A good bet is to set the path to the drmaa library like so:
>     >
>     > |... <plugin id="slurm" type="runner"
>     > load="galaxy.jobs.runners.slurm:SlurmJobRunner"> <param
>     > id="drmaa_library_path">/var/lib/libdrmaa.so</param> </plugin> ... |
>     >
>     > If you’re still having trouble let us know.
>     >
>     > Best,
>     > Marius
>     >
>     > ​
>     >
>     > On 13 February 2017 at 15:25, Leonor Palmeira <[hidden email] <mailto:[hidden email]>
>     > <mailto:[hidden email] <mailto:[hidden email]>>> wrote:
>     >
>     >     Dear all,
>     >
>     >     we are struggling with the Galaxy documentation to understand how our VM
>     >     (with our Galaxy instance running perfectly in local) should be
>     >     configured in order to be able to submit jobs to our SLURM Cluster.
>     >
>     >     We have a shared filesystem named /home/mass/GAL between the Cluster and
>     >     the VM. Galaxy is installed in /home/mass/GAL/APP/ and 'drmaa' is
>     >     installed on the SLURM Cluster.
>     >
>     >     The following variables need to be specified but we are struggling to
>     >     find out which paths we should give them. We have currently set them
>     >     like this and this clearly does not work :
>     >
>     >     $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >     $SGE_ROOT=/usr/lib/gridengine-drmaa (very wild guess)
>     >
>     >     We would greatly appreciate some help
>     >     Thank you in advance
>     >     Leo
>     >
>     >     Leonor Palmeira | PhD
>     >     Associate Scientist
>     >     Department of Human Genetics
>     >     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     >     4000 Liège | BELGIQUE
>     >     Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     <tel:%2B32-4-366.91.41>
>     >     Fax: +32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     <tel:%2B32-4-366.72.61>
>     >     e-mail: [hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>>
>     >
>     >     On 02/07/2017 03:58 PM, Leonor Palmeira wrote:
>     >     > Dear all,
>     >     >
>     >     > we have setup a Galaxy instance on a virtual machine, and we
>     want
>     >     to be
>     >     > able to submit jobs to our HPC system (SLURM).
>     >     >
>     >     > Currently, we do not understand how to define that jobs will be
>     >     sent to
>     >     > the HPC cluster.
>     >     >
>     >     > We have set :
>     >     >
>     >     > export $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >     >
>     >     > This is our config/job_conf.xml :
>     >     >
>     >     > <?xml version="1.0"?>
>     >     > <!-- A sample job config that explicitly configures job running
>     >     the way
>     >     > it is configured by default (if there is no explicit
>     config). -->
>     >     > <job_conf>
>     >     >     <plugins>
>     >     >         <plugin id="drmaa" type="runner"
>     >     > load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" />
>     >     >     </plugins>
>     >     >     <handlers default="handlers">
>     >     >         <handler id="handler0" tags="handlers" />
>     >     >         <handler id="main" />
>     >     >     </handlers>
>     >     >     <destinations default="slurm">
>     >     >         <destination id="slurm" runner="drmaa">
>     >     >               <param id="nativeSpecification">-P
>     all_5hrs</param>
>     >     >       </destination>
>     >     >     </destinations>
>     >     > </job_conf>
>     >     >
>     >     > And the output of "sh run.sh" :
>     >     >
>     >     >
>     >     > galaxy.jobs.manager DEBUG 2017-02-07 15:50:39,962 Starting
>     job handler
>     >     >
>     >     > galaxy.jobs <http://galaxy.jobs> <http://galaxy.jobs> INFO
>     2017-02-07 15:50:39,962
>     >     Handler 'main' will load all
>     >     > configured runner plugins
>     >     >
>     >     > galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-07
>     >     15:50:39,971
>     >     > Loaded 'failure' state handler from module
>     >     > galaxy.jobs.runners.state_handlers.resubmit
>     >     >
>     >     > pulsar.managers.util.drmaa DEBUG 2017-02-07 15:50:39,975
>     Initializing
>     >     > DRMAA session from thread MainThread
>     >     >
>     >     > Traceback (most recent call last):
>     >     >
>     >     >   File
>     >     >
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     >     line
>     >     > 55, in paste_app_factory
>     >     >
>     >     >     app = galaxy.app.UniverseApplication(
>     global_conf=global_conf,
>     >     > **kwargs )
>     >     >
>     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line
>     170, in
>     >     __init__
>     >     >
>     >     >     self.job_manager = manager.JobManager( self )
>     >     >
>     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py",
>     >     line 23,
>     >     > in __init__
>     >     >
>     >     >     self.job_handler = handler.JobHandler( app )
>     >     >
>     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     >     line 32,
>     >     > in __init__
>     >     >
>     >     >     self.dispatcher = DefaultJobDispatcher( app )
>     >     >
>     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     >     line 723,
>     >     > in __init__
>     >     >
>     >     >     self.job_runners =
>     self.app.job_config.get_job_runner_plugins(
>     >     > self.app.config.server_name )
>     >     >
>     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     >     > 687, in get_job_runner_plugins
>     >     >
>     >     >     rval[id] = runner_class( self.app, runner[ 'workers' ],
>     >     > **runner.get( 'kwds', {} ) )
>     >     >
>     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     >     > line 88, in __init__
>     >     >
>     >     >     self.ds = DrmaaSessionFactory().get()
>     >     >
>     >     >   File
>     >     >
>     >
>      "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >     > line 31, in get
>     >     >
>     >     >     return DrmaaSession(session_constructor, **kwds)
>     >     >
>     >     >   File
>     >     >
>     >
>      "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >     > line 49, in __init__
>     >     >
>     >     >     DrmaaSession.session.initialize()
>     >     >
>     >     >   File
>     "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
>     >     > 257, in initialize
>     >     >
>     >     >     py_drmaa_init(contactString)
>     >     >
>     >     >   File
>     "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     >     line
>     >     > 73, in py_drmaa_init
>     >     >
>     >     >     return _lib.drmaa_init(contact, error_buffer,
>     >     sizeof(error_buffer))
>     >     >
>     >     >   File
>     "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
>     >     > 151, in error_check
>     >     >
>     >     >     raise _ERRORS[code - 1](error_string)
>     >     >
>     >     > InternalException: code 1: cell directory
>     >     > "/usr/lib/gridengine-drmaa/default" doesn't exist
>     >     >
>     >     > Could anyone point us in the right direction?
>     >     > This would be greatly appreciated.
>     >     >
>     >     > Best regards
>     >     > Leonor
>     >     >
>     >     ___________________________________________________________
>     >     Please keep all replies on the list by using "reply all"
>     >     in your mail client.  To manage your subscriptions to this
>     >     and other Galaxy lists, please use the interface at:
>     >       https://lists.galaxyproject.org/
>     <https://lists.galaxyproject.org/> <https://lists.galaxyproject.org/
>     <https://lists.galaxyproject.org/>>
>     >
>     >     To search Galaxy mailing lists use the unified search at:
>     >       http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>
>     >     <http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>>
>     >
>     >
>
>


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/
Reply | Threaded
Open this post in threaded view
|

Re: SLURM configuration problem

Leonor Palmeira-2
Dear all,

we are struggling with the basics in our Galaxy/SLURM configuration.

- Galaxy is installed on a virtual machine that is physically
independent from our cluster, but on a shared filesystem that is also
mounted on the Cluster

- Our Cluster is running SLURM and has 'slurm-drmaa' (Poznan version)
installed. The shared filesystem is mounted on the same mount point as
the VM, so their /paths are identical

What do we need so that the Galaxy VM is able to submit jobs to the Cluster?
Currently, running ".run.sh" from the VM as root leads to the $SGE_ROOT
error I posted in my previous email and that ends like this :

  File
"/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
line 49, in __init__
    DrmaaSession.session.initialize()
  File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
257, in initialize
    py_drmaa_init(contactString)
  File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py", line
73, in py_drmaa_init
    return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
  File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
151, in error_check
    raise _ERRORS[code - 1](error_string)
InternalException: code 1: Please set the environment variable SGE_ROOT.

Any help would be greatly appreciated.

Best
Leonor

Leonor Palmeira | PhD
Associate Scientist
Department of Human Genetics
CHU de Liège | Domaine Universitaire du Sart-Tilman
4000 Liège | BELGIQUE
Tél: +32-4-366.91.41
Fax: +32-4-366.72.61
e-mail: [hidden email]

On 02/20/2017 03:57 PM, Marius van den Beek wrote:

> It doesn't hurt to try this, but I don't think that will solve the problem.
>
> Just to be sure, the basics are working? You can submit jobs via sbatch?
> How did you compile/install slurm-drmaa ?
>
> Also it looks like drmaa-python is being used from /usr/local/... .
> Are you running galaxy in a virtualenv?
> It's strongly recommended to do that.
> Starting galaxy through run.sh will handle the creation and installation
> of all necessary dependencies for you.
> Finally it looks like you're loading pulsar from /usr/local ... this is
> a bit messy.
> Please try getting the cluster submission to work using run.sh first.
>
>
> On 20 February 2017 at 15:24, Leonor Palmeira <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi Marius,
>
>     yes, we are using the one from Poznan. Should we give it a try with the
>     fork?
>
>     Best
>     Leonor
>
>     Leonor Palmeira | PhD
>     Associate Scientist
>     Department of Human Genetics
>     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     4000 Liège | BELGIQUE
>     Tél: +32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     Fax: +32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     e-mail: [hidden email] <mailto:[hidden email]>
>
>     On 02/20/2017 03:13 PM, Marius van den Beek wrote:
>     > Hi Leonor,
>     >
>     > Are you sure that you are using a drmaa library that is compatible with
>     > slurm?
>     > This http://apps.man.poznan.pl/trac/slurm-drmaa
>     <http://apps.man.poznan.pl/trac/slurm-drmaa> should work, IIRC,
>     > or alternatively you can use Nate Coraor's fork
>     > here https://github.com/natefoo/slurm-drmaa
>     <https://github.com/natefoo/slurm-drmaa>.
>     >
>     > Best,
>     > Marius
>     >
>     > On 20 February 2017 at 15:06, Leonor Palmeira <[hidden email] <mailto:[hidden email]>
>     > <mailto:[hidden email] <mailto:[hidden email]>>>
>     wrote:
>     >
>     >     Hi,
>     >
>     >     we modified our configuration as Marius suggested, but we
>     still get the
>     >     following error. This is an error we had just before, and we
>     were trying
>     >     to fix it by specifying an $SGE_ROOT variable.
>     >
>     >     I don't know why this error pops up, as we are trying to use
>     SLURM, not
>     >     SGE...
>     >
>     >     galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-20
>     14:58:59,768
>     >     Loaded 'failure' state handler from module
>     >     galaxy.jobs.runners.state_handlers.resubmit
>     >     pulsar.managers.util.drmaa DEBUG 2017-02-20 14:58:59,807
>     >     Initializing DRMAA
>     >     session from thread MainThread
>     >     Traceback (most recent call last):
>     >       File
>     >     "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     >     line 55, in paste_app_factory
>     >         app = galaxy.app.UniverseApplication( global_conf=global_conf,
>     >     **kwargs )
>     >       File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in
>     >     __init__
>     >         self.job_manager = manager.JobManager( self )
>     >       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py",
>     line
>     >     23, in
>     >     __init__
>     >         self.job_handler = handler.JobHandler( app )
>     >       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     line
>     >     32, in
>     >     __init__
>     >         self.dispatcher = DefaultJobDispatcher( app )
>     >       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     line
>     >     723, in
>     >     __init__
>     >         self.job_runners = self.app.job_config.get_job_runner_plugins(
>     >     self.app.config.server_name )
>     >       File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     >     687, in
>     >     get_job_runner_plugins
>     >         rval[id] = runner_class( self.app, runner[ 'workers' ],
>     >     **runner.get(
>     >     'kwds', {} ) )
>     >       File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     >     line
>     >     88, in __init__
>     >         self.ds = DrmaaSessionFactory().get()
>     >       File
>     >  
>      "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >
>     >     line 31, in get
>     >         return DrmaaSession(session_constructor, **kwds)
>     >       File
>     >  
>      "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >
>     >     line 49, in __init__
>     >         DrmaaSession.session.initialize()
>     >       File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py",
>     >     line 257,
>     >     in initialize
>     >         py_drmaa_init(contactString)
>     >       File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     >     line 73,
>     >     in py_drmaa_init
>     >         return _lib.drmaa_init(contact, error_buffer,
>     sizeof(error_buffer))
>     >       File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py",
>     >     line 151,
>     >     in error_check
>     >         raise _ERRORS[code - 1](error_string)
>     >     InternalException: code 1: Please set the environment variable
>     SGE_ROOT.
>     >
>     >     Thanks a lot in advance
>     >     Leonor
>     >
>     >     Leonor Palmeira | PhD
>     >     Associate Scientist
>     >     Department of Human Genetics
>     >     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     >     4000 Liège | BELGIQUE
>     >     Tél: +32-4-366.91.41 <tel:%2B32-4-366.91.41> <tel:%2B32-4-366.91.41>
>     >     Fax: +32-4-366.72.61 <tel:%2B32-4-366.72.61> <tel:%2B32-4-366.72.61>
>     >     e-mail: [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     >
>     >     On 02/13/2017 03:37 PM, Marius van den Beek wrote:
>     >     > Hello Leonor,
>     >     >
>     >     > One thing that you should avoid is setting things related to SGE (Sun
>     >     > Grid Engine)
>     >     > if you’re trying to interface with Slurm.
>     >     > The error message`
>     >     >
>     >     > |"/usr/lib/gridengine-drmaa/default" |
>     >     >
>     >     > Points to a problem with SGE … I don’t understand where that is coming
>     >     > into play if you’re trying to submit jobs to SLURM.
>     >     >
>     >     > A good bet is to set the path to the drmaa library like so:
>     >     >
>     >     > |... <plugin id="slurm" type="runner"
>     >     > load="galaxy.jobs.runners.slurm:SlurmJobRunner"> <param
>     >     > id="drmaa_library_path">/var/lib/libdrmaa.so</param> </plugin> ... |
>     >     >
>     >     > If you’re still having trouble let us know.
>     >     >
>     >     > Best,
>     >     > Marius
>     >     >
>     >     > ​
>     >     >
>     >     > On 13 February 2017 at 15:25, Leonor Palmeira <[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     >     > <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>>>> wrote:
>     >     >
>     >     >     Dear all,
>     >     >
>     >     >     we are struggling with the Galaxy documentation to
>     understand how our VM
>     >     >     (with our Galaxy instance running perfectly in local)
>     should be
>     >     >     configured in order to be able to submit jobs to our
>     SLURM Cluster.
>     >     >
>     >     >     We have a shared filesystem named /home/mass/GAL between
>     the Cluster and
>     >     >     the VM. Galaxy is installed in /home/mass/GAL/APP/ and
>     'drmaa' is
>     >     >     installed on the SLURM Cluster.
>     >     >
>     >     >     The following variables need to be specified but we are
>     struggling to
>     >     >     find out which paths we should give them. We have
>     currently set them
>     >     >     like this and this clearly does not work :
>     >     >
>     >     >     $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >     >     $SGE_ROOT=/usr/lib/gridengine-drmaa (very wild guess)
>     >     >
>     >     >     We would greatly appreciate some help
>     >     >     Thank you in advance
>     >     >     Leo
>     >     >
>     >     >     Leonor Palmeira | PhD
>     >     >     Associate Scientist
>     >     >     Department of Human Genetics
>     >     >     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     >     >     4000 Liège | BELGIQUE
>     >     >     Tél: +32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     <tel:%2B32-4-366.91.41>
>     >     <tel:%2B32-4-366.91.41>
>     >     >     Fax: +32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     >     <tel:%2B32-4-366.72.61>
>     >     >     e-mail: [hidden email]
>     <mailto:[hidden email]>
>     >     <mailto:[hidden email]
>     <mailto:[hidden email]>> <mailto:[hidden email]
>     <mailto:[hidden email]>
>     >     <mailto:[hidden email] <mailto:[hidden email]>>>
>     >     >
>     >     >     On 02/07/2017 03:58 PM, Leonor Palmeira wrote:
>     >     >     > Dear all,
>     >     >     >
>     >     >     > we have setup a Galaxy instance on a virtual machine,
>     and we
>     >     want
>     >     >     to be
>     >     >     > able to submit jobs to our HPC system (SLURM).
>     >     >     >
>     >     >     > Currently, we do not understand how to define that
>     jobs will be
>     >     >     sent to
>     >     >     > the HPC cluster.
>     >     >     >
>     >     >     > We have set :
>     >     >     >
>     >     >     > export $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >     >     >
>     >     >     > This is our config/job_conf.xml :
>     >     >     >
>     >     >     > <?xml version="1.0"?>
>     >     >     > <!-- A sample job config that explicitly configures
>     job running
>     >     >     the way
>     >     >     > it is configured by default (if there is no explicit
>     >     config). -->
>     >     >     > <job_conf>
>     >     >     >     <plugins>
>     >     >     >         <plugin id="drmaa" type="runner"
>     >     >     > load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" />
>     >     >     >     </plugins>
>     >     >     >     <handlers default="handlers">
>     >     >     >         <handler id="handler0" tags="handlers" />
>     >     >     >         <handler id="main" />
>     >     >     >     </handlers>
>     >     >     >     <destinations default="slurm">
>     >     >     >         <destination id="slurm" runner="drmaa">
>     >     >     >               <param id="nativeSpecification">-P
>     >     all_5hrs</param>
>     >     >     >       </destination>
>     >     >     >     </destinations>
>     >     >     > </job_conf>
>     >     >     >
>     >     >     > And the output of "sh run.sh" :
>     >     >     >
>     >     >     >
>     >     >     > galaxy.jobs.manager DEBUG 2017-02-07 15:50:39,962 Starting
>     >     job handler
>     >     >     >
>     >     >     > galaxy.jobs <http://galaxy.jobs> <http://galaxy.jobs>
>     <http://galaxy.jobs> INFO
>     >     2017-02-07 15:50:39,962
>     >     >     Handler 'main' will load all
>     >     >     > configured runner plugins
>     >     >     >
>     >     >     > galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-07
>     >     >     15:50:39,971
>     >     >     > Loaded 'failure' state handler from module
>     >     >     > galaxy.jobs.runners.state_handlers.resubmit
>     >     >     >
>     >     >     > pulsar.managers.util.drmaa DEBUG 2017-02-07 15:50:39,975
>     >     Initializing
>     >     >     > DRMAA session from thread MainThread
>     >     >     >
>     >     >     > Traceback (most recent call last):
>     >     >     >
>     >     >     >   File
>     >     >     >
>     >     "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     >     >     line
>     >     >     > 55, in paste_app_factory
>     >     >     >
>     >     >     >     app = galaxy.app.UniverseApplication(
>     >     global_conf=global_conf,
>     >     >     > **kwargs )
>     >     >     >
>     >     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line
>     >     170, in
>     >     >     __init__
>     >     >     >
>     >     >     >     self.job_manager = manager.JobManager( self )
>     >     >     >
>     >     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py",
>     >     >     line 23,
>     >     >     > in __init__
>     >     >     >
>     >     >     >     self.job_handler = handler.JobHandler( app )
>     >     >     >
>     >     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     >     >     line 32,
>     >     >     > in __init__
>     >     >     >
>     >     >     >     self.dispatcher = DefaultJobDispatcher( app )
>     >     >     >
>     >     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     >     >     line 723,
>     >     >     > in __init__
>     >     >     >
>     >     >     >     self.job_runners =
>     >     self.app.job_config.get_job_runner_plugins(
>     >     >     > self.app.config.server_name )
>     >     >     >
>     >     >     >   File
>     >     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     >     >     > 687, in get_job_runner_plugins
>     >     >     >
>     >     >     >     rval[id] = runner_class( self.app, runner[
>     'workers' ],
>     >     >     > **runner.get( 'kwds', {} ) )
>     >     >     >
>     >     >     >   File
>     >     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     >     >     > line 88, in __init__
>     >     >     >
>     >     >     >     self.ds = DrmaaSessionFactory().get()
>     >     >     >
>     >     >     >   File
>     >     >     >
>     >     >
>     >    
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >     >     > line 31, in get
>     >     >     >
>     >     >     >     return DrmaaSession(session_constructor, **kwds)
>     >     >     >
>     >     >     >   File
>     >     >     >
>     >     >
>     >    
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >     >     > line 49, in __init__
>     >     >     >
>     >     >     >     DrmaaSession.session.initialize()
>     >     >     >
>     >     >     >   File
>     >     "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
>     >     >     > 257, in initialize
>     >     >     >
>     >     >     >     py_drmaa_init(contactString)
>     >     >     >
>     >     >     >   File
>     >     "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     >     >     line
>     >     >     > 73, in py_drmaa_init
>     >     >     >
>     >     >     >     return _lib.drmaa_init(contact, error_buffer,
>     >     >     sizeof(error_buffer))
>     >     >     >
>     >     >     >   File
>     >     "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
>     >     >     > 151, in error_check
>     >     >     >
>     >     >     >     raise _ERRORS[code - 1](error_string)
>     >     >     >
>     >     >     > InternalException: code 1: cell directory
>     >     >     > "/usr/lib/gridengine-drmaa/default" doesn't exist
>     >     >     >
>     >     >     > Could anyone point us in the right direction?
>     >     >     > This would be greatly appreciated.
>     >     >     >
>     >     >     > Best regards
>     >     >     > Leonor
>     >     >     >
>     >     >     ___________________________________________________________
>     >     >     Please keep all replies on the list by using "reply all"
>     >     >     in your mail client.  To manage your subscriptions to this
>     >     >     and other Galaxy lists, please use the interface at:
>     >     >       https://lists.galaxyproject.org/
>     <https://lists.galaxyproject.org/>
>     >     <https://lists.galaxyproject.org/
>     <https://lists.galaxyproject.org/>>
>     <https://lists.galaxyproject.org/ <https://lists.galaxyproject.org/>
>     >     <https://lists.galaxyproject.org/ <https://lists.galaxyproject.org/>>>
>     >     >
>     >     >     To search Galaxy mailing lists use the unified search at:
>     >     >       http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>
>     >     <http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>>
>     >     >     <http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>
>     >     <http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>>>
>     >     >
>     >     >
>     >
>     >
>
>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/
Reply | Threaded
Open this post in threaded view
|

Re: SLURM configuration problem

Marius van den Beek
Hello Leonor,

the log you've sent indicates that you're picking up pulsar from /usr/local/lib.
That should not happen if you're running galaxy in a virtualenv.

Apart from that you did not mention if you able to submit slurm jobs from the command line.
That is a prerequisite for launching jobs through galaxy.

Could you post the full startup logs and job_conf.xml file somewhere?

Best,
Marius

On 31 March 2017 at 12:38, Leonor Palmeira <[hidden email]> wrote:
Dear all,

we are struggling with the basics in our Galaxy/SLURM configuration.

- Galaxy is installed on a virtual machine that is physically
independent from our cluster, but on a shared filesystem that is also
mounted on the Cluster

- Our Cluster is running SLURM and has 'slurm-drmaa' (Poznan version)
installed. The shared filesystem is mounted on the same mount point as
the VM, so their /paths are identical

What do we need so that the Galaxy VM is able to submit jobs to the Cluster?
Currently, running ".run.sh" from the VM as root leads to the $SGE_ROOT
error I posted in my previous email and that ends like this :

  File
"/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
line 49, in __init__
    DrmaaSession.session.initialize()
  File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
257, in initialize
    py_drmaa_init(contactString)
  File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py", line
73, in py_drmaa_init
    return _lib.drmaa_init(contact, error_buffer, sizeof(error_buffer))
  File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
151, in error_check
    raise _ERRORS[code - 1](error_string)
InternalException: code 1: Please set the environment variable SGE_ROOT.

Any help would be greatly appreciated.

Best
Leonor

Leonor Palmeira | PhD
Associate Scientist
Department of Human Genetics
CHU de Liège | Domaine Universitaire du Sart-Tilman
4000 Liège | BELGIQUE
Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41
Fax: <a href="tel:%2B32-4-366.72.61" value="+3243667261">+32-4-366.72.61
e-mail: [hidden email]

On 02/20/2017 03:57 PM, Marius van den Beek wrote:
> It doesn't hurt to try this, but I don't think that will solve the problem.
>
> Just to be sure, the basics are working? You can submit jobs via sbatch?
> How did you compile/install slurm-drmaa ?
>
> Also it looks like drmaa-python is being used from /usr/local/... .
> Are you running galaxy in a virtualenv?
> It's strongly recommended to do that.
> Starting galaxy through run.sh will handle the creation and installation
> of all necessary dependencies for you.
> Finally it looks like you're loading pulsar from /usr/local ... this is
> a bit messy.
> Please try getting the cluster submission to work using run.sh first.
>
>
> On 20 February 2017 at 15:24, Leonor Palmeira <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     Hi Marius,
>
>     yes, we are using the one from Poznan. Should we give it a try with the
>     fork?
>
>     Best
>     Leonor
>
>     Leonor Palmeira | PhD
>     Associate Scientist
>     Department of Human Genetics
>     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     4000 Liège | BELGIQUE
>     Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     Fax: <a href="tel:%2B32-4-366.72.61" value="+3243667261">+32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     e-mail: [hidden email] <mailto:[hidden email]>
>
>     On 02/20/2017 03:13 PM, Marius van den Beek wrote:
>     > Hi Leonor,
>     >
>     > Are you sure that you are using a drmaa library that is compatible with
>     > slurm?
>     > This http://apps.man.poznan.pl/trac/slurm-drmaa
>     <http://apps.man.poznan.pl/trac/slurm-drmaa> should work, IIRC,
>     > or alternatively you can use Nate Coraor's fork
>     > here https://github.com/natefoo/slurm-drmaa
>     <https://github.com/natefoo/slurm-drmaa>.
>     >
>     > Best,
>     > Marius
>     >
>     > On 20 February 2017 at 15:06, Leonor Palmeira <[hidden email] <mailto:[hidden email]>
>     > <mailto:[hidden email] <mailto:[hidden email]>>>
>     wrote:
>     >
>     >     Hi,
>     >
>     >     we modified our configuration as Marius suggested, but we
>     still get the
>     >     following error. This is an error we had just before, and we
>     were trying
>     >     to fix it by specifying an $SGE_ROOT variable.
>     >
>     >     I don't know why this error pops up, as we are trying to use
>     SLURM, not
>     >     SGE...
>     >
>     >     galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-20
>     14:58:59,768
>     >     Loaded 'failure' state handler from module
>     >     galaxy.jobs.runners.state_handlers.resubmit
>     >     pulsar.managers.util.drmaa DEBUG 2017-02-20 14:58:59,807
>     >     Initializing DRMAA
>     >     session from thread MainThread
>     >     Traceback (most recent call last):
>     >       File
>     >     "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     >     line 55, in paste_app_factory
>     >         app = galaxy.app.UniverseApplication( global_conf=global_conf,
>     >     **kwargs )
>     >       File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line 170, in
>     >     __init__
>     >         self.job_manager = manager.JobManager( self )
>     >       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py",
>     line
>     >     23, in
>     >     __init__
>     >         self.job_handler = handler.JobHandler( app )
>     >       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     line
>     >     32, in
>     >     __init__
>     >         self.dispatcher = DefaultJobDispatcher( app )
>     >       File "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     line
>     >     723, in
>     >     __init__
>     >         self.job_runners = self.app.job_config.get_job_runner_plugins(
>     >     self.app.config.server_name )
>     >       File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     >     687, in
>     >     get_job_runner_plugins
>     >         rval[id] = runner_class( self.app, runner[ 'workers' ],
>     >     **runner.get(
>     >     'kwds', {} ) )
>     >       File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     >     line
>     >     88, in __init__
>     >         self.ds = DrmaaSessionFactory().get()
>     >       File
>     >
>      "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >
>     >     line 31, in get
>     >         return DrmaaSession(session_constructor, **kwds)
>     >       File
>     >
>      "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >
>     >     line 49, in __init__
>     >         DrmaaSession.session.initialize()
>     >       File "/usr/local/lib/python2.7/dist-packages/drmaa/session.py",
>     >     line 257,
>     >     in initialize
>     >         py_drmaa_init(contactString)
>     >       File "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     >     line 73,
>     >     in py_drmaa_init
>     >         return _lib.drmaa_init(contact, error_buffer,
>     sizeof(error_buffer))
>     >       File "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py",
>     >     line 151,
>     >     in error_check
>     >         raise _ERRORS[code - 1](error_string)
>     >     InternalException: code 1: Please set the environment variable
>     SGE_ROOT.
>     >
>     >     Thanks a lot in advance
>     >     Leonor
>     >
>     >     Leonor Palmeira | PhD
>     >     Associate Scientist
>     >     Department of Human Genetics
>     >     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     >     4000 Liège | BELGIQUE
>     >     Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41 <tel:%2B32-4-366.91.41> <tel:%2B32-4-366.91.41>
>     >     Fax: +32-4-366.72.61 <tel:%2B32-4-366.72.61> <tel:%2B32-4-366.72.61>
>     >     e-mail: [hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     >
>     >     On 02/13/2017 03:37 PM, Marius van den Beek wrote:
>     >     > Hello Leonor,
>     >     >
>     >     > One thing that you should avoid is setting things related to SGE (Sun
>     >     > Grid Engine)
>     >     > if you’re trying to interface with Slurm.
>     >     > The error message`
>     >     >
>     >     > |"/usr/lib/gridengine-drmaa/default" |
>     >     >
>     >     > Points to a problem with SGE … I don’t understand where that is coming
>     >     > into play if you’re trying to submit jobs to SLURM.
>     >     >
>     >     > A good bet is to set the path to the drmaa library like so:
>     >     >
>     >     > |... <plugin id="slurm" type="runner"
>     >     > load="galaxy.jobs.runners.slurm:SlurmJobRunner"> <param
>     >     > id="drmaa_library_path">/var/lib/libdrmaa.so</param> </plugin> ... |
>     >     >
>     >     > If you’re still having trouble let us know.
>     >     >
>     >     > Best,
>     >     > Marius
>     >     >
>     >     > ​
>     >     >
>     >     > On 13 February 2017 at 15:25, Leonor Palmeira <[hidden email] <mailto:[hidden email]>
>     <mailto:[hidden email] <mailto:[hidden email]>>
>     >     > <mailto:[hidden email]
>     <mailto:[hidden email]> <mailto:[hidden email]
>     <mailto:[hidden email]>>>> wrote:
>     >     >
>     >     >     Dear all,
>     >     >
>     >     >     we are struggling with the Galaxy documentation to
>     understand how our VM
>     >     >     (with our Galaxy instance running perfectly in local)
>     should be
>     >     >     configured in order to be able to submit jobs to our
>     SLURM Cluster.
>     >     >
>     >     >     We have a shared filesystem named /home/mass/GAL between
>     the Cluster and
>     >     >     the VM. Galaxy is installed in /home/mass/GAL/APP/ and
>     'drmaa' is
>     >     >     installed on the SLURM Cluster.
>     >     >
>     >     >     The following variables need to be specified but we are
>     struggling to
>     >     >     find out which paths we should give them. We have
>     currently set them
>     >     >     like this and this clearly does not work :
>     >     >
>     >     >     $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >     >     $SGE_ROOT=/usr/lib/gridengine-drmaa (very wild guess)
>     >     >
>     >     >     We would greatly appreciate some help
>     >     >     Thank you in advance
>     >     >     Leo
>     >     >
>     >     >     Leonor Palmeira | PhD
>     >     >     Associate Scientist
>     >     >     Department of Human Genetics
>     >     >     CHU de Liège | Domaine Universitaire du Sart-Tilman
>     >     >     4000 Liège | BELGIQUE
>     >     >     Tél: <a href="tel:%2B32-4-366.91.41" value="+3243669141">+32-4-366.91.41 <tel:%2B32-4-366.91.41>
>     <tel:%2B32-4-366.91.41>
>     >     <tel:%2B32-4-366.91.41>
>     >     >     Fax: +32-4-366.72.61 <tel:%2B32-4-366.72.61>
>     >     <tel:%2B32-4-366.72.61>
>     >     >     e-mail: [hidden email]
>     <mailto:[hidden email]>
>     >     <mailto:[hidden email]
>     <mailto:[hidden email]>> <mailto:[hidden email]
>     <mailto:[hidden email]>
>     >     <mailto:[hidden email] <mailto:[hidden email]>>>
>     >     >
>     >     >     On 02/07/2017 03:58 PM, Leonor Palmeira wrote:
>     >     >     > Dear all,
>     >     >     >
>     >     >     > we have setup a Galaxy instance on a virtual machine,
>     and we
>     >     want
>     >     >     to be
>     >     >     > able to submit jobs to our HPC system (SLURM).
>     >     >     >
>     >     >     > Currently, we do not understand how to define that
>     jobs will be
>     >     >     sent to
>     >     >     > the HPC cluster.
>     >     >     >
>     >     >     > We have set :
>     >     >     >
>     >     >     > export $DRMAA_LIBRARY_PATH=/var/lib/libdrmaa.so
>     >     >     >
>     >     >     > This is our config/job_conf.xml :
>     >     >     >
>     >     >     > <?xml version="1.0"?>
>     >     >     > <!-- A sample job config that explicitly configures
>     job running
>     >     >     the way
>     >     >     > it is configured by default (if there is no explicit
>     >     config). -->
>     >     >     > <job_conf>
>     >     >     >     <plugins>
>     >     >     >         <plugin id="drmaa" type="runner"
>     >     >     > load="galaxy.jobs.runners.drmaa:DRMAAJobRunner" />
>     >     >     >     </plugins>
>     >     >     >     <handlers default="handlers">
>     >     >     >         <handler id="handler0" tags="handlers" />
>     >     >     >         <handler id="main" />
>     >     >     >     </handlers>
>     >     >     >     <destinations default="slurm">
>     >     >     >         <destination id="slurm" runner="drmaa">
>     >     >     >               <param id="nativeSpecification">-P
>     >     all_5hrs</param>
>     >     >     >       </destination>
>     >     >     >     </destinations>
>     >     >     > </job_conf>
>     >     >     >
>     >     >     > And the output of "sh run.sh" :
>     >     >     >
>     >     >     >
>     >     >     > galaxy.jobs.manager DEBUG 2017-02-07 15:50:39,962 Starting
>     >     job handler
>     >     >     >
>     >     >     > galaxy.jobs <http://galaxy.jobs> <http://galaxy.jobs>
>     <http://galaxy.jobs> INFO
>     >     2017-02-07 15:50:39,962
>     >     >     Handler 'main' will load all
>     >     >     > configured runner plugins
>     >     >     >
>     >     >     > galaxy.jobs.runners.state_handler_factory DEBUG 2017-02-07
>     >     >     15:50:39,971
>     >     >     > Loaded 'failure' state handler from module
>     >     >     > galaxy.jobs.runners.state_handlers.resubmit
>     >     >     >
>     >     >     > pulsar.managers.util.drmaa DEBUG 2017-02-07 15:50:39,975
>     >     Initializing
>     >     >     > DRMAA session from thread MainThread
>     >     >     >
>     >     >     > Traceback (most recent call last):
>     >     >     >
>     >     >     >   File
>     >     >     >
>     >     "/home/mass/GAL/APP/galaxy/lib/galaxy/webapps/galaxy/buildapp.py",
>     >     >     line
>     >     >     > 55, in paste_app_factory
>     >     >     >
>     >     >     >     app = galaxy.app.UniverseApplication(
>     >     global_conf=global_conf,
>     >     >     > **kwargs )
>     >     >     >
>     >     >     >   File "/home/mass/GAL/APP/galaxy/lib/galaxy/app.py", line
>     >     170, in
>     >     >     __init__
>     >     >     >
>     >     >     >     self.job_manager = manager.JobManager( self )
>     >     >     >
>     >     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/manager.py",
>     >     >     line 23,
>     >     >     > in __init__
>     >     >     >
>     >     >     >     self.job_handler = handler.JobHandler( app )
>     >     >     >
>     >     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     >     >     line 32,
>     >     >     > in __init__
>     >     >     >
>     >     >     >     self.dispatcher = DefaultJobDispatcher( app )
>     >     >     >
>     >     >     >   File
>     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/handler.py",
>     >     >     line 723,
>     >     >     > in __init__
>     >     >     >
>     >     >     >     self.job_runners =
>     >     self.app.job_config.get_job_runner_plugins(
>     >     >     > self.app.config.server_name )
>     >     >     >
>     >     >     >   File
>     >     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/__init__.py", line
>     >     >     > 687, in get_job_runner_plugins
>     >     >     >
>     >     >     >     rval[id] = runner_class( self.app, runner[
>     'workers' ],
>     >     >     > **runner.get( 'kwds', {} ) )
>     >     >     >
>     >     >     >   File
>     >     "/home/mass/GAL/APP/galaxy/lib/galaxy/jobs/runners/drmaa.py",
>     >     >     > line 88, in __init__
>     >     >     >
>     >     >     >     self.ds = DrmaaSessionFactory().get()
>     >     >     >
>     >     >     >   File
>     >     >     >
>     >     >
>     >
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >     >     > line 31, in get
>     >     >     >
>     >     >     >     return DrmaaSession(session_constructor, **kwds)
>     >     >     >
>     >     >     >   File
>     >     >     >
>     >     >
>     >
>     "/usr/local/lib/python2.7/dist-packages/pulsar/managers/util/drmaa/__init__.py",
>     >     >     > line 49, in __init__
>     >     >     >
>     >     >     >     DrmaaSession.session.initialize()
>     >     >     >
>     >     >     >   File
>     >     "/usr/local/lib/python2.7/dist-packages/drmaa/session.py", line
>     >     >     > 257, in initialize
>     >     >     >
>     >     >     >     py_drmaa_init(contactString)
>     >     >     >
>     >     >     >   File
>     >     "/usr/local/lib/python2.7/dist-packages/drmaa/wrappers.py",
>     >     >     line
>     >     >     > 73, in py_drmaa_init
>     >     >     >
>     >     >     >     return _lib.drmaa_init(contact, error_buffer,
>     >     >     sizeof(error_buffer))
>     >     >     >
>     >     >     >   File
>     >     "/usr/local/lib/python2.7/dist-packages/drmaa/errors.py", line
>     >     >     > 151, in error_check
>     >     >     >
>     >     >     >     raise _ERRORS[code - 1](error_string)
>     >     >     >
>     >     >     > InternalException: code 1: cell directory
>     >     >     > "/usr/lib/gridengine-drmaa/default" doesn't exist
>     >     >     >
>     >     >     > Could anyone point us in the right direction?
>     >     >     > This would be greatly appreciated.
>     >     >     >
>     >     >     > Best regards
>     >     >     > Leonor
>     >     >     >
>     >     >     ___________________________________________________________
>     >     >     Please keep all replies on the list by using "reply all"
>     >     >     in your mail client.  To manage your subscriptions to this
>     >     >     and other Galaxy lists, please use the interface at:
>     >     >       https://lists.galaxyproject.org/
>     <https://lists.galaxyproject.org/>
>     >     <https://lists.galaxyproject.org/
>     <https://lists.galaxyproject.org/>>
>     <https://lists.galaxyproject.org/ <https://lists.galaxyproject.org/>
>     >     <https://lists.galaxyproject.org/ <https://lists.galaxyproject.org/>>>
>     >     >
>     >     >     To search Galaxy mailing lists use the unified search at:
>     >     >       http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>
>     >     <http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>>
>     >     >     <http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>
>     >     <http://galaxyproject.org/search/mailinglists/
>     <http://galaxyproject.org/search/mailinglists/>>>
>     >     >
>     >     >
>     >
>     >
>
>


___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/