Running MAKER using MPI over SGE scheduler with multiple nodes

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Running MAKER using MPI over SGE scheduler with multiple nodes

Lior Glick
Dear MAKER users,

I am running MAKER in order to annotate a large plant genome. To improve performance, I use the MPI option as described in the documentation (specifically openMPI). The machines I currently have access to are part of a cluster on which SGE is used as the job scheduler. There are about 15 machines, each with 20 cores. Therefore, in order to run, I create files that look something like this:

#!/bin/bash
#$ -N try_MAKER
#$ -S /bin/bash
#$ -e /path/to/err
#$ -o /path/to/out
#$ -pe openmpi-x86_64 20
cd /path/to/run_dir
mpiexec -n 20 --mca btl tcp,self maker

I then just qsub the file.
This works fine, but I'd like to use more than 20 cores, which means I need to use multiple nodes of the cluster. Simply increasing the number of requested cores (e.g.  mpiexec -n 100) does not work - it keeps using 20 cores of a single node.
I see this is possible when the scheduler used is PBS (using the nodes:ppn option), but couldn't find any examples/instructions regarding SGE.
Can anyone help me figure it out? Has anyone done this on SGE?

Thanks a lot and best regards,
Lior


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
Reply | Threaded
Open this post in threaded view
|

Re: Running MAKER using MPI over SGE scheduler with multiple nodes

Carson Holt-2
Hi Lior,

I know it can be done, but I never have had to do it on SGE. It will require that both SGE be setup to do this and OpenMPI be set up to work with SGE. If it is not setup already, you may have to involve your IT manager.

There are a number of documentation sources on how to do this —>

Alternatively submit multiple MAKER jobs to one node each. You can have all jobs write to the same output directory and use different input fastas (i.e. chunk the original fasta) using the -base and -g command line options while running maker. You can chunk the fasta using fasta_tool (bundled with MAKER) and the --chunk option.

Example:
fasta_tool --chunk 10 assembly.fasta

#job1
mpiexec -n 20 maker -base assembly -g assembly_00.fasta

#job 2
mpiexec -n 20 maker -base assembly -g assembly_01.fasta

#job 3
mpiexec -n 20 maker -base assembly -g assembly_02.fasta

# and so on …


Thanks,
Carson


On Aug 8, 2018, at 5:51 AM, Lior Glick <[hidden email]> wrote:

Dear MAKER users,

I am running MAKER in order to annotate a large plant genome. To improve performance, I use the MPI option as described in the documentation (specifically openMPI). The machines I currently have access to are part of a cluster on which SGE is used as the job scheduler. There are about 15 machines, each with 20 cores. Therefore, in order to run, I create files that look something like this:

#!/bin/bash
#$ -N try_MAKER
#$ -S /bin/bash
#$ -e /path/to/err
#$ -o /path/to/out
#$ -pe openmpi-x86_64 20
cd /path/to/run_dir
mpiexec -n 20 --mca btl tcp,self maker

I then just qsub the file.
This works fine, but I'd like to use more than 20 cores, which means I need to use multiple nodes of the cluster. Simply increasing the number of requested cores (e.g.  mpiexec -n 100) does not work - it keeps using 20 cores of a single node.
I see this is possible when the scheduler used is PBS (using the nodes:ppn option), but couldn't find any examples/instructions regarding SGE.
Can anyone help me figure it out? Has anyone done this on SGE?

Thanks a lot and best regards,
Lior

_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org