Fwd: maker MPI problem

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Fwd: maker MPI problem

Carson Holt-2
Just forwarding this to the list.

--Carson

From: Carson Holt <[hidden email]>
Date: August 14, 2017 at 2:00:11 PM MDT
To: zl c <[hidden email]>
Subject: Re: [maker-devel] maker MPI problem

Yes. You can delete them.

Also I notice this library being mentioned in the segfault —> libpthread.so

MAKER doesn’t use pthreads, so I’m surprised it’s showing up in an error. You could try installing a separate version of perl without pthread support and running MAKER with that (pthreads is optional for perl). It may remove an OpenMPI/perl incompatibility happening on your system.

—Carson


On Aug 14, 2017, at 1:50 PM, zl c <[hidden email]> wrote:

Maker dies.

I've set LD_PRELOAD before install.

I'll try the option.

Can I remove the .NFS files before rerunning?

Thanks,
Zelin



On Mon, Aug 14, 2017 at 3:35 PM, Carson Holt <[hidden email]> wrote:
Is the issue that your cluster dies or that MAKER dies? (i.e. I want to know if this is an issue with your cluster or just an issue running MAKER)

I see in the file that you are getting segfaults which should not crash the cluster but would kill maker. They would indicate either an installation problem, or just a command configuration option.

You may need to recompile while the LD_PRELOAD value is set (it must be set during MAKER install and whenever you run with OpenMPI). Or you may still have the native infiniband communication active (causes segfaults with system calls).

You can try this (to do ip over infiiniband instead, worls only if ib0 exists or set it to eth0 if eth0 exists) —> '--mca btl vader,tcp,self --mca btl_tcp_if_include ib0'

That would replace the '-mca btl ^openib'

Also make sure you can run maker on a single node under MPI before trying to work across nodes, then try on two nodes for your first test.

The NFSLock files are file locks that are not cleaned up on a hard failure.

—Carson





On Aug 14, 2017, at 1:22 PM, zl c <[hidden email]> wrote:

It's in the attached file.

Beside, I see there are lots of .NFS... files.like:
.NFSLock..NFSLock.genomedb.NFSLock.share.tmp.2247.26272.7466.34069868502337

--------------------------------------------
Zelin Chen [[hidden email]]

NIH/NHGRI
Building 50, Room 5531
50 SOUTH DR, MSC 8004 
BETHESDA, MD 20892-8004

On Mon, Aug 14, 2017 at 3:18 PM, Carson Holt <[hidden email]> wrote:
This is rather vague —> “crashed the computer cluster

Do you have a specific error?

—Carson



On Aug 14, 2017, at 12:59 PM, zl c <[hidden email]> wrote:

Hello,

 

I ran maker 3.0 with openmpi 2.0.2 and it crashed the computer cluster. I attached the log file. Could you help me to solve the problem?

 

CMD:

export LD_PRELOAD=/usr/local/OpenMPI/2.0.2/gcc-6.3.0/lib/libmpi.so

export OMPI_MCA_mpi_warn_on_fork=0

mpiexec -mca btl ^openib -n $SLURM_NTASKS maker -c 1 –base genome  -g genome.fasta

 

Thanks,

Zelin Chen

 
--------------------------------------------
Zelin Chen [[hidden email]]  Ph.D.

NIH/NHGRI
Building 50, Room 5531
50 SOUTH DR, MSC 8004 
BETHESDA, MD 20892-8004
<run05.mpi.o47346077>_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


<run05.mpi.o47346077>




_______________________________________________
maker-devel mailing list
[hidden email]
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org