From: Carson Holt <[hidden email]> Date: August 14, 2017 at 2:00:11 PM MDT To: zl c <[hidden email]> Subject:Re: [maker-devel] maker MPI problem
Yes. You can delete them.
Also I notice this library being mentioned in the segfault —> libpthread.so
MAKER doesn’t use pthreads, so I’m surprised it’s showing up in an error. You could try installing a separate version of perl without pthread support and running MAKER with that (pthreads is optional for perl). It may remove an OpenMPI/perl incompatibility happening on your system.
Is the issue that your cluster dies or that MAKER dies? (i.e. I want to know if this is an issue with your cluster or just an issue running MAKER)
I see in the file that you are getting segfaults which should not crash the cluster but would kill maker. They would indicate either an installation problem, or just a command configuration option.
You may need to recompile while the LD_PRELOAD value is set (it must be set during MAKER install and whenever you run with OpenMPI). Or you may still have the native infiniband communication active (causes segfaults with system calls).
You can try this (to do ip over infiiniband instead, worls only if ib0 exists or set it to eth0 if eth0 exists) —> '--mca btl vader,tcp,self --mca btl_tcp_if_include ib0'
That would replace the '-mca btl ^openib'
Also make sure you can run maker on a single node under MPI before trying to work across nodes, then try on two nodes for your first test.
The NFSLock files are file locks that are not cleaned up on a hard failure.